Re: Missed warning (-Wuse-after-free)

2023-02-24 Thread Martin Uecker via Gcc
Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn:
> On Fri, Feb 24, 2023 at 01:02:54AM +0100, Alex Colomar wrote:
> > Hi Martin,
> > 
> > On 2/23/23 20:57, Martin Uecker wrote:
> > > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar:
> > > > Hi Martin,
> > > > 
> > > > On 2/17/23 14:48, Martin Uecker wrote:
> > > > > > This new wording doesn't even allow one to use memcmp(3);
> > > > > > just reading the pointer value, however you do it, is UB.
> > > > > 
> > > > > memcmp would not use the pointer value but work
> > > > > on the representation bytes and is still allowed.
> > > > 
> > > > Hmm, interesting.  It's rather unspecified behavior. Still
> > > > unpredictable: (memcmp(, , sizeof(p) == 0) might evaluate to true or
> > > > false randomly; the compiler may compile out the call to memcmp(3),
> > > > since it knows it won't produce any observable behavior.
> > > > 
> > > > 
> > > 
> > > No, I think several things get mixed up here.
> > > 
> > > The representation of a pointer that becomes invalid
> > > does not change.
> > > 
> > > So (0 === memcmp(, , sizeof(p)) always
> > > evaluates to true.
> > > 
> > > Also in general, an unspecified value is simply unspecified
> > > but does not change anymore.
> 
> Right.  p is its own thing - n bytes on the stack containing some value.
> Once it comes into scope, it doesn't change on its own.  And if I do
> free(p) or o = realloc(p), then the value of p itself - the n bytes on
> the stack - does not change.

Yes, but one comment about terminology:. The C standard
differentiates between the representation, i.e. the bytes on
the stack, and the value.  The representation is converted to
a value during lvalue conversion.  For an invalid pointer
the representation is indeterminate because it now does not
point to a valid object anymore.  So it is not possible to
convert the representation to a value during lvalue conversion.
In other words, it does not make sense to speak of the value
of the pointer anymore.

> I realize C11 appears to have changed that.  I fear that in doing so it
> actually risks increasing the confusion about pointers.  IMO it's much
> easier to reason about
> 
>   o = realloc(p, X);
> 
> (and more baroque constructions) when keeping in mind that o, p, and the
> object pointed to by either one are all different things.
> 

What did change in C11? As far as I know, the pointer model
did not change in C11.

> > > Reading an uninitialized value of automatic storage whose
> > > address was not taken is undefined behavior, so everything
> > > is possible afterwards.
> > > 
> > > An uninitialized variable whose address was taken has a
> > > representation which can represent an unspecified value
> > > or a no-value (trap) representation. Reading the
> > > representation itself is always ok and gives consistent
> > > results. Reading the variable can be undefined behavior
> > > iff it is a trap representation, otherwise you get
> > > the unspecified value which is stored there.
> > > 
> > > At least this is my reading of the C standard. Compilers
> > > are not full conformant.
> > 
> > Does all this imply that the following is well defined behavior (and shall
> > print what one would expect)?
> > 
> >   free(p);
> > 
> >   (void)   // take the address
> >   // or maybe we should (void) memcmp(, , sizeof(p)); ?
> > 
> >   printf("%p\n", p);  // we took previously its address,
> >   // so now it has to hold consistently
> >   // the previous value
> > 
> > 

No, the printf is not well defined, because the lvalue conversion
of the pointer with indeterminate representation may lead to
undefined behavior.


Martin


> > This feels weird.  And a bit of a Schroedinger's pointer.  I'm not entirely
> > convinced, but might be.
> 
> Again, p is just an n byte variable which happens to have (one hopes)
> pointed at a previously malloc'd address.
> 
> And I'd argue that pre-C11, this was not confusing, and would not have
> felt weird to you.
> 
> But I am most grateful to you for having brought this to my attention.
> I may not agree with it and not like it, but it's right there in the
> spec, so time for me to adjust :)
> 







Re: Missed warning (-Wuse-after-free)

2023-02-24 Thread Martin Uecker via Gcc
Am Freitag, dem 24.02.2023 um 02:42 +0100 schrieb Alex Colomar:
> Hi Serge, Martin,
> 
> On 2/24/23 02:21, Serge E. Hallyn wrote:
> > > Does all this imply that the following is well defined behavior (and shall
> > > print what one would expect)?
> > > 
> > >    free(p);
> > > 
> > >    (void)   // take the address
> > >    // or maybe we should (void) memcmp(, , sizeof(p)); ?
> > > 
> > >    printf("%p\n", p);  // we took previously its address,
> > >    // so now it has to hold consistently
> > >    // the previous value
> > > 
> > > 
> > > This feels weird.  And a bit of a Schroedinger's pointer.  I'm not 
> > > entirely
> > > convinced, but might be.
> > 
> > Again, p is just an n byte variable which happens to have (one hopes)
> > pointed at a previously malloc'd address.
> > 
> > And I'd argue that pre-C11, this was not confusing, and would not have
> > felt weird to you.
> > 
> > But I am most grateful to you for having brought this to my attention.
> > I may not agree with it and not like it, but it's right there in the
> > spec, so time for me to adjust :)
> 
> I'll try to show why this feels weird to me (even in C89):
> 
> 
> alx@dell7760:~/tmp$ cat pointers.c
> #include 
> #include 
> 
> 
> int
> main(void)
> {
>   char  *p, *q;
> 
>   p = malloc(42);
>   if (p == NULL)
>   exit(1);
> 
>   q = realloc(p, 42);
>   if (q == NULL)
>   exit(1);
> 
>   (void)   // If we remove this, we get -Wuse-after-free
> 
>   printf("(%p == %p) = %i\n", p, q, (p == q));
> }
> alx@dell7760:~/tmp$ cc -Wall -Wextra pointers.c  -Wuse-after-free=3
> alx@dell7760:~/tmp$ ./a.out
> (0x5642cd9022a0 == 0x5642cd9022a0) = 1
> 

No, you can't do the comparison or use the value of 'p'
because 'p' is not a valid pointer. (The address taken
makes no difference here, but it may confuse the
compiler so that it does not warn.)

> 
> This pointers point to different objects (actually, one of them doesn't 
> even point to an object anymore), so they can't compare equal, according 
> to both:
> 
> 
> 
> 
> 
> (I believe C89 already had the concept of lifetime well defined as it is 
> now, so the object had finished it's lifetime after realloc(3)).
> 
> How can we justify that true, if the pointer don't point to the same 
> object?  And how can we justify a hypothetical false (which compilers 
> don't implement), if compilers will really just read the value?  To 
> implement this as well defined behavior, it could result in no other 
> than false, and it would require heavy overhead for the compilers to 
> detect that the seemingly-equal values are indeed different, don't you 
> think?  The easiest solution is for the standard to just declare this 
> outlaw, IMO.

This is undefined behavior, so the comparison can return false
or true or crash or whatever.  

Martin

> 
> Maybe it could do an exception for printing, that is, reading a pointer 
> is not a problem in itself, a long as you don't compare it, but I'm not 
> such an expert about this.
> 
> Cheers,
> 
> Alex
> 
> > 
> > -serge
> 
> -- 
> 
> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
> 




Re: Missed warning (-Wuse-after-free)

2023-02-24 Thread Martin Uecker via Gcc
Am Freitag, dem 24.02.2023 um 03:01 + schrieb Peter Lafreniere:

...
> 
> > Maybe it could do an exception for printing, that is, reading a pointer
> > is not a problem in itself, a long as you don't compare it, but I'm not
> > such an expert about this.
> 
> One last thought: with the above strict interpretation of the c standard,
> it would become nigh on impossible to implement the malloc(3) family of
> functions in c themselves. I vote for the "shared storage" interpretation
> of the c11 standard that is actually implemented rather than this abstract
> liveness oriented interpretation.

This is a bit of a misunderstanding about what "undefined behavior" means
in ISO C. It simply means that ISO C does not specify the behavior.  This
does not mean it is illegal to do something which has undefined behavior.

Instead, it means you can not rely on the ISO C standard for portable
behavior.  So if you implement "malloc" in C itself you will probably
rely on "undefined behavior", but this is perfectly fine.  The C standard
specifies behavior of "malloc", but does not care how it is implemented.


Martin



[GSoC][C++: Compiler Built-in Traits]: Example Impls & Small Patches

2023-02-24 Thread Ken Matsui via Gcc
Hi,

My name is Ken Matsui. I am highly interested in contributing to the
project idea, "C++: Implement compiler built-in traits for the
standard library traits." To understand how to implement those traits,
could you please give me some example implementations of the compiler
built-in traits, as well as some recommended traits to get started
with making small patches?

Also, I would appreciate receiving the contact information for the
project mentor, Patrick Palka.

Sincerely,
Ken Matsui


Re: Missed warning (-Wuse-after-free)

2023-02-24 Thread Serge E. Hallyn
On Fri, Feb 24, 2023 at 09:36:45AM +0100, Martin Uecker wrote:
> Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn:
> > On Fri, Feb 24, 2023 at 01:02:54AM +0100, Alex Colomar wrote:
> > > Hi Martin,
> > > 
> > > On 2/23/23 20:57, Martin Uecker wrote:
> > > > Am Donnerstag, dem 23.02.2023 um 20:23 +0100 schrieb Alex Colomar:
> > > > > Hi Martin,
> > > > > 
> > > > > On 2/17/23 14:48, Martin Uecker wrote:
> > > > > > > This new wording doesn't even allow one to use memcmp(3);
> > > > > > > just reading the pointer value, however you do it, is UB.
> > > > > > 
> > > > > > memcmp would not use the pointer value but work
> > > > > > on the representation bytes and is still allowed.
> > > > > 
> > > > > Hmm, interesting.  It's rather unspecified behavior. Still
> > > > > unpredictable: (memcmp(, , sizeof(p) == 0) might evaluate to true 
> > > > > or
> > > > > false randomly; the compiler may compile out the call to memcmp(3),
> > > > > since it knows it won't produce any observable behavior.
> > > > > 
> > > > > 
> > > > 
> > > > No, I think several things get mixed up here.
> > > > 
> > > > The representation of a pointer that becomes invalid
> > > > does not change.
> > > > 
> > > > So (0 === memcmp(, , sizeof(p)) always
> > > > evaluates to true.
> > > > 
> > > > Also in general, an unspecified value is simply unspecified
> > > > but does not change anymore.
> > 
> > Right.  p is its own thing - n bytes on the stack containing some value.
> > Once it comes into scope, it doesn't change on its own.  And if I do
> > free(p) or o = realloc(p), then the value of p itself - the n bytes on
> > the stack - does not change.
> 
> Yes, but one comment about terminology:. The C standard
> differentiates between the representation, i.e. the bytes on
> the stack, and the value.  The representation is converted to
> a value during lvalue conversion.  For an invalid pointer
> the representation is indeterminate because it now does not
> point to a valid object anymore.  So it is not possible to
> convert the representation to a value during lvalue conversion.
> In other words, it does not make sense to speak of the value
> of the pointer anymore.

I'm sure there are, especially from an implementer's point of view,
great reasons for this.

However, as just a user, the "value" of 'void *p' should absolutely
not be tied to whatever is at that address.  I'm given a simple
linear memory space, under which sits an entirely different view
obfuscated by page tables, but that doesn't concern me.  if I say
void *p = -1, then if I print p, then I expect to see that value.

Since I'm complaining about standards I'm picking and choosing here,
but I'll still point at the printf(3) manpage :)  :

   p  The  void * pointer argument is printed in hexadecimal (as if by 
%#x
  or %#lx).

> > I realize C11 appears to have changed that.  I fear that in doing so it
> > actually risks increasing the confusion about pointers.  IMO it's much
> > easier to reason about
> > 
> > o = realloc(p, X);
> > 
> > (and more baroque constructions) when keeping in mind that o, p, and the
> > object pointed to by either one are all different things.
> > 
> 
> What did change in C11? As far as I know, the pointer model
> did not change in C11.

I haven't looked in more detail, and don't really plan to, but my
understanding is that the text of:

  The lifetime of an object is the portion of program execution during which 
storage is
  guaranteed to be reserved for it. An object exists, has a constant address, 
and retains
  its last-stored value throughout its lifetime. If an object is referred to 
outside of its
  lifetime, the behavior is undefined. The value of a pointer becomes 
indeterminate when
  the object it points to (or just past) reaches the end of its lifetime.

(especially the last sentence) was new.

Maybe the words "value of a pointer" don't mean what I think they
mean.  But that's the phrase to which I object.  The n bytes on
the stack, p, are not changed just because something happened with
the accounting for the memory at the address represented by that
value.  If they do, then that's not 'C' any more.

> > > > Reading an uninitialized value of automatic storage whose
> > > > address was not taken is undefined behavior, so everything
> > > > is possible afterwards.
> > > > 
> > > > An uninitialized variable whose address was taken has a
> > > > representation which can represent an unspecified value
> > > > or a no-value (trap) representation. Reading the
> > > > representation itself is always ok and gives consistent
> > > > results. Reading the variable can be undefined behavior
> > > > iff it is a trap representation, otherwise you get
> > > > the unspecified value which is stored there.
> > > > 
> > > > At least this is my reading of the C standard. Compilers
> > > > are not full conformant.
> > > 
> > > Does all this imply that the following is well 

[GSoC] Introduction and query on LTO object emmission project

2023-02-24 Thread Peter Lafreniere via Gcc
Hi! I've been interested in compiler development for a while, and would love to
work with any of you as part of GSoC, or even just as a side-project on my own.

I'm an 18 year-old student going into university next year with a passion for 
all
things open source and low level. I consider myself fluent in c, and proficient
with c++, rust, and x86 assembly, but unfamiliar with practical compiler design.
I have done some reading on the theoretical aspects of compilers, however.

While I haven't worked with the GCC community before, I have worked with the 
linux
community and have made several small patches there, so I am familiar with both
email-based workflows and the principles of open-source development. 

This summer, I'm looking for more experience working on larger projects, as well
as getting into real compilers.

Of particular interest to me is the project idea labelled "Bypass assembler when
generating LTO object files." I see that the project was taken last year, but
I can find no sign of any changes committed to trunk 
(`git shortlog --after=2022-01-01 | grep -i -E "lto|assembl(er|y)"` shows 
nothing
related to this project) and no sign of any needed change made in the code.
Is this project still available?

I'm also willing to work on other projects, ideally in the middle/backend, but
currently I have only been experimenting with the gcc/[lto,data]-streamer* 
files.
If anyone has a small or medium sized project idea, please feel free to let me 
know.


I look forward to working with all of you in the future,

Peter Lafreniere




Re: Missed warning (-Wuse-after-free)

2023-02-24 Thread Serge E. Hallyn
On Fri, Feb 24, 2023 at 02:42:32AM +0100, Alex Colomar wrote:
> Hi Serge, Martin,
> 
> On 2/24/23 02:21, Serge E. Hallyn wrote:
> > > Does all this imply that the following is well defined behavior (and shall
> > > print what one would expect)?
> > > 
> > >free(p);
> > > 
> > >(void)   // take the address
> > >// or maybe we should (void) memcmp(, , sizeof(p)); ?
> > > 
> > >printf("%p\n", p);  // we took previously its address,
> > >// so now it has to hold consistently
> > >// the previous value
> > > 
> > > 
> > > This feels weird.  And a bit of a Schroedinger's pointer.  I'm not 
> > > entirely
> > > convinced, but might be.
> > 
> > Again, p is just an n byte variable which happens to have (one hopes)
> > pointed at a previously malloc'd address.
> > 
> > And I'd argue that pre-C11, this was not confusing, and would not have
> > felt weird to you.
> > 
> > But I am most grateful to you for having brought this to my attention.
> > I may not agree with it and not like it, but it's right there in the
> > spec, so time for me to adjust :)
> 
> I'll try to show why this feels weird to me (even in C89):
> 
> 
> alx@dell7760:~/tmp$ cat pointers.c
> #include 
> #include 
> 
> 
> int
> main(void)
> {
>   char  *p, *q;
> 
>   p = malloc(42);
>   if (p == NULL)
>   exit(1);
> 
>   q = realloc(p, 42);
>   if (q == NULL)
>   exit(1);
> 
>   (void)   // If we remove this, we get -Wuse-after-free

(which I would argue is a bug in the compiler)

>   printf("(%p == %p) = %i\n", p, q, (p == q));
> }
> alx@dell7760:~/tmp$ cc -Wall -Wextra pointers.c  -Wuse-after-free=3
> alx@dell7760:~/tmp$ ./a.out
> (0x5642cd9022a0 == 0x5642cd9022a0) = 1
> 
> 
> This pointers point to different objects (actually, one of them doesn't even
> point to an object anymore), so they can't compare equal, according to both:
> 
> 
> 
> 
> 
> (I believe C89 already had the concept of lifetime well defined as it is
> now, so the object had finished it's lifetime after realloc(3)).
> 
> How can we justify that true, if the pointer don't point to the same object?

Because what's pointed to does not matter.

You are comparing the memory address p, not the contents of the memory address.

By way of analogy, if I do

   mkdir -p /tmp/1/a
   ln -s /tmp/1 /tmp/2
   rm -rf /tmp/1

then /tmp/2 is still a symlink.  'stat /tmp/2' still works and is well
defined.  And if I create a new /tmp/1, then /tmp/2 starts pointing to
that.  Yes, re-useing p like that is a very bad idea, in many cases :)

> And how can we justify a hypothetical false (which compilers don't
> implement), if compilers will really just read the value?  To implement this
> as well defined behavior, it could result in no other than false, and it
> would require heavy overhead for the compilers to detect that the
> seemingly-equal values are indeed different, don't you think?  The easiest
> solution is for the standard to just declare this outlaw, IMO.
> 
> Maybe it could do an exception for printing, that is, reading a pointer is
> not a problem in itself, a long as you don't compare it, but I'm not such an
> expert about this.
> 
> Cheers,
> 
> Alex
> 
> > 
> > -serge
> 
> -- 
> 
> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
> 





Re: Missed warning (-Wuse-after-free)

2023-02-24 Thread Martin Uecker via Gcc
Am Freitag, dem 24.02.2023 um 10:01 -0600 schrieb Serge E. Hallyn:
> On Fri, Feb 24, 2023 at 09:36:45AM +0100, Martin Uecker wrote:
> > Am Donnerstag, dem 23.02.2023 um 19:21 -0600 schrieb Serge E. Hallyn:

...
> > 
> > Yes, but one comment about terminology:. The C standard
> > differentiates between the representation, i.e. the bytes on
> > the stack, and the value.  The representation is converted to
> > a value during lvalue conversion.  For an invalid pointer
> > the representation is indeterminate because it now does not
> > point to a valid object anymore.  So it is not possible to
> > convert the representation to a value during lvalue conversion.
> > In other words, it does not make sense to speak of the value
> > of the pointer anymore.
> 
> I'm sure there are, especially from an implementer's point of view,
> great reasons for this.
> 
> However, as just a user, the "value" of 'void *p' should absolutely
> not be tied to whatever is at that address.

Think about it in this way: The set of possible values for a pointer
is the set of objects that exist at a point in time. If one object
disappears, a pointer can not point to it anymore. So it is not that
the pointer changes, but the set of valid values.

>   I'm given a simple
> linear memory space, under which sits an entirely different view
> obfuscated by page tables, but that doesn't concern me.  if I say
> void *p = -1, then if I print p, then I expect to see that value.

If you store an integer into a pointer (you need a cast), then
this is implementation-defined and may also produce an invalid
pointer.

> 
> Since I'm complaining about standards I'm picking and choosing here,
> but I'll still point at the printf(3) manpage :)  :
> 
>    p  The  void * pointer argument is printed in hexadecimal (as if 
> by %#x
>   or %#lx).

This is valid if the pointer is valid, but if the pointer
is invalid, this is undefined behavior.

In C one not think about pointers as addresses. They
are abstract handles that point to objects, and compilers
do exploit this for optimization.

If you need an address, you can cast it to uintptr_t
(but see below).

> 
> > > I realize C11 appears to have changed that.  I fear that in doing so it
> > > actually risks increasing the confusion about pointers.  IMO it's much
> > > easier to reason about
> > > 
> > >   o = realloc(p, X);
> > > 
> > > (and more baroque constructions) when keeping in mind that o, p, and the
> > > object pointed to by either one are all different things.
> > > 
> > 
> > What did change in C11? As far as I know, the pointer model
> > did not change in C11.
> 
> I haven't looked in more detail, and don't really plan to, but my
> understanding is that the text of:
> 
>   The lifetime of an object is the portion of program execution during which 
> storage is
>   guaranteed to be reserved for it. An object exists, has a constant address, 
> and retains
>   its last-stored value throughout its lifetime. If an object is referred to 
> outside of its
>   lifetime, the behavior is undefined. The value of a pointer becomes 
> indeterminate when
>   the object it points to (or just past) reaches the end of its lifetime.
> 
> (especially the last sentence) was new.

This is not new.

C99 "The value of a pointer becomes indeterminate when
the object it points to reaches the end of its lifetime."

C90: "The value of a pointer that referred to an object
with automatic storage duration that is no longer
guaranteed to be reserved is indeterminate."

and

"The value of a pointer that refers to freed space is
indeterminate."

> Maybe the words "value of a pointer" don't mean what I think they
> mean.  But that's the phrase to which I object.  The n bytes on
> the stack, p, are not changed just because something happened with
> the accounting for the memory at the address represented by that
> value.  If they do, then that's not 'C' any more.

It is not about the bytes of the pointer changing. But if
the object is freed they do not represent a valid pointer
anymore.  There were CPUs that trapped when an invalid
address is loaded, e.g. because the data segment for the
object was removed from the segment tables. So this is a 
rule in portable 'C'  for more than 30 years.

Nowadays compilers exploit the knowledge that the
object is freed. So you can not reliably use such
a pointer. If you do this, your code will be broken on
most modern compilers.


> 
> > > > > Reading an uninitialized value of automatic storage whose
> > > > > address was not taken is undefined behavior, so everything
> > > > > is possible afterwards.
> > > > > 
> > > > > An uninitialized variable whose address was taken has a
> > > > > representation which can represent an unspecified value
> > > > > or a no-value (trap) representation. Reading the
> > > > > representation itself is always ok and gives consistent
> > > > > results. Reading the variable can be undefined behavior
> > > > > iff it is a trap representation, 

gcc-11-20230224 is now available

2023-02-24 Thread GCC Administrator via Gcc
Snapshot gcc-11-20230224 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/11-20230224/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 11 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-11 revision d832f1977440565f9cb8e154c8ff92c36714d2e8

You'll find:

 gcc-11-20230224.tar.xz   Complete GCC

  SHA256=73ac9c6d8dedf9f160e3a58815485282646dd802b1f561b56f274fc786867917
  SHA1=5a84a87983bd6ae8b1e557cc9d8b05a86ae45e96

Diffs from 11-20230217 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-11
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[Bug target/55218] armv6 doesn't use unaligned access for packed structures

2023-02-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55218

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-02-25
 Ever confirmed|0   |1

--- Comment #3 from Andrew Pinski  ---
Confirmed. very similar to PR 51709 if not the same.

[Bug middle-end/108920] Condition falsely optimized out

2023-02-24 Thread agner at agner dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108920

--- Comment #3 from Agner Fog  ---
It seems to work with gcc 9.4.0.
Thank you

[Bug target/108922] fmod() 13x slowdown in gcc4.9 dropping "fprem" and calling fmod()

2023-02-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922

--- Comment #1 from Andrew Pinski  ---
>The committer also claims "fixes ieee_2.f90 testsuite failure" but I have no 
>idea where to find this testsuite.


./testsuite/gfortran.dg/ieee/ieee_2.f90

[Bug middle-end/55658] bitfields and __attribute__((packed)) generate horrible code on x86_64

2023-02-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55658

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Andrew Pinski  ---
Mine, I am going to look into this for GCC 14 (seperately from the bitfield
lower since it also still happens with that pass).

That pass does give some extra information though on this (a slightly different
but similar enough testcase):
```
Trying to expand bitfield reference:
a_3(D)->la
base: *a_3(D) orig bitpos: 0 bytepos: 0
after bit_range bitpos: 0 bytepos: 0 bitregion_start: 0 bitregion_end: 63
align: 8 word_size: 64.
failed, get_best_mode return false.
```

So basically get_best_mode failed with the above arguments ...

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Ajit Agarwal via Gcc-patches
Hello Segher:

On 24/02/23 8:41 pm, Segher Boessenkool wrote:
> Hi!
> 
> For future patches: please don't send patches as replies to existing
> threads.  Just start a new thread for a new patch (series).  You can
> mark it as [PATCH v2] in the subject, if you want.
> 
> On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote:
>> Here is the patch that uses xxlor instead of fmr where possible.
>> Performance results shows that fmr is better in power9 and 
>> power10 architectures whereas xxlor is better in power7 and
>> power 8 architectures.
> 
> And fmr is the only option before p7.
> 
>>  rs6000: Use xxlor instead of fmr where possible
>>
>>  This patch replaces fmr with xxlor instruction for power7
>>  and power8 architectures whereas for power9 and power10
>>  replaces xxlor with fmr instruction.
> 
> Saying "this patch" in a commit message reads strangely.  Just "Replace
> fmr with" etc.?
> 

I will correct this.

> The second part is just wrong, you cannot replace xxlor by fmr in
> general.
> 
>>  Perf measurement results:
>>
>>  Power9 fmr:  201,847,661 cycles.
>>  Power9 xxlor: 201,877,78 cycles.
>>  Power8 fmr: 201,057,795 cycles.
>> Power8 xxlor: 201,004,671 cycles.
> 
> What is this measuring?  100M insns back-to-back, each dependent on the
> previous one?
> 
Yes.

> What are the results on p7 and p10?
> 
> These numbers show there is no difference on p8 either.  Did you paste
> the wrong numbers maybe?
>

I will measure it again and update with a new patch.
 
>>  * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor
>>  for power7 and power8 and fmr for power9 and power10.
> 
> Please don't break lines early.  Changelogs lines can be 80 columns
> wide, just like source code lines.
> 
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -354,7 +354,7 @@ (define_attr "cpu"
>>(const (symbol_ref "(enum attr_cpu) rs6000_tune")))
>>  
>>  ;; The ISA we implement.
>> -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
>> +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10"
> 
> p78v, and sort it after p8v please.
> 
>> + (and (eq_attr "isa" "p7p8")
>> +  (match_test "TARGET_VSX && !TARGET_P9_VECTOR"))
>> + (const_int 1)
> 
> Okay.
> 
>>  (define_insn "*mov_hardfloat64"
>>[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
>> -   "=m,   d,  d,  ,   wY,
>> - ,Z,  ,  ,  !r,
>> - YZ,  r,  !r, *c*l,   !r,
>> -*h,   r,  ,   wa")
>> +   "=m,   d,  ,  ,   wY,
>> +, Z,  wa, ,  !r,
>> +YZ,   r,  !r, *c*l,   !r,
>> +*h,   r,  ,   d,  wn,
>> +wa")
>>  (match_operand:FMOVE64 1 "input_operand"
> 
> (You posted this mail as wrapping.  That means the patch cannot be
> applied non-manually, and that replies to your mail will be mangled.
> Just get a Real mail client, and configure it correctly :-) )
>

I am using Thunderbird as mail client and the settings are all correct.
I have set the mailnews.wrapLength 0.

 
>> -"d,   m,  d,  wY, ,
>> - Z,   ,   ,  ,  ,
>> +"d,   m,  ,  wY, ,
>> + Z,   ,   wa, ,  ,
>>   r,   YZ, r,  r,  *h,
>> - 0,   ,   r,  eP"))]
>> + 0,   ,   r,  d,  wn,
>> + eP"))]
> 
> No.  It is impossible to figure out what you changed here by just
> reading it.
> 
> There is no requirement there should be exactly five alternatives per
> line, and/or that there should be the same number everywhere.
> 
> If the indentation was incorrect, and you want to fix that, do that in a
> separate *earlier* patch in the series, please.
> 

I will Keep indentation as same.
>>"TARGET_POWERPC64 && TARGET_HARD_FLOAT
>> && (gpc_reg_operand (operands[0], mode)
>> || gpc_reg_operand (operands[1], mode))"
>>"@
>> stfd%U0%X0 %1,%0
>> lfd%U1%X1 %0,%1
>> -   fmr %0,%1
>> +   xxlor %x0,%x1,%x1
>> lxsd %0,%1
>> stxsd %1,%0
>> lxsdx %x0,%y1
>> stxsdx %x1,%y0
>> -   xxlor %x0,%x1,%x1
>> +   fmr %0,%1
>> xxlxor %x0,%x0,%x0
>> li %0,0
>> std%U0%X0 %1,%0
>> @@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64"
>> nop
>> mfvsrd %0,%x1
>> mtvsrd %x0,%1
>> +   fmr %0,%1
>> +   fmr %0,%1
>> #"
>>[(set_attr "type"
>> -"fpstore, fpload, fpsimple,   fpload, fpstore,
>> +"fpstore, fpload, veclogical, fpload, fpstore,
>>   fpload,  fpstore,veclogical, veclogical, integer,
>>   store,   load,   *,  mtjmpr, mfjmpr,
>> -   

[Bug target/108922] fmod() 13x slowdown in gcc4.9 dropping "fprem" and calling fmod()

2023-02-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-02-25
 Status|UNCONFIRMED |WAITING

--- Comment #2 from Andrew Pinski  ---
So the simple test is run the full GCC bootstrap/test with all languages and
check if the testcase fails or not. I suspect it will.

[Bug tree-optimization/66364] poor optimization of packed structs containing bitfields

2023-02-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66364

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Andrew Pinski  ---
Dup of bug 55658.

*** This bug has been marked as a duplicate of bug 55658 ***

[Bug middle-end/55658] bitfields and __attribute__((packed)) generate horrible code on x86_64

2023-02-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55658

Andrew Pinski  changed:

   What|Removed |Added

 CC||rogero at howzatt dot co.uk

--- Comment #5 from Andrew Pinski  ---
*** Bug 66364 has been marked as a duplicate of this bug. ***

Re: [PATCH v3 10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index cf0cd669be4..5cd3f7673f0 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -215,4 +215,7 @@ enum stack_protector_guard {
>  #define TARGET_XTHEADMEMPAIR ((riscv_xthead_subext & MASK_XTHEADMEMPAIR) != 
> 0)
>  #define TARGET_XTHEADSYNC((riscv_xthead_subext & MASK_XTHEADSYNC) != 0)
>
> +#define HAVE_POST_MODIFY_DISP TARGET_XTHEADMEMIDX
> +#define HAVE_PRE_MODIFY_DISP  TARGET_XTHEADMEMIDX
> +
>  #endif /* ! GCC_RISCV_OPTS_H */
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 1b7ba02726d..019a0e08285 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -65,6 +65,24 @@ extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, 
> rtx);
>  extern void riscv_expand_float_scc (rtx, enum rtx_code, rtx, rtx);
>  extern void riscv_expand_conditional_branch (rtx, enum rtx_code, rtx, rtx);
>  #endif
> +
> +extern bool
> +riscv_classify_address_index (struct riscv_address_info *info, rtx x,
> + machine_mode mode, bool strict_p);
> +extern bool
> +riscv_classify_address_modify (struct riscv_address_info *info, rtx x,
> +  machine_mode mode, bool strict_p);
> +
> +extern const char *
> +riscv_output_move_index (rtx x, machine_mode mode, bool ldr);
> +extern const char *
> +riscv_output_move_modify (rtx x, machine_mode mode, bool ldi);
> +
> +extern bool
> +riscv_legitimize_address_index_p (rtx x, machine_mode mode, bool uindex);
> +extern bool
> +riscv_legitimize_address_modify_p (rtx x, machine_mode mode, bool post);
> +
>  extern bool riscv_expand_conditional_move (rtx, rtx, rtx, rtx);
>  extern rtx riscv_legitimize_call_address (rtx);
>  extern void riscv_set_return_address (rtx, rtx);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 33854393bd2..2980dbd69f9 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -83,6 +83,19 @@ along with GCC; see the file COPYING3.  If not see
>
>  /* Classifies an address.
>
> +   ADDRESS_REG_REG
> +   A base register indexed by (optionally scaled) register.
> +
> +   ADDRESS_REG_UREG
> +   A base register indexed by (optionally scaled) zero-extended register.
> +
> +   ADDRESS_REG_WB
> +   A base register indexed by immediate offset with writeback.
> +
> +   ADDRESS_REG
> +   A natural register + offset address.  The register satisfies
> +   riscv_valid_base_register_p and the offset is a const_arith_operand.
> +
> ADDRESS_REG
> A natural register + offset address.  The register satisfies
> riscv_valid_base_register_p and the offset is a const_arith_operand.
> @@ -97,6 +110,9 @@ along with GCC; see the file COPYING3.  If not see
> ADDRESS_SYMBOLIC:
> A constant symbolic address.  */
>  enum riscv_address_type {
> +  ADDRESS_REG_REG,
> +  ADDRESS_REG_UREG,
> +  ADDRESS_REG_WB,
>ADDRESS_REG,
>ADDRESS_LO_SUM,
>ADDRESS_CONST_INT,
> @@ -201,6 +217,7 @@ struct riscv_address_info {
>rtx reg;
>rtx offset;
>enum riscv_symbol_type symbol_type;
> +  int shift;
>  };
>
>  /* One stage in a constant building sequence.  These sequences have
> @@ -1025,12 +1042,31 @@ riscv_classify_address (struct riscv_address_info 
> *info, rtx x,
>if (riscv_v_ext_vector_mode_p (mode))
> return false;
>
> +  if (riscv_valid_base_register_p (XEXP (x, 0), mode, strict_p)
> + && riscv_classify_address_index (info, XEXP (x, 1), mode, strict_p))
> +   {
> + info->reg = XEXP (x, 0);
> + return true;
> +   }
> +  else if (riscv_valid_base_register_p (XEXP (x, 1), mode, strict_p)
> +   && riscv_classify_address_index (info, XEXP (x, 0),
> +mode, strict_p))
> +   {
> + info->reg = XEXP (x, 1);
> + return true;
> +   }
> +
>info->type = ADDRESS_REG;
>info->reg = XEXP (x, 0);
>info->offset = XEXP (x, 1);
>return (riscv_valid_base_register_p (info->reg, mode, strict_p)
>   && riscv_valid_offset_p (info->offset, mode));
>
> +case POST_MODIFY:
> +case PRE_MODIFY:
> +
> +  return riscv_classify_address_modify (info, x, mode, strict_p);
> +
>  case LO_SUM:
>/* RVV load/store disallow LO_SUM.  */
>if (riscv_v_ext_vector_mode_p (mode))
> @@ -1269,6 +1305,263 @@ riscv_emit_move (rtx dest, rtx src)
>   : emit_move_insn_1 (dest, src));
>  }
>
> +/* Return true if address offset is a valid index.  If it is, fill in INFO
> +   appropriately.  STRICT_P is true if REG_OK_STRICT is in effect.  */
> +
> +bool
> +riscv_classify_address_index (struct riscv_address_info *info, rtx x,
> +  machine_mode mode, bool strict_p)

indent

> +{
> +  enum riscv_address_type type = ADDRESS_REG_REG;;
> +  rtx index;
> +  int shift = 0;
> +
> +  

Re: [PATCH v3 11/11] riscv: thead: Add support for the XTheadFMemIdx ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
> > +(define_memory_constraint "Qmx"
> > +  "@internal
> > +   An address valid for GPR."
> > +  (and (match_code "mem")
> > +   (match_test "!riscv_legitimize_address_index_p (
> > +   XEXP (op, 0), GET_MODE (op), false)")))
>
> Check TARGET_XTHEADFMEMIDX, and I don't quite understand why it

I changed my mind, don't check TARGET_XTHEADFMEMIDX here,
check ext in the pattern instead.


Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 09:50, Jonathan Wakely  wrote:
>
> On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote:
> >
> > Assuming a compiler handles the T m_vecdata[1]; as flexible array member
> > like (which we need because standard C++ doesn't have flexible array members
> > nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
> > trick couldn't make auto_vec have
> > alignas(vec) unsigned char buf m_data[sizeof (vec) + (N 
> > - 1) * sizeof (T)];
> > and do a placement new of vec into that m_data during auto_vec
> > construction.  Isn't it then similar to how are flexible array members
> > normally used in C, where one uses malloc or alloca to allocate storage
> > for them and the storage can be larger than the structure itself and
> > flexible array member then can use storage after it?
>
> You would still be accessing past the end of the
> vec::m_vecdata array which is UB.

My thinking is something like:

// New tag type
struct vl_relative { };

// This must only be used as a member subobject of another type
// which provides the trailing storage.
template
struct vec
{
  T *address (void) { return (T*)(m_vecpfx+1); }
  const T *address (void) const { return (T*)(m_vecpfx+1); }

  alignas(T) alignas(vec_prefix) vec_prefix m_vecpfx;
};

template
class auto_vec : public vec
{
  // ...
private:
  vec m_head;
  T m_data[N];

static_assert(...);
};



Re: C++ modules and AAPCS/ARM EABI clash on inline key methods

2023-02-24 Thread Richard Earnshaw via Gcc-patches




On 23/02/2023 21:20, Alexandre Oliva wrote:

On Feb 23, 2023, Alexandre Oliva  wrote:


On Feb 23, 2023, Richard Earnshaw  wrote:

On 22/02/2023 19:57, Alexandre Oliva wrote:

On Feb 21, 2023, Richard Earnshaw  wrote:


Rather than scanning for the triplet, a better test would be



{ xfail { arm_eabi } }


Indeed, thanks.  Here's the updated patch, retested.  Ok to install?



Based on Nathan's comments, we should just skip the test on arm_eabi,
it's simply not applicable.



Like this, I suppose.  Retested on x86_64-linux-gnu (trunk) and
arm-wrs-vxworks7 (gcc-12).  Ok to install?


Erhm, actually, that version still ran the assembler scans and failed.
This one skips the testset entirely.


Yeah, I tried something like that and it didn't appear to work. Perhaps 
it's a bug in the way dg-do-module is implemented.





[PR105224] C++ modules and AAPCS/ARM EABI clash on inline key methods

From: Alexandre Oliva 

g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
that use the AAPCS variant.  ARM is the only target that overrides
TARGET_CXX_KEY_METHOD_MAY_BE_INLINE.  It's not clear to me which way
the clash between AAPCS and C++ Modules design should be resolved, but
currently it favors AAPCS and thus the test fails, so skip it on
arm_eabi.


for  gcc/testsuite/ChangeLog

PR c++/105224
* g++.dg/modules/virt-2_a.C: Skip on arm_eabi.
---
  gcc/testsuite/g++.dg/modules/virt-2_a.C |3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/testsuite/g++.dg/modules/virt-2_a.C 
b/gcc/testsuite/g++.dg/modules/virt-2_a.C
index 580552be5a0d8..ede711c3e83be 100644
--- a/gcc/testsuite/g++.dg/modules/virt-2_a.C
+++ b/gcc/testsuite/g++.dg/modules/virt-2_a.C
@@ -1,3 +1,6 @@
+// AAPCS overrides TARGET_CXX_KEY_METHOD_MAY_BE_INLINE,
+// in a way that invalidates this test.
+// { dg-skip-if "TARGET_CXX_KEY_METHOD_MAY_BE_INLINE" { arm_eabi } }


Given the logic of this macro, the text should be 
"!TARGET_CXX_METHOD_MAY_BE_INLINE".


OK with that change.

R.


  // { dg-module-do run }
  // { dg-additional-options -fmodules-ts }
  export module foo;




Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches wrote:
> Maybe this would work, vl_relative even could be vl_embed.
> Because vl_embed I believe is used in two spots, part of
> auto_vec where it is followed by m_data and on heap or GGC
> allocated memory where vec<..., vl_embed> is followed by
> further storage for the vector.

So roughtly something like below?  Except I get weird crashes with it
in the gen* tools.  And we'd need to adjust the gdb python hooks
which also use m_vecdata.

--- gcc/vec.h.jj2023-01-02 09:32:32.177143804 +0100
+++ gcc/vec.h   2023-02-24 11:19:37.900157177 +0100
@@ -586,8 +586,8 @@ public:
   unsigned allocated (void) const { return m_vecpfx.m_alloc; }
   unsigned length (void) const { return m_vecpfx.m_num; }
   bool is_empty (void) const { return m_vecpfx.m_num == 0; }
-  T *address (void) { return m_vecdata; }
-  const T *address (void) const { return m_vecdata; }
+  T *address (void) { return (T *) (this + 1); }
+  const T *address (void) const { return (const T *) (this + 1); }
   T *begin () { return address (); }
   const T *begin () const { return address (); }
   T *end () { return address () + length (); }
@@ -629,10 +629,9 @@ public:
   friend struct va_gc_atomic;
   friend struct va_heap;
 
-  /* FIXME - These fields should be private, but we need to cater to
+  /* FIXME - This field should be private, but we need to cater to
 compilers that have stricter notions of PODness for types.  */
-  vec_prefix m_vecpfx;
-  T m_vecdata[1];
+  alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx;
 };
 
 
@@ -879,7 +878,7 @@ inline const T &
 vec::operator[] (unsigned ix) const
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 template
@@ -887,7 +886,7 @@ inline T &
 vec::operator[] (unsigned ix)
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 
@@ -929,7 +928,7 @@ vec::iterate (unsigned i
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = m_vecdata[ix];
+  *ptr = address ()[ix];
   return true;
 }
   else
@@ -955,7 +954,7 @@ vec::iterate (unsigned i
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = CONST_CAST (T *, _vecdata[ix]);
+  *ptr = CONST_CAST (T *,  ()[ix]);
   return true;
 }
   else
@@ -978,7 +977,7 @@ vec::copy (ALONE_MEM_STA
 {
   vec_alloc (new_vec, len PASS_MEM_STAT);
   new_vec->embedded_init (len, len);
-  vec_copy_construct (new_vec->address (), m_vecdata, len);
+  vec_copy_construct (new_vec->address (), address (), len);
 }
   return new_vec;
 }
@@ -1018,7 +1017,7 @@ inline T *
 vec::quick_push (const T )
 {
   gcc_checking_assert (space (1));
-  T *slot = _vecdata[m_vecpfx.m_num++];
+  T *slot =  ()[m_vecpfx.m_num++];
   *slot = obj;
   return slot;
 }
@@ -1031,7 +1030,7 @@ inline T &
 vec::pop (void)
 {
   gcc_checking_assert (length () > 0);
-  return m_vecdata[--m_vecpfx.m_num];
+  return address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1056,7 +1055,7 @@ vec::quick_insert (unsig
 {
   gcc_checking_assert (length () < allocated ());
   gcc_checking_assert (ix <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
   *slot = obj;
 }
@@ -1071,7 +1070,7 @@ inline void
 vec::ordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
 }
 
@@ -1118,7 +1117,7 @@ inline void
 vec::unordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
+  address ()[ix] = address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1130,7 +1129,7 @@ inline void
 vec::block_remove (unsigned ix, unsigned len)
 {
   gcc_checking_assert (ix + len <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   m_vecpfx.m_num -= len;
   memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
 }
@@ -1309,7 +1308,7 @@ vec::embedded_size (unsi
vec, vec_embedded>::type vec_stdlayout;
   static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
   static_assert (alignof (vec_stdlayout) == alignof (vec), "");
-  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
+  return sizeof (vec_stdlayout) + alloc * sizeof (T);
 }
 
 
@@ -1476,10 +1475,10 @@ public:
   { return m_vec ? m_vec->length () : 0; }
 
   T *address (void)
-  { return m_vec ? m_vec->m_vecdata : NULL; }
+  { return m_vec ? m_vec->address () : NULL; }
 
   const T *address (void) const
-  { return m_vec ? m_vec->m_vecdata : NULL; }
+  { return m_vec ? m_vec->address () : NULL; }
 
   T *begin () { return address (); }
   const T *begin () const { return address (); }
@@ -1584,7 +1583,7 @@ public:
 
 private:
   vec m_auto;
-  T m_data[MAX (N - 1, 1)];
+  unsigned char m_data[MAX (N - 1, 1)];
 };
 
 /* auto_vec is a sub 

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 10:24, Jakub Jelinek  wrote:
>
> On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches wrote:
> > Maybe this would work, vl_relative even could be vl_embed.
> > Because vl_embed I believe is used in two spots, part of
> > auto_vec where it is followed by m_data and on heap or GGC
> > allocated memory where vec<..., vl_embed> is followed by
> > further storage for the vector.
>
> So roughtly something like below?  Except I get weird crashes with it
> in the gen* tools.  And we'd need to adjust the gdb python hooks
> which also use m_vecdata.
>
> --- gcc/vec.h.jj2023-01-02 09:32:32.177143804 +0100
> +++ gcc/vec.h   2023-02-24 11:19:37.900157177 +0100
> @@ -586,8 +586,8 @@ public:
>unsigned allocated (void) const { return m_vecpfx.m_alloc; }
>unsigned length (void) const { return m_vecpfx.m_num; }
>bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> -  T *address (void) { return m_vecdata; }
> -  const T *address (void) const { return m_vecdata; }
> +  T *address (void) { return (T *) (this + 1); }
> +  const T *address (void) const { return (const T *) (this + 1); }
>T *begin () { return address (); }
>const T *begin () const { return address (); }
>T *end () { return address () + length (); }
> @@ -629,10 +629,9 @@ public:
>friend struct va_gc_atomic;
>friend struct va_heap;
>
> -  /* FIXME - These fields should be private, but we need to cater to
> +  /* FIXME - This field should be private, but we need to cater to
>  compilers that have stricter notions of PODness for types.  */
> -  vec_prefix m_vecpfx;
> -  T m_vecdata[1];
> +  alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx;
>  };
>
>
> @@ -879,7 +878,7 @@ inline const T &
>  vec::operator[] (unsigned ix) const
>  {
>gcc_checking_assert (ix < m_vecpfx.m_num);
> -  return m_vecdata[ix];
> +  return address ()[ix];
>  }
>
>  template
> @@ -887,7 +886,7 @@ inline T &
>  vec::operator[] (unsigned ix)
>  {
>gcc_checking_assert (ix < m_vecpfx.m_num);
> -  return m_vecdata[ix];
> +  return address ()[ix];
>  }
>
>
> @@ -929,7 +928,7 @@ vec::iterate (unsigned i
>  {
>if (ix < m_vecpfx.m_num)
>  {
> -  *ptr = m_vecdata[ix];
> +  *ptr = address ()[ix];
>return true;
>  }
>else
> @@ -955,7 +954,7 @@ vec::iterate (unsigned i
>  {
>if (ix < m_vecpfx.m_num)
>  {
> -  *ptr = CONST_CAST (T *, _vecdata[ix]);
> +  *ptr = CONST_CAST (T *,  ()[ix]);
>return true;
>  }
>else
> @@ -978,7 +977,7 @@ vec::copy (ALONE_MEM_STA
>  {
>vec_alloc (new_vec, len PASS_MEM_STAT);
>new_vec->embedded_init (len, len);
> -  vec_copy_construct (new_vec->address (), m_vecdata, len);
> +  vec_copy_construct (new_vec->address (), address (), len);
>  }
>return new_vec;
>  }
> @@ -1018,7 +1017,7 @@ inline T *
>  vec::quick_push (const T )
>  {
>gcc_checking_assert (space (1));
> -  T *slot = _vecdata[m_vecpfx.m_num++];
> +  T *slot =  ()[m_vecpfx.m_num++];
>*slot = obj;
>return slot;
>  }
> @@ -1031,7 +1030,7 @@ inline T &
>  vec::pop (void)
>  {
>gcc_checking_assert (length () > 0);
> -  return m_vecdata[--m_vecpfx.m_num];
> +  return address ()[--m_vecpfx.m_num];
>  }
>
>
> @@ -1056,7 +1055,7 @@ vec::quick_insert (unsig
>  {
>gcc_checking_assert (length () < allocated ());
>gcc_checking_assert (ix <= length ());
> -  T *slot = _vecdata[ix];
> +  T *slot =  ()[ix];
>memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
>*slot = obj;
>  }
> @@ -1071,7 +1070,7 @@ inline void
>  vec::ordered_remove (unsigned ix)
>  {
>gcc_checking_assert (ix < length ());
> -  T *slot = _vecdata[ix];
> +  T *slot =  ()[ix];
>memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
>  }
>
> @@ -1118,7 +1117,7 @@ inline void
>  vec::unordered_remove (unsigned ix)
>  {
>gcc_checking_assert (ix < length ());
> -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> +  address ()[ix] = address ()[--m_vecpfx.m_num];
>  }
>
>
> @@ -1130,7 +1129,7 @@ inline void
>  vec::block_remove (unsigned ix, unsigned len)
>  {
>gcc_checking_assert (ix + len <= length ());
> -  T *slot = _vecdata[ix];
> +  T *slot =  ()[ix];
>m_vecpfx.m_num -= len;
>memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
>  }
> @@ -1309,7 +1308,7 @@ vec::embedded_size (unsi
> vec, vec_embedded>::type vec_stdlayout;
>static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
>static_assert (alignof (vec_stdlayout) == alignof (vec), "");
> -  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
> +  return sizeof (vec_stdlayout) + alloc * sizeof (T);
>  }
>
>
> @@ -1476,10 +1475,10 @@ public:
>{ return m_vec ? m_vec->length () : 0; }
>
>T *address (void)
> -  { return m_vec ? m_vec->m_vecdata : NULL; }
> +  { return m_vec ? m_vec->address () : NULL; }
>
>const T *address (void) const
> -  { return 

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 11:59:53AM +0100, Jakub Jelinek via Gcc-patches wrote:
> > This needs to be alignas(T) unsigned char m_data[sizeof(T) * N];
> 
>   unsigned char m_data[MAX (N, 2) * sizeof (T)];
> 
> if we want to preserve current behavior I think.
> 
> I've screwed up, when I was about to change this line, I've realized
> I want to look at the embedded_size stuff first and then forgot
> to update this.  With the above line it builds, though I bet one will need
> to compare the generated code etc. and test with GCC 4.8.
> 
> Though I guess we now have also an option to change auto_vec with N 0/1
> to have 1 embedded element instead of 2, so that would also mean changing
> all of MAX (N, 2) to MAX (N, 1) if we want.

It builds then in non-optimized build, but when I try to build it in stage3,
the useless middle-end warnings kick in:

In member function ‘T* vec::quick_push(const T&) [with T = 
parameter; A = va_heap]’,
inlined from ‘T* vec::quick_push(const T&) [with T = parameter]’ at 
../../gcc/vec.h:1957:28,
inlined from ‘void populate_pattern_routine(create_pattern_info*, 
merge_state_info*, state*, const vec&)’ at 
../../gcc/genrecog.cc:3008:29:
../../gcc/vec.h:1021:3: error: writing 1 byte into a region of size 0 
[-Werror=stringop-overflow=]
 1021 |   *slot = obj;
  |   ^
../../gcc/vec.h: In function ‘void 
populate_pattern_routine(create_pattern_info*, merge_state_info*, state*, const 
vec&)’:
../../gcc/vec.h:1585:29: note: at offset 12 into destination object 
‘auto_vec::m_auto’ of size 8
 1585 |   vec m_auto;
  | ^~
In member function ‘T* vec::quick_push(const T&) [with T = 
parameter; A = va_heap]’,
inlined from ‘T* vec::quick_push(const T&) [with T = parameter]’ at 
../../gcc/vec.h:1957:28,
inlined from ‘decision* init_pattern_use(create_pattern_info*, 
merge_state_info*, const vec&)’ at 
../../gcc/genrecog.cc:2886:26:
../../gcc/vec.h:1021:3: error: writing 1 byte into a region of size 0 
[-Werror=stringop-overflow=]
 1021 |   *slot = obj;
  |   ^
../../gcc/vec.h: In function ‘decision* init_pattern_use(create_pattern_info*, 
merge_state_info*, const vec&)’:
../../gcc/vec.h:1585:29: note: at offset 12 into destination object 
‘auto_vec::m_auto’ of size 8
 1585 |   vec m_auto;
  | ^~


Jakub



Re: [wwwdocs, patch] OpenMP update for gcc-13/changes.html + projects/gomp/

2023-02-24 Thread Benson Muite via Gcc-patches
On 2/24/23 10:32, Benson Muite via Gcc-patches wrote:
> On 2/24/23 04:02, Gerald Pfeifer wrote:
>> On Thu, 23 Feb 2023, Tobias Burnus wrote:
>>> PS: I also removed a stray , but admittedly only after the
>>> commit. I found it by manually running those through the w3 validator
>>> site. However, I did not see an automatic email, either it takes longer
>>> or does it no longer run? It did in the past!
>>
>> You are right, and this is a sore / sad point: validator.w3.org that we
>> used in the past now only supports interactive sessions. And they even
>> broke support for the Referer header, so I also had to remove the checking 
>> link I had embedded in all of our pages.
>>
>> These days I invoke the validator (via a version of the original script) 
>> when I see a commit. Which indeed leads to many orders of magnitude longer 
>> delays.
>>
>> Sadly I don't have a better alternative. :-(
>>
> Could one of the following be used or used to generate a better workflow:
> https://html-validate.org/usage/cli.html - written in Javascript, but
> has a command line interface
> https://github.com/validator/validator - packaged, but a little
> cumbersome, may need a wrapper
> https://github.com/w3c-validators/w3c_validators - Wrapper written in
> Ruby, with a nice interface to validate a local file
> 
html-tidy could work well. Written in C. A typical session from Git
Sources following [1]:

git pull
cd gcc
./configure
mkdir HTML
makeinfo --html --no-split -Idoc -Idoc/include -o HTML doc/gcc.texi
tidy -f HTML/errs.txt -imu HTML/gcc.html

Typical current reported errors are
line 23080 column 22 - Warning: nested emphasis 
line 40445 column 11 - Warning: nested emphasis 
line 3541 column 1 - Warning:  anchor "index-g_002b_002b" already
defined
line 54489 column 1 - Warning:  lacks "summary" attribute


[1]
https://unix.stackexchange.com/questions/493013/how-to-build-the-gcc-html-documentation-from-source-into-a-single-page


[Bug d/106977] [13 regression] d21 dies with SIGBUS on 32-bit Darwin

2023-02-24 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106977

--- Comment #26 from ibuclaw at gcc dot gnu.org ---
Comparing the D and C++ trees side by side.

At the point of `finish_function` for the ::vis() method, I see the following:

D:   type 

[Bug d/106977] [13 regression] d21 dies with SIGBUS on 32-bit Darwin

2023-02-24 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106977

--- Comment #25 from Iain Sandoe  ---
(In reply to ibuclaw from comment #24)
> (In reply to Iain Sandoe from comment #23)
> > So the ABIs differ in this (as noted on IRC, the Darwin 32b ABIs are not the
> > same as Linux).
> I'm still yet to work out why D on 32-bit Darwin behaves the same as 32-bit
> Linux though.  I would have assumed the decision to generate an sret would
> occur long after the front-end has freed itself from the compilation process.

there should be some target hook to query whether a struct returns in regs.

> Regardless, the ABI issue can be raised in a separate PR. Because of it
> though, that means for this bootstrap PR we just have to avoid calling any
> extern(C++) method implemented in D that returns a struct by value.

Note that this would affect any interface to libc or external C++ that returns
a small struct.  AFAIR, X86 Darwin is not the only platform that returns small
structs in regs.

Re: [PATCH] asan: adjust module name for global variables

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 10:00:01AM +0100, Martin Liška wrote:
> As mentioned in the PR, when we use LTO, we wrongly use ltrans output
> file name as a module name of a global variable. That leads to a
> non-reproducible output.
> 
> After the suggested change, we emit context name of normal global
> variables. And for artificial variables (like .Lubsan_data3), we use
> aux_base_name (e.g. "./a.ltrans0.ltrans").
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
>   PR asan/108834
> 
> gcc/ChangeLog:
> 
>   * asan.cc (asan_add_global): Use proper TU name for normal
> global variables (and aux_base_name for the artificial one).
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/asan/global-overflow-1.c: Test line and column
>   info for a global variable.
> ---
>  gcc/asan.cc | 7 ++-
>  gcc/testsuite/c-c++-common/asan/global-overflow-1.c | 2 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/asan.cc b/gcc/asan.cc
> index f56d084bc7a..245abb14388 100644
> --- a/gcc/asan.cc
> +++ b/gcc/asan.cc
> @@ -3287,7 +3287,12 @@ asan_add_global (tree decl, tree type, 
> vec *v)
>  pp_string (_pp, "");
>str_cst = asan_pp_string (_pp);
>  
> -  pp_string (_name_pp, main_input_filename);
> +  const_tree tu = get_ultimate_context ((const_tree)decl);
> +  if (tu != NULL_TREE)
> +pp_string (_name_pp, IDENTIFIER_POINTER (DECL_NAME (tu)));
> +  else
> +pp_string (_name_pp, aux_base_name);

I think for !in_lto_p we don't need to bother with get_ultimate_context
and should just use main_input_filename as before.

Otherwise LGTM.

Jakub



Re: [PATCH v3 09/11] riscv: thead: Add support for the XTheadMemPair ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 10:01 AM Kito Cheng  wrote:
>
> Got one fail:
>
> FAIL: gcc.target/riscv/xtheadmempair-1.c   -O2   scan-assembler-times
> th.luwd\t 4
>
> It should scan lwud rather than luwd?

Yes, this should be th.lwud.
Must have been introduced after testing.

I also ran the whole patchset again with RV32 and RV64.
This should be the only issue of this kind in the series.
Sorry for that!


Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 09:34:46AM +, Richard Biener wrote:
> > Looking at vec::operator[] which just does
> > 
> > template
> > inline const T &
> > vec::operator[] (unsigned ix) const
> > {
> >   gcc_checking_assert (ix < m_vecpfx.m_num);
> >   return m_vecdata[ix];
> > } 
> > 
> > the whole thing looks fragile at best - we basically have
> > 
> > struct auto_vec
> > {
> >   struct vec
> >   {
> > ...
> > T m_vecdata[1];
> >   } m_auto;
> >   T m_data[N-1];
> > };

Assuming a compiler handles the T m_vecdata[1]; as flexible array member
like (which we need because standard C++ doesn't have flexible array members
nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
trick couldn't make auto_vec have
alignas(vec) unsigned char buf m_data[sizeof (vec) + (N - 
1) * sizeof (T)];
and do a placement new of vec into that m_data during auto_vec
construction.  Isn't it then similar to how are flexible array members
normally used in C, where one uses malloc or alloca to allocate storage
for them and the storage can be larger than the structure itself and
flexible array member then can use storage after it?

Though, of course, we'd need to test it with various compilers,
GCC 4.8 till now, various versions of clang, ICC, ...

Jakub



[Bug middle-end/108854] [10/11/12/13 Regression] tbb-2021.8.0 fails on i686-linux (32-bit), internal compiler error: in expand_expr_real_1, at expr.c:10281 since r10-4511-g6cf67b62c8cda035dccac

2023-02-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108854

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:2f1691be517fcdcabae9cd671ab511eb0e08b1d5

commit r13-6319-g2f1691be517fcdcabae9cd671ab511eb0e08b1d5
Author: Jakub Jelinek 
Date:   Fri Feb 24 11:05:27 2023 +0100

cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial
thunk [PR108854]

The following testcase ICEs on x86_64-linux with -m32.  The problem is
we create an artificial thunk and because of -fPIC, ia32 and thunk
destination which doesn't bind locally can't use a mi thunk.
The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
only if some arguments need modification.

The following patch fixes it by copying the DECL_ARGUMENTS list even if
the arguments can stay as is, to update DECL_CONTEXT on them.  While for
mi thunks it doesn't really matter because we don't use those arguments
in any way, for other thunks it is important.

2023-02-23  Jakub Jelinek  

PR middle-end/108854
* cgraphclones.cc (duplicate_thunk_for_node): If no parameter
changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
nodes and adjust their DECL_CONTEXT.

* g++.dg/opt/pr108854.C: New test.

[Bug middle-end/108854] [10/11/12 Regression] tbb-2021.8.0 fails on i686-linux (32-bit), internal compiler error: in expand_expr_real_1, at expr.c:10281 since r10-4511-g6cf67b62c8cda035dccac

2023-02-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108854

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[10/11/12/13 Regression]|[10/11/12 Regression]
   |tbb-2021.8.0 fails on   |tbb-2021.8.0 fails on
   |i686-linux (32-bit),|i686-linux (32-bit),
   |internal compiler error: in |internal compiler error: in
   |expand_expr_real_1, at  |expand_expr_real_1, at
   |expr.c:10281 since  |expr.c:10281 since
   |r10-4511-g6cf67b62c8cda035d |r10-4511-g6cf67b62c8cda035d
   |ccac|ccac

--- Comment #14 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug fortran/108921] New: ICE: using the result of an impure function in automatic character allocation

2023-02-24 Thread vterzi1996 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108921

Bug ID: 108921
   Summary: ICE: using the result of an impure function in
automatic character allocation
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vterzi1996 at gmail dot com
  Target Milestone: ---

Created attachment 54527
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54527=edit
Complete backtrace

Consider the following minimal working example:
```
module lib
type t
contains
procedure::f,g
end type
contains
integer function f(this)
class(t),intent(in)::this
f=10
end
function g(this)result(r)
class(t),intent(in)::this
character(len=this%f())::r  ! problem appears here
r='42'
end
end

program prog
use lib
type(t)::o
print*,o%g()
end
```

This example was already discussed here:
https://stackoverflow.com/questions/75544072/using-function-result-as-character-length-in-fortran

The compilation of this code with the GNU compiler (gfortran 12.2.0 on Rocky
Linux 8.7) fails with the message `f951: internal compiler error: Segmentation
fault` (full output is in the attachment). It also fails with 7.5.0, 9.4.0, and
11.1.0 on Ubuntu 18.04.2 LTS. The reason for this problem is the unallowed
usage of the result of an impure function (here: `f`) in the automatic
allocation of the CHARACTER variable, but the compiler fails to identify the
mistake in the code.

[wwwdocs] gcc-13: riscv: Document the T-Head CPU support

2023-02-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch documents the new T-Head CPU support for RISC-V.

Signed-off-by: Christoph Müllner 
---
 htdocs/gcc-13/changes.html | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index a803f501..ce5ba35c 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -490,7 +490,29 @@ a work-in-progress.
 
 RISC-V
 
-New ISA extension support for zawrs.
+  New ISA extension support for Zawrs.
+  Support for the following vendor extensions has been added:
+
+  XTheadBa
+  XTheadBb
+  XTheadBs
+  XTheadCmo
+  XTheadCondMov
+  XTheadFMemIdx
+  XTheadFmv
+  XTheadInt
+  XTheadMac
+  XTheadMemIdx
+  XTheadMemPair
+  XTheadSync
+
+  
+  The following new CPUs are supported through the -mcpu
+  option (GCC identifiers in parentheses).
+
+  T-Head's XuanTie C906 (thead-c906).
+
+  
 
 
 
-- 
2.39.2



[Patch] Fortran: Skip bound conv in gfc_conv_gfc_desc_to_cfi_desc with intent(out) ptr [PR108621]

2023-02-24 Thread Tobias Burnus

[The following is about Fortran pointers as actual argument to a CFI taking 
procedure.]

The issue has been marked as 12/13 regression but the issue is just a 
diagnostic one.

To disentangle:

(A) Bogus warning
[Now tracked as middle-end https://gcc.gnu.org/PR108906 ]
Assume:
   nullify(p)
   call bind_c_proc(p)

For some reasons, the compiler does not propagate the NULL and thus assumes that
if (addr != NULL) could be true. This leads to a may-be-uninit warning with 
'-Wall -O0'.
(And no warning with -Og/-Os/-O1.)

We could silence it on the gfortran side, I think, by tweaking some tree 
properties,
but I am not sure we want to to it.

(The same kind of code is in GCC 11 but as it is hidden in libgfortran; hence, 
there are
no warnings when compiling user code. Since GCC 12, the compiler does the 
conversion
in place when generating the code.)



(B) The attached patch:

With 'intent(out)' there is no reason to do the conversions. While for nullified
pointers the bounds-conversion loop is skipped, it may still be executed for 
undefined
pointers. (Which is usually harmless.) In either case, not generating this code 
makes
sense.

OK for mainline?

Regarding GCC 12:  I am not really sure as it is no real regression. Besides 
bogus
warnings, there might be an issue for undefined pointers and 
-fsanitize=undefined, namely
if 'ubound - lbound' evaluated on random numbers overflows (such as for ubound 
= huge(..)
and lbound = -huge(..)). But that looks like a rather special case. - Thoughts?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Skip bound conv in gfc_conv_gfc_desc_to_cfi_desc with intent(out) ptr [PR108621]

When the dummy argument of the bind(C) proc is 'pointer, intent(out)', the conversion
of the GFC to the CFI bounds can be skipped: it is not needed and avoids issues with
noninit memory.


Note that the 'cfi->base_addr = gfc->addr' assignment is kept as the C code of a user
might assume that a nullified pointer arrives as NULL (or even a specific value).
For instance, gfortran.dg/c-interop/section-{1,2}.f90 assumes the value NULL.

Note 2: The PR is about a may-be-uninitialized warning with intent(out). In the PR's
testcase, the pointer was nullified and should not have produced that warning.
That is a diagnostic issue, now tracked as PR middle-end/108906 as the issue in principle
still exists (e.g. with 'intent(inout)'). [But no longer for intent(out).]

Note 3: With undefined pointers and no 'intent', accessing uninit memory is unavoidable
on the caller side as the compiler cannot know what the C function does (but this usage
determines whether the pointer is permitted be undefined or whether the bounds must be
gfc-to-cfi converted).


gcc/fortran/ChangeLog:

	PR fortran/108621
	* trans-expr.cc (gfc_conv_gfc_desc_to_cfi_desc):

gcc/testsuite/ChangeLog:

	PR fortran/108621
	* gfortran.dg/c-interop/fc-descriptor-pr108621.f90: New test.

 gcc/fortran/trans-expr.cc  |  6 ++
 .../c-interop/fc-descriptor-pr108621.f90   | 65 ++
 2 files changed, 71 insertions(+)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index e85b53fae85..045c8b00b90 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -5673,6 +5673,9 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym)
   gfc_add_modify (, tmp,
 		  build_int_cst (TREE_TYPE (tmp), attr));
 
+  /* The cfi-base_addr assignment could be skipped for 'pointer, intent(out)'.
+ That is very sensible for undefined pointers, but the C code might assume
+ that the pointer retains the value, in particular, if it was NULL.  */
   if (e->rank == 0)
 {
   tmp = gfc_get_cfi_desc_base_addr (cfi);
@@ -5695,6 +5698,9 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym)
   gfc_add_modify (, tmp2, fold_convert (TREE_TYPE (tmp2), tmp));
 }
 
+  if (fsym->attr.pointer && fsym->attr.intent == INTENT_OUT)
+goto done;
+
   /* When allocatable + intent out, free the cfi descriptor.  */
   if (fsym->attr.allocatable && fsym->attr.intent == INTENT_OUT)
 {
diff --git a/gcc/testsuite/gfortran.dg/c-interop/fc-descriptor-pr108621.f90 b/gcc/testsuite/gfortran.dg/c-interop/fc-descriptor-pr108621.f90
new file mode 100644
index 000..9c9062bd62d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/c-interop/fc-descriptor-pr108621.f90
@@ -0,0 +1,65 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/108621
+!
+! If the bind(C) procedure's dummy argument is a POINTER with INTENT(OUT),
+! avoid converting the array bounds for the CFI descriptor before the call.
+!
+! Rational: Fewer code and, esp. for undefined pointers, there might be a
+! 

[PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Richard Biener via Gcc-patches
As preparation to remove m_vecdata in the vl_embed vector this
changes references to it into calls to address ().

As I was here it also fixes ::contains to avoid repeated bounds
checking and the same issue in ::lower_bound which also suffers
from unnecessary copying around values.

* vec.h: Change m_vecdata references to address ().
* vec.h (vec::lower_bound): Adjust to
take a const reference to the object, use address to
access data.
(vec::contains): Use address to access data.
(vec::operator[]): Use address instead of
m_vecdata to access data.
(vec::iterate): Likewise.
(vec::copy): Likewise.
(vec::quick_push): Likewise.
(vec::pop): Likewise.
(vec::quick_insert): Likewise.
(vec::ordered_remove): Likewise.
(vec::unordered_remove): Likewise.
(vec::block_remove): Likewise.
(vec::address): Likewise.
---
 gcc/vec.h | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index a536b68732d..5a2ee9c0294 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -614,7 +614,7 @@ public:
   T *bsearch (const void *key, int (*compar)(const void *, const void *));
   T *bsearch (const void *key,
  int (*compar)(const void *, const void *, void *), void *);
-  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
+  unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const;
   bool contains (const T ) const;
   static size_t embedded_size (unsigned);
   void embedded_init (unsigned, unsigned = 0, unsigned = 0);
@@ -879,7 +879,7 @@ inline const T &
 vec::operator[] (unsigned ix) const
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 template
@@ -887,7 +887,7 @@ inline T &
 vec::operator[] (unsigned ix)
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 
@@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = m_vecdata[ix];
+  *ptr = address()[ix];
   return true;
 }
   else
@@ -955,7 +955,7 @@ vec::iterate (unsigned ix, T **ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = CONST_CAST (T *, _vecdata[ix]);
+  *ptr = CONST_CAST (T *,  ()[ix]);
   return true;
 }
   else
@@ -978,7 +978,7 @@ vec::copy (ALONE_MEM_STAT_DECL) const
 {
   vec_alloc (new_vec, len PASS_MEM_STAT);
   new_vec->embedded_init (len, len);
-  vec_copy_construct (new_vec->address (), m_vecdata, len);
+  vec_copy_construct (new_vec->address (), address (), len);
 }
   return new_vec;
 }
@@ -1018,7 +1018,7 @@ inline T *
 vec::quick_push (const T )
 {
   gcc_checking_assert (space (1));
-  T *slot = _vecdata[m_vecpfx.m_num++];
+  T *slot =  ()[m_vecpfx.m_num++];
   *slot = obj;
   return slot;
 }
@@ -1031,7 +1031,7 @@ inline T &
 vec::pop (void)
 {
   gcc_checking_assert (length () > 0);
-  return m_vecdata[--m_vecpfx.m_num];
+  return address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1056,7 +1056,7 @@ vec::quick_insert (unsigned ix, const T 
)
 {
   gcc_checking_assert (length () < allocated ());
   gcc_checking_assert (ix <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
   *slot = obj;
 }
@@ -1071,7 +1071,7 @@ inline void
 vec::ordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
 }
 
@@ -1118,7 +1118,7 @@ inline void
 vec::unordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
+  address ()[ix] = address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1130,7 +1130,7 @@ inline void
 vec::block_remove (unsigned ix, unsigned len)
 {
   gcc_checking_assert (ix + len <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   m_vecpfx.m_num -= len;
   memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
 }
@@ -1249,8 +1249,11 @@ vec::contains (const T ) const
 {
   unsigned int len = length ();
   for (unsigned int i = 0; i < len; i++)
-if ((*this)[i] == search)
-  return true;
+{
+  const T *slot =  ()[i];
+  if (*slot == search)
+   return true;
+}
 
   return false;
 }
@@ -1262,7 +1265,8 @@ vec::contains (const T ) const
 
 template
 unsigned
-vec::lower_bound (T obj, bool (*lessthan)(const T &, const T 
&))
+vec::lower_bound (const T ,
+ bool (*lessthan)(const T &, const T &))
   const
 {
   unsigned int len = length ();
@@ -1273,7 +1277,7 @@ vec::lower_bound (T obj, bool 
(*lessthan)(const T &, const T &))
   half = len / 2;
   middle = first;
   middle += half;
-  T middle_elem = (*this)[middle];
+  const T _elem = address ()[middle];
   if 

Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -586,8 +586,8 @@ public:
>unsigned allocated (void) const { return m_vecpfx.m_alloc; }
>unsigned length (void) const { return m_vecpfx.m_num; }
>bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> -  T *address (void) { return m_vecdata; }
> -  const T *address (void) const { return m_vecdata; }
> +  T *address (void) { return reinterpret_cast  (this + 1); }
> +  const T *address (void) const { return reinterpret_cast  (this 
> + 1); }

This is now too long.

>T *begin () { return address (); }
>const T *begin () const { return address (); }
>T *end () { return address () + length (); }
> @@ -631,8 +631,7 @@ public:
>  
>/* FIXME - These fields should be private, but we need to cater to
>compilers that have stricter notions of PODness for types.  */
> -  vec_prefix m_vecpfx;
> -  T m_vecdata[1];
> +  alignas (T) vec_prefix m_vecpfx;

The comment needs adjustment and down't we need
alignas (T) alignas (vec_prefix) ?

> @@ -1588,7 +1587,7 @@ public:
>  
>  private:
>vec m_auto;
> -  T m_data[MAX (N - 1, 1)];
> +  alignas(T) unsigned char m_data[sizeof (T) * N];
>  };

I still believe you don't need alignas(T) here (and space before (T) ).
Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
standard C.

Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
bootstrap that I got with my version or not.

Jakub



[Bug target/108922] New: fmod() 13x slowdown in gcc 4.8->4.9 dropping "fprem" and calling fmod()

2023-02-24 Thread jkratochvil at azul dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922

Bug ID: 108922
   Summary: fmod() 13x slowdown in gcc 4.8->4.9 dropping "fprem"
and calling fmod()
   Product: gcc
   Version: 12.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jkratochvil at azul dot com
  Target Milestone: ---

Created attachment 54528
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54528=edit
bench.cpp

This performance regression is since:

[PATCH, i386]: Enable reminder{sd,df,xf} and fmod{sf,df,xf} only for
flag_finite_math_only.
https://gcc.gnu.org/pipermail/gcc-patches/2014-September/400104.html

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=93ba85fdd253b4b9cf2b9e54e8e5969b1a3db098

Reproducible with attached "bench.cpp":
g++ (GCC) 4.8.3 20140517 (prerelease)
real0m0.329s
g++ (GCC) 4.9.3 20150207 (prerelease)
real0m4.396s

The committer claims "do not return NaN for infinities, but generate
invalid-arithmetic-operand exception.". But my attached testcase tests that all
the corner cases do have both the same result value and the same exceptions
generated.

The committer also claims "fixes ieee_2.f90 testsuite failure" but I have no
idea where to find this testsuite.

g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
/home/azul/t/zuc1182/fmod.C:7
  4005f8:   dd 44 24 30 fldl   0x30(%rsp)
  4005fc:   dd 44 24 38 fldl   0x38(%rsp)
  400600:   d9 c1   fld%st(1)
  400602:   d9 c1   fld%st(1)
  400604:   d9 f8   fprem
  400606:   df e0   fnstsw %ax
  400608:   f6 c4 04test   $0x4,%ah
  40060b:   75 f7   jne400604 
  40060d:   dd d9   fstp   %st(1)
  40060f:   dd 5c 24 18 fstpl  0x18(%rsp)
  400613:   f2 0f 10 44 24 18   movsd  0x18(%rsp),%xmm0
  400619:   66 0f 2e c0 ucomisd %xmm0,%xmm0
^^^
Here it tests the result is finite;
if it is not it will fallback to calling fmod().
But I do not find even that needed, one could just use the "fprem" result.
  40061d:   7a 06   jp 400625 
  40061f:   74 2f   je 400650 
  400621:   d9 c9   fxch   %st(1)
  400623:   eb 0b   jmp400630 
  400625:   d9 c9   fxch   %st(1)
  400627:   66 0f 1f 84 00 00 00nopw   0x0(%rax,%rax,1)
  40062e:   00 00
  400630:   dd 5c 24 08 fstpl  0x8(%rsp)
  400634:   f2 0f 10 4c 24 08   movsd  0x8(%rsp),%xmm1
  40063a:   dd 5c 24 08 fstpl  0x8(%rsp)
  40063e:   f2 0f 10 44 24 08   movsd  0x8(%rsp),%xmm0
  400644:   e8 6f fe ff ff  callq  4004b8 
  400649:   eb 09   jmp400654 
  40064b:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
  400650:   dd d8   fstp   %st(0)
  400652:   dd d8   fstp   %st(0)
  400654:   83 c3 01add$0x1,%ebx
  400657:   f2 0f 11 44 24 28   movsd  %xmm0,0x28(%rsp)
/home/azul/t/zuc1182/fmod.C:6
  40065d:   81 fb 00 e1 f5 05   cmp$0x5f5e100,%ebx
  400663:   75 93   jne4005f8 

Similar issue may be with drem() (=remainder()) vs. "fprem1" instruction.

I expect the same issue also affects fmodf(), dremf() and remainderf().

Another topic is why the glibc fmod() implementation just does not use "fprem"
on i686/x86_64 arch.

[Bug target/108919] New: pure nested function may clobber its static chain pointer in windowed ABI on xtensa

2023-02-24 Thread jcmvbkbc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108919

Bug ID: 108919
   Summary: pure nested function may clobber its static chain
pointer in windowed ABI on xtensa
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jcmvbkbc at gcc dot gnu.org
  Target Milestone: ---

Created attachment 54525
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54525=edit
test.c

The following gfortran tests fail on windowed ABI xtensa target, but pass on
call0 ABI:

gfortran.dg/character_workout_1.f90   -O[123s]  execution test
gfortran.dg/character_workout_4.f90   -O[123s]  execution test

E.g. gfortran.dg/character_workout_1.f90 -O1 ends with the message 'STOP 209'.

This happens because in these tests pure nested functions are generated for
subroutines achar_cm and achar_am and the fwprop1 pass removes static chain
pointer initialization between the calls to these functions:

.L143:  
addia3, sp, 108
movia4, 0xd4
add.n   a4, sp, a4
addia8, sp, -20
s32i.n  a4, a8, 0
movi.n  a11, 7
mov.n   a10, a3
call8   achar_cm.21
movi.n  a11, 7
mov.n   a10, a3
call8   achar_am.15

Windowed ABI for xtensa specifies that the static chain is passed on the stack,
at offset -20 from the called function's CFA, i.e. outside the caller's stack
frame. This location may be overwritten if the called function makes function
calls that exceed certain depth.

The issue may be reproduced with the attached small test case.

[Bug c/63357] Warn for P && P and P || P (same expression used multiple times in a condition)

2023-02-24 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63357

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #8 from David Binderman  ---
This could probably be extended to other operators.

Static analyser cppcheck can be made to say things like:

linux-6.2/drivers/spi/spi-sn-f-ospi.c:614:31: style: Same expression
'SPI_TX_OCTAL' found multiple times in chain of '|' operators.
[duplicateExpression]

Source code is

ctlr->mode_bits = SPI_TX_DUAL | SPI_TX_QUAD | SPI_TX_OCTAL
| SPI_RX_DUAL | SPI_RX_QUAD | SPI_TX_OCTAL
| SPI_MODE_0 | SPI_MODE_1 | SPI_LSB_FIRST;

If |, then probably also &.

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Richard Biener via Gcc-patches
On Thu, 23 Feb 2023, Jakub Jelinek wrote:

> On Thu, Feb 23, 2023 at 03:02:01PM +, Richard Biener wrote:
> > > > * vec.h (auto_vec): Turn m_data storage into
> > > > uninitialized unsigned char.
> > > 
> > > Given that we actually never reference the m_data array anywhere,
> > > it is just to reserve space, I think even the alignas(T) there is
> > > useless.  The point is that m_auto has as data members:
> > >   vec_prefix m_vecpfx;
> > >   T m_vecdata[1];
> > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or
> > > -fsanitize=bound-sstrict incompatible) being treated as
> > > flexible array member flowing into the m_data storage after it.
> > 
> > Doesn't the array otherwise eventually overlap with tail padding
> > in m_auto?  Or does an array of T never produce tail padding?
> 
> The array can certainly overlap with tail padding in m_auto if any.
> But whether m_data is aligned to alignof (T) or not doesn't change anything
> on it.
> m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, m_num; 
> },
> so I think there is on most arches tail padding if T has smaller alignment
> than int, so typically char/short or structs with the same size/alignments.
> If that happens, alignof (auto_vec_x.m_auto) will be alignof (int),
> there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto)
> is 3 * sizeof (int), m_data will have offset always aligned to alignof (T).
> If alignof (T) >= alignof (int), then there won't be any tail padding
> at the end of m_auto, there could be padding between m_vecpfx and
> m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and
> so m_data will be again already properly aligned.
> 
> So, I think your patch is fine without alignas(T), the rest is just that
> there is more work to do incrementally, even for the case you want to
> deal with (the point 1) in particular).

Looking at vec::operator[] which just does

template
inline const T &
vec::operator[] (unsigned ix) const
{
  gcc_checking_assert (ix < m_vecpfx.m_num);
  return m_vecdata[ix];
} 

the whole thing looks fragile at best - we basically have

struct auto_vec
{
  struct vec
  {
...
T m_vecdata[1];
  } m_auto;
  T m_data[N-1];
};

and access m_auto.m_vecdata[] as if it extends to m_data.  That's
not something supported by the middle-end - not by design at least.
auto_vec *p; p->m_auto.m_vecdata[i] would never alias
p->m_data[j], in practice we might not see this though.  Also
get_ref_base_and_extent will compute a maxsize/size of sizeof(T)
for any m_auto.m_vecdata[i] access, but I think we nowhere
actually replace 'i' by zero based on this knowledge, but we'd
perform CSE with earlier m_auto.m_vecdata[0] stores, so that
might be something one could provoke.  Doing a self-test like

static __attribute__((noipa)) void
test_auto_alias (int i)
{ 
  auto_vec v;
  v.quick_grow (2);
  v[0] = 1;
  v[1] = 2;
  int val = v[i];
  ASSERT_EQ (val, 2);
} 

shows

  _27 = &_25->m_vecdata[0];
  *_27 = 1;
...
  _7 = &_12->m_vecdata[i.235_3];
  val_13 = *_7;

which is safe in middle-end rules though.  So what "saves" us
here is that we always return a reference and never a value.
There's the ::iterate member function which fails to do this,
the ::quick_push function does

  T *slot = _vecdata[m_vecpfx.m_num++];
  *slot = obj;

with

static __attribute__((noipa)) void
test_auto_alias (int i)
{ 
  auto_vec v;
  v.quick_grow (2);
  v[0] = 1;
  v[1] = 2;
  int val;
  for (int ix = i; v.iterate (ix, ); ix++)
;
  ASSERT_EQ (val, 2);
} 

I get that optimzied to a FAIL.  I have a "fix" for this.
unordered_remove has a similar issue accesing the last element.
There are a few functions using the [] access member which is
at least sub-optimal due to repeated bounds checking but also safe.

I suppose if auto_vec would be a union of vec and
a storage member with the vl_embed active that would work, but then
that's likely not something C++11 supports.

So I think to support auto_vec we'd need to make the m_vecdata[]
member in vec of templated size (defaulted to 1)
and get rid of the m_data member in auto_vec instead.  Or have
another C++ way of increasing the size of auto_vec without
actually adding any member?

The vec data accesses then would need to go through
a wrapper obtaining a correctly typed pointer to m_vecdata[]
since we'd like to have that as unsigned char[] to avoid the
initialization.

> > Yes, I'm not proposing to fix non-POD support.  I want to make
> > as-if-POD stuff like std::pair to work like it was intended.
> > 
> > > Oh, and perhaps we should start marking such spots in GCC with
> > > strict_flex_array attribute to make it clear where we rely on the
> > > non-strict behavior.
> > 
> > I think we never access the array directly as array, do we?
> 
> Sure, the attribute should go to m_vecdata array, not to m_data.
> And to op array in gimple_statement_with_ops, operands array in
> operands, ops array in tree_omp_clause, val in tree_int_cst,
> 

[Bug target/108881] "__builtin_ia32_cvtne2ps2bf16_v16hi" compiled only with option -mavx512bf16 report ICE.

2023-02-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108881

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:0ccfa3884f638816af0f5a3f0ee2695e0771ef6d

commit r13-6318-g0ccfa3884f638816af0f5a3f0ee2695e0771ef6d
Author: Jakub Jelinek 
Date:   Fri Feb 24 10:12:44 2023 +0100

i386: Fix up builtins used in avx512bf16vlintrin.h [PR108881]

The builtins used in avx512bf16vlintrin.h implementation need both
avx512bf16 and avx512vl ISAs, which the header ensures for them, but
the builtins weren't actually requiring avx512vl, so when used by hand
with just -mavx512bf16 -mno-avx512vl it resulted in ICEs.

Fixed by adding OPTION_MASK_ISA_AVX512VL to their BDESC.

2023-02-24  Jakub Jelinek  

PR target/108881
* config/i386/i386-builtin.def (__builtin_ia32_cvtne2ps2bf16_v16bf,
__builtin_ia32_cvtne2ps2bf16_v16bf_mask,
__builtin_ia32_cvtne2ps2bf16_v16bf_maskz,
__builtin_ia32_cvtne2ps2bf16_v8bf,
__builtin_ia32_cvtne2ps2bf16_v8bf_mask,
__builtin_ia32_cvtne2ps2bf16_v8bf_maskz,
__builtin_ia32_cvtneps2bf16_v8sf_mask,
__builtin_ia32_cvtneps2bf16_v8sf_maskz,
__builtin_ia32_cvtneps2bf16_v4sf_mask,
__builtin_ia32_cvtneps2bf16_v4sf_maskz,
__builtin_ia32_dpbf16ps_v8sf, __builtin_ia32_dpbf16ps_v8sf_mask,
__builtin_ia32_dpbf16ps_v8sf_maskz, __builtin_ia32_dpbf16ps_v4sf,
__builtin_ia32_dpbf16ps_v4sf_mask,
__builtin_ia32_dpbf16ps_v4sf_maskz): Require also
OPTION_MASK_ISA_AVX512VL.

* gcc.target/i386/avx512bf16-pr108881.c: New test.

Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 10:54 AM Kito Cheng  wrote:
>
> My impression is that md patterns will use first-match patterns? so
> the zba will get higher priority than xtheadba if both patterns are
> matched?

Yes, I was just about to write this.

/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c

The resulting xtheadba-addsl.s file has:
.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zba1p0_xtheadba1p0"
[...]
sh1add  a0,a1,a0

So the standard extension will be preferred over the custom extension.


>
> On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
> >  wrote:
> > >
> > > From: Christoph Müllner 
> > >
> > > This patch adds support for the XTheadBa ISA extension.
> > > The new INSN pattern is defined in a new file to separate
> > > this vendor extension from the standard extensions.
> >
> > How does this interact with doing -march=rv32gc_xtheadba_zba ?
> > Seems like it might be better handle that case correctly. I suspect
> > these all XThreadB* extensions have a similar problem too.
> >
> > Thanks,
> > Andrew Pinski
> >
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/riscv.md: Include thead.md
> > > * config/riscv/thead.md: New file.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/riscv/xtheadba-addsl.c: New test.
> > >
> > > Changes in v3:
> > > - Fix operand order for th.addsl.
> > >
> > > Signed-off-by: Christoph Müllner 
> > > ---
> > >  gcc/config/riscv/riscv.md |  1 +
> > >  gcc/config/riscv/thead.md | 31 +++
> > >  .../gcc.target/riscv/xtheadba-addsl.c | 55 +++
> > >  3 files changed, 87 insertions(+)
> > >  create mode 100644 gcc/config/riscv/thead.md
> > >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > >
> > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > > index 05924e9bbf1..d6c2265e9d4 100644
> > > --- a/gcc/config/riscv/riscv.md
> > > +++ b/gcc/config/riscv/riscv.md
> > > @@ -3093,4 +3093,5 @@ (define_insn "riscv_prefetchi_"
> > >  (include "pic.md")
> > >  (include "generic.md")
> > >  (include "sifive-7.md")
> > > +(include "thead.md")
> > >  (include "vector.md")
> > > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > > new file mode 100644
> > > index 000..158e9124c3a
> > > --- /dev/null
> > > +++ b/gcc/config/riscv/thead.md
> > > @@ -0,0 +1,31 @@
> > > +;; Machine description for T-Head vendor extensions
> > > +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> > > +
> > > +;; This file is part of GCC.
> > > +
> > > +;; GCC is free software; you can redistribute it and/or modify
> > > +;; it under the terms of the GNU General Public License as published by
> > > +;; the Free Software Foundation; either version 3, or (at your option)
> > > +;; any later version.
> > > +
> > > +;; GCC is distributed in the hope that it will be useful,
> > > +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > +;; GNU General Public License for more details.
> > > +
> > > +;; You should have received a copy of the GNU General Public License
> > > +;; along with GCC; see the file COPYING3.  If not see
> > > +;; .
> > > +
> > > +;; XTheadBa
> > > +
> > > +(define_insn "*th_addsl"
> > > +  [(set (match_operand:X 0 "register_operand" "=r")
> > > +   (plus:X (ashift:X (match_operand:X 1 "register_operand" "r")
> > > + (match_operand:QI 2 "immediate_operand" "I"))
> > > +   (match_operand:X 3 "register_operand" "r")))]
> > > +  "TARGET_XTHEADBA
> > > +   && (INTVAL (operands[2]) >= 0) && (INTVAL (operands[2]) <= 3)"
> > > +  "th.addsl\t%0,%3,%1,%2"
> > > +  [(set_attr "type" "bitmanip")
> > > +   (set_attr "mode" "")])
> > > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c 
> > > b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > > new file mode 100644
> > > index 000..5004735a246
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > > @@ -0,0 +1,55 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-march=rv32gc_xtheadba" { target { rv32 } } } */
> > > +/* { dg-options "-march=rv64gc_xtheadba" { target { rv64 } } } */
> > > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */
> > > +
> > > +long
> > > +test_1 (long a, long b)
> > > +{
> > > +  /* th.addsl aX, aX, 1  */
> > > +  return a + (b << 1);
> > > +}
> > > +
> > > +int
> > > +foos (short *x, int n)
> > > +{
> > > +  /* th.addsl aX, aX, 1  */
> > > +  return x[n];
> > > +}
> > > +
> > > +long
> > > +test_2 (long a, long b)
> > > +{
> > > +  /* th.addsl aX, aX, 2  */
> > > +  return a + (b << 2);
> > > +}
> > > +
> > > +int
> > > +fooi (int *x, 

[Bug tree-optimization/107409] Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark with r10-5090-ga9a4edf0e71bba

2023-02-24 Thread rvmallad at amazon dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409

--- Comment #21 from Rama Malladi  ---
I did another triage for perf loss on Graviton 2 processor (neoverse-n1) based
instance and found this commit: `a9a4edf0e71bbac9f1b5dcecdcf9250111d16889` to
be the reason. As I had indicated in my earlier reply, I was doing a triage of
perf loss going from gcc-7 to gcc-10.

The perf of 519.libm_r 1-copy run improved 1.08x with the revert of commit:
`a9a4edf0e71bbac9f1b5dcecdcf9250111d16889` on gcc-mainline (
`2f1691be517fcdcabae9cd671ab511eb0e08b1d5`).

I am guessing that we don't see it on LNT/ Altra CPUs.

So, please look into this issue fix. Let me know if you have any queries.
Thanks.

Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 11:52, Jakub Jelinek  wrote:
>
> On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> > --- a/gcc/vec.h
> > +++ b/gcc/vec.h
> > @@ -586,8 +586,8 @@ public:
> >unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> >unsigned length (void) const { return m_vecpfx.m_num; }
> >bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > -  T *address (void) { return m_vecdata; }
> > -  const T *address (void) const { return m_vecdata; }
> > +  T *address (void) { return reinterpret_cast  (this + 1); }
> > +  const T *address (void) const { return reinterpret_cast  
> > (this + 1); }
>
> This is now too long.
>
> >T *begin () { return address (); }
> >const T *begin () const { return address (); }
> >T *end () { return address () + length (); }
> > @@ -631,8 +631,7 @@ public:
> >
> >/* FIXME - These fields should be private, but we need to cater to
> >compilers that have stricter notions of PODness for types.  */
> > -  vec_prefix m_vecpfx;
> > -  T m_vecdata[1];
> > +  alignas (T) vec_prefix m_vecpfx;
>
> The comment needs adjustment and down't we need
> alignas (T) alignas (vec_prefix) ?

Yes. If alignas(T) is less than the natural alignment then this will
be an error. We want it to be the larger of  the two alignments, so we
need to specify both.

>
> > @@ -1588,7 +1587,7 @@ public:
> >
> >  private:
> >vec m_auto;
> > -  T m_data[MAX (N - 1, 1)];
> > +  alignas(T) unsigned char m_data[sizeof (T) * N];
> >  };
>
> I still believe you don't need alignas(T) here (and space before (T) ).
> Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> standard C.
>
> Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> bootstrap that I got with my version or not.
>
> Jakub
>



Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 11:54:54AM +, Jonathan Wakely wrote:
> > The comment needs adjustment and don't we need
> > alignas (T) alignas (vec_prefix) ?
> 
> Yes. If alignas(T) is less than the natural alignment then this will
> be an error. We want it to be the larger of  the two alignments, so we
> need to specify both.

Seems g++ doesn't diagnose this but clang++ does:
struct S { int a; };
alignas (char) alignas (S) S s;
alignas (char) S t;
$ g++ -S -o /tmp/1.s /tmp/1.C -pedantic-errors
$ clang++ -S -o /tmp/1.s /tmp/1.C -pedantic-errors
/tmp/1.C:3:1: error: requested alignment is less than minimum alignment of 4 
for type 'S'
alignas (char) S t;
^
1 error generated.

Jakub



[PATCH] asan: adjust module name for global variables

2023-02-24 Thread Martin Liška
As mentioned in the PR, when we use LTO, we wrongly use ltrans output
file name as a module name of a global variable. That leads to a
non-reproducible output.

After the suggested change, we emit context name of normal global
variables. And for artificial variables (like .Lubsan_data3), we use
aux_base_name (e.g. "./a.ltrans0.ltrans").

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR asan/108834

gcc/ChangeLog:

* asan.cc (asan_add_global): Use proper TU name for normal
  global variables (and aux_base_name for the artificial one).

gcc/testsuite/ChangeLog:

* c-c++-common/asan/global-overflow-1.c: Test line and column
info for a global variable.
---
 gcc/asan.cc | 7 ++-
 gcc/testsuite/c-c++-common/asan/global-overflow-1.c | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/asan.cc b/gcc/asan.cc
index f56d084bc7a..245abb14388 100644
--- a/gcc/asan.cc
+++ b/gcc/asan.cc
@@ -3287,7 +3287,12 @@ asan_add_global (tree decl, tree type, 
vec *v)
 pp_string (_pp, "");
   str_cst = asan_pp_string (_pp);
 
-  pp_string (_name_pp, main_input_filename);
+  const_tree tu = get_ultimate_context ((const_tree)decl);
+  if (tu != NULL_TREE)
+pp_string (_name_pp, IDENTIFIER_POINTER (DECL_NAME (tu)));
+  else
+pp_string (_name_pp, aux_base_name);
+
   module_name_cst = asan_pp_string (_name_pp);
 
   if (asan_needs_local_alias (decl))
diff --git a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c 
b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
index b97801da2b7..7e167cee67a 100644
--- a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
@@ -26,4 +26,4 @@ int main() {
 /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
 /* { dg-output "#0 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0|\[^\n\r]*\\+0x\[0-9a-z\]*)|\[(\])\[^\n\r]*(\n|\r\n|\r).*"
 } */
 /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes after global variable" } */
-/* { dg-output ".*YYY\[^\n\r]* of size 10\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output ".*YYY\[^\n\r]*asan/global-overflow-1.c:15:15'\[^\n\r]*of size 
10\[^\n\r]*(\n|\r\n|\r)" } */
-- 
2.39.2



Re: Rust: In 'type_for_mode' langhook also consider all 'int_n' modes/types (was: Modula-2 / Rust: Many targets failing)

2023-02-24 Thread Jan-Benedict Glaw
Hi Thomas / Arthur!

On Wed, 2023-02-22 15:30:37 +0100, Arthur Cohen  
wrote:
[..]
> > >   --target=msp430-elfbare
> > > ~
> > >
> > > /var/lib/laminar/run/gcc-msp430-elfbare/24/toolchain-build/./gcc/xgcc 
> > > -B/var/lib/laminar/run/gcc-msp430-elfbare/24/toolchain-build/./gcc/  
> > > -xrust -frust-incomplete-and-experimental-compiler-do-not-use -nostdinc 
> > > /dev/null -S -o /dev/null -fself-test=../../gcc/gcc/testsuite/selftests
> > >: internal compiler error: Segmentation fault
> > >0xf2efbf crash_signal
> > >  ../../gcc/gcc/toplev.cc:314
> > >0x120c8c7 build_function_type(tree_node*, tree_node*, bool)
> > >  ../../gcc/gcc/tree.cc:7360
> > >0x120cc20 build_function_type_list(tree_node*, ...)
> > >  ../../gcc/gcc/tree.cc:7442
> > >0x120d16b build_common_builtin_nodes()
> > >  ../../gcc/gcc/tree.cc:9883
> > >0x8449b4 grs_langhook_init
> > >  ../../gcc/gcc/rust/rust-lang.cc:132
> > >0x8427b2 lang_dependent_init
> > >  ../../gcc/gcc/toplev.cc:1815
> > >0x8427b2 do_compile
> > >  ../../gcc/gcc/toplev.cc:2110
> > >Please submit a full bug report, with preprocessed source (by 
> > > using -freport-bug).
> > >Please include the complete backtrace with any bug report.
> > >See  for instructions.
> > >make[1]: *** [../../gcc/gcc/rust/Make-lang.in:275: 
> > > s-selftest-rust] Error 1

Confirmed successful build #37 for my msp320-elfbare build at
http://toolchain.lug-owl.de/laminar/jobs/gcc-msp430-elfbare

Thanks,
  Jan-Benedict

-- 


signature.asc
Description: PGP signature
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust


Re: Rust: In 'type_for_mode' langhook also consider all 'int_n' modes/types (was: Modula-2 / Rust: Many targets failing)

2023-02-24 Thread Jan-Benedict Glaw
Hi Thomas / Arthur!

On Wed, 2023-02-22 15:30:37 +0100, Arthur Cohen  
wrote:
[..]
> > >   --target=msp430-elfbare
> > > ~
> > >
> > > /var/lib/laminar/run/gcc-msp430-elfbare/24/toolchain-build/./gcc/xgcc 
> > > -B/var/lib/laminar/run/gcc-msp430-elfbare/24/toolchain-build/./gcc/  
> > > -xrust -frust-incomplete-and-experimental-compiler-do-not-use -nostdinc 
> > > /dev/null -S -o /dev/null -fself-test=../../gcc/gcc/testsuite/selftests
> > >: internal compiler error: Segmentation fault
> > >0xf2efbf crash_signal
> > >  ../../gcc/gcc/toplev.cc:314
> > >0x120c8c7 build_function_type(tree_node*, tree_node*, bool)
> > >  ../../gcc/gcc/tree.cc:7360
> > >0x120cc20 build_function_type_list(tree_node*, ...)
> > >  ../../gcc/gcc/tree.cc:7442
> > >0x120d16b build_common_builtin_nodes()
> > >  ../../gcc/gcc/tree.cc:9883
> > >0x8449b4 grs_langhook_init
> > >  ../../gcc/gcc/rust/rust-lang.cc:132
> > >0x8427b2 lang_dependent_init
> > >  ../../gcc/gcc/toplev.cc:1815
> > >0x8427b2 do_compile
> > >  ../../gcc/gcc/toplev.cc:2110
> > >Please submit a full bug report, with preprocessed source (by 
> > > using -freport-bug).
> > >Please include the complete backtrace with any bug report.
> > >See  for instructions.
> > >make[1]: *** [../../gcc/gcc/rust/Make-lang.in:275: 
> > > s-selftest-rust] Error 1

Confirmed successful build #37 for my msp320-elfbare build at
http://toolchain.lug-owl.de/laminar/jobs/gcc-msp430-elfbare

Thanks,
  Jan-Benedict

-- 


signature.asc
Description: PGP signature


[PATCH] cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial thunk [PR108854]

2023-02-24 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs on x86_64-linux with -m32.  The problem is
we create an artificial thunk and because of -fPIC, ia32 and thunk
destination which doesn't bind locally can't use a mi thunk.
The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
only if some arguments need modification.

The following patch fixes it by copying the DECL_ARGUMENTS list even if
the arguments can stay as is, to update DECL_CONTEXT on them.  While for
mi thunks it doesn't really matter because we don't use those arguments
in any way, for other thunks it is important.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-02-23  Jakub Jelinek  

PR middle-end/108854
* cgraphclones.cc (duplicate_thunk_for_node): If no parameter
changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
nodes and adjust their DECL_CONTEXT.

* g++.dg/opt/pr108854.C: New test.

--- gcc/cgraphclones.cc.jj  2023-02-22 20:50:27.417519830 +0100
+++ gcc/cgraphclones.cc 2023-02-23 17:12:59.875133883 +0100
@@ -218,7 +218,17 @@ duplicate_thunk_for_node (cgraph_node *t
   body_adj.modify_formal_parameters ();
 }
   else
-new_decl = copy_node (thunk->decl);
+{
+  new_decl = copy_node (thunk->decl);
+  for (tree *arg = _ARGUMENTS (new_decl);
+  *arg; arg = _CHAIN (*arg))
+   {
+ tree next = DECL_CHAIN (*arg);
+ *arg = copy_node (*arg);
+ DECL_CONTEXT (*arg) = new_decl;
+ DECL_CHAIN (*arg) = next;
+   }
+}
 
   gcc_checking_assert (!DECL_STRUCT_FUNCTION (new_decl));
   gcc_checking_assert (!DECL_INITIAL (new_decl));
--- gcc/testsuite/g++.dg/opt/pr108854.C.jj  2023-02-23 17:11:19.275583506 
+0100
+++ gcc/testsuite/g++.dg/opt/pr108854.C 2023-02-23 17:11:02.723822009 +0100
@@ -0,0 +1,37 @@
+// PR middle-end/108854
+// { dg-do compile { target c++11 } }
+// { dg-options "-O3" }
+// { dg-additional-options "-fPIC" { target fpic } }
+
+struct A { A (int); ~A (); };
+struct B { B (int, bool); ~B (); };
+template 
+struct C { void m1 (T); void m2 (T &&); };
+class D;
+struct E { virtual void m3 (); };
+template 
+struct F { virtual bool m4 (D &); };
+struct D { virtual D m5 () { return D (); } };
+void foo (void *, void *);
+struct G {
+  int a;
+  C  b;
+  void m4 (D ) { B l (a, true); r.m5 (); b.m1 (); b.m2 (); }
+};
+struct H : E, F  {
+  template 
+  H (int, T);
+  bool m4 (D ) { A l (a); b.m4 (r); if (c) return true; } // { dg-warning 
"control reaches end of non-void function" }
+  int a;
+  bool c;
+  G b;
+};
+inline void bar (F  ) { D s, t; p.m4 (t); foo (, ); }
+enum I { I1, I2 };
+template 
+struct J;
+template 
+void baz () { int g = 0, h = 0; T i (g, h); bar (i); }
+template 
+void qux () { baz > (); }
+void corge () { qux  (); qux  (); }

Jakub



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote:
>
> Assuming a compiler handles the T m_vecdata[1]; as flexible array member
> like (which we need because standard C++ doesn't have flexible array members
> nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
> trick couldn't make auto_vec have
> alignas(vec) unsigned char buf m_data[sizeof (vec) + (N - 
> 1) * sizeof (T)];
> and do a placement new of vec into that m_data during auto_vec
> construction.  Isn't it then similar to how are flexible array members
> normally used in C, where one uses malloc or alloca to allocate storage
> for them and the storage can be larger than the structure itself and
> flexible array member then can use storage after it?

You would still be accessing past the end of the
vec::m_vecdata array which is UB.



[Bug target/108881] "__builtin_ia32_cvtne2ps2bf16_v16hi" compiled only with option -mavx512bf16 report ICE.

2023-02-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108881

--- Comment #6 from Jakub Jelinek  ---
Fixed on the trunk so far.
I think we want to backport to 10/11/12, though in that case it won't be v*bf
but v*hi.

Re: [PATCH v3 04/11] riscv: thead: Add support for the XTheadBs ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 8:37 AM Kito Cheng  wrote:
>
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > index 158e9124c3a..2c684885850 100644
> > --- a/gcc/config/riscv/thead.md
> > +++ b/gcc/config/riscv/thead.md
> > @@ -29,3 +29,14 @@ (define_insn "*th_addsl"
> >"th.addsl\t%0,%3,%1,%2"
> >[(set_attr "type" "bitmanip")
> > (set_attr "mode" "")])
> > +
> > +;; XTheadBs
> > +
> > +(define_insn "*th_tst"
> > +  [(set (match_operand:X 0 "register_operand" "=r")
> > +   (zero_extract:X (match_operand:X 1 "register_operand" "r")
> > +   (const_int 1)
> > +   (match_operand 2 "immediate_operand" "i")))]
> > +  "TARGET_XTHEADBS"
>
> Add range check like *bexti pattern?
>
> TARGET_XTHEADBS && UINTVAL (operands[2]) < GET_MODE_BITSIZE (mode)

Ok.

Thanks,
Christoph

>
> > +  "th.tst\t%0,%1,%2"
> > +  [(set_attr "type" "bitmanip")])


[Bug c++/108920] New: Condition falsely optimized out

2023-02-24 Thread agner at agner dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108920

Bug ID: 108920
   Summary: Condition falsely optimized out
   Product: gcc
   Version: 9.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: agner at agner dot org
  Target Milestone: ---

Created attachment 54526
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54526=edit
code to reproduce error

The attached file test.cpp gives wrong code when optimized with -O2 or higher.
To reproduce error, do:

g++ -O2 -m64 -S -o t1.s test.cpp
g++ -O2 -m64 -S -DFIX -o t2.s test.cpp


The condition in line 104 in test.cpp is optimized away in t1.s

The workaround on line 73 is preventing this false optimization with -DFIX to
generate correct code in t2.s
See t2.s line 252-255

Re: C++ modules and AAPCS/ARM EABI clash on inline key methods

2023-02-24 Thread Iain Sandoe



> On 24 Feb 2023, at 10:23, Richard Earnshaw via Gcc-patches 
>  wrote:
> 
> 
> 
> On 23/02/2023 21:20, Alexandre Oliva wrote:
>> On Feb 23, 2023, Alexandre Oliva  wrote:
>>> On Feb 23, 2023, Richard Earnshaw  wrote:
 On 22/02/2023 19:57, Alexandre Oliva wrote:
> On Feb 21, 2023, Richard Earnshaw  wrote:
> 
>> Rather than scanning for the triplet, a better test would be
> 
>> { xfail { arm_eabi } }
> 
> Indeed, thanks.  Here's the updated patch, retested.  Ok to install?
 Based on Nathan's comments, we should just skip the test on arm_eabi,
 it's simply not applicable.
>>> Like this, I suppose.  Retested on x86_64-linux-gnu (trunk) and
>>> arm-wrs-vxworks7 (gcc-12).  Ok to install?
>> Erhm, actually, that version still ran the assembler scans and failed.
>> This one skips the testset entirely.
> 
> Yeah, I tried something like that and it didn't appear to work. Perhaps it's 
> a bug in the way dg-do-module is implemented.

I think if you suppress the dg-do run line (with the target selector) then it 
will just do the default (which is to compile only?)

Skip seems like the correct thing to do here ..
Iain

> 
>> [PR105224] C++ modules and AAPCS/ARM EABI clash on inline key methods
>> From: Alexandre Oliva 
>> g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
>> that use the AAPCS variant.  ARM is the only target that overrides
>> TARGET_CXX_KEY_METHOD_MAY_BE_INLINE.  It's not clear to me which way
>> the clash between AAPCS and C++ Modules design should be resolved, but
>> currently it favors AAPCS and thus the test fails, so skip it on
>> arm_eabi.
>> for  gcc/testsuite/ChangeLog
>>  PR c++/105224
>>  * g++.dg/modules/virt-2_a.C: Skip on arm_eabi.
>> ---
>>  gcc/testsuite/g++.dg/modules/virt-2_a.C |3 +++
>>  1 file changed, 3 insertions(+)
>> diff --git a/gcc/testsuite/g++.dg/modules/virt-2_a.C 
>> b/gcc/testsuite/g++.dg/modules/virt-2_a.C
>> index 580552be5a0d8..ede711c3e83be 100644
>> --- a/gcc/testsuite/g++.dg/modules/virt-2_a.C
>> +++ b/gcc/testsuite/g++.dg/modules/virt-2_a.C
>> @@ -1,3 +1,6 @@
>> +// AAPCS overrides TARGET_CXX_KEY_METHOD_MAY_BE_INLINE,
>> +// in a way that invalidates this test.
>> +// { dg-skip-if "TARGET_CXX_KEY_METHOD_MAY_BE_INLINE" { arm_eabi } }
> 
> Given the logic of this macro, the text should be 
> "!TARGET_CXX_METHOD_MAY_BE_INLINE".
> 
> OK with that change.
> 
> R.
> 
>>  // { dg-module-do run }
>>  // { dg-additional-options -fmodules-ts }
>>  export module foo;



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 10:30:00AM +, Jonathan Wakely wrote:
> On Fri, 24 Feb 2023 at 10:24, Jakub Jelinek  wrote:
> >
> > On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches 
> > wrote:
> > > Maybe this would work, vl_relative even could be vl_embed.
> > > Because vl_embed I believe is used in two spots, part of
> > > auto_vec where it is followed by m_data and on heap or GGC
> > > allocated memory where vec<..., vl_embed> is followed by
> > > further storage for the vector.
> >
> > So roughtly something like below?  Except I get weird crashes with it
> > in the gen* tools.  And we'd need to adjust the gdb python hooks
> > which also use m_vecdata.
> >
> > --- gcc/vec.h.jj2023-01-02 09:32:32.177143804 +0100
> > +++ gcc/vec.h   2023-02-24 11:19:37.900157177 +0100
> > @@ -586,8 +586,8 @@ public:
> >unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> >unsigned length (void) const { return m_vecpfx.m_num; }
> >bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > -  T *address (void) { return m_vecdata; }
> > -  const T *address (void) const { return m_vecdata; }
> > +  T *address (void) { return (T *) (this + 1); }
> > +  const T *address (void) const { return (const T *) (this + 1); }
> >T *begin () { return address (); }
> >const T *begin () const { return address (); }
> >T *end () { return address () + length (); }
> > @@ -629,10 +629,9 @@ public:
> >friend struct va_gc_atomic;
> >friend struct va_heap;
> >
> > -  /* FIXME - These fields should be private, but we need to cater to
> > +  /* FIXME - This field should be private, but we need to cater to
> >  compilers that have stricter notions of PODness for types.  */
> > -  vec_prefix m_vecpfx;
> > -  T m_vecdata[1];
> > +  alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx;
> >  };
> >
> >
> > @@ -879,7 +878,7 @@ inline const T &
> >  vec::operator[] (unsigned ix) const
> >  {
> >gcc_checking_assert (ix < m_vecpfx.m_num);
> > -  return m_vecdata[ix];
> > +  return address ()[ix];
> >  }
> >
> >  template
> > @@ -887,7 +886,7 @@ inline T &
> >  vec::operator[] (unsigned ix)
> >  {
> >gcc_checking_assert (ix < m_vecpfx.m_num);
> > -  return m_vecdata[ix];
> > +  return address ()[ix];
> >  }
> >
> >
> > @@ -929,7 +928,7 @@ vec::iterate (unsigned i
> >  {
> >if (ix < m_vecpfx.m_num)
> >  {
> > -  *ptr = m_vecdata[ix];
> > +  *ptr = address ()[ix];
> >return true;
> >  }
> >else
> > @@ -955,7 +954,7 @@ vec::iterate (unsigned i
> >  {
> >if (ix < m_vecpfx.m_num)
> >  {
> > -  *ptr = CONST_CAST (T *, _vecdata[ix]);
> > +  *ptr = CONST_CAST (T *,  ()[ix]);
> >return true;
> >  }
> >else
> > @@ -978,7 +977,7 @@ vec::copy (ALONE_MEM_STA
> >  {
> >vec_alloc (new_vec, len PASS_MEM_STAT);
> >new_vec->embedded_init (len, len);
> > -  vec_copy_construct (new_vec->address (), m_vecdata, len);
> > +  vec_copy_construct (new_vec->address (), address (), len);
> >  }
> >return new_vec;
> >  }
> > @@ -1018,7 +1017,7 @@ inline T *
> >  vec::quick_push (const T )
> >  {
> >gcc_checking_assert (space (1));
> > -  T *slot = _vecdata[m_vecpfx.m_num++];
> > +  T *slot =  ()[m_vecpfx.m_num++];
> >*slot = obj;
> >return slot;
> >  }
> > @@ -1031,7 +1030,7 @@ inline T &
> >  vec::pop (void)
> >  {
> >gcc_checking_assert (length () > 0);
> > -  return m_vecdata[--m_vecpfx.m_num];
> > +  return address ()[--m_vecpfx.m_num];
> >  }
> >
> >
> > @@ -1056,7 +1055,7 @@ vec::quick_insert (unsig
> >  {
> >gcc_checking_assert (length () < allocated ());
> >gcc_checking_assert (ix <= length ());
> > -  T *slot = _vecdata[ix];
> > +  T *slot =  ()[ix];
> >memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
> >*slot = obj;
> >  }
> > @@ -1071,7 +1070,7 @@ inline void
> >  vec::ordered_remove (unsigned ix)
> >  {
> >gcc_checking_assert (ix < length ());
> > -  T *slot = _vecdata[ix];
> > +  T *slot =  ()[ix];
> >memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
> >  }
> >
> > @@ -1118,7 +1117,7 @@ inline void
> >  vec::unordered_remove (unsigned ix)
> >  {
> >gcc_checking_assert (ix < length ());
> > -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> > +  address ()[ix] = address ()[--m_vecpfx.m_num];
> >  }
> >
> >
> > @@ -1130,7 +1129,7 @@ inline void
> >  vec::block_remove (unsigned ix, unsigned len)
> >  {
> >gcc_checking_assert (ix + len <= length ());
> > -  T *slot = _vecdata[ix];
> > +  T *slot =  ()[ix];
> >m_vecpfx.m_num -= len;
> >memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
> >  }
> > @@ -1309,7 +1308,7 @@ vec::embedded_size (unsi
> > vec, vec_embedded>::type vec_stdlayout;
> >static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
> >static_assert (alignof (vec_stdlayout) == alignof (vec), "");
> > -  return offsetof (vec_stdlayout, 

[PATCH][committed] aarch64: Update FLAGS field documentation comment in aarch64-cores.def

2023-02-24 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

With the cleanup of the arch features in GCC 13 the comment on the FLAGS field 
in aarch64-cores.def
is now outdated. It's now a comma-separated list rather than a bitwise or.
Spotted while reviewing an aarch64-cores.def patch.
Update the comment.

Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (FLAGS): Update comment.


aarch64-comment.patch
Description: aarch64-comment.patch


[PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Richard Biener via Gcc-patches
The following avoids default-initializing auto_vec storage for
non-POD T since that's not what the allocated storage fallback
will do and it's also not expected for existing cases like

  auto_vec, 64> elts;

which exist to optimize the allocation.

It also fixes the array accesses done by vec to not
use its own m_vecdata member but instead access the container
provided storage via pointer arithmetic.

This seems to work but it also somehow breaks genrecog which now
goes OOM with this change.  I'm going to see if the testsuite
shows anything, but maybe it's obvious from a second eye what
I did wrong ...

Comments welcome of course.

Thanks,
Richard.

* vec.h (vec::m_vecdata): Remove.
(vec::m_vecpfx): Align as T to avoid
changing alignment of vec and simplifying
address.
(vec::address): Compute as this + 1.
(vec::embedded_size): Use sizeof the
vector instead of the offset of the m_vecdata member.
(auto_vec): Turn m_data storage into
uninitialized unsigned char aligned as T.
* vec.cc (test_auto_alias): New.
(vec_cc_tests): Call it.
---
 gcc/vec.cc | 17 +
 gcc/vec.h  | 11 +--
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/gcc/vec.cc b/gcc/vec.cc
index 511e6dff50d..2128fb1 100644
--- a/gcc/vec.cc
+++ b/gcc/vec.cc
@@ -568,6 +568,22 @@ test_auto_delete_vec ()
   ASSERT_EQ (dtor_count, 2);
 }
 
+/* Verify accesses to m_vecdata are done indirectly.  */
+
+static void
+test_auto_alias ()
+{
+  volatile int i = 1;
+  auto_vec v;
+  v.quick_grow (2);
+  v[0] = 1;
+  v[1] = 2;
+  int val;
+  for (int ix = i; v.iterate (ix, ); ix++)
+ASSERT_EQ (val, 2);
+  ASSERT_EQ (val, 0);
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -587,6 +603,7 @@ vec_cc_tests ()
   test_qsort ();
   test_reverse ();
   test_auto_delete_vec ();
+  test_auto_alias ();
 }
 
 } // namespace selftest
diff --git a/gcc/vec.h b/gcc/vec.h
index 5a2ee9c0294..b680efebe7a 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -586,8 +586,8 @@ public:
   unsigned allocated (void) const { return m_vecpfx.m_alloc; }
   unsigned length (void) const { return m_vecpfx.m_num; }
   bool is_empty (void) const { return m_vecpfx.m_num == 0; }
-  T *address (void) { return m_vecdata; }
-  const T *address (void) const { return m_vecdata; }
+  T *address (void) { return reinterpret_cast  (this + 1); }
+  const T *address (void) const { return reinterpret_cast  (this + 
1); }
   T *begin () { return address (); }
   const T *begin () const { return address (); }
   T *end () { return address () + length (); }
@@ -631,8 +631,7 @@ public:
 
   /* FIXME - These fields should be private, but we need to cater to
 compilers that have stricter notions of PODness for types.  */
-  vec_prefix m_vecpfx;
-  T m_vecdata[1];
+  alignas (T) vec_prefix m_vecpfx;
 };
 
 
@@ -1313,7 +1312,7 @@ vec::embedded_size (unsigned alloc)
vec, vec_embedded>::type vec_stdlayout;
   static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
   static_assert (alignof (vec_stdlayout) == alignof (vec), "");
-  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
+  return sizeof (vec_stdlayout) + alloc * sizeof (T);
 }
 
 
@@ -1588,7 +1587,7 @@ public:
 
 private:
   vec m_auto;
-  T m_data[MAX (N - 1, 1)];
+  alignas(T) unsigned char m_data[sizeof (T) * N];
 };
 
 /* auto_vec is a sub class of vec whose storage is released when it is
-- 
2.35.3


Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jonathan Wakely wrote:

> On Fri, 24 Feb 2023 at 11:52, Jakub Jelinek  wrote:
> >
> > On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> > > --- a/gcc/vec.h
> > > +++ b/gcc/vec.h
> > > @@ -586,8 +586,8 @@ public:
> > >unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> > >unsigned length (void) const { return m_vecpfx.m_num; }
> > >bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > > -  T *address (void) { return m_vecdata; }
> > > -  const T *address (void) const { return m_vecdata; }
> > > +  T *address (void) { return reinterpret_cast  (this + 1); }
> > > +  const T *address (void) const { return reinterpret_cast  
> > > (this + 1); }
> >
> > This is now too long.

Fixed.

> > >T *begin () { return address (); }
> > >const T *begin () const { return address (); }
> > >T *end () { return address () + length (); }
> > > @@ -631,8 +631,7 @@ public:
> > >
> > >/* FIXME - These fields should be private, but we need to cater to
> > >compilers that have stricter notions of PODness for types.  */
> > > -  vec_prefix m_vecpfx;
> > > -  T m_vecdata[1];
> > > +  alignas (T) vec_prefix m_vecpfx;
> >
> > The comment needs adjustment and down't we need
> > alignas (T) alignas (vec_prefix) ?
> 
> Yes. If alignas(T) is less than the natural alignment then this will
> be an error. We want it to be the larger of  the two alignments, so we
> need to specify both.

OK, changed to specify both and adjusted the comment, also noting why
we do this - it simplifies address (), otherwise we'd have to round up
to an aligned address.

> >
> > > @@ -1588,7 +1587,7 @@ public:
> > >
> > >  private:
> > >vec m_auto;
> > > -  T m_data[MAX (N - 1, 1)];
> > > +  alignas(T) unsigned char m_data[sizeof (T) * N];
> > >  };
> >
> > I still believe you don't need alignas(T) here (and space before (T) ).

I was worried that with auto_vec<__int128> we get tail-padding in m_auto
re-used, but since this isn't inheritance we're probably safe.  So 
removed give that m_auto is aligned to T.

> > Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> > ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> > now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> > standard C.

I've removed the MAX (N, 2) now, I think that N == 0 cannot happen
because we have a specialization covering that.  So we know N is
at least 1.

> > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > bootstrap that I got with my version or not.

Yes, I get this as well, not sure how to suppress it.  I guess there's
no standard way to get at the address after some object without going
through uintptr obfuscation - and obviously we do not want to have
that (and if we optimize it away that doesn't help the diagnostic ...)

Richard.


[committed] i386: Update i386-builtin.def file comment description of BDESC{,_FIRST}

2023-02-24 Thread Jakub Jelinek via Gcc-patches
Hi!

I've noticed the description of these wasn't updated when the mask2
argument has been added in 2019.

Committed to trunk as obvious.

2023-02-24  Jakub Jelinek  

* config/i386/i386-builtin.def: Update description of BDESC
and BDESC_FIRST in file comment to include mask2.

--- gcc/config/i386/i386-builtin.def.jj 2023-02-24 10:12:37.027390923 +0100
+++ gcc/config/i386/i386-builtin.def2023-02-24 13:07:29.453512100 +0100
@@ -23,9 +23,9 @@
.  */
 
 /* Before including this file, some macros must be defined:
-   BDESC (mask, icode, name, code, comparison, flag)
+   BDESC (mask, mask2, icode, name, code, comparison, flag)
  -- definition of each builtin
-   BDESC_FIRST (kind, KIND, mask, icode, name, code, comparison, flag)
+   BDESC_FIRST (kind, KIND, mask, mask2, icode, name, code, comparison, flag)
  -- like BDESC, but used for the first builtin in each category;
bdesc_##kind will be used in the name of the array and
IX86_BUILTIN__BDESC_##KIND##_FIRST will be the low boundary

Jakub



Re: [PATCH v3 09/11] riscv: thead: Add support for the XTheadMemPair ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
Got one fail:

FAIL: gcc.target/riscv/xtheadmempair-1.c   -O2   scan-assembler-times
th.luwd\t 4

It should scan lwud rather than luwd?


[committed] i386: Fix up builtins used in avx512bf16vlintrin.h [PR108881]

2023-02-24 Thread Jakub Jelinek via Gcc-patches
Hi!

The builtins used in avx512bf16vlintrin.h implementation need both
avx512bf16 and avx512vl ISAs, which the header ensures for them, but
the builtins weren't actually requiring avx512vl, so when used by hand
with just -mavx512bf16 -mno-avx512vl it resulted in ICEs.

Fixed by adding OPTION_MASK_ISA_AVX512VL to their BDESC.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by
Hongtao in the PR, committed to trunk.

2023-02-23  Jakub Jelinek  

PR target/108881
* config/i386/i386-builtin.def (__builtin_ia32_cvtne2ps2bf16_v16bf,
__builtin_ia32_cvtne2ps2bf16_v16bf_mask,
__builtin_ia32_cvtne2ps2bf16_v16bf_maskz,
__builtin_ia32_cvtne2ps2bf16_v8bf,
__builtin_ia32_cvtne2ps2bf16_v8bf_mask,
__builtin_ia32_cvtne2ps2bf16_v8bf_maskz,
__builtin_ia32_cvtneps2bf16_v8sf_mask,
__builtin_ia32_cvtneps2bf16_v8sf_maskz,
__builtin_ia32_cvtneps2bf16_v4sf_mask,
__builtin_ia32_cvtneps2bf16_v4sf_maskz,
__builtin_ia32_dpbf16ps_v8sf, __builtin_ia32_dpbf16ps_v8sf_mask,
__builtin_ia32_dpbf16ps_v8sf_maskz, __builtin_ia32_dpbf16ps_v4sf,
__builtin_ia32_dpbf16ps_v4sf_mask,
__builtin_ia32_dpbf16ps_v4sf_maskz): Require also
OPTION_MASK_ISA_AVX512VL.

* gcc.target/i386/avx512bf16-pr108881.c: New test.

--- gcc/config/i386/i386-builtin.def.jj 2023-01-16 11:52:15.955735951 +0100
+++ gcc/config/i386/i386-builtin.def2023-02-23 18:20:37.139676726 +0100
@@ -2814,30 +2814,30 @@ BDESC (0, OPTION_MASK_ISA2_VAES, CODE_FO
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtne2ps2bf16_v32bf, 
"__builtin_ia32_cvtne2ps2bf16_v32bf", IX86_BUILTIN_CVTNE2PS2BF16_V32BF, 
UNKNOWN, (int) V32BF_FTYPE_V16SF_V16SF)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v32bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v32bf_mask", 
IX86_BUILTIN_CVTNE2PS2BF16_V32BF_MASK, UNKNOWN, (int) 
V32BF_FTYPE_V16SF_V16SF_V32BF_USI)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v32bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v32bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V32BF_MASKZ, UNKNOWN, (int) 
V32BF_FTYPE_V16SF_V16SF_USI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtne2ps2bf16_v16bf, 
"__builtin_ia32_cvtne2ps2bf16_v16bf", IX86_BUILTIN_CVTNE2PS2BF16_V16BF, 
UNKNOWN, (int) V16BF_FTYPE_V8SF_V8SF)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_mask", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASK, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_V16BF_UHI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASKZ, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_UHI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtne2ps2bf16_v8bf, 
"__builtin_ia32_cvtne2ps2bf16_v8bf", IX86_BUILTIN_CVTNE2PS2BF16_V8BF, UNKNOWN, 
(int) V8BF_FTYPE_V4SF_V4SF)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_mask", IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASK, 
UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_V8BF_UQI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASKZ, UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf, "__builtin_ia32_cvtne2ps2bf16_v16bf", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF, UNKNOWN, (int) V16BF_FTYPE_V8SF_V8SF)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_mask", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASK, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_V16BF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASKZ, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf, "__builtin_ia32_cvtne2ps2bf16_v8bf", 
IX86_BUILTIN_CVTNE2PS2BF16_V8BF, UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_mask", IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASK, 
UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_V8BF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASKZ, UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_UQI)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtneps2bf16_v16sf, 
"__builtin_ia32_cvtneps2bf16_v16sf", IX86_BUILTIN_CVTNEPS2BF16_V16SF, UNKNOWN, 
(int) V16BF_FTYPE_V16SF)
 BDESC (0, 

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Richard Biener wrote:

> On Thu, 23 Feb 2023, Jakub Jelinek wrote:
> 
> > On Thu, Feb 23, 2023 at 03:02:01PM +, Richard Biener wrote:
> > > > >   * vec.h (auto_vec): Turn m_data storage into
> > > > >   uninitialized unsigned char.
> > > > 
> > > > Given that we actually never reference the m_data array anywhere,
> > > > it is just to reserve space, I think even the alignas(T) there is
> > > > useless.  The point is that m_auto has as data members:
> > > >   vec_prefix m_vecpfx;
> > > >   T m_vecdata[1];
> > > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or
> > > > -fsanitize=bound-sstrict incompatible) being treated as
> > > > flexible array member flowing into the m_data storage after it.
> > > 
> > > Doesn't the array otherwise eventually overlap with tail padding
> > > in m_auto?  Or does an array of T never produce tail padding?
> > 
> > The array can certainly overlap with tail padding in m_auto if any.
> > But whether m_data is aligned to alignof (T) or not doesn't change anything
> > on it.
> > m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, 
> > m_num; },
> > so I think there is on most arches tail padding if T has smaller alignment
> > than int, so typically char/short or structs with the same size/alignments.
> > If that happens, alignof (auto_vec_x.m_auto) will be alignof (int),
> > there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto)
> > is 3 * sizeof (int), m_data will have offset always aligned to alignof (T).
> > If alignof (T) >= alignof (int), then there won't be any tail padding
> > at the end of m_auto, there could be padding between m_vecpfx and
> > m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and
> > so m_data will be again already properly aligned.
> > 
> > So, I think your patch is fine without alignas(T), the rest is just that
> > there is more work to do incrementally, even for the case you want to
> > deal with (the point 1) in particular).
> 
> Looking at vec::operator[] which just does
> 
> template
> inline const T &
> vec::operator[] (unsigned ix) const
> {
>   gcc_checking_assert (ix < m_vecpfx.m_num);
>   return m_vecdata[ix];
> } 
> 
> the whole thing looks fragile at best - we basically have
> 
> struct auto_vec
> {
>   struct vec
>   {
> ...
> T m_vecdata[1];
>   } m_auto;
>   T m_data[N-1];
> };
> 
> and access m_auto.m_vecdata[] as if it extends to m_data.  That's
> not something supported by the middle-end - not by design at least.
> auto_vec *p; p->m_auto.m_vecdata[i] would never alias
> p->m_data[j], in practice we might not see this though.  Also
> get_ref_base_and_extent will compute a maxsize/size of sizeof(T)
> for any m_auto.m_vecdata[i] access, but I think we nowhere
> actually replace 'i' by zero based on this knowledge, but we'd
> perform CSE with earlier m_auto.m_vecdata[0] stores, so that
> might be something one could provoke.  Doing a self-test like
> 
> static __attribute__((noipa)) void
> test_auto_alias (int i)
> { 
>   auto_vec v;
>   v.quick_grow (2);
>   v[0] = 1;
>   v[1] = 2;
>   int val = v[i];
>   ASSERT_EQ (val, 2);
> } 
> 
> shows
> 
>   _27 = &_25->m_vecdata[0];
>   *_27 = 1;
> ...
>   _7 = &_12->m_vecdata[i.235_3];
>   val_13 = *_7;
> 
> which is safe in middle-end rules though.  So what "saves" us
> here is that we always return a reference and never a value.
> There's the ::iterate member function which fails to do this,
> the ::quick_push function does
> 
>   T *slot = _vecdata[m_vecpfx.m_num++];
>   *slot = obj;
> 
> with
> 
> static __attribute__((noipa)) void
> test_auto_alias (int i)
> { 
>   auto_vec v;
>   v.quick_grow (2);
>   v[0] = 1;
>   v[1] = 2;
>   int val;
>   for (int ix = i; v.iterate (ix, ); ix++)
> ;
>   ASSERT_EQ (val, 2);
> } 
> 
> I get that optimzied to a FAIL.  I have a "fix" for this.
> unordered_remove has a similar issue accesing the last element.

Turns out forwprop "breaks" this still, so the fix doesn't work.
That means we have a hole here in the middle-end.  We can
avoid this by obfuscating things even more, like to

  const T *first = m_vecdata;
  const T *slot = first + ix;
  *ptr = *slot;

which at least for variable 'ix' avoids forwprop from triggering.

But this also means that the existing [] accessor isn't really safe,
we're just lucky that we turn constant accesses to
MEM[(int &) + off] = val; and that we now have PR108355
which made the get_ref_base_and_extent info used less often in VN.

I'm testing the patch now without the new selftest, it should be
good to avoid these issues for constant indexes.  I can also split
the patch up.

But in the end I think we have to fix auto_vec in a better way.

Richard.

> There are a few functions using the [] access member which is
> at least sub-optimal due to repeated bounds checking but also safe.
> 
> I suppose if auto_vec would be a union of vec and
> a storage member with the vl_embed active that would work, but then
> 

Re: [PATCH] cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial thunk [PR108854]

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs on x86_64-linux with -m32.  The problem is
> we create an artificial thunk and because of -fPIC, ia32 and thunk
> destination which doesn't bind locally can't use a mi thunk.
> The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
> but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
> This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
> only if some arguments need modification.
> 
> The following patch fixes it by copying the DECL_ARGUMENTS list even if
> the arguments can stay as is, to update DECL_CONTEXT on them.  While for
> mi thunks it doesn't really matter because we don't use those arguments
> in any way, for other thunks it is important.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2023-02-23  Jakub Jelinek  
> 
>   PR middle-end/108854
>   * cgraphclones.cc (duplicate_thunk_for_node): If no parameter
>   changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
>   nodes and adjust their DECL_CONTEXT.
> 
>   * g++.dg/opt/pr108854.C: New test.
> 
> --- gcc/cgraphclones.cc.jj2023-02-22 20:50:27.417519830 +0100
> +++ gcc/cgraphclones.cc   2023-02-23 17:12:59.875133883 +0100
> @@ -218,7 +218,17 @@ duplicate_thunk_for_node (cgraph_node *t
>body_adj.modify_formal_parameters ();
>  }
>else
> -new_decl = copy_node (thunk->decl);
> +{
> +  new_decl = copy_node (thunk->decl);
> +  for (tree *arg = _ARGUMENTS (new_decl);
> +*arg; arg = _CHAIN (*arg))
> + {
> +   tree next = DECL_CHAIN (*arg);
> +   *arg = copy_node (*arg);
> +   DECL_CONTEXT (*arg) = new_decl;
> +   DECL_CHAIN (*arg) = next;
> + }
> +}
>  
>gcc_checking_assert (!DECL_STRUCT_FUNCTION (new_decl));
>gcc_checking_assert (!DECL_INITIAL (new_decl));
> --- gcc/testsuite/g++.dg/opt/pr108854.C.jj2023-02-23 17:11:19.275583506 
> +0100
> +++ gcc/testsuite/g++.dg/opt/pr108854.C   2023-02-23 17:11:02.723822009 
> +0100
> @@ -0,0 +1,37 @@
> +// PR middle-end/108854
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O3" }
> +// { dg-additional-options "-fPIC" { target fpic } }
> +
> +struct A { A (int); ~A (); };
> +struct B { B (int, bool); ~B (); };
> +template 
> +struct C { void m1 (T); void m2 (T &&); };
> +class D;
> +struct E { virtual void m3 (); };
> +template 
> +struct F { virtual bool m4 (D &); };
> +struct D { virtual D m5 () { return D (); } };
> +void foo (void *, void *);
> +struct G {
> +  int a;
> +  C  b;
> +  void m4 (D ) { B l (a, true); r.m5 (); b.m1 (); b.m2 (); }
> +};
> +struct H : E, F  {
> +  template 
> +  H (int, T);
> +  bool m4 (D ) { A l (a); b.m4 (r); if (c) return true; } // { dg-warning 
> "control reaches end of non-void function" }
> +  int a;
> +  bool c;
> +  G b;
> +};
> +inline void bar (F  ) { D s, t; p.m4 (t); foo (, ); }
> +enum I { I1, I2 };
> +template 
> +struct J;
> +template 
> +void baz () { int g = 0, h = 0; T i (g, h); bar (i); }
> +template 
> +void qux () { baz > (); }
> +void corge () { qux  (); qux  (); }
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
My impression is that md patterns will use first-match patterns? so
the zba will get higher priority than xtheadba if both patterns are
matched?

On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
 wrote:
>
> On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > This patch adds support for the XTheadBa ISA extension.
> > The new INSN pattern is defined in a new file to separate
> > this vendor extension from the standard extensions.
>
> How does this interact with doing -march=rv32gc_xtheadba_zba ?
> Seems like it might be better handle that case correctly. I suspect
> these all XThreadB* extensions have a similar problem too.
>
> Thanks,
> Andrew Pinski
>
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.md: Include thead.md
> > * config/riscv/thead.md: New file.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/xtheadba-addsl.c: New test.
> >
> > Changes in v3:
> > - Fix operand order for th.addsl.
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >  gcc/config/riscv/riscv.md |  1 +
> >  gcc/config/riscv/thead.md | 31 +++
> >  .../gcc.target/riscv/xtheadba-addsl.c | 55 +++
> >  3 files changed, 87 insertions(+)
> >  create mode 100644 gcc/config/riscv/thead.md
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index 05924e9bbf1..d6c2265e9d4 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -3093,4 +3093,5 @@ (define_insn "riscv_prefetchi_"
> >  (include "pic.md")
> >  (include "generic.md")
> >  (include "sifive-7.md")
> > +(include "thead.md")
> >  (include "vector.md")
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > new file mode 100644
> > index 000..158e9124c3a
> > --- /dev/null
> > +++ b/gcc/config/riscv/thead.md
> > @@ -0,0 +1,31 @@
> > +;; Machine description for T-Head vendor extensions
> > +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> > +
> > +;; This file is part of GCC.
> > +
> > +;; GCC is free software; you can redistribute it and/or modify
> > +;; it under the terms of the GNU General Public License as published by
> > +;; the Free Software Foundation; either version 3, or (at your option)
> > +;; any later version.
> > +
> > +;; GCC is distributed in the hope that it will be useful,
> > +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +;; GNU General Public License for more details.
> > +
> > +;; You should have received a copy of the GNU General Public License
> > +;; along with GCC; see the file COPYING3.  If not see
> > +;; .
> > +
> > +;; XTheadBa
> > +
> > +(define_insn "*th_addsl"
> > +  [(set (match_operand:X 0 "register_operand" "=r")
> > +   (plus:X (ashift:X (match_operand:X 1 "register_operand" "r")
> > + (match_operand:QI 2 "immediate_operand" "I"))
> > +   (match_operand:X 3 "register_operand" "r")))]
> > +  "TARGET_XTHEADBA
> > +   && (INTVAL (operands[2]) >= 0) && (INTVAL (operands[2]) <= 3)"
> > +  "th.addsl\t%0,%3,%1,%2"
> > +  [(set_attr "type" "bitmanip")
> > +   (set_attr "mode" "")])
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > new file mode 100644
> > index 000..5004735a246
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > @@ -0,0 +1,55 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32gc_xtheadba" { target { rv32 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadba" { target { rv64 } } } */
> > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */
> > +
> > +long
> > +test_1 (long a, long b)
> > +{
> > +  /* th.addsl aX, aX, 1  */
> > +  return a + (b << 1);
> > +}
> > +
> > +int
> > +foos (short *x, int n)
> > +{
> > +  /* th.addsl aX, aX, 1  */
> > +  return x[n];
> > +}
> > +
> > +long
> > +test_2 (long a, long b)
> > +{
> > +  /* th.addsl aX, aX, 2  */
> > +  return a + (b << 2);
> > +}
> > +
> > +int
> > +fooi (int *x, int n)
> > +{
> > +  /* th.addsl aX, aX, 2  */
> > +  return x[n];
> > +}
> > +
> > +long
> > +test_3 (long a, long b)
> > +{
> > +  /* th.addsl aX, aX, 3  */
> > +  return a + (b << 3);
> > +}
> > +
> > +long
> > +fool (long *x, int n)
> > +{
> > +  /* th.addsl aX, aX, 2 (rv32)  */
> > +  /* th.addsl aX, aX, 3 (rv64)  */
> > +  return x[n];
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "th.addsl\[ 
> > \t\]*a\[0-9\]+,a\[0-9\]+,a\[0-9\]+,1" 2 } } */
> > +
> > +/* { dg-final { scan-assembler-times "th.addsl\[ 
> > \t\]*a\[0-9\]+,a\[0-9\]+,a\[0-9\]+,2" 3 { target { rv32 } } } } */
> > +/* { dg-final { scan-assembler-times "th.addsl\[ 
> > \t\]*a\[0-9\]+,a\[0-9\]+,a\[0-9\]+,2" 2 { target { rv64 } } } } */
> > +

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 09:50:33AM +, Jonathan Wakely wrote:
> On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote:
> >
> > Assuming a compiler handles the T m_vecdata[1]; as flexible array member
> > like (which we need because standard C++ doesn't have flexible array members
> > nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
> > trick couldn't make auto_vec have
> > alignas(vec) unsigned char buf m_data[sizeof (vec) + (N 
> > - 1) * sizeof (T)];
> > and do a placement new of vec into that m_data during auto_vec
> > construction.  Isn't it then similar to how are flexible array members
> > normally used in C, where one uses malloc or alloca to allocate storage
> > for them and the storage can be larger than the structure itself and
> > flexible array member then can use storage after it?
> 
> You would still be accessing past the end of the
> vec::m_vecdata array which is UB.

Pedantically sure, but because C++ doesn't have flexible array members,
people in the wild use the flexible array member like arrays for that
purpose.
If there was T m_vecdata[];, would it still be UB (with the flexible
array member extensions)?
We could use T m_vecdata[]; if the host compiler supports them and
T m_vecdata[1]; otherwise in the hope that the compiler handles it
similarly.  After all, I think lots of other real-world programs do the
same.

Jakub



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 09:55:13AM +, Jonathan Wakely wrote:
> > You would still be accessing past the end of the
> > vec::m_vecdata array which is UB.
> 
> My thinking is something like:
> 
> // New tag type
> struct vl_relative { };
> 
> // This must only be used as a member subobject of another type
> // which provides the trailing storage.
> template
> struct vec
> {
>   T *address (void) { return (T*)(m_vecpfx+1); }
>   const T *address (void) const { return (T*)(m_vecpfx+1); }
> 
>   alignas(T) alignas(vec_prefix) vec_prefix m_vecpfx;
> };
> 
> template
> class auto_vec : public vec
> {
>   // ...
> private:
>   vec m_head;
>   T m_data[N];
> 
> static_assert(...);
> };

Maybe this would work, vl_relative even could be vl_embed.
Because vl_embed I believe is used in two spots, part of
auto_vec where it is followed by m_data and on heap or GGC
allocated memory where vec<..., vl_embed> is followed by
further storage for the vector.

Jakub



Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 11:05 AM Christoph Müllner
 wrote:
>
> On Fri, Feb 24, 2023 at 10:54 AM Kito Cheng  wrote:
> >
> > My impression is that md patterns will use first-match patterns? so
> > the zba will get higher priority than xtheadba if both patterns are
> > matched?
>
> Yes, I was just about to write this.
>
> /opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
> -march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
> ./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
>
> The resulting xtheadba-addsl.s file has:
> .attribute arch, 
> "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zba1p0_xtheadba1p0"
> [...]
> sh1add  a0,a1,a0
>
> So the standard extension will be preferred over the custom extension.

I tested now with all of them (RV32 and RV64):

/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu-2.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-srri.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbs -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c

/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu-2.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-srri.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbs -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c

All behave ok (also when dropping the xtheadb* from the -march).

Is it ok to leave this as is?

Thanks,
Christoph

>
>
> >
> > On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
> >  wrote:
> > >
> > > On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
> > >  wrote:
> > > >
> > > > From: Christoph Müllner 
> > > >
> > > > This patch adds support for the XTheadBa ISA extension.
> > > > The new INSN pattern is defined in a new file to separate
> > > > this vendor extension from the standard extensions.
> > >
> > > How does this interact with doing -march=rv32gc_xtheadba_zba ?
> > > Seems like it might be better handle that case correctly. I suspect
> > > these all XThreadB* extensions have a similar problem too.
> > >
> > > Thanks,
> > > Andrew Pinski
> > >
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * config/riscv/riscv.md: Include thead.md
> > > > * config/riscv/thead.md: New file.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.target/riscv/xtheadba-addsl.c: New test.
> > > >
> > > > Changes in v3:
> > > > - Fix operand order for th.addsl.
> > > >
> > > > Signed-off-by: Christoph Müllner 
> > > > ---
> > > >  gcc/config/riscv/riscv.md |  1 +
> > > >  gcc/config/riscv/thead.md | 31 +++
> > > >  .../gcc.target/riscv/xtheadba-addsl.c | 55 +++
> > > >  3 files changed, 87 insertions(+)
> > > >  create mode 100644 gcc/config/riscv/thead.md
> > > >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > > >
> > > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > > > index 05924e9bbf1..d6c2265e9d4 100644
> > > > --- a/gcc/config/riscv/riscv.md
> > > > +++ b/gcc/config/riscv/riscv.md
> > > > @@ 

[Bug c/63357] Warn for P && P and P || P (same expression used multiple times in a condition)

2023-02-24 Thread manu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63357

--- Comment #9 from Manuel López-Ibáñez  ---
(In reply to David Binderman from comment #8)
> This could probably be extended to other operators.

Please open a new PR mentioning this one.

Re: [PATCH v3 00/11] RISC-V: Add XThead* extension support

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 9:09 AM Kito Cheng  wrote:
>
> Hi Christoph:
>
> OK for trunk for the 1~8, feel free to commit 1~8 after you address
> those minor comments, and could you also prepare release notes for
> those extensions?

I addressed the comment regarding XTheadBs.
But I have not done anything regarding XTheadB* and Zb*.

Release notes patch can be found here:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612763.html

> And 9~11 needs to take a few more rounds of review and test.

I've seen the comments regarding patch 10 and 11.
We will try to clean this up asap.

In the patch for XTheadMemPair there was this nasty typo in one of the tests,
is there anything else that is needed?
I believe that patch should be in a better shape than the last two patches
and it is much less invasive.
Further similar code can be found in other backends.

Thanks,
Christoph

>
>
>
>
> On Fri, Feb 24, 2023 at 1:52 PM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > This series introduces support for the T-Head specific RISC-V ISA extensions
> > which are available e.g. on the T-Head XuanTie C906.
> >
> > The ISA spec can be found here:
> >   https://github.com/T-head-Semi/thead-extension-spec
> >
> > This series adds support for the following XThead* extensions:
> > * XTheadBa
> > * XTheadBb
> > * XTheadBs
> > * XTheadCmo
> > * XTheadCondMov
> > * XTheadFMemIdx
> > * XTheadFmv
> > * XTheadInt
> > * XTheadMac
> > * XTheadMemIdx
> > * XTheadMemPair
> > * XTheadSync
> >
> > All extensions are properly integrated and the included tests
> > demonstrate the improvements of the generated code.
> >
> > The series also introduces support for "-mcpu=thead-c906", which also
> > enables all available XThead* ISA extensions of the T-Head C906.
> >
> > All patches have been tested and don't introduce regressions for RV32 or 
> > RV64.
> > The patches have also been tested with SPEC CPU2017 on QEMU and real HW
> > (D1 board).
> >
> > Support patches for these extensions for Binutils, QEMU, and LLVM have
> > already been merged in the corresponding upstream projects.
> >
> > Changes in v3:
> > - Bugfix in XTheadBa
> > - Rewrite of XTheadMemPair
> > - Inclusion of XTheadMemIdx and XTheadFMemIdx
> >
> > Christoph Müllner (9):
> >   riscv: Add basic XThead* vendor extension support
> >   riscv: riscv-cores.def: Add T-Head XuanTie C906
> >   riscv: thead: Add support for the XTheadBa ISA extension
> >   riscv: thead: Add support for the XTheadBs ISA extension
> >   riscv: thead: Add support for the XTheadBb ISA extension
> >   riscv: thead: Add support for the XTheadCondMov ISA extensions
> >   riscv: thead: Add support for the XTheadMac ISA extension
> >   riscv: thead: Add support for the XTheadFmv ISA extension
> >   riscv: thead: Add support for the XTheadMemPair ISA extension
> >
> > moiz.hussain (2):
> >   riscv: thead: Add support for the XTheadMemIdx ISA extension
> >   riscv: thead: Add support for the XTheadFMemIdx ISA extension
> >
> >  gcc/common/config/riscv/riscv-common.cc   |   26 +
> >  gcc/config/riscv/bitmanip.md  |   52 +-
> >  gcc/config/riscv/constraints.md   |   43 +
> >  gcc/config/riscv/iterators.md |4 +
> >  gcc/config/riscv/peephole.md  |   56 +
> >  gcc/config/riscv/riscv-cores.def  |4 +
> >  gcc/config/riscv/riscv-opts.h |   29 +
> >  gcc/config/riscv/riscv-protos.h   |   28 +-
> >  gcc/config/riscv/riscv.cc | 1090 +++--
> >  gcc/config/riscv/riscv.h  |8 +-
> >  gcc/config/riscv/riscv.md |  169 ++-
> >  gcc/config/riscv/riscv.opt|3 +
> >  gcc/config/riscv/thead.md |  351 ++
> >  .../gcc.target/riscv/mcpu-thead-c906.c|   28 +
> >  .../gcc.target/riscv/xtheadba-addsl.c |   55 +
> >  gcc/testsuite/gcc.target/riscv/xtheadba.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c |   20 +
> >  .../gcc.target/riscv/xtheadbb-extu-2.c|   22 +
> >  .../gcc.target/riscv/xtheadbb-extu.c  |   22 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c |   18 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c |   45 +
> >  .../gcc.target/riscv/xtheadbb-srri.c  |   21 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c |   13 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbs.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadcmo.c|   14 +
> >  .../riscv/xtheadcondmov-mveqz-imm-eqz.c   |   38 +
> >  .../riscv/xtheadcondmov-mveqz-imm-not.c   |   38 +
> >  .../riscv/xtheadcondmov-mveqz-reg-eqz.c   |   38 +
> >  .../riscv/xtheadcondmov-mveqz-reg-not.c   |   38 +
> >  .../riscv/xtheadcondmov-mvnez-imm-cond.c  |   38 +
> >  .../riscv/xtheadcondmov-mvnez-imm-nez.c   |   38 +
> >  .../riscv/xtheadcondmov-mvnez-reg-cond.c  |   

Re: [PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 12:32:45PM +0100, Richard Biener via Gcc-patches wrote:
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -614,7 +614,7 @@ public:
>T *bsearch (const void *key, int (*compar)(const void *, const void *));
>T *bsearch (const void *key,
> int (*compar)(const void *, const void *, void *), void *);
> -  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
> +  unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const;

Missing space after (*) while you're there.

> @@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
>  {
>if (ix < m_vecpfx.m_num)
>  {
> -  *ptr = m_vecdata[ix];
> +  *ptr = address()[ix];

Missing space before ().

> @@ -1118,7 +1118,7 @@ inline void
>  vec::unordered_remove (unsigned ix)
>  {
>gcc_checking_assert (ix < length ());
> -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> +  address ()[ix] = address ()[--m_vecpfx.m_num];
>  }

As address () is used twice here, can't we stick it into a temporary
and use twice then?

> @@ -1249,8 +1249,11 @@ vec::contains (const T ) const
>  {
>unsigned int len = length ();
>for (unsigned int i = 0; i < len; i++)
> -if ((*this)[i] == search)
> -  return true;
> +{
> +  const T *slot =  ()[i];
> +  if (*slot == search)
> + return true;

Similarly, can't we do address () once before the loop into a temporary?

>  template
>  unsigned
> -vec::lower_bound (T obj, bool (*lessthan)(const T &, const T 
> &))
> +vec::lower_bound (const T ,
> +   bool (*lessthan)(const T &, const T &))

) ( while you're at it.

Otherwise LGTM.

Jakub



[Bug ipa/108695] [13 Regression] Wrong code since r13-5215-gb1f30bf42d8d47 for dd_rescue package

2023-02-24 Thread kurt at garloff dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108695

--- Comment #18 from Kurt Garloff  ---
dd_rescue-1.99.13 has been released including the fix to XORN.
(Fix uses uint* casts rather than uchar*.)

[PATCH] use subreg for movsf_from_si and remove UNSPEC_SF_FROM_SI

2023-02-24 Thread Jiufu Guo via Gcc-patches
Hi,

In patch https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612168.html,
we improved the bictcast from lowpart/highpart of DI to SF by using mtvsrws
or mtvsrd.

As investigating this functionality, we may improve the related code by using
bitcast subreg from SI to SF, and avoid generating UNSPEC_SF_FROM_SI.

We can also improve the cases like "subreg:SI(reg:SF)=reg:SI" which is cast
SI to SF (e.g. pr48335-1.c).

This patch also reduce clobber usage, only adding clobber for p8 where 
additional
register is required.

This patch pass bootstrap and regtest for ppc64(p7,p8 and p9) and 
ppc64le(p10,p9).

Is this patch ok for trunk (or maybe stage1)? Thanks for comments and 
sugguestions!


BR,
Jeff (Jiufu)

gcc/ChangeLog:

* config/rs6000/predicates.md: Rename TARGET_NO_SF_SUBREG to
BITCAST_SI_SF_IN_REGS, and rename TARGET_ALLOW_SF_SUBREG to
BITCAST_SI_SF_IN_MEM.
* config/rs6000/rs6000.cc (valid_sf_si_move): Likewise.
(is_lfs_stfs_insn): Split to is_stfs_insn and is_lfs_insn.
(is_stfs_insn): Split from is_lfs_stfs_insn.
(is_lfs_insn): Split from is_lfs_stfs_insn.
(prefixed_load_p): Call is_lfs_insn.
(prefixed_store_p): Call is_stfs_insn.
* config/rs6000/rs6000.h (TARGET_NO_SF_SUBREG): Rename to ...
(BITCAST_SI_SF_IN_REGS): ... this.
(TARGET_ALLOW_SF_SUBREG): Rename to ...
(BITCAST_SI_SF_IN_MEM): ... this.
* config/rs6000/rs6000.md (movsf_from_si_p8): New define_insn.

---
 gcc/config/rs6000/predicates.md | 16 +++---
 gcc/config/rs6000/rs6000.cc | 36 
 gcc/config/rs6000/rs6000.h  |  4 +-
 gcc/config/rs6000/rs6000.md | 98 +
 4 files changed, 97 insertions(+), 57 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index e57c9d99c6b..4a7d5893126 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -47,7 +47,7 @@ (define_predicate "sf_subreg_operand"
   rtx inner_reg = SUBREG_REG (op);
   machine_mode inner_mode = GET_MODE (inner_reg);
 
-  if (TARGET_ALLOW_SF_SUBREG || !REG_P (inner_reg))
+  if (BITCAST_SI_SF_IN_MEM || !REG_P (inner_reg))
 return 0;
 
   if ((mode == SFmode && GET_MODE_CLASS (inner_mode) == MODE_INT)
@@ -67,7 +67,7 @@ (define_predicate "altivec_register_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -88,7 +88,7 @@ (define_predicate "vsx_register_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -126,7 +126,7 @@ (define_predicate "vfloat_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -148,7 +148,7 @@ (define_predicate "vint_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -170,7 +170,7 @@ (define_predicate "vlogical_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -346,7 +346,7 @@ (define_predicate "gpc_reg_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -375,7 +375,7 @@ (define_predicate "int_reg_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 16ca3a31757..b8a9f01cbfa 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10565,7 +10565,7 @@ rs6000_emit_le_vsx_move (rtx dest, rtx source, 
machine_mode mode)
 bool
 valid_sf_si_move (rtx dest, rtx src, machine_mode mode)
 {
-  if (TARGET_ALLOW_SF_SUBREG)
+  if (BITCAST_SI_SF_IN_MEM)
 return true;
 
   if (mode != SFmode && GET_MODE_CLASS (mode) != MODE_INT)
@@ -26425,13 +26425,10 @@ pcrel_opt_valid_mem_p (rtx reg, machine_mode mode, 
rtx mem)
- stfs:
 - SET is from UNSPEC_SI_FROM_SF to MEM:SI
 - CLOBBER is a V4SF
-   - lfs:
-- SET is from UNSPEC_SF_FROM_SI to REG:SF
-- CLOBBER is a DI
  */
 
 static bool
-is_lfs_stfs_insn (rtx_insn *insn)
+is_stfs_insn (rtx_insn *insn)
 {
   rtx 

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Ajit Agarwal via Gcc-patches
Hello All:

Here is the patch that uses xxlor instead of fmr where possible.
Performance results shows that fmr is better in power9 and 
power10 architectures whereas xxlor is better in power7 and
power 8 architectures.

Bootstrapped and regtested powepc64-linux-gnu.

Thanks & Regards
Ajit

rs6000: Use xxlor instead of fmr where possible

This patch replaces fmr with xxlor instruction for power7
and power8 architectures whereas for power9 and power10
replaces xxlor with fmr instruction.

Perf measurement results:

Power9 fmr:  201,847,661 cycles.
Power9 xxlor: 201,877,78 cycles.
Power8 fmr: 201,057,795 cycles.
Power8 xxlor: 201,004,671 cycles.

2023-02-24  Ajit Kumar Agarwal  

gcc/ChangeLog:

* config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor
for power7 and power8 and fmr for power9 and power10.
---
 gcc/config/rs6000/rs6000.md | 46 +++--
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 81bffb04ceb..1253b8622a7 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -354,7 +354,7 @@ (define_attr "cpu"
   (const (symbol_ref "(enum attr_cpu) rs6000_tune")))
 
 ;; The ISA we implement.
-(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
+(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10"
   (const_string "any"))
 
 ;; Is this alternative enabled for the current CPU/ISA/etc.?
@@ -402,6 +402,11 @@ (define_attr "enabled" ""
  (and (eq_attr "isa" "p10")
  (match_test "TARGET_POWER10"))
  (const_int 1)
+  
+ (and (eq_attr "isa" "p7p8")
+ (match_test "TARGET_VSX && !TARGET_P9_VECTOR"))
+ (const_int 1)
+
 ] (const_int 0)))
 
 ;; If this instruction is microcoded on the CELL processor
@@ -8436,27 +8441,29 @@ (define_insn "*mov_softfloat32"
 
 (define_insn "*mov_hardfloat64"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
-   "=m,   d,  d,  ,   wY,
- ,Z,  ,  ,  !r,
- YZ,  r,  !r, *c*l,   !r,
-*h,   r,  ,   wa")
+   "=m,   d,  ,  ,   wY,
+, Z,  wa, ,  !r,
+YZ,   r,  !r, *c*l,   !r,
+*h,   r,  ,   d,  wn,
+wa")
(match_operand:FMOVE64 1 "input_operand"
-"d,   m,  d,  wY, ,
- Z,   ,   ,  ,  ,
+"d,   m,  ,  wY, ,
+ Z,   ,   wa, ,  ,
  r,   YZ, r,  r,  *h,
- 0,   ,   r,  eP"))]
+ 0,   ,   r,  d,  wn,
+ eP"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], mode)
|| gpc_reg_operand (operands[1], mode))"
   "@
stfd%U0%X0 %1,%0
lfd%U1%X1 %0,%1
-   fmr %0,%1
+   xxlor %x0,%x1,%x1
lxsd %0,%1
stxsd %1,%0
lxsdx %x0,%y1
stxsdx %x1,%y0
-   xxlor %x0,%x1,%x1
+   fmr %0,%1
xxlxor %x0,%x0,%x0
li %0,0
std%U0%X0 %1,%0
@@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64"
nop
mfvsrd %0,%x1
mtvsrd %x0,%1
+   fmr %0,%1
+   fmr %0,%1
#"
   [(set_attr "type"
-"fpstore, fpload, fpsimple,   fpload, fpstore,
+"fpstore, fpload, veclogical, fpload, fpstore,
  fpload,  fpstore,veclogical, veclogical, integer,
  store,   load,   *,  mtjmpr, mfjmpr,
- *,   mfvsr,  mtvsr,  vecperm")
+ *,   mfvsr,  mtvsr,  fpsimple,   fpsimple,
+ vecperm")
(set_attr "size" "64")
(set_attr "isa"
-"*,   *,  *,  p9v,p9v,
- p7v, p7v,*,  *,  *,
- *,   *,  *,  *,  *,
- *,   p8v,p8v,p10")
+"*,   *,  p7p8,p9v,p9v,
+ p7v, p7v,*,   *,  *,
+ *,   *,  *,   *,  *,
+ *,   p8v,p8v, *,  *,
+ p10")
(set_attr "prefixed"
 "*,   *,  *,  *,  *,
  *,   *,  *,  *,  *,
  *,   *,  *,  *,  *,
- *,   *,  *,  *")])
+ *,   *,  *,  *,  *,
+ *")])
 
 ;;   STD  LD   MR  MT MF G-const
 ;;   H-const  F-const  

Re: [PATCH v3 00/11] RISC-V: Add XThead* extension support

2023-02-24 Thread Kito Cheng via Gcc-patches
Hi Christoph:

OK for trunk for the 1~8, feel free to commit 1~8 after you address
those minor comments, and could you also prepare release notes for
those extensions?

And 9~11 needs to take a few more rounds of review and test.




On Fri, Feb 24, 2023 at 1:52 PM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> This series introduces support for the T-Head specific RISC-V ISA extensions
> which are available e.g. on the T-Head XuanTie C906.
>
> The ISA spec can be found here:
>   https://github.com/T-head-Semi/thead-extension-spec
>
> This series adds support for the following XThead* extensions:
> * XTheadBa
> * XTheadBb
> * XTheadBs
> * XTheadCmo
> * XTheadCondMov
> * XTheadFMemIdx
> * XTheadFmv
> * XTheadInt
> * XTheadMac
> * XTheadMemIdx
> * XTheadMemPair
> * XTheadSync
>
> All extensions are properly integrated and the included tests
> demonstrate the improvements of the generated code.
>
> The series also introduces support for "-mcpu=thead-c906", which also
> enables all available XThead* ISA extensions of the T-Head C906.
>
> All patches have been tested and don't introduce regressions for RV32 or RV64.
> The patches have also been tested with SPEC CPU2017 on QEMU and real HW
> (D1 board).
>
> Support patches for these extensions for Binutils, QEMU, and LLVM have
> already been merged in the corresponding upstream projects.
>
> Changes in v3:
> - Bugfix in XTheadBa
> - Rewrite of XTheadMemPair
> - Inclusion of XTheadMemIdx and XTheadFMemIdx
>
> Christoph Müllner (9):
>   riscv: Add basic XThead* vendor extension support
>   riscv: riscv-cores.def: Add T-Head XuanTie C906
>   riscv: thead: Add support for the XTheadBa ISA extension
>   riscv: thead: Add support for the XTheadBs ISA extension
>   riscv: thead: Add support for the XTheadBb ISA extension
>   riscv: thead: Add support for the XTheadCondMov ISA extensions
>   riscv: thead: Add support for the XTheadMac ISA extension
>   riscv: thead: Add support for the XTheadFmv ISA extension
>   riscv: thead: Add support for the XTheadMemPair ISA extension
>
> moiz.hussain (2):
>   riscv: thead: Add support for the XTheadMemIdx ISA extension
>   riscv: thead: Add support for the XTheadFMemIdx ISA extension
>
>  gcc/common/config/riscv/riscv-common.cc   |   26 +
>  gcc/config/riscv/bitmanip.md  |   52 +-
>  gcc/config/riscv/constraints.md   |   43 +
>  gcc/config/riscv/iterators.md |4 +
>  gcc/config/riscv/peephole.md  |   56 +
>  gcc/config/riscv/riscv-cores.def  |4 +
>  gcc/config/riscv/riscv-opts.h |   29 +
>  gcc/config/riscv/riscv-protos.h   |   28 +-
>  gcc/config/riscv/riscv.cc | 1090 +++--
>  gcc/config/riscv/riscv.h  |8 +-
>  gcc/config/riscv/riscv.md |  169 ++-
>  gcc/config/riscv/riscv.opt|3 +
>  gcc/config/riscv/thead.md |  351 ++
>  .../gcc.target/riscv/mcpu-thead-c906.c|   28 +
>  .../gcc.target/riscv/xtheadba-addsl.c |   55 +
>  gcc/testsuite/gcc.target/riscv/xtheadba.c |   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c |   20 +
>  .../gcc.target/riscv/xtheadbb-extu-2.c|   22 +
>  .../gcc.target/riscv/xtheadbb-extu.c  |   22 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c |   18 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c |   45 +
>  .../gcc.target/riscv/xtheadbb-srri.c  |   21 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb.c |   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c |   13 +
>  gcc/testsuite/gcc.target/riscv/xtheadbs.c |   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadcmo.c|   14 +
>  .../riscv/xtheadcondmov-mveqz-imm-eqz.c   |   38 +
>  .../riscv/xtheadcondmov-mveqz-imm-not.c   |   38 +
>  .../riscv/xtheadcondmov-mveqz-reg-eqz.c   |   38 +
>  .../riscv/xtheadcondmov-mveqz-reg-not.c   |   38 +
>  .../riscv/xtheadcondmov-mvnez-imm-cond.c  |   38 +
>  .../riscv/xtheadcondmov-mvnez-imm-nez.c   |   38 +
>  .../riscv/xtheadcondmov-mvnez-reg-cond.c  |   38 +
>  .../riscv/xtheadcondmov-mvnez-reg-nez.c   |   38 +
>  .../gcc.target/riscv/xtheadcondmov.c  |   14 +
>  .../riscv/xtheadfmemidx-fldr-fstr.c   |   58 +
>  .../gcc.target/riscv/xtheadfmemidx.c  |   14 +
>  .../gcc.target/riscv/xtheadfmv-fmv.c  |   24 +
>  gcc/testsuite/gcc.target/riscv/xtheadfmv.c|   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadint.c|   14 +
>  .../gcc.target/riscv/xtheadmac-mula-muls.c|   43 +
>  gcc/testsuite/gcc.target/riscv/xtheadmac.c|   14 +
>  .../gcc.target/riscv/xtheadmemidx-ldi-sdi.c   |   72 ++
>  .../riscv/xtheadmemidx-ldr-str-32.c   |   23 +
>  .../riscv/xtheadmemidx-ldr-str-64.c   |   53 +
>  .../gcc.target/riscv/xtheadmemidx-macros.h|  110 ++
>  gcc/testsuite/gcc.target/riscv/xtheadmemidx.c |   14 +
>  

[PATCH (pushed)] libsanitizer: cherry-pick commit 8f5962b1ccb5fcd4d4544121d43efb860ac3cc6d from upstream

2023-02-24 Thread Martin Liška
ASAN: keep support for Global::location

We as GCC still emit __asan_global_source_location for global variables
and we would like to use it in the future. On other hand, we don't
support llvm-symbolizer and the default libbacktraace symbolizer
does not support location info.
---
 libsanitizer/asan/asan_globals.cpp  | 9 +
 libsanitizer/asan/asan_interface_internal.h | 7 ---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/libsanitizer/asan/asan_globals.cpp 
b/libsanitizer/asan/asan_globals.cpp
index 8f3491f0199..01a243927ca 100644
--- a/libsanitizer/asan/asan_globals.cpp
+++ b/libsanitizer/asan/asan_globals.cpp
@@ -92,6 +92,10 @@ static void ReportGlobal(const Global , const char 
*prefix) {
   if (info.line != 0) {
 Report("  location: name=%s, %d\n", info.file, 
static_cast(info.line));
   }
+  else if (g.gcc_location != 0) {
+// Fallback to Global::gcc_location
+Report("  location: name=%s, %d\n", g.gcc_location->filename, 
g.gcc_location->line_no);
+  }
 }
 
 static u32 FindRegistrationSite(const Global *g) {
@@ -283,6 +287,11 @@ void PrintGlobalLocation(InternalScopedString *str, const 
__asan_global ) {
 
   if (info.line != 0) {
 str->append("%s:%d", info.file, static_cast(info.line));
+  } else if (g.gcc_location != 0) {
+// Fallback to Global::gcc_location
+str->append("%s", g.gcc_location->filename ? g.gcc_location->filename : 
g.module_name);
+if (g.gcc_location->line_no) str->append(":%d", g.gcc_location->line_no);
+if (g.gcc_location->column_no) str->append(":%d", 
g.gcc_location->column_no);
   } else {
 str->append("%s", g.module_name);
   }
diff --git a/libsanitizer/asan/asan_interface_internal.h 
b/libsanitizer/asan/asan_interface_internal.h
index 987f855c0f9..a9982637802 100644
--- a/libsanitizer/asan/asan_interface_internal.h
+++ b/libsanitizer/asan/asan_interface_internal.h
@@ -53,9 +53,10 @@ extern "C" {
 const char *module_name; // Module name as a C string. This pointer is a
  // unique identifier of a module.
 uptr has_dynamic_init;   // Non-zero if the global has dynamic initializer.
-uptr windows_padding;// TODO: Figure out how to remove this padding
- // that's simply here to make the MSVC incremental
- // linker happy...
+__asan_global_source_location *gcc_location;  // Source location of a 
global,
+  // used by GCC compiler. 
LLVM uses
+  // llvm-symbolizer that 
relies
+  // on DWARF debugging info.
 uptr odr_indicator;  // The address of the ODR indicator symbol.
   };
 
-- 
2.39.2



[Bug d/106977] [13 regression] d21 dies with SIGBUS on 32-bit Darwin

2023-02-24 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106977

--- Comment #24 from ibuclaw at gcc dot gnu.org ---
(In reply to Iain Sandoe from comment #23)
> So the ABIs differ in this (as noted on IRC, the Darwin 32b ABIs are not the
> same as Linux).
I'm still yet to work out why D on 32-bit Darwin behaves the same as 32-bit
Linux though.  I would have assumed the decision to generate an sret would
occur long after the front-end has freed itself from the compilation process.

Regardless, the ABI issue can be raised in a separate PR. Because of it though,
that means for this bootstrap PR we just have to avoid calling any extern(C++)
method implemented in D that returns a struct by value.

Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 12:15:04PM +, Richard Biener wrote:
> > > Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> > > ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> > > now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> > > standard C.
> 
> I've removed the MAX (N, 2) now, I think that N == 0 cannot happen
> because we have a specialization covering that.  So we know N is
> at least 1.

I think you're right.

> > > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > > bootstrap that I got with my version or not.
> 
> Yes, I get this as well, not sure how to suppress it.  I guess there's
> no standard way to get at the address after some object without going
> through uintptr obfuscation - and obviously we do not want to have
> that (and if we optimize it away that doesn't help the diagnostic ...)

I think we need to look at the exact IL on which it warns and see what
our options are.

Jakub



Re: [PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jakub Jelinek wrote:

> On Fri, Feb 24, 2023 at 12:32:45PM +0100, Richard Biener via Gcc-patches 
> wrote:
> > --- a/gcc/vec.h
> > +++ b/gcc/vec.h
> > @@ -614,7 +614,7 @@ public:
> >T *bsearch (const void *key, int (*compar)(const void *, const void *));
> >T *bsearch (const void *key,
> >   int (*compar)(const void *, const void *, void *), void *);
> > -  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
> > +  unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const;
> 
> Missing space after (*) while you're there.
> 
> > @@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
> >  {
> >if (ix < m_vecpfx.m_num)
> >  {
> > -  *ptr = m_vecdata[ix];
> > +  *ptr = address()[ix];
> 
> Missing space before ().
> 
> > @@ -1118,7 +1118,7 @@ inline void
> >  vec::unordered_remove (unsigned ix)
> >  {
> >gcc_checking_assert (ix < length ());
> > -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> > +  address ()[ix] = address ()[--m_vecpfx.m_num];
> >  }
> 
> As address () is used twice here, can't we stick it into a temporary
> and use twice then?
> 
> > @@ -1249,8 +1249,11 @@ vec::contains (const T ) const
> >  {
> >unsigned int len = length ();
> >for (unsigned int i = 0; i < len; i++)
> > -if ((*this)[i] == search)
> > -  return true;
> > +{
> > +  const T *slot =  ()[i];
> > +  if (*slot == search)
> > +   return true;
> 
> Similarly, can't we do address () once before the loop into a temporary?
> 
> >  template
> >  unsigned
> > -vec::lower_bound (T obj, bool (*lessthan)(const T &, const 
> > T &))
> > +vec::lower_bound (const T ,
> > + bool (*lessthan)(const T &, const T &))
> 
> ) ( while you're at it.
> 
> Otherwise LGTM.

All fixed.

Richard.


Re: [PATCH] cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial thunk [PR108854]

2023-02-24 Thread Jan Hubicka via Gcc-patches
> Hi!
> 
> The following testcase ICEs on x86_64-linux with -m32.  The problem is
> we create an artificial thunk and because of -fPIC, ia32 and thunk
> destination which doesn't bind locally can't use a mi thunk.
> The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
> but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
> This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
> only if some arguments need modification.
> 
> The following patch fixes it by copying the DECL_ARGUMENTS list even if
> the arguments can stay as is, to update DECL_CONTEXT on them.  While for
> mi thunks it doesn't really matter because we don't use those arguments
> in any way, for other thunks it is important.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2023-02-23  Jakub Jelinek  
> 
>   PR middle-end/108854
>   * cgraphclones.cc (duplicate_thunk_for_node): If no parameter
>   changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
>   nodes and adjust their DECL_CONTEXT.
> 
>   * g++.dg/opt/pr108854.C: New test.
> 
> --- gcc/cgraphclones.cc.jj2023-02-22 20:50:27.417519830 +0100
> +++ gcc/cgraphclones.cc   2023-02-23 17:12:59.875133883 +0100
> @@ -218,7 +218,17 @@ duplicate_thunk_for_node (cgraph_node *t
>body_adj.modify_formal_parameters ();
>  }
>else
> -new_decl = copy_node (thunk->decl);
> +{
> +  new_decl = copy_node (thunk->decl);
> +  for (tree *arg = _ARGUMENTS (new_decl);
> +*arg; arg = _CHAIN (*arg))
> + {
> +   tree next = DECL_CHAIN (*arg);
> +   *arg = copy_node (*arg);
> +   DECL_CONTEXT (*arg) = new_decl;
> +   DECL_CHAIN (*arg) = next;

This makes sense to me. I wonder if we don't want to update abstract
origin too like we do in tree-inline?
Maybe it is unecessary since we don't do debug info for thunks

Jan


[Bug tree-optimization/105329] [12/13 Regression] Bogus restrict warning when assigning 1-char string literal to std::string since r12-3347-g8af8abfbbace49e6

2023-02-24 Thread wielkiegie at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105329

Gustaw Smolarczyk  changed:

   What|Removed |Added

 CC||wielkiegie at gmail dot com

--- Comment #30 from Gustaw Smolarczyk  ---
I have encountered a similar bug, reproducer:


#include 

std::string foo()
{
return "_" + std::to_string(0);
}

Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 02:47:39PM +0100, Richard Biener wrote:
>   * vec.h (vec::m_vecdata): Remove.
>   (vec::m_vecpfx): Align as T to avoid
>   changing alignment of vec and simplifying
>   address.
>   (vec::address): Compute as this + 1.
>   (vec::embedded_size): Use sizeof the
>   vector instead of the offset of the m_vecdata member.
>   (auto_vec::m_data): Turn storage into
>   uninitialized unsigned char.
>   (auto_vec::auto_vec): Allow allocation of one
>   stack member.  Initialize m_vec in a special way to
>   avoid later stringop overflow diagnostics.
>   * vec.cc (test_auto_alias): New.
>   (vec_cc_tests): Call it.
> @@ -1559,8 +1560,14 @@ class auto_vec : public vec
>  public:
>auto_vec ()
>{
> -m_auto.embedded_init (MAX (N, 2), 0, 1);
> -this->m_vec = _auto;
> +m_auto.embedded_init (N, 0, 1);
> +/* ???  Instead of initializing m_vec from _auto directly use an
> +   expression that avoids refering to a specific member of 'this'
> +   to derail the -Wstringop-overflow diagnostic code, avoiding
> +   the impression that data accesses are supposed to be to the
> +   m_auto memmber storage.  */

s/memmber/member/

> +size_t off = (char *) _auto - (char *) this;
> +this->m_vec = (vec *) ((char *) this + off);
>}
>  
>auto_vec (size_t s CXX_MEM_STAT_INFO)
> @@ -1571,7 +1578,7 @@ public:
>   return;
>}
>  
> -m_auto.embedded_init (MAX (N, 2), 0, 1);
> +m_auto.embedded_init (N, 0, 1);
>  this->m_vec = _auto;

Don't we need the above 2 lines here as well (perhaps with a shorter comment
just referencing the earlier comment)?

Otherwise LGTM, thanks.

Jakub



Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jakub Jelinek wrote:

> On Fri, Feb 24, 2023 at 02:47:39PM +0100, Richard Biener wrote:
> > * vec.h (vec::m_vecdata): Remove.
> > (vec::m_vecpfx): Align as T to avoid
> > changing alignment of vec and simplifying
> > address.
> > (vec::address): Compute as this + 1.
> > (vec::embedded_size): Use sizeof the
> > vector instead of the offset of the m_vecdata member.
> > (auto_vec::m_data): Turn storage into
> > uninitialized unsigned char.
> > (auto_vec::auto_vec): Allow allocation of one
> > stack member.  Initialize m_vec in a special way to
> > avoid later stringop overflow diagnostics.
> > * vec.cc (test_auto_alias): New.
> > (vec_cc_tests): Call it.
> > @@ -1559,8 +1560,14 @@ class auto_vec : public vec
> >  public:
> >auto_vec ()
> >{
> > -m_auto.embedded_init (MAX (N, 2), 0, 1);
> > -this->m_vec = _auto;
> > +m_auto.embedded_init (N, 0, 1);
> > +/* ???  Instead of initializing m_vec from _auto directly use an
> > +   expression that avoids refering to a specific member of 'this'
> > +   to derail the -Wstringop-overflow diagnostic code, avoiding
> > +   the impression that data accesses are supposed to be to the
> > +   m_auto memmber storage.  */
> 
> s/memmber/member/
> 
> > +size_t off = (char *) _auto - (char *) this;
> > +this->m_vec = (vec *) ((char *) this + off);
> >}
> >  
> >auto_vec (size_t s CXX_MEM_STAT_INFO)
> > @@ -1571,7 +1578,7 @@ public:
> > return;
> >}
> >  
> > -m_auto.embedded_init (MAX (N, 2), 0, 1);
> > +m_auto.embedded_init (N, 0, 1);
> >  this->m_vec = _auto;
> 
> Don't we need the above 2 lines here as well (perhaps with a shorter comment
> just referencing the earlier comment)?

I've noticed that as well and put it there now, it wasn't necessary
to get bootstrap working.

> Otherwise LGTM, thanks.

Thanks,
Richard.


[committed 1/5] libstdc++: Optimize net::ip::address_v4::to_string()

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This is an order of magnitude faster than calling inet_ntop (and not
only because we now avoid allocating a string that is one byte larger
than the SSO buffer).

libstdc++-v3/ChangeLog:

* include/experimental/internet (address_v4::to_string):
Optimize.
* testsuite/experimental/net/internet/address/v4/members.cc:
Check more addresses.
---
 libstdc++-v3/include/experimental/internet| 28 +--
 .../net/internet/address/v4/members.cc| 11 
 2 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 707370d5611..08bd0db4bb2 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef _GLIBCXX_HAVE_UNISTD_H
 # include 
 #endif
@@ -241,17 +242,28 @@ namespace ip
   __string_with<_Allocator>
   to_string(const _Allocator& __a = _Allocator()) const
   {
-#ifdef _GLIBCXX_HAVE_ARPA_INET_H
+   auto __write = [__addr = to_uint()](char* __p, size_t __n) {
+ auto __to_chars = [](char* __p, uint8_t __v) {
+   unsigned __n = __v >= 100u ? 3 : __v >= 10u ? 2 : 1;
+   std::__detail::__to_chars_10_impl(__p, __n, __v);
+   return __p + __n;
+ };
+ const auto __begin = __p;
+ __p = __to_chars(__p, uint8_t(__addr >> 24));
+ for (int __i = 2; __i >= 0; __i--) {
+   *__p++ = '.';
+   __p = __to_chars(__p, uint8_t(__addr >> (__i * 8)));
+ }
+ return __p - __begin;
+   };
__string_with<_Allocator> __str(__a);
-   __str.resize(INET_ADDRSTRLEN);
-   if (inet_ntop(AF_INET, &_M_addr, &__str.front(), __str.size()))
- __str.erase(__str.find('\0'));
-   else
- __str.resize(0);
-   return __str;
+#if __cpp_lib_string_resize_and_overwrite
+   __str.resize_and_overwrite(15, __write);
 #else
-   std::__throw_system_error((int)__unsupported_err());
+   __str.resize(15);
+   __str.resize(__write(&__str.front(), 15));
 #endif
+   return __str;
   }
 
 // static members:
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc
index df19b11804d..c40a8103664 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 using std::experimental::net::ip::address_v4;
 
@@ -100,6 +101,16 @@ test04()
   VERIFY( address_v4::any().to_string() == "0.0.0.0" );
   VERIFY( address_v4::loopback().to_string() == "127.0.0.1" );
   VERIFY( address_v4::broadcast().to_string() == "255.255.255.255" );
+  using b = address_v4::bytes_type;
+  VERIFY( address_v4(b(1, 23, 45, 67)).to_string() == "1.23.45.67" );
+  VERIFY( address_v4(b(12, 34, 56, 78)).to_string() == "12.34.56.78" );
+  VERIFY( address_v4(b(123, 4, 5, 6)).to_string() == "123.4.5.6" );
+  VERIFY( address_v4(b(123, 234, 124, 235)).to_string() == "123.234.124.235" );
+
+  __gnu_test::uneq_allocator alloc(123);
+  auto str = address_v4(b(12, 34, 56, 78)).to_string(alloc);
+  VERIFY(str.get_allocator().get_personality() == alloc.get_personality());
+  VERIFY( str == "12.34.56.78" );
 }
 
 void
-- 
2.39.2



[committed 5/5] libstdc++: Constrain net::executor constructors

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

The TS says the arguments to these constructors shall meet the Executor
requirements, so it's undefined if they don't. Constraining on a subset
of those requirements won't affect valid cases, but prevents the
majority of invalid cases from trying to instantiate the constructor.

This prevents the non-explicit executor(Executor) constructor being a
candidate anywhere that a net::executor could be constructed e.g.
comparing ip::tcp::v4() == ip::udp::v4() would try to convert both
operands to executor using that constructor, then compare then using
operator==(const executor&, const executor&).

libstdc++-v3/ChangeLog:

* include/experimental/executor (executor): Constrain template
constructors.
---
 libstdc++-v3/include/experimental/executor | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/experimental/executor 
b/libstdc++-v3/include/experimental/executor
index cd75d99ddb3..1dae8925916 100644
--- a/libstdc++-v3/include/experimental/executor
+++ b/libstdc++-v3/include/experimental/executor
@@ -1012,6 +1012,9 @@ inline namespace v1
 
   class executor
   {
+template
+  using _Context_t = decltype(std::declval<_Executor&>().context());
+
   public:
 // construct / copy / destroy:
 
@@ -1021,12 +1024,14 @@ inline namespace v1
 executor(const executor&) noexcept = default;
 executor(executor&&) noexcept = default;
 
-template
+template>>>
   executor(_Executor __e)
   : _M_target(make_shared<_Tgt1<_Executor>>(std::move(__e)))
   { }
 
-template
+template>>>
   executor(allocator_arg_t, const _ProtoAlloc& __a, _Executor __e)
   : _M_target(allocate_shared<_Tgt2<_Executor, _ProtoAlloc>>(__a,
std::move(__e), __a))
-- 
2.39.2



[committed 3/5] libstdc++: Fix members of net::ip::network_v4

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/experimental/internet (network_v4::netmask()): Avoid
undefined shift.
(network_v4::broadcast()): Optimize and fix for targets with
uint_least32_t wider than 32 bits.
(network_v4::to_string(const Allocator&)): Fix for custom
allocators and optimize using to_chars.
(operator==(const network_v4&, const network_v4&)): Add missing
constexpr.
(operator==(const network_v6&, const network_v6&)): Likewise.
* testsuite/experimental/net/internet/network/v4/cons.cc: New test.
* testsuite/experimental/net/internet/network/v4/members.cc: New test.
---
 libstdc++-v3/include/experimental/internet|  41 ++--
 .../net/internet/network/v4/cons.cc   | 129 
 .../net/internet/network/v4/members.cc| 186 ++
 3 files changed, 343 insertions(+), 13 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc
 create mode 100644 
libstdc++-v3/testsuite/experimental/net/internet/network/v4/members.cc

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 3fd200251fa..5336b8a8ce3 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -1219,10 +1219,10 @@ namespace ip
 
   /// @}
 
-  bool
+  constexpr bool
   operator==(const network_v4& __a, const network_v4& __b) noexcept;
 
-  bool
+  constexpr bool
   operator==(const network_v6& __a, const network_v6& __b) noexcept;
 
 
@@ -1263,10 +1263,10 @@ namespace ip
 constexpr address_v4
 netmask() const noexcept
 {
-  address_v4::uint_type __val = address_v4::broadcast().to_uint();
-  __val >>= (32 - _M_prefix_len);
-  __val <<= (32 - _M_prefix_len);
-  return address_v4{__val};
+  address_v4 __m;
+  if (_M_prefix_len)
+   __m = address_v4(0xu << (32 - _M_prefix_len));
+  return __m;
 }
 
 constexpr address_v4
@@ -1275,7 +1275,7 @@ namespace ip
 
 constexpr address_v4
 broadcast() const noexcept
-{ return address_v4{_M_addr.to_uint() | ~netmask().to_uint()}; }
+{ return address_v4{_M_addr.to_uint() | (0xu >> _M_prefix_len)}; }
 
 address_v4_range
 hosts() const noexcept
@@ -1306,8 +1306,23 @@ namespace ip
   __string_with<_Allocator>
   to_string(const _Allocator& __a = _Allocator()) const
   {
-   return address().to_string(__a) + '/'
- + std::to_string(prefix_length());
+   auto __str = address().to_string(__a);
+   const unsigned __addrlen = __str.length();
+   const unsigned __preflen = prefix_length() >= 10 ? 2 : 1;
+   auto __write = [=](char* __p, size_t __n) {
+ __p[__addrlen] = '/';
+ std::__detail::__to_chars_10_impl(__p + __addrlen + 1, __preflen,
+   (unsigned char)prefix_length());
+ return __n;
+   };
+   const unsigned __len = __addrlen + 1 + __preflen;
+#if __cpp_lib_string_resize_and_overwrite
+   __str.resize_and_overwrite(__len, __write);
+#else
+   __str.resize(__len);
+   __write(&__str.front(), __len);
+#endif
+   return __str;
   }
 
   private:
@@ -1379,14 +1394,14 @@ namespace ip
* @{
*/
 
-  inline bool
+  constexpr bool
   operator==(const network_v4& __a, const network_v4& __b) noexcept
   {
 return __a.address() == __b.address()
   && __a.prefix_length() == __b.prefix_length();
   }
 
-  inline bool
+  constexpr bool
   operator!=(const network_v4& __a, const network_v4& __b) noexcept
   { return !(__a == __b); }
 
@@ -1396,14 +1411,14 @@ namespace ip
* @{
*/
 
-  inline bool
+  constexpr bool
   operator==(const network_v6& __a, const network_v6& __b) noexcept
   {
 return __a.address() == __b.address()
   && __a.prefix_length() == __b.prefix_length();
   }
 
-  inline bool
+  constexpr bool
   operator!=(const network_v6& __a, const network_v6& __b) noexcept
   { return !(__a == __b); }
 
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc
new file mode 100644
index 000..7784b6f6f58
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc
@@ -0,0 +1,129 @@
+// { dg-do run { target c++14 } }
+// { dg-require-effective-target net_ts_ip }
+// { dg-add-options net_ts }
+
+#include 
+#include 
+#include 
+
+using std::experimental::net::ip::network_v4;
+using std::experimental::net::ip::address_v4;
+
+constexpr void
+test01()
+{
+  network_v4 n0;
+  VERIFY( n0.address().is_unspecified() );
+  VERIFY( n0.prefix_length() == 0 );
+}
+
+constexpr void
+test02()
+{
+  address_v4 a0;
+  network_v4 n0{ a0, 0 };
+  VERIFY( n0.address() == a0 );
+  VERIFY( n0.prefix_length() == 0 );
+
+  address_v4 a1{ 

[committed 2/5] libstdc++: Fix conversion to/from net::ip::address_v4::bytes_type

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

I messed up the endianness of the address_v4::bytes_type array, which
should always be in network byte order. We can just use bit_cast to
convert the _M_addr member to/from bytes_type.

libstdc++-v3/ChangeLog:

* include/experimental/internet (address_4(const bytes_type&)):
Use __builtin_bit_cast if available, otherwise convert to
network byte order.
(address_v4::to_bytes()): Likewise, but convert from network
byte order.
* testsuite/experimental/net/internet/address/v4/cons.cc: Fix
incorrect tests. Check for constexpr too.
* testsuite/experimental/net/internet/address/v4/creation.cc:
Likewise.
* testsuite/experimental/net/internet/address/v4/members.cc:
Check that bytes_type is a standard-layout type.
---
 libstdc++-v3/include/experimental/internet| 20 ---
 .../net/internet/address/v4/cons.cc   | 33 ---
 .../net/internet/address/v4/creation.cc   | 24 +++---
 .../net/internet/address/v4/members.cc|  3 ++
 4 files changed, 60 insertions(+), 20 deletions(-)

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 08bd0db4bb2..3fd200251fa 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -198,7 +198,12 @@ namespace ip
 
 constexpr
 address_v4(const bytes_type& __b)
-: _M_addr((__b[0] << 24) | (__b[1] << 16) | (__b[2] << 8) | __b[3])
+#if __has_builtin(__builtin_bit_cast)
+: _M_addr(__builtin_bit_cast(uint_type, __b))
+#else
+: _M_addr(_S_hton_32((__b[0] << 24) | (__b[1] << 16)
+  | (__b[2] << 8) | __b[3]))
+#endif
 { }
 
 explicit constexpr
@@ -227,12 +232,17 @@ namespace ip
 constexpr bytes_type
 to_bytes() const noexcept
 {
+#if __has_builtin(__builtin_bit_cast)
+  return __builtin_bit_cast(bytes_type, _M_addr);
+#else
+  auto __host = to_uint();
   return bytes_type{
- (_M_addr >> 24) & 0xFF,
- (_M_addr >> 16) & 0xFF,
- (_M_addr >> 8) & 0xFF,
- _M_addr & 0xFF
+   (__host >> 24) & 0xFF,
+   (__host >> 16) & 0xFF,
+   (__host >> 8) & 0xFF,
+   __host & 0xFF
   };
+#endif
 }
 
 constexpr uint_type
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
index 65f23642de4..af9fef2215e 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
@@ -24,41 +24,45 @@
 
 using std::experimental::net::ip::address_v4;
 
-void
+#if __cplusplus < 202002L
+// Naughty, but operator== for std::array is not constexpr until C++20.
+constexpr bool
+operator==(const address_v4::bytes_type& lhs, const address_v4::bytes_type& 
rhs)
+{
+  return lhs[0] == rhs[0] && lhs[1] == rhs[1]
+  && lhs[2] == rhs[2] && lhs[3] == rhs[3];
+}
+#endif
+
+constexpr void
 test01()
 {
-  bool test __attribute__((unused)) = false;
-
   address_v4 a0;
   VERIFY( a0.to_uint() == 0 );
   VERIFY( a0.to_bytes() == address_v4::bytes_type{} );
 }
 
-void
+constexpr void
 test02()
 {
-  bool test __attribute__((unused)) = false;
-
   address_v4 a0{ address_v4::bytes_type{} };
   VERIFY( a0.to_uint() == 0 );
   VERIFY( a0.to_bytes() == address_v4::bytes_type{} );
 
   address_v4::bytes_type b1{ 1, 2, 3, 4 };
   address_v4 a1{ b1 };
-  VERIFY( a1.to_uint() == ntohl((1 << 24) | (2 << 16) | (3 << 8) | 4) );
+  VERIFY( a1.to_uint() == ((1 << 24) | (2 << 16) | (3 << 8) | 4) );
   VERIFY( a1.to_bytes() == b1 );
 }
 
-void
+constexpr void
 test03()
 {
-  bool test __attribute__((unused)) = false;
-
   address_v4 a0{ 0u };
   VERIFY( a0.to_uint() == 0 );
   VERIFY( a0.to_bytes() == address_v4::bytes_type{} );
 
-  address_v4::uint_type u1 = ntohl((5 << 24) | (6 << 16) | (7 << 8) | 8);
+  address_v4::uint_type u1 = (5 << 24) | (6 << 16) | (7 << 8) | 8;
   address_v4 a1{ u1 };
   VERIFY( a1.to_uint() == u1 );
   VERIFY( a1.to_bytes() == address_v4::bytes_type( 5, 6, 7, 8 ) );
@@ -70,4 +74,11 @@ main()
   test01();
   test02();
   test03();
+
+  constexpr bool c = []{
+test01();
+test02();
+test03();
+return true;
+  };
 }
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc
index 441c832bf54..84aebbb7adc 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc
@@ -25,7 +25,17 @@
 namespace net = std::experimental::net;
 using net::ip::address_v4;
 
-void
+#if __cplusplus < 202002L
+// Naughty, but operator== for std::array is not constexpr until C++20.
+constexpr bool
+operator==(const address_v4::bytes_type& lhs, 

[committed 4/5] libstdc++: Make net::ip::basic_endpoint comparisons constexpr

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/experimental/internet (basic_endpoint): Add missing
constexpr to comparison operators.
* testsuite/experimental/net/internet/endpoint/cons.cc: New test.
---
 libstdc++-v3/include/experimental/internet| 12 ++--
 .../net/internet/endpoint/cons.cc | 66 +++
 2 files changed, 72 insertions(+), 6 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 5336b8a8ce3..cae07f466da 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -1626,19 +1626,19 @@ namespace ip
*/
 
   template
-inline bool
+constexpr bool
 operator==(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return __a.address() == __b.address() && __a.port() == __b.port(); }
 
   template
-inline bool
+constexpr bool
 operator!=(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return !(__a == __b); }
 
   template
-inline bool
+constexpr bool
 operator< (const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 {
@@ -1647,19 +1647,19 @@ namespace ip
 }
 
   template
-inline bool
+constexpr bool
 operator> (const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return __b < __a; }
 
   template
-inline bool
+constexpr bool
 operator<=(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return !(__b < __a); }
 
   template
-inline bool
+constexpr bool
 operator>=(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return !(__a < __b); }
diff --git a/libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc
new file mode 100644
index 000..1b5c92c0b58
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc
@@ -0,0 +1,66 @@
+// { dg-do run { target c++14 } }
+// { dg-require-effective-target net_ts_ip }
+// { dg-add-options net_ts }
+
+#include 
+#include 
+
+using namespace std::experimental::net;
+
+constexpr void
+test_default()
+{
+  ip::tcp::endpoint t1;
+  VERIFY( t1.protocol() == ip::tcp::v4() );
+  VERIFY( t1.address() == ip::address() );
+  VERIFY( t1.port() == 0 );
+
+  ip::udp::endpoint t2;
+  VERIFY( t2.protocol() == ip::udp::v4() );
+  VERIFY( t2.address() == ip::address() );
+  VERIFY( t2.port() == 0 );
+}
+
+constexpr void
+test_proto()
+{
+  ip::tcp::endpoint t1(ip::tcp::v4(), 22);
+  VERIFY( t1.protocol() == ip::tcp::v4() );
+  VERIFY( t1.address() == ip::address_v4() );
+  VERIFY( t1.port() == 22 );
+
+  ip::tcp::endpoint t2(ip::tcp::v6(), 80);
+  VERIFY( t2.protocol() == ip::tcp::v6() );
+  VERIFY( t2.address() == ip::address_v6() );
+  VERIFY( t2.port() == 80 );
+}
+
+constexpr void
+test_addr()
+{
+  ip::address_v4 a1(ip::address_v4::bytes_type(1, 2, 3, 4));
+  ip::tcp::endpoint t1(a1, 22);
+  VERIFY( t1.protocol() == ip::tcp::v4() );
+  VERIFY( t1.address() == a1 );
+  VERIFY( t1.port() == 22 );
+
+  ip::address_v6 a2(ip::address_v6::bytes_type(21,22,23,24,25,26,27,28,29));
+  ip::tcp::endpoint t2(a2, 80);
+  VERIFY( t2.protocol() == ip::tcp::v6() );
+  VERIFY( t2.address() == a2 );
+  VERIFY( t2.port() == 80 );
+}
+
+int main()
+{
+  test_default();
+  test_proto();
+  test_addr();
+
+  constexpr bool c = [] {
+test_default();
+test_proto();
+test_addr();
+return true;
+  };
+}
-- 
2.39.2



[Bug c++/105224] [modules] g++.dg/modules/virt-2_a.C: inline key methods: c++ modules and arm aapcs clash

2023-02-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105224

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:3d1d3ece9bc5a1baa2feb4bf231b709c097b8434

commit r13-6329-g3d1d3ece9bc5a1baa2feb4bf231b709c097b8434
Author: Alexandre Oliva 
Date:   Fri Feb 24 11:31:05 2023 -0300

[PR105224] C++ modules and AAPCS/ARM EABI clash on inline key methods

g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
that use the AAPCS variant.  ARM is the only target that overrides
TARGET_CXX_KEY_METHOD_MAY_BE_INLINE.  It's not clear to me which way
the clash between AAPCS and C++ Modules design should be resolved, but
currently it favors AAPCS and thus the test fails, so skip it on
arm_eabi.


for  gcc/testsuite/ChangeLog

PR c++/105224
* g++.dg/modules/virt-2_a.C: Skip on arm_eabi.

[Bug middle-end/108545] [13 Regression] ICE in install_var_field, at omp-low.cc:799 since r13-2665-g23baa717c991d77f

2023-02-24 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108545

--- Comment #3 from Tobias Burnus  ---
Fortran: Same issue (ICE) also with:
   !$omp target enter data map(to: x)

Crucial is the VOLATILE attribute.

 * * *

The following C code already gives an ICE with GCC 12, it works with GCC 11.
(Either of the two lines fail. I think that's invalid OpenMP code, but
I do not have a real overview about 'map' - and I fear no one has.)


volatile struct t {
  struct t2 { int *a; int c; } u;
  int b;
} my_struct;
volatile struct t3 { int *a; int c; } my_struct3;

void f() {
  #pragma omp target enter data map(to:my_struct.u) map(to:my_struct.u.a)
  #pragma omp target enter data map(to:my_struct3) map(to:my_struct3.a)
}

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Segher Boessenkool
Hi!

For future patches: please don't send patches as replies to existing
threads.  Just start a new thread for a new patch (series).  You can
mark it as [PATCH v2] in the subject, if you want.

On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote:
> Here is the patch that uses xxlor instead of fmr where possible.
> Performance results shows that fmr is better in power9 and 
> power10 architectures whereas xxlor is better in power7 and
> power 8 architectures.

And fmr is the only option before p7.

>   rs6000: Use xxlor instead of fmr where possible
> 
>   This patch replaces fmr with xxlor instruction for power7
>   and power8 architectures whereas for power9 and power10
>   replaces xxlor with fmr instruction.

Saying "this patch" in a commit message reads strangely.  Just "Replace
fmr with" etc.?

The second part is just wrong, you cannot replace xxlor by fmr in
general.

>   Perf measurement results:
> 
>   Power9 fmr:  201,847,661 cycles.
>   Power9 xxlor: 201,877,78 cycles.
>   Power8 fmr: 201,057,795 cycles.
> Power8 xxlor: 201,004,671 cycles.

What is this measuring?  100M insns back-to-back, each dependent on the
previous one?

What are the results on p7 and p10?

These numbers show there is no difference on p8 either.  Did you paste
the wrong numbers maybe?

>   * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor
>   for power7 and power8 and fmr for power9 and power10.

Please don't break lines early.  Changelogs lines can be 80 columns
wide, just like source code lines.

> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -354,7 +354,7 @@ (define_attr "cpu"
>(const (symbol_ref "(enum attr_cpu) rs6000_tune")))
>  
>  ;; The ISA we implement.
> -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
> +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10"

p78v, and sort it after p8v please.

> + (and (eq_attr "isa" "p7p8")
> +   (match_test "TARGET_VSX && !TARGET_P9_VECTOR"))
> + (const_int 1)

Okay.

>  (define_insn "*mov_hardfloat64"
>[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
> -   "=m,   d,  d,  ,   wY,
> - ,Z,  ,  ,  !r,
> - YZ,  r,  !r, *c*l,   !r,
> -*h,   r,  ,   wa")
> +   "=m,   d,  ,  ,   wY,
> +, Z,  wa, ,  !r,
> +YZ,   r,  !r, *c*l,   !r,
> +*h,   r,  ,   d,  wn,
> +wa")
>   (match_operand:FMOVE64 1 "input_operand"

(You posted this mail as wrapping.  That means the patch cannot be
applied non-manually, and that replies to your mail will be mangled.
Just get a Real mail client, and configure it correctly :-) )

> -"d,   m,  d,  wY, ,
> - Z,   ,   ,  ,  ,
> +"d,   m,  ,  wY, ,
> + Z,   ,   wa, ,  ,
>   r,   YZ, r,  r,  *h,
> - 0,   ,   r,  eP"))]
> + 0,   ,   r,  d,  wn,
> + eP"))]

No.  It is impossible to figure out what you changed here by just
reading it.

There is no requirement there should be exactly five alternatives per
line, and/or that there should be the same number everywhere.

If the indentation was incorrect, and you want to fix that, do that in a
separate *earlier* patch in the series, please.

>"TARGET_POWERPC64 && TARGET_HARD_FLOAT
> && (gpc_reg_operand (operands[0], mode)
> || gpc_reg_operand (operands[1], mode))"
>"@
> stfd%U0%X0 %1,%0
> lfd%U1%X1 %0,%1
> -   fmr %0,%1
> +   xxlor %x0,%x1,%x1
> lxsd %0,%1
> stxsd %1,%0
> lxsdx %x0,%y1
> stxsdx %x1,%y0
> -   xxlor %x0,%x1,%x1
> +   fmr %0,%1
> xxlxor %x0,%x0,%x0
> li %0,0
> std%U0%X0 %1,%0
> @@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64"
> nop
> mfvsrd %0,%x1
> mtvsrd %x0,%1
> +   fmr %0,%1
> +   fmr %0,%1
> #"
>[(set_attr "type"
> -"fpstore, fpload, fpsimple,   fpload, fpstore,
> +"fpstore, fpload, veclogical, fpload, fpstore,
>   fpload,  fpstore,veclogical, veclogical, integer,
>   store,   load,   *,  mtjmpr, mfjmpr,
> - *,   mfvsr,  mtvsr,  vecperm")
> + *,   mfvsr,  mtvsr,  fpsimple,   fpsimple,
> + vecperm")
> (set_attr "size" "64")
> (set_attr "isa"
> -"*,   *,  *,  p9v,p9v,
> - p7v, p7v,*,  *,  *,
> - *,   *,  *,  *,  *,
> - *,   p8v,p8v,   

Re: [PATCH] asan: adjust module name for global variables

2023-02-24 Thread Martin Liška
On 2/24/23 10:07, Jakub Jelinek wrote:
> On Fri, Feb 24, 2023 at 10:00:01AM +0100, Martin Liška wrote:
>> As mentioned in the PR, when we use LTO, we wrongly use ltrans output
>> file name as a module name of a global variable. That leads to a
>> non-reproducible output.
>>
>> After the suggested change, we emit context name of normal global
>> variables. And for artificial variables (like .Lubsan_data3), we use
>> aux_base_name (e.g. "./a.ltrans0.ltrans").
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
>>  PR asan/108834
>>
>> gcc/ChangeLog:
>>
>>  * asan.cc (asan_add_global): Use proper TU name for normal
>>global variables (and aux_base_name for the artificial one).
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * c-c++-common/asan/global-overflow-1.c: Test line and column
>>  info for a global variable.
>> ---
>>  gcc/asan.cc | 7 ++-
>>  gcc/testsuite/c-c++-common/asan/global-overflow-1.c | 2 +-
>>  2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/asan.cc b/gcc/asan.cc
>> index f56d084bc7a..245abb14388 100644
>> --- a/gcc/asan.cc
>> +++ b/gcc/asan.cc
>> @@ -3287,7 +3287,12 @@ asan_add_global (tree decl, tree type, 
>> vec *v)
>>  pp_string (_pp, "");
>>str_cst = asan_pp_string (_pp);
>>  
>> -  pp_string (_name_pp, main_input_filename);
>> +  const_tree tu = get_ultimate_context ((const_tree)decl);
>> +  if (tu != NULL_TREE)
>> +pp_string (_name_pp, IDENTIFIER_POINTER (DECL_NAME (tu)));
>> +  else
>> +pp_string (_name_pp, aux_base_name);
> 
> I think for !in_lto_p we don't need to bother with get_ultimate_context
> and should just use main_input_filename as before.

All right, pushed with that change.

Thanks,
Martin

> 
> Otherwise LGTM.
> 
>   Jakub
> 



[Bug c/108896] provide "element_count" attribute to give more context to __builtin_dynamic_object_size() and -fsanitize=bounds

2023-02-24 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896

Martin Uecker  changed:

   What|Removed |Added

 CC||muecker at gwdg dot de

--- Comment #4 from Martin Uecker  ---

What could work is using a subset of designator syntax.

struct {
  int count;
  char data[.count];
};

But I do not think this should be a flexible array member that is simply
ignored except for sizeof but a complete type (maybe with some restrictions). 

Then we could also have

struct {
  int count;
  char (*buf)[.count];
};

which would be incredible useful.

Re: [PATCH v3 09/11] riscv: thead: Add support for the XTheadMemPair ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
Could you move those thead_* and th_* functions into thead.cc

> +static bool
> +thead_mempair_operand_p (rtx mem, machine_mode mode)
> +{
> +  if (!MEM_SIZE_KNOWN_P (mem))
> +return false;
> +
> +  /* Only DI or SI mempair instructions exist.  */

add gcc_assert (mode == SImode || mode == DImode); here


[PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Richard Biener via Gcc-patches
As preparation to remove m_vecdata in the vl_embed vector this
changes references to it into calls to address ().

As I was here it also fixes ::contains to avoid repeated bounds
checking and the same issue in ::lower_bound which also suffers
from unnecessary copying around values.

* vec.h: Change m_vecdata references to address ().
* vec.h (vec::lower_bound): Adjust to
take a const reference to the object, use address to
access data.
(vec::contains): Use address to access data.
(vec::operator[]): Use address instead of
m_vecdata to access data.
(vec::iterate): Likewise.
(vec::copy): Likewise.
(vec::quick_push): Likewise.
(vec::pop): Likewise.
(vec::quick_insert): Likewise.
(vec::ordered_remove): Likewise.
(vec::unordered_remove): Likewise.
(vec::block_remove): Likewise.
(vec::address): Likewise.
---
 gcc/vec.h | 44 +---
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index a536b68732d..2b36f065234 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -611,10 +611,10 @@ public:
   void qsort (int (*) (const void *, const void *));
   void sort (int (*) (const void *, const void *, void *), void *);
   void stablesort (int (*) (const void *, const void *, void *), void *);
-  T *bsearch (const void *key, int (*compar)(const void *, const void *));
+  T *bsearch (const void *key, int (*compar) (const void *, const void *));
   T *bsearch (const void *key,
  int (*compar)(const void *, const void *, void *), void *);
-  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
+  unsigned lower_bound (const T &, bool (*) (const T &, const T &)) const;
   bool contains (const T ) const;
   static size_t embedded_size (unsigned);
   void embedded_init (unsigned, unsigned = 0, unsigned = 0);
@@ -879,7 +879,7 @@ inline const T &
 vec::operator[] (unsigned ix) const
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 template
@@ -887,7 +887,7 @@ inline T &
 vec::operator[] (unsigned ix)
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 
@@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = m_vecdata[ix];
+  *ptr = address ()[ix];
   return true;
 }
   else
@@ -955,7 +955,7 @@ vec::iterate (unsigned ix, T **ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = CONST_CAST (T *, _vecdata[ix]);
+  *ptr = CONST_CAST (T *,  ()[ix]);
   return true;
 }
   else
@@ -978,7 +978,7 @@ vec::copy (ALONE_MEM_STAT_DECL) const
 {
   vec_alloc (new_vec, len PASS_MEM_STAT);
   new_vec->embedded_init (len, len);
-  vec_copy_construct (new_vec->address (), m_vecdata, len);
+  vec_copy_construct (new_vec->address (), address (), len);
 }
   return new_vec;
 }
@@ -1018,7 +1018,7 @@ inline T *
 vec::quick_push (const T )
 {
   gcc_checking_assert (space (1));
-  T *slot = _vecdata[m_vecpfx.m_num++];
+  T *slot =  ()[m_vecpfx.m_num++];
   *slot = obj;
   return slot;
 }
@@ -1031,7 +1031,7 @@ inline T &
 vec::pop (void)
 {
   gcc_checking_assert (length () > 0);
-  return m_vecdata[--m_vecpfx.m_num];
+  return address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1056,7 +1056,7 @@ vec::quick_insert (unsigned ix, const T 
)
 {
   gcc_checking_assert (length () < allocated ());
   gcc_checking_assert (ix <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
   *slot = obj;
 }
@@ -1071,7 +1071,7 @@ inline void
 vec::ordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
 }
 
@@ -1118,7 +1118,8 @@ inline void
 vec::unordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
+  T *p = address ();
+  p[ix] = p[--m_vecpfx.m_num];
 }
 
 
@@ -1130,7 +1131,7 @@ inline void
 vec::block_remove (unsigned ix, unsigned len)
 {
   gcc_checking_assert (ix + len <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   m_vecpfx.m_num -= len;
   memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
 }
@@ -1248,9 +1249,13 @@ inline bool
 vec::contains (const T ) const
 {
   unsigned int len = length ();
+  const T *p = address ();
   for (unsigned int i = 0; i < len; i++)
-if ((*this)[i] == search)
-  return true;
+{
+  const T *slot = [i];
+  if (*slot == search)
+   return true;
+}
 
   return false;
 }
@@ -1262,7 +1267,8 @@ vec::contains (const T ) const
 
 template
 unsigned
-vec::lower_bound (T obj, bool (*lessthan)(const T &, const T 
&))
+vec::lower_bound (const T ,
+ bool 

[PATCH] RISC-V: Add scalar move support and fix VSETVL bugs

2023-02-24 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/constraints.md (Wb1): New constraint.
* config/riscv/predicates.md 
(vector_least_significant_set_mask_operand): New predicate.
(vector_broadcast_mask_operand): Ditto.
* config/riscv/riscv-protos.h (enum vlmul_type): Adjust.
(gen_scalar_move_mask): New function.
* config/riscv/riscv-v.cc (gen_scalar_move_mask): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class vmv): New class.
(class vmv_s): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmv_x): Ditto.
(vmv_s): Ditto.
(vfmv_f): Ditto.
(vfmv_s): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct 
scalar_move_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (function_expander::mask_mode): 
Ditto.
(function_expander::use_exact_insn): New function.
(function_expander::use_contiguous_load_insn): New function.
(function_expander::use_contiguous_store_insn): New function.
(function_expander::use_ternop_insn): New function.
(function_expander::use_widen_ternop_insn): New function.
(function_expander::use_scalar_move_insn): New function.
* config/riscv/riscv-vector-builtins.def (s): New operand suffix.
* config/riscv/riscv-vector-builtins.h 
(function_expander::add_scalar_move_mask_operand): New class.
* config/riscv/riscv-vsetvl.cc (ignore_vlmul_insn_p): New function.
(scalar_move_insn_p): Ditto.
(has_vsetvl_killed_avl_p): Ditto.
(anticipatable_occurrence_p): Ditto.
(insert_vsetvl): Ditto.
(get_vl_vtype_info): Ditto.
(calculate_sew): Ditto.
(calculate_vlmul): Ditto.
(incompatible_avl_p): Ditto.
(different_sew_p): Ditto.
(different_lmul_p): Ditto.
(different_ratio_p): Ditto.
(different_tail_policy_p): Ditto.
(different_mask_policy_p): Ditto.
(possible_zero_avl_p): Ditto.
(first_ratio_invalid_for_second_sew_p): Ditto.
(first_ratio_invalid_for_second_lmul_p): Ditto.
(second_ratio_invalid_for_first_sew_p): Ditto.
(second_ratio_invalid_for_first_lmul_p): Ditto.
(second_sew_less_than_first_sew_p): Ditto.
(first_sew_less_than_second_sew_p): Ditto.
(compare_lmul): Ditto.
(second_lmul_less_than_first_lmul_p): Ditto.
(first_lmul_less_than_second_lmul_p): Ditto.
(first_ratio_less_than_second_ratio_p): Ditto.
(second_ratio_less_than_first_ratio_p): Ditto.
(DEF_INCOMPATIBLE_COND): Ditto.
(greatest_sew): Ditto.
(first_sew): Ditto.
(second_sew): Ditto.
(first_vlmul): Ditto.
(second_vlmul): Ditto.
(first_ratio): Ditto.
(second_ratio): Ditto.
(vlmul_for_first_sew_second_ratio): Ditto.
(ratio_for_second_sew_first_vlmul): Ditto.
(DEF_SEW_LMUL_FUSE_RULE): Ditto.
(always_unavailable): Ditto.
(avl_unavailable_p): Ditto.
(sew_unavailable_p): Ditto.
(lmul_unavailable_p): Ditto.
(ge_sew_unavailable_p): Ditto.
(ge_sew_lmul_unavailable_p): Ditto.
(ge_sew_ratio_unavailable_p): Ditto.
(DEF_UNAVAILABLE_COND): Ditto.
(same_sew_lmul_demand_p): Ditto.
(propagate_avl_across_demands_p): Ditto.
(reg_available_p): Ditto.
(avl_info::has_non_zero_avl): Ditto.
(vl_vtype_info::has_non_zero_avl): Ditto.
(vector_insn_info::operator>=): Refactor.
(vector_insn_info::parse_insn): Adjust for scalar move.
(vector_insn_info::demand_vl_vtype): Remove.
(vector_insn_info::compatible_p): New function.
(vector_insn_info::compatible_avl_p): Ditto.
(vector_insn_info::compatible_vtype_p): Ditto.
(vector_insn_info::available_p): Ditto.
(vector_insn_info::merge): Ditto.
(vector_insn_info::fuse_avl): Ditto.
(vector_insn_info::fuse_sew_lmul): Ditto.
(vector_insn_info::fuse_tail_policy): Ditto.
(vector_insn_info::fuse_mask_policy): Ditto.
(vector_insn_info::dump): Ditto.
(vector_infos_manager::release): Ditto.
(pass_vsetvl::compute_local_backward_infos): Adjust for scalar move 
support.
(pass_vsetvl::get_backward_fusion_type): Adjust for scalar move support.
(pass_vsetvl::hard_empty_block_p): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::propagate_avl): Ditto.
* config/riscv/riscv-vsetvl.h (enum demand_status): New class.
(struct 

[Bug demangler/107884] H8/300: cp-demangle.c fix warning related demangle.h

2023-02-24 Thread mike at mnmoran dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107884

--- Comment #8 from Michael N. Moran  ---
I tried to build with after patching with
https://gcc.gnu.org/bugzilla/attachment.cgi?id=53980 and now get an assembler
failure.

/tmp/cc2C1wMh.s: Assembler messages:
/tmp/cc2C1wMh.s:82060: Error: value of 00012570 too large for field of 2 bytes
at 0002

I created a corresponding cp-demangle.s file and attached it
https://gcc.gnu.org/bugzilla/attachment.cgi?id=54530

The offending line looks like this:


.Ldebug_line0:
.2byte  0, .LELT0-.LSLT0 << line 82060
.LSLT0:
.2byte  0x5

[Bug middle-end/108545] [13 Regression] ICE in install_var_field, at omp-low.cc:799 since r13-2665-g23baa717c991d77f

2023-02-24 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108545

--- Comment #4 from Tobias Burnus  ---
For the C/C++ testcase of comment 3, bisecting points to
  commit r12-5835-g0ab29cf0bb68960c1f87405f14b4fb2109254e2f
  "openmp: Improve OpenMP target support for C++ (PR92120)"

[Bug fortran/108925] New: memory leak of gfc_get_namespace result

2023-02-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108925

Bug ID: 108925
   Summary: memory leak of gfc_get_namespace result
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

valgrind leak check complains

==8747== 3,976 (2,792 direct, 1,184 indirect) bytes in 1 blocks are definitely
lost in loss record 2,248 of 2,309
==8747==at 0x4C39571: calloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)   
==8747==by 0x1C25680: xcalloc (xmalloc.c:164)
==8747==by 0x7C998E: gfc_get_namespace(gfc_namespace*, int)
(symbol.cc:2869)
==8747==by 0x767D44: load_needed(pointer_info*) (module.cc:5175) 
==8747==by 0x767BB5: load_needed(pointer_info*) (module.cc:5153)
==8747==by 0x767BB5: load_needed(pointer_info*) (module.cc:5153)
==8747==by 0x767BB5: load_needed(pointer_info*) (module.cc:5153)
==8747==by 0x767BB5: load_needed(pointer_info*) (module.cc:5153) 
==8747==by 0x767BC0: load_needed(pointer_info*) (module.cc:5154)
==8747==by 0x767BC0: load_needed(pointer_info*) (module.cc:5154)
==8747==by 0x767BC0: load_needed(pointer_info*) (module.cc:5154)
==8747==by 0x767BB5: load_needed(pointer_info*) (module.cc:5153)

[Bug tree-optimization/108906] Bogus may be used uninitialized warning

2023-02-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108906

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-02-24
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
We diagnose this from the early diagnostic pass where propagation is limited.
At some cost we could improve things here.

  1   2   >