[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-06-15 Thread hiraditya at msn dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #17 from AK  ---
Even after vector::size() is hoisted, the codegen is sub-optimal compared to
iterator version.

```
void use_idx_const_size(std::vector v) {
auto s = v.size();
for (std::vector::size_type i = 0; i < s; i++)
f(v[i]);
}
```

$ g++ -O3

use_idx_const_size(std::vector >):
pushr12
pushrbp
pushrbx
mov rdx, QWORD PTR [rdi+8]
mov rax, QWORD PTR [rdi]
mov r12, rdx
sub r12, rax
sar r12, 2
cmp rax, rdx
je  .L1
mov rbp, rdi
xor ebx, ebx
jmp .L3
.L6:
mov rax, QWORD PTR [rbp+0]
.L3:
mov edi, DWORD PTR [rax+rbx*4]
add rbx, 1
callf(int)
cmp rbx, r12
jb  .L6
.L1:
pop rbx
pop rbp
pop r12
ret

It seems compiler is assuming that vector `v` is not loop-invariant?

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #16 from rguenther at suse dot de  ---
On Thu, 13 Apr 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443
> 
> Jakub Jelinek  changed:
> 
>What|Removed |Added
> 
>  CC||redi at gcc dot gnu.org
> 
> --- Comment #15 from Jakub Jelinek  ---
> (In reply to rguent...@suse.de from comment #14)
> > If that's valid then all bets for this PR are off since f() then
> > _can_ change v.size ().
> 
> Well, in C++ the ctors are typically defined inline, so if we wanted and they
> would be inline the FE could do some quick check whether the ctor could leak
> the this address or some address derived from it and we could do the
> optimization only if we prove that the copy constructor (and default
> constructor?) doesn't do this.

Yes, with IPA we could also figure out cases where arguments / return
by reference could be effectively restrict.

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

Jakub Jelinek  changed:

   What|Removed |Added

 CC||redi at gcc dot gnu.org

--- Comment #15 from Jakub Jelinek  ---
(In reply to rguent...@suse.de from comment #14)
> If that's valid then all bets for this PR are off since f() then
> _can_ change v.size ().

Well, in C++ the ctors are typically defined inline, so if we wanted and they
would be inline the FE could do some quick check whether the ctor could leak
the this address or some address derived from it and we could do the
optimization only if we prove that the copy constructor (and default
constructor?) doesn't do this.
CCing Jonathan on whether it is valid.

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #14 from rguenther at suse dot de  ---
On Thu, 13 Apr 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443
> 
> --- Comment #13 from Jakub Jelinek  ---
> (In reply to Richard Biener from comment #12)
> > Note maybe the restrict qualification isn't the best representation since
> > it doesn't capture the value will die upon function return (does it?  I
> > gues the DTOR if any will run in caller context and thus stores to it
> > are not necessarily "dead").
> 
> Yes, the dtors will be invoked by the caller and those can inspect the values
> and do all kinds of things with them.
> In fact, the restrict isn't probably right either, the constructor of the
> object in the caller could legally save the address of the object somewhere
> else (say global pointer,
> or inside of the object, or wherever else) and as long as say the destructor
> undoes that, it could be valid.
> 
> Something like:
> 
> struct S {
>   static bool enabled;
>   static S *p;
>   S () : s (0) {}
>   ~S () { if (enabled) p = nullptr; }
>   S (const S ) : s (x.s) { if (enabled) p = this; }
>   int s;
> };
> 
> [[gnu::noipa]] void
> bar (S s)
> {
>   s.s++;
>   S::p->s = 0;
>   s.s++;
> }
> 
> void
> foo ()
> {
>   S s;
>   S::enabled = true;
>   bar (s);
> }

If that's valid then all bets for this PR are off since f() then
_can_ change v.size ().

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #13 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #12)
> Note maybe the restrict qualification isn't the best representation since
> it doesn't capture the value will die upon function return (does it?  I
> gues the DTOR if any will run in caller context and thus stores to it
> are not necessarily "dead").

Yes, the dtors will be invoked by the caller and those can inspect the values
and do all kinds of things with them.
In fact, the restrict isn't probably right either, the constructor of the
object in the caller could legally save the address of the object somewhere
else (say global pointer,
or inside of the object, or wherever else) and as long as say the destructor
undoes that, it could be valid.

Something like:

struct S {
  static bool enabled;
  static S *p;
  S () : s (0) {}
  ~S () { if (enabled) p = nullptr; }
  S (const S ) : s (x.s) { if (enabled) p = this; }
  int s;
};

[[gnu::noipa]] void
bar (S s)
{
  s.s++;
  S::p->s = 0;
  s.s++;
}

void
foo ()
{
  S s;
  S::enabled = true;
  bar (s);
}

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #12 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #11)
> And I don't see any code generation changes on the #c0 testcase with added
> #include  with the patch.

Yes, that's because we cannot disambiguate the call against a restrict
qualified memory access.  But otherwise it "works".

Note maybe the restrict qualification isn't the best representation since
it doesn't capture the value will die upon function return (does it?  I
gues the DTOR if any will run in caller context and thus stores to it
are not necessarily "dead").

It's on my list of things to do to investigate adding "'restrict' support"
to calls for GCC14.

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #11 from Jakub Jelinek  ---
And I don't see any code generation changes on the #c0 testcase with added
#include  with the patch.

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #10 from Jakub Jelinek  ---
Making the reference types to return or parameter non-POD types passed by value
restrict could be
--- gcc/cp/call.cc.jj   2023-03-30 09:34:05.609725768 +0200
+++ gcc/cp/call.cc  2023-04-13 09:56:53.226908996 +0200
@@ -9223,7 +9223,11 @@ type_passed_as (tree type)
 {
   /* Pass classes with copy ctors by invisible reference.  */
   if (TREE_ADDRESSABLE (type))
-type = build_reference_type (type);
+{
+  type = build_reference_type (type);
+  type = build_qualified_type (type,
+  TYPE_QUALS (type) | TYPE_QUAL_RESTRICT);
+}
   else if (targetm.calls.promote_prototypes (NULL_TREE)
   && INTEGRAL_TYPE_P (type)
   && COMPLETE_TYPE_P (type)
--- gcc/cp/cp-gimplify.cc.jj2023-03-15 15:36:02.500430556 +0100
+++ gcc/cp/cp-gimplify.cc   2023-04-13 09:57:51.989059798 +0200
@@ -2000,6 +2000,9 @@ cp_genericize (tree fndecl)
 {
   t = DECL_RESULT (fndecl);
   TREE_TYPE (t) = build_reference_type (TREE_TYPE (t));
+  TREE_TYPE (t)
+   = build_qualified_type (TREE_TYPE (t), TYPE_QUALS (TREE_TYPE (t))
+  | TYPE_QUAL_RESTRICT);
   DECL_BY_REFERENCE (t) = 1;
   TREE_ADDRESSABLE (t) = 0;
   relayout_decl (t);

Completely untested.  Does this what we want?  Stage1 material anyway...

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #9 from Jonathan Wakely  ---
No:

This elision of copy/move operations, called copy elision, is permitted in the
following circumstances (which may be combined to eliminate multiple copies):
 — in a return statement in a function with a class return type, when the
expression is the name of a non-volatile object with automatic storage duration
(other than a function parameter or a variable introduced by the
exception-declaration of a handler (14.4)) with the same type (ignoring
cv-qualification) as the function return type, the copy/move operation can be
omitted by constructing the object directly into the function call’s return
object

"Other than a function parameter" means the return object must be at a distinct
address.

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-13 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

--- Comment #8 from rguenther at suse dot de  ---
On Wed, 12 Apr 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443
> 
> Jakub Jelinek  changed:
> 
>What|Removed |Added
> 
>  CC||jakub at gcc dot gnu.org,
>||jason at gcc dot gnu.org,
>||mpolacek at gcc dot gnu.org
> 
> --- Comment #7 from Jakub Jelinek  ---
> I think the middle-end could figure this out by itself, such PARM_DECLs or
> RESULT_DECLs
> are DECL_BY_REFERENCE and TREE_ADDRESSABLE (TREE_TYPE (TREE_TYPE
> (parm_or_decl))).

We currently require an additional TYPE_RESTRICT on the pointer
but do not require TREE_ADDRESSABLE.  Do you say TREE_ADDRESSABLE
provides the same guarantees as TYPE_RESTRICT here?

> I think that means that if one calls a function with two arguments the two
> arguments won't alias:
> struct S { S (); ~S (); S (const S &); int s; };
> void
> foo (S a, S b)
> {
> }
> void
> bar (S c)
> {
>   foo (c, c);
> }
> Inside of the function one can take address of such a parameter as many times
> as one wants and dereference through that of course.  But I think one can
> assume that when the
> function is called, the argument (i.e. the address of the object in the 
> caller)
> isn't stored anywhere else.

I'd prefer if the frontend is explicit about this.

What about

S foo (S a)
{
  return a;
}

can the return object alias the argument?

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org,
   ||mpolacek at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
I think the middle-end could figure this out by itself, such PARM_DECLs or
RESULT_DECLs
are DECL_BY_REFERENCE and TREE_ADDRESSABLE (TREE_TYPE (TREE_TYPE
(parm_or_decl))).
I think that means that if one calls a function with two arguments the two
arguments won't alias:
struct S { S (); ~S (); S (const S &); int s; };
void
foo (S a, S b)
{
}
void
bar (S c)
{
  foo (c, c);
}
Inside of the function one can take address of such a parameter as many times
as one wants and dereference through that of course.  But I think one can
assume that when the
function is called, the argument (i.e. the address of the object in the caller)
isn't stored anywhere else.

[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)

2023-04-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-04-11
 Status|UNCONFIRMED |NEW
Version|unknown |13.0
  Component|tree-optimization   |c++
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=35269

--- Comment #6 from Richard Biener  ---
Well, but v.size () is not loop invariant.  As Andrew says 'v' is passed by
(const?) reference but the PTA solution doesn't exactly reflect this
so we do think the call clobbers it:

  # USE = nonlocal escaped
  # CLB = nonlocal escaped
  f (_1);
...
  # PT = nonlocal escaped null
  _10 = MEM[(const struct vector *)v_5(D)].D.35644._M_impl.D.34955._M_finish;
  # PT = nonlocal escaped null
  _11 = MEM[(const struct vector *)v_5(D)].D.35644._M_impl.D.34955._M_start;
  _12 = _10 - _11;
  _13 = _12 /[ex] 4;
  _14 = (long unsigned int) _13;
  if (i_8 < _14)
goto ; [89.00%]

so somehow the C++ handling of value-as-reference passing doesn't work (it
should set TYPE_RESTRICT on the argument type used.