[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #17 from AK --- Even after vector::size() is hoisted, the codegen is sub-optimal compared to iterator version. ``` void use_idx_const_size(std::vector v) { auto s = v.size(); for (std::vector::size_type i = 0; i < s; i++) f(v[i]); } ``` $ g++ -O3 use_idx_const_size(std::vector >): pushr12 pushrbp pushrbx mov rdx, QWORD PTR [rdi+8] mov rax, QWORD PTR [rdi] mov r12, rdx sub r12, rax sar r12, 2 cmp rax, rdx je .L1 mov rbp, rdi xor ebx, ebx jmp .L3 .L6: mov rax, QWORD PTR [rbp+0] .L3: mov edi, DWORD PTR [rax+rbx*4] add rbx, 1 callf(int) cmp rbx, r12 jb .L6 .L1: pop rbx pop rbp pop r12 ret It seems compiler is assuming that vector `v` is not loop-invariant?
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #16 from rguenther at suse dot de --- On Thu, 13 Apr 2023, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 > > Jakub Jelinek changed: > >What|Removed |Added > > CC||redi at gcc dot gnu.org > > --- Comment #15 from Jakub Jelinek --- > (In reply to rguent...@suse.de from comment #14) > > If that's valid then all bets for this PR are off since f() then > > _can_ change v.size (). > > Well, in C++ the ctors are typically defined inline, so if we wanted and they > would be inline the FE could do some quick check whether the ctor could leak > the this address or some address derived from it and we could do the > optimization only if we prove that the copy constructor (and default > constructor?) doesn't do this. Yes, with IPA we could also figure out cases where arguments / return by reference could be effectively restrict.
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 Jakub Jelinek changed: What|Removed |Added CC||redi at gcc dot gnu.org --- Comment #15 from Jakub Jelinek --- (In reply to rguent...@suse.de from comment #14) > If that's valid then all bets for this PR are off since f() then > _can_ change v.size (). Well, in C++ the ctors are typically defined inline, so if we wanted and they would be inline the FE could do some quick check whether the ctor could leak the this address or some address derived from it and we could do the optimization only if we prove that the copy constructor (and default constructor?) doesn't do this. CCing Jonathan on whether it is valid.
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #14 from rguenther at suse dot de --- On Thu, 13 Apr 2023, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 > > --- Comment #13 from Jakub Jelinek --- > (In reply to Richard Biener from comment #12) > > Note maybe the restrict qualification isn't the best representation since > > it doesn't capture the value will die upon function return (does it? I > > gues the DTOR if any will run in caller context and thus stores to it > > are not necessarily "dead"). > > Yes, the dtors will be invoked by the caller and those can inspect the values > and do all kinds of things with them. > In fact, the restrict isn't probably right either, the constructor of the > object in the caller could legally save the address of the object somewhere > else (say global pointer, > or inside of the object, or wherever else) and as long as say the destructor > undoes that, it could be valid. > > Something like: > > struct S { > static bool enabled; > static S *p; > S () : s (0) {} > ~S () { if (enabled) p = nullptr; } > S (const S ) : s (x.s) { if (enabled) p = this; } > int s; > }; > > [[gnu::noipa]] void > bar (S s) > { > s.s++; > S::p->s = 0; > s.s++; > } > > void > foo () > { > S s; > S::enabled = true; > bar (s); > } If that's valid then all bets for this PR are off since f() then _can_ change v.size ().
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #13 from Jakub Jelinek --- (In reply to Richard Biener from comment #12) > Note maybe the restrict qualification isn't the best representation since > it doesn't capture the value will die upon function return (does it? I > gues the DTOR if any will run in caller context and thus stores to it > are not necessarily "dead"). Yes, the dtors will be invoked by the caller and those can inspect the values and do all kinds of things with them. In fact, the restrict isn't probably right either, the constructor of the object in the caller could legally save the address of the object somewhere else (say global pointer, or inside of the object, or wherever else) and as long as say the destructor undoes that, it could be valid. Something like: struct S { static bool enabled; static S *p; S () : s (0) {} ~S () { if (enabled) p = nullptr; } S (const S ) : s (x.s) { if (enabled) p = this; } int s; }; [[gnu::noipa]] void bar (S s) { s.s++; S::p->s = 0; s.s++; } void foo () { S s; S::enabled = true; bar (s); }
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #12 from Richard Biener --- (In reply to Jakub Jelinek from comment #11) > And I don't see any code generation changes on the #c0 testcase with added > #include with the patch. Yes, that's because we cannot disambiguate the call against a restrict qualified memory access. But otherwise it "works". Note maybe the restrict qualification isn't the best representation since it doesn't capture the value will die upon function return (does it? I gues the DTOR if any will run in caller context and thus stores to it are not necessarily "dead"). It's on my list of things to do to investigate adding "'restrict' support" to calls for GCC14.
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #11 from Jakub Jelinek --- And I don't see any code generation changes on the #c0 testcase with added #include with the patch.
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #10 from Jakub Jelinek --- Making the reference types to return or parameter non-POD types passed by value restrict could be --- gcc/cp/call.cc.jj 2023-03-30 09:34:05.609725768 +0200 +++ gcc/cp/call.cc 2023-04-13 09:56:53.226908996 +0200 @@ -9223,7 +9223,11 @@ type_passed_as (tree type) { /* Pass classes with copy ctors by invisible reference. */ if (TREE_ADDRESSABLE (type)) -type = build_reference_type (type); +{ + type = build_reference_type (type); + type = build_qualified_type (type, + TYPE_QUALS (type) | TYPE_QUAL_RESTRICT); +} else if (targetm.calls.promote_prototypes (NULL_TREE) && INTEGRAL_TYPE_P (type) && COMPLETE_TYPE_P (type) --- gcc/cp/cp-gimplify.cc.jj2023-03-15 15:36:02.500430556 +0100 +++ gcc/cp/cp-gimplify.cc 2023-04-13 09:57:51.989059798 +0200 @@ -2000,6 +2000,9 @@ cp_genericize (tree fndecl) { t = DECL_RESULT (fndecl); TREE_TYPE (t) = build_reference_type (TREE_TYPE (t)); + TREE_TYPE (t) + = build_qualified_type (TREE_TYPE (t), TYPE_QUALS (TREE_TYPE (t)) + | TYPE_QUAL_RESTRICT); DECL_BY_REFERENCE (t) = 1; TREE_ADDRESSABLE (t) = 0; relayout_decl (t); Completely untested. Does this what we want? Stage1 material anyway...
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #9 from Jonathan Wakely --- No: This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which may be combined to eliminate multiple copies): — in a return statement in a function with a class return type, when the expression is the name of a non-volatile object with automatic storage duration (other than a function parameter or a variable introduced by the exception-declaration of a handler (14.4)) with the same type (ignoring cv-qualification) as the function return type, the copy/move operation can be omitted by constructing the object directly into the function call’s return object "Other than a function parameter" means the return object must be at a distinct address.
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 --- Comment #8 from rguenther at suse dot de --- On Wed, 12 Apr 2023, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 > > Jakub Jelinek changed: > >What|Removed |Added > > CC||jakub at gcc dot gnu.org, >||jason at gcc dot gnu.org, >||mpolacek at gcc dot gnu.org > > --- Comment #7 from Jakub Jelinek --- > I think the middle-end could figure this out by itself, such PARM_DECLs or > RESULT_DECLs > are DECL_BY_REFERENCE and TREE_ADDRESSABLE (TREE_TYPE (TREE_TYPE > (parm_or_decl))). We currently require an additional TYPE_RESTRICT on the pointer but do not require TREE_ADDRESSABLE. Do you say TREE_ADDRESSABLE provides the same guarantees as TYPE_RESTRICT here? > I think that means that if one calls a function with two arguments the two > arguments won't alias: > struct S { S (); ~S (); S (const S &); int s; }; > void > foo (S a, S b) > { > } > void > bar (S c) > { > foo (c, c); > } > Inside of the function one can take address of such a parameter as many times > as one wants and dereference through that of course. But I think one can > assume that when the > function is called, the argument (i.e. the address of the object in the > caller) > isn't stored anywhere else. I'd prefer if the frontend is explicit about this. What about S foo (S a) { return a; } can the return object alias the argument?
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||jason at gcc dot gnu.org, ||mpolacek at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- I think the middle-end could figure this out by itself, such PARM_DECLs or RESULT_DECLs are DECL_BY_REFERENCE and TREE_ADDRESSABLE (TREE_TYPE (TREE_TYPE (parm_or_decl))). I think that means that if one calls a function with two arguments the two arguments won't alias: struct S { S (); ~S (); S (const S &); int s; }; void foo (S a, S b) { } void bar (S c) { foo (c, c); } Inside of the function one can take address of such a parameter as many times as one wants and dereference through that of course. But I think one can assume that when the function is called, the argument (i.e. the address of the object in the caller) isn't stored anywhere else.
[Bug c++/109443] missed optimization of std::vector access (Related to issue 35269)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-04-11 Status|UNCONFIRMED |NEW Version|unknown |13.0 Component|tree-optimization |c++ See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=35269 --- Comment #6 from Richard Biener --- Well, but v.size () is not loop invariant. As Andrew says 'v' is passed by (const?) reference but the PTA solution doesn't exactly reflect this so we do think the call clobbers it: # USE = nonlocal escaped # CLB = nonlocal escaped f (_1); ... # PT = nonlocal escaped null _10 = MEM[(const struct vector *)v_5(D)].D.35644._M_impl.D.34955._M_finish; # PT = nonlocal escaped null _11 = MEM[(const struct vector *)v_5(D)].D.35644._M_impl.D.34955._M_start; _12 = _10 - _11; _13 = _12 /[ex] 4; _14 = (long unsigned int) _13; if (i_8 < _14) goto ; [89.00%] so somehow the C++ handling of value-as-reference passing doesn't work (it should set TYPE_RESTRICT on the argument type used.