[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu since r10-3311
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 --- Comment #8 from Martin Jambor --- Fixed on trunk. I did not want to backport this but because this variant does not require disabling DCE, I will probably do after a few weeks on master, if there are no issues.
[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu since r10-3311
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 --- Comment #7 from GCC Commits --- The master branch has been updated by Martin Jambor : https://gcc.gnu.org/g:a9a8426e534760b8d3a250e9bd3cff4db131a2be commit r14-8398-ga9a8426e534760b8d3a250e9bd3cff4db131a2be Author: Martin Jambor Date: Wed Jan 24 19:12:31 2024 +0100 ipa: Self-DCE of uses of removed call LHSs (PR 108007) PR 108007 is another manifestation where we rely on DCE to clean-up after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA can leave behind statements which are fed uninitialized values and trap, even though their results are themselves never used. I have already fixed this for unused parameters in callees, this bug shows that almost the same thing can happen for removed returns, on the side of callers. This means that the issue has to be fixed elsewhere, in call redirection. This patch adds a function which looks for (and through, using a work-list) uses of operations fed specific SSA names and removes them all. That would have been easy if it wasn't for debug statements during tree-inline (from which call redirection is also invoked). Debug statements are decoupled from the rest at this point and iterating over uses of SSAs does not bring them up. During tree-inline they are handled especially at the end, I assume in order to make sure that relative ordering of UIDs are the same with and without debug info. This means that during tree-inline we need to make a hash of killed SSAs, that we already have in copy_body_data, available to the function making the purging. So the patch duly does also that, making the interface slightly ugly. Moreover, all newly unused SSA names need to be freed and as PR 112616 showed, it must be done in a defined order, which is what newly added ipa_release_ssas_in_hash does. gcc/ChangeLog: 2024-01-12 Martin Jambor PR ipa/108007 PR ipa/112616 * cgraph.h (cgraph_edge): Add a parameter to redirect_call_stmt_to_callee. * ipa-param-manipulation.h (ipa_param_adjustments): Add a parameter to modify_call. (ipa_release_ssas_in_hash): Declare. * cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New parameter killed_ssas, pass it to padjs->modify_call. * ipa-param-manipulation.cc (purge_all_uses): New function. (ipa_param_adjustments::modify_call): New parameter killed_ssas. Instead of substituting uses, invoke purge_all_uses. If hash of killed SSAs has not been provided, create a temporary one and release SSAs that have been added to it. (compare_ssa_versions): New function. (ipa_release_ssas_in_hash): Likewise. * tree-inline.cc (redirect_all_calls): Create id->killed_new_ssa_names earlier, pass it to edge redirection, adjust a comment. (copy_body): Release SSAs in id->killed_new_ssa_names. gcc/testsuite/ChangeLog: 2024-01-15 Martin Jambor PR ipa/108007 PR ipa/112616 * gcc.dg/ipa/pr108007.c: New test. * gcc.dg/ipa/pr112616.c: Likewise.
[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu since r10-3311
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 --- Comment #6 from Martin Jambor --- (In reply to Andrew Pinski from comment #1) > # q_11 = PHI <0B(2), removed_return.14_14(D)(4), > removed_return.14_14(D)(3)> > _12 = *q_11; > > > WTF Well, _12 is not used anywhere, so the code expects the entire load to be DCEd. But it gets optimized to _2 = MEM[(int *)0B]; before DCE sees it and then even if _2 is never used anywhere, apparently the statement is kept there as an intended trap (I guess). I have adjusted my patch to make DCE for removed returnd part of IPA edge redirection so that it does not have compare-debug problems and submitted it for review in: https://inbox.sourceware.org/gcc-patches/ri6cyu1e9kw.fsf@/T/#u
[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu since r10-3311
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 Martin Jambor changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jamborm at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #5 from Martin Jambor --- (In reply to Andrew Pinski from comment #2) > This is like PR 108007 but unlike that one, -fno-tree-dce is not used. But the patch fixes it, so I gess it's time to make it pass ppc64le bootstrap. (But I did not want to backport that patch, I wonder whether we can't figure out something simpler :-/ )
[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu since r10-3311
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Summary|[11/12/13/14 Regression]|[11/12/13/14 Regression] |wrong code at -O{s,2,3} on |wrong code at -O{s,2,3} on |x86_64-linux-gnu|x86_64-linux-gnu since ||r10-3311 --- Comment #4 from Jakub Jelinek --- Started with r10-3311-gff6686d2e5f797d6c6a36ad14a7084bc1dc350e4
[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 Richard Biener changed: What|Removed |Added Priority|P3 |P2 CC||hubicka at gcc dot gnu.org, ||jamborm at gcc dot gnu.org Version|unknown |13.2.1 --- Comment #3 from Richard Biener --- IPA-SRA at work. With -fno-inline added you see -long int p () +void p.isra () { ... @@ -106,13 +82,13 @@ i.2_2 = i; h.3_3 = h; *i.2_2 = h.3_3; - _4 = o (); + o.isra (); [local count: 1073741824]: - # q_11 = PHI <0B(2), _4(3)> - _5 = *q_11; - _10 = (long int) _5; - return _10; + # q_5 = PHI <0B(2), removed_return.17_12(D)(3)> + _6 = *q_5; + _7 = (long int) _6; + return; Note the issue is that we end up with the following after inlining: [local count: 1073741824]: # q_11 = PHI <0B(2), removed_return.14_14(D)(4), removed_return.14_14(D)(3)> _12 = *q_11; and CCP will optimistically simplify q_11 to 0B because accessing the removed_return.14_14(D) would be undefined behavior. That means substituting a default-def isn't correct - it exposes undefined behavior on valid paths of the program.
[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 --- Comment #2 from Andrew Pinski --- This is like PR 108007 but unlike that one, -fno-tree-dce is not used.
[Bug ipa/112616] [11/12/13/14 Regression] wrong code at -O{s, 2, 3} on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112616 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-11-19 CC||marxin at gcc dot gnu.org Status|UNCONFIRMED |NEW Component|tree-optimization |ipa --- Comment #1 from Andrew Pinski --- # q_11 = PHI <0B(2), removed_return.14_14(D)(4), removed_return.14_14(D)(3)> _12 = *q_11; WTF