[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|7.0 |6.4 --- Comment #16 from Richard Biener --- Fixed on trunk and branch.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #15 from Richard Biener --- Author: rguenth Date: Thu Jan 19 12:02:43 2017 New Revision: 244625 URL: https://gcc.gnu.org/viewcvs?rev=244625=gcc=rev Log: 2017-01-19 Richard BienerPR tree-optimization/72488 * tree-ssa-sccvn.c (run_scc_vn): When we abort the VN make sure to restore SSA info. Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/tree-ssa-sccvn.c
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #14 from Richard Biener --- Author: rguenth Date: Thu Jan 19 12:00:42 2017 New Revision: 244623 URL: https://gcc.gnu.org/viewcvs?rev=244623=gcc=rev Log: 2017-01-19 Richard BienerPR tree-optimization/72488 * tree-ssa-sccvn.c (run_scc_vn): When we abort the VN make sure to restore SSA info. * tree-ssa.c (verify_ssa): Verify SSA info is not shared. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-sccvn.c trunk/gcc/tree-ssa.c
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #13 from Richard Biener --- Ah. Setting value number of t34_9752(D) to t34_9752(D) (changed) Setting value number of t42_9760(D) to t42_9760(D) (changed) WARNING: Giving up with SCCVN due to SCC size 10003 exceeding 1 so we're failing to call scc_vn_restore_ssa_info when we fail this way. oops.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #12 from Richard Biener --- Ok, with r235817 and a tiny verifier I can reproduce the sharing. Will investigate/fix. It's gone latent on trunk...
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #11 from Jeffrey A. Law --- We have two different SSA_NAMEs where their SSA_NAME_INFO is the same pointer. Thus modification range info by way of set_range_info changes the underlying range on both SSA_NAMEs. From a debugging session: (gdb) p cfun->gimple_df->ssa_names.m_vecdata[1045].ssa_name.info.range_info $33 = (range_info_def *) 0x7fffefa4a420 (gdb) p cfun->gimple_df->ssa_names.m_vecdata[938].ssa_name.info.range_info $34 = (range_info_def *) 0x7fffefa4a420 So to follow how we get to that state. First ccp_finalize calls set_nonzero_bits on SSA_NAME 938. That allocates a hunk of memory and initializes it with the right information. Then ccp_finalize calls set_nonzero_bits on SSA_NAME 1045. That allocates another hunk of memory and initializes it appropriately. The actual recorded ranges are different -- they differ in nonzero_bits. So far, so good. No problems. Then FRE comes along and decides the two SSA_NAMEs are equal and executes this code: /* Use that from the dominator. */ SSA_NAME_RANGE_INFO (to) = SSA_NAME_RANGE_INFO (from); SSA_NAME_ANTI_RANGE_P (to) = SSA_NAME_ANTI_RANGE_P (from); TO is SSA_NAME 1045, FROM is SSA_NAME 938. At that point we have two distinct SSA_NAMEs with a shared SSA_NAME_RANGE_INFO pointer. EVRP then comes along and calls set_range_info on SSA_NAME 1045, which has the side effect of changing the range info on SSA_NAME 938. This changes the nonzero bits associated with SSA_NAME 938 (which ultimately results in incorrect code starting in a later CCP pass). Shortly thereafter we release SSA_NAME 1045 because its LHS is going to be fully propagated away. And it makes perfect sense why last week'd change to not set range info on statements marked for removal can cause this bug to go latent. We avoid the call to set_range_info on SSA_NAME 1045 and thus don't clobber the nonzero bits for SSA_NAME 938. FWIW, knowing that I can diff the .ccp2 dumps to look for evidence of this issue gave me some hope I could reduce the testcase further without worrying about still being able to execute the test. Alas that didn't make any significant difference -- a few things were removed, but not enough to significantly simplify the test. BTW, I don't expect to be online much over the next few days... Good luck...
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #10 from rguenther at suse dot de --- On Wed, 18 Jan 2017, law at redhat dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 > > --- Comment #9 from Jeffrey A. Law --- > So, just to record some thoughts. > > There's about a half-dozen patches, mostly from the August timeframe that will > make this bug go latent. The general theme across them is they change the > order in which we visit statements in sccvn/vrp, twiddle the set of SSA_NAMEs > live ever so slightly prior to ccp & vrp or avoid setting ranges to names that > we're going to full propagate away. > > The last has been the most fruitful in terms of starting to understand what's > going on. Essentially by avoiding setting ranges on SSA_NAMES we're going to > propagate away in evrp, we're changing the behavior of ccp! That (of course) > doesn't seem right. > > I do know that we end up equating two SSA_NAMEs in the sccvn code. It > *appears* that we don't undo the sharing of range info between them. EVRP > comes along and stomps on the (now shared) range info (particularly the mask > used by bitcpp). That in turn appears to be causing bitcpp to incorrectly > simplify some integer arithmetic that presumably causes things to go awry > later. Range info shouldn't be "shared" - where does this bougs sharing happen? Richard.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #9 from Jeffrey A. Law --- So, just to record some thoughts. There's about a half-dozen patches, mostly from the August timeframe that will make this bug go latent. The general theme across them is they change the order in which we visit statements in sccvn/vrp, twiddle the set of SSA_NAMEs live ever so slightly prior to ccp & vrp or avoid setting ranges to names that we're going to full propagate away. The last has been the most fruitful in terms of starting to understand what's going on. Essentially by avoiding setting ranges on SSA_NAMES we're going to propagate away in evrp, we're changing the behavior of ccp! That (of course) doesn't seem right. I do know that we end up equating two SSA_NAMEs in the sccvn code. It *appears* that we don't undo the sharing of range info between them. EVRP comes along and stomps on the (now shared) range info (particularly the mask used by bitcpp). That in turn appears to be causing bitcpp to incorrectly simplify some integer arithmetic that presumably causes things to go awry later. I'm going to try and reduce the testcase tomorrow looking for the dump file differences in ccp2. While that won't produce a working executable, it ought to make the analysis phase of how/why we're sharing range info, why the info gets overwritten, etc a bit easier to follow (not to mention faster). Anyway, just recording thoughts from last night's late debugging session.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #8 from Jeffrey A. Law --- I've got a solid theory here. But it's too late to test and write up.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #7 from Jeffrey A. Law --- I poked at this a bit yesterday (given the irreducible loops I've got some concerns that jump threading might be involved). Whatever is going on, it is highly sensitive to just about any codegen changes. I've identified 3 separate patches from August that can make the bug go latent. Arggh.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #6 from rguenther at suse dot de --- On Tue, 22 Nov 2016, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 > > Jakub Jelinek changed: > >What|Removed |Added > > CC||jakub at gcc dot gnu.org > > --- Comment #5 from Jakub Jelinek --- > This doesn't reproduce anymore starting with r239357 (which on the other side > introduced PR77766). Has it just gone latent? I think so. I never fully analyzed the issue.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- This doesn't reproduce anymore starting with r239357 (which on the other side introduced PR77766). Has it just gone latent?
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P1 CC||law at redhat dot com
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #4 from Richard Biener --- -Os -fno-tree-sink -fno-tree-reassoc -fno-tree-loop-im -fno-tree-pre -fdisable-tree-ifcombine -fno-tree-loop-optimize -fno-tree-forwprop -fdisable-tree-vrp2 -fdisable-tree-phiopt3 -fdisable-tree-phiopt2 -fdump-tree-all-lineno -fdisable-tree-widening_mul -fdisable-tree-dom3 -fdisable-tree-phicprop2 -fdisable-tree-slsr -fdisable-tree-pre -fdisable-tree-dom2 -fdisable-tree-cselim -fdisable-tree-phiopt1 -fdisable-tree-tailr2 -fdisable-tree-ch2 -fdisable-tree-cdce -fdisable-tree-isolate-paths -fno-if-conversion -fno-if-conversion2 -fdbg-cnt=registered_jump_thread:0 -fdisable-tree-phicprop1 -fdisable-tree-bswap -fdisable-tree-laddress -fdisable-tree-sra -fno-move-loop-invariants -da -fdisable-rtl-hoist still reproduces it. Looking at VRP1 in more detail may be worth, eventually adding some debug counter for substitute-and-fold to force some ranges varying.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 --- Comment #3 from Richard Biener --- -fdisable-tree-vrp1 "fixes" it (yes, the revision uncovered a latent bug). VRP1 performs a lot of jump-threading thus it isn't very likely the culprit. Program received signal SIGFPE, Arithmetic exception. 0x00401194 in fn1 () at t.c:130 130 t4 = -(~r % ~e | t17 & t23 / t18 / t4 - t9 % l ^ o); (gdb) p t18 $1 = 0 The testcase is extremely large and has many irreducible regions so it's hard to see what goes wrong. Passes sofar trimmed down to -Os -fno-tree-sink -fno-tree-reassoc -fno-tree-loop-im -fno-tree-pre -fdisable-tree-ifcombine -fno-tree-loop-optimize -fno-tree-forwprop -fdisable-tree-vrp2 -fdisable-tree-phiopt3 -fdisable-tree-phiopt2
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- I will have a looksee.
[Bug rtl-optimization/72488] [7 Regression] wrong code (SIGFPE) at -Os and above on x86_64-linux-gnu (in the 64-bit mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72488 Marek Polacek changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-07-26 CC||mpolacek at gcc dot gnu.org Target Milestone|--- |7.0 Summary|wrong code (SIGFPE) at -Os |[7 Regression] wrong code |and above on|(SIGFPE) at -Os and above |x86_64-linux-gnu (in the|on x86_64-linux-gnu (in the |64-bit mode)|64-bit mode) Ever confirmed|0 |1 --- Comment #1 from Marek Polacek --- Uh. Started with r235817.