[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #11 from jamborm at gcc dot gnu dot org 2010-05-24 09:43 --- (In reply to comment #9) (In reply to comment #7) This is now fixed on both the trunk and the 4.5 branch. this commit produces broken libkhtml.so.5.4.0 from kdelibs-4.4.3. in details, it produces different/broken binaries for khtml/css/parser.cpp and khtml/svg/SVGGradientElement.cpp. Please file this as a separate bug and CC me. I can't promise I'll be able to look at it this week though. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #12 from pluto at agmk dot net 2010-05-24 11:04 --- (From update of attachment 20731) moved to separated PR44258. -- pluto at agmk dot net changed: What|Removed |Added Attachment #20731|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #9 from pluto at agmk dot net 2010-05-23 11:53 --- (In reply to comment #7) This is now fixed on both the trunk and the 4.5 branch. this commit produces broken libkhtml.so.5.4.0 from kdelibs-4.4.3. in details, it produces different/broken binaries for khtml/css/parser.cpp and khtml/svg/SVGGradientElement.cpp. finally we get nice GPF during knode/kmail/konqueror startup: [KCrash Handler] #5 memcpy () at ../sysdeps/x86_64/memcpy.S:78 #6 0x7f546e63fc5e in QString::QString(QChar const*, int) () from /usr/lib64/libQtCore.so.4 #7 0x7f5469f70e2e in qString (ps=value optimized out) at /usr/src/debug/kdelibs-4.4.3/khtml/css/cssparser.h:84 #8 DOM::CSSParser::parseValue (ps=value optimized out) at /usr/src/debug/kdelibs-4.4.3/khtml/css/cssparser.cpp:518 #9 0x7f5469f95075 in cssyyparse (parser=0x7fff08c22820) at /usr/src/debug/kdelibs-4.4.3/khtml/css/parser.cpp:2969 #10 0x7f5469f67d00 in DOM::CSSParser::runParser (this=0x7fff08c22820) at /usr/src/debug/kdelibs-4.4.3/khtml/css/cssparser.cpp:151 (...) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #10 from pluto at agmk dot net 2010-05-23 21:25 --- Created an attachment (id=20731) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20731action=view) parser.i from kdelibs. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #6 from jamborm at gcc dot gnu dot org 2010-04-28 13:10 --- Subject: Bug 43846 Author: jamborm Date: Wed Apr 28 13:09:56 2010 New Revision: 158826 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=158826 Log: 2010-04-28 Martin Jambor mjam...@suse.cz PR tree-optimization/43846 * tree-sra.c (struct access): New flag grp_assignment_read. (build_accesses_from_assign): Set grp_assignment_read. (sort_and_splice_var_accesses): Propagate grp_assignment_read. (enum mark_read_status): New type. (analyze_access_subtree): Propagate grp_assignment_read, create accesses also if both direct_read and root-grp_assignment_read. * testsuite/gcc.dg/tree-ssa/sra-10.c: New test. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/tree-ssa/sra-10.c Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/testsuite/ChangeLog branches/gcc-4_5-branch/gcc/tree-sra.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #7 from jamborm at gcc dot gnu dot org 2010-04-28 13:15 --- This is now fixed on both the trunk and the 4.5 branch. -- jamborm at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #8 from tbptbp at gmail dot com 2010-04-28 13:43 --- Allow me to extend to you my most profuse praises and blessing; may all the woman in your vicinity fall pregnant and your male progeny be granted abounding chest hair. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #5 from jamborm at gcc dot gnu dot org 2010-04-23 14:52 --- Subject: Bug 43846 Author: jamborm Date: Fri Apr 23 14:52:06 2010 New Revision: 158668 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=158668 Log: 2010-04-23 Martin Jambor mjam...@suse.cz PR tree-optimization/43846 * tree-sra.c (struct access): New flag grp_assignment_read. (build_accesses_from_assign): Set grp_assignment_read. (sort_and_splice_var_accesses): Propagate grp_assignment_read. (enum mark_read_status): New type. (analyze_access_subtree): Propagate grp_assignment_read, create accesses also if both direct_read and root-grp_assignment_read. * testsuite/gcc.dg/tree-ssa/sra-10.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/sra-10.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-sra.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #1 from rguenth at gcc dot gnu dot org 2010-04-22 09:07 --- Hm, frob1 looks like _Z5frob1RK5foo_tRS_: .LFB18: movss (%rdi), %xmm3 movss 4(%rdi), %xmm2 movaps %xmm3, %xmm4 movaps %xmm2, %xmm0 mulss %xmm3, %xmm4 movss 8(%rdi), %xmm1 mulss %xmm2, %xmm0 addss %xmm4, %xmm0 movaps %xmm1, %xmm4 mulss %xmm1, %xmm4 addss %xmm4, %xmm0 rsqrtss %xmm0, %xmm4 mulss %xmm4, %xmm0 mulss %xmm4, %xmm0 mulss .LC1(%rip), %xmm4 addss .LC0(%rip), %xmm0 mulss %xmm4, %xmm0 mulss %xmm0, %xmm3 mulss %xmm0, %xmm2 mulss %xmm1, %xmm0 movss %xmm3, (%rsi) movss %xmm2, 4(%rsi) movss %xmm0, 8(%rsi) ret and frob2 like _Z5frob2RK5bar_tRS_: .LFB19: movss (%rdi), %xmm3 movss 4(%rdi), %xmm2 movaps %xmm3, %xmm4 movaps %xmm2, %xmm0 mulss %xmm3, %xmm4 movss 8(%rdi), %xmm1 mulss %xmm2, %xmm0 addss %xmm4, %xmm0 movaps %xmm1, %xmm4 mulss %xmm1, %xmm4 addss %xmm4, %xmm0 rsqrtss %xmm0, %xmm4 mulss %xmm4, %xmm0 mulss %xmm4, %xmm0 mulss .LC1(%rip), %xmm4 addss .LC0(%rip), %xmm0 mulss %xmm4, %xmm0 mulss %xmm0, %xmm3 mulss %xmm0, %xmm2 mulss %xmm1, %xmm0 movss %xmm3, -24(%rsp) movss %xmm2, -20(%rsp) movq-24(%rsp), %rax movss %xmm0, -16(%rsp) movq%rax, (%rsi) movl-16(%rsp), %eax movl%eax, 8(%rsi) ret so it's an aggregate copy that is not scalarized in frob2: b_1(D)-x = D.2444_20; b_1(D)-y = D.2443_19; b_1(D)-z = D.2442_18; return; vs. D.2464.m[0] = D.2473_20; D.2464.m[1] = D.2472_19; D.2464.m[2] = D.2471_18; *b_1(D) = D.2464; return; all inlining happens during early inlining and frob1 and frob2 are reasonably similar after early inlining. But then we have early SRA which does ;; Function void frob1(const foo_t, foo_t) (_Z5frob1RK5foo_tRS_) Candidate (2452): D.2452 Candidate (2434): v Candidate (2435): D.2435 Will attempt to totally scalarize D.2435 (UID: 2435): Will attempt to totally scalarize D.2452 (UID: 2452): Marking v offset: 0, size: 32: to be replaced. Marking v offset: 32, size: 32: to be replaced. Marking v offset: 64, size: 32: to be replaced. ... ;; Function void frob2(const bar_t, bar_t) (_Z5frob2RK5bar_tRS_) Candidate (2481): D.2481 Candidate (2464): D.2464 Candidate (2463): v Marking v offset: 0, size: 32: to be replaced. Marking v offset: 32, size: 32: to be replaced. Marking v offset: 64, size: 32: to be replaced. ... ! Disqualifying D.2464 - No scalar replacements to be created. so it doesn't consider the struct with the array for total scalarization for some reason. Martin? -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||jamborm at gcc dot gnu dot ||org Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2010-04-22 09:07:40 date|| Summary|4.5.0 regression, array vs |[4.5 Regression] array vs |members, dead code removal |members, total scalarization |issues |issues Target Milestone|--- |4.5.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #2 from jamborm at gcc dot gnu dot org 2010-04-22 12:35 --- (In reply to comment #1) so it doesn't consider the struct with the array for total scalarization for some reason. Martin? Well, that was a deliberate decision when fixing PR 42585 (see type_consists_of_records_p). The code is simpler because it does not have to know how to iterate over the array index domain. Of course, we can alleviate this restriction and learn how to iterate. However, all the accesses for the whole array are already created, that is not the issue. The problem basically is that when we see the sequence D.2035.m[0] = D.2044_20; D.2035.m[1] = D.2043_19; D.2035.m[2] = D.2042_18; *b_1(D) = D.2035; (and there are no other accesses to D.2035) the condition that tries to prevent us from creating unnecessary replacements kicks in and we decide not to scalarize. The intent of the current code (possibly among other reasons) was to avoid going through a replacement when the whole structure was then passed as an argument to a function and similar situations. But it should not be very difficult to change the condition (in analyze_access_subtree) to handle both situations right. Doing this, rather than total scalarization for arrays (which should be only useful as a substitute for a copy propagation) should enable us to handle even huge arrays. I'll get to this right after dealing with PR 43835. -- jamborm at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |jamborm at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED Last reconfirmed|2010-04-22 09:07:40 |2010-04-22 12:35:41 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #3 from davidxl at gcc dot gnu dot org 2010-04-22 17:04 --- (In reply to comment #2) (In reply to comment #1) so it doesn't consider the struct with the array for total scalarization for some reason. Martin? Well, that was a deliberate decision when fixing PR 42585 (see type_consists_of_records_p). The code is simpler because it does not have to know how to iterate over the array index domain. Of course, we can alleviate this restriction and learn how to iterate. However, all the accesses for the whole array are already created, that is not the issue. The problem basically is that when we see the sequence D.2035.m[0] = D.2044_20; D.2035.m[1] = D.2043_19; D.2035.m[2] = D.2042_18; *b_1(D) = D.2035; (and there are no other accesses to D.2035) the condition that tries to prevent us from creating unnecessary replacements kicks in and we decide not to scalarize. This code sequence looks like a good motivating factor for scalarizing/expansion. In fact, small arrays should be treated the same way as records if all accesses are through compile time constant indices. This is a common scenario after full unrolling. The intent of the current code (possibly among other reasons) was to avoid going through a replacement when the whole structure was then passed as an argument to a function and similar situations. If the temp aggregate is passed to call and the calling convention is not exposed at the IL level, then it is not a good sra candidate as no copy (both code and storage) elimination will be exposed. In this one, the temp aggregate is used as the RHS of an assignment, thus it is a good candidate to expand. So will be the reverse case: aggregate1 = aggregate2; .. ... = aggregate1.e1; ... = aggregate1.e2; David But it should not be very difficult to change the condition (in analyze_access_subtree) to handle both situations right. Doing this, rather than total scalarization for arrays (which should be only useful as a substitute for a copy propagation) should enable us to handle even huge arrays. I'll get to this right after dealing with PR 43835. -- davidxl at gcc dot gnu dot org changed: What|Removed |Added CC||xinliangli at gmail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846
[Bug tree-optimization/43846] [4.5 Regression] array vs members, total scalarization issues
--- Comment #4 from jamborm at gcc dot gnu dot org 2010-04-22 17:18 --- Created an attachment (id=20464) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20464action=view) Proposed fix I'm currently testing this patch and will submit it tomorrow if everything goes OK. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43846