[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364 --- Comment #35 from Richard Biener --- Author: rguenth Date: Wed Dec 19 11:10:08 2018 New Revision: 267262 URL: https://gcc.gnu.org/viewcvs?rev=267262&root=gcc&view=rev Log: 2018-12-19 Richard Biener PR tree-optimization/88533 Revert 2018-04-30 Richard Biener PR tree-optimization/28364 PR tree-optimization/85275 * tree-ssa-loop-ch.c (ch_base::copy_headers): Stop after copying first exit test. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust. * tree-ssa-loop-ch.c: Include tree-phinodes.h and ssa-iterators.h. (should_duplicate_loop_header_p): Track whether stmt compute loop invariants or values based on IVs. Apart from the original loop header only duplicate blocks with exit tests that are based on IVs or invariants. * gcc.dg/tree-ssa/copy-headers-6.c: New testcase. * gcc.dg/tree-ssa/copy-headers-7.c: Likewise. * gcc.dg/tree-ssa/ivopt_mult_1.c: Un-XFAIL. * gcc.dg/tree-ssa/ivopt_mult_2.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-6.c trunk/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c trunk/gcc/tree-ssa-loop-ch.c
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work||9.0 Resolution|--- |FIXED --- Comment #34 from Richard Biener --- Fixed.
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364 --- Comment #33 from Richard Biener --- Author: rguenth Date: Mon Apr 30 07:23:36 2018 New Revision: 259754 URL: https://gcc.gnu.org/viewcvs?rev=259754&root=gcc&view=rev Log: 2018-04-30 Richard Biener PR tree-optimization/28364 PR tree-optimization/85275 * tree-ssa-loop-ch.c (ch_base::copy_headers): Stop after copying first exit test. * gcc.dg/tree-ssa/copy-headers-5.c: New testcase. * gcc.dg/tree-ssa/predcom-8.c: Likewise. * gcc.dg/tree-ssa/cunroll-13.c: Rewrite to gimple testcase. * gcc.dg/tree-ssa/ivopt_mult_1.c: XFAIL. * gcc.dg/tree-ssa/ivopt_mult_1g.c: Add gimple variant that still passes. * gcc.dg/tree-ssa/ivopt_mult_2.c: XFAIL. * gcc.dg/tree-ssa/ivopt_mult_2g.c: Add gimple variant that still passes. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust. * gcc.dg/tree-ssa/20030710-1.c: Likewise. * gcc.dg/tree-ssa/20030711-1.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-5.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1g.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2g.c trunk/gcc/testsuite/gcc.dg/tree-ssa/predcom-8.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/20030710-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/20030711-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/cunroll-13.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c trunk/gcc/tree-ssa-loop-ch.c
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364 amker at gcc dot gnu.org changed: What|Removed |Added CC||amker at gcc dot gnu.org --- Comment #32 from amker at gcc dot gnu.org --- (In reply to bin.cheng from comment #31) > This is a really old issue! I will also check status of this issue on trunk. For multi-exit cases, I wonder if it's possible to only copy wrto the one checking IV. It should still satisfy ch's purpose.
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364 bin.cheng changed: What|Removed |Added CC||amker.cheng at gmail dot com --- Comment #31 from bin.cheng --- This is a really old issue! I will also check status of this issue on trunk.
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364 --- Comment #30 from Zack Weinberg --- It's been a very long time and I don't know exactly what changed, but GCC 7.3 generates essentially the same code for both of the functions in the "C test case" and I would not describe that code as "bad". I can still make it duplicate the entire body of the loop by relatively small tweaks, though. For instance, int has_bad_chars(char *str, __SIZE_TYPE__ len) { for (char *c = str; c < str + len; c++) { unsigned char x = (unsigned char)(*c); if (x <= 0x1f || x == 0x5c || x == 0x7f) return 1; } return 0; } generates significantly worse code (doubling cache footprint for no gain in branch predictability or any other metric) with -O2 than -Os. Also, the code generated for the body of the loop (with both the original test case and the above) is more complicated than it needs to be, but perhaps that should be a new bug report.
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org --- Comment #29 from Eric Gallager --- (In reply to Zdenek Dvorak from comment #28) > Subject: Bug 28364 > > Author: rakdver > Date: Wed Aug 16 21:14:11 2006 > New Revision: 116189 > > URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116189 > Log: > PR tree-optimization/28364 > * tree-ssa-loop-ivopts.c (aff_combination_to_tree): Handle zero > correctly. > (fold_affine_expr): New function. > (may_eliminate_iv): Use fold_affine_expr. > > > Modified: > trunk/gcc/ChangeLog > trunk/gcc/tree-ssa-loop-ivopts.c Did this fix it?
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #28 from rakdver at gcc dot gnu dot org 2006-08-16 21:14 --- Subject: Bug 28364 Author: rakdver Date: Wed Aug 16 21:14:11 2006 New Revision: 116189 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116189 Log: PR tree-optimization/28364 * tree-ssa-loop-ivopts.c (aff_combination_to_tree): Handle zero correctly. (fold_affine_expr): New function. (may_eliminate_iv): Use fold_affine_expr. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-loop-ivopts.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #27 from rakdver at gcc dot gnu dot org 2006-07-26 19:38 --- Patch for the wrong choice of induction variable: http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01125.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #26 from rakdver at gcc dot gnu dot org 2006-07-25 15:20 --- A patch for the "return in the middle of the loop" problem: http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00893.html (to be commited once mainline is open). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #25 from rakdver at gcc dot gnu dot org 2006-07-18 00:45 --- Created an attachment (id=11906) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11906&action=view) A patch for loop header selection. I tried improving the heuristics for the selection of the loop header, however without success. In ch, copying all the exits simply looks like a good idea, since it makes the loop header contain the most important looking part of the loop, and makes the latch of the loop empty. I append the patch I made, in case someone wanted to play with it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
-- rakdver at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rakdver at gcc dot gnu dot |dot org |org Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2006-07-14 14:12:12 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #24 from rakdver at atrey dot karlin dot mff dot cuni dot cz 2006-07-13 09:03 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > > > However, ch isn't just copying the loop header; it is also > > > copying the *entire body of the loop*, which nothing can fix. I call > > > that a clear bug. > > > > how do you define a loop header? > > I was under the impression it was just the one basic block called out > in the .ch dump, e.g. > > ;; Loop 1 > ;; header 6, latch 5 > ;; depth 1, level 1, outer 0 > > -- basic block 6 happens to contain just the code from the syntactic > loop condition. Andrew informs me that this is wrong, and that in > this case the header is the entire loop, but I will come back at that > with 'ch should never be duplicating the entire loop; if the header is > the entire loop, it should do something more sensible, like duplicate > just the first basic block or something.' currently, we stop once the copied region is too large. This means that on "normal" loops that have a body that does something, we won't copy whole loop. Of course, any heuristics will have cases when it won't perform ideally. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #23 from rakdver at atrey dot karlin dot mff dot cuni dot cz 2006-07-13 09:00 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > > on your real program, how much performance do you gain by hand-rewriting > > the assembler to look the way you like? Just to make sure there really > > is a problem. > > I'm a little annoyed by the suggestion that this wouldn't be a real > problem if I couldn't measure a performance difference. sorry, but it indeed would not be. I have seen so many examples of code that looks bad at the first sight and preforms just fine that unless there is something obviously wrong (which is not the case here, IMO), I am somewhat more careful before I spend my time on fixing this type of "bugs". > Depending on workload, other activity on the same machine, and phase > of moon, this loop is between .1% and 1% of runtime, and my tweaks > make it go about a third faster. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #22 from rguenth at gcc dot gnu dot org 2006-07-13 08:28 --- For practical purposes (determining the loop runs at least once) it needs to duplicate the exit condition. Which happens to be difficult here, as there are multiple loop exits. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #21 from zackw at panix dot com 2006-07-13 08:28 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > on your real program, how much performance do you gain by hand-rewriting > the assembler to look the way you like? Just to make sure there really > is a problem. I'm a little annoyed by the suggestion that this wouldn't be a real problem if I couldn't measure a performance difference. Depending on workload, other activity on the same machine, and phase of moon, this loop is between .1% and 1% of runtime, and my tweaks make it go about a third faster. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #20 from zackw at panix dot com 2006-07-13 08:25 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > > However, ch isn't just copying the loop header; it is also > > copying the *entire body of the loop*, which nothing can fix. I call > > that a clear bug. > > how do you define a loop header? I was under the impression it was just the one basic block called out in the .ch dump, e.g. ;; Loop 1 ;; header 6, latch 5 ;; depth 1, level 1, outer 0 -- basic block 6 happens to contain just the code from the syntactic loop condition. Andrew informs me that this is wrong, and that in this case the header is the entire loop, but I will come back at that with 'ch should never be duplicating the entire loop; if the header is the entire loop, it should do something more sensible, like duplicate just the first basic block or something.' -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #19 from rakdver at atrey dot karlin dot mff dot cuni dot cz 2006-07-13 08:01 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > > > I-cache. > > this only matters if this increase in code size will make the code go > > out of instruction cache. > > The real program that this is taken from is a large C++ application > which is guaranteed to go out of cache - it's got slightly less than > four megabytes of .text - the actual goal is to make sure all of its > inner inner inner loops do stay in cache. And this is one of 'em. on your real program, how much performance do you gain by hand-rewriting the assembler to look the way you like? Just to make sure there really is a problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #18 from rakdver at atrey dot karlin dot mff dot cuni dot cz 2006-07-13 07:58 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > > > I-cache. > > this only matters if this increase in code size will make the code go > > out of instruction cache. > > The real program that this is taken from is a large C++ application > which is guaranteed to go out of cache - it's got slightly less than > four megabytes of .text - the actual goal is to make sure all of its > inner inner inner loops do stay in cache. And this is one of 'em. > > > > Also, more iterations before the branch predictors figure out what's > > > going on. > > But also possibly more consistent behavior with respect to branch > > prediction, in case the loop is often exited in the first iteration. > > Again, in real life I know a priori that the function is rarely, if > ever, called with a zero-length string. > > - > > I went through the tree dumps for my week-old 4.2.0 for the test > program with a fine comb. They are quite instructive. If tree-ch > were doing what it says on the label -- copying the loop header -- > everything would be fine; dom1 cleans up the two copies of the header > later. However, ch isn't just copying the loop header; it is also > copying the *entire body of the loop*, which nothing can fix. I call > that a clear bug. how do you define a loop header? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #17 from zackw at panix dot com 2006-07-13 04:23 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > One more comment, a loop with an early exit is whole issue and that is the > reason why all of the code in the loop is considered the header. There are a > couple of loop headers in this case, one for each exit which makes it harder > to > deal with in general. I didn't know that, and it is not obvious from the optimizer dumps. Thanks for explaining. > What you did not mention is which how would this loop exit normally, via the > return 1 or all the way through the loop. There is no enough information from > GCC's point of view to figure that out without profiling (for this case). GCC > is assuming that the loop exits in the first if statement which seems > reasoniable. Maybe you should try with profiling information and see what GCC > does for this testcase. Feedback-directed optimization is only good for making compilers look better on benchmarks. It's useless in real life. I can, in fact, get good code out of gcc 4.1 by beating it over the head with __builtin_expect, but I don't think I should have to do that. I think my suggested version is better code no matter whether or not the loop exits early. 4.2 still makes what I consider to be bad addressing mode choices after that change, but Zdenek did say he would look at that. It also puts the "return 1" exit block in the middle of the loop in spite of being told that all three conditions leading to that are unlikely. struct rep { unsigned long len; unsigned long alloc; unsigned long dummy; }; struct data { char * ptr; }; struct string { struct rep R; struct data D; }; int has_bad_chars(struct data *path) { char *c; for (c = path->ptr; __builtin_expect(c < path->ptr + ((struct rep *)path)[-1].len, 1); c++) { unsigned char x = (unsigned char)(*c); if (__builtin_expect(x <= 0x1f || x == 0x5c || x == 0x7f, 0)) return 1; } return 0; } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #16 from pinskia at gcc dot gnu dot org 2006-07-13 04:01 --- If this is really a program's inner most loop, then the program is messed up as there is no caculation going on here at all. What type of program is this? Do you cache the result of this function? Maybe chaching the results will show other bottle necks. Maybe even instead of doing this loop, find another loop which loops over the text and also process it at the same time. These are normal optimization tricks which usuaully cannot be done by the compiler. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #15 from pinskia at gcc dot gnu dot org 2006-07-13 03:45 --- One more comment, a loop with an early exit is whole issue and that is the reason why all of the code in the loop is considered the header. There are a couple of loop headers in this case, one for each exit which makes it harder to deal with in general. What you did not mention is which how would this loop exit normally, via the return 1 or all the way through the loop. There is no enough information from GCC's point of view to figure that out without profiling (for this case). GCC is assuming that the loop exits in the first if statement which seems reasoniable. Maybe you should try with profiling information and see what GCC does for this testcase. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #14 from zackw at panix dot com 2006-07-13 03:40 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) It's a validation routine, yes, which is run over every pathname the program is working on, and there can be hundreds or thousands of those. And why the heck shouldn't I be able to use std::string in inner loops? I sure don't want to be using bare char*... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #13 from pinskia at gcc dot gnu dot org 2006-07-13 03:37 --- Hmm, for some reason I don't like the idea of using std::string in the inner loop :). Even the C testcase does not seem like a good inner loop in general anyways as there is no caculation going on here. To me these look like loops which are run for testing only, looking for bad characters to see if a problem had happened. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #12 from zackw at panix dot com 2006-07-13 03:09 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > > I-cache. > this only matters if this increase in code size will make the code go > out of instruction cache. The real program that this is taken from is a large C++ application which is guaranteed to go out of cache - it's got slightly less than four megabytes of .text - the actual goal is to make sure all of its inner inner inner loops do stay in cache. And this is one of 'em. > > Also, more iterations before the branch predictors figure out what's > > going on. > But also possibly more consistent behavior with respect to branch > prediction, in case the loop is often exited in the first iteration. Again, in real life I know a priori that the function is rarely, if ever, called with a zero-length string. - I went through the tree dumps for my week-old 4.2.0 for the test program with a fine comb. They are quite instructive. If tree-ch were doing what it says on the label -- copying the loop header -- everything would be fine; dom1 cleans up the two copies of the header later. However, ch isn't just copying the loop header; it is also copying the *entire body of the loop*, which nothing can fix. I call that a clear bug. zw -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #11 from rakdver at atrey dot karlin dot mff dot cuni dot cz 2006-07-12 23:39 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > I-cache. this only matters if this increase in code size will make the code go out of instruction cache. It definitely is possible to artificially construct programs where it matters, but I haven't seen one yet (ch increases the total code size by less than 1% on all the te > Also, more iterations before the branch predictors figure out what's > going on. But also possibly more consistent behavior with respect to branch prediction, in case the loop is often exited in the first iteration. In general, it is not possible to determine in ch whether loop header copying will be profitable or not. Undoing the loop header copying by some later pass might be doable, although I am not quite sure how much profitable. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #10 from zackw at panix dot com 2006-07-12 23:33 --- I-cache. Also, more iterations before the branch predictors figure out what's going on. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #9 from rakdver at atrey dot karlin dot mff dot cuni dot cz 2006-07-12 23:30 --- Subject: Re: poor optimization choices when iterating over a std::string (probably not c++-specific) > Zdenek: I don't see how you can say that what we get is optimal code "unless > optimizing for size". The code generated *will* be slower than the > alternative. why? Exactly the same number of instructions is executed. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #8 from zackw at panix dot com 2006-07-12 23:21 --- Zdenek: I don't see how you can say that what we get is optimal code "unless optimizing for size". The code generated *will* be slower than the alternative. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #7 from zackw at panix dot com 2006-07-12 23:19 --- Created an attachment (id=11875) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11875&action=view) C test case (with interesting implications) I've found a plain C test case. In the process, I've found that the way libstdc++ is coded interacts interestingly with the optimizer. In the attached file, has_bad_chars_bad is a literal translation to C of the code seen by the optimizers after inlining for the original C++ test case. Yes, libstdc++ does the moral equivalent of ((struct rep*)path)[-1].len. This function compiles to the same bad code as my original test case. has_bad_chars_good, on the other hand, is how I naively thought worked on the first read-through. That one compiles to code which looks optimal to me. I suspect some optimizer or other is not smart enough to see through this particular construct ... it would be good to make it do so, since we want libstdc++ to generate good code. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #6 from rakdver at gcc dot gnu dot org 2006-07-12 23:13 --- Loop header copying is OK; the result is the one I would expect, it certainly does not make the code worse (unless you are optimizing for code size), and in many cases may make it better. I will have a look at the addressing mode choices. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #5 from pinskia at gcc dot gnu dot org 2006-07-12 22:52 --- For me on PPC-darwin, it generates pretty good code at just -O2 even though there is a duplicated "header". The loop is pretty good at scheduling the code too: L9: lbz r0,0(r3) cmplwi cr7,r0,31 extsb r0,r0 cmpwi cr1,r0,127 cmpwi cr6,r0,92 ble- cr7,L4 beq- cr6,L4 beq- cr1,L4 L8: addi r3,r3,1 bdnz L9 Though the branches throw off everything (though that is a different issue), for the Cell really cror should be used (I think). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #4 from zackw at panix dot com 2006-07-12 22:48 --- I remembered that I had a build of 4.2 from last week lying around. It generates even worse code - still with the duplication of most of the loop, plus a bunch of unnecessary register fiddling and bad addressing mode choice. -- zackw at panix dot com changed: What|Removed |Added Known to fail||4.1.2 4.2.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #3 from zackw at panix dot com 2006-07-12 22:42 --- I should mention that the exact command line flags were -O2 -fomit-frame-pointer -march=pentium4, and that I hand-tweaked the label numbers for ease of reading. Also, -fno-tree-ch does suppress this bad optimization, but in exchange we get mildly worse code from the loop optimizer proper - it uses [reg+reg] indexing and a 0..n count instead of [reg] indexing and a base..limit count. The code is pretty short so I'll just paste it here (meaningless labels removed): _Z17has_bad_chars_newRKSs: pushl %ebx movl8(%esp), %eax movl(%eax), %eax xorl%ecx, %ecx movl-12(%eax), %ebx .L2: cmpl%ecx, %ebx je .L10 movzbl (%ecx,%eax), %edx cmpb$31, %dl jbe .L4 cmpb$92, %dl je .L4 addl$1, %ecx cmpb$127, %dl jne .L2 .L4: movl$1, %eax popl%ebx .p2align 4,,2 ret .L10: xorl%eax, %eax popl%ebx .p2align 4,,2 ret Looking at the code, I see that the entire purpose of tree-ch is to duplicate loop bodies in this fashion, and the justification given is that it "increases effectiveness of code motion and reduces the need for loop preconditioning", which I take to cover the above degradation in addressing mode choice. I'm not an optimizer expert, but surely there is a way to get the best of both worlds here...? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #2 from pinskia at gcc dot gnu dot org 2006-07-12 22:41 --- Loop-Copy header is doing it Which means there is a confusion at what is the real header of the loop here. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364
[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)
--- Comment #1 from zackw at panix dot com 2006-07-12 22:33 --- Created an attachment (id=11874) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11874&action=view) assembly output (bad on top, hand-corrected below) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364