[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-12-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

--- Comment #35 from Richard Biener  ---
Author: rguenth
Date: Wed Dec 19 11:10:08 2018
New Revision: 267262

URL: https://gcc.gnu.org/viewcvs?rev=267262=gcc=rev
Log:
2018-12-19  Richard Biener  

PR tree-optimization/88533
Revert
2018-04-30  Richard Biener  

PR tree-optimization/28364
PR tree-optimization/85275
* tree-ssa-loop-ch.c (ch_base::copy_headers): Stop after
copying first exit test.

* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.

* tree-ssa-loop-ch.c: Include tree-phinodes.h and
ssa-iterators.h.
(should_duplicate_loop_header_p): Track whether stmt compute
loop invariants or values based on IVs.  Apart from the
original loop header only duplicate blocks with exit tests
that are based on IVs or invariants.

* gcc.dg/tree-ssa/copy-headers-6.c: New testcase.
* gcc.dg/tree-ssa/copy-headers-7.c: Likewise.
* gcc.dg/tree-ssa/ivopt_mult_1.c: Un-XFAIL.
* gcc.dg/tree-ssa/ivopt_mult_2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-6.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
trunk/gcc/tree-ssa-loop-ch.c

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-04-30 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
  Known to work||9.0
 Resolution|--- |FIXED

--- Comment #34 from Richard Biener  ---
Fixed.

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-04-30 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

--- Comment #33 from Richard Biener  ---
Author: rguenth
Date: Mon Apr 30 07:23:36 2018
New Revision: 259754

URL: https://gcc.gnu.org/viewcvs?rev=259754=gcc=rev
Log:
2018-04-30  Richard Biener  

PR tree-optimization/28364
PR tree-optimization/85275
* tree-ssa-loop-ch.c (ch_base::copy_headers): Stop after
copying first exit test.

* gcc.dg/tree-ssa/copy-headers-5.c: New testcase.
* gcc.dg/tree-ssa/predcom-8.c: Likewise.
* gcc.dg/tree-ssa/cunroll-13.c: Rewrite to gimple testcase.
* gcc.dg/tree-ssa/ivopt_mult_1.c: XFAIL.
* gcc.dg/tree-ssa/ivopt_mult_1g.c: Add gimple variant that
still passes.
* gcc.dg/tree-ssa/ivopt_mult_2.c: XFAIL.
* gcc.dg/tree-ssa/ivopt_mult_2g.c: Add gimple variant that
still passes.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.
* gcc.dg/tree-ssa/20030710-1.c: Likewise.
* gcc.dg/tree-ssa/20030711-1.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-5.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1g.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2g.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/predcom-8.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/20030710-1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/20030711-1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/cunroll-13.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
trunk/gcc/tree-ssa-loop-ch.c

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-03-06 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

amker at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amker at gcc dot gnu.org

--- Comment #32 from amker at gcc dot gnu.org ---
(In reply to bin.cheng from comment #31)
> This is a really old issue!  I will also check status of this issue on trunk.

For multi-exit cases, I wonder if it's possible to only copy wrto the one
checking IV.  It should still satisfy ch's purpose.

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-03-04 Thread amker.cheng at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

bin.cheng  changed:

   What|Removed |Added

 CC||amker.cheng at gmail dot com

--- Comment #31 from bin.cheng  ---
This is a really old issue!  I will also check status of this issue on trunk.

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-03-02 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

--- Comment #30 from Zack Weinberg  ---
It's been a very long time and I don't know exactly what changed, but GCC 7.3
generates essentially the same code for both of the functions in the "C test
case" and I would not describe that code as "bad".

I can still make it duplicate the entire body of the loop by relatively small
tweaks, though.  For instance, 

int has_bad_chars(char *str, __SIZE_TYPE__ len)
{
  for (char *c = str; c < str + len; c++)
{
  unsigned char x = (unsigned char)(*c);
  if (x <= 0x1f || x == 0x5c || x == 0x7f)
return 1;
}
  return 0;
}

generates significantly worse code (doubling cache footprint for no gain in
branch predictability or any other metric) with -O2 than -Os.

Also, the code generated for the body of the loop (with both the original test
case and the above) is more complicated than it needs to be, but perhaps that
should be a new bug report.

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-03-02 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #29 from Eric Gallager  ---
(In reply to Zdenek Dvorak from comment #28)
> Subject: Bug 28364
> 
> Author: rakdver
> Date: Wed Aug 16 21:14:11 2006
> New Revision: 116189
> 
> URL: http://gcc.gnu.org/viewcvs?root=gcc=rev=116189
> Log:
>   PR tree-optimization/28364
>   * tree-ssa-loop-ivopts.c (aff_combination_to_tree): Handle zero
>   correctly.
>   (fold_affine_expr): New function.
>   (may_eliminate_iv): Use fold_affine_expr.
> 
> 
> Modified:
> trunk/gcc/ChangeLog
> trunk/gcc/tree-ssa-loop-ivopts.c

Did this fix it?

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-08-16 Thread rakdver at gcc dot gnu dot org


--- Comment #28 from rakdver at gcc dot gnu dot org  2006-08-16 21:14 
---
Subject: Bug 28364

Author: rakdver
Date: Wed Aug 16 21:14:11 2006
New Revision: 116189

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=116189
Log:
PR tree-optimization/28364
* tree-ssa-loop-ivopts.c (aff_combination_to_tree): Handle zero
correctly.
(fold_affine_expr): New function.
(may_eliminate_iv): Use fold_affine_expr.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-loop-ivopts.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-26 Thread rakdver at gcc dot gnu dot org


--- Comment #27 from rakdver at gcc dot gnu dot org  2006-07-26 19:38 
---
Patch for the wrong choice of induction variable:
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01125.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-25 Thread rakdver at gcc dot gnu dot org


--- Comment #26 from rakdver at gcc dot gnu dot org  2006-07-25 15:20 
---
A patch for the return in the middle of the loop problem:
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00893.html
(to be commited once mainline is open).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-17 Thread rakdver at gcc dot gnu dot org


--- Comment #25 from rakdver at gcc dot gnu dot org  2006-07-18 00:45 
---
Created an attachment (id=11906)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11906action=view)
A patch for loop header selection.

I tried improving the heuristics for the selection of the loop header, however
without success.  In ch, copying all the exits simply looks like a good idea,
since it makes the loop header contain the most important looking part of the
loop, and makes the latch of the loop empty.  I append the patch I made, in
case someone wanted to play with it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-14 Thread rakdver at gcc dot gnu dot org


-- 

rakdver at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |rakdver at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-07-14 14:12:12
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #18 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-07-13 07:58 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

   I-cache.
  this only matters if this increase in code size will make the code go
  out of instruction cache.
 
 The real program that this is taken from is a large C++ application
 which is guaranteed to go out of cache - it's got slightly less than
 four megabytes of .text - the actual goal is to make sure all of its
 inner inner inner loops do stay in cache.  And this is one of 'em.
 
   Also, more iterations before the branch predictors figure out what's
   going on.
  But also possibly more consistent behavior with respect to branch
  prediction, in case the loop is often exited in the first iteration.
 
 Again, in real life I know a priori that the function is rarely, if
 ever, called with a zero-length string.
 
 -
 
 I went through the tree dumps for my week-old 4.2.0 for the test
 program with a fine comb.  They are quite instructive.  If tree-ch
 were doing what it says on the label -- copying the loop header --
 everything would be fine; dom1 cleans up the two copies of the header
 later.  However, ch isn't just copying the loop header; it is also
 copying the *entire body of the loop*, which nothing can fix. I call
 that a clear bug.

how do you define a loop header?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #19 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-07-13 08:01 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

   I-cache.
  this only matters if this increase in code size will make the code go
  out of instruction cache.
 
 The real program that this is taken from is a large C++ application
 which is guaranteed to go out of cache - it's got slightly less than
 four megabytes of .text - the actual goal is to make sure all of its
 inner inner inner loops do stay in cache.  And this is one of 'em.

on your real program, how much performance do you gain by hand-rewriting
the assembler to look the way you like?  Just to make sure there really
is a problem.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread zackw at panix dot com


--- Comment #20 from zackw at panix dot com  2006-07-13 08:25 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

  However, ch isn't just copying the loop header; it is also
  copying the *entire body of the loop*, which nothing can fix. I call
  that a clear bug.

 how do you define a loop header?

I was under the impression it was just the one basic block called out
in the .ch dump, e.g.

;; Loop 1
;;  header 6, latch 5
;;  depth 1, level 1, outer 0

-- basic block 6 happens to contain just the code from the syntactic
loop condition.  Andrew informs me that this is wrong, and that in
this case the header is the entire loop, but I will come back at that
with 'ch should never be duplicating the entire loop; if the header is
the entire loop, it should do something more sensible, like duplicate
just the first basic block or something.'


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread zackw at panix dot com


--- Comment #21 from zackw at panix dot com  2006-07-13 08:28 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

 on your real program, how much performance do you gain by hand-rewriting
 the assembler to look the way you like?  Just to make sure there really
 is a problem.

I'm a little annoyed by the suggestion that this wouldn't be a real
problem if I couldn't measure a performance difference.

Depending on workload, other activity on the same machine, and phase
of moon, this loop is between .1% and 1% of runtime, and my tweaks
make it go about a third faster.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread rguenth at gcc dot gnu dot org


--- Comment #22 from rguenth at gcc dot gnu dot org  2006-07-13 08:28 
---
For practical purposes (determining the loop runs at least once) it needs to
duplicate the exit condition.  Which happens to be difficult here, as there are
multiple loop exits.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #23 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-07-13 09:00 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

  on your real program, how much performance do you gain by hand-rewriting
  the assembler to look the way you like?  Just to make sure there really
  is a problem.
 
 I'm a little annoyed by the suggestion that this wouldn't be a real
 problem if I couldn't measure a performance difference.

sorry, but it indeed would not be.  I have seen so many examples of code
that looks bad at the first sight and preforms just fine that unless
there is something obviously wrong (which is not the case here, IMO), I
am somewhat more careful before I spend my time on fixing this type of
bugs.

 Depending on workload, other activity on the same machine, and phase
 of moon, this loop is between .1% and 1% of runtime, and my tweaks
 make it go about a third faster.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #24 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-07-13 09:03 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

   However, ch isn't just copying the loop header; it is also
   copying the *entire body of the loop*, which nothing can fix. I call
   that a clear bug.
 
  how do you define a loop header?
 
 I was under the impression it was just the one basic block called out
 in the .ch dump, e.g.
 
 ;; Loop 1
 ;;  header 6, latch 5
 ;;  depth 1, level 1, outer 0
 
 -- basic block 6 happens to contain just the code from the syntactic
 loop condition.  Andrew informs me that this is wrong, and that in
 this case the header is the entire loop, but I will come back at that
 with 'ch should never be duplicating the entire loop; if the header is
 the entire loop, it should do something more sensible, like duplicate
 just the first basic block or something.'

currently, we stop once the copied region is too large.  This means that
on normal loops that have a body that does something, we won't copy
whole loop.  Of course, any heuristics will have cases when it won't
perform ideally.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #1 from zackw at panix dot com  2006-07-12 22:33 ---
Created an attachment (id=11874)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11874action=view)
assembly output (bad on top, hand-corrected below)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread pinskia at gcc dot gnu dot org


--- Comment #2 from pinskia at gcc dot gnu dot org  2006-07-12 22:41 ---
Loop-Copy header is doing it Which means there is a confusion at what is the
real header of the loop here.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #3 from zackw at panix dot com  2006-07-12 22:42 ---
I should mention that the exact command line flags were -O2
-fomit-frame-pointer -march=pentium4, and that I hand-tweaked the label numbers
for ease of reading.

Also, -fno-tree-ch does suppress this bad optimization, but in exchange we get
mildly worse code from the loop optimizer proper - it uses [reg+reg] indexing
and a 0..n count instead of [reg] indexing and a base..limit count.  The code
is pretty short so I'll just paste it here (meaningless labels removed):

_Z17has_bad_chars_newRKSs:
pushl   %ebx
movl8(%esp), %eax
movl(%eax), %eax
xorl%ecx, %ecx
movl-12(%eax), %ebx
.L2:
cmpl%ecx, %ebx
je  .L10
movzbl  (%ecx,%eax), %edx
cmpb$31, %dl
jbe .L4
cmpb$92, %dl
je  .L4
addl$1, %ecx
cmpb$127, %dl
jne .L2
.L4:
movl$1, %eax
popl%ebx
.p2align 4,,2
ret
.L10:
xorl%eax, %eax
popl%ebx
.p2align 4,,2
ret

Looking at the code, I see that the entire purpose of tree-ch is to duplicate
loop bodies in this fashion, and the justification given is that it increases
effectiveness of code motion and reduces the need for loop preconditioning,
which I take to cover the above degradation in addressing mode choice.  I'm not
an optimizer expert, but surely there is a way to get the best of both worlds
here...?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #4 from zackw at panix dot com  2006-07-12 22:48 ---
I remembered that I had a build of 4.2 from last week lying around.  It
generates even worse code - still with the duplication of most of the loop,
plus a bunch of unnecessary register fiddling and bad addressing mode choice.


-- 

zackw at panix dot com changed:

   What|Removed |Added

  Known to fail||4.1.2 4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread pinskia at gcc dot gnu dot org


--- Comment #5 from pinskia at gcc dot gnu dot org  2006-07-12 22:52 ---
For me on PPC-darwin, it generates pretty good code at just -O2 even though
there is a duplicated header.  The loop is pretty good at scheduling the code
too:
L9:
lbz r0,0(r3)
cmplwi cr7,r0,31
extsb r0,r0
cmpwi cr1,r0,127
cmpwi cr6,r0,92
ble- cr7,L4
beq- cr6,L4
beq- cr1,L4
L8:
addi r3,r3,1
bdnz L9

Though the branches throw off everything (though that is a different issue),
for the Cell really cror should be used (I think).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread rakdver at gcc dot gnu dot org


--- Comment #6 from rakdver at gcc dot gnu dot org  2006-07-12 23:13 ---
Loop header copying is OK; the result is the one I would expect, it certainly
does not make the code worse (unless you are optimizing for code size), and in
many cases may make it better.

I will have a look at the addressing mode choices.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #7 from zackw at panix dot com  2006-07-12 23:19 ---
Created an attachment (id=11875)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11875action=view)
C test case (with interesting implications)

I've found a plain C test case.  In the process, I've found that the way
libstdc++ string is coded interacts interestingly with the optimizer.

In the attached file, has_bad_chars_bad is a literal translation to C of the
code seen by the optimizers after inlining for the original C++ test case. 
Yes, libstdc++ string does the moral equivalent of ((struct
rep*)path)[-1].len.  This function compiles to the same bad code as my original
test case.

has_bad_chars_good, on the other hand, is how I naively thought string worked
on the first read-through.  That one compiles to code which looks optimal to
me.  I suspect some optimizer or other is not smart enough to see through this
particular construct ... it would be good to make it do so, since we want
libstdc++ string to generate good code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #8 from zackw at panix dot com  2006-07-12 23:21 ---
Zdenek: I don't see how you can say that what we get is optimal code unless
optimizing for size.  The code generated *will* be slower than the
alternative.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #9 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-07-12 23:30 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

 Zdenek: I don't see how you can say that what we get is optimal code unless
 optimizing for size.  The code generated *will* be slower than the
 alternative.

why?  Exactly the same number of instructions is executed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #10 from zackw at panix dot com  2006-07-12 23:33 ---
I-cache.  Also, more iterations before the branch predictors figure out what's
going on.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #11 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-07-12 23:39 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

 I-cache.

this only matters if this increase in code size will make the code go
out of instruction cache.  It definitely is possible to artificially
construct programs where it matters, but I haven't seen one yet (ch
increases the total code size by less than 1% on all the te

 Also, more iterations before the branch predictors figure out what's
 going on.

But also possibly more consistent behavior with respect to branch
prediction, in case the loop is often exited in the first iteration.

In general, it is not possible to determine in ch whether loop header
copying will be profitable or not.  Undoing the loop header copying
by some later pass might be doable, although I am not quite sure how
much profitable.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #12 from zackw at panix dot com  2006-07-13 03:09 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

  I-cache.
 this only matters if this increase in code size will make the code go
 out of instruction cache.

The real program that this is taken from is a large C++ application
which is guaranteed to go out of cache - it's got slightly less than
four megabytes of .text - the actual goal is to make sure all of its
inner inner inner loops do stay in cache.  And this is one of 'em.

  Also, more iterations before the branch predictors figure out what's
  going on.
 But also possibly more consistent behavior with respect to branch
 prediction, in case the loop is often exited in the first iteration.

Again, in real life I know a priori that the function is rarely, if
ever, called with a zero-length string.

-

I went through the tree dumps for my week-old 4.2.0 for the test
program with a fine comb.  They are quite instructive.  If tree-ch
were doing what it says on the label -- copying the loop header --
everything would be fine; dom1 cleans up the two copies of the header
later.  However, ch isn't just copying the loop header; it is also
copying the *entire body of the loop*, which nothing can fix. I call
that a clear bug.

zw


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread pinskia at gcc dot gnu dot org


--- Comment #13 from pinskia at gcc dot gnu dot org  2006-07-13 03:37 
---
Hmm, for some reason I don't like the idea of using std::string in the inner
loop :).  Even the C testcase does not seem like a good inner loop in general
anyways as there is no caculation going on here.  To me these look like loops
which are run for testing only, looking for bad characters to see if a problem
had happened.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #14 from zackw at panix dot com  2006-07-13 03:40 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

It's a validation routine, yes, which is run over every pathname the
program is working on, and there can be hundreds or thousands of
those.

And why the heck shouldn't I be able to use std::string in inner
loops?  I sure don't want to be using bare char*...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread pinskia at gcc dot gnu dot org


--- Comment #15 from pinskia at gcc dot gnu dot org  2006-07-13 03:45 
---
One more comment, a loop with an early exit is whole issue and that is the
reason why all of the code in the loop is considered the header. There are a
couple of loop headers in this case, one for each exit which makes it harder to
deal with in general.  

What you did not mention is which how would this loop exit normally, via the
return 1 or all the way through the loop.  There is no enough information from
GCC's point of view to figure that out without profiling (for this case).  GCC
is assuming that the loop exits in the first if statement which seems
reasoniable.  Maybe you should try with profiling information and see what GCC
does for this testcase.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread pinskia at gcc dot gnu dot org


--- Comment #16 from pinskia at gcc dot gnu dot org  2006-07-13 04:01 
---
If this is really a program's inner most loop, then the program is messed up as
there is no caculation going on here at all.  
What type of program is this? Do you cache the result of this function?  Maybe
chaching the results will show other bottle necks.  Maybe even instead of doing
this loop, find another loop which loops over the text and also process it at
the same time.  These are normal optimization tricks which usuaully cannot be
done by the compiler.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #17 from zackw at panix dot com  2006-07-13 04:23 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

 One more comment, a loop with an early exit is whole issue and that is the
 reason why all of the code in the loop is considered the header. There are a
 couple of loop headers in this case, one for each exit which makes it harder 
 to
 deal with in general.

I didn't know that, and it is not obvious from the optimizer dumps.  Thanks for
explaining.

 What you did not mention is which how would this loop exit normally, via the
 return 1 or all the way through the loop.  There is no enough information from
 GCC's point of view to figure that out without profiling (for this case).  GCC
 is assuming that the loop exits in the first if statement which seems
 reasoniable. Maybe you should try with profiling information and see what GCC
 does for this testcase.

flamebait Feedback-directed optimization is only good for making
compilers look better on benchmarks.  It's useless in real life. /

I can, in fact, get good code out of gcc 4.1 by beating it over the
head with __builtin_expect, but I don't think I should have to do
that.  I think my suggested version is better code no matter whether
or not the loop exits early.

4.2 still makes what I consider to be bad addressing mode choices
after that change, but Zdenek did say he would look at that.  It also
puts the return 1 exit block in the middle of the loop in spite of
being told that all three conditions leading to that are unlikely.

struct rep
{
  unsigned long len;
  unsigned long alloc;
  unsigned long dummy;
};

struct data
{
  char * ptr;
};

struct string
{
  struct rep R;
  struct data D;
};

int
has_bad_chars(struct data *path)
{
  char *c;
  for (c = path-ptr;
   __builtin_expect(c  path-ptr + ((struct rep *)path)[-1].len, 1);
   c++)
{
  unsigned char x = (unsigned char)(*c);
  if (__builtin_expect(x = 0x1f || x == 0x5c || x == 0x7f, 0))
return 1;
}
  return 0;
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364