Re: optimizing predictable branches on x86

2008-03-02 Thread Nick Piggin
On Wednesday 27 February 2008 03:06, J.C. Pizarro wrote: > Compiling and executing the code of Nick Piggin at > http://gcc.gnu.org/ml/gcc/2008-02/msg00601.html > > in my old Athlon64 Venice 3200+ 2.0 GHz, > 3 GiB DDR400, 32-bit kernel, gcc 3.4.6, i got > > $ gcc -O3 -falign-functions=64 -falign-loo

Re: optimizing predictable branches on x86

2008-02-27 Thread Jan Hubicka
> > At least on x86 it should also be a good idea to know which way > > the branch is going to go, because it doesn't have explicit branch > > hints, you really want to be able to optimize the cold branch > > predictor case if converting from cmov to conditional branches. > > x86 as of Pentium 4 d

Re: optimizing predictable branches on x86

2008-02-27 Thread Kenny Simpson
> At least on x86 it should also be a good idea to know which way > the branch is going to go, because it doesn't have explicit branch > hints, you really want to be able to optimize the cold branch > predictor case if converting from cmov to conditional branches. x86 as of Pentium 4 does have bra

Re: optimizing predictable branches on x86

2008-02-26 Thread J.C. Pizarro
On Tuesday 26 February 2008 21:14, Jan Hubicka wrote: > Only cases we do so quite reliably IMO are: > 1) loop branches that are not interesting for cmov conversion > 2) branches leading to noreturn calls, also not interesting > 3) builtin_expect mentioned. > 4) when profile feedback is arou

Re: optimizing predictable branches on x86

2008-02-26 Thread J.C. Pizarro
It's a final summary for good performance of the tested machines: + unpredictable: * don't use conditional jmp (the worst). / * use cmov or C version. / \ + no deps: * use cmov or C version. \ / + predictable: \ + has deps: * do

Re: optimizing predictable branches on x86

2008-02-26 Thread J.C. Pizarro
On 2008/2/26, J.C. Pizarro <[EMAIL PROTECTED]>, i wrote: > 4. C > cmov >> jmp when it's unpredictable and has not data dependencies. I'm sorry of my error typo, the correct is (without the "not") 4. C > cmov >> jmp when it's unpredictable and has data dependencies. and my forgotten 3rd annotatio

Re: optimizing predictable branches on x86

2008-02-26 Thread J.C. Pizarro
Compiling and executing the code of Nick Piggin at http://gcc.gnu.org/ml/gcc/2008-02/msg00601.html in my old Athlon64 Venice 3200+ 2.0 GHz, 3 GiB DDR400, 32-bit kernel, gcc 3.4.6, i got $ gcc -O3 -falign-functions=64 -falign-loops=64 -falign-jumps=64 -falign-labels=64 -march=i686 foo.c -o foo $ .

Re: optimizing predictable branches on x86

2008-02-26 Thread Nick Piggin
On Tuesday 26 February 2008 21:14, Jan Hubicka wrote: > Hi, > > > Core2 follows a similar pattern, although it's not seeing any > > slowdown in the "no deps, predictable, jmp" case like K8 does. > > > > Any comments? (please cc me) Should gcc be using conditional jumps > > more often eg. in the cas

Re: optimizing predictable branches on x86

2008-02-26 Thread Jan Hubicka
> Hi, > > Core2 follows a similar pattern, although it's not seeing any > > slowdown in the "no deps, predictable, jmp" case like K8 does. > > > > Any comments? (please cc me) Should gcc be using conditional jumps > > more often eg. in the case of __builtin_expect())? > > The problem is that in g

Re: optimizing predictable branches on x86

2008-02-26 Thread Jan Hubicka
Hi, > Core2 follows a similar pattern, although it's not seeing any > slowdown in the "no deps, predictable, jmp" case like K8 does. > > Any comments? (please cc me) Should gcc be using conditional jumps > more often eg. in the case of __builtin_expect())? The problem is that in general GCC's bra