We should also be wary of compiler options that are a win on one processor family and a loss on another. Anything that schedules instructions may be specific to a particular generation of CPUs, for instance. Or for i5 vs i7 vs M(obile)...

                        ...jim

On 10/2/15 9:10 AM, Laurent Bourgès wrote:
Sergey,

thanks for the information:

        I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is
        actually
        slightly faster: 10% on my fill ellipse test (450ms vs 490ms).


    I tested by your jmh test, and the difference became bigger on 1400
    size.


Interesting; I will try too.



        Do you know which gcc compiler and options are used to build
        JavaSE EA?

    I guess that compiler options in makefile are the same.

    plus some default gcc options:
    jdk8:
    gcc (GCC) 4.3.0 20080428 (Red Hat-8) C compiler version 4.3.0-8)

    jdk9:
    gcc-4.8.2 - OEL5.5


However the gcc compiler are different: 4.3 vs 4.8.2 !

So it may be worth comparing their different optimization options; I
guess somebody already looked at that !


        Moreover, the linux distrib may define default options.

        I will try to figure out all compiler options (command line +
        defaults)
        on my machine.


    It is not simple to find an option, which will help for everyone.
    Two options suggested by me is a minimum number from -O3 to get the
    maximum performance, both seems reasonable. Actually if I change the
    -O2 to -O3(OPTIMIZATION := LOW =>> OPTIMIZATION := HIGHEST)
    performance became worse.


It is often the case with O3, but your patch seems a good win with only
2 enabled options.


        What is your build environment ?


    Ubuntu 14.04 gcc 4.8.4


I have the same and I got finally my gcc options:
gcc -c -Q -O2 --help=common

Here are the difference between O2 vs O3 with gcc 4.8.4:

gcc -c -Q -O3 --help=optimizers > /tmp/O3-opts
gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts
diff /tmp/O2-opts /tmp/O3-opts | grep enabled

*>   -fgcse-after-reload                 [enabled]
*>   -finline-functions                  [enabled]
  -fipa-cp-clone                      [enabled]
  -fpredictive-commoning              [enabled]
  -ftree-loop-distribute-patterns     [enabled]
  -ftree-partial-pre                  [enabled]
*>   -ftree-vectorize                    [enabled]
*>   -funswitch-loops                    [enabled]
  -fvect-cost-model                   [enabled]

So we could evaluate some of these options and see what is the best
compromise for libawt on gcc 4.8 !

Regards,
Laurent

Reply via email to