On Sun, 23 May 2010 11:06:24 +0200 Benjamin Zores <b...@geexbox.org> said:

> On Sun, May 23, 2010 at 6:22 AM, Carsten Haitzler <ras...@rasterman.com>
> wrote:
> > On Sat, 22 May 2010 23:46:56 +0200 Benjamin Zores <b...@geexbox.org> said:
> >
> > any reason you use just -O not -O2?
> >
> > fyi - no attached file - sf.net filtered it out. :) though generally this
> > kind of a problem is a result of compiler issues - eg no neon support in
> > gcc (or poor/older support) etc. generally anyway.
> 
> It used to be -O4 actually. Then I just switch to -O just to ensure it
> was not triggered by any compiler optimization bug.
> Apparently this wasn't the case.

aaah gotcha. tho in my experience -O2 is about as good as it gets without using
-mtune/cpu etc to generate instructions for a specific architecture level or
tune for it. at least in my experience. i have a suggestion. dont go about -O2
unless you can really justify it - that means benchmarks show real solid
speedups (consistently more than 5%). in the past at least -O3 and above have
also been wonderful sources of compiler bugs, that produce incorrect code - and
you may end up suffering from bugs that don't actually exist in the code - but
lie in the compiler, so... just beware of high -O levels. test to be sure it
actually is worth it. do real benchmarking. remember it's a risk tradeoff - you
gain N% more speed for a higher chance of bugs (that cant actually be fixed in
the src - but in the compiler which makes it harder). thus why i say more than
5% - it is, of course, a fuzzy number, but it means "you need a real
significant and noticeable speedup".

for evas at any rate, use expedite and do benchmarks. better results come from
higher -c counts (loop count - so the more loops it does the less you'll be hit
by entropy - default is 128 - so use -c 256 if you are patient if you want very
accurate results). as an example - results for x86 (32bit) evas speed
(weighted):

-O0 185.74
-O1 271.33
-O2 274.52
-O3 274.30
-O4 272.46

(all of them used -march=nocona as well in CFLAGS - on a core2 duo laptop).

notice performance actually peaks at -O2 :) (also -O0 is horrible - but just to
note - i write a lot of my code to assume a compiler will have a decent
optimiser, so it will pick up the pieces with what may seem "stupid code", but
that "sutpid code" is meant to be more readable/maintainable

(need i get into -funroll-loops ?) :)

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    ras...@rasterman.com


------------------------------------------------------------------------------

_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to