Vincent Snijders wrote:
willem schreef:
Peter Vreman wrote:
At 17:01 1-1-2008, you wrote:
Vincent Snijders wrote:
willem schreef:

Benchmark results from :
http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all#about Conclusion : It would be good to have a compiler switch that optimizes for speed.

That is a wrong conclusion. All these programs were compiled for speed and not for low memory use.

Vincent
No I mean a new G1 switch who optimizes 30% better at the expensive of memory usage.
The memory usage may be ten times greater than the G switch.

I think your view is a bit too simplistic. Optimizing code is a complex task and we are continuusly working on it. Ofcourse patches to improve the optimizer of the compiler are always welcome. Simply refering to a couple of, already known, benchmarks will not help to get things improved. Besides that Free Pascal does a fairly good job against the other commericially funded compilers.

Peter
Yes optimizing code is a complex job.
The Free Pascal compiler does already a lots of optimization.
Like constant merging, shifts in stead of multiply, stack frame omission and so on.

But the Free Pascal compiler performs bad in the Mandelbrot benchmark.

number 1 is C++ g++
number 4 is Java 6
number 19 is Free Pascal

In the test Cpu time as N increases

number 3 is C gcc
number 18 is Java 6
number 26 is Free Pascal

I did download the Mandelbrot pascal source and I compiled it in the Lazarus Ide.
I got two hints about the ov div instead of / .
and I got a runtime error when I tried to run this programm.

The problem lies in the conversion from integer to real and vice versa.

When I did a expliciet conversion from real to integer with round(), the Mandelbrot benchmark runs fine.

How much speed improvement did that give? Did it give the exact same results?
Where are reals converted to integers?


If you implement the G1 switch which an automic conversion the / arithmic operator to div operator then you gain speed at the cost of memory usage. The developer can then always use the aritmic operator / .


var
cx : double
i, j: integer;

i := 3; j := 4;

cx := i / j; (cx = 0.75)
cx := i div j; (cx = 0)

The results are different. So this optimization is not correct.

Vincent

Ha Ha Vincent optimization is complex :-)

Well I have a Pentium D processor 2.80 Ghz, dual core with 1024 Mb memory, so I cannot compare my benchmark results
with the Gentoo Pentium 4 benchmarks.

I looked at Mandelbrot cpp code , this code is very complex, the Lazarus Pascal Mandelbrot code is simple,
but much easier to understand. The cpp is compiled for the Pentium 4.
To give you an idea of how complex optimization is I give you here the optimization switches of c++ :

Optimization Options
-falign-functions=n -falign-jumps=n -falign-labels=n
-falign-loops=n -fbounds-check -fmudflap -fmudflapth -fmudflapir
-fbranch-probabilities -fprofile-values -fvpt -fbranch-tar‐
get-load-optimize -fbranch-target-load-optimize2 -fbtr-bb-exclusive
-fcaller-saves -fcprop-registers -fcse-follow-jumps
-fcse-skip-blocks -fcx-limited-range -fdata-sections -fde‐
layed-branch -fdelete-null-pointer-checks -fearly-inlining -fex‐
pensive-optimizations -ffast-math -ffloat-store -fforce-addr
-ffunction-sections -fgcse -fgcse-lm -fgcse-sm -fgcse-las
-fgcse-after-reload -floop-optimize -fcrossjumping -fif-conversion
-fif-conversion2 -finline-functions -finline-functions-called-once
-finline-limit=n -fkeep-inline-functions -fkeep-static-consts
-fmerge-constants -fmerge-all-constants -fmodulo-sched
-fno-branch-count-reg -fno-default-inline -fno-defer-pop
-floop-optimize2 -fmove-loop-invariants -fno-function-cse
-fno-guess-branch-probability -fno-inline -fno-math-errno
-fno-peephole -fno-peephole2 -funsafe-math-optimizations -fun‐
safe-loop-optimizations -ffinite-math-only -fno-trapping-math
-fno-zero-initialized-in-bss -fomit-frame-pointer -foptimize-reg‐
ister-move -foptimize-sibling-calls -fprefetch-loop-arrays -fpro‐
file-generate -fprofile-use -fregmove -frename-registers -fre‐
order-blocks -freorder-blocks-and-partition -freorder-functions
-frerun-cse-after-loop -frerun-loop-opt -frounding-math -fsched‐
ule-insns -fschedule-insns2 -fno-sched-interblock -fno-sched-spec
-fsched-spec-load -fsched-spec-load-dangerous
-fsched-stalled-insns=n -fsched-stalled-insns-dep=n
-fsched2-use-superblocks -fsched2-use-traces -freschedule-mod‐
ulo-scheduled-loops -fsignaling-nans -fsingle-precision-constant
-fstack-protector -fstack-protector-all -fstrength-reduce
-fstrict-aliasing -ftracer -fthread-jumps -funroll-all-loops
-funroll-loops -fpeel-loops -fsplit-ivs-in-unroller
-funswitch-loops -fvariable-expansion-in-unroller -ftree-pre
-ftree-ccp -ftree-dce -ftree-loop-optimize -ftree-loop-linear
-ftree-loop-im -ftree-loop-ivcanon -fivopts -ftree-dominator-opts
-ftree-dse -ftree-copyrename -ftree-sink -ftree-ch -ftree-sra
-ftree-ter -ftree-lrs -ftree-fre -ftree-vectorize
-ftree-vect-loop-version -ftree-salias -fweb -ftree-copy-prop
-ftree-store-ccp -ftree-store-copy-prop -fwhole-program --param

quite a lot.

Regards Wim




_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Reply via email to