Vincent Snijders wrote:
willem schreef:
Peter Vreman wrote:
At 17:01 1-1-2008, you wrote:
Vincent Snijders wrote:
willem schreef:
Benchmark results from :
http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all#about
Conclusion : It would be good to have a compiler switch that
optimizes for speed.
That is a wrong conclusion. All these programs were compiled for
speed and not for low memory use.
Vincent
No I mean a new G1 switch who optimizes 30% better at the expensive
of memory usage.
The memory usage may be ten times greater than the G switch.
I think your view is a bit too simplistic. Optimizing code is a
complex task and we are continuusly working on it. Ofcourse patches
to improve the optimizer of the compiler are always welcome. Simply
refering to a couple of, already known, benchmarks will not help to
get things improved. Besides that Free Pascal does a fairly good job
against the other commericially funded compilers.
Peter
Yes optimizing code is a complex job.
The Free Pascal compiler does already a lots of optimization.
Like constant merging, shifts in stead of multiply, stack frame
omission and so on.
But the Free Pascal compiler performs bad in the Mandelbrot benchmark.
number 1 is C++ g++
number 4 is Java 6
number 19 is Free Pascal
In the test Cpu time as N increases
number 3 is C gcc
number 18 is Java 6
number 26 is Free Pascal
I did download the Mandelbrot pascal source and I compiled it in the
Lazarus Ide.
I got two hints about the ov div instead of / .
and I got a runtime error when I tried to run this programm.
The problem lies in the conversion from integer to real and vice versa.
When I did a expliciet conversion from real to integer with round(),
the Mandelbrot benchmark runs fine.
How much speed improvement did that give? Did it give the exact same
results?
Where are reals converted to integers?
If you implement the G1 switch which an automic conversion the /
arithmic operator to div operator then you gain
speed at the cost of memory usage. The developer can then always use
the aritmic operator / .
var
cx : double
i, j: integer;
i := 3; j := 4;
cx := i / j; (cx = 0.75)
cx := i div j; (cx = 0)
The results are different. So this optimization is not correct.
Vincent
Ha Ha Vincent optimization is complex :-)
Well I have a Pentium D processor 2.80 Ghz, dual core with 1024 Mb
memory, so I cannot compare my benchmark results
with the Gentoo Pentium 4 benchmarks.
I looked at Mandelbrot cpp code , this code is very complex, the Lazarus
Pascal Mandelbrot code is simple,
but much easier to understand. The cpp is compiled for the Pentium 4.
To give you an idea of how complex optimization is I give you here the
optimization switches of c++ :
Optimization Options
-falign-functions=n -falign-jumps=n -falign-labels=n
-falign-loops=n -fbounds-check -fmudflap -fmudflapth -fmudflapir
-fbranch-probabilities -fprofile-values -fvpt -fbranch-tar‐
get-load-optimize -fbranch-target-load-optimize2 -fbtr-bb-exclusive
-fcaller-saves -fcprop-registers -fcse-follow-jumps
-fcse-skip-blocks -fcx-limited-range -fdata-sections -fde‐
layed-branch -fdelete-null-pointer-checks -fearly-inlining -fex‐
pensive-optimizations -ffast-math -ffloat-store -fforce-addr
-ffunction-sections -fgcse -fgcse-lm -fgcse-sm -fgcse-las
-fgcse-after-reload -floop-optimize -fcrossjumping -fif-conversion
-fif-conversion2 -finline-functions -finline-functions-called-once
-finline-limit=n -fkeep-inline-functions -fkeep-static-consts
-fmerge-constants -fmerge-all-constants -fmodulo-sched
-fno-branch-count-reg -fno-default-inline -fno-defer-pop
-floop-optimize2 -fmove-loop-invariants -fno-function-cse
-fno-guess-branch-probability -fno-inline -fno-math-errno
-fno-peephole -fno-peephole2 -funsafe-math-optimizations -fun‐
safe-loop-optimizations -ffinite-math-only -fno-trapping-math
-fno-zero-initialized-in-bss -fomit-frame-pointer -foptimize-reg‐
ister-move -foptimize-sibling-calls -fprefetch-loop-arrays -fpro‐
file-generate -fprofile-use -fregmove -frename-registers -fre‐
order-blocks -freorder-blocks-and-partition -freorder-functions
-frerun-cse-after-loop -frerun-loop-opt -frounding-math -fsched‐
ule-insns -fschedule-insns2 -fno-sched-interblock -fno-sched-spec
-fsched-spec-load -fsched-spec-load-dangerous
-fsched-stalled-insns=n -fsched-stalled-insns-dep=n
-fsched2-use-superblocks -fsched2-use-traces -freschedule-mod‐
ulo-scheduled-loops -fsignaling-nans -fsingle-precision-constant
-fstack-protector -fstack-protector-all -fstrength-reduce
-fstrict-aliasing -ftracer -fthread-jumps -funroll-all-loops
-funroll-loops -fpeel-loops -fsplit-ivs-in-unroller
-funswitch-loops -fvariable-expansion-in-unroller -ftree-pre
-ftree-ccp -ftree-dce -ftree-loop-optimize -ftree-loop-linear
-ftree-loop-im -ftree-loop-ivcanon -fivopts -ftree-dominator-opts
-ftree-dse -ftree-copyrename -ftree-sink -ftree-ch -ftree-sra
-ftree-ter -ftree-lrs -ftree-fre -ftree-vectorize
-ftree-vect-loop-version -ftree-salias -fweb -ftree-copy-prop
-ftree-store-ccp -ftree-store-copy-prop -fwhole-program --param
quite a lot.
Regards Wim
_________________________________________________________________
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject
archives at http://www.lazarus.freepascal.org/mailarchives