I've done a bit of assembly programming for IA-32, none recently and zero
compiler development other than simple parsers, but from what I know
optimizing with newer instructions like SSEx is only part of there story.
There's also optimizing for out of order instruction execution (OoOIE), or
instruction reordering. Even on a single core, when optimized done properly
it can yield 4x or more increase in processing speed, avoiding CPU stalls,
which is likely happening a lot given the descriptions people have provided
in this thread.

Obviously properly using SSEx instructions and optimizing for OoOIE is a
non trivial undertaking. I think probably the best way to test FPC out is
to stop writing speed tests against other languages, and to view the CPU
instructions generated for a few different loop types, including
trunc()/floor() functions. Next rewrite those parts in assembler, include
SSE whatever instructions and pepper in OoOIE, and finally test the speed
difference.

My instincts tell me if the compiler could then generate both SSEx and
OoOIE properly it could automatically applied in many places, resulting in
a huge speed up for FPC programs. But for me, the main advantage of Free
Pascal is that it generates native language, leading to easier reuse of C
style APIs such as Cairo/GTK/SDL, and also allowing for the PME
(properties, methods, events) plus all the other nice features Free Pascal
offers. Most of the time my programs are either idle waiting for user
input, or idle waiting for other APIs or in the case of OpenGL the graphics
hardware to complete. As such the execution speed of the pascal portion of
my programs isn't as important.
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to