On 05/24/2017 06:15 AM, Martin Schreiber wrote:
> On Wednesday 24 May 2017 07:47:09 Martin Schreiber wrote:
>
>>> So what's MSElang's frame rate and on what kind of machine/OS?
>> Not ready yet.
>>
> Linux 32 bit
> Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
>
> With round() operations
> https://gitlab.com/mseide-msegui/mselang/blob/master/mselang/benchmark/mctest/mctest.pas
>
> FPC 3.0.2
> -O- -> 8 FPS
> -O4 -CfSSE3 -CpCOREI -> access violation
> -O3 -CfSSE3 -CpCOREI -> access violation
> -O4 -CpCOREI -> 8.2 FPS
>
> MSElang, LLVM 3.8.0
> No options -> 4.2 FPS
> -O3 -> 5.9 FPS
> -O3 -mcpu=corei7 -mattr=+sse3,+ssse3 -> 33.5 FPS
>
> With trunci32() operations
> https://gitlab.com/mseide-msegui/mselang/blob/master/mselang/benchmark/mctest/mctest_trunc.pas
> -O3 -> 8.1
> -O3 -mcpu=corei7 -mattr=+sse3,+ssse3 -> 41.5 FPS
ROCK ON!

Since a server crash got me up *WAY* early this morning I ended up with an 
hour of extra time so I went over Jonas' suggestions again. I made tweaks 
fixing integer types, using trunc() instead of round, ... The over all 
improvement is only +0.6%, as I predicted nowhere near the 10x+ needed to 
compete with the other languages. I'd also like to point out that most of 
these things deal with internals in the compiler, which is what the initial 
complaint was.

I also tried the SSE3 flag, which for a brief period of time brought me up 
to 10fps on my 1.9GHz i3 notebook. That is just under a 3x improvement. But 
then it segfaulted. As Jonas pointed out -O1 is about as far as you can 
take it with -CfSSE3 due to bugs in FPC. Also Jonas pointed out that they 
don't have a floor() with a float output. Indeed I could not find any 
rounding mechanism, in the docs, that produced a float output. So tried 
substituting -frac(). It was actually marginally slower. :-)

His item #4 about the time base is my doing. It was the method I 
substituted for the the SDL_GetTicks() time source, which is probably much 
faster than FPC's time(). I'm sure there is a much better system call for 
that. But the short of it is that its only happening once per render and 
its total contribution to the performance deficit is inconsequential.

Your machine is closer to what Graeme's is/was than mine is and that 
41.5FPS is *SWEET*!

THX - Jon

-- 
Sent from my Debian Linux laptop
Jon Foster
JF Possibilities, Inc.
j...@jfpossibilities.com


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
mseide-msegui-talk mailing list
mseide-msegui-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mseide-msegui-talk

Reply via email to