On 05/24/2017 06:15 AM, Martin Schreiber wrote: > On Wednesday 24 May 2017 07:47:09 Martin Schreiber wrote: > >>> So what's MSElang's frame rate and on what kind of machine/OS? >> Not ready yet. >> > Linux 32 bit > Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz > > With round() operations > https://gitlab.com/mseide-msegui/mselang/blob/master/mselang/benchmark/mctest/mctest.pas > > FPC 3.0.2 > -O- -> 8 FPS > -O4 -CfSSE3 -CpCOREI -> access violation > -O3 -CfSSE3 -CpCOREI -> access violation > -O4 -CpCOREI -> 8.2 FPS > > MSElang, LLVM 3.8.0 > No options -> 4.2 FPS > -O3 -> 5.9 FPS > -O3 -mcpu=corei7 -mattr=+sse3,+ssse3 -> 33.5 FPS > > With trunci32() operations > https://gitlab.com/mseide-msegui/mselang/blob/master/mselang/benchmark/mctest/mctest_trunc.pas > -O3 -> 8.1 > -O3 -mcpu=corei7 -mattr=+sse3,+ssse3 -> 41.5 FPS ROCK ON!
Since a server crash got me up *WAY* early this morning I ended up with an hour of extra time so I went over Jonas' suggestions again. I made tweaks fixing integer types, using trunc() instead of round, ... The over all improvement is only +0.6%, as I predicted nowhere near the 10x+ needed to compete with the other languages. I'd also like to point out that most of these things deal with internals in the compiler, which is what the initial complaint was. I also tried the SSE3 flag, which for a brief period of time brought me up to 10fps on my 1.9GHz i3 notebook. That is just under a 3x improvement. But then it segfaulted. As Jonas pointed out -O1 is about as far as you can take it with -CfSSE3 due to bugs in FPC. Also Jonas pointed out that they don't have a floor() with a float output. Indeed I could not find any rounding mechanism, in the docs, that produced a float output. So tried substituting -frac(). It was actually marginally slower. :-) His item #4 about the time base is my doing. It was the method I substituted for the the SDL_GetTicks() time source, which is probably much faster than FPC's time(). I'm sure there is a much better system call for that. But the short of it is that its only happening once per render and its total contribution to the performance deficit is inconsequential. Your machine is closer to what Graeme's is/was than mine is and that 41.5FPS is *SWEET*! THX - Jon -- Sent from my Debian Linux laptop Jon Foster JF Possibilities, Inc. j...@jfpossibilities.com ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ mseide-msegui-talk mailing list mseide-msegui-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mseide-msegui-talk