On 05/18/2017 07:19 AM, Reimar Grabowski wrote:
On Thu, 18 May 2017 20:32:57 +0700
Ryan Joseph<r...@thealchemistguild.com>  wrote:

On May 18, 2017, at 8:23 PM, Graeme Geldenhuys<mailingli...@geldenhuys.co.uk>  
wrote:
The compiler must be doing something really stupid for it mess up like that but 
how can we know?
By getting the source of Graemes test, using a profiler on it and having a look 
at the results?

I compiled Graeme's code which was surprisingly easy. I guess I already had SDL installed with the dev files. FPC on my old 2GHz Intel gets 2fps. With Java (OpenJDK 1.6.x) I get a pretty steady 14fps. On my 3yr notebook with a 1.9GHz Intel I get 4fps for FPC and 18fps for Java. I find it amusing the slower chip is faster. But there has been significant hardware improvements over the years.

I would have provided the fps for C++ but I haven't figured out how to get it to render the same content. It was doing half the resolution. I upped the window size which was obvious but it appears to be magnifying the same low pixel count. I've got to get back to work so I'll have to stop fiddling with this.

Both devices are multi-core Intel processors, obviously two different families, running 32bit Linux with Xorg for the display. FPC is version 3.0.0. For the benchmark runs I used "-XXs -O3" for compiler switches.

I limited run time to 10secs, and used "time" to verify actual run time. Here are the results, time output listed first and the first section of the gprof output without comments:

10.05user 0.01system 0:10.11elapsed 99%CPU (0avgtext+0avgdata 9188maxresident)k
0inputs+272outputs (0major+1010minor)pagefaults 0swaps

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
 62.44      1.33     1.33                             fpc_frac_real
 26.76      1.90     0.57 MATH_$$_FLOOR$EXTENDED$$LONGINT
 10.33      2.12     0.22                             FPC_DIV_INT64
  0.47      2.13     0.01 SYSTEM_$$_GENRAND_MT19937$$LONGINT
  0.00      2.13     0.00       43     0.00     0.00 P$TEST_$$_RENDER
0.00 2.13 0.00 8 0.00 0.00 P$TEST_$$_INTTOSTR$LONGINT$$ANSISTRING
  0.00      2.13     0.00        1     0.00     0.00  P$TEST_$$_INIT
  0.00      2.13     0.00        1     0.00     0.00  main


The code that is slow appears to be all standard floating point and integer math (* / mod + - >> <<). All of the time is spent in the render() function's main loops (lines 191-282 inclusive). I might have missed something but I didn't see any function calls of any concern. I rem'd those lines out and my frame rate jumped to ~1200. Obviously all I saw was a black screen. But I wanted to eliminate the SDL calls from the suspect list and make sure I had targeted the right lines.

I'm convinced that its not the integer math from the benchmarks I published earlier. And I thought everyone used the FPU so .... I never imagined there could be a significant speed difference with floating point. But this feels like the compiler is using software math instead of the FPU. I tried various -Cf... switches and either got the same results or "Access Violation". Even tried "-Ce" and the compiler said it was an invalid switch.

I suppose there could be something else I'm missing. Much more work to do to find the culprit.

Seems like the compiler team should be very interested in this. Fortunately I don't do much with floating point math or I'd probably be heading back to C/C++. Ugg. I can still churn through millions of web log lines in pretty quick order. :-D
--

--
Sent from my Debian Linux workstation -- http://www.debian.org/intro/about

Jon Foster
JF Possibilities, Inc.
j...@jfpossibilities.com
541-410-2760
Making computers work for you!

_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to