Re: [fpc-devel] LEA instruction speed

2023-10-27 Thread J. Gareth Moreton via fpc-devel
I should have figured.  Thank you! Kit On 27/10/2023 01:51, Nikolay Nikolov via fpc-devel wrote: On 10/11/23 11:21, Tomas Hajny via fpc-devel wrote: On 2023-10-11 04:15, J. Gareth Moreton via fpc-devel wrote: Sweet, thank you.  Would you be willing to share your modified test's source? I

Re: [fpc-devel] LEA instruction speed

2023-10-26 Thread Nikolay Nikolov via fpc-devel
On 10/11/23 11:21, Tomas Hajny via fpc-devel wrote: On 2023-10-11 04:15, J. Gareth Moreton via fpc-devel wrote: Sweet, thank you.  Would you be willing to share your modified test's source? I was worried that if CPUID wasn't present it would cause a SIGILL. Sure, attached, but I didn't do

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
It was a thought that crossed my mind when Stefan pointed out the translated Google Benchmark, but given that it hasn't yet been adapted to work outside of i386 and x86_64, you are right that it probably shouldn't be used for the time being.  The framework uses CPU timings to decide how many

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread Tomas Hajny via fpc-devel
On 2023-10-13 17:08, J. Gareth Moreton via fpc-devel wrote: Interesting!  That's a bug report to send to the maintainers of the framework.  I'll need to have them fix it before I'd be willing to try again with its use in FPC. Removed the reference.  Apologies - I'm rushing a bit. BTW, it's

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
This one's for you Stefan! https://github.com/spring4d/benchmark/issues/4 Kit On 13/10/2023 16:03, Tomas Hajny via fpc-devel wrote: On 2023-10-13 16:25, J. Gareth Moreton via fpc-devel wrote: GetLogicalProcessorInformation returns a Boolean - if false, an error occurred, and is handled as

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
Interesting!  That's a bug report to send to the maintainers of the framework.  I'll need to have them fix it before I'd be willing to try again with its use in FPC. Removed the reference.  Apologies - I'm rushing a bit. Kit On 13/10/2023 16:03, Tomas Hajny via fpc-devel wrote: On

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread Tomas Hajny via fpc-devel
On 2023-10-13 16:25, J. Gareth Moreton via fpc-devel wrote: GetLogicalProcessorInformation returns a Boolean - if false, an error occurred, and is handled as follows: DiagnoseAndExit('Failed during call to GetLogicalProcessorInformation: ' + GetLastError.ToString); GetLastError = 8 indicates

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
GetLogicalProcessorInformation returns a Boolean - if false, an error occurred, and is handled as follows: DiagnoseAndExit('Failed during call to GetLogicalProcessorInformation: ' + GetLastError.ToString); GetLastError = 8 indicates "out of memory", which I will say is odd. Nevertheless,

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
Oops - that was a silly mistake of mine with R8.  As for the other error, that sounds like it's in the third party benchmark suite.  I'll do some investigating on my virtual machine. In the meantime, here's the fixed test with the stray R8 call properly filtered out on i386 (it's replaced

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread Tomas Hajny via fpc-devel
On 2023-10-13 09:26, Tomas Hajny wrote: On 2023-10-12 20:02, J. Gareth Moreton via fpc-devel wrote: So an update. . . The latest version of blea.pp doesn't compile with a 32-bit compiler - line 76 contains an unconditional reference to R8 register, which obviously doesn't for the 32-bit

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread Tomas Hajny via fpc-devel
On 2023-10-12 20:02, J. Gareth Moreton via fpc-devel wrote: So an update. . . The latest version of blea.pp doesn't compile with a 32-bit compiler - line 76 contains an unconditional reference to R8 register, which obviously doesn't for the 32-bit mode. Tomas

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
So an update. I've added Spring.Benchmark to "tests/bench/spring" on my local branch, along with its readme and licence file.  It seems to work quite well even if it feels a bit like overkill for this small a benchmark.  Still, I've attached the version with Stefan's translated Google

Re: [fpc-devel] LEA instruction speed

2023-10-11 Thread Tomas Hajny via fpc-devel
On 2023-10-11 04:15, J. Gareth Moreton via fpc-devel wrote: Sweet, thank you.  Would you be willing to share your modified test's source? I was worried that if CPUID wasn't present it would cause a SIGILL. Sure, attached, but I didn't do anything special - I modified it in a way allowing easy

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
The LEA and ADD times are close enough that I can consider them identical.  And Braswell (the architecture behind that brand of Celeron) doesn't support AVX, I don't think, so that lines up with COREI having a fast LEA instruction but not COREAVX. Given the many different x86-compatible CPUs,

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread Christo Crause via fpc-devel
On Tue, Oct 10, 2023 at 11:13 AM J. Gareth Moreton via fpc-devel wrote: > > Thanks Tomas, > > Nothing is broken, but the timing measurement isn't precise enough. > > Normally I have a much higher iteration count (e.g. 1,000,000), but I > had reduced it to 10,000 because, coupled with the 1,000

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
Sweet, thank you.  Would you be willing to share your modified test's source? I was worried that if CPUID wasn't present it would cause a SIGILL. Kit On 11/10/2023 01:47, Tomas Hajny via fpc-devel wrote: On 2023-10-10 13:24, J. Gareth Moreton via fpc-devel wrote: I'm all for receiving

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread Tomas Hajny via fpc-devel
On 2023-10-10 13:24, J. Gareth Moreton via fpc-devel wrote: I'm all for receiving results for all kinds of processor, as it helps me to make more informed choices on flags as well as confirming that Agner Fog''s instruction tables are correct. Also, results for older processors can be hard to

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
I'm all for receiving results for all kinds of processor, as it helps me to make more informed choices on flags as well as confirming that Agner Fog''s instruction tables are correct. Also, results for older processors can be hard to come by sometimes. Currently, most architectures have a

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread Tomas Hajny via fpc-devel
On 2023-10-10 12:19, Marco van de Voort via fpc-devel wrote: Op 10-10-2023 om 11:13 schreef J. Gareth Moreton via fpc-devel: Thanks Tomas, Nothing is broken, but the timing measurement isn't precise enough. Normally I have a much higher iteration count (e.g. 1,000,000), but I had reduced it

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread Marco van de Voort via fpc-devel
Op 10-10-2023 om 11:13 schreef J. Gareth Moreton via fpc-devel: Thanks Tomas, Nothing is broken, but the timing measurement isn't precise enough. Normally I have a much higher iteration count (e.g. 1,000,000), but I had reduced it to 10,000 because, coupled with the 1,000 iterations in the

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
Ooo, that might be just what we need.  Thank you Stefan. Kit On 10/10/2023 10:57, Stefan Glienke via fpc-devel wrote: Be my guest making https://github.com/spring4d/benchmark compatible for all platforms you need it for. On 10/10/2023 11:13 CEST J. Gareth Moreton via fpc-devel wrote:

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread Stefan Glienke via fpc-devel
Be my guest making https://github.com/spring4d/benchmark compatible for all platforms you need it for. > On 10/10/2023 11:13 CEST J. Gareth Moreton via fpc-devel > wrote: > > > Thanks Tomas, > > Nothing is broken, but the timing measurement isn't precise enough. > > Normally I have a much

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
Looking at the text log, the results are a bit strange and I can't easily explain it.  Normally a system interrupt would increase the time taken. Let me know if increasing the iteration count fixes it or not. Kit On 10/10/2023 09:57, Tomas Hajny wrote: On 2023-10-09 20:51, J. Gareth Moreton

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
Thanks Tomas, Nothing is broken, but the timing measurement isn't precise enough. Normally I have a much higher iteration count (e.g. 1,000,000), but I had reduced it to 10,000 because, coupled with the 1,000 iterations in the subroutines themselves, would have led to 1,000,000,000 passes and

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread Tomas Hajny via fpc-devel
On 2023-10-09 20:51, J. Gareth Moreton via fpc-devel wrote: Hi Kit, I updated the "blea" test in the merge request so it now displays the processor brand name on x86_64; however, it is not fetched under i386 because CPUID was not introduced until later 486 processors.  I've attached it to

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread Jean SUZINEAU via fpc-devel
My results on Windows : E:\temp>C:\lazarus\fpc\3.2.2\bin\x86_64-win64\fpc.exe -MObjFPC -Scghi -O1 -g -gl -l -vewnhibq -Fu. -FUlib\x86_64-win64 -FE. -oblea.exe blea.pp Hint: (11030) Start of reading config file C:\lazarus\fpc\3.2.2\bin\x86_64-win64\fpc.cfg Hint: (11031) End of reading config

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread J. Gareth Moreton via fpc-devel
I updated the "blea" test in the merge request so it now displays the processor brand name on x86_64; however, it is not fetched under i386 because CPUID was not introduced until later 486 processors.  I've attached it to this e-mail if anyone wants to take a look to ensure I haven't broken

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread J. Gareth Moreton via fpc-devel
Thank you very much!  That processor is built on the Excavator architecture and lines up with the flag I put in the merge request (i.e. it has the "fast LEA" hint). I honestly didn't expect this much testing feedback, so thank you all! Gareth aka. Kit P.S. I'm tempted to extend the test

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread Jean SUZINEAU via fpc-devel
My results: jean@First-Boss:~/temp$ cat /proc/cpuinfo | grep "model name" model name    : AMD A6-7480 Radeon R5, 8 Compute Cores 2C+6G jean@First-Boss:~/temp$ /usr/bin/fpc blea.pp Free Pascal Compiler version 3.2.2 [2021/07/09] for x86_64 Copyright (c) 1993-2021 by Florian Klaempfl and others

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread J. Gareth Moreton via fpc-devel
Thank you for the report. According to Agner Fog's table, complex LEA instructions should have a 3-cycle latency on that architecture (Haswell). Optimisations with this instruction are proving interesting because there's such a variety between processor architectures. There are some that are

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread Nataraj S Narayan via fpc-devel
Hi Gareth model name : Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz Regards Nataraj S Narayan Synergy Info Systems Software & Technology Consultants Ettumanoor, INDIA Ph:+91 9443211326 On Sun, Oct 8, 2023 at 6:40 PM J. Gareth Moreton via fpc-devel < fpc-devel@lists.freepascal.org> wrote: > Hi

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread Tomas Hajny via fpc-devel
On 2023-10-08 13:45, J. Gareth Moreton via fpc-devel wrote: Sorry, ignore last attachment - I forgot to change a line of assembly (it was correct for x86_64-win64!!). Here is the corrected version. Alright, results for this version for AMD A9 9425 under Linux (the same trunk compiler as

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
Did some checking of the test I copied the code from, and I forgot that Rika's original code only exited once a certain time period had elapsed (e.g. 0.5 seconds).  I had changed it to a standard iteration count since I was concerned about fairness and accuracy, but I only changed the loop

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
In the meantime, here's the merge request for the feature based on user tests and studying of Agner Fog's instruction tables: https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/502 Kit ___ fpc-devel maillist -

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
Hi Nataraj Which processor is that run on? (although too close to call, it implies LEA has a latency of 2 in that case) Kit On 08/10/2023 14:06, Nataraj S Narayan via fpc-devel wrote: Hi [nataraj@dflyHP ~]$ fpc ttt.pas Free Pascal Compiler version 3.2.2 [2023/07/04] for x86_64 Copyright

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread Nataraj S Narayan via fpc-devel
Hi [nataraj@dflyHP ~]$ fpc ttt.pas Free Pascal Compiler version 3.2.2 [2023/07/04] for x86_64 Copyright (c) 1993-2021 by Florian Klaempfl and others Target OS: DragonFly for x86-64 Compiling ttt.pas Linking ttt /usr/local/bin/ld.bfd: warning:

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
Sorry, ignore last attachment - I forgot to change a line of assembly (it was correct for x86_64-win64!!). Here is the corrected version. Kit On 08/10/2023 12:38, J. Gareth Moreton via fpc-devel wrote: Sorry, I got careless and was in a rush, as both the Pascal code is wrong and I didn't

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
Sorry, I got careless and was in a rush, as both the Pascal code is wrong and I didn't store the result of the benchmark test, hence the error check at the end returned a false negative. The benchmark code was from Rika's SHA-1 test code, which I didn't properly check, although I assumed the

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread Marģers . via fpc-devel
1. why you leave "time:=..." in benchmark loop? It does add 50% of execution time per call. 2. Pascal version does not match assembler version. Had to fix it.   //Result := X + Counter + $87654321;   Result:=Result + X + $87654321;   Result:=Result xor y; 3. Assembler functions can be

Re: [fpc-devel] LEA instruction speed

2023-10-07 Thread J. Gareth Moreton via fpc-devel
I'm still slightly curious, but if full optimisations make better code, then indeed it's probably not worth the effort. Your timings are incredibly helpful - thank you!  If I understand, AMD A9 is the Excavator architecture, which implies that AMD processors don't suffer from the same latency

Re: [fpc-devel] LEA instruction speed

2023-10-07 Thread Tomas Hajny via fpc-devel
On 2023-10-07 18:09, J. Gareth Moreton via fpc-devel wrote: That's interesting; I am interested to see the assembly output for the Pascal control cases.  As for the 64-bit version, that was my fault since the assembly language is for Microsoft's ABI rather than the System V ABI, so it was

Re: [fpc-devel] LEA instruction speed

2023-10-07 Thread J. Gareth Moreton via fpc-devel
That's interesting; I am interested to see the assembly output for the Pascal control cases.  As for the 64-bit version, that was my fault since the assembly language is for Microsoft's ABI rather than the System V ABI, so it was checking a register with an undefined value.  Find attached the

Re: [fpc-devel] LEA instruction speed

2023-10-07 Thread Tomas Hajny via fpc-devel
On 2023-10-07 03:57, J. Gareth Moreton via fpc-devel wrote: Hi Kit, Do you think this should suffice? Originally it ran for 1,000,000 repetitions but I fear that will take way too long on a 486, so I reduced it to 10,000. OK, I tried it now. First of all, after turning on the old machine, I

Re: [fpc-devel] LEA instruction speed

2023-10-06 Thread J. Gareth Moreton via fpc-devel
Hi Tomas, Do you think this should suffice? Originally it ran for 1,000,000 repetitions but I fear that will take way too long on a 486, so I reduced it to 10,000. Kit On 03/10/2023 06:30, Tomas Hajny via fpc-devel wrote: On October 3, 2023 03:32:34 +0200, "J. Gareth Moreton via fpc-devel"

Re: [fpc-devel] LEA instruction speed

2023-10-03 Thread J. Gareth Moreton via fpc-devel
What should I call a new sub-CPU option?  Should it be "ICELAKE" or is there a better name like "CORE10" or "COREX" (X being the Roman numeral for 10, standing in for the 10th generation of Intel Core)? Kit On 03/10/2023 08:02, Florian Klämpfl via fpc-devel wrote: Am 03.10.2023 um 03:32

Re: [fpc-devel] LEA instruction speed

2023-10-03 Thread J. Gareth Moreton via fpc-devel
I don't think any of them currently fit, although Zen 3 is later than Ice Lake, but I'm not sure if it has a faster LEA or not. I'll do some investigation.  I'll take up Tomas' offer on the 486 test though.  Personally I think the best test might actually be one of the recently-optimised

Re: [fpc-devel] LEA instruction speed

2023-10-03 Thread Florian Klämpfl via fpc-devel
> Am 03.10.2023 um 03:32 schrieb J. Gareth Moreton via fpc-devel > : > > Hi everyone, > > This is mainly to Florian, but also to anyone else who can answer the > question - at which point did a complex LEA instruction (using all three > input operands and some other specific circumstances)

Re: [fpc-devel] LEA instruction speed

2023-10-03 Thread J. Gareth Moreton via fpc-devel
Hmmm, could be fun to attempt to test - I'll see what I can set up. Kit On 03/10/2023 06:30, Tomas Hajny via fpc-devel wrote: On October 3, 2023 03:32:34 +0200, "J. Gareth Moreton via fpc-devel" wrote: Hii Kit, This is mainly to Florian, but also to anyone else who can answer the

Re: [fpc-devel] LEA instruction speed

2023-10-02 Thread Tomas Hajny via fpc-devel
On October 3, 2023 03:32:34 +0200, "J. Gareth Moreton via fpc-devel" wrote: Hii Kit, >This is mainly to Florian, but also to anyone else who can answer the question >- at which point did a complex LEA instruction (using all three input operands >and some other specific circumstances) get

Re: [fpc-devel] LEA instruction speed

2023-10-02 Thread J. Gareth Moreton via fpc-devel
(And I meant "Ice Lake", not "Icy Lake") On 03/10/2023 02:32, J. Gareth Moreton via fpc-devel wrote: Hi everyone, This is mainly to Florian, but also to anyone else who can answer the question - at which point did a complex LEA instruction (using all three input operands and some other

[fpc-devel] LEA instruction speed

2023-10-02 Thread J. Gareth Moreton via fpc-devel
Hi everyone, This is mainly to Florian, but also to anyone else who can answer the question - at which point did a complex LEA instruction (using all three input operands and some other specific circumstances) get slow?  Preliminary research suggests the 486 was when it gained extra latency,