Re: [fpc-devel] Windows for AArch64

2024-04-29 Thread J. Gareth Moreton via fpc-devel
Thanks Sven.  I'm predicting a future of Windows on AArch64, since Windows is not going anywhere but Arm processors are starting to really take off beyond mobile devices. Kit On 29/04/2024 21:31, Sven Barth via fpc-devel wrote: Am 29.04.2024 um 08:42 schrieb J. Gareth Moreton via fpc-devel

Re: [fpc-devel] Windows for AArch64

2024-04-29 Thread J. Gareth Moreton via fpc-devel
Aah, partially answered.  It's not supported in 3.2.2, but there is better support for it in the trunk. Kit On 29/04/2024 06:42, J. Gareth Moreton via fpc-devel wrote: Hi everyone, I may need some help with this one.  Is there a tried and tested way of getting FPC to build and install

[fpc-devel] Windows for AArch64

2024-04-28 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I may need some help with this one.  Is there a tried and tested way of getting FPC to build and install on aarch64-win64? (I assume that's the correct OS for Windows for ARM64).  The make script doesn't seem to accept the combination of CPU_TARGET=aarch64 OS_TARGET=win64 and

Re: [fpc-devel] Free Pascal for Windows aarch64 Bug Bounties

2024-04-27 Thread J. Gareth Moreton via fpc-devel
I figured that would be the case with the PE format.  Fun times ahead! Kit On 27/04/2024 13:06, Sven Barth via fpc-devel wrote: J. Gareth Moreton via fpc-devel schrieb am Sa., 27. Apr. 2024, 10:00: You've piqued my interest.  I currently only have the ability to develop on aarch64

Re: [fpc-devel] error target i386 -Cp80486

2024-04-23 Thread J. Gareth Moreton via fpc-devel
enance since I shouldn't need to set the operand sizes, but I'll save that for another day. On 24/04/2024 05:08, J. Gareth Moreton via fpc-devel wrote: I've gotten back as far as here, and it's still bad: commit 7fbda0e0e8b1d071e72ccbc5e487dbb1c2173c63 (HEAD) Author: Jonas Maebe Date:  

Re: [fpc-devel] error target i386 -Cp80486

2024-04-23 Thread J. Gareth Moreton via fpc-devel
doesn't work.  However the hang (and also an eventual out of memory error) doesn't occur if -OoNOPEEPHOLE is specified, meaning the problem is located in the peephole optimizer.  So it looks like I'll have to try to debug this the hard way! Kit On 23/04/2024 17:27, J. Gareth Moreton via fpc

Re: [fpc-devel] error target i386 -Cp80486

2024-04-23 Thread J. Gareth Moreton via fpc-devel
I've reproduced the hang doing "make clean all CPU_TARGET=i386 OS_TARGET=win32 OPT="-Cp80486 -Op80486"" on my x86_64-win64 machine. So far I haven't found the bad commit - this problem has been here a while. Kit I still haven't found the bad commit! On 23/04/2024 12:46, J

Re: [fpc-devel] error target i386 -Cp80486

2024-04-23 Thread J. Gareth Moreton via fpc-devel
Absolutely I can.  I'll see what I can find. Gareth aka. Kit On 23/04/2024 12:09, Tomas Hajny via fpc-devel wrote: On 2024-04-23 11:50, Marģers . via fpc-devel wrote: 1) does not work make clean singlezipinstall OS_TARGET=win32 CPU_TARGET=i386 ALLOW_WARNINGS=1 OPT="  -O2 -vxitl -Cp80486

[fpc-devel] Planning refactor of x86 OptPass1MOV

2024-04-15 Thread J. Gareth Moreton via fpc-devel
Hi everyone, While I'm starting to focus more on node-level optimisations, since they benefit more platforms and can produce better optimisations in some situations due to cross-platform support and things like register allocation, I'm thinking about refactoring and improving x86's

Re: [fpc-devel] wrong result for abs(low(int64))

2024-04-04 Thread J. Gareth Moreton via fpc-devel
Essentially, an arithmetic overflow is happening.  Since the largest Int64 possible is 9,223,372,036,853,775,807, going one above that (the result to abs(low(int64))) wraps back around to -9,223,372,036,853,775,808. Internally, you can think about negating (positing?) a negative number as

Re: [fpc-devel] Unaligned access on Cortex-M0 in Initialization code

2024-04-01 Thread J. Gareth Moreton via fpc-devel
Oops - I'm sorry for my introduced bug! Gareth aka. Kit On 31/03/2024 21:48, Michael Ring via fpc-devel wrote: Works, thank you! Michael Am 31.03.24 um 22:18 schrieb Florian Klämpfl via fpc-devel: Am 31.03.2024 um 21:58 schrieb Michael Ring via fpc-devel : This is what I see (guess

Re: [fpc-devel] i386-win32 -CriotR fails to build

2024-03-01 Thread J. Gareth Moreton via fpc-devel
Excellent, thank you Michael. Kit On 01/03/2024 20:56, Michael Van Canneyt via fpc-devel wrote: On Fri, 1 Mar 2024, J. Gareth Moreton via fpc-devel wrote: Just want to confirm that the failure also occurs on x86_64-win64 under -CriotR rules. On all platforms. I fixed compilation

Re: [fpc-devel] i386-win32 -CriotR fails to build

2024-03-01 Thread J. Gareth Moreton via fpc-devel
Just want to confirm that the failure also occurs on x86_64-win64 under -CriotR rules. Kit On 01/03/2024 18:18, J. Gareth Moreton via fpc-devel wrote: Hi everyone. As part of my automated tests I try to build the compiler and packages on i386-win32 under the options "-O4 -CriotR".

[fpc-devel] i386-win32 -CriotR fails to build

2024-03-01 Thread J. Gareth Moreton via fpc-devel
Hi everyone. As part of my automated tests I try to build the compiler and packages on i386-win32 under the options "-O4 -CriotR".  Doing so gives a failure with the vcl_compat package (the failure also occurs with just "-CriotR").  Can others confirm? External command

Re: [fpc-devel] ARM: AND/CMP -> TST optimisation produces incorrect results

2024-02-28 Thread J. Gareth Moreton via fpc-devel
Hi Garry, Hopefully I have fixed this issue now, which is also causing problems elsewhere. https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/598 - just waiting on it to be verified, approved and merged. Gareth aka. Kit On 20/02/2024 06:32, J. Gareth Moreton via fpc-devel wrote

[fpc-devel] Possible bug in "chmreader"

2024-02-21 Thread J. Gareth Moreton via fpc-devel
Hi everyone, While evaluating a new peephole optimisation, I came across a null pointer dereference in the assembly language.  After looking at the original Pascal code, I came across this starting at line 525 of packages/chm/src/chmreader.pas: procedure

Re: [fpc-devel] ARM: AND/CMP -> TST optimisation produces incorrect results

2024-02-19 Thread J. Gareth Moreton via fpc-devel
Thanks for the report and especially your investigative work. Ii'll take a look to see what's going on. Gareth aka. Kit On 20/02/2024 01:30, Garry Wood via fpc-devel wrote: Hello, Commit 6b2e4fa4 (main) entitled “* arm: "OpCmp2OpS" moved to Pass 2 so it doesn't conflict with AND; CMP ->

[fpc-devel] Compiler warning when built with -dDEBUG_NODE_XML

2024-02-14 Thread J. Gareth Moreton via fpc-devel
Hi everyone, After some recent updates to the trunk, the compiler no longer successfully builds when -dDEBUG_NODE_XML is specified: symsym.pas(2885,9) Warning: (treated as error) Case statement does not handle all possible cases This is located within "procedure

Re: [fpc-devel] Modifiers...

2024-01-24 Thread J. Gareth Moreton via fpc-devel
Note that for 1), it should be "end.", not "end;" - make sure that isn't causing your error. Subroutine directives like "vectorcall" I think are usable as variable names. Kit On 24/01/2024 22:29, Martin Frb via fpc-devel wrote: https://www.freepascal.org/docs-html/ref/refsu3.html Is this

[fpc-devel] Internal "Signed wrap" function

2023-12-01 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I'm just fixing a bugged optimisation, and I came across a situation where I need to essentially do a "signed wrap" of a constant.  At one point I'm optimising "subl $-12,%eax", "addl $1,%eax" into a single instruction, but to make sure (untrapped) overflows are handled

[fpc-devel] Accidental file inclusion in repository

2023-11-29 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I hate to point fingers, but there's a 0-byte file named "HEAD" in the repository, which causes git to throw a tantrum sometimes - it was introduced in the following commit: commit a4c324ee237674950e4675894df386519b75a130 Author: Rika Ichinose Date:   Fri Apr 14 09:24:55 2023

[fpc-devel] Interesting short article about optimisation

2023-11-25 Thread J. Gareth Moreton via fpc-devel
I just stumbled across this article about micro-architecture-specific optimisations in ARM: https://www.phoronix.com/news/ARM64-Linux-No-Uarch-Opts They briefly mention x86_64, and I agree it's good to avoid micro-architecture-specific optimisations and now it makes me wonder where the line

[fpc-devel] Arm compiler limitation

2023-11-23 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So one of my recent merge requests (https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/516) has been having test failures on arm-linux, and this has confused me for a while because they don't occur on aarch64-linux (the two platforms share the same code in this

[fpc-devel] Quirk is "IsJumpToLabel"

2023-11-10 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I've been developing a new optimisation for x86, and in one situation a JMP becomes a Jcc.  To make sure it's valid, I ensure that "IsJumpToLabel" returns True before the change is made.  All was well in x86_64-win64 and x86_64-linux, but on i386-linux, I came across a bit of an

Re: [fpc-devel] Kit's current work

2023-11-08 Thread J. Gareth Moreton via fpc-devel
On 08/11/2023 21:11, Florian Klämpfl via fpc-devel wrote: Am 08.11.2023 um 21:22 schrieb J. Gareth Moreton via fpc-devel: - I don't know what the eventual support for intrinsics will be for FPC, if it will ever get implemented, but I at the very least hope the internal nodes

[fpc-devel] Kit's current work

2023-11-08 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I just thought I'd give a heads up on what I'm currently doing for the Free Pascal Compiler. - Pure functions are still my main target.  There are a few sticking points that I'm trying to resolve, like handling certain internal functions and how to deal with out variables that

[fpc-devel] Optimisation question

2023-10-30 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I'm still exploring optimisations in generated x86 code, something which has become my speciality, and I found one new potential optimisation sequence that aims to reduce unnecessary calls to CMP and TEST when the result is already known.  However there are some situations where

Re: [fpc-devel] LEA instruction speed

2023-10-27 Thread J. Gareth Moreton via fpc-devel
I should have figured.  Thank you! Kit On 27/10/2023 01:51, Nikolay Nikolov via fpc-devel wrote: On 10/11/23 11:21, Tomas Hajny via fpc-devel wrote: On 2023-10-11 04:15, J. Gareth Moreton via fpc-devel wrote: Sweet, thank you.  Would you be willing to share your modified test's source? I

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
rs has proven useful in determining the correctness of the new optimisation hint, which I intend to use to make the i386/x86_64 peephole optimizer smarter in regards to using LEA statements. Kit On 13/10/2023 16:36, Tomas Hajny via fpc-devel wrote: On 2023-10-13 17:08, J. Gareth Moreton via fpc-devel

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
This one's for you Stefan! https://github.com/spring4d/benchmark/issues/4 Kit On 13/10/2023 16:03, Tomas Hajny via fpc-devel wrote: On 2023-10-13 16:25, J. Gareth Moreton via fpc-devel wrote: GetLogicalProcessorInformation returns a Boolean - if false, an error occurred, and is handled

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
-13 16:25, J. Gareth Moreton via fpc-devel wrote: GetLogicalProcessorInformation returns a Boolean - if false, an error occurred, and is handled as follows: DiagnoseAndExit('Failed during call to GetLogicalProcessorInformation: ' + GetLastError.ToString); GetLastError = 8 indicates "out of m

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
Hajny via fpc-devel wrote: On 2023-10-13 09:26, Tomas Hajny wrote: On 2023-10-12 20:02, J. Gareth Moreton via fpc-devel wrote: So an update.  .  . The latest version of blea.pp doesn't compile with a 32-bit compiler - line 76 contains an unconditional reference to R8 register, which obviously do

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
with "CPUName" on 32-bit).  I wasn't sure if global variables were initialised or not, hence me playing safe. Kit On 13/10/2023 08:34, Tomas Hajny via fpc-devel wrote: On 2023-10-13 09:26, Tomas Hajny wrote: On 2023-10-12 20:02, J. Gareth Moreton via fpc-devel wrote: So

Re: [fpc-devel] LEA instruction speed

2023-10-13 Thread J. Gareth Moreton via fpc-devel
So an update. I've added Spring.Benchmark to "tests/bench/spring" on my local branch, along with its readme and licence file.  It seems to work quite well even if it feels a bit like overkill for this small a benchmark.  Still, I've attached the version with Stefan's translated Google

[fpc-devel] 47k attachment

2023-10-12 Thread J. Gareth Moreton via fpc-devel
To whom it may concern, I have a new message for the "LEA instruction speed" chain, but it is currently in holding as it contains a 47k ZIP file (source code only, and a third-party licence agreement).  Can the mailing list maintainer confirm (or deny) that it's okay? Kit

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
:13 AM J. Gareth Moreton via fpc-devel wrote: Thanks Tomas, Nothing is broken, but the timing measurement isn't precise enough. Normally I have a much higher iteration count (e.g. 1,000,000), but I had reduced it to 10,000 because, coupled with the 1,000 iterations in the subroutines themselves

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
Sweet, thank you.  Would you be willing to share your modified test's source? I was worried that if CPUID wasn't present it would cause a SIGILL. Kit On 11/10/2023 01:47, Tomas Hajny via fpc-devel wrote: On 2023-10-10 13:24, J. Gareth Moreton via fpc-devel wrote: I'm all for receiving

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
sed, while AGUs tend to be used one at a time. On 10/10/2023 11:54, Tomas Hajny via fpc-devel wrote: On 2023-10-10 12:19, Marco van de Voort via fpc-devel wrote: Op 10-10-2023 om 11:13 schreef J. Gareth Moreton via fpc-devel: Thanks Tomas, Nothing is broken, but the timing measurement isn't prec

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
Ooo, that might be just what we need.  Thank you Stefan. Kit On 10/10/2023 10:57, Stefan Glienke via fpc-devel wrote: Be my guest making https://github.com/spring4d/benchmark compatible for all platforms you need it for. On 10/10/2023 11:13 CEST J. Gareth Moreton via fpc-devel wrote

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
sion shows 0.0 ns/call, sometimes the LEA version shows 0.0 ns/call (32-bits) or 0.1 ns/call (64-bits). See the attached results (the CPU is only displayed for the 64-bit compilation, but it's obviously the same CPU). Tomas On 09/10/2023 18:01, J. Gareth Moreton via fpc-devel wrote: Thank you

Re: [fpc-devel] LEA instruction speed

2023-10-10 Thread J. Gareth Moreton via fpc-devel
routines.  Still, let's see if 100,000 gives better results for you. Kit On 10/10/2023 09:57, Tomas Hajny wrote: On 2023-10-09 20:51, J. Gareth Moreton via fpc-devel wrote: Hi Kit, I updated the "blea" test in the merge request so it now displays the processor brand name on x86_6

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread J. Gareth Moreton via fpc-devel
n't broken something. Kit On 09/10/2023 18:01, J. Gareth Moreton via fpc-devel wrote: Thank you very much!  That processor is built on the Excavator architecture and lines up with the flag I put in the merge request (i.e. it has the "fast LEA" hint). I honestly didn't expect this much

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread J. Gareth Moreton via fpc-devel
Thank you very much!  That processor is built on the Excavator architecture and lines up with the flag I put in the merge request (i.e. it has the "fast LEA" hint). I honestly didn't expect this much testing feedback, so thank you all! Gareth aka. Kit P.S. I'm tempted to extend the test

Re: [fpc-devel] LEA instruction speed

2023-10-09 Thread J. Gareth Moreton via fpc-devel
noor, INDIA Ph:+91 9443211326 On Sun, Oct 8, 2023 at 6:40 PM J. Gareth Moreton via fpc-devel wrote: Hi Nataraj Which processor is that run on? (although too close to call, it implies LEA has a latency of 2 in that case) Kit On 08/10/2023 14:06, Nataraj S Narayan via fpc-d

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
Did some checking of the test I copied the code from, and I forgot that Rika's original code only exited once a certain time period had elapsed (e.g. 0.5 seconds).  I had changed it to a standard iteration count since I was concerned about fairness and accuracy, but I only changed the loop

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
In the meantime, here's the merge request for the feature based on user tests and studying of Agner Fog's instruction tables: https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/502 Kit ___ fpc-devel maillist -

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
logy Consultants Ettumanoor, INDIA Ph:+91 9443211326 On Sat, Oct 7, 2023 at 9:39 PM J. Gareth Moreton via fpc-devel wrote: That's interesting; I am interested to see the assembly output for the Pascal control cases.  As for the 64-bit version, that was my fault since the asse

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
Sorry, ignore last attachment - I forgot to change a line of assembly (it was correct for x86_64-win64!!). Here is the corrected version. Kit On 08/10/2023 12:38, J. Gareth Moreton via fpc-devel wrote: Sorry, I got careless and was in a rush, as both the Pascal code is wrong and I didn't

Re: [fpc-devel] LEA instruction speed

2023-10-08 Thread J. Gareth Moreton via fpc-devel
Sorry, I got careless and was in a rush, as both the Pascal code is wrong and I didn't store the result of the benchmark test, hence the error check at the end returned a false negative. The benchmark code was from Rika's SHA-1 test code, which I didn't properly check, although I assumed the

Re: [fpc-devel] LEA instruction speed

2023-10-07 Thread J. Gareth Moreton via fpc-devel
486 shows it is at least as fast as two ADDs in a dependency chain. That should be all the information I need - thanks again! Kit On 07/10/2023 19:03, Tomas Hajny via fpc-devel wrote: On 2023-10-07 18:09, J. Gareth Moreton via fpc-devel wrote: That's interesting; I am interested to see

Re: [fpc-devel] LEA instruction speed

2023-10-07 Thread J. Gareth Moreton via fpc-devel
the fixed test. Kit P.S. Results on my Intel(R) Core(TM) i7-10750H    Pascal control case: 2.0 ns/call  Using LEA instruction: 1.7 ns/call Using ADD instructions: 1.3 ns/call On 07/10/2023 16:51, Tomas Hajny via fpc-devel wrote: On 2023-10-07 03:57, J. Gareth Moreton via fpc-devel wrote: Hi Kit

Re: [fpc-devel] LEA instruction speed

2023-10-06 Thread J. Gareth Moreton via fpc-devel
Hi Tomas, Do you think this should suffice? Originally it ran for 1,000,000 repetitions but I fear that will take way too long on a 486, so I reduced it to 10,000. Kit On 03/10/2023 06:30, Tomas Hajny via fpc-devel wrote: On October 3, 2023 03:32:34 +0200, "J. Gareth Moreton via fpc-

Re: [fpc-devel] LEA instruction speed

2023-10-03 Thread J. Gareth Moreton via fpc-devel
ote: Am 03.10.2023 um 03:32 schrieb J. Gareth Moreton via fpc-devel : Hi everyone, This is mainly to Florian, but also to anyone else who can answer the question - at which point did a complex LEA instruction (using all three input operands and some other specific circumstances) get s

Re: [fpc-devel] LEA instruction speed

2023-10-03 Thread J. Gareth Moreton via fpc-devel
cryptographic functions. Kit On 03/10/2023 08:02, Florian Klämpfl via fpc-devel wrote: Am 03.10.2023 um 03:32 schrieb J. Gareth Moreton via fpc-devel : Hi everyone, This is mainly to Florian, but also to anyone else who can answer the question - at which point did a complex LEA instruction

Re: [fpc-devel] LEA instruction speed

2023-10-03 Thread J. Gareth Moreton via fpc-devel
Hmmm, could be fun to attempt to test - I'll see what I can set up. Kit On 03/10/2023 06:30, Tomas Hajny via fpc-devel wrote: On October 3, 2023 03:32:34 +0200, "J. Gareth Moreton via fpc-devel" wrote: Hii Kit, This is mainly to Florian, but also to anyone else who

Re: [fpc-devel] LEA instruction speed

2023-10-02 Thread J. Gareth Moreton via fpc-devel
(And I meant "Ice Lake", not "Icy Lake") On 03/10/2023 02:32, J. Gareth Moreton via fpc-devel wrote: Hi everyone, This is mainly to Florian, but also to anyone else who can answer the question - at which point did a complex LEA instruction (using all three input ope

[fpc-devel] LEA instruction speed

2023-10-02 Thread J. Gareth Moreton via fpc-devel
Hi everyone, This is mainly to Florian, but also to anyone else who can answer the question - at which point did a complex LEA instruction (using all three input operands and some other specific circumstances) get slow?  Preliminary research suggests the 486 was when it gained extra latency,

Re: [fpc-devel] A call to help test pure functions

2023-10-02 Thread J. Gareth Moreton via fpc-devel
As an additional note - apologies to those who responded to me directly, but for some reason, GMail doesn't like e-mails coming from my domain name, so I have to use my own GMail account, watercran...@gmail.com to respond. Kit On 02/10/2023 18:21, J. Gareth Moreton via fpc-devel wrote

Re: [fpc-devel] A call to help test pure functions

2023-10-02 Thread J. Gareth Moreton via fpc-devel
uot;, say. Given it's a Free Pascal construct, it probably should be disabled in Delphi mode etc, but currently it isn't. Kit On 02/10/2023 12:43, Mattias Gaertner via fpc-devel wrote: On 29.09.23 21:28, J. Gareth Moreton via fpc-devel wrote: [...]  As the examples imply, to mark as a functi

[fpc-devel] A call to help test pure functions

2023-09-29 Thread J. Gareth Moreton via fpc-devel
Hi everyone, This has been something that's been in development for some time now, and I invite other Free Pascal users and developers to test the new feature... pure functions.  Its closest equivalent in C++ would be "constexpr". A pure function has no side-effects - it doesn't affect the

[fpc-devel] Request to add SHA-2 and Keccak (SHA-3) to hash package

2023-09-28 Thread J. Gareth Moreton via fpc-devel
Hi everyone, Given the current work on optimising and fixing the hash package, I would like to propose adding two additional families of hash functions to the package: * SHA-2: namely SHA-224 and SHA-256 (they essentially share the algorithm with some changes to the initial constants and

[fpc-devel] Some handy information regarding LEA instructions

2023-09-22 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I just discovered this while trying to optimise some of the hash functions.  This might already be known, but in case it isn't, here's something useful to know. The LEA instruction is useful because you can essentially perform "x := y + z + const" with one instruction, or just

Re: [fpc-devel] x86_64 SHA1 implementation (J. Gareth Moreton)

2023-09-17 Thread J. Gareth Moreton via fpc-devel
I will admit that part of me likes to program my own implementations in assembly language if just for the practice, especially learning where latency and stalls happen.  The problem with most of the examples given in this chain is that they use a 'high-level' assembly language with macros and

Re: [fpc-devel] x86_64 SHA1 implementation

2023-09-16 Thread J. Gareth Moreton via fpc-devel
ere need to be a new one? Kit On 15/09/2023 22:48, Florian Klämpfl via fpc-devel wrote: Am 16.09.23 um 15:13 schrieb J. Gareth Moreton via fpc-devel: Hi everyone, So this past week I've been building on Rika's work by adding an assembly version of SHA-1 for x86_64 to complement Rika's i386

Re: [fpc-devel] x86_64 SHA1 implementation

2023-09-16 Thread J. Gareth Moreton via fpc-devel
er, but I'm not sure what the equivalent Intel processor is... is "cpu_core_avx2" okay or does there need to be a new one? Kit On 15/09/2023 22:48, Florian Klämpfl via fpc-devel wrote: Am 16.09.23 um 15:13 schrieb J. Gareth Moreton via fpc-devel: Hi everyone, So this past week I'v

Re: [fpc-devel] x86_64 SHA1 implementation

2023-09-16 Thread J. Gareth Moreton via fpc-devel
that is guaranteed to be available on all x86_64 processors.  I can write versions for SSSE3 and AVX later, but currently I'm trying to identify the mysterious performance drops. Kit On 16/09/2023 16:18, Wayne Sherman wrote: J. Gareth Moreton via fpc-devel wrote: So this past week I've been

[fpc-devel] x86_64 SHA1 implementation

2023-09-16 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So this past week I've been building on Rika's work by adding an assembly version of SHA-1 for x86_64 to complement Rika's i386 version.  So far I've successfully made a version that runs twice as fast as the Pascal code.  I hoped to go even faster by making use of the SSE2

Re: [fpc-devel] Progress on pure functions

2023-08-17 Thread J. Gareth Moreton via fpc-devel
n't be analysed) so I'm trying to work out if I can create a temporary procdef and symtable.  So far I haven't had much luck, but I'll keep up the work. Kit On 16/08/2023 05:05, J. Gareth Moreton via fpc-devel wrote: Fixed my problem with the recursive function (enabling range check and overflo

Re: [fpc-devel] Progress on pure functions

2023-08-15 Thread J. Gareth Moreton via fpc-devel
Fixed my problem with the recursive function (enabling range check and overflow errors blocked dead-store elimination, so I worked around that) and the warning no longer cascades.  Progress is being made! Kit On 16/08/2023 04:02, J. Gareth Moreton via fpc-devel wrote: So managed to stop

Re: [fpc-devel] Progress on pure functions

2023-08-15 Thread J. Gareth Moreton via fpc-devel
; flag is removed from the subroutine. Currently the "analysis did not produce simple assignment" part is a hard-coded string and not a part of errore.msg, for exmaple, so there may need to be a way to adapt this to be multi-lingual. Kit On 12/08/2023 18:14, J. Gareth Moreton via

[fpc-devel] Progress on pure functions

2023-08-12 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So I'm still working on pure functions and have pushed some merge requests that are indirectly related to it, mostly simplifying the node tree so it can more easily be collapsed into simple assignments (what pure functions should simplify to).  Negative testing is still limited,

[fpc-devel] "Ordinal expression expected" awkwardness

2023-07-19 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So I've come across a bit of awkwardness with the compiler.  I'm not sure if it's a well-defined rule that I've overlooked, but in a for-loop, you can't use a 64-bit control variable when compiling for i386-win32 (and presumably other 32-bit platforms), but you can under

Re: [fpc-devel] Division nodes

2023-05-23 Thread J. Gareth Moreton via fpc-devel
NOT generate a conditional check for a divisor of -1, so LongInt($8000) div LongInt(-1) will raise an exception, but won't raise an exception on x86_64-win64, thus behaviour between platforms is different. Can others confirm this? Kit On 21/05/2023 00:00, J. Gareth Moreton via fpc-devel

Re: [fpc-devel] Division nodes

2023-05-20 Thread J. Gareth Moreton via fpc-devel
ility, should raise an exception if you try to divide it by -1 since the programmer is asking to downsize values that could potentially be out of range. Kit On 19/05/2023 21:55, Florian Klämpfl via fpc-devel wrote: Am 19.05.23 um 21:14 schrieb J. Gareth Moreton via fpc-devel: So I need to ask

Re: [fpc-devel] Division nodes

2023-05-19 Thread J. Gareth Moreton via fpc-devel
/05/2023 21:55, Florian Klämpfl via fpc-devel wrote: Am 19.05.23 um 21:14 schrieb J. Gareth Moreton via fpc-devel: So I need to ask... should the check for a divisor of -1 still be performed? Yes. This is the result of "down sizing" a division. In case of longint(int64 div int64) can

Re: [fpc-devel] Division nodes

2023-05-19 Thread J. Gareth Moreton via fpc-devel
uot; to "-x", and Intel's NEG instruction doesn't return an error if min_int is its input operand, but I can't be sure if the same applies to non-Intel processors and their equivalent instructions). Kit On 17/05/2023 09:51, J. Gareth Moreton via fpc-devel wrote: Logically yes, but using

Re: [fpc-devel] Division nodes

2023-05-17 Thread J. Gareth Moreton via fpc-devel
Logically yes, but using 16-bit as an example, min_int is -32,768, and signed 16-bit integers range from -32,768 to 32,767. So -32,768 ÷ -1 = 32,768, which is out of range.  This is where the problem lies. Internally, negation involves inverting all of the bits and then adding 1 (essentially

Re: [fpc-devel] Division nodes

2023-05-15 Thread J. Gareth Moreton via fpc-devel
change? And, what min_int is of course depends on whether targeting a 32-bit or 64-bit system, so best check both cases. ~Kirinn On Mon, 15 May 2023 17:21:30 +0100 "J. Gareth Moreton via fpc-devel" wrote: I made a merge request that removes the comparison against -1. x86_64-win64 and

Re: [fpc-devel] Division nodes

2023-05-15 Thread J. Gareth Moreton via fpc-devel
On 11/05/2023 23:04, J. Gareth Moreton via fpc-devel wrote: Fair enough, but I would like an explanation as to why it's necessary, because the reason for testing -1 in particular is very unclear, and I wonder if there's a known misbehaviour with a particular division function with -1. Kit On 11

Re: [fpc-devel] Division nodes

2023-05-11 Thread J. Gareth Moreton via fpc-devel
Fair enough, but I would like an explanation as to why it's necessary, because the reason for testing -1 in particular is very unclear, and I wonder if there's a known misbehaviour with a particular division function with -1. Kit On 11/05/2023 21:27, Wayne Sherman wrote: On Thu, May 11,

Re: [fpc-devel] Division nodes

2023-05-11 Thread J. Gareth Moreton via fpc-devel
is then converted elsewhere) Kit On 11/05/2023 18:01, J. Gareth Moreton via fpc-devel wrote: P.S. I found the code that adds the conditional checks; it's "doremoveinttypeconvs" in the ncnv unit.  However, it's very unclear as to WHY it's doing it as there's no comments around the code block. Kit

Re: [fpc-devel] Division nodes

2023-05-11 Thread J. Gareth Moreton via fpc-devel
P.S. I found the code that adds the conditional checks; it's "doremoveinttypeconvs" in the ncnv unit.  However, it's very unclear as to WHY it's doing it as there's no comments around the code block. Kit On 11/05/2023 15:39, J. Gareth Moreton via fpc-devel wrote: It doe

Re: [fpc-devel] Division nodes

2023-05-11 Thread J. Gareth Moreton via fpc-devel
5% of your divisors are -1 and you really need to save those few extra cycles of calling idiv. On 11/05/2023 11:04 CEST J. Gareth Moreton via fpc-devel wrote: Hi everyone, I need to ask a question about how division nodes are set up (I'm looking at possible optimisation techniques).  I'

[fpc-devel] Division nodes

2023-05-11 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I need to ask a question about how division nodes are set up (I'm looking at possible optimisation techniques).  I've written the following procedure: procedure DoDivMod(N, D: Integer; out Q, R: Integer); begin   Q := N div D;   R := N mod D; end; Fairly simple and to the

Re: [fpc-devel] Curious about the effect of all the new optimizations....

2023-03-01 Thread J. Gareth Moreton via fpc-devel
On 01/03/2023 13:10, Sven Barth wrote: It's a German proverb: "Mühsam ernährt sich das Eichhörnchen" Regards, Sven Thanks Sven! Kit ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] Curious about the effect of all the new optimizations....

2023-03-01 Thread J. Gareth Moreton via fpc-devel
On 01/03/2023 13:11, Martin Frb via fpc-devel wrote: Hence testing back to  3.2.3 ( unfortunately 3.2.2 has a bug that matters in this code) Also, I didn't expect any huge diffs, just wanted to see if anything can be noted at all. (and if lucky, in that test I run) I did a test on a more

Re: [fpc-devel] Curious about the effect of all the new optimizations....

2023-03-01 Thread J. Gareth Moreton via fpc-devel
My peephole optimisations mostly save only a handful of cycles each time which probably won't add up to much for a relatively short test.  The most major optimisation I can think of, although I'm not quite sure when it was merged, is the method of replacing divisions by a constant with an

Re: [fpc-devel] Unexpected "Range check error while evaluating constants" when compiling for Win64

2023-02-12 Thread J. Gareth Moreton via fpc-devel
Yeah, of course, since LongInt($8001), before typecasting to HKEY, is specifically a signed constant.  So obvious!! Kit On 12/02/2023 20:43, Bart via fpc-devel wrote: On Sun, Feb 12, 2023 at 6:26 PM J. Gareth Moreton via fpc-devel wrote: If HKey is unsigned, then yes, the definition

Re: [fpc-devel] Unexpected "Range check error while evaluating constants" when compiling for Win64

2023-02-12 Thread J. Gareth Moreton via fpc-devel
If HKey is unsigned, then yes, the definition should be HKEY(DWORD($8001)).  I do wonder why Win64 produces that particular error though because the final destination is an unsigned 64-bit integer. Kit On 12/02/2023 17:17, Bart via fpc-devel wrote: Hi, This code compiles happily for

[fpc-devel] Fixing bugs

2023-02-02 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I've just made an update to https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/366, the request that fixes i40111, removing the band-aid in aoptx86 and hopefully still fixing the original bug.  Can everyone confirm that i386-linux no longer crashes? In the meantime

Re: [fpc-devel] Incorrect hint (5023) "unit not used", if unit is only used in a conditional compiler expression (like: {$IF ..})

2023-01-13 Thread J. Gareth Moreton via fpc-devel
In my opinion, yes, report this as a bug.  Sure, it's what I'd consider "low priority" since it's just an incorrect informative hint rather than something critical, but it's a bug nonetheless. Kit On 13/01/2023 11:54, Bart via fpc-devel wrote: Consider the follwoing program: === program

[fpc-devel] Happy New Year!

2022-12-31 Thread J. Gareth Moreton via fpc-devel
Happy New Year everybody!  Free Pascal lives on! Kit ___ fpc-devel maillist - fpc-devel@lists.freepascal.org https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Progress on pure functions

2022-12-15 Thread J. Gareth Moreton via fpc-devel
ions copy and analyse the same tree.  If a function is both pure and inline, these initial trees will always be identical, so it is redundant to store two copies. Kit On 16/12/2022 06:44, Sven Barth wrote: Am 16.12.2022 um 02:02 schrieb J. Gareth Moreton via fpc-devel: The purity analysi

Re: [fpc-devel] Progress on pure functions

2022-12-15 Thread J. Gareth Moreton via fpc-devel
The purity analysis process is very dependent on the node tree being as clean as possible, and so depends on a fair few merge requests that have not yet been approved.  I'm guessing Florian and Jonas and others are somewhat busy, what with being December and all. -

Re: [fpc-devel] Progress on pure functions

2022-12-15 Thread J. Gareth Moreton via fpc-devel
ded.  Because only the unoptimised tree is stored, I felt there was no need to store this twice (doing so would also increase the size of PPU files). Kit On 15/12/2022 21:39, Sven Barth wrote: Am 14.12.2022 um 12:15 schrieb J. Gareth Moreton via fpc-devel: To better explain how purity analysi

Re: [fpc-devel] Progress on pure functions

2022-12-14 Thread J. Gareth Moreton via fpc-devel
ight be that I have to find a way to perform purity analysis without ever calling firstpass on the node tree, which is currently needed for processing nested pure and inline functions... granted, I can possibly do that selectively using "foreachnode" or "foreachnodestatic".

Re: [fpc-devel] Progress on pure functions

2022-12-14 Thread J. Gareth Moreton via fpc-devel
d is set to a value the compiler doesn't recognise, it just falls back to regular call-node handling. Kit On 14/12/2022 11:39, J. Gareth Moreton via fpc-devel wrote: On 14/12/2022 10:18, Sven Barth via fpc-devel wrote: Wouldn't it make more sense to ensure that the Str() and Val() intr

Re: [fpc-devel] Progress on pure functions

2022-12-14 Thread J. Gareth Moreton via fpc-devel
On 14/12/2022 10:18, Sven Barth via fpc-devel wrote: Wouldn't it make more sense to ensure that the Str() and Val() intrinsic work correctly inside "pure" functions? After all the compiler can then simply evaluate the inlinen nodes and does not have to "interpret" a ton of

Re: [fpc-devel] Progress on pure functions

2022-12-14 Thread J. Gareth Moreton via fpc-devel
Kit On 14/12/2022 10:18, Sven Barth via fpc-devel wrote: J. Gareth Moreton via fpc-devel schrieb am Di., 13. Dez. 2022, 22:09: The next big milestone that I want to achieve is to make this a pure function: procedure int_str_unsigned(l:longword;out s:shortstring); pure

Re: [fpc-devel] Progress on pure functions

2022-12-13 Thread J. Gareth Moreton via fpc-devel
h also correctly replaces the function call with the number 12. Kit On 14/12/2022 01:17, J. Gareth Moreton via fpc-devel wrote: So there are bugs in my pure function code, specifically with the use of current_procinfo - I didn't realise until now that the one relating to the current function is act

  1   2   3   4   5   >