Re: [fpc-devel] Thoughts: Make FillChar etc. an intrinsic for specialised performance potential

2022-04-16 Thread J. Gareth Moreton via fpc-devel
It's funny - for some reason I was expecting a lot of opposition! I knew about FillChar being written in assembly langauge and know from experience that FPC will never support the inlining of pure assembler routines (as Florian said, it will just open a huge can of worms).  My thought was how

Re: [fpc-devel] x86: Efficiency of opposing CMOVs

2022-04-16 Thread J. Gareth Moreton via fpc-devel
ing a MOV/CMP/CMOV triplet.  In other words, it tries to convert a CMOV to a MOV if it can. On 16/04/2022 20:03, Florian Klämpfl via fpc-devel wrote: Am 16.04.2022 um 12:31 schrieb Thorsten Otto via fpc-devel : On Samstag, 16. April 2022 06:49:07 CEST J. Gareth Moreton via fpc-devel

[fpc-devel] x86: Efficiency of opposing CMOVs

2022-04-15 Thread J. Gareth Moreton via fpc-devel
Hi everyone, In the x86_64 assembly dumps, I frequently come across combinations such as the following:     cmpl    %ebx,%edx     cmovll    %ebx,%eax     cmovnll    %edx,%eax This is essentially the tertiary C operator "x = cond ? trueval : falseval", or in Pascal "if (cond) then x := trueva

[fpc-devel] Thoughts: Make FillChar etc. an intrinsic for specialised performance potential

2022-04-15 Thread J. Gareth Moreton via fpc-devel
Hi everyone, This is something that sprung to mind when thinking about code speed and the like, and one thing that cropped up is the initialisation of large variables such as arrays or records.  A common means of doing this is, say: FillChar(MyVar, SizeOf(MyVar), 0); To keep things as genera

Re: [fpc-devel] Aligned array feature

2022-04-12 Thread J. Gareth Moreton via fpc-devel
ogress, I stripped everything out except what was advertised... making __m128 etc. aligned. My current question though is regarding testing.  Writing tests for these aligned arrays and records is simple enough, but I'm not sure what subdirectory/class they fall under... tbs or test/cg etc.

[fpc-devel] Aligned array feature

2022-04-12 Thread J. Gareth Moreton via fpc-devel
Hi everyone, To complement aligned records, I'm trying out an implementation of aligned records.  Like how you might declare an aligned record as follows: type AlignedVector = packed record     X, Y: Double; end align 16; At present I've gone for the following for arrays: type AlignedVector

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-11 Thread J. Gareth Moreton via fpc-devel
Since this feature is still a work in progress and bugs are inevitable, my merge request over here now only does what it was originally designed to do... make __m128 and the like aligned, although admittedly code that ensures they are treated as vector types is tied into the same commit: https

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-09 Thread J. Gareth Moreton via fpc-devel
course I don't want to force the call to make On 08/04/2022 20:58, Jonas Maebe via fpc-devel wrote: On 08/04/2022 20:31, J. Gareth Moreton via fpc-devel wrote: That might explain a few things.  The problem is that under vectorcall and the System V ABI (the default x86_64 calling convention

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-08 Thread J. Gareth Moreton via fpc-devel
On 08/04/2022 19:19, Jonas Maebe via fpc-devel wrote: On 08/04/2022 19:57, J. Gareth Moreton via fpc-devel wrote: It looks like support for writing to arrays that are wholly stored in registers is a little limited and buggy Modifying individual elements of arrays stored in registers has never

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-08 Thread J. Gareth Moreton via fpc-devel
.1] of Double align 16;", which is essentially __m128d). Gareth aka. Kit On 08/04/2022 18:57, J. Gareth Moreton via fpc-devel wrote: It looks like support for writing to arrays that are wholly stored in registers is a little limited and buggy - while it writes to temporary memory when modi

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-08 Thread J. Gareth Moreton via fpc-devel
It looks like support for writing to arrays that are wholly stored in registers is a little limited and buggy - while it writes to temporary memory when modifying an individual element, the compiler sometimes doesn't write back the final result into the original register.  I'm seeing if I can f

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-06 Thread J. Gareth Moreton via fpc-devel
On 06/04/2022 22:58, J. Gareth Moreton via fpc-devel wrote: On 06/04/2022 21:16, Jonas Maebe via fpc-devel wrote: On 06/04/2022 19:20, J. Gareth Moreton via fpc-devel wrote: I recently made a merge request that initally just fixed the incorrect memory alignment for __m128 and similar types, but

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-06 Thread J. Gareth Moreton via fpc-devel
I used it because it was easy to hot-swap the constructor in the definition of __m128 and the like, and it was a quick and convenient way to ensure the alignment was correct. Gareth aka. Kit On 06/04/2022 21:16, Jonas Maebe via fpc-devel wrote: On 06/04/2022 19:20, J. Gareth Moreton via fpc

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-06 Thread J. Gareth Moreton via fpc-devel
Pascal simply is a strongly typed language. Vector intrinsics are no reason to weaken this. Thus you need to declare operator overloads that hide the nitty, gritty details of assigning a TVector4 to a __m128, e.g.: type   TVector4 = packed record     X, Y, Z, W: Single;     class operator := (c

Re: [fpc-devel] Problems with MM types (__m128 etc).

2022-04-06 Thread J. Gareth Moreton via fpc-devel
Another problem... I've tried to declare an ADDPD intrinsic as follows: function x86_addpd(r0, r1: __m128d): __m128d; [INTERNPROC: fpc_in_x86_addpd]; I thought using __m128d instead of __m128 was fairly logical since ADDPD works with Doubles, not Singles, but this can cause problems.  For ex

[fpc-devel] Problems with MM types (__m128 etc).

2022-04-06 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I recently made a merge request that initally just fixed the incorrect memory alignment for __m128 and similar types, but doing so revealed a whole plethora of other bugs.  First, when I fixed it, __m128 etc were no longer recognised as a valid SIMD or aggregate type due to the wr

[fpc-devel] Question about typeconv nodes

2022-04-04 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I've noticed that, in many units, the generated node trees produce a lot of typeconv nodes with convtype=tc_equal.  For example (4th line - this is generated with DEBUG_NODE_XML): ... .. .convtype="tc_equal"> ...result .

Re: [fpc-devel] Package build failure under i386-win32

2022-04-03 Thread J. Gareth Moreton via fpc-devel
Ah, sorry for the confusion, but it occurs even if -a is omitted.  I got mixed up with something else.  So it's probably a simple range check error in the source and not an assembler issue. Gareth aka. Kit -- This email has been checked for viruses by Avast antivirus software. https://www.ava

Re: [fpc-devel] Package build failure under i386-win32

2022-04-03 Thread J. Gareth Moreton via fpc-devel
al: Compilation aborted The installer encountered the following error: Compilation of "BuildUnit_chm.pp" failed As mentioned, it doesn't show up if -a is omitted. Gareth aka. Kit On 03/04/2022 15:42, Florian Klämpfl via fpc-devel wrote: Am 03.04.2022 um 15:44 schrieb J. Gareth Moreton

[fpc-devel] Package build failure under i386-win32

2022-04-03 Thread J. Gareth Moreton via fpc-devel
Hi everyone, It seems at some point, something was introduced to the compiler that causes a package to fail to build (specifically packages\chm\src\chmwriter.pas) with an assembler-level range check error.  If you run "make all" under i386-win32 with "-CriotR -a" options, the error is trigger

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-03-27 Thread J. Gareth Moreton via fpc-devel
It's done! https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/191 Whitepaper can be found as an attachment in that merge request. Go ham on trying to break it! It will likely need refactoring later, especially as I want to port it to AArch64 at some point to see how it affects cod

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-03-11 Thread J. Gareth Moreton via fpc-devel
id it make much of a difference in the compilation time? Em sex., 25 de fev. de 2022 às 02:08, J. Gareth Moreton via fpc-devel escreveu: I did it! After a good week of work and getting things wrong, I finally found a solution that works nicely and is extensible, at least for x86. A bit of refac

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-03-07 Thread J. Gareth Moreton via fpc-devel
o possibly very out of practice or going senile.  I'll keep trying. Gareth aka. Kit On 27/02/2022 23:37, J. Gareth Moreton via fpc-devel wrote: I will need to refactor this feature later on, especially with names, because the data structure is not a true sliding window because it only ope

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-02-27 Thread J. Gareth Moreton via fpc-devel
I will need to refactor this feature later on, especially with names, because the data structure is not a true sliding window because it only operates over a subset of instructions that are added to a list based on whether they fit the criteria of what I call 'seed instructions'.  In future, th

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-02-27 Thread J. Gareth Moreton via fpc-devel
All tests passed successfully, including an extra addition that shaved another 5kb off the compiler!  Now the hard part... writing that whitepaper for everyone, since I think this is one of those times where it will be necessary. But if anyone wants to analyse and test it out before I make a m

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-02-25 Thread J. Gareth Moreton via fpc-devel
25/02/2022 13:08, J. Gareth Moreton via fpc-devel wrote: Well I'm not out of the woods yet, I've got one failure on x86_64-win64 and two on i386-win32: x86_64-win64: Failed to run webtbs/tw16040.pp 2021/09/13 08:19:30 i386-win32: Failed to run test/packages/bzip2/tbzip2streamtest.pp

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-02-25 Thread J. Gareth Moreton via fpc-devel
On 25/02/2022 08:29, Marco Borsari via fpc-devel wrote: This is very useful, thank you. I think FPC has an excellent register allocator, but frustrated on 32 bit by scarce resources and by the lack of reloading check. Unfortunately the equivalent procedure isn't optimised on i386-win32: .Lj67

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-02-25 Thread J. Gareth Moreton via fpc-devel
Well I'm not out of the woods yet, I've got one failure on x86_64-win64 and two on i386-win32: x86_64-win64: Failed to run webtbs/tw16040.pp 2021/09/13 08:19:30 i386-win32: Failed to run test/packages/bzip2/tbzip2streamtest.pp 2021/09/13 08:19:28 Failed to run webtbs/tw16040.pp 2021/09/13 08:

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-02-24 Thread J. Gareth Moreton via fpc-devel
I did it! After a good week of work and getting things wrong, I finally found a solution that works nicely and is extensible, at least for x86.  A bit of refactoring and it can be ported to other platforms.  I'm just running the test suites to see if I can break things now.  Honestly the hard

Re: [fpc-devel] Prototype optimisation... Sliding Window

2022-02-17 Thread J. Gareth Moreton via fpc-devel
hether another register is pointing to this location and is hence volatile. Gareth aka. Kit On 17/02/2022 21:38, Jonas Maebe via fpc-devel wrote: On 17/02/2022 20:25, J. Gareth Moreton via fpc-devel wrote: P.S. The term "sliding window" comes from the LZ77 compression algorithm and

[fpc-devel] Prototype optimisation... Sliding Window

2022-02-17 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So I've started experimenting with a new technique in the peephole optimizer for x86 platforms that I've named the Sliding Window.  The intention is to use it to help replace common blocks of code within a procedure, such as pointer dereferences.  So far I'm having a degree of su

Re: [fpc-devel] Peephole optimizer passes

2022-01-26 Thread J. Gareth Moreton via fpc-devel
In the meantime, my research has revealed some peephole optimizer bugs.  Normally these bugs aren't triggered, but given one of them manifested simply by running Pass 2 twice, I feel they may have the potential to manifest themselves in contrived examples. Thereforem, I have made a merge reques

[fpc-devel] Peephole optimizer passes

2022-01-25 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So I've found with the peephole optimizer, at least on x86, that if you run pass 2 more than once, it often catches even more optimisations that otherwise get missed.  At the same time I've found some bugs that get triggered when pass 2 is run again (which is why I asked about Re

Re: [fpc-devel] Any word on this ARM / AArch64 optimisation?

2022-01-22 Thread J. Gareth Moreton via fpc-devel
a fpc-devel wrote: Am 21.01.2022 um 18:23 schrieb J. Gareth Moreton via fpc-devel : Hi everyone, Any word on the validity of this ARM / AArch64 optimisation? It's quite good at increasing speed and shrinking code size by concatenating writes to the stack, among other things: https://

[fpc-devel] Any word on this ARM / AArch64 optimisation?

2022-01-21 Thread J. Gareth Moreton via fpc-devel
Hi everyone, Any word on the validity of this ARM / AArch64 optimisation? It's quite good at increasing speed and shrinking code size by concatenating writes to the stack, among other things: https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/104 - the new CPU feature flag I put in

Re: [fpc-devel] Issue 3.2.3 (ok in 3.3.1) Win64 "raise exception" does not go to "except"

2022-01-18 Thread J. Gareth Moreton via fpc-devel
Commit af107ca8fee33355e8c35fab6fc5ba5290bd3ebc fixes the problem in the main branch, so this one should be merged into fixes_3_2. Note that this commit also extends an optimisation in OptPass2JMP that is unrelated to the bug fix.  If this proves incompatible with the fixes branch, it can be i

Re: [fpc-devel] Issue 3.2.3 (ok in 3.3.1) Win64 "raise exception" does not go to "except"

2022-01-18 Thread J. Gareth Moreton via fpc-devel
Found the reason for it, or at least what looks like the reason... the RemoveDeadCodeAfterJump routine, which removes all instructions between a "jmp" instruction and the next live label (since those instructions will never get executed), doesn't stop if it hits the SEH section and strips all t

Re: [fpc-devel] Proposed new utility functions for x86 peephole optimiser (and maybe others)

2022-01-11 Thread J. Gareth Moreton via fpc-devel
And I got the equation wrong for the permissive one.  It's meant to be something like the following instead: "(r1.volatility + r2.volatility - permitted_volatility) = []" Gareth aka. Kit On 11/01/2022 08:59, J. Gareth Moreton via fpc-devel wrote: Hi everyone, During my impl

[fpc-devel] Proposed new utility functions for x86 peephole optimiser (and maybe others)

2022-01-11 Thread J. Gareth Moreton via fpc-devel
Hi everyone, During my implementation of a new optimisation, https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/136, which merges some references, Florian asked me to make sure I check the volatility fields of the references, something which I forgot about, but which turned out was

Re: [fpc-devel] Plans for 2022

2022-01-11 Thread J. Gareth Moreton via fpc-devel
can be adapted or if we have to make a new one. Gareth aka. Kit On 10/01/2022 09:23, J. Gareth Moreton via fpc-devel wrote: I have a passion for games programming so this one really rings close for me.  The tricky thing is that early SSE and AVX instructions have to fill and use the entire re

Re: [fpc-devel] Plans for 2022

2022-01-10 Thread J. Gareth Moreton via fpc-devel
ittle difficult to manage.  One can't just add a dummy 4th component because then storage won't behave as expected. Gareth aka. Kit On 10/01/2022 03:08, Ryan Joseph via fpc-devel wrote: On Jan 9, 2022, at 2:09 PM, J. Gareth Moreton via fpc-devel wrote: https://www.patreon.com/posts

Re: [fpc-devel] Double-checking an optimisation

2022-01-09 Thread J. Gareth Moreton via fpc-devel
On 09/01/2022 15:28, Martin Frb via fpc-devel wrote: Btw, have you seen this? https://www.agner.org/optimize/optimizing_assembly.pdf Page 70, it says that under some conditions a branch may be faster than a conditional move. I'm definitely saving a local copy of that!  It could prove insightf

Re: [fpc-devel] Double-checking an optimisation

2022-01-09 Thread J. Gareth Moreton via fpc-devel
On 09/01/2022 12:35, Florian Klämpfl via fpc-devel wrote:   It removes a jump and a label, which might permit other long-range optimisations, but it's 3 instructions that are in a dependency chain. Didn't you implement something which transformed the code above in   xorl    %ebx,%ebx  

Re: [fpc-devel] Plans for 2022

2022-01-09 Thread J. Gareth Moreton via fpc-devel
f there's some kind of register tracking problem. Gareth aka. Kit On 09/01/2022 13:38, J. Gareth Moreton via fpc-devel wrote: It's probably a good idea, yes.  Normally the optimisations I implement are due to me spotting something in the RTL or compiler disassembly while optimisi

Re: [fpc-devel] Plans for 2022

2022-01-09 Thread J. Gareth Moreton via fpc-devel
tart with making these tests? Gareth aka. Kit On 09/01/2022 11:14, Florian Klämpfl via fpc-devel wrote: Am 09.01.2022 um 08:09 schrieb J. Gareth Moreton via fpc-devel : Some people requested a Patreon post as to my plans for 2022 with FPC, so I was happy to oblige. Plans may change a bit tho

[fpc-devel] Plans for 2022

2022-01-08 Thread J. Gareth Moreton via fpc-devel
Some people requested a Patreon post as to my plans for 2022 with FPC, so I was happy to oblige.  Plans may change a bit though depending on what happens in life and also what Florian's own vision is with the compiler, but this is the gist of it: https://www.patreon.com/posts/60922821 Gareth

Re: [fpc-devel] Double-checking an optimisation

2022-01-08 Thread J. Gareth Moreton via fpc-devel
On 09/01/2022 01:47, Martin Frb via fpc-devel wrote: I take it, it also is one (or two?) bytes longer? If that is in a loop, which otherwise is exactly within a 32 byte aligned block, then that could cause a slow down too. (If the loop is 16 bytes long, but aligned to a 32byte-bound+16, then

[fpc-devel] Double-checking an optimisation

2022-01-08 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So a merge request of mine was just approved that allows the peephole optimizer access to more registers when it needs one for temporary storage.  It allows it to make an optimisation on x86_64-win64 that wasn't possible before due to the lack of available volatile registers.  In

[fpc-devel] Is the ARM/AArch64 optimisation okay now?

2022-01-05 Thread J. Gareth Moreton via fpc-devel
Hey everyone, I uploaded a merge request for ARMv7A and AArch64 a while back. I'm just wondering if it's okay now. https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/104 To explain, because the optimisation requires a specific version of ARM to work (it always works under AArch64)

Re: [fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-04 Thread J. Gareth Moreton via fpc-devel
It's why I like going for optimisations that try to reduce code size without sacrificing speed, because of reducing the number of 16-byte or 32-byte sections.  Anyhow, back to work with optimising! Gareth aka. Kit On 04/01/2022 19:33, Martin Frb via fpc-devel wrote: On 04/01/2022 18:43, Jonas

Re: [fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-04 Thread J. Gareth Moreton via fpc-devel
I neglected to include -Cpcoreavx, that was my bad.  I'll try again. According to Intel® 64 and IA-32 Architectures Software Developer’s Manual, Vol 2B, Page 4-391.  The zero flag is set if the source is zero, and cleared otherwise.  Regarding an undefined result, I got confused with the BSF a

Re: [fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-03 Thread J. Gareth Moreton via fpc-devel
Prepare for a lot of technical rambling! This is just an analysis of the compilation of utf8lentest.lpr, not any of the System units.  Notably, POPCNT isn't called directly, but instead goes through the System unit via "call fpc_popcnt_qword" on both 3.2.x and 3.3.1.  A future study of "fpc_po

Re: [fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-03 Thread J. Gareth Moreton via fpc-devel
Interesting - thank you.  Will be interesting to study the assembler output to see what's going on. I'm honoured that I've become the go-to person when optimisation is concerned! Gareth aka. Kit On 03/01/2022 11:54, Martin Frb via fpc-devel wrote: Hi Gareth, not sure if this is of interest

Re: [fpc-devel] Fixed up !104

2021-12-27 Thread J. Gareth Moreton via fpc-devel
ot; to get the list of registers available to the current procedure.  While this is slightly slower, I figure it's a bit more flexible and backward-compatible. Gareth aka. Kit On 27/12/2021 19:32, J. Gareth Moreton via fpc-devel wrote: Hi Florian et al, I fixed up the CPU subtype in htt

[fpc-devel] Fixed up !104

2021-12-27 Thread J. Gareth Moreton via fpc-devel
Hi Florian et al, I fixed up the CPU subtype in https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/104 - hopefully the merge request is good now. Gareth aka. Kit -- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus _

[fpc-devel] Random branchless coding video!

2021-12-20 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So I stumbled across this Australian programmer who talks about branchless programming and assembly-level optimisations, and it's interesting because he points out the pitfalls of trying to out-optimise the compiler.  Just a random thing I thought the compiler developers might be

Re: [fpc-devel] I've asked this before, but perhaps I wasn't specific enough that time: what do ...

2021-12-19 Thread J. Gareth Moreton via fpc-devel
To throw my hat into the ring, I'd be willing to help out with developing some library routines.  I did experiment once with using a truncated and factorised MacLaurin series to calculate Double-precision sin and cos simultaneously in native SSE2 (with sin and cos filling two elements of the sa

[fpc-devel] Made improvements for the "extra optimisation information" feature

2021-12-13 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I made some updates and cleaned things up with the "extra optimisation feature" I started to experiment with a while ago. Well, I've managed to find a better showcase for the feature (although it might need refactoring to speed the compiler up): https://gitlab.com/freepascal.org/

[fpc-devel] Optimisation challenge!

2021-12-08 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I've noticed a bit of a deep potential optimisation for faster code:     jne     .Lj1806     movb    $13,-40(%rbp)     cmpq    $0,-32(%rbp)     je      .Lj1798     ... .Lj1806:     movb    $1,-40(%rbp) .Lj1798:     cmpb    $1,-40(%rbp)     jne     .Lj1812 If you analyse the jumps a

[fpc-devel] Ordinal optimisation question

2021-12-01 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I'm playing around with some node-level optimisations because I've noticed that some Int64 operations on 32-bit platforms can be sped up if the arithmetic is changed slightly.  For example, when performing x + x with Int64s, it's faster to do x shl 1, especially if x is on the sta

Re: [fpc-devel] Fix for an annoying error

2021-11-30 Thread J. Gareth Moreton via fpc-devel
ctions are used in the unit because they're not declared. Gareth aka. Kit On 30/11/2021 16:22, Michael Van Canneyt via fpc-devel wrote: On Tue, 30 Nov 2021, J. Gareth Moreton via fpc-devel wrote: That was a conundrum I was trying to answer when making the patch.  What is a warn

Re: [fpc-devel] Fix for an annoying error

2021-11-30 Thread J. Gareth Moreton via fpc-devel
Of course, I'm not fussed if the patch is rejected.  It was more of compensation for something that isn't always the user's fault and tended to block "make all". Gareth aka. Kit On 30/11/2021 19:01, Bart via fpc-devel wrote: On Tue, Nov 30, 2021 at 7:53 PM Bart wrote: I think I also discus

Re: [fpc-devel] Fix for an annoying error

2021-11-30 Thread J. Gareth Moreton via fpc-devel
Yeah, that's the exact same file that I have problems with. Gareth aka. Kit On 30/11/2021 18:34, Bart via fpc-devel wrote: On Tue, Nov 30, 2021 at 8:33 AM J. Gareth Moreton via fpc-devel wrote: For a while now I've had problems building the i386-win32 compiler under my 64-bit Wind

Re: [fpc-devel] Fix for an annoying error

2021-11-30 Thread J. Gareth Moreton via fpc-devel
;re not declared. Gareth aka. Kit On 30/11/2021 16:22, Michael Van Canneyt via fpc-devel wrote: On Tue, 30 Nov 2021, J. Gareth Moreton via fpc-devel wrote: That was a conundrum I was trying to answer when making the patch.  What is a warning and what is an error? A lot of the verificatio

Re: [fpc-devel] Fix for an annoying error

2021-11-30 Thread J. Gareth Moreton via fpc-devel
Windows is a warning at the very least because the project will probably break when you try to run it. Gareth aka. Kit On 30/11/2021 09:47, Tomas Hajny via fpc-devel wrote: On 2021-11-30 08:33, J. Gareth Moreton via fpc-devel wrote: Hi Gareth, For a while now I've had problems bui

[fpc-devel] Fix for an annoying error

2021-11-29 Thread J. Gareth Moreton via fpc-devel
Hi everyone, For a while now I've had problems building the i386-win32 compiler under my 64-bit Windows system because one of the packages fails to build - this is because it thinks a statically-imported DLL (done through $linklib) is invalid. Technically it is, but this system DLL (for me, it

Re: [fpc-devel] Building 3.3.1 make fails for Win 32

2021-11-26 Thread J. Gareth Moreton via fpc-devel
I've always found that Windows is a little bit temperamental with the make tool, or at the very least it is very slow.  Even my Linux virtual machine runs faster! Gareth aka. Kit On 26/11/2021 21:10, Martin Frb via fpc-devel wrote: On 26/11/2021 20:28, J. Gareth Moreton via fpc-devel

Re: [fpc-devel] Building 3.3.1 make fails for Win 32

2021-11-26 Thread J. Gareth Moreton via fpc-devel
I wonder if this is related to the current problems on x86_64. Do you know how long this failure has occurred?  It might be possible to bisect it if it's been a while. Gareth aka. Kit On 26/11/2021 11:39, Martin Frb via fpc-devel wrote: Start compiler 3.2.2 make.exe  all LINKSMART=1  CREATES

Re: [fpc-devel] The "magic div" algorithm

2021-11-16 Thread J. Gareth Moreton via fpc-devel
ide by 10 all the time, e.g. https://github.com/benibela/bigdecimalmath/blob/master/bigdecimalmath.pas#L1324-L1325 would the magic div help there much? Bye, Benito On 09.11.21 22:12, J. Gareth Moreton via fpc-devel wrote: This one for Marģers specifically, You'll be pleased

Re: [fpc-devel] Optimisation and thread safety

2021-11-13 Thread J. Gareth Moreton via fpc-devel
s the programmer's job to handle thread synchronisation, that alleviates some pressure. Thanks for the insight, Florian and Michael. Gareth aka. Kit On 13/11/2021 10:54, Florian Klämpfl via fpc-devel wrote: Am 13.11.2021 um 00:55 schrieb J. Gareth Moreton via fpc-devel : Hi everyone, I

[fpc-devel] Optimisation and thread safety

2021-11-12 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I have a question when it comes to optimising memory reads and writes.  What are the rules for FPC when it comes to writing to memory and then reading from it later within a single subroutine? For example, say I had this pair of commands:     movq    %rdx,-584(%rbp)     movl   

Re: [fpc-devel] The "magic div" algorithm

2021-11-09 Thread J. Gareth Moreton via fpc-devel
This one for Marģers specifically, You'll be pleased to know that your insight has been partially implemented! https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/51 This only expands 32-bit divisions to 64-bit, since the smaller sizes requires more work at the node level, but is cer

Re: [fpc-devel] Need help!

2021-10-30 Thread J. Gareth Moreton via fpc-devel
2021 21:43, Jonas Maebe via fpc-devel wrote: On 30/10/2021 22:26, J. Gareth Moreton via fpc-devel wrote: Just to clarify, I cannot delete the branch because it's the default branch, and it won't update with the mirror because it's diverged, and I can't revert the accidental co

Re: [fpc-devel] Need help!

2021-10-30 Thread J. Gareth Moreton via fpc-devel
s diverged. Gareth aka. Kit On 30/10/2021 21:12, J. Gareth Moreton via fpc-devel wrote: Hi everyone, I need a little bit of help.  I made a mistake and accidentally pushed on my local main branch.  I'm trying to revert it so I can resynchronise it with FPC's main branch, but it&

[fpc-devel] Need help!

2021-10-30 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I need a little bit of help.  I made a mistake and accidentally pushed on my local main branch.  I'm trying to revert it so I can resynchronise it with FPC's main branch, but it's not letting me. Do you know what I have to do? Gareth aka. Kit -- This email has been checked for

Re: [fpc-devel] Main build fails

2021-10-30 Thread J. Gareth Moreton via fpc-devel
(Meant to send that to the Core team, sorry, but it's probably applicable here too) On 30/10/2021 18:33, J. Gareth Moreton via fpc-devel wrote: Hi everyone, It seems that as of this e-mail, the build-and-test-job script fails for the main branch.  It's failing on a package:

[fpc-devel] Main build fails

2021-10-30 Thread J. Gareth Moreton via fpc-devel
Hi everyone, It seems that as of this e-mail, the build-and-test-job script fails for the main branch.  It's failing on a package: External command "/builds/freepascal.org/fpc/source/compiler/ppcx64 -Tlinux -FUhash/units/x86_64-linux/ -Fu/builds/freepascal.org/fpc/source/rtl/units/x86_64-lin

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-19 Thread J. Gareth Moreton via fpc-devel
machine learning if I'm not careful! Gareth aka. Kit On 17/10/2021 15:24, J. Gareth Moreton via fpc-devel wrote: That's why I was discussing with Jonas in how to handle that, since currently tai objects don't have a clean way to free them themselves, and optinfo is an untype

Re: [fpc-devel] Register renaming and false dependency question

2021-10-18 Thread J. Gareth Moreton via fpc-devel
as to execute after the mov. I don't know though how that interfers with rax also being input of the imul instruction and when the shl can actually execute. You will have to profile this with something like https://github.com/travisdowns/uarch-bench or alike. On 18/10/2021 11:14 J. Gareth Mo

Re: [fpc-devel] Register renaming and false dependency question

2021-10-18 Thread J. Gareth Moreton via fpc-devel
efan Glienke via fpc-devel wrote: According to compiler explorer clang, gcc and msvc compile this to the same code with -O3 as FPC does. So I would assume that is fine. Am 17.10.2021 um 13:25 schrieb J. Gareth Moreton via fpc-devel: Hi everyone, While reading up on some algorithms, I came

Re: [fpc-devel] Register renaming and false dependency question

2021-10-17 Thread J. Gareth Moreton via fpc-devel
/10/2021 14:52, Florian Klämpfl via fpc-devel wrote: Am 17.10.2021 um 13:25 schrieb J. Gareth Moreton via fpc-devel : Hi everyone, While reading up on some algorithms, I came across a recommendation of using a shorter arithmetic function to change the value of a constant in a register ra

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-17 Thread J. Gareth Moreton via fpc-devel
or and destructor handle initialisation and cleanup. Gareth aka. Kit On 17/10/2021 15:00, Florian Klämpfl via fpc-devel wrote: Am 11.10.2021 um 10:00 schrieb J. Gareth Moreton via fpc-devel : One for Jonas mainly, but also for Florian. This is a new "extra optimisation information&quo

[fpc-devel] Register renaming and false dependency question

2021-10-17 Thread J. Gareth Moreton via fpc-devel
Hi everyone, While reading up on some algorithms, I came across a recommendation of using a shorter arithmetic function to change the value of a constant in a register rather than loading the new value directly.  However, the algorithm assumes a RISC-like processor, so I'm not sure if it appli

Re: [fpc-devel] Merging identical procedure proposals

2021-10-16 Thread J. Gareth Moreton via fpc-devel
On 16/10/2021 20:33, Yuriy Sydorov via fpc-devel wrote: On 16.10.2021 21:45, J. Gareth Moreton via fpc-devel wrote: I figured that virtual methods would be no-go and that this would only apply to static methods.  It seems a shame to dismiss it completely though because there's a huge numb

Re: [fpc-devel] Merging identical procedure proposals

2021-10-16 Thread J. Gareth Moreton via fpc-devel
fair number in the compiler itself (at least when compiled under x86_64-win64). I guess it would be something that would have to be showcased and thoroughly tested.  I can only try! Gareth aka. Kit On 16/10/2021 19:21, Jonas Maebe via fpc-devel wrote: On 16/10/2021 19:59, J. Gareth Moreton vi

Re: [fpc-devel] Merging identical procedure proposals

2021-10-16 Thread J. Gareth Moreton via fpc-devel
Sounds like "procvar = @myproc" would be -O4 at best due to the side-effects, otherwise I would wonder if it's possible to track such references, especially with units that are pre-compiled. Gareth aka. Kit On 16/10/2021 15:32, Jonas Maebe via fpc-devel wrote: On 13/10/2021 1

[fpc-devel] Merging identical procedure proposals

2021-10-13 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So one optimisation that has cropped up a couple of times is finding ways to merge subroutines that, while containing different source code, compile into the exact same assembly language.  For example, TStream.WriteData has implementations for numerous input types, and the compil

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-11 Thread J. Gareth Moreton via fpc-devel
One for Jonas mainly, but also for Florian.  This is a new "extra optimisation information" feature that allows the peephole optimizer to leave 'notes' and other extra information on individual tai objects for later reference.  An initial showcase is to store a link to the destination label if

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-07 Thread J. Gareth Moreton via fpc-devel
, J. Gareth Moreton via fpc-devel wrote: Would you approve something like this, Jonas?  I admit it's not properly tested yet, but I'm descending from TLinkedListItem (and the descendant class can itself be descended from) and the TAOptObj class handles allocation and cleanup of extra i

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-05 Thread J. Gareth Moreton via fpc-devel
Would you approve something like this, Jonas?  I admit it's not properly tested yet, but I'm descending from TLinkedListItem (and the descendant class can itself be descended from) and the TAOptObj class handles allocation and cleanup of extra information.  I'm not sure how practical it is, but

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-05 Thread J. Gareth Moreton via fpc-devel
h aka. Kit On 05/10/2021 19:54, Jonas Maebe via fpc-devel wrote: On 03/10/2021 23:32, J. Gareth Moreton via fpc-devel wrote: One drawback I've noticed is that there's no clean way to free the optinfo pointer when the tai object is destroyed, The best way to handle this is by allocatin

Re: [fpc-devel] Anyone an idea were/how to look for the missing merge in 3.2.0 [[peephole / fixed]]

2021-10-04 Thread J. Gareth Moreton via fpc-devel
Ah, fair enough.  I don't recall touching much of the CMOV source code, if any, but I'm glad it got fixed. Gareth aka. Kit On 04/10/2021 19:36, Yuriy Sydorov via fpc-devel wrote: On 04.10.2021 20:24, J. Gareth Moreton via fpc-devel wrote: I have a suspicion as to what it might be

Re: [fpc-devel] Anyone an idea were/how to look for the missing merge in 3.2.0 [[peephole / fixed]]

2021-10-04 Thread J. Gareth Moreton via fpc-devel
I have a suspicion as to what it might be.  Can you produce the faulty assembly language with DEBUG_AOPTCPU so it shows the comments?  Does it say "Mov2Nop 3" where the missing instruction lies? Gareth aka. Kit P.S. Good job in spotting the fault. On 04/10/2021 13:16, Yuriy Sydorov via fpc-de

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-03 Thread J. Gareth Moreton via fpc-devel
often used to hold an object pointer, especially in the 32-bit days).  I would like to propose having a Boolean field named "OwnsOptInfo" that, if True, calls Dispose on optinfo's value in tai's destructor so we don't get memory leaks. Gareth aka. Kit On 03/10/2021 13:1

Re: [fpc-devel] Peephole optimizer tai class change proposals

2021-10-03 Thread J. Gareth Moreton via fpc-devel
That's useful to know - thanks Jonas. On 03/10/2021 13:10, Jonas Maebe via fpc-devel wrote: On 03/10/2021 14:04, J. Gareth Moreton via fpc-devel wrote: I'm aware that the tai class declares an "optinfo" field, although I'm uncertain if this is safe to use or not

[fpc-devel] Peephole optimizer tai class change proposals

2021-10-03 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So as my optimisations get more and more sophisticated and intelligent, I'm realising that I may need ways to store more information than is currently possible.  Obviously I want to avoid enlarging the internal state too much or making the code unwieldly, but the additions I have

Re: [fpc-devel] New deep optimisation

2021-10-01 Thread J. Gareth Moreton via fpc-devel
/cmp and jcc instructions are macrofused but only if they are directly adjacent. Am 01.10.2021 um 18:10 schrieb J. Gareth Moreton via fpc-devel: Hi everyone, I've started playing around with an optimisation on x86 platforms that looks for common instructions that appear on both branches

[fpc-devel] New deep optimisation

2021-10-01 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I've started playing around with an optimisation on x86 platforms that looks for common instructions that appear on both branches of a Jcc instruction (i.e. after the label it jumps to and after the jump itself), and so far I'm having a lot of success.  For example, in the Math u

Re: [fpc-devel] Difficulty with a failed build

2021-09-23 Thread J. Gareth Moreton via fpc-devel
Problem seems to be resolved for now after rebasing.  I guess it was a transient problem on the main branch. Gareth aka. Kit On 22/09/2021 21:07, J. Gareth Moreton via fpc-devel wrote: Hi everyone, So I've made a merge request for Marģers' division improvement over here: https://

[fpc-devel] Difficulty with a failed build

2021-09-22 Thread J. Gareth Moreton via fpc-devel
Hi everyone, So I've made a merge request for Marģers' division improvement over here: https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/51 I am getting a build failure that indicates "|Binary files ppc3 and ppcx64 differ|".|| This usually implies a fault in the compiler code.  H

<    1   2   3   4   5   6   7   8   9   10   >