It's funny - for some reason I was expecting a lot of opposition!
I knew about FillChar being written in assembly langauge and know from
experience that FPC will never support the inlining of pure assembler
routines (as Florian said, it will just open a huge can of worms). My
thought was how
ing a MOV/CMP/CMOV
triplet. In other words, it tries to convert a CMOV to a MOV if it can.
On 16/04/2022 20:03, Florian Klämpfl via fpc-devel wrote:
Am 16.04.2022 um 12:31 schrieb Thorsten Otto via fpc-devel
:
On Samstag, 16. April 2022 06:49:07 CEST J. Gareth Moreton via
fpc-devel
Hi everyone,
In the x86_64 assembly dumps, I frequently come across combinations such
as the following:
cmpl %ebx,%edx
cmovll %ebx,%eax
cmovnll %edx,%eax
This is essentially the tertiary C operator "x = cond ? trueval :
falseval", or in Pascal "if (cond) then x := trueva
Hi everyone,
This is something that sprung to mind when thinking about code speed and
the like, and one thing that cropped up is the initialisation of large
variables such as arrays or records. A common means of doing this is, say:
FillChar(MyVar, SizeOf(MyVar), 0);
To keep things as genera
ogress, I stripped everything out except what was advertised...
making __m128 etc. aligned.
My current question though is regarding testing. Writing tests for
these aligned arrays and records is simple enough, but I'm not sure what
subdirectory/class they fall under... tbs or test/cg etc.
Hi everyone,
To complement aligned records, I'm trying out an implementation of
aligned records. Like how you might declare an aligned record as follows:
type AlignedVector = packed record
X, Y: Double;
end align 16;
At present I've gone for the following for arrays:
type AlignedVector
Since this feature is still a work in progress and bugs are inevitable,
my merge request over here now only does what it was originally designed
to do... make __m128 and the like aligned, although admittedly code that
ensures they are treated as vector types is tied into the same commit:
https
course I don't
want to force the call to make
On 08/04/2022 20:58, Jonas Maebe via fpc-devel wrote:
On 08/04/2022 20:31, J. Gareth Moreton via fpc-devel wrote:
That might explain a few things. The problem is that under
vectorcall and the System V ABI (the default x86_64 calling
convention
On 08/04/2022 19:19, Jonas Maebe via fpc-devel wrote:
On 08/04/2022 19:57, J. Gareth Moreton via fpc-devel wrote:
It looks like support for writing to arrays that are wholly stored in
registers is a little limited and buggy
Modifying individual elements of arrays stored in registers has never
.1] of Double align 16;", which is
essentially __m128d).
Gareth aka. Kit
On 08/04/2022 18:57, J. Gareth Moreton via fpc-devel wrote:
It looks like support for writing to arrays that are wholly stored in
registers is a little limited and buggy - while it writes to temporary
memory when modi
It looks like support for writing to arrays that are wholly stored in
registers is a little limited and buggy - while it writes to temporary
memory when modifying an individual element, the compiler sometimes
doesn't write back the final result into the original register. I'm
seeing if I can f
On 06/04/2022 22:58, J. Gareth Moreton via fpc-devel wrote:
On 06/04/2022 21:16, Jonas Maebe via fpc-devel wrote:
On 06/04/2022 19:20, J. Gareth Moreton via fpc-devel wrote:
I recently made a merge request that initally just fixed the
incorrect memory alignment for __m128 and similar types, but
I used it because it was easy to hot-swap the constructor in the
definition of __m128 and the like, and it was a quick and convenient way
to ensure the alignment was correct.
Gareth aka. Kit
On 06/04/2022 21:16, Jonas Maebe via fpc-devel wrote:
On 06/04/2022 19:20, J. Gareth Moreton via fpc
Pascal simply is a strongly typed language. Vector intrinsics are no
reason to weaken this. Thus you need to declare operator overloads that
hide the nitty, gritty details of assigning a TVector4 to a __m128, e.g.:
type
TVector4 = packed record
X, Y, Z, W: Single;
class operator := (c
Another problem... I've tried to declare an ADDPD intrinsic as follows:
function x86_addpd(r0, r1: __m128d): __m128d; [INTERNPROC:
fpc_in_x86_addpd];
I thought using __m128d instead of __m128 was fairly logical since ADDPD
works with Doubles, not Singles, but this can cause problems. For
ex
Hi everyone,
I recently made a merge request that initally just fixed the incorrect
memory alignment for __m128 and similar types, but doing so revealed a
whole plethora of other bugs. First, when I fixed it, __m128 etc were
no longer recognised as a valid SIMD or aggregate type due to the wr
Hi everyone,
I've noticed that, in many units, the generated node trees produce a lot
of typeconv nodes with convtype=tc_equal. For example (4th line - this
is generated with DEBUG_NODE_XML):
...
..
.convtype="tc_equal">
...result
.
Ah, sorry for the confusion, but it occurs even if -a is omitted. I got
mixed up with something else. So it's probably a simple range check
error in the source and not an assembler issue.
Gareth aka. Kit
--
This email has been checked for viruses by Avast antivirus software.
https://www.ava
al: Compilation aborted
The installer encountered the following error:
Compilation of "BuildUnit_chm.pp" failed
As mentioned, it doesn't show up if -a is omitted.
Gareth aka. Kit
On 03/04/2022 15:42, Florian Klämpfl via fpc-devel wrote:
Am 03.04.2022 um 15:44 schrieb J. Gareth Moreton
Hi everyone,
It seems at some point, something was introduced to the compiler that
causes a package to fail to build (specifically
packages\chm\src\chmwriter.pas) with an assembler-level range check
error. If you run "make all" under i386-win32 with "-CriotR -a"
options, the error is trigger
It's done!
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/191
Whitepaper can be found as an attachment in that merge request. Go ham
on trying to break it!
It will likely need refactoring later, especially as I want to port it
to AArch64 at some point to see how it affects cod
id it make much of a difference in the compilation time?
Em sex., 25 de fev. de 2022 às 02:08, J. Gareth Moreton via fpc-devel
escreveu:
I did it!
After a good week of work and getting things wrong, I finally found a
solution that works nicely and is extensible, at least for x86. A bit
of refac
o possibly very
out of practice or going senile. I'll keep trying.
Gareth aka. Kit
On 27/02/2022 23:37, J. Gareth Moreton via fpc-devel wrote:
I will need to refactor this feature later on, especially with names,
because the data structure is not a true sliding window because it
only ope
I will need to refactor this feature later on, especially with names,
because the data structure is not a true sliding window because it only
operates over a subset of instructions that are added to a list based on
whether they fit the criteria of what I call 'seed instructions'. In
future, th
All tests passed successfully, including an extra addition that shaved
another 5kb off the compiler! Now the hard part... writing that
whitepaper for everyone, since I think this is one of those times where
it will be necessary.
But if anyone wants to analyse and test it out before I make a m
25/02/2022 13:08, J. Gareth Moreton via fpc-devel wrote:
Well I'm not out of the woods yet, I've got one failure on
x86_64-win64 and two on i386-win32:
x86_64-win64:
Failed to run webtbs/tw16040.pp 2021/09/13 08:19:30
i386-win32:
Failed to run test/packages/bzip2/tbzip2streamtest.pp
On 25/02/2022 08:29, Marco Borsari via fpc-devel wrote:
This is very useful, thank you.
I think FPC has an excellent register allocator, but frustrated on 32 bit
by scarce resources and by the lack of reloading check.
Unfortunately the equivalent procedure isn't optimised on i386-win32:
.Lj67
Well I'm not out of the woods yet, I've got one failure on x86_64-win64
and two on i386-win32:
x86_64-win64:
Failed to run webtbs/tw16040.pp 2021/09/13 08:19:30
i386-win32:
Failed to run test/packages/bzip2/tbzip2streamtest.pp 2021/09/13 08:19:28
Failed to run webtbs/tw16040.pp 2021/09/13 08:
I did it!
After a good week of work and getting things wrong, I finally found a
solution that works nicely and is extensible, at least for x86. A bit
of refactoring and it can be ported to other platforms. I'm just
running the test suites to see if I can break things now. Honestly the
hard
hether another
register is pointing to this location and is hence volatile.
Gareth aka. Kit
On 17/02/2022 21:38, Jonas Maebe via fpc-devel wrote:
On 17/02/2022 20:25, J. Gareth Moreton via fpc-devel wrote:
P.S. The term "sliding window" comes from the LZ77 compression
algorithm and
Hi everyone,
So I've started experimenting with a new technique in the peephole
optimizer for x86 platforms that I've named the Sliding Window. The
intention is to use it to help replace common blocks of code within a
procedure, such as pointer dereferences. So far I'm having a degree of
su
In the meantime, my research has revealed some peephole optimizer bugs.
Normally these bugs aren't triggered, but given one of them manifested
simply by running Pass 2 twice, I feel they may have the potential to
manifest themselves in contrived examples. Thereforem, I have made a
merge reques
Hi everyone,
So I've found with the peephole optimizer, at least on x86, that if you
run pass 2 more than once, it often catches even more optimisations that
otherwise get missed. At the same time I've found some bugs that get
triggered when pass 2 is run again (which is why I asked about
Re
a fpc-devel wrote:
Am 21.01.2022 um 18:23 schrieb J. Gareth Moreton via fpc-devel
:
Hi everyone,
Any word on the validity of this ARM / AArch64 optimisation? It's quite good at
increasing speed and shrinking code size by concatenating writes to the stack,
among other things:
https://
Hi everyone,
Any word on the validity of this ARM / AArch64 optimisation? It's quite
good at increasing speed and shrinking code size by concatenating writes
to the stack, among other things:
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/104 - the
new CPU feature flag I put in
Commit af107ca8fee33355e8c35fab6fc5ba5290bd3ebc fixes the problem in the
main branch, so this one should be merged into fixes_3_2.
Note that this commit also extends an optimisation in OptPass2JMP that
is unrelated to the bug fix. If this proves incompatible with the fixes
branch, it can be i
Found the reason for it, or at least what looks like the reason... the
RemoveDeadCodeAfterJump routine, which removes all instructions between
a "jmp" instruction and the next live label (since those instructions
will never get executed), doesn't stop if it hits the SEH section and
strips all t
And I got the equation wrong for the permissive one. It's meant to be
something like the following instead:
"(r1.volatility + r2.volatility - permitted_volatility) = []"
Gareth aka. Kit
On 11/01/2022 08:59, J. Gareth Moreton via fpc-devel wrote:
Hi everyone,
During my impl
Hi everyone,
During my implementation of a new optimisation,
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/136, which
merges some references, Florian asked me to make sure I check the
volatility fields of the references, something which I forgot about, but
which turned out was
can be
adapted or if we have to make a new one.
Gareth aka. Kit
On 10/01/2022 09:23, J. Gareth Moreton via fpc-devel wrote:
I have a passion for games programming so this one really rings close
for me. The tricky thing is that early SSE and AVX instructions have
to fill and use the entire re
ittle difficult to manage. One can't just add a dummy 4th component
because then storage won't behave as expected.
Gareth aka. Kit
On 10/01/2022 03:08, Ryan Joseph via fpc-devel wrote:
On Jan 9, 2022, at 2:09 PM, J. Gareth Moreton via fpc-devel
wrote:
https://www.patreon.com/posts
On 09/01/2022 15:28, Martin Frb via fpc-devel wrote:
Btw, have you seen this?
https://www.agner.org/optimize/optimizing_assembly.pdf
Page 70, it says that under some conditions a branch may be faster
than a conditional move.
I'm definitely saving a local copy of that! It could prove insightf
On 09/01/2022 12:35, Florian Klämpfl via fpc-devel wrote:
It removes a jump and a label, which might permit other long-range
optimisations, but it's 3 instructions that are in a dependency chain.
Didn't you implement something which transformed the code above in
xorl %ebx,%ebx
f
there's some kind of register tracking problem.
Gareth aka. Kit
On 09/01/2022 13:38, J. Gareth Moreton via fpc-devel wrote:
It's probably a good idea, yes. Normally the optimisations I
implement are due to me spotting something in the RTL or compiler
disassembly while optimisi
tart with making these tests?
Gareth aka. Kit
On 09/01/2022 11:14, Florian Klämpfl via fpc-devel wrote:
Am 09.01.2022 um 08:09 schrieb J. Gareth Moreton via fpc-devel
:
Some people requested a Patreon post as to my plans for 2022 with FPC, so I was
happy to oblige. Plans may change a bit tho
Some people requested a Patreon post as to my plans for 2022 with FPC,
so I was happy to oblige. Plans may change a bit though depending on
what happens in life and also what Florian's own vision is with the
compiler, but this is the gist of it:
https://www.patreon.com/posts/60922821
Gareth
On 09/01/2022 01:47, Martin Frb via fpc-devel wrote:
I take it, it also is one (or two?) bytes longer? If that is in a
loop, which otherwise is exactly within a 32 byte aligned block, then
that could cause a slow down too. (If the loop is 16 bytes long, but
aligned to a 32byte-bound+16, then
Hi everyone,
So a merge request of mine was just approved that allows the peephole
optimizer access to more registers when it needs one for temporary
storage. It allows it to make an optimisation on x86_64-win64 that
wasn't possible before due to the lack of available volatile registers.
In
Hey everyone,
I uploaded a merge request for ARMv7A and AArch64 a while back. I'm just
wondering if it's okay now.
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/104
To explain, because the optimisation requires a specific version of ARM
to work (it always works under AArch64)
It's why I like going for optimisations that try to reduce code size
without sacrificing speed, because of reducing the number of 16-byte or
32-byte sections. Anyhow, back to work with optimising!
Gareth aka. Kit
On 04/01/2022 19:33, Martin Frb via fpc-devel wrote:
On 04/01/2022 18:43, Jonas
I neglected to include -Cpcoreavx, that was my bad. I'll try again.
According to Intel® 64 and IA-32 Architectures Software Developer’s
Manual, Vol 2B, Page 4-391. The zero flag is set if the source is zero,
and cleared otherwise. Regarding an undefined result, I got confused
with the BSF a
Prepare for a lot of technical rambling!
This is just an analysis of the compilation of utf8lentest.lpr, not any
of the System units. Notably, POPCNT isn't called directly, but instead
goes through the System unit via "call fpc_popcnt_qword" on both 3.2.x
and 3.3.1. A future study of "fpc_po
Interesting - thank you. Will be interesting to study the assembler
output to see what's going on.
I'm honoured that I've become the go-to person when optimisation is
concerned!
Gareth aka. Kit
On 03/01/2022 11:54, Martin Frb via fpc-devel wrote:
Hi Gareth,
not sure if this is of interest
ot; to get the list of registers
available to the current procedure. While this is slightly slower, I
figure it's a bit more flexible and backward-compatible.
Gareth aka. Kit
On 27/12/2021 19:32, J. Gareth Moreton via fpc-devel wrote:
Hi Florian et al,
I fixed up the CPU subtype in
htt
Hi Florian et al,
I fixed up the CPU subtype in
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/104 -
hopefully the merge request is good now.
Gareth aka. Kit
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
_
Hi everyone,
So I stumbled across this Australian programmer who talks about
branchless programming and assembly-level optimisations, and it's
interesting because he points out the pitfalls of trying to out-optimise
the compiler. Just a random thing I thought the compiler developers
might be
To throw my hat into the ring, I'd be willing to help out with
developing some library routines. I did experiment once with using a
truncated and factorised MacLaurin series to calculate Double-precision
sin and cos simultaneously in native SSE2 (with sin and cos filling two
elements of the sa
Hi everyone,
I made some updates and cleaned things up with the "extra optimisation
feature" I started to experiment with a while ago. Well, I've managed to
find a better showcase for the feature (although it might need
refactoring to speed the compiler up):
https://gitlab.com/freepascal.org/
Hi everyone,
I've noticed a bit of a deep potential optimisation for faster code:
jne .Lj1806
movb $13,-40(%rbp)
cmpq $0,-32(%rbp)
je .Lj1798
...
.Lj1806:
movb $1,-40(%rbp)
.Lj1798:
cmpb $1,-40(%rbp)
jne .Lj1812
If you analyse the jumps a
Hi everyone,
I'm playing around with some node-level optimisations because I've
noticed that some Int64 operations on 32-bit platforms can be sped up if
the arithmetic is changed slightly. For example, when performing x + x
with Int64s, it's faster to do x shl 1, especially if x is on the sta
ctions are used in the unit because they're not declared.
Gareth aka. Kit
On 30/11/2021 16:22, Michael Van Canneyt via fpc-devel wrote:
On Tue, 30 Nov 2021, J. Gareth Moreton via fpc-devel wrote:
That was a conundrum I was trying to answer when making the patch.
What is a warn
Of course, I'm not fussed if the patch is rejected. It was more of
compensation for something that isn't always the user's fault and tended
to block "make all".
Gareth aka. Kit
On 30/11/2021 19:01, Bart via fpc-devel wrote:
On Tue, Nov 30, 2021 at 7:53 PM Bart wrote:
I think I also discus
Yeah, that's the exact same file that I have problems with.
Gareth aka. Kit
On 30/11/2021 18:34, Bart via fpc-devel wrote:
On Tue, Nov 30, 2021 at 8:33 AM J. Gareth Moreton via fpc-devel
wrote:
For a while now I've had problems building the i386-win32 compiler under
my 64-bit Wind
;re not declared.
Gareth aka. Kit
On 30/11/2021 16:22, Michael Van Canneyt via fpc-devel wrote:
On Tue, 30 Nov 2021, J. Gareth Moreton via fpc-devel wrote:
That was a conundrum I was trying to answer when making the patch.
What is a warning and what is an error?
A lot of the verificatio
Windows is a warning at
the very least because the project will probably break when you try to
run it.
Gareth aka. Kit
On 30/11/2021 09:47, Tomas Hajny via fpc-devel wrote:
On 2021-11-30 08:33, J. Gareth Moreton via fpc-devel wrote:
Hi Gareth,
For a while now I've had problems bui
Hi everyone,
For a while now I've had problems building the i386-win32 compiler under
my 64-bit Windows system because one of the packages fails to build -
this is because it thinks a statically-imported DLL (done through
$linklib) is invalid. Technically it is, but this system DLL (for me, it
I've always found that Windows is a little bit temperamental with the
make tool, or at the very least it is very slow. Even my Linux virtual
machine runs faster!
Gareth aka. Kit
On 26/11/2021 21:10, Martin Frb via fpc-devel wrote:
On 26/11/2021 20:28, J. Gareth Moreton via fpc-devel
I wonder if this is related to the current problems on x86_64. Do you
know how long this failure has occurred? It might be possible to bisect
it if it's been a while.
Gareth aka. Kit
On 26/11/2021 11:39, Martin Frb via fpc-devel wrote:
Start compiler 3.2.2
make.exe all LINKSMART=1 CREATES
ide by 10 all the time,
e.g.
https://github.com/benibela/bigdecimalmath/blob/master/bigdecimalmath.pas#L1324-L1325
would the magic div help there much?
Bye,
Benito
On 09.11.21 22:12, J. Gareth Moreton via fpc-devel wrote:
This one for Marģers specifically,
You'll be pleased
s the programmer's job to handle thread synchronisation, that
alleviates some pressure.
Thanks for the insight, Florian and Michael.
Gareth aka. Kit
On 13/11/2021 10:54, Florian Klämpfl via fpc-devel wrote:
Am 13.11.2021 um 00:55 schrieb J. Gareth Moreton via fpc-devel
:
Hi everyone,
I
Hi everyone,
I have a question when it comes to optimising memory reads and writes.
What are the rules for FPC when it comes to writing to memory and then
reading from it later within a single subroutine? For example, say I had
this pair of commands:
movq %rdx,-584(%rbp)
movl
This one for Marģers specifically,
You'll be pleased to know that your insight has been partially implemented!
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/51
This only expands 32-bit divisions to 64-bit, since the smaller sizes
requires more work at the node level, but is cer
2021 21:43, Jonas Maebe via fpc-devel wrote:
On 30/10/2021 22:26, J. Gareth Moreton via fpc-devel wrote:
Just to clarify, I cannot delete the branch because it's the default
branch, and it won't update with the mirror because it's diverged,
and I can't revert the accidental co
s diverged.
Gareth aka. Kit
On 30/10/2021 21:12, J. Gareth Moreton via fpc-devel wrote:
Hi everyone,
I need a little bit of help. I made a mistake and accidentally pushed
on my local main branch. I'm trying to revert it so I can
resynchronise it with FPC's main branch, but it&
Hi everyone,
I need a little bit of help. I made a mistake and accidentally pushed
on my local main branch. I'm trying to revert it so I can resynchronise
it with FPC's main branch, but it's not letting me. Do you know what I
have to do?
Gareth aka. Kit
--
This email has been checked for
(Meant to send that to the Core team, sorry, but it's probably
applicable here too)
On 30/10/2021 18:33, J. Gareth Moreton via fpc-devel wrote:
Hi everyone,
It seems that as of this e-mail, the build-and-test-job script fails
for the main branch. It's failing on a package:
Hi everyone,
It seems that as of this e-mail, the build-and-test-job script fails for
the main branch. It's failing on a package:
External command "/builds/freepascal.org/fpc/source/compiler/ppcx64
-Tlinux -FUhash/units/x86_64-linux/
-Fu/builds/freepascal.org/fpc/source/rtl/units/x86_64-lin
machine learning if I'm not careful!
Gareth aka. Kit
On 17/10/2021 15:24, J. Gareth Moreton via fpc-devel wrote:
That's why I was discussing with Jonas in how to handle that, since
currently tai objects don't have a clean way to free them themselves,
and optinfo is an untype
as to execute after the mov. I don't know though
how that interfers with rax also being input of the imul instruction and when
the shl can actually execute. You will have to profile this with something like
https://github.com/travisdowns/uarch-bench or alike.
On 18/10/2021 11:14 J. Gareth Mo
efan Glienke via fpc-devel wrote:
According to compiler explorer clang, gcc and msvc compile this to the
same code with -O3 as FPC does. So I would assume that is fine.
Am 17.10.2021 um 13:25 schrieb J. Gareth Moreton via fpc-devel:
Hi everyone,
While reading up on some algorithms, I came
/10/2021 14:52, Florian Klämpfl via fpc-devel wrote:
Am 17.10.2021 um 13:25 schrieb J. Gareth Moreton via fpc-devel
:
Hi everyone,
While reading up on some algorithms, I came across a recommendation of using a
shorter arithmetic function to change the value of a constant in a register
ra
or and
destructor handle initialisation and cleanup.
Gareth aka. Kit
On 17/10/2021 15:00, Florian Klämpfl via fpc-devel wrote:
Am 11.10.2021 um 10:00 schrieb J. Gareth Moreton via fpc-devel
:
One for Jonas mainly, but also for Florian. This is a new "extra optimisation
information&quo
Hi everyone,
While reading up on some algorithms, I came across a recommendation of
using a shorter arithmetic function to change the value of a constant in
a register rather than loading the new value directly. However, the
algorithm assumes a RISC-like processor, so I'm not sure if it appli
On 16/10/2021 20:33, Yuriy Sydorov via fpc-devel wrote:
On 16.10.2021 21:45, J. Gareth Moreton via fpc-devel wrote:
I figured that virtual methods would be no-go and that this would
only apply to static methods. It seems a shame to dismiss it
completely though because there's a huge numb
fair number in the compiler itself (at least when compiled
under x86_64-win64).
I guess it would be something that would have to be showcased and
thoroughly tested. I can only try!
Gareth aka. Kit
On 16/10/2021 19:21, Jonas Maebe via fpc-devel wrote:
On 16/10/2021 19:59, J. Gareth Moreton vi
Sounds like "procvar = @myproc" would be -O4 at best due to the
side-effects, otherwise I would wonder if it's possible to track such
references, especially with units that are pre-compiled.
Gareth aka. Kit
On 16/10/2021 15:32, Jonas Maebe via fpc-devel wrote:
On 13/10/2021 1
Hi everyone,
So one optimisation that has cropped up a couple of times is finding
ways to merge subroutines that, while containing different source code,
compile into the exact same assembly language. For example,
TStream.WriteData has implementations for numerous input types, and the
compil
One for Jonas mainly, but also for Florian. This is a new "extra
optimisation information" feature that allows the peephole optimizer to
leave 'notes' and other extra information on individual tai objects for
later reference. An initial showcase is to store a link to the
destination label if
, J. Gareth Moreton via fpc-devel wrote:
Would you approve something like this, Jonas? I admit it's not
properly tested yet, but I'm descending from TLinkedListItem (and the
descendant class can itself be descended from) and the TAOptObj class
handles allocation and cleanup of extra i
Would you approve something like this, Jonas? I admit it's not properly
tested yet, but I'm descending from TLinkedListItem (and the descendant
class can itself be descended from) and the TAOptObj class handles
allocation and cleanup of extra information. I'm not sure how practical
it is, but
h aka. Kit
On 05/10/2021 19:54, Jonas Maebe via fpc-devel wrote:
On 03/10/2021 23:32, J. Gareth Moreton via fpc-devel wrote:
One drawback I've noticed is that there's no clean way to free the
optinfo pointer when the tai object is destroyed,
The best way to handle this is by allocatin
Ah, fair enough. I don't recall touching much of the CMOV source code,
if any, but I'm glad it got fixed.
Gareth aka. Kit
On 04/10/2021 19:36, Yuriy Sydorov via fpc-devel wrote:
On 04.10.2021 20:24, J. Gareth Moreton via fpc-devel wrote:
I have a suspicion as to what it might be
I have a suspicion as to what it might be. Can you produce the faulty
assembly language with DEBUG_AOPTCPU so it shows the comments? Does it
say "Mov2Nop 3" where the missing instruction lies?
Gareth aka. Kit
P.S. Good job in spotting the fault.
On 04/10/2021 13:16, Yuriy Sydorov via fpc-de
often used to hold an object pointer, especially in the 32-bit days). I
would like to propose having a Boolean field named "OwnsOptInfo" that,
if True, calls Dispose on optinfo's value in tai's destructor so we
don't get memory leaks.
Gareth aka. Kit
On 03/10/2021 13:1
That's useful to know - thanks Jonas.
On 03/10/2021 13:10, Jonas Maebe via fpc-devel wrote:
On 03/10/2021 14:04, J. Gareth Moreton via fpc-devel wrote:
I'm aware that the tai class declares an "optinfo" field, although I'm
uncertain if this is safe to use or not
Hi everyone,
So as my optimisations get more and more sophisticated and intelligent,
I'm realising that I may need ways to store more information than is
currently possible. Obviously I want to avoid enlarging the internal
state too much or making the code unwieldly, but the additions I have
/cmp and jcc instructions are macrofused
but only if they are directly adjacent.
Am 01.10.2021 um 18:10 schrieb J. Gareth Moreton via fpc-devel:
Hi everyone,
I've started playing around with an optimisation on x86 platforms
that looks for common instructions that appear on both branches
Hi everyone,
I've started playing around with an optimisation on x86 platforms that
looks for common instructions that appear on both branches of a Jcc
instruction (i.e. after the label it jumps to and after the jump
itself), and so far I'm having a lot of success. For example, in the
Math u
Problem seems to be resolved for now after rebasing. I guess it was a
transient problem on the main branch.
Gareth aka. Kit
On 22/09/2021 21:07, J. Gareth Moreton via fpc-devel wrote:
Hi everyone,
So I've made a merge request for Marģers' division improvement over
here: https://
Hi everyone,
So I've made a merge request for Marģers' division improvement over
here: https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/51
I am getting a build failure that indicates "|Binary files ppc3 and
ppcx64 differ|".|| This usually implies a fault in the compiler code.
H
201 - 300 of 1212 matches
Mail list logo