Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-28 Thread J. Gareth Moreton
 I figured so, although when I implemented such a pooled object, it
unintentionally fixed a couple of memory leaks!  Of course, it might just
be a lazy workaround instead of putting in the missing "ReleaseUsedRegs"
commands.

 Gareth aka. Kit

 On Fri 28/12/18 17:41 , Florian Klämpfl flor...@freepascal.org sent:
 Am 15.12.2018 um 16:18 schrieb J. Gareth Moreton: 
 > Ah right, so things like "TmpUsedRegs" (an array of TUsedRegs)
constantly being created and destroyed in the peephole 
 > optimizer is actually not that much of a penalty hit, and creating a
pooled object for continuous use doesn't give that 
 > much of a performance gain? 

 I do not expect so. 
 ___ 
 fpc-devel maillist - fpc-devel@lists.freepascal.org [1] 
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

 

Links:
--
[1] mailto:fpc-devel@lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-28 Thread Florian Klämpfl
Am 15.12.2018 um 16:18 schrieb J. Gareth Moreton:
> Ah right, so things like "TmpUsedRegs" (an array of TUsedRegs) constantly 
> being created and destroyed in the peephole
> optimizer is actually not that much of a penalty hit, and creating a pooled 
> object for continuous use doesn't give that
> much of a performance gain?

I do not expect so.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-27 Thread J. Gareth Moreton
Is it possible to still consider these 
changes? They do give big time savings 
when compiling large projects under x86_64 
and a couple of the rewritten functions 
perform better in finding optimisations 
with jumps.  I'm holding off doing 
additional peephole additions until I know 
whether this will be discarded or not, 
since any new changes will involve 
updating the patch files (which might have 
to be updated with the recent changes to 
the trunk).

Gareth aka. Kit

On Sun 23/12/18 21:09 , "J. Gareth 
Moreton" gar...@moreton-family.com sent:
> Updated 
https://bugs.freepascal.org/view.php?
id=34628 - new
> "overhaul-mov-refactor.patch" that now 
changes "movl addl movq" to "addl"
> (and equivalently with "incl").  
Currently the patches only apply to
> x86_64, but i386 is ready for upload if 
approved... much less splitting
> involved!
> 
> Gareth aka. Kit
> P.S. To the moderators... I believe 
there's a message in the moderation
> queue because it contains a 60kb 
attachment (a patch file).
> 
> On Sat 22/12/18 20:28 , "J. Gareth 
Moreton" gar...@moreton-family.com
> sent:
> Saying that, it might not be a bug but a 
design choice.  If the compiler
> is able to extend the variable to 64 
bits on the stack, it will do,
> including the use of "incq" over "incl" 
- whenever I try to exploit the
> upper 32 bits, the compiler is too smart 
for that!
> 
> The only problem is that it can hinder 
the peephole optimiser, and if I
> put in an exception that optimises, say, 
"movl addl movq" to "addl" or
> "incl" or whatnot, then that could be 
exploited.  I'll have a think in
> regards to what to do with that one.
> 
> Gareth aka. Kit
> 
> On Sat 22/12/18 20:04 , "J. Gareth 
Moreton" gar...@moreton-family.com
> sent:
> Have to apologise again for my web 
client making life difficult for the
> mail archive system.
> 
> Currently I'm a little reluctant to put 
in the "incq" fix because the code
> isn't equivalent.  More than anything, 
it's a very minor bug with the node
> system in that it writes the full 64-bit 
register instead of the 32-bit
> register for LongInts and LongWords, it 
seems.  I might be able to exploit
> the bug in a very narrow range of 
circumstances, and if I'm able to make a
> reproducible test case, I'll report it.
> 
> Nevertheless, if I'm not able to exploit 
it, it's something that might be
> worth fixing if only because it will 
make optimisation easier and correct.
> 
> Also note that "INC" is generally only 
generated if you're optimising for
> size, because modern CPUs work better 
with "ADD" due to how it modifies the
> RFLAGS register.
> 
> For the overhaul in general, I have it 
ported over to i386 on my working
> branch, since I've got x86_64 working 
without any breaking bugs.
> 
> Even if just for a temporary debugging 
measure, I'm considering options to
> allow the writing of node trees to an 
XML file or some such, since I think
> this will allow easier debugging of that 
part of the compiler.
> 
> Gareth aka. Kit
> 
__
_
> fpc-devel maillist - 
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> [1]">http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> 
> 
__
_
> fpc-devel maillist - 
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> [2]">http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> 
> 
__
_
> fpc-devel maillist - fpc-
de...@lists.freepascal.org
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel [3]
> 
> 
> 
> Links:
> --
> [1] http://secureweb.fast.net.uk/ http:=
> [2] http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> [3]
> http://secureweb.fast.net.uk/parse.php?
redirect=http://lists.freepascal.org
> /cgi-bin/mailman/listinfo/fpc-devel
> 

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-23 Thread J. Gareth Moreton
 Updated https://bugs.freepascal.org/view.php?id=34628 - new
"overhaul-mov-refactor.patch" that now changes "movl addl movq" to "addl"
(and equivalently with "incl").  Currently the patches only apply to
x86_64, but i386 is ready for upload if approved... much less splitting
involved!

 Gareth aka. Kit
 P.S. To the moderators... I believe there's a message in the moderation
queue because it contains a 60kb attachment (a patch file).

 On Sat 22/12/18 20:28 , "J. Gareth Moreton" gar...@moreton-family.com
sent:
  Saying that, it might not be a bug but a design choice.  If the compiler
is able to extend the variable to 64 bits on the stack, it will do,
including the use of "incq" over "incl" - whenever I try to exploit the
upper 32 bits, the compiler is too smart for that!

 The only problem is that it can hinder the peephole optimiser, and if I
put in an exception that optimises, say, "movl addl movq" to "addl" or
"incl" or whatnot, then that could be exploited.  I'll have a think in
regards to what to do with that one.

 Gareth aka. Kit

 On Sat 22/12/18 20:04 , "J. Gareth Moreton" gar...@moreton-family.com
sent:
  Have to apologise again for my web client making life difficult for the
mail archive system.

 Currently I'm a little reluctant to put in the "incq" fix because the code
isn't equivalent.  More than anything, it's a very minor bug with the node
system in that it writes the full 64-bit register instead of the 32-bit
register for LongInts and LongWords, it seems.  I might be able to exploit
the bug in a very narrow range of circumstances, and if I'm able to make a
reproducible test case, I'll report it.

 Nevertheless, if I'm not able to exploit it, it's something that might be
worth fixing if only because it will make optimisation easier and correct.

 Also note that "INC" is generally only generated if you're optimising for
size, because modern CPUs work better with "ADD" due to how it modifies the
RFLAGS register.

 For the overhaul in general, I have it ported over to i386 on my working
branch, since I've got x86_64 working without any breaking bugs.

 Even if just for a temporary debugging measure, I'm considering options to
allow the writing of node trees to an XML file or some such, since I think
this will allow easier debugging of that part of the compiler.

 Gareth aka. Kit
  ___
 fpc-devel maillist - fpc-devel@lists.freepascal.org
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[1]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

  ___
 fpc-devel maillist - fpc-devel@lists.freepascal.org
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

 

Links:
--
[1] http://secureweb.fast.net.uk/ http:=
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-22 Thread J. Gareth Moreton
 Saying that, it might not be a bug but a design choice.  If the compiler
is able to extend the variable to 64 bits on the stack, it will do,
including the use of "incq" over "incl" - whenever I try to exploit the
upper 32 bits, the compiler is too smart for that!

 The only problem is that it can hinder the peephole optimiser, and if I
put in an exception that optimises, say, "movl addl movq" to "addl" or
"incl" or whatnot, then that could be exploited.  I'll have a think in
regards to what to do with that one.

 Gareth aka. Kit

 On Sat 22/12/18 20:04 , "J. Gareth Moreton" gar...@moreton-family.com
sent:
  Have to apologise again for my web client making life difficult for the
mail archive system.

 Currently I'm a little reluctant to put in the "incq" fix because the code
isn't equivalent.  More than anything, it's a very minor bug with the node
system in that it writes the full 64-bit register instead of the 32-bit
register for LongInts and LongWords, it seems.  I might be able to exploit
the bug in a very narrow range of circumstances, and if I'm able to make a
reproducible test case, I'll report it.

 Nevertheless, if I'm not able to exploit it, it's something that might be
worth fixing if only because it will make optimisation easier and correct.

 Also note that "INC" is generally only generated if you're optimising for
size, because modern CPUs work better with "ADD" due to how it modifies the
RFLAGS register.

 For the overhaul in general, I have it ported over to i386 on my working
branch, since I've got x86_64 working without any breaking bugs.

 Even if just for a temporary debugging measure, I'm considering options to
allow the writing of node trees to an XML file or some such, since I think
this will allow easier debugging of that part of the compiler.

 Gareth aka. Kit
  ___
 fpc-devel maillist - fpc-devel@lists.freepascal.org [1]
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

 

Links:
--
[1] mailto:fpc-devel@lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-22 Thread J. Gareth Moreton
 Have to apologise again for my web client making life difficult for the
mail archive system.

 Currently I'm a little reluctant to put in the "incq" fix because the code
isn't equivalent.  More than anything, it's a very minor bug with the node
system in that it writes the full 64-bit register instead of the 32-bit
register for LongInts and LongWords, it seems.  I might be able to exploit
the bug in a very narrow range of circumstances, and if I'm able to make a
reproducible test case, I'll report it.

 Nevertheless, if I'm not able to exploit it, it's something that might be
worth fixing if only because it will make optimisation easier and correct.

 Also note that "INC" is generally only generated if you're optimising for
size, because modern CPUs work better with "ADD" due to how it modifies the
RFLAGS register.

 For the overhaul in general, I have it ported over to i386 on my working
branch, since I've got x86_64 working without any breaking bugs.

 Even if just for a temporary debugging measure, I'm considering options to
allow the writing of node trees to an XML file or some such, since I think
this will allow easier debugging of that part of the compiler.

 Gareth aka. Kit
 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-17 Thread Marco van de Voort


Op 12/17/2018 om 8:23 AM schreef Ryan Joseph:

On Dec 16, 2018, at 10:57 PM, Marco van de Voort  
wrote:

I'm no expert, but afaik creating an object involves an exception frame, which 
is afaik cheaper in Delphi with SEH, then FPC with setjmp.

Even if there is no try..except block?


Try.. finally block is more like it. If the constructor segfaults, the 
already allocated memory is released and an exception is raised.



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-16 Thread Ryan Joseph


> On Dec 16, 2018, at 10:57 PM, Marco van de Voort  
> wrote:
> 
> I'm no expert, but afaik creating an object involves an exception frame, 
> which is afaik cheaper in Delphi with SEH, then FPC with setjmp.

Even if there is no try..except block? I don’t use exceptions in FPC so 
shouldn’t this be turned off? I may not understand what you mean by exception 
frame though.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-16 Thread Marco van de Voort


Op 2018-12-15 om 19:01 schreef Martok:
memory manager in daily use. Doing that is a C++-ism that shouldn't 
exist in a

sane environment ;-)

I just tested something, and I'm a surprised by how big the difference is. This
simple test is 1.5 times slower in FPC/trunk/win32 than Delphi 2007 and 2.8
times slower for instances of TComponent. Medium-size GetMemory (I tested 123
bytes) is 22 times slower in FPC.
Looks like there is quite some potential there.


I'm no expert, but afaik creating an object involves an exception frame, 
which is afaik cheaper in Delphi with SEH, then FPC with setjmp.


The test should probably be repeated with a FPC recompiled with 
OPTS="-dTEST_WIN32_SEH"




___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Jonas Maebe

On 15/12/18 21:04, Ben Grasset wrote:
On Sat, Dec 15, 2018 at 2:43 PM Jonas Maebe > wrote:


That is incorrect.

I didn't mean that it doesn't *care* about being fast, but more that it 
will not necessarily use more memory in all cases that it might result 
in a speed gain, and generally is more concerned with a "balanced" 
approach, which is what I've always heard to be the case. Is that really 
not true at all?


It makes use of thread-local heaps and pooling for small allocation 
sizes, both of which are aimed at increasing speed. For blocks that are 
too large for the pools, there are no particular trade-offs afaik (it's 
just a linked list of free blocks per thread that can be reused). It's 
probably true that it will have a smaller memory footprint than 
something like jemalloc, at least as long as memory fragmentation (which 
jemalloc is very good at avoiding) doesn't take the upper hand.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread J. Gareth Moreton
 Apologies for my webmail client problems.  There's little I can do about
that.
 This was the object pooling that I did -
https://bugs.freepascal.org/view.php?id=34679 - though there's some cycle
counting involved (e.g. in OptPass1MOV, only the integer registers are
updated instead of all 8 types), this little bit of refactoring also fixed
a couple of memory leaks as a side-effect, since two of the peephole
optimisation routines created an array of TUsedRegs objects, but failed to
free them afterwards.  There are also quite a few locations where
ReleaseUsedRegs is called multiple times in the same routine, mostly in an
attempt to avoid using a "try...finally" block.  Such configurations are
just asking for a bug to be introduced where it's forgotten in one of the
branches.

 Gareth aka. Kit

 On Sat 15/12/18 19:37 , Martok list...@martoks-place.de sent:
 Sorry for hijacking the thread. Your mail client issue makes the
conversation 
 really hard to follow, so I have literally no idea what the current
subtopic of 
 a reply chain is, and there's little point in properly detaching a thread.


 Am 15.12.2018 um 18:13 schrieb J. Gareth Moreton: 
 > I dare ask, does that mean we should avoid workarounds in the compiler
(and 
 > our own programs) that aim to avoid constant construction and
destruction of 
 > objects, and instead try to improve the memory manager? 

 I was thinking more along the lines of avoiding cycle-counting special
paths at 
 the cost of reliability, when there are much larger issues that would
benefit 
 every program. 

 I would not be surprised if some of the large difference Simon listed when

 calling out the bounty come from this side, instead of raw instruction
throughput. 

 > Thus, I would imagine that Delphi's *default* internal memory management

 > system is more along the lines of what is done in FPC's cmem unit, which
is 
 > well known to be objectively faster than FPC's default memory manager 
 I'm fairly certain the runtime is written in Pascal, except for parts of
the 
 startup code. The memory manager at some point (I think D2006?) adopted
FastMM: 
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

 

Links:
--
[1]
http://docwiki.embarcadero.com/RADStudio/2010/en/Configuring_the_Memory_Manager%26gt
[2] mailto:fpc-devel@lists.freepascal.org
[3] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Ben Grasset
On Sat, Dec 15, 2018 at 2:37 PM Martok  wrote:

> In any case, FPC's cmem on Win32 calls into mscvrt, and that is so slow
> that I
> killed the test code after a couple of minutes, where even FPC-builtin was
> done
> after 10 seconds.
>

Interesting. On Win64 I've found it to be consistently faster. Also, the
cmem unit changed to call into a copy of Jemalloc.dll (i.e. je_malloc
instead of malloc and such), something I've experimented with, is WAY
faster.

On Sat, Dec 15, 2018 at 2:43 PM Jonas Maebe  wrote:

> That is incorrect.


I didn't mean that it doesn't *care* about being fast, but more that it
will not necessarily use more memory in all cases that it might result in a
speed gain, and generally is more concerned with a "balanced" approach,
which is what I've always heard to be the case. Is that really not true at
all?
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Jonas Maebe

On 15/12/18 19:37, Ben Grasset wrote:
Should this really be surprising at all though? To me it seems obvious 
why that would be the case. Delphi the compiler (not the IDE) is not 
written in Pascal. It's written in a combination of C and C++. Thus, I 
would imagine that Delphi's *default* internal memory management system 
is more along the lines of what is done in FPC's cmem unit, which is 
well known to be objectively faster than FPC's default memory manager


As Martok wrote, Delphi's memory manager is FastMM. That one is written 
in assembler, afaik. Additionally, cmem is mainly faster (on Unix 
platforms) if you reallocate many (large) memory blocks, which is 
related to what Florian talked about earlier.


as 
FPC's default memory manager simply does not aim to be fast but rather 
to use the smallest amount of memory possible.


That is incorrect.


Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Martok
Sorry for hijacking the thread. Your mail client issue makes the conversation
really hard to follow, so I have literally no idea what the current subtopic of
a reply chain is, and there's little point in properly detaching a thread.


Am 15.12.2018 um 18:13 schrieb J. Gareth Moreton:
> I dare ask, does that mean we should avoid workarounds in the compiler (and
> our own programs) that aim to avoid constant construction and destruction of 
> objects, and instead try to improve the memory manager?

I was thinking more along the lines of avoiding cycle-counting special paths at
the cost of reliability, when there are much larger issues that would benefit
every program.

I would not be surprised if some of the large difference Simon listed when
calling out the bounty come from this side, instead of raw instruction 
throughput.


> Thus, I would imagine that Delphi's *default* internal memory management
> system is more along the lines of what is done in FPC's cmem unit, which is
> well known to be objectively faster than FPC's default memory manager
I'm fairly certain the runtime is written in Pascal, except for parts of the
startup code. The memory manager at some point (I think D2006?) adopted FastMM:

In any case, FPC's cmem on Win32 calls into mscvrt, and that is so slow that I
killed the test code after a couple of minutes, where even FPC-builtin was done
after 10 seconds.



-- 
Regards,
Martok


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread J. Gareth Moreton
 Well, Florian did say he was concerned about the increased maintenance
costs, given how complex the compiler is already.  Granted, it's one of
the few surefire ways that I've sped up the compiler quite significantly. 
Other speed-ups like other case block algorithms may also help.

 Though the modifications as they stand are perfectly adequate, some "inc"
instructions get changed to "mov, add, mov" that, while not particularly
bad, are a little unexpected, since one of my criteria was that optimised
code is either identical or better to what it was before, not worse.  I
can always introduce a fix for that later though.

 If I know the changes will be approved or rejected, I can work on
additional peephole optimisations that depend somewhat on the overhaul.

  Gareth aka. Kit

 On Sat 15/12/18 19:05 , "Ben Grasset" operato...@gmail.com sent:
 On Sat, Dec 15, 2018 at 1:14 PM J. Gareth Moreton  wrote:P.S. This thread
is supposed to be for the x86_64 optimizer overhaul that I presented!
 Despite the other reply I just sent about the memory management stuff I
also agree here! Your changes look very beneficial and it would be nice to
see them get formally addressed ASAP.   ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Ben Grasset
On Sat, Dec 15, 2018 at 1:14 PM J. Gareth Moreton 
wrote:

> P.S. This thread is supposed to be for the x86_64 optimizer overhaul that
> I presented!
>

Despite the other reply I just sent about the memory management stuff I
also agree here! Your changes look very beneficial and it would be nice to
see them get formally addressed ASAP.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Ben Grasset
On Sat, Dec 15, 2018 at 1:01 PM Martok  wrote:

> I just tested something, and I'm a surprised by how big the difference is.
>

Should this really be surprising at all though? To me it seems obvious why
that would be the case. Delphi the compiler (not the IDE) is not written in
Pascal. It's written in a combination of C and C++. Thus, I would imagine
that Delphi's *default* internal memory management system is more along the
lines of what is done in FPC's cmem unit, which is well known to be
objectively faster than FPC's default memory manager as FPC's default
memory manager simply does not aim to be fast but rather to use the
smallest amount of memory possible.

It seems to me like a clear cut issue of just deciding what the biggest
priority is. If anything, it might make sense to implement a
secondary-default memory manager for FPC that does not necessarily call C
functions like malloc but *does* aim specifically for speed, and that could
perhaps be made available either via a command-line flag or perhaps just by
using a unit the same way cmem is.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread J. Gareth Moreton
 I dare ask, does that mean we should avoid workarounds in the compiler
(and our own programs) that aim to avoid constant construction and
destruction of objects, and instead try to improve the memory manager?
 So many discoveries!
 Gareth aka. Kit
 P.S. This thread is supposed to be for the x86_64 optimizer overhaul that
I presented!

 On Sat 15/12/18 18:01 , Martok list...@martoks-place.de sent:
 Am 15.12.2018 um 17:12 schrieb Florian Klämpfl: 
 > The memory manager itself pools already, so no need for the compiler. If
somebody wants to improve the heap manager: 
 > implement OS supported re-allocations (OS can move memory by just
shuffling pages). 

 Very much agree, it's not a user program's job to work around the standard

 memory manager in daily use. Doing that is a C++-ism that shouldn't exist
in a 
 sane environment ;-) 

 I just tested something, and I'm a surprised by how big the difference is.
This 
 simple test is 1.5 times slower in FPC/trunk/win32 than Delphi 2007 and
2.8 
 times slower for instances of TComponent. Medium-size GetMemory (I tested
123 
 bytes) is 22 times slower in FPC. 
 Looks like there is quite some potential there. 

 const COUNT=1; 
 var 
 t1, t2: dword; 
 objs: array[0..1] of TObject; 
 i, j: integer; 
 begin 
 t1:= Gettickcount; 
 for i := 0 to COUNT - 1 do begin 
 for j := 0 to high(objs) do 
 objs[j]:= TObject.Create; 
 for j := 0 to high(objs) do 
 objs[j].Free; 
 end; 
 t2:= Gettickcount; 

 writeln((t2-t1)/COUNT:10:3, 'ms'); 
 Readln; 
 end. 

 -- 
 Regards, 
 Martok 

 ___ 
 fpc-devel maillist - fpc-devel@lists.freepascal.org [1] 
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

 

Links:
--
[1] mailto:fpc-devel@lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Martok
Am 15.12.2018 um 17:12 schrieb Florian Klämpfl:
> The memory manager itself pools already, so no need for the compiler. If 
> somebody wants to improve the heap manager:
> implement OS supported re-allocations (OS can move memory by just shuffling 
> pages).

Very much agree, it's not a user program's job to work around the standard
memory manager in daily use. Doing that is a C++-ism that shouldn't exist in a
sane environment ;-)

I just tested something, and I'm a surprised by how big the difference is. This
simple test is 1.5 times slower in FPC/trunk/win32 than Delphi 2007 and 2.8
times slower for instances of TComponent. Medium-size GetMemory (I tested 123
bytes) is 22 times slower in FPC.
Looks like there is quite some potential there.


const COUNT=1;
var
  t1, t2: dword;
  objs: array[0..1] of TObject;
  i, j: integer;
begin
  t1:= Gettickcount;
  for i := 0 to COUNT - 1 do begin
for j := 0 to high(objs) do
  objs[j]:= TObject.Create;
for j := 0 to high(objs) do
  objs[j].Free;
  end;
  t2:= Gettickcount;

  writeln((t2-t1)/COUNT:10:3, 'ms');
  Readln;
end.

-- 
Regards,
Martok


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread J. Gareth Moreton
 Ah right, so things like "TmpUsedRegs" (an array of TUsedRegs) constantly
being created and destroyed in the peephole optimizer is actually not that
much of a penalty hit, and creating a pooled object for continuous use
doesn't give that much of a performance gain?

 Gareth

 On Sat 15/12/18 16:12 , Florian Klämpfl flor...@freepascal.org sent:
 Am 12.12.2018 um 13:49 schrieb Ryan Joseph: 
 > 
 > 
 >> On Dec 12, 2018, at 7:20 PM, Martok  wrote: 
 >> 
 >> Checking out the memory manager(s) could be useful as well - there are
a lot of 
 >> small allocations, that generally tends to put much stress on it. 
 >> And any improvement there would also directly benefit user
applications. 
 > 
 > I was going to say the same thing myself and even planned to do a test.
My profiles show the top hits being getmem/freemem which really don’t
need to be there. 
 > 
 > There’s no reason to be allocating and freeing nodes (for example)
over and over again when we could just allocate a large pool at startup and
return to the pool instead of freeing. It would make the compiler utilize
more memory but that’s a good trade off for me personally. This is
especially a good idea because the compiler is a one pass program so leaks
over the long term aren’t a problem. 

 The memory manager itself pools already, so no need for the compiler. If
somebody wants to improve the heap manager: 
 implement OS supported re-allocations (OS can move memory by just
shuffling pages). 
 ___ 
 fpc-devel maillist - fpc-devel@lists.freepascal.org [2] 
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[3]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

 

Links:
--
[1] mailto:list...@martoks-place.de
[2] mailto:fpc-devel@lists.freepascal.org
[3] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-15 Thread Florian Klämpfl
Am 12.12.2018 um 13:49 schrieb Ryan Joseph:
> 
> 
>> On Dec 12, 2018, at 7:20 PM, Martok  wrote:
>>
>> Checking out the memory manager(s) could be useful as well - there are a lot 
>> of
>> small allocations, that generally tends to put much stress on it.
>> And any improvement there would also directly benefit user applications.
> 
> I was going to say the same thing myself and even planned to do a test. My 
> profiles show the top hits being getmem/freemem which really don’t need to be 
> there.
> 
> There’s no reason to be allocating and freeing nodes (for example) over and 
> over again when we could just allocate a large pool at startup and return to 
> the pool instead of freeing. It would make the compiler utilize more memory 
> but that’s a good trade off for me personally. This is especially a good idea 
> because the compiler is a one pass program so leaks over the long term aren’t 
> a problem.

The memory manager itself pools already, so no need for the compiler. If 
somebody wants to improve the heap manager:
implement OS supported re-allocations (OS can move memory by just shuffling 
pages).
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Marģers . via fpc-devel
 

- Reply to message -
Subject: Re: [fpc-devel] x86_64 Optimizer Overhaul
Date: 2018. gada 12. decembris 17:02:02
From:  J. Gareth Moreton 
To:  FPC developers' list

> By the way, what generates that set of
> operations? I'm curious because I want to
> see what's going on in the compiler. You
> see, "incq" and that "mov, add, mov" set
> aren't equivalent; anything over
> $1 gets truncated with the set,
> but not with "incq", although it's not a
> concern if only the lower 32 bits are
> used.

Have to agree, it's not equivalent. I added
example program for you to examine this situation.
It might and might not be an error. 
note: i use compiler parameter -O4

> If both combinations run at about the same
> speed, then "incq" is better just on
> account of code size.
I spent some time to examine "incq mem" and "mov
add mov"
On my particular cpu if "incq" is independent
instruction, then actual performance is 1 clock
cycle. 
Combination of "mov add mov" ended up like 1  -
1.2 clock cycles. Chain of "mov add mov" was
always few clocks more than the same length chain
of "incq".
But in case if "incq" fall into sever dependency
chain then "incq" executes 25% worse than "mov add
mov".
"incq" 4,5 clock cycles 
"mov add mov" 3,8 clock cycles

I vote for shorter code and prefer "incq" 

margers

program overhaul_incq;

var globalQ : longint;

function dummycall(a,b: longint):longint;
begin
 dummycall:=a+b;
end;

procedure fuu;
var k : longint;  { rbx for loop counter }
a,b,c,m,z,q  : longint; {no real use, just to occupie r12-r15}
sk : longint;  {no free real registers - so to be temp on stack}
begin
 sk:=0;
 q:=0; a:=0;
 for k:=0 to 100 do  { k takes rbx }
 begin
  { dummy math to keep busy registers r12 - r15 }
  c:=q+a;
  m:=k+1;
  { call discards  r8 - r11, rax, rdx, rcx, rdi, rsi - no use of them}
  z:=dummycall(k,c);
  q:=c+z;

  {  as fpc don't use rbp for variable,  }
  {  we don't have left any usable register }
  { incq [mem] }
  inc(sk);
  {writeln(k,' ',q);}
 end;
 globalQ:=q;
end;

begin
 fuu;
 writeln(globalQ);
end.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread J. Gareth Moreton
By the way, what generates that set of 
operations? I'm curious because I want to 
see what's going on in the compiler. You 
see, "incq" and that "mov, add, mov" set 
aren't equivalent; anything over 
$1 gets truncated with the set, 
but not with "incq", although it's not a 
concern if only the lower 32 bits are 
used.

If both combinations run at about the same 
speed, then "incq" is better just on 
account of code size.

Gareth aka. Kit

On Wed 12/12/18 14:38 , "Marģers ." 
margers.ro...@inbox.lv sent:
>  
> 
> 
> 
> > Nice spot with the "incq" command
> there.  It
> wasn't intentional for that to be split 
into 3
> 
> commands, but is likely just a side-
effect of pass
> 
> 1 not being run twice now... granted, 
since one of
> 
> my criteria was that the code should not 
be less
> 
> optimal, I'll see if I can watch out for 
that one.
> 
> 
> 
> Both versions are kinda equivalent in 
execution
> 
> speed. 
> 
> 
> 
> > One interesting thing to note though 
is that the
> 
> read and add work on the 32-bit 
register, but then
> 
> the full 64-bit register is written.
> 
> 
> 
> As local variables are meant to be 
allocated in 
> 
> registers, but procedure has calls to 
other
> 
> procedures, they are stored 
"temporarily" on stack
> 
> as 64 bit registers.
> 
> It's not an error or at least not an 
error for
> 
> program logic in this case.
> 
> 
> 
> 
> 
> > > # [468] inc(sk);
> 
> > > --trunk -
> 
> > > incq 272(%rsp)
> 
> 
> 
> > > -- overhaul ---
> 
> > > movl 272(%rsp),%eax
> 
> > > addl $1,%eax
> 
> > > movq %rax,272(%rsp)
> 
> 
> 
> > > did you mean to be so?
> 
> 
> 
> > > margers
> 
> 
> 
> 
> 
> 
> 

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Ryan Joseph


> On Dec 12, 2018, at 7:59 PM, Ryan Joseph  wrote:
> 
> For example every time you it parses “1 + 1” a large code block is entered

Correction, 1+1 doesn’t enter a large code block unless there’s an overload 
present. Once you add overloads however that’s when a caching solution would 
start helping.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Marģers . via fpc-devel
 

> Nice spot with the "incq" command there.  It
wasn't intentional for that to be split into 3
commands, but is likely just a side-effect of pass
1 not being run twice now... granted, since one of
my criteria was that the code should not be less
optimal, I'll see if I can watch out for that one.

Both versions are kinda equivalent in execution
speed. 

> One interesting thing to note though is that the
read and add work on the 32-bit register, but then
the full 64-bit register is written.

As local variables are meant to be allocated in 
registers, but procedure has calls to other
procedures, they are stored "temporarily" on stack
as 64 bit registers.
It's not an error or at least not an error for
program logic in this case.


> > # [468] inc(sk);
> > --trunk -
> > incq 272(%rsp)

> > -- overhaul ---
> > movl 272(%rsp),%eax
> > addl $1,%eax
> > movq %rax,272(%rsp)

> > did you mean to be so?

> > margers

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread J. Gareth Moreton
 Currently, compiled programs shouldn't show any measurable difference in
running speed because the overhaul just attempts to make compilation faster
overall.

 Nice spot with the "incq" command there.  It wasn't intentional for that
to be split into 3 commands, but is likely just a side-effect of pass 1 not
being run twice now... granted, since one of my criteria was that the code
should not be less optimal, I'll see if I can watch out for that one.

 One interesting thing to note though is that the read and add work on the
32-bit register, but then the full 64-bit register is written.

 Gareth aka. Kit

 On Wed 12/12/18 13:08 , "Marģers ." margers.ro...@inbox.lv sent:
   

 - Reply to message ----- 
 Subject: Re: [fpc-devel] x86_64 Optimizer Overhaul 
 Date: 2018. gada 6. decembris 18:57:29 
 From: J. Gareth Moreton  
 To: FPC developers' list 

 > I believed I've fixed the bug.  Thanks for your 
 help. 

 Now it's way better. -O3 and -O4 works fine. 
 Speed test for my programs shows no measurable 
 difference. 

 # [468] inc(sk); 
 --trunk - 
 incq 272(%rsp) 

 -- overhaul --- 
 movl 272(%rsp),%eax 
 addl $1,%eax 
 movq %rax,272(%rsp) 

 did you mean to be so? 

 margers 

 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Marģers . via fpc-devel
 

- Reply to message -
Subject: Re: [fpc-devel] x86_64 Optimizer Overhaul
Date: 2018. gada 6. decembris 18:57:29
From:  J. Gareth Moreton 
To:  FPC developers' list

> I believed I've fixed the bug.  Thanks for your
help.

Now it's way better. -O3 and -O4 works fine.
Speed test for my programs shows no measurable
difference.


# [468] inc(sk);
--trunk  - 
incq 272(%rsp)

-- overhaul --- 
movl272(%rsp),%eax
addl$1,%eax
movq%rax,272(%rsp)

did you mean to be so?

margers

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Ryan Joseph


> On Dec 12, 2018, at 7:20 PM, Martok  wrote:
> 
> Checking out the memory manager(s) could be useful as well - there are a lot 
> of
> small allocations, that generally tends to put much stress on it.
> And any improvement there would also directly benefit user applications.

I noticed today when working on operator overloads (for default properties) is 
that there is no attempt to cache anything.

For example every time you it parses “1 + 1” a large code block is entered, 
lots of dynamic allocations etc… Obviously the resolved overload for 
integer+integer could be cached, at least on a per-block level. That’s a 
particularly wasteful detail I noticed and there’s probably many more like this.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Marco van de Voort


Op 12/12/2018 om 1:49 PM schreef Ryan Joseph:
This is especially a good idea because the compiler is a one pass 
program so leaks over the long term aren’t a problem. 

(well, unless it is integrated in the textmode IDE "fp" of course)
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Ryan Joseph


> On Dec 12, 2018, at 7:20 PM, Martok  wrote:
> 
> Checking out the memory manager(s) could be useful as well - there are a lot 
> of
> small allocations, that generally tends to put much stress on it.
> And any improvement there would also directly benefit user applications.

I was going to say the same thing myself and even planned to do a test. My 
profiles show the top hits being getmem/freemem which really don’t need to be 
there.

There’s no reason to be allocating and freeing nodes (for example) over and 
over again when we could just allocate a large pool at startup and return to 
the pool instead of freeing. It would make the compiler utilize more memory but 
that’s a good trade off for me personally. This is especially a good idea 
because the compiler is a one pass program so leaks over the long term aren’t a 
problem.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-12 Thread Martok
Am 12.12.2018 um 04:51 schrieb Ryan Joseph:
> I’ve spent some time in the compiler sources now and I’m curious just where 
> people think the bottle necks for performance actually are. It’s such a 
> complicated system for anyone one person to have a good understanding of so 
> it’s not clear where to begin looking.

Checking out the memory manager(s) could be useful as well - there are a lot of
small allocations, that generally tends to put much stress on it.
And any improvement there would also directly benefit user applications.

-- 
Regards,
Martok


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-11 Thread J. Gareth Moreton
 It is indeed such a complex system.  I wouldn't say that I've identified
a bottleneck per se, but I've chosen to focus my improvements there.  The
idea behind the overhaul is that it attempts to reduce the number of passes
during the peephole optimizer stage - given that I've managed to shave off
15 seconds from the compile time of Lazarus, I figure I might be onto
something.

 Generally, a good place to start with bottlenecks are routines that are
most frequently entered, because any slow-downs there can very quickly
multiply.  For a recent example, I looked at OptPass1MOV and figure I
could refactor parts of it to reduce the number of calls to
"GetNextInstruction", which can take a while sometimes because it's
stepping through a linked list which might not always be cached. 
Otherwise it's a matter of simplifying some of the conditions.

 Otherwise, I'm the kind of perfectionist who just looks at a wall of
assembly language and thinks "that could be improved", even if it's just
one cycle.

 But the nice thing about open source projects like this is that we can all
have our individual specialisations and skillsets and choose to focus our
efforts on individual parts of the compiler.  If you ask me, if you see
something that could be improved, pass your ideas on and submit a patch if
you like.  It's worth doing some tests to confirm if you've made a saving,
although the hardest one to determine is if your compiled binary runs
faster or not.

 Gareth

 On Wed 12/12/18 03:51 , "Ryan Joseph" r...@thealchemistguild.com sent:
 I’ve spent some time in the compiler sources now and I’m curious just
where people think the bottle necks for performance actually are. It’s
such a complicated system for anyone one person to have a good
understanding of so it’s not clear where to begin looking. 

 > On Dec 12, 2018, at 9:42 AM, J. Gareth Moreton  wrote: 
 > 
 > The overhaul primarily increases the speed of compilation, but it makes
some minor improvements to conditional branches here and there.
Nevertheless, I'm always happy to find a saving here and there in the
compiled assembly language! 

 Regards, 
 Ryan Joseph 

 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-11 Thread Ryan Joseph
I’ve spent some time in the compiler sources now and I’m curious just where 
people think the bottle necks for performance actually are. It’s such a 
complicated system for anyone one person to have a good understanding of so 
it’s not clear where to begin looking.

> On Dec 12, 2018, at 9:42 AM, J. Gareth Moreton  
> wrote:
> 
> The overhaul primarily increases the speed of compilation, but it makes some 
> minor improvements to conditional branches here and there.  Nevertheless, I'm 
> always happy to find a saving here and there in the compiled assembly 
> language!

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-11 Thread J. Gareth Moreton
 I think this was intended for the mailing list - I'm looking forward to it
too.  Depends on what Florian says though.

 The overhaul primarily increases the speed of compilation, but it makes
some minor improvements to conditional branches here and there. 
Nevertheless, I'm always happy to find a saving here and there in the
compiled assembly language!

 Gareth

 On Tue 11/12/18 20:57 , Ched charles.edouard.des.vastes.vig...@gmail.com
sent:
 Hello Gareth, 

 I'm looking forward for the implementation of your optimizer, as the gain
of speed of execution for 
 programs running hours a day is very welcome ! But I'll wait for an
official upgrade of the production trunk. 

 Cheers, Ched' 

 Le 09. 12. 18 à 22:39, J. Gareth Moreton a écrit : 
 > Because of how intertwined my work is, I can't easily work on something
else until I know if this 
 > overhaul is accepted or rejected.  However, in the meantime, would
anyone object if I start porting it to 
 > i386, so I can get rid of all those horrible $ifdef's more than
anything? From the little I've observed, 
 > i386 still works as it does normally, which was the original intention
so x86_64 can be tested in isolation. 
 > 
 > Gareth aka. Kit 
 > 
 > 
 > ___ 
 > fpc-devel maillist - fpc-devel@lists.freepascal.org [1] 
 > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 
 > 

 

Links:
--
[1] mailto:fpc-devel@lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-09 Thread J. Gareth Moreton
 Because of how intertwined my work is, I can't easily work on something
else until I know if this overhaul is accepted or rejected.  However, in
the meantime, would anyone object if I start porting it to i386, so I can
get rid of all those horrible $ifdef's more than anything? From the little
I've observed, i386 still works as it does normally, which was the original
intention so x86_64 can be tested in isolation.

 Gareth aka. Kit
  ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-09 Thread J. Gareth Moreton
 I think patch.exe gets a bit confused if the changes have to be offset -
in my case, there are 6 patches that should be applied together, and some
of them modify the same file, hence causing movement of procedures in the
source file.
 Sorry if I'm getting pushy and impatient with this change - I guess I'm a
bit too passionate for my own good!

 Gareth aka. Kit

 On Sun 09/12/18 11:57 , Marco van de Voort f...@pascalprogramming.org sent:

 Op 2018-12-09 om 02:05 schreef J. Gareth Moreton: 
 > I'm not sure. I've always had problems 
 > with patch.exe. I personally use "svn 
 > patch", which works for me both under 
 > Windows and Linux. 

 Cygwin patch afaik also works fine. 

 ___ 
 fpc-devel maillist - fpc-devel@lists.freepascal.org [1] 
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

 

Links:
--
[1] mailto:fpc-devel@lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-09 Thread Marco van de Voort


Op 2018-12-09 om 02:05 schreef J. Gareth Moreton:

I'm not sure. I've always had problems
with patch.exe. I personally use "svn
patch", which works for me both under
Windows and Linux.


Cygwin patch afaik also works fine.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread J. Gareth Moreton
 Saying that though, despite the near-identical time, what's the size of
the binary like?  It should be the same or slightly smaller, but
(hopefully) never larger.

 Gareth aka. Kit

 On Sun 09/12/18 03:32 , "Ryan Joseph" r...@thealchemistguild.com sent:

 > On Dec 9, 2018, at 9:15 AM, J. Gareth Moreton  wrote: 
 > 
 > Hmmm, that's a shame if the time difference is so small. Up to you if
it's worth it or not. I hoped it would be slightly better than that,
although if it's consistently faster, especially with large projects, then
it's a winner in my eyes. Fingers crossed! 

 My biggest project is only ~20 seconds so it’s just not a big enough
code base I think. 

 How do we even know how much of the time was spent on code generation vs
parsing? All I looked at was the final time using -vs when it got to the
linking phase. 

 Regards, 
 Ryan Joseph 

 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread J. Gareth Moreton
 Hmmm, it might imply that the overhaul isn't worth it except for the
largest projects.  I guess we'll have to let Florian make that call.

 I'm not sure how to time the optimisation stage separately, unless you're
able to pass in the PPU files directly.  Other factors like reading from
the disk can take up proportinally more time as well.  Thanks for your
help though - at least it's not considerably worse!

 i'm just hoping my changes are successful so I can port it to i386 and
then implement some new peephole optimisations that I've found (the changes
clash slightly with the overhaul).

 Gareth aka. Kit.

 On Sun 09/12/18 03:32 , "Ryan Joseph" r...@thealchemistguild.com sent:

 > On Dec 9, 2018, at 9:15 AM, J. Gareth Moreton  wrote: 
 > 
 > Hmmm, that's a shame if the time difference is so small. Up to you if
it's worth it or not. I hoped it would be slightly better than that,
although if it's consistently faster, especially with large projects, then
it's a winner in my eyes. Fingers crossed! 

 My biggest project is only ~20 seconds so it’s just not a big enough
code base I think. 

 How do we even know how much of the time was spent on code generation vs
parsing? All I looked at was the final time using -vs when it got to the
linking phase. 

 Regards, 
 Ryan Joseph 

 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread Ryan Joseph


> On Dec 9, 2018, at 9:15 AM, J. Gareth Moreton  
> wrote:
> 
> Hmmm, that's a shame if the time difference is so small.  Up to you if it's 
> worth it or not.  I hoped it would be slightly better than that, although if 
> it's consistently faster, especially with large projects, then it's a winner 
> in my eyes.  Fingers crossed!

My biggest project is only ~20 seconds so it’s just not a big enough code base 
I think. 

How do we even know how much of the time was spent on code generation vs 
parsing? All I looked at was the final time using -vs  when it got to the 
linking phase.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread Ryan Joseph
got it compiling but I need a better way to specify the changed rtl/package 
units. I just did a “make clean all” but I need to specify the new location of 
the rtf/packages units which are no longer in the default locations. 

Is there a better way than adding tons of -Fu’s in the command line?

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread J. Gareth Moreton
 Hmmm, that's a shame if the time difference is so small.  Up to you if
it's worth it or not.  I hoped it would be slightly better than that,
although if it's consistently faster, especially with large projects, then
it's a winner in my eyes.  Fingers crossed!

 Gareth aka. Kit

 On Sun 09/12/18 03:10 , "Ryan Joseph" r...@thealchemistguild.com sent:
 Got everything building finally but the time difference is so small I'll
need to make a script to compile multiple times and average all the runs.
Is it even worth the time doing that? 

 Regards, 
 Ryan Joseph 

 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread Ryan Joseph


> On Dec 9, 2018, at 9:02 AM, J. Gareth Moreton  
> wrote:
> 
> (This should probably be on the mailing list because it's helpful to everyone)
> 
> Hmmm, I'm not sure about that one - those shouldn't be affected.  Just the 
> standard "make clean all" should work.
> 
> However, the document here contains everything about building FPC: 
> http://www.stack.nl/~marcov/buildfaq.pdf - not sure why your packages aren't 
> in the default directories now though... the patches only change a few source 
> files.
> 
> Gareth aka. Kit

The reason is that PPU versions changed so I can’t use the units at the default 
system locations. Not sure how the compiler decides but it was looking for 
units at the 3.1.1 install location which is an older PPU version.

There should be a single command to specify a top-level directory that wins out 
over default locations.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread Ryan Joseph
Got everything building finally but the time difference is so small I'll need 
to make a script to compile multiple times and average all the runs. Is it even 
worth the time doing that? 

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread J. Gareth Moreton
 (This should probably be on the mailing list because it's helpful to
everyone)

 Hmmm, I'm not sure about that one - those shouldn't be affected.  Just
the standard "make clean all" should work.

 However, the document here contains everything about building FPC:
http://www.stack.nl/~marcov/buildfaq.pdf [1] - not sure why your packages
aren't in the default directories now though... the patches only change a
few source files.

 Gareth aka. Kit

 On Sun 09/12/18 02:56 , Ryan Joseph r...@thealchemistguild.com sent:
 got it compiling but I need a better way to specify the changed
rtl/package units. I just did a “make clean all” but I need to specify
the new location of the rtf/packages units which are no longer in the
default locations. 

 Is there a better way than adding tons of -Fu’s in the command line? 

 Regards, 
 Ryan Joseph 

 

Links:
--
[1] http://www.stack.nl/~marcov/buildfaq.pdf
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread J. Gareth Moreton
I'm not sure. I've always had problems 
with patch.exe. I personally use "svn 
patch", which works for me both under 
Windows and Linux.

I hope this works better.

Gareth aka. Kit

On Sun 09/12/18 01:56 , "Ryan Joseph" 
r...@thealchemistguild.com sent:
> I was stupid and didn’t use the right 
options. They work now except
> this one:
> 
> 
> sudo patch -p0 < 
/Users/ryanjoseph/Downloads/overhaul-
base.patch 
> 
> (Stripping trailing CRs from patch.)
> 
> patching file compiler/aopt.pas
> 
> patch:  malformed patch at line 15: 
Index: compiler/aoptbase.pas
> 
> 
> 
> did it fail?
> 
> 
> 
> 
> 
> > On Dec 9, 2018, at 8:36 AM, Ryan 
Joseph  a...@thealchemistguild.com> wrote:
> > 
> 
> > Couldn’t figure out the patching. I 
tried
> a dry run but it doesn’t seem to find 
the file.
> 
> 
> Regards,
> 
> Ryan Joseph
> 
> 
> 
> 
> 
> 
> 

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread Ryan Joseph
Couldn’t figure out the patching. I tried a dry run but it doesn’t seem to find 
the file.

Downloaded from svn

cd trunk
patch < /Users/ryanjoseph/Downloads/overhaul-64-32-split.patch --dry-run

(Stripping trailing CRs from patch.)
can't find file to patch at input line 5
Perhaps you should have used the -p or --strip option?
The text leading up to this was:
--
|Index: compiler/x86/aoptx86.pas
|===
|--- compiler/x86/aoptx86.pas   (revision 40472)
|+++ compiler/x86/aoptx86.pas   (working copy)
--
File to patch:


> On Dec 9, 2018, at 7:11 AM, J. Gareth Moreton  
> wrote:
> 
> Had any luck with this?

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread Ryan Joseph
I was stupid and didn’t use the right options. They work now except this one:

sudo patch -p0 < /Users/ryanjoseph/Downloads/overhaul-base.patch 
(Stripping trailing CRs from patch.)
patching file compiler/aopt.pas
patch:  malformed patch at line 15: Index: compiler/aoptbase.pas

did it fail?


> On Dec 9, 2018, at 8:36 AM, Ryan Joseph  wrote:
> 
> Couldn’t figure out the patching. I tried a dry run but it doesn’t seem to 
> find the file.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-08 Thread J. Gareth Moreton
Had any luck with this?

Gareth aka. Kit


On Fri 07/12/18 01:26 , Ryan Joseph 
r...@thealchemistguild.com sent:
> 
> 
> 
> 
> > On Dec 7, 2018, at 5:11 AM, J. Gareth 
Moreton
>  e...@moreton-family.com> wrote:
> > 
> 
> > Does anyone have other test projects 
to compile
> that would give more coverage for the 
timing metrics?
> 
> 
> Sure. How do I download and build? Are 
you just relying the FPC standard
> output for timing or are there are 
special switches to show compiles times
> more accurate?
> 
> 
> Regards,
> 
> Ryan Joseph
> 
> 
> 
> 
__
_
> 
> fpc-devel maillist  -  fpc-
de...@lists.freepascal.org
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> 
> 
> 
> 

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-06 Thread J. Gareth Moreton
 The patches in question can be found here:
https://bugs.freepascal.org/view.php?id=34628 - just get the latest source
files from the SVN trunk and apply the patches - the order shouldn't
matter, but be careful you don't accidentally apply the same patch twice. 
After that, you just "make clean all install" as normal.  If you've never
built FPC before, you might want to hunt around the website for
instructions.

 For the most accurate timing information, specify the "-vs" flag and it
will put timestamps at the front of all of the messages.

 Hope this helps.

 Gareth aka. Kit

 On Fri 07/12/18 01:26 , Ryan Joseph r...@thealchemistguild.com sent:

 > On Dec 7, 2018, at 5:11 AM, J. Gareth Moreton  wrote: 
 > 
 > Does anyone have other test projects to compile that would give more
coverage for the timing metrics? 

 Sure. How do I download and build? Are you just relying the FPC standard
output for timing or are there are special switches to show compiles times
more accurate? 

 Regards, 
 Ryan Joseph 

 ___ 
 fpc-devel maillist - fpc-devel@lists.freepascal.org [2] 
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[3]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

 

Links:
--
[1] mailto:gar...@moreton-family.com
[2] mailto:fpc-devel@lists.freepascal.org
[3] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-06 Thread Ryan Joseph


> On Dec 7, 2018, at 5:11 AM, J. Gareth Moreton  
> wrote:
> 
> Does anyone have other test projects to compile that would give more coverage 
> for the timing metrics?

Sure. How do I download and build? Are you just relying the FPC standard output 
for timing or are there are special switches to show compiles times more 
accurate?

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-06 Thread J. Gareth Moreton
 -- with overhaul   it never set r8d to new value, but should
     lea esi,
[r11d+1H]  
 -- end  overhaul

     mov r10d, dword
[rdi+rsi*4]     
     jmp
?_00144     

         }
  end else
  if (m > toFind) then
  begin
   high := mid - 1;
   h := sortedArray^[high];
  end else
  begin
     binarySearchLong:=mid;
     exit;
  end;
      
     end;

     if (sortedArray^[low] = toFind) then
     begin
  binarySearchLong:=low;
     end else
     binarySearchLong := -1; { Not found}
 end;

         ----- Reply to message -----
 Subject: Re: [fpc-devel] x86_64 Optimizer Overhaul
 Date: 2018. gada 2. decembris 23:32:36
 From:  J. Gareth Moreton 
 To:  FPC developers' list  Thanks for the feedback.  Do you have a
reproducible case, and does it fail on Linux or Windows?  I'll have a look
for the infinite loops in the meantime.   Gareth aka. Kit          
___
 fpc-devel maillist - fpc-devel@lists.freepascal.org [1]
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

 

Links:
--
[1] mailto:fpc-devel@lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-06 Thread J. Gareth Moreton
 I believed I've fixed the bug.  Thanks for your help.

 I had misunderstood one of the internal methods and, as a result, it
wasn't resetting the register allocation usage with each iteration of the
loop (and to add insult to injury, caused a memory leak!).  By sheer
coincidence, this wasn't a problem under Windows because of some additional
code that skipped over the function prologue, but got triggered under
Linux.

 I've updated all of the patch files in the bug report and added an
additional one, since one function in particular got a bigger rework than
everything else (overhaul-mov-refactor).

 I haven't had a chance to re-test the timings yet, although I've tried to
provide a couple of additional savings for -O1 and -O2.

 Gareth aka. Kit

 P.S. Note that the code is very messy with functions being split between
i386 and x86_64. This is for testing and control cases.  If x86_64 is
successful, I intend to remove the distinctions and have i386 and x86_64
share the same overhaul.  One platform at a time though!

 On Sun 02/12/18 23:21 , "Marģers ." margers.ro...@inbox.lv sent:
 I run it no linux. Problem code part.

 type PLongData = ^TLongData;
   TLongData = array [0..100] of longint;

 function binarySearchLong ( sortedArray:PLongData; nLen,
toFind:longint):longint;
 var low, high, mid, l, h, m : longint;
 begin
     { Returns index of toFind in sortedArray, or -1 if not found}
     low := 0;
     high := nLen - 1;

     l := sortedArray^[low];
     h := sortedArray^[high];

     while ((l = toFind)) do
     begin
  mid := (low + high) shr 1;   { var "low" in register
r8d }
  m := sortedArray^[mid];

  if (m < toFind) then
  begin
   low := mid + 1;
   l := sortedArray^[low];

         { asm code generated
 -- with trunk
     lea r8d,
[r11d+1H]  
     mov  esi, r8d
 --end trunk
 -- with overhaul   it never set r8d to new value, but should
     lea esi,
[r11d+1H]  
 -- end  overhaul

     mov r10d, dword
[rdi+rsi*4]     
     jmp
?_00144     

         }
  end else
  if (m > toFind) then
  begin
   high := mid - 1;
   h := sortedArray^[high];
  end else
  begin
     binarySearchLong:=mid;
     exit;
  end;
      
     end;

     if (sortedArray^[low] = toFind) then
     begin
  binarySearchLong:=low;
     end else
     binarySearchLong := -1; { Not found}
 end;

         - Reply to message -
 Subject: Re: [fpc-devel] x86_64 Optimizer Overhaul
 Date: 2018. gada 2. decembris 23:32:36
 From:  J. Gareth Moreton 
 To:  FPC developers' list  Thanks for the feedback.  Do you have a
reproducible case, and does it fail on Linux or Windows?  I'll have a look
for the infinite loops in the meantime.   Gareth aka. Kit          ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-02 Thread J. Gareth Moreton
 That's interesting. Thanks for that. Time to get fixing.

 In the meantime I'm also fixing up the buggy optimisation that caused the
original crash on Linux... nothing against the contributor, but it looks
like some badly-copied code from the MovXX routine... it even still
mentions "movsx" in the comments!

 Hopefully this effort on the overhaul won't all be for naught.

 Gareth aka. Kit

 On Sun 02/12/18 23:21 , "Marģers ." margers.ro...@inbox.lv sent:
 I run it no linux. Problem code part.

 type PLongData = ^TLongData;
   TLongData = array [0..100] of longint;

 function binarySearchLong ( sortedArray:PLongData; nLen,
toFind:longint):longint;
 var low, high, mid, l, h, m : longint;
 begin
     { Returns index of toFind in sortedArray, or -1 if not found}
     low := 0;
     high := nLen - 1;

     l := sortedArray^[low];
     h := sortedArray^[high];

     while ((l = toFind)) do
     begin
  mid := (low + high) shr 1;   { var "low" in register
r8d }
  m := sortedArray^[mid];

  if (m < toFind) then
  begin
   low := mid + 1;
   l := sortedArray^[low];

         { asm code generated
 -- with trunk
     lea r8d,
[r11d+1H]  
     mov  esi, r8d
 --end trunk
 -- with overhaul   it never set r8d to new value, but should
     lea esi,
[r11d+1H]  
 -- end  overhaul

     mov r10d, dword
[rdi+rsi*4]     
     jmp
?_00144     

         }
  end else
  if (m > toFind) then
  begin
   high := mid - 1;
   h := sortedArray^[high];
  end else
  begin
     binarySearchLong:=mid;
     exit;
  end;
      
     end;

     if (sortedArray^[low] = toFind) then
     begin
  binarySearchLong:=low;
     end else
     binarySearchLong := -1; { Not found}
 end;

         ----- Reply to message -----
 Subject: Re: [fpc-devel] x86_64 Optimizer Overhaul
 Date: 2018. gada 2. decembris 23:32:36
 From:  J. Gareth Moreton 
 To:  FPC developers' list  Thanks for the feedback.  Do you have a
reproducible case, and does it fail on Linux or Windows?  I'll have a look
for the infinite loops in the meantime.   Gareth aka. Kit          ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-02 Thread J. Gareth Moreton
 Thanks for the feedback.  Do you have a reproducible case, and does it
fail on Linux or Windows?  I'll have a look for the infinite loops in the
meantime.

 Gareth aka. Kit

 On Sun 02/12/18 20:54 , "Marģers ." margers.ro...@inbox.lv sent:
 > I've had problems testing it under Linux due to configuration
difficulties, so if anyone is willing to try out "make all", I'll be most
grateful. 

 "make all" work well on linux.

 Compiler options -O3 and -O4 are broken.
 It was possible to compile my program, but program at some point went into
never ending loop - cpu usage 100% and response zero.

 Compiling my speed test program using -O2, optimizations made by Overhaul,
was speed lose by 2% comparing to current trunk.  I guess, optimizations
is good for compiler itself, but no so much for user programs.

 margers
           ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] x86_64 Optimizer Overhaul

2018-12-01 Thread J. Gareth Moreton
 Following advice from Florian, I've split my submission into five separate
patches so they are easier to test.  It also now compiles under
x86_64-linux.  It seems that there's an apparent fault with one of the MOV
optimisations that was causing incorrect code to be generated in some
instances.  I have a good idea as to what's going on and can try to fix
this at another time.

 Hopefully now it's stable enough for time metrics to be taken and to
confirm it doesn't break other platforms.
 Some more refactoring should be performed down the line; I plan to do this
once my code is confirmed reasonable and I begin adapting it for i386,
where there's a bounty for speed gains!

 Find all the new patch files over here:
https://bugs.freepascal.org/view.php?id=34628 - note that some of the
patches require others to work; prerequisite information is given in the
first note.

 Gareth aka. Kit
 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel