Re: [fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-03 Thread J. Gareth Moreton via fpc-devel
Prepare for a lot of technical rambling! This is just an analysis of the compilation of utf8lentest.lpr, not any of the System units.  Notably, POPCNT isn't called directly, but instead goes through the System unit via "call fpc_popcnt_qword" on both 3.2.x and 3.3.1.  A future study of

Re: [fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-03 Thread J. Gareth Moreton via fpc-devel
Interesting - thank you.  Will be interesting to study the assembler output to see what's going on. I'm honoured that I've become the go-to person when optimisation is concerned! Gareth aka. Kit On 03/01/2022 11:54, Martin Frb via fpc-devel wrote: Hi Gareth, not sure if this is of

Re: [fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-03 Thread Marco van de Voort via fpc-devel
On 3-1-2022 12:54, Martin Frb via fpc-devel wrote: fpc 3.2.3 /   fpc 3.3.1 fst 594   fst 688 fst 578   fst 703 fst 578   fst 687 fst 562   fst 688 Fyi, the latest asm version (+fst/pop/add/naieve) is at http://www.stack.nl/~marcov/utf8lentest.lpr

[fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

2022-01-03 Thread Martin Frb via fpc-devel
Hi Gareth, not sure if this is of interest to you, but I see you do a lot on the optimizer While testing the attached, I found that one of the functions was notable slower when compiled with 3.3.1 (compared to 3.2.3). So maybe something you are interested in looking at? The Code in