On 29/12/21 01:26, Bart via lazarus wrote:
fpc -al ulen.pas
> This will produce the file ulen.s
> You can attach or copy that here.
File is attached.
The output from running this program is:
% ./ulen
Signed version
Len = -100663283
Unsigned version
Len = 1
To add another wrinkle to this,
On 27-12-2021 22:10, Noel Duffy via lazarus wrote:
On 28/12/21 01:47, Juha Manninen via lazarus wrote:
On Mon, Dec 27, 2021 at 1:44 AM Noel Duffy via lazarus <
lazarus@lists.lazarus-ide.org> wrote:
I need some help getting to the root of a problem with incorrect results
on Apple hardware (M1,
On Tue, Dec 28, 2021 at 3:56 PM Florian Klämpfl via lazarus
wrote:
>
> Crash at run time with sigill. Popcnt was introduced with Nehalem, so >10
> years ago.
Thanks.
Any other CPU's support something like this?
--
Bart
--
___
lazarus mailing list
Am 28.12.2021 um 15:50 schrieb Bart via lazarus:
On Tue, Dec 28, 2021 at 3:39 PM Marco van de Voort via lazarus
wrote:
On what machine did you test? The settings if for the generated code,
but the actual processor determines the effective speed.
I have a Intel i5 7th generation on my
On Tue, Dec 28, 2021 at 3:39 PM Marco van de Voort via lazarus
wrote:
> On what machine did you test? The settings if for the generated code,
> but the actual processor determines the effective speed.
I have a Intel i5 7th generation on my Win10-64 laptop from approx.
2017 (so, it's really old
On Tue, Dec 28, 2021 at 3:31 PM Florian Klämpfl via lazarus
wrote:
> For X86, check for the define CPUX86_HAS_POPCNT (compile time!).
Thanks.
--
Bart
--
___
lazarus mailing list
lazarus@lists.lazarus-ide.org
Op 12/28/2021 om 3:01 PM schreef Bart via lazarus:
-Cpcoreavx for core 3000 series and higher
Thanks for that.
Up to PENTIUMM: PopCnt slower
COREI : approximately equally fast
COREAVX PopCnt slightly faster
COREAVX2 PopCnt slightly faster
On what machine did you test? The settings if for
Am 28.12.2021 um 15:01 schrieb Bart via lazarus:
On Tue, Dec 28, 2021 at 2:46 PM Marco van de Voort via lazarus
wrote:
You need an appropriate minimal CPU with -Cp
Try e.g. -Cpcoreavx for core 3000 series and higher
Thanks for that.
Up to PENTIUMM: PopCnt slower
COREI : approximately
On Tue, Dec 28, 2021 at 2:46 PM Marco van de Voort via lazarus
wrote:
> You need an appropriate minimal CPU with -Cp
>
>
> Try e.g. -Cpcoreavx for core 3000 series and higher
Thanks for that.
Up to PENTIUMM: PopCnt slower
COREI : approximately equally fast
COREAVX PopCnt slightly faster
Op 12/28/2021 om 1:53 PM schreef Bart via lazarus:
I just tested PopCnt vs Multiplication on win32 and win64.
The version with PopCnt is appr. 3 times slower on both 32 and 64 bit!
You need an appropriate minimal CPU with -Cp
Try e.g. -Cpcoreavx for core 3000 series and higher
--
On Tue, Dec 28, 2021 at 1:09 PM Juha Manninen via lazarus
wrote:
>> I will patch the function using unsigned types where applicable.
>> I will keep the loop variables unsigned though.
>
>
> Yes, thank you.
Done.
Should that be merged to fixes?
--
Bart
--
On Tue, Dec 28, 2021 at 1:09 PM Juha Manninen via lazarus
wrote:
> I confess I didn't remember what PopCnt does. I checked from the net.
> FPC implements it as internproc.
> function PopCnt(Const AValue : QWord): QWord;[internproc:fpc_in_popcnt_x];
> I guess it translates to one x86_64
On Tue, Dec 28, 2021 at 12:08 PM Martin Frb via lazarus
wrote:
> I would like to see the generates assembler on M1, if that is possible? (for
> code with optimization off, as well as code with whatever optimization was
> used so far)
@Noel:
Here's example code (standalone) you can use to
On Tue, Dec 28, 2021 at 1:45 PM Bart via lazarus <
lazarus@lists.lazarus-ide.org> wrote:
> @Juha: can you please comment on my possible improvement using PopCnt
> instead of a multiplication with ONEMASK.
>
I confess I didn't remember what PopCnt does. I checked from the net.
FPC implements it
On Tue, Dec 28, 2021 at 11:52 AM Juha Manninen via lazarus
wrote:
> Can you please create a patch for UTFLengthFast. You can upload it here or
> create a merge request in GitLab or anything.
@Juha: can you please comment on my possible improvement using PopCnt
instead of a multiplication with
On 28/12/2021 11:52, Juha Manninen via lazarus wrote:
On Tue, Dec 28, 2021 at 3:29 AM Noel Duffy via lazarus
wrote:
So it appears to me that an unsigned pointer type is required in
UTFLengthFast.
Can you please create a patch for UTFLengthFast. You can upload it
here or create a
On Tue, Dec 28, 2021 at 3:29 AM Noel Duffy via lazarus <
lazarus@lists.lazarus-ide.org> wrote:
> So it appears to me that an unsigned pointer type is required in
> UTFLengthFast.
>
Can you please create a patch for UTFLengthFast. You can upload it here or
create a merge request in GitLab or
On 28/12/21 07:21, Bart via lazarus wrote:
On Mon, Dec 27, 2021 at 6:35 PM Marco van de Voort via lazarus
wrote:
The expression seems to be 1 when the top bits are 10 iow when it is a
follow bytes of utf8, that is what the comment says, and I as far as I
can see the signedness doesn't
On Mon, Dec 27, 2021 at 10:02 PM Noel Duffy via lazarus
wrote:
> It's not just the euro, though. It's any utf-8 sequence.
What I meant was that a single '€' (or any other single UTF8
"character") will not enter the mentioned block.
Can you add some debug statements to display the values of the
On 28/12/21 01:47, Juha Manninen via lazarus wrote:
On Mon, Dec 27, 2021 at 1:44 AM Noel Duffy via lazarus <
lazarus@lists.lazarus-ide.org> wrote:
I need some help getting to the root of a problem with incorrect results
on Apple hardware (M1, aarch64) for the function UTF8LengthFast in
On 28/12/21 04:39, Bart via lazarus wrote:
On Mon, Dec 27, 2021 at 3:41 PM Juha Manninen via lazarus
wrote:
It must be a Big endian / Little endian issue. IIRC it can be adjusted in ARM
CPUs.
Why do MacOS and Linux use a different setting there? I have no idea.
On second thought: if the
On Mon, Dec 27, 2021 at 6:35 PM Marco van de Voort via lazarus
wrote:
> The expression seems to be 1 when the top bits are 10 iow when it is a
> follow bytes of utf8, that is what the comment says, and I as far as I
> can see the signedness doesn't matter.
>
> Basically to me that seems to be a
Op 12/27/2021 om 4:39 PM schreef Bart via lazarus:
pn8^ =11100010 //first byte
(pn8^ shr 7) = //<<-- I would have expected that to be 0001 ?
Depends on if pn8^ is signed or not, for a signed shift it makes sense.
The definition as pint8 (instead of puint8) is
Am 27.12.2021 um 13:28 schrieb Bart via lazarus:
On Mon, Dec 27, 2021 at 12:44 AM Noel Duffy via lazarus
wrote:
I need some help getting to the root of a problem with incorrect results
on Apple hardware (M1, aarch64) for the function UTF8LengthFast in lazutf8.
Your M1 architecture is
On Mon, Dec 27, 2021 at 3:41 PM Juha Manninen via lazarus
wrote:
> It must be a Big endian / Little endian issue. IIRC it can be adjusted in ARM
> CPUs.
> Why do MacOS and Linux use a different setting there? I have no idea.
On second thought: if the function returns grabage for just a single
On Mon, Dec 27, 2021 at 1:44 AM Noel Duffy via lazarus <
lazarus@lists.lazarus-ide.org> wrote:
> I need some help getting to the root of a problem with incorrect results
> on Apple hardware (M1, aarch64) for the function UTF8LengthFast in lazutf8.
>
> On MacOS, when given a string containing one
On Mon, Dec 27, 2021 at 12:44 AM Noel Duffy via lazarus
wrote:
> I need some help getting to the root of a problem with incorrect results
> on Apple hardware (M1, aarch64) for the function UTF8LengthFast in lazutf8.
Your M1 architecture is BigEndian perhaps?
(I really have no idea)
--
Bart
--
I need some help getting to the root of a problem with incorrect results
on Apple hardware (M1, aarch64) for the function UTF8LengthFast in lazutf8.
On MacOS, when given a string containing one or more UTF8 characters,
UTF8LengthFast returns wildly incorrect results. On Fedora, the function
28 matches
Mail list logo