On 30/12/2021 17:16, Florian Klämpfl via fpc-devel wrote:
Am 30.12.21 um 14:52 schrieb Jonas Maebe via fpc-devel:
On 29/12/2021 00:48, Martin Frb via fpc-devel wrote:
I don't have an M1 myself, but according to the data from the thread on the lazarus mail list, there is a bug in the 3.3.1 asm generator for M1

var pn8: pint8; // pointer signed byte

In the below expression ...(not pn8^)...

"pn8^" is loaded to w0 and sign extended. From this point onwards operations on the value should be 32 bits (the value has been extended, and the full 32 bits are later used).
but "not" only affects the lowest 8 bit.

Apparently in 3.2.2 (or was it 3.2.0) there was
mvn w0,w0

If someone can confirm tihs....

It's probably caused by c90616944d3bde7b36e924d27a0790195d61f95c (Florian)


Isn't the sign extension during the load wrong? Martin didn't post the whole assemble code but I would expect that 3.2.2 produced an uxtb instruction afterwards which hide the problem.

The code is from the "old" LazUtils  Utf8LengthFast.
"old" => about a week back, since it got recently changed to uint8.


function UTF8LengthFast(p: PChar; ByteCount: PtrInt): PtrInt;
var
  pnx: PPtrInt absolute p; // To get contents of text in PtrInt blocks. x refers to 32 or 64 bits   pn8: pint8 absolute pnx; // To read text as Int8 in the initial and final loops
begin
....
    Result += (pn8^ shr 7) and ((not pn8^) shr 6);

It is about the  "((not pn8^) shr 6)" part.


For X86 the "not" is byte only, then sign extend, then shift
(interesting, that the value for a logical shift is sign extended.)

Project1.pas:276                          Result += (pn8^ shr 7) and ((not pn8^) shr 6);
0000000100001B30 488b45f8                 mov    -0x8(%rbp),%rax
0000000100001B34 8a00                     mov    (%rax),%al
0000000100001B36 f6d0                     not    %al
0000000100001B38 0fbec0                   movsbl %al,%eax
0000000100001B3B c1e806                   shr    $0x6,%eax
0000000100001B3E 488b55f8                 mov    -0x8(%rbp),%rdx
0000000100001B42 0fbe12                   movsbl (%rdx),%edx
0000000100001B45 c1ea07                   shr    $0x7,%edx
0000000100001B48 21d0                     and    %edx,%eax
0000000100001B4A 4863c0                   movslq %eax,%rax
0000000100001B4D 480345e8                 add    -0x18(%rbp),%rax
0000000100001B51 488945e8                 mov    %rax,-0x18(%rbp)


_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to