Am 16.01.2020 um 23:22 schrieb J. Gareth Moreton:
Hey everyone,
Maybe I'm being a bit pedantic with this, but must we abide by C/C++
standards and go by the name __m128 etc. for the 128-bit data type?
Being as how Pascal tended to go for more readable and BASIC-inspired
names like Integer and Single, might it be better to name them TM128
instead? If not that, then is it possible to add a union-like record
type to the System unit or the inc files that contain all of the
intrinsics?
I agree that the names with __xxx for the SIMD types is a bad choice. In
C/C++ they did this to avoid type conflicts (after all types with two
underscores are "reserved"), but in Pascal we don't have this problem as
the System types will be hidden by other units that declare similar
types, but can still be used by using System.TheType.
Thus I personally would prefer more Pascal-style names for these as well
(though I don't think that TXXX is good, because no other primitive type
starts with a T and that's what those types essentially are: primitive,
base types). So maybe simply M128 instead of __m128 would be better (and
analogous for the other types). This would be similar to the "new"
integer aliases: UInt8, Int8, Int32, UInt32, etc.
My vectorcall tests (e.g. tests\test\cg\tvectorcall1.pp) have
something like this:
{$PUSH}
{$CODEALIGN RECORDMIN=16}
{$PACKRECORDS C}
type
TM128 = record
case Byte of
0: (M128_F32: array[0..3] of Single);
1: (M128_F64: array[0..1] of Double);
end;
{$POP}
Granted, given that __m128 will be automatically aligned, all of the
codealign directives may not be necessary - for example:
type
TM128 = record
case Byte of
0: (M128_F32: array[0..3] of Single);
2: (M128_F64: array[0..1] of Double);
3: (M128_Internal: __m128);
end;
The main thing I'm thinking about is that it's actually rather
difficult to modify the elements of a variable of type __m128 directly
in C/C++ because of the type being opaque and difficult to typecast
sometimes (some compilers will treat it as an array, others will treat
it as a record type like the above (Visual C++ does this), while
others may not allow access to its elements at all). Often, I might
want to map a 4-component vector with Single-type fields x, y, z and w
to an aligned __m128 type, or Double-type fields Re and Im when
dealing with complex numbers. That way, I can read from and write to
them outside of intrinsic calls.
I suppose I'm suggesting we introduce something more usable than what
C has so people can actually use intrinsics more easily.
I don't know the plans of Florian, but I would very well imagine that
code like the following is going to be valid:
=== code begin ===
type
i: array[0..3] of LongInt;
m: __m128i;
begin
m := i;
// or
i := m;
end.
=== code end ===
With that working and type helpers one can implement the following:
=== code begin ===
type
TM128Helper = type helper for __m128
public type
TLongIntIndex = 0..3;
private type
TLongIntArray = array[TLongIntIndex] of LongInt;
private
procedure SetAsLongInt(aIndex: TLongIntIndex; aValue: LongInt); inline;
function GetAsLongInt(aIndex: TLongIntIndex): LongInt; inline;
public
property AsLongInt[Index: TLongIntIndex]: LongInt read GetAsLongInt
write SetAsLongInt;
end;
//
procedure TM128Helper.SetAsLongInt(aIndex: TLongIntIndex; aValue: LongInt);
begin
TLongIntArray(Self)[aIndex] := aValue;
end;
function TM128Helper.GetAsLongInt(aIndex: TLongIntIndex): LongInt;
begin
Result := TLongIntArray(Self)[aIndex];
end;
=== code end ===
This would allow to move those conversions from being handled by some
compiler magic to the runtime library.
In fact quite a bit of it is already working now, though the generated
assembly is not yet optimal (but the feature is still work in progress
after all):
=== code begin ===
program tmmtest;
{$mode objfpc}
{$modeswitch typehelpers}
type
TM128Helper = type helper for __m128
public type
TLongIntIndex = 0..3;
private type
TLongIntArray = array[0..3] of LongInt;
private
procedure SetAsLongInt(aIndex: TLongIntIndex; aValue: LongInt);
inline; vectorcall;
function GetAsLongInt(aIndex: TLongIntIndex): LongInt; inline;
vectorcall;
public
property AsLongInt[Index: TLongIntIndex]: LongInt read GetAsLongInt
write SetAsLongInt;
end;
procedure TM128Helper.SetAsLongInt(aIndex: TLongIntIndex; aValue:
LongInt); vectorcall;
var
arr: TLongIntArray;
begin
x86_movups(@arr[0], Self);
arr[aIndex] := aValue;
// triggers internal error 200310081
//Self := x86_movups(@arr[0]);
end;
function TM128Helper.GetAsLongInt(aIndex: TLongIntIndex): LongInt;
vectorcall;
var
arr: TLongIntArray;
begin
x86_movups(@arr[0], Self);
Result := arr[aIndex];
end;
procedure Test;
var
m: __m128;
i: LongInt;
begin
m.AsLongInt[0] := 42;
i := m.AsLongInt[0];
end;
begin
Test;
end.
=== code end ===
The generated assembly for Test is this:
=== code begin ===
# Var m located at rbp-16, size=OS_M128
# Var i located at rbp-20, size=OS_S32
# [42] m.AsLongInt[0] := 42;
leaq -36(%rbp),%rax
movdqa -16(%rbp),%xmm0
movups %xmm0,(%rax)
movl $42,-36(%rbp)
# [43] i := m.AsLongInt[0];
leaq -36(%rbp),%rax
movdqa -16(%rbp),%xmm0
movups %xmm0,(%rax)
movl -36(%rbp),%eax
movl %eax,-20(%rbp)
# [44] end;
=== code end ===
Regards,
Sven
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel