Re: [fpc-devel] The new XMM intrinsics

Sven Barth via fpc-devel Sat, 18 Jan 2020 03:51:40 -0800

Am 16.01.2020 um 23:22 schrieb J. Gareth Moreton:

Hey everyone,
Maybe I'm being a bit pedantic with this, but must we abide by C/C++standards and go by the name __m128 etc. for the 128-bit data type? Being as how Pascal tended to go for more readable and BASIC-inspirednames like Integer and Single, might it be better to name them TM128instead? If not that, then is it possible to add a union-like recordtype to the System unit or the inc files that contain all of theintrinsics?

I agree that the names with __xxx for the SIMD types is a bad choice. InC/C++ they did this to avoid type conflicts (after all types with twounderscores are "reserved"), but in Pascal we don't have this problem asthe System types will be hidden by other units that declare similartypes, but can still be used by using System.TheType.

Thus I personally would prefer more Pascal-style names for these as well(though I don't think that TXXX is good, because no other primitive typestarts with a T and that's what those types essentially are: primitive,base types). So maybe simply M128 instead of __m128 would be better (andanalogous for the other types). This would be similar to the "new"integer aliases: UInt8, Int8, Int32, UInt32, etc.

My vectorcall tests (e.g. tests\test\cg\tvectorcall1.pp) havesomething like this:
{$PUSH}
{$CODEALIGN RECORDMIN=16}
{$PACKRECORDS C}
type
  TM128 = record
    case Byte of
      0: (M128_F32: array[0..3] of Single);
      1: (M128_F64: array[0..1] of Double);
  end;
{$POP}
Granted, given that __m128 will be automatically aligned, all of thecodealign directives may not be necessary - for example:
type
  TM128 = record
    case Byte of
      0: (M128_F32: array[0..3] of Single);
      2: (M128_F64: array[0..1] of Double);
      3: (M128_Internal: __m128);
  end;
The main thing I'm thinking about is that it's actually ratherdifficult to modify the elements of a variable of type __m128 directlyin C/C++ because of the type being opaque and difficult to typecastsometimes (some compilers will treat it as an array, others will treatit as a record type like the above (Visual C++ does this), whileothers may not allow access to its elements at all). Often, I mightwant to map a 4-component vector with Single-type fields x, y, z and wto an aligned __m128 type, or Double-type fields Re and Im whendealing with complex numbers. That way, I can read from and write tothem outside of intrinsic calls.
I suppose I'm suggesting we introduce something more usable than whatC has so people can actually use intrinsics more easily.

I don't know the plans of Florian, but I would very well imagine thatcode like the following is going to be valid:


=== code begin ===

type
  i: array[0..3] of LongInt;
  m: __m128i;
begin
  m := i;
  // or
  i := m;
end.

=== code end ===

With that working and type helpers one can implement the following:

=== code begin ===

type
  TM128Helper = type helper for __m128
  public type
    TLongIntIndex = 0..3;
  private type
    TLongIntArray = array[TLongIntIndex] of LongInt;
  private
    procedure SetAsLongInt(aIndex: TLongIntIndex; aValue: LongInt); inline;
    function GetAsLongInt(aIndex: TLongIntIndex): LongInt; inline;
  public

property AsLongInt[Index: TLongIntIndex]: LongInt read GetAsLongIntwrite SetAsLongInt;

  end;

//

procedure TM128Helper.SetAsLongInt(aIndex: TLongIntIndex; aValue: LongInt);
begin
  TLongIntArray(Self)[aIndex] := aValue;
end;

function TM128Helper.GetAsLongInt(aIndex: TLongIntIndex): LongInt;
begin
  Result := TLongIntArray(Self)[aIndex];
end;

=== code end ===

This would allow to move those conversions from being handled by somecompiler magic to the runtime library.

In fact quite a bit of it is already working now, though the generatedassembly is not yet optimal (but the feature is still work in progressafter all):


=== code begin ===

program tmmtest;

{$mode objfpc}
{$modeswitch typehelpers}

type
  TM128Helper = type helper for __m128
  public type
    TLongIntIndex = 0..3;
  private type
    TLongIntArray = array[0..3] of LongInt;
  private

procedure SetAsLongInt(aIndex: TLongIntIndex; aValue: LongInt);inline; vectorcall; function GetAsLongInt(aIndex: TLongIntIndex): LongInt; inline;vectorcall;

 public

property AsLongInt[Index: TLongIntIndex]: LongInt read GetAsLongIntwrite SetAsLongInt;

 end;

procedure TM128Helper.SetAsLongInt(aIndex: TLongIntIndex; aValue:LongInt); vectorcall;

var
  arr: TLongIntArray;
begin
  x86_movups(@arr[0], Self);
  arr[aIndex] := aValue;
  // triggers internal error 200310081
  //Self := x86_movups(@arr[0]);
end;

function TM128Helper.GetAsLongInt(aIndex: TLongIntIndex): LongInt;vectorcall;

var
  arr: TLongIntArray;
begin
  x86_movups(@arr[0], Self);
  Result := arr[aIndex];
end;

procedure Test;
var
  m: __m128;
  i: LongInt;
begin
  m.AsLongInt[0] := 42;
  i := m.AsLongInt[0];
end;

begin
  Test;
end.

=== code end ===

The generated assembly for Test is this:

=== code begin ===

# Var m located at rbp-16, size=OS_M128
# Var i located at rbp-20, size=OS_S32
# [42] m.AsLongInt[0] := 42;
    leaq    -36(%rbp),%rax
    movdqa    -16(%rbp),%xmm0
    movups    %xmm0,(%rax)
    movl    $42,-36(%rbp)
# [43] i := m.AsLongInt[0];
    leaq    -36(%rbp),%rax
    movdqa    -16(%rbp),%xmm0
    movups    %xmm0,(%rax)
    movl    -36(%rbp),%eax
    movl    %eax,-20(%rbp)
# [44] end;

=== code end ===

Regards,
Sven
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] The new XMM intrinsics

Reply via email to