Re: primitive vector types

Denis Koroskin Fri, 20 Feb 2009 00:45:20 -0800

On Fri, 20 Feb 2009 08:55:16 +0300, Denis Koroskin <[email protected]>wrote:

On Fri, 20 Feb 2009 06:22:40 +0300, Andrei Alexandrescu<[email protected]> wrote:
Denis Koroskin wrote:
On Thu, 19 Feb 2009 23:05:34 +0300, Andrei Alexandrescu<[email protected]> wrote:
Denis Koroskin wrote:
On Thu, 19 Feb 2009 22:25:04 +0300, Mattias Holm<[email protected]> wrote:
Since (SIMD) vectors are so common and every reasonabe systemsupport them in one way or the other (and scalar emulation of thisis rather simple), why not have support for this in D directly?
Yes, the array operations are nice (and one of the main reasons forwhy I like D :) ), but have the problem that an array of floatsmust be aligned on float boundaries and not vector boundaries. Inmy mind vectors are a primitive data type that should be exposed bythe programming language.
Something OpenCL-like:

    float4 vec;
    vec.xyzw = {1.0,1.0, 1.0, 1.0}; // assignment
    vec.xyzw = vec.wyxz; // permutation
    vec[i] = 1.0; // indexing
And then we can easily immagine some extra nice features to havewith respect to operators:
vec ^ vec2; // 3d cross product for float vectors, for intvectors xor
Has this been discussed before?

/ Mattias
 I don't see any reason why float4 can't be made a library type.
Yah, I was thinking the same:

struct float4
{
     __align(16) float[4] data; // right syntax and value?
     alias data this;
}
This looks like something that should go into std.matrix pronto. Iteven has value semantics even though fixed arrays don't :o/.
Andrei
That would be great. If float4 gets its way into D, I'll share ourblazing fast math code with community (most common operations onvectors, matrices, quaternions etc). It is written entirely in SSE(intrinsics, not asm; there is a problem with inlining asm in D, IIRC.Can anyone elaborate on this?) and *very* fast. According to ourbenchmarks, that's the best we get squeeze out of hardware.
 I know LLVM have support for *very* wide range of intrinsics:
http://www.cs.ucla.edu/classes/spring08/cs259/llvm-2.2/include/llvm/Intrinsics.genHopefully they will get into LDC (and DMD *hint* Walter *hint*) verysoon.
Put me down for that. What do I need to do?

Andrei
Convince Walter to add float4 type and some intrinsics to DMD (I'll posta list of those we use later), LDC will follow, I believe.There should be some type that would be treated specially. After all,intrinsics have function signatures and those should specify someconcrete types.


Here is a nice documentation about MMX, SSE, SSE2 intrinsics:
http://msdn.microsoft.com/en-us/library/y0dh78ez(VS.80).aspx

Here is a quick statistics on what intrinsics are used in our code and howmany times.Note that it doesn't directly maps to how many times it is *actually* usedin user-code.

This info may give Walter some information about priorities (thoseintrinsics that aren't often used may be given lower priority, forexample).


Arithmetic Operations (Floating-Point SSE2 Intrinsics)
http://msdn.microsoft.com/en-us/library/708ya3be(VS.80).aspx
_mm_add_ss - 2
_mm_add_ps - 48
_mm_sub_ss - 4
_mm_sub_ps - 24
_mm_mul_ss - 2
_mm_mul_ps - 100
_mm_div_ss - 0
_mm_div_ps - 1
_mm_sqrt_ss - 0
_mm_sqrt_ps - 0
_mm_rcp_ss - 1
_mm_rcp_ps - 0
_mm_rsqrt_ss - 0
_mm_rsqrt_ps - 1
_mm_min_ss - 0
_mm_min_ps - 1
_mm_max_ss - 0
_mm_max_ps - 1

Store Operations (SSE)
http://msdn.microsoft.com/en-us/library/ybhzf6dk(VS.80).aspx
_mm_store_ss - 1
_mm_store1_ps - 0
_mm_store_ps1 - 0
_mm_store_ps - 0
_mm_storeu_ps - 0
_mm_storer_ps - 0
_mm_move_ss - 2

Set Operations (SSE)
http://msdn.microsoft.com/en-us/library/wbzwdy6a(VS.80).aspx
_mm_set_ss - 0
_mm_set1_ps - 0
_mm_set_ps1 - 19
_mm_set_ps - 45
_mm_setr_ps - 0
_mm_setzero_ps - 2

Logical Operations (SSE)
http://msdn.microsoft.com/en-us/library/9759as73(VS.80).aspx
_mm_and_ps - 2
_mm_andnot_ps - 0
_mm_or_ps - 0
_mm_xor_ps - 3

Miscellaneous Instructions That Use Streaming SIMD Extensions
http://msdn.microsoft.com/en-us/library/dzs626wx.aspx
_mm_shuffle_ps - 124
_mm_shuffle_pi16 - 0
_mm_unpackhi_ps - 0
_mm_unpacklo_ps - 0
_mm_loadh_pi - 0
_mm_storeh_pi - 0
_mm_movehl_ps - 0
_mm_movelh_ps - 0
_mm_loadl_pi - 0
_mm_storel_pi - 0
_mm_movemask_ps - 0
_mm_getcsr - 0
_mm_setcsr - 0
_mm_extract_si64 - 0
_mm_extracti_si64 - 0
_mm_insert_si64 - 0
_mm_inserti_si64 - 0

Comparison Intrinsics (SSE)
http://msdn.microsoft.com/en-us/library/w8kez9sf(VS.80).aspx
Not used

Conversion Operations (SSE)
http://msdn.microsoft.com/en-us/library/0d4dtzhb(VS.80).aspx
Not used

Macros

_MM_SHUFFLE - 100 - #define _MM_SHUFFLE(fp3,fp2,fp1,fp0) (((fp3) << 6) |((fp2) << 4) | ((fp1) << 2) | ((fp0)))

Re: primitive vector types

Reply via email to