Re: Does dmd have SSE intrinsics?

Jeremie Pelletier Mon, 21 Sep 2009 15:35:15 -0700

bearophile wrote:

Don:

(1) They don't take advantage of fixed-length arrays. In particular,operations on float[4] should be a single SSE instruction (no functioncall, no loop, nothing). This will make a huge difference to game andgraphics programmers, I believe.

[...]

It's issue (1) which is the killer.


In my answer I have forgotten to say another small thing.

The std.gc.malloc() of D returns pointers aligned to 16 bytes (but I may like 
to add a second argument to such GC malloc, to specify the alignment, this can 
be used to save some memory when the alignment isn't necessary), while I think 
the std.c.stdlib.malloc() doesn't give pointers aligned to 16 bytes.

In the following code if you want to implement the last line with one vector 
instruction then a and b arrays have to be aligned to 16 bytes. I think that 
currently LDC doesn't align a and b to 16 bytes.

float[4] a = [1.f, 2., 3., 4.];
float[4] b[] = 10f;
float[4] c[] = a[] + b[];

So you may need a syntax like the following, that's not handy:

align(16) float[4] a = [1.f, 2., 3., 4.];
align(16) float[4] b[] = 10f;
align(16) float[4] c[] = a[] + b[];

A possible solution is to automatically align to 16 (by default, but it can be 
changed to save stack space in specific situations) all static arrays allocated 
on the stack too :-)
A note: in future probably CPU vector instructions will relax their alignment 
requirements... it's already happening.

Bye,
bearophile

That 16bytes alignment is a restriction of the current usage of bitfields. Since every bit in the field indexes a single 16bytes block, asimple shift 4 bits to the right translate a pointer into its index inthe bit field. You could align on 4 bytes boundaries but at the cost ofdoubling the size of bit fields, and possibly having slower collection runs.

Doesn't SSE have aligned and unaligned versions of its moveinstructions? like MOVAPS and MOVUPS.

Re: Does dmd have SSE intrinsics?

Reply via email to