Don: > I don't think that's messy at all. I can't see much difference between > special support for float[4] versus float4. It's better if the code can > take advantage of hardware without specific support. Bear in mind that > SSE/SSE2 is a temporary situation. AVX provides for much longer arrays > of vectors; and it's extensible. You'd end up needing to keep adding on > special types whenever a new CPU comes out. > > Note that the fundamental concept which is missing from the C virtual > machine is that all modern machines can efficiently perform operations > on arrays of built-in types of length 2^n, for some small value of n. > We need to get this into the language abstraction. Not follow C++ in > hacking a few extra special types onto the old, deficient C model. And I > think D is actually in a position to do this. > > float[4] would be a greatly superior option if it could be done. > The key requirements are: > (1) need to specify that static arrays are passed by value. > (2) need to keep stack aligned to 16. > The good news is that both of these appear to be done on DMD2-Mac!
I have quoted it all because I like the experience and ideas you bring to D. But how can the operations like shuffling, or how to map a sqrt on just the first item of such 4 floats, or on them all, etc? What syntax can be used? Regarding the array operations already implemented, are there ways to force the compiler to inline such code/operations? Bye, bearophile
