On 06.01.2012 02:42, Manu wrote:
I like v128, or something like that. I'll use that for the sake of this
document. I think it is preferable to float4 for a few reasons...

I do not agree at all. That way, the type looses all semantic information. This is not only breaking with C/C++/D philosophy but actually *hides* an essential hardware detail on Intel SSE:

An SSE register is 128 bit, but the processor actually cares about the semantics of the content:

There are different commands for loading two doubles, four singles or integers to a register. They all load the same 128 bits from memory into the same register. Anyhow, the specs warn about a performance penalty when loading a register as one type and then using it as another. I do not know the internals of the processor, but my understanding is that the CPU splits the floats into mantissa, exponent and sign already at the moment of loading and has to drop that information when you reinterpret the bit pattern stored in the register.

A type v128 would not provide the necessary information for the compiler to produce the correct mov statements.

There definitely must be a float4 and a double2 type to express these semantics. For integers, I am not quite sure. I believe that integer SSE commands can be mixed more so a single 128bit type would be sufficient.

Considering these hardware details of the SSE architecture alone, I fear that portable low-level support for SIMD is very hard to achieve. If you want to offer access to the raw power of each architecture, it might be simpler to have machine-specific language extensions for SIMD and leave the portability for a wrapper library with a common front-end and various back-ends for the different architectures.

Reply via email to