On Fri, Jan 6, 2012 at 2:43 AM, Walter Bright <[email protected]> wrote: > On 1/5/2012 5:42 PM, Manu wrote: >> >> So I've been hassling about this for a while now, and Walter asked me to >> pitch >> an email detailing a minimal implementation with some initial thoughts. > > > Takeaways: > > 1. SIMD behavior is going to be very machine specific. > > 2. Even trying to do something with + is fraught with peril, as integer adds > with SIMD can be saturated or unsaturated. > > 3. Trying to build all the details about how each of the various adds and > other ops work into the compiler/optimizer is a large undertaking. D would > have to support internally maybe a 100 or more new operators. > > So some simplification is in order, perhaps a low level layer that is fairly > extensible for new instructions, and for which a library can be layered over > for a more presentable interface. A half-formed idea of mine is, taking a > cue from yours: > > Declare one new basic type: > > __v128 > > which represents the 16 byte aligned 128 bit vector type. The only > operations defined to work on it would be construction and assignment. The > __ prefix signals that it is non-portable. > > Then, have: > > import core.simd; > > which provides two functions: > > __v128 simdop(operator, __v128 op1); > __v128 simdop(operator, __v128 op1, __v128 op2); > > This will be a function built in to the compiler, at least for the x86. > (Other architectures can provide an implementation of it that simulates its > operation, but I doubt that it would be worth anyone's while to use that.) > > The operators would be an enum listing of the SIMD opcodes, > > PFACC, PFADD, PFCMPEQ, etc. > > For: > > z = simdop(PFADD, x, y); > > the compiler would generate: > > MOV z,x > PFADD z,y >
Would this tie SIMD support directly to x86/x86_64, or would it possible to also support NEON on ARM (also 128 bit SIMD, see http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0409g/index.html ) ? (Obviously not for DMD, but if the syntax wasn't directly tied to x86/64, GDC and LDC could support this) It seems like using a standard naming convention instead of directly referencing instructions could let the underlying SIMD instructions vary across platforms, but I don't know enough about the technologies to say whether NEON's capabilities match SSE closely enough that they could be handled the same way.
