I am trying to code a simple operation using SSE2 instructions where possible. I have a feeling that what I want to do is just a matter of a couple of shufps and haddps instructions but I can't get it. Lazyweb please help!
The operation is integration. I have a vector of 4 single floats (v4sf) and a carry-in float to start. For example CI F0 F1 F2 F3 5 1 0 10 -5 Yields F0 F1 F2 F3 6 6 16 11 So far iteration on plain floats seems to be the best I can come up with, but HADDPS is tantalizingly close to what I want to do. Any hints? Thanks, Bill Gribble _______________________________________________ [email protected] mailing list UNSUBSCRIBE and account-management -> http://lists.puredata.info/listinfo/pd-list
