You still have to load 32 bit scalars and registers in to 128/256 data ( apart from reprepared memory) , this "conversion" is poorly optimized with intrinsic limiting the use of SSE.
For SIMD I agree with you (treating it as a copro) but on x86_64 the SSE registers are almost GP registers and have a full instruction set, the only limitation is addition/subtraction/mult and div is limited to 32 bits ( when used as a GP register) all others are 128 bit. I even suspect on x86 this will be used more and more replacing the traditional registers due to better interop with SSE and faster move and ops. So in x86 you can do full bit ops , comparison etc on an int128 or int256 type it just has a 32 bit max value limitation for the basic arithmetic ops ( but not for bits , shifts , masks etc) . The registers in this form are especially attractive as loop counters for SIMD code , mem copy / scan , temporal mem write and bit operations ( traditional as well as the mask ones or bit scan) , While it's easy to support these intrinsics the question is how do you treat such a data type/ register in Bitc and how do you convert between these and traditional int32 and int64 efficiently ? I can see one option is to treat the reg as a int128_32 which can be cast to int[] or another to represent the sse reg as a union or another where you treat it as a int128 and produce an error when using it with more than 32 bit values arithmetic on x86 platforms. . Ben The compiler doesn't convert it. Just as we have integral types that are register allocatable, we introduce SIMD types and corresponding operations. The main problem with this approach is the fact that no two vendors agree on the implementation of the SIMD unit. What I think I'm saying here is that the right way to think about the SIMD unit is that it's a heterogeneous coprocessor that shares a memory (at least if ARM didn't break it) with the primary processor. 2. Using the 128 and 256 bit registers as pseudo GP registers for general work , this is at least as common eg memcpy . This case does not use aggregate data and is not even SIMD but still important. This case arises only in a few intrinsics, and isn't that hard to teach the compiler as a special case. shap No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.851 / Virus Database: 271.1.1/3069 - Release Date: 08/15/10 02:34:00
<<attachment: winmail.dat>>
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
