You still have to load 32 bit scalars and registers in to 128/256 data (
apart from reprepared memory) ,  this "conversion" is poorly optimized with
intrinsic limiting the use of SSE.

 

For SIMD I agree with you (treating it as a copro) but  on x86_64 the SSE
registers are almost GP registers and have a full instruction set,  the only
limitation is addition/subtraction/mult and div is limited to 32 bits ( when
used as a GP register) all others are 128 bit.  I even suspect on x86  this
will be used more and more replacing the traditional registers due to better
interop with SSE and faster move and ops.   So  in x86 you can do full bit
ops , comparison etc on an int128 or int256 type it just has a 32 bit max
value limitation for the basic arithmetic ops  ( but not for bits , shifts ,
masks etc) .  The registers in this form are especially attractive as loop
counters for SIMD code , mem copy / scan  , temporal mem write and bit
operations ( traditional as well as the mask ones or bit scan) ,  While it's
easy to support these intrinsics the question is how do you treat such a
data type/ register in Bitc and how do you convert between these and
traditional int32 and int64 efficiently ?   

 

I can see one option is to treat the reg as a int128_32  which can be cast
to  int[]  or  another to represent the sse reg as a union or another where
you treat it as a int128 and produce an error when using it with more than
32 bit values arithmetic on x86 platforms. .

 

Ben

 

 

The compiler doesn't convert it. Just as we have integral types that are
register allocatable, we introduce SIMD types and corresponding operations.
The main problem with this approach is the fact that no two vendors agree on
the implementation of the SIMD unit.

 

What I think I'm saying here is that the right way to think about the SIMD
unit is that it's a heterogeneous coprocessor that shares a memory (at least
if ARM didn't break it) with the primary processor.

 

 

2. Using the 128 and 256 bit registers as pseudo GP registers for general
work , this is at least as common eg memcpy .  This case does not use
aggregate data and is not even SIMD but still important.

This case arises only in a few intrinsics, and isn't that hard to teach the
compiler as a special case.

 

shap

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.851 / Virus Database: 271.1.1/3069 - Release Date: 08/15/10
02:34:00

 

 

 

<<attachment: winmail.dat>>

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to