On 4/21/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> >
> > We don't.  However, we could use software (the compiler/assembler) to do
> > it.
>
> So you need pack/unpack instruction and balanced there use, because it
> take also one cycle. Or you need individualy selectable word inside the
> SIMD register which produice big switch.

Here's a solution:

One fully-pipelined vmul unit that can start a new vmul every cycle.
The vmul unit can do scalar individual ops (where the other three muls
have a bubble).
Sometimes, the compiler can do whole-program analysis, carefully
allocating registers and using pack/unpack/swizzle instructions to get
a bit more throughput.

We don't need separate hardware for smul, just a little multiplexing on vmul.

It's the divides that'll kill us.  Can we cheat by always using
reciprocal and multiply?
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to