On 4/21/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> > One fully-pipelined vmul unit that can start a new vmul every cycle.
> > The vmul unit can do scalar individual ops (where the other three muls
> > have a bubble).
> > Sometimes, the compiler can do whole-program analysis, carefully
> > allocating registers and using pack/unpack/swizzle instructions to get
> > a bit more throughput.
>
> It could. But gcc don't do it. Icc from intel did it a little bit. Do you
> think you could have enough compiler people to do better than Intel ?

Part of GCC's problem is that it's trying to be very general.  We'd
develop a very special-purposes compiler and we can evolve the
compiler and architecture together so that it is more convenient to
compile.  I suspect that there are challenges with vectorizing for SSE
that we can avoid.

> > We don't need separate hardware for smul, just a little multiplexing on
> > vmul.
>
> Yep. Why nobody like the idea to keep every thing simple with 4 scalar core ?

The control and routing hardware for doing vectors is simpler than
doing four scalars at the same time.
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to