2012/5/31 Timothy Normand Miller <[email protected]>:
> Nicolas, your idea regarding LIW instructions has merit except for one
> problem:  Register file bandwidth.

Sure. In asic 8 read and 8 write exist, but i understood it's not fast
and not common. But does the extra logic to manage 4 read and 2 write
port is a good deal ?

> With very clever pipeline organization, we can read two regs at once,
> and when the write-back occurs, it is always timed with a
> non-conflicting bank.  Note that even in ASICs, SRAM blocks are almost
> always dual-ported.  Now, if we were to have separate FP and INT reg

SRAM block is good for big register file. For 4 or 8 registers, you
could even implement it as normal gate (or let the router do the job).

> files (which CPUs do for bandwidth reasons), that would be fine, but
> now that doubles the SRAM resources we need for a core in a way that
> isn't economical, since the int registers will go underutilized.

It's depend on the register total number. 8 is the minimum. 32+32 is
the norme for RISC cpu. 256 is used for SPARC VII of fujitsu.  Fermi
have 32K registers if i remember correctly.

> So for very practical reasons, every instruction must have two
> register inputs and one register output.  There are very few
> combinations that would work otherwise.  Only memory write and branch
> have no target register.  They could be combined, sometimes, with a
> single-input ALU op, but we haven't decided that we'll have any of
> those.  How often is it useful to combine a memory write with an
> FPNEG?  Do we really want to spend the extra logic to optimize such a
> low-probability case?
>

And what about use explicit register bank ? You could split it in
half, with a lost cycle in case of conflict.

You could also have fast register (as L1 cache) and slow one (like
L2). Imagines 8 "fast" registers done with logic gate with many port,
and a large 248 register file dual ported only, but with enough
register to avoid RAM access.

If a "many"-ported  register is too costly, you could have a specific
register bank for the fpu only (only fpu could write in them). It will
looks like the big A register of some Ti DSP.

All of this solution complexify the programming model, but enable 100%
use of the fpu.
>
>
>
> --
> Timothy Normand Miller
> http://www.cse.ohio-state.edu/~millerti
> Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to