2012/5/31 Timothy Normand Miller <[email protected]>: > Nicolas, your idea regarding LIW instructions has merit except for one > problem: Register file bandwidth.
Sure. In asic 8 read and 8 write exist, but i understood it's not fast and not common. But does the extra logic to manage 4 read and 2 write port is a good deal ? > With very clever pipeline organization, we can read two regs at once, > and when the write-back occurs, it is always timed with a > non-conflicting bank. Note that even in ASICs, SRAM blocks are almost > always dual-ported. Now, if we were to have separate FP and INT reg SRAM block is good for big register file. For 4 or 8 registers, you could even implement it as normal gate (or let the router do the job). > files (which CPUs do for bandwidth reasons), that would be fine, but > now that doubles the SRAM resources we need for a core in a way that > isn't economical, since the int registers will go underutilized. It's depend on the register total number. 8 is the minimum. 32+32 is the norme for RISC cpu. 256 is used for SPARC VII of fujitsu. Fermi have 32K registers if i remember correctly. > So for very practical reasons, every instruction must have two > register inputs and one register output. There are very few > combinations that would work otherwise. Only memory write and branch > have no target register. They could be combined, sometimes, with a > single-input ALU op, but we haven't decided that we'll have any of > those. How often is it useful to combine a memory write with an > FPNEG? Do we really want to spend the extra logic to optimize such a > low-probability case? > And what about use explicit register bank ? You could split it in half, with a lost cycle in case of conflict. You could also have fast register (as L1 cache) and slow one (like L2). Imagines 8 "fast" registers done with logic gate with many port, and a large 248 register file dual ported only, but with enough register to avoid RAM access. If a "many"-ported register is too costly, you could have a specific register bank for the fpu only (only fpu could write in them). It will looks like the big A register of some Ti DSP. All of this solution complexify the programming model, but enable 100% use of the fpu. > > > > -- > Timothy Normand Miller > http://www.cse.ohio-state.edu/~millerti > Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
