More than one write port for the RF is very costly, especially if it's for a minority of instructions.
On Mon, Mar 18, 2013 at 10:16 AM, Nicolas Boulay <[email protected]>wrote: > > > 2013/3/18 Timothy Normand Miller <[email protected]> > >> >> >> >> On Mon, Mar 18, 2013 at 4:53 AM, Nicolas Boulay <[email protected]>wrote: >> >>> Registers are a very precious ressources. Memory are more and more >>> slower than the CPU (it's even worse from the latency point of view). So >>> having a register code for /dev/null is a coslty solution, if we have >>> constraint on the instruction size. A cpu with large code have more >>> pressure to reduce the code size, than a gpu where the code is smaller. >>> >>> MSP430 use 16 and 32 bits instruction size, 32 bits instruction use the >>> second 16 part as immediat, it's quite clean. >>> >>> One of the new cpu have a specific encoding for constant. It's like >>> having 3 bits that code 8 values includes -1, 0, 1, 2, 4, 8, 16, the most >>> used constant to avoid to use larger code. >>> >>> - Large instruction world is coslty only on large code >>> >> >> >>> - dependencies between register is always a plague for performance on >>> pipeline >>> >> >> >>> - Register and register adress space is one of the most precisous >>> ressources of a cpu >>> >> >> All very true. >> >> >> >>> - immediat could be coded as enum or constant name for the most used >>> value >> >> >> Yes. This is equivalent to having a shared extension to the register >> file that contains constants. >> >> >> I have gaps in my knowledge about some architectures, so there are some >> features (such as a constant file) that I am more inclined to adopt because >> earlier architectures have proved them to be useful. Once I understand >> more of this, I'll be more willing to consider creative new features, and >> by that point I hope to have some infrastructure for testing. >> >> > If the instruction size is a problem, i think that a large register bank > that could only be moved from and to normal register and memory could be > usefull. This kind of register could replace write buffer and prefetch, by > preloading. The idea is to fill 2 or 4 register in a single load or store > to the main memory (preload), but partial write should be impossible. > Each loop could be split in 2 or 4 using this special register bank. This > better use the burst of the DRAM without problem on timing like with > prefetch. > > I would like to see also load_load instruction to have only a single stall > instead of 2, for variable access like "struct->struct.i" . > > >> >>> >>> >>> Nicolas >>> >>> >>> >>> -- >> Timothy Normand Miller, PhD >> Assistant Professor of Computer Science, Binghamton University >> http://www.cs.binghamton.edu/~millerti/ >> Open Graphics Project >> > > -- Timothy Normand Miller, PhD Assistant Professor of Computer Science, Binghamton University http://www.cs.binghamton.edu/~millerti/ Open Graphics Project
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
