2013/3/18 Timothy Normand Miller <[email protected]> > > > > On Mon, Mar 18, 2013 at 10:10 AM, Troy Benjegerdes <[email protected]>wrote: > >> If registers are precious, why not add more? .. what is the relative cost >> (in terms of latency, silicon area, and energy) to double the register set >> size? >> > > For CPUs, 32 was found to be optimal by some paper published back in the > early 90's, I think. 16 was a second best, while 64 had diminishing > returns. I'm not sure how this applies to GPUs, however. One problem with > doubling the RF size is that you slow it down. >
This number came without superpipelining and superscalaire in mind. Unrolling loops is a good way to avoid instruction for the control flow, and removing dependencies between instructions but this need at least twice the number of register. > > >> >> So what if we have 8 'bitbucket/constant' registers of these most used >> constants, and then instead of hardcoding the constants, make them be >> something the application can load. >> > > This would be a good alternative to immediates. Somes CPU architectures > don't/didn't have immediates. > > >> >> As for the dependency issue, I think the point of the bitbucket >> register(s) >> was that they have none, and we throw away all writes going to those >> registers. >> > > It's a good alternative to the 'wr' bit in the ISA I posted. > > > > -- > Timothy Normand Miller, PhD > Assistant Professor of Computer Science, Binghamton University > http://www.cs.binghamton.edu/~millerti/ > Open Graphics Project >
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
