More than one write port for the RF is very costly, especially if it's for
a minority of instructions.


On Mon, Mar 18, 2013 at 10:16 AM, Nicolas Boulay <[email protected]>wrote:

>
>
> 2013/3/18 Timothy Normand Miller <[email protected]>
>
>>
>>
>>
>> On Mon, Mar 18, 2013 at 4:53 AM, Nicolas Boulay <[email protected]>wrote:
>>
>>> Registers are a very precious ressources. Memory are more and more
>>> slower than the CPU (it's even worse from the latency point of view). So
>>> having a register code for /dev/null is a coslty solution, if we have
>>> constraint on  the instruction size. A cpu with large code have more
>>> pressure to reduce the code size, than a gpu where the code is smaller.
>>>
>>> MSP430 use 16 and 32 bits instruction size, 32 bits instruction use the
>>> second 16 part as immediat, it's quite clean.
>>>
>>> One of the new cpu have a specific encoding for constant. It's like
>>> having 3 bits that code 8 values includes -1, 0, 1, 2, 4, 8, 16, the most
>>> used constant to avoid to use larger code.
>>>
>>> - Large instruction world is coslty only on large code
>>>
>>
>>
>>> - dependencies between register is always a plague for performance on
>>> pipeline
>>>
>>
>>
>>> - Register and register adress space is one of the most precisous
>>> ressources of a cpu
>>>
>>
>> All very true.
>>
>>
>>
>>> - immediat could be coded as enum or constant name for the most used
>>> value
>>
>>
>> Yes.  This is equivalent to having a shared extension to the register
>> file that contains constants.
>>
>>
>> I have gaps in my knowledge about some architectures, so there are some
>> features (such as a constant file) that I am more inclined to adopt because
>> earlier architectures have proved them to be useful.  Once I understand
>> more of this, I'll be more willing to consider creative new features, and
>> by that point I hope to have some infrastructure for testing.
>>
>>
> If the instruction size is a problem, i think that a large register bank
> that could only be moved from and to normal register and memory could be
> usefull. This kind of register could replace write buffer and prefetch, by
> preloading. The idea is to fill 2 or 4 register in a single load or store
> to the main memory (preload), but partial write should be impossible.
> Each loop could be split in 2 or 4 using this special register bank. This
> better use the burst of the DRAM without problem on timing like with
> prefetch.
>
> I would like to see also load_load instruction to have only a single stall
> instead of 2, for variable access like "struct->struct.i" .
>
>
>>
>>>
>>>
>>> Nicolas
>>>
>>>
>>>
>>> --
>> Timothy Normand Miller, PhD
>> Assistant Professor of Computer Science, Binghamton University
>> http://www.cs.binghamton.edu/~millerti/
>> Open Graphics Project
>>
>
>


-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to