On Thu, Aug 14, 2008 at 4:47 PM, Petter Urkedal <[EMAIL PROTECTED]> wrote:
> On 2008-08-14, Timothy Normand Miller wrote:
>> Another goofy thing that we're going to have to think about...
>>
>> The S3, when doing memory access, relies on reads being in even
>> numbers of 32-bit words.  So if we turn caching off, but HQ is not
>> running, then a memory read will turn into a single-word request,
>> which the S3 won't handle correctly.  I think I may not try to fix
>> that, though.
>>
>> Basically, we can only turn off memory caching if HQ is intercepting
>> traffic.  If HQ gets a single word read request from memory (and it
>> doesn't have it cached somewhere), then it must make at least a 2-word
>> request to the S3 and then return the correct half of what comes back.
>
> I was just about to start the assembly for target read when I got your
> message, but for simplicity I ignored this and put a comment in the
> source for the time being.  Does this also mean the address sent to the
> bridge must always be even for reads?

I looked.  It's worse than that.  It assumes that you're going to read
at least 8 words starting on an 8-word boundary.  I designed it under
the assumption that memory reads would always come in 16-word blocks.
But I THINK that the real enforced granularity is 8.  Also, the count
is 7 bits, and the address auto-inc only considers the lower 7 bits.
(Actually, since it's doing even numbers, only bits [6:1] are
inc'd/dec'd.)  So I think you could safely read up to 15 blocks of 8
(120 words), as long as you don't cross a 128-word boundary.

Also, interestingly, it's the XP10 that decides when the read is done.
 The S3 code sends the right number of requests to the memory
controllers then basically forgets about it except to clear a counter
that assumes you're on an 8-word boundary.  Then as read data comes
back from memory, the s3 bridge just assumes it's valid and sends it
on to the XP10, using the counter to fetch from the memory controllers
in the right order.  When the XP10 get all the words it asked for,
then it switches the bus state back.  From the perspective of the
XP10, the S3 is a slave device.

>
> Another goofy thing:  It seems tricky at best to unroll the
> transfer-loop for target write.  The reason is that we only know the
> number of queued commands, but what we need is the number of queued
> write-data commands.  Any idea?

That is tricky, and we may have no good answer for that.

I suggest we do nothing about it right now.  We should get a working
revision out, then we can go back later and see if we can do anything
clever with the CPU design.

One thing I've thought of is going to wide instruction word.  Two
instructions are side-by-side, and you have two register files and two
ALUs.  On a fetch, you can fetch any two regs from file A and any two
from file B and then cross them over in any way you like to the ALUs,
then on writeback, you can do one write to each file.  It's like two
processors in parallel but with the ability to cross registers over
between them with restrictions.  Now we can deal with some of the
inefficiencies, if we can schedule instructions properly.  Obviously
only one instruction could hold a branch (but it could be either, and
we could allow it to be both if only one could compute to true).


-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to