On 12/28/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
>
> --- [EMAIL PROTECTED] wrote:
> I am going through s3/ and adding comments for my understanding
> (which is most likely wrong in a few places, but it should help),
> and should be ready by tomorrow afternoon. I will just post it as
> a diff since my SVN account doesn't appear to work any more.

We need to get Attila to help you fix that.

>
> This initial version is very rough, and I am probably going to hit
> myself for some of the questions, but release early, release often
> wins.

I'm looking through it.  Very nice work.  I have some comments.  You
can integrate this, and one of us will check it in.


-    .br2arb_addr(br2arb_addr_out >> 2),
+    .br2arb_addr(br2arb_addr_out >> 2),
+        // Hmm... this seems suspicious.
+        // We are accepting the address of a 64 bit word.
+        // br2arb_addr_out is 27 bits, so we do need to
+        // get rid of two, but shouldn't they be MSBs?

We're getting rid of the lower two bits because there are four memory
controllers, and what we want to do is have each 64-bit word in four
go to a different controller.  When we have a drawing engine, this
will be important due to the typical access patterns.

However, this raises a serious issue that we have to work out.  Yet
another of my weaknesses:  Getting address bits straight.

The address from PCI is a byte address, [29:0], a 1GB address space.
We throw away the lower two bits because they don't mean anything for
memory space access, and we have the byte enables.  Issue:  Did we
already throw them away in the XP10 already?  We need to find out.  We
need to make sure we don't throw them away twice.

Next, now that we, presumably, have an address of 32-bit words, we
want to send it to a memory controller.  It's one of four.  But each
one takes a 64-bit word.  Right now, we cheat and just put our word in
the upper or lower half of the 64-bit word and set the byte enables
accordingly.  That's what goes into the command fifo to the arbiter.
But now, this is an address of a 64-bit word.  We need to throw away
another lower address bit.  Did I forget to do that?

Now, we have an address of 64-bit word.  Each adjacent word goes to a
different memory, so we use the lower two bits of the address to
select which controller and the address passed into the controller has
those two bits shifted off.  Inside the memory wrapper, we throw away
some UPPER bits because the address space isn't as large as we're
pretending it is.

So we have a 1GiB address space that is [29:0].  For 32-bit words,
that's [29:2] or [27:0].  For 64-bit, that's [29:4] or [25:0].  For
the address going into the arbiter, that's [29:6] or [23:0].  In the
arbiter, we use [22:0].

I'm off by one somewhere.  Where did I go wrong?


 // Pipeline Pixels for Output
+// This sends out the out1 signal (output RGB[23:0] pixels) at
+// the correct phase in relation to the out0 signal, right?

I'll have to look at this more closely.  A version of this was tested
in real hardware and worked.  I may, however, have broken it.


             mem_addr <= address[27:1];  // shift correct?
+                                        // No. We are reading in 64 bit words.
+                                        // Looking at s3_top_level,
it should be
+                                        // address[28:2]. address is
the raw address.

See above.  I can't seem to get it to work out right.


+        address[6:1] <= address[6:1] + 1;   // For a similar reason to above,
+                                            // this should change as well.

Now, _THIS_ is for reads, where we're actually requesting 64-bit
words, but we've been handed an address of 32-bit by the bridge.
Here's how this needs to work (and probably doesn't):  We get a
request for 16 32-bit words from the bridge.  We turn that into 8
requests for 64-bit words into the command fifo.  So we have to
increment the address by 2 each time (which is what that does).  On
the output of the command fifo, we look at the lower 2 address bits
(address of 64-bit, it needs to be) and send to the right arbiter.


 // Dequeue when we're sending the top half
+// This is because we are splitting the 64 bit
+// data word into two 32 bit writes for the bridge.

Exactly.


     // Read end of request queue from bridge
     // This is only a 32-bit interface
+    // No, this is 64 bits. Data to the memory controller is 64 bits wide.

Yes, you're right.  Although for writes we do cheat.  Instead of
putting two 32-bit writes together into 1 64-bit, we turn one 32-bit
into one 64-bit by just setting the byte enables to mask off the
unwritten bytes.  The thing is, we want to keep it a 64-bit interface
for later when we fix this shortcut.  (Although, given the relatively
slow rate of PCI data coming in, we might not.  Then we can save some
LUTs by making the command fifo 32 bits less wide.)


         end else if (vid1_access) begin
             allow_vid0 <= 1;
+                // Why is vid0 allowed? It isn't accessing.
+                // Won't this mess up cmd_mem below?

You found a bug.


+// What happened to precharging a row -- auto precharge? I assume we are using
+// auto precharge, so doesn't column[9] need to be set?

The memctl_rh stage figures out if we had a row miss.  Then the fsm
does an explicit precharge on the row.  This is all automatic in the
memory controller.  Similarly, the precharge for all banks that must
precede a refresh is also built into the fsm.

The precharge command into the memory controller is actually a
precharge-all that is manually initiated.  It's for setting up the
controller.  When we power up, the memories are off.  We need to wait
until they've been on for some minimum period, then enable clock and
enable chip select (in some order).  Then we do a precharge all, wait,
do an LMR, wait, do the extended LMR, wait, then finally enable
refresh.

This sequence needs to be handled in the software driver, although
some of the support logic is missing from the hardware at the moment!


 assign bank_mem =
-    ({2{lmr_cmd}} & lmr_data[14:13]) |
+    ({2{lmr_cmd}} & lmr_data[14:13]) | // configure the mode of the bank

I don't know what you mean here.  These bits are routed this way so
that when the driver initiated an LMR, those two bits are asserted on
the bank signals.  IIRC, which of the two LMR commands you want is
selected on the bank pins.  This is just how we decided software would
select.



 assign rtag_mem = {pci_read_cmd, vid0_cmd, vid1_cmd};
+    // This tells the memory_wrapper, and therefore memctl_top, what
is being sent.
+    // This is sent through from write to read and lets the arbiter
know who requested
+    // data once it finally comes.

When making a read request, we use the same signals into the memory
controller.  It's just that the write data bits are unused.  When it's
a read, we send a tag along with the read address.  When the read data
come out, the tag comes with it.  This way, we know where to send the
data.


This is some nice work here, and it's definitely put some bullets on
the to-do list.


-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to