On Saturday 29 December 2007 03:41:38 Timothy Normand Miller wrote:
>
> However, this raises a serious issue that we have to work out.  Yet
> another of my weaknesses:  Getting address bits straight.
>
> The address from PCI is a byte address, [29:0], a 1GB address space.
> We throw away the lower two bits because they don't mean anything for
> memory space access, and we have the byte enables.  Issue:  Did we
> already throw them away in the XP10 already?  We need to find out. 
> We need to make sure we don't throw them away twice.
>
> Next, now that we, presumably, have an address of 32-bit words, we
> want to send it to a memory controller.  It's one of four.  But each
> one takes a 64-bit word.  Right now, we cheat and just put our word
> in the upper or lower half of the 64-bit word and set the byte
> enables accordingly.  That's what goes into the command fifo to the
> arbiter. But now, this is an address of a 64-bit word.  We need to
> throw away another lower address bit.  Did I forget to do that?
>
> Now, we have an address of 64-bit word.  Each adjacent word goes to a
> different memory, so we use the lower two bits of the address to
> select which controller and the address passed into the controller
> has those two bits shifted off.  Inside the memory wrapper, we throw
> away some UPPER bits because the address space isn't as large as
> we're pretending it is.
>
> So we have a 1GiB address space that is [29:0].  For 32-bit words,
> that's [29:2] or [27:0].  For 64-bit, that's [29:4] or [25:0].  For
> the address going into the arbiter, that's [29:6] or [23:0].  In the
> arbiter, we use [22:0].
>
> I'm off by one somewhere.  Where did I go wrong?

On Friday 14 December 2007 21:43:12 Timothy Normand Miller wrote:
>
> (2) Addresses -- Different parts of the design use addresses with
> different numbers of bits because they refer to different word sizes.
> Here's how I think the 1GB memory address space should work out:
>
>     PCI addresses are 32-bit, but we want only [29:0]
>     Stripped of the byte offset, that gives us [29:2] or [27:0]
>         (Address of 32-bit word)
>     For 64-bit words (wanted by the arbiter), that's [26:0]
>     Since there are four memory controllers, we chop off again
>         the lower 2 bits, leaving us with [24:0] into an arbiter.
>     However, the memory space isn't that big, leaving [22:0] to the
>         memories.  (Ignore top two.)

Okay, I'm actually starting to understand more and more of this (this 
hardware stuff is fun! :-)), but the memory subsystem and the addressing 
has me mystified somewhat (I'm not the only one, it looks like :-)). 
Also, there appear to be discrepancies between the code in SVN and the 
documentation on the mailinglist. Below is how I think it works.

Comments please?


We have four pairs of 16Mx16 memory chips. To address 16M words, you 
need 24 address bits: 2 bits bank select, 13 bits row address, 9 bits 
column address. Each pair of two memory chips takes a 24 bit address 
through which you access a 32-bit word.

Each of these pairs has a memory controller attached to it. To clients 
it looks (for future expandability?) like a 32Mx64 memory. So, it takes 
a (24+2-1=)25-bit address to access a 64-bit word, but the topmost two 
bits of the address must be zero since we only actually have 8Mx64 bits 
installed. The memory controller has to do a two-word burst to access 
64 bits, since its chip pair only does 32 bits per access.

Next, we get to the arbiter. Terminology is a bit confusing here: there 
is conceptually one arbiter, but it consists of four instantiations of 
arbiter.v if I understand correctly, each wired to one memory 
controller. The arbiter multiplexes memory access requests from various 
sources. The whole arbiter takes 27-bit addresses (upper two bits zero) 
to access 64-bit data words, and uses the lowest two bits to select a 
memory controller to pass the upper 25 bits of the address on to when 
it's time for the request to be serviced.

The bridge between the FPGAs has 32 physical data lines. Requesting 
access to memory across the bridge therefore goes at 32 bits at a time, 
requiring a 28-bit address with the upper two bits zero.

Finally, if you want to refer to a single byte within this memory space, 
then you need two more bits at the low end, for a 30-bit address and a 
1GB overall memory space of which only the lowest 256MB is populated.


Summarising, here is a diagram. Each line describes the input address of 
the named module, and how it uses it. For example, the arbiter gets a 
27-bit address of which it uses the last two bits to select one of the 
four memory controllers. The remaining bits are sent to the memory 
controller in the line below.

Note how the bits are consumed top to bottom. The chips get an 
additional bit at the end of the 8-bit column address the memory 
controller receives, which is generated by the memory controller.

bits    |31|30|29|28|27|26|25|24|23|22|21|20|19|18|17|16|
--------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
PCI     |00|00|00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
1GB     |  |  |00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Bridge  |  |  |00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Arbiter |  |  |00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Memctl  |  |  |00|00|BB|BB|RR|RR|RR|RR|RR|RR|RR|RR|RR|RR|


bits    |15|14|13|12|11|10|09|08|07|06|05|04|03|02|01|00|
--------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
PCI     |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
1GB     |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Bridge  |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|WW|  |  |
Arbiter |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|MM|MM|  |  |  |
Memctl  |RR|RR|RR|CC|CC|CC|CC|CC|CC|CC|CC|  |  |  |  |  |

XX = some value
00 = zero because we only have 256MB of DRAM
WW = select upper or lower 32-bit word of arbiter output
MM = memory controller select
BB = bank select
RR = row select
CC = column select


NOTE:

In [1], Timothy specifies a 64-bit interface with separate read and 
write data lines between the memory controller and the arbiter, but in 
SVN mem_ctl.v, there is a 32-bit interface with combined read and write 
data lines. That doesn't match the spec, nor the arbiter, so something 
has to be adjusted here. The key question here is who does the 2-word 
burst to convert from a 32-bit to a 64-bit interface? I'd say that that 
is up to the memory controller (as I described above); then the arbiter 
can worry about scheduling and not have to do address/datawidth 
conversion as well.

NOTE:

The arbiter currently has

    output [12:0] col_mem,

If I understand correctly and the diagram above is right, then that 12 
should be a 7. It could also be 10, 9 or 8, but not 12 I think :-).

Cheers,

Lourens


[1] http://lists.duskglow.com/open-graphics/2007-December/010944.html

Attachment: pgpa9Ofb2EeYb.pgp
Description: PGP signature

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to