On Saturday 29 December 2007 03:41:38 Timothy Normand Miller wrote: > > However, this raises a serious issue that we have to work out. Yet > another of my weaknesses: Getting address bits straight. > > The address from PCI is a byte address, [29:0], a 1GB address space. > We throw away the lower two bits because they don't mean anything for > memory space access, and we have the byte enables. Issue: Did we > already throw them away in the XP10 already? We need to find out. > We need to make sure we don't throw them away twice. > > Next, now that we, presumably, have an address of 32-bit words, we > want to send it to a memory controller. It's one of four. But each > one takes a 64-bit word. Right now, we cheat and just put our word > in the upper or lower half of the 64-bit word and set the byte > enables accordingly. That's what goes into the command fifo to the > arbiter. But now, this is an address of a 64-bit word. We need to > throw away another lower address bit. Did I forget to do that? > > Now, we have an address of 64-bit word. Each adjacent word goes to a > different memory, so we use the lower two bits of the address to > select which controller and the address passed into the controller > has those two bits shifted off. Inside the memory wrapper, we throw > away some UPPER bits because the address space isn't as large as > we're pretending it is. > > So we have a 1GiB address space that is [29:0]. For 32-bit words, > that's [29:2] or [27:0]. For 64-bit, that's [29:4] or [25:0]. For > the address going into the arbiter, that's [29:6] or [23:0]. In the > arbiter, we use [22:0]. > > I'm off by one somewhere. Where did I go wrong?
On Friday 14 December 2007 21:43:12 Timothy Normand Miller wrote:
>
> (2) Addresses -- Different parts of the design use addresses with
> different numbers of bits because they refer to different word sizes.
> Here's how I think the 1GB memory address space should work out:
>
> PCI addresses are 32-bit, but we want only [29:0]
> Stripped of the byte offset, that gives us [29:2] or [27:0]
> (Address of 32-bit word)
> For 64-bit words (wanted by the arbiter), that's [26:0]
> Since there are four memory controllers, we chop off again
> the lower 2 bits, leaving us with [24:0] into an arbiter.
> However, the memory space isn't that big, leaving [22:0] to the
> memories. (Ignore top two.)
Okay, I'm actually starting to understand more and more of this (this
hardware stuff is fun! :-)), but the memory subsystem and the addressing
has me mystified somewhat (I'm not the only one, it looks like :-)).
Also, there appear to be discrepancies between the code in SVN and the
documentation on the mailinglist. Below is how I think it works.
Comments please?
We have four pairs of 16Mx16 memory chips. To address 16M words, you
need 24 address bits: 2 bits bank select, 13 bits row address, 9 bits
column address. Each pair of two memory chips takes a 24 bit address
through which you access a 32-bit word.
Each of these pairs has a memory controller attached to it. To clients
it looks (for future expandability?) like a 32Mx64 memory. So, it takes
a (24+2-1=)25-bit address to access a 64-bit word, but the topmost two
bits of the address must be zero since we only actually have 8Mx64 bits
installed. The memory controller has to do a two-word burst to access
64 bits, since its chip pair only does 32 bits per access.
Next, we get to the arbiter. Terminology is a bit confusing here: there
is conceptually one arbiter, but it consists of four instantiations of
arbiter.v if I understand correctly, each wired to one memory
controller. The arbiter multiplexes memory access requests from various
sources. The whole arbiter takes 27-bit addresses (upper two bits zero)
to access 64-bit data words, and uses the lowest two bits to select a
memory controller to pass the upper 25 bits of the address on to when
it's time for the request to be serviced.
The bridge between the FPGAs has 32 physical data lines. Requesting
access to memory across the bridge therefore goes at 32 bits at a time,
requiring a 28-bit address with the upper two bits zero.
Finally, if you want to refer to a single byte within this memory space,
then you need two more bits at the low end, for a 30-bit address and a
1GB overall memory space of which only the lowest 256MB is populated.
Summarising, here is a diagram. Each line describes the input address of
the named module, and how it uses it. For example, the arbiter gets a
27-bit address of which it uses the last two bits to select one of the
four memory controllers. The remaining bits are sent to the memory
controller in the line below.
Note how the bits are consumed top to bottom. The chips get an
additional bit at the end of the 8-bit column address the memory
controller receives, which is generated by the memory controller.
bits |31|30|29|28|27|26|25|24|23|22|21|20|19|18|17|16|
--------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
PCI |00|00|00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
1GB | | |00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Bridge | | |00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Arbiter | | |00|00|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Memctl | | |00|00|BB|BB|RR|RR|RR|RR|RR|RR|RR|RR|RR|RR|
bits |15|14|13|12|11|10|09|08|07|06|05|04|03|02|01|00|
--------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
PCI |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
1GB |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|
Bridge |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|WW| | |
Arbiter |XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|XX|MM|MM| | | |
Memctl |RR|RR|RR|CC|CC|CC|CC|CC|CC|CC|CC| | | | | |
XX = some value
00 = zero because we only have 256MB of DRAM
WW = select upper or lower 32-bit word of arbiter output
MM = memory controller select
BB = bank select
RR = row select
CC = column select
NOTE:
In [1], Timothy specifies a 64-bit interface with separate read and
write data lines between the memory controller and the arbiter, but in
SVN mem_ctl.v, there is a 32-bit interface with combined read and write
data lines. That doesn't match the spec, nor the arbiter, so something
has to be adjusted here. The key question here is who does the 2-word
burst to convert from a 32-bit to a 64-bit interface? I'd say that that
is up to the memory controller (as I described above); then the arbiter
can worry about scheduling and not have to do address/datawidth
conversion as well.
NOTE:
The arbiter currently has
output [12:0] col_mem,
If I understand correctly and the diagram above is right, then that 12
should be a 7. It could also be 10, 9 or 8, but not 12 I think :-).
Cheers,
Lourens
[1] http://lists.duskglow.com/open-graphics/2007-December/010944.html
pgpa9Ofb2EeYb.pgp
Description: PGP signature
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
