I defer to people who are probably more expert (folks at AMD for
instance?), but this is my understanding. It seems there's also some info
here which may be more correct/complete than my explanation that follows:

https://wiki.osdev.org/Memory_Map_(x86)

The lowest 1MB of memory was at one time all the CPU could address, and so
a lot of stuff lives down there. The framebuffer is there, as are some
regions of memory reserved for the BIOS, some space for RAM to be mapped,
etc. There are some regions in this area which can actually serve two
purposes. For instance, if the CPU gets an SMI (system management
interrupt), it will transparently and (almost) unavoidably switch into a
different mode which is sort of like real mode called SMM. This has a
special region of memory which then becomes visible where the framebuffer
was (the chipset handles the magic remapping, I believe), and in that area
the state of the system is saved so it can be restored when SMM exits. The
assumption here was that when you're adjusting fan speeds, etc., you don't
need to update the framebuffer, and in normal operation the system
management window should not be accessible. SMBIOS was something that
operated behind the OS's back without it's interference. It turns out that
makes it a great dark corner to hide viruses, but that's getting fairly far
from your original question.

Then if you bump up to the 32 bit world, you have an (approximately) 4GB
address space. With PAE (physical address extension) you get I think up to
36 bits of address space, but a lot of the layout was done with 4GB in
mind. In that space, you have some space for RAM starting at around 1MB and
then extending upwards to, say, 3GB. Above that you'd have a big window
which would let you map in, for instance, PCI cards which are 32 bit.

Then when you bump up to 64 bit (really probably 48 bits-ish in real
systems, I think), you have more space for RAM again, so more of it goes
above the hole which is at the end of the 4GB address space.

On top of this, the BIOS/EFI/whatever runs first, and may take regions of
the physical address space which *are* RAM and *are* available to software,
but lies about that to the OS since it needs to keep it's own stuff there
for whatever reason. For instance, there are components of EFI which
continue to exist and to be accessible after the OS starts called runtime
services. ACPI also lives in memory persistently, and the OS needs to not
stomp on top of it. These are generally either not reported to the OS at
all (dangerous), or reported to the OS as a different type of memory which
is reserved for some purpose.

The way that fixed address resources like the VGA framebuffer, legacy IDE
controller interface, etc, work is a bit of a wonky hack on top of how PCI
works, and I don't remember the specifics. Reducing it to hand wavy high
level, those interfaces/features respond to the right IO ports and memory
addresses regardless of where they actually live according to PCI.

All of the above is how software sees RAM, but not necessarily how the
hardware is laid out. The chipset will be between the CPU and the actual
banks of DRAM, and will futz with addresses as they come by so that the
actual RAM sees, for instance, a contiguous range of addresses which start
at 0. Individual parts might even see addresses as if they were at 0
individually, even if all of them think that and clearly aren't all at the
same place. The relationship to the physical addresses visible to software
and the addresses the DRAM or even the memory controller sees can be
arbitrarily complicated.

It's at this other, more basic level that I would expect address
interleaving, banking, blah blah to happen.

To throw even more complexity into the picture, there's the A20 address
line which did not exist on older x86 chips which had 20 bit address
busses. If you wrapped around the end of the address space on those you'd
alias to the start, but in newer chips which had an A20 line, if you fell
off the top you would actually go to different addresses. The keyboard
controller on x86 was a relatively competent (for the time) microcontroller
which actually had it's own internal RAM, etc., and to make this
incompatibility go away, the keyboard controller can optionally mask the
A20 (and only the A20) address line to make those addresses wrap like they
used to. A21, etc, are not masked. So you can, on any modern hardware as
far as I know, alias every other MB to its neighbor in the physical address
space. Woe be unto you if you do this, but it should actually do what it
claims to do if you try. gem5 does not implement this "feature". It's also
invisible to the physical DRAM as far as I know.

Gabe

On Thu, Oct 10, 2019 at 12:10 PM Jason Lowe-Power <ja...@lowepower.com>
wrote:

> Let me follow up on this a little.
>
> Does anyone (likely Gabe) have pointers to information about how the
> *physical addresses* are mapped in normal x86 systems.
>
> Here's what a couple of real systems look like when looking at /proc/iomem
> (populated from the E820 entries in the BIOS). I'm omitting all of the IO
> ranges, but they fill in the gaps.
>
> This is a simple Intel desktop system with one NUMA node and 32 GB DRAM (4
> 8GB DIMMs, IIRC). /proc/meminfo says there's 31.3 GB memory total
>
> 00001000-0009c3ff : System RAM    0.000-0.001 (0.001GB)
> 00100000-cb3a8fff : System RAM    0.001-3.175 (3.174GB)
> cb3ab000-dac45fff : System RAM    3.175-3.418 (0.243GB)
> dbdff000-dbdfffff : System RAM    3.436-3.436 (0.000GB)
> 100000000-81dffffff : System RAM    4.000-32.469 (28.469GB)
>
>
> This is an 8 NUMA node AMD system with 512 GB RAM (16 32 GB DIMMs, IIRC).
> /proc/meminfo says there's 503.79 GB memory total
>
> 00001000-000997ff : System RAM    0.000-0.001 (0.001GB)
> 00100000-76daffff : System RAM    0.001-1.857 (1.856GB)
> 77000000-c9d69fff : System RAM    1.859-3.154 (1.294GB)
> c9dda000-c9e90fff : System RAM    3.154-3.155 (0.001GB)
> cacc9000-cbffffff : System RAM    3.169-3.187 (0.019GB)
> 100000000-102f37ffff : System RAM    4.000-64.738 (60.738GB)
> 1030000000-202ff7ffff : System RAM    64.750-128.750 (64.000GB)
> 2030000000-302ff7ffff : System RAM    128.750-192.750 (64.000GB)
> 3030000000-402ff7ffff : System RAM    192.750-256.750 (64.000GB)
> 4030000000-502ff7ffff : System RAM    256.750-320.750 (64.000GB)
> 5030000000-602ff7ffff : System RAM    320.750-384.750 (64.000GB)
> 6030000000-702ff7ffff : System RAM    384.750-448.750 (64.000GB)
> 7030000000-802ff7ffff : System RAM    448.750-512.750 (64.000GB)
>
>
> Our main question comes down to how to get reasonable interleavings at the
> memory controllers across banks/ranks/channels? We've been assuming that
> physical memory starts at 0 and goes to the size of memory, but that's
> clearly not how real (x86) systems are set up. If we were to use the
> addresses above, I don't believe the default interleavings will work
> correctly (I could be wrong here...).
>
> Any ideas on what is going on under the covers? Does the bus controller
> have a level of address "translation" before the memory controllers? Do the
> gaps just mask parts of DRAM?
>
> Right now in gem5's x86 FS configuration, we have one physical memory range
> (AddrRange) for each E820 entry. Would it be more correct to set up the
> E820 regions and physical memory separately and add a translation layer
> somewhere in the memory system (e.g., just before each memory controller)?
>
> Any pointers to documentation on this (OSDev wiki was slightly helpful but
> didn't explain how hardware interleavings work) or other ideas would be
> greatly appreciated!
>
> Cheers,
> Jason
>
> On Thu, Oct 10, 2019 at 11:27 AM Pouya Fotouhi <pfoto...@ucdavis.edu>
> wrote:
>
> > Hi All,
> >
> > I am trying to add a GPU as PCI device in full system. The kernel expects
> > the shadowed ROM for vga devices to be at 0xc0000, and would attempt to
> > read the ROM from there. However, we start mapping the memory range from
> > address 0x0, and that would result in any accesses to vga ROM to go
> through
> > the memory and not the device itself.
> >
> > As a workaround, we can check for this particular address range before
> > dispatching the incoming request and send it to pio instead of memory.
> > However, this would be more of a hack, and would not solve the issue for
> > KVM cpu.
> >
> > I was wondering if anyone faced similar issues, or have any feedback on
> how
> > to properly handle this. Looking at how this is handled in real systems,
> is
> > seems like DRAM is often not mapped starting from address 0x0. Other than
> > simplicity, is there any particular reason for mapping DRAM starting from
> > address zero?
> >
> > Best,
> > --
> > Pouya Fotouhi
> > PhD Candidate
> > Department of Electrical and Computer Engineering
> > University of California, Davis
> > _______________________________________________
> > gem5-dev mailing list
> > gem5-dev@gem5.org
> > http://m5sim.org/mailman/listinfo/gem5-dev
> _______________________________________________
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to