Hey Daniel, Got through the stuff I wanted to get done tonight faster than expected...
On Thu, Jan 19, 2023 at 05:17:33PM -0300, Daniel Henrique Barboza wrote: > Are you testing it by using the command line > you mentioned in the "qemu icicle kit es" thread? > > $(QEMU)/qemu-system-riscv64 \ > -M microchip-icicle-kit \ > -m 2G -smp 5 \ > -kernel $(vmlinux_bin) \ > -dtb $(devkit).dtb \ > -initrd $(initramfs) \ > -display none \ > -serial null \ > -serial stdio Yah, effectively. It's not quite that, but near enough as makes no real difference: qemu-icicle: $(QEMU)/qemu-system-riscv64 -M microchip-icicle-kit \ -m 2G -smp 5 \ -kernel $(vmlinux_bin) \ -dtb $(wrkdir)/riscvpc.dtb \ -initrd $(initramfs) \ -display none -serial null \ -serial stdio \ -D qemu.log -d unimp I just tried to make things somewhat more intelligible for that thread. Also in case it is not obvious, I do work for Microchip. As I mentioned to Alistair at LPC, I/we don't have the cycles at the moment to do anything with QEMU, so the bits of fixes I have sent are things I fixed while debugging other issues etc, mostly in the evenings. Anways, I'll attempt to explain what the craic is here.. On Thu, Jan 19, 2023 at 04:17:24PM -0300, Daniel Henrique Barboza wrote: > The Icicle Kit board works with 2 distinct RAM banks that are separated Ehh, 2 isn't really true. There are 6 possible "windows" into the DDR on MPFS, list here as with their start addresses. 32-bit cached 0x0080000000 64-bit cached 0x1000000000 32-bit non-cached 0x00c0000000 64-bit non-cached 0x1400000000 32-bit WCB 0x00d0000000 64-bit WCB 0x1800000000 These are the "bus" addresses, where the harts think the memory is, but the memory is not actually connected there. There are some runtime configurable registers which determine what addresses these correspond to in the DDR itself. When the QEMU port for MPFS was written, only two of these were in use, the 32-bit and 64-bit non-cached regions. The config (seg) registers were set up so that the 32-bit cached region pointed to 0x0 in DDR and the 64-bit region pointed to 0x3000_0000 in DDR. ⢰⠒⠒⠒⠒⡖⠒⠒⠒⣶⠒0x80000000 ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸⡖⠒⠒⢲⡇ ⡇ 0x40000000 ⢸⡇ ⢸⡇ ⡇ ⡇ ⢸⡇ ⢸⠓⠒⠒⠒⠃ ⡇ <-- 64-bit starts here ⢸⡇ ⢸ ⡇ ⢸⡇ ⢸ ⡇ ⢸⡇ ⢸ ⡇ ⢸⡇ ⢸ ⡇ ⢸⡇ ⢸ ⡇ <-- 32-bit starts at 0x0 ⠘⠓⠒0⠚⠒⠒1⠒⠒⠒0x00000000 (These diagrams are a bit crap, I'm copy pasting them from a TUI tool for visualising these I made for myself. The ~s can be ignored. https://github.com/ConchuOD/memory-aperature-configurator) > by a gap. We have a lower bank with 1GiB size, a gap follows, > then at 64GiB the high memory starts. As you correctly pointed out, that lower region is in fact 1 GiB & hence there is actually an overlapping region of 256 MiB. The Devicetree at this point in time looked like: ddrc_cache_lo: memory@80000000 { device_type = "memory"; reg = <0x0 0x80000000 0x0 0x30000000>; clocks = <&clkcfg CLK_DDRC>; status = "okay"; }; ddrc_cache_hi: memory@1000000000 { device_type = "memory"; reg = <0x10 0x0 0x0 0x40000000>; clocks = <&clkcfg CLK_DDRC>; status = "okay"; }; At some point, it was decided that instead we would use a configuration with ~no memory at 32-bit addresses. I think it was this one here: ⢰⡖⠒⠒⢲⡖⠒⠒⠒⣶⠒0x80000000 ⢸⡇ ⢸⡇ ⣿ ⡇ ⢸⠓⠒⠒⠚⡇ ⡟ ⡇ <-- 32-bit starts here ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ 0x40000000 ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ <-- 64-bit starts at 0x0 ⠘⠒⠒0⠒⠓⠒1⠒⠓⠒0x00000000 Because of how these windows work, the 32-bit cached region was always there, just not used as the Devicetree became: ddrc_cache: memory@1000000000 { device_type = "memory"; reg = <0x10 0x0 0x0 0x76000000>; status = "okay"; }; The remaining bit of memory is being used for some WCB buffers etc & not for the OS itself. This was never upstreamed anywhere AFAIK as it was a workaround. The current Devicetree in Linux & U-Boot corresponds to a configuration like: ⢰⠒⠒⠒⠒⡖⠒⠒⠒⣶⠒0x80000000 ⢸ ⡇ ⣿ ⡇ ⢸ ⡇ ⡟ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸ ⡇ ⡇ ⡇ ⢸⡖⠒⠒⢲⡇ ⡇ 0x40000000 ⢸⡇ ⢸⡇ ⡇ ⡇ ⢸⡇ ⢸⡇ ⡇ ⡇ ⢸⡇ ⢸⡇ ⡇ ⡇ ⢸⡇ ⢸⡇ ⡇ ⡇ ⢸⡇ ⢸⡇ ⡇ ⡇ ⢸⡇ ⢸⡇ ⡇ ⡇ ⢸⡇ ⢸⡇ ⡇ ⡇ <-- 32- & 64-bit start at 0x0 ⠘⠓⠒0⠚⠓⠒1⠒⠓⠒0x00000000 That DT looks like: ddrc_cache_lo: memory@80000000 { device_type = "memory"; reg = <0x0 0x80000000 0x0 0x40000000>; status = "okay"; }; ddrc_cache_hi: memory@1040000000 { device_type = "memory"; reg = <0x10 0x40000000 0x0 0x40000000>; status = "okay"; }; Each of these changes came as part of an FPGA reference design change & a corresponding compatible change. I believe rtlv2203 was the second configuration & rtlv2210 the third. I can't boot the current configuration in QEMU, probably due to some of the things you point out below. To get it working, I remove the ddrc_cache_hi from my DT and boot with the 32-bit cached memory only. This is what the current changes have broken for me. IMO it is a perfectly valid thing to boot a system using less than the memory it *can* use. I guess you read the other thread in which I stated that the HSS boot that is documented doesn't work with recent HSSes. Ideally, and I am most certainly _not_ expecting anyone to do this, when the HSS writes the "seg" registers during boot to configure the memory layout as per the FPGA bitstream QEMU would configure the memory layout it is emulating to match. Since direct kernel boot is a thing too, I was thinking that for that mode, the config in the dtb should probably be used. I don't know enough about QEMU to know if this is even possible! The other possibility I was thinking of was just relaxing the DDR limit entirely (and ignoring the overlaying) so that QEMU thinks there is 1 GiB at 0x8000_0000 and 16 GiB at 0x10_0000_0000. Again, I've not had the cycles to look into any of this at all nor am I expecting anyone else to - just while I am already typing about this stuff there's no harm in broadcasting the other thoughts I had. > MachineClass::default_ram_size is set to 1.5Gb and machine_init() is > enforcing it as minimal RAM size, meaning that there we'll always have I don't think that this is > at least 512 MiB in the Hi RAM area, and that the FDT will be located > there all the time. All the time? That's odd. I suppose my kernel then remaps the dtb into the memory range it can access, and therefore things keep ticking. I don't think that machine_init() should be enforcing a minimum ram size of 1.5 GiB - although maybe Bin Meng has a reason for that that I don't understand. > riscv_compute_fdt_addr() can't handle this setup because it assumes that > the RAM is always contiguous. It's also returning an uint32_t because > it's enforcing that fdt address is sitting on an area that is addressable > to 32 bit CPUs, but 32 bits won't be enough to point to the Hi area of > the Icicle Kit RAM (and to its FDT itself). > > Create a new function called microchip_compute_fdt_addr() that is able > to deal with all these details that are particular to the Icicle Kit. > Ditch riscv_compute_fdt_addr() and use it instead. > > Signed-off-by: Daniel Henrique Barboza <dbarb...@ventanamicro.com> > --- > hw/riscv/microchip_pfsoc.c | 46 +++++++++++++++++++++++++++++++++++--- > 1 file changed, 43 insertions(+), 3 deletions(-) > > diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c > index dcdbc2cac3..9b829e4d1a 100644 > --- a/hw/riscv/microchip_pfsoc.c > +++ b/hw/riscv/microchip_pfsoc.c > @@ -54,6 +54,8 @@ > #include "sysemu/device_tree.h" > #include "sysemu/sysemu.h" > > +#include <libfdt.h> > + > /* > * The BIOS image used by this machine is called Hart Software Services > (HSS). > * See https://github.com/polarfire-soc/hart-software-services > @@ -513,6 +515,46 @@ static void microchip_pfsoc_soc_register_types(void) > > type_init(microchip_pfsoc_soc_register_types) > > +static hwaddr microchip_compute_fdt_addr(MachineState *ms) > +{ > + const MemMapEntry *memmap = microchip_pfsoc_memmap; > + hwaddr mem_low_size = memmap[MICROCHIP_PFSOC_DRAM_LO].size; > + hwaddr mem_high_size, fdt_base; > + int ret = fdt_pack(ms->fdt); > + int fdtsize; > + > + /* Should only fail if we've built a corrupted tree */ > + g_assert(ret == 0); > + > + fdtsize = fdt_totalsize(ms->fdt); > + if (fdtsize <= 0) { > + error_report("invalid device-tree"); > + exit(1); > + } > + > + /* > + * microchip_icicle_kit_machine_init() does a validation > + * that guarantees that ms->ram_size is always greater > + * than mem_low_size and that mem_high_size will be > + * at least 512MiB. Again, I don't think it should be doing this at all. I see the comment about that size refers to DDR training, but given the overlaying of memory it's entirely possible to train against 64-bit addresses but then boot a kernel using only low memory addresses. Perhaps by default & for booting via the bootloader, but I don't think enforcing this makes sense when the bootloader is not involved. If a dtb is used as the source for the memory layout, requiring memory at high addresses doesn't make sense to me. I have no idea if there is a mechanism for figuring that out though nor am I au fait with how these memory sizes are calculated. It is getting kinda late here, so I am sending this without having investigated any of the detail, sorry. Hopefully that wasn't too deranged and you can at least understand why I have been doing what I have... Thanks, Conor. > + * > + * This also means that our fdt_addr will be based > + * on the starting address of the HI DRAM block. > + */ > + mem_high_size = ms->ram_size - mem_low_size; > + fdt_base = memmap[MICROCHIP_PFSOC_DRAM_HI].base; > + > + /* > + * In theory we could copy riscv_compute_fdt_addr() > + * and put the FDT capped at maximum 3Gb from fdt_base, > + * but fdt_base is set at 0x1000000000 (64GiB). We > + * make the assumption here that the OS is ready to > + * handle the FDT, 2MB aligned, at the very end of > + * the available RAM. > + */ > + return QEMU_ALIGN_DOWN(fdt_base + mem_high_size - fdtsize, 2 * MiB); > +} > + > static void microchip_icicle_kit_machine_init(MachineState *machine) > { > MachineClass *mc = MACHINE_GET_CLASS(machine); > @@ -640,9 +682,7 @@ static void > microchip_icicle_kit_machine_init(MachineState *machine) > "bootargs", machine->kernel_cmdline); > } > > - /* Compute the fdt load address in dram */ > - fdt_load_addr = > riscv_compute_fdt_addr(memmap[MICROCHIP_PFSOC_DRAM_LO].base, > - machine->ram_size, > machine->fdt); > + fdt_load_addr = microchip_compute_fdt_addr(machine); > riscv_load_fdt(fdt_load_addr, machine->fdt); > > /* Load the reset vector */ > -- > 2.39.0 > > >
signature.asc
Description: PGP signature