Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Wu, Fei
On 9/7/2023 11:46 PM, Anup Patel wrote:
> On Tue, Aug 1, 2023 at 4:16 AM Daniel Henrique Barboza
>  wrote:
>>
>>
>>
>> On 7/30/23 22:53, Fei Wu wrote:
>>> riscv virt platform's memory started at 0x8000 and
>>> straddled the 4GiB boundary. Curiously enough, this choice
>>> of a memory layout will prevent from launching a VM with
>>> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
>>> to identity mapping requirements for the MSI doorbell on x86,
>>> and these (APIC/IOAPIC) live right below 4GiB.
>>>
>>> So just split the RAM range into two portions:
>>> - 1 GiB range from 0x8000 to 0xc000.
>>> - The remainder at 0x1
>>>
>>> ...leaving a hole between the ranges.
>>
>> I am afraid this breaks some existing distro setups, like Ubuntu. After this 
>> patch
>> this emulation stopped working:
>>
>> ~/work/qemu/build/qemu-system-riscv64 \
>> -machine virt -nographic -m 8G -smp 8 \
>> -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
>> -drive file=snapshot.img,format=qcow2,if=virtio \
>> -netdev bridge,id=bridge1,br=virbr0 -device 
>> virtio-net-pci,netdev=bridge1
>>
>>
>> This is basically a guest created via the official Canonical tutorial:
>>
>> https://wiki.ubuntu.com/RISC-V/QEMU
>>
>> The error being thrown:
>>
>> =
>>
>> Boot HART ID  : 4
>> Boot HART Domain  : root
>> Boot HART Priv Version: v1.12
>> Boot HART Base ISA: rv64imafdch
>> Boot HART ISA Extensions  : time,sstc
>> Boot HART PMP Count   : 16
>> Boot HART PMP Granularity : 4
>> Boot HART PMP Address Bits: 54
>> Boot HART MHPM Count  : 16
>> Boot HART MIDELEG : 0x1666
>> Boot HART MEDELEG : 0x00f0b509
>>
>>
>> U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)
>>
>> CPU:   
>> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
>> Model: riscv-virtio,qemu
>> DRAM:  Unhandled exception: Store/AMO access fault
>> EPC: 802018b8 RA: 802126a0 TVAL: ff733f90
>>
>> Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)
>>
>>
>> resetting ...
>> System reset not supported on this platform
>> ### ERROR ### Please RESET the board ###
>> QEMU 8.0.90 monitor - type 'help' for more infor
>> =
> 
> Can you try again after setting CONFIG_NR_DRAM_BANKS=2 in
> qemu-riscv64_smode_defconfig and qemu-riscv64_spl_defconfig ?
> 
Yes, I made a u-boot patch to change this setting and also use
fdtdec_setup_mem_size_base_lowest() instead fdtdec_setup_mem_size_base()
in dram_init(), the latter is also necessary. The patch has been posted
to u-boot mailing list but got no reply yet:
https://lists.denx.de/pipermail/u-boot/2023-September/529729.html

Thanks,
Fei.

> Regards,
> Anup
> 
>>
>>
>> Based on the change made I can make an educated guess on what is going wrong.
>> We have another board with a similar memory topology you're making here, the
>> Microchip Polarfire (microchip_pfsoc.c). We were having some problems with 
>> this
>> board while trying to consolidate the boot process between all boards in
>> hw/riscv/boot.c because of its non-continuous RAM bank. The full story can be
>> read in the commit message of 4b402886ac89 ("hw/riscv: change 
>> riscv_compute_fdt_addr()
>> semantics") but the short version can be seen in riscv_compute_fdt_addr()
>> from boot.c:
>>
>>   - if ram_start is less than 3072MiB, the FDT will be  put at the lowest 
>> value
>> between 3072 MiB and the end of that RAM bank;
>>
>> - if ram_start is higher than 3072 MiB the FDT will be put at the end of the
>> RAM bank.
>>
>> So, after this patch, since riscv_compute_fdt_addr() is being used with the 
>> now
>> lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup 
>> that has
>> more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu and 
>> possibly
>> others that are trying to retrieve the FDT from the gap that you created 
>> between
>> low and hi mem in this patch.
>>
>> In fact, this same Ubuntu guest I mentioned above will boot if I put only 1 
>> Gb of RAM
>> (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a 
>> validation of
>> the guess I'm making here: Ubuntu is trying to fetch stuff (probably the 
>> fdt) from
>> the gap between the memory areas.
>>
>> This change on top of this patch doesn't work either:
>>
>> $ git diff
>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>> index 8fbdc7220c..dfff48d849 100644
>> --- a/hw/riscv/virt.c
>> +++ b/hw/riscv/virt.c
>> @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier, 
>> void *data)
>>kernel_start_addr, true, NULL);
>>   }
>>
>> -fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>> +if (machine->ram_size < memmap[VIRT_DRAM].size) {
>> +fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>> 

Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Anup Patel
On Thu, Sep 7, 2023 at 8:48 AM Alistair Francis  wrote:
>
> On Thu, Aug 3, 2023 at 10:47 AM Wu, Fei  wrote:
> >
> > On 8/1/2023 6:46 AM, Daniel Henrique Barboza wrote:
> > >
> > >
> > > On 7/30/23 22:53, Fei Wu wrote:
> > >> riscv virt platform's memory started at 0x8000 and
> > >> straddled the 4GiB boundary. Curiously enough, this choice
> > >> of a memory layout will prevent from launching a VM with
> > >> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
> > >> to identity mapping requirements for the MSI doorbell on x86,
> > >> and these (APIC/IOAPIC) live right below 4GiB.
> > >>
> > >> So just split the RAM range into two portions:
> > >> - 1 GiB range from 0x8000 to 0xc000.
> > >> - The remainder at 0x1
> > >>
> > >> ...leaving a hole between the ranges.
> > >
> > > I am afraid this breaks some existing distro setups, like Ubuntu. After
> > > this patch
> > > this emulation stopped working:
> > >
> > > ~/work/qemu/build/qemu-system-riscv64 \
> > > -machine virt -nographic -m 8G -smp 8 \
> > > -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
> > > -drive file=snapshot.img,format=qcow2,if=virtio \
> > > -netdev bridge,id=bridge1,br=virbr0 -device
> > > virtio-net-pci,netdev=bridge1
> > >
> > >
> > > This is basically a guest created via the official Canonical tutorial:
> > >
> > > https://wiki.ubuntu.com/RISC-V/QEMU
> > >
> > > The error being thrown:
> > >
> > > =
> > >
> > > Boot HART ID  : 4
> > > Boot HART Domain  : root
> > > Boot HART Priv Version: v1.12
> > > Boot HART Base ISA: rv64imafdch
> > > Boot HART ISA Extensions  : time,sstc
> > > Boot HART PMP Count   : 16
> > > Boot HART PMP Granularity : 4
> > > Boot HART PMP Address Bits: 54
> > > Boot HART MHPM Count  : 16
> > > Boot HART MIDELEG : 0x1666
> > > Boot HART MEDELEG : 0x00f0b509
> > >
> > >
> > > U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)
> > >
> > > CPU:
> > > rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
> > > Model: riscv-virtio,qemu
> > > DRAM:  Unhandled exception: Store/AMO access fault
> > > EPC: 802018b8 RA: 802126a0 TVAL: ff733f90
> > >
> > > Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)
> > >
> > >
> > > resetting ...
> > > System reset not supported on this platform
> > > ### ERROR ### Please RESET the board ###
> > > QEMU 8.0.90 monitor - type 'help' for more infor
> > > =
> > >
> > >
> > > Based on the change made I can make an educated guess on what is going
> > > wrong.
> > > We have another board with a similar memory topology you're making here,
> > > the
> > > Microchip Polarfire (microchip_pfsoc.c). We were having some problems
> > > with this
> > > board while trying to consolidate the boot process between all boards in
> > > hw/riscv/boot.c because of its non-continuous RAM bank. The full story
> > > can be
> > > read in the commit message of 4b402886ac89 ("hw/riscv: change
> > > riscv_compute_fdt_addr()
> > > semantics") but the short version can be seen in riscv_compute_fdt_addr()
> > > from boot.c:
> > >
> > >  - if ram_start is less than 3072MiB, the FDT will be  put at the lowest
> > > value
> > > between 3072 MiB and the end of that RAM bank;
> > >
> > > - if ram_start is higher than 3072 MiB the FDT will be put at the end of
> > > the
> > > RAM bank.
> > >
> > > So, after this patch, since riscv_compute_fdt_addr() is being used with
> > > the now
> > > lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup
> > > that has
> > > more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu
> > > and possibly
> > > others that are trying to retrieve the FDT from the gap that you created
> > > between
> > > low and hi mem in this patch.
> > >
> > > In fact, this same Ubuntu guest I mentioned above will boot if I put
> > > only 1 Gb of RAM
> > > (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a
> > > validation of
> > > the guess I'm making here: Ubuntu is trying to fetch stuff (probably the
> > > fdt) from
> > > the gap between the memory areas.
> > >
> > > This change on top of this patch doesn't work either:
> > >
> > > $ git diff
> > > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> > > index 8fbdc7220c..dfff48d849 100644
> > > --- a/hw/riscv/virt.c
> > > +++ b/hw/riscv/virt.c
> > > @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier,
> > > void *data)
> > >   kernel_start_addr, true, NULL);
> > >  }
> > >
> > > -fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> > > +if (machine->ram_size < memmap[VIRT_DRAM].size) {
> > > +fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> > > memmap[VIRT_DRAM].size,
> > >

Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Anup Patel
On Tue, Aug 1, 2023 at 4:16 AM Daniel Henrique Barboza
 wrote:
>
>
>
> On 7/30/23 22:53, Fei Wu wrote:
> > riscv virt platform's memory started at 0x8000 and
> > straddled the 4GiB boundary. Curiously enough, this choice
> > of a memory layout will prevent from launching a VM with
> > a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
> > to identity mapping requirements for the MSI doorbell on x86,
> > and these (APIC/IOAPIC) live right below 4GiB.
> >
> > So just split the RAM range into two portions:
> > - 1 GiB range from 0x8000 to 0xc000.
> > - The remainder at 0x1
> >
> > ...leaving a hole between the ranges.
>
> I am afraid this breaks some existing distro setups, like Ubuntu. After this 
> patch
> this emulation stopped working:
>
> ~/work/qemu/build/qemu-system-riscv64 \
> -machine virt -nographic -m 8G -smp 8 \
> -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
> -drive file=snapshot.img,format=qcow2,if=virtio \
> -netdev bridge,id=bridge1,br=virbr0 -device 
> virtio-net-pci,netdev=bridge1
>
>
> This is basically a guest created via the official Canonical tutorial:
>
> https://wiki.ubuntu.com/RISC-V/QEMU
>
> The error being thrown:
>
> =
>
> Boot HART ID  : 4
> Boot HART Domain  : root
> Boot HART Priv Version: v1.12
> Boot HART Base ISA: rv64imafdch
> Boot HART ISA Extensions  : time,sstc
> Boot HART PMP Count   : 16
> Boot HART PMP Granularity : 4
> Boot HART PMP Address Bits: 54
> Boot HART MHPM Count  : 16
> Boot HART MIDELEG : 0x1666
> Boot HART MEDELEG : 0x00f0b509
>
>
> U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)
>
> CPU:   
> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
> Model: riscv-virtio,qemu
> DRAM:  Unhandled exception: Store/AMO access fault
> EPC: 802018b8 RA: 802126a0 TVAL: ff733f90
>
> Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)
>
>
> resetting ...
> System reset not supported on this platform
> ### ERROR ### Please RESET the board ###
> QEMU 8.0.90 monitor - type 'help' for more infor
> =

Can you try again after setting CONFIG_NR_DRAM_BANKS=2 in
qemu-riscv64_smode_defconfig and qemu-riscv64_spl_defconfig ?

Regards,
Anup

>
>
> Based on the change made I can make an educated guess on what is going wrong.
> We have another board with a similar memory topology you're making here, the
> Microchip Polarfire (microchip_pfsoc.c). We were having some problems with 
> this
> board while trying to consolidate the boot process between all boards in
> hw/riscv/boot.c because of its non-continuous RAM bank. The full story can be
> read in the commit message of 4b402886ac89 ("hw/riscv: change 
> riscv_compute_fdt_addr()
> semantics") but the short version can be seen in riscv_compute_fdt_addr()
> from boot.c:
>
>   - if ram_start is less than 3072MiB, the FDT will be  put at the lowest 
> value
> between 3072 MiB and the end of that RAM bank;
>
> - if ram_start is higher than 3072 MiB the FDT will be put at the end of the
> RAM bank.
>
> So, after this patch, since riscv_compute_fdt_addr() is being used with the 
> now
> lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup that 
> has
> more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu and 
> possibly
> others that are trying to retrieve the FDT from the gap that you created 
> between
> low and hi mem in this patch.
>
> In fact, this same Ubuntu guest I mentioned above will boot if I put only 1 
> Gb of RAM
> (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a 
> validation of
> the guess I'm making here: Ubuntu is trying to fetch stuff (probably the fdt) 
> from
> the gap between the memory areas.
>
> This change on top of this patch doesn't work either:
>
> $ git diff
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 8fbdc7220c..dfff48d849 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier, void 
> *data)
>kernel_start_addr, true, NULL);
>   }
>
> -fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> +if (machine->ram_size < memmap[VIRT_DRAM].size) {
> +fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>  memmap[VIRT_DRAM].size,
>  machine);
> +} else {
> +fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM_HIGH].base,
> +   memmap[VIRT_DRAM_HIGH].size,
> +   machine);
> +}
> +
>
>
> This would put the fdt at the end of the HI RAM for guests with more than 1Gb 
> of RAM.
> This change in fact makes the situation even worse, 

Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Eric Auger
Hi,

On 9/7/23 12:04, Wu, Fei wrote:
> On 9/7/2023 5:10 PM, Eric Auger wrote:
>> Hi,
>>
>> On 9/7/23 09:16, Philippe Mathieu-Daudé wrote:
>>> Widening Cc to ARM/VFIO.
>>>
>>> On 4/8/23 11:15, Wu, Fei wrote:
 On 8/3/2023 11:07 PM, Andrew Jones wrote:
> On Mon, Jul 31, 2023 at 09:53:17AM +0800, Fei Wu wrote:
>> riscv virt platform's memory started at 0x8000 and
>> straddled the 4GiB boundary. Curiously enough, this choice
>> of a memory layout will prevent from launching a VM with
>> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
>> to identity mapping requirements for the MSI doorbell on x86,
>> and these (APIC/IOAPIC) live right below 4GiB.
>>
>> So just split the RAM range into two portions:
>> - 1 GiB range from 0x8000 to 0xc000.
>> - The remainder at 0x1
>>
>> ...leaving a hole between the ranges.
> Can you elaborate on the use case? Maybe provide details of the host
> system and the QEMU command line? I'm wondering why we didn't have
> any problems with the arm virt machine type. Has nobody tried this
> use case with that? Is the use case something valid for riscv, but
> not arm?
>
 Firstly we have to enable pcie passthru on host, find the device groups,
 e.g. the vga card, and add their pci ids to host kernel cmdline:
 vfio-pci.ids=10de:0f02,10de:0e08

 then start vm through qemu as follows:
 $Q -machine virt -m 4G -smp 4 -nographic \
    -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
    -kernel ./vmlinuz -initrd initrd.img -append "root=/dev/vda1 rw" \
    -drive
 file=ubuntu-22.04.1-preinstalled-server-riscv64+unmatched.img,if=virtio,format=raw

 \
    -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1 \
    -netdev user,id=vnet,hostfwd=:127.0.0.1:2223-:22 -device
 virtio-net-pci,netdev=vnet

 Without this patch, qemu exits immediately instead of boots up.

 Just tried pcie passthru on arm, it cannot handle 4G memory either.
 $Q -m 4G -smp 4 -cpu max -M virt -nographic \
    -pflash /usr/share/AAVMF/AAVMF_CODE.fd -pflash flash1.img \
    -drive if=none,file=ubuntu-22.04-server-cloudimg-arm64.img,id=hd0 \
    -device virtio-blk-device,drive=hd0 \
    -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1

 qemu-system-aarch64: -device vfio-pci,host=01:00.0: VFIO_MAP_DMA failed:
 Invalid argument
 qemu-system-aarch64: -device vfio-pci,host=01:00.0: vfio :01:00.0:
 failed to setup container for group 11: memory listener initialization
 failed: Region mach-virt.ram: vfio_dma_map(0x55de3c2a97f0, 0x4000,
 0x1, 0x7f8fcbe0) = -22 (Invalid argument)
>> The collision between the x86 host MSI reserved region [0xfee0,
>> 0xfeef] and the ARM guest RAM starting at 1GB has also always
>> existed. But now this collision is properly detected instead of being
>> silenced. People have not really complained about this so far. Since the
>> existing guest RAM layout couldn't be changed, I am afraid we couldn't
>> do much.
>>
> Just as what this patch does for riscv, arm guest on x86 could do the
> same thing to adjust the guest RAM layout if necessary? This looks like
> a nice-to-have feature for arm if there still is a requirement to run
> arm guest on x86.

Well Peter was opposed to any change in the legacy RAM layout. At some
point on ARM VIRT we added some extra RAM but never changed the existing
layout . But actually nobody ever complained about the lack of support
of this UC (TCG arm guest with host assigned device on x86 host).

Eric
>
> Thanks,
> Fei.
>
>> Eric
 Thanks,
 Fei.

> Thanks,
> drew





Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Wu, Fei
On 9/7/2023 5:10 PM, Eric Auger wrote:
> Hi,
> 
> On 9/7/23 09:16, Philippe Mathieu-Daudé wrote:
>> Widening Cc to ARM/VFIO.
>>
>> On 4/8/23 11:15, Wu, Fei wrote:
>>> On 8/3/2023 11:07 PM, Andrew Jones wrote:
 On Mon, Jul 31, 2023 at 09:53:17AM +0800, Fei Wu wrote:
> riscv virt platform's memory started at 0x8000 and
> straddled the 4GiB boundary. Curiously enough, this choice
> of a memory layout will prevent from launching a VM with
> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
> to identity mapping requirements for the MSI doorbell on x86,
> and these (APIC/IOAPIC) live right below 4GiB.
>
> So just split the RAM range into two portions:
> - 1 GiB range from 0x8000 to 0xc000.
> - The remainder at 0x1
>
> ...leaving a hole between the ranges.

 Can you elaborate on the use case? Maybe provide details of the host
 system and the QEMU command line? I'm wondering why we didn't have
 any problems with the arm virt machine type. Has nobody tried this
 use case with that? Is the use case something valid for riscv, but
 not arm?

>>> Firstly we have to enable pcie passthru on host, find the device groups,
>>> e.g. the vga card, and add their pci ids to host kernel cmdline:
>>> vfio-pci.ids=10de:0f02,10de:0e08
>>>
>>> then start vm through qemu as follows:
>>> $Q -machine virt -m 4G -smp 4 -nographic \
>>>    -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
>>>    -kernel ./vmlinuz -initrd initrd.img -append "root=/dev/vda1 rw" \
>>>    -drive
>>> file=ubuntu-22.04.1-preinstalled-server-riscv64+unmatched.img,if=virtio,format=raw
>>>
>>> \
>>>    -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1 \
>>>    -netdev user,id=vnet,hostfwd=:127.0.0.1:2223-:22 -device
>>> virtio-net-pci,netdev=vnet
>>>
>>> Without this patch, qemu exits immediately instead of boots up.
>>>
>>> Just tried pcie passthru on arm, it cannot handle 4G memory either.
>>> $Q -m 4G -smp 4 -cpu max -M virt -nographic \
>>>    -pflash /usr/share/AAVMF/AAVMF_CODE.fd -pflash flash1.img \
>>>    -drive if=none,file=ubuntu-22.04-server-cloudimg-arm64.img,id=hd0 \
>>>    -device virtio-blk-device,drive=hd0 \
>>>    -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1
>>>
>>> qemu-system-aarch64: -device vfio-pci,host=01:00.0: VFIO_MAP_DMA failed:
>>> Invalid argument
>>> qemu-system-aarch64: -device vfio-pci,host=01:00.0: vfio :01:00.0:
>>> failed to setup container for group 11: memory listener initialization
>>> failed: Region mach-virt.ram: vfio_dma_map(0x55de3c2a97f0, 0x4000,
>>> 0x1, 0x7f8fcbe0) = -22 (Invalid argument)
> 
> The collision between the x86 host MSI reserved region [0xfee0,
> 0xfeef] and the ARM guest RAM starting at 1GB has also always
> existed. But now this collision is properly detected instead of being
> silenced. People have not really complained about this so far. Since the
> existing guest RAM layout couldn't be changed, I am afraid we couldn't
> do much.
> 
Just as what this patch does for riscv, arm guest on x86 could do the
same thing to adjust the guest RAM layout if necessary? This looks like
a nice-to-have feature for arm if there still is a requirement to run
arm guest on x86.

Thanks,
Fei.

> Eric
>>>
>>> Thanks,
>>> Fei.
>>>
 Thanks,
 drew
>>>
>>>
>>
> 




Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Eric Auger
Hi,

On 9/7/23 09:16, Philippe Mathieu-Daudé wrote:
> Widening Cc to ARM/VFIO.
>
> On 4/8/23 11:15, Wu, Fei wrote:
>> On 8/3/2023 11:07 PM, Andrew Jones wrote:
>>> On Mon, Jul 31, 2023 at 09:53:17AM +0800, Fei Wu wrote:
 riscv virt platform's memory started at 0x8000 and
 straddled the 4GiB boundary. Curiously enough, this choice
 of a memory layout will prevent from launching a VM with
 a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
 to identity mapping requirements for the MSI doorbell on x86,
 and these (APIC/IOAPIC) live right below 4GiB.

 So just split the RAM range into two portions:
 - 1 GiB range from 0x8000 to 0xc000.
 - The remainder at 0x1

 ...leaving a hole between the ranges.
>>>
>>> Can you elaborate on the use case? Maybe provide details of the host
>>> system and the QEMU command line? I'm wondering why we didn't have
>>> any problems with the arm virt machine type. Has nobody tried this
>>> use case with that? Is the use case something valid for riscv, but
>>> not arm?
>>>
>> Firstly we have to enable pcie passthru on host, find the device groups,
>> e.g. the vga card, and add their pci ids to host kernel cmdline:
>> vfio-pci.ids=10de:0f02,10de:0e08
>>
>> then start vm through qemu as follows:
>> $Q -machine virt -m 4G -smp 4 -nographic \
>>    -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
>>    -kernel ./vmlinuz -initrd initrd.img -append "root=/dev/vda1 rw" \
>>    -drive
>> file=ubuntu-22.04.1-preinstalled-server-riscv64+unmatched.img,if=virtio,format=raw
>>
>> \
>>    -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1 \
>>    -netdev user,id=vnet,hostfwd=:127.0.0.1:2223-:22 -device
>> virtio-net-pci,netdev=vnet
>>
>> Without this patch, qemu exits immediately instead of boots up.
>>
>> Just tried pcie passthru on arm, it cannot handle 4G memory either.
>> $Q -m 4G -smp 4 -cpu max -M virt -nographic \
>>    -pflash /usr/share/AAVMF/AAVMF_CODE.fd -pflash flash1.img \
>>    -drive if=none,file=ubuntu-22.04-server-cloudimg-arm64.img,id=hd0 \
>>    -device virtio-blk-device,drive=hd0 \
>>    -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1
>>
>> qemu-system-aarch64: -device vfio-pci,host=01:00.0: VFIO_MAP_DMA failed:
>> Invalid argument
>> qemu-system-aarch64: -device vfio-pci,host=01:00.0: vfio :01:00.0:
>> failed to setup container for group 11: memory listener initialization
>> failed: Region mach-virt.ram: vfio_dma_map(0x55de3c2a97f0, 0x4000,
>> 0x1, 0x7f8fcbe0) = -22 (Invalid argument)

The collision between the x86 host MSI reserved region [0xfee0,
0xfeef] and the ARM guest RAM starting at 1GB has also always
existed. But now this collision is properly detected instead of being
silenced. People have not really complained about this so far. Since the
existing guest RAM layout couldn't be changed, I am afraid we couldn't
do much.

Eric
>>
>> Thanks,
>> Fei.
>>
>>> Thanks,
>>> drew
>>
>>
>




Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Philippe Mathieu-Daudé

Widening Cc to ARM/VFIO.

On 4/8/23 11:15, Wu, Fei wrote:

On 8/3/2023 11:07 PM, Andrew Jones wrote:

On Mon, Jul 31, 2023 at 09:53:17AM +0800, Fei Wu wrote:

riscv virt platform's memory started at 0x8000 and
straddled the 4GiB boundary. Curiously enough, this choice
of a memory layout will prevent from launching a VM with
a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
to identity mapping requirements for the MSI doorbell on x86,
and these (APIC/IOAPIC) live right below 4GiB.

So just split the RAM range into two portions:
- 1 GiB range from 0x8000 to 0xc000.
- The remainder at 0x1

...leaving a hole between the ranges.


Can you elaborate on the use case? Maybe provide details of the host
system and the QEMU command line? I'm wondering why we didn't have
any problems with the arm virt machine type. Has nobody tried this
use case with that? Is the use case something valid for riscv, but
not arm?


Firstly we have to enable pcie passthru on host, find the device groups,
e.g. the vga card, and add their pci ids to host kernel cmdline:
vfio-pci.ids=10de:0f02,10de:0e08

then start vm through qemu as follows:
$Q -machine virt -m 4G -smp 4 -nographic \
   -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
   -kernel ./vmlinuz -initrd initrd.img -append "root=/dev/vda1 rw" \
   -drive
file=ubuntu-22.04.1-preinstalled-server-riscv64+unmatched.img,if=virtio,format=raw
\
   -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1 \
   -netdev user,id=vnet,hostfwd=:127.0.0.1:2223-:22 -device
virtio-net-pci,netdev=vnet

Without this patch, qemu exits immediately instead of boots up.

Just tried pcie passthru on arm, it cannot handle 4G memory either.
$Q -m 4G -smp 4 -cpu max -M virt -nographic \
   -pflash /usr/share/AAVMF/AAVMF_CODE.fd -pflash flash1.img \
   -drive if=none,file=ubuntu-22.04-server-cloudimg-arm64.img,id=hd0 \
   -device virtio-blk-device,drive=hd0 \
   -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1

qemu-system-aarch64: -device vfio-pci,host=01:00.0: VFIO_MAP_DMA failed:
Invalid argument
qemu-system-aarch64: -device vfio-pci,host=01:00.0: vfio :01:00.0:
failed to setup container for group 11: memory listener initialization
failed: Region mach-virt.ram: vfio_dma_map(0x55de3c2a97f0, 0x4000,
0x1, 0x7f8fcbe0) = -22 (Invalid argument)

Thanks,
Fei.


Thanks,
drew








Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-06 Thread Alistair Francis
On Thu, Aug 3, 2023 at 10:47 AM Wu, Fei  wrote:
>
> On 8/1/2023 6:46 AM, Daniel Henrique Barboza wrote:
> >
> >
> > On 7/30/23 22:53, Fei Wu wrote:
> >> riscv virt platform's memory started at 0x8000 and
> >> straddled the 4GiB boundary. Curiously enough, this choice
> >> of a memory layout will prevent from launching a VM with
> >> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
> >> to identity mapping requirements for the MSI doorbell on x86,
> >> and these (APIC/IOAPIC) live right below 4GiB.
> >>
> >> So just split the RAM range into two portions:
> >> - 1 GiB range from 0x8000 to 0xc000.
> >> - The remainder at 0x1
> >>
> >> ...leaving a hole between the ranges.
> >
> > I am afraid this breaks some existing distro setups, like Ubuntu. After
> > this patch
> > this emulation stopped working:
> >
> > ~/work/qemu/build/qemu-system-riscv64 \
> > -machine virt -nographic -m 8G -smp 8 \
> > -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
> > -drive file=snapshot.img,format=qcow2,if=virtio \
> > -netdev bridge,id=bridge1,br=virbr0 -device
> > virtio-net-pci,netdev=bridge1
> >
> >
> > This is basically a guest created via the official Canonical tutorial:
> >
> > https://wiki.ubuntu.com/RISC-V/QEMU
> >
> > The error being thrown:
> >
> > =
> >
> > Boot HART ID  : 4
> > Boot HART Domain  : root
> > Boot HART Priv Version: v1.12
> > Boot HART Base ISA: rv64imafdch
> > Boot HART ISA Extensions  : time,sstc
> > Boot HART PMP Count   : 16
> > Boot HART PMP Granularity : 4
> > Boot HART PMP Address Bits: 54
> > Boot HART MHPM Count  : 16
> > Boot HART MIDELEG : 0x1666
> > Boot HART MEDELEG : 0x00f0b509
> >
> >
> > U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)
> >
> > CPU:
> > rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
> > Model: riscv-virtio,qemu
> > DRAM:  Unhandled exception: Store/AMO access fault
> > EPC: 802018b8 RA: 802126a0 TVAL: ff733f90
> >
> > Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)
> >
> >
> > resetting ...
> > System reset not supported on this platform
> > ### ERROR ### Please RESET the board ###
> > QEMU 8.0.90 monitor - type 'help' for more infor
> > =
> >
> >
> > Based on the change made I can make an educated guess on what is going
> > wrong.
> > We have another board with a similar memory topology you're making here,
> > the
> > Microchip Polarfire (microchip_pfsoc.c). We were having some problems
> > with this
> > board while trying to consolidate the boot process between all boards in
> > hw/riscv/boot.c because of its non-continuous RAM bank. The full story
> > can be
> > read in the commit message of 4b402886ac89 ("hw/riscv: change
> > riscv_compute_fdt_addr()
> > semantics") but the short version can be seen in riscv_compute_fdt_addr()
> > from boot.c:
> >
> >  - if ram_start is less than 3072MiB, the FDT will be  put at the lowest
> > value
> > between 3072 MiB and the end of that RAM bank;
> >
> > - if ram_start is higher than 3072 MiB the FDT will be put at the end of
> > the
> > RAM bank.
> >
> > So, after this patch, since riscv_compute_fdt_addr() is being used with
> > the now
> > lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup
> > that has
> > more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu
> > and possibly
> > others that are trying to retrieve the FDT from the gap that you created
> > between
> > low and hi mem in this patch.
> >
> > In fact, this same Ubuntu guest I mentioned above will boot if I put
> > only 1 Gb of RAM
> > (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a
> > validation of
> > the guess I'm making here: Ubuntu is trying to fetch stuff (probably the
> > fdt) from
> > the gap between the memory areas.
> >
> > This change on top of this patch doesn't work either:
> >
> > $ git diff
> > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> > index 8fbdc7220c..dfff48d849 100644
> > --- a/hw/riscv/virt.c
> > +++ b/hw/riscv/virt.c
> > @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier,
> > void *data)
> >   kernel_start_addr, true, NULL);
> >  }
> >
> > -fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> > +if (machine->ram_size < memmap[VIRT_DRAM].size) {
> > +fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> > memmap[VIRT_DRAM].size,
> > machine);
> > +} else {
> > +fdt_load_addr =
> > riscv_compute_fdt_addr(memmap[VIRT_DRAM_HIGH].base,
> > +   memmap[VIRT_DRAM_HIGH].size,
> > +   machine);
> > +}
> > +
> >
> > This would put 

Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-08-04 Thread Wu, Fei
On 8/3/2023 11:07 PM, Andrew Jones wrote:
> On Mon, Jul 31, 2023 at 09:53:17AM +0800, Fei Wu wrote:
>> riscv virt platform's memory started at 0x8000 and
>> straddled the 4GiB boundary. Curiously enough, this choice
>> of a memory layout will prevent from launching a VM with
>> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
>> to identity mapping requirements for the MSI doorbell on x86,
>> and these (APIC/IOAPIC) live right below 4GiB.
>>
>> So just split the RAM range into two portions:
>> - 1 GiB range from 0x8000 to 0xc000.
>> - The remainder at 0x1
>>
>> ...leaving a hole between the ranges.
> 
> Can you elaborate on the use case? Maybe provide details of the host
> system and the QEMU command line? I'm wondering why we didn't have
> any problems with the arm virt machine type. Has nobody tried this
> use case with that? Is the use case something valid for riscv, but
> not arm?
> 
Firstly we have to enable pcie passthru on host, find the device groups,
e.g. the vga card, and add their pci ids to host kernel cmdline:
vfio-pci.ids=10de:0f02,10de:0e08

then start vm through qemu as follows:
$Q -machine virt -m 4G -smp 4 -nographic \
  -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
  -kernel ./vmlinuz -initrd initrd.img -append "root=/dev/vda1 rw" \
  -drive
file=ubuntu-22.04.1-preinstalled-server-riscv64+unmatched.img,if=virtio,format=raw
\
  -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1 \
  -netdev user,id=vnet,hostfwd=:127.0.0.1:2223-:22 -device
virtio-net-pci,netdev=vnet

Without this patch, qemu exits immediately instead of boots up.

Just tried pcie passthru on arm, it cannot handle 4G memory either.
$Q -m 4G -smp 4 -cpu max -M virt -nographic \
  -pflash /usr/share/AAVMF/AAVMF_CODE.fd -pflash flash1.img \
  -drive if=none,file=ubuntu-22.04-server-cloudimg-arm64.img,id=hd0 \
  -device virtio-blk-device,drive=hd0 \
  -device vfio-pci,host=01:00.0 -device vfio-pci,host=01:00.1

qemu-system-aarch64: -device vfio-pci,host=01:00.0: VFIO_MAP_DMA failed:
Invalid argument
qemu-system-aarch64: -device vfio-pci,host=01:00.0: vfio :01:00.0:
failed to setup container for group 11: memory listener initialization
failed: Region mach-virt.ram: vfio_dma_map(0x55de3c2a97f0, 0x4000,
0x1, 0x7f8fcbe0) = -22 (Invalid argument)

Thanks,
Fei.

> Thanks,
> drew




Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-08-03 Thread Andrew Jones
On Mon, Jul 31, 2023 at 09:53:17AM +0800, Fei Wu wrote:
> riscv virt platform's memory started at 0x8000 and
> straddled the 4GiB boundary. Curiously enough, this choice
> of a memory layout will prevent from launching a VM with
> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
> to identity mapping requirements for the MSI doorbell on x86,
> and these (APIC/IOAPIC) live right below 4GiB.
> 
> So just split the RAM range into two portions:
> - 1 GiB range from 0x8000 to 0xc000.
> - The remainder at 0x1
> 
> ...leaving a hole between the ranges.

Can you elaborate on the use case? Maybe provide details of the host
system and the QEMU command line? I'm wondering why we didn't have
any problems with the arm virt machine type. Has nobody tried this
use case with that? Is the use case something valid for riscv, but
not arm?

Thanks,
drew



Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-08-02 Thread Wu, Fei
On 8/1/2023 6:46 AM, Daniel Henrique Barboza wrote:
> 
> 
> On 7/30/23 22:53, Fei Wu wrote:
>> riscv virt platform's memory started at 0x8000 and
>> straddled the 4GiB boundary. Curiously enough, this choice
>> of a memory layout will prevent from launching a VM with
>> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
>> to identity mapping requirements for the MSI doorbell on x86,
>> and these (APIC/IOAPIC) live right below 4GiB.
>>
>> So just split the RAM range into two portions:
>> - 1 GiB range from 0x8000 to 0xc000.
>> - The remainder at 0x1
>>
>> ...leaving a hole between the ranges.
> 
> I am afraid this breaks some existing distro setups, like Ubuntu. After
> this patch
> this emulation stopped working:
> 
> ~/work/qemu/build/qemu-system-riscv64 \
> -machine virt -nographic -m 8G -smp 8 \
> -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
> -drive file=snapshot.img,format=qcow2,if=virtio \
> -netdev bridge,id=bridge1,br=virbr0 -device
> virtio-net-pci,netdev=bridge1
> 
> 
> This is basically a guest created via the official Canonical tutorial:
> 
> https://wiki.ubuntu.com/RISC-V/QEMU
> 
> The error being thrown:
> 
> =
> 
> Boot HART ID  : 4
> Boot HART Domain  : root
> Boot HART Priv Version    : v1.12
> Boot HART Base ISA    : rv64imafdch
> Boot HART ISA Extensions  : time,sstc
> Boot HART PMP Count   : 16
> Boot HART PMP Granularity : 4
> Boot HART PMP Address Bits: 54
> Boot HART MHPM Count  : 16
> Boot HART MIDELEG : 0x1666
> Boot HART MEDELEG : 0x00f0b509
> 
> 
> U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)
> 
> CPU:  
> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
> Model: riscv-virtio,qemu
> DRAM:  Unhandled exception: Store/AMO access fault
> EPC: 802018b8 RA: 802126a0 TVAL: ff733f90
> 
> Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)
> 
> 
> resetting ...
> System reset not supported on this platform
> ### ERROR ### Please RESET the board ###
> QEMU 8.0.90 monitor - type 'help' for more infor
> =
> 
> 
> Based on the change made I can make an educated guess on what is going
> wrong.
> We have another board with a similar memory topology you're making here,
> the
> Microchip Polarfire (microchip_pfsoc.c). We were having some problems
> with this
> board while trying to consolidate the boot process between all boards in
> hw/riscv/boot.c because of its non-continuous RAM bank. The full story
> can be
> read in the commit message of 4b402886ac89 ("hw/riscv: change
> riscv_compute_fdt_addr()
> semantics") but the short version can be seen in riscv_compute_fdt_addr()
> from boot.c:
> 
>  - if ram_start is less than 3072MiB, the FDT will be  put at the lowest
> value
> between 3072 MiB and the end of that RAM bank;
> 
> - if ram_start is higher than 3072 MiB the FDT will be put at the end of
> the
> RAM bank.
> 
> So, after this patch, since riscv_compute_fdt_addr() is being used with
> the now
> lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup
> that has
> more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu
> and possibly
> others that are trying to retrieve the FDT from the gap that you created
> between
> low and hi mem in this patch.
> 
> In fact, this same Ubuntu guest I mentioned above will boot if I put
> only 1 Gb of RAM
> (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a
> validation of
> the guess I'm making here: Ubuntu is trying to fetch stuff (probably the
> fdt) from
> the gap between the memory areas.
> 
> This change on top of this patch doesn't work either:
> 
> $ git diff
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 8fbdc7220c..dfff48d849 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier,
> void *data)
>   kernel_start_addr, true, NULL);
>  }
>  
> -    fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> +    if (machine->ram_size < memmap[VIRT_DRAM].size) {
> +    fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>     memmap[VIRT_DRAM].size,
>     machine);
> +    } else {
> +    fdt_load_addr =
> riscv_compute_fdt_addr(memmap[VIRT_DRAM_HIGH].base,
> +   memmap[VIRT_DRAM_HIGH].size,
> +   machine);
> +    }
> +
>   
> This would put the fdt at the end of the HI RAM for guests with more
> than 1Gb of RAM.
> This change in fact makes the situation even worse, breaking setups that
> were working
> before with this patch.
> 
> There's a chance that reducing the gap between the RAM banks can make
> Ubuntu work
> reliably 

Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-07-31 Thread Wu, Fei
On 8/1/2023 6:46 AM, Daniel Henrique Barboza wrote:
> 
> 
> On 7/30/23 22:53, Fei Wu wrote:
>> riscv virt platform's memory started at 0x8000 and
>> straddled the 4GiB boundary. Curiously enough, this choice
>> of a memory layout will prevent from launching a VM with
>> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
>> to identity mapping requirements for the MSI doorbell on x86,
>> and these (APIC/IOAPIC) live right below 4GiB.
>>
>> So just split the RAM range into two portions:
>> - 1 GiB range from 0x8000 to 0xc000.
>> - The remainder at 0x1
>>
>> ...leaving a hole between the ranges.
> 
> I am afraid this breaks some existing distro setups, like Ubuntu. After
> this patch
> this emulation stopped working:
> 
> ~/work/qemu/build/qemu-system-riscv64 \
> -machine virt -nographic -m 8G -smp 8 \
> -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
> -drive file=snapshot.img,format=qcow2,if=virtio \
> -netdev bridge,id=bridge1,br=virbr0 -device
> virtio-net-pci,netdev=bridge1
> 
> 
> This is basically a guest created via the official Canonical tutorial:
> 
> https://wiki.ubuntu.com/RISC-V/QEMU
> 
> The error being thrown:
> 
> =
> 
> Boot HART ID  : 4
> Boot HART Domain  : root
> Boot HART Priv Version    : v1.12
> Boot HART Base ISA    : rv64imafdch
> Boot HART ISA Extensions  : time,sstc
> Boot HART PMP Count   : 16
> Boot HART PMP Granularity : 4
> Boot HART PMP Address Bits: 54
> Boot HART MHPM Count  : 16
> Boot HART MIDELEG : 0x1666
> Boot HART MEDELEG : 0x00f0b509
> 
> 
> U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)
> 
> CPU:  
> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
> Model: riscv-virtio,qemu
> DRAM:  Unhandled exception: Store/AMO access fault
> EPC: 802018b8 RA: 802126a0 TVAL: ff733f90
> 
> Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)
> 
> 
> resetting ...
> System reset not supported on this platform
> ### ERROR ### Please RESET the board ###
> QEMU 8.0.90 monitor - type 'help' for more infor
> =
> 
> 
> Based on the change made I can make an educated guess on what is going
> wrong.
> We have another board with a similar memory topology you're making here,
> the
> Microchip Polarfire (microchip_pfsoc.c). We were having some problems
> with this
> board while trying to consolidate the boot process between all boards in
> hw/riscv/boot.c because of its non-continuous RAM bank. The full story
> can be
> read in the commit message of 4b402886ac89 ("hw/riscv: change
> riscv_compute_fdt_addr()
> semantics") but the short version can be seen in riscv_compute_fdt_addr()
> from boot.c:
> 
>  - if ram_start is less than 3072MiB, the FDT will be  put at the lowest
> value
> between 3072 MiB and the end of that RAM bank;
> 
> - if ram_start is higher than 3072 MiB the FDT will be put at the end of
> the
> RAM bank.
> 
> So, after this patch, since riscv_compute_fdt_addr() is being used with
> the now
> lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup
> that has
> more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu
> and possibly
> others that are trying to retrieve the FDT from the gap that you created
> between
> low and hi mem in this patch.
> 
> In fact, this same Ubuntu guest I mentioned above will boot if I put
> only 1 Gb of RAM
> (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a
> validation of
> the guess I'm making here: Ubuntu is trying to fetch stuff (probably the
> fdt) from
> the gap between the memory areas.
> 
> This change on top of this patch doesn't work either:
> 
> $ git diff
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 8fbdc7220c..dfff48d849 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier,
> void *data)
>   kernel_start_addr, true, NULL);
>  }
>  
> -    fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
> +    if (machine->ram_size < memmap[VIRT_DRAM].size) {
> +    fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>     memmap[VIRT_DRAM].size,
>     machine);
> +    } else {
> +    fdt_load_addr =
> riscv_compute_fdt_addr(memmap[VIRT_DRAM_HIGH].base,
> +   memmap[VIRT_DRAM_HIGH].size,
> +   machine);
> +    }
> +
>   
> This would put the fdt at the end of the HI RAM for guests with more
> than 1Gb of RAM.
> This change in fact makes the situation even worse, breaking setups that
> were working
> before with this patch.
> 
> There's a chance that reducing the gap between the RAM banks can make
> Ubuntu work
> reliably 

Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-07-31 Thread Daniel Henrique Barboza




On 7/30/23 22:53, Fei Wu wrote:

riscv virt platform's memory started at 0x8000 and
straddled the 4GiB boundary. Curiously enough, this choice
of a memory layout will prevent from launching a VM with
a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
to identity mapping requirements for the MSI doorbell on x86,
and these (APIC/IOAPIC) live right below 4GiB.

So just split the RAM range into two portions:
- 1 GiB range from 0x8000 to 0xc000.
- The remainder at 0x1

...leaving a hole between the ranges.


I am afraid this breaks some existing distro setups, like Ubuntu. After this 
patch
this emulation stopped working:

~/work/qemu/build/qemu-system-riscv64 \
-machine virt -nographic -m 8G -smp 8 \
-kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
-drive file=snapshot.img,format=qcow2,if=virtio \
-netdev bridge,id=bridge1,br=virbr0 -device 
virtio-net-pci,netdev=bridge1


This is basically a guest created via the official Canonical tutorial:

https://wiki.ubuntu.com/RISC-V/QEMU

The error being thrown:

=

Boot HART ID  : 4
Boot HART Domain  : root
Boot HART Priv Version: v1.12
Boot HART Base ISA: rv64imafdch
Boot HART ISA Extensions  : time,sstc
Boot HART PMP Count   : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count  : 16
Boot HART MIDELEG : 0x1666
Boot HART MEDELEG : 0x00f0b509


U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)

CPU:   
rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
Model: riscv-virtio,qemu
DRAM:  Unhandled exception: Store/AMO access fault
EPC: 802018b8 RA: 802126a0 TVAL: ff733f90

Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)


resetting ...
System reset not supported on this platform
### ERROR ### Please RESET the board ###
QEMU 8.0.90 monitor - type 'help' for more infor
=


Based on the change made I can make an educated guess on what is going wrong.
We have another board with a similar memory topology you're making here, the
Microchip Polarfire (microchip_pfsoc.c). We were having some problems with this
board while trying to consolidate the boot process between all boards in
hw/riscv/boot.c because of its non-continuous RAM bank. The full story can be
read in the commit message of 4b402886ac89 ("hw/riscv: change 
riscv_compute_fdt_addr()
semantics") but the short version can be seen in riscv_compute_fdt_addr()
from boot.c:

 - if ram_start is less than 3072MiB, the FDT will be  put at the lowest value
between 3072 MiB and the end of that RAM bank;

- if ram_start is higher than 3072 MiB the FDT will be put at the end of the
RAM bank.

So, after this patch, since riscv_compute_fdt_addr() is being used with the now
lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup that 
has
more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu and 
possibly
others that are trying to retrieve the FDT from the gap that you created between
low and hi mem in this patch.

In fact, this same Ubuntu guest I mentioned above will boot if I put only 1 Gb 
of RAM
(-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a 
validation of
the guess I'm making here: Ubuntu is trying to fetch stuff (probably the fdt) 
from
the gap between the memory areas.

This change on top of this patch doesn't work either:

$ git diff
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 8fbdc7220c..dfff48d849 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier, void 
*data)
  kernel_start_addr, true, NULL);
 }
 
-fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,

+if (machine->ram_size < memmap[VIRT_DRAM].size) {
+fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
memmap[VIRT_DRAM].size,
machine);
+} else {
+fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM_HIGH].base,
+   memmap[VIRT_DRAM_HIGH].size,
+   machine);
+}
+
   


This would put the fdt at the end of the HI RAM for guests with more than 1Gb 
of RAM.
This change in fact makes the situation even worse, breaking setups that were 
working
before with this patch.

There's a chance that reducing the gap between the RAM banks can make Ubuntu 
work
reliably again but now I'm a little cold feet with this change.


I think we'll need some kernel/Opensbi folks to weight in here to see if 
there's a
failsafe memory setup that won't break distros out there and allow your 
passthrough
to work.



Thanks,

Daniel



Signed-off-by: Andrei Warkentin 
Signed-off-by: 

[PATCH] hw/riscv: split RAM into low and high memory

2023-07-30 Thread Fei Wu
riscv virt platform's memory started at 0x8000 and
straddled the 4GiB boundary. Curiously enough, this choice
of a memory layout will prevent from launching a VM with
a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
to identity mapping requirements for the MSI doorbell on x86,
and these (APIC/IOAPIC) live right below 4GiB.

So just split the RAM range into two portions:
- 1 GiB range from 0x8000 to 0xc000.
- The remainder at 0x1

...leaving a hole between the ranges.

Signed-off-by: Andrei Warkentin 
Signed-off-by: Fei Wu 
---
 hw/riscv/virt.c | 74 -
 include/hw/riscv/virt.h |  1 +
 2 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index d90286dc46..8fbdc7220c 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -75,7 +75,9 @@
 #error "Can't accomodate all IMSIC groups in address space"
 #endif
 
-static const MemMapEntry virt_memmap[] = {
+#define LOW_MEM (1 * GiB)
+
+static MemMapEntry virt_memmap[] = {
 [VIRT_DEBUG] ={0x0, 0x100 },
 [VIRT_MROM] = { 0x1000,0xf000 },
 [VIRT_TEST] = {   0x10,0x1000 },
@@ -96,6 +98,7 @@ static const MemMapEntry virt_memmap[] = {
 [VIRT_PCIE_ECAM] ={ 0x3000,0x1000 },
 [VIRT_PCIE_MMIO] ={ 0x4000,0x4000 },
 [VIRT_DRAM] = { 0x8000,   0x0 },
+[VIRT_DRAM_HIGH] ={ 0x1,  0x0 },
 };
 
 /* PCIe high mmio is fixed for RV32 */
@@ -295,15 +298,12 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, int 
socket,
 }
 }
 
-static void create_fdt_socket_memory(RISCVVirtState *s,
- const MemMapEntry *memmap, int socket)
+static void create_fdt_socket_mem_range(RISCVVirtState *s, uint64_t addr,
+uint64_t size, int socket)
 {
 char *mem_name;
-uint64_t addr, size;
 MachineState *ms = MACHINE(s);
 
-addr = memmap[VIRT_DRAM].base + riscv_socket_mem_offset(ms, socket);
-size = riscv_socket_mem_size(ms, socket);
 mem_name = g_strdup_printf("/memory@%lx", (long)addr);
 qemu_fdt_add_subnode(ms->fdt, mem_name);
 qemu_fdt_setprop_cells(ms->fdt, mem_name, "reg",
@@ -313,6 +313,34 @@ static void create_fdt_socket_memory(RISCVVirtState *s,
 g_free(mem_name);
 }
 
+static void create_fdt_socket_memory(RISCVVirtState *s,
+ const MemMapEntry *memmap, int socket)
+{
+uint64_t addr, size;
+MachineState *mc = MACHINE(s);
+uint64_t sock_offset = riscv_socket_mem_offset(mc, socket);
+uint64_t sock_size = riscv_socket_mem_size(mc, socket);
+
+if (sock_offset < memmap[VIRT_DRAM].size) {
+uint64_t low_mem_end = memmap[VIRT_DRAM].base + memmap[VIRT_DRAM].size;
+
+addr = memmap[VIRT_DRAM].base + sock_offset;
+size = MIN(low_mem_end - addr, sock_size);
+create_fdt_socket_mem_range(s, addr, size, socket);
+
+size = sock_size - size;
+if (size > 0) {
+create_fdt_socket_mem_range(s, memmap[VIRT_DRAM_HIGH].base,
+size, socket);
+}
+} else {
+addr = memmap[VIRT_DRAM_HIGH].base +
+   sock_offset - memmap[VIRT_DRAM].size;
+
+create_fdt_socket_mem_range(s, addr, sock_size, socket);
+}
+}
+
 static void create_fdt_socket_clint(RISCVVirtState *s,
 const MemMapEntry *memmap, int socket,
 uint32_t *intc_phandles)
@@ -1334,10 +1362,12 @@ static void virt_machine_done(Notifier *notifier, void 
*data)
 
 static void virt_machine_init(MachineState *machine)
 {
-const MemMapEntry *memmap = virt_memmap;
+MemMapEntry *memmap = virt_memmap;
 RISCVVirtState *s = RISCV_VIRT_MACHINE(machine);
 MemoryRegion *system_memory = get_system_memory();
 MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
+MemoryRegion *ram_below_4g, *ram_above_4g;
+uint64_t ram_size_low, ram_size_high;
 char *soc_name;
 DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
 int i, base_hartid, hart_count;
@@ -1448,6 +1478,17 @@ static void virt_machine_init(MachineState *machine)
 }
 }
 
+if (machine->ram_size > LOW_MEM) {
+ram_size_high = machine->ram_size - LOW_MEM;
+ram_size_low = LOW_MEM;
+} else {
+ram_size_high = 0;
+ram_size_low = machine->ram_size;
+}
+
+memmap[VIRT_DRAM].size = ram_size_low;
+memmap[VIRT_DRAM_HIGH].size = ram_size_high;
+
 if (riscv_is_32bit(>soc[0])) {
 #if HOST_LONG_BITS == 64
 /* limit RAM size in a 32-bit system */
@@ -1460,7 +1501,8 @@ static void virt_machine_init(MachineState *machine)
 virt_high_pcie_memmap.size = VIRT32_HIGH_PCIE_MMIO_SIZE;
 } else {
 virt_high_pcie_memmap.size = VIRT64_HIGH_PCIE_MMIO_SIZE;
-