Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Il 07/11/2013 22:51, Peter Maydell ha scritto: 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). More specifically, the x86 pc behaves like this. Other x86 based systems could in theory behave differently (not that we actually model any, I think). After Marcel's patch, we have changed behavior for at least all boards that pass get_system_memory() to pci_register_bus or pci_bus_new: * mips/gt64xxx_pci.c * pci-host/bonito.c * pci-host/ppce500.c * ppc/ppc4xx_pci.c * sh4/sh_pci.c These now will not go anymore through unassigned_mem_ops, which is a behavioral change for MIPS boards (gt64xxx_pci and bonito) at least. Furthermore, the default behavior of the memory API _is_ read all-ones/ignore writes, so I'm not sure what's the benefit of adding a separate region for master abort... Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On 8 November 2013 08:05, Paolo Bonzini pbonz...@redhat.com wrote: Il 07/11/2013 22:51, Peter Maydell ha scritto: 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). More specifically, the x86 pc behaves like this. Other x86 based systems could in theory behave differently (not that we actually model any, I think). After Marcel's patch, we have changed behavior for at least all boards that pass get_system_memory() to pci_register_bus or pci_bus_new: * mips/gt64xxx_pci.c * pci-host/bonito.c * pci-host/ppce500.c * ppc/ppc4xx_pci.c * sh4/sh_pci.c Oh, right. Ideally those boards should not do that (I fixed the versatile pci controller not to do that a while back) but it's a long standing behaviour so it would be better had we not broken it. I think it should in general be fixable by just having those pci controllers create an empty memory region to pass in, like versatile: memory_region_init(s-pci_io_space, OBJECT(s), pci_io, 1ULL 32); memory_region_init(s-pci_mem_space, OBJECT(s), pci_mem, 1ULL 32); pci_bus_new_inplace(s-pci_bus, sizeof(s-pci_bus), DEVICE(obj), pci, s-pci_mem_space, s-pci_io_space, PCI_DEVFN(11, 0), TYPE_PCI_BUS); That doesn't affect where PCI DMA goes. It might also require mapping an alias into that region at wherever the pci hole is on those boards. Kind of awkward to do and test at this point in the release cycle though. These now will not go anymore through unassigned_mem_ops, which is a behavioral change for MIPS boards (gt64xxx_pci and bonito) at least. Furthermore, the default behavior of the memory API _is_ read all-ones/ignore writes, so I'm not sure what's the benefit of adding a separate region for master abort... It gives you a place set the appropriate PCI controller or device register status bits on abort by a PCI device access, IIRC. -- PMM
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Il 08/11/2013 11:44, Peter Maydell ha scritto: On 8 November 2013 08:05, Paolo Bonzini pbonz...@redhat.com wrote: Il 07/11/2013 22:51, Peter Maydell ha scritto: 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). More specifically, the x86 pc behaves like this. Other x86 based systems could in theory behave differently (not that we actually model any, I think). After Marcel's patch, we have changed behavior for at least all boards that pass get_system_memory() to pci_register_bus or pci_bus_new: * mips/gt64xxx_pci.c * pci-host/bonito.c * pci-host/ppce500.c * ppc/ppc4xx_pci.c * sh4/sh_pci.c Oh, right. Ideally those boards should not do that (I fixed the versatile pci controller not to do that a while back) but it's a long standing behaviour so it would be better had we not broken it. I think it should in general be fixable by just having those pci controllers create an empty memory region to pass in, like versatile: memory_region_init(s-pci_io_space, OBJECT(s), pci_io, 1ULL 32); memory_region_init(s-pci_mem_space, OBJECT(s), pci_mem, 1ULL 32); pci_bus_new_inplace(s-pci_bus, sizeof(s-pci_bus), DEVICE(obj), pci, s-pci_mem_space, s-pci_io_space, PCI_DEVFN(11, 0), TYPE_PCI_BUS); That doesn't affect where PCI DMA goes. It might also require mapping an alias into that region at wherever the pci hole is on those boards. Kind of awkward to do and test at this point in the release cycle though. Yes, for now we should just revert the patch. BTW I tried optimizing MMIO dispatch and managed to save ~30 cycles so I don't think we should be afraid of increasing the depth of the radix tree. All stuff for 1.8 though. Paolo Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Fri, 2013-11-08 at 09:05 +0100, Paolo Bonzini wrote: Il 07/11/2013 22:51, Peter Maydell ha scritto: 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). More specifically, the x86 pc behaves like this. Other x86 based systems could in theory behave differently (not that we actually model any, I think). After Marcel's patch, we have changed behavior for at least all boards that pass get_system_memory() to pci_register_bus or pci_bus_new: * mips/gt64xxx_pci.c * pci-host/bonito.c * pci-host/ppce500.c * ppc/ppc4xx_pci.c * sh4/sh_pci.c These now will not go anymore through unassigned_mem_ops, which is a behavioral change for MIPS boards (gt64xxx_pci and bonito) at least. Furthermore, the default behavior of the memory API _is_ read all-ones/ignore writes, so I'm not sure what's the benefit of adding a separate region for master abort... Actually, as I see, the default behavior of system memory region is to use unassigned_mem_ops that has valid.accepts method returning false (no read/write methods). I don't see that read all-ones/ignore writes is implemented. This was the main reason I submitted this patch. I had *no* clue that it would impact so much the system... I still think the patch is needed ans the guests will benefit from more accurate PCI spec emulation. Thanks, Marcel Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Fri, 2013-11-08 at 10:44 +, Peter Maydell wrote: On 8 November 2013 08:05, Paolo Bonzini pbonz...@redhat.com wrote: Il 07/11/2013 22:51, Peter Maydell ha scritto: 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). More specifically, the x86 pc behaves like this. Other x86 based systems could in theory behave differently (not that we actually model any, I think). After Marcel's patch, we have changed behavior for at least all boards that pass get_system_memory() to pci_register_bus or pci_bus_new: * mips/gt64xxx_pci.c * pci-host/bonito.c * pci-host/ppce500.c * ppc/ppc4xx_pci.c * sh4/sh_pci.c Oh, right. Ideally those boards should not do that (I fixed the versatile pci controller not to do that a while back) but it's a long standing behaviour so it would be better had we not broken it. I think it should in general be fixable by just having those pci controllers create an empty memory region to pass in, like versatile: memory_region_init(s-pci_io_space, OBJECT(s), pci_io, 1ULL 32); memory_region_init(s-pci_mem_space, OBJECT(s), pci_mem, 1ULL 32); pci_bus_new_inplace(s-pci_bus, sizeof(s-pci_bus), DEVICE(obj), pci, s-pci_mem_space, s-pci_io_space, PCI_DEVFN(11, 0), TYPE_PCI_BUS); That doesn't affect where PCI DMA goes. It might also require mapping an alias into that region at wherever the pci hole is on those boards. Kind of awkward to do and test at this point in the release cycle though. These now will not go anymore through unassigned_mem_ops, which is a behavioral change for MIPS boards (gt64xxx_pci and bonito) at least. Furthermore, the default behavior of the memory API _is_ read all-ones/ignore writes, so I'm not sure what's the benefit of adding a separate region for master abort... It gives you a place set the appropriate PCI controller or device register status bits on abort by a PCI device access, IIRC. Right, thanks for pointing this out. This was indeed the patches direction. But even the first patch has meaning by itself and not just a preparation. As I mentioned in other mail in this thread read all-ones/ignore writes is not implemented yet asbeacuse unassigned_mem_ops has no read/write methods and valid.accepts returns false. Thanks, Marcel -- PMM
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Am 07.11.2013 21:27, schrieb Jordan Justen: On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum marce...@redhat.com wrote: The commit: Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6 Author: Marcel Apfelbaum marce...@redhat.com Date: Mon Sep 16 11:21:16 2013 +0300 hw/pci: partially handle pci master abort introduced a regression on make check: Laszlo pointed out that my OVMF flash support series was not working with QEMU master. It was working with QEMU 1.6.0. I then bisected the issue to this commit. It seems this commit regresses -pflash support for both KVM and non-KVM modes. Can you reproduce the issue this with command? x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin (with or without adding -enable-kvm) Jordan or Laszlo, Can either of you please add a small test case to i440fx-test using -pflash and doing a read in the PCI hole (or wherever exactly) so that we can avoid regressing yet again? :) Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Il 08/11/2013 16:08, Marcel Apfelbaum ha scritto: Actually, as I see, the default behavior of system memory region is to use unassigned_mem_ops that has valid.accepts method returning false (no read/write methods). I don't see that read all-ones/ignore writes is implemented. Right, it's read-all-zeroes (see unassigned_mem_read). It was read-all-ones for a brief time, then it got changed back to read-all-zeroes. This was the main reason I submitted this patch. I had *no* clue that it would impact so much the system... Yeah, it's not your fault. But it should have been held back until 1.8. Do you agree with reverting it? I still think the patch is needed ans the guests will benefit from more accurate PCI spec emulation. Only buggy guests :) but yes, I agree it's a good thing to have. Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On 11/08/13 16:42, Andreas Färber wrote: Am 07.11.2013 21:27, schrieb Jordan Justen: On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum marce...@redhat.com wrote: The commit: Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6 Author: Marcel Apfelbaum marce...@redhat.com Date: Mon Sep 16 11:21:16 2013 +0300 hw/pci: partially handle pci master abort introduced a regression on make check: Laszlo pointed out that my OVMF flash support series was not working with QEMU master. It was working with QEMU 1.6.0. I then bisected the issue to this commit. It seems this commit regresses -pflash support for both KVM and non-KVM modes. Can you reproduce the issue this with command? x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin (with or without adding -enable-kvm) Jordan or Laszlo, Can either of you please add a small test case to i440fx-test using -pflash and doing a read in the PCI hole (or wherever exactly) so that we can avoid regressing yet again? :) For -pflash we need a small test file. I'm thinking about creating a 512 byte (1 sector) big file, and modifying the qemu command line in tests/i440fx-test.c. I'm not very familiar with external files in tests though. Can I model it on qdict-test-data.txt? qdict-test-data.txt is located in the root source directory. When configure runs outside the root source directory (= separate build dir), it symlinks it. And, the check-qdict.c test program opens it (with fopen()) simply by basename (no path prefix). Can I follow that? Thanks Laszlo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Fri, 2013-11-08 at 17:12 +0100, Paolo Bonzini wrote: Il 08/11/2013 16:08, Marcel Apfelbaum ha scritto: Actually, as I see, the default behavior of system memory region is to use unassigned_mem_ops that has valid.accepts method returning false (no read/write methods). I don't see that read all-ones/ignore writes is implemented. Right, it's read-all-zeroes (see unassigned_mem_read). It was read-all-ones for a brief time, then it got changed back to read-all-zeroes. I remember something, Jan's patches got reverted because the modification was system wide and not only for pci address space. This was the main reason I submitted this patch. I had *no* clue that it would impact so much the system... Yeah, it's not your fault. But it should have been held back until 1.8. Do you agree with reverting it? Given the actual impact on version's stability, I agree it is the right thing to do. I still think the patch is needed ans the guests will benefit from more accurate PCI spec emulation. Only buggy guests :) but yes, I agree it's a good thing to have. Yes, it may be a driver there making some a wrong decision and reading-all-ones may give it a clue. (the same for writes) Thanks, Marcel Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Il 08/11/2013 17:19, Laszlo Ersek ha scritto: I'm not very familiar with external files in tests though. Can I model it on qdict-test-data.txt? qdict-test-data.txt is located in the root source directory. When configure runs outside the root source directory (= separate build dir), it symlinks it. And, the check-qdict.c test program opens it (with fopen()) simply by basename (no path prefix). Can I follow that? Yes, tests are always run from the root build directory. Symlinking files is not particularly awesome, but it works. A patch to move qdict-test-data.txt to tests/ is welcome. :) You can create a temporary file as well from the qtest. That's probably easier; just write 64 kB of 0x00 0x01 0x02 0x03...0xFF 0x00 0x01... Also, please test both -bios and -pflash. Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Am 08.11.2013 17:19, schrieb Laszlo Ersek: On 11/08/13 16:42, Andreas Färber wrote: Jordan or Laszlo, Can either of you please add a small test case to i440fx-test using -pflash and doing a read in the PCI hole (or wherever exactly) so that we can avoid regressing yet again? :) For -pflash we need a small test file. I'm thinking about creating a 512 byte (1 sector) big file, and modifying the qemu command line in tests/i440fx-test.c. I'm not very familiar with external files in tests though. Can I model it on qdict-test-data.txt? qdict-test-data.txt is located in the root source directory. When configure runs outside the root source directory (= separate build dir), it symlinks it. And, the check-qdict.c test program opens it (with fopen()) simply by basename (no path prefix). Can I follow that? I don't have personal experience with using external files there yet, but I was hoping that using -pflash pc-bios/bios.bin would just work since that'll be symlinked for execution from build directory iiuc. My thinking was the test could then verify that the BIOS does not read as all 0xff, whereas Paolo's suggestion sounds more elaborate, ruling out actual 0xff within SeaBIOS by having a positive pattern to check for. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Il 08/11/2013 18:09, Andreas Färber ha scritto: I don't have personal experience with using external files there yet, but I was hoping that using -pflash pc-bios/bios.bin would just work since that'll be symlinked for execution from build directory iiuc. My thinking was the test could then verify that the BIOS does not read as all 0xff, whereas Paolo's suggestion sounds more elaborate, ruling out actual 0xff within SeaBIOS by having a positive pattern to check for. Yeah, that's also a good test and easier! Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On 11/08/13 18:15, Paolo Bonzini wrote: Il 08/11/2013 18:09, Andreas Färber ha scritto: I don't have personal experience with using external files there yet, but I was hoping that using -pflash pc-bios/bios.bin would just work since that'll be symlinked for execution from build directory iiuc. My thinking was the test could then verify that the BIOS does not read as all 0xff, whereas Paolo's suggestion sounds more elaborate, ruling out actual 0xff within SeaBIOS by having a positive pattern to check for. Yeah, that's also a good test and easier! Believe it or not, I did think of both 0x00..0xFF (from a small static file) and using an actual bios image :) I think I'd prefer 0x00..0xFF (with a temporary file as Paolo suggested) because what is it looks more attractive than what is it not to me. Also, opening the packaged bios binary from the build dir would need extra symlinking again; mkstemp() seems cleaner. Thanks! Laszlo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum marce...@redhat.com wrote: The commit: Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6 Author: Marcel Apfelbaum marce...@redhat.com Date: Mon Sep 16 11:21:16 2013 +0300 hw/pci: partially handle pci master abort introduced a regression on make check: Laszlo pointed out that my OVMF flash support series was not working with QEMU master. It was working with QEMU 1.6.0. I then bisected the issue to this commit. It seems this commit regresses -pflash support for both KVM and non-KVM modes. Can you reproduce the issue this with command? x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin (with or without adding -enable-kvm) I tried adding this patch (exec: fix regression by making system-memory region UINT64_MAX size) and it did not help the -pflash regression. -Jordan
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Thu, 2013-11-07 at 12:27 -0800, Jordan Justen wrote: On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum marce...@redhat.com wrote: The commit: Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6 Author: Marcel Apfelbaum marce...@redhat.com Date: Mon Sep 16 11:21:16 2013 +0300 hw/pci: partially handle pci master abort introduced a regression on make check: Laszlo pointed out that my OVMF flash support series was not working with QEMU master. It was working with QEMU 1.6.0. I then bisected the issue to this commit. It seems this commit regresses -pflash support for both KVM and non-KVM modes. Can you reproduce the issue this with command? x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin (with or without adding -enable-kvm) Thanks Jordan for pointing this out. This patch revealed a lot of hidden issues inside qemu ... I succeeded to reproduce the issue. I saw that get_page_addr_code associated address 0xfff0 (after calling io_tlb_to_region) with the new master abort region. It breaks because memory_region_is_unassigned returns true for the new region. It is interesting what happened before this region was added. I will investigate this issue, Thanks, Marcel I tried adding this patch (exec: fix regression by making system-memory region UINT64_MAX size) and it did not help the -pflash regression. -Jordan
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
This is a QEMU bug report, only disguised as an edk2-devel followup. The problem in a nutshell is that the OVMF binary, placed into pflash (read-only KVM memslot) used to run in qemu-1.6, but it fails to start in qemu-1.7. The OVMF reset vector reads as 0xFF bytes. (Jordan and myself started writing these emails in parallel.) On 11/07/13 21:27, Jordan Justen wrote: On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum marce...@redhat.com wrote: The commit: Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6 Author: Marcel Apfelbaum marce...@redhat.com Date: Mon Sep 16 11:21:16 2013 +0300 hw/pci: partially handle pci master abort introduced a regression on make check: Laszlo pointed out that my OVMF flash support series was not working with QEMU master. It was working with QEMU 1.6.0. I then bisected the issue to this commit. It seems this commit regresses -pflash support for both KVM and non-KVM modes. Can you reproduce the issue this with command? x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin (with or without adding -enable-kvm) I tried adding this patch (exec: fix regression by making system-memory region UINT64_MAX size) and it did not help the -pflash regression. From the edk2-devel discussion: http://thread.gmane.org/gmane.comp.bios.tianocore.devel/4787/focus=4812 On 11/07/13 19:07, Laszlo Ersek wrote: On 11/07/13 17:28, Laszlo Ersek wrote: On 11/06/13 23:29, Jordan Justen wrote: https://github.com/jljusten/edk2.git ovmf-nvvars-v2 This series implements support for QEMU's emulated system flash. This allows for persistent UEFI non-volatile variables. Previously we attempted to emulate non-volatile variables in a few ways, but each of them would fail in particular situations. To support flash based variable storage, we: * Reserve space in the firmware device image * Add a new flash firmware volume block driver * Disable EmuVariableFvbRuntimeDxe (at runtime) if QEMU flash is available. To use: * QEMU version 1.1 or newer is required without KVM * KVM support requires Linux 3.7 and QEMU 1.6 * Run QEMU with -pflash OVMF.fd instead of -L or -bios or use OvmfPkg/build.sh --enable-flash qemu ... * If QEMU is 1.6 or newer, then OvmfPkg/build.sh will automatically enable flash when running QEMU, so in that case --enable-flash is not required. See also: * http://wiki.qemu.org/Features/PC_System_Flash v2: * Replace OvmfPkg/AcpiPlatformDxe/Qemu: Allow high runtime memory regions with OvmfPkg/AcpiPlatformDxe/Qemu: Decrease upper limit for PCI window 32 * Separate portions of OvmfPkg/build.sh: Support --enable-flash switch out into a new patch OvmfPkg/build.sh: Enable flash for QEMU = 1.6 * Add OvmfPkg/README: Add information about OVMF flash layout * Update OvmfPkg: Add NV Variable storage within FD to also change the size of PcdVariableStoreSize * Update commit messages on a couple patches for better clarity Tested in the following configurations: (1) RHEL-6 host (no KVM support, no qemu support -- that is, regression test): RHEL-6, Fedora 19, Windows 2008 R2 (needs CSM), Windows 2012 R2 boot tests work OK. (2) 3.10-based host kernel, qemu v1.7.0-rc0, Xeon W3550 host CPU: Unfortunately qemu dies with the following KVM trace: KVM internal error. Suberror: 1 emulation failure EAX= EBX= ECX= EDX=0623 ESI= EDI= EBP= ESP= EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES = 9300 CS =f000 9b00 SS = 9300 DS = 9300 FS = 9300 GS = 9300 LDT= 8200 TR = 8b00 GDT= IDT= CR0=6010 CR2= CR3= CR4= DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER= Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff I'm quite unsure, but the CS:IP value seems to point at the reset vector, no? However, the Code=... log only shows 0xFF bytes. (3) 3.10.17 host kernel, qemu v1.7.0-rc0, Athlon II X2 B22 host CPU: almost identical KVM error, with the following difference: --- vmx 2013-11-07 17:23:45.612973935 +0100 +++ svm 2013-11-07 17:24:17.237973059 +0100 @@ -1,6 +1,6 @@ KVM internal error. Suberror: 1 emulation failure -EAX= EBX= ECX= EDX=0623 +EAX= EBX= ECX= EDX=0663 ESI= EDI= EBP= ESP= EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Thu, 2013-11-07 at 22:12 +0100, Laszlo Ersek wrote: This is a QEMU bug report, only disguised as an edk2-devel followup. The problem in a nutshell is that the OVMF binary, placed into pflash (read-only KVM memslot) used to run in qemu-1.6, but it fails to start in qemu-1.7. The OVMF reset vector reads as 0xFF bytes. (Jordan and myself started writing these emails in parallel.) On 11/07/13 21:27, Jordan Justen wrote: On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum marce...@redhat.com wrote: The commit: Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6 Author: Marcel Apfelbaum marce...@redhat.com Date: Mon Sep 16 11:21:16 2013 +0300 hw/pci: partially handle pci master abort introduced a regression on make check: Laszlo pointed out that my OVMF flash support series was not working with QEMU master. It was working with QEMU 1.6.0. I then bisected the issue to this commit. It seems this commit regresses -pflash support for both KVM and non-KVM modes. Can you reproduce the issue this with command? x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin (with or without adding -enable-kvm) I tried adding this patch (exec: fix regression by making system-memory region UINT64_MAX size) and it did not help the -pflash regression. From the edk2-devel discussion: http://thread.gmane.org/gmane.comp.bios.tianocore.devel/4787/focus=4812 On 11/07/13 19:07, Laszlo Ersek wrote: On 11/07/13 17:28, Laszlo Ersek wrote: On 11/06/13 23:29, Jordan Justen wrote: https://github.com/jljusten/edk2.git ovmf-nvvars-v2 This series implements support for QEMU's emulated system flash. This allows for persistent UEFI non-volatile variables. Previously we attempted to emulate non-volatile variables in a few ways, but each of them would fail in particular situations. To support flash based variable storage, we: * Reserve space in the firmware device image * Add a new flash firmware volume block driver * Disable EmuVariableFvbRuntimeDxe (at runtime) if QEMU flash is available. To use: * QEMU version 1.1 or newer is required without KVM * KVM support requires Linux 3.7 and QEMU 1.6 * Run QEMU with -pflash OVMF.fd instead of -L or -bios or use OvmfPkg/build.sh --enable-flash qemu ... * If QEMU is 1.6 or newer, then OvmfPkg/build.sh will automatically enable flash when running QEMU, so in that case --enable-flash is not required. See also: * http://wiki.qemu.org/Features/PC_System_Flash v2: * Replace OvmfPkg/AcpiPlatformDxe/Qemu: Allow high runtime memory regions with OvmfPkg/AcpiPlatformDxe/Qemu: Decrease upper limit for PCI window 32 * Separate portions of OvmfPkg/build.sh: Support --enable-flash switch out into a new patch OvmfPkg/build.sh: Enable flash for QEMU = 1.6 * Add OvmfPkg/README: Add information about OVMF flash layout * Update OvmfPkg: Add NV Variable storage within FD to also change the size of PcdVariableStoreSize * Update commit messages on a couple patches for better clarity Tested in the following configurations: (1) RHEL-6 host (no KVM support, no qemu support -- that is, regression test): RHEL-6, Fedora 19, Windows 2008 R2 (needs CSM), Windows 2012 R2 boot tests work OK. (2) 3.10-based host kernel, qemu v1.7.0-rc0, Xeon W3550 host CPU: Unfortunately qemu dies with the following KVM trace: KVM internal error. Suberror: 1 emulation failure EAX= EBX= ECX= EDX=0623 ESI= EDI= EBP= ESP= EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES = 9300 CS =f000 9b00 SS = 9300 DS = 9300 FS = 9300 GS = 9300 LDT= 8200 TR = 8b00 GDT= IDT= CR0=6010 CR2= CR3= CR4= DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER= Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff I'm quite unsure, but the CS:IP value seems to point at the reset vector, no? However, the Code=... log only shows 0xFF bytes. (3) 3.10.17 host kernel, qemu v1.7.0-rc0, Athlon II X2 B22 host CPU: almost identical KVM error, with the following difference: --- vmx 2013-11-07 17:23:45.612973935 +0100 +++ svm 2013-11-07 17:24:17.237973059 +0100 @@ -1,6 +1,6 @@ KVM internal error. Suberror: 1 emulation failure -EAX= EBX= ECX= EDX=0623
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
Il 07/11/2013 22:24, Marcel Apfelbaum ha scritto: Thank you Laszlo for the detailed info! I think the problem is right above. Why pci-hole and system.flash collide? IMHO we should not play with priorities here, better solve the collision. We need to audit all the other boards that support PCI... I'll take a look tomorrow since you guys are off. Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On 11/07/13 22:24, Marcel Apfelbaum wrote: On Thu, 2013-11-07 at 22:12 +0100, Laszlo Ersek wrote: adding subregion 'pci-hole' to region 'system' at offset 6000 warning: subregion collision 6000/a000 (pci-hole) vs ffe0/20 (system.flash) Thank you Laszlo for the detailed info! I think the problem is right above. Why pci-hole and system.flash collide? IMHO we should not play with priorities here, better solve the collision. pc_init1() pc_memory_init() pc_system_firmware_init() pc_system_flash_init() sets base address to 0x1ULL - flash_size pflash_cfi01_register() sysbus_mmio_map() sysbus_mmio_map_common() memory_region_add_subregion() i440fx_init() memory_region_init_alias(pci-hole) pc_init1() passes 0x1ULL - below_4g_mem_size to i440fx_init() as pci_hole_size, which is then used as the size of the pci-hole alias. We should probably subtract the size of the flash from this, but I don't know how to do that elegantly. Yet another (output) parameter for pc_memory_init()? Blech. Or look up the end address of system.flash by name? Thanks Laszlo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On 7 November 2013 21:38, Marcel Apfelbaum marce...@redhat.com wrote: Thanks Paolo, Let me just point out what I know (or I think I know): 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). More specifically, the x86 pc behaves like this. Other x86 based systems could in theory behave differently (not that we actually model any, I think). That means that you cannot have a 64bit wide pci-hole with lower priority that catches all accesses that are not for RAM(and firends). ...but this conclusion is wrong, because the pci-hole region is created by the pc model. So we should create it at the correct priority and size to give the behaviour relative to other devices in the pc model that is required (ie that the hardware has). This doesn't affect any other target architecture or board. That said, I don't know enough about the PC to know what the exact details of the pci-hole are, so I'm not making a statement about what the correct model is. I'm just saying that what you do with the pci-hole and the container it lives in and the other devices in that container is not going to change the behaviour of any other target board. 2. If the above is right, and making pci-hole 64 bit wide is not an option, playing with pci-holes/other-region priorities it would be just wrong, it would be only to fight with the locality of the memory region's priority. I have no idea what you mean by fighting here. MR priorities apply only within a specific container region[*], to set which of that container's children appears 'above' another. They're totally local to the container (which in this case is part of the PC model, not the generic PCI code) and so the PC model can freely set them to whatever makes most sense. [*] if you didn't already know this, see the recently committed updates to doc/memory.txt for a more detailed explanation. -- PMM
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Thu, 2013-11-07 at 21:51 +, Peter Maydell wrote: On 7 November 2013 21:38, Marcel Apfelbaum marce...@redhat.com wrote: Thanks Paolo, Let me just point out what I know (or I think I know): 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). More specifically, the x86 pc behaves like this. Other x86 based systems could in theory behave differently (not that we actually model any, I think). That means that you cannot have a 64bit wide pci-hole with lower priority that catches all accesses that are not for RAM(and firends). ...but this conclusion is wrong, because the pci-hole region is created by the pc model. So we should create Yes... It think you are right. I was thinking for some reason of master-abort region belonging to the the pci bus that is not specific to x86 pc... it at the correct priority and size to give the behaviour relative to other devices in the pc model that is required (ie that the hardware has). This doesn't affect any other target architecture or board. That said, I don't know enough about the PC to know what the exact details of the pci-hole are, so I'm not making a statement about what the correct model is. I'm just saying that what you do with the pci-hole and the container it lives in and the other devices in that container is not going to change the behaviour of any other target board. 2. If the above is right, and making pci-hole 64 bit wide is not an option, playing with pci-holes/other-region priorities it would be just wrong, it would be only to fight with the locality of the memory region's priority. I have no idea what you mean by fighting here. MR priorities By fighting I was referring exactly to the fact that because priorities are container specific we need to use them twice to get the wanted behavior (master abort being with the lowest priority) 1. master-abort region priority for pci-address-space 2. pci-hole priority for system-address-space But this is as designed... Thanks, Marcel apply only within a specific container region[*], to set which of that container's children appears 'above' another. They're totally local to the container (which in this case is part of the PC model, not the generic PCI code) and so the PC model can freely set them to whatever makes most sense. [*] if you didn't already know this, see the recently committed updates to doc/memory.txt for a more detailed explanation. -- PMM
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Thu, 2013-11-07 at 22:48 +0100, Laszlo Ersek wrote: On 11/07/13 22:24, Marcel Apfelbaum wrote: On Thu, 2013-11-07 at 22:12 +0100, Laszlo Ersek wrote: adding subregion 'pci-hole' to region 'system' at offset 6000 warning: subregion collision 6000/a000 (pci-hole) vs ffe0/20 (system.flash) Thank you Laszlo for the detailed info! I think the problem is right above. Why pci-hole and system.flash collide? IMHO we should not play with priorities here, better solve the collision. pc_init1() pc_memory_init() pc_system_firmware_init() pc_system_flash_init() sets base address to 0x1ULL - flash_size pflash_cfi01_register() sysbus_mmio_map() sysbus_mmio_map_common() memory_region_add_subregion() i440fx_init() memory_region_init_alias(pci-hole) pc_init1() passes 0x1ULL - below_4g_mem_size to i440fx_init() as pci_hole_size, which is then used as the size of the pci-hole alias. We should probably subtract the size of the flash from this, but I don't know how to do that elegantly. Yet another (output) parameter for pc_memory_init()? Blech. Or look up the end address of system.flash by name? Both seems ugly to me... As Peter Maydell pointed out, the pci-hole belongs to the pc model, so we can actually change its priority without affecting other architectures. Thanks Laszlo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Thu, 2013-11-07 at 22:31 +0100, Paolo Bonzini wrote: Il 07/11/2013 22:24, Marcel Apfelbaum ha scritto: Thank you Laszlo for the detailed info! I think the problem is right above. Why pci-hole and system.flash collide? IMHO we should not play with priorities here, better solve the collision. We need to audit all the other boards that support PCI... I'll take a look tomorrow since you guys are off. Thanks Paolo, Let me just point out what I know (or I think I know): 1. Not all architectures have the behavior: Address space that is not RAM(and friends) is for sure PCI. Only x86 behaves like this (I think). That means that you cannot have a 64bit wide pci-hole with lower priority that catches all accesses that are not for RAM(and firends). 2. If the above is right, and making pci-hole 64 bit wide is not an option, playing with pci-holes/other-region priorities it would be just wrong, it would be only to fight with the locality of the memory region's priority. That being said, I am looking forward for your findings. Thanks! Marcel Paolo
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Mon, 2013-11-04 at 08:18 +0200, Michael S. Tsirkin wrote: On Sun, Nov 03, 2013 at 09:26:06PM +, Peter Maydell wrote: On 3 November 2013 20:48, Marcel Apfelbaum marce...@redhat.com wrote: The problem appears when a root memory region within an address space with size UINT64_MAX has overlapping children with the same size. If the size of the root memory region is UINT64_MAX everyting is ok. Solved the regression by making the system-memory region of size UINT64_MAX instead of INT64_MAX. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- In the mean time I am investigating why the root memory region has to be UINT64_MAX size in order to have overlapping children system_memory = g_malloc(sizeof(*system_memory)); -memory_region_init(system_memory, NULL, system, INT64_MAX); +memory_region_init(system_memory, NULL, system, UINT64_MAX); address_space_init(address_space_memory, system_memory, memory); As you say above we should investigate why this caused a problem, but I was surprised the system memory space isn't already maximum size. It turns out that that change was introduced in commit 8417cebf in an attempt to avoid overflow issues by sticking to signed 64 bit arithmetic. This approach was subsequently ditched in favour of using proper 128 bit arithmetic in commit 08dafab4, but we never changed the init call for the system memory back to UINT64_MAX. So I think this is a good change in itself. -- PMM I think I debugged it. So this patch seems to help simply because we only have sanity checking asserts in the subpage path. UINT64_MAX will make the region a number of full pages and avoid hitting the checks. I think I see what the issue is: exec.c assumes that TARGET_PHYS_ADDR_SPACE_BITS is enough to render any section in system memory: number of page table levels is calculated from that: #define P_L2_LEVELS \ (((TARGET_PHYS_ADDR_SPACE_BITS - TARGET_PAGE_BITS - 1) / L2_BITS) + 1) any other bits are simply ignored: for (i = P_L2_LEVELS - 1; i = 0 !lp.is_leaf; i--) { if (lp.ptr == PHYS_MAP_NODE_NIL) { return sections[PHYS_SECTION_UNASSIGNED]; } p = nodes[lp.ptr]; lp = p[(index (i * L2_BITS)) (L2_SIZE - 1)]; } so mask by L2_SIZE - 1 means that each round looks at L2_BITS bits, and there are at most P_L2_LEVELS. Any other bits are simply ignored. Michael, thanks for helping to debug this issue. Let me see if I got it right: If the system memory size is INT64_MAX (0x7fff), the address of the last page (0x7) has more bits (55) that TARGET_PHYS_ADDR_SPACE_BITS (52) and cannot be correctly mapped into page levels? Thanks, Marcel This is very wrong and can break in a number of other ways, for example I think we will also hit this assert if we have a non aligned 64 bit BAR of a PCI device. I think the fastest solution is to just limit system memory size of TARGET_PAGE_BITS. I sent a patch like this.
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Mon, Nov 04, 2013 at 11:33:56AM +0200, Marcel Apfelbaum wrote: On Mon, 2013-11-04 at 08:18 +0200, Michael S. Tsirkin wrote: On Sun, Nov 03, 2013 at 09:26:06PM +, Peter Maydell wrote: On 3 November 2013 20:48, Marcel Apfelbaum marce...@redhat.com wrote: The problem appears when a root memory region within an address space with size UINT64_MAX has overlapping children with the same size. If the size of the root memory region is UINT64_MAX everyting is ok. Solved the regression by making the system-memory region of size UINT64_MAX instead of INT64_MAX. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- In the mean time I am investigating why the root memory region has to be UINT64_MAX size in order to have overlapping children system_memory = g_malloc(sizeof(*system_memory)); -memory_region_init(system_memory, NULL, system, INT64_MAX); +memory_region_init(system_memory, NULL, system, UINT64_MAX); address_space_init(address_space_memory, system_memory, memory); As you say above we should investigate why this caused a problem, but I was surprised the system memory space isn't already maximum size. It turns out that that change was introduced in commit 8417cebf in an attempt to avoid overflow issues by sticking to signed 64 bit arithmetic. This approach was subsequently ditched in favour of using proper 128 bit arithmetic in commit 08dafab4, but we never changed the init call for the system memory back to UINT64_MAX. So I think this is a good change in itself. -- PMM I think I debugged it. So this patch seems to help simply because we only have sanity checking asserts in the subpage path. UINT64_MAX will make the region a number of full pages and avoid hitting the checks. I think I see what the issue is: exec.c assumes that TARGET_PHYS_ADDR_SPACE_BITS is enough to render any section in system memory: number of page table levels is calculated from that: #define P_L2_LEVELS \ (((TARGET_PHYS_ADDR_SPACE_BITS - TARGET_PAGE_BITS - 1) / L2_BITS) + 1) any other bits are simply ignored: for (i = P_L2_LEVELS - 1; i = 0 !lp.is_leaf; i--) { if (lp.ptr == PHYS_MAP_NODE_NIL) { return sections[PHYS_SECTION_UNASSIGNED]; } p = nodes[lp.ptr]; lp = p[(index (i * L2_BITS)) (L2_SIZE - 1)]; } so mask by L2_SIZE - 1 means that each round looks at L2_BITS bits, and there are at most P_L2_LEVELS. Any other bits are simply ignored. Michael, thanks for helping to debug this issue. Let me see if I got it right: If the system memory size is INT64_MAX (0x7fff), the address of the last page (0x7) has more bits (55) that TARGET_PHYS_ADDR_SPACE_BITS (52) and cannot be correctly mapped into page levels? Thanks, Marcel Yes, I think that's it. This is very wrong and can break in a number of other ways, for example I think we will also hit this assert if we have a non aligned 64 bit BAR of a PCI device. I think the fastest solution is to just limit system memory size of TARGET_PAGE_BITS. I sent a patch like this.
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On 3 November 2013 20:48, Marcel Apfelbaum marce...@redhat.com wrote: The problem appears when a root memory region within an address space with size UINT64_MAX has overlapping children with the same size. If the size of the root memory region is UINT64_MAX everyting is ok. Solved the regression by making the system-memory region of size UINT64_MAX instead of INT64_MAX. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- In the mean time I am investigating why the root memory region has to be UINT64_MAX size in order to have overlapping children system_memory = g_malloc(sizeof(*system_memory)); -memory_region_init(system_memory, NULL, system, INT64_MAX); +memory_region_init(system_memory, NULL, system, UINT64_MAX); address_space_init(address_space_memory, system_memory, memory); As you say above we should investigate why this caused a problem, but I was surprised the system memory space isn't already maximum size. It turns out that that change was introduced in commit 8417cebf in an attempt to avoid overflow issues by sticking to signed 64 bit arithmetic. This approach was subsequently ditched in favour of using proper 128 bit arithmetic in commit 08dafab4, but we never changed the init call for the system memory back to UINT64_MAX. So I think this is a good change in itself. -- PMM
Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size
On Sun, Nov 03, 2013 at 09:26:06PM +, Peter Maydell wrote: On 3 November 2013 20:48, Marcel Apfelbaum marce...@redhat.com wrote: The problem appears when a root memory region within an address space with size UINT64_MAX has overlapping children with the same size. If the size of the root memory region is UINT64_MAX everyting is ok. Solved the regression by making the system-memory region of size UINT64_MAX instead of INT64_MAX. Signed-off-by: Marcel Apfelbaum marce...@redhat.com --- In the mean time I am investigating why the root memory region has to be UINT64_MAX size in order to have overlapping children system_memory = g_malloc(sizeof(*system_memory)); -memory_region_init(system_memory, NULL, system, INT64_MAX); +memory_region_init(system_memory, NULL, system, UINT64_MAX); address_space_init(address_space_memory, system_memory, memory); As you say above we should investigate why this caused a problem, but I was surprised the system memory space isn't already maximum size. It turns out that that change was introduced in commit 8417cebf in an attempt to avoid overflow issues by sticking to signed 64 bit arithmetic. This approach was subsequently ditched in favour of using proper 128 bit arithmetic in commit 08dafab4, but we never changed the init call for the system memory back to UINT64_MAX. So I think this is a good change in itself. -- PMM I think I debugged it. So this patch seems to help simply because we only have sanity checking asserts in the subpage path. UINT64_MAX will make the region a number of full pages and avoid hitting the checks. I think I see what the issue is: exec.c assumes that TARGET_PHYS_ADDR_SPACE_BITS is enough to render any section in system memory: number of page table levels is calculated from that: #define P_L2_LEVELS \ (((TARGET_PHYS_ADDR_SPACE_BITS - TARGET_PAGE_BITS - 1) / L2_BITS) + 1) any other bits are simply ignored: for (i = P_L2_LEVELS - 1; i = 0 !lp.is_leaf; i--) { if (lp.ptr == PHYS_MAP_NODE_NIL) { return sections[PHYS_SECTION_UNASSIGNED]; } p = nodes[lp.ptr]; lp = p[(index (i * L2_BITS)) (L2_SIZE - 1)]; } so mask by L2_SIZE - 1 means that each round looks at L2_BITS bits, and there are at most P_L2_LEVELS. Any other bits are simply ignored. This is very wrong and can break in a number of other ways, for example I think we will also hit this assert if we have a non aligned 64 bit BAR of a PCI device. I think the fastest solution is to just limit system memory size of TARGET_PAGE_BITS. I sent a patch like this. -- MST