On Wed, 4 Mar 2015 20:12:31 +0100 "Michael S. Tsirkin" <m...@redhat.com> wrote:
> On Wed, Mar 04, 2015 at 05:33:42PM +0100, Igor Mammedov wrote: > > On Wed, 4 Mar 2015 16:31:39 +0100 > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > On Wed, Mar 04, 2015 at 04:14:44PM +0100, Igor Mammedov wrote: > > > > On Wed, 4 Mar 2015 14:49:00 +0100 > > > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > > > > > On Wed, Mar 04, 2015 at 02:12:32PM +0100, Igor Mammedov wrote: > > > > > > On Wed, 4 Mar 2015 13:11:48 +0100 > > > > > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > > > > > > > > > On Tue, Mar 03, 2015 at 09:33:51PM +0100, Igor Mammedov wrote: > > > > > > > > On Tue, 3 Mar 2015 18:35:39 +0100 > > > > > > > > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > > > > > > > > > > > > > > > On Tue, Mar 03, 2015 at 05:18:14PM +0100, Igor Mammedov wrote: > > > > > > > > > > Based on Microsoft's sepecifications (paper can be > > > > > > > > > > dowloaded from > > > > > > > > > > http://go.microsoft.com/fwlink/?LinkId=260709), add a device > > > > > > > > > > description to the SSDT ACPI table and its implementation. > > > > > > > > > > > > > > > > > > > > The GUID is set using "vmgenid.uuid" property. > > > > > > > > > > > > > > > > > > > > Example of using vmgenid device: > > > > > > > > > > -device > > > > > > > > > > vmgenid,id=FOO,uuid="324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87" [...] > > > > - no consuming of slot and/or addrX.functionY > > > > - we would know immediately if device is correctly initialized > > > > even before BIOS runs. i.e. no guest involved and with > > > > clear end result. > > > > > > Been there, done that. > > > Each time we try to steal memory, we get pain. > > We are stealing it any way just by using more complex means. > > Not really. guest reserves memory and gives us the address. > With linker this is standard DMA that happens with a ton of devices. Let's see how "not really complex" is in case of linker 1. QEMU creates VGID bugger entry in linker so BIOS could allocate that buffer in conventional RAM -> reducing available RAM on 1 page 2. QEMU creates ACPI description for VGID device with fake address value that should be patched to VGID buffer address by linker in BIOS when it loads SSDT table. 3. BIOS linker loads linker table and allocates VGID buffer (and adds space occupied by buffer in into E820 table as E820_RESERVED), then it loads SSDT table and patches VGID.ADDR object with address of allocated VGID buffer. 4. QEMU the needs to know VGID address as well so it could write UUID in that buffer, for that we introduce MMIO interface (consuming resources in limited IO address space) so that ACPI code could write back VGID buffer address. 5. We wait until guest OS loads and executes VGID.INIT() method which writes in MMIO ports VGID.ADDR address which linker patched at #3 step so that QEMU would know where VGID buffer is to write UUID. 6. When QEMU gets VGID buffer address from OS it writes there UUID and sends GPE0._E00 notification which notifies guest that there is a new/updated UUID in VGID buffer 7. Guest OS can query VGID.ADDR for getting access to UUID at that address benefits of approach: there is no additional state that should be migrated since buffer in initial RAM provided by QEMU and migrated along with it. drawbacks: - resource wise: - guest's RAM size is reduced on 1 page - IO address space reduced on 4 bytes - Device is not in correct state until guest OS is booted and executes VGID.INIT ACPI method. - too complex for what needs to be done: - harder to maintain since maintainer would have to know/recall how it works when considering changes to this code - difficult to debug loop QEMU -> BIOS (3)-> OSPM(5) -> QEMU if something went wrong there > With pci this is standard BAR mapping. with PCI device it's quite a bit simpler: 1. QEMU creates PCI device with not mapped into guest address-space RAM BAR which is VGID buffer and writes UUID into it 2. BIOS initializes PCI devices and maps VGID BAR somewhere in PCI hole (i.e. VGID RAM buffer is allocated by QEMU and doesn't reduce initial RAM) 3. BIOS loads ACPI tables from QEMU, at that time QEMU generates them taking mapped address of VGID BAR directly from VGID PCI device to create VGID.ADDR ACPI object 4. Guest OS boots and can query VGID.ADDR for getting access to UUID at that address benefits of approach: - resource wise: - doesn't consume IO address space ports - quite a bit less complex than BIOS linker approach - incorrect device state lasts only till BIOS initializes PCI devices i.e. guest OS is not involved. drawbacks: - resource wise: - consumes a PCI function at least - still has device in incorrect state until BIOS maps VGID's provided RAM somewhere - debugging is a bit easier since it contained to QEMU+BIOS only: QEMU -> BIOS(maps bar) -> QEMU(update ACPI tables) -> BIOS(load SSDT) - unit test should wait till BIOS initializes PCI devices to start tests - migration stream will have extra state to carry device's VGID buffer > > > Is there any technical reasons/concerns why direct stealing > > from QEMU is bad and why is it better to used PCI/linker? > > > > I'd really know answer instead of getting "just because it's bad". > > Gal was also curios regarding this question. > > Simply put reserving RAM by hardware is not something > that happens on real hardware. That's not really technical reason against direct mapping in QEMU, if that were ACPI table wouldn't be ever moved to QEMU since it's job of BIOS. But for reducing complexity and to avoid split brain issues between QEMU and BIOS, ACPI generation was moved into QEMU. > Rather it's bios that reserves RAM. The same applies in this case, it's not RAM reservation since it's Device backed RAM block and it doesn't interfere with conventional RAM memory in any way. It's exactly the same like QEMU allocates and maps low and high RAM regions at certain addresses. Compared to PCI approach the difference would be that it's not BIOS selecting where to map VGID's RAM but rather a QEMU itself deciding where to map VGID's bar. So steps with direct mapping would look like: 1. QEMU creates a device, maps device's VGID buffer into guest address-space and writes UUID into it 2. BIOS loads ACPI tables from QEMU with correctly set VGID.ADDR ACPI object since address is known in advance even before BIOS runs. 3. Guest OS boots and can query VGID.ADDR for getting access to UUID at that address benefits of this approach over PCI one: - device is in correct state once it's realized, i.e. even before guest runs - there is no need to worry which PCI CLASS ID to choose so that Windows wouldn't ask for non existing driver or confuse stub PCI device with something else - simple straightforward design which doesn't depend on guest side, easy to debug and understand - unit test doesn't need to execute guest code at all, we could actually switch accel=tcg to qtest in it. - we don't even need a separate device for it, it could be a feature/property of chipset/board drawbacks: - migration stream will have extra state to carry device's VGID buffer - you name it