Re: [Qemu-devel] live migration vs device assignment (motivation)
On Fri, Dec 25, 2015 at 02:31:14PM -0800, Alexander Duyck wrote: > The PCI hot-plug specification calls out that the OS can optionally > implement a "pause" mechanism which is meant to be used for high > availability type environments. What I am proposing is basically > extending the standard SHPC capable PCI bridge so that we can support > the DMA page dirtying for everything hosted on it, add a vendor > specific block to the config space so that the guest can notify the > host that it will do page dirtying, and add a mechanism to indicate > that all hot-plug events during the warm-up phase of the migration are > pause events instead of full removals. Two comments: 1. A vendor specific capability will always be problematic. Better to register a capability id with pci sig. 2. There are actually several capabilities: A. support for memory dirtying if not supported, we must stop device before migration This is supported by core guest OS code, using patches similar to posted by you. B. support for device replacement This is a faster form of hotplug, where device is removed and later another device using same driver is inserted in the same slot. This is a possible optimization, but I am convinced (A) should be implemented independently of (B). > I've been poking around in the kernel and QEMU code and the part I > have been trying to sort out is how to get QEMU based pci-bridge to > use the SHPC driver because from what I can tell the driver never > actually gets loaded on the device as it is left in the control of > ACPI hot-plug. > > - Alex There are ways, but you can just use pci express, it's easier. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] live migration vs device assignment (motivation)
On Sun, Dec 27, 2015 at 1:21 AM, Michael S. Tsirkinwrote: > On Fri, Dec 25, 2015 at 02:31:14PM -0800, Alexander Duyck wrote: >> The PCI hot-plug specification calls out that the OS can optionally >> implement a "pause" mechanism which is meant to be used for high >> availability type environments. What I am proposing is basically >> extending the standard SHPC capable PCI bridge so that we can support >> the DMA page dirtying for everything hosted on it, add a vendor >> specific block to the config space so that the guest can notify the >> host that it will do page dirtying, and add a mechanism to indicate >> that all hot-plug events during the warm-up phase of the migration are >> pause events instead of full removals. > > Two comments: > > 1. A vendor specific capability will always be problematic. > Better to register a capability id with pci sig. > > 2. There are actually several capabilities: > > A. support for memory dirtying > if not supported, we must stop device before migration > > This is supported by core guest OS code, > using patches similar to posted by you. > > > B. support for device replacement > This is a faster form of hotplug, where device is removed and > later another device using same driver is inserted in the same slot. > > This is a possible optimization, but I am convinced > (A) should be implemented independently of (B). > My thought on this was that we don't need much to really implement either feature. Really only a bit or two for either one. I had thought about extending the PCI Advanced Features, but for now it might make more sense to just implement it as a vendor capability for the QEMU based bridges instead of trying to make this a true PCI capability since I am not sure if this in any way would apply to physical hardware. The fact is the PCI Advanced Features capability is essentially just a vendor specific capability with a different ID so if we were to use 2 bits that are currently reserved in the capability we could later merge the functionality without much overhead. I fully agree that the two implementations should be separate but nothing says we have to implement them completely different. If we are just using 3 bits for capability, status, and control of each feature there is no reason for them to need to be stored in separate locations. >> I've been poking around in the kernel and QEMU code and the part I >> have been trying to sort out is how to get QEMU based pci-bridge to >> use the SHPC driver because from what I can tell the driver never >> actually gets loaded on the device as it is left in the control of >> ACPI hot-plug. > > There are ways, but you can just use pci express, it's easier. That's true. I should probably just give up on trying to do an implementation that works with the i440fx implementation. I could probably move over to the q35 and once that is done then we could look at something like the PCI Advanced Features solution for something like the PCI-bridge drivers. - Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to reserve guest physical region for ACPI
Hi Michael, Paolo, Now it is the time to return to the challenge that how to reserve guest physical region internally used by ACPI. Igor suggested that: | An alternative place to allocate reserve from could be high memory. | For pc we have "reserved-memory-end" which currently makes sure | that hotpluggable memory range isn't used by firmware (https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00926.html) he also innovated a way to use 64-bit address in DSDT/SSDT.rev = 1: | when writing ASL one shall make sure that only XP supported | features are in global scope, which is evaluated when tables | are loaded and features of rev2 and higher are inside methods. | That way XP doesn't crash as far as it doesn't evaluate unsupported | opcode and one can guard those opcodes checking _REV object if neccesary. (https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg01010.html) Michael, Paolo, what do you think about these ideas? Thanks! -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] live migration vs device assignment (motivation)
On Sun, Dec 27, 2015 at 7:20 PM, Dong, Eddiewrote: >> > >> > Even if the device driver doesn't support migration, you still want to >> > migrate VM? That maybe risk and we should add the "bad path" for the >> > driver at least. >> >> At a minimum we should have support for hot-plug if we are expecting to >> support migration. You would simply have to hot-plug the device before you >> start migration and then return it after. That is how the current bonding >> approach for this works if I am not mistaken. > > Hotplug is good to eliminate the device spefic state clone, but bonding > approach is very network specific, it doesn’t work for other devices such as > FPGA device, QaT devices & GPU devices, which we plan to support gradually :) Hotplug would be usable for that assuming the guest supports the optional "pause" implementation as called out in the PCI hotplug spec. With that the device can maintain state for some period of time after the hotplug remove event has occurred. The problem is that you have to get the device to quiesce at some point as you cannot complete the migration with the device still active. The way you were doing it was using the per-device configuration space mechanism. That doesn't scale when you have to implement it for each and every driver for each and every OS you have to support. Using the "pause" implementation for hot-plug would have a much greater likelihood of scaling as you could either take the fast path approach of "pausing" the device to resume it when migration has completed, or you could just remove the device and restart the driver on the other side if the pause support is not yet implemented. You would lose the state under such a migration but it is much more practical than having to implement a per device solution. - Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] live migration vs device assignment (motivation)
> > > > Even if the device driver doesn't support migration, you still want to > > migrate VM? That maybe risk and we should add the "bad path" for the > > driver at least. > > At a minimum we should have support for hot-plug if we are expecting to > support migration. You would simply have to hot-plug the device before you > start migration and then return it after. That is how the current bonding > approach for this works if I am not mistaken. Hotplug is good to eliminate the device spefic state clone, but bonding approach is very network specific, it doesn’t work for other devices such as FPGA device, QaT devices & GPU devices, which we plan to support gradually :) > > The advantage we are looking to gain is to avoid removing/disabling the > device for as long as possible. Ideally we want to keep the device active > through the warm-up period, but if the guest doesn't do that we should still > be able to fall back on the older approaches if needed. > N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf