On Fri, Jul 23, 2021 at 08:56:54PM +0200, David Hildenbrand wrote: > > > > As I've asked this question previously elsewhere, it's more or less also > > related to the design decision of having virtio-mem being able to sparsely > > plugged in such a small granularity rather than making the plug/unplug still > > continuous within GPA range (so we move page when unplug). > > Yes, in an ideal world that would be optimal solution. Unfortunately, we're > not living in an ideal world :) > > virtio-mem in Linux guests will as default try unplugging highest-to-lowest > address, and I have on my TODO list an item to shrink the usable region (-> > later, shrinking the actual RAMBlock) once possible. > > So virtio-mem is prepared for that, but it will only apply in some cases. > > > > > There's definitely reasons there and I believe you're the expert on that (as > > you mentioned once: some guest GUPed pages cannot migrate so cannot get > > those > > ranges offlined otherwise), but so far I still not sure whether that's a > > kernel > > issue to solve on GUP, although I agree it's a complicated one anyway! > > To do something like that reliably, you have to manage hotplugged memory in > a special way, for example, in a movable zone. > > We have a at least 4 cases: > > a) The guest OS supports the movable zone and uses it for all hotplugged > memory > b) The guest OS supports the movable zone and uses it for some > hotplugged memory > c) The guest OS supports the movable zone and uses it for no hotplugged > memory > d) The guest OS does not support the concept of movable zones > > > a) is the dream but only applies in some cases if Linux is properly > configured (e.g., never hotplug more than 3 times boot memory) > b) will be possible under Linux soon (e.g., when hotplugging more than 3 > times boot memory) > c) is the default under Linux for most Linux distributions > d) Is Windows > > In addition, we can still have random unplug errors when using the movable > zone, for example, if someone references a page just a little too long. > > Maybe that helps.
Yes, thanks. > > > > > Maybe it's a trade-off you made at last, I don't have enough knowledge to > > tell. > > That's the precise description of what virtio-mem is. It's a trade-off > between which OSs we want to support, what the guest OS can actually do, how > we can manage memory in the hypervisor efficiently, ... > > > > > The patch itself looks okay to me, there's just a slight worry on not sure > > how > > long would the list be at last; if it's chopped in 1M/2M small chunks. > > I don't think that's really an issue: take a look at > qemu_get_guest_memory_mapping(), which will create as many entries as > necessary to express the guest physical mapping of the guest virtual (!) > address space with such chunks. That can be a lot :) I'm indeed a bit surprised by the "paging" parameter.. I gave it a try, the list grows into tens of thousands. One last question: will virtio-mem still do best-effort to move the pages, so as to grant as less holes as possible? Thanks, -- Peter Xu