On Wed, Jul 7, 2021 at 1:08 PM Peter Xu <pet...@redhat.com> wrote: > > On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote: > > On 07.07.21 20:02, Peter Xu wrote: > > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > > As it never worked properly, let's disable it via the postcopy notifier > > > > on > > > > the destination. Trying to set "migrate_set_capability postcopy-ram on" > > > > on the destination now results in "virtio-balloon: 'free-page-hint' does > > > > not support postcopy Error: Postcopy is not supported". > > > > > > Would it be possible to do this in reversed order? Say, dynamically > > > disable > > > free-page-hinting if postcopy capability is set when migration starts? > > > Perhaps > > > it can also be re-enabled automatically when migration completes? > > > > I remember that this might be quite racy. We would have to make sure that no > > hinting happens before we enable the capability. > > > > As soon as we messed with the dirty bitmap (during precopy), postcopy is no > > longer safe. As noted in the patch, the only runtime alternative is to > > disable postcopy as soon as we actually do clear a bit. Alternatively, we > > could ignore any hints if the postcopy capability was enabled. > > Logically migration capabilities are applied at VM starts, and these > capabilities should be constant during migration (I didn't check if there's a > hard requirement; easy to add that if we want to assure it), and in most cases > for the lifecycle of the vm.
Would it make sense to maybe just look at adding a postcopy value to the PrecopyNotifyData that you could populate with migration_in_postcopy() in precopy_notify()? Then all you would need to do is check for that value and if it is set you shut down the page hinting or don't start it since I suspect it wouldn't likely add any value anyway since I would think flagging unused pages doesn't add much value in a postcopy environment anyway. > > > > Whatever we do, we have to make sure that a user cannot trick the system > > into an inconsistent state. Like enabling hinting, starting migration, then > > enabling the postcopy capability and kicking of postcopy. I did not check if > > we allow for that, though. > > We could turn free page hinting off when migration starts with > postcopy-ram=on, > then re-enable it after migration finishes. That looks very safe to me. And > I > don't even worry on user trying to mess it up - as that only put their own VM > at risk; that's mostly fine to me. We wouldn't necessarily even need to really turn it off, just don't start it. I wonder if we couldn't just get away with adding a check to the existing virtio_balloon_free_page_hint_notify to see if we are in the postcopy state there and just shut things down or not start them.