On 8/18/20 2:11 AM, Nicholas Piggin wrote> Very reasonable point.

The problem we're trying to get a handle on is live partition migration
where a running guest might be using SAO then get migrated to a P10. I
don't think we have a good way to handle this case. Potentially the
hypervisor could revoke the page tables if the guest is running in hash
mode and the guest kernel could be taught about that and sigbus the
process, but in radix the guest controls those page tables and the SAO
state and I don't think there's a way to cause it to take a fault.

I also don't know what the proprietary hypervisor does here.

We could add it back, default to n, or make it bare metal only, or
somehow try to block live migration to a later CPU without the faciliy.
I wouldn't be against that.

Admittedly I'm not too familiar with the specifics of live migration
or guest memory management, but restoring the functionality and adding
a way to prevent migration of SAO-using guests seems like a reasonable
choice to me. Would this be done with help from the guest using some
sort of infrastructure to signal to the hypervisor that SAO is in use,
or entirely on the hypervisor by e.g. scanning the through the process
table for SAO pages?

It would be very interesting to know how it performs in such a "real"
situation. I don't know how well POWER9 has optimised it -- it's
possible that it's not much better than putting lwsync after every load
or store.

This is definitely worth investigating in depth. That said, even if the
performance on P9 isn't super great, I think the feature could still be
useful, since it would offer more granularity than the sledgehammer
approach of emitting lwsync everywhere.

I'd be happy to put in some of the work required to get this to a point
where it can be reintroduced without breaking guest migration - I'd just
need some pointers on getting started with whatever approach is decided on.


Reply via email to