Excerpts from Shawn Anastasio's message of August 19, 2020 6:59 am:
> On 8/18/20 2:11 AM, Nicholas Piggin wrote> Very reasonable point.
>> The problem we're trying to get a handle on is live partition migration
>> where a running guest might be using SAO then get migrated to a P10. I
>> don't think we have a good way to handle this case. Potentially the
>> hypervisor could revoke the page tables if the guest is running in hash
>> mode and the guest kernel could be taught about that and sigbus the
>> process, but in radix the guest controls those page tables and the SAO
>> state and I don't think there's a way to cause it to take a fault.
>> I also don't know what the proprietary hypervisor does here.
>> We could add it back, default to n, or make it bare metal only, or
>> somehow try to block live migration to a later CPU without the faciliy.
>> I wouldn't be against that.
> Admittedly I'm not too familiar with the specifics of live migration
> or guest memory management, but restoring the functionality and adding
> a way to prevent migration of SAO-using guests seems like a reasonable
> choice to me. Would this be done with help from the guest using some
> sort of infrastructure to signal to the hypervisor that SAO is in use,
> or entirely on the hypervisor by e.g. scanning the through the process
> table for SAO pages?

The first step might be to just re-add the functionality but disable
it by default if firmware_has_feature(FW_FEATURE_LPAR). You could have
a config or boot option to allow guests to use it at the cost of
migration compatibility.

That would probably be good enough for experimenting with the feature.
I think modifying the hypervisor and/or guest to deal with migration
is probably too much work to be justified at the moment.

>> It would be very interesting to know how it performs in such a "real"
>> situation. I don't know how well POWER9 has optimised it -- it's
>> possible that it's not much better than putting lwsync after every load
>> or store.
> This is definitely worth investigating in depth. That said, even if the
> performance on P9 isn't super great, I think the feature could still be
> useful, since it would offer more granularity than the sledgehammer
> approach of emitting lwsync everywhere.

Sure, we'd be interested to hear of results.

> I'd be happy to put in some of the work required to get this to a point
> where it can be reintroduced without breaking guest migration - I'd just
> need some pointers on getting started with whatever approach is decided on.

I think re-adding it as I said above would be okay. The code itself is 
not complex so that was not the reason for removal.


Reply via email to