On Wed, Jun 10, 2020 at 03:57:03PM +0200, Halil Pasic wrote: > On Wed, 10 Jun 2020 14:29:29 +1000 > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > On Tue, Jun 09, 2020 at 06:28:39PM +0200, Halil Pasic wrote: > > > On Tue, 9 Jun 2020 17:47:47 +0200 > > > Claudio Imbrenda <imbre...@linux.ibm.com> wrote: > > > > > > > On Tue, 9 Jun 2020 11:41:30 +0200 > > > > Halil Pasic <pa...@linux.ibm.com> wrote: > > > > > > > > [...] > > > > > > > > > I don't know. Janosch could answer that, but he is on vacation. Adding > > > > > Claudio maybe he can answer. My understanding is, that while it might > > > > > be possible, it is ugly at best. The ability to do a transition is > > > > > indicated by a CPU model feature. Indicating the feature to the guest > > > > > and then failing the transition sounds wrong to me. > > > > > > > > I agree. If the feature is advertised, then it has to work. I don't > > > > think we even have an architected way to fail the transition for that > > > > reason. > > > > > > > > What __could__ be done is to prevent qemu from even starting if an > > > > incompatible device is specified together with PV. > > > > > > AFAIU, the "specified together with PV" is the problem here. Currently > > > we don't "specify PV" but PV is just a capability that is managed by the > > > CPU model (like so many other). I.e. the fact that the > > > visualization environment is capable providing PV (unpack facility > > > available), and the fact, that the end user didn't fence the unpack > > > facility, does not mean, the user is dead set to use PV. > > > > > > My understanding is, that we want PV to just work, without having to > > > put together a peculiar VM definition that says: this is going to be > > > used as a PV VM. > > > > Having had a similar discussion for POWER, I no longer think this is a > > wise model. I think we want to have an explicit "allow PV" option - > > but we do want it to be a *single* option, rather than having to > > change configuration of a whole bunch of places. > > > > My intention is for my 'host-trust-limitation' series to accomplish > > that. > > Dave, many thanks for your input. I would be interested to read up that > discussion you hand for POWER to try to catch the train of thought. Can > you give me a pointer?
Urgh.. not really.. it was spread out over several discussions, some of which were on IRC or Slack, rather than email. > My current understanding is that s390x already has the "allow PV" option, > which is the CPU model feature. But its dynamics is just like the > dynamics of other CPU model features, in a sense that you may have to > disable it explicitly. > > Our problem is, that iommu_platform=on comes at a price point for us, > and we don't want to enforce it when it is not needed. And if the guest > does not decide to do the transition to protected, it is not needed. > > Thus any scheme were we pessimise based on the sheer possibility of > protected virtualization seems wrong to me. Hrm, I see your point. So... I guess my thinking is that although the strict meaning of the proposed host-trust-limitation option is just that "protection _can_ be used, at the guest/platform's option", it is a strong hint that we're expecting protection to be used. So would this work for s390: * The cpu feature remains, as now, enabled by default * The host-trust-limitation option would apply the protection necessary virtio options (and any other changes to defaults we discover we need), just as it does for SEV and POWER PEF * Optionally, the s390 machine type code could error out if host-trust-limitation is specified, but the cpu option is explicitly disabled > The sad thing is that QEMU has every information it needs to do what is > best: for paravirtualized devices > * use F_ACCESS_PLATFORM when needed, to make the guest work harder and > work around the access restrictions imposed by memory protection, and > * don't use F_ACCESS_PLATFORM when when not needed, and allow for > optimization based on the fact that no such access restrictions exist. Right.. IIUC you're suggesting delaying finalization of the device's featureset until the guest driver actually starts probing it > Sure we can burden up the user, to tell us if the VM is intended to be > used with memory protection or not. But what does it buy us? The > opportunity to create dodgy configurations? So, I don't know what the situation is with z, but for POWER machines with the ultravisor running are rare (read, not actually available outside IBM yet), and not directly tied to a cpu version (obviously you need a cpu with support, but you also need to actually be running under an ultravisor, which is optional). So what are our options: 1) Require explicitly enabling PEF support - this is burdening the user, as you say, but.. 2) Allow by default - but fail if the host doesn't have support. That means explicitly *disabling* on non-ultravisor machines, a much bigger imposition on the user 3) Enable conditionally depending on host support. Seems nice, but it's badly broken, as we've found with previous times we've tried to automatically do things based on host capabilities. The problem is that once you have this, it's not obvious without knowing a bunch about the hosts which ones it will be safe to migrate between. That horribly breaks things like RHV that want to do load balancing migrations within a cluster Basically having different guest-visible features depending on host properties is just unworkable, which brings us back to (1) being the least bad option. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature