Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 15/11/16 17:04, Boris Ostrovsky wrote: > On 11/15/2016 11:41 AM, Andrew Cooper wrote: >> It also occurs to me that you necessarily need a get_avail_vcpus >> hypercall to be able to use this interface sensibly from the toolstack. > We could modify getdomaininfo but that would make set_avail_vcpus domctl > non-symmetrical. > > And what would the use of this information be anyway? Well, for a start, this information needs to move in the migration stream, or by migrating a VM you will lose its current availability bitmap and reintroduce the problem we are trying to solve. >>> Oh, right, of course. >> Everyone forgets migrate. > (Or nested virt.) or vnuma, PoD, ballooning. All examples of features which fail or blow up spectacularly when combined with migration. > > As I was poking at this yesterday I realized that we may not need the > get interface this information is available in xenstore via > cpu/available node (well, it actually doesn't fully function for HVM > guests but I think it should be fixed). Xenstore data isn't propagated on migrate, nor should it start to be. (This isn't quite true. There is currently one piece of data which is, because of a Qemu hackary, which needs to be dropped.) > > But if you still feel we should have it I'd rather add another field to > XEN_DOMCTL_getdomaininfo. There are two points to consider. 1) Where should the information live in an ideal migration stream which includes hypervisor migration-v2. As this is being implemented by a new piece of emulation in Xen, there should be a new Xen record in the stream. 2) How are utility functions like `xl vcpu-info` going to find this information out. This just requires the information being available somehow in a hypercall. Extending XEN_DOMCTL_getdomaininfo would be fine for this purpose. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/15/2016 11:41 AM, Andrew Cooper wrote: > > It also occurs to me that you necessarily need a get_avail_vcpus > hypercall to be able to use this interface sensibly from the toolstack. We could modify getdomaininfo but that would make set_avail_vcpus domctl non-symmetrical. And what would the use of this information be anyway? >>> Well, for a start, this information needs to move in the migration >>> stream, or by migrating a VM you will lose its current availability >>> bitmap and reintroduce the problem we are trying to solve. >> Oh, right, of course. > Everyone forgets migrate. (Or nested virt.) As I was poking at this yesterday I realized that we may not need the get interface this information is available in xenstore via cpu/available node (well, it actually doesn't fully function for HVM guests but I think it should be fixed). But if you still feel we should have it I'd rather add another field to XEN_DOMCTL_getdomaininfo. -boris > > When we eventually get a checklist for new features, migration will > definitely be on there somewhere. > > ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 14/11/16 18:44, Boris Ostrovsky wrote: > On 11/14/2016 01:19 PM, Andrew Cooper wrote: >> On 14/11/16 17:48, Boris Ostrovsky wrote: >>> On 11/14/2016 12:17 PM, Andrew Cooper wrote: >> I am not convinced though that we can start enforcing this new VCPU >> count, at least for PV guests. They expect to start all max VCPUs and >> then offline them. This, of course, can be fixed but all non-updated >> kernels will stop booting. > How about we don't clear _VPF_down if the bit in the availability bitmap > is not set? This is yet another PV mess. We clearly need to quirk PV guests as the exception to sanity, given that they expect (and have been able to) online all cpus at start-of-day. To avoid race conditions, you necessarily need to be able to set a reduced permitted map before asking the VM to unplug. For HVM guests, we can set a proper permitted map at boot, and really should do so. For PV guests, we have to wait until it has completed its SMP bringup before reducing the permitted set. Therefore, making the initial set_avail_vcpus call could be deferred until the first unplug request? >>> I am not sure how we can determine in the hypervisor that a guest has >>> completed the bringup: I don't think we can rely on the last VCPU (which >>> is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the >>> guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we >>> can't assume a well-behaved guest. >> I wasn't suggesting relying on the guest. I was referring to the first >> unplug request at the toolstack level. > I don't think waiting for toolstack's (un)plug request is going to help > much --- the request may never come and the guest will be able to use > all maxvcpus. How does this currently work for PV guests, assuming there is no explicit user input beyond the what was written in the domain configuraiton file? > > >>> And then, even if we do determine the point when (maxvcpus-1) VCPUs are >>> all up, when do we clamp them down to avail_vcpus? For the same reason, >>> we can't assume that the guest will VCPUOP_down all extra VCPUs. >> If at some point we observe all vcpus being up, then we could set the >> restricted map then. However, I can't think of a useful way of >> identifying this point. > Exactly. > > The question is then, if we can't do this for PV, should we still do it > for HVM? Absolutely. It is embarrassing that there isn't enforcement for PV. Getting extra vcpu power for a short while is something that you typically buy from a cloud provider. > It also occurs to me that you necessarily need a get_avail_vcpus hypercall to be able to use this interface sensibly from the toolstack. >>> We could modify getdomaininfo but that would make set_avail_vcpus domctl >>> non-symmetrical. >>> >>> And what would the use of this information be anyway? >> Well, for a start, this information needs to move in the migration >> stream, or by migrating a VM you will lose its current availability >> bitmap and reintroduce the problem we are trying to solve. > Oh, right, of course. Everyone forgets migrate. When we eventually get a checklist for new features, migration will definitely be on there somewhere. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/15/2016 09:33 AM, Jan Beulich wrote: On 15.11.16 at 15:28,wrote: >> On 11/15/2016 03:34 AM, Jan Beulich wrote: >> On 09.11.16 at 15:39, wrote: This domctl is called when a VCPU is hot-(un)plugged to a guest (via 'xl vcpu-set'). While this currently is only intended to be needed by PVH guests we will call this domctl for all (x86) guests for consistency. >>> The discussion on the actual change seems to have pointed out all >>> needs of change, but what I wasn't able to understand yet is why >>> this is needed in the first place. From hypervisor pov, so far it's been >>> up to the guest which CPUs get onlined/offlined, and the interface >>> to request offlining (not an issue for onlining) was - afaict - a purely >>> voluntary one. Why does this change with PVH? Any such ratonale >>> should be put in the commit message. >> If the question is why we need to have hypervisor interface for PVH >> guests then it's because we need someone to send an SCI and set GPE >> registers and there is noone but the hypervisor to do that for PVH (I >> will add it to the commit message). > Yes, that was the primary question. And it took me until quite late > in the series until I've seen the purpose, so I appreciate you > extending the description, even if just slightly. > >> As for whether we want to enforce available VCPU count --- I think we >> decided that we can't do this for PV and so the question is whether it's >> worth doing only for some types of guests. And as you pointed out the >> second question (or may be the first) is whether enforcing it is the >> right thing in the first place. >> >> (BTW, I am thinking to move the domctl from x86-specific to common code >> since if we are no longer saying that it's PVH-only then ARM should have >> it available too) > Well, without us wanting to enforce anything, would we still need the > domctl for anything? While we can send an SCI directly from the toolstack we need to set the GPE registers for the guest and that's what domctl's primary use is (was). -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
>>> On 15.11.16 at 15:28,wrote: > On 11/15/2016 03:34 AM, Jan Beulich wrote: > On 09.11.16 at 15:39, wrote: >>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via >>> 'xl vcpu-set'). While this currently is only intended to be needed by >>> PVH guests we will call this domctl for all (x86) guests for consistency. >> The discussion on the actual change seems to have pointed out all >> needs of change, but what I wasn't able to understand yet is why >> this is needed in the first place. From hypervisor pov, so far it's been >> up to the guest which CPUs get onlined/offlined, and the interface >> to request offlining (not an issue for onlining) was - afaict - a purely >> voluntary one. Why does this change with PVH? Any such ratonale >> should be put in the commit message. > > If the question is why we need to have hypervisor interface for PVH > guests then it's because we need someone to send an SCI and set GPE > registers and there is noone but the hypervisor to do that for PVH (I > will add it to the commit message). Yes, that was the primary question. And it took me until quite late in the series until I've seen the purpose, so I appreciate you extending the description, even if just slightly. > As for whether we want to enforce available VCPU count --- I think we > decided that we can't do this for PV and so the question is whether it's > worth doing only for some types of guests. And as you pointed out the > second question (or may be the first) is whether enforcing it is the > right thing in the first place. > > (BTW, I am thinking to move the domctl from x86-specific to common code > since if we are no longer saying that it's PVH-only then ARM should have > it available too) Well, without us wanting to enforce anything, would we still need the domctl for anything? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/15/2016 03:34 AM, Jan Beulich wrote: On 09.11.16 at 15:39,wrote: >> This domctl is called when a VCPU is hot-(un)plugged to a guest (via >> 'xl vcpu-set'). While this currently is only intended to be needed by >> PVH guests we will call this domctl for all (x86) guests for consistency. > The discussion on the actual change seems to have pointed out all > needs of change, but what I wasn't able to understand yet is why > this is needed in the first place. From hypervisor pov, so far it's been > up to the guest which CPUs get onlined/offlined, and the interface > to request offlining (not an issue for onlining) was - afaict - a purely > voluntary one. Why does this change with PVH? Any such ratonale > should be put in the commit message. If the question is why we need to have hypervisor interface for PVH guests then it's because we need someone to send an SCI and set GPE registers and there is noone but the hypervisor to do that for PVH (I will add it to the commit message). As for whether we want to enforce available VCPU count --- I think we decided that we can't do this for PV and so the question is whether it's worth doing only for some types of guests. And as you pointed out the second question (or may be the first) is whether enforcing it is the right thing in the first place. (BTW, I am thinking to move the domctl from x86-specific to common code since if we are no longer saying that it's PVH-only then ARM should have it available too) -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
>>> On 09.11.16 at 15:39,wrote: > This domctl is called when a VCPU is hot-(un)plugged to a guest (via > 'xl vcpu-set'). While this currently is only intended to be needed by > PVH guests we will call this domctl for all (x86) guests for consistency. The discussion on the actual change seems to have pointed out all needs of change, but what I wasn't able to understand yet is why this is needed in the first place. From hypervisor pov, so far it's been up to the guest which CPUs get onlined/offlined, and the interface to request offlining (not an issue for onlining) was - afaict - a purely voluntary one. Why does this change with PVH? Any such ratonale should be put in the commit message. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/14/2016 01:19 PM, Andrew Cooper wrote: > On 14/11/16 17:48, Boris Ostrovsky wrote: >> On 11/14/2016 12:17 PM, Andrew Cooper wrote: > I am not convinced though that we can start enforcing this new VCPU > count, at least for PV guests. They expect to start all max VCPUs and > then offline them. This, of course, can be fixed but all non-updated > kernels will stop booting. How about we don't clear _VPF_down if the bit in the availability bitmap is not set? >>> This is yet another PV mess. We clearly need to quirk PV guests as the >>> exception to sanity, given that they expect (and have been able to) >>> online all cpus at start-of-day. >>> >>> To avoid race conditions, you necessarily need to be able to set a >>> reduced permitted map before asking the VM to unplug. >>> >>> For HVM guests, we can set a proper permitted map at boot, and really >>> should do so. >>> >>> For PV guests, we have to wait until it has completed its SMP bringup >>> before reducing the permitted set. Therefore, making the initial >>> set_avail_vcpus call could be deferred until the first unplug request? >> I am not sure how we can determine in the hypervisor that a guest has >> completed the bringup: I don't think we can rely on the last VCPU (which >> is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the >> guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we >> can't assume a well-behaved guest. > I wasn't suggesting relying on the guest. I was referring to the first > unplug request at the toolstack level. I don't think waiting for toolstack's (un)plug request is going to help much --- the request may never come and the guest will be able to use all maxvcpus. > >> And then, even if we do determine the point when (maxvcpus-1) VCPUs are >> all up, when do we clamp them down to avail_vcpus? For the same reason, >> we can't assume that the guest will VCPUOP_down all extra VCPUs. > If at some point we observe all vcpus being up, then we could set the > restricted map then. However, I can't think of a useful way of > identifying this point. Exactly. The question is then, if we can't do this for PV, should we still do it for HVM? > >>> It also occurs to me that you necessarily need a get_avail_vcpus >>> hypercall to be able to use this interface sensibly from the toolstack. >> We could modify getdomaininfo but that would make set_avail_vcpus domctl >> non-symmetrical. >> >> And what would the use of this information be anyway? > Well, for a start, this information needs to move in the migration > stream, or by migrating a VM you will lose its current availability > bitmap and reintroduce the problem we are trying to solve. Oh, right, of course. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 14/11/16 17:48, Boris Ostrovsky wrote: > On 11/14/2016 12:17 PM, Andrew Cooper wrote: I am not convinced though that we can start enforcing this new VCPU count, at least for PV guests. They expect to start all max VCPUs and then offline them. This, of course, can be fixed but all non-updated kernels will stop booting. >>> How about we don't clear _VPF_down if the bit in the availability bitmap >>> is not set? >> This is yet another PV mess. We clearly need to quirk PV guests as the >> exception to sanity, given that they expect (and have been able to) >> online all cpus at start-of-day. >> >> To avoid race conditions, you necessarily need to be able to set a >> reduced permitted map before asking the VM to unplug. >> >> For HVM guests, we can set a proper permitted map at boot, and really >> should do so. >> >> For PV guests, we have to wait until it has completed its SMP bringup >> before reducing the permitted set. Therefore, making the initial >> set_avail_vcpus call could be deferred until the first unplug request? > I am not sure how we can determine in the hypervisor that a guest has > completed the bringup: I don't think we can rely on the last VCPU (which > is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the > guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we > can't assume a well-behaved guest. I wasn't suggesting relying on the guest. I was referring to the first unplug request at the toolstack level. > And then, even if we do determine the point when (maxvcpus-1) VCPUs are > all up, when do we clamp them down to avail_vcpus? For the same reason, > we can't assume that the guest will VCPUOP_down all extra VCPUs. If at some point we observe all vcpus being up, then we could set the restricted map then. However, I can't think of a useful way of identifying this point. >> It also occurs to me that you necessarily need a get_avail_vcpus >> hypercall to be able to use this interface sensibly from the toolstack. > We could modify getdomaininfo but that would make set_avail_vcpus domctl > non-symmetrical. > > And what would the use of this information be anyway? Well, for a start, this information needs to move in the migration stream, or by migrating a VM you will lose its current availability bitmap and reintroduce the problem we are trying to solve. > > -boris > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/14/2016 12:17 PM, Andrew Cooper wrote: > >>> I am not convinced though that we can start enforcing this new VCPU >>> count, at least for PV guests. They expect to start all max VCPUs and >>> then offline them. This, of course, can be fixed but all non-updated >>> kernels will stop booting. >> How about we don't clear _VPF_down if the bit in the availability bitmap >> is not set? > This is yet another PV mess. We clearly need to quirk PV guests as the > exception to sanity, given that they expect (and have been able to) > online all cpus at start-of-day. > > To avoid race conditions, you necessarily need to be able to set a > reduced permitted map before asking the VM to unplug. > > For HVM guests, we can set a proper permitted map at boot, and really > should do so. > > For PV guests, we have to wait until it has completed its SMP bringup > before reducing the permitted set. Therefore, making the initial > set_avail_vcpus call could be deferred until the first unplug request? I am not sure how we can determine in the hypervisor that a guest has completed the bringup: I don't think we can rely on the last VCPU (which is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we can't assume a well-behaved guest. And then, even if we do determine the point when (maxvcpus-1) VCPUs are all up, when do we clamp them down to avail_vcpus? For the same reason, we can't assume that the guest will VCPUOP_down all extra VCPUs. > > It also occurs to me that you necessarily need a get_avail_vcpus > hypercall to be able to use this interface sensibly from the toolstack. We could modify getdomaininfo but that would make set_avail_vcpus domctl non-symmetrical. And what would the use of this information be anyway? -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 14/11/16 14:59, Boris Ostrovsky wrote: > On 11/09/2016 02:47 PM, Boris Ostrovsky wrote: >> On 11/09/2016 02:23 PM, Andrew Cooper wrote: >>> On 09/11/16 15:29, Boris Ostrovsky wrote: On 11/09/2016 10:04 AM, Andrew Cooper wrote: > On 09/11/16 14:39, Boris Ostrovsky wrote: >> This domctl is called when a VCPU is hot-(un)plugged to a guest (via >> 'xl vcpu-set'). While this currently is only intended to be needed by >> PVH guests we will call this domctl for all (x86) guests for consistency. >> >> Signed-off-by: Boris Ostrovsky>> Acked-by: Daniel De Graaf >> --- >> Changes in v2: >> * Added comment in arch_do_domctl() stating that the domctl is only >> required >> for PVH guests > I am not happy with this change until we understand why it is needed. > > Are we genuinely saying that there is no current enforcement in the > PV-hotplug mechanism? That's right. Don't call setup_cpu_watcher() in Linux and you will be running with maxvcpus. >>> /sigh - Quality engineering there... >>> >>> Yes - lets take the time to actually do this properly. >>> >> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c >> index 2a2fe04..b736d4c 100644 >> --- a/xen/arch/x86/domctl.c >> +++ b/xen/arch/x86/domctl.c >> @@ -1430,6 +1430,23 @@ long arch_do_domctl( >> } >> break; >> >> +case XEN_DOMCTL_set_avail_vcpus: >> +{ >> +unsigned int num = domctl->u.avail_vcpus.num; >> + >> +/* >> + * This is currently only needed by PVH guests (but >> + * any guest is free to make this call). >> + */ >> +ret = -EINVAL; >> +if ( num > d->max_vcpus ) >> +break; >> + >> +d->arch.avail_vcpus = num; >> +ret = 0; >> +break; >> +} > What do you actually mean by "avail_vcpus"? What happens if a vcpu > higher than the new number is currently online and running? What > happens to the already-existing vcpus-at-startup value? It shouldn't happen: we set avail_vcpus at domain creation time to vcpus-at-startup. The name is not great. It would have been better to have it online_vcpus but that usually means that VPF_down is set (which can happen, for example, when the guest offlines a VCPU). >>> How about an availability bitmap instead, which always has max_vcpus >>> bits in it? Xen should consult the bitmap before allowing a VM to >>> online a new vcpu. >> We could indeed use bitmap (and then it will actually be easier to >> handle io request as we can just copy appropriate bytes of the map >> instead of going bit-by-bit). This will still require the new domctl. Given the lack of any enforcing mechanism, introducing one is clearly needed. Please mention so in the commit message. >> >> I am not convinced though that we can start enforcing this new VCPU >> count, at least for PV guests. They expect to start all max VCPUs and >> then offline them. This, of course, can be fixed but all non-updated >> kernels will stop booting. > > How about we don't clear _VPF_down if the bit in the availability bitmap > is not set? This is yet another PV mess. We clearly need to quirk PV guests as the exception to sanity, given that they expect (and have been able to) online all cpus at start-of-day. To avoid race conditions, you necessarily need to be able to set a reduced permitted map before asking the VM to unplug. For HVM guests, we can set a proper permitted map at boot, and really should do so. For PV guests, we have to wait until it has completed its SMP bringup before reducing the permitted set. Therefore, making the initial set_avail_vcpus call could be deferred until the first unplug request? It also occurs to me that you necessarily need a get_avail_vcpus hypercall to be able to use this interface sensibly from the toolstack. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/09/2016 02:47 PM, Boris Ostrovsky wrote: > On 11/09/2016 02:23 PM, Andrew Cooper wrote: >> On 09/11/16 15:29, Boris Ostrovsky wrote: >>> On 11/09/2016 10:04 AM, Andrew Cooper wrote: On 09/11/16 14:39, Boris Ostrovsky wrote: > This domctl is called when a VCPU is hot-(un)plugged to a guest (via > 'xl vcpu-set'). While this currently is only intended to be needed by > PVH guests we will call this domctl for all (x86) guests for consistency. > > Signed-off-by: Boris Ostrovsky> Acked-by: Daniel De Graaf > --- > Changes in v2: > * Added comment in arch_do_domctl() stating that the domctl is only > required > for PVH guests I am not happy with this change until we understand why it is needed. Are we genuinely saying that there is no current enforcement in the PV-hotplug mechanism? >>> That's right. Don't call setup_cpu_watcher() in Linux and you will be >>> running with maxvcpus. >> /sigh - Quality engineering there... >> >> Yes - lets take the time to actually do this properly. >> > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index 2a2fe04..b736d4c 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -1430,6 +1430,23 @@ long arch_do_domctl( > } > break; > > +case XEN_DOMCTL_set_avail_vcpus: > +{ > +unsigned int num = domctl->u.avail_vcpus.num; > + > +/* > + * This is currently only needed by PVH guests (but > + * any guest is free to make this call). > + */ > +ret = -EINVAL; > +if ( num > d->max_vcpus ) > +break; > + > +d->arch.avail_vcpus = num; > +ret = 0; > +break; > +} What do you actually mean by "avail_vcpus"? What happens if a vcpu higher than the new number is currently online and running? What happens to the already-existing vcpus-at-startup value? >>> It shouldn't happen: we set avail_vcpus at domain creation time to >>> vcpus-at-startup. >>> >>> The name is not great. It would have been better to have it online_vcpus >>> but that usually means that VPF_down is set (which can happen, for >>> example, when the guest offlines a VCPU). >> How about an availability bitmap instead, which always has max_vcpus >> bits in it? Xen should consult the bitmap before allowing a VM to >> online a new vcpu. > We could indeed use bitmap (and then it will actually be easier to > handle io request as we can just copy appropriate bytes of the map > instead of going bit-by-bit). This will still require the new domctl. > > I am not convinced though that we can start enforcing this new VCPU > count, at least for PV guests. They expect to start all max VCPUs and > then offline them. This, of course, can be fixed but all non-updated > kernels will stop booting. How about we don't clear _VPF_down if the bit in the availability bitmap is not set? -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/09/2016 02:23 PM, Andrew Cooper wrote: > On 09/11/16 15:29, Boris Ostrovsky wrote: >> On 11/09/2016 10:04 AM, Andrew Cooper wrote: >>> On 09/11/16 14:39, Boris Ostrovsky wrote: This domctl is called when a VCPU is hot-(un)plugged to a guest (via 'xl vcpu-set'). While this currently is only intended to be needed by PVH guests we will call this domctl for all (x86) guests for consistency. Signed-off-by: Boris OstrovskyAcked-by: Daniel De Graaf --- Changes in v2: * Added comment in arch_do_domctl() stating that the domctl is only required for PVH guests >>> I am not happy with this change until we understand why it is needed. >>> >>> Are we genuinely saying that there is no current enforcement in the >>> PV-hotplug mechanism? >> That's right. Don't call setup_cpu_watcher() in Linux and you will be >> running with maxvcpus. > /sigh - Quality engineering there... > > Yes - lets take the time to actually do this properly. > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 2a2fe04..b736d4c 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -1430,6 +1430,23 @@ long arch_do_domctl( } break; +case XEN_DOMCTL_set_avail_vcpus: +{ +unsigned int num = domctl->u.avail_vcpus.num; + +/* + * This is currently only needed by PVH guests (but + * any guest is free to make this call). + */ +ret = -EINVAL; +if ( num > d->max_vcpus ) +break; + +d->arch.avail_vcpus = num; +ret = 0; +break; +} >>> What do you actually mean by "avail_vcpus"? What happens if a vcpu >>> higher than the new number is currently online and running? What >>> happens to the already-existing vcpus-at-startup value? >> It shouldn't happen: we set avail_vcpus at domain creation time to >> vcpus-at-startup. >> >> The name is not great. It would have been better to have it online_vcpus >> but that usually means that VPF_down is set (which can happen, for >> example, when the guest offlines a VCPU). > How about an availability bitmap instead, which always has max_vcpus > bits in it? Xen should consult the bitmap before allowing a VM to > online a new vcpu. We could indeed use bitmap (and then it will actually be easier to handle io request as we can just copy appropriate bytes of the map instead of going bit-by-bit). This will still require the new domctl. I am not convinced though that we can start enforcing this new VCPU count, at least for PV guests. They expect to start all max VCPUs and then offline them. This, of course, can be fixed but all non-updated kernels will stop booting. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 09/11/16 15:29, Boris Ostrovsky wrote: > On 11/09/2016 10:04 AM, Andrew Cooper wrote: >> On 09/11/16 14:39, Boris Ostrovsky wrote: >>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via >>> 'xl vcpu-set'). While this currently is only intended to be needed by >>> PVH guests we will call this domctl for all (x86) guests for consistency. >>> >>> Signed-off-by: Boris Ostrovsky>>> Acked-by: Daniel De Graaf >>> --- >>> Changes in v2: >>> * Added comment in arch_do_domctl() stating that the domctl is only required >>> for PVH guests >> I am not happy with this change until we understand why it is needed. >> >> Are we genuinely saying that there is no current enforcement in the >> PV-hotplug mechanism? > That's right. Don't call setup_cpu_watcher() in Linux and you will be > running with maxvcpus. /sigh - Quality engineering there... Yes - lets take the time to actually do this properly. > >>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c >>> index 2a2fe04..b736d4c 100644 >>> --- a/xen/arch/x86/domctl.c >>> +++ b/xen/arch/x86/domctl.c >>> @@ -1430,6 +1430,23 @@ long arch_do_domctl( >>> } >>> break; >>> >>> +case XEN_DOMCTL_set_avail_vcpus: >>> +{ >>> +unsigned int num = domctl->u.avail_vcpus.num; >>> + >>> +/* >>> + * This is currently only needed by PVH guests (but >>> + * any guest is free to make this call). >>> + */ >>> +ret = -EINVAL; >>> +if ( num > d->max_vcpus ) >>> +break; >>> + >>> +d->arch.avail_vcpus = num; >>> +ret = 0; >>> +break; >>> +} >> What do you actually mean by "avail_vcpus"? What happens if a vcpu >> higher than the new number is currently online and running? What >> happens to the already-existing vcpus-at-startup value? > It shouldn't happen: we set avail_vcpus at domain creation time to > vcpus-at-startup. > > The name is not great. It would have been better to have it online_vcpus > but that usually means that VPF_down is set (which can happen, for > example, when the guest offlines a VCPU). How about an availability bitmap instead, which always has max_vcpus bits in it? Xen should consult the bitmap before allowing a VM to online a new vcpu. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 11/09/2016 10:04 AM, Andrew Cooper wrote: > On 09/11/16 14:39, Boris Ostrovsky wrote: >> This domctl is called when a VCPU is hot-(un)plugged to a guest (via >> 'xl vcpu-set'). While this currently is only intended to be needed by >> PVH guests we will call this domctl for all (x86) guests for consistency. >> >> Signed-off-by: Boris Ostrovsky>> Acked-by: Daniel De Graaf >> --- >> Changes in v2: >> * Added comment in arch_do_domctl() stating that the domctl is only required >> for PVH guests > I am not happy with this change until we understand why it is needed. > > Are we genuinely saying that there is no current enforcement in the > PV-hotplug mechanism? That's right. Don't call setup_cpu_watcher() in Linux and you will be running with maxvcpus. > >> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c >> index 2a2fe04..b736d4c 100644 >> --- a/xen/arch/x86/domctl.c >> +++ b/xen/arch/x86/domctl.c >> @@ -1430,6 +1430,23 @@ long arch_do_domctl( >> } >> break; >> >> +case XEN_DOMCTL_set_avail_vcpus: >> +{ >> +unsigned int num = domctl->u.avail_vcpus.num; >> + >> +/* >> + * This is currently only needed by PVH guests (but >> + * any guest is free to make this call). >> + */ >> +ret = -EINVAL; >> +if ( num > d->max_vcpus ) >> +break; >> + >> +d->arch.avail_vcpus = num; >> +ret = 0; >> +break; >> +} > What do you actually mean by "avail_vcpus"? What happens if a vcpu > higher than the new number is currently online and running? What > happens to the already-existing vcpus-at-startup value? It shouldn't happen: we set avail_vcpus at domain creation time to vcpus-at-startup. The name is not great. It would have been better to have it online_vcpus but that usually means that VPF_down is set (which can happen, for example, when the guest offlines a VCPU). -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus
On 09/11/16 14:39, Boris Ostrovsky wrote: > This domctl is called when a VCPU is hot-(un)plugged to a guest (via > 'xl vcpu-set'). While this currently is only intended to be needed by > PVH guests we will call this domctl for all (x86) guests for consistency. > > Signed-off-by: Boris Ostrovsky> Acked-by: Daniel De Graaf > --- > Changes in v2: > * Added comment in arch_do_domctl() stating that the domctl is only required > for PVH guests I am not happy with this change until we understand why it is needed. Are we genuinely saying that there is no current enforcement in the PV-hotplug mechanism? > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index 2a2fe04..b736d4c 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -1430,6 +1430,23 @@ long arch_do_domctl( > } > break; > > +case XEN_DOMCTL_set_avail_vcpus: > +{ > +unsigned int num = domctl->u.avail_vcpus.num; > + > +/* > + * This is currently only needed by PVH guests (but > + * any guest is free to make this call). > + */ > +ret = -EINVAL; > +if ( num > d->max_vcpus ) > +break; > + > +d->arch.avail_vcpus = num; > +ret = 0; > +break; > +} What do you actually mean by "avail_vcpus"? What happens if a vcpu higher than the new number is currently online and running? What happens to the already-existing vcpus-at-startup value? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel