Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-15 Thread Andrew Cooper
On 15/11/16 17:04, Boris Ostrovsky wrote:
> On 11/15/2016 11:41 AM, Andrew Cooper wrote:
>> It also occurs to me that you necessarily need a get_avail_vcpus
>> hypercall to be able to use this interface sensibly from the toolstack.
> We could modify getdomaininfo but that would make set_avail_vcpus domctl
> non-symmetrical.
>
> And what would the use of this information be anyway?
 Well, for a start, this information needs to move in the migration
 stream, or by migrating a VM you will lose its current availability
 bitmap and reintroduce the problem we are trying to solve.
>>> Oh, right, of course.
>> Everyone forgets migrate.
> (Or nested virt.)

or vnuma, PoD, ballooning.  All examples of features which fail or blow
up spectacularly when combined with migration.

>
> As I was poking at this yesterday I realized that we may not need the
> get interface  this information is available in xenstore via
> cpu/available node (well, it actually doesn't fully function for HVM
> guests but I think it should be fixed).

Xenstore data isn't propagated on migrate, nor should it start to be. 
(This isn't quite true.  There is currently one piece of data which is,
because of a Qemu hackary, which needs to be dropped.)

>
> But if you still feel we should have it I'd rather add another field to
> XEN_DOMCTL_getdomaininfo.

There are two points to consider.

1) Where should the information live in an ideal migration stream which
includes hypervisor migration-v2.

As this is being implemented by a new piece of emulation in Xen, there
should be a new Xen record in the stream.


2) How are utility functions like `xl vcpu-info` going to find this
information out.

This just requires the information being available somehow in a
hypercall.  Extending XEN_DOMCTL_getdomaininfo would be fine for this
purpose.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-15 Thread Boris Ostrovsky
On 11/15/2016 11:41 AM, Andrew Cooper wrote:
>
> It also occurs to me that you necessarily need a get_avail_vcpus
> hypercall to be able to use this interface sensibly from the toolstack.
 We could modify getdomaininfo but that would make set_avail_vcpus domctl
 non-symmetrical.

 And what would the use of this information be anyway?
>>> Well, for a start, this information needs to move in the migration
>>> stream, or by migrating a VM you will lose its current availability
>>> bitmap and reintroduce the problem we are trying to solve.
>> Oh, right, of course.
> Everyone forgets migrate.

(Or nested virt.)

As I was poking at this yesterday I realized that we may not need the
get interface  this information is available in xenstore via
cpu/available node (well, it actually doesn't fully function for HVM
guests but I think it should be fixed).

But if you still feel we should have it I'd rather add another field to
XEN_DOMCTL_getdomaininfo.

-boris

>
> When we eventually get a checklist for new features, migration will
> definitely be on there somewhere.
>
> ~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-15 Thread Andrew Cooper
On 14/11/16 18:44, Boris Ostrovsky wrote:
> On 11/14/2016 01:19 PM, Andrew Cooper wrote:
>> On 14/11/16 17:48, Boris Ostrovsky wrote:
>>> On 11/14/2016 12:17 PM, Andrew Cooper wrote:
>> I am not convinced though that we can start enforcing this new VCPU
>> count, at least for PV guests. They expect to start all max VCPUs and
>> then offline them. This, of course, can be fixed but all non-updated
>> kernels will stop booting.
> How about we don't clear _VPF_down if the bit in the availability bitmap
> is not set?
 This is yet another PV mess.  We clearly need to quirk PV guests as the
 exception to sanity, given that they expect (and have been able to)
 online all cpus at start-of-day.

 To avoid race conditions, you necessarily need to be able to set a
 reduced permitted map before asking the VM to unplug.

 For HVM guests, we can set a proper permitted map at boot, and really
 should do so.

 For PV guests, we have to wait until it has completed its SMP bringup
 before reducing the permitted set.  Therefore, making the initial
 set_avail_vcpus call could be deferred until the first unplug request?
>>> I am not sure how we can determine in the hypervisor that a guest has
>>> completed the bringup: I don't think we can rely on the last VCPU (which
>>> is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the
>>> guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we
>>> can't assume a well-behaved guest.
>> I wasn't suggesting relying on the guest.  I was referring to the first
>> unplug request at the toolstack level.
> I don't think waiting for toolstack's (un)plug request is going to help
> much --- the request may never come and the guest will be able to use
> all maxvcpus.

How does this currently work for PV guests, assuming there is no
explicit user input beyond the what was written in the domain
configuraiton file?

>
>
>>> And then, even if we do determine the point when (maxvcpus-1) VCPUs are
>>> all up, when do we clamp them down to avail_vcpus? For the same reason,
>>> we can't assume that the guest will VCPUOP_down all extra VCPUs.
>> If at some point we observe all vcpus being up, then we could set the
>> restricted map then.  However, I can't think of a useful way of
>> identifying this point.
> Exactly.
>
> The question is then, if we can't do this for PV, should we still do it
> for HVM?

Absolutely.  It is embarrassing that there isn't enforcement for PV.

Getting extra vcpu power for a short while is something that you
typically buy from a cloud provider.

>
 It also occurs to me that you necessarily need a get_avail_vcpus
 hypercall to be able to use this interface sensibly from the toolstack.
>>> We could modify getdomaininfo but that would make set_avail_vcpus domctl
>>> non-symmetrical.
>>>
>>> And what would the use of this information be anyway?
>> Well, for a start, this information needs to move in the migration
>> stream, or by migrating a VM you will lose its current availability
>> bitmap and reintroduce the problem we are trying to solve.
> Oh, right, of course.

Everyone forgets migrate.

When we eventually get a checklist for new features, migration will
definitely be on there somewhere.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-15 Thread Boris Ostrovsky
On 11/15/2016 09:33 AM, Jan Beulich wrote:
 On 15.11.16 at 15:28,  wrote:
>> On 11/15/2016 03:34 AM, Jan Beulich wrote:
>> On 09.11.16 at 15:39,  wrote:
 This domctl is called when a VCPU is hot-(un)plugged to a guest (via
 'xl vcpu-set'). While this currently is only intended to be needed by
 PVH guests we will call this domctl for all (x86) guests for consistency.
>>> The discussion on the actual change seems to have pointed out all
>>> needs of change, but what I wasn't able to understand yet is why
>>> this is needed in the first place. From hypervisor pov, so far it's been
>>> up to the guest which CPUs get onlined/offlined, and the interface
>>> to request offlining (not an issue for onlining) was - afaict - a purely
>>> voluntary one. Why does this change with PVH? Any such ratonale
>>> should be put in the commit message.
>> If the question is why we need to have hypervisor interface for PVH
>> guests then it's because we need someone to send an SCI and set GPE
>> registers and there is noone but the hypervisor to do that for PVH (I
>> will add it to the commit message).
> Yes, that was the primary question. And it took me until quite late
> in the series until I've seen the purpose, so I appreciate you
> extending the description, even if just slightly.
>
>> As for whether we want to enforce available VCPU count --- I think we
>> decided that we can't do this for PV and so the question is whether it's
>> worth doing only for some types of guests. And as you pointed out the
>> second question (or may be the first) is whether enforcing it is the
>> right thing in the first place.
>>
>> (BTW, I am thinking to move the domctl from x86-specific to common code
>> since if we are no longer saying that it's PVH-only then ARM should have
>> it available too)
> Well, without us wanting to enforce anything, would we still need the
> domctl for anything?

While we can send an SCI directly from the toolstack we need to set the
GPE registers for the guest and that's what domctl's primary use is (was).

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-15 Thread Jan Beulich
>>> On 15.11.16 at 15:28,  wrote:
> On 11/15/2016 03:34 AM, Jan Beulich wrote:
> On 09.11.16 at 15:39,  wrote:
>>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
>>> 'xl vcpu-set'). While this currently is only intended to be needed by
>>> PVH guests we will call this domctl for all (x86) guests for consistency.
>> The discussion on the actual change seems to have pointed out all
>> needs of change, but what I wasn't able to understand yet is why
>> this is needed in the first place. From hypervisor pov, so far it's been
>> up to the guest which CPUs get onlined/offlined, and the interface
>> to request offlining (not an issue for onlining) was - afaict - a purely
>> voluntary one. Why does this change with PVH? Any such ratonale
>> should be put in the commit message.
> 
> If the question is why we need to have hypervisor interface for PVH
> guests then it's because we need someone to send an SCI and set GPE
> registers and there is noone but the hypervisor to do that for PVH (I
> will add it to the commit message).

Yes, that was the primary question. And it took me until quite late
in the series until I've seen the purpose, so I appreciate you
extending the description, even if just slightly.

> As for whether we want to enforce available VCPU count --- I think we
> decided that we can't do this for PV and so the question is whether it's
> worth doing only for some types of guests. And as you pointed out the
> second question (or may be the first) is whether enforcing it is the
> right thing in the first place.
> 
> (BTW, I am thinking to move the domctl from x86-specific to common code
> since if we are no longer saying that it's PVH-only then ARM should have
> it available too)

Well, without us wanting to enforce anything, would we still need the
domctl for anything?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-15 Thread Boris Ostrovsky
On 11/15/2016 03:34 AM, Jan Beulich wrote:
 On 09.11.16 at 15:39,  wrote:
>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
>> 'xl vcpu-set'). While this currently is only intended to be needed by
>> PVH guests we will call this domctl for all (x86) guests for consistency.
> The discussion on the actual change seems to have pointed out all
> needs of change, but what I wasn't able to understand yet is why
> this is needed in the first place. From hypervisor pov, so far it's been
> up to the guest which CPUs get onlined/offlined, and the interface
> to request offlining (not an issue for onlining) was - afaict - a purely
> voluntary one. Why does this change with PVH? Any such ratonale
> should be put in the commit message.


If the question is why we need to have hypervisor interface for PVH
guests then it's because we need someone to send an SCI and set GPE
registers and there is noone but the hypervisor to do that for PVH (I
will add it to the commit message).

As for whether we want to enforce available VCPU count --- I think we
decided that we can't do this for PV and so the question is whether it's
worth doing only for some types of guests. And as you pointed out the
second question (or may be the first) is whether enforcing it is the
right thing in the first place.

(BTW, I am thinking to move the domctl from x86-specific to common code
since if we are no longer saying that it's PVH-only then ARM should have
it available too)

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-15 Thread Jan Beulich
>>> On 09.11.16 at 15:39,  wrote:
> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
> 'xl vcpu-set'). While this currently is only intended to be needed by
> PVH guests we will call this domctl for all (x86) guests for consistency.

The discussion on the actual change seems to have pointed out all
needs of change, but what I wasn't able to understand yet is why
this is needed in the first place. From hypervisor pov, so far it's been
up to the guest which CPUs get onlined/offlined, and the interface
to request offlining (not an issue for onlining) was - afaict - a purely
voluntary one. Why does this change with PVH? Any such ratonale
should be put in the commit message.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-14 Thread Boris Ostrovsky
On 11/14/2016 01:19 PM, Andrew Cooper wrote:
> On 14/11/16 17:48, Boris Ostrovsky wrote:
>> On 11/14/2016 12:17 PM, Andrew Cooper wrote:
> I am not convinced though that we can start enforcing this new VCPU
> count, at least for PV guests. They expect to start all max VCPUs and
> then offline them. This, of course, can be fixed but all non-updated
> kernels will stop booting.
 How about we don't clear _VPF_down if the bit in the availability bitmap
 is not set?
>>> This is yet another PV mess.  We clearly need to quirk PV guests as the
>>> exception to sanity, given that they expect (and have been able to)
>>> online all cpus at start-of-day.
>>>
>>> To avoid race conditions, you necessarily need to be able to set a
>>> reduced permitted map before asking the VM to unplug.
>>>
>>> For HVM guests, we can set a proper permitted map at boot, and really
>>> should do so.
>>>
>>> For PV guests, we have to wait until it has completed its SMP bringup
>>> before reducing the permitted set.  Therefore, making the initial
>>> set_avail_vcpus call could be deferred until the first unplug request?
>> I am not sure how we can determine in the hypervisor that a guest has
>> completed the bringup: I don't think we can rely on the last VCPU (which
>> is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the
>> guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we
>> can't assume a well-behaved guest.
> I wasn't suggesting relying on the guest.  I was referring to the first
> unplug request at the toolstack level.

I don't think waiting for toolstack's (un)plug request is going to help
much --- the request may never come and the guest will be able to use
all maxvcpus.


>
>> And then, even if we do determine the point when (maxvcpus-1) VCPUs are
>> all up, when do we clamp them down to avail_vcpus? For the same reason,
>> we can't assume that the guest will VCPUOP_down all extra VCPUs.
> If at some point we observe all vcpus being up, then we could set the
> restricted map then.  However, I can't think of a useful way of
> identifying this point.

Exactly.

The question is then, if we can't do this for PV, should we still do it
for HVM?

>
>>> It also occurs to me that you necessarily need a get_avail_vcpus
>>> hypercall to be able to use this interface sensibly from the toolstack.
>> We could modify getdomaininfo but that would make set_avail_vcpus domctl
>> non-symmetrical.
>>
>> And what would the use of this information be anyway?
> Well, for a start, this information needs to move in the migration
> stream, or by migrating a VM you will lose its current availability
> bitmap and reintroduce the problem we are trying to solve.

Oh, right, of course.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-14 Thread Andrew Cooper
On 14/11/16 17:48, Boris Ostrovsky wrote:
> On 11/14/2016 12:17 PM, Andrew Cooper wrote:
 I am not convinced though that we can start enforcing this new VCPU
 count, at least for PV guests. They expect to start all max VCPUs and
 then offline them. This, of course, can be fixed but all non-updated
 kernels will stop booting.
>>> How about we don't clear _VPF_down if the bit in the availability bitmap
>>> is not set?
>> This is yet another PV mess.  We clearly need to quirk PV guests as the
>> exception to sanity, given that they expect (and have been able to)
>> online all cpus at start-of-day.
>>
>> To avoid race conditions, you necessarily need to be able to set a
>> reduced permitted map before asking the VM to unplug.
>>
>> For HVM guests, we can set a proper permitted map at boot, and really
>> should do so.
>>
>> For PV guests, we have to wait until it has completed its SMP bringup
>> before reducing the permitted set.  Therefore, making the initial
>> set_avail_vcpus call could be deferred until the first unplug request?
> I am not sure how we can determine in the hypervisor that a guest has
> completed the bringup: I don't think we can rely on the last VCPU (which
> is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the
> guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we
> can't assume a well-behaved guest.

I wasn't suggesting relying on the guest.  I was referring to the first
unplug request at the toolstack level.

> And then, even if we do determine the point when (maxvcpus-1) VCPUs are
> all up, when do we clamp them down to avail_vcpus? For the same reason,
> we can't assume that the guest will VCPUOP_down all extra VCPUs.

If at some point we observe all vcpus being up, then we could set the
restricted map then.  However, I can't think of a useful way of
identifying this point.

>> It also occurs to me that you necessarily need a get_avail_vcpus
>> hypercall to be able to use this interface sensibly from the toolstack.
> We could modify getdomaininfo but that would make set_avail_vcpus domctl
> non-symmetrical.
>
> And what would the use of this information be anyway?

Well, for a start, this information needs to move in the migration
stream, or by migrating a VM you will lose its current availability
bitmap and reintroduce the problem we are trying to solve.

>
> -boris
>
>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-14 Thread Boris Ostrovsky
On 11/14/2016 12:17 PM, Andrew Cooper wrote:
>
>>> I am not convinced though that we can start enforcing this new VCPU
>>> count, at least for PV guests. They expect to start all max VCPUs and
>>> then offline them. This, of course, can be fixed but all non-updated
>>> kernels will stop booting.
>> How about we don't clear _VPF_down if the bit in the availability bitmap
>> is not set?
> This is yet another PV mess.  We clearly need to quirk PV guests as the
> exception to sanity, given that they expect (and have been able to)
> online all cpus at start-of-day.
>
> To avoid race conditions, you necessarily need to be able to set a
> reduced permitted map before asking the VM to unplug.
>
> For HVM guests, we can set a proper permitted map at boot, and really
> should do so.
>
> For PV guests, we have to wait until it has completed its SMP bringup
> before reducing the permitted set.  Therefore, making the initial
> set_avail_vcpus call could be deferred until the first unplug request?

I am not sure how we can determine in the hypervisor that a guest has
completed the bringup: I don't think we can rely on the last VCPU (which
is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the
guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we
can't assume a well-behaved guest.

And then, even if we do determine the point when (maxvcpus-1) VCPUs are
all up, when do we clamp them down to avail_vcpus? For the same reason,
we can't assume that the guest will VCPUOP_down all extra VCPUs.


>
> It also occurs to me that you necessarily need a get_avail_vcpus
> hypercall to be able to use this interface sensibly from the toolstack.

We could modify getdomaininfo but that would make set_avail_vcpus domctl
non-symmetrical.

And what would the use of this information be anyway?

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-14 Thread Andrew Cooper
On 14/11/16 14:59, Boris Ostrovsky wrote:
> On 11/09/2016 02:47 PM, Boris Ostrovsky wrote:
>> On 11/09/2016 02:23 PM, Andrew Cooper wrote:
>>> On 09/11/16 15:29, Boris Ostrovsky wrote:
 On 11/09/2016 10:04 AM, Andrew Cooper wrote:
> On 09/11/16 14:39, Boris Ostrovsky wrote:
>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
>> 'xl vcpu-set'). While this currently is only intended to be needed by
>> PVH guests we will call this domctl for all (x86) guests for consistency.
>>
>> Signed-off-by: Boris Ostrovsky 
>> Acked-by: Daniel De Graaf 
>> ---
>> Changes in v2:
>> * Added comment in arch_do_domctl() stating that the domctl is only 
>> required
>>   for PVH guests
> I am not happy with this change until we understand why it is needed.
>
> Are we genuinely saying that there is no current enforcement in the
> PV-hotplug mechanism?
 That's right. Don't call setup_cpu_watcher() in Linux and you will be
 running with maxvcpus.
>>> /sigh - Quality engineering there...
>>>
>>> Yes - lets take the time to actually do this properly.
>>>
>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
>> index 2a2fe04..b736d4c 100644
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -1430,6 +1430,23 @@ long arch_do_domctl(
>>  }
>>  break;
>>  
>> +case XEN_DOMCTL_set_avail_vcpus:
>> +{
>> +unsigned int num = domctl->u.avail_vcpus.num;
>> +
>> +/*
>> + * This is currently only needed by PVH guests (but
>> + * any guest is free to make this call).
>> + */
>> +ret = -EINVAL;
>> +if ( num > d->max_vcpus )
>> +break;
>> +
>> +d->arch.avail_vcpus = num;
>> +ret = 0;
>> +break;
>> +}
> What do you actually mean by "avail_vcpus"?  What happens if a vcpu
> higher than the new number is currently online and running?  What
> happens to the already-existing vcpus-at-startup value?
 It shouldn't happen: we set avail_vcpus at domain creation time to
 vcpus-at-startup.

 The name is not great. It would have been better to have it online_vcpus
 but that usually means that VPF_down is set (which can happen, for
 example, when the guest offlines a VCPU).
>>> How about an availability bitmap instead, which always has max_vcpus
>>> bits in it?  Xen should consult the bitmap before allowing a VM to
>>> online a new vcpu.
>> We could indeed use bitmap (and then it will actually be easier to
>> handle io request as we can just copy appropriate bytes of the map
>> instead of going bit-by-bit). This will still require the new domctl.

Given the lack of any enforcing mechanism, introducing one is clearly
needed.  Please mention so in the commit message.

>>
>> I am not convinced though that we can start enforcing this new VCPU
>> count, at least for PV guests. They expect to start all max VCPUs and
>> then offline them. This, of course, can be fixed but all non-updated
>> kernels will stop booting.
>
> How about we don't clear _VPF_down if the bit in the availability bitmap
> is not set?

This is yet another PV mess.  We clearly need to quirk PV guests as the
exception to sanity, given that they expect (and have been able to)
online all cpus at start-of-day.

To avoid race conditions, you necessarily need to be able to set a
reduced permitted map before asking the VM to unplug.

For HVM guests, we can set a proper permitted map at boot, and really
should do so.

For PV guests, we have to wait until it has completed its SMP bringup
before reducing the permitted set.  Therefore, making the initial
set_avail_vcpus call could be deferred until the first unplug request?

It also occurs to me that you necessarily need a get_avail_vcpus
hypercall to be able to use this interface sensibly from the toolstack.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-14 Thread Boris Ostrovsky
On 11/09/2016 02:47 PM, Boris Ostrovsky wrote:
> On 11/09/2016 02:23 PM, Andrew Cooper wrote:
>> On 09/11/16 15:29, Boris Ostrovsky wrote:
>>> On 11/09/2016 10:04 AM, Andrew Cooper wrote:
 On 09/11/16 14:39, Boris Ostrovsky wrote:
> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
> 'xl vcpu-set'). While this currently is only intended to be needed by
> PVH guests we will call this domctl for all (x86) guests for consistency.
>
> Signed-off-by: Boris Ostrovsky 
> Acked-by: Daniel De Graaf 
> ---
> Changes in v2:
> * Added comment in arch_do_domctl() stating that the domctl is only 
> required
>   for PVH guests
 I am not happy with this change until we understand why it is needed.

 Are we genuinely saying that there is no current enforcement in the
 PV-hotplug mechanism?
>>> That's right. Don't call setup_cpu_watcher() in Linux and you will be
>>> running with maxvcpus.
>> /sigh - Quality engineering there...
>>
>> Yes - lets take the time to actually do this properly.
>>
> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index 2a2fe04..b736d4c 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1430,6 +1430,23 @@ long arch_do_domctl(
>  }
>  break;
>  
> +case XEN_DOMCTL_set_avail_vcpus:
> +{
> +unsigned int num = domctl->u.avail_vcpus.num;
> +
> +/*
> + * This is currently only needed by PVH guests (but
> + * any guest is free to make this call).
> + */
> +ret = -EINVAL;
> +if ( num > d->max_vcpus )
> +break;
> +
> +d->arch.avail_vcpus = num;
> +ret = 0;
> +break;
> +}
 What do you actually mean by "avail_vcpus"?  What happens if a vcpu
 higher than the new number is currently online and running?  What
 happens to the already-existing vcpus-at-startup value?
>>> It shouldn't happen: we set avail_vcpus at domain creation time to
>>> vcpus-at-startup.
>>>
>>> The name is not great. It would have been better to have it online_vcpus
>>> but that usually means that VPF_down is set (which can happen, for
>>> example, when the guest offlines a VCPU).
>> How about an availability bitmap instead, which always has max_vcpus
>> bits in it?  Xen should consult the bitmap before allowing a VM to
>> online a new vcpu.
> We could indeed use bitmap (and then it will actually be easier to
> handle io request as we can just copy appropriate bytes of the map
> instead of going bit-by-bit). This will still require the new domctl.
>
> I am not convinced though that we can start enforcing this new VCPU
> count, at least for PV guests. They expect to start all max VCPUs and
> then offline them. This, of course, can be fixed but all non-updated
> kernels will stop booting.


How about we don't clear _VPF_down if the bit in the availability bitmap
is not set?


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-09 Thread Boris Ostrovsky
On 11/09/2016 02:23 PM, Andrew Cooper wrote:
> On 09/11/16 15:29, Boris Ostrovsky wrote:
>> On 11/09/2016 10:04 AM, Andrew Cooper wrote:
>>> On 09/11/16 14:39, Boris Ostrovsky wrote:
 This domctl is called when a VCPU is hot-(un)plugged to a guest (via
 'xl vcpu-set'). While this currently is only intended to be needed by
 PVH guests we will call this domctl for all (x86) guests for consistency.

 Signed-off-by: Boris Ostrovsky 
 Acked-by: Daniel De Graaf 
 ---
 Changes in v2:
 * Added comment in arch_do_domctl() stating that the domctl is only 
 required
   for PVH guests
>>> I am not happy with this change until we understand why it is needed.
>>>
>>> Are we genuinely saying that there is no current enforcement in the
>>> PV-hotplug mechanism?
>> That's right. Don't call setup_cpu_watcher() in Linux and you will be
>> running with maxvcpus.
> /sigh - Quality engineering there...
>
> Yes - lets take the time to actually do this properly.
>
 diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
 index 2a2fe04..b736d4c 100644
 --- a/xen/arch/x86/domctl.c
 +++ b/xen/arch/x86/domctl.c
 @@ -1430,6 +1430,23 @@ long arch_do_domctl(
  }
  break;
  
 +case XEN_DOMCTL_set_avail_vcpus:
 +{
 +unsigned int num = domctl->u.avail_vcpus.num;
 +
 +/*
 + * This is currently only needed by PVH guests (but
 + * any guest is free to make this call).
 + */
 +ret = -EINVAL;
 +if ( num > d->max_vcpus )
 +break;
 +
 +d->arch.avail_vcpus = num;
 +ret = 0;
 +break;
 +}
>>> What do you actually mean by "avail_vcpus"?  What happens if a vcpu
>>> higher than the new number is currently online and running?  What
>>> happens to the already-existing vcpus-at-startup value?
>> It shouldn't happen: we set avail_vcpus at domain creation time to
>> vcpus-at-startup.
>>
>> The name is not great. It would have been better to have it online_vcpus
>> but that usually means that VPF_down is set (which can happen, for
>> example, when the guest offlines a VCPU).
> How about an availability bitmap instead, which always has max_vcpus
> bits in it?  Xen should consult the bitmap before allowing a VM to
> online a new vcpu.

We could indeed use bitmap (and then it will actually be easier to
handle io request as we can just copy appropriate bytes of the map
instead of going bit-by-bit). This will still require the new domctl.

I am not convinced though that we can start enforcing this new VCPU
count, at least for PV guests. They expect to start all max VCPUs and
then offline them. This, of course, can be fixed but all non-updated
kernels will stop booting.
-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-09 Thread Andrew Cooper
On 09/11/16 15:29, Boris Ostrovsky wrote:
> On 11/09/2016 10:04 AM, Andrew Cooper wrote:
>> On 09/11/16 14:39, Boris Ostrovsky wrote:
>>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
>>> 'xl vcpu-set'). While this currently is only intended to be needed by
>>> PVH guests we will call this domctl for all (x86) guests for consistency.
>>>
>>> Signed-off-by: Boris Ostrovsky 
>>> Acked-by: Daniel De Graaf 
>>> ---
>>> Changes in v2:
>>> * Added comment in arch_do_domctl() stating that the domctl is only required
>>>   for PVH guests
>> I am not happy with this change until we understand why it is needed.
>>
>> Are we genuinely saying that there is no current enforcement in the
>> PV-hotplug mechanism?
> That's right. Don't call setup_cpu_watcher() in Linux and you will be
> running with maxvcpus.

/sigh - Quality engineering there...

Yes - lets take the time to actually do this properly.

>
>>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
>>> index 2a2fe04..b736d4c 100644
>>> --- a/xen/arch/x86/domctl.c
>>> +++ b/xen/arch/x86/domctl.c
>>> @@ -1430,6 +1430,23 @@ long arch_do_domctl(
>>>  }
>>>  break;
>>>  
>>> +case XEN_DOMCTL_set_avail_vcpus:
>>> +{
>>> +unsigned int num = domctl->u.avail_vcpus.num;
>>> +
>>> +/*
>>> + * This is currently only needed by PVH guests (but
>>> + * any guest is free to make this call).
>>> + */
>>> +ret = -EINVAL;
>>> +if ( num > d->max_vcpus )
>>> +break;
>>> +
>>> +d->arch.avail_vcpus = num;
>>> +ret = 0;
>>> +break;
>>> +}
>> What do you actually mean by "avail_vcpus"?  What happens if a vcpu
>> higher than the new number is currently online and running?  What
>> happens to the already-existing vcpus-at-startup value?
> It shouldn't happen: we set avail_vcpus at domain creation time to
> vcpus-at-startup.
>
> The name is not great. It would have been better to have it online_vcpus
> but that usually means that VPF_down is set (which can happen, for
> example, when the guest offlines a VCPU).

How about an availability bitmap instead, which always has max_vcpus
bits in it?  Xen should consult the bitmap before allowing a VM to
online a new vcpu.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-09 Thread Boris Ostrovsky
On 11/09/2016 10:04 AM, Andrew Cooper wrote:
> On 09/11/16 14:39, Boris Ostrovsky wrote:
>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
>> 'xl vcpu-set'). While this currently is only intended to be needed by
>> PVH guests we will call this domctl for all (x86) guests for consistency.
>>
>> Signed-off-by: Boris Ostrovsky 
>> Acked-by: Daniel De Graaf 
>> ---
>> Changes in v2:
>> * Added comment in arch_do_domctl() stating that the domctl is only required
>>   for PVH guests
> I am not happy with this change until we understand why it is needed.
>
> Are we genuinely saying that there is no current enforcement in the
> PV-hotplug mechanism?

That's right. Don't call setup_cpu_watcher() in Linux and you will be
running with maxvcpus.

>
>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
>> index 2a2fe04..b736d4c 100644
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -1430,6 +1430,23 @@ long arch_do_domctl(
>>  }
>>  break;
>>  
>> +case XEN_DOMCTL_set_avail_vcpus:
>> +{
>> +unsigned int num = domctl->u.avail_vcpus.num;
>> +
>> +/*
>> + * This is currently only needed by PVH guests (but
>> + * any guest is free to make this call).
>> + */
>> +ret = -EINVAL;
>> +if ( num > d->max_vcpus )
>> +break;
>> +
>> +d->arch.avail_vcpus = num;
>> +ret = 0;
>> +break;
>> +}
> What do you actually mean by "avail_vcpus"?  What happens if a vcpu
> higher than the new number is currently online and running?  What
> happens to the already-existing vcpus-at-startup value?

It shouldn't happen: we set avail_vcpus at domain creation time to
vcpus-at-startup.

The name is not great. It would have been better to have it online_vcpus
but that usually means that VPF_down is set (which can happen, for
example, when the guest offlines a VCPU).

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus

2016-11-09 Thread Andrew Cooper
On 09/11/16 14:39, Boris Ostrovsky wrote:
> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
> 'xl vcpu-set'). While this currently is only intended to be needed by
> PVH guests we will call this domctl for all (x86) guests for consistency.
>
> Signed-off-by: Boris Ostrovsky 
> Acked-by: Daniel De Graaf 
> ---
> Changes in v2:
> * Added comment in arch_do_domctl() stating that the domctl is only required
>   for PVH guests

I am not happy with this change until we understand why it is needed.

Are we genuinely saying that there is no current enforcement in the
PV-hotplug mechanism?

> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index 2a2fe04..b736d4c 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1430,6 +1430,23 @@ long arch_do_domctl(
>  }
>  break;
>  
> +case XEN_DOMCTL_set_avail_vcpus:
> +{
> +unsigned int num = domctl->u.avail_vcpus.num;
> +
> +/*
> + * This is currently only needed by PVH guests (but
> + * any guest is free to make this call).
> + */
> +ret = -EINVAL;
> +if ( num > d->max_vcpus )
> +break;
> +
> +d->arch.avail_vcpus = num;
> +ret = 0;
> +break;
> +}

What do you actually mean by "avail_vcpus"?  What happens if a vcpu
higher than the new number is currently online and running?  What
happens to the already-existing vcpus-at-startup value?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel