Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus

2019-07-16 Thread Dario Faggioli
On Tue, 2019-07-16 at 12:02 +, George Dunlap wrote:
> > On Jul 16, 2019, at 11:50 AM, Dario Faggioli 
> > On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote:
> > > On 8/25/18 1:21 AM, Dario Faggioli wrote:
> > > > 
> > > The other thing is, from a "developmental purity" point of view,
> > > I
> > > think
> > > this series technically has a regression in the middle: cpu
> > > offline /
> > > online stops working between patch 2 and patch 4.  But I'm
> > > inclined
> > > in
> > > this case not to worry too much. :-)
> > > 
> > Well, the point is that offlining/onlining does not work before
> > this
> > series. System does not crash, but behavior is wrong, as offline
> > vCPUs
> > stay assigned to pCPUs (keeping them idle) while online vCPUs are
> > "trapped" in the wait list, which is wrong.
> > 
> > So that's why I don't think there's much value in being consistent
> > with
> > such behavior throughout the series... which I guess is why you
> > said
> > you "won't worry too much in this case” ?
> 
> It’s definitely sub-optimal from a system point of view; but from a
> guest point of view, it does (or should) function.  Before this
> series, if a guest offline and then online vcpus, they should come
> back.
>
Well, yes, I guess they should. IAC, one of the main backing usecases
for these fixes is when null is used as the scheduler of the Xen PV-
SHIM. In that case, if I remember correctly, the L1 and L2 vCPUs are
created offline, and then onlined dynamically. And they don't come up.
:-)

Anyway...

> In the middle of this series, once a vcpu is offlined, it can’t be
> brought back up.  (That is, if I’m reading it right.)
>
...Yes, at that stage, things are working even less. But more
important...

> But I’m not expecting people to be doing bisections of that
> particular functionality in this particular scheduler very much.  I
> think the “benefit” of avoiding a complicated re-write is well worth
> the “cost” of having that particular bisection fail, on the off
> chance that anyone tries it.
> 
...As we fully agree on this, let's move on

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus

2019-07-16 Thread George Dunlap


> On Jul 16, 2019, at 11:50 AM, Dario Faggioli  wrote:
> 
> On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote:
>> On 8/25/18 1:21 AM, Dario Faggioli wrote:
>>> If a pCPU has been/is being offlined, we don't want it to be
>>> neither
>>> assigned to any pCPU, nor in the wait list.
>> 
>> I take it the first `pCPU` should be `vCPU`?
>> 
> Indeed. :-)
> 
>> Also, English grammar agreement is funny: `neither` needs to agree
>> with
>> `nor`, but the two do *not* agree with the original verb.  That is,
>> the
>> sentence should say:
>> 
>> "...we want it to be neither assigned to pCPU, nor in the wait list".
>> 
> Yep, now that I see it, this rings a bell back from my high-school
> English class! :-O
> 
> Sorry... and thanks. :-)
> 
>> Both here and in the comment.
>> 
> And in patch 3 changelog too, I'm afraid. :-P
> 
>> The other thing is, from a "developmental purity" point of view, I
>> think
>> this series technically has a regression in the middle: cpu offline /
>> online stops working between patch 2 and patch 4.  But I'm inclined
>> in
>> this case not to worry too much. :-)
>> 
> Well, the point is that offlining/onlining does not work before this
> series. System does not crash, but behavior is wrong, as offline vCPUs
> stay assigned to pCPUs (keeping them idle) while online vCPUs are
> "trapped" in the wait list, which is wrong.
> 
> So that's why I don't think there's much value in being consistent with
> such behavior throughout the series... which I guess is why you said
> you "won't worry too much in this case” ?

It’s definitely sub-optimal from a system point of view; but from a guest point 
of view, it does (or should) function.  Before this series, if a guest offline 
and then online vcpus, they should come back.  In the middle of this series, 
once a vcpu is offlined, it can’t be brought back up.  (That is, if I’m reading 
it right.)

But I’m not expecting people to be doing bisections of that particular 
functionality in this particular scheduler very much.  I think the “benefit” of 
avoiding a complicated re-write is well worth the “cost” of having that 
particular bisection fail, on the off chance that anyone tries it.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus

2019-07-16 Thread Dario Faggioli
On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote:
> On 8/25/18 1:21 AM, Dario Faggioli wrote:
> > If a pCPU has been/is being offlined, we don't want it to be
> > neither
> > assigned to any pCPU, nor in the wait list.
> 
> I take it the first `pCPU` should be `vCPU`?
> 
Indeed. :-)

> Also, English grammar agreement is funny: `neither` needs to agree
> with
> `nor`, but the two do *not* agree with the original verb.  That is,
> the
> sentence should say:
> 
> "...we want it to be neither assigned to pCPU, nor in the wait list".
> 
Yep, now that I see it, this rings a bell back from my high-school
English class! :-O

Sorry... and thanks. :-)

> Both here and in the comment.
>
And in patch 3 changelog too, I'm afraid. :-P

> The other thing is, from a "developmental purity" point of view, I
> think
> this series technically has a regression in the middle: cpu offline /
> online stops working between patch 2 and patch 4.  But I'm inclined
> in
> this case not to worry too much. :-)
> 
Well, the point is that offlining/onlining does not work before this
series. System does not crash, but behavior is wrong, as offline vCPUs
stay assigned to pCPUs (keeping them idle) while online vCPUs are
"trapped" in the wait list, which is wrong.

So that's why I don't think there's much value in being consistent with
such behavior throughout the series... which I guess is why you said
you "won't worry too much in this case" ?

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus

2019-07-15 Thread George Dunlap
On 8/25/18 1:21 AM, Dario Faggioli wrote:
> If a pCPU has been/is being offlined, we don't want it to be neither
> assigned to any pCPU, nor in the wait list.

I take it the first `pCPU` should be `vCPU`?

Also, English grammar agreement is funny: `neither` needs to agree with
`nor`, but the two do *not* agree with the original verb.  That is, the
sentence should say:

"...we want it to be neither assigned to pCPU, nor in the wait list".

Both here and in the comment.

The other thing is, from a "developmental purity" point of view, I think
this series technically has a regression in the middle: cpu offline /
online stops working between patch 2 and patch 4.  But I'm inclined in
this case not to worry too much. :-)

Everything else looks good.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus

2018-08-24 Thread Dario Faggioli
If a pCPU has been/is being offlined, we don't want it to be neither
assigned to any pCPU, nor in the wait list.

So, avoid doing that while inserting or migrating vcpus.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Stefano Stabellini 
Cc: Roger Pau Monne 
---
 xen/common/sched_null.c |   26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index f372172c32..1426124525 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -340,6 +340,8 @@ static unsigned int pick_cpu(struct null_private *prv, 
struct vcpu *v)
 static void vcpu_assign(struct null_private *prv, struct vcpu *v,
 unsigned int cpu)
 {
+ASSERT(is_vcpu_online(v));
+
 per_cpu(npc, cpu).vcpu = v;
 v->processor = cpu;
 cpumask_clear_cpu(cpu, >cpus_free);
@@ -454,8 +456,14 @@ static void null_vcpu_insert(const struct scheduler *ops, 
struct vcpu *v)
 ASSERT(!is_idle_vcpu(v));
 
 lock = vcpu_schedule_lock_irq(v);
- retry:
 
+if ( unlikely(!is_vcpu_online(v)) )
+{
+vcpu_schedule_unlock_irq(lock, v);
+return;
+}
+
+ retry:
 cpu = v->processor = pick_cpu(prv, v);
 
 spin_unlock(lock);
@@ -617,6 +625,21 @@ static void null_vcpu_migrate(const struct scheduler *ops, 
struct vcpu *v,
 
 SCHED_STAT_CRANK(migrated);
 
+/*
+ * If it was/it's going offline, we don't want it neither assigned to
+ * a vcpu, nor in the waitqueue.
+ *
+ * If it was on a cpu, we've removed it from there above. If it is in the
+ * waitqueue, we remove it from there now. And then we bail.
+ */
+if ( unlikely(!is_vcpu_online(v)) )
+{
+spin_lock(>waitq_lock);
+list_del_init(>waitq_elem);
+spin_unlock(>waitq_lock);
+goto out;
+}
+
 /*
  * Let's now consider new_cpu, which is where v is being sent. It can be
  * either free, or have a vCPU already assigned to it.
@@ -657,6 +680,7 @@ static void null_vcpu_migrate(const struct scheduler *ops, 
struct vcpu *v,
  * at least. In case of suspend, any temporary inconsistency caused
  * by this, will be fixed-up during resume.
  */
+ out:
 v->processor = new_cpu;
 }
 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel