Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
On Tue, 2019-07-16 at 12:02 +, George Dunlap wrote: > > On Jul 16, 2019, at 11:50 AM, Dario Faggioli > > On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote: > > > On 8/25/18 1:21 AM, Dario Faggioli wrote: > > > > > > > The other thing is, from a "developmental purity" point of view, > > > I > > > think > > > this series technically has a regression in the middle: cpu > > > offline / > > > online stops working between patch 2 and patch 4. But I'm > > > inclined > > > in > > > this case not to worry too much. :-) > > > > > Well, the point is that offlining/onlining does not work before > > this > > series. System does not crash, but behavior is wrong, as offline > > vCPUs > > stay assigned to pCPUs (keeping them idle) while online vCPUs are > > "trapped" in the wait list, which is wrong. > > > > So that's why I don't think there's much value in being consistent > > with > > such behavior throughout the series... which I guess is why you > > said > > you "won't worry too much in this case” ? > > It’s definitely sub-optimal from a system point of view; but from a > guest point of view, it does (or should) function. Before this > series, if a guest offline and then online vcpus, they should come > back. > Well, yes, I guess they should. IAC, one of the main backing usecases for these fixes is when null is used as the scheduler of the Xen PV- SHIM. In that case, if I remember correctly, the L1 and L2 vCPUs are created offline, and then onlined dynamically. And they don't come up. :-) Anyway... > In the middle of this series, once a vcpu is offlined, it can’t be > brought back up. (That is, if I’m reading it right.) > ...Yes, at that stage, things are working even less. But more important... > But I’m not expecting people to be doing bisections of that > particular functionality in this particular scheduler very much. I > think the “benefit” of avoiding a complicated re-write is well worth > the “cost” of having that particular bisection fail, on the off > chance that anyone tries it. > ...As we fully agree on this, let's move on Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ --- <> (Raistlin Majere) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
> On Jul 16, 2019, at 11:50 AM, Dario Faggioli wrote: > > On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote: >> On 8/25/18 1:21 AM, Dario Faggioli wrote: >>> If a pCPU has been/is being offlined, we don't want it to be >>> neither >>> assigned to any pCPU, nor in the wait list. >> >> I take it the first `pCPU` should be `vCPU`? >> > Indeed. :-) > >> Also, English grammar agreement is funny: `neither` needs to agree >> with >> `nor`, but the two do *not* agree with the original verb. That is, >> the >> sentence should say: >> >> "...we want it to be neither assigned to pCPU, nor in the wait list". >> > Yep, now that I see it, this rings a bell back from my high-school > English class! :-O > > Sorry... and thanks. :-) > >> Both here and in the comment. >> > And in patch 3 changelog too, I'm afraid. :-P > >> The other thing is, from a "developmental purity" point of view, I >> think >> this series technically has a regression in the middle: cpu offline / >> online stops working between patch 2 and patch 4. But I'm inclined >> in >> this case not to worry too much. :-) >> > Well, the point is that offlining/onlining does not work before this > series. System does not crash, but behavior is wrong, as offline vCPUs > stay assigned to pCPUs (keeping them idle) while online vCPUs are > "trapped" in the wait list, which is wrong. > > So that's why I don't think there's much value in being consistent with > such behavior throughout the series... which I guess is why you said > you "won't worry too much in this case” ? It’s definitely sub-optimal from a system point of view; but from a guest point of view, it does (or should) function. Before this series, if a guest offline and then online vcpus, they should come back. In the middle of this series, once a vcpu is offlined, it can’t be brought back up. (That is, if I’m reading it right.) But I’m not expecting people to be doing bisections of that particular functionality in this particular scheduler very much. I think the “benefit” of avoiding a complicated re-write is well worth the “cost” of having that particular bisection fail, on the off chance that anyone tries it. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
On Mon, 2019-07-15 at 17:06 +0100, George Dunlap wrote: > On 8/25/18 1:21 AM, Dario Faggioli wrote: > > If a pCPU has been/is being offlined, we don't want it to be > > neither > > assigned to any pCPU, nor in the wait list. > > I take it the first `pCPU` should be `vCPU`? > Indeed. :-) > Also, English grammar agreement is funny: `neither` needs to agree > with > `nor`, but the two do *not* agree with the original verb. That is, > the > sentence should say: > > "...we want it to be neither assigned to pCPU, nor in the wait list". > Yep, now that I see it, this rings a bell back from my high-school English class! :-O Sorry... and thanks. :-) > Both here and in the comment. > And in patch 3 changelog too, I'm afraid. :-P > The other thing is, from a "developmental purity" point of view, I > think > this series technically has a regression in the middle: cpu offline / > online stops working between patch 2 and patch 4. But I'm inclined > in > this case not to worry too much. :-) > Well, the point is that offlining/onlining does not work before this series. System does not crash, but behavior is wrong, as offline vCPUs stay assigned to pCPUs (keeping them idle) while online vCPUs are "trapped" in the wait list, which is wrong. So that's why I don't think there's much value in being consistent with such behavior throughout the series... which I guess is why you said you "won't worry too much in this case" ? Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ --- <> (Raistlin Majere) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
On 8/25/18 1:21 AM, Dario Faggioli wrote: > If a pCPU has been/is being offlined, we don't want it to be neither > assigned to any pCPU, nor in the wait list. I take it the first `pCPU` should be `vCPU`? Also, English grammar agreement is funny: `neither` needs to agree with `nor`, but the two do *not* agree with the original verb. That is, the sentence should say: "...we want it to be neither assigned to pCPU, nor in the wait list". Both here and in the comment. The other thing is, from a "developmental purity" point of view, I think this series technically has a regression in the middle: cpu offline / online stops working between patch 2 and patch 4. But I'm inclined in this case not to worry too much. :-) Everything else looks good. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v1 2/5] xen: sched: null: don't assign down vcpus to pcpus
If a pCPU has been/is being offlined, we don't want it to be neither assigned to any pCPU, nor in the wait list. So, avoid doing that while inserting or migrating vcpus. Signed-off-by: Dario Faggioli --- Cc: George Dunlap Cc: Stefano Stabellini Cc: Roger Pau Monne --- xen/common/sched_null.c | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c index f372172c32..1426124525 100644 --- a/xen/common/sched_null.c +++ b/xen/common/sched_null.c @@ -340,6 +340,8 @@ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v) static void vcpu_assign(struct null_private *prv, struct vcpu *v, unsigned int cpu) { +ASSERT(is_vcpu_online(v)); + per_cpu(npc, cpu).vcpu = v; v->processor = cpu; cpumask_clear_cpu(cpu, >cpus_free); @@ -454,8 +456,14 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v) ASSERT(!is_idle_vcpu(v)); lock = vcpu_schedule_lock_irq(v); - retry: +if ( unlikely(!is_vcpu_online(v)) ) +{ +vcpu_schedule_unlock_irq(lock, v); +return; +} + + retry: cpu = v->processor = pick_cpu(prv, v); spin_unlock(lock); @@ -617,6 +625,21 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v, SCHED_STAT_CRANK(migrated); +/* + * If it was/it's going offline, we don't want it neither assigned to + * a vcpu, nor in the waitqueue. + * + * If it was on a cpu, we've removed it from there above. If it is in the + * waitqueue, we remove it from there now. And then we bail. + */ +if ( unlikely(!is_vcpu_online(v)) ) +{ +spin_lock(>waitq_lock); +list_del_init(>waitq_elem); +spin_unlock(>waitq_lock); +goto out; +} + /* * Let's now consider new_cpu, which is where v is being sent. It can be * either free, or have a vCPU already assigned to it. @@ -657,6 +680,7 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v, * at least. In case of suspend, any temporary inconsistency caused * by this, will be fixed-up during resume. */ + out: v->processor = new_cpu; } ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel