Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-17 Thread Dario Faggioli
On Wed, 2018-04-11 at 14:45 +0200, Olaf Hering wrote: > On Wed, Apr 11, Dario Faggioli wrote: > > > If you're interested in figuring out, I'd like to see: > > - full output of `xl info -n' > > - output of `xl debug-key u' > > - xl vcpu-list > > - xl list -n > > Logs for this .cfg attached: > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread Dario Faggioli
On Fri, 2018-04-13 at 11:29 +, George Dunlap wrote: > I think as far as backports go, my current RFC would be > fine. Another possibility, though, would be to simply add a > migrate() callback to remove the vcpu from the runqueue before > switching v->processor, *without* removing any of the

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread George Dunlap
> On Apr 13, 2018, at 10:25 AM, Dario Faggioli wrote: > > On Fri, 2018-04-13 at 09:03 +, George Dunlap wrote: >>> On Apr 12, 2018, at 6:25 PM, Dario Faggioli >>> wrote: >>> >> I think the bottom line is, for this test to be valid, then at this >>

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread Dario Faggioli
On Fri, 2018-04-13 at 09:03 +, George Dunlap wrote: > > On Apr 12, 2018, at 6:25 PM, Dario Faggioli > > wrote: > > > > On the "other CPU", we might be around here [**]: > > > > static void vcpu_migrate(struct vcpu *v) > > { > >... > >if ( v->is_running || > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread Olaf Hering
On Fri, Apr 13, Dario Faggioli wrote: > Yes. In fact, Olaf, I still think that doing a run with George's RFC > applied, would be useful, if only as a data point. First tests indicate that this series fixes the bug. Olaf signature.asc Description: PGP signature

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread Dario Faggioli
On Fri, 2018-04-13 at 09:03 +, George Dunlap wrote: > > On Apr 12, 2018, at 6:25 PM, Dario Faggioli > > wrote: > > > I think the bottom line is, for this test to be valid, then at this > point test_bit(VPF_migrating) *must* imply !vcpu_on_runqueue(v), but > at this point

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread George Dunlap
> On Apr 12, 2018, at 6:25 PM, Dario Faggioli wrote: > > On Thu, 2018-04-12 at 17:38 +0200, Dario Faggioli wrote: >> On Thu, 2018-04-12 at 15:15 +0200, Dario Faggioli wrote: >>> On Thu, 2018-04-12 at 14:45 +0200, Olaf Hering wrote: dies after the first iteration.

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread Dario Faggioli
On Fri, 2018-04-13 at 08:23 +0200, Olaf Hering wrote: > Am Thu, 12 Apr 2018 19:25:43 +0200 > schrieb Dario Faggioli : > > > Olaf, new patch! :-) > > BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); > Thanks! > (XEN) CPU 36: d10v1 isr=0 runnbl=1 proc=36 pf=0 orq=0 csf=4 > So,

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-13 Thread Olaf Hering
Am Thu, 12 Apr 2018 19:25:43 +0200 schrieb Dario Faggioli : > Olaf, new patch! :-) BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); (XEN) CPU 36: d10v1 isr=0 runnbl=1 proc=36 pf=0 orq=0 csf=4 (XEN) CPU 33: d10v2 isr=0 runnbl=0 proc=33 pf=1 orq=0 csf=4 (XEN) CPU 20: d10v2 isr=0

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-12 Thread Dario Faggioli
On Thu, 2018-04-12 at 17:38 +0200, Dario Faggioli wrote: > On Thu, 2018-04-12 at 15:15 +0200, Dario Faggioli wrote: > > On Thu, 2018-04-12 at 14:45 +0200, Olaf Hering wrote: > > > > > > dies after the first iteration. > > > > > > BUG_ON(!test_bit(_VPF_migrating, >pause_flags)); > > > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-12 Thread Dario Faggioli
On Thu, 2018-04-12 at 15:15 +0200, Dario Faggioli wrote: > On Thu, 2018-04-12 at 14:45 +0200, Olaf Hering wrote: > > > > dies after the first iteration. > > > > BUG_ON(!test_bit(_VPF_migrating, >pause_flags)); > > > Update. I replaced this: +BUG_ON(vcpu_runnable(prev)); +

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-12 Thread Dario Faggioli
On Thu, 2018-04-12 at 14:45 +0200, Olaf Hering wrote: > Am Thu, 12 Apr 2018 12:16:34 +0200 > schrieb Dario Faggioli : > > > Olaf, new patch. Please, remove _everything_ and apply _only_ this > > one. > > dies after the first iteration. > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-12 Thread Olaf Hering
Am Thu, 12 Apr 2018 12:16:34 +0200 schrieb Dario Faggioli : > Olaf, new patch. Please, remove _everything_ and apply _only_ this one. dies after the first iteration. BUG_ON(!test_bit(_VPF_migrating, >pause_flags)); (XEN) Xen BUG at schedule.c:1570 (XEN) [

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-12 Thread Dario Faggioli
On Thu, 2018-04-12 at 09:38 +, George Dunlap wrote: > > On Apr 11, 2018, at 10:31 PM, Dario Faggioli > > wrote: > > (XEN) Xen BUG at sched_credit.c:876 > > (XEN) [ Xen-4.11.20180410T125709.50f8ba84a5- > > 7.bug1087289_411 x86_64 debug=y Not tainted ] > > (XEN)

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-12 Thread George Dunlap
> On Apr 11, 2018, at 10:31 PM, Dario Faggioli wrote: > > Il Mer 11 Apr 2018, 22:48 Olaf Hering ha scritto: > On Wed, Apr 11, Dario Faggioli wrote: > > > It will crash, again, possibly with the same stack trace, but I think > > it's worth a try. > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-12 Thread Jan Beulich
>>> On 11.04.18 at 23:31, wrote: > Il Mer 11 Apr 2018, 22:48 Olaf Hering ha scritto: > >> On Wed, Apr 11, Dario Faggioli wrote: >> >> > It will crash, again, possibly with the same stack trace, but I think >> > it's worth a try. >> >>

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
Il Mer 11 Apr 2018, 22:48 Olaf Hering ha scritto: > On Wed, Apr 11, Dario Faggioli wrote: > > > It will crash, again, possibly with the same stack trace, but I think > > it's worth a try. > > BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); > > (XEN) Xen BUG at sched_credit.c:876 >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
On Wed, Apr 11, Dario Faggioli wrote: > It will crash, again, possibly with the same stack trace, but I think > it's worth a try. BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); (XEN) grant_table.c:1769:d15v18 Expanding d15 grant table from 12 to 13 frames (XEN) grant_table.c:1769:d15v20 Expanding

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 17:27 +0200, Olaf Hering wrote: > On Wed, Apr 11, Olaf Hering wrote: > > > That was with sched=credit2, sorry for that. > > Now with just that second patch ... > > Still BUG in csched_load_balance. > > (XEN) Xen BUG at sched_credit.c:1694 > (XEN) [

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
Am Wed, 11 Apr 2018 09:38:59 -0600 schrieb "Jan Beulich" : > And till now I had assumed we've taken care of them with earlier > fixes (all 4.7 reports were with old packages, like 4.7.2 based > ones). Can you repro this with a debug hypervisor (so we can > both trust the stack

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Jan Beulich
>>> On 11.04.18 at 17:03, wrote: > On Wed, Apr 11, Olaf Hering wrote: > >> On Wed, Apr 11, Dario Faggioli wrote: >> >> > Olaf, can you give it a try? It should be fine to run it on top of the >> > last debug patch (the one that produced this crash). >> >> Yes, with both changes

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
On Wed, Apr 11, Olaf Hering wrote: > On Wed, Apr 11, Olaf Hering wrote: > > On Wed, Apr 11, Dario Faggioli wrote: > > > Olaf, can you give it a try? It should be fine to run it on top of the > > > last debug patch (the one that produced this crash). > > Yes, with both changes it did >4k

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
On Wed, Apr 11, Olaf Hering wrote: > On Wed, Apr 11, Dario Faggioli wrote: > > > Olaf, can you give it a try? It should be fine to run it on top of the > > last debug patch (the one that produced this crash). > > Yes, with both changes it did >4k iterations already. Thanks. That was with

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
On Wed, Apr 11, Dario Faggioli wrote: > If you're interested in figuring out, I'd like to see: > - full output of `xl info -n' > - output of `xl debug-key u' > - xl vcpu-list > - xl list -n Logs for this .cfg attached: name='fv_sles12sp1.0' vif=[ 'mac=00:18:3e:58:00:c1,bridge=br0' ] memory=

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Jan Beulich
>>> On 11.04.18 at 13:02, wrote: > On 04/11/2018 11:17 AM, Dario Faggioli wrote: >> On Wed, 2018-04-11 at 12:00 +0200, Olaf Hering wrote: >>> On Wed, Apr 11, Dario Faggioli wrote: >>> Olaf, can you give it a try? It should be fine to run it on top of the

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread George Dunlap
On 04/11/2018 11:17 AM, Dario Faggioli wrote: > On Wed, 2018-04-11 at 12:00 +0200, Olaf Hering wrote: >> On Wed, Apr 11, Dario Faggioli wrote: >> >>> Olaf, can you give it a try? It should be fine to run it on top of >>> the >>> last debug patch (the one that produced this crash). >> >> Yes, with

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 11:37 +0100, George Dunlap wrote: > On 04/10/2018 11:59 PM, Dario Faggioli wrote: > > > > So, basically, the race is between context_saved() and > > vcpu_set_affinity(). Basically, vcpu_set_affinity() sets the > > VPF_migrating pause flags on a vcpu in a runqueue, with the

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 12:00 +0200, Olaf Hering wrote: > On Wed, Apr 11, Dario Faggioli wrote: > > > Olaf, can you give it a try? It should be fine to run it on top of > > the > > last debug patch (the one that produced this crash). > > Yes, with both changes it did >4k iterations already.

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread George Dunlap
On 04/10/2018 11:59 PM, Dario Faggioli wrote: > [Adding Andrew, not because I expect anything, but just because we've > chatted about this issue on IRC :-) ] > > On Tue, 2018-04-10 at 22:37 +0200, Olaf Hering wrote: >> On Tue, Apr 10, Dario Faggioli wrote: >> >>

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 10:48 +0200, Olaf Hering wrote: > On Wed, Apr 11, Dario Faggioli wrote: > > So, now, when you say 'does not work', do you mean 'domain creation > > is > > aborted with errors' or 'domain is created, but memory is not where > > it > > should be'. > > domU can not be created

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 10:48 +0200, Olaf Hering wrote: > On Wed, Apr 11, Dario Faggioli wrote: > > So, now, when you say 'does not work', do you mean 'domain creation > > is > > aborted with errors' or 'domain is created, but memory is not where > > it > > should be'. > > domU can not be created

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
On Wed, Apr 11, Dario Faggioli wrote: > Olaf, can you give it a try? It should be fine to run it on top of the > last debug patch (the one that produced this crash). Yes, with both changes it did >4k iterations already. Thanks. Olaf signature.asc Description: PGP signature

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
On Wed, Apr 11, Dario Faggioli wrote: > So, now, when you say 'does not work', do you mean 'domain creation is > aborted with errors' or 'domain is created, but memory is not where it > should be'. domU can not be created due to "libxl__set_vcpuaffinity: setting vcpu affinity: Invalid argument".

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 08:23 +0200, Olaf Hering wrote: > It turned out that I had a typo all the time in my template, it used > 'cpu=' rather than 'cpus='. On this system none of this works: > #pus="node:${node}" > cpus="nodes:${node}" > #pus="nodes:${node},^node:0" >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 09:39 +0200, Juergen Gross wrote: > On 11/04/18 09:31, Dario Faggioli wrote: > > > On Tue, 2018-04-10 at 22:37 +0200, Olaf Hering wrote: > > > > On Tue, Apr 10, Dario Faggioli wrote: > > > > > > > > BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); > > > > > > > > ... patch

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Juergen Gross
On 11/04/18 09:31, Dario Faggioli wrote: > On Wed, 2018-04-11 at 00:59 +0200, Dario Faggioli wrote: >> [Adding Andrew, not because I expect anything, but just because >> we've chatted about this issue on IRC :-) ] >> > Except, I did not add it. :-P > > Anyway... > >> On Tue, 2018-04-10 at 22:37

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Dario Faggioli
On Wed, 2018-04-11 at 00:59 +0200, Dario Faggioli wrote: > [Adding Andrew, not because I expect anything, but just because > we've chatted about this issue on IRC :-) ] > Except, I did not add it. :-P Anyway... > On Tue, 2018-04-10 at 22:37 +0200, Olaf Hering wrote: > > On Tue, Apr 10, Dario

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-11 Thread Olaf Hering
On Tue, Apr 10, Dario Faggioli wrote: > I remember specifically wanting for it to support not only "nodes:", but also > "node:", because I thought that, e.g., "nodes:3" would have sound weird to > users. It turned out that I had a typo all the time in my template, it used 'cpu=' rather than

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
[Adding Andrew, not because I expect anything, but just because we've chatted about this issue on IRC :-) ] On Tue, 2018-04-10 at 22:37 +0200, Olaf Hering wrote: > On Tue, Apr 10, Dario Faggioli wrote: > > BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); > > (XEN) Xen BUG at sched_credit.c:876 >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
Il Mar 10 Apr 2018, 22:16 Olaf Hering ha scritto: > On Tue, Apr 10, Olaf Hering wrote: > > > On Tue, Apr 10, Dario Faggioli wrote: > > > > > In the meanwhile --let me repeat myself-- just go ahead with "node:2", > > > "node:3", etc. :-D > > > > I did, and that fails. > > I think

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering
On Tue, Apr 10, Dario Faggioli wrote: > So, Olaf, if you're fancy giving this a tray anyway, well, go ahead. BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); (XEN) Xen BUG at sched_credit.c:876 (XEN) [ Xen-4.11.20180410T125709.50f8ba84a5-3.bug1087289_411 x86_64 debug=y Not tainted ]

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering
On Tue, Apr 10, Olaf Hering wrote: > On Tue, Apr 10, Dario Faggioli wrote: > > > In the meanwhile --let me repeat myself-- just go ahead with "node:2", > > "node:3", etc. :-D > > I did, and that fails. I think the man page is not that clear, to me. If there is a difference between 'node' vs.

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering
On Tue, Apr 10, Dario Faggioli wrote: > In the meanwhile --let me repeat myself-- just go ahead with "node:2", > "node:3", etc. :-D I did, and that fails. Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 21:03 +0200, Olaf Hering wrote: > On Tue, Apr 10, Dario Faggioli wrote: > > > As said, its cpus= and cpus_soft=, and you probably just need > > cpus="node:1" > > cpus_soft="node:1" > > Or, even just: > > cpus="node:1" > > as, if soft-affinity is set to be equal to hard, it

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering
On Tue, Apr 10, Dario Faggioli wrote: > On Tue, 2018-04-10 at 17:59 +0200, Olaf Hering wrote: > > memory= > > vcpus=36 > > cpu="nodes:1,^node:0" > > cpu_soft="nodes:1,^node:0" > As said, its cpus= and cpus_soft=, and you probably just need > cpus="node:1" > cpus_soft="node:1" > Or, even just:

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 16:25 +0100, George Dunlap wrote: > On 04/10/2018 12:29 PM, Dario Faggioli wrote: > > > One thing we might consider doing is implementing the migrate() > callback > for the Credit scheduler, and just have it make a bunch of sanity > checks > (v->processor lock held, new_cpu

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 17:59 +0200, Olaf Hering wrote: > On Tue, Apr 10, Olaf Hering wrote: > > > (XEN) Xen BUG at sched_credit.c:1694 > > And another one with debug=y and this config: > Wow... > memory= > vcpus=36 > cpu="nodes:1,^node:0" > cpu_soft="nodes:1,^node:0" > As said, its cpus= and

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering
On Tue, Apr 10, Olaf Hering wrote: > (XEN) Xen BUG at sched_credit.c:1694 And another one with debug=y and this config: memory= vcpus=36 cpu="nodes:1,^node:0" cpu_soft="nodes:1,^node:0" (nodes=1 cycles between 1-3 for each following domU). (XEN) Assertion 'CSCHED_PCPU(cpu)->nr_runnable >=

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 16:25 +0100, George Dunlap wrote: > On 04/10/2018 12:29 PM, Dario Faggioli wrote: > > > whenever that is. (Possibly at the end of the current call to > vcpu_migrate(), possibly at the end of a vcpu_migrate() triggered in > context_saved() due to VPF_migrating.) > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap
On 04/10/2018 04:18 PM, Olaf Hering wrote: > On Tue, Apr 10, Olaf Hering wrote: > >> (XEN) Xen BUG at sched_credit.c:1694 > > Another variant: > > This time the domUs had just vcpus=36 and > cpus=nodes:N,node:^0/cpus_soft=nodes:N,node:^0 > > (XEN) Xen BUG at sched_credit.c:280 > (XEN) [

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap
On 04/10/2018 12:29 PM, Dario Faggioli wrote: > On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote: >> On 04/10/2018 11:33 AM, Dario Faggioli wrote: >>> On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: Assuming the bug is this one: BUG_ON( cpu != snext->vcpu->processor );

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering
On Tue, Apr 10, Olaf Hering wrote: > (XEN) Xen BUG at sched_credit.c:1694 Another variant: This time the domUs had just vcpus=36 and cpus=nodes:N,node:^0/cpus_soft=nodes:N,node:^0 (XEN) Xen BUG at sched_credit.c:280 (XEN) [ Xen-4.11.20180407T144959.e62e140daa-2.bug1087289_411 x86_64

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote: > On 04/10/2018 11:33 AM, Dario Faggioli wrote: > > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: > > > Assuming the bug is this one: > > > > > > BUG_ON( cpu != snext->vcpu->processor ); > > > > > > > Yes, it is that one. > > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote: > On 04/10/2018 11:33 AM, Dario Faggioli wrote: > > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: > > > Assuming the bug is this one: > > > > > > BUG_ON( cpu != snext->vcpu->processor ); > > > > > > > Yes, it is that one. > > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote: > On 04/10/2018 11:33 AM, Dario Faggioli wrote: > > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: > > > Assuming the bug is this one: > > > > > > BUG_ON( cpu != snext->vcpu->processor ); > > > > > > > Yes, it is that one. > > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote: > On 04/10/2018 11:33 AM, Dario Faggioli wrote: > > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: > > > Assuming the bug is this one: > > > > > > BUG_ON( cpu != snext->vcpu->processor ); > > > > > > > Yes, it is that one. > > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote: > On 04/10/2018 11:33 AM, Dario Faggioli wrote: > > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: > > > Assuming the bug is this one: > > > > > > BUG_ON( cpu != snext->vcpu->processor ); > > > > > > > Yes, it is that one. > > >

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap
On 04/10/2018 11:33 AM, Dario Faggioli wrote: > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: >> Assuming the bug is this one: >> >> BUG_ON( cpu != snext->vcpu->processor ); >> > Yes, it is that one. > > Another stack trace, this time from a debug=y built hypervisor, of what > we are

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli
On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote: > Assuming the bug is this one: > > BUG_ON( cpu != snext->vcpu->processor ); > Yes, it is that one. Another stack trace, this time from a debug=y built hypervisor, of what we are thinking it is the same bug (although reproduced in a

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap
> On Apr 10, 2018, at 9:57 AM, Olaf Hering wrote: > > While hunting some other bug we run into the single BUG in > sched_credit.c:csched_load_balance(). This happens with all versions > since 4.7, staging is also affected. Testsystem is a Haswell model 63 > system with 4 NUMA