Re: [Xen-devel] BUG: sched=credit2 crashes system when using cpupools
On Thursday, 13 September 2018 1:11:20 AM AEST Dario Faggioli wrote: > On Thu, 2018-08-30 at 18:49 +1000, Steven Haigh wrote: > > On 2018-08-30 18:33, Jan Beulich wrote: > > > Anyway - as Jürgen says, something for the scheduler > > > maintainers to look into. > > Ok, I'm back. > > > Yep - I just want to confirm that we tested this in BOTH NUMA > > configurations - and credit2 crashed on both. > > > > I switched back to sched=credit, and it seems to work as expected: > > # xl cpupool-list > > Name CPUs Sched Active Domain count > > Pool-node0 12credit y 3 > > Pool-node1 12credit y 0 Hi Dario, I'll try to clarify below. > Wait, in a previous message, you said: "A machine where we could get > this working every time shows". Doesn't that mean creating a separate > pool for node 1 works with both Credit and Credit2, if the node has > memory? > > I mean, trying to clarifying, my understanding is that you have to > systems: > > system A: node 1 has *no* memory > system B: both node 0 and node 1 have memory > > Creating a Credit pool with pcpus from node 1 always work on both > systems. Correct. With the credit scheduler, the pool split worked correctly on both systems. > OTOH, when you try to create a Credit2 pool with pcpus from node 1, > does it always crash on both systems, or does it work on system B and > crashes on system A ? Both systems crashed when using credit2. We originally thought this was due to the different memory layout between the two systems. This bit turned out to not matter as both systems crashed in the same way. > I do have a NUMA box with RAM in both nodes (so similar to system B). > Last time I checked, what you're trying to do worked there, pretty much > with any scheduler combination, but I'll recheck. > > I don't have a box similar to system A. I'll try to remove some of the > RAM from that NUMA box, and check what happens. In theory, if you reproduce what we did, it should crash anyway. The RAM layout shouldn't matter. We changed the scheduler in the grub line as 'sched=credit2'. Then did the split. I didn't try changing the Dom0 to boot with credit, but making the pools credit2. -- Steven Haigh net...@crc.id.au https://www.crc.id.au +61 (3) 9001 6090 0412 935 897 signature.asc Description: This is a digitally signed message part. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] BUG: sched=credit2 crashes system when using cpupools
On Thu, 2018-08-30 at 18:49 +1000, Steven Haigh wrote: > On 2018-08-30 18:33, Jan Beulich wrote: > > > > > > > Anyway - as Jürgen says, something for the scheduler > > maintainers to look into. > Ok, I'm back. > Yep - I just want to confirm that we tested this in BOTH NUMA > configurations - and credit2 crashed on both. > > I switched back to sched=credit, and it seems to work as expected: > # xl cpupool-list > Name CPUs Sched Active Domain count > Pool-node0 12credit y 3 > Pool-node1 12credit y 0 > Wait, in a previous message, you said: "A machine where we could get this working every time shows". Doesn't that mean creating a separate pool for node 1 works with both Credit and Credit2, if the node has memory? I mean, trying to clarifying, my understanding is that you have to systems: system A: node 1 has *no* memory system B: both node 0 and node 1 have memory Creating a Credit pool with pcpus from node 1 always work on both systems. OTOH, when you try to create a Credit2 pool with pcpus from node 1, does it always crash on both systems, or does it work on system B and crashes on system A ? I do have a NUMA box with RAM in both nodes (so similar to system B). Last time I checked, what you're trying to do worked there, pretty much with any scheduler combination, but I'll recheck. I don't have a box similar to system A. I'll try to remove some of the RAM from that NUMA box, and check what happens. Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Software Engineer @ SUSE https://www.suse.com/ signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] BUG: sched=credit2 crashes system when using cpupools
On 2018-08-30 18:33, Jan Beulich wrote: On 30.08.18 at 06:01, wrote: Managed to get the same crash log when adding CPUs to Pool-1 as follows: Create the pool: (XEN) Initializing Credit2 scheduler (XEN) load_precision_shift: 18 (XEN) load_window_shift: 30 (XEN) underload_balance_tolerance: 0 (XEN) overload_balance_tolerance: -3 (XEN) runqueues arrangement: socket (XEN) cap enforcement granularity: 10ms (XEN) load tracking window length 1073741824 ns Add the CPUs: (XEN) Adding cpu 12 to runqueue 0 (XEN) First cpu on runqueue, activating (XEN) Removing cpu 12 from runqueue 0 (XEN) Adding cpu 13 to runqueue 0 (XEN) Removing cpu 13 from runqueue 0 (XEN) Adding cpu 14 to runqueue 0 (XEN) Removing cpu 14 from runqueue 0 (XEN) Xen BUG at sched_credit2.c:3452 credit2 still not being the default - do things work if you don't override the default (of using credit1)? I guess the problem is connected to the "Removing cpu from runqueue 0", considering this BUG_ON(!cpumask_test_cpu(cpu, >active)); is what triggers. Anyway - as Jürgen says, something for the scheduler maintainers to look into. Yep - I just want to confirm that we tested this in BOTH NUMA configurations - and credit2 crashed on both. I switched back to sched=credit, and it seems to work as expected: # xl cpupool-list Name CPUs Sched Active Domain count Pool-node0 12credit y 3 Pool-node1 12credit y 0 I've updated the subject - as this isn't a NUMA issue at all. -- Steven Haigh ? net...@crc.id.au ? http://www.crc.id.au ? +61 (3) 9001 6090? 0412 935 897 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel