Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision
Le 19/10/2020 à 06:55, Joel Stanley a écrit : On Sat, 17 Oct 2020 at 15:55, Christophe Leroy wrote: When building mpc885_ads_defconfig with gcc 10.1, the function get_order() appears 50 times in vmlinux: [linux]# ppc-linux-objdump -x vmlinux | grep get_order | wc -l 50 [linux]# size vmlinux textdata bss dec hex filename 3842620 675624 135160 4653404 47015c vmlinux In the old days, marking a function 'static inline' was forcing GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline a function. It looks like GCC 10 is taking poor decisions on this. get_order() compiles into the following tiny function, occupying 20 bytes of text. 007c : 7c: 38 63 ff ff addir3,r3,-1 80: 54 63 a3 3e rlwinm r3,r3,20,12,31 84: 7c 63 00 34 cntlzw r3,r3 88: 20 63 00 20 subfic r3,r3,32 8c: 4e 80 00 20 blr By forcing get_order() to be __always_inline, the size of text is reduced by 1940 bytes, that is almost twice the space occupied by 50 times get_order() [linux-powerpc]# size vmlinux textdata bss dec hex filename 3840680 675588 135176 4651444 46f9b4 vmlinux I see similar results with GCC 10.2 building for arm32. There are 143 instances of get_order with aspeed_g5_defconfig. Before: 9071838 2630138 186468 11888444 b5673c vmlinux After: 9069886 2630126 186468 11886480 b55f90 vmlinux 1952 bytes smaller with your patch applied. Did you raise this with anyone from GCC? Yes I did, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445 For the time being, it's at a standstill. Christophe Reviewed-by: Joel Stanley Signed-off-by: Christophe Leroy --- include/asm-generic/getorder.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h index e9f20b813a69..f2979e3a96b6 100644 --- a/include/asm-generic/getorder.h +++ b/include/asm-generic/getorder.h @@ -26,7 +26,7 @@ * * The result is undefined if the size is 0. */ -static inline __attribute_const__ int get_order(unsigned long size) +static __always_inline __attribute_const__ int get_order(unsigned long size) { if (__builtin_constant_p(size)) { if (!size) -- 2.25.0
Re: [PATCH v4 0/2] powerpc/mce: Fix mce handler and add selftest
On 10/16/20 5:02 PM, Michael Ellerman wrote: On Fri, 9 Oct 2020 12:10:03 +0530, Ganesh Goudar wrote: This patch series fixes mce handling for pseries, Adds LKDTM test for SLB multihit recovery and enables selftest for the same, basically to test MCE handling on pseries/powernv machines running in hash mmu mode. v4: * Use radix_enabled() to check if its in Hash or Radix mode. * Use FW_FEATURE_LPAR instead of machine_is_pseries(). [...] Patch 1 applied to powerpc/fixes. [1/2] powerpc/mce: Avoid nmi_enter/exit in real mode on pseries hash https://git.kernel.org/powerpc/c/8d0e2101274358d9b6b1f27232b40253ca48bab5 cheers Thank you, Any comments on patch 2.
Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision
On Sat, 17 Oct 2020 at 15:55, Christophe Leroy wrote: > > When building mpc885_ads_defconfig with gcc 10.1, > the function get_order() appears 50 times in vmlinux: > > [linux]# ppc-linux-objdump -x vmlinux | grep get_order | wc -l > 50 > > [linux]# size vmlinux >textdata bss dec hex filename > 3842620 675624 135160 4653404 47015c vmlinux > > In the old days, marking a function 'static inline' was forcing > GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable > CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline > a function. > > It looks like GCC 10 is taking poor decisions on this. > > get_order() compiles into the following tiny function, > occupying 20 bytes of text. > > 007c : > 7c: 38 63 ff ff addir3,r3,-1 > 80: 54 63 a3 3e rlwinm r3,r3,20,12,31 > 84: 7c 63 00 34 cntlzw r3,r3 > 88: 20 63 00 20 subfic r3,r3,32 > 8c: 4e 80 00 20 blr > > By forcing get_order() to be __always_inline, the size of text is > reduced by 1940 bytes, that is almost twice the space occupied by > 50 times get_order() > > [linux-powerpc]# size vmlinux >textdata bss dec hex filename > 3840680 675588 135176 4651444 46f9b4 vmlinux I see similar results with GCC 10.2 building for arm32. There are 143 instances of get_order with aspeed_g5_defconfig. Before: 9071838 2630138 186468 11888444 b5673c vmlinux After: 9069886 2630126 186468 11886480 b55f90 vmlinux 1952 bytes smaller with your patch applied. Did you raise this with anyone from GCC? Reviewed-by: Joel Stanley > Signed-off-by: Christophe Leroy > --- > include/asm-generic/getorder.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h > index e9f20b813a69..f2979e3a96b6 100644 > --- a/include/asm-generic/getorder.h > +++ b/include/asm-generic/getorder.h > @@ -26,7 +26,7 @@ > * > * The result is undefined if the size is 0. > */ > -static inline __attribute_const__ int get_order(unsigned long size) > +static __always_inline __attribute_const__ int get_order(unsigned long size) > { > if (__builtin_constant_p(size)) { > if (!size) > -- > 2.25.0 >
Re: KVM on POWER8 host lock up since 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
Excerpts from Nicholas Piggin's message of October 19, 2020 11:00 am: > Excerpts from Michal Suchánek's message of October 17, 2020 6:14 am: >> On Mon, Sep 07, 2020 at 11:13:47PM +1000, Nicholas Piggin wrote: >>> Excerpts from Michael Ellerman's message of August 31, 2020 8:50 pm: >>> > Michal Suchánek writes: >>> >> On Mon, Aug 31, 2020 at 11:14:18AM +1000, Nicholas Piggin wrote: >>> >>> Excerpts from Michal Suchánek's message of August 31, 2020 6:11 am: >>> >>> > Hello, >>> >>> > >>> >>> > on POWER8 KVM hosts lock up since commit 10d91611f426 ("powerpc/64s: >>> >>> > Reimplement book3s idle code in C"). >>> >>> > >>> >>> > The symptom is host locking up completely after some hours of KVM >>> >>> > workload with messages like >>> >>> > >>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab >>> >>> > cpu 47 >>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab >>> >>> > cpu 71 >>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab >>> >>> > cpu 47 >>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab >>> >>> > cpu 71 >>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab >>> >>> > cpu 47 >>> >>> > >>> >>> > printed before the host locks up. >>> >>> > >>> >>> > The machines run sandboxed builds which is a mixed workload resulting >>> >>> > in >>> >>> > IO/single core/mutiple core load over time and there are periods of no >>> >>> > activity and no VMS runnig as well. The VMs are shortlived so VM >>> >>> > setup/terdown is somewhat excercised as well. >>> >>> > >>> >>> > POWER9 with the new guest entry fast path does not seem to be >>> >>> > affected. >>> >>> > >>> >>> > Reverted the patch and the followup idle fixes on top of 5.2.14 and >>> >>> > re-applied commit a3f3072db6ca ("powerpc/powernv/idle: Restore IAMR >>> >>> > after idle") which gives same idle code as 5.1.16 and the kernel seems >>> >>> > stable. >>> >>> > >>> >>> > Config is attached. >>> >>> > >>> >>> > I cannot easily revert this commit, especially if I want to use the >>> >>> > same >>> >>> > kernel on POWER8 and POWER9 - many of the POWER9 fixes are applicable >>> >>> > only to the new idle code. >>> >>> > >>> >>> > Any idea what can be the problem? >>> >>> >>> >>> So hwthread_state is never getting back to to HWTHREAD_IN_IDLE on >>> >>> those threads. I wonder what they are doing. POWER8 doesn't have a good >>> >>> NMI IPI and I don't know if it supports pdbg dumping registers from the >>> >>> BMC unfortunately. >>> >> >>> >> It may be possible to set up fadump with a later kernel version that >>> >> supports it on powernv and dump the whole kernel. >>> > >>> > Your firmware won't support it AFAIK. >>> > >>> > You could try kdump, but if we have CPUs stuck in KVM then there's a >>> > good chance it won't work :/ >>> >>> I haven't had any luck yet reproducing this still. Testing with sub >>> cores of various different combinations, etc. I'll keep trying though. >> >> Hello, >> >> I tried running some KVM guests to simulate the workload and what I get >> is guests failing to start with a rcu stall. Tried both 5.3 and 5.9 >> kernel and qemu 4.2.1 and 5.1.0 >> >> To start some guests I run >> >> for i in $(seq 0 9) ; do /opt/qemu/bin/qemu-system-ppc64 -m 2048 -accel kvm >> -smp 8 -kernel /boot/vmlinux -initrd /boot/initrd -nodefaults -nographic >> -serial mon:telnet::444$i,server,wait & done >> >> To simulate some workload I run >> >> xz -zc9T0 < /dev/zero > /dev/null & >> while true; do >> killall -STOP xz; sleep 1; killall -CONT xz; sleep 1; >> done & >> >> on the host and add a job that executes this to the ramdisk. However, most >> guests never get to the point where the job is executed. >> >> Any idea what might be the problem? > > I would say try without pv queued spin locks (but if the same thing is > happening with 5.3 then it must be something else I guess). > > I'll try to test a similar setup on a POWER8 here. Couldn't reproduce the guest hang, they seem to run fine even with queued spinlocks. Might have a different .config. I might have got a lockup in the host (although different symptoms than the original report). I'll look into that a bit further. Thanks, Nick
[PATCH 2/2] powerpc/smp: Use GFP_ATOMIC while allocating tmp mask
Qian Cai reported a regression where CPU Hotplug fails with the latest powerpc/next BUG: sleeping function called from invalid context at mm/slab.h:494 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/88 no locks held by swapper/88/0. irq event stamp: 18074448 hardirqs last enabled at (18074447): [] tick_nohz_idle_enter+0x9c/0x110 hardirqs last disabled at (18074448): [] do_idle+0x138/0x3b0 do_idle at kernel/sched/idle.c:253 (discriminator 1) softirqs last enabled at (18074440): [] irq_enter_rcu+0x94/0xa0 softirqs last disabled at (18074439): [] irq_enter_rcu+0x70/0xa0 CPU: 88 PID: 0 Comm: swapper/88 Tainted: GW 5.9.0-rc8-next-20201007 #1 Call Trace: [c0002a4bfcf0] [c0649e98] dump_stack+0xec/0x144 (unreliable) [c0002a4bfd30] [c00f6c34] ___might_sleep+0x2f4/0x310 [c0002a4bfdb0] [c0354f94] slab_pre_alloc_hook.constprop.82+0x124/0x190 [c0002a4bfe00] [c035e9e8] __kmalloc_node+0x88/0x3a0 slab_alloc_node at mm/slub.c:2817 (inlined by) __kmalloc_node at mm/slub.c:4013 [c0002a4bfe80] [c06494d8] alloc_cpumask_var_node+0x38/0x80 kmalloc_node at include/linux/slab.h:577 (inlined by) alloc_cpumask_var_node at lib/cpumask.c:116 [c0002a4bfef0] [c003eedc] start_secondary+0x27c/0x800 update_mask_by_l2 at arch/powerpc/kernel/smp.c:1267 (inlined by) add_cpu_to_masks at arch/powerpc/kernel/smp.c:1387 (inlined by) start_secondary at arch/powerpc/kernel/smp.c:1420 [c0002a4bff90] [c000c468] start_secondary_resume+0x10/0x14 Allocating a temporary mask while performing a CPU Hotplug operation with CONFIG_CPUMASK_OFFSTACK enabled, leads to calling a sleepable function from a atomic context. Fix this by allocating the temporary mask with GFP_ATOMIC flag. Also instead of having to allocate twice, allocate the mask in the caller so that we only have to allocate once. If the allocation fails, assume the mask to be same as sibling mask, which will make the scheduler to drop this domain for this CPU. Fixes: 70a94089d7f7 ("powerpc/smp: Optimize update_coregroup_mask") Fixes: 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2") Reported-by: Qian Cai Signed-off-by: Srikar Dronamraju Cc: linuxppc-dev Cc: LKML Cc: Michael Ellerman Cc: Nathan Lynch Cc: Gautham R Shenoy Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Qian Cai --- Changelog v1->v2: https://lore.kernel.org/linuxppc-dev/20201008034240.34059-1-sri...@linux.vnet.ibm.com/t/#u Updated 2nd patch based on comments from Michael Ellerman - Remove the WARN_ON. - Handle allocation failures in a more subtle fashion - Allocate in the caller so that we allocate once. arch/powerpc/kernel/smp.c | 57 +-- 1 file changed, 31 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index a864b9b3228c..028479e9b66b 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1257,38 +1257,33 @@ static struct device_node *cpu_to_l2cache(int cpu) return cache; } -static bool update_mask_by_l2(int cpu) +static bool update_mask_by_l2(int cpu, cpumask_var_t *mask) { struct cpumask *(*submask_fn)(int) = cpu_sibling_mask; struct device_node *l2_cache, *np; - cpumask_var_t mask; int i; if (has_big_cores) submask_fn = cpu_smallcore_mask; l2_cache = cpu_to_l2cache(cpu); - if (!l2_cache) { - /* -* If no l2cache for this CPU, assume all siblings to share -* cache with this CPU. -*/ + if (!l2_cache || !*mask) { + /* Assume only core siblings share cache with this CPU */ for_each_cpu(i, submask_fn(cpu)) set_cpus_related(cpu, i, cpu_l2_cache_mask); return false; } - alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu)); - cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu)); + cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu)); /* Update l2-cache mask with all the CPUs that are part of submask */ or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask); /* Skip all CPUs already part of current CPU l2-cache mask */ - cpumask_andnot(mask, mask, cpu_l2_cache_mask(cpu)); + cpumask_andnot(*mask, *mask, cpu_l2_cache_mask(cpu)); - for_each_cpu(i, mask) { + for_each_cpu(i, *mask) { /* * when updating the marks the current CPU has not been marked * online, but we need to update the cache masks @@ -1298,15 +1293,14 @@ static bool update_mask_by_l2(int cpu) /* Skip all CPUs already part of current CPU l2-cache */ if (np == l2_cache) { or_cpumasks_related(cpu, i, submask_fn, cpu_l2_cache_mask); - cpumask_andnot(mask, mask, submask_f
[PATCH 1/2] powerpc/smp: Remove unnecessary variable
Commit 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2") introduced submask_fn in update_mask_by_l2 to track the right submask. However commit f6606cfdfbcd ("powerpc/smp: Dont assume l2-cache to be superset of sibling") introduced sibling_mask in update_mask_by_l2 to track the same submask. Remove sibling_mask in favour of submask_fn. Signed-off-by: Srikar Dronamraju Cc: linuxppc-dev Cc: LKML Cc: Michael Ellerman Cc: Nathan Lynch Cc: Gautham R Shenoy Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Qian Cai --- arch/powerpc/kernel/smp.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 8d1c401f4617..a864b9b3228c 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1264,18 +1264,16 @@ static bool update_mask_by_l2(int cpu) cpumask_var_t mask; int i; + if (has_big_cores) + submask_fn = cpu_smallcore_mask; + l2_cache = cpu_to_l2cache(cpu); if (!l2_cache) { - struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask; - /* * If no l2cache for this CPU, assume all siblings to share * cache with this CPU. */ - if (has_big_cores) - sibling_mask = cpu_smallcore_mask; - - for_each_cpu(i, sibling_mask(cpu)) + for_each_cpu(i, submask_fn(cpu)) set_cpus_related(cpu, i, cpu_l2_cache_mask); return false; @@ -1284,9 +1282,6 @@ static bool update_mask_by_l2(int cpu) alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu)); cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu)); - if (has_big_cores) - submask_fn = cpu_smallcore_mask; - /* Update l2-cache mask with all the CPUs that are part of submask */ or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask); -- 2.18.2
[PATCH v2 0/2] Fixes for coregroup
These patches fixes problems introduced by the coregroup patches. The first patch we remove a redundant variable. Second patch allows to boot with CONFIG_CPUMASK_OFFSTACK enabled. Changelog v1->v2: https://lore.kernel.org/linuxppc-dev/20201008034240.34059-1-sri...@linux.vnet.ibm.com/t/#u 1. 1st patch was not part of previous posting. 2. Updated 2nd patch based on comments from Michael Ellerman Cc: linuxppc-dev Cc: LKML Cc: Michael Ellerman Cc: Nathan Lynch Cc: Gautham R Shenoy Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Qian Cai Srikar Dronamraju (2): powerpc/smp: Remove unnecessary variable powerpc/smp: Use GFP_ATOMIC while allocating tmp mask arch/powerpc/kernel/smp.c | 70 +++ 1 file changed, 35 insertions(+), 35 deletions(-) -- 2.18.2
Re: KVM on POWER8 host lock up since 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
Excerpts from Michal Suchánek's message of October 17, 2020 6:14 am: > On Mon, Sep 07, 2020 at 11:13:47PM +1000, Nicholas Piggin wrote: >> Excerpts from Michael Ellerman's message of August 31, 2020 8:50 pm: >> > Michal Suchánek writes: >> >> On Mon, Aug 31, 2020 at 11:14:18AM +1000, Nicholas Piggin wrote: >> >>> Excerpts from Michal Suchánek's message of August 31, 2020 6:11 am: >> >>> > Hello, >> >>> > >> >>> > on POWER8 KVM hosts lock up since commit 10d91611f426 ("powerpc/64s: >> >>> > Reimplement book3s idle code in C"). >> >>> > >> >>> > The symptom is host locking up completely after some hours of KVM >> >>> > workload with messages like >> >>> > >> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu >> >>> > 47 >> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu >> >>> > 71 >> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu >> >>> > 47 >> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu >> >>> > 71 >> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu >> >>> > 47 >> >>> > >> >>> > printed before the host locks up. >> >>> > >> >>> > The machines run sandboxed builds which is a mixed workload resulting >> >>> > in >> >>> > IO/single core/mutiple core load over time and there are periods of no >> >>> > activity and no VMS runnig as well. The VMs are shortlived so VM >> >>> > setup/terdown is somewhat excercised as well. >> >>> > >> >>> > POWER9 with the new guest entry fast path does not seem to be affected. >> >>> > >> >>> > Reverted the patch and the followup idle fixes on top of 5.2.14 and >> >>> > re-applied commit a3f3072db6ca ("powerpc/powernv/idle: Restore IAMR >> >>> > after idle") which gives same idle code as 5.1.16 and the kernel seems >> >>> > stable. >> >>> > >> >>> > Config is attached. >> >>> > >> >>> > I cannot easily revert this commit, especially if I want to use the >> >>> > same >> >>> > kernel on POWER8 and POWER9 - many of the POWER9 fixes are applicable >> >>> > only to the new idle code. >> >>> > >> >>> > Any idea what can be the problem? >> >>> >> >>> So hwthread_state is never getting back to to HWTHREAD_IN_IDLE on >> >>> those threads. I wonder what they are doing. POWER8 doesn't have a good >> >>> NMI IPI and I don't know if it supports pdbg dumping registers from the >> >>> BMC unfortunately. >> >> >> >> It may be possible to set up fadump with a later kernel version that >> >> supports it on powernv and dump the whole kernel. >> > >> > Your firmware won't support it AFAIK. >> > >> > You could try kdump, but if we have CPUs stuck in KVM then there's a >> > good chance it won't work :/ >> >> I haven't had any luck yet reproducing this still. Testing with sub >> cores of various different combinations, etc. I'll keep trying though. > > Hello, > > I tried running some KVM guests to simulate the workload and what I get > is guests failing to start with a rcu stall. Tried both 5.3 and 5.9 > kernel and qemu 4.2.1 and 5.1.0 > > To start some guests I run > > for i in $(seq 0 9) ; do /opt/qemu/bin/qemu-system-ppc64 -m 2048 -accel kvm > -smp 8 -kernel /boot/vmlinux -initrd /boot/initrd -nodefaults -nographic > -serial mon:telnet::444$i,server,wait & done > > To simulate some workload I run > > xz -zc9T0 < /dev/zero > /dev/null & > while true; do > killall -STOP xz; sleep 1; killall -CONT xz; sleep 1; > done & > > on the host and add a job that executes this to the ramdisk. However, most > guests never get to the point where the job is executed. > > Any idea what might be the problem? I would say try without pv queued spin locks (but if the same thing is happening with 5.3 then it must be something else I guess). I'll try to test a similar setup on a POWER8 here. Thanks, Nick > > In the past I was able to boot guests quite realiably. > > This is boot log of one of the VMs > > Trying ::1... > Connected to localhost. > Escape character is '^]'. > > > SLOF ** > QEMU Starting > Build Date = Jul 17 2020 11:15:24 > FW Version = git-e18ddad8516ff2cf > Press "s" to enter Open Firmware. > > Populating /vdevice methods > Populating /vdevice/vty@7100 > Populating /vdevice/nvram@7101 > Populating /pci@8002000 > No NVRAM common partition, re-initializing... > Scanning USB > Using default console: /vdevice/vty@7100 > Detected RAM kernel at 40 (27c8620 bytes) > > Welcome to Open Firmware > > Copyright (c) 2004, 2017 IBM Corporation All rights reserved. > This program and the accompanying materials are made available > under the terms of the BSD License available at > http://www.opensource.org/licenses/bsd-license.php > > Booting from memory... > OF stdout device is: /vdevice/vty@7100 > Preparing to boot Linux version 5.9.0-1.g11733e1-default (geeko@buildhost) > (gcc (SUSE Linux) 10.2.1 20200825 [re
[Bug 209733] Starting new KVM virtual machines on PPC64 starts to hang after box is up for a while
https://bugzilla.kernel.org/show_bug.cgi?id=209733 Cameron (c...@neo-zeon.de) changed: What|Removed |Added Component|PPC-64 |kvm Version|2.5 |unspecified Product|Platform Specific/Hardware |Virtualization -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 209733] New: Starting new KVM virtual machines on PPC64 starts to hang after box is up for a while
https://bugzilla.kernel.org/show_bug.cgi?id=209733 Bug ID: 209733 Summary: Starting new KVM virtual machines on PPC64 starts to hang after box is up for a while Product: Platform Specific/Hardware Version: 2.5 Kernel Version: >=5.8 Hardware: PPC-64 OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: PPC-64 Assignee: platform_ppc...@kernel-bugs.osdl.org Reporter: c...@neo-zeon.de Regression: No Issue occurs with 5.8.14, 5.8.16, and 5.9.1. Does NOT occur with 5.7.x. I suspect it occurs with all of 5.8, but I haven't confirmed this yet. After the box has been up for a "while", starting new VM's fails. Completely shutting down existing VM's and then starting them back up will also fail in the same way. What is a while? Could be 2 days, might be 9. I'll update as the pattern becomes more clear. libvirt is generally used, but when running kvm manually with strace, kvm always gets stuck here: ioctl(11, KVM_PPC_ALLOCATE_HTAB, 0x7fffea0bade4 Maybe the kernel is trying to find the memory needed to allocate the Hashed Page Table but is unable to do so? Maybe there's a memory leak? Before this issue starts occurring, I have confirmed I am able to run the exact same kvm command manually: sudo -u libvirt-qemu qemu-system-ppc64 -enable-kvm -m 8192 -nographic -vga none -drive file=/var/lib/libvirt/images/test.qcow2,format=qcow2 -mem-prealloc -smp 4 Nothing in dmesg, nothing useful in the logs. This box's configuration: Debian 10 stable 2x 18 core POWER9 (144 threads) 512g physical memory Raptor Talos II motherboard radix MMU disabled Unfortunately, I cannot test the affected box with the Radix MMU enabled because I have some important VM's that won't run unless it is disabled. -- You are receiving this mail because: You are watching the assignee of the bug.
[PATCH AUTOSEL 4.4 24/33] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Jing Xiangfeng [ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangf...@huawei.com Acked-by: Tyrel Datwyler Signed-off-by: Jing Xiangfeng Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvfc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 0526a47e30a3f..db80ab8335dfb 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4790,6 +4790,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } -- 2.25.1
[PATCH AUTOSEL 4.9 32/41] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Jing Xiangfeng [ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangf...@huawei.com Acked-by: Tyrel Datwyler Signed-off-by: Jing Xiangfeng Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvfc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 54dea767dfde9..04b3ac17531db 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4804,6 +4804,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } -- 2.25.1
[PATCH AUTOSEL 4.14 39/52] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Jing Xiangfeng [ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangf...@huawei.com Acked-by: Tyrel Datwyler Signed-off-by: Jing Xiangfeng Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvfc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 34612add3829f..dbacd9830d3df 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4797,6 +4797,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } -- 2.25.1
[PATCH AUTOSEL 4.19 42/56] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Jing Xiangfeng [ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangf...@huawei.com Acked-by: Tyrel Datwyler Signed-off-by: Jing Xiangfeng Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvfc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 71d53bb239e25..090ab377f65e5 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4795,6 +4795,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } -- 2.25.1
[PATCH AUTOSEL 5.4 55/80] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Jing Xiangfeng [ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangf...@huawei.com Acked-by: Tyrel Datwyler Signed-off-by: Jing Xiangfeng Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvfc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index df897df5cafee..8a76284b59b08 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4788,6 +4788,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } -- 2.25.1
[PATCH AUTOSEL 5.8 071/101] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Jing Xiangfeng [ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangf...@huawei.com Acked-by: Tyrel Datwyler Signed-off-by: Jing Xiangfeng Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvfc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 635f6f9cffc40..ef91f3d01f989 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4928,6 +4928,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } -- 2.25.1
[PATCH AUTOSEL 5.9 076/111] scsi: ibmvfc: Fix error return in ibmvfc_probe()
From: Jing Xiangfeng [ Upstream commit 5e48a084f4e824e1b624d3fd7ddcf53d2ba69e53 ] Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangf...@huawei.com Acked-by: Tyrel Datwyler Signed-off-by: Jing Xiangfeng Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvfc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index ea7c8930592dc..70daa0605082d 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4928,6 +4928,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } -- 2.25.1
[PATCH v2 2/2] powerpc/44x: Don't support 47x code and non 47x code at the same time
440/460 variants and 470 variants are not compatible, no need to make code supporting both and using MMU features. Just use CONFIG_PPC_47x to decide what to build. Signed-off-by: Christophe Leroy --- v2: Move outside #ifdef CONFIG_PPC_47x a label "1:" used by 44x --- arch/powerpc/kernel/entry_32.S | 11 +++ arch/powerpc/mm/nohash/tlb_low.S | 29 +++-- 2 files changed, 10 insertions(+), 30 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 8cdc8bcde703..a425360deabb 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -439,15 +439,13 @@ syscall_exit_cont: andis. r10,r0,DBCR0_IDM@h bnel- load_dbcr0 #endif -#ifdef CONFIG_44x -BEGIN_MMU_FTR_SECTION +#ifdef CONFIG_PPC_47x lis r4,icache_44x_need_flush@ha lwz r5,icache_44x_need_flush@l(r4) cmplwi cr0,r5,0 bne-2f +#endif /* CONFIG_PPC_47x */ 1: -END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_47x) -#endif /* CONFIG_44x */ BEGIN_FTR_SECTION lwarx r7,0,r1 END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRED_STWCX) @@ -948,10 +946,7 @@ restore_kuap: /* interrupts are hard-disabled at this point */ restore: -#ifdef CONFIG_44x -BEGIN_MMU_FTR_SECTION - b 1f -END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x) +#if defined(CONFIG_44x) && !defined(CONFIG_PPC_47x) lis r4,icache_44x_need_flush@ha lwz r5,icache_44x_need_flush@l(r4) cmplwi cr0,r5,0 diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S index eaeee402f96e..68797e072f55 100644 --- a/arch/powerpc/mm/nohash/tlb_low.S +++ b/arch/powerpc/mm/nohash/tlb_low.S @@ -92,36 +92,25 @@ _GLOBAL(__tlbil_va) tlbsx. r6,0,r3 bne 10f sync -BEGIN_MMU_FTR_SECTION - b 2f -END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x) +#ifndef CONFIG_PPC_47x /* On 440 There are only 64 TLB entries, so r3 < 64, which means bit * 22, is clear. Since 22 is the V bit in the TLB_PAGEID, loading this * value will invalidate the TLB entry. */ tlbwe r6,r6,PPC44x_TLB_PAGEID - isync -10:wrtee r10 - blr -2: -#ifdef CONFIG_PPC_47x +#else orisr7,r6,0x8000/* specify way explicitly */ clrrwi r4,r3,12/* get an EPN for the hashing with V = 0 */ ori r4,r4,PPC47x_TLBE_SIZE tlbwe r4,r7,0 /* write it */ +#endif /* !CONFIG_PPC_47x */ isync - wrtee r10 +10:wrtee r10 blr -#else /* CONFIG_PPC_47x */ -1: trap - EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,0; -#endif /* !CONFIG_PPC_47x */ _GLOBAL(_tlbil_all) _GLOBAL(_tlbil_pid) -BEGIN_MMU_FTR_SECTION - b 2f -END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x) +#ifndef CONFIG_PPC_47x li r3,0 sync @@ -136,8 +125,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x) isync blr -2: -#ifdef CONFIG_PPC_47x +#else /* 476 variant. There's not simple way to do this, hopefully we'll * try to limit the amount of such full invalidates */ @@ -179,11 +167,8 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_47x) b 1b /* Then loop */ 1: isync /* Sync shadows */ wrtee r11 -#else /* CONFIG_PPC_47x */ -1: trap - EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,0; -#endif /* !CONFIG_PPC_47x */ blr +#endif /* !CONFIG_PPC_47x */ #ifdef CONFIG_PPC_47x -- 2.25.0
[PATCH v2 1/2] powerpc/44x: Don't support 440 when CONFIG_PPC_47x is set
As stated in platform/44x/Kconfig, CONFIG_PPC_47x is not compatible with 440 and 460 variants. This is confirmed in asm/cache.h as L1_CACHE_SHIFT is different for 47x, meaning a kernel built for 47x will not run correctly on a 440. In cputable, opt out all 440 and 460 variants when CONFIG_PPC_47x is set. Also add a default match dedicated to 470. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/cputable.h | 9 + arch/powerpc/include/asm/mmu.h | 7 +++ arch/powerpc/kernel/cputable.c | 29 + 3 files changed, 29 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index d88bcb79f16d..4a0ddf66bd4a 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -523,11 +523,10 @@ enum { #ifdef CONFIG_40x CPU_FTRS_40X | #endif -#ifdef CONFIG_44x - CPU_FTRS_44X | CPU_FTRS_440x6 | -#endif #ifdef CONFIG_PPC_47x CPU_FTRS_47X | CPU_FTR_476_DD2 | +#elif defined(CONFIG_44x) + CPU_FTRS_44X | CPU_FTRS_440x6 | #endif #ifdef CONFIG_E200 CPU_FTRS_E200 | @@ -596,7 +595,9 @@ enum { #ifdef CONFIG_40x CPU_FTRS_40X & #endif -#ifdef CONFIG_44x +#ifdef CONFIG_PPC_47x + CPU_FTRS_47X & +#elif defined(CONFIG_44x) CPU_FTRS_44X & CPU_FTRS_440x6 & #endif #ifdef CONFIG_E200 diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index a1769c0426f2..bf5d3b5291f1 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -162,15 +162,14 @@ enum { #ifdef CONFIG_40x MMU_FTR_TYPE_40x | #endif -#ifdef CONFIG_44x +#ifdef CONFIG_PPC_47x + MMU_FTR_TYPE_47x | MMU_FTR_USE_TLBIVAX_BCAST | MMU_FTR_LOCK_BCAST_INVAL | +#elif defined(CONFIG_44x) MMU_FTR_TYPE_44x | #endif #if defined(CONFIG_E200) || defined(CONFIG_E500) MMU_FTR_TYPE_FSL_E | MMU_FTR_BIG_PHYS | MMU_FTR_USE_TLBILX | #endif -#ifdef CONFIG_PPC_47x - MMU_FTR_TYPE_47x | MMU_FTR_USE_TLBIVAX_BCAST | MMU_FTR_LOCK_BCAST_INVAL | -#endif #ifdef CONFIG_PPC_BOOK3S_32 MMU_FTR_USE_HIGH_BATS | #endif diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c index 492c0b36aff6..cf80e6c8ed5e 100644 --- a/arch/powerpc/kernel/cputable.c +++ b/arch/powerpc/kernel/cputable.c @@ -1533,6 +1533,7 @@ static struct cpu_spec __initdata cpu_specs[] = { #endif /* CONFIG_40x */ #ifdef CONFIG_44x +#ifndef CONFIG_PPC_47x { .pvr_mask = 0xffff, .pvr_value = 0x4850, @@ -1815,7 +1816,19 @@ static struct cpu_spec __initdata cpu_specs[] = { .machine_check = machine_check_440A, .platform = "ppc440", }, -#ifdef CONFIG_PPC_47x + { /* default match */ + .pvr_mask = 0x, + .pvr_value = 0x, + .cpu_name = "(generic 44x PPC)", + .cpu_features = CPU_FTRS_44X, + .cpu_user_features = COMMON_USER_BOOKE, + .mmu_features = MMU_FTR_TYPE_44x, + .icache_bsize = 32, + .dcache_bsize = 32, + .machine_check = machine_check_4xx, + .platform = "ppc440", + } +#else /* CONFIG_PPC_47x */ { /* 476 DD2 core */ .pvr_mask = 0x, .pvr_value = 0x11a52080, @@ -1872,19 +1885,19 @@ static struct cpu_spec __initdata cpu_specs[] = { .machine_check = machine_check_47x, .platform = "ppc470", }, -#endif /* CONFIG_PPC_47x */ { /* default match */ .pvr_mask = 0x, .pvr_value = 0x, - .cpu_name = "(generic 44x PPC)", - .cpu_features = CPU_FTRS_44X, + .cpu_name = "(generic 47x PPC)", + .cpu_features = CPU_FTRS_47X, .cpu_user_features = COMMON_USER_BOOKE, - .mmu_features = MMU_FTR_TYPE_44x, + .mmu_features = MMU_FTR_TYPE_47x, .icache_bsize = 32, - .dcache_bsize = 32, - .machine_check = machine_check_4xx, - .platform = "ppc440", + .dcache_bsize = 128, + .machine_check = machine_check_47x, + .platform = "ppc470", } +#endif /* CONFIG_PPC_47x */ #endif /* CONFIG_44x */ #ifdef CONFIG_E200 { /* e200z5 */ -- 2.25.0