Re: [PATCH v1 1/1] soc: fsl: Replace kernel.h with the necessary inclusions
Le 29/10/2021 à 22:31, Andy Shevchenko a écrit : On Fri, Oct 29, 2021 at 10:04 PM LEROY Christophe wrote: Le 29/10/2021 à 17:55, Andy Shevchenko a écrit : On Wed, Oct 27, 2021 at 06:33:54PM +0300, Andy Shevchenko wrote: When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Seems nobody from PPC took this patch. Any idea who can take it? You have to check in MAINTAINERS file in the root directory of kernel sources: https://github.com/linuxppc/linux/blob/master/MAINTAINERS Actually for these files get_maintainer.pl showed nothing. I have chosen PPC maintainers manually. That's Michael who takes them. But you have to allow him enough time for it. Thanks! I wrote that message because I have got a notification from checkpatch that it should go somewhere else. That means that Michael considered it is not for him. And I think the reason is that in MAINTAINERS you have: FREESCALE QUICC ENGINE LIBRARY M: Qiang Zhao L: linuxppc-dev@lists.ozlabs.org S: Maintained F: drivers/soc/fsl/qe/ F: include/soc/fsl/*qe*.h F: include/soc/fsl/*ucc*.h FREESCALE SOC DRIVERS M: Li Yang L: linuxppc-dev@lists.ozlabs.org L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) S: Maintained F: Documentation/devicetree/bindings/misc/fsl,dpaa2-console.yaml F: Documentation/devicetree/bindings/soc/fsl/ F: drivers/soc/fsl/ F: include/linux/fsl/ Sorry I overlooked your patch. Christophe
Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.15-6 tag
The pull request you sent on Sat, 30 Oct 2021 10:05:46 +1100: > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git > tags/powerpc-5.15-6 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/119c85055d867b9588263bca59794c872ef2a30e Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
[PATCH] powerpc: fadump: correct two typos in a comment
Fix typos of 'remaining' and 'those'. Signed-off-by: Randy Dunlap Suggested-by: Matthew Wilcox # 'remaining' Cc: Michael Ellerman Cc: linuxppc-dev@lists.ozlabs.org --- arch/powerpc/kernel/fadump.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-next-20211029.orig/arch/powerpc/kernel/fadump.c +++ linux-next-20211029/arch/powerpc/kernel/fadump.c @@ -73,8 +73,8 @@ static struct cma *fadump_cma; * The total size of fadump reserved memory covers for boot memory size * + cpu data size + hpte size and metadata. * Initialize only the area equivalent to boot memory size for CMA use. - * The reamining portion of fadump reserved memory will be not given - * to CMA and pages for thoes will stay reserved. boot memory size is + * The remaining portion of fadump reserved memory will be not given + * to CMA and pages for those will stay reserved. boot memory size is * aligned per CMA requirement to satisy cma_init_reserved_mem() call. * But for some reason even if it fails we still have the memory reservation * with us and we can still continue doing fadump.
[Bug 214867] UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
https://bugzilla.kernel.org/show_bug.cgi?id=214867 --- Comment #3 from Frank Rowand (bugzilla.kernel@frowand.com) --- I forwarded my email notification of this bug to the mail lists. I prefer discussion to occur there: https://lore.kernel.org/all/c474a371-b524-1da8-4a67-e72cf8f2b...@gmail.com/ Thank you for the report. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.
On Fri, Oct 29, 2021 at 4:27 PM Eugene Bordenkircher wrote: > > Typing Greg's email correct this time. My apologies. > > Eugene > > -Original Message- > From: Eugene Bordenkircher > Sent: Friday, October 29, 2021 10:14 AM > To: linux-...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org > Cc: leoyang...@nxp.com; ba...@kernel.org; gre...@linuxfoundataion.org > Subject: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to > unrecoverable loop. > > Hello all, > > We've discovered a situation where the FSL udc driver > (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating over the > request queue, but the queue has been corrupted at some point so it loops > infinitely. I believe we have narrowed into the offending code, but we are > in need of assistance trying to find an appropriate fix for the problem. The > identified code appears to be in all versions of the Linux kernel the driver > exists in. > > The problem appears to be when handling a USB_REQ_GET_STATUS request. The > driver gets this request and then calls the ch9getstatus() function. In this > function, it starts a request by "borrowing" the per device status_req, > filling it in, and then queuing it with a call to list_add_tail() to add the > request to the endpoint queue. Right before it exits the function however, > it's calling ep0_prime_status(), which is filling out that same status_req > structure and then queuing it with another call to list_add_tail() to add the > request to the endpoint queue. This adds two instances of the exact same > LIST_HEAD to the endpoint queue, which breaks the list since the prev and > next pointers end up pointing to the wrong things. This ends up causing a > hard loop the next time nuke() gets called, which happens on the next setup > IRQ. > I agree with you that this looks problematic. This is probably introduced by f79a60b8785 "usb: fsl_udc_core: prime status stage once data stage has primed" that it didn't consider that the status_req has been re-used for the DATA phase. I think the proper fix should be having a separate request allocated for the data phase after the above change. > I'm not sure what the appropriate fix to this problem is, mostly due to my > lack of expertise in USB and this driver stack. The code has been this way > in the kernel for a very long time, which suggests that it has been working, > unless USB_REQ_GET_STATUS requests are never made. This further suggests > that there is something else going on that I don't understand. Deleting the > call to ep0_prime_status() and the following ep0stall() call appears, on the > surface, to get the device working again, but may have side effects that I'm > not seeing. > > I'm hopeful someone in the community can help provide some information on > what I may be missing or help come up with a solution to the problem. A big > thank you to anyone who would like to help out. > > Eugene
[GIT PULL] Please pull powerpc/linux.git powerpc-5.15-6 tag
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Linus, Please pull the final set of powerpc fixes for 5.15: The following changes since commit 787252a10d9422f3058df9a4821f389e5326c440: powerpc/smp: do not decrement idle task preempt count in CPU offline (2021-10-20 21:38:01 +1100) are available in the git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.15-6 for you to fetch changes up to d853adc7adf601d7d6823afe3ed396065a3e08d1: powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present (2021-10-25 11:41:15 +1100) - -- powerpc fixes for 5.15 #6 Three commits fixing some issues introduced with the recent IOMMU changes we merged. Thanks to: Alexey Kardashevskiy - -- Alexey Kardashevskiy (3): powerpc/pseries/iommu: Use correct vfree for it_map powerpc/pseries/iommu: Check if the default window in use before removing it powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present arch/powerpc/platforms/pseries/iommu.c | 27 ++-- 1 file changed, 14 insertions(+), 13 deletions(-) -BEGIN PGP SIGNATURE- iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmF8fhoACgkQUevqPMjh pYAoaRAAps3wmmCXKdVbFvqKTFzcRFiWoFa0r2c6SykG7hvo6y1r3avF5PhXU5ry OshoMcw+ZPFeH/Jc7VB/i7a9nQZSlf1k3Z9SaM+WVOgqFUhbE6OjC1r2VfRgo2lY 8QFmlLesNNx5dg+NXcunFD7Z7ydQopCR9QprlpWq2ZAxcIf9z7PP/SNlfxzCMo0d 0zYBfchkHAsg4C3/c6CjIr6lmbuPvlX3YoSyiOb9MBuAZB+fA6jNxqsW8GWbLYOA XNCFQ+1vqv5cwrjlo1nKCLQjYi/9MnF7/SLPeIHA/MYQBF7iuAeOCDo2ldgzKtAO uwSDrNiuGBya2QMU6ulnbHlropmg4NdtCp9i0jcztbDWRZl+dmJ88LqI5jE43JOF pgaf25jTw80yCrwxBFxfGwAesQPAxWMAV5SmqilArNu8ctCThRVeyYxIeFXpoZBA Gl54/3VX6lXGF0Myf1gHdu5Qqkj6W/PlOwmr/WcQLRthHhIaVnW/Y0VlWqQ1FA3e SsPf5XfP5VsqTXSos+t8FR9kpFaxOOC8C3Qo6bTbGYdd/dNx37AqXAK9B7vlgm3I ufLg5t6bx9DWLx8i+tNOqG7owY4PfwnBDgxLl9dsP41srWPdgP81/IsHnevSYis8 QrSBgPE3+elkr2V8tRR9Eco3bYwPQDBSdMqTksfnkMJ+t1jinz0= =l0Ac -END PGP SIGNATURE-
bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.
Hello all, We've discovered a situation where the FSL udc driver (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating over the request queue, but the queue has been corrupted at some point so it loops infinitely. I believe we have narrowed into the offending code, but we are in need of assistance trying to find an appropriate fix for the problem. The identified code appears to be in all versions of the Linux kernel the driver exists in. The problem appears to be when handling a USB_REQ_GET_STATUS request. The driver gets this request and then calls the ch9getstatus() function. In this function, it starts a request by "borrowing" the per device status_req, filling it in, and then queuing it with a call to list_add_tail() to add the request to the endpoint queue. Right before it exits the function however, it's calling ep0_prime_status(), which is filling out that same status_req structure and then queuing it with another call to list_add_tail() to add the request to the endpoint queue. This adds two instances of the exact same LIST_HEAD to the endpoint queue, which breaks the list since the prev and next pointers end up pointing to the wrong things. This ends up causing a hard loop the next time nuke() gets called, which happens on the next setup IRQ. I'm not sure what the appropriate fix to this problem is, mostly due to my lack of expertise in USB and this driver stack. The code has been this way in the kernel for a very long time, which suggests that it has been working, unless USB_REQ_GET_STATUS requests are never made. This further suggests that there is something else going on that I don't understand. Deleting the call to ep0_prime_status() and the following ep0stall() call appears, on the surface, to get the device working again, but may have side effects that I'm not seeing. I'm hopeful someone in the community can help provide some information on what I may be missing or help come up with a solution to the problem. A big thank you to anyone who would like to help out. Eugene
RE: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.
Typing Greg's email correct this time. My apologies. Eugene -Original Message- From: Eugene Bordenkircher Sent: Friday, October 29, 2021 10:14 AM To: linux-...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org Cc: leoyang...@nxp.com; ba...@kernel.org; gre...@linuxfoundataion.org Subject: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop. Hello all, We've discovered a situation where the FSL udc driver (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating over the request queue, but the queue has been corrupted at some point so it loops infinitely. I believe we have narrowed into the offending code, but we are in need of assistance trying to find an appropriate fix for the problem. The identified code appears to be in all versions of the Linux kernel the driver exists in. The problem appears to be when handling a USB_REQ_GET_STATUS request. The driver gets this request and then calls the ch9getstatus() function. In this function, it starts a request by "borrowing" the per device status_req, filling it in, and then queuing it with a call to list_add_tail() to add the request to the endpoint queue. Right before it exits the function however, it's calling ep0_prime_status(), which is filling out that same status_req structure and then queuing it with another call to list_add_tail() to add the request to the endpoint queue. This adds two instances of the exact same LIST_HEAD to the endpoint queue, which breaks the list since the prev and next pointers end up pointing to the wrong things. This ends up causing a hard loop the next time nuke() gets called, which happens on the next setup IRQ. I'm not sure what the appropriate fix to this problem is, mostly due to my lack of expertise in USB and this driver stack. The code has been this way in the kernel for a very long time, which suggests that it has been working, unless USB_REQ_GET_STATUS requests are never made. This further suggests that there is something else going on that I don't understand. Deleting the call to ep0_prime_status() and the following ep0stall() call appears, on the surface, to get the device working again, but may have side effects that I'm not seeing. I'm hopeful someone in the community can help provide some information on what I may be missing or help come up with a solution to the problem. A big thank you to anyone who would like to help out. Eugene
[PATCH] powerpc/fsl: fix the schema check errors for fsl, tmu-calibration
fsl,tmu-calibration is in u32-matrix format. Use matching property syntax. No functional changes. Fixes warnings as: $ make dtbs_check ... arch/arm64/boot/dts/freescale/imx8mq-librem5-r3.dt.yaml: tmu@3026: fsl,tmu-calibration:0: Additional items are not allowed (1, 41, 2, 47, 3, 53, 4, 61, 5, 67, 6, 75, 7, 81, 8, 87, 9, 95, 10, 103, 11, 111 , 65536, 27, 65537, 35, 65538, 43, 65539, 51, 65540, 59, 65541, 67, 65542, 75, 65543, 85, 65544, 93, 65545, 103, 65546, 112, 131072, 23, 131073, 35, 131074, 45, 131075, 55, 131076, 65, 131077, 75, 131078, 87, 13 1079, 99, 131080, 111, 196608, 21, 196609, 33, 196610, 45, 196611, 57, 196612, 69, 196613, 83, 196614, 95, 196615, 113 were unexpected) From schema: Documentation/devicetree/bindings/thermal/qoriq-thermal.yaml ... Signed-off-by: David Heidelberg --- arch/powerpc/boot/dts/fsl/t1023si-post.dtsi | 79 +++-- arch/powerpc/boot/dts/fsl/t1040si-post.dtsi | 71 +- 2 files changed, 76 insertions(+), 74 deletions(-) diff --git a/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi b/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi index d552044c5afc..aa5152ca8120 100644 --- a/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/t1023si-post.dtsi @@ -367,45 +367,46 @@ tmu: tmu@f { reg = <0xf 0x1000>; interrupts = <18 2 0 0>; fsl,tmu-range = <0xb 0xa0026 0x80048 0x30061>; - fsl,tmu-calibration = <0x 0x000f - 0x0001 0x0017 - 0x0002 0x001e - 0x0003 0x0026 - 0x0004 0x002e - 0x0005 0x0035 - 0x0006 0x003d - 0x0007 0x0044 - 0x0008 0x004c - 0x0009 0x0053 - 0x000a 0x005b - 0x000b 0x0064 - - 0x0001 0x0011 - 0x00010001 0x001c - 0x00010002 0x0024 - 0x00010003 0x002b - 0x00010004 0x0034 - 0x00010005 0x0039 - 0x00010006 0x0042 - 0x00010007 0x004c - 0x00010008 0x0051 - 0x00010009 0x005a - 0x0001000a 0x0063 - - 0x0002 0x0013 - 0x00020001 0x0019 - 0x00020002 0x0024 - 0x00020003 0x002c - 0x00020004 0x0035 - 0x00020005 0x003d - 0x00020006 0x0046 - 0x00020007 0x0050 - 0x00020008 0x0059 - - 0x0003 0x0002 - 0x00030001 0x000d - 0x00030002 0x0019 - 0x00030003 0x0024>; + fsl,tmu-calibration = + <0x 0x000f>, + <0x0001 0x0017>, + <0x0002 0x001e>, + <0x0003 0x0026>, + <0x0004 0x002e>, + <0x0005 0x0035>, + <0x0006 0x003d>, + <0x0007 0x0044>, + <0x0008 0x004c>, + <0x0009 0x0053>, + <0x000a 0x005b>, + <0x000b 0x0064>, + + <0x0001 0x0011>, + <0x00010001 0x001c>, + <0x00010002 0x0024>, + <0x00010003 0x002b>, + <0x00010004 0x0034>, + <0x00010005 0x0039>, + <0x00010006 0x0042>, + <0x00010007 0x004c>, + <0x00010008 0x0051>, + <0x00010009 0x005a>, + <0x0001000a
Re: [PATCH v1 1/1] soc: fsl: Replace kernel.h with the necessary inclusions
On Fri, Oct 29, 2021 at 10:04 PM LEROY Christophe wrote: > > > > Le 29/10/2021 à 17:55, Andy Shevchenko a écrit : > > On Wed, Oct 27, 2021 at 06:33:54PM +0300, Andy Shevchenko wrote: > >> When kernel.h is used in the headers it adds a lot into dependency hell, > >> especially when there are circular dependencies are involved. > >> > >> Replace kernel.h inclusion with the list of what is really being used. > > > > Seems nobody from PPC took this patch. > > Any idea who can take it? > > > > You have to check in MAINTAINERS file in the root directory of kernel > sources: https://github.com/linuxppc/linux/blob/master/MAINTAINERS Actually for these files get_maintainer.pl showed nothing. I have chosen PPC maintainers manually. > That's Michael who takes them. But you have to allow him enough time for it. Thanks! I wrote that message because I have got a notification from checkpatch that it should go somewhere else. -- With Best Regards, Andy Shevchenko
Re: [PATCH v1 1/1] soc: fsl: Replace kernel.h with the necessary inclusions
Le 29/10/2021 à 17:55, Andy Shevchenko a écrit : On Wed, Oct 27, 2021 at 06:33:54PM +0300, Andy Shevchenko wrote: When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Seems nobody from PPC took this patch. Any idea who can take it? You have to check in MAINTAINERS file in the root directory of kernel sources: https://github.com/linuxppc/linux/blob/master/MAINTAINERS That's Michael who takes them. But you have to allow him enough time for it. Christophe
Re: [PATCH v1 1/1] soc: fsl: Replace kernel.h with the necessary inclusions
Le 29/10/2021 à 17:55, Andy Shevchenko a écrit : > On Wed, Oct 27, 2021 at 06:33:54PM +0300, Andy Shevchenko wrote: >> When kernel.h is used in the headers it adds a lot into dependency hell, >> especially when there are circular dependencies are involved. >> >> Replace kernel.h inclusion with the list of what is really being used. > > Seems nobody from PPC took this patch. > Any idea who can take it? > You have to check in MAINTAINERS file in the root directory of kernel sources: https://github.com/linuxppc/linux/blob/master/MAINTAINERS That's Michael who takes them. But you have to allow him enough time for it. Christophe
Re: [PATCH v1 1/1] soc: fsl: Replace kernel.h with the necessary inclusions
On Wed, Oct 27, 2021 at 06:33:54PM +0300, Andy Shevchenko wrote: > When kernel.h is used in the headers it adds a lot into dependency hell, > especially when there are circular dependencies are involved. > > Replace kernel.h inclusion with the list of what is really being used. Seems nobody from PPC took this patch. Any idea who can take it? -- With Best Regards, Andy Shevchenko
[PATCH] powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
Until now, all tests involving CONFIG_STRICT_KERNEL_RWX were done with DEBUG_RODATA_TEST to check the result. But now that CONFIG_STRICT_KERNEL_RWX is selected by default, it came without CONFIG_DEBUG_RODATA_TEST and led to the following Oops [6.830908] Freeing unused kernel image (initmem) memory: 352K [6.840077] BUG: Unable to handle kernel data access on write at 0xc1285200 [6.846836] Faulting instruction address: 0xc0004b6c [6.851745] Oops: Kernel access of bad area, sig: 11 [#1] [6.857075] BE PAGE_SIZE=16K PREEMPT CMPC885 [6.861348] SAF3000 DIE NOTIFICATION [6.864830] CPU: 0 PID: 1 Comm: swapper Not tainted 5.15.0-rc5-s3k-dev-02255-g2747d7b7916f #451 [6.873429] NIP: c0004b6c LR: c0004b60 CTR: [6.878419] REGS: c902be60 TRAP: 0300 Not tainted (5.15.0-rc5-s3k-dev-02255-g2747d7b7916f) [6.886852] MSR: 9032 CR: 53000335 XER: 8000ff40 [6.893564] DAR: c1285200 DSISR: 8200 [6.893564] GPR00: 0c00 c902bf20 c20f4000 0800 0001 04001f00 c180 0035 [6.893564] GPR08: ff0001ff c128 0002 c0004b60 1000 c0004b1c [6.893564] GPR16: [6.893564] GPR24: c106 [6.932034] NIP [c0004b6c] kernel_init+0x50/0x138 [6.936682] LR [c0004b60] kernel_init+0x44/0x138 [6.941245] Call Trace: [6.943653] [c902bf20] [c0004b60] kernel_init+0x44/0x138 (unreliable) [6.950022] [c902bf30] [c001122c] ret_from_kernel_thread+0x5c/0x64 [6.956135] Instruction dump: [6.959060] 48ffc521 48045469 4800d8cd 3d20c086 89295fa0 2c09 41820058 480796c9 [6.966890] 4800e48d 3d20c128 3942 3fe0c106 <91495200> 3bff8000 4806fa1d 481f7d75 [6.974902] ---[ end trace 1e397bacba4aa610 ]--- 0xc1285200 corresponds to 'system_state' global var that the kernel is trying to set to SYSTEM_RUNNING. This var is above the RO/RW limit so it shouldn't Oops. It oopses because the dirty bit is missing. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 9bdb95f5694f..2d596881b70e 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -755,7 +755,7 @@ _GLOBAL(mmu_pin_tlb) cmplw r6, r9 bdnzt lt, 2b -4: LOAD_REG_IMMEDIATE(r8, 0xf0 | _PAGE_SPS | _PAGE_SH | _PAGE_PRESENT) +4: LOAD_REG_IMMEDIATE(r8, 0xf0 | _PAGE_DIRTY | _PAGE_SPS | _PAGE_SH | _PAGE_PRESENT) 2: ori r0, r6, MD_EVALID mtspr SPRN_MD_CTR, r5 mtspr SPRN_MD_EPN, r0 -- 2.31.1
Re: [PATCH 11/13] ps3vram: add error handling support for add_disk()
Hi Luis, On 10/15/21 4:52 PM, Luis Chamberlain wrote: > We never checked for errors on add_disk() as this function > returned void. Now that this is fixed, use the shiny new > error handling. I didn't yet test this ps3vram related change, but based on the ps3disk testing I think this change will be OK. Acked-by: Geoff Levand
Re: [PATCH 10/13] ps3disk: add error handling support for add_disk()
Hi Luis, On 10/15/21 4:52 PM, Luis Chamberlain wrote: > We never checked for errors on add_disk() as this function > returned void. Now that this is fixed, use the shiny new > error handling. > > Signed-off-by: Luis Chamberlain I tested your 20211011-for-axboe-add-disk-error-handling branch on PS3 and the ps3disk changes seem to be working OK. Tested-by: Geoff Levand
[Bug 214867] UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
https://bugzilla.kernel.org/show_bug.cgi?id=214867 Arnd Bergmann (a...@arndb.de) changed: What|Removed |Added CC||a...@arndb.de --- Comment #2 from Arnd Bergmann (a...@arndb.de) --- This is the function that triggers it: static void of_unittest_untrack_overlay(int id) { if (overlay_first_id < 0) return; id -= overlay_first_id; if (WARN_ON(id >= MAX_UNITTEST_OVERLAYS)) return; overlay_id_bits[BIT_WORD(id)] &= ~BIT_MASK(id); } My guess is that 'id' is negative here, which means it fails to tigger the WARN_ON() but ends up still being out of range. Can you try changing it to 'unsigned int id'? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214867] UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
https://bugzilla.kernel.org/show_bug.cgi?id=214867 --- Comment #1 from Erhard F. (erhar...@mailbox.org) --- Created attachment 299363 --> https://bugzilla.kernel.org/attachment.cgi?id=299363=edit kernel .config (kernel 5.15-rc7, Talos II) # lspci :00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) :01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Turks XT [Radeon HD 6670/7670] :01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Turks HDMI Audio [Radeon HD 6500/6600 / 6700M Series] 0001:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0001:01:00.0 Non-Volatile memory controller: Phison Electronics Corporation Device 5008 (rev 01) 0002:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0003:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) 0004:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0004:01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 0004:01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 0005:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0005:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04) 0005:02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41) 0030:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0031:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0032:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) 0033:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
https://bugzilla.kernel.org/show_bug.cgi?id=214867 Bug ID: 214867 Summary: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36 Product: Platform Specific/Hardware Version: 2.5 Kernel Version: 5.15-rc7 Hardware: PPC-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: PPC-64 Assignee: platform_ppc...@kernel-bugs.osdl.org Reporter: erhar...@mailbox.org CC: bugzilla.kernel@frowand.com Regression: No Created attachment 299361 --> https://bugzilla.kernel.org/attachment.cgi?id=299361=edit kernel dmesg (kernel 5.15-rc7, Talos II) UBSAN catches this at boot on my Talos II. [...] ### dt-test ### EXPECT / : GPIO line <> (line-C-input) hogged as input UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36 shift exponent -1 is negative CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc7-TalosII #1 Call Trace: [c4163700] [c08ffaa8] .dump_stack_lvl+0xa4/0x100 (unreliable) [c4163790] [c08fb46c] .ubsan_epilogue+0x10/0x70 [c4163800] [c08fb270] .__ubsan_handle_shift_out_of_bounds+0x1f0/0x34c [c4163910] [c0ad94a0] .of_unittest_untrack_overlay+0x6c/0xe0 [c41639a0] [c2098ff8] .of_unittest+0x4c50/0x59f8 [c4163b60] [c0011b5c] .do_one_initcall+0x7c/0x4f0 [c4163c50] [c200300c] .kernel_init_freeable+0x704/0x858 [c4163d90] [c0012730] .kernel_init+0x20/0x190 [c4163e10] [c000ce78] .ret_from_kernel_thread+0x58/0x60 ### dt-test ### EXPECT \ : OF: overlay: WARNING: memory leak will occur if overlay removed, property: /testcase-data-2/substation@100/status [...] -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active
Nicholas Piggin writes: > Excerpts from Athira Rajeev's message of October 29, 2021 1:05 pm: >> During Live Partition Migration (LPM), it is observed that perf >> counter values reports zero post migration completion. However >> 'perf stat' with workload continues to show counts post migration >> since PMU gets disabled/enabled during sched switches. But incase >> of system/cpu wide monitoring, zero counts were reported with 'perf >> stat' after migration completion. >> >> Example: >> ./perf stat -e r1001e -I 1000 >>time counts unit events >> 1.001010437 22,137,414 r1001e >> 2.002495447 15,455,821 r1001e >> <<>> As seen in next below logs, the counter values shows zero >> after migration is completed. >> <<>> >> 86.142535370129,392,333,440 r1001e >> 87.144714617 0 r1001e >> 88.146526636 0 r1001e >> 89.148085029 0 r1001e > > This is the output without the patch? After the patch it keeps counting > I suppose? And does the very large count go away too? > >> >> Here PMU is enabled during start of perf session and counter >> values are read at intervals. Counters are only disabled at the >> end of session. The powerpc mobility code presently does not handle >> disabling and enabling back of PMU counters during partition >> migration. Also since the PMU register values are not saved/restored >> during migration, PMU registers like Monitor Mode Control Register 0 >> (MMCR0), Monitor Mode Control Register 1 (MMCR1) will not contain >> the value it was programmed with. Hence PMU counters will not be >> enabled correctly post migration. >> >> Fix this in mobility code by handling disabling and enabling of >> PMU in all cpu's before and after migration. Patch introduces two >> functions 'mobility_pmu_disable' and 'mobility_pmu_enable'. >> mobility_pmu_disable() is called before the processor threads goes >> to suspend state so as to disable the PMU counters. And disable is >> done only if there are any active events running on that cpu. >> mobility_pmu_enable() is called after the migrate is done to enable >> back the PMU counters. >> >> Since the performance Monitor counters ( PMCs) are not >> saved/restored during LPM, results in PMC value being zero and the >> 'event->hw.prev_count' being non-zero value. This causes problem >> during updation of event->count since we always accumulate >> (event->hw.prev_count - PMC value) in event->count. If >> event->hw.prev_count is greater PMC value, event->count becomes >> negative. To fix this, 'prev_count' also needs to be re-initialised >> for all events while enabling back the events. Hence read the >> existing events and clear the PMC index (stored in event->hw.idx) >> for all events im mobility_pmu_disable. By this way, event count >> settings will get re-initialised correctly in power_pmu_enable. >> >> Signed-off-by: Athira Rajeev >> [ Fixed compilation error reported by kernel test robot ] >> Reported-by: kernel test robot >> --- >> Changelog: >> Change from v2 -> v3: >> Addressed review comments from Nicholas Piggin. >> - Removed the "migrate" field which was added in initial >>patch to address updation of event count settings correctly >>in power_pmu_enable. Instead read off existing events in >>mobility_pmu_disable before power_pmu_enable. >> - Moved the mobility_pmu_disable/enable declaration from >>rtas.h to perf event header file. >> >> Addressed review comments from Nathan. >> - Moved the mobility function calls from stop_machine >>context out to pseries_migrate_partition. Also now this >>is a per cpu invocation. >> >> Change from v1 -> v2: >> - Moved the mobility_pmu_enable and mobility_pmu_disable >>declarations under CONFIG_PPC_PERF_CTRS in rtas.h. >>Also included 'asm/rtas.h' in core-book3s to fix the >>compilation warning reported by kernel test robot. >> >> arch/powerpc/include/asm/perf_event.h | 8 + >> arch/powerpc/perf/core-book3s.c | 39 +++ >> arch/powerpc/platforms/pseries/mobility.c | 7 >> 3 files changed, 54 insertions(+) >> >> diff --git a/arch/powerpc/include/asm/perf_event.h >> b/arch/powerpc/include/asm/perf_event.h >> index 164e910bf654..88aab6cf840c 100644 >> --- a/arch/powerpc/include/asm/perf_event.h >> +++ b/arch/powerpc/include/asm/perf_event.h >> @@ -17,6 +17,14 @@ static inline bool is_sier_available(void) { return >> false; } >> static inline unsigned long get_pmcs_ext_regs(int idx) { return 0; } >> #endif >> >> +#ifdef CONFIG_PPC_PERF_CTRS >> +void mobility_pmu_disable(void *unused); >> +void mobility_pmu_enable(void *unused); >> +#else >> +static inline void mobility_pmu_disable(void *unused) { } >> +static inline void mobility_pmu_enable(void *unused) { } >> +#endif >> + >> #ifdef CONFIG_FSL_EMB_PERF_EVENT >> #include >> #endif >> diff --git
Re: Linux kernel: powerpc: KVM guest can trigger host crash on Power8
Hi Nicholas! On 10/29/21 02:41, Nicholas Piggin wrote: > Soft lockup should mean it's taking timer interrupts still, just not > scheduling. Do you have the hard lockup detector enabled as well? Is > there anything stuck spinning on another CPU? I haven't enabled it. But looking at the documentation [1] it seems we could use it to print a backtrace once the lockup occurs. > Do you have the full dmesg / kernel log for this boot? I do, uploaded the messages file here: https://people.debian.org/~glaubitz/messages-kvm-lockup.gz Also, I noticed there is actually a backtrace: Oct 25 17:02:31 watson kernel: [14104.902061] (detected by 80, t=5252 jiffies, g=49897, q=37) Oct 25 17:02:31 watson kernel: [14104.902072] Sending NMI from CPU 80 to CPUs 136: Oct 25 17:02:31 watson kernel: [14108.253972] Modules linked in: dm_mod(E) vhost_net(E) vhost(E) vhost_iotlb(E) tap(E) tun(E) kvm_hv(E) kvm_pr(E) kvm(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nft_counter(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) xfs(E) ecb(E) xts(E) sg(E) ctr(E) vmx_crypto(E) gf128mul(E) ipmi_powernv(E) powernv_rng(E) ipmi_devintf(E) rng_core(E) ipmi_msghandler(E) powernv_op_panel(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) iscsi_tcp(E) libiscsi_tcp(E) sunrpc(E) libiscsi(E) drm(E) scsi_transport_iscsi(E) fuse(E) drm_panel_orientation_quirks(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sr_mod(E) sd_mod(E) ses(E) cdrom(E) enclosure(E) t10_pi(E) crc_t10dif(E) scsi_transport_sas(E) crct10dif_generic(E) crct10dif_common(E) btrfs(E) blake2b_generic(E) zstd_compress(E) raid10(E) raid456(E) Oct 25 17:02:31 watson kernel: [14108.254101] async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) xhci_pci(E) xhci_hcd(E) e1000e(E) usbcore(E) ptp(E) pps_core(E) ipr(E) usb_common(E) Oct 25 17:02:31 watson kernel: [14108.254139] CPU: 104 PID: 175 Comm: migration/104 Tainted: GE 5.14.0-0.bpo.2-powerpc64le #1 Debian 5.14.9-2~bpo11+2 Oct 25 17:02:31 watson kernel: [14108.254146] Stopper: multi_cpu_stop+0x0/0x240 <- migrate_swap+0xf8/0x240 Oct 25 17:02:31 watson kernel: [14108.254160] NIP: c01f6a58 LR: c026b734 CTR: c026b5c0 Oct 25 17:02:31 watson kernel: [14108.254163] REGS: c01001237970 TRAP: 0900 Tainted: GE (5.14.0-0.bpo.2-powerpc64le Debian 5.14.9-2~bpo11+2) Oct 25 17:02:31 watson kernel: [14108.254168] MSR: 90009033 CR: 28002442 XER: 2000 Oct 25 17:02:31 watson kernel: [14108.254183] CFAR: c026b730 IRQMASK: 0 Oct 25 17:02:31 watson kernel: [14108.254183] GPR00: c026b32c c01001237c10 c166ce00 c0d02c30 Oct 25 17:02:31 watson kernel: [14108.254183] GPR04: c01806433198 c01806433198 5687ca06 Oct 25 17:02:31 watson kernel: [14108.254183] GPR08: c017fc8948a0 c017fc894780 0004 c0080a80e378 Oct 25 17:02:31 watson kernel: [14108.254183] GPR12: c0175a00 c0173ec8 c194c080 Oct 25 17:02:31 watson kernel: [14108.254183] GPR16: Oct 25 17:02:31 watson kernel: [14108.254183] GPR20: c01806433170 0001 Oct 25 17:02:31 watson kernel: [14108.254183] GPR24: 0002 0003 c0d02c30 Oct 25 17:02:31 watson kernel: [14108.254183] GPR28: 0001 c01806433170 c01806433194 0001 Oct 25 17:02:31 watson kernel: [14108.254240] NIP [c01f6a58] rcu_momentary_dyntick_idle+0x48/0x60 Oct 25 17:02:31 watson kernel: [14108.254245] LR [c026b734] multi_cpu_stop+0x174/0x240 Oct 25 17:02:31 watson kernel: [14108.254251] Call Trace: Oct 25 17:02:31 watson kernel: [14108.254253] [c01001237c10] [c01001237c80] 0xc01001237c80 (unreliable) Oct 25 17:02:31 watson kernel: [14108.254260] [c01001237c80] [c026b32c] cpu_stopper_thread+0x16c/0x280 Oct 25 17:02:31 watson kernel: [14108.254267] [c01001237d40] [c017ad4c] smpboot_thread_fn+0x1ec/0x260 Oct 25 17:02:31 watson kernel: [14108.254273] [c01001237da0] [c017403c] kthread+0x17c/0x190 Oct 25 17:02:31 watson kernel: [14108.254280] [c01001237e10] [c000cf64] ret_from_kernel_thread+0x5c/0x64 Oct 25 17:02:31 watson kernel: [14108.254287] Instruction dump: Oct 25 17:02:31 watson kernel: [14108.254289] 394a7aa4 39297980 7cc751ae e94d0030 7d295214 39090120 7c0004ac 3944 Oct 25 17:02:31 watson kernel: [14108.254301] 7ce04028 7cea3a14 7ce0412d 40c2fff4 <7c0004ac> 70e90002 4c820020 0fe0 Oct 25 17:02:31 watson
Re: [PATCH 2/2] powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi
Le 29/10/2021 à 10:39, Nicholas Piggin a écrit : There is a deadlock with the console_owner lock and the wd_smp_lock: CPU x takes the console_owner lock CPU y takes a watchdog timer interrupt and takes __wd_smp_lock CPU x takes a watchdog timer interrupt and spins on __wd_smp_lock CPU y detects a deadlock, tries to print something and spins on console_owner -> deadlock Change the watchdog locking scheme so wd_smp_lock protects the watchdog internal data, but "reporting" (printing, issuing NMI IPIs, taking any action outside of watchdog) uses a non-waiting exclusion. If a CPU detects a problem but can not take the reporting lock, it just returns because something else is already reporting. It will try again at some point. Typically hard lockup watchdog report usefulness is not impacted due to failure to spewing a large enough amount of data in as short a time as possible, but by messages getting garbled. Laurent debugged this and found the deadlock, and this patch is based on his general approach to avoid expensive operations while holding the lock. With the addition of the reporting exclusion. Signed-off-by: Laurent Dufour [np: rework to add reporting exclusion update changelog] Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/watchdog.c | 76 ++ 1 file changed, 59 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index 4bb7c8e371a2..69a475aa0f44 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -85,10 +85,32 @@ static DEFINE_PER_CPU(u64, wd_timer_tb); /* SMP checker bits */ static unsigned long __wd_smp_lock; +static unsigned long __wd_reporting; static cpumask_t wd_smp_cpus_pending; static cpumask_t wd_smp_cpus_stuck; static u64 wd_smp_last_reset_tb; +/* + * Try to take the exclusive watchdog action / NMI IPI / printing lock. + * wd_smp_lock must be held. If this fails, we should return and wait + * for the watchdog to kick in again (or another CPU to trigger it). + */ +static bool wd_try_report(void) +{ + if (__wd_reporting) + return false; + __wd_reporting = 1; + return true; +} + +/* End printing after successful wd_try_report. wd_smp_lock not required. */ +static void wd_end_reporting(void) +{ + smp_mb(); /* End printing "critical section" */ + WARN_ON_ONCE(__wd_reporting == 0); + WRITE_ONCE(__wd_reporting, 0); +} + static inline void wd_smp_lock(unsigned long *flags) { /* @@ -131,10 +153,10 @@ static void wd_lockup_ipi(struct pt_regs *regs) /* Do not panic from here because that can recurse into NMI IPI layer */ } -static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb) +static void set_cpu_stuck(int cpu, u64 tb) { - cpumask_or(_smp_cpus_stuck, _smp_cpus_stuck, cpumask); - cpumask_andnot(_smp_cpus_pending, _smp_cpus_pending, cpumask); + cpumask_set_cpu(cpu, _smp_cpus_stuck); + cpumask_clear_cpu(cpu, _smp_cpus_pending); if (cpumask_empty(_smp_cpus_pending)) { wd_smp_last_reset_tb = tb; cpumask_andnot(_smp_cpus_pending, @@ -142,10 +164,6 @@ static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb) _smp_cpus_stuck); } } -static void set_cpu_stuck(int cpu, u64 tb) -{ - set_cpumask_stuck(cpumask_of(cpu), tb); -} static void watchdog_smp_panic(int cpu, u64 tb) { @@ -160,6 +178,9 @@ static void watchdog_smp_panic(int cpu, u64 tb) goto out; if (cpumask_weight(_smp_cpus_pending) == 0) goto out; + if (!wd_try_report()) + goto out; + wd_smp_unlock(); pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n", cpu, cpumask_pr_args(_smp_cpus_pending)); @@ -172,24 +193,32 @@ static void watchdog_smp_panic(int cpu, u64 tb) * Try to trigger the stuck CPUs, unless we are going to * get a backtrace on all of them anyway. */ - for_each_cpu(c, _smp_cpus_pending) { + for_each_online_cpu(c) { if (c == cpu) continue; + if (!cpumask_test_cpu(cpu, _smp_cpus_pending)) ^ c cpu is the reporting CPU, c is the target here. + continue; + wd_smp_lock(); + if (!cpumask_test_cpu(cpu, _smp_cpus_pending)) { ^ again c + wd_smp_unlock(); + continue; + } + /* Take the stuck CPU out of the watch group */ + set_cpu_stuck(cpu, tb); ^ c + wd_smp_unlock(); +
Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active
Hi Athira, Thank you for the patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on tip/perf/core v5.15-rc7 next-20211028] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Athira-Rajeev/powerpc-perf-Enable-PMU-counters-post-partition-migration-if-PMU-is-active/20211029-110804 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc64-randconfig-r003-20211028 (attached as .config) compiler: powerpc64-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/caeb6bb8d2f36e6b15151587193c1e0a9e62ab16 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Athira-Rajeev/powerpc-perf-Enable-PMU-counters-post-partition-migration-if-PMU-is-active/20211029-110804 git checkout caeb6bb8d2f36e6b15151587193c1e0a9e62ab16 # save the attached .config to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=powerpc SHELL=/bin/bash arch/powerpc/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): arch/powerpc/platforms/pseries/mobility.c: In function 'pseries_migrate_partition': >> arch/powerpc/platforms/pseries/mobility.c:670:22: error: >> 'mobility_pmu_disable' undeclared (first use in this function) 670 | on_each_cpu(_pmu_disable, NULL, 0); | ^~~~ arch/powerpc/platforms/pseries/mobility.c:670:22: note: each undeclared identifier is reported only once for each function it appears in >> arch/powerpc/platforms/pseries/mobility.c:679:22: error: >> 'mobility_pmu_enable' undeclared (first use in this function) 679 | on_each_cpu(_pmu_enable, NULL, 0); | ^~~ vim +/mobility_pmu_disable +670 arch/powerpc/platforms/pseries/mobility.c 660 661 static int pseries_migrate_partition(u64 handle) 662 { 663 int ret; 664 665 ret = wait_for_vasi_session_suspending(handle); 666 if (ret) 667 return ret; 668 669 /* Disable PMU before suspend */ > 670 on_each_cpu(_pmu_disable, NULL, 0); 671 672 ret = pseries_suspend(handle); 673 if (ret == 0) 674 post_mobility_fixup(); 675 else 676 pseries_cancel_migration(handle, ret); 677 678 /* Enable PMU after resuming */ > 679 on_each_cpu(_pmu_enable, NULL, 0); 680 681 return ret; 682 } 683 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
[PATCH 2/2] powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi
There is a deadlock with the console_owner lock and the wd_smp_lock: CPU x takes the console_owner lock CPU y takes a watchdog timer interrupt and takes __wd_smp_lock CPU x takes a watchdog timer interrupt and spins on __wd_smp_lock CPU y detects a deadlock, tries to print something and spins on console_owner -> deadlock Change the watchdog locking scheme so wd_smp_lock protects the watchdog internal data, but "reporting" (printing, issuing NMI IPIs, taking any action outside of watchdog) uses a non-waiting exclusion. If a CPU detects a problem but can not take the reporting lock, it just returns because something else is already reporting. It will try again at some point. Typically hard lockup watchdog report usefulness is not impacted due to failure to spewing a large enough amount of data in as short a time as possible, but by messages getting garbled. Laurent debugged this and found the deadlock, and this patch is based on his general approach to avoid expensive operations while holding the lock. With the addition of the reporting exclusion. Signed-off-by: Laurent Dufour [np: rework to add reporting exclusion update changelog] Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/watchdog.c | 76 ++ 1 file changed, 59 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index 4bb7c8e371a2..69a475aa0f44 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -85,10 +85,32 @@ static DEFINE_PER_CPU(u64, wd_timer_tb); /* SMP checker bits */ static unsigned long __wd_smp_lock; +static unsigned long __wd_reporting; static cpumask_t wd_smp_cpus_pending; static cpumask_t wd_smp_cpus_stuck; static u64 wd_smp_last_reset_tb; +/* + * Try to take the exclusive watchdog action / NMI IPI / printing lock. + * wd_smp_lock must be held. If this fails, we should return and wait + * for the watchdog to kick in again (or another CPU to trigger it). + */ +static bool wd_try_report(void) +{ + if (__wd_reporting) + return false; + __wd_reporting = 1; + return true; +} + +/* End printing after successful wd_try_report. wd_smp_lock not required. */ +static void wd_end_reporting(void) +{ + smp_mb(); /* End printing "critical section" */ + WARN_ON_ONCE(__wd_reporting == 0); + WRITE_ONCE(__wd_reporting, 0); +} + static inline void wd_smp_lock(unsigned long *flags) { /* @@ -131,10 +153,10 @@ static void wd_lockup_ipi(struct pt_regs *regs) /* Do not panic from here because that can recurse into NMI IPI layer */ } -static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb) +static void set_cpu_stuck(int cpu, u64 tb) { - cpumask_or(_smp_cpus_stuck, _smp_cpus_stuck, cpumask); - cpumask_andnot(_smp_cpus_pending, _smp_cpus_pending, cpumask); + cpumask_set_cpu(cpu, _smp_cpus_stuck); + cpumask_clear_cpu(cpu, _smp_cpus_pending); if (cpumask_empty(_smp_cpus_pending)) { wd_smp_last_reset_tb = tb; cpumask_andnot(_smp_cpus_pending, @@ -142,10 +164,6 @@ static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb) _smp_cpus_stuck); } } -static void set_cpu_stuck(int cpu, u64 tb) -{ - set_cpumask_stuck(cpumask_of(cpu), tb); -} static void watchdog_smp_panic(int cpu, u64 tb) { @@ -160,6 +178,9 @@ static void watchdog_smp_panic(int cpu, u64 tb) goto out; if (cpumask_weight(_smp_cpus_pending) == 0) goto out; + if (!wd_try_report()) + goto out; + wd_smp_unlock(); pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n", cpu, cpumask_pr_args(_smp_cpus_pending)); @@ -172,24 +193,32 @@ static void watchdog_smp_panic(int cpu, u64 tb) * Try to trigger the stuck CPUs, unless we are going to * get a backtrace on all of them anyway. */ - for_each_cpu(c, _smp_cpus_pending) { + for_each_online_cpu(c) { if (c == cpu) continue; + if (!cpumask_test_cpu(cpu, _smp_cpus_pending)) + continue; + wd_smp_lock(); + if (!cpumask_test_cpu(cpu, _smp_cpus_pending)) { + wd_smp_unlock(); + continue; + } + /* Take the stuck CPU out of the watch group */ + set_cpu_stuck(cpu, tb); + wd_smp_unlock(); + smp_send_nmi_ipi(c, wd_lockup_ipi, 100); } } - /* Take the stuck CPUs out of the watch group */ - set_cpumask_stuck(_smp_cpus_pending, tb); - - wd_smp_unlock(); - if (sysctl_hardlockup_all_cpu_backtrace)
[PATCH 1/2] powerpc/watchdog: Fix missed watchdog reset due to memory ordering race
It is possible for all CPUs to miss the pending cpumask becoming clear, and then nobody resetting it, which will cause the lockup detector to stop working. It will eventually expire, but watchdog_smp_panic will avoid doing anything if the pending mask is clear and it will never be reset. Order the cpumask clear vs the subsequent test to close this race. Add an extra check for an empty pending mask when the watchdog fires and finds its bit still clear, to try to catch any other possible races or bugs here and keep the watchdog working. The extra test in arch_touch_nmi_watchdog is required to prevent the new warning from firing off. Debugged-by: Laurent Dufour Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/watchdog.c | 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index f9ea0e5357f9..4bb7c8e371a2 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -215,13 +215,38 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb) cpumask_clear_cpu(cpu, _smp_cpus_stuck); wd_smp_unlock(); + } else { + /* +* The last CPU to clear pending should have reset the +* watchdog, yet we find it empty here. This should not +* happen but we can try to recover and avoid a false +* positive if it does. +*/ + if (WARN_ON_ONCE(cpumask_empty(_smp_cpus_pending))) + goto none_pending; } return; } + cpumask_clear_cpu(cpu, _smp_cpus_pending); + /* +* Order the store to clear pending with the load(s) to check all +* words in the pending mask to check they are all empty. This orders +* with the same barrier on another CPU. This prevents two CPUs +* clearing the last 2 pending bits, but neither seeing the other's +* store when checking if the mask is empty, and missing an empty +* mask, which ends with a false positive. +*/ + smp_mb(); if (cpumask_empty(_smp_cpus_pending)) { unsigned long flags; +none_pending: + /* +* Double check under lock because more than one CPU could see +* a clear mask with the lockless check after clearing their +* pending bits. +*/ wd_smp_lock(); if (cpumask_empty(_smp_cpus_pending)) { wd_smp_last_reset_tb = tb; @@ -312,8 +337,12 @@ void arch_touch_nmi_watchdog(void) { unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000; int cpu = smp_processor_id(); - u64 tb = get_tb(); + u64 tb; + if (!cpumask_test_cpu(cpu, _cpumask)) + return; + + tb = get_tb(); if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) { per_cpu(wd_timer_tb, cpu) = tb; wd_smp_clear_cpu_pending(cpu, tb); -- 2.23.0
Re: [PATCH] mm/migrate.c: Remove MIGRATE_PFN_LOCKED
On Friday, 29 October 2021 2:33:31 AM AEDT Felix Kuehling wrote: > Am 2021-10-27 um 9:42 p.m. schrieb Alistair Popple: > > On Wednesday, 27 October 2021 3:09:57 AM AEDT Felix Kuehling wrote: > >> Am 2021-10-25 um 12:16 a.m. schrieb Alistair Popple: > >>> MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a > >>> source page was already locked during migrate_vma_collect(). If it > >>> wasn't then the a second attempt is made to lock the page. However if > >>> the first attempt failed it's unlikely a second attempt will succeed, > >>> and the retry adds complexity. So clean this up by removing the retry > >>> and MIGRATE_PFN_LOCKED flag. > >>> > >>> Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag > >>> set, but nothing actually checks that. > >>> > >>> Signed-off-by: Alistair Popple > >> It makes sense to me. Do you have any empirical data on how much more > >> likely migrations are going to fail with this change due to contested > >> page locks? > > Thanks Felix. I do not have any empirical data on this but I've mostly seen > > migrations fail due to the reference count check failing rather than > > failure to > > lock the page. Even then it's mostly been due to thrashing on the same > > page, so > > I would be surprised if this change made any noticeable difference. > > We have seen more page locking contention on NUMA systems that disappear > when we disable NUMA balancing. Probably NUMA balancing migrations > result in the page lock being more contended, which can cause HMM > migration of some pages to fail. Yeah, we've found NUMA balancing in general is pretty unhelpful for HMM based migrations and mappings so have been looking into ways of disabling it for HMM ranges. > Also, for migrations to system memory, multiple threads page faulting > concurrently can cause contention. I was just helping debug such an > issue. Having migrations to system memory be only partially successful > is problematic. We'll probably have to implement some retry logic in the > driver to handle this. Sounds similar to the problem I was referring to, except in my case I was seeing contention on the page reference checks due to lots of threads hitting __migration_entry_wait() at just the wrong time. I am working on a fix for that that avoids taking the reference at all, however I think retry logic will still be needed and suspect a driver is probably the best place to implement that. > Regards, > Felix > > > > > >> Either way, the patch is > >> > >> Acked-by: Felix Kuehling > >> > >> > >>> --- > >>> Documentation/vm/hmm.rst | 2 +- > >>> arch/powerpc/kvm/book3s_hv_uvmem.c | 4 +- > >>> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 2 - > >>> drivers/gpu/drm/nouveau/nouveau_dmem.c | 4 +- > >>> include/linux/migrate.h | 1 - > >>> lib/test_hmm.c | 5 +- > >>> mm/migrate.c | 145 +-- > >>> 7 files changed, 35 insertions(+), 128 deletions(-) > >>> > >>> diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst > >>> index a14c2938e7af..f2a59ed82ed3 100644 > >>> --- a/Documentation/vm/hmm.rst > >>> +++ b/Documentation/vm/hmm.rst > >>> @@ -360,7 +360,7 @@ between device driver specific code and shared common > >>> code: > >>> system memory page, locks the page with ``lock_page()``, and fills in > >>> the > >>> ``dst`` array entry with:: > >>> > >>> - dst[i] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; > >>> + dst[i] = migrate_pfn(page_to_pfn(dpage)); > >>> > >>> Now that the driver knows that this page is being migrated, it can > >>> invalidate device private MMU mappings and copy device private memory > >>> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c > >>> b/arch/powerpc/kvm/book3s_hv_uvmem.c > >>> index a7061ee3b157..28c436df9935 100644 > >>> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c > >>> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c > >>> @@ -560,7 +560,7 @@ static int __kvmppc_svm_page_out(struct > >>> vm_area_struct *vma, > >>> gpa, 0, page_shift); > >>> > >>> if (ret == U_SUCCESS) > >>> - *mig.dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED; > >>> + *mig.dst = migrate_pfn(pfn); > >>> else { > >>> unlock_page(dpage); > >>> __free_page(dpage); > >>> @@ -774,7 +774,7 @@ static int kvmppc_svm_page_in(struct vm_area_struct > >>> *vma, > >>> } > >>> } > >>> > >>> - *mig.dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; > >>> + *mig.dst = migrate_pfn(page_to_pfn(dpage)); > >>> migrate_vma_pages(); > >>> out_finalize: > >>> migrate_vma_finalize(); > >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > >>> b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > >>> index 4a16e3c257b9..41d9417f182b 100644 > >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > >>> @@
Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active
Excerpts from Athira Rajeev's message of October 29, 2021 1:05 pm: > During Live Partition Migration (LPM), it is observed that perf > counter values reports zero post migration completion. However > 'perf stat' with workload continues to show counts post migration > since PMU gets disabled/enabled during sched switches. But incase > of system/cpu wide monitoring, zero counts were reported with 'perf > stat' after migration completion. > > Example: > ./perf stat -e r1001e -I 1000 >time counts unit events > 1.001010437 22,137,414 r1001e > 2.002495447 15,455,821 r1001e > <<>> As seen in next below logs, the counter values shows zero > after migration is completed. > <<>> > 86.142535370129,392,333,440 r1001e > 87.144714617 0 r1001e > 88.146526636 0 r1001e > 89.148085029 0 r1001e This is the output without the patch? After the patch it keeps counting I suppose? And does the very large count go away too? > > Here PMU is enabled during start of perf session and counter > values are read at intervals. Counters are only disabled at the > end of session. The powerpc mobility code presently does not handle > disabling and enabling back of PMU counters during partition > migration. Also since the PMU register values are not saved/restored > during migration, PMU registers like Monitor Mode Control Register 0 > (MMCR0), Monitor Mode Control Register 1 (MMCR1) will not contain > the value it was programmed with. Hence PMU counters will not be > enabled correctly post migration. > > Fix this in mobility code by handling disabling and enabling of > PMU in all cpu's before and after migration. Patch introduces two > functions 'mobility_pmu_disable' and 'mobility_pmu_enable'. > mobility_pmu_disable() is called before the processor threads goes > to suspend state so as to disable the PMU counters. And disable is > done only if there are any active events running on that cpu. > mobility_pmu_enable() is called after the migrate is done to enable > back the PMU counters. > > Since the performance Monitor counters ( PMCs) are not > saved/restored during LPM, results in PMC value being zero and the > 'event->hw.prev_count' being non-zero value. This causes problem > during updation of event->count since we always accumulate > (event->hw.prev_count - PMC value) in event->count. If > event->hw.prev_count is greater PMC value, event->count becomes > negative. To fix this, 'prev_count' also needs to be re-initialised > for all events while enabling back the events. Hence read the > existing events and clear the PMC index (stored in event->hw.idx) > for all events im mobility_pmu_disable. By this way, event count > settings will get re-initialised correctly in power_pmu_enable. > > Signed-off-by: Athira Rajeev > [ Fixed compilation error reported by kernel test robot ] > Reported-by: kernel test robot > --- > Changelog: > Change from v2 -> v3: > Addressed review comments from Nicholas Piggin. > - Removed the "migrate" field which was added in initial >patch to address updation of event count settings correctly >in power_pmu_enable. Instead read off existing events in >mobility_pmu_disable before power_pmu_enable. > - Moved the mobility_pmu_disable/enable declaration from >rtas.h to perf event header file. > > Addressed review comments from Nathan. > - Moved the mobility function calls from stop_machine >context out to pseries_migrate_partition. Also now this >is a per cpu invocation. > > Change from v1 -> v2: > - Moved the mobility_pmu_enable and mobility_pmu_disable >declarations under CONFIG_PPC_PERF_CTRS in rtas.h. >Also included 'asm/rtas.h' in core-book3s to fix the >compilation warning reported by kernel test robot. > > arch/powerpc/include/asm/perf_event.h | 8 + > arch/powerpc/perf/core-book3s.c | 39 +++ > arch/powerpc/platforms/pseries/mobility.c | 7 > 3 files changed, 54 insertions(+) > > diff --git a/arch/powerpc/include/asm/perf_event.h > b/arch/powerpc/include/asm/perf_event.h > index 164e910bf654..88aab6cf840c 100644 > --- a/arch/powerpc/include/asm/perf_event.h > +++ b/arch/powerpc/include/asm/perf_event.h > @@ -17,6 +17,14 @@ static inline bool is_sier_available(void) { return false; > } > static inline unsigned long get_pmcs_ext_regs(int idx) { return 0; } > #endif > > +#ifdef CONFIG_PPC_PERF_CTRS > +void mobility_pmu_disable(void *unused); > +void mobility_pmu_enable(void *unused); > +#else > +static inline void mobility_pmu_disable(void *unused) { } > +static inline void mobility_pmu_enable(void *unused) { } > +#endif > + > #ifdef CONFIG_FSL_EMB_PERF_EVENT > #include > #endif > diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c > index 73e62e9b179b..2e8c4c668fa3 100644 > --- a/arch/powerpc/perf/core-book3s.c > +++