[PATCH v3] mm: pmd dirty emulation in page fault handler
Andreas reported [1] made a test in jemalloc hang in THP mode in arm64. http://lkml.kernel.org/r/mvmmvfy37g1@hawking.suse.de The problem is page fault handler supports only accessed flag emulation for THP page of SW-dirty/accessed architecture. This patch enables dirty-bit emulation for those architectures. Without it, MADV_FREE makes application hang by repeated fault forever. [1] b8d3c4c3009d, mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called Cc: Jason EvansCc: Kirill A. Shutemov Cc: Will Deacon Cc: Catalin Marinas Cc: linux-a...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: [4.5+] Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called") Reported-and-Tested-by: Andreas Schwab Acked-by: Kirill A. Shutemov Signed-off-by: Minchan Kim --- * from v2 * Add acked-by/tested-by * from v1 * Remove __handle_mm_fault part - Kirill mm/huge_memory.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 10eedbf..29ec8a4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd) { pmd_t entry; unsigned long haddr; + bool write = vmf->flags & FAULT_FLAG_WRITE; vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) goto unlock; entry = pmd_mkyoung(orig_pmd); + if (write) + entry = pmd_mkdirty(entry); haddr = vmf->address & HPAGE_PMD_MASK; - if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, - vmf->flags & FAULT_FLAG_WRITE)) + if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write)) update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd); unlock: -- 2.7.4
[PATCH v3] mm: pmd dirty emulation in page fault handler
Andreas reported [1] made a test in jemalloc hang in THP mode in arm64. http://lkml.kernel.org/r/mvmmvfy37g1@hawking.suse.de The problem is page fault handler supports only accessed flag emulation for THP page of SW-dirty/accessed architecture. This patch enables dirty-bit emulation for those architectures. Without it, MADV_FREE makes application hang by repeated fault forever. [1] b8d3c4c3009d, mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called Cc: Jason Evans Cc: Kirill A. Shutemov Cc: Will Deacon Cc: Catalin Marinas Cc: linux-a...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: [4.5+] Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called") Reported-and-Tested-by: Andreas Schwab Acked-by: Kirill A. Shutemov Signed-off-by: Minchan Kim --- * from v2 * Add acked-by/tested-by * from v1 * Remove __handle_mm_fault part - Kirill mm/huge_memory.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 10eedbf..29ec8a4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd) { pmd_t entry; unsigned long haddr; + bool write = vmf->flags & FAULT_FLAG_WRITE; vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) goto unlock; entry = pmd_mkyoung(orig_pmd); + if (write) + entry = pmd_mkdirty(entry); haddr = vmf->address & HPAGE_PMD_MASK; - if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, - vmf->flags & FAULT_FLAG_WRITE)) + if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write)) update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd); unlock: -- 2.7.4
Re: [RFC PATCH v2] crypto: Add IV generation algorithms
On Thu, Dec 22, 2016 at 04:25:12PM +0530, Binoy Jayan wrote: > > > It doesn't have to live outside of dm-crypt. You can register > > these IV generators from there if you really want. > > Sorry, but I didn't understand this part. What I mean is that moving the IV generators into the crypto API does not mean the dm-crypt team giving up control over them. You could continue to keep them within the dm-crypt code base and still register them through the normal crypto API mechanism. Cheers, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [RFC PATCH v2] crypto: Add IV generation algorithms
On Thu, Dec 22, 2016 at 04:25:12PM +0530, Binoy Jayan wrote: > > > It doesn't have to live outside of dm-crypt. You can register > > these IV generators from there if you really want. > > Sorry, but I didn't understand this part. What I mean is that moving the IV generators into the crypto API does not mean the dm-crypt team giving up control over them. You could continue to keep them within the dm-crypt code base and still register them through the normal crypto API mechanism. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [Intel-gfx] [PATCH] drm/i915: check if execlist_port is empty before using its content
On Fri, Dec 23, 2016 at 01:46:36PM +0800, changbin...@intel.com wrote: > From: "Du, Changbin"> > This patch fix a crash in function reset_common_ring. In this case, > the port[0].request is null when reset the render ring, so a null > dereference exception is raised. We need to check execlist_port status > first. No. The root cause is whatever got you into the illegal condition in the first place. -Chris -- Chris Wilson, Intel Open Source Technology Centre
Re: [Intel-gfx] [PATCH] drm/i915: check if execlist_port is empty before using its content
On Fri, Dec 23, 2016 at 01:46:36PM +0800, changbin...@intel.com wrote: > From: "Du, Changbin" > > This patch fix a crash in function reset_common_ring. In this case, > the port[0].request is null when reset the render ring, so a null > dereference exception is raised. We need to check execlist_port status > first. No. The root cause is whatever got you into the illegal condition in the first place. -Chris -- Chris Wilson, Intel Open Source Technology Centre
Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0
On Thu, Dec 22, 2016 at 12:22:27PM -0800, Hugh Dickins wrote: > On Wed, 21 Dec 2016, Linus Torvalds wrote: > > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinnerwrote: > > > I unmounted the fs, mkfs'd it again, ran the > > > workload again and about a minute in this fired: > > > > > > [628867.607417] [ cut here ] > > > [628867.608603] WARNING: CPU: 2 PID: 16925 at mm/workingset.c:461 > > > shadow_lru_isolate+0x171/0x220 > > > > Well, part of the changes during the merge window were the shadow > > entry tracking changes that came in through Andrew's tree. Adding > > Johannes Weiner to the participants. > > > > > Now, this workload does not touch the page cache at all - it's > > > entirely an XFS metadata workload, so it should not really be > > > affecting the working set code. > > > > Well, I suspect that anything that creates memory pressure will end up > > triggering the working set code, so .. > > > > That said, obviously memory corruption could be involved and result in > > random issues too, but I wouldn't really expect that in this code. > > > > It would probably be really useful to get more data points - is the > > problem reliably in this area, or is it going to be random and all > > over the place. > > Data point: kswapd got WARNING on mm/workingset.c:457 in shadow_lru_isolate, > soon followed by NULL pointer deref in list_lru_isolate, one time when > I tried out Sunday's git tree. Not seen since, I haven't had time to > investigate, just set it aside as something to worry about if it happens > again. But it looks like shadow_lru_isolate() has issues beyond Dave's > case (I've no XFS and no iscsi), suspect unrelated to his other problems. This seems consistent with what Dave observed: we encounter regular pages in radix tree nodes on the shadow LRU that should only contain nodes full of exceptional shadow entries. It could be an issue in the new slot replacement code and the node tracking callback. Looking...
Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0
On Thu, Dec 22, 2016 at 12:22:27PM -0800, Hugh Dickins wrote: > On Wed, 21 Dec 2016, Linus Torvalds wrote: > > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinner wrote: > > > I unmounted the fs, mkfs'd it again, ran the > > > workload again and about a minute in this fired: > > > > > > [628867.607417] [ cut here ] > > > [628867.608603] WARNING: CPU: 2 PID: 16925 at mm/workingset.c:461 > > > shadow_lru_isolate+0x171/0x220 > > > > Well, part of the changes during the merge window were the shadow > > entry tracking changes that came in through Andrew's tree. Adding > > Johannes Weiner to the participants. > > > > > Now, this workload does not touch the page cache at all - it's > > > entirely an XFS metadata workload, so it should not really be > > > affecting the working set code. > > > > Well, I suspect that anything that creates memory pressure will end up > > triggering the working set code, so .. > > > > That said, obviously memory corruption could be involved and result in > > random issues too, but I wouldn't really expect that in this code. > > > > It would probably be really useful to get more data points - is the > > problem reliably in this area, or is it going to be random and all > > over the place. > > Data point: kswapd got WARNING on mm/workingset.c:457 in shadow_lru_isolate, > soon followed by NULL pointer deref in list_lru_isolate, one time when > I tried out Sunday's git tree. Not seen since, I haven't had time to > investigate, just set it aside as something to worry about if it happens > again. But it looks like shadow_lru_isolate() has issues beyond Dave's > case (I've no XFS and no iscsi), suspect unrelated to his other problems. This seems consistent with what Dave observed: we encounter regular pages in radix tree nodes on the shadow LRU that should only contain nodes full of exceptional shadow entries. It could be an issue in the new slot replacement code and the node tracking callback. Looking...
Re: [PATCH V5 1/3] ARM64 LPC: Indirect ISA port IO introduced
On Fri, Dec 23, 2016 at 9:43 AM, zhichang.yuanwrote: > Hi,Ming, > > > On 2016/12/22 16:15, Ming Lei wrote: >> Hi Guys, >> >> On Tue, Nov 8, 2016 at 11:47 AM, zhichang.yuan >> wrote: >>> For arm64, there is no I/O space as other architectural platforms, such as >>> X86. Most I/O accesses are achieved based on MMIO. But for some arm64 SoCs, >>> such as Hip06, when accessing some legacy ISA devices connected to LPC, >>> those >>> known port addresses are used to control the corresponding target devices, >>> for >>> example, 0x2f8 is for UART, 0xe4 is for ipmi-bt. It is different from the >>> normal MMIO mode in using. >>> >>> To drive these devices, this patch introduces a method named indirect-IO. >>> In this method the in/out pair in arch/arm64/include/asm/io.h will be >>> redefined. When upper layer drivers call in/out with those known legacy port >>> addresses to access the peripherals, the hooking functions corrresponding to >>> those target peripherals will be called. Through this way, those upper layer >>> drivers which depend on in/out can run on Hip06 without any changes. >>> >>> Cc: Catalin Marinas >>> Cc: Will Deacon >>> Signed-off-by: zhichang.yuan >>> Signed-off-by: Gabriele Paoloni >>> --- >>> arch/arm64/Kconfig | 6 +++ >>> arch/arm64/include/asm/extio.h | 94 >>> ++ >>> arch/arm64/include/asm/io.h| 29 + >>> arch/arm64/kernel/Makefile | 1 + >>> arch/arm64/kernel/extio.c | 27 >>> 5 files changed, 157 insertions(+) >> >> When I applied these three patches against current linus tree and >> enable CONFIG_HISILICON_LPC, the following build failure[1] is >> triggered when running 'make modules'. >> > > Thanks for your report! > > This patch has compilation issue on some architectures, sorry for the > inconvenience caused by this! > The ongoing v6 will solve these issues. > I will trace this failure and provide a fix if you can not wait for the next > version. > > Could you send me your .config in private? I don't want to bother all the > hacker in the mail-list. > Sure, please see the config in attachment. > > Thanks, > Zhichang > >> >> Thanks, >> Ming >> >> [1] 'make modules' failure log >> >> Building modules, stage 2. >> MODPOST 2260 modules >> ERROR: "inb" [drivers/watchdog/wdt_pci.ko] undefined! >> ERROR: "outb" [drivers/watchdog/wdt_pci.ko] undefined! >> ERROR: "outb" [drivers/watchdog/pcwd_pci.ko] undefined! >> ERROR: "inb" [drivers/watchdog/pcwd_pci.ko] undefined! >> ERROR: "outw" [drivers/video/vgastate.ko] undefined! >> ERROR: "outb" [drivers/video/vgastate.ko] undefined! >> ERROR: "inb" [drivers/video/vgastate.ko] undefined! >> ERROR: "outw" [drivers/video/fbdev/vt8623fb.ko] undefined! >> ERROR: "inb" [drivers/video/fbdev/vt8623fb.ko] undefined! >> ERROR: "outb" [drivers/video/fbdev/vt8623fb.ko] undefined! >> ERROR: "outw" [drivers/video/fbdev/tridentfb.ko] undefined! >> ERROR: "inb" [drivers/video/fbdev/tridentfb.ko] undefined! >> ERROR: "outb" [drivers/video/fbdev/tridentfb.ko] undefined! >> ERROR: "inb" [drivers/video/fbdev/tdfxfb.ko] undefined! >> . >> ERROR: "inb" [drivers/ata/pata_cmd64x.ko] undefined! >> ERROR: "inb" [drivers/ata/pata_artop.ko] undefined! >> scripts/Makefile.modpost:91: recipe for target '__modpost' failed >> make[1]: *** [__modpost] Error 1 >> Makefile:1196: recipe for target 'modules' failed >> make: *** [modules] Error 2 >> >> >>> create mode 100644 arch/arm64/include/asm/extio.h >>> create mode 100644 arch/arm64/kernel/extio.c >>> >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>> index 969ef88..b44070b 100644 >>> --- a/arch/arm64/Kconfig >>> +++ b/arch/arm64/Kconfig >>> @@ -163,6 +163,12 @@ config ARCH_MMAP_RND_COMPAT_BITS_MIN >>> config ARCH_MMAP_RND_COMPAT_BITS_MAX >>> default 16 >>> >>> +config ARM64_INDIRECT_PIO >>> + bool "access peripherals with legacy I/O port" >>> + help >>> + Support special accessors for ISA I/O devices. This is needed for >>> + SoCs that do not support standard read/write for the ISA range. >>> + >>> config NO_IOPORT_MAP >>> def_bool y if !PCI >>> >>> diff --git a/arch/arm64/include/asm/extio.h b/arch/arm64/include/asm/extio.h >>> new file mode 100644 >>> index 000..6ae0787 >>> --- /dev/null >>> +++ b/arch/arm64/include/asm/extio.h >>> @@ -0,0 +1,94 @@ >>> +/* >>> + * Copyright (C) 2016 Hisilicon Limited, All Rights Reserved. >>> + * Author: Zhichang Yuan >>> + * >>> + * This program is free software; you can redistribute it and/or modify >>> + * it under the terms of the GNU General Public License version 2 as >>> + * published by the Free Software Foundation. >>> + * >>> + * This program is distributed in the hope that it will be useful, >>> + * but WITHOUT ANY
Re: [PATCH V5 1/3] ARM64 LPC: Indirect ISA port IO introduced
On Fri, Dec 23, 2016 at 9:43 AM, zhichang.yuan wrote: > Hi,Ming, > > > On 2016/12/22 16:15, Ming Lei wrote: >> Hi Guys, >> >> On Tue, Nov 8, 2016 at 11:47 AM, zhichang.yuan >> wrote: >>> For arm64, there is no I/O space as other architectural platforms, such as >>> X86. Most I/O accesses are achieved based on MMIO. But for some arm64 SoCs, >>> such as Hip06, when accessing some legacy ISA devices connected to LPC, >>> those >>> known port addresses are used to control the corresponding target devices, >>> for >>> example, 0x2f8 is for UART, 0xe4 is for ipmi-bt. It is different from the >>> normal MMIO mode in using. >>> >>> To drive these devices, this patch introduces a method named indirect-IO. >>> In this method the in/out pair in arch/arm64/include/asm/io.h will be >>> redefined. When upper layer drivers call in/out with those known legacy port >>> addresses to access the peripherals, the hooking functions corrresponding to >>> those target peripherals will be called. Through this way, those upper layer >>> drivers which depend on in/out can run on Hip06 without any changes. >>> >>> Cc: Catalin Marinas >>> Cc: Will Deacon >>> Signed-off-by: zhichang.yuan >>> Signed-off-by: Gabriele Paoloni >>> --- >>> arch/arm64/Kconfig | 6 +++ >>> arch/arm64/include/asm/extio.h | 94 >>> ++ >>> arch/arm64/include/asm/io.h| 29 + >>> arch/arm64/kernel/Makefile | 1 + >>> arch/arm64/kernel/extio.c | 27 >>> 5 files changed, 157 insertions(+) >> >> When I applied these three patches against current linus tree and >> enable CONFIG_HISILICON_LPC, the following build failure[1] is >> triggered when running 'make modules'. >> > > Thanks for your report! > > This patch has compilation issue on some architectures, sorry for the > inconvenience caused by this! > The ongoing v6 will solve these issues. > I will trace this failure and provide a fix if you can not wait for the next > version. > > Could you send me your .config in private? I don't want to bother all the > hacker in the mail-list. > Sure, please see the config in attachment. > > Thanks, > Zhichang > >> >> Thanks, >> Ming >> >> [1] 'make modules' failure log >> >> Building modules, stage 2. >> MODPOST 2260 modules >> ERROR: "inb" [drivers/watchdog/wdt_pci.ko] undefined! >> ERROR: "outb" [drivers/watchdog/wdt_pci.ko] undefined! >> ERROR: "outb" [drivers/watchdog/pcwd_pci.ko] undefined! >> ERROR: "inb" [drivers/watchdog/pcwd_pci.ko] undefined! >> ERROR: "outw" [drivers/video/vgastate.ko] undefined! >> ERROR: "outb" [drivers/video/vgastate.ko] undefined! >> ERROR: "inb" [drivers/video/vgastate.ko] undefined! >> ERROR: "outw" [drivers/video/fbdev/vt8623fb.ko] undefined! >> ERROR: "inb" [drivers/video/fbdev/vt8623fb.ko] undefined! >> ERROR: "outb" [drivers/video/fbdev/vt8623fb.ko] undefined! >> ERROR: "outw" [drivers/video/fbdev/tridentfb.ko] undefined! >> ERROR: "inb" [drivers/video/fbdev/tridentfb.ko] undefined! >> ERROR: "outb" [drivers/video/fbdev/tridentfb.ko] undefined! >> ERROR: "inb" [drivers/video/fbdev/tdfxfb.ko] undefined! >> . >> ERROR: "inb" [drivers/ata/pata_cmd64x.ko] undefined! >> ERROR: "inb" [drivers/ata/pata_artop.ko] undefined! >> scripts/Makefile.modpost:91: recipe for target '__modpost' failed >> make[1]: *** [__modpost] Error 1 >> Makefile:1196: recipe for target 'modules' failed >> make: *** [modules] Error 2 >> >> >>> create mode 100644 arch/arm64/include/asm/extio.h >>> create mode 100644 arch/arm64/kernel/extio.c >>> >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>> index 969ef88..b44070b 100644 >>> --- a/arch/arm64/Kconfig >>> +++ b/arch/arm64/Kconfig >>> @@ -163,6 +163,12 @@ config ARCH_MMAP_RND_COMPAT_BITS_MIN >>> config ARCH_MMAP_RND_COMPAT_BITS_MAX >>> default 16 >>> >>> +config ARM64_INDIRECT_PIO >>> + bool "access peripherals with legacy I/O port" >>> + help >>> + Support special accessors for ISA I/O devices. This is needed for >>> + SoCs that do not support standard read/write for the ISA range. >>> + >>> config NO_IOPORT_MAP >>> def_bool y if !PCI >>> >>> diff --git a/arch/arm64/include/asm/extio.h b/arch/arm64/include/asm/extio.h >>> new file mode 100644 >>> index 000..6ae0787 >>> --- /dev/null >>> +++ b/arch/arm64/include/asm/extio.h >>> @@ -0,0 +1,94 @@ >>> +/* >>> + * Copyright (C) 2016 Hisilicon Limited, All Rights Reserved. >>> + * Author: Zhichang Yuan >>> + * >>> + * This program is free software; you can redistribute it and/or modify >>> + * it under the terms of the GNU General Public License version 2 as >>> + * published by the Free Software Foundation. >>> + * >>> + * This program is distributed in the hope that it will be useful, >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >>> + * GNU General Public License for more details. >>> + * >>> + * You
Re: [PATCH] serial: 8250: use initializer instead of memset to clear local struct
On Fri, Dec 23, 2016 at 12:21:48PM +0900, Masahiro Yamada wrote: > Leave the way of zero-out to the compiler's decision; the compiler > may know a more optimized way than calling memset(). But no, it doesn't, it will leave "blank" areas in the structure with bad data in it, which is why we do memset. See the tree-wide fixups we made about a year ago for this very issue. Are you sure none of these structures get copied to userspace? > It may end up with memset() for big structures like this after all, > but the code will be cleaner at least. Please leave it as-is, unless you see a measured speedup. thanks, greg k-h
Re: [PATCH] serial: 8250: use initializer instead of memset to clear local struct
On Fri, Dec 23, 2016 at 12:21:48PM +0900, Masahiro Yamada wrote: > Leave the way of zero-out to the compiler's decision; the compiler > may know a more optimized way than calling memset(). But no, it doesn't, it will leave "blank" areas in the structure with bad data in it, which is why we do memset. See the tree-wide fixups we made about a year ago for this very issue. Are you sure none of these structures get copied to userspace? > It may end up with memset() for big structures like this after all, > but the code will be cleaner at least. Please leave it as-is, unless you see a measured speedup. thanks, greg k-h
Re: [PATCH] tpm: Restore functionality to xen vtpm driver.
On Thu, Dec 22, 2016 at 02:41:52PM -0600, Dr. Greg Wettstein wrote: > Functionality of the xen-tpmfront driver was lost secondary to > the introduction of xenbus multi-page support in the following > commit: > > ccc9d90a9a8b5c4ad7e9708ec41f75ff9e98d61d > > xenbus_client: Extend interface to support multi-page ring > > In this commit a pointer to the shared page address was being > passed to the xenbus_grant_ring() function rather then the > address of the shared page itself. This resulted in a situation > where the driver would attach to the vtpm-stubdom but any attempt > to send a command to the stub domain would timeout. > > A diagnostic finding for this regression is the following error > message being generated when the xen-tpmfront driver probes for a > device: > > <3>vtpm vtpm-0: tpm_transmit: tpm_send: error -62 > > <3>vtpm vtpm-0: A TPM error (-62) occurred attempting to determine the > timeouts > > This fix is relevant to all kernels from 4.1 forward which is the > release in which multi-page xenbus support was introduced. > > Daniel De Graaf formulated the fix by code inspection after the > regression point was located. > > Signed-off-by: Dr. Greg WettsteinThis is not the correct way to submit patches for inclusion in the stable kernel tree. Please read Documentation/stable_kernel_rules.txt for how to do this properly.
Re: [PATCH] tpm: Restore functionality to xen vtpm driver.
On Thu, Dec 22, 2016 at 02:41:52PM -0600, Dr. Greg Wettstein wrote: > Functionality of the xen-tpmfront driver was lost secondary to > the introduction of xenbus multi-page support in the following > commit: > > ccc9d90a9a8b5c4ad7e9708ec41f75ff9e98d61d > > xenbus_client: Extend interface to support multi-page ring > > In this commit a pointer to the shared page address was being > passed to the xenbus_grant_ring() function rather then the > address of the shared page itself. This resulted in a situation > where the driver would attach to the vtpm-stubdom but any attempt > to send a command to the stub domain would timeout. > > A diagnostic finding for this regression is the following error > message being generated when the xen-tpmfront driver probes for a > device: > > <3>vtpm vtpm-0: tpm_transmit: tpm_send: error -62 > > <3>vtpm vtpm-0: A TPM error (-62) occurred attempting to determine the > timeouts > > This fix is relevant to all kernels from 4.1 forward which is the > release in which multi-page xenbus support was introduced. > > Daniel De Graaf formulated the fix by code inspection after the > regression point was located. > > Signed-off-by: Dr. Greg Wettstein This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read Documentation/stable_kernel_rules.txt for how to do this properly.
Re: [PATCH] drm/i915: check if execlist_port is empty before using its content
On Fri, 23 Dec 2016, changbin...@intel.com wrote: > From: "Du, Changbin"> > This patch fix a crash in function reset_common_ring. In this case, > the port[0].request is null when reset the render ring, so a null > dereference exception is raised. We need to check execlist_port status > first. > > [ 35.748034] BUG: unable to handle kernel NULL pointer dereference at > 0070 > [ 35.749567] IP: [] reset_common_ring+0xbe/0x150 > [ 35.749567] Call Trace: > [ 35.749567] [] i915_gem_reset+0x150/0x270 > [ 35.749567] [] i915_reset+0x8a/0xe0 > [ 35.749567] [] i915_reset_and_wakeup+0x131/0x160 > [ 35.749567] [] ? gen5_read8+0x110/0x110 > [ 35.749567] [] i915_handle_error+0xca/0x5a0 > [ 35.749567] [] ? scnprintf+0x3d/0x70 > [ 35.749567] [] i915_hangcheck_elapsed+0x213/0x510 > [ 35.749567] [] process_one_work+0x15b/0x470 > [ 35.749567] [] worker_thread+0x43/0x4d0 > [ 35.749567] [] ? process_one_work+0x470/0x470 > [ 35.749567] [] ? process_one_work+0x470/0x470 > [ 35.749567] [] ? > call_usermodehelper_exec_async+0x12e/0x130 > [ 35.749567] [] kthread+0xc5/0xe0 > [ 35.749567] [] ? kthread_park+0x60/0x60 > [ 35.749567] [] ? umh_complete+0x40/0x40 > [ 35.749567] [] ret_from_fork+0x22/0x30 > Fixes: ? i.e. which commit broke things? BR, Jani. > Signed-off-by: Changbin Du > --- > drivers/gpu/drm/i915/intel_lrc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index 0a09024..81a9b0b 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1450,7 +1450,7 @@ static void reset_common_ring(struct intel_engine_cs > *engine, > > /* Catch up with any missed context-switch interrupts */ > I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0x, 0)); > - if (request->ctx != port[0].request->ctx) { > + if (!execlists_elsp_idle(engine) && request->ctx != > port[0].request->ctx) { > i915_gem_request_put(port[0].request); > port[0] = port[1]; > memset([1], 0, sizeof(port[1])); -- Jani Nikula, Intel Open Source Technology Center
Re: [PATCH] drm/i915: check if execlist_port is empty before using its content
On Fri, 23 Dec 2016, changbin...@intel.com wrote: > From: "Du, Changbin" > > This patch fix a crash in function reset_common_ring. In this case, > the port[0].request is null when reset the render ring, so a null > dereference exception is raised. We need to check execlist_port status > first. > > [ 35.748034] BUG: unable to handle kernel NULL pointer dereference at > 0070 > [ 35.749567] IP: [] reset_common_ring+0xbe/0x150 > [ 35.749567] Call Trace: > [ 35.749567] [] i915_gem_reset+0x150/0x270 > [ 35.749567] [] i915_reset+0x8a/0xe0 > [ 35.749567] [] i915_reset_and_wakeup+0x131/0x160 > [ 35.749567] [] ? gen5_read8+0x110/0x110 > [ 35.749567] [] i915_handle_error+0xca/0x5a0 > [ 35.749567] [] ? scnprintf+0x3d/0x70 > [ 35.749567] [] i915_hangcheck_elapsed+0x213/0x510 > [ 35.749567] [] process_one_work+0x15b/0x470 > [ 35.749567] [] worker_thread+0x43/0x4d0 > [ 35.749567] [] ? process_one_work+0x470/0x470 > [ 35.749567] [] ? process_one_work+0x470/0x470 > [ 35.749567] [] ? > call_usermodehelper_exec_async+0x12e/0x130 > [ 35.749567] [] kthread+0xc5/0xe0 > [ 35.749567] [] ? kthread_park+0x60/0x60 > [ 35.749567] [] ? umh_complete+0x40/0x40 > [ 35.749567] [] ret_from_fork+0x22/0x30 > Fixes: ? i.e. which commit broke things? BR, Jani. > Signed-off-by: Changbin Du > --- > drivers/gpu/drm/i915/intel_lrc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index 0a09024..81a9b0b 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1450,7 +1450,7 @@ static void reset_common_ring(struct intel_engine_cs > *engine, > > /* Catch up with any missed context-switch interrupts */ > I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0x, 0)); > - if (request->ctx != port[0].request->ctx) { > + if (!execlists_elsp_idle(engine) && request->ctx != > port[0].request->ctx) { > i915_gem_request_put(port[0].request); > port[0] = port[1]; > memset([1], 0, sizeof(port[1])); -- Jani Nikula, Intel Open Source Technology Center
[PATCH 3/4] usb: xhci: add XHCI_MISS_CA_EVENT quirk bit
Writing the CMD_RING_ABORT bit in xhci command register should cause a command completion event with the command completion code set to command ring stopped. However on some hosts, the CMD_RING_RUNNING bit is correctly cleared but the completion event is never sent. Current xhci driver treats the behavior of "CMD_RING_RUNNING bit is correctly cleared but the completion event is missed" as a correct behavior for all hosts. This is different from that defined in xhci spec. Refer to 4.6.1.2 in xhci spec. This patch introduces a quirk bit for those quirky hardwares. For normal hosts, if the command completion for abort command ring misses, we shall assume that there are larger problems with the host and assert HCRST. Signed-off-by: Lu Baolu--- drivers/usb/host/xhci-ring.c | 17 + drivers/usb/host/xhci.h | 6 ++ 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 5935dce..6a23c37 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -345,25 +345,18 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) * larger problems with the xHC and assert HCRST. */ temp_64 = xhci_read_64(xhci, >op_regs->cmd_ring); - if (temp_64 & CMD_RING_RUNNING) { + if ((temp_64 & CMD_RING_RUNNING) || + !(xhci->quirks & XHCI_MISS_CA_EVENT)) { xhci_err(xhci, "Stop command ring failed, maybe the host is dead\n"); xhci->xhc_state |= XHCI_STATE_DYING; xhci_halt(xhci); return -ESHUTDOWN; } - - /* -* Writing the CMD_RING_ABORT bit should cause a cmd -* completion event, however on some hosts the bit is -* correctly cleared but the completion event is never -* sent. -*/ - xhci_warn(xhci, "No stop event for abort, ring start fail?\n"); - xhci_cleanup_command_queue(xhci); - } else { - xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci)); } + + xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci)); + return 0; } diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 09fe63f..c550bf0 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1656,6 +1656,12 @@ struct xhci_hcd { #define XHCI_SSIC_PORT_UNUSED (1 << 22) #define XHCI_NO_64BIT_SUPPORT (1 << 23) #define XHCI_MISSING_CAS (1 << 24) +/* + * Writing the CMD_RING_ABORT bit should cause a cmd completion event, + * however on some host hw the CMD_RING_RUNNING bit is correctly cleared + * but the completion event is never sent. + */ +#define XHCI_MISS_CA_EVENT (1 << 25) unsigned intnum_active_eps; unsigned intlimit_active_eps; /* There are two roothubs to keep track of bus suspend info for */ -- 2.1.4
[PATCH 3/4] usb: xhci: add XHCI_MISS_CA_EVENT quirk bit
Writing the CMD_RING_ABORT bit in xhci command register should cause a command completion event with the command completion code set to command ring stopped. However on some hosts, the CMD_RING_RUNNING bit is correctly cleared but the completion event is never sent. Current xhci driver treats the behavior of "CMD_RING_RUNNING bit is correctly cleared but the completion event is missed" as a correct behavior for all hosts. This is different from that defined in xhci spec. Refer to 4.6.1.2 in xhci spec. This patch introduces a quirk bit for those quirky hardwares. For normal hosts, if the command completion for abort command ring misses, we shall assume that there are larger problems with the host and assert HCRST. Signed-off-by: Lu Baolu --- drivers/usb/host/xhci-ring.c | 17 + drivers/usb/host/xhci.h | 6 ++ 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 5935dce..6a23c37 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -345,25 +345,18 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) * larger problems with the xHC and assert HCRST. */ temp_64 = xhci_read_64(xhci, >op_regs->cmd_ring); - if (temp_64 & CMD_RING_RUNNING) { + if ((temp_64 & CMD_RING_RUNNING) || + !(xhci->quirks & XHCI_MISS_CA_EVENT)) { xhci_err(xhci, "Stop command ring failed, maybe the host is dead\n"); xhci->xhc_state |= XHCI_STATE_DYING; xhci_halt(xhci); return -ESHUTDOWN; } - - /* -* Writing the CMD_RING_ABORT bit should cause a cmd -* completion event, however on some hosts the bit is -* correctly cleared but the completion event is never -* sent. -*/ - xhci_warn(xhci, "No stop event for abort, ring start fail?\n"); - xhci_cleanup_command_queue(xhci); - } else { - xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci)); } + + xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci)); + return 0; } diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 09fe63f..c550bf0 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1656,6 +1656,12 @@ struct xhci_hcd { #define XHCI_SSIC_PORT_UNUSED (1 << 22) #define XHCI_NO_64BIT_SUPPORT (1 << 23) #define XHCI_MISSING_CAS (1 << 24) +/* + * Writing the CMD_RING_ABORT bit should cause a cmd completion event, + * however on some host hw the CMD_RING_RUNNING bit is correctly cleared + * but the completion event is never sent. + */ +#define XHCI_MISS_CA_EVENT (1 << 25) unsigned intnum_active_eps; unsigned intlimit_active_eps; /* There are two roothubs to keep track of bus suspend info for */ -- 2.1.4
[PATCH 1/4] usb: xhci: remove unnecessary second abort try
The second try was a workaround for (what we thought was) command ring failing to stop in the first place. But this turns out to be due to the race that we have fixed(see "xhci: Fix race related to abort operation"). With that fix, it is time to remove the second try. Signed-off-by: Lu Baolu--- drivers/usb/host/xhci-ring.c | 18 +- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 2f4ea21..e3bcf6d 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -343,19 +343,11 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) ret = xhci_handshake(>op_regs->cmd_ring, CMD_RING_RUNNING, 0, 5 * 1000 * 1000); if (ret < 0) { - /* we are about to kill xhci, give it one more chance */ - xhci_write_64(xhci, temp_64 | CMD_RING_ABORT, - >op_regs->cmd_ring); - udelay(1000); - ret = xhci_handshake(>op_regs->cmd_ring, -CMD_RING_RUNNING, 0, 3 * 1000 * 1000); - if (ret < 0) { - xhci_err(xhci, "Stopped the command ring failed, " -"maybe the host is dead\n"); - xhci->xhc_state |= XHCI_STATE_DYING; - xhci_halt(xhci); - return -ESHUTDOWN; - } + xhci_err(xhci, +"Stop command ring failed, maybe the host is dead\n"); + xhci->xhc_state |= XHCI_STATE_DYING; + xhci_halt(xhci); + return -ESHUTDOWN; } /* * Writing the CMD_RING_ABORT bit should cause a cmd completion event, -- 2.1.4
[PATCH 1/4] usb: xhci: remove unnecessary second abort try
The second try was a workaround for (what we thought was) command ring failing to stop in the first place. But this turns out to be due to the race that we have fixed(see "xhci: Fix race related to abort operation"). With that fix, it is time to remove the second try. Signed-off-by: Lu Baolu --- drivers/usb/host/xhci-ring.c | 18 +- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 2f4ea21..e3bcf6d 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -343,19 +343,11 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) ret = xhci_handshake(>op_regs->cmd_ring, CMD_RING_RUNNING, 0, 5 * 1000 * 1000); if (ret < 0) { - /* we are about to kill xhci, give it one more chance */ - xhci_write_64(xhci, temp_64 | CMD_RING_ABORT, - >op_regs->cmd_ring); - udelay(1000); - ret = xhci_handshake(>op_regs->cmd_ring, -CMD_RING_RUNNING, 0, 3 * 1000 * 1000); - if (ret < 0) { - xhci_err(xhci, "Stopped the command ring failed, " -"maybe the host is dead\n"); - xhci->xhc_state |= XHCI_STATE_DYING; - xhci_halt(xhci); - return -ESHUTDOWN; - } + xhci_err(xhci, +"Stop command ring failed, maybe the host is dead\n"); + xhci->xhc_state |= XHCI_STATE_DYING; + xhci_halt(xhci); + return -ESHUTDOWN; } /* * Writing the CMD_RING_ABORT bit should cause a cmd completion event, -- 2.1.4
[PATCH 0/4] refactor command timeout handling
Hi Mathias, This patch series is for command timeout refactoring. Patches usb: xhci: remove unnecessary second abort try usb: xhci: remove CRR polling in xhci_abort_cmd_ring() are follow-ups of your comments of "remove unnecessary second abort try as a separate patch, send to usb-next" and "remove polling for the Command ring running (CRR), waiting for completion is enough, if completion times out then we can check CRR. for usb-next" in below discussion thread. https://lkml.org/lkml/2016/12/21/186 Patches usb: xhci: add XHCI_MISS_CA_EVENT quirk bit usb: xhci: warn on command timeout in stopped command ring are my proposals. They base on the top of your timeout_race_fixes branch. (git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes) Lu Baolu (4): usb: xhci: remove unnecessary second abort try usb: xhci: remove CRR polling in xhci_abort_cmd_ring() usb: xhci: add XHCI_MISS_CA_EVENT quirk bit usb: xhci: warn on command timeout in stopped command ring drivers/usb/host/xhci-ring.c | 64 ++-- drivers/usb/host/xhci.h | 6 + 2 files changed, 32 insertions(+), 38 deletions(-) -- 2.1.4
[PATCH 2/4] usb: xhci: remove CRR polling in xhci_abort_cmd_ring()
xHCI driver aborts the command ring by setting CA (Command Abort) bit in the command register. With this setting, host should abort all command executions and generate a command completion event with the completion code set to command ring stopped. Software should time the completion of command abort by checking the CRR (Command Ring Running) and waiting for the command ring stopped event. Current xhci_abort_cmd_ring() does this in the following way: setting CA bit; busy polling CRR bit until it negates; sleep and wait for the command ring stopped event. This is not an efficient way since cpu cycles are wasted on polling registers. This patch removes polling for CRR (Command Ring Running). Wait for completion, and check CRR if completion times out is enough. Suggested-by: Mathias NymanSigned-off-by: Lu Baolu --- drivers/usb/host/xhci-ring.c | 49 ++-- 1 file changed, 24 insertions(+), 25 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index e3bcf6d..5935dce 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -323,7 +323,6 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) { unsigned long timeleft; u64 temp_64; - int ret; xhci_dbg(xhci, "Abort command ring\n"); @@ -333,34 +332,34 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) xhci_write_64(xhci, temp_64 | CMD_RING_ABORT, >op_regs->cmd_ring); - /* Section 4.6.1.2 of xHCI 1.0 spec says software should -* time the completion od all xHCI commands, including -* the Command Abort operation. If software doesn't see -* CRR negated in a timely manner (e.g. longer than 5 -* seconds), then it should assume that the there are -* larger problems with the xHC and assert HCRST. -*/ - ret = xhci_handshake(>op_regs->cmd_ring, - CMD_RING_RUNNING, 0, 5 * 1000 * 1000); - if (ret < 0) { - xhci_err(xhci, -"Stop command ring failed, maybe the host is dead\n"); - xhci->xhc_state |= XHCI_STATE_DYING; - xhci_halt(xhci); - return -ESHUTDOWN; - } - /* -* Writing the CMD_RING_ABORT bit should cause a cmd completion event, -* however on some host hw the CMD_RING_RUNNING bit is correctly cleared -* but the completion event in never sent. Wait 2 secs (arbitrary -* number) to handle those cases after negation of CMD_RING_RUNNING. -*/ spin_unlock_irqrestore(>lock, *flags); timeleft = wait_for_completion_timeout(>cmd_ring_stop_completion, - 2 * HZ); + XHCI_CMD_DEFAULT_TIMEOUT); spin_lock_irqsave(>lock, *flags); if (!timeleft) { - xhci_dbg(xhci, "No stop event for abort, ring start fail?\n"); + /* Section 4.6.1.2 of xHCI 1.0 spec says software should +* time the completion of all xHCI commands, including +* the Command Abort operation. If software doesn't see +* CRR negated in a timely manner (e.g. longer than 5 +* seconds), then it should assume that the there are +* larger problems with the xHC and assert HCRST. +*/ + temp_64 = xhci_read_64(xhci, >op_regs->cmd_ring); + if (temp_64 & CMD_RING_RUNNING) { + xhci_err(xhci, +"Stop command ring failed, maybe the host is dead\n"); + xhci->xhc_state |= XHCI_STATE_DYING; + xhci_halt(xhci); + return -ESHUTDOWN; + } + + /* +* Writing the CMD_RING_ABORT bit should cause a cmd +* completion event, however on some hosts the bit is +* correctly cleared but the completion event is never +* sent. +*/ + xhci_warn(xhci, "No stop event for abort, ring start fail?\n"); xhci_cleanup_command_queue(xhci); } else { xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci)); -- 2.1.4
[PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring
If xhci host fails to response to a command, the command watchdog timer will be fired. The callback function will abort and clear current command and restart the command execution. If driver fails to restart command execution, it will assume there is a larger problem in host and report the situation to the upper layer. In rare cases, if the driver sees a command timeout in a stopped command ring, driver should let the user know it. Signed-off-by: Lu Baolu--- drivers/usb/host/xhci-ring.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 6a23c37..16baefc 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1308,8 +1308,12 @@ void xhci_handle_command_timeout(struct work_struct *work) goto time_out_completed; } - /* command timeout on stopped ring, ring can't be aborted */ - xhci_dbg(xhci, "Command timeout on stopped ring\n"); + /* +* We should never reach here until a command times out on a stopped +* ring and xhci driver has no idea about the reason why the command +* ring was stopped. +*/ + xhci_warn(xhci, "WARN command timeout on stopped ring\n"); xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd); time_out_completed: -- 2.1.4
[PATCH 0/4] refactor command timeout handling
Hi Mathias, This patch series is for command timeout refactoring. Patches usb: xhci: remove unnecessary second abort try usb: xhci: remove CRR polling in xhci_abort_cmd_ring() are follow-ups of your comments of "remove unnecessary second abort try as a separate patch, send to usb-next" and "remove polling for the Command ring running (CRR), waiting for completion is enough, if completion times out then we can check CRR. for usb-next" in below discussion thread. https://lkml.org/lkml/2016/12/21/186 Patches usb: xhci: add XHCI_MISS_CA_EVENT quirk bit usb: xhci: warn on command timeout in stopped command ring are my proposals. They base on the top of your timeout_race_fixes branch. (git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes) Lu Baolu (4): usb: xhci: remove unnecessary second abort try usb: xhci: remove CRR polling in xhci_abort_cmd_ring() usb: xhci: add XHCI_MISS_CA_EVENT quirk bit usb: xhci: warn on command timeout in stopped command ring drivers/usb/host/xhci-ring.c | 64 ++-- drivers/usb/host/xhci.h | 6 + 2 files changed, 32 insertions(+), 38 deletions(-) -- 2.1.4
[PATCH 2/4] usb: xhci: remove CRR polling in xhci_abort_cmd_ring()
xHCI driver aborts the command ring by setting CA (Command Abort) bit in the command register. With this setting, host should abort all command executions and generate a command completion event with the completion code set to command ring stopped. Software should time the completion of command abort by checking the CRR (Command Ring Running) and waiting for the command ring stopped event. Current xhci_abort_cmd_ring() does this in the following way: setting CA bit; busy polling CRR bit until it negates; sleep and wait for the command ring stopped event. This is not an efficient way since cpu cycles are wasted on polling registers. This patch removes polling for CRR (Command Ring Running). Wait for completion, and check CRR if completion times out is enough. Suggested-by: Mathias Nyman Signed-off-by: Lu Baolu --- drivers/usb/host/xhci-ring.c | 49 ++-- 1 file changed, 24 insertions(+), 25 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index e3bcf6d..5935dce 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -323,7 +323,6 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) { unsigned long timeleft; u64 temp_64; - int ret; xhci_dbg(xhci, "Abort command ring\n"); @@ -333,34 +332,34 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags) xhci_write_64(xhci, temp_64 | CMD_RING_ABORT, >op_regs->cmd_ring); - /* Section 4.6.1.2 of xHCI 1.0 spec says software should -* time the completion od all xHCI commands, including -* the Command Abort operation. If software doesn't see -* CRR negated in a timely manner (e.g. longer than 5 -* seconds), then it should assume that the there are -* larger problems with the xHC and assert HCRST. -*/ - ret = xhci_handshake(>op_regs->cmd_ring, - CMD_RING_RUNNING, 0, 5 * 1000 * 1000); - if (ret < 0) { - xhci_err(xhci, -"Stop command ring failed, maybe the host is dead\n"); - xhci->xhc_state |= XHCI_STATE_DYING; - xhci_halt(xhci); - return -ESHUTDOWN; - } - /* -* Writing the CMD_RING_ABORT bit should cause a cmd completion event, -* however on some host hw the CMD_RING_RUNNING bit is correctly cleared -* but the completion event in never sent. Wait 2 secs (arbitrary -* number) to handle those cases after negation of CMD_RING_RUNNING. -*/ spin_unlock_irqrestore(>lock, *flags); timeleft = wait_for_completion_timeout(>cmd_ring_stop_completion, - 2 * HZ); + XHCI_CMD_DEFAULT_TIMEOUT); spin_lock_irqsave(>lock, *flags); if (!timeleft) { - xhci_dbg(xhci, "No stop event for abort, ring start fail?\n"); + /* Section 4.6.1.2 of xHCI 1.0 spec says software should +* time the completion of all xHCI commands, including +* the Command Abort operation. If software doesn't see +* CRR negated in a timely manner (e.g. longer than 5 +* seconds), then it should assume that the there are +* larger problems with the xHC and assert HCRST. +*/ + temp_64 = xhci_read_64(xhci, >op_regs->cmd_ring); + if (temp_64 & CMD_RING_RUNNING) { + xhci_err(xhci, +"Stop command ring failed, maybe the host is dead\n"); + xhci->xhc_state |= XHCI_STATE_DYING; + xhci_halt(xhci); + return -ESHUTDOWN; + } + + /* +* Writing the CMD_RING_ABORT bit should cause a cmd +* completion event, however on some hosts the bit is +* correctly cleared but the completion event is never +* sent. +*/ + xhci_warn(xhci, "No stop event for abort, ring start fail?\n"); xhci_cleanup_command_queue(xhci); } else { xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci)); -- 2.1.4
[PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring
If xhci host fails to response to a command, the command watchdog timer will be fired. The callback function will abort and clear current command and restart the command execution. If driver fails to restart command execution, it will assume there is a larger problem in host and report the situation to the upper layer. In rare cases, if the driver sees a command timeout in a stopped command ring, driver should let the user know it. Signed-off-by: Lu Baolu --- drivers/usb/host/xhci-ring.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 6a23c37..16baefc 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1308,8 +1308,12 @@ void xhci_handle_command_timeout(struct work_struct *work) goto time_out_completed; } - /* command timeout on stopped ring, ring can't be aborted */ - xhci_dbg(xhci, "Command timeout on stopped ring\n"); + /* +* We should never reach here until a command times out on a stopped +* ring and xhci driver has no idea about the reason why the command +* ring was stopped. +*/ + xhci_warn(xhci, "WARN command timeout on stopped ring\n"); xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd); time_out_completed: -- 2.1.4
[PATCH 0/1] xhci: Fix race related to abort operation
Hi Mathias, This is a follow-up patch for below comment "rebase OGAWA Hirofumi's changes on top of that, and send to usb-linus only" in below discussion thread. https://lkml.org/lkml/2016/12/21/186 I rebased the patch and added unlock-before and lock-after xhci->lock during waiting for the command stopped event. It sits at the same place in your timeout_race_fixes branch. (git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes) Best regards, Lu Baolu OGAWA Hirofumi (1): xhci: Fix race related to abort operation drivers/usb/host/xhci-mem.c | 1 + drivers/usb/host/xhci-ring.c | 171 ++- drivers/usb/host/xhci.h | 1 + 3 files changed, 91 insertions(+), 82 deletions(-) -- 2.1.4
[PATCH 1/1] xhci: Fix race related to abort operation
From: OGAWA HirofumiCurrent abort operation has race. xhci_handle_command_timeout() xhci_abort_cmd_ring() xhci_write_64(CMD_RING_ABORT) xhci_handshake(5s) do { check CMD_RING_RUNNING udelay(1) ... COMP_CMD_ABORT event COMP_CMD_STOP event xhci_handle_stopped_cmd_ring() restart cmd_ring CMD_RING_RUNNING become 1 again } while () return -ETIMEDOUT xhci_write_64(CMD_RING_ABORT) /* can abort random command */ To do abort operation correctly, we have to wait both of COMP_CMD_STOP event and negation of CMD_RING_RUNNING. But like above, while timeout handler is waiting negation of CMD_RING_RUNNING, event handler can restart cmd_ring. So timeout handler never be notice negation of CMD_RING_RUNNING, and retry of CMD_RING_ABORT can abort random command (BTW, I guess retry of CMD_RING_ABORT was workaround of this race). To fix this race, this moves xhci_handle_stopped_cmd_ring() to xhci_abort_cmd_ring(). And timeout handler waits COMP_CMD_STOP event. At this point, timeout handler is owner of cmd_ring, and safely restart cmd_ring by using xhci_handle_stopped_cmd_ring(). [FWIW, as bonus, this way would be easily extend to add CMD_RING_PAUSE operation] [baolu: release xhci->lock before wait and apply it back again after wait. This is the reason my signed-off-by is added.] CC: Signed-off-by: OGAWA Hirofumi Signed-off-by: Lu Baolu Signed-off-by: Mathias Nyman --- drivers/usb/host/xhci-mem.c | 1 + drivers/usb/host/xhci-ring.c | 171 ++- drivers/usb/host/xhci.h | 1 + 3 files changed, 91 insertions(+), 82 deletions(-) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 536f572..70c9204 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2379,6 +2379,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags) /* init command timeout work */ INIT_DELAYED_WORK(>cmd_timer, xhci_handle_command_timeout); + init_completion(>cmd_ring_stop_completion); page_size = readl(>op_regs->page_size); xhci_dbg_trace(xhci, trace_xhci_dbg_init, diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 63af013..d5fde9a 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -265,23 +265,71 @@ static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci, unsigned long delay) return mod_delayed_work(system_wq, >cmd_timer, delay); } -static int xhci_abort_cmd_ring(struct xhci_hcd *xhci) +static struct xhci_command *xhci_next_queued_cmd(struct xhci_hcd *xhci) { + return list_first_entry_or_null(>cmd_list, struct xhci_command, + cmd_list); +} + +/* + * Turn all commands on command ring with status set to "aborted" to no-op trbs. + * If there are other commands waiting then restart the ring and kick the timer. + * This must be called with command ring stopped and xhci->lock held. + */ +static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci, +struct xhci_command *cur_cmd) +{ + struct xhci_command *i_cmd; + u32 cycle_state; + + /* Turn all aborted commands in list to no-ops, then restart */ + list_for_each_entry(i_cmd, >cmd_list, cmd_list) { + + if (i_cmd->status != COMP_CMD_ABORT) + continue; + + i_cmd->status = COMP_CMD_STOP; + + xhci_dbg(xhci, "Turn aborted command %p to no-op\n", +i_cmd->command_trb); + /* get cycle state from the original cmd trb */ + cycle_state = le32_to_cpu( + i_cmd->command_trb->generic.field[3]) & TRB_CYCLE; + /* modify the command trb to no-op command */ + i_cmd->command_trb->generic.field[0] = 0; + i_cmd->command_trb->generic.field[1] = 0; + i_cmd->command_trb->generic.field[2] = 0; + i_cmd->command_trb->generic.field[3] = cpu_to_le32( + TRB_TYPE(TRB_CMD_NOOP) | cycle_state); + + /* +* caller waiting for completion is called when command +* completion event is received for these no-op commands +*/ + } + + xhci->cmd_ring_state = CMD_RING_STATE_RUNNING; + + /* ring command ring doorbell to restart the command ring */ + if ((xhci->cmd_ring->dequeue != xhci->cmd_ring->enqueue) && + !(xhci->xhc_state &
[PATCH 0/1] xhci: Fix race related to abort operation
Hi Mathias, This is a follow-up patch for below comment "rebase OGAWA Hirofumi's changes on top of that, and send to usb-linus only" in below discussion thread. https://lkml.org/lkml/2016/12/21/186 I rebased the patch and added unlock-before and lock-after xhci->lock during waiting for the command stopped event. It sits at the same place in your timeout_race_fixes branch. (git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes) Best regards, Lu Baolu OGAWA Hirofumi (1): xhci: Fix race related to abort operation drivers/usb/host/xhci-mem.c | 1 + drivers/usb/host/xhci-ring.c | 171 ++- drivers/usb/host/xhci.h | 1 + 3 files changed, 91 insertions(+), 82 deletions(-) -- 2.1.4
[PATCH 1/1] xhci: Fix race related to abort operation
From: OGAWA Hirofumi Current abort operation has race. xhci_handle_command_timeout() xhci_abort_cmd_ring() xhci_write_64(CMD_RING_ABORT) xhci_handshake(5s) do { check CMD_RING_RUNNING udelay(1) ... COMP_CMD_ABORT event COMP_CMD_STOP event xhci_handle_stopped_cmd_ring() restart cmd_ring CMD_RING_RUNNING become 1 again } while () return -ETIMEDOUT xhci_write_64(CMD_RING_ABORT) /* can abort random command */ To do abort operation correctly, we have to wait both of COMP_CMD_STOP event and negation of CMD_RING_RUNNING. But like above, while timeout handler is waiting negation of CMD_RING_RUNNING, event handler can restart cmd_ring. So timeout handler never be notice negation of CMD_RING_RUNNING, and retry of CMD_RING_ABORT can abort random command (BTW, I guess retry of CMD_RING_ABORT was workaround of this race). To fix this race, this moves xhci_handle_stopped_cmd_ring() to xhci_abort_cmd_ring(). And timeout handler waits COMP_CMD_STOP event. At this point, timeout handler is owner of cmd_ring, and safely restart cmd_ring by using xhci_handle_stopped_cmd_ring(). [FWIW, as bonus, this way would be easily extend to add CMD_RING_PAUSE operation] [baolu: release xhci->lock before wait and apply it back again after wait. This is the reason my signed-off-by is added.] CC: Signed-off-by: OGAWA Hirofumi Signed-off-by: Lu Baolu Signed-off-by: Mathias Nyman --- drivers/usb/host/xhci-mem.c | 1 + drivers/usb/host/xhci-ring.c | 171 ++- drivers/usb/host/xhci.h | 1 + 3 files changed, 91 insertions(+), 82 deletions(-) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 536f572..70c9204 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2379,6 +2379,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags) /* init command timeout work */ INIT_DELAYED_WORK(>cmd_timer, xhci_handle_command_timeout); + init_completion(>cmd_ring_stop_completion); page_size = readl(>op_regs->page_size); xhci_dbg_trace(xhci, trace_xhci_dbg_init, diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 63af013..d5fde9a 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -265,23 +265,71 @@ static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci, unsigned long delay) return mod_delayed_work(system_wq, >cmd_timer, delay); } -static int xhci_abort_cmd_ring(struct xhci_hcd *xhci) +static struct xhci_command *xhci_next_queued_cmd(struct xhci_hcd *xhci) { + return list_first_entry_or_null(>cmd_list, struct xhci_command, + cmd_list); +} + +/* + * Turn all commands on command ring with status set to "aborted" to no-op trbs. + * If there are other commands waiting then restart the ring and kick the timer. + * This must be called with command ring stopped and xhci->lock held. + */ +static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci, +struct xhci_command *cur_cmd) +{ + struct xhci_command *i_cmd; + u32 cycle_state; + + /* Turn all aborted commands in list to no-ops, then restart */ + list_for_each_entry(i_cmd, >cmd_list, cmd_list) { + + if (i_cmd->status != COMP_CMD_ABORT) + continue; + + i_cmd->status = COMP_CMD_STOP; + + xhci_dbg(xhci, "Turn aborted command %p to no-op\n", +i_cmd->command_trb); + /* get cycle state from the original cmd trb */ + cycle_state = le32_to_cpu( + i_cmd->command_trb->generic.field[3]) & TRB_CYCLE; + /* modify the command trb to no-op command */ + i_cmd->command_trb->generic.field[0] = 0; + i_cmd->command_trb->generic.field[1] = 0; + i_cmd->command_trb->generic.field[2] = 0; + i_cmd->command_trb->generic.field[3] = cpu_to_le32( + TRB_TYPE(TRB_CMD_NOOP) | cycle_state); + + /* +* caller waiting for completion is called when command +* completion event is received for these no-op commands +*/ + } + + xhci->cmd_ring_state = CMD_RING_STATE_RUNNING; + + /* ring command ring doorbell to restart the command ring */ + if ((xhci->cmd_ring->dequeue != xhci->cmd_ring->enqueue) && + !(xhci->xhc_state & XHCI_STATE_DYING)) { + xhci->current_cmd = cur_cmd; + xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); +
[PATCH 0/1] usb: xhci: hold lock over xhci_abort_cmd_ring()
Hi Mathias, This is a follow-up patch for below comment "fix the lock to cover abort+CRR check, and send it to usb-linus +stable" in below discussion thread. https://lkml.org/lkml/2016/12/21/186 It's based on v4.9. Best regards, Lu Baolu Lu Baolu (1): usb: xhci: hold lock over xhci_abort_cmd_ring() drivers/usb/host/xhci-ring.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) -- 2.1.4
[PATCH 1/1] usb: xhci: hold lock over xhci_abort_cmd_ring()
In command timer function, xhci_handle_command_timeout(), xhci->lock is unlocked before call into xhci_abort_cmd_ring(). This might cause race between the timer function and the event handler. The xhci_abort_cmd_ring() function sets the CMD_RING_ABORT bit in the command register and polling it until the setting takes effect. A stop command ring event might be handled between writing the abort bit and polling for it. The event handler will restart the command ring, which causes the failure of polling, and we ever believed that we failed to stop it. As a bonus, this also fixes some issues of calling functions without locking in xhci_handle_command_timeout(). Cc:# 3.7+ Signed-off-by: Lu Baolu --- drivers/usb/host/xhci-ring.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 797137e..0f99078 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1283,29 +1283,34 @@ void xhci_handle_command_timeout(unsigned long data) hw_ring_state = xhci_read_64(xhci, >op_regs->cmd_ring); if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) && (hw_ring_state & CMD_RING_RUNNING)) { - spin_unlock_irqrestore(>lock, flags); xhci_dbg(xhci, "Command timeout\n"); ret = xhci_abort_cmd_ring(xhci); if (unlikely(ret == -ESHUTDOWN)) { xhci_err(xhci, "Abort command ring failed\n"); xhci_cleanup_command_queue(xhci); + spin_unlock_irqrestore(>lock, flags); usb_hc_died(xhci_to_hcd(xhci)->primary_hcd); xhci_dbg(xhci, "xHCI host controller is dead.\n"); + + return; } - return; + + goto time_out_completed; } /* command ring failed to restart, or host removed. Bail out */ if (second_timeout || xhci->xhc_state & XHCI_STATE_REMOVING) { - spin_unlock_irqrestore(>lock, flags); xhci_dbg(xhci, "command timed out twice, ring start fail?\n"); xhci_cleanup_command_queue(xhci); - return; + + goto time_out_completed; } /* command timeout on stopped ring, ring can't be aborted */ xhci_dbg(xhci, "Command timeout on stopped ring\n"); xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd); + +time_out_completed: spin_unlock_irqrestore(>lock, flags); return; } -- 2.1.4
[PATCH 0/1] usb: xhci: hold lock over xhci_abort_cmd_ring()
Hi Mathias, This is a follow-up patch for below comment "fix the lock to cover abort+CRR check, and send it to usb-linus +stable" in below discussion thread. https://lkml.org/lkml/2016/12/21/186 It's based on v4.9. Best regards, Lu Baolu Lu Baolu (1): usb: xhci: hold lock over xhci_abort_cmd_ring() drivers/usb/host/xhci-ring.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) -- 2.1.4
[PATCH 1/1] usb: xhci: hold lock over xhci_abort_cmd_ring()
In command timer function, xhci_handle_command_timeout(), xhci->lock is unlocked before call into xhci_abort_cmd_ring(). This might cause race between the timer function and the event handler. The xhci_abort_cmd_ring() function sets the CMD_RING_ABORT bit in the command register and polling it until the setting takes effect. A stop command ring event might be handled between writing the abort bit and polling for it. The event handler will restart the command ring, which causes the failure of polling, and we ever believed that we failed to stop it. As a bonus, this also fixes some issues of calling functions without locking in xhci_handle_command_timeout(). Cc: # 3.7+ Signed-off-by: Lu Baolu --- drivers/usb/host/xhci-ring.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 797137e..0f99078 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1283,29 +1283,34 @@ void xhci_handle_command_timeout(unsigned long data) hw_ring_state = xhci_read_64(xhci, >op_regs->cmd_ring); if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) && (hw_ring_state & CMD_RING_RUNNING)) { - spin_unlock_irqrestore(>lock, flags); xhci_dbg(xhci, "Command timeout\n"); ret = xhci_abort_cmd_ring(xhci); if (unlikely(ret == -ESHUTDOWN)) { xhci_err(xhci, "Abort command ring failed\n"); xhci_cleanup_command_queue(xhci); + spin_unlock_irqrestore(>lock, flags); usb_hc_died(xhci_to_hcd(xhci)->primary_hcd); xhci_dbg(xhci, "xHCI host controller is dead.\n"); + + return; } - return; + + goto time_out_completed; } /* command ring failed to restart, or host removed. Bail out */ if (second_timeout || xhci->xhc_state & XHCI_STATE_REMOVING) { - spin_unlock_irqrestore(>lock, flags); xhci_dbg(xhci, "command timed out twice, ring start fail?\n"); xhci_cleanup_command_queue(xhci); - return; + + goto time_out_completed; } /* command timeout on stopped ring, ring can't be aborted */ xhci_dbg(xhci, "Command timeout on stopped ring\n"); xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd); + +time_out_completed: spin_unlock_irqrestore(>lock, flags); return; } -- 2.1.4
Re: [PATCH 00/66] use permission-specific DEVICE_ATTR variants
> Hi Julia, > > I noticed that drivers/hwmon/gl518sm.c was not converted. Running your script > on it does not do anything. Any idea what might cause that ? I'll check. > I noticed that I get a warning if I specify a debug file: > > File "devattr-new.cocci", line 36, characters 5-6: > Warning 26: unused variable x. > > but that seems to have no impact and is always seen. I think it is caused > by the following. Yes, it has no impact. You can change x << d.x; to _x << d.x; > > ... > > @script:ocaml@ > x << d.x; < > show << o.show; > store << o.store; > @@ > > if (not(show = "NULL") && Hashtbl.mem taken show) || >(not(store = "NULL") && Hashtbl.mem taken store) > then Coccilib.include_match false > else (Hashtbl.add taken show (); Hashtbl.add taken store ()) > > ... > > How does one specify indentation preferences ? Actually, you are pretty much stuck with what Coccinelle provides you. What do you want to adjust? Currently, it shouldn't have any impact on indentation if the new name is the same as the old one. That function could be generalized to allow length changes of up to a few characters, for example, but I'm not sure that the result would always be satisfactory. When I made the patch, I had to adjust about 5 cases by hand, where it was sending all of the parameters past 80 characters, because the name of the function was long. > Also, how do I generate > an actual patch (instead of getting the output on screen) ? > Sorry for the maybe dumb questions ;-). The patch is on standard output. All other messages are on standard error. You can also use the argument --in-place to change your code directly. If you want to only work on the hwmon directory, but to get a patch that is relative to the root directory, you can do: --dir linux/drivers/hwmon --patch linux That is, the argument to --patch is the directory that you want the patch to be relative to. julia > Thanks, > Guenter > > -- > To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Re: [PATCH 00/66] use permission-specific DEVICE_ATTR variants
> Hi Julia, > > I noticed that drivers/hwmon/gl518sm.c was not converted. Running your script > on it does not do anything. Any idea what might cause that ? I'll check. > I noticed that I get a warning if I specify a debug file: > > File "devattr-new.cocci", line 36, characters 5-6: > Warning 26: unused variable x. > > but that seems to have no impact and is always seen. I think it is caused > by the following. Yes, it has no impact. You can change x << d.x; to _x << d.x; > > ... > > @script:ocaml@ > x << d.x; < > show << o.show; > store << o.store; > @@ > > if (not(show = "NULL") && Hashtbl.mem taken show) || >(not(store = "NULL") && Hashtbl.mem taken store) > then Coccilib.include_match false > else (Hashtbl.add taken show (); Hashtbl.add taken store ()) > > ... > > How does one specify indentation preferences ? Actually, you are pretty much stuck with what Coccinelle provides you. What do you want to adjust? Currently, it shouldn't have any impact on indentation if the new name is the same as the old one. That function could be generalized to allow length changes of up to a few characters, for example, but I'm not sure that the result would always be satisfactory. When I made the patch, I had to adjust about 5 cases by hand, where it was sending all of the parameters past 80 characters, because the name of the function was long. > Also, how do I generate > an actual patch (instead of getting the output on screen) ? > Sorry for the maybe dumb questions ;-). The patch is on standard output. All other messages are on standard error. You can also use the argument --in-place to change your code directly. If you want to only work on the hwmon directory, but to get a patch that is relative to the root directory, you can do: --dir linux/drivers/hwmon --patch linux That is, the argument to --patch is the directory that you want the patch to be relative to. julia > Thanks, > Guenter > > -- > To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Why do Zombie process' /proc entries have uid 0?
This is mostly out of curiosity, but I was surprised by the behavior, so I was hoping somebody might be able to explain why this behavior was chosen. In particular, consider any zombie process, e.g. $ cat /proc/77078/status Name: test State: Z (zombie) Tgid: 77078 Ngid: 0 Pid: 77078 PPid: 77077 TracerPid: 0 Uid: 1000 1000 1000 1000 Gid: 1000 1000 1000 1000 [...] now, this process has uid 1000, as does the /proc/ directory $ stat /proc/77078 File: '/proc/77078' Access: (0555/dr-xr-xr-x) Uid: ( 1000/keno) Gid: ( 1000/keno) but most files in /proc/ are owned by root: $ stat /proc/77078/status File: '/proc/77078/status' Access: (0444/-r--r--r--) Uid: (0/root) Gid: (0/root) Why is this? Why don't these files remain owned by the same uid as the process itself? Keno
Why do Zombie process' /proc entries have uid 0?
This is mostly out of curiosity, but I was surprised by the behavior, so I was hoping somebody might be able to explain why this behavior was chosen. In particular, consider any zombie process, e.g. $ cat /proc/77078/status Name: test State: Z (zombie) Tgid: 77078 Ngid: 0 Pid: 77078 PPid: 77077 TracerPid: 0 Uid: 1000 1000 1000 1000 Gid: 1000 1000 1000 1000 [...] now, this process has uid 1000, as does the /proc/ directory $ stat /proc/77078 File: '/proc/77078' Access: (0555/dr-xr-xr-x) Uid: ( 1000/keno) Gid: ( 1000/keno) but most files in /proc/ are owned by root: $ stat /proc/77078/status File: '/proc/77078/status' Access: (0444/-r--r--r--) Uid: (0/root) Gid: (0/root) Why is this? Why don't these files remain owned by the same uid as the process itself? Keno
Re: [PATCH V6 04/11] megaraid_sas: SAS3.5 Generic Megaraid Controllers Stream Detection and IO Coalescing
Hi Sasikumar, [auto build test WARNING on scsi/for-next] [also build test WARNING on next-20161223] [cannot apply to v4.9] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Sasikumar-Chandrasekaran/megaraid_sas-Updates-for-scsi-next/20161223-103256 base: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next config: x86_64-randconfig-ne0-12231250 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from include/linux/linkage.h:4:0, from include/linux/kernel.h:6, from drivers/scsi/megaraid/megaraid_sas_fusion.c:34: drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_stream_detect': include/linux/compiler.h:149:2: warning: this 'if' clause does not guard... [-Wmisleading-indentation] if (__builtin_constant_p(!!(cond)) ? !!(cond) : \ ^ include/linux/compiler.h:147:23: note: in expansion of macro '__trace_if' #define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) ) ^~ >> drivers/scsi/megaraid/megaraid_sas_fusion.c:1744:3: note: in expansion of >> macro 'if' if ((io_info->ldStartBlock != current_sd->next_seq_lba) ^~ drivers/scsi/megaraid/megaraid_sas_fusion.c:1751:4: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if' cmd->io_request->RaidContext.raid_context_g35.stream_detected = true; ^~~ vim +/if +1744 drivers/scsi/megaraid/megaraid_sas_fusion.c 1728 struct STREAM_DETECT *current_sd; 1729 /* find possible stream */ 1730 for (i = 0; i < MAX_STREAMS_TRACKED; ++i) { 1731 stream_num = 1732 (*track_stream >> (i * BITS_PER_INDEX_STREAM)) & 1733 STREAM_MASK; 1734 current_sd = _ld_sd->stream_track[stream_num]; 1735 /* if we found a stream, update the raid 1736 * context and also update the mruBitMap 1737 */ 1738 /* boundary condition */ 1739 if ((current_sd->next_seq_lba) && 1740 (io_info->ldStartBlock >= current_sd->next_seq_lba) && 1741 (io_info->ldStartBlock <= (current_sd->next_seq_lba+32)) && 1742 (current_sd->is_read == io_info->isRead)) { 1743 > 1744 if ((io_info->ldStartBlock != current_sd->next_seq_lba) 1745 && ((!io_info->isRead) || (!is_read_ahead))) 1746 /* 1747 * Once the API availible we need to change this. 1748 * At this point we are not allowing any gap 1749 */ 1750 continue; 1751 cmd->io_request->RaidContext.raid_context_g35.stream_detected = true; 1752 current_sd->next_seq_lba = --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH V6 04/11] megaraid_sas: SAS3.5 Generic Megaraid Controllers Stream Detection and IO Coalescing
Hi Sasikumar, [auto build test WARNING on scsi/for-next] [also build test WARNING on next-20161223] [cannot apply to v4.9] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Sasikumar-Chandrasekaran/megaraid_sas-Updates-for-scsi-next/20161223-103256 base: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next config: x86_64-randconfig-ne0-12231250 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from include/linux/linkage.h:4:0, from include/linux/kernel.h:6, from drivers/scsi/megaraid/megaraid_sas_fusion.c:34: drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_stream_detect': include/linux/compiler.h:149:2: warning: this 'if' clause does not guard... [-Wmisleading-indentation] if (__builtin_constant_p(!!(cond)) ? !!(cond) : \ ^ include/linux/compiler.h:147:23: note: in expansion of macro '__trace_if' #define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) ) ^~ >> drivers/scsi/megaraid/megaraid_sas_fusion.c:1744:3: note: in expansion of >> macro 'if' if ((io_info->ldStartBlock != current_sd->next_seq_lba) ^~ drivers/scsi/megaraid/megaraid_sas_fusion.c:1751:4: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if' cmd->io_request->RaidContext.raid_context_g35.stream_detected = true; ^~~ vim +/if +1744 drivers/scsi/megaraid/megaraid_sas_fusion.c 1728 struct STREAM_DETECT *current_sd; 1729 /* find possible stream */ 1730 for (i = 0; i < MAX_STREAMS_TRACKED; ++i) { 1731 stream_num = 1732 (*track_stream >> (i * BITS_PER_INDEX_STREAM)) & 1733 STREAM_MASK; 1734 current_sd = _ld_sd->stream_track[stream_num]; 1735 /* if we found a stream, update the raid 1736 * context and also update the mruBitMap 1737 */ 1738 /* boundary condition */ 1739 if ((current_sd->next_seq_lba) && 1740 (io_info->ldStartBlock >= current_sd->next_seq_lba) && 1741 (io_info->ldStartBlock <= (current_sd->next_seq_lba+32)) && 1742 (current_sd->is_read == io_info->isRead)) { 1743 > 1744 if ((io_info->ldStartBlock != current_sd->next_seq_lba) 1745 && ((!io_info->isRead) || (!is_read_ahead))) 1746 /* 1747 * Once the API availible we need to change this. 1748 * At this point we are not allowing any gap 1749 */ 1750 continue; 1751 cmd->io_request->RaidContext.raid_context_g35.stream_detected = true; 1752 current_sd->next_seq_lba = --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[PATCH] drm/i915: check if execlist_port is empty before using its content
From: "Du, Changbin"This patch fix a crash in function reset_common_ring. In this case, the port[0].request is null when reset the render ring, so a null dereference exception is raised. We need to check execlist_port status first. [ 35.748034] BUG: unable to handle kernel NULL pointer dereference at 0070 [ 35.749567] IP: [] reset_common_ring+0xbe/0x150 [ 35.749567] Call Trace: [ 35.749567] [] i915_gem_reset+0x150/0x270 [ 35.749567] [] i915_reset+0x8a/0xe0 [ 35.749567] [] i915_reset_and_wakeup+0x131/0x160 [ 35.749567] [] ? gen5_read8+0x110/0x110 [ 35.749567] [] i915_handle_error+0xca/0x5a0 [ 35.749567] [] ? scnprintf+0x3d/0x70 [ 35.749567] [] i915_hangcheck_elapsed+0x213/0x510 [ 35.749567] [] process_one_work+0x15b/0x470 [ 35.749567] [] worker_thread+0x43/0x4d0 [ 35.749567] [] ? process_one_work+0x470/0x470 [ 35.749567] [] ? process_one_work+0x470/0x470 [ 35.749567] [] ? call_usermodehelper_exec_async+0x12e/0x130 [ 35.749567] [] kthread+0xc5/0xe0 [ 35.749567] [] ? kthread_park+0x60/0x60 [ 35.749567] [] ? umh_complete+0x40/0x40 [ 35.749567] [] ret_from_fork+0x22/0x30 Signed-off-by: Changbin Du --- drivers/gpu/drm/i915/intel_lrc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0a09024..81a9b0b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1450,7 +1450,7 @@ static void reset_common_ring(struct intel_engine_cs *engine, /* Catch up with any missed context-switch interrupts */ I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0x, 0)); - if (request->ctx != port[0].request->ctx) { + if (!execlists_elsp_idle(engine) && request->ctx != port[0].request->ctx) { i915_gem_request_put(port[0].request); port[0] = port[1]; memset([1], 0, sizeof(port[1])); -- 2.7.4
[PATCH] drm/i915: check if execlist_port is empty before using its content
From: "Du, Changbin" This patch fix a crash in function reset_common_ring. In this case, the port[0].request is null when reset the render ring, so a null dereference exception is raised. We need to check execlist_port status first. [ 35.748034] BUG: unable to handle kernel NULL pointer dereference at 0070 [ 35.749567] IP: [] reset_common_ring+0xbe/0x150 [ 35.749567] Call Trace: [ 35.749567] [] i915_gem_reset+0x150/0x270 [ 35.749567] [] i915_reset+0x8a/0xe0 [ 35.749567] [] i915_reset_and_wakeup+0x131/0x160 [ 35.749567] [] ? gen5_read8+0x110/0x110 [ 35.749567] [] i915_handle_error+0xca/0x5a0 [ 35.749567] [] ? scnprintf+0x3d/0x70 [ 35.749567] [] i915_hangcheck_elapsed+0x213/0x510 [ 35.749567] [] process_one_work+0x15b/0x470 [ 35.749567] [] worker_thread+0x43/0x4d0 [ 35.749567] [] ? process_one_work+0x470/0x470 [ 35.749567] [] ? process_one_work+0x470/0x470 [ 35.749567] [] ? call_usermodehelper_exec_async+0x12e/0x130 [ 35.749567] [] kthread+0xc5/0xe0 [ 35.749567] [] ? kthread_park+0x60/0x60 [ 35.749567] [] ? umh_complete+0x40/0x40 [ 35.749567] [] ret_from_fork+0x22/0x30 Signed-off-by: Changbin Du --- drivers/gpu/drm/i915/intel_lrc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0a09024..81a9b0b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1450,7 +1450,7 @@ static void reset_common_ring(struct intel_engine_cs *engine, /* Catch up with any missed context-switch interrupts */ I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0x, 0)); - if (request->ctx != port[0].request->ctx) { + if (!execlists_elsp_idle(engine) && request->ctx != port[0].request->ctx) { i915_gem_request_put(port[0].request); port[0] = port[1]; memset([1], 0, sizeof(port[1])); -- 2.7.4
Re: [PATCH] MIPS: ralink: fix incorrect assignment on ralink_soc
On 23/12/2016 00:52, Colin King wrote: > From: Colin Ian King> > ralink_soc sould be assigned to RT3883_SOC, replace incorrect > comparision with assignment. > > Signed-off-by: Colin Ian King Acked-by: John Crispin i thought i had sent this fix upstream ages ago. luckily this bug never caused any error as none of the code checking ralink_soc checks for rt3883. its used for the rt3x5x family. John > --- > arch/mips/ralink/rt3883.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/mips/ralink/rt3883.c b/arch/mips/ralink/rt3883.c > index 141c597..f869052 100644 > --- a/arch/mips/ralink/rt3883.c > +++ b/arch/mips/ralink/rt3883.c > @@ -157,5 +157,5 @@ void prom_soc_init(struct ralink_soc_info *soc_info) > > rt2880_pinmux_data = rt3883_pinmux_data; > > - ralink_soc == RT3883_SOC; > + ralink_soc = RT3883_SOC; > } > his
Re: [PATCH] MIPS: ralink: fix incorrect assignment on ralink_soc
On 23/12/2016 00:52, Colin King wrote: > From: Colin Ian King > > ralink_soc sould be assigned to RT3883_SOC, replace incorrect > comparision with assignment. > > Signed-off-by: Colin Ian King Acked-by: John Crispin i thought i had sent this fix upstream ages ago. luckily this bug never caused any error as none of the code checking ralink_soc checks for rt3883. its used for the rt3x5x family. John > --- > arch/mips/ralink/rt3883.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/mips/ralink/rt3883.c b/arch/mips/ralink/rt3883.c > index 141c597..f869052 100644 > --- a/arch/mips/ralink/rt3883.c > +++ b/arch/mips/ralink/rt3883.c > @@ -157,5 +157,5 @@ void prom_soc_init(struct ralink_soc_info *soc_info) > > rt2880_pinmux_data = rt3883_pinmux_data; > > - ralink_soc == RT3883_SOC; > + ralink_soc = RT3883_SOC; > } > his
Re: [PATCH 1/2] drm/panel: Add support for S6E3HA2 panel driver on TM2 board
On 12/20/2016 03:55 AM, Rob Herring wrote: On Wed, Dec 14, 2016 at 03:04:04PM +0900, Hoegeun Kwon wrote: This patch add support for MIPI-DSI based S6E3HA2 AMOLED panel driver. This panel has 1440x2560 resolution in 5.7-inch physical panel in the TM2 device. Signed-off-by: Donghwa LeeSigned-off-by: Hyungwon Hwang Signed-off-by: Hoegeun Kwon --- .../bindings/display/panel/samsung,s6e3ha2.txt | 52 ++ drivers/gpu/drm/panel/Kconfig | 6 + drivers/gpu/drm/panel/Makefile | 1 + drivers/gpu/drm/panel/panel-samsung-s6e3ha2.c | 756 + 4 files changed, 815 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt create mode 100644 drivers/gpu/drm/panel/panel-samsung-s6e3ha2.c diff --git a/Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt b/Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt new file mode 100644 index 000..1f41f24 --- /dev/null +++ b/Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt @@ -0,0 +1,52 @@ +Samsung S6E3HA2 5.7" 1440x2560 AMOLED panel + +Required properties: + - compatible: "samsung,s6e3ha2" + - reg: the virtual channel number of a DSI peripheral + - vdd3-supply: core voltage supply + - vci-supply: voltage supply for analog circuits + - reset-gpios: a GPIO spec for the reset pin + - enable-gpios: a GPIO spec for the panel enable pin + - te-gpios: a GPIO spec for the tearing effect synchronization signal gpio pin Need to specify the GPIOs as active high or low. Hi Rob Herring, Thanks for point out. I will add "(active high or low)" after the description of the GPIOs. + +Optional properties: + - display-timings: timings for the connected panel as described by [1] + +The device node can contain one 'port' child node with one child +'endpoint' node, according to the bindings defined in [2]. This +node should describe panel's video bus. + +[1]: Documentation/devicetree/bindings/display/panel/display-timing.txt +[2]: Documentation/devicetree/bindings/media/video-interfaces.txt + +Example: + + panel@0 { + compatible = "samsung,s6e3ha2"; + reg = <0>; reg doesn't really work here unless this node is a child of the DSI controller node. But if it is a child node, then you don't need the OF graph. The reg value is used in virtual child panel node so I think the reg value is needed. Please refer to the document below. Documentation/devicetree/bindings/display/mipi-dsi-bus.txt Best Regards, Hoegeun Kwon + vdd3-supply = <_reg>; + vci-supply = <_reg>; + reset-gpios = < 0 GPIO_ACTIVE_HIGH>; + enable-gpios = < 5 GPIO_ACTIVE_HIGH>; + te-gpios = < 3 GPIO_ACTIVE_HIGH>; + + display-timings { + timing-0 { + clock-frequency = <0>; + hactive = <1440>; + vactive = <2560>; + hfront-porch = <1>; + hback-porch = <1>; + hsync-len = <1>; + vfront-porch = <1>; + vback-porch = <15>; + vsync-len = <1>; + }; + }; + + port { + dsi_in: endpoint { + remote-endpoint = <_out>; + }; + }; + }; ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] drm/panel: Add support for S6E3HA2 panel driver on TM2 board
On 12/20/2016 03:55 AM, Rob Herring wrote: On Wed, Dec 14, 2016 at 03:04:04PM +0900, Hoegeun Kwon wrote: This patch add support for MIPI-DSI based S6E3HA2 AMOLED panel driver. This panel has 1440x2560 resolution in 5.7-inch physical panel in the TM2 device. Signed-off-by: Donghwa Lee Signed-off-by: Hyungwon Hwang Signed-off-by: Hoegeun Kwon --- .../bindings/display/panel/samsung,s6e3ha2.txt | 52 ++ drivers/gpu/drm/panel/Kconfig | 6 + drivers/gpu/drm/panel/Makefile | 1 + drivers/gpu/drm/panel/panel-samsung-s6e3ha2.c | 756 + 4 files changed, 815 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt create mode 100644 drivers/gpu/drm/panel/panel-samsung-s6e3ha2.c diff --git a/Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt b/Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt new file mode 100644 index 000..1f41f24 --- /dev/null +++ b/Documentation/devicetree/bindings/display/panel/samsung,s6e3ha2.txt @@ -0,0 +1,52 @@ +Samsung S6E3HA2 5.7" 1440x2560 AMOLED panel + +Required properties: + - compatible: "samsung,s6e3ha2" + - reg: the virtual channel number of a DSI peripheral + - vdd3-supply: core voltage supply + - vci-supply: voltage supply for analog circuits + - reset-gpios: a GPIO spec for the reset pin + - enable-gpios: a GPIO spec for the panel enable pin + - te-gpios: a GPIO spec for the tearing effect synchronization signal gpio pin Need to specify the GPIOs as active high or low. Hi Rob Herring, Thanks for point out. I will add "(active high or low)" after the description of the GPIOs. + +Optional properties: + - display-timings: timings for the connected panel as described by [1] + +The device node can contain one 'port' child node with one child +'endpoint' node, according to the bindings defined in [2]. This +node should describe panel's video bus. + +[1]: Documentation/devicetree/bindings/display/panel/display-timing.txt +[2]: Documentation/devicetree/bindings/media/video-interfaces.txt + +Example: + + panel@0 { + compatible = "samsung,s6e3ha2"; + reg = <0>; reg doesn't really work here unless this node is a child of the DSI controller node. But if it is a child node, then you don't need the OF graph. The reg value is used in virtual child panel node so I think the reg value is needed. Please refer to the document below. Documentation/devicetree/bindings/display/mipi-dsi-bus.txt Best Regards, Hoegeun Kwon + vdd3-supply = <_reg>; + vci-supply = <_reg>; + reset-gpios = < 0 GPIO_ACTIVE_HIGH>; + enable-gpios = < 5 GPIO_ACTIVE_HIGH>; + te-gpios = < 3 GPIO_ACTIVE_HIGH>; + + display-timings { + timing-0 { + clock-frequency = <0>; + hactive = <1440>; + vactive = <2560>; + hfront-porch = <1>; + hback-porch = <1>; + hsync-len = <1>; + vfront-porch = <1>; + vback-porch = <15>; + vsync-len = <1>; + }; + }; + + port { + dsi_in: endpoint { + remote-endpoint = <_out>; + }; + }; + }; ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: RE: [PATCH] Input: Change msleep to usleep_range for small msecs
Dear Aniroop Mathur Please see the acknowledgement below. Best regards Albert (Xu) ZHANG BST/ESA3.1 Tel. +86(21)2218-1283 -Original Message- From: Aniroop Mathur [mailto:a.mat...@samsung.com] Sent: Thursday, December 01, 2016 6:34 PM To: ZHANG Xu (BST/ESA3.1); Dmitry Torokhov ; linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org Cc: Aniroop Mathur ; SAMUEL SEQUEIRA ; Rahul Mahale Subject: RE: RE: [PATCH] Input: Change msleep to usleep_range for small msecs Dear Mr. Albert Zhang, Thank you for your confirmation! Yes, I think usleep_range(2000, 2100) is better than usleep_range(2000, 2000) because delta time will allow the kernel to batch the processes who need to wake up around same time and generate single interrupt to wake up all of them. So this would be beneficial from power saving point of view. -- Best Regards, Aniroop Mathur - Original Message - Sender : ZHANG Xu (BST/ESA3.1) Date : 2016-12-01 11:19 (GMT+5:30) Title : RE: [PATCH] Input: Change msleep to usleep_range for small msecs Hello Aniroop Mathur Thank you for your mail. We have used the usleep_range() function in our new product's driver and the verification result is working. Your patch for bma150 is definitely working for sure. Just one question need your answer. To replace the msleep(2), is usleep_range(2000, 2100) better than usleep_range(2000, 2000) ? Best regards Albert (Xu) ZHANG BST/ESA3.1 -Original Message- From: mathur.anir...@gmail.com [mailto:mathur.anir...@gmail.com] On Behalf Of Aniroop Mathur Sent: Tuesday, November 29, 2016 12:36 AM To: ZHANG Xu (BST/ESA3.1) ; Dmitry Torokhov ; linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org Cc: Aniroop Mathur ; s.sam...@samsung.com; r.mah...@samsung.com; Aniroop Mathur Subject: Re: [PATCH] Input: Change msleep to usleep_range for small msecs Hello Mr. Albert Zhang, I am Aniroop Mathur from Samsung R Institute, India. I have submitted one patch as below for review to Linux Open Source. The problem is that we do not have the hardware available with us to test it and we would like to test it before actually applying it. As you are the author of this driver, so I would like to request you if you could help to test this patch or provide us the contact points of individuals who could support to get this patch tested? Thank you! BR, Aniroop Mathur On Thu, Nov 24, 2016 at 9:33 PM, Aniroop Mathur wrote: > msleep(1~20) may not do what the caller intends, and will often sleep longer. > (~20 ms actual sleep for any value given in the 1~20ms range) > This is not the desired behaviour for many cases like device resume time, > device suspend time, device enable time, etc. > Thus, change msleep to usleep_range for precise wakeups. > > Signed-off-by: Aniroop Mathur Acked by: Albert Zhang > --- > drivers/input/misc/bma150.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/input/misc/bma150.c b/drivers/input/misc/bma150.c > index 2124390..1fa8537 100644 > --- a/drivers/input/misc/bma150.c > +++ b/drivers/input/misc/bma150.c > @@ -207,7 +207,7 @@ static int bma150_set_mode(struct bma150_data *bma150, u8 >mode) > return error; > > if (mode == BMA150_MODE_NORMAL) > - msleep(2); > + usleep_range(2000, 2100); > > bma150->mode = mode; > return 0; > @@ -222,7 +222,7 @@ static int bma150_soft_reset(struct bma150_data *bma150) > if (error) > return error; > > - msleep(2); > + usleep_range(2000, 2100); > return 0; > } > > -- > 2.6.2 >
RE: RE: [PATCH] Input: Change msleep to usleep_range for small msecs
Dear Aniroop Mathur Please see the acknowledgement below. Best regards Albert (Xu) ZHANG BST/ESA3.1 Tel. +86(21)2218-1283 -Original Message- From: Aniroop Mathur [mailto:a.mat...@samsung.com] Sent: Thursday, December 01, 2016 6:34 PM To: ZHANG Xu (BST/ESA3.1) ; Dmitry Torokhov ; linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org Cc: Aniroop Mathur ; SAMUEL SEQUEIRA ; Rahul Mahale Subject: RE: RE: [PATCH] Input: Change msleep to usleep_range for small msecs Dear Mr. Albert Zhang, Thank you for your confirmation! Yes, I think usleep_range(2000, 2100) is better than usleep_range(2000, 2000) because delta time will allow the kernel to batch the processes who need to wake up around same time and generate single interrupt to wake up all of them. So this would be beneficial from power saving point of view. -- Best Regards, Aniroop Mathur - Original Message - Sender : ZHANG Xu (BST/ESA3.1) Date : 2016-12-01 11:19 (GMT+5:30) Title : RE: [PATCH] Input: Change msleep to usleep_range for small msecs Hello Aniroop Mathur Thank you for your mail. We have used the usleep_range() function in our new product's driver and the verification result is working. Your patch for bma150 is definitely working for sure. Just one question need your answer. To replace the msleep(2), is usleep_range(2000, 2100) better than usleep_range(2000, 2000) ? Best regards Albert (Xu) ZHANG BST/ESA3.1 -Original Message- From: mathur.anir...@gmail.com [mailto:mathur.anir...@gmail.com] On Behalf Of Aniroop Mathur Sent: Tuesday, November 29, 2016 12:36 AM To: ZHANG Xu (BST/ESA3.1) ; Dmitry Torokhov ; linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org Cc: Aniroop Mathur ; s.sam...@samsung.com; r.mah...@samsung.com; Aniroop Mathur Subject: Re: [PATCH] Input: Change msleep to usleep_range for small msecs Hello Mr. Albert Zhang, I am Aniroop Mathur from Samsung R Institute, India. I have submitted one patch as below for review to Linux Open Source. The problem is that we do not have the hardware available with us to test it and we would like to test it before actually applying it. As you are the author of this driver, so I would like to request you if you could help to test this patch or provide us the contact points of individuals who could support to get this patch tested? Thank you! BR, Aniroop Mathur On Thu, Nov 24, 2016 at 9:33 PM, Aniroop Mathur wrote: > msleep(1~20) may not do what the caller intends, and will often sleep longer. > (~20 ms actual sleep for any value given in the 1~20ms range) > This is not the desired behaviour for many cases like device resume time, > device suspend time, device enable time, etc. > Thus, change msleep to usleep_range for precise wakeups. > > Signed-off-by: Aniroop Mathur Acked by: Albert Zhang > --- > drivers/input/misc/bma150.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/input/misc/bma150.c b/drivers/input/misc/bma150.c > index 2124390..1fa8537 100644 > --- a/drivers/input/misc/bma150.c > +++ b/drivers/input/misc/bma150.c > @@ -207,7 +207,7 @@ static int bma150_set_mode(struct bma150_data *bma150, u8 >mode) > return error; > > if (mode == BMA150_MODE_NORMAL) > - msleep(2); > + usleep_range(2000, 2100); > > bma150->mode = mode; > return 0; > @@ -222,7 +222,7 @@ static int bma150_soft_reset(struct bma150_data *bma150) > if (error) > return error; > > - msleep(2); > + usleep_range(2000, 2100); > return 0; > } > > -- > 2.6.2 >
Re: [PATCH 65/66] hwmon: (fam15h_power) use permission-specific DEVICE_ATTR variants
On Thu, Dec 22, 2016 at 01:05:34PM +0100, Julia Lawall wrote: > Use DEVICE_ATTR_RO etc. for read only attributes etc. This simplifies the > source code, improves readbility, and reduces the chance of > inconsistencies. > > The semantic patch for the RO case, in the case where the show function > already has the expected name, is as follows: > (http://coccinelle.lip6.fr/) > > // > @ro@ > declarer name DEVICE_ATTR; > identifier x,x_show; > @@ > > DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > > @script:ocaml@ > x << ro.x; > x_show << ro.x_show; > @@ > > if not (x^"_show" = x_show) then Coccilib.include_match false > > @@ > declarer name DEVICE_ATTR_RO; > identifier ro.x,ro.x_show; > @@ > > - DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > + DEVICE_ATTR_RO(x); > // > > Signed-off-by: Julia Lawall> Acked-by: Huang Rui Thanks, Rui > --- > drivers/hwmon/fam15h_power.c | 34 -- > 1 file changed, 16 insertions(+), 18 deletions(-) > > diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c > index 15aa49d..9545a34 100644 > --- a/drivers/hwmon/fam15h_power.c > +++ b/drivers/hwmon/fam15h_power.c > @@ -83,8 +83,8 @@ static bool is_carrizo_or_later(void) > return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; > } > > -static ssize_t show_power(struct device *dev, > - struct device_attribute *attr, char *buf) > +static ssize_t power1_input_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > u32 val, tdp_limit, running_avg_range; > s32 running_avg_capture; > @@ -136,16 +136,16 @@ static ssize_t show_power(struct device *dev, > curr_pwr_watts = (curr_pwr_watts * 15625) >> (10 + running_avg_range); > return sprintf(buf, "%u\n", (unsigned int) curr_pwr_watts); > } > -static DEVICE_ATTR(power1_input, S_IRUGO, show_power, NULL); > +static DEVICE_ATTR_RO(power1_input); > > -static ssize_t show_power_crit(struct device *dev, > -struct device_attribute *attr, char *buf) > +static ssize_t power1_crit_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%u\n", data->processor_pwr_watts); > } > -static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); > +static DEVICE_ATTR_RO(power1_crit); > > static void do_read_registers_on_cu(void *_data) > { > @@ -212,9 +212,8 @@ static int read_registers(struct fam15h_power_data *data) > return 0; > } > > -static ssize_t acc_show_power(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_show(struct device *dev, > +struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], > @@ -267,20 +266,20 @@ static ssize_t acc_show_power(struct device *dev, > > return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); > } > -static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); > +static DEVICE_ATTR_RO(power1_average); > > -static ssize_t acc_show_power_period(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_interval_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%lu\n", data->power_period); > } > > -static ssize_t acc_set_power_period(struct device *dev, > - struct device_attribute *attr, > - const char *buf, size_t count) > +static ssize_t power1_average_interval_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > unsigned long temp; > @@ -301,8 +300,7 @@ static ssize_t acc_set_power_period(struct device *dev, > > return count; > } > -static DEVICE_ATTR(power1_average_interval, S_IRUGO | S_IWUSR, > -acc_show_power_period, acc_set_power_period); > +static DEVICE_ATTR_RW(power1_average_interval); > > static int fam15h_power_init_attrs(struct pci_dev *pdev, > struct fam15h_power_data *data) >
Re: [PATCH 65/66] hwmon: (fam15h_power) use permission-specific DEVICE_ATTR variants
On Thu, Dec 22, 2016 at 01:05:34PM +0100, Julia Lawall wrote: > Use DEVICE_ATTR_RO etc. for read only attributes etc. This simplifies the > source code, improves readbility, and reduces the chance of > inconsistencies. > > The semantic patch for the RO case, in the case where the show function > already has the expected name, is as follows: > (http://coccinelle.lip6.fr/) > > // > @ro@ > declarer name DEVICE_ATTR; > identifier x,x_show; > @@ > > DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > > @script:ocaml@ > x << ro.x; > x_show << ro.x_show; > @@ > > if not (x^"_show" = x_show) then Coccilib.include_match false > > @@ > declarer name DEVICE_ATTR_RO; > identifier ro.x,ro.x_show; > @@ > > - DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > + DEVICE_ATTR_RO(x); > // > > Signed-off-by: Julia Lawall > Acked-by: Huang Rui Thanks, Rui > --- > drivers/hwmon/fam15h_power.c | 34 -- > 1 file changed, 16 insertions(+), 18 deletions(-) > > diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c > index 15aa49d..9545a34 100644 > --- a/drivers/hwmon/fam15h_power.c > +++ b/drivers/hwmon/fam15h_power.c > @@ -83,8 +83,8 @@ static bool is_carrizo_or_later(void) > return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; > } > > -static ssize_t show_power(struct device *dev, > - struct device_attribute *attr, char *buf) > +static ssize_t power1_input_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > u32 val, tdp_limit, running_avg_range; > s32 running_avg_capture; > @@ -136,16 +136,16 @@ static ssize_t show_power(struct device *dev, > curr_pwr_watts = (curr_pwr_watts * 15625) >> (10 + running_avg_range); > return sprintf(buf, "%u\n", (unsigned int) curr_pwr_watts); > } > -static DEVICE_ATTR(power1_input, S_IRUGO, show_power, NULL); > +static DEVICE_ATTR_RO(power1_input); > > -static ssize_t show_power_crit(struct device *dev, > -struct device_attribute *attr, char *buf) > +static ssize_t power1_crit_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%u\n", data->processor_pwr_watts); > } > -static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); > +static DEVICE_ATTR_RO(power1_crit); > > static void do_read_registers_on_cu(void *_data) > { > @@ -212,9 +212,8 @@ static int read_registers(struct fam15h_power_data *data) > return 0; > } > > -static ssize_t acc_show_power(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_show(struct device *dev, > +struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], > @@ -267,20 +266,20 @@ static ssize_t acc_show_power(struct device *dev, > > return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); > } > -static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); > +static DEVICE_ATTR_RO(power1_average); > > -static ssize_t acc_show_power_period(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_interval_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%lu\n", data->power_period); > } > > -static ssize_t acc_set_power_period(struct device *dev, > - struct device_attribute *attr, > - const char *buf, size_t count) > +static ssize_t power1_average_interval_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > unsigned long temp; > @@ -301,8 +300,7 @@ static ssize_t acc_set_power_period(struct device *dev, > > return count; > } > -static DEVICE_ATTR(power1_average_interval, S_IRUGO | S_IWUSR, > -acc_show_power_period, acc_set_power_period); > +static DEVICE_ATTR_RW(power1_average_interval); > > static int fam15h_power_init_attrs(struct pci_dev *pdev, > struct fam15h_power_data *data) >
[RFC PATCHv2 2/4] clk: mdm9615: Add EBI2 clock
Add definition of EBI2 clock used by MDM9615 NAND controller. Cc: Andy GrossCc: David Brown Cc: Michael Turquette Cc: Stephen Boyd Cc: Rob Herring Cc: Mark Rutland Cc: Neil Armstrong Cc: linux-arm-...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: devicet...@vger.kernel.org Signed-off-by: Zoran Markovic --- drivers/clk/qcom/gcc-mdm9615.c | 30 ++ include/dt-bindings/clock/qcom,gcc-mdm9615.h |3 +++ 2 files changed, 33 insertions(+) diff --git a/drivers/clk/qcom/gcc-mdm9615.c b/drivers/clk/qcom/gcc-mdm9615.c index 581a17f..e9e98b1 100644 --- a/drivers/clk/qcom/gcc-mdm9615.c +++ b/drivers/clk/qcom/gcc-mdm9615.c @@ -1563,6 +1563,34 @@ enum { }, }; +static struct clk_branch ebi2_clk = { + .hwcg_reg = 0x2664, + .hwcg_bit = 6, + .halt_reg = 0x2fcc, + .halt_bit = 23, + .clkr = { + .enable_reg = 0x2664, + .enable_mask = BIT(6) | BIT(4), + .hw.init = &(struct clk_init_data){ + .name = "ebi2_clk", + .ops = _branch_ops, + }, + }, +}; + +static struct clk_branch ebi2_aon_clk = { + .halt_reg = 0x2fcc, + .halt_bit = 23, + .clkr = { + .enable_reg = 0x2664, + .enable_mask = BIT(8), + .hw.init = &(struct clk_init_data){ + .name = "ebi2_aon_clk", + .ops = _branch_ops, + }, + }, +}; + static struct clk_hw *gcc_mdm9615_hws[] = { , }; @@ -1637,6 +1665,8 @@ enum { [PMIC_ARB1_H_CLK] = _arb1_h_clk.clkr, [PMIC_SSBI2_CLK] = _ssbi2_clk.clkr, [RPM_MSG_RAM_H_CLK] = _msg_ram_h_clk.clkr, + [EBI2_CLK] = _clk.clkr, + [EBI2_AON_CLK] = _aon_clk.clkr, }; static const struct qcom_reset_map gcc_mdm9615_resets[] = { diff --git a/include/dt-bindings/clock/qcom,gcc-mdm9615.h b/include/dt-bindings/clock/qcom,gcc-mdm9615.h index 9ab2c40..57cdca6 100644 --- a/include/dt-bindings/clock/qcom,gcc-mdm9615.h +++ b/include/dt-bindings/clock/qcom,gcc-mdm9615.h @@ -323,5 +323,8 @@ #define CE3_H_CLK 305 #define USB_HS1_SYSTEM_CLK_SRC 306 #define USB_HS1_SYSTEM_CLK 307 +#define EBI2_CLK 308 +#define EBI2_AON_CLK 309 + #endif -- 1.7.9.5
[RFC PATCHv2 2/4] clk: mdm9615: Add EBI2 clock
Add definition of EBI2 clock used by MDM9615 NAND controller. Cc: Andy Gross Cc: David Brown Cc: Michael Turquette Cc: Stephen Boyd Cc: Rob Herring Cc: Mark Rutland Cc: Neil Armstrong Cc: linux-arm-...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: devicet...@vger.kernel.org Signed-off-by: Zoran Markovic --- drivers/clk/qcom/gcc-mdm9615.c | 30 ++ include/dt-bindings/clock/qcom,gcc-mdm9615.h |3 +++ 2 files changed, 33 insertions(+) diff --git a/drivers/clk/qcom/gcc-mdm9615.c b/drivers/clk/qcom/gcc-mdm9615.c index 581a17f..e9e98b1 100644 --- a/drivers/clk/qcom/gcc-mdm9615.c +++ b/drivers/clk/qcom/gcc-mdm9615.c @@ -1563,6 +1563,34 @@ enum { }, }; +static struct clk_branch ebi2_clk = { + .hwcg_reg = 0x2664, + .hwcg_bit = 6, + .halt_reg = 0x2fcc, + .halt_bit = 23, + .clkr = { + .enable_reg = 0x2664, + .enable_mask = BIT(6) | BIT(4), + .hw.init = &(struct clk_init_data){ + .name = "ebi2_clk", + .ops = _branch_ops, + }, + }, +}; + +static struct clk_branch ebi2_aon_clk = { + .halt_reg = 0x2fcc, + .halt_bit = 23, + .clkr = { + .enable_reg = 0x2664, + .enable_mask = BIT(8), + .hw.init = &(struct clk_init_data){ + .name = "ebi2_aon_clk", + .ops = _branch_ops, + }, + }, +}; + static struct clk_hw *gcc_mdm9615_hws[] = { , }; @@ -1637,6 +1665,8 @@ enum { [PMIC_ARB1_H_CLK] = _arb1_h_clk.clkr, [PMIC_SSBI2_CLK] = _ssbi2_clk.clkr, [RPM_MSG_RAM_H_CLK] = _msg_ram_h_clk.clkr, + [EBI2_CLK] = _clk.clkr, + [EBI2_AON_CLK] = _aon_clk.clkr, }; static const struct qcom_reset_map gcc_mdm9615_resets[] = { diff --git a/include/dt-bindings/clock/qcom,gcc-mdm9615.h b/include/dt-bindings/clock/qcom,gcc-mdm9615.h index 9ab2c40..57cdca6 100644 --- a/include/dt-bindings/clock/qcom,gcc-mdm9615.h +++ b/include/dt-bindings/clock/qcom,gcc-mdm9615.h @@ -323,5 +323,8 @@ #define CE3_H_CLK 305 #define USB_HS1_SYSTEM_CLK_SRC 306 #define USB_HS1_SYSTEM_CLK 307 +#define EBI2_CLK 308 +#define EBI2_AON_CLK 309 + #endif -- 1.7.9.5
Re: [PATCH v3 1/4] dt-bindings: phy: Add support for QUSB2 phy
Hi Rob, On Fri, Dec 23, 2016 at 2:46 AM, Rob Herringwrote: > On Tue, Dec 20, 2016 at 10:33:48PM +0530, Vivek Gautam wrote: >> Qualcomm chipsets have QUSB2 phy controller that provides >> HighSpeed functionality for DWC3 controller. >> Adding dt binding information for the same. >> >> Signed-off-by: Vivek Gautam >> --- >> >> Changes since v2: >> - Removed binding for "ref_clk_src" since we don't request this >>clock in the driver. >> - Addressed s/vdda-phy-dpdm/vdda-phy-dpdm-supply. >> - Addressed s/ref_clk/ref. Don't need to add '_clk' suffix to clock names. >> - Addressed s/tune2_hstx_trim_efuse/tune2_hstx_trim. Don't need to add >>'efuse' suffix to nvmem cell. >> - Addressed s/qusb2phy/phy for the node name. >> >> Changes since v1: >> - New patch, forked out of the original driver patch: >>"phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips" >> - Updated dt bindings to remove 'hstx-trim-bit-offset' and >>'hstx-trim-bit-len' bindings. >> >> .../devicetree/bindings/phy/qcom-qusb2-phy.txt | 53 >> ++ >> 1 file changed, 53 insertions(+) >> create mode 100644 Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> >> diff --git a/Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> b/Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> new file mode 100644 >> index ..594f2dcd12dd >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> @@ -0,0 +1,53 @@ >> +Qualcomm QUSB2 phy controller >> += >> + >> +QUSB2 controller supports LS/FS/HS usb connectivity on Qualcomm chipsets. >> + >> +Required properties: >> + - compatible: compatible list, contains "qcom,msm8996-qusb2-phy". >> + - reg: offset and length of the PHY register set. >> + - #phy-cells: must be 0. >> + >> + - clocks: a list of phandles and clock-specifier pairs, >> +one for each entry in clock-names. >> + - clock-names: must be "cfg_ahb" for phy config clock, >> + "ref" for 19.2 MHz ref clk, >> + "iface" for phy interface clock (Optional). >> + >> + - vdd-phy-supply: Phandle to a regulator supply to PHY core block. >> + - vdda-pll-supply: Phandle to 1.8V regulator supply to PHY refclk pll >> block. >> + - vdda-phy-dpdm-supply: Phandle to 3.1V regulator supply to Dp/Dm port >> signals. >> + >> + - resets: a list of phandles and reset controller specifier pairs, >> +one for each entry in reset-names. >> + - reset-names: must be "phy" for reset of phy block. > > -names is pointless when only one. Sure, will drop the -names property, and get the reset control by index. > >> + >> +Optional properties: >> + - nvmem-cells: a list of phandles to nvmem cells that contain fused >> + tuning parameters for qusb2 phy, one for each entry >> + in nvmem-cell-names. >> + - nvmem-cell-names: must be "tune2_hstx_trim" for cell containing >> + HS Tx trim value. > > ditto. nvmem doesn't allow, at this point, to get the cells by index. Its APIs take 'const char' cell id and get the cell. We should add this support to get the cell by index. Will create a patch for that, and drop the '-names' property from bindings. > > With those dropped, > > Acked-by: Rob Herring Thanks for the Ack. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH v3 1/4] dt-bindings: phy: Add support for QUSB2 phy
Hi Rob, On Fri, Dec 23, 2016 at 2:46 AM, Rob Herring wrote: > On Tue, Dec 20, 2016 at 10:33:48PM +0530, Vivek Gautam wrote: >> Qualcomm chipsets have QUSB2 phy controller that provides >> HighSpeed functionality for DWC3 controller. >> Adding dt binding information for the same. >> >> Signed-off-by: Vivek Gautam >> --- >> >> Changes since v2: >> - Removed binding for "ref_clk_src" since we don't request this >>clock in the driver. >> - Addressed s/vdda-phy-dpdm/vdda-phy-dpdm-supply. >> - Addressed s/ref_clk/ref. Don't need to add '_clk' suffix to clock names. >> - Addressed s/tune2_hstx_trim_efuse/tune2_hstx_trim. Don't need to add >>'efuse' suffix to nvmem cell. >> - Addressed s/qusb2phy/phy for the node name. >> >> Changes since v1: >> - New patch, forked out of the original driver patch: >>"phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips" >> - Updated dt bindings to remove 'hstx-trim-bit-offset' and >>'hstx-trim-bit-len' bindings. >> >> .../devicetree/bindings/phy/qcom-qusb2-phy.txt | 53 >> ++ >> 1 file changed, 53 insertions(+) >> create mode 100644 Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> >> diff --git a/Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> b/Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> new file mode 100644 >> index ..594f2dcd12dd >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/phy/qcom-qusb2-phy.txt >> @@ -0,0 +1,53 @@ >> +Qualcomm QUSB2 phy controller >> += >> + >> +QUSB2 controller supports LS/FS/HS usb connectivity on Qualcomm chipsets. >> + >> +Required properties: >> + - compatible: compatible list, contains "qcom,msm8996-qusb2-phy". >> + - reg: offset and length of the PHY register set. >> + - #phy-cells: must be 0. >> + >> + - clocks: a list of phandles and clock-specifier pairs, >> +one for each entry in clock-names. >> + - clock-names: must be "cfg_ahb" for phy config clock, >> + "ref" for 19.2 MHz ref clk, >> + "iface" for phy interface clock (Optional). >> + >> + - vdd-phy-supply: Phandle to a regulator supply to PHY core block. >> + - vdda-pll-supply: Phandle to 1.8V regulator supply to PHY refclk pll >> block. >> + - vdda-phy-dpdm-supply: Phandle to 3.1V regulator supply to Dp/Dm port >> signals. >> + >> + - resets: a list of phandles and reset controller specifier pairs, >> +one for each entry in reset-names. >> + - reset-names: must be "phy" for reset of phy block. > > -names is pointless when only one. Sure, will drop the -names property, and get the reset control by index. > >> + >> +Optional properties: >> + - nvmem-cells: a list of phandles to nvmem cells that contain fused >> + tuning parameters for qusb2 phy, one for each entry >> + in nvmem-cell-names. >> + - nvmem-cell-names: must be "tune2_hstx_trim" for cell containing >> + HS Tx trim value. > > ditto. nvmem doesn't allow, at this point, to get the cells by index. Its APIs take 'const char' cell id and get the cell. We should add this support to get the cell by index. Will create a patch for that, and drop the '-names' property from bindings. > > With those dropped, > > Acked-by: Rob Herring Thanks for the Ack. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] tools build: replace $(CC) -E with $(CPP) for pre-processing
The top-level Makefile defines: CPP = $(CC) -E So, $(CC) -E can be replaced with $(CPP) and this makes more sense for pre-processing. Signed-off-by: Masahiro Yamada--- tools/build/Makefile.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build index 99c0ccd..4a0ec5c 100644 --- a/tools/build/Makefile.build +++ b/tools/build/Makefile.build @@ -65,7 +65,7 @@ quiet_cmd_cxx_o_c = CXX $@ cmd_cxx_o_c = $(CXX) $(cxx_flags) -c -o $@ $< quiet_cmd_cpp_i_c = CPP $@ - cmd_cpp_i_c = $(CC) $(c_flags) -E -o $@ $< + cmd_cpp_i_c = $(CPP) $(c_flags) -o $@ $< quiet_cmd_cc_s_c = AS $@ cmd_cc_s_c = $(CC) $(c_flags) -S -o $@ $< -- 2.7.4
[PATCH] tools build: replace $(CC) -E with $(CPP) for pre-processing
The top-level Makefile defines: CPP = $(CC) -E So, $(CC) -E can be replaced with $(CPP) and this makes more sense for pre-processing. Signed-off-by: Masahiro Yamada --- tools/build/Makefile.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build index 99c0ccd..4a0ec5c 100644 --- a/tools/build/Makefile.build +++ b/tools/build/Makefile.build @@ -65,7 +65,7 @@ quiet_cmd_cxx_o_c = CXX $@ cmd_cxx_o_c = $(CXX) $(cxx_flags) -c -o $@ $< quiet_cmd_cpp_i_c = CPP $@ - cmd_cpp_i_c = $(CC) $(c_flags) -E -o $@ $< + cmd_cpp_i_c = $(CPP) $(c_flags) -o $@ $< quiet_cmd_cc_s_c = AS $@ cmd_cc_s_c = $(CC) $(c_flags) -S -o $@ $< -- 2.7.4
[PATCH] genirq: use simple work queue for polling spurious irq
Spurious irq will be disabled and then polled through mod_timer in interrupt context. While mod_timer has a common spin_lock operation thus will cause an error: BUG: sleeping function called from invalid context. Move the mod_timer into simple work queue to fix it. Signed-off-by: Zhang Xiao--- kernel/irq/spurious.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c index 903a69c..0ce2a38 100644 --- a/kernel/irq/spurious.c +++ b/kernel/irq/spurious.c @@ -24,6 +24,36 @@ static DEFINE_TIMER(poll_spurious_irq_timer, poll_spurious_irqs, 0, 0); static int irq_poll_cpu; static atomic_t irq_poll_active; +#ifdef CONFIG_PREEMPT_RT_FULL +#include + +static bool mod_timer_work_ready __read_mostly; +static struct swork_event mod_timer_work; + +static void __mod_timer_work(struct swork_event *event) +{ + mod_timer(_spurious_irq_timer, + jiffies + POLL_SPURIOUS_IRQ_INTERVAL); +} + +static int __init mod_timer_work_init(void) +{ + int err; + + err = swork_get(); + if(err) { + printk(KERN_ERR "mod_timer work init failed.\n"); + return err; + } + + INIT_SWORK(_timer_work, __mod_timer_work); + mod_timer_work_ready = true; + return 0; +} + +early_initcall(mod_timer_work_init); +#endif /* CONFIG_PREEMPT_RT_FULL */ + /* * We wait here for a poller to finish. * @@ -422,8 +452,17 @@ void note_interrupt(unsigned int irq, struct irq_desc *desc, desc->depth++; irq_disable(desc); +#ifdef CONFIG_PREEMPT_RT_FULL + /* +* Avoid calling mod_timer directly in interrupt +* contex here. Let simple workqueue do it. +*/ + if(mod_timer_work_ready) + swork_queue(_timer_work); +#else mod_timer(_spurious_irq_timer, jiffies + POLL_SPURIOUS_IRQ_INTERVAL); +#endif /* CONFIG_PREEMPT_RT_FULL */ } desc->irqs_unhandled = 0; } -- 1.9.1
[PATCH] genirq: use simple work queue for polling spurious irq
Spurious irq will be disabled and then polled through mod_timer in interrupt context. While mod_timer has a common spin_lock operation thus will cause an error: BUG: sleeping function called from invalid context. Move the mod_timer into simple work queue to fix it. Signed-off-by: Zhang Xiao --- kernel/irq/spurious.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c index 903a69c..0ce2a38 100644 --- a/kernel/irq/spurious.c +++ b/kernel/irq/spurious.c @@ -24,6 +24,36 @@ static DEFINE_TIMER(poll_spurious_irq_timer, poll_spurious_irqs, 0, 0); static int irq_poll_cpu; static atomic_t irq_poll_active; +#ifdef CONFIG_PREEMPT_RT_FULL +#include + +static bool mod_timer_work_ready __read_mostly; +static struct swork_event mod_timer_work; + +static void __mod_timer_work(struct swork_event *event) +{ + mod_timer(_spurious_irq_timer, + jiffies + POLL_SPURIOUS_IRQ_INTERVAL); +} + +static int __init mod_timer_work_init(void) +{ + int err; + + err = swork_get(); + if(err) { + printk(KERN_ERR "mod_timer work init failed.\n"); + return err; + } + + INIT_SWORK(_timer_work, __mod_timer_work); + mod_timer_work_ready = true; + return 0; +} + +early_initcall(mod_timer_work_init); +#endif /* CONFIG_PREEMPT_RT_FULL */ + /* * We wait here for a poller to finish. * @@ -422,8 +452,17 @@ void note_interrupt(unsigned int irq, struct irq_desc *desc, desc->depth++; irq_disable(desc); +#ifdef CONFIG_PREEMPT_RT_FULL + /* +* Avoid calling mod_timer directly in interrupt +* contex here. Let simple workqueue do it. +*/ + if(mod_timer_work_ready) + swork_queue(_timer_work); +#else mod_timer(_spurious_irq_timer, jiffies + POLL_SPURIOUS_IRQ_INTERVAL); +#endif /* CONFIG_PREEMPT_RT_FULL */ } desc->irqs_unhandled = 0; } -- 1.9.1
Re: [PATCH] cpufreq: powernv: Add boost files to export ultra-turbo frequencies
Hi Shilpa, On Fri, Dec 16, 2016 at 04:43:08PM +0530, Shilpasri G Bhat wrote: > In P8+, Workload Optimized Frequency(WOF) provides the capability to > boost the cpu frequency based on the utilization of the other cpus > running in the chip. The On-Chip-Controller(OCC) firmware will control > the achievability of these frequencies depending on the power headroom > available in the chip. Currently the ultra-turbo frequencies provided > by this feature are exported along with the turbo and sub-turbo > frequencies as scaling_available_frequencies. This patch will export > the ultra-turbo frequencies separately as scaling_boost_frequencies in > WOF enabled systems. This patch will add the boost sysfs file which > can be used to disable/enable ultra-turbo frequencies. > > Signed-off-by: Shilpasri G Bhat> --- > drivers/cpufreq/powernv-cpufreq.c | 48 > --- > 1 file changed, 45 insertions(+), 3 deletions(-) > > diff --git a/drivers/cpufreq/powernv-cpufreq.c > b/drivers/cpufreq/powernv-cpufreq.c > index 37671b5..56dfd91 100644 > --- a/drivers/cpufreq/powernv-cpufreq.c > +++ b/drivers/cpufreq/powernv-cpufreq.c > @@ -144,6 +144,7 @@ enum throttle_reason_type { > unsigned int max; > unsigned int nominal; > unsigned int nr_pstates; > + bool wof_enabled; > } powernv_pstate_info; > > /* Use following macros for conversions between pstate_id and index */ > @@ -203,6 +204,7 @@ static int init_powernv_pstates(void) > const __be32 *pstate_ids, *pstate_freqs; > u32 len_ids, len_freqs; > u32 pstate_min, pstate_max, pstate_nominal; > + u32 pstate_turbo, pstate_ultra_turbo; > > power_mgt = of_find_node_by_path("/ibm,opal/power-mgt"); > if (!power_mgt) { > @@ -225,6 +227,25 @@ static int init_powernv_pstates(void) > pr_warn("ibm,pstate-nominal not found\n"); > return -ENODEV; > } > + > + if (of_property_read_u32(power_mgt, "ibm,pstate-ultra-turbo", > + _ultra_turbo)) { > + powernv_pstate_info.wof_enabled = false; > + goto next; > + } > + > + if (of_property_read_u32(power_mgt, "ibm,pstate-turbo", > + _turbo)) { > + powernv_pstate_info.wof_enabled = false; > + goto next; > + } > + > + if (pstate_turbo == pstate_ultra_turbo) > + powernv_pstate_info.wof_enabled = false; > + else > + powernv_pstate_info.wof_enabled = true; > + > +next: > pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min, > pstate_nominal, pstate_max); Could you also print if ultra_turbo is enabled ? > > @@ -268,6 +289,13 @@ static int init_powernv_pstates(void) > powernv_pstate_info.nominal = i; > else if (id == pstate_min) > powernv_pstate_info.min = i; > + > + if (powernv_pstate_info.wof_enabled && id == pstate_turbo) { powernv_pstate_info.wof_enabled check is not required since we will bail out of the loop below as j < powernv_pstate_info.max in case when turbo = ultra-turbo or if ultra-turbo is not defined. That said, it makes the code more readable so let us keep it. > + int j; > + > + for (j = i - 1; j >= (int)powernv_pstate_info.max; j--) > + powernv_freqs[j].flags = CPUFREQ_BOOST_FREQ; > + } > > /* End of list marker entry */ > @@ -305,9 +333,12 @@ static ssize_t cpuinfo_nominal_freq_show(struct > cpufreq_policy *policy, > struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = > __ATTR_RO(cpuinfo_nominal_freq); > > +#define SCALING_BOOST_FREQS_ATTR_INDEX 2 > + > static struct freq_attr *powernv_cpu_freq_attr[] = { > _freq_attr_scaling_available_freqs, > _freq_attr_cpuinfo_nominal_freq, > + _freq_attr_scaling_boost_freqs, > NULL, > }; > > @@ -1013,11 +1044,22 @@ static int __init powernv_cpufreq_init(void) > register_reboot_notifier(_cpufreq_reboot_nb); > opal_message_notifier_register(OPAL_MSG_OCC, _cpufreq_opal_nb); > > + if (powernv_pstate_info.wof_enabled) > + powernv_cpufreq_driver.boost_enabled = true; > + else > + powernv_cpu_freq_attr[SCALING_BOOST_FREQS_ATTR_INDEX] = NULL; > + > rc = cpufreq_register_driver(_cpufreq_driver); > - if (!rc) > - return 0; > + if (rc) { > + pr_info("Failed to register the cpufreq driver (%d)\n", rc); > + goto clean_notifiers; cleanup_notifiers ? > + } > > - pr_info("Failed to register the cpufreq driver (%d)\n", rc); > + if (powernv_pstate_info.wof_enabled) > + cpufreq_enable_boost_support(); > + > + return 0; > +clean_notifiers: > unregister_all_notifiers(); > clean_chip_info(); > out: > -- > 1.8.3.1 > Looks good
Re: [PATCH] cpufreq: powernv: Add boost files to export ultra-turbo frequencies
Hi Shilpa, On Fri, Dec 16, 2016 at 04:43:08PM +0530, Shilpasri G Bhat wrote: > In P8+, Workload Optimized Frequency(WOF) provides the capability to > boost the cpu frequency based on the utilization of the other cpus > running in the chip. The On-Chip-Controller(OCC) firmware will control > the achievability of these frequencies depending on the power headroom > available in the chip. Currently the ultra-turbo frequencies provided > by this feature are exported along with the turbo and sub-turbo > frequencies as scaling_available_frequencies. This patch will export > the ultra-turbo frequencies separately as scaling_boost_frequencies in > WOF enabled systems. This patch will add the boost sysfs file which > can be used to disable/enable ultra-turbo frequencies. > > Signed-off-by: Shilpasri G Bhat > --- > drivers/cpufreq/powernv-cpufreq.c | 48 > --- > 1 file changed, 45 insertions(+), 3 deletions(-) > > diff --git a/drivers/cpufreq/powernv-cpufreq.c > b/drivers/cpufreq/powernv-cpufreq.c > index 37671b5..56dfd91 100644 > --- a/drivers/cpufreq/powernv-cpufreq.c > +++ b/drivers/cpufreq/powernv-cpufreq.c > @@ -144,6 +144,7 @@ enum throttle_reason_type { > unsigned int max; > unsigned int nominal; > unsigned int nr_pstates; > + bool wof_enabled; > } powernv_pstate_info; > > /* Use following macros for conversions between pstate_id and index */ > @@ -203,6 +204,7 @@ static int init_powernv_pstates(void) > const __be32 *pstate_ids, *pstate_freqs; > u32 len_ids, len_freqs; > u32 pstate_min, pstate_max, pstate_nominal; > + u32 pstate_turbo, pstate_ultra_turbo; > > power_mgt = of_find_node_by_path("/ibm,opal/power-mgt"); > if (!power_mgt) { > @@ -225,6 +227,25 @@ static int init_powernv_pstates(void) > pr_warn("ibm,pstate-nominal not found\n"); > return -ENODEV; > } > + > + if (of_property_read_u32(power_mgt, "ibm,pstate-ultra-turbo", > + _ultra_turbo)) { > + powernv_pstate_info.wof_enabled = false; > + goto next; > + } > + > + if (of_property_read_u32(power_mgt, "ibm,pstate-turbo", > + _turbo)) { > + powernv_pstate_info.wof_enabled = false; > + goto next; > + } > + > + if (pstate_turbo == pstate_ultra_turbo) > + powernv_pstate_info.wof_enabled = false; > + else > + powernv_pstate_info.wof_enabled = true; > + > +next: > pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min, > pstate_nominal, pstate_max); Could you also print if ultra_turbo is enabled ? > > @@ -268,6 +289,13 @@ static int init_powernv_pstates(void) > powernv_pstate_info.nominal = i; > else if (id == pstate_min) > powernv_pstate_info.min = i; > + > + if (powernv_pstate_info.wof_enabled && id == pstate_turbo) { powernv_pstate_info.wof_enabled check is not required since we will bail out of the loop below as j < powernv_pstate_info.max in case when turbo = ultra-turbo or if ultra-turbo is not defined. That said, it makes the code more readable so let us keep it. > + int j; > + > + for (j = i - 1; j >= (int)powernv_pstate_info.max; j--) > + powernv_freqs[j].flags = CPUFREQ_BOOST_FREQ; > + } > > /* End of list marker entry */ > @@ -305,9 +333,12 @@ static ssize_t cpuinfo_nominal_freq_show(struct > cpufreq_policy *policy, > struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = > __ATTR_RO(cpuinfo_nominal_freq); > > +#define SCALING_BOOST_FREQS_ATTR_INDEX 2 > + > static struct freq_attr *powernv_cpu_freq_attr[] = { > _freq_attr_scaling_available_freqs, > _freq_attr_cpuinfo_nominal_freq, > + _freq_attr_scaling_boost_freqs, > NULL, > }; > > @@ -1013,11 +1044,22 @@ static int __init powernv_cpufreq_init(void) > register_reboot_notifier(_cpufreq_reboot_nb); > opal_message_notifier_register(OPAL_MSG_OCC, _cpufreq_opal_nb); > > + if (powernv_pstate_info.wof_enabled) > + powernv_cpufreq_driver.boost_enabled = true; > + else > + powernv_cpu_freq_attr[SCALING_BOOST_FREQS_ATTR_INDEX] = NULL; > + > rc = cpufreq_register_driver(_cpufreq_driver); > - if (!rc) > - return 0; > + if (rc) { > + pr_info("Failed to register the cpufreq driver (%d)\n", rc); > + goto clean_notifiers; cleanup_notifiers ? > + } > > - pr_info("Failed to register the cpufreq driver (%d)\n", rc); > + if (powernv_pstate_info.wof_enabled) > + cpufreq_enable_boost_support(); > + > + return 0; > +clean_notifiers: > unregister_all_notifiers(); > clean_chip_info(); > out: > -- > 1.8.3.1 > Looks good otherwise. Reviewed-by: Gautham R.
Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0
On Fri, Dec 23, 2016 at 09:33:36AM +1100, Dave Chinner wrote: > On Fri, Dec 23, 2016 at 09:15:00AM +1100, Dave Chinner wrote: > > On Thu, Dec 22, 2016 at 01:10:19PM -0800, Linus Torvalds wrote: > > > Ok, so the numa issue was a red herring. With that fixed: > > > > > > On Thu, Dec 22, 2016 at 1:06 PM, Dave Chinnerwrote: > > > > > > > > Better, but still bad. average files/s is not up to 200k files/s, > > > > so still a good 10-15% off where it should be. xfs_repair is back > > > > down to 10-15% off where it should be, too. bulkstat still fires off > > > > a bad page reference count warning, iscsi still panics immediately. > > > > > > Do you have CONFIG_BLK_WBT enabled, perhaps? > > > > Ok, yes, that's enabled. Let me go turn it off and see what happens. > > Numbers are still all over the place. > > FSUse%Count SizeFiles/sec App Overhead > . > 0 28800 228175.5 21928812 > 0 30400 167880.5 39628229 > 0 32000 124289.5 41420925 > 0 33600 150577.9 35382318 > 0 35200 216535.4 16072628 > 0 36800 233414.4 11846654 > 0 38400 213812.0 13356633 > 0 40000 175905.7 53012015 > 0 41600 157028.7 34700794 > 0 43200 138829.1 50282461 > > And the average is now back down to 185k files/s. repair runtime is > unchanged and still 10-15% off... > > I've got to run away for a few hours right now, but I'll retest the > 4.9 + xfs for-next branch when I get back to see if the problem is > my curent config or whether there really is a perf problem lurking > somewhere Well, I'm not sure now. Taking that config back to 4.9 gave results of 210k files/s. A bit faster, but still not the 230k files/s I'm expecting. So I'm missing something in the .config at this point, though I'm not sure what. FWIW, updating from 4.9 to to the 4.10 tree, this happened: $ $ make oldconfig; make -j 32 scripts/kconfig/conf --oldconfig Kconfig * * Restart config... * * * General setup * Yup, it definitely did something unexpected. And almost silently, I might add - I didn't notice this the first time around, and wouldn't hav enoticed it this time if I wasn't looking for something strange to happen. As iit is, I still haven't found what magic config option is taking away that 10-15% of performance. There's no unexpected debug options set, and it's nothing obvious in the fs/block layer config. I'll keep looking for the moment... Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0
On Fri, Dec 23, 2016 at 09:33:36AM +1100, Dave Chinner wrote: > On Fri, Dec 23, 2016 at 09:15:00AM +1100, Dave Chinner wrote: > > On Thu, Dec 22, 2016 at 01:10:19PM -0800, Linus Torvalds wrote: > > > Ok, so the numa issue was a red herring. With that fixed: > > > > > > On Thu, Dec 22, 2016 at 1:06 PM, Dave Chinner wrote: > > > > > > > > Better, but still bad. average files/s is not up to 200k files/s, > > > > so still a good 10-15% off where it should be. xfs_repair is back > > > > down to 10-15% off where it should be, too. bulkstat still fires off > > > > a bad page reference count warning, iscsi still panics immediately. > > > > > > Do you have CONFIG_BLK_WBT enabled, perhaps? > > > > Ok, yes, that's enabled. Let me go turn it off and see what happens. > > Numbers are still all over the place. > > FSUse%Count SizeFiles/sec App Overhead > . > 0 28800 228175.5 21928812 > 0 30400 167880.5 39628229 > 0 32000 124289.5 41420925 > 0 33600 150577.9 35382318 > 0 35200 216535.4 16072628 > 0 36800 233414.4 11846654 > 0 38400 213812.0 13356633 > 0 40000 175905.7 53012015 > 0 41600 157028.7 34700794 > 0 43200 138829.1 50282461 > > And the average is now back down to 185k files/s. repair runtime is > unchanged and still 10-15% off... > > I've got to run away for a few hours right now, but I'll retest the > 4.9 + xfs for-next branch when I get back to see if the problem is > my curent config or whether there really is a perf problem lurking > somewhere Well, I'm not sure now. Taking that config back to 4.9 gave results of 210k files/s. A bit faster, but still not the 230k files/s I'm expecting. So I'm missing something in the .config at this point, though I'm not sure what. FWIW, updating from 4.9 to to the 4.10 tree, this happened: $ $ make oldconfig; make -j 32 scripts/kconfig/conf --oldconfig Kconfig * * Restart config... * * * General setup * Yup, it definitely did something unexpected. And almost silently, I might add - I didn't notice this the first time around, and wouldn't hav enoticed it this time if I wasn't looking for something strange to happen. As iit is, I still haven't found what magic config option is taking away that 10-15% of performance. There's no unexpected debug options set, and it's nothing obvious in the fs/block layer config. I'll keep looking for the moment... Cheers, Dave. -- Dave Chinner da...@fromorbit.com
[PATCH v1 0/3] soc: rockchip: power-domain: support RK3328 SoC
Elaine Zhang (3): dt/bindings: power: add RK3328 SoCs header for idle-request dt-bindings: add binding for rk3328 power domains soc: rockchip: power-domain: Modify power domain driver for rk3328 .../bindings/soc/rockchip/power_domain.txt | 3 ++ drivers/soc/rockchip/pm_domains.c | 63 +++--- include/dt-bindings/power/rk3328-power.h | 18 +++ 3 files changed, 78 insertions(+), 6 deletions(-) create mode 100644 include/dt-bindings/power/rk3328-power.h -- 1.9.1
[PATCH v1 2/3] dt-bindings: add binding for rk3328 power domains
Add binding documentation for the power domains found on Rockchip RK3328 SoCs. But RK3328 SoC just support idle, not support pd. Signed-off-by: Elaine Zhang--- Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt index f909ce06afc4..01bfb6745fbd 100644 --- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt +++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt @@ -6,6 +6,7 @@ powered up/down by software based on different application scenes to save power. Required properties for power domain controller: - compatible: Should be one of the following. "rockchip,rk3288-power-controller" - for RK3288 SoCs. + "rockchip,rk3328-power-controller" - for RK3328 SoCs. "rockchip,rk3368-power-controller" - for RK3368 SoCs. "rockchip,rk3399-power-controller" - for RK3399 SoCs. - #power-domain-cells: Number of cells in a power-domain specifier. @@ -16,6 +17,7 @@ Required properties for power domain controller: Required properties for power domain sub nodes: - reg: index of the power domain, should use macros in: "include/dt-bindings/power/rk3288-power.h" - for RK3288 type power domain. + "include/dt-bindings/power/rk3328-power.h" - for RK3328 type power domain. "include/dt-bindings/power/rk3368-power.h" - for RK3368 type power domain. "include/dt-bindings/power/rk3399-power.h" - for RK3399 type power domain. - clocks (optional): phandles to clocks which need to be enabled while power domain @@ -90,6 +92,7 @@ containing a phandle to the power device node and an index specifying which power domain to use. The index should use macros in: "include/dt-bindings/power/rk3288-power.h" - for rk3288 type power domain. + "include/dt-bindings/power/rk3328-power.h" - for rk3328 type power domain. "include/dt-bindings/power/rk3368-power.h" - for rk3368 type power domain. "include/dt-bindings/power/rk3399-power.h" - for rk3399 type power domain. -- 1.9.1
[PATCH v1 0/3] soc: rockchip: power-domain: support RK3328 SoC
Elaine Zhang (3): dt/bindings: power: add RK3328 SoCs header for idle-request dt-bindings: add binding for rk3328 power domains soc: rockchip: power-domain: Modify power domain driver for rk3328 .../bindings/soc/rockchip/power_domain.txt | 3 ++ drivers/soc/rockchip/pm_domains.c | 63 +++--- include/dt-bindings/power/rk3328-power.h | 18 +++ 3 files changed, 78 insertions(+), 6 deletions(-) create mode 100644 include/dt-bindings/power/rk3328-power.h -- 1.9.1
[PATCH v1 2/3] dt-bindings: add binding for rk3328 power domains
Add binding documentation for the power domains found on Rockchip RK3328 SoCs. But RK3328 SoC just support idle, not support pd. Signed-off-by: Elaine Zhang --- Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt index f909ce06afc4..01bfb6745fbd 100644 --- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt +++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt @@ -6,6 +6,7 @@ powered up/down by software based on different application scenes to save power. Required properties for power domain controller: - compatible: Should be one of the following. "rockchip,rk3288-power-controller" - for RK3288 SoCs. + "rockchip,rk3328-power-controller" - for RK3328 SoCs. "rockchip,rk3368-power-controller" - for RK3368 SoCs. "rockchip,rk3399-power-controller" - for RK3399 SoCs. - #power-domain-cells: Number of cells in a power-domain specifier. @@ -16,6 +17,7 @@ Required properties for power domain controller: Required properties for power domain sub nodes: - reg: index of the power domain, should use macros in: "include/dt-bindings/power/rk3288-power.h" - for RK3288 type power domain. + "include/dt-bindings/power/rk3328-power.h" - for RK3328 type power domain. "include/dt-bindings/power/rk3368-power.h" - for RK3368 type power domain. "include/dt-bindings/power/rk3399-power.h" - for RK3399 type power domain. - clocks (optional): phandles to clocks which need to be enabled while power domain @@ -90,6 +92,7 @@ containing a phandle to the power device node and an index specifying which power domain to use. The index should use macros in: "include/dt-bindings/power/rk3288-power.h" - for rk3288 type power domain. + "include/dt-bindings/power/rk3328-power.h" - for rk3328 type power domain. "include/dt-bindings/power/rk3368-power.h" - for rk3368 type power domain. "include/dt-bindings/power/rk3399-power.h" - for rk3399 type power domain. -- 1.9.1
[PATCH v1 3/3] soc: rockchip: power-domain: Modify power domain driver for rk3328
This driver is modified to support RK3328 SoC. RK3328 SoC is only support idle. add DOMAIN_M type, for support regs have write_enable bit. Signed-off-by: Elaine Zhang--- drivers/soc/rockchip/pm_domains.c | 63 +++ 1 file changed, 57 insertions(+), 6 deletions(-) diff --git a/drivers/soc/rockchip/pm_domains.c b/drivers/soc/rockchip/pm_domains.c index 1c78c42416c6..796c46a6cbe7 100644 --- a/drivers/soc/rockchip/pm_domains.c +++ b/drivers/soc/rockchip/pm_domains.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -29,6 +30,8 @@ struct rockchip_domain_info { int idle_mask; int ack_mask; bool active_wakeup; + int pwr_w_mask; + int req_w_mask; }; struct rockchip_pmu_info { @@ -87,9 +90,24 @@ struct rockchip_pmu { .active_wakeup = wakeup,\ } +#define DOMAIN_M(pwr, status, req, idle, ack, wakeup) \ +{ \ + .pwr_w_mask = (pwr >= 0) ? BIT(pwr + 16) : 0, \ + .pwr_mask = (pwr >= 0) ? BIT(pwr) : 0, \ + .status_mask = (status >= 0) ? BIT(status) : 0, \ + .req_w_mask = (req >= 0) ? BIT(req + 16) : 0, \ + .req_mask = (req >= 0) ? BIT(req) : 0, \ + .idle_mask = (idle >= 0) ? BIT(idle) : 0, \ + .ack_mask = (ack >= 0) ? BIT(ack) : 0, \ + .active_wakeup = wakeup,\ +} + #define DOMAIN_RK3288(pwr, status, req, wakeup)\ DOMAIN(pwr, status, req, req, (req) + 16, wakeup) +#define DOMAIN_RK3328(pwr, status, req, wakeup)\ + DOMAIN_M(pwr, pwr, req, (req) + 10, req, wakeup) + #define DOMAIN_RK3368(pwr, status, req, wakeup)\ DOMAIN(pwr, status, req, (req) + 16, req, wakeup) @@ -127,9 +145,13 @@ static int rockchip_pmu_set_idle_request(struct rockchip_pm_domain *pd, if (pd_info->req_mask == 0) return 0; - - regmap_update_bits(pmu->regmap, pmu->info->req_offset, - pd_info->req_mask, idle ? -1U : 0); + else if (pd_info->req_w_mask) + regmap_write(pmu->regmap, pmu->info->req_offset, +idle ? (pd_info->req_mask | pd_info->req_w_mask) : +pd_info->req_w_mask); + else + regmap_update_bits(pmu->regmap, pmu->info->req_offset, + pd_info->req_mask, idle ? -1U : 0); dsb(sy); @@ -230,9 +252,13 @@ static void rockchip_do_pmu_set_power_domain(struct rockchip_pm_domain *pd, if (pd->info->pwr_mask == 0) return; - - regmap_update_bits(pmu->regmap, pmu->info->pwr_offset, - pd->info->pwr_mask, on ? 0 : -1U); + else if (pd->info->pwr_w_mask) + regmap_write(pmu->regmap, pmu->info->pwr_offset, +on ? pd->info->pwr_mask : +(pd->info->pwr_mask | pd->info->pwr_w_mask)); + else + regmap_update_bits(pmu->regmap, pmu->info->pwr_offset, + pd->info->pwr_mask, on ? 0 : -1U); dsb(sy); @@ -692,6 +718,18 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev) [RK3288_PD_GPU] = DOMAIN_RK3288(9, 9, 2, false), }; +static const struct rockchip_domain_info rk3328_pm_domains[] = { + [RK3328_PD_CORE]= DOMAIN_RK3328(-1, 0, 0, false), + [RK3328_PD_GPU] = DOMAIN_RK3328(-1, 1, 1, false), + [RK3328_PD_BUS] = DOMAIN_RK3328(-1, 2, 2, true), + [RK3328_PD_MSCH]= DOMAIN_RK3328(-1, 3, 3, true), + [RK3328_PD_PERI]= DOMAIN_RK3328(-1, 4, 4, true), + [RK3328_PD_VIDEO] = DOMAIN_RK3328(-1, 5, 5, false), + [RK3328_PD_HEVC]= DOMAIN_RK3328(-1, 6, 6, false), + [RK3328_PD_VIO] = DOMAIN_RK3328(-1, 8, 8, false), + [RK3328_PD_VPU] = DOMAIN_RK3328(-1, 9, 9, false), +}; + static const struct rockchip_domain_info rk3368_pm_domains[] = { [RK3368_PD_PERI]= DOMAIN_RK3368(13, 12, 6, true), [RK3368_PD_VIO] = DOMAIN_RK3368(15, 14, 8, false), @@ -747,6 +785,15 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev) .domain_info = rk3288_pm_domains, }; +static const struct rockchip_pmu_info rk3328_pmu = { + .req_offset = 0x414, + .idle_offset = 0x484, + .ack_offset = 0x484, + + .num_domains = ARRAY_SIZE(rk3328_pm_domains), + .domain_info = rk3328_pm_domains, +}; + static const struct rockchip_pmu_info rk3368_pmu = { .pwr_offset = 0x0c, .status_offset = 0x10, @@ -783,6 +830,10 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev) .data = (void *)_pmu, }, { + .compatible =
[PATCH v1 1/3] dt/bindings: power: add RK3328 SoCs header for idle-request
According to a description from TRM, add all the idle request. Signed-off-by: Elaine Zhang--- include/dt-bindings/power/rk3328-power.h | 18 ++ 1 file changed, 18 insertions(+) create mode 100644 include/dt-bindings/power/rk3328-power.h diff --git a/include/dt-bindings/power/rk3328-power.h b/include/dt-bindings/power/rk3328-power.h new file mode 100644 index ..10c3c3715334 --- /dev/null +++ b/include/dt-bindings/power/rk3328-power.h @@ -0,0 +1,18 @@ +#ifndef __DT_BINDINGS_POWER_RK3328_POWER_H__ +#define __DT_BINDINGS_POWER_RK3328_POWER_H__ + +/** + * RK3328 idle id Summary. + */ +#define RK3328_PD_CORE 0 +#define RK3328_PD_GPU 1 +#define RK3328_PD_BUS 2 +#define RK3328_PD_MSCH 3 +#define RK3328_PD_PERI 4 +#define RK3328_PD_VIDEO5 +#define RK3328_PD_HEVC 6 +#define RK3328_PD_SYS 7 +#define RK3328_PD_VPU 8 +#define RK3328_PD_VIO 9 + +#endif -- 1.9.1
[PATCH v1 3/3] soc: rockchip: power-domain: Modify power domain driver for rk3328
This driver is modified to support RK3328 SoC. RK3328 SoC is only support idle. add DOMAIN_M type, for support regs have write_enable bit. Signed-off-by: Elaine Zhang --- drivers/soc/rockchip/pm_domains.c | 63 +++ 1 file changed, 57 insertions(+), 6 deletions(-) diff --git a/drivers/soc/rockchip/pm_domains.c b/drivers/soc/rockchip/pm_domains.c index 1c78c42416c6..796c46a6cbe7 100644 --- a/drivers/soc/rockchip/pm_domains.c +++ b/drivers/soc/rockchip/pm_domains.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -29,6 +30,8 @@ struct rockchip_domain_info { int idle_mask; int ack_mask; bool active_wakeup; + int pwr_w_mask; + int req_w_mask; }; struct rockchip_pmu_info { @@ -87,9 +90,24 @@ struct rockchip_pmu { .active_wakeup = wakeup,\ } +#define DOMAIN_M(pwr, status, req, idle, ack, wakeup) \ +{ \ + .pwr_w_mask = (pwr >= 0) ? BIT(pwr + 16) : 0, \ + .pwr_mask = (pwr >= 0) ? BIT(pwr) : 0, \ + .status_mask = (status >= 0) ? BIT(status) : 0, \ + .req_w_mask = (req >= 0) ? BIT(req + 16) : 0, \ + .req_mask = (req >= 0) ? BIT(req) : 0, \ + .idle_mask = (idle >= 0) ? BIT(idle) : 0, \ + .ack_mask = (ack >= 0) ? BIT(ack) : 0, \ + .active_wakeup = wakeup,\ +} + #define DOMAIN_RK3288(pwr, status, req, wakeup)\ DOMAIN(pwr, status, req, req, (req) + 16, wakeup) +#define DOMAIN_RK3328(pwr, status, req, wakeup)\ + DOMAIN_M(pwr, pwr, req, (req) + 10, req, wakeup) + #define DOMAIN_RK3368(pwr, status, req, wakeup)\ DOMAIN(pwr, status, req, (req) + 16, req, wakeup) @@ -127,9 +145,13 @@ static int rockchip_pmu_set_idle_request(struct rockchip_pm_domain *pd, if (pd_info->req_mask == 0) return 0; - - regmap_update_bits(pmu->regmap, pmu->info->req_offset, - pd_info->req_mask, idle ? -1U : 0); + else if (pd_info->req_w_mask) + regmap_write(pmu->regmap, pmu->info->req_offset, +idle ? (pd_info->req_mask | pd_info->req_w_mask) : +pd_info->req_w_mask); + else + regmap_update_bits(pmu->regmap, pmu->info->req_offset, + pd_info->req_mask, idle ? -1U : 0); dsb(sy); @@ -230,9 +252,13 @@ static void rockchip_do_pmu_set_power_domain(struct rockchip_pm_domain *pd, if (pd->info->pwr_mask == 0) return; - - regmap_update_bits(pmu->regmap, pmu->info->pwr_offset, - pd->info->pwr_mask, on ? 0 : -1U); + else if (pd->info->pwr_w_mask) + regmap_write(pmu->regmap, pmu->info->pwr_offset, +on ? pd->info->pwr_mask : +(pd->info->pwr_mask | pd->info->pwr_w_mask)); + else + regmap_update_bits(pmu->regmap, pmu->info->pwr_offset, + pd->info->pwr_mask, on ? 0 : -1U); dsb(sy); @@ -692,6 +718,18 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev) [RK3288_PD_GPU] = DOMAIN_RK3288(9, 9, 2, false), }; +static const struct rockchip_domain_info rk3328_pm_domains[] = { + [RK3328_PD_CORE]= DOMAIN_RK3328(-1, 0, 0, false), + [RK3328_PD_GPU] = DOMAIN_RK3328(-1, 1, 1, false), + [RK3328_PD_BUS] = DOMAIN_RK3328(-1, 2, 2, true), + [RK3328_PD_MSCH]= DOMAIN_RK3328(-1, 3, 3, true), + [RK3328_PD_PERI]= DOMAIN_RK3328(-1, 4, 4, true), + [RK3328_PD_VIDEO] = DOMAIN_RK3328(-1, 5, 5, false), + [RK3328_PD_HEVC]= DOMAIN_RK3328(-1, 6, 6, false), + [RK3328_PD_VIO] = DOMAIN_RK3328(-1, 8, 8, false), + [RK3328_PD_VPU] = DOMAIN_RK3328(-1, 9, 9, false), +}; + static const struct rockchip_domain_info rk3368_pm_domains[] = { [RK3368_PD_PERI]= DOMAIN_RK3368(13, 12, 6, true), [RK3368_PD_VIO] = DOMAIN_RK3368(15, 14, 8, false), @@ -747,6 +785,15 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev) .domain_info = rk3288_pm_domains, }; +static const struct rockchip_pmu_info rk3328_pmu = { + .req_offset = 0x414, + .idle_offset = 0x484, + .ack_offset = 0x484, + + .num_domains = ARRAY_SIZE(rk3328_pm_domains), + .domain_info = rk3328_pm_domains, +}; + static const struct rockchip_pmu_info rk3368_pmu = { .pwr_offset = 0x0c, .status_offset = 0x10, @@ -783,6 +830,10 @@ static int rockchip_pm_domain_probe(struct platform_device *pdev) .data = (void *)_pmu, }, { + .compatible = "rockchip,rk3328-power-controller", +
[PATCH v1 1/3] dt/bindings: power: add RK3328 SoCs header for idle-request
According to a description from TRM, add all the idle request. Signed-off-by: Elaine Zhang --- include/dt-bindings/power/rk3328-power.h | 18 ++ 1 file changed, 18 insertions(+) create mode 100644 include/dt-bindings/power/rk3328-power.h diff --git a/include/dt-bindings/power/rk3328-power.h b/include/dt-bindings/power/rk3328-power.h new file mode 100644 index ..10c3c3715334 --- /dev/null +++ b/include/dt-bindings/power/rk3328-power.h @@ -0,0 +1,18 @@ +#ifndef __DT_BINDINGS_POWER_RK3328_POWER_H__ +#define __DT_BINDINGS_POWER_RK3328_POWER_H__ + +/** + * RK3328 idle id Summary. + */ +#define RK3328_PD_CORE 0 +#define RK3328_PD_GPU 1 +#define RK3328_PD_BUS 2 +#define RK3328_PD_MSCH 3 +#define RK3328_PD_PERI 4 +#define RK3328_PD_VIDEO5 +#define RK3328_PD_HEVC 6 +#define RK3328_PD_SYS 7 +#define RK3328_PD_VPU 8 +#define RK3328_PD_VIO 9 + +#endif -- 1.9.1
RE: A small window for a race condition in mm/rmap.c:page_lock_anon_vma_read
On Fri, 23 Dec 2016, Dashi DS1 Cao wrote: > The kernel version is "RELEASE: 3.10.0-327.36.3.el7.x86_64". It was the > latest kernel release of CentOS 7.2 at that time, or maybe still now. Okay, thanks: so, basically a v3.10 kernel, with lots of added patches, but also lacking many more recent fixes. > I've tried to print the value of anon_vma from other three dumps, but the > content is not available in the dump. > "gdb: page excluded: kernel virtual address: 882b47ddadc0" > I guess it is not copied out because it has already been put into some unused > list. Useful info: that suggests that the anon_vma was rightly freed, and that it's the page->_mapcount that's wrong. The page isn't really mapped anywhere now, but appearing to be still page_mapped(), it has tricked page_lock_anon_vma_read() into thinking the stale anon_vma pointer is safe to access. That can happen if there's a race, and a page gets mapped with one pte on top of another: only one of them will be unmapped later. Incorrect handling of page table entries. But I cannot remember anywhere that was shown to happen - beyond a project of my own, which never reached the tree. If it's a file page, that usually ends up as BUG_ON(page_mapped(page)) in __delete_from_page_cache() (in v3.10, changed a little later on), when truncating or unlinking the file or unmounting the filesystem. Those have been seen in the past, on rare occasions, but I don't remember actually root-causing any of them. If it's an anon page, there is no equivalent place for such a BUG_ON. mremap move has a tricky job to do, and might cause such a problem if its locking were inadequate: but the only example I see since v3.10 was dd18dbc2d42a "mm, thp: close race between mremap() and split_huge_page()", and that used to crash in __split_huge_page(). Or see c0d73261f5c1 "mm/memory.c: use entry = ACCESS_ONCE(*pte) in handle_pte_fault()", which brings us back to Peter's topic of over-imaginative compilers; but none of us believed that change really made a difference in practice. Cc'ing Sasha Levin, long-time trinity-runner, just in case he might remember any time when a BUG_ON(page_mapped(page)) was really solved: if so, there's a chance the explanation might also apply to anonymous pages, and be responsible for your page_lock_anon_vma_read() crashes. Hugh
RE: A small window for a race condition in mm/rmap.c:page_lock_anon_vma_read
On Fri, 23 Dec 2016, Dashi DS1 Cao wrote: > The kernel version is "RELEASE: 3.10.0-327.36.3.el7.x86_64". It was the > latest kernel release of CentOS 7.2 at that time, or maybe still now. Okay, thanks: so, basically a v3.10 kernel, with lots of added patches, but also lacking many more recent fixes. > I've tried to print the value of anon_vma from other three dumps, but the > content is not available in the dump. > "gdb: page excluded: kernel virtual address: 882b47ddadc0" > I guess it is not copied out because it has already been put into some unused > list. Useful info: that suggests that the anon_vma was rightly freed, and that it's the page->_mapcount that's wrong. The page isn't really mapped anywhere now, but appearing to be still page_mapped(), it has tricked page_lock_anon_vma_read() into thinking the stale anon_vma pointer is safe to access. That can happen if there's a race, and a page gets mapped with one pte on top of another: only one of them will be unmapped later. Incorrect handling of page table entries. But I cannot remember anywhere that was shown to happen - beyond a project of my own, which never reached the tree. If it's a file page, that usually ends up as BUG_ON(page_mapped(page)) in __delete_from_page_cache() (in v3.10, changed a little later on), when truncating or unlinking the file or unmounting the filesystem. Those have been seen in the past, on rare occasions, but I don't remember actually root-causing any of them. If it's an anon page, there is no equivalent place for such a BUG_ON. mremap move has a tricky job to do, and might cause such a problem if its locking were inadequate: but the only example I see since v3.10 was dd18dbc2d42a "mm, thp: close race between mremap() and split_huge_page()", and that used to crash in __split_huge_page(). Or see c0d73261f5c1 "mm/memory.c: use entry = ACCESS_ONCE(*pte) in handle_pte_fault()", which brings us back to Peter's topic of over-imaginative compilers; but none of us believed that change really made a difference in practice. Cc'ing Sasha Levin, long-time trinity-runner, just in case he might remember any time when a BUG_ON(page_mapped(page)) was really solved: if so, there's a chance the explanation might also apply to anonymous pages, and be responsible for your page_lock_anon_vma_read() crashes. Hugh
[PATCH] serial: 8250: use initializer instead of memset to clear local struct
Leave the way of zero-out to the compiler's decision; the compiler may know a more optimized way than calling memset(). It may end up with memset() for big structures like this after all, but the code will be cleaner at least. Signed-off-by: Masahiro Yamada--- drivers/tty/serial/8250/8250_acorn.c| 3 +-- drivers/tty/serial/8250/8250_core.c | 4 +--- drivers/tty/serial/8250/8250_em.c | 3 +-- drivers/tty/serial/8250/8250_gsc.c | 3 +-- drivers/tty/serial/8250/8250_hp300.c| 11 +++ drivers/tty/serial/8250/8250_lpc18xx.c | 4 +--- drivers/tty/serial/8250/8250_lpss.c | 4 +--- drivers/tty/serial/8250/8250_mid.c | 4 +--- drivers/tty/serial/8250/8250_moxa.c | 4 +--- drivers/tty/serial/8250/8250_of.c | 4 ++-- drivers/tty/serial/8250/8250_omap.c | 3 +-- drivers/tty/serial/8250/8250_pci.c | 3 +-- drivers/tty/serial/8250/8250_pnp.c | 3 +-- drivers/tty/serial/8250/8250_uniphier.c | 4 +--- drivers/tty/serial/8250/serial_cs.c | 3 +-- 15 files changed, 18 insertions(+), 42 deletions(-) diff --git a/drivers/tty/serial/8250/8250_acorn.c b/drivers/tty/serial/8250/8250_acorn.c index 402dfdd..f49acafc 100644 --- a/drivers/tty/serial/8250/8250_acorn.c +++ b/drivers/tty/serial/8250/8250_acorn.c @@ -43,7 +43,7 @@ serial_card_probe(struct expansion_card *ec, const struct ecard_id *id) { struct serial_card_info *info; struct serial_card_type *type = id->data; - struct uart_8250_port uart; + struct uart_8250_port uart = {}; unsigned long bus_addr; unsigned int i; @@ -62,7 +62,6 @@ serial_card_probe(struct expansion_card *ec, const struct ecard_id *id) ecard_set_drvdata(ec, info); - memset(, 0, sizeof(struct uart_8250_port)); uart.port.irq = ec->irq; uart.port.flags = UPF_BOOT_AUTOCONF | UPF_SHARE_IRQ; uart.port.uartclk = type->uartclk; diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c index 61569a7..27c18c9 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ b/drivers/tty/serial/8250/8250_core.c @@ -804,11 +804,9 @@ EXPORT_SYMBOL(serial8250_resume_port); static int serial8250_probe(struct platform_device *dev) { struct plat_serial8250_port *p = dev_get_platdata(>dev); - struct uart_8250_port uart; + struct uart_8250_port uart = {}; int ret, i, irqflag = 0; - memset(, 0, sizeof(uart)); - if (share_irqs) irqflag = IRQF_SHARED; diff --git a/drivers/tty/serial/8250/8250_em.c b/drivers/tty/serial/8250/8250_em.c index 0b63812..5deabaf 100644 --- a/drivers/tty/serial/8250/8250_em.c +++ b/drivers/tty/serial/8250/8250_em.c @@ -92,7 +92,7 @@ static int serial8250_em_probe(struct platform_device *pdev) struct resource *regs = platform_get_resource(pdev, IORESOURCE_MEM, 0); struct resource *irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0); struct serial8250_em_priv *priv; - struct uart_8250_port up; + struct uart_8250_port up = {}; int ret; if (!regs || !irq) { @@ -110,7 +110,6 @@ static int serial8250_em_probe(struct platform_device *pdev) return PTR_ERR(priv->sclk); } - memset(, 0, sizeof(up)); up.port.mapbase = regs->start; up.port.irq = irq->start; up.port.type = PORT_UNKNOWN; diff --git a/drivers/tty/serial/8250/8250_gsc.c b/drivers/tty/serial/8250/8250_gsc.c index b1e6ae9..5366e97 100644 --- a/drivers/tty/serial/8250/8250_gsc.c +++ b/drivers/tty/serial/8250/8250_gsc.c @@ -26,7 +26,7 @@ static int __init serial_init_chip(struct parisc_device *dev) { - struct uart_8250_port uart; + struct uart_8250_port uart = {}; unsigned long address; int err; @@ -53,7 +53,6 @@ static int __init serial_init_chip(struct parisc_device *dev) if (dev->id.sversion != 0x8d) address += 0x800; - memset(, 0, sizeof(uart)); uart.port.iotype= UPIO_MEM; /* 7.272727MHz on Lasi. Assumed the same for Dino, Wax and Timi. */ uart.port.uartclk = (dev->id.sversion != 0xad) ? diff --git a/drivers/tty/serial/8250/8250_hp300.c b/drivers/tty/serial/8250/8250_hp300.c index 38166db..6fd6414 100644 --- a/drivers/tty/serial/8250/8250_hp300.c +++ b/drivers/tty/serial/8250/8250_hp300.c @@ -90,9 +90,7 @@ extern int hp300_uart_scode; int __init hp300_setup_serial_console(void) { int scode; - struct uart_port port; - - memset(, 0, sizeof(port)); + struct uart_port port = {}; if (hp300_uart_scode < 0 || hp300_uart_scode > DIO_SCMAX) return 0; @@ -156,7 +154,7 @@ int __init hp300_setup_serial_console(void) static int hpdca_init_one(struct dio_dev *d, const struct dio_device_id *ent) { - struct uart_8250_port uart; + struct uart_8250_port uart =
[PATCH] serial: 8250: use initializer instead of memset to clear local struct
Leave the way of zero-out to the compiler's decision; the compiler may know a more optimized way than calling memset(). It may end up with memset() for big structures like this after all, but the code will be cleaner at least. Signed-off-by: Masahiro Yamada --- drivers/tty/serial/8250/8250_acorn.c| 3 +-- drivers/tty/serial/8250/8250_core.c | 4 +--- drivers/tty/serial/8250/8250_em.c | 3 +-- drivers/tty/serial/8250/8250_gsc.c | 3 +-- drivers/tty/serial/8250/8250_hp300.c| 11 +++ drivers/tty/serial/8250/8250_lpc18xx.c | 4 +--- drivers/tty/serial/8250/8250_lpss.c | 4 +--- drivers/tty/serial/8250/8250_mid.c | 4 +--- drivers/tty/serial/8250/8250_moxa.c | 4 +--- drivers/tty/serial/8250/8250_of.c | 4 ++-- drivers/tty/serial/8250/8250_omap.c | 3 +-- drivers/tty/serial/8250/8250_pci.c | 3 +-- drivers/tty/serial/8250/8250_pnp.c | 3 +-- drivers/tty/serial/8250/8250_uniphier.c | 4 +--- drivers/tty/serial/8250/serial_cs.c | 3 +-- 15 files changed, 18 insertions(+), 42 deletions(-) diff --git a/drivers/tty/serial/8250/8250_acorn.c b/drivers/tty/serial/8250/8250_acorn.c index 402dfdd..f49acafc 100644 --- a/drivers/tty/serial/8250/8250_acorn.c +++ b/drivers/tty/serial/8250/8250_acorn.c @@ -43,7 +43,7 @@ serial_card_probe(struct expansion_card *ec, const struct ecard_id *id) { struct serial_card_info *info; struct serial_card_type *type = id->data; - struct uart_8250_port uart; + struct uart_8250_port uart = {}; unsigned long bus_addr; unsigned int i; @@ -62,7 +62,6 @@ serial_card_probe(struct expansion_card *ec, const struct ecard_id *id) ecard_set_drvdata(ec, info); - memset(, 0, sizeof(struct uart_8250_port)); uart.port.irq = ec->irq; uart.port.flags = UPF_BOOT_AUTOCONF | UPF_SHARE_IRQ; uart.port.uartclk = type->uartclk; diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c index 61569a7..27c18c9 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ b/drivers/tty/serial/8250/8250_core.c @@ -804,11 +804,9 @@ EXPORT_SYMBOL(serial8250_resume_port); static int serial8250_probe(struct platform_device *dev) { struct plat_serial8250_port *p = dev_get_platdata(>dev); - struct uart_8250_port uart; + struct uart_8250_port uart = {}; int ret, i, irqflag = 0; - memset(, 0, sizeof(uart)); - if (share_irqs) irqflag = IRQF_SHARED; diff --git a/drivers/tty/serial/8250/8250_em.c b/drivers/tty/serial/8250/8250_em.c index 0b63812..5deabaf 100644 --- a/drivers/tty/serial/8250/8250_em.c +++ b/drivers/tty/serial/8250/8250_em.c @@ -92,7 +92,7 @@ static int serial8250_em_probe(struct platform_device *pdev) struct resource *regs = platform_get_resource(pdev, IORESOURCE_MEM, 0); struct resource *irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0); struct serial8250_em_priv *priv; - struct uart_8250_port up; + struct uart_8250_port up = {}; int ret; if (!regs || !irq) { @@ -110,7 +110,6 @@ static int serial8250_em_probe(struct platform_device *pdev) return PTR_ERR(priv->sclk); } - memset(, 0, sizeof(up)); up.port.mapbase = regs->start; up.port.irq = irq->start; up.port.type = PORT_UNKNOWN; diff --git a/drivers/tty/serial/8250/8250_gsc.c b/drivers/tty/serial/8250/8250_gsc.c index b1e6ae9..5366e97 100644 --- a/drivers/tty/serial/8250/8250_gsc.c +++ b/drivers/tty/serial/8250/8250_gsc.c @@ -26,7 +26,7 @@ static int __init serial_init_chip(struct parisc_device *dev) { - struct uart_8250_port uart; + struct uart_8250_port uart = {}; unsigned long address; int err; @@ -53,7 +53,6 @@ static int __init serial_init_chip(struct parisc_device *dev) if (dev->id.sversion != 0x8d) address += 0x800; - memset(, 0, sizeof(uart)); uart.port.iotype= UPIO_MEM; /* 7.272727MHz on Lasi. Assumed the same for Dino, Wax and Timi. */ uart.port.uartclk = (dev->id.sversion != 0xad) ? diff --git a/drivers/tty/serial/8250/8250_hp300.c b/drivers/tty/serial/8250/8250_hp300.c index 38166db..6fd6414 100644 --- a/drivers/tty/serial/8250/8250_hp300.c +++ b/drivers/tty/serial/8250/8250_hp300.c @@ -90,9 +90,7 @@ extern int hp300_uart_scode; int __init hp300_setup_serial_console(void) { int scode; - struct uart_port port; - - memset(, 0, sizeof(port)); + struct uart_port port = {}; if (hp300_uart_scode < 0 || hp300_uart_scode > DIO_SCMAX) return 0; @@ -156,7 +154,7 @@ int __init hp300_setup_serial_console(void) static int hpdca_init_one(struct dio_dev *d, const struct dio_device_id *ent) { - struct uart_8250_port uart; + struct uart_8250_port uart = {}; int line;
[PATCH] arm64: dts: uniphier: add eMMC controller node for LD11/LD20
Add Cadence's eMMC controller node for LD11/LD20. Signed-off-by: Masahiro Yamada--- arch/arm/boot/dts/uniphier-pinctrl.dtsi | 5 + arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi | 12 arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi | 12 3 files changed, 29 insertions(+) diff --git a/arch/arm/boot/dts/uniphier-pinctrl.dtsi b/arch/arm/boot/dts/uniphier-pinctrl.dtsi index 10a7110..be353af15 100644 --- a/arch/arm/boot/dts/uniphier-pinctrl.dtsi +++ b/arch/arm/boot/dts/uniphier-pinctrl.dtsi @@ -43,6 +43,11 @@ */ { + pinctrl_emmc: emmc_grp { + groups = "emmc", "emmc_dat8"; + function = "emmc"; + }; + pinctrl_i2c0: i2c0_grp { groups = "i2c0"; function = "i2c0"; diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi index 43b6583..e1e45b4 100644 --- a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi +++ b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi @@ -300,6 +300,18 @@ }; }; + emmc: sdhc@5a00 { + compatible = "socionext,uniphier-sd4hc", "cdns,sd4hc"; + reg = <0x5a00 0x400>; + interrupts = <0 78 4>; + pinctrl-names = "default"; + pinctrl-0 = <_emmc>; + clocks = <_clk 4>; + bus-width = <8>; + mmc-ddr-1_8v; + mmc-hs200-1_8v; + }; + usb0: usb@5a800100 { compatible = "socionext,uniphier-ehci", "generic-ehci"; status = "disabled"; diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi index fcaecc6..1e61a04 100644 --- a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi +++ b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi @@ -374,6 +374,18 @@ }; }; + emmc: sdhc@5a00 { + compatible = "socionext,uniphier-sd4hc", "cdns,sd4hc"; + reg = <0x5a00 0x400>; + interrupts = <0 78 4>; + pinctrl-names = "default"; + pinctrl-0 = <_emmc>; + clocks = <_clk 4>; + bus-width = <8>; + mmc-ddr-1_8v; + mmc-hs200-1_8v; + }; + soc-glue@5f80 { compatible = "socionext,uniphier-ld20-soc-glue", "simple-mfd", "syscon"; -- 2.7.4
[PATCH] arm64: dts: uniphier: add eMMC controller node for LD11/LD20
Add Cadence's eMMC controller node for LD11/LD20. Signed-off-by: Masahiro Yamada --- arch/arm/boot/dts/uniphier-pinctrl.dtsi | 5 + arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi | 12 arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi | 12 3 files changed, 29 insertions(+) diff --git a/arch/arm/boot/dts/uniphier-pinctrl.dtsi b/arch/arm/boot/dts/uniphier-pinctrl.dtsi index 10a7110..be353af15 100644 --- a/arch/arm/boot/dts/uniphier-pinctrl.dtsi +++ b/arch/arm/boot/dts/uniphier-pinctrl.dtsi @@ -43,6 +43,11 @@ */ { + pinctrl_emmc: emmc_grp { + groups = "emmc", "emmc_dat8"; + function = "emmc"; + }; + pinctrl_i2c0: i2c0_grp { groups = "i2c0"; function = "i2c0"; diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi index 43b6583..e1e45b4 100644 --- a/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi +++ b/arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi @@ -300,6 +300,18 @@ }; }; + emmc: sdhc@5a00 { + compatible = "socionext,uniphier-sd4hc", "cdns,sd4hc"; + reg = <0x5a00 0x400>; + interrupts = <0 78 4>; + pinctrl-names = "default"; + pinctrl-0 = <_emmc>; + clocks = <_clk 4>; + bus-width = <8>; + mmc-ddr-1_8v; + mmc-hs200-1_8v; + }; + usb0: usb@5a800100 { compatible = "socionext,uniphier-ehci", "generic-ehci"; status = "disabled"; diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi index fcaecc6..1e61a04 100644 --- a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi +++ b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi @@ -374,6 +374,18 @@ }; }; + emmc: sdhc@5a00 { + compatible = "socionext,uniphier-sd4hc", "cdns,sd4hc"; + reg = <0x5a00 0x400>; + interrupts = <0 78 4>; + pinctrl-names = "default"; + pinctrl-0 = <_emmc>; + clocks = <_clk 4>; + bus-width = <8>; + mmc-ddr-1_8v; + mmc-hs200-1_8v; + }; + soc-glue@5f80 { compatible = "socionext,uniphier-ld20-soc-glue", "simple-mfd", "syscon"; -- 2.7.4
Re: [PATCH] ACPI: small formatting fixes
On Tue, Dec 13, 2016 at 08:00:01PM +0100, Pavel Machek wrote: > Actually.. the ACPI cleanup is fine. You did well :-). > Pavel Cool, so (forgive the naive question) what happens next? Maybe I'm too used to the immediate gratification from Github of having a Pull Request get merged. Is this something that you have cherry picked into your tree, then will ask Linus to pull from at the end of the merge window? Do you have a tree that public that I am able to view? On git.kernel.org, there seems to be one tree with your name on it but it seems to be related to the (Nokia?) n900 and it's latest updated branch is from 9 weeks ago. Or is there more approvals I have to get to get the patch merged? Thanks, ~Nick
Re: [PATCH] ACPI: small formatting fixes
On Tue, Dec 13, 2016 at 08:00:01PM +0100, Pavel Machek wrote: > Actually.. the ACPI cleanup is fine. You did well :-). > Pavel Cool, so (forgive the naive question) what happens next? Maybe I'm too used to the immediate gratification from Github of having a Pull Request get merged. Is this something that you have cherry picked into your tree, then will ask Linus to pull from at the end of the merge window? Do you have a tree that public that I am able to view? On git.kernel.org, there seems to be one tree with your name on it but it seems to be related to the (Nokia?) n900 and it's latest updated branch is from 9 weeks ago. Or is there more approvals I have to get to get the patch merged? Thanks, ~Nick
[PATCH] powerpc: Fix build warning on 32-bit PPC - bisected to commit 989cea5c14be
I am getting the following warning when I build kernel 4.9-git on my PowerBook G4 with a 32-bit PPC processor: AS arch/powerpc/kernel/misc_32.o arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef] This problem is evident after commit 989cea5c14be ("kbuild: prevent lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an error that has been in the code since 2005 when this source file was created. That was with commit 9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S"). The offending line is way does not make a lot of sense. This error does not seem to cause any errors in the executable, thus I am not recommending that it be applied to any stable versions. Thanks to Nicholas Piggin for suggesting this solution. Fixes: 9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S") Signed-off-by: Larry FingerCc: Nicholas Piggin Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: linuxppc-...@lists.ozlabs.org --- arch/powerpc/kernel/misc_32.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 1863324..84db14e 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -296,7 +296,7 @@ _GLOBAL(flush_instruction_cache) lis r3, KERNELBASE@h iccci 0,r3 #endif -#elif CONFIG_FSL_BOOKE +#elif defined(CONFIG_FSL_BOOKE) BEGIN_FTR_SECTION mfspr r3,SPRN_L1CSR0 ori r3,r3,L1CSR0_CFI|L1CSR0_CLFC -- 2.10.2
[PATCH] powerpc: Fix build warning on 32-bit PPC - bisected to commit 989cea5c14be
I am getting the following warning when I build kernel 4.9-git on my PowerBook G4 with a 32-bit PPC processor: AS arch/powerpc/kernel/misc_32.o arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef] This problem is evident after commit 989cea5c14be ("kbuild: prevent lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an error that has been in the code since 2005 when this source file was created. That was with commit 9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S"). The offending line is way does not make a lot of sense. This error does not seem to cause any errors in the executable, thus I am not recommending that it be applied to any stable versions. Thanks to Nicholas Piggin for suggesting this solution. Fixes: 9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S") Signed-off-by: Larry Finger Cc: Nicholas Piggin Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: linuxppc-...@lists.ozlabs.org --- arch/powerpc/kernel/misc_32.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 1863324..84db14e 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -296,7 +296,7 @@ _GLOBAL(flush_instruction_cache) lis r3, KERNELBASE@h iccci 0,r3 #endif -#elif CONFIG_FSL_BOOKE +#elif defined(CONFIG_FSL_BOOKE) BEGIN_FTR_SECTION mfspr r3,SPRN_L1CSR0 ori r3,r3,L1CSR0_CFI|L1CSR0_CLFC -- 2.10.2
Re: Still OOM problems with 4.9er kernels
On Fri, Dec 09, 2016 at 04:52:07PM +0100, Gerhard Wiesinger wrote: > On 09.12.2016 14:40, Michal Hocko wrote: > >On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: > >>Hello, > >> > >>same with latest kernel rc, dnf still killed with OOM (but sometimes > >>better). > >> > >>./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} > >>(does dnf clean all;dnf update) > >>Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 > >>17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux > >> > >>Updated bug report: > >>https://bugzilla.redhat.com/show_bug.cgi?id=1314697 > >Could you post your oom report please? > > E.g. a new one with more than one included, first one after boot ... > > Just setup a low mem VM under KVM and it is easily triggerable. > > Still enough virtual memory available ... > > 4.9.0-0.rc8.git2.1.fc26.x86_64 > > [ 624.862777] ksoftirqd/0: page allocation failure: order:0, > mode:0x2080020(GFP_ATOMIC) > [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted > 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 > [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.9.3 > [ 624.863510] aa62c007f958 904774e3 90c7dd98 > > [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 > 90c7dd98 > [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 > aa62c007f9a0 > [ 624.864998] Call Trace: > [ 624.865149] [] dump_stack+0x86/0xc3 > [ 624.865347] [] warn_alloc+0x13a/0x170 > [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 > [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 > [ 624.865675] [] __alloc_page_frag+0x193/0x200 > [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 > [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 > [virtio_net] > [ 624.866201] [] virtnet_receive+0x2db/0x9a0 > [virtio_net] > [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] > [ 624.867494] [] net_rx_action+0x23e/0x470 > [ 624.867612] [] __do_softirq+0xcd/0x4b9 > [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 > [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 > [ 624.867924] [] run_ksoftirqd+0x25/0x80 > [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 > [ 624.868197] [] ? sort_range+0x30/0x30 > [ 624.868596] [] kthread+0x102/0x120 > [ 624.868679] [] ? wait_for_completion+0x110/0x140 > [ 624.868768] [] ? kthread_park+0x60/0x60 > [ 624.868850] [] ret_from_fork+0x2a/0x40 > [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left > [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left > [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left > [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes > left > [95895.765570] kworker/1:1H: page allocation failure: order:0, > mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) > [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted > 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 > [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.9.3 > [95895.766060] Workqueue: kblockd blk_mq_run_work_fn > [95895.766143] aa62c0257628 904774e3 90c7dd98 > > [95895.766235] aa62c02576b0 9020e6ea 022800200046 > 90c7dd98 > [95895.766325] aa62c0257650 96b90010 aa62c02576c0 > aa62c0257670 > [95895.766417] Call Trace: > [95895.766502] [] dump_stack+0x86/0xc3 > [95895.766596] [] warn_alloc+0x13a/0x170 > [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 > [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 > [95895.766866] [] alloc_pages_current+0xa1/0x1f0 > [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 > [95895.767073] [] new_slab+0x316/0x7c0 > [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 > [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 > [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 > [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.776155] [] __slab_alloc+0x51/0x90 > [95895.777090] [] __kmalloc+0x251/0x320 > [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 > [virtio_ring] > [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 > [95895.784742] [] __virtblk_add_req+0xbc/0x220 > [virtio_blk] > [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 > [95895.786086] [] ? virtio_queue_rq+0x105/0x290 > [virtio_blk] > [95895.786750] [] virtio_queue_rq+0x12d/0x290 > [virtio_blk] > [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 > [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 > [95895.789065] [] process_one_work+0x23e/0x6f0 > [95895.789741] [] ? process_one_work+0x1ba/0x6f0 > [95895.790444] [] worker_thread+0x4e/0x490 > [95895.791178] [] ? process_one_work+0x6f0/0x6f0 > [95895.791911] [] ? process_one_work+0x6f0/0x6f0 >
Re: Still OOM problems with 4.9er kernels
On Fri, Dec 09, 2016 at 04:52:07PM +0100, Gerhard Wiesinger wrote: > On 09.12.2016 14:40, Michal Hocko wrote: > >On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: > >>Hello, > >> > >>same with latest kernel rc, dnf still killed with OOM (but sometimes > >>better). > >> > >>./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} > >>(does dnf clean all;dnf update) > >>Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 > >>17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux > >> > >>Updated bug report: > >>https://bugzilla.redhat.com/show_bug.cgi?id=1314697 > >Could you post your oom report please? > > E.g. a new one with more than one included, first one after boot ... > > Just setup a low mem VM under KVM and it is easily triggerable. > > Still enough virtual memory available ... > > 4.9.0-0.rc8.git2.1.fc26.x86_64 > > [ 624.862777] ksoftirqd/0: page allocation failure: order:0, > mode:0x2080020(GFP_ATOMIC) > [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted > 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 > [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.9.3 > [ 624.863510] aa62c007f958 904774e3 90c7dd98 > > [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 > 90c7dd98 > [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 > aa62c007f9a0 > [ 624.864998] Call Trace: > [ 624.865149] [] dump_stack+0x86/0xc3 > [ 624.865347] [] warn_alloc+0x13a/0x170 > [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 > [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 > [ 624.865675] [] __alloc_page_frag+0x193/0x200 > [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 > [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 > [virtio_net] > [ 624.866201] [] virtnet_receive+0x2db/0x9a0 > [virtio_net] > [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] > [ 624.867494] [] net_rx_action+0x23e/0x470 > [ 624.867612] [] __do_softirq+0xcd/0x4b9 > [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 > [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 > [ 624.867924] [] run_ksoftirqd+0x25/0x80 > [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 > [ 624.868197] [] ? sort_range+0x30/0x30 > [ 624.868596] [] kthread+0x102/0x120 > [ 624.868679] [] ? wait_for_completion+0x110/0x140 > [ 624.868768] [] ? kthread_park+0x60/0x60 > [ 624.868850] [] ret_from_fork+0x2a/0x40 > [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left > [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left > [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left > [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes > left > [95895.765570] kworker/1:1H: page allocation failure: order:0, > mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) > [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted > 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 > [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.9.3 > [95895.766060] Workqueue: kblockd blk_mq_run_work_fn > [95895.766143] aa62c0257628 904774e3 90c7dd98 > > [95895.766235] aa62c02576b0 9020e6ea 022800200046 > 90c7dd98 > [95895.766325] aa62c0257650 96b90010 aa62c02576c0 > aa62c0257670 > [95895.766417] Call Trace: > [95895.766502] [] dump_stack+0x86/0xc3 > [95895.766596] [] warn_alloc+0x13a/0x170 > [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 > [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 > [95895.766866] [] alloc_pages_current+0xa1/0x1f0 > [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 > [95895.767073] [] new_slab+0x316/0x7c0 > [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 > [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 > [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 > [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.776155] [] __slab_alloc+0x51/0x90 > [95895.777090] [] __kmalloc+0x251/0x320 > [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 > [virtio_ring] > [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 > [virtio_ring] > [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 > [95895.784742] [] __virtblk_add_req+0xbc/0x220 > [virtio_blk] > [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 > [95895.786086] [] ? virtio_queue_rq+0x105/0x290 > [virtio_blk] > [95895.786750] [] virtio_queue_rq+0x12d/0x290 > [virtio_blk] > [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 > [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 > [95895.789065] [] process_one_work+0x23e/0x6f0 > [95895.789741] [] ? process_one_work+0x1ba/0x6f0 > [95895.790444] [] worker_thread+0x4e/0x490 > [95895.791178] [] ? process_one_work+0x6f0/0x6f0 > [95895.791911] [] ? process_one_work+0x6f0/0x6f0 >
RE: [PATCH v3 2/3] USB3/DWC3: Add property "snps, incr-burst-type-adjustment" for INCR burst type
Hi, Rob, > -Original Message- > From: Rob Herring [mailto:r...@kernel.org] > Sent: Friday, December 23, 2016 2:45 AM > To: Jerry Huang> Cc: ba...@kernel.org; mark.rutl...@arm.com; catalin.mari...@arm.com; > will.dea...@arm.com; li...@armlinux.org.uk; devicet...@vger.kernel.org; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm- > ker...@lists.infradead.org > Subject: Re: [PATCH v3 2/3] USB3/DWC3: Add property "snps, incr-burst- > type-adjustment" for INCR burst type > > On Mon, Dec 19, 2016 at 05:25:53PM +0800, Changming Huang wrote: > > New property "snps,incr-burst-type-adjustment = , " for USB3.0 > DWC3. > > Field "x": 1/0 - undefined length INCR burst type enable or not; Field > > "y": INCR4/INCR8/INCR16/INCR32/INCR64/INCR128/INCR256 burst type. > > > > While enabling undefined length INCR burst type and INCR16 burst type, > > get better write performance on NXP Layerscape platform: > > around 3% improvement (from 364MB/s to 375MB/s). > > > > Signed-off-by: Changming Huang > > --- > > Changes in v3: > > - add new property for INCR burst in usb node. > > > > Documentation/devicetree/bindings/usb/dwc3.txt |5 + > > arch/arm/boot/dts/ls1021a.dtsi |1 + > > arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi |3 +++ > > arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi |2 ++ > > 4 files changed, 11 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt > > b/Documentation/devicetree/bindings/usb/dwc3.txt > > index e3e6983..8c405a3 100644 > > --- a/Documentation/devicetree/bindings/usb/dwc3.txt > > +++ b/Documentation/devicetree/bindings/usb/dwc3.txt > > @@ -55,6 +55,10 @@ Optional properties: > > fladj_30mhz_sdbnd signal is invalid or incorrect. > > > > - tx-fifo-resize: determines if the FIFO *has* to be > reallocated. > > + - snps,incr-burst-type-adjustment: Value for INCR burst type of > GSBUSCFG0 > > + register, undefined length INCR burst type enable and INCRx type. > > + First field is for undefined length INCR burst type enable or not. > > + Second field is for largest INCRx type enabled. > > Why do you need the first field? Is the 2nd field used if the 1st is 0? > If not, then just use the presence of the property to enable or not. The first field is one switch. When it is 1, means undefined length INCR burst type enabled, we can use any length less than or equal to the largest-enabled burst length of INCR4/8/16/32/64/128/256. When it is zero, means INCRx burst mode enabled, we can use one fixed burst length of 1/4/8/16/32/64/128/256 byte. So, the 2nd field is used if the 1st is 0, we need to select one largest burst length the USB controller can support. If we don't want to change the value of this register (use the default value), we don't need to add this property to usb node.
RE: [PATCH v3 2/3] USB3/DWC3: Add property "snps, incr-burst-type-adjustment" for INCR burst type
Hi, Rob, > -Original Message- > From: Rob Herring [mailto:r...@kernel.org] > Sent: Friday, December 23, 2016 2:45 AM > To: Jerry Huang > Cc: ba...@kernel.org; mark.rutl...@arm.com; catalin.mari...@arm.com; > will.dea...@arm.com; li...@armlinux.org.uk; devicet...@vger.kernel.org; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm- > ker...@lists.infradead.org > Subject: Re: [PATCH v3 2/3] USB3/DWC3: Add property "snps, incr-burst- > type-adjustment" for INCR burst type > > On Mon, Dec 19, 2016 at 05:25:53PM +0800, Changming Huang wrote: > > New property "snps,incr-burst-type-adjustment = , " for USB3.0 > DWC3. > > Field "x": 1/0 - undefined length INCR burst type enable or not; Field > > "y": INCR4/INCR8/INCR16/INCR32/INCR64/INCR128/INCR256 burst type. > > > > While enabling undefined length INCR burst type and INCR16 burst type, > > get better write performance on NXP Layerscape platform: > > around 3% improvement (from 364MB/s to 375MB/s). > > > > Signed-off-by: Changming Huang > > --- > > Changes in v3: > > - add new property for INCR burst in usb node. > > > > Documentation/devicetree/bindings/usb/dwc3.txt |5 + > > arch/arm/boot/dts/ls1021a.dtsi |1 + > > arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi |3 +++ > > arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi |2 ++ > > 4 files changed, 11 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt > > b/Documentation/devicetree/bindings/usb/dwc3.txt > > index e3e6983..8c405a3 100644 > > --- a/Documentation/devicetree/bindings/usb/dwc3.txt > > +++ b/Documentation/devicetree/bindings/usb/dwc3.txt > > @@ -55,6 +55,10 @@ Optional properties: > > fladj_30mhz_sdbnd signal is invalid or incorrect. > > > > - tx-fifo-resize: determines if the FIFO *has* to be > reallocated. > > + - snps,incr-burst-type-adjustment: Value for INCR burst type of > GSBUSCFG0 > > + register, undefined length INCR burst type enable and INCRx type. > > + First field is for undefined length INCR burst type enable or not. > > + Second field is for largest INCRx type enabled. > > Why do you need the first field? Is the 2nd field used if the 1st is 0? > If not, then just use the presence of the property to enable or not. The first field is one switch. When it is 1, means undefined length INCR burst type enabled, we can use any length less than or equal to the largest-enabled burst length of INCR4/8/16/32/64/128/256. When it is zero, means INCRx burst mode enabled, we can use one fixed burst length of 1/4/8/16/32/64/128/256 byte. So, the 2nd field is used if the 1st is 0, we need to select one largest burst length the USB controller can support. If we don't want to change the value of this register (use the default value), we don't need to add this property to usb node.
RE: A small window for a race condition in mm/rmap.c:page_lock_anon_vma_read
I'd expected that one or more tasks doing the free were the current task of other cpu cores, but only one of the four dumps has two swapd task that are concurrently at execution on different cpu. This is the task leading to the crash: PID: 247TASK: 881fcfad8000 CPU: 14 COMMAND: "kswapd1" #0 [881fcfad7978] machine_kexec at 81051e9b #1 [881fcfad79d8] crash_kexec at 810f27e2 #2 [881fcfad7aa8] oops_end at 8163f448 #3 [881fcfad7ad0] die at 8101859b #4 [881fcfad7b00] do_general_protection at 8163ed3e #5 [881fcfad7b30] general_protection at 8163e5e8 [exception RIP: down_read_trylock+9] RIP: 810aa9f9 RSP: 881fcfad7be0 RFLAGS: 00010286 RAX: RBX: 882b47ddadc0 RCX: RDX: RSI: RDI: 91550b2b32f5a3e8 RBP: 881fcfad7be0 R8: ea00ecc28860 R9: 883fcffeae28 R10: 81a691a0 R11: 0001 R12: 882b47ddadc1 R13: ea00ecc28840 R14: 91550b2b32f5a3e8 R15: ea00ecc28840 ORIG_RAX: CS: 0010 SS: #6 [881fcfad7be8] page_lock_anon_vma_read at 811a3365 #7 [881fcfad7c18] page_referenced at 811a35e7 #8 [881fcfad7c90] shrink_active_list at 8117e8cc #9 [881fcfad7d48] balance_pgdat at 81180288 #10 [881fcfad7e20] kswapd at 81180813 #11 [881fcfad7ec8] kthread at 810a5b8f #12 [881fcfad7f50] ret_from_fork at 81646a98 And this is the one at the same time: PID: 246TASK: 881fd27af300 CPU: 20 COMMAND: "kswapd0" #0 [881fffd05e70] crash_nmi_callback at 81045982 #1 [881fffd05e80] nmi_handle at 8163f5d9 #2 [881fffd05ec8] do_nmi at 8163f6f0 #3 [881fffd05ef0] end_repeat_nmi at 8163ea13 [exception RIP: free_pcppages_bulk+529] RIP: 81171ae1 RSP: 881fcfad38f0 RFLAGS: 0087 RAX: 002f002c RBX: ea007606ae40 RCX: RDX: ea007606ae00 RSI: 02b9 RDI: RBP: 881fcfad3978 R8: R9: 0001 R10: 88207ffda000 R11: 0002 R12: ea007606ae40 R13: 0002 R14: 88207ffda000 R15: 02b8 ORIG_RAX: CS: 0010 SS: 0018 --- --- #4 [881fcfad38f0] free_pcppages_bulk at 81171ae1 #5 [881fcfad3980] free_hot_cold_page at 81171f08 #6 [881fcfad39b8] free_hot_cold_page_list at 81171f76 #7 [881fcfad39f0] shrink_page_list at 8117d71e #8 [881fcfad3b28] shrink_inactive_list at 8117e37a #9 [881fcfad3bf0] shrink_lruvec at 8117ee45 #10 [881fcfad3cf0] shrink_zone at 8117f2a6 #11 [881fcfad3d48] balance_pgdat at 8118054c #12 [881fcfad3e20] kswapd at 81180813 #13 [881fcfad3ec8] kthread at 810a5b8f #14 [881fcfad3f50] ret_from_fork at 81646a98 I hope the information would be useful. Dashi Cao -Original Message- From: Hugh Dickins [mailto:hu...@google.com] Sent: Friday, December 23, 2016 6:27 AM To: Peter ZijlstraCc: Michal Hocko ; Dashi DS1 Cao ; linux...@kvack.org; linux-kernel@vger.kernel.org; Hugh Dickins Subject: Re: A small window for a race condition in mm/rmap.c:page_lock_anon_vma_read On Thu, 22 Dec 2016, Peter Zijlstra wrote: > On Wed, Dec 21, 2016 at 03:43:43PM +0100, Michal Hocko wrote: > > anon_vma locking is clever^Wsubtle as hell. CC Peter... > > > > On Tue 20-12-16 09:32:27, Dashi DS1 Cao wrote: > > > I've collected four crash dumps with similar backtrace. > > > > > > PID: 247TASK: 881fcfad8000 CPU: 14 COMMAND: "kswapd1" > > > #0 [881fcfad7978] machine_kexec at 81051e9b > > > #1 [881fcfad79d8] crash_kexec at 810f27e2 > > > #2 [881fcfad7aa8] oops_end at 8163f448 > > > #3 [881fcfad7ad0] die at 8101859b > > > #4 [881fcfad7b00] do_general_protection at 8163ed3e > > > #5 [881fcfad7b30] general_protection at 8163e5e8 > > > [exception RIP: down_read_trylock+9] > > > RIP: 810aa9f9 RSP: 881fcfad7be0 RFLAGS: 00010286 > > > RAX: RBX: 882b47ddadc0 RCX: > > > RDX: RSI: RDI: > > > 91550b2b32f5a3e8 > > > > rdi is obviously a mess - smells like a string. So either sombody > > has overwritten root_anon_vma or this is really a use after free... > > e8 - ??? > a3 - ??? > f5 - ??? > 32 - 2 > 2b - + > b - > > 55 - U > 91 - ??? > > Not a string.. > > > > RBP: 881fcfad7be0 R8: ea00ecc28860 R9: 883fcffeae28 > > > R10: 81a691a0 R11: 0001 R12: 882b47ddadc1 > > > R13: ea00ecc28840 R14: 91550b2b32f5a3e8
RE: A small window for a race condition in mm/rmap.c:page_lock_anon_vma_read
I'd expected that one or more tasks doing the free were the current task of other cpu cores, but only one of the four dumps has two swapd task that are concurrently at execution on different cpu. This is the task leading to the crash: PID: 247TASK: 881fcfad8000 CPU: 14 COMMAND: "kswapd1" #0 [881fcfad7978] machine_kexec at 81051e9b #1 [881fcfad79d8] crash_kexec at 810f27e2 #2 [881fcfad7aa8] oops_end at 8163f448 #3 [881fcfad7ad0] die at 8101859b #4 [881fcfad7b00] do_general_protection at 8163ed3e #5 [881fcfad7b30] general_protection at 8163e5e8 [exception RIP: down_read_trylock+9] RIP: 810aa9f9 RSP: 881fcfad7be0 RFLAGS: 00010286 RAX: RBX: 882b47ddadc0 RCX: RDX: RSI: RDI: 91550b2b32f5a3e8 RBP: 881fcfad7be0 R8: ea00ecc28860 R9: 883fcffeae28 R10: 81a691a0 R11: 0001 R12: 882b47ddadc1 R13: ea00ecc28840 R14: 91550b2b32f5a3e8 R15: ea00ecc28840 ORIG_RAX: CS: 0010 SS: #6 [881fcfad7be8] page_lock_anon_vma_read at 811a3365 #7 [881fcfad7c18] page_referenced at 811a35e7 #8 [881fcfad7c90] shrink_active_list at 8117e8cc #9 [881fcfad7d48] balance_pgdat at 81180288 #10 [881fcfad7e20] kswapd at 81180813 #11 [881fcfad7ec8] kthread at 810a5b8f #12 [881fcfad7f50] ret_from_fork at 81646a98 And this is the one at the same time: PID: 246TASK: 881fd27af300 CPU: 20 COMMAND: "kswapd0" #0 [881fffd05e70] crash_nmi_callback at 81045982 #1 [881fffd05e80] nmi_handle at 8163f5d9 #2 [881fffd05ec8] do_nmi at 8163f6f0 #3 [881fffd05ef0] end_repeat_nmi at 8163ea13 [exception RIP: free_pcppages_bulk+529] RIP: 81171ae1 RSP: 881fcfad38f0 RFLAGS: 0087 RAX: 002f002c RBX: ea007606ae40 RCX: RDX: ea007606ae00 RSI: 02b9 RDI: RBP: 881fcfad3978 R8: R9: 0001 R10: 88207ffda000 R11: 0002 R12: ea007606ae40 R13: 0002 R14: 88207ffda000 R15: 02b8 ORIG_RAX: CS: 0010 SS: 0018 --- --- #4 [881fcfad38f0] free_pcppages_bulk at 81171ae1 #5 [881fcfad3980] free_hot_cold_page at 81171f08 #6 [881fcfad39b8] free_hot_cold_page_list at 81171f76 #7 [881fcfad39f0] shrink_page_list at 8117d71e #8 [881fcfad3b28] shrink_inactive_list at 8117e37a #9 [881fcfad3bf0] shrink_lruvec at 8117ee45 #10 [881fcfad3cf0] shrink_zone at 8117f2a6 #11 [881fcfad3d48] balance_pgdat at 8118054c #12 [881fcfad3e20] kswapd at 81180813 #13 [881fcfad3ec8] kthread at 810a5b8f #14 [881fcfad3f50] ret_from_fork at 81646a98 I hope the information would be useful. Dashi Cao -Original Message- From: Hugh Dickins [mailto:hu...@google.com] Sent: Friday, December 23, 2016 6:27 AM To: Peter Zijlstra Cc: Michal Hocko ; Dashi DS1 Cao ; linux...@kvack.org; linux-kernel@vger.kernel.org; Hugh Dickins Subject: Re: A small window for a race condition in mm/rmap.c:page_lock_anon_vma_read On Thu, 22 Dec 2016, Peter Zijlstra wrote: > On Wed, Dec 21, 2016 at 03:43:43PM +0100, Michal Hocko wrote: > > anon_vma locking is clever^Wsubtle as hell. CC Peter... > > > > On Tue 20-12-16 09:32:27, Dashi DS1 Cao wrote: > > > I've collected four crash dumps with similar backtrace. > > > > > > PID: 247TASK: 881fcfad8000 CPU: 14 COMMAND: "kswapd1" > > > #0 [881fcfad7978] machine_kexec at 81051e9b > > > #1 [881fcfad79d8] crash_kexec at 810f27e2 > > > #2 [881fcfad7aa8] oops_end at 8163f448 > > > #3 [881fcfad7ad0] die at 8101859b > > > #4 [881fcfad7b00] do_general_protection at 8163ed3e > > > #5 [881fcfad7b30] general_protection at 8163e5e8 > > > [exception RIP: down_read_trylock+9] > > > RIP: 810aa9f9 RSP: 881fcfad7be0 RFLAGS: 00010286 > > > RAX: RBX: 882b47ddadc0 RCX: > > > RDX: RSI: RDI: > > > 91550b2b32f5a3e8 > > > > rdi is obviously a mess - smells like a string. So either sombody > > has overwritten root_anon_vma or this is really a use after free... > > e8 - ??? > a3 - ??? > f5 - ??? > 32 - 2 > 2b - + > b - > > 55 - U > 91 - ??? > > Not a string.. > > > > RBP: 881fcfad7be0 R8: ea00ecc28860 R9: 883fcffeae28 > > > R10: 81a691a0 R11: 0001 R12: 882b47ddadc1 > > > R13: ea00ecc28840 R14: 91550b2b32f5a3e8 R15: ea00ecc28840 > > > ORIG_RAX: CS: 0010 SS:
[PATCH v3 07/12] scsi: ufs: set default UFS power management level
UFS device and link can be put in multiple different low power modes hence UFS driver supports multiple different low power modes. This change sets the default UFS power management level which should put the link hibernate state and device in sleep state. This default power management level gives good power savings with relatively less enter/exit latencies. Reviewed-by: Yaniv GardiSigned-off-by: Subhash Jadavani --- Changes from v2 -> v3: - Simplified patch to just set the default power management level. Device tree capability to specify broken hardware will be taken up as separate patch in different patch series than current one. --- drivers/scsi/ufs/ufshcd.c | 28 1 file changed, 28 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 30ee04a..12a2250 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -191,6 +191,22 @@ enum { return ufs_pm_lvl_states[lvl].link_state; } +static inline enum ufs_pm_level +ufs_get_desired_pm_lvl_for_dev_link_state(enum ufs_dev_pwr_mode dev_state, + enum uic_link_state link_state) +{ + enum ufs_pm_level lvl; + + for (lvl = UFS_PM_LVL_0; lvl < UFS_PM_LVL_MAX; lvl++) { + if ((ufs_pm_lvl_states[lvl].dev_state == dev_state) && + (ufs_pm_lvl_states[lvl].link_state == link_state)) + return lvl; + } + + /* if no match found, return the level 0 */ + return UFS_PM_LVL_0; +} + static struct ufs_dev_fix ufs_fixups[] = { /* UFS cards deviations table */ UFS_FIX(UFS_VENDOR_SAMSUNG, UFS_ANY_MODEL, @@ -7264,6 +7280,18 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) ufshcd_clkscaling_init_sysfs(hba); } + /* +* Set the default power management level for runtime and system PM. +* Default power saving mode is to keep UFS link in Hibern8 state +* and UFS device in sleep state. +*/ + hba->rpm_lvl = ufs_get_desired_pm_lvl_for_dev_link_state( + UFS_SLEEP_PWR_MODE, + UIC_LINK_HIBERN8_STATE); + hba->spm_lvl = ufs_get_desired_pm_lvl_for_dev_link_state( + UFS_SLEEP_PWR_MODE, + UIC_LINK_HIBERN8_STATE); + /* Hold auto suspend until async scan completes */ pm_runtime_get_sync(dev); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 08/12] scsi: ufs: add capability to keep auto bkops always enabled
UFS device requires to perform bkops (back ground operations) periodically but host can control (via auto-bkops parameter of device) when device can perform bkops based on its performance requirements. In general, host would like to enable the device's auto-bkops only when it's not doing any regular data transfer but sometimes device may not behave properly if host keeps the auto-bkops disabled. This change adds the capability to let the device auto-bkops always enabled except suspend. Reviewed-by: Sahitya TummalaSigned-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 33 ++--- drivers/scsi/ufs/ufshcd.h | 13 + 2 files changed, 35 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 12a2250..a647bcf 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4134,18 +4134,25 @@ static int ufshcd_disable_auto_bkops(struct ufs_hba *hba) } /** - * ufshcd_force_reset_auto_bkops - force enable of auto bkops + * ufshcd_force_reset_auto_bkops - force reset auto bkops state * @hba: per adapter instance * * After a device reset the device may toggle the BKOPS_EN flag * to default value. The s/w tracking variables should be updated - * as well. Do this by forcing enable of auto bkops. + * as well. This function would change the auto-bkops state based on + * UFSHCD_CAP_KEEP_AUTO_BKOPS_ENABLED_EXCEPT_SUSPEND. */ -static void ufshcd_force_reset_auto_bkops(struct ufs_hba *hba) +static void ufshcd_force_reset_auto_bkops(struct ufs_hba *hba) { - hba->auto_bkops_enabled = false; - hba->ee_ctrl_mask |= MASK_EE_URGENT_BKOPS; - ufshcd_enable_auto_bkops(hba); + if (ufshcd_keep_autobkops_enabled_except_suspend(hba)) { + hba->auto_bkops_enabled = false; + hba->ee_ctrl_mask |= MASK_EE_URGENT_BKOPS; + ufshcd_enable_auto_bkops(hba); + } else { + hba->auto_bkops_enabled = true; + hba->ee_ctrl_mask &= ~MASK_EE_URGENT_BKOPS; + ufshcd_disable_auto_bkops(hba); + } } static inline int ufshcd_get_bkops_status(struct ufs_hba *hba, u32 *status) @@ -6564,11 +6571,15 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op) goto set_old_link_state; } - /* -* If BKOPs operations are urgently needed at this moment then -* keep auto-bkops enabled or else disable it. -*/ - ufshcd_urgent_bkops(hba); + if (ufshcd_keep_autobkops_enabled_except_suspend(hba)) + ufshcd_enable_auto_bkops(hba); + else + /* +* If BKOPs operations are urgently needed at this moment then +* keep auto-bkops enabled or else disable it. +*/ + ufshcd_urgent_bkops(hba); + hba->clk_gating.is_suspended = false; if (hba->clk_scaling.is_allowed) diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 787323b..f25d468 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -566,6 +566,14 @@ struct ufs_hba { * CAUTION: Enabling this might reduce overall UFS throughput. */ #define UFSHCD_CAP_INTR_AGGR (1 << 4) + /* +* This capability allows the device auto-bkops to be always enabled +* except during suspend (both runtime and suspend). +* Enabling this capability means that device will always be allowed +* to do background operation when it's active but it might degrade +* the performance of ongoing read/write operations. +*/ +#define UFSHCD_CAP_KEEP_AUTO_BKOPS_ENABLED_EXCEPT_SUSPEND (1 << 5) struct devfreq *devfreq; struct ufs_clk_scaling clk_scaling; @@ -663,6 +671,11 @@ static inline void *ufshcd_get_variant(struct ufs_hba *hba) BUG_ON(!hba); return hba->priv; } +static inline bool ufshcd_keep_autobkops_enabled_except_suspend( + struct ufs_hba *hba) +{ + return hba->caps & UFSHCD_CAP_KEEP_AUTO_BKOPS_ENABLED_EXCEPT_SUSPEND; +} extern int ufshcd_runtime_suspend(struct ufs_hba *hba); extern int ufshcd_runtime_resume(struct ufs_hba *hba); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 07/12] scsi: ufs: set default UFS power management level
UFS device and link can be put in multiple different low power modes hence UFS driver supports multiple different low power modes. This change sets the default UFS power management level which should put the link hibernate state and device in sleep state. This default power management level gives good power savings with relatively less enter/exit latencies. Reviewed-by: Yaniv Gardi Signed-off-by: Subhash Jadavani --- Changes from v2 -> v3: - Simplified patch to just set the default power management level. Device tree capability to specify broken hardware will be taken up as separate patch in different patch series than current one. --- drivers/scsi/ufs/ufshcd.c | 28 1 file changed, 28 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 30ee04a..12a2250 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -191,6 +191,22 @@ enum { return ufs_pm_lvl_states[lvl].link_state; } +static inline enum ufs_pm_level +ufs_get_desired_pm_lvl_for_dev_link_state(enum ufs_dev_pwr_mode dev_state, + enum uic_link_state link_state) +{ + enum ufs_pm_level lvl; + + for (lvl = UFS_PM_LVL_0; lvl < UFS_PM_LVL_MAX; lvl++) { + if ((ufs_pm_lvl_states[lvl].dev_state == dev_state) && + (ufs_pm_lvl_states[lvl].link_state == link_state)) + return lvl; + } + + /* if no match found, return the level 0 */ + return UFS_PM_LVL_0; +} + static struct ufs_dev_fix ufs_fixups[] = { /* UFS cards deviations table */ UFS_FIX(UFS_VENDOR_SAMSUNG, UFS_ANY_MODEL, @@ -7264,6 +7280,18 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) ufshcd_clkscaling_init_sysfs(hba); } + /* +* Set the default power management level for runtime and system PM. +* Default power saving mode is to keep UFS link in Hibern8 state +* and UFS device in sleep state. +*/ + hba->rpm_lvl = ufs_get_desired_pm_lvl_for_dev_link_state( + UFS_SLEEP_PWR_MODE, + UIC_LINK_HIBERN8_STATE); + hba->spm_lvl = ufs_get_desired_pm_lvl_for_dev_link_state( + UFS_SLEEP_PWR_MODE, + UIC_LINK_HIBERN8_STATE); + /* Hold auto suspend until async scan completes */ pm_runtime_get_sync(dev); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 08/12] scsi: ufs: add capability to keep auto bkops always enabled
UFS device requires to perform bkops (back ground operations) periodically but host can control (via auto-bkops parameter of device) when device can perform bkops based on its performance requirements. In general, host would like to enable the device's auto-bkops only when it's not doing any regular data transfer but sometimes device may not behave properly if host keeps the auto-bkops disabled. This change adds the capability to let the device auto-bkops always enabled except suspend. Reviewed-by: Sahitya Tummala Signed-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 33 ++--- drivers/scsi/ufs/ufshcd.h | 13 + 2 files changed, 35 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 12a2250..a647bcf 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4134,18 +4134,25 @@ static int ufshcd_disable_auto_bkops(struct ufs_hba *hba) } /** - * ufshcd_force_reset_auto_bkops - force enable of auto bkops + * ufshcd_force_reset_auto_bkops - force reset auto bkops state * @hba: per adapter instance * * After a device reset the device may toggle the BKOPS_EN flag * to default value. The s/w tracking variables should be updated - * as well. Do this by forcing enable of auto bkops. + * as well. This function would change the auto-bkops state based on + * UFSHCD_CAP_KEEP_AUTO_BKOPS_ENABLED_EXCEPT_SUSPEND. */ -static void ufshcd_force_reset_auto_bkops(struct ufs_hba *hba) +static void ufshcd_force_reset_auto_bkops(struct ufs_hba *hba) { - hba->auto_bkops_enabled = false; - hba->ee_ctrl_mask |= MASK_EE_URGENT_BKOPS; - ufshcd_enable_auto_bkops(hba); + if (ufshcd_keep_autobkops_enabled_except_suspend(hba)) { + hba->auto_bkops_enabled = false; + hba->ee_ctrl_mask |= MASK_EE_URGENT_BKOPS; + ufshcd_enable_auto_bkops(hba); + } else { + hba->auto_bkops_enabled = true; + hba->ee_ctrl_mask &= ~MASK_EE_URGENT_BKOPS; + ufshcd_disable_auto_bkops(hba); + } } static inline int ufshcd_get_bkops_status(struct ufs_hba *hba, u32 *status) @@ -6564,11 +6571,15 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op) goto set_old_link_state; } - /* -* If BKOPs operations are urgently needed at this moment then -* keep auto-bkops enabled or else disable it. -*/ - ufshcd_urgent_bkops(hba); + if (ufshcd_keep_autobkops_enabled_except_suspend(hba)) + ufshcd_enable_auto_bkops(hba); + else + /* +* If BKOPs operations are urgently needed at this moment then +* keep auto-bkops enabled or else disable it. +*/ + ufshcd_urgent_bkops(hba); + hba->clk_gating.is_suspended = false; if (hba->clk_scaling.is_allowed) diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 787323b..f25d468 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -566,6 +566,14 @@ struct ufs_hba { * CAUTION: Enabling this might reduce overall UFS throughput. */ #define UFSHCD_CAP_INTR_AGGR (1 << 4) + /* +* This capability allows the device auto-bkops to be always enabled +* except during suspend (both runtime and suspend). +* Enabling this capability means that device will always be allowed +* to do background operation when it's active but it might degrade +* the performance of ongoing read/write operations. +*/ +#define UFSHCD_CAP_KEEP_AUTO_BKOPS_ENABLED_EXCEPT_SUSPEND (1 << 5) struct devfreq *devfreq; struct ufs_clk_scaling clk_scaling; @@ -663,6 +671,11 @@ static inline void *ufshcd_get_variant(struct ufs_hba *hba) BUG_ON(!hba); return hba->priv; } +static inline bool ufshcd_keep_autobkops_enabled_except_suspend( + struct ufs_hba *hba) +{ + return hba->caps & UFSHCD_CAP_KEEP_AUTO_BKOPS_ENABLED_EXCEPT_SUSPEND; +} extern int ufshcd_runtime_suspend(struct ufs_hba *hba); extern int ufshcd_runtime_resume(struct ufs_hba *hba); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 12/12] scsi: ufs: Improve fatal error logs
From: Dolev RavivErrors such as UIC error, illegal OCS values, and others may require more information for debugging. Such information could be hibern8 events, events sequences, recoverable errors, error history, and more. This patch improves tracking of important errors and events in debug level to be enabled when debugging a such issues. It includes: * UIC error history * Successful hibern8 events * Successful command after hibern8 exit * Clk-freq info * Failed device command * Infrastructure for dumping host controller debug information Signed-off-by: Dolev Raviv Signed-off-by: Subhash Jadavani --- Changes from v1 -> v2: - Added explicit new line character at the end of the printk messages. --- drivers/scsi/ufs/ufshcd.c | 203 -- drivers/scsi/ufs/ufshcd.h | 47 +++ 2 files changed, 208 insertions(+), 42 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 918c597..c2b15a2 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -328,6 +328,37 @@ static void ufshcd_add_command_trace(struct ufs_hba *hba, doorbell, transfer_len, intr, lba, opcode); } +static void ufshcd_print_clk_freqs(struct ufs_hba *hba) +{ + struct ufs_clk_info *clki; + struct list_head *head = >clk_list_head; + + if (!head || list_empty(head)) + return; + + list_for_each_entry(clki, head, list) { + if (!IS_ERR_OR_NULL(clki->clk) && clki->min_freq && + clki->max_freq) + dev_err(hba->dev, "clk: %s, rate: %u\n", + clki->name, clki->curr_freq); + } +} + +static void ufshcd_print_uic_err_hist(struct ufs_hba *hba, + struct ufs_uic_err_reg_hist *err_hist, char *err_name) +{ + int i; + + for (i = 0; i < UIC_ERR_REG_HIST_LENGTH; i++) { + int p = (i + err_hist->pos - 1) % UIC_ERR_REG_HIST_LENGTH; + + if (err_hist->reg[p] == 0) + continue; + dev_err(hba->dev, "%s[%d] = 0x%x at %lld us\n", err_name, i, + err_hist->reg[p], ktime_to_us(err_hist->tstamp[p])); + } +} + static void ufshcd_print_host_regs(struct ufs_hba *hba) { /* @@ -344,6 +375,21 @@ static void ufshcd_print_host_regs(struct ufs_hba *hba) dev_err(hba->dev, "hba->outstanding_reqs = 0x%x, hba->outstanding_tasks = 0x%x\n", (u32)hba->outstanding_reqs, (u32)hba->outstanding_tasks); + dev_err(hba->dev, + "last_hibern8_exit_tstamp at %lld us, hibern8_exit_cnt = %d\n", + ktime_to_us(hba->ufs_stats.last_hibern8_exit_tstamp), + hba->ufs_stats.hibern8_exit_cnt); + + ufshcd_print_uic_err_hist(hba, >ufs_stats.pa_err, "pa_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.dl_err, "dl_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.nl_err, "nl_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.tl_err, "tl_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.dme_err, "dme_err"); + + ufshcd_print_clk_freqs(hba); + + if (hba->vops && hba->vops->dbg_register_dump) + hba->vops->dbg_register_dump(hba); } static @@ -355,22 +401,30 @@ void ufshcd_print_trs(struct ufs_hba *hba, unsigned long bitmap, bool pr_prdt) for_each_set_bit(tag, , hba->nutrs) { lrbp = >lrb[tag]; - dev_err(hba->dev, "UPIU[%d] - Transfer Request Descriptor\n", - tag); + dev_err(hba->dev, "UPIU[%d] - issue time %lld us\n", + tag, ktime_to_us(lrbp->issue_time_stamp)); + dev_err(hba->dev, + "UPIU[%d] - Transfer Request Descriptor phys@0x%llx\n", + tag, (u64)lrbp->utrd_dma_addr); + ufshcd_hex_dump("UPIU TRD: ", lrbp->utr_descriptor_ptr, sizeof(struct utp_transfer_req_desc)); - dev_err(hba->dev, "UPIU[%d] - Request UPIU\n", tag); + dev_err(hba->dev, "UPIU[%d] - Request UPIU phys@0x%llx\n", tag, + (u64)lrbp->ucd_req_dma_addr); ufshcd_hex_dump("UPIU REQ: ", lrbp->ucd_req_ptr, sizeof(struct utp_upiu_req)); - dev_err(hba->dev, "UPIU[%d] - Response UPIU\n", tag); + dev_err(hba->dev, "UPIU[%d] - Response UPIU phys@0x%llx\n", tag, + (u64)lrbp->ucd_rsp_dma_addr); ufshcd_hex_dump("UPIU RSP: ", lrbp->ucd_rsp_ptr, sizeof(struct utp_upiu_rsp)); if (pr_prdt) { int prdt_length = le16_to_cpu( lrbp->utr_descriptor_ptr->prd_table_length); -
[PATCH v3 09/12] scsi: ufs: fix setting init power mode
Immediately after successful UFS link startup, UFS link power mode would be in PWM-G1, 1-lane, SLOW-AUTO mode. But currently we are doing few of the DME local/peer attributes access before setting the "hba->pwr_info" to default power mode. If we are doing link startup as part of error recovery then old power mode might be set to FAST mode and doing DME peer access (after link startup but before updating "hba->pwr_info" to default power mode) unintentionally tries to switch from FAST to FAST_AUTO mode (if UFSHCD_QUIRK_DME_PEER_ACCESS_AUTO_MODE quirk is enabled). Above issue is fixed by setting the default power mode immediately after successful link startup. Reviewed-by: Sahitya TummalaSigned-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index a647bcf..2d3ca18 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3512,6 +3512,10 @@ static int ufshcd_link_startup(struct ufs_hba *hba) goto link_startup; } + /* Mark that link is up in PWM-G1, 1-lane, SLOW-AUTO mode */ + ufshcd_init_pwr_info(hba); + ufshcd_print_pwr_info(hba); + if (hba->quirks & UFSHCD_QUIRK_BROKEN_LCC) { ret = ufshcd_disable_device_tx_lcc(hba); if (ret) @@ -5547,9 +5551,6 @@ static int ufshcd_probe_hba(struct ufs_hba *hba) if (ret) goto out; - ufshcd_init_pwr_info(hba); - ufshcd_print_pwr_info(hba); - /* set the default level for urgent bkops */ hba->urgent_bkops_lvl = BKOPS_STATUS_PERF_IMPACT; hba->is_urgent_bkops_lvl_checked = false; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 10/12] scsi: ufs: add time profiling support
This patch adds the profiling support for some of the time critical operations like hibern8 enter/exit, clock gating & clock scaling. Reviewed-by: Venkat GopalakrishnanSigned-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 24 include/trace/events/ufs.h | 40 2 files changed, 64 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 2d3ca18..36d239d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3021,11 +3021,14 @@ static int __ufshcd_uic_hibern8_enter(struct ufs_hba *hba) { int ret; struct uic_command uic_cmd = {0}; + ktime_t start = ktime_get(); ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, PRE_CHANGE); uic_cmd.command = UIC_CMD_DME_HIBER_ENTER; ret = ufshcd_uic_pwr_ctrl(hba, _cmd); + trace_ufshcd_profile_hibern8(dev_name(hba->dev), "enter", +ktime_to_us(ktime_sub(ktime_get(), start)), ret); if (ret) { dev_err(hba->dev, "%s: hibern8 enter failed. ret = %d\n", @@ -3061,11 +3064,15 @@ static int ufshcd_uic_hibern8_exit(struct ufs_hba *hba) { struct uic_command uic_cmd = {0}; int ret; + ktime_t start = ktime_get(); ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, PRE_CHANGE); uic_cmd.command = UIC_CMD_DME_HIBER_EXIT; ret = ufshcd_uic_pwr_ctrl(hba, _cmd); + trace_ufshcd_profile_hibern8(dev_name(hba->dev), "exit", +ktime_to_us(ktime_sub(ktime_get(), start)), ret); + if (ret) { dev_err(hba->dev, "%s: hibern8 exit failed. ret = %d\n", __func__, ret); @@ -5950,6 +5957,8 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, struct ufs_clk_info *clki; struct list_head *head = >clk_list_head; unsigned long flags; + ktime_t start = ktime_get(); + bool clk_state_changed = false; if (!head || list_empty(head)) goto out; @@ -5963,6 +5972,7 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, if (skip_ref_clk && !strcmp(clki->name, "ref_clk")) continue; + clk_state_changed = on ^ clki->enabled; if (on && !clki->enabled) { ret = clk_prepare_enable(clki->clk); if (ret) { @@ -5997,6 +6007,10 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, spin_unlock_irqrestore(hba->host->host_lock, flags); } + if (clk_state_changed) + trace_ufshcd_profile_clk_gating(dev_name(hba->dev), + (on ? "on" : "off"), + ktime_to_us(ktime_sub(ktime_get(), start)), ret); return ret; } @@ -6996,6 +7010,8 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) int ret = 0; struct ufs_clk_info *clki; struct list_head *head = >clk_list_head; + ktime_t start = ktime_get(); + bool clk_state_changed = false; if (!head || list_empty(head)) goto out; @@ -7009,6 +7025,8 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) if (scale_up && clki->max_freq) { if (clki->curr_freq == clki->max_freq) continue; + + clk_state_changed = true; ret = clk_set_rate(clki->clk, clki->max_freq); if (ret) { dev_err(hba->dev, "%s: %s clk set rate(%dHz) failed, %d\n", @@ -7026,6 +7044,8 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) } else if (!scale_up && clki->min_freq) { if (clki->curr_freq == clki->min_freq) continue; + + clk_state_changed = true; ret = clk_set_rate(clki->clk, clki->min_freq); if (ret) { dev_err(hba->dev, "%s: %s clk set rate(%dHz) failed, %d\n", @@ -7047,6 +7067,10 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) ret = ufshcd_vops_clk_scale_notify(hba, scale_up, POST_CHANGE); out: + if (clk_state_changed) + trace_ufshcd_profile_clk_scaling(dev_name(hba->dev), + (scale_up ? "up" : "down"), + ktime_to_us(ktime_sub(ktime_get(), start)), ret); return ret; } diff --git a/include/trace/events/ufs.h b/include/trace/events/ufs.h index 57743b6..e9634df 100644 ---
[PATCH v3 12/12] scsi: ufs: Improve fatal error logs
From: Dolev Raviv Errors such as UIC error, illegal OCS values, and others may require more information for debugging. Such information could be hibern8 events, events sequences, recoverable errors, error history, and more. This patch improves tracking of important errors and events in debug level to be enabled when debugging a such issues. It includes: * UIC error history * Successful hibern8 events * Successful command after hibern8 exit * Clk-freq info * Failed device command * Infrastructure for dumping host controller debug information Signed-off-by: Dolev Raviv Signed-off-by: Subhash Jadavani --- Changes from v1 -> v2: - Added explicit new line character at the end of the printk messages. --- drivers/scsi/ufs/ufshcd.c | 203 -- drivers/scsi/ufs/ufshcd.h | 47 +++ 2 files changed, 208 insertions(+), 42 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 918c597..c2b15a2 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -328,6 +328,37 @@ static void ufshcd_add_command_trace(struct ufs_hba *hba, doorbell, transfer_len, intr, lba, opcode); } +static void ufshcd_print_clk_freqs(struct ufs_hba *hba) +{ + struct ufs_clk_info *clki; + struct list_head *head = >clk_list_head; + + if (!head || list_empty(head)) + return; + + list_for_each_entry(clki, head, list) { + if (!IS_ERR_OR_NULL(clki->clk) && clki->min_freq && + clki->max_freq) + dev_err(hba->dev, "clk: %s, rate: %u\n", + clki->name, clki->curr_freq); + } +} + +static void ufshcd_print_uic_err_hist(struct ufs_hba *hba, + struct ufs_uic_err_reg_hist *err_hist, char *err_name) +{ + int i; + + for (i = 0; i < UIC_ERR_REG_HIST_LENGTH; i++) { + int p = (i + err_hist->pos - 1) % UIC_ERR_REG_HIST_LENGTH; + + if (err_hist->reg[p] == 0) + continue; + dev_err(hba->dev, "%s[%d] = 0x%x at %lld us\n", err_name, i, + err_hist->reg[p], ktime_to_us(err_hist->tstamp[p])); + } +} + static void ufshcd_print_host_regs(struct ufs_hba *hba) { /* @@ -344,6 +375,21 @@ static void ufshcd_print_host_regs(struct ufs_hba *hba) dev_err(hba->dev, "hba->outstanding_reqs = 0x%x, hba->outstanding_tasks = 0x%x\n", (u32)hba->outstanding_reqs, (u32)hba->outstanding_tasks); + dev_err(hba->dev, + "last_hibern8_exit_tstamp at %lld us, hibern8_exit_cnt = %d\n", + ktime_to_us(hba->ufs_stats.last_hibern8_exit_tstamp), + hba->ufs_stats.hibern8_exit_cnt); + + ufshcd_print_uic_err_hist(hba, >ufs_stats.pa_err, "pa_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.dl_err, "dl_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.nl_err, "nl_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.tl_err, "tl_err"); + ufshcd_print_uic_err_hist(hba, >ufs_stats.dme_err, "dme_err"); + + ufshcd_print_clk_freqs(hba); + + if (hba->vops && hba->vops->dbg_register_dump) + hba->vops->dbg_register_dump(hba); } static @@ -355,22 +401,30 @@ void ufshcd_print_trs(struct ufs_hba *hba, unsigned long bitmap, bool pr_prdt) for_each_set_bit(tag, , hba->nutrs) { lrbp = >lrb[tag]; - dev_err(hba->dev, "UPIU[%d] - Transfer Request Descriptor\n", - tag); + dev_err(hba->dev, "UPIU[%d] - issue time %lld us\n", + tag, ktime_to_us(lrbp->issue_time_stamp)); + dev_err(hba->dev, + "UPIU[%d] - Transfer Request Descriptor phys@0x%llx\n", + tag, (u64)lrbp->utrd_dma_addr); + ufshcd_hex_dump("UPIU TRD: ", lrbp->utr_descriptor_ptr, sizeof(struct utp_transfer_req_desc)); - dev_err(hba->dev, "UPIU[%d] - Request UPIU\n", tag); + dev_err(hba->dev, "UPIU[%d] - Request UPIU phys@0x%llx\n", tag, + (u64)lrbp->ucd_req_dma_addr); ufshcd_hex_dump("UPIU REQ: ", lrbp->ucd_req_ptr, sizeof(struct utp_upiu_req)); - dev_err(hba->dev, "UPIU[%d] - Response UPIU\n", tag); + dev_err(hba->dev, "UPIU[%d] - Response UPIU phys@0x%llx\n", tag, + (u64)lrbp->ucd_rsp_dma_addr); ufshcd_hex_dump("UPIU RSP: ", lrbp->ucd_rsp_ptr, sizeof(struct utp_upiu_rsp)); if (pr_prdt) { int prdt_length = le16_to_cpu( lrbp->utr_descriptor_ptr->prd_table_length); - dev_err(hba->dev, "UPIU[%d] - PRDT - %d entries\n",
[PATCH v3 09/12] scsi: ufs: fix setting init power mode
Immediately after successful UFS link startup, UFS link power mode would be in PWM-G1, 1-lane, SLOW-AUTO mode. But currently we are doing few of the DME local/peer attributes access before setting the "hba->pwr_info" to default power mode. If we are doing link startup as part of error recovery then old power mode might be set to FAST mode and doing DME peer access (after link startup but before updating "hba->pwr_info" to default power mode) unintentionally tries to switch from FAST to FAST_AUTO mode (if UFSHCD_QUIRK_DME_PEER_ACCESS_AUTO_MODE quirk is enabled). Above issue is fixed by setting the default power mode immediately after successful link startup. Reviewed-by: Sahitya Tummala Signed-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index a647bcf..2d3ca18 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3512,6 +3512,10 @@ static int ufshcd_link_startup(struct ufs_hba *hba) goto link_startup; } + /* Mark that link is up in PWM-G1, 1-lane, SLOW-AUTO mode */ + ufshcd_init_pwr_info(hba); + ufshcd_print_pwr_info(hba); + if (hba->quirks & UFSHCD_QUIRK_BROKEN_LCC) { ret = ufshcd_disable_device_tx_lcc(hba); if (ret) @@ -5547,9 +5551,6 @@ static int ufshcd_probe_hba(struct ufs_hba *hba) if (ret) goto out; - ufshcd_init_pwr_info(hba); - ufshcd_print_pwr_info(hba); - /* set the default level for urgent bkops */ hba->urgent_bkops_lvl = BKOPS_STATUS_PERF_IMPACT; hba->is_urgent_bkops_lvl_checked = false; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3 10/12] scsi: ufs: add time profiling support
This patch adds the profiling support for some of the time critical operations like hibern8 enter/exit, clock gating & clock scaling. Reviewed-by: Venkat Gopalakrishnan Signed-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 24 include/trace/events/ufs.h | 40 2 files changed, 64 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 2d3ca18..36d239d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3021,11 +3021,14 @@ static int __ufshcd_uic_hibern8_enter(struct ufs_hba *hba) { int ret; struct uic_command uic_cmd = {0}; + ktime_t start = ktime_get(); ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, PRE_CHANGE); uic_cmd.command = UIC_CMD_DME_HIBER_ENTER; ret = ufshcd_uic_pwr_ctrl(hba, _cmd); + trace_ufshcd_profile_hibern8(dev_name(hba->dev), "enter", +ktime_to_us(ktime_sub(ktime_get(), start)), ret); if (ret) { dev_err(hba->dev, "%s: hibern8 enter failed. ret = %d\n", @@ -3061,11 +3064,15 @@ static int ufshcd_uic_hibern8_exit(struct ufs_hba *hba) { struct uic_command uic_cmd = {0}; int ret; + ktime_t start = ktime_get(); ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, PRE_CHANGE); uic_cmd.command = UIC_CMD_DME_HIBER_EXIT; ret = ufshcd_uic_pwr_ctrl(hba, _cmd); + trace_ufshcd_profile_hibern8(dev_name(hba->dev), "exit", +ktime_to_us(ktime_sub(ktime_get(), start)), ret); + if (ret) { dev_err(hba->dev, "%s: hibern8 exit failed. ret = %d\n", __func__, ret); @@ -5950,6 +5957,8 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, struct ufs_clk_info *clki; struct list_head *head = >clk_list_head; unsigned long flags; + ktime_t start = ktime_get(); + bool clk_state_changed = false; if (!head || list_empty(head)) goto out; @@ -5963,6 +5972,7 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, if (skip_ref_clk && !strcmp(clki->name, "ref_clk")) continue; + clk_state_changed = on ^ clki->enabled; if (on && !clki->enabled) { ret = clk_prepare_enable(clki->clk); if (ret) { @@ -5997,6 +6007,10 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, spin_unlock_irqrestore(hba->host->host_lock, flags); } + if (clk_state_changed) + trace_ufshcd_profile_clk_gating(dev_name(hba->dev), + (on ? "on" : "off"), + ktime_to_us(ktime_sub(ktime_get(), start)), ret); return ret; } @@ -6996,6 +7010,8 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) int ret = 0; struct ufs_clk_info *clki; struct list_head *head = >clk_list_head; + ktime_t start = ktime_get(); + bool clk_state_changed = false; if (!head || list_empty(head)) goto out; @@ -7009,6 +7025,8 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) if (scale_up && clki->max_freq) { if (clki->curr_freq == clki->max_freq) continue; + + clk_state_changed = true; ret = clk_set_rate(clki->clk, clki->max_freq); if (ret) { dev_err(hba->dev, "%s: %s clk set rate(%dHz) failed, %d\n", @@ -7026,6 +7044,8 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) } else if (!scale_up && clki->min_freq) { if (clki->curr_freq == clki->min_freq) continue; + + clk_state_changed = true; ret = clk_set_rate(clki->clk, clki->min_freq); if (ret) { dev_err(hba->dev, "%s: %s clk set rate(%dHz) failed, %d\n", @@ -7047,6 +7067,10 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up) ret = ufshcd_vops_clk_scale_notify(hba, scale_up, POST_CHANGE); out: + if (clk_state_changed) + trace_ufshcd_profile_clk_scaling(dev_name(hba->dev), + (scale_up ? "up" : "down"), + ktime_to_us(ktime_sub(ktime_get(), start)), ret); return ret; } diff --git a/include/trace/events/ufs.h b/include/trace/events/ufs.h index 57743b6..e9634df 100644 --- a/include/trace/events/ufs.h +++ b/include/trace/events/ufs.h @@
[PATCH v3 11/12] scsi: ufs: add trace event for ufs commands
From: Lee SusmanUse the ftrace infrastructure to conditionally trace ufs command events. New trace event is created, which samples the following ufs command data: - device name - optional identification string - task tag - doorbell register - number of transfer bytes - interrupt status register - request start LBA - command opcode Currently we only fully trace read(10) and write(10) commands. All other commands which pass through ufshcd_send_command() will be printed with "-1" in the lba and transfer_len fields. Usage: echo 1 > /sys/kernel/debug/tracing/events/ufs/enable cat /sys/kernel/debug/tracing/trace_pipe Signed-off-by: Lee Susman Signed-off-by: Subhash Jadavani --- Changes from v2 -> v3: - Removed conditional compilation based on CONFIG_TRACEPOINTS and used trace_*_enabled() to skip unnecessary code path execution. --- drivers/scsi/ufs/ufshcd.c | 42 +- include/trace/events/ufs.h | 38 ++ 2 files changed, 79 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 36d239d..918c597 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -293,6 +293,41 @@ static inline void ufshcd_remove_non_printable(char *val) *val = ' '; } +static void ufshcd_add_command_trace(struct ufs_hba *hba, + unsigned int tag, const char *str) +{ + sector_t lba = -1; + u8 opcode = 0; + u32 intr, doorbell; + struct ufshcd_lrb *lrbp; + int transfer_len = -1; + + if (!trace_ufshcd_command_enabled()) + return; + + lrbp = >lrb[tag]; + + if (lrbp->cmd) { /* data phase exists */ + opcode = (u8)(*lrbp->cmd->cmnd); + if ((opcode == READ_10) || (opcode == WRITE_10)) { + /* +* Currently we only fully trace read(10) and write(10) +* commands +*/ + if (lrbp->cmd->request && lrbp->cmd->request->bio) + lba = + lrbp->cmd->request->bio->bi_iter.bi_sector; + transfer_len = be32_to_cpu( + lrbp->ucd_req_ptr->sc.exp_data_transfer_len); + } + } + + intr = ufshcd_readl(hba, REG_INTERRUPT_STATUS); + doorbell = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL); + trace_ufshcd_command(dev_name(hba->dev), str, tag, + doorbell, transfer_len, intr, lba, opcode); +} + static void ufshcd_print_host_regs(struct ufs_hba *hba) { /* @@ -1166,6 +1201,7 @@ void ufshcd_send_command(struct ufs_hba *hba, unsigned int task_tag) ufshcd_writel(hba, 1 << task_tag, REG_UTP_TRANSFER_REQ_DOOR_BELL); /* Make sure that doorbell is committed immediately */ wmb(); + ufshcd_add_command_trace(hba, task_tag, "send"); } /** @@ -3955,6 +3991,7 @@ static void __ufshcd_transfer_req_compl(struct ufs_hba *hba, lrbp = >lrb[index]; cmd = lrbp->cmd; if (cmd) { + ufshcd_add_command_trace(hba, index, "complete"); result = ufshcd_transfer_rsp_status(hba, lrbp); scsi_dma_unmap(cmd); cmd->result = result; @@ -3966,8 +4003,11 @@ static void __ufshcd_transfer_req_compl(struct ufs_hba *hba, __ufshcd_release(hba); } else if (lrbp->command_type == UTP_CMD_TYPE_DEV_MANAGE || lrbp->command_type == UTP_CMD_TYPE_UFS_STORAGE) { - if (hba->dev_cmd.complete) + if (hba->dev_cmd.complete) { + ufshcd_add_command_trace(hba, index, + "dev_complete"); complete(hba->dev_cmd.complete); + } } } diff --git a/include/trace/events/ufs.h b/include/trace/events/ufs.h index e9634df..bf6f826 100644 --- a/include/trace/events/ufs.h +++ b/include/trace/events/ufs.h @@ -219,6 +219,44 @@ TP_PROTO(const char *dev_name, int err, s64 usecs, int dev_state, int link_state), TP_ARGS(dev_name, err, usecs, dev_state, link_state)); + +TRACE_EVENT(ufshcd_command, + TP_PROTO(const char *dev_name, const char *str, unsigned int tag, + u32 doorbell, int transfer_len, u32 intr, u64 lba, + u8 opcode), + + TP_ARGS(dev_name, str, tag, doorbell, transfer_len, intr, lba, opcode), + + TP_STRUCT__entry( + __string(dev_name, dev_name) + __string(str, str) + __field(unsigned int, tag) +
[PATCH v3 06/12] scsi: ufs: provide sysfs attribute to select the PM level
This patch provides the sysfs attribute to choose the power management level for UFS runtime and system suspend. Reviewed-by: Sujit Reddy ThummaSigned-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 144 ++ drivers/scsi/ufs/ufshcd.h | 2 + 2 files changed, 146 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 5932936..30ee04a 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -686,6 +686,28 @@ static inline int ufshcd_is_hba_active(struct ufs_hba *hba) return (ufshcd_readl(hba, REG_CONTROLLER_ENABLE) & 0x1) ? 0 : 1; } +static const char *ufschd_uic_link_state_to_string( + enum uic_link_state state) +{ + switch (state) { + case UIC_LINK_OFF_STATE:return "OFF"; + case UIC_LINK_ACTIVE_STATE: return "ACTIVE"; + case UIC_LINK_HIBERN8_STATE:return "HIBERN8"; + default:return "UNKNOWN"; + } +} + +static const char *ufschd_ufs_dev_pwr_mode_to_string( + enum ufs_dev_pwr_mode state) +{ + switch (state) { + case UFS_ACTIVE_PWR_MODE: return "ACTIVE"; + case UFS_SLEEP_PWR_MODE:return "SLEEP"; + case UFS_POWERDOWN_PWR_MODE:return "POWERDOWN"; + default:return "UNKNOWN"; + } +} + u32 ufshcd_get_local_unipro_ver(struct ufs_hba *hba) { /* HCI version 1.0 and 1.1 supports UniPro 1.41 */ @@ -6709,6 +6731,127 @@ int ufshcd_runtime_idle(struct ufs_hba *hba) } EXPORT_SYMBOL(ufshcd_runtime_idle); +static inline ssize_t ufshcd_pm_lvl_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count, + bool rpm) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long flags, value; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + if ((value < UFS_PM_LVL_0) || (value >= UFS_PM_LVL_MAX)) + return -EINVAL; + + spin_lock_irqsave(hba->host->host_lock, flags); + if (rpm) + hba->rpm_lvl = value; + else + hba->spm_lvl = value; + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t ufshcd_rpm_lvl_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + int curr_len; + u8 lvl; + + curr_len = snprintf(buf, PAGE_SIZE, + "\nCurrent Runtime PM level [%d] => dev_state [%s] link_state [%s]\n", + hba->rpm_lvl, + ufschd_ufs_dev_pwr_mode_to_string( + ufs_pm_lvl_states[hba->rpm_lvl].dev_state), + ufschd_uic_link_state_to_string( + ufs_pm_lvl_states[hba->rpm_lvl].link_state)); + + curr_len += snprintf((buf + curr_len), (PAGE_SIZE - curr_len), +"\nAll available Runtime PM levels info:\n"); + for (lvl = UFS_PM_LVL_0; lvl < UFS_PM_LVL_MAX; lvl++) + curr_len += snprintf((buf + curr_len), (PAGE_SIZE - curr_len), +"\tRuntime PM level [%d] => dev_state [%s] link_state [%s]\n", + lvl, + ufschd_ufs_dev_pwr_mode_to_string( + ufs_pm_lvl_states[lvl].dev_state), + ufschd_uic_link_state_to_string( + ufs_pm_lvl_states[lvl].link_state)); + + return curr_len; +} + +static ssize_t ufshcd_rpm_lvl_store(struct device *dev, + struct device_attribute *attr, const char *buf, size_t count) +{ + return ufshcd_pm_lvl_store(dev, attr, buf, count, true); +} + +static void ufshcd_add_rpm_lvl_sysfs_nodes(struct ufs_hba *hba) +{ + hba->rpm_lvl_attr.show = ufshcd_rpm_lvl_show; + hba->rpm_lvl_attr.store = ufshcd_rpm_lvl_store; + sysfs_attr_init(>rpm_lvl_attr.attr); + hba->rpm_lvl_attr.attr.name = "rpm_lvl"; + hba->rpm_lvl_attr.attr.mode = 0644; + if (device_create_file(hba->dev, >rpm_lvl_attr)) + dev_err(hba->dev, "Failed to create sysfs for rpm_lvl\n"); +} + +static ssize_t ufshcd_spm_lvl_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + int curr_len; + u8 lvl; + + curr_len = snprintf(buf, PAGE_SIZE, + "\nCurrent System PM level [%d] => dev_state [%s] link_state [%s]\n", + hba->spm_lvl, +
[PATCH v3 11/12] scsi: ufs: add trace event for ufs commands
From: Lee Susman Use the ftrace infrastructure to conditionally trace ufs command events. New trace event is created, which samples the following ufs command data: - device name - optional identification string - task tag - doorbell register - number of transfer bytes - interrupt status register - request start LBA - command opcode Currently we only fully trace read(10) and write(10) commands. All other commands which pass through ufshcd_send_command() will be printed with "-1" in the lba and transfer_len fields. Usage: echo 1 > /sys/kernel/debug/tracing/events/ufs/enable cat /sys/kernel/debug/tracing/trace_pipe Signed-off-by: Lee Susman Signed-off-by: Subhash Jadavani --- Changes from v2 -> v3: - Removed conditional compilation based on CONFIG_TRACEPOINTS and used trace_*_enabled() to skip unnecessary code path execution. --- drivers/scsi/ufs/ufshcd.c | 42 +- include/trace/events/ufs.h | 38 ++ 2 files changed, 79 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 36d239d..918c597 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -293,6 +293,41 @@ static inline void ufshcd_remove_non_printable(char *val) *val = ' '; } +static void ufshcd_add_command_trace(struct ufs_hba *hba, + unsigned int tag, const char *str) +{ + sector_t lba = -1; + u8 opcode = 0; + u32 intr, doorbell; + struct ufshcd_lrb *lrbp; + int transfer_len = -1; + + if (!trace_ufshcd_command_enabled()) + return; + + lrbp = >lrb[tag]; + + if (lrbp->cmd) { /* data phase exists */ + opcode = (u8)(*lrbp->cmd->cmnd); + if ((opcode == READ_10) || (opcode == WRITE_10)) { + /* +* Currently we only fully trace read(10) and write(10) +* commands +*/ + if (lrbp->cmd->request && lrbp->cmd->request->bio) + lba = + lrbp->cmd->request->bio->bi_iter.bi_sector; + transfer_len = be32_to_cpu( + lrbp->ucd_req_ptr->sc.exp_data_transfer_len); + } + } + + intr = ufshcd_readl(hba, REG_INTERRUPT_STATUS); + doorbell = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL); + trace_ufshcd_command(dev_name(hba->dev), str, tag, + doorbell, transfer_len, intr, lba, opcode); +} + static void ufshcd_print_host_regs(struct ufs_hba *hba) { /* @@ -1166,6 +1201,7 @@ void ufshcd_send_command(struct ufs_hba *hba, unsigned int task_tag) ufshcd_writel(hba, 1 << task_tag, REG_UTP_TRANSFER_REQ_DOOR_BELL); /* Make sure that doorbell is committed immediately */ wmb(); + ufshcd_add_command_trace(hba, task_tag, "send"); } /** @@ -3955,6 +3991,7 @@ static void __ufshcd_transfer_req_compl(struct ufs_hba *hba, lrbp = >lrb[index]; cmd = lrbp->cmd; if (cmd) { + ufshcd_add_command_trace(hba, index, "complete"); result = ufshcd_transfer_rsp_status(hba, lrbp); scsi_dma_unmap(cmd); cmd->result = result; @@ -3966,8 +4003,11 @@ static void __ufshcd_transfer_req_compl(struct ufs_hba *hba, __ufshcd_release(hba); } else if (lrbp->command_type == UTP_CMD_TYPE_DEV_MANAGE || lrbp->command_type == UTP_CMD_TYPE_UFS_STORAGE) { - if (hba->dev_cmd.complete) + if (hba->dev_cmd.complete) { + ufshcd_add_command_trace(hba, index, + "dev_complete"); complete(hba->dev_cmd.complete); + } } } diff --git a/include/trace/events/ufs.h b/include/trace/events/ufs.h index e9634df..bf6f826 100644 --- a/include/trace/events/ufs.h +++ b/include/trace/events/ufs.h @@ -219,6 +219,44 @@ TP_PROTO(const char *dev_name, int err, s64 usecs, int dev_state, int link_state), TP_ARGS(dev_name, err, usecs, dev_state, link_state)); + +TRACE_EVENT(ufshcd_command, + TP_PROTO(const char *dev_name, const char *str, unsigned int tag, + u32 doorbell, int transfer_len, u32 intr, u64 lba, + u8 opcode), + + TP_ARGS(dev_name, str, tag, doorbell, transfer_len, intr, lba, opcode), + + TP_STRUCT__entry( + __string(dev_name, dev_name) + __string(str, str) + __field(unsigned int, tag) + __field(u32, doorbell) + __field(int, transfer_len) +
[PATCH v3 06/12] scsi: ufs: provide sysfs attribute to select the PM level
This patch provides the sysfs attribute to choose the power management level for UFS runtime and system suspend. Reviewed-by: Sujit Reddy Thumma Signed-off-by: Subhash Jadavani --- drivers/scsi/ufs/ufshcd.c | 144 ++ drivers/scsi/ufs/ufshcd.h | 2 + 2 files changed, 146 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 5932936..30ee04a 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -686,6 +686,28 @@ static inline int ufshcd_is_hba_active(struct ufs_hba *hba) return (ufshcd_readl(hba, REG_CONTROLLER_ENABLE) & 0x1) ? 0 : 1; } +static const char *ufschd_uic_link_state_to_string( + enum uic_link_state state) +{ + switch (state) { + case UIC_LINK_OFF_STATE:return "OFF"; + case UIC_LINK_ACTIVE_STATE: return "ACTIVE"; + case UIC_LINK_HIBERN8_STATE:return "HIBERN8"; + default:return "UNKNOWN"; + } +} + +static const char *ufschd_ufs_dev_pwr_mode_to_string( + enum ufs_dev_pwr_mode state) +{ + switch (state) { + case UFS_ACTIVE_PWR_MODE: return "ACTIVE"; + case UFS_SLEEP_PWR_MODE:return "SLEEP"; + case UFS_POWERDOWN_PWR_MODE:return "POWERDOWN"; + default:return "UNKNOWN"; + } +} + u32 ufshcd_get_local_unipro_ver(struct ufs_hba *hba) { /* HCI version 1.0 and 1.1 supports UniPro 1.41 */ @@ -6709,6 +6731,127 @@ int ufshcd_runtime_idle(struct ufs_hba *hba) } EXPORT_SYMBOL(ufshcd_runtime_idle); +static inline ssize_t ufshcd_pm_lvl_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count, + bool rpm) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long flags, value; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + if ((value < UFS_PM_LVL_0) || (value >= UFS_PM_LVL_MAX)) + return -EINVAL; + + spin_lock_irqsave(hba->host->host_lock, flags); + if (rpm) + hba->rpm_lvl = value; + else + hba->spm_lvl = value; + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t ufshcd_rpm_lvl_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + int curr_len; + u8 lvl; + + curr_len = snprintf(buf, PAGE_SIZE, + "\nCurrent Runtime PM level [%d] => dev_state [%s] link_state [%s]\n", + hba->rpm_lvl, + ufschd_ufs_dev_pwr_mode_to_string( + ufs_pm_lvl_states[hba->rpm_lvl].dev_state), + ufschd_uic_link_state_to_string( + ufs_pm_lvl_states[hba->rpm_lvl].link_state)); + + curr_len += snprintf((buf + curr_len), (PAGE_SIZE - curr_len), +"\nAll available Runtime PM levels info:\n"); + for (lvl = UFS_PM_LVL_0; lvl < UFS_PM_LVL_MAX; lvl++) + curr_len += snprintf((buf + curr_len), (PAGE_SIZE - curr_len), +"\tRuntime PM level [%d] => dev_state [%s] link_state [%s]\n", + lvl, + ufschd_ufs_dev_pwr_mode_to_string( + ufs_pm_lvl_states[lvl].dev_state), + ufschd_uic_link_state_to_string( + ufs_pm_lvl_states[lvl].link_state)); + + return curr_len; +} + +static ssize_t ufshcd_rpm_lvl_store(struct device *dev, + struct device_attribute *attr, const char *buf, size_t count) +{ + return ufshcd_pm_lvl_store(dev, attr, buf, count, true); +} + +static void ufshcd_add_rpm_lvl_sysfs_nodes(struct ufs_hba *hba) +{ + hba->rpm_lvl_attr.show = ufshcd_rpm_lvl_show; + hba->rpm_lvl_attr.store = ufshcd_rpm_lvl_store; + sysfs_attr_init(>rpm_lvl_attr.attr); + hba->rpm_lvl_attr.attr.name = "rpm_lvl"; + hba->rpm_lvl_attr.attr.mode = 0644; + if (device_create_file(hba->dev, >rpm_lvl_attr)) + dev_err(hba->dev, "Failed to create sysfs for rpm_lvl\n"); +} + +static ssize_t ufshcd_spm_lvl_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + int curr_len; + u8 lvl; + + curr_len = snprintf(buf, PAGE_SIZE, + "\nCurrent System PM level [%d] => dev_state [%s] link_state [%s]\n", + hba->spm_lvl, + ufschd_ufs_dev_pwr_mode_to_string( +