Re: [PATCH V6 4/7] sparc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
On 4/13/22 11:43, Christophe Leroy wrote: > > > Le 13/04/2022 à 07:58, Anshuman Khandual a écrit : >> This defines and exports a platform specific custom vm_get_page_prot() via >> subscribing ARCH_HAS_VM_GET_PAGE_PROT. It localizes arch_vm_get_page_prot() >> as sparc_vm_get_page_prot() and moves near vm_get_page_prot(). >> >> Cc: "David S. Miller" >> Cc: Khalid Aziz >> Cc: sparcli...@vger.kernel.org >> Cc: linux-ker...@vger.kernel.org >> Reviewed-by: Khalid Aziz >> Signed-off-by: Anshuman Khandual >> --- >> arch/sparc/Kconfig| 1 + >> arch/sparc/include/asm/mman.h | 6 -- >> arch/sparc/mm/init_64.c | 13 + >> 3 files changed, 14 insertions(+), 6 deletions(-) >> >> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig >> index 9200bc04701c..85b573643af6 100644 >> --- a/arch/sparc/Kconfig >> +++ b/arch/sparc/Kconfig >> @@ -84,6 +84,7 @@ config SPARC64 >> select PERF_USE_VMALLOC >> select ARCH_HAVE_NMI_SAFE_CMPXCHG >> select HAVE_C_RECORDMCOUNT >> +select ARCH_HAS_VM_GET_PAGE_PROT >> select HAVE_ARCH_AUDITSYSCALL >> select ARCH_SUPPORTS_ATOMIC_RMW >> select ARCH_SUPPORTS_DEBUG_PAGEALLOC >> diff --git a/arch/sparc/include/asm/mman.h b/arch/sparc/include/asm/mman.h >> index 274217e7ed70..af9c10c83dc5 100644 >> --- a/arch/sparc/include/asm/mman.h >> +++ b/arch/sparc/include/asm/mman.h >> @@ -46,12 +46,6 @@ static inline unsigned long >> sparc_calc_vm_prot_bits(unsigned long prot) >> } >> } >> >> -#define arch_vm_get_page_prot(vm_flags) sparc_vm_get_page_prot(vm_flags) >> -static inline pgprot_t sparc_vm_get_page_prot(unsigned long vm_flags) >> -{ >> -return (vm_flags & VM_SPARC_ADI) ? __pgprot(_PAGE_MCD_4V) : __pgprot(0); >> -} >> - >> #define arch_validate_prot(prot, addr) sparc_validate_prot(prot, addr) >> static inline int sparc_validate_prot(unsigned long prot, unsigned long >> addr) >> { >> diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c >> index 8b1911591581..dcb17763c1f2 100644 >> --- a/arch/sparc/mm/init_64.c >> +++ b/arch/sparc/mm/init_64.c >> @@ -3184,3 +3184,16 @@ void copy_highpage(struct page *to, struct page *from) >> } >> } >> EXPORT_SYMBOL(copy_highpage); >> + >> +static pgprot_t sparc_vm_get_page_prot(unsigned long vm_flags) >> +{ >> +return (vm_flags & VM_SPARC_ADI) ? __pgprot(_PAGE_MCD_4V) : __pgprot(0); >> +} >> + >> +pgprot_t vm_get_page_prot(unsigned long vm_flags) >> +{ >> +return __pgprot(pgprot_val(protection_map[vm_flags & >> +(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | >> +pgprot_val(sparc_vm_get_page_prot(vm_flags))); >> +} >> +EXPORT_SYMBOL(vm_get_page_prot); > > > sparc is now the only one with two functions. You can most likely do > like you did for ARM and POWERPC: merge into a single function: I was almost about to do this one as well but as this patch has already been reviewed with a tag, just skipped it. I will respin the series once more :) Khalid, Could I keep your review tag after the following change ? > > pgprot_t vm_get_page_prot(unsigned long vm_flags) > { > unsigned long prot = pgprot_val(protection_map[vm_flags & > (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); > > if (vm_flags & VM_SPARC_ADI) > prot |= _PAGE_MCD_4V; > > return __pgprot(prot); > } > EXPORT_SYMBOL(vm_get_page_prot); - Anshuman
Re: cleanup swiotlb initialization v8
On Mon, Apr 04, 2022 at 07:05:44AM +0200, Christoph Hellwig wrote: > Hi all, > > this series tries to clean up the swiotlb initialization, including > that of swiotlb-xen. To get there is also removes the x86 iommu table > infrastructure that massively obsfucates the initialization path. > > Git tree: > > git://git.infradead.org/users/hch/misc.git swiotlb-init-cleanup > > Gitweb: > > > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-init-cleanup I've updated the git tree above with the commit message nitpicks and received reviews. I plan to pull the patches into the dma-mapping tree after -rc3 is released, so if any involved maintainer is not happy with the result, please speak up now.
Re: [PATCH V6 4/7] sparc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Le 13/04/2022 à 07:58, Anshuman Khandual a écrit : > This defines and exports a platform specific custom vm_get_page_prot() via > subscribing ARCH_HAS_VM_GET_PAGE_PROT. It localizes arch_vm_get_page_prot() > as sparc_vm_get_page_prot() and moves near vm_get_page_prot(). > > Cc: "David S. Miller" > Cc: Khalid Aziz > Cc: sparcli...@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > Reviewed-by: Khalid Aziz > Signed-off-by: Anshuman Khandual > --- > arch/sparc/Kconfig| 1 + > arch/sparc/include/asm/mman.h | 6 -- > arch/sparc/mm/init_64.c | 13 + > 3 files changed, 14 insertions(+), 6 deletions(-) > > diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig > index 9200bc04701c..85b573643af6 100644 > --- a/arch/sparc/Kconfig > +++ b/arch/sparc/Kconfig > @@ -84,6 +84,7 @@ config SPARC64 > select PERF_USE_VMALLOC > select ARCH_HAVE_NMI_SAFE_CMPXCHG > select HAVE_C_RECORDMCOUNT > + select ARCH_HAS_VM_GET_PAGE_PROT > select HAVE_ARCH_AUDITSYSCALL > select ARCH_SUPPORTS_ATOMIC_RMW > select ARCH_SUPPORTS_DEBUG_PAGEALLOC > diff --git a/arch/sparc/include/asm/mman.h b/arch/sparc/include/asm/mman.h > index 274217e7ed70..af9c10c83dc5 100644 > --- a/arch/sparc/include/asm/mman.h > +++ b/arch/sparc/include/asm/mman.h > @@ -46,12 +46,6 @@ static inline unsigned long > sparc_calc_vm_prot_bits(unsigned long prot) > } > } > > -#define arch_vm_get_page_prot(vm_flags) sparc_vm_get_page_prot(vm_flags) > -static inline pgprot_t sparc_vm_get_page_prot(unsigned long vm_flags) > -{ > - return (vm_flags & VM_SPARC_ADI) ? __pgprot(_PAGE_MCD_4V) : __pgprot(0); > -} > - > #define arch_validate_prot(prot, addr) sparc_validate_prot(prot, addr) > static inline int sparc_validate_prot(unsigned long prot, unsigned long > addr) > { > diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c > index 8b1911591581..dcb17763c1f2 100644 > --- a/arch/sparc/mm/init_64.c > +++ b/arch/sparc/mm/init_64.c > @@ -3184,3 +3184,16 @@ void copy_highpage(struct page *to, struct page *from) > } > } > EXPORT_SYMBOL(copy_highpage); > + > +static pgprot_t sparc_vm_get_page_prot(unsigned long vm_flags) > +{ > + return (vm_flags & VM_SPARC_ADI) ? __pgprot(_PAGE_MCD_4V) : __pgprot(0); > +} > + > +pgprot_t vm_get_page_prot(unsigned long vm_flags) > +{ > + return __pgprot(pgprot_val(protection_map[vm_flags & > + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > + pgprot_val(sparc_vm_get_page_prot(vm_flags))); > +} > +EXPORT_SYMBOL(vm_get_page_prot); sparc is now the only one with two functions. You can most likely do like you did for ARM and POWERPC: merge into a single function: pgprot_t vm_get_page_prot(unsigned long vm_flags) { unsigned long prot = pgprot_val(protection_map[vm_flags & (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); if (vm_flags & VM_SPARC_ADI) prot |= _PAGE_MCD_4V; return __pgprot(prot); } EXPORT_SYMBOL(vm_get_page_prot);
Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU)
Oops, fixed subject... Excerpts from Nicholas Piggin's message of April 13, 2022 3:11 pm: > +Daniel, Thomas, Viresh > > Subject: Re: rcu_sched self-detected stall on CPU > > Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am: >> Michael Ellerman writes: >>> "Paul E. McKenney" writes: On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: > Hi > > I can reproduce it in a ppc virtual cloud server provided by Oregon > State University. Following is what I do: > 1) curl -l > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz > -o linux-5.18-rc1.tar.gz > 2) tar zxf linux-5.18-rc1.tar.gz > 3) cp config linux-5.18-rc1/.config > 4) cd linux-5.18-rc1 > 5) make vmlinux -j 8 > 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot > -smp 2 (QEMU 4.2.1) > 7) after 12 rounds, the bug got reproduced: > (http://154.223.142.244/logs/20220406/qemu.log.txt) Just to make sure, are you both seeing the same thing? Last I knew, Zhouyi was chasing an RCU-tasks issue that appears only in kernels built with CONFIG_PROVE_RCU=y, which Miguel does not have set. Or did I miss something? Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period kthread slept for three milliseconds, but did not wake up for more than 20 seconds. This kthread would normally have awakened on CPU 1, but CPU 1 looks to me to be very unhealthy, as can be seen in your console output below (but maybe my idea of what is healthy for powerpc systems is outdated). Please see also the inline annotations. Thoughts from the PPC guys? >>> >>> I haven't seen it in my testing. But using Miguel's config I can >>> reproduce it seemingly on every boot. >>> >>> For me it bisects to: >>> >>> 35de589cb879 ("powerpc/time: improve decrementer clockevent processing") >>> >>> Which seems plausible. >>> >>> Reverting that on mainline makes the bug go away. >>> >>> I don't see an obvious bug in the diff, but I could be wrong, or the old >>> code was papering over an existing bug? >>> >>> I'll try and work out what it is about Miguel's config that exposes >>> this vs our defconfig, that might give us a clue. >> >> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall. >> >> I can reproduce just with: >> >> $ make ppc64le_guest_defconfig >> $ ./scripts/config -d HIGH_RES_TIMERS >> >> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even >> realise you could disable it TBH :) >> >> The Rust CI has it disabled because I copied that from the x86 defconfig >> they were using back when I added the Rust support. I think that was >> meant to be a stripped down fast config for CI, but the result is it's >> just using a badly tested combination which is not helpful. >> >> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we >> can debug this further without blocking them. > > So we traced the problem down to possibly a misunderstanding between > decrementer clock event device and core code. > > The decrementer is only oneshot*ish*. It actually needs to either be > reprogrammed or shut down otherwise it just continues to cause > interrupts. > > Before commit 35de589cb879, it was sort of two-shot. The initial > interrupt at the programmed time would set its internal next_tb variable > to ~0 and call the ->event_handler(). If that did not set_next_event or > stop the timer, the interrupt will fire again immediately, notice > next_tb is ~0, and only then stop the decrementer interrupt. > > So that was already kind of ugly, this patch just turned it into a hang. > > The problem happens when the tick is stopped with an event still > pending, then tick_nohz_handler() is called, but it bails out because > tick_stopped == 1 so the device never gets programmed again, and so it > keeps firing. > > How to fix it? Before commit a7cba02deced, powerpc's decrementer was > really oneshot, but we would like to avoid doing that because it requires > additional programming of the hardware on each timer interrupt. We have > the ONESHOT_STOPPED state which seems to be just about what we want. > > Did the ONESHOT_STOPPED patch just miss this case, or is there a reason > we don't stop it here? This patch seems to fix the hang (not heavily > tested though). > > Thanks, > Nick > > --- > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > index 2d76c91b85de..7e13a55b6b71 100644 > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct > clock_event_device *dev) > tick_sched_do_timer(ts, now); > tick_sched_handle(ts, regs); > > - /* No need to reprogram if we are running tickless */ > - if (unlikely(ts->tick_stopped)) > + if (unlikely(ts->tick_stopped)) { > + /* If we are tickless, change
Re: [PATCH V6 1/7] mm/mmap: Add new config ARCH_HAS_VM_GET_PAGE_PROT
Le 13/04/2022 à 07:58, Anshuman Khandual a écrit : > Add a new config ARCH_HAS_VM_GET_PAGE_PROT, which when subscribed enables a > given platform to define its own vm_get_page_prot() but still utilizing the > generic protection_map[] array. > > Cc: Andrew Morton > Cc: linux...@kvack.org > Cc: linux-ker...@vger.kernel.org > Reviewed-by: Catalin Marinas > Suggested-by: Christoph Hellwig Reviewed-by: Christophe Leroy > Signed-off-by: Anshuman Khandual > --- > mm/Kconfig | 3 +++ > mm/mmap.c | 2 ++ > 2 files changed, 5 insertions(+) > > diff --git a/mm/Kconfig b/mm/Kconfig > index 034d87953600..b1f7624276f8 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -765,6 +765,9 @@ config ARCH_HAS_CURRENT_STACK_POINTER > config ARCH_HAS_FILTER_PGPROT > bool > > +config ARCH_HAS_VM_GET_PAGE_PROT > + bool > + > config ARCH_HAS_PTE_DEVMAP > bool > > diff --git a/mm/mmap.c b/mm/mmap.c > index 3aa839f81e63..87cb2eaf7e1a 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -106,6 +106,7 @@ pgprot_t protection_map[16] __ro_after_init = { > __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111 > }; > > +#ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT > #ifndef CONFIG_ARCH_HAS_FILTER_PGPROT > static inline pgprot_t arch_filter_pgprot(pgprot_t prot) > { > @@ -122,6 +123,7 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags) > return arch_filter_pgprot(ret); > } > EXPORT_SYMBOL(vm_get_page_prot); > +#endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */ > > static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) > {
Re: [PATCH V6 2/7] powerpc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Le 13/04/2022 à 07:58, Anshuman Khandual a écrit : > This defines and exports a platform specific custom vm_get_page_prot() via > subscribing ARCH_HAS_VM_GET_PAGE_PROT. While here, this also localizes > arch_vm_get_page_prot() as __vm_get_page_prot() and moves it near > vm_get_page_prot(). > > Cc: Michael Ellerman > Cc: Paul Mackerras > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-ker...@vger.kernel.org > Signed-off-by: Anshuman Khandual Reviewed-by: Christophe Leroy > --- > arch/powerpc/Kconfig | 1 + > arch/powerpc/include/asm/mman.h| 12 > arch/powerpc/mm/book3s64/pgtable.c | 17 + > 3 files changed, 18 insertions(+), 12 deletions(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 174edabb74fa..69e44358a235 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -140,6 +140,7 @@ config PPC > select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST > select ARCH_HAS_UACCESS_FLUSHCACHE > select ARCH_HAS_UBSAN_SANITIZE_ALL > + select ARCH_HAS_VM_GET_PAGE_PROTif PPC_BOOK3S_64 > select ARCH_HAVE_NMI_SAFE_CMPXCHG > select ARCH_KEEP_MEMBLOCK > select ARCH_MIGHT_HAVE_PC_PARPORT > diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h > index 7cb6d18f5cd6..1b024e64c8ec 100644 > --- a/arch/powerpc/include/asm/mman.h > +++ b/arch/powerpc/include/asm/mman.h > @@ -24,18 +24,6 @@ static inline unsigned long > arch_calc_vm_prot_bits(unsigned long prot, > } > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, > pkey) > > -static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags) > -{ > -#ifdef CONFIG_PPC_MEM_KEYS > - return (vm_flags & VM_SAO) ? > - __pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) : > - __pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags)); > -#else > - return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0); > -#endif > -} > -#define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags) > - > static inline bool arch_validate_prot(unsigned long prot, unsigned long > addr) > { > if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO)) > diff --git a/arch/powerpc/mm/book3s64/pgtable.c > b/arch/powerpc/mm/book3s64/pgtable.c > index 052e6590f84f..8b474ab32f67 100644 > --- a/arch/powerpc/mm/book3s64/pgtable.c > +++ b/arch/powerpc/mm/book3s64/pgtable.c > @@ -7,6 +7,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -549,3 +550,19 @@ unsigned long memremap_compat_align(void) > } > EXPORT_SYMBOL_GPL(memremap_compat_align); > #endif > + > +pgprot_t vm_get_page_prot(unsigned long vm_flags) > +{ > + unsigned long prot = pgprot_val(protection_map[vm_flags & > + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); > + > + if (vm_flags & VM_SAO) > + prot |= _PAGE_SAO; > + > +#ifdef CONFIG_PPC_MEM_KEYS > + prot |= vmflag_to_pte_pkey_bits(vm_flags); > +#endif > + > + return __pgprot(prot); > +} > +EXPORT_SYMBOL(vm_get_page_prot);
Re: [PATCH V6 6/7] mm/mmap: Drop arch_filter_pgprot()
Le 13/04/2022 à 07:58, Anshuman Khandual a écrit : > There are no platforms left which subscribe ARCH_HAS_FILTER_PGPROT. Hence > drop generic arch_filter_pgprot() and also config ARCH_HAS_FILTER_PGPROT. > > Cc: Andrew Morton > Cc: linux...@kvack.org > Cc: linux-ker...@vger.kernel.org > Reviewed-by: Catalin Marinas Reviewed-by: Christophe Leroy > Signed-off-by: Anshuman Khandual > --- > mm/Kconfig | 3 --- > mm/mmap.c | 13 ++--- > 2 files changed, 2 insertions(+), 14 deletions(-) > > diff --git a/mm/Kconfig b/mm/Kconfig > index b1f7624276f8..3f7b6d7b69df 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -762,9 +762,6 @@ config ARCH_HAS_CURRENT_STACK_POINTER > register alias named "current_stack_pointer", this config can be > selected. > > -config ARCH_HAS_FILTER_PGPROT > - bool > - > config ARCH_HAS_VM_GET_PAGE_PROT > bool > > diff --git a/mm/mmap.c b/mm/mmap.c > index 87cb2eaf7e1a..b96e995f3733 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -107,20 +107,11 @@ pgprot_t protection_map[16] __ro_after_init = { > }; > > #ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT > -#ifndef CONFIG_ARCH_HAS_FILTER_PGPROT > -static inline pgprot_t arch_filter_pgprot(pgprot_t prot) > -{ > - return prot; > -} > -#endif > - > pgprot_t vm_get_page_prot(unsigned long vm_flags) > { > - pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags & > - (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > + return __pgprot(pgprot_val(protection_map[vm_flags & > + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > pgprot_val(arch_vm_get_page_prot(vm_flags))); > - > - return arch_filter_pgprot(ret); > } > EXPORT_SYMBOL(vm_get_page_prot); > #endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */
Re: [PATCH V6 7/7] mm/mmap: Drop arch_vm_get_page_pgprot()
Le 13/04/2022 à 07:58, Anshuman Khandual a écrit : > There are no platforms left which use arch_vm_get_page_prot(). Just drop > generic arch_vm_get_page_prot(). > > Cc: Andrew Morton > Cc: linux...@kvack.org > Cc: linux-ker...@vger.kernel.org > Reviewed-by: Catalin Marinas Reviewed-by: Christophe Leroy > Signed-off-by: Anshuman Khandual > --- > include/linux/mman.h | 4 > mm/mmap.c| 4 +--- > 2 files changed, 1 insertion(+), 7 deletions(-) > > diff --git a/include/linux/mman.h b/include/linux/mman.h > index b66e91b8176c..58b3abd457a3 100644 > --- a/include/linux/mman.h > +++ b/include/linux/mman.h > @@ -93,10 +93,6 @@ static inline void vm_unacct_memory(long pages) > #define arch_calc_vm_flag_bits(flags) 0 > #endif > > -#ifndef arch_vm_get_page_prot > -#define arch_vm_get_page_prot(vm_flags) __pgprot(0) > -#endif > - > #ifndef arch_validate_prot > /* >* This is called from mprotect(). PROT_GROWSDOWN and PROT_GROWSUP have > diff --git a/mm/mmap.c b/mm/mmap.c > index b96e995f3733..475b7c367032 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -109,9 +109,7 @@ pgprot_t protection_map[16] __ro_after_init = { > #ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT > pgprot_t vm_get_page_prot(unsigned long vm_flags) > { > - return __pgprot(pgprot_val(protection_map[vm_flags & > - (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > - pgprot_val(arch_vm_get_page_prot(vm_flags))); > + return protection_map[vm_flags & (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]; > } > EXPORT_SYMBOL(vm_get_page_prot); > #endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */
[PATCH V6 7/7] mm/mmap: Drop arch_vm_get_page_pgprot()
There are no platforms left which use arch_vm_get_page_prot(). Just drop generic arch_vm_get_page_prot(). Cc: Andrew Morton Cc: linux...@kvack.org Cc: linux-ker...@vger.kernel.org Reviewed-by: Catalin Marinas Signed-off-by: Anshuman Khandual --- include/linux/mman.h | 4 mm/mmap.c| 4 +--- 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/include/linux/mman.h b/include/linux/mman.h index b66e91b8176c..58b3abd457a3 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -93,10 +93,6 @@ static inline void vm_unacct_memory(long pages) #define arch_calc_vm_flag_bits(flags) 0 #endif -#ifndef arch_vm_get_page_prot -#define arch_vm_get_page_prot(vm_flags) __pgprot(0) -#endif - #ifndef arch_validate_prot /* * This is called from mprotect(). PROT_GROWSDOWN and PROT_GROWSUP have diff --git a/mm/mmap.c b/mm/mmap.c index b96e995f3733..475b7c367032 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -109,9 +109,7 @@ pgprot_t protection_map[16] __ro_after_init = { #ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT pgprot_t vm_get_page_prot(unsigned long vm_flags) { - return __pgprot(pgprot_val(protection_map[vm_flags & - (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | - pgprot_val(arch_vm_get_page_prot(vm_flags))); + return protection_map[vm_flags & (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]; } EXPORT_SYMBOL(vm_get_page_prot); #endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */ -- 2.25.1
[PATCH V6 6/7] mm/mmap: Drop arch_filter_pgprot()
There are no platforms left which subscribe ARCH_HAS_FILTER_PGPROT. Hence drop generic arch_filter_pgprot() and also config ARCH_HAS_FILTER_PGPROT. Cc: Andrew Morton Cc: linux...@kvack.org Cc: linux-ker...@vger.kernel.org Reviewed-by: Catalin Marinas Signed-off-by: Anshuman Khandual --- mm/Kconfig | 3 --- mm/mmap.c | 13 ++--- 2 files changed, 2 insertions(+), 14 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index b1f7624276f8..3f7b6d7b69df 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -762,9 +762,6 @@ config ARCH_HAS_CURRENT_STACK_POINTER register alias named "current_stack_pointer", this config can be selected. -config ARCH_HAS_FILTER_PGPROT - bool - config ARCH_HAS_VM_GET_PAGE_PROT bool diff --git a/mm/mmap.c b/mm/mmap.c index 87cb2eaf7e1a..b96e995f3733 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -107,20 +107,11 @@ pgprot_t protection_map[16] __ro_after_init = { }; #ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT -#ifndef CONFIG_ARCH_HAS_FILTER_PGPROT -static inline pgprot_t arch_filter_pgprot(pgprot_t prot) -{ - return prot; -} -#endif - pgprot_t vm_get_page_prot(unsigned long vm_flags) { - pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags & - (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | + return __pgprot(pgprot_val(protection_map[vm_flags & + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | pgprot_val(arch_vm_get_page_prot(vm_flags))); - - return arch_filter_pgprot(ret); } EXPORT_SYMBOL(vm_get_page_prot); #endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */ -- 2.25.1
[PATCH V6 5/7] x86/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
From: Christoph Hellwig This defines and exports a platform specific custom vm_get_page_prot() via subscribing ARCH_HAS_VM_GET_PAGE_PROT. This also unsubscribes from config ARCH_HAS_FILTER_PGPROT, after dropping off arch_filter_pgprot() and arch_vm_get_page_prot(). Cc: Thomas Gleixner Cc: Ingo Molnar Cc: linux-ker...@vger.kernel.org Signed-off-by: Christoph Hellwig Signed-off-by: Anshuman Khandual --- arch/x86/Kconfig | 2 +- arch/x86/include/asm/pgtable.h | 5 - arch/x86/include/uapi/asm/mman.h | 14 - arch/x86/mm/Makefile | 2 +- arch/x86/mm/pgprot.c | 35 5 files changed, 37 insertions(+), 21 deletions(-) create mode 100644 arch/x86/mm/pgprot.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b0142e01002e..c355c420150e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -76,7 +76,6 @@ config X86 select ARCH_HAS_EARLY_DEBUG if KGDB select ARCH_HAS_ELF_RANDOMIZE select ARCH_HAS_FAST_MULTIPLIER - select ARCH_HAS_FILTER_PGPROT select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_KCOVif X86_64 @@ -95,6 +94,7 @@ config X86 select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_UBSAN_SANITIZE_ALL + select ARCH_HAS_VM_GET_PAGE_PROT select ARCH_HAS_DEBUG_WX select ARCH_HAS_ZONE_DMA_SET if EXPERT select ARCH_HAVE_NMI_SAFE_CMPXCHG diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 62ab07e24aef..3563f4645fa1 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -649,11 +649,6 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) #define canon_pgprot(p) __pgprot(massage_pgprot(p)) -static inline pgprot_t arch_filter_pgprot(pgprot_t prot) -{ - return canon_pgprot(prot); -} - static inline int is_new_memtype_allowed(u64 paddr, unsigned long size, enum page_cache_mode pcm, enum page_cache_mode new_pcm) diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h index d4a8d0424bfb..775dbd3aff73 100644 --- a/arch/x86/include/uapi/asm/mman.h +++ b/arch/x86/include/uapi/asm/mman.h @@ -5,20 +5,6 @@ #define MAP_32BIT 0x40/* only give out 32bit addresses */ #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS -/* - * Take the 4 protection key bits out of the vma->vm_flags - * value and turn them in to the bits that we can put in - * to a pte. - * - * Only override these if Protection Keys are available - * (which is only on 64-bit). - */ -#define arch_vm_get_page_prot(vm_flags)__pgprot( \ - ((vm_flags) & VM_PKEY_BIT0 ? _PAGE_PKEY_BIT0 : 0) | \ - ((vm_flags) & VM_PKEY_BIT1 ? _PAGE_PKEY_BIT1 : 0) | \ - ((vm_flags) & VM_PKEY_BIT2 ? _PAGE_PKEY_BIT2 : 0) | \ - ((vm_flags) & VM_PKEY_BIT3 ? _PAGE_PKEY_BIT3 : 0)) - #define arch_calc_vm_prot_bits(prot, key) (\ ((key) & 0x1 ? VM_PKEY_BIT0 : 0) | \ ((key) & 0x2 ? VM_PKEY_BIT1 : 0) | \ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index fe3d3061fc11..fb6b41a48ae5 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -20,7 +20,7 @@ CFLAGS_REMOVE_mem_encrypt_identity.o = -pg endif obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o mmap.o \ - pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o maccess.o + pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o maccess.o pgprot.o obj-y += pat/ diff --git a/arch/x86/mm/pgprot.c b/arch/x86/mm/pgprot.c new file mode 100644 index ..763742782286 --- /dev/null +++ b/arch/x86/mm/pgprot.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +pgprot_t vm_get_page_prot(unsigned long vm_flags) +{ + unsigned long val = pgprot_val(protection_map[vm_flags & + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); + +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS + /* +* Take the 4 protection key bits out of the vma->vm_flags value and +* turn them in to the bits that we can put in to a pte. +* +* Only override these if Protection Keys are available (which is only +* on 64-bit). +*/ + if (vm_flags & VM_PKEY_BIT0) + val |= _PAGE_PKEY_BIT0; + if (vm_flags & VM_PKEY_BIT1) + val |= _PAGE_PKEY_BIT1; + if (vm_flags & VM_PKEY_BIT2) + val |= _PAGE_PKEY_BIT2; + if (vm_flags & VM_PKEY_BIT3) + val |= _PAGE_PKEY_BIT3; +#endif + + val = __sm
[PATCH V6 4/7] sparc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
This defines and exports a platform specific custom vm_get_page_prot() via subscribing ARCH_HAS_VM_GET_PAGE_PROT. It localizes arch_vm_get_page_prot() as sparc_vm_get_page_prot() and moves near vm_get_page_prot(). Cc: "David S. Miller" Cc: Khalid Aziz Cc: sparcli...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Reviewed-by: Khalid Aziz Signed-off-by: Anshuman Khandual --- arch/sparc/Kconfig| 1 + arch/sparc/include/asm/mman.h | 6 -- arch/sparc/mm/init_64.c | 13 + 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 9200bc04701c..85b573643af6 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -84,6 +84,7 @@ config SPARC64 select PERF_USE_VMALLOC select ARCH_HAVE_NMI_SAFE_CMPXCHG select HAVE_C_RECORDMCOUNT + select ARCH_HAS_VM_GET_PAGE_PROT select HAVE_ARCH_AUDITSYSCALL select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC diff --git a/arch/sparc/include/asm/mman.h b/arch/sparc/include/asm/mman.h index 274217e7ed70..af9c10c83dc5 100644 --- a/arch/sparc/include/asm/mman.h +++ b/arch/sparc/include/asm/mman.h @@ -46,12 +46,6 @@ static inline unsigned long sparc_calc_vm_prot_bits(unsigned long prot) } } -#define arch_vm_get_page_prot(vm_flags) sparc_vm_get_page_prot(vm_flags) -static inline pgprot_t sparc_vm_get_page_prot(unsigned long vm_flags) -{ - return (vm_flags & VM_SPARC_ADI) ? __pgprot(_PAGE_MCD_4V) : __pgprot(0); -} - #define arch_validate_prot(prot, addr) sparc_validate_prot(prot, addr) static inline int sparc_validate_prot(unsigned long prot, unsigned long addr) { diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 8b1911591581..dcb17763c1f2 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -3184,3 +3184,16 @@ void copy_highpage(struct page *to, struct page *from) } } EXPORT_SYMBOL(copy_highpage); + +static pgprot_t sparc_vm_get_page_prot(unsigned long vm_flags) +{ + return (vm_flags & VM_SPARC_ADI) ? __pgprot(_PAGE_MCD_4V) : __pgprot(0); +} + +pgprot_t vm_get_page_prot(unsigned long vm_flags) +{ + return __pgprot(pgprot_val(protection_map[vm_flags & + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | + pgprot_val(sparc_vm_get_page_prot(vm_flags))); +} +EXPORT_SYMBOL(vm_get_page_prot); -- 2.25.1
[PATCH V6 3/7] arm64/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
This defines and exports a platform specific custom vm_get_page_prot() via subscribing ARCH_HAS_VM_GET_PAGE_PROT. It localizes arch_vm_get_page_prot() and moves it near vm_get_page_prot(). Cc: Catalin Marinas Cc: Will Deacon Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Reviewed-by: Catalin Marinas Signed-off-by: Anshuman Khandual --- arch/arm64/Kconfig| 1 + arch/arm64/include/asm/mman.h | 24 arch/arm64/mm/mmap.c | 25 + 3 files changed, 26 insertions(+), 24 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 57c4c995965f..dd0b15162bb3 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -45,6 +45,7 @@ config ARM64 select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_TEARDOWN_DMA_OPS if IOMMU_SUPPORT select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST + select ARCH_HAS_VM_GET_PAGE_PROT select ARCH_HAS_ZONE_DMA_SET if EXPERT select ARCH_HAVE_ELF_PROT select ARCH_HAVE_NMI_SAFE_CMPXCHG diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index e3e28f7daf62..5966ee4a6154 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -35,30 +35,6 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) } #define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags) -static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags) -{ - pteval_t prot = 0; - - if (vm_flags & VM_ARM64_BTI) - prot |= PTE_GP; - - /* -* There are two conditions required for returning a Normal Tagged -* memory type: (1) the user requested it via PROT_MTE passed to -* mmap() or mprotect() and (2) the corresponding vma supports MTE. We -* register (1) as VM_MTE in the vma->vm_flags and (2) as -* VM_MTE_ALLOWED. Note that the latter can only be set during the -* mmap() call since mprotect() does not accept MAP_* flags. -* Checking for VM_MTE only is sufficient since arch_validate_flags() -* does not permit (VM_MTE & !VM_MTE_ALLOWED). -*/ - if (vm_flags & VM_MTE) - prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED); - - return __pgprot(prot); -} -#define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags) - static inline bool arch_validate_prot(unsigned long prot, unsigned long addr __always_unused) { diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c index 77ada00280d9..78e9490f748d 100644 --- a/arch/arm64/mm/mmap.c +++ b/arch/arm64/mm/mmap.c @@ -55,3 +55,28 @@ static int __init adjust_protection_map(void) return 0; } arch_initcall(adjust_protection_map); + +pgprot_t vm_get_page_prot(unsigned long vm_flags) +{ + pteval_t prot = pgprot_val(protection_map[vm_flags & + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); + + if (vm_flags & VM_ARM64_BTI) + prot |= PTE_GP; + + /* +* There are two conditions required for returning a Normal Tagged +* memory type: (1) the user requested it via PROT_MTE passed to +* mmap() or mprotect() and (2) the corresponding vma supports MTE. We +* register (1) as VM_MTE in the vma->vm_flags and (2) as +* VM_MTE_ALLOWED. Note that the latter can only be set during the +* mmap() call since mprotect() does not accept MAP_* flags. +* Checking for VM_MTE only is sufficient since arch_validate_flags() +* does not permit (VM_MTE & !VM_MTE_ALLOWED). +*/ + if (vm_flags & VM_MTE) + prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED); + + return __pgprot(prot); +} +EXPORT_SYMBOL(vm_get_page_prot); -- 2.25.1
[PATCH V6 2/7] powerpc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
This defines and exports a platform specific custom vm_get_page_prot() via subscribing ARCH_HAS_VM_GET_PAGE_PROT. While here, this also localizes arch_vm_get_page_prot() as __vm_get_page_prot() and moves it near vm_get_page_prot(). Cc: Michael Ellerman Cc: Paul Mackerras Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Anshuman Khandual --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/mman.h| 12 arch/powerpc/mm/book3s64/pgtable.c | 17 + 3 files changed, 18 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 174edabb74fa..69e44358a235 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -140,6 +140,7 @@ config PPC select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UACCESS_FLUSHCACHE select ARCH_HAS_UBSAN_SANITIZE_ALL + select ARCH_HAS_VM_GET_PAGE_PROTif PPC_BOOK3S_64 select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_KEEP_MEMBLOCK select ARCH_MIGHT_HAVE_PC_PARPORT diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h index 7cb6d18f5cd6..1b024e64c8ec 100644 --- a/arch/powerpc/include/asm/mman.h +++ b/arch/powerpc/include/asm/mman.h @@ -24,18 +24,6 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, } #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) -static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags) -{ -#ifdef CONFIG_PPC_MEM_KEYS - return (vm_flags & VM_SAO) ? - __pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) : - __pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags)); -#else - return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0); -#endif -} -#define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags) - static inline bool arch_validate_prot(unsigned long prot, unsigned long addr) { if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO)) diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 052e6590f84f..8b474ab32f67 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include @@ -549,3 +550,19 @@ unsigned long memremap_compat_align(void) } EXPORT_SYMBOL_GPL(memremap_compat_align); #endif + +pgprot_t vm_get_page_prot(unsigned long vm_flags) +{ + unsigned long prot = pgprot_val(protection_map[vm_flags & + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); + + if (vm_flags & VM_SAO) + prot |= _PAGE_SAO; + +#ifdef CONFIG_PPC_MEM_KEYS + prot |= vmflag_to_pte_pkey_bits(vm_flags); +#endif + + return __pgprot(prot); +} +EXPORT_SYMBOL(vm_get_page_prot); -- 2.25.1
[PATCH V6 1/7] mm/mmap: Add new config ARCH_HAS_VM_GET_PAGE_PROT
Add a new config ARCH_HAS_VM_GET_PAGE_PROT, which when subscribed enables a given platform to define its own vm_get_page_prot() but still utilizing the generic protection_map[] array. Cc: Andrew Morton Cc: linux...@kvack.org Cc: linux-ker...@vger.kernel.org Reviewed-by: Catalin Marinas Suggested-by: Christoph Hellwig Signed-off-by: Anshuman Khandual --- mm/Kconfig | 3 +++ mm/mmap.c | 2 ++ 2 files changed, 5 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 034d87953600..b1f7624276f8 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -765,6 +765,9 @@ config ARCH_HAS_CURRENT_STACK_POINTER config ARCH_HAS_FILTER_PGPROT bool +config ARCH_HAS_VM_GET_PAGE_PROT + bool + config ARCH_HAS_PTE_DEVMAP bool diff --git a/mm/mmap.c b/mm/mmap.c index 3aa839f81e63..87cb2eaf7e1a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -106,6 +106,7 @@ pgprot_t protection_map[16] __ro_after_init = { __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111 }; +#ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT #ifndef CONFIG_ARCH_HAS_FILTER_PGPROT static inline pgprot_t arch_filter_pgprot(pgprot_t prot) { @@ -122,6 +123,7 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags) return arch_filter_pgprot(ret); } EXPORT_SYMBOL(vm_get_page_prot); +#endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */ static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) { -- 2.25.1
[PATCH V6 0/7] mm/mmap: Drop arch_vm_get_page_prot() and arch_filter_pgprot()
protection_map[] is an array based construct that translates given vm_flags combination. This array contains page protection map, which is populated by the platform via [__S000 .. __S111] and [__P000 .. __P111] exported macros. Primary usage for protection_map[] is for vm_get_page_prot(), which is used to determine page protection value for a given vm_flags. vm_get_page_prot() implementation, could again call platform overrides arch_vm_get_page_prot() and arch_filter_pgprot(). Some platforms override protection_map[] that was originally built with __SXXX/__PXXX with different runtime values. Currently there are multiple layers of abstraction i.e __SXXX/__PXXX macros , protection_map[], arch_vm_get_page_prot() and arch_filter_pgprot() built between the platform and generic MM, finally defining vm_get_page_prot(). Hence this series proposes to drop later two abstraction levels and instead just move the responsibility of defining vm_get_page_prot() to the platform (still utilizing generic protection_map[] array) itself making it clean and simple. This first introduces ARCH_HAS_VM_GET_PAGE_PROT which enables the platforms to define custom vm_get_page_prot(). This starts converting platforms that define the overrides arch_filter_pgprot() or arch_vm_get_page_prot() which enables for those constructs to be dropped off completely. The series has been inspired from an earlier discuss with Christoph Hellwig https://lore.kernel.org/all/1632712920-8171-1-git-send-email-anshuman.khand...@arm.com/ This series applies on 5.18-rc2. This series has been cross built for multiple platforms. - Anshuman Cc: Christoph Hellwig Cc: Andrew Morton Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: sparcli...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-a...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Changes in V6: - Created single merged vm_get_page_prot() function per Christophe - Dropped local variable 'ret' in generic vm_get_page_prot() per Christophe - Dropped __pgprot(pgprot_val(x)) in generic vm_get_page_prot() per Christophe Changes in V5: https://lore.kernel.org/all/20220412043848.80464-2-anshuman.khand...@arm.com/ - Collected new tags on various patches in the series - Coalesced arm64_arch_vm_get_page_prot() into vm_get_page_prot() per Catalin - Modified powerpc's vm_get_page_prot() implementation per Christophe Changes in V4: https://lore.kernel.org/all/20220407103251.1209606-1-anshuman.khand...@arm.com/ - ARCH_HAS_VM_GET_PAGE_PROT now excludes generic protection_map[] - Changed platform's vm_get_page_prot() to use generic protection_map[] - Dropped all platform changes not enabling either arch_vm_get_page_prot() or arch_filter_pgprot() - Dropped all previous tags as code base has changed Changes in V3: https://lore.kernel.org/all/1646045273-9343-1-git-send-email-anshuman.khand...@arm.com/ - Dropped variable 'i' from sme_early_init() on x86 platform - Moved CONFIG_COLDFIRE vm_get_page_prot() inside arch/m68k/mm/mcfmmu.c - Moved CONFIG_SUN3 vm_get_page_prot() inside arch/m68k/mm/sun3mmu.c - Dropped cachebits for vm_get_page_prot() inside arch/m68k/mm/motorola.c - Dropped PAGE_XXX_C definitions from arch/m68k/include/asm/motorola_pgtable.h - Used PAGE_XXX instead for vm_get_page_prot() inside arch/m68k/mm/motorola.c - Dropped all references to protection_map[] in the tree - Replaced s/extensa/xtensa/ on the patch title - Moved left over comments from pgtable.h into init.c on nios2 platform Changes in V2: https://lore.kernel.org/all/1645425519-9034-1-git-send-email-anshuman.khand...@arm.com/ - Dropped the entire comment block in [PATCH 30/30] per Geert - Replaced __P010 (although commented) with __PAGE_COPY on arm platform - Replaced __P101 with PAGE_READONLY on um platform Changes in V1: https://lore.kernel.org/all/1644805853-21338-1-git-send-email-anshuman.khand...@arm.com/ - Add white spaces around the | operators - Moved powerpc_vm_get_page_prot() near vm_get_page_prot() on powerpc - Moved arm64_vm_get_page_prot() near vm_get_page_prot() on arm64 - Moved sparc_vm_get_page_prot() near vm_get_page_prot() on sparc - Compacted vm_get_page_prot() switch cases on all platforms - _PAGE_CACHE040 inclusion is dependent on CPU_IS_040_OR_060 - VM_SHARED case should return PAGE_NONE (not PAGE_COPY) on SH platform - Reorganized VM_SHARED, VM_EXEC, VM_WRITE, VM_READ - Dropped the last patch [RFC V1 31/31] which added macros for vm_flags combinations https://lore.kernel.org/all/1643029028-12710-32-git-send-email-anshuman.khand...@arm.com/ Changes in RFC: https://lore.kernel.org/all/1643029028-12710-1-git-send-email-anshuman.khand...@arm.com/ Anshuman Khandual (6): mm/mmap: Add new config ARCH_HAS_VM_GET_PAGE_PROT powerpc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT arm64/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT sparc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT mm/mmap: Drop arch_filter_pgprot() mm/mmap: Drop arch_vm_get_page_pgprot() Christoph Hellwig (1): x86/mm
Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
Excerpts from Laurent Dufour's message of April 2, 2022 12:06 am: > RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits > mode (MSR[SF] unset). > > The change in MSR is done in enter_rtas() in a relatively complex way, > since the MSR value could be hardcoded. > > Furthermore, a panic has been reported when hitting the watchdog interrupt > while running in RTAS, this leads to the following stack trace: > > [69244.027433][ C24] watchdog: CPU 24 Hard LOCKUP > [69244.027442][ C24] watchdog: CPU 24 TB:997512652051031, last heartbeat > TB:997504470175378 (15980ms ago) > [69244.027451][ C24] Modules linked in: chacha_generic(E) libchacha(E) > xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) > libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) > algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) > fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) > cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) > algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) > rpcsec_gss_krb5(E) auth_rpcgss(E) > nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) > udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) > netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) > fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) > crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) > fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) > sd_mod(E) t10_pi(E) > [69244.027555][ C24] ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) > gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) > raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) > dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) > [69244.027587][ C24] Supported: No, Unreleased kernel > [69244.027600][ C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: > GE X5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 > (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c > [69244.027609][ C24] NIP: 1fb41050 LR: 1fb4104c CTR: > > [69244.027612][ C24] REGS: cfc33d60 TRAP: 0100 Tainted: G >E X (5.14.21-150400.71.1.bz196362_2-default) > [69244.027615][ C24] MSR: 82981000 CR: 4882 > XER: 20040020 > [69244.027625][ C24] CFAR: 011c IRQMASK: 1 > [69244.027625][ C24] GPR00: 0003 > 0001 50dc > [69244.027625][ C24] GPR04: 1ffb6100 0020 > 0001 1fb09010 > [69244.027625][ C24] GPR08: 2000 > > [69244.027625][ C24] GPR12: 8004072a40a8 cff8b680 > 0007 0034 > [69244.027625][ C24] GPR16: 1fbf6e94 1fbf6d84 > 1fbd1db0 1fb3f008 > [69244.027625][ C24] GPR20: 1fb41018 > 017f f68f > [69244.027625][ C24] GPR24: 1fb18fe8 1fb3e000 > 1fb1adc0 1fb1cf40 > [69244.027625][ C24] GPR28: 1fb26000 1fb460f0 > 1fb17f18 1fb17000 > [69244.027663][ C24] NIP [1fb41050] 0x1fb41050 > [69244.027696][ C24] LR [1fb4104c] 0x1fb4104c > [69244.027699][ C24] Call Trace: > [69244.027701][ C24] Instruction dump: > [69244.027723][ C24] > > [69244.027728][ C24] > > [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1] > [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries > [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) > xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) > libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) > algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) > fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) > cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) > algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) > rpcsec_gss_krb5(E) auth_rpcgss(E) > nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) > udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) > netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) > fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) > crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) > fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) > sd_mod(E) t10_pi(E) > [69244.028171][T87504] i
[no subject]
+Daniel, Thomas, Viresh Subject: Re: rcu_sched self-detected stall on CPU Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am: > Michael Ellerman writes: >> "Paul E. McKenney" writes: >>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote: Hi I can reproduce it in a ppc virtual cloud server provided by Oregon State University. Following is what I do: 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz -o linux-5.18-rc1.tar.gz 2) tar zxf linux-5.18-rc1.tar.gz 3) cp config linux-5.18-rc1/.config 4) cd linux-5.18-rc1 5) make vmlinux -j 8 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot -smp 2 (QEMU 4.2.1) 7) after 12 rounds, the bug got reproduced: (http://154.223.142.244/logs/20220406/qemu.log.txt) >>> >>> Just to make sure, are you both seeing the same thing? Last I knew, >>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels >>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set. Or did >>> I miss something? >>> >>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period >>> kthread slept for three milliseconds, but did not wake up for more than >>> 20 seconds. This kthread would normally have awakened on CPU 1, but >>> CPU 1 looks to me to be very unhealthy, as can be seen in your console >>> output below (but maybe my idea of what is healthy for powerpc systems >>> is outdated). Please see also the inline annotations. >>> >>> Thoughts from the PPC guys? >> >> I haven't seen it in my testing. But using Miguel's config I can >> reproduce it seemingly on every boot. >> >> For me it bisects to: >> >> 35de589cb879 ("powerpc/time: improve decrementer clockevent processing") >> >> Which seems plausible. >> >> Reverting that on mainline makes the bug go away. >> >> I don't see an obvious bug in the diff, but I could be wrong, or the old >> code was papering over an existing bug? >> >> I'll try and work out what it is about Miguel's config that exposes >> this vs our defconfig, that might give us a clue. > > It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall. > > I can reproduce just with: > > $ make ppc64le_guest_defconfig > $ ./scripts/config -d HIGH_RES_TIMERS > > We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even > realise you could disable it TBH :) > > The Rust CI has it disabled because I copied that from the x86 defconfig > they were using back when I added the Rust support. I think that was > meant to be a stripped down fast config for CI, but the result is it's > just using a badly tested combination which is not helpful. > > So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we > can debug this further without blocking them. So we traced the problem down to possibly a misunderstanding between decrementer clock event device and core code. The decrementer is only oneshot*ish*. It actually needs to either be reprogrammed or shut down otherwise it just continues to cause interrupts. Before commit 35de589cb879, it was sort of two-shot. The initial interrupt at the programmed time would set its internal next_tb variable to ~0 and call the ->event_handler(). If that did not set_next_event or stop the timer, the interrupt will fire again immediately, notice next_tb is ~0, and only then stop the decrementer interrupt. So that was already kind of ugly, this patch just turned it into a hang. The problem happens when the tick is stopped with an event still pending, then tick_nohz_handler() is called, but it bails out because tick_stopped == 1 so the device never gets programmed again, and so it keeps firing. How to fix it? Before commit a7cba02deced, powerpc's decrementer was really oneshot, but we would like to avoid doing that because it requires additional programming of the hardware on each timer interrupt. We have the ONESHOT_STOPPED state which seems to be just about what we want. Did the ONESHOT_STOPPED patch just miss this case, or is there a reason we don't stop it here? This patch seems to fix the hang (not heavily tested though). Thanks, Nick --- diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 2d76c91b85de..7e13a55b6b71 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev) tick_sched_do_timer(ts, now); tick_sched_handle(ts, regs); - /* No need to reprogram if we are running tickless */ - if (unlikely(ts->tick_stopped)) + if (unlikely(ts->tick_stopped)) { + /* If we are tickless, change the clock event to stopped */ + tick_program_event(KTIME_MAX, 1); return; + } hrtimer_forward(&ts->sched_timer, now, TICK_NSEC); tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
Re: [PATCH V5 2/7] powerpc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
On 4/12/22 17:57, Christophe Leroy wrote: > > > Le 12/04/2022 à 06:38, Anshuman Khandual a écrit : >> This defines and exports a platform specific custom vm_get_page_prot() via >> subscribing ARCH_HAS_VM_GET_PAGE_PROT. While here, this also localizes >> arch_vm_get_page_prot() as __vm_get_page_prot() and moves it near >> vm_get_page_prot(). >> >> Cc: Michael Ellerman >> Cc: Paul Mackerras >> Cc: linuxppc-dev@lists.ozlabs.org >> Cc: linux-ker...@vger.kernel.org >> Signed-off-by: Anshuman Khandual >> --- >> arch/powerpc/Kconfig | 1 + >> arch/powerpc/include/asm/mman.h| 12 >> arch/powerpc/mm/book3s64/pgtable.c | 20 >> 3 files changed, 21 insertions(+), 12 deletions(-) >> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index 174edabb74fa..69e44358a235 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -140,6 +140,7 @@ config PPC >> select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST >> select ARCH_HAS_UACCESS_FLUSHCACHE >> select ARCH_HAS_UBSAN_SANITIZE_ALL >> +select ARCH_HAS_VM_GET_PAGE_PROTif PPC_BOOK3S_64 >> select ARCH_HAVE_NMI_SAFE_CMPXCHG >> select ARCH_KEEP_MEMBLOCK >> select ARCH_MIGHT_HAVE_PC_PARPORT >> diff --git a/arch/powerpc/include/asm/mman.h >> b/arch/powerpc/include/asm/mman.h >> index 7cb6d18f5cd6..1b024e64c8ec 100644 >> --- a/arch/powerpc/include/asm/mman.h >> +++ b/arch/powerpc/include/asm/mman.h >> @@ -24,18 +24,6 @@ static inline unsigned long >> arch_calc_vm_prot_bits(unsigned long prot, >> } >> #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, >> pkey) >> >> -static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags) >> -{ >> -#ifdef CONFIG_PPC_MEM_KEYS >> -return (vm_flags & VM_SAO) ? >> -__pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) : >> -__pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags)); >> -#else >> -return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0); >> -#endif >> -} >> -#define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags) >> - >> static inline bool arch_validate_prot(unsigned long prot, unsigned long >> addr) >> { >> if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO)) >> diff --git a/arch/powerpc/mm/book3s64/pgtable.c >> b/arch/powerpc/mm/book3s64/pgtable.c >> index 052e6590f84f..d0319524e27f 100644 >> --- a/arch/powerpc/mm/book3s64/pgtable.c >> +++ b/arch/powerpc/mm/book3s64/pgtable.c >> @@ -7,6 +7,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> >> @@ -549,3 +550,22 @@ unsigned long memremap_compat_align(void) >> } >> EXPORT_SYMBOL_GPL(memremap_compat_align); >> #endif >> + >> +static pgprot_t __vm_get_page_prot(unsigned long vm_flags) >> +{ >> +#ifdef CONFIG_PPC_MEM_KEYS >> +return (vm_flags & VM_SAO) ? >> +__pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) : >> +__pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags)); >> +#else >> +return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0); >> +#endif >> +} >> + >> +pgprot_t vm_get_page_prot(unsigned long vm_flags) >> +{ >> +return __pgprot(pgprot_val(protection_map[vm_flags & >> +(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | >> +pgprot_val(__vm_get_page_prot(vm_flags))); >> +} >> +EXPORT_SYMBOL(vm_get_page_prot); > > This looks functionnaly OK, but I think we could go in the same > direction as ARM and try to integrate __vm_get_page_prot() inside > vm_get_page_prot() to get something simpler and cleaner: > > Something like below (untested): > > pgprot_t vm_get_page_prot(unsigned long vm_flags) > { > unsigned long prot = pgprot_val(protection_map[vm_flags & >(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); > > if (vm_flags & VM_SAO) > prot |= _PAGE_SAO > > #ifdef CONFIG_PPC_MEM_KEYS > prot |= vmflag_to_pte_pkey_bits(vm_flags); > #endif > > return __pgprot(prot); > } Okay, will integrate these functions into vm_get_page_prot() as suggested.
Re: [PATCH V5 6/7] mm/mmap: Drop arch_filter_pgprot()
On 4/12/22 17:59, Christophe Leroy wrote: > > > Le 12/04/2022 à 06:38, Anshuman Khandual a écrit : >> There are no platforms left which subscribe ARCH_HAS_FILTER_PGPROT. Hence >> drop generic arch_filter_pgprot() and also config ARCH_HAS_FILTER_PGPROT. >> >> Cc: Andrew Morton >> Cc: linux...@kvack.org >> Cc: linux-ker...@vger.kernel.org >> Reviewed-by: Catalin Marinas >> Signed-off-by: Anshuman Khandual >> --- >> mm/Kconfig | 3 --- >> mm/mmap.c | 9 + >> 2 files changed, 1 insertion(+), 11 deletions(-) >> >> diff --git a/mm/Kconfig b/mm/Kconfig >> index b1f7624276f8..3f7b6d7b69df 100644 >> --- a/mm/Kconfig >> +++ b/mm/Kconfig >> @@ -762,9 +762,6 @@ config ARCH_HAS_CURRENT_STACK_POINTER >>register alias named "current_stack_pointer", this config can be >>selected. >> >> -config ARCH_HAS_FILTER_PGPROT >> -bool >> - >> config ARCH_HAS_VM_GET_PAGE_PROT >> bool >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 87cb2eaf7e1a..edf2a5e38f4d 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -107,20 +107,13 @@ pgprot_t protection_map[16] __ro_after_init = { >> }; >> >> #ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT >> -#ifndef CONFIG_ARCH_HAS_FILTER_PGPROT >> -static inline pgprot_t arch_filter_pgprot(pgprot_t prot) >> -{ >> -return prot; >> -} >> -#endif >> - >> pgprot_t vm_get_page_prot(unsigned long vm_flags) >> { >> pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags & >> (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | >> pgprot_val(arch_vm_get_page_prot(vm_flags))); >> >> -return arch_filter_pgprot(ret); >> +return ret; > > You can drop 'ret' and directly do: > > return __pgprot(pgprot_val(protection_map[vm_flags & > (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > pgprot_val(arch_vm_get_page_prot(vm_flags))); Sure, will do. > > >> } >> EXPORT_SYMBOL(vm_get_page_prot); >> #endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */
Re: [PATCH V5 7/7] mm/mmap: Drop arch_vm_get_page_pgprot()
On 4/12/22 18:00, Christophe Leroy wrote: > > > Le 12/04/2022 à 06:38, Anshuman Khandual a écrit : >> There are no platforms left which use arch_vm_get_page_prot(). Just drop >> generic arch_vm_get_page_prot(). >> >> Cc: Andrew Morton >> Cc: linux...@kvack.org >> Cc: linux-ker...@vger.kernel.org >> Reviewed-by: Catalin Marinas >> Signed-off-by: Anshuman Khandual >> --- >> include/linux/mman.h | 4 >> mm/mmap.c| 3 +-- >> 2 files changed, 1 insertion(+), 6 deletions(-) >> >> diff --git a/include/linux/mman.h b/include/linux/mman.h >> index b66e91b8176c..58b3abd457a3 100644 >> --- a/include/linux/mman.h >> +++ b/include/linux/mman.h >> @@ -93,10 +93,6 @@ static inline void vm_unacct_memory(long pages) >> #define arch_calc_vm_flag_bits(flags) 0 >> #endif >> >> -#ifndef arch_vm_get_page_prot >> -#define arch_vm_get_page_prot(vm_flags) __pgprot(0) >> -#endif >> - >> #ifndef arch_validate_prot >> /* >>* This is called from mprotect(). PROT_GROWSDOWN and PROT_GROWSUP have >> diff --git a/mm/mmap.c b/mm/mmap.c >> index edf2a5e38f4d..db7f33154206 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -110,8 +110,7 @@ pgprot_t protection_map[16] __ro_after_init = { >> pgprot_t vm_get_page_prot(unsigned long vm_flags) >> { >> pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags & >> -(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | >> -pgprot_val(arch_vm_get_page_prot(vm_flags))); >> +(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)])); >> >> return ret; >> } > > __pgprot(pgprot_val(x)) is a no-op. > > You can simply do: > > return protection_map[vm_flags & > (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED); Sure, will do.
Re: [PATCH] ASoC: fsl: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
On Tue, Apr 12, 2022 at 4:30 PM wrote: > From: Minghao Chi > > Using pm_runtime_resume_and_get is more appropriate > for simplifing code > > Reported-by: Zeal Robot > Signed-off-by: Minghao Chi > Acked-by: Shengjiu Wang Best regards Wang Shengjiu > --- > sound/soc/fsl/fsl_esai.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c > index ed444e8f1d6b..1a2bdf8e76f0 100644 > --- a/sound/soc/fsl/fsl_esai.c > +++ b/sound/soc/fsl/fsl_esai.c > @@ -1050,11 +1050,9 @@ static int fsl_esai_probe(struct platform_device > *pdev) > goto err_pm_disable; > } > > - ret = pm_runtime_get_sync(&pdev->dev); > - if (ret < 0) { > - pm_runtime_put_noidle(&pdev->dev); > + ret = pm_runtime_resume_and_get(&pdev->dev); > + if (ret < 0) > goto err_pm_get_sync; > - } > > ret = fsl_esai_hw_init(esai_priv); > if (ret) > -- > 2.25.1 > >
Re: [PATCH net-next v3 14/18] sfc: Remove usage of list iterator for list_add() after the loop body
On Tue, 12 Apr 2022 14:15:53 +0200 Jakob Koschel wrote: > - struct list_head *head = &efx->rss_context.list; > + struct list_head *head = *pos = &efx->rss_context.list; ENOTBUILT, please wait with the reposting. Since you posted two versions today I guess that's 2x 24h? :)
Re: [PATCH] ASoC: fsl: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
On Tue, 12 Apr 2022 08:30:00 +, cgel@gmail.com wrote: > From: Minghao Chi > > Using pm_runtime_resume_and_get is more appropriate > for simplifing code > > Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next Thanks! [1/1] ASoC: fsl: using pm_runtime_resume_and_get instead of pm_runtime_get_sync commit: c721905c54d913db0102973dbcdfb48d91146a2d All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark
Re: False positive kmemleak report for dtb properties names on powerpc
On Tue, Apr 12, 2022 at 04:47:47PM +1000, Michael Ellerman wrote: > Christophe Leroy writes: > > Hi Ariel > > > > Le 09/04/2022 à 15:47, Ariel Marcovitch a écrit : > >> Hi Christophe, did you get the chance to look at this? > > > > I tested something this morning, it works for me, see below > > > >> > >> On 23/03/2022 21:06, Mike Rapoport wrote: > >>> Hi Catalin, > >>> > >>> On Wed, Mar 23, 2022 at 05:22:38PM +, Catalin Marinas wrote: > Hi Ariel, > > On Fri, Feb 18, 2022 at 09:45:51PM +0200, Ariel Marcovitch wrote: > > I was running a powerpc 32bit kernel (built using > > qemu_ppc_mpc8544ds_defconfig > > buildroot config, with enabling DEBUGFS+KMEMLEAK+HIGHMEM in the kernel > > config) > > > > ... > > > > I don't suppose I can just shuffle the calls in setup_arch() around, > > so I > > wanted to hear your opinions first > I think it's better if we change the logic than shuffling the calls. > IIUC MEMBLOCK_ALLOC_ACCESSIBLE means that __va() works on the phys > address return by memblock, so something like below (untested): > >>> MEMBLOCK_ALLOC_ACCESSIBLE means "anywhere", see commit e63075a3c937 > >>> ("memblock: Introduce default allocation limit and use it to replace > >>> explicit ones"), so it won't help to detect high memory. > >>> > >>> If I remember correctly, ppc initializes memblock *very* early, so > >>> setting > >>> max_low_pfn along with lowmem_end_addr in > >>> arch/powerpc/mm/init_32::MMU_init() makes sense to me. > >>> > >>> Maybe ppc folks have other ideas... > >>> I've added Christophe who works on ppc32 these days. > > > > I think memblock is already available at the end of MMU_init() on PPC32 > > and at the end of early_setup() on PPC64. It means it is ready when we > > enter setup_arch(). > > > > I tested the change below, it works for me, I don't get any kmemleak > > report anymore. > > > > diff --git a/arch/powerpc/kernel/setup-common.c > > b/arch/powerpc/kernel/setup-common.c > > index 518ae5aa9410..9f4e50b176c9 100644 > > --- a/arch/powerpc/kernel/setup-common.c > > +++ b/arch/powerpc/kernel/setup-common.c > > @@ -840,6 +840,9 @@ void __init setup_arch(char **cmdline_p) > > /* Set a half-reasonable default so udelay does something sensible */ > > loops_per_jiffy = 5 / HZ; > > > > + /* Parse memory topology */ > > + mem_topology_setup(); > > + > > /* Unflatten the device-tree passed by prom_init or kexec */ > > unflatten_device_tree(); > > The 64-bit/NUMA version of mem_topology_setup() requires the device tree > to be unflattened, so I don't think that can work. > > Setting max_low_pfn etc in MMU_init() as Mike suggested seems more > likely to work. > > But we might need to set it again in mem_topology_setup() though, so > that things that change memblock_end_of_DRAM() are reflected, eg. memory > limit or crash dump? I don't think this can cause issues for kmemleak Ariel reported. The kmemleak checks if there is a linear mapping for a PFN or that PFN is only accessible via HIGHMEM. Memory limit or crash dump won't change the split, or am I missing something? > cheers -- Sincerely yours, Mike.
[PATCH V3 2/2] perf bench: Fix numa bench to fix usage of affinity for machines with #CPUs > 1K
The 'perf bench numa' testcase fails on systems with more than 1K CPUs. Testcase: perf bench numa mem -p 1 -t 3 -P 512 -s 100 -zZ0qcm --thp 1 Snippet of code: <<>> perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed. Aborted (core dumped) <<>> bind_to_node function uses "sched_getaffinity" to save the original cpumask and this call is returning EINVAL ((invalid argument). This happens because the default mask size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size. Apart from fixing this for "orig_mask", apply same logic to "mask" as well which is used to setaffinity so that mask size is large enough to represent number of possible CPU's in the system. sched_getaffinity is used in one more place in perf numa bench. It is in "bind_to_cpu" function. Apply the same logic there also. Though currently no failure is reported from there, it is ideal to change getaffinity to work with such system configurations having CPU's more than default mask size supported by glibc. Also fix "sched_setaffinity" to use mask size which is large enough to represent number of possible CPU's in the system. Fixed all places where "bind_cpumask" which is part of "struct thread_data" is used such that bind_cpumask works in all configuration. Reported-by: Disha Goel Signed-off-by: Athira Rajeev --- Changelog The first version was posted here: https://lore.kernel.org/all/20220406175113.87881-5-atraj...@linux.vnet.ibm.com/ This v3 version separates patch 3 and addressing compilation failure in some distro's. Change in V3 does initialization of cpu_set to NULL in main to fix compilation warning for uninitialized pointer. Also made changes in "bind_to_cpu" and "bind_to_node" to CPU_FREE masks properly. tools/perf/bench/numa.c | 128 +--- 1 file changed, 95 insertions(+), 33 deletions(-) diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c index c838c248aa2c..ffa83744ef99 100644 --- a/tools/perf/bench/numa.c +++ b/tools/perf/bench/numa.c @@ -55,7 +55,7 @@ struct thread_data { int curr_cpu; - cpu_set_t bind_cpumask; + cpu_set_t *bind_cpumask; int bind_node; u8 *process_data; int process_nr; @@ -267,71 +267,115 @@ static bool node_has_cpus(int node) return ret; } -static cpu_set_t bind_to_cpu(int target_cpu) +static cpu_set_t *bind_to_cpu(int target_cpu) { - cpu_set_t orig_mask, mask; - int ret; + int nrcpus = numa_num_possible_cpus(); + cpu_set_t *orig_mask, *mask; + size_t size; - ret = sched_getaffinity(0, sizeof(orig_mask), &orig_mask); - BUG_ON(ret); + orig_mask = CPU_ALLOC(nrcpus); + BUG_ON(!orig_mask); + size = CPU_ALLOC_SIZE(nrcpus); + CPU_ZERO_S(size, orig_mask); + + if (sched_getaffinity(0, size, orig_mask)) + goto err_out; + + mask = CPU_ALLOC(nrcpus); + if (!mask) + goto err_out; - CPU_ZERO(&mask); + CPU_ZERO_S(size, mask); if (target_cpu == -1) { int cpu; for (cpu = 0; cpu < g->p.nr_cpus; cpu++) - CPU_SET(cpu, &mask); + CPU_SET_S(cpu, size, mask); } else { - BUG_ON(target_cpu < 0 || target_cpu >= g->p.nr_cpus); - CPU_SET(target_cpu, &mask); + if (target_cpu < 0 || target_cpu >= g->p.nr_cpus) + goto err; + + CPU_SET_S(target_cpu, size, mask); } - ret = sched_setaffinity(0, sizeof(mask), &mask); - BUG_ON(ret); + if (sched_setaffinity(0, size, mask)) + goto err; return orig_mask; + +err: + CPU_FREE(mask); +err_out: + CPU_FREE(orig_mask); + + /* BUG_ON due to failure in allocation of orig_mask/mask */ + BUG_ON(-1); } -static cpu_set_t bind_to_node(int target_node) +static cpu_set_t *bind_to_node(int target_node) { - cpu_set_t orig_mask, mask; + int nrcpus = numa_num_possible_cpus(); + size_t size; + cpu_set_t *orig_mask, *mask; int cpu; - int ret; - ret = sched_getaffinity(0, sizeof(orig_mask), &orig_mask); - BUG_ON(ret); + orig_mask = CPU_ALLOC(nrcpus); + BUG_ON(!orig_mask); + size = CPU_ALLOC_SIZE(nrcpus); + CPU_ZERO_S(size, orig_mask); + + if (sched_getaffinity(0, size, orig_mask)) + goto err_out; + + mask = CPU_ALLOC(nrcpus); + if (!mask) + goto err_out; - CPU_ZERO(&mask); + CPU_ZERO_S(size, mask); if (target_node == NUMA_NO_NODE) { for (cpu = 0; cpu < g->p.nr_cpus; cpu++) - CPU_SET(cpu, &mask)
[PATCH V3 1/2] tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is online
Perf numa bench test fails with error: Testcase: ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk Failure snippet: <<>> Running 'numa/mem' benchmark: # Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk" perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed. <<>> The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list", There is check to see if cpu number is greater than max cpu's possible in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >= g->p.nr_cpus) {". But it could happen that system has say 48 CPU's, but only number of online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is 48, so function will go ahead and set bit for CPU 8 also in cpumask ( td->bind_cpumask). bind_to_cpumask function is called to set affinity using sched_setaffinity and the cpumask. Since the CPU8 is not present, set affinity will fail here with EINVAL. Fix this issue by adding a check to make sure that, CPU's provided in the input argument values are online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Tested-by: Disha Goel Signed-off-by: Athira Rajeev Reported-by: Disha Goel --- Changelog The first version was posted here: https://lore.kernel.org/all/20220406175113.87881-5-atraj...@linux.vnet.ibm.com/ This v3 version separates patch 4 and addressing review comments from Arnaldo to use sysfs__read_str for reading sysfs file. tools/perf/bench/numa.c | 8 +-- tools/perf/util/header.c | 51 tools/perf/util/header.h | 1 + 3 files changed, 58 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c index f2640179ada9..c838c248aa2c 100644 --- a/tools/perf/bench/numa.c +++ b/tools/perf/bench/numa.c @@ -34,6 +34,7 @@ #include #include +#include "../util/header.h" #include #include @@ -585,6 +586,11 @@ static int parse_setup_cpu_list(void) return -1; } + if (is_cpu_online(bind_cpu_0) != 1 || is_cpu_online(bind_cpu_1) != 1) { + printf("\nTest not applicable, bind_cpu_0 or bind_cpu_1 is offline\n"); + return -1; + } + BUG_ON(bind_cpu_0 < 0 || bind_cpu_1 < 0); BUG_ON(bind_cpu_0 > bind_cpu_1); @@ -752,8 +758,6 @@ static int parse_nodes_opt(const struct option *opt __maybe_unused, return parse_node_list(arg); } -#define BIT(x) (1ul << x) - static inline uint32_t lfsr_32(uint32_t lfsr) { const uint32_t taps = BIT(1) | BIT(5) | BIT(6) | BIT(31); diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index d546ff724dbe..a27132e5a5ef 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -983,6 +983,57 @@ static int write_dir_format(struct feat_fd *ff, return do_write(ff, &data->dir.version, sizeof(data->dir.version)); } +/* + * Check whether a CPU is online + * + * Returns: + * 1 -> if CPU is online + * 0 -> if CPU is offline + *-1 -> error case + */ +int is_cpu_online(unsigned int cpu) +{ + char *str; + size_t strlen; + char buf[256]; + int status = -1; + struct stat statbuf; + + snprintf(buf, sizeof(buf), + "/sys/devices/system/cpu/cpu%d", cpu); + if (stat(buf, &statbuf) != 0) + return 0; + + /* +* Check if /sys/devices/system/cpu/cpux/online file +* exists. Some cases cpu0 won't have online file since +* it is not expected to be turned off generally. +* In kernels without CONFIG_HOTPLUG_CPU, this +* file won't exist +*/ + snprintf(buf, sizeof(buf), + "/sys/devices/system/cpu/cpu%d/online", cpu); + if (stat(buf, &statbuf) != 0) + return 1; + + /* +* Read online file using sysfs__read_str. +* If read or open fails, return -1. +* If read succeeds, return value from file +* which gets stored in "str" +*/ + snprintf(buf, sizeof(buf), + "devices/system/cpu/cpu%d/online", cpu); + + if (sysfs__read_str(buf, &str, &strlen) < 0) + return status; + + status = atoi(str); + + free(str); + return status; +} + #ifdef HAVE_LIBBPF_SUPPORT static int write_bpf_prog_info(struct feat_fd *ff, struct evlist *evlist __maybe_unused) diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h index c9e3265832d9..0eb4bc29a5a4 100644 --- a/tools/perf/util/header.h +++ b/tools/perf/util/header.h @@ -158,6 +158,7 @@ int do_write(struct feat_fd *fd, const void *buf, size_t size); int write_padded(struct feat_
[PATCH V3 0/2] Fix perf bench numa to work with machines having #CPUs > 1K
The perf benchmark for collections: numa hits failure in system configuration with CPU's more than 1024. These benchmarks uses "sched_getaffinity" and "sched_setaffinity" in the code to work with affinity. Example snippet from numa benchmark: <<>> perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed. Aborted (core dumped) <<>> bind_to_node function uses "sched_getaffinity" to save the cpumask. This fails with EINVAL because the default mask size in glibc is 1024 To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit. Fix all the relevant places in the code to use mask size which is large enough to represent number of possible CPU's in the system. This patchset also address a fix for parse_setup_cpu_list function in numa bench to check if input CPU is online before binding task to that CPU. This is to fix failures where, though CPU number is within max CPU, it could happen that CPU is offline. Here, sched_setaffinity will result in failure when using cpumask having that cpu bit set in the mask. Patch 1 address fix for parse_setup_cpu_list to check if CPU used to bind task is online. Patch 2 has fix for bench numa to work with machines having #CPUs > 1K Athira Rajeev (2): tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is online perf bench: Fix numa bench to fix usage of affinity for machines with #CPUs > 1K Changelog: v2 -> v3 Link to the v2 version : https://lore.kernel.org/all/20220406175113.87881-1-atraj...@linux.vnet.ibm.com/ - From the v2 version, patch 1 and patch 2 are now part of upstream. - This v3 version separates patch 3 and patch 4 to address review comments from arnaldo which includes using sysfs__read_str for reading sysfs file and fixing the compilation issues observed in debian tools/perf/bench/numa.c | 136 +-- tools/perf/util/header.c | 51 +++ tools/perf/util/header.h | 1 + 3 files changed, 153 insertions(+), 35 deletions(-) -- 2.35.1
Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K
> On 09-Apr-2022, at 10:48 PM, Arnaldo Carvalho de Melo wrote: > > Em Sat, Apr 09, 2022 at 12:28:01PM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Wed, Apr 06, 2022 at 11:21:09PM +0530, Athira Rajeev escreveu: >>> The perf benchmark for collections: numa, futex and epoll >>> hits failure in system configuration with CPU's more than 1024. >>> These benchmarks uses "sched_getaffinity" and "sched_setaffinity" >>> in the code to work with affinity. >> >> Applied 1-3, 4 needs some reworking and can wait for v5.19, the first 3 >> are fixes, so can go now. > > Now trying to fix this: > > 26 7.89 debian:9 : FAIL gcc version 6.3.0 20170516 > (Debian 6.3.0-18+deb9u1) >bench/numa.c: In function 'alloc_data': >bench/numa.c:359:6: error: 'orig_mask' may be used uninitialized in this > function [-Werror=maybe-uninitialized] > ret = sched_setaffinity(0, size, mask); > ^~ >bench/numa.c:409:13: note: 'orig_mask' was declared here > cpu_set_t *orig_mask; > ^ >cc1: all warnings being treated as errors >/git/perf-5.18.0-rc1/tools/build/Makefile.build:139: recipe for target > 'bench' failed >make[3]: *** [bench] Error 2 > > > Happened in several distros. Hi Arnaldo Thanks for pointing it. I could be able to recreate this compile error in Debian. The reason for this issue is variable orig_mask which is used and initialised in “alloc_data" function within if condition for "init_cpu0”. We can fix this issue by initialising it to NULL since it is accessed conditionally. I also made some changes to CPU_FREE the mask in other error paths. I will post a V3 with these changes. Athira > > - Arnaldo
[powerpc:merge] BUILD SUCCESS 83d8a0d166119de813cad27ae7d61f54f9aea707
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge branch HEAD: 83d8a0d166119de813cad27ae7d61f54f9aea707 Automatic merge of 'master' into merge (2022-04-11 23:56) elapsed time: 1324m configs tested: 125 configs skipped: 3 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm allmodconfig arm allyesconfig arm64 defconfig arm64allyesconfig i386 randconfig-c001-20220411 armlart_defconfig arc tb10x_defconfig arm gemini_defconfig i386defconfig sh alldefconfig sh kfr2r09-romimage_defconfig arm imx_v6_v7_defconfig arm pxa3xx_defconfig m68k m5475evb_defconfig xtensa iss_defconfig parisc64 alldefconfig armrealview_defconfig sh sdk7780_defconfig powerpc ep8248e_defconfig powerpc bamboo_defconfig powerpc stx_gp3_defconfig mips decstation_64_defconfig powerpc mpc83xx_defconfig xtensa cadence_csp_defconfig sh rsk7203_defconfig mips cobalt_defconfig nios2alldefconfig ia64 gensparse_defconfig powerpc mpc837x_rdb_defconfig arm jornada720_defconfig openrisc alldefconfig x86_64 randconfig-c001-20220411 arm randconfig-c002-20220411 ia64 allmodconfig ia64 allyesconfig ia64defconfig m68k allyesconfig m68k allmodconfig m68kdefconfig nios2 defconfig arc allyesconfig cskydefconfig nios2allyesconfig alpha defconfig alphaallyesconfig h8300allyesconfig xtensa allyesconfig arc defconfig sh allmodconfig s390defconfig s390 allmodconfig parisc defconfig parisc64defconfig parisc allyesconfig s390 allyesconfig sparc defconfig i386 allyesconfig sparcallyesconfig i386 debian-10.3-kselftests i386 debian-10.3 mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allnoconfig powerpc allmodconfig x86_64 randconfig-a016-20220411 x86_64 randconfig-a012-20220411 x86_64 randconfig-a013-20220411 x86_64 randconfig-a014-20220411 x86_64 randconfig-a015-20220411 x86_64 randconfig-a011-20220411 i386 randconfig-a015-20220411 i386 randconfig-a011-20220411 i386 randconfig-a016-20220411 i386 randconfig-a012-20220411 i386 randconfig-a013-20220411 i386 randconfig-a014-20220411 riscvrandconfig-r042-20220411 arc randconfig-r043-20220411 s390 randconfig-r044-20220411 riscvallmodconfig riscvnommu_k210_defconfig riscvnommu_virt_defconfig riscv allnoconfig riscv rv32_defconfig riscvallyesconfig riscv defconfig um i386_defconfig um x86_64_defconfig x86_64 rhel-8.3-func x86_64 kexec x86_64 defconfig x86_64 allyesconfig x86_64rhel-8.3-kselftests x86_64 rhel-8.3-kunit x86_64 rhel-8.3 clang tested configs: powerpc randconfig-c003-20220411 arm randconfig-c002-20220411 x86_64 randconfig-c007-202
Re: rcu_sched self-detected stall on CPU
On Tue, Apr 12, 2022 at 04:53:06PM +1000, Michael Ellerman wrote: > "Paul E. McKenney" writes: > > On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote: > >> Zhouyi Zhou writes: > >> > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney > >> > wrote: > >> >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote: > >> >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman > >> >> > wrote: > >> ... > >> >> > > I haven't seen it in my testing. But using Miguel's config I can > >> >> > > reproduce it seemingly on every boot. > >> >> > > > >> >> > > For me it bisects to: > >> >> > > > >> >> > > 35de589cb879 ("powerpc/time: improve decrementer clockevent > >> >> > > processing") > >> >> > > > >> >> > > Which seems plausible. > >> >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer > >> >> > clockevent processing") > >> ... > >> >> > >> >> > > Reverting that on mainline makes the bug go away. > >> > >> >> > I also revert that on the mainline, and am currently doing a pressure > >> >> > test (by repeatedly invoking qemu and checking the console.log) on PPC > >> >> > VM in Oregon State University. > >> > >> > After 306 rounds of stress test on mainline without triggering the bug > >> > (last for 4 hours and 27 minutes), I think the bug is indeed caused by > >> > 35de589cb879 ("powerpc/time: improve decrementer clockevent > >> > processing") and stop the test for now. > >> > >> Thanks for testing, that's pretty conclusive. > >> > >> I'm not inclined to actually revert it yet. > >> > >> We need to understand if there's actually a bug in the patch, or if it's > >> just exposing some existing bug/bad behavior we have. The fact that it > >> only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious. > >> > >> Do we have some code that inadvertently relies on something enabled by > >> HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y > >> ? > > > > For whatever it is worth, moderate rcutorture runs to completion without > > errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86. > > Thanks for testing that, I don't have any big x86 machines to test on :) > > > Also for whatever it is worth, I don't know of anything other than > > microcontrollers or the larger IoT devices that would want their kernels > > built with CONFIG_HIGH_RES_TIMERS=n. Which might be a failure of > > imagination on my part, but so it goes. > > Yeah I agree, like I said before I wasn't even aware you could turn it > off. So I think we'll definitely add a select HIGH_RES_TIMERS in future, > but first I need to work out why we are seeing stalls with it disabled. Good point, and fair enough! Thanx, Paul
Re: [PATCH net-next v2 05/18] net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan()
On Tue, 2022-04-12 at 13:37 +0200, Jakob Koschel wrote: > > > On 12. Apr 2022, at 13:27, Russell King (Oracle) > > wrote: > > > > On Tue, Apr 12, 2022 at 12:58:17PM +0200, Jakob Koschel wrote: > > > We know that "dev > dst->last_switch" in the "else" block. > > > In other words, that "dev - dst->last_switch" is > 0. > > > > > > dsa_port_bridge_num_get(dp) can be 0, but the check > > > "if (bridge_num + dst->last_switch != dev) continue", rewritten as > > > "if (bridge_num != dev - dst->last_switch) continue", aka > > > "if (bridge_num != something which cannot be 0) continue", > > > makes it redundant to have the extra "if (!bridge_num) continue" logic, > > > since a bridge_num of zero would have been skipped anyway. > > > > > > Signed-off-by: Jakob Koschel > > > Signed-off-by: Vladimir Oltean > > > > Isn't this Vladimir's patch? > > > > If so, it should be commited as Vladimir as the author, and Vladimir's > > sign-off should appear before yours. When sending out such patches, > > there should be a From: line for Vladimir as the first line in the body > > of the patch email. > > yes, you are right. I wasn't sure on how to send those commits, but > I guess if I just set the commit author correctly it should be fine. > > regarding the order: I thought I did it correctly doing bottom up but > I confused the order, wasn't on purpose. Sorry about that. > > I'll resend after verifying all the authors and sign-offs are correct. whoops, too late... Please, do wait at least 24h before reposting, as pointed out in the documentation: https://elixir.bootlin.com/linux/v5.18-rc2/source/Documentation/process/maintainer-netdev.rst#L148 Thanks, Paolo
Re: [RFC][PATCH] net: fs_enet: fix tx error handling
Le 17/03/2022 à 16:38, Mans Rullgard a écrit : > In some cases, the TXE flag is apparently set without any error > indication in the buffer descriptor status. When this happens, tx > stalls until the tx_restart() function is called via the device > watchdog which can take a long time. Is there an errata from NXP about this ? Did you report the issue to them ? What feedback did you get ? > > To fix this, check for TXE in the napi poll function and trigger a > tx_restart() call as for errors reported in the buffer descriptor. I'm not sure to understand. You change seems to do a lot more than that. Especially it changes to location of the handling of errors. Previously errors where handled in interrupt routine. Now it's handled in the napi poll routine. You have to explain all that. > > This change makes the FCC based Ethernet controller on MPC82xx devices > usable. It probably breaks the other modes (FEC, SCC) which I have no > way of testing. You should at least change mac-scc.h and mac-fec.h to match the changes you did in mac-fcc.h This would allow me to at least test the FEC one. > > Signed-off-by: Mans Rullgard > --- > .../ethernet/freescale/fs_enet/fs_enet-main.c | 47 +++ > .../net/ethernet/freescale/fs_enet/mac-fcc.c | 2 +- > 2 files changed, 19 insertions(+), 30 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c > b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c > index 78e008b81374..4276becd07cf 100644 > --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c > +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c > @@ -94,14 +94,22 @@ static int fs_enet_napi(struct napi_struct *napi, int > budget) > int curidx; > int dirtyidx, do_wake, do_restart; > int tx_left = TX_RING_SIZE; > + u32 int_events; > > spin_lock(&fep->tx_lock); > bdp = fep->dirty_tx; > + do_wake = do_restart = 0; > + > + int_events = (*fep->ops->get_int_events)(dev); > + > + if (int_events & fep->ev_err) { > + (*fep->ops->ev_error)(dev, int_events); > + do_restart = 1; > + } > > /* clear status bits for napi*/ > (*fep->ops->napi_clear_event)(dev); > > - do_wake = do_restart = 0; > while (((sc = CBDR_SC(bdp)) & BD_ENET_TX_READY) == 0 && tx_left) { > dirtyidx = bdp - fep->tx_bd_base; > > @@ -318,43 +326,24 @@ fs_enet_interrupt(int irq, void *dev_id) > { > struct net_device *dev = dev_id; > struct fs_enet_private *fep; > - const struct fs_platform_info *fpi; > u32 int_events; > - u32 int_clr_events; > - int nr, napi_ok; > - int handled; > > fep = netdev_priv(dev); > - fpi = fep->fpi; > > - nr = 0; > - while ((int_events = (*fep->ops->get_int_events)(dev)) != 0) { > - nr++; > + int_events = (*fep->ops->get_int_events)(dev); > + if (!int_events) > + return IRQ_NONE; > > - int_clr_events = int_events; > - int_clr_events &= ~fep->ev_napi; > + int_events &= ~fep->ev_napi; > > - (*fep->ops->clear_int_events)(dev, int_clr_events); > - > - if (int_events & fep->ev_err) > - (*fep->ops->ev_error)(dev, int_events); > - > - if (int_events & fep->ev) { > - napi_ok = napi_schedule_prep(&fep->napi); > - > - (*fep->ops->napi_disable)(dev); > - (*fep->ops->clear_int_events)(dev, fep->ev_napi); > - > - /* NOTE: it is possible for FCCs in NAPI mode*/ > - /* to submit a spurious interrupt while in poll */ > - if (napi_ok) > - __napi_schedule(&fep->napi); > - } > + (*fep->ops->clear_int_events)(dev, int_events); > > + if (napi_schedule_prep(&fep->napi)) { > + (*fep->ops->napi_disable)(dev); > + __napi_schedule(&fep->napi); > } > > - handled = nr > 0; > - return IRQ_RETVAL(handled); > + return IRQ_HANDLED; > } > > void fs_init_bds(struct net_device *dev) > diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c > b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c > index b47490be872c..66c8f82a8333 100644 > --- a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c > +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c > @@ -124,7 +124,7 @@ static int do_pd_setup(struct fs_enet_private *fep) > return ret; > } > > -#define FCC_NAPI_EVENT_MSK (FCC_ENET_RXF | FCC_ENET_RXB | FCC_ENET_TXB) > +#define FCC_NAPI_EVENT_MSK (FCC_ENET_RXF | FCC_ENET_RXB | FCC_ENET_TXB | > FCC_ENET_TXE) > #define FCC_EVENT (FCC_ENET_RXF | FCC_ENET_TXB) > #define FCC_ERR_EVENT_MSK (FCC_ENET_TXE) >
Re: [PATCH V5 7/7] mm/mmap: Drop arch_vm_get_page_pgprot()
Le 12/04/2022 à 06:38, Anshuman Khandual a écrit : > There are no platforms left which use arch_vm_get_page_prot(). Just drop > generic arch_vm_get_page_prot(). > > Cc: Andrew Morton > Cc: linux...@kvack.org > Cc: linux-ker...@vger.kernel.org > Reviewed-by: Catalin Marinas > Signed-off-by: Anshuman Khandual > --- > include/linux/mman.h | 4 > mm/mmap.c| 3 +-- > 2 files changed, 1 insertion(+), 6 deletions(-) > > diff --git a/include/linux/mman.h b/include/linux/mman.h > index b66e91b8176c..58b3abd457a3 100644 > --- a/include/linux/mman.h > +++ b/include/linux/mman.h > @@ -93,10 +93,6 @@ static inline void vm_unacct_memory(long pages) > #define arch_calc_vm_flag_bits(flags) 0 > #endif > > -#ifndef arch_vm_get_page_prot > -#define arch_vm_get_page_prot(vm_flags) __pgprot(0) > -#endif > - > #ifndef arch_validate_prot > /* >* This is called from mprotect(). PROT_GROWSDOWN and PROT_GROWSUP have > diff --git a/mm/mmap.c b/mm/mmap.c > index edf2a5e38f4d..db7f33154206 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -110,8 +110,7 @@ pgprot_t protection_map[16] __ro_after_init = { > pgprot_t vm_get_page_prot(unsigned long vm_flags) > { > pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags & > - (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > - pgprot_val(arch_vm_get_page_prot(vm_flags))); > + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)])); > > return ret; > } __pgprot(pgprot_val(x)) is a no-op. You can simply do: return protection_map[vm_flags & (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED);
Re: [PATCH V5 6/7] mm/mmap: Drop arch_filter_pgprot()
Le 12/04/2022 à 06:38, Anshuman Khandual a écrit : > There are no platforms left which subscribe ARCH_HAS_FILTER_PGPROT. Hence > drop generic arch_filter_pgprot() and also config ARCH_HAS_FILTER_PGPROT. > > Cc: Andrew Morton > Cc: linux...@kvack.org > Cc: linux-ker...@vger.kernel.org > Reviewed-by: Catalin Marinas > Signed-off-by: Anshuman Khandual > --- > mm/Kconfig | 3 --- > mm/mmap.c | 9 + > 2 files changed, 1 insertion(+), 11 deletions(-) > > diff --git a/mm/Kconfig b/mm/Kconfig > index b1f7624276f8..3f7b6d7b69df 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -762,9 +762,6 @@ config ARCH_HAS_CURRENT_STACK_POINTER > register alias named "current_stack_pointer", this config can be > selected. > > -config ARCH_HAS_FILTER_PGPROT > - bool > - > config ARCH_HAS_VM_GET_PAGE_PROT > bool > > diff --git a/mm/mmap.c b/mm/mmap.c > index 87cb2eaf7e1a..edf2a5e38f4d 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -107,20 +107,13 @@ pgprot_t protection_map[16] __ro_after_init = { > }; > > #ifndef CONFIG_ARCH_HAS_VM_GET_PAGE_PROT > -#ifndef CONFIG_ARCH_HAS_FILTER_PGPROT > -static inline pgprot_t arch_filter_pgprot(pgprot_t prot) > -{ > - return prot; > -} > -#endif > - > pgprot_t vm_get_page_prot(unsigned long vm_flags) > { > pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags & > (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > pgprot_val(arch_vm_get_page_prot(vm_flags))); > > - return arch_filter_pgprot(ret); > + return ret; You can drop 'ret' and directly do: return __pgprot(pgprot_val(protection_map[vm_flags & (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | pgprot_val(arch_vm_get_page_prot(vm_flags))); > } > EXPORT_SYMBOL(vm_get_page_prot); > #endif /* CONFIG_ARCH_HAS_VM_GET_PAGE_PROT */
Re: [PATCH V5 2/7] powerpc/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT
Le 12/04/2022 à 06:38, Anshuman Khandual a écrit : > This defines and exports a platform specific custom vm_get_page_prot() via > subscribing ARCH_HAS_VM_GET_PAGE_PROT. While here, this also localizes > arch_vm_get_page_prot() as __vm_get_page_prot() and moves it near > vm_get_page_prot(). > > Cc: Michael Ellerman > Cc: Paul Mackerras > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-ker...@vger.kernel.org > Signed-off-by: Anshuman Khandual > --- > arch/powerpc/Kconfig | 1 + > arch/powerpc/include/asm/mman.h| 12 > arch/powerpc/mm/book3s64/pgtable.c | 20 > 3 files changed, 21 insertions(+), 12 deletions(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 174edabb74fa..69e44358a235 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -140,6 +140,7 @@ config PPC > select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST > select ARCH_HAS_UACCESS_FLUSHCACHE > select ARCH_HAS_UBSAN_SANITIZE_ALL > + select ARCH_HAS_VM_GET_PAGE_PROTif PPC_BOOK3S_64 > select ARCH_HAVE_NMI_SAFE_CMPXCHG > select ARCH_KEEP_MEMBLOCK > select ARCH_MIGHT_HAVE_PC_PARPORT > diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h > index 7cb6d18f5cd6..1b024e64c8ec 100644 > --- a/arch/powerpc/include/asm/mman.h > +++ b/arch/powerpc/include/asm/mman.h > @@ -24,18 +24,6 @@ static inline unsigned long > arch_calc_vm_prot_bits(unsigned long prot, > } > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, > pkey) > > -static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags) > -{ > -#ifdef CONFIG_PPC_MEM_KEYS > - return (vm_flags & VM_SAO) ? > - __pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) : > - __pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags)); > -#else > - return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0); > -#endif > -} > -#define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags) > - > static inline bool arch_validate_prot(unsigned long prot, unsigned long > addr) > { > if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO)) > diff --git a/arch/powerpc/mm/book3s64/pgtable.c > b/arch/powerpc/mm/book3s64/pgtable.c > index 052e6590f84f..d0319524e27f 100644 > --- a/arch/powerpc/mm/book3s64/pgtable.c > +++ b/arch/powerpc/mm/book3s64/pgtable.c > @@ -7,6 +7,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -549,3 +550,22 @@ unsigned long memremap_compat_align(void) > } > EXPORT_SYMBOL_GPL(memremap_compat_align); > #endif > + > +static pgprot_t __vm_get_page_prot(unsigned long vm_flags) > +{ > +#ifdef CONFIG_PPC_MEM_KEYS > + return (vm_flags & VM_SAO) ? > + __pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) : > + __pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags)); > +#else > + return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0); > +#endif > +} > + > +pgprot_t vm_get_page_prot(unsigned long vm_flags) > +{ > + return __pgprot(pgprot_val(protection_map[vm_flags & > + (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) | > + pgprot_val(__vm_get_page_prot(vm_flags))); > +} > +EXPORT_SYMBOL(vm_get_page_prot); This looks functionnaly OK, but I think we could go in the same direction as ARM and try to integrate __vm_get_page_prot() inside vm_get_page_prot() to get something simpler and cleaner: Something like below (untested): pgprot_t vm_get_page_prot(unsigned long vm_flags) { unsigned long prot = pgprot_val(protection_map[vm_flags & (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); if (vm_flags & VM_SAO) prot |= _PAGE_SAO #ifdef CONFIG_PPC_MEM_KEYS prot |= vmflag_to_pte_pkey_bits(vm_flags); #endif return __pgprot(prot); }
[PATCH net-next v3 18/18] team: Remove use of list iterator variable for list_for_each_entry_from()
In preparation to limit the scope of the list iterator variable to the list traversal loop, use a dedicated pointer to iterate through the list [1]. Since that variable should not be used past the loop iteration, a separate variable is used to 'remember the current location within the loop'. To either continue iterating from that position or skip the iteration (if the previous iteration was complete) list_prepare_entry() is used. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/team/team.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index b07dde6f0abf..688c4393f099 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2425,17 +2425,17 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, int flags, team_nl_send_func_t *send_func, struct list_head *sel_opt_inst_list) { + struct team_option_inst *opt_inst, *tmp = NULL; struct nlattr *option_list; struct nlmsghdr *nlh; void *hdr; - struct team_option_inst *opt_inst; int err; struct sk_buff *skb = NULL; bool incomplete; int i; - opt_inst = list_first_entry(sel_opt_inst_list, - struct team_option_inst, tmp_list); + tmp = list_first_entry(sel_opt_inst_list, + struct team_option_inst, tmp_list); start_again: err = __send_and_alloc_skb(&skb, team, portid, send_func); @@ -2456,7 +2456,9 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, goto nla_put_failure; i = 0; + opt_inst = list_prepare_entry(tmp, sel_opt_inst_list, tmp_list); incomplete = false; + tmp = NULL; list_for_each_entry_from(opt_inst, sel_opt_inst_list, tmp_list) { err = team_nl_fill_one_option_get(skb, team, opt_inst); if (err) { @@ -2464,6 +2466,7 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, if (!i) goto errout; incomplete = true; + tmp = opt_inst; break; } goto errout; @@ -2707,14 +2710,14 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, struct nlattr *port_list; struct nlmsghdr *nlh; void *hdr; - struct team_port *port; + struct team_port *port, *tmp = NULL; int err; struct sk_buff *skb = NULL; bool incomplete; int i; - port = list_first_entry_or_null(&team->port_list, - struct team_port, list); + tmp = list_first_entry_or_null(&team->port_list, + struct team_port, list); start_again: err = __send_and_alloc_skb(&skb, team, portid, send_func); @@ -2744,7 +2747,9 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, err = team_nl_fill_one_port_get(skb, one_port); if (err) goto errout; - } else if (port) { + } else { + port = list_prepare_entry(tmp, &team->port_list, list); + tmp = NULL; list_for_each_entry_from(port, &team->port_list, list) { err = team_nl_fill_one_port_get(skb, port); if (err) { @@ -2752,6 +2757,7 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, if (!i) goto errout; incomplete = true; + tmp = port; break; } goto errout; -- 2.25.1
[PATCH net-next v3 17/18] ipvlan: Remove usage of list iterator variable for the loop body
In preparation to limit the scope of the list iterator variable to the list traversal loop, use a dedicated pointer to iterate through the list [1]. Since that variable should not be used past the loop iteration, a separate variable is used to 'remember the current location within the loop'. To either continue iterating from that position or start a new iteration (if the previous iteration was complete) list_prepare_entry() is used. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ipvlan/ipvlan_main.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index 696e245f6d00..063d7c30e944 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -9,7 +9,7 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, struct netlink_ext_ack *extack) { - struct ipvl_dev *ipvlan; + struct ipvl_dev *ipvlan, *tmp = NULL; unsigned int flags; int err; @@ -26,8 +26,10 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, flags & ~IFF_NOARP, extack); } - if (unlikely(err)) + if (unlikely(err)) { + tmp = ipvlan; goto fail; + } } if (nval == IPVLAN_MODE_L3S) { /* New mode is L3S */ @@ -43,6 +45,7 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, return 0; fail: + ipvlan = list_prepare_entry(tmp, &port->ipvlans, pnode); /* Undo the flags changes that have been done so far. */ list_for_each_entry_continue_reverse(ipvlan, &port->ipvlans, pnode) { flags = ipvlan->dev->flags; -- 2.25.1
[PATCH net-next v3 16/18] ps3_gelic: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- .../net/ethernet/toshiba/ps3_gelic_wireless.c | 30 +-- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c index dc14a66583ff..c8a016c902cd 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c +++ b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c @@ -1495,14 +1495,14 @@ static int gelic_wl_start_scan(struct gelic_wl_info *wl, int always_scan, */ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) { + struct gelic_wl_scan_info *target = NULL, *iter, *tmp; struct gelic_eurus_cmd *cmd = NULL; - struct gelic_wl_scan_info *target, *tmp; struct gelic_wl_scan_info *oldest = NULL; struct gelic_eurus_scan_info *scan_info; unsigned int scan_info_size; union iwreq_data data; unsigned long this_time = jiffies; - unsigned int data_len, i, found, r; + unsigned int data_len, i, r; void *buf; pr_debug("%s:start\n", __func__); @@ -1539,14 +1539,14 @@ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) wl->scan_stat = GELIC_WL_SCAN_STAT_GOT_LIST; /* mark all entries are old */ - list_for_each_entry_safe(target, tmp, &wl->network_list, list) { - target->valid = 0; + list_for_each_entry_safe(iter, tmp, &wl->network_list, list) { + iter->valid = 0; /* expire too old entries */ - if (time_before(target->last_scanned + wl->scan_age, + if (time_before(iter->last_scanned + wl->scan_age, this_time)) { - kfree(target->hwinfo); - target->hwinfo = NULL; - list_move_tail(&target->list, &wl->network_free_list); + kfree(iter->hwinfo); + iter->hwinfo = NULL; + list_move_tail(&iter->list, &wl->network_free_list); } } @@ -1569,22 +1569,22 @@ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) continue; } - found = 0; + target = NULL; oldest = NULL; - list_for_each_entry(target, &wl->network_list, list) { - if (ether_addr_equal(&target->hwinfo->bssid[2], + list_for_each_entry(iter, &wl->network_list, list) { + if (ether_addr_equal(&iter->hwinfo->bssid[2], &scan_info->bssid[2])) { - found = 1; + target = iter; pr_debug("%s: same BBS found scanned list\n", __func__); break; } if (!oldest || - (target->last_scanned < oldest->last_scanned)) - oldest = target; + (iter->last_scanned < oldest->last_scanned)) + oldest = iter; } - if (!found) { + if (!target) { /* not found in the list */ if (list_empty(&wl->network_free_list)) { /* expire oldest */ -- 2.25.1
[PATCH net-next v3 15/18] net: netcp: Remove usage of list iterator for list_add() after loop body
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer pointing to the location where the element should be inserted [1]. Before, the code implicitly used the head when no element was found when using &next->list. The new 'pos' variable is set to the list head by default and overwritten if the list exits early, marking the insertion point for list_add(). Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/ti/netcp_core.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index 16507bff652a..f25104b5a31b 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -471,6 +471,7 @@ struct netcp_hook_list { int netcp_register_txhook(struct netcp_intf *netcp_priv, int order, netcp_hook_rtn *hook_rtn, void *hook_data) { + struct list_head *pos = &netcp_priv->txhook_list_head; struct netcp_hook_list *entry; struct netcp_hook_list *next; unsigned long flags; @@ -485,10 +486,12 @@ int netcp_register_txhook(struct netcp_intf *netcp_priv, int order, spin_lock_irqsave(&netcp_priv->lock, flags); list_for_each_entry(next, &netcp_priv->txhook_list_head, list) { - if (next->order > order) + if (next->order > order) { + pos = &next->list; break; + } } - __list_add(&entry->list, next->list.prev, &next->list); + list_add_tail(&entry->list, pos); spin_unlock_irqrestore(&netcp_priv->lock, flags); return 0; @@ -520,6 +523,7 @@ EXPORT_SYMBOL_GPL(netcp_unregister_txhook); int netcp_register_rxhook(struct netcp_intf *netcp_priv, int order, netcp_hook_rtn *hook_rtn, void *hook_data) { + struct list_head *pos = &netcp_priv->rxhook_list_head; struct netcp_hook_list *entry; struct netcp_hook_list *next; unsigned long flags; @@ -534,10 +538,12 @@ int netcp_register_rxhook(struct netcp_intf *netcp_priv, int order, spin_lock_irqsave(&netcp_priv->lock, flags); list_for_each_entry(next, &netcp_priv->rxhook_list_head, list) { - if (next->order > order) + if (next->order > order) { + pos = &next->list; break; + } } - __list_add(&entry->list, next->list.prev, &next->list); + list_add_tail(&entry->list, pos); spin_unlock_irqrestore(&netcp_priv->lock, flags); return 0; -- 2.25.1
[PATCH net-next v3 14/18] sfc: Remove usage of list iterator for list_add() after the loop body
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer pointing to the location where the element should be inserted [1]. Before, the code implicitly used the head when no element was found when using &new->list. The new 'pos' variable is set to the list head by default and overwritten if the list exits early, marking the insertion point for list_add(). Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/sfc/rx_common.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c index 1b22c7be0088..80894a35ea79 100644 --- a/drivers/net/ethernet/sfc/rx_common.c +++ b/drivers/net/ethernet/sfc/rx_common.c @@ -555,7 +555,7 @@ efx_rx_packet_gro(struct efx_channel *channel, struct efx_rx_buffer *rx_buf, */ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) { - struct list_head *head = &efx->rss_context.list; + struct list_head *head = *pos = &efx->rss_context.list; struct efx_rss_context *ctx, *new; u32 id = 1; /* Don't use zero, that refers to the master RSS context */ @@ -563,8 +563,10 @@ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) /* Search for first gap in the numbering */ list_for_each_entry(ctx, head, list) { - if (ctx->user_id != id) + if (ctx->user_id != id) { + pos = &ctx->list; break; + } id++; /* Check for wrap. If this happens, we have nearly 2^32 * allocated RSS contexts, which seems unlikely. @@ -582,7 +584,7 @@ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) /* Insert the new entry into the gap */ new->user_id = id; - list_add_tail(&new->list, &ctx->list); + list_add_tail(&new->list, pos); return new; } -- 2.25.1
[PATCH net-next v3 13/18] net: qede: Remove check of list iterator against head past the loop body
When list_for_each_entry() completes the iteration over the whole list without breaking the loop, the iterator value will be a bogus pointer computed based on the head element. While it is safe to use the pointer to determine if it was computed based on the head element, either with list_entry_is_head() or &pos->member == head, using the iterator variable after the loop should be avoided. In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer to point to the found element [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qede/qede_filter.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c index 3010833ddde3..3d167e37e654 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_filter.c +++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c @@ -829,18 +829,21 @@ int qede_configure_vlan_filters(struct qede_dev *edev) int qede_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid) { struct qede_dev *edev = netdev_priv(dev); - struct qede_vlan *vlan; + struct qede_vlan *vlan = NULL; + struct qede_vlan *iter; int rc = 0; DP_VERBOSE(edev, NETIF_MSG_IFDOWN, "Removing vlan 0x%04x\n", vid); /* Find whether entry exists */ __qede_lock(edev); - list_for_each_entry(vlan, &edev->vlan_list, list) - if (vlan->vid == vid) + list_for_each_entry(iter, &edev->vlan_list, list) + if (iter->vid == vid) { + vlan = iter; break; + } - if (list_entry_is_head(vlan, &edev->vlan_list, list)) { + if (!vlan) { DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), "Vlan isn't configured\n"); goto out; -- 2.25.1
[PATCH net-next v3 12/18] net: qede: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qede/qede_rdma.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_rdma.c b/drivers/net/ethernet/qlogic/qede/qede_rdma.c index 6304514a6f2c..2eb03ffe2484 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_rdma.c +++ b/drivers/net/ethernet/qlogic/qede/qede_rdma.c @@ -246,18 +246,17 @@ static void qede_rdma_change_mtu(struct qede_dev *edev) static struct qede_rdma_event_work * qede_rdma_get_free_event_node(struct qede_dev *edev) { - struct qede_rdma_event_work *event_node = NULL; - bool found = false; + struct qede_rdma_event_work *event_node = NULL, *iter; - list_for_each_entry(event_node, &edev->rdma_info.rdma_event_list, + list_for_each_entry(iter, &edev->rdma_info.rdma_event_list, list) { - if (!work_pending(&event_node->work)) { - found = true; + if (!work_pending(&iter->work)) { + event_node = iter; break; } } - if (!found) { + if (!event_node) { event_node = kzalloc(sizeof(*event_node), GFP_ATOMIC); if (!event_node) { DP_NOTICE(edev, -- 2.25.1
[PATCH net-next v3 11/18] qed: Remove usage of list iterator variable after the loop
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. Since "found" and "p_ent" need to be equal, "found" should be used consistently to limit the scope of "p_ent" to the list traversal in the future. Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_spq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c index d01b9245f811..cbaa2abed660 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_spq.c +++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c @@ -934,10 +934,10 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, u8 fw_return_code, union event_ring_data *p_data) { + struct qed_spq_entry*found = NULL; struct qed_spq *p_spq; - struct qed_spq_entry*p_ent = NULL; + struct qed_spq_entry*p_ent; struct qed_spq_entry*tmp; - struct qed_spq_entry*found = NULL; if (!p_hwfn) return -EINVAL; @@ -980,7 +980,7 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, DP_VERBOSE(p_hwfn, QED_MSG_SPQ, "Complete EQE [echo %04x]: func %p cookie %p)\n", le16_to_cpu(echo), - p_ent->comp_cb.function, p_ent->comp_cb.cookie); + found->comp_cb.function, found->comp_cb.cookie); if (found->comp_cb.function) found->comp_cb.function(p_hwfn, found->comp_cb.cookie, p_data, fw_return_code); -- 2.25.1
[PATCH net-next v3 10/18] qed: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 26 ++--- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c index 1d1d4caad680..198c9321bf51 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -1630,38 +1630,36 @@ static struct qed_iwarp_listener * qed_iwarp_get_listener(struct qed_hwfn *p_hwfn, struct qed_iwarp_cm_info *cm_info) { - struct qed_iwarp_listener *listener = NULL; + struct qed_iwarp_listener *listener = NULL, *iter; static const u32 ip_zero[4] = { 0, 0, 0, 0 }; - bool found = false; - list_for_each_entry(listener, + list_for_each_entry(iter, &p_hwfn->p_rdma_info->iwarp.listen_list, list_entry) { - if (listener->port == cm_info->local_port) { - if (!memcmp(listener->ip_addr, + if (iter->port == cm_info->local_port) { + if (!memcmp(iter->ip_addr, ip_zero, sizeof(ip_zero))) { - found = true; + listener = iter; break; } - if (!memcmp(listener->ip_addr, + if (!memcmp(iter->ip_addr, cm_info->local_ip, sizeof(cm_info->local_ip)) && - (listener->vlan == cm_info->vlan)) { - found = true; + iter->vlan == cm_info->vlan) { + listener = iter; break; } } } - if (found) { + if (listener) DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener found = %p\n", listener); - return listener; - } + else + DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener not found\n"); - DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener not found\n"); - return NULL; + return listener; } static int -- 2.25.1
[PATCH net-next v3 09/18] qed: Use dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_dev.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index 672480c9d195..e920e7dcf66a 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -174,7 +174,7 @@ int qed_db_recovery_add(struct qed_dev *cdev, int qed_db_recovery_del(struct qed_dev *cdev, void __iomem *db_addr, void *db_data) { - struct qed_db_recovery_entry *db_entry = NULL; + struct qed_db_recovery_entry *db_entry = NULL, *iter; struct qed_hwfn *p_hwfn; int rc = -EINVAL; @@ -190,12 +190,13 @@ int qed_db_recovery_del(struct qed_dev *cdev, /* Protect the list */ spin_lock_bh(&p_hwfn->db_recovery_info.lock); - list_for_each_entry(db_entry, + list_for_each_entry(iter, &p_hwfn->db_recovery_info.list, list_entry) { /* search according to db_data addr since db_addr is not unique (roce) */ - if (db_entry->db_data == db_data) { - qed_db_recovery_dp_entry(p_hwfn, db_entry, "Deleting"); - list_del(&db_entry->list_entry); + if (iter->db_data == db_data) { + qed_db_recovery_dp_entry(p_hwfn, iter, "Deleting"); + list_del(&iter->list_entry); + db_entry = iter; rc = 0; break; } -- 2.25.1
[PATCH net-next v3 08/18] net: sparx5: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- .../microchip/sparx5/sparx5_mactable.c| 25 +-- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c index a5837dbe0c7e..bb8d9ce79ac2 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c @@ -362,8 +362,7 @@ static void sparx5_mact_handle_entry(struct sparx5 *sparx5, unsigned char mac[ETH_ALEN], u16 vid, u32 cfg2) { - struct sparx5_mact_entry *mact_entry; - bool found = false; + struct sparx5_mact_entry *mact_entry = NULL, *iter; u16 port; if (LRN_MAC_ACCESS_CFG_2_MAC_ENTRY_ADDR_TYPE_GET(cfg2) != @@ -378,28 +377,28 @@ static void sparx5_mact_handle_entry(struct sparx5 *sparx5, return; mutex_lock(&sparx5->mact_lock); - list_for_each_entry(mact_entry, &sparx5->mact_entries, list) { - if (mact_entry->vid == vid && - ether_addr_equal(mac, mact_entry->mac)) { - found = true; - mact_entry->flags |= MAC_ENT_ALIVE; - if (mact_entry->port != port) { + list_for_each_entry(iter, &sparx5->mact_entries, list) { + if (iter->vid == vid && + ether_addr_equal(mac, iter->mac)) { + iter->flags |= MAC_ENT_ALIVE; + if (iter->port != port) { dev_warn(sparx5->dev, "Entry move: %d -> %d\n", -mact_entry->port, port); - mact_entry->port = port; - mact_entry->flags |= MAC_ENT_MOVED; +iter->port, port); + iter->port = port; + iter->flags |= MAC_ENT_MOVED; } /* Entry handled */ + mact_entry = iter; break; } } mutex_unlock(&sparx5->mact_lock); - if (found && !(mact_entry->flags & MAC_ENT_MOVED)) + if (mact_entry && !(mact_entry->flags & MAC_ENT_MOVED)) /* Present, not moved */ return; - if (!found) { + if (!mact_entry) { /* Entry not found - now add */ mact_entry = alloc_mact_entry(sparx5, mac, vid, port); if (!mact_entry) -- 2.25.1
[PATCH net-next v3 07/18] net: dsa: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- net/dsa/dsa.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index 89c6c86e746f..645522c4dd4a 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -112,22 +112,21 @@ const struct dsa_device_ops *dsa_find_tagger_by_name(const char *buf) const struct dsa_device_ops *dsa_tag_driver_get(int tag_protocol) { - struct dsa_tag_driver *dsa_tag_driver; + struct dsa_tag_driver *dsa_tag_driver = NULL, *iter; const struct dsa_device_ops *ops; - bool found = false; request_module("%s%d", DSA_TAG_DRIVER_ALIAS, tag_protocol); mutex_lock(&dsa_tag_drivers_lock); - list_for_each_entry(dsa_tag_driver, &dsa_tag_drivers_list, list) { - ops = dsa_tag_driver->ops; + list_for_each_entry(iter, &dsa_tag_drivers_list, list) { + ops = iter->ops; if (ops->proto == tag_protocol) { - found = true; + dsa_tag_driver = iter; break; } } - if (found) { + if (dsa_tag_driver) { if (!try_module_get(dsa_tag_driver->owner)) ops = ERR_PTR(-ENOPROTOOPT); } else { -- 2.25.1
[PATCH net-next v3 06/18] net: dsa: mv88e6xxx: refactor mv88e6xxx_port_vlan()
From: Vladimir Oltean To avoid bugs and speculative execution exploits due to type-confused pointers at the end of a list_for_each_entry() loop, one measure is to restrict code to not use the iterator variable outside the loop block. In the case of mv88e6xxx_port_vlan(), this isn't a problem, as we never let the loops exit through "natural causes" anyway, by using a "found" variable and then using the last "dp" iterator prior to the break, which is a safe thing to do. Nonetheless, with the expected new syntax, this pattern will no longer be possible. Profit off of the occasion and break the two port finding methods into smaller sub-functions. Somehow, returning a copy of the iterator pointer is still accepted. This change makes it redundant to have a "bool found", since the "dp" from mv88e6xxx_port_vlan() now holds NULL if we haven't found what we were looking for. Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel --- drivers/net/dsa/mv88e6xxx/chip.c | 54 ++-- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index b3aa0e5bc842..1f35e89053e6 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1378,42 +1378,50 @@ static int mv88e6xxx_set_mac_eee(struct dsa_switch *ds, int port, return 0; } +static struct dsa_port *mv88e6xxx_find_port(struct dsa_switch_tree *dst, + int sw_index, int port) +{ + struct dsa_port *dp; + + list_for_each_entry(dp, &dst->ports, list) + if (dp->ds->index == sw_index && dp->index == port) + return dp; + + return NULL; +} + +static struct dsa_port * +mv88e6xxx_find_port_by_bridge_num(struct dsa_switch_tree *dst, + unsigned int bridge_num) +{ + struct dsa_port *dp; + + list_for_each_entry(dp, &dst->ports, list) + if (dsa_port_bridge_num_get(dp) == bridge_num) + return dp; + + return NULL; +} + /* Mask of the local ports allowed to receive frames from a given fabric port */ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port) { struct dsa_switch *ds = chip->ds; struct dsa_switch_tree *dst = ds->dst; struct dsa_port *dp, *other_dp; - bool found = false; u16 pvlan; - /* dev is a physical switch */ if (dev <= dst->last_switch) { - list_for_each_entry(dp, &dst->ports, list) { - if (dp->ds->index == dev && dp->index == port) { - /* dp might be a DSA link or a user port, so it -* might or might not have a bridge. -* Use the "found" variable for both cases. -*/ - found = true; - break; - } - } - /* dev is a virtual bridge */ + /* dev is a physical switch */ + dp = mv88e6xxx_find_port(dst, dev, port); } else { - list_for_each_entry(dp, &dst->ports, list) { - unsigned int bridge_num = dsa_port_bridge_num_get(dp); - - if (bridge_num + dst->last_switch != dev) - continue; - - found = true; - break; - } + /* dev is a virtual bridge */ + dp = mv88e6xxx_find_port_by_bridge_num(dst, + dev - dst->last_switch); } /* Prevent frames from unknown switch or virtual bridge */ - if (!found) + if (!dp) return 0; /* Frames from DSA links and CPU ports can egress any local port */ -- 2.25.1
[PATCH net-next v3 05/18] net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan()
From: Vladimir Oltean We know that "dev > dst->last_switch" in the "else" block. In other words, that "dev - dst->last_switch" is > 0. dsa_port_bridge_num_get(dp) can be 0, but the check "if (bridge_num + dst->last_switch != dev) continue", rewritten as "if (bridge_num != dev - dst->last_switch) continue", aka "if (bridge_num != something which cannot be 0) continue", makes it redundant to have the extra "if (!bridge_num) continue" logic, since a bridge_num of zero would have been skipped anyway. Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel --- drivers/net/dsa/mv88e6xxx/chip.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 64f4fdd02902..b3aa0e5bc842 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1404,9 +1404,6 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port) list_for_each_entry(dp, &dst->ports, list) { unsigned int bridge_num = dsa_port_bridge_num_get(dp); - if (!bridge_num) - continue; - if (bridge_num + dst->last_switch != dev) continue; -- 2.25.1
[PATCH net-next v3 03/18] net: dsa: sja1105: reorder sja1105_first_entry_longer_than with memory allocation
From: Vladimir Oltean sja1105_first_entry_longer_than() does not make use of the full struct sja1105_gate_entry *e, just of e->interval which is set from the passed entry_time. This means that if there is a gate conflict, we have allocated e for nothing, just to free it later. Reorder the memory allocation and the function call, to avoid that and simplify the error unwind path. Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel --- drivers/net/dsa/sja1105/sja1105_vl.c | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index 369be2ac3587..e5ea8eb9ec4e 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -36,7 +36,11 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, { struct sja1105_gate_entry *e; struct list_head *pos; - int rc; + + pos = sja1105_first_entry_longer_than(&gating_cfg->entries, + entry_time, extack); + if (IS_ERR(pos)) + return PTR_ERR(pos); e = kzalloc(sizeof(*e), GFP_KERNEL); if (!e) @@ -45,22 +49,11 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->rule = rule; e->gate_state = gate_state; e->interval = entry_time; - - pos = sja1105_first_entry_longer_than(&gating_cfg->entries, - e->interval, extack); - if (IS_ERR(pos)) { - rc = PTR_ERR(pos); - goto err; - } - list_add(&e->list, pos->prev); gating_cfg->num_entries++; return 0; -err: - kfree(e); - return rc; } /* The gate entries contain absolute times in their e->interval field. Convert -- 2.25.1
[PATCH net-next v3 00/18] Remove use of list iterator after loop body
When the list iterator loop does not exit early the list iterator variable contains a type-confused pointer to a 'bogus' list element computed based on the head [1]. Often a 'found' variable is used to ensure the list iterator variable is only accessed after the loop body if the loop did exit early (using a break or goto). In other cases that list iterator variable is used in combination to access the list member which reverses the invocation of container_of() and brings back a "safe" pointer to the head of the list. Since, due to this code patten, there were quite a few bugs discovered [2], Linus concluded that the rule should be to never use the list iterator after the loop and introduce a dedicated pointer for that [3]. With the new gnu11 standard, it will now be possible to limit the scope of the list iterator variable to the traversal loop itself by defining the variable within the for loop. This, however, requires to remove all uses of the list iterator after the loop. Based on input from Paolo Abeni [4], Vinicius Costa Gomes [5], and Jakub Kicinski [6], I've splitted all the net-next related changes into two patch sets, where this is part 1. v2->v3: - fix commit authors and signed-off order regarding Vladimir's patches (Sorry about that, wasn't intentional.) v1->v2: - Fixed commit message for PATCH 14/18 and used dedicated variable pointing to the position (Edward Cree) - Removed redundant check in mv88e6xxx_port_vlan() (Vladimir Oltean) - Refactor mv88e6xxx_port_vlan() using separate list iterator functions (Vladimir Oltean) - Refactor sja1105_insert_gate_entry() to use separate list iterator functions (Vladimir Oltean) - Allow early return in sja1105_insert_gate_entry() if sja1105_first_entry_longer_than() didn't find any element (Vladimir Oltean) - Use list_add_tail() instead of list_add() in sja1105_insert_gate_entry() (Jakub Kicinski) - net: netcp: also use separate 'pos' variable instead of duplicating list_add() Link: https://lwn.net/Articles/887097/ [1] Link: https://lore.kernel.org/linux-kernel/20220217184829.1991035-4-jakobkosc...@gmail.com/ [2] Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [3] Link: https://lore.kernel.org/linux-kernel/7393b673c626fd75f2b4f8509faa5459254fb87c.ca...@redhat.com/ [4] Link: https://lore.kernel.org/linux-kernel/877d8a3sww@intel.com/ [5] Link: https://lore.kernel.org/linux-kernel/20220403205502.1b344...@kernel.org/ [6]
[PATCH net-next v3 04/18] net: dsa: sja1105: use list_add_tail(pos) instead of list_add(pos->prev)
From: Vladimir Oltean When passed a non-head list element, list_add_tail() actually adds the new element to its left, which is what we want. Despite the slightly confusing name, use the dedicated function which does the same thing as the open-coded list_add(pos->prev). Suggested-by: Jakub Kicinski Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel --- drivers/net/dsa/sja1105/sja1105_vl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index e5ea8eb9ec4e..7fe9b18f1cbd 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -49,7 +49,7 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->rule = rule; e->gate_state = gate_state; e->interval = entry_time; - list_add(&e->list, pos->prev); + list_add_tail(&e->list, pos); gating_cfg->num_entries++; -- 2.25.1
[PATCH net-next v3 01/18] connector: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/connector/cn_queue.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c index 996f025eb63c..ed77599b0b25 100644 --- a/drivers/connector/cn_queue.c +++ b/drivers/connector/cn_queue.c @@ -92,20 +92,19 @@ int cn_queue_add_callback(struct cn_queue_dev *dev, const char *name, void cn_queue_del_callback(struct cn_queue_dev *dev, const struct cb_id *id) { - struct cn_callback_entry *cbq, *n; - int found = 0; + struct cn_callback_entry *cbq = NULL, *iter, *n; spin_lock_bh(&dev->queue_lock); - list_for_each_entry_safe(cbq, n, &dev->queue_list, callback_entry) { - if (cn_cb_equal(&cbq->id.id, id)) { - list_del(&cbq->callback_entry); - found = 1; + list_for_each_entry_safe(iter, n, &dev->queue_list, callback_entry) { + if (cn_cb_equal(&iter->id.id, id)) { + list_del(&iter->callback_entry); + cbq = iter; break; } } spin_unlock_bh(&dev->queue_lock); - if (found) + if (cbq) cn_queue_release_callback(cbq); } -- 2.25.1
[PATCH net-next v3 02/18] net: dsa: sja1105: remove use of iterator after list_for_each_entry() loop
From: Vladimir Oltean The link below explains that there is a desire to syntactically change list_for_each_entry() and list_for_each() such that it becomes impossible to use the iterator variable outside the scope of the loop. Although sja1105_insert_gate_entry() makes legitimate use of the iterator pointer when it breaks out, the pattern it uses may become illegal, so it needs to change. It is deemed acceptable to use a copy of the loop iterator, and sja1105_insert_gate_entry() only needs to know the list_head element before which the list insertion should be made. So let's profit from the occasion and refactor the list iteration to a dedicated function. An additional benefit is given by the fact that with the helper function in place, we no longer need to special-case the empty list, since it is equivalent to not having found any gating entry larger than the specified interval in the list. We just need to insert at the tail of that list (list_add vs list_add_tail on an empty list does the same thing). Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220407102900.3086255-3-jakobkosc...@gmail.com/#24810127 Signed-off-by: Vladimir Oltean Signed-off-by: Jakob Koschel --- drivers/net/dsa/sja1105/sja1105_vl.c | 46 ++-- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index b7e95d60a6e4..369be2ac3587 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -7,6 +7,27 @@ #define SJA1105_SIZE_VL_STATUS 8 +static struct list_head * +sja1105_first_entry_longer_than(struct list_head *entries, + s64 interval, + struct netlink_ext_ack *extack) +{ + struct sja1105_gate_entry *p; + + list_for_each_entry(p, entries, list) { + if (p->interval == interval) { + NL_SET_ERR_MSG_MOD(extack, "Gate conflict"); + return ERR_PTR(-EBUSY); + } + + if (interval < p->interval) + return &p->list; + } + + /* Empty list, or specified interval is largest within the list */ + return entries; +} + /* Insert into the global gate list, sorted by gate action time. */ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, struct sja1105_rule *rule, @@ -14,6 +35,7 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, struct netlink_ext_ack *extack) { struct sja1105_gate_entry *e; + struct list_head *pos; int rc; e = kzalloc(sizeof(*e), GFP_KERNEL); @@ -24,25 +46,15 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->gate_state = gate_state; e->interval = entry_time; - if (list_empty(&gating_cfg->entries)) { - list_add(&e->list, &gating_cfg->entries); - } else { - struct sja1105_gate_entry *p; - - list_for_each_entry(p, &gating_cfg->entries, list) { - if (p->interval == e->interval) { - NL_SET_ERR_MSG_MOD(extack, - "Gate conflict"); - rc = -EBUSY; - goto err; - } - - if (e->interval < p->interval) - break; - } - list_add(&e->list, p->list.prev); + pos = sja1105_first_entry_longer_than(&gating_cfg->entries, + e->interval, extack); + if (IS_ERR(pos)) { + rc = PTR_ERR(pos); + goto err; } + list_add(&e->list, pos->prev); + gating_cfg->num_entries++; return 0; -- 2.25.1
Re: [PATCH net-next v2 05/18] net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan()
> On 12. Apr 2022, at 13:27, Russell King (Oracle) > wrote: > > On Tue, Apr 12, 2022 at 12:58:17PM +0200, Jakob Koschel wrote: >> We know that "dev > dst->last_switch" in the "else" block. >> In other words, that "dev - dst->last_switch" is > 0. >> >> dsa_port_bridge_num_get(dp) can be 0, but the check >> "if (bridge_num + dst->last_switch != dev) continue", rewritten as >> "if (bridge_num != dev - dst->last_switch) continue", aka >> "if (bridge_num != something which cannot be 0) continue", >> makes it redundant to have the extra "if (!bridge_num) continue" logic, >> since a bridge_num of zero would have been skipped anyway. >> >> Signed-off-by: Jakob Koschel >> Signed-off-by: Vladimir Oltean > > Isn't this Vladimir's patch? > > If so, it should be commited as Vladimir as the author, and Vladimir's > sign-off should appear before yours. When sending out such patches, > there should be a From: line for Vladimir as the first line in the body > of the patch email. yes, you are right. I wasn't sure on how to send those commits, but I guess if I just set the commit author correctly it should be fine. regarding the order: I thought I did it correctly doing bottom up but I confused the order, wasn't on purpose. Sorry about that. I'll resend after verifying all the authors and sign-offs are correct. > > The same goes for the next mv88e6xxx patch in this series - I think > both of these are the patches Vladimir sent in his email: > > https://lore.kernel.org/r/20220408123101.p33jpynhqo67hebe@skbuf > > Thanks. > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! Thanks for pointing it out! Jakob
Re: [PATCH net-next v2 05/18] net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan()
On Tue, Apr 12, 2022 at 12:58:17PM +0200, Jakob Koschel wrote: > We know that "dev > dst->last_switch" in the "else" block. > In other words, that "dev - dst->last_switch" is > 0. > > dsa_port_bridge_num_get(dp) can be 0, but the check > "if (bridge_num + dst->last_switch != dev) continue", rewritten as > "if (bridge_num != dev - dst->last_switch) continue", aka > "if (bridge_num != something which cannot be 0) continue", > makes it redundant to have the extra "if (!bridge_num) continue" logic, > since a bridge_num of zero would have been skipped anyway. > > Signed-off-by: Jakob Koschel > Signed-off-by: Vladimir Oltean Isn't this Vladimir's patch? If so, it should be commited as Vladimir as the author, and Vladimir's sign-off should appear before yours. When sending out such patches, there should be a From: line for Vladimir as the first line in the body of the patch email. The same goes for the next mv88e6xxx patch in this series - I think both of these are the patches Vladimir sent in his email: https://lore.kernel.org/r/20220408123101.p33jpynhqo67hebe@skbuf Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
[PATCH net-next v2 18/18] team: Remove use of list iterator variable for list_for_each_entry_from()
In preparation to limit the scope of the list iterator variable to the list traversal loop, use a dedicated pointer to iterate through the list [1]. Since that variable should not be used past the loop iteration, a separate variable is used to 'remember the current location within the loop'. To either continue iterating from that position or skip the iteration (if the previous iteration was complete) list_prepare_entry() is used. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/team/team.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index b07dde6f0abf..688c4393f099 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2425,17 +2425,17 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, int flags, team_nl_send_func_t *send_func, struct list_head *sel_opt_inst_list) { + struct team_option_inst *opt_inst, *tmp = NULL; struct nlattr *option_list; struct nlmsghdr *nlh; void *hdr; - struct team_option_inst *opt_inst; int err; struct sk_buff *skb = NULL; bool incomplete; int i; - opt_inst = list_first_entry(sel_opt_inst_list, - struct team_option_inst, tmp_list); + tmp = list_first_entry(sel_opt_inst_list, + struct team_option_inst, tmp_list); start_again: err = __send_and_alloc_skb(&skb, team, portid, send_func); @@ -2456,7 +2456,9 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, goto nla_put_failure; i = 0; + opt_inst = list_prepare_entry(tmp, sel_opt_inst_list, tmp_list); incomplete = false; + tmp = NULL; list_for_each_entry_from(opt_inst, sel_opt_inst_list, tmp_list) { err = team_nl_fill_one_option_get(skb, team, opt_inst); if (err) { @@ -2464,6 +2466,7 @@ static int team_nl_send_options_get(struct team *team, u32 portid, u32 seq, if (!i) goto errout; incomplete = true; + tmp = opt_inst; break; } goto errout; @@ -2707,14 +2710,14 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, struct nlattr *port_list; struct nlmsghdr *nlh; void *hdr; - struct team_port *port; + struct team_port *port, *tmp = NULL; int err; struct sk_buff *skb = NULL; bool incomplete; int i; - port = list_first_entry_or_null(&team->port_list, - struct team_port, list); + tmp = list_first_entry_or_null(&team->port_list, + struct team_port, list); start_again: err = __send_and_alloc_skb(&skb, team, portid, send_func); @@ -2744,7 +2747,9 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, err = team_nl_fill_one_port_get(skb, one_port); if (err) goto errout; - } else if (port) { + } else { + port = list_prepare_entry(tmp, &team->port_list, list); + tmp = NULL; list_for_each_entry_from(port, &team->port_list, list) { err = team_nl_fill_one_port_get(skb, port); if (err) { @@ -2752,6 +2757,7 @@ static int team_nl_send_port_list_get(struct team *team, u32 portid, u32 seq, if (!i) goto errout; incomplete = true; + tmp = port; break; } goto errout; -- 2.25.1
[PATCH net-next v2 17/18] ipvlan: Remove usage of list iterator variable for the loop body
In preparation to limit the scope of the list iterator variable to the list traversal loop, use a dedicated pointer to iterate through the list [1]. Since that variable should not be used past the loop iteration, a separate variable is used to 'remember the current location within the loop'. To either continue iterating from that position or start a new iteration (if the previous iteration was complete) list_prepare_entry() is used. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ipvlan/ipvlan_main.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index 696e245f6d00..063d7c30e944 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -9,7 +9,7 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, struct netlink_ext_ack *extack) { - struct ipvl_dev *ipvlan; + struct ipvl_dev *ipvlan, *tmp = NULL; unsigned int flags; int err; @@ -26,8 +26,10 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, flags & ~IFF_NOARP, extack); } - if (unlikely(err)) + if (unlikely(err)) { + tmp = ipvlan; goto fail; + } } if (nval == IPVLAN_MODE_L3S) { /* New mode is L3S */ @@ -43,6 +45,7 @@ static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval, return 0; fail: + ipvlan = list_prepare_entry(tmp, &port->ipvlans, pnode); /* Undo the flags changes that have been done so far. */ list_for_each_entry_continue_reverse(ipvlan, &port->ipvlans, pnode) { flags = ipvlan->dev->flags; -- 2.25.1
[PATCH net-next v2 16/18] ps3_gelic: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- .../net/ethernet/toshiba/ps3_gelic_wireless.c | 30 +-- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c index dc14a66583ff..c8a016c902cd 100644 --- a/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c +++ b/drivers/net/ethernet/toshiba/ps3_gelic_wireless.c @@ -1495,14 +1495,14 @@ static int gelic_wl_start_scan(struct gelic_wl_info *wl, int always_scan, */ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) { + struct gelic_wl_scan_info *target = NULL, *iter, *tmp; struct gelic_eurus_cmd *cmd = NULL; - struct gelic_wl_scan_info *target, *tmp; struct gelic_wl_scan_info *oldest = NULL; struct gelic_eurus_scan_info *scan_info; unsigned int scan_info_size; union iwreq_data data; unsigned long this_time = jiffies; - unsigned int data_len, i, found, r; + unsigned int data_len, i, r; void *buf; pr_debug("%s:start\n", __func__); @@ -1539,14 +1539,14 @@ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) wl->scan_stat = GELIC_WL_SCAN_STAT_GOT_LIST; /* mark all entries are old */ - list_for_each_entry_safe(target, tmp, &wl->network_list, list) { - target->valid = 0; + list_for_each_entry_safe(iter, tmp, &wl->network_list, list) { + iter->valid = 0; /* expire too old entries */ - if (time_before(target->last_scanned + wl->scan_age, + if (time_before(iter->last_scanned + wl->scan_age, this_time)) { - kfree(target->hwinfo); - target->hwinfo = NULL; - list_move_tail(&target->list, &wl->network_free_list); + kfree(iter->hwinfo); + iter->hwinfo = NULL; + list_move_tail(&iter->list, &wl->network_free_list); } } @@ -1569,22 +1569,22 @@ static void gelic_wl_scan_complete_event(struct gelic_wl_info *wl) continue; } - found = 0; + target = NULL; oldest = NULL; - list_for_each_entry(target, &wl->network_list, list) { - if (ether_addr_equal(&target->hwinfo->bssid[2], + list_for_each_entry(iter, &wl->network_list, list) { + if (ether_addr_equal(&iter->hwinfo->bssid[2], &scan_info->bssid[2])) { - found = 1; + target = iter; pr_debug("%s: same BBS found scanned list\n", __func__); break; } if (!oldest || - (target->last_scanned < oldest->last_scanned)) - oldest = target; + (iter->last_scanned < oldest->last_scanned)) + oldest = iter; } - if (!found) { + if (!target) { /* not found in the list */ if (list_empty(&wl->network_free_list)) { /* expire oldest */ -- 2.25.1
[PATCH net-next v2 15/18] net: netcp: Remove usage of list iterator for list_add() after loop body
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer pointing to the location where the element should be inserted [1]. Before, the code implicitly used the head when no element was found when using &next->list. The new 'pos' variable is set to the list head by default and overwritten if the list exits early, marking the insertion point for list_add(). Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/ti/netcp_core.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index 16507bff652a..f25104b5a31b 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -471,6 +471,7 @@ struct netcp_hook_list { int netcp_register_txhook(struct netcp_intf *netcp_priv, int order, netcp_hook_rtn *hook_rtn, void *hook_data) { + struct list_head *pos = &netcp_priv->txhook_list_head; struct netcp_hook_list *entry; struct netcp_hook_list *next; unsigned long flags; @@ -485,10 +486,12 @@ int netcp_register_txhook(struct netcp_intf *netcp_priv, int order, spin_lock_irqsave(&netcp_priv->lock, flags); list_for_each_entry(next, &netcp_priv->txhook_list_head, list) { - if (next->order > order) + if (next->order > order) { + pos = &next->list; break; + } } - __list_add(&entry->list, next->list.prev, &next->list); + list_add_tail(&entry->list, pos); spin_unlock_irqrestore(&netcp_priv->lock, flags); return 0; @@ -520,6 +523,7 @@ EXPORT_SYMBOL_GPL(netcp_unregister_txhook); int netcp_register_rxhook(struct netcp_intf *netcp_priv, int order, netcp_hook_rtn *hook_rtn, void *hook_data) { + struct list_head *pos = &netcp_priv->rxhook_list_head; struct netcp_hook_list *entry; struct netcp_hook_list *next; unsigned long flags; @@ -534,10 +538,12 @@ int netcp_register_rxhook(struct netcp_intf *netcp_priv, int order, spin_lock_irqsave(&netcp_priv->lock, flags); list_for_each_entry(next, &netcp_priv->rxhook_list_head, list) { - if (next->order > order) + if (next->order > order) { + pos = &next->list; break; + } } - __list_add(&entry->list, next->list.prev, &next->list); + list_add_tail(&entry->list, pos); spin_unlock_irqrestore(&netcp_priv->lock, flags); return 0; -- 2.25.1
[PATCH net-next v2 14/18] sfc: Remove usage of list iterator for list_add() after the loop body
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer pointing to the location where the element should be inserted [1]. Before, the code implicitly used the head when no element was found when using &new->list. The new 'pos' variable is set to the list head by default and overwritten if the list exits early, marking the insertion point for list_add(). Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/sfc/rx_common.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c index 1b22c7be0088..80894a35ea79 100644 --- a/drivers/net/ethernet/sfc/rx_common.c +++ b/drivers/net/ethernet/sfc/rx_common.c @@ -555,7 +555,7 @@ efx_rx_packet_gro(struct efx_channel *channel, struct efx_rx_buffer *rx_buf, */ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) { - struct list_head *head = &efx->rss_context.list; + struct list_head *head = *pos = &efx->rss_context.list; struct efx_rss_context *ctx, *new; u32 id = 1; /* Don't use zero, that refers to the master RSS context */ @@ -563,8 +563,10 @@ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) /* Search for first gap in the numbering */ list_for_each_entry(ctx, head, list) { - if (ctx->user_id != id) + if (ctx->user_id != id) { + pos = &ctx->list; break; + } id++; /* Check for wrap. If this happens, we have nearly 2^32 * allocated RSS contexts, which seems unlikely. @@ -582,7 +584,7 @@ struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx) /* Insert the new entry into the gap */ new->user_id = id; - list_add_tail(&new->list, &ctx->list); + list_add_tail(&new->list, pos); return new; } -- 2.25.1
[PATCH net-next v2 13/18] net: qede: Remove check of list iterator against head past the loop body
When list_for_each_entry() completes the iteration over the whole list without breaking the loop, the iterator value will be a bogus pointer computed based on the head element. While it is safe to use the pointer to determine if it was computed based on the head element, either with list_entry_is_head() or &pos->member == head, using the iterator variable after the loop should be avoided. In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer to point to the found element [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qede/qede_filter.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c index 3010833ddde3..3d167e37e654 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_filter.c +++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c @@ -829,18 +829,21 @@ int qede_configure_vlan_filters(struct qede_dev *edev) int qede_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid) { struct qede_dev *edev = netdev_priv(dev); - struct qede_vlan *vlan; + struct qede_vlan *vlan = NULL; + struct qede_vlan *iter; int rc = 0; DP_VERBOSE(edev, NETIF_MSG_IFDOWN, "Removing vlan 0x%04x\n", vid); /* Find whether entry exists */ __qede_lock(edev); - list_for_each_entry(vlan, &edev->vlan_list, list) - if (vlan->vid == vid) + list_for_each_entry(iter, &edev->vlan_list, list) + if (iter->vid == vid) { + vlan = iter; break; + } - if (list_entry_is_head(vlan, &edev->vlan_list, list)) { + if (!vlan) { DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), "Vlan isn't configured\n"); goto out; -- 2.25.1
[PATCH net-next v2 12/18] net: qede: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qede/qede_rdma.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede_rdma.c b/drivers/net/ethernet/qlogic/qede/qede_rdma.c index 6304514a6f2c..2eb03ffe2484 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_rdma.c +++ b/drivers/net/ethernet/qlogic/qede/qede_rdma.c @@ -246,18 +246,17 @@ static void qede_rdma_change_mtu(struct qede_dev *edev) static struct qede_rdma_event_work * qede_rdma_get_free_event_node(struct qede_dev *edev) { - struct qede_rdma_event_work *event_node = NULL; - bool found = false; + struct qede_rdma_event_work *event_node = NULL, *iter; - list_for_each_entry(event_node, &edev->rdma_info.rdma_event_list, + list_for_each_entry(iter, &edev->rdma_info.rdma_event_list, list) { - if (!work_pending(&event_node->work)) { - found = true; + if (!work_pending(&iter->work)) { + event_node = iter; break; } } - if (!found) { + if (!event_node) { event_node = kzalloc(sizeof(*event_node), GFP_ATOMIC); if (!event_node) { DP_NOTICE(edev, -- 2.25.1
[PATCH net-next v2 11/18] qed: Remove usage of list iterator variable after the loop
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. Since "found" and "p_ent" need to be equal, "found" should be used consistently to limit the scope of "p_ent" to the list traversal in the future. Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_spq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c index d01b9245f811..cbaa2abed660 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_spq.c +++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c @@ -934,10 +934,10 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, u8 fw_return_code, union event_ring_data *p_data) { + struct qed_spq_entry*found = NULL; struct qed_spq *p_spq; - struct qed_spq_entry*p_ent = NULL; + struct qed_spq_entry*p_ent; struct qed_spq_entry*tmp; - struct qed_spq_entry*found = NULL; if (!p_hwfn) return -EINVAL; @@ -980,7 +980,7 @@ int qed_spq_completion(struct qed_hwfn *p_hwfn, DP_VERBOSE(p_hwfn, QED_MSG_SPQ, "Complete EQE [echo %04x]: func %p cookie %p)\n", le16_to_cpu(echo), - p_ent->comp_cb.function, p_ent->comp_cb.cookie); + found->comp_cb.function, found->comp_cb.cookie); if (found->comp_cb.function) found->comp_cb.function(p_hwfn, found->comp_cb.cookie, p_data, fw_return_code); -- 2.25.1
[PATCH net-next v2 10/18] qed: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 26 ++--- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c index 1d1d4caad680..198c9321bf51 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -1630,38 +1630,36 @@ static struct qed_iwarp_listener * qed_iwarp_get_listener(struct qed_hwfn *p_hwfn, struct qed_iwarp_cm_info *cm_info) { - struct qed_iwarp_listener *listener = NULL; + struct qed_iwarp_listener *listener = NULL, *iter; static const u32 ip_zero[4] = { 0, 0, 0, 0 }; - bool found = false; - list_for_each_entry(listener, + list_for_each_entry(iter, &p_hwfn->p_rdma_info->iwarp.listen_list, list_entry) { - if (listener->port == cm_info->local_port) { - if (!memcmp(listener->ip_addr, + if (iter->port == cm_info->local_port) { + if (!memcmp(iter->ip_addr, ip_zero, sizeof(ip_zero))) { - found = true; + listener = iter; break; } - if (!memcmp(listener->ip_addr, + if (!memcmp(iter->ip_addr, cm_info->local_ip, sizeof(cm_info->local_ip)) && - (listener->vlan == cm_info->vlan)) { - found = true; + iter->vlan == cm_info->vlan) { + listener = iter; break; } } } - if (found) { + if (listener) DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener found = %p\n", listener); - return listener; - } + else + DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener not found\n"); - DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "listener not found\n"); - return NULL; + return listener; } static int -- 2.25.1
[PATCH net-next v2 09/18] qed: Use dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/net/ethernet/qlogic/qed/qed_dev.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index 672480c9d195..e920e7dcf66a 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -174,7 +174,7 @@ int qed_db_recovery_add(struct qed_dev *cdev, int qed_db_recovery_del(struct qed_dev *cdev, void __iomem *db_addr, void *db_data) { - struct qed_db_recovery_entry *db_entry = NULL; + struct qed_db_recovery_entry *db_entry = NULL, *iter; struct qed_hwfn *p_hwfn; int rc = -EINVAL; @@ -190,12 +190,13 @@ int qed_db_recovery_del(struct qed_dev *cdev, /* Protect the list */ spin_lock_bh(&p_hwfn->db_recovery_info.lock); - list_for_each_entry(db_entry, + list_for_each_entry(iter, &p_hwfn->db_recovery_info.list, list_entry) { /* search according to db_data addr since db_addr is not unique (roce) */ - if (db_entry->db_data == db_data) { - qed_db_recovery_dp_entry(p_hwfn, db_entry, "Deleting"); - list_del(&db_entry->list_entry); + if (iter->db_data == db_data) { + qed_db_recovery_dp_entry(p_hwfn, iter, "Deleting"); + list_del(&iter->list_entry); + db_entry = iter; rc = 0; break; } -- 2.25.1
[PATCH net-next v2 08/18] net: sparx5: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- .../microchip/sparx5/sparx5_mactable.c| 25 +-- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c index a5837dbe0c7e..bb8d9ce79ac2 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_mactable.c @@ -362,8 +362,7 @@ static void sparx5_mact_handle_entry(struct sparx5 *sparx5, unsigned char mac[ETH_ALEN], u16 vid, u32 cfg2) { - struct sparx5_mact_entry *mact_entry; - bool found = false; + struct sparx5_mact_entry *mact_entry = NULL, *iter; u16 port; if (LRN_MAC_ACCESS_CFG_2_MAC_ENTRY_ADDR_TYPE_GET(cfg2) != @@ -378,28 +377,28 @@ static void sparx5_mact_handle_entry(struct sparx5 *sparx5, return; mutex_lock(&sparx5->mact_lock); - list_for_each_entry(mact_entry, &sparx5->mact_entries, list) { - if (mact_entry->vid == vid && - ether_addr_equal(mac, mact_entry->mac)) { - found = true; - mact_entry->flags |= MAC_ENT_ALIVE; - if (mact_entry->port != port) { + list_for_each_entry(iter, &sparx5->mact_entries, list) { + if (iter->vid == vid && + ether_addr_equal(mac, iter->mac)) { + iter->flags |= MAC_ENT_ALIVE; + if (iter->port != port) { dev_warn(sparx5->dev, "Entry move: %d -> %d\n", -mact_entry->port, port); - mact_entry->port = port; - mact_entry->flags |= MAC_ENT_MOVED; +iter->port, port); + iter->port = port; + iter->flags |= MAC_ENT_MOVED; } /* Entry handled */ + mact_entry = iter; break; } } mutex_unlock(&sparx5->mact_lock); - if (found && !(mact_entry->flags & MAC_ENT_MOVED)) + if (mact_entry && !(mact_entry->flags & MAC_ENT_MOVED)) /* Present, not moved */ return; - if (!found) { + if (!mact_entry) { /* Entry not found - now add */ mact_entry = alloc_mact_entry(sparx5, mac, vid, port); if (!mact_entry) -- 2.25.1
[PATCH net-next v2 07/18] net: dsa: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- net/dsa/dsa.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index 89c6c86e746f..645522c4dd4a 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -112,22 +112,21 @@ const struct dsa_device_ops *dsa_find_tagger_by_name(const char *buf) const struct dsa_device_ops *dsa_tag_driver_get(int tag_protocol) { - struct dsa_tag_driver *dsa_tag_driver; + struct dsa_tag_driver *dsa_tag_driver = NULL, *iter; const struct dsa_device_ops *ops; - bool found = false; request_module("%s%d", DSA_TAG_DRIVER_ALIAS, tag_protocol); mutex_lock(&dsa_tag_drivers_lock); - list_for_each_entry(dsa_tag_driver, &dsa_tag_drivers_list, list) { - ops = dsa_tag_driver->ops; + list_for_each_entry(iter, &dsa_tag_drivers_list, list) { + ops = iter->ops; if (ops->proto == tag_protocol) { - found = true; + dsa_tag_driver = iter; break; } } - if (found) { + if (dsa_tag_driver) { if (!try_module_get(dsa_tag_driver->owner)) ops = ERR_PTR(-ENOPROTOOPT); } else { -- 2.25.1
[PATCH net-next v2 06/18] net: dsa: mv88e6xxx: refactor mv88e6xxx_port_vlan()
To avoid bugs and speculative execution exploits due to type-confused pointers at the end of a list_for_each_entry() loop, one measure is to restrict code to not use the iterator variable outside the loop block. In the case of mv88e6xxx_port_vlan(), this isn't a problem, as we never let the loops exit through "natural causes" anyway, by using a "found" variable and then using the last "dp" iterator prior to the break, which is a safe thing to do. Nonetheless, with the expected new syntax, this pattern will no longer be possible. Profit off of the occasion and break the two port finding methods into smaller sub-functions. Somehow, returning a copy of the iterator pointer is still accepted. This change makes it redundant to have a "bool found", since the "dp" from mv88e6xxx_port_vlan() now holds NULL if we haven't found what we were looking for. Signed-off-by: Jakob Koschel Signed-off-by: Vladimir Oltean --- drivers/net/dsa/mv88e6xxx/chip.c | 54 ++-- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index b3aa0e5bc842..1f35e89053e6 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1378,42 +1378,50 @@ static int mv88e6xxx_set_mac_eee(struct dsa_switch *ds, int port, return 0; } +static struct dsa_port *mv88e6xxx_find_port(struct dsa_switch_tree *dst, + int sw_index, int port) +{ + struct dsa_port *dp; + + list_for_each_entry(dp, &dst->ports, list) + if (dp->ds->index == sw_index && dp->index == port) + return dp; + + return NULL; +} + +static struct dsa_port * +mv88e6xxx_find_port_by_bridge_num(struct dsa_switch_tree *dst, + unsigned int bridge_num) +{ + struct dsa_port *dp; + + list_for_each_entry(dp, &dst->ports, list) + if (dsa_port_bridge_num_get(dp) == bridge_num) + return dp; + + return NULL; +} + /* Mask of the local ports allowed to receive frames from a given fabric port */ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port) { struct dsa_switch *ds = chip->ds; struct dsa_switch_tree *dst = ds->dst; struct dsa_port *dp, *other_dp; - bool found = false; u16 pvlan; - /* dev is a physical switch */ if (dev <= dst->last_switch) { - list_for_each_entry(dp, &dst->ports, list) { - if (dp->ds->index == dev && dp->index == port) { - /* dp might be a DSA link or a user port, so it -* might or might not have a bridge. -* Use the "found" variable for both cases. -*/ - found = true; - break; - } - } - /* dev is a virtual bridge */ + /* dev is a physical switch */ + dp = mv88e6xxx_find_port(dst, dev, port); } else { - list_for_each_entry(dp, &dst->ports, list) { - unsigned int bridge_num = dsa_port_bridge_num_get(dp); - - if (bridge_num + dst->last_switch != dev) - continue; - - found = true; - break; - } + /* dev is a virtual bridge */ + dp = mv88e6xxx_find_port_by_bridge_num(dst, + dev - dst->last_switch); } /* Prevent frames from unknown switch or virtual bridge */ - if (!found) + if (!dp) return 0; /* Frames from DSA links and CPU ports can egress any local port */ -- 2.25.1
[PATCH net-next v2 05/18] net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan()
We know that "dev > dst->last_switch" in the "else" block. In other words, that "dev - dst->last_switch" is > 0. dsa_port_bridge_num_get(dp) can be 0, but the check "if (bridge_num + dst->last_switch != dev) continue", rewritten as "if (bridge_num != dev - dst->last_switch) continue", aka "if (bridge_num != something which cannot be 0) continue", makes it redundant to have the extra "if (!bridge_num) continue" logic, since a bridge_num of zero would have been skipped anyway. Signed-off-by: Jakob Koschel Signed-off-by: Vladimir Oltean --- drivers/net/dsa/mv88e6xxx/chip.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 64f4fdd02902..b3aa0e5bc842 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1404,9 +1404,6 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port) list_for_each_entry(dp, &dst->ports, list) { unsigned int bridge_num = dsa_port_bridge_num_get(dp); - if (!bridge_num) - continue; - if (bridge_num + dst->last_switch != dev) continue; -- 2.25.1
[PATCH net-next v2 04/18] net: dsa: sja1105: use list_add_tail(pos) instead of list_add(pos->prev)
When passed a non-head list element, list_add_tail() actually adds the new element to its left, which is what we want. Despite the slightly confusing name, use the dedicated function which does the same thing as the open-coded list_add(pos->prev). Suggested-by: Jakub Kicinski Signed-off-by: Jakob Koschel Signed-off-by: Vladimir Oltean --- drivers/net/dsa/sja1105/sja1105_vl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index e5ea8eb9ec4e..7fe9b18f1cbd 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -49,7 +49,7 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->rule = rule; e->gate_state = gate_state; e->interval = entry_time; - list_add(&e->list, pos->prev); + list_add_tail(&e->list, pos); gating_cfg->num_entries++; -- 2.25.1
[PATCH net-next v2 00/18] Remove use of list iterator after loop body
When the list iterator loop does not exit early the list iterator variable contains a type-confused pointer to a 'bogus' list element computed based on the head [1]. Often a 'found' variable is used to ensure the list iterator variable is only accessed after the loop body if the loop did exit early (using a break or goto). In other cases that list iterator variable is used in combination to access the list member which reverses the invocation of container_of() and brings back a "safe" pointer to the head of the list. Since, due to this code patten, there were quite a few bugs discovered [2], Linus concluded that the rule should be to never use the list iterator after the loop and introduce a dedicated pointer for that [3]. With the new gnu11 standard, it will now be possible to limit the scope of the list iterator variable to the traversal loop itself by defining the variable within the for loop. This, however, requires to remove all uses of the list iterator after the loop. Based on input from Paolo Abeni [4], Vinicius Costa Gomes [5], and Jakub Kicinski [6], I've splitted all the net-next related changes into two patch sets, where this is part 1. v1->v2: - Fixed commit message for PATCH 14/18 and used dedicated variable pointing to the position (Edward Cree) - Removed redundant check in mv88e6xxx_port_vlan() (Vladimir Oltean) - Refactor mv88e6xxx_port_vlan() using separate list iterator functions (Vladimir Oltean) - Refactor sja1105_insert_gate_entry() to use separate list iterator functions (Vladimir Oltean) - Allow early return in sja1105_insert_gate_entry() if sja1105_first_entry_longer_than() didn't find any element (Vladimir Oltean) - Use list_add_tail() instead of list_add() in sja1105_insert_gate_entry() (Jakub Kicinski) - net: netcp: also use separate 'pos' variable instead of duplicating list_add() Link: https://lwn.net/Articles/887097/ [1] Link: https://lore.kernel.org/linux-kernel/20220217184829.1991035-4-jakobkosc...@gmail.com/ [2] Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [3] Link: https://lore.kernel.org/linux-kernel/7393b673c626fd75f2b4f8509faa5459254fb87c.ca...@redhat.com/ [4] Link: https://lore.kernel.org/linux-kernel/877d8a3sww@intel.com/ [5] Link: https://lore.kernel.org/linux-kernel/20220403205502.1b344...@kernel.org/ [6] Jakob Koschel (18): connector: Replace usage of found with dedicated list iterator variable net: dsa: sja1105: remove use of iterator after list_for_each_entry() loop net: dsa: sja1105: reorder sja1105_first_entry_longer_than with memory allocation net: dsa: sja1105: use list_add_tail(pos) instead of list_add(pos->prev) net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan() net: dsa: mv88e6xxx: refactor mv88e6xxx_port_vlan() net: dsa: Replace usage of found with dedicated list iterator variable net: sparx5: Replace usage of found with dedicated list iterator variable qed: Use dedicated list iterator variable qed: Replace usage of found with dedicated list iterator variable qed: Remove usage of list iterator variable after the loop net: qede: Replace usage of found with dedicated list iterator variable net: qede: Remove check of list iterator against head past the loop body sfc: Remove usage of list iterator for list_add() after the loop body net: netcp: Remove usage of list iterator for list_add() after loop body ps3_gelic: Replace usage of found with dedicated list iterator variable ipvlan: Remove usage of list iterator variable for the loop body team: Remove use of list iterator variable for list_for_each_entry_from() drivers/connector/cn_queue.c | 13 ++--- drivers/net/dsa/mv88e6xxx/chip.c | 57 ++- drivers/net/dsa/sja1105/sja1105_vl.c | 51 + .../microchip/sparx5/sparx5_mactable.c| 25 drivers/net/ethernet/qlogic/qed/qed_dev.c | 11 ++-- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 26 - drivers/net/ethernet/qlogic/qed/qed_spq.c | 6 +- .../net/ethernet/qlogic/qede/qede_filter.c| 11 ++-- drivers/net/ethernet/qlogic/qede/qede_rdma.c | 11 ++-- drivers/net/ethernet/sfc/rx_common.c | 8 ++- drivers/net/ethernet/ti/netcp_core.c | 14 +++-- .../net/ethernet/toshiba/ps3_gelic_wireless.c | 30 +- drivers/net/ipvlan/ipvlan_main.c | 7 ++- drivers/net/team/team.c | 20 --- net/dsa/dsa.c | 11 ++-- 15 files changed, 163 insertions(+), 138 deletions(-) base-commit: 3e732ebf7316ac83e8562db7e64cc68aec390a18 -- 2.25.1
[PATCH net-next v2 02/18] net: dsa: sja1105: remove use of iterator after list_for_each_entry() loop
The link below explains that there is a desire to syntactically change list_for_each_entry() and list_for_each() such that it becomes impossible to use the iterator variable outside the scope of the loop. Although sja1105_insert_gate_entry() makes legitimate use of the iterator pointer when it breaks out, the pattern it uses may become illegal, so it needs to change. It is deemed acceptable to use a copy of the loop iterator, and sja1105_insert_gate_entry() only needs to know the list_head element before which the list insertion should be made. So let's profit from the occasion and refactor the list iteration to a dedicated function. An additional benefit is given by the fact that with the helper function in place, we no longer need to special-case the empty list, since it is equivalent to not having found any gating entry larger than the specified interval in the list. We just need to insert at the tail of that list (list_add vs list_add_tail on an empty list does the same thing). Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220407102900.3086255-3-jakobkosc...@gmail.com/#24810127 Signed-off-by: Jakob Koschel Signed-off-by: Vladimir Oltean --- drivers/net/dsa/sja1105/sja1105_vl.c | 46 ++-- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index b7e95d60a6e4..369be2ac3587 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -7,6 +7,27 @@ #define SJA1105_SIZE_VL_STATUS 8 +static struct list_head * +sja1105_first_entry_longer_than(struct list_head *entries, + s64 interval, + struct netlink_ext_ack *extack) +{ + struct sja1105_gate_entry *p; + + list_for_each_entry(p, entries, list) { + if (p->interval == interval) { + NL_SET_ERR_MSG_MOD(extack, "Gate conflict"); + return ERR_PTR(-EBUSY); + } + + if (interval < p->interval) + return &p->list; + } + + /* Empty list, or specified interval is largest within the list */ + return entries; +} + /* Insert into the global gate list, sorted by gate action time. */ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, struct sja1105_rule *rule, @@ -14,6 +35,7 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, struct netlink_ext_ack *extack) { struct sja1105_gate_entry *e; + struct list_head *pos; int rc; e = kzalloc(sizeof(*e), GFP_KERNEL); @@ -24,25 +46,15 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->gate_state = gate_state; e->interval = entry_time; - if (list_empty(&gating_cfg->entries)) { - list_add(&e->list, &gating_cfg->entries); - } else { - struct sja1105_gate_entry *p; - - list_for_each_entry(p, &gating_cfg->entries, list) { - if (p->interval == e->interval) { - NL_SET_ERR_MSG_MOD(extack, - "Gate conflict"); - rc = -EBUSY; - goto err; - } - - if (e->interval < p->interval) - break; - } - list_add(&e->list, p->list.prev); + pos = sja1105_first_entry_longer_than(&gating_cfg->entries, + e->interval, extack); + if (IS_ERR(pos)) { + rc = PTR_ERR(pos); + goto err; } + list_add(&e->list, pos->prev); + gating_cfg->num_entries++; return 0; -- 2.25.1
[PATCH net-next v2 03/18] net: dsa: sja1105: reorder sja1105_first_entry_longer_than with memory allocation
sja1105_first_entry_longer_than() does not make use of the full struct sja1105_gate_entry *e, just of e->interval which is set from the passed entry_time. This means that if there is a gate conflict, we have allocated e for nothing, just to free it later. Reorder the memory allocation and the function call, to avoid that and simplify the error unwind path. Signed-off-by: Jakob Koschel Signed-off-by: Vladimir Oltean --- drivers/net/dsa/sja1105/sja1105_vl.c | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105_vl.c b/drivers/net/dsa/sja1105/sja1105_vl.c index 369be2ac3587..e5ea8eb9ec4e 100644 --- a/drivers/net/dsa/sja1105/sja1105_vl.c +++ b/drivers/net/dsa/sja1105/sja1105_vl.c @@ -36,7 +36,11 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, { struct sja1105_gate_entry *e; struct list_head *pos; - int rc; + + pos = sja1105_first_entry_longer_than(&gating_cfg->entries, + entry_time, extack); + if (IS_ERR(pos)) + return PTR_ERR(pos); e = kzalloc(sizeof(*e), GFP_KERNEL); if (!e) @@ -45,22 +49,11 @@ static int sja1105_insert_gate_entry(struct sja1105_gating_config *gating_cfg, e->rule = rule; e->gate_state = gate_state; e->interval = entry_time; - - pos = sja1105_first_entry_longer_than(&gating_cfg->entries, - e->interval, extack); - if (IS_ERR(pos)) { - rc = PTR_ERR(pos); - goto err; - } - list_add(&e->list, pos->prev); gating_cfg->num_entries++; return 0; -err: - kfree(e); - return rc; } /* The gate entries contain absolute times in their e->interval field. Convert -- 2.25.1
[PATCH net-next v2 01/18] connector: Replace usage of found with dedicated list iterator variable
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=ehreask5sqxpwr9y7k9sa6cwx...@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel --- drivers/connector/cn_queue.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c index 996f025eb63c..ed77599b0b25 100644 --- a/drivers/connector/cn_queue.c +++ b/drivers/connector/cn_queue.c @@ -92,20 +92,19 @@ int cn_queue_add_callback(struct cn_queue_dev *dev, const char *name, void cn_queue_del_callback(struct cn_queue_dev *dev, const struct cb_id *id) { - struct cn_callback_entry *cbq, *n; - int found = 0; + struct cn_callback_entry *cbq = NULL, *iter, *n; spin_lock_bh(&dev->queue_lock); - list_for_each_entry_safe(cbq, n, &dev->queue_list, callback_entry) { - if (cn_cb_equal(&cbq->id.id, id)) { - list_del(&cbq->callback_entry); - found = 1; + list_for_each_entry_safe(iter, n, &dev->queue_list, callback_entry) { + if (cn_cb_equal(&iter->id.id, id)) { + list_del(&iter->callback_entry); + cbq = iter; break; } } spin_unlock_bh(&dev->queue_lock); - if (found) + if (cbq) cn_queue_release_callback(cbq); } -- 2.25.1
[PATCH] ASoC: fsl: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
From: Minghao Chi Using pm_runtime_resume_and_get is more appropriate for simplifing code Reported-by: Zeal Robot Signed-off-by: Minghao Chi --- sound/soc/fsl/fsl_esai.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c index ed444e8f1d6b..1a2bdf8e76f0 100644 --- a/sound/soc/fsl/fsl_esai.c +++ b/sound/soc/fsl/fsl_esai.c @@ -1050,11 +1050,9 @@ static int fsl_esai_probe(struct platform_device *pdev) goto err_pm_disable; } - ret = pm_runtime_get_sync(&pdev->dev); - if (ret < 0) { - pm_runtime_put_noidle(&pdev->dev); + ret = pm_runtime_resume_and_get(&pdev->dev); + if (ret < 0) goto err_pm_get_sync; - } ret = fsl_esai_hw_init(esai_priv); if (ret) -- 2.25.1
[Bug 215803] ppc64le(P9): BUG: Kernel NULL pointer dereference on read at 0x00000060 NIP: do_remove_conflicting_framebuffers+0x184/0x1d0
https://bugzilla.kernel.org/show_bug.cgi?id=215803 Michael Ellerman (mich...@ellerman.id.au) changed: What|Removed |Added Status|NEW |RESOLVED CC||mich...@ellerman.id.au Resolution|--- |CODE_FIX --- Comment #5 from Michael Ellerman (mich...@ellerman.id.au) --- This was reported to the patch author here: https://lore.kernel.org/all/YkHXO6LGHAN0p1pq@debian/ And there is a fix here: https://patchwork.freedesktop.org/patch/480648/ -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.