Re: [PATCH 1/2] mm: Fix struct page layout on 32-bit systems

2021-04-20 Thread Arnd Bergmann
On Tue, Apr 20, 2021 at 11:14 PM Vineet Gupta
 wrote:
> On 4/20/21 12:07 AM, Arnd Bergmann wrote:

> >
> > which means that half the 32-bit architectures do this. This may
> > cause more problems when arc and/or microblaze want to support
> > 64-bit kernels and compat mode in the future on their latest hardware,
> > as that means duplicating the x86 specific hacks we have for compat.
> >
> > What is alignof(u64) on 64-bit arc?
>
> $ echo 'int a = __alignof__(long long);' | arc64-linux-gnu-gcc -xc -
> -Wall -S -o - | grep -A1 a: | tail -n 1 | cut -f 3
> 8

Ok, good.

> Yeah ARCv2 alignment of 4 for 64-bit data was a bit of surprise finding
> for me as well. When 64-bit load/stores were initially targeted by the
> internal Metaware compiler (llvm based) they decided to keep alignment
> to 4 still (granted hardware allowed this) and then gcc guys decided to
> follow the same ABI. I only found this by accident :-)
>
> Can you point me to some specifics on the compat issue. For better of
> worse, arc64 does''t have a compat 32-bit mode, so everything is
> 64-on-64 or 32-on-32 (ARC32 flavor of ARCv3)

In that case, there should be no problem for you.

The main issue is with system calls and ioctls that contain a misaligned
struct member like

struct s {
   u32 a;
   u64 b;
};

Passing this structure by reference from a 32-bit user space application
to a 64-bit kernel with different alignment constraints means that the
kernel has to convert the structure layout. See
compat_ioctl_preallocate() in fs/ioctl.c for one such example.

   Arnd


Re: [PATCH 2/3] nios2: Cleanup deprecated function strlen_user

2021-04-20 Thread Arnd Bergmann
On Tue, Apr 20, 2021 at 3:37 PM  wrote:
>
> From: Guo Ren 
>
> $ grep strlen_user * -r
> arch/csky/include/asm/uaccess.h:#define strlen_user(str) strnlen_user(str, 
> 32767)
> arch/csky/lib/usercopy.c: * strlen_user: - Get the size of a string in user 
> space.
> arch/ia64/lib/strlen.S: // Please note that in the case of strlen() as 
> opposed to strlen_user()
> arch/mips/lib/strnlen_user.S: *  make strlen_user and strnlen_user access the 
> first few KSEG0
> arch/nds32/include/asm/uaccess.h:extern __must_check long strlen_user(const 
> char __user * str);
> arch/nios2/include/asm/uaccess.h:extern __must_check long strlen_user(const 
> char __user *str);
> arch/riscv/include/asm/uaccess.h:extern long __must_check strlen_user(const 
> char __user *str);
> kernel/trace/trace_probe_tmpl.h:static nokprobe_inline int 
> fetch_store_strlen_user(unsigned long addr);
> kernel/trace/trace_probe_tmpl.h:ret += 
> fetch_store_strlen_user(val + code->offset);
> kernel/trace/trace_uprobe.c:fetch_store_strlen_user(unsigned long addr)
> kernel/trace/trace_kprobe.c:fetch_store_strlen_user(unsigned long addr)
> kernel/trace/trace_kprobe.c:return fetch_store_strlen_user(addr);

I would suggest using "grep strlen_user * -rw", to let the whole-word match
filter out the irrelevant ones for the changelog.

> See grep result, nobody uses it.
>
> Signed-off-by: Guo Ren 
> Cc: Arnd Bergmann 

All three patches

Reviewed-by: Arnd Bergmann 

Do you want me to pick them up in the asm-generic tree?


Re: [RFC v1 PATCH 3/3] driver: update all the code that use soc_device_match

2021-04-20 Thread Arnd Bergmann
On Tue, Apr 20, 2021 at 1:44 AM Dominique MARTINET
 wrote:
> Arnd Bergmann wrote on Mon, Apr 19, 2021 at 02:16:36PM +0200:
> > For built-in drivers, load order depends on the initcall level and
> > link order (how things are lined listed in the Makefile hierarchy).
> >
> > For loadable modules, this is up to user space in the end.
> >
> > Which of the drivers in this scenario are loadable modules?
>
> All the drivers involved in my case are built-in (nvmem, soc and final
> soc_device_match consumer e.g. caam_jr that crashes the kernel if soc is
> not identified properly).

Ok, in that case you may have a chance to just adapt the initcall
levels. This is somewhat fragile if someone else already relies
on a particular order, but it's an easy one-line change to change
a driver e.g. from module_init() or device_initcall() to arch_initcall().

> I frankly don't like the idea of moving nvmem/ above soc/ in
> drivers/Makefile as a "solution" to this (especially as there is one
> that seems to care about what soc they run on...), so I'll have a look
> at links first, hopefully that will work out.

Right, that would be way more fragile.

I think the main problem in this case is the caam driver that really
should not look into the particular SoC type or even machine
compatible string. This is something we can do as a last resort
for compatibility with busted devicetree files, but it appears that
this driver does it as the primary method for identifying different
hardware revisions. I would suggest fixing the binding so that
each SoC that includes one of these devices has a soc specific
compatible string associated with the device that the driver can
use as the primary way of identifying the device.

We probably need to keep the old logic around for old dtb files,
but there can at least be a comment next to that table that
discourages people from adding more entries there.

  Arnd


Re: [RFC v1 PATCH 3/3] driver: update all the code that use soc_device_match

2021-04-20 Thread Arnd Bergmann
On Tue, Apr 20, 2021 at 1:44 AM Dominique MARTINET
 wrote:
> Arnd Bergmann wrote on Mon, Apr 19, 2021 at 02:16:36PM +0200:
> > For built-in drivers, load order depends on the initcall level and
> > link order (how things are lined listed in the Makefile hierarchy).
> >
> > For loadable modules, this is up to user space in the end.
> >
> > Which of the drivers in this scenario are loadable modules?
>
> All the drivers involved in my case are built-in (nvmem, soc and final
> soc_device_match consumer e.g. caam_jr that crashes the kernel if soc is
> not identified properly).

Ok, in that case you may have a chance to just adapt the initcall
levels. This is somewhat fragile if someone else already relies
on a particular order, but it's an easy one-line change to change
a driver e.g. from module_init() or device_initcall() to arch_initcall().

> I frankly don't like the idea of moving nvmem/ above soc/ in
> drivers/Makefile as a "solution" to this (especially as there is one
> that seems to care about what soc they run on...), so I'll have a look
> at links first, hopefully that will work out.

Right, that would be way more fragile.

I think the main problem in this case is the caam driver that really
should not look into the particular SoC type or even machine
compatible string. This is something we can do as a last resort
for compatibility with busted devicetree files, but it appears that
this driver does it as the primary method for identifying different
hardware revisions. I would suggest fixing the binding so that
each SoC that includes one of these devices has a soc specific
compatible string associated with the device that the driver can
use as the primary way of identifying the device.

We probably need to keep the old logic around for old dtb files,
but there can at least be a comment next to that table that
discourages people from adding more entries there.

  Arnd


Re: [PATCH 1/2] mm: Fix struct page layout on 32-bit systems

2021-04-20 Thread Arnd Bergmann
On Tue, Apr 20, 2021 at 5:10 AM Matthew Wilcox  wrote:
>
> On Tue, Apr 20, 2021 at 02:48:17AM +, Vineet Gupta wrote:
> > > 32-bit architectures which expect 8-byte alignment for 8-byte integers
> > > and need 64-bit DMA addresses (arc, arm, mips, ppc) had their struct
> > > page inadvertently expanded in 2019.
> >
> > FWIW, ARC doesn't require 8 byte alignment for 8 byte integers. This is
> > only needed for 8-byte atomics due to the requirements of LLOCKD/SCOND
> > instructions.
>
> Ah, like x86?  OK, great, I'll drop your arch from the list of
> affected.  Thanks!

I mistakenly assumed that i386 and m68k were the only supported
architectures with 32-bit alignment on u64. I checked it now and found

$ for i in /home/arnd/cross/x86_64/gcc-10.1.0-nolibc/*/bin/*-gcc ; do
echo `echo 'int a = __alignof__(long long);' | $i -xc - -Wall -S -o- |
grep -A1 a: | tail -n 1 | cut -f 3 -d\   `
${i#/home/arnd/cross/x86_64/gcc-10.1.0-nolibc/*/bin/} ; done
8 aarch64-linux-gcc
8 alpha-linux-gcc
4 arc-linux-gcc
8 arm-linux-gnueabi-gcc
8 c6x-elf-gcc
4 csky-linux-gcc
4 h8300-linux-gcc
8 hppa-linux-gcc
8 hppa64-linux-gcc
8 i386-linux-gcc
8 ia64-linux-gcc
2 m68k-linux-gcc
4 microblaze-linux-gcc
8 mips-linux-gcc
8 mips64-linux-gcc
8 nds32le-linux-gcc
4 nios2-linux-gcc
4 or1k-linux-gcc
8 powerpc-linux-gcc
8 powerpc64-linux-gcc
8 riscv32-linux-gcc
8 riscv64-linux-gcc
8 s390-linux-gcc
4 sh2-linux-gcc
4 sh4-linux-gcc
8 sparc-linux-gcc
8 sparc64-linux-gcc
8 x86_64-linux-gcc
8 xtensa-linux-gcc

which means that half the 32-bit architectures do this. This may
cause more problems when arc and/or microblaze want to support
64-bit kernels and compat mode in the future on their latest hardware,
as that means duplicating the x86 specific hacks we have for compat.

What is alignof(u64) on 64-bit arc?

  Arnd


Re: [PATCH][next] wlcore: Fix buffer overrun by snprintf due to incorrect buffer size Content-Type: text/plain; charset="utf-8"

2021-04-19 Thread Arnd Bergmann
On Mon, Apr 19, 2021 at 4:01 PM Colin King  wrote:
>
> From: Colin Ian King 
>
> The size of the buffer than can be written to is currently incorrect, it is
> always the size of the entire buffer even though the snprintf is writing
> as position pos into the buffer. Fix this by setting the buffer size to be
> the number of bytes left in the buffer, namely sizeof(buf) - pos.
>
> Addresses-Coverity: ("Out-of-bounds access")
> Fixes: 7b0e2c4f6be3 ("wlcore: fix overlapping snprintf arguments in debugfs")
> Signed-off-by: Colin Ian King 

Acked-by: Arnd Bergmann 


Re: [PATCH 37/57] staging: rtl8188eu: os_dep: ioctl_linux: Move 2 large data buffers into the heap

2021-04-19 Thread Arnd Bergmann
On Thu, Apr 15, 2021 at 7:29 AM Dan Carpenter  wrote:
>
> On Thu, Apr 15, 2021 at 08:20:16AM +0300, Dan Carpenter wrote:
> > On Wed, Apr 14, 2021 at 07:11:09PM +0100, Lee Jones wrote:
> > > ---
> > >  drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 12 +++-
> > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c 
> > > b/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
> > > index c95ae4d6a3b6b..cc14f00947781 100644
> > > --- a/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
> > > +++ b/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
> > > @@ -224,7 +224,7 @@ static char *translate_scan(struct adapter *padapter,
> > > /* parsing WPA/WPA2 IE */
> > > {
> > > u8 *buf;
> > > -   u8 wpa_ie[255], rsn_ie[255];
> > > +   u8 *wpa_ie, *rsn_ie;
> > > u16 wpa_len = 0, rsn_len = 0;
> > > u8 *p;
> > >
> > > @@ -232,6 +232,14 @@ static char *translate_scan(struct adapter *padapter,
> > > if (!buf)
> > > return start;
>
> Arnd, added this return...  I don't understand why we aren't returning
> -ENOMEM here.

I don't remember my patch, but I see that the function returns a pointer
that gets dereferenced afterwards. Changing this is probably a good idea
(the caller does return an error code), but it requires a few extra changes.

If there is a larger rework, I'd suggest using a single kzalloc to get all
three arrays at once to get simpler error handling.

   Arnd


Re: [PATCH 12/13] ARM: dts: stm32: fix DSI port node on STM32MP15

2021-04-19 Thread Arnd Bergmann
On Thu, Apr 15, 2021 at 2:23 PM Alexandre TORGUE
 wrote:
> On 4/15/21 12:43 PM, Ahmad Fatoum wrote:
> > On 15.04.21 12:10, Alexandre Torgue wrote:
> >> Running "make dtbs_check W=1", some warnings are reported concerning
> >> DSI. This patch reorder DSI nodes to avoid:
> >>
> >> soc/dsi@5a00: unnecessary #address-cells/#size-cells without
> >> "ranges" or child "reg" property
> >
> > This reverts parts of commit 9c32f980d9 ("ARM: dts: stm32: preset
> > stm32mp15x video #address- and #size-cells"):
> >
> >  The cell count for address and size is defined by the binding and not
> >  something a board would change. Avoid each board adding this
> >  boilerplate by having the cell size specification in the SoC DTSI.
> >
> >
> > The DSI can have child nodes with a unit address (e.g. a panel) and ones
> > without (ports { } container). ports is described in the dtsi, panels are
> > described in the dts if available.
> >
> > Apparently, the checker is fine with
> > ports {
> >   #address-cells = <1>;
> >   #size-cells = <0>;
> > };
> >
> > I think my rationale for the patch above was sound, so I think the checker
> > taking offense at the DSI cells here should be considered a false positive.
>
> If it's a "false positive" warning then we need to find a way to not
> print it out. Else, it'll be difficult to distinguish which warnings are
> "normal" and which are not. This question could also be applied to patch[3].
>
> Arnd, Rob what is your feeling about this case ?

I don't have a strong opinion on this either way, but I would just
not apply this one for 5.13 in this case. Rob, Alexandre, please
let me know if I should apply the other patches before the
merge window, I usually don't mind taking bugfixes late before the
merge window, but I still want some level of confidence that they
are actually correct.

Ahmad, if you feel strongly about this particular issue, would you like
to suggest a patch for the checker?

Arnd


Re: [RFC v1 PATCH 3/3] driver: update all the code that use soc_device_match

2021-04-19 Thread Arnd Bergmann
On Mon, Apr 19, 2021 at 11:33 AM Dominique MARTINET
 wrote:
> Geert Uytterhoeven wrote on Mon, Apr 19, 2021 at 11:03:24AM +0200:
>
> > soc_device_match() should only be used as a last resort, to identify
> > systems that cannot be identified otherwise.  Typically this is used for
> > quirks, which should only be enabled on a very specific subset of
> > systems.  IMHO such systems should make sure soc_device_match()
> > is available early, by registering their SoC device early.
>
> I definitely agree there, my suggestion to defer was only because I know
> of no other way to influence the ordering of drivers loading reliably
> and gave up on soc being init'd early.

In some cases, you can use the device_link infrastructure to deal
with dependencies between devices. Not sure if this would help
in your case, but have a look at device_link_add() etc in drivers/base/core.c

> In this particular case the problem is that since 7d981405d0fd ("soc:
> imx8m: change to use platform driver") the soc probe tries to use the
> nvmem driver for ocotp fuses for imx8m devices, which isn't ready yet.
> So soc loading gets pushed back to the end of the list because it gets
> defered and other drivers relying on soc_device_match get confused
> because they wrongly think a device doesn't match a quirk when it
> actually does.
>
> If there is a way to ensure the nvmem driver gets loaded before the soc,
> that would also solve the problem nicely, and avoid the need to mess
> with all the ~50 drivers which use it.
>
> Is there a way to control in what order drivers get loaded? Something in
> the dtb perhaps?

For built-in drivers, load order depends on the initcall level and
link order (how things are lined listed in the Makefile hierarchy).

For loadable modules, this is up to user space in the end.

Which of the drivers in this scenario are loadable modules?

Arnd


Re: [PATCH v2 (RESEND) 2/2] riscv: atomic: Using ARCH_ATOMIC in asm/atomic.h

2021-04-19 Thread Arnd Bergmann
On Sat, Apr 17, 2021 at 6:45 AM  wrote:
> +#define arch_atomic_read(v)__READ_ONCE((v)->counter)
> +#define arch_atomic_set(v, i)  __WRITE_ONCE(((v)->counter), 
> (i))

> +#define ATOMIC64_INIT  ATOMIC_INIT
> +#define arch_atomic64_read arch_atomic_read
> +#define arch_atomic64_set  arch_atomic_set
>  #endif

I think it's a bit confusing to define arch_atomic64_read() etc in terms
of arch_atomic_read(), given that they operate on different types.

IMHO the clearest would be to define both in terms of the open-coded
version you have for the 32-bit atomics.

Also, given that all three architectures (x86, arm64, riscv) use the same
definitions for the six macros above, maybe those can just get moved
into a common file with a possible override?

x86 uses an inline function here instead of the macro. This would also
be my preference, but it may add complexity to avoid circular header
dependencies.

The rest of this patch looks good to me.

Arnd


Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-17 Thread Arnd Bergmann
On Sat, Apr 17, 2021 at 3:58 PM Matthew Wilcox  wrote:
> I wouldn't like to make that assumption.  I've come across IOMMUs (maybe
> on parisc?  powerpc?) that like to encode fun information in the top
> few bits.  So we could get it down to 52 bits, but I don't think we can
> get all the way down to 32 bits.  Also, we need to keep the bottom bit
> clear for PageTail, so that further constrains us.

I'd be surprised to find such an IOMMU on a 32-bit machine, given that
the main reason for using an IOMMU on these is to avoid the 32-bit
address limit in DMA masters.

I see that parisc32 does not enable 64-bit dma_addr_t, while powerpc32
does not support any IOMMU, so it wouldn't be either of those two.

I do remember some powerpc systems that encode additional flags
(transaction ordering, caching, ...) into the high bits of the physical
address in the IOTLB, but not the virtual address used for looking
them up.

> Anyway, I like the "two unsigned longs" approach I posted yesterday,
> but thanks for the suggestion.

Ok, fair enough. As long as there are enough bits in this branch of
'struct page', I suppose it is the safe choice.

Arnd


Re: [PATCH 1/1] mm: Fix struct page layout on 32-bit systems

2021-04-17 Thread Arnd Bergmann
On Fri, Apr 16, 2021 at 5:27 PM Matthew Wilcox  wrote:
> diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> index b5b195305346..db7c7020746a 100644
> --- a/include/net/page_pool.h
> +++ b/include/net/page_pool.h
> @@ -198,7 +198,17 @@ static inline void page_pool_recycle_direct(struct 
> page_pool *pool,
>
>  static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
>  {
> -   return page->dma_addr;
> +   dma_addr_t ret = page->dma_addr[0];
> +   if (sizeof(dma_addr_t) > sizeof(unsigned long))
> +   ret |= (dma_addr_t)page->dma_addr[1] << 32;
> +   return ret;
> +}

Have you considered using a PFN type address here? I suspect you
can prove that shifting the DMA address by PAGE_BITS would
make it fit into an 'unsigned long' on all 32-bit architectures with
64-bit dma_addr_t. This requires that page->dma_addr to be
page aligned, as well as fit into 44 bits. I recently went through the
maximum address space per architecture to define a
MAX_POSSIBLE_PHYSMEM_BITS, and none of them have more than
40 here, presumably the same is true for dma address space.

Arnd


Re: Bogus struct page layout on 32-bit

2021-04-16 Thread Arnd Bergmann
On Fri, Apr 16, 2021 at 11:27 AM 'Grygorii Strashko' via Clang Built
Linux  wrote:
> On 10/04/2021 11:52, Ilias Apalodimas wrote:
> > +CC Grygorii for the cpsw part as Ivan's email is not valid anymore
> The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA 
> even in case of LPAE (dma-ranges are used).
> Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected 
> for the LPAE case
> on TI platforms and the fact that it became set is the result of 
> multi-paltform/allXXXconfig/DMA
> optimizations and unification.
> (just checked - not set in 4.14)
>
> Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config 
> symbol in lib/Kconfig").

I completely missed this change in the past, and I don't really agree
with it either.

Most 32-bit Arm platforms are in fact limited to 32-bit DMA, even when they have
MMIO or RAM areas above the 4GB boundary that require LPAE.

> The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y 
> by using
> things like (__force u32) for example.
>
> Honestly, I've done sanity check of CPSW with LPAE=y 
> (ARCH_DMA_ADDR_T_64BIT=y) very long time ago.

This is of course a good idea, drivers should work with any
combination of 32-bit
or 64-bit phys_addr_t and dma_addr_t.

Arnd


Re: [PATCH net-next] net: Space: remove hp100 probe

2021-04-13 Thread Arnd Bergmann
On Wed, Apr 14, 2021, 00:42 Stephen Hemminger
 wrote:
>
> On Tue, 13 Apr 2021 16:16:17 +0200 Arnd Bergmann  wrote:
>
> >   */
> >  static struct devprobe2 isa_probes[] __initdata = {
> > -#if defined(CONFIG_HP100) && defined(CONFIG_ISA) /* ISA, EISA */
> > - {hp100_probe, 0},
> > -#endif
> >  #ifdef CONFIG_3C515
> >   {tc515_probe, 0},
> >  #endif
>
> Thanks, do we even need to have the static initialization anymore?

I actually did some more cleanups after I sent the above patch when
I found out that this code still exists. It turned out that above half of
the static initializations are completely pointless because the
drivers never rely on the netdev= command line arguments and
can simply be changed to always using module_init() instead of
relying on net_olddevs_init() for the built-in case.

The remaining ones are all ISA drivers: 3c515, Ultra, WD80x3,
NE2000, Lance, SMC9194, CS89x0, NI65 and COPS.

With my cleanups, I move the netdev_boot_setup infrastructure
into drivers/net/Space.c and only compile it when at least one of
these eight drivers is enabled.

All these drivers also support being built as loadable modules, but
in that configuration they only support a single device (back in the
day you could copy the module and just load it twice to support
more than one instance, not sure we still want to support that).

None of these drivers have a maintainer listed, but I suppose
there are still some PC/104 machines with NE2000 network
cards that could theoretically run a modern kernel.

Arnd


[PATCH net-next] net: Space: remove hp100 probe

2021-04-13 Thread Arnd Bergmann
From: Arnd Bergmann 

The driver was removed last year, but the static initialization got left
behind by accident.

Fixes: a10079c66290 ("staging: remove hp100 driver")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/Space.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/Space.c b/drivers/net/Space.c
index 7bb699d7c422..a61cc7b26a87 100644
--- a/drivers/net/Space.c
+++ b/drivers/net/Space.c
@@ -59,9 +59,6 @@ static int __init probe_list2(int unit, struct devprobe2 *p, 
int autoprobe)
  * look for EISA/PCI cards in addition to ISA cards).
  */
 static struct devprobe2 isa_probes[] __initdata = {
-#if defined(CONFIG_HP100) && defined(CONFIG_ISA)   /* ISA, EISA */
-   {hp100_probe, 0},
-#endif
 #ifdef CONFIG_3C515
{tc515_probe, 0},
 #endif
-- 
2.29.2



Re: [PATCH] asm-generic/io.h: Silence -Wnull-pointer-arithmetic warning on PCI_IOBASE

2021-04-13 Thread Arnd Bergmann
On Tue, Apr 13, 2021 at 3:06 PM David Laight  wrote:
>
> From: Arnd Bergmann
> > Sent: 13 April 2021 13:58
> ...
> > The remaining ones (csky, m68k, sparc32) need to be inspected
> > manually to see if they currently support PCI I/O space but in
> > fact use address zero as the base (with large resources) or they
> > should also turn the operations into a NOP.
>
> I'd expect sparc32 to use an ASI to access PCI IO space.
> I can't quite remember whether IO space was supported at all.

I see this bit in arch/sparc/kernel/leon_pci.c

 * PCI Memory and Prefetchable Memory is direct-mapped. However I/O Space is
 * accessed through a Window which is translated to low 64KB in PCI space, the
 * first 4KB is not used so 60KB is available.
...
pci_add_resource_offset(, >io_space,
info->io_space.start - 0x1000);

which means that there is I/O space, which gets accessed through whichever
method readb() uses. Having the offset equal to the resource means that
the '(void *)0' start is correct.

As this leaves only two others, I checked those as well:

csky does not actually have a PCI host bridge driver at the moment, so
we don't care about breaking port access on it it, and I would suggest
leaving I/O port access disabled. (Added Guo Ren to Cc for confirmation).

m68k only supports PCI on coldfire M54xx, and this variant does set
a PCI_IOBASE after all. The normal MMU based m68k have no PCI
and do define their out inb/outb/..., so nothing changes for them.

To summarize: only sparc32 needs to set PCI_IOBASE to zero, everyone
else should just WARN_ONCE() or return 0xff/0x/0x.

Arnd


Re: [PATCH] asm-generic/io.h: Silence -Wnull-pointer-arithmetic warning on PCI_IOBASE

2021-04-13 Thread Arnd Bergmann
On Tue, Apr 13, 2021 at 2:38 PM Niklas Schnelle  wrote:
> On Tue, 2021-04-13 at 14:26 +0200, Arnd Bergmann wrote:
> > I think the real underlying problem is that '0' is a particularly bad
> > default value,
> > we should not have used this one in asm-generic, or maybe have left it as
> > mandatory to be defined by an architecture to a sane value. Note that
> > architectures that don't actually support I/O ports will cause a NULL
> > pointer dereference when loading a legacy driver, which is exactly what 
> > clang
> > is warning about here. Architectures that to support I/O ports in PCI 
> > typically
> > map them at a fixed location in the virtual address space and should put 
> > that
> > location here, rather than adding the pointer value to the port resources.
> >
> > What we might do instead here would be move the definition into those
> > architectures that actually define the base to be at address zero, with a
> > comment explaining why this is a bad location, and then changing the
> > inb/outb style helpers to either an empty function or a WARN_ONCE().
> >
> > On which architectures do you see the problem? How is the I/O port
> > actually mapped there?
>
> I'm seeing this on s390 which indeed has no I/O port support at all.
> I'm not sure how many others there are but I feel like us having to
> define these functions as empty is also kind of awkward. Maybe we could
> put them into the asm-generic/io.h for the case that PCI_IOBASE is not
> defined? Then at least every platform not supporting I/O ports would
> share them.

Yes, I meant doing this in the asm-generic version, something like

#if !defined(inb) && !defined(_inb)
#define _inb _inb
static inline u8 _inb(unsigned long addr)
{
#ifdef PCI_IOBASE
u8 val;

__io_pbr();
val = __raw_readb(PCI_IOBASE + addr);
__io_par(val);
return val;
#else
WARN_ONCE(1, "No I/O port support");
return 0xff;
#endif
}
#endif

And follow up with a separate patch that moves the
"#define PCI_IOBASE ((void __iomem *)0)" into the architectures
that would currently see it, i.e. those that include asm-generic/io.h
but define neither inb/_inb nor PCI_IOBASE:

$ git grep -l asm-generic/io.h | xargs grep -wL inb | xargs grep -L PCI_IOBASE
arch/arc/include/asm/io.h
arch/csky/include/asm/io.h
arch/h8300/include/asm/io.h
arch/m68k/include/asm/io.h
arch/nds32/include/asm/io.h
arch/nios2/include/asm/io.h
arch/openrisc/include/asm/io.h
arch/s390/include/asm/io.h
arch/sparc/include/asm/io_32.h
arch/um/include/asm/io.h

Out of these, I see that arc, h8300, nds32, nios2, openrisc, and um
never set CONFIG_HAVE_PCI, so these would all be better off
leaving PCI_IOBASE undefined and getting the WARN_ONCE.

The remaining ones (csky, m68k, sparc32) need to be inspected
manually to see if they currently support PCI I/O space but in
fact use address zero as the base (with large resources) or they
should also turn the operations into a NOP.

 Arnd


Re: [PATCH] asm-generic/io.h: Silence -Wnull-pointer-arithmetic warning on PCI_IOBASE

2021-04-13 Thread Arnd Bergmann
On Tue, Apr 13, 2021 at 1:54 PM Niklas Schnelle  wrote:
>
> When PCI_IOBASE is not defined, it is set to 0 such that it is ignored
> in calls to the readX/writeX primitives. While mathematically obvious
> this triggers clang's -Wnull-pointer-arithmetic warning.

Indeed, this is an annoying warning.

> An additional complication is that PCI_IOBASE is explicitly typed as
> "void __iomem *" which causes the type conversion that converts the
> "unsigned long" port/addr parameters to the appropriate pointer type.
> As non pointer types are used by drivers at the callsite since these are
> dealing with I/O port numbers, changing the parameter type would cause
> further warnings in drivers. Instead use "uintptr_t" for PCI_IOBASE
> 0 and explicitly cast to "void __iomem *" when calling readX/writeX.
>
> Signed-off-by: Niklas Schnelle 
> ---
>  include/asm-generic/io.h | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
> index c6af40ce03be..8eb00bdef7ad 100644
> --- a/include/asm-generic/io.h
> +++ b/include/asm-generic/io.h
> @@ -441,7 +441,7 @@ static inline void writesq(volatile void __iomem *addr, 
> const void *buffer,
>  #endif /* CONFIG_64BIT */
>
>  #ifndef PCI_IOBASE
> -#define PCI_IOBASE ((void __iomem *)0)
> +#define PCI_IOBASE ((uintptr_t)0)
>  #endif
>
>  #ifndef IO_SPACE_LIMIT

Your patch looks wrong in the way it changes the type of one of the definitions,
but not the type of any of the architecture specific ones. It's also
awkward since
'void __iomem *' is really the correct type, while 'uintptr_t' is not!

I think the real underlying problem is that '0' is a particularly bad
default value,
we should not have used this one in asm-generic, or maybe have left it as
mandatory to be defined by an architecture to a sane value. Note that
architectures that don't actually support I/O ports will cause a NULL
pointer dereference when loading a legacy driver, which is exactly what clang
is warning about here. Architectures that to support I/O ports in PCI typically
map them at a fixed location in the virtual address space and should put that
location here, rather than adding the pointer value to the port resources.

What we might do instead here would be move the definition into those
architectures that actually define the base to be at address zero, with a
comment explaining why this is a bad location, and then changing the
inb/outb style helpers to either an empty function or a WARN_ONCE().

On which architectures do you see the problem? How is the I/O port
actually mapped there?

  Arnd


Re: [PATCH v2 16/21] ipmi: kcs_bmc: Add a "raw" character device interface

2021-04-13 Thread Arnd Bergmann
On Tue, Apr 13, 2021 at 1:45 AM Andrew Jeffery  wrote:
> On Mon, 12 Apr 2021, at 18:18, Arnd Bergmann wrote:
> > On Mon, Apr 12, 2021 at 3:33 AM Andrew Jeffery  wrote:
> > > On Fri, 9 Apr 2021, at 17:25, Arnd Bergmann wrote:
> > > > On Fri, Mar 19, 2021 at 7:31 AM Andrew Jeffery  wrote:
> > > > >
> > > > > The existing IPMI chardev encodes IPMI behaviours as the name 
> > > > > suggests.
> > > > > However, KCS devices are useful beyond IPMI (or keyboards), as they
> > > > > provide a means to generate IRQs and exchange arbitrary data between a
> > > > > BMC and its host system.
> > > >
> > > > I only noticed the series after Joel asked about the DT changes on the 
> > > > arm
> > > > side. One question though:
> > > >
> > > > How does this related to the drivers/input/serio/ framework that also 
> > > > talks
> > > > to the keyboard controller for things that are not keyboards?
> > >
> > > I've taken a brief look and I feel they're somewhat closely related.
> > >
> > > It's plausible that we could wrangle the code so the Aspeed and Nuvoton
> > > KCS drivers move under drivers/input/serio. If you squint, the i8042
> > > serio device driver has similarities with what the Aspeed and Nuvoton
> > > device drivers are providing to the KCS IPMI stack.
> >
> > After looking some more into it, I finally understood that the two are
> > rather complementary. While the  drivers/char/ipmi/kcs_bmc.c
> > is the other (bmc) end of drivers/char/ipmi/ipmi_kcs_sm.c, it seems
> > that the proposed kcs_bmc_cdev_raw.c interface would be
> > what corresponds to the other side of
> > drivers/input/serio/i8042.c+userio.c.
>
> Right. I guess the question is should we be splitting kernel subsystems
> along host/bmc lines? Doesn't feel intuitive, it's all Linux, but maybe
> we can consolidate in the future if it makes sense?

We actually have a number of subsystems with somewhat overlapping
functionality. I brought up serio, because it has an abstraction for multiple
things that communicate over the keyboard controller and I thought
the problem you were trying to solve was also related to the keyboard
controller.
It is also one of multiple abstractions that allow you to connect a device
to a uart (along with serdev and tty_ldisc, probably at least one more that
you can nest above or below these).

Consolidating the kcs_bmc.c interface into something that already
exists would obviously be best, but it's not clear which of these that
should be, that depends on the fundamental properties of the hardware
interface.

> > Then again, these are also on
> > separate ports (0x60 for the keyboard controller, 0xca2 for the BMC
> > KCS), so they would never actually talk to one another.
>
> Well, sort of I guess. On Power systems we don't use the keyboard
> controller for IPMI or keyboards, so we're just kinda exploiting the
> hardware for our own purposes.

Can you describe in an abstract form what the hardware interface
can do here and what you want from it? I wonder if it could be
part of a higher-level interface such as drivers/mailbox/ instead.

 Arnd


Re: [PATCH][next] habanalabs/gaudi: Fix uninitialized return code rc when read size is zero

2021-04-12 Thread Arnd Bergmann
On Mon, Apr 12, 2021 at 6:11 PM Colin King  wrote:
>
> From: Colin Ian King 
>
> In the case where size is zero the while loop never assigns rc and the
> return value is uninitialized. Fix this by initializing rc to zero.
>
> Addresses-Coverity: ("Uninitialized scalar variable")
> Fixes: 639781dcab82 ("habanalabs/gaudi: add debugfs to DMA from the device")
> Signed-off-by: Colin Ian King 
> ---
>  drivers/misc/habanalabs/gaudi/gaudi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c 
> b/drivers/misc/habanalabs/gaudi/gaudi.c
> index 8730b691ec61..b751652f80a8 100644
> --- a/drivers/misc/habanalabs/gaudi/gaudi.c
> +++ b/drivers/misc/habanalabs/gaudi/gaudi.c
> @@ -6252,7 +6252,7 @@ static int gaudi_debugfs_read_dma(struct hl_device 
> *hdev, u64 addr, u32 size,
> dma_addr_t dma_addr;
> void *kernel_addr;
> bool is_eng_idle;
> -   int rc, dma_id;
> +   int rc = 0, dma_id;
>
> kernel_addr = hdev->asic_funcs->asic_dma_alloc_coherent(
> hdev, SZ_2M,


In general, I don't like adding initializations during the declaration as that
tends to hide warnings for the cases where a later initialization is
missing. In this case it looks correct though.

Acked-by: Arnd Bergmann 


Re: [Linux-stm32] [RESEND PATCH 0/3] MAINTAINERS: update STMicroelectronics email

2021-04-12 Thread Arnd Bergmann
On Mon, Apr 12, 2021 at 12:19 PM Patrice CHOTARD
 wrote:
>
> Hi
>
> I think this series has been forgotten, any chance to see it merged into 
> v5.13 ?

It's in -rc7, but it appears that my email reply went missing when I merged it.

  Arnd


Re: [PATCH 5/5] compat: consolidate the compat_flock{,64} definition

2021-04-12 Thread Arnd Bergmann
On Mon, Apr 12, 2021 at 12:54 PM David Laight  wrote:
> From: David Laight > Sent: 12 April 2021 10:37
> ...
> > I'm guessing that compat_pid_t is 16 bits?
> > So the native 32bit version has an unnamed 2 byte structure pad.
> > The 'packed' removes this pad from the compat structure.
> >
> > AFAICT (apart from mips) the __ARCH_COMPAT_FLOCK_PAD is just
> > adding an explicit pad for the implicit pad the compiler
> > would generate because compat_pid_t is 16 bits.
>
> I've just looked at the header.
> compat_pid_t is 32 bits.
> So Linux must have gained 32bit pids at some earlier time.
> (Historically Unix pids were 16 bit - even on 32bit systems.)
>
> Which makes the explicit pad in 'sparc' rather 'interesting'.

I saw it was there since the sparc kernel support got merged in
linux-1.3, possibly copied from an older sunos version.

> oh - compat_loff_t is only used in a couple of other places.
> neither care in any way about the alignment.
> (Provided get_user() doesn't fault on a 8n+4 aligned address.)

Ah right, I also see that after this series it's only used in to other
places:  compat_resume_swap_area, which could also lose the
__packed annotation, and in the declaration of
compat_sys_sendfile64, where it makes no difference.

  Arnd


Re: consolidate the flock uapi definitions

2021-04-12 Thread Arnd Bergmann
On Mon, Apr 12, 2021 at 12:22 PM David Laight  wrote:
>
> From: Arnd Bergmann
> > Sent: 12 April 2021 11:04
> >
> > On Mon, Apr 12, 2021 at 10:55 AM Christoph Hellwig  wrote:
> > >
> > > Hi all,
> > >
> > > currently we deal with the slight differents in the various architecture
> > > variants of the flock and flock64 stuctures in a very cruft way.  This
> > > series switches to just use small arch hooks and define the rest in
> > > asm-generic and linux/compat.h instead.
> >
> > Nice cleanup. I can merge it through the asm-generic tree if you like,
> > though it's a little late just ahead of the merge window.
> >
> > I would not want to change the compat_loff_t definition to compat_s64
> > to avoid the padding at this time, though that might be a useful cleanup
> > for a future cycle.
>
> Is x86 the only architecture that has 32bit and 64bit variants where
> the 32bit variant aligns 64bit items on 32bit boundaries?

Yes.

> ISTM that fixing compat_loff_t shouldn't have any fallout.

That is my assumption as well, but I still wouldn't take the
risk one week before the merge window.

   Arnd


Re: consolidate the flock uapi definitions

2021-04-12 Thread Arnd Bergmann
On Mon, Apr 12, 2021 at 10:55 AM Christoph Hellwig  wrote:
>
> Hi all,
>
> currently we deal with the slight differents in the various architecture
> variants of the flock and flock64 stuctures in a very cruft way.  This
> series switches to just use small arch hooks and define the rest in
> asm-generic and linux/compat.h instead.

Nice cleanup. I can merge it through the asm-generic tree if you like,
though it's a little late just ahead of the merge window.

I would not want to change the compat_loff_t definition to compat_s64
to avoid the padding at this time, though that might be a useful cleanup
for a future cycle.

Arnd


Re: [PATCH 1/5] uapi: remove the unused HAVE_ARCH_STRUCT_FLOCK64 define

2021-04-12 Thread Arnd Bergmann
On Mon, Apr 12, 2021 at 10:55 AM Christoph Hellwig  wrote:
>
> Signed-off-by: Christoph Hellwig 

The patch looks good, but I'd like to see a description for each one.
How about:

| The check was added when Stephen Rothwell created the file, but
| no architecture ever defined it.

Arnd


Re: [PATCH v2 16/21] ipmi: kcs_bmc: Add a "raw" character device interface

2021-04-12 Thread Arnd Bergmann
On Mon, Apr 12, 2021 at 3:33 AM Andrew Jeffery  wrote:
> On Fri, 9 Apr 2021, at 17:25, Arnd Bergmann wrote:
> > On Fri, Mar 19, 2021 at 7:31 AM Andrew Jeffery  wrote:
> > >
> > > The existing IPMI chardev encodes IPMI behaviours as the name suggests.
> > > However, KCS devices are useful beyond IPMI (or keyboards), as they
> > > provide a means to generate IRQs and exchange arbitrary data between a
> > > BMC and its host system.
> >
> > I only noticed the series after Joel asked about the DT changes on the arm
> > side. One question though:
> >
> > How does this related to the drivers/input/serio/ framework that also talks
> > to the keyboard controller for things that are not keyboards?
>
> I've taken a brief look and I feel they're somewhat closely related.
>
> It's plausible that we could wrangle the code so the Aspeed and Nuvoton
> KCS drivers move under drivers/input/serio. If you squint, the i8042
> serio device driver has similarities with what the Aspeed and Nuvoton
> device drivers are providing to the KCS IPMI stack.

After looking some more into it, I finally understood that the two are
rather complementary. While the  drivers/char/ipmi/kcs_bmc.c
is the other (bmc) end of drivers/char/ipmi/ipmi_kcs_sm.c, it seems
that the proposed kcs_bmc_cdev_raw.c interface would be
what corresponds to the other side of
drivers/input/serio/i8042.c+userio.c. Then again, these are also on
separate ports (0x60 for the keyboard controller, 0xca2 for the BMC
KCS), so they would never actually talk to one another.

> Both the KCS IPMI and raw chardev I've implemented in this patch need
> both read and write access to the status register (STR). serio could
> potentially expose its value through serio_interrupt() using the
> SERIO_OOB_DATA flag, but I haven't put any thought into it beyond this
> sentence. We'd need some extra support for writing STR via the serio
> API. I'm not sure that fits into the abstraction (unless we make
> serio_write() take a flags argument?).
>
> In that vein, the serio_raw interface is close to the functionality
> that the raw chardev provides in this patch, though again serio_raw
> lacks userspace access to STR. Flags are ignored in the ->interrupt()
> callback so all values received via ->interrupt() are exposed as data.
> The result is there's no way to take care of SERIO_OOB_DATA in the
> read() path. Given that, I think we'd have to expose an ioctl() to
> access the STR value after taking care of SERIO_OOB_DATA in
> ->interrupt().
>
> I'm not sure where that lands us.

Based on what I looked up, I think you can just forget about my original
question. We have two separate interfaces that use an Intel 8042-style
protocol, but they don't really interact.

  Arnd


Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Arnd Bergmann
On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox  wrote:
> +   dma_addr_t dma_addr __packed;
> };
> struct {/* slab, slob and slub */
> union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +   dma_addr_t dma_addr __packed __aligned(sizeof(void 
> *));
>
> and I see pahole, at least sees this correctly:
>
> struct {
> long unsigned int _page_pool_pad; /* 4 4 */
> dma_addr_t dma_addr __attribute__((__aligned__(4))); 
> /* 8 8 */
> } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
short a;
struct {
short b;
long long c __attribute__((packed, aligned(2)));
} __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

  Arnd


[PATCH] ext4: fix debug format string warning

2021-04-09 Thread Arnd Bergmann
From: Arnd Bergmann 

Using no_printk() for jbd_debug() revealed two warnings:

fs/jbd2/recovery.c: In function 'fc_do_one_pass':
fs/jbd2/recovery.c:256:30: error: format '%d' expects a matching 'int' argument 
[-Werror=format=]
  256 | jbd_debug(3, "Processing fast commit blk with seq %d");
  |  ^~~~
fs/ext4/fast_commit.c: In function 'ext4_fc_replay_add_range':
fs/ext4/fast_commit.c:1732:30: error: format '%d' expects argument of type 
'int', but argument 2 has type 'long unsigned int' [-Werror=format=]
 1732 | jbd_debug(1, "Converting from %d to %d %lld",

The first one was added incorrectly, and was also missing a few newlines
in debug output, and the second one happened when the type of an
argument changed.

Reported-by: kernel test robot 
Fixes: d556435156b7 ("jbd2: avoid -Wempty-body warnings")
Fixes: 6db074618969 ("ext4: use BIT() macro for BH_** state bits")
Fixes: 5b849b5f96b4 ("jbd2: fast commit recovery path")
Signed-off-by: Arnd Bergmann 
---
 fs/ext4/fast_commit.c | 2 +-
 fs/jbd2/recovery.c| 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 6c4f19b0a556..feec2f3f13e9 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1729,7 +1729,7 @@ static int ext4_fc_replay_add_range(struct super_block 
*sb,
}
 
/* Range is mapped and needs a state change */
-   jbd_debug(1, "Converting from %d to %d %lld",
+   jbd_debug(1, "Converting from %ld to %d %lld",
map.m_flags & EXT4_MAP_UNWRITTEN,
ext4_ext_is_unwritten(ex), map.m_pblk);
ret = ext4_ext_replay_update_ex(inode, cur, map.m_len,
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 69f18fe20923..60601c5779f1 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -245,15 +245,15 @@ static int fc_do_one_pass(journal_t *journal,
return 0;
 
while (next_fc_block <= journal->j_fc_last) {
-   jbd_debug(3, "Fast commit replay: next block %ld",
+   jbd_debug(3, "Fast commit replay: next block %ld\n",
  next_fc_block);
err = jread(, journal, next_fc_block);
if (err) {
-   jbd_debug(3, "Fast commit replay: read error");
+   jbd_debug(3, "Fast commit replay: read error\n");
break;
}
 
-   jbd_debug(3, "Processing fast commit blk with seq %d");
+   jbd_debug(3, "Processing fast commit blk with seq\n");
err = journal->j_fc_replay_callback(journal, bh, pass,
next_fc_block - journal->j_fc_first,
expected_commit_id);
-- 
2.29.2



Re: [PATCH] pwm: raspberrypi-poe: Fix mailbox message initialization

2021-04-09 Thread Arnd Bergmann
On Fri, Apr 9, 2021 at 11:08 AM Nicolas Saenz Julienne
 wrote:
>
> For testing purposes this driver might be built on big-endian
> architectures. So make sure we take that into account when populating
> structures that are to be passed to RPi4's mailbox.
>
> Reported-by: kernel test robot 
> Fixes: 79caa362eab6 ("pwm: Add Raspberry Pi Firmware based PWM bus")
> Signed-off-by: Nicolas Saenz Julienne 
> ---
>
> @arndb: This was just meged into the arm-soc tree some days ago. Should I
> prepare a second PR once it's been reviewed?

Yes, that would be good. If you have no other driver patches that I should
pick up, just forward the patch (with the Ack) to s...@kernel.org and I
can just apply it.

Arnd


Re: [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format

2021-04-09 Thread Arnd Bergmann
On Fri, Apr 9, 2021 at 6:56 PM Sven Peter  wrote:
> On Wed, Apr 7, 2021, at 12:44, Will Deacon wrote:
> > On Sun, Mar 28, 2021 at 09:40:07AM +0200, Sven Peter wrote:
> >
> > > +   cfg->pgsize_bitmap &= SZ_16K;
> > > +   if (!cfg->pgsize_bitmap)
> > > +   return NULL;
> >
> > This is worrying (and again, I don't think this belongs here). How is this
> > thing supposed to work if the CPU is using 4k pages?
>
> This SoC is just full of fun surprises!
> I didn't even think about that case since I've always been using 16k pages so 
> far.
>
> I've checked again and wasn't able to find any way to configure the pagesize
> of the IOMMU. There seem to be variants of this IP in older iPhones which
> support a 4k pagesize but to the best of my knowledge this is hard wired
> and not configurable in software.
>
> When booting with 4k pages I hit the BUG_ON in iova.c that ensures that the
> iommu pagesize has to be <= the cpu page size.
>
> I see two options here and I'm not sure I like either of them:
>
> 1) Just don't support 4k CPU pages together with IOMMU translations and only
>allow full bypass mode there.
>This would however mean that PCIe (i.e. ethernet, usb ports on the Mac
>mini) and possibly Thunderbolt support would not be possible since these
>devices don't seem to like iommu bypass mode at all.

It should be possible to do a fake bypass mode by just programming a
static page table for as much address space as you can, and then
use swiotlb to address any memory beyond that. This won't perform
well because it requires bounce buffers for any high memory, but it
can be a last resort if a dart instance cannot do normal bypass mode.

> 2) I've had a brief discussion on IRC with Arnd about this [1] and he pointed
>out that the dma_map_sg API doesn't make any guarantees about the returned
>iovas and that it might be possible to make this work at least for devices
>that go through the normal DMA API.
>
>I've then replaced the page size check with a WARN_ON in iova.c just to see
>what happens. At least normal devices that go through the DMA API seem to
>work with my configuration. iommu_dma_alloc took the iommu_dma_alloc_remap
>path which was called with the cpu page size but then used
>domain->pgsize_bitmap to increase that to 16k. So this kinda works out, but
>there are other functions in dma-iommu.c that I believe rely on the fact 
> that
>the iommu can map single cpu pages. This feels very fragile right now and
>would probably require some rather invasive changes.

The other second-to-last resort here would be to duplicate the code from
the dma-iommu code and implement the dma-mapping API directly on
top of the dart hardware instead of the iommu layer. This would probably
be much faster than the swiotlb on top of a bypass or a linear map,
but it's a really awful abstraction that would require adding special cases
into a lot of generic code.

>Any driver that tries to use the iommu API directly could be trouble
>as well if they make similar assumptions.

I think pretty much all drivers using the iommu API directly already
depends on having a matching page size.  I don't see any way to use
e.g. PCI device assignment using vfio, or a GPU driver with per-process
contexts when the iotlb page size is larger than the CPU's.

>Is this something you would even want to support in the iommu subsytem
>and is it even possible to do this in a sane way?

I don't know how hard it is to do adjust the dma-iommu implementation
to allow this, but as long as we can work out the DT binding to support
both normal dma-iommu mode with 16KB pages and some kind of
passthrough mode (emulated or not) with 4KB pages, it can be left
as a possible optimization for later.

Arnd


Re: [PATCH v3] drivers: introduce and use WANT_DMA_CMA for soft dependencies on DMA_CMA

2021-04-09 Thread Arnd Bergmann
On Fri, Apr 9, 2021 at 1:21 PM David Hildenbrand  wrote:
>
> Random drivers should not override a user configuration of core knobs
> (e.g., CONFIG_DMA_CMA=n). Applicable drivers would like to use DMA_CMA,
> which depends on CMA, if possible; however, these drivers also have to
> tolerate if DMA_CMA is not available/functioning, for example, if no CMA
> area for DMA_CMA use has been setup via "cma=X". In the worst case, the
> driver cannot do it's job properly in some configurations.
>
> For example, commit 63f5677544b3 ("drm/etnaviv: select CMA and DMA_CMA if
> available") documents
> While this is no build dependency, etnaviv will only work correctly
> on most systems if CMA and DMA_CMA are enabled. Select both options
> if available to avoid users ending up with a non-working GPU due to
> a lacking kernel config.
> So etnaviv really wants to have DMA_CMA, however, can deal with some cases
> where it is not available.
>
> Let's introduce WANT_DMA_CMA and use it in most cases where drivers
> select CMA/DMA_CMA, or depend on DMA_CMA (in a wrong way via CMA because
> of recursive dependency issues).
>
> We'll assume that any driver that selects DRM_GEM_CMA_HELPER or
> DRM_KMS_CMA_HELPER would like to use DMA_CMA if possible.
>
> With this change, distributions can disable CONFIG_CMA or
> CONFIG_DMA_CMA, without it silently getting enabled again by random
> drivers. Also, we'll now automatically try to enabled both, CONFIG_CMA
> and CONFIG_DMA_CMA if they are unspecified and any driver is around that
> selects WANT_DMA_CMA -- also implicitly via DRM_GEM_CMA_HELPER or
> DRM_KMS_CMA_HELPER.
>
> For example, if any driver selects WANT_DMA_CMA and we do a
> "make olddefconfig":
>
> 1. With "# CONFIG_CMA is not set" and no specification of
>"CONFIG_DMA_CMA"
>
> -> CONFIG_DMA_CMA won't be part of .config
>
> 2. With no specification of CONFIG_CMA or CONFIG_DMA_CMA
>
> Contiguous Memory Allocator (CMA) [Y/n/?] (NEW)
> DMA Contiguous Memory Allocator (DMA_CMA) [Y/n/?] (NEW)
>
> 3. With "# CONFIG_CMA is not set" and "# CONFIG_DMA_CMA is not set"
>
> -> CONFIG_DMA_CMA will be removed from .config
>
> Note: drivers/remoteproc seems to be special; commit c51e882cd711
> ("remoteproc/davinci: Update Kconfig to depend on DMA_CMA") explains that
> there is a real dependency to DMA_CMA for it to work; leave that dependency
> in place and don't convert it to a soft dependency.

I don't think this dependency is fundamentally different from the others,
though davinci machines tend to have less memory than a lot of the
other machines, so it's more likely to fail without CMA.

Regardless of this,

Reviewed-by: Arnd Bergmann 


Re: [PATCH v2 00/21] ipmi: Allow raw access to KCS devices

2021-04-09 Thread Arnd Bergmann
On Fri, Apr 9, 2021 at 6:09 AM Joel Stanley  wrote:
> On Thu, 8 Apr 2021 at 23:47, Andrew Jeffery  wrote:
> > On Thu, 8 Apr 2021, at 21:44, Corey Minyard wrote:
> > > On Thu, Apr 08, 2021 at 10:27:46AM +0930, Andrew Jeffery wrote:
> > > There were some minor concerns that were unanswered, and there really
> > > was no review by others for many of the patches.
> >
> > Right; I was planning to clean up the minor concerns once I'd received
> > some more feedback. I could have done a better job of communicating
> > that :)
>
> I'll merge the first five through the aspeed tree this coming merge
> window. We have acks from the relevant maintainers.
>
> Arnd: would you prefer that this come as it's own pull request, or as
> part of the device tree branch?

When you are unsure, it's almost never wrong to go for a separate
branch, which gives you a chance to have a concise description
of the contents in the tag. This would be particularly helpful if there
are incompatible changes to the DT binding that require a justification.

If you are only adding a few DT nodes to existing files, then merging
these through the regular branch is probably easier.

   Arnd


Re: [PATCH v2 00/10] Initial support for Nuvoton WPCM450 BMC SoC

2021-04-09 Thread Arnd Bergmann
On Fri, Apr 9, 2021 at 9:58 AM Jonathan Neuschäfer
 wrote:
> On Fri, Apr 09, 2021 at 04:37:34AM +, Joel Stanley wrote:
> > On Tue, 6 Apr 2021 at 21:59, Jonathan Neuschäfer  
> > wrote:
> > I've had a look at the series and it looks good to me:
> >
> > Reviewed-by: Joel Stanley 
> >
> > Nice work Jonathan.
> >
> > I'll put this in it's own branch along with the bindings change it
> > depends on and send a pull request to Arnd for v5.13.
>
> Thanks a bunch!
>
> A few patches are going through other branches (IRQ bindings+driver;
> watchdog bindings+driver probably, I guess). That leaves the following
> patches to go into your branch, AFAIUI:
>
> [PATCH v2 01/10] dt-bindings: vendor-prefixes: Add Supermicro
> [PATCH v2 02/10] dt-bindings: arm: npcm: Add nuvoton,wpcm450 compatible string
> [PATCH v2 05/10] ARM: npcm: Introduce Nuvoton WPCM450 SoC
> [PATCH v2 08/10] ARM: dts: Add devicetree for Nuvoton WPCM450 BMC chip
> [PATCH v2 09/10] ARM: dts: Add devicetree for Supermicro X9SCi-LN4F based on 
> WPCM450
> [PATCH v2 10/10] MAINTAINERS: Add entry for Nuvoton WPCM450

Actually for an initial merge, we sometimes just put all the patches into one
branch in the soc tree to avoid conflicts. Unfortunately we already have a
(trivial) conflict now anyway since I merged the irqchip driver for the Apple
M1 SoC through the soc tree but the wpcm irqchip through the irqchip tree.

You did nothing wrong here, this would have just been a way to make the
initial merge a bit easier, and have a tree that is more easily bisectible
when one branch in the merge window contains all the code that is
needed for booting.

Arnd


Re: [PATCH v2 16/21] ipmi: kcs_bmc: Add a "raw" character device interface

2021-04-09 Thread Arnd Bergmann
On Fri, Mar 19, 2021 at 7:31 AM Andrew Jeffery  wrote:
>
> The existing IPMI chardev encodes IPMI behaviours as the name suggests.
> However, KCS devices are useful beyond IPMI (or keyboards), as they
> provide a means to generate IRQs and exchange arbitrary data between a
> BMC and its host system.

I only noticed the series after Joel asked about the DT changes on the arm
side. One question though:

How does this related to the drivers/input/serio/ framework that also talks
to the keyboard controller for things that are not keyboards? Are these
separate communication channels on adjacent I/O ports, or does there
need to be some arbitration?

   Arnd


[PATCH] ARM: dts: clps711x: fix missing interrupt parent

2021-04-09 Thread Arnd Bergmann
From: Arnd Bergmann 

The kernelci.org bot reports a build time regression:

arch/arm/boot/dts/ep7209.dtsi:187.17-192.4: Warning (interrupts_property): 
/keypad: Missing interrupt-parent

There is only one interrupt controller in this SoC, so I assume this
is the parent.

Fixes: 2bd86203acf3 ("ARM: dts: clps711x: Add keypad node")
Reported-by: kernelci.org bot 
Signed-off-by: Arnd Bergmann 
---
I applied this fixup on top of the arm/dt branch, so it will be part of
linux-next and the v5.13 pull requests.

 arch/arm/boot/dts/ep7209.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/ep7209.dtsi b/arch/arm/boot/dts/ep7209.dtsi
index 40a277370fd0..57bdad2c1994 100644
--- a/arch/arm/boot/dts/ep7209.dtsi
+++ b/arch/arm/boot/dts/ep7209.dtsi
@@ -186,6 +186,7 @@ syscon3: syscon@80002200 {
 
keypad: keypad {
compatible = "cirrus,ep7209-keypad";
+   interrupt-parent = <>;
interrupts = <16>;
syscon = <>;
status = "disabled";
-- 
2.29.2



[PATCH] ARM: dts: mvebu: fix SPI device node

2021-04-09 Thread Arnd Bergmann
From: Arnd Bergmann 

dtc warns about a mismatched address:

arch/arm/boot/dts/armada-385-atl-x530.dts:171.14-199.4: Warning (spi_bus_reg): 
/soc/spi@10680/spi-flash@0: SPI bus unit address format error, expected "1"

I assume the "reg" property is correct here, so adjust the unit address
accordingly.

Fixes: c6dfc019c239 ("ARM: dts: mvebu: Add device tree for ATL-x530 Board")
Reported-by: Stephen Rothwell 
Reported-by: kernelci.org bot 
Signed-off-by: Arnd Bergmann 
---
I applied this fixup to the arm/dt branch on top of the new file, this
will be part of linux-next and the v5.13 pull request

 arch/arm/boot/dts/armada-385-atl-x530.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/armada-385-atl-x530.dts 
b/arch/arm/boot/dts/armada-385-atl-x530.dts
index 2041bf09c578..ed3f41c7df71 100644
--- a/arch/arm/boot/dts/armada-385-atl-x530.dts
+++ b/arch/arm/boot/dts/armada-385-atl-x530.dts
@@ -168,7 +168,7 @@  {
pinctrl-0 = <_pins>;
status = "okay";
 
-   spi-flash@0 {
+   spi-flash@1 {
#address-cells = <1>;
#size-cells = <1>;
compatible = "jedec,spi-nor";
-- 
2.29.2



Re: [PATCH v2] clk: socfpga: fix iomem pointer cast on 64-bit

2021-04-09 Thread Arnd Bergmann
From: Arnd Bergmann 

On Sun, 14 Mar 2021 12:07:09 +0100, Krzysztof Kozlowski wrote:
> Pointers should be cast with uintptr_t instead of integer.  This fixes
> warning when compile testing on ARM64:
> 
>   drivers/clk/socfpga/clk-gate.c: In function ‘socfpga_clk_recalc_rate’:
>   drivers/clk/socfpga/clk-gate.c:102:7: warning: cast from pointer to integer 
> of different size [-Wpointer-to-int-cast]

I decided to pull that into the arm/drivers branch as well, to avoid the
build regression. Since the same fix is in the clk tree, there will now
be a duplicated commit in the git history, but I prefer that over introducing
warnings.

[1/1] clk: socfpga: fix iomem pointer cast on 64-bit
  commit: 36841008059caec9667459a7e126efac6379676b

   Arnd


Re: [PATCH 2/2] pm: allow drivers to drop #ifdef and __maybe_unused from pm callbacks

2021-04-08 Thread Arnd Bergmann
On Thu, Apr 8, 2021 at 11:00 PM Masahiro Yamada  wrote:
>
> Drivers typically surround suspend and resume callbacks with #ifdef
> CONFIG_PM(_SLEEP) or mark them as __maybe_unused in order to avoid
> -Wunused-const-variable warnings.
>
> With this commit, drivers will be able to remove #ifdef CONFIG_PM(_SLEEP)
> and __maybe_unsed because unused functions are dropped by the compiler
> instead of the preprocessor.
>
> Signed-off-by: Masahiro Yamada 

I tried this before and could not get it to work right.

>
> -#ifdef CONFIG_PM_SLEEP
> +#define pm_ptr(_ptr)   PTR_IF(IS_ENABLED(CONFIG_PM), _ptr)
> +#define pm_sleep_ptr(_ptr) PTR_IF(IS_ENABLED(CONFIG_PM_SLEEP), _ptr)
> +
>  #define SET_SYSTEM_SLEEP_PM_OPS(suspend_fn, resume_fn) \
> -   .suspend = suspend_fn, \
> -   .resume = resume_fn, \
> -   .freeze = suspend_fn, \
> -   .thaw = resume_fn, \
> -   .poweroff = suspend_fn, \
> -   .restore = resume_fn,
> -#else
> -#define SET_SYSTEM_SLEEP_PM_OPS(suspend_fn, resume_fn)
> -#endif
> +   .suspend  = pm_sleep_ptr(suspend_fn), \
> +   .resume   = pm_sleep_ptr(resume_fn), \
> +   .freeze   = pm_sleep_ptr(suspend_fn), \
> +   .thaw = pm_sleep_ptr(resume_fn), \
> +   .poweroff = pm_sleep_ptr(suspend_fn), \
> +   .restore  = pm_sleep_ptr(resume_fn),

The problem that I think you inevitably hit is that you run into a missing
declaration for any driver that still uses an #ifdef around a static
function.

The only way I can see us doing this is to create a new set of
macros that behave like the version you propose here but leave
the old macros in place until the last such #ifdef has been removed.

   Arnd


Re: [PATCH v1 2/2] drivers/gpu/drm: don't select DMA_CMA or CMA from aspeed or etnaviv

2021-04-08 Thread Arnd Bergmann
On Thu, Apr 8, 2021 at 6:45 PM David Hildenbrand  wrote:
> On 08.04.21 14:49, Linus Walleij wrote:
> > On Thu, Apr 8, 2021 at 2:01 PM David Hildenbrand  wrote:
> >
> >>> This is something you could do using a hidden helper symbol like
> >>>
> >>> config DRMA_ASPEED_GFX
> >>>  bool "Aspeed display driver"
> >>>  select DRM_WANT_CMA
> >>>
> >>> config DRM_WANT_CMA
> >>>  bool
> >>>  help
> >>> Select this from any driver that benefits from CMA being 
> >>> enabled
> >>>
> >>> config DMA_CMA
> >>>  bool "Use CMA helpers for DRM"
> >>>  default DRM_WANT_CMA
> >>>
> >>>Arnd
> >>>
> >>
> >> That's precisely what I had first, with an additional "WANT_CMA" --  but
> >> looking at the number of such existing options (I was able to spot 1 !)
> >
> > If you do this it probably makes sense to fix a few other drivers
> > Kconfig in the process. It's not just a problem with your driver.
> > "my" drivers:
> >
>
> :) I actually wanted to convert them to "depends on DMA_CMA" but ran
> into recursive dependencies ...
>
> > drivers/gpu/drm/mcde/Kconfig
> > drivers/gpu/drm/pl111/Kconfig
> > drivers/gpu/drm/tve200/Kconfig

Right, this is the main problem caused by using 'select' to
force-enable symbols that other drivers depend on.

Usually, the answer is to be consistent about the use of 'select'
and 'depends on', using the former only to enable symbols that
are hidden, while using 'depends on' for anything that is an
actual build time dependency.

> I was assuming these are "real" dependencies. Will it also work without
> DMA_CMA?

I think in this case, it is fairly likely to work without DMA_CMA when the
probe function gets called during a fresh boot, but fairly likely to fail if
it gets called after the system has run for long enough to fragment the
free memory.

The point of DMA_CMA is to make it work reliably.

  Arnd


Re: [PATCH v1 2/2] drivers/gpu/drm: don't select DMA_CMA or CMA from aspeed or etnaviv

2021-04-08 Thread Arnd Bergmann
On Thu, Apr 8, 2021 at 2:50 PM Linus Walleij  wrote:
>
> On Thu, Apr 8, 2021 at 2:01 PM David Hildenbrand  wrote:
>
> > > This is something you could do using a hidden helper symbol like
> > >
> > > config DRMA_ASPEED_GFX
> > > bool "Aspeed display driver"
> > > select DRM_WANT_CMA
> > >
> > > config DRM_WANT_CMA
> > > bool
> > > help
> > >Select this from any driver that benefits from CMA being 
> > > enabled
> > >
> > > config DMA_CMA
> > > bool "Use CMA helpers for DRM"
> > > default DRM_WANT_CMA
> > >
> > >   Arnd
> > >
> >
> > That's precisely what I had first, with an additional "WANT_CMA" --  but
> > looking at the number of such existing options (I was able to spot 1 !)
>
> If you do this it probably makes sense to fix a few other drivers
> Kconfig in the process. It's not just a problem with your driver.
> "my" drivers:
>
> drivers/gpu/drm/mcde/Kconfig
> drivers/gpu/drm/pl111/Kconfig
> drivers/gpu/drm/tve200/Kconfig
>
> certainly needs this as well, and pretty much anything that is
> selecting DRM_KMS_CMA_HELPER or
> DRM_GEM_CMA_HELPER "wants" DMA_CMA.

Are there any that don't select either of the helpers and
still want CMA? If not, it would be easy to just add

   default  DRM_KMS_CMA_HELPER || DRM_GEM_CMA_HELPER

and skipt the extra symbol.

Arnd


Re: [PATCH v1 2/2] drivers/gpu/drm: don't select DMA_CMA or CMA from aspeed or etnaviv

2021-04-08 Thread Arnd Bergmann
On Thu, Apr 8, 2021 at 2:00 PM David Hildenbrand  wrote:
>
> On 08.04.21 13:44, Arnd Bergmann wrote:
> > On Thu, Apr 8, 2021 at 1:00 PM David Hildenbrand  wrote:
> >>>
> >>> It is a somewhat awkward way to say "prevent this symbol from
> >>> being =y if the dependency is =m".
> >>
> >> What would be the right thing to do in the case here then to achieve the
> >> "if DRMA_ASPEED_GFX is enabled, also enable DMA_CMA id possible"?
> >>
> >> One approach could be to have for DMA_CMA
> >>
> >> default y if DRMA_ASPEED_GFX
> >>
> >> but it feels like the wrong way to tackle this.
> >
> > I'm still not sure what you are trying to achieve. Is the idea only to 
> > provide
> > a useful default for DMA_CMA depending on which drivers are enabled?
>
> "Random drivers should not override a user configuration of core knobs
> (e.g., CONFIG_DMA_CMA=n)."
>
> Let's assume I'm a distribution and want to set CONFIG_CMA=n or want to
> set CONFIG_DMA_CMA=n with CONFIG_CMA=y; there is no way to do that with
> e.g., DRMA_ASPEED_GFX=y because it will always override my (user!)
> setting -- even though it doesn't really always need it. Using "select"
> is the problem here.

I agree on the part of removing the 'select' if we don't need it. The
part I couldn't figure out was what the 'imply' is supposed to help with.
Most other users that added imply tried (and failed) to fix a build problem.

> > This is something you could do using a hidden helper symbol like
> >
> > config DRMA_ASPEED_GFX
> > bool "Aspeed display driver"
> > select DRM_WANT_CMA
> >
> > config DRM_WANT_CMA
> > bool
> > help
> >Select this from any driver that benefits from CMA being enabled
> >
> > config DMA_CMA
> > bool "Use CMA helpers for DRM"
> > default DRM_WANT_CMA
> >
> >   Arnd
> >
>
> That's precisely what I had first, with an additional "WANT_CMA" --  but
> looking at the number of such existing options (I was able to spot 1 !)
> I wondered if there is a better approach to achieve the same; "imply"
> sounded like a good candidate.

I can probably find a couple more, but regardless of how many others
exist, this would be a much clearer way of doing it than 'imply' since it
has none of the ambiguity and misuse problems.

I think the reason we don't see more is that generally speaking, those
defaults are widely ignored anyway. You almost always start out with
a defconfig file that contains everything you know you need, and then
you add bits to that. Having the default in any form only helps to
make that defconfig file one line shorter, while requiring other users
to add another line to turn it off when they do not want it.

 Arnd


Re: [PATCH v1 2/2] drivers/gpu/drm: don't select DMA_CMA or CMA from aspeed or etnaviv

2021-04-08 Thread Arnd Bergmann
On Thu, Apr 8, 2021 at 1:00 PM David Hildenbrand  wrote:
> >
> > It is a somewhat awkward way to say "prevent this symbol from
> > being =y if the dependency is =m".
>
> What would be the right thing to do in the case here then to achieve the
> "if DRMA_ASPEED_GFX is enabled, also enable DMA_CMA id possible"?
>
> One approach could be to have for DMA_CMA
>
> default y if DRMA_ASPEED_GFX
>
> but it feels like the wrong way to tackle this.

I'm still not sure what you are trying to achieve. Is the idea only to provide
a useful default for DMA_CMA depending on which drivers are enabled?

This is something you could do using a hidden helper symbol like

config DRMA_ASPEED_GFX
   bool "Aspeed display driver"
   select DRM_WANT_CMA

config DRM_WANT_CMA
   bool
   help
  Select this from any driver that benefits from CMA being enabled

config DMA_CMA
   bool "Use CMA helpers for DRM"
   default DRM_WANT_CMA

 Arnd


Re: [PATCH v1 2/2] drivers/gpu/drm: don't select DMA_CMA or CMA from aspeed or etnaviv

2021-04-08 Thread Arnd Bergmann
On Thu, Apr 8, 2021 at 12:29 PM David Hildenbrand  wrote:
> On 08.04.21 12:20, Arnd Bergmann wrote:
> > On Thu, Apr 8, 2021 at 11:22 AM David Hildenbrand  wrote:
> >>
> >> Random drivers should not override a user configuration of core knobs
> >> (e.g., CONFIG_DMA_CMA=n). Use "imply" instead, to still respect
> >> dependencies and manual overrides.
> >>
> >> "This is similar to "select" as it enforces a lower limit on another
> >>   symbol except that the "implied" symbol's value may still be set to n
> >>   from a direct dependency or with a visible prompt."
> >>
> >> Implying DRM_CMA should be sufficient, as that depends on CMA.
> >>
> >> Note: If this is a real dependency, we should use "depends on DMA_CMA"
> >> instead -  but I assume the driver can work without CMA just fine --
> >> esp. when we wouldn't have HAVE_DMA_CONTIGUOUS right now.
> >
> > 'imply' is almost never the right solution, and it tends to cause more
> > problems than it solves.
>
> I thought that was the case with "select" :)

Yes, but that's a different set of problems

> >
> > In particular, it does not prevent a configuration with 'DRM_CMA=m'
>
> I assume you meant "DRM_CMA=n" ? DRM_CMA cannot be built as a module.

Ok, at least that makes it easier.

> > and 'DRMA_ASPEED_GFX=y', or any build failures from such
> > a configuration.
>
> I don't follow. "DRM_CMA=n" and 'DRMA_ASPEED_GFX=y' is supposed to work
> just fine (e.g., without HAVE_DMA_CONTIGUOUS) or what am I missing?

I thought you were trying to solve the problem where DRMA_ASPEED_GFX
can optionally link against CMA but would fail to build when the CMA code
is in a loadable module.

If the problem you are trying to solve is a different one, you need a different
solution, not what I posted above.

> > If you want this kind of soft dependency, you need
> > 'depends on DRM_CMA || !DRM_CMA'.
>
> Seriously? I think the point of imply is "please enable if possible and
> not prevented by someone else".

That used to be the meaning, but it changed a few years ago. Now
it means "when a used manually turns on this symbol, turn on the
implied one as well, but let them turn it off again if they choose".

This is pretty much a NOP.

> Your example looks more like a NOP - no?
> Or will it have the same effect?

The example I gave is only meaningful if both are tristate, which is
not the case here as you explain.

It is a somewhat awkward way to say "prevent this symbol from
being =y if the dependency is =m".

  Arnd


Re: [PATCH v1 2/2] drivers/gpu/drm: don't select DMA_CMA or CMA from aspeed or etnaviv

2021-04-08 Thread Arnd Bergmann
On Thu, Apr 8, 2021 at 11:22 AM David Hildenbrand  wrote:
>
> Random drivers should not override a user configuration of core knobs
> (e.g., CONFIG_DMA_CMA=n). Use "imply" instead, to still respect
> dependencies and manual overrides.
>
> "This is similar to "select" as it enforces a lower limit on another
>  symbol except that the "implied" symbol's value may still be set to n
>  from a direct dependency or with a visible prompt."
>
> Implying DRM_CMA should be sufficient, as that depends on CMA.
>
> Note: If this is a real dependency, we should use "depends on DMA_CMA"
> instead -  but I assume the driver can work without CMA just fine --
> esp. when we wouldn't have HAVE_DMA_CONTIGUOUS right now.

'imply' is almost never the right solution, and it tends to cause more
problems than it solves.

In particular, it does not prevent a configuration with 'DRM_CMA=m'
and 'DRMA_ASPEED_GFX=y', or any build failures from such
a configuration.

If you want this kind of soft dependency, you need
'depends on DRM_CMA || !DRM_CMA'.

 Arnd


Re: arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table'

2021-04-07 Thread Arnd Bergmann
On Wed, Apr 7, 2021 at 8:07 PM Nick Desaulniers  wrote:
>
> On Wed, Apr 7, 2021 at 3:52 AM kernel test robot  wrote:
> >
> > Hi Kees,
> >
> > FYI, the error/warning still remains.
> >
> > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> > master
> > head:   2d743660786ec51f5c1fefd5782bbdee7b227db0
> > commit: 5a17850e251a55bba6d65aefbfeacfa9888cd2cd arm/build: Warn on orphan 
> > section placement
> > date:   7 months ago
> > config: arm-randconfig-r033-20210407 (attached as .config)
> > compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
> > reproduce (this is a W=1 build):
> > wget 
> > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> > ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > # 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5a17850e251a55bba6d65aefbfeacfa9888cd2cd
> > git remote add linus 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > git fetch --no-tags linus master
> > git checkout 5a17850e251a55bba6d65aefbfeacfa9888cd2cd
> > # save the attached .config to linux build tree
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
> > ARCH=arm
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> >
> > All warnings (new ones prefixed by >>):
> >
> > >> arm-linux-gnueabi-ld: warning: orphan section 
> > >> `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' 
> > >> being placed in section `__cpuidle_method_of_table'
> > >> arm-linux-gnueabi-ld: warning: orphan section 
> > >> `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' 
> > >> being placed in section `__cpuidle_method_of_table'
> > >> arm-linux-gnueabi-ld: warning: orphan section 
> > >> `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' 
> > >> being placed in section `__cpuidle_method_of_table'
>
> Looks like arch/arm/include/asm/cpuidle.h defines
> `CPUIDLE_METHOD_OF_DECLARE` to create a static struct in such a
> section.  Only arch/arm/mach-omap2/pm33xx-core.c uses that macro.

Nick, Kees,

Should I resend my patch, or are you taking care of it?

https://lore.kernel.org/lkml/20201230155506.1085689-1-a...@kernel.org/T/#u

 Arnd


[irqchip: irq/irqchip-next] irqchip/gic-v3: Fix OF_BAD_ADDR error handling

2021-04-07 Thread irqchip-bot for Arnd Bergmann
The following commit has been merged into the irq/irqchip-next branch of 
irqchip:

Commit-ID: 8e13d96670a4c050d4883e6743a9e9858e5cfe10
Gitweb:
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/8e13d96670a4c050d4883e6743a9e9858e5cfe10
Author:Arnd Bergmann 
AuthorDate:Tue, 23 Mar 2021 14:18:35 +01:00
Committer: Marc Zyngier 
CommitterDate: Wed, 07 Apr 2021 13:25:52 +01:00

irqchip/gic-v3: Fix OF_BAD_ADDR error handling

When building with extra warnings enabled, clang points out a
mistake in the error handling:

drivers/irqchip/irq-gic-v3-mbi.c:306:21: error: result of comparison of 
constant 18446744073709551615 with expression of type 'phys_addr_t' (aka 
'unsigned int') is always false 
[-Werror,-Wtautological-constant-out-of-range-compare]
if (mbi_phys_base == OF_BAD_ADDR) {

Truncate the constant to the same type as the variable it gets compared
to, to shut make the check work and void the warning.

Fixes: 505287525c24 ("irqchip/gic-v3: Add support for Message Based Interrupts 
as an MSI controller")
Signed-off-by: Arnd Bergmann 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210323131842.2773094-1-a...@kernel.org
---
 drivers/irqchip/irq-gic-v3-mbi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index 563a9b3..e81e89a 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -303,7 +303,7 @@ int __init mbi_init(struct fwnode_handle *fwnode, struct 
irq_domain *parent)
reg = of_get_property(np, "mbi-alias", NULL);
if (reg) {
mbi_phys_base = of_translate_address(np, reg);
-   if (mbi_phys_base == OF_BAD_ADDR) {
+   if (mbi_phys_base == (phys_addr_t)OF_BAD_ADDR) {
ret = -ENXIO;
goto err_free_mbi;
}


Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-04-07 Thread Arnd Bergmann
On Wed, Apr 7, 2021 at 1:36 PM Peter Zijlstra  wrote:
>
> On Wed, Apr 07, 2021 at 10:42:50AM +0200, Arnd Bergmann wrote:
> > Since there are really only a handful of instances in the kernel
> > that use the cmpxchg() or xchg() on u8/u16 variables, it would seem
> > best to just disallow those completely
>
> Not going to happen. xchg16 is optimal for qspinlock and if we replace
> that with a cmpxchg loop on x86 we're regressing.

I did not mean to remove the possibility of doing a 16-bit compare-exchange
operation altogether, just removing it from the regular macros.

We already do this for the 64-bit xchg64()/cmpxchg64(), which only some
of the 32-bit architectures provide, so I think having an explicit
xchg8()/xchg16()/cmpxchg8()/cmpxchg16() interface while tightening the
type checking would be more consistent here.

On any 32-bit architecture, cmpxchg()/xchg() macros then only have
to deal with word-sized operations.

> > Interestingly, the s390 version using __sync_val_compare_and_swap()
> > seems to produce nice output on all architectures that have atomic
> > instructions, with any supported compiler, to the point where I think
> > we could just use that to replace most of the inline-asm versions except
> > for arm64:
> >
> > #define cmpxchg(ptr, o, n)  \
> > ({  \
> > __typeof__(*(ptr)) __o = (o);   \
> > __typeof__(*(ptr)) __n = (n);   \
> > (__typeof__(*(ptr))) __sync_val_compare_and_swap((ptr),__o,__n);\
> > })
>
> It generates the LL/SC loop, but doesn't do sensible optimizations when
> it's again used in a loop itself. That is, it generates a loop of a
> loop, just like what you'd expect, which is sub-optimal for LL/SC.

One thing that it does though is to generate an ll/sc loop for xchg16(),
while mips, openrisc, xtensa and sparc64 currently open-code the
nested loop in their respective xchg16() wrappers. I have not seen
any case in which it produces object code that is worse than the
architecture specific code does today, except for those that rely on
runtime patching (i486+smp, arm64+lse).

> > Not how gcc's acquire/release behavior of __sync_val_compare_and_swap()
> > relates to what the kernel wants here.
> >
> > The gcc documentation also recommends using the standard
> > __atomic_compare_exchange_n() builtin instead, which would allow
> > constructing release/acquire/relaxed versions as well, but I could not
> > get it to produce equally good output. (possibly I was using it wrong)
>
> I'm scared to death of the C11 crap, the compiler will 'optimize' them
> when it feels like it and use the C11 memory model rules for it, which
> are not compatible with the kernel rules.
>
> But the same thing applies, it won't do the right thing for composites.

Makes sense. As I said, I could not even get it to produce optimal code
for the simple case.

 Arnd


Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-04-07 Thread Arnd Bergmann
On Tue, Apr 6, 2021 at 10:56 AM Stafford Horne  wrote:
> On Tue, Apr 06, 2021 at 11:50:38AM +0800, Guo Ren wrote:
> > On Wed, Mar 31, 2021 at 3:23 PM Arnd Bergmann  wrote:
> > > On Wed, Mar 31, 2021 at 12:35 AM Stafford Horne  wrote:
> >
> > We shouldn't export xchg16/cmpxchg16(emulated by lr.w/sc.w) in riscv,
> > We should forbid these sub-word atomic primitive and lets the
> > programmers consider their atomic design.
>
> Fair enough, having the generic sub-word emulation would be something that
> an architecture can select to use/export.

I still have the feeling that we should generalize and unify the exact behavior
across architectures as much as possible here, while possibly also trying to
simplify the interface a little.

Looking through the various xchg()/cmpxchg() implementations, I find eight
distinct ways to do 8-bit and 16-bit atomics:

Full support:
  ia64, m68k (Atari only), x86, arm32 (v6k+), arm64

gcc/clang __sync_{val,bool}_compare_and_swap:
 s390

Emulated through ll/sc:
  alpha, powerpc

Emulated through cmpxchg loop:
  mips, openrisc, xtensa (xchg but not cmpxchg), sparc64 (cmpxchg_u8,
  xchg_u16 but not cmpxchg_u16 and xchg_u8!)

Emulated through local_irq_save (non SMP only):
h8300, m68k (most), microblaze, mips, nds32, nios2

Emulated through hashed spinlock:
parisc (8-bit only added in 2020, 16-bit still missing)

Forced compile-time error:
   arm32 (v4/v5/v6 non-SMP), arc, csky, riscv, parisc (16 bit), sparc32,
   sparc64, xtensa (cmpxchg)

Silently broken:
hexagon

Since there are really only a handful of instances in the kernel
that use the cmpxchg() or xchg() on u8/u16 variables, it would seem
best to just disallow those completely and have a separate set of
functions here, with only 64-bit architectures using any variable-type
wrapper to handle both 32-bit and 64-bit arguments.

Interestingly, the s390 version using __sync_val_compare_and_swap()
seems to produce nice output on all architectures that have atomic
instructions, with any supported compiler, to the point where I think
we could just use that to replace most of the inline-asm versions except
for arm64:

#define cmpxchg(ptr, o, n)  \
({  \
__typeof__(*(ptr)) __o = (o);   \
__typeof__(*(ptr)) __n = (n);   \
(__typeof__(*(ptr))) __sync_val_compare_and_swap((ptr),__o,__n);\
})

Not how gcc's acquire/release behavior of __sync_val_compare_and_swap()
relates to what the kernel wants here.

The gcc documentation also recommends using the standard
__atomic_compare_exchange_n() builtin instead, which would allow
constructing release/acquire/relaxed versions as well, but I could not
get it to produce equally good output. (possibly I was using it wrong)

   Arnd


Re: linux-next: Signed-off-by missing for commit in the arm-soc tree

2021-04-06 Thread Arnd Bergmann
On Tue, Apr 6, 2021 at 12:11 AM Stephen Rothwell  wrote:
>
> Hi all,
>
> On Tue, 6 Apr 2021 08:11:00 +1000 Stephen Rothwell  
> wrote:
> >
> > Hi all,
> >
> > Commit
> >
> >   3b493ac0ac04 ("arm64: dts: allwinner: h6: Switch to macros for RSB 
> > clock/reset indices")
> >
> > is missing a Signed-off-by from its committer.
>
> Sorry, that commit is in the arm-soc-fixes tree.

Thanks for the report, I've temporarily removed the sunx-fixes branch merge
from the arm/fixes branch and will send the pull request without it.

Maxime, can you fix it up and resend the pull request?
Feel free to add any other fixes that have come up since then.

   Arnd


Re: [PATCH v2 00/10] Initial support for Nuvoton WPCM450 BMC SoC

2021-04-06 Thread Arnd Bergmann
On Tue, Apr 6, 2021 at 2:09 PM Jonathan Neuschäfer
 wrote:
>
> This series adds basic support for the Nuvoton WPCM450 BMC SoC. It's an older
> SoC but still commonly found on eBay, mostly in Supermicro X9 server boards.
>
> Third-party documentation is available at: 
> https://github.com/neuschaefer/wpcm450/wiki
>
> Patches 1-4 add devicetree bindings for the WPCM450 SoC and its various parts.
> Patches 5-7 add arch and driver support. Patches 8 and 9 add a devicetree for
> the SoC and a board based on it. Patch 10 finally updates the MAINTAINERS 
> file.
>
> Patch 2 requires "dt-bindings: arm: Convert nuvoton,npcm750 binding to YAML"
> (https://lore.kernel.org/lkml/20210320164023.614059-1-j.neuschae...@gmx.net/)

Hi Jonathan,

It appears these patches are doing roughly the right thing, and we may still
be able to get them into v5.13, but I'm not sure what your plan for maintaining
them is. The two options are that you either send your patches to be picked up
by Joel, or you send everything directly to s...@kernel.org once it's fully
reviewed.

I only noticed your series when patch 9/10 made it into the s...@kernel.org
patchwork because of the Cc, but none of the other ones did.

If you end up with the second option, we can go through what this involves
off-list.

Regarding the Cc:s...@kernel.org, please add that only for patches that
are already reviewed and ready to be picked up, ideally with a cover letter
that describes what the plan is for merging. If you need me to review the
platform code, use my a...@arndb.de or a...@kernel.org addresses.

  Arnd


Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-06 Thread Arnd Bergmann
On Tue, Apr 6, 2021 at 3:31 PM Andy Shevchenko
 wrote:
>
> kernel.h is being used as a dump for all kinds of stuff for a long time.
> Here is the attempt to start cleaning it up by splitting out panic and
> oops helpers.
>
> At the same time convert users in header and lib folder to use new header.
> Though for time being include new header back to kernel.h to avoid twisted
> indirected includes for existing users.
>
> Signed-off-by: Andy Shevchenko 

Nice!

Acked-by: Arnd Bergmann 


Re:

2021-04-06 Thread Arnd Bergmann
On Mon, Apr 5, 2021 at 2:03 AM Mitali Borkar  wrote:
>
> outreachy-ker...@googlegroups.com, mitaliborkar...@gmail.com
> Bcc:
> Subject: [PATCH] staging: qlge:remove else after break
> Reply-To:
>
> Fixed Warning:- else is not needed after break
> break terminates the loop if encountered. else is unnecessary and
> increases indenatation
>
> Signed-off-by: Mitali Borkar 
> ---
>  drivers/staging/qlge/qlge_mpi.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/staging/qlge/qlge_mpi.c b/drivers/staging/qlge/qlge_mpi.c
> index 2630ebf50341..3a49f187203b 100644
> --- a/drivers/staging/qlge/qlge_mpi.c
> +++ b/drivers/staging/qlge/qlge_mpi.c
> @@ -935,13 +935,11 @@ static int qlge_idc_wait(struct qlge_adapter *qdev)
> netif_err(qdev, drv, qdev->ndev, "IDC Success.\n");
> status = 0;
> break;
> -   } else {
> -   netif_err(qdev, drv, qdev->ndev,
> +   }   netif_err(qdev, drv, qdev->ndev,
>   "IDC: Invalid State 0x%.04x.\n",
>   mbcp->mbox_out[0]);
> status = -EIO;
> break;
> -   }
> }

It looks like you got this one wrong in multiple ways:

- This is not an equivalent transformation, since the errror is now
  printed in the first part of the 'if()' block as well.

- The indentation is wrong now, with the netif_err() starting in the
  same line as the '}'.

- The description mentions a change in indentation, but you did not
   actually change it.

- The changelog text appears mangled.

Arnd


Re: [PATCH 1/8] dt-bindings: clk: mstar msc313 cpupll binding description

2021-04-01 Thread Arnd Bergmann
On Fri, Feb 26, 2021 at 12:31 PM Daniel Palmer  wrote:
>
> Hi Rob's bot
>
> On Wed, 24 Feb 2021 at 04:34, Rob Herring  wrote:
> > dtschema/dtc warnings/errors:
> > Documentation/devicetree/bindings/clock/mstar,msc313-cpupll.example.dts:19:18:
> >  fatal error: dt-bindings/clock/mstar-msc313-mpll.h: No such file or 
> > directory
> >19 | #include 
> >   |  ^~~
> > compilation terminated.
> > make[1]: *** [scripts/Makefile.lib:344: 
> > Documentation/devicetree/bindings/clock/mstar,msc313-cpupll.example.dt.yaml]
> >  Error 1
> > make[1]: *** Waiting for unfinished jobs
> > make: *** [Makefile:1370: dt_binding_check] Error 2
>
> Looks like I sent this too early. I will try again later.

I found this is still in patchwork as not merged, and I have not
seen a replacement. Marking all eight patches as 'changes requested' now,
please resend.

 Arnd


Re: [PATCH v6 00/12] lib/find_bit: fast path for small bitmaps

2021-04-01 Thread Arnd Bergmann
On Thu, Apr 1, 2021 at 11:16 AM Andy Shevchenko
 wrote:
>
> On Thu, Apr 1, 2021 at 3:36 AM Yury Norov  wrote:
> >
> > Bitmap operations are much simpler and faster in case of small bitmaps
> > which fit into a single word. In linux/bitmap.c we have a machinery that
> > allows compiler to replace actual function call with a few instructions
> > if bitmaps passed into the function are small and their size is known at
> > compile time.
> >
> > find_*_bit() API lacks this functionality; but users will benefit from it
> > a lot. One important example is cpumask subsystem when
> > NR_CPUS <= BITS_PER_LONG.
>
> Cool, thanks!
>
> I guess it's assumed to go via Andrew's tree.
>
> But after that since you are about to be a maintainer of this, I think
> it would make sense to send PRs directly to Linus. I would recommend
> creating an official tree (followed by an update in the MAINTAINERS)
> and connecting it to Linux next (usually done by email to Stephen).

It depends on how often we expect to see updates to this. I have not
followed the changes as closely as I should have, but I can also
merge them through the asm-generic tree for this time so Andrew
has to carry fewer patches for this.

I normally don't have a lot of material for asm-generic either, half
the time there are no pull requests at all for a given release. I would
expect future changes to the bitmap implementation to only need
an occasional bugfix, which could go through either the asm-generic
tree or through mm and doesn't need another separate pull request.

If it turns out to be a tree that needs regular updates every time,
then having a top level repository in linux-next would be appropriate.

Arnd


Re: [PATCH] arm64: Add support for cisco craw64 ARMv8 SoCs

2021-03-31 Thread Arnd Bergmann
On Wed, Mar 31, 2021 at 7:57 PM Daniel Walker  wrote:
>
> On Wed, Mar 31, 2021 at 09:04:15AM +0200, Arnd Bergmann wrote:
> > On Wed, Mar 31, 2021 at 3:46 AM Daniel Walker  wrote:
> > > From: Ofer Licht 
> >
> > Thanks for the submission, it's always nice to see a new platform
>
>
> > > Define craw64 config, dts and Makefile for Cisco
> > > SoCs known as Craw.
> >
> > I'd like some more information about the platform, e.g. the target
> > market and maybe a link to the product information.
>
> Our SoC is produced as an internal product. So SoC specifications aren't
> widely available.
>
> Here is an example of a Cisco product which uses this SoC,
>
> https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-9200-series-switches/nb-06-cat9200-ser-data-sheet-cte-en.html
>
> I suspect that's not really what your looking for tho.

No, that's pretty much exactly what I was looking for. Just put this one
sentence and the link into the patch description and it will be fine.

> >
> > The memory size is usually filled by the boot loader, just put an
> > empty node into the .dtsi file
>
> Arnd, I must regretfully inform you that Cisco has a deep dark addiction to
> bootloaders which, are, um, how do I say this diplomatically, um , brain dead.
>
> You have some other comments below related to moving things into the 
> bootloader,
> and I can look into it, but bootloader inflexibility is wide spread inside
> Cisco.

Ok, no worries. If the bootloader can do it right, you should do it, but
this part is not essential.

Arnd


Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-31 Thread Arnd Bergmann
On Wed, Mar 31, 2021 at 12:35 AM Stafford Horne  wrote:
>
> I just want to chime in here, there may be a better spot in the thread to
> mention this but, for OpenRISC I did implement some generic 8/16-bit xchg code
> which I have on my todo list somwhere to replace the other generic
> implementations like that in mips.
>
>   arch/openrisc/include/asm/cmpxchg.h
>
> The idea would be that architectures just implement these methods:
>
>   long cmpxchg_u32(*ptr,old,new)
>   long xchg_u32(*ptr,val)
>
> Then the rest of the generic header would implement cmpxchg.

I like the idea of generalizing it a little further. I'd suggest staying a
little closer to the existing naming here though, as we already have
cmpxchg() for the type-agnostic version, and cmpxchg64() for the
fixed-length 64-bit version.

I think a nice interface between architecture-specific and architecture
independent code would be to have architectures provide
arch_cmpxchg32()/arch_xchg32() as the most basic version, as well
as arch_cmpxchg8()/arch_cmpxchg16()/arch_xchg8()/arch_xchg16()
if they have instructions for those.

The common code can then build cmpxchg16()/xchg16() on top of
either the 16-bit or the 32-bit primitives, and build the cmpxchg()/xchg()
wrapper around those (or alternatively we can decide to have them
only deal with fixed-32-bit and long/pointer sized atomics).

  Arnd


Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-31 Thread Arnd Bergmann
On Wed, Mar 31, 2021 at 8:44 AM Guo Ren  wrote:
> On Wed, Mar 31, 2021 at 12:18 PM Guo Ren  wrote:
> > On Tue, Mar 30, 2021 at 3:12 PM Arnd Bergmann  wrote:
> > > On Tue, Mar 30, 2021 at 4:26 AM Guo Ren  wrote:

> > > As I understand, this example must not cause a deadlock on
> > > a compliant hardware implementation when the underlying memory
> > > has RsrvEventual behavior, but could deadlock in case of
> > > RsrvNonEventual
> > Thx for the nice explanation:
> >  - RsrvNonEventual - depends on software fall-back mechanisms, and
> > just I'm worried about.
> >  - RsrvEventual - HW would provide the eventual success guarantee.
> In riscv-spec 8.3 Eventual Success of Store-Conditional Instructions
>
> I found:
> "As a consequence of the eventuality guarantee, if some harts in an
> execution environment are
> executing constrained LR/SC loops, and no other harts or devices in
> the execution environment
> execute an unconditional store or AMO to that reservation set, then at
> least one hart will
> eventually exit its constrained LR/SC loop. *** By contrast, if other
> harts or devices continue to
> write to that reservation set, it ***is not guaranteed*** that any
> hart will exit its LR/SC loop.*** "
>
> Seems RsrvEventual couldn't solve the code's problem I've mentioned.

Ok, got it.

Arnd


Re: [PATCH] arm64: Add support for cisco craw64 ARMv8 SoCs

2021-03-31 Thread Arnd Bergmann
On Wed, Mar 31, 2021 at 3:46 AM Daniel Walker  wrote:
> From: Ofer Licht 

Thanks for the submission, it's always nice to see a new platform

> Define craw64 config, dts and Makefile for Cisco
> SoCs known as Craw.

I'd like some more information about the platform, e.g. the target
market and maybe a link to the product information.

> Cc: xe-linux-exter...@cisco.com
> Signed-off-by: Ofer Licht 
> Signed-off-by: Daniel Walker 
> ---
>  .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
>  arch/arm64/Kconfig.platforms  |   5 +
>  arch/arm64/boot/dts/Makefile  |   1 +
>  arch/arm64/boot/dts/cisco/Makefile|   5 +
>  .../arm64/boot/dts/cisco/craw64-dopplerg2.dts | 239 +++
>  arch/arm64/boot/dts/cisco/craw64.dtsi | 392 ++
>  arch/arm64/configs/defconfig  |   1 +

We have separate branches for dt, defconfig, and the rest, so it would be
good to split this patch up a little more.

There should also be an entry in the top-level MAINTAINERS file.

> diff --git a/arch/arm64/boot/dts/cisco/craw64-dopplerg2.dts 
> b/arch/arm64/boot/dts/cisco/craw64-dopplerg2.dts
> new file mode 100644
> index ..20ecc57b4e5c
> --- /dev/null
> +++ b/arch/arm64/boot/dts/cisco/craw64-dopplerg2.dts
> @@ -0,0 +1,239 @@
> +/dts-v1/;
> +
> +#include "craw64.dtsi"
> +
> +/ {
> +   model = "Cisco Craw64 on DopplerG 2.0";
> +   compatible = "cisco,craw64-dopplerg2", "cisco,craw64";
> +
> +   memory {
> +   device_type = "memory";
> +   reg = <0x0 0x8000 0x0 0x8000>;
> +   };

The memory size is usually filled by the boot loader, just put an
empty node into the .dtsi file

> +
> +   soc: soc {
> +   uart0: serial@23f8 {
> +   clock-frequency = <25000>;
> +   status = "ok";
> +   };
> +
> +   uart1: serial@23fc {
> +   clock-frequency = <25000>;
> +   status = "ok";
> +   };
> +
> +   spiclk: spiclk {
> +   clock-frequency = <25000>;
> +   };

The clock frequencies can also normally go into the .dtsi, as those
tend to be SoC specific rather than board specific.

> +   spi: spi@2400 {
> +   status="ok";
> +   flash: flash@0 {
> +   compatible = "micron,n25q128a13", 
> "jedec,spi-nor";
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   spi-max-frequency = <800>;
> +   reg = <0>;
> +   partition@0 {
> +   label = "unused0";
> +   reg = <0x0 0x1>;
> +   read-only;
> +   };
> +   partition@1 {
> +   label = "brom";
> +   reg = <0x1 0x1>;
> +   };

The partitions in turn normally go into the bootloader, which
needs to know about them anyway, but might pick different
settings.

> +
> +   stmmaceth: stmmaceth {
> +   clock-frequency = <25000>;
> +   };
> +
> +   eth0: dwmac@282c {
> +   status = "ok";
> +   mdio-channel = <0>;
> +   };

There is no point in defining labels in the board specific file, as
nothing will reference them. If you define labels in the .dtsi file,
you might find it easier to refer to the nodes by those labels
rather than by the full path.

Both of the node names look wrong here. The name for an ethernet
device should be 'ethernet@282c000'.

> +   wd@28500200 {
> +   compatible = "cisco,craw-smgmt-wdt";
> +   reg = <0x28500200 0x140>;
> +   };

Similarly, the watchdog should be watchdog@28500200. I don't
see a binding document for a cisco,craw-smgmt-wdt, so please
drop this node until the binding has been accepted. You can
submit the binding together with the driver or the dts file (as
a separate patch), but I don't want to merge dt nodes without a
reviewed binding, as that has the risk of requiring incompatible
changes later.

> +
> +   doppler_i2c: bsp_i2c {
> +   compatible = "cisco,dplr-i2c";
> +   reg = <0x23f71000 0x10>;
> +   };
> +   };
> +   doppler {
> +   intrpt {
> +   compatible = "cisco,dplr_intrpt";

Same for both of these.

> +   reg = <0x0 0x2000 0x0 0x0FFF>;
> +   interrupts = ,
> +   ,
> +  

Re: Compiling kernel-3.4.xxx with gcc-9.x. Need some help.

2021-03-30 Thread Arnd Bergmann
On Tue, Mar 30, 2021 at 3:23 PM Fawad Lateef  wrote:
> On Mon, 29 Mar 2021 at 22:06, Arnd Bergmann  wrote:
> >
> > On Mon, Mar 29, 2021 at 9:23 PM Fawad Lateef  wrote:
> > >
> > > On Mon, 29 Mar 2021 at 06:57, Greg KH  wrote:
> > > >
> > > > On Sun, Mar 28, 2021 at 10:20:50PM +0200, Fawad Lateef wrote:
> > > > > Hi
> > > > >
> > > > > I am using an Olimex A20 SOM with NAND and due to some binary blob for
> > > > > NAND driver, I am stuck with the sunxi kernel 3.4.xxx version. (Repo
> > > > > here: https://github.com/linux-sunxi/linux-sunxi)
> > > >
> > > > Please work with the vendor that is forcing you to use this obsolete and
> > > > insecure kernel version.  You are paying for that support, and they are
> > > > the only ones that can support you.
> > > >
> > >
> > > The problem is vendor Olimex now have eMMC based SOM which is
> > > supported by mainline kernel _but_ they still selling NAND SOM and
> > > only supporting 3.4 kernel (as this is the only latest version from
> > > sunxi with NAND support, after that sunxi is now moved away from NAND
> > > too).
> >
> > From a very quick look at the git history, I can tell that A20 NAND driver
> > support was added in linux-4.8. Have you actually tried a modern kernel?
> >
>
> I tried compiling and booting kernel v4.4 (from sunxi repo on github)
> but unable to make it boot. Stuck at "Starting kernel". By the way I
> am booting from RAM (using the sunxi usbboot/usb-otg option).
>
> You mentioned that it's supported from linux-4.8 (from your quick look
> at git history); can you tell me which driver/files? As I was able to
> see some sort of sunxi NAND driver even in v4.4 kernel
> (https://lxr.missinglinkelectronics.com/linux+v4.4/drivers/mtd/nand/sunxi_nand.c).

The nand flash controller node for a20 was added in commit
b2a83ad217b8 ("ARM: dts: sun7i: Add NFC node to Allwinner A20 SoC")
The driver was merged first but not hooked up in the dt.
For the specific board file, you need to still need to either enable the
device and add a partition map in the dts (as described in the howto),
or have a boot loader that fills out that information.

> When I tried to compare with the sunxi-3.4 kernel code, I found that
> the kernel has code for "nfc" but no reference to "nfd" (I don't know
> what they are referring to).
> The sunxi 3.4 kernel NAND driver at quick look seems quite different.
> (https://github.com/linux-sunxi/linux-sunxi/tree/sunxi-3.4/drivers/block/sunxi_nand)
>
> I will try the v4.8 or above kernel.

I think the custom nand driver in that tree was trying to avoid the
patents for nand flash by using an independently developed implementation,
but that of course leads to copyright issues with incompatible licenses.

The new driver is a regular mtd driver. I don't know whether the layout
of the data is compatible at all, so it's possible that you have to backup
all the data using the old kernel and write back the file system using a
new kernel.

   Arnd


Re: [PATCH 06/11] cgroup: fix -Wzero-length-bounds warnings

2021-03-30 Thread Arnd Bergmann
On Tue, Mar 30, 2021 at 10:41 AM Michal Koutný  wrote:
>
> On Mon, Mar 22, 2021 at 05:02:44PM +0100, Arnd Bergmann  
> wrote:
> > I'm not sure what is expected to happen for such a configuration,
> > presumably these functions are never calls in that case.
> Yes, the functions you patched would only be called from subsystems or
> there should be no way to obtain a struct cgroup_subsys reference
> anyway (hence it's ok to always branch as if ss==NULL).
>
> I'd prefer a variant that wouldn't compile the affected codepaths when
> there are no subsystems registered, however, I couldn't come up with a
> way how to do it without some preprocessor ugliness.

Would it be possible to enclose most or all of kernel/cgroup/cgroup.c
in an #ifdef CGROUP_SUBSYS_COUNT block? I didn't try that
myself, but this might be a way to guarantee that there cannot
be any callers (it would cause a link error).

> Reviewed-by: Michal Koutný 

Thanks

Arnd


Re: [PATCH v2 1/3] hpsa: use __packed on individual structs, not header-wide

2021-03-30 Thread Arnd Bergmann
On Tue, Mar 30, 2021 at 9:30 AM Arnd Bergmann  wrote:
> On Tue, Mar 30, 2021 at 9:22 AM Sergei Trofimovich  wrote:
>
> > @@ -451,7 +452,7 @@ struct CommandList {
> > bool retry_pending;
> > struct hpsa_scsi_dev_t *device;
> > atomic_t refcount; /* Must be last to avoid memset in 
> > hpsa_cmd_init() */
> > -} __aligned(COMMANDLIST_ALIGNMENT);
> > +} __packed __aligned(COMMANDLIST_ALIGNMENT);
>
> You are still marking CommandList as __packed here, which is
> what caused the original problem. Please don't mark this one
> as __packed at all. If there are individual members that you want
> to be misaligned inside of the structure, you could mark those
> explicitly.

Nevermind, I just got patch 2/3, splitting up the patches like this seems
fine to me.

Whole series

Reviewed-by: Arnd Bergmann 


Re: [PATCH v2 3/3] hpsa: add an assert to prevent from __packed reintroduction

2021-03-30 Thread Arnd Bergmann
On Tue, Mar 30, 2021 at 9:23 AM Sergei Trofimovich  wrote:

> +/*
> + * Make sure our embedded atomic variable is aligned. Otherwise we break 
> atomic
> + * operations on architectures that don't support unaligned atomics like 
> IA64.
> + *
> + * The assert guards against reintroductin against unwanted __packed to
> + * the struct CommandList.
> + */
> +static_assert(offsetof(struct CommandList, refcount) % __alignof__(atomic_t) 
> == 0);
> +

There are a few other members that need to be aligned: the work_struct
has another
atomic_t inside it, and there are a few pointers that might rely on
being written to
atomically.

While you could add a static_assert for each member, the easier solution is to
just not ask for the members to be misaligned in the first place.

   Arnd


Re: [PATCH v2 1/3] hpsa: use __packed on individual structs, not header-wide

2021-03-30 Thread Arnd Bergmann
On Tue, Mar 30, 2021 at 9:22 AM Sergei Trofimovich  wrote:
>
> Some of the structs contain `atomic_t` values and are not intended to be
> sent to IO controller as is.
>
> The change adds __packed to every struct and union in the file.
> Follow-up commits will fix `atomic_t` problems.
>
> The commit is a no-op at least on ia64:
> $ diff -u <(objdump -d -r old.o) <(objdump -d -r new.o)

This looks better to me, but I think it still has the same potential bug in
the CommandList definition. Moving from #pragma to annotating the
misaligned structures as __packed is more of a cleanup that could
be done separately from the bugfix, but it does make it a little more
robust.

>  #define HPSA_INQUIRY 0x12
>  struct InquiryData {
> u8 data_byte[36];
> -};
> +} __packed;

Marking this one and a few others as __packed is a bit silly, but
also obviously harmless, and closer to the original version, so that's
ok.

> @@ -451,7 +452,7 @@ struct CommandList {
> bool retry_pending;
> struct hpsa_scsi_dev_t *device;
> atomic_t refcount; /* Must be last to avoid memset in hpsa_cmd_init() 
> */
> -} __aligned(COMMANDLIST_ALIGNMENT);
> +} __packed __aligned(COMMANDLIST_ALIGNMENT);

You are still marking CommandList as __packed here, which is
what caused the original problem. Please don't mark this one
as __packed at all. If there are individual members that you want
to be misaligned inside of the structure, you could mark those
explicitly.

  Arnd


Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-30 Thread Arnd Bergmann
On Tue, Mar 30, 2021 at 4:26 AM Guo Ren  wrote:
> On Mon, Mar 29, 2021 at 9:56 PM Arnd Bergmann  wrote:
> > On Mon, Mar 29, 2021 at 2:52 PM Guo Ren  wrote:
> > > On Mon, Mar 29, 2021 at 7:31 PM Peter Zijlstra  
> > > wrote:
> > > >
> > > > What's the architectural guarantee on LL/SC progress for RISC-V ?
> >
> >"When LR/SC is used for memory locations marked RsrvNonEventual,
> >  software should provide alternative fall-back mechanisms used when
> >  lack of progress is detected."
> >
> > My reading of this is that if the example you tried stalls, then either
> > the PMA is not RsrvEventual, and it is wrong to rely on ll/sc on this,
> > or that the PMA is marked RsrvEventual but the implementation is
> > buggy.
>
> Yes, PMA just defines physical memory region attributes, But in our
> processor, when MMU is enabled (satp's value register > 2) in s-mode,
> it will look at our custom PTE's attributes BIT(63) ref [1]:
>
>PTE format:
>| 63 | 62 | 61 | 60 | 59 | 58-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
>  SO   CBSH   SERSW   D   A   G   U   X   W   R   V
>  ^^^^^
>BIT(63): SO - Strong Order
>BIT(62): C  - Cacheable
>BIT(61): B  - Bufferable
>BIT(60): SH - Shareable
>BIT(59): SE - Security
>
> So the memory also could be RsrvNone/RsrvEventual.

I was not talking about RsrvNone, which would clearly mean that
you cannot use lr/sc at all (trap would trap, right?), but "RsrvNonEventual",
which would explain the behavior you described in an earlier reply:

| u32 a = 0x55aa66bb;
| u16 *ptr = 
|
| CPU0   CPU1
| = =
| xchg16(ptr, new) while(1)
| WRITE_ONCE(*(ptr + 1), x);
|
| When we use lr.w/sc.w implement xchg16, it'll cause CPU0 deadlock.

As I understand, this example must not cause a deadlock on
a compliant hardware implementation when the underlying memory
has RsrvEventual behavior, but could deadlock in case of
RsrvNonEventual

> [1] 
> https://github.com/c-sky/csky-linux/commit/e837aad23148542771794d8a2fcc52afd0fcbf88
>
> >
> > It also seems that the current "amoswap" based implementation
> > would be reliable independent of RsrvEventual/RsrvNonEventual.
>
> Yes, the hardware implementation of AMO could be different from LR/SC.
> AMO could use ACE snoop holding to lock the bus in hw coherency
> design, but LR/SC uses an exclusive monitor without locking the bus.
>
> RISC-V hasn't CAS instructions, and it uses LR/SC for cmpxchg. I don't
> think LR/SC would be slower than CAS, and CAS is just good for code
> size.

What I meant here is that the current spinlock uses a simple amoswap,
which presumably does not suffer from the lack of forward process you
described.

Arnd


Re: Compiling kernel-3.4.xxx with gcc-9.x. Need some help.

2021-03-29 Thread Arnd Bergmann
On Mon, Mar 29, 2021 at 9:23 PM Fawad Lateef  wrote:
>
> On Mon, 29 Mar 2021 at 06:57, Greg KH  wrote:
> >
> > On Sun, Mar 28, 2021 at 10:20:50PM +0200, Fawad Lateef wrote:
> > > Hi
> > >
> > > I am using an Olimex A20 SOM with NAND and due to some binary blob for
> > > NAND driver, I am stuck with the sunxi kernel 3.4.xxx version. (Repo
> > > here: https://github.com/linux-sunxi/linux-sunxi)
> >
> > Please work with the vendor that is forcing you to use this obsolete and
> > insecure kernel version.  You are paying for that support, and they are
> > the only ones that can support you.
> >
>
> The problem is vendor Olimex now have eMMC based SOM which is
> supported by mainline kernel _but_ they still selling NAND SOM and
> only supporting 3.4 kernel (as this is the only latest version from
> sunxi with NAND support, after that sunxi is now moved away from NAND
> too).

>From a very quick look at the git history, I can tell that A20 NAND driver
support was added in linux-4.8. Have you actually tried a modern kernel?

There is also a howto document at
https://linux-sunxi.org/Mainline_NAND_Howto

The olimex board specific dts files seem to be missing the entry for
the nand controller. If you have trouble figuring out how to enable that
from the howto above, Olimex should be able to prove a small patch
for it.

> > > I am currently using buildroot-2016 and gcc-5.5 for building the
> > > kernel and every other package needed.
> > >
> > > Now the requirement is to move to the latest version of gcc-9.x, so
> > > that we can have glibc++ provided by the gcc-9.1 toolchain.
> > >
> > > Main problem for moving to later versions of buildroot is the kernel
> > > 3.4 which we couldn't to work with gcc-6 a few years ago _but_ now the
> > > gcc-9.1 requirement is mandatory so now have to look into compiling
> > > linux-3.4 with gcc-9.1 or above.

There is no need to compile user space and kernel with the same compiler.

Arnd


Re: [RFC 1/2] arm64: PCI: Allow use arch-specific pci sysdata

2021-03-29 Thread Arnd Bergmann
On Mon, Mar 29, 2021 at 4:32 PM Boqun Feng  wrote:
>
> Hi Arnd,
>
> On Sat, Mar 20, 2021 at 05:09:10PM +0100, Arnd Bergmann wrote:
> > On Sat, Mar 20, 2021 at 1:54 PM Arnd Bergmann  wrote:
> > >  I actually still have a (not really tested) patch series to clean up
> > > the pci host bridge registration, and this should make this a lot easier
> > > to add on top.
> > >
> > > I should dig that out of my backlog and post for review.
> >
> > I've uploaded my series to
> > https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git
> > pci-probe-rework-20210320
> >
> > The purpose of this series is mostly to simplify what variations of
> > host probe methods exist, towards using pci_host_probe() as the
> > only method. It does provide some simplifications based on that
> > that, including a way to universally have access to the pci_host_bridge
> > pointer during the probe function.
> >
>
> Thanks for the suggestion and code. I spend some time to catch up. Yes,
> Bjorn and you are correct, the better way is having a 'domain_nr' in the
> 'pci_host_bridge' and making sure every driver fill that correctly
> before probe. I definitly will use this approach.
>
> However, I may start small: I plan to introduce 'domain_nr' and only
> fill the field at probe time for PCI_DOMAINS_GENERIC=y archs, and leave
> other archs and driver alone. (honestly, I was shocked by the number of
> pci_scan_root_bus_bridge() and pci_host_probe() that I need to adjust if
> I really want to unify the 'domain_nr' handling for every arch and
> driver ;-)). This will fulfil my requirement for Hyper-V PCI controller
> on ARM64. And later on, we can switch each arch to this approach one by
> one and keep the rest still working.
>
> Thoughts?

That sounds reasonable to me, yes. I would also suggest you look
at my hyperv patch from the branch I mentioned [1] and try to integrate
that first. I suspect this makes it easier to do the domain rework and
possibly other changes on top.

   Arnd

https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/commit/?h=pci-probe-rework-20210320=44db8df9d729d


Re: [PATCH] hpsa: fix boot on ia64 (atomic_t alignment)

2021-03-29 Thread Arnd Bergmann
On Mon, Mar 29, 2021 at 1:28 PM John Paul Adrian Glaubitz
 wrote:
> On 3/24/21 7:37 PM, don.br...@microchip.com wrote:
> >
> > Acked-by: Don Brace 
> >
> > Thanks for your patch and extra effort.
>
> Apologies for being so persistent, but has this patch been queued anywhere?
>
> This should be included for 5.12 if possible as it unbreaks the kernel on alot
> of Itanium servers (and potentially other machines with the HPSA controller).
>
> If no one wants to pick the patch up, it could go through Andrew Morton's 
> tree (-mm).
>

I think Martin is still waiting for a fixed version of the patch, as
the proposed patch from
March 12 only solves the immediate symptom, but not the underlying problem
of the CommandList structure being marked as unaligned. If it gets fixed, the
new version should work on all architectures.

 Arnd


Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-29 Thread Arnd Bergmann
On Mon, Mar 29, 2021 at 2:52 PM Guo Ren  wrote:
>
> On Mon, Mar 29, 2021 at 7:31 PM Peter Zijlstra  wrote:
> >
> > On Mon, Mar 29, 2021 at 01:16:53PM +0200, Peter Zijlstra wrote:
> > > Anyway, an additional 'funny' is that I suspect you cannot prove fwd
> > > progress of the entire primitive with any of this on. But who cares
> > > about details anyway.. :/
> >
> > What's the architectural guarantee on LL/SC progress for RISC-V ?
>
> funct5| aq | rl   | rs2 |  rs1  | funct3 | rd | opcode
>  5  11  5   5 35  7
> LR.W/D  ordering  0 addrwidth   destAMO
> SC.W/D  ordering  src  addrwidth   destAMO
>
> LR.W loads a word from the address in rs1, places the sign-extended
> value in rd, and registers a reservation set—a set of bytes that
> subsumes the bytes in the addressed word. SC.W conditionally writes a
> word in rs2 to the address in rs1: the SC.W succeeds only if the
> reservation is still valid and the reservation set contains the bytes
> being written. If the SC.W succeeds, the instruction writes the word
> in rs2 to memory, and it writes zero to rd. If the SC.W fails, the
> instruction does not write to memory, and it writes a nonzero value to
> rd. Regardless of success or failure, executing an SC.W instruction
> *invalidates any reservation held by this hart*.
>
> More details, ref:
> https://github.com/riscv/riscv-isa-manual

I think section "3.5.3.2 Reservability PMA" [1] would be a more relevant
link, as this defines memory areas that either do or do not have
forward progress guarantees, including this part:

   "When LR/SC is used for memory locations marked RsrvNonEventual,
 software should provide alternative fall-back mechanisms used when
 lack of progress is detected."

My reading of this is that if the example you tried stalls, then either
the PMA is not RsrvEventual, and it is wrong to rely on ll/sc on this,
or that the PMA is marked RsrvEventual but the implementation is
buggy.

It also seems that the current "amoswap" based implementation
would be reliable independent of RsrvEventual/RsrvNonEventual.
arm64 is already in the situation of having to choose between
two cmpxchg() implementation at runtime to allow falling back to
a slower but more general version, but it's best to avoid that if you
can.

 Arnd

[1] 
http://www.five-embeddev.com/riscv-isa-manual/latest/machine.html#atomicity-pmas


Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-29 Thread Arnd Bergmann
On Mon, Mar 29, 2021 at 9:52 AM Peter Zijlstra  wrote:
> On Sat, Mar 27, 2021 at 06:06:38PM +, guo...@kernel.org wrote:
> > From: Guo Ren 
> >
> > Some architectures don't have sub-word swap atomic instruction,
> > they only have the full word's one.
> >
> > The sub-word swap only improve the performance when:
> > NR_CPUS < 16K
> >  *  0- 7: locked byte
> >  * 8: pending
> >  *  9-15: not used
> >  * 16-17: tail index
> >  * 18-31: tail cpu (+1)
> >
> > The 9-15 bits are wasted to use xchg16 in xchg_tail.
> >
> > Please let architecture select xchg16/xchg32 to implement
> > xchg_tail.
>
> So I really don't like this, this pushes complexity into the generic
> code for something that's really not needed.
>
> Lots of RISC already implement sub-word atomics using word ll/sc.
> Obviously they're not sharing code like they should be :/ See for
> example arch/mips/kernel/cmpxchg.c.

That is what the previous version of the patch set did, right?

I think this v4 is nicer because the code is already there in
qspinlock.c and just gets moved around, and the implementation
is likely more efficient this way. The mips version could be made
more generic, but it is also less efficient than a simple xchg
since it requires an indirect function call plus nesting a pair of
loops instead in place of the single single ll/sc loop in the 32-bit
xchg.

I think the weakly typed xchg/cmpxchg implementation causes
more problems than it solves, and we'd be better off using
a stronger version in general, with the 8-bit and 16-bit exchanges
using separate helpers in the same way that the fixed-length
cmpxchg64 is separate already, there are only a couple of instances
for each of these in the kernel.

Unfortunately, there is roughly a 50:50 split between fixed 32-bit
and long/pointer-sized xchg/cmpxchg users in the kernel, so
making the interface completely fixed-type would add a ton of
churn. I created an experimental patch for this, but it's probably
not worth it.

> Also, I really do think doing ticket locks first is a far more sensible
> step.

I think this is the important point to look at first. From what I can
tell, most users of ticket spinlocks have moved to qspinlock over
time, but I'm not sure about the exact tradeoff, in particular on
large machines without a 16-bit xchg operation.

FWIW, the implementations in the other SMP capable
architectures are

alphasimple
arc  simple+custom
arm  ticket
arm64qspinlock (formerly ticket)
csky ticket/qrwlock (formerly simple+ticket/qrwlock)
hexagon  simple
ia64 ticket
mips qspinlock (formerly ticket)
openrisc qspinlock
parisc   custom
powerpc  simple+qspinlock (formerly simple)
riscvsimple
s390 custom-queue
sh   simple
sparcqspinlock
x86  qspinlock (formerly ticket+qspinlock)
xtensa   qspinlock

32-bit Arm is the only relevant user of ticket locks these days, but its
hardware has a practical limit of 16 CPUs and four nodes, with most
implementations having a single CPU cluster of at most four cores.

We had the same discussion about xchg16() when Sebastian submitted
the arm32 qspinlock implementation:
https://lore.kernel.org/linux-arm-kernel/20191007214439.27891-1-sebast...@breakpoint.cc/

Arnd


Re: [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings

2021-03-28 Thread Arnd Bergmann
On Sun, Mar 28, 2021 at 9:40 AM Sven Peter  wrote:

I noticed only one detail here:

> +  - |+
> +dart2a: dart2a@82f0 {
> +  compatible = "apple,t8103-dart";
> +  reg = <0x82f0 0x4000>;
> +  interrupts = <1 781 4>;
> +  #iommu-cells = <1>;
> +};

The name of the iommu should be iommu@82f0, not dart2a@82f0.

   Arnd


Re: [PATCH v4 2/4] riscv: cmpxchg.h: Merge macros

2021-03-27 Thread Arnd Bergmann
On Sat, Mar 27, 2021 at 7:06 PM  wrote:
>
> From: Guo Ren 
>
> To reduce assembly codes, let's merge duplicate codes into one
> (xchg_acquire, xchg_release, cmpxchg_release).
>
> Signed-off-by: Guo Ren 

This is a nice cleanup, but I wonder if you can go even further by using
the definitions from atomic-arch-fallback.h like arm64 and x86 do.

Arnd


Re: [PATCH 0/3] Apple M1 DART IOMMU driver

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 6:28 PM Mark Kettenis  wrote:

> I haven't figured out how the bypass stuff really works.  Corellium
> added support for it in their codebase when they added support for
> Thunderbolt, and some of the DARTs that seem to be related to
> Thunderbolt do indeed have a "bypass" property.  But it is unclear to
> me how the different puzzle pieces fit together for Thunderbolt.

As a general observation, bypass mode for Thunderbolt is what enabled
the http://thunderclap.io/ attack. This is extremely useful for debugging
a running kernel from another machine, but it's also something that
should never be done in a production kernel.

 Arnd


Re: [PATCH 0/3] Apple M1 DART IOMMU driver

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 6:51 PM Sven Peter  wrote:
> On Fri, Mar 26, 2021, at 18:34, Robin Murphy wrote:
> > On 2021-03-26 17:26, Mark Kettenis wrote:
> > >
> > > Anyway, from my viewpoint having the information about the IOVA
> > > address space sit on the devices makes little sense.  This information
> > > is needed by the DART driver, and there is no direct cnnection from
> > > the DART to the individual devices in the devicetree.  The "iommus"
> > > property makes a connection in the opposite direction.
> >
> > What still seems unclear is whether these addressing limitations are a
> > property of the DART input interface, the device output interface, or
> > the interconnect between them. Although the observable end result
> > appears more or less the same either way, they are conceptually
> > different things which we have different abstractions to deal with.
>
> I'm not really sure if there is any way for us to figure out where these
> limitation comes from though.

My first guess was that this is done to partition the available address
address space in a way that allows one physical IOTLB to handle
multiple devices that each have their own page table for a subset
of the address space, as was done on old PowerPC IOMMUs.
However, the ranges you list don't really support that model.

> I've done some more experiments and looked at all DART nodes in Apple's Device
> Tree though. It seems that most (if not all) masters only connect 32 address
> lines even though the iommu can handle a much larger address space. I'll 
> therefore
> remove the code to handle the full space for v2 since it's essentially dead
> code that can't be tested anyway.
>
>
> There are some exceptions though:
>
> There are the PCIe DARTs which have a different limitation which could be
> encoded as 'dma-ranges' in the pci bus node:
>
>name base  size
>  dart-apcie1: 0010  3fe0
>  dart-apcie2: 0010  3fe0
>  dart-apcie0: 0010  3fe0
> dart-apciec0: 4000  7fffc000
> dart-apciec1: 8000  7fffc000

This looks like they are reserving some address space in the beginning
and/or the end, and for apciec0, the address space is partitioned into
two equal-sized regions.

> Then there are also these display controller DARTs. If we wanted to use 
> dma-ranges
> we could just put them in a single sub bus:
>
>   name base  size
>   dart-disp0:  fc00
> dart-dcp:  fc00
>dart-dispext0:  fc00
>  dart-dcpext:  fc00
>
>
> And finally we have these strange ones which might eventually each require
> another awkward sub-bus if we want to stick to the dma-ranges property.
>
> name base  size
>   dart-aop: 0003  ("always-on processor")
>   dart-pmp:  bff0 (no idea yet)

Here I also see a "pio-vm-size" property:

dart-pmp {
  pio-vm-base = <0xc000>;
  pio-vm-size = <0x4000>;
  vm-size = <0xbff0>;
  ...
   };

Which seems to give 3GB of address space to the normal iotlb,
plus the last 1GB to something else. The vm-base property is also
missing rather than zero, but that could just be part of their syntax
instead of a real difference.

Could it be that there are

>   dart-sio: 0021c000 fbde4000 (at least their Secure Enclave/TPM co-processor)

Same here:
dart-sio {
   vm-base = <0x0>;
   vm-size = <0xfc00>;
   pio-vm-base = <0xfd00>;
  pio-vm-size = <0x200>;
  pio-granularity = <0x100>;
   }

There are clearly two distinct ranges that split up the 4GB space again,
with a small hole of 16MB (==pio-granularity) at the end of each range.

The "pio" name might indicate that this is a range of addresses that
can be programmed to point at I/O registers in another device, rather
than pointing to RAM.

   Arnd


Re: [PATCH 2/2] ASoC: amd: fix acpi dependency kernel warning

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 5:44 PM Vijendar Mukunda
 wrote:
>
> Fix ACPI dependency kernel warning produced by powerpc
> allyesconfig.
>
> sound/soc/amd/acp-da7219-max98357a.c:684:28: warning:
> 'cz_rt5682_card' defined but not used [-Wunused-variable]
>
> sound/soc/amd/acp-da7219-max98357a.c:671:28: warning: 'cz_card'
> defined but not used [-Wunused-variable]

I would suggest simply dropping the unnecessary #ifdef and
ACPI_PTR() guard.

It might be helpful to hide the Kconfig submenu under
'depends on X86 || COMPILE_TEST'.

   Arnd


Re: [PATCH 0/3] Apple M1 DART IOMMU driver

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 5:10 PM Sven Peter  wrote:
> On Fri, Mar 26, 2021, at 16:59, Mark Kettenis wrote:
> > Some of the DARTs provide a bypass facility.  That code make using the
> > standard "dma-ranges" property tricky.  That property would need to
> > contain the bypass address range.  But that would mean that if the
> > DART driver needs to look at that property to figure out the address
> > range that supports translation it will need to be able to distinguish
> > between the translatable address range and the bypass address range.
>
> Do we understand if and why we even need to bypass certain streams?

My guess is that this is a performance optimization.

There are generally three reasons to want an iommu in the first place:
 - Pass a device down to a guest or user process without giving
   access to all of memory
 - Avoid problems with limitations in the device, typically when it
only supports
   32-bit bus addressing, but the installed memory is larger than 4GB
 - Protect kernel memory from broken drivers

If you care about none of the above, but you do care about data transfer
speed, you are better off just leaving the IOMMU in bypass mode.
I don't think we have to support it if the IOMMU works reliably, but it's
something that users might want.

Arnd


Re: [PATCH 0/3] Apple M1 DART IOMMU driver

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 4:59 PM Mark Kettenis  wrote:
>
> > From: Arnd Bergmann 
> > Date: Thu, 25 Mar 2021 22:41:09 +0100
> >
> > On Thu, Mar 25, 2021 at 8:53 AM Sven Peter  wrote:
> > > On Tue, Mar 23, 2021, at 21:53, Rob Herring wrote:
> > >
> > > I'm probably just confused or maybe the documentation is outdated but I 
> > > don't
> > > see how I could specify "this device can only use DMA addresses from
> > > 0x0010...0x3ff0 but can map these via the iommu to any physical
> > > address" using 'dma-ranges'.
> >
> > It sounds like this is a holdover from the original powerpc iommu,
> > which also had a limited set of virtual addresses in the iommu.
> >
> > I would think it's sufficient to describe it in the iommu itself,
> > since the limitation is more "addresses coming into the iommu must
> > be this range" than "this device must use that address range for
> > talking to the iommu".
> >
> > If the addresses are allocated by the iommu driver, and each iommu
> > only has one DMA master attached to it, having a simple range
> > property in the iommu node should do the trick here. If there might
> > be multiple devices on the same iommu but with different address
> > ranges (which I don't think is the case), then it could be part of
> > the reference to the iommu.
>
> The ADT has properties on the iommu node that describe the adresses it
> accepts for translation ("vm-base" and "vm-size").  So I think we can
> safely assume that the same limits apply to all DMA masters that are
> attached to it.  We don't know if the range limit is baked into the
> silicon or whether it is related to how the firmware sets things up.
> Having the properties on the iommu node makes it easy for m1n1 to
> update the properties with the right values if necessary.

Ok.

> Some of the DARTs provide a bypass facility.  That code make using the
> standard "dma-ranges" property tricky.  That property would need to
> contain the bypass address range.  But that would mean that if the
> DART driver needs to look at that property to figure out the address
> range that supports translation it will need to be able to distinguish
> between the translatable address range and the bypass address range.

I'm not following here. Do you mean the bus address used for bypass is
different from the upstream address that gets programmed into the iommu
when it is active?

  Arnd


Re: [PATCH 4/4] exec: move the call to getname_flags into do_execveat

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 3:38 PM Christoph Hellwig  wrote:
>
> Remove the duplicated copying of the pathname into the common helper.
>
> Signed-off-by: Christoph Hellwig 

Looks correct, but

> -static int do_execveat(int fd, struct filename *filename,
> +static int do_execveat(int fd, const char __user *pathname,
> const char __user *const __user *argv,
> const char __user *const __user *envp, int flags)

Maybe rename this to ksys_execveat() for consistency now? I think that
is the current trend for functions that are essentially just the syscall.

With or without that change

Reviewed-by: Arnd Bergmann 


Re: [PATCH 1/4] exec: remove do_execve

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 3:38 PM Christoph Hellwig  wrote:
>
> Just call do_execveat instead.
>
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Arnd Bergmann 


Re: [PATCH 3/4] exec: simplify the compat syscall handling

2021-03-26 Thread Arnd Bergmann
On Fri, Mar 26, 2021 at 3:38 PM Christoph Hellwig  wrote:
>
> The only differenence betweeen the compat exec* syscalls and their
> native versions is the compat_ptr sign extension, and the fact that
> the pointer arithmetics for the two dimensional arrays needs to use
> the compat pointer size.  Instead of the compat wrappers and the
> struct user_arg_ptr machinery just use in_compat_syscall() to do the
> right thing for the compat case deep inside get_user_arg_ptr().
>
> Signed-off-by: Christoph Hellwig 

Nice cleanup!

Reviewed-by: Arnd Bergmann 


Re: [PATCH 0/3] Apple M1 DART IOMMU driver

2021-03-25 Thread Arnd Bergmann
On Thu, Mar 25, 2021 at 8:53 AM Sven Peter  wrote:
> On Tue, Mar 23, 2021, at 21:53, Rob Herring wrote:
>
> I'm probably just confused or maybe the documentation is outdated but I don't
> see how I could specify "this device can only use DMA addresses from
> 0x0010...0x3ff0 but can map these via the iommu to any physical
> address" using 'dma-ranges'.

It sounds like this is a holdover from the original powerpc iommu, which also
had a limited set of virtual addresses in the iommu.

I would think it's sufficient to describe it in the iommu itself,
since the limitation
is more "addresses coming into the iommu must be this range" than "this device
must use that address range for talking to the iommu".

If the addresses are allocated by the iommu driver, and each iommu only has
one DMA master attached to it, having a simple range property in the iommu
node should do the trick here. If there might be multiple devices on the same
iommu but with different address ranges (which I don't think is the case), then
it could be part of the reference to the iommu.

 Arnd


Re: [PATCH 11/11] [RFC] drm/i915/dp: fix array overflow warning

2021-03-25 Thread Arnd Bergmann
On Thu, Mar 25, 2021 at 9:05 AM Jani Nikula  wrote:
> > Clearly something is wrong here, but I can't quite figure out what.
> > Changing the array size to 16 bytes avoids the warning, but is
> > probably the wrong solution here.
>
> Ugh. drm_dp_channel_eq_ok() does not actually require more than
> DP_LINK_STATUS_SIZE - 2 elements in the link_status. It's some other
> related functions that do, and in most cases it's convenient to read all
> those DP_LINK_STATUS_SIZE bytes.
>
> However, here the case is slightly different for DP MST, and the change
> causes reserved DPCD addresses to be read. Not sure it matters, but
> really I think the problem is what drm_dp_channel_eq_ok() advertizes.
>
> I also don't like the array notation with sizes in function parameters
> in general, because I think it's misleading. Would gcc-11 warn if a
> function actually accesses the memory out of bounds of the size?

Yes, that is the point of the warning. Using an explicit length in an
array argument type tells gcc that the function will never access
beyond the end of that bound, and that passing a short array
is a bug.

I don't know if this /only/ means triggering a warning, or if gcc
is also able to make optimizations after classifying this as undefined
behavior that it would not make for an unspecified length.

> Anyway. I don't think we're going to get rid of the array notation
> anytime soon, if ever, no matter how much I dislike it, so I think the
> right fix would be to at least state the correct required size in
> drm_dp_channel_eq_ok().

Ok. Just to confirm: Changing the declaration to an unspecified length
avoids the warnings, as does the patch below:

diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index eedbb48815b7..6ebeec3d88a7 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -46,12 +46,12 @@
  */

 /* Helpers for DP link training */
-static u8 dp_link_status(const u8 link_status[DP_LINK_STATUS_SIZE], int r)
+static u8 dp_link_status(const u8 link_status[DP_LINK_STATUS_SIZE - 2], int r)
 {
return link_status[r - DP_LANE0_1_STATUS];
 }

-static u8 dp_get_lane_status(const u8 link_status[DP_LINK_STATUS_SIZE],
+static u8 dp_get_lane_status(const u8 link_status[DP_LINK_STATUS_SIZE - 2],
 int lane)
 {
int i = DP_LANE0_1_STATUS + (lane >> 1);
@@ -61,7 +61,7 @@ static u8 dp_get_lane_status(const u8
link_status[DP_LINK_STATUS_SIZE],
return (l >> s) & 0xf;
 }

-bool drm_dp_channel_eq_ok(const u8 link_status[DP_LINK_STATUS_SIZE],
+bool drm_dp_channel_eq_ok(const u8 link_status[DP_LINK_STATUS_SIZE - 2],
  int lane_count)
 {
u8 lane_align;
diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index edffd1dcca3e..160f7fd127b1 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -1456,7 +1456,7 @@ enum drm_dp_phy {

 #define DP_LINK_CONSTANT_N_VALUE 0x8000
 #define DP_LINK_STATUS_SIZE   6
-bool drm_dp_channel_eq_ok(const u8 link_status[DP_LINK_STATUS_SIZE],
+bool drm_dp_channel_eq_ok(const u8 link_status[DP_LINK_STATUS_SIZE - 2],
  int lane_count);
 bool drm_dp_clock_recovery_ok(const u8 link_status[DP_LINK_STATUS_SIZE],
  int lane_count);


This obviously needs a good explanation in the code and the changelog text,
which I don't have, but if the above is what you had in mind, please take that
and add Reported-by/Tested-by: Arnd Bergmann .

   Arnd


Re: [PATCH net-next 5/5] vxge: avoid -Wemtpy-body warnings

2021-03-24 Thread Arnd Bergmann
On Mon, Mar 22, 2021 at 11:43 AM Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> There are a few warnings about empty debug macros in this driver:
>
> drivers/net/ethernet/neterion/vxge/vxge-main.c: In function 'vxge_probe':
> drivers/net/ethernet/neterion/vxge/vxge-main.c:4480:76: error: suggest braces 
> around empty body in an 'if' statement [-Werror=empty-body]
>  4480 | "Failed in enabling SRIOV mode: 
> %d\n", ret);
>
> Change them to proper 'do { } while (0)' expressions to make the
> code a little more robust and avoid the warnings.
>
> Signed-off-by: Arnd Bergmann 

Please disregard this patch, I was accidentally building without -Wformat and
failed to notice that this introduces a regression. I'll send a new
version after
more testing.

   Arnd


Re: [RFT PATCH v3 08/27] asm-generic/io.h: Add a non-posted variant of ioremap()

2021-03-24 Thread Arnd Bergmann
On Wed, Mar 24, 2021 at 7:12 PM Will Deacon  wrote:
>
> > +/*
> > + * ioremap_np needs an explicit architecture implementation, as it
> > + * requests stronger semantics than regular ioremap(). Portable drivers
> > + * should instead use one of the higher-level abstractions, like
> > + * devm_ioremap_resource(), to choose the correct variant for any given
> > + * device and bus. Portable drivers with a good reason to want non-posted
> > + * write semantics should always provide an ioremap() fallback in case
> > + * ioremap_np() is not available.
> > + */
> > +#ifndef ioremap_np
> > +#define ioremap_np ioremap_np
> > +static inline void __iomem *ioremap_np(phys_addr_t offset, size_t size)
> > +{
> > + return NULL;
> > +}
> > +#endif
>
> Can we implement the generic pci_remap_cfgspace() in terms of ioremap_np()
> if it is supported by the architecture? That way, we could avoid defining
> both on arm64.

Good idea. It needs a fallback in case the ioremap_np() fails on most
architectures, but that sounds easy enough.

Since pci_remap_cfgspace() only has custom implementations, it sounds like
we can actually make the generic implementation unconditional in the end,
but that requires adding ioremap_np() on 32-bit as well, and I would keep
that separate from this series.

Arnd


[PATCH] [v3] drm/imx: imx-ldb: fix out of bounds array access warning

2021-03-24 Thread Arnd Bergmann
From: Arnd Bergmann 

When CONFIG_OF is disabled, building with 'make W=1' produces warnings
about out of bounds array access:

drivers/gpu/drm/imx/imx-ldb.c: In function 'imx_ldb_set_clock.constprop':
drivers/gpu/drm/imx/imx-ldb.c:186:8: error: array subscript -22 is below array 
bounds of 'struct clk *[4]' [-Werror=array-bounds]

Add an error check before the index is used, which helps with the
warning, as well as any possible other error condition that may be
triggered at runtime.

The warning could be fixed by adding a Kconfig depedency on CONFIG_OF,
but Liu Ying points out that the driver may hit the out-of-bounds
problem at runtime anyway.

Signed-off-by: Arnd Bergmann 
---
v3: fix build regression from v2
v2: fix subject line
expand patch description
print mux number
check upper bound as well
---
 drivers/gpu/drm/imx/imx-ldb.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c
index dbfe39e2f7f6..565482e2b816 100644
--- a/drivers/gpu/drm/imx/imx-ldb.c
+++ b/drivers/gpu/drm/imx/imx-ldb.c
@@ -197,6 +197,11 @@ static void imx_ldb_encoder_enable(struct drm_encoder 
*encoder)
int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN;
int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
 
+   if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) {
+   dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux);
+   return;
+   }
+
drm_panel_prepare(imx_ldb_ch->panel);
 
if (dual) {
@@ -255,6 +260,11 @@ imx_ldb_encoder_atomic_mode_set(struct drm_encoder 
*encoder,
int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
u32 bus_format = imx_ldb_ch->bus_format;
 
+   if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) {
+   dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux);
+   return;
+   }
+
if (mode->clock > 17) {
dev_warn(ldb->dev,
 "%s: mode exceeds 170 MHz pixel clock\n", __func__);
-- 
2.29.2



Re: [PATCH] [v2] drm/imx: imx-ldb: fix out of bounds array access warning

2021-03-24 Thread Arnd Bergmann
On Wed, Mar 24, 2021 at 3:20 PM Joe Perches  wrote:
>
> On Wed, 2021-03-24 at 13:17 +0100, Arnd Bergmann wrote:
> > From: Arnd Bergmann 
> >
> > When CONFIG_OF is disabled, building with 'make W=1' produces warnings
> > about out of bounds array access:
> >
> > drivers/gpu/drm/imx/imx-ldb.c: In function 'imx_ldb_set_clock.constprop':
> > drivers/gpu/drm/imx/imx-ldb.c:186:8: error: array subscript -22 is below 
> > array bounds of 'struct clk *[4]' [-Werror=array-bounds]
> >
> > Add an error check before the index is used, which helps with the
> > warning, as well as any possible other error condition that may be
> > triggered at runtime.
> >
> > The warning could be fixed by adding a Kconfig depedency on CONFIG_OF,
> > but Liu Ying points out that the driver may hit the out-of-bounds
> > problem at runtime anyway.
> >
> > Signed-off-by: Arnd Bergmann 
> > ---
> > v2: fix subject line
> > expand patch description
> > print mux number
> > check upper bound as well
> []
> > diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c
> []
> > @@ -197,6 +197,12 @@ static void imx_ldb_encoder_enable(struct drm_encoder 
> > *encoder)
> >   int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN;
> >   int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
> >
> > + if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) {
> > + dev_warn(ldb->dev, "%s: invalid mux %d\n",
> > +  __func__, ERR_PTR(mux));
>
> This does not compile without warnings.
>
> drivers/gpu/drm/imx/imx-ldb.c: In function ‘imx_ldb_encoder_enable’:
> drivers/gpu/drm/imx/imx-ldb.c:201:22: warning: format ‘%d’ expects argument 
> of type ‘int’, but argument 4 has type ‘void *’ [-Wformat=]
>   201 |   dev_warn(ldb->dev, "%s: invalid mux %d\n",
>   |  ^~
>
> If you want to use ERR_PTR, the %d should be %pe as ERR_PTR
> is converting an int a void * to decode the error type and
> emit it as a string.

Sorry about that.

I decided against using ERR_PTR() in order to also check for
positive array overflow, but the version I tested was different from
the version I sent.

v3 coming.

 Arnd


Re: [PATCH] [v2] hinic: avoid gcc -Wrestrict warning

2021-03-24 Thread Arnd Bergmann
On Wed, Mar 24, 2021 at 2:29 PM Rasmus Villemoes
 wrote:
> On 24/03/2021 14.07, Arnd Bergmann wrote:
 >
> >   if (set_settings & HILINK_LINK_SET_SPEED) {
> >   speed_level = hinic_ethtool_to_hw_speed_level(speed);
> >   err = snprintf(set_link_str, SET_LINK_STR_MAX_LEN,
> > -"%sspeed %d ", set_link_str, speed);
> > - if (err <= 0 || err >= SET_LINK_STR_MAX_LEN) {
> > +"speed %d ", speed);
> > + if (err >= SET_LINK_STR_MAX_LEN) {
> >   netif_err(nic_dev, drv, netdev, "Failed to snprintf 
> > link speed, function return(%d) and dest_len(%d)\n",
> > err, SET_LINK_STR_MAX_LEN);
> >   return -EFAULT;
>
> It's not your invention of course, but this both seems needlessly harsh
> and EFAULT is a weird error to return. It's just a printk() message that
> might be truncated, and now that the format string only has a %d
> specifier, it can actually be verified statically that overflow will
> never happen (though I don't know or think gcc can do that, perhaps
> there's some locale nonsense in the standard that allows using
> utf16-encoded sanskrit runes). So probably that test should just be
> dropped, but that's a separate thing.

I thought about fixing it, but this seemed to be a rabbit hole I didn't
want to get into, as there are other harmless issues in the driver
that could be improved.

I'm fairly sure gcc can indeed warn about the overflow with
-Wformat-truncation, but the warning is disabled at the moment
because it has a ton of false positives:

kernel/workqueue.c: In function 'create_worker':
kernel/workqueue.c:1933:55: error: '%d' directive output may be
truncated writing between 1 and 10 bytes into a region of size between
3 and 13 [-Werror=format-truncation=]
 1933 | snprintf(id_buf, sizeof(id_buf), "u%d:%d",
pool->id, id);

Arnd


[PATCH] amdgpu: securedisplay: simplify i2c hexdump output

2021-03-24 Thread Arnd Bergmann
From: Arnd Bergmann 

A previous fix I did left a rather complicated loop in
amdgpu_securedisplay_debugfs_write() for what could be expressed in a
simple sprintf, as Rasmus pointed out.

This drops the leading 0x for each byte, but is otherwise
much nicer.

Suggested-by: Rasmus Villemoes 
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
index 69d7f6bff5d4..fc3ddd7aa6f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
@@ -92,9 +92,7 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct file 
*f, const char __u
struct drm_device *dev = adev_to_drm(adev);
uint32_t phy_id;
uint32_t op;
-   int i;
char str[64];
-   char i2c_output[256];
int ret;
 
if (*pos || size > sizeof(str) - 1)
@@ -136,12 +134,9 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
file *f, const char __u
ret = psp_securedisplay_invoke(psp, 
TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);
if (!ret) {
if (securedisplay_cmd->status == 
TA_SECUREDISPLAY_STATUS__SUCCESS) {
-   int pos = 0;
-   memset(i2c_output,  0, sizeof(i2c_output));
-   for (i = 0; i < 
TA_SECUREDISPLAY_I2C_BUFFER_SIZE; i++)
-   pos += sprintf(i2c_output + pos, " 
0x%X",
-   
securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
-   dev_info(adev->dev, "SECUREDISPLAY: I2C buffer 
out put is :%s\n", i2c_output);
+   dev_info(adev->dev, "SECUREDISPLAY: I2C buffer 
out put is: %*ph\n",
+TA_SECUREDISPLAY_I2C_BUFFER_SIZE,
+
securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf);
} else {
psp_securedisplay_parse_resp_status(psp, 
securedisplay_cmd->status);
}
-- 
2.29.2



Re: [PATCH] amdgpu: fix gcc -Wrestrict warning

2021-03-24 Thread Arnd Bergmann
On Tue, Mar 23, 2021 at 4:57 PM Rasmus Villemoes
 wrote:
> On 23/03/2021 14.04, Arnd Bergmann wrote:
> >   if (securedisplay_cmd->status == 
> > TA_SECUREDISPLAY_STATUS__SUCCESS) {
> > + int pos = 0;
> >   memset(i2c_output,  0, sizeof(i2c_output));
> >   for (i = 0; i < 
> > TA_SECUREDISPLAY_I2C_BUFFER_SIZE; i++)
> > - sprintf(i2c_output, "%s 0x%X", 
> > i2c_output,
> > + pos += sprintf(i2c_output + pos, " 
> > 0x%X",
> >   
> > securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
> >   dev_info(adev->dev, "SECUREDISPLAY: I2C 
> > buffer out put is :%s\n", i2c_output);
>
> Eh, why not get rid of the 256 byte stack allocation and just replace
> all of this by
>
>   dev_info(adev->dev, ""SECUREDISPLAY: I2C buffer out put is: %*ph\n",
> TA_SECUREDISPLAY_I2C_BUFFER_SIZE,
> securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf);
>
> That's much less code (both in #LOC and .text), and avoids adding yet
> another place that will be audited over and over for "hm, yeah, that
> sprintf() is actually not gonna overflow".
>
> Yeah, it'll lose the 0x prefixes for each byte and use lowercase hex chars.

Ah, I didn't know the kernel's sprintf could do that, that's really nice.

I'll send a follow-up patch, as Alex has already applied the first one.

Alex, feel free to merge the two into one, or just keep as they are.

  Arnd


[PATCH] [v2] Input: analog - fix invalid snprintf() call

2021-03-24 Thread Arnd Bergmann
From: Arnd Bergmann 

Overlapping input and output arguments to snprintf() are
undefined behavior in C99:

drivers/input/joystick/analog.c: In function 'analog_name':
drivers/input/joystick/analog.c:428:3: error: 'snprintf' argument 4 overlaps 
destination object 'analog' [-Werror=restrict]
  428 |   snprintf(analog->name, sizeof(analog->name), "%s %d-hat",
  |   ^
  429 | analog->name, hweight16(analog->mask & ANALOG_HATS_ALL));
  | 
drivers/input/joystick/analog.c:420:40: note: destination object referenced by 
'restrict'-qualified argument 1 was declared here
  420 | static void analog_name(struct analog *analog)
  | ~~~^~

Change this function to use the simpler seq_buf interface instead.

Cc: Rasmus Villemoes 
Signed-off-by: Arnd Bergmann 
---
v2: use seq_buf instead of rolling my own
---
 drivers/input/joystick/analog.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/input/joystick/analog.c b/drivers/input/joystick/analog.c
index f798922a4598..087b65ae7585 100644
--- a/drivers/input/joystick/analog.c
+++ b/drivers/input/joystick/analog.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -419,23 +420,24 @@ static void analog_calibrate_timer(struct analog_port 
*port)
 
 static void analog_name(struct analog *analog)
 {
-   snprintf(analog->name, sizeof(analog->name), "Analog %d-axis %d-button",
+   struct seq_buf s;
+
+   seq_buf_init(, analog->name, sizeof(analog->name));
+   seq_buf_printf(, "Analog %d-axis %d-button",
 hweight8(analog->mask & ANALOG_AXES_STD),
 hweight8(analog->mask & ANALOG_BTNS_STD) + !!(analog->mask & 
ANALOG_BTNS_CHF) * 2 +
 hweight16(analog->mask & ANALOG_BTNS_GAMEPAD) + 
!!(analog->mask & ANALOG_HBTN_CHF) * 4);
 
if (analog->mask & ANALOG_HATS_ALL)
-   snprintf(analog->name, sizeof(analog->name), "%s %d-hat",
-analog->name, hweight16(analog->mask & 
ANALOG_HATS_ALL));
+   seq_buf_printf(, " %d-hat",
+  hweight16(analog->mask & ANALOG_HATS_ALL));
 
if (analog->mask & ANALOG_HAT_FCS)
-   strlcat(analog->name, " FCS", sizeof(analog->name));
+   seq_buf_printf(, " FCS");
if (analog->mask & ANALOG_ANY_CHF)
-   strlcat(analog->name, (analog->mask & ANALOG_SAITEK) ? " 
Saitek" : " CHF",
-   sizeof(analog->name));
+   seq_buf_printf(, (analog->mask & ANALOG_SAITEK) ? " Saitek" : 
" CHF");
 
-   strlcat(analog->name, (analog->mask & ANALOG_GAMEPAD) ? " gamepad": " 
joystick",
-   sizeof(analog->name));
+   seq_buf_printf(, (analog->mask & ANALOG_GAMEPAD) ? " gamepad": " 
joystick");
 }
 
 /*
-- 
2.29.2



[PATCH] [v2] hinic: avoid gcc -Wrestrict warning

2021-03-24 Thread Arnd Bergmann
From: Arnd Bergmann 

With extra warnings enabled, gcc complains that snprintf should not
take the same buffer as source and destination:

drivers/net/ethernet/huawei/hinic/hinic_ethtool.c: In function 
'hinic_set_settings_to_hw':
drivers/net/ethernet/huawei/hinic/hinic_ethtool.c:480:9: error: 'snprintf' 
argument 4 overlaps destination object 'set_link_str' [-Werror=restrict]
  480 |   err = snprintf(set_link_str, SET_LINK_STR_MAX_LEN,
  | ^~~~
  481 |   "%sspeed %d ", set_link_str, speed);
  |   ~~~
drivers/net/ethernet/huawei/hinic/hinic_ethtool.c:464:7: note: destination 
object referenced by 'restrict'-qualified argument 1 was declared here
  464 |  char set_link_str[SET_LINK_STR_MAX_LEN] = {0};

Rewrite this to avoid the nested sprintf and instead use separate
buffers, which is simpler.

Cc: Rasmus Villemoes 
Signed-off-by: Arnd Bergmann 
---
v2: rework according to feedback from Rasmus. This one could
easily avoid most of the pitfalls
---
 .../net/ethernet/huawei/hinic/hinic_ethtool.c | 25 ---
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c 
b/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c
index c340d9acba80..d7e20dab6e48 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c
@@ -34,7 +34,7 @@
 #include "hinic_rx.h"
 #include "hinic_dev.h"
 
-#define SET_LINK_STR_MAX_LEN   128
+#define SET_LINK_STR_MAX_LEN   16
 
 #define GET_SUPPORTED_MODE 0
 #define GET_ADVERTISED_MODE1
@@ -462,24 +462,19 @@ static int hinic_set_settings_to_hw(struct hinic_dev 
*nic_dev,
 {
struct hinic_link_ksettings_info settings = {0};
char set_link_str[SET_LINK_STR_MAX_LEN] = {0};
+   const char *autoneg_str;
struct net_device *netdev = nic_dev->netdev;
enum nic_speed_level speed_level = 0;
int err;
 
-   err = snprintf(set_link_str, SET_LINK_STR_MAX_LEN, "%s",
-  (set_settings & HILINK_LINK_SET_AUTONEG) ?
-  (autoneg ? "autong enable " : "autong disable ") : "");
-   if (err < 0 || err >= SET_LINK_STR_MAX_LEN) {
-   netif_err(nic_dev, drv, netdev, "Failed to snprintf link state, 
function return(%d) and dest_len(%d)\n",
- err, SET_LINK_STR_MAX_LEN);
-   return -EFAULT;
-   }
+   autoneg_str = (set_settings & HILINK_LINK_SET_AUTONEG) ?
+ (autoneg ? "autong enable " : "autong disable ") : "";
 
if (set_settings & HILINK_LINK_SET_SPEED) {
speed_level = hinic_ethtool_to_hw_speed_level(speed);
err = snprintf(set_link_str, SET_LINK_STR_MAX_LEN,
-  "%sspeed %d ", set_link_str, speed);
-   if (err <= 0 || err >= SET_LINK_STR_MAX_LEN) {
+  "speed %d ", speed);
+   if (err >= SET_LINK_STR_MAX_LEN) {
netif_err(nic_dev, drv, netdev, "Failed to snprintf 
link speed, function return(%d) and dest_len(%d)\n",
  err, SET_LINK_STR_MAX_LEN);
return -EFAULT;
@@ -494,11 +489,11 @@ static int hinic_set_settings_to_hw(struct hinic_dev 
*nic_dev,
err = hinic_set_link_settings(nic_dev->hwdev, );
if (err != HINIC_MGMT_CMD_UNSUPPORTED) {
if (err)
-   netif_err(nic_dev, drv, netdev, "Set %s failed\n",
- set_link_str);
+   netif_err(nic_dev, drv, netdev, "Set %s%sfailed\n",
+ autoneg_str, set_link_str);
else
-   netif_info(nic_dev, drv, netdev, "Set %s 
successfully\n",
-  set_link_str);
+   netif_info(nic_dev, drv, netdev, "Set 
%s%ssuccessfully\n",
+  autoneg_str, set_link_str);
 
return err;
}
-- 
2.29.2



[PATCH] [v2] drm/imx: imx-ldb: fix out of bounds array access warning

2021-03-24 Thread Arnd Bergmann
From: Arnd Bergmann 

When CONFIG_OF is disabled, building with 'make W=1' produces warnings
about out of bounds array access:

drivers/gpu/drm/imx/imx-ldb.c: In function 'imx_ldb_set_clock.constprop':
drivers/gpu/drm/imx/imx-ldb.c:186:8: error: array subscript -22 is below array 
bounds of 'struct clk *[4]' [-Werror=array-bounds]

Add an error check before the index is used, which helps with the
warning, as well as any possible other error condition that may be
triggered at runtime.

The warning could be fixed by adding a Kconfig depedency on CONFIG_OF,
but Liu Ying points out that the driver may hit the out-of-bounds
problem at runtime anyway.

Signed-off-by: Arnd Bergmann 
---
v2: fix subject line
expand patch description
print mux number
check upper bound as well
---
 drivers/gpu/drm/imx/imx-ldb.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c
index dbfe39e2f7f6..40310327fa76 100644
--- a/drivers/gpu/drm/imx/imx-ldb.c
+++ b/drivers/gpu/drm/imx/imx-ldb.c
@@ -197,6 +197,12 @@ static void imx_ldb_encoder_enable(struct drm_encoder 
*encoder)
int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN;
int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
 
+   if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) {
+   dev_warn(ldb->dev, "%s: invalid mux %d\n",
+__func__, ERR_PTR(mux));
+   return;
+   }
+
drm_panel_prepare(imx_ldb_ch->panel);
 
if (dual) {
@@ -255,6 +261,12 @@ imx_ldb_encoder_atomic_mode_set(struct drm_encoder 
*encoder,
int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
u32 bus_format = imx_ldb_ch->bus_format;
 
+   if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) {
+   dev_warn(ldb->dev, "%s: invalid mux %d\n",
+__func__, ERR_PTR(mux));
+   return;
+   }
+
if (mode->clock > 17) {
dev_warn(ldb->dev,
 "%s: mode exceeds 170 MHz pixel clock\n", __func__);
-- 
2.29.2



Re: [PATCH -next] applicom: fix some err codes returned by ac_ioctl

2021-03-24 Thread Arnd Bergmann
On Wed, Mar 24, 2021 at 8:20 AM Xu Jia  wrote:
>
> When cmd > 6 or copy_to_user() fail, The variable 'ret' would not be
> returned back. Fix the 'ret' set but not used.
>
> Signed-off-by: Xu Jia 

Reviewed-by: Arnd Bergmann 

> diff --git a/drivers/char/applicom.c b/drivers/char/applicom.c
> index 14b2d8034c51..0ab765143354 100644
> --- a/drivers/char/applicom.c
> +++ b/drivers/char/applicom.c
> @@ -839,7 +839,7 @@ static long ac_ioctl(struct file *file, unsigned int cmd, 
> unsigned long arg)
> Dummy = readb(apbs[IndexCard].RamIO + VERS);
> kfree(adgl);
> mutex_unlock(_mutex);
> -   return 0;
> +   return ret;
>

Apparently this has been broken since the driver was first merged in
linux-2.3.16. I could find no indication of anyone using the driver
and reporting any problems in the git history and it clearly still has
the style of drivers writting in the 1990s. On the other hand, this is
(was) used in some very long-lived systems and you can still
buy old applicom cards from artisan[1].

Is there any chance this driver is still used anywhere with modern
kernels? I suspect we could move it to staging to find out.

  Arnd

[1] https://www.artisantg.com/Mfgr/MolexWoodheadApplicom


Re: [PATCH] irqchip/gic-v3: fix OF_BAD_ADDR error handling

2021-03-24 Thread Arnd Bergmann
On Wed, Mar 24, 2021 at 11:14 AM Marc Zyngier  wrote:
>
> On Tue, 23 Mar 2021 22:06:22 +,
> Nick Desaulniers  wrote:
> >
> > On Tue, Mar 23, 2021 at 6:18 AM Arnd Bergmann  wrote:
> > >
> > > From: Arnd Bergmann 
> > >
> > > When building with extra warnings enabled, clang points out a
> > > mistake in the error handling:
> > >
> > > drivers/irqchip/irq-gic-v3-mbi.c:306:21: error: result of comparison of 
> > > constant 18446744073709551615 with expression of type 'phys_addr_t' (aka 
> > > 'unsigned int') is always false 
> > > [-Werror,-Wtautological-constant-out-of-range-compare]
> >
> > Looks like based on CONFIG_PHYS_ADDR_T_64BIT, phys_addr_t can be u64
> > or u32, but of_translate_address always returns a u64.  This is fine
> > for the current value of OF_BAD_ADDR, but I think there's a risk of
> > losing the top 32b of the return value of of_translate_address() here?
>
> If the DT describes a 64bit physical address, and that the (32bit)
> kernel isn't built to grok these addresses, then I'd say that the
> kernel cannot run on this HW, and that we don't need to worry much
> about this case.
>
> In general, CONFIG_PHYS_ADDR_T_64BIT must be selected by the arch code
> if anything above 32bit can be described in the PA space.

Right, this generally works fine, the OF_BAD_ADDR is just a little
hard to get right. Fortunately there are very few comparisons to
OF_BAD_ADDR in other drivers, so I don't think we have to do
anything about it. I looked through every other user of this and found
no problems there, either they use 64-bit temporaries, or they correctly
cast the results.

 Arnd


[PATCH] rsxx: remove extraneous 'const' qualifier

2021-03-23 Thread Arnd Bergmann
From: Arnd Bergmann 

The returned string from rsxx_card_state_to_str is 'const',
but the other qualifier doesn't change anything here except
causing a warning with 'clang -Wextra':

drivers/block/rsxx/core.c:393:21: warning: 'const' type qualifier on return 
type has no effect [-Wignored-qualifiers]
static const char * const rsxx_card_state_to_str(unsigned int state)

Fixes: f37912039eb0 ("block: IBM RamSan 70/80 trivial changes.")
Reviewed-by: Nick Desaulniers 
Signed-off-by: Arnd Bergmann 
---
 drivers/block/rsxx/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/rsxx/core.c b/drivers/block/rsxx/core.c
index 5ac1881396af..813b0a554d4a 100644
--- a/drivers/block/rsxx/core.c
+++ b/drivers/block/rsxx/core.c
@@ -392,7 +392,7 @@ static irqreturn_t rsxx_isr(int irq, void *pdata)
 }
 
 /*- Card Event Handler ---*/
-static const char * const rsxx_card_state_to_str(unsigned int state)
+static const char *rsxx_card_state_to_str(unsigned int state)
 {
static const char * const state_strings[] = {
"Unknown", "Shutdown", "Starting", "Formatting",
-- 
2.29.2



[PATCH net-next] [RESEND] ch_ktls: fix enum-conversion warning

2021-03-23 Thread Arnd Bergmann
From: Arnd Bergmann 

gcc points out an incorrect enum assignment:

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c: In function 
'chcr_ktls_cpl_set_tcb_rpl':
drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:684:22: warning: 
implicit conversion from 'enum ' to 'enum ch_ktls_open_state' 
[-Wenum-conversion]

This appears harmless, and should apparently use 'CH_KTLS_OPEN_SUCCESS'
instead of 'false', with the same value '0'.

Fixes: efca3878a5fb ("ch_ktls: Issue if connection offload fails")
Reviewed-by: Andrew Lunn 
Signed-off-by: Arnd Bergmann 
---
I sent this last October, but it never made it in, resending
---
 drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c 
b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index 46a809f2aeca..7c599195e256 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -722,7 +722,7 @@ static int chcr_ktls_cpl_set_tcb_rpl(struct adapter *adap, 
unsigned char *input)
kvfree(tx_info);
return 0;
}
-   tx_info->open_state = false;
+   tx_info->open_state = CH_KTLS_OPEN_SUCCESS;
spin_unlock(_info->lock);
 
complete(_info->completion);
-- 
2.29.2



Re: [RFC net] net: skbuff: fix stack variable out of bounds access

2021-03-23 Thread Arnd Bergmann
On Tue, Mar 23, 2021 at 3:42 PM Willem de Bruijn
 wrote:
>
> On Tue, Mar 23, 2021 at 8:52 AM Arnd Bergmann  wrote:
> >>
> A similar fix already landed in 5.12-rc3: commit b228c9b05876 ("net:
> expand textsearch ts_state to fit skb_seq_state"). That fix landed in
> 5.12-rc3.

Ah nice, even the same BUILD_BUG_ON() ;-)

Too bad it had to be found through runtime testing when it could have been
found by the compiler warning.

   Arnd


[tip: x86/urgent] x86/build: Turn off -fcf-protection for realmode targets

2021-03-23 Thread tip-bot2 for Arnd Bergmann
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 9fcb51c14da2953de585c5c6e50697b8a6e91a7b
Gitweb:
https://git.kernel.org/tip/9fcb51c14da2953de585c5c6e50697b8a6e91a7b
Author:Arnd Bergmann 
AuthorDate:Tue, 23 Mar 2021 13:48:36 +01:00
Committer: Ingo Molnar 
CommitterDate: Tue, 23 Mar 2021 16:36:01 +01:00

x86/build: Turn off -fcf-protection for realmode targets

The new Ubuntu GCC packages turn on -fcf-protection globally,
which causes a build failure in the x86 realmode code:

  cc1: error: ‘-fcf-protection’ is not compatible with this target

Turn it off explicitly on compilers that understand this option.

Signed-off-by: Arnd Bergmann 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210323124846.1584944-1-a...@kernel.org
---
 arch/x86/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 2d6d5a2..9a85eae 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -27,7 +27,7 @@ endif
 REALMODE_CFLAGS:= -m16 -g -Os -DDISABLE_BRANCH_PROFILING \
   -Wall -Wstrict-prototypes -march=i386 -mregparm=3 \
   -fno-strict-aliasing -fomit-frame-pointer -fno-pic \
-  -mno-mmx -mno-sse
+  -mno-mmx -mno-sse $(call cc-option,-fcf-protection=none)
 
 REALMODE_CFLAGS += -ffreestanding
 REALMODE_CFLAGS += -fno-stack-protector


[PATCH] [v2] btrfs: zoned: bail out in btrfs_alloc_chunk for bad input

2021-03-23 Thread Arnd Bergmann
From: Arnd Bergmann 

gcc complains that the ctl->max_chunk_size member might be used
uninitialized when none of the three conditions for initializing it in
init_alloc_chunk_ctl_policy_zoned() are true:

In function ‘init_alloc_chunk_ctl_policy_zoned’,
inlined from ‘init_alloc_chunk_ctl’ at fs/btrfs/volumes.c:5023:3,
inlined from ‘btrfs_alloc_chunk’ at fs/btrfs/volumes.c:5340:2:
include/linux/compiler-gcc.h:48:45: error: ‘ctl.max_chunk_size’ may be used 
uninitialized [-Werror=maybe-uninitialized]
 4998 | ctl->max_chunk_size = min(limit, ctl->max_chunk_size);
  |   ^~~
fs/btrfs/volumes.c: In function ‘btrfs_alloc_chunk’:
fs/btrfs/volumes.c:5316:32: note: ‘ctl’ declared here
 5316 | struct alloc_chunk_ctl ctl;
  |^~~

If we ever get into this condition, something is seriously
wrong, so the same logic as in init_alloc_chunk_ctl_policy_regular()
and a few other places should be applied. This avoids both further
data corruption, and the compile-time warning.

Fixes: 1cd6121f2a38 ("btrfs: zoned: implement zoned chunk allocator")
Link: https://lore.kernel.org/lkml/20210323132343.gf7...@twin.jikos.cz/
Suggested-by: David Sterba 
Signed-off-by: Arnd Bergmann 
---
Note that the -Wmaybe-unintialized warning is globally disabled
by default. For some reason I got this warning anyway when building
this specific file with gcc-11.
---
 fs/btrfs/volumes.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bc3b33efddc5..d2994305ed77 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4989,6 +4989,8 @@ static void init_alloc_chunk_ctl_policy_zoned(
ctl->max_chunk_size = 2 * ctl->max_stripe_size;
ctl->devs_max = min_t(int, ctl->devs_max,
  BTRFS_MAX_DEVS_SYS_CHUNK);
+   } else {
+   BUG();
}
 
/* We don't want a chunk larger than 10% of writable space */
-- 
2.29.2



[PATCH] ARM: delay: avoid clang -Wtautological-constant warning

2021-03-23 Thread Arnd Bergmann
From: Arnd Bergmann 

Passing an 8-bit constant into delay() triggers a warning when building
with 'make W=1' using clang:

drivers/clk/actions/owl-pll.c:182:2: error: result of comparison of constant 
2000 with expression of type 'u8' (aka 'unsigned char') is always false 
[-Werror,-Wtautological-constant-out-of-range-compare]
udelay(pll_hw->delay);
^
arch/arm/include/asm/delay.h:84:9: note: expanded from macro 'udelay'
  ((n) > (MAX_UDELAY_MS * 1000) ? __bad_udelay() :  \
   ~~~ ^ ~~
arch/arm/mach-omap2/wd_timer.c:89:3: error: result of comparison of constant 
2000 with expression of type 'u8' (aka 'unsigned char') is always false 
[-Werror,-Wtautological-constant-out-of-range-compare]
udelay(oh->class->sysc->srst_udelay);
^~~~

Shut up the warning by adding a cast to a 64-bit number. A cast to 'int'
would usually be sufficient, but would fail to cause a link-time error
for large 64-bit constants.

Signed-off-by: Arnd Bergmann 
---
 arch/arm/include/asm/delay.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/delay.h b/arch/arm/include/asm/delay.h
index 4f80b72372b4..1bb6417a3a83 100644
--- a/arch/arm/include/asm/delay.h
+++ b/arch/arm/include/asm/delay.h
@@ -81,7 +81,7 @@ extern void __bad_udelay(void);
 
 #define udelay(n)  \
(__builtin_constant_p(n) ?  \
- ((n) > (MAX_UDELAY_MS * 1000) ? __bad_udelay() :  \
+ ((u64)(n) > (MAX_UDELAY_MS * 1000) ? __bad_udelay() : \
__const_udelay((n) * UDELAY_MULT)) :\
  __udelay(n))
 
-- 
2.29.2



[PATCH] irqchip/gic-v3: fix OF_BAD_ADDR error handling

2021-03-23 Thread Arnd Bergmann
From: Arnd Bergmann 

When building with extra warnings enabled, clang points out a
mistake in the error handling:

drivers/irqchip/irq-gic-v3-mbi.c:306:21: error: result of comparison of 
constant 18446744073709551615 with expression of type 'phys_addr_t' (aka 
'unsigned int') is always false 
[-Werror,-Wtautological-constant-out-of-range-compare]
if (mbi_phys_base == OF_BAD_ADDR) {

Truncate the constant to the same type as the variable it gets compared
to, to shut make the check work and void the warning.

Fixes: 505287525c24 ("irqchip/gic-v3: Add support for Message Based Interrupts 
as an MSI controller")
Signed-off-by: Arnd Bergmann 
---
 drivers/irqchip/irq-gic-v3-mbi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index 563a9b366294..e81e89a81cb5 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -303,7 +303,7 @@ int __init mbi_init(struct fwnode_handle *fwnode, struct 
irq_domain *parent)
reg = of_get_property(np, "mbi-alias", NULL);
if (reg) {
mbi_phys_base = of_translate_address(np, reg);
-   if (mbi_phys_base == OF_BAD_ADDR) {
+   if (mbi_phys_base == (phys_addr_t)OF_BAD_ADDR) {
ret = -ENXIO;
goto err_free_mbi;
}
-- 
2.29.2



  1   2   3   4   5   6   7   8   9   10   >