Re: [PATCH v7 00/24] x86: Implement an HPET-based hardlockup detector

2023-04-12 Thread Ricardo Neri
On Wed, Mar 01, 2023 at 03:47:29PM -0800, Ricardo Neri wrote:
> Hi x86 trusted reviewers,
> 
> This is the seventh version of this patchset. I acknowledge that it took me
> a long time to post a new version. Sorry! I will commit time to continue
> working on this series with high priority and I will post a new series soon
> after receiving your new feedback.
> 
> Although this series touches several subsystems, I plan to send it to the
> x86 maintainers because a) the series does not make much sense if split
> into subsystems, b) Thomas Gleixner has reviewed previous versions, and c)
> he has contributed to all the subsystems I modify.
> 
> Tony Luck has kindly reviewed previous versions of the series and I carried
> his Reviewed-by tags. This version, however, has new patches that also need
> review.
> 
> I seek to collect the Reviewed-by tags from the x86 trusted reviewers for
> the following patches:
>+ arch/x86: 4, 5
>+ Intel IOMMU: 6,
>+ AMD IOMMU: 9, 10, 11,
>+ NMI watchdog: 23 and 24.

Hello, checking if there is any feedback on these patches that I plan to
send to the x86 maintainer.

I am still seeking to collect the Reviewed-by: tags from the x86 trusted
reviewers.

Thanks in advance!

BR,
Ricardo


Re: [PATCH] powerpc/bpf: populate extable entries only during the last pass

2023-04-12 Thread Hari Bathini

Hello Christophe,

Thanks for the review.

On 07/04/23 11:31 am, Christophe Leroy wrote:



Le 06/04/2023 à 09:35, Hari Bathini a écrit :

Since commit 85e031154c7c ("powerpc/bpf: Perform complete extra passes
to update addresses"), two additional passes are performed to avoid
space and CPU time wastage on powerpc. But these extra passes led to
WARN_ON_ONCE() hits in bpf_add_extable_entry(). Fix it by not adding
extable entries during the extra pass.


Are you sure this change is correct ?


Actually, I was in two minds about that owing to commit 04c04205bc35
("bpf powerpc: Remove extra_pass from bpf_jit_build_body()").


During the extra pass the code can get shrinked or expanded (within the
limits of the size of the preliminary pass). Shouldn't extable entries
be populated during the last pass ?


Unlikely, but the intention there was to eliminate a regression in case
extra_pass ends up being 'false' always in any subsequent change.

- Hari



Fixes: 85e031154c7c ("powerpc/bpf: Perform complete extra passes to update 
addresses")
Signed-off-by: Hari Bathini 
---
   arch/powerpc/net/bpf_jit_comp32.c | 2 +-
   arch/powerpc/net/bpf_jit_comp64.c | 2 +-
   2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c 
b/arch/powerpc/net/bpf_jit_comp32.c
index 7f91ea064c08..e788b1fbeee6 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -977,7 +977,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, 
struct codegen_context *
if (size != BPF_DW && !fp->aux->verifier_zext)
EMIT(PPC_RAW_LI(dst_reg_h, 0));
   
-			if (BPF_MODE(code) == BPF_PROBE_MEM) {

+   if (BPF_MODE(code) == BPF_PROBE_MEM && !extra_pass) {
int insn_idx = ctx->idx - 1;
int jmp_off = 4;
   
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c

index 8dd3cabaa83a..1cc2777ec846 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -921,7 +921,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, 
struct codegen_context *
if (size != BPF_DW && insn_is_zext([i + 1]))
addrs[++i] = ctx->idx * 4;
   
-			if (BPF_MODE(code) == BPF_PROBE_MEM) {

+   if (BPF_MODE(code) == BPF_PROBE_MEM && !extra_pass) {
ret = bpf_add_extable_entry(fp, image, pass, ctx, 
ctx->idx - 1,
4, dst_reg);
if (ret)


[PATCH] selftests/powerpc: Replace obsolete memalign() with posix_memalign()

2023-04-12 Thread Deming Wang
memalign() is obsolete according to its manpage.

Replace memalign() with posix_memalign() and remove malloc.h include
that was there for memalign().

As a pointer is passed into posix_memalign(), initialize *s to NULL
to silence a warning about the function's return value being used as
uninitialized (which is not valid anyway because the error is properly
checked before s is returned).

Signed-off-by: Deming Wang 
---
 tools/testing/selftests/powerpc/stringloops/strlen.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c 
b/tools/testing/selftests/powerpc/stringloops/strlen.c
index 9055ebc484d0..f9c1f9cc2d32 100644
--- a/tools/testing/selftests/powerpc/stringloops/strlen.c
+++ b/tools/testing/selftests/powerpc/stringloops/strlen.c
@@ -1,5 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0
-#include 
 #include 
 #include 
 #include 
@@ -51,10 +50,11 @@ static void bench_test(char *s)
 static int testcase(void)
 {
char *s;
+   int ret;
unsigned long i;
 
-   s = memalign(128, SIZE);
-   if (!s) {
+   ret = posix_memalign((void **), 128, SIZE);
+   if (ret < 0) {
perror("memalign");
exit(1);
}
-- 
2.27.0



Re: [PATCH] powerpc/mpc52xx_lpbfifo: Drop unused functions

2023-04-12 Thread Michael Ellerman
Uwe Kleine-König  writes:
> On Wed, Dec 28, 2022 at 03:51:29PM +0100, Uwe Kleine-König wrote:
>> The four exported functions mpc52xx_lpbfifo_submit(),
>> mpc52xx_lpbfifo_abort(), mpc52xx_lpbfifo_poll(), and
>> mpc52xx_lpbfifo_start_xfer() are not used. So they can be dropped and the
>> definitions needed to call them can be moved into the driver file.
>> 
>> Signed-off-by: Uwe Kleine-König 
>
> I never got feedback about this driver and it has not appeared in next
> up to now. Did it fell through the cracks?

Yeah. I was hoping someone would explain what's going on with the
driver.

Presumably there are some out-of-tree drivers that use the routines
provided by this driver?

I think rather than merging this patch, which keeps the code but makes
it completely useless, do you mind sending a patch to remove the whole
driver? Maybe that will get someone's attention.

cheers


Re: [PATCH 2/2] start_kernel: omit prevent_tail_call_optimization for newer toolchains

2023-04-12 Thread Nathan Chancellor
On Wed, Apr 12, 2023 at 11:32:13AM -0700, ndesaulni...@google.com wrote:
> prevent_tail_call_optimization was added in
> commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")
> to work around stack canaries getting inserted into functions that would
> initialize the stack canary in the first place.
> 
> Now that we have no_stack_protector function attribute (gcc-11+,
> clang-7+) and use it on start_kernel, remove the call to
> prevent_tail_call_optimization such that we may one day remove it
> outright.
> 
> Signed-off-by: Nick Desaulniers 

Reviewed-by: Nathan Chancellor 

> ---
>  init/main.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/init/main.c b/init/main.c
> index 213baf7b8cb1..c8503d02dfa6 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1152,7 +1152,12 @@ void start_kernel(void)
>   /* Do the rest non-__init'ed, we're now alive */
>   arch_call_rest_init();
>  
> + /* Avoid stack canaries in callers of boot_init_stack_canary for gcc-10
> +  * and older.
> +  */
> +#if !__has_attribute(__no_stack_protector__)
>   prevent_tail_call_optimization();
> +#endif
>  }
>  
>  /* Call all constructor functions linked into the kernel. */
> 
> -- 
> 2.40.0.577.gac1e443424-goog
> 


Re: [PATCH 1/2] start_kernel: add no_stack_protector fn attr

2023-04-12 Thread Nathan Chancellor
On Wed, Apr 12, 2023 at 11:32:12AM -0700, ndesaulni...@google.com wrote:
> Back during the discussion of
> commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")
> we discussed the need for a function attribute to control the omission
> of stack protectors on a per-function basis; at the time Clang had
> support for no_stack_protector but GCC did not. This was fixed in
> gcc-11. Now that the function attribute is available, let's start using
> it.
> 
> Callers of boot_init_stack_canary need to use this function attribute
> unless they're compiled with -fno-stack-protector, otherwise the canary
> stored in the stack slot of the caller will differ upon the call to
> boot_init_stack_canary. This will lead to a call to __stack_chk_fail
> then panic.
> 
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722
> Link: 
> https://lore.kernel.org/all/20200316130414.gc12...@hirez.programming.kicks-ass.net/
> Signed-off-by: Nick Desaulniers 

I applied this in front of Josh's series and defconfig no longer panics
on boot :)

Tested-by: Nathan Chancellor 

> ---
>  arch/powerpc/kernel/smp.c   |  1 +
>  include/linux/compiler_attributes.h | 12 
>  init/main.c |  3 ++-
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 6b90f10a6c81..7d4c12b1abb7 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1603,6 +1603,7 @@ static void add_cpu_to_masks(int cpu)
>  }
>  
>  /* Activate a secondary processor. */
> +__no_stack_protector
>  void start_secondary(void *unused)
>  {
>   unsigned int cpu = raw_smp_processor_id();
> diff --git a/include/linux/compiler_attributes.h 
> b/include/linux/compiler_attributes.h
> index e659cb6fded3..84864767a56a 100644
> --- a/include/linux/compiler_attributes.h
> +++ b/include/linux/compiler_attributes.h
> @@ -255,6 +255,18 @@
>   */
>  #define __noreturn  __attribute__((__noreturn__))
>  
> +/*
> + * Optional: only supported since GCC >= 11.1, clang >= 7.0.
> + *
> + *   gcc: 
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-no_005fstack_005fprotector-function-attribute
> + *   clang: 
> https://clang.llvm.org/docs/AttributeReference.html#no-stack-protector-safebuffers
> + */
> +#if __has_attribute(__no_stack_protector__)
> +# define __no_stack_protector
> __attribute__((__no_stack_protector__))
> +#else
> +# define __no_stack_protector
> +#endif
> +
>  /*
>   * Optional: not supported by gcc.
>   *
> diff --git a/init/main.c b/init/main.c
> index bb87b789c543..213baf7b8cb1 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -941,7 +941,8 @@ static void __init print_unknown_bootoptions(void)
>   memblock_free(unknown_options, len);
>  }
>  
> -asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
> +asmlinkage __visible __init __no_sanitize_address __no_stack_protector
> +void start_kernel(void)
>  {
>   char *command_line;
>   char *after_dashes;
> 
> -- 
> 2.40.0.577.gac1e443424-goog
> 
> 


Re: [PATCH v3 5/6] PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler

2023-04-12 Thread Bjorn Helgaas
On Tue, Apr 11, 2023 at 01:03:01PM -0500, Terry Bowman wrote:
> From: Robert Richter 
> 
> In Restricted CXL Device (RCD) mode a CXL device is exposed as an
> RCiEP, but CXL downstream and upstream ports are not enumerated and
> not visible in the PCIe hierarchy. Protocol and link errors are sent
> to an RCEC.
> 
> Restricted CXL host (RCH) downstream port-detected errors are signaled
> as internal AER errors, either Uncorrectable Internal Error (UIE) or
> Corrected Internal Errors (CIE). The error source is the id of the
> RCEC. A CXL handler must then inspect the error status in various CXL
> registers residing in the dport's component register space (CXL RAS
> cap) or the dport's RCRB (AER ext cap). [1]
> 
> Errors showing up in the RCEC's error handler must be handled and
> connected to the CXL subsystem. Implement this by forwarding the error
> to all CXL devices below the RCEC. Since the entire CXL device is
> controlled only using PCIe Configuration Space of device 0, Function
> 0,

Capitalize "device" and "Function" the same way (also appears in
comment below).

> only pass it there [2]. These devices have the Memory Device class
> code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver
> can implement the handler. In addition to errors directed to the CXL
> endpoint device, the handler must also inspect the CXL downstream
> port's CXL RAS and PCIe AER external capabilities that is connected to

"AER external capabilities" --  is that referring to the "AER
*Extended* capability"?  If so, we usually don't bother including the
"extended" part because it's usually not relevant.  But if you intended
"external", I don't know what it means.

> the device.
> 
> Since CXL downstream port errors are signaled using internal errors,
> the handler requires those errors to be unmasked. This is subject of a
> follow-on patch.
> 
> The reason for choosing this implementation is that a CXL RCEC device
> is bound to the AER port driver, but the driver does not allow it to
> register a custom specific handler to support CXL. Connecting the RCEC
> hard-wired with a CXL handler does not work, as the CXL subsystem
> might not be present all the time. The alternative to add an
> implementation to the portdrv to allow the registration of a custom
> RCEC error handler isn't worth doing it as CXL would be its only user.
> Instead, just check for an CXL RCEC and pass it down to the connected
> CXL device's error handler. With this approach the code can entirely
> be implemented in the PCIe AER driver and is independent of the CXL
> subsystem. The CXL driver only provides the handler.

Can you make this more concrete with an example topology so we can
work through how this all works?  Correct me when I go off the rails
here:

The current code uses pcie_walk_rcec() in this path, which basically
searches below a Root Port or RCEC for devices that have an AER error
status bit set, add them to the e_info[] list, and call
handle_error_source() for each one:

  aer_isr_one_error
# get e_src from aer_fifo
find_source_device(e_src)
  pcie_walk_rcec(find_device_iter)
find_device_iter
  is_error_source
# read PCI_ERR_COR_STATUS or PCI_ERR_UNCOR_STATUS
  if (error-source)
add_error_device
  # add device to e_info[] list
# now call handle_error_source for everything in e_info[]
aer_process_err_devices
  for (i = 0; i < e_info->err_dev_num; i++)
handle_error_source

IIUC, this patch basically says that an RCEC should have an AER error
status bit (UIE or CIE) set, but the devices "below" the RCEC will
not, so they won't get added to e_info[].

So we insert cxl_handle_error() in handle_error_source(), where it
gets called for the RCEC, and then it uses pcie_walk_rcec() again to
forcibly call handle_error_source() for *every* device "below" the
RCEC (even though they don't have AER error status bits set).

Then handle_error_source() ultimately calls the CXL driver err_handler
entry points (.cor_error_detected(), .error_detected(), etc), which
can look at the CXL-specific error status in the CXL RAS or RCRB or
whatever.

So this basically looks like a workaround for the fact that the AER
code only calls handle_error_source() when it finds AER error status,
and CXL doesn't *set* that AER error status.  There's not that much
code here, but it seems like a quite a bit of complexity in an area
that is already pretty complicated.

Here's another idea: the ACPI GHES code (ghes_handle_aer()) basically
receives a packet of error status from firmware and queues it for
recovery via pcie_do_recovery().  What if you had a CXL module that
knew how to look for the CXL error status, package it up similarly,
and queue it via aer_recover_queue()?

> [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors
> [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices
> 
> Co-developed-by: Terry Bowman 
> Signed-off-by: Robert Richter 
> Signed-off-by: Terry 

RE: [PATCH 0/6] bus: fsl-mc: Make remove function return void

2023-04-12 Thread Leo Li



> -Original Message-
> From: Uwe Kleine-König 
> Sent: Wednesday, April 12, 2023 12:11 PM
> To: Stuart Yoder ; Laurentiu Tudor
> ; Roy Pledge ; Leo Li
> ; Horia Geanta ; Pankaj
> Gupta ; Gaurav Jain ;
> Herbert Xu ; David S. Miller
> ; Vinod Koul ; Ioana Ciornei
> ; Eric Dumazet ; Jakub
> Kicinski ; Paolo Abeni ; Y.B. Lu
> ; Diana Madalina Craciun (OSS)
> ; Alex Williamson
> ; Richard Cochran
> 
> Cc: k...@vger.kernel.org; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; linux-cry...@vger.kernel.org;
> ker...@pengutronix.de; dmaeng...@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org; linux-arm-ker...@lists.infradead.org
> Subject: Re: [PATCH 0/6] bus: fsl-mc: Make remove function return void
> 
> Hello,
> 
> On Fri, Mar 10, 2023 at 11:41:22PM +0100, Uwe Kleine-König wrote:
> > Hello,
> >
> > many bus remove functions return an integer which is a historic
> > misdesign that makes driver authors assume that there is some kind of
> > error handling in the upper layers. This is wrong however and
> > returning and error code only yields an error message.
> >
> > This series improves the fsl-mc bus by changing the remove callback to
> > return no value instead. As a preparation all drivers are changed to
> > return zero before so that they don't trigger the error message.
> 
> Who is supposed to pick up this patch series (or point out a good reason for
> not taking it)?

Previously Greg KH picked up MC bus patches.

If no one is picking up them this time, I probably can take it through the fsl 
soc tree.

Regards,
Leo


Re: [PATCH v3 6/6] PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling

2023-04-12 Thread Bjorn Helgaas
On Tue, Apr 11, 2023 at 01:03:02PM -0500, Terry Bowman wrote:
> From: Robert Richter 
> 
> RCEC AER corrected and uncorrectable internal errors (CIE/UIE) are
> disabled by default.

"Disabled by default" just means "the power-up state of CIE/UIC is
that they are masked", right?  It doesn't mean that Linux normally
masks them.

> [1][2] Enable them to receive CXL downstream port
> errors of a Restricted CXL Host (RCH).
> 
> [1] CXL 3.0 Spec, 12.2.1.1 - RCH Downstream Port Detected Errors
> [2] PCIe Base Spec 6.0, 7.8.4.3 Uncorrectable Error Mask Register,
> 7.8.4.6 Correctable Error Mask Register
> 
> Co-developed-by: Terry Bowman 
> Signed-off-by: Robert Richter 
> Signed-off-by: Terry Bowman 
> Cc: "Oliver O'Halloran" 
> Cc: Bjorn Helgaas 
> Cc: Mahesh J Salgaonkar 
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/pci/pcie/aer.c | 73 ++
>  1 file changed, 73 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 171a08fd8ebd..3973c731e11d 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1000,7 +1000,79 @@ static void cxl_handle_error(struct pci_dev *dev, 
> struct aer_err_info *info)
>   pcie_walk_rcec(dev, cxl_handle_error_iter, info);
>  }
>  
> +static bool cxl_error_is_native(struct pci_dev *dev)
> +{
> + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
> +
> + if (pcie_ports_native)
> + return true;
> +
> + return host->native_aer && host->native_cxl_error;
> +}
> +
> +static int handles_cxl_error_iter(struct pci_dev *dev, void *data)
> +{
> + int *handles_cxl = data;
> +
> + *handles_cxl = is_cxl_mem_dev(dev) && cxl_error_is_native(dev);
> +
> + return *handles_cxl;
> +}
> +
> +static bool handles_cxl_errors(struct pci_dev *rcec)
> +{
> + int handles_cxl = 0;
> +
> + if (!rcec->aer_cap)
> + return false;
> +
> + if (pci_pcie_type(rcec) == PCI_EXP_TYPE_RC_EC)
> + pcie_walk_rcec(rcec, handles_cxl_error_iter, _cxl);
> +
> + return !!handles_cxl;
> +}
> +
> +static int __cxl_unmask_internal_errors(struct pci_dev *rcec)
> +{
> + int aer, rc;
> + u32 mask;
> +
> + /*
> +  * Internal errors are masked by default, unmask RCEC's here
> +  * PCI6.0 7.8.4.3 Uncorrectable Error Mask Register (Offset 08h)
> +  * PCI6.0 7.8.4.6 Correctable Error Mask Register (Offset 14h)
> +  */

Unmasking internal errors doesn't have anything specific to do with
CXL, so I don't think it should have "cxl" in the function name.
Maybe something like "pci_aer_unmask_internal_errors()".

This also has nothing special to do with RCECs, so I think we should
refer to the device as "dev" as is typical in this file.

I think this needs to check pcie_aer_is_native() as is done by
pci_aer_clear_nonfatal_status() and other functions that write the AER
Capability.

With the exception of this function, this patch looks like all CXL
code that maybe could be with other CXL code.  Would require making
pcie_walk_rcec() available outside drivers/pci, I guess.

> + aer = rcec->aer_cap;
> + rc = pci_read_config_dword(rcec, aer + PCI_ERR_UNCOR_MASK, );
> + if (rc)
> + return rc;
> + mask &= ~PCI_ERR_UNC_INTN;
> + rc = pci_write_config_dword(rcec, aer + PCI_ERR_UNCOR_MASK, mask);
> + if (rc)
> + return rc;
> +
> + rc = pci_read_config_dword(rcec, aer + PCI_ERR_COR_MASK, );
> + if (rc)
> + return rc;
> + mask &= ~PCI_ERR_COR_INTERNAL;
> + rc = pci_write_config_dword(rcec, aer + PCI_ERR_COR_MASK, mask);
> +
> + return rc;
> +}
> +
> +static void cxl_unmask_internal_errors(struct pci_dev *rcec)
> +{
> + if (!handles_cxl_errors(rcec))
> + return;
> +
> + if (__cxl_unmask_internal_errors(rcec))
> + dev_err(>dev, "cxl: Failed to unmask internal errors");
> + else
> + dev_dbg(>dev, "cxl: Internal errors unmasked");
> +}
> +
>  #else
> +static inline void cxl_unmask_internal_errors(struct pci_dev *dev) { }
>  static inline void cxl_handle_error(struct pci_dev *dev,
>   struct aer_err_info *info) { }
>  #endif
> @@ -1397,6 +1469,7 @@ static int aer_probe(struct pcie_device *dev)
>   return status;
>   }
>  
> + cxl_unmask_internal_errors(port);
>   aer_enable_rootport(rpc);
>   pci_info(port, "enabled with IRQ %d\n", dev->irq);
>   return 0;
> -- 
> 2.34.1
> 


Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver

2023-04-12 Thread Sean Anderson

Hi Vinod,

On 4/12/23 12:10, Vinod Koul wrote:

On 11-04-23, 14:43, Sean Anderson wrote:

This adds support for the Lynx 10G "SerDes" devices found on various NXP
QorIQ SoCs. There may be up to four SerDes devices on each SoC, each
supporting up to eight lanes. Protocol support for each SerDes is highly
heterogeneous, with each SoC typically having a totally different
selection of supported protocols for each lane. Additionally, the SerDes
devices on each SoC also have differing support. One SerDes will
typically support Ethernet on most lanes, while the other will typically
support PCIe on most lanes.

There is wide hardware support for this SerDes. It is present on QorIQ
T-Series and Layerscape processors. Because each SoC typically has
specific instructions and exceptions for its SerDes, I have limited the
initial scope of this module to just the LS1046A and LS1088A.
Additionally, I have only added support for Ethernet protocols. There is
not a great need for dynamic reconfiguration for other protocols (except
perhaps for M.2 cards), so support for them may never be added.

Nevertheless, I have tried to provide an obvious path for adding support
for other SoCs as well as other protocols. SATA just needs support for
configuring LNmSSCR0. PCIe may need to configure the equalization
registers. It also uses multiple lanes. I have tried to write the driver
with multi-lane support in mind, so there should not need to be any
large changes. Although there are 6 protocols supported, I have only
tested SGMII and XFI. The rest have been implemented as described in
the datasheet. Most of these protocols should work "as-is", but
10GBASE-KR will need PCS support for link training.

Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all
configured for PCIe, this driver uses one phy configured to use 4 lanes.
This is because while the individual lanes may be configured
individually, the protocol selection acts on all lanes at once.
Additionally, the order which lanes should be configured in is specified
by the datasheet. To coordinate this, lanes are reserved in phy_init,
and released in phy_exit.

This driver was written with reference to the LS1046A reference manual.
However, it was informed by reference manuals for all processors with
mEMACs, especially the T4240 (which appears to have a "maxed-out"
configuration). The earlier P-series processors appear to be similar, but
have a different overall register layout (using "banks" instead of
separate SerDes). Perhaps this those use a "5G Lynx SerDes."

Signed-off-by: Sean Anderson 
---

(no changes since v10)

Changes in v10:
- Fix debugging print with incorrect error variable

Changes in v9:
- Split off clock "driver" into its own patch to allow for better
   review.
- Add ability to defer lane initialization to phy_init. This allows
   for easier transitioning between firmware-managed serdes and Linux-
   managed serdes, as the consumer (such as dpaa2, which knows what the
   firmware is doing) has the last say on who gets control.
- phy-type -> fsl,phy

Changes in v8:
- Remove unused variable from lynx_ls_mode_init

Changes in v7:
- Break out call order into generic documentation
- Refuse to switch "major" protocols
- Update Kconfig to reflect restrictions
- Remove set/clear of "pcs reset" bit, since it doesn't seem to fix
   anything.

Changes in v6:
- Update MAINTAINERS to include new files
- Include bitfield.h and slab.h to allow compilation on non-arm64
   arches.
- Depend on COMMON_CLK and either layerscape/ppc

Changes in v5:
- Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this
   series to be applied directly to linux/master.
- Add fsl,lynx-10g.h to MAINTAINERS

Changes in v4:
- Rework all debug statements to remove use of __func__. Additional
   information has been provided as necessary.
- Consider alternative parent rates in round_rate and not in set_rate.
   Trying to modify out parent's rate in set_rate will deadlock.
- Explicitly perform a stop/reset sequence in set_rate. This way we
   always ensure that the PLL is properly stopped.
- Set the power-down bit when disabling the PLL. We can do this now that
   enable/disable aren't abused during the set rate sequence.
- Fix typos in QSGMII_OFFSET and XFI_OFFSET
- Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better
   reflect its function (adding post-cursor equalization).
- Use of_clk_hw_onecell_get instead of a custom function.
- Return struct clks from lynx_clks_init instead of embedding lynx_clk
   in lynx_priv.
- Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs
   primarily in the layout and offset of the PCCRs. This will help bring a
   cleaner abstraction layer. The caps have been removed, since this handles the
   only current usage.
- Convert to use new binding format. As a result of this, we no longer need to
   have protocols for PCIe or SATA. Additionally, modes now live in lynx_group
   instead of lynx_priv.
- Remove teq from 

Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver

2023-04-12 Thread Sean Anderson

Hi Vinod,

On 4/12/23 12:10, Vinod Koul wrote:

On 11-04-23, 14:43, Sean Anderson wrote:

This adds support for the Lynx 10G "SerDes" devices found on various NXP
QorIQ SoCs. There may be up to four SerDes devices on each SoC, each
supporting up to eight lanes. Protocol support for each SerDes is highly
heterogeneous, with each SoC typically having a totally different
selection of supported protocols for each lane. Additionally, the SerDes
devices on each SoC also have differing support. One SerDes will
typically support Ethernet on most lanes, while the other will typically
support PCIe on most lanes.

There is wide hardware support for this SerDes. It is present on QorIQ
T-Series and Layerscape processors. Because each SoC typically has
specific instructions and exceptions for its SerDes, I have limited the
initial scope of this module to just the LS1046A and LS1088A.
Additionally, I have only added support for Ethernet protocols. There is
not a great need for dynamic reconfiguration for other protocols (except
perhaps for M.2 cards), so support for them may never be added.

Nevertheless, I have tried to provide an obvious path for adding support
for other SoCs as well as other protocols. SATA just needs support for
configuring LNmSSCR0. PCIe may need to configure the equalization
registers. It also uses multiple lanes. I have tried to write the driver
with multi-lane support in mind, so there should not need to be any
large changes. Although there are 6 protocols supported, I have only
tested SGMII and XFI. The rest have been implemented as described in
the datasheet. Most of these protocols should work "as-is", but
10GBASE-KR will need PCS support for link training.

Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all
configured for PCIe, this driver uses one phy configured to use 4 lanes.
This is because while the individual lanes may be configured
individually, the protocol selection acts on all lanes at once.
Additionally, the order which lanes should be configured in is specified
by the datasheet. To coordinate this, lanes are reserved in phy_init,
and released in phy_exit.

This driver was written with reference to the LS1046A reference manual.
However, it was informed by reference manuals for all processors with
mEMACs, especially the T4240 (which appears to have a "maxed-out"
configuration). The earlier P-series processors appear to be similar, but
have a different overall register layout (using "banks" instead of
separate SerDes). Perhaps this those use a "5G Lynx SerDes."

Signed-off-by: Sean Anderson 
---

(no changes since v10)

Changes in v10:
- Fix debugging print with incorrect error variable

Changes in v9:
- Split off clock "driver" into its own patch to allow for better
   review.
- Add ability to defer lane initialization to phy_init. This allows
   for easier transitioning between firmware-managed serdes and Linux-
   managed serdes, as the consumer (such as dpaa2, which knows what the
   firmware is doing) has the last say on who gets control.
- phy-type -> fsl,phy

Changes in v8:
- Remove unused variable from lynx_ls_mode_init

Changes in v7:
- Break out call order into generic documentation
- Refuse to switch "major" protocols
- Update Kconfig to reflect restrictions
- Remove set/clear of "pcs reset" bit, since it doesn't seem to fix
   anything.

Changes in v6:
- Update MAINTAINERS to include new files
- Include bitfield.h and slab.h to allow compilation on non-arm64
   arches.
- Depend on COMMON_CLK and either layerscape/ppc

Changes in v5:
- Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this
   series to be applied directly to linux/master.
- Add fsl,lynx-10g.h to MAINTAINERS

Changes in v4:
- Rework all debug statements to remove use of __func__. Additional
   information has been provided as necessary.
- Consider alternative parent rates in round_rate and not in set_rate.
   Trying to modify out parent's rate in set_rate will deadlock.
- Explicitly perform a stop/reset sequence in set_rate. This way we
   always ensure that the PLL is properly stopped.
- Set the power-down bit when disabling the PLL. We can do this now that
   enable/disable aren't abused during the set rate sequence.
- Fix typos in QSGMII_OFFSET and XFI_OFFSET
- Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better
   reflect its function (adding post-cursor equalization).
- Use of_clk_hw_onecell_get instead of a custom function.
- Return struct clks from lynx_clks_init instead of embedding lynx_clk
   in lynx_priv.
- Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs
   primarily in the layout and offset of the PCCRs. This will help bring a
   cleaner abstraction layer. The caps have been removed, since this handles the
   only current usage.
- Convert to use new binding format. As a result of this, we no longer need to
   have protocols for PCIe or SATA. Additionally, modes now live in lynx_group
   instead of lynx_priv.
- Remove teq from 

Re: [PATCH 1/2] start_kernel: add no_stack_protector fn attr

2023-04-12 Thread Miguel Ojeda
On Wed, Apr 12, 2023 at 8:32 PM  wrote:
>
>  include/linux/compiler_attributes.h | 12 

Acked-by: Miguel Ojeda 

Cheers,
Miguel


Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver

2023-04-12 Thread Sean Anderson

Hi Vinod,

On 4/12/23 12:10, Vinod Koul wrote:

On 11-04-23, 14:43, Sean Anderson wrote:

This adds support for the Lynx 10G "SerDes" devices found on various NXP
QorIQ SoCs. There may be up to four SerDes devices on each SoC, each
supporting up to eight lanes. Protocol support for each SerDes is highly
heterogeneous, with each SoC typically having a totally different
selection of supported protocols for each lane. Additionally, the SerDes
devices on each SoC also have differing support. One SerDes will
typically support Ethernet on most lanes, while the other will typically
support PCIe on most lanes.

There is wide hardware support for this SerDes. It is present on QorIQ
T-Series and Layerscape processors. Because each SoC typically has
specific instructions and exceptions for its SerDes, I have limited the
initial scope of this module to just the LS1046A and LS1088A.
Additionally, I have only added support for Ethernet protocols. There is
not a great need for dynamic reconfiguration for other protocols (except
perhaps for M.2 cards), so support for them may never be added.

Nevertheless, I have tried to provide an obvious path for adding support
for other SoCs as well as other protocols. SATA just needs support for
configuring LNmSSCR0. PCIe may need to configure the equalization
registers. It also uses multiple lanes. I have tried to write the driver
with multi-lane support in mind, so there should not need to be any
large changes. Although there are 6 protocols supported, I have only
tested SGMII and XFI. The rest have been implemented as described in
the datasheet. Most of these protocols should work "as-is", but
10GBASE-KR will need PCS support for link training.

Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all
configured for PCIe, this driver uses one phy configured to use 4 lanes.
This is because while the individual lanes may be configured
individually, the protocol selection acts on all lanes at once.
Additionally, the order which lanes should be configured in is specified
by the datasheet. To coordinate this, lanes are reserved in phy_init,
and released in phy_exit.

This driver was written with reference to the LS1046A reference manual.
However, it was informed by reference manuals for all processors with
mEMACs, especially the T4240 (which appears to have a "maxed-out"
configuration). The earlier P-series processors appear to be similar, but
have a different overall register layout (using "banks" instead of
separate SerDes). Perhaps this those use a "5G Lynx SerDes."

Signed-off-by: Sean Anderson 
---

(no changes since v10)

Changes in v10:
- Fix debugging print with incorrect error variable

Changes in v9:
- Split off clock "driver" into its own patch to allow for better
   review.
- Add ability to defer lane initialization to phy_init. This allows
   for easier transitioning between firmware-managed serdes and Linux-
   managed serdes, as the consumer (such as dpaa2, which knows what the
   firmware is doing) has the last say on who gets control.
- phy-type -> fsl,phy

Changes in v8:
- Remove unused variable from lynx_ls_mode_init

Changes in v7:
- Break out call order into generic documentation
- Refuse to switch "major" protocols
- Update Kconfig to reflect restrictions
- Remove set/clear of "pcs reset" bit, since it doesn't seem to fix
   anything.

Changes in v6:
- Update MAINTAINERS to include new files
- Include bitfield.h and slab.h to allow compilation on non-arm64
   arches.
- Depend on COMMON_CLK and either layerscape/ppc

Changes in v5:
- Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this
   series to be applied directly to linux/master.
- Add fsl,lynx-10g.h to MAINTAINERS

Changes in v4:
- Rework all debug statements to remove use of __func__. Additional
   information has been provided as necessary.
- Consider alternative parent rates in round_rate and not in set_rate.
   Trying to modify out parent's rate in set_rate will deadlock.
- Explicitly perform a stop/reset sequence in set_rate. This way we
   always ensure that the PLL is properly stopped.
- Set the power-down bit when disabling the PLL. We can do this now that
   enable/disable aren't abused during the set rate sequence.
- Fix typos in QSGMII_OFFSET and XFI_OFFSET
- Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better
   reflect its function (adding post-cursor equalization).
- Use of_clk_hw_onecell_get instead of a custom function.
- Return struct clks from lynx_clks_init instead of embedding lynx_clk
   in lynx_priv.
- Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs
   primarily in the layout and offset of the PCCRs. This will help bring a
   cleaner abstraction layer. The caps have been removed, since this handles the
   only current usage.
- Convert to use new binding format. As a result of this, we no longer need to
   have protocols for PCIe or SATA. Additionally, modes now live in lynx_group
   instead of lynx_priv.
- Remove teq from 

[PATCH 1/2] start_kernel: add no_stack_protector fn attr

2023-04-12 Thread ndesaulniers
Back during the discussion of
commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")
we discussed the need for a function attribute to control the omission
of stack protectors on a per-function basis; at the time Clang had
support for no_stack_protector but GCC did not. This was fixed in
gcc-11. Now that the function attribute is available, let's start using
it.

Callers of boot_init_stack_canary need to use this function attribute
unless they're compiled with -fno-stack-protector, otherwise the canary
stored in the stack slot of the caller will differ upon the call to
boot_init_stack_canary. This will lead to a call to __stack_chk_fail
then panic.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722
Link: 
https://lore.kernel.org/all/20200316130414.gc12...@hirez.programming.kicks-ass.net/
Signed-off-by: Nick Desaulniers 
---
 arch/powerpc/kernel/smp.c   |  1 +
 include/linux/compiler_attributes.h | 12 
 init/main.c |  3 ++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 6b90f10a6c81..7d4c12b1abb7 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1603,6 +1603,7 @@ static void add_cpu_to_masks(int cpu)
 }
 
 /* Activate a secondary processor. */
+__no_stack_protector
 void start_secondary(void *unused)
 {
unsigned int cpu = raw_smp_processor_id();
diff --git a/include/linux/compiler_attributes.h 
b/include/linux/compiler_attributes.h
index e659cb6fded3..84864767a56a 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -255,6 +255,18 @@
  */
 #define __noreturn  __attribute__((__noreturn__))
 
+/*
+ * Optional: only supported since GCC >= 11.1, clang >= 7.0.
+ *
+ *   gcc: 
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-no_005fstack_005fprotector-function-attribute
+ *   clang: 
https://clang.llvm.org/docs/AttributeReference.html#no-stack-protector-safebuffers
+ */
+#if __has_attribute(__no_stack_protector__)
+# define __no_stack_protector  __attribute__((__no_stack_protector__))
+#else
+# define __no_stack_protector
+#endif
+
 /*
  * Optional: not supported by gcc.
  *
diff --git a/init/main.c b/init/main.c
index bb87b789c543..213baf7b8cb1 100644
--- a/init/main.c
+++ b/init/main.c
@@ -941,7 +941,8 @@ static void __init print_unknown_bootoptions(void)
memblock_free(unknown_options, len);
 }
 
-asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
+asmlinkage __visible __init __no_sanitize_address __no_stack_protector
+void start_kernel(void)
 {
char *command_line;
char *after_dashes;

-- 
2.40.0.577.gac1e443424-goog



[PATCH 0/2] start_kernel: omit stack canary

2023-04-12 Thread ndesaulniers
A security research paper was recently published detailing Catch Handler
Oriented Programming (CHOP) attacks.
https://download.vusec.net/papers/chop_ndss23.pdf
The TL;DR being that C++ structured exception handling runtimes are
attractive gadgets for Jump Oriented Programming (JOP) attacks.

In response to this, a mitigation was developed under embargo in
clang-16 to check the stack canary before calling noreturn functions.
https://bugs.chromium.org/p/llvm/issues/detail?id=30

This started causing boot failures in Android kernel trees downstream of
stable linux-4.14.y that had proto-LTO support, as reported by Nathan
Chancellor.
https://github.com/ClangBuiltLinux/linux/issues/1815

Josh Poimboeuf recently sent a series to explicitly annotate more
functions as noreturn. Nathan noticed the series, and tested it finding
that it now caused boot failures with clang-16+ on mainline (raising the
visibility and urgency of the issue).
https://lore.kernel.org/cover.1680912057.git.jpoim...@kernel.org/

Once the embargo was lifted, I asked questions such as "what does C++
structured exception handling have to do with C code" and "surely GCC
didn't ship the same mitigation for C code (narrator: 'They did not:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a25982ada523689c8745d7fb4b1b93c8f5dab2e7')?"

I now have a patch out for LLVM to undo this mess (or at least limit it
to C++ functions that may throw, similar to GCC's mitigation); it hasn't
landed yet but we're close to consensus and I expect it to land
imminently.
https://reviews.llvm.org/D147975

Remember this thread?  (Pepperidge farms remembers...)
https://lore.kernel.org/all/20200314164451.346497-1-sly...@gentoo.org/

That reminded me that years ago we discussed a function attribute for
no_stack_protector.
https://lore.kernel.org/all/20200316130414.gc12...@hirez.programming.kicks-ass.net/

GCC didn't have one at the time, it now does. In addition to the LLVM
fix, I'd like to introduce this in the kernel so that we might start
using it in additional places:
* https://lore.kernel.org/linux-pm/20200915172658.1432732-1-r...@google.com/
* 
https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolva...@google.com/
And eventually remove the final macro expansion site of
prevent_tail_call_optimization.

With the LLVM fix, this series isn't required, but I'd like to start
paving the way to use these function attributes since I think they are a
sweet spot in terms of granularity (as opposed to trying to move
start_kernel to its own TU compiled with -fno-stack-protector).

(This is my first time using b4 to send a series, as per
https://people.kernel.org/monsieuricon/sending-a-kernel-patch-with-b4-part-1.
Here goes nothing!)

Signed-off-by: Nick Desaulniers 
---
Nick Desaulniers (2):
  start_kernel: add no_stack_protector fn attr
  start_kernel: omit prevent_tail_call_optimization for newer toolchains

 arch/powerpc/kernel/smp.c   |  1 +
 include/linux/compiler_attributes.h | 12 
 init/main.c |  8 +++-
 3 files changed, 20 insertions(+), 1 deletion(-)
---
base-commit: 0bcc4025550403ae28d2984bddacafbca0a2f112
change-id: 20230412-no_stackp-a98168a2bb0a

Best regards,
-- 
Nick Desaulniers 



[PATCH 2/2] start_kernel: omit prevent_tail_call_optimization for newer toolchains

2023-04-12 Thread ndesaulniers
prevent_tail_call_optimization was added in
commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")
to work around stack canaries getting inserted into functions that would
initialize the stack canary in the first place.

Now that we have no_stack_protector function attribute (gcc-11+,
clang-7+) and use it on start_kernel, remove the call to
prevent_tail_call_optimization such that we may one day remove it
outright.

Signed-off-by: Nick Desaulniers 
---
 init/main.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/init/main.c b/init/main.c
index 213baf7b8cb1..c8503d02dfa6 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1152,7 +1152,12 @@ void start_kernel(void)
/* Do the rest non-__init'ed, we're now alive */
arch_call_rest_init();
 
+   /* Avoid stack canaries in callers of boot_init_stack_canary for gcc-10
+* and older.
+*/
+#if !__has_attribute(__no_stack_protector__)
prevent_tail_call_optimization();
+#endif
 }
 
 /* Call all constructor functions linked into the kernel. */

-- 
2.40.0.577.gac1e443424-goog



[linux-next:master] BUILD REGRESSION 7d8214bba44c1aa6a75921a09a691945d26a8d43

2023-04-12 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 7d8214bba44c1aa6a75921a09a691945d26a8d43  Add linux-next specific 
files for 20230412

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202303161521.jbgbafjj-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202304061839.hi01vpl1-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202304102354.q4voxgte-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202304112044.8nzkpvxm-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202304121606.ltrfyuij-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202304121702.bav49hfn-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

Error: failed to load BTF from vmlinux: No data available
arch/arm/vfp/entry.S:27: undefined reference to `vfp_entry'
arch/csky/abiv2/cacheflush.c:15:9: error: implicit declaration of function 
'flush_tlb_page'; did you mean 'flush_anon_page'? 
[-Werror=implicit-function-declaration]
diff: tools/arch/s390/include/uapi/asm/ptrace.h: No such file or directory
drivers/bluetooth/hci_qca.c:1894:37: warning: unused variable 
'qca_soc_data_wcn6855' [-Wunused-const-variable]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_validation.c:351:13: 
warning: variable 'bw_needed' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_validation.c:352:25: 
warning: variable 'link' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:62: warning: wrong kernel-doc 
identifier on line:
drivers/gpu/drm/i915/i915_pmu.h:41: warning: This comment starts with '/**', 
but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
drivers/gpu/drm/i915/i915_request.h:176: warning: This comment starts with 
'/**', but isn't a kernel-doc comment. Refer 
Documentation/doc-guide/kernel-doc.rst
drivers/gpu/drm/i915/i915_vma.h:145: warning: expecting prototype for 
i915_vma_offset(). Prototype was for i915_vma_size() instead
drivers/net/wireless/legacy/ray_cs.c:628:17: warning: 'strncpy' specified bound 
32 equals destination size [-Wstringop-truncation]
drivers/tty/serial/samsung_tty.c:2034:10: error: implicit declaration of 
function 'of_device_get_match_data' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]
drivers/tty/serial/samsung_tty.c:2034:10: warning: incompatible integer to 
pointer conversion returning 'int' from a function with result type 'const 
struct s3c24xx_serial_drv_data *' [-Wint-conversion]
drivers/tty/serial/samsung_tty.c:2034:24: error: implicit declaration of 
function 'of_device_get_match_data'; did you mean 'device_get_match_data'? 
[-Werror=implicit-function-declaration]
drivers/tty/serial/samsung_tty.c:2034:24: warning: returning 'int' from a 
function with return type 'const struct s3c24xx_serial_drv_data *' makes 
pointer from integer without a cast [-Wint-conversion]
make[3]: *** No rule to make target 'zip.h', needed by 
'/tools/build/libbpf/sharedobjs/libbpf.o'.
make[3]: *** No rule to make target 'zip.h', needed by 
'/tools/build/libbpf/staticobjs/libbpf.o'.
make[4]: *** No rule to make target 'zip.h', needed by 
'kselftest/net/tools/build/libbpf/sharedobjs/libbpf.o'.
make[4]: *** No rule to make target 'zip.h', needed by 
'kselftest/net/tools/build/libbpf/staticobjs/libbpf.o'.
membarrier_test_impl.h:27:30: error: 'MEMBARRIER_CMD_GET_REGISTRATIONS' 
undeclared (first use in this function)
mount_setattr_test.c:107:8: error: redefinition of 'struct mount_attr'
thermal_nl.h:6:10: fatal error: netlink/netlink.h: No such file or directory
thermometer.c:21:10: fatal error: libconfig.h: No such file or directory

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/acpi/property.c:985 acpi_data_prop_read_single() error: potentially 
dereferencing uninitialized 'obj'.
drivers/crypto/intel/qat/qat_common/adf_cfg.c:262 adf_cfg_add_key_value_param() 
warn: argument 4 to %lx specifier is cast from pointer
drivers/hwmon/pmbus/pmbus_core.c:3164:7-32: WARNING: Threaded IRQ with no 
primary handler requested without IRQF_ONESHOT (unless it is nested IRQ)
drivers/soc/fsl/qe/tsa.c:140:26: sparse: sparse: incorrect type in argument 2 
(different address spaces)
drivers/soc/fsl/qe/tsa.c:150:27: sparse: sparse: incorrect type in argument 1 
(different address spaces)

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-link-link_validation.c:warning:variable-bw_needed-set-but-not-used
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-link-link_validation.c:warning:variable-link-set-but-not-used
|   `-- 
drivers-net-wireless-legacy-ray_cs.c:warning:strncpy-specified-bound-equals-destination-size
|-- alpha-randconfig-c023-20230409
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-link-link_validation.c:warning:variable-bw_needed-set-but-not-used
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc

Re: [PATCH v3 02/14] arm64: drop ranges in definition of ARCH_FORCE_MAX_ORDER

2023-04-12 Thread Catalin Marinas
On Tue, Apr 04, 2023 at 06:50:01AM -0500, Justin Forbes wrote:
> On Tue, Apr 4, 2023 at 2:22 AM Mike Rapoport  wrote:
> > On Wed, Mar 29, 2023 at 10:55:37AM -0500, Justin Forbes wrote:
> > > On Sat, Mar 25, 2023 at 1:09 AM Mike Rapoport  wrote:
> > > >
> > > > From: "Mike Rapoport (IBM)" 
> > > >
> > > > It is not a good idea to change fundamental parameters of core memory
> > > > management. Having predefined ranges suggests that the values within
> > > > those ranges are sensible, but one has to *really* understand
> > > > implications of changing MAX_ORDER before actually amending it and
> > > > ranges don't help here.
> > > >
> > > > Drop ranges in definition of ARCH_FORCE_MAX_ORDER and make its prompt
> > > > visible only if EXPERT=y
> > >
> > > I do not like suddenly hiding this behind EXPERT for a couple of
> > > reasons.  Most importantly, it will silently change the config for
> > > users building with an old kernel config.  If a user has for instance
> > > "13" set and building with 4K pages, as is the current configuration
> > > for Fedora and RHEL aarch64 builds, an oldconfig build will now set it
> > > to 10 with no indication that it is doing so.  And while I think that
> > > 10 is a fine default for many aarch64 users, there are valid reasons
> > > for choosing other values. Putting this behind expert makes it much
> > > less obvious that this is an option.
> >
> > That's the idea of EXPERT, no?
> >
> > This option was intended to allow allocation of huge pages for
> > architectures that had PMD_ORDER > MAX_ORDER and not to allow user to
> > select size of maximal physically contiguous allocation.
> >
> > Changes to MAX_ORDER fundamentally change the behaviour of core mm and
> > unless users *really* know what they are doing there is no reason to choose
> > non-default values so hiding this option behind EXPERT seems totally
> > appropriate to me.
> 
> It sounds nice in theory. In practice. EXPERT hides too much. When you
> flip expert, you expose over a 175ish new config options which are
> hidden behind EXPERT.  You don't have to know what you are doing just
> with the MAX_ORDER, but a whole bunch more as well.  If everyone were
> already running 10, this might be less of a problem. At least Fedora
> and RHEL are running 13 for 4K pages on aarch64. This was not some
> accidental choice, we had to carry a patch to even allow it for a
> while.  If this does go in as is, we will likely just carry a patch to
> remove the "if EXPERT", but that is a bit of a disservice to users who
> might be trying to debug something else upstream, bisecting upstream
> kernels or testing a patch.  In those cases, people tend to use
> pristine upstream sources without distro patches to verify, and they
> tend to use their existing configs. With this change, their MAX_ORDER
> will drop to 10 from 13 silently.   That can look like a different
> issue enough to ruin a bisect or have them give bad feedback on a
> patch because it introduces a "regression" which is not a regression
> at all, but a config change they couldn't see.

If we remove EXPERT (as prior to this patch), I'd rather keep the ranges
and avoid having to explain to people why some random MAX_ORDER doesn't
build (keeping the range would also make sense for randconfig, not sure
we got to any conclusion there).

-- 
Catalin


Re: [PATCH 0/6] bus: fsl-mc: Make remove function return void

2023-04-12 Thread Uwe Kleine-König
Hello,

On Fri, Mar 10, 2023 at 11:41:22PM +0100, Uwe Kleine-König wrote:
> Hello,
> 
> many bus remove functions return an integer which is a historic
> misdesign that makes driver authors assume that there is some kind of
> error handling in the upper layers. This is wrong however and returning
> and error code only yields an error message.
> 
> This series improves the fsl-mc bus by changing the remove callback to
> return no value instead. As a preparation all drivers are changed to
> return zero before so that they don't trigger the error message.

Who is supposed to pick up this patch series (or point out a good reason
for not taking it)?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature


Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver

2023-04-12 Thread Vinod Koul
On 11-04-23, 14:43, Sean Anderson wrote:
> This adds support for the Lynx 10G "SerDes" devices found on various NXP
> QorIQ SoCs. There may be up to four SerDes devices on each SoC, each
> supporting up to eight lanes. Protocol support for each SerDes is highly
> heterogeneous, with each SoC typically having a totally different
> selection of supported protocols for each lane. Additionally, the SerDes
> devices on each SoC also have differing support. One SerDes will
> typically support Ethernet on most lanes, while the other will typically
> support PCIe on most lanes.
> 
> There is wide hardware support for this SerDes. It is present on QorIQ
> T-Series and Layerscape processors. Because each SoC typically has
> specific instructions and exceptions for its SerDes, I have limited the
> initial scope of this module to just the LS1046A and LS1088A.
> Additionally, I have only added support for Ethernet protocols. There is
> not a great need for dynamic reconfiguration for other protocols (except
> perhaps for M.2 cards), so support for them may never be added.
> 
> Nevertheless, I have tried to provide an obvious path for adding support
> for other SoCs as well as other protocols. SATA just needs support for
> configuring LNmSSCR0. PCIe may need to configure the equalization
> registers. It also uses multiple lanes. I have tried to write the driver
> with multi-lane support in mind, so there should not need to be any
> large changes. Although there are 6 protocols supported, I have only
> tested SGMII and XFI. The rest have been implemented as described in
> the datasheet. Most of these protocols should work "as-is", but
> 10GBASE-KR will need PCS support for link training.
> 
> Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all
> configured for PCIe, this driver uses one phy configured to use 4 lanes.
> This is because while the individual lanes may be configured
> individually, the protocol selection acts on all lanes at once.
> Additionally, the order which lanes should be configured in is specified
> by the datasheet. To coordinate this, lanes are reserved in phy_init,
> and released in phy_exit.
> 
> This driver was written with reference to the LS1046A reference manual.
> However, it was informed by reference manuals for all processors with
> mEMACs, especially the T4240 (which appears to have a "maxed-out"
> configuration). The earlier P-series processors appear to be similar, but
> have a different overall register layout (using "banks" instead of
> separate SerDes). Perhaps this those use a "5G Lynx SerDes."
> 
> Signed-off-by: Sean Anderson 
> ---
> 
> (no changes since v10)
> 
> Changes in v10:
> - Fix debugging print with incorrect error variable
> 
> Changes in v9:
> - Split off clock "driver" into its own patch to allow for better
>   review.
> - Add ability to defer lane initialization to phy_init. This allows
>   for easier transitioning between firmware-managed serdes and Linux-
>   managed serdes, as the consumer (such as dpaa2, which knows what the
>   firmware is doing) has the last say on who gets control.
> - phy-type -> fsl,phy
> 
> Changes in v8:
> - Remove unused variable from lynx_ls_mode_init
> 
> Changes in v7:
> - Break out call order into generic documentation
> - Refuse to switch "major" protocols
> - Update Kconfig to reflect restrictions
> - Remove set/clear of "pcs reset" bit, since it doesn't seem to fix
>   anything.
> 
> Changes in v6:
> - Update MAINTAINERS to include new files
> - Include bitfield.h and slab.h to allow compilation on non-arm64
>   arches.
> - Depend on COMMON_CLK and either layerscape/ppc
> 
> Changes in v5:
> - Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this
>   series to be applied directly to linux/master.
> - Add fsl,lynx-10g.h to MAINTAINERS
> 
> Changes in v4:
> - Rework all debug statements to remove use of __func__. Additional
>   information has been provided as necessary.
> - Consider alternative parent rates in round_rate and not in set_rate.
>   Trying to modify out parent's rate in set_rate will deadlock.
> - Explicitly perform a stop/reset sequence in set_rate. This way we
>   always ensure that the PLL is properly stopped.
> - Set the power-down bit when disabling the PLL. We can do this now that
>   enable/disable aren't abused during the set rate sequence.
> - Fix typos in QSGMII_OFFSET and XFI_OFFSET
> - Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better
>   reflect its function (adding post-cursor equalization).
> - Use of_clk_hw_onecell_get instead of a custom function.
> - Return struct clks from lynx_clks_init instead of embedding lynx_clk
>   in lynx_priv.
> - Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs
>   primarily in the layout and offset of the PCCRs. This will help bring a
>   cleaner abstraction layer. The caps have been removed, since this handles 
> the
>   only current usage.
> - Convert to use new binding format. As a result of this, we no 

Re: [PATCH] powerpc/mpc52xx_lpbfifo: Drop unused functions

2023-04-12 Thread Uwe Kleine-König
On Wed, Dec 28, 2022 at 03:51:29PM +0100, Uwe Kleine-König wrote:
> The four exported functions mpc52xx_lpbfifo_submit(),
> mpc52xx_lpbfifo_abort(), mpc52xx_lpbfifo_poll(), and
> mpc52xx_lpbfifo_start_xfer() are not used. So they can be dropped and the
> definitions needed to call them can be moved into the driver file.
> 
> Signed-off-by: Uwe Kleine-König 

I never got feedback about this driver and it has not appeared in next
up to now. Did it fell through the cracks?

Best regards
Uwe

> ---
>  arch/powerpc/include/asm/mpc52xx.h|  41 --
>  arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c | 134 +-
>  2 files changed, 33 insertions(+), 142 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mpc52xx.h 
> b/arch/powerpc/include/asm/mpc52xx.h
> index 5ea16a71c2f0..01ae6c351e50 100644
> --- a/arch/powerpc/include/asm/mpc52xx.h
> +++ b/arch/powerpc/include/asm/mpc52xx.h
> @@ -285,47 +285,6 @@ extern int mpc52xx_gpt_start_timer(struct 
> mpc52xx_gpt_priv *gpt, u64 period,
>  extern u64 mpc52xx_gpt_timer_period(struct mpc52xx_gpt_priv *gpt);
>  extern int mpc52xx_gpt_stop_timer(struct mpc52xx_gpt_priv *gpt);
>  
> -/* mpc52xx_lpbfifo.c */
> -#define MPC52XX_LPBFIFO_FLAG_READ(0)
> -#define MPC52XX_LPBFIFO_FLAG_WRITE   (1<<0)
> -#define MPC52XX_LPBFIFO_FLAG_NO_INCREMENT(1<<1)
> -#define MPC52XX_LPBFIFO_FLAG_NO_DMA  (1<<2)
> -#define MPC52XX_LPBFIFO_FLAG_POLL_DMA(1<<3)
> -
> -struct mpc52xx_lpbfifo_request {
> - struct list_head list;
> -
> - /* localplus bus address */
> - unsigned int cs;
> - size_t offset;
> -
> - /* Memory address */
> - void *data;
> - phys_addr_t data_phys;
> -
> - /* Details of transfer */
> - size_t size;
> - size_t pos; /* current position of transfer */
> - int flags;
> - int defer_xfer_start;
> -
> - /* What to do when finished */
> - void (*callback)(struct mpc52xx_lpbfifo_request *);
> -
> - void *priv; /* Driver private data */
> -
> - /* statistics */
> - int irq_count;
> - int irq_ticks;
> - u8 last_byte;
> - int buffer_not_done_cnt;
> -};
> -
> -extern int mpc52xx_lpbfifo_submit(struct mpc52xx_lpbfifo_request *req);
> -extern void mpc52xx_lpbfifo_abort(struct mpc52xx_lpbfifo_request *req);
> -extern void mpc52xx_lpbfifo_poll(void);
> -extern int mpc52xx_lpbfifo_start_xfer(struct mpc52xx_lpbfifo_request *req);
> -
>  /* mpc52xx_pic.c */
>  extern void mpc52xx_init_irq(void);
>  extern unsigned int mpc52xx_get_irq(void);
> diff --git a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c 
> b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
> index 6d1dd6e87478..32fd1345ffeb 100644
> --- a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
> +++ b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
> @@ -38,6 +38,39 @@ MODULE_LICENSE("GPL");
>  #define LPBFIFO_REG_FIFO_CONTROL (0x48)
>  #define LPBFIFO_REG_FIFO_ALARM   (0x4C)
>  
> +#define MPC52XX_LPBFIFO_FLAG_WRITE   (1<<0)
> +#define MPC52XX_LPBFIFO_FLAG_NO_DMA  (1<<2)
> +#define MPC52XX_LPBFIFO_FLAG_POLL_DMA(1<<3)
> +
> +struct mpc52xx_lpbfifo_request {
> + struct list_head list;
> +
> + /* localplus bus address */
> + unsigned int cs;
> + size_t offset;
> +
> + /* Memory address */
> + void *data;
> + phys_addr_t data_phys;
> +
> + /* Details of transfer */
> + size_t size;
> + size_t pos; /* current position of transfer */
> + int flags;
> + int defer_xfer_start;
> +
> + /* What to do when finished */
> + void (*callback)(struct mpc52xx_lpbfifo_request *);
> +
> + void *priv; /* Driver private data */
> +
> + /* statistics */
> + int irq_count;
> + int irq_ticks;
> + u8 last_byte;
> + int buffer_not_done_cnt;
> +};
> +
>  struct mpc52xx_lpbfifo {
>   struct device *dev;
>   phys_addr_t regs_phys;
> @@ -381,107 +414,6 @@ static irqreturn_t mpc52xx_lpbfifo_bcom_irq(int irq, 
> void *dev_id)
>   return IRQ_HANDLED;
>  }
>  
> -/**
> - * mpc52xx_lpbfifo_poll - Poll for DMA completion
> - */
> -void mpc52xx_lpbfifo_poll(void)
> -{
> - struct mpc52xx_lpbfifo_request *req = lpbfifo.req;
> - int dma = !(req->flags & MPC52XX_LPBFIFO_FLAG_NO_DMA);
> - int write = req->flags & MPC52XX_LPBFIFO_FLAG_WRITE;
> -
> - /*
> -  * For more information, see comments on the "Fat Lady" 
> -  */
> - if (dma && write)
> - mpc52xx_lpbfifo_irq(0, NULL);
> - else 
> - mpc52xx_lpbfifo_bcom_irq(0, NULL);
> -}
> -EXPORT_SYMBOL(mpc52xx_lpbfifo_poll);
> -
> -/**
> - * mpc52xx_lpbfifo_submit - Submit an LPB FIFO transfer request.
> - * @req: Pointer to request structure
> - *
> - * Return: %0 on success, -errno code on error
> - */
> -int mpc52xx_lpbfifo_submit(struct mpc52xx_lpbfifo_request *req)
> -{
> - unsigned long flags;
> -
> - if (!lpbfifo.regs)
> - return -ENODEV;
> -
> 

Re: [PATCH v2 3/5] locking/arch: Wire up local_try_cmpxchg

2023-04-12 Thread Peter Zijlstra
On Wed, Apr 12, 2023 at 03:37:50PM +0200, Uros Bizjak wrote:
> On Wed, Apr 12, 2023 at 1:33 PM Peter Zijlstra  wrote:
> >
> > On Wed, Apr 05, 2023 at 04:17:08PM +0200, Uros Bizjak wrote:
> > > diff --git a/arch/powerpc/include/asm/local.h 
> > > b/arch/powerpc/include/asm/local.h
> > > index bc4bd19b7fc2..45492fb5bf22 100644
> > > --- a/arch/powerpc/include/asm/local.h
> > > +++ b/arch/powerpc/include/asm/local.h
> > > @@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long 
> > > o, long n)
> > >   return t;
> > >  }
> > >
> > > +static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n)
> > > +{
> > > + long o = *po, r;
> > > +
> > > + r = local_cmpxchg(l, o, n);
> > > + if (unlikely(r != o))
> > > + *po = r;
> > > +
> > > + return likely(r == o);
> > > +}
> > > +
> >
> > Why is the ppc one different from the rest? Why can't it use the
> > try_cmpxchg_local() fallback and needs to have it open-coded?
> 
> Please note that ppc directly defines local_cmpxchg that bypasses
> cmpxchg_local/arch_cmpxchg_local machinery. The patch takes the same
> approach for local_try_cmpxchg, because fallbacks are using
> arch_cmpxchg_local definitions.
> 
> PPC should be converted to use arch_cmpxchg_local (to also enable
> instrumentation), but this is not the scope of the proposed patchset.

Ah indeed. Thanks!


Re: [PATCH v2 3/5] locking/arch: Wire up local_try_cmpxchg

2023-04-12 Thread Uros Bizjak
On Wed, Apr 12, 2023 at 1:33 PM Peter Zijlstra  wrote:
>
> On Wed, Apr 05, 2023 at 04:17:08PM +0200, Uros Bizjak wrote:
> > diff --git a/arch/powerpc/include/asm/local.h 
> > b/arch/powerpc/include/asm/local.h
> > index bc4bd19b7fc2..45492fb5bf22 100644
> > --- a/arch/powerpc/include/asm/local.h
> > +++ b/arch/powerpc/include/asm/local.h
> > @@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long o, 
> > long n)
> >   return t;
> >  }
> >
> > +static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n)
> > +{
> > + long o = *po, r;
> > +
> > + r = local_cmpxchg(l, o, n);
> > + if (unlikely(r != o))
> > + *po = r;
> > +
> > + return likely(r == o);
> > +}
> > +
>
> Why is the ppc one different from the rest? Why can't it use the
> try_cmpxchg_local() fallback and needs to have it open-coded?

Please note that ppc directly defines local_cmpxchg that bypasses
cmpxchg_local/arch_cmpxchg_local machinery. The patch takes the same
approach for local_try_cmpxchg, because fallbacks are using
arch_cmpxchg_local definitions.

PPC should be converted to use arch_cmpxchg_local (to also enable
instrumentation), but this is not the scope of the proposed patchset.

Uros.


Re: [PATCH] powerpc/64: Always build with 128-bit long double

2023-04-12 Thread Michael Ellerman
Hamza Mahfooz  writes:
> On 4/4/23 06:28, Michael Ellerman wrote:
>> The amdgpu driver builds some of its code with hard-float enabled,
>> whereas the rest of the kernel is built with soft-float.
>> 
>> When building with 64-bit long double, if soft-float and hard-float
>> objects are linked together, the build fails due to incompatible ABI
>> tags.
>> 
>> In the past there have been build errors in the amdgpu driver caused by
>> this, some of those were due to bad intermingling of soft & hard-float
>> code, but those issues have now all been fixed since commit c92b7fe0d92a
>> ("drm/amd/display: move remaining FPU code to dml folder").
>> 
>> However it's still possible for soft & hard-float objects to end up
>> linked together, if the amdgpu driver is built-in to the kernel along
>> with the test_emulate_step.c code, which uses soft-float. That happens
>> in an allyesconfig build.
>> 
>> Currently those build errors are avoided because the amdgpu driver is
>> gated on 128-bit long double being enabled. But that's not a detail the
>> amdgpu driver should need to be aware of, and if another driver starts
>> using hard-float the same problem would occur.
>> 
>> All versions of the 64-bit ABI specify that long-double is 128-bits.
>> However some compilers, notably the kernel.org ones, are built to use
>> 64-bit long double by default.
>> 
>> Apart from this issue of soft vs hard-float, the kernel doesn't care
>> what size long double is. In particular the kernel using 128-bit long
>> double doesn't impact userspace's ability to use 64-bit long double, as
>> musl does.
>> 
>> So always build the 64-bit kernel with 128-bit long double. That should
>> avoid any build errors due to the incompatible ABI tags. Excluding the
>> code that uses soft/hard-float, the vmlinux is identical with/without
>> the flag.
>> 
>> It does mean any code which is incorrectly intermingling soft &
>> hard-float code will build without error, so those bugs will need to be
>> caught by testing rather than at build time.
>> 
>> For more background see:
>>- commit d11219ad53dc ("amdgpu: disable powerpc support for the newer 
>> display engine")
>>- commit c653c591789b ("drm/amdgpu: Re-enable DCN for 64-bit powerpc")
>>- 
>> https://lore.kernel.org/r/dab9cbd8-2626-4b99-8098-31fe76397...@app.fastmail.com
>> 
>> Signed-off-by: Michael Ellerman 
>
> Reviewed-by: Hamza Mahfooz 

Thanks.

> If you'd prefer to have this go through the amdgpu branch, please let
> me know.

I think it makes more sense to go via the powerpc tree, it will get more
testing on powerpc that way.

cheers


Re: [PATCH] selftests/powerpc: Replace obsolete memalign() with posix_memalign()

2023-04-12 Thread Michael Ellerman
Deming Wang  writes:
> memalign() is obsolete according to its manpage.
>
> Replace memalign() with posix_memalign() and remove malloc.h include
> that was there for memalign().
>
> As a pointer is passed into posix_memalign(), initialize *s to NULL
> to silence a warning about the function's return value being used as
> uninitialized (which is not valid anyway because the error is properly
> checked before p is returned).

The patch doesn't do that. There is no p?

I think you've copied the change log for a whole bunch of commits but
not updated them to be accurate for each change?

cheers

> diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c 
> b/tools/testing/selftests/powerpc/stringloops/strlen.c
> index 9055ebc484d0..f9c1f9cc2d32 100644
> --- a/tools/testing/selftests/powerpc/stringloops/strlen.c
> +++ b/tools/testing/selftests/powerpc/stringloops/strlen.c
> @@ -1,5 +1,4 @@
>  // SPDX-License-Identifier: GPL-2.0
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -51,10 +50,11 @@ static void bench_test(char *s)
>  static int testcase(void)
>  {
>   char *s;
> + int ret;
>   unsigned long i;
>  
> - s = memalign(128, SIZE);
> - if (!s) {
> + ret = posix_memalign((void **), 128, SIZE);
> + if (ret < 0) {
>   perror("memalign");
>   exit(1);
>   }
> -- 
> 2.27.0


Re: [PATCH v2 3/5] locking/arch: Wire up local_try_cmpxchg

2023-04-12 Thread Peter Zijlstra
On Wed, Apr 05, 2023 at 04:17:08PM +0200, Uros Bizjak wrote:
> diff --git a/arch/powerpc/include/asm/local.h 
> b/arch/powerpc/include/asm/local.h
> index bc4bd19b7fc2..45492fb5bf22 100644
> --- a/arch/powerpc/include/asm/local.h
> +++ b/arch/powerpc/include/asm/local.h
> @@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long o, 
> long n)
>   return t;
>  }
>  
> +static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n)
> +{
> + long o = *po, r;
> +
> + r = local_cmpxchg(l, o, n);
> + if (unlikely(r != o))
> + *po = r;
> +
> + return likely(r == o);
> +}
> +

Why is the ppc one different from the rest? Why can't it use the
try_cmpxchg_local() fallback and needs to have it open-coded?


[PATCH] selftests/powerpc: Replace obsolete memalign() with posix_memalign()

2023-04-12 Thread Deming Wang
memalign() is obsolete according to its manpage.

Replace memalign() with posix_memalign() and remove malloc.h include
that was there for memalign().

As a pointer is passed into posix_memalign(), initialize *s to NULL
to silence a warning about the function's return value being used as
uninitialized (which is not valid anyway because the error is properly
checked before p is returned).

Signed-off-by: Deming Wang 
---
 tools/testing/selftests/powerpc/stringloops/strlen.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c 
b/tools/testing/selftests/powerpc/stringloops/strlen.c
index 9055ebc484d0..f9c1f9cc2d32 100644
--- a/tools/testing/selftests/powerpc/stringloops/strlen.c
+++ b/tools/testing/selftests/powerpc/stringloops/strlen.c
@@ -1,5 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0
-#include 
 #include 
 #include 
 #include 
@@ -51,10 +50,11 @@ static void bench_test(char *s)
 static int testcase(void)
 {
char *s;
+   int ret;
unsigned long i;
 
-   s = memalign(128, SIZE);
-   if (!s) {
+   ret = posix_memalign((void **), 128, SIZE);
+   if (ret < 0) {
perror("memalign");
exit(1);
}
-- 
2.27.0



Re: [PATCH v13 03/15] dt-bindings: Convert gpio-mmio to yaml

2023-04-12 Thread Krzysztof Kozlowski
On 11/04/2023 20:43, Sean Anderson wrote:
> This is a generic binding for simple MMIO GPIO controllers. Although we
> have a single driver for these controllers, they were previously spread
> over several files. Consolidate them. The register descriptions are
> adapted from the comments in the source. There is no set order for the
> registers, and some registers may be omitted. Because of this, reg-names
> is mandatory, and no order is specified.
> 
> Rename brcm,bcm6345-gpio to brcm,bcm63xx-gpio to reflect that bcm6345
> has moved.
> 
> Signed-off-by: Sean Anderson 
> Reviewed-by: Linus Walleij 
> ---
> Linus or Bartosz, feel free to pick this up as the rest of this series
> may not be merged any time soon.
> 
> Changes in v13:
> - Fix references to brcm,bcm63xx-gpio.yaml (neé brcm,bcm6345-gpio)

You got some of the same errors as last time.

Test your patches before sending.

Best regards,
Krzysztof



Re: [PATCH] KVM: PPC: BOOK3S: book3s_hv_nested.c: improve branch prediction for k.alloc

2023-04-12 Thread Kautuk Consul
Hi,

On 2023-04-11 16:35:10, Michael Ellerman wrote:
> Kautuk Consul  writes:
> > On 2023-04-07 09:01:29, Sean Christopherson wrote:
> >> On Fri, Apr 07, 2023, Bagas Sanjaya wrote:
> >> > On Fri, Apr 07, 2023 at 05:31:47AM -0400, Kautuk Consul wrote:
> >> > > I used the unlikely() macro on the return values of the k.alloc
> >> > > calls and found that it changes the code generation a bit.
> >> > > Optimize all return paths of k.alloc calls by improving
> >> > > branch prediction on return value of k.alloc.
> >> 
> >> Nit, this is improving code generation, not branch prediction.
> > Sorry my mistake.
> >> 
> >> > What about below?
> >> > 
> >> > "Improve branch prediction on kmalloc() and kzalloc() call by using
> >> > unlikely() macro to optimize their return paths."
> >> 
> >> Another nit, using unlikely() doesn't necessarily provide a measurable 
> >> optimization.
> >> As above, it does often improve code generation for the happy path, but 
> >> that doesn't
> >> always equate to improved performance, e.g. if the CPU can easily predict 
> >> the branch
> >> and/or there is no impact on the cache footprint.
> 
> > I see. I will submit a v2 of the patch with a better and more accurate
> > description. Does anyone else have any comments before I do so ?
>  
> In general I think unlikely should be saved for cases where either the
> compiler is generating terrible code, or the likelyness of the condition
> might be surprising to a human reader.
> 
> eg. if you had some code that does a NULL check and it's *expected* that
> the value is NULL, then wrapping that check in likely() actually adds
> information for a human reader.
> 
> Also please don't use unlikely in init paths or other cold paths, it
> clutters the code (only slightly but a little) and that's not worth the
> possible tiny benefit for code that only runs once or infrequently.
> 
> I would expect the compilers to do the right thing in all
> these cases without the unlikely. But if you can demonstrate that they
> meaningfully improve the code generation with a before/after
> dissassembly then I'd be interested.
Just FYI, the last email by kautuk.consul...@gmail.com was by me.
That last email contains a diff file attachment which compares 2 files:
before my changes and after my changes.
This diff file shows a lot of changes in code generation. Im assuming
all those changes are made by the compiler towards optimizing all return
paths to k.alloc calls.
Kindly review and comment.
> cheers


Re: [PATCH v2 15/19] arch/powerpc: Implement with generic helpers

2023-04-12 Thread Michael Ellerman
Thomas Zimmermann  writes:
> Replace the architecture's fb_is_primary_device() with the generic
> one from . No functional changes.
>
> Signed-off-by: Thomas Zimmermann 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Cc: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/fb.h | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)

Looks fine.

Acked-by: Michael Ellerman  (powerpc)

cheers

> diff --git a/arch/powerpc/include/asm/fb.h b/arch/powerpc/include/asm/fb.h
> index 6541ab77c5b9..5f1a2e5f7654 100644
> --- a/arch/powerpc/include/asm/fb.h
> +++ b/arch/powerpc/include/asm/fb.h
> @@ -2,8 +2,8 @@
>  #ifndef _ASM_FB_H_
>  #define _ASM_FB_H_
>  
> -#include 
>  #include 
> +
>  #include 
>  
>  static inline void fb_pgprotect(struct file *file, struct vm_area_struct 
> *vma,
> @@ -13,10 +13,8 @@ static inline void fb_pgprotect(struct file *file, struct 
> vm_area_struct *vma,
>vma->vm_end - vma->vm_start,
>vma->vm_page_prot);
>  }
> +#define fb_pgprotect fb_pgprotect
>  
> -static inline int fb_is_primary_device(struct fb_info *info)
> -{
> - return 0;
> -}
> +#include 
>  
>  #endif /* _ASM_FB_H_ */
> -- 
> 2.40.0