Re: [PATCH v7 00/24] x86: Implement an HPET-based hardlockup detector
On Wed, Mar 01, 2023 at 03:47:29PM -0800, Ricardo Neri wrote: > Hi x86 trusted reviewers, > > This is the seventh version of this patchset. I acknowledge that it took me > a long time to post a new version. Sorry! I will commit time to continue > working on this series with high priority and I will post a new series soon > after receiving your new feedback. > > Although this series touches several subsystems, I plan to send it to the > x86 maintainers because a) the series does not make much sense if split > into subsystems, b) Thomas Gleixner has reviewed previous versions, and c) > he has contributed to all the subsystems I modify. > > Tony Luck has kindly reviewed previous versions of the series and I carried > his Reviewed-by tags. This version, however, has new patches that also need > review. > > I seek to collect the Reviewed-by tags from the x86 trusted reviewers for > the following patches: >+ arch/x86: 4, 5 >+ Intel IOMMU: 6, >+ AMD IOMMU: 9, 10, 11, >+ NMI watchdog: 23 and 24. Hello, checking if there is any feedback on these patches that I plan to send to the x86 maintainer. I am still seeking to collect the Reviewed-by: tags from the x86 trusted reviewers. Thanks in advance! BR, Ricardo
Re: [PATCH] powerpc/bpf: populate extable entries only during the last pass
Hello Christophe, Thanks for the review. On 07/04/23 11:31 am, Christophe Leroy wrote: Le 06/04/2023 à 09:35, Hari Bathini a écrit : Since commit 85e031154c7c ("powerpc/bpf: Perform complete extra passes to update addresses"), two additional passes are performed to avoid space and CPU time wastage on powerpc. But these extra passes led to WARN_ON_ONCE() hits in bpf_add_extable_entry(). Fix it by not adding extable entries during the extra pass. Are you sure this change is correct ? Actually, I was in two minds about that owing to commit 04c04205bc35 ("bpf powerpc: Remove extra_pass from bpf_jit_build_body()"). During the extra pass the code can get shrinked or expanded (within the limits of the size of the preliminary pass). Shouldn't extable entries be populated during the last pass ? Unlikely, but the intention there was to eliminate a regression in case extra_pass ends up being 'false' always in any subsequent change. - Hari Fixes: 85e031154c7c ("powerpc/bpf: Perform complete extra passes to update addresses") Signed-off-by: Hari Bathini --- arch/powerpc/net/bpf_jit_comp32.c | 2 +- arch/powerpc/net/bpf_jit_comp64.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c index 7f91ea064c08..e788b1fbeee6 100644 --- a/arch/powerpc/net/bpf_jit_comp32.c +++ b/arch/powerpc/net/bpf_jit_comp32.c @@ -977,7 +977,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * if (size != BPF_DW && !fp->aux->verifier_zext) EMIT(PPC_RAW_LI(dst_reg_h, 0)); - if (BPF_MODE(code) == BPF_PROBE_MEM) { + if (BPF_MODE(code) == BPF_PROBE_MEM && !extra_pass) { int insn_idx = ctx->idx - 1; int jmp_off = 4; diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 8dd3cabaa83a..1cc2777ec846 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -921,7 +921,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * if (size != BPF_DW && insn_is_zext([i + 1])) addrs[++i] = ctx->idx * 4; - if (BPF_MODE(code) == BPF_PROBE_MEM) { + if (BPF_MODE(code) == BPF_PROBE_MEM && !extra_pass) { ret = bpf_add_extable_entry(fp, image, pass, ctx, ctx->idx - 1, 4, dst_reg); if (ret)
[PATCH] selftests/powerpc: Replace obsolete memalign() with posix_memalign()
memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign() and remove malloc.h include that was there for memalign(). As a pointer is passed into posix_memalign(), initialize *s to NULL to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before s is returned). Signed-off-by: Deming Wang --- tools/testing/selftests/powerpc/stringloops/strlen.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c b/tools/testing/selftests/powerpc/stringloops/strlen.c index 9055ebc484d0..f9c1f9cc2d32 100644 --- a/tools/testing/selftests/powerpc/stringloops/strlen.c +++ b/tools/testing/selftests/powerpc/stringloops/strlen.c @@ -1,5 +1,4 @@ // SPDX-License-Identifier: GPL-2.0 -#include #include #include #include @@ -51,10 +50,11 @@ static void bench_test(char *s) static int testcase(void) { char *s; + int ret; unsigned long i; - s = memalign(128, SIZE); - if (!s) { + ret = posix_memalign((void **), 128, SIZE); + if (ret < 0) { perror("memalign"); exit(1); } -- 2.27.0
Re: [PATCH] powerpc/mpc52xx_lpbfifo: Drop unused functions
Uwe Kleine-König writes: > On Wed, Dec 28, 2022 at 03:51:29PM +0100, Uwe Kleine-König wrote: >> The four exported functions mpc52xx_lpbfifo_submit(), >> mpc52xx_lpbfifo_abort(), mpc52xx_lpbfifo_poll(), and >> mpc52xx_lpbfifo_start_xfer() are not used. So they can be dropped and the >> definitions needed to call them can be moved into the driver file. >> >> Signed-off-by: Uwe Kleine-König > > I never got feedback about this driver and it has not appeared in next > up to now. Did it fell through the cracks? Yeah. I was hoping someone would explain what's going on with the driver. Presumably there are some out-of-tree drivers that use the routines provided by this driver? I think rather than merging this patch, which keeps the code but makes it completely useless, do you mind sending a patch to remove the whole driver? Maybe that will get someone's attention. cheers
Re: [PATCH 2/2] start_kernel: omit prevent_tail_call_optimization for newer toolchains
On Wed, Apr 12, 2023 at 11:32:13AM -0700, ndesaulni...@google.com wrote: > prevent_tail_call_optimization was added in > commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try") > to work around stack canaries getting inserted into functions that would > initialize the stack canary in the first place. > > Now that we have no_stack_protector function attribute (gcc-11+, > clang-7+) and use it on start_kernel, remove the call to > prevent_tail_call_optimization such that we may one day remove it > outright. > > Signed-off-by: Nick Desaulniers Reviewed-by: Nathan Chancellor > --- > init/main.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/init/main.c b/init/main.c > index 213baf7b8cb1..c8503d02dfa6 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -1152,7 +1152,12 @@ void start_kernel(void) > /* Do the rest non-__init'ed, we're now alive */ > arch_call_rest_init(); > > + /* Avoid stack canaries in callers of boot_init_stack_canary for gcc-10 > + * and older. > + */ > +#if !__has_attribute(__no_stack_protector__) > prevent_tail_call_optimization(); > +#endif > } > > /* Call all constructor functions linked into the kernel. */ > > -- > 2.40.0.577.gac1e443424-goog >
Re: [PATCH 1/2] start_kernel: add no_stack_protector fn attr
On Wed, Apr 12, 2023 at 11:32:12AM -0700, ndesaulni...@google.com wrote: > Back during the discussion of > commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try") > we discussed the need for a function attribute to control the omission > of stack protectors on a per-function basis; at the time Clang had > support for no_stack_protector but GCC did not. This was fixed in > gcc-11. Now that the function attribute is available, let's start using > it. > > Callers of boot_init_stack_canary need to use this function attribute > unless they're compiled with -fno-stack-protector, otherwise the canary > stored in the stack slot of the caller will differ upon the call to > boot_init_stack_canary. This will lead to a call to __stack_chk_fail > then panic. > > Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722 > Link: > https://lore.kernel.org/all/20200316130414.gc12...@hirez.programming.kicks-ass.net/ > Signed-off-by: Nick Desaulniers I applied this in front of Josh's series and defconfig no longer panics on boot :) Tested-by: Nathan Chancellor > --- > arch/powerpc/kernel/smp.c | 1 + > include/linux/compiler_attributes.h | 12 > init/main.c | 3 ++- > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c > index 6b90f10a6c81..7d4c12b1abb7 100644 > --- a/arch/powerpc/kernel/smp.c > +++ b/arch/powerpc/kernel/smp.c > @@ -1603,6 +1603,7 @@ static void add_cpu_to_masks(int cpu) > } > > /* Activate a secondary processor. */ > +__no_stack_protector > void start_secondary(void *unused) > { > unsigned int cpu = raw_smp_processor_id(); > diff --git a/include/linux/compiler_attributes.h > b/include/linux/compiler_attributes.h > index e659cb6fded3..84864767a56a 100644 > --- a/include/linux/compiler_attributes.h > +++ b/include/linux/compiler_attributes.h > @@ -255,6 +255,18 @@ > */ > #define __noreturn __attribute__((__noreturn__)) > > +/* > + * Optional: only supported since GCC >= 11.1, clang >= 7.0. > + * > + * gcc: > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-no_005fstack_005fprotector-function-attribute > + * clang: > https://clang.llvm.org/docs/AttributeReference.html#no-stack-protector-safebuffers > + */ > +#if __has_attribute(__no_stack_protector__) > +# define __no_stack_protector > __attribute__((__no_stack_protector__)) > +#else > +# define __no_stack_protector > +#endif > + > /* > * Optional: not supported by gcc. > * > diff --git a/init/main.c b/init/main.c > index bb87b789c543..213baf7b8cb1 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -941,7 +941,8 @@ static void __init print_unknown_bootoptions(void) > memblock_free(unknown_options, len); > } > > -asmlinkage __visible void __init __no_sanitize_address start_kernel(void) > +asmlinkage __visible __init __no_sanitize_address __no_stack_protector > +void start_kernel(void) > { > char *command_line; > char *after_dashes; > > -- > 2.40.0.577.gac1e443424-goog > >
Re: [PATCH v3 5/6] PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler
On Tue, Apr 11, 2023 at 01:03:01PM -0500, Terry Bowman wrote: > From: Robert Richter > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > RCiEP, but CXL downstream and upstream ports are not enumerated and > not visible in the PCIe hierarchy. Protocol and link errors are sent > to an RCEC. > > Restricted CXL host (RCH) downstream port-detected errors are signaled > as internal AER errors, either Uncorrectable Internal Error (UIE) or > Corrected Internal Errors (CIE). The error source is the id of the > RCEC. A CXL handler must then inspect the error status in various CXL > registers residing in the dport's component register space (CXL RAS > cap) or the dport's RCRB (AER ext cap). [1] > > Errors showing up in the RCEC's error handler must be handled and > connected to the CXL subsystem. Implement this by forwarding the error > to all CXL devices below the RCEC. Since the entire CXL device is > controlled only using PCIe Configuration Space of device 0, Function > 0, Capitalize "device" and "Function" the same way (also appears in comment below). > only pass it there [2]. These devices have the Memory Device class > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > can implement the handler. In addition to errors directed to the CXL > endpoint device, the handler must also inspect the CXL downstream > port's CXL RAS and PCIe AER external capabilities that is connected to "AER external capabilities" -- is that referring to the "AER *Extended* capability"? If so, we usually don't bother including the "extended" part because it's usually not relevant. But if you intended "external", I don't know what it means. > the device. > > Since CXL downstream port errors are signaled using internal errors, > the handler requires those errors to be unmasked. This is subject of a > follow-on patch. > > The reason for choosing this implementation is that a CXL RCEC device > is bound to the AER port driver, but the driver does not allow it to > register a custom specific handler to support CXL. Connecting the RCEC > hard-wired with a CXL handler does not work, as the CXL subsystem > might not be present all the time. The alternative to add an > implementation to the portdrv to allow the registration of a custom > RCEC error handler isn't worth doing it as CXL would be its only user. > Instead, just check for an CXL RCEC and pass it down to the connected > CXL device's error handler. With this approach the code can entirely > be implemented in the PCIe AER driver and is independent of the CXL > subsystem. The CXL driver only provides the handler. Can you make this more concrete with an example topology so we can work through how this all works? Correct me when I go off the rails here: The current code uses pcie_walk_rcec() in this path, which basically searches below a Root Port or RCEC for devices that have an AER error status bit set, add them to the e_info[] list, and call handle_error_source() for each one: aer_isr_one_error # get e_src from aer_fifo find_source_device(e_src) pcie_walk_rcec(find_device_iter) find_device_iter is_error_source # read PCI_ERR_COR_STATUS or PCI_ERR_UNCOR_STATUS if (error-source) add_error_device # add device to e_info[] list # now call handle_error_source for everything in e_info[] aer_process_err_devices for (i = 0; i < e_info->err_dev_num; i++) handle_error_source IIUC, this patch basically says that an RCEC should have an AER error status bit (UIE or CIE) set, but the devices "below" the RCEC will not, so they won't get added to e_info[]. So we insert cxl_handle_error() in handle_error_source(), where it gets called for the RCEC, and then it uses pcie_walk_rcec() again to forcibly call handle_error_source() for *every* device "below" the RCEC (even though they don't have AER error status bits set). Then handle_error_source() ultimately calls the CXL driver err_handler entry points (.cor_error_detected(), .error_detected(), etc), which can look at the CXL-specific error status in the CXL RAS or RCRB or whatever. So this basically looks like a workaround for the fact that the AER code only calls handle_error_source() when it finds AER error status, and CXL doesn't *set* that AER error status. There's not that much code here, but it seems like a quite a bit of complexity in an area that is already pretty complicated. Here's another idea: the ACPI GHES code (ghes_handle_aer()) basically receives a packet of error status from firmware and queues it for recovery via pcie_do_recovery(). What if you had a CXL module that knew how to look for the CXL error status, package it up similarly, and queue it via aer_recover_queue()? > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > Co-developed-by: Terry Bowman > Signed-off-by: Robert Richter > Signed-off-by: Terry
RE: [PATCH 0/6] bus: fsl-mc: Make remove function return void
> -Original Message- > From: Uwe Kleine-König > Sent: Wednesday, April 12, 2023 12:11 PM > To: Stuart Yoder ; Laurentiu Tudor > ; Roy Pledge ; Leo Li > ; Horia Geanta ; Pankaj > Gupta ; Gaurav Jain ; > Herbert Xu ; David S. Miller > ; Vinod Koul ; Ioana Ciornei > ; Eric Dumazet ; Jakub > Kicinski ; Paolo Abeni ; Y.B. Lu > ; Diana Madalina Craciun (OSS) > ; Alex Williamson > ; Richard Cochran > > Cc: k...@vger.kernel.org; net...@vger.kernel.org; linux- > ker...@vger.kernel.org; linux-cry...@vger.kernel.org; > ker...@pengutronix.de; dmaeng...@vger.kernel.org; linuxppc- > d...@lists.ozlabs.org; linux-arm-ker...@lists.infradead.org > Subject: Re: [PATCH 0/6] bus: fsl-mc: Make remove function return void > > Hello, > > On Fri, Mar 10, 2023 at 11:41:22PM +0100, Uwe Kleine-König wrote: > > Hello, > > > > many bus remove functions return an integer which is a historic > > misdesign that makes driver authors assume that there is some kind of > > error handling in the upper layers. This is wrong however and > > returning and error code only yields an error message. > > > > This series improves the fsl-mc bus by changing the remove callback to > > return no value instead. As a preparation all drivers are changed to > > return zero before so that they don't trigger the error message. > > Who is supposed to pick up this patch series (or point out a good reason for > not taking it)? Previously Greg KH picked up MC bus patches. If no one is picking up them this time, I probably can take it through the fsl soc tree. Regards, Leo
Re: [PATCH v3 6/6] PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling
On Tue, Apr 11, 2023 at 01:03:02PM -0500, Terry Bowman wrote: > From: Robert Richter > > RCEC AER corrected and uncorrectable internal errors (CIE/UIE) are > disabled by default. "Disabled by default" just means "the power-up state of CIE/UIC is that they are masked", right? It doesn't mean that Linux normally masks them. > [1][2] Enable them to receive CXL downstream port > errors of a Restricted CXL Host (RCH). > > [1] CXL 3.0 Spec, 12.2.1.1 - RCH Downstream Port Detected Errors > [2] PCIe Base Spec 6.0, 7.8.4.3 Uncorrectable Error Mask Register, > 7.8.4.6 Correctable Error Mask Register > > Co-developed-by: Terry Bowman > Signed-off-by: Robert Richter > Signed-off-by: Terry Bowman > Cc: "Oliver O'Halloran" > Cc: Bjorn Helgaas > Cc: Mahesh J Salgaonkar > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-...@vger.kernel.org > --- > drivers/pci/pcie/aer.c | 73 ++ > 1 file changed, 73 insertions(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 171a08fd8ebd..3973c731e11d 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1000,7 +1000,79 @@ static void cxl_handle_error(struct pci_dev *dev, > struct aer_err_info *info) > pcie_walk_rcec(dev, cxl_handle_error_iter, info); > } > > +static bool cxl_error_is_native(struct pci_dev *dev) > +{ > + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); > + > + if (pcie_ports_native) > + return true; > + > + return host->native_aer && host->native_cxl_error; > +} > + > +static int handles_cxl_error_iter(struct pci_dev *dev, void *data) > +{ > + int *handles_cxl = data; > + > + *handles_cxl = is_cxl_mem_dev(dev) && cxl_error_is_native(dev); > + > + return *handles_cxl; > +} > + > +static bool handles_cxl_errors(struct pci_dev *rcec) > +{ > + int handles_cxl = 0; > + > + if (!rcec->aer_cap) > + return false; > + > + if (pci_pcie_type(rcec) == PCI_EXP_TYPE_RC_EC) > + pcie_walk_rcec(rcec, handles_cxl_error_iter, _cxl); > + > + return !!handles_cxl; > +} > + > +static int __cxl_unmask_internal_errors(struct pci_dev *rcec) > +{ > + int aer, rc; > + u32 mask; > + > + /* > + * Internal errors are masked by default, unmask RCEC's here > + * PCI6.0 7.8.4.3 Uncorrectable Error Mask Register (Offset 08h) > + * PCI6.0 7.8.4.6 Correctable Error Mask Register (Offset 14h) > + */ Unmasking internal errors doesn't have anything specific to do with CXL, so I don't think it should have "cxl" in the function name. Maybe something like "pci_aer_unmask_internal_errors()". This also has nothing special to do with RCECs, so I think we should refer to the device as "dev" as is typical in this file. I think this needs to check pcie_aer_is_native() as is done by pci_aer_clear_nonfatal_status() and other functions that write the AER Capability. With the exception of this function, this patch looks like all CXL code that maybe could be with other CXL code. Would require making pcie_walk_rcec() available outside drivers/pci, I guess. > + aer = rcec->aer_cap; > + rc = pci_read_config_dword(rcec, aer + PCI_ERR_UNCOR_MASK, ); > + if (rc) > + return rc; > + mask &= ~PCI_ERR_UNC_INTN; > + rc = pci_write_config_dword(rcec, aer + PCI_ERR_UNCOR_MASK, mask); > + if (rc) > + return rc; > + > + rc = pci_read_config_dword(rcec, aer + PCI_ERR_COR_MASK, ); > + if (rc) > + return rc; > + mask &= ~PCI_ERR_COR_INTERNAL; > + rc = pci_write_config_dword(rcec, aer + PCI_ERR_COR_MASK, mask); > + > + return rc; > +} > + > +static void cxl_unmask_internal_errors(struct pci_dev *rcec) > +{ > + if (!handles_cxl_errors(rcec)) > + return; > + > + if (__cxl_unmask_internal_errors(rcec)) > + dev_err(>dev, "cxl: Failed to unmask internal errors"); > + else > + dev_dbg(>dev, "cxl: Internal errors unmasked"); > +} > + > #else > +static inline void cxl_unmask_internal_errors(struct pci_dev *dev) { } > static inline void cxl_handle_error(struct pci_dev *dev, > struct aer_err_info *info) { } > #endif > @@ -1397,6 +1469,7 @@ static int aer_probe(struct pcie_device *dev) > return status; > } > > + cxl_unmask_internal_errors(port); > aer_enable_rootport(rpc); > pci_info(port, "enabled with IRQ %d\n", dev->irq); > return 0; > -- > 2.34.1 >
Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver
Hi Vinod, On 4/12/23 12:10, Vinod Koul wrote: On 11-04-23, 14:43, Sean Anderson wrote: This adds support for the Lynx 10G "SerDes" devices found on various NXP QorIQ SoCs. There may be up to four SerDes devices on each SoC, each supporting up to eight lanes. Protocol support for each SerDes is highly heterogeneous, with each SoC typically having a totally different selection of supported protocols for each lane. Additionally, the SerDes devices on each SoC also have differing support. One SerDes will typically support Ethernet on most lanes, while the other will typically support PCIe on most lanes. There is wide hardware support for this SerDes. It is present on QorIQ T-Series and Layerscape processors. Because each SoC typically has specific instructions and exceptions for its SerDes, I have limited the initial scope of this module to just the LS1046A and LS1088A. Additionally, I have only added support for Ethernet protocols. There is not a great need for dynamic reconfiguration for other protocols (except perhaps for M.2 cards), so support for them may never be added. Nevertheless, I have tried to provide an obvious path for adding support for other SoCs as well as other protocols. SATA just needs support for configuring LNmSSCR0. PCIe may need to configure the equalization registers. It also uses multiple lanes. I have tried to write the driver with multi-lane support in mind, so there should not need to be any large changes. Although there are 6 protocols supported, I have only tested SGMII and XFI. The rest have been implemented as described in the datasheet. Most of these protocols should work "as-is", but 10GBASE-KR will need PCS support for link training. Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all configured for PCIe, this driver uses one phy configured to use 4 lanes. This is because while the individual lanes may be configured individually, the protocol selection acts on all lanes at once. Additionally, the order which lanes should be configured in is specified by the datasheet. To coordinate this, lanes are reserved in phy_init, and released in phy_exit. This driver was written with reference to the LS1046A reference manual. However, it was informed by reference manuals for all processors with mEMACs, especially the T4240 (which appears to have a "maxed-out" configuration). The earlier P-series processors appear to be similar, but have a different overall register layout (using "banks" instead of separate SerDes). Perhaps this those use a "5G Lynx SerDes." Signed-off-by: Sean Anderson --- (no changes since v10) Changes in v10: - Fix debugging print with incorrect error variable Changes in v9: - Split off clock "driver" into its own patch to allow for better review. - Add ability to defer lane initialization to phy_init. This allows for easier transitioning between firmware-managed serdes and Linux- managed serdes, as the consumer (such as dpaa2, which knows what the firmware is doing) has the last say on who gets control. - phy-type -> fsl,phy Changes in v8: - Remove unused variable from lynx_ls_mode_init Changes in v7: - Break out call order into generic documentation - Refuse to switch "major" protocols - Update Kconfig to reflect restrictions - Remove set/clear of "pcs reset" bit, since it doesn't seem to fix anything. Changes in v6: - Update MAINTAINERS to include new files - Include bitfield.h and slab.h to allow compilation on non-arm64 arches. - Depend on COMMON_CLK and either layerscape/ppc Changes in v5: - Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this series to be applied directly to linux/master. - Add fsl,lynx-10g.h to MAINTAINERS Changes in v4: - Rework all debug statements to remove use of __func__. Additional information has been provided as necessary. - Consider alternative parent rates in round_rate and not in set_rate. Trying to modify out parent's rate in set_rate will deadlock. - Explicitly perform a stop/reset sequence in set_rate. This way we always ensure that the PLL is properly stopped. - Set the power-down bit when disabling the PLL. We can do this now that enable/disable aren't abused during the set rate sequence. - Fix typos in QSGMII_OFFSET and XFI_OFFSET - Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better reflect its function (adding post-cursor equalization). - Use of_clk_hw_onecell_get instead of a custom function. - Return struct clks from lynx_clks_init instead of embedding lynx_clk in lynx_priv. - Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs primarily in the layout and offset of the PCCRs. This will help bring a cleaner abstraction layer. The caps have been removed, since this handles the only current usage. - Convert to use new binding format. As a result of this, we no longer need to have protocols for PCIe or SATA. Additionally, modes now live in lynx_group instead of lynx_priv. - Remove teq from
Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver
Hi Vinod, On 4/12/23 12:10, Vinod Koul wrote: On 11-04-23, 14:43, Sean Anderson wrote: This adds support for the Lynx 10G "SerDes" devices found on various NXP QorIQ SoCs. There may be up to four SerDes devices on each SoC, each supporting up to eight lanes. Protocol support for each SerDes is highly heterogeneous, with each SoC typically having a totally different selection of supported protocols for each lane. Additionally, the SerDes devices on each SoC also have differing support. One SerDes will typically support Ethernet on most lanes, while the other will typically support PCIe on most lanes. There is wide hardware support for this SerDes. It is present on QorIQ T-Series and Layerscape processors. Because each SoC typically has specific instructions and exceptions for its SerDes, I have limited the initial scope of this module to just the LS1046A and LS1088A. Additionally, I have only added support for Ethernet protocols. There is not a great need for dynamic reconfiguration for other protocols (except perhaps for M.2 cards), so support for them may never be added. Nevertheless, I have tried to provide an obvious path for adding support for other SoCs as well as other protocols. SATA just needs support for configuring LNmSSCR0. PCIe may need to configure the equalization registers. It also uses multiple lanes. I have tried to write the driver with multi-lane support in mind, so there should not need to be any large changes. Although there are 6 protocols supported, I have only tested SGMII and XFI. The rest have been implemented as described in the datasheet. Most of these protocols should work "as-is", but 10GBASE-KR will need PCS support for link training. Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all configured for PCIe, this driver uses one phy configured to use 4 lanes. This is because while the individual lanes may be configured individually, the protocol selection acts on all lanes at once. Additionally, the order which lanes should be configured in is specified by the datasheet. To coordinate this, lanes are reserved in phy_init, and released in phy_exit. This driver was written with reference to the LS1046A reference manual. However, it was informed by reference manuals for all processors with mEMACs, especially the T4240 (which appears to have a "maxed-out" configuration). The earlier P-series processors appear to be similar, but have a different overall register layout (using "banks" instead of separate SerDes). Perhaps this those use a "5G Lynx SerDes." Signed-off-by: Sean Anderson --- (no changes since v10) Changes in v10: - Fix debugging print with incorrect error variable Changes in v9: - Split off clock "driver" into its own patch to allow for better review. - Add ability to defer lane initialization to phy_init. This allows for easier transitioning between firmware-managed serdes and Linux- managed serdes, as the consumer (such as dpaa2, which knows what the firmware is doing) has the last say on who gets control. - phy-type -> fsl,phy Changes in v8: - Remove unused variable from lynx_ls_mode_init Changes in v7: - Break out call order into generic documentation - Refuse to switch "major" protocols - Update Kconfig to reflect restrictions - Remove set/clear of "pcs reset" bit, since it doesn't seem to fix anything. Changes in v6: - Update MAINTAINERS to include new files - Include bitfield.h and slab.h to allow compilation on non-arm64 arches. - Depend on COMMON_CLK and either layerscape/ppc Changes in v5: - Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this series to be applied directly to linux/master. - Add fsl,lynx-10g.h to MAINTAINERS Changes in v4: - Rework all debug statements to remove use of __func__. Additional information has been provided as necessary. - Consider alternative parent rates in round_rate and not in set_rate. Trying to modify out parent's rate in set_rate will deadlock. - Explicitly perform a stop/reset sequence in set_rate. This way we always ensure that the PLL is properly stopped. - Set the power-down bit when disabling the PLL. We can do this now that enable/disable aren't abused during the set rate sequence. - Fix typos in QSGMII_OFFSET and XFI_OFFSET - Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better reflect its function (adding post-cursor equalization). - Use of_clk_hw_onecell_get instead of a custom function. - Return struct clks from lynx_clks_init instead of embedding lynx_clk in lynx_priv. - Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs primarily in the layout and offset of the PCCRs. This will help bring a cleaner abstraction layer. The caps have been removed, since this handles the only current usage. - Convert to use new binding format. As a result of this, we no longer need to have protocols for PCIe or SATA. Additionally, modes now live in lynx_group instead of lynx_priv. - Remove teq from
Re: [PATCH 1/2] start_kernel: add no_stack_protector fn attr
On Wed, Apr 12, 2023 at 8:32 PM wrote: > > include/linux/compiler_attributes.h | 12 Acked-by: Miguel Ojeda Cheers, Miguel
Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver
Hi Vinod, On 4/12/23 12:10, Vinod Koul wrote: On 11-04-23, 14:43, Sean Anderson wrote: This adds support for the Lynx 10G "SerDes" devices found on various NXP QorIQ SoCs. There may be up to four SerDes devices on each SoC, each supporting up to eight lanes. Protocol support for each SerDes is highly heterogeneous, with each SoC typically having a totally different selection of supported protocols for each lane. Additionally, the SerDes devices on each SoC also have differing support. One SerDes will typically support Ethernet on most lanes, while the other will typically support PCIe on most lanes. There is wide hardware support for this SerDes. It is present on QorIQ T-Series and Layerscape processors. Because each SoC typically has specific instructions and exceptions for its SerDes, I have limited the initial scope of this module to just the LS1046A and LS1088A. Additionally, I have only added support for Ethernet protocols. There is not a great need for dynamic reconfiguration for other protocols (except perhaps for M.2 cards), so support for them may never be added. Nevertheless, I have tried to provide an obvious path for adding support for other SoCs as well as other protocols. SATA just needs support for configuring LNmSSCR0. PCIe may need to configure the equalization registers. It also uses multiple lanes. I have tried to write the driver with multi-lane support in mind, so there should not need to be any large changes. Although there are 6 protocols supported, I have only tested SGMII and XFI. The rest have been implemented as described in the datasheet. Most of these protocols should work "as-is", but 10GBASE-KR will need PCS support for link training. Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all configured for PCIe, this driver uses one phy configured to use 4 lanes. This is because while the individual lanes may be configured individually, the protocol selection acts on all lanes at once. Additionally, the order which lanes should be configured in is specified by the datasheet. To coordinate this, lanes are reserved in phy_init, and released in phy_exit. This driver was written with reference to the LS1046A reference manual. However, it was informed by reference manuals for all processors with mEMACs, especially the T4240 (which appears to have a "maxed-out" configuration). The earlier P-series processors appear to be similar, but have a different overall register layout (using "banks" instead of separate SerDes). Perhaps this those use a "5G Lynx SerDes." Signed-off-by: Sean Anderson --- (no changes since v10) Changes in v10: - Fix debugging print with incorrect error variable Changes in v9: - Split off clock "driver" into its own patch to allow for better review. - Add ability to defer lane initialization to phy_init. This allows for easier transitioning between firmware-managed serdes and Linux- managed serdes, as the consumer (such as dpaa2, which knows what the firmware is doing) has the last say on who gets control. - phy-type -> fsl,phy Changes in v8: - Remove unused variable from lynx_ls_mode_init Changes in v7: - Break out call order into generic documentation - Refuse to switch "major" protocols - Update Kconfig to reflect restrictions - Remove set/clear of "pcs reset" bit, since it doesn't seem to fix anything. Changes in v6: - Update MAINTAINERS to include new files - Include bitfield.h and slab.h to allow compilation on non-arm64 arches. - Depend on COMMON_CLK and either layerscape/ppc Changes in v5: - Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this series to be applied directly to linux/master. - Add fsl,lynx-10g.h to MAINTAINERS Changes in v4: - Rework all debug statements to remove use of __func__. Additional information has been provided as necessary. - Consider alternative parent rates in round_rate and not in set_rate. Trying to modify out parent's rate in set_rate will deadlock. - Explicitly perform a stop/reset sequence in set_rate. This way we always ensure that the PLL is properly stopped. - Set the power-down bit when disabling the PLL. We can do this now that enable/disable aren't abused during the set rate sequence. - Fix typos in QSGMII_OFFSET and XFI_OFFSET - Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better reflect its function (adding post-cursor equalization). - Use of_clk_hw_onecell_get instead of a custom function. - Return struct clks from lynx_clks_init instead of embedding lynx_clk in lynx_priv. - Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs primarily in the layout and offset of the PCCRs. This will help bring a cleaner abstraction layer. The caps have been removed, since this handles the only current usage. - Convert to use new binding format. As a result of this, we no longer need to have protocols for PCIe or SATA. Additionally, modes now live in lynx_group instead of lynx_priv. - Remove teq from
[PATCH 1/2] start_kernel: add no_stack_protector fn attr
Back during the discussion of commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try") we discussed the need for a function attribute to control the omission of stack protectors on a per-function basis; at the time Clang had support for no_stack_protector but GCC did not. This was fixed in gcc-11. Now that the function attribute is available, let's start using it. Callers of boot_init_stack_canary need to use this function attribute unless they're compiled with -fno-stack-protector, otherwise the canary stored in the stack slot of the caller will differ upon the call to boot_init_stack_canary. This will lead to a call to __stack_chk_fail then panic. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722 Link: https://lore.kernel.org/all/20200316130414.gc12...@hirez.programming.kicks-ass.net/ Signed-off-by: Nick Desaulniers --- arch/powerpc/kernel/smp.c | 1 + include/linux/compiler_attributes.h | 12 init/main.c | 3 ++- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 6b90f10a6c81..7d4c12b1abb7 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1603,6 +1603,7 @@ static void add_cpu_to_masks(int cpu) } /* Activate a secondary processor. */ +__no_stack_protector void start_secondary(void *unused) { unsigned int cpu = raw_smp_processor_id(); diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h index e659cb6fded3..84864767a56a 100644 --- a/include/linux/compiler_attributes.h +++ b/include/linux/compiler_attributes.h @@ -255,6 +255,18 @@ */ #define __noreturn __attribute__((__noreturn__)) +/* + * Optional: only supported since GCC >= 11.1, clang >= 7.0. + * + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-no_005fstack_005fprotector-function-attribute + * clang: https://clang.llvm.org/docs/AttributeReference.html#no-stack-protector-safebuffers + */ +#if __has_attribute(__no_stack_protector__) +# define __no_stack_protector __attribute__((__no_stack_protector__)) +#else +# define __no_stack_protector +#endif + /* * Optional: not supported by gcc. * diff --git a/init/main.c b/init/main.c index bb87b789c543..213baf7b8cb1 100644 --- a/init/main.c +++ b/init/main.c @@ -941,7 +941,8 @@ static void __init print_unknown_bootoptions(void) memblock_free(unknown_options, len); } -asmlinkage __visible void __init __no_sanitize_address start_kernel(void) +asmlinkage __visible __init __no_sanitize_address __no_stack_protector +void start_kernel(void) { char *command_line; char *after_dashes; -- 2.40.0.577.gac1e443424-goog
[PATCH 0/2] start_kernel: omit stack canary
A security research paper was recently published detailing Catch Handler Oriented Programming (CHOP) attacks. https://download.vusec.net/papers/chop_ndss23.pdf The TL;DR being that C++ structured exception handling runtimes are attractive gadgets for Jump Oriented Programming (JOP) attacks. In response to this, a mitigation was developed under embargo in clang-16 to check the stack canary before calling noreturn functions. https://bugs.chromium.org/p/llvm/issues/detail?id=30 This started causing boot failures in Android kernel trees downstream of stable linux-4.14.y that had proto-LTO support, as reported by Nathan Chancellor. https://github.com/ClangBuiltLinux/linux/issues/1815 Josh Poimboeuf recently sent a series to explicitly annotate more functions as noreturn. Nathan noticed the series, and tested it finding that it now caused boot failures with clang-16+ on mainline (raising the visibility and urgency of the issue). https://lore.kernel.org/cover.1680912057.git.jpoim...@kernel.org/ Once the embargo was lifted, I asked questions such as "what does C++ structured exception handling have to do with C code" and "surely GCC didn't ship the same mitigation for C code (narrator: 'They did not: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a25982ada523689c8745d7fb4b1b93c8f5dab2e7')?" I now have a patch out for LLVM to undo this mess (or at least limit it to C++ functions that may throw, similar to GCC's mitigation); it hasn't landed yet but we're close to consensus and I expect it to land imminently. https://reviews.llvm.org/D147975 Remember this thread? (Pepperidge farms remembers...) https://lore.kernel.org/all/20200314164451.346497-1-sly...@gentoo.org/ That reminded me that years ago we discussed a function attribute for no_stack_protector. https://lore.kernel.org/all/20200316130414.gc12...@hirez.programming.kicks-ass.net/ GCC didn't have one at the time, it now does. In addition to the LLVM fix, I'd like to introduce this in the kernel so that we might start using it in additional places: * https://lore.kernel.org/linux-pm/20200915172658.1432732-1-r...@google.com/ * https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolva...@google.com/ And eventually remove the final macro expansion site of prevent_tail_call_optimization. With the LLVM fix, this series isn't required, but I'd like to start paving the way to use these function attributes since I think they are a sweet spot in terms of granularity (as opposed to trying to move start_kernel to its own TU compiled with -fno-stack-protector). (This is my first time using b4 to send a series, as per https://people.kernel.org/monsieuricon/sending-a-kernel-patch-with-b4-part-1. Here goes nothing!) Signed-off-by: Nick Desaulniers --- Nick Desaulniers (2): start_kernel: add no_stack_protector fn attr start_kernel: omit prevent_tail_call_optimization for newer toolchains arch/powerpc/kernel/smp.c | 1 + include/linux/compiler_attributes.h | 12 init/main.c | 8 +++- 3 files changed, 20 insertions(+), 1 deletion(-) --- base-commit: 0bcc4025550403ae28d2984bddacafbca0a2f112 change-id: 20230412-no_stackp-a98168a2bb0a Best regards, -- Nick Desaulniers
[PATCH 2/2] start_kernel: omit prevent_tail_call_optimization for newer toolchains
prevent_tail_call_optimization was added in commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try") to work around stack canaries getting inserted into functions that would initialize the stack canary in the first place. Now that we have no_stack_protector function attribute (gcc-11+, clang-7+) and use it on start_kernel, remove the call to prevent_tail_call_optimization such that we may one day remove it outright. Signed-off-by: Nick Desaulniers --- init/main.c | 5 + 1 file changed, 5 insertions(+) diff --git a/init/main.c b/init/main.c index 213baf7b8cb1..c8503d02dfa6 100644 --- a/init/main.c +++ b/init/main.c @@ -1152,7 +1152,12 @@ void start_kernel(void) /* Do the rest non-__init'ed, we're now alive */ arch_call_rest_init(); + /* Avoid stack canaries in callers of boot_init_stack_canary for gcc-10 +* and older. +*/ +#if !__has_attribute(__no_stack_protector__) prevent_tail_call_optimization(); +#endif } /* Call all constructor functions linked into the kernel. */ -- 2.40.0.577.gac1e443424-goog
[linux-next:master] BUILD REGRESSION 7d8214bba44c1aa6a75921a09a691945d26a8d43
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 7d8214bba44c1aa6a75921a09a691945d26a8d43 Add linux-next specific files for 20230412 Error/Warning reports: https://lore.kernel.org/oe-kbuild-all/202303161521.jbgbafjj-...@intel.com https://lore.kernel.org/oe-kbuild-all/202304061839.hi01vpl1-...@intel.com https://lore.kernel.org/oe-kbuild-all/202304102354.q4voxgte-...@intel.com https://lore.kernel.org/oe-kbuild-all/202304112044.8nzkpvxm-...@intel.com https://lore.kernel.org/oe-kbuild-all/202304121606.ltrfyuij-...@intel.com https://lore.kernel.org/oe-kbuild-all/202304121702.bav49hfn-...@intel.com Error/Warning: (recently discovered and may have been fixed) Error: failed to load BTF from vmlinux: No data available arch/arm/vfp/entry.S:27: undefined reference to `vfp_entry' arch/csky/abiv2/cacheflush.c:15:9: error: implicit declaration of function 'flush_tlb_page'; did you mean 'flush_anon_page'? [-Werror=implicit-function-declaration] diff: tools/arch/s390/include/uapi/asm/ptrace.h: No such file or directory drivers/bluetooth/hci_qca.c:1894:37: warning: unused variable 'qca_soc_data_wcn6855' [-Wunused-const-variable] drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_validation.c:351:13: warning: variable 'bw_needed' set but not used [-Wunused-but-set-variable] drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_validation.c:352:25: warning: variable 'link' set but not used [-Wunused-but-set-variable] drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:62: warning: wrong kernel-doc identifier on line: drivers/gpu/drm/i915/i915_pmu.h:41: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/gpu/drm/i915/i915_request.h:176: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/gpu/drm/i915/i915_vma.h:145: warning: expecting prototype for i915_vma_offset(). Prototype was for i915_vma_size() instead drivers/net/wireless/legacy/ray_cs.c:628:17: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation] drivers/tty/serial/samsung_tty.c:2034:10: error: implicit declaration of function 'of_device_get_match_data' is invalid in C99 [-Werror,-Wimplicit-function-declaration] drivers/tty/serial/samsung_tty.c:2034:10: warning: incompatible integer to pointer conversion returning 'int' from a function with result type 'const struct s3c24xx_serial_drv_data *' [-Wint-conversion] drivers/tty/serial/samsung_tty.c:2034:24: error: implicit declaration of function 'of_device_get_match_data'; did you mean 'device_get_match_data'? [-Werror=implicit-function-declaration] drivers/tty/serial/samsung_tty.c:2034:24: warning: returning 'int' from a function with return type 'const struct s3c24xx_serial_drv_data *' makes pointer from integer without a cast [-Wint-conversion] make[3]: *** No rule to make target 'zip.h', needed by '/tools/build/libbpf/sharedobjs/libbpf.o'. make[3]: *** No rule to make target 'zip.h', needed by '/tools/build/libbpf/staticobjs/libbpf.o'. make[4]: *** No rule to make target 'zip.h', needed by 'kselftest/net/tools/build/libbpf/sharedobjs/libbpf.o'. make[4]: *** No rule to make target 'zip.h', needed by 'kselftest/net/tools/build/libbpf/staticobjs/libbpf.o'. membarrier_test_impl.h:27:30: error: 'MEMBARRIER_CMD_GET_REGISTRATIONS' undeclared (first use in this function) mount_setattr_test.c:107:8: error: redefinition of 'struct mount_attr' thermal_nl.h:6:10: fatal error: netlink/netlink.h: No such file or directory thermometer.c:21:10: fatal error: libconfig.h: No such file or directory Unverified Error/Warning (likely false positive, please contact us if interested): drivers/acpi/property.c:985 acpi_data_prop_read_single() error: potentially dereferencing uninitialized 'obj'. drivers/crypto/intel/qat/qat_common/adf_cfg.c:262 adf_cfg_add_key_value_param() warn: argument 4 to %lx specifier is cast from pointer drivers/hwmon/pmbus/pmbus_core.c:3164:7-32: WARNING: Threaded IRQ with no primary handler requested without IRQF_ONESHOT (unless it is nested IRQ) drivers/soc/fsl/qe/tsa.c:140:26: sparse: sparse: incorrect type in argument 2 (different address spaces) drivers/soc/fsl/qe/tsa.c:150:27: sparse: sparse: incorrect type in argument 1 (different address spaces) Error/Warning ids grouped by kconfigs: gcc_recent_errors |-- alpha-allyesconfig | |-- drivers-gpu-drm-amd-amdgpu-..-display-dc-link-link_validation.c:warning:variable-bw_needed-set-but-not-used | |-- drivers-gpu-drm-amd-amdgpu-..-display-dc-link-link_validation.c:warning:variable-link-set-but-not-used | `-- drivers-net-wireless-legacy-ray_cs.c:warning:strncpy-specified-bound-equals-destination-size |-- alpha-randconfig-c023-20230409 | |-- drivers-gpu-drm-amd-amdgpu-..-display-dc-link-link_validation.c:warning:variable-bw_needed-set-but-not-used | `-- drivers-gpu-drm-amd-amdgpu-..-display-dc
Re: [PATCH v3 02/14] arm64: drop ranges in definition of ARCH_FORCE_MAX_ORDER
On Tue, Apr 04, 2023 at 06:50:01AM -0500, Justin Forbes wrote: > On Tue, Apr 4, 2023 at 2:22 AM Mike Rapoport wrote: > > On Wed, Mar 29, 2023 at 10:55:37AM -0500, Justin Forbes wrote: > > > On Sat, Mar 25, 2023 at 1:09 AM Mike Rapoport wrote: > > > > > > > > From: "Mike Rapoport (IBM)" > > > > > > > > It is not a good idea to change fundamental parameters of core memory > > > > management. Having predefined ranges suggests that the values within > > > > those ranges are sensible, but one has to *really* understand > > > > implications of changing MAX_ORDER before actually amending it and > > > > ranges don't help here. > > > > > > > > Drop ranges in definition of ARCH_FORCE_MAX_ORDER and make its prompt > > > > visible only if EXPERT=y > > > > > > I do not like suddenly hiding this behind EXPERT for a couple of > > > reasons. Most importantly, it will silently change the config for > > > users building with an old kernel config. If a user has for instance > > > "13" set and building with 4K pages, as is the current configuration > > > for Fedora and RHEL aarch64 builds, an oldconfig build will now set it > > > to 10 with no indication that it is doing so. And while I think that > > > 10 is a fine default for many aarch64 users, there are valid reasons > > > for choosing other values. Putting this behind expert makes it much > > > less obvious that this is an option. > > > > That's the idea of EXPERT, no? > > > > This option was intended to allow allocation of huge pages for > > architectures that had PMD_ORDER > MAX_ORDER and not to allow user to > > select size of maximal physically contiguous allocation. > > > > Changes to MAX_ORDER fundamentally change the behaviour of core mm and > > unless users *really* know what they are doing there is no reason to choose > > non-default values so hiding this option behind EXPERT seems totally > > appropriate to me. > > It sounds nice in theory. In practice. EXPERT hides too much. When you > flip expert, you expose over a 175ish new config options which are > hidden behind EXPERT. You don't have to know what you are doing just > with the MAX_ORDER, but a whole bunch more as well. If everyone were > already running 10, this might be less of a problem. At least Fedora > and RHEL are running 13 for 4K pages on aarch64. This was not some > accidental choice, we had to carry a patch to even allow it for a > while. If this does go in as is, we will likely just carry a patch to > remove the "if EXPERT", but that is a bit of a disservice to users who > might be trying to debug something else upstream, bisecting upstream > kernels or testing a patch. In those cases, people tend to use > pristine upstream sources without distro patches to verify, and they > tend to use their existing configs. With this change, their MAX_ORDER > will drop to 10 from 13 silently. That can look like a different > issue enough to ruin a bisect or have them give bad feedback on a > patch because it introduces a "regression" which is not a regression > at all, but a config change they couldn't see. If we remove EXPERT (as prior to this patch), I'd rather keep the ranges and avoid having to explain to people why some random MAX_ORDER doesn't build (keeping the range would also make sense for randconfig, not sure we got to any conclusion there). -- Catalin
Re: [PATCH 0/6] bus: fsl-mc: Make remove function return void
Hello, On Fri, Mar 10, 2023 at 11:41:22PM +0100, Uwe Kleine-König wrote: > Hello, > > many bus remove functions return an integer which is a historic > misdesign that makes driver authors assume that there is some kind of > error handling in the upper layers. This is wrong however and returning > and error code only yields an error message. > > This series improves the fsl-mc bus by changing the remove callback to > return no value instead. As a preparation all drivers are changed to > return zero before so that they don't trigger the error message. Who is supposed to pick up this patch series (or point out a good reason for not taking it)? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | https://www.pengutronix.de/ | signature.asc Description: PGP signature
Re: [PATCH v13 07/15] phy: fsl: Add Lynx 10G SerDes driver
On 11-04-23, 14:43, Sean Anderson wrote: > This adds support for the Lynx 10G "SerDes" devices found on various NXP > QorIQ SoCs. There may be up to four SerDes devices on each SoC, each > supporting up to eight lanes. Protocol support for each SerDes is highly > heterogeneous, with each SoC typically having a totally different > selection of supported protocols for each lane. Additionally, the SerDes > devices on each SoC also have differing support. One SerDes will > typically support Ethernet on most lanes, while the other will typically > support PCIe on most lanes. > > There is wide hardware support for this SerDes. It is present on QorIQ > T-Series and Layerscape processors. Because each SoC typically has > specific instructions and exceptions for its SerDes, I have limited the > initial scope of this module to just the LS1046A and LS1088A. > Additionally, I have only added support for Ethernet protocols. There is > not a great need for dynamic reconfiguration for other protocols (except > perhaps for M.2 cards), so support for them may never be added. > > Nevertheless, I have tried to provide an obvious path for adding support > for other SoCs as well as other protocols. SATA just needs support for > configuring LNmSSCR0. PCIe may need to configure the equalization > registers. It also uses multiple lanes. I have tried to write the driver > with multi-lane support in mind, so there should not need to be any > large changes. Although there are 6 protocols supported, I have only > tested SGMII and XFI. The rest have been implemented as described in > the datasheet. Most of these protocols should work "as-is", but > 10GBASE-KR will need PCS support for link training. > > Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all > configured for PCIe, this driver uses one phy configured to use 4 lanes. > This is because while the individual lanes may be configured > individually, the protocol selection acts on all lanes at once. > Additionally, the order which lanes should be configured in is specified > by the datasheet. To coordinate this, lanes are reserved in phy_init, > and released in phy_exit. > > This driver was written with reference to the LS1046A reference manual. > However, it was informed by reference manuals for all processors with > mEMACs, especially the T4240 (which appears to have a "maxed-out" > configuration). The earlier P-series processors appear to be similar, but > have a different overall register layout (using "banks" instead of > separate SerDes). Perhaps this those use a "5G Lynx SerDes." > > Signed-off-by: Sean Anderson > --- > > (no changes since v10) > > Changes in v10: > - Fix debugging print with incorrect error variable > > Changes in v9: > - Split off clock "driver" into its own patch to allow for better > review. > - Add ability to defer lane initialization to phy_init. This allows > for easier transitioning between firmware-managed serdes and Linux- > managed serdes, as the consumer (such as dpaa2, which knows what the > firmware is doing) has the last say on who gets control. > - phy-type -> fsl,phy > > Changes in v8: > - Remove unused variable from lynx_ls_mode_init > > Changes in v7: > - Break out call order into generic documentation > - Refuse to switch "major" protocols > - Update Kconfig to reflect restrictions > - Remove set/clear of "pcs reset" bit, since it doesn't seem to fix > anything. > > Changes in v6: > - Update MAINTAINERS to include new files > - Include bitfield.h and slab.h to allow compilation on non-arm64 > arches. > - Depend on COMMON_CLK and either layerscape/ppc > > Changes in v5: > - Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this > series to be applied directly to linux/master. > - Add fsl,lynx-10g.h to MAINTAINERS > > Changes in v4: > - Rework all debug statements to remove use of __func__. Additional > information has been provided as necessary. > - Consider alternative parent rates in round_rate and not in set_rate. > Trying to modify out parent's rate in set_rate will deadlock. > - Explicitly perform a stop/reset sequence in set_rate. This way we > always ensure that the PLL is properly stopped. > - Set the power-down bit when disabling the PLL. We can do this now that > enable/disable aren't abused during the set rate sequence. > - Fix typos in QSGMII_OFFSET and XFI_OFFSET > - Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better > reflect its function (adding post-cursor equalization). > - Use of_clk_hw_onecell_get instead of a custom function. > - Return struct clks from lynx_clks_init instead of embedding lynx_clk > in lynx_priv. > - Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs > primarily in the layout and offset of the PCCRs. This will help bring a > cleaner abstraction layer. The caps have been removed, since this handles > the > only current usage. > - Convert to use new binding format. As a result of this, we no
Re: [PATCH] powerpc/mpc52xx_lpbfifo: Drop unused functions
On Wed, Dec 28, 2022 at 03:51:29PM +0100, Uwe Kleine-König wrote: > The four exported functions mpc52xx_lpbfifo_submit(), > mpc52xx_lpbfifo_abort(), mpc52xx_lpbfifo_poll(), and > mpc52xx_lpbfifo_start_xfer() are not used. So they can be dropped and the > definitions needed to call them can be moved into the driver file. > > Signed-off-by: Uwe Kleine-König I never got feedback about this driver and it has not appeared in next up to now. Did it fell through the cracks? Best regards Uwe > --- > arch/powerpc/include/asm/mpc52xx.h| 41 -- > arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c | 134 +- > 2 files changed, 33 insertions(+), 142 deletions(-) > > diff --git a/arch/powerpc/include/asm/mpc52xx.h > b/arch/powerpc/include/asm/mpc52xx.h > index 5ea16a71c2f0..01ae6c351e50 100644 > --- a/arch/powerpc/include/asm/mpc52xx.h > +++ b/arch/powerpc/include/asm/mpc52xx.h > @@ -285,47 +285,6 @@ extern int mpc52xx_gpt_start_timer(struct > mpc52xx_gpt_priv *gpt, u64 period, > extern u64 mpc52xx_gpt_timer_period(struct mpc52xx_gpt_priv *gpt); > extern int mpc52xx_gpt_stop_timer(struct mpc52xx_gpt_priv *gpt); > > -/* mpc52xx_lpbfifo.c */ > -#define MPC52XX_LPBFIFO_FLAG_READ(0) > -#define MPC52XX_LPBFIFO_FLAG_WRITE (1<<0) > -#define MPC52XX_LPBFIFO_FLAG_NO_INCREMENT(1<<1) > -#define MPC52XX_LPBFIFO_FLAG_NO_DMA (1<<2) > -#define MPC52XX_LPBFIFO_FLAG_POLL_DMA(1<<3) > - > -struct mpc52xx_lpbfifo_request { > - struct list_head list; > - > - /* localplus bus address */ > - unsigned int cs; > - size_t offset; > - > - /* Memory address */ > - void *data; > - phys_addr_t data_phys; > - > - /* Details of transfer */ > - size_t size; > - size_t pos; /* current position of transfer */ > - int flags; > - int defer_xfer_start; > - > - /* What to do when finished */ > - void (*callback)(struct mpc52xx_lpbfifo_request *); > - > - void *priv; /* Driver private data */ > - > - /* statistics */ > - int irq_count; > - int irq_ticks; > - u8 last_byte; > - int buffer_not_done_cnt; > -}; > - > -extern int mpc52xx_lpbfifo_submit(struct mpc52xx_lpbfifo_request *req); > -extern void mpc52xx_lpbfifo_abort(struct mpc52xx_lpbfifo_request *req); > -extern void mpc52xx_lpbfifo_poll(void); > -extern int mpc52xx_lpbfifo_start_xfer(struct mpc52xx_lpbfifo_request *req); > - > /* mpc52xx_pic.c */ > extern void mpc52xx_init_irq(void); > extern unsigned int mpc52xx_get_irq(void); > diff --git a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c > b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c > index 6d1dd6e87478..32fd1345ffeb 100644 > --- a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c > +++ b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c > @@ -38,6 +38,39 @@ MODULE_LICENSE("GPL"); > #define LPBFIFO_REG_FIFO_CONTROL (0x48) > #define LPBFIFO_REG_FIFO_ALARM (0x4C) > > +#define MPC52XX_LPBFIFO_FLAG_WRITE (1<<0) > +#define MPC52XX_LPBFIFO_FLAG_NO_DMA (1<<2) > +#define MPC52XX_LPBFIFO_FLAG_POLL_DMA(1<<3) > + > +struct mpc52xx_lpbfifo_request { > + struct list_head list; > + > + /* localplus bus address */ > + unsigned int cs; > + size_t offset; > + > + /* Memory address */ > + void *data; > + phys_addr_t data_phys; > + > + /* Details of transfer */ > + size_t size; > + size_t pos; /* current position of transfer */ > + int flags; > + int defer_xfer_start; > + > + /* What to do when finished */ > + void (*callback)(struct mpc52xx_lpbfifo_request *); > + > + void *priv; /* Driver private data */ > + > + /* statistics */ > + int irq_count; > + int irq_ticks; > + u8 last_byte; > + int buffer_not_done_cnt; > +}; > + > struct mpc52xx_lpbfifo { > struct device *dev; > phys_addr_t regs_phys; > @@ -381,107 +414,6 @@ static irqreturn_t mpc52xx_lpbfifo_bcom_irq(int irq, > void *dev_id) > return IRQ_HANDLED; > } > > -/** > - * mpc52xx_lpbfifo_poll - Poll for DMA completion > - */ > -void mpc52xx_lpbfifo_poll(void) > -{ > - struct mpc52xx_lpbfifo_request *req = lpbfifo.req; > - int dma = !(req->flags & MPC52XX_LPBFIFO_FLAG_NO_DMA); > - int write = req->flags & MPC52XX_LPBFIFO_FLAG_WRITE; > - > - /* > - * For more information, see comments on the "Fat Lady" > - */ > - if (dma && write) > - mpc52xx_lpbfifo_irq(0, NULL); > - else > - mpc52xx_lpbfifo_bcom_irq(0, NULL); > -} > -EXPORT_SYMBOL(mpc52xx_lpbfifo_poll); > - > -/** > - * mpc52xx_lpbfifo_submit - Submit an LPB FIFO transfer request. > - * @req: Pointer to request structure > - * > - * Return: %0 on success, -errno code on error > - */ > -int mpc52xx_lpbfifo_submit(struct mpc52xx_lpbfifo_request *req) > -{ > - unsigned long flags; > - > - if (!lpbfifo.regs) > - return -ENODEV; > - >
Re: [PATCH v2 3/5] locking/arch: Wire up local_try_cmpxchg
On Wed, Apr 12, 2023 at 03:37:50PM +0200, Uros Bizjak wrote: > On Wed, Apr 12, 2023 at 1:33 PM Peter Zijlstra wrote: > > > > On Wed, Apr 05, 2023 at 04:17:08PM +0200, Uros Bizjak wrote: > > > diff --git a/arch/powerpc/include/asm/local.h > > > b/arch/powerpc/include/asm/local.h > > > index bc4bd19b7fc2..45492fb5bf22 100644 > > > --- a/arch/powerpc/include/asm/local.h > > > +++ b/arch/powerpc/include/asm/local.h > > > @@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long > > > o, long n) > > > return t; > > > } > > > > > > +static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n) > > > +{ > > > + long o = *po, r; > > > + > > > + r = local_cmpxchg(l, o, n); > > > + if (unlikely(r != o)) > > > + *po = r; > > > + > > > + return likely(r == o); > > > +} > > > + > > > > Why is the ppc one different from the rest? Why can't it use the > > try_cmpxchg_local() fallback and needs to have it open-coded? > > Please note that ppc directly defines local_cmpxchg that bypasses > cmpxchg_local/arch_cmpxchg_local machinery. The patch takes the same > approach for local_try_cmpxchg, because fallbacks are using > arch_cmpxchg_local definitions. > > PPC should be converted to use arch_cmpxchg_local (to also enable > instrumentation), but this is not the scope of the proposed patchset. Ah indeed. Thanks!
Re: [PATCH v2 3/5] locking/arch: Wire up local_try_cmpxchg
On Wed, Apr 12, 2023 at 1:33 PM Peter Zijlstra wrote: > > On Wed, Apr 05, 2023 at 04:17:08PM +0200, Uros Bizjak wrote: > > diff --git a/arch/powerpc/include/asm/local.h > > b/arch/powerpc/include/asm/local.h > > index bc4bd19b7fc2..45492fb5bf22 100644 > > --- a/arch/powerpc/include/asm/local.h > > +++ b/arch/powerpc/include/asm/local.h > > @@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long o, > > long n) > > return t; > > } > > > > +static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n) > > +{ > > + long o = *po, r; > > + > > + r = local_cmpxchg(l, o, n); > > + if (unlikely(r != o)) > > + *po = r; > > + > > + return likely(r == o); > > +} > > + > > Why is the ppc one different from the rest? Why can't it use the > try_cmpxchg_local() fallback and needs to have it open-coded? Please note that ppc directly defines local_cmpxchg that bypasses cmpxchg_local/arch_cmpxchg_local machinery. The patch takes the same approach for local_try_cmpxchg, because fallbacks are using arch_cmpxchg_local definitions. PPC should be converted to use arch_cmpxchg_local (to also enable instrumentation), but this is not the scope of the proposed patchset. Uros.
Re: [PATCH] powerpc/64: Always build with 128-bit long double
Hamza Mahfooz writes: > On 4/4/23 06:28, Michael Ellerman wrote: >> The amdgpu driver builds some of its code with hard-float enabled, >> whereas the rest of the kernel is built with soft-float. >> >> When building with 64-bit long double, if soft-float and hard-float >> objects are linked together, the build fails due to incompatible ABI >> tags. >> >> In the past there have been build errors in the amdgpu driver caused by >> this, some of those were due to bad intermingling of soft & hard-float >> code, but those issues have now all been fixed since commit c92b7fe0d92a >> ("drm/amd/display: move remaining FPU code to dml folder"). >> >> However it's still possible for soft & hard-float objects to end up >> linked together, if the amdgpu driver is built-in to the kernel along >> with the test_emulate_step.c code, which uses soft-float. That happens >> in an allyesconfig build. >> >> Currently those build errors are avoided because the amdgpu driver is >> gated on 128-bit long double being enabled. But that's not a detail the >> amdgpu driver should need to be aware of, and if another driver starts >> using hard-float the same problem would occur. >> >> All versions of the 64-bit ABI specify that long-double is 128-bits. >> However some compilers, notably the kernel.org ones, are built to use >> 64-bit long double by default. >> >> Apart from this issue of soft vs hard-float, the kernel doesn't care >> what size long double is. In particular the kernel using 128-bit long >> double doesn't impact userspace's ability to use 64-bit long double, as >> musl does. >> >> So always build the 64-bit kernel with 128-bit long double. That should >> avoid any build errors due to the incompatible ABI tags. Excluding the >> code that uses soft/hard-float, the vmlinux is identical with/without >> the flag. >> >> It does mean any code which is incorrectly intermingling soft & >> hard-float code will build without error, so those bugs will need to be >> caught by testing rather than at build time. >> >> For more background see: >>- commit d11219ad53dc ("amdgpu: disable powerpc support for the newer >> display engine") >>- commit c653c591789b ("drm/amdgpu: Re-enable DCN for 64-bit powerpc") >>- >> https://lore.kernel.org/r/dab9cbd8-2626-4b99-8098-31fe76397...@app.fastmail.com >> >> Signed-off-by: Michael Ellerman > > Reviewed-by: Hamza Mahfooz Thanks. > If you'd prefer to have this go through the amdgpu branch, please let > me know. I think it makes more sense to go via the powerpc tree, it will get more testing on powerpc that way. cheers
Re: [PATCH] selftests/powerpc: Replace obsolete memalign() with posix_memalign()
Deming Wang writes: > memalign() is obsolete according to its manpage. > > Replace memalign() with posix_memalign() and remove malloc.h include > that was there for memalign(). > > As a pointer is passed into posix_memalign(), initialize *s to NULL > to silence a warning about the function's return value being used as > uninitialized (which is not valid anyway because the error is properly > checked before p is returned). The patch doesn't do that. There is no p? I think you've copied the change log for a whole bunch of commits but not updated them to be accurate for each change? cheers > diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c > b/tools/testing/selftests/powerpc/stringloops/strlen.c > index 9055ebc484d0..f9c1f9cc2d32 100644 > --- a/tools/testing/selftests/powerpc/stringloops/strlen.c > +++ b/tools/testing/selftests/powerpc/stringloops/strlen.c > @@ -1,5 +1,4 @@ > // SPDX-License-Identifier: GPL-2.0 > -#include > #include > #include > #include > @@ -51,10 +50,11 @@ static void bench_test(char *s) > static int testcase(void) > { > char *s; > + int ret; > unsigned long i; > > - s = memalign(128, SIZE); > - if (!s) { > + ret = posix_memalign((void **), 128, SIZE); > + if (ret < 0) { > perror("memalign"); > exit(1); > } > -- > 2.27.0
Re: [PATCH v2 3/5] locking/arch: Wire up local_try_cmpxchg
On Wed, Apr 05, 2023 at 04:17:08PM +0200, Uros Bizjak wrote: > diff --git a/arch/powerpc/include/asm/local.h > b/arch/powerpc/include/asm/local.h > index bc4bd19b7fc2..45492fb5bf22 100644 > --- a/arch/powerpc/include/asm/local.h > +++ b/arch/powerpc/include/asm/local.h > @@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long o, > long n) > return t; > } > > +static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n) > +{ > + long o = *po, r; > + > + r = local_cmpxchg(l, o, n); > + if (unlikely(r != o)) > + *po = r; > + > + return likely(r == o); > +} > + Why is the ppc one different from the rest? Why can't it use the try_cmpxchg_local() fallback and needs to have it open-coded?
[PATCH] selftests/powerpc: Replace obsolete memalign() with posix_memalign()
memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign() and remove malloc.h include that was there for memalign(). As a pointer is passed into posix_memalign(), initialize *s to NULL to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before p is returned). Signed-off-by: Deming Wang --- tools/testing/selftests/powerpc/stringloops/strlen.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c b/tools/testing/selftests/powerpc/stringloops/strlen.c index 9055ebc484d0..f9c1f9cc2d32 100644 --- a/tools/testing/selftests/powerpc/stringloops/strlen.c +++ b/tools/testing/selftests/powerpc/stringloops/strlen.c @@ -1,5 +1,4 @@ // SPDX-License-Identifier: GPL-2.0 -#include #include #include #include @@ -51,10 +50,11 @@ static void bench_test(char *s) static int testcase(void) { char *s; + int ret; unsigned long i; - s = memalign(128, SIZE); - if (!s) { + ret = posix_memalign((void **), 128, SIZE); + if (ret < 0) { perror("memalign"); exit(1); } -- 2.27.0
Re: [PATCH v13 03/15] dt-bindings: Convert gpio-mmio to yaml
On 11/04/2023 20:43, Sean Anderson wrote: > This is a generic binding for simple MMIO GPIO controllers. Although we > have a single driver for these controllers, they were previously spread > over several files. Consolidate them. The register descriptions are > adapted from the comments in the source. There is no set order for the > registers, and some registers may be omitted. Because of this, reg-names > is mandatory, and no order is specified. > > Rename brcm,bcm6345-gpio to brcm,bcm63xx-gpio to reflect that bcm6345 > has moved. > > Signed-off-by: Sean Anderson > Reviewed-by: Linus Walleij > --- > Linus or Bartosz, feel free to pick this up as the rest of this series > may not be merged any time soon. > > Changes in v13: > - Fix references to brcm,bcm63xx-gpio.yaml (neé brcm,bcm6345-gpio) You got some of the same errors as last time. Test your patches before sending. Best regards, Krzysztof
Re: [PATCH] KVM: PPC: BOOK3S: book3s_hv_nested.c: improve branch prediction for k.alloc
Hi, On 2023-04-11 16:35:10, Michael Ellerman wrote: > Kautuk Consul writes: > > On 2023-04-07 09:01:29, Sean Christopherson wrote: > >> On Fri, Apr 07, 2023, Bagas Sanjaya wrote: > >> > On Fri, Apr 07, 2023 at 05:31:47AM -0400, Kautuk Consul wrote: > >> > > I used the unlikely() macro on the return values of the k.alloc > >> > > calls and found that it changes the code generation a bit. > >> > > Optimize all return paths of k.alloc calls by improving > >> > > branch prediction on return value of k.alloc. > >> > >> Nit, this is improving code generation, not branch prediction. > > Sorry my mistake. > >> > >> > What about below? > >> > > >> > "Improve branch prediction on kmalloc() and kzalloc() call by using > >> > unlikely() macro to optimize their return paths." > >> > >> Another nit, using unlikely() doesn't necessarily provide a measurable > >> optimization. > >> As above, it does often improve code generation for the happy path, but > >> that doesn't > >> always equate to improved performance, e.g. if the CPU can easily predict > >> the branch > >> and/or there is no impact on the cache footprint. > > > I see. I will submit a v2 of the patch with a better and more accurate > > description. Does anyone else have any comments before I do so ? > > In general I think unlikely should be saved for cases where either the > compiler is generating terrible code, or the likelyness of the condition > might be surprising to a human reader. > > eg. if you had some code that does a NULL check and it's *expected* that > the value is NULL, then wrapping that check in likely() actually adds > information for a human reader. > > Also please don't use unlikely in init paths or other cold paths, it > clutters the code (only slightly but a little) and that's not worth the > possible tiny benefit for code that only runs once or infrequently. > > I would expect the compilers to do the right thing in all > these cases without the unlikely. But if you can demonstrate that they > meaningfully improve the code generation with a before/after > dissassembly then I'd be interested. Just FYI, the last email by kautuk.consul...@gmail.com was by me. That last email contains a diff file attachment which compares 2 files: before my changes and after my changes. This diff file shows a lot of changes in code generation. Im assuming all those changes are made by the compiler towards optimizing all return paths to k.alloc calls. Kindly review and comment. > cheers
Re: [PATCH v2 15/19] arch/powerpc: Implement with generic helpers
Thomas Zimmermann writes: > Replace the architecture's fb_is_primary_device() with the generic > one from . No functional changes. > > Signed-off-by: Thomas Zimmermann > Cc: Michael Ellerman > Cc: Nicholas Piggin > Cc: Christophe Leroy > --- > arch/powerpc/include/asm/fb.h | 8 +++- > 1 file changed, 3 insertions(+), 5 deletions(-) Looks fine. Acked-by: Michael Ellerman (powerpc) cheers > diff --git a/arch/powerpc/include/asm/fb.h b/arch/powerpc/include/asm/fb.h > index 6541ab77c5b9..5f1a2e5f7654 100644 > --- a/arch/powerpc/include/asm/fb.h > +++ b/arch/powerpc/include/asm/fb.h > @@ -2,8 +2,8 @@ > #ifndef _ASM_FB_H_ > #define _ASM_FB_H_ > > -#include > #include > + > #include > > static inline void fb_pgprotect(struct file *file, struct vm_area_struct > *vma, > @@ -13,10 +13,8 @@ static inline void fb_pgprotect(struct file *file, struct > vm_area_struct *vma, >vma->vm_end - vma->vm_start, >vma->vm_page_prot); > } > +#define fb_pgprotect fb_pgprotect > > -static inline int fb_is_primary_device(struct fb_info *info) > -{ > - return 0; > -} > +#include > > #endif /* _ASM_FB_H_ */ > -- > 2.40.0