RE: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent
From: Christophe Leroy > Sent: 11 December 2020 15:22 > > Le 11/12/2020 à 09:52, Xu Wang a écrit : > > Remove casting the values returned by dma_alloc_coherent. > > Can you explain more in the commit log ? > > As far as I can see, dma_alloc_coherent() doesn't return __iomem, and > ring_base member is __iomem Which is probably wrong - that is the kernel address of kernel memory. So it shouldn't have the __iomem marker. I wonder what else is wrong David > > Christophe > > > > > Signed-off-by: Xu Wang > > --- > > drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > > index 99fe2c210d0f..3ae345676e50 100644 > > --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > > +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > > @@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev) > > struct fs_enet_private *fep = netdev_priv(dev); > > const struct fs_platform_info *fpi = fep->fpi; > > > > - fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev, > > + fep->ring_base = dma_alloc_coherent(fep->dev, > > (fpi->tx_ring + fpi->rx_ring) * > > sizeof(cbd_t), &fep->ring_mem_addr, > > GFP_KERNEL); > > - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
RE: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent
From: Christophe Leroy > Sent: 11 December 2020 16:43 > > Le 11/12/2020 à 17:07, David Laight a écrit : > > From: Christophe Leroy > >> Sent: 11 December 2020 15:22 > >> > >> Le 11/12/2020 à 09:52, Xu Wang a écrit : > >>> Remove casting the values returned by dma_alloc_coherent. > >> > >> Can you explain more in the commit log ? > >> > >> As far as I can see, dma_alloc_coherent() doesn't return __iomem, and > >> ring_base member is __iomem > > > > Which is probably wrong - that is the kernel address of kernel memory. > > So it shouldn't have the __iomem marker. > > That's where the buffer descriptors are, the driver accesses to the content > of the buffer > descriptors using the IO accessors in_be16()/out_be16(). Is it not correct ? I've just been looking at the crap in there. My understanding is that IO accessors are for IO devices (eg addresses from io_remap() etc). Buffers allocated by dma_alloc_coherent() are normal kernel memory and don't need any accessors. Now you might need some barriers - mostly because an ethernet chip can typically read a ring entry without being prodded. IIRC there is a barrier in writel() to ensure the dma master will 'see' all memory writes done before the IO write that kicks it into doing some processing. The fact that the driver contains so many __iomem casts (eg in tx_restart) is an indication that something is badly awry. __iomem exists to check you are using the correct type of pointer. Any __iomem casts are dubious. David > > Christophe > > > > > I wonder what else is wrong > > > > David > > > >> > >> Christophe > >> > >>> > >>> Signed-off-by: Xu Wang > >>> --- > >>>drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +- > >>>1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > >> b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > >>> index 99fe2c210d0f..3ae345676e50 100644 > >>> --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > >>> +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c > >>> @@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev) > >>> struct fs_enet_private *fep = netdev_priv(dev); > >>> const struct fs_platform_info *fpi = fep->fpi; > >>> > >>> - fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev, > >>> + fep->ring_base = dma_alloc_coherent(fep->dev, > >>> (fpi->tx_ring + > >>> fpi->rx_ring) * > >>> sizeof(cbd_t), > >>> &fep->ring_mem_addr, > >>> GFP_KERNEL); > >>> > > > > - > > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 > > 1PT, UK > > Registration No: 1397386 (Wales) > > - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K
On Fri, Dec 11, 2020 at 08:37:40AM -0600, Rob Herring wrote: > On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson > wrote: > > > > I have been chasing down a problem enumerating an NVMe drive on a > > Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate > > successfully if the we are emitting lots of console messages via a UART. > > If the system is booted with `quiet` set then enumeration fails. > > We really need to get away from work-arounds for device X on host Y. I > suspect they are not limited to that combination only... No objection on that. This patch was essentially sharing the result of an investigation where I got stuck at the "now fix it properly" stage! > How exactly does it fail to enumerate? There's a (racy) linkup check > on config accesses, is it reporting link down and skipping config > accesses? In dmesg terms it looked like this: --- nvme_borked_gpu_working.linted.dmesg2020-11-18 15:28:35.544118213 + +++ nvme_working_gpu_working.linted.dmesg 2020-11-18 15:28:56.180076946 + @@ -241,11 +241,19 @@ pci :00:00.0: reg 0x38: [mem 0x904800-0x9048ff pref] pci :00:00.0: supports D1 D2 pci :00:00.0: PME# supported from D0 D1 D2 D3hot +pci :01:00.0: [15b7:5009] type 00 class 0x010802 +pci :01:00.0: reg 0x10: [mem 0x904900-0x9049003fff 64bit] +pci :01:00.0: reg 0x20: [mem 0x9049004000-0x90490040ff 64bit] pci :00:00.0: BAR 1: assigned [mem 0x904000-0x9043ff] pci :00:00.0: BAR 0: assigned [mem 0x904400-0x9044ff] pci :00:00.0: BAR 6: assigned [mem 0x904500-0x9045ff pref] +pci :00:00.0: BAR 14: assigned [mem 0x904600-0x90460f] +pci :01:00.0: BAR 0: assigned [mem 0x904600-0x9046003fff 64bit] +pci :01:00.0: BAR 4: assigned [mem 0x9046004000-0x90460040ff 64bit] pci :00:00.0: PCI bridge to [bus 01-ff] +pci :00:00.0: bridge window [mem 0x904600-0x90460f] pci :00:00.0: Max Payload Size set to 256/ 256 (was 128), Max Read Rq 256 +pci :01:00.0: Max Payload Size set to 256/ 512 (was 128), Max Read Rq 256 layerscape-pcie 380.pcie: host bridge /soc/pcie@380 ranges: layerscape-pcie 380.pcie: MEM 0xa04000..0xa07fff -> 0x004000 layerscape-pcie 380.pcie: PCI host bridge to bus 0001:00 ... and be aware that the last three lines here are another PCIe controller coming up just fine and it is about to detect the GPU (which like the NVMe is also gen3) just fine whether or not we add a short delay. > > I guessed this would be due to the timing impact of printk-to-UART and > > tried to find out where a delay could be added to provoke a successful > > enumeration. > > > > This patch contains the results. The delay length (1ms) was selected by > > binary searching downwards until the delay was not effective for these > > devices (Honeycomb LX2K and a Western Digital WD Blue SN550). > > > > I have also included the workaround twice (conditionally compiled). The > > first change is the *latest* possible code path that we can deploy a > > delay whilst the second is the earliest place I could find. > > > > The summary is that the critical window were we are currently relying on > > a call to the console UART code can "mend" the driver runs from calling > > dw_pcie_setup_rc() in host init to just before we read the state in the > > link up callback. > > > > Signed-off-by: Daniel Thompson > > --- > > > > Notes: > > This patch is RFC (and HACK) because I don't have much clue *why* this > > patch works... merely that this is the smallest possible change I need > > to replicate the current accidental printk() workaround. Perhaps one > > could argue that RFC here stands for request-for-clue. All my > > observations and changes here are empirical and I don't know how best to > > turn them into something that is not a hack! > > > > BTW I noticed many other pcie-designware drivers take advantage > > of a function called dw_pcie_wait_for_link() in their init paths... > > but my naive attempts to add it to the layerscape driver results > > in non-booting systems so I haven't embarrassed myself by including > > that in the patch! > > You need to look at what's pending for v5.11, because I reworked this > to be more unified. The ordering of init is also possibly changed. The > sequence is now like this: > > dw_pcie_setup_rc(pp); > dw_pcie_msi_init(pp); > > if (!dw_pcie_link_up(pci) && pci->ops->start_link) { > ret = pci->ops->start_link(pci); > if (ret) > goto err_free_msi; > } > > /* Ignore errors, the link may come up later */ > dw_pcie_wait_for_link(pci); Thanks. That looks likely to fix it since IIUC dw_pcie_wait_for_link() will end up waiting somewhat like the double check I added to ls_pcie_link_up(). I'll take a look at let you know. Daniel.
Re: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent
Le 11/12/2020 à 17:07, David Laight a écrit : From: Christophe Leroy Sent: 11 December 2020 15:22 Le 11/12/2020 à 09:52, Xu Wang a écrit : Remove casting the values returned by dma_alloc_coherent. Can you explain more in the commit log ? As far as I can see, dma_alloc_coherent() doesn't return __iomem, and ring_base member is __iomem Which is probably wrong - that is the kernel address of kernel memory. So it shouldn't have the __iomem marker. That's where the buffer descriptors are, the driver accesses to the content of the buffer descriptors using the IO accessors in_be16()/out_be16(). Is it not correct ? Christophe I wonder what else is wrong David Christophe Signed-off-by: Xu Wang --- drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c index 99fe2c210d0f..3ae345676e50 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c @@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev) struct fs_enet_private *fep = netdev_priv(dev); const struct fs_platform_info *fpi = fep->fpi; - fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev, + fep->ring_base = dma_alloc_coherent(fep->dev, (fpi->tx_ring + fpi->rx_ring) * sizeof(cbd_t), &fep->ring_mem_addr, GFP_KERNEL); - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH] powerpc/mm: Refactor the floor/ceiling check in hugetlb range freeing functions
ange() after range_is_outside_limits(), but after this patch, "start" is not longer has the bitmask, i.e., "no &=". Anyway, reverting this commit from today's linux-next fixed a crash on POWE9 NV. # runltp -f hugetlb [ 7703.114640][T58070] LTP: starting hugemmap05_1 (hugemmap05 -m) [ 7703.157792][ C99] [ cut here ] [ 7703.158279][ C99] kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:387! [ 7703.158306][ C99] Oops: Exception in kernel mode, sig: 5 [#1] [ 7703.158330][ C99] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 NUMA PowerNV [ 7703.158343][ C99] Modules linked in: vfio_pci vfio_virqfd vfio_iommu_spapr_tce vfio vfio_spapr_eeh loop kvm_hv kvm ip_tables x_tables sd_mod ahci libahci tg3 libata firmware_class libphy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_del_mod] [ 7703.158435][ C99] CPU: 99 PID: 308 Comm: ksoftirqd/99 Tainted: G O 5.10.0-rc7-next-20201211 #1 [ 7703.158464][ C99] NIP: c005dbec LR: c03352f4 CTR: [ 7703.158489][ C99] REGS: c0002bb6f830 TRAP: 0700 Tainted: G O (5.10.0-rc7-next-20201211) [ 7703.158528][ C99] MSR: 9282b033 CR: 24002284 XER: 2004 [ 7703.158570][ C99] GPR00: c03352f4 c0002bb6fad0 c7f70b00 c0002000385b3ff0 [ 7703.158570][ C99] GPR04: 0003 c0002bb6f8b4 0001 [ 7703.158570][ C99] GPR08: 0001 0009 0008 0002 [ 7703.158570][ C99] GPR12: 24002488 c000201fff649c00 c7f2a20c [ 7703.158570][ C99] GPR16: 0007 c0194d10 c0194d10 [ 7703.158570][ C99] GPR24: 0014 0015 c000201cc6e72398 c7fac4b4 [ 7703.158570][ C99] GPR28: c7f2bf80 c7fac2f8 0008 c00020003387 [ 7703.158766][ C99] NIP [c005dbec] __tlb_remove_table+0x1dc/0x1e0 pgtable_free at arch/powerpc/mm/book3s64/pgtable.c:387 (inlined by) __tlb_remove_table at arch/powerpc/mm/book3s64/pgtable.c:405 [ 7703.158805][ C99] LR [c03352f4] tlb_remove_table_rcu+0x54/0xa0 [ 7703.158853][ C99] Call Trace: [ 7703.158872][ C99] [c0002bb6fad0] [c005db4c] __tlb_remove_table+0x13c/0x1e0 (unreliable) [ 7703.158890][ C99] [c0002bb6fb00] [c03352f4] tlb_remove_table_rcu+0x54/0xa0 __tlb_remove_table_free at mm/mmu_gather.c:101 (inlined by) tlb_remove_table_rcu at mm/mmu_gather.c:156 [ 7703.158927][ C99] [c0002bb6fb30] [c0194d7c] rcu_core+0x35c/0xbb0 rcu_do_batch at kernel/rcu/tree.c:2502 (inlined by) rcu_core at kernel/rcu/tree.c:2737 [ 7703.158966][ C99] [c0002bb6fbf0] [c095a3d0] __do_softirq+0x480/0x704 [ 7703.159006][ C99] [c0002bb6fd10] [c00cc1f4] run_ksoftirqd+0x74/0xd0 run_ksoftirqd at kernel/softirq.c:651 (inlined by) run_ksoftirqd at kernel/softirq.c:642 [ 7703.159046][ C99] [c0002bb6fd30] [c01040c8] smpboot_thread_fn+0x278/0x320 [ 7703.159096][ C99] [c0002bb6fda0] [c00fc8a4] kthread+0x1c4/0x1d0 [ 7703.159145][ C99] [c0002bb6fe10] [c000d9fc] ret_from_kernel_thread+0x5c/0x80 [ 7703.159183][ C99] Instruction dump: [ 7703.159204][ C99] 6000 7c0802a6 3c82f8b4 7fe3fb78 38847470 f8010040 482b4fc5 6000 [ 7703.159248][ C99] 0fe0 7c0802a6 fbe10028 f8010040 <0fe0> 3c4c07f1 38422f10 7c0802a6 [ 7703.159293][ C99] ---[ end trace 1d92a5231ba6a0d5 ]---
[RFC PATCH v1] powerpc/net: Implement eBPF on PPC32
This first draft of eBPF for PPC32. This is still dirty WIP. Don't pay too much attention on comments, most of them are copied as is from PPC64 implementation. Test result: test_bpf: Summary: 378 PASSED, 0 FAILED, [332/366 JIT'ed] Registers mapping: [BPF_REG_0] = r21-r22 /* function arguments */ [BPF_REG_1] = r3-r4 [BPF_REG_2] = r5-r6 [BPF_REG_3] = r7-r8 [BPF_REG_4] = r9-r10 [BPF_REG_5] = r11-r12 /* non volatile registers */ [BPF_REG_6] = r23-r24 [BPF_REG_7] = r25-r26 [BPF_REG_8] = r27-r28 [BPF_REG_9] = r29-r30 /* frame pointer aka BPF_REG_10 */ [BPF_REG_FP] = r31 /* eBPF jit internal registers */ [BPF_REG_AX] = r19-r20 [TMP_REG] = r18 r0 is also used as temporary register as much as possible. It is referenced directly in the code in order to avoid misuse of it, as some instructions interpret it as value 0 instead of register r0. The following operations are not (or only partially) supported for the time being: case BPF_ALU64 | BPF_DIV | BPF_X: /* dst /= src */ case BPF_ALU64 | BPF_MOD | BPF_X: /* dst %= src */ case BPF_ALU64 | BPF_MOD | BPF_K: /* dst %= imm */ case BPF_ALU64 | BPF_DIV | BPF_K: /* dst /= imm */ case BPF_ALU64 | BPF_LSH | BPF_X: /* dst <<= src; */ case BPF_ALU64 | BPF_LSH | BPF_K: /* dst <<== imm */ case BPF_ALU64 | BPF_RSH | BPF_X: /* dst >>= src */ case BPF_ALU64 | BPF_RSH | BPF_K: /* dst >>= imm */ case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */ case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */ case BPF_STX | BPF_XADD | BPF_DW: /* *(u64 *)(dst + off) += src */ Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig |3 +- arch/powerpc/include/asm/ppc-opcode.h | 11 + arch/powerpc/net/Makefile |6 +- arch/powerpc/net/bpf_jit32.h | 173 ++-- arch/powerpc/net/bpf_jit_asm.S| 226 - arch/powerpc/net/bpf_jit_comp.c | 683 -- arch/powerpc/net/bpf_jit_comp32.c | 1177 + 7 files changed, 1249 insertions(+), 1030 deletions(-) delete mode 100644 arch/powerpc/net/bpf_jit_asm.S delete mode 100644 arch/powerpc/net/bpf_jit_comp.c create mode 100644 arch/powerpc/net/bpf_jit_comp32.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 9e679ba0811c..dd6ccd550230 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -190,7 +190,6 @@ config PPC select HAVE_ARCH_TRACEHOOK select HAVE_ASM_MODVERSIONS select HAVE_C_RECORDMCOUNT - select HAVE_CBPF_JITif !PPC64 select HAVE_STACKPROTECTOR if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13) select HAVE_STACKPROTECTOR if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2) select HAVE_CONTEXT_TRACKINGif PPC64 @@ -199,7 +198,7 @@ config PPC select HAVE_DEBUG_STACKOVERFLOW select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGSif MPROFILE_KERNEL - select HAVE_EBPF_JITif PPC64 + select HAVE_EBPF_JIT select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU) select HAVE_FAST_GUP select HAVE_FTRACE_MCOUNT_RECORD diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index a6e3700c4566..0cefa1da9a1f 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -417,6 +417,7 @@ #define PPC_RAW_LD(r, base, i) (PPC_INST_LD | ___PPC_RT(r) | ___PPC_RA(base) | IMM_DS(i)) #define PPC_RAW_LWZ(r, base, i)(0x8000 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_L(i)) #define PPC_RAW_LWZX(t, a, b) (0x7c2e | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b)) +#define PPC_RAW_LMW(r, base, i)(0xb800 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_L(i)) #define PPC_RAW_STD(r, base, i)(PPC_INST_STD | ___PPC_RS(r) | ___PPC_RA(base) | IMM_DS(i)) #define PPC_RAW_STDCX(s, a, b) (0x7c0001ad | ___PPC_RS(s) | ___PPC_RA(a) | ___PPC_RB(b)) #define PPC_RAW_LFSX(t, a, b) (0x7c00042e | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b)) @@ -425,6 +426,9 @@ #define PPC_RAW_STFDX(s, a, b) (0x7c0005ae | ___PPC_RS(s) | ___PPC_RA(a) | ___PPC_RB(b)) #define PPC_RAW_LVX(t, a, b) (0x7cce | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b)) #define PPC_RAW_STVX(s, a, b) (0x7c0001ce | ___PPC_RS(s) | ___PPC_RA(a) | ___PPC_RB(b)) +#define PPC_RAW_ADDE(t, a, b) (0x7c000114 | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b)) +#define PPC_RAW_ADDZE(t, a)(0x7c000194
Re: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent
Le 11/12/2020 à 09:52, Xu Wang a écrit : Remove casting the values returned by dma_alloc_coherent. Can you explain more in the commit log ? As far as I can see, dma_alloc_coherent() doesn't return __iomem, and ring_base member is __iomem Christophe Signed-off-by: Xu Wang --- drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c index 99fe2c210d0f..3ae345676e50 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c @@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev) struct fs_enet_private *fep = netdev_priv(dev); const struct fs_platform_info *fpi = fep->fpi; - fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev, + fep->ring_base = dma_alloc_coherent(fep->dev, (fpi->tx_ring + fpi->rx_ring) * sizeof(cbd_t), &fep->ring_mem_addr, GFP_KERNEL);
[PATCH] powerpc/memhotplug: quieting some DLPAR operations
When attempting to remove by index a set of LMB a lot of messages are displayed on the console, even when everything goes fine: pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 802d Offlined Pages 4096 pseries-hotplug-mem: Memory at 2d000 was hot-removed The 2 messages prefixed by "pseries-hotplug-mem" are not really helpful for the end user, they should be debug outputs. In case of error, because some of the LMB's pages couldn't be offlined, the following is displayed on the console: pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 803e pseries-hotplug-mem: Failed to hot-remove memory at 3e000 dlpar: Could not handle DLPAR request "memory remove index 0x803e" Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless, and the generic DLPAR prefixed message should be enough. Turning the 2 firts at DEBUG level. These 2 first changes are mainly triggered by the changes introduced in drmgr: https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ Also, when adding a bunch of LMBs, a message is displayed in the console per LMB like these ones: pseries-hotplug-mem: Memory at 7e000 (drc index 807e) was hot-added pseries-hotplug-mem: Memory at 7f000 (drc index 807f) was hot-added pseries-hotplug-mem: Memory at 8 (drc index 8080) was hot-added pseries-hotplug-mem: Memory at 81000 (drc index 8081) was hot-added When adding 1TB of memory and LMB size is 256MB, this leads to 4096 messages to be displayed on the console. These messages are not really helpful for the end user, so moving them to the DEBUG level. Signed-off-by: Laurent Dufour --- arch/powerpc/platforms/pseries/hotplug-memory.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 7efe6ec5d14a..8377f1f7c78e 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -479,7 +479,7 @@ static int dlpar_memory_remove_by_index(u32 drc_index) int lmb_found; int rc; - pr_info("Attempting to hot-remove LMB, drc index %x\n", drc_index); + pr_debug("Attempting to hot-remove LMB, drc index %x\n", drc_index); lmb_found = 0; for_each_drmem_lmb(lmb) { @@ -497,10 +497,10 @@ static int dlpar_memory_remove_by_index(u32 drc_index) rc = -EINVAL; if (rc) - pr_info("Failed to hot-remove memory at %llx\n", - lmb->base_addr); + pr_debug("Failed to hot-remove memory at %llx\n", +lmb->base_addr); else - pr_info("Memory at %llx was hot-removed\n", lmb->base_addr); + pr_debug("Memory at %llx was hot-removed\n", lmb->base_addr); return rc; } @@ -717,8 +717,8 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add) if (!drmem_lmb_reserved(lmb)) continue; - pr_info("Memory at %llx (drc index %x) was hot-added\n", - lmb->base_addr, lmb->drc_index); + pr_debug("Memory at %llx (drc index %x) was hot-added\n", +lmb->base_addr, lmb->drc_index); drmem_remove_lmb_reservation(lmb); } rc = 0; -- 2.29.2
Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K
On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson wrote: > > I have been chasing down a problem enumerating an NVMe drive on a > Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate > successfully if the we are emitting lots of console messages via a UART. > If the system is booted with `quiet` set then enumeration fails. We really need to get away from work-arounds for device X on host Y. I suspect they are not limited to that combination only... How exactly does it fail to enumerate? There's a (racy) linkup check on config accesses, is it reporting link down and skipping config accesses? > I guessed this would be due to the timing impact of printk-to-UART and > tried to find out where a delay could be added to provoke a successful > enumeration. > > This patch contains the results. The delay length (1ms) was selected by > binary searching downwards until the delay was not effective for these > devices (Honeycomb LX2K and a Western Digital WD Blue SN550). > > I have also included the workaround twice (conditionally compiled). The > first change is the *latest* possible code path that we can deploy a > delay whilst the second is the earliest place I could find. > > The summary is that the critical window were we are currently relying on > a call to the console UART code can "mend" the driver runs from calling > dw_pcie_setup_rc() in host init to just before we read the state in the > link up callback. > > Signed-off-by: Daniel Thompson > --- > > Notes: > This patch is RFC (and HACK) because I don't have much clue *why* this > patch works... merely that this is the smallest possible change I need > to replicate the current accidental printk() workaround. Perhaps one > could argue that RFC here stands for request-for-clue. All my > observations and changes here are empirical and I don't know how best to > turn them into something that is not a hack! > > BTW I noticed many other pcie-designware drivers take advantage > of a function called dw_pcie_wait_for_link() in their init paths... > but my naive attempts to add it to the layerscape driver results > in non-booting systems so I haven't embarrassed myself by including > that in the patch! You need to look at what's pending for v5.11, because I reworked this to be more unified. The ordering of init is also possibly changed. The sequence is now like this: dw_pcie_setup_rc(pp); dw_pcie_msi_init(pp); if (!dw_pcie_link_up(pci) && pci->ops->start_link) { ret = pci->ops->start_link(pci); if (ret) goto err_free_msi; } /* Ignore errors, the link may come up later */ dw_pcie_wait_for_link(pci); > drivers/pci/controller/dwc/pci-layerscape.c | 35 + > 1 file changed, 35 insertions(+) > > diff --git a/drivers/pci/controller/dwc/pci-layerscape.c > b/drivers/pci/controller/dwc/pci-layerscape.c > index f24f79a70d9a..c354904b90ef 100644 > --- a/drivers/pci/controller/dwc/pci-layerscape.c > +++ b/drivers/pci/controller/dwc/pci-layerscape.c > @@ -22,6 +22,8 @@ > > #include "pcie-designware.h" > > +#define WORKAROUND_LATEST_POSSIBLE > + > /* PEX1/2 Misc Ports Status Register */ > #define SCFG_PEXMSCPORTSR(pex_idx) (0x94 + (pex_idx) * 4) > #define LTSSM_STATE_SHIFT 20 > @@ -113,10 +115,31 @@ static int ls_pcie_link_up(struct dw_pcie *pci) > struct ls_pcie *pcie = to_ls_pcie(pci); > u32 state; > > + /* > +* Strictly speaking *this* (before the ioread32) is the latest > +* point a simple delay can be effective. If we move the delay > +* after the ioread32 then the NVMe does not enumerate. > +* > +* However this function appears to be frequently called so an > +* unconditional delay here causes noticeable delay at boot > +* time. Hence we implement the workaround by retrying the read > +* after a short delay if we think we might need to return false. > +*/ > + > state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >> > pcie->drvdata->ltssm_shift) & > LTSSM_STATE_MASK; > > +#ifdef WORKAROUND_LATEST_POSSIBLE > + if (state < LTSSM_PCIE_L0) { > + /* see comment above */ > + mdelay(1); > + state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >> > +pcie->drvdata->ltssm_shift) & > +LTSSM_STATE_MASK; Side note, I really wonder about the variety of ways in DWC drivers we have to check link state. The default is a debug register... Are the standard PCIe registers for this really broken in these cases? > + } > +#endif > + > if (state < LTSSM_PCIE_L0) > return 0; > > @@ -152,6 +175,18 @@ static int ls_pcie_host_init(struct pcie_port *pp) > > dw_pcie_setup_rc(pp); Might be related to the PORT_LOGIC_SPEED_CHANGE we
[RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K
I have been chasing down a problem enumerating an NVMe drive on a Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate successfully if the we are emitting lots of console messages via a UART. If the system is booted with `quiet` set then enumeration fails. I guessed this would be due to the timing impact of printk-to-UART and tried to find out where a delay could be added to provoke a successful enumeration. This patch contains the results. The delay length (1ms) was selected by binary searching downwards until the delay was not effective for these devices (Honeycomb LX2K and a Western Digital WD Blue SN550). I have also included the workaround twice (conditionally compiled). The first change is the *latest* possible code path that we can deploy a delay whilst the second is the earliest place I could find. The summary is that the critical window were we are currently relying on a call to the console UART code can "mend" the driver runs from calling dw_pcie_setup_rc() in host init to just before we read the state in the link up callback. Signed-off-by: Daniel Thompson --- Notes: This patch is RFC (and HACK) because I don't have much clue *why* this patch works... merely that this is the smallest possible change I need to replicate the current accidental printk() workaround. Perhaps one could argue that RFC here stands for request-for-clue. All my observations and changes here are empirical and I don't know how best to turn them into something that is not a hack! BTW I noticed many other pcie-designware drivers take advantage of a function called dw_pcie_wait_for_link() in their init paths... but my naive attempts to add it to the layerscape driver results in non-booting systems so I haven't embarrassed myself by including that in the patch! drivers/pci/controller/dwc/pci-layerscape.c | 35 + 1 file changed, 35 insertions(+) diff --git a/drivers/pci/controller/dwc/pci-layerscape.c b/drivers/pci/controller/dwc/pci-layerscape.c index f24f79a70d9a..c354904b90ef 100644 --- a/drivers/pci/controller/dwc/pci-layerscape.c +++ b/drivers/pci/controller/dwc/pci-layerscape.c @@ -22,6 +22,8 @@ #include "pcie-designware.h" +#define WORKAROUND_LATEST_POSSIBLE + /* PEX1/2 Misc Ports Status Register */ #define SCFG_PEXMSCPORTSR(pex_idx) (0x94 + (pex_idx) * 4) #define LTSSM_STATE_SHIFT 20 @@ -113,10 +115,31 @@ static int ls_pcie_link_up(struct dw_pcie *pci) struct ls_pcie *pcie = to_ls_pcie(pci); u32 state; + /* +* Strictly speaking *this* (before the ioread32) is the latest +* point a simple delay can be effective. If we move the delay +* after the ioread32 then the NVMe does not enumerate. +* +* However this function appears to be frequently called so an +* unconditional delay here causes noticeable delay at boot +* time. Hence we implement the workaround by retrying the read +* after a short delay if we think we might need to return false. +*/ + state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >> pcie->drvdata->ltssm_shift) & LTSSM_STATE_MASK; +#ifdef WORKAROUND_LATEST_POSSIBLE + if (state < LTSSM_PCIE_L0) { + /* see comment above */ + mdelay(1); + state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >> +pcie->drvdata->ltssm_shift) & +LTSSM_STATE_MASK; + } +#endif + if (state < LTSSM_PCIE_L0) return 0; @@ -152,6 +175,18 @@ static int ls_pcie_host_init(struct pcie_port *pp) dw_pcie_setup_rc(pp); +#ifdef WORKAROUND_EARLIEST_POSSIBLE + /* +* This is the earliest point the delay is effective. +* If we move it before dw_pcie_setup_rc() then the +* NVMe does not enumerate. +* +* 500us is too short to reliably work around the issue +* hence adopting 1000us here. +*/ + mdelay(1); +#endif + return 0; } base-commit: 0477e92881850d44910a7e94fc2c46f96faa131f -- 2.29.2
[PATCH] ASoC: fsl: imx-hdmi: Fix an uninitialized return in probe
The "ret" variable should be set to a negative error code but currently it returns uninitialized stack data. Fixes: 6a5f850aa83a ("ASoC: fsl: Add imx-hdmi machine driver") Signed-off-by: Dan Carpenter --- sound/soc/fsl/imx-hdmi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/soc/fsl/imx-hdmi.c b/sound/soc/fsl/imx-hdmi.c index 2c2a76a71940..ede4a9ad1054 100644 --- a/sound/soc/fsl/imx-hdmi.c +++ b/sound/soc/fsl/imx-hdmi.c @@ -164,6 +164,7 @@ static int imx_hdmi_probe(struct platform_device *pdev) if ((hdmi_out && hdmi_in) || (!hdmi_out && !hdmi_in)) { dev_err(&pdev->dev, "Invalid HDMI DAI link\n"); + ret = -EINVAL; goto fail; } -- 2.29.2
Re: [PATCH v1 1/2] KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE (nested case only)
On Mon, Oct 19, 2020 at 04:56:41PM +0530, Bharata B Rao wrote: > Implements H_RPT_INVALIDATE hcall and supports only nested case > currently. > > A KVM capability KVM_CAP_RPT_INVALIDATE is added to indicate the > support for this hcall. As Paul mentioned in the thread, this hcall does both process scoped invalidations and partition scoped invalidations for L2 guest. I am adding KVM_CAP_RPT_INVALIDATE capability with only partition scoped invalidations (nested case) implemented in the hcall as we don't see the need for KVM to implement process scoped invalidation function as KVM may never run with LPCR[GTSE]=0. I am wondering if enabling the capability with only partial implementation of the hcall is the correct thing to do. In future if we ever want process scoped invalidations support in this hcall, we may not be able to differentiate the availability of two functions cleanly from QEMU. So does it make sense to implement the process scoped invalidation function also now itself even if it is not going to be used in KVM? Regards, Bharata.
[powerpc/merge] System crash during cpu offline/online operation
I am observing system crash during a cpu offline/online operation with latest merge branch code running in a PowerVM LPAR (P8 onwards) # uname -r 5.10.0-rc7-01792-g244569c777ca # ppc64_cpu --smt=1 [ 244.205194] cpu 1 (hwid 1) Ready to die… ……… ……... [ 247.015113] cpu 30 (hwid 30) Ready to die... [ 247.104973] cpu 31 (hwid 31) Ready to die… # ppc64_cpu --smt=8 At this point the LPAR reboots instantly without any trace message. IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM ……….. ……….. IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM 1 = SMS Menu 5 = Default Boot List 8 = Open Firmware Prompt 6 = Stored Boot List 9 = Restricted Open Firmware Prompt ……. ……. merge commit 9acd775e45 (based on rc6) was good. Mainline 5.10.0-rc7 also does not exhibit this problem. Thanks -Sachin
[PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent
Remove casting the values returned by dma_alloc_coherent. Signed-off-by: Xu Wang --- drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c index 99fe2c210d0f..3ae345676e50 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c @@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev) struct fs_enet_private *fep = netdev_priv(dev); const struct fs_platform_info *fpi = fep->fpi; - fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev, + fep->ring_base = dma_alloc_coherent(fep->dev, (fpi->tx_ring + fpi->rx_ring) * sizeof(cbd_t), &fep->ring_mem_addr, GFP_KERNEL); -- 2.17.1
[PATCH] net: fs_enet: remove casting dma_alloc_coherent
Remove casting the values returned by dma_alloc_coherent. Signed-off-by: Xu Wang --- drivers/net/ethernet/freescale/fs_enet/mac-fcc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c index b47490be872c..17f757c0bb85 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c @@ -147,7 +147,7 @@ static int allocate_bd(struct net_device *dev) struct fs_enet_private *fep = netdev_priv(dev); const struct fs_platform_info *fpi = fep->fpi; - fep->ring_base = (void __iomem __force *)dma_alloc_coherent(fep->dev, + fep->ring_base = dma_alloc_coherent(fep->dev, (fpi->tx_ring + fpi->rx_ring) * sizeof(cbd_t), &fep->ring_mem_addr, GFP_KERNEL); -- 2.17.1
Re: [PATCH v2 07/13] powerpc: Increase NR_IRQS range to support more KVM guests
On 12/11/20 12:51 AM, Michael Ellerman wrote: > Cédric Le Goater writes: >> PowerNV systems can handle up to 4K guests and 1M interrupt numbers >> per chip. Increase the range of allowed interrupts to support a larger >> number of guests. >> >> Reviewed-by: Greg Kurz >> Signed-off-by: Cédric Le Goater >> --- >> arch/powerpc/Kconfig | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index 5181872f9452..c250fbd430d1 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -66,7 +66,7 @@ config NEED_PER_CPU_PAGE_FIRST_CHUNK >> >> config NR_IRQS >> int "Number of virtual interrupt numbers" >> -range 32 32768 >> +range 32 1048576 >> default "512" >> help >>This defines the number of virtual interrupt numbers the kernel > > We should really do what other arches do, and size this appropriately > based on the config, rather than asking users to guess what size they > need. > > But I guess I'll take this for now, and we can do something fancier > later. I was thinking on adding a property to OPAL to size the HW interrupt number space. Is that it ? That would be good because it's increasing from 20bits on P9 to 24bits on P10. I am checking other arches. Thanks, C.