RE: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent

2020-12-11 Thread David Laight
From: Christophe Leroy
> Sent: 11 December 2020 15:22
> 
> Le 11/12/2020 à 09:52, Xu Wang a écrit :
> > Remove casting the values returned by dma_alloc_coherent.
> 
> Can you explain more in the commit log ?
> 
> As far as I can see, dma_alloc_coherent() doesn't return __iomem, and 
> ring_base member is __iomem

Which is probably wrong - that is the kernel address of kernel memory.
So it shouldn't have the __iomem marker.

I wonder what else is wrong

David

> 
> Christophe
> 
> >
> > Signed-off-by: Xu Wang 
> > ---
> >   drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> > index 99fe2c210d0f..3ae345676e50 100644
> > --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> > +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> > @@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev)
> > struct fs_enet_private *fep = netdev_priv(dev);
> > const struct fs_platform_info *fpi = fep->fpi;
> >
> > -   fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev,
> > +   fep->ring_base = dma_alloc_coherent(fep->dev,
> > (fpi->tx_ring + fpi->rx_ring) *
> > sizeof(cbd_t), &fep->ring_mem_addr,
> > GFP_KERNEL);
> >

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent

2020-12-11 Thread David Laight
From: Christophe Leroy
> Sent: 11 December 2020 16:43
> 
> Le 11/12/2020 à 17:07, David Laight a écrit :
> > From: Christophe Leroy
> >> Sent: 11 December 2020 15:22
> >>
> >> Le 11/12/2020 à 09:52, Xu Wang a écrit :
> >>> Remove casting the values returned by dma_alloc_coherent.
> >>
> >> Can you explain more in the commit log ?
> >>
> >> As far as I can see, dma_alloc_coherent() doesn't return __iomem, and 
> >> ring_base member is __iomem
> >
> > Which is probably wrong - that is the kernel address of kernel memory.
> > So it shouldn't have the __iomem marker.
> 
> That's where the buffer descriptors are, the driver accesses to the content 
> of the buffer
> descriptors using the IO accessors in_be16()/out_be16(). Is it not correct ?

I've just been looking at the crap in there.
My understanding is that IO accessors are for IO devices (eg addresses
from io_remap() etc).

Buffers allocated by dma_alloc_coherent() are normal kernel memory
and don't need any accessors.
Now you might need some barriers - mostly because an ethernet chip
can typically read a ring entry without being prodded.
IIRC there is a barrier in writel() to ensure the dma master will
'see' all memory writes done before the IO write that kicks it into
doing some processing.

The fact that the driver contains so many __iomem casts (eg in
tx_restart) is an indication that something is badly awry.
__iomem exists to check you are using the correct type of pointer.
Any __iomem casts are dubious.

David

> 
> Christophe
> 
> >
> > I wonder what else is wrong
> >
> > David
> >
> >>
> >> Christophe
> >>
> >>>
> >>> Signed-off-by: Xu Wang 
> >>> ---
> >>>drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +-
> >>>1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> >> b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> >>> index 99fe2c210d0f..3ae345676e50 100644
> >>> --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> >>> +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
> >>> @@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev)
> >>>   struct fs_enet_private *fep = netdev_priv(dev);
> >>>   const struct fs_platform_info *fpi = fep->fpi;
> >>>
> >>> - fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev,
> >>> + fep->ring_base = dma_alloc_coherent(fep->dev,
> >>>   (fpi->tx_ring + 
> >>> fpi->rx_ring) *
> >>>   sizeof(cbd_t), 
> >>> &fep->ring_mem_addr,
> >>>   GFP_KERNEL);
> >>>
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> > 1PT, UK
> > Registration No: 1397386 (Wales)
> >

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K

2020-12-11 Thread Daniel Thompson
On Fri, Dec 11, 2020 at 08:37:40AM -0600, Rob Herring wrote:
> On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson
>  wrote:
> >
> > I have been chasing down a problem enumerating an NVMe drive on a
> > Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate
> > successfully if the we are emitting lots of console messages via a UART.
> > If the system is booted with `quiet` set then enumeration fails.
> 
> We really need to get away from work-arounds for device X on host Y. I
> suspect they are not limited to that combination only...

No objection on that. This patch was essentially sharing the result of
an investigation where I got stuck at the "now fix it properly" stage!


> How exactly does it fail to enumerate? There's a (racy) linkup check
> on config accesses, is it reporting link down and skipping config
> accesses?

In dmesg terms it looked like this:

--- nvme_borked_gpu_working.linted.dmesg2020-11-18 15:28:35.544118213 
+
+++ nvme_working_gpu_working.linted.dmesg   2020-11-18 15:28:56.180076946 
+
@@ -241,11 +241,19 @@
 pci :00:00.0: reg 0x38: [mem 0x904800-0x9048ff pref]
 pci :00:00.0: supports D1 D2
 pci :00:00.0: PME# supported from D0 D1 D2 D3hot
+pci :01:00.0: [15b7:5009] type 00 class 0x010802
+pci :01:00.0: reg 0x10: [mem 0x904900-0x9049003fff 64bit]
+pci :01:00.0: reg 0x20: [mem 0x9049004000-0x90490040ff 64bit]
 pci :00:00.0: BAR 1: assigned [mem 0x904000-0x9043ff]
 pci :00:00.0: BAR 0: assigned [mem 0x904400-0x9044ff]
 pci :00:00.0: BAR 6: assigned [mem 0x904500-0x9045ff pref]
+pci :00:00.0: BAR 14: assigned [mem 0x904600-0x90460f]
+pci :01:00.0: BAR 0: assigned [mem 0x904600-0x9046003fff 64bit]
+pci :01:00.0: BAR 4: assigned [mem 0x9046004000-0x90460040ff 64bit]
 pci :00:00.0: PCI bridge to [bus 01-ff]
+pci :00:00.0:   bridge window [mem 0x904600-0x90460f]
 pci :00:00.0: Max Payload Size set to  256/ 256 (was  128), Max Read Rq  
256
+pci :01:00.0: Max Payload Size set to  256/ 512 (was  128), Max Read Rq  
256
 layerscape-pcie 380.pcie: host bridge /soc/pcie@380 ranges:
 layerscape-pcie 380.pcie:  MEM 0xa04000..0xa07fff -> 
0x004000
 layerscape-pcie 380.pcie: PCI host bridge to bus 0001:00

... and be aware that the last three lines here are another PCIe
controller coming up just fine and it is about to detect the GPU
(which like the NVMe is also gen3) just fine whether or not we
add a short delay.


> > I guessed this would be due to the timing impact of printk-to-UART and
> > tried to find out where a delay could be added to provoke a successful
> > enumeration.
> >
> > This patch contains the results. The delay length (1ms) was selected by
> > binary searching downwards until the delay was not effective for these
> > devices (Honeycomb LX2K and a Western Digital WD Blue SN550).
> >
> > I have also included the workaround twice (conditionally compiled). The
> > first change is the *latest* possible code path that we can deploy a
> > delay whilst the second is the earliest place I could find.
> >
> > The summary is that the critical window were we are currently relying on
> > a call to the console UART code can "mend" the driver runs from calling
> > dw_pcie_setup_rc() in host init to just before we read the state in the
> > link up callback.
> >
> > Signed-off-by: Daniel Thompson 
> > ---
> >
> > Notes:
> > This patch is RFC (and HACK) because I don't have much clue *why* this
> > patch works... merely that this is the smallest possible change I need
> > to replicate the current accidental printk() workaround.  Perhaps one
> > could argue that RFC here stands for request-for-clue.  All my
> > observations and changes here are empirical and I don't know how best to
> > turn them into something that is not a hack!
> >
> > BTW I noticed many other pcie-designware drivers take advantage
> > of a function called dw_pcie_wait_for_link() in their init paths...
> > but my naive attempts to add it to the layerscape driver results
> > in non-booting systems so I haven't embarrassed myself by including
> > that in the patch!
> 
> You need to look at what's pending for v5.11, because I reworked this
> to be more unified. The ordering of init is also possibly changed. The
> sequence is now like this:
> 
> dw_pcie_setup_rc(pp);
> dw_pcie_msi_init(pp);
> 
> if (!dw_pcie_link_up(pci) && pci->ops->start_link) {
> ret = pci->ops->start_link(pci);
> if (ret)
> goto err_free_msi;
> }
> 
> /* Ignore errors, the link may come up later */
> dw_pcie_wait_for_link(pci);

Thanks. That looks likely to fix it since IIUC dw_pcie_wait_for_link()
will end up waiting somewhat like the double check I added to
ls_pcie_link_up().

I'll take a look at let you know.


Daniel.


Re: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent

2020-12-11 Thread Christophe Leroy




Le 11/12/2020 à 17:07, David Laight a écrit :

From: Christophe Leroy

Sent: 11 December 2020 15:22

Le 11/12/2020 à 09:52, Xu Wang a écrit :

Remove casting the values returned by dma_alloc_coherent.


Can you explain more in the commit log ?

As far as I can see, dma_alloc_coherent() doesn't return __iomem, and ring_base 
member is __iomem


Which is probably wrong - that is the kernel address of kernel memory.
So it shouldn't have the __iomem marker.


That's where the buffer descriptors are, the driver accesses to the content of the buffer 
descriptors using the IO accessors in_be16()/out_be16(). Is it not correct ?


Christophe



I wonder what else is wrong

David



Christophe



Signed-off-by: Xu Wang 
---
   drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c

b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c

index 99fe2c210d0f..3ae345676e50 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
@@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev)
struct fs_enet_private *fep = netdev_priv(dev);
const struct fs_platform_info *fpi = fep->fpi;

-   fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev,
+   fep->ring_base = dma_alloc_coherent(fep->dev,
(fpi->tx_ring + fpi->rx_ring) *
sizeof(cbd_t), &fep->ring_mem_addr,
GFP_KERNEL);



-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH] powerpc/mm: Refactor the floor/ceiling check in hugetlb range freeing functions

2020-12-11 Thread Qian Cai
ange() after range_is_outside_limits(), but after this patch,
"start" is not longer has the bitmask, i.e., "no &=".

Anyway, reverting this commit from today's linux-next fixed a crash on POWE9 NV.

# runltp -f hugetlb
[ 7703.114640][T58070] LTP: starting hugemmap05_1 (hugemmap05 -m)
[ 7703.157792][   C99] [ cut here ]
[ 7703.158279][   C99] kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:387!
[ 7703.158306][   C99] Oops: Exception in kernel mode, sig: 5 [#1]
[ 7703.158330][   C99] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 NUMA PowerNV
[ 7703.158343][   C99] Modules linked in: vfio_pci vfio_virqfd 
vfio_iommu_spapr_tce vfio vfio_spapr_eeh loop kvm_hv kvm ip_tables x_tables 
sd_mod ahci libahci tg3 libata firmware_class libphy dm_mirror dm_region_hash 
dm_log dm_mod [last unloaded: dummy_del_mod]
[ 7703.158435][   C99] CPU: 99 PID: 308 Comm: ksoftirqd/99 Tainted: G   
O  5.10.0-rc7-next-20201211 #1
[ 7703.158464][   C99] NIP:  c005dbec LR: c03352f4 CTR: 

[ 7703.158489][   C99] REGS: c0002bb6f830 TRAP: 0700   Tainted: G   
O   (5.10.0-rc7-next-20201211)
[ 7703.158528][   C99] MSR:  9282b033 
  CR: 24002284  XER: 2004
[ 7703.158570][   C99] GPR00: c03352f4 c0002bb6fad0 
c7f70b00 c0002000385b3ff0 
[ 7703.158570][   C99] GPR04:  0003 
c0002bb6f8b4 0001 
[ 7703.158570][   C99] GPR08: 0001 0009 
0008 0002 
[ 7703.158570][   C99] GPR12: 24002488 c000201fff649c00 
c7f2a20c  
[ 7703.158570][   C99] GPR16: 0007  
c0194d10 c0194d10 
[ 7703.158570][   C99] GPR24: 0014 0015 
c000201cc6e72398 c7fac4b4 
[ 7703.158570][   C99] GPR28: c7f2bf80 c7fac2f8 
0008 c00020003387 
[ 7703.158766][   C99] NIP [c005dbec] __tlb_remove_table+0x1dc/0x1e0
pgtable_free at arch/powerpc/mm/book3s64/pgtable.c:387
(inlined by) __tlb_remove_table at arch/powerpc/mm/book3s64/pgtable.c:405
[ 7703.158805][   C99] LR [c03352f4] tlb_remove_table_rcu+0x54/0xa0
[ 7703.158853][   C99] Call Trace:
[ 7703.158872][   C99] [c0002bb6fad0] [c005db4c] 
__tlb_remove_table+0x13c/0x1e0 (unreliable)
[ 7703.158890][   C99] [c0002bb6fb00] [c03352f4] 
tlb_remove_table_rcu+0x54/0xa0
__tlb_remove_table_free at mm/mmu_gather.c:101
(inlined by) tlb_remove_table_rcu at mm/mmu_gather.c:156
[ 7703.158927][   C99] [c0002bb6fb30] [c0194d7c] 
rcu_core+0x35c/0xbb0
rcu_do_batch at kernel/rcu/tree.c:2502
(inlined by) rcu_core at kernel/rcu/tree.c:2737
[ 7703.158966][   C99] [c0002bb6fbf0] [c095a3d0] 
__do_softirq+0x480/0x704
[ 7703.159006][   C99] [c0002bb6fd10] [c00cc1f4] 
run_ksoftirqd+0x74/0xd0
run_ksoftirqd at kernel/softirq.c:651
(inlined by) run_ksoftirqd at kernel/softirq.c:642
[ 7703.159046][   C99] [c0002bb6fd30] [c01040c8] 
smpboot_thread_fn+0x278/0x320
[ 7703.159096][   C99] [c0002bb6fda0] [c00fc8a4] kthread+0x1c4/0x1d0
[ 7703.159145][   C99] [c0002bb6fe10] [c000d9fc] 
ret_from_kernel_thread+0x5c/0x80
[ 7703.159183][   C99] Instruction dump:
[ 7703.159204][   C99] 6000 7c0802a6 3c82f8b4 7fe3fb78 38847470 f8010040 
482b4fc5 6000 
[ 7703.159248][   C99] 0fe0 7c0802a6 fbe10028 f8010040 <0fe0> 3c4c07f1 
38422f10 7c0802a6 
[ 7703.159293][   C99] ---[ end trace 1d92a5231ba6a0d5 ]---



[RFC PATCH v1] powerpc/net: Implement eBPF on PPC32

2020-12-11 Thread Christophe Leroy
This first draft of eBPF for PPC32. This is still dirty WIP.

Don't pay too much attention on comments, most of them are copied
as is from PPC64 implementation.

Test result:

test_bpf: Summary: 378 PASSED, 0 FAILED, [332/366 JIT'ed]

Registers mapping:

[BPF_REG_0] = r21-r22
/* function arguments */
[BPF_REG_1] = r3-r4
[BPF_REG_2] = r5-r6
[BPF_REG_3] = r7-r8
[BPF_REG_4] = r9-r10
[BPF_REG_5] = r11-r12
/* non volatile registers */
[BPF_REG_6] = r23-r24
[BPF_REG_7] = r25-r26
[BPF_REG_8] = r27-r28
[BPF_REG_9] = r29-r30
/* frame pointer aka BPF_REG_10 */
[BPF_REG_FP] = r31
/* eBPF jit internal registers */
[BPF_REG_AX] = r19-r20
[TMP_REG] = r18

r0 is also used as temporary register as much as possible. It is
referenced directly in the code in order to avoid misuse of it, as
some instructions interpret it as value 0 instead of register r0.

The following operations are not (or only partially) supported
for the time being:

case BPF_ALU64 | BPF_DIV | BPF_X: /* dst /= src */
case BPF_ALU64 | BPF_MOD | BPF_X: /* dst %= src */
case BPF_ALU64 | BPF_MOD | BPF_K: /* dst %= imm */
case BPF_ALU64 | BPF_DIV | BPF_K: /* dst /= imm */
case BPF_ALU64 | BPF_LSH | BPF_X: /* dst <<= src; */
case BPF_ALU64 | BPF_LSH | BPF_K: /* dst <<== imm */
case BPF_ALU64 | BPF_RSH | BPF_X: /* dst >>= src */
case BPF_ALU64 | BPF_RSH | BPF_K: /* dst >>= imm */
case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */
case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */
case BPF_STX | BPF_XADD | BPF_DW: /* *(u64 *)(dst + off) += src 
*/

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig  |3 +-
 arch/powerpc/include/asm/ppc-opcode.h |   11 +
 arch/powerpc/net/Makefile |6 +-
 arch/powerpc/net/bpf_jit32.h  |  173 ++--
 arch/powerpc/net/bpf_jit_asm.S|  226 -
 arch/powerpc/net/bpf_jit_comp.c   |  683 --
 arch/powerpc/net/bpf_jit_comp32.c | 1177 +
 7 files changed, 1249 insertions(+), 1030 deletions(-)
 delete mode 100644 arch/powerpc/net/bpf_jit_asm.S
 delete mode 100644 arch/powerpc/net/bpf_jit_comp.c
 create mode 100644 arch/powerpc/net/bpf_jit_comp32.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9e679ba0811c..dd6ccd550230 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -190,7 +190,6 @@ config PPC
select HAVE_ARCH_TRACEHOOK
select HAVE_ASM_MODVERSIONS
select HAVE_C_RECORDMCOUNT
-   select HAVE_CBPF_JITif !PPC64
select HAVE_STACKPROTECTOR  if PPC64 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
select HAVE_STACKPROTECTOR  if PPC32 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
select HAVE_CONTEXT_TRACKINGif PPC64
@@ -199,7 +198,7 @@ config PPC
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGSif MPROFILE_KERNEL
-   select HAVE_EBPF_JITif PPC64
+   select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS  if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index a6e3700c4566..0cefa1da9a1f 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -417,6 +417,7 @@
 #define PPC_RAW_LD(r, base, i) (PPC_INST_LD | ___PPC_RT(r) | 
___PPC_RA(base) | IMM_DS(i))
 #define PPC_RAW_LWZ(r, base, i)(0x8000 | ___PPC_RT(r) | 
___PPC_RA(base) | IMM_L(i))
 #define PPC_RAW_LWZX(t, a, b)  (0x7c2e | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_LMW(r, base, i)(0xb800 | ___PPC_RT(r) | 
___PPC_RA(base) | IMM_L(i))
 #define PPC_RAW_STD(r, base, i)(PPC_INST_STD | ___PPC_RS(r) | 
___PPC_RA(base) | IMM_DS(i))
 #define PPC_RAW_STDCX(s, a, b) (0x7c0001ad | ___PPC_RS(s) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_LFSX(t, a, b)  (0x7c00042e | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
@@ -425,6 +426,9 @@
 #define PPC_RAW_STFDX(s, a, b) (0x7c0005ae | ___PPC_RS(s) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_LVX(t, a, b)   (0x7cce | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_STVX(s, a, b)  (0x7c0001ce | ___PPC_RS(s) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_ADDE(t, a, b)  (0x7c000114 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_ADDZE(t, a)(0x7c000194 

Re: [PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent

2020-12-11 Thread Christophe Leroy




Le 11/12/2020 à 09:52, Xu Wang a écrit :

Remove casting the values returned by dma_alloc_coherent.


Can you explain more in the commit log ?

As far as I can see, dma_alloc_coherent() doesn't return __iomem, and ring_base 
member is __iomem

Christophe



Signed-off-by: Xu Wang 
---
  drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c 
b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
index 99fe2c210d0f..3ae345676e50 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
@@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev)
struct fs_enet_private *fep = netdev_priv(dev);
const struct fs_platform_info *fpi = fep->fpi;
  
-	fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev,

+   fep->ring_base = dma_alloc_coherent(fep->dev,
(fpi->tx_ring + fpi->rx_ring) *
sizeof(cbd_t), &fep->ring_mem_addr,
GFP_KERNEL);



[PATCH] powerpc/memhotplug: quieting some DLPAR operations

2020-12-11 Thread Laurent Dufour
When attempting to remove by index a set of LMB a lot of messages are
displayed on the console, even when everything goes fine:

 pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 802d
 Offlined Pages 4096
 pseries-hotplug-mem: Memory at 2d000 was hot-removed

The 2 messages prefixed by "pseries-hotplug-mem" are not really helpful for
the end user, they should be debug outputs.

In case of error, because some of the LMB's pages couldn't be offlined, the
following is displayed on the console:

 pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 803e
 pseries-hotplug-mem: Failed to hot-remove memory at 3e000
 dlpar: Could not handle DLPAR request "memory remove index 0x803e"

Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless, and the
generic DLPAR prefixed message should be enough. Turning the 2 firts at
DEBUG level.

These 2 first changes are mainly triggered by the changes introduced in
drmgr:
https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ

Also, when adding a bunch of LMBs, a message is displayed in the console per LMB
like these ones:
 pseries-hotplug-mem: Memory at 7e000 (drc index 807e) was hot-added
 pseries-hotplug-mem: Memory at 7f000 (drc index 807f) was hot-added
 pseries-hotplug-mem: Memory at 8 (drc index 8080) was hot-added
 pseries-hotplug-mem: Memory at 81000 (drc index 8081) was hot-added

When adding 1TB of memory and LMB size is 256MB, this leads to 4096
messages to be displayed on the console. These messages are not really
helpful for the end user, so moving them to the DEBUG level.

Signed-off-by: Laurent Dufour 
---
 arch/powerpc/platforms/pseries/hotplug-memory.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 7efe6ec5d14a..8377f1f7c78e 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -479,7 +479,7 @@ static int dlpar_memory_remove_by_index(u32 drc_index)
int lmb_found;
int rc;
 
-   pr_info("Attempting to hot-remove LMB, drc index %x\n", drc_index);
+   pr_debug("Attempting to hot-remove LMB, drc index %x\n", drc_index);
 
lmb_found = 0;
for_each_drmem_lmb(lmb) {
@@ -497,10 +497,10 @@ static int dlpar_memory_remove_by_index(u32 drc_index)
rc = -EINVAL;
 
if (rc)
-   pr_info("Failed to hot-remove memory at %llx\n",
-   lmb->base_addr);
+   pr_debug("Failed to hot-remove memory at %llx\n",
+lmb->base_addr);
else
-   pr_info("Memory at %llx was hot-removed\n", lmb->base_addr);
+   pr_debug("Memory at %llx was hot-removed\n", lmb->base_addr);
 
return rc;
 }
@@ -717,8 +717,8 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add)
if (!drmem_lmb_reserved(lmb))
continue;
 
-   pr_info("Memory at %llx (drc index %x) was hot-added\n",
-   lmb->base_addr, lmb->drc_index);
+   pr_debug("Memory at %llx (drc index %x) was 
hot-added\n",
+lmb->base_addr, lmb->drc_index);
drmem_remove_lmb_reservation(lmb);
}
rc = 0;
-- 
2.29.2



Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K

2020-12-11 Thread Rob Herring
On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson
 wrote:
>
> I have been chasing down a problem enumerating an NVMe drive on a
> Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate
> successfully if the we are emitting lots of console messages via a UART.
> If the system is booted with `quiet` set then enumeration fails.

We really need to get away from work-arounds for device X on host Y. I
suspect they are not limited to that combination only...

How exactly does it fail to enumerate? There's a (racy) linkup check
on config accesses, is it reporting link down and skipping config
accesses?

> I guessed this would be due to the timing impact of printk-to-UART and
> tried to find out where a delay could be added to provoke a successful
> enumeration.
>
> This patch contains the results. The delay length (1ms) was selected by
> binary searching downwards until the delay was not effective for these
> devices (Honeycomb LX2K and a Western Digital WD Blue SN550).
>
> I have also included the workaround twice (conditionally compiled). The
> first change is the *latest* possible code path that we can deploy a
> delay whilst the second is the earliest place I could find.
>
> The summary is that the critical window were we are currently relying on
> a call to the console UART code can "mend" the driver runs from calling
> dw_pcie_setup_rc() in host init to just before we read the state in the
> link up callback.
>
> Signed-off-by: Daniel Thompson 
> ---
>
> Notes:
> This patch is RFC (and HACK) because I don't have much clue *why* this
> patch works... merely that this is the smallest possible change I need
> to replicate the current accidental printk() workaround.  Perhaps one
> could argue that RFC here stands for request-for-clue.  All my
> observations and changes here are empirical and I don't know how best to
> turn them into something that is not a hack!
>
> BTW I noticed many other pcie-designware drivers take advantage
> of a function called dw_pcie_wait_for_link() in their init paths...
> but my naive attempts to add it to the layerscape driver results
> in non-booting systems so I haven't embarrassed myself by including
> that in the patch!

You need to look at what's pending for v5.11, because I reworked this
to be more unified. The ordering of init is also possibly changed. The
sequence is now like this:

dw_pcie_setup_rc(pp);
dw_pcie_msi_init(pp);

if (!dw_pcie_link_up(pci) && pci->ops->start_link) {
ret = pci->ops->start_link(pci);
if (ret)
goto err_free_msi;
}

/* Ignore errors, the link may come up later */
dw_pcie_wait_for_link(pci);


>  drivers/pci/controller/dwc/pci-layerscape.c | 35 +
>  1 file changed, 35 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pci-layerscape.c 
> b/drivers/pci/controller/dwc/pci-layerscape.c
> index f24f79a70d9a..c354904b90ef 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape.c
> @@ -22,6 +22,8 @@
>
>  #include "pcie-designware.h"
>
> +#define WORKAROUND_LATEST_POSSIBLE
> +
>  /* PEX1/2 Misc Ports Status Register */
>  #define SCFG_PEXMSCPORTSR(pex_idx) (0x94 + (pex_idx) * 4)
>  #define LTSSM_STATE_SHIFT  20
> @@ -113,10 +115,31 @@ static int ls_pcie_link_up(struct dw_pcie *pci)
> struct ls_pcie *pcie = to_ls_pcie(pci);
> u32 state;
>
> +   /*
> +* Strictly speaking *this* (before the ioread32) is the latest
> +* point a simple delay can be effective. If we move the delay
> +* after the ioread32 then the NVMe does not enumerate.
> +*
> +* However this function appears to be frequently called so an
> +* unconditional delay here causes noticeable delay at boot
> +* time. Hence we implement the workaround by retrying the read
> +* after a short delay if we think we might need to return false.
> +*/
> +
> state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >>
>  pcie->drvdata->ltssm_shift) &
>  LTSSM_STATE_MASK;
>
> +#ifdef WORKAROUND_LATEST_POSSIBLE
> +   if (state < LTSSM_PCIE_L0) {
> +   /* see comment above */
> +   mdelay(1);
> +   state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >>
> +pcie->drvdata->ltssm_shift) &
> +LTSSM_STATE_MASK;

Side note, I really wonder about the variety of ways in DWC drivers we
have to check link state. The default is a debug register... Are the
standard PCIe registers for this really broken in these cases?

> +   }
> +#endif
> +
> if (state < LTSSM_PCIE_L0)
> return 0;
>
> @@ -152,6 +175,18 @@ static int ls_pcie_host_init(struct pcie_port *pp)
>
> dw_pcie_setup_rc(pp);

Might be related to the PORT_LOGIC_SPEED_CHANGE we 

[RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K

2020-12-11 Thread Daniel Thompson
I have been chasing down a problem enumerating an NVMe drive on a
Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate
successfully if the we are emitting lots of console messages via a UART.
If the system is booted with `quiet` set then enumeration fails.

I guessed this would be due to the timing impact of printk-to-UART and
tried to find out where a delay could be added to provoke a successful
enumeration.

This patch contains the results. The delay length (1ms) was selected by
binary searching downwards until the delay was not effective for these
devices (Honeycomb LX2K and a Western Digital WD Blue SN550).

I have also included the workaround twice (conditionally compiled). The
first change is the *latest* possible code path that we can deploy a
delay whilst the second is the earliest place I could find.

The summary is that the critical window were we are currently relying on
a call to the console UART code can "mend" the driver runs from calling
dw_pcie_setup_rc() in host init to just before we read the state in the
link up callback.

Signed-off-by: Daniel Thompson 
---

Notes:
This patch is RFC (and HACK) because I don't have much clue *why* this
patch works... merely that this is the smallest possible change I need
to replicate the current accidental printk() workaround.  Perhaps one
could argue that RFC here stands for request-for-clue.  All my
observations and changes here are empirical and I don't know how best to
turn them into something that is not a hack!

BTW I noticed many other pcie-designware drivers take advantage
of a function called dw_pcie_wait_for_link() in their init paths...
but my naive attempts to add it to the layerscape driver results
in non-booting systems so I haven't embarrassed myself by including
that in the patch!

 drivers/pci/controller/dwc/pci-layerscape.c | 35 +
 1 file changed, 35 insertions(+)

diff --git a/drivers/pci/controller/dwc/pci-layerscape.c 
b/drivers/pci/controller/dwc/pci-layerscape.c
index f24f79a70d9a..c354904b90ef 100644
--- a/drivers/pci/controller/dwc/pci-layerscape.c
+++ b/drivers/pci/controller/dwc/pci-layerscape.c
@@ -22,6 +22,8 @@

 #include "pcie-designware.h"

+#define WORKAROUND_LATEST_POSSIBLE
+
 /* PEX1/2 Misc Ports Status Register */
 #define SCFG_PEXMSCPORTSR(pex_idx) (0x94 + (pex_idx) * 4)
 #define LTSSM_STATE_SHIFT  20
@@ -113,10 +115,31 @@ static int ls_pcie_link_up(struct dw_pcie *pci)
struct ls_pcie *pcie = to_ls_pcie(pci);
u32 state;

+   /*
+* Strictly speaking *this* (before the ioread32) is the latest
+* point a simple delay can be effective. If we move the delay
+* after the ioread32 then the NVMe does not enumerate.
+*
+* However this function appears to be frequently called so an
+* unconditional delay here causes noticeable delay at boot
+* time. Hence we implement the workaround by retrying the read
+* after a short delay if we think we might need to return false.
+*/
+
state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >>
 pcie->drvdata->ltssm_shift) &
 LTSSM_STATE_MASK;

+#ifdef WORKAROUND_LATEST_POSSIBLE
+   if (state < LTSSM_PCIE_L0) {
+   /* see comment above */
+   mdelay(1);
+   state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >>
+pcie->drvdata->ltssm_shift) &
+LTSSM_STATE_MASK;
+   }
+#endif
+
if (state < LTSSM_PCIE_L0)
return 0;

@@ -152,6 +175,18 @@ static int ls_pcie_host_init(struct pcie_port *pp)

dw_pcie_setup_rc(pp);

+#ifdef WORKAROUND_EARLIEST_POSSIBLE
+   /*
+* This is the earliest point the delay is effective.
+* If we move it before dw_pcie_setup_rc() then the
+* NVMe does not enumerate.
+*
+* 500us is too short to reliably work around the issue
+* hence adopting 1000us here.
+*/
+   mdelay(1);
+#endif
+
return 0;
 }


base-commit: 0477e92881850d44910a7e94fc2c46f96faa131f
--
2.29.2



[PATCH] ASoC: fsl: imx-hdmi: Fix an uninitialized return in probe

2020-12-11 Thread Dan Carpenter
The "ret" variable should be set to a negative error code but
currently it returns uninitialized stack data.

Fixes: 6a5f850aa83a ("ASoC: fsl: Add imx-hdmi machine driver")
Signed-off-by: Dan Carpenter 
---
 sound/soc/fsl/imx-hdmi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/imx-hdmi.c b/sound/soc/fsl/imx-hdmi.c
index 2c2a76a71940..ede4a9ad1054 100644
--- a/sound/soc/fsl/imx-hdmi.c
+++ b/sound/soc/fsl/imx-hdmi.c
@@ -164,6 +164,7 @@ static int imx_hdmi_probe(struct platform_device *pdev)
 
if ((hdmi_out && hdmi_in) || (!hdmi_out && !hdmi_in)) {
dev_err(&pdev->dev, "Invalid HDMI DAI link\n");
+   ret = -EINVAL;
goto fail;
}
 
-- 
2.29.2



Re: [PATCH v1 1/2] KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE (nested case only)

2020-12-11 Thread Bharata B Rao
On Mon, Oct 19, 2020 at 04:56:41PM +0530, Bharata B Rao wrote:
> Implements H_RPT_INVALIDATE hcall and supports only nested case
> currently.
> 
> A KVM capability KVM_CAP_RPT_INVALIDATE is added to indicate the
> support for this hcall.

As Paul mentioned in the thread, this hcall does both process scoped
invalidations and partition scoped invalidations for L2 guest.
I am adding KVM_CAP_RPT_INVALIDATE capability with only partition
scoped invalidations (nested case) implemented in the hcall as we
don't see the need for KVM to implement process scoped invalidation
function as KVM may never run with LPCR[GTSE]=0.

I am wondering if enabling the capability with only partial
implementation of the hcall is the correct thing to do. In future
if we ever want process scoped invalidations support in this hcall,
we may not be able to differentiate the availability of two functions
cleanly from QEMU.

So does it make sense to implement the process scoped invalidation
function also now itself even if it is not going to be used in
KVM?

Regards,
Bharata.


[powerpc/merge] System crash during cpu offline/online operation

2020-12-11 Thread Sachin Sant
I am observing system crash during a cpu offline/online operation
with latest merge branch code running in a PowerVM LPAR (P8 onwards)

# uname -r
5.10.0-rc7-01792-g244569c777ca
# ppc64_cpu --smt=1
[  244.205194] cpu 1 (hwid 1) Ready to die…
………
……...
[  247.015113] cpu 30 (hwid 30) Ready to die...
[  247.104973] cpu 31 (hwid 31) Ready to die…
# ppc64_cpu --smt=8

At this point the LPAR reboots instantly without any trace message.

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM 
………..
………..
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM 

  1 = SMS Menu  5 = Default Boot List
  8 = Open Firmware Prompt  6 = Stored Boot List
  9 = Restricted Open Firmware Prompt

…….
…….

merge commit 9acd775e45 (based on rc6) was good. Mainline 5.10.0-rc7 also does
not exhibit this problem. 

Thanks
-Sachin
 

[PATCH] net: ethernet: fs-enet: remove casting dma_alloc_coherent

2020-12-11 Thread Xu Wang
Remove casting the values returned by dma_alloc_coherent.

Signed-off-by: Xu Wang 
---
 drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c 
b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
index 99fe2c210d0f..3ae345676e50 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c
@@ -131,7 +131,7 @@ static int allocate_bd(struct net_device *dev)
struct fs_enet_private *fep = netdev_priv(dev);
const struct fs_platform_info *fpi = fep->fpi;
 
-   fep->ring_base = (void __force __iomem *)dma_alloc_coherent(fep->dev,
+   fep->ring_base = dma_alloc_coherent(fep->dev,
(fpi->tx_ring + fpi->rx_ring) *
sizeof(cbd_t), &fep->ring_mem_addr,
GFP_KERNEL);
-- 
2.17.1



[PATCH] net: fs_enet: remove casting dma_alloc_coherent

2020-12-11 Thread Xu Wang
Remove casting the values returned by dma_alloc_coherent.

Signed-off-by: Xu Wang 
---
 drivers/net/ethernet/freescale/fs_enet/mac-fcc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c 
b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
index b47490be872c..17f757c0bb85 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
@@ -147,7 +147,7 @@ static int allocate_bd(struct net_device *dev)
struct fs_enet_private *fep = netdev_priv(dev);
const struct fs_platform_info *fpi = fep->fpi;
 
-   fep->ring_base = (void __iomem __force *)dma_alloc_coherent(fep->dev,
+   fep->ring_base = dma_alloc_coherent(fep->dev,
(fpi->tx_ring + fpi->rx_ring) *
sizeof(cbd_t), &fep->ring_mem_addr,
GFP_KERNEL);
-- 
2.17.1



Re: [PATCH v2 07/13] powerpc: Increase NR_IRQS range to support more KVM guests

2020-12-11 Thread Cédric Le Goater
On 12/11/20 12:51 AM, Michael Ellerman wrote:
> Cédric Le Goater  writes:
>> PowerNV systems can handle up to 4K guests and 1M interrupt numbers
>> per chip. Increase the range of allowed interrupts to support a larger
>> number of guests.
>>
>> Reviewed-by: Greg Kurz 
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  arch/powerpc/Kconfig | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 5181872f9452..c250fbd430d1 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -66,7 +66,7 @@ config NEED_PER_CPU_PAGE_FIRST_CHUNK
>>  
>>  config NR_IRQS
>>  int "Number of virtual interrupt numbers"
>> -range 32 32768
>> +range 32 1048576
>>  default "512"
>>  help
>>This defines the number of virtual interrupt numbers the kernel
> 
> We should really do what other arches do, and size this appropriately
> based on the config, rather than asking users to guess what size they
> need.
> 
> But I guess I'll take this for now, and we can do something fancier
> later.

I was thinking on adding a property to OPAL to size the HW interrupt 
number space. Is that it ?  That would be good because it's increasing
from 20bits on P9 to 24bits on P10.

I am checking other arches.

Thanks,

C.