date:20190612

Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-12 Thread Oded Gabbay

On Wed, Jun 12, 2019 at 1:53 AM Benjamin Herrenschmidt
 wrote:
>
> On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> >
> > > So, to summarize:
> > > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > > call it again with 32, which makes our device pretty much unusable.
> > > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > > 59 will be set and the host won't like it (I checked it). In addition,
> > > I might get addresses above 50 bits, which my device can't generate.
> > >
> > > I hope this makes things more clear. Now, please explain to me how I
> > > can call pci_set_dma_mask without any regard to whether I run on
> > > x86-64 or POWER9, considering what I wrote above ?
> > >
> > > Thanks,
> > > Oded
> >
> > Adding ppc mailing list.
>
> You can't. Your device is broken. Devices that don't support DMAing to
> the full 64-bit deserve to be added to the trash pile.
>
Hmm... right know they are added to customers data-centers but what do I know ;)

> As a result, getting it to work will require hacks. Some GPUs have
> similar issues and require similar hacks, it's unfortunate.
>
> Added a couple of guys on CC who might be able to help get those hacks
> right.
Thanks :)
>
> It's still very fishy .. the idea is to detect the case where setting a
> 64-bit mask will give your system memory mapped at a fixed high address
> (1 << 59 in our case) and program that in your chip in the "Fixed high
> bits" register that you seem to have (also make sure it doesn't affect
> MSIs or it will break them).
MSI-X are working. The set of bit 59 doesn't apply to MSI-X
transactions (AFAICS from the PCIe controller spec we have).
>
> This will only work as long as all of the system memory can be
> addressed at an offset from that fixed address that itself fits your
> device addressing capabilities (50 bits in this case). It may or may
> not be the case but there's no way to check since the DMA mask logic
> won't really apply.
Understood. In the specific system we are integrated to, that is the
case - we have less then 48 bits. But, as you pointed out, it is not a
generic solution but with my H/W I can't give a generic fit-all
solution for POWER9. I'll settle for the best that I can do.

>
> You might want to consider fixing your HW in the next iteration... This
> is going to bite you when x86 increases the max physical memory for
> example, or on other architectures.
Understood and taken care of.

>
> Cheers,
> Ben.
>
>
>
>

Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-12 Thread Christoph Hellwig

On Wed, Jun 12, 2019 at 04:35:22PM +1000, Oliver O'Halloran wrote:
> Setting a 48 bit DMA mask doesn't work today because we only allocate
> IOMMU tables to cover the 0..2GB range of PCI bus addresses.

I don't think that is true upstream, and if it is we need to fix bug
in the powerpc code.  powerpc should be falling back treating a 48-bit
dma mask like a 32-bit one at least, that is use dynamic iommu mappings
instead of using the direct mapping.  And from my reding of 
arch/powerpc/kernel/dma-iommu.c that is exactly what it does.

[PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Michael Neuling

In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
option") I screwed up some assembler and corrupted a pointer in
r3. This resulted in crashes like the below from Cédric:

  [   44.374746] BUG: Kernel NULL pointer dereference at 0x13bf
  [   44.374848] Faulting instruction address: 0xc010b044
  [   44.374906] Oops: Kernel access of bad area, sig: 11 [#1]
  [   44.374951] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
pSeries
  [   44.375018] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter vmx_crypto crct10dif_vpmsum crc32c_vpmsum kvm_hv kvm 
sch_fq_codel ip_tables x_tables autofs4 virtio_net net_failover virtio_scsi 
failover
  [   44.375401] CPU: 8 PID: 1771 Comm: qemu-system-ppc Kdump: loaded Not 
tainted 5.2.0-rc4+ #3
  [   44.375500] NIP:  c010b044 LR: c008089dacf4 CTR: 
c010aff4
  [   44.375604] REGS: c0179b397710 TRAP: 0300   Not tainted  (5.2.0-rc4+)
  [   44.375691] MSR:  8280b033   CR: 
42244842  XER: 
  [   44.375815] CFAR: c010aff8 DAR: 13bf DSISR: 4200 
IRQMASK: 0
  [   44.375815] GPR00: c008089dd6bc c0179b3979a0 c00808a04300 

  [   44.375815] GPR04:  0003 2444b05d 
c017f11c45d0
  [   44.375815] GPR08: 07803e018dfe 0028 0001 
0075
  [   44.375815] GPR12: c010aff4 c7ff6300  

  [   44.375815] GPR16:  c017f11d  
c017f11ca7a8
  [   44.375815] GPR20: c017f11c42ec   
000a
  [   44.375815] GPR24: fffc  c017f11c 
c1a77ed8
  [   44.375815] GPR28: c0179af7 fffc c008089ff170 
c0179ae88540
  [   44.376673] NIP [c010b044] kvmppc_h_set_dabr+0x50/0x68
  [   44.376754] LR [c008089dacf4] kvmppc_pseries_do_hcall+0xa3c/0xeb0 
[kvm_hv]
  [   44.376849] Call Trace:
  [   44.376886] [c0179b3979a0] [c017f11c] 0xc017f11c 
(unreliable)
  [   44.376982] [c0179b397a10] [c008089dd6bc] 
kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
  [   44.377084] [c0179b397ae0] [c008093f8bcc] 
kvmppc_vcpu_run+0x34/0x48 [kvm]
  [   44.377185] [c0179b397b00] [c008093f522c] 
kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
  [   44.377286] [c0179b397b90] [c008093e3618] 
kvm_vcpu_ioctl+0x460/0x850 [kvm]
  [   44.377384] [c0179b397d00] [c04ba6c4] do_vfs_ioctl+0xe4/0xb40
  [   44.377464] [c0179b397db0] [c04bb1e4] ksys_ioctl+0xc4/0x110
  [   44.377547] [c0179b397e00] [c04bb258] sys_ioctl+0x28/0x80
  [   44.377628] [c0179b397e20] [c000b888] system_call+0x5c/0x70
  [   44.377712] Instruction dump:
  [   44.377765] 4082fff4 4c00012c 3860 4e800020 e96280c0 896b 2c2b 
3860
  [   44.377862] 4d820020 50852e74 508516f6 78840724  f8a313c8 
7c942ba6 7cbc2ba6

This fixes the problem by only changing r3 when we are returning
immediately.

Signed-off-by: Michael Neuling 
Reported-by: Cédric Le Goater 
--
mpe: This is for 5.2 fixes
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 139027c62d..f781ee1458 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2519,8 +2519,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
LOAD_REG_ADDR(r11, dawr_force_enable)
lbz r11, 0(r11)
cmpdi   r11, 0
+   bne 3f
li  r3, H_HARDWARE
-   beqlr
+   blr
+3:
/* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
rlwimi  r5, r4, 2, DAWRX_WT
-- 
2.21.0

Re: [PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Cédric Le Goater

On 12/06/2019 09:22, Michael Neuling wrote:
> In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
> option") I screwed up some assembler and corrupted a pointer in
> r3. This resulted in crashes like the below from Cédric:
> 
>   [   44.374746] BUG: Kernel NULL pointer dereference at 0x13bf
>   [   44.374848] Faulting instruction address: 0xc010b044
>   [   44.374906] Oops: Kernel access of bad area, sig: 11 [#1]
>   [   44.374951] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
> pSeries
>   [   44.375018] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
> iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
> nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
> bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
> iptable_filter bpfilter vmx_crypto crct10dif_vpmsum crc32c_vpmsum kvm_hv kvm 
> sch_fq_codel ip_tables x_tables autofs4 virtio_net net_failover virtio_scsi 
> failover
>   [   44.375401] CPU: 8 PID: 1771 Comm: qemu-system-ppc Kdump: loaded Not 
> tainted 5.2.0-rc4+ #3
>   [   44.375500] NIP:  c010b044 LR: c008089dacf4 CTR: 
> c010aff4
>   [   44.375604] REGS: c0179b397710 TRAP: 0300   Not tainted  (5.2.0-rc4+)
>   [   44.375691] MSR:  8280b033   
> CR: 42244842  XER: 
>   [   44.375815] CFAR: c010aff8 DAR: 13bf DSISR: 4200 
> IRQMASK: 0
>   [   44.375815] GPR00: c008089dd6bc c0179b3979a0 c00808a04300 
> 
>   [   44.375815] GPR04:  0003 2444b05d 
> c017f11c45d0
>   [   44.375815] GPR08: 07803e018dfe 0028 0001 
> 0075
>   [   44.375815] GPR12: c010aff4 c7ff6300  
> 
>   [   44.375815] GPR16:  c017f11d  
> c017f11ca7a8
>   [   44.375815] GPR20: c017f11c42ec   
> 000a
>   [   44.375815] GPR24: fffc  c017f11c 
> c1a77ed8
>   [   44.375815] GPR28: c0179af7 fffc c008089ff170 
> c0179ae88540
>   [   44.376673] NIP [c010b044] kvmppc_h_set_dabr+0x50/0x68
>   [   44.376754] LR [c008089dacf4] kvmppc_pseries_do_hcall+0xa3c/0xeb0 
> [kvm_hv]
>   [   44.376849] Call Trace:
>   [   44.376886] [c0179b3979a0] [c017f11c] 0xc017f11c 
> (unreliable)
>   [   44.376982] [c0179b397a10] [c008089dd6bc] 
> kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
>   [   44.377084] [c0179b397ae0] [c008093f8bcc] 
> kvmppc_vcpu_run+0x34/0x48 [kvm]
>   [   44.377185] [c0179b397b00] [c008093f522c] 
> kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
>   [   44.377286] [c0179b397b90] [c008093e3618] 
> kvm_vcpu_ioctl+0x460/0x850 [kvm]
>   [   44.377384] [c0179b397d00] [c04ba6c4] do_vfs_ioctl+0xe4/0xb40
>   [   44.377464] [c0179b397db0] [c04bb1e4] ksys_ioctl+0xc4/0x110
>   [   44.377547] [c0179b397e00] [c04bb258] sys_ioctl+0x28/0x80
>   [   44.377628] [c0179b397e20] [c000b888] system_call+0x5c/0x70
>   [   44.377712] Instruction dump:
>   [   44.377765] 4082fff4 4c00012c 3860 4e800020 e96280c0 896b 
> 2c2b 3860
>   [   44.377862] 4d820020 50852e74 508516f6 78840724  f8a313c8 
> 7c942ba6 7cbc2ba6
> 
> This fixes the problem by only changing r3 when we are returning
> immediately.
> 
> Signed-off-by: Michael Neuling 
> Reported-by: Cédric Le Goater 

On nested, I still see : 

[   94.609274] Oops: Exception in kernel mode, sig: 4 [#1]
[   94.609432] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[   94.609596] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter vmx_crypto kvm_hv crct10dif_vpmsum crc32c_vpmsum kvm 
sch_fq_codel ip_tables x_tables autofs4 virtio_net virtio_scsi net_failover 
failover
[   94.610179] CPU: 12 PID: 2026 Comm: qemu-system-ppc Kdump: loaded Not 
tainted 5.2.0-rc4+ #6
[   94.610290] NIP:  c010b050 LR: c00808bbacf4 CTR: c010aff4
[   94.610400] REGS: c017913d7710 TRAP: 0700   Not tainted  (5.2.0-rc4+)
[   94.610493] MSR:  8284b033   CR: 
42224842  XER: 
[   94.610671] CFAR: c010b030 IRQMASK: 0 
[   94.610671] GPR00: c00808bbd6bc c017913d79a0 c00808be4300 
c01791376220 
[   94.610671] GPR04:  0003 f679892e 
c017911045d0 
[   94.610671] GPR08: 07803e018dfe 0028 0001 
0075 
[   94.610671] GPR12: c010aff4 c7ff1300  
 
[   94.610671] GPR16:  c0179111

Re: [PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Christophe Leroy





Le 12/06/2019 à 11:23, Paul Mackerras a écrit :

On Wed, Jun 12, 2019 at 09:42:52AM +0200, Christophe Leroy wrote:



Le 12/06/2019 à 09:22, Michael Neuling a écrit :

In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
option") I screwed up some assembler and corrupted a pointer in
r3. This resulted in crashes like the below from Cédric:


Iaw Documentation/process/submitting-patches.rst:

Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
to do frotz", as if you are giving orders to the codebase to change
its behaviour.

So you could rephrase as follows for instance:

Commit  ("") screwed up some assembler 


That advice in submitting-patches.rst is certainly appropriate when
talking about the actual change that the patch makes.  However, it is
also appropriate to give descriptive background material that helps
the reader to understand why the change is necessary -- in this case,
where and how the bug was introduced.  So I'm going to support Mikey
as regards his first few paragraphs.


Does it really matter knowing that it is Mikey who screwed up the 
assembler ? For me what's important is to know which commit introduced 
the error, not who made the error, isn't it ?


Christophe



I agree that the last paragraph that says "This fixes the bug by ..."
could be reworded as "Fix the bug by ...".

Paul.

Re: [PATCH kernel v3 0/3] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-06-12 Thread Oliver O'Halloran

On Wed, Jun 12, 2019 at 3:06 PM Shawn Anastasio  wrote:
>
> On 6/5/19 11:11 PM, Shawn Anastasio wrote:
> > On 5/30/19 2:03 AM, Alexey Kardashevskiy wrote:
> >> This is an attempt to allow DMA masks between 32..59 which are not large
> >> enough to use either a PHB3 bypass mode or a sketchy bypass. Depending
> >> on the max order, up to 40 is usually available.
> >>
> >>
> >> This is based on v5.2-rc2.
> >>
> >> Please comment. Thanks.
> >
> > I have tested this patch set with an AMD GPU that's limited to <64bit
> > DMA (I believe it's 40 or 42 bit). It successfully allows the card to
> > operate without falling back to 32-bit DMA mode as it does without
> > the patches.
> >
> > Relevant kernel log message:
> > ```
> > [0.311211] pci 0033:01 : [PE# 00] Enabling 64-bit DMA bypass
> > ```
> >
> > Tested-by: Shawn Anastasio 
>
> After a few days of further testing, I've started to run into stability
> issues with the patch applied and used with an AMD GPU. Specifically,
> the system sometimes spontaneously crashes. Not just EEH errors either,
> the whole system shuts down in what looks like a checkstop.

Any specific workload? Checkstops are harder to debug without a system
in the failed state so we'd need to replicate that locally to get a
decent idea what's up.

> Perhaps some subtle corruption is occurring?

Re: [PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Christophe Leroy





Le 12/06/2019 à 09:22, Michael Neuling a écrit :

In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
option") I screwed up some assembler and corrupted a pointer in
r3. This resulted in crashes like the below from Cédric:


Iaw Documentation/process/submitting-patches.rst:

Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
to do frotz", as if you are giving orders to the codebase to change
its behaviour.

So you could rephrase as follows for instance:

Commit  ("") screwed up some assembler 



   [   44.374746] BUG: Kernel NULL pointer dereference at 0x13bf
   [   44.374848] Faulting instruction address: 0xc010b044
   [   44.374906] Oops: Kernel access of bad area, sig: 11 [#1]
   [   44.374951] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
pSeries
   [   44.375018] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter vmx_crypto crct10dif_vpmsum crc32c_vpmsum kvm_hv kvm 
sch_fq_codel ip_tables x_tables autofs4 virtio_net net_failover virtio_scsi 
failover
   [   44.375401] CPU: 8 PID: 1771 Comm: qemu-system-ppc Kdump: loaded Not 
tainted 5.2.0-rc4+ #3
   [   44.375500] NIP:  c010b044 LR: c008089dacf4 CTR: 
c010aff4
   [   44.375604] REGS: c0179b397710 TRAP: 0300   Not tainted  (5.2.0-rc4+)
   [   44.375691] MSR:  8280b033   CR: 
42244842  XER: 
   [   44.375815] CFAR: c010aff8 DAR: 13bf DSISR: 4200 
IRQMASK: 0
   [   44.375815] GPR00: c008089dd6bc c0179b3979a0 c00808a04300 

   [   44.375815] GPR04:  0003 2444b05d 
c017f11c45d0
   [   44.375815] GPR08: 07803e018dfe 0028 0001 
0075
   [   44.375815] GPR12: c010aff4 c7ff6300  

   [   44.375815] GPR16:  c017f11d  
c017f11ca7a8
   [   44.375815] GPR20: c017f11c42ec   
000a
   [   44.375815] GPR24: fffc  c017f11c 
c1a77ed8
   [   44.375815] GPR28: c0179af7 fffc c008089ff170 
c0179ae88540
   [   44.376673] NIP [c010b044] kvmppc_h_set_dabr+0x50/0x68
   [   44.376754] LR [c008089dacf4] kvmppc_pseries_do_hcall+0xa3c/0xeb0 
[kvm_hv]
   [   44.376849] Call Trace:
   [   44.376886] [c0179b3979a0] [c017f11c] 0xc017f11c 
(unreliable)
   [   44.376982] [c0179b397a10] [c008089dd6bc] 
kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
   [   44.377084] [c0179b397ae0] [c008093f8bcc] 
kvmppc_vcpu_run+0x34/0x48 [kvm]
   [   44.377185] [c0179b397b00] [c008093f522c] 
kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
   [   44.377286] [c0179b397b90] [c008093e3618] 
kvm_vcpu_ioctl+0x460/0x850 [kvm]
   [   44.377384] [c0179b397d00] [c04ba6c4] do_vfs_ioctl+0xe4/0xb40
   [   44.377464] [c0179b397db0] [c04bb1e4] ksys_ioctl+0xc4/0x110
   [   44.377547] [c0179b397e00] [c04bb258] sys_ioctl+0x28/0x80
   [   44.377628] [c0179b397e20] [c000b888] system_call+0x5c/0x70
   [   44.377712] Instruction dump:
   [   44.377765] 4082fff4 4c00012c 3860 4e800020 e96280c0 896b 
2c2b 3860
   [   44.377862] 4d820020 50852e74 508516f6 78840724  f8a313c8 
7c942ba6 7cbc2ba6

This fixes the problem by only changing r3 when we are returning
immediately.

Signed-off-by: Michael Neuling 
Reported-by: Cédric Le Goater 
--
mpe: This is for 5.2 fixes


Then your commit log should include the following:

Fixes: c1fe190c0672 ("powerpc: Add force enable of DAWR on P9 option")

Christophe


---
  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 139027c62d..f781ee1458 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2519,8 +2519,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
LOAD_REG_ADDR(r11, dawr_force_enable)
lbz r11, 0(r11)
cmpdi   r11, 0
+   bne 3f
li  r3, H_HARDWARE
-   beqlr
+   blr
+3:
/* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
rlwimi  r5, r4, 2, DAWRX_WT

Re: [PATCH v8 2/7] x86/dma: use IS_ENABLED() to simplify the code

2019-06-12 Thread Leizhen (ThunderTown)




On 2019/6/12 13:16, Borislav Petkov wrote:
> On Thu, May 30, 2019 at 11:48:26AM +0800, Zhen Lei wrote:
>> This patch removes the ifdefs around CONFIG_IOMMU_DEFAULT_PASSTHROUGH to
>> improve readablity.
> 
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.

OK, thanks.

> 
> Also, do
> 
> $ git grep 'This patch' Documentation/process
> 
> for more details.
> 
>> Signed-off-by: Zhen Lei 
>> ---
>>  arch/x86/kernel/pci-dma.c | 7 ++-
>>  1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
>> index dcd272dbd0a9330..9f2b19c35a060df 100644
>> --- a/arch/x86/kernel/pci-dma.c
>> +++ b/arch/x86/kernel/pci-dma.c
>> @@ -43,11 +43,8 @@
>>   * It is also possible to disable by default in kernel config, and enable 
>> with
>>   * iommu=nopt at boot time.
>>   */
>> -#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
>> -int iommu_pass_through __read_mostly = 1;
>> -#else
>> -int iommu_pass_through __read_mostly;
>> -#endif
>> +int iommu_pass_through __read_mostly =
>> +IS_ENABLED(CONFIG_IOMMU_DEFAULT_PASSTHROUGH);
> 
> Let that line stick out.

OK, I will merge them on the same line.

> 
> Thx.
>

Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-12 Thread Oliver O'Halloran

On Wed, Jun 12, 2019 at 3:25 AM Oded Gabbay  wrote:
>
> On Tue, Jun 11, 2019 at 8:03 PM Oded Gabbay  wrote:
> >
> > On Tue, Jun 11, 2019 at 6:26 PM Greg KH  wrote:
> > > *snip*
> >
> > Now, when I tried to integrate Goya into a POWER9 machine, I got a
> > reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
> > as I wrote above, is to call the same function with 32-bits. That
> > works BUT it is not practical, as our applications require much more
> > memory mapped then 32-bits.

Setting a 48 bit DMA mask doesn't work today because we only allocate
IOMMU tables to cover the 0..2GB range of PCI bus addresses. Alexey
has some patches to expand that range so we can support devices that
can't hit the 64 bit bypass window. You need:

This fix: http://patchwork.ozlabs.org/patch/1113506/
This series: 
http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=110810

Give that a try and see if the IOMMU overhead is tolerable.

> >In addition, once you add more cards which
> > are all mapped to the same range, it is simply not usable at all.

Each IOMMU group should have a separate bus address space and seperate
cards shouldn't be in the same IOMMU group. If they are then there's
something up.

Oliver

[PATCH net-next] defconfigs: remove obsolete CONFIG_INET_XFRM_MODE_* and CONFIG_INET6_XFRM_MODE_*

2019-06-12 Thread YueHaibing

These Kconfig options has been removed in
commit 4c145dce2601 ("xfrm: make xfrm modes builtin")
So there is no point to keep it in defconfigs any longer.

Signed-off-by: YueHaibing 
---
 arch/arc/configs/axs101_defconfig   | 3 ---
 arch/arc/configs/axs103_defconfig   | 3 ---
 arch/arc/configs/axs103_smp_defconfig   | 3 ---
 arch/arc/configs/haps_hs_defconfig  | 3 ---
 arch/arc/configs/haps_hs_smp_defconfig  | 3 ---
 arch/arc/configs/nps_defconfig  | 3 ---
 arch/arc/configs/nsimosci_hs_smp_defconfig  | 3 ---
 arch/arc/configs/tb10x_defconfig| 3 ---
 arch/arm/configs/acs5k_tiny_defconfig   | 3 ---
 arch/arm/configs/am200epdkit_defconfig  | 3 ---
 arch/arm/configs/aspeed_g4_defconfig| 6 --
 arch/arm/configs/aspeed_g5_defconfig| 6 --
 arch/arm/configs/at91_dt_defconfig  | 6 --
 arch/arm/configs/cm_x300_defconfig  | 3 ---
 arch/arm/configs/efm32_defconfig| 3 ---
 arch/arm/configs/ep93xx_defconfig   | 3 ---
 arch/arm/configs/ezx_defconfig  | 3 ---
 arch/arm/configs/h5000_defconfig| 3 ---
 arch/arm/configs/imote2_defconfig   | 3 ---
 arch/arm/configs/imx_v4_v5_defconfig| 3 ---
 arch/arm/configs/imx_v6_v7_defconfig| 3 ---
 arch/arm/configs/iop13xx_defconfig  | 3 ---
 arch/arm/configs/iop32x_defconfig   | 3 ---
 arch/arm/configs/iop33x_defconfig   | 3 ---
 arch/arm/configs/keystone_defconfig | 3 ---
 arch/arm/configs/lpc18xx_defconfig  | 3 ---
 arch/arm/configs/lpc32xx_defconfig  | 3 ---
 arch/arm/configs/lpd270_defconfig   | 3 ---
 arch/arm/configs/magician_defconfig | 3 ---
 arch/arm/configs/mini2440_defconfig | 3 ---
 arch/arm/configs/moxart_defconfig   | 3 ---
 arch/arm/configs/mps2_defconfig | 3 ---
 arch/arm/configs/mxs_defconfig  | 3 ---
 arch/arm/configs/omap1_defconfig| 3 ---
 arch/arm/configs/palmz72_defconfig  | 3 ---
 arch/arm/configs/pcm027_defconfig   | 3 ---
 arch/arm/configs/pxa3xx_defconfig   | 3 ---
 arch/arm/configs/qcom_defconfig | 3 ---
 arch/arm/configs/rpc_defconfig  | 6 --
 arch/arm/configs/s3c2410_defconfig  | 1 -
 arch/arm/configs/sama5_defconfig| 6 --
 arch/arm/configs/sunxi_defconfig| 3 ---
 arch/arm/configs/tango4_defconfig   | 3 ---
 arch/arm/configs/tegra_defconfig| 2 --
 arch/arm/configs/xcep_defconfig | 3 ---
 arch/hexagon/configs/comet_defconfig| 3 ---
 arch/m68k/configs/amcore_defconfig  | 3 ---
 arch/m68k/configs/m5208evb_defconfig| 3 ---
 arch/m68k/configs/m5249evb_defconfig| 3 ---
 arch/m68k/configs/m5272c3_defconfig | 3 ---
 arch/m68k/configs/m5275evb_defconfig| 3 ---
 arch/m68k/configs/m5307c3_defconfig | 3 ---
 arch/m68k/configs/m5407c3_defconfig | 3 ---
 arch/mips/configs/ar7_defconfig | 3 ---
 arch/mips/configs/ath25_defconfig   | 3 ---
 arch/mips/configs/ath79_defconfig   | 3 ---
 arch/mips/configs/bcm63xx_defconfig | 3 ---
 arch/mips/configs/bigsur_defconfig  | 3 ---
 arch/mips/configs/bmips_be_defconfig| 3 ---
 arch/mips/configs/bmips_stb_defconfig   | 3 ---
 arch/mips/configs/capcella_defconfig| 3 ---
 arch/mips/configs/ci20_defconfig| 3 ---
 arch/mips/configs/db1xxx_defconfig  | 1 -
 arch/mips/configs/decstation_64_defconfig   | 4 
 arch/mips/configs/decstation_defconfig  | 4 
 arch/mips/configs/decstation_r4k_defconfig  | 4 
 arch/mips/configs/fuloong2e_defconfig   | 2 --
 arch/mips/configs/gpr_defconfig | 3 ---
 arch/mips/configs/ip22_defconfig| 4 
 arch/mips/configs/ip27_defconfig| 7 ---
 arch/mips/configs/ip28_defconfig| 3 ---
 arch/mips/configs/jazz_defconfig| 2 --
 arch/mips/configs/jmr3927_defconfig | 3 ---
 arch/mips/configs/lasat_defconfig   | 3 ---
 arch/mips/configs/lemote2f_defconfig| 3 ---
 arch/mips/configs/loongson1b_defconfig  | 3 ---
 arch/mips/configs/loongson1c_defconfig  | 3 ---
 arch/mips/configs/malta_defconfig   | 2 --
 arch/mips/configs/malta_kvm_defconfig   | 2 --
 arch/mips/configs/malta_kvm_guest_defconfig | 2 --
 arch/mips/configs/maltaup_xpa_defconfig | 2 --
 arch/mips/configs/markeins_defconfig| 4 
 arch/mips/configs/mpc30x_defconfig  | 3 ---
 arch/mips/configs/mtx1_defconfig| 4

Re: [PATCH net-next] defconfigs: remove obsolete CONFIG_INET_XFRM_MODE_* and CONFIG_INET6_XFRM_MODE_*

2019-06-12 Thread Yuehaibing

Pls ignore this, will fix and resend.

On 2019/6/12 15:06, YueHaibing wrote:
> These Kconfig options has been removed in
> commit 4c145dce2601 ("xfrm: make xfrm modes builtin")
> So there is no point to keep it in defconfigs any longer.
> 
> Signed-off-by: YueHaibing 
> ---
>  arch/arc/configs/axs101_defconfig   | 3 ---
>  arch/arc/configs/axs103_defconfig   | 3 ---
>  arch/arc/configs/axs103_smp_defconfig   | 3 ---
>  arch/arc/configs/haps_hs_defconfig  | 3 ---
>  arch/arc/configs/haps_hs_smp_defconfig  | 3 ---
>  arch/arc/configs/nps_defconfig  | 3 ---
>  arch/arc/configs/nsimosci_hs_smp_defconfig  | 3 ---
>  arch/arc/configs/tb10x_defconfig| 3 ---
>  arch/arm/configs/acs5k_tiny_defconfig   | 3 ---
>  arch/arm/configs/am200epdkit_defconfig  | 3 ---
>  arch/arm/configs/aspeed_g4_defconfig| 6 --
>  arch/arm/configs/aspeed_g5_defconfig| 6 --
>  arch/arm/configs/at91_dt_defconfig  | 6 --
>  arch/arm/configs/cm_x300_defconfig  | 3 ---
>  arch/arm/configs/efm32_defconfig| 3 ---
>  arch/arm/configs/ep93xx_defconfig   | 3 ---
>  arch/arm/configs/ezx_defconfig  | 3 ---
>  arch/arm/configs/h5000_defconfig| 3 ---
>  arch/arm/configs/imote2_defconfig   | 3 ---
>  arch/arm/configs/imx_v4_v5_defconfig| 3 ---
>  arch/arm/configs/imx_v6_v7_defconfig| 3 ---
>  arch/arm/configs/iop13xx_defconfig  | 3 ---
>  arch/arm/configs/iop32x_defconfig   | 3 ---
>  arch/arm/configs/iop33x_defconfig   | 3 ---
>  arch/arm/configs/keystone_defconfig | 3 ---
>  arch/arm/configs/lpc18xx_defconfig  | 3 ---
>  arch/arm/configs/lpc32xx_defconfig  | 3 ---
>  arch/arm/configs/lpd270_defconfig   | 3 ---
>  arch/arm/configs/magician_defconfig | 3 ---
>  arch/arm/configs/mini2440_defconfig | 3 ---
>  arch/arm/configs/moxart_defconfig   | 3 ---
>  arch/arm/configs/mps2_defconfig | 3 ---
>  arch/arm/configs/mxs_defconfig  | 3 ---
>  arch/arm/configs/omap1_defconfig| 3 ---
>  arch/arm/configs/palmz72_defconfig  | 3 ---
>  arch/arm/configs/pcm027_defconfig   | 3 ---
>  arch/arm/configs/pxa3xx_defconfig   | 3 ---
>  arch/arm/configs/qcom_defconfig | 3 ---
>  arch/arm/configs/rpc_defconfig  | 6 --
>  arch/arm/configs/s3c2410_defconfig  | 1 -
>  arch/arm/configs/sama5_defconfig| 6 --
>  arch/arm/configs/sunxi_defconfig| 3 ---
>  arch/arm/configs/tango4_defconfig   | 3 ---
>  arch/arm/configs/tegra_defconfig| 2 --
>  arch/arm/configs/xcep_defconfig | 3 ---
>  arch/hexagon/configs/comet_defconfig| 3 ---
>  arch/m68k/configs/amcore_defconfig  | 3 ---
>  arch/m68k/configs/m5208evb_defconfig| 3 ---
>  arch/m68k/configs/m5249evb_defconfig| 3 ---
>  arch/m68k/configs/m5272c3_defconfig | 3 ---
>  arch/m68k/configs/m5275evb_defconfig| 3 ---
>  arch/m68k/configs/m5307c3_defconfig | 3 ---
>  arch/m68k/configs/m5407c3_defconfig | 3 ---
>  arch/mips/configs/ar7_defconfig | 3 ---
>  arch/mips/configs/ath25_defconfig   | 3 ---
>  arch/mips/configs/ath79_defconfig   | 3 ---
>  arch/mips/configs/bcm63xx_defconfig | 3 ---
>  arch/mips/configs/bigsur_defconfig  | 3 ---
>  arch/mips/configs/bmips_be_defconfig| 3 ---
>  arch/mips/configs/bmips_stb_defconfig   | 3 ---
>  arch/mips/configs/capcella_defconfig| 3 ---
>  arch/mips/configs/ci20_defconfig| 3 ---
>  arch/mips/configs/db1xxx_defconfig  | 1 -
>  arch/mips/configs/decstation_64_defconfig   | 4 
>  arch/mips/configs/decstation_defconfig  | 4 
>  arch/mips/configs/decstation_r4k_defconfig  | 4 
>  arch/mips/configs/fuloong2e_defconfig   | 2 --
>  arch/mips/configs/gpr_defconfig | 3 ---
>  arch/mips/configs/ip22_defconfig| 4 
>  arch/mips/configs/ip27_defconfig| 7 ---
>  arch/mips/configs/ip28_defconfig| 3 ---
>  arch/mips/configs/jazz_defconfig| 2 --
>  arch/mips/configs/jmr3927_defconfig | 3 ---
>  arch/mips/configs/lasat_defconfig   | 3 ---
>  arch/mips/configs/lemote2f_defconfig| 3 ---
>  arch/mips/configs/loongson1b_defconfig  | 3 ---
>  arch/mips/configs/loongson1c_defconfig  | 3 ---
>  arch/mips/configs/malta_defconfig   | 2 --
>  arch/mips/configs/malta_kvm_defconfig   | 2 --
>  arch/mips/configs/malta_kvm_guest_defconfig

Re: [PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Paul Mackerras

On Wed, Jun 12, 2019 at 09:42:52AM +0200, Christophe Leroy wrote:
> 
> 
> Le 12/06/2019 à 09:22, Michael Neuling a écrit :
> >In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
> >option") I screwed up some assembler and corrupted a pointer in
> >r3. This resulted in crashes like the below from Cédric:
> 
> Iaw Documentation/process/submitting-patches.rst:
> 
> Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
> instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
> to do frotz", as if you are giving orders to the codebase to change
> its behaviour.
> 
> So you could rephrase as follows for instance:
> 
> Commit  ("") screwed up some assembler 

That advice in submitting-patches.rst is certainly appropriate when
talking about the actual change that the patch makes.  However, it is
also appropriate to give descriptive background material that helps
the reader to understand why the change is necessary -- in this case,
where and how the bug was introduced.  So I'm going to support Mikey
as regards his first few paragraphs.

I agree that the last paragraph that says "This fixes the bug by ..."
could be reworded as "Fix the bug by ...".

Paul.

Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-12 Thread Benjamin Herrenschmidt

On Wed, 2019-06-12 at 15:45 +1000, Oliver O'Halloran wrote:
> 
> Also, are you sure about the MSI thing? The IODA3 spec says the only
> important bits for a 64bit MSI are bits 61:60 (to hit the window) and
> the lower bits that determine what IVE to use. Everything in between
> is ignored so ORing in bit 59 shouldn't break anything.

On IODA3... could be different on another system. My point is you can't
just have a fixed setting for all top bits for DMA & MSIs.

> > This will only work as long as all of the system memory can be
> > addressed at an offset from that fixed address that itself fits your
> > device addressing capabilities (50 bits in this case). It may or may
> > not be the case but there's no way to check since the DMA mask logic
> > won't really apply.
> > 
> > You might want to consider fixing your HW in the next iteration... This
> > is going to bite you when x86 increases the max physical memory for
> > example, or on other architectures.
> 
> Yes, do this. The easiest way to avoid this sort of wierd hack is to
> just design the PCIe interface to the spec in the first place.

Ben.

Re: sys_exit: NR -1

2019-06-12 Thread Naveen N. Rao


Paul Clarke wrote:

What are the circumstances in which raw_syscalls:sys_exit reports "-1" for the 
syscall ID?

perf  5375 [007] 59632.478528:   raw_syscalls:sys_enter: NR 1 (3, 9fb888, 
8, 2d83740, 1, 7)
perf  5375 [007] 59632.478532:raw_syscalls:sys_exit: NR 1 = 8
perf  5375 [007] 59632.478538:   raw_syscalls:sys_enter: NR 15 (11, 
7ca734b0, 7ca73380, 2d83740, 1, 7)
perf  5375 [007] 59632.478539:raw_syscalls:sys_exit: NR -1 = 8
perf  5375 [007] 59632.478543:   raw_syscalls:sys_enter: NR 16 (4, 2401, 0, 
2d83740, 1, 0)
perf  5375 [007] 59632.478551:raw_syscalls:sys_exit: NR 16 = 0


Which architecture?
For powerpc, see:

static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
{
/*
 * Note that we are returning an int here. That means 0x, ie.
 * 32-bit negative 1, will be interpreted as -1 on a 64-bit kernel.
 * This is important for seccomp so that compat tasks can set r0 = -1
 * to reject the syscall.
 */
return TRAP(regs) == 0xc00 ? regs->gpr[0] : -1;
}


- Naveen

Re: [PATCH v3 1/3] powerpc/powernv: Add OPAL API interface to get secureboot state

2019-06-12 Thread Daniel Axtens

Nayna Jain  writes:

> From: Claudio Carvalho 
>
> The X.509 certificates trusted by the platform and other information
> required to secure boot the OS kernel are wrapped in secure variables,
> which are controlled by OPAL.
>
> This patch adds support to read OPAL secure variables through
> OPAL_SECVAR_GET call. It returns the metadata and data for a given secure
> variable based on the unique key.
>
> Since OPAL can support different types of backend which can vary in the
> variable interpretation, a new OPAL API call named OPAL_SECVAR_BACKEND, is
> added to retrieve the supported backend version. This helps the consumer
> to know how to interpret the variable.
>

(Firstly, apologies that I haven't got around to asking about this yet!)

Are pluggable/versioned backend a good idea?

There are a few things that worry me about the idea:

 - It adds complexity in crypto (or crypto-adjacent) code, and that
   increases the likelihood that we'll accidentally add a bug with bad
   consequences.

 - Under what circumstances would would we change the kernel-visible
   behaviour of skiboot? Are we expecting to change the behaviour,
   content or names of the variables in future? Otherwise the only
   relevant change I can think of is a change to hardware platforms, and
   I'm not sure how a change in hardware would lead to change in
   behaviour in the kernel. Wouldn't Skiboot hide h/w differences?

 - If we are worried about a long-term-future change to how secure-boot
   works, would it be better to just add more get/set calls to opal at
   the point at which we actually implement the new system?
   
 - UEFI added EFI_VARIABLE_AUTHENTICATION_3 in a way that - as far
   as I know - didn't break backwards compatibility. Is there a reason
   we cannot add features that way instead? (It also dropped v1 of the
   authentication header.)
   
 - What is the correct fallback behaviour if a kernel receives a result
   that it does not expect? If a kernel expecting BackendV1 is instead
   informed that it is running on BackendV2, then the cannot access the
   secure variable at all, so it cannot load keys that are potentially
   required to successfully boot (e.g. to validate the module for
   network card or graphics!)

Kind regards,
Daniel

> This support can be enabled using CONFIG_OPAL_SECVAR
>
> Signed-off-by: Claudio Carvalho 
> Signed-off-by: Nayna Jain 
> ---
> This patch depends on a new OPAL call that is being added to skiboot.
> The patch set that implements the new call has been posted to
> https://patchwork.ozlabs.org/project/skiboot/list/?series=112868
>
>  arch/powerpc/include/asm/opal-api.h  |  4 +-
>  arch/powerpc/include/asm/opal-secvar.h   | 23 ++
>  arch/powerpc/include/asm/opal.h  |  6 ++
>  arch/powerpc/platforms/powernv/Kconfig   |  6 ++
>  arch/powerpc/platforms/powernv/Makefile  |  1 +
>  arch/powerpc/platforms/powernv/opal-call.c   |  2 +
>  arch/powerpc/platforms/powernv/opal-secvar.c | 85 
>  7 files changed, 126 insertions(+), 1 deletion(-)
>  create mode 100644 arch/powerpc/include/asm/opal-secvar.h
>  create mode 100644 arch/powerpc/platforms/powernv/opal-secvar.c
>
> diff --git a/arch/powerpc/include/asm/opal-api.h 
> b/arch/powerpc/include/asm/opal-api.h
> index e1577cfa7186..a505e669b4b6 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -212,7 +212,9 @@
>  #define OPAL_HANDLE_HMI2 166
>  #define  OPAL_NX_COPROC_INIT 167
>  #define OPAL_XIVE_GET_VP_STATE   170
> -#define OPAL_LAST170
> +#define OPAL_SECVAR_GET 173
> +#define OPAL_SECVAR_BACKEND 177
> +#define OPAL_LAST177
>  
>  #define QUIESCE_HOLD 1 /* Spin all calls at entry */
>  #define QUIESCE_REJECT   2 /* Fail all calls with 
> OPAL_BUSY */
> diff --git a/arch/powerpc/include/asm/opal-secvar.h 
> b/arch/powerpc/include/asm/opal-secvar.h
> new file mode 100644
> index ..b677171a0368
> --- /dev/null
> +++ b/arch/powerpc/include/asm/opal-secvar.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * PowerNV definitions for secure variables OPAL API.
> + *
> + * Copyright (C) 2019 IBM Corporation
> + * Author: Claudio Carvalho 
> + *
> + */
> +#ifndef OPAL_SECVAR_H
> +#define OPAL_SECVAR_H
> +
> +enum {
> + BACKEND_NONE = 0,
> + BACKEND_TC_COMPAT_V1,
> +};
> +
> +extern int opal_get_variable(u8 *key, unsigned long ksize,
> +  u8 *metadata, unsigned long *mdsize,
> +  u8 *data, unsigned long *dsize);
> +
> +extern int opal_variable_version(unsigned long *backend);
> +
> +#endif
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index 4cc37e708bc7..57d2c2356eda 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++

Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-12 Thread Benjamin Herrenschmidt

On Wed, 2019-06-12 at 09:25 +0300, Oded Gabbay wrote:
> 
> > You can't. Your device is broken. Devices that don't support DMAing to
> > the full 64-bit deserve to be added to the trash pile.
> > 
> 
> Hmm... right know they are added to customers data-centers but what do I know 
> ;)

Well, some customers don't know they are being sold a lemon :)

> > As a result, getting it to work will require hacks. Some GPUs have
> > similar issues and require similar hacks, it's unfortunate.
> > 
> > Added a couple of guys on CC who might be able to help get those hacks
> > right.
> 
> Thanks :)
> > 
> > It's still very fishy .. the idea is to detect the case where setting a
> > 64-bit mask will give your system memory mapped at a fixed high address
> > (1 << 59 in our case) and program that in your chip in the "Fixed high
> > bits" register that you seem to have (also make sure it doesn't affect
> > MSIs or it will break them).
> 
> MSI-X are working. The set of bit 59 doesn't apply to MSI-X
> transactions (AFAICS from the PCIe controller spec we have).

Ok.

> > This will only work as long as all of the system memory can be
> > addressed at an offset from that fixed address that itself fits your
> > device addressing capabilities (50 bits in this case). It may or may
> > not be the case but there's no way to check since the DMA mask logic
> > won't really apply.
> 
> Understood. In the specific system we are integrated to, that is the
> case - we have less then 48 bits. But, as you pointed out, it is not a
> generic solution but with my H/W I can't give a generic fit-all
> solution for POWER9. I'll settle for the best that I can do.
> 
> > 
> > You might want to consider fixing your HW in the next iteration... This
> > is going to bite you when x86 increases the max physical memory for
> > example, or on other architectures.
> 
> Understood and taken care of.

Cheers,
Ben.

> > 
> > Cheers,
> > Ben.
> > 
> > 
> > 
> >

Re: [PATCH] cxl: no need to check return value of debugfs_create functions

2019-06-12 Thread Arnd Bergmann

On Tue, Jun 11, 2019 at 8:13 PM Greg Kroah-Hartman
 wrote:

> @@ -64,8 +64,6 @@ int cxl_debugfs_adapter_add(struct cxl *adapter)
>
> snprintf(buf, 32, "card%i", adapter->adapter_num);
> dir = debugfs_create_dir(buf, cxl_debugfs);
> -   if (IS_ERR(dir))
> -   return PTR_ERR(dir);
> adapter->debugfs = dir;
>

Should the check for 'cxl_debugfs' get removed here as well?
If that is null, we might put the subdir in the wrong place in the
tree, but that would otherwise be harmless as well, and the
same thing happens if 'dir' is NULL above and we add the
files in the debugfs root later (losing the ability to clean up
afterwards).

int cxl_debugfs_adapter_add(struct cxl *adapter)
{
struct dentry *dir;
char buf[32];

if (!cxl_debugfs)
return -ENODEV;

It's still a bit odd to return an error, since the caller then just
ignores the return code anway:

/* Don't care if this one fails: */
cxl_debugfs_adapter_add(adapter);

It would seem best to change the return type to 'void' here for
consistency.

 Arnd

Re: [PATCH] powerpc/64s: Fix misleading SPR and timebase information

2019-06-12 Thread Zhangshaokun

Hi Michael,

A gentle ping.

On 2019/5/29 17:21, Shaokun Zhang wrote:
> pr_info shows SPR and timebase as a decimal value with a '0x'
> prefix, which is somewhat misleading.
> 
> Fix it to print hexadecimal, as was intended.
> 
> Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Signed-off-by: Shaokun Zhang 
> ---
>  arch/powerpc/platforms/powernv/idle.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/idle.c 
> b/arch/powerpc/platforms/powernv/idle.c
> index c9133f7908ca..77f2e0a4ee37 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -1159,10 +1159,10 @@ static void __init pnv_power9_idle_init(void)
>   pnv_deepest_stop_psscr_mask);
>   }
>  
> - pr_info("cpuidle-powernv: First stop level that may lose SPRs = 
> 0x%lld\n",
> + pr_info("cpuidle-powernv: First stop level that may lose SPRs = 
> 0x%llx\n",
>   pnv_first_spr_loss_level);
>  
> - pr_info("cpuidle-powernv: First stop level that may lose timebase = 
> 0x%lld\n",
> + pr_info("cpuidle-powernv: First stop level that may lose timebase = 
> 0x%llx\n",
>   pnv_first_tb_loss_level);
>  }
>  
>

Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-12 Thread Oliver O'Halloran

On Wed, Jun 12, 2019 at 4:53 PM Christoph Hellwig  wrote:
>
> On Wed, Jun 12, 2019 at 04:35:22PM +1000, Oliver O'Halloran wrote:
> > Setting a 48 bit DMA mask doesn't work today because we only allocate
> > IOMMU tables to cover the 0..2GB range of PCI bus addresses.
>
> I don't think that is true upstream, and if it is we need to fix bug
> in the powerpc code.  powerpc should be falling back treating a 48-bit
> dma mask like a 32-bit one at least, that is use dynamic iommu mappings
> instead of using the direct mapping.  And from my reding of
> arch/powerpc/kernel/dma-iommu.c that is exactly what it does.

This is more or less what Alexey's patches fix. The IOMMU table
allocated for the 32bit DMA window is only sized for 2GB in the
platform code, see pnv_pci_ioda2_setup_default_config().

Re: [PATCH] cxl: no need to check return value of debugfs_create functions

2019-06-12 Thread Greg Kroah-Hartman

On Wed, Jun 12, 2019 at 11:51:21AM +0200, Arnd Bergmann wrote:
> On Tue, Jun 11, 2019 at 8:13 PM Greg Kroah-Hartman
>  wrote:
> 
> > @@ -64,8 +64,6 @@ int cxl_debugfs_adapter_add(struct cxl *adapter)
> >
> > snprintf(buf, 32, "card%i", adapter->adapter_num);
> > dir = debugfs_create_dir(buf, cxl_debugfs);
> > -   if (IS_ERR(dir))
> > -   return PTR_ERR(dir);
> > adapter->debugfs = dir;
> >
> 
> Should the check for 'cxl_debugfs' get removed here as well?

Maybe, I could not determine the logic if those functions could be
called before cxl_debugfs was ever set.

And debugfs_create_dir() will not return a NULL value if an error
happens, so no need to worry about files being created in the wrong
place.

> If that is null, we might put the subdir in the wrong place in the
> tree, but that would otherwise be harmless as well, and the
> same thing happens if 'dir' is NULL above and we add the
> files in the debugfs root later (losing the ability to clean up
> afterwards).
> 
> int cxl_debugfs_adapter_add(struct cxl *adapter)
> {
> struct dentry *dir;
> char buf[32];
> 
> if (!cxl_debugfs)
> return -ENODEV;
> 
> It's still a bit odd to return an error, since the caller then just
> ignores the return code anway:

Then let's just return nothing.

> /* Don't care if this one fails: */
> cxl_debugfs_adapter_add(adapter);
> 
> It would seem best to change the return type to 'void' here for
> consistency.

I agree, let me go do that.

thanks,

greg k-h

Re: [PATCH] powerpc/pseries: Switch to GFP_ATOMIC allocations in hotplug interrupt handler

2019-06-12 Thread Nathan Lynch

Bharata B Rao  writes:

> queue_hotplug_event() gets called from interrupt handler code. Use
> GFP_ATOMIC allocations instead of GFP_KERNEL.

https://patchwork.ozlabs.org/patch/1106626/

(That version also adds a missing check for the result of the first
kmalloc.)

Re: [PATCH] cxl: no need to check return value of debugfs_create functions

2019-06-12 Thread Greg Kroah-Hartman

On Wed, Jun 12, 2019 at 11:51:21AM +0200, Arnd Bergmann wrote:
> On Tue, Jun 11, 2019 at 8:13 PM Greg Kroah-Hartman
>  wrote:
> 
> > @@ -64,8 +64,6 @@ int cxl_debugfs_adapter_add(struct cxl *adapter)
> >
> > snprintf(buf, 32, "card%i", adapter->adapter_num);
> > dir = debugfs_create_dir(buf, cxl_debugfs);
> > -   if (IS_ERR(dir))
> > -   return PTR_ERR(dir);
> > adapter->debugfs = dir;
> >
> 
> Should the check for 'cxl_debugfs' get removed here as well?
> If that is null, we might put the subdir in the wrong place in the
> tree, but that would otherwise be harmless as well, and the
> same thing happens if 'dir' is NULL above and we add the
> files in the debugfs root later (losing the ability to clean up
> afterwards).

dir can only be NULL if no one has initialized it, debugfs_create_dir()
will never return a null value.  I don't really know the ordering of the
calls here, so I'll keep this as-is for now incase someone is trying to
add a "device" before a directory is initialized.

thanks,

greg k-h

[PATCH v2] cxl: no need to check return value of debugfs_create functions

2019-06-12 Thread Greg Kroah-Hartman

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Because there's no need to check, also make the return value of the
local debugfs_create_io_x64() call void, as no one ever did anything
with the return value (as they did not need to.)

And make the cxl_debugfs_* calls return void as no one was even checking
their return value at all.

Cc: Frederic Barrat 
Cc: Andrew Donnellan 
Cc: Arnd Bergmann 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
v2: make the return value of all of the cxl_debugfs_* calls void as no
one was checking the return values of them.

 drivers/misc/cxl/cxl.h | 15 ++-
 drivers/misc/cxl/debugfs.c | 36 +++-
 2 files changed, 17 insertions(+), 34 deletions(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index a73c9e669d78..5dc0f6093f9d 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -908,11 +908,11 @@ void cxl_update_dedicated_ivtes_psl8(struct cxl_context 
*ctx);
 
 #ifdef CONFIG_DEBUG_FS
 
-int cxl_debugfs_init(void);
+void cxl_debugfs_init(void);
 void cxl_debugfs_exit(void);
-int cxl_debugfs_adapter_add(struct cxl *adapter);
+void cxl_debugfs_adapter_add(struct cxl *adapter);
 void cxl_debugfs_adapter_remove(struct cxl *adapter);
-int cxl_debugfs_afu_add(struct cxl_afu *afu);
+void cxl_debugfs_afu_add(struct cxl_afu *afu);
 void cxl_debugfs_afu_remove(struct cxl_afu *afu);
 void cxl_debugfs_add_adapter_regs_psl9(struct cxl *adapter, struct dentry 
*dir);
 void cxl_debugfs_add_adapter_regs_psl8(struct cxl *adapter, struct dentry 
*dir);
@@ -921,27 +921,24 @@ void cxl_debugfs_add_afu_regs_psl8(struct cxl_afu *afu, 
struct dentry *dir);
 
 #else /* CONFIG_DEBUG_FS */
 
-static inline int __init cxl_debugfs_init(void)
+static inline void __init cxl_debugfs_init(void)
 {
-   return 0;
 }
 
 static inline void cxl_debugfs_exit(void)
 {
 }
 
-static inline int cxl_debugfs_adapter_add(struct cxl *adapter)
+static inline void cxl_debugfs_adapter_add(struct cxl *adapter)
 {
-   return 0;
 }
 
 static inline void cxl_debugfs_adapter_remove(struct cxl *adapter)
 {
 }
 
-static inline int cxl_debugfs_afu_add(struct cxl_afu *afu)
+static inline void cxl_debugfs_afu_add(struct cxl_afu *afu)
 {
-   return 0;
 }
 
 static inline void cxl_debugfs_afu_remove(struct cxl_afu *afu)
diff --git a/drivers/misc/cxl/debugfs.c b/drivers/misc/cxl/debugfs.c
index 1fda22c24c93..7b987bf498b5 100644
--- a/drivers/misc/cxl/debugfs.c
+++ b/drivers/misc/cxl/debugfs.c
@@ -26,11 +26,11 @@ static int debugfs_io_u64_set(void *data, u64 val)
 DEFINE_DEBUGFS_ATTRIBUTE(fops_io_x64, debugfs_io_u64_get, debugfs_io_u64_set,
 "0x%016llx\n");
 
-static struct dentry *debugfs_create_io_x64(const char *name, umode_t mode,
-   struct dentry *parent, u64 __iomem 
*value)
+static void debugfs_create_io_x64(const char *name, umode_t mode,
+ struct dentry *parent, u64 __iomem *value)
 {
-   return debugfs_create_file_unsafe(name, mode, parent,
- (void __force *)value, _io_x64);
+   debugfs_create_file_unsafe(name, mode, parent, (void __force *)value,
+  _io_x64);
 }
 
 void cxl_debugfs_add_adapter_regs_psl9(struct cxl *adapter, struct dentry *dir)
@@ -54,25 +54,22 @@ void cxl_debugfs_add_adapter_regs_psl8(struct cxl *adapter, 
struct dentry *dir)
debugfs_create_io_x64("trace", S_IRUSR | S_IWUSR, dir, 
_cxl_p1_addr(adapter, CXL_PSL_TRACE));
 }
 
-int cxl_debugfs_adapter_add(struct cxl *adapter)
+void cxl_debugfs_adapter_add(struct cxl *adapter)
 {
struct dentry *dir;
char buf[32];
 
if (!cxl_debugfs)
-   return -ENODEV;
+   return;
 
snprintf(buf, 32, "card%i", adapter->adapter_num);
dir = debugfs_create_dir(buf, cxl_debugfs);
-   if (IS_ERR(dir))
-   return PTR_ERR(dir);
adapter->debugfs = dir;
 
debugfs_create_io_x64("err_ivte", S_IRUSR, dir, _cxl_p1_addr(adapter, 
CXL_PSL_ErrIVTE));
 
if (adapter->native->sl_ops->debugfs_add_adapter_regs)
adapter->native->sl_ops->debugfs_add_adapter_regs(adapter, dir);
-   return 0;
 }
 
 void cxl_debugfs_adapter_remove(struct cxl *adapter)
@@ -96,18 +93,16 @@ void cxl_debugfs_add_afu_regs_psl8(struct cxl_afu *afu, 
struct dentry *dir)
debugfs_create_io_x64("trace", S_IRUSR | S_IWUSR, dir, 
_cxl_p1n_addr(afu, CXL_PSL_SLICE_TRACE));
 }
 
-int cxl_debugfs_afu_add(struct cxl_afu *afu)
+void cxl_debugfs_afu_add(struct cxl_afu *afu)
 {
struct dentry *dir;
char buf[32];
 
if (!afu->adapter->debugfs)
-   return -ENODEV;
+   return;
 
snprintf(buf, 32, "psl%i.%i", afu->adapter->adapter_num,

[PATCH v4 19/28] docs: powerpc: convert docs to ReST and rename to *.rst

2019-06-12 Thread Mauro Carvalho Chehab

Convert docs to ReST and add them to the arch-specific
book.

The conversion here was trivial, as almost every file there
was already using an elegant format close to ReST standard.

The changes were mostly to mark literal blocks and add a few
missing section title identifiers.

One note with regards to "--": on Sphinx, this can't be used
to identify a list, as it will format it badly. This can be
used, however, to identify a long hyphen - and "---" is an
even longer one.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab 
Acked-by: Andrew Donnellan  # cxl
---
 Documentation/PCI/pci-error-recovery.rst  |  23 ++-
 .../{bootwrapper.txt => bootwrapper.rst}  |  28 +++-
 .../{cpu_families.txt => cpu_families.rst}|  23 +--
 .../{cpu_features.txt => cpu_features.rst}|   6 +-
 Documentation/powerpc/{cxl.txt => cxl.rst}|  46 --
 .../powerpc/{cxlflash.txt => cxlflash.rst}|  10 +-
 .../{DAWR-POWER9.txt => dawr-power9.rst}  |  15 +-
 Documentation/powerpc/{dscr.txt => dscr.rst}  |  18 +-
 ...ecovery.txt => eeh-pci-error-recovery.rst} | 108 ++--
 ...ed-dump.txt => firmware-assisted-dump.rst} | 117 +++--
 Documentation/powerpc/{hvcs.txt => hvcs.rst}  | 108 ++--
 Documentation/powerpc/index.rst   |  34 
 Documentation/powerpc/isa-versions.rst|  15 +-
 .../powerpc/{mpc52xx.txt => mpc52xx.rst}  |  12 +-
 ...nv.txt => pci_iov_resource_on_powernv.rst} |  15 +-
 .../powerpc/{pmu-ebb.txt => pmu-ebb.rst}  |   1 +
 Documentation/powerpc/ptrace.rst  | 156 ++
 Documentation/powerpc/ptrace.txt  | 151 -
 .../{qe_firmware.txt => qe_firmware.rst}  |  37 +++--
 .../{syscall64-abi.txt => syscall64-abi.rst}  |  29 ++--
 ...al_memory.txt => transactional_memory.rst} |  45 ++---
 MAINTAINERS   |   6 +-
 arch/powerpc/kernel/exceptions-64s.S  |   2 +-
 drivers/soc/fsl/qe/qe.c   |   2 +-
 drivers/tty/hvc/hvcs.c|   2 +-
 include/soc/fsl/qe/qe.h   |   2 +-
 26 files changed, 584 insertions(+), 427 deletions(-)
 rename Documentation/powerpc/{bootwrapper.txt => bootwrapper.rst} (93%)
 rename Documentation/powerpc/{cpu_families.txt => cpu_families.rst} (95%)
 rename Documentation/powerpc/{cpu_features.txt => cpu_features.rst} (97%)
 rename Documentation/powerpc/{cxl.txt => cxl.rst} (95%)
 rename Documentation/powerpc/{cxlflash.txt => cxlflash.rst} (98%)
 rename Documentation/powerpc/{DAWR-POWER9.txt => dawr-power9.rst} (95%)
 rename Documentation/powerpc/{dscr.txt => dscr.rst} (91%)
 rename Documentation/powerpc/{eeh-pci-error-recovery.txt => 
eeh-pci-error-recovery.rst} (82%)
 rename Documentation/powerpc/{firmware-assisted-dump.txt => 
firmware-assisted-dump.rst} (80%)
 rename Documentation/powerpc/{hvcs.txt => hvcs.rst} (91%)
 create mode 100644 Documentation/powerpc/index.rst
 rename Documentation/powerpc/{mpc52xx.txt => mpc52xx.rst} (91%)
 rename Documentation/powerpc/{pci_iov_resource_on_powernv.txt => 
pci_iov_resource_on_powernv.rst} (97%)
 rename Documentation/powerpc/{pmu-ebb.txt => pmu-ebb.rst} (99%)
 create mode 100644 Documentation/powerpc/ptrace.rst
 delete mode 100644 Documentation/powerpc/ptrace.txt
 rename Documentation/powerpc/{qe_firmware.txt => qe_firmware.rst} (95%)
 rename Documentation/powerpc/{syscall64-abi.txt => syscall64-abi.rst} (82%)
 rename Documentation/powerpc/{transactional_memory.txt => 
transactional_memory.rst} (93%)

diff --git a/Documentation/PCI/pci-error-recovery.rst 
b/Documentation/PCI/pci-error-recovery.rst
index 83db42092935..acc21ecca322 100644
--- a/Documentation/PCI/pci-error-recovery.rst
+++ b/Documentation/PCI/pci-error-recovery.rst
@@ -403,7 +403,7 @@ That is, the recovery API only requires that:
 .. note::
 
Implementation details for the powerpc platform are discussed in
-   the file Documentation/powerpc/eeh-pci-error-recovery.txt
+   the file Documentation/powerpc/eeh-pci-error-recovery.rst
 
As of this writing, there is a growing list of device drivers with
patches implementing error recovery. Not all of these patches are in
@@ -422,3 +422,24 @@ That is, the recovery API only requires that:
- drivers/net/cxgb3
- drivers/net/s2io.c
- drivers/net/qlge
+
+>>> As of this writing, there is a growing list of device drivers with
+>>> patches implementing error recovery. Not all of these patches are in
+>>> mainline yet. These may be used as "examples":
+>>>
+>>> drivers/scsi/ipr
+>>> drivers/scsi/sym53c8xx_2
+>>> drivers/scsi/qla2xxx
+>>> drivers/scsi/lpfc
+>>> drivers/next/bnx2.c
+>>> drivers/next/e100.c
+>>> drivers/net/e1000
+>>> drivers/net/e1000e
+>>> drivers/net/ixgb
+>>> drivers/net/ixgbe
+>>> drivers/net/cxgb3
+>>> drivers/net/s2io.c
+>>> drivers/net/qlge
+
+The End
+---
diff --git

[PATCH v4 13/28] docs: kdump: convert docs to ReST and rename to *.rst

2019-06-12 Thread Mauro Carvalho Chehab

Convert kdump documentation to ReST and add it to the
user faced manual, as the documents are mainly focused on
sysadmins that would be enabling kdump.

Note: the vmcoreinfo.rst has one very long title on one of its
sub-sections:


PG_lru|PG_private|PG_swapcache|PG_swapbacked|PG_slab|PG_hwpoision|PG_head_mask|PAGE_BUDDY_MAPCOUNT_VALUE(~PG_buddy)|PAGE_OFFLINE_MAPCOUNT_VALUE(~PG_offline)

I opted to break this one, into two entries with the same content,
in order to make it easier to display after being parsed in html and PDF.

The conversion is actually:
  - add blank lines and identation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/admin-guide/bug-hunting.rst |   2 +-
 .../admin-guide/kernel-parameters.txt |   6 +-
 Documentation/kdump/index.rst |  21 +++
 Documentation/kdump/{kdump.txt => kdump.rst}  | 131 +++---
 .../kdump/{vmcoreinfo.txt => vmcoreinfo.rst}  |  59 
 .../powerpc/firmware-assisted-dump.txt|   2 +-
 .../translations/zh_CN/oops-tracing.txt   |   2 +-
 Documentation/watchdog/hpwdt.txt  |   2 +-
 arch/arm/Kconfig  |   2 +-
 arch/arm64/Kconfig|   2 +-
 arch/sh/Kconfig   |   2 +-
 arch/x86/Kconfig  |   4 +-
 12 files changed, 137 insertions(+), 98 deletions(-)
 create mode 100644 Documentation/kdump/index.rst
 rename Documentation/kdump/{kdump.txt => kdump.rst} (91%)
 rename Documentation/kdump/{vmcoreinfo.txt => vmcoreinfo.rst} (95%)

diff --git a/Documentation/admin-guide/bug-hunting.rst 
b/Documentation/admin-guide/bug-hunting.rst
index f278b289e260..b761aa2a51d2 100644
--- a/Documentation/admin-guide/bug-hunting.rst
+++ b/Documentation/admin-guide/bug-hunting.rst
@@ -90,7 +90,7 @@ the disk is not available then you have three options:
 run a null modem to a second machine and capture the output there
 using your favourite communication program.  Minicom works well.
 
-(3) Use Kdump (see Documentation/kdump/kdump.txt),
+(3) Use Kdump (see Documentation/kdump/kdump.rst),
 extract the kernel ring buffer from old memory with using dmesg
 gdbmacro in Documentation/kdump/gdbmacros.txt.
 
diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index affed5d447de..c31373f39240 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -708,14 +708,14 @@
[KNL, x86_64] select a region under 4G first, and
fall back to reserve region above 4G when '@offset'
hasn't been specified.
-   See Documentation/kdump/kdump.txt for further details.
+   See Documentation/kdump/kdump.rst for further details.
 
crashkernel=range1:size1[,range2:size2,...][@offset]
[KNL] Same as above, but depends on the memory
in the running system. The syntax of range is
start-[end] where start and end are both
a memory unit (amount[KMG]). See also
-   Documentation/kdump/kdump.txt for an example.
+   Documentation/kdump/kdump.rst for an example.
 
crashkernel=size[KMG],high
[KNL, x86_64] range could be above 4G. Allow kernel
@@ -1207,7 +1207,7 @@
Specifies physical address of start of kernel core
image elf header and optionally the size. Generally
kexec loader will pass this option to capture kernel.
-   See Documentation/kdump/kdump.txt for details.
+   See Documentation/kdump/kdump.rst for details.
 
enable_mtrr_cleanup [X86]
The kernel tries to adjust MTRR layout from continuous
diff --git a/Documentation/kdump/index.rst b/Documentation/kdump/index.rst
new file mode 100644
index ..2b17fcf6867a
--- /dev/null
+++ b/Documentation/kdump/index.rst
@@ -0,0 +1,21 @@
+:orphan:
+
+
+Documentation for Kdump - The kexec-based Crash Dumping Solution
+
+
+This document includes overview, setup and installation, and analysis
+information.
+
+.. toctree::
+:maxdepth: 1
+
+kdump
+vmcoreinfo
+
+.. only::  subproject and html
+
+   Indices
+   ===
+
+   * :ref:`genindex`
diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.rst
similarity index 91%
rename from

Re: [PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-12 Thread Horia Geanta

On 6/12/2019 8:52 AM, Christophe Leroy wrote:
> 
> 
> Le 11/06/2019 à 18:30, Horia Geanta a écrit :
>> On 6/11/2019 6:40 PM, Christophe Leroy wrote:
>>>
>>>
>>> Le 11/06/2019 à 17:37, Horia Geanta a écrit :
 On 6/11/2019 5:39 PM, Christophe Leroy wrote:
> This series is the last set of fixes for the Talitos driver.
>
> We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
> SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:
>
 I am getting below failures on a sec 3.3.2 (p1020rdb) for hmac(sha384) and
 hmac(sha512):
>>>
>>> Is that new with this series or did you already have it before ?
>>>
>> Looks like this happens with or without this series.
> 
> Found the issue, that's in 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=b8fbdc2bc4e71b62646031d5df5f08aafe15d5ad
> 
> CONFIG_CRYPTO_DEV_TALITOS_SEC2 should be CONFIG_CRYPTO_DEV_TALITOS2 instead.
> 
> Just sent a patch to fix it.
> 
Thanks, I've tested it and the hmac failures go away.

However, testing gets stuck.
Seems there is another issue lurking in the driver.

Used cryptodev-2.6/master with the following on top:
crypto: testmgr - add some more preemption points
https://patchwork.kernel.org/patch/10972337/
crypto: talitos - fix max key size for sha384 and sha512
https://patchwork.kernel.org/patch/10988473/

[...]
alg: skcipher: skipping comparison tests for ecb-3des-talitos because 
ecb(des3_ede-generic) is unavailable
INFO: task cryptomgr_test:314 blocked for more than 120 seconds.
  Not tainted 5.2.0-rc1-g905bfd415e8a #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cryptomgr_test  D0   314  2 0x0800
Call Trace:
[e78337e0] [0004] 0x4 (unreliable)
[e78338a8] [c08a6e5c] __schedule+0x20c/0x4d4
[e78338f8] [c08a7158] schedule+0x34/0xc8
[e7833908] [c08aa5ec] schedule_timeout+0x1d4/0x350
[e7833958] [c08a7be4] wait_for_common+0xa0/0x164
[e7833998] [c03a7b14] do_ahash_op+0xa4/0xc4
[e78339b8] [c03aba00] test_ahash_vec_cfg+0x188/0x5e4
[e7833aa8] [c03ac1c8] test_hash_vs_generic_impl+0x1b0/0x2b4
[e7833de8] [c03ac498] __alg_test_hash+0x1cc/0x2d0
[e7833e28] [c03a9fb4] alg_test.part.37+0x8c/0x3ac
[e7833ef8] [c03a54d0] cryptomgr_test+0x4c/0x54
[e7833f08] [c006c410] kthread+0xf8/0x124
[e7833f38] [c001227c] ret_from_kernel_thread+0x14/0x1c

addr2line on c03aba00 points to crypto/testmgr.c:1335

   1327)  if (cfg->finalization_type == FINALIZATION_TYPE_DIGEST ||
   1328)  vec->digest_error) {
   1329)  /* Just using digest() */
   1330)  ahash_request_set_callback(req, req_flags, crypto_req_done,
   1331) );
   1332)  ahash_request_set_crypt(req, tsgl->sgl, result, vec->psize);
   1333)  err = do_ahash_op(crypto_ahash_digest, req, , 
cfg->nosimd);
   1334)  if (err) {
-> 1335)  if (err == vec->digest_error)
   1336)  return 0;
   1337)  pr_err("alg: ahash: %s digest() failed on test vector 
%s; expected_error=%d, actual_error=%d, cfg=\"%s\"\n",
   1338) driver, vec_name, vec->digest_error, err,
   1339) cfg->name);
   1340)  return err;
   1341)  }
   1342)  if (vec->digest_error) {
   1343)  pr_err("alg: ahash: %s digest() unexpectedly 
succeeded on test vector %s; expected_error=%d, cfg=\"%s\"\n",
   1344) driver, vec_name, vec->digest_error, 
cfg->name);
   1345)  return -EINVAL;
   1346)  }
   1347)  goto result_ready;
   1348)  }

Seems that for some reason driver does not receive the interrupt from HW,
thus completion callback does not run.

Tried with or without current patch series, no change in behaviour.

If you cannot reproduce and don't have any idea, I'll try the hard way
(git bisect).

Thanks,
Horia

Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-12 Thread Christoph Hellwig

On Tue, Jun 11, 2019 at 05:20:12PM -0500, Larry Finger wrote:
> Your first patch did not work as the configuration does not have 
> CONFIG_ZONE_DMA. As a result, the initial value of min_mask always starts 
> at 32 bits and is taken down to 31 with the maximum pfn minimization. When 
> I forced the initial value of min_mask to 30 bits, the device worked.

Ooops, yes.  But I think we could just enable ZONE_DMA on 32-bit
powerpc.  Crude enablement hack below:

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636308c8..1dd71a98b70c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -372,7 +372,7 @@ config PPC_ADV_DEBUG_DAC_RANGE
 
 config ZONE_DMA
bool
-   default y if PPC_BOOK3E_64
+   default y
 
 config PGTABLE_LEVELS
int

Re: [PATCH kernel v3 0/3] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-06-12 Thread Alexey Kardashevskiy




On 12/06/2019 15:05, Shawn Anastasio wrote:
> On 6/5/19 11:11 PM, Shawn Anastasio wrote:
>> On 5/30/19 2:03 AM, Alexey Kardashevskiy wrote:
>>> This is an attempt to allow DMA masks between 32..59 which are not large
>>> enough to use either a PHB3 bypass mode or a sketchy bypass. Depending
>>> on the max order, up to 40 is usually available.
>>>
>>>
>>> This is based on v5.2-rc2.
>>>
>>> Please comment. Thanks.
>>
>> I have tested this patch set with an AMD GPU that's limited to <64bit
>> DMA (I believe it's 40 or 42 bit). It successfully allows the card to
>> operate without falling back to 32-bit DMA mode as it does without
>> the patches.
>>
>> Relevant kernel log message:
>> ```
>> [    0.311211] pci 0033:01 : [PE# 00] Enabling 64-bit DMA bypass
>> ```
>>
>> Tested-by: Shawn Anastasio 
> 
> After a few days of further testing, I've started to run into stability
> issues with the patch applied and used with an AMD GPU. Specifically,
> the system sometimes spontaneously crashes. Not just EEH errors either,
> the whole system shuts down in what looks like a checkstop.
> 
> Perhaps some subtle corruption is occurring?

Have you tried this?

https://patchwork.ozlabs.org/patch/1113506/



-- 
Alexey

[PATCH net-next] defconfigs: remove obsolete CONFIG_INET_XFRM_MODE_* and CONFIG_INET6_XFRM_MODE_*

2019-06-12 Thread YueHaibing

These Kconfig options has been removed in
commit 4c145dce2601 ("xfrm: make xfrm modes builtin")
So there is no point to keep it in defconfigs any longer.

Signed-off-by: YueHaibing 
---
 arch/arc/configs/axs101_defconfig   | 3 ---
 arch/arc/configs/axs103_defconfig   | 3 ---
 arch/arc/configs/axs103_smp_defconfig   | 3 ---
 arch/arc/configs/haps_hs_defconfig  | 3 ---
 arch/arc/configs/haps_hs_smp_defconfig  | 3 ---
 arch/arc/configs/nps_defconfig  | 3 ---
 arch/arc/configs/nsimosci_hs_smp_defconfig  | 3 ---
 arch/arc/configs/tb10x_defconfig| 3 ---
 arch/arm/configs/acs5k_tiny_defconfig   | 3 ---
 arch/arm/configs/am200epdkit_defconfig  | 3 ---
 arch/arm/configs/aspeed_g4_defconfig| 6 --
 arch/arm/configs/aspeed_g5_defconfig| 6 --
 arch/arm/configs/at91_dt_defconfig  | 6 --
 arch/arm/configs/cm_x300_defconfig  | 3 ---
 arch/arm/configs/efm32_defconfig| 3 ---
 arch/arm/configs/ep93xx_defconfig   | 3 ---
 arch/arm/configs/ezx_defconfig  | 3 ---
 arch/arm/configs/h5000_defconfig| 3 ---
 arch/arm/configs/imote2_defconfig   | 3 ---
 arch/arm/configs/imx_v4_v5_defconfig| 3 ---
 arch/arm/configs/imx_v6_v7_defconfig| 3 ---
 arch/arm/configs/iop13xx_defconfig  | 3 ---
 arch/arm/configs/iop32x_defconfig   | 3 ---
 arch/arm/configs/iop33x_defconfig   | 3 ---
 arch/arm/configs/keystone_defconfig | 3 ---
 arch/arm/configs/lpc18xx_defconfig  | 3 ---
 arch/arm/configs/lpc32xx_defconfig  | 3 ---
 arch/arm/configs/lpd270_defconfig   | 3 ---
 arch/arm/configs/magician_defconfig | 3 ---
 arch/arm/configs/mini2440_defconfig | 3 ---
 arch/arm/configs/moxart_defconfig   | 3 ---
 arch/arm/configs/mps2_defconfig | 3 ---
 arch/arm/configs/mxs_defconfig  | 3 ---
 arch/arm/configs/omap1_defconfig| 3 ---
 arch/arm/configs/palmz72_defconfig  | 3 ---
 arch/arm/configs/pcm027_defconfig   | 3 ---
 arch/arm/configs/pxa3xx_defconfig   | 3 ---
 arch/arm/configs/qcom_defconfig | 3 ---
 arch/arm/configs/rpc_defconfig  | 6 --
 arch/arm/configs/s3c2410_defconfig  | 1 -
 arch/arm/configs/sama5_defconfig| 6 --
 arch/arm/configs/sunxi_defconfig| 3 ---
 arch/arm/configs/tango4_defconfig   | 3 ---
 arch/arm/configs/tegra_defconfig| 2 --
 arch/arm/configs/xcep_defconfig | 3 ---
 arch/hexagon/configs/comet_defconfig| 3 ---
 arch/m68k/configs/amcore_defconfig  | 3 ---
 arch/m68k/configs/m5208evb_defconfig| 3 ---
 arch/m68k/configs/m5249evb_defconfig| 3 ---
 arch/m68k/configs/m5272c3_defconfig | 3 ---
 arch/m68k/configs/m5275evb_defconfig| 3 ---
 arch/m68k/configs/m5307c3_defconfig | 3 ---
 arch/m68k/configs/m5407c3_defconfig | 3 ---
 arch/mips/configs/ar7_defconfig | 3 ---
 arch/mips/configs/ath25_defconfig   | 3 ---
 arch/mips/configs/ath79_defconfig   | 3 ---
 arch/mips/configs/bcm63xx_defconfig | 3 ---
 arch/mips/configs/bigsur_defconfig  | 3 ---
 arch/mips/configs/bmips_be_defconfig| 3 ---
 arch/mips/configs/bmips_stb_defconfig   | 3 ---
 arch/mips/configs/capcella_defconfig| 3 ---
 arch/mips/configs/ci20_defconfig| 3 ---
 arch/mips/configs/db1xxx_defconfig  | 1 -
 arch/mips/configs/decstation_64_defconfig   | 4 
 arch/mips/configs/decstation_defconfig  | 4 
 arch/mips/configs/decstation_r4k_defconfig  | 4 
 arch/mips/configs/fuloong2e_defconfig   | 2 --
 arch/mips/configs/gpr_defconfig | 3 ---
 arch/mips/configs/ip22_defconfig| 4 
 arch/mips/configs/ip27_defconfig| 7 ---
 arch/mips/configs/ip28_defconfig| 3 ---
 arch/mips/configs/jazz_defconfig| 2 --
 arch/mips/configs/jmr3927_defconfig | 3 ---
 arch/mips/configs/lasat_defconfig   | 3 ---
 arch/mips/configs/lemote2f_defconfig| 3 ---
 arch/mips/configs/loongson1b_defconfig  | 3 ---
 arch/mips/configs/loongson1c_defconfig  | 3 ---
 arch/mips/configs/malta_defconfig   | 2 --
 arch/mips/configs/malta_kvm_defconfig   | 2 --
 arch/mips/configs/malta_kvm_guest_defconfig | 2 --
 arch/mips/configs/maltaup_xpa_defconfig | 2 --
 arch/mips/configs/markeins_defconfig| 4 
 arch/mips/configs/mpc30x_defconfig  | 3 ---
 arch/mips/configs/mtx1_defconfig| 4

Re: [PATCH] crypto: talitos - fix max key size for sha384 and sha512

2019-06-12 Thread Horia Geanta

On 6/12/2019 8:49 AM, Christophe Leroy wrote:
> Below commit came with a typo in the CONFIG_ symbol, leading
> to a permanently reduced max key size regarless of the driver
> capabilities.
> 
> Reported-by: Horia Geantă 
> Fixes: b8fbdc2bc4e7 ("crypto: talitos - reduce max key size for SEC1")
> Signed-off-by: Christophe Leroy 
Reviewed-by: Horia Geantă 

Thanks,
Horia

Re: [PATCH] cxl: no need to check return value of debugfs_create functions

2019-06-12 Thread Frederic Barrat





Le 12/06/2019 à 12:02, Greg Kroah-Hartman a écrit :

On Wed, Jun 12, 2019 at 11:51:21AM +0200, Arnd Bergmann wrote:

On Tue, Jun 11, 2019 at 8:13 PM Greg Kroah-Hartman
 wrote:


@@ -64,8 +64,6 @@ int cxl_debugfs_adapter_add(struct cxl *adapter)

 snprintf(buf, 32, "card%i", adapter->adapter_num);
 dir = debugfs_create_dir(buf, cxl_debugfs);
-   if (IS_ERR(dir))
-   return PTR_ERR(dir);
 adapter->debugfs = dir;



Should the check for 'cxl_debugfs' get removed here as well?


Maybe, I could not determine the logic if those functions could be
called before cxl_debugfs was ever set.

And debugfs_create_dir() will not return a NULL value if an error
happens, so no need to worry about files being created in the wrong
place.


If that is null, we might put the subdir in the wrong place in the
tree, but that would otherwise be harmless as well, and the
same thing happens if 'dir' is NULL above and we add the
files in the debugfs root later (losing the ability to clean up
afterwards).

int cxl_debugfs_adapter_add(struct cxl *adapter)
{
 struct dentry *dir;
 char buf[32];

 if (!cxl_debugfs)
 return -ENODEV;

It's still a bit odd to return an error, since the caller then just
ignores the return code anway:


Then let's just return nothing.


 /* Don't care if this one fails: */
 cxl_debugfs_adapter_add(adapter);

It would seem best to change the return type to 'void' here for
consistency.


I agree, let me go do that.



I don't see any problems with turning all those function return types to 
'void'. Thanks for pointing it out and the clean up!


  Fred




thanks,

greg k-h

[PATCH] powerpc/pseries: Switch to GFP_ATOMIC allocations in hotplug interrupt handler

2019-06-12 Thread Bharata B Rao

queue_hotplug_event() gets called from interrupt handler code. Use
GFP_ATOMIC allocations instead of GFP_KERNEL.

Signed-off-by: Bharata B Rao 
---
 arch/powerpc/platforms/pseries/dlpar.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 17958043e7f7..79b36d91be28 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -387,10 +387,10 @@ void queue_hotplug_event(struct pseries_hp_errorlog 
*hp_errlog)
struct pseries_hp_errorlog *hp_errlog_copy;
 
hp_errlog_copy = kmalloc(sizeof(struct pseries_hp_errorlog),
-GFP_KERNEL);
+GFP_ATOMIC);
memcpy(hp_errlog_copy, hp_errlog, sizeof(struct pseries_hp_errorlog));
 
-   work = kmalloc(sizeof(struct pseries_hp_work), GFP_KERNEL);
+   work = kmalloc(sizeof(struct pseries_hp_work), GFP_ATOMIC);
if (work) {
INIT_WORK((struct work_struct *)work, pseries_hp_work_fn);
work->errlog = hp_errlog_copy;
-- 
2.17.1

[PATCH] powerpc/64: allow compiler to cache 'current'

2019-06-12 Thread Nicholas Piggin

current may be cached by the compiler, so remove the volatile asm
restriction. This results in better generated code, as well as being
smaller and fewer dependent loads, it can avoid store-hit-load flushes
like this one that shows up in irq_exit():

preempt_count_sub(HARDIRQ_OFFSET);
if (!in_interrupt() && ...)

Which ends up as:

((struct thread_info *)current)->preempt_count -= HARDIRQ_OFFSET;
if (((struct thread_info *)current)->preempt_count ...

Evaluating current twice presently means it has to be loaded twice, and
here gcc happens to pick a different register each time, then
preempt_count is accessed via that base register:

1058:   ld  r10,2392(r13) <-- current
105c:   lwz r9,0(r10) <-- preempt_count
1060:   addis   r9,r9,-1
1064:   stw r9,0(r10) <-- preempt_count
1068:   ld  r9,2392(r13)  <-- current
106c:   lwz r9,0(r9)  <-- preempt_count
1070:   rlwinm. r9,r9,0,11,23
1074:   bne 1090 

This can frustrate store-hit-load detection heuristics and cause
flushes. Allowing the compiler to cache current in a reigster with this
patch results in the same base register being used for all accesses,
which is more likely to be detected as an alias:

1058:   ld  r31,2392(r13)
...
1070:   lwz r9,0(r31)
1074:   addis   r9,r9,-1
1078:   stw r9,0(r31)
107c:   lwz r9,0(r31)
1080:   rlwinm. r9,r9,0,11,23
1084:   bne 10a0 

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/current.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/current.h 
b/arch/powerpc/include/asm/current.h
index 297827b76169..bbfb94800415 100644
--- a/arch/powerpc/include/asm/current.h
+++ b/arch/powerpc/include/asm/current.h
@@ -16,7 +16,8 @@ static inline struct task_struct *get_current(void)
 {
struct task_struct *task;
 
-   __asm__ __volatile__("ld %0,%1(13)"
+   /* get_current can be cached by the compiler, so no volatile */
+   asm ("ld %0,%1(13)"
: "=r" (task)
: "i" (offsetof(struct paca_struct, __current)));
 
-- 
2.20.1

[PATCH 1/2] bpf: fix div64 overflow tests to properly detect errors

2019-06-12 Thread Naveen N. Rao

If the result of the division is LLONG_MIN, current tests do not detect
the error since the return value is truncated to a 32-bit value and ends
up being 0.

Signed-off-by: Naveen N. Rao 
---
 .../testing/selftests/bpf/verifier/div_overflow.c  | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/verifier/div_overflow.c 
b/tools/testing/selftests/bpf/verifier/div_overflow.c
index bd3f38dbe796..acab4f00819f 100644
--- a/tools/testing/selftests/bpf/verifier/div_overflow.c
+++ b/tools/testing/selftests/bpf/verifier/div_overflow.c
@@ -29,8 +29,11 @@
"DIV64 overflow, check 1",
.insns = {
BPF_MOV64_IMM(BPF_REG_1, -1),
-   BPF_LD_IMM64(BPF_REG_0, LLONG_MIN),
-   BPF_ALU64_REG(BPF_DIV, BPF_REG_0, BPF_REG_1),
+   BPF_LD_IMM64(BPF_REG_2, LLONG_MIN),
+   BPF_ALU64_REG(BPF_DIV, BPF_REG_2, BPF_REG_1),
+   BPF_MOV32_IMM(BPF_REG_0, 0),
+   BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_2, 1),
+   BPF_MOV32_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
@@ -40,8 +43,11 @@
 {
"DIV64 overflow, check 2",
.insns = {
-   BPF_LD_IMM64(BPF_REG_0, LLONG_MIN),
-   BPF_ALU64_IMM(BPF_DIV, BPF_REG_0, -1),
+   BPF_LD_IMM64(BPF_REG_1, LLONG_MIN),
+   BPF_ALU64_IMM(BPF_DIV, BPF_REG_1, -1),
+   BPF_MOV32_IMM(BPF_REG_0, 0),
+   BPF_JMP_REG(BPF_JEQ, BPF_REG_0, BPF_REG_1, 1),
+   BPF_MOV32_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
-- 
2.21.0

[PATCH 2/2] powerpc/bpf: use unsigned division instruction for 64-bit operations

2019-06-12 Thread Naveen N. Rao

BPF_ALU64 div/mod operations are currently using signed division, unlike
BPF_ALU32 operations. Fix the same. DIV64 and MOD64 overflow tests pass
with this fix.

Fixes: 156d0e290e969c ("powerpc/ebpf/jit: Implement JIT compiler for extended 
BPF")
Cc: sta...@vger.kernel.org # v4.8+
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/ppc-opcode.h | 1 +
 arch/powerpc/net/bpf_jit.h| 2 +-
 arch/powerpc/net/bpf_jit_comp64.c | 8 
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 23f7ed796f38..49d65cd08ee0 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -342,6 +342,7 @@
 #define PPC_INST_MADDLD0x1033
 #define PPC_INST_DIVWU 0x7c000396
 #define PPC_INST_DIVD  0x7c0003d2
+#define PPC_INST_DIVDU 0x7c000392
 #define PPC_INST_RLWINM0x5400
 #define PPC_INST_RLWINM_DOT0x5401
 #define PPC_INST_RLWIMI0x5000
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index dcac37745b05..1e932898d430 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -116,7 +116,7 @@
 ___PPC_RA(a) | IMM_L(i))
 #define PPC_DIVWU(d, a, b) EMIT(PPC_INST_DIVWU | ___PPC_RT(d) |  \
 ___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_DIVD(d, a, b)  EMIT(PPC_INST_DIVD | ___PPC_RT(d) |   \
+#define PPC_DIVDU(d, a, b) EMIT(PPC_INST_DIVDU | ___PPC_RT(d) |  \
 ___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_AND(d, a, b)   EMIT(PPC_INST_AND | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(b))
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 0ebd946f178b..b0fa4723d6fb 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -399,12 +399,12 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
case BPF_ALU64 | BPF_DIV | BPF_X: /* dst /= src */
case BPF_ALU64 | BPF_MOD | BPF_X: /* dst %= src */
if (BPF_OP(code) == BPF_MOD) {
-   PPC_DIVD(b2p[TMP_REG_1], dst_reg, src_reg);
+   PPC_DIVDU(b2p[TMP_REG_1], dst_reg, src_reg);
PPC_MULD(b2p[TMP_REG_1], src_reg,
b2p[TMP_REG_1]);
PPC_SUB(dst_reg, dst_reg, b2p[TMP_REG_1]);
} else
-   PPC_DIVD(dst_reg, dst_reg, src_reg);
+   PPC_DIVDU(dst_reg, dst_reg, src_reg);
break;
case BPF_ALU | BPF_MOD | BPF_K: /* (u32) dst %= (u32) imm */
case BPF_ALU | BPF_DIV | BPF_K: /* (u32) dst /= (u32) imm */
@@ -432,7 +432,7 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
break;
case BPF_ALU64:
if (BPF_OP(code) == BPF_MOD) {
-   PPC_DIVD(b2p[TMP_REG_2], dst_reg,
+   PPC_DIVDU(b2p[TMP_REG_2], dst_reg,
b2p[TMP_REG_1]);
PPC_MULD(b2p[TMP_REG_1],
b2p[TMP_REG_1],
@@ -440,7 +440,7 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
PPC_SUB(dst_reg, dst_reg,
b2p[TMP_REG_1]);
} else
-   PPC_DIVD(dst_reg, dst_reg,
+   PPC_DIVDU(dst_reg, dst_reg,
b2p[TMP_REG_1]);
break;
}
-- 
2.21.0

[PATCH 0/2] powerpc/bpf: DIV64 instruction fix

2019-06-12 Thread Naveen N. Rao

The first patch updates DIV64 overflow tests to properly detect error 
conditions. The second patch fixes powerpc64 JIT to generate the proper 
unsigned division instruction for BPF_ALU64.

- Naveen

Naveen N. Rao (2):
  bpf: fix div64 overflow tests to properly detect errors
  powerpc/bpf: use unsigned division instruction for 64-bit operations

 arch/powerpc/include/asm/ppc-opcode.h  |  1 +
 arch/powerpc/net/bpf_jit.h |  2 +-
 arch/powerpc/net/bpf_jit_comp64.c  |  8 
 .../testing/selftests/bpf/verifier/div_overflow.c  | 14 ++
 4 files changed, 16 insertions(+), 9 deletions(-)

-- 
2.21.0

Re: [PATCH kernel v3 0/3] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-06-12 Thread Shawn Anastasio


On 6/12/19 2:07 AM, Alexey Kardashevskiy wrote:



On 12/06/2019 15:05, Shawn Anastasio wrote:

On 6/5/19 11:11 PM, Shawn Anastasio wrote:

On 5/30/19 2:03 AM, Alexey Kardashevskiy wrote:

This is an attempt to allow DMA masks between 32..59 which are not large
enough to use either a PHB3 bypass mode or a sketchy bypass. Depending
on the max order, up to 40 is usually available.


This is based on v5.2-rc2.

Please comment. Thanks.


I have tested this patch set with an AMD GPU that's limited to <64bit
DMA (I believe it's 40 or 42 bit). It successfully allows the card to
operate without falling back to 32-bit DMA mode as it does without
the patches.

Relevant kernel log message:
```
[    0.311211] pci 0033:01 : [PE# 00] Enabling 64-bit DMA bypass
```

Tested-by: Shawn Anastasio 


After a few days of further testing, I've started to run into stability
issues with the patch applied and used with an AMD GPU. Specifically,
the system sometimes spontaneously crashes. Not just EEH errors either,
the whole system shuts down in what looks like a checkstop.

Perhaps some subtle corruption is occurring?


Have you tried this?

https://patchwork.ozlabs.org/patch/1113506/


I have not. I'll give it a shot and try it out for a few days to see
if I'm able to reproduce the crashes.

Re: [PATCH v2] cxl: no need to check return value of debugfs_create functions

2019-06-12 Thread Arnd Bergmann

On Wed, Jun 12, 2019 at 5:54 PM Greg Kroah-Hartman
 wrote:
>
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
>
> Because there's no need to check, also make the return value of the
> local debugfs_create_io_x64() call void, as no one ever did anything
> with the return value (as they did not need to.)
>
> And make the cxl_debugfs_* calls return void as no one was even checking
> their return value at all.
>
> Cc: Frederic Barrat 
> Cc: Andrew Donnellan 
> Cc: Arnd Bergmann 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Greg Kroah-Hartman 

Reviewed-by: Arnd Bergmann

Re: [PATCH kernel v3 0/3] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-06-12 Thread Shawn Anastasio


On 6/12/19 1:16 AM, Oliver O'Halloran wrote:

On Wed, Jun 12, 2019 at 3:06 PM Shawn Anastasio  wrote:


On 6/5/19 11:11 PM, Shawn Anastasio wrote:

On 5/30/19 2:03 AM, Alexey Kardashevskiy wrote:

This is an attempt to allow DMA masks between 32..59 which are not large
enough to use either a PHB3 bypass mode or a sketchy bypass. Depending
on the max order, up to 40 is usually available.


This is based on v5.2-rc2.

Please comment. Thanks.


I have tested this patch set with an AMD GPU that's limited to <64bit
DMA (I believe it's 40 or 42 bit). It successfully allows the card to
operate without falling back to 32-bit DMA mode as it does without
the patches.

Relevant kernel log message:
```
[0.311211] pci 0033:01 : [PE# 00] Enabling 64-bit DMA bypass
```

Tested-by: Shawn Anastasio 


After a few days of further testing, I've started to run into stability
issues with the patch applied and used with an AMD GPU. Specifically,
the system sometimes spontaneously crashes. Not just EEH errors either,
the whole system shuts down in what looks like a checkstop.


Any specific workload? Checkstops are harder to debug without a system
in the failed state so we'd need to replicate that locally to get a
decent idea what's up.


I haven't been able to pinpoint the exact cause. The first time it
happened was after about 4 days of uptime while playing a 1080p
video in mpv. The second time was about 5 minutes after booting up
while restoring a firefox session.

Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-12 Thread Larry Finger


On 6/12/19 1:55 AM, Christoph Hellwig wrote:


Ooops, yes.  But I think we could just enable ZONE_DMA on 32-bit
powerpc.  Crude enablement hack below:

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636308c8..1dd71a98b70c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -372,7 +372,7 @@ config PPC_ADV_DEBUG_DAC_RANGE
  
  config ZONE_DMA

bool
-   default y if PPC_BOOK3E_64
+   default y
  
  config PGTABLE_LEVELS

int



With the patch for Kconfig above, and the original patch setting 
ARCH_ZONE_DMA_BITS to 30, everything works.


Do you have any ideas on what should trigger the change in ARCH_ZONE_BITS? 
Should it be CONFIG_PPC32 defined, or perhaps CONFIG_G4_CPU defined?


Larry

Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-12 Thread Benjamin Herrenschmidt

On Wed, 2019-06-12 at 14:41 -0500, Larry Finger wrote:
> On 6/12/19 1:55 AM, Christoph Hellwig wrote:
> > 
> > Ooops, yes.  But I think we could just enable ZONE_DMA on 32-bit
> > powerpc.  Crude enablement hack below:
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 8c1c636308c8..1dd71a98b70c 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -372,7 +372,7 @@ config PPC_ADV_DEBUG_DAC_RANGE
> >
> >config ZONE_DMA
> >bool
> > - default y if PPC_BOOK3E_64
> > + default y
> >
> >config PGTABLE_LEVELS
> >int
> > 
> 
> With the patch for Kconfig above, and the original patch setting 
> ARCH_ZONE_DMA_BITS to 30, everything works.
> 
> Do you have any ideas on what should trigger the change in ARCH_ZONE_BITS? 
> Should it be CONFIG_PPC32 defined, or perhaps CONFIG_G4_CPU defined?

I think CONFIG_PPC32 is fine

Ben.

Re: [PATCH v3 1/3] powerpc/powernv: Add OPAL API interface to get secureboot state

2019-06-12 Thread Nayna




On 06/12/2019 02:17 AM, Daniel Axtens wrote:

Nayna Jain  writes:


From: Claudio Carvalho 

The X.509 certificates trusted by the platform and other information
required to secure boot the OS kernel are wrapped in secure variables,
which are controlled by OPAL.

This patch adds support to read OPAL secure variables through
OPAL_SECVAR_GET call. It returns the metadata and data for a given secure
variable based on the unique key.

Since OPAL can support different types of backend which can vary in the
variable interpretation, a new OPAL API call named OPAL_SECVAR_BACKEND, is
added to retrieve the supported backend version. This helps the consumer
to know how to interpret the variable.


(Firstly, apologies that I haven't got around to asking about this yet!)

Are pluggable/versioned backend a good idea?

There are a few things that worry me about the idea:

  - It adds complexity in crypto (or crypto-adjacent) code, and that
increases the likelihood that we'll accidentally add a bug with bad
consequences.


Sorry, I think I am not clear on what exactly you mean here.Can you 
please elaborate or give specifics ?





  - Under what circumstances would would we change the kernel-visible
behaviour of skiboot? Are we expecting to change the behaviour,
content or names of the variables in future? Otherwise the only
relevant change I can think of is a change to hardware platforms, and
I'm not sure how a change in hardware would lead to change in
behaviour in the kernel. Wouldn't Skiboot hide h/w differences?


Backends are intended to be an agreement for firmware, kernel and 
userspace on what the format of variables are, what variables should be 
expected, how they should be signed, etc. Though we don't expect it to 
happen very often, we want to anticipate possible changes in the 
firmware which may affect the kernel such as new features, support of 
new authentication mechanisms, addition of new variables. Corresponding 
skiboot patches are on - 
https://lists.ozlabs.org/pipermail/skiboot/2019-June/014641.html





  - If we are worried about a long-term-future change to how secure-boot
works, would it be better to just add more get/set calls to opal at
the point at which we actually implement the new system?


The intention is to avoid to re-implement the key/value interface for 
each scheme. Do you mean to deprecate the old APIs and add new APIs with 
every scheme ?




  - UEFI added EFI_VARIABLE_AUTHENTICATION_3 in a way that - as far

as I know - didn't break backwards compatibility. Is there a reason
we cannot add features that way instead? (It also dropped v1 of the
authentication header.)

  - What is the correct fallback behaviour if a kernel receives a result

that it does not expect? If a kernel expecting BackendV1 is instead
informed that it is running on BackendV2, then the cannot access the
secure variable at all, so it cannot load keys that are potentially
required to successfully boot (e.g. to validate the module for
network card or graphics!)


The backend is declaredby the firmware, and is set at compile-time. The 
kernel queriesfirmware on whichbackend is in use, and the backend will 
not change at runtime.If the backend in use by the firmware is not 
supported by the kernel (e.g. kernel is too old), the kernel does not 
attempt to read any secure variables, as it won't understand what the 
format is. This is a secure boot failure condition, as we cannot verify 
the next kernel. With addition of new backends in the skiboot, the 
support will be added to the kernel. Note: skiboot and skiroot should 
always be in sync with backend support.



Thanks & Regards,
    - Nayna

Re: [PATCH v3 1/3] powerpc/powernv: Add OPAL API interface to get secureboot state

2019-06-12 Thread Daniel Axtens

Hi Nayna,

>>> Since OPAL can support different types of backend which can vary in the
>>> variable interpretation, a new OPAL API call named OPAL_SECVAR_BACKEND, is
>>> added to retrieve the supported backend version. This helps the consumer
>>> to know how to interpret the variable.
>>>
>> (Firstly, apologies that I haven't got around to asking about this yet!)
>>
>> Are pluggable/versioned backend a good idea?
>>
>> There are a few things that worry me about the idea:
>>
>>   - It adds complexity in crypto (or crypto-adjacent) code, and that
>> increases the likelihood that we'll accidentally add a bug with bad
>> consequences.
>
> Sorry, I think I am not clear on what exactly you mean here.Can you 
> please elaborate or give specifics ?

Cryptosystems with greater flexibility can have new kinds of
vulnerabilities arise from the greater complexity. The first sort of
thing that comes to mind is a downgrade attack like from TLS. I think
you're protected from this because the mode cannot be negotiatied at run
time, but in general it's security sensitive code so I'd like it to be
as simple as possible.

>>   - If we are worried about a long-term-future change to how secure-boot
>> works, would it be better to just add more get/set calls to opal at
>> the point at which we actually implement the new system?
>
> The intention is to avoid to re-implement the key/value interface for 
> each scheme. Do you mean to deprecate the old APIs and add new APIs with 
> every scheme ?

Yes, because I expect the scheme would change very, very rarely.

>>   - Under what circumstances would would we change the kernel-visible
>> behaviour of skiboot? Are we expecting to change the behaviour,
>> content or names of the variables in future? Otherwise the only
>> relevant change I can think of is a change to hardware platforms, and
>> I'm not sure how a change in hardware would lead to change in
>> behaviour in the kernel. Wouldn't Skiboot hide h/w differences?
>
> Backends are intended to be an agreement for firmware, kernel and 
> userspace on what the format of variables are, what variables should be 
> expected, how they should be signed, etc. Though we don't expect it to 
> happen very often, we want to anticipate possible changes in the 
> firmware which may affect the kernel such as new features, support of 
> new authentication mechanisms, addition of new variables. Corresponding 
> skiboot patches are on - 
> https://lists.ozlabs.org/pipermail/skiboot/2019-June/014641.html

I still feel like this is holding onto ongoing complexity for very
little gain, but perhaps this is because I can't picture a specific
change that would actually require a wholesale change to the scheme.

You mention new features, support for new authentication mechanisms, and
addition of new variables.

 - New features is a bit too generic to answer specifically. In general
   I accept that there exists some new feature that would be
   sufficiently backwards-incompatible as to require a new version. I
   just can't think of one off the top of my head and so I'm not
   convinced it's worth the complexity. Did you have something in mind?

 - By support for new authentication mechanisms, I assume you mean new
   mechanisms for authenticating variable updates? This is communicated
   in edk2 via the attributes field. Looking at patch 5 from the skiboot
   series:

+ * When the attribute EFI_VARIABLE_TIME_BASED_AUTHENTICATED_WRITE_ACCESS is 
set,
+ * then the Data buffer shall begin with an instance of a complete (and
+ * serialized) EFI_VARIABLE_AUTHENTICATION_2 descriptor.

   Could a new authentication scheme be communicated by setting a
   different attribute value? Or are we not carrying attributes in the
   metadata blob?

 - For addition of new variables, I'm confused as to why this would
   require a new API - wouldn't it just be exposed in the normal way via
   opal_secvar_get(_next)?

I guess I also somewhat object to calling it a 'backend' if we're using
it as a version scheme. I think the skiboot storage backends are true
backends - they provide different implementations of the same
functionality with the same API, but this seems like you're using it to
indicate different functionality. It seems like we're using it as if it
were called OPAL_SECVAR_VERSION.

>>   - What is the correct fallback behaviour if a kernel receives a result
>> that it does not expect? If a kernel expecting BackendV1 is instead
>> informed that it is running on BackendV2, then the cannot access the
>> secure variable at all, so it cannot load keys that are potentially
>> required to successfully boot (e.g. to validate the module for
>> network card or graphics!)
>
> The backend is declaredby the firmware, and is set at compile-time. The 
> kernel queriesfirmware on whichbackend is in use, and the backend will 
> not change at runtime.If the backend in use by the firmware is not 
> supported by the kernel (e.g.

[PATCH v4 12/28] docs: kbuild: convert docs to ReST and rename to *.rst

2019-06-12 Thread Mauro Carvalho Chehab

The kbuild documentation clearly shows that the documents
there are written at different times: some use markdown,
some use their own peculiar logic to split sections.

Convert everything to ReST without affecting too much
the author's style and avoiding adding uneeded markups.

The conversion is actually:
  - add blank lines and identation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/admin-guide/README.rst  |   2 +-
 ...eaders_install.txt => headers_install.rst} |   5 +-
 Documentation/kbuild/index.rst|  27 +
 Documentation/kbuild/issues.rst   |  11 +
 .../kbuild/{kbuild.txt => kbuild.rst} | 119 ++--
 ...nfig-language.txt => kconfig-language.rst} | 232 
 ...anguage.txt => kconfig-macro-language.rst} |  37 +-
 .../kbuild/{kconfig.txt => kconfig.rst}   | 136 +++--
 .../kbuild/{makefiles.txt => makefiles.rst}   | 530 +++---
 .../kbuild/{modules.txt => modules.rst}   | 168 +++---
 Documentation/kernel-hacking/hacking.rst  |   4 +-
 Documentation/process/coding-style.rst|   2 +-
 Documentation/process/submit-checklist.rst|   2 +-
 .../it_IT/kernel-hacking/hacking.rst  |   4 +-
 .../it_IT/process/coding-style.rst|   2 +-
 .../it_IT/process/submit-checklist.rst|   2 +-
 .../zh_CN/process/coding-style.rst|   2 +-
 .../zh_CN/process/submit-checklist.rst|   2 +-
 Kconfig   |   2 +-
 arch/arc/plat-eznps/Kconfig   |   2 +-
 arch/c6x/Kconfig  |   2 +-
 arch/microblaze/Kconfig.debug |   2 +-
 arch/microblaze/Kconfig.platform  |   2 +-
 arch/nds32/Kconfig|   2 +-
 arch/openrisc/Kconfig |   2 +-
 arch/powerpc/sysdev/Kconfig   |   2 +-
 arch/riscv/Kconfig|   2 +-
 drivers/auxdisplay/Kconfig|   2 +-
 drivers/firmware/Kconfig  |   2 +-
 drivers/mtd/devices/Kconfig   |   2 +-
 drivers/net/ethernet/smsc/Kconfig |   6 +-
 drivers/net/wireless/intel/iwlegacy/Kconfig   |   4 +-
 drivers/net/wireless/intel/iwlwifi/Kconfig|   2 +-
 drivers/parport/Kconfig   |   2 +-
 drivers/scsi/Kconfig  |   4 +-
 drivers/staging/sm750fb/Kconfig   |   2 +-
 drivers/usb/misc/Kconfig  |   4 +-
 drivers/video/fbdev/Kconfig   |  14 +-
 net/bridge/netfilter/Kconfig  |   2 +-
 net/ipv4/netfilter/Kconfig|   2 +-
 net/ipv6/netfilter/Kconfig|   2 +-
 net/netfilter/Kconfig |  16 +-
 net/tipc/Kconfig  |   2 +-
 scripts/Kbuild.include|   4 +-
 scripts/Makefile.host |   2 +-
 scripts/kconfig/symbol.c  |   2 +-
 .../tests/err_recursive_dep/expected_stderr   |  14 +-
 sound/oss/dmasound/Kconfig|   6 +-
 48 files changed, 840 insertions(+), 561 deletions(-)
 rename Documentation/kbuild/{headers_install.txt => headers_install.rst} (96%)
 create mode 100644 Documentation/kbuild/index.rst
 create mode 100644 Documentation/kbuild/issues.rst
 rename Documentation/kbuild/{kbuild.txt => kbuild.rst} (72%)
 rename Documentation/kbuild/{kconfig-language.txt => kconfig-language.rst} 
(85%)
 rename Documentation/kbuild/{kconfig-macro-language.txt => 
kconfig-macro-language.rst} (94%)
 rename Documentation/kbuild/{kconfig.txt => kconfig.rst} (80%)
 rename Documentation/kbuild/{makefiles.txt => makefiles.rst} (83%)
 rename Documentation/kbuild/{modules.txt => modules.rst} (84%)

diff --git a/Documentation/admin-guide/README.rst 
b/Documentation/admin-guide/README.rst
index a582c780c3bd..cc6151fc0845 100644
--- a/Documentation/admin-guide/README.rst
+++ b/Documentation/admin-guide/README.rst
@@ -227,7 +227,7 @@ Configuring the kernel
  "make tinyconfig"  Configure the tiniest possible kernel.
 
You can find more information on using the Linux kernel config tools
-   in Documentation/kbuild/kconfig.txt.
+   in Documentation/kbuild/kconfig.rst.
 
  - NOTES on ``make config``:
 
diff --git a/Documentation/kbuild/headers_install.txt 
b/Documentation/kbuild/headers_install.rst
similarity index 96%
rename from Documentation/kbuild/headers_install.txt
rename to Documentation/kbuild/headers_install.rst
index f0153adb95e2..1ab7294e41ac 100644
--- a/Documentation/kbuild/headers_install.txt
+++ b/Documentation/kbuild/headers_install.rst
@@ -1,3 +1,4 @@
+=
 Exporting kernel

Re: [PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Michael Neuling

On Wed, 2019-06-12 at 09:43 +0200, Cédric Le Goater wrote:
> On 12/06/2019 09:22, Michael Neuling wrote:
> > In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
> > option") I screwed up some assembler and corrupted a pointer in
> > r3. This resulted in crashes like the below from Cédric:
> > 
> >   [   44.374746] BUG: Kernel NULL pointer dereference at 0x13bf
> >   [   44.374848] Faulting instruction address: 0xc010b044
> >   [   44.374906] Oops: Kernel access of bad area, sig: 11 [#1]
> >   [   44.374951] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
> > pSeries
> >   [   44.375018] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
> > iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
> > nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
> > bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
> > iptable_filter bpfilter vmx_crypto crct10dif_vpmsum crc32c_vpmsum kvm_hv 
> > kvm sch_fq_codel ip_tables x_tables autofs4 virtio_net net_failover 
> > virtio_scsi failover
> >   [   44.375401] CPU: 8 PID: 1771 Comm: qemu-system-ppc Kdump: loaded Not 
> > tainted 5.2.0-rc4+ #3
> >   [   44.375500] NIP:  c010b044 LR: c008089dacf4 CTR: 
> > c010aff4
> >   [   44.375604] REGS: c0179b397710 TRAP: 0300   Not tainted  
> > (5.2.0-rc4+)
> >   [   44.375691] MSR:  8280b033   
> > CR: 42244842  XER: 
> >   [   44.375815] CFAR: c010aff8 DAR: 13bf DSISR: 
> > 4200 IRQMASK: 0
> >   [   44.375815] GPR00: c008089dd6bc c0179b3979a0 c00808a04300 
> > 
> >   [   44.375815] GPR04:  0003 2444b05d 
> > c017f11c45d0
> >   [   44.375815] GPR08: 07803e018dfe 0028 0001 
> > 0075
> >   [   44.375815] GPR12: c010aff4 c7ff6300  
> > 
> >   [   44.375815] GPR16:  c017f11d  
> > c017f11ca7a8
> >   [   44.375815] GPR20: c017f11c42ec   
> > 000a
> >   [   44.375815] GPR24: fffc  c017f11c 
> > c1a77ed8
> >   [   44.375815] GPR28: c0179af7 fffc c008089ff170 
> > c0179ae88540
> >   [   44.376673] NIP [c010b044] kvmppc_h_set_dabr+0x50/0x68
> >   [   44.376754] LR [c008089dacf4] kvmppc_pseries_do_hcall+0xa3c/0xeb0 
> > [kvm_hv]
> >   [   44.376849] Call Trace:
> >   [   44.376886] [c0179b3979a0] [c017f11c] 0xc017f11c 
> > (unreliable)
> >   [   44.376982] [c0179b397a10] [c008089dd6bc] 
> > kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
> >   [   44.377084] [c0179b397ae0] [c008093f8bcc] 
> > kvmppc_vcpu_run+0x34/0x48 [kvm]
> >   [   44.377185] [c0179b397b00] [c008093f522c] 
> > kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
> >   [   44.377286] [c0179b397b90] [c008093e3618] 
> > kvm_vcpu_ioctl+0x460/0x850 [kvm]
> >   [   44.377384] [c0179b397d00] [c04ba6c4] 
> > do_vfs_ioctl+0xe4/0xb40
> >   [   44.377464] [c0179b397db0] [c04bb1e4] ksys_ioctl+0xc4/0x110
> >   [   44.377547] [c0179b397e00] [c04bb258] sys_ioctl+0x28/0x80
> >   [   44.377628] [c0179b397e20] [c000b888] system_call+0x5c/0x70
> >   [   44.377712] Instruction dump:
> >   [   44.377765] 4082fff4 4c00012c 3860 4e800020 e96280c0 896b 
> > 2c2b 3860
> >   [   44.377862] 4d820020 50852e74 508516f6 78840724  f8a313c8 
> > 7c942ba6 7cbc2ba6
> > 
> > This fixes the problem by only changing r3 when we are returning
> > immediately.
> > 
> > Signed-off-by: Michael Neuling 
> > Reported-by: Cédric Le Goater 
> 
> On nested, I still see : 
> 
> [   94.609274] Oops: Exception in kernel mode, sig: 4 [#1]
> [   94.609432] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
> pSeries
> [   94.609596] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
> iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
> nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
> bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
> iptable_filter bpfilter vmx_crypto kvm_hv crct10dif_vpmsum crc32c_vpmsum kvm 
> sch_fq_codel ip_tables x_tables autofs4 virtio_net virtio_scsi net_failover 
> failover
> [   94.610179] CPU: 12 PID: 2026 Comm: qemu-system-ppc Kdump: loaded Not 
> tainted 5.2.0-rc4+ #6
> [   94.610290] NIP:  c010b050 LR: c00808bbacf4 CTR: 
> c010aff4
> [   94.610400] REGS: c017913d7710 TRAP: 0700   Not tainted  (5.2.0-rc4+)
> [   94.610493] MSR:  8284b033   CR: 
> 42224842  XER: 
> [   94.610671] CFAR: c010b030 IRQMASK: 0 
> [   94.610671] GPR00: c00808bbd6bc c017913d79a0 c00808be4300 
> c01791376220 
> [   94.610671] GPR04:  0003 f679892e 
>

[PATCH v2] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Michael Neuling

Commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
option") screwed up some assembler and corrupted a pointer in
r3. This resulted in crashes like the below:

  [   44.374746] BUG: Kernel NULL pointer dereference at 0x13bf
  [   44.374848] Faulting instruction address: 0xc010b044
  [   44.374906] Oops: Kernel access of bad area, sig: 11 [#1]
  [   44.374951] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA 
pSeries
  [   44.375018] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter vmx_crypto crct10dif_vpmsum crc32c_vpmsum kvm_hv kvm 
sch_fq_codel ip_tables x_tables autofs4 virtio_net net_failover virtio_scsi 
failover
  [   44.375401] CPU: 8 PID: 1771 Comm: qemu-system-ppc Kdump: loaded Not 
tainted 5.2.0-rc4+ #3
  [   44.375500] NIP:  c010b044 LR: c008089dacf4 CTR: 
c010aff4
  [   44.375604] REGS: c0179b397710 TRAP: 0300   Not tainted  (5.2.0-rc4+)
  [   44.375691] MSR:  8280b033   CR: 
42244842  XER: 
  [   44.375815] CFAR: c010aff8 DAR: 13bf DSISR: 4200 
IRQMASK: 0
  [   44.375815] GPR00: c008089dd6bc c0179b3979a0 c00808a04300 

  [   44.375815] GPR04:  0003 2444b05d 
c017f11c45d0
  [   44.375815] GPR08: 07803e018dfe 0028 0001 
0075
  [   44.375815] GPR12: c010aff4 c7ff6300  

  [   44.375815] GPR16:  c017f11d  
c017f11ca7a8
  [   44.375815] GPR20: c017f11c42ec   
000a
  [   44.375815] GPR24: fffc  c017f11c 
c1a77ed8
  [   44.375815] GPR28: c0179af7 fffc c008089ff170 
c0179ae88540
  [   44.376673] NIP [c010b044] kvmppc_h_set_dabr+0x50/0x68
  [   44.376754] LR [c008089dacf4] kvmppc_pseries_do_hcall+0xa3c/0xeb0 
[kvm_hv]
  [   44.376849] Call Trace:
  [   44.376886] [c0179b3979a0] [c017f11c] 0xc017f11c 
(unreliable)
  [   44.376982] [c0179b397a10] [c008089dd6bc] 
kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
  [   44.377084] [c0179b397ae0] [c008093f8bcc] 
kvmppc_vcpu_run+0x34/0x48 [kvm]
  [   44.377185] [c0179b397b00] [c008093f522c] 
kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
  [   44.377286] [c0179b397b90] [c008093e3618] 
kvm_vcpu_ioctl+0x460/0x850 [kvm]
  [   44.377384] [c0179b397d00] [c04ba6c4] do_vfs_ioctl+0xe4/0xb40
  [   44.377464] [c0179b397db0] [c04bb1e4] ksys_ioctl+0xc4/0x110
  [   44.377547] [c0179b397e00] [c04bb258] sys_ioctl+0x28/0x80
  [   44.377628] [c0179b397e20] [c000b888] system_call+0x5c/0x70
  [   44.377712] Instruction dump:
  [   44.377765] 4082fff4 4c00012c 3860 4e800020 e96280c0 896b 2c2b 
3860
  [   44.377862] 4d820020 50852e74 508516f6 78840724  f8a313c8 
7c942ba6 7cbc2ba6

Fix the bug by only changing r3 when we are returning immediately.

Fixes: c1fe190c0672 ("powerpc: Add force enable of DAWR on P9 option")
Signed-off-by: Michael Neuling 
Reported-by: Cédric Le Goater 
--
mpe: This is for 5.2 fixes

v2: Review from Christophe Leroy
  - De-Mikey/Cedric-ify commit message
  - Add "Fixes:"
  - Other trivial commit messages changes
  - No code change
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 139027c62d..f781ee1458 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2519,8 +2519,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
LOAD_REG_ADDR(r11, dawr_force_enable)
lbz r11, 0(r11)
cmpdi   r11, 0
+   bne 3f
li  r3, H_HARDWARE
-   beqlr
+   blr
+3:
/* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
rlwimi  r5, r4, 2, DAWRX_WT
-- 
2.21.0

Re: [PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Michael Neuling



> > > 3:
> > > /* Emulate H_SET_DABR/X on P8 for the sake of compat mode
> > > guests */
> > > rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
> > > c010b03c:   74 2e 85 50 rlwimi  r5,r4,5,25,26
> > > rlwimi  r5, r4, 2, DAWRX_WT
> > > c010b040:   f6 16 85 50 rlwimi  r5,r4,2,27,27
> > > clrrdi  r4, r4, 3
> > > c010b044:   24 07 84 78 rldicr  r4,r4,0,60
> > > std r4, VCPU_DAWR(r3)
> > > c010b048:   c0 13 83 f8 std r4,5056(r3)
> > > std r5, VCPU_DAWRX(r3)
> > > c010b04c:   c8 13 a3 f8 std r5,5064(r3)
> > > mtspr   SPRN_DAWR, r4
> > > c010b050:   a6 2b 94 7c mtspr   180,r4
> > > mtspr   SPRN_DAWRX, r5
> > > c010b054:   a6 2b bc 7c mtspr   188,r5
> > > li  r3, 0
> > > c010b058:   00 00 60 38 li  r3,0
> > > blr
> > > c010b05c:   20 00 80 4e blr
> > 
> > It's the `mtspr   SPRN_DAWR, r4` as you're HV=0.  I'm not sure how
> > nested works
> > in that regard. Is the level above suppose to trap and emulate
> > that?  
> > 
> 
> Yeah so as a nested hypervisor we need to avoid that call to mtspr
> SPRN_DAWR since it's HV privileged and we run with HV = 0.
> 
> The fix will be to check kvmhv_on_pseries() before doing the write. In
> fact we should avoid the write any time we call the function from _not_
> real mode.
> 
> I'll submit a fix for the KVM side. Doesn't look like this is anything
> to do with Mikey's patch, was always broken as far as I can tell.

Thanks Suraj.

Mikey

Re: [PATCH] powerpc: Enable kernel XZ compression option on PPC_85xx

2019-06-12 Thread Daniel Axtens

Pawel Dembicki  writes:

> Enable kernel XZ compression option on PPC_85xx. Tested with
> simpleImage on TP-Link TL-WDR4900 (Freescale P1014 processor).
>
> Suggested-by: Christian Lamparter 
> Signed-off-by: Pawel Dembicki 
> ---
>  arch/powerpc/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 8c1c636308c8..daf4cb968922 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -196,7 +196,7 @@ config PPC
>   select HAVE_IOREMAP_PROT
>   select HAVE_IRQ_EXIT_ON_IRQ_STACK
>   select HAVE_KERNEL_GZIP
> - select HAVE_KERNEL_XZ   if PPC_BOOK3S || 44x
> + select HAVE_KERNEL_XZ   if PPC_BOOK3S || 44x || PPC_85xx

(I'm not super well versed in the compression stuff, so apologies if
this is a dumb question.) If it's this simple, is there any reason we
can't turn it on generally, or convert it to a blacklist of platforms
known not to work?

Regards,
Daniel

>   select HAVE_KPROBES
>   select HAVE_KPROBES_ON_FTRACE
>   select HAVE_KRETPROBES
> -- 
> 2.20.1

Re: [RFC/RFT PATCH v2] ASoC: fsl_esai: Revert "ETDR and TX0~5 registers are non volatile"

2019-06-12 Thread Nicolin Chen

Hi Shengjiu,

On Thu, Jun 13, 2019 at 03:00:58AM +, S.j. Wang wrote:
> > Commit 8973112aa41b ("ASoC: fsl_esai: ETDR and TX0~5 registers are non
> > volatile") removed TX data registers from the volatile_reg list and appended
> > default values for them. However, being data registers of TX, they should
> > not have been removed from the list because they should not be cached --
> > see the following reason.
> > 
> > When doing regcache_sync(), this operation might accidentally write some
> > dirty data to these registers, in case that cached data happen to be
> > different from the default ones, which might also result in a channel shift 
> > or
> > swap situation, since the number of write-via-sync operations at ETDR
> > would very unlikely match the channel number.
> > 
> > So this patch reverts the original commit to keep TX data registers in
> > volatile_reg list in order to prevent them from being written by
> > regcache_sync().
> > 
> > Note: this revert is not a complete revert as it keeps those macros of
> > registers remaining in the default value list while the original commit also
> > changed other entries in the list. And this patch isn't very necessary to Cc
> > stable tree since there has been always a FIFO reset operation around the
> > regcache_sync() call, even prior to this reverted commit.
> > 
> > Signed-off-by: Nicolin Chen 
> > Cc: Shengjiu Wang 
> > ---
> > Hi Mark,
> > In case there's no objection against the patch, I'd still like to wait for a
> > Tested-by from NXP folks before submitting it. Thanks!
> 
> bool regmap_volatile(struct regmap *map, unsigned int reg)
> {
> if (!map->format.format_write && !regmap_readable(map, reg))
> return false;
> 
> 
> Actually with this patch, the regcache_sync will write the 0 to ETDR, even
> It is declared volatile, the reason is that in regmap_volatile(), the first
> condition
> 
> (!map->format.format_write && !regmap_readable(map, reg))  is true.
> 
> So the regmap_volatile will return false.

Interesting finding.so a write-only register will not be treated
as a volatile register (to avoid regcache_sync) at all

> And in regcache_reg_needs_sync(), because there is no default value
> It will return true, then the ETDR need be synced, and be written 0.

Looks like either way of keeping them in or out of volatile_reg list
might have the same result of having a data being written, while our
current code at least would not force to write 0.

So I think having a FIFO reset won't be a bad idea at all. And since
our suspend/resume() functions are already doing regcache_sync() with
a FIFO reset, we can just reuse that code for your reset routine.

Thanks a lot
Nicolin

RE: [RFC/RFT PATCH v2] ASoC: fsl_esai: Revert "ETDR and TX0~5 registers are non volatile"

2019-06-12 Thread S.j. Wang

Hi
> 
> Commit 8973112aa41b ("ASoC: fsl_esai: ETDR and TX0~5 registers are non
> volatile") removed TX data registers from the volatile_reg list and appended
> default values for them. However, being data registers of TX, they should
> not have been removed from the list because they should not be cached --
> see the following reason.
> 
> When doing regcache_sync(), this operation might accidentally write some
> dirty data to these registers, in case that cached data happen to be
> different from the default ones, which might also result in a channel shift or
> swap situation, since the number of write-via-sync operations at ETDR
> would very unlikely match the channel number.
> 
> So this patch reverts the original commit to keep TX data registers in
> volatile_reg list in order to prevent them from being written by
> regcache_sync().
> 
> Note: this revert is not a complete revert as it keeps those macros of
> registers remaining in the default value list while the original commit also
> changed other entries in the list. And this patch isn't very necessary to Cc
> stable tree since there has been always a FIFO reset operation around the
> regcache_sync() call, even prior to this reverted commit.
> 
> Signed-off-by: Nicolin Chen 
> Cc: Shengjiu Wang 
> ---
> Hi Mark,
> In case there's no objection against the patch, I'd still like to wait for a
> Tested-by from NXP folks before submitting it. Thanks!

bool regmap_volatile(struct regmap *map, unsigned int reg)
{
if (!map->format.format_write && !regmap_readable(map, reg))
return false;


Actually with this patch, the regcache_sync will write the 0 to ETDR, even
It is declared volatile, the reason is that in regmap_volatile(), the first
condition

(!map->format.format_write && !regmap_readable(map, reg))  is true.

So the regmap_volatile will return false.

And in regcache_reg_needs_sync(), because there is no default value
It will return true, then the ETDR need be synced, and be written 0.

Here is the code for regcache_default_sync()

static int regcache_default_sync(struct regmap *map, unsigned int min,
 unsigned int max)
{
unsigned int reg;

for (reg = min; reg <= max; reg += map->reg_stride) {
unsigned int val;
int ret;

if (regmap_volatile(map, reg) ||
!regmap_writeable(map, reg))
continue;

ret = regcache_read(map, reg, );
if (ret)
return ret;

if (!regcache_reg_needs_sync(map, reg, val))
continue;

map->cache_bypass = true;
ret = _regmap_write(map, reg, val);
map->cache_bypass = false;

Best regards
Wang shengjiu

Re: [PATCH] KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()

2019-06-12 Thread Suraj Jitindar Singh

On Thu, 2019-06-13 at 10:16 +1000, Michael Neuling wrote:
> On Wed, 2019-06-12 at 09:43 +0200, Cédric Le Goater wrote:
> > On 12/06/2019 09:22, Michael Neuling wrote:
> > > In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
> > > option") I screwed up some assembler and corrupted a pointer in
> > > r3. This resulted in crashes like the below from Cédric:
> > > 
> > >   [   44.374746] BUG: Kernel NULL pointer dereference at
> > > 0x13bf
> > >   [   44.374848] Faulting instruction address: 0xc010b044
> > >   [   44.374906] Oops: Kernel access of bad area, sig: 11 [#1]
> > >   [   44.374951] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP
> > > NR_CPUS=2048 NUMA pSeries
> > >   [   44.375018] Modules linked in: vhost_net vhost tap
> > > xt_CHECKSUM iptable_mangle xt_MASQUERADE iptable_nat nf_nat
> > > xt_conntrack nf_conntrack nf_defrag_ipv6 libcrc32c nf_defrag_ipv4
> > > ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
> > > ebtables ip6table_filter ip6_tables iptable_filter bpfilter
> > > vmx_crypto crct10dif_vpmsum crc32c_vpmsum kvm_hv kvm sch_fq_codel
> > > ip_tables x_tables autofs4 virtio_net net_failover virtio_scsi
> > > failover
> > >   [   44.375401] CPU: 8 PID: 1771 Comm: qemu-system-ppc Kdump:
> > > loaded Not tainted 5.2.0-rc4+ #3
> > >   [   44.375500] NIP:  c010b044 LR: c008089dacf4 CTR:
> > > c010aff4
> > >   [   44.375604] REGS: c0179b397710 TRAP: 0300   Not
> > > tainted  (5.2.0-rc4+)
> > >   [   44.375691] MSR:  8280b033
> > >   CR: 42244842  XER: 
> > >   [   44.375815] CFAR: c010aff8 DAR: 13bf
> > > DSISR: 4200 IRQMASK: 0
> > >   [   44.375815] GPR00: c008089dd6bc c0179b3979a0
> > > c00808a04300 
> > >   [   44.375815] GPR04:  0003
> > > 2444b05d c017f11c45d0
> > >   [   44.375815] GPR08: 07803e018dfe 0028
> > > 0001 0075
> > >   [   44.375815] GPR12: c010aff4 c7ff6300
> > >  
> > >   [   44.375815] GPR16:  c017f11d
> > >  c017f11ca7a8
> > >   [   44.375815] GPR20: c017f11c42ec 
> > >  000a
> > >   [   44.375815] GPR24: fffc 
> > > c017f11c c1a77ed8
> > >   [   44.375815] GPR28: c0179af7 fffc
> > > c008089ff170 c0179ae88540
> > >   [   44.376673] NIP [c010b044]
> > > kvmppc_h_set_dabr+0x50/0x68
> > >   [   44.376754] LR [c008089dacf4]
> > > kvmppc_pseries_do_hcall+0xa3c/0xeb0 [kvm_hv]
> > >   [   44.376849] Call Trace:
> > >   [   44.376886] [c0179b3979a0] [c017f11c]
> > > 0xc017f11c (unreliable)
> > >   [   44.376982] [c0179b397a10] [c008089dd6bc]
> > > kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
> > >   [   44.377084] [c0179b397ae0] [c008093f8bcc]
> > > kvmppc_vcpu_run+0x34/0x48 [kvm]
> > >   [   44.377185] [c0179b397b00] [c008093f522c]
> > > kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
> > >   [   44.377286] [c0179b397b90] [c008093e3618]
> > > kvm_vcpu_ioctl+0x460/0x850 [kvm]
> > >   [   44.377384] [c0179b397d00] [c04ba6c4]
> > > do_vfs_ioctl+0xe4/0xb40
> > >   [   44.377464] [c0179b397db0] [c04bb1e4]
> > > ksys_ioctl+0xc4/0x110
> > >   [   44.377547] [c0179b397e00] [c04bb258]
> > > sys_ioctl+0x28/0x80
> > >   [   44.377628] [c0179b397e20] [c000b888]
> > > system_call+0x5c/0x70
> > >   [   44.377712] Instruction dump:
> > >   [   44.377765] 4082fff4 4c00012c 3860 4e800020 e96280c0
> > > 896b 2c2b 3860
> > >   [   44.377862] 4d820020 50852e74 508516f6 78840724 
> > > f8a313c8 7c942ba6 7cbc2ba6
> > > 
> > > This fixes the problem by only changing r3 when we are returning
> > > immediately.
> > > 
> > > Signed-off-by: Michael Neuling 
> > > Reported-by: Cédric Le Goater 
> > 
> > On nested, I still see : 
> > 
> > [   94.609274] Oops: Exception in kernel mode, sig: 4 [#1]
> > [   94.609432] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048
> > NUMA pSeries
> > [   94.609596] Modules linked in: vhost_net vhost tap xt_CHECKSUM
> > iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack
> > nf_conntrack nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT
> > nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables
> > ip6table_filter ip6_tables iptable_filter bpfilter vmx_crypto
> > kvm_hv crct10dif_vpmsum crc32c_vpmsum kvm sch_fq_codel ip_tables
> > x_tables autofs4 virtio_net virtio_scsi net_failover failover
> > [   94.610179] CPU: 12 PID: 2026 Comm: qemu-system-ppc Kdump:
> > loaded Not tainted 5.2.0-rc4+ #6
> > [   94.610290] NIP:  c010b050 LR: c00808bbacf4 CTR:
> > c010aff4
> > [   94.610400] REGS: c017913d7710 TRAP: 0700   Not
> > tainted  (5.2.0-rc4+)
> > [   94.610493] MSR:  8284b033
> >   CR: 42224842

[PATCH v2] Powerpc/Watchpoint: Restore nvgprs while returning from exception

2019-06-12 Thread Ravi Bangoria

Powerpc hw triggers watchpoint before executing the instruction. To
make trigger-after-execute behavior, kernel emulates the instruction.
If the instruction is 'load something into non-volatile register',
exception handler should restore emulated register state while
returning back, otherwise there will be register state corruption.
Ex, Adding a watchpoint on a list can corrput the list:

  # cat /proc/kallsyms | grep kthread_create_list
  c121c8b8 d kthread_create_list

Add watchpoint on kthread_create_list->prev:

  # perf record -e mem:0xc121c8c0

Run some workload such that new kthread gets invoked. Ex, I just
logged out from console:

  list_add corruption. next->prev should be prev (c1214e00), \
but was c121c8b8. (next=c121c8b8).
  WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0
  CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69
  ...
  NIP __list_add_valid+0xb4/0xc0
  LR __list_add_valid+0xb0/0xc0
  Call Trace:
  __list_add_valid+0xb0/0xc0 (unreliable)
  __kthread_create_on_node+0xe0/0x260
  kthread_create_on_node+0x34/0x50
  create_worker+0xe8/0x260
  worker_thread+0x444/0x560
  kthread+0x160/0x1a0
  ret_from_kernel_thread+0x5c/0x70

List corruption happened because it uses 'load into non-volatile
register' instruction:

Snippet from __kthread_create_on_node:

  c0136be8: addis   r29,r2,-19
  c0136bec: ld  r29,31424(r29)
if (!__list_add_valid(new, prev, next))
  c0136bf0: mr  r3,r30
  c0136bf4: mr  r5,r28
  c0136bf8: mr  r4,r29
  c0136bfc: bl  c059a2f8 <__list_add_valid+0x8>

Register state from WARN_ON():

  GPR00: c059a3a0 c07ff23afb50 c1344e00 0075
  GPR04:   001852af8bc1 
  GPR08: 0001 0007 0006 04aa
  GPR12:  c07eb080 c0137038 c05ff62aaa00
  GPR16:   c07fffbe7600 c07fffbe7370
  GPR20: c07fffbe7320 c07fffbe7300 c1373a00 
  GPR24: fef7 c012e320 c07ff23afcb0 c0cb8628
  GPR28: c121c8b8 c1214e00 c07fef5b17e8 c07fef5b17c0

Watchpoint hit at 0xc0136bec.

  addis   r29,r2,-19
   => r29 = 0xc1344e00 + (-19 << 16)
   => r29 = 0xc1214e00

  ld  r29,31424(r29)
   => r29 = *(0xc1214e00 + 31424)
   => r29 = *(0xc121c8c0)

0xc121c8c0 is where we placed a watchpoint and thus this
instruction was emulated by emulate_step. But because handle_dabr_fault
did not restore emulated register state, r29 still contains stale
value in above register state.

Fixes: 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 
64-bit server processors")
Signed-off-by: Ravi Bangoria 
Cc: sta...@vger.kernel.org # 2.6.36+
---
v1: https://lkml.org/lkml/2019/6/10/1058
v1->v2:
Successful do_page_fault returns using ret_from_except_lite at
the same place where handle_dabr_fault also returns. v1 messed
up with do_page_fault return path. Fix that in v2.

 arch/powerpc/kernel/exceptions-64s.S | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 6b86055..2546427 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1746,7 +1746,7 @@ handle_page_fault:
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_page_fault
cmpdi   r3,0
-   beq+12f
+   beq+ret_from_except_lite
bl  save_nvgprs
mr  r5,r3
addir3,r1,STACK_FRAME_OVERHEAD
@@ -1761,7 +1761,12 @@ handle_dabr_fault:
ld  r5,_DSISR(r1)
addir3,r1,STACK_FRAME_OVERHEAD
bl  do_break
-12:b   ret_from_except_lite
+   /*
+* do_break may have changed the nv-gprs while haldling
+* breakpoint. If so, we need to restore them with their
+* updated values. Don't use ret_from_except_lite here.
+*/
+   b   ret_from_except
 
 
 #ifdef CONFIG_PPC_BOOK3S_64
-- 
1.8.3.1

Re: [PATCH 0/2] powerpc/bpf: DIV64 instruction fix

2019-06-12 Thread Sandipan Das



On 13/06/19 12:21 AM, Naveen N. Rao wrote:
> The first patch updates DIV64 overflow tests to properly detect error 
> conditions. The second patch fixes powerpc64 JIT to generate the proper 
> unsigned division instruction for BPF_ALU64.
> 
> - Naveen
> 
> Naveen N. Rao (2):
>   bpf: fix div64 overflow tests to properly detect errors
>   powerpc/bpf: use unsigned division instruction for 64-bit operations
> 
>  arch/powerpc/include/asm/ppc-opcode.h  |  1 +
>  arch/powerpc/net/bpf_jit.h |  2 +-
>  arch/powerpc/net/bpf_jit_comp64.c  |  8 
>  .../testing/selftests/bpf/verifier/div_overflow.c  | 14 ++
>  4 files changed, 16 insertions(+), 9 deletions(-)
> 

For the series

Acked-by: Sandipan Das

Re: [PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-12 Thread Christophe Leroy





Le 12/06/2019 à 15:59, Horia Geanta a écrit :

On 6/12/2019 8:52 AM, Christophe Leroy wrote:



Le 11/06/2019 à 18:30, Horia Geanta a écrit :

On 6/11/2019 6:40 PM, Christophe Leroy wrote:



Le 11/06/2019 à 17:37, Horia Geanta a écrit :

On 6/11/2019 5:39 PM, Christophe Leroy wrote:

This series is the last set of fixes for the Talitos driver.

We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:


I am getting below failures on a sec 3.3.2 (p1020rdb) for hmac(sha384) and
hmac(sha512):


Is that new with this series or did you already have it before ?


Looks like this happens with or without this series.


Found the issue, that's in
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=b8fbdc2bc4e71b62646031d5df5f08aafe15d5ad

CONFIG_CRYPTO_DEV_TALITOS_SEC2 should be CONFIG_CRYPTO_DEV_TALITOS2 instead.

Just sent a patch to fix it.


Thanks, I've tested it and the hmac failures go away.

However, testing gets stuck.
Seems there is another issue lurking in the driver.

Used cryptodev-2.6/master with the following on top:
crypto: testmgr - add some more preemption points
https://patchwork.kernel.org/patch/10972337/
crypto: talitos - fix max key size for sha384 and sha512
https://patchwork.kernel.org/patch/10988473/

[...]
alg: skcipher: skipping comparison tests for ecb-3des-talitos because 
ecb(des3_ede-generic) is unavailable
INFO: task cryptomgr_test:314 blocked for more than 120 seconds.
   Not tainted 5.2.0-rc1-g905bfd415e8a #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cryptomgr_test  D0   314  2 0x0800
Call Trace:
[e78337e0] [0004] 0x4 (unreliable)
[e78338a8] [c08a6e5c] __schedule+0x20c/0x4d4
[e78338f8] [c08a7158] schedule+0x34/0xc8
[e7833908] [c08aa5ec] schedule_timeout+0x1d4/0x350
[e7833958] [c08a7be4] wait_for_common+0xa0/0x164
[e7833998] [c03a7b14] do_ahash_op+0xa4/0xc4
[e78339b8] [c03aba00] test_ahash_vec_cfg+0x188/0x5e4
[e7833aa8] [c03ac1c8] test_hash_vs_generic_impl+0x1b0/0x2b4
[e7833de8] [c03ac498] __alg_test_hash+0x1cc/0x2d0
[e7833e28] [c03a9fb4] alg_test.part.37+0x8c/0x3ac
[e7833ef8] [c03a54d0] cryptomgr_test+0x4c/0x54
[e7833f08] [c006c410] kthread+0xf8/0x124
[e7833f38] [c001227c] ret_from_kernel_thread+0x14/0x1c

addr2line on c03aba00 points to crypto/testmgr.c:1335

1327)  if (cfg->finalization_type == FINALIZATION_TYPE_DIGEST ||
1328)  vec->digest_error) {
1329)  /* Just using digest() */
1330)  ahash_request_set_callback(req, req_flags, crypto_req_done,
1331) );
1332)  ahash_request_set_crypt(req, tsgl->sgl, result, vec->psize);
1333)  err = do_ahash_op(crypto_ahash_digest, req, , 
cfg->nosimd);
1334)  if (err) {
-> 1335)  if (err == vec->digest_error)
1336)  return 0;
1337)  pr_err("alg: ahash: %s digest() failed on test vector %s; 
expected_error=%d, actual_error=%d, cfg=\"%s\"\n",
1338) driver, vec_name, vec->digest_error, err,
1339) cfg->name);
1340)  return err;
1341)  }
1342)  if (vec->digest_error) {
1343)  pr_err("alg: ahash: %s digest() unexpectedly succeeded on test 
vector %s; expected_error=%d, cfg=\"%s\"\n",
1344) driver, vec_name, vec->digest_error, 
cfg->name);
1345)  return -EINVAL;
1346)  }
1347)  goto result_ready;
1348)  }

Seems that for some reason driver does not receive the interrupt from HW,
thus completion callback does not run.

Tried with or without current patch series, no change in behaviour.

If you cannot reproduce and don't have any idea, I'll try the hard way
(git bisect).


I cannot reproduce, both mpc885 and mpc8321e boot fine, and don't have 
any idea at first.


I know the SEC1 behaves that way when you submit zero-length data.

Christophe



Thanks,
Horia

54 matches

Mail list logo