date:20180604

Re: [RFC PATCH for 4.18 09/16] powerpc: Add syscall detection for restartable sequences

2018-06-04 Thread Michael Ellerman

Mathieu Desnoyers  writes:
> From: Boqun Feng 
>
> Syscalls are not allowed inside restartable sequences, so add a call to
> rseq_syscall() at the very beginning of system call exiting path for
> CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect whether there
> is a syscall issued inside restartable sequences.
>
> [ Tested on 64-bit powerpc kernel by Mathieu Desnoyers. Still needs to
>   be tested on 32-bit powerpc kernel. ]
>
> Signed-off-by: Boqun Feng 
> Signed-off-by: Mathieu Desnoyers 
> CC: Benjamin Herrenschmidt 
> CC: Paul Mackerras 
> CC: Michael Ellerman 
> CC: Peter Zijlstra 
> CC: "Paul E. McKenney" 
> CC: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/powerpc/kernel/entry_32.S | 7 +++
>  arch/powerpc/kernel/entry_64.S | 8 
>  2 files changed, 15 insertions(+)

I don't _love_ the #ifdefs in here, but they look correct and there's
not really a better option until we rewrite the syscall handler in C.

The rseq selftests passed for me with this applied and enabled. So if
you like here's some tags:

Tested-by: Michael Ellerman 
Acked-by: Michael Ellerman 

cheers

Re: [RFC PATCH for 4.18 10/16] powerpc: Wire up restartable sequences system call

2018-06-04 Thread Michael Ellerman

Mathieu Desnoyers  writes:

> From: Boqun Feng 
>
> Wire up the rseq system call on powerpc.
>
> This provides an ABI improving the speed of a user-space getcpu
> operation on powerpc by skipping the getcpu system call on the fast
> path, as well as improving the speed of user-space operations on per-cpu
> data compared to using load-reservation/store-conditional atomics.
>
> Signed-off-by: Boqun Feng 
> Signed-off-by: Mathieu Desnoyers 
> CC: Benjamin Herrenschmidt 
> CC: Paul Mackerras 
> CC: Michael Ellerman 
> CC: Peter Zijlstra 
> CC: "Paul E. McKenney" 
> CC: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/powerpc/include/asm/systbl.h  | 1 +
>  arch/powerpc/include/asm/unistd.h  | 2 +-
>  arch/powerpc/include/uapi/asm/unistd.h | 1 +
>  3 files changed, 3 insertions(+), 1 deletion(-)

Looks fine to me.

I don't have any other new syscalls in my next, so this should not
conflict with anything for 4.18.

Acked-by: Michael Ellerman  (powerpc)


cheers

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Christoph Hellwig

On Tue, Jun 05, 2018 at 09:26:56AM +1000, Benjamin Herrenschmidt wrote:
> Sorry Michael, that doesn't click. Yes of course virtio is implemented
> in qemu, but the problem we are trying to solve is *not* a qemu problem
> (the fact that the Linux drivers bypass the DMA API is wrong, needs
> fixing, and isnt a qemu problem). The fact that the secure guests need
> bounce buffering is not a qemu problem either.
> 
> Whether qemu chose to use an iommu or not is, and should remain an
> orthogonal problem.

Agreed.  We have a problem with qemu (old qemu only?) punching a hole
into the VM abstraction by deciding that even if firmware tables
claim use of an IOMMU for a PCI bus it expects virtio to use physiscal
addresses.  So far so bad.  The answer to that should have been to
quirk the affected qemu versions and move on.  Instead we now have
virtio not use the DMA API by default and are creating a worse problem.

Let's fix this issue ASAP and quirk the buggy implementations instead
of letting everyone else suffer.  

> The DMA API itself isn't the one that needs to learn "per-device
> quirks", it's just plumbing into arch backends. The "quirk" is at the
> point of establishing the backend for a given device.
> 
> We can go a good way down that path simply by having virtio in Linux
> start with putting *itself* its own direct ops in there when
> VIRTIO_F_IOMMU_PLATFORM is not set, and removing all the special casing
> in the rest of the driver.

Yes.  And we have all the infrastructure for that now.  A few RDMA
drivers quirk to virt_dma_ops, and virtio could quirk to dma_direct_ops
anytime now.  In fact given how much time we are spending arguing here
I'm going to give it a spin today.

> Once that's done, we have a single point of establishing the dma ops,
> we can quirk in there if needed, that's rather nicely contained, or put
> an arch hook, or whatever is necessary.

Yes.

Re: [PATCH v7 0/5] powerpc/64: memcmp() optimization

2018-06-04 Thread Simon Guo

Hi Michael,
On Tue, Jun 05, 2018 at 12:16:22PM +1000, Michael Ellerman wrote:
> Hi Simon,
> 
> wei.guo.si...@gmail.com writes:
> > From: Simon Guo 
> >
> > There is some room to optimize memcmp() in powerpc 64 bits version for
> > following 2 cases:
> > (1) Even src/dst addresses are not aligned with 8 bytes at the beginning,
> > memcmp() can align them and go with .Llong comparision mode without
> > fallback to .Lshort comparision mode do compare buffer byte by byte.
> > (2) VMX instructions can be used to speed up for large size comparision,
> > currently the threshold is set for 4K bytes. Notes the VMX instructions
> > will lead to VMX regs save/load penalty. This patch set includes a
> > patch to add a 32 bytes pre-checking to minimize the penalty.
> >
> > It did the similar with glibc commit dec4a7105e (powerpc: Improve memcmp 
> > performance for POWER8). Thanks Cyril Bur's information.
> > This patch set also updates memcmp selftest case to make it compiled and
> > incorporate large size comparison case.
> 
> I'm seeing a few crashes with this applied, I haven't had time to look
> into what is happening yet, sorry.
Sorry I didn't catch this in my testing. I will check the root cause
and update later.

Thanks,
- Simon

> 
> [ 2471.300595] kselftest: Running tests in user
> [ 2471.302785] calling  test_user_copy_init+0x0/0xd14 [test_user_copy] @ 44883
> [ 2471.302892] Unable to handle kernel paging request for data at address 
> 0xc00818553005
> [ 2471.303014] Faulting instruction address: 0xc001f29c
> [ 2471.303119] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 2471.303193] LE SMP NR_CPUS=2048 NUMA PowerNV


> [ 2471.303256] Modules linked in: test_user_copy(+) vxlan ip6_udp_tunnel 
> udp_tunnel 8021q bridge stp llc dummy test_printf test_firmware vmx_crypto 
> crct10dif_vpmsum crct10dif_common crc32c_vpmsum veth [last unloaded: 
> test_static_key_base]
> [ 2471.303532] CPU: 4 PID: 44883 Comm: modprobe Tainted: GW 
> 4.17.0-rc3-gcc7x-g7204012 #1
> [ 2471.303644] NIP:  c001f29c LR: c001f6e4 CTR: 
> 
> [ 2471.303754] REGS: c01fddc2b560 TRAP: 0300   Tainted: GW
>   (4.17.0-rc3-gcc7x-g7204012)
> [ 2471.303873] MSR:  92009033   CR: 
> 24222844  XER: 
> [ 2471.303996] CFAR: c001f6e0 DAR: c00818553005 DSISR: 4000 
> IRQMASK: 0 
> [ 2471.303996] GPR00: c001f6e4 c01fddc2b7e0 c00818529900 
> 0200 
> [ 2471.303996] GPR04: c01fe4b90020 ffe0  
> 03fe01b48000 
> [ 2471.303996] GPR08: 8000 c00818553005 c01fddc28000 
> c00818520df0 
> [ 2471.303996] GPR12: c009c430 c01fbc00 2000 
>  
> [ 2471.303996] GPR16: c01fddc2bc20 0030 c01f7ba0 
> 0001 
> [ 2471.303996] GPR20:  c0c772b0 c10b4018 
>  
> [ 2471.303996] GPR24:  c00818521c98  
> c01fe4b9 
> [ 2471.303996] GPR28: fff4 0200 92009033 
> 92009033 
> [ 2471.304930] NIP [c001f29c] msr_check_and_set+0x3c/0xc0
> [ 2471.305008] LR [c001f6e4] enable_kernel_altivec+0x44/0x100
> [ 2471.305084] Call Trace:
> [ 2471.305122] [c01fddc2b7e0] [c009baa8] 
> __copy_tofrom_user_base+0x9c/0x574 (unreliable)
> [ 2471.305240] [c01fddc2b860] [c001f6e4] 
> enable_kernel_altivec+0x44/0x100
> [ 2471.305336] [c01fddc2b890] [c009ce40] enter_vmx_ops+0x50/0x70
> [ 2471.305418] [c01fddc2b8b0] [c009c768] memcmp+0x338/0x680
> [ 2471.305501] [c01fddc2b9b0] [c00818520190] 
> test_user_copy_init+0x188/0xd14 [test_user_copy]
> [ 2471.305617] [c01fddc2ba60] [c000de20] 
> do_one_initcall+0x90/0x560
> [ 2471.305710] [c01fddc2bb30] [c0200630] do_init_module+0x90/0x260
> [ 2471.305795] [c01fddc2bbc0] [c01fec88] load_module+0x1a28/0x1ce0
> [ 2471.305875] [c01fddc2bd70] [c01ff1e8] 
> sys_finit_module+0xc8/0x110
> [ 2471.305983] [c01fddc2be30] [c000b528] system_call+0x58/0x6c
> [ 2471.306066] Instruction dump:
> [ 2471.306112] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7d1b78 6000 
> 6000 
> [ 2471.306216] 7fe000a6 3d220003 39299705 7ffeeb78 <8929> 2f89 
> 419e0044 6000 
> [ 2471.306326] ---[ end trace daf8d409e65b9841 ]---
> 
> And:
> 
> [   19.096709] test_bpf: test_skb_segment: success in skb_segment!
> [   19.096799] initcall test_bpf_init+0x0/0xae0 [test_bpf] returned 0 after 
> 591217 usecs
> [   19.115869] calling  test_user_copy_init+0x0/0xd14 [test_user_copy] @ 3159
> [   19.116165] Unable to handle kernel paging request for data at address 
> 0xd3852805
> [   19.116352] Faulting instruction address: 0xc001f44c
> [   19.116483] Oops: Kernel access of bad area, sig: 11 [#1]
> [   19.116583] LE SMP NR_CPUS=2048 NUMA pSeries
>

Re: [PATCH] cpuidle:powernv: Make the snooze timeout dynamic.

2018-06-04 Thread Stewart Smith

Michael Ellerman  writes:
> "Gautham R. Shenoy"  writes:
>
>> From: "Gautham R. Shenoy" 
>>
>> The commit 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of
>> snooze to deeper idle state") introduced a timeout for the snooze idle
>> state so that it could be eventually be promoted to a deeper idle
>> state. The snooze timeout value is static and set to the target
>> residency of the next idle state, which would train the cpuidle
>> governor to pick the next idle state eventually.
>>
>> The unfortunate side-effect of this is that if the next idle state(s)
>> is disabled, the CPU will forever remain in snooze, despite the fact
>> that the system is completely idle, and other deeper idle states are
>> available.
>
> That sounds like a bug, I'll add?
>
> Fixes: 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of snooze to 
> deeper idle state")
> Cc: sta...@vger.kernel.org # v4.2+

Yes, it's a bug - we had a customer bug because we lacked this that
meant we had to do firmware changes rather than just tweaking what stop
states were used.

-- 
Stewart Smith
OPAL Architect, IBM.

RE: [PATCH 09/10] dpaa_eth: add support for hardware timestamping

2018-06-04 Thread Y.b. Lu

Hi Richard,

> -Original Message-
> From: Richard Cochran [mailto:richardcoch...@gmail.com]
> Sent: Monday, June 4, 2018 9:49 PM
> To: Y.b. Lu 
> Cc: net...@vger.kernel.org; Madalin-cristian Bucur
> ; Rob Herring ; Shawn Guo
> ; David S . Miller ;
> devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org
> Subject: Re: [PATCH 09/10] dpaa_eth: add support for hardware timestamping
> 
> On Mon, Jun 04, 2018 at 03:08:36PM +0800, Yangbo Lu wrote:
> 
> > +if FSL_DPAA_ETH
> > +config FSL_DPAA_ETH_TS
> > +   bool "DPAA hardware timestamping support"
> > +   select PTP_1588_CLOCK_QORIQ
> > +   default n
> > +   help
> > + Enable DPAA hardware timestamping support.
> > + This option is useful for applications to get
> > + hardware time stamps on the Ethernet packets
> > + using the SO_TIMESTAMPING API.
> > +endif
> 
> You should drop this #ifdef.  In general, if a MAC supports time stamping and
> PHC, then the driver support should simply be compiled in.
> 
> [ When time stamping incurs a large run time performance penalty to
>   non-PTP users, then it might make sense to have a Kconfig option to
>   disable it, but that doesn't appear to be the case here. ]

[Y.b. Lu] Actually these timestamping codes affected DPAA networking 
performance in our previous performance test.
That's why we used ifdef for it.

> 
> > @@ -1615,6 +1635,24 @@ static int dpaa_eth_refill_bpools(struct
> dpaa_priv *priv)
> > skbh = (struct sk_buff **)phys_to_virt(addr);
> > skb = *skbh;
> >
> > +#ifdef CONFIG_FSL_DPAA_ETH_TS
> > +   if (priv->tx_tstamp &&
> > +   skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
> 
> This condition fits on one line easily.

[Y.b. Lu] Right. I will use one line in next version.

> 
> > +   struct skb_shared_hwtstamps shhwtstamps;
> > +   u64 ns;
> 
> Local variables belong at the top of the function.

[Y.b. Lu] Ok, will move them to the top in next verison.

> 
> > +   memset(, 0, sizeof(shhwtstamps));
> > +
> > +   if (!dpaa_get_tstamp_ns(priv->net_dev, ,
> > +   priv->mac_dev->port[TX],
> > +   (void *)skbh)) {
> > +   shhwtstamps.hwtstamp = ns_to_ktime(ns);
> > +   skb_tstamp_tx(skb, );
> > +   } else {
> > +   dev_warn(dev, "dpaa_get_tstamp_ns failed!\n");
> > +   }
> > +   }
> > +#endif
> > if (unlikely(qm_fd_get_format(fd) == qm_fd_sg)) {
> > nr_frags = skb_shinfo(skb)->nr_frags;
> > dma_unmap_single(dev, addr, qm_fd_get_offset(fd) + @@ -2086,6
> > +2124,14 @@ static int dpaa_start_xmit(struct sk_buff *skb, struct
> net_device *net_dev)
> > if (unlikely(err < 0))
> > goto skb_to_fd_failed;
> >
> > +#ifdef CONFIG_FSL_DPAA_ETH_TS
> > +   if (priv->tx_tstamp &&
> > +   skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
> 
> One line please.

[Y.b. Lu] No problem.

> 
> > +   fd.cmd |= FM_FD_CMD_UPD;
> > +   skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> > +   }
> > +#endif
> > +
> > if (likely(dpaa_xmit(priv, percpu_stats, queue_mapping, ) == 0))
> > return NETDEV_TX_OK;
> >
> 
> Thanks,
> Richard

Re: [PATCH v7 0/5] powerpc/64: memcmp() optimization

2018-06-04 Thread Michael Ellerman

Hi Simon,

wei.guo.si...@gmail.com writes:
> From: Simon Guo 
>
> There is some room to optimize memcmp() in powerpc 64 bits version for
> following 2 cases:
> (1) Even src/dst addresses are not aligned with 8 bytes at the beginning,
> memcmp() can align them and go with .Llong comparision mode without
> fallback to .Lshort comparision mode do compare buffer byte by byte.
> (2) VMX instructions can be used to speed up for large size comparision,
> currently the threshold is set for 4K bytes. Notes the VMX instructions
> will lead to VMX regs save/load penalty. This patch set includes a
> patch to add a 32 bytes pre-checking to minimize the penalty.
>
> It did the similar with glibc commit dec4a7105e (powerpc: Improve memcmp 
> performance for POWER8). Thanks Cyril Bur's information.
> This patch set also updates memcmp selftest case to make it compiled and
> incorporate large size comparison case.

I'm seeing a few crashes with this applied, I haven't had time to look
into what is happening yet, sorry.

[ 2471.300595] kselftest: Running tests in user
[ 2471.302785] calling  test_user_copy_init+0x0/0xd14 [test_user_copy] @ 44883
[ 2471.302892] Unable to handle kernel paging request for data at address 
0xc00818553005
[ 2471.303014] Faulting instruction address: 0xc001f29c
[ 2471.303119] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2471.303193] LE SMP NR_CPUS=2048 NUMA PowerNV
[ 2471.303256] Modules linked in: test_user_copy(+) vxlan ip6_udp_tunnel 
udp_tunnel 8021q bridge stp llc dummy test_printf test_firmware vmx_crypto 
crct10dif_vpmsum crct10dif_common crc32c_vpmsum veth [last unloaded: 
test_static_key_base]
[ 2471.303532] CPU: 4 PID: 44883 Comm: modprobe Tainted: GW 
4.17.0-rc3-gcc7x-g7204012 #1
[ 2471.303644] NIP:  c001f29c LR: c001f6e4 CTR: 
[ 2471.303754] REGS: c01fddc2b560 TRAP: 0300   Tainted: GW  
(4.17.0-rc3-gcc7x-g7204012)
[ 2471.303873] MSR:  92009033   CR: 
24222844  XER: 
[ 2471.303996] CFAR: c001f6e0 DAR: c00818553005 DSISR: 4000 
IRQMASK: 0 
[ 2471.303996] GPR00: c001f6e4 c01fddc2b7e0 c00818529900 
0200 
[ 2471.303996] GPR04: c01fe4b90020 ffe0  
03fe01b48000 
[ 2471.303996] GPR08: 8000 c00818553005 c01fddc28000 
c00818520df0 
[ 2471.303996] GPR12: c009c430 c01fbc00 2000 
 
[ 2471.303996] GPR16: c01fddc2bc20 0030 c01f7ba0 
0001 
[ 2471.303996] GPR20:  c0c772b0 c10b4018 
 
[ 2471.303996] GPR24:  c00818521c98  
c01fe4b9 
[ 2471.303996] GPR28: fff4 0200 92009033 
92009033 
[ 2471.304930] NIP [c001f29c] msr_check_and_set+0x3c/0xc0
[ 2471.305008] LR [c001f6e4] enable_kernel_altivec+0x44/0x100
[ 2471.305084] Call Trace:
[ 2471.305122] [c01fddc2b7e0] [c009baa8] 
__copy_tofrom_user_base+0x9c/0x574 (unreliable)
[ 2471.305240] [c01fddc2b860] [c001f6e4] 
enable_kernel_altivec+0x44/0x100
[ 2471.305336] [c01fddc2b890] [c009ce40] enter_vmx_ops+0x50/0x70
[ 2471.305418] [c01fddc2b8b0] [c009c768] memcmp+0x338/0x680
[ 2471.305501] [c01fddc2b9b0] [c00818520190] 
test_user_copy_init+0x188/0xd14 [test_user_copy]
[ 2471.305617] [c01fddc2ba60] [c000de20] do_one_initcall+0x90/0x560
[ 2471.305710] [c01fddc2bb30] [c0200630] do_init_module+0x90/0x260
[ 2471.305795] [c01fddc2bbc0] [c01fec88] load_module+0x1a28/0x1ce0
[ 2471.305875] [c01fddc2bd70] [c01ff1e8] sys_finit_module+0xc8/0x110
[ 2471.305983] [c01fddc2be30] [c000b528] system_call+0x58/0x6c
[ 2471.306066] Instruction dump:
[ 2471.306112] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7d1b78 6000 
6000 
[ 2471.306216] 7fe000a6 3d220003 39299705 7ffeeb78 <8929> 2f89 419e0044 
6000 
[ 2471.306326] ---[ end trace daf8d409e65b9841 ]---

And:

[   19.096709] test_bpf: test_skb_segment: success in skb_segment!
[   19.096799] initcall test_bpf_init+0x0/0xae0 [test_bpf] returned 0 after 
591217 usecs
[   19.115869] calling  test_user_copy_init+0x0/0xd14 [test_user_copy] @ 3159
[   19.116165] Unable to handle kernel paging request for data at address 
0xd3852805
[   19.116352] Faulting instruction address: 0xc001f44c
[   19.116483] Oops: Kernel access of bad area, sig: 11 [#1]
[   19.116583] LE SMP NR_CPUS=2048 NUMA pSeries
[   19.116684] Modules linked in: test_user_copy(+) lzo_compress crc_itu_t 
zstd_compress zstd_decompress test_bpf test_static_keys test_static_key_base 
xxhash test_firmware af_key cls_bpf act_bpf bridge nf_nat_irc xt_NFLOG 
nfnetlink_log xt_policy nf_conntrack_netlink nfnetlink xt_nat nf_conntrack_irc 
xt_mark xt_tcpudp nf_nat_sip xt_TCPMSS xt_LOG nf_nat_ftp nf_conntrack_ftp

[PATCH 5/5] powerpc/pkeys: make protection key 0 less special

2018-06-04 Thread Ram Pai

Applications need the ability to associate an address-range with some
key and latter revert to its initial default key. Pkey-0 comes close to
providing this function but falls short, because the current
implementation disallows applications to explicitly associate pkey-0 to
the address range.

Lets make pkey-0 less special and treat it almost like any other key.
Thus it can be explicitly associated with any address range, and can be
freed. This gives the application more flexibility and power.  The
ability to free pkey-0 must be used responsibily, since pkey-0 is
associated with almost all address-range by default.

Even with this change pkey-0 continues to be slightly more special
from the following point of view.
(a) it is implicitly allocated.
(b) it is the default key assigned to any address-range.
(c) its permissions cannot be modified by userspace.

NOTE: (c) is specific to powerpc only. pkey-0 is associated by default
with all pages including kernel pages, and pkeys are also active in
kernel mode. If any permission is denied on pkey-0, the kernel running
in the context of the application will be unable to operate.

Tested on powerpc.

cc: Thomas Gleixner 
cc: Dave Hansen 
cc: Michael Ellermen 
cc: Ingo Molnar 
cc: Andrew Morton 
cc: Thiago Jung Bauermann 
cc: Michal Such谩nek 
---
History:
v4: . introduced PKEY_0 macro.  No bug fixes. Code
re-arrangement to save a few cycles.

v3: . Corrected a comment in arch_set_user_pkey_access().  .
Clarified the header, to capture the notion that pkey-0
permissions cannot be modified by userspace on powerpc.
-- comment from Thiago

v2: . mm_pkey_is_allocated() continued to treat pkey-0 special.
fixed it.
---
 arch/powerpc/include/asm/pkeys.h |   29 +++--
 arch/powerpc/mm/pkeys.c  |   13 ++---
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0409c80..d349e22 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -13,7 +13,10 @@
 
 DECLARE_STATIC_KEY_TRUE(pkey_disabled);
 extern int pkeys_total; /* total pkeys as per device tree */
-extern u32 initial_allocation_mask; /* bits set for reserved keys */
+extern u32 initial_allocation_mask; /*  bits set for the initially allocated 
keys */
+extern u32 reserved_allocation_mask; /* bits set for reserved keys */
+
+#define PKEY_0 0
 
 /*
  * Define these here temporarily so we're not dependent on patching linux/mm.h.
@@ -96,15 +99,19 @@ static inline u16 pte_to_pkey_bits(u64 pteflags)
 #define __mm_pkey_is_allocated(mm, pkey)   \
(mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
 
-#define __mm_pkey_is_reserved(pkey) (initial_allocation_mask & \
+#define __mm_pkey_is_reserved(pkey) (reserved_allocation_mask & \
   pkey_alloc_mask(pkey))
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
-   /* A reserved key is never considered as 'explicitly allocated' */
-   return ((pkey < arch_max_pkey()) &&
-   !__mm_pkey_is_reserved(pkey) &&
-   __mm_pkey_is_allocated(mm, pkey));
+   if (pkey < 0 || pkey >= arch_max_pkey())
+   return false;
+
+   /* Reserved keys are never allocated. */
+   if (__mm_pkey_is_reserved(pkey))
+   return false;
+
+   return __mm_pkey_is_allocated(mm, pkey);
 }
 
 extern void __arch_activate_pkey(int pkey);
@@ -200,6 +207,16 @@ static inline int arch_set_user_pkey_access(struct 
task_struct *tsk, int pkey,
 {
if (static_branch_likely(_disabled))
return -EINVAL;
+
+   /*
+* userspace should not change pkey-0 permissions.
+* pkey-0 is associated with every page in the kernel.
+* If userspace denies any permission on pkey-0, the
+* kernel cannot operate.
+*/
+   if (pkey == PKEY_0)
+   return init_val ? -EINVAL : 0;
+
return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 0b98db6..1ebb21b 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -14,7 +14,8 @@
 bool pkey_execute_disable_supported;
 int  pkeys_total;  /* Total pkeys as per device tree */
 bool pkeys_devtree_defined;/* pkey property exported by device tree */
-u32  initial_allocation_mask;  /* Bits set for reserved keys */
+u32  initial_allocation_mask;   /* Bits set for the initially allocated keys */
+u32  reserved_allocation_mask;  /* Bits set for reserved keys */
 u64  pkey_amr_mask;/* Bits in AMR not to be touched */
 u64  pkey_iamr_mask;   /* Bits in AMR not to be touched */
 u64  pkey_uamor_mask;  /* Bits in UMOR not to be touched */
@@ -121,8 +122,9

[PATCH 4/5] powerpc/pkeys: Preallocate execute-only key

2018-06-04 Thread Ram Pai

execute-only key is allocated dynamically. This is a problem.  When a
thread implicitly creates a execute-only key, and resets UAMOR for that
key, the UAMOR value does not percolate to all the other threads.  Any
other thread may ignorantly change the permissions on the key.  This can
cause the key to be not execute-only for that thread.

Preallocate the execute-only key and ensure that no thread can change
the permission of the key, by resetting the corresponding bit in UAMOR.

CC: Andy Lutomirski 
CC: Florian Weimer 
CC: Thiago Jung Bauermann 
CC: Michael Ellerman 
Signed-off-by: Ram Pai 
---
 arch/powerpc/mm/pkeys.c |   53 +++---
 1 files changed, 8 insertions(+), 45 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 90ab793..0b98db6 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -25,6 +25,7 @@
 #define IAMR_EX_BIT 0x1UL
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
+#define EXECUTE_ONLY_KEY 2
 
 static void scan_pkey_feature(void)
 {
@@ -120,7 +121,8 @@ int pkey_initialize(void)
 #else
os_reserved = 0;
 #endif
-   initial_allocation_mask  = (0x1 << 0) | (0x1 << 1);
+   initial_allocation_mask  = (0x1 << 0) | (0x1 << 1) |
+   (0x1 << EXECUTE_ONLY_KEY);
 
/* register mask is in BE format */
pkey_amr_mask = ~0x0ul;
@@ -130,6 +132,7 @@ int pkey_initialize(void)
pkey_amr_mask &= ~(0x3ul << pkeyshift(i));
pkey_iamr_mask &= ~(0x1ul << pkeyshift(i));
}
+   pkey_amr_mask |= (AMR_RD_BIT|AMR_WR_BIT) << pkeyshift(EXECUTE_ONLY_KEY);
 
pkey_uamor_mask = ~0x0ul;
pkey_uamor_mask &= ~(0x3ul << pkeyshift(0));
@@ -140,6 +143,8 @@ int pkey_initialize(void)
 * pseries kernel running on powerVM.
 */
pkey_uamor_mask &= ~(0x3ul << pkeyshift(1));
+   pkey_uamor_mask &= ~(0x3ul << pkeyshift(EXECUTE_ONLY_KEY));
+
for (i = (pkeys_total - os_reserved); i < pkeys_total; i++)
pkey_uamor_mask &= ~(0x3ul << pkeyshift(i));
 
@@ -153,8 +158,7 @@ void pkey_mm_init(struct mm_struct *mm)
if (static_branch_likely(_disabled))
return;
mm_pkey_allocation_map(mm) = initial_allocation_mask;
-   /* -1 means unallocated or invalid */
-   mm->context.execute_only_pkey = -1;
+   mm->context.execute_only_pkey = EXECUTE_ONLY_KEY;
 }
 
 static inline u64 read_amr(void)
@@ -333,48 +337,7 @@ static inline bool pkey_allows_readwrite(int pkey)
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
-   bool need_to_set_mm_pkey = false;
-   int execute_only_pkey = mm->context.execute_only_pkey;
-   int ret;
-
-   /* Do we need to assign a pkey for mm's execute-only maps? */
-   if (execute_only_pkey == -1) {
-   /* Go allocate one to use, which might fail */
-   execute_only_pkey = mm_pkey_alloc(mm);
-   if (execute_only_pkey < 0)
-   return -1;
-   need_to_set_mm_pkey = true;
-   }
-
-   /*
-* We do not want to go through the relatively costly dance to set AMR
-* if we do not need to. Check it first and assume that if the
-* execute-only pkey is readwrite-disabled than we do not have to set it
-* ourselves.
-*/
-   if (!need_to_set_mm_pkey && !pkey_allows_readwrite(execute_only_pkey))
-   return execute_only_pkey;
-
-   /*
-* Set up AMR so that it denies access for everything other than
-* execution.
-*/
-   ret = __arch_set_user_pkey_access(current, execute_only_pkey,
- PKEY_DISABLE_ACCESS |
- PKEY_DISABLE_WRITE);
-   /*
-* If the AMR-set operation failed somehow, just return 0 and
-* effectively disable execute-only support.
-*/
-   if (ret) {
-   mm_pkey_free(mm, execute_only_pkey);
-   return -1;
-   }
-
-   /* We got one, store it and use it from here on out */
-   if (need_to_set_mm_pkey)
-   mm->context.execute_only_pkey = execute_only_pkey;
-   return execute_only_pkey;
+   return mm->context.execute_only_pkey;
 }
 
 static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
-- 
1.7.1

[PATCH 3/5] powerpc/pkeys: fix calculation of total pkeys.

2018-06-04 Thread Ram Pai

Total number of pkeys calculation is off by 1. Fix it.

CC: Florian Weimer 
CC: Michael Ellerman 
CC: Thiago Jung Bauermann 
Signed-off-by: Ram Pai 
---
 arch/powerpc/mm/pkeys.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 6fc56f4..90ab793 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -92,7 +92,7 @@ int pkey_initialize(void)
 * arch-neutral code.
 */
pkeys_total = min_t(int, pkeys_total,
-   (ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
+   ((ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)+1));
 
if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
static_branch_enable(_disabled);
-- 
1.7.1

[PATCH 1/5] powerpc/pkeys: Enable all user-allocatable pkeys at init.

2018-06-04 Thread Ram Pai

In a multithreaded application, a key allocated by one thread must
be activate and usable on all threads.

Currently this is not the case, because the UAMOR bits for all keys are
disabled by default. When a new key is allocated in one thread, though
the corresponding UAMOR bits for that thread get enabled, the UAMOR bits
for all other existing threads continue to have their bits disabled.
Other threads have no way to set permissions on the key, effectively
making the key useless.

Enable the UAMOR bits for all keys, at process creation. Since the
contents of UAMOR are inherited at fork, all threads are capable of
modifying the permissions on any key.

BTW: changing the permission on unallocated keys has no effect, till
those keys are not associated with any PTEs. The kernel will anyway
disallow to association of unallocated keys with PTEs.

CC: Andy Lutomirski 
CC: Florian Weimer 
CC: Thiago Jung Bauermann 
CC: Michael Ellerman 
Signed-off-by: Ram Pai 
---
 arch/powerpc/mm/pkeys.c |   47 +--
 1 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 0eafdf0..6fc56f4 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -15,8 +15,9 @@
 int  pkeys_total;  /* Total pkeys as per device tree */
 bool pkeys_devtree_defined;/* pkey property exported by device tree */
 u32  initial_allocation_mask;  /* Bits set for reserved keys */
-u64  pkey_amr_uamor_mask;  /* Bits in AMR/UMOR not to be touched */
+u64  pkey_amr_mask;/* Bits in AMR not to be touched */
 u64  pkey_iamr_mask;   /* Bits in AMR not to be touched */
+u64  pkey_uamor_mask;  /* Bits in UMOR not to be touched */
 
 #define AMR_BITS_PER_PKEY 2
 #define AMR_RD_BIT 0x1UL
@@ -119,20 +120,29 @@ int pkey_initialize(void)
 #else
os_reserved = 0;
 #endif
-   initial_allocation_mask = ~0x0;
-   pkey_amr_uamor_mask = ~0x0ul;
+   initial_allocation_mask  = (0x1 << 0) | (0x1 << 1);
+
+   /* register mask is in BE format */
+   pkey_amr_mask = ~0x0ul;
pkey_iamr_mask = ~0x0ul;
-   /*
-* key 0, 1 are reserved.
-* key 0 is the default key, which allows read/write/execute.
-* key 1 is recommended not to be used. PowerISA(3.0) page 1015,
-* programming note.
-*/
-   for (i = 2; i < (pkeys_total - os_reserved); i++) {
-   initial_allocation_mask &= ~(0x1 << i);
-   pkey_amr_uamor_mask &= ~(0x3ul << pkeyshift(i));
+
+   for (i = 0; i < (pkeys_total - os_reserved); i++) {
+   pkey_amr_mask &= ~(0x3ul << pkeyshift(i));
pkey_iamr_mask &= ~(0x1ul << pkeyshift(i));
}
+
+   pkey_uamor_mask = ~0x0ul;
+   pkey_uamor_mask &= ~(0x3ul << pkeyshift(0));
+   /*
+* key 1 is recommended not to be used.
+* PowerISA(3.0) page 1015,
+* @TODO: Revisit this. This is only applicable on
+* pseries kernel running on powerVM.
+*/
+   pkey_uamor_mask &= ~(0x3ul << pkeyshift(1));
+   for (i = (pkeys_total - os_reserved); i < pkeys_total; i++)
+   pkey_uamor_mask &= ~(0x3ul << pkeyshift(i));
+
return 0;
 }
 
@@ -289,9 +299,6 @@ void thread_pkey_regs_restore(struct thread_struct 
*new_thread,
if (static_branch_likely(_disabled))
return;
 
-   /*
-* TODO: Just set UAMOR to zero if @new_thread hasn't used any keys yet.
-*/
if (old_thread->amr != new_thread->amr)
write_amr(new_thread->amr);
if (old_thread->iamr != new_thread->iamr)
@@ -305,9 +312,13 @@ void thread_pkey_regs_init(struct thread_struct *thread)
if (static_branch_likely(_disabled))
return;
 
-   thread->amr = read_amr() & pkey_amr_uamor_mask;
-   thread->iamr = read_iamr() & pkey_iamr_mask;
-   thread->uamor = read_uamor() & pkey_amr_uamor_mask;
+   thread->amr = pkey_amr_mask;
+   thread->iamr = pkey_iamr_mask;
+   thread->uamor = pkey_uamor_mask;
+
+   write_uamor(pkey_uamor_mask);
+   write_amr(pkey_amr_mask);
+   write_iamr(pkey_iamr_mask);
 }
 
 static inline bool pkey_allows_readwrite(int pkey)
-- 
1.7.1

[PATCH 2/5] powerpc/pkeys: Save the pkey registers before fork

2018-06-04 Thread Ram Pai

When a thread forks the contents of AMR, IAMR, UAMOR registers in the
newly forked thread are not inherited.

Save the registers before forking, for content of those
registers to be automatically copied into the new thread.

CC: Michael Ellerman 
CC: Florian Weimer 
CC: Andy Lutomirski 
CC: Thiago Jung Bauermann 
Signed-off-by: Ram Pai 
---
 arch/powerpc/kernel/process.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1237f13..999dd08 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -582,6 +582,7 @@ static void save_all(struct task_struct *tsk)
__giveup_spe(tsk);
 
msr_check_and_clear(msr_all_available);
+   thread_pkey_regs_save(>thread);
 }
 
 void flush_all_to_thread(struct task_struct *tsk)
-- 
1.7.1

[PATCH 0/5] powerpc/pkeys: fixes to pkeys

2018-06-04 Thread Ram Pai

Assortment of fixes to pkey.

Patch 1  makes pkey consumable in multithreaded applications.

Patch 2  fixes fork behavior to inherit the key attributes.

Patch 3  A off-by-one bug made one key unusable. Fixes it.

Patch 4  Makes pkey-0 less special.

Ram Pai (5):
  powerpc/pkeys: Enable all user-allocatable pkeys at init.
  powerpc/pkeys: Save the pkey registers before fork
  powerpc/pkeys: fix calculation of total pkeys.
  powerpc/pkeys: Preallocate execute-only key
  powerpc/pkeys: make protection key 0 less special

 arch/powerpc/include/asm/pkeys.h |   29 --
 arch/powerpc/kernel/process.c|1 +
 arch/powerpc/mm/pkeys.c  |  107 ++
 3 files changed, 64 insertions(+), 73 deletions(-)

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread David Gibson

On Mon, Jun 04, 2018 at 07:48:54PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2018-06-04 at 18:57 +1000, David Gibson wrote:
> > 
> > > - First qemu doesn't know that the guest will switch to "secure mode"
> > > in advance. There is no difference between a normal and a secure
> > > partition until the partition does the magic UV call to "enter secure
> > > mode" and qemu doesn't see any of it. So who can set the flag here ?
> > 
> > This seems weird to me.  As a rule HV calls should go through qemu -
> > or be allowed to go directly to KVM *by* qemu.
> 
> It's not an HV call, it's a UV call, qemu won't see it, qemu isn't
> trusted. Now the UV *will* reflect that to the HV via some synthetized
> HV calls, and we *could* have those do a pass by qemu, however, so far,
> our entire design doesn't rely on *any* qemu knowledge whatsoever and
> it would be sad to add it just for that purpose.
> 
> Additionally, this is rather orthogonal, see my other email, the
> problem we are trying to solve is *not* a qemu problem and it doesn't
> make sense to leak that into qemu.
> 
> >   We generally reserve
> > the latter for hot path things.  Since this isn't a hot path, having
> > the call handled directly by the kernel seems wrong.
> >
> > Unless a "UV call" is something different I don't know about.
> 
> Yes, a UV call goes to the Ultravisor, not the Hypervisor. The
> Hypervisor isn't trusted.

Ah, right.  Is that implemented in the host kernel, or in something
further above?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Michael S. Tsirkin

On Tue, Jun 05, 2018 at 09:26:56AM +1000, Benjamin Herrenschmidt wrote:
> I would like to keep however the ability to bypass the iommu for
> performance reasons

So that's easy, clear the IOMMU flag and this means "bypass the IOMMU".

-- 
MST

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Benjamin Herrenschmidt

On Mon, 2018-06-04 at 19:21 +0300, Michael S. Tsirkin wrote:
> 
> > > > - First qemu doesn't know that the guest will switch to "secure mode"
> > > > in advance. There is no difference between a normal and a secure
> > > > partition until the partition does the magic UV call to "enter secure
> > > > mode" and qemu doesn't see any of it. So who can set the flag here ?
> > > 
> > > The user should set it. You just tell user "to be able to use with
> > > feature X, enable IOMMU".
> > 
> > That's completely backwards. The user has no idea what that stuff is.
> > And it would have to percolate all the way up the management stack,
> > libvirt, kimchi, whatever else ... that's just nonsense.
> > 
> > Especially since, as I explained in my other email, this is *not* a
> > qemu problem and thus the solution shouldn't be messing around with
> > qemu.
> 
> virtio is implemented in qemu though. If you prefer to stick
> all your code in either guest or the UV that's your decision
> but it looks like qemu could be helpful here.

Sorry Michael, that doesn't click. Yes of course virtio is implemented
in qemu, but the problem we are trying to solve is *not* a qemu problem
(the fact that the Linux drivers bypass the DMA API is wrong, needs
fixing, and isnt a qemu problem). The fact that the secure guests need
bounce buffering is not a qemu problem either.

Whether qemu chose to use an iommu or not is, and should remain an
orthogonal problem.

Forcing qemu to use the iommu to work around a linux side lack of
proper use of the DMA API is not only just papering over the problem,
it's also forcing changes up 3 or 4 levels of the SW stack to create
that new option that no user will understand the meaning of and that
would otherwise be unnecessary.

> For example what if you have a guest that passes physical addresses
> to qemu bypassing swiotlb? Don't you want to detect
> that and fail gracefully rather than crash the guest?

A guest bug then ? Well it wouldn't so much crash as force the pages to
become encrypted and cause horrible ping/pong between qemu and the
guest (the secure pages aren't accessible to qemu directly).

> That's what VIRTIO_F_IOMMU_PLATFORM will do for you.

Again this is orthogonal. Using an iommu will indeed provide a modicum
of protection against buggy drivers, like it does on HW PCI platforms,
whether those guests are secure or not.

Note however that in practice, we tend to disable the iommu even on
real HW whenever we want performance (of course we can't for guests but
for bare metal systems we do, the added RAS isn't worth the performance
lost for very fast networking for example).

> Still that's hypervisor's decision. What isn't up to the hypervisor is
> the way we structure code. We made an early decision to merge a hack
> with xen, among discussion about how with time DMA API will learn to
> support per-device quirks and we'll be able to switch to that.
> So let's do that now?

The DMA API itself isn't the one that needs to learn "per-device
quirks", it's just plumbing into arch backends. The "quirk" is at the
point of establishing the backend for a given device.

We can go a good way down that path simply by having virtio in Linux
start with putting *itself* its own direct ops in there when
VIRTIO_F_IOMMU_PLATFORM is not set, and removing all the special casing
in the rest of the driver.

Once that's done, we have a single point of establishing the dma ops,
we can quirk in there if needed, that's rather nicely contained, or put
an arch hook, or whatever is necessary.

I would like to keep however the ability to bypass the iommu for
performance reasons, and also because it's qemu default mode of
operation and my secure guest has no clean way to force qemu to turn
the iommu on. The hypervisor *could* return something to qemu when the
guest switch to secure as we do know that, and qemu could walk all of
it's virtio devices as a result and "switch" them over but that's
almost grosser from a qemu perspective.

.../...

> > The point is that requiring specific qemu command line arguments isn't
> > going to fly. We have additional problems due to the fact that our
> > firmware (SLOF) inside qemu doesn't currently deal with iommu's etc...
> > though those can be fixed.
> > 
> > Overall, however, this seems to be the most convoluted way of achieving
> > things, require user interventions where none should be needed etc...
> > 
> > Again, what's wrong with a 2 lines hook instead that solves it all and
> > completely avoids involving qemu ?
> > 
> > Ben.
> 
> That each platform wants to add hacks in this data path function.

Sure, then add a single platform hook and the platforms can do what
they want here.

But as I said, it should all be done at initialization time rather than
in the data path, this we absolutely agree. We should just chose the
right set of dma_ops, and have the data path always use the DMA API.

Cheers,
Ben.

> > > 
> > > > > 
> > > > > 
> > > > > >

Re: [PATCH net-next] wan/fsl_ucc_hdlc: use dma_zalloc_coherent instead of allocator/memset

2018-06-04 Thread David Miller

From: YueHaibing 
Date: Mon, 4 Jun 2018 21:07:59 +0800

> Use dma_zalloc_coherent instead of dma_alloc_coherent
> followed by memset 0.
> 
> Signed-off-by: YueHaibing 

Applied.

Re: pkeys on POWER: Access rights not reset on execve

2018-06-04 Thread Florian Weimer


On 06/04/2018 09:02 PM, Ram Pai wrote:

On Mon, Jun 04, 2018 at 07:57:46PM +0200, Florian Weimer wrote:

On 06/04/2018 04:01 PM, Ram Pai wrote:

On Mon, Jun 04, 2018 at 12:12:07PM +0200, Florian Weimer wrote:

On 06/03/2018 10:18 PM, Ram Pai wrote:

On Mon, May 21, 2018 at 01:29:11PM +0200, Florian Weimer wrote:

On 05/20/2018 09:11 PM, Ram Pai wrote:

Florian,

Does the following patch fix the problem for you?  Just like x86
I am enabling all keys in the UAMOR register during
initialization itself. Hence any key created by any thread at
any time, will get activated on all threads. So any thread
can change the permission on that key. Smoke tested it
with your test program.


I think this goes in the right direction, but the AMR value after
fork is still strange:

AMR (PID 34912): 0x
AMR after fork (PID 34913): 0x
AMR (PID 34913): 0x
Allocated key in subprocess (PID 34913): 2
Allocated key (PID 34912): 2
Setting AMR: 0x
New AMR value (PID 34912): 0x0fff
About to call execl (PID 34912) ...
AMR (PID 34912): 0x0fff
AMR after fork (PID 34914): 0x0003
AMR (PID 34914): 0x0003
Allocated key in subprocess (PID 34914): 2
Allocated key (PID 34912): 2
Setting AMR: 0x
New AMR value (PID 34912): 0x0fff

I mean this line:

AMR after fork (PID 34914): 0x0003

Shouldn't it be the same as in the parent process?


Fixed it. Please try this patch. If it all works to your satisfaction, I
will clean it up further and send to Michael Ellermen(ppc maintainer).


commit 51f4208ed5baeab1edb9b0f8b68d719b3527
Author: Ram Pai 
Date:   Sun Jun 3 14:44:32 2018 -0500

 Fix for the fork bug.
 Signed-off-by: Ram Pai 


Is this on top of the previous patch, or a separate fix?


top of previous patch.


Thanks.  With this patch, I get this on an LPAR:

AMR (PID 1876): 0x0003
AMR after fork (PID 1877): 0x0003
AMR (PID 1877): 0x0003
Allocated key in subprocess (PID 1877): 2
Allocated key (PID 1876): 2
Setting AMR: 0x
New AMR value (PID 1876): 0x0fff
About to call execl (PID 1876) ...
AMR (PID 1876): 0x0003
AMR after fork (PID 1878): 0x0003
AMR (PID 1878): 0x0003
Allocated key in subprocess (PID 1878): 2
Allocated key (PID 1876): 2
Setting AMR: 0x
New AMR value (PID 1876): 0x0fff

Test program is still this one:



So the process starts out with a different AMR value for some
reason. That could be a pre-existing bug that was just hidden by the
reset-to-zero on fork, or it could be intentional.  But the kernel


yes it is a bug, a patch for which is lined up for submission..

The fix is


commit eaf5b2ac002ad2f5bca118d7ce075ce28311aa8e
Author: Ram Pai 
Date:   Mon Jun 4 10:58:44 2018 -0500

 powerpc/pkeys: fix total pkeys calculation
 
 Total number of pkeys calculation is off by 1. Fix it.
 
 Signed-off-by: Ram Pai 


Looks good to me now.  Initial AMR value is zero, as is currently intended.

So the remaining question at this point is whether the Intel behavior 
(default-deny instead of default-allow) is preferable.


But if you can get the existing fixes into 4.18 and perhaps the relevant 
stable kernels, that would already be a great help for my glibc work.


Thanks,
Florian

Re: [1/5] powerpc/embedded6xx: Remove C2K board support

2018-06-04 Thread Mark Greer

On Tue, Jun 05, 2018 at 12:10:31AM +1000, Michael Ellerman wrote:
> On Fri, 2018-04-06 at 01:17:16 UTC, Mark Greer wrote:
> > The C2K platform appears to be orphaned so remove code supporting it.
> > 
> > CC: Remi Machet 
> > Signed-off-by: Mark Greer 
> > Acked-by: Remi Machet 
> > Signed-off-by: Mark Greer 
> 
> Series applied to powerpc next, thanks.
> 
> https://git.kernel.org/powerpc/c/92c8c16f345759e87c5d5b771d438f

Thanks Michael.

Mark
--

Re: pkeys on POWER: Access rights not reset on execve

2018-06-04 Thread Ram Pai

On Mon, Jun 04, 2018 at 07:57:46PM +0200, Florian Weimer wrote:
> On 06/04/2018 04:01 PM, Ram Pai wrote:
> >On Mon, Jun 04, 2018 at 12:12:07PM +0200, Florian Weimer wrote:
> >>On 06/03/2018 10:18 PM, Ram Pai wrote:
> >>>On Mon, May 21, 2018 at 01:29:11PM +0200, Florian Weimer wrote:
> On 05/20/2018 09:11 PM, Ram Pai wrote:
> >Florian,
> >
> > Does the following patch fix the problem for you?  Just like x86
> > I am enabling all keys in the UAMOR register during
> > initialization itself. Hence any key created by any thread at
> > any time, will get activated on all threads. So any thread
> > can change the permission on that key. Smoke tested it
> > with your test program.
> 
> I think this goes in the right direction, but the AMR value after
> fork is still strange:
> 
> AMR (PID 34912): 0x
> AMR after fork (PID 34913): 0x
> AMR (PID 34913): 0x
> Allocated key in subprocess (PID 34913): 2
> Allocated key (PID 34912): 2
> Setting AMR: 0x
> New AMR value (PID 34912): 0x0fff
> About to call execl (PID 34912) ...
> AMR (PID 34912): 0x0fff
> AMR after fork (PID 34914): 0x0003
> AMR (PID 34914): 0x0003
> Allocated key in subprocess (PID 34914): 2
> Allocated key (PID 34912): 2
> Setting AMR: 0x
> New AMR value (PID 34912): 0x0fff
> 
> I mean this line:
> 
> AMR after fork (PID 34914): 0x0003
> 
> Shouldn't it be the same as in the parent process?
> >>>
> >>>Fixed it. Please try this patch. If it all works to your satisfaction, I
> >>>will clean it up further and send to Michael Ellermen(ppc maintainer).
> >>>
> >>>
> >>>commit 51f4208ed5baeab1edb9b0f8b68d719b3527
> >>>Author: Ram Pai 
> >>>Date:   Sun Jun 3 14:44:32 2018 -0500
> >>>
> >>> Fix for the fork bug.
> >>> Signed-off-by: Ram Pai 
> >>
> >>Is this on top of the previous patch, or a separate fix?
> >
> >top of previous patch.
> 
> Thanks.  With this patch, I get this on an LPAR:
> 
> AMR (PID 1876): 0x0003
> AMR after fork (PID 1877): 0x0003
> AMR (PID 1877): 0x0003
> Allocated key in subprocess (PID 1877): 2
> Allocated key (PID 1876): 2
> Setting AMR: 0x
> New AMR value (PID 1876): 0x0fff
> About to call execl (PID 1876) ...
> AMR (PID 1876): 0x0003
> AMR after fork (PID 1878): 0x0003
> AMR (PID 1878): 0x0003
> Allocated key in subprocess (PID 1878): 2
> Allocated key (PID 1876): 2
> Setting AMR: 0x
> New AMR value (PID 1876): 0x0fff
> 
> Test program is still this one:
> 
> 
> 
> So the process starts out with a different AMR value for some
> reason. That could be a pre-existing bug that was just hidden by the
> reset-to-zero on fork, or it could be intentional.  But the kernel

yes it is a bug, a patch for which is lined up for submission..

The fix is


commit eaf5b2ac002ad2f5bca118d7ce075ce28311aa8e
Author: Ram Pai 
Date:   Mon Jun 4 10:58:44 2018 -0500

powerpc/pkeys: fix total pkeys calculation

Total number of pkeys calculation is off by 1. Fix it.

Signed-off-by: Ram Pai 

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 4530cdf..3384c4e 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -93,7 +93,7 @@ int pkey_initialize(void)
 * arch-neutral code.
 */
pkeys_total = min_t(int, pkeys_total,
-   (ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT));
+   ((ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)+1));
 
if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
static_branch_enable(_disabled);

Re: pkeys on POWER: Access rights not reset on execve

2018-06-04 Thread Florian Weimer


On 06/04/2018 04:01 PM, Ram Pai wrote:

On Mon, Jun 04, 2018 at 12:12:07PM +0200, Florian Weimer wrote:

On 06/03/2018 10:18 PM, Ram Pai wrote:

On Mon, May 21, 2018 at 01:29:11PM +0200, Florian Weimer wrote:

On 05/20/2018 09:11 PM, Ram Pai wrote:

Florian,

Does the following patch fix the problem for you?  Just like x86
I am enabling all keys in the UAMOR register during
initialization itself. Hence any key created by any thread at
any time, will get activated on all threads. So any thread
can change the permission on that key. Smoke tested it
with your test program.


I think this goes in the right direction, but the AMR value after
fork is still strange:

AMR (PID 34912): 0x
AMR after fork (PID 34913): 0x
AMR (PID 34913): 0x
Allocated key in subprocess (PID 34913): 2
Allocated key (PID 34912): 2
Setting AMR: 0x
New AMR value (PID 34912): 0x0fff
About to call execl (PID 34912) ...
AMR (PID 34912): 0x0fff
AMR after fork (PID 34914): 0x0003
AMR (PID 34914): 0x0003
Allocated key in subprocess (PID 34914): 2
Allocated key (PID 34912): 2
Setting AMR: 0x
New AMR value (PID 34912): 0x0fff

I mean this line:

AMR after fork (PID 34914): 0x0003

Shouldn't it be the same as in the parent process?


Fixed it. Please try this patch. If it all works to your satisfaction, I
will clean it up further and send to Michael Ellermen(ppc maintainer).


commit 51f4208ed5baeab1edb9b0f8b68d719b3527
Author: Ram Pai 
Date:   Sun Jun 3 14:44:32 2018 -0500

 Fix for the fork bug.
 Signed-off-by: Ram Pai 


Is this on top of the previous patch, or a separate fix?


top of previous patch.


Thanks.  With this patch, I get this on an LPAR:

AMR (PID 1876): 0x0003
AMR after fork (PID 1877): 0x0003
AMR (PID 1877): 0x0003
Allocated key in subprocess (PID 1877): 2
Allocated key (PID 1876): 2
Setting AMR: 0x
New AMR value (PID 1876): 0x0fff
About to call execl (PID 1876) ...
AMR (PID 1876): 0x0003
AMR after fork (PID 1878): 0x0003
AMR (PID 1878): 0x0003
Allocated key in subprocess (PID 1878): 2
Allocated key (PID 1876): 2
Setting AMR: 0x
New AMR value (PID 1876): 0x0fff

Test program is still this one:



So the process starts out with a different AMR value for some reason. 
That could be a pre-existing bug that was just hidden by the 
reset-to-zero on fork, or it could be intentional.  But the kernel code 
does not indicate that key 63 is reserved (POWER numbers keys from the 
MSB to the LSB).


But it looks like we are finally getting somewhere. 8-)

Thanks,
Florian

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Michael S. Tsirkin

On Mon, Jun 04, 2018 at 11:14:36PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2018-06-04 at 05:55 -0700, Christoph Hellwig wrote:
> > On Mon, Jun 04, 2018 at 03:43:09PM +0300, Michael S. Tsirkin wrote:
> > > Another is that given the basic functionality is in there, optimizations
> > > can possibly wait until per-device quirks in DMA API are supported.
> > 
> > We have had per-device dma_ops for quite a while.
> 
> I've asked Ansuman to start with a patch that converts virtio to use
> DMA ops always, along with an init quirk to hookup "direct" ops when
> the IOMMU flag isn't set.
> 
> This will at least remove that horrid duplication of code path we have
> in there.
> 
> Then we can just involve the arch in that init quirk so we can chose an
> alternate set of ops when running a secure VM.
> 
> This is completely orthogonal to whether an iommu exist qemu side or
> not, and should be entirely solved on the Linux side.
> 
> Cheers,
> Ben.

Sounds good to me.

-- 
MST

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Michael S. Tsirkin

On Mon, Jun 04, 2018 at 11:11:52PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2018-06-04 at 15:43 +0300, Michael S. Tsirkin wrote:
> > On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> > > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> > > 
> > > > I re-read that discussion and I'm still unclear on the
> > > > original question, since I got several apparently
> > > > conflicting answers.
> > > > 
> > > > I asked:
> > > > 
> > > > Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > > > hypervisor side sufficient?
> > > 
> > > I thought I had replied to this...
> > > 
> > > There are a couple of reasons:
> > > 
> > > - First qemu doesn't know that the guest will switch to "secure mode"
> > > in advance. There is no difference between a normal and a secure
> > > partition until the partition does the magic UV call to "enter secure
> > > mode" and qemu doesn't see any of it. So who can set the flag here ?
> > 
> > The user should set it. You just tell user "to be able to use with
> > feature X, enable IOMMU".
> 
> That's completely backwards. The user has no idea what that stuff is.
> And it would have to percolate all the way up the management stack,
> libvirt, kimchi, whatever else ... that's just nonsense.
> 
> Especially since, as I explained in my other email, this is *not* a
> qemu problem and thus the solution shouldn't be messing around with
> qemu.

virtio is implemented in qemu though. If you prefer to stick
all your code in either guest or the UV that's your decision
but it looks like qemu could be helpful here.

For example what if you have a guest that passes physical addresses
to qemu bypassing swiotlb? Don't you want to detect
that and fail gracefully rather than crash the guest?
That's what VIRTIO_F_IOMMU_PLATFORM will do for you.

Still that's hypervisor's decision. What isn't up to the hypervisor is
the way we structure code. We made an early decision to merge a hack
with xen, among discussion about how with time DMA API will learn to
support per-device quirks and we'll be able to switch to that.
So let's do that now?

> > 
> > > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > > vhost) go through the emulated MMIO for every access to the guest,
> > > which adds additional overhead.
> > > 
> > > Cheers,
> > > Ben.
> > 
> > There are several answers to this.  One is that we are working hard to
> > make overhead small when the mappings are static (which they would be if
> > there's no actual IOMMU). So maybe especially given you are using
> > a bounce buffer on top it's not so bad - did you try to
> > benchmark?
> > 
> > Another is that given the basic functionality is in there, optimizations
> > can possibly wait until per-device quirks in DMA API are supported.
> 
> The point is that requiring specific qemu command line arguments isn't
> going to fly. We have additional problems due to the fact that our
> firmware (SLOF) inside qemu doesn't currently deal with iommu's etc...
> though those can be fixed.
> 
> Overall, however, this seems to be the most convoluted way of achieving
> things, require user interventions where none should be needed etc...
> 
> Again, what's wrong with a 2 lines hook instead that solves it all and
> completely avoids involving qemu ?
> 
> Ben.

That each platform wants to add hacks in this data path function.

> > 
> > > > 
> > > > 
> > > > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++
> > > > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++
> > > > >  drivers/virtio/virtio_ring.c   | 10 ++
> > > > >  3 files changed, 27 insertions(+)
> > > > > 
> > > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> > > > > b/arch/powerpc/include/asm/dma-mapping.h
> > > > > index 8fa3945..056e578 100644
> > > > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device 
> > > > > *dev);
> > > > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > > > >  
> > > > >  #endif /* __KERNEL__ */
> > > > > +
> > > > > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > > > > +
> > > > > +struct virtio_device;
> > > > > +
> > > > > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev);
> > > > >  #endif   /* _ASM_DMA_MAPPING_H */
> > > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> > > > > b/arch/powerpc/platforms/pseries/iommu.c
> > > > > index 06f0296..a2ec15a 100644
> > > > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > > > @@ -38,6 +38,7 @@
> > > > >  #include 
> > > > >  #include 
> > > > >  #include 
> > > > > +#include 
> > > > >  #include 
> > > > >  #include 
> > > > >  #include 
> > > > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str)
> > > > >  __setup("multitce=", disable_multitce);
> > > > >  
> > > > >

[RFC PATCH -tip v5 24/27] bpf: error-inject: kprobes: Clear current_kprobe and enable preempt in kprobe

2018-06-04 Thread Masami Hiramatsu

Clear current_kprobe and enable preemption in kprobe
even if pre_handler returns !0.

This simplifies function override using kprobes.

Jprobe used to require to keep the preemption disabled and
keep current_kprobe until it returned to original function
entry. For this reason kprobe_int3_handler() and similar
arch dependent kprobe handers checks pre_handler result
and exit without enabling preemption if the result is !0.

After removing the jprobe, Kprobes does not need to
keep preempt disabled even if user handler returns !0
anymore.

But since the function override handler in error-inject
and bpf is also returns !0 if it overrides a function,
to balancing the preempt count, it enables preemption
and reset current kprobe by itself.

That is a bad design that is very buggy. This fixes
such unbalanced preempt-count and current_kprobes setting
in kprobes, bpf and error-inject.

Note: for powerpc and x86, this removes all preempt_disable
from kprobe_ftrace_handler because ftrace callbacks are
called under preempt disabled.

Signed-off-by: Masami Hiramatsu 
Cc: Vineet Gupta 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Ralf Baechle 
Cc: James Hogan 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: "David S. Miller" 
Cc: "Naveen N. Rao" 
Cc: Josef Bacik 
Cc: Alexei Starovoitov 
Cc: x...@kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
---
 Changes in v5:
  - Fix kprobe_ftrace_handler in arch/powerpc too.
---
 arch/arc/kernel/kprobes.c|5 +++--
 arch/arm/probes/kprobes/core.c   |   10 +-
 arch/arm64/kernel/probes/kprobes.c   |   10 +-
 arch/ia64/kernel/kprobes.c   |   13 -
 arch/mips/kernel/kprobes.c   |4 ++--
 arch/powerpc/kernel/kprobes-ftrace.c |   15 ++-
 arch/powerpc/kernel/kprobes.c|7 +--
 arch/s390/kernel/kprobes.c   |7 ---
 arch/sh/kernel/kprobes.c |7 ---
 arch/sparc/kernel/kprobes.c  |7 ---
 arch/x86/kernel/kprobes/core.c   |4 
 arch/x86/kernel/kprobes/ftrace.c |   15 ---
 kernel/fail_function.c   |3 ---
 kernel/trace/trace_kprobe.c  |   11 +++
 14 files changed, 57 insertions(+), 61 deletions(-)

diff --git a/arch/arc/kernel/kprobes.c b/arch/arc/kernel/kprobes.c
index 465365696c91..df35d4c0b0b8 100644
--- a/arch/arc/kernel/kprobes.c
+++ b/arch/arc/kernel/kprobes.c
@@ -231,6 +231,9 @@ int __kprobes arc_kprobe_handler(unsigned long addr, struct 
pt_regs *regs)
if (!p->pre_handler || !p->pre_handler(p, regs)) {
setup_singlestep(p, regs);
kcb->kprobe_status = KPROBE_HIT_SS;
+   } else {
+   reset_current_kprobe();
+   preempt_enable_no_resched();
}
 
return 1;
@@ -442,9 +445,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe 
*p,
kretprobe_assert(ri, orig_ret_address, trampoline_address);
regs->ret = orig_ret_address;
 
-   reset_current_kprobe();
kretprobe_hash_unlock(current, );
-   preempt_enable_no_resched();
 
hlist_for_each_entry_safe(ri, tmp, _rp, hlist) {
hlist_del(>hlist);
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 3192350f389d..8d37601fdb20 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -300,10 +300,10 @@ void __kprobes kprobe_handler(struct pt_regs *regs)
 
/*
 * If we have no pre-handler or it returned 0, we
-* continue with normal processing.  If we have a
-* pre-handler and it returned non-zero, it prepped
-* for calling the break_handler below on re-entry,
-* so get out doing nothing more here.
+* continue with normal processing. If we have a
+* pre-handler and it returned non-zero, it will
+* modify the execution path and no need to single
+* stepping. Let's just reset current kprobe and exit.
 */
if (!p->pre_handler || !p->pre_handler(p, regs)) {
kcb->kprobe_status = KPROBE_HIT_SS;
@@ -312,8 +312,8 @@ void __kprobes kprobe_handler(struct pt_regs *regs)
kcb->kprobe_status = KPROBE_HIT_SSDONE;

[RFC PATCH -tip v5 18/27] powerpc/kprobes: Don't call the ->break_handler() in arm kprobes code

2018-06-04 Thread Masami Hiramatsu

Don't call the ->break_handler() from the arm kprobes code,
because it was only used by jprobes which got removed.

This also makes skip_singlestep() a static function since
only ftrace-kprobe.c is using this function.

Signed-off-by: Masami Hiramatsu 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/include/asm/kprobes.h   |   10 --
 arch/powerpc/kernel/kprobes-ftrace.c |   16 +++-
 arch/powerpc/kernel/kprobes.c|   31 +++
 3 files changed, 14 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/include/asm/kprobes.h 
b/arch/powerpc/include/asm/kprobes.h
index 674036db558b..785c464b6588 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -102,16 +102,6 @@ extern int kprobe_exceptions_notify(struct notifier_block 
*self,
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 extern int kprobe_handler(struct pt_regs *regs);
 extern int kprobe_post_handler(struct pt_regs *regs);
-#ifdef CONFIG_KPROBES_ON_FTRACE
-extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
-  struct kprobe_ctlblk *kcb);
-#else
-static inline int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb)
-{
-   return 0;
-}
-#endif
 #else
 static inline int kprobe_handler(struct pt_regs *regs) { return 0; }
 static inline int kprobe_post_handler(struct pt_regs *regs) { return 0; }
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
index 1b316331c2d9..3869b0e5d5c7 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -26,8 +26,8 @@
 #include 
 
 static nokprobe_inline
-int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
- struct kprobe_ctlblk *kcb, unsigned long orig_nip)
+int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
+   struct kprobe_ctlblk *kcb, unsigned long orig_nip)
 {
/*
 * Emulate singlestep (and also recover regs->nip)
@@ -44,16 +44,6 @@ int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
return 1;
 }
 
-int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
-   struct kprobe_ctlblk *kcb)
-{
-   if (kprobe_ftrace(p))
-   return __skip_singlestep(p, regs, kcb, 0);
-   else
-   return 0;
-}
-NOKPROBE_SYMBOL(skip_singlestep);
-
 /* Ftrace callback handler for kprobes */
 void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
   struct ftrace_ops *ops, struct pt_regs *regs)
@@ -82,7 +72,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long 
parent_nip,
__this_cpu_write(current_kprobe, p);
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
if (!p->pre_handler || !p->pre_handler(p, regs))
-   __skip_singlestep(p, regs, kcb, orig_nip);
+   skip_singlestep(p, regs, kcb, orig_nip);
else {
/*
 * If pre_handler returns !0, it sets regs->nip and
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 600678fce0a8..f06747e2e70d 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -317,25 +317,17 @@ int kprobe_handler(struct pt_regs *regs)
}
prepare_singlestep(p, regs);
return 1;
-   } else {
-   if (*addr != BREAKPOINT_INSTRUCTION) {
-   /* If trap variant, then it belongs not to us */
-   kprobe_opcode_t cur_insn = *addr;
-   if (is_trap(cur_insn))
-   goto no_kprobe;
-   /* The breakpoint instruction was removed by
-* another cpu right after we hit, no further
-* handling of this interrupt is appropriate
-*/
-   ret = 1;
+   } else if (*addr != BREAKPOINT_INSTRUCTION) {
+   /* If trap variant, then it belongs not to us */
+   kprobe_opcode_t cur_insn = *addr;
+
+   if (is_trap(cur_insn))
goto no_kprobe;
-   }
-   p = __this_cpu_read(current_kprobe);
-   if (p->break_handler && p->break_handler(p, regs)) {
-   if (!skip_singlestep(p, regs, kcb))
-   goto ss_probe;
-   ret = 1;
-   }
+   /* The breakpoint

[RFC PATCH -tip v5 07/27] powerpc/kprobes: Remove jprobe powerpc implementation

2018-06-04 Thread Masami Hiramatsu

Remove arch dependent setjump/longjump functions
and unused fields in kprobe_ctlblk for jprobes
from arch/powerpc. This also reverts commits
related __is_active_jprobe() function.

Signed-off-by: Masami Hiramatsu 

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: "Naveen N. Rao" 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/include/asm/kprobes.h |2 -
 arch/powerpc/kernel/kprobes-ftrace.c   |   15 ---
 arch/powerpc/kernel/kprobes.c  |   54 
 arch/powerpc/kernel/trace/ftrace_64_mprofile.S |   39 ++---
 4 files changed, 5 insertions(+), 105 deletions(-)

diff --git a/arch/powerpc/include/asm/kprobes.h 
b/arch/powerpc/include/asm/kprobes.h
index 9f3be5c8a4a3..674036db558b 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -88,7 +88,6 @@ struct prev_kprobe {
 struct kprobe_ctlblk {
unsigned long kprobe_status;
unsigned long kprobe_saved_msr;
-   struct pt_regs jprobe_saved_regs;
struct prev_kprobe prev_kprobe;
 };
 
@@ -104,7 +103,6 @@ extern int kprobe_fault_handler(struct pt_regs *regs, int 
trapnr);
 extern int kprobe_handler(struct pt_regs *regs);
 extern int kprobe_post_handler(struct pt_regs *regs);
 #ifdef CONFIG_KPROBES_ON_FTRACE
-extern int __is_active_jprobe(unsigned long addr);
 extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
   struct kprobe_ctlblk *kcb);
 #else
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
index 7a1f99f1b47f..1b316331c2d9 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -25,21 +25,6 @@
 #include 
 #include 
 
-/*
- * This is called from ftrace code after invoking registered handlers to
- * disambiguate regs->nip changes done by jprobes and livepatch. We check if
- * there is an active jprobe at the provided address (mcount location).
- */
-int __is_active_jprobe(unsigned long addr)
-{
-   if (!preemptible()) {
-   struct kprobe *p = raw_cpu_read(current_kprobe);
-   return (p && (unsigned long)p->addr == addr) ? 1 : 0;
-   }
-
-   return 0;
-}
-
 static nokprobe_inline
 int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
  struct kprobe_ctlblk *kcb, unsigned long orig_nip)
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index e4c5bf33970b..600678fce0a8 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -611,60 +611,6 @@ unsigned long arch_deref_entry_point(void *entry)
 }
 NOKPROBE_SYMBOL(arch_deref_entry_point);
 
-int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
-{
-   struct jprobe *jp = container_of(p, struct jprobe, kp);
-   struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
-
-   memcpy(>jprobe_saved_regs, regs, sizeof(struct pt_regs));
-
-   /* setup return addr to the jprobe handler routine */
-   regs->nip = arch_deref_entry_point(jp->entry);
-#ifdef PPC64_ELF_ABI_v2
-   regs->gpr[12] = (unsigned long)jp->entry;
-#elif defined(PPC64_ELF_ABI_v1)
-   regs->gpr[2] = (unsigned long)(((func_descr_t *)jp->entry)->toc);
-#endif
-
-   /*
-* jprobes use jprobe_return() which skips the normal return
-* path of the function, and this messes up the accounting of the
-* function graph tracer.
-*
-* Pause function graph tracing while performing the jprobe function.
-*/
-   pause_graph_tracing();
-
-   return 1;
-}
-NOKPROBE_SYMBOL(setjmp_pre_handler);
-
-void __used jprobe_return(void)
-{
-   asm volatile("jprobe_return_trap:\n"
-"trap\n"
-::: "memory");
-}
-NOKPROBE_SYMBOL(jprobe_return);
-
-int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
-{
-   struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
-
-   if (regs->nip != ppc_kallsyms_lookup_name("jprobe_return_trap")) {
-   pr_debug("longjmp_break_handler NIP (0x%lx) does not match 
jprobe_return_trap (0x%lx)\n",
-   regs->nip, 
ppc_kallsyms_lookup_name("jprobe_return_trap"));
-   return 0;
-   }
-
-   memcpy(regs, >jprobe_saved_regs, sizeof(struct pt_regs));
-   /* It's OK to start function graph tracing again */
-   unpause_graph_tracing();
-   preempt_enable_no_resched();
-   return 1;
-}
-NOKPROBE_SYMBOL(longjmp_break_handler);
-
 static struct kprobe trampoline_p = {
.addr = (kprobe_opcode_t *) _trampoline,
.pre_handler = trampoline_probe_handler
diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S 
b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
index 3f3e81852422..4e84a713e80a 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
+++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
@@ -99,39 +99,13 @@ ftrace_call:
bl

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Benjamin Herrenschmidt

On Mon, 2018-06-04 at 05:55 -0700, Christoph Hellwig wrote:
> On Mon, Jun 04, 2018 at 03:43:09PM +0300, Michael S. Tsirkin wrote:
> > Another is that given the basic functionality is in there, optimizations
> > can possibly wait until per-device quirks in DMA API are supported.
> 
> We have had per-device dma_ops for quite a while.

I've asked Ansuman to start with a patch that converts virtio to use
DMA ops always, along with an init quirk to hookup "direct" ops when
the IOMMU flag isn't set.

This will at least remove that horrid duplication of code path we have
in there.

Then we can just involve the arch in that init quirk so we can chose an
alternate set of ops when running a secure VM.

This is completely orthogonal to whether an iommu exist qemu side or
not, and should be entirely solved on the Linux side.

Cheers,
Ben.

Re: [v4, 1/7] powerpc/64s/radix: do not flush TLB when relaxing access

2018-06-04 Thread Michael Ellerman

On Fri, 2018-06-01 at 10:01:15 UTC, Nicholas Piggin wrote:
> Radix flushes the TLB when updating ptes to increase permissiveness
> of protection (increase access authority). Book3S does not require
> TLB flushing in this case, and it is not done on hash. This patch
> avoids the flush for radix.
> 
> >From Power ISA v3.0B, p.1090:
> 
> Setting a Reference or Change Bit or Upgrading Access Authority
> (PTE Subject to Atomic Hardware Updates)
> 
> If the only change being made to a valid PTE that is subject to
> atomic hardware updates is to set the Reference or Change bit to 1
> or to add access authorities, a simpler sequence suffices because
> the translation hardware will refetch the PTE if an access is
> attempted for which the only problems were reference and/or change
> bits needing to be set or insufficient access authority.
> 
> The nest MMU on POWER9 does not re-fetch the PTE after such an access
> attempt before faulting, so address spaces with a coprocessor
> attached will continue to flush in these cases.
> 
> This reduces tlbies for a kernel compile workload from 1.28M to 0.95M,
> tlbiels from 20.17M 19.68M.
> 
> fork --fork --exec benchmark improved 2.77% (12000->12300).
> 
> Reviewed-by: Aneesh Kumar K.V 
> Signed-off-by: Nicholas Piggin 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e5f7cb58c2b77a0249c2028b6d1ec4

cheers

Re: powerpc/mm/hugetlb: Update hugetlb related locks

2018-06-04 Thread Michael Ellerman

On Fri, 2018-06-01 at 08:24:24 UTC, "Aneesh Kumar K.V" wrote:
> With split pmd page table lock enabled, we don't use mm->page_table_lock when
> updating pmd entries. This patch update hugetlb path to use the right lock
> when inserting huge page directory entries into page table.
> 
> ex: if we are using hugepd and inserting hugepd entry at the pmd level, we
> use pmd_lockptr, which based on config can be split pmd lock.
> 
> For update huge page directory entries itself we use mm->page_table_lock. We
> do have a helper huge_pte_lockptr() for that.
> 
> Fixes: 675d99529 ("powerpc/book3s64: Enable split pmd ptlock")
> Signed-off-by: Aneesh Kumar K.V 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ed515b6898c36775ddd99ff9ffeda4

cheers

Re: powerpc/mm/hash: hard disable irq in the SLB insert path

2018-06-04 Thread Michael Ellerman

On Fri, 2018-06-01 at 08:24:02 UTC, "Aneesh Kumar K.V" wrote:
> When inserting SLB entries for EA above 512TB, we need to hard disable irq.
> This will make sure we don't take a PMU interrupt that can possibly touch
> user space address via a stack dump. To prevent this, we need to hard disable
> the interrupt.
> 
> Also add a comment explaining why we don't need context synchronizing isync
> with slbmte.
> 
> Fixes: f384796c4 ("powerpc/mm: Add support for handling > 512TB address in 
> SLB miss")
> Signed-off-by: Aneesh Kumar K.V 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/a5db5060e0b2e27605df272224bfd4

cheers

Re: [v2,20/21] powerpc/xmon: use match_string() helper

2018-06-04 Thread Michael Ellerman

On Thu, 2018-05-31 at 11:11:25 UTC, Yisheng Xie wrote:
> match_string() returns the index of an array for a matching string,
> which can be used instead of open coded variant.
> 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Yisheng Xie 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/0abbf2bfdc9dec32e9832aa8d4522a

cheers

Re: powerpc/mm/hash: Add missing update in slb update sequence.

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-30 at 13:18:04 UTC, "Aneesh Kumar K.V" wrote:
> >From ISA
> 
> "For data accesses, the context synchronizing instruction before the slbie,
> slbieg, slbia, slbmte, tlbie, or tlbiel instruction ensures that all preceding
> instructions that access data storage have completed to a point at which they
> have reported all exceptions they will cause."
> 
> Add the missing isync when updating Kernel stack slb entry.
> 
> Signed-off-by: Aneesh Kumar K.V 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/91d06971881f71d945910de1286580

cheers

Re: [v5, 2/4] powerpc/kbuild: remove CROSS32 defines from top level powerpc Makefile

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-30 at 12:19:20 UTC, Nicholas Piggin wrote:
> Switch VDSO32 build over to use CROSS32_COMPILE directly, and have
> it pass in -m32 after the standard c_flags. This allows endianness
> overrides to be removed and the endian and bitness flags moved into
> standard flags variables.
> 
> Signed-off-by: Nicholas Piggin 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/af3901cbbd3de182aafb8ee553c825

cheers

Re: [v5, 3/4] powerpc/kbuild: Use flags variables rather than overriding LD/CC/AS

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-30 at 12:19:21 UTC, Nicholas Piggin wrote:
> The powerpc toolchain can compile combinations of 32/64 bit and
> big/little endian, so it's convenient to consider, e.g.,
> 
>   `CC -m64 -mbig-endian`
> 
> To be the C compiler for the purpose of invoking it to build target
> artifacts. So overriding the the CC variable to include thse flags
> works for this purpose.
> 
> Unfortunately that is not compatible with the way the proposed new
> Kconfig macro language will work.
> 
> After previous patches in this series, these flags can be carefully
> passed in using flags instead.
> 
> Signed-off-by: Nicholas Piggin 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1421dc6d48296a9e91702743b31458

cheers

Re: [kernel] powerpc/powernv/ioda2: Remove redundand free of TCE pages

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-30 at 09:22:50 UTC, Alexey Kardashevskiy wrote:
> When IODA2 creates a PE, it creates an IOMMU table with it_ops::free
> set to pnv_ioda2_table_free() which calls pnv_pci_ioda2_table_free_pages().
> 
> Since iommu_tce_table_put() calls it_ops::free when the last reference
> to the table is released, explicit call to pnv_pci_ioda2_table_free_pages()
> is not needed so let's remove it.
> 
> This should fix double free in the case of PCI hotuplug as
> pnv_pci_ioda2_table_free_pages() does not reset neither
> iommu_table::it_base nor ::it_size.
> 
> This was not exposed by SRIOV as it uses different code path via
> pnv_pcibios_sriov_disable().
> 
> IODA1 does not inialize it_ops::free so it does not have this issue.
> 
> Fixes: c5f7700bb "powerpc/powernv: Dynamically release PE"
> Signed-off-by: Alexey Kardashevskiy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/98fd72fe82527fd26618062b60cfd3

cheers

Re: powerpc/64s: Fix compiler store ordering to SLB shadow area

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-30 at 10:31:22 UTC, Nicholas Piggin wrote:
> The stores to update the SLB shadow area must be made as they appear
> in the C code, so that the hypervisor does not see an entry with
> mismatched vsid and esid. Use WRITE_ONCE for this.
> 
> GCC has been observed to elide the first store to esid in the update,
> which means that if the hypervisor interrupts the guest after storing
> to vsid, it could see an entry with old esid and new vsid, which may
> possibly result in memory corruption.
> 
> Signed-off-by: Nicholas Piggin 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/926bc2f100c24d4842b3064b5af44a

cheers

Re: [v5, 1/4] powerpc/kbuild: set default generic machine type for 32-bit compile

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-30 at 12:19:19 UTC, Nicholas Piggin wrote:
> Some 64-bit toolchains uses the wrong ISA variant for compiling 32-bit
> kernels, even with -m32. Debian's powerpc64le is one such case, and
> that is because it is built with --with-cpu=power8.
> 
> So when cross compiling a 32-bit kernel with a 64-bit toolchain, set
> -mcpu=powerpc initially, which is the generic 32-bit powerpc machine
> type and scheduling model. CPU and platform code can override this
> with subsequent -mcpu flags if necessary.
> 
> This is not done for 32-bit toolchains otherwise it would override
> their defaults, which are presumably set appropriately for the
> environment (moreso than a 64-bit cross compiler).
> 
> This fixes a lot of build failures due to incompatible assembly when
> compiling 32-bit kernel with th Debian powerpc64le 64-bit toolchain.
> 
> Cc: Segher Boessenkool 
> Signed-off-by: Nicholas Piggin 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/4bf4f42a2febb449a5cc5d79e7c58e

cheers

Re: [v3] powerpc: fix build failure by disabling attribute-alias warning

2018-06-04 Thread Michael Ellerman

On Tue, 2018-05-29 at 16:06:41 UTC, Christophe Leroy wrote:
> Latest GCC version emit the following warnings
> 
> As arch/powerpc code is built with -Werror, this breaks build with
> GCC 8.1
> 
> This patch inhibits those warnings
> 
>   CC  arch/powerpc/kernel/syscalls.o
> In file included from arch/powerpc/kernel/syscalls.c:24:
> ./include/linux/syscalls.h:233:18: error: 'sys_mmap2' alias between functions 
> of incompatible types 'long int(long unsigned int,  size_t,  long unsigned 
> int,  long unsigned int,  long unsigned int,  long unsigned int)' {aka 'long 
> int(long unsigned int,  long unsigned int,  long unsigned int,  long unsigned 
> int,  long unsigned int,  long unsigned int)'} and 'long int(long int,  long 
> int,  long int,  long int,  long int,  long int)' [-Werror=attribute-alias]
>   asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
>   ^~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro 
> '__SYSCALL_DEFINEx'
>   __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>   ^
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 
> 'SYSCALL_DEFINEx'
>  #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~
> arch/powerpc/kernel/syscalls.c:65:1: note: in expansion of macro 
> 'SYSCALL_DEFINE6'
>  SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len,
>  ^~~
> ./include/linux/syscalls.h:238:18: note: aliased declaration here
>   asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>   ^~~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro 
> '__SYSCALL_DEFINEx'
>   __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>   ^
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 
> 'SYSCALL_DEFINEx'
>  #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~
> arch/powerpc/kernel/syscalls.c:65:1: note: in expansion of macro 
> 'SYSCALL_DEFINE6'
>  SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len,
>  ^~~
> ./include/linux/syscalls.h:233:18: error: 'sys_mmap' alias between functions 
> of incompatible types 'long int(long unsigned int,  size_t,  long unsigned 
> int,  long unsigned int,  long unsigned int,  off_t)' {aka 'long int(long 
> unsigned int,  long unsigned int,  long unsigned int,  long unsigned int,  
> long unsigned int,  long int)'} and 'long int(long int,  long int,  long int, 
>  long int,  long int,  long int)' [-Werror=attribute-alias]
>   asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
>   ^~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro 
> '__SYSCALL_DEFINEx'
>   __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>   ^
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 
> 'SYSCALL_DEFINEx'
>  #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~
> arch/powerpc/kernel/syscalls.c:72:1: note: in expansion of macro 
> 'SYSCALL_DEFINE6'
>  SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len,
>  ^~~
> ./include/linux/syscalls.h:238:18: note: aliased declaration here
>   asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>   ^~~~
> ./include/linux/syscalls.h:222:2: note: in expansion of macro 
> '__SYSCALL_DEFINEx'
>   __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>   ^
> ./include/linux/syscalls.h:216:36: note: in expansion of macro 
> 'SYSCALL_DEFINEx'
>  #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
> ^~~
> arch/powerpc/kernel/syscalls.c:72:1: note: in expansion of macro 
> 'SYSCALL_DEFINE6'
>  SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len,
>  ^~~
>   CC  arch/powerpc/kernel/signal_32.o
> In file included from arch/powerpc/kernel/signal_32.c:31:
> ./include/linux/compat.h:74:18: error: 'compat_sys_swapcontext' alias between 
> functions of incompatible types 'long int(struct ucontext32 *, struct 
> ucontext32 *, int)' and 'long int(long int,  long int,  long int)' 
> [-Werror=attribute-alias]
>   asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
>   ^~
> ./include/linux/compat.h:58:2: note: in expansion of macro 
> 'COMPAT_SYSCALL_DEFINEx'
>   COMPAT_SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
>   ^~
> arch/powerpc/kernel/signal_32.c:1041:1: note: in expansion of macro 
> 'COMPAT_SYSCALL_DEFINE3'
>  COMPAT_SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>  ^~
> ./include/linux/compat.h:79:18: note: aliased declaration here
>   asmlinkage long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>   ^~~
> ./include/linux/compat.h:58:2: note: in expansion of macro

Re: [v3,1/3] powerpc/time: inline arch_vtime_task_switch()

2018-06-04 Thread Michael Ellerman

On Tue, 2018-05-29 at 16:19:14 UTC, Christophe Leroy wrote:
> arch_vtime_task_switch() is a small function which is called
> only from vtime_common_task_switch(), so it is worth inlining
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/60f1d2893ee6de65cdea609c84950b

cheers

Re: [v6,1/2] powerpc/lib: optimise 32 bits __clear_user()

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-30 at 07:06:13 UTC, Christophe Leroy wrote:
> Rewrite clear_user() on the same principle as memset(0), making use
> of dcbz to clear complete cache lines.
> 
> This code is a copy/paste of memset(), with some modifications
> in order to retrieve remaining number of bytes to be cleared,
> as it needs to be returned in case of error.
> 
> On the same way as done on PPC64 in commit 17968fbbd19f1
> ("powerpc: 64bit optimised __clear_user"), the patch moves
> __clear_user() into a dedicated file string_32.S
> 
> On a MPC885, throughput is almost doubled:
> 
> Before:
> ~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 18.990779 seconds, 52.7MB/s
> 
> After:
> ~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 9.611468 seconds, 104.0MB/s
> 
> On a MPC8321, throughput is multiplied by 2.12:
> 
> Before:
> root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 6.844352 seconds, 146.1MB/s
> 
> After:
> root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000
> 1048576000 bytes (1000.0MB) copied, 3.218854 seconds, 310.7MB/s
> 
> Signed-off-by: Christophe Leroy 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/f36bbf21e8b911b3c629fd36d4d217

cheers

Re: powerpc/ptrace: Use copy_{from, to}_user() rather than open-coding

2018-06-04 Thread Michael Ellerman

On Tue, 2018-05-29 at 12:57:38 UTC, Michael Ellerman wrote:
> From: Al Viro 
> 
> In PPC_PTRACE_GETHWDBGINFO and PPC_PTRACE_SETHWDEBUG we do an
> access_ok() check and then __copy_{from,to}_user().
> 
> Instead we should just use copy_{from,to}_user() which does all that
> for us and is less error prone.
> 
> Signed-off-by: Al Viro 
> Signed-off-by: Michael Ellerman 
> Reviewed-by: Samuel Mendoza-Jonas 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/6bcdd2972b9f6ebda9ae5c7075e2d5

cheers

Re: [V2, 1/4] powerpc/mm/hugetlb: Update huge_ptep_set_access_flags to call __ptep_set_access_flags directly

2018-06-04 Thread Michael Ellerman

On Tue, 2018-05-29 at 14:28:38 UTC, "Aneesh Kumar K.V" wrote:
> In a later patch, we want to update __ptep_set_access_flags take page size
> arg. This makes ptep_set_access_flags only work with mmu_virtual_psize.
> To simplify the code make huge_ptep_set_access_flags directly call
> __ptep_set_access_flags so that we can compute the hugetlb page size in
> hugetlb function.
> 
> Now that ptep_set_access_flags won't be called for hugetlb remove
> the is_vm_hugetlb_page() check and add the assert of pte lock
> unconditionally.
> 
> Signed-off-by: Aneesh Kumar K.V 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/f069ff396d657ac7bdb5de866c3ec2

cheers

Re: powerpc/64s: Enhance the information in cpu_show_spectre_v1()

2018-06-04 Thread Michael Ellerman

On Mon, 2018-05-28 at 13:19:14 UTC, Michal Suchanek wrote:
> We now have barrier_nospec as mitigation so print it in
> cpu_show_spectre_v1 when enabled.
> 
> Signed-off-by: Michal Suchanek 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/a377514519b9a20fa1ea9adddbb412

cheers

Re: [v2] powerpc/64: Fix build failure with GCC 8.1

2018-06-04 Thread Michael Ellerman

On Tue, 2018-05-29 at 06:03:53 UTC, Christophe Leroy wrote:
> CC  arch/powerpc/kernel/nvram_64.o
> arch/powerpc/kernel/nvram_64.c: In function 'nvram_create_partition':
> arch/powerpc/kernel/nvram_64.c:1042:2: error: 'strncpy' specified bound 12 
> equals destination size [-Werror=stringop-truncation]
>   strncpy(new_part->header.name, name, 12);
>   ^~~~
> 
>   CC  arch/powerpc/kernel/trace/ftrace.o
> In function 'make_field',
> inlined from 'ps3_repository_read_boot_dat_address' at 
> arch/powerpc/platforms/ps3/repository.c:900:9:
> arch/powerpc/platforms/ps3/repository.c:106:2: error: 'strncpy' output 
> truncated before terminating nul copying 8 bytes from a string of the same 
> length [-Werror=stringop-truncation]
>   strncpy((char *), text, 8);
>   ^~~~
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c95998811807d897ca112ea62d6671

cheers

Re: [v2] selftests/powerpc: Add perf breakpoint test

2018-06-04 Thread Michael Ellerman

On Mon, 2018-05-28 at 23:22:38 UTC, Michael Neuling wrote:
> This tests perf hardware breakpoints (ie PERF_TYPE_BREAKPOINT) on
> powerpc.
> 
> Signed-off-by: Michael Neuling 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/9c2d72d497a32788bf90f05610319a

cheers

Re: powerpc/Makefile: set -mcpu=860 flag for the 8xx

2018-06-04 Thread Michael Ellerman

On Mon, 2018-05-28 at 06:08:34 UTC, Christophe Leroy wrote:
> When compiled with GCC 8.1, vmlinux is significantly bigger than
> with GCC 4.8.
> 
> When looking at the generated code with objdump, we notice that
> all functions and loops when a 16 bytes alignment. This significantly
> increases the size of the kernel. It is pointless and even
> counterproductive as on the 8xx 'nop' also consumes one clock cycle.
> 
> Size of vmlinux with GCC 4.8:
>text  data bss dec hex filename
> 5801948   1626076  457796 7885820  7853fc vmlinux
> 
> Size of vmlinux with GCC 8.1:
>text  data bss dec hex filename
> 6764592   1630652  456476 8851720  871108 vmlinux
> 
> Size of vmlinux with GCC 8.1 and this patch:
>text  data bss dec hex filename
> 6331544   1631756  456476 8419776  8079c0 vmlinux
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1c38976334c0efce1b285369a6037f

cheers

Re: [v4] powerpc: Implement csum_ipv6_magic in assembly

2018-06-04 Thread Michael Ellerman

On Thu, 2018-05-24 at 11:33:18 UTC, Christophe Leroy wrote:
> The generic csum_ipv6_magic() generates a pretty bad result
> 
>  : (PPC32)
>0: 81 23 00 00 lwz r9,0(r3)
>4: 81 03 00 04 lwz r8,4(r3)
>8: 7c e7 4a 14 add r7,r7,r9
>c: 7d 29 38 10 subfc   r9,r9,r7
>   10: 7d 4a 51 10 subfe   r10,r10,r10
>   14: 7d 27 42 14 add r9,r7,r8
>   18: 7d 2a 48 50 subfr9,r10,r9
>   1c: 80 e3 00 08 lwz r7,8(r3)
>   20: 7d 08 48 10 subfc   r8,r8,r9
>   24: 7d 4a 51 10 subfe   r10,r10,r10
>   28: 7d 29 3a 14 add r9,r9,r7
>   2c: 81 03 00 0c lwz r8,12(r3)
>   30: 7d 2a 48 50 subfr9,r10,r9
>   34: 7c e7 48 10 subfc   r7,r7,r9
>   38: 7d 4a 51 10 subfe   r10,r10,r10
>   3c: 7d 29 42 14 add r9,r9,r8
>   40: 7d 2a 48 50 subfr9,r10,r9
>   44: 80 e4 00 00 lwz r7,0(r4)
>   48: 7d 08 48 10 subfc   r8,r8,r9
>   4c: 7d 4a 51 10 subfe   r10,r10,r10
>   50: 7d 29 3a 14 add r9,r9,r7
>   54: 7d 2a 48 50 subfr9,r10,r9
>   58: 81 04 00 04 lwz r8,4(r4)
>   5c: 7c e7 48 10 subfc   r7,r7,r9
>   60: 7d 4a 51 10 subfe   r10,r10,r10
>   64: 7d 29 42 14 add r9,r9,r8
>   68: 7d 2a 48 50 subfr9,r10,r9
>   6c: 80 e4 00 08 lwz r7,8(r4)
>   70: 7d 08 48 10 subfc   r8,r8,r9
>   74: 7d 4a 51 10 subfe   r10,r10,r10
>   78: 7d 29 3a 14 add r9,r9,r7
>   7c: 7d 2a 48 50 subfr9,r10,r9
>   80: 81 04 00 0c lwz r8,12(r4)
>   84: 7c e7 48 10 subfc   r7,r7,r9
>   88: 7d 4a 51 10 subfe   r10,r10,r10
>   8c: 7d 29 42 14 add r9,r9,r8
>   90: 7d 2a 48 50 subfr9,r10,r9
>   94: 7d 08 48 10 subfc   r8,r8,r9
>   98: 7d 4a 51 10 subfe   r10,r10,r10
>   9c: 7d 29 2a 14 add r9,r9,r5
>   a0: 7d 2a 48 50 subfr9,r10,r9
>   a4: 7c a5 48 10 subfc   r5,r5,r9
>   a8: 7c 63 19 10 subfe   r3,r3,r3
>   ac: 7d 29 32 14 add r9,r9,r6
>   b0: 7d 23 48 50 subfr9,r3,r9
>   b4: 7c c6 48 10 subfc   r6,r6,r9
>   b8: 7c 63 19 10 subfe   r3,r3,r3
>   bc: 7c 63 48 50 subfr3,r3,r9
>   c0: 54 6a 80 3e rotlwi  r10,r3,16
>   c4: 7c 63 52 14 add r3,r3,r10
>   c8: 7c 63 18 f8 not r3,r3
>   cc: 54 63 84 3e rlwinm  r3,r3,16,16,31
>   d0: 4e 80 00 20 blr
> 
>  <.csum_ipv6_magic>: (PPC64)
>0: 81 23 00 00 lwz r9,0(r3)
>4: 80 03 00 04 lwz r0,4(r3)
>8: 81 63 00 08 lwz r11,8(r3)
>c: 7c e7 4a 14 add r7,r7,r9
>   10: 7f 89 38 40 cmplw   cr7,r9,r7
>   14: 7d 47 02 14 add r10,r7,r0
>   18: 7d 30 10 26 mfocrf  r9,1
>   1c: 55 29 f7 fe rlwinm  r9,r9,30,31,31
>   20: 7d 4a 4a 14 add r10,r10,r9
>   24: 7f 80 50 40 cmplw   cr7,r0,r10
>   28: 7d 2a 5a 14 add r9,r10,r11
>   2c: 80 03 00 0c lwz r0,12(r3)
>   30: 81 44 00 00 lwz r10,0(r4)
>   34: 7d 10 10 26 mfocrf  r8,1
>   38: 55 08 f7 fe rlwinm  r8,r8,30,31,31
>   3c: 7d 29 42 14 add r9,r9,r8
>   40: 81 04 00 04 lwz r8,4(r4)
>   44: 7f 8b 48 40 cmplw   cr7,r11,r9
>   48: 7d 29 02 14 add r9,r9,r0
>   4c: 7d 70 10 26 mfocrf  r11,1
>   50: 55 6b f7 fe rlwinm  r11,r11,30,31,31
>   54: 7d 29 5a 14 add r9,r9,r11
>   58: 7f 80 48 40 cmplw   cr7,r0,r9
>   5c: 7d 29 52 14 add r9,r9,r10
>   60: 7c 10 10 26 mfocrf  r0,1
>   64: 54 00 f7 fe rlwinm  r0,r0,30,31,31
>   68: 7d 69 02 14 add r11,r9,r0
>   6c: 7f 8a 58 40 cmplw   cr7,r10,r11
>   70: 7c 0b 42 14 add r0,r11,r8
>   74: 81 44 00 08 lwz r10,8(r4)
>   78: 7c f0 10 26 mfocrf  r7,1
>   7c: 54 e7 f7 fe rlwinm  r7,r7,30,31,31
>   80: 7c 00 3a 14 add r0,r0,r7
>   84: 7f 88 00 40 cmplw   cr7,r8,r0
>   88: 7d 20 52 14 add r9,r0,r10
>   8c: 80 04 00 0c lwz r0,12(r4)
>   90: 7d 70 10 26 mfocrf  r11,1
>   94: 55 6b f7 fe rlwinm  r11,r11,30,31,31
>   98: 7d 29 5a 14 add r9,r9,r11
>   9c: 7f 8a 48 40 cmplw   cr7,r10,r9
>   a0: 7d 29 02 14 add r9,r9,r0
>   a4: 7d 70 10 26 mfocrf  r11,1
>   a8: 55 6b f7 fe rlwinm  r11,r11,30,31,31
>   ac: 7d 29 5a 14 add r9,r9,r11
>   b0: 7f 80 48 40 cmplw   cr7,r0,r9
>   b4: 7d 29 2a 14 add r9,r9,r5
>   b8: 7c 10 10 26 mfocrf  r0,1
>   bc: 54 00 f7 fe rlwinm  r0,r0,30,31,31
>   c0: 7d 29 02 14 add r9,r9,r0
>   c4: 7f 85 48 40 cmplw   cr7,r5,r9
>   c8: 7c 09 32 14 add r0,r9,r6
>   cc: 7d 50 10 26 mfocrf  r10,1
>   d0: 55 4a f7 fe rlwinm  r10,r10,30,31,31
>   d4: 7c 00 52 14 add r0,r0,r10
>   d8: 7f 80 30 40 cmplw   cr7,r0,r6
>   dc: 7d 30 10 26 mfocrf  r9,1
>   e0: 55 29 ef fe rlwinm  r9,r9,29,31,31
>   e4: 7c 09 02 14 add r0,r9,r0
>   e8: 54 03 80 3e rotlwi  r3,r0,16
>   ec: 7c 03 02 14 add r0,r3,r0
>   f0: 7c 03 00 f8 not r3,r0
>   f4: 78 63 84 22 rldicl

Re: [v2, 01/13] powerpc/eeh: Add eeh_max_freezes to initial EEH log line

2018-06-04 Thread Michael Ellerman

On Fri, 2018-05-25 at 03:11:28 UTC, Sam Bobroff wrote:
> The current failure message includes the number of failures that have
> occurred in the last hour (for a device) but it does not indicate
> how many failures will be tolerated before the device is permanently
> disabled.
> 
> Include the limit (eeh_max_freezes) to make this less surprising when
> it happens.
> 
> Also remove the embedded newline from the existing message to make it
> easier to grep for.
> 
> Signed-off-by: Sam Bobroff 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/796b9f5b317a46d1b744f661c38a62

cheers

Re: [v2] powerpc/lib: Adjust .balign inside string functions for PPC32

2018-06-04 Thread Michael Ellerman

On Fri, 2018-05-18 at 13:01:16 UTC, Christophe Leroy wrote:
> commit 87a156fb18fe1 ("Align hot loops of some string functions")
> degraded the performance of string functions by adding useless
> nops
> 
> A simple benchmark on an 8xx calling 10x a memchr() that
> matches the first byte runs in 41668 TB ticks before this patch
> and in 35986 TB ticks after this patch. So this gives an
> improvement of approx 10%
> 
> Another benchmark doing the same with a memchr() matching the 128th
> byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks
> after this patch, so regardless on the number of loops, removing
> those useless nops improves the test by 5683 TB ticks.
> 
> Fixes: 87a156fb18fe1 ("Align hot loops of some string functions")
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1128bb7813a896bd608fb622eee3c2

cheers

Re: powerpc/32: Optimise __csum_partial()

2018-06-04 Thread Michael Ellerman

On Thu, 2018-05-24 at 11:22:27 UTC, Christophe Leroy wrote:
> Improve __csum_partial by interleaving loads and adds.
> 
> On a 8xx, it brings neither improvement nor degradation.
> On a 83xx, it brings a 25% improvement.
> 
> Signed-off-by: Christophe Leroy 
> Reviewed-by: Segher Boessenkool 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/373e098e1e788d7b89ec0f31765a6c

cheers

Re: [1/3] powerpc/sstep: Introduce GETTYPE macro

2018-06-04 Thread Michael Ellerman

On Mon, 2018-05-21 at 04:21:06 UTC, Ravi Bangoria wrote:
> Replace 'op->type & INSTR_TYPE_MASK' expression with GETTYPE(op->type)
> macro.
> 
> Signed-off-by: Ravi Bangoria 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e6684d07e4308430b9b6497265781a

cheers

Re: [v2,1/4] powerpc/perf: Rearrange memory freeing in imc init

2018-06-04 Thread Michael Ellerman

On Tue, 2018-05-22 at 09:12:34 UTC, Anju T Sudhakar wrote:
> When any of the IMC (In-Memory Collection counter) devices fail
> to initialize, imc_common_mem_free() frees set of memory. In doing so,
> pmu_ptr pointer is also freed. But pmu_ptr pointer is used in subsequent
> function (imc_common_cpuhp_mem_free()) which is wrong. Patch here reorders
> the code to avoid such access.
> 
> Also free the memory which is dynamically allocated during imc
> initialization, wherever required.
> 
> Signed-off-by: Anju T Sudhakar 
> Reviewed-by: Madhavan Srinivasan 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/cb094fa5af7c9623084aa4c3cf529b

cheers

Re: [1/2] powerpc: Rename thread_struct.fs to addr_limit

2018-06-04 Thread Michael Ellerman

On Mon, 2018-05-14 at 13:03:15 UTC, Michael Ellerman wrote:
> It's called 'fs' for historical reasons, it's named after the x86 'FS'
> register. But we don't have to use that name for the member of
> thread_struct, and in fact arch/x86 doesn't even call it 'fs' anymore.
> 
> So rename it to 'addr_limit', which better reflects what it's used
> for, and is also the name used on other arches.
> 
> Signed-off-by: Michael Ellerman 

Series applied to powerpc next.

https://git.kernel.org/powerpc/c/ba0635fcbe8c1ce83523c1ec797538

cheers

Re: powerpc/xive: Remove (almost) unused macros

2018-06-04 Thread Michael Ellerman

On Fri, 2018-05-11 at 08:03:13 UTC, Russell Currey wrote:
> The GETFIELD and SETFIELD macros in xive-regs.h aren't used except for a
> single instance of GETFIELD, so replace that and remove them.
> 
> These macros are also defined in vas.h, so either those should be
> eventually replaced or the macros moved into bitops.h.
> 
> Signed-off-by: Russell Currey 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/8a792262f320245de0174e6bcb5513

cheers

Re: [v5,1/7] powerpc: Add TIDR CPU feature for POWER9

2018-06-04 Thread Michael Ellerman

On Fri, 2018-05-11 at 06:12:57 UTC, "Alastair D'Silva" wrote:
> From: Alastair D'Silva 
> 
> This patch adds a CPU feature bit to show whether the CPU has
> the TIDR register available, enabling as_notify/wait in userspace.
> 
> Signed-off-by: Alastair D'Silva 
> Reviewed-by: Frederic Barrat 
> Reviewed-by: Andrew Donnellan 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/819844285ef2b5d15466f5b5062514

cheers

Re: powerpc/powernv: process all OPAL event interrupts with kopald

2018-06-04 Thread Michael Ellerman

On Thu, 2018-05-10 at 17:20:05 UTC, Nicholas Piggin wrote:
> Using irq_work for processing OPAL event interrupts is not necessary.
> irq_work is typically used to schedule work from NMI context, a
> softirq may be more appropriate. However OPAL events are not
> particularly performance or latency critical, so they can all be
> invoked by kopald.
> 
> This patch removes the irq_work queueing, and instead wakes up
> kopald when there is an event to be processed. kopald processes
> interrupts individually, enabling irqs and calling cond_resched
> between each one to minimise latencies.
> 
> Event handlers themselves should still use threaded handlers,
> workqueues, etc. as necessary to avoid high interrupts-off latencies
> within any single interrupt.
> 
> Signed-off-by: Nicholas Piggin 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/56c0b48b1e443efa5d6f4d60513302

cheers

Re: powerpc/powernv: call OPAL_QUIESCE before OPAL_SIGNAL_SYSTEM_RESET

2018-06-04 Thread Michael Ellerman

On Thu, 2018-05-10 at 12:21:48 UTC, Nicholas Piggin wrote:
> Although it is often possible to recover a CPU that was interrupted
> from OPAL with a system reset NMI, it's undesirable to interrupt them
> for a few reasons. Firstly because dump/debug code itself needs to
> call firmware, so it could hang on a lock or possibly corrupt a
> per-cpu data structure if it or another CPU was interrupted from
> OPAL. Secondly, the kexec crash dump code will not return from
> interrupt to unwind the OPAL call.
> 
> Call OPAL_QUIESCE with QUIESCE_HOLD before sending an NMI IPI to
> another CPU, which wait for it to leave firmware (or time out) to
> avoid this problem in normal conditions. Firmware bugs may still
> result in a timeout and interrupting OPAL, but that is the best
> option (stops the CPU, and possibly allows firmware to be debugged).
> 
> Signed-off-by: Nicholas Piggin 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ee03b9b4479d1302d01cebedda3518

cheers

Re: [1/2] powerpc/pmu/fsl: fix is_nmi test for irq mask change

2018-06-04 Thread Michael Ellerman

On Thu, 2018-05-10 at 01:04:23 UTC, Nicholas Piggin wrote:
> When soft enabled was changed to irq disabled mask, this test missed
> being converted (although the equivalent book3s test was converted).
> 
> The PMU drivers consider it an NMI when they take a PMI while general
> interrupts are disabled. This change restores that behaviour.
> 
> Fixes: 01417c6cc7 ("powerpc/64: Change soft_enabled from flag to bitmask")
> Signed-off-by: Nicholas Piggin 
> Reviewed-by: Madhavan Srinivasan 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/81ea11d3af3aba0bea475dd1dd2f79

cheers

Re: powerpc: cpm_gpio: Remove owner assignment from platform_driver

2018-06-04 Thread Michael Ellerman

On Sat, 2018-05-05 at 03:01:25 UTC, Fabio Estevam wrote:
> From: Fabio Estevam 
> 
> Structure platform_driver does not need to set the owner field, as this
> will be populated by the driver core.
> 
> Generated by scripts/coccinelle/api/platform_no_drv_owner.cocci.
> 
> Signed-off-by: Fabio Estevam 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c5cbde2df3951c59dc099dd2e452b5

cheers

Re: [RFC, 1/4] powerpc/64: Save stack pointer when we hard disable interrupts

2018-06-04 Thread Michael Ellerman

On Wed, 2018-05-02 at 13:07:26 UTC, Michael Ellerman wrote:
> A CPU that gets stuck with interrupts hard disable can be difficult to
> debug, as on some platforms we have no way to interrupt the CPU to
> find out what it's doing.
> 
> A stop-gap is to have the CPU save it's stack pointer (r1) in its paca
> when it hard disables interrupts. That way if we can't interrupt it,
> we can at least trace the stack based on where it last disabled
> interrupts.
> 
> In some cases that will be total junk, but the stack trace code should
> handle that. In the simple case of a CPU that disable interrupts and
> then gets stuck in a loop, the stack trace should be informative.
> 
> We could clear the saved stack pointer when we enable interrupts, but
> that loses information which could be useful if we have nothing else
> to go on.
> 
> Signed-off-by: Michael Ellerman 

Series applied to powerpc next.

https://git.kernel.org/powerpc/c/7b08729cb272b4cd5c657cd5ac0ddd

cheers

Re: [01/11] powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled

2018-06-04 Thread Michael Ellerman

On Fri, 2018-05-04 at 17:19:25 UTC, Nicholas Piggin wrote:
> irq_work_raise should not cause a decrementer exception unless it is
> called from NMI context. Doing so often just results in an immediate
> masked decrementer interrupt:
> 
><...>-55090d...4us : update_curr_rt <-dequeue_task_rt
><...>-55090d...5us : dbs_update_util_handler <-update_curr_rt
><...>-55090d...6us : arch_irq_work_raise <-irq_work_queue
><...>-55090d...7us : soft_nmi_interrupt <-soft_nmi_common
><...>-55090d...7us : printk_nmi_enter <-soft_nmi_interrupt
><...>-55090d.Z.8us : rcu_nmi_enter <-soft_nmi_interrupt
><...>-55090d.Z.9us : rcu_nmi_exit <-soft_nmi_interrupt
><...>-55090d...9us : printk_nmi_exit <-soft_nmi_interrupt
><...>-55090d...   10us : cpuacct_charge <-update_curr_rt
> 
> The soft_nmi_interrupt here is the call into the watchdog, due to the
> decrementer interrupt firing with irqs soft-disabled. This is
> harmless, but sub-optimal.
> 
> When it's not called from NMI context or with interrupts enabled, mark
> the decrementer pending in the irq_happened mask directly, rather than
> having the masked decrementer interupt handler do it. This will be
> replayed at the next local_irq_enable. See the comment for details.
> 
> Signed-off-by: Nicholas Piggin 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ebb37cf3ffd39fdb6ec5b07111f8bb

cheers

Re: powerpc/xics: add missing of_node_put() in error path

2018-06-04 Thread Michael Ellerman

On Wed, 2018-04-25 at 11:27:07 UTC, YueHaibing wrote:
> The device node obtained with of_find_compatible_node() should be
> released by calling of_node_put().  But it was not released when
> of_get_property() failed.
> 
> Signed-off-by: YueHaibing 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/589b1f7e4b0db4c31cef3b55f75148

cheers

Re: [1/6] powerpc/64s: Add barrier_nospec

2018-06-04 Thread Michael Ellerman

On Tue, 2018-04-24 at 04:15:54 UTC, Michael Ellerman wrote:
> From: Michal Suchanek 
> 
> A no-op form of ori (or immediate of 0 into r31 and the result stored
> in r31) has been re-tasked as a speculation barrier. The instruction
> only acts as a barrier on newer machines with appropriate firmware
> support. On older CPUs it remains a harmless no-op.
> 
> Implement barrier_nospec using this instruction.
> 
> mpe: The semantics of the instruction are believed to be that it
> prevents execution of subsequent instructions until preceding branches
> have been fully resolved and are no longer executing speculatively.
> There is no further documentation available at this time.
> 
> Signed-off-by: Michal Suchanek 
> Signed-off-by: Michael Ellerman 

Series applied to powerpc next.

https://git.kernel.org/powerpc/c/a6b3964ad71a61bb7c61d80a60bea7

cheers

Re: [v2] powerpc/signal32: Use fault_in_pages_readable() to prefault user context

2018-06-04 Thread Michael Ellerman

On Tue, 2018-04-24 at 16:04:25 UTC, Christophe Leroy wrote:
> Use fault_in_pages_readable() to prefault user context
> instead of open coding
> 
> Signed-off-by: Christophe Leroy 
> Reviewed-by: Mathieu Malaterre 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/56b04d568f880a48d892e840cfaf4e

cheers

Re: powerpc/8xx: Remove RTC clock on 88x

2018-06-04 Thread Michael Ellerman

On Tue, 2018-04-17 at 12:47:35 UTC, Christophe Leroy wrote:
> The 885 familly processors don't have the Real Time Clock
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/d04f11d2713e736a3740361f8b5bb4

cheers

Re: [1/5] powerpc: always enable RTC_LIB

2018-06-04 Thread Michael Ellerman

On Mon, 2018-04-23 at 08:36:38 UTC, Arnd Bergmann wrote:
> In order to use the rtc_tm_to_time64() and rtc_time64_to_tm()
> helper functions in later patches, we have to ensure that
> CONFIG_RTC_LIB is always built-in.
> 
> Note that this symbol only controls a couple of helper functions,
> not the actual RTC subsystem, which remains optional and is
> enabled with CONFIG_RTC_CLASS.
> 
> Signed-off-by: Arnd Bergmann 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6e8cef384a41882b2d4ec6992dd0d7

cheers

Re: powerpc/boot: remove unused variable in mpc8xx

2018-06-04 Thread Michael Ellerman

On Tue, 2018-04-17 at 12:36:45 UTC, Christophe Leroy wrote:
> Variable div is set but never used. Remove it.
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/169f438a7e369226a452e0ac4a54db

cheers

Re: powerpc/misc: merge reloc_offset() and add_reloc_offset()

2018-06-04 Thread Michael Ellerman

On Tue, 2018-04-17 at 11:23:10 UTC, Christophe Leroy wrote:
> reloc_offset() is the same as add_reloc_offset(0)
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/0cc377d16e565b90b43b7550cdf5b3

cheers

Re: powerpc/64: optimises from64to32()

2018-06-04 Thread Michael Ellerman

On Tue, 2018-04-10 at 06:34:35 UTC, Christophe Leroy wrote:
> The current implementation of from64to32() gives a poor result:
> 
> 0270 <.from64to32>:
>  270: 38 00 ff ff li  r0,-1
>  274: 78 69 00 22 rldicl  r9,r3,32,32
>  278: 78 00 00 20 clrldi  r0,r0,32
>  27c: 7c 60 00 38 and r0,r3,r0
>  280: 7c 09 02 14 add r0,r9,r0
>  284: 78 09 00 22 rldicl  r9,r0,32,32
>  288: 7c 00 4a 14 add r0,r0,r9
>  28c: 78 03 00 20 clrldi  r3,r0,32
>  290: 4e 80 00 20 blr
> 
> This patch modifies from64to32() to operate in the same
> spirit as csum_fold()
> 
> It swaps the two 32-bit halves of sum then it adds it with the
> unswapped sum. If there is a carry from adding the two 32-bit halves,
> it will carry from the lower half into the upper half, giving us the
> correct sum in the upper half.
> 
> The resulting code is:
> 
> 0260 <.from64to32>:
>  260: 78 60 00 02 rotldi  r0,r3,32
>  264: 7c 60 1a 14 add r3,r0,r3
>  268: 78 63 00 22 rldicl  r3,r3,32,32
>  26c: 4e 80 00 20 blr
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/55a0edf083022e402042255a0afb03

cheers

Re: [1/5] powerpc/embedded6xx: Remove C2K board support

2018-06-04 Thread Michael Ellerman

On Fri, 2018-04-06 at 01:17:16 UTC, Mark Greer wrote:
> The C2K platform appears to be orphaned so remove code supporting it.
> 
> CC: Remi Machet 
> Signed-off-by: Mark Greer 
> Acked-by: Remi Machet 
> Signed-off-by: Mark Greer 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/92c8c16f345759e87c5d5b771d438f

cheers

Re: hvc_opal: don't set tb_ticks_per_usec in udbg_init_opal_common()

2018-06-04 Thread Michael Ellerman

On Thu, 2018-03-29 at 06:02:46 UTC, Stewart Smith wrote:
> time_init() will set up tb_ticks_per_usec based on reality.
> time_init() is called *after* udbg_init_opal_common() during boot.
> 
> from arch/powerpc/kernel/time.c:
>   unsigned long tb_ticks_per_usec = 100; /* sane default */
> 
> Currently, all powernv systems have a timebase frequency of 512mhz
> (51200/100 == 0x200) - although there's nothing written
> down anywhere that I can find saying that we couldn't make that
> different based on the requirements in the ISA.
> 
> So, we've been (accidentally) thwacking the (currently) correct
> (for powernv at least) value for tb_ticks_per_usec earlier than
> we otherwise would have.
> 
> The "sane default" seems to be adequate for our purposes between
> udbg_init_opal_common() and time_init() being called, and if it isn't,
> then we should probably be setting it somewhere that isn't hvc_opal.c!
> 
> Signed-off-by: Stewart Smith 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/447808bf500a7cc92173266a59f8a4

cheers

Re: [1/4] powerpc/mm: constify FIRST_CONTEXT in mmu_context_nohash

2018-06-04 Thread Michael Ellerman

On Wed, 2018-03-21 at 14:07:47 UTC, Christophe Leroy wrote:
> First context is now 1 for all supported platforms, so it
> can be made a constant.
> 
> Signed-off-by: Christophe Leroy 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/8820a44738308a8a1644c6c3d04b47

cheers

Re: powerpc/dma: remove unnecessary BUG()

2018-06-04 Thread Michael Ellerman

On Wed, 2018-02-28 at 18:21:45 UTC, Christophe Leroy wrote:
> Direction is already checked in all calling functions in
> include/linux/dma-mapping.h and also in called function __dma_sync()
> 
> So really no need to check it once more here.
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/9887334b804892f10262fa7f805998

cheers

Re: SB600 for the Nemo board has non-zero devices on non-root bus

2018-06-04 Thread Michael Ellerman

On Wed, 2017-12-06 at 11:03:52 UTC, Christian Zigotzky wrote:
> On 06 December 2017 at 09:37AM, Christian Zigotzky wrote:
>  > On 03 December 2017 at 10:43AM, Christian Zigotzky wrote:
>  > >
>  > > On 3. Dec 2017, at 00:02, Olof Johansson  wrote:
>  > >>
>  > >> Typo, should be ';', not ':'. I obviously didn't even try 
> compiling this. :)
>  > >>
>  > >>
>  > >> -Olof
>  > >
>  > > Hi Olof,
>  > >
>  > > Thanks a lot for your patch! I will test it on Wednesday.
>  > >
>  > > Cheers,
>  > > Christian
>  >
>  >
>  > Hi Olof,
>  >
>  > I tested your patch today. Unfortunately the kernel 4.15-rc2 doesn't 
> compile with your patch.
>  >
>  > Error messages:
>  >
>  >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  ^
>  > arch/powerpc/platforms/pasemi/pci.c: In function âpas_pci_initâ:
>  > arch/powerpc/platforms/pasemi/pci.c:298:2: error: implicit 
> declaration of function âpci_set_flagâ 
> [-Werror=implicit-function-declaration]
>  >Â Â  pci_set_flag(PCI_SCAN_ALL_PCIE_DEVS);
>  >Â Â  ^~~~
>  > cc1: some warnings being treated as errors
>  >
>  > ---
>  >
>  > I figured out that we need 'pci_set_flags' instead of 'pci_set_flag'. 
> I modified your patch and after that the kernel compiles. Please find 
> attached the new patch.
>  >
>  > Cheers,
>  > Christian
> 
> Hi Olof,
> 
> Many thanks for your patch! :-) The RC2 of kernel 4.15 boots without any 
> problems on my P.A. Semi Nemo board (A-EON AmigaOne X1000). I donât need 
> the additional boot argument 'pci=pcie_scan_all' anymore.
> 
> Is it possible to merge it via the powerpc tree?
> 
> Thanks,
> Christian
> 
> arch/powerpc/platforms/pasemi/pci.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pasemi/pci.c 
> b/arch/powerpc/platforms/pasemi/pci.c
> index 5ff6108..ea54ed2 100644
> --- a/arch/powerpc/platforms/pasemi/pci.c
> +++ b/arch/powerpc/platforms/pasemi/pci.c
> @@ -224,6 +224,8 @@ void __init pas_pci_init(void)
>   return;
>   }
>  
> + pci_set_flags(PCI_SCAN_ALL_PCIE_DEVS);
> +
>   for (np = NULL; (np = of_get_next_child(root, np)) != NULL;)
>   if (np->name && !strcmp(np->name, "pxp") && !pas_add_bridge(np))
>   of_node_get(np);

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/eff06ef0891d200eb0ddd156c6e96c

cheers

Re: pkeys on POWER: Access rights not reset on execve

2018-06-04 Thread Ram Pai

On Mon, Jun 04, 2018 at 12:12:07PM +0200, Florian Weimer wrote:
> On 06/03/2018 10:18 PM, Ram Pai wrote:
> >On Mon, May 21, 2018 at 01:29:11PM +0200, Florian Weimer wrote:
> >>On 05/20/2018 09:11 PM, Ram Pai wrote:
> >>>Florian,
> >>>
> >>>   Does the following patch fix the problem for you?  Just like x86
> >>>   I am enabling all keys in the UAMOR register during
> >>>   initialization itself. Hence any key created by any thread at
> >>>   any time, will get activated on all threads. So any thread
> >>>   can change the permission on that key. Smoke tested it
> >>>   with your test program.
> >>
> >>I think this goes in the right direction, but the AMR value after
> >>fork is still strange:
> >>
> >>AMR (PID 34912): 0x
> >>AMR after fork (PID 34913): 0x
> >>AMR (PID 34913): 0x
> >>Allocated key in subprocess (PID 34913): 2
> >>Allocated key (PID 34912): 2
> >>Setting AMR: 0x
> >>New AMR value (PID 34912): 0x0fff
> >>About to call execl (PID 34912) ...
> >>AMR (PID 34912): 0x0fff
> >>AMR after fork (PID 34914): 0x0003
> >>AMR (PID 34914): 0x0003
> >>Allocated key in subprocess (PID 34914): 2
> >>Allocated key (PID 34912): 2
> >>Setting AMR: 0x
> >>New AMR value (PID 34912): 0x0fff
> >>
> >>I mean this line:
> >>
> >>AMR after fork (PID 34914): 0x0003
> >>
> >>Shouldn't it be the same as in the parent process?
> >
> >Fixed it. Please try this patch. If it all works to your satisfaction, I
> >will clean it up further and send to Michael Ellermen(ppc maintainer).
> >
> >
> >commit 51f4208ed5baeab1edb9b0f8b68d719b3527
> >Author: Ram Pai 
> >Date:   Sun Jun 3 14:44:32 2018 -0500
> >
> > Fix for the fork bug.
> > Signed-off-by: Ram Pai 
> 
> Is this on top of the previous patch, or a separate fix?

top of previous patch.
RP

Re: [PATCH 09/10] dpaa_eth: add support for hardware timestamping

2018-06-04 Thread Richard Cochran

On Mon, Jun 04, 2018 at 03:08:36PM +0800, Yangbo Lu wrote:

> +if FSL_DPAA_ETH
> +config FSL_DPAA_ETH_TS
> + bool "DPAA hardware timestamping support"
> + select PTP_1588_CLOCK_QORIQ
> + default n
> + help
> +   Enable DPAA hardware timestamping support.
> +   This option is useful for applications to get
> +   hardware time stamps on the Ethernet packets
> +   using the SO_TIMESTAMPING API.
> +endif

You should drop this #ifdef.  In general, if a MAC supports time
stamping and PHC, then the driver support should simply be compiled
in.

[ When time stamping incurs a large run time performance penalty to
  non-PTP users, then it might make sense to have a Kconfig option to
  disable it, but that doesn't appear to be the case here. ]

> @@ -1615,6 +1635,24 @@ static int dpaa_eth_refill_bpools(struct dpaa_priv 
> *priv)
>   skbh = (struct sk_buff **)phys_to_virt(addr);
>   skb = *skbh;
>  
> +#ifdef CONFIG_FSL_DPAA_ETH_TS
> + if (priv->tx_tstamp &&
> + skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {

This condition fits on one line easily.

> + struct skb_shared_hwtstamps shhwtstamps;
> + u64 ns;

Local variables belong at the top of the function.

> + memset(, 0, sizeof(shhwtstamps));
> +
> + if (!dpaa_get_tstamp_ns(priv->net_dev, ,
> + priv->mac_dev->port[TX],
> + (void *)skbh)) {
> + shhwtstamps.hwtstamp = ns_to_ktime(ns);
> + skb_tstamp_tx(skb, );
> + } else {
> + dev_warn(dev, "dpaa_get_tstamp_ns failed!\n");
> + }
> + }
> +#endif
>   if (unlikely(qm_fd_get_format(fd) == qm_fd_sg)) {
>   nr_frags = skb_shinfo(skb)->nr_frags;
>   dma_unmap_single(dev, addr, qm_fd_get_offset(fd) +
> @@ -2086,6 +2124,14 @@ static int dpaa_start_xmit(struct sk_buff *skb, struct 
> net_device *net_dev)
>   if (unlikely(err < 0))
>   goto skb_to_fd_failed;
>  
> +#ifdef CONFIG_FSL_DPAA_ETH_TS
> + if (priv->tx_tstamp &&
> + skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {

One line please.

> + fd.cmd |= FM_FD_CMD_UPD;
> + skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> + }
> +#endif
> +
>   if (likely(dpaa_xmit(priv, percpu_stats, queue_mapping, ) == 0))
>   return NETDEV_TX_OK;
>  

Thanks,
Richard

Re: [PATCH] fsl/qe: ucc: copy and paste bug in ucc_get_tdm_sync_shift()

2018-06-04 Thread Julia Lawall




On Mon, 4 Jun 2018, Dan Carpenter wrote:

> On Mon, Jun 04, 2018 at 10:25:14PM +0900, Julia Lawall wrote:
> >
> >
> > On Mon, 4 Jun 2018, Dan Carpenter wrote:
> >
> > > There is a copy and paste bug so we accidentally use the RX_ shift when
> > > we're in TX_ mode.
> > >
> > > Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode")
> > > Signed-off-by: Dan Carpenter 
> > > ---
> > > Static analysis work.  Not tested.  This affects the success path, so
> > > we should probably test it.
> >
> > Maybe this is another one?  I don't have time to look into it at the
> > moment...
> >
> > drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
> >
> > /* For strict priority entries defines the number of consecutive
> >  * slots for the highest priority.
> >  */
> > REG_WR(bp, (port) ? NIG_REG_P1_TX_ARB_NUM_STRICT_ARB_SLOTS :
> >NIG_REG_P1_TX_ARB_NUM_STRICT_ARB_SLOTS, 0x100);
> > /* Mapping between the CREDIT_WEIGHT registers and actual client
> >  * numbers
> >  */
> >
> > I find some others that choose between constants, such as ... ? 0 : 0.
>
> I feel like it should warn about all of those because people shouldn't
> be submitting unfinished written code to the kernel.  Coccinelle is a
> lot better for this than Smatch is because it's pre-processor stuff.

OK, maybe I can report these in the next few days.

thanks,
julia

Re: [PATCH] fsl/qe: ucc: copy and paste bug in ucc_get_tdm_sync_shift()

2018-06-04 Thread Dan Carpenter

On Mon, Jun 04, 2018 at 10:25:14PM +0900, Julia Lawall wrote:
> 
> 
> On Mon, 4 Jun 2018, Dan Carpenter wrote:
> 
> > There is a copy and paste bug so we accidentally use the RX_ shift when
> > we're in TX_ mode.
> >
> > Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode")
> > Signed-off-by: Dan Carpenter 
> > ---
> > Static analysis work.  Not tested.  This affects the success path, so
> > we should probably test it.
> 
> Maybe this is another one?  I don't have time to look into it at the
> moment...
> 
> drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
> 
>   /* For strict priority entries defines the number of consecutive
>* slots for the highest priority.
>*/
>   REG_WR(bp, (port) ? NIG_REG_P1_TX_ARB_NUM_STRICT_ARB_SLOTS :
>  NIG_REG_P1_TX_ARB_NUM_STRICT_ARB_SLOTS, 0x100);
>   /* Mapping between the CREDIT_WEIGHT registers and actual client
>* numbers
>*/
> 
> I find some others that choose between constants, such as ... ? 0 : 0.

I feel like it should warn about all of those because people shouldn't
be submitting unfinished written code to the kernel.  Coccinelle is a
lot better for this than Smatch is because it's pre-processor stuff.

regards,
dan carpenter

Re: [PATCH] fsl/qe: ucc: copy and paste bug in ucc_get_tdm_sync_shift()

2018-06-04 Thread Julia Lawall




On Mon, 4 Jun 2018, Dan Carpenter wrote:

> There is a copy and paste bug so we accidentally use the RX_ shift when
> we're in TX_ mode.
>
> Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode")
> Signed-off-by: Dan Carpenter 
> ---
> Static analysis work.  Not tested.  This affects the success path, so
> we should probably test it.

Maybe this is another one?  I don't have time to look into it at the
moment...

drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c

/* For strict priority entries defines the number of consecutive
 * slots for the highest priority.
 */
REG_WR(bp, (port) ? NIG_REG_P1_TX_ARB_NUM_STRICT_ARB_SLOTS :
   NIG_REG_P1_TX_ARB_NUM_STRICT_ARB_SLOTS, 0x100);
/* Mapping between the CREDIT_WEIGHT registers and actual client
 * numbers
 */

I find some others that choose between constants, such as ... ? 0 : 0.

julia


>
> diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c
> index c646d8713861..681f7d4b7724 100644
> --- a/drivers/soc/fsl/qe/ucc.c
> +++ b/drivers/soc/fsl/qe/ucc.c
> @@ -626,7 +626,7 @@ static u32 ucc_get_tdm_sync_shift(enum comm_dir mode, u32 
> tdm_num)
>  {
>   u32 shift;
>
> - shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : RX_SYNC_SHIFT_BASE;
> + shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : TX_SYNC_SHIFT_BASE;
>   shift -= tdm_num * 2;
>
>   return shift;
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[PATCH net-next] wan/fsl_ucc_hdlc: use dma_zalloc_coherent instead of allocator/memset

2018-06-04 Thread YueHaibing

Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing 
---
 drivers/net/wan/fsl_ucc_hdlc.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index 33df764..4205dfd 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -270,10 +270,10 @@ static int uhdlc_init(struct ucc_hdlc_private *priv)
iowrite16be(DEFAULT_HDLC_ADDR, >ucc_pram->haddr4);
 
/* Get BD buffer */
-   bd_buffer = dma_alloc_coherent(priv->dev,
-  (RX_BD_RING_LEN + TX_BD_RING_LEN) *
-  MAX_RX_BUF_LENGTH,
-  _dma_addr, GFP_KERNEL);
+   bd_buffer = dma_zalloc_coherent(priv->dev,
+   (RX_BD_RING_LEN + TX_BD_RING_LEN) *
+   MAX_RX_BUF_LENGTH,
+   _dma_addr, GFP_KERNEL);
 
if (!bd_buffer) {
dev_err(priv->dev, "Could not allocate buffer descriptors\n");
@@ -281,9 +281,6 @@ static int uhdlc_init(struct ucc_hdlc_private *priv)
goto free_tiptr;
}
 
-   memset(bd_buffer, 0, (RX_BD_RING_LEN + TX_BD_RING_LEN)
-   * MAX_RX_BUF_LENGTH);
-
priv->rx_buffer = bd_buffer;
priv->tx_buffer = bd_buffer + RX_BD_RING_LEN * MAX_RX_BUF_LENGTH;
 
-- 
2.7.0

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Christoph Hellwig

On Mon, Jun 04, 2018 at 03:43:09PM +0300, Michael S. Tsirkin wrote:
> Another is that given the basic functionality is in there, optimizations
> can possibly wait until per-device quirks in DMA API are supported.

We have had per-device dma_ops for quite a while.

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Benjamin Herrenschmidt

On Mon, 2018-06-04 at 15:43 +0300, Michael S. Tsirkin wrote:
> On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> > 
> > > I re-read that discussion and I'm still unclear on the
> > > original question, since I got several apparently
> > > conflicting answers.
> > > 
> > > I asked:
> > > 
> > >   Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > >   hypervisor side sufficient?
> > 
> > I thought I had replied to this...
> > 
> > There are a couple of reasons:
> > 
> > - First qemu doesn't know that the guest will switch to "secure mode"
> > in advance. There is no difference between a normal and a secure
> > partition until the partition does the magic UV call to "enter secure
> > mode" and qemu doesn't see any of it. So who can set the flag here ?
> 
> The user should set it. You just tell user "to be able to use with
> feature X, enable IOMMU".

That's completely backwards. The user has no idea what that stuff is.
And it would have to percolate all the way up the management stack,
libvirt, kimchi, whatever else ... that's just nonsense.

Especially since, as I explained in my other email, this is *not* a
qemu problem and thus the solution shouldn't be messing around with
qemu.

> 
> > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > vhost) go through the emulated MMIO for every access to the guest,
> > which adds additional overhead.
> > 
> > Cheers,
> > Ben.
> 
> There are several answers to this.  One is that we are working hard to
> make overhead small when the mappings are static (which they would be if
> there's no actual IOMMU). So maybe especially given you are using
> a bounce buffer on top it's not so bad - did you try to
> benchmark?
> 
> Another is that given the basic functionality is in there, optimizations
> can possibly wait until per-device quirks in DMA API are supported.

The point is that requiring specific qemu command line arguments isn't
going to fly. We have additional problems due to the fact that our
firmware (SLOF) inside qemu doesn't currently deal with iommu's etc...
though those can be fixed.

Overall, however, this seems to be the most convoluted way of achieving
things, require user interventions where none should be needed etc...

Again, what's wrong with a 2 lines hook instead that solves it all and
completely avoids involving qemu ?

Ben.

> 
> > > 
> > > 
> > > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++
> > > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++
> > > >  drivers/virtio/virtio_ring.c   | 10 ++
> > > >  3 files changed, 27 insertions(+)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> > > > b/arch/powerpc/include/asm/dma-mapping.h
> > > > index 8fa3945..056e578 100644
> > > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device 
> > > > *dev);
> > > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > > >  
> > > >  #endif /* __KERNEL__ */
> > > > +
> > > > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > > > +
> > > > +struct virtio_device;
> > > > +
> > > > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev);
> > > >  #endif /* _ASM_DMA_MAPPING_H */
> > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> > > > b/arch/powerpc/platforms/pseries/iommu.c
> > > > index 06f0296..a2ec15a 100644
> > > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > > @@ -38,6 +38,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str)
> > > >  __setup("multitce=", disable_multitce);
> > > >  
> > > >  machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init);
> > > > +
> > > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > > +{
> > > > +   /*
> > > > +* On protected guest platforms, force virtio core to use DMA
> > > > +* MAP API for all virtio devices. But there can also be some
> > > > +* exceptions for individual devices like virtio balloon.
> > > > +*/
> > > > +   return (of_find_compatible_node(NULL, NULL, "ibm,ultravisor") 
> > > > != NULL);
> > > > +}
> > > 
> > > Isn't this kind of slow?  vring_use_dma_api is on
> > > data path and supposed to be very fast.
> > > 
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index 21d464a..47ea6c3 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > > >   * unconditionally on data path.
> > > >   */
> > > >  
> > > > +#ifndef platform_forces_virtio_dma
> > > > +static inline bool

Re: [RFC PATCH v2 00/14] Remove unneccessary included headers

2018-06-04 Thread Michael Ellerman

Christophe Leroy  writes:

> The purpose of this serie is to limit the number of includes to
> only the necessary ones in order to reduce the number of files
> recompiled everytime a header file is modified.
>
> This is the start of the work, please provide feedback if any so
> that I don't go in the wrong direction.
>
> Handled inclusion changes more carrefully after Michael feedback.
>
> Started splitting some headers in order to reduce their coverage.
>
> Christophe Leroy (14):
>   powerpc: remove kdump.h from page.h
>   powerpc: remove unneeded inclusions of cpu_has_feature.h
>   powerpc/405: move PPC405_ERR77 in asm-405.h
>   powerpc: move ASM_CONST and stringify_in_c() into asm-const.h
>   powerpc: clean the inclusion of stringify.h
>   powerpc: clean inclusions of asm/feature-fixups.h
>   powerpc: remove superflous inclusions of asm/fixmap.h
>   powerpc: declare set_breakpoint() static
>   powerpc/book3s: Remove PPC_PIN_SIZE
>   powerpc: fix includes in asm/processor.h
>   powerpc/nohash: fix hash related comments in pgtable.h
>   powerpc/44x: remove page.h from mmu-44x.h
>   powerpc: split reg.h in two parts
>   powerpc: Split synch.h in two parts

Still some problems :)

  http://kisskb.ellerman.id.au/kisskb/head/14047/

cheers


arch/powerpc/include/asm/reg.h:1286:31: error: expected ':' or ')' before 
'ASM_FTR_IFCLR':
  allmodconfig+64K_PAGES powerpc
  allmodconfig+64K_PAGES powerpc-5.3
  allmodconfig+ppc64le ppc64le
  powernv_defconfig+NO_NUMA ppc64le
  powernv_defconfig+NO_PERF ppc64le
  powernv_defconfig+NO_RADIX ppc64le
  powernv_defconfig+STRICT_RWX ppc64le
  powernv_defconfig+THIN ppc64le
  powerpc-allmodconfig powerpc
  powerpc-allmodconfig powerpc-5.3
  powerpc-allyesconfig powerpc
  powerpc-allyesconfig powerpc-5.3
  ppc64_defconfig powerpc
  ppc64_defconfig powerpc-5.3
  ppc64_defconfig+NO_ALTIVEC powerpc
  ppc64_defconfig+NO_ALTIVEC powerpc-5.3
  ppc64_defconfig+NO_HUGETLB powerpc
  ppc64_defconfig+NO_HUGETLB powerpc-5.3
  ppc64_defconfig+NO_KVM powerpc
  ppc64_defconfig+NO_KVM powerpc-5.3
  ppc64_defconfig+NO_RADIX powerpc
  ppc64_defconfig+NO_TM powerpc
  ppc64_defconfig+NO_TM powerpc-5.3
  ppc64_defconfig+UP powerpc
  ppc64_defconfig+UP powerpc-5.3
  ppc64e_defconfig powerpc
  ppc64e_defconfig powerpc-5.3
  ppc64e_defconfig+KEXEC powerpc
  ppc64e_defconfig+KEXEC powerpc-5.3
  ppc64e_defconfig+UP powerpc
  ppc64e_defconfig+UP powerpc-5.3
  ppc64le_defconfig ppc64le
  ppc64le_defconfig+NO_KPROBES ppc64le
  ppc64le_defconfig+NO_KVM ppc64le
  ppc6xx_defconfig powerpc
  ppc6xx_defconfig powerpc-5.3
  pseries_defconfig powerpc
  pseries_defconfig powerpc-5.3
  pseries_defconfig+FA_DUMP powerpc
  pseries_defconfig+FA_DUMP powerpc-5.3
  pseries_defconfig+NO_MEMORY_HOTPLUG powerpc
  pseries_defconfig+NO_MEMORY_HOTPLUG powerpc-5.3
  pseries_defconfig+NO_MEMORY_HOTREMOVE powerpc
  pseries_defconfig+NO_SPLPAR powerpc
  pseries_defconfig+NO_SPLPAR powerpc-5.3
  pseries_le_defconfig ppc64le
  pseries_le_defconfig+NO_NUMA ppc64le
  pseries_le_defconfig+NO_SPLPAR ppc64le
  skiroot_defconfig ppc64le

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Michael S. Tsirkin

On Mon, Jun 04, 2018 at 07:48:54PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2018-06-04 at 18:57 +1000, David Gibson wrote:
> > 
> > > - First qemu doesn't know that the guest will switch to "secure mode"
> > > in advance. There is no difference between a normal and a secure
> > > partition until the partition does the magic UV call to "enter secure
> > > mode" and qemu doesn't see any of it. So who can set the flag here ?
> > 
> > This seems weird to me.  As a rule HV calls should go through qemu -
> > or be allowed to go directly to KVM *by* qemu.
> 
> It's not an HV call, it's a UV call, qemu won't see it, qemu isn't
> trusted. Now the UV *will* reflect that to the HV via some synthetized
> HV calls, and we *could* have those do a pass by qemu, however, so far,
> our entire design doesn't rely on *any* qemu knowledge whatsoever and
> it would be sad to add it just for that purpose.

It's a temporary work-around. I think that the long-term fix is to
support per-device quirks and have the DMA API DTRT for virtio.

> Additionally, this is rather orthogonal, see my other email, the
> problem we are trying to solve is *not* a qemu problem and it doesn't
> make sense to leak that into qemu.
> 
> >   We generally reserve
> > the latter for hot path things.  Since this isn't a hot path, having
> > the call handled directly by the kernel seems wrong.
> >
> > Unless a "UV call" is something different I don't know about.
> 
> Yes, a UV call goes to the Ultravisor, not the Hypervisor. The
> Hypervisor isn't trusted.
> 
> > > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > > vhost) go through the emulated MMIO for every access to the guest,
> > > which adds additional overhead.
> > 
> Ben.

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Michael S. Tsirkin

On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> 
> > I re-read that discussion and I'm still unclear on the
> > original question, since I got several apparently
> > conflicting answers.
> > 
> > I asked:
> > 
> > Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > hypervisor side sufficient?
> 
> I thought I had replied to this...
> 
> There are a couple of reasons:
> 
> - First qemu doesn't know that the guest will switch to "secure mode"
> in advance. There is no difference between a normal and a secure
> partition until the partition does the magic UV call to "enter secure
> mode" and qemu doesn't see any of it. So who can set the flag here ?

The user should set it. You just tell user "to be able to use with
feature X, enable IOMMU".

> - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> vhost) go through the emulated MMIO for every access to the guest,
> which adds additional overhead.
> 
> Cheers,
> Ben.

There are several answers to this.  One is that we are working hard to
make overhead small when the mappings are static (which they would be if
there's no actual IOMMU). So maybe especially given you are using
a bounce buffer on top it's not so bad - did you try to
benchmark?

Another is that given the basic functionality is in there, optimizations
can possibly wait until per-device quirks in DMA API are supported.


> > 
> > 
> > >  arch/powerpc/include/asm/dma-mapping.h |  6 ++
> > >  arch/powerpc/platforms/pseries/iommu.c | 11 +++
> > >  drivers/virtio/virtio_ring.c   | 10 ++
> > >  3 files changed, 27 insertions(+)
> > > 
> > > diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> > > b/arch/powerpc/include/asm/dma-mapping.h
> > > index 8fa3945..056e578 100644
> > > --- a/arch/powerpc/include/asm/dma-mapping.h
> > > +++ b/arch/powerpc/include/asm/dma-mapping.h
> > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device 
> > > *dev);
> > >  #define ARCH_HAS_DMA_MMAP_COHERENT
> > >  
> > >  #endif /* __KERNEL__ */
> > > +
> > > +#define platform_forces_virtio_dma platform_forces_virtio_dma
> > > +
> > > +struct virtio_device;
> > > +
> > > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev);
> > >  #endif   /* _ASM_DMA_MAPPING_H */
> > > diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> > > b/arch/powerpc/platforms/pseries/iommu.c
> > > index 06f0296..a2ec15a 100644
> > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > @@ -38,6 +38,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str)
> > >  __setup("multitce=", disable_multitce);
> > >  
> > >  machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init);
> > > +
> > > +bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > +{
> > > + /*
> > > +  * On protected guest platforms, force virtio core to use DMA
> > > +  * MAP API for all virtio devices. But there can also be some
> > > +  * exceptions for individual devices like virtio balloon.
> > > +  */
> > > + return (of_find_compatible_node(NULL, NULL, "ibm,ultravisor") != NULL);
> > > +}
> > 
> > Isn't this kind of slow?  vring_use_dma_api is on
> > data path and supposed to be very fast.
> > 
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 21d464a..47ea6c3 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -141,8 +141,18 @@ struct vring_virtqueue {
> > >   * unconditionally on data path.
> > >   */
> > >  
> > > +#ifndef platform_forces_virtio_dma
> > > +static inline bool platform_forces_virtio_dma(struct virtio_device *vdev)
> > > +{
> > > + return false;
> > > +}
> > > +#endif
> > > +
> > >  static bool vring_use_dma_api(struct virtio_device *vdev)
> > >  {
> > > + if (platform_forces_virtio_dma(vdev))
> > > + return true;
> > > +
> > >   if (!virtio_has_iommu_quirk(vdev))
> > >   return true;
> > >  
> > > -- 
> > > 2.9.3

Re: [PATCH] fsl/qe: ucc: copy and paste bug in ucc_get_tdm_sync_shift()

2018-06-04 Thread Mathieu Malaterre

Where did the original go ?

https://patchwork.ozlabs.org/patch/868158/


On Mon, Jun 4, 2018 at 2:02 PM Dan Carpenter  wrote:
>
> There is a copy and paste bug so we accidentally use the RX_ shift when
> we're in TX_ mode.
>
> Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode")
> Signed-off-by: Dan Carpenter 
> ---
> Static analysis work.  Not tested.  This affects the success path, so
> we should probably test it.
>
> diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c
> index c646d8713861..681f7d4b7724 100644
> --- a/drivers/soc/fsl/qe/ucc.c
> +++ b/drivers/soc/fsl/qe/ucc.c
> @@ -626,7 +626,7 @@ static u32 ucc_get_tdm_sync_shift(enum comm_dir mode, u32 
> tdm_num)
>  {
> u32 shift;
>
> -   shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : 
> RX_SYNC_SHIFT_BASE;
> +   shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : 
> TX_SYNC_SHIFT_BASE;
> shift -= tdm_num * 2;
>
> return shift;

Re: [PATCH 10/11] macintosh/via-pmu: Clean up interrupt statistics

2018-06-04 Thread Geert Uytterhoeven

Hi Finn,

On Sat, Jun 2, 2018 at 5:27 AM, Finn Thain  wrote:
> Replace an open-coded ffs() with the function call.
> Simplify an if-else cascade using a switch statement.
> Correct a typo and an indentation issue.
>
> Tested-by: Stan Johnson 
> Signed-off-by: Finn Thain 

Thanks for your patch!

Reviewed-by: Geert Uytterhoeven 

A few minor nits below...

> --- a/drivers/macintosh/via-pmu.c
> +++ b/drivers/macintosh/via-pmu.c

> @@ -1470,25 +1470,25 @@ pmu_handle_data(unsigned char *data, int len)
> adb_input(data+1, len-1, 1);
>  #endif /* CONFIG_ADB */
> }
> -   }
> +   break;
> /* Sound/brightness button pressed */
> -   else if ((1 << pirq) & PMU_INT_SNDBRT) {
> +   case PMU_INT_SNDBRT:
>  #ifdef CONFIG_PMAC_BACKLIGHT
> if (len == 3)
> pmac_backlight_set_legacy_brightness_pmu(data[1] >> 
> 4);
>  #endif
> -   }
> +   break;

Please add a blank line after each "break" statement.

> /* Tick interrupt */
> -   else if ((1 << pirq) & PMU_INT_TICK) {
> -   /* Environement or tick interrupt, query batteries */
> +   case PMU_INT_TICK:

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

[PATCH] fsl/qe: ucc: copy and paste bug in ucc_get_tdm_sync_shift()

2018-06-04 Thread Dan Carpenter

There is a copy and paste bug so we accidentally use the RX_ shift when
we're in TX_ mode.

Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode")
Signed-off-by: Dan Carpenter 
---
Static analysis work.  Not tested.  This affects the success path, so
we should probably test it.

diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c
index c646d8713861..681f7d4b7724 100644
--- a/drivers/soc/fsl/qe/ucc.c
+++ b/drivers/soc/fsl/qe/ucc.c
@@ -626,7 +626,7 @@ static u32 ucc_get_tdm_sync_shift(enum comm_dir mode, u32 
tdm_num)
 {
u32 shift;
 
-   shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : RX_SYNC_SHIFT_BASE;
+   shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : TX_SYNC_SHIFT_BASE;
shift -= tdm_num * 2;
 
return shift;

Re: [PATCH 08/11] macintosh/via-pmu: Replace via-pmu68k driver with via-pmu driver

2018-06-04 Thread Geert Uytterhoeven

Hi Finn,

On Sat, Jun 2, 2018 at 5:27 AM, Finn Thain  wrote:
> Now that the PowerMac via-pmu driver supports m68k PowerBooks,
> switch over to that driver and remove the via-pmu68k driver.

Thanks!

> Don't call pmu_shutdown() or pmu_restart() on early PowerBooks:
> the PMU device found in these PowerBooks isn't supported.

Shouldn't that be a separate patch?

> --- a/arch/m68k/mac/misc.c
> +++ b/arch/m68k/mac/misc.c

> @@ -477,9 +445,8 @@ void mac_poweroff(void)
>macintosh_config->adb_type == MAC_ADB_CUDA) {
> cuda_shutdown();
>  #endif
> -#ifdef CONFIG_ADB_PMU68K
> -   } else if (macintosh_config->adb_type == MAC_ADB_PB1
> -   || macintosh_config->adb_type == MAC_ADB_PB2) {
> +#ifdef CONFIG_ADB_PMU
> +   } else if (macintosh_config->adb_type == MAC_ADB_PB2) {
> pmu_shutdown();
>  #endif
> }
> @@ -519,9 +486,8 @@ void mac_reset(void)
>macintosh_config->adb_type == MAC_ADB_CUDA) {
> cuda_restart();
>  #endif
> -#ifdef CONFIG_ADB_PMU68K
> -   } else if (macintosh_config->adb_type == MAC_ADB_PB1
> -   || macintosh_config->adb_type == MAC_ADB_PB2) {
> +#ifdef CONFIG_ADB_PMU
> +   } else if (macintosh_config->adb_type == MAC_ADB_PB2) {
> pmu_restart();
>  #endif
> } else if (CPU_IS_030) {

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 03/11] macintosh/via-pmu: Don't clear shift register interrupt flag twice

2018-06-04 Thread Geert Uytterhoeven

On Sat, Jun 2, 2018 at 5:27 AM, Finn Thain  wrote:
> Clearing the interrupt flag twice in succession creates a theoretical
> race condition. Fix this.

I would add that the caller of pmu_sr_intr() has already cleared the flag,
so the casual reviewer doesn't have to hunt for it.

> Tested-by: Stan Johnson 
> Signed-off-by: Finn Thain 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH 02/11] macintosh/via-pmu: Add missing mmio accessors

2018-06-04 Thread Geert Uytterhoeven

On Sat, Jun 2, 2018 at 5:27 AM, Finn Thain  wrote:
> Add missing in_8() accessors to init_pmu() and pmu_sr_intr().
>
> This fixes several sparse warnings:
> drivers/macintosh/via-pmu.c:536:29: warning: dereference of noderef expression
> drivers/macintosh/via-pmu.c:537:33: warning: dereference of noderef expression
> drivers/macintosh/via-pmu.c:1455:17: warning: dereference of noderef 
> expression
> drivers/macintosh/via-pmu.c:1456:69: warning: dereference of noderef 
> expression
>
> Tested-by: Stan Johnson 
> Signed-off-by: Finn Thain 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH v2] cpuidle/powernv : Add Description for cpuidle state

2018-06-04 Thread Akshay Adiga

On Mon, Jun 04, 2018 at 07:04:14PM +1000, Benjamin Herrenschmidt wrote:
> Is this a new property ? I'm not fan of adding yet another of those
> silly arrays.
> 
> I would say this is the right time now to switch over to a node per
> state instead, as we discussed with Vaidy.

I posted  the node based device tree here :
skiboot patch :  https://patchwork.ozlabs.org/patch/923120/
kernel patch : https://lkml.org/lkml/2018/5/30/1146

Do you have any inputs for this design ?

> 
> Additionally, while doing that, we can provide the versioning mechanism
> I proposed so we can deal with state specific issues and erratas.
> 
> Cheers,
> Ben.
>

Re: [PATCH 01/11] macintosh/via-pmu: Fix section mismatch warning

2018-06-04 Thread Geert Uytterhoeven

Hi Finn,

On Sat, Jun 2, 2018 at 5:27 AM, Finn Thain  wrote:
> The pmu_init() function has the __init qualifier, but the ops struct
> that holds a pointer to it does not. This causes a build warning.
> The driver works fine because the pointer is only dereferenced early.
>
> The function is so small that there's negligible benefit from using
> the __init qualifier. Remove it to fix the warning, consistent with
> the other ADB drivers.

Some other ADB subdriver .init() and .probe() functions aren't that small.
But with the current scheme using adb_drivers_list[], they cannot be __init.
Probably the long term fix is to change the ADB subsystem from the
centralized approach of letting adb_init() call all subdrivers, to making the
subdrivers platform drivers registering with the ADB core.

> Tested-by: Stan Johnson 
> Signed-off-by: Finn Thain 

Anyway:
Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH] cpuidle:powernv: Make the snooze timeout dynamic.

2018-06-04 Thread Michael Ellerman

"Gautham R. Shenoy"  writes:

> From: "Gautham R. Shenoy" 
>
> The commit 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of
> snooze to deeper idle state") introduced a timeout for the snooze idle
> state so that it could be eventually be promoted to a deeper idle
> state. The snooze timeout value is static and set to the target
> residency of the next idle state, which would train the cpuidle
> governor to pick the next idle state eventually.
>
> The unfortunate side-effect of this is that if the next idle state(s)
> is disabled, the CPU will forever remain in snooze, despite the fact
> that the system is completely idle, and other deeper idle states are
> available.

That sounds like a bug, I'll add?

Fixes: 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of snooze to 
deeper idle state")
Cc: sta...@vger.kernel.org # v4.2+

cheers

Re: linux-next: Signed-off-by missing for commit in the powerpc tree

2018-06-04 Thread Michael Ellerman

Stephen Rothwell  writes:

> Hi all,
>
> Commit
>
>   cb3d6759a93c ("powerpc/64s: Enable barrier_nospec based on firmware 
> settings")
>
> is missing a Signed-off-by from its author.

Sorry my fault.

cheers

Re: pkeys on POWER: Access rights not reset on execve

2018-06-04 Thread Florian Weimer


On 06/03/2018 10:18 PM, Ram Pai wrote:

On Mon, May 21, 2018 at 01:29:11PM +0200, Florian Weimer wrote:

On 05/20/2018 09:11 PM, Ram Pai wrote:

Florian,

Does the following patch fix the problem for you?  Just like x86
I am enabling all keys in the UAMOR register during
initialization itself. Hence any key created by any thread at
any time, will get activated on all threads. So any thread
can change the permission on that key. Smoke tested it
with your test program.


I think this goes in the right direction, but the AMR value after
fork is still strange:

AMR (PID 34912): 0x
AMR after fork (PID 34913): 0x
AMR (PID 34913): 0x
Allocated key in subprocess (PID 34913): 2
Allocated key (PID 34912): 2
Setting AMR: 0x
New AMR value (PID 34912): 0x0fff
About to call execl (PID 34912) ...
AMR (PID 34912): 0x0fff
AMR after fork (PID 34914): 0x0003
AMR (PID 34914): 0x0003
Allocated key in subprocess (PID 34914): 2
Allocated key (PID 34912): 2
Setting AMR: 0x
New AMR value (PID 34912): 0x0fff

I mean this line:

AMR after fork (PID 34914): 0x0003

Shouldn't it be the same as in the parent process?


Fixed it. Please try this patch. If it all works to your satisfaction, I
will clean it up further and send to Michael Ellermen(ppc maintainer).


commit 51f4208ed5baeab1edb9b0f8b68d719b3527
Author: Ram Pai 
Date:   Sun Jun 3 14:44:32 2018 -0500

 Fix for the fork bug.
 
 Signed-off-by: Ram Pai 


Is this on top of the previous patch, or a separate fix?

Thanks,
Florian

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread Benjamin Herrenschmidt

On Mon, 2018-06-04 at 18:57 +1000, David Gibson wrote:
> 
> > - First qemu doesn't know that the guest will switch to "secure mode"
> > in advance. There is no difference between a normal and a secure
> > partition until the partition does the magic UV call to "enter secure
> > mode" and qemu doesn't see any of it. So who can set the flag here ?
> 
> This seems weird to me.  As a rule HV calls should go through qemu -
> or be allowed to go directly to KVM *by* qemu.

It's not an HV call, it's a UV call, qemu won't see it, qemu isn't
trusted. Now the UV *will* reflect that to the HV via some synthetized
HV calls, and we *could* have those do a pass by qemu, however, so far,
our entire design doesn't rely on *any* qemu knowledge whatsoever and
it would be sad to add it just for that purpose.

Additionally, this is rather orthogonal, see my other email, the
problem we are trying to solve is *not* a qemu problem and it doesn't
make sense to leak that into qemu.

>   We generally reserve
> the latter for hot path things.  Since this isn't a hot path, having
> the call handled directly by the kernel seems wrong.
>
> Unless a "UV call" is something different I don't know about.

Yes, a UV call goes to the Ultravisor, not the Hypervisor. The
Hypervisor isn't trusted.

> > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> > vhost) go through the emulated MMIO for every access to the guest,
> > which adds additional overhead.
> 
Ben.

Re: [PATCH v2 07/13] powerpc/eeh: Clean up pci_ers_result handling

2018-06-04 Thread Michael Ellerman

Sam Bobroff  writes:

> On Sat, Jun 02, 2018 at 01:40:46AM +1000, Michael Ellerman wrote:
>> Sam Bobroff  writes:
>> 
>> > As EEH event handling progresses, a cumulative result of type
>> > pci_ers_result is built up by (some of) the eeh_report_*() functions
>> > using either:
>> >if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
>> >if (*res == PCI_ERS_RESULT_NONE) *res = rc;
>> > or:
>> >if ((*res == PCI_ERS_RESULT_NONE) ||
>> >(*res == PCI_ERS_RESULT_RECOVERED)) *res = rc;
>> >if (*res == PCI_ERS_RESULT_DISCONNECT &&
>> >rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
>> > (Where *res is the accumulator.)
>> >
>> > However, the intent is not immediately clear and the result in some
>> > situations is order dependent.
>> >
>> > Address this by assigning a priority to each result value, and always
>> > merging to the highest priority. This renders the intent clear, and
>> > provides a stable value for all orderings.
>> >
>> > Signed-off-by: Sam Bobroff 
>> > ---
>> > == v1 -> v2: ==
>> >
>> > * Added the value, and missing newline, to some WARN()s.
>> > * Improved name of merge_result() to pci_ers_merge_result().
>> > * Adjusted the result priorities so that unknown doesn't overlap with 
>> > _NONE.
>> 
>> These === markers seem to have confused patchwork, they ended up in the
>> patch, and then git put them in the changelog.
>> 
>> http://patchwork.ozlabs.org/patch/920194/
>> 
>> The usual format is just something like:
>> 
>> v2 - Added the value, and missing newline, to some WARN()s.
>>- Improved name of merge_result() to pci_ers_merge_result().
>>- Adjusted the result priorities so that unknown doesn't overlap with 
>> _NONE.
>> 
>> cheers
>
> Oh! I'll change it!

Thanks.

cheers

Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices

2018-06-04 Thread David Gibson

On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote:
> 
> > I re-read that discussion and I'm still unclear on the
> > original question, since I got several apparently
> > conflicting answers.
> > 
> > I asked:
> > 
> > Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the
> > hypervisor side sufficient?
> 
> I thought I had replied to this...
> 
> There are a couple of reasons:
> 
> - First qemu doesn't know that the guest will switch to "secure mode"
> in advance. There is no difference between a normal and a secure
> partition until the partition does the magic UV call to "enter secure
> mode" and qemu doesn't see any of it. So who can set the flag here ?

This seems weird to me.  As a rule HV calls should go through qemu -
or be allowed to go directly to KVM *by* qemu.  We generally reserve
the latter for hot path things.  Since this isn't a hot path, having
the call handled directly by the kernel seems wrong.

Unless a "UV call" is something different I don't know about.

> - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or
> vhost) go through the emulated MMIO for every access to the guest,
> which adds additional overhead.
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: powerpc/powernv: copy/paste - Mask XERS0 bit in CR

2018-06-04 Thread Michael Ellerman

Haren Myneni  writes:
> On 06/03/2018 03:48 AM, Michael Ellerman wrote:
>> Haren Myneni  writes:
>>> NX can set 3rd bit in CR register for XER[SO] (Summation overflow)
>>> which is not related to paste request. The current paste function
>>> returns failure for the successful request when this bit is set.
>>> So mask this bit and check the proper return status.
>>>
>>> Fixes: 2392c8c8c045 ("powerpc/powernv/vas: Define copy/paste interfaces")
>>> Cc: sta...@vger.kernel.org # v4.14+
>>> Signed-off-by: Haren Myneni 
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/copy-paste.h 
>>> b/arch/powerpc/platforms/powernv/copy-paste.h
>>> index c9a5036..82392e3 100644
>>> --- a/arch/powerpc/platforms/powernv/copy-paste.h
>>> +++ b/arch/powerpc/platforms/powernv/copy-paste.h
>>> @@ -9,7 +9,8 @@
>>>  #include 
>>>  
>>>  #define CR0_SHIFT  28
>>> -#define CR0_MASK   0xF
>>> +#define CR0_MASK   0xE /* 3rd bit undefined or set for XER[SO] */
>>> +
>>>  /*
>>>   * Copy/paste instructions:
>>>   *
>> 
>> Unfortunately this no longer applies to my next branch, because those
>> macros have been moved out of this header as part of an unrelated patch.
>> 
>> The following patch should work instead, can you please confirm by
>> testing it?
>> 
>> diff --git a/arch/powerpc/platforms/powernv/copy-paste.h 
>> b/arch/powerpc/platforms/powernv/copy-paste.h
>> index 3fa62de96d9c..c46a326776cf 100644
>> --- a/arch/powerpc/platforms/powernv/copy-paste.h
>> +++ b/arch/powerpc/platforms/powernv/copy-paste.h
>> @@ -41,5 +41,7 @@ static inline int vas_paste(void *paste_address, int 
>> offset)
>>  : "b" (offset), "b" (paste_address)
>>  : "memory", "cr0");
>> 
>> -return (cr >> CR0_SHIFT) & CR0_MASK;
>> +
>> +/* We mask with 0xE to ignore SO */
>> +return (cr >> CR0_SHIFT) & 0xE;
>>  }
>
> Tested with this patch and it works.

Thanks. I sent a new version to the list and will apply that.

cheers

1 2 >

1 - 100 of 114 matches

Mail list logo