Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Christoph Hellwig
On Tue, Jun 11, 2019 at 05:20:12PM -0500, Larry Finger wrote:
> Your first patch did not work as the configuration does not have 
> CONFIG_ZONE_DMA. As a result, the initial value of min_mask always starts 
> at 32 bits and is taken down to 31 with the maximum pfn minimization. When 
> I forced the initial value of min_mask to 30 bits, the device worked.

Ooops, yes.  But I think we could just enable ZONE_DMA on 32-bit
powerpc.  Crude enablement hack below:

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636308c8..1dd71a98b70c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -372,7 +372,7 @@ config PPC_ADV_DEBUG_DAC_RANGE
 
 config ZONE_DMA
bool
-   default y if PPC_BOOK3E_64
+   default y
 
 config PGTABLE_LEVELS
int


Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-11 Thread Christoph Hellwig
On Wed, Jun 12, 2019 at 04:35:22PM +1000, Oliver O'Halloran wrote:
> Setting a 48 bit DMA mask doesn't work today because we only allocate
> IOMMU tables to cover the 0..2GB range of PCI bus addresses.

I don't think that is true upstream, and if it is we need to fix bug
in the powerpc code.  powerpc should be falling back treating a 48-bit
dma mask like a 32-bit one at least, that is use dynamic iommu mappings
instead of using the direct mapping.  And from my reding of 
arch/powerpc/kernel/dma-iommu.c that is exactly what it does.


Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-11 Thread Oliver O'Halloran
On Wed, Jun 12, 2019 at 3:25 AM Oded Gabbay  wrote:
>
> On Tue, Jun 11, 2019 at 8:03 PM Oded Gabbay  wrote:
> >
> > On Tue, Jun 11, 2019 at 6:26 PM Greg KH  wrote:
> > > *snip*
> >
> > Now, when I tried to integrate Goya into a POWER9 machine, I got a
> > reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
> > as I wrote above, is to call the same function with 32-bits. That
> > works BUT it is not practical, as our applications require much more
> > memory mapped then 32-bits.

Setting a 48 bit DMA mask doesn't work today because we only allocate
IOMMU tables to cover the 0..2GB range of PCI bus addresses. Alexey
has some patches to expand that range so we can support devices that
can't hit the 64 bit bypass window. You need:

This fix: http://patchwork.ozlabs.org/patch/1113506/
This series: 
http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=110810

Give that a try and see if the IOMMU overhead is tolerable.

> >In addition, once you add more cards which
> > are all mapped to the same range, it is simply not usable at all.

Each IOMMU group should have a separate bus address space and seperate
cards shouldn't be in the same IOMMU group. If they are then there's
something up.

Oliver


Re: sys_exit: NR -1

2019-06-11 Thread Naveen N. Rao

Paul Clarke wrote:

What are the circumstances in which raw_syscalls:sys_exit reports "-1" for the 
syscall ID?

perf  5375 [007] 59632.478528:   raw_syscalls:sys_enter: NR 1 (3, 9fb888, 
8, 2d83740, 1, 7)
perf  5375 [007] 59632.478532:raw_syscalls:sys_exit: NR 1 = 8
perf  5375 [007] 59632.478538:   raw_syscalls:sys_enter: NR 15 (11, 
7ca734b0, 7ca73380, 2d83740, 1, 7)
perf  5375 [007] 59632.478539:raw_syscalls:sys_exit: NR -1 = 8
perf  5375 [007] 59632.478543:   raw_syscalls:sys_enter: NR 16 (4, 2401, 0, 
2d83740, 1, 0)
perf  5375 [007] 59632.478551:raw_syscalls:sys_exit: NR 16 = 0


Which architecture?
For powerpc, see:

static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
{
/*
 * Note that we are returning an int here. That means 0x, ie.
 * 32-bit negative 1, will be interpreted as -1 on a 64-bit kernel.
 * This is important for seccomp so that compat tasks can set r0 = -1
 * to reject the syscall.
 */
return TRAP(regs) == 0xc00 ? regs->gpr[0] : -1;
}


- Naveen




Re: [PATCH v8 2/7] x86/dma: use IS_ENABLED() to simplify the code

2019-06-11 Thread Leizhen (ThunderTown)



On 2019/6/12 13:16, Borislav Petkov wrote:
> On Thu, May 30, 2019 at 11:48:26AM +0800, Zhen Lei wrote:
>> This patch removes the ifdefs around CONFIG_IOMMU_DEFAULT_PASSTHROUGH to
>> improve readablity.
> 
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.

OK, thanks.

> 
> Also, do
> 
> $ git grep 'This patch' Documentation/process
> 
> for more details.
> 
>> Signed-off-by: Zhen Lei 
>> ---
>>  arch/x86/kernel/pci-dma.c | 7 ++-
>>  1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
>> index dcd272dbd0a9330..9f2b19c35a060df 100644
>> --- a/arch/x86/kernel/pci-dma.c
>> +++ b/arch/x86/kernel/pci-dma.c
>> @@ -43,11 +43,8 @@
>>   * It is also possible to disable by default in kernel config, and enable 
>> with
>>   * iommu=nopt at boot time.
>>   */
>> -#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
>> -int iommu_pass_through __read_mostly = 1;
>> -#else
>> -int iommu_pass_through __read_mostly;
>> -#endif
>> +int iommu_pass_through __read_mostly =
>> +IS_ENABLED(CONFIG_IOMMU_DEFAULT_PASSTHROUGH);
> 
> Let that line stick out.

OK, I will merge them on the same line.

> 
> Thx.
> 



Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-11 Thread Oded Gabbay
On Wed, Jun 12, 2019 at 1:53 AM Benjamin Herrenschmidt
 wrote:
>
> On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> >
> > > So, to summarize:
> > > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > > call it again with 32, which makes our device pretty much unusable.
> > > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > > 59 will be set and the host won't like it (I checked it). In addition,
> > > I might get addresses above 50 bits, which my device can't generate.
> > >
> > > I hope this makes things more clear. Now, please explain to me how I
> > > can call pci_set_dma_mask without any regard to whether I run on
> > > x86-64 or POWER9, considering what I wrote above ?
> > >
> > > Thanks,
> > > Oded
> >
> > Adding ppc mailing list.
>
> You can't. Your device is broken. Devices that don't support DMAing to
> the full 64-bit deserve to be added to the trash pile.
>
Hmm... right know they are added to customers data-centers but what do I know ;)

> As a result, getting it to work will require hacks. Some GPUs have
> similar issues and require similar hacks, it's unfortunate.
>
> Added a couple of guys on CC who might be able to help get those hacks
> right.
Thanks :)
>
> It's still very fishy .. the idea is to detect the case where setting a
> 64-bit mask will give your system memory mapped at a fixed high address
> (1 << 59 in our case) and program that in your chip in the "Fixed high
> bits" register that you seem to have (also make sure it doesn't affect
> MSIs or it will break them).
MSI-X are working. The set of bit 59 doesn't apply to MSI-X
transactions (AFAICS from the PCIe controller spec we have).
>
> This will only work as long as all of the system memory can be
> addressed at an offset from that fixed address that itself fits your
> device addressing capabilities (50 bits in this case). It may or may
> not be the case but there's no way to check since the DMA mask logic
> won't really apply.
Understood. In the specific system we are integrated to, that is the
case - we have less then 48 bits. But, as you pointed out, it is not a
generic solution but with my H/W I can't give a generic fit-all
solution for POWER9. I'll settle for the best that I can do.

>
> You might want to consider fixing your HW in the next iteration... This
> is going to bite you when x86 increases the max physical memory for
> example, or on other architectures.
Understood and taken care of.

>
> Cheers,
> Ben.
>
>
>
>


Re: [PATCH v3 1/3] powerpc/powernv: Add OPAL API interface to get secureboot state

2019-06-11 Thread Daniel Axtens
Nayna Jain  writes:

> From: Claudio Carvalho 
>
> The X.509 certificates trusted by the platform and other information
> required to secure boot the OS kernel are wrapped in secure variables,
> which are controlled by OPAL.
>
> This patch adds support to read OPAL secure variables through
> OPAL_SECVAR_GET call. It returns the metadata and data for a given secure
> variable based on the unique key.
>
> Since OPAL can support different types of backend which can vary in the
> variable interpretation, a new OPAL API call named OPAL_SECVAR_BACKEND, is
> added to retrieve the supported backend version. This helps the consumer
> to know how to interpret the variable.
>

(Firstly, apologies that I haven't got around to asking about this yet!)

Are pluggable/versioned backend a good idea?

There are a few things that worry me about the idea:

 - It adds complexity in crypto (or crypto-adjacent) code, and that
   increases the likelihood that we'll accidentally add a bug with bad
   consequences.

 - Under what circumstances would would we change the kernel-visible
   behaviour of skiboot? Are we expecting to change the behaviour,
   content or names of the variables in future? Otherwise the only
   relevant change I can think of is a change to hardware platforms, and
   I'm not sure how a change in hardware would lead to change in
   behaviour in the kernel. Wouldn't Skiboot hide h/w differences?

 - If we are worried about a long-term-future change to how secure-boot
   works, would it be better to just add more get/set calls to opal at
   the point at which we actually implement the new system?
   
 - UEFI added EFI_VARIABLE_AUTHENTICATION_3 in a way that - as far
   as I know - didn't break backwards compatibility. Is there a reason
   we cannot add features that way instead? (It also dropped v1 of the
   authentication header.)
   
 - What is the correct fallback behaviour if a kernel receives a result
   that it does not expect? If a kernel expecting BackendV1 is instead
   informed that it is running on BackendV2, then the cannot access the
   secure variable at all, so it cannot load keys that are potentially
   required to successfully boot (e.g. to validate the module for
   network card or graphics!)

Kind regards,
Daniel

> This support can be enabled using CONFIG_OPAL_SECVAR
>
> Signed-off-by: Claudio Carvalho 
> Signed-off-by: Nayna Jain 
> ---
> This patch depends on a new OPAL call that is being added to skiboot.
> The patch set that implements the new call has been posted to
> https://patchwork.ozlabs.org/project/skiboot/list/?series=112868
>
>  arch/powerpc/include/asm/opal-api.h  |  4 +-
>  arch/powerpc/include/asm/opal-secvar.h   | 23 ++
>  arch/powerpc/include/asm/opal.h  |  6 ++
>  arch/powerpc/platforms/powernv/Kconfig   |  6 ++
>  arch/powerpc/platforms/powernv/Makefile  |  1 +
>  arch/powerpc/platforms/powernv/opal-call.c   |  2 +
>  arch/powerpc/platforms/powernv/opal-secvar.c | 85 
>  7 files changed, 126 insertions(+), 1 deletion(-)
>  create mode 100644 arch/powerpc/include/asm/opal-secvar.h
>  create mode 100644 arch/powerpc/platforms/powernv/opal-secvar.c
>
> diff --git a/arch/powerpc/include/asm/opal-api.h 
> b/arch/powerpc/include/asm/opal-api.h
> index e1577cfa7186..a505e669b4b6 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -212,7 +212,9 @@
>  #define OPAL_HANDLE_HMI2 166
>  #define  OPAL_NX_COPROC_INIT 167
>  #define OPAL_XIVE_GET_VP_STATE   170
> -#define OPAL_LAST170
> +#define OPAL_SECVAR_GET 173
> +#define OPAL_SECVAR_BACKEND 177
> +#define OPAL_LAST177
>  
>  #define QUIESCE_HOLD 1 /* Spin all calls at entry */
>  #define QUIESCE_REJECT   2 /* Fail all calls with 
> OPAL_BUSY */
> diff --git a/arch/powerpc/include/asm/opal-secvar.h 
> b/arch/powerpc/include/asm/opal-secvar.h
> new file mode 100644
> index ..b677171a0368
> --- /dev/null
> +++ b/arch/powerpc/include/asm/opal-secvar.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * PowerNV definitions for secure variables OPAL API.
> + *
> + * Copyright (C) 2019 IBM Corporation
> + * Author: Claudio Carvalho 
> + *
> + */
> +#ifndef OPAL_SECVAR_H
> +#define OPAL_SECVAR_H
> +
> +enum {
> + BACKEND_NONE = 0,
> + BACKEND_TC_COMPAT_V1,
> +};
> +
> +extern int opal_get_variable(u8 *key, unsigned long ksize,
> +  u8 *metadata, unsigned long *mdsize,
> +  u8 *data, unsigned long *dsize);
> +
> +extern int opal_variable_version(unsigned long *backend);
> +
> +#endif
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index 4cc37e708bc7..57d2c2356eda 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/

Re: [PATCH kernel v3 0/3] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-06-11 Thread Oliver O'Halloran
On Wed, Jun 12, 2019 at 3:06 PM Shawn Anastasio  wrote:
>
> On 6/5/19 11:11 PM, Shawn Anastasio wrote:
> > On 5/30/19 2:03 AM, Alexey Kardashevskiy wrote:
> >> This is an attempt to allow DMA masks between 32..59 which are not large
> >> enough to use either a PHB3 bypass mode or a sketchy bypass. Depending
> >> on the max order, up to 40 is usually available.
> >>
> >>
> >> This is based on v5.2-rc2.
> >>
> >> Please comment. Thanks.
> >
> > I have tested this patch set with an AMD GPU that's limited to <64bit
> > DMA (I believe it's 40 or 42 bit). It successfully allows the card to
> > operate without falling back to 32-bit DMA mode as it does without
> > the patches.
> >
> > Relevant kernel log message:
> > ```
> > [0.311211] pci 0033:01 : [PE# 00] Enabling 64-bit DMA bypass
> > ```
> >
> > Tested-by: Shawn Anastasio 
>
> After a few days of further testing, I've started to run into stability
> issues with the patch applied and used with an AMD GPU. Specifically,
> the system sometimes spontaneously crashes. Not just EEH errors either,
> the whole system shuts down in what looks like a checkstop.

Any specific workload? Checkstops are harder to debug without a system
in the failed state so we'd need to replicate that locally to get a
decent idea what's up.

> Perhaps some subtle corruption is occurring?


Re: [PATCH v2] powerpc/perf: Use cpumask_last() to determine the designated cpu for nest/core units.

2019-06-11 Thread Anju T Sudhakar

Hi Leonardo,

On 6/11/19 12:17 AM, Leonardo Bras wrote:

On Mon, 2019-06-10 at 12:02 +0530, Anju T Sudhakar wrote:

Nest and core imc(In-memory Collection counters) assigns a particular
cpu as the designated target for counter data collection.
During system boot, the first online cpu in a chip gets assigned as
the designated cpu for that chip(for nest-imc) and the first online cpu
in a core gets assigned as the designated cpu for that core(for core-imc).

If the designated cpu goes offline, the next online cpu from the same
chip(for nest-imc)/core(for core-imc) is assigned as the next target,
and the event context is migrated to the target cpu.
Currently, cpumask_any_but() function is used to find the target cpu.
Though this function is expected to return a `random` cpu, this always
returns the next online cpu.

If all cpus in a chip/core is offlined in a sequential manner, starting
from the first cpu, the event migration has to happen for all the cpus
which goes offline. Since the migration process involves a grace period,
the total time taken to offline all the cpus will be significantly high.

Seems like a very interesting work.
Out of curiosity, have you used 'chcpu -d' to create your benchmark?


Here I did not use chcpu to disable the cpu.

I used a script which will offline cpus 88-175 by echoing  `0` to

/sys/devices/system/cpu/cpu*/online.


Regards,

Anju




Re: [PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-11 Thread Christophe Leroy




Le 11/06/2019 à 18:30, Horia Geanta a écrit :

On 6/11/2019 6:40 PM, Christophe Leroy wrote:



Le 11/06/2019 à 17:37, Horia Geanta a écrit :

On 6/11/2019 5:39 PM, Christophe Leroy wrote:

This series is the last set of fixes for the Talitos driver.

We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:


I am getting below failures on a sec 3.3.2 (p1020rdb) for hmac(sha384) and
hmac(sha512):


Is that new with this series or did you already have it before ?


Looks like this happens with or without this series.


Found the issue, that's in 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=b8fbdc2bc4e71b62646031d5df5f08aafe15d5ad


CONFIG_CRYPTO_DEV_TALITOS_SEC2 should be CONFIG_CRYPTO_DEV_TALITOS2 instead.

Just sent a patch to fix it.

Thanks
Christophe



I haven't checked the state of this driver for quite some time.
Since I've noticed increased activity, I thought it would be worth
actually testing the changes.

Are changes in patch 2/4 ("crypto: talitos - fix hash on SEC1.")
strictly for sec 1.x or they affect all revisions?


What do you mean by "fuzz testing" enabled ? Is that
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS or something else ?


Yes, it's this config symbol.

Horia



[PATCH] crypto: talitos - fix max key size for sha384 and sha512

2019-06-11 Thread Christophe Leroy
Below commit came with a typo in the CONFIG_ symbol, leading
to a permanently reduced max key size regarless of the driver
capabilities.

Reported-by: Horia Geantă 
Fixes: b8fbdc2bc4e7 ("crypto: talitos - reduce max key size for SEC1")
Signed-off-by: Christophe Leroy 
---
 drivers/crypto/talitos.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 03b7a5d28fb0..b4c8a013f302 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -832,7 +832,7 @@ static void talitos_unregister_rng(struct device *dev)
  * HMAC_SNOOP_NO_AFEA (HSNA) instead of type IPSEC_ESP
  */
 #define TALITOS_CRA_PRIORITY_AEAD_HSNA (TALITOS_CRA_PRIORITY - 1)
-#ifdef CONFIG_CRYPTO_DEV_TALITOS_SEC2
+#ifdef CONFIG_CRYPTO_DEV_TALITOS2
 #define TALITOS_MAX_KEY_SIZE   (AES_MAX_KEY_SIZE + SHA512_BLOCK_SIZE)
 #else
 #define TALITOS_MAX_KEY_SIZE   (AES_MAX_KEY_SIZE + SHA256_BLOCK_SIZE)
-- 
2.13.3



Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-11 Thread Oliver O'Halloran
On Wed, Jun 12, 2019 at 8:54 AM Benjamin Herrenschmidt
 wrote:
>
> On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> >
> > > So, to summarize:
> > > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > > call it again with 32, which makes our device pretty much unusable.
> > > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > > 59 will be set and the host won't like it (I checked it). In addition,
> > > I might get addresses above 50 bits, which my device can't generate.
> > >
> > > I hope this makes things more clear. Now, please explain to me how I
> > > can call pci_set_dma_mask without any regard to whether I run on
> > > x86-64 or POWER9, considering what I wrote above ?
> > >
> > > Thanks,
> > > Oded
> >
> > Adding ppc mailing list.
>
> You can't. Your device is broken. Devices that don't support DMAing to
> the full 64-bit deserve to be added to the trash pile.
>
> As a result, getting it to work will require hacks. Some GPUs have
> similar issues and require similar hacks, it's unfortunate.
>
> Added a couple of guys on CC who might be able to help get those hacks
> right.

> It's still very fishy .. the idea is to detect the case where setting a
> 64-bit mask will give your system memory mapped at a fixed high address
> (1 << 59 in our case) and program that in your chip in the "Fixed high
> bits" register that you seem to have (also make sure it doesn't affect
> MSIs or it will break them).

Judging from the patch (https://lkml.org/lkml/2019/6/11/59) this is
what they're doing.

Also, are you sure about the MSI thing? The IODA3 spec says the only
important bits for a 64bit MSI are bits 61:60 (to hit the window) and
the lower bits that determine what IVE to use. Everything in between
is ignored so ORing in bit 59 shouldn't break anything.

> This will only work as long as all of the system memory can be
> addressed at an offset from that fixed address that itself fits your
> device addressing capabilities (50 bits in this case). It may or may
> not be the case but there's no way to check since the DMA mask logic
> won't really apply.
>
> You might want to consider fixing your HW in the next iteration... This
> is going to bite you when x86 increases the max physical memory for
> example, or on other architectures.

Yes, do this. The easiest way to avoid this sort of wierd hack is to
just design the PCIe interface to the spec in the first place.


Re: [PATCH v8 2/7] x86/dma: use IS_ENABLED() to simplify the code

2019-06-11 Thread Borislav Petkov
On Thu, May 30, 2019 at 11:48:26AM +0800, Zhen Lei wrote:
> This patch removes the ifdefs around CONFIG_IOMMU_DEFAULT_PASSTHROUGH to
> improve readablity.

Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

> Signed-off-by: Zhen Lei 
> ---
>  arch/x86/kernel/pci-dma.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index dcd272dbd0a9330..9f2b19c35a060df 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -43,11 +43,8 @@
>   * It is also possible to disable by default in kernel config, and enable 
> with
>   * iommu=nopt at boot time.
>   */
> -#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
> -int iommu_pass_through __read_mostly = 1;
> -#else
> -int iommu_pass_through __read_mostly;
> -#endif
> +int iommu_pass_through __read_mostly =
> + IS_ENABLED(CONFIG_IOMMU_DEFAULT_PASSTHROUGH);

Let that line stick out.

Thx.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH kernel v3 0/3] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-06-11 Thread Shawn Anastasio

On 6/5/19 11:11 PM, Shawn Anastasio wrote:

On 5/30/19 2:03 AM, Alexey Kardashevskiy wrote:

This is an attempt to allow DMA masks between 32..59 which are not large
enough to use either a PHB3 bypass mode or a sketchy bypass. Depending
on the max order, up to 40 is usually available.


This is based on v5.2-rc2.

Please comment. Thanks.


I have tested this patch set with an AMD GPU that's limited to <64bit
DMA (I believe it's 40 or 42 bit). It successfully allows the card to
operate without falling back to 32-bit DMA mode as it does without
the patches.

Relevant kernel log message:
```
[    0.311211] pci 0033:01 : [PE# 00] Enabling 64-bit DMA bypass
```

Tested-by: Shawn Anastasio 


After a few days of further testing, I've started to run into stability
issues with the patch applied and used with an AMD GPU. Specifically,
the system sometimes spontaneously crashes. Not just EEH errors either,
the whole system shuts down in what looks like a checkstop.

Perhaps some subtle corruption is occurring?


Re: [PATCH 2/2] powerpc/64s: __find_linux_pte synchronization vs pmdp_invalidate

2019-06-11 Thread Michael Ellerman
On Fri, 2019-06-07 at 03:56:36 UTC, Nicholas Piggin wrote:
> The change to pmdp_invalidate to mark the pmd with _PAGE_INVALID broke
> the synchronisation against lock free lookups, __find_linux_pte's
> pmd_none check no longer returns true for such cases.
> 
> Fix this by adding a check for this condition as well.
> 
> Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at 
> _PAGE_PRESENT bit")
> Cc: Christophe Leroy 
> Suggested-by: Aneesh Kumar K.V 
> Signed-off-by: Nicholas Piggin 
> Reviewed-by: Aneesh Kumar K.V 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/a00196a272161338d4b1d66ec69e3d57

cheers


Re: [PATCH 1/2] powerpc/64s: Fix THP PMD collapse serialisation

2019-06-11 Thread Michael Ellerman
On Fri, 2019-06-07 at 03:56:35 UTC, Nicholas Piggin wrote:
> Commit 1b2443a547f9 ("powerpc/book3s64: Avoid multiple endian conversion
> in pte helpers") changed the actual bitwise tests in pte_access_permitted
> by using pte_write() and pte_present() helpers rather than raw bitwise
> testing _PAGE_WRITE and _PAGE_PRESENT bits.
> 
> The pte_present change now returns true for ptes which are !_PAGE_PRESENT
> and _PAGE_INVALID, which is the combination used by pmdp_invalidate to
> synchronize access from lock-free lookups. pte_access_permitted is used by
> pmd_access_permitted, so allowing GUP lock free access to proceed with
> such PTEs breaks this synchronisation.
> 
> This bug has been observed on HPT host, with random crashes and corruption
> in guests, usually together with bad PMD messages in the host.
> 
> Fix this by adding an explicit check in pmd_access_permitted, and
> documenting the condition explicitly.
> 
> The pte_write() change should be okay, and would prevent GUP from falling
> back to the slow path when encountering savedwrite ptes, which matches
> what x86 (that does not implement savedwrite) does.
> 
> Fixes: 1b2443a547f9 ("powerpc/book3s64: Avoid multiple endian conversion in 
> pte helpers")
> Cc: Aneesh Kumar K.V 
> Cc: Christophe Leroy 
> Signed-off-by: Nicholas Piggin 
> Reviewed-by: Aneesh Kumar K.V 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/33258a1db165cf43a9e6382587ad06e9

cheers


Re: [PATCH] powerpc/32s: fix booting with CONFIG_PPC_EARLY_DEBUG_BOOTX

2019-06-11 Thread Michael Ellerman
On Mon, 2019-06-03 at 13:00:51 UTC, Christophe Leroy wrote:
> When booting through OF, setup_disp_bat() does nothing because
> disp_BAT are not set. By change, it used to work because BOOTX
> buffer is mapped 1:1 at address 0x8100 by the bootloader, and
> btext_setup_display() sets virt addr same as phys addr.
> 
> But since commit 215b823707ce ("powerpc/32s: set up an early static
> hash table for KASAN."), a temporary page table overrides the
> bootloader mapping.
> 
> This 0x8100 is also problematic with the newly implemented
> Kernel Userspace Access Protection (KUAP) because it is within user
> address space.
> 
> This patch fixes those issues by properly setting disp_BAT through
> a call to btext_prepare_BAT(), allowing setup_disp_bat() to
> properly setup BAT3 for early bootx screen buffer access.
> 
> Reported-by: Mathieu Malaterre 
> Fixes: 215b823707ce ("powerpc/32s: set up an early static hash table for 
> KASAN.")
> Signed-off-by: Christophe Leroy 
> Tested-by: Mathieu Malaterre 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/c21f5a9ed85ca3e914ca11f421677ae9

cheers


Re: [PATCH v3] powerpc: fix kexec failure on book3s/32

2019-06-11 Thread Michael Ellerman
On Mon, 2019-06-03 at 08:20:28 UTC, Christophe Leroy wrote:
> In the old days, _PAGE_EXEC didn't exist on 6xx aka book3s/32.
> Therefore, allthough __mapin_ram_chunk() was already mapping kernel
> text with PAGE_KERNEL_TEXT and the rest with PAGE_KERNEL, the entire
> memory was executable. Part of the memory (first 512kbytes) was
> mapped with BATs instead of page table, but it was also entirely
> mapped as executable.
> 
> In commit 385e89d5b20f ("powerpc/mm: add exec protection on
> powerpc 603"), we started adding exec protection to some 6xx, namely
> the 603, for pages mapped via pagetables.
> 
> Then, in commit 63b2bc619565 ("powerpc/mm/32s: Use BATs for
> STRICT_KERNEL_RWX"), the exec protection was extended to BAT mapped
> memory, so that really only the kernel text could be executed.
> 
> The problem here is that kexec is based on copying some code into
> upper part of memory then executing it from there in order to install
> a fresh new kernel at its definitive location.
> 
> However, the code is position independant and first part of it is
> just there to deactivate the MMU and jump to the second part. So it
> is possible to run this first part inplace instead of running the
> copy. Once the MMU is off, there is no protection anymore and the
> second part of the code will just run as before.
> 
> Reported-by: Aaro Koskinen 
> Fixes: 63b2bc619565 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Christophe Leroy 
> Tested-by: Aaro Koskinen 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/6c284228eb356a1ec62a704b4d232971

cheers


[PATCH 0/3] live partition migration vs cacheinfo

2019-06-11 Thread Nathan Lynch
Partition migration often results in the platform telling the OS to
replace all the cache nodes in the device tree. The cacheinfo code has
no knowledge of this, and continues to maintain references to the
deleted/detached nodes, causing subsequent CPU online/offline
operations to get warnings and oopses. This series addresses this
longstanding issue by providing an interface to the cacheinfo layer
that the migration code uses to rebuild the cacheinfo data structures
at a safe time after migration, with appropriate serialization vs CPU
hotplug.


Nathan Lynch (3):
  powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild
  powerpc/pseries/mobility: prevent cpu hotplug during DT update
  powerpc/pseries/mobility: rebuild cacheinfo hierarchy post-migration

 arch/powerpc/kernel/cacheinfo.c   | 21 +
 arch/powerpc/kernel/cacheinfo.h   |  4 
 arch/powerpc/platforms/pseries/mobility.c | 19 +++
 3 files changed, 44 insertions(+)

-- 
2.20.1



[PATCH 2/3] powerpc/pseries/mobility: prevent cpu hotplug during DT update

2019-06-11 Thread Nathan Lynch
CPU online/offline code paths are sensitive to parts of the device
tree (various cpu node properties, cache nodes) that can be changed as
a result of a migration.

Prevent CPU hotplug while the device tree potentially is inconsistent.

Fixes: 410bccf97881 ("powerpc/pseries: Partition migration in the kernel")
Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/mobility.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 88925f8ca8a0..edc1ec408589 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -9,6 +9,7 @@
  * 2 as published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -338,11 +339,19 @@ void post_mobility_fixup(void)
if (rc)
printk(KERN_ERR "Post-mobility activate-fw failed: %d\n", rc);
 
+   /*
+* We don't want CPUs to go online/offline while the device
+* tree is being updated.
+*/
+   cpus_read_lock();
+
rc = pseries_devicetree_update(MIGRATION_SCOPE);
if (rc)
printk(KERN_ERR "Post-mobility device tree update "
"failed: %d\n", rc);
 
+   cpus_read_unlock();
+
/* Possibly switch to a new RFI flush type */
pseries_setup_rfi_flush();
 
-- 
2.20.1



[PATCH 3/3] powerpc/pseries/mobility: rebuild cacheinfo hierarchy post-migration

2019-06-11 Thread Nathan Lynch
It's common for the platform to replace the cache device nodes after a
migration. Since the cacheinfo code is never informed about this, it
never drops its references to the source system's cache nodes, causing
it to wind up in an inconsistent state resulting in warnings and oopses
as soon as CPU online/offline occurs after the migration, e.g.

cache for /cpus/l3-cache@3113(Unified) refers to cache for 
/cpus/l2-cache@200d(Unified)
WARNING: CPU: 15 PID: 86 at arch/powerpc/kernel/cacheinfo.c:176 
release_cache+0x1bc/0x1d0
[...]
NIP [c002d9bc] release_cache+0x1bc/0x1d0
LR [c002d9b8] release_cache+0x1b8/0x1d0
Call Trace:
[c001fc99fa70] [c002d9b8] release_cache+0x1b8/0x1d0 (unreliable)
[c001fc99fb10] [c002ebf4] cacheinfo_cpu_offline+0x1c4/0x2c0
[c001fc99fbe0] [c002ae58] unregister_cpu_online+0x1b8/0x260
[c001fc99fc40] [c0165a64] cpuhp_invoke_callback+0x114/0xf40
[c001fc99fcd0] [c0167450] cpuhp_thread_fun+0x270/0x310
[c001fc99fd40] [c01a8bb8] smpboot_thread_fn+0x2c8/0x390
[c001fc99fdb0] [c01a1cd8] kthread+0x1b8/0x1c0
[c001fc99fe20] [c000c2d4] ret_from_kernel_thread+0x5c/0x68

Using device tree notifiers won't work since we want to rebuild the
hierarchy only after all the removals and additions have occurred and
the device tree is in a consistent state. Call cacheinfo_teardown()
before processing device tree updates, and rebuild the hierarchy
afterward.

Fixes: 410bccf97881 ("powerpc/pseries: Partition migration in the kernel")
Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/mobility.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index edc1ec408589..b8c8096907d4 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include "pseries.h"
+#include "../../kernel/cacheinfo.h"
 
 static struct kobject *mobility_kobj;
 
@@ -345,11 +346,20 @@ void post_mobility_fixup(void)
 */
cpus_read_lock();
 
+   /*
+* It's common for the destination firmware to replace cache
+* nodes.  Release all of the cacheinfo hierarchy's references
+* before updating the device tree.
+*/
+   cacheinfo_teardown();
+
rc = pseries_devicetree_update(MIGRATION_SCOPE);
if (rc)
printk(KERN_ERR "Post-mobility device tree update "
"failed: %d\n", rc);
 
+   cacheinfo_rebuild();
+
cpus_read_unlock();
 
/* Possibly switch to a new RFI flush type */
-- 
2.20.1



[PATCH 1/3] powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild

2019-06-11 Thread Nathan Lynch
Allow external callers to force the cacheinfo code to release all its
references to cache nodes, e.g. before processing device tree updates
post-migration, and to rebuild the hierarchy afterward.

CPU online/offline must be blocked by callers; enforce this.

Fixes: 410bccf97881 ("powerpc/pseries: Partition migration in the kernel")
Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/cacheinfo.c | 21 +
 arch/powerpc/kernel/cacheinfo.h |  4 
 2 files changed, 25 insertions(+)

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index 862e2890bd3d..42c559efe060 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -896,4 +896,25 @@ void cacheinfo_cpu_offline(unsigned int cpu_id)
if (cache)
cache_cpu_clear(cache, cpu_id);
 }
+
+void cacheinfo_teardown(void)
+{
+   unsigned int cpu;
+
+   lockdep_assert_cpus_held();
+
+   for_each_online_cpu(cpu)
+   cacheinfo_cpu_offline(cpu);
+}
+
+void cacheinfo_rebuild(void)
+{
+   unsigned int cpu;
+
+   lockdep_assert_cpus_held();
+
+   for_each_online_cpu(cpu)
+   cacheinfo_cpu_online(cpu);
+}
+
 #endif /* (CONFIG_PPC_PSERIES && CONFIG_SUSPEND) || CONFIG_HOTPLUG_CPU */
diff --git a/arch/powerpc/kernel/cacheinfo.h b/arch/powerpc/kernel/cacheinfo.h
index 955f5e999f1b..52bd3fc6642d 100644
--- a/arch/powerpc/kernel/cacheinfo.h
+++ b/arch/powerpc/kernel/cacheinfo.h
@@ -6,4 +6,8 @@
 extern void cacheinfo_cpu_online(unsigned int cpu_id);
 extern void cacheinfo_cpu_offline(unsigned int cpu_id);
 
+/* Allow migration/suspend to tear down and rebuild the hierarchy. */
+extern void cacheinfo_teardown(void);
+extern void cacheinfo_rebuild(void);
+
 #endif /* _PPC_CACHEINFO_H */
-- 
2.20.1



Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Benjamin Herrenschmidt
On Tue, 2019-06-11 at 20:52 -0500, Larry Finger wrote:
> On 6/11/19 5:46 PM, Benjamin Herrenschmidt wrote:
> > On Tue, 2019-06-11 at 17:20 -0500, Larry Finger wrote:
> > > b43-pci-bridge 0001:11:00.0: dma_direct_supported: failed (mask =
> > > 0x3fff,
> > > min_mask = 0x5000/0x5000, dma bits = 0x1f
> > 
> > Ugh ? A mask with holes in it ? That's very wrong... That min_mask is
> > bogus.
> 
> I agree, but that is not likely serious as most systems will have enough 
> memory 
> that the max_pfn term will be much larger than the initial min_mask, and 
> min_mask will be unchanged by the min function. 

Well no... it's too much memory that is the problem. If min_mask is
bogus though it will cause problem later too, so one should look into
it.

> In addition, min_mask is not 
> used beyond this routine, and then only to decide if direct dma is supported. 
> The following patch generates masks with no holes, but I cannot see that it 
> is 
> needed.

The right fix is to round up max_pfn to a power of 2, something like

min_mask = min_t(u64, min_mask, (roundup_pow_of_two(max_pfn - 1)) <<
PAGE_SHIFT) 

> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 2c2772e9702a..e3edd4f29e80 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -384,7 +384,8 @@ int dma_direct_supported(struct device *dev, u64 mask)
>  else
>  min_mask = DMA_BIT_MASK(32);
> 
> -   min_mask = min_t(u64, min_mask, (max_pfn - 1) << PAGE_SHIFT);
> +   min_mask = min_t(u64, min_mask, ((max_pfn - 1) << PAGE_SHIFT) |
> +DMA_BIT_MASK(PAGE_SHIFT));
> 
>  /*
>   * This check needs to be against the actual bit mask value, so
> 
> 
> Larry



Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Larry Finger

On 6/11/19 5:46 PM, Aaro Koskinen wrote:

Hi,

On Tue, Jun 11, 2019 at 05:20:12PM -0500, Larry Finger wrote:

It is obvious that the case of a mask smaller than min_mask should be
handled by the IOMMU. In my system, CONFIG_IOMMU_SUPPORT is selected. All
other CONFIG variables containing IOMMU are not selected. When
dma_direct_supported() fails, should the system not try for an IOMMU
solution? Is the driver asking for the wrong type of memory? It is doing a
dma_and_set_mask_coherent() call.


I don't think we have IOMMU on G4. On G5 it should work (I remember fixing
b43 issue on G5, see 4c374af5fdee, unfortunately all my G5 Macs with b43
are dead and waiting for re-capping).


You are right. My configuration has CONFIG_IOMMU_SUPPORT=y, but there is no 
mention of an IOMMU in the log.


Larry



Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Larry Finger

On 6/11/19 5:46 PM, Benjamin Herrenschmidt wrote:

On Tue, 2019-06-11 at 17:20 -0500, Larry Finger wrote:

b43-pci-bridge 0001:11:00.0: dma_direct_supported: failed (mask =
0x3fff,
min_mask = 0x5000/0x5000, dma bits = 0x1f


Ugh ? A mask with holes in it ? That's very wrong... That min_mask is
bogus.


I agree, but that is not likely serious as most systems will have enough memory 
that the max_pfn term will be much larger than the initial min_mask, and 
min_mask will be unchanged by the min function. In addition, min_mask is not 
used beyond this routine, and then only to decide if direct dma is supported. 
The following patch generates masks with no holes, but I cannot see that it is 
needed.


diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 2c2772e9702a..e3edd4f29e80 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -384,7 +384,8 @@ int dma_direct_supported(struct device *dev, u64 mask)
else
min_mask = DMA_BIT_MASK(32);

-   min_mask = min_t(u64, min_mask, (max_pfn - 1) << PAGE_SHIFT);
+   min_mask = min_t(u64, min_mask, ((max_pfn - 1) << PAGE_SHIFT) |
+DMA_BIT_MASK(PAGE_SHIFT));

/*
 * This check needs to be against the actual bit mask value, so


Larry


Re: [PATCH 16/16] mm: pass get_user_pages_fast iterator arguments in a structure

2019-06-11 Thread Nadav Amit
> On Jun 11, 2019, at 5:52 PM, Nicholas Piggin  wrote:
> 
> Christoph Hellwig's on June 12, 2019 12:41 am:
>> Instead of passing a set of always repeated arguments down the
>> get_user_pages_fast iterators, create a struct gup_args to hold them and
>> pass that by reference.  This leads to an over 100 byte .text size
>> reduction for x86-64.
> 
> What does this do for performance? I've found this pattern can be
> bad for store aliasing detection.

Note that sometimes such an optimization can also have adverse effect due to
stack protector code that gcc emits when you use such structs.

Matthew Wilcox encountered such a case:
https://patchwork.kernel.org/patch/10702741/


Re: [PATCH 16/16] mm: pass get_user_pages_fast iterator arguments in a structure

2019-06-11 Thread Linus Torvalds
On Tue, Jun 11, 2019 at 2:55 PM Nicholas Piggin  wrote:
>
> What does this do for performance? I've found this pattern can be
> bad for store aliasing detection.

I wouldn't expect it to be noticeable, and the lack of argument
reloading etc should make up for it. Plus inlining makes it a
non-issue when that happens.

But I guess we could also at least look at using "restrict", if that
ends up helping. Unlike the completely bogus type-based aliasing rules
(that we disable because I think the C people were on some bad bad
drugs when they came up with them), restricted pointers are a real
thing that makes sense.

That said, we haven't traditionally used it, and I don't know how much
it helps gcc. Maybe gcc ignores it entirely? S

   Linus


Re: [PATCH 16/16] mm: pass get_user_pages_fast iterator arguments in a structure

2019-06-11 Thread Nicholas Piggin
Christoph Hellwig's on June 12, 2019 12:41 am:
> Instead of passing a set of always repeated arguments down the
> get_user_pages_fast iterators, create a struct gup_args to hold them and
> pass that by reference.  This leads to an over 100 byte .text size
> reduction for x86-64.

What does this do for performance? I've found this pattern can be
bad for store aliasing detection.

Thanks,
Nick


Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-11 Thread Benjamin Herrenschmidt
On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> 
> > So, to summarize:
> > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > call it again with 32, which makes our device pretty much unusable.
> > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > 59 will be set and the host won't like it (I checked it). In addition,
> > I might get addresses above 50 bits, which my device can't generate.
> > 
> > I hope this makes things more clear. Now, please explain to me how I
> > can call pci_set_dma_mask without any regard to whether I run on
> > x86-64 or POWER9, considering what I wrote above ?
> > 
> > Thanks,
> > Oded
> 
> Adding ppc mailing list.

You can't. Your device is broken. Devices that don't support DMAing to
the full 64-bit deserve to be added to the trash pile.

As a result, getting it to work will require hacks. Some GPUs have
similar issues and require similar hacks, it's unfortunate.

Added a couple of guys on CC who might be able to help get those hacks
right.

It's still very fishy .. the idea is to detect the case where setting a
64-bit mask will give your system memory mapped at a fixed high address
(1 << 59 in our case) and program that in your chip in the "Fixed high
bits" register that you seem to have (also make sure it doesn't affect
MSIs or it will break them).

This will only work as long as all of the system memory can be
addressed at an offset from that fixed address that itself fits your
device addressing capabilities (50 bits in this case). It may or may
not be the case but there's no way to check since the DMA mask logic
won't really apply.

You might want to consider fixing your HW in the next iteration... This
is going to bite you when x86 increases the max physical memory for
example, or on other architectures.

Cheers,
Ben.






Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Benjamin Herrenschmidt
On Tue, 2019-06-11 at 17:20 -0500, Larry Finger wrote:
> b43-pci-bridge 0001:11:00.0: dma_direct_supported: failed (mask =
> 0x3fff, 
> min_mask = 0x5000/0x5000, dma bits = 0x1f

Ugh ? A mask with holes in it ? That's very wrong... That min_mask is
bogus.

Ben.




Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Aaro Koskinen
Hi,

On Tue, Jun 11, 2019 at 05:20:12PM -0500, Larry Finger wrote:
> It is obvious that the case of a mask smaller than min_mask should be
> handled by the IOMMU. In my system, CONFIG_IOMMU_SUPPORT is selected. All
> other CONFIG variables containing IOMMU are not selected. When
> dma_direct_supported() fails, should the system not try for an IOMMU
> solution? Is the driver asking for the wrong type of memory? It is doing a
> dma_and_set_mask_coherent() call.

I don't think we have IOMMU on G4. On G5 it should work (I remember fixing
b43 issue on G5, see 4c374af5fdee, unfortunately all my G5 Macs with b43
are dead and waiting for re-capping).

A.


Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Larry Finger

On 6/11/19 1:05 AM, Christoph Hellwig wrote:

On Mon, Jun 10, 2019 at 11:09:47AM -0500, Larry Finger wrote:

What might be confusing in your output is that dev->dma_mask is a pointer,
and we are setting it in dma_set_mask.  That is before we only check
if the pointer is set, and later we override it.  Of course this doesn't
actually explain the failure.  But what is even more strange to me
is that you get a return value from dma_supported() that isn't 0 or 1,
as that function is supposed to return a boolean, and I really can't see
how mask >= __phys_to_dma(dev, min_mask), would return anything but 0
or 1.  Does the output change if you use the correct printk specifiers?

i.e. with a debug patch like this:


diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 2c2772e9702a..9e5b30b12b10 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -378,6 +378,7 @@ EXPORT_SYMBOL(dma_direct_map_resource);
  int dma_direct_supported(struct device *dev, u64 mask)
  {
u64 min_mask;
+   bool ret;
  
  	if (IS_ENABLED(CONFIG_ZONE_DMA))

min_mask = DMA_BIT_MASK(ARCH_ZONE_DMA_BITS);
@@ -391,7 +392,12 @@ int dma_direct_supported(struct device *dev, u64 mask)
 * use __phys_to_dma() here so that the SME encryption mask isn't
 * part of the check.
 */
-   return mask >= __phys_to_dma(dev, min_mask);
+   ret = (mask >= __phys_to_dma(dev, min_mask));
+   if (!ret)
+   dev_info(dev,
+   "%s: failed (mask = 0x%llx, min_mask = 0x%llx/0x%llx, dma 
bits = %d\n",
+   __func__, mask, min_mask, __phys_to_dma(dev, min_mask), 
ARCH_ZONE_DMA_BITS);
+   return ret;
  }
  
  size_t dma_direct_max_mapping_size(struct device *dev)

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index f7afdadb6770..6c57ccdee2ae 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -317,8 +317,14 @@ void arch_dma_set_mask(struct device *dev, u64 mask);
  
  int dma_set_mask(struct device *dev, u64 mask)

  {
-   if (!dev->dma_mask || !dma_supported(dev, mask))
+   if (!dev->dma_mask) {
+   dev_info(dev, "no DMA mask set!\n");
return -EIO;
+   }
+   if (!dma_supported(dev, mask)) {
+   printk("DMA not supported\n");
+   return -EIO;
+   }
  
  	arch_dma_set_mask(dev, mask);

dma_check_mask(dev, mask);



After I got the correct formatting, the output with this patch only gives the 
following in dmesg:


b43-pci-bridge 0001:11:00.0: dma_direct_supported: failed (mask = 0x3fff, 
min_mask = 0x5000/0x5000, dma bits = 0x1f

DMA not supported
b43legacy-phy0 ERROR: The machine/kernel does not support the required 30-bit 
DMA mask


Your first patch did not work as the configuration does not have 
CONFIG_ZONE_DMA. As a result, the initial value of min_mask always starts at 32 
bits and is taken down to 31 with the maximum pfn minimization. When I forced 
the initial value of min_mask to 30 bits, the device worked.


It is obvious that the case of a mask smaller than min_mask should be handled by 
the IOMMU. In my system, CONFIG_IOMMU_SUPPORT is selected. All other CONFIG 
variables containing IOMMU are not selected. When dma_direct_supported() fails, 
should the system not try for an IOMMU solution? Is the driver asking for the 
wrong type of memory? It is doing a dma_and_set_mask_coherent() call.


Larry


[Bug 203837] Booting kernel under KVM immediately freezes host

2019-06-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203837

--- Comment #4 from Shawn Anastasio (sh...@anastas.io) ---
I have applied Nick's patchset to 5.1.7 but the issue still occurs.

As for using pdbg, I'm aware of the tool's existence but I'm not sure how
I would effectively use it to diagnose this issue. If anybody has some
pointers, it'd be appreciated.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH 10/16] mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP

2019-06-11 Thread Khalid Aziz
On 6/11/19 8:40 AM, Christoph Hellwig wrote:
> We only support the generic GUP now, so rename the config option to
> be more clear, and always use the mm/Kconfig definition of the
> symbol and select it from the arch Kconfigs.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm/Kconfig | 5 +
>  arch/arm64/Kconfig   | 4 +---
>  arch/mips/Kconfig| 2 +-
>  arch/powerpc/Kconfig | 2 +-
>  arch/s390/Kconfig| 2 +-
>  arch/sh/Kconfig  | 2 +-
>  arch/sparc/Kconfig   | 2 +-
>  arch/x86/Kconfig | 4 +---
>  mm/Kconfig   | 2 +-
>  mm/gup.c | 4 ++--
>  10 files changed, 11 insertions(+), 18 deletions(-)
> 

Looks good.

Reviewed-by: Khalid Aziz 




Re: [PATCH 09/16] sparc64: use the generic get_user_pages_fast code

2019-06-11 Thread Khalid Aziz
On 6/11/19 8:40 AM, Christoph Hellwig wrote:
> The sparc64 code is mostly equivalent to the generic one, minus various
> bugfixes and two arch overrides that this patch adds to pgtable.h.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/sparc/Kconfig  |   1 +
>  arch/sparc/include/asm/pgtable_64.h |  18 ++
>  arch/sparc/mm/Makefile  |   2 +-
>  arch/sparc/mm/gup.c | 340 
>  4 files changed, 20 insertions(+), 341 deletions(-)
>  delete mode 100644 arch/sparc/mm/gup.c
> 

Reviewed-by: Khalid Aziz 




Re: [PATCH 08/16] sparc64: define untagged_addr()

2019-06-11 Thread Khalid Aziz
On 6/11/19 8:40 AM, Christoph Hellwig wrote:
> Add a helper to untag a user pointer.  This is needed for ADI support
> in get_user_pages_fast.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/sparc/include/asm/pgtable_64.h | 22 ++
>  1 file changed, 22 insertions(+)

Looks good to me.

Reviewed-by: Khalid Aziz 

> 
> diff --git a/arch/sparc/include/asm/pgtable_64.h 
> b/arch/sparc/include/asm/pgtable_64.h
> index f0dcf991d27f..1904782dcd39 100644
> --- a/arch/sparc/include/asm/pgtable_64.h
> +++ b/arch/sparc/include/asm/pgtable_64.h
> @@ -1076,6 +1076,28 @@ static inline int io_remap_pfn_range(struct 
> vm_area_struct *vma,
>  }
>  #define io_remap_pfn_range io_remap_pfn_range 
>  
> +static inline unsigned long untagged_addr(unsigned long start)
> +{
> + if (adi_capable()) {
> + long addr = start;
> +
> + /* If userspace has passed a versioned address, kernel
> +  * will not find it in the VMAs since it does not store
> +  * the version tags in the list of VMAs. Storing version
> +  * tags in list of VMAs is impractical since they can be
> +  * changed any time from userspace without dropping into
> +  * kernel. Any address search in VMAs will be done with
> +  * non-versioned addresses. Ensure the ADI version bits
> +  * are dropped here by sign extending the last bit before
> +  * ADI bits. IOMMU does not implement version tags.
> +  */
> + return (addr << (long)adi_nbits()) >> (long)adi_nbits();
> + }
> +
> + return start;
> +}
> +#define untagged_addr untagged_addr
> +
>  #include 
>  #include 
>  
> 




Re: [PATCH 01/16] mm: use untagged_addr() for get_user_pages_fast addresses

2019-06-11 Thread Khalid Aziz
On 6/11/19 8:40 AM, Christoph Hellwig wrote:
> This will allow sparc64 to override its ADI tags for
> get_user_pages and get_user_pages_fast.
> 
> Signed-off-by: Christoph Hellwig 
> ---

Commit message is sparc64 specific but the goal here is to allow any
architecture with memory tagging to use this. So I would suggest
rewording the commit log. Other than that:

Reviewed-by: Khalid Aziz 

>  mm/gup.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index ddde097cf9e4..6bb521db67ec 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2146,7 +2146,7 @@ int __get_user_pages_fast(unsigned long start, int 
> nr_pages, int write,
>   unsigned long flags;
>   int nr = 0;
>  
> - start &= PAGE_MASK;
> + start = untagged_addr(start) & PAGE_MASK;
>   len = (unsigned long) nr_pages << PAGE_SHIFT;
>   end = start + len;
>  
> @@ -2219,7 +2219,7 @@ int get_user_pages_fast(unsigned long start, int 
> nr_pages,
>   unsigned long addr, len, end;
>   int nr = 0, ret = 0;
>  
> - start &= PAGE_MASK;
> + start = untagged_addr(start) & PAGE_MASK;
>   addr = start;
>   len = (unsigned long) nr_pages << PAGE_SHIFT;
>   end = start + len;
> 




Re: [PATCH v3 06/20] docs: mark orphan documents as such

2019-06-11 Thread Andy Shevchenko
On Tue, Jun 11, 2019 at 8:05 PM Mauro Carvalho Chehab
 wrote:
>
> Em Tue, 11 Jun 2019 19:52:04 +0300
> Andy Shevchenko  escreveu:
>
> > On Fri, Jun 7, 2019 at 10:04 PM Mauro Carvalho Chehab
> >  wrote:
> > > Sphinx doesn't like orphan documents:
> >
> > > Documentation/laptops/lg-laptop.rst: WARNING: document isn't included 
> > > in any toctree
> >
> > >  Documentation/laptops/lg-laptop.rst | 2 ++
> >
> > > diff --git a/Documentation/laptops/lg-laptop.rst 
> > > b/Documentation/laptops/lg-laptop.rst
> > > index aa503ee9b3bc..f2c2ffe31101 100644
> > > --- a/Documentation/laptops/lg-laptop.rst
> > > +++ b/Documentation/laptops/lg-laptop.rst
> > > @@ -1,5 +1,7 @@
> > >  .. SPDX-License-Identifier: GPL-2.0+
> > >
> > > +:orphan:
> > > +
> > >  LG Gram laptop extra features
> > >  =
> > >
> >
> > Can we rather create a toc tree there?
> > It was a first document in reST format in that folder.
>
> Sure, but:
>
> 1) I have a patch converting the other files on this dir to rst:
>
> 
> https://git.linuxtv.org/mchehab/experimental.git/commit/?h=convert_rst_renames_v4.1&id=abc13233035fdfdbc5ef2f2fbd3d127a1ab15530
>
> 2) It probably makes sense to move the entire dir to
> Documentation/admin-guide.
>
> So, I would prefer to have the :orphan: here while (1) is not merged.

Fine to me as long as you will drop it by the mentioned effort.

-- 
With Best Regards,
Andy Shevchenko


[PATCH] cxl: no need to check return value of debugfs_create functions

2019-06-11 Thread Greg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Because there's no need to check, also make the return value of the
local debugfs_create_io_x64() call void, as no one ever did anything
with the return value (as they did not need to.)

Cc: Frederic Barrat 
Cc: Andrew Donnellan 
Cc: Arnd Bergmann 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/misc/cxl/debugfs.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/misc/cxl/debugfs.c b/drivers/misc/cxl/debugfs.c
index 1fda22c24c93..27f3bcb7d939 100644
--- a/drivers/misc/cxl/debugfs.c
+++ b/drivers/misc/cxl/debugfs.c
@@ -26,11 +26,11 @@ static int debugfs_io_u64_set(void *data, u64 val)
 DEFINE_DEBUGFS_ATTRIBUTE(fops_io_x64, debugfs_io_u64_get, debugfs_io_u64_set,
 "0x%016llx\n");
 
-static struct dentry *debugfs_create_io_x64(const char *name, umode_t mode,
-   struct dentry *parent, u64 __iomem 
*value)
+static void debugfs_create_io_x64(const char *name, umode_t mode,
+ struct dentry *parent, u64 __iomem *value)
 {
-   return debugfs_create_file_unsafe(name, mode, parent,
- (void __force *)value, &fops_io_x64);
+   debugfs_create_file_unsafe(name, mode, parent, (void __force *)value,
+  &fops_io_x64);
 }
 
 void cxl_debugfs_add_adapter_regs_psl9(struct cxl *adapter, struct dentry *dir)
@@ -64,8 +64,6 @@ int cxl_debugfs_adapter_add(struct cxl *adapter)
 
snprintf(buf, 32, "card%i", adapter->adapter_num);
dir = debugfs_create_dir(buf, cxl_debugfs);
-   if (IS_ERR(dir))
-   return PTR_ERR(dir);
adapter->debugfs = dir;
 
debugfs_create_io_x64("err_ivte", S_IRUSR, dir, _cxl_p1_addr(adapter, 
CXL_PSL_ErrIVTE));
@@ -106,8 +104,6 @@ int cxl_debugfs_afu_add(struct cxl_afu *afu)
 
snprintf(buf, 32, "psl%i.%i", afu->adapter->adapter_num, afu->slice);
dir = debugfs_create_dir(buf, afu->adapter->debugfs);
-   if (IS_ERR(dir))
-   return PTR_ERR(dir);
afu->debugfs = dir;
 
debugfs_create_io_x64("sr", S_IRUSR, dir, _cxl_p1n_addr(afu, 
CXL_PSL_SR_An));
@@ -129,15 +125,10 @@ void cxl_debugfs_afu_remove(struct cxl_afu *afu)
 
 int __init cxl_debugfs_init(void)
 {
-   struct dentry *ent;
-
if (!cpu_has_feature(CPU_FTR_HVMODE))
return 0;
 
-   ent = debugfs_create_dir("cxl", NULL);
-   if (IS_ERR(ent))
-   return PTR_ERR(ent);
-   cxl_debugfs = ent;
+   cxl_debugfs = debugfs_create_dir("cxl", NULL);
 
return 0;
 }
-- 
2.22.0



Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook

2019-06-11 Thread Andreas Schwab
On Jun 10 2019, Larry Finger  wrote:

> I do not understand why the if statement returns true as neither of the
> values is zero.

That's because the format string does not make any sense.  You are
printing garbage.

> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index f7afdad..ba2489d 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -317,9 +317,12 @@ int dma_supported(struct device *dev, u64 mask)
>
>  int dma_set_mask(struct device *dev, u64 mask)
>  {
> +   pr_info("mask 0x%llx, dma_mask 0x%llx, dma_supported 0x%llx\n",
> mask, dev->dma_mask,
> +   dma_supported(dev, mask));

None of the format directives match the type of the arguments.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [RFC V3] mm: Generalize and rename notify_page_fault() as kprobe_page_fault()

2019-06-11 Thread Leonardo Bras
On Tue, 2019-06-11 at 10:44 +0530, Anshuman Khandual wrote:
> 
> On 06/10/2019 08:57 PM, Leonardo Bras wrote:
> > On Mon, 2019-06-10 at 08:09 +0530, Anshuman Khandual wrote:
> > > > > +/*
> > > > > + * To be potentially processing a kprobe fault and to be allowed
> > > > > + * to call kprobe_running(), we have to be non-preemptible.
> > > > > + */
> > > > > +if (kprobes_built_in() && !preemptible() && !user_mode(regs)) {
> > > > > +if (kprobe_running() && kprobe_fault_handler(regs, trap))
> > > > 
> > > > don't need an 'if A if B', can do 'if A && B'
> > > 
> > > Which will make it a very lengthy condition check.
> > 
> > Well, is there any problem line-breaking the if condition?
> > 
> > if (A && B && C &&
> > D && E )
> > 
> > Also, if it's used only to decide the return value, maybe would be fine
> > to do somethink like that:
> > 
> > return (A && B && C &&
> > D && E ); 
> 
> Got it. But as Dave and Matthew had pointed out earlier, the current x86
> implementation has better readability. Hence will probably stick with it.
> 
Sure, I agree with them. It's way more readable.


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9

2019-06-11 Thread Oded Gabbay
On Tue, Jun 11, 2019 at 8:03 PM Oded Gabbay  wrote:
>
> On Tue, Jun 11, 2019 at 6:26 PM Greg KH  wrote:
> >
> > On Tue, Jun 11, 2019 at 08:17:53AM -0700, Christoph Hellwig wrote:
> > > On Tue, Jun 11, 2019 at 11:58:57AM +0200, Greg KH wrote:
> > > > That feels like a big hack.  ppc doesn't have any "what arch am I
> > > > running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> > > > ok to take this for now, but odds are you need a better fix for this
> > > > sometime...
> > >
> > > That isn't the worst part of it.  The whole idea of checking what I'm
> > > running to set a dma mask just doesn't make any sense at all.
> >
> > Oded, I thought I asked if there was a dma call you should be making to
> > keep this type of check from being needed.  What happened to that?  As
> > Christoph points out, none of this should be needed, which is what I
> > thought I originally said :)
> >
> > thanks,
> >
> > greg k-h
>
> I'm sorry, but it seems I can't explain what's my problem because you
> and Christoph keep mentioning the pci_set_dma_mask() but it doesn't
> help me.
> I'll try again to explain.
>
> The main problem specifically for Goya device, is that I can't call
> this function with *the same parameter* for POWER9 and x86-64, because
> x86-64 supports dma mask of 48-bits while POWER9 supports only 32-bits
> or 64-bits.
>
> The main limitation in my Goya device is that it can generate PCI
> outbound transactions with addresses from 0 to (2^50 - 1).
> That's why when we first integrated it in x86-64, we used a DMA mask
> of 48-bits, by calling pci_set_dma_mask(pdev, 48). That way, the
> kernel ensures me that all the DMA addresses are from 0 to (2^48 - 1),
> and that address range is accessible by my device.
>
> If for some reason, the x86-64 machine doesn't support 48-bits, the
> standard fallback code in ALL the drivers I have seen is to set the
> DMA mask to 32-bits. And that's how my current driver's code is
> written.
>
> Now, when I tried to integrate Goya into a POWER9 machine, I got a
> reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
> as I wrote above, is to call the same function with 32-bits. That
> works BUT it is not practical, as our applications require much more
> memory mapped then 32-bits. In addition, once you add more cards which
> are all mapped to the same range, it is simply not usable at all.
>
> Therefore, I consulted with POWER people and they told me I can call
> to pci_set_dma_mask with the mask as 64, but I must make sure that ALL
> outbound transactions from Goya will be with bit 59 set in the
> address.
> I can achieve that with a dedicated configuration I make in Goya's
> PCIe controller. That's what I did and that works.
>
> So, to summarize:
> If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> in runtime, I don't know if its POWER9 or not, so upon failure I will
> call it again with 32, which makes our device pretty much unusable.
> If I call pci_set_dma_mask with 64, and do the dedicated configuration
> in Goya's PCIe controller, then it won't work on x86-64, because bit
> 59 will be set and the host won't like it (I checked it). In addition,
> I might get addresses above 50 bits, which my device can't generate.
>
> I hope this makes things more clear. Now, please explain to me how I
> can call pci_set_dma_mask without any regard to whether I run on
> x86-64 or POWER9, considering what I wrote above ?
>
> Thanks,
> Oded

Adding ppc mailing list.

Oded


Re: Question - check in runtime which architecture am I running on

2019-06-11 Thread Oded Gabbay
On Tue, Jun 11, 2019 at 5:07 PM Christoph Hellwig  wrote:
>
> On Tue, Jun 11, 2019 at 03:30:08PM +0300, Oded Gabbay wrote:
> > Hello POWER developers,
> >
> > I'm trying to find out if there is an internal kernel API so that a
> > PCI driver can call it to check if its PCI device is running inside a
> > POWER9 machine. Alternatively, if that's not available, if it is
> > running on a machine with powerpc architecture.
>
> Your driver has absolutely not business knowing this.
>
> >
> > I need this information as my device (Goya AI accelerator)
> > unfortunately needs a slightly different configuration of its PCIe
> > controller in case of POWER9 (need to set bit 59 to be 1 in all
> > outbound transactions).
>
> No, it doesn't.  You can query the output from dma_get_required_mask
> to optimize for the DMA addresses you get, and otherwise you simply
> set the maximum dma mask you support.  That is about the control you
> get, and nothing else is a drivers business.

I don't want to conduct two discussions as I saw you answered on my patch.
I'll add the ppc mailing list to my patch.
Oded


Re: [PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-11 Thread Christophe Leroy




Le 11/06/2019 à 18:30, Horia Geanta a écrit :

On 6/11/2019 6:40 PM, Christophe Leroy wrote:



Le 11/06/2019 à 17:37, Horia Geanta a écrit :

On 6/11/2019 5:39 PM, Christophe Leroy wrote:

This series is the last set of fixes for the Talitos driver.

We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:


I am getting below failures on a sec 3.3.2 (p1020rdb) for hmac(sha384) and
hmac(sha512):


Is that new with this series or did you already have it before ?


Looks like this happens with or without this series.

I haven't checked the state of this driver for quite some time.
Since I've noticed increased activity, I thought it would be worth
actually testing the changes.

Are changes in patch 2/4 ("crypto: talitos - fix hash on SEC1.")
strictly for sec 1.x or they affect all revisions?


They are strictly for sec 1.x




What do you mean by "fuzz testing" enabled ? Is that
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS or something else ?


Yes, it's this config symbol.


Indeed SEC 2.2 only supports up to SHA-256.

Christophe



Horia



Re: [PATCH v3 3/3] powerpc: Add support to initialize ima policy rules

2019-06-11 Thread Nayna




On 06/11/2019 01:19 AM, Satheesh Rajendran wrote:

On Mon, Jun 10, 2019 at 04:33:57PM -0400, Nayna Jain wrote:

PowerNV secure boot relies on the kernel IMA security subsystem to
perform the OS kernel image signature verification. Since each secure
boot mode has different IMA policy requirements, dynamic definition of
the policy rules based on the runtime secure boot mode of the system is
required. On systems that support secure boot, but have it disabled,
only measurement policy rules of the kernel image and modules are
defined.

This patch defines the arch-specific implementation to retrieve the
secure boot mode of the system and accordingly configures the IMA policy
rules.

This patch provides arch-specific IMA policies if PPC_SECURE_BOOT
config is enabled.

Signed-off-by: Nayna Jain 
---
  arch/powerpc/Kconfig   | 14 +
  arch/powerpc/kernel/Makefile   |  1 +
  arch/powerpc/kernel/ima_arch.c | 54 ++
  include/linux/ima.h|  3 +-
  4 files changed, 71 insertions(+), 1 deletion(-)
  create mode 100644 arch/powerpc/kernel/ima_arch.c

Hi,

This series failed to build against linuxppc/merge tree with 
`ppc64le_defconfig`,

arch/powerpc/platforms/powernv/secboot.c:14:6: error: redefinition of 
'get_powerpc_sb_mode'
14 | bool get_powerpc_sb_mode(void)
   |  ^~~
In file included from arch/powerpc/platforms/powernv/secboot.c:11:
./arch/powerpc/include/asm/secboot.h:15:20: note: previous definition of 
'get_powerpc_sb_mode' was here
15 | static inline bool get_powerpc_sb_mode(void)
   |^~~
make[3]: *** [scripts/Makefile.build:278: 
arch/powerpc/platforms/powernv/secboot.o] Error 1
make[3]: *** Waiting for unfinished jobs
make[2]: *** [scripts/Makefile.build:489: arch/powerpc/platforms/powernv] Error 
2
make[1]: *** [scripts/Makefile.build:489: arch/powerpc/platforms] Error 2
make: *** [Makefile:1071: arch/powerpc] Error 2
make: *** Waiting for unfinished jobs



Thanks for reporting. I have fixed it and reposted as v4.

Please retry.

Thanks & Regards,
 - Nayna




[PATCH v4 2/3] powerpc/powernv: detect the secure boot mode of the system

2019-06-11 Thread Nayna Jain
PowerNV secure boot defines different IMA policies based on the secure
boot state of the system.

This patch defines a function to detect the secure boot state of the
system.

Signed-off-by: Nayna Jain 
---
 arch/powerpc/include/asm/secboot.h   | 21 
 arch/powerpc/platforms/powernv/Makefile  |  2 +-
 arch/powerpc/platforms/powernv/secboot.c | 61 
 3 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/include/asm/secboot.h
 create mode 100644 arch/powerpc/platforms/powernv/secboot.c

diff --git a/arch/powerpc/include/asm/secboot.h 
b/arch/powerpc/include/asm/secboot.h
new file mode 100644
index ..1904fb4a3352
--- /dev/null
+++ b/arch/powerpc/include/asm/secboot.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * PowerPC secure boot definitions
+ *
+ * Copyright (C) 2019 IBM Corporation
+ * Author: Nayna Jain 
+ *
+ */
+#ifndef POWERPC_SECBOOT_H
+#define POWERPC_SECBOOT_H
+
+#if defined(CONFIG_OPAL_SECVAR)
+extern bool get_powerpc_sb_mode(void);
+#else
+static inline bool get_powerpc_sb_mode(void)
+{
+   return false;
+}
+#endif
+
+#endif
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index 6651c742e530..6f4af607a915 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -16,4 +16,4 @@ obj-$(CONFIG_PERF_EVENTS) += opal-imc.o
 obj-$(CONFIG_PPC_MEMTRACE) += memtrace.o
 obj-$(CONFIG_PPC_VAS)  += vas.o vas-window.o vas-debug.o
 obj-$(CONFIG_OCXL_BASE)+= ocxl.o
-obj-$(CONFIG_OPAL_SECVAR) += opal-secvar.o
+obj-$(CONFIG_OPAL_SECVAR) += opal-secvar.o secboot.o
diff --git a/arch/powerpc/platforms/powernv/secboot.c 
b/arch/powerpc/platforms/powernv/secboot.c
new file mode 100644
index ..9199e520ebed
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/secboot.c
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 IBM Corporation
+ * Author: Nayna Jain 
+ *
+ * secboot.c
+ *  - util function to get powerpc secboot state
+ */
+#include 
+#include 
+#include 
+#include 
+
+bool get_powerpc_sb_mode(void)
+{
+   u8 secure_boot_name[] = "SecureBoot";
+   u8 setup_mode_name[] = "SetupMode";
+   u8 secboot, setupmode;
+   unsigned long size = sizeof(secboot);
+   int status;
+   unsigned long version;
+
+   status = opal_variable_version(&version);
+   if ((status != OPAL_SUCCESS) || (version != BACKEND_TC_COMPAT_V1)) {
+   pr_info("secboot: error retrieving compatible backend\n");
+   return false;
+   }
+
+   status = opal_get_variable(secure_boot_name, sizeof(secure_boot_name),
+  NULL, NULL, &secboot, &size);
+
+   /*
+* For now assume all failures reading the SecureBoot variable implies
+* secure boot is not enabled. Later differentiate failure types.
+*/
+   if (status != OPAL_SUCCESS) {
+   secboot = 0;
+   setupmode = 0;
+   goto out;
+   }
+
+   size = sizeof(setupmode);
+   status = opal_get_variable(setup_mode_name, sizeof(setup_mode_name),
+  NULL, NULL,  &setupmode, &size);
+
+   /*
+* Failure to read the SetupMode variable does not prevent
+* secure boot mode
+*/
+   if (status != OPAL_SUCCESS)
+   setupmode = 0;
+
+out:
+   if ((secboot == 0) || (setupmode == 1)) {
+   pr_info("secboot: secureboot mode disabled\n");
+   return false;
+   }
+
+   pr_info("secboot: secureboot mode enabled\n");
+   return true;
+}
-- 
2.20.1



[PATCH v4 3/3] powerpc: Add support to initialize ima policy rules

2019-06-11 Thread Nayna Jain
PowerNV secure boot relies on the kernel IMA security subsystem to
perform the OS kernel image signature verification. Since each secure
boot mode has different IMA policy requirements, dynamic definition of
the policy rules based on the runtime secure boot mode of the system is
required. On systems that support secure boot, but have it disabled,
only measurement policy rules of the kernel image and modules are
defined.

This patch defines the arch-specific implementation to retrieve the
secure boot mode of the system and accordingly configures the IMA policy
rules.

This patch provides arch-specific IMA policies if PPC_SECURE_BOOT
config is enabled.

Signed-off-by: Nayna Jain 
---
 arch/powerpc/Kconfig   | 14 +
 arch/powerpc/kernel/Makefile   |  1 +
 arch/powerpc/kernel/ima_arch.c | 54 ++
 include/linux/ima.h|  3 +-
 4 files changed, 71 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/kernel/ima_arch.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636308c8..9de77bb14f54 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -902,6 +902,20 @@ config PPC_MEM_KEYS
 
  If unsure, say y.
 
+config PPC_SECURE_BOOT
+   prompt "Enable PowerPC Secure Boot"
+   bool
+   default n
+   depends on PPC64
+   depends on OPAL_SECVAR
+   depends on IMA
+   depends on IMA_ARCH_POLICY
+   help
+ Linux on POWER with firmware secure boot enabled needs to define
+ security policies to extend secure boot to the OS.This config
+ allows user to enable OS Secure Boot on PowerPC systems that
+ have firmware secure boot support.
+
 endmenu
 
 config ISA_DMA_API
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 0ea6c4aa3a20..75c929b41341 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -131,6 +131,7 @@ ifdef CONFIG_IMA
 obj-y  += ima_kexec.o
 endif
 endif
+obj-$(CONFIG_PPC_SECURE_BOOT)  += ima_arch.o
 
 obj-$(CONFIG_AUDIT)+= audit.o
 obj64-$(CONFIG_AUDIT)  += compat_audit.o
diff --git a/arch/powerpc/kernel/ima_arch.c b/arch/powerpc/kernel/ima_arch.c
new file mode 100644
index ..1767bf6e6550
--- /dev/null
+++ b/arch/powerpc/kernel/ima_arch.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 IBM Corporation
+ * Author: Nayna Jain 
+ *
+ * ima_arch.c
+ *  - initialize ima policies for PowerPC Secure Boot
+ */
+
+#include 
+#include 
+
+bool arch_ima_get_secureboot(void)
+{
+   bool sb_mode;
+
+   sb_mode = get_powerpc_sb_mode();
+   if (sb_mode)
+   return true;
+   else
+   return false;
+}
+
+/*
+ * File signature verification is not needed, include only measurements
+ */
+static const char *const default_arch_rules[] = {
+   "measure func=KEXEC_KERNEL_CHECK template=ima-modsig",
+   "measure func=MODULE_CHECK template=ima-modsig",
+   NULL
+};
+
+/* Both file signature verification and measurements are needed */
+static const char *const sb_arch_rules[] = {
+   "measure func=KEXEC_KERNEL_CHECK template=ima-modsig",
+   "measure func=MODULE_CHECK template=ima-modsig",
+   "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig|modsig 
template=ima-modsig",
+#if !IS_ENABLED(CONFIG_MODULE_SIG)
+   "appraise func=MODULE_CHECK appraise_type=imasig|modsig 
template=ima-modsig",
+#endif
+   NULL
+};
+
+/*
+ * On PowerPC, file measurements are to be added to the IMA measurement list
+ * irrespective of the secure boot state of the system. Signature verification
+ * is conditionally enabled based on the secure boot state.
+ */
+const char *const *arch_get_ima_policy(void)
+{
+   if (IS_ENABLED(CONFIG_IMA_ARCH_POLICY) && arch_ima_get_secureboot())
+   return sb_arch_rules;
+   return default_arch_rules;
+}
diff --git a/include/linux/ima.h b/include/linux/ima.h
index fd9f7cf4cdf5..a01df076ecae 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -31,7 +31,8 @@ extern void ima_post_path_mknod(struct dentry *dentry);
 extern void ima_add_kexec_buffer(struct kimage *image);
 #endif
 
-#if (defined(CONFIG_X86) && defined(CONFIG_EFI)) || defined(CONFIG_S390)
+#if (defined(CONFIG_X86) && defined(CONFIG_EFI)) || defined(CONFIG_S390) \
+   || defined(CONFIG_PPC_SECURE_BOOT)
 extern bool arch_ima_get_secureboot(void);
 extern const char * const *arch_get_ima_policy(void);
 #else
-- 
2.20.1



[PATCH v4 1/3] powerpc/powernv: Add OPAL API interface to get secureboot state

2019-06-11 Thread Nayna Jain
From: Claudio Carvalho 

The X.509 certificates trusted by the platform and other information
required to secure boot the OS kernel are wrapped in secure variables,
which are controlled by OPAL.

This patch adds support to read OPAL secure variables through
OPAL_SECVAR_GET call. It returns the metadata and data for a given secure
variable based on the unique key.

Since OPAL can support different types of backend which can vary in the
variable interpretation, a new OPAL API call named OPAL_SECVAR_BACKEND, is
added to retrieve the supported backend version. This helps the consumer
to know how to interpret the variable.

This support can be enabled using CONFIG_OPAL_SECVAR

Signed-off-by: Claudio Carvalho 
Signed-off-by: Nayna Jain 
---
This patch depends on a new OPAL call that is being added to skiboot.
The patch set that implements the new call has been posted to
https://patchwork.ozlabs.org/project/skiboot/list/?series=112868

 arch/powerpc/include/asm/opal-api.h  |  4 +-
 arch/powerpc/include/asm/opal-secvar.h   | 23 ++
 arch/powerpc/include/asm/opal.h  |  6 ++
 arch/powerpc/platforms/powernv/Kconfig   |  6 ++
 arch/powerpc/platforms/powernv/Makefile  |  1 +
 arch/powerpc/platforms/powernv/opal-call.c   |  2 +
 arch/powerpc/platforms/powernv/opal-secvar.c | 85 
 7 files changed, 126 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/include/asm/opal-secvar.h
 create mode 100644 arch/powerpc/platforms/powernv/opal-secvar.c

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index e1577cfa7186..a505e669b4b6 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -212,7 +212,9 @@
 #define OPAL_HANDLE_HMI2   166
 #defineOPAL_NX_COPROC_INIT 167
 #define OPAL_XIVE_GET_VP_STATE 170
-#define OPAL_LAST  170
+#define OPAL_SECVAR_GET 173
+#define OPAL_SECVAR_BACKEND 177
+#define OPAL_LAST  177
 
 #define QUIESCE_HOLD   1 /* Spin all calls at entry */
 #define QUIESCE_REJECT 2 /* Fail all calls with OPAL_BUSY */
diff --git a/arch/powerpc/include/asm/opal-secvar.h 
b/arch/powerpc/include/asm/opal-secvar.h
new file mode 100644
index ..b677171a0368
--- /dev/null
+++ b/arch/powerpc/include/asm/opal-secvar.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * PowerNV definitions for secure variables OPAL API.
+ *
+ * Copyright (C) 2019 IBM Corporation
+ * Author: Claudio Carvalho 
+ *
+ */
+#ifndef OPAL_SECVAR_H
+#define OPAL_SECVAR_H
+
+enum {
+   BACKEND_NONE = 0,
+   BACKEND_TC_COMPAT_V1,
+};
+
+extern int opal_get_variable(u8 *key, unsigned long ksize,
+u8 *metadata, unsigned long *mdsize,
+u8 *data, unsigned long *dsize);
+
+extern int opal_variable_version(unsigned long *backend);
+
+#endif
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 4cc37e708bc7..57d2c2356eda 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -394,6 +394,12 @@ void opal_powercap_init(void);
 void opal_psr_init(void);
 void opal_sensor_groups_init(void);
 
+extern int opal_secvar_get(uint64_t k_key, uint64_t k_key_len,
+  uint64_t k_metadata, uint64_t k_metadata_size,
+  uint64_t k_data, uint64_t k_data_size);
+
+extern int opal_secvar_backend(uint64_t k_backend);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_OPAL_H */
diff --git a/arch/powerpc/platforms/powernv/Kconfig 
b/arch/powerpc/platforms/powernv/Kconfig
index 850eee860cf2..65b060539b5c 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -47,3 +47,9 @@ config PPC_VAS
  VAS adapters are found in POWER9 based systems.
 
  If unsure, say N.
+
+config OPAL_SECVAR
+   bool "OPAL Secure Variables"
+   depends on PPC_POWERNV
+   help
+ This enables the kernel to access OPAL secure variables.
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index da2e99efbd04..6651c742e530 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -16,3 +16,4 @@ obj-$(CONFIG_PERF_EVENTS) += opal-imc.o
 obj-$(CONFIG_PPC_MEMTRACE) += memtrace.o
 obj-$(CONFIG_PPC_VAS)  += vas.o vas-window.o vas-debug.o
 obj-$(CONFIG_OCXL_BASE)+= ocxl.o
+obj-$(CONFIG_OPAL_SECVAR) += opal-secvar.o
diff --git a/arch/powerpc/platforms/powernv/opal-call.c 
b/arch/powerpc/platforms/powernv/opal-call.c
index 36c8fa3647a2..0445980f294f 100644
--- a/arch/powerpc/platforms/powernv/opal-call.c
+++ b/arch/powerpc/platforms/powernv/opal-call.c
@@ -288,3 +288,5 @@ OPAL_CALL(opal_pci_set_pbcq_tunnel_bar, 

[PATCH v4 0/3] powerpc: Enabling IMA arch specific secure boot policies

2019-06-11 Thread Nayna Jain
This patch set, previously named "powerpc: Enabling secure boot on powernv
systems - Part 1", is part of a series that implements secure boot on
PowerNV systems.

In order to verify the OS kernel on PowerNV, secure boot requires X.509
certificates trusted by the platform, the secure boot modes, and several
other pieces of information. These are stored in secure variables
controlled by OPAL, also known as OPAL secure variables.

The IMA architecture specific policy support on POWER is dependent on OPAL
runtime services to access secure variables. OPAL APIs in skiboot are
modified to define generic interface compatible to any backend. This
patchset is consequently updated to be compatible with new OPAL API
interface. This has cleaned up any EFIsms in the arch specific code.
Further, the ima arch specific policies are updated to be able to support
appended signatures. They also now use per policy template.

Exposing the OPAL secure variables to userspace will be posted as a
separate patch set, allowing the IMA architecture specific policy on POWER
to be upstreamed independently.

This patch set adds the following features:

1. Add support for OPAL Runtime API to access secure variables controlled
by OPAL.
2. Define IMA arch-specific policies based on the secure boot state and
mode of the system. On secure boot enabled PowerNV systems, the OS kernel
signature will be verified by IMA appraisal.

Pre-requisites for this patchset are:
1. OPAL APIs in Skiboot[1]
2. Appended signature support in IMA [2]
3. Per policy template support in IMA [3]

[1] https://patchwork.ozlabs.org/project/skiboot/list/?series=112868 
[2] https://patchwork.ozlabs.org/cover/1087361/. Updated version will be
posted soon
[3] Repo: 
https://kernel.googlesource.com/pub/scm/linux/kernel/git/zohar/linux-integrity
Branch: next-queued-testing. Commit: f241bb1f42aa95

--

Original Cover Letter:

This patch set is part of a series that implements secure boot on PowerNV
systems.

In order to verify the OS kernel on PowerNV, secure boot requires X.509
certificates trusted by the platform, the secure boot modes, and several
other pieces of information. These are stored in secure variables
controlled by OPAL, also known as OPAL secure variables.

The IMA architecture specific policy support on Power is dependent on OPAL
runtime services to access secure variables. Instead of directly accessing
the OPAL runtime services, version 3 of this patch set relied upon the
EFI hooks. This version drops that dependency and calls the OPAL runtime
services directly. Skiboot OPAL APIs are due to be posted soon.

Exposing the OPAL secure variables to userspace will be posted as a
separate patch set, allowing the IMA architecture specific policy on Power
to be upstreamed independently.

This patch set adds the following features:

1. Add support for OPAL Runtime API to access secure variables controlled
by OPAL.
2. Define IMA arch-specific policies based on the secure boot state and
mode of the system. On secure boot enabled powernv systems, the OS kernel
signature will be verified by IMA appraisal.

[1] https://patchwork.kernel.org/cover/10882149/

Changelog:

v4:
* Fixed the build issue as reported by Satheesh Rajendran.

v3:
* OPAL APIs in Patch 1 are updated to provide generic interface based on
key/keylen. This patchset updates kernel OPAL APIs to be compatible with
generic interface.
* Patch 2 is cleaned up to use new OPAL APIs. 
* Since OPAL can support different types of backend which can vary in the
variable interpretation, the Patch 2 is updated to add a check for the
backend version
* OPAL API now expects consumer to first check the supported backend version
before calling other secvar OPAL APIs. This check is now added in patch 2.
* IMA policies in Patch 3 is updated to specify appended signature and
per policy template.
* The patches now are free of any EFIisms.

v2:

* Removed Patch 1: powerpc/include: Override unneeded early ioremap
functions
* Updated Subject line and patch description of the Patch 1 of this series
* Removed dependency of OPAL_SECVAR on EFI, CPU_BIG_ENDIAN and UCS2_STRING
* Changed OPAL APIs from static to non-static. Added opal-secvar.h for the
same
* Removed EFI hooks from opal_secvar.c
* Removed opal_secvar_get_next(), opal_secvar_enqueue() and
opal_query_variable_info() function
* get_powerpc_sb_mode() in secboot.c now directly calls OPAL Runtime API
rather than via EFI hooks.
* Fixed log messages in get_powerpc_sb_mode() function.
* Added dependency for PPC_SECURE_BOOT on configs PPC64 and OPAL_SECVAR
* Replaced obj-$(CONFIG_IMA) with obj-$(CONFIG_PPC_SECURE_BOOT) in
arch/powerpc/kernel/Makefile

Claudio Carvalho (1):
  powerpc/powernv: Add OPAL API interface to get secureboot state

Nayna Jain (2):
  powerpc/powernv: detect the secure boot mode of the system
  powerpc: Add support to initialize ima policy rules

 arch/powerpc/Kconfig

Re: [PATCH v3 06/20] docs: mark orphan documents as such

2019-06-11 Thread Mauro Carvalho Chehab
Em Tue, 11 Jun 2019 19:52:04 +0300
Andy Shevchenko  escreveu:

> On Fri, Jun 7, 2019 at 10:04 PM Mauro Carvalho Chehab
>  wrote:
> > Sphinx doesn't like orphan documents:  
> 
> > Documentation/laptops/lg-laptop.rst: WARNING: document isn't included 
> > in any toctree  
> 
> >  Documentation/laptops/lg-laptop.rst | 2 ++  
> 
> > diff --git a/Documentation/laptops/lg-laptop.rst 
> > b/Documentation/laptops/lg-laptop.rst
> > index aa503ee9b3bc..f2c2ffe31101 100644
> > --- a/Documentation/laptops/lg-laptop.rst
> > +++ b/Documentation/laptops/lg-laptop.rst
> > @@ -1,5 +1,7 @@
> >  .. SPDX-License-Identifier: GPL-2.0+
> >
> > +:orphan:
> > +
> >  LG Gram laptop extra features
> >  =
> >  
> 
> Can we rather create a toc tree there?
> It was a first document in reST format in that folder.

Sure, but:

1) I have a patch converting the other files on this dir to rst:


https://git.linuxtv.org/mchehab/experimental.git/commit/?h=convert_rst_renames_v4.1&id=abc13233035fdfdbc5ef2f2fbd3d127a1ab15530

2) It probably makes sense to move the entire dir to
Documentation/admin-guide.

So, I would prefer to have the :orphan: here while (1) is not merged.

Thanks,
Mauro


Re: [PATCH v3 06/20] docs: mark orphan documents as such

2019-06-11 Thread Andy Shevchenko
On Fri, Jun 7, 2019 at 10:04 PM Mauro Carvalho Chehab
 wrote:
> Sphinx doesn't like orphan documents:

> Documentation/laptops/lg-laptop.rst: WARNING: document isn't included in 
> any toctree

>  Documentation/laptops/lg-laptop.rst | 2 ++

> diff --git a/Documentation/laptops/lg-laptop.rst 
> b/Documentation/laptops/lg-laptop.rst
> index aa503ee9b3bc..f2c2ffe31101 100644
> --- a/Documentation/laptops/lg-laptop.rst
> +++ b/Documentation/laptops/lg-laptop.rst
> @@ -1,5 +1,7 @@
>  .. SPDX-License-Identifier: GPL-2.0+
>
> +:orphan:
> +
>  LG Gram laptop extra features
>  =
>

Can we rather create a toc tree there?
It was a first document in reST format in that folder.

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH] powerpc/32s: fix initial setup of segment registers on secondary CPU

2019-06-11 Thread Christophe Leroy




Le 11/06/2019 à 17:47, Christophe Leroy a écrit :

The patch referenced below moved the loading of segment registers
out of load_up_mmu() in order to do it earlier in the boot sequence.
However, the secondary CPU still needs it to be done when loading up
the MMU.

Reported-by: Erhard F. 
Fixes: 215b823707ce ("powerpc/32s: set up an early static hash table for KASAN")


Cc: sta...@vger.kernel.org


Signed-off-by: Christophe Leroy 
---
  arch/powerpc/kernel/head_32.S | 1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index 1d5f1bd0dacd..f255e22184b4 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -752,6 +752,7 @@ __secondary_start:
stw r0,0(r3)
  
  	/* load up the MMU */

+   bl  load_segment_registers
bl  load_up_mmu
  
  	/* ptr to phys current thread */




Re: [PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-11 Thread Horia Geanta
On 6/11/2019 6:40 PM, Christophe Leroy wrote:
> 
> 
> Le 11/06/2019 à 17:37, Horia Geanta a écrit :
>> On 6/11/2019 5:39 PM, Christophe Leroy wrote:
>>> This series is the last set of fixes for the Talitos driver.
>>>
>>> We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
>>> SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:
>>>
>> I am getting below failures on a sec 3.3.2 (p1020rdb) for hmac(sha384) and
>> hmac(sha512):
> 
> Is that new with this series or did you already have it before ?
> 
Looks like this happens with or without this series.

I haven't checked the state of this driver for quite some time.
Since I've noticed increased activity, I thought it would be worth
actually testing the changes.

Are changes in patch 2/4 ("crypto: talitos - fix hash on SEC1.")
strictly for sec 1.x or they affect all revisions?

> What do you mean by "fuzz testing" enabled ? Is that 
> CONFIG_CRYPTO_MANAGER_EXTRA_TESTS or something else ?
> 
Yes, it's this config symbol.

Horia


[PATCH] powerpc/32s: fix initial setup of segment registers on secondary CPU

2019-06-11 Thread Christophe Leroy
The patch referenced below moved the loading of segment registers
out of load_up_mmu() in order to do it earlier in the boot sequence.
However, the secondary CPU still needs it to be done when loading up
the MMU.

Reported-by: Erhard F. 
Fixes: 215b823707ce ("powerpc/32s: set up an early static hash table for KASAN")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_32.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index 1d5f1bd0dacd..f255e22184b4 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -752,6 +752,7 @@ __secondary_start:
stw r0,0(r3)
 
/* load up the MMU */
+   bl  load_segment_registers
bl  load_up_mmu
 
/* ptr to phys current thread */
-- 
2.13.3



Re: [PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-11 Thread Christophe Leroy




Le 11/06/2019 à 17:37, Horia Geanta a écrit :

On 6/11/2019 5:39 PM, Christophe Leroy wrote:

This series is the last set of fixes for the Talitos driver.

We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:


I am getting below failures on a sec 3.3.2 (p1020rdb) for hmac(sha384) and
hmac(sha512):


Is that new with this series or did you already have it before ?

What do you mean by "fuzz testing" enabled ? Is that 
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS or something else ?


Christophe



alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector "random: psize=2497 
ksize=124", cfg="random: inplace use_finup nosimd src_divs=[76.49%@+4002, 
23.51%@alignmask+26] iv_offset=4"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector "random: psize=27 
ksize=121", cfg="random: inplace may_sleep use_digest src_divs=[100.0%@+10] 
iv_offset=9"

Reproducibility rate is 100% so far, here are a few more runs - they might help 
finding a pattern:

1.
alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector "random: psize=184 
ksize=121", cfg="random: use_finup src_divs=[100.0%@+3988] 
dst_divs=[100.0%@+547] iv_offset=44"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector "random: psize=7 
ksize=122", cfg="random: may_sleep use_digest src_divs=[100.0%@+3968] 
dst_divs=[100.0%@+20]"

2.
alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector "random: psize=6481 
ksize=120", cfg="random: use_final src_divs=[100.0%@+6] 
dst_divs=[43.84%@alignmask+6, 56.16%@+22]"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector "random: psize=635 
ksize=128", cfg="random: may_sleep use_finup src_divs=[100.0%@+4062] 
dst_divs=[20.47%@+2509, 72.36%@alignmask+2, 7.17%@alignmask+3990]"

3.
alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector "random: psize=2428 
ksize=127", cfg="random: may_sleep use_finup src_divs=[35.19%@+18, 
64.81%@+1755] dst_divs=[100.0%@+111] iv_offset=5"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector "random: psize=4345 
ksize=128", cfg="random: may_sleep use_digest src_divs=[100.0%@+2820] iv_offset=59"

If you run several times with fuzz testing enabled on your sec2.2,
are you able to see similar failures?

Thanks,
Horia



Re: [PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-11 Thread Horia Geanta
On 6/11/2019 5:39 PM, Christophe Leroy wrote:
> This series is the last set of fixes for the Talitos driver.
> 
> We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
> SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:
> 
I am getting below failures on a sec 3.3.2 (p1020rdb) for hmac(sha384) and
hmac(sha512):

alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector 
"random: psize=2497 ksize=124", cfg="random: inplace use_finup nosimd 
src_divs=[76.49%@+4002, 23.51%@alignmask+26] iv_offset=4"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector 
"random: psize=27 ksize=121", cfg="random: inplace may_sleep use_digest 
src_divs=[100.0%@+10] iv_offset=9"

Reproducibility rate is 100% so far, here are a few more runs - they might help 
finding a pattern:

1.
alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector 
"random: psize=184 ksize=121", cfg="random: use_finup 
src_divs=[100.0%@+3988] dst_divs=[100.0%@+547] iv_offset=44"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector 
"random: psize=7 ksize=122", cfg="random: may_sleep use_digest 
src_divs=[100.0%@+3968] dst_divs=[100.0%@+20]"

2.
alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector 
"random: psize=6481 ksize=120", cfg="random: use_final 
src_divs=[100.0%@+6] dst_divs=[43.84%@alignmask+6, 56.16%@+22]"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector 
"random: psize=635 ksize=128", cfg="random: may_sleep use_finup 
src_divs=[100.0%@+4062] dst_divs=[20.47%@+2509, 72.36%@alignmask+2, 
7.17%@alignmask+3990]"

3.
alg: ahash: hmac-sha384-talitos test failed (wrong result) on test vector 
"random: psize=2428 ksize=127", cfg="random: may_sleep use_finup 
src_divs=[35.19%@+18, 64.81%@+1755] dst_divs=[100.0%@+111] 
iv_offset=5"
alg: ahash: hmac-sha512-talitos test failed (wrong result) on test vector 
"random: psize=4345 ksize=128", cfg="random: may_sleep use_digest 
src_divs=[100.0%@+2820] iv_offset=59"

If you run several times with fuzz testing enabled on your sec2.2,
are you able to see similar failures?

Thanks,
Horia


[PATCH 16/16] mm: pass get_user_pages_fast iterator arguments in a structure

2019-06-11 Thread Christoph Hellwig
Instead of passing a set of always repeated arguments down the
get_user_pages_fast iterators, create a struct gup_args to hold them and
pass that by reference.  This leads to an over 100 byte .text size
reduction for x86-64.

Signed-off-by: Christoph Hellwig 
---
 mm/gup.c | 338 ++-
 1 file changed, 158 insertions(+), 180 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 8bcc042f933a..419a565fc998 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -24,6 +24,13 @@
 
 #include "internal.h"
 
+struct gup_args {
+   unsigned long   addr;
+   unsigned intflags;
+   struct page **pages;
+   unsigned intnr;
+};
+
 struct follow_page_context {
struct dev_pagemap *pgmap;
unsigned int page_mask;
@@ -1786,10 +1793,10 @@ static inline pte_t gup_get_pte(pte_t *ptep)
 }
 #endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */
 
-static void undo_dev_pagemap(int *nr, int nr_start, struct page **pages)
+static void undo_dev_pagemap(struct gup_args *args, int nr_start)
 {
-   while ((*nr) - nr_start) {
-   struct page *page = pages[--(*nr)];
+   while (args->nr - nr_start) {
+   struct page *page = args->pages[--args->nr];
 
ClearPageReferenced(page);
put_page(page);
@@ -1811,14 +1818,13 @@ static inline struct page *try_get_compound_head(struct 
page *page, int refs)
 }
 
 #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
-static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-unsigned int flags, struct page **pages, int *nr)
+static int gup_pte_range(struct gup_args *args, pmd_t pmd, unsigned long end)
 {
struct dev_pagemap *pgmap = NULL;
-   int nr_start = *nr, ret = 0;
+   int nr_start = args->nr, ret = 0;
pte_t *ptep, *ptem;
 
-   ptem = ptep = pte_offset_map(&pmd, addr);
+   ptem = ptep = pte_offset_map(&pmd, args->addr);
do {
pte_t pte = gup_get_pte(ptep);
struct page *head, *page;
@@ -1830,16 +1836,16 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
if (pte_protnone(pte))
goto pte_unmap;
 
-   if (!pte_access_permitted(pte, flags & FOLL_WRITE))
+   if (!pte_access_permitted(pte, args->flags & FOLL_WRITE))
goto pte_unmap;
 
if (pte_devmap(pte)) {
-   if (unlikely(flags & FOLL_LONGTERM))
+   if (unlikely(args->flags & FOLL_LONGTERM))
goto pte_unmap;
 
pgmap = get_dev_pagemap(pte_pfn(pte), pgmap);
if (unlikely(!pgmap)) {
-   undo_dev_pagemap(nr, nr_start, pages);
+   undo_dev_pagemap(args, nr_start);
goto pte_unmap;
}
} else if (pte_special(pte))
@@ -1860,10 +1866,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
VM_BUG_ON_PAGE(compound_head(page) != head, page);
 
SetPageReferenced(page);
-   pages[*nr] = page;
-   (*nr)++;
-
-   } while (ptep++, addr += PAGE_SIZE, addr != end);
+   args->pages[args->nr++] = page;
+   } while (ptep++, args->addr += PAGE_SIZE, args->addr != end);
 
ret = 1;
 
@@ -1884,18 +1888,17 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
  * __get_user_pages_fast implementation that can pin pages. Thus it's still
  * useful to have gup_huge_pmd even if we can't operate on ptes.
  */
-static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-unsigned int flags, struct page **pages, int *nr)
+static int gup_pte_range(struct gup_args *args, pmd_t pmd, unsigned long end)
 {
return 0;
 }
 #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
 
 #if defined(__HAVE_ARCH_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
-static int __gup_device_huge(unsigned long pfn, unsigned long addr,
-   unsigned long end, struct page **pages, int *nr)
+static int __gup_device_huge(struct gup_args *args, unsigned long pfn,
+   unsigned long end)
 {
-   int nr_start = *nr;
+   int nr_start = args->nr;
struct dev_pagemap *pgmap = NULL;
 
do {
@@ -1903,64 +1906,63 @@ static int __gup_device_huge(unsigned long pfn, 
unsigned long addr,
 
pgmap = get_dev_pagemap(pfn, pgmap);
if (unlikely(!pgmap)) {
-   undo_dev_pagemap(nr, nr_start, pages);
+   undo_dev_pagemap(args, nr_start);
return 0;
}
SetPageReferenced(page);
-   pages[*nr] = page;
+   args->pages[args->nr++] = page;
get_page(page);

[PATCH 15/16] mm: mark the page referenced in gup_hugepte

2019-06-11 Thread Christoph Hellwig
All other get_user_page_fast cases mark the page referenced, so do
this here as well.

Signed-off-by: Christoph Hellwig 
---
 mm/gup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/gup.c b/mm/gup.c
index 0733674b539d..8bcc042f933a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2021,6 +2021,7 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, 
unsigned long addr,
return 0;
}
 
+   SetPageReferenced(head);
return 1;
 }
 
-- 
2.20.1



[PATCH 13/16] mm: move the powerpc hugepd code to mm/gup.c

2019-06-11 Thread Christoph Hellwig
While only powerpc supports the hugepd case, the code is pretty
generic and I'd like to keep all GUP internals in one place.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/mm/hugetlbpage.c | 72 --
 include/linux/hugetlb.h   | 18 
 mm/Kconfig| 10 +
 mm/gup.c  | 82 +++
 5 files changed, 93 insertions(+), 90 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 992a04796e56..4f1b00979cde 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -125,6 +125,7 @@ config PPC
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOV
+   select ARCH_HAS_HUGEPD  if HUGETLB_PAGE
select ARCH_HAS_MMIOWB  if PPC64
select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_PMEM_APIif PPC64
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index b5d92dc32844..51716c11d0fb 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -511,13 +511,6 @@ struct page *follow_huge_pd(struct vm_area_struct *vma,
return page;
 }
 
-static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end,
- unsigned long sz)
-{
-   unsigned long __boundary = (addr + sz) & ~(sz-1);
-   return (__boundary - 1 < end - 1) ? __boundary : end;
-}
-
 #ifdef CONFIG_PPC_MM_SLICES
 unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
unsigned long len, unsigned long pgoff,
@@ -665,68 +658,3 @@ void flush_dcache_icache_hugepage(struct page *page)
}
}
 }
-
-static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
-  unsigned long end, int write, struct page **pages, int 
*nr)
-{
-   unsigned long pte_end;
-   struct page *head, *page;
-   pte_t pte;
-   int refs;
-
-   pte_end = (addr + sz) & ~(sz-1);
-   if (pte_end < end)
-   end = pte_end;
-
-   pte = READ_ONCE(*ptep);
-
-   if (!pte_access_permitted(pte, write))
-   return 0;
-
-   /* hugepages are never "special" */
-   VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
-
-   refs = 0;
-   head = pte_page(pte);
-
-   page = head + ((addr & (sz-1)) >> PAGE_SHIFT);
-   do {
-   VM_BUG_ON(compound_head(page) != head);
-   pages[*nr] = page;
-   (*nr)++;
-   page++;
-   refs++;
-   } while (addr += PAGE_SIZE, addr != end);
-
-   if (!page_cache_add_speculative(head, refs)) {
-   *nr -= refs;
-   return 0;
-   }
-
-   if (unlikely(pte_val(pte) != pte_val(*ptep))) {
-   /* Could be optimized better */
-   *nr -= refs;
-   while (refs--)
-   put_page(head);
-   return 0;
-   }
-
-   return 1;
-}
-
-int gup_huge_pd(hugepd_t hugepd, unsigned long addr, unsigned int pdshift,
-   unsigned long end, int write, struct page **pages, int *nr)
-{
-   pte_t *ptep;
-   unsigned long sz = 1UL << hugepd_shift(hugepd);
-   unsigned long next;
-
-   ptep = hugepte_offset(hugepd, addr, pdshift);
-   do {
-   next = hugepte_addr_end(addr, end, sz);
-   if (!gup_hugepte(ptep, sz, addr, end, write, pages, nr))
-   return 0;
-   } while (ptep++, addr = next, addr != end);
-
-   return 1;
-}
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index edf476c8cfb9..0f91761e2c53 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -16,29 +16,11 @@ struct user_struct;
 struct mmu_gather;
 
 #ifndef is_hugepd
-/*
- * Some architectures requires a hugepage directory format that is
- * required to support multiple hugepage sizes. For example
- * a4fe3ce76 "powerpc/mm: Allow more flexible layouts for hugepage pagetables"
- * introduced the same on powerpc. This allows for a more flexible hugepage
- * pagetable layout.
- */
 typedef struct { unsigned long pd; } hugepd_t;
 #define is_hugepd(hugepd) (0)
 #define __hugepd(x) ((hugepd_t) { (x) })
-static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
- unsigned pdshift, unsigned long end,
- int write, struct page **pages, int *nr)
-{
-   return 0;
-}
-#else
-extern int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
-  unsigned pdshift, unsigned long end,
-  int write, struct page **pages, int *nr);
 #endif
 
-
 #ifdef CONFIG_HUGETLB_PAGE
 
 #include 
diff --git a/mm/Kconfig b/mm/Kconfig
index 5c41409557da..44be3f01a2b2 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -769,4 +769,14 @@ config GUP_GET_PTE_L

[PATCH 14/16] mm: switch gup_hugepte to use try_get_compound_head

2019-06-11 Thread Christoph Hellwig
This applies the overflow fixes from 8fde12ca79aff
("mm: prevent get_user_pages() from overflowing page refcount")
to the powerpc hugepd code and brings it back in sync with the
other GUP cases.

Signed-off-by: Christoph Hellwig 
---
 mm/gup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index 494aa4c3a55e..0733674b539d 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2007,7 +2007,8 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, 
unsigned long addr,
refs++;
} while (addr += PAGE_SIZE, addr != end);
 
-   if (!page_cache_add_speculative(head, refs)) {
+   head = try_get_compound_head(head, refs);
+   if (!head) {
*nr -= refs;
return 0;
}
-- 
2.20.1



[PATCH 11/16] mm: consolidate the get_user_pages* implementations

2019-06-11 Thread Christoph Hellwig
Always build mm/gup.c, and move the nommu versions and replace the
separate stubs for various functions by the default ones, with the _fast
version always falling back to the slow path because gup_fast_permitted
always returns false now if HAVE_FAST_GUP is not set, and we use the
nommu version of __get_user_pages while keeping all the wrappers common.

This also ensures the new put_user_pages* helpers are available for
nommu, as those are currently missing, which would create a problem as
soon as we actually grew users for it.

Signed-off-by: Christoph Hellwig 
---
 mm/Kconfig  |   1 +
 mm/Makefile |   4 +-
 mm/gup.c| 476 +---
 mm/nommu.c  |  88 --
 mm/util.c   |  47 --
 5 files changed, 269 insertions(+), 347 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 98dffb0f2447..5c41409557da 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -133,6 +133,7 @@ config HAVE_MEMBLOCK_PHYS_MAP
bool
 
 config HAVE_FAST_GUP
+   depends on MMU
bool
 
 config ARCH_KEEP_MEMBLOCK
diff --git a/mm/Makefile b/mm/Makefile
index ac5e5ba78874..dc0746ca1109 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -22,7 +22,7 @@ KCOV_INSTRUMENT_mmzone.o := n
 KCOV_INSTRUMENT_vmstat.o := n
 
 mmu-y  := nommu.o
-mmu-$(CONFIG_MMU)  := gup.o highmem.o memory.o mincore.o \
+mmu-$(CONFIG_MMU)  := highmem.o memory.o mincore.o \
   mlock.o mmap.o mmu_gather.o mprotect.o mremap.o \
   msync.o page_vma_mapped.o pagewalk.o \
   pgtable-generic.o rmap.o vmalloc.o
@@ -39,7 +39,7 @@ obj-y := filemap.o mempool.o oom_kill.o 
fadvise.o \
   mm_init.o mmu_context.o percpu.o slab_common.o \
   compaction.o vmacache.o \
   interval_tree.o list_lru.o workingset.o \
-  debug.o $(mmu-y)
+  debug.o gup.o $(mmu-y)
 
 # Give 'page_alloc' its own module-parameter namespace
 page-alloc-y := page_alloc.o
diff --git a/mm/gup.c b/mm/gup.c
index 7328890ad8d3..fe4f205651fd 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -134,6 +134,7 @@ void put_user_pages(struct page **pages, unsigned long 
npages)
 }
 EXPORT_SYMBOL(put_user_pages);
 
+#ifdef CONFIG_MMU
 static struct page *no_page_table(struct vm_area_struct *vma,
unsigned int flags)
 {
@@ -1100,86 +1101,6 @@ static __always_inline long 
__get_user_pages_locked(struct task_struct *tsk,
return pages_done;
 }
 
-/*
- * We can leverage the VM_FAULT_RETRY functionality in the page fault
- * paths better by using either get_user_pages_locked() or
- * get_user_pages_unlocked().
- *
- * get_user_pages_locked() is suitable to replace the form:
- *
- *  down_read(&mm->mmap_sem);
- *  do_something()
- *  get_user_pages(tsk, mm, ..., pages, NULL);
- *  up_read(&mm->mmap_sem);
- *
- *  to:
- *
- *  int locked = 1;
- *  down_read(&mm->mmap_sem);
- *  do_something()
- *  get_user_pages_locked(tsk, mm, ..., pages, &locked);
- *  if (locked)
- *  up_read(&mm->mmap_sem);
- */
-long get_user_pages_locked(unsigned long start, unsigned long nr_pages,
-  unsigned int gup_flags, struct page **pages,
-  int *locked)
-{
-   /*
-* FIXME: Current FOLL_LONGTERM behavior is incompatible with
-* FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on
-* vmas.  As there are no users of this flag in this call we simply
-* disallow this option for now.
-*/
-   if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM))
-   return -EINVAL;
-
-   return __get_user_pages_locked(current, current->mm, start, nr_pages,
-  pages, NULL, locked,
-  gup_flags | FOLL_TOUCH);
-}
-EXPORT_SYMBOL(get_user_pages_locked);
-
-/*
- * get_user_pages_unlocked() is suitable to replace the form:
- *
- *  down_read(&mm->mmap_sem);
- *  get_user_pages(tsk, mm, ..., pages, NULL);
- *  up_read(&mm->mmap_sem);
- *
- *  with:
- *
- *  get_user_pages_unlocked(tsk, mm, ..., pages);
- *
- * It is functionally equivalent to get_user_pages_fast so
- * get_user_pages_fast should be used instead if specific gup_flags
- * (e.g. FOLL_FORCE) are not required.
- */
-long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
-struct page **pages, unsigned int gup_flags)
-{
-   struct mm_struct *mm = current->mm;
-   int locked = 1;
-   long ret;
-
-   /*
-* FIXME: Current FOLL_LONGTERM behavior is incompatible with
-* FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on
-* vmas.  As there are no users of this flag in this call we simply
-* disallow this option for now.
-*/
-   if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM))

[PATCH 12/16] mm: validate get_user_pages_fast flags

2019-06-11 Thread Christoph Hellwig
We can only deal with FOLL_WRITE and/or FOLL_LONGTERM in
get_user_pages_fast, so reject all other flags.

Signed-off-by: Christoph Hellwig 
---
 mm/gup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index fe4f205651fd..78dc1871b3d4 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2317,6 +2317,9 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
unsigned long addr, len, end;
int nr = 0, ret = 0;
 
+   if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM)))
+   return -EINVAL;
+
start = untagged_addr(start) & PAGE_MASK;
addr = start;
len = (unsigned long) nr_pages << PAGE_SHIFT;
-- 
2.20.1



[PATCH 10/16] mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP

2019-06-11 Thread Christoph Hellwig
We only support the generic GUP now, so rename the config option to
be more clear, and always use the mm/Kconfig definition of the
symbol and select it from the arch Kconfigs.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/Kconfig | 5 +
 arch/arm64/Kconfig   | 4 +---
 arch/mips/Kconfig| 2 +-
 arch/powerpc/Kconfig | 2 +-
 arch/s390/Kconfig| 2 +-
 arch/sh/Kconfig  | 2 +-
 arch/sparc/Kconfig   | 2 +-
 arch/x86/Kconfig | 4 +---
 mm/Kconfig   | 2 +-
 mm/gup.c | 4 ++--
 10 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 8869742a85df..3879a3e2c511 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -73,6 +73,7 @@ config ARM
select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) 
&& MMU
select HAVE_EXIT_THREAD
+   select HAVE_FAST_GUP if ARM_LPAE
select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
select HAVE_FUNCTION_GRAPH_TRACER if !THUMB2_KERNEL && !CC_IS_CLANG
select HAVE_FUNCTION_TRACER if !XIP_KERNEL
@@ -1596,10 +1597,6 @@ config ARCH_SELECT_MEMORY_MODEL
 config HAVE_ARCH_PFN_VALID
def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM
 
-config HAVE_GENERIC_GUP
-   def_bool y
-   depends on ARM_LPAE
-
 config HIGHMEM
bool "High Memory Support"
depends on MMU
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 697ea0510729..4a6ee3e92757 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -140,6 +140,7 @@ config ARM64
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_EFFICIENT_UNALIGNED_ACCESS
+   select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER
@@ -262,9 +263,6 @@ config GENERIC_CALIBRATE_DELAY
 config ZONE_DMA32
def_bool y
 
-config HAVE_GENERIC_GUP
-   def_bool y
-
 config ARCH_ENABLE_MEMORY_HOTPLUG
def_bool y
 
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 64108a2a16d4..b1e42f0e4ed0 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -54,10 +54,10 @@ config MIPS
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_EXIT_THREAD
+   select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
-   select HAVE_GENERIC_GUP
select HAVE_IDE
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636308c8..992a04796e56 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -185,12 +185,12 @@ config PPC
select HAVE_DYNAMIC_FTRACE_WITH_REGSif MPROFILE_KERNEL
select HAVE_EBPF_JITif PPC64
select HAVE_EFFICIENT_UNALIGNED_ACCESS  if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
+   select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS if GCC_VERSION >= 50200   # 
plugin support on gcc <= 5.1 is buggy on PPC
-   select HAVE_GENERIC_GUP
select HAVE_HW_BREAKPOINT   if PERF_EVENTS && (PPC_BOOK3S 
|| PPC_8xx)
select HAVE_IDE
select HAVE_IOREMAP_PROT
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 109243fdb6ec..aaff0376bf53 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -137,6 +137,7 @@ config S390
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
+   select HAVE_FAST_GUP
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FENTRY
select HAVE_FTRACE_MCOUNT_RECORD
@@ -144,7 +145,6 @@ config S390
select HAVE_FUNCTION_TRACER
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_GCC_PLUGINS
-   select HAVE_GENERIC_GUP
select HAVE_KERNEL_BZIP2
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_LZ4
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 6fddfc3c9710..56712f3c9838 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -14,7 +14,7 @@ config SUPERH
select HAVE_ARCH_TRACEHOOK
select HAVE_PERF_EVENTS
select HAVE_DEBUG_BUGVERBOSE
-   select HAVE_GENERIC_GUP
+   select HAVE_FAST_GUP
select ARCH_HAVE_CUSTOM_GPIO_H
select ARCH_HAVE_NMI_SAFE_CMPXCHG if (GUSA_RB || CPU_SH4A)
select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 22435471f942..659232b760e1 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -28,7 +28,7 @@ config SPARC
select RTC_DRV_M48T59
select RTC_SYSTOHC
select HAVE_ARCH_JUMP_LABEL if SPARC64
-   select HAVE_GENERIC_GUP if

[PATCH 09/16] sparc64: use the generic get_user_pages_fast code

2019-06-11 Thread Christoph Hellwig
The sparc64 code is mostly equivalent to the generic one, minus various
bugfixes and two arch overrides that this patch adds to pgtable.h.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/Kconfig  |   1 +
 arch/sparc/include/asm/pgtable_64.h |  18 ++
 arch/sparc/mm/Makefile  |   2 +-
 arch/sparc/mm/gup.c | 340 
 4 files changed, 20 insertions(+), 341 deletions(-)
 delete mode 100644 arch/sparc/mm/gup.c

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 26ab6f5bbaaf..22435471f942 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -28,6 +28,7 @@ config SPARC
select RTC_DRV_M48T59
select RTC_SYSTOHC
select HAVE_ARCH_JUMP_LABEL if SPARC64
+   select HAVE_GENERIC_GUP if SPARC64
select GENERIC_IRQ_SHOW
select ARCH_WANT_IPC_PARSE_VERSION
select GENERIC_PCI_IOMAP
diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 1904782dcd39..547ff96fb228 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -1098,6 +1098,24 @@ static inline unsigned long untagged_addr(unsigned long 
start)
 }
 #define untagged_addr untagged_addr
 
+static inline bool pte_access_permitted(pte_t pte, bool write)
+{
+   u64 prot;
+
+   if (tlb_type == hypervisor) {
+   prot = _PAGE_PRESENT_4V | _PAGE_P_4V;
+   if (write)
+   prot |= _PAGE_WRITE_4V;
+   } else {
+   prot = _PAGE_PRESENT_4U | _PAGE_P_4U;
+   if (write)
+   prot |= _PAGE_WRITE_4U;
+   }
+
+   return (pte_val(pte) & (prot | _PAGE_SPECIAL)) == prot;
+}
+#define pte_access_permitted pte_access_permitted
+
 #include 
 #include 
 
diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile
index d39075b1e3b7..b078205b70e0 100644
--- a/arch/sparc/mm/Makefile
+++ b/arch/sparc/mm/Makefile
@@ -5,7 +5,7 @@
 asflags-y := -ansi
 ccflags-y := -Werror
 
-obj-$(CONFIG_SPARC64)   += ultra.o tlb.o tsb.o gup.o
+obj-$(CONFIG_SPARC64)   += ultra.o tlb.o tsb.o
 obj-y   += fault_$(BITS).o
 obj-y   += init_$(BITS).o
 obj-$(CONFIG_SPARC32)   += extable.o srmmu.o iommu.o io-unit.o
diff --git a/arch/sparc/mm/gup.c b/arch/sparc/mm/gup.c
deleted file mode 100644
index 1e770a517d4a..
--- a/arch/sparc/mm/gup.c
+++ /dev/null
@@ -1,340 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Lockless get_user_pages_fast for sparc, cribbed from powerpc
- *
- * Copyright (C) 2008 Nick Piggin
- * Copyright (C) 2008 Novell Inc.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-/*
- * The performance critical leaf functions are made noinline otherwise gcc
- * inlines everything into a single function which results in too much
- * register pressure.
- */
-static noinline int gup_pte_range(pmd_t pmd, unsigned long addr,
-   unsigned long end, int write, struct page **pages, int *nr)
-{
-   unsigned long mask, result;
-   pte_t *ptep;
-
-   if (tlb_type == hypervisor) {
-   result = _PAGE_PRESENT_4V|_PAGE_P_4V;
-   if (write)
-   result |= _PAGE_WRITE_4V;
-   } else {
-   result = _PAGE_PRESENT_4U|_PAGE_P_4U;
-   if (write)
-   result |= _PAGE_WRITE_4U;
-   }
-   mask = result | _PAGE_SPECIAL;
-
-   ptep = pte_offset_kernel(&pmd, addr);
-   do {
-   struct page *page, *head;
-   pte_t pte = *ptep;
-
-   if ((pte_val(pte) & mask) != result)
-   return 0;
-   VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
-
-   /* The hugepage case is simplified on sparc64 because
-* we encode the sub-page pfn offsets into the
-* hugepage PTEs.  We could optimize this in the future
-* use page_cache_add_speculative() for the hugepage case.
-*/
-   page = pte_page(pte);
-   head = compound_head(page);
-   if (!page_cache_get_speculative(head))
-   return 0;
-   if (unlikely(pte_val(pte) != pte_val(*ptep))) {
-   put_page(head);
-   return 0;
-   }
-
-   pages[*nr] = page;
-   (*nr)++;
-   } while (ptep++, addr += PAGE_SIZE, addr != end);
-
-   return 1;
-}
-
-static int gup_huge_pmd(pmd_t *pmdp, pmd_t pmd, unsigned long addr,
-   unsigned long end, int write, struct page **pages,
-   int *nr)
-{
-   struct page *head, *page;
-   int refs;
-
-   if (!(pmd_val(pmd) & _PAGE_VALID))
-   return 0;
-
-   if (write && !pmd_write(pmd))
-   return 0;
-
-   refs = 0;
-   page = pmd_page(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
-   head = compound

[PATCH 08/16] sparc64: define untagged_addr()

2019-06-11 Thread Christoph Hellwig
Add a helper to untag a user pointer.  This is needed for ADI support
in get_user_pages_fast.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/include/asm/pgtable_64.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index f0dcf991d27f..1904782dcd39 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -1076,6 +1076,28 @@ static inline int io_remap_pfn_range(struct 
vm_area_struct *vma,
 }
 #define io_remap_pfn_range io_remap_pfn_range 
 
+static inline unsigned long untagged_addr(unsigned long start)
+{
+   if (adi_capable()) {
+   long addr = start;
+
+   /* If userspace has passed a versioned address, kernel
+* will not find it in the VMAs since it does not store
+* the version tags in the list of VMAs. Storing version
+* tags in list of VMAs is impractical since they can be
+* changed any time from userspace without dropping into
+* kernel. Any address search in VMAs will be done with
+* non-versioned addresses. Ensure the ADI version bits
+* are dropped here by sign extending the last bit before
+* ADI bits. IOMMU does not implement version tags.
+*/
+   return (addr << (long)adi_nbits()) >> (long)adi_nbits();
+   }
+
+   return start;
+}
+#define untagged_addr untagged_addr
+
 #include 
 #include 
 
-- 
2.20.1



[PATCH 07/16] sparc64: add the missing pgd_page definition

2019-06-11 Thread Christoph Hellwig
sparc64 only had pgd_page_vaddr, but not pgd_page.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/include/asm/pgtable_64.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 22500c3be7a9..f0dcf991d27f 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -861,6 +861,7 @@ static inline unsigned long pud_page_vaddr(pud_t pud)
 #define pud_clear(pudp)(pud_val(*(pudp)) = 0UL)
 #define pgd_page_vaddr(pgd)\
((unsigned long) __va(pgd_val(pgd)))
+#define pgd_page(pgd)  pfn_to_page(pgd_pfn(pgd))
 #define pgd_present(pgd)   (pgd_val(pgd) != 0U)
 #define pgd_clear(pgdp)(pgd_val(*(pgdp)) = 0UL)
 
-- 
2.20.1



[PATCH 06/16] sh: use the generic get_user_pages_fast code

2019-06-11 Thread Christoph Hellwig
The sh code is mostly equivalent to the generic one, minus various
bugfixes and two arch overrides that this patch adds to pgtable.h.

Signed-off-by: Christoph Hellwig 
---
 arch/sh/Kconfig   |   2 +
 arch/sh/include/asm/pgtable.h |  37 +
 arch/sh/mm/Makefile   |   2 +-
 arch/sh/mm/gup.c  | 277 --
 4 files changed, 40 insertions(+), 278 deletions(-)
 delete mode 100644 arch/sh/mm/gup.c

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index b77f512bb176..6fddfc3c9710 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -14,6 +14,7 @@ config SUPERH
select HAVE_ARCH_TRACEHOOK
select HAVE_PERF_EVENTS
select HAVE_DEBUG_BUGVERBOSE
+   select HAVE_GENERIC_GUP
select ARCH_HAVE_CUSTOM_GPIO_H
select ARCH_HAVE_NMI_SAFE_CMPXCHG if (GUSA_RB || CPU_SH4A)
select ARCH_HAS_GCOV_PROFILE_ALL
@@ -63,6 +64,7 @@ config SUPERH
 config SUPERH32
def_bool "$(ARCH)" = "sh"
select ARCH_32BIT_OFF_T
+   select GUP_GET_PTE_LOW_HIGH if X2TLB
select HAVE_KPROBES
select HAVE_KRETPROBES
select HAVE_IOREMAP_PROT if MMU && !X2TLB
diff --git a/arch/sh/include/asm/pgtable.h b/arch/sh/include/asm/pgtable.h
index 3587103afe59..9085d1142fa3 100644
--- a/arch/sh/include/asm/pgtable.h
+++ b/arch/sh/include/asm/pgtable.h
@@ -149,6 +149,43 @@ extern void paging_init(void);
 extern void page_table_range_init(unsigned long start, unsigned long end,
  pgd_t *pgd);
 
+static inline bool __pte_access_permitted(pte_t pte, u64 prot)
+{
+   return (pte_val(pte) & (prot | _PAGE_SPECIAL)) == prot;
+}
+
+#ifdef CONFIG_X2TLB
+static inline bool pte_access_permitted(pte_t pte, bool write)
+{
+   u64 prot = _PAGE_PRESENT;
+
+   prot |= _PAGE_EXT(_PAGE_EXT_KERN_READ | _PAGE_EXT_USER_READ);
+   if (write)
+   prot |= _PAGE_EXT(_PAGE_EXT_KERN_WRITE | _PAGE_EXT_USER_WRITE);
+   return __pte_access_permitted(pte, prot);
+}
+#elif defined(CONFIG_SUPERH64)
+static inline bool pte_access_permitted(pte_t pte, bool write)
+{
+   u64 prot = _PAGE_PRESENT | _PAGE_USER | _PAGE_READ;
+
+   if (write)
+   prot |= _PAGE_WRITE;
+   return __pte_access_permitted(pte, prot);
+}
+#else
+static inline bool pte_access_permitted(pte_t pte, bool write)
+{
+   u64 prot = _PAGE_PRESENT | _PAGE_USER;
+
+   if (write)
+   prot |= _PAGE_RW;
+   return __pte_access_permitted(pte, prot);
+}
+#endif
+
+#define pte_access_permitted pte_access_permitted
+
 /* arch/sh/mm/mmap.c */
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
diff --git a/arch/sh/mm/Makefile b/arch/sh/mm/Makefile
index fbe5e79751b3..5051b38fd5b6 100644
--- a/arch/sh/mm/Makefile
+++ b/arch/sh/mm/Makefile
@@ -17,7 +17,7 @@ cacheops-$(CONFIG_CPU_SHX3)   += cache-shx3.o
 obj-y  += $(cacheops-y)
 
 mmu-y  := nommu.o extable_32.o
-mmu-$(CONFIG_MMU)  := extable_$(BITS).o fault.o gup.o ioremap.o kmap.o \
+mmu-$(CONFIG_MMU)  := extable_$(BITS).o fault.o ioremap.o kmap.o \
   pgtable.o tlbex_$(BITS).o tlbflush_$(BITS).o
 
 obj-y  += $(mmu-y)
diff --git a/arch/sh/mm/gup.c b/arch/sh/mm/gup.c
deleted file mode 100644
index 277c882f7489..
--- a/arch/sh/mm/gup.c
+++ /dev/null
@@ -1,277 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Lockless get_user_pages_fast for SuperH
- *
- * Copyright (C) 2009 - 2010  Paul Mundt
- *
- * Cloned from the x86 and PowerPC versions, by:
- *
- * Copyright (C) 2008 Nick Piggin
- * Copyright (C) 2008 Novell Inc.
- */
-#include 
-#include 
-#include 
-#include 
-#include 
-
-static inline pte_t gup_get_pte(pte_t *ptep)
-{
-#ifndef CONFIG_X2TLB
-   return READ_ONCE(*ptep);
-#else
-   /*
-* With get_user_pages_fast, we walk down the pagetables without
-* taking any locks.  For this we would like to load the pointers
-* atomically, but that is not possible with 64-bit PTEs.  What
-* we do have is the guarantee that a pte will only either go
-* from not present to present, or present to not present or both
-* -- it will not switch to a completely different present page
-* without a TLB flush in between; something that we are blocking
-* by holding interrupts off.
-*
-* Setting ptes from not present to present goes:
-* ptep->pte_high = h;
-* smp_wmb();
-* ptep->pte_low = l;
-*
-* And present to not present goes:
-* ptep->pte_low = 0;
-* smp_wmb();
-* ptep->pte_high = 0;
-*
-* We must ensure here that the load of pte_low sees l iff pte_high
-* sees h. We load pte_high *after* loading pte_low, which ensures we
-* don't see an older value of pte_high.  *Then* we recheck pte_low,
-* which ensures that we haven't picked up a chang

[PATCH 05/16] sh: add the missing pud_page definition

2019-06-11 Thread Christoph Hellwig
sh only had pud_page_vaddr, but not pud_page.

Signed-off-by: Christoph Hellwig 
---
 arch/sh/include/asm/pgtable-3level.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/sh/include/asm/pgtable-3level.h 
b/arch/sh/include/asm/pgtable-3level.h
index 7d8587eb65ff..3c7ff20f3f94 100644
--- a/arch/sh/include/asm/pgtable-3level.h
+++ b/arch/sh/include/asm/pgtable-3level.h
@@ -37,6 +37,7 @@ static inline unsigned long pud_page_vaddr(pud_t pud)
 {
return pud_val(pud);
 }
+#define pud_page(pud)  pfn_to_page(pud_pfn(pud))
 
 #define pmd_index(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
 static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
-- 
2.20.1



[PATCH 03/16] mm: lift the x86_32 PAE version of gup_get_pte to common code

2019-06-11 Thread Christoph Hellwig
The split low/high access is the only non-READ_ONCE version of
gup_get_pte that did show up in the various arch implemenations.
Lift it to common code and drop the ifdef based arch override.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/Kconfig  |  1 +
 arch/x86/include/asm/pgtable-3level.h | 47 
 arch/x86/kvm/mmu.c|  2 +-
 mm/Kconfig|  3 ++
 mm/gup.c  | 51 ---
 5 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2bbbd4d1ba31..7cd53cc59f0f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -121,6 +121,7 @@ config X86
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
+   select GUP_GET_PTE_LOW_HIGH if X86_PAE
select HARDLOCKUP_CHECK_TIMESTAMP   if X86_64
select HAVE_ACPI_APEI   if ACPI
select HAVE_ACPI_APEI_NMI   if ACPI
diff --git a/arch/x86/include/asm/pgtable-3level.h 
b/arch/x86/include/asm/pgtable-3level.h
index f8b1ad2c3828..e3633795fb22 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -285,53 +285,6 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp)
 #define __pte_to_swp_entry(pte)(__swp_entry(__pteval_swp_type(pte), \
 __pteval_swp_offset(pte)))
 
-#define gup_get_pte gup_get_pte
-/*
- * WARNING: only to be used in the get_user_pages_fast() implementation.
- *
- * With get_user_pages_fast(), we walk down the pagetables without taking
- * any locks.  For this we would like to load the pointers atomically,
- * but that is not possible (without expensive cmpxchg8b) on PAE.  What
- * we do have is the guarantee that a PTE will only either go from not
- * present to present, or present to not present or both -- it will not
- * switch to a completely different present page without a TLB flush in
- * between; something that we are blocking by holding interrupts off.
- *
- * Setting ptes from not present to present goes:
- *
- *   ptep->pte_high = h;
- *   smp_wmb();
- *   ptep->pte_low = l;
- *
- * And present to not present goes:
- *
- *   ptep->pte_low = 0;
- *   smp_wmb();
- *   ptep->pte_high = 0;
- *
- * We must ensure here that the load of pte_low sees 'l' iff pte_high
- * sees 'h'. We load pte_high *after* loading pte_low, which ensures we
- * don't see an older value of pte_high.  *Then* we recheck pte_low,
- * which ensures that we haven't picked up a changed pte high. We might
- * have gotten rubbish values from pte_low and pte_high, but we are
- * guaranteed that pte_low will not have the present bit set *unless*
- * it is 'l'. Because get_user_pages_fast() only operates on present ptes
- * we're safe.
- */
-static inline pte_t gup_get_pte(pte_t *ptep)
-{
-   pte_t pte;
-
-   do {
-   pte.pte_low = ptep->pte_low;
-   smp_rmb();
-   pte.pte_high = ptep->pte_high;
-   smp_rmb();
-   } while (unlikely(pte.pte_low != ptep->pte_low));
-
-   return pte;
-}
-
 #include 
 
 #endif /* _ASM_X86_PGTABLE_3LEVEL_H */
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1e9ba81accba..3f7cd11168f9 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -653,7 +653,7 @@ static u64 __update_clear_spte_slow(u64 *sptep, u64 spte)
 
 /*
  * The idea using the light way get the spte on x86_32 guest is from
- * gup_get_pte(arch/x86/mm/gup.c).
+ * gup_get_pte (mm/gup.c).
  *
  * An spte tlb flush may be pending, because kvm_set_pte_rmapp
  * coalesces them and we are running out of the MMU lock.  Therefore
diff --git a/mm/Kconfig b/mm/Kconfig
index f0c76ba47695..fe51f104a9e0 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -762,6 +762,9 @@ config GUP_BENCHMARK
 
  See tools/testing/selftests/vm/gup_benchmark.c
 
+config GUP_GET_PTE_LOW_HIGH
+   bool
+
 config ARCH_HAS_PTE_SPECIAL
bool
 
diff --git a/mm/gup.c b/mm/gup.c
index 3237f33792e6..9b72f2ea3471 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1684,17 +1684,60 @@ struct page *get_dump_page(unsigned long addr)
  * This code is based heavily on the PowerPC implementation by Nick Piggin.
  */
 #ifdef CONFIG_HAVE_GENERIC_GUP
+#ifdef CONFIG_GUP_GET_PTE_LOW_HIGH
+/*
+ * WARNING: only to be used in the get_user_pages_fast() implementation.
+ *
+ * With get_user_pages_fast(), we walk down the pagetables without taking any
+ * locks.  For this we would like to load the pointers atomically, but 
sometimes
+ * that is not possible (e.g. without expensive cmpxchg8b on x86_32 PAE).  What
+ * we do have is the guarantee that a PTE will only either go from not present
+ * to present, or present to not present or both -- it will not switch to a
+ * completely different present page without a TLB flush in between; something
+ * that we are blocking by h

[PATCH 04/16] MIPS: use the generic get_user_pages_fast code

2019-06-11 Thread Christoph Hellwig
The mips code is mostly equivalent to the generic one, minus various
bugfixes and an arch override for gup_fast_permitted.

Note that this defines ARCH_HAS_PTE_SPECIAL for mips as mips has
pte_special and pte_mkspecial implemented and used in the existing
gup code.  They are no-op stubs, though which makes me a little unsure
if this is really right thing to do.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig   |   3 +
 arch/mips/include/asm/pgtable.h |   3 +
 arch/mips/mm/Makefile   |   1 -
 arch/mips/mm/gup.c  | 303 
 4 files changed, 6 insertions(+), 304 deletions(-)
 delete mode 100644 arch/mips/mm/gup.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 70d3200476bf..64108a2a16d4 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -6,6 +6,7 @@ config MIPS
select ARCH_BINFMT_ELF_STATE if MIPS_FP_SUPPORT
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_ELF_RANDOMIZE
+   select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_SUPPORTS_UPROBES
@@ -34,6 +35,7 @@ config MIPS
select GENERIC_SCHED_CLOCK if !CAVIUM_OCTEON_SOC
select GENERIC_SMP_IDLE_THREAD
select GENERIC_TIME_VSYSCALL
+   select GUP_GET_PTE_LOW_HIGH if CPU_MIPS32 && PHYS_ADDR_T_64BIT
select HANDLE_DOMAIN_IRQ
select HAVE_ARCH_COMPILER_H
select HAVE_ARCH_JUMP_LABEL
@@ -55,6 +57,7 @@ config MIPS
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
+   select HAVE_GENERIC_GUP
select HAVE_IDE
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 4ccb465ef3f2..7d27194e3b45 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct mm_struct;
 struct vm_area_struct;
@@ -626,6 +627,8 @@ static inline pmd_t pmdp_huge_get_and_clear(struct 
mm_struct *mm,
 
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
+#define gup_fast_permitted(start, end) (!cpu_has_dc_aliases)
+
 #include 
 
 /*
diff --git a/arch/mips/mm/Makefile b/arch/mips/mm/Makefile
index f34d7ff5eb60..1e8d335025d7 100644
--- a/arch/mips/mm/Makefile
+++ b/arch/mips/mm/Makefile
@@ -7,7 +7,6 @@ obj-y   += cache.o
 obj-y  += context.o
 obj-y  += extable.o
 obj-y  += fault.o
-obj-y  += gup.o
 obj-y  += init.o
 obj-y  += mmap.o
 obj-y  += page.o
diff --git a/arch/mips/mm/gup.c b/arch/mips/mm/gup.c
deleted file mode 100644
index 4c2b4483683c..
--- a/arch/mips/mm/gup.c
+++ /dev/null
@@ -1,303 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Lockless get_user_pages_fast for MIPS
- *
- * Copyright (C) 2008 Nick Piggin
- * Copyright (C) 2008 Novell Inc.
- * Copyright (C) 2011 Ralf Baechle
- */
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-
-static inline pte_t gup_get_pte(pte_t *ptep)
-{
-#if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)
-   pte_t pte;
-
-retry:
-   pte.pte_low = ptep->pte_low;
-   smp_rmb();
-   pte.pte_high = ptep->pte_high;
-   smp_rmb();
-   if (unlikely(pte.pte_low != ptep->pte_low))
-   goto retry;
-
-   return pte;
-#else
-   return READ_ONCE(*ptep);
-#endif
-}
-
-static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-   int write, struct page **pages, int *nr)
-{
-   pte_t *ptep = pte_offset_map(&pmd, addr);
-   do {
-   pte_t pte = gup_get_pte(ptep);
-   struct page *page;
-
-   if (!pte_present(pte) ||
-   pte_special(pte) || (write && !pte_write(pte))) {
-   pte_unmap(ptep);
-   return 0;
-   }
-   VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
-   page = pte_page(pte);
-   get_page(page);
-   SetPageReferenced(page);
-   pages[*nr] = page;
-   (*nr)++;
-
-   } while (ptep++, addr += PAGE_SIZE, addr != end);
-
-   pte_unmap(ptep - 1);
-   return 1;
-}
-
-static inline void get_head_page_multiple(struct page *page, int nr)
-{
-   VM_BUG_ON(page != compound_head(page));
-   VM_BUG_ON(page_count(page) == 0);
-   page_ref_add(page, nr);
-   SetPageReferenced(page);
-}
-
-static int gup_huge_pmd(pmd_t pmd, unsigned long addr, unsigned long end,
-   int write, struct page **pages, int *nr)
-{
-   pte_t pte = *(pte_t *)&pmd;
-   struct page *head, *page;
-   int refs;
-
-

[PATCH 02/16] mm: simplify gup_fast_permitted

2019-06-11 Thread Christoph Hellwig
Pass in the already calculated end value instead of recomputing it, and
leave the end > start check in the callers instead of duplicating them
in the arch code.

Signed-off-by: Christoph Hellwig 
---
 arch/s390/include/asm/pgtable.h   |  8 +---
 arch/x86/include/asm/pgtable_64.h |  8 +---
 mm/gup.c  | 17 +++--
 3 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 9f0195d5fa16..9b274fcaacb6 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1270,14 +1270,8 @@ static inline pte_t *pte_offset(pmd_t *pmd, unsigned 
long address)
 #define pte_offset_map(pmd, address) pte_offset_kernel(pmd, address)
 #define pte_unmap(pte) do { } while (0)
 
-static inline bool gup_fast_permitted(unsigned long start, int nr_pages)
+static inline bool gup_fast_permitted(unsigned long start, unsigned long end)
 {
-   unsigned long len, end;
-
-   len = (unsigned long) nr_pages << PAGE_SHIFT;
-   end = start + len;
-   if (end < start)
-   return false;
return end <= current->mm->context.asce_limit;
 }
 #define gup_fast_permitted gup_fast_permitted
diff --git a/arch/x86/include/asm/pgtable_64.h 
b/arch/x86/include/asm/pgtable_64.h
index 0bb566315621..4990d26dfc73 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -259,14 +259,8 @@ extern void init_extra_mapping_uc(unsigned long phys, 
unsigned long size);
 extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
 
 #define gup_fast_permitted gup_fast_permitted
-static inline bool gup_fast_permitted(unsigned long start, int nr_pages)
+static inline bool gup_fast_permitted(unsigned long start, unsigned long end)
 {
-   unsigned long len, end;
-
-   len = (unsigned long)nr_pages << PAGE_SHIFT;
-   end = start + len;
-   if (end < start)
-   return false;
if (end >> __VIRTUAL_MASK_SHIFT)
return false;
return true;
diff --git a/mm/gup.c b/mm/gup.c
index 6bb521db67ec..3237f33792e6 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2123,13 +2123,9 @@ static void gup_pgd_range(unsigned long addr, unsigned 
long end,
  * Check if it's allowed to use __get_user_pages_fast() for the range, or
  * we need to fall back to the slow version:
  */
-bool gup_fast_permitted(unsigned long start, int nr_pages)
+static bool gup_fast_permitted(unsigned long start, unsigned long end)
 {
-   unsigned long len, end;
-
-   len = (unsigned long) nr_pages << PAGE_SHIFT;
-   end = start + len;
-   return end >= start;
+   return true;
 }
 #endif
 
@@ -2150,6 +2146,8 @@ int __get_user_pages_fast(unsigned long start, int 
nr_pages, int write,
len = (unsigned long) nr_pages << PAGE_SHIFT;
end = start + len;
 
+   if (end <= start)
+   return 0;
if (unlikely(!access_ok((void __user *)start, len)))
return 0;
 
@@ -2165,7 +2163,7 @@ int __get_user_pages_fast(unsigned long start, int 
nr_pages, int write,
 * block IPIs that come from THPs splitting.
 */
 
-   if (gup_fast_permitted(start, nr_pages)) {
+   if (gup_fast_permitted(start, end)) {
local_irq_save(flags);
gup_pgd_range(start, end, write ? FOLL_WRITE : 0, pages, &nr);
local_irq_restore(flags);
@@ -2224,13 +,12 @@ int get_user_pages_fast(unsigned long start, int 
nr_pages,
len = (unsigned long) nr_pages << PAGE_SHIFT;
end = start + len;
 
-   if (nr_pages <= 0)
+   if (end <= start)
return 0;
-
if (unlikely(!access_ok((void __user *)start, len)))
return -EFAULT;
 
-   if (gup_fast_permitted(start, nr_pages)) {
+   if (gup_fast_permitted(start, end)) {
local_irq_disable();
gup_pgd_range(addr, end, gup_flags, pages, &nr);
local_irq_enable();
-- 
2.20.1



switch the remaining architectures to use generic GUP v3

2019-06-11 Thread Christoph Hellwig
Hi Linus and maintainers,

below is a series to switch mips, sh and sparc64 to use the generic
GUP code so that we only have one codebase to touch for further
improvements to this code.  I don't have hardware for any of these
architectures, and generally no clue about their page table
management, so handle with care.

Changes since v2:
 - rebase to mainline to pick up the untagged_addr definition
 - fix the gup range check to be start <= end to catch the 0 length case
 - use pfn based version for the missing pud_page/pgd_page definitions
 - fix a wrong check in the sparc64 version of pte_access_permitted

Changes since v1:
 - fix various issues found by the build bot
 - cherry pick and use the untagged_addr helper form Andrey
 - add various refactoring patches to share more code over architectures
 - move the powerpc hugepd code to mm/gup.c and sync it with the generic
   hup semantics


[PATCH 01/16] mm: use untagged_addr() for get_user_pages_fast addresses

2019-06-11 Thread Christoph Hellwig
This will allow sparc64 to override its ADI tags for
get_user_pages and get_user_pages_fast.

Signed-off-by: Christoph Hellwig 
---
 mm/gup.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index ddde097cf9e4..6bb521db67ec 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2146,7 +2146,7 @@ int __get_user_pages_fast(unsigned long start, int 
nr_pages, int write,
unsigned long flags;
int nr = 0;
 
-   start &= PAGE_MASK;
+   start = untagged_addr(start) & PAGE_MASK;
len = (unsigned long) nr_pages << PAGE_SHIFT;
end = start + len;
 
@@ -2219,7 +2219,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
unsigned long addr, len, end;
int nr = 0, ret = 0;
 
-   start &= PAGE_MASK;
+   start = untagged_addr(start) & PAGE_MASK;
addr = start;
len = (unsigned long) nr_pages << PAGE_SHIFT;
end = start + len;
-- 
2.20.1



[PATCH v2 3/4] crypto: talitos - eliminate unneeded 'done' functions at build time

2019-06-11 Thread Christophe Leroy
When building for SEC1 only, talitos2_done functions are unneeded
and should go away.

For this, use has_ftr_sec1() which will always return true when only
SEC1 support is being built, allowing GCC to drop TALITOS2 functions.

Signed-off-by: Christophe Leroy 
---
 drivers/crypto/talitos.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 4f03baef952b..b2de931de623 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -3401,7 +3401,7 @@ static int talitos_probe(struct platform_device *ofdev)
if (err)
goto err_out;
 
-   if (of_device_is_compatible(np, "fsl,sec1.0")) {
+   if (has_ftr_sec1(priv)) {
if (priv->num_channels == 1)
tasklet_init(&priv->done_task[0], talitos1_done_ch0,
 (unsigned long)dev);
-- 
2.13.3



[PATCH v2 4/4] crypto: talitos - drop icv_ool

2019-06-11 Thread Christophe Leroy
icv_ool is not used anymore, drop it.

Fixes: 9cc87bc3613b ("crypto: talitos - fix AEAD processing")
Signed-off-by: Christophe Leroy 
---
 drivers/crypto/talitos.c | 3 ---
 drivers/crypto/talitos.h | 2 --
 2 files changed, 5 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index b2de931de623..03b7a5d28fb0 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -1278,9 +1278,6 @@ static int ipsec_esp(struct talitos_edesc *edesc, struct 
aead_request *areq,
 is_ipsec_esp && !encrypt);
tbl_off += ret;
 
-   /* ICV data */
-   edesc->icv_ool = !encrypt;
-
if (!encrypt && is_ipsec_esp) {
struct talitos_ptr *tbl_ptr = &edesc->link_tbl[tbl_off];
 
diff --git a/drivers/crypto/talitos.h b/drivers/crypto/talitos.h
index 95f78c6d9206..1469b956948a 100644
--- a/drivers/crypto/talitos.h
+++ b/drivers/crypto/talitos.h
@@ -46,7 +46,6 @@ struct talitos_desc {
  * talitos_edesc - s/w-extended descriptor
  * @src_nents: number of segments in input scatterlist
  * @dst_nents: number of segments in output scatterlist
- * @icv_ool: whether ICV is out-of-line
  * @iv_dma: dma address of iv for checking continuity and link table
  * @dma_len: length of dma mapped link_tbl space
  * @dma_link_tbl: bus physical address of link_tbl/buf
@@ -61,7 +60,6 @@ struct talitos_desc {
 struct talitos_edesc {
int src_nents;
int dst_nents;
-   bool icv_ool;
dma_addr_t iv_dma;
int dma_len;
dma_addr_t dma_link_tbl;
-- 
2.13.3



[PATCH v2 2/4] crypto: talitos - fix hash on SEC1.

2019-06-11 Thread Christophe Leroy
On SEC1, hash provides wrong result when performing hashing in several
steps with input data SG list has more than one element. This was
detected with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:

[   44.185947] alg: hash: md5-talitos test failed (wrong result) on test vector 
6, cfg="random: may_sleep use_finup src_divs=[25.88%@+8063, 
24.19%@+9588, 28.63%@+16333, 4.60%@+6756, 16.70%@+16281] 
dst_divs=[71.61%@alignmask+16361, 14.36%@+7756, 14.3%@+"
[   44.325122] alg: hash: sha1-talitos test failed (wrong result) on test 
vector 3, cfg="random: inplace use_final src_divs=[16.56%@+16378, 
52.0%@+16329, 21.42%@alignmask+16380, 10.2%@alignmask+16380] 
iv_offset=39"
[   44.493500] alg: hash: sha224-talitos test failed (wrong result) on test 
vector 4, cfg="random: use_final nosimd src_divs=[52.27%@+7401, 
17.34%@+16285, 17.71%@+26, 12.68%@+10644] iv_offset=43"
[   44.673262] alg: hash: sha256-talitos test failed (wrong result) on test 
vector 4, cfg="random: may_sleep use_finup src_divs=[60.6%@+12790, 
17.86%@+1329, 12.64%@alignmask+16300, 8.29%@+15, 0.40%@+13506, 
0.51%@+16322, 0.24%@+16339] dst_divs"

This is due to two issues:
- We have an overlap between the buffer used for copying the input
data (SEC1 doesn't do scatter/gather) and the chained descriptor.
- Data copy is wrong when the previous hash left less than one
blocksize of data to hash, implying a complement of the previous
block with a few bytes from the new request.

This patch fixes it by:
- Moving the second descriptor after the buffer, as moving the buffer
after the descriptor would make it more complex for other cipher
operations (AEAD, ABLKCIPHER)
- Rebuiding a new data SG list without the bytes taken from the new
request to complete the previous one.

Fixes: 37b5e8897eb5 ("crypto: talitos - chain in buffered data for ahash on 
SEC1")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 drivers/crypto/talitos.c | 63 ++--
 1 file changed, 40 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 5b401aec6c84..4f03baef952b 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -336,15 +336,18 @@ static void flush_channel(struct device *dev, int ch, int 
error, int reset_ch)
tail = priv->chan[ch].tail;
while (priv->chan[ch].fifo[tail].desc) {
__be32 hdr;
+   struct talitos_edesc *edesc;
 
request = &priv->chan[ch].fifo[tail];
+   edesc = container_of(request->desc, struct talitos_edesc, desc);
 
/* descriptors with their done bits set don't get the error */
rmb();
if (!is_sec1)
hdr = request->desc->hdr;
else if (request->desc->next_desc)
-   hdr = (request->desc + 1)->hdr1;
+   hdr = ((struct talitos_desc *)
+  (edesc->buf + edesc->dma_len))->hdr1;
else
hdr = request->desc->hdr1;
 
@@ -476,8 +479,14 @@ static u32 current_desc_hdr(struct device *dev, int ch)
}
}
 
-   if (priv->chan[ch].fifo[iter].desc->next_desc == cur_desc)
-   return (priv->chan[ch].fifo[iter].desc + 1)->hdr;
+   if (priv->chan[ch].fifo[iter].desc->next_desc == cur_desc) {
+   struct talitos_edesc *edesc;
+
+   edesc = container_of(priv->chan[ch].fifo[iter].desc,
+struct talitos_edesc, desc);
+   return ((struct talitos_desc *)
+   (edesc->buf + edesc->dma_len))->hdr;
+   }
 
return priv->chan[ch].fifo[iter].desc->hdr;
 }
@@ -1402,15 +1411,11 @@ static struct talitos_edesc *talitos_edesc_alloc(struct 
device *dev,
edesc->dst_nents = dst_nents;
edesc->iv_dma = iv_dma;
edesc->dma_len = dma_len;
-   if (dma_len) {
-   void *addr = &edesc->link_tbl[0];
-
-   if (is_sec1 && !dst)
-   addr += sizeof(struct talitos_desc);
-   edesc->dma_link_tbl = dma_map_single(dev, addr,
+   if (dma_len)
+   edesc->dma_link_tbl = dma_map_single(dev, &edesc->link_tbl[0],
 edesc->dma_len,
 DMA_BIDIRECTIONAL);
-   }
+
return edesc;
 }
 
@@ -1722,14 +1727,16 @@ static void common_nonsnoop_hash_unmap(struct device 
*dev,
struct talitos_private *priv = dev_get_drvdata(dev);
bool is_sec1 = has_ftr_sec1(priv);
struct talitos_desc *desc = &edesc->desc;
-   struct talitos_desc *desc2 = desc + 1;
+   struct talitos_desc *desc2 = (struct talitos_desc *)
+(edesc->buf + edesc->dma_len);
 
unmap_single_talitos_ptr(dev, &edesc->desc.ptr[5], DMA_FROM_DEVICE);
if (desc->next_desc &&
desc->pt

[PATCH v2 1/4] crypto: talitos - move struct talitos_edesc into talitos.h

2019-06-11 Thread Christophe Leroy
Next patch will require struct talitos_edesc to be defined
earlier in talitos.c

This patch moves it into talitos.h so that it can be used
from any place in talitos.c

Fixes: 37b5e8897eb5 ("crypto: talitos - chain in buffered data for ahash on 
SEC1")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 drivers/crypto/talitos.c | 30 --
 drivers/crypto/talitos.h | 30 ++
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 3b3e99f1cddb..5b401aec6c84 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -951,36 +951,6 @@ static int aead_des3_setkey(struct crypto_aead *authenc,
goto out;
 }
 
-/*
- * talitos_edesc - s/w-extended descriptor
- * @src_nents: number of segments in input scatterlist
- * @dst_nents: number of segments in output scatterlist
- * @icv_ool: whether ICV is out-of-line
- * @iv_dma: dma address of iv for checking continuity and link table
- * @dma_len: length of dma mapped link_tbl space
- * @dma_link_tbl: bus physical address of link_tbl/buf
- * @desc: h/w descriptor
- * @link_tbl: input and output h/w link tables (if {src,dst}_nents > 1) (SEC2)
- * @buf: input and output buffeur (if {src,dst}_nents > 1) (SEC1)
- *
- * if decrypting (with authcheck), or either one of src_nents or dst_nents
- * is greater than 1, an integrity check value is concatenated to the end
- * of link_tbl data
- */
-struct talitos_edesc {
-   int src_nents;
-   int dst_nents;
-   bool icv_ool;
-   dma_addr_t iv_dma;
-   int dma_len;
-   dma_addr_t dma_link_tbl;
-   struct talitos_desc desc;
-   union {
-   struct talitos_ptr link_tbl[0];
-   u8 buf[0];
-   };
-};
-
 static void talitos_sg_unmap(struct device *dev,
 struct talitos_edesc *edesc,
 struct scatterlist *src,
diff --git a/drivers/crypto/talitos.h b/drivers/crypto/talitos.h
index 32ad4fc679ed..95f78c6d9206 100644
--- a/drivers/crypto/talitos.h
+++ b/drivers/crypto/talitos.h
@@ -42,6 +42,36 @@ struct talitos_desc {
 
 #define TALITOS_DESC_SIZE  (sizeof(struct talitos_desc) - sizeof(__be32))
 
+/*
+ * talitos_edesc - s/w-extended descriptor
+ * @src_nents: number of segments in input scatterlist
+ * @dst_nents: number of segments in output scatterlist
+ * @icv_ool: whether ICV is out-of-line
+ * @iv_dma: dma address of iv for checking continuity and link table
+ * @dma_len: length of dma mapped link_tbl space
+ * @dma_link_tbl: bus physical address of link_tbl/buf
+ * @desc: h/w descriptor
+ * @link_tbl: input and output h/w link tables (if {src,dst}_nents > 1) (SEC2)
+ * @buf: input and output buffeur (if {src,dst}_nents > 1) (SEC1)
+ *
+ * if decrypting (with authcheck), or either one of src_nents or dst_nents
+ * is greater than 1, an integrity check value is concatenated to the end
+ * of link_tbl data
+ */
+struct talitos_edesc {
+   int src_nents;
+   int dst_nents;
+   bool icv_ool;
+   dma_addr_t iv_dma;
+   int dma_len;
+   dma_addr_t dma_link_tbl;
+   struct talitos_desc desc;
+   union {
+   struct talitos_ptr link_tbl[0];
+   u8 buf[0];
+   };
+};
+
 /**
  * talitos_request - descriptor submission request
  * @desc: descriptor pointer (kernel virtual)
-- 
2.13.3



[PATCH v2 0/4] Additional fixes on Talitos driver

2019-06-11 Thread Christophe Leroy
This series is the last set of fixes for the Talitos driver.

We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:

[3.385197] bus: 'platform': really_probe: probing driver talitos with 
device ff02.crypto
[3.450982] random: fast init done
[   12.252548] alg: No test for authenc(hmac(md5),cbc(aes)) 
(authenc-hmac-md5-cbc-aes-talitos-hsna)
[   12.262226] alg: No test for authenc(hmac(md5),cbc(des3_ede)) 
(authenc-hmac-md5-cbc-3des-talitos-hsna)
[   43.310737] Bug in SEC1, padding ourself
[   45.603318] random: crng init done
[   54.612333] talitos ff02.crypto: fsl,sec1.2 algorithms registered in 
/proc/crypto
[   54.620232] driver: 'talitos': driver_bound: bound to device 
'ff02.crypto'

[1.193721] bus: 'platform': really_probe: probing driver talitos with 
device b003.crypto
[1.229197] random: fast init done
[2.714920] alg: No test for authenc(hmac(sha224),cbc(aes)) 
(authenc-hmac-sha224-cbc-aes-talitos)
[2.724312] alg: No test for authenc(hmac(sha224),cbc(aes)) 
(authenc-hmac-sha224-cbc-aes-talitos-hsna)
[4.482045] alg: No test for authenc(hmac(md5),cbc(aes)) 
(authenc-hmac-md5-cbc-aes-talitos)
[4.490940] alg: No test for authenc(hmac(md5),cbc(aes)) 
(authenc-hmac-md5-cbc-aes-talitos-hsna)
[4.500280] alg: No test for authenc(hmac(md5),cbc(des3_ede)) 
(authenc-hmac-md5-cbc-3des-talitos)
[4.509727] alg: No test for authenc(hmac(md5),cbc(des3_ede)) 
(authenc-hmac-md5-cbc-3des-talitos-hsna)
[6.631781] random: crng init done
[   11.521795] talitos b003.crypto: fsl,sec2.2 algorithms registered in 
/proc/crypto
[   11.529803] driver: 'talitos': driver_bound: bound to device 
'b003.crypto'

v2: dropped patch 1 which was irrelevant due to a rebase weirdness. Added Cc to 
stable on the 2 first patches.

Christophe Leroy (4):
  crypto: talitos - move struct talitos_edesc into talitos.h
  crypto: talitos - fix hash on SEC1.
  crypto: talitos - eliminate unneeded 'done' functions at build time
  crypto: talitos - drop icv_ool

 drivers/crypto/talitos.c | 98 
 drivers/crypto/talitos.h | 28 ++
 2 files changed, 69 insertions(+), 57 deletions(-)

-- 
2.13.3



[PATCH 28/28] powerpc/64s/exception: avoid SPR RAW scoreboard stall in real mode entry

2019-06-11 Thread Nicholas Piggin
Move SPR reads ahead of writes. Real mode entry that is not a KVM
guest is rare these days, but bad practice propagates.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index d9e531a00319..df9c3126fe08 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -183,19 +183,19 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
.endif
.if \hsrr
mfspr   r11,SPRN_HSRR0  /* save HSRR0 */
+   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
+   mtspr   SPRN_HSRR1,r10
.else
mfspr   r11,SPRN_SRR0   /* save SRR0 */
+   mfspr   r12,SPRN_SRR1   /* and SRR1 */
+   mtspr   SPRN_SRR1,r10
.endif
-   LOAD_HANDLER(r12, \label\())
+   LOAD_HANDLER(r10, \label\())
.if \hsrr
-   mtspr   SPRN_HSRR0,r12
-   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
-   mtspr   SPRN_HSRR1,r10
+   mtspr   SPRN_HSRR0,r10
HRFI_TO_KERNEL
.else
-   mtspr   SPRN_SRR0,r12
-   mfspr   r12,SPRN_SRR1   /* and SRR1 */
-   mtspr   SPRN_SRR1,r10
+   mtspr   SPRN_SRR0,r10
RFI_TO_KERNEL
.endif
b   .   /* prevent speculative execution */
-- 
2.20.1



[PATCH 27/28] powerpc/64s/exception: clean up system call entry

2019-06-11 Thread Nicholas Piggin
syscall / hcall entry unnecessarily differs between KVM and non-KVM
builds. Move the SMT priority instruction to the same location
(after INTERRUPT_TO_KERNEL).

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 25 +++--
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index c1075bbe4677..d9e531a00319 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1643,10 +1643,8 @@ EXC_COMMON(trap_0b_common, 0xb00, unknown_exception)
std r10,PACA_EXGEN+EX_R10(r13)
INTERRUPT_TO_KERNEL
KVMTEST EXC_STD 0xc00 /* uses r10, branch to do_kvm_0xc00_system_call */
-   HMT_MEDIUM
mfctr   r9
 #else
-   HMT_MEDIUM
mr  r9,r13
GET_PACA(r13)
INTERRUPT_TO_KERNEL
@@ -1658,11 +1656,13 @@ BEGIN_FTR_SECTION
beq-1f
 END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
 #endif
-   /* We reach here with PACA in r13, r13 in r9, and HMT_MEDIUM. */
-
-   .if \real
+   /* We reach here with PACA in r13, r13 in r9. */
mfspr   r11,SPRN_SRR0
mfspr   r12,SPRN_SRR1
+
+   HMT_MEDIUM
+
+   .if \real
__LOAD_HANDLER(r10, system_call_common)
mtspr   SPRN_SRR0,r10
ld  r10,PACAKMSR(r13)
@@ -1670,24 +1670,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
RFI_TO_KERNEL
b   .   /* prevent speculative execution */
.else
+   li  r10,MSR_RI
+   mtmsrd  r10,1   /* Set RI (EE=0) */
 #ifdef CONFIG_RELOCATABLE
-   /*
-* We can't branch directly so we do it via the CTR which
-* is volatile across system calls.
-*/
__LOAD_HANDLER(r10, system_call_common)
mtctr   r10
-   mfspr   r11,SPRN_SRR0
-   mfspr   r12,SPRN_SRR1
-   li  r10,MSR_RI
-   mtmsrd  r10,1
bctr
 #else
-   /* We can branch directly */
-   mfspr   r11,SPRN_SRR0
-   mfspr   r12,SPRN_SRR1
-   li  r10,MSR_RI
-   mtmsrd  r10,1   /* Set RI (EE=0) */
b   system_call_common
 #endif
.endif
-- 
2.20.1



[PATCH 26/28] powerpc/64s/exception: move paca save area offsets into exception-64s.S

2019-06-11 Thread Nicholas Piggin
No generated code change. File is change is in bug table line numbers.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 17 +++--
 arch/powerpc/kernel/exceptions-64s.S | 22 ++
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 79e5ac87c029..33f4f72eb035 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -32,22 +32,11 @@
  */
 #include 
 
-/* PACA save area offsets (exgen, exmc, etc) */
-#define EX_R9  0
-#define EX_R10 8
-#define EX_R11 16
-#define EX_R12 24
-#define EX_R13 32
-#define EX_DAR 40
-#define EX_DSISR   48
-#define EX_CCR 52
-#define EX_CFAR56
-#define EX_PPR 64
+/* PACA save area size in u64 units (exgen, exmc, etc) */
 #if defined(CONFIG_RELOCATABLE)
-#define EX_CTR 72
-#define EX_SIZE10  /* size in u64 units */
+#define EX_SIZE10
 #else
-#define EX_SIZE9   /* size in u64 units */
+#define EX_SIZE9
 #endif
 
 /*
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 8b571a2b3d76..c1075bbe4677 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -21,6 +21,28 @@
 #include 
 #include 
 
+/* PACA save area offsets (exgen, exmc, etc) */
+#define EX_R9  0
+#define EX_R10 8
+#define EX_R11 16
+#define EX_R12 24
+#define EX_R13 32
+#define EX_DAR 40
+#define EX_DSISR   48
+#define EX_CCR 52
+#define EX_CFAR56
+#define EX_PPR 64
+#if defined(CONFIG_RELOCATABLE)
+#define EX_CTR 72
+.if EX_SIZE != 10
+   .error "EX_SIZE is wrong"
+.endif
+#else
+.if EX_SIZE != 9
+   .error "EX_SIZE is wrong"
+.endif
+#endif
+
 /*
  * We're short on space and time in the exception prolog, so we can't
  * use the normal LOAD_REG_IMMEDIATE macro to load the address of label.
-- 
2.20.1



[PATCH 25/28] powerpc/64s/exception: remove pointless EXCEPTION_PROLOG macro indirection

2019-06-11 Thread Nicholas Piggin
No generated code change. File is change is in bug table line numbers.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 97 +---
 1 file changed, 45 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index b402a006cd48..8b571a2b3d76 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -334,34 +334,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948)
std r0,GPR0(r1);/* save r0 in stackframe*/ \
std r10,GPR1(r1);   /* save r1 in stackframe*/ \
 
-
-/*
- * The common exception prolog is used for all except a few exceptions
- * such as a segment miss on a kernel address.  We have to be prepared
- * to take another exception from the point where we first touch the
- * kernel stack onwards.
- *
- * On entry r13 points to the paca, r9-r13 are saved in the paca,
- * r9 contains the saved CR, r11 and r12 contain the saved SRR0 and
- * SRR1, and relocation is on.
- */
-#define EXCEPTION_PROLOG_COMMON(n, area)  \
-   andi.   r10,r12,MSR_PR; /* See if coming from user  */ \
-   mr  r10,r1; /* Save r1  */ \
-   subir1,r1,INT_FRAME_SIZE;   /* alloc frame on kernel stack  */ \
-   beq-1f;\
-   ld  r1,PACAKSAVE(r13);  /* kernel stack to use  */ \
-1: tdgei   r1,-INT_FRAME_SIZE; /* trap if r1 is in userspace   */ \
-   EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,0; \
-3: EXCEPTION_PROLOG_COMMON_1();   \
-   kuap_save_amr_and_lock r9, r10, cr1, cr0;  \
-   beq 4f; /* if from kernel mode  */ \
-   ACCOUNT_CPU_USER_ENTRY(r13, r9, r10);  \
-   SAVE_PPR(area, r9);\
-4: EXCEPTION_PROLOG_COMMON_2(area)\
-   EXCEPTION_PROLOG_COMMON_3(n)   \
-   ACCOUNT_STOLEN_TIME
-
 /* Save original regs values from save area to stack frame. */
 #define EXCEPTION_PROLOG_COMMON_2(area)
   \
ld  r9,area+EX_R9(r13); /* move r9, r10 to stackframe   */ \
@@ -381,7 +353,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 66); 
   \
GET_CTR(r10, area);\
std r10,_CTR(r1);
 
-#define EXCEPTION_PROLOG_COMMON_3(n)  \
+#define EXCEPTION_PROLOG_COMMON_3(trap)
   \
std r2,GPR2(r1);/* save r2 in stackframe*/ \
SAVE_4GPRS(3, r1);  /* save r3 - r6 in stackframe   */ \
SAVE_2GPRS(7, r1);  /* save r7, r8 in stackframe*/ \
@@ -392,26 +364,38 @@ END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 66);   
   \
mfspr   r11,SPRN_XER;   /* save XER in stackframe   */ \
std r10,SOFTE(r1); \
std r11,_XER(r1);  \
-   li  r9,(n)+1;  \
+   li  r9,(trap)+1;   \
std r9,_TRAP(r1);   /* set trap number  */ \
li  r10,0; \
ld  r11,exception_marker@toc(r2);  \
std r10,RESULT(r1); /* clear regs->result   */ \
std r11,STACK_FRAME_OVERHEAD-16(r1); /* mark the frame  */
 
-#define RUNLATCH_ON\
-BEGIN_FTR_SECTION  \
-   ld  r3, PACA_THREAD_INFO(r13);  \
-   ld  r4,TI_LOCAL_FLAGS(r3);  \
-   andi.   r0,r4,_TLF_RUNLATCH;\
-   beqlppc64_runlatch_on_trampoline;   \
-END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
-
-#define EXCEPTION_COMMON(area, trap)   \
-   EXCEPTION_PROLOG_COMMON(trap, area);\
+/*
+ * On entry r13 points to the paca, r9-r13 are saved in the paca,
+ * r9 contains the saved CR, r11 and r12 contain the saved SRR0 and
+ * SRR1, and relocation is on.
+ */
+#define EXCEPTION_COMMON(area, trap)  \
+   andi.   r10,r12,MSR_PR; /* See if coming from user  */ \
+   mr  r10,r1; /* Save r1  */ \
+   subir1,r1,INT_FRAME_SIZE;   /* alloc frame on kernel stack  */ \
+   beq-1f;   

[PATCH 24/28] powerpc/64s/exception: remove bad stack branch

2019-06-11 Thread Nicholas Piggin
The bad stack test in interrupt handlers has a few problems. For
performance it is taken in the common case, which is a fetch bubble
and a waste of i-cache.

For code development and maintainence, it requires yet another stack
frame setup routine, and that constrains all exception handlers to
follow the same register save pattern which inhibits future
optimisation.

Remove the test/branch and replace it with a trap. Teach the program
check handler to use the emergency stack for this case.

This does not result in quite so nice a message, however the SRR0 and
SRR1 of the crashed interrupt can be seen in r11 and r12, as is the
original r1 (adjusted by INT_FRAME_SIZE). These are the most important
parts to debugging the issue.

The original r9-12 and cr0 is lost, which is the main downside.

  kernel BUG at linux/arch/powerpc/kernel/exceptions-64s.S:847!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted
  NIP:  c0009108 LR: c0cadbcc CTR: c00090f0
  REGS: c000fffcbd70 TRAP: 0700   Not tainted
  MSR:  90021032   CR: 28222448  XER: 2004
  CFAR: c0009100 IRQMASK: 0
  GPR00: 003d fd00 c18cfb00 c000f02b3166
  GPR04: fffd 0007 fffb 0030
  GPR08: 0037 28222448  c0ca8de0
  GPR12: 92009032 c1ae c0010a00 
  GPR16:    
  GPR20: c000f00322c0 c0f85200 0004 
  GPR24: fffe   000a
  GPR28:   c000f02b391c c000f02b3167
  NIP [c0009108] decrementer_common+0x18/0x160
  LR [c0cadbcc] .vsnprintf+0x3ec/0x4f0
  Call Trace:
  Instruction dump:
  996d098a 994d098b 38610070 480246ed 48005518 6000 3820 718a4000
  7c2a0b78 3821fd00 41c20008 e82d0970 <0981fd00> f92101a0 f9610170 f9810178

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h |  7 --
 arch/powerpc/include/asm/paca.h  |  2 +
 arch/powerpc/kernel/asm-offsets.c|  2 +
 arch/powerpc/kernel/exceptions-64s.S | 95 
 arch/powerpc/xmon/xmon.c |  2 +
 5 files changed, 22 insertions(+), 86 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index dc6a5ccac965..79e5ac87c029 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -55,13 +55,6 @@
  */
 #define MAX_MCE_DEPTH  4
 
-/*
- * EX_R3 is only used by the bad_stack handler. bad_stack reloads and
- * saves DAR from SPRN_DAR, and EX_DAR is not used. So EX_R3 can overlap
- * with EX_DAR.
- */
-#define EX_R3  EX_DAR
-
 #ifdef __ASSEMBLY__
 
 #define STF_ENTRY_BARRIER_SLOT \
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 9bd2326bef6f..e3cc9eb9204d 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -166,7 +166,9 @@ struct paca_struct {
u64 kstack; /* Saved Kernel stack addr */
u64 saved_r1;   /* r1 save for RTAS calls or PM or EE=0 
*/
u64 saved_msr;  /* MSR saved here by enter_rtas */
+#ifdef CONFIG_PPC_BOOK3E
u16 trap_save;  /* Used when bad stack is encountered */
+#endif
u8 irq_soft_mask;   /* mask for irq soft masking */
u8 irq_happened;/* irq happened while soft-disabled */
u8 irq_work_pending;/* IRQ_WORK interrupt while 
soft-disable */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 31dc7e64cbfc..4ccb6b3a7fbd 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -266,7 +266,9 @@ int main(void)
OFFSET(ACCOUNT_STARTTIME_USER, paca_struct, accounting.starttime_user);
OFFSET(ACCOUNT_USER_TIME, paca_struct, accounting.utime);
OFFSET(ACCOUNT_SYSTEM_TIME, paca_struct, accounting.stime);
+#ifdef CONFIG_PPC_BOOK3E
OFFSET(PACA_TRAP_SAVE, paca_struct, trap_save);
+#endif
OFFSET(PACA_SPRG_VDSO, paca_struct, sprg_vdso);
 #else /* CONFIG_PPC64 */
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index ce7aad9d3840..b402a006cd48 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -351,14 +351,8 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948)
subir1,r1,INT_FRAME_SIZE;   /* alloc frame on kernel stack  */ \
beq-1f;\

[PATCH 23/28] powerpc/64s/exception: generate regs clear instructions using .rept

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 29 +++-
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index a0721c3fc097..ce7aad9d3840 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -2018,12 +2018,11 @@ BEGIN_FTR_SECTION
mtmsrd  r10
sync
 
-#define FMR2(n)  fmr (n), (n) ; fmr n+1, n+1
-#define FMR4(n)  FMR2(n) ; FMR2(n+2)
-#define FMR8(n)  FMR4(n) ; FMR4(n+4)
-#define FMR16(n) FMR8(n) ; FMR8(n+8)
-#define FMR32(n) FMR16(n) ; FMR16(n+16)
-   FMR32(0)
+   .Lreg=0
+   .rept 32
+   fmr .Lreg,.Lreg
+   .Lreg=.Lreg+1
+   .endr
 
 FTR_SECTION_ELSE
 /*
@@ -2035,12 +2034,11 @@ FTR_SECTION_ELSE
mtmsrd  r10
sync
 
-#define XVCPSGNDP2(n) XVCPSGNDP(n,n,n) ; XVCPSGNDP(n+1,n+1,n+1)
-#define XVCPSGNDP4(n) XVCPSGNDP2(n) ; XVCPSGNDP2(n+2)
-#define XVCPSGNDP8(n) XVCPSGNDP4(n) ; XVCPSGNDP4(n+4)
-#define XVCPSGNDP16(n) XVCPSGNDP8(n) ; XVCPSGNDP8(n+8)
-#define XVCPSGNDP32(n) XVCPSGNDP16(n) ; XVCPSGNDP16(n+16)
-   XVCPSGNDP32(0)
+   .Lreg=0
+   .rept 32
+   XVCPSGNDP(.Lreg,.Lreg,.Lreg)
+   .Lreg=.Lreg+1
+   .endr
 
 ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_206)
 
@@ -2051,7 +2049,12 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
  * To denormalise we need to move a copy of the register to itself.
  * For POWER8 we need to do that for all 64 VSX registers
  */
-   XVCPSGNDP32(32)
+   .Lreg=32
+   .rept 32
+   XVCPSGNDP(.Lreg,.Lreg,.Lreg)
+   .Lreg=.Lreg+1
+   .endr
+
 denorm_done:
mfspr   r11,SPRN_HSRR0
subir11,r11,4
-- 
2.20.1



[PATCH 22/28] powerpc/64s/exception: fix indenting irregularities

2019-06-11 Thread Nicholas Piggin
Generally, macros that result in instructions being expanded are
indented by a tab, and those that don't have no indent. Fix the
obvious cases that go contrary to style.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 92 ++--
 1 file changed, 46 insertions(+), 46 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 1c11a7330856..a0721c3fc097 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -269,16 +269,16 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
cmpwi   r10,KVM_GUEST_MODE_SKIP
beq 89f
.else
-   BEGIN_FTR_SECTION_NESTED(947)
+BEGIN_FTR_SECTION_NESTED(947)
ld  r10,\area+EX_CFAR(r13)
std r10,HSTATE_CFAR(r13)
-   END_FTR_SECTION_NESTED(CPU_FTR_CFAR,CPU_FTR_CFAR,947)
+END_FTR_SECTION_NESTED(CPU_FTR_CFAR,CPU_FTR_CFAR,947)
.endif
 
-   BEGIN_FTR_SECTION_NESTED(948)
+BEGIN_FTR_SECTION_NESTED(948)
ld  r10,\area+EX_PPR(r13)
std r10,HSTATE_PPR(r13)
-   END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948)
+END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948)
ld  r10,\area+EX_R10(r13)
std r12,HSTATE_SCRATCH0(r13)
sldir12,r9,32
@@ -380,10 +380,10 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
std r9,GPR11(r1);  \
std r10,GPR12(r1); \
std r11,GPR13(r1); \
-   BEGIN_FTR_SECTION_NESTED(66);  \
+BEGIN_FTR_SECTION_NESTED(66); \
ld  r10,area+EX_CFAR(r13); \
std r10,ORIG_GPR3(r1); \
-   END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 66);\
+END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 66);
   \
GET_CTR(r10, area);\
std r10,_CTR(r1);
 
@@ -802,7 +802,7 @@ EXC_REAL_BEGIN(system_reset, 0x100, 0x100)
 * but we branch to the 0xc000... address so we can turn on relocation
 * with mtmsr.
 */
-   BEGIN_FTR_SECTION
+BEGIN_FTR_SECTION
mfspr   r10,SPRN_SRR1
rlwinm. r10,r10,47-31,30,31
beq-1f
@@ -811,7 +811,7 @@ EXC_REAL_BEGIN(system_reset, 0x100, 0x100)
bltlr   cr1 /* no state loss, return to idle caller */
BRANCH_TO_C000(r10, system_reset_idle_common)
 1:
-   END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
+END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
 #endif
 
KVMTEST EXC_STD 0x100
@@ -1159,10 +1159,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
 *
 * Go back to nap/sleep/winkle mode again if (b) is true.
 */
-   BEGIN_FTR_SECTION
+BEGIN_FTR_SECTION
rlwinm. r11,r12,47-31,30,31
bne machine_check_idle_common
-   END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
+END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
 #endif
 
/*
@@ -1269,13 +1269,13 @@ EXC_COMMON_BEGIN(mce_return)
b   .
 
 EXC_REAL_BEGIN(data_access, 0x300, 0x80)
-SET_SCRATCH0(r13)  /* save r13 */
-EXCEPTION_PROLOG_0 PACA_EXGEN
+   SET_SCRATCH0(r13)   /* save r13 */
+   EXCEPTION_PROLOG_0 PACA_EXGEN
b   tramp_real_data_access
 EXC_REAL_END(data_access, 0x300, 0x80)
 
 TRAMP_REAL_BEGIN(tramp_real_data_access)
-EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, 0x300, 0
+   EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, 0x300, 0
/*
 * DAR/DSISR must be read before setting MSR[RI], because
 * a d-side MCE will clobber those registers so is not
@@ -1288,9 +1288,9 @@ EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, 0x300, 0
 EXCEPTION_PROLOG_2_REAL data_access_common, EXC_STD, 1
 
 EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
-SET_SCRATCH0(r13)  /* save r13 */
-EXCEPTION_PROLOG_0 PACA_EXGEN
-EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 0, 0x300, 0
+   SET_SCRATCH0(r13)   /* save r13 */
+   EXCEPTION_PROLOG_0 PACA_EXGEN
+   EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 0, 0x300, 0
mfspr   r10,SPRN_DAR
mfspr   r11,SPRN_DSISR
std r10,PACA_EXGEN+EX_DAR(r13)
@@ -1323,24 +1323,24 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
 
 EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
-SET_SCRATCH0(r13)  /* save r13 */
-EXCEPTION_PROLOG_0 PACA_EXSLB
+   SET_SCRATCH0(r13)   /* save r13 */
+   EXCEPTION_PROLOG_0 PACA_EXSLB
b   tramp_real_data_access_slb
 EXC_REAL_END(data_access_slb, 0x380, 0x80)
 
 TRAMP_REAL_BEGIN(tramp_real_data_access_slb)
-EXCEPTION_PROLOG_1 EXC_STD, PACA_EXSL

[PATCH 21/28] powerpc/64s/exception: use a gas macro for system call handler code

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 127 ---
 1 file changed, 55 insertions(+), 72 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 8a65ae64ed54..1c11a7330856 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1615,6 +1615,7 @@ EXC_COMMON(trap_0b_common, 0xb00, unknown_exception)
  * without saving, though xer is not a good idea to use, as hardware may
  * interpret some bits so it may be costly to change them.
  */
+.macro SYSTEM_CALL real
 #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
/*
 * There is a little bit of juggling to get syscall and hcall
@@ -1624,95 +1625,77 @@ EXC_COMMON(trap_0b_common, 0xb00, unknown_exception)
 * Userspace syscalls have already saved the PPR, hcalls must save
 * it before setting HMT_MEDIUM.
 */
-#define SYSCALL_KVMTEST
\
-   mtctr   r13;\
-   GET_PACA(r13);  \
-   std r10,PACA_EXGEN+EX_R10(r13); \
-   INTERRUPT_TO_KERNEL;\
-   KVMTEST EXC_STD 0xc00 ; /* uses r10, branch to do_kvm_0xc00_system_call 
*/ \
-   HMT_MEDIUM; \
-   mfctr   r9;
-
+   mtctr   r13
+   GET_PACA(r13)
+   std r10,PACA_EXGEN+EX_R10(r13)
+   INTERRUPT_TO_KERNEL
+   KVMTEST EXC_STD 0xc00 /* uses r10, branch to do_kvm_0xc00_system_call */
+   HMT_MEDIUM
+   mfctr   r9
 #else
-#define SYSCALL_KVMTEST
\
-   HMT_MEDIUM; \
-   mr  r9,r13; \
-   GET_PACA(r13);  \
-   INTERRUPT_TO_KERNEL;
+   HMT_MEDIUM
+   mr  r9,r13
+   GET_PACA(r13)
+   INTERRUPT_TO_KERNEL
 #endif
-   
-#define LOAD_SYSCALL_HANDLER(reg)  \
-   __LOAD_HANDLER(reg, system_call_common)
-
-/*
- * After SYSCALL_KVMTEST, we reach here with PACA in r13, r13 in r9,
- * and HMT_MEDIUM.
- */
-#define SYSCALL_REAL   \
-   mfspr   r11,SPRN_SRR0 ; \
-   mfspr   r12,SPRN_SRR1 ; \
-   LOAD_SYSCALL_HANDLER(r10) ; \
-   mtspr   SPRN_SRR0,r10 ; \
-   ld  r10,PACAKMSR(r13) ; \
-   mtspr   SPRN_SRR1,r10 ; \
-   RFI_TO_KERNEL ; \
-   b   . ; /* prevent speculative execution */
 
 #ifdef CONFIG_PPC_FAST_ENDIAN_SWITCH
-#define SYSCALL_FASTENDIAN_TEST\
-BEGIN_FTR_SECTION  \
-   cmpdi   r0,0x1ebe ; \
-   beq-1f ;\
-END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) \
-
-#define SYSCALL_FASTENDIAN \
-   /* Fast LE/BE switch system call */ \
-1: mfspr   r12,SPRN_SRR1 ; \
-   xorir12,r12,MSR_LE ;\
-   mtspr   SPRN_SRR1,r12 ; \
-   mr  r13,r9 ;\
-   RFI_TO_USER ;   /* return to userspace */   \
-   b   . ; /* prevent speculative execution */
-#else
-#define SYSCALL_FASTENDIAN_TEST
-#define SYSCALL_FASTENDIAN
-#endif /* CONFIG_PPC_FAST_ENDIAN_SWITCH */
+BEGIN_FTR_SECTION
+   cmpdi   r0,0x1ebe
+   beq-1f
+END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
+#endif
+   /* We reach here with PACA in r13, r13 in r9, and HMT_MEDIUM. */
 
-#if defined(CONFIG_RELOCATABLE)
+   .if \real
+   mfspr   r11,SPRN_SRR0
+   mfspr   r12,SPRN_SRR1
+   __LOAD_HANDLER(r10, system_call_common)
+   mtspr   SPRN_SRR0,r10
+   ld  r10,PACAKMSR(r13)
+   mtspr   SPRN_SRR1,r10
+   RFI_TO_KERNEL
+   b   .   /* prevent speculative execution */
+   .else
+#ifdef CONFIG_RELOCATABLE
/*
 * We can't branch directly so we do it via the CTR which
 * is volatile across system calls.
 */
-#define SYSCALL_VIRT   \
-   LOAD_SYSCALL_HANDLER(r10) ; \
-   mtctr   r10 ;   \
-   mfspr   r11,SPRN_SRR0 ; \
- 

[PATCH 20/28] powerpc/64s/exception: remove __BRANCH_TO_KVM

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 43 
 1 file changed, 18 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 013abf3ea6f6..8a65ae64ed54 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -243,29 +243,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 #endif
 
 #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
-
-#ifdef CONFIG_RELOCATABLE
-/*
- * KVM requires __LOAD_FAR_HANDLER.
- *
- * __BRANCH_TO_KVM_EXIT branches are also a special case because they
- * explicitly use r9 then reload it from PACA before branching. Hence
- * the double-underscore.
- */
-#define __BRANCH_TO_KVM_EXIT(area, label)  \
-   mfctr   r9; \
-   std r9,HSTATE_SCRATCH1(r13);\
-   __LOAD_FAR_HANDLER(r9, label);  \
-   mtctr   r9; \
-   ld  r9,area+EX_R9(r13); \
-   bctr
-
-#else
-#define __BRANCH_TO_KVM_EXIT(area, label)  \
-   ld  r9,area+EX_R9(r13); \
-   b   label
-#endif
-
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
 /*
  * If hv is possible, interrupts come into to the hv version
@@ -311,8 +288,24 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
.else
ori r12,r12,(\n)
.endif
-   /* This reloads r9 before branching to kvmppc_interrupt */
-   __BRANCH_TO_KVM_EXIT(\area, kvmppc_interrupt)
+
+#ifdef CONFIG_RELOCATABLE
+   /*
+* KVM requires __LOAD_FAR_HANDLER beause kvmppc_interrupt lives
+* outside the head section. CONFIG_RELOCATABLE KVM expects CTR
+* to be saved in HSTATE_SCRATCH1.
+*/
+   mfctr   r9
+   std r9,HSTATE_SCRATCH1(r13)
+   __LOAD_FAR_HANDLER(r9, kvmppc_interrupt)
+   mtctr   r9
+   ld  r9,\area+EX_R9(r13)
+   bctr
+#else
+   ld  r9,\area+EX_R9(r13)
+   b   kvmppc_interrupt
+#endif
+
 
.if \skip
 89:mtocrf  0x80,r9
-- 
2.20.1



[PATCH 19/28] powerpc/64s/exception: move head-64.h code to exception-64s.S where it is used

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h |   1 -
 arch/powerpc/include/asm/head-64.h   | 252 ---
 arch/powerpc/kernel/exceptions-64s.S | 251 ++
 3 files changed, 251 insertions(+), 253 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 9e6712099f7a..dc6a5ccac965 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -30,7 +30,6 @@
  * exception handlers (including pSeries LPAR) and iSeries LPAR
  * implementations as possible.
  */
-#include 
 #include 
 
 /* PACA save area offsets (exgen, exmc, etc) */
diff --git a/arch/powerpc/include/asm/head-64.h 
b/arch/powerpc/include/asm/head-64.h
index dc1940c94a86..a466765709a9 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -169,53 +169,6 @@ end_##sname:
 
 #define ABS_ADDR(label) (label - fs_label + fs_start)
 
-/*
- * Following are the BOOK3S exception handler helper macros.
- * Handlers come in a number of types, and each type has a number of varieties.
- *
- * EXC_REAL_* - real, unrelocated exception vectors
- * EXC_VIRT_* - virt (AIL), unrelocated exception vectors
- * TRAMP_REAL_*   - real, unrelocated helpers (virt can call these)
- * TRAMP_VIRT_*   - virt, unreloc helpers (in practice, real can use)
- * TRAMP_KVM  - KVM handlers that get put into real, unrelocated
- * EXC_COMMON - virt, relocated common handlers
- *
- * The EXC handlers are given a name, and branch to name_common, or the
- * appropriate KVM or masking function. Vector handler verieties are as
- * follows:
- *
- * EXC_{REAL|VIRT}_BEGIN/END - used to open-code the exception
- *
- * EXC_{REAL|VIRT}  - standard exception
- *
- * EXC_{REAL|VIRT}_suffix
- * where _suffix is:
- *   - _MASKABLE   - maskable exception
- *   - _OOL- out of line with trampoline to common handler
- *   - _HV - HV exception
- *
- * There can be combinations, e.g., EXC_VIRT_OOL_MASKABLE_HV
- *
- * The one unusual case is __EXC_REAL_OOL_HV_DIRECT, which is
- * an OOL vector that branches to a specified handler rather than the usual
- * trampoline that goes to common. It, and other underscore macros, should
- * be used with care.
- *
- * KVM handlers come in the following verieties:
- * TRAMP_KVM
- * TRAMP_KVM_SKIP
- * TRAMP_KVM_HV
- * TRAMP_KVM_HV_SKIP
- *
- * COMMON handlers come in the following verieties:
- * EXC_COMMON_BEGIN/END - used to open-code the handler
- * EXC_COMMON
- * EXC_COMMON_ASYNC
- *
- * TRAMP_REAL and TRAMP_VIRT can be used with BEGIN/END. KVM
- * and OOL handlers are implemented as types of TRAMP and TRAMP_VIRT handlers.
- */
-
 #define EXC_REAL_BEGIN(name, start, size)  \
FIXED_SECTION_ENTRY_BEGIN_LOCATION(real_vectors, 
exc_real_##start##_##name, start, size)
 
@@ -257,211 +210,6 @@ end_##sname:
FIXED_SECTION_ENTRY_BEGIN_LOCATION(virt_vectors, 
exc_virt_##start##_##unused, start, size); \
FIXED_SECTION_ENTRY_END_LOCATION(virt_vectors, 
exc_virt_##start##_##unused, start, size)
 
-
-#define __EXC_REAL(name, start, size, area)\
-   EXC_REAL_BEGIN(name, start, size);  \
-   SET_SCRATCH0(r13);  /* save r13 */  \
-   EXCEPTION_PROLOG_0 area ;   \
-   EXCEPTION_PROLOG_1 EXC_STD, area, 1, start, 0 ; \
-   EXCEPTION_PROLOG_2_REAL name##_common, EXC_STD, 1 ; \
-   EXC_REAL_END(name, start, size)
-
-#define EXC_REAL(name, start, size)\
-   __EXC_REAL(name, start, size, PACA_EXGEN)
-
-#define __EXC_VIRT(name, start, size, realvec, area)   \
-   EXC_VIRT_BEGIN(name, start, size);  \
-   SET_SCRATCH0(r13);/* save r13 */\
-   EXCEPTION_PROLOG_0 area ;   \
-   EXCEPTION_PROLOG_1 EXC_STD, area, 0, realvec, 0;\
-   EXCEPTION_PROLOG_2_VIRT name##_common, EXC_STD ;\
-   EXC_VIRT_END(name, start, size)
-
-#define EXC_VIRT(name, start, size, realvec)   \
-   __EXC_VIRT(name, start, size, realvec, PACA_EXGEN)
-
-#define EXC_REAL_MASKABLE(name, start, size, bitmask)  \
-   EXC_REAL_BEGIN(name, start, size);  \
-   SET_SCRATCH0(r13);/* save r13 */\
-   EXCEPTION_PROLOG_0 PACA_EXGEN ; \
-   EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, start, bitmask ; \
-   EXCEPTION_PROLOG_2_REAL name##_common, EXC_STD, 1 ; \
-   EXC_REAL_END(name, start, size)
-
-#define EXC_VIRT_MASKABLE(name, start, size, real

[PATCH 18/28] powerpc/64s/exception: move exception-64s.h code to exception-64s.S where it is used

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 430 --
 arch/powerpc/kernel/exceptions-64s.S | 431 +++
 2 files changed, 431 insertions(+), 430 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index e996ffe68cf3..9e6712099f7a 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -146,436 +146,6 @@
hrfid;  \
b   hrfi_flush_fallback
 
-/*
- * We're short on space and time in the exception prolog, so we can't
- * use the normal LOAD_REG_IMMEDIATE macro to load the address of label.
- * Instead we get the base of the kernel from paca->kernelbase and or in the 
low
- * part of label. This requires that the label be within 64KB of kernelbase, 
and
- * that kernelbase be 64K aligned.
- */
-#define LOAD_HANDLER(reg, label)   \
-   ld  reg,PACAKBASE(r13); /* get high part of &label */   \
-   ori reg,reg,FIXED_SYMBOL_ABS_ADDR(label)
-
-#define __LOAD_HANDLER(reg, label) \
-   ld  reg,PACAKBASE(r13); \
-   ori reg,reg,(ABS_ADDR(label))@l
-
-/*
- * Branches from unrelocated code (e.g., interrupts) to labels outside
- * head-y require >64K offsets.
- */
-#define __LOAD_FAR_HANDLER(reg, label) \
-   ld  reg,PACAKBASE(r13); \
-   ori reg,reg,(ABS_ADDR(label))@l;\
-   addis   reg,reg,(ABS_ADDR(label))@h
-
-/* Exception register prefixes */
-#define EXC_HV 1
-#define EXC_STD0
-
-#if defined(CONFIG_RELOCATABLE)
-/*
- * If we support interrupts with relocation on AND we're a relocatable kernel,
- * we need to use CTR to get to the 2nd level handler.  So, save/restore it
- * when required.
- */
-#define SAVE_CTR(reg, area)mfctr   reg ;   std reg,area+EX_CTR(r13)
-#define GET_CTR(reg, area) ld  reg,area+EX_CTR(r13)
-#define RESTORE_CTR(reg, area) ld  reg,area+EX_CTR(r13) ; mtctr reg
-#else
-/* ...else CTR is unused and in register. */
-#define SAVE_CTR(reg, area)
-#define GET_CTR(reg, area) mfctr   reg
-#define RESTORE_CTR(reg, area)
-#endif
-
-/*
- * PPR save/restore macros used in exceptions_64s.S  
- * Used for P7 or later processors
- */
-#define SAVE_PPR(area, ra) \
-BEGIN_FTR_SECTION_NESTED(940)  \
-   ld  ra,area+EX_PPR(r13);/* Read PPR from paca */\
-   std ra,_PPR(r1);\
-END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940)
-
-#define RESTORE_PPR_PACA(area, ra) \
-BEGIN_FTR_SECTION_NESTED(941)  \
-   ld  ra,area+EX_PPR(r13);\
-   mtspr   SPRN_PPR,ra;\
-END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,941)
-
-/*
- * Get an SPR into a register if the CPU has the given feature
- */
-#define OPT_GET_SPR(ra, spr, ftr)  \
-BEGIN_FTR_SECTION_NESTED(943)  \
-   mfspr   ra,spr; \
-END_FTR_SECTION_NESTED(ftr,ftr,943)
-
-/*
- * Set an SPR from a register if the CPU has the given feature
- */
-#define OPT_SET_SPR(ra, spr, ftr)  \
-BEGIN_FTR_SECTION_NESTED(943)  \
-   mtspr   spr,ra; \
-END_FTR_SECTION_NESTED(ftr,ftr,943)
-
-/*
- * Save a register to the PACA if the CPU has the given feature
- */
-#define OPT_SAVE_REG_TO_PACA(offset, ra, ftr)  \
-BEGIN_FTR_SECTION_NESTED(943)  \
-   std ra,offset(r13); \
-END_FTR_SECTION_NESTED(ftr,ftr,943)
-
-.macro EXCEPTION_PROLOG_0 area
-   GET_PACA(r13)
-   std r9,\area\()+EX_R9(r13)  /* save r9 */
-   OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR)
-   HMT_MEDIUM
-   std r10,\area\()+EX_R10(r13)/* save r10 - r12 */
-   OPT_GET_SPR(r10, SPRN_CFAR, CPU_FTR_CFAR)
-.endm
-
-.macro EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, bitmask
-   OPT_SAVE_REG_TO_PACA(\area\()+EX_PPR, r9, CPU_FTR_HAS_PPR)
-   OPT_SAVE_REG_TO_PACA(\area\()+EX_CFAR, r10, CPU_FTR_CFAR)
-   INTERRUPT_TO_KERNEL
-   SAVE_CTR(r10, \area\())
-   mfcrr9
-   .if \kvm
-   KVMTEST \hsrr \vec
-   .endif
-   .if \bitmask
-   lbz r10,PACAIRQ

[PATCH 17/28] powerpc/64s/exception: move KVM related code together

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 40 +---
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 73705421f423..e996ffe68cf3 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -335,18 +335,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 #endif
 .endm
 
-
-#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
-/*
- * If hv is possible, interrupts come into to the hv version
- * of the kvmppc_interrupt code, which then jumps to the PR handler,
- * kvmppc_interrupt_pr, if the guest is a PR guest.
- */
-#define kvmppc_interrupt kvmppc_interrupt_hv
-#else
-#define kvmppc_interrupt kvmppc_interrupt_pr
-#endif
-
 /*
  * Branch to label using its 0xC000 address. This results in instruction
  * address suitable for MSR[IR]=0 or 1, which allows relocation to be turned
@@ -371,6 +359,17 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
mtctr   r12;\
bctrl
 
+#else
+#define BRANCH_TO_COMMON(reg, label)   \
+   b   label
+
+#define BRANCH_LINK_TO_FAR(label)  \
+   bl  label
+#endif
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+
+#ifdef CONFIG_RELOCATABLE
 /*
  * KVM requires __LOAD_FAR_HANDLER.
  *
@@ -387,19 +386,22 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
bctr
 
 #else
-#define BRANCH_TO_COMMON(reg, label)   \
-   b   label
-
-#define BRANCH_LINK_TO_FAR(label)  \
-   bl  label
-
 #define __BRANCH_TO_KVM_EXIT(area, label)  \
ld  r9,area+EX_R9(r13); \
b   label
+#endif
 
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+/*
+ * If hv is possible, interrupts come into to the hv version
+ * of the kvmppc_interrupt code, which then jumps to the PR handler,
+ * kvmppc_interrupt_pr, if the guest is a PR guest.
+ */
+#define kvmppc_interrupt kvmppc_interrupt_hv
+#else
+#define kvmppc_interrupt kvmppc_interrupt_pr
 #endif
 
-#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
 .macro KVMTEST hsrr, n
lbz r10,HSTATE_IN_GUEST(r13)
cmpwi   r10,0
-- 
2.20.1



[PATCH 16/28] powerpc/64s/exception: remove STD_EXCEPTION_COMMON variants

2019-06-11 Thread Nicholas Piggin
These are only called in one place each.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 22 --
 arch/powerpc/include/asm/head-64.h   | 19 +--
 2 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 6de3c393ddf7..73705421f423 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -555,28 +555,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
EXCEPTION_PROLOG_COMMON_2(area);\
EXCEPTION_PROLOG_COMMON_3(trap)
 
-#define STD_EXCEPTION_COMMON(trap, hdlr)   \
-   EXCEPTION_COMMON(PACA_EXGEN, trap); \
-   bl  save_nvgprs;\
-   RECONCILE_IRQ_STATE(r10, r11);  \
-   addir3,r1,STACK_FRAME_OVERHEAD; \
-   bl  hdlr;   \
-   b   ret_from_except
-
-/*
- * Like STD_EXCEPTION_COMMON, but for exceptions that can occur
- * in the idle task and therefore need the special idle handling
- * (finish nap and runlatch)
- */
-#define STD_EXCEPTION_COMMON_ASYNC(trap, hdlr) \
-   EXCEPTION_COMMON(PACA_EXGEN, trap); \
-   FINISH_NAP; \
-   RECONCILE_IRQ_STATE(r10, r11);  \
-   RUNLATCH_ON;\
-   addir3,r1,STACK_FRAME_OVERHEAD; \
-   bl  hdlr;   \
-   b   ret_from_except_lite
-
 /*
  * When the idle code in power4_idle puts the CPU into NAP mode,
  * it has to do so in a loop, and relies on the external interrupt
diff --git a/arch/powerpc/include/asm/head-64.h 
b/arch/powerpc/include/asm/head-64.h
index 54db05afb80f..dc1940c94a86 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -441,11 +441,26 @@ end_##sname:
 
 #define EXC_COMMON(name, realvec, hdlr)
\
EXC_COMMON_BEGIN(name); \
-   STD_EXCEPTION_COMMON(realvec, hdlr)
+   EXCEPTION_COMMON(PACA_EXGEN, realvec);  \
+   bl  save_nvgprs;\
+   RECONCILE_IRQ_STATE(r10, r11);  \
+   addir3,r1,STACK_FRAME_OVERHEAD; \
+   bl  hdlr;   \
+   b   ret_from_except
 
+/*
+ * Like EXC_COMMON, but for exceptions that can occur in the idle task and
+ * therefore need the special idle handling (finish nap and runlatch)
+ */
 #define EXC_COMMON_ASYNC(name, realvec, hdlr)  \
EXC_COMMON_BEGIN(name); \
-   STD_EXCEPTION_COMMON_ASYNC(realvec, hdlr)
+   EXCEPTION_COMMON(PACA_EXGEN, realvec);  \
+   FINISH_NAP; \
+   RECONCILE_IRQ_STATE(r10, r11);  \
+   RUNLATCH_ON;\
+   addir3,r1,STACK_FRAME_OVERHEAD; \
+   bl  hdlr;   \
+   b   ret_from_except_lite
 
 #endif /* __ASSEMBLY__ */
 
-- 
2.20.1



[PATCH 15/28] powerpc/64s/exception: move EXCEPTION_PROLOG_2* to a more logical place

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 113 ---
 1 file changed, 57 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 0bb0310b794f..6de3c393ddf7 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -170,62 +170,6 @@
ori reg,reg,(ABS_ADDR(label))@l;\
addis   reg,reg,(ABS_ADDR(label))@h
 
-.macro EXCEPTION_PROLOG_2_REAL label, hsrr, set_ri
-   ld  r10,PACAKMSR(r13)   /* get MSR value for kernel */
-   .if ! \set_ri
-   xorir10,r10,MSR_RI  /* Clear MSR_RI */
-   .endif
-   .if \hsrr
-   mfspr   r11,SPRN_HSRR0  /* save HSRR0 */
-   .else
-   mfspr   r11,SPRN_SRR0   /* save SRR0 */
-   .endif
-   LOAD_HANDLER(r12, \label\())
-   .if \hsrr
-   mtspr   SPRN_HSRR0,r12
-   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
-   mtspr   SPRN_HSRR1,r10
-   HRFI_TO_KERNEL
-   .else
-   mtspr   SPRN_SRR0,r12
-   mfspr   r12,SPRN_SRR1   /* and SRR1 */
-   mtspr   SPRN_SRR1,r10
-   RFI_TO_KERNEL
-   .endif
-   b   .   /* prevent speculative execution */
-.endm
-
-.macro EXCEPTION_PROLOG_2_VIRT label, hsrr
-#ifdef CONFIG_RELOCATABLE
-   .if \hsrr
-   mfspr   r11,SPRN_HSRR0  /* save HSRR0 */
-   .else
-   mfspr   r11,SPRN_SRR0   /* save SRR0 */
-   .endif
-   LOAD_HANDLER(r12, \label\())
-   mtctr   r12
-   .if \hsrr
-   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
-   .else
-   mfspr   r12,SPRN_SRR1   /* and HSRR1 */
-   .endif
-   li  r10,MSR_RI
-   mtmsrd  r10,1   /* Set RI (EE=0) */
-   bctr
-#else
-   .if \hsrr
-   mfspr   r11,SPRN_HSRR0  /* save HSRR0 */
-   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
-   .else
-   mfspr   r11,SPRN_SRR0   /* save SRR0 */
-   mfspr   r12,SPRN_SRR1   /* and SRR1 */
-   .endif
-   li  r10,MSR_RI
-   mtmsrd  r10,1   /* Set RI (EE=0) */
-   b   \label
-#endif
-.endm
-
 /* Exception register prefixes */
 #define EXC_HV 1
 #define EXC_STD0
@@ -335,6 +279,63 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
std r10,\area\()+EX_R13(r13)
 .endm
 
+.macro EXCEPTION_PROLOG_2_REAL label, hsrr, set_ri
+   ld  r10,PACAKMSR(r13)   /* get MSR value for kernel */
+   .if ! \set_ri
+   xorir10,r10,MSR_RI  /* Clear MSR_RI */
+   .endif
+   .if \hsrr
+   mfspr   r11,SPRN_HSRR0  /* save HSRR0 */
+   .else
+   mfspr   r11,SPRN_SRR0   /* save SRR0 */
+   .endif
+   LOAD_HANDLER(r12, \label\())
+   .if \hsrr
+   mtspr   SPRN_HSRR0,r12
+   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
+   mtspr   SPRN_HSRR1,r10
+   HRFI_TO_KERNEL
+   .else
+   mtspr   SPRN_SRR0,r12
+   mfspr   r12,SPRN_SRR1   /* and SRR1 */
+   mtspr   SPRN_SRR1,r10
+   RFI_TO_KERNEL
+   .endif
+   b   .   /* prevent speculative execution */
+.endm
+
+.macro EXCEPTION_PROLOG_2_VIRT label, hsrr
+#ifdef CONFIG_RELOCATABLE
+   .if \hsrr
+   mfspr   r11,SPRN_HSRR0  /* save HSRR0 */
+   .else
+   mfspr   r11,SPRN_SRR0   /* save SRR0 */
+   .endif
+   LOAD_HANDLER(r12, \label\())
+   mtctr   r12
+   .if \hsrr
+   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
+   .else
+   mfspr   r12,SPRN_SRR1   /* and HSRR1 */
+   .endif
+   li  r10,MSR_RI
+   mtmsrd  r10,1   /* Set RI (EE=0) */
+   bctr
+#else
+   .if \hsrr
+   mfspr   r11,SPRN_HSRR0  /* save HSRR0 */
+   mfspr   r12,SPRN_HSRR1  /* and HSRR1 */
+   .else
+   mfspr   r11,SPRN_SRR0   /* save SRR0 */
+   mfspr   r12,SPRN_SRR1   /* and SRR1 */
+   .endif
+   li  r10,MSR_RI
+   mtmsrd  r10,1   /* Set RI (EE=0) */
+   b   \label
+#endif
+.endm
+
+
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
 /*
  * If hv is possible, interrupts come into to the hv version
-- 
2.20.1



[PATCH 14/28] powerpc/64s/exception: improve 0x500 handler code

2019-06-11 Thread Nicholas Piggin
After the previous cleanup, it becomes possible to consolidate some
common code outside the runtime alternate patching. Also remove
unused labels.

This results in some code change, but unchanged runtime instruction
sequence.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/exceptions-64s.S | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index b8dba3fffeeb..c95dfc618a52 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -746,32 +746,24 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
 
 EXC_REAL_BEGIN(hardware_interrupt, 0x500, 0x100)
-   .globl hardware_interrupt_hv
-hardware_interrupt_hv:
+   SET_SCRATCH0(r13)   /* save r13 */
+   EXCEPTION_PROLOG_0 PACA_EXGEN
BEGIN_FTR_SECTION
-   SET_SCRATCH0(r13)   /* save r13 */
-   EXCEPTION_PROLOG_0 PACA_EXGEN
EXCEPTION_PROLOG_1 EXC_HV, PACA_EXGEN, 1, 0x500, IRQS_DISABLED
EXCEPTION_PROLOG_2_REAL hardware_interrupt_common, EXC_HV, 1
FTR_SECTION_ELSE
-   SET_SCRATCH0(r13)   /* save r13 */
-   EXCEPTION_PROLOG_0 PACA_EXGEN
EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, 0x500, IRQS_DISABLED
EXCEPTION_PROLOG_2_REAL hardware_interrupt_common, EXC_STD, 1
ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
 EXC_REAL_END(hardware_interrupt, 0x500, 0x100)
 
 EXC_VIRT_BEGIN(hardware_interrupt, 0x4500, 0x100)
-   .globl hardware_interrupt_relon_hv
-hardware_interrupt_relon_hv:
+   SET_SCRATCH0(r13)   /* save r13 */
+   EXCEPTION_PROLOG_0 PACA_EXGEN
BEGIN_FTR_SECTION
-   SET_SCRATCH0(r13)   /* save r13 */
-   EXCEPTION_PROLOG_0 PACA_EXGEN
EXCEPTION_PROLOG_1 EXC_HV, PACA_EXGEN, 1, 0x500, IRQS_DISABLED
EXCEPTION_PROLOG_2_VIRT hardware_interrupt_common, EXC_HV
FTR_SECTION_ELSE
-   SET_SCRATCH0(r13)   /* save r13 */
-   EXCEPTION_PROLOG_0 PACA_EXGEN
EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, 0x500, IRQS_DISABLED
EXCEPTION_PROLOG_2_VIRT hardware_interrupt_common, EXC_STD
ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
-- 
2.20.1



[PATCH 13/28] powerpc/64s/exception: unwind exception-64s.h macros

2019-06-11 Thread Nicholas Piggin
Many of these macros just specify 1-4 lines which are only called a
few times each at most, and often just once. Remove this indirection.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 101 ---
 arch/powerpc/include/asm/head-64.h   |  76 -
 arch/powerpc/kernel/exceptions-64s.S |  44 +-
 3 files changed, 82 insertions(+), 139 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 24fc0104c9d3..0bb0310b794f 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -226,17 +226,6 @@
 #endif
 .endm
 
-/*
- * As EXCEPTION_PROLOG(), except we've already got relocation on so no need to
- * rfid. Save CTR in case we're CONFIG_RELOCATABLE, in which case
- * EXCEPTION_PROLOG_2_VIRT will be using CTR.
- */
-#define EXCEPTION_RELON_PROLOG(area, label, hsrr, kvm, vec)\
-   SET_SCRATCH0(r13);  /* save r13 */  \
-   EXCEPTION_PROLOG_0 area ;   \
-   EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, 0 ;\
-   EXCEPTION_PROLOG_2_VIRT label, hsrr
-
 /* Exception register prefixes */
 #define EXC_HV 1
 #define EXC_STD0
@@ -346,12 +335,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
std r10,\area\()+EX_R13(r13)
 .endm
 
-#define EXCEPTION_PROLOG(area, label, hsrr, kvm, vec)  \
-   SET_SCRATCH0(r13);  /* save r13 */  \
-   EXCEPTION_PROLOG_0 area ;   \
-   EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, 0 ;\
-   EXCEPTION_PROLOG_2_REAL label, hsrr, 1
-
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
 /*
  * If hv is possible, interrupts come into to the hv version
@@ -415,12 +398,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 #endif
 
-/* Do not enable RI */
-#define EXCEPTION_PROLOG_NORI(area, label, hsrr, kvm, vec) \
-   EXCEPTION_PROLOG_0 area ;   \
-   EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, 0 ;\
-   EXCEPTION_PROLOG_2_REAL label, hsrr, 0
-
 #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
 .macro KVMTEST hsrr, n
lbz r10,HSTATE_IN_GUEST(r13)
@@ -557,84 +534,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
std r10,RESULT(r1); /* clear regs->result   */ \
std r11,STACK_FRAME_OVERHEAD-16(r1); /* mark the frame  */
 
-/*
- * Exception vectors.
- */
-#define STD_EXCEPTION(vec, label)  \
-   EXCEPTION_PROLOG(PACA_EXGEN, label, EXC_STD, 1, vec);
-
-/* Version of above for when we have to branch out-of-line */
-#define __OOL_EXCEPTION(vec, label, hdlr)  \
-   SET_SCRATCH0(r13);  \
-   EXCEPTION_PROLOG_0 PACA_EXGEN ; \
-   b hdlr
-
-#define STD_EXCEPTION_OOL(vec, label)  \
-   EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, vec, 0 ; \
-   EXCEPTION_PROLOG_2_REAL label, EXC_STD, 1
-
-#define STD_EXCEPTION_HV(loc, vec, label)  \
-   EXCEPTION_PROLOG(PACA_EXGEN, label, EXC_HV, 1, vec)
-
-#define STD_EXCEPTION_HV_OOL(vec, label)   \
-   EXCEPTION_PROLOG_1 EXC_HV, PACA_EXGEN, 1, vec, 0 ;  \
-   EXCEPTION_PROLOG_2_REAL label, EXC_HV, 1
-
-#define STD_RELON_EXCEPTION(loc, vec, label)   \
-   /* No guest interrupts come through here */ \
-   EXCEPTION_RELON_PROLOG(PACA_EXGEN, label, EXC_STD, 0, vec)
-
-#define STD_RELON_EXCEPTION_OOL(vec, label)\
-   EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 0, vec, 0 ; \
-   EXCEPTION_PROLOG_2_VIRT label, EXC_STD
-
-#define STD_RELON_EXCEPTION_HV(loc, vec, label)\
-   EXCEPTION_RELON_PROLOG(PACA_EXGEN, label, EXC_HV, 1, vec)
-
-#define STD_RELON_EXCEPTION_HV_OOL(vec, label) \
-   EXCEPTION_PROLOG_1 EXC_HV, PACA_EXGEN, 1, vec, 0 ;  \
-   EXCEPTION_PROLOG_2_VIRT label, EXC_HV
-
-#define __MASKABLE_EXCEPTION(vec, label, hsrr, kvm, bitmask)   \
-   SET_SCRATCH0(r13);/* save r13 */\
-   EXCEPTION_PROLOG_0 PACA_EXGEN ; \
-   EXCEPTION_PROLOG_1 hsrr, PACA_EXGEN, kvm, vec, bitmask ;\
-   EXCEPTION_PROLOG_2_REAL label, hsrr, 1
-
-#define MASKABLE_EXCEPTION(vec, label, bitmask)
\
-   __MASKABLE_EXCEPTION(vec, label, EXC_STD, 1, bitmask)
-
-#define MASKABLE_EXCEPTION_OOL(vec, label, bitmask)\
-   EXCEPTION_PROLOG_1 EXC_STD, PACA_EXGEN, 1, vec, bitmask ;   \
-   EXCEPTION_PROLOG_2_REAL label, EXC_STD, 1
-
-#define MASKABLE_EXCEPTION_HV(vec, label, bitmask) \
-   __MASK

[PATCH 11/28] powerpc/64s/exception: Move EXCEPTION_COMMON handler and return branches into callers

2019-06-11 Thread Nicholas Piggin
The aim is to reduce the amount of indirection it takes to get through
the exception handler macros, particularly where it provides little
code sharing.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 26 
 arch/powerpc/kernel/exceptions-64s.S | 21 +++
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index f19c2391cc36..cc65e87cff2f 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -658,31 +658,28 @@ BEGIN_FTR_SECTION \
beqlppc64_runlatch_on_trampoline;   \
 END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 
-#define EXCEPTION_COMMON(area, trap, label, hdlr, ret, additions) \
+#define EXCEPTION_COMMON(area, trap, label, additions) \
EXCEPTION_PROLOG_COMMON(trap, area);\
/* Volatile regs are potentially clobbered here */  \
-   additions;  \
-   addir3,r1,STACK_FRAME_OVERHEAD; \
-   bl  hdlr;   \
-   b   ret
+   additions
 
 /*
  * Exception where stack is already set in r1, r1 is saved in r10, and it
  * continues rather than returns.
  */
-#define EXCEPTION_COMMON_NORET_STACK(area, trap, label, hdlr, additions) \
+#define EXCEPTION_COMMON_NORET_STACK(area, trap, label, additions) \
EXCEPTION_PROLOG_COMMON_1();\
kuap_save_amr_and_lock r9, r10, cr1;\
EXCEPTION_PROLOG_COMMON_2(area);\
EXCEPTION_PROLOG_COMMON_3(trap);\
/* Volatile regs are potentially clobbered here */  \
-   additions;  \
-   addir3,r1,STACK_FRAME_OVERHEAD; \
-   bl  hdlr
+   additions
 
 #define STD_EXCEPTION_COMMON(trap, label, hdlr)\
-   EXCEPTION_COMMON(PACA_EXGEN, trap, label, hdlr, \
-   ret_from_except, ADD_NVGPRS;ADD_RECONCILE)
+   EXCEPTION_COMMON(PACA_EXGEN, trap, label, ADD_NVGPRS;ADD_RECONCILE); \
+   addir3,r1,STACK_FRAME_OVERHEAD; \
+   bl  hdlr;   \
+   b   ret_from_except
 
 /*
  * Like STD_EXCEPTION_COMMON, but for exceptions that can occur
@@ -690,8 +687,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
  * (finish nap and runlatch)
  */
 #define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr)  \
-   EXCEPTION_COMMON(PACA_EXGEN, trap, label, hdlr, \
-   ret_from_except_lite, FINISH_NAP;ADD_RECONCILE;RUNLATCH_ON)
+   EXCEPTION_COMMON(PACA_EXGEN, trap, label,   \
+   FINISH_NAP;ADD_RECONCILE;RUNLATCH_ON);  \
+   addir3,r1,STACK_FRAME_OVERHEAD; \
+   bl  hdlr;   \
+   b   ret_from_except_lite
 
 /*
  * When the idle code in power4_idle puts the CPU into NAP mode,
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 88f892167a64..63b161c23e9e 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -195,9 +195,10 @@ EXC_COMMON_BEGIN(system_reset_common)
mr  r10,r1
ld  r1,PACA_NMI_EMERG_SP(r13)
subir1,r1,INT_FRAME_SIZE
-   EXCEPTION_COMMON_NORET_STACK(PACA_EXNMI, 0x100,
-   system_reset, system_reset_exception,
-   ADD_NVGPRS;ADD_RECONCILE_NMI)
+   EXCEPTION_COMMON_NORET_STACK(PACA_EXNMI, 0x100, system_reset,
+   ADD_NVGPRS;ADD_RECONCILE_NMI)
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  system_reset_exception
 
/* This (and MCE) can be simplified with mtmsrd L=1 */
/* Clear MSR_RI before setting SRR0 and SRR1. */
@@ -1171,8 +1172,11 @@ hmi_exception_after_realmode:
b   tramp_real_hmi_exception
 
 EXC_COMMON_BEGIN(hmi_exception_common)
-EXCEPTION_COMMON(PACA_EXGEN, 0xe60, hmi_exception_common, handle_hmi_exception,
-ret_from_except, FINISH_NAP;ADD_NVGPRS;ADD_RECONCILE;RUNLATCH_ON)
+EXCEPTION_COMMON(PACA_EXGEN, 0xe60, hmi_exception_common,
+   FINISH_NAP;ADD_NVGPRS;ADD_RECONCILE;RUNLATCH_ON)
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  handle_hmi_exception
+   b   ret_from_except
 
 EXC_REAL_OOL_MASKABLE_HV(h_doorbell, 0xe80, 0x20, IRQS_DISABLED)
 EXC_VIRT_OOL_MASKABLE_HV(h_doorbell, 0x4e80, 0x20, 0xe80, IRQS_DISABLED)
@@ -1467,9 +1471,10 @@ EXC_COMMON_BEGIN(soft_nmi_common)
mr  r10,r1
ld  r1,PACAEMERGSP(r13)
subir1,r1,INT_FRAME_SIZE
-   EXCEPTION_CO

[PATCH 12/28] powerpc/64s/exception: Move EXCEPTION_COMMON additions into callers

2019-06-11 Thread Nicholas Piggin
More cases of code insertion via macros that does not add a great
deal. All the additions have to be specified in the macro arguments,
so they can just as well go after the macro.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 42 +++---
 arch/powerpc/include/asm/head-64.h   |  4 +--
 arch/powerpc/kernel/exceptions-64s.S | 45 +---
 3 files changed, 39 insertions(+), 52 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index cc65e87cff2f..24fc0104c9d3 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -635,21 +635,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
EXCEPTION_PROLOG_1 EXC_HV, PACA_EXGEN, 1, vec, bitmask ;\
EXCEPTION_PROLOG_2_VIRT label, EXC_HV
 
-/*
- * Our exception common code can be passed various "additions"
- * to specify the behaviour of interrupts, whether to kick the
- * runlatch, etc...
- */
-
-/*
- * This addition reconciles our actual IRQ state with the various software
- * flags that track it. This may call C code.
- */
-#define ADD_RECONCILE  RECONCILE_IRQ_STATE(r10,r11)
-
-#define ADD_NVGPRS \
-   bl  save_nvgprs
-
 #define RUNLATCH_ON\
 BEGIN_FTR_SECTION  \
ld  r3, PACA_THREAD_INFO(r13);  \
@@ -658,25 +643,22 @@ BEGIN_FTR_SECTION \
beqlppc64_runlatch_on_trampoline;   \
 END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 
-#define EXCEPTION_COMMON(area, trap, label, additions) \
+#define EXCEPTION_COMMON(area, trap)   \
EXCEPTION_PROLOG_COMMON(trap, area);\
-   /* Volatile regs are potentially clobbered here */  \
-   additions
 
 /*
- * Exception where stack is already set in r1, r1 is saved in r10, and it
- * continues rather than returns.
+ * Exception where stack is already set in r1, r1 is saved in r10
  */
-#define EXCEPTION_COMMON_NORET_STACK(area, trap, label, additions) \
+#define EXCEPTION_COMMON_STACK(area, trap) \
EXCEPTION_PROLOG_COMMON_1();\
kuap_save_amr_and_lock r9, r10, cr1;\
EXCEPTION_PROLOG_COMMON_2(area);\
-   EXCEPTION_PROLOG_COMMON_3(trap);\
-   /* Volatile regs are potentially clobbered here */  \
-   additions
+   EXCEPTION_PROLOG_COMMON_3(trap)
 
-#define STD_EXCEPTION_COMMON(trap, label, hdlr)\
-   EXCEPTION_COMMON(PACA_EXGEN, trap, label, ADD_NVGPRS;ADD_RECONCILE); \
+#define STD_EXCEPTION_COMMON(trap, hdlr)   \
+   EXCEPTION_COMMON(PACA_EXGEN, trap); \
+   bl  save_nvgprs;\
+   RECONCILE_IRQ_STATE(r10, r11);  \
addir3,r1,STACK_FRAME_OVERHEAD; \
bl  hdlr;   \
b   ret_from_except
@@ -686,9 +668,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
  * in the idle task and therefore need the special idle handling
  * (finish nap and runlatch)
  */
-#define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr)  \
-   EXCEPTION_COMMON(PACA_EXGEN, trap, label,   \
-   FINISH_NAP;ADD_RECONCILE;RUNLATCH_ON);  \
+#define STD_EXCEPTION_COMMON_ASYNC(trap, hdlr) \
+   EXCEPTION_COMMON(PACA_EXGEN, trap); \
+   FINISH_NAP; \
+   RECONCILE_IRQ_STATE(r10, r11);  \
+   RUNLATCH_ON;\
addir3,r1,STACK_FRAME_OVERHEAD; \
bl  hdlr;   \
b   ret_from_except_lite
diff --git a/arch/powerpc/include/asm/head-64.h 
b/arch/powerpc/include/asm/head-64.h
index bdd67a26e959..acd94fcf9f40 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -403,11 +403,11 @@ end_##sname:
 
 #define EXC_COMMON(name, realvec, hdlr)
\
EXC_COMMON_BEGIN(name); \
-   STD_EXCEPTION_COMMON(realvec, name, hdlr)
+   STD_EXCEPTION_COMMON(realvec, hdlr)
 
 #define EXC_COMMON_ASYNC(name, realvec, hdlr)  \
EXC_COMMON_BEGIN(name); \
-   STD_EXCEPTION_COMMON_ASYNC(realvec, name, hdlr)
+   STD_EXCEPTION_COMMON_ASYNC(realvec, hdlr)
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 63b161c23e9e..935019529f16 100644
--- a/arch/powerpc/ke

[PATCH 10/28] powerpc/64s/exception: Make EXCEPTION_PROLOG_0 a gas macro for consistency with others

2019-06-11 Thread Nicholas Piggin
No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 25 
 arch/powerpc/kernel/exceptions-64s.S | 24 +++
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 1d8fc085e845..f19c2391cc36 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -233,7 +233,7 @@
  */
 #define EXCEPTION_RELON_PROLOG(area, label, hsrr, kvm, vec)\
SET_SCRATCH0(r13);  /* save r13 */  \
-   EXCEPTION_PROLOG_0(area);   \
+   EXCEPTION_PROLOG_0 area ;   \
EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, 0 ;\
EXCEPTION_PROLOG_2_VIRT label, hsrr
 
@@ -297,13 +297,14 @@ BEGIN_FTR_SECTION_NESTED(943) 
\
std ra,offset(r13); \
 END_FTR_SECTION_NESTED(ftr,ftr,943)
 
-#define EXCEPTION_PROLOG_0(area)   \
-   GET_PACA(r13);  \
-   std r9,area+EX_R9(r13); /* save r9 */   \
-   OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR); \
-   HMT_MEDIUM; \
-   std r10,area+EX_R10(r13);   /* save r10 - r12 */\
+.macro EXCEPTION_PROLOG_0 area
+   GET_PACA(r13)
+   std r9,\area\()+EX_R9(r13)  /* save r9 */
+   OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR)
+   HMT_MEDIUM
+   std r10,\area\()+EX_R10(r13)/* save r10 - r12 */
OPT_GET_SPR(r10, SPRN_CFAR, CPU_FTR_CFAR)
+.endm
 
 .macro EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, bitmask
OPT_SAVE_REG_TO_PACA(\area\()+EX_PPR, r9, CPU_FTR_HAS_PPR)
@@ -347,7 +348,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 #define EXCEPTION_PROLOG(area, label, hsrr, kvm, vec)  \
SET_SCRATCH0(r13);  /* save r13 */  \
-   EXCEPTION_PROLOG_0(area);   \
+   EXCEPTION_PROLOG_0 area ;   \
EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, 0 ;\
EXCEPTION_PROLOG_2_REAL label, hsrr, 1
 
@@ -416,7 +417,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 /* Do not enable RI */
 #define EXCEPTION_PROLOG_NORI(area, label, hsrr, kvm, vec) \
-   EXCEPTION_PROLOG_0(area);   \
+   EXCEPTION_PROLOG_0 area ;   \
EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, 0 ;\
EXCEPTION_PROLOG_2_REAL label, hsrr, 0
 
@@ -565,7 +566,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 /* Version of above for when we have to branch out-of-line */
 #define __OOL_EXCEPTION(vec, label, hdlr)  \
SET_SCRATCH0(r13);  \
-   EXCEPTION_PROLOG_0(PACA_EXGEN); \
+   EXCEPTION_PROLOG_0 PACA_EXGEN ; \
b hdlr
 
 #define STD_EXCEPTION_OOL(vec, label)  \
@@ -596,7 +597,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 #define __MASKABLE_EXCEPTION(vec, label, hsrr, kvm, bitmask)   \
SET_SCRATCH0(r13);/* save r13 */\
-   EXCEPTION_PROLOG_0(PACA_EXGEN); \
+   EXCEPTION_PROLOG_0 PACA_EXGEN ; \
EXCEPTION_PROLOG_1 hsrr, PACA_EXGEN, kvm, vec, bitmask ;\
EXCEPTION_PROLOG_2_REAL label, hsrr, 1
 
@@ -616,7 +617,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 #define __MASKABLE_RELON_EXCEPTION(vec, label, hsrr, kvm, bitmask) \
SET_SCRATCH0(r13);/* save r13 */\
-   EXCEPTION_PROLOG_0(PACA_EXGEN); \
+   EXCEPTION_PROLOG_0 PACA_EXGEN ; \
EXCEPTION_PROLOG_1 hsrr, PACA_EXGEN, kvm, vec, bitmask ;\
EXCEPTION_PROLOG_2_VIRT label, hsrr
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 8680cd7da550..88f892167a64 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -109,7 +109,7 @@ EXC_VIRT_NONE(0x4000, 0x100)
 
 EXC_REAL_BEGIN(system_reset, 0x100, 0x100)
SET_SCRATCH0(r13)
-   EXCEPTION_PROLOG_0(PACA_EXNMI)
+   EXCEPTION_PROLOG_0 PACA_EXNMI
 
/* This is EXCEPTION_PROLOG_1 with the idle feature section added */
OPT_SAVE_REG_TO_PACA(PACA_EXNMI+EX_PPR, r9, CPU_FTR_HAS_PPR)
@@ -266,7 +266,7 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
 * vector

[PATCH 09/28] powerpc/64s/exception: KVM handler can set the HSRR trap bit

2019-06-11 Thread Nicholas Piggin
Move the KVM trap HSRR bit into the KVM handler, which can be
conditionally applied when hsrr parameter is set.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 5 +
 arch/powerpc/include/asm/head-64.h   | 7 ++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 737c37d1df4b..1d8fc085e845 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -449,7 +449,12 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
ld  r10,\area+EX_R10(r13)
std r12,HSTATE_SCRATCH0(r13)
sldir12,r9,32
+   /* HSRR variants have the 0x2 bit added to their trap number */
+   .if \hsrr
+   ori r12,r12,(\n + 0x2)
+   .else
ori r12,r12,(\n)
+   .endif
/* This reloads r9 before branching to kvmppc_interrupt */
__BRANCH_TO_KVM_EXIT(\area, kvmppc_interrupt)
 
diff --git a/arch/powerpc/include/asm/head-64.h 
b/arch/powerpc/include/asm/head-64.h
index 518d9758b41e..bdd67a26e959 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -393,16 +393,13 @@ end_##sname:
TRAMP_KVM_BEGIN(do_kvm_##n);\
KVM_HANDLER area, EXC_STD, n, 1
 
-/*
- * HV variant exceptions get the 0x2 bit added to their trap number.
- */
 #define TRAMP_KVM_HV(area, n)  \
TRAMP_KVM_BEGIN(do_kvm_H##n);   \
-   KVM_HANDLER area, EXC_HV, n + 0x2, 0
+   KVM_HANDLER area, EXC_HV, n, 0
 
 #define TRAMP_KVM_HV_SKIP(area, n) \
TRAMP_KVM_BEGIN(do_kvm_H##n);   \
-   KVM_HANDLER area, EXC_HV, n + 0x2, 1
+   KVM_HANDLER area, EXC_HV, n, 1
 
 #define EXC_COMMON(name, realvec, hdlr)
\
EXC_COMMON_BEGIN(name); \
-- 
2.20.1



[PATCH 08/28] powerpc/64s/exception: merge KVM handler and skip variants

2019-06-11 Thread Nicholas Piggin
Conditionally expand the skip case if it is specified.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 28 +---
 arch/powerpc/include/asm/head-64.h   |  8 +++
 arch/powerpc/kernel/exceptions-64s.S |  2 +-
 3 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 74ddcb37156c..737c37d1df4b 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -431,26 +431,17 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
.endif
 .endm
 
-.macro KVM_HANDLER area, hsrr, n
+.macro KVM_HANDLER area, hsrr, n, skip
+   .if \skip
+   cmpwi   r10,KVM_GUEST_MODE_SKIP
+   beq 89f
+   .else
BEGIN_FTR_SECTION_NESTED(947)
ld  r10,\area+EX_CFAR(r13)
std r10,HSTATE_CFAR(r13)
END_FTR_SECTION_NESTED(CPU_FTR_CFAR,CPU_FTR_CFAR,947)
-   BEGIN_FTR_SECTION_NESTED(948)
-   ld  r10,\area+EX_PPR(r13)
-   std r10,HSTATE_PPR(r13)
-   END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948)
-   ld  r10,\area+EX_R10(r13)
-   std r12,HSTATE_SCRATCH0(r13)
-   sldir12,r9,32
-   ori r12,r12,(\n)
-   /* This reloads r9 before branching to kvmppc_interrupt */
-   __BRANCH_TO_KVM_EXIT(\area, kvmppc_interrupt)
-.endm
+   .endif
 
-.macro KVM_HANDLER_SKIP area, hsrr, n
-   cmpwi   r10,KVM_GUEST_MODE_SKIP
-   beq 89f
BEGIN_FTR_SECTION_NESTED(948)
ld  r10,\area+EX_PPR(r13)
std r10,HSTATE_PPR(r13)
@@ -461,6 +452,8 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
ori r12,r12,(\n)
/* This reloads r9 before branching to kvmppc_interrupt */
__BRANCH_TO_KVM_EXIT(\area, kvmppc_interrupt)
+
+   .if \skip
 89:mtocrf  0x80,r9
ld  r9,\area+EX_R9(r13)
ld  r10,\area+EX_R10(r13)
@@ -469,14 +462,13 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
.else
b   kvmppc_skip_interrupt
.endif
+   .endif
 .endm
 
 #else
 .macro KVMTEST hsrr, n
 .endm
-.macro KVM_HANDLER area, hsrr, n
-.endm
-.macro KVM_HANDLER_SKIP area, hsrr, n
+.macro KVM_HANDLER area, hsrr, n, skip
 .endm
 #endif
 
diff --git a/arch/powerpc/include/asm/head-64.h 
b/arch/powerpc/include/asm/head-64.h
index 4767d6c7b8fa..518d9758b41e 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -387,22 +387,22 @@ end_##sname:
 
 #define TRAMP_KVM(area, n) \
TRAMP_KVM_BEGIN(do_kvm_##n);\
-   KVM_HANDLER area, EXC_STD, n
+   KVM_HANDLER area, EXC_STD, n, 0
 
 #define TRAMP_KVM_SKIP(area, n)
\
TRAMP_KVM_BEGIN(do_kvm_##n);\
-   KVM_HANDLER_SKIP area, EXC_STD, n
+   KVM_HANDLER area, EXC_STD, n, 1
 
 /*
  * HV variant exceptions get the 0x2 bit added to their trap number.
  */
 #define TRAMP_KVM_HV(area, n)  \
TRAMP_KVM_BEGIN(do_kvm_H##n);   \
-   KVM_HANDLER area, EXC_HV, n + 0x2
+   KVM_HANDLER area, EXC_HV, n + 0x2, 0
 
 #define TRAMP_KVM_HV_SKIP(area, n) \
TRAMP_KVM_BEGIN(do_kvm_H##n);   \
-   KVM_HANDLER_SKIP area, EXC_HV, n + 0x2
+   KVM_HANDLER area, EXC_HV, n + 0x2, 1
 
 #define EXC_COMMON(name, realvec, hdlr)
\
EXC_COMMON_BEGIN(name); \
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 91350b3dedde..8680cd7da550 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1063,7 +1063,7 @@ TRAMP_KVM_BEGIN(do_kvm_0xc00)
SET_SCRATCH0(r10)
std r9,PACA_EXGEN+EX_R9(r13)
mfcrr9
-   KVM_HANDLER PACA_EXGEN, EXC_STD, 0xc00
+   KVM_HANDLER PACA_EXGEN, EXC_STD, 0xc00, 0
 #endif
 
 
-- 
2.20.1



[PATCH 07/28] powerpc/64s/exception: consolidate maskable and non-maskable prologs

2019-06-11 Thread Nicholas Piggin
Conditionally expand the soft-masking test if a mask is passed in.

No generated code change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/exception-64s.h | 113 +--
 arch/powerpc/kernel/exceptions-64s.S |  20 ++--
 2 files changed, 55 insertions(+), 78 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index e1b449e2c9ea..74ddcb37156c 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -234,7 +234,7 @@
 #define EXCEPTION_RELON_PROLOG(area, label, hsrr, kvm, vec)\
SET_SCRATCH0(r13);  /* save r13 */  \
EXCEPTION_PROLOG_0(area);   \
-   EXCEPTION_PROLOG_1 hsrr, area, kvm, vec ;   \
+   EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, 0 ;\
EXCEPTION_PROLOG_2_VIRT label, hsrr
 
 /* Exception register prefixes */
@@ -305,73 +305,50 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
std r10,area+EX_R10(r13);   /* save r10 - r12 */\
OPT_GET_SPR(r10, SPRN_CFAR, CPU_FTR_CFAR)
 
-#define __EXCEPTION_PROLOG_1_PRE(area) \
-   OPT_SAVE_REG_TO_PACA(area+EX_PPR, r9, CPU_FTR_HAS_PPR); \
-   OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR);  \
-   INTERRUPT_TO_KERNEL;\
-   SAVE_CTR(r10, area);\
+.macro EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, bitmask
+   OPT_SAVE_REG_TO_PACA(\area\()+EX_PPR, r9, CPU_FTR_HAS_PPR)
+   OPT_SAVE_REG_TO_PACA(\area\()+EX_CFAR, r10, CPU_FTR_CFAR)
+   INTERRUPT_TO_KERNEL
+   SAVE_CTR(r10, \area\())
mfcrr9
-
-#define __EXCEPTION_PROLOG_1_POST(area)
\
-   std r11,area+EX_R11(r13);   \
-   std r12,area+EX_R12(r13);   \
-   GET_SCRATCH0(r10);  \
-   std r10,area+EX_R13(r13)
-
-/*
- * This version of the EXCEPTION_PROLOG_1 will carry
- * addition parameter called "bitmask" to support
- * checking of the interrupt maskable level.
- * Intended to be used in MASKABLE_EXCPETION_* macros.
- */
-.macro MASKABLE_EXCEPTION_PROLOG_1 hsrr, area, kvm, vec, bitmask
-   __EXCEPTION_PROLOG_1_PRE(\area\())
.if \kvm
KVMTEST \hsrr \vec
.endif
-
-   lbz r10,PACAIRQSOFTMASK(r13)
-   andi.   r10,r10,\bitmask
-   /* This associates vector numbers with bits in paca->irq_happened */
-   .if \vec == 0x500 || \vec == 0xea0
-   li  r10,PACA_IRQ_EE
-   .elseif \vec == 0x900 || \vec == 0xea0
-   li  r10,PACA_IRQ_DEC
-   .elseif \vec == 0xa00 || \vec == 0xe80
-   li  r10,PACA_IRQ_DBELL
-   .elseif \vec == 0xe60
-   li  r10,PACA_IRQ_HMI
-   .elseif \vec == 0xf00
-   li  r10,PACA_IRQ_PMI
-   .else
-   .abort "Bad maskable vector"
+   .if \bitmask
+   lbz r10,PACAIRQSOFTMASK(r13)
+   andi.   r10,r10,\bitmask
+   /* Associate vector numbers with bits in paca->irq_happened */
+   .if \vec == 0x500 || \vec == 0xea0
+   li  r10,PACA_IRQ_EE
+   .elseif \vec == 0x900 || \vec == 0xea0
+   li  r10,PACA_IRQ_DEC
+   .elseif \vec == 0xa00 || \vec == 0xe80
+   li  r10,PACA_IRQ_DBELL
+   .elseif \vec == 0xe60
+   li  r10,PACA_IRQ_HMI
+   .elseif \vec == 0xf00
+   li  r10,PACA_IRQ_PMI
+   .else
+   .abort "Bad maskable vector"
+   .endif
+
+   .if \hsrr
+   bne masked_Hinterrupt
+   .else
+   bne masked_interrupt
+   .endif
.endif
 
-   .if \hsrr
-   bne masked_Hinterrupt
-   .else
-   bne masked_interrupt
-   .endif
-
-   __EXCEPTION_PROLOG_1_POST(\area\())
-.endm
-
-/*
- * This version of the EXCEPTION_PROLOG_1 is intended
- * to be used in STD_EXCEPTION* macros
- */
-.macro EXCEPTION_PROLOG_1 hsrr, area, kvm, vec
-   __EXCEPTION_PROLOG_1_PRE(\area\())
-   .if \kvm
-   KVMTEST \hsrr \vec
-   .endif
-   __EXCEPTION_PROLOG_1_POST(\area\())
+   std r11,\area\()+EX_R11(r13)
+   std r12,\area\()+EX_R12(r13)
+   GET_SCRATCH0(r10)
+   std r10,\area\()+EX_R13(r13)
 .endm
 
 #define EXCEPTION_PROLOG(area, label, hsrr, kvm, vec)  \
SET_SCRATCH0(r13);  /* save r13 */  \
EXCEPTION_PROLOG_0(area);   \
-   EXCEPTION_PROLOG_1 hsrr, area, kvm, vec ;   \
+   EXCEPTION_PROLOG_1 hsrr, a

  1   2   >