date:20230424

Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

2023-04-24 Thread Sathyanarayanan Kuppuswamy

Hi,

On 4/24/23 10:55 PM, Kai-Heng Feng wrote:
> On Tue, Apr 25, 2023 at 7:47 AM Sathyanarayanan Kuppuswamy
>  wrote:
>>
>>
>>
>> On 4/23/23 10:52 PM, Kai-Heng Feng wrote:
>>> PCIe service that shares IRQ with PME may cause spurious wakeup on
>>> system suspend.
>>>
>>> PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
>>> that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
>>> (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
>>> much here to disable AER during system suspend.
>>>
>>> This is very similar to previous attempts to suspend AER and DPC [1],
>>> but with a different reason.
>>>
>>> [1] 
>>> https://lore.kernel.org/linux-pci/20220408153159.106741-1-kai.heng.f...@canonical.com/
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
>>>
>>> Reviewed-by: Mika Westerberg 
>>> Signed-off-by: Kai-Heng Feng 
>>> ---
>>
>> IIUC, you encounter AER errors during the suspend/resume process, which
>> results in AER IRQ. Because AER and PME share an IRQ, it is regarded as a
>> spurious wake-up IRQ. So to fix it, you want to disable AER reporting,
>> right?
> 
> Yes. That's exactly what happened.
> 
>>
>> It looks like it is harmless to disable the AER during the suspend/resume
>> path. But, I am wondering why we get these errors? Did you check what errors
>> you get during the suspend/resume path? Are these errors valid?
> 
> I really don't know. I think it's similar to the reasoning in commit
> b07461a8e45b ("PCI/AER: Clear error status registers during
> enumeration and restore"): "AER errors might be recorded when
> powering-on devices. These errors can be ignored, ...".
> For this case, it happens when powering-off the device (D3cold) via
> turning off power resources.

Got it.

Reviewed-by: Kuppuswamy Sathyanarayanan 


> 
> Kai-Heng
> 
>>
>>
>>>  drivers/pci/pcie/aer.c | 22 ++
>>>  1 file changed, 22 insertions(+)
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index 1420e1f27105..9c07fdbeb52d 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
>>>   return 0;
>>>  }
>>>
>>> +static int aer_suspend(struct pcie_device *dev)
>>> +{
>>> + struct aer_rpc *rpc = get_service_data(dev);
>>> + struct pci_dev *pdev = rpc->rpd;
>>> +
>>> + aer_disable_irq(pdev);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int aer_resume(struct pcie_device *dev)
>>> +{
>>> + struct aer_rpc *rpc = get_service_data(dev);
>>> + struct pci_dev *pdev = rpc->rpd;
>>> +
>>> + aer_enable_irq(pdev);
>>> +
>>> + return 0;
>>> +}
>>> +
>>>  /**
>>>   * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
>>>   * @dev: pointer to Root Port, RCEC, or RCiEP
>>> @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
>>>   .service= PCIE_PORT_SERVICE_AER,
>>>
>>>   .probe  = aer_probe,
>>> + .suspend= aer_suspend,
>>> + .resume = aer_resume,
>>>   .remove = aer_remove,
>>>  };
>>>
>>
>> --
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

2023-04-24 Thread Kai-Heng Feng

On Tue, Apr 25, 2023 at 7:47 AM Sathyanarayanan Kuppuswamy
 wrote:
>
>
>
> On 4/23/23 10:52 PM, Kai-Heng Feng wrote:
> > PCIe service that shares IRQ with PME may cause spurious wakeup on
> > system suspend.
> >
> > PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> > that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> > (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
> > much here to disable AER during system suspend.
> >
> > This is very similar to previous attempts to suspend AER and DPC [1],
> > but with a different reason.
> >
> > [1] 
> > https://lore.kernel.org/linux-pci/20220408153159.106741-1-kai.heng.f...@canonical.com/
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
> >
> > Reviewed-by: Mika Westerberg 
> > Signed-off-by: Kai-Heng Feng 
> > ---
>
> IIUC, you encounter AER errors during the suspend/resume process, which
> results in AER IRQ. Because AER and PME share an IRQ, it is regarded as a
> spurious wake-up IRQ. So to fix it, you want to disable AER reporting,
> right?

Yes. That's exactly what happened.

>
> It looks like it is harmless to disable the AER during the suspend/resume
> path. But, I am wondering why we get these errors? Did you check what errors
> you get during the suspend/resume path? Are these errors valid?

I really don't know. I think it's similar to the reasoning in commit
b07461a8e45b ("PCI/AER: Clear error status registers during
enumeration and restore"): "AER errors might be recorded when
powering-on devices. These errors can be ignored, ...".
For this case, it happens when powering-off the device (D3cold) via
turning off power resources.

Kai-Heng

>
>
> >  drivers/pci/pcie/aer.c | 22 ++
> >  1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 1420e1f27105..9c07fdbeb52d 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
> >   return 0;
> >  }
> >
> > +static int aer_suspend(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_disable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> > +static int aer_resume(struct pcie_device *dev)
> > +{
> > + struct aer_rpc *rpc = get_service_data(dev);
> > + struct pci_dev *pdev = rpc->rpd;
> > +
> > + aer_enable_irq(pdev);
> > +
> > + return 0;
> > +}
> > +
> >  /**
> >   * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
> >   * @dev: pointer to Root Port, RCEC, or RCiEP
> > @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
> >   .service= PCIE_PORT_SERVICE_AER,
> >
> >   .probe  = aer_probe,
> > + .suspend= aer_suspend,
> > + .resume = aer_resume,
> >   .remove = aer_remove,
> >  };
> >
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer

Re: [PATCH 5/5] Update Kconfig and Makefile.

2023-04-24 Thread Herbert Xu

On Mon, Apr 24, 2023 at 02:47:26PM -0400, Danny Tsen wrote:
>
> +config CRYPTO_CHACHA20_P10
> + tristate "Ciphers: ChaCha20, XChacha20, XChacha12 (P10 or later)"
> + depends on PPC64 && CPU_LITTLE_ENDIAN
> + select CRYPTO_SKCIPHER

I thought your IS_REACHABLE test was so that you could build this
without the Crypto API? Colour me confused.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 4/5] Glue code for optmized Poly1305 implementation for ppc64le.

2023-04-24 Thread Herbert Xu

On Mon, Apr 24, 2023 at 02:47:25PM -0400, Danny Tsen wrote:
>
> + if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
> + bytes = round_down(srclen, POLY1305_BLOCK_SIZE);
> + used = crypto_poly1305_setdctxkey(dctx, src, bytes);
> + if (likely(used)) {
> + srclen -= used;
> + src += used;
> + }
> + if (srclen >= POLY1305_BLOCK_SIZE*4) {
> + vsx_begin();

Your chacha code has a SIMD-fallback, how come this one doesn't?

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 2/5] Glue code for optmized Chacha20 implementation for ppc64le.

2023-04-24 Thread Herbert Xu

On Mon, Apr 24, 2023 at 02:47:23PM -0400, Danny Tsen wrote:
>
> +static int chacha_p10_stream_xor(struct skcipher_request *req,
> +  const struct chacha_ctx *ctx, const u8 *iv)
> +{
> + struct skcipher_walk walk;
> + u32 state[16];
> + int err;
> +
> + err = skcipher_walk_virt(, req, false);
> + if (err)
> + return err;
> +
> + chacha_init_generic(state, ctx->key, iv);
> +
> + while (walk.nbytes > 0) {
> + unsigned int nbytes = walk.nbytes;
> +
> + if (nbytes < walk.total)
> + nbytes = rounddown(nbytes, walk.stride);
> +
> + if (!static_branch_likely(_p10) ||

You don't need the static branch in the Crypto API code since
the registration is already conditional.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 2/5] Glue code for optmized Chacha20 implementation for ppc64le.

2023-04-24 Thread Herbert Xu

On Tue, Apr 25, 2023 at 01:37:22PM +0800, Herbert Xu wrote:
> On Mon, Apr 24, 2023 at 02:47:23PM -0400, Danny Tsen wrote:
> >
> > +static int __init chacha_p10_init(void)
> > +{
> > +   static_branch_enable(_p10);
> > +
> > +   return IS_REACHABLE(CONFIG_CRYPTO_SKCIPHER) ?
> > +   crypto_register_skciphers(algs, ARRAY_SIZE(algs)) : 0;
> 
> What is this for? The usual way is to select CRYPTO_SKCIPHER
> rather than have a mysterious failure at run-time.

Nevermind, I see that you also have non-Crypto API code in there.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 2/5] Glue code for optmized Chacha20 implementation for ppc64le.

2023-04-24 Thread Herbert Xu

On Mon, Apr 24, 2023 at 02:47:23PM -0400, Danny Tsen wrote:
>
> +static int __init chacha_p10_init(void)
> +{
> + static_branch_enable(_p10);
> +
> + return IS_REACHABLE(CONFIG_CRYPTO_SKCIPHER) ?
> + crypto_register_skciphers(algs, ARRAY_SIZE(algs)) : 0;

What is this for? The usual way is to select CRYPTO_SKCIPHER
rather than have a mysterious failure at run-time.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v4 2/3] PCI/AER: Disable AER interrupt on suspend

2023-04-24 Thread Sathyanarayanan Kuppuswamy




On 4/23/23 10:52 PM, Kai-Heng Feng wrote:
> PCIe service that shares IRQ with PME may cause spurious wakeup on
> system suspend.
> 
> PCIe Base Spec 5.0, section 5.2 "Link State Power Management" states
> that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready
> (D3hot), L2 (D3cold with aux power) and L3 (D3cold), so we don't lose
> much here to disable AER during system suspend.
> 
> This is very similar to previous attempts to suspend AER and DPC [1],
> but with a different reason.
> 
> [1] 
> https://lore.kernel.org/linux-pci/20220408153159.106741-1-kai.heng.f...@canonical.com/
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295
> 
> Reviewed-by: Mika Westerberg 
> Signed-off-by: Kai-Heng Feng 
> ---

IIUC, you encounter AER errors during the suspend/resume process, which
results in AER IRQ. Because AER and PME share an IRQ, it is regarded as a
spurious wake-up IRQ. So to fix it, you want to disable AER reporting,
right?

It looks like it is harmless to disable the AER during the suspend/resume
path. But, I am wondering why we get these errors? Did you check what errors
you get during the suspend/resume path? Are these errors valid?


>  drivers/pci/pcie/aer.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 1420e1f27105..9c07fdbeb52d 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1356,6 +1356,26 @@ static int aer_probe(struct pcie_device *dev)
>   return 0;
>  }
>  
> +static int aer_suspend(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> + struct pci_dev *pdev = rpc->rpd;
> +
> + aer_disable_irq(pdev);
> +
> + return 0;
> +}
> +
> +static int aer_resume(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> + struct pci_dev *pdev = rpc->rpd;
> +
> + aer_enable_irq(pdev);
> +
> + return 0;
> +}
> +
>  /**
>   * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
>   * @dev: pointer to Root Port, RCEC, or RCiEP
> @@ -1420,6 +1440,8 @@ static struct pcie_port_service_driver aerdriver = {
>   .service= PCIE_PORT_SERVICE_AER,
>  
>   .probe  = aer_probe,
> + .suspend= aer_suspend,
> + .resume = aer_resume,
>   .remove = aer_remove,
>  };
>  

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Re: [PATCH 1/5] An optimized Chacha20 implementation with 8-way unrolling for ppc64le.

2023-04-24 Thread Danny Tsen


This is recommended template to use for IBM copyright.

Thanks.

-Danny

On 4/24/23 3:40 PM, Elliott, Robert (Servers) wrote:

+# Copyright 2023- IBM Inc. All rights reserved

I don't think any such entity exists - you probably mean IBM Corporation.

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Zhouyi Zhou

On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman  wrote:
>
> Zhouyi Zhou  writes:
> > Dear PowerPC and RCU developers:
> > During the RCU torture test on mainline (on the VM of Opensource Lab
> > of Oregon State University), SRCU-P failed with __stack_chk_fail:
> ...
> > by debugging, I see the r10 is assigned with r13 on c0226eb4,
> > but if there is a context-switch before c0226edc, a false
> > positive will be reported.
> >
> > [1] http://154.220.3.115/logs/0422/configformainline.txt
>
> Says:
>
> CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc-10 (Ubuntu 
> 10.4.0-4ubuntu1~22.04) 10.4.0"
>
> Do you see the same issue with a newer GCC?
I would like to try that in the newest GCC ;-), please give me about a
day's time because I am going to compile the gcc ;-)
>
> There's 12.2.0 here:
>   https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/
>   
> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/
>
> Or if you can build in a Fedora 38 system or container, it has GCC 13.
OK, I will try it or similar

This is a very learningful process for me, thank you all ;-)

Cheers
>
> cheers

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Michael Ellerman

Zhouyi Zhou  writes:
> Dear PowerPC and RCU developers:
> During the RCU torture test on mainline (on the VM of Opensource Lab
> of Oregon State University), SRCU-P failed with __stack_chk_fail:
...
> by debugging, I see the r10 is assigned with r13 on c0226eb4,
> but if there is a context-switch before c0226edc, a false
> positive will be reported.
>
> [1] http://154.220.3.115/logs/0422/configformainline.txt

Says:

CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc-10 (Ubuntu 
10.4.0-4ubuntu1~22.04) 10.4.0"

Do you see the same issue with a newer GCC?

There's 12.2.0 here:
  https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/
  https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/

Or if you can build in a Fedora 38 system or container, it has GCC 13.

cheers

RE: [PATCH 1/5] An optimized Chacha20 implementation with 8-way unrolling for ppc64le.

2023-04-24 Thread Elliott, Robert (Servers)

> +# Copyright 2023- IBM Inc. All rights reserved

I don't think any such entity exists - you probably mean IBM Corporation.

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Boqun Feng

On Mon, Apr 24, 2023 at 12:29:00PM -0500, Segher Boessenkool wrote:
> On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote:
> > On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> > > At what points can r13 change?  Only when some particular functions are
> > > called?
> > 
> > r13 is the local paca:
> > 
> > register struct paca_struct *local_paca asm("r13");
> > 
> > , which is a pointer to percpu data.
> 
> Yes, it is a global register variable.
> 
> > So if a task schedule from one CPU to anotehr CPU, the value gets
> > changed.
> 
> But the compiler does not see that something else changes local_paca (or

It's more like this, however, in this case r13 is not changed:

CPU 0   CPU 1
{r13 = 0x00}{r13 = 0x04}



 _switch():
  
  
  


_switch():
 
 
 


as you can see thread 1 schedules from CPU 0 to CPU 1 and neither CPU
changes its r13, but in the point of view for thread 1, its r13 changes.

> r13 some other way, via assembler code perhaps)?  Or is there a compiler
> bug?
> 

This looks to me a compiler bug, but I'm not 100% sure.

Regards,
Boqun


> If the latter is true:
> 
> Can you make a reproducer and open a GCC PR?  
> for how to get started doing that.  We need *exact* code that shows the
> problem, together with a compiler command line.  So that we can
> reproduce the problem.  That is step 0 in figuring out what is going on,
> and then maybe fixing the problem :-)
> 
> 
> Segher

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Joel Fernandes

This is amazing debugging Boqun, like a boss! One comment below:

> > > Or something simple I haven't thought of? :)
> >
> > At what points can r13 change?  Only when some particular functions are
> > called?
> >
>
> r13 is the local paca:
>
> register struct paca_struct *local_paca asm("r13");
>
> , which is a pointer to percpu data.
>
> So if a task schedule from one CPU to anotehr CPU, the value gets
> changed.

It appears the whole issue, per your analysis, is that the stack
checking code in gcc should not cache or alias r13, and must read its
most up-to-date value during stack checking, as its value may have
changed during a migration to a new CPU.

Did I get that right?

IMO, even without a reproducer, gcc on PPC should just not do that,
that feels terribly broken for the kernel. I wonder what clang does,
I'll go poke around with compilerexplorer after lunch.

Adding +Peter Zijlstra as well to join the party as I have a feeling
he'll be interested. ;-)

thanks,

 - Joel

[PATCH 5/5] Update Kconfig and Makefile.

2023-04-24 Thread Danny Tsen

Defined CRYPTO_CHACHA20_P10 and CRYPTO POLY1305_P10 in Kconfig to
support optimized implementation for Power10 and later CPU.

Added new module driver chacha-p10-crypto and poly1305-p10-crypto.

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/Kconfig  | 26 ++
 arch/powerpc/crypto/Makefile |  4 
 2 files changed, 30 insertions(+)

diff --git a/arch/powerpc/crypto/Kconfig b/arch/powerpc/crypto/Kconfig
index 7113f9355165..f74d9dd6574b 100644
--- a/arch/powerpc/crypto/Kconfig
+++ b/arch/powerpc/crypto/Kconfig
@@ -111,4 +111,30 @@ config CRYPTO_AES_GCM_P10
  Support for cryptographic acceleration instructions on Power10 or
  later CPU. This module supports stitched acceleration for AES/GCM.
 
+config CRYPTO_CHACHA20_P10
+   tristate "Ciphers: ChaCha20, XChacha20, XChacha12 (P10 or later)"
+   depends on PPC64 && CPU_LITTLE_ENDIAN
+   select CRYPTO_SKCIPHER
+   select CRYPTO_LIB_CHACHA_GENERIC
+   select CRYPTO_ARCH_HAVE_LIB_CHACHA
+   help
+ Length-preserving ciphers: ChaCha20, XChaCha20, and XChaCha12
+ stream cipher algorithms
+
+ Architecture: PowerPC64
+ - Power10 or later
+ - Little-endian
+
+config CRYPTO_POLY1305_P10
+   tristate "Hash functions: Poly1305 (P10 or later)"
+   depends on PPC64 && CPU_LITTLE_ENDIAN
+   select CRYPTO_HASH
+   select CRYPTO_LIB_POLY1305_GENERIC
+   help
+ Poly1305 authenticator algorithm (RFC7539)
+
+ Architecture: PowerPC64
+ - Power10 or later
+ - Little-endian
+
 endmenu
diff --git a/arch/powerpc/crypto/Makefile b/arch/powerpc/crypto/Makefile
index 05c7486f42c5..cd5282eff451 100644
--- a/arch/powerpc/crypto/Makefile
+++ b/arch/powerpc/crypto/Makefile
@@ -14,6 +14,8 @@ obj-$(CONFIG_CRYPTO_CRC32C_VPMSUM) += crc32c-vpmsum.o
 obj-$(CONFIG_CRYPTO_CRCT10DIF_VPMSUM) += crct10dif-vpmsum.o
 obj-$(CONFIG_CRYPTO_VPMSUM_TESTER) += crc-vpmsum_test.o
 obj-$(CONFIG_CRYPTO_AES_GCM_P10) += aes-gcm-p10-crypto.o
+obj-$(CONFIG_CRYPTO_CHACHA20_P10) += chacha-p10-crypto.o
+obj-$(CONFIG_CRYPTO_POLY1305_P10) += poly1305-p10-crypto.o
 
 aes-ppc-spe-y := aes-spe-core.o aes-spe-keys.o aes-tab-4k.o aes-spe-modes.o 
aes-spe-glue.o
 md5-ppc-y := md5-asm.o md5-glue.o
@@ -23,6 +25,8 @@ sha256-ppc-spe-y := sha256-spe-asm.o sha256-spe-glue.o
 crc32c-vpmsum-y := crc32c-vpmsum_asm.o crc32c-vpmsum_glue.o
 crct10dif-vpmsum-y := crct10dif-vpmsum_asm.o crct10dif-vpmsum_glue.o
 aes-gcm-p10-crypto-y := aes-gcm-p10-glue.o aes-gcm-p10.o ghashp8-ppc.o 
aesp8-ppc.o
+chacha-p10-crypto-y := chacha-p10-glue.o chacha-p10le-8x.o
+poly1305-p10-crypto-y := poly1305-p10-glue.o poly1305-p10le_64.o
 
 quiet_cmd_perl = PERL$@
   cmd_perl = $(PERL) $< $(if $(CONFIG_CPU_LITTLE_ENDIAN), linux-ppc64le, 
linux-ppc64) > $@
-- 
2.31.1

[PATCH 4/5] Glue code for optmized Poly1305 implementation for ppc64le.

2023-04-24 Thread Danny Tsen

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/poly1305-p10-glue.c | 186 
 1 file changed, 186 insertions(+)
 create mode 100644 arch/powerpc/crypto/poly1305-p10-glue.c

diff --git a/arch/powerpc/crypto/poly1305-p10-glue.c 
b/arch/powerpc/crypto/poly1305-p10-glue.c
new file mode 100644
index ..b1800f7b6af8
--- /dev/null
+++ b/arch/powerpc/crypto/poly1305-p10-glue.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Poly1305 authenticator algorithm, RFC7539.
+ *
+ * Copyright 2023- IBM Inc. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+asmlinkage void poly1305_p10le_4blocks(void *h, const u8 *m, u32 mlen);
+asmlinkage void poly1305_64s(void *h, const u8 *m, u32 mlen, int highbit);
+asmlinkage void poly1305_emit_64(void *h, void *s, u8 *dst);
+
+static void vsx_begin(void)
+{
+   preempt_disable();
+   enable_kernel_vsx();
+}
+
+static void vsx_end(void)
+{
+   disable_kernel_vsx();
+   preempt_enable();
+}
+
+static int crypto_poly1305_p10_init(struct shash_desc *desc)
+{
+   struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
+
+   poly1305_core_init(>h);
+   dctx->buflen = 0;
+   dctx->rset = 0;
+   dctx->sset = false;
+
+   return 0;
+}
+
+static unsigned int crypto_poly1305_setdctxkey(struct poly1305_desc_ctx *dctx,
+  const u8 *inp, unsigned int len)
+{
+   unsigned int acc = 0;
+
+   if (unlikely(!dctx->sset)) {
+   if (!dctx->rset && len >= POLY1305_BLOCK_SIZE) {
+   struct poly1305_core_key *key = >core_r;
+
+   key->key.r64[0] = get_unaligned_le64([0]);
+   key->key.r64[1] = get_unaligned_le64([8]);
+   inp += POLY1305_BLOCK_SIZE;
+   len -= POLY1305_BLOCK_SIZE;
+   acc += POLY1305_BLOCK_SIZE;
+   dctx->rset = 1;
+   }
+   if (len >= POLY1305_BLOCK_SIZE) {
+   dctx->s[0] = get_unaligned_le32([0]);
+   dctx->s[1] = get_unaligned_le32([4]);
+   dctx->s[2] = get_unaligned_le32([8]);
+   dctx->s[3] = get_unaligned_le32([12]);
+   acc += POLY1305_BLOCK_SIZE;
+   dctx->sset = true;
+   }
+   }
+   return acc;
+}
+
+static int crypto_poly1305_p10_update(struct shash_desc *desc,
+ const u8 *src, unsigned int srclen)
+{
+   struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
+   unsigned int bytes, used;
+
+   if (unlikely(dctx->buflen)) {
+   bytes = min(srclen, POLY1305_BLOCK_SIZE - dctx->buflen);
+   memcpy(dctx->buf + dctx->buflen, src, bytes);
+   src += bytes;
+   srclen -= bytes;
+   dctx->buflen += bytes;
+
+   if (dctx->buflen == POLY1305_BLOCK_SIZE) {
+   if (likely(!crypto_poly1305_setdctxkey(dctx, dctx->buf,
+  
POLY1305_BLOCK_SIZE))) {
+   vsx_begin();
+   poly1305_64s(>h, dctx->buf,
+ POLY1305_BLOCK_SIZE, 1);
+   vsx_end();
+   }
+   dctx->buflen = 0;
+   }
+   }
+
+   if (likely(srclen >= POLY1305_BLOCK_SIZE)) {
+   bytes = round_down(srclen, POLY1305_BLOCK_SIZE);
+   used = crypto_poly1305_setdctxkey(dctx, src, bytes);
+   if (likely(used)) {
+   srclen -= used;
+   src += used;
+   }
+   if (srclen >= POLY1305_BLOCK_SIZE*4) {
+   vsx_begin();
+   poly1305_p10le_4blocks(>h, src, srclen);
+   vsx_end();
+   src += srclen - (srclen % (POLY1305_BLOCK_SIZE * 4));
+   srclen %= POLY1305_BLOCK_SIZE * 4;
+   }
+   while (srclen >= POLY1305_BLOCK_SIZE) {
+   vsx_begin();
+   poly1305_64s(>h, src, POLY1305_BLOCK_SIZE, 1);
+   vsx_end();
+   srclen -= POLY1305_BLOCK_SIZE;
+   src += POLY1305_BLOCK_SIZE;
+   }
+   }
+
+   if (unlikely(srclen)) {
+   dctx->buflen = srclen;
+   memcpy(dctx->buf, src, srclen);
+   }
+
+   return 0;
+}
+
+static int crypto_poly1305_p10_final(struct shash_desc *desc, u8 *dst)
+{
+   struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc);
+
+   if (unlikely(!dctx->sset))
+   return -ENOKEY;
+
+   if ((dctx->buflen)) {
+

[PATCH 2/5] Glue code for optmized Chacha20 implementation for ppc64le.

2023-04-24 Thread Danny Tsen

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/chacha-p10-glue.c | 223 ++
 1 file changed, 223 insertions(+)
 create mode 100644 arch/powerpc/crypto/chacha-p10-glue.c

diff --git a/arch/powerpc/crypto/chacha-p10-glue.c 
b/arch/powerpc/crypto/chacha-p10-glue.c
new file mode 100644
index ..cefb150e7b3c
--- /dev/null
+++ b/arch/powerpc/crypto/chacha-p10-glue.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * PowerPC P10 (ppc64le) accelerated ChaCha and XChaCha stream ciphers,
+ * including ChaCha20 (RFC7539)
+ *
+ * Copyright 2023- IBM Inc. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+asmlinkage void chacha_p10le_8x(u32 *state, u8 *dst, const u8 *src,
+   unsigned int len, int nrounds);
+
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_p10);
+
+static void vsx_begin(void)
+{
+   preempt_disable();
+   enable_kernel_vsx();
+}
+
+static void vsx_end(void)
+{
+   disable_kernel_vsx();
+   preempt_enable();
+}
+
+static void chacha_p10_do_8x(u32 *state, u8 *dst, const u8 *src,
+unsigned int bytes, int nrounds)
+{
+   unsigned int l = bytes & ~0x0FF;
+
+   if (l > 0) {
+   chacha_p10le_8x(state, dst, src, l, nrounds);
+   bytes -= l;
+   src += l;
+   dst += l;
+   state[12] += l / CHACHA_BLOCK_SIZE;
+   }
+
+   if (bytes > 0)
+   chacha_crypt_generic(state, dst, src, bytes, nrounds);
+}
+
+void hchacha_block_arch(const u32 *state, u32 *stream, int nrounds)
+{
+   hchacha_block_generic(state, stream, nrounds);
+}
+EXPORT_SYMBOL(hchacha_block_arch);
+
+void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv)
+{
+   chacha_init_generic(state, key, iv);
+}
+EXPORT_SYMBOL(chacha_init_arch);
+
+void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes,
+  int nrounds)
+{
+   if (!static_branch_likely(_p10) || bytes <= CHACHA_BLOCK_SIZE ||
+   !crypto_simd_usable())
+   return chacha_crypt_generic(state, dst, src, bytes, nrounds);
+
+   do {
+   unsigned int todo = min_t(unsigned int, bytes, SZ_4K);
+
+   vsx_begin();
+   chacha_p10_do_8x(state, dst, src, todo, nrounds);
+   vsx_end();
+
+   bytes -= todo;
+   src += todo;
+   dst += todo;
+   } while (bytes);
+}
+EXPORT_SYMBOL(chacha_crypt_arch);
+
+static int chacha_p10_stream_xor(struct skcipher_request *req,
+const struct chacha_ctx *ctx, const u8 *iv)
+{
+   struct skcipher_walk walk;
+   u32 state[16];
+   int err;
+
+   err = skcipher_walk_virt(, req, false);
+   if (err)
+   return err;
+
+   chacha_init_generic(state, ctx->key, iv);
+
+   while (walk.nbytes > 0) {
+   unsigned int nbytes = walk.nbytes;
+
+   if (nbytes < walk.total)
+   nbytes = rounddown(nbytes, walk.stride);
+
+   if (!static_branch_likely(_p10) ||
+   !crypto_simd_usable()) {
+   chacha_crypt_generic(state, walk.dst.virt.addr,
+walk.src.virt.addr, nbytes,
+ctx->nrounds);
+   } else {
+   vsx_begin();
+   chacha_p10_do_8x(state, walk.dst.virt.addr,
+ walk.src.virt.addr, nbytes, ctx->nrounds);
+   vsx_end();
+   }
+   err = skcipher_walk_done(, walk.nbytes - nbytes);
+   if (err)
+   break;
+   }
+
+   return err;
+}
+
+static int chacha_p10(struct skcipher_request *req)
+{
+   struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+   struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+   return chacha_p10_stream_xor(req, ctx, req->iv);
+}
+
+static int xchacha_p10(struct skcipher_request *req)
+{
+   struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+   struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm);
+   struct chacha_ctx subctx;
+   u32 state[16];
+   u8 real_iv[16];
+
+   chacha_init_generic(state, ctx->key, req->iv);
+   hchacha_block_arch(state, subctx.key, ctx->nrounds);
+   subctx.nrounds = ctx->nrounds;
+
+   memcpy(_iv[0], req->iv + 24, 8);
+   memcpy(_iv[8], req->iv + 16, 8);
+   return chacha_p10_stream_xor(req, , real_iv);
+}
+
+static struct skcipher_alg algs[] = {
+   {
+   .base.cra_name  = "chacha20",
+   .base.cra_driver_name   = "chacha20-p10",
+   .base.cra_priority  = 300,
+   .base.cra_blocksize = 1,
+

[PATCH 3/5] An optimized Poly1305 implementation with 4-way unrolling for ppc64le.

2023-04-24 Thread Danny Tsen

Improve overall performance of Poly1305 for Power10 or later CPU.

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/poly1305-p10le_64.S | 1075 +++
 1 file changed, 1075 insertions(+)
 create mode 100644 arch/powerpc/crypto/poly1305-p10le_64.S

diff --git a/arch/powerpc/crypto/poly1305-p10le_64.S 
b/arch/powerpc/crypto/poly1305-p10le_64.S
new file mode 100644
index ..22fd255af87e
--- /dev/null
+++ b/arch/powerpc/crypto/poly1305-p10le_64.S
@@ -0,0 +1,1075 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#
+# Accelerated poly1305 implementation for ppc64le.
+#
+# Copyright 2023- IBM Inc. All rights reserved
+#
+#===
+# Written by Danny Tsen 
+#
+# Poly1305 - this version mainly using vector/VSX/Scalar
+#  - 26 bits limbs
+#  - Handle multiple 64 byte blcok.
+#
+# Block size 16 bytes
+# key = (r, s)
+# clamp r &= 0x0FFC0FFC 0x0FFC0FFF
+# p = 2^130 - 5
+# a += m
+# a = (r + a) % p
+# a += s
+#
+# Improve performance by breaking down polynominal to the sum of products with
+# h4 = m1 * r⁴ + m2 * r³ + m3 * r² + m4 * r
+#
+#  07/22/21 - this revison based on the above sum of products.  Setup r^4, 
r^3, r^2, r and s3, s2, s1, s0
+# to 9 vectors for multiplications.
+#
+# setup r^4, r^3, r^2, r vectors
+#vs[r^1, r^3, r^2, r^4]
+#vs0 = [r0,.]
+#vs1 = [r1,.]
+#vs2 = [r2,.]
+#vs3 = [r3,.]
+#vs4 = [r4,.]
+#vs5 = [r1*5,...]
+#vs6 = [r2*5,...]
+#vs7 = [r2*5,...]
+#vs8 = [r4*5,...]
+#
+#  Each word in a vector consists a member of a "r/s" in [a * r/s].
+#
+# r0, r4*5, r3*5, r2*5, r1*5;
+# r1, r0,   r4*5, r3*5, r2*5;
+# r2, r1,   r0,   r4*5, r3*5;
+# r3, r2,   r1,   r0,   r4*5;
+# r4, r3,   r2,   r1,   r0  ;
+#
+#
+# poly1305_p10le_4blocks( uint8_t *k, uint32_t mlen, uint8_t *m)
+#  k = 32 bytes key
+#  r3 = k (r, s)
+#  r4 = mlen
+#  r5 = m
+#
+#include 
+#include 
+#include 
+#include 
+
+.machine "any"
+
+.text
+
+.macro SAVE_GPR GPR OFFSET FRAME
+   std \GPR,\OFFSET(\FRAME)
+.endm
+
+.macro SAVE_VRS VRS OFFSET FRAME
+   li  16, \OFFSET
+   stvx\VRS, 16, \FRAME
+.endm
+
+.macro SAVE_VSX VSX OFFSET FRAME
+   li  16, \OFFSET
+   stxvx   \VSX, 16, \FRAME
+.endm
+
+.macro RESTORE_GPR GPR OFFSET FRAME
+   ld  \GPR,\OFFSET(\FRAME)
+.endm
+
+.macro RESTORE_VRS VRS OFFSET FRAME
+   li  16, \OFFSET
+   lvx \VRS, 16, \FRAME
+.endm
+
+.macro RESTORE_VSX VSX OFFSET FRAME
+   li  16, \OFFSET
+   lxvx\VSX, 16, \FRAME
+.endm
+
+.macro SAVE_REGS
+   mflr 0
+   std 0, 16(1)
+   stdu 1,-752(1)
+
+   SAVE_GPR 14, 112, 1
+   SAVE_GPR 15, 120, 1
+   SAVE_GPR 16, 128, 1
+   SAVE_GPR 17, 136, 1
+   SAVE_GPR 18, 144, 1
+   SAVE_GPR 19, 152, 1
+   SAVE_GPR 20, 160, 1
+   SAVE_GPR 21, 168, 1
+   SAVE_GPR 22, 176, 1
+   SAVE_GPR 23, 184, 1
+   SAVE_GPR 24, 192, 1
+   SAVE_GPR 25, 200, 1
+   SAVE_GPR 26, 208, 1
+   SAVE_GPR 27, 216, 1
+   SAVE_GPR 28, 224, 1
+   SAVE_GPR 29, 232, 1
+   SAVE_GPR 30, 240, 1
+   SAVE_GPR 31, 248, 1
+
+   addi9, 1, 256
+   SAVE_VRS 20, 0, 9
+   SAVE_VRS 21, 16, 9
+   SAVE_VRS 22, 32, 9
+   SAVE_VRS 23, 48, 9
+   SAVE_VRS 24, 64, 9
+   SAVE_VRS 25, 80, 9
+   SAVE_VRS 26, 96, 9
+   SAVE_VRS 27, 112, 9
+   SAVE_VRS 28, 128, 9
+   SAVE_VRS 29, 144, 9
+   SAVE_VRS 30, 160, 9
+   SAVE_VRS 31, 176, 9
+
+   SAVE_VSX 14, 192, 9
+   SAVE_VSX 15, 208, 9
+   SAVE_VSX 16, 224, 9
+   SAVE_VSX 17, 240, 9
+   SAVE_VSX 18, 256, 9
+   SAVE_VSX 19, 272, 9
+   SAVE_VSX 20, 288, 9
+   SAVE_VSX 21, 304, 9
+   SAVE_VSX 22, 320, 9
+   SAVE_VSX 23, 336, 9
+   SAVE_VSX 24, 352, 9
+   SAVE_VSX 25, 368, 9
+   SAVE_VSX 26, 384, 9
+   SAVE_VSX 27, 400, 9
+   SAVE_VSX 28, 416, 9
+   SAVE_VSX 29, 432, 9
+   SAVE_VSX 30, 448, 9
+   SAVE_VSX 31, 464, 9
+.endm # SAVE_REGS
+
+.macro RESTORE_REGS
+   addi9, 1, 256
+   RESTORE_VRS 20, 0, 9
+   RESTORE_VRS 21, 16, 9
+   RESTORE_VRS 22, 32, 9
+   RESTORE_VRS 23, 48, 9
+   RESTORE_VRS 24, 64, 9
+   RESTORE_VRS 25, 80, 9
+   RESTORE_VRS 26, 96, 9
+   RESTORE_VRS 27, 112, 9
+   RESTORE_VRS 28, 128, 9
+   RESTORE_VRS 29, 144, 9
+   RESTORE_VRS 30, 160, 9
+   RESTORE_VRS 31, 176, 9
+
+   RESTORE_VSX 14, 192, 9
+   RESTORE_VSX 15, 208, 9
+   RESTORE_VSX 16, 224, 9
+   RESTORE_VSX 17, 240, 9
+   RESTORE_VSX 18, 256, 9
+   RESTORE_VSX 19, 272, 9
+   RESTORE_VSX 20, 288, 9
+   RESTORE_VSX 21, 304, 9
+   RESTORE_VSX 22, 320, 9
+   RESTORE_VSX 23, 336, 9
+   RESTORE_VSX 24, 352, 9
+   RESTORE_VSX 25, 368, 9
+   RESTORE_VSX 26, 384, 9
+   RESTORE_VSX 27, 400, 9
+   RESTORE_VSX 28, 416, 9
+   RESTORE_VSX 29,

[PATCH 1/5] An optimized Chacha20 implementation with 8-way unrolling for ppc64le.

2023-04-24 Thread Danny Tsen

Improve overall performance of chacha20 encrypt and decrypt operations
for Power10 or later CPU.

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/chacha-p10le-8x.S | 842 ++
 1 file changed, 842 insertions(+)
 create mode 100644 arch/powerpc/crypto/chacha-p10le-8x.S

diff --git a/arch/powerpc/crypto/chacha-p10le-8x.S 
b/arch/powerpc/crypto/chacha-p10le-8x.S
new file mode 100644
index ..7c15d17101d7
--- /dev/null
+++ b/arch/powerpc/crypto/chacha-p10le-8x.S
@@ -0,0 +1,842 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#
+# Accelerated chacha20 implementation for ppc64le.
+#
+# Copyright 2023- IBM Inc. All rights reserved
+#
+#===
+# Written by Danny Tsen 
+#
+# chacha_p10le_8x(u32 *state, byte *dst, const byte *src,
+#   size_t len, int nrounds);
+#
+# do rounds,  8 quarter rounds
+# 1.  a += b; d ^= a; d <<<= 16;
+# 2.  c += d; b ^= c; b <<<= 12;
+# 3.  a += b; d ^= a; d <<<= 8;
+# 4.  c += d; b ^= c; b <<<= 7
+#
+# row1 = (row1 + row2),  row4 = row1 xor row4,  row4 rotate each word by 16
+# row3 = (row3 + row4),  row2 = row3 xor row2,  row2 rotate each word by 12
+# row1 = (row1 + row2), row4 = row1 xor row4,  row4 rotate each word by 8
+# row3 = (row3 + row4), row2 = row3 xor row2,  row2 rotate each word by 7
+#
+# 4 blocks (a b c d)
+#
+# a0 b0 c0 d0
+# a1 b1 c1 d1
+# ...
+# a4 b4 c4 d4
+# ...
+# a8 b8 c8 d8
+# ...
+# a12 b12 c12 d12
+# a13 ...
+# a14 ...
+# a15 b15 c15 d15
+#
+# Column round (v0, v4,  v8, v12, v1, v5,  v9, v13, v2, v6, v10, v14, v3, v7, 
v11, v15)
+# Diagnal round (v0, v5, v10, v15, v1, v6, v11, v12, v2, v7,  v8, v13, v3, v4, 
 v9, v14)
+#
+
+#include 
+#include 
+#include 
+#include 
+
+.machine   "any"
+.text
+
+.macro SAVE_GPR GPR OFFSET FRAME
+   std \GPR,\OFFSET(\FRAME)
+.endm
+
+.macro SAVE_VRS VRS OFFSET FRAME
+   li  16, \OFFSET
+   stvx\VRS, 16, \FRAME
+.endm
+
+.macro SAVE_VSX VSX OFFSET FRAME
+   li  16, \OFFSET
+   stxvx   \VSX, 16, \FRAME
+.endm
+
+.macro RESTORE_GPR GPR OFFSET FRAME
+   ld  \GPR,\OFFSET(\FRAME)
+.endm
+
+.macro RESTORE_VRS VRS OFFSET FRAME
+   li  16, \OFFSET
+   lvx \VRS, 16, \FRAME
+.endm
+
+.macro RESTORE_VSX VSX OFFSET FRAME
+   li  16, \OFFSET
+   lxvx\VSX, 16, \FRAME
+.endm
+
+.macro SAVE_REGS
+   mflr 0
+   std 0, 16(1)
+   stdu 1,-752(1)
+
+   SAVE_GPR 14, 112, 1
+   SAVE_GPR 15, 120, 1
+   SAVE_GPR 16, 128, 1
+   SAVE_GPR 17, 136, 1
+   SAVE_GPR 18, 144, 1
+   SAVE_GPR 19, 152, 1
+   SAVE_GPR 20, 160, 1
+   SAVE_GPR 21, 168, 1
+   SAVE_GPR 22, 176, 1
+   SAVE_GPR 23, 184, 1
+   SAVE_GPR 24, 192, 1
+   SAVE_GPR 25, 200, 1
+   SAVE_GPR 26, 208, 1
+   SAVE_GPR 27, 216, 1
+   SAVE_GPR 28, 224, 1
+   SAVE_GPR 29, 232, 1
+   SAVE_GPR 30, 240, 1
+   SAVE_GPR 31, 248, 1
+
+   addi9, 1, 256
+   SAVE_VRS 20, 0, 9
+   SAVE_VRS 21, 16, 9
+   SAVE_VRS 22, 32, 9
+   SAVE_VRS 23, 48, 9
+   SAVE_VRS 24, 64, 9
+   SAVE_VRS 25, 80, 9
+   SAVE_VRS 26, 96, 9
+   SAVE_VRS 27, 112, 9
+   SAVE_VRS 28, 128, 9
+   SAVE_VRS 29, 144, 9
+   SAVE_VRS 30, 160, 9
+   SAVE_VRS 31, 176, 9
+
+   SAVE_VSX 14, 192, 9
+   SAVE_VSX 15, 208, 9
+   SAVE_VSX 16, 224, 9
+   SAVE_VSX 17, 240, 9
+   SAVE_VSX 18, 256, 9
+   SAVE_VSX 19, 272, 9
+   SAVE_VSX 20, 288, 9
+   SAVE_VSX 21, 304, 9
+   SAVE_VSX 22, 320, 9
+   SAVE_VSX 23, 336, 9
+   SAVE_VSX 24, 352, 9
+   SAVE_VSX 25, 368, 9
+   SAVE_VSX 26, 384, 9
+   SAVE_VSX 27, 400, 9
+   SAVE_VSX 28, 416, 9
+   SAVE_VSX 29, 432, 9
+   SAVE_VSX 30, 448, 9
+   SAVE_VSX 31, 464, 9
+.endm # SAVE_REGS
+
+.macro RESTORE_REGS
+   addi9, 1, 256
+   RESTORE_VRS 20, 0, 9
+   RESTORE_VRS 21, 16, 9
+   RESTORE_VRS 22, 32, 9
+   RESTORE_VRS 23, 48, 9
+   RESTORE_VRS 24, 64, 9
+   RESTORE_VRS 25, 80, 9
+   RESTORE_VRS 26, 96, 9
+   RESTORE_VRS 27, 112, 9
+   RESTORE_VRS 28, 128, 9
+   RESTORE_VRS 29, 144, 9
+   RESTORE_VRS 30, 160, 9
+   RESTORE_VRS 31, 176, 9
+
+   RESTORE_VSX 14, 192, 9
+   RESTORE_VSX 15, 208, 9
+   RESTORE_VSX 16, 224, 9
+   RESTORE_VSX 17, 240, 9
+   RESTORE_VSX 18, 256, 9
+   RESTORE_VSX 19, 272, 9
+   RESTORE_VSX 20, 288, 9
+   RESTORE_VSX 21, 304, 9
+   RESTORE_VSX 22, 320, 9
+   RESTORE_VSX 23, 336, 9
+   RESTORE_VSX 24, 352, 9
+   RESTORE_VSX 25, 368, 9
+   RESTORE_VSX 26, 384, 9
+   RESTORE_VSX 27, 400, 9
+   RESTORE_VSX 28, 416, 9
+   RESTORE_VSX 29, 432, 9
+   RESTORE_VSX 30, 448, 9
+   RESTORE_VSX 31, 464, 9
+
+   RESTORE_GPR 14, 112, 1
+   RESTORE_GPR 15, 120, 1
+   RESTORE_GPR 16, 128, 1
+   RESTORE_GPR 17, 136, 1
+   RESTORE_GPR 18, 144, 1
+

[PATCH 0/5] crypto: Accelerated Chacha20/Poly1305 implementation

2023-04-24 Thread Danny Tsen

This patch series provide an accelerated/optimized Chacha20 and Poly1305
implementation for Power10 or later CPU (ppc64le).  This module
implements algorithm specified in RFC7539.  The implementation
provides 3.5X better performance than the baseline for Chacha20 and
Poly1305 individually and 1.5X improvement for Chacha20/Poly1305
operation.

This patch has been tested with the kernel crypto module tcrypt.ko and
has passed the selftest.  The patch is also tested with
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS enabled.


Danny Tsen (5):
  An optimized Chacha20 implementation with 8-way unrolling for ppc64le.
  Glue code for optmized Chacha20 implementation for ppc64le.
  An optimized Poly1305 implementation with 4-way unrolling for ppc64le.
  Glue code for optmized Poly1305 implementation for ppc64le.
  Update Kconfig and Makefile.

 arch/powerpc/crypto/Kconfig |   26 +
 arch/powerpc/crypto/Makefile|4 +
 arch/powerpc/crypto/chacha-p10-glue.c   |  223 +
 arch/powerpc/crypto/chacha-p10le-8x.S   |  842 ++
 arch/powerpc/crypto/poly1305-p10-glue.c |  186 
 arch/powerpc/crypto/poly1305-p10le_64.S | 1075 +++
 6 files changed, 2356 insertions(+)
 create mode 100644 arch/powerpc/crypto/chacha-p10-glue.c
 create mode 100644 arch/powerpc/crypto/chacha-p10le-8x.S
 create mode 100644 arch/powerpc/crypto/poly1305-p10-glue.c
 create mode 100644 arch/powerpc/crypto/poly1305-p10le_64.S

-- 
2.31.1

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Segher Boessenkool

On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote:
> On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> > At what points can r13 change?  Only when some particular functions are
> > called?
> 
> r13 is the local paca:
> 
>   register struct paca_struct *local_paca asm("r13");
> 
> , which is a pointer to percpu data.

Yes, it is a global register variable.

> So if a task schedule from one CPU to anotehr CPU, the value gets
> changed.

But the compiler does not see that something else changes local_paca (or
r13 some other way, via assembler code perhaps)?  Or is there a compiler
bug?

If the latter is true:

Can you make a reproducer and open a GCC PR?  
for how to get started doing that.  We need *exact* code that shows the
problem, together with a compiler command line.  So that we can
reproduce the problem.  That is step 0 in figuring out what is going on,
and then maybe fixing the problem :-)

Segher

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Boqun Feng

On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote:
> > Boqun Feng  writes:
> > > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> > >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou  wrote:
> > >> > by debugging, I see the r10 is assigned with r13 on c0226eb4,
> > >> > but if there is a context-switch before c0226edc, a false
> > >> > positive will be reported.
> 
> > I've never understood why the compiler wants to make a copy of a
> > register variable into another register!? >:#
> 
> It is usually because a) you told it to (maybe via an earlyclobber), or
> b) it looked cheaper.  I don't see either here :-(
> 
> > > here I think that the compiler is using r10 as an alias to r13, since
> > > for userspace program, it's safe to assume the TLS pointer doesn't
> > > change. However this is not true for kernel percpu pointer.
> 
> r13 is a "fixed" register, but that means it has a fixed purpose (so not
> available for allocation), it does not mean "unchanging".
> 
> > > The real intention here is to compare 40(r1) vs 3192(r13) for stack
> > > guard checking, however since r13 is the percpu pointer in kernel, so
> > > the value of r13 can be changed if the thread gets scheduled to a
> > > different CPU after reading r13 for r10.
> > 
> > Yeah that's not good.
> 
> The GCC pattern here makes the four machine insns all stay together.
> That is to make sure to not leak any secret value, which is impossible
> to guarantee otherwise.
> 
> What tells GCC r13 can randomly change behind its back?  And, what then
> makes GCC ignore that fact?
> 
> > >   +   asm volatile("" : : : "r13", "memory");
> 
> Any asm without output is always volatile.
> 
> > > Needless to say, the correct fix is to make ppc stack protector aware of
> > > r13 is volatile.
> > 
> > I suspect the compiler developers will tell us to go jump :)
> 
> Why would r13 change over the course of *this* function / this macro,
> why can this not happen anywhere else?
> 
> > The problem of the compiler caching r13 has come up in the past, but I
> > only remember it being "a worry" rather than causing an actual bug.
> 
> In most cases the compiler is smart enough to use r13 directly, instead
> of copying it to another reg and then using that one.  But not here for
> some strange reason.  That of course is a very minor generated machine
> code quality bug and nothing more :-(
> 
> > We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
> > least some comfort that if the compiler is caching r13, it shouldn't be
> > doing it in preemptable regions.
> > 
> > But obviously that doesn't help at all with the stack protector check.
> > 
> > I don't see an easy fix.
> > 
> > Adding "volatile" to the definition of local_paca seems to reduce but
> > not elimate the caching of r13, and the GCC docs explicitly say *not* to
> > use volatile. It also triggers lots of warnings about volatile being
> > discarded.
> 
> The point here is to say some code clobbers r13, not the asm volatile?
> 
> > Or something simple I haven't thought of? :)
> 
> At what points can r13 change?  Only when some particular functions are
> called?
> 

r13 is the local paca:

register struct paca_struct *local_paca asm("r13");

, which is a pointer to percpu data.

So if a task schedule from one CPU to anotehr CPU, the value gets
changed.

Regards,
Boqun


> 
> Segher

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Segher Boessenkool

Hi!

On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote:
> Boqun Feng  writes:
> > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
> >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou  wrote:
> >> > by debugging, I see the r10 is assigned with r13 on c0226eb4,
> >> > but if there is a context-switch before c0226edc, a false
> >> > positive will be reported.

> I've never understood why the compiler wants to make a copy of a
> register variable into another register!? >:#

It is usually because a) you told it to (maybe via an earlyclobber), or
b) it looked cheaper.  I don't see either here :-(

> > here I think that the compiler is using r10 as an alias to r13, since
> > for userspace program, it's safe to assume the TLS pointer doesn't
> > change. However this is not true for kernel percpu pointer.

r13 is a "fixed" register, but that means it has a fixed purpose (so not
available for allocation), it does not mean "unchanging".

> > The real intention here is to compare 40(r1) vs 3192(r13) for stack
> > guard checking, however since r13 is the percpu pointer in kernel, so
> > the value of r13 can be changed if the thread gets scheduled to a
> > different CPU after reading r13 for r10.
> 
> Yeah that's not good.

The GCC pattern here makes the four machine insns all stay together.
That is to make sure to not leak any secret value, which is impossible
to guarantee otherwise.

What tells GCC r13 can randomly change behind its back?  And, what then
makes GCC ignore that fact?

> > +   asm volatile("" : : : "r13", "memory");

Any asm without output is always volatile.

> > Needless to say, the correct fix is to make ppc stack protector aware of
> > r13 is volatile.
> 
> I suspect the compiler developers will tell us to go jump :)

Why would r13 change over the course of *this* function / this macro,
why can this not happen anywhere else?

> The problem of the compiler caching r13 has come up in the past, but I
> only remember it being "a worry" rather than causing an actual bug.

In most cases the compiler is smart enough to use r13 directly, instead
of copying it to another reg and then using that one.  But not here for
some strange reason.  That of course is a very minor generated machine
code quality bug and nothing more :-(

> We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at
> least some comfort that if the compiler is caching r13, it shouldn't be
> doing it in preemptable regions.
> 
> But obviously that doesn't help at all with the stack protector check.
> 
> I don't see an easy fix.
> 
> Adding "volatile" to the definition of local_paca seems to reduce but
> not elimate the caching of r13, and the GCC docs explicitly say *not* to
> use volatile. It also triggers lots of warnings about volatile being
> discarded.

The point here is to say some code clobbers r13, not the asm volatile?

> Or something simple I haven't thought of? :)

At what points can r13 change?  Only when some particular functions are
called?

Segher

Re: [PATCH v10 0/5] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel

2023-04-24 Thread Sourabh Jain




On 24/04/23 19:35, Eric DeVolder wrote:



On 4/23/23 05:52, Sourabh Jain wrote:

The Problem:

Post CPU/Memory hot plug/unplug and online/offline events the kernel
holds stale information about the system. Dump collection with stale
kdump kernel might end up in dump capture failure or an inaccurate dump
collection.

Existing solution:
==
The existing solution to keep the kdump kernel up-to-date by monitoring
CPU/Memory hotplug/online/offline events via udev rule and trigger a 
full

kdump kernel reload for every hotplug event.

Shortcomings:

- Leaves a window where kernel crash might not lead to a successful dump
   collection.
- Reloading all kexec components for each hotplug is inefficient.
- udev rules are prone to races if hotplug events are frequent.

More about issues with an existing solution is posted here:
  - https://lkml.org/lkml/2020/12/14/532
  - 
https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html


Proposed Solution:
==
Instead of reloading all kexec segments on CPU/Memory 
hotplug/online/offline
event, this patch series focuses on updating only the relevant kexec 
segment.

Once the kexec segments are loaded in the kernel reserved area then an
arch-specific hotplug handler will update the relevant kexec segment 
based on

hotplug event type.

Series Dependencies

This patch series implements the crash hotplug handler on PowerPC. 
The generic
crash hotplug handler is introduced by 
https://lkml.org/lkml/2023/4/4/1136 patch

series.

Git tree for testing:
=
The below git tree has this patch series applied on top of dependent 
patch

series.
https://github.com/sourabhjains/linux/tree/e21-s10

To realise the feature the kdump udev rule must updated to avoid
reloading of kdump reload on CPU/Memory hotplug/online/offline events.

   RHEL: /usr/lib/udev/rules.d/98-kexec.rules

-SUBSYSTEM=="cpu", ACTION=="online", GOTO="kdump_reload_cpu"
-SUBSYSTEM=="memory", ACTION=="online", GOTO="kdump_reload_mem"
-SUBSYSTEM=="memory", ACTION=="offline", GOTO="kdump_reload_mem"
+SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", 
GOTO="kdump_reload_end"
+SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", 
GOTO="kdump_reload_end"




I didn't see in the patch series where you would have the equivalent 
to the following (needed for the sysfs crash_hotplug entries):


#ifdef CONFIG_HOTPLUG_CPU
static inline int crash_hotplug_cpu_support(void) { return 1; }
#define crash_hotplug_cpu_support crash_hotplug_cpu_support
#endif

#ifdef CONFIG_MEMORY_HOTPLUG
static inline int crash_hotplug_memory_support(void) { return 1; }
#define crash_hotplug_memory_support crash_hotplug_memory_support
#endif


I missed the above diff in my testing environment. Thanks you for 
bringing it

to my attention. I will fix this next version.

- Sourabh Jain

Re: [PATCH v10 2/5] powerpc/crash: introduce a new config option CRASH_HOTPLUG

2023-04-24 Thread Sourabh Jain




On 24/04/23 15:27, Laurent Dufour wrote:

On 23/04/2023 12:52:10, Sourabh Jain wrote:

Due to CPU/Memory hot plug/unplug or online/offline events the system
resources changes. A similar change should reflect in the loaded kdump
kernel kexec segments that describes the state of the CPU and memory of
the running kernel.

If the kdump kernel kexec segments are not updated after the CPU/Memory
hot plug/unplug or online/offline events and kdump kernel tries to
collect the dump with the stale system resource data then this might
lead to dump collection failure or an inaccurate dump collection.

The current method to keep the kdump kernel kexec segments up to date is
by reloading the complete kdump kernel whenever a CPU/Memory hot
plug/unplug or online/offline event is observed in userspace. Reloading
the kdump kernel for every CPU/Memory hot plug/unplug or online/offline
event is inefficient and creates a large window where the kdump service
is not available. It can be improved by doing in-kernel updates to only
necessary kdump kernel kexec segments which describe CPU and Memory
resources of the running kernel to the kdump kernel.

The kernel changes related to in-kernel updates to the kdump kernel
kexec segments are kept under the CRASH_HOTPLUG config option.

Later in the series, a powerpc crash hotplug handler is introduced to
update the kdump kernel kexec segments on CPU/Memory hotplug events.
This arch-specific handler is triggered from a generic crash handler
that registers with the CPU/Memory add/remove notifiers.

The CRASH_HOTPLUG config option is enabled by default.

Signed-off-by: Sourabh Jain 
Reviewed-by: Laurent Dufour 

I can't remember having sent a review-by on that patch earlier.

Anyway, I can't find any issue with that one, so replace with:
Reviewed-by: Laurent Dufour 


I apologies for mistakenly applying the "Reviewed-by" tag to the entire
patch series, when it was intended for only one patch. I will remove the
tag in next version.

- Sourabh Jain

Re: [PATCH v10 4/5] crash: forward memory_notify args to arch crash hotplug handler

2023-04-24 Thread Eric DeVolder





On 4/23/23 05:52, Sourabh Jain wrote:

On PowePC memblock regions are used to prepare elfcorehdr which

s/PowePC/PowerPC/


describes the memory regions of the running kernel to the kdump kernel.
Since the notifier used for the memory hotplug crash handler gets
initiated before the update of the memblock region happens (as depicted
below) the newly prepared elfcorehdr still holds the old memory regions.
If the elfcorehdr is prepared with stale memblock regions then the newly
prepared elfcorehdr will still be holding stale memory regions. And dump
collection with stale elfcorehdr will lead to dump collection failure or
incomplete dump collection.

The sequence of actions done on PowerPC when an LMB memory hot removed:

  Initiate memory hot remove
   |
   v
  offline pages
   |
   v
  initiate memory notify call
  chain for MEM_OFFLINE event  <---> Prepare new elfcorehdr and replace
it with old one
   |
   v
  update memblock regions

Such challenges only exist for memory remove case. For the memory add
case the memory regions are updated first and then memory notify calls
the arch crash hotplug handler to update the elfcorehdr.

This patch passes additional information about the hot removed LMB to
the arch crash hotplug handler in the form of memory_notify object.

How passing memory_notify to arch crash hotplug handler will help?

memory_notify holds the start PFN and page count of the hot removed
memory. With that base address and the size of the hot removed memory
can be calculated and same can be used to avoid adding hot removed
memory region to get added in the elfcorehdr.

Signed-off-by: Sourabh Jain 
Reviewed-by: Laurent Dufour 
---
  arch/powerpc/include/asm/kexec.h |  2 +-
  arch/powerpc/kexec/core_64.c |  3 ++-
  arch/x86/include/asm/kexec.h |  2 +-
  arch/x86/kernel/crash.c  |  3 ++-
  include/linux/kexec.h|  2 +-
  kernel/crash_core.c  | 14 +++---
  6 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index f01ba767af56e..7e811bad5ec92 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -104,7 +104,7 @@ struct crash_mem;
  int update_cpus_node(void *fdt);
  int get_crash_memory_ranges(struct crash_mem **mem_ranges);
  #if defined(CONFIG_CRASH_HOTPLUG)
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
  #endif
  #endif
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 611b89bcea2be..147ea6288a526 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -551,10 +551,11 @@ int update_cpus_node(void *fdt)
   * arch_crash_hotplug_handler() - Handle crash CPU/Memory hotplug events to 
update the
   *necessary kexec segments based on the 
hotplug event.

s/arch/crash_hotplug_handler/arch_crash_handle_hotplug_event/


   * @image: the active struct kimage
+ * @arg: struct memory_notify handler for memory add/remove case and NULL for 
CPU case.
   *
   * Update FDT segment to include newly added CPU. No action for CPU remove 
case.
   */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
  {
void *fdt, *ptr;
unsigned long mem;
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 1bc852ce347d4..70c3b23b468b6 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -213,7 +213,7 @@ extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
  extern void kdump_nmi_shootdown_cpus(void);
  
  #ifdef CONFIG_CRASH_HOTPLUG

-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
  
  #ifdef CONFIG_HOTPLUG_CPU

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index ead602636f3e0..b45d13193b579 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -445,11 +445,12 @@ int crash_load_segments(struct kimage *image)
  /**
   * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
   * @image: the active struct kimage
+ * @arg: struct memory_notify handler for memory add/remove case and NULL for 
CPU case.
   *
   * The new elfcorehdr is prepared in a kernel buffer, and then it is
   * written on top of the existing/old elfcorehdr.
   */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
  {
void *elfbuf = NULL, *old_elfcorehdr;
unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h

Re: [PATCH v10 0/5] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel

2023-04-24 Thread Eric DeVolder





On 4/23/23 05:52, Sourabh Jain wrote:

The Problem:

Post CPU/Memory hot plug/unplug and online/offline events the  kernel
holds stale information about the system. Dump collection with stale
kdump kernel might end up in dump capture failure or an inaccurate dump
collection.

Existing solution:
==
The existing solution to keep the kdump kernel up-to-date by monitoring
CPU/Memory hotplug/online/offline events via udev rule and trigger a full
kdump kernel reload for every hotplug event.

Shortcomings:

- Leaves a window where kernel crash might not lead to a successful dump
   collection.
- Reloading all kexec components for each hotplug is inefficient.
- udev rules are prone to races if hotplug events are frequent.

More about issues with an existing solution is posted here:
  - https://lkml.org/lkml/2020/12/14/532
  - https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html

Proposed Solution:
==
Instead of reloading all kexec segments on CPU/Memory hotplug/online/offline
event, this patch series focuses on updating only the relevant kexec segment.
Once the kexec segments are loaded in the kernel reserved area then an
arch-specific hotplug handler will update the relevant kexec segment based on
hotplug event type.

Series Dependencies

This patch series implements the crash hotplug handler on PowerPC. The generic
crash hotplug handler is introduced by https://lkml.org/lkml/2023/4/4/1136 patch
series.

Git tree for testing:
=
The below git tree has this patch series applied on top of dependent patch
series.
https://github.com/sourabhjains/linux/tree/e21-s10

To realise the feature the kdump udev rule must updated to avoid
reloading of kdump reload on CPU/Memory hotplug/online/offline events.

   RHEL: /usr/lib/udev/rules.d/98-kexec.rules

-SUBSYSTEM=="cpu", ACTION=="online", GOTO="kdump_reload_cpu"
-SUBSYSTEM=="memory", ACTION=="online", GOTO="kdump_reload_mem"
-SUBSYSTEM=="memory", ACTION=="offline", GOTO="kdump_reload_mem"
+SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
+SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"



I didn't see in the patch series where you would have the equivalent to the following (needed for 
the sysfs crash_hotplug entries):


#ifdef CONFIG_HOTPLUG_CPU
static inline int crash_hotplug_cpu_support(void) { return 1; }
#define crash_hotplug_cpu_support crash_hotplug_cpu_support
#endif

#ifdef CONFIG_MEMORY_HOTPLUG
static inline int crash_hotplug_memory_support(void) { return 1; }
#define crash_hotplug_memory_support crash_hotplug_memory_support
#endif


Note: only kexec_file_load syscall will work. For kexec_load minor changes are
required in kexec tool.

---
Changelog:

v10:
   - Drop the patch that adds fdt_index attribute to struct kimage_arch
 Find the fdt segment index when needed.
   - Added more details into commits messages.
   - Rebased onto 6.3.0-rc5

v9:
   - Removed patch to prepare elfcorehdr crash notes for possible CPUs.
 The patch is moved to generic patch series that introduces generic
 infrastructure for in kernel crash update.
   - Removed patch to pass the hotplug action type to the arch crash
 hotplug handler function. The generic patch series has introduced
 the hotplug action type in kimage struct.
   - Add detail commit message for better understanding.

v8:
   - Restrict fdt_index initialization to machine_kexec_post_load
 it work for both kexec_load and kexec_file_load.[3/8] Laurent Dufour

   - Updated the logic to find the number of offline core. [6/8]

   - Changed the logic to find the elfcore program header to accommodate
 future memory ranges due memory hotplug events. [8/8]

v7
   - added a new config to configure this feature
   - pass hotplug action type to arch specific handler

v6
   - Added crash memory hotplug support

v5:
   - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
   - Move fdt segment identification for kexec_load case to load path
 instead of crash hotplug handler
   - Keep new attribute defined under kimage_arch to track FDT segment
 under CONFIG_HOTPLUG_CPU config.

v4:
   - Update the logic to find the additional space needed for hotadd CPUs post
 kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash hotplug
 support for kexec_file_load" patch to know more about the change.
   - Fix a couple of typo.
   - Replace pr_err to pr_info_once to warn user about memory hotplug
 support.
   - In crash hotplug handle exit the for loop if FDT segment is found.

v3
   - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
   - Rebase patche on top of https://lkml.org/lkml/2022/3/3/674 [v5]
   - Fixed warning reported by checpatch script

v2:
   - Use generic hotplug handler introduced by 
https://lkml.org/lkml/2022/2/9/1406,

Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

2023-04-24 Thread Michael Ellerman

Hi Boqun,

Thanks for debugging this ...

Boqun Feng  writes:
> On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote:
>> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou  wrote:
>> >
>> > Dear PowerPC and RCU developers:
>> > During the RCU torture test on mainline (on the VM of Opensource Lab
>> > of Oregon State University), SRCU-P failed with __stack_chk_fail:
>> > [  264.381952][   T99] [c6c7bab0] [c10c67c0]
>> > dump_stack_lvl+0x94/0xd8 (unreliable)
>> > [  264.383786][   T99] [c6c7bae0] [c014fc94] 
>> > panic+0x19c/0x468
>> > [  264.385128][   T99] [c6c7bb80] [c10fca24]
>> > __stack_chk_fail+0x24/0x30
>> > [  264.386610][   T99] [c6c7bbe0] [c02293b4]
>> > srcu_gp_start_if_needed+0x5c4/0x5d0
>> > [  264.388188][   T99] [c6c7bc70] [c022f7f4]
>> > srcu_torture_call+0x34/0x50
>> > [  264.389611][   T99] [c6c7bc90] [c022b5e8]
>> > rcu_torture_fwd_prog+0x8c8/0xa60
>> > [  264.391439][   T99] [c6c7be00] [c018e37c] 
>> > kthread+0x15c/0x170
>> > [  264.392792][   T99] [c6c7be50] [c000df94]
>> > ret_from_kernel_thread+0x5c/0x64
>> > The kernel config file can be found in [1].
>> > And I write a bash script to accelerate the bug reproducing [2].
>> > After a week's debugging, I found the cause of the bug is because the
>> > register r10 used to judge for stack overflow is not constant between
>> > context switches.
>> > The assembly code for srcu_gp_start_if_needed is located at [3]:
>> > c0226eb4:   78 6b aa 7d mr  r10,r13
>> > c0226eb8:   14 42 29 7d add r9,r9,r8
>> > c0226ebc:   ac 04 00 7c hwsync
>> > c0226ec0:   10 00 7b 3b addir27,r27,16
>> > c0226ec4:   14 da 29 7d add r9,r9,r27
>> > c0226ec8:   a8 48 00 7d ldarx   r8,0,r9
>> > c0226ecc:   01 00 08 31 addic   r8,r8,1
>> > c0226ed0:   ad 49 00 7d stdcx.  r8,0,r9
>> > c0226ed4:   f4 ff c2 40 bne-c0226ec8
>> > 
>> > c0226ed8:   28 00 21 e9 ld  r9,40(r1)
>> > c0226edc:   78 0c 4a e9 ld  r10,3192(r10)
>> > c0226ee0:   79 52 29 7d xor.r9,r9,r10
>> > c0226ee4:   00 00 40 39 li  r10,0
>> > c0226ee8:   b8 03 82 40 bne c02272a0
>> > 
>> > by debugging, I see the r10 is assigned with r13 on c0226eb4,
>> > but if there is a context-switch before c0226edc, a false
>> > positive will be reported.
>> >
>> > [1] http://154.220.3.115/logs/0422/configformainline.txt
>> > [2] 154.220.3.115/logs/0422/whilebash.sh
>> > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt
>> >
>> > My analysis and debugging may not be correct, but the bug is easily
>> > reproducible.
>> 
>> If this is a bug in the stack smashing protection as you seem to hint,
>> I wonder if you see the issue with a specific gcc version and is a
>> compiler-specific issue. It's hard to say, but considering this I
>
> Very likely, more asm code from Zhouyi's link:
>
> This is the __srcu_read_unlock_nmisafe(), since "hwsync" is
> smp_mb__{after,before}_atomic(), and the following code is first
> barrier then atomic, so it's the unlock.
>
>   c0226eb4:   78 6b aa 7d mr  r10,r13
>
> ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also
> the pointer to TLS data for userspace code.

I've never understood why the compiler wants to make a copy of a
register variable into another register!? >:#

>   c0226eb8:   14 42 29 7d add r9,r9,r8
>   c0226ebc:   ac 04 00 7c hwsync
>   c0226ec0:   10 00 7b 3b addir27,r27,16
>   c0226ec4:   14 da 29 7d add r9,r9,r27
>   c0226ec8:   a8 48 00 7d ldarx   r8,0,r9
>   c0226ecc:   01 00 08 31 addic   r8,r8,1
>   c0226ed0:   ad 49 00 7d stdcx.  r8,0,r9
>   c0226ed4:   f4 ff c2 40 bne-c0226ec8 
> 
>   c0226ed8:   28 00 21 e9 ld  r9,40(r1)
>   c0226edc:   78 0c 4a e9 ld  r10,3192(r10)
>
> here I think that the compiler is using r10 as an alias to r13, since
> for userspace program, it's safe to assume the TLS pointer doesn't
> change. However this is not true for kernel percpu pointer.
>
> The real intention here is to compare 40(r1) vs 3192(r13) for stack
> guard checking, however since r13 is the percpu pointer in kernel, so
> the value of r13 can be changed if the thread gets scheduled to a
> different CPU after reading r13 for r10.

Yeah that's not good.

> If I'm correct, the following should be a workaround:
>
>   diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>   index ab4ee58af84b..f5ae3be3d04d 100644
>   --- a/kernel/rcu/srcutree.c
>   +++ b/kernel/rcu/srcutree.c
>   @@ -747,6 +747,7 @@ void

Re: [PATCH] powerpc/bpf: populate extable entries only during the last pass

2023-04-24 Thread Naveen N. Rao


Hari Bathini wrote:

Hello Christophe,

Thanks for the review.

On 07/04/23 11:31 am, Christophe Leroy wrote:



Le 06/04/2023 à 09:35, Hari Bathini a écrit :

Since commit 85e031154c7c ("powerpc/bpf: Perform complete extra passes
to update addresses"), two additional passes are performed to avoid
space and CPU time wastage on powerpc. But these extra passes led to
WARN_ON_ONCE() hits in bpf_add_extable_entry(). Fix it by not adding
extable entries during the extra pass.


Are you sure this change is correct ?


Actually, I was in two minds about that owing to commit 04c04205bc35
("bpf powerpc: Remove extra_pass from bpf_jit_build_body()").


Right, but Christophe's series adding complete passes during the 
extra_pass phase added 'extra_pass' parameter back to 
bpf_jit_build_body().





During the extra pass the code can get shrinked or expanded (within the
limits of the size of the preliminary pass). Shouldn't extable entries
be populated during the last pass ?


Unlikely, but the intention there was to eliminate a regression in case
extra_pass ends up being 'false' always in any subsequent change.


But, the current approach risks generating incorrect offsets in the 
extable. The main motivation for the extra pass is to generate more 
compact code, so there is a good chance that offsets are going to change 
(especially with bpf subprogs).




- Hari



Fixes: 85e031154c7c ("powerpc/bpf: Perform complete extra passes to update 
addresses")
Signed-off-by: Hari Bathini 
---
   arch/powerpc/net/bpf_jit_comp32.c | 2 +-
   arch/powerpc/net/bpf_jit_comp64.c | 2 +-
   2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c 
b/arch/powerpc/net/bpf_jit_comp32.c
index 7f91ea064c08..e788b1fbeee6 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -977,7 +977,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, 
struct codegen_context *
if (size != BPF_DW && !fp->aux->verifier_zext)
EMIT(PPC_RAW_LI(dst_reg_h, 0));
   
-			if (BPF_MODE(code) == BPF_PROBE_MEM) {

+   if (BPF_MODE(code) == BPF_PROBE_MEM && !extra_pass) {


It is probably better to pass 'extra_pass' into bpf_add_extable_entry() 
to keep all those checks together.



- Naveen

Re: In-flight collision: DRM_AMD_DC_DCN renaming

2023-04-24 Thread Michael Ellerman

Hi Lukas,

Lukas Bulwahn  writes:
> Dear Michael, dear Harry, dear Alex,
>
> The commit 4652ae7a51b7 ("drm/amd/display: Rename DCN config to FP")
> renames config DRM_AMD_DC_DCN to config DRM_AMD_DC_FP. The concurrent
> commit 78f0929884d4 ("powerpc/64: Always build with 128-bit long
> double") overrides the renaming change for the select in config
> DRM_AMD_DC, and this leads to selecting the non-existent
> DRM_AMD_DC_DCN.

The powerpc commit doesn't override the name change, in the powerpc tree
where it's applied the name change hasn't happened yet, see the diff:

diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index 0c9bd0a53e60..e36261d546af 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -8,7 +8,7 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
-   select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128 || (ARM64 && 
KERNEL_MODE_NEON && !CC_IS_CLANG))
+   select DRM_AMD_DC_DCN if (X86 || (PPC64 && ALTIVEC) || (ARM64 && 
KERNEL_MODE_NEON && !CC_IS_CLANG))
help
  Choose this option if you want to use the new display engine
  support for AMDGPU. This adds required support for Vega and


The problem is that the resolution of the merge conflict in linux-next
is incorrect, it takes the powerpc change without taking into account
the rename from the amdgpu commit.

The correct resolution is:

diff --cc drivers/gpu/drm/amd/display/Kconfig
index e36261d546af,06b438217c61..
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@@ -8,7 -8,7 +8,7 @@@ config DRM_AMD_D
depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
-   select DRM_AMD_DC_DCN if (X86 || (PPC64 && ALTIVEC) || (ARM64 && 
KERNEL_MODE_NEON && !CC_IS_CLANG))
 -  select DRM_AMD_DC_FP if (X86 || PPC64 || (ARM64 && KERNEL_MODE_NEON && 
!CC_IS_CLANG))
++  select DRM_AMD_DC_FP if (X86 || (PPC64 && ALTIVEC) || (ARM64 && 
KERNEL_MODE_NEON && !CC_IS_CLANG))
help
  Choose this option if you want to use the new display engine
  support for AMDGPU. This adds required support for Vega and


(Note that 4652ae7a51b7 incorrectly changed PPC_LONG_DOUBLE_128 to plain
 PPC64, which is why PPC_LONG_DOUBLE_128 doesn't appear in the diff above.)

Possibly the merge resoulution can be fixed in linux-next.

And ultimately the fix is for Linus to do the proper merge resolution
when he eventually merges the two trees.

cheers

Re: [PATCH v10 4/5] crash: forward memory_notify args to arch crash hotplug handler

2023-04-24 Thread Laurent Dufour

On 23/04/2023 12:52:12, Sourabh Jain wrote:
> On PowePC memblock regions are used to prepare elfcorehdr which
> describes the memory regions of the running kernel to the kdump kernel.
> Since the notifier used for the memory hotplug crash handler gets
> initiated before the update of the memblock region happens (as depicted
> below) the newly prepared elfcorehdr still holds the old memory regions.
> If the elfcorehdr is prepared with stale memblock regions then the newly
> prepared elfcorehdr will still be holding stale memory regions. And dump
> collection with stale elfcorehdr will lead to dump collection failure or
> incomplete dump collection.
> 
> The sequence of actions done on PowerPC when an LMB memory hot removed:
> 
>  Initiate memory hot remove
>   |
>   v
>  offline pages
>   |
>   v
>  initiate memory notify call
>  chain for MEM_OFFLINE event  <---> Prepare new elfcorehdr and replace
>   it with old one
>   |
>   v
>  update memblock regions
> 
> Such challenges only exist for memory remove case. For the memory add
> case the memory regions are updated first and then memory notify calls
> the arch crash hotplug handler to update the elfcorehdr.
> 
> This patch passes additional information about the hot removed LMB to
> the arch crash hotplug handler in the form of memory_notify object.
> 
> How passing memory_notify to arch crash hotplug handler will help?
> 
> memory_notify holds the start PFN and page count of the hot removed
> memory. With that base address and the size of the hot removed memory
> can be calculated and same can be used to avoid adding hot removed
> memory region to get added in the elfcorehdr.
> 
> Signed-off-by: Sourabh Jain 
> Reviewed-by: Laurent Dufour 

I don't remember sending a review-by on this patch earlier, do you?

> ---
>  arch/powerpc/include/asm/kexec.h |  2 +-
>  arch/powerpc/kexec/core_64.c |  3 ++-
>  arch/x86/include/asm/kexec.h |  2 +-
>  arch/x86/kernel/crash.c  |  3 ++-
>  include/linux/kexec.h|  2 +-
>  kernel/crash_core.c  | 14 +++---
>  6 files changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kexec.h 
> b/arch/powerpc/include/asm/kexec.h
> index f01ba767af56e..7e811bad5ec92 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -104,7 +104,7 @@ struct crash_mem;
>  int update_cpus_node(void *fdt);
>  int get_crash_memory_ranges(struct crash_mem **mem_ranges);
>  #if defined(CONFIG_CRASH_HOTPLUG)
> -void arch_crash_handle_hotplug_event(struct kimage *image);
> +void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
>  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
>  #endif
>  #endif
> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
> index 611b89bcea2be..147ea6288a526 100644
> --- a/arch/powerpc/kexec/core_64.c
> +++ b/arch/powerpc/kexec/core_64.c
> @@ -551,10 +551,11 @@ int update_cpus_node(void *fdt)
>   * arch_crash_hotplug_handler() - Handle crash CPU/Memory hotplug events to 
> update the
>   *necessary kexec segments based on the 
> hotplug event.
>   * @image: the active struct kimage
> + * @arg: struct memory_notify handler for memory add/remove case and NULL 
> for CPU case.
>   *
>   * Update FDT segment to include newly added CPU. No action for CPU remove 
> case.
>   */
> -void arch_crash_handle_hotplug_event(struct kimage *image)
> +void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
>  {
>   void *fdt, *ptr;
>   unsigned long mem;
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index 1bc852ce347d4..70c3b23b468b6 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -213,7 +213,7 @@ extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
>  extern void kdump_nmi_shootdown_cpus(void);
>  
>  #ifdef CONFIG_CRASH_HOTPLUG
> -void arch_crash_handle_hotplug_event(struct kimage *image);
> +void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
>  #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
>  
>  #ifdef CONFIG_HOTPLUG_CPU
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index ead602636f3e0..b45d13193b579 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -445,11 +445,12 @@ int crash_load_segments(struct kimage *image)
>  /**
>   * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
>   * @image: the active struct kimage
> + * @arg: struct memory_notify handler for memory add/remove case and NULL 
> for CPU case.
>   *
>   * The new elfcorehdr is prepared in a kernel buffer, and then it is
>   * written on top of the existing/old elfcorehdr.
>   */
> -void arch_crash_handle_hotplug_event(struct kimage *image)
> +void arch_crash_handle_hotplug_event(struct

Re: [PATCH v10 5/5] powerpc/kexec: add crash memory hotplug support

2023-04-24 Thread Laurent Dufour

On 23/04/2023 12:52:13, Sourabh Jain wrote:
> Extend PowerPC arch crash hotplug handler to support memory hotplug
> events. Since elfcorehdr is used to exchange the memory info between the
> kernels hence it needs to be recreated to reflect the changes due to
> memory hotplug events.
> 
> The way memory hotplug events are handled on PowerPC and the notifier
> call chain used in generic code to trigger the arch crash handler, the
> process to recreate the elfcorehdr is different for memory add and
> remove case.
> 
> For memory remove case the memory change notifier call chain is
> triggered first and then memblock regions is updated. Whereas for the
> memory hot add case, memblock regions are updated before invoking the
> memory change notifier call chain.
> 
> On PowerPC, memblock regions list is used to prepare the elfcorehdr. In
> case of memory hot remove the memblock regions are updated after the
> arch crash hotplug handler is triggered, hence an additional step is
> taken to ensure that memory ranges used to prepare elfcorehdr do not
> include hot removed memory.
> 
> When memory is hot removed it possible that memory regions count may
> increase. So to accommodate a growing number of memory regions, the
> elfcorehdr kexec segment is built with additional buffer space.
> 
> The changes done here will also work for the kexec_load system call given
> that the kexec tool builds the elfcoredhr with additional space to
> accommodate future memory regions as it is done for kexec_file_load
> system call in the kernel.
> 
> Signed-off-by: Sourabh Jain 
> Reviewed-by: Laurent Dufour 

I don't remember sending a review-by on this patch earlier, do you?

> ---
>  arch/powerpc/include/asm/kexec_ranges.h |  1 +
>  arch/powerpc/kexec/core_64.c| 77 +-
>  arch/powerpc/kexec/file_load_64.c   | 36 ++-
>  arch/powerpc/kexec/ranges.c | 85 +
>  4 files changed, 195 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kexec_ranges.h 
> b/arch/powerpc/include/asm/kexec_ranges.h
> index f83866a19e870..802abf580cf0f 100644
> --- a/arch/powerpc/include/asm/kexec_ranges.h
> +++ b/arch/powerpc/include/asm/kexec_ranges.h
> @@ -7,6 +7,7 @@
>  void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
>  struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
>  int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
> +int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
>  int add_tce_mem_ranges(struct crash_mem **mem_ranges);
>  int add_initrd_mem_range(struct crash_mem **mem_ranges);
>  #ifdef CONFIG_PPC_64S_HASH_MMU
> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
> index 147ea6288a526..01a764b1c9b07 100644
> --- a/arch/powerpc/kexec/core_64.c
> +++ b/arch/powerpc/kexec/core_64.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -547,6 +548,76 @@ int update_cpus_node(void *fdt)
>  #undef pr_fmt
>  #define pr_fmt(fmt) "crash hp: " fmt
>  
> +/**
> + * update_crash_elfcorehdr() - Recreate the elfcorehdr and replace it with 
> old
> + *  elfcorehdr in the kexec segment array.
> + * @image: the active struct kimage
> + * @arg: struct memory_notify data handler
> + */
> +static void update_crash_elfcorehdr(struct kimage *image, struct 
> memory_notify *mn)
> +{
> + int ret;
> + struct crash_mem *cmem = NULL;
> + struct kexec_segment *ksegment;
> + void *ptr, *mem, *elfbuf = NULL;
> + unsigned long elfsz, memsz, base_addr, size;
> +
> + ksegment = >segment[image->elfcorehdr_index];
> + mem = (void *) ksegment->mem;
> + memsz = ksegment->memsz;
> +
> + ret = get_crash_memory_ranges();
> + if (ret) {
> + pr_err("Failed to get crash mem range\n");
> + return;
> + }
> +
> + /*
> +  * The hot unplugged memory is not yet removed from crash memory
> +  * ranges, remove it here.
> +  */
> + if (image->hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY) {
> + base_addr = PFN_PHYS(mn->start_pfn);
> + size = mn->nr_pages * PAGE_SIZE;
> + ret = remove_mem_range(, base_addr, size);
> + if (ret) {
> + pr_err("Failed to remove hot-unplugged from crash 
> memory ranges.\n");
> + return;
> + }
> + }
> +
> + ret = crash_prepare_elf64_headers(cmem, false, , );
> + if (ret) {
> + pr_err("Failed to prepare elf header\n");
> + return;
> + }
> +
> + /*
> +  * It is unlikely that kernel hit this because elfcorehdr kexec
> +  * segment (memsz) is built with addition space to accommodate growing
> +  * number of crash memory ranges while loading the kdump kernel. It is
> +  * Just to avoid any unforeseen case.
> +  */
> + if (elfsz > memsz) {
> +

Re: [PATCH v10 3/5] powerpc/crash: add crash CPU hotplug support

2023-04-24 Thread Laurent Dufour

On 23/04/2023 12:52:11, Sourabh Jain wrote:
> Introduce powerpc crash hotplug handler to update the necessary kexec
> segments in the kernel on CPU/Memory hotplug events. Currently, these
> updates are done by monitoring CPU/Memory hotplug events in userspace.
> 
> A common crash hotplug handler is triggered from generic infrastructure
> for both CPU/Memory hotplug events. But in this patch, crash updates are
> handled only for CPU hotplug events. Support for the crash update on
> memory hotplug events is added in upcoming patches.
> 
> The elfcorehdr segment is used to exchange the CPU and other
> dump-related information between the kernels. Ideally, the elfcorehdr
> segment needs to be recreated on CPU hotplug events to reflect the
> changes. But on powerpc, the elfcorehdr is built with possible CPUs
> hence there is no need to update/recreate the elfcorehdr on CPU hotplug
> events.
> 
> In addition to elfcorehdr, there is another kexec segment that holds CPU
> data on powerpc is FDT (Flattened Device Tree). During the kdump kernel
> boot, it is expected that the crashing CPU must be present in FDT, else
> kdump kernel boot fails.
> 
> Now the only action needed on powerpc to handle the crash CPU hotplug
> event is to add hot added CPUs in the kdump FDT segment to avoid kdump
> kernel boot failure. So for the CPU hot add event, the FDT segment is
> updated with hot added CPU and Since there is no need to remove the hot
> unplugged CPUs from the FDT segment hence no action was taken for CPU
> hot remove event.
> 
> To accommodate a growing number of CPUs, FDT is built with additional
> buffer space to ensure that it can hold possible CPU nodes.
> 
> The changes done here will also work for the kexec_load system call
> given that the kexec tool builds the FDT segment with additional space
> to accommodate possible CPU nodes.
> 
> Since memory crash hotplug support is not there yet the crash hotplug
> the handler simply warns the user and returns.
> 
> Signed-off-by: Sourabh Jain 
> Reviewed-by: Laurent Dufour  ---
>  arch/powerpc/include/asm/kexec.h  |  4 ++
>  arch/powerpc/kexec/core_64.c  | 61 +++
>  arch/powerpc/kexec/elf_64.c   | 12 +-
>  arch/powerpc/kexec/file_load_64.c | 14 +++
>  4 files changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/kexec.h 
> b/arch/powerpc/include/asm/kexec.h
> index 8090ad7d97d9d..f01ba767af56e 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -103,6 +103,10 @@ void kexec_copy_flush(struct kimage *image);
>  struct crash_mem;
>  int update_cpus_node(void *fdt);
>  int get_crash_memory_ranges(struct crash_mem **mem_ranges);
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +void arch_crash_handle_hotplug_event(struct kimage *image);
> +#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
> +#endif
>  #endif
>  
>  #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
> index 0b292f93a74cc..611b89bcea2be 100644
> --- a/arch/powerpc/kexec/core_64.c
> +++ b/arch/powerpc/kexec/core_64.c
> @@ -543,6 +543,67 @@ int update_cpus_node(void *fdt)
>   return ret;
>  }
>  
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +#undef pr_fmt
> +#define pr_fmt(fmt) "crash hp: " fmt
> +
> +/**
> + * arch_crash_hotplug_handler() - Handle crash CPU/Memory hotplug events to 
> update the
> + *necessary kexec segments based on the 
> hotplug event.
> + * @image: the active struct kimage
> + *
> + * Update FDT segment to include newly added CPU. No action for CPU remove 
> case.
> + */
> +void arch_crash_handle_hotplug_event(struct kimage *image)
> +{
> + void *fdt, *ptr;
> + unsigned long mem;
> + int i, fdt_index = -1;
> + unsigned int hp_action = image->hp_action;
> +
> + /*
> +  * Since the hot-unplugged CPU is already part of crash FDT,
> +  * no action is needed for CPU remove case.
> +  */
> + if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
> + return;
> +
> + /* crash update on memory hotplug events is not supported yet */
> + if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == 
> KEXEC_CRASH_HP_ADD_MEMORY) {
> + pr_info_once("Crash update is not supported for memory 
> hotplug\n");
> + return;
> + }
> +
> + /* Find the FDT segment index in kexec segment array. */
> + for (i = 0; i < image->nr_segments; i++) {
> + mem = image->segment[i].mem;
> + ptr = __va(mem);
> +
> + if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
> + fdt_index = i;
> + break;
> + }
> + }
> +
> + if (fdt_index < 0) {
> + pr_err("Unable to locate FDT segment.\n");
> + return;
> + }
> +
> + fdt = __va((void *)image->segment[fdt_index].mem);
> +
> + /* Temporarily

Re: [PATCH v10 2/5] powerpc/crash: introduce a new config option CRASH_HOTPLUG

2023-04-24 Thread Laurent Dufour

On 23/04/2023 12:52:10, Sourabh Jain wrote:
> Due to CPU/Memory hot plug/unplug or online/offline events the system
> resources changes. A similar change should reflect in the loaded kdump
> kernel kexec segments that describes the state of the CPU and memory of
> the running kernel.
> 
> If the kdump kernel kexec segments are not updated after the CPU/Memory
> hot plug/unplug or online/offline events and kdump kernel tries to
> collect the dump with the stale system resource data then this might
> lead to dump collection failure or an inaccurate dump collection.
> 
> The current method to keep the kdump kernel kexec segments up to date is
> by reloading the complete kdump kernel whenever a CPU/Memory hot
> plug/unplug or online/offline event is observed in userspace. Reloading
> the kdump kernel for every CPU/Memory hot plug/unplug or online/offline
> event is inefficient and creates a large window where the kdump service
> is not available. It can be improved by doing in-kernel updates to only
> necessary kdump kernel kexec segments which describe CPU and Memory
> resources of the running kernel to the kdump kernel.
> 
> The kernel changes related to in-kernel updates to the kdump kernel
> kexec segments are kept under the CRASH_HOTPLUG config option.
> 
> Later in the series, a powerpc crash hotplug handler is introduced to
> update the kdump kernel kexec segments on CPU/Memory hotplug events.
> This arch-specific handler is triggered from a generic crash handler
> that registers with the CPU/Memory add/remove notifiers.
> 
> The CRASH_HOTPLUG config option is enabled by default.
> 
> Signed-off-by: Sourabh Jain 
> Reviewed-by: Laurent Dufour 

I can't remember having sent a review-by on that patch earlier.

Anyway, I can't find any issue with that one, so replace with:
Reviewed-by: Laurent Dufour 

> ---
>  arch/powerpc/Kconfig | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index a6c4407d3ec83..ac0dc0ffe89b4 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -681,6 +681,18 @@ config CRASH_DUMP
> The same kernel binary can be used as production kernel and dump
> capture kernel.
>  
> +config CRASH_HOTPLUG
> + bool "In-kernel update to kdump kernel on system configuration changes"
> + default y
> + depends on CRASH_DUMP && (HOTPLUG_CPU || MEMORY_HOTPLUG)
> + help
> +   Quick and efficient mechanism to update the kdump kernel in the
> +   event of CPU/Memory hot plug/unplug or online/offline events. This
> +   approach does the in-kernel update to only necessary kexec segment
> +   instead of unload-reload entire kdump kernel from userspace.
> +
> +   If unsure, say Y.
> +
>  config FA_DUMP
>   bool "Firmware-assisted dump"
>   depends on PPC64 && (PPC_RTAS || PPC_POWERNV)

In-flight collision: DRM_AMD_DC_DCN renaming

2023-04-24 Thread Lukas Bulwahn

Dear Michael, dear Harry, dear Alex,

The commit 4652ae7a51b7 ("drm/amd/display: Rename DCN config to FP")
renames config DRM_AMD_DC_DCN to config DRM_AMD_DC_FP. The concurrent
commit 78f0929884d4 ("powerpc/64: Always build with 128-bit long
double") overrides the renaming change for the select in config
DRM_AMD_DC, and this leads to selecting the non-existent
DRM_AMD_DC_DCN.

It is easy to fix, adjust this one line to the renaming and 'select
DRM_AMD_DC_FP' once both commits are merged. For now, I am just
reporting to let you know; probably, it is best to get a quick fix-up
patch with -rc2, once all the changes landed in -rc1.


Best regards,

Lukas

Re: [PATCH 0/3] fbdev: Set missing owner fields in fb_ops

2023-04-24 Thread Helge Deller


On 4/24/23 10:58, Thomas Zimmermann wrote:

Set the owner field of various drivers' fb_ops instance. The
setting is required by fbcon, which acquires a reference on the
fbdev driver's module. Otherwise, users could attempt to unload
the module while it's still in use.

Thomas Zimmermann (3):
   fbdev/68328fb: Init owner field of struct fb_ops
   fbdev/ps3fb: Init owner field of struct fb_ops
   fbdev/vfb: Init owner field of struct fb_ops

  drivers/video/fbdev/68328fb.c | 1 +
  drivers/video/fbdev/ps3fb.c   | 1 +
  drivers/video/fbdev/vfb.c | 1 +
  3 files changed, 3 insertions(+)



Series applied.

Thank you!
Helge

[PATCH 1/3] fbdev/68328fb: Init owner field of struct fb_ops

2023-04-24 Thread Thomas Zimmermann

Initialize the owner field of struct fb_ops. Required to prevent
module unloading while the driver is in use.

Signed-off-by: Thomas Zimmermann 
---
 drivers/video/fbdev/68328fb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/video/fbdev/68328fb.c b/drivers/video/fbdev/68328fb.c
index ef49cdfb7dd2..07d6e8dc686b 100644
--- a/drivers/video/fbdev/68328fb.c
+++ b/drivers/video/fbdev/68328fb.c
@@ -94,6 +94,7 @@ static int mc68x328fb_pan_display(struct fb_var_screeninfo 
*var,
 static int mc68x328fb_mmap(struct fb_info *info, struct vm_area_struct *vma);
 
 static const struct fb_ops mc68x328fb_ops = {
+   .owner  = THIS_MODULE,
.fb_check_var   = mc68x328fb_check_var,
.fb_set_par = mc68x328fb_set_par,
.fb_setcolreg   = mc68x328fb_setcolreg,
-- 
2.40.0

[PATCH 3/3] fbdev/vfb: Init owner field of struct fb_ops

2023-04-24 Thread Thomas Zimmermann

Initialize the owner field of struct fb_ops. Required to prevent
module unloading while the driver is in use.

Signed-off-by: Thomas Zimmermann 
---
 drivers/video/fbdev/vfb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/video/fbdev/vfb.c b/drivers/video/fbdev/vfb.c
index 680c88267ef4..a0a2009e003e 100644
--- a/drivers/video/fbdev/vfb.c
+++ b/drivers/video/fbdev/vfb.c
@@ -79,6 +79,7 @@ static int vfb_mmap(struct fb_info *info,
struct vm_area_struct *vma);
 
 static const struct fb_ops vfb_ops = {
+   .owner  = THIS_MODULE,
.fb_read= fb_sys_read,
.fb_write   = fb_sys_write,
.fb_check_var   = vfb_check_var,
-- 
2.40.0

[PATCH 2/3] fbdev/ps3fb: Init owner field of struct fb_ops

2023-04-24 Thread Thomas Zimmermann

Initialize the owner field of struct fb_ops. Required to prevent
module unloading while the driver is in use.

Signed-off-by: Thomas Zimmermann 
---
 drivers/video/fbdev/ps3fb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/video/fbdev/ps3fb.c b/drivers/video/fbdev/ps3fb.c
index 2fe08b67eda7..98aaa330a193 100644
--- a/drivers/video/fbdev/ps3fb.c
+++ b/drivers/video/fbdev/ps3fb.c
@@ -936,6 +936,7 @@ static irqreturn_t ps3fb_vsync_interrupt(int irq, void *ptr)
 
 
 static const struct fb_ops ps3fb_ops = {
+   .owner  = THIS_MODULE,
.fb_open= ps3fb_open,
.fb_release = ps3fb_release,
.fb_read= fb_sys_read,
-- 
2.40.0

[PATCH 0/3] fbdev: Set missing owner fields in fb_ops

2023-04-24 Thread Thomas Zimmermann

Set the owner field of various drivers' fb_ops instance. The
setting is required by fbcon, which acquires a reference on the
fbdev driver's module. Otherwise, users could attempt to unload
the module while it's still in use.

Thomas Zimmermann (3):
  fbdev/68328fb: Init owner field of struct fb_ops
  fbdev/ps3fb: Init owner field of struct fb_ops
  fbdev/vfb: Init owner field of struct fb_ops

 drivers/video/fbdev/68328fb.c | 1 +
 drivers/video/fbdev/ps3fb.c   | 1 +
 drivers/video/fbdev/vfb.c | 1 +
 3 files changed, 3 insertions(+)

-- 
2.40.0

40 matches

Mail list logo