Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
On 04/04/2018 06:28 AM, James Morse wrote: > We ended up with the check-summing code because its the default behaviour of > kexec-tools on other architectures. > > One alternative is to rip it out for arm64. Or add arm64 support to kexec-lite: https://github.com/antonblanchard/kexec-lite Or accept my bypass patch: http://lists.infradead.org/pipermail/kexec/2015-October/014573.html -Geoff ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
On 04.04.18 16:28, James Morse wrote: Hi Kostiantyn, On 04/04/18 13:45, Kostiantyn Iarmak wrote: From: Pratyush Anand Date: Fri, Jun 2, 2017 at 5:42 PM Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory To: James Morse Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org, ho...@verge.net.au, dyo...@redhat.com, linux-arm-ker...@lists.infradead.org On Friday 02 June 2017 01:53 PM, James Morse wrote: On 23/05/17 06:02, Pratyush Anand wrote: It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image is around 13MB and initramfs is around 30MB. It takes more than 20 second even when we have -O2 optimization enabled. However, if dcache is enabled during purgatory execution then, it takes just a second in SHA verification. Therefore, these patches adds support for dcache enabling facility during purgatory execution. I'm still not convinced we need this. Moving the SHA verification to happen before the dcache+mmu are disabled would also solve the delay problem, Humm..I am not sure, if we can do that. When we leave kernel (and enter into purgatory), icache+dcache+mmu are already disabled. I think, that would be possible when we will be in a position to use in-kernel purgatory. and we can print an error message or fail the syscall. For kexec we don't expect memory corruption, what are we testing for? I can see the use for kdump, but the kdump-kernel is unmapped so the kernel can't accidentally write over it. (we discussed all this last time, but it fizzled-out. If you and the kexec-tools maintainer think its necessary, fine by me!) Yes, there had already been discussion and MAINTAINERs have discouraged none-purgatory implementation. I have some comments on making this code easier to maintain.. Thanks. I have implemented your review comments and have archived the code in https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache I will be posting the next version only when someone complains about ARM64 kdump behavior that it is not as fast as x86. On our ARM64-based platform we have very long main kernel-secondary kernel switch time. This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools version), we can get ~25x speedup, with this patch secondary kernel boots in ~3 seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about 75 seconds. This is slow because its generating a checksum of the kernel without the benefit of the caches. This series generated page tables so that it could enable the MMU and caches. But, the purgatory code also needs to be a simple as possible because its practically impossible to debug. The purgatory code does this checksum-ing because its worried the panic() was because the kernel cause some memory corruption, and that memory corruption may have affected the kdump kernel too. This can't happen on arm64 as we unmap kdump's crash region, so not even the kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash dump kernel memory") has all the details. (we also needed to do this to avoid the risk of mismatched memory attributes if kdump boots and some CPUs are still stuck in the old kernel) When do you plan merge this patch? We ended up with the check-summing code because its the default behaviour of kexec-tools on other architectures. One alternative is to rip it out for arm64. Untested: %< diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile index 636abea..f10c148 100644 --- a/purgatory/arch/arm64/Makefile +++ b/purgatory/arch/arm64/Makefile @@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \ -Werror-implicit-function-declaration \ -Wdeclaration-after-statement \ -Werror=implicit-int \ - -Werror=strict-prototypes + -Werror=strict-prototypes \ + -DNO_SHA_IN_PURGATORY arm64_PURGATORY_SRCS += \ purgatory/arch/arm64/entry.S \ diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c index 3bbcc09..44e792a 100644 --- a/purgatory/purgatory.c +++ b/purgatory/purgatory.c @@ -9,6 +9,8 @@ struct sha256_region sha256_regions[SHA256_REGIONS] = {}; sha256_digest_t sha256_digest = { }; +#ifndef NO_SHA_IN_PURGATORY + int verify_sha256_digest(void) { struct sha256_region *ptr, *end; @@ -39,14 +41,18 @@ int verify_sha256_digest(void) return 0; } +#endif /* NO_SHA_IN_PURGATORY */ + void purgatory(void) { printf("I'm in purgatory\n"); setup_arch(); +#ifndef NO_SHA_IN_PURGATORY if (verify_sha256_digest()) { for(;;) { /* loop forever */ } } +#endif /* NO_SHA_IN_PURGATORY */ post_verification_setup_arch(); } %< Thank you, I've tested this patch (no issue
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
On Wed, Apr 04, 2018 at 02:28:52PM +0100, James Morse wrote: > Hi Kostiantyn, > > On 04/04/18 13:45, Kostiantyn Iarmak wrote: > > From: Pratyush Anand > >> Date: Fri, Jun 2, 2017 at 5:42 PM > >> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory > >> To: James Morse > >> Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org, > >> ho...@verge.net.au, dyo...@redhat.com, > >> linux-arm-ker...@lists.infradead.org > >> > >> On Friday 02 June 2017 01:53 PM, James Morse wrote: > >>> On 23/05/17 06:02, Pratyush Anand wrote: > >>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz > >>>> image > >>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second > >>>> even when we have -O2 optimization enabled. However, if dcache is enabled > >>>> during purgatory execution then, it takes just a second in SHA > >>>> verification. > >>>> > >>>> Therefore, these patches adds support for dcache enabling facility during > >>>> purgatory execution. > > >>> I'm still not convinced we need this. Moving the SHA verification to > >>> happen > >>> before the dcache+mmu are disabled would also solve the delay problem, > >> > >> Humm..I am not sure, if we can do that. > > >> When we leave kernel (and enter into purgatory), icache+dcache+mmu are > >> already disabled. I think, that would be possible when we will be in a > >> position to use in-kernel purgatory. > >> > >>> and we > >>> can print an error message or fail the syscall. > >>> > >>> For kexec we don't expect memory corruption, what are we testing for? > >>> I can see the use for kdump, but the kdump-kernel is unmapped so the > >>> kernel > >>> can't accidentally write over it. > >>> > >>> (we discussed all this last time, but it fizzled-out. If you and the > >>> kexec-tools maintainer think its necessary, fine by me!) > > >> Yes, there had already been discussion and MAINTAINERs have > >> discouraged none-purgatory implementation. I don't remember the discussion like this quite well, but anyhow ... > >> > >>> I have some comments on making this code easier to maintain.. > >>> > >> Thanks. > >> > >> I have implemented your review comments and have archived the code in > >> > >> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache > >> > >> I will be posting the next version only when someone complains about > >> ARM64 kdump behavior that it is not as fast as x86. > > > On our ARM64-based platform we have very long main kernel-secondary kernel > > switch time. > > > > This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 > > kexec-tools > > version), we can get ~25x speedup, with this patch secondary kernel boots > > in ~3 > > seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes > > about > > 75 seconds. > > This is slow because its generating a checksum of the kernel without the > benefit > of the caches. This series generated page tables so that it could enable the > MMU > and caches. But, the purgatory code also needs to be a simple as possible > because its practically impossible to debug. Not impossible, but I admit that I occasionally had hard time in debugging. > The purgatory code does this checksum-ing because its worried the panic() was > because the kernel cause some memory corruption, and that memory corruption > may > have affected the kdump kernel too. > > This can't happen on arm64 as we unmap kdump's crash region, so not even the > kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect > crash > dump kernel memory") has all the details. > > (we also needed to do this to avoid the risk of mismatched memory attributes > if > kdump boots and some CPUs are still stuck in the old kernel) > > > > When do you plan merge this patch? > > We ended up with the check-summing code because its the default behaviour of > kexec-tools on other architectures. > > One alternative is to rip it out for arm64. Untested: Thanks for the patch. This eventually eliminates "reason d'etre" of purgatory on arm64 as I does in my kexec_file patch, although it would require a small re-work. -Takahiro AKASHI > %&l
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
On 4 April 2018 at 15:28, James Morse wrote: > Hi Kostiantyn, > > On 04/04/18 13:45, Kostiantyn Iarmak wrote: >> From: Pratyush Anand >>> Date: Fri, Jun 2, 2017 at 5:42 PM >>> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory >>> To: James Morse >>> Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org, >>> ho...@verge.net.au, dyo...@redhat.com, >>> linux-arm-ker...@lists.infradead.org >>> >>> On Friday 02 June 2017 01:53 PM, James Morse wrote: >>>> On 23/05/17 06:02, Pratyush Anand wrote: >>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image >>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second >>>>> even when we have -O2 optimization enabled. However, if dcache is enabled >>>>> during purgatory execution then, it takes just a second in SHA >>>>> verification. >>>>> >>>>> Therefore, these patches adds support for dcache enabling facility during >>>>> purgatory execution. > >>>> I'm still not convinced we need this. Moving the SHA verification to happen >>>> before the dcache+mmu are disabled would also solve the delay problem, >>> >>> Humm..I am not sure, if we can do that. > >>> When we leave kernel (and enter into purgatory), icache+dcache+mmu are >>> already disabled. I think, that would be possible when we will be in a >>> position to use in-kernel purgatory. >>> >>>> and we >>>> can print an error message or fail the syscall. >>>> >>>> For kexec we don't expect memory corruption, what are we testing for? >>>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel >>>> can't accidentally write over it. >>>> >>>> (we discussed all this last time, but it fizzled-out. If you and the >>>> kexec-tools maintainer think its necessary, fine by me!) > >>> Yes, there had already been discussion and MAINTAINERs have >>> discouraged none-purgatory implementation. >>> >>>> I have some comments on making this code easier to maintain.. >>>> >>> Thanks. >>> >>> I have implemented your review comments and have archived the code in >>> >>> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache >>> >>> I will be posting the next version only when someone complains about >>> ARM64 kdump behavior that it is not as fast as x86. > >> On our ARM64-based platform we have very long main kernel-secondary kernel >> switch time. >> >> This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 >> kexec-tools >> version), we can get ~25x speedup, with this patch secondary kernel boots in >> ~3 >> seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes >> about >> 75 seconds. > > This is slow because its generating a checksum of the kernel without the > benefit > of the caches. This series generated page tables so that it could enable the > MMU > and caches. But, the purgatory code also needs to be a simple as possible > because its practically impossible to debug. > > The purgatory code does this checksum-ing because its worried the panic() was > because the kernel cause some memory corruption, and that memory corruption > may > have affected the kdump kernel too. > If this is the only reason, there is no need to use a strong cryptographic hash, and we should be able to recover some performance by switching to CRC32 instead, preferably using the special arm64 instructions (if implemented). But I agree that skipping the checksum calculation altogether is probably the best approach here. > This can't happen on arm64 as we unmap kdump's crash region, so not even the > kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect > crash > dump kernel memory") has all the details. > > (we also needed to do this to avoid the risk of mismatched memory attributes > if > kdump boots and some CPUs are still stuck in the old kernel) > > >> When do you plan merge this patch? > > We ended up with the check-summing code because its the default behaviour of > kexec-tools on other architectures. > > One alternative is to rip it out for arm64. Untested: > %< > diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile > index 636abea..f10c148 100644 > --- a/purgatory/arch/arm
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Hi Kostiantyn, On 04/04/18 13:45, Kostiantyn Iarmak wrote: > From: Pratyush Anand >> Date: Fri, Jun 2, 2017 at 5:42 PM >> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory >> To: James Morse >> Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org, >> ho...@verge.net.au, dyo...@redhat.com, >> linux-arm-ker...@lists.infradead.org >> >> On Friday 02 June 2017 01:53 PM, James Morse wrote: >>> On 23/05/17 06:02, Pratyush Anand wrote: >>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image >>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second >>>> even when we have -O2 optimization enabled. However, if dcache is enabled >>>> during purgatory execution then, it takes just a second in SHA >>>> verification. >>>> >>>> Therefore, these patches adds support for dcache enabling facility during >>>> purgatory execution. >>> I'm still not convinced we need this. Moving the SHA verification to happen >>> before the dcache+mmu are disabled would also solve the delay problem, >> >> Humm..I am not sure, if we can do that. >> When we leave kernel (and enter into purgatory), icache+dcache+mmu are >> already disabled. I think, that would be possible when we will be in a >> position to use in-kernel purgatory. >> >>> and we >>> can print an error message or fail the syscall. >>> >>> For kexec we don't expect memory corruption, what are we testing for? >>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel >>> can't accidentally write over it. >>> >>> (we discussed all this last time, but it fizzled-out. If you and the >>> kexec-tools maintainer think its necessary, fine by me!) >> Yes, there had already been discussion and MAINTAINERs have >> discouraged none-purgatory implementation. >> >>> I have some comments on making this code easier to maintain.. >>> >> Thanks. >> >> I have implemented your review comments and have archived the code in >> >> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache >> >> I will be posting the next version only when someone complains about >> ARM64 kdump behavior that it is not as fast as x86. > On our ARM64-based platform we have very long main kernel-secondary kernel > switch time. > > This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools > version), we can get ~25x speedup, with this patch secondary kernel boots in > ~3 > seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes > about > 75 seconds. This is slow because its generating a checksum of the kernel without the benefit of the caches. This series generated page tables so that it could enable the MMU and caches. But, the purgatory code also needs to be a simple as possible because its practically impossible to debug. The purgatory code does this checksum-ing because its worried the panic() was because the kernel cause some memory corruption, and that memory corruption may have affected the kdump kernel too. This can't happen on arm64 as we unmap kdump's crash region, so not even the kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash dump kernel memory") has all the details. (we also needed to do this to avoid the risk of mismatched memory attributes if kdump boots and some CPUs are still stuck in the old kernel) > When do you plan merge this patch? We ended up with the check-summing code because its the default behaviour of kexec-tools on other architectures. One alternative is to rip it out for arm64. Untested: %< diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile index 636abea..f10c148 100644 --- a/purgatory/arch/arm64/Makefile +++ b/purgatory/arch/arm64/Makefile @@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \ -Werror-implicit-function-declaration \ -Wdeclaration-after-statement \ -Werror=implicit-int \ - -Werror=strict-prototypes + -Werror=strict-prototypes \ + -DNO_SHA_IN_PURGATORY arm64_PURGATORY_SRCS += \ purgatory/arch/arm64/entry.S \ diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c index 3bbcc09..44e792a 100644 --- a/purgatory/purgatory.c +++ b/purgatory/purgatory.c @@ -9,6 +9,8 @@ struct sha256_region sha256_regions[SHA256_REGIONS] = {}; sha256_digest_t sha256_digest = { }; +#ifndef NO_SHA_IN_PURGATORY + int verify_sha256_digest(void) { struct sha256_region *ptr, *end; @@ -39,14 +41,18 @@ int verify_sha256_digest(void)
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Unfortunately got delivery failure notification for Pratyush Anand's address (Unknown address), who can help with merging this patch set? On 04.04.18 15:45, Kostiantyn Iarmak wrote: Hi Pratyush, From: Pratyush Anand Date: Fri, Jun 2, 2017 at 5:42 PM Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory To: James Morse Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org, ho...@verge.net.au, dyo...@redhat.com, linux-arm-ker...@lists.infradead.org Hi James, On Friday 02 June 2017 01:53 PM, James Morse wrote: Hi Pratyush, On 23/05/17 06:02, Pratyush Anand wrote: It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image is around 13MB and initramfs is around 30MB. It takes more than 20 second even when we have -O2 optimization enabled. However, if dcache is enabled during purgatory execution then, it takes just a second in SHA verification. Therefore, these patches adds support for dcache enabling facility during purgatory execution. I'm still not convinced we need this. Moving the SHA verification to happen before the dcache+mmu are disabled would also solve the delay problem, Humm..I am not sure, if we can do that. When we leave kernel (and enter into purgatory), icache+dcache+mmu are already disabled. I think, that would be possible when we will be in a position to use in-kernel purgatory. and we can print an error message or fail the syscall. For kexec we don't expect memory corruption, what are we testing for? I can see the use for kdump, but the kdump-kernel is unmapped so the kernel can't accidentally write over it. (we discussed all this last time, but it fizzled-out. If you and the kexec-tools maintainer think its necessary, fine by me!) Yes, there had already been discussion and MAINTAINERs have discouraged none-purgatory implementation. I have some comments on making this code easier to maintain.. Thanks. I have implemented your review comments and have archived the code in https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache I will be posting the next version only when someone complains about ARM64 kdump behavior that it is not as fast as x86. On our ARM64-based platform we have very long main kernel-secondary kernel switch time. This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools version), we can get ~25x speedup, with this patch secondary kernel boots in ~3 seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about 75 seconds. When do you plan merge this patch? I can help you with testing on our platform. Thanks for all your time on this series. That really helped me to understand the arm64 page table in a better way. ~Pratyush ___ linux-arm-kernel mailing list linux-arm-ker...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- Best Regards, Kostiantyn (Kostia) Iarmak. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Hi Pratyush, From: Pratyush Anand Date: Fri, Jun 2, 2017 at 5:42 PM Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory To: James Morse Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org, ho...@verge.net.au, dyo...@redhat.com, linux-arm-ker...@lists.infradead.org Hi James, On Friday 02 June 2017 01:53 PM, James Morse wrote: Hi Pratyush, On 23/05/17 06:02, Pratyush Anand wrote: It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image is around 13MB and initramfs is around 30MB. It takes more than 20 second even when we have -O2 optimization enabled. However, if dcache is enabled during purgatory execution then, it takes just a second in SHA verification. Therefore, these patches adds support for dcache enabling facility during purgatory execution. I'm still not convinced we need this. Moving the SHA verification to happen before the dcache+mmu are disabled would also solve the delay problem, Humm..I am not sure, if we can do that. When we leave kernel (and enter into purgatory), icache+dcache+mmu are already disabled. I think, that would be possible when we will be in a position to use in-kernel purgatory. and we can print an error message or fail the syscall. For kexec we don't expect memory corruption, what are we testing for? I can see the use for kdump, but the kdump-kernel is unmapped so the kernel can't accidentally write over it. (we discussed all this last time, but it fizzled-out. If you and the kexec-tools maintainer think its necessary, fine by me!) Yes, there had already been discussion and MAINTAINERs have discouraged none-purgatory implementation. I have some comments on making this code easier to maintain.. Thanks. I have implemented your review comments and have archived the code in https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache I will be posting the next version only when someone complains about ARM64 kdump behavior that it is not as fast as x86. On our ARM64-based platform we have very long main kernel-secondary kernel switch time. This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools version), we can get ~25x speedup, with this patch secondary kernel boots in ~3 seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about 75 seconds. When do you plan merge this patch? I can help you with testing on our platform. Thanks for all your time on this series. That really helped me to understand the arm64 page table in a better way. ~Pratyush ___ linux-arm-kernel mailing list linux-arm-ker...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- Best Regards, Kostiantyn Iarmak. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Hi Bhupesh, On 02/06/17 12:15, Bhupesh SHARMA wrote: > On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel > wrote: >> On 2 June 2017 at 08:23, James Morse wrote: >>> On 23/05/17 06:02, Pratyush Anand wrote: It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image is around 13MB and initramfs is around 30MB. It takes more than 20 second even when we have -O2 optimization enabled. However, if dcache is enabled during purgatory execution then, it takes just a second in SHA verification. Therefore, these patches adds support for dcache enabling facility during purgatory execution. >>> >>> I'm still not convinced we need this. Moving the SHA verification to happen >>> before the dcache+mmu are disabled would also solve the delay problem, and >>> we >>> can print an error message or fail the syscall. >>> >>> For kexec we don't expect memory corruption, what are we testing for? >> >> This is a very good question. SHA-256 is quite a heavy hammer if all >> you need is CRC style error detection. Thanks for the history links. We don't (yet) support KEXEC_FILE or KEXEC_VERIFY_SIG, and arm64 doesn't have an in-kernel purgatory (which looks to be required for kexec_file under secure-boot). > AFAICR the sha-256 implementation was proposed to boot a signed > kexec/kdump kernel to circumvent kexec from violating UEFI secure boot > restrictions (see [1]). The beginning of the kexec-tools git history is 'kexec-tools-1.101' in 2006, it had util_lib/sha256.c. It looks like SecureBoot arrived in 2011 with v2.3.1 of UEFI. I can see how x86 picked up on this checksum for secure-boot as kexec-tools already did this work, (some of the files under arch/x86/purgatory note their kexec-tools origin), my question is why did it do it in the first place? If the reason is accidental writes, we mitigate this on arm64 by unmapping the kdump region instead of just marking it read-only. > As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is > explicitly designed to stop you booting modified kernels unless you've > added your own keys. > So, CRC wouldn't possibly fulfil the functionality we are trying to > achieve with SHA-256 in the purgatory. Is this still true for a purgatory provided by user-space? Thanks, James ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Hi James, On Friday 02 June 2017 01:53 PM, James Morse wrote: Hi Pratyush, On 23/05/17 06:02, Pratyush Anand wrote: It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image is around 13MB and initramfs is around 30MB. It takes more than 20 second even when we have -O2 optimization enabled. However, if dcache is enabled during purgatory execution then, it takes just a second in SHA verification. Therefore, these patches adds support for dcache enabling facility during purgatory execution. I'm still not convinced we need this. Moving the SHA verification to happen before the dcache+mmu are disabled would also solve the delay problem, Humm..I am not sure, if we can do that. When we leave kernel (and enter into purgatory), icache+dcache+mmu are already disabled. I think, that would be possible when we will be in a position to use in-kernel purgatory. and we can print an error message or fail the syscall. For kexec we don't expect memory corruption, what are we testing for? I can see the use for kdump, but the kdump-kernel is unmapped so the kernel can't accidentally write over it. (we discussed all this last time, but it fizzled-out. If you and the kexec-tools maintainer think its necessary, fine by me!) Yes, there had already been discussion and MAINTAINERs have discouraged none-purgatory implementation. I have some comments on making this code easier to maintain.. Thanks. I have implemented your review comments and have archived the code in https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache I will be posting the next version only when someone complains about ARM64 kdump behavior that it is not as fast as x86. Thanks for all your time on this series. That really helped me to understand the arm64 page table in a better way. ~Pratyush ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Hi Ard, Thanks for all the inputs. On Friday 02 June 2017 05:14 PM, Ard Biesheuvel wrote: Alternatively, a SHA-256 implementation that uses movz/movk sequences instead of ldr instructions to load the round constants would already be 5x faster, given that we don't need page tables to enable the I-cache. Actually, looking at the C code and the objdump of the kernel's sha256_generic driver, it is likely that it is already doing this, and none of the points I made actually make a lot of sense ... Pratyush: I assume you are already enabling the I-cache in the purgatory? It's not enabled yet. But, I had tried first with enabling only I-cache, but that did not make much difference in execution time. ~Pratyush ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
On 2 June 2017 at 11:36, Ard Biesheuvel wrote: > On 2 June 2017 at 11:15, Bhupesh SHARMA wrote: >> Hi Ard, James >> >> On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel >> wrote: >>> On 2 June 2017 at 08:23, James Morse wrote: Hi Pratyush, On 23/05/17 06:02, Pratyush Anand wrote: > It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image > is around 13MB and initramfs is around 30MB. It takes more than 20 second > even when we have -O2 optimization enabled. However, if dcache is enabled > during purgatory execution then, it takes just a second in SHA > verification. > > Therefore, these patches adds support for dcache enabling facility during > purgatory execution. I'm still not convinced we need this. Moving the SHA verification to happen before the dcache+mmu are disabled would also solve the delay problem, and we can print an error message or fail the syscall. For kexec we don't expect memory corruption, what are we testing for? >>> >>> This is a very good question. SHA-256 is quite a heavy hammer if all >>> you need is CRC style error detection. Note that SHA-256 uses 256 >>> bytes of round keys, which means that in the absence of a cache, each >>> 64 byte chunk of data processed involves (re)reading 320 bytes from >>> DRAM. That also means you could write a SHA-256 implementation for >>> AArch64 that keeps the round keys in NEON registers instead, and it >>> would probably be a lot faster. >> >> AFAICR the sha-256 implementation was proposed to boot a signed >> kexec/kdump kernel to circumvent kexec from violating UEFI secure boot >> restrictions (see [1]). >> >> As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is >> explicitly designed to stop you booting modified kernels unless you've >> added your own keys. >> >> But if you boot a signed Linux distribution with kexec enabled without >> using the SHA like feature in the purgatory (like, say, Ubuntu) then >> you're able to boot a modified Windows kernel that will still believe >> it was booted securely. >> >> So, CRC wouldn't possibly fulfil the functionality we are trying to >> achieve with SHA-256 in the purgatory. >> > > OK. But it appears that kexec_load_file() generates the hashes, and > the purgatory just double checks them? That means there is wiggle room > in terms of hash implementation, even though a non-cryptographic hash > may be out of the question. > >> However, having seen the power of using the inbuilt CRC instructions >> from the ARM64 ISA on a SoC which supports it, I can vouch that the >> native ISA implementations are much faster than other approaches. >> >> However, using the SHA-256 implementation (as you rightly noted) would >> employ NEON registers and can be faster, however I remember some SoC >> vendors disabling co-processor extensions in their ARM implementations >> in the past, so I am not sure we can assume that NEON extensions would >> be available in all ARMv8 implementations by default. >> > > Alternatively, a SHA-256 implementation that uses movz/movk sequences > instead of ldr instructions to load the round constants would already > be 5x faster, given that we don't need page tables to enable the > I-cache. Actually, looking at the C code and the objdump of the kernel's sha256_generic driver, it is likely that it is already doing this, and none of the points I made actually make a lot of sense ... Pratyush: I assume you are already enabling the I-cache in the purgatory? ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
On 2 June 2017 at 11:15, Bhupesh SHARMA wrote: > Hi Ard, James > > On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel > wrote: >> On 2 June 2017 at 08:23, James Morse wrote: >>> Hi Pratyush, >>> >>> On 23/05/17 06:02, Pratyush Anand wrote: It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image is around 13MB and initramfs is around 30MB. It takes more than 20 second even when we have -O2 optimization enabled. However, if dcache is enabled during purgatory execution then, it takes just a second in SHA verification. Therefore, these patches adds support for dcache enabling facility during purgatory execution. >>> >>> I'm still not convinced we need this. Moving the SHA verification to happen >>> before the dcache+mmu are disabled would also solve the delay problem, and >>> we >>> can print an error message or fail the syscall. >>> >>> For kexec we don't expect memory corruption, what are we testing for? >> >> This is a very good question. SHA-256 is quite a heavy hammer if all >> you need is CRC style error detection. Note that SHA-256 uses 256 >> bytes of round keys, which means that in the absence of a cache, each >> 64 byte chunk of data processed involves (re)reading 320 bytes from >> DRAM. That also means you could write a SHA-256 implementation for >> AArch64 that keeps the round keys in NEON registers instead, and it >> would probably be a lot faster. > > AFAICR the sha-256 implementation was proposed to boot a signed > kexec/kdump kernel to circumvent kexec from violating UEFI secure boot > restrictions (see [1]). > > As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is > explicitly designed to stop you booting modified kernels unless you've > added your own keys. > > But if you boot a signed Linux distribution with kexec enabled without > using the SHA like feature in the purgatory (like, say, Ubuntu) then > you're able to boot a modified Windows kernel that will still believe > it was booted securely. > > So, CRC wouldn't possibly fulfil the functionality we are trying to > achieve with SHA-256 in the purgatory. > OK. But it appears that kexec_load_file() generates the hashes, and the purgatory just double checks them? That means there is wiggle room in terms of hash implementation, even though a non-cryptographic hash may be out of the question. > However, having seen the power of using the inbuilt CRC instructions > from the ARM64 ISA on a SoC which supports it, I can vouch that the > native ISA implementations are much faster than other approaches. > > However, using the SHA-256 implementation (as you rightly noted) would > employ NEON registers and can be faster, however I remember some SoC > vendors disabling co-processor extensions in their ARM implementations > in the past, so I am not sure we can assume that NEON extensions would > be available in all ARMv8 implementations by default. > Alternatively, a SHA-256 implementation that uses movz/movk sequences instead of ldr instructions to load the round constants would already be 5x faster, given that we don't need page tables to enable the I-cache. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Hi Ard, James On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel wrote: > On 2 June 2017 at 08:23, James Morse wrote: >> Hi Pratyush, >> >> On 23/05/17 06:02, Pratyush Anand wrote: >>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image >>> is around 13MB and initramfs is around 30MB. It takes more than 20 second >>> even when we have -O2 optimization enabled. However, if dcache is enabled >>> during purgatory execution then, it takes just a second in SHA >>> verification. >>> >>> Therefore, these patches adds support for dcache enabling facility during >>> purgatory execution. >> >> I'm still not convinced we need this. Moving the SHA verification to happen >> before the dcache+mmu are disabled would also solve the delay problem, and we >> can print an error message or fail the syscall. >> >> For kexec we don't expect memory corruption, what are we testing for? > > This is a very good question. SHA-256 is quite a heavy hammer if all > you need is CRC style error detection. Note that SHA-256 uses 256 > bytes of round keys, which means that in the absence of a cache, each > 64 byte chunk of data processed involves (re)reading 320 bytes from > DRAM. That also means you could write a SHA-256 implementation for > AArch64 that keeps the round keys in NEON registers instead, and it > would probably be a lot faster. AFAICR the sha-256 implementation was proposed to boot a signed kexec/kdump kernel to circumvent kexec from violating UEFI secure boot restrictions (see [1]). As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is explicitly designed to stop you booting modified kernels unless you've added your own keys. But if you boot a signed Linux distribution with kexec enabled without using the SHA like feature in the purgatory (like, say, Ubuntu) then you're able to boot a modified Windows kernel that will still believe it was booted securely. So, CRC wouldn't possibly fulfil the functionality we are trying to achieve with SHA-256 in the purgatory. However, having seen the power of using the inbuilt CRC instructions from the ARM64 ISA on a SoC which supports it, I can vouch that the native ISA implementations are much faster than other approaches. However, using the SHA-256 implementation (as you rightly noted) would employ NEON registers and can be faster, however I remember some SoC vendors disabling co-processor extensions in their ARM implementations in the past, so I am not sure we can assume that NEON extensions would be available in all ARMv8 implementations by default. [1] https://lwn.net/Articles/603116/ [2] http://mjg59.dreamwidth.org/28746.html Regards, Bhupesh >> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel >> can't accidentally write over it. >> >> (we discussed all this last time, but it fizzled-out. If you and the >> kexec-tools maintainer think its necessary, fine by me!) >> > > ___ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
On 2 June 2017 at 08:23, James Morse wrote: > Hi Pratyush, > > On 23/05/17 06:02, Pratyush Anand wrote: >> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image >> is around 13MB and initramfs is around 30MB. It takes more than 20 second >> even when we have -O2 optimization enabled. However, if dcache is enabled >> during purgatory execution then, it takes just a second in SHA >> verification. >> >> Therefore, these patches adds support for dcache enabling facility during >> purgatory execution. > > I'm still not convinced we need this. Moving the SHA verification to happen > before the dcache+mmu are disabled would also solve the delay problem, and we > can print an error message or fail the syscall. > > For kexec we don't expect memory corruption, what are we testing for? This is a very good question. SHA-256 is quite a heavy hammer if all you need is CRC style error detection. Note that SHA-256 uses 256 bytes of round keys, which means that in the absence of a cache, each 64 byte chunk of data processed involves (re)reading 320 bytes from DRAM. That also means you could write a SHA-256 implementation for AArch64 that keeps the round keys in NEON registers instead, and it would probably be a lot faster. > I can see the use for kdump, but the kdump-kernel is unmapped so the kernel > can't accidentally write over it. > > (we discussed all this last time, but it fizzled-out. If you and the > kexec-tools maintainer think its necessary, fine by me!) > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
Hi Pratyush, On 23/05/17 06:02, Pratyush Anand wrote: > It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image > is around 13MB and initramfs is around 30MB. It takes more than 20 second > even when we have -O2 optimization enabled. However, if dcache is enabled > during purgatory execution then, it takes just a second in SHA > verification. > > Therefore, these patches adds support for dcache enabling facility during > purgatory execution. I'm still not convinced we need this. Moving the SHA verification to happen before the dcache+mmu are disabled would also solve the delay problem, and we can print an error message or fail the syscall. For kexec we don't expect memory corruption, what are we testing for? I can see the use for kdump, but the kdump-kernel is unmapped so the kernel can't accidentally write over it. (we discussed all this last time, but it fizzled-out. If you and the kexec-tools maintainer think its necessary, fine by me!) I have some comments on making this code easier to maintain.. Thanks, James ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec