Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2018-04-11 Thread Geoff Levand
On 04/04/2018 06:28 AM, James Morse wrote:
> We ended up with the check-summing code because its the default behaviour of
> kexec-tools on other architectures.
> 
> One alternative is to rip it out for arm64.

Or add arm64 support to kexec-lite:

  https://github.com/antonblanchard/kexec-lite

Or accept my bypass patch:

  http://lists.infradead.org/pipermail/kexec/2015-October/014573.html

-Geoff

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2018-04-06 Thread Kostiantyn Iarmak


On 04.04.18 16:28, James Morse wrote:

Hi Kostiantyn,

On 04/04/18 13:45, Kostiantyn Iarmak wrote:

From: Pratyush Anand 

Date: Fri, Jun 2, 2017 at 5:42 PM
Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
To: James Morse 
Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org,
ho...@verge.net.au, dyo...@redhat.com,
linux-arm-ker...@lists.infradead.org

On Friday 02 June 2017 01:53 PM, James Morse wrote:

On 23/05/17 06:02, Pratyush Anand wrote:

It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
is around 13MB and initramfs is around 30MB. It takes more than 20 second
even when we have -O2 optimization enabled. However, if dcache is enabled
during purgatory execution then, it takes just a second in SHA
verification.

Therefore, these patches adds support for dcache enabling facility during
purgatory execution.

I'm still not convinced we need this. Moving the SHA verification to happen
before the dcache+mmu are disabled would also solve the delay problem,

Humm..I am not sure, if we can do that.
When we leave kernel (and enter into purgatory), icache+dcache+mmu are
already disabled. I think, that would be possible when we will be in a
position to use in-kernel purgatory.


and we
can print an error message or fail the syscall.

For kexec we don't expect memory corruption, what are we testing for?
I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
can't accidentally write over it.

(we discussed all this last time, but it fizzled-out. If you and the
   kexec-tools maintainer think its necessary, fine by me!)

Yes, there had already been discussion and MAINTAINERs have
discouraged none-purgatory implementation.


I have some comments on making this code easier to maintain..


Thanks.

I have implemented your review comments and have archived the code in

https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache

I will be posting the next version only when someone complains about
ARM64 kdump behavior that it is not as fast as x86.

On our ARM64-based platform we have very long main kernel-secondary kernel
switch time.

This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools
version), we can get ~25x speedup, with this patch secondary kernel boots in ~3
seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about
75 seconds.

This is slow because its generating a checksum of the kernel without the benefit
of the caches. This series generated page tables so that it could enable the MMU
and caches. But, the purgatory code also needs to be a simple as possible
because its practically impossible to debug.

The purgatory code does this checksum-ing because its worried the panic() was
because the kernel cause some memory corruption, and that memory corruption may
have affected the kdump kernel too.

This can't happen on arm64 as we unmap kdump's crash region, so not even the
kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash
dump kernel memory") has all the details.

(we also needed to do this to avoid the risk of mismatched memory attributes if
kdump boots and some CPUs are still stuck in the old kernel)



When do you plan merge this patch?

We ended up with the check-summing code because its the default behaviour of
kexec-tools on other architectures.

One alternative is to rip it out for arm64. Untested:
%<
diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
index 636abea..f10c148 100644
--- a/purgatory/arch/arm64/Makefile
+++ b/purgatory/arch/arm64/Makefile
@@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
 -Werror-implicit-function-declaration \
 -Wdeclaration-after-statement \
 -Werror=implicit-int \
-   -Werror=strict-prototypes
+   -Werror=strict-prototypes \
+   -DNO_SHA_IN_PURGATORY

  arm64_PURGATORY_SRCS += \
 purgatory/arch/arm64/entry.S \
diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c
index 3bbcc09..44e792a 100644
--- a/purgatory/purgatory.c
+++ b/purgatory/purgatory.c
@@ -9,6 +9,8 @@
  struct sha256_region sha256_regions[SHA256_REGIONS] = {};
  sha256_digest_t sha256_digest = { };

+#ifndef NO_SHA_IN_PURGATORY
+
  int verify_sha256_digest(void)
  {
 struct sha256_region *ptr, *end;
@@ -39,14 +41,18 @@ int verify_sha256_digest(void)
 return 0;
  }

+#endif /* NO_SHA_IN_PURGATORY */
+
  void purgatory(void)
  {
 printf("I'm in purgatory\n");
 setup_arch();
+#ifndef NO_SHA_IN_PURGATORY
 if (verify_sha256_digest()) {
 for(;;) {
 /* loop forever */
 }
 }
+#endif /* NO_SHA_IN_PURGATORY */
 post_verification_setup_arch();
  }
%<


Thank you, I've tested this patch (no issue

Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2018-04-04 Thread AKASHI Takahiro
On Wed, Apr 04, 2018 at 02:28:52PM +0100, James Morse wrote:
> Hi Kostiantyn,
> 
> On 04/04/18 13:45, Kostiantyn Iarmak wrote:
> > From: Pratyush Anand 
> >> Date: Fri, Jun 2, 2017 at 5:42 PM
> >> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
> >> To: James Morse 
> >> Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org,
> >> ho...@verge.net.au, dyo...@redhat.com,
> >> linux-arm-ker...@lists.infradead.org
> >>
> >> On Friday 02 June 2017 01:53 PM, James Morse wrote:
> >>> On 23/05/17 06:02, Pratyush Anand wrote:
> >>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz 
> >>>> image
> >>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
> >>>> even when we have -O2 optimization enabled. However, if dcache is enabled
> >>>> during purgatory execution then, it takes just a second in SHA
> >>>> verification.
> >>>>
> >>>> Therefore, these patches adds support for dcache enabling facility during
> >>>> purgatory execution.
> 
> >>> I'm still not convinced we need this. Moving the SHA verification to 
> >>> happen
> >>> before the dcache+mmu are disabled would also solve the delay problem,
> >>
> >> Humm..I am not sure, if we can do that.
> 
> >> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
> >> already disabled. I think, that would be possible when we will be in a
> >> position to use in-kernel purgatory.
> >>
> >>> and we
> >>> can print an error message or fail the syscall.
> >>>
> >>> For kexec we don't expect memory corruption, what are we testing for?
> >>> I can see the use for kdump, but the kdump-kernel is unmapped so the 
> >>> kernel
> >>> can't accidentally write over it.
> >>>
> >>> (we discussed all this last time, but it fizzled-out. If you and the
> >>>   kexec-tools maintainer think its necessary, fine by me!)
> 
> >> Yes, there had already been discussion and MAINTAINERs have
> >> discouraged none-purgatory implementation.

I don't remember the discussion like this quite well, but anyhow ...

> >>
> >>> I have some comments on making this code easier to maintain..
> >>>
> >> Thanks.
> >>
> >> I have implemented your review comments and have archived the code in
> >>
> >> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
> >>
> >> I will be posting the next version only when someone complains about
> >> ARM64 kdump behavior that it is not as fast as x86.
> 
> > On our ARM64-based platform we have very long main kernel-secondary kernel
> > switch time.
> > 
> > This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 
> > kexec-tools
> > version), we can get ~25x speedup, with this patch secondary kernel boots 
> > in ~3
> > seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes 
> > about
> > 75 seconds.
> 
> This is slow because its generating a checksum of the kernel without the 
> benefit
> of the caches. This series generated page tables so that it could enable the 
> MMU
> and caches. But, the purgatory code also needs to be a simple as possible
> because its practically impossible to debug.

Not impossible, but I admit that I occasionally had hard time in debugging.

> The purgatory code does this checksum-ing because its worried the panic() was
> because the kernel cause some memory corruption, and that memory corruption 
> may
> have affected the kdump kernel too.
> 
> This can't happen on arm64 as we unmap kdump's crash region, so not even the
> kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect 
> crash
> dump kernel memory") has all the details.
> 
> (we also needed to do this to avoid the risk of mismatched memory attributes 
> if
> kdump boots and some CPUs are still stuck in the old kernel)
> 
> 
> > When do you plan merge this patch?
> 
> We ended up with the check-summing code because its the default behaviour of
> kexec-tools on other architectures.
> 
> One alternative is to rip it out for arm64. Untested:

Thanks for the patch. This eventually eliminates "reason d'etre" of
purgatory on arm64 as I does in my kexec_file patch, although it would
require a small re-work.

-Takahiro AKASHI


> %&l

Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2018-04-04 Thread Ard Biesheuvel
On 4 April 2018 at 15:28, James Morse  wrote:
> Hi Kostiantyn,
>
> On 04/04/18 13:45, Kostiantyn Iarmak wrote:
>> From: Pratyush Anand 
>>> Date: Fri, Jun 2, 2017 at 5:42 PM
>>> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
>>> To: James Morse 
>>> Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org,
>>> ho...@verge.net.au, dyo...@redhat.com,
>>> linux-arm-ker...@lists.infradead.org
>>>
>>> On Friday 02 June 2017 01:53 PM, James Morse wrote:
>>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>>> during purgatory execution then, it takes just a second in SHA
>>>>> verification.
>>>>>
>>>>> Therefore, these patches adds support for dcache enabling facility during
>>>>> purgatory execution.
>
>>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>>> before the dcache+mmu are disabled would also solve the delay problem,
>>>
>>> Humm..I am not sure, if we can do that.
>
>>> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
>>> already disabled. I think, that would be possible when we will be in a
>>> position to use in-kernel purgatory.
>>>
>>>> and we
>>>> can print an error message or fail the syscall.
>>>>
>>>> For kexec we don't expect memory corruption, what are we testing for?
>>>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>>>> can't accidentally write over it.
>>>>
>>>> (we discussed all this last time, but it fizzled-out. If you and the
>>>>   kexec-tools maintainer think its necessary, fine by me!)
>
>>> Yes, there had already been discussion and MAINTAINERs have
>>> discouraged none-purgatory implementation.
>>>
>>>> I have some comments on making this code easier to maintain..
>>>>
>>> Thanks.
>>>
>>> I have implemented your review comments and have archived the code in
>>>
>>> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
>>>
>>> I will be posting the next version only when someone complains about
>>> ARM64 kdump behavior that it is not as fast as x86.
>
>> On our ARM64-based platform we have very long main kernel-secondary kernel
>> switch time.
>>
>> This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 
>> kexec-tools
>> version), we can get ~25x speedup, with this patch secondary kernel boots in 
>> ~3
>> seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes 
>> about
>> 75 seconds.
>
> This is slow because its generating a checksum of the kernel without the 
> benefit
> of the caches. This series generated page tables so that it could enable the 
> MMU
> and caches. But, the purgatory code also needs to be a simple as possible
> because its practically impossible to debug.
>
> The purgatory code does this checksum-ing because its worried the panic() was
> because the kernel cause some memory corruption, and that memory corruption 
> may
> have affected the kdump kernel too.
>

If this is the only reason, there is no need to use a strong
cryptographic hash, and we should be able to recover some performance
by switching to CRC32 instead, preferably using the special arm64
instructions (if implemented).

But I agree that skipping the checksum calculation altogether is
probably the best approach here.


> This can't happen on arm64 as we unmap kdump's crash region, so not even the
> kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect 
> crash
> dump kernel memory") has all the details.
>
> (we also needed to do this to avoid the risk of mismatched memory attributes 
> if
> kdump boots and some CPUs are still stuck in the old kernel)
>
>
>> When do you plan merge this patch?
>
> We ended up with the check-summing code because its the default behaviour of
> kexec-tools on other architectures.
>
> One alternative is to rip it out for arm64. Untested:
> %<
> diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
> index 636abea..f10c148 100644
> --- a/purgatory/arch/arm

Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2018-04-04 Thread James Morse
Hi Kostiantyn,

On 04/04/18 13:45, Kostiantyn Iarmak wrote:
> From: Pratyush Anand 
>> Date: Fri, Jun 2, 2017 at 5:42 PM
>> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
>> To: James Morse 
>> Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org,
>> ho...@verge.net.au, dyo...@redhat.com,
>> linux-arm-ker...@lists.infradead.org
>>
>> On Friday 02 June 2017 01:53 PM, James Morse wrote:
>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>> during purgatory execution then, it takes just a second in SHA
>>>> verification.
>>>>
>>>> Therefore, these patches adds support for dcache enabling facility during
>>>> purgatory execution.

>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>> before the dcache+mmu are disabled would also solve the delay problem,
>>
>> Humm..I am not sure, if we can do that.

>> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
>> already disabled. I think, that would be possible when we will be in a
>> position to use in-kernel purgatory.
>>
>>> and we
>>> can print an error message or fail the syscall.
>>>
>>> For kexec we don't expect memory corruption, what are we testing for?
>>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>>> can't accidentally write over it.
>>>
>>> (we discussed all this last time, but it fizzled-out. If you and the
>>>   kexec-tools maintainer think its necessary, fine by me!)

>> Yes, there had already been discussion and MAINTAINERs have
>> discouraged none-purgatory implementation.
>>
>>> I have some comments on making this code easier to maintain..
>>>
>> Thanks.
>>
>> I have implemented your review comments and have archived the code in
>>
>> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
>>
>> I will be posting the next version only when someone complains about
>> ARM64 kdump behavior that it is not as fast as x86.

> On our ARM64-based platform we have very long main kernel-secondary kernel
> switch time.
> 
> This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools
> version), we can get ~25x speedup, with this patch secondary kernel boots in 
> ~3
> seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes 
> about
> 75 seconds.

This is slow because its generating a checksum of the kernel without the benefit
of the caches. This series generated page tables so that it could enable the MMU
and caches. But, the purgatory code also needs to be a simple as possible
because its practically impossible to debug.

The purgatory code does this checksum-ing because its worried the panic() was
because the kernel cause some memory corruption, and that memory corruption may
have affected the kdump kernel too.

This can't happen on arm64 as we unmap kdump's crash region, so not even the
kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash
dump kernel memory") has all the details.

(we also needed to do this to avoid the risk of mismatched memory attributes if
kdump boots and some CPUs are still stuck in the old kernel)


> When do you plan merge this patch?

We ended up with the check-summing code because its the default behaviour of
kexec-tools on other architectures.

One alternative is to rip it out for arm64. Untested:
%<
diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
index 636abea..f10c148 100644
--- a/purgatory/arch/arm64/Makefile
+++ b/purgatory/arch/arm64/Makefile
@@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
-Werror-implicit-function-declaration \
-Wdeclaration-after-statement \
-Werror=implicit-int \
-   -Werror=strict-prototypes
+   -Werror=strict-prototypes \
+   -DNO_SHA_IN_PURGATORY

 arm64_PURGATORY_SRCS += \
purgatory/arch/arm64/entry.S \
diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c
index 3bbcc09..44e792a 100644
--- a/purgatory/purgatory.c
+++ b/purgatory/purgatory.c
@@ -9,6 +9,8 @@
 struct sha256_region sha256_regions[SHA256_REGIONS] = {};
 sha256_digest_t sha256_digest = { };

+#ifndef NO_SHA_IN_PURGATORY
+
 int verify_sha256_digest(void)
 {
struct sha256_region *ptr, *end;
@@ -39,14 +41,18 @@ int verify_sha256_digest(void)

Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2018-04-04 Thread Kostiantyn Iarmak
Unfortunately got delivery failure notification for Pratyush Anand's 
address (Unknown address),


  who can help with merging this patch set?


On 04.04.18 15:45, Kostiantyn Iarmak wrote:

Hi Pratyush,

From: Pratyush Anand 

Date: Fri, Jun 2, 2017 at 5:42 PM
Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in 
purgatory

To: James Morse 
Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org,
ho...@verge.net.au, dyo...@redhat.com,
linux-arm-ker...@lists.infradead.org


Hi James,

On Friday 02 June 2017 01:53 PM, James Morse wrote:

Hi Pratyush,

On 23/05/17 06:02, Pratyush Anand wrote:
It takes more that 2 minutes to verify SHA in purgatory when 
vmlinuz image
is around 13MB and initramfs is around 30MB. It takes more than 20 
second
even when we have -O2 optimization enabled. However, if dcache is 
enabled

during purgatory execution then, it takes just a second in SHA
verification.

Therefore, these patches adds support for dcache enabling facility 
during

purgatory execution.


I'm still not convinced we need this. Moving the SHA verification to 
happen

before the dcache+mmu are disabled would also solve the delay problem,


Humm..I am not sure, if we can do that.

When we leave kernel (and enter into purgatory), icache+dcache+mmu are
already disabled. I think, that would be possible when we will be in a
position to use in-kernel purgatory.


and we
can print an error message or fail the syscall.

For kexec we don't expect memory corruption, what are we testing for?
I can see the use for kdump, but the kdump-kernel is unmapped so the 
kernel

can't accidentally write over it.

(we discussed all this last time, but it fizzled-out. If you and the
  kexec-tools maintainer think its necessary, fine by me!)


Yes, there had already been discussion and MAINTAINERs have
discouraged none-purgatory implementation.


I have some comments on making this code easier to maintain..


Thanks.

I have implemented your review comments and have archived the code in

https://github.com/pratyushanand/kexec-tools.git : 
purgatory-enable-dcache


I will be posting the next version only when someone complains about
ARM64 kdump behavior that it is not as fast as x86.
On our ARM64-based platform we have very long main kernel-secondary 
kernel switch time.


This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 
kexec-tools version), we can get ~25x speedup, with this patch 
secondary kernel boots in ~3 seconds while on 2.0.13-2.0.16 
kexec-tools without this patch switch takes about 75 seconds.


When do you plan merge this patch?

I can help you with testing on our platform.


Thanks for all your time on this series. That really helped me to
understand the arm64 page table in a better way.

~Pratyush




___
linux-arm-kernel mailing list
linux-arm-ker...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel




--
Best Regards,

  Kostiantyn (Kostia) Iarmak.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2018-04-04 Thread Kostiantyn Iarmak

Hi Pratyush,

From: Pratyush Anand 

Date: Fri, Jun 2, 2017 at 5:42 PM
Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
To: James Morse 
Cc: mark.rutl...@arm.com, b...@redhat.com, kexec@lists.infradead.org,
ho...@verge.net.au, dyo...@redhat.com,
linux-arm-ker...@lists.infradead.org


Hi James,

On Friday 02 June 2017 01:53 PM, James Morse wrote:

Hi Pratyush,

On 23/05/17 06:02, Pratyush Anand wrote:

It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
is around 13MB and initramfs is around 30MB. It takes more than 20 second
even when we have -O2 optimization enabled. However, if dcache is enabled
during purgatory execution then, it takes just a second in SHA
verification.

Therefore, these patches adds support for dcache enabling facility during
purgatory execution.


I'm still not convinced we need this. Moving the SHA verification to happen
before the dcache+mmu are disabled would also solve the delay problem,


Humm..I am not sure, if we can do that.

When we leave kernel (and enter into purgatory), icache+dcache+mmu are
already disabled. I think, that would be possible when we will be in a
position to use in-kernel purgatory.


and we
can print an error message or fail the syscall.

For kexec we don't expect memory corruption, what are we testing for?
I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
can't accidentally write over it.

(we discussed all this last time, but it fizzled-out. If you and the
  kexec-tools maintainer think its necessary, fine by me!)


Yes, there had already been discussion and MAINTAINERs have
discouraged none-purgatory implementation.


I have some comments on making this code easier to maintain..


Thanks.

I have implemented your review comments and have archived the code in

https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache

I will be posting the next version only when someone complains about
ARM64 kdump behavior that it is not as fast as x86.
On our ARM64-based platform we have very long main kernel-secondary 
kernel switch time.


This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 
kexec-tools version), we can get ~25x speedup, with this patch secondary 
kernel boots in ~3 seconds while on 2.0.13-2.0.16 kexec-tools without 
this patch switch takes about 75 seconds.


When do you plan merge this patch?

I can help you with testing on our platform.


Thanks for all your time on this series. That really helped me to
understand the arm64 page table in a better way.

~Pratyush




___
linux-arm-kernel mailing list
linux-arm-ker...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


--
Best Regards,

  Kostiantyn Iarmak.


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread James Morse
Hi Bhupesh,

On 02/06/17 12:15, Bhupesh SHARMA wrote:
> On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
>  wrote:
>> On 2 June 2017 at 08:23, James Morse  wrote:
>>> On 23/05/17 06:02, Pratyush Anand wrote:
 It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
 is around 13MB and initramfs is around 30MB. It takes more than 20 second
 even when we have -O2 optimization enabled. However, if dcache is enabled
 during purgatory execution then, it takes just a second in SHA
 verification.

 Therefore, these patches adds support for dcache enabling facility during
 purgatory execution.
>>>
>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>> before the dcache+mmu are disabled would also solve the delay problem, and 
>>> we
>>> can print an error message or fail the syscall.
>>>
>>> For kexec we don't expect memory corruption, what are we testing for?
>>
>> This is a very good question. SHA-256 is quite a heavy hammer if all
>> you need is CRC style error detection.

Thanks for the history links.

We don't (yet) support KEXEC_FILE or KEXEC_VERIFY_SIG, and arm64 doesn't have an
in-kernel purgatory (which looks to be required for kexec_file under 
secure-boot).


> AFAICR the sha-256 implementation was proposed to boot a signed
> kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
> restrictions (see [1]).

The beginning of the kexec-tools git history is 'kexec-tools-1.101' in 2006, it
had util_lib/sha256.c. It looks like SecureBoot arrived in 2011 with v2.3.1 of 
UEFI.
I can see how x86 picked up on this checksum for secure-boot as kexec-tools
already did this work, (some of the files under arch/x86/purgatory note their
kexec-tools origin), my question is why did it do it in the first place?
If the reason is accidental writes, we mitigate this on arm64 by unmapping the
kdump region instead of just marking it read-only.


> As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
> explicitly designed to stop you booting modified kernels unless you've
> added your own keys.

> So, CRC wouldn't possibly fulfil the functionality we are trying to
> achieve with SHA-256 in the purgatory.

Is this still true for a purgatory provided by user-space?


Thanks,

James


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread Pratyush Anand

Hi James,

On Friday 02 June 2017 01:53 PM, James Morse wrote:

Hi Pratyush,

On 23/05/17 06:02, Pratyush Anand wrote:

It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
is around 13MB and initramfs is around 30MB. It takes more than 20 second
even when we have -O2 optimization enabled. However, if dcache is enabled
during purgatory execution then, it takes just a second in SHA
verification.

Therefore, these patches adds support for dcache enabling facility during
purgatory execution.


I'm still not convinced we need this. Moving the SHA verification to happen
before the dcache+mmu are disabled would also solve the delay problem,


Humm..I am not sure, if we can do that.

When we leave kernel (and enter into purgatory), icache+dcache+mmu are already 
disabled. I think, that would be possible when we will be in a position to use 
in-kernel purgatory.



and we
can print an error message or fail the syscall.

For kexec we don't expect memory corruption, what are we testing for?
I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
can't accidentally write over it.

(we discussed all this last time, but it fizzled-out. If you and the
 kexec-tools maintainer think its necessary, fine by me!)


Yes, there had already been discussion and MAINTAINERs have discouraged 
none-purgatory implementation.




I have some comments on making this code easier to maintain..



Thanks.

I have implemented your review comments and have archived the code in

https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache

I will be posting the next version only when someone complains about ARM64 
kdump behavior that it is not as fast as x86.


Thanks for all your time on this series. That really helped me to understand 
the arm64 page table in a better way.


~Pratyush




___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread Pratyush Anand

Hi Ard,

Thanks for all the inputs.

On Friday 02 June 2017 05:14 PM, Ard Biesheuvel wrote:

Alternatively, a SHA-256 implementation that uses movz/movk sequences
instead of ldr instructions to load the round constants would already
be 5x faster, given that we don't need page tables to enable the
I-cache.

Actually, looking at the C code and the objdump of the kernel's
sha256_generic driver, it is likely that it is already doing this, and
none of the points I made actually make a lot of sense ...

Pratyush: I assume you are already enabling the I-cache in the purgatory?


It's not enabled yet. But, I had tried first with enabling only I-cache, but 
that did not make much difference in execution time.


~Pratyush



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread Ard Biesheuvel
On 2 June 2017 at 11:36, Ard Biesheuvel  wrote:
> On 2 June 2017 at 11:15, Bhupesh SHARMA  wrote:
>> Hi Ard, James
>>
>> On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
>>  wrote:
>>> On 2 June 2017 at 08:23, James Morse  wrote:
 Hi Pratyush,

 On 23/05/17 06:02, Pratyush Anand wrote:
> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
> is around 13MB and initramfs is around 30MB. It takes more than 20 second
> even when we have -O2 optimization enabled. However, if dcache is enabled
> during purgatory execution then, it takes just a second in SHA
> verification.
>
> Therefore, these patches adds support for dcache enabling facility during
> purgatory execution.

 I'm still not convinced we need this. Moving the SHA verification to happen
 before the dcache+mmu are disabled would also solve the delay problem, and 
 we
 can print an error message or fail the syscall.

 For kexec we don't expect memory corruption, what are we testing for?
>>>
>>> This is a very good question. SHA-256 is quite a heavy hammer if all
>>> you need is CRC style error detection. Note that SHA-256 uses 256
>>> bytes of round keys, which means that in the absence of a cache, each
>>> 64 byte chunk of data processed involves (re)reading 320 bytes from
>>> DRAM. That also means you could write a SHA-256 implementation for
>>> AArch64 that keeps the round keys in NEON registers instead, and it
>>> would probably be a lot faster.
>>
>> AFAICR the sha-256 implementation was proposed to boot a signed
>> kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
>> restrictions (see [1]).
>>
>> As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
>> explicitly designed to stop you booting modified kernels unless you've
>> added your own keys.
>>
>> But if you boot a signed Linux distribution with kexec enabled without
>> using the SHA like feature in the purgatory (like, say, Ubuntu) then
>> you're able to boot a modified Windows kernel that will still believe
>> it was booted securely.
>>
>> So, CRC wouldn't possibly fulfil the functionality we are trying to
>> achieve with SHA-256 in the purgatory.
>>
>
> OK. But it appears that kexec_load_file() generates the hashes, and
> the purgatory just double checks them? That means there is wiggle room
> in terms of hash implementation, even though a non-cryptographic hash
> may be out of the question.
>
>> However, having seen the power of using the inbuilt CRC instructions
>> from the ARM64 ISA on a SoC which supports it, I can vouch that the
>> native ISA implementations are much faster than other approaches.
>>
>> However, using the SHA-256 implementation (as you rightly noted) would
>> employ NEON registers and can be faster, however I remember some SoC
>> vendors disabling co-processor extensions in their ARM implementations
>> in the past, so I am not sure we can assume that NEON extensions would
>> be available in all ARMv8 implementations by default.
>>
>
> Alternatively, a SHA-256 implementation that uses movz/movk sequences
> instead of ldr instructions to load the round constants would already
> be 5x faster, given that we don't need page tables to enable the
> I-cache.

Actually, looking at the C code and the objdump of the kernel's
sha256_generic driver, it is likely that it is already doing this, and
none of the points I made actually make a lot of sense ...

Pratyush: I assume you are already enabling the I-cache in the purgatory?

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread Ard Biesheuvel
On 2 June 2017 at 11:15, Bhupesh SHARMA  wrote:
> Hi Ard, James
>
> On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
>  wrote:
>> On 2 June 2017 at 08:23, James Morse  wrote:
>>> Hi Pratyush,
>>>
>>> On 23/05/17 06:02, Pratyush Anand wrote:
 It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
 is around 13MB and initramfs is around 30MB. It takes more than 20 second
 even when we have -O2 optimization enabled. However, if dcache is enabled
 during purgatory execution then, it takes just a second in SHA
 verification.

 Therefore, these patches adds support for dcache enabling facility during
 purgatory execution.
>>>
>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>> before the dcache+mmu are disabled would also solve the delay problem, and 
>>> we
>>> can print an error message or fail the syscall.
>>>
>>> For kexec we don't expect memory corruption, what are we testing for?
>>
>> This is a very good question. SHA-256 is quite a heavy hammer if all
>> you need is CRC style error detection. Note that SHA-256 uses 256
>> bytes of round keys, which means that in the absence of a cache, each
>> 64 byte chunk of data processed involves (re)reading 320 bytes from
>> DRAM. That also means you could write a SHA-256 implementation for
>> AArch64 that keeps the round keys in NEON registers instead, and it
>> would probably be a lot faster.
>
> AFAICR the sha-256 implementation was proposed to boot a signed
> kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
> restrictions (see [1]).
>
> As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
> explicitly designed to stop you booting modified kernels unless you've
> added your own keys.
>
> But if you boot a signed Linux distribution with kexec enabled without
> using the SHA like feature in the purgatory (like, say, Ubuntu) then
> you're able to boot a modified Windows kernel that will still believe
> it was booted securely.
>
> So, CRC wouldn't possibly fulfil the functionality we are trying to
> achieve with SHA-256 in the purgatory.
>

OK. But it appears that kexec_load_file() generates the hashes, and
the purgatory just double checks them? That means there is wiggle room
in terms of hash implementation, even though a non-cryptographic hash
may be out of the question.

> However, having seen the power of using the inbuilt CRC instructions
> from the ARM64 ISA on a SoC which supports it, I can vouch that the
> native ISA implementations are much faster than other approaches.
>
> However, using the SHA-256 implementation (as you rightly noted) would
> employ NEON registers and can be faster, however I remember some SoC
> vendors disabling co-processor extensions in their ARM implementations
> in the past, so I am not sure we can assume that NEON extensions would
> be available in all ARMv8 implementations by default.
>

Alternatively, a SHA-256 implementation that uses movz/movk sequences
instead of ldr instructions to load the round constants would already
be 5x faster, given that we don't need page tables to enable the
I-cache.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread Bhupesh SHARMA
Hi Ard, James

On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
 wrote:
> On 2 June 2017 at 08:23, James Morse  wrote:
>> Hi Pratyush,
>>
>> On 23/05/17 06:02, Pratyush Anand wrote:
>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>> during purgatory execution then, it takes just a second in SHA
>>> verification.
>>>
>>> Therefore, these patches adds support for dcache enabling facility during
>>> purgatory execution.
>>
>> I'm still not convinced we need this. Moving the SHA verification to happen
>> before the dcache+mmu are disabled would also solve the delay problem, and we
>> can print an error message or fail the syscall.
>>
>> For kexec we don't expect memory corruption, what are we testing for?
>
> This is a very good question. SHA-256 is quite a heavy hammer if all
> you need is CRC style error detection. Note that SHA-256 uses 256
> bytes of round keys, which means that in the absence of a cache, each
> 64 byte chunk of data processed involves (re)reading 320 bytes from
> DRAM. That also means you could write a SHA-256 implementation for
> AArch64 that keeps the round keys in NEON registers instead, and it
> would probably be a lot faster.

AFAICR the sha-256 implementation was proposed to boot a signed
kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
restrictions (see [1]).

As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
explicitly designed to stop you booting modified kernels unless you've
added your own keys.

But if you boot a signed Linux distribution with kexec enabled without
using the SHA like feature in the purgatory (like, say, Ubuntu) then
you're able to boot a modified Windows kernel that will still believe
it was booted securely.

So, CRC wouldn't possibly fulfil the functionality we are trying to
achieve with SHA-256 in the purgatory.

However, having seen the power of using the inbuilt CRC instructions
from the ARM64 ISA on a SoC which supports it, I can vouch that the
native ISA implementations are much faster than other approaches.

However, using the SHA-256 implementation (as you rightly noted) would
employ NEON registers and can be faster, however I remember some SoC
vendors disabling co-processor extensions in their ARM implementations
in the past, so I am not sure we can assume that NEON extensions would
be available in all ARMv8 implementations by default.

[1] https://lwn.net/Articles/603116/
[2] http://mjg59.dreamwidth.org/28746.html

Regards,
Bhupesh

>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>> can't accidentally write over it.
>>
>> (we discussed all this last time, but it fizzled-out. If you and the
>>  kexec-tools maintainer think its necessary, fine by me!)
>>
>
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread Ard Biesheuvel
On 2 June 2017 at 08:23, James Morse  wrote:
> Hi Pratyush,
>
> On 23/05/17 06:02, Pratyush Anand wrote:
>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>> even when we have -O2 optimization enabled. However, if dcache is enabled
>> during purgatory execution then, it takes just a second in SHA
>> verification.
>>
>> Therefore, these patches adds support for dcache enabling facility during
>> purgatory execution.
>
> I'm still not convinced we need this. Moving the SHA verification to happen
> before the dcache+mmu are disabled would also solve the delay problem, and we
> can print an error message or fail the syscall.
>
> For kexec we don't expect memory corruption, what are we testing for?

This is a very good question. SHA-256 is quite a heavy hammer if all
you need is CRC style error detection. Note that SHA-256 uses 256
bytes of round keys, which means that in the absence of a cache, each
64 byte chunk of data processed involves (re)reading 320 bytes from
DRAM. That also means you could write a SHA-256 implementation for
AArch64 that keeps the round keys in NEON registers instead, and it
would probably be a lot faster.

> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
> can't accidentally write over it.
>
> (we discussed all this last time, but it fizzled-out. If you and the
>  kexec-tools maintainer think its necessary, fine by me!)
>

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory

2017-06-02 Thread James Morse
Hi Pratyush,

On 23/05/17 06:02, Pratyush Anand wrote:
> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
> is around 13MB and initramfs is around 30MB. It takes more than 20 second
> even when we have -O2 optimization enabled. However, if dcache is enabled
> during purgatory execution then, it takes just a second in SHA
> verification.
> 
> Therefore, these patches adds support for dcache enabling facility during
> purgatory execution.

I'm still not convinced we need this. Moving the SHA verification to happen
before the dcache+mmu are disabled would also solve the delay problem, and we
can print an error message or fail the syscall.

For kexec we don't expect memory corruption, what are we testing for?
I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
can't accidentally write over it.

(we discussed all this last time, but it fizzled-out. If you and the
 kexec-tools maintainer think its necessary, fine by me!)

I have some comments on making this code easier to maintain..


Thanks,

James

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec