Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Ard Biesheuvel
On Wed, 21 Aug 2019 at 11:29, Mike Rapoport  wrote:
>
> On Wed, Aug 21, 2019 at 10:29:37AM +0300, Ard Biesheuvel wrote:
> > On Wed, 21 Aug 2019 at 10:11, Mike Rapoport  wrote:
> > >
...
> > > I think the only missing part here is to ensure that non-reserved memory 
> > > in
> > > bank 0 starts from a PMD-aligned address. I believe this could be done if
> > > EFI stub, but I'm not really familiar with it so this just a semi-educated
> > > guess :)
> > >
> >
> > Given that it is the ARM arch code that imposes this requirement, how
> > about adding something like this to adjust_lowmem_bounds():
> >
> > if (memblock_start_of_DRAM() % PMD_SIZE)
> > memblock_mark_nomap(memblock_start_of_DRAM(),
> > PMD_SIZE - (memblock_start_of_DRAM() % PMD_SIZE));
>
> memblock_start_of_DRAM() won't work here, as it returns the actual start of
> the DRAM including NOMAP regions. Moreover, as we cannot mark a region
> NOMAP inside for_each_memblock() this should be done beforehand.
>
> I think something like this could work:
>
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index 2f0f07e..f2b635b 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -1178,6 +1178,19 @@ void __init adjust_lowmem_bounds(void)
>  */
> vmalloc_limit = (u64)(uintptr_t)vmalloc_min - PAGE_OFFSET + 
> PHYS_OFFSET;
>
> +   /*
> +* The first usable region must be PMD aligned. Mark its start
> +* as MEMBLOCK_NOMAP if it isn't
> +*/
> +   for_each_memblock(memory, reg) {
> +   if (!memblock_is_nomap(reg) && (reg->base % PMD_SIZE)) {
> +   phys_addr_t size = PMD_SIZE - (reg->base % PMD_SIZE);
> +
> +   memblock_mark_nomap(reg->base, size);
> +   break;

We should break on the first !NOMAP memblock, even if it is already
PMD aligned, but beyond that, this looks ok to me.


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Mike Rapoport
On Wed, Aug 21, 2019 at 10:29:37AM +0300, Ard Biesheuvel wrote:
> On Wed, 21 Aug 2019 at 10:11, Mike Rapoport  wrote:
> >
> > On Wed, Aug 21, 2019 at 09:35:16AM +0300, Ard Biesheuvel wrote:
> > > On Wed, 21 Aug 2019 at 09:11, Chester Lin  wrote:
> > > >
> > > > On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> > > > > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
> > > > >  wrote:
> > > > > >
> > > > > > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > > > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > > > > index f3ce34113f89..909b11ba48d8 100644
> > > > > > > --- a/arch/arm/mm/mmu.c
> > > > > > > +++ b/arch/arm/mm/mmu.c
> > > > > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > > > > > >   phys_addr_t block_start = reg->base;
> > > > > > >   phys_addr_t block_end = reg->base + reg->size;
> > > > > > >
> > > > > > > + if (memblock_is_nomap(reg))
> > > > > > > + continue;
> > > > > > > +
> > > > > > >   if (reg->base < vmalloc_limit) {
> > > > > > >   if (block_end > lowmem_limit)
> > > > > > >   /*
> > > > > >
> > > > > > I think this hunk is sane - if the memory is marked nomap, then it 
> > > > > > isn't
> > > > > > available for the kernel's use, so as far as calculating where the
> > > > > > lowmem/highmem boundary is, it effectively doesn't exist and should 
> > > > > > be
> > > > > > skipped.
> > > > > >
> > > > >
> > > > > I agree.
> > > > >
> > > > > Chester, could you explain what you need beyond this change (and my
> > > > > EFI stub change involving TEXT_OFFSET) to make things work on the
> > > > > RPi2?
> > > > >
> > > >
> > > > Hi Ard,
> > > >
> > > > In fact I am working with Guillaume to try booting zImage kernel and 
> > > > openSUSE
> > > > from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, 
> > > > which is
> > > > one of the test machines we have. However we want a better solution for 
> > > > all
> > > > cases but not just RPi2 since we don't want to affect other platforms 
> > > > as well.
> > > >
> > >
> > > Thanks Chester, but that doesn't answer my question.
> > >
> > > Your fix is a single patch that changes various things that are only
> > > vaguely related. We have already identified that we need to take
> > > TEXT_OFFSET (minus some space used by the swapper page tables) into
> > > account into the EFI stub if we want to ensure compatibility with many
> > > different platforms, and as it turns out, this applies not only to
> > > RPi2 but to other platforms as well, most notably the ones that
> > > require a TEXT_OFFSET of 0x208000, since they also have reserved
> > > regions at the base of RAM.
> > >
> > > My question was what else we need beyond:
> > > - the EFI stub TEXT_OFFSET fix [0]
> > > - the change to disregard NOMAP memblocks in adjust_lowmem_bounds()
> > > - what else???
> >
> > I think the only missing part here is to ensure that non-reserved memory in
> > bank 0 starts from a PMD-aligned address. I believe this could be done if
> > EFI stub, but I'm not really familiar with it so this just a semi-educated
> > guess :)
> >
> 
> Given that it is the ARM arch code that imposes this requirement, how
> about adding something like this to adjust_lowmem_bounds():
> 
> if (memblock_start_of_DRAM() % PMD_SIZE)
> memblock_mark_nomap(memblock_start_of_DRAM(),
> PMD_SIZE - (memblock_start_of_DRAM() % PMD_SIZE));

memblock_start_of_DRAM() won't work here, as it returns the actual start of
the DRAM including NOMAP regions. Moreover, as we cannot mark a region
NOMAP inside for_each_memblock() this should be done beforehand.

I think something like this could work:

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 2f0f07e..f2b635b 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1178,6 +1178,19 @@ void __init adjust_lowmem_bounds(void)
 */
vmalloc_limit = (u64)(uintptr_t)vmalloc_min - PAGE_OFFSET + PHYS_OFFSET;
 
+   /*
+* The first usable region must be PMD aligned. Mark its start
+* as MEMBLOCK_NOMAP if it isn't
+*/
+   for_each_memblock(memory, reg) {
+   if (!memblock_is_nomap(reg) && (reg->base % PMD_SIZE)) {
+   phys_addr_t size = PMD_SIZE - (reg->base % PMD_SIZE);
+
+   memblock_mark_nomap(reg->base, size);
+   break;
+   }
+   }
+
for_each_memblock(memory, reg) {
phys_addr_t block_start = reg->base;
phys_addr_t block_end = reg->base + reg->size;



 
> (and introduce the nomap check into the loop)

-- 
Sincerely yours,
Mike.



Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Ard Biesheuvel
On Wed, 21 Aug 2019 at 10:11, Mike Rapoport  wrote:
>
> On Wed, Aug 21, 2019 at 09:35:16AM +0300, Ard Biesheuvel wrote:
> > On Wed, 21 Aug 2019 at 09:11, Chester Lin  wrote:
> > >
> > > On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> > > > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
> > > >  wrote:
> > > > >
> > > > > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > > > index f3ce34113f89..909b11ba48d8 100644
> > > > > > --- a/arch/arm/mm/mmu.c
> > > > > > +++ b/arch/arm/mm/mmu.c
> > > > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > > > > >   phys_addr_t block_start = reg->base;
> > > > > >   phys_addr_t block_end = reg->base + reg->size;
> > > > > >
> > > > > > + if (memblock_is_nomap(reg))
> > > > > > + continue;
> > > > > > +
> > > > > >   if (reg->base < vmalloc_limit) {
> > > > > >   if (block_end > lowmem_limit)
> > > > > >   /*
> > > > >
> > > > > I think this hunk is sane - if the memory is marked nomap, then it 
> > > > > isn't
> > > > > available for the kernel's use, so as far as calculating where the
> > > > > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > > > > skipped.
> > > > >
> > > >
> > > > I agree.
> > > >
> > > > Chester, could you explain what you need beyond this change (and my
> > > > EFI stub change involving TEXT_OFFSET) to make things work on the
> > > > RPi2?
> > > >
> > >
> > > Hi Ard,
> > >
> > > In fact I am working with Guillaume to try booting zImage kernel and 
> > > openSUSE
> > > from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, 
> > > which is
> > > one of the test machines we have. However we want a better solution for 
> > > all
> > > cases but not just RPi2 since we don't want to affect other platforms as 
> > > well.
> > >
> >
> > Thanks Chester, but that doesn't answer my question.
> >
> > Your fix is a single patch that changes various things that are only
> > vaguely related. We have already identified that we need to take
> > TEXT_OFFSET (minus some space used by the swapper page tables) into
> > account into the EFI stub if we want to ensure compatibility with many
> > different platforms, and as it turns out, this applies not only to
> > RPi2 but to other platforms as well, most notably the ones that
> > require a TEXT_OFFSET of 0x208000, since they also have reserved
> > regions at the base of RAM.
> >
> > My question was what else we need beyond:
> > - the EFI stub TEXT_OFFSET fix [0]
> > - the change to disregard NOMAP memblocks in adjust_lowmem_bounds()
> > - what else???
>
> I think the only missing part here is to ensure that non-reserved memory in
> bank 0 starts from a PMD-aligned address. I believe this could be done if
> EFI stub, but I'm not really familiar with it so this just a semi-educated
> guess :)
>

Given that it is the ARM arch code that imposes this requirement, how
about adding something like this to adjust_lowmem_bounds():

if (memblock_start_of_DRAM() % PMD_SIZE)
memblock_mark_nomap(memblock_start_of_DRAM(),
PMD_SIZE - (memblock_start_of_DRAM() % PMD_SIZE));

(and introduce the nomap check into the loop)


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Chester Lin
On Wed, Aug 21, 2019 at 10:11:01AM +0300, Mike Rapoport wrote:
> On Wed, Aug 21, 2019 at 09:35:16AM +0300, Ard Biesheuvel wrote:
> > On Wed, 21 Aug 2019 at 09:11, Chester Lin  wrote:
> > >
> > > On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> > > > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
> > > >  wrote:
> > > > >
> > > > > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > > > index f3ce34113f89..909b11ba48d8 100644
> > > > > > --- a/arch/arm/mm/mmu.c
> > > > > > +++ b/arch/arm/mm/mmu.c
> > > > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > > > > >   phys_addr_t block_start = reg->base;
> > > > > >   phys_addr_t block_end = reg->base + reg->size;
> > > > > >
> > > > > > + if (memblock_is_nomap(reg))
> > > > > > + continue;
> > > > > > +
> > > > > >   if (reg->base < vmalloc_limit) {
> > > > > >   if (block_end > lowmem_limit)
> > > > > >   /*
> > > > >
> > > > > I think this hunk is sane - if the memory is marked nomap, then it 
> > > > > isn't
> > > > > available for the kernel's use, so as far as calculating where the
> > > > > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > > > > skipped.
> > > > >
> > > >
> > > > I agree.
> > > >
> > > > Chester, could you explain what you need beyond this change (and my
> > > > EFI stub change involving TEXT_OFFSET) to make things work on the
> > > > RPi2?
> > > >
> > >
> > > Hi Ard,
> > >
> > > In fact I am working with Guillaume to try booting zImage kernel and 
> > > openSUSE
> > > from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, 
> > > which is
> > > one of the test machines we have. However we want a better solution for 
> > > all
> > > cases but not just RPi2 since we don't want to affect other platforms as 
> > > well.
> > >
> > 
> > Thanks Chester, but that doesn't answer my question.
> > 
> > Your fix is a single patch that changes various things that are only
> > vaguely related. We have already identified that we need to take
> > TEXT_OFFSET (minus some space used by the swapper page tables) into
> > account into the EFI stub if we want to ensure compatibility with many
> > different platforms, and as it turns out, this applies not only to
> > RPi2 but to other platforms as well, most notably the ones that
> > require a TEXT_OFFSET of 0x208000, since they also have reserved
> > regions at the base of RAM.
> > 
> > My question was what else we need beyond:
> > - the EFI stub TEXT_OFFSET fix [0]
> > - the change to disregard NOMAP memblocks in adjust_lowmem_bounds()
> > - what else???
> 
> I think the only missing part here is to ensure that non-reserved memory in
> bank 0 starts from a PMD-aligned address. I believe this could be done if
> EFI stub, but I'm not really familiar with it so this just a semi-educated
> guess :)
>  
> > [0] 
> > https://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git/commit/?h=next=0eb7bad595e52666b642a02862ad996a0f9bfcc0
>

Hi Ard and Mike,

Sorry for my misunderstanding and I agree with Mike. We could still meet the
memblock_limit issue if there's a non-reserved memory in bank0 starts from an
unaligned address.

Regards,
Chester


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Mike Rapoport
On Wed, Aug 21, 2019 at 09:35:16AM +0300, Ard Biesheuvel wrote:
> On Wed, 21 Aug 2019 at 09:11, Chester Lin  wrote:
> >
> > On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> > > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
> > >  wrote:
> > > >
> > > > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > > index f3ce34113f89..909b11ba48d8 100644
> > > > > --- a/arch/arm/mm/mmu.c
> > > > > +++ b/arch/arm/mm/mmu.c
> > > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > > > >   phys_addr_t block_start = reg->base;
> > > > >   phys_addr_t block_end = reg->base + reg->size;
> > > > >
> > > > > + if (memblock_is_nomap(reg))
> > > > > + continue;
> > > > > +
> > > > >   if (reg->base < vmalloc_limit) {
> > > > >   if (block_end > lowmem_limit)
> > > > >   /*
> > > >
> > > > I think this hunk is sane - if the memory is marked nomap, then it isn't
> > > > available for the kernel's use, so as far as calculating where the
> > > > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > > > skipped.
> > > >
> > >
> > > I agree.
> > >
> > > Chester, could you explain what you need beyond this change (and my
> > > EFI stub change involving TEXT_OFFSET) to make things work on the
> > > RPi2?
> > >
> >
> > Hi Ard,
> >
> > In fact I am working with Guillaume to try booting zImage kernel and 
> > openSUSE
> > from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, 
> > which is
> > one of the test machines we have. However we want a better solution for all
> > cases but not just RPi2 since we don't want to affect other platforms as 
> > well.
> >
> 
> Thanks Chester, but that doesn't answer my question.
> 
> Your fix is a single patch that changes various things that are only
> vaguely related. We have already identified that we need to take
> TEXT_OFFSET (minus some space used by the swapper page tables) into
> account into the EFI stub if we want to ensure compatibility with many
> different platforms, and as it turns out, this applies not only to
> RPi2 but to other platforms as well, most notably the ones that
> require a TEXT_OFFSET of 0x208000, since they also have reserved
> regions at the base of RAM.
> 
> My question was what else we need beyond:
> - the EFI stub TEXT_OFFSET fix [0]
> - the change to disregard NOMAP memblocks in adjust_lowmem_bounds()
> - what else???

I think the only missing part here is to ensure that non-reserved memory in
bank 0 starts from a PMD-aligned address. I believe this could be done if
EFI stub, but I'm not really familiar with it so this just a semi-educated
guess :)
 
> [0] 
> https://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git/commit/?h=next=0eb7bad595e52666b642a02862ad996a0f9bfcc0

-- 
Sincerely yours,
Mike.



Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Ard Biesheuvel
On Wed, 21 Aug 2019 at 09:11, Chester Lin  wrote:
>
> On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
> >  wrote:
> > >
> > > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > index f3ce34113f89..909b11ba48d8 100644
> > > > --- a/arch/arm/mm/mmu.c
> > > > +++ b/arch/arm/mm/mmu.c
> > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > > >   phys_addr_t block_start = reg->base;
> > > >   phys_addr_t block_end = reg->base + reg->size;
> > > >
> > > > + if (memblock_is_nomap(reg))
> > > > + continue;
> > > > +
> > > >   if (reg->base < vmalloc_limit) {
> > > >   if (block_end > lowmem_limit)
> > > >   /*
> > >
> > > I think this hunk is sane - if the memory is marked nomap, then it isn't
> > > available for the kernel's use, so as far as calculating where the
> > > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > > skipped.
> > >
> >
> > I agree.
> >
> > Chester, could you explain what you need beyond this change (and my
> > EFI stub change involving TEXT_OFFSET) to make things work on the
> > RPi2?
> >
>
> Hi Ard,
>
> In fact I am working with Guillaume to try booting zImage kernel and openSUSE
> from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, which 
> is
> one of the test machines we have. However we want a better solution for all
> cases but not just RPi2 since we don't want to affect other platforms as well.
>

Thanks Chester, but that doesn't answer my question.

Your fix is a single patch that changes various things that are only
vaguely related. We have already identified that we need to take
TEXT_OFFSET (minus some space used by the swapper page tables) into
account into the EFI stub if we want to ensure compatibility with many
different platforms, and as it turns out, this applies not only to
RPi2 but to other platforms as well, most notably the ones that
require a TEXT_OFFSET of 0x208000, since they also have reserved
regions at the base of RAM.

My question was what else we need beyond:
- the EFI stub TEXT_OFFSET fix [0]
- the change to disregard NOMAP memblocks in adjust_lowmem_bounds()
- what else???


[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git/commit/?h=next=0eb7bad595e52666b642a02862ad996a0f9bfcc0


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-21 Thread Chester Lin
On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
>  wrote:
> >
> > On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > index f3ce34113f89..909b11ba48d8 100644
> > > --- a/arch/arm/mm/mmu.c
> > > +++ b/arch/arm/mm/mmu.c
> > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > >   phys_addr_t block_start = reg->base;
> > >   phys_addr_t block_end = reg->base + reg->size;
> > >
> > > + if (memblock_is_nomap(reg))
> > > + continue;
> > > +
> > >   if (reg->base < vmalloc_limit) {
> > >   if (block_end > lowmem_limit)
> > >   /*
> >
> > I think this hunk is sane - if the memory is marked nomap, then it isn't
> > available for the kernel's use, so as far as calculating where the
> > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > skipped.
> >
> 
> I agree.
> 
> Chester, could you explain what you need beyond this change (and my
> EFI stub change involving TEXT_OFFSET) to make things work on the
> RPi2?
>

Hi Ard,

In fact I am working with Guillaume to try booting zImage kernel and openSUSE
from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, which is
one of the test machines we have. However we want a better solution for all
cases but not just RPi2 since we don't want to affect other platforms as well.

Regards,
Chester


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Ard Biesheuvel
On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
 wrote:
>
> On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > index f3ce34113f89..909b11ba48d8 100644
> > --- a/arch/arm/mm/mmu.c
> > +++ b/arch/arm/mm/mmu.c
> > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> >   phys_addr_t block_start = reg->base;
> >   phys_addr_t block_end = reg->base + reg->size;
> >
> > + if (memblock_is_nomap(reg))
> > + continue;
> > +
> >   if (reg->base < vmalloc_limit) {
> >   if (block_end > lowmem_limit)
> >   /*
>
> I think this hunk is sane - if the memory is marked nomap, then it isn't
> available for the kernel's use, so as far as calculating where the
> lowmem/highmem boundary is, it effectively doesn't exist and should be
> skipped.
>

I agree.

Chester, could you explain what you need beyond this change (and my
EFI stub change involving TEXT_OFFSET) to make things work on the
RPi2?


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Ard Biesheuvel
On Tue, 20 Aug 2019 at 14:54, Russell King - ARM Linux admin
 wrote:
>
> On Sun, Aug 04, 2019 at 10:57:00AM +0300, Ard Biesheuvel wrote:
> > (The first TEXT_OFFSET bytes are no longer used in practice, which is
> > why putting a reserved region of 4 KB bytes works at the moment, but
> > this is fragile).
>
> That is not correct for 32-bit ARM.  The swapper page table is still
> located 16kiB below the kernel.
>

Right. So that means we can only permit reserved regions in the first
(TEXT_OFFSET - 16 kiB) bytes starting at the first 128 MiB aligned
address covered by system RAM, if we want to ensure that the
decompressor or the early kernel don't trample on it. (or TEXT_OFFSET
- 20 kiB for LPAE kernels)


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Russell King - ARM Linux admin
On Fri, Aug 02, 2019 at 05:38:54AM +, Chester Lin wrote:
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index f3ce34113f89..909b11ba48d8 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
>   phys_addr_t block_start = reg->base;
>   phys_addr_t block_end = reg->base + reg->size;
>  
> + if (memblock_is_nomap(reg))
> + continue;
> +
>   if (reg->base < vmalloc_limit) {
>   if (block_end > lowmem_limit)
>   /*

I think this hunk is sane - if the memory is marked nomap, then it isn't
available for the kernel's use, so as far as calculating where the
lowmem/highmem boundary is, it effectively doesn't exist and should be
skipped.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Russell King - ARM Linux admin
On Sun, Aug 04, 2019 at 10:57:00AM +0300, Ard Biesheuvel wrote:
> (The first TEXT_OFFSET bytes are no longer used in practice, which is
> why putting a reserved region of 4 KB bytes works at the moment, but
> this is fragile).

That is not correct for 32-bit ARM.  The swapper page table is still
located 16kiB below the kernel.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Ard Biesheuvel
On Tue, 20 Aug 2019 at 10:49, Mike Rapoport  wrote:
>
> On Mon, Aug 19, 2019 at 05:56:51PM +0300, Ard Biesheuvel wrote:
> > On Mon, 19 Aug 2019 at 11:01, Chester Lin  wrote:
> > >
> > > Hi Mike and Ard,
> > >
> > > On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> > > > On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > > > > (adding Mike)
> > > > >
>
> ...
>
> > > > > > In this case the kernel failed to reserve cma, which should hit the 
> > > > > > issue of
> > > > > > memblock_limit=0x1000 as I had mentioned in my patch description. 
> > > > > > The first
> > > > > > block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did 
> > > > > > not align
> > > > > > with PMD_SIZE so the cma reservation failed because the 
> > > > > > memblock.current_limit
> > > > > > was extremely low. That's why I expand the first reservation from 1 
> > > > > > PAGESIZE to
> > > > > > 1 PMD_SIZE in my patch in order to avoid this issue. Please kindly 
> > > > > > let me know
> > > > > > if any suggestion, thank you.
> > > >
> > > >
> > > > > This looks like it is a separate issue. The memblock/cma code should
> > > > > not choke on a reserved page of memory at 0x0.
> > > > >
> > > > > Perhaps Russell or Mike (cc'ed) have an idea how to address this?
> > > >
> > > > Presuming that the last memblock dump comes from the end of
> > > > arm_memblock_init() with the this memory map
> > > >
> > > > memory[0x0] [0x-0x0fff], 0x1000 
> > > > bytes flags: 0x4
> > > > memory[0x1] [0x1000-0x07ef5fff], 0x07ef5000 
> > > > bytes flags: 0x0
> > > > memory[0x2] [0x07ef6000-0x07f09fff], 0x00014000 
> > > > bytes flags: 0x4
> > > > memory[0x3] [0x07f0a000-0x3cb3efff], 0x34c35000 
> > > > bytes flags: 0x0
> > > >
> > > > adjust_lowmem_bounds() will set the memblock_limit (and respectively 
> > > > global
> > > > memblock.current_limit) to 0x1000 and any further memblock_alloc*() will
> > > > happily fail.
> > > >
> > > > I believe that the assumption for memblock_limit calculations was that 
> > > > the
> > > > first bank has several megs at least.
> > > >
> > > > I wonder if this hack would help:
> > > >
> > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > index d9a0038..948e5b9 100644
> > > > --- a/arch/arm/mm/mmu.c
> > > > +++ b/arch/arm/mm/mmu.c
> > > > @@ -1206,7 +1206,7 @@ void __init adjust_lowmem_bounds(void)
> > > >* allocated when mapping the start of bank 0, 
> > > > which
> > > >* occurs before any free memory is mapped.
> > > >*/
> > > > - if (!memblock_limit) {
> > > > + if (memblock_limit < PMD_SIZE) {
> > > >   if (!IS_ALIGNED(block_start, PMD_SIZE))
> > > >   memblock_limit = block_start;
> > > >   else if (!IS_ALIGNED(block_end, PMD_SIZE))
> > > >
> > >
> > > I applied this patch as well and it works well on rpi-2 model B.
> > >
> >
> > Thanks, Chester, that is good to know.
> >
> > However, afaict, this only affects systems where physical memory
> > starts at address 0x0, so I think we need a better fix.
>
> This hack can be easily extended to handle systems with arbitrary start
> address, but it's still a hack...
>
> > I know Mike has been looking into the NOMAP stuff lately, and your
> > original patch contains a hunk that makes this code (?) disregard
> > nomap memblocks. That might be a better approach.
>
> I was actually looking how to replace NOMAP with something else to make
> memblock.memory consistent with actual physical memory banks. But this work
> is stashed for now.
>
> I'm not sure that skipping NOMAP regions would be good enough.
> If I understand corrrectly, with Chester's original patch the reservation
> of PMD aligned chunk of 32M for the kernel made the first conv-mem region
> PMD aligned and then memblock_limit will be set to the end of this region.
>
> Is there a reason for marking EFI_RESERVED_TYPE as NOMAP rather than simply
> reserve them with memblock_reserve()?
>

Yes.

On ARM systems, reserved memory regions should never be mapped by
default, since the cacheable mappings we use in the linear region may
conflict with the mapping attributes used by the firmware or driver
components that are using this memory.

In this particular case, we are talking about things like spin tables
and pens for secondaries that boot up with their caches disabled, and
having a cacheable mapping on the primary CPU might cause a loss of
coherency.


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Chester Lin
On Tue, Aug 20, 2019 at 10:49:30AM +0300, Mike Rapoport wrote:
> On Mon, Aug 19, 2019 at 05:56:51PM +0300, Ard Biesheuvel wrote:
> > On Mon, 19 Aug 2019 at 11:01, Chester Lin  wrote:
> > >
> > > Hi Mike and Ard,
> > >
> > > On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> > > > On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > > > > (adding Mike)
> > > > >
> 
> ...
> 
> > > > > > In this case the kernel failed to reserve cma, which should hit the 
> > > > > > issue of
> > > > > > memblock_limit=0x1000 as I had mentioned in my patch description. 
> > > > > > The first
> > > > > > block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did 
> > > > > > not align
> > > > > > with PMD_SIZE so the cma reservation failed because the 
> > > > > > memblock.current_limit
> > > > > > was extremely low. That's why I expand the first reservation from 1 
> > > > > > PAGESIZE to
> > > > > > 1 PMD_SIZE in my patch in order to avoid this issue. Please kindly 
> > > > > > let me know
> > > > > > if any suggestion, thank you.
> > > >
> > > >
> > > > > This looks like it is a separate issue. The memblock/cma code should
> > > > > not choke on a reserved page of memory at 0x0.
> > > > >
> > > > > Perhaps Russell or Mike (cc'ed) have an idea how to address this?
> > > >
> > > > Presuming that the last memblock dump comes from the end of
> > > > arm_memblock_init() with the this memory map
> > > >
> > > > memory[0x0] [0x-0x0fff], 0x1000 
> > > > bytes flags: 0x4
> > > > memory[0x1] [0x1000-0x07ef5fff], 0x07ef5000 
> > > > bytes flags: 0x0
> > > > memory[0x2] [0x07ef6000-0x07f09fff], 0x00014000 
> > > > bytes flags: 0x4
> > > > memory[0x3] [0x07f0a000-0x3cb3efff], 0x34c35000 
> > > > bytes flags: 0x0
> > > >
> > > > adjust_lowmem_bounds() will set the memblock_limit (and respectively 
> > > > global
> > > > memblock.current_limit) to 0x1000 and any further memblock_alloc*() will
> > > > happily fail.
> > > >
> > > > I believe that the assumption for memblock_limit calculations was that 
> > > > the
> > > > first bank has several megs at least.
> > > >
> > > > I wonder if this hack would help:
> > > >
> > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > index d9a0038..948e5b9 100644
> > > > --- a/arch/arm/mm/mmu.c
> > > > +++ b/arch/arm/mm/mmu.c
> > > > @@ -1206,7 +1206,7 @@ void __init adjust_lowmem_bounds(void)
> > > >* allocated when mapping the start of bank 0, 
> > > > which
> > > >* occurs before any free memory is mapped.
> > > >*/
> > > > - if (!memblock_limit) {
> > > > + if (memblock_limit < PMD_SIZE) {
> > > >   if (!IS_ALIGNED(block_start, PMD_SIZE))
> > > >   memblock_limit = block_start;
> > > >   else if (!IS_ALIGNED(block_end, PMD_SIZE))
> > > >
> > >
> > > I applied this patch as well and it works well on rpi-2 model B.
> > >
> > 
> > Thanks, Chester, that is good to know.
> > 
> > However, afaict, this only affects systems where physical memory
> > starts at address 0x0, so I think we need a better fix.
> 
> This hack can be easily extended to handle systems with arbitrary start
> address, but it's still a hack...
> 
> > I know Mike has been looking into the NOMAP stuff lately, and your
> > original patch contains a hunk that makes this code (?) disregard
> > nomap memblocks. That might be a better approach.
> 
> I was actually looking how to replace NOMAP with something else to make
> memblock.memory consistent with actual physical memory banks. But this work
> is stashed for now.
> 
> I'm not sure that skipping NOMAP regions would be good enough.
> If I understand corrrectly, with Chester's original patch the reservation
> of PMD aligned chunk of 32M for the kernel made the first conv-mem region
> PMD aligned and then memblock_limit will be set to the end of this region.
> 
> Is there a reason for marking EFI_RESERVED_TYPE as NOMAP rather than simply
> reserve them with memblock_reserve()?
> 

Hi Mike,

I make this change in efistub so I am not sure if memblock_reserve() can be
linked by ld or not. I tried using efi_mem_reserve() but got a linker error of
undefined reference. Is there a better place to call memblock_reserve() after
efistub?

Thanks,
Chester


Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Mike Rapoport
On Mon, Aug 19, 2019 at 05:56:51PM +0300, Ard Biesheuvel wrote:
> On Mon, 19 Aug 2019 at 11:01, Chester Lin  wrote:
> >
> > Hi Mike and Ard,
> >
> > On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> > > On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > > > (adding Mike)
> > > >

...

> > > > > In this case the kernel failed to reserve cma, which should hit the 
> > > > > issue of
> > > > > memblock_limit=0x1000 as I had mentioned in my patch description. The 
> > > > > first
> > > > > block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did not 
> > > > > align
> > > > > with PMD_SIZE so the cma reservation failed because the 
> > > > > memblock.current_limit
> > > > > was extremely low. That's why I expand the first reservation from 1 
> > > > > PAGESIZE to
> > > > > 1 PMD_SIZE in my patch in order to avoid this issue. Please kindly 
> > > > > let me know
> > > > > if any suggestion, thank you.
> > >
> > >
> > > > This looks like it is a separate issue. The memblock/cma code should
> > > > not choke on a reserved page of memory at 0x0.
> > > >
> > > > Perhaps Russell or Mike (cc'ed) have an idea how to address this?
> > >
> > > Presuming that the last memblock dump comes from the end of
> > > arm_memblock_init() with the this memory map
> > >
> > > memory[0x0] [0x-0x0fff], 0x1000 
> > > bytes flags: 0x4
> > > memory[0x1] [0x1000-0x07ef5fff], 0x07ef5000 
> > > bytes flags: 0x0
> > > memory[0x2] [0x07ef6000-0x07f09fff], 0x00014000 
> > > bytes flags: 0x4
> > > memory[0x3] [0x07f0a000-0x3cb3efff], 0x34c35000 
> > > bytes flags: 0x0
> > >
> > > adjust_lowmem_bounds() will set the memblock_limit (and respectively 
> > > global
> > > memblock.current_limit) to 0x1000 and any further memblock_alloc*() will
> > > happily fail.
> > >
> > > I believe that the assumption for memblock_limit calculations was that the
> > > first bank has several megs at least.
> > >
> > > I wonder if this hack would help:
> > >
> > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > index d9a0038..948e5b9 100644
> > > --- a/arch/arm/mm/mmu.c
> > > +++ b/arch/arm/mm/mmu.c
> > > @@ -1206,7 +1206,7 @@ void __init adjust_lowmem_bounds(void)
> > >* allocated when mapping the start of bank 0, which
> > >* occurs before any free memory is mapped.
> > >*/
> > > - if (!memblock_limit) {
> > > + if (memblock_limit < PMD_SIZE) {
> > >   if (!IS_ALIGNED(block_start, PMD_SIZE))
> > >   memblock_limit = block_start;
> > >   else if (!IS_ALIGNED(block_end, PMD_SIZE))
> > >
> >
> > I applied this patch as well and it works well on rpi-2 model B.
> >
> 
> Thanks, Chester, that is good to know.
> 
> However, afaict, this only affects systems where physical memory
> starts at address 0x0, so I think we need a better fix.

This hack can be easily extended to handle systems with arbitrary start
address, but it's still a hack...

> I know Mike has been looking into the NOMAP stuff lately, and your
> original patch contains a hunk that makes this code (?) disregard
> nomap memblocks. That might be a better approach.

I was actually looking how to replace NOMAP with something else to make
memblock.memory consistent with actual physical memory banks. But this work
is stashed for now.

I'm not sure that skipping NOMAP regions would be good enough.
If I understand corrrectly, with Chester's original patch the reservation
of PMD aligned chunk of 32M for the kernel made the first conv-mem region
PMD aligned and then memblock_limit will be set to the end of this region.

Is there a reason for marking EFI_RESERVED_TYPE as NOMAP rather than simply
reserve them with memblock_reserve()?

-- 
Sincerely yours,
Mike.



Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-20 Thread Chester Lin
On Mon, Aug 19, 2019 at 05:56:51PM +0300, Ard Biesheuvel wrote:
> On Mon, 19 Aug 2019 at 11:01, Chester Lin  wrote:
> >
> > Hi Mike and Ard,
> >
> > On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> > > On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > > > (adding Mike)
> > > >
> > > > On Thu, 15 Aug 2019 at 14:28, Chester Lin  wrote:
> > > > >
> > > > > Hi Ard,
> > > > >
> > > > > On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> > > > > > On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel 
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hello Chester,
> > > > > > >
> > > > > > > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > > > > > > >
> > > > > > > > In some cases the arm32 efistub could fail to allocate memory 
> > > > > > > > for
> > > > > > > > uncompressed kernel. For example, we got the following error 
> > > > > > > > message when
> > > > > > > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] 
> > > > > > > > :
> > > > > > > >
> > > > > > > >   EFI stub: Booting Linux Kernel...
> > > > > > > >   EFI stub: ERROR: Unable to allocate memory for uncompressed 
> > > > > > > > kernel.
> > > > > > > >   EFI stub: ERROR: Failed to relocate kernel
> > > > > > > >
> > > > > > > > After checking the EFI memory map we found that the first page 
> > > > > > > > [0 - 0xfff]
> > > > > > > > had been reserved by Raspberry Pi-2's firmware, and the efistub 
> > > > > > > > tried to
> > > > > > > > set the dram base at 0, which was actually in a reserved region.
> > > > > > > >
> > > > > > >
> > > > > > > This by itself is a violation of the Linux boot protocol for 
> > > > > > > 32-bit
> > > > > > > ARM when using the decompressor. The decompressor rounds down its 
> > > > > > > own
> > > > > > > base address to a multiple of 128 MB, and assumes the whole area 
> > > > > > > is
> > > > > > > available for the decompressed kernel and related data structures.
> > > > > > > (The first TEXT_OFFSET bytes are no longer used in practice, 
> > > > > > > which is
> > > > > > > why putting a reserved region of 4 KB bytes works at the moment, 
> > > > > > > but
> > > > > > > this is fragile). Note that the decompressor does not look at any 
> > > > > > > DT
> > > > > > > or EFI provided memory maps *at all*.
> > > > > > >
> > > > > > > So unfortunately, this is not something we can fix in the kernel, 
> > > > > > > but
> > > > > > > we should fix it in the bootloader or in GRUB, so it does not put 
> > > > > > > any
> > > > > > > reserved regions in the first 128 MB of memory,
> > > > > > >
> > > > > >
> > > > > > OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> > > > > > ARM boot protocol docs are unclear about whether this memory should 
> > > > > > be
> > > > > > used or not, but it is no longer used for its original purpose (page
> > > > > > tables), and the RPi loader already keeps data there.
> > > > > >
> > > > > > Can you check whether the following patch works for you?
> > > > > >
> > > > > > diff --git a/drivers/firmware/efi/libstub/Makefile
> > > > > > b/drivers/firmware/efi/libstub/Makefile
> > > > > > index 0460c7581220..ee0661ddb25b 100644
> > > > > > --- a/drivers/firmware/efi/libstub/Makefile
> > > > > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > > > > @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> > > > > > string.o random.o \
> > > > > >
> > > > > >  lib-$(CONFIG_ARM)  += arm32-stub.o
> > > > > >  lib-$(CONFIG_ARM64)+= arm64-stub.o
> > > > > > +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > > > >  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > > > >
> > > > > >  #
> > > > > > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > > b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > > index e8f7aefb6813..66ff0c8ec269 100644
> > > > > > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > > @@ -204,7 +204,7 @@ efi_status_t
> > > > > > handle_kernel_image(efi_system_table_t *sys_table,
> > > > > >  * loaded. These assumptions are made by the decompressor,
> > > > > >  * before any memory map is available.
> > > > > >  */
> > > > > > -   dram_base = round_up(dram_base, SZ_128M);
> > > > > > +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> > > > > >
> > > > > > status = reserve_kernel_base(sys_table, dram_base, 
> > > > > > reserve_addr,
> > > > > >  reserve_size);
> > > > > >
> > > > >
> > > > > I tried your patch on rpi2 and got the following panic. Just a 
> > > > > reminder that I
> > > > > have replaced some log messages with ".." since it might be too 
> > > > > long to
> > > > > post all.
> > > > >
> > > >
> > > > OK. Good to know that this change helps you to get past the EFI stub 
> > > > boot issue.
> > > >
> > > > > In this case the kernel failed to 

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-19 Thread Ard Biesheuvel
On Mon, 19 Aug 2019 at 11:01, Chester Lin  wrote:
>
> Hi Mike and Ard,
>
> On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> > On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > > (adding Mike)
> > >
> > > On Thu, 15 Aug 2019 at 14:28, Chester Lin  wrote:
> > > >
> > > > Hi Ard,
> > > >
> > > > On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> > > > > On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel 
> > > > >  wrote:
> > > > > >
> > > > > > Hello Chester,
> > > > > >
> > > > > > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > > > > > >
> > > > > > > In some cases the arm32 efistub could fail to allocate memory for
> > > > > > > uncompressed kernel. For example, we got the following error 
> > > > > > > message when
> > > > > > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> > > > > > >
> > > > > > >   EFI stub: Booting Linux Kernel...
> > > > > > >   EFI stub: ERROR: Unable to allocate memory for uncompressed 
> > > > > > > kernel.
> > > > > > >   EFI stub: ERROR: Failed to relocate kernel
> > > > > > >
> > > > > > > After checking the EFI memory map we found that the first page [0 
> > > > > > > - 0xfff]
> > > > > > > had been reserved by Raspberry Pi-2's firmware, and the efistub 
> > > > > > > tried to
> > > > > > > set the dram base at 0, which was actually in a reserved region.
> > > > > > >
> > > > > >
> > > > > > This by itself is a violation of the Linux boot protocol for 32-bit
> > > > > > ARM when using the decompressor. The decompressor rounds down its 
> > > > > > own
> > > > > > base address to a multiple of 128 MB, and assumes the whole area is
> > > > > > available for the decompressed kernel and related data structures.
> > > > > > (The first TEXT_OFFSET bytes are no longer used in practice, which 
> > > > > > is
> > > > > > why putting a reserved region of 4 KB bytes works at the moment, but
> > > > > > this is fragile). Note that the decompressor does not look at any DT
> > > > > > or EFI provided memory maps *at all*.
> > > > > >
> > > > > > So unfortunately, this is not something we can fix in the kernel, 
> > > > > > but
> > > > > > we should fix it in the bootloader or in GRUB, so it does not put 
> > > > > > any
> > > > > > reserved regions in the first 128 MB of memory,
> > > > > >
> > > > >
> > > > > OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> > > > > ARM boot protocol docs are unclear about whether this memory should be
> > > > > used or not, but it is no longer used for its original purpose (page
> > > > > tables), and the RPi loader already keeps data there.
> > > > >
> > > > > Can you check whether the following patch works for you?
> > > > >
> > > > > diff --git a/drivers/firmware/efi/libstub/Makefile
> > > > > b/drivers/firmware/efi/libstub/Makefile
> > > > > index 0460c7581220..ee0661ddb25b 100644
> > > > > --- a/drivers/firmware/efi/libstub/Makefile
> > > > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > > > @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> > > > > string.o random.o \
> > > > >
> > > > >  lib-$(CONFIG_ARM)  += arm32-stub.o
> > > > >  lib-$(CONFIG_ARM64)+= arm64-stub.o
> > > > > +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > > >  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > > >
> > > > >  #
> > > > > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > index e8f7aefb6813..66ff0c8ec269 100644
> > > > > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > > @@ -204,7 +204,7 @@ efi_status_t
> > > > > handle_kernel_image(efi_system_table_t *sys_table,
> > > > >  * loaded. These assumptions are made by the decompressor,
> > > > >  * before any memory map is available.
> > > > >  */
> > > > > -   dram_base = round_up(dram_base, SZ_128M);
> > > > > +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> > > > >
> > > > > status = reserve_kernel_base(sys_table, dram_base, 
> > > > > reserve_addr,
> > > > >  reserve_size);
> > > > >
> > > >
> > > > I tried your patch on rpi2 and got the following panic. Just a reminder 
> > > > that I
> > > > have replaced some log messages with ".." since it might be too 
> > > > long to
> > > > post all.
> > > >
> > >
> > > OK. Good to know that this change helps you to get past the EFI stub boot 
> > > issue.
> > >
> > > > In this case the kernel failed to reserve cma, which should hit the 
> > > > issue of
> > > > memblock_limit=0x1000 as I had mentioned in my patch description. The 
> > > > first
> > > > block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did not 
> > > > align
> > > > with PMD_SIZE so the cma reservation failed because the 
> > > > memblock.current_limit
> > > > was extremely low. That's why I expand 

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-19 Thread Chester Lin
Hi Mike and Ard,

On Thu, Aug 15, 2019 at 04:37:39PM +0300, Mike Rapoport wrote:
> On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> > (adding Mike)
> > 
> > On Thu, 15 Aug 2019 at 14:28, Chester Lin  wrote:
> > >
> > > Hi Ard,
> > >
> > > On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> > > > On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel  
> > > > wrote:
> > > > >
> > > > > Hello Chester,
> > > > >
> > > > > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > > > > >
> > > > > > In some cases the arm32 efistub could fail to allocate memory for
> > > > > > uncompressed kernel. For example, we got the following error 
> > > > > > message when
> > > > > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> > > > > >
> > > > > >   EFI stub: Booting Linux Kernel...
> > > > > >   EFI stub: ERROR: Unable to allocate memory for uncompressed 
> > > > > > kernel.
> > > > > >   EFI stub: ERROR: Failed to relocate kernel
> > > > > >
> > > > > > After checking the EFI memory map we found that the first page [0 - 
> > > > > > 0xfff]
> > > > > > had been reserved by Raspberry Pi-2's firmware, and the efistub 
> > > > > > tried to
> > > > > > set the dram base at 0, which was actually in a reserved region.
> > > > > >
> > > > >
> > > > > This by itself is a violation of the Linux boot protocol for 32-bit
> > > > > ARM when using the decompressor. The decompressor rounds down its own
> > > > > base address to a multiple of 128 MB, and assumes the whole area is
> > > > > available for the decompressed kernel and related data structures.
> > > > > (The first TEXT_OFFSET bytes are no longer used in practice, which is
> > > > > why putting a reserved region of 4 KB bytes works at the moment, but
> > > > > this is fragile). Note that the decompressor does not look at any DT
> > > > > or EFI provided memory maps *at all*.
> > > > >
> > > > > So unfortunately, this is not something we can fix in the kernel, but
> > > > > we should fix it in the bootloader or in GRUB, so it does not put any
> > > > > reserved regions in the first 128 MB of memory,
> > > > >
> > > >
> > > > OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> > > > ARM boot protocol docs are unclear about whether this memory should be
> > > > used or not, but it is no longer used for its original purpose (page
> > > > tables), and the RPi loader already keeps data there.
> > > >
> > > > Can you check whether the following patch works for you?
> > > >
> > > > diff --git a/drivers/firmware/efi/libstub/Makefile
> > > > b/drivers/firmware/efi/libstub/Makefile
> > > > index 0460c7581220..ee0661ddb25b 100644
> > > > --- a/drivers/firmware/efi/libstub/Makefile
> > > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > > @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> > > > string.o random.o \
> > > >
> > > >  lib-$(CONFIG_ARM)  += arm32-stub.o
> > > >  lib-$(CONFIG_ARM64)+= arm64-stub.o
> > > > +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > >  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > > >
> > > >  #
> > > > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > index e8f7aefb6813..66ff0c8ec269 100644
> > > > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > > > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > > > @@ -204,7 +204,7 @@ efi_status_t
> > > > handle_kernel_image(efi_system_table_t *sys_table,
> > > >  * loaded. These assumptions are made by the decompressor,
> > > >  * before any memory map is available.
> > > >  */
> > > > -   dram_base = round_up(dram_base, SZ_128M);
> > > > +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> > > >
> > > > status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
> > > >  reserve_size);
> > > >
> > >
> > > I tried your patch on rpi2 and got the following panic. Just a reminder 
> > > that I
> > > have replaced some log messages with ".." since it might be too long 
> > > to
> > > post all.
> > >
> > 
> > OK. Good to know that this change helps you to get past the EFI stub boot 
> > issue.
> > 
> > > In this case the kernel failed to reserve cma, which should hit the issue 
> > > of
> > > memblock_limit=0x1000 as I had mentioned in my patch description. The 
> > > first
> > > block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did not 
> > > align
> > > with PMD_SIZE so the cma reservation failed because the 
> > > memblock.current_limit
> > > was extremely low. That's why I expand the first reservation from 1 
> > > PAGESIZE to
> > > 1 PMD_SIZE in my patch in order to avoid this issue. Please kindly let me 
> > > know
> > > if any suggestion, thank you.
> 
>  
> > This looks like it is a separate issue. The memblock/cma code should
> > not choke on a reserved page of memory at 0x0.
> > 
> > Perhaps 

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-15 Thread Mike Rapoport
On Thu, Aug 15, 2019 at 02:32:50PM +0300, Ard Biesheuvel wrote:
> (adding Mike)
> 
> On Thu, 15 Aug 2019 at 14:28, Chester Lin  wrote:
> >
> > Hi Ard,
> >
> > On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> > > On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel  
> > > wrote:
> > > >
> > > > Hello Chester,
> > > >
> > > > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > > > >
> > > > > In some cases the arm32 efistub could fail to allocate memory for
> > > > > uncompressed kernel. For example, we got the following error message 
> > > > > when
> > > > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> > > > >
> > > > >   EFI stub: Booting Linux Kernel...
> > > > >   EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
> > > > >   EFI stub: ERROR: Failed to relocate kernel
> > > > >
> > > > > After checking the EFI memory map we found that the first page [0 - 
> > > > > 0xfff]
> > > > > had been reserved by Raspberry Pi-2's firmware, and the efistub tried 
> > > > > to
> > > > > set the dram base at 0, which was actually in a reserved region.
> > > > >
> > > >
> > > > This by itself is a violation of the Linux boot protocol for 32-bit
> > > > ARM when using the decompressor. The decompressor rounds down its own
> > > > base address to a multiple of 128 MB, and assumes the whole area is
> > > > available for the decompressed kernel and related data structures.
> > > > (The first TEXT_OFFSET bytes are no longer used in practice, which is
> > > > why putting a reserved region of 4 KB bytes works at the moment, but
> > > > this is fragile). Note that the decompressor does not look at any DT
> > > > or EFI provided memory maps *at all*.
> > > >
> > > > So unfortunately, this is not something we can fix in the kernel, but
> > > > we should fix it in the bootloader or in GRUB, so it does not put any
> > > > reserved regions in the first 128 MB of memory,
> > > >
> > >
> > > OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> > > ARM boot protocol docs are unclear about whether this memory should be
> > > used or not, but it is no longer used for its original purpose (page
> > > tables), and the RPi loader already keeps data there.
> > >
> > > Can you check whether the following patch works for you?
> > >
> > > diff --git a/drivers/firmware/efi/libstub/Makefile
> > > b/drivers/firmware/efi/libstub/Makefile
> > > index 0460c7581220..ee0661ddb25b 100644
> > > --- a/drivers/firmware/efi/libstub/Makefile
> > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> > > string.o random.o \
> > >
> > >  lib-$(CONFIG_ARM)  += arm32-stub.o
> > >  lib-$(CONFIG_ARM64)+= arm64-stub.o
> > > +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > >  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> > >
> > >  #
> > > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> > > b/drivers/firmware/efi/libstub/arm32-stub.c
> > > index e8f7aefb6813..66ff0c8ec269 100644
> > > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > > @@ -204,7 +204,7 @@ efi_status_t
> > > handle_kernel_image(efi_system_table_t *sys_table,
> > >  * loaded. These assumptions are made by the decompressor,
> > >  * before any memory map is available.
> > >  */
> > > -   dram_base = round_up(dram_base, SZ_128M);
> > > +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> > >
> > > status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
> > >  reserve_size);
> > >
> >
> > I tried your patch on rpi2 and got the following panic. Just a reminder 
> > that I
> > have replaced some log messages with ".." since it might be too long to
> > post all.
> >
> 
> OK. Good to know that this change helps you to get past the EFI stub boot 
> issue.
> 
> > In this case the kernel failed to reserve cma, which should hit the issue of
> > memblock_limit=0x1000 as I had mentioned in my patch description. The first
> > block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did not align
> > with PMD_SIZE so the cma reservation failed because the 
> > memblock.current_limit
> > was extremely low. That's why I expand the first reservation from 1 
> > PAGESIZE to
> > 1 PMD_SIZE in my patch in order to avoid this issue. Please kindly let me 
> > know
> > if any suggestion, thank you.

 
> This looks like it is a separate issue. The memblock/cma code should
> not choke on a reserved page of memory at 0x0.
> 
> Perhaps Russell or Mike (cc'ed) have an idea how to address this?

Presuming that the last memblock dump comes from the end of
arm_memblock_init() with the this memory map 
 
memory[0x0] [0x-0x0fff], 0x1000 bytes 
flags: 0x4
memory[0x1] [0x1000-0x07ef5fff], 0x07ef5000 bytes 

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-15 Thread Chester Lin
Hi Ard,

On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel  wrote:
> >
> > Hello Chester,
> >
> > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > >
> > > In some cases the arm32 efistub could fail to allocate memory for
> > > uncompressed kernel. For example, we got the following error message when
> > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> > >
> > >   EFI stub: Booting Linux Kernel...
> > >   EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
> > >   EFI stub: ERROR: Failed to relocate kernel
> > >
> > > After checking the EFI memory map we found that the first page [0 - 0xfff]
> > > had been reserved by Raspberry Pi-2's firmware, and the efistub tried to
> > > set the dram base at 0, which was actually in a reserved region.
> > >
> >
> > This by itself is a violation of the Linux boot protocol for 32-bit
> > ARM when using the decompressor. The decompressor rounds down its own
> > base address to a multiple of 128 MB, and assumes the whole area is
> > available for the decompressed kernel and related data structures.
> > (The first TEXT_OFFSET bytes are no longer used in practice, which is
> > why putting a reserved region of 4 KB bytes works at the moment, but
> > this is fragile). Note that the decompressor does not look at any DT
> > or EFI provided memory maps *at all*.
> >
> > So unfortunately, this is not something we can fix in the kernel, but
> > we should fix it in the bootloader or in GRUB, so it does not put any
> > reserved regions in the first 128 MB of memory,
> >
> 
> OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> ARM boot protocol docs are unclear about whether this memory should be
> used or not, but it is no longer used for its original purpose (page
> tables), and the RPi loader already keeps data there.
> 
> Can you check whether the following patch works for you?
> 
> diff --git a/drivers/firmware/efi/libstub/Makefile
> b/drivers/firmware/efi/libstub/Makefile
> index 0460c7581220..ee0661ddb25b 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> string.o random.o \
> 
>  lib-$(CONFIG_ARM)  += arm32-stub.o
>  lib-$(CONFIG_ARM64)+= arm64-stub.o
> +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
>  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> 
>  #
> diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> b/drivers/firmware/efi/libstub/arm32-stub.c
> index e8f7aefb6813..66ff0c8ec269 100644
> --- a/drivers/firmware/efi/libstub/arm32-stub.c
> +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> @@ -204,7 +204,7 @@ efi_status_t
> handle_kernel_image(efi_system_table_t *sys_table,
>  * loaded. These assumptions are made by the decompressor,
>  * before any memory map is available.
>  */
> -   dram_base = round_up(dram_base, SZ_128M);
> +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> 
> status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
>  reserve_size);
> 

I tried your patch on rpi2 and got the following panic. Just a reminder that I
have replaced some log messages with ".." since it might be too long to
post all.

In this case the kernel failed to reserve cma, which should hit the issue of
memblock_limit=0x1000 as I had mentioned in my patch description. The first
block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did not align
with PMD_SIZE so the cma reservation failed because the memblock.current_limit
was extremely low. That's why I expand the first reservation from 1 PAGESIZE to
1 PMD_SIZE in my patch in order to avoid this issue. Please kindly let me know
if any suggestion, thank you.

boot-log:


Loading Linux test ...
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
Uncompressing Linux... done, booting the kernel.
[0.00] Booting Linux on physical CPU 0xf00
[0.00] Linux version 5.2.1-lpae (chester@linux-8mug) (..)
[0.00] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=30c5387d
[0.00] CPU: div instructions available: patching division code
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
instruction cache
[0.00] OF: fdt: Machine model: Raspberry Pi 2 Model B Rev 1.1
[0.00] printk: bootconsole [earlycon0] enabled
[0.00] Memory policy: Data cache writealloc
[0.00] efi: Getting EFI parameters from FDT:
[0.00] efi:   System Table: 0x3df757c0
[0.00] efi:   MemMap Address: 0x2c1c5040
[0.00] efi:   MemMap Size: 0x03c0
[0.00] efi:   MemMap Desc. Size: 0x0028
[0.00] efi:   MemMap Desc. Version: 0x0001

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-15 Thread Ard Biesheuvel
(adding Mike)

On Thu, 15 Aug 2019 at 14:28, Chester Lin  wrote:
>
> Hi Ard,
>
> On Thu, Aug 15, 2019 at 10:59:43AM +0300, Ard Biesheuvel wrote:
> > On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel  
> > wrote:
> > >
> > > Hello Chester,
> > >
> > > On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> > > >
> > > > In some cases the arm32 efistub could fail to allocate memory for
> > > > uncompressed kernel. For example, we got the following error message 
> > > > when
> > > > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> > > >
> > > >   EFI stub: Booting Linux Kernel...
> > > >   EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
> > > >   EFI stub: ERROR: Failed to relocate kernel
> > > >
> > > > After checking the EFI memory map we found that the first page [0 - 
> > > > 0xfff]
> > > > had been reserved by Raspberry Pi-2's firmware, and the efistub tried to
> > > > set the dram base at 0, which was actually in a reserved region.
> > > >
> > >
> > > This by itself is a violation of the Linux boot protocol for 32-bit
> > > ARM when using the decompressor. The decompressor rounds down its own
> > > base address to a multiple of 128 MB, and assumes the whole area is
> > > available for the decompressed kernel and related data structures.
> > > (The first TEXT_OFFSET bytes are no longer used in practice, which is
> > > why putting a reserved region of 4 KB bytes works at the moment, but
> > > this is fragile). Note that the decompressor does not look at any DT
> > > or EFI provided memory maps *at all*.
> > >
> > > So unfortunately, this is not something we can fix in the kernel, but
> > > we should fix it in the bootloader or in GRUB, so it does not put any
> > > reserved regions in the first 128 MB of memory,
> > >
> >
> > OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
> > ARM boot protocol docs are unclear about whether this memory should be
> > used or not, but it is no longer used for its original purpose (page
> > tables), and the RPi loader already keeps data there.
> >
> > Can you check whether the following patch works for you?
> >
> > diff --git a/drivers/firmware/efi/libstub/Makefile
> > b/drivers/firmware/efi/libstub/Makefile
> > index 0460c7581220..ee0661ddb25b 100644
> > --- a/drivers/firmware/efi/libstub/Makefile
> > +++ b/drivers/firmware/efi/libstub/Makefile
> > @@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
> > string.o random.o \
> >
> >  lib-$(CONFIG_ARM)  += arm32-stub.o
> >  lib-$(CONFIG_ARM64)+= arm64-stub.o
> > +CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> >  CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
> >
> >  #
> > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> > b/drivers/firmware/efi/libstub/arm32-stub.c
> > index e8f7aefb6813..66ff0c8ec269 100644
> > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > @@ -204,7 +204,7 @@ efi_status_t
> > handle_kernel_image(efi_system_table_t *sys_table,
> >  * loaded. These assumptions are made by the decompressor,
> >  * before any memory map is available.
> >  */
> > -   dram_base = round_up(dram_base, SZ_128M);
> > +   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;
> >
> > status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
> >  reserve_size);
> >
>
> I tried your patch on rpi2 and got the following panic. Just a reminder that I
> have replaced some log messages with ".." since it might be too long to
> post all.
>

OK. Good to know that this change helps you to get past the EFI stub boot issue.

> In this case the kernel failed to reserve cma, which should hit the issue of
> memblock_limit=0x1000 as I had mentioned in my patch description. The first
> block [0-0xfff] was scanned in adjust_lowmem_bounds(), but it did not align
> with PMD_SIZE so the cma reservation failed because the memblock.current_limit
> was extremely low. That's why I expand the first reservation from 1 PAGESIZE 
> to
> 1 PMD_SIZE in my patch in order to avoid this issue. Please kindly let me know
> if any suggestion, thank you.
>

This looks like it is a separate issue. The memblock/cma code should
not choke on a reserved page of memory at 0x0.

Perhaps Russell or Mike (cc'ed) have an idea how to address this?



> boot-log:
> 
>
> Loading Linux test ...
> EFI stub: Booting Linux Kernel...
> EFI stub: Using DTB from configuration table
> EFI stub: Exiting boot services and installing virtual address map...
> Uncompressing Linux... done, booting the kernel.
> [0.00] Booting Linux on physical CPU 0xf00
> [0.00] Linux version 5.2.1-lpae (chester@linux-8mug) (..)
> [0.00] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=30c5387d
> [0.00] CPU: div instructions available: patching division code
> [0.00] CPU: PIPT / VIPT 

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-15 Thread Ard Biesheuvel
On Sun, 4 Aug 2019 at 10:57, Ard Biesheuvel  wrote:
>
> Hello Chester,
>
> On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> >
> > In some cases the arm32 efistub could fail to allocate memory for
> > uncompressed kernel. For example, we got the following error message when
> > verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> >
> >   EFI stub: Booting Linux Kernel...
> >   EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
> >   EFI stub: ERROR: Failed to relocate kernel
> >
> > After checking the EFI memory map we found that the first page [0 - 0xfff]
> > had been reserved by Raspberry Pi-2's firmware, and the efistub tried to
> > set the dram base at 0, which was actually in a reserved region.
> >
>
> This by itself is a violation of the Linux boot protocol for 32-bit
> ARM when using the decompressor. The decompressor rounds down its own
> base address to a multiple of 128 MB, and assumes the whole area is
> available for the decompressed kernel and related data structures.
> (The first TEXT_OFFSET bytes are no longer used in practice, which is
> why putting a reserved region of 4 KB bytes works at the moment, but
> this is fragile). Note that the decompressor does not look at any DT
> or EFI provided memory maps *at all*.
>
> So unfortunately, this is not something we can fix in the kernel, but
> we should fix it in the bootloader or in GRUB, so it does not put any
> reserved regions in the first 128 MB of memory,
>

OK, perhaps we can fix this by taking TEXT_OFFSET into account. The
ARM boot protocol docs are unclear about whether this memory should be
used or not, but it is no longer used for its original purpose (page
tables), and the RPi loader already keeps data there.

Can you check whether the following patch works for you?

diff --git a/drivers/firmware/efi/libstub/Makefile
b/drivers/firmware/efi/libstub/Makefile
index 0460c7581220..ee0661ddb25b 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -52,6 +52,7 @@ lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o
string.o random.o \

 lib-$(CONFIG_ARM)  += arm32-stub.o
 lib-$(CONFIG_ARM64)+= arm64-stub.o
+CFLAGS_arm32-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)
 CFLAGS_arm64-stub.o:= -DTEXT_OFFSET=$(TEXT_OFFSET)

 #
diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
b/drivers/firmware/efi/libstub/arm32-stub.c
index e8f7aefb6813..66ff0c8ec269 100644
--- a/drivers/firmware/efi/libstub/arm32-stub.c
+++ b/drivers/firmware/efi/libstub/arm32-stub.c
@@ -204,7 +204,7 @@ efi_status_t
handle_kernel_image(efi_system_table_t *sys_table,
 * loaded. These assumptions are made by the decompressor,
 * before any memory map is available.
 */
-   dram_base = round_up(dram_base, SZ_128M);
+   dram_base = round_up(dram_base, SZ_128M) + TEXT_OFFSET;

status = reserve_kernel_base(sys_table, dram_base, reserve_addr,
 reserve_size);

>
> >   grub> lsefimmap
> >   Type  Physical start  - end #PagesSize Attributes
> >   reserved  -0fff 0001  4KiB WB
> >   conv-mem  1000-07ef5fff 7ef5 130004KiB WB
> >   RT-data   07ef6000-07f09fff 0014 80KiB RT WB
> >   conv-mem  07f0a000-2d871fff 00025968 615840KiB WB
> >   .
> >
> > To avoid a reserved address, we have to ignore the memory regions which are
> > marked as EFI_RESERVED_TYPE, and only conventional memory regions can be
> > chosen. If the region before the kernel base is unaligned, it will be
> > marked as EFI_RESERVED_TYPE and let kernel ignore it so that memblock_limit
> > will not be sticked with a very low address such as 0x1000.
> >

This is a separate issue, so it should be handled in a separate patch.

> > Signed-off-by: Chester Lin 
> > ---
> >  arch/arm/mm/mmu.c |  3 ++
> >  drivers/firmware/efi/libstub/arm32-stub.c | 43 ++-
> >  2 files changed, 37 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > index f3ce34113f89..909b11ba48d8 100644
> > --- a/arch/arm/mm/mmu.c
> > +++ b/arch/arm/mm/mmu.c
> > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > phys_addr_t block_start = reg->base;
> > phys_addr_t block_end = reg->base + reg->size;
> >
> > +   if (memblock_is_nomap(reg))
> > +   continue;
> > +
> > if (reg->base < vmalloc_limit) {
> > if (block_end > lowmem_limit)
> > /*
> > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c 
> > b/drivers/firmware/efi/libstub/arm32-stub.c
> > index e8f7aefb6813..10d33d36df00 100644
> > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > @@ -128,7 +128,7 @@ static 

RE: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-14 Thread Guillaume Gardet


> -Original Message-
> From: Ard Biesheuvel 
> Sent: 04 August 2019 09:57
> To: Chester Lin ; li...@armlinux.org.uk
> Cc: a...@linux-foundation.org; r...@linux.ibm.com; ren_...@c-sky.com;
> Juergen Gross ; ge...@linux-m68k.org; mi...@kernel.org;
> linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org; linux-
> e...@vger.kernel.org; Guillaume Gardet ; Joey Lee
> ; Gary Lin 
> Subject: Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel
> base
>
> Hello Chester,
>
> On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
> >
> > In some cases the arm32 efistub could fail to allocate memory for
> > uncompressed kernel. For example, we got the following error message
> > when verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
> >
> >   EFI stub: Booting Linux Kernel...
> >   EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
> >   EFI stub: ERROR: Failed to relocate kernel
> >
> > After checking the EFI memory map we found that the first page [0 -
> > 0xfff] had been reserved by Raspberry Pi-2's firmware, and the efistub
> > tried to set the dram base at 0, which was actually in a reserved region.
> >
>
> This by itself is a violation of the Linux boot protocol for 32-bit ARM when 
> using
> the decompressor. The decompressor rounds down its own base address to a
> multiple of 128 MB, and assumes the whole area is available for the
> decompressed kernel and related data structures.
> (The first TEXT_OFFSET bytes are no longer used in practice, which is why 
> putting
> a reserved region of 4 KB bytes works at the moment, but this is fragile). 
> Note
> that the decompressor does not look at any DT or EFI provided memory maps
> *at all*.
>
> So unfortunately, this is not something we can fix in the kernel, but we 
> should fix
> it in the bootloader or in GRUB, so it does not put any reserved regions in 
> the
> first 128 MB of memory,

FYI, this is in Raspberry Pi firmware: 
https://github.com/raspberrypi/firmware/issues/1199


>
>
> >   grub> lsefimmap
> >   Type  Physical start  - end #PagesSize Attributes
> >   reserved  -0fff 0001  4KiB WB
> >   conv-mem  1000-07ef5fff 7ef5 130004KiB WB
> >   RT-data   07ef6000-07f09fff 0014 80KiB RT WB
> >   conv-mem  07f0a000-2d871fff 00025968 615840KiB WB
> >   .
> >
> > To avoid a reserved address, we have to ignore the memory regions
> > which are marked as EFI_RESERVED_TYPE, and only conventional memory
> > regions can be chosen. If the region before the kernel base is
> > unaligned, it will be marked as EFI_RESERVED_TYPE and let kernel
> > ignore it so that memblock_limit will not be sticked with a very low address
> such as 0x1000.
> >
> > Signed-off-by: Chester Lin 
> > ---
> >  arch/arm/mm/mmu.c |  3 ++
> >  drivers/firmware/efi/libstub/arm32-stub.c | 43
> > ++-
> >  2 files changed, 37 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index
> > f3ce34113f89..909b11ba48d8 100644
> > --- a/arch/arm/mm/mmu.c
> > +++ b/arch/arm/mm/mmu.c
> > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > phys_addr_t block_start = reg->base;
> > phys_addr_t block_end = reg->base + reg->size;
> >
> > +   if (memblock_is_nomap(reg))
> > +   continue;
> > +
> > if (reg->base < vmalloc_limit) {
> > if (block_end > lowmem_limit)
> > /*
> > diff --git a/drivers/firmware/efi/libstub/arm32-stub.c
> > b/drivers/firmware/efi/libstub/arm32-stub.c
> > index e8f7aefb6813..10d33d36df00 100644
> > --- a/drivers/firmware/efi/libstub/arm32-stub.c
> > +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> > @@ -128,7 +128,7 @@ static efi_status_t
> > reserve_kernel_base(efi_system_table_t *sys_table_arg,
> >
> > for (l = 0; l < map_size; l += desc_size) {
> > efi_memory_desc_t *desc;
> > -   u64 start, end;
> > +   u64 start, end, spare, kernel_base;
> >
> > desc = (void *)memory_map + l;
> > start = desc->phys_addr; @@ -144,27 +144,52 @@ static
> > efi_status_t reserve_kernel_base(efi_system_table_t *sys_table_arg,
> > case EFI_BOOT_SERVICES_DATA:
> >

Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

2019-08-04 Thread Ard Biesheuvel
Hello Chester,

On Fri, 2 Aug 2019 at 08:40, Chester Lin  wrote:
>
> In some cases the arm32 efistub could fail to allocate memory for
> uncompressed kernel. For example, we got the following error message when
> verifying EFI stub on Raspberry Pi-2 [kernel-5.2.1 + grub-2.04] :
>
>   EFI stub: Booting Linux Kernel...
>   EFI stub: ERROR: Unable to allocate memory for uncompressed kernel.
>   EFI stub: ERROR: Failed to relocate kernel
>
> After checking the EFI memory map we found that the first page [0 - 0xfff]
> had been reserved by Raspberry Pi-2's firmware, and the efistub tried to
> set the dram base at 0, which was actually in a reserved region.
>

This by itself is a violation of the Linux boot protocol for 32-bit
ARM when using the decompressor. The decompressor rounds down its own
base address to a multiple of 128 MB, and assumes the whole area is
available for the decompressed kernel and related data structures.
(The first TEXT_OFFSET bytes are no longer used in practice, which is
why putting a reserved region of 4 KB bytes works at the moment, but
this is fragile). Note that the decompressor does not look at any DT
or EFI provided memory maps *at all*.

So unfortunately, this is not something we can fix in the kernel, but
we should fix it in the bootloader or in GRUB, so it does not put any
reserved regions in the first 128 MB of memory,


>   grub> lsefimmap
>   Type  Physical start  - end #PagesSize Attributes
>   reserved  -0fff 0001  4KiB WB
>   conv-mem  1000-07ef5fff 7ef5 130004KiB WB
>   RT-data   07ef6000-07f09fff 0014 80KiB RT WB
>   conv-mem  07f0a000-2d871fff 00025968 615840KiB WB
>   .
>
> To avoid a reserved address, we have to ignore the memory regions which are
> marked as EFI_RESERVED_TYPE, and only conventional memory regions can be
> chosen. If the region before the kernel base is unaligned, it will be
> marked as EFI_RESERVED_TYPE and let kernel ignore it so that memblock_limit
> will not be sticked with a very low address such as 0x1000.
>
> Signed-off-by: Chester Lin 
> ---
>  arch/arm/mm/mmu.c |  3 ++
>  drivers/firmware/efi/libstub/arm32-stub.c | 43 ++-
>  2 files changed, 37 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index f3ce34113f89..909b11ba48d8 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> phys_addr_t block_start = reg->base;
> phys_addr_t block_end = reg->base + reg->size;
>
> +   if (memblock_is_nomap(reg))
> +   continue;
> +
> if (reg->base < vmalloc_limit) {
> if (block_end > lowmem_limit)
> /*
> diff --git a/drivers/firmware/efi/libstub/arm32-stub.c 
> b/drivers/firmware/efi/libstub/arm32-stub.c
> index e8f7aefb6813..10d33d36df00 100644
> --- a/drivers/firmware/efi/libstub/arm32-stub.c
> +++ b/drivers/firmware/efi/libstub/arm32-stub.c
> @@ -128,7 +128,7 @@ static efi_status_t 
> reserve_kernel_base(efi_system_table_t *sys_table_arg,
>
> for (l = 0; l < map_size; l += desc_size) {
> efi_memory_desc_t *desc;
> -   u64 start, end;
> +   u64 start, end, spare, kernel_base;
>
> desc = (void *)memory_map + l;
> start = desc->phys_addr;
> @@ -144,27 +144,52 @@ static efi_status_t 
> reserve_kernel_base(efi_system_table_t *sys_table_arg,
> case EFI_BOOT_SERVICES_DATA:
> /* Ignore types that are released to the OS anyway */
> continue;
> -
> +   case EFI_RESERVED_TYPE:
> +   /* Ignore reserved regions */
> +   continue;
> case EFI_CONVENTIONAL_MEMORY:
> /*
>  * Reserve the intersection between this entry and the
>  * region.
>  */
> start = max(start, (u64)dram_base);
> -   end = min(end, (u64)dram_base + 
> MAX_UNCOMP_KERNEL_SIZE);
> +   kernel_base = round_up(start, PMD_SIZE);
> +   spare = kernel_base - start;
> +   end = min(end, kernel_base + MAX_UNCOMP_KERNEL_SIZE);
> +
> +   status = efi_call_early(allocate_pages,
> +   EFI_ALLOCATE_ADDRESS,
> +   EFI_LOADER_DATA,
> +   MAX_UNCOMP_KERNEL_SIZE / 
> EFI_PAGE_SIZE,
> +   _base);
> +   if (status != EFI_SUCCESS) {
> +   pr_efi_err(sys_table_arg,
> +