Re: (overlapping) lmb allocations during boot

2021-10-31 Thread Michael Walle

Am 2021-10-31 18:36, schrieb Mark Kettenis:

Date: Sun, 31 Oct 2021 16:19:56 +0100
From: Michael Walle 

..


>> > Unless CONFIG_EFI_LOADER is defined.  Then it relocates the spin table
>> > to memory allocated using efi_allocate_pages().  But that function
>> > only looks at the EFI memory map to figure out what memory is
>> > available.  So I suspect that it might hand out the same memory as
>> > lmb_alloc().  It all looks a bit broken to me...
>>
>> Yes, that is actually my code ;) The kontron_sl28 is the only
>> board which uses spin tables as far as I know. It doesn't support
>> PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP
>> it uses spin tables. The relocation code work arounds a problem
>> with the reserved EFI code, see [1].
>>
>> And yes, it actually is broken. But so might be every code which
>> is using the efi_allocate_pages(), no? LMB isn't global, but is
>> just initialized at different places. Like before a linux kernel
>> is booted or when you load a file (?). And everytime the whole
>> memory is added, and then different regions are carved out (see
>> above).
>
> Right.  So it mostly works because U-Boot either ends up using a code
> path where it uses LMB or a path where it ends up using the EFI memory
> map, but never both.  Except if it hits your layerscape code.
>
> I think the right way to solve this is to allocate the memory for the
> spin table through some other means and use efi_add_memory_map() to
> reserve that memory like what is done in board/raspberrypi/rpi/rpi.c.

Mh. I don't think this will work, just because lmb is quite stupid
and has just the three steps to carve out memory:
  (1) arch_lmb_reserve()
  (2) board_lmb_reserve()
  (3) boot_fdt_add_mem_rsv_regions()

(1) will carve out u-boot code and data (2) is usually not used and
(3) doesn't really work IMHO, because the reserved regions are added
much later in the bootm flow.

So what would "other means" be? I could allocate the memory somehow,
but how can I communicate that to lmb, so it would be excluded there.


It could be a simple:

  memalign(PAGE_SIZE, PAGE_SIZE);

This would allocate memory from the heap which should be excluded from
what lmb hands out.


Thanks, that was easy ;) Turns out the heap is between the stack and the
uboot code which is reserved by lmb (arch_lmb_reserve()).

-michael


Re: (overlapping) lmb allocations during boot

2021-10-31 Thread Mark Kettenis
> Date: Sun, 31 Oct 2021 16:19:56 +0100
> From: Michael Walle 
> 
> Am 2021-10-31 13:51, schrieb Mark Kettenis:
> >> Date: Sun, 31 Oct 2021 13:04:40 +0100
> >> From: Michael Walle 
> >> 
> >> Am 2021-10-31 12:44, schrieb Mark Kettenis:
> >> >> Date: Sun, 31 Oct 2021 11:43:38 +0100
> >> >> From: Michael Walle 
> >> >>
> >> >> Hi,
> >> >>
> >> >> I sometimes see a corrupted initrd during kernel boot on my board
> >> >> (kontron_sl28). Debugging showed that in this case the spin table
> >> >> for the secondary CPUs overlaps with the lmb initrd allocations
> >> >> (initrd_high is set).
> >> >>
> >> >> I had a look at how the fdt and initrd are relocated (if enabled). In
> >> >> summary, they use lmb_alloc() which will first add all available
> >> >> memory
> >> >> and then carve out reserved regions. There are calls for arch
> >> >> (arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The
> >> >> interesting thing is that, there is also code to carve out any
> >> >> reserved regions which were added to the fdt earlier
> >> >> (boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT
> >> >> fixups,
> >> >> which might add the reserved regions are called just before jumping to
> >> >> linux. Thus both the allocation for the fdt (if fdt_high is set) and
> >> >> the ramdisk (if initrd_high is set) will ignore any reserved memory
> >> >> regions.
> >> >>
> >> >> Unfortunately, I don't see any good way to fix this. You'd need all
> >> >> the
> >> >> DT fixups before we can initialize the lmb. Also, I don't know if this
> >> >> will affect any other areas; probably I'm the only one, who reserves
> >> >> an
> >> >> area which is outside of the u-boot code and data segment.
> >> >>
> >> >> A hackish way would be to carve out the spin_table code in
> >> >> board_lmb_reserve(). But meh..
> >> >
> >> > The spin table is embedded in the u-boot binary itself isn't it?  But
> >> > the memory occupied by the u-boot should already be reserved...
> >> 
> >> Yes it is. As long as it doen't overlap with the 64k EFI code page.
> >> See below.
> > 
> > Yes, I saw the comment.  Which confuses me.  The overlap itself
> > shouldn't be a problem since both the spin table and the EFI runtime
> > code page should be preserved.  But I guess that the overlap causes
> > problems later on because attempting to reserve overlapping areas
> > fails?
> 
> TBH I don't remember the specifics anymore, my first try to fix this
> problem was here [1]. But the conclusion was that reserving 64k for
> the EFI alone would increase the binary size a lot. Thus Heinrich
> suggested to relocate the spin table code.

Ah right, the overlap is somewhat problematic since the EFI memory
types don't match.  Now that in itself isn't an issue as long as the
memory page attributes are the same.  As far as I can tell, both the
spin table and the EFI runtime services code would use EFI_MEMORY_WB,
so that would be fine.

> >> > Unless CONFIG_EFI_LOADER is defined.  Then it relocates the spin table
> >> > to memory allocated using efi_allocate_pages().  But that function
> >> > only looks at the EFI memory map to figure out what memory is
> >> > available.  So I suspect that it might hand out the same memory as
> >> > lmb_alloc().  It all looks a bit broken to me...
> >> 
> >> Yes, that is actually my code ;) The kontron_sl28 is the only
> >> board which uses spin tables as far as I know. It doesn't support
> >> PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP
> >> it uses spin tables. The relocation code work arounds a problem
> >> with the reserved EFI code, see [1].
> >> 
> >> And yes, it actually is broken. But so might be every code which
> >> is using the efi_allocate_pages(), no? LMB isn't global, but is
> >> just initialized at different places. Like before a linux kernel
> >> is booted or when you load a file (?). And everytime the whole
> >> memory is added, and then different regions are carved out (see
> >> above).
> > 
> > Right.  So it mostly works because U-Boot either ends up using a code
> > path where it uses LMB or a path where it ends up using the EFI memory
> > map, but never both.  Except if it hits your layerscape code.
> > 
> > I think the right way to solve this is to allocate the memory for the
> > spin table through some other means and use efi_add_memory_map() to
> > reserve that memory like what is done in board/raspberrypi/rpi/rpi.c.
> 
> Mh. I don't think this will work, just because lmb is quite stupid
> and has just the three steps to carve out memory:
>   (1) arch_lmb_reserve()
>   (2) board_lmb_reserve()
>   (3) boot_fdt_add_mem_rsv_regions()
> 
> (1) will carve out u-boot code and data (2) is usually not used and
> (3) doesn't really work IMHO, because the reserved regions are added
> much later in the bootm flow.
> 
> So what would "other means" be? I could allocate the memory somehow,
> but how can I communicate that to lmb, so it would be excluded there.

It could be a simple:

  

Re: (overlapping) lmb allocations during boot

2021-10-31 Thread Michael Walle

Am 2021-10-31 13:51, schrieb Mark Kettenis:

Date: Sun, 31 Oct 2021 13:04:40 +0100
From: Michael Walle 

Am 2021-10-31 12:44, schrieb Mark Kettenis:
>> Date: Sun, 31 Oct 2021 11:43:38 +0100
>> From: Michael Walle 
>>
>> Hi,
>>
>> I sometimes see a corrupted initrd during kernel boot on my board
>> (kontron_sl28). Debugging showed that in this case the spin table
>> for the secondary CPUs overlaps with the lmb initrd allocations
>> (initrd_high is set).
>>
>> I had a look at how the fdt and initrd are relocated (if enabled). In
>> summary, they use lmb_alloc() which will first add all available
>> memory
>> and then carve out reserved regions. There are calls for arch
>> (arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The
>> interesting thing is that, there is also code to carve out any
>> reserved regions which were added to the fdt earlier
>> (boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT
>> fixups,
>> which might add the reserved regions are called just before jumping to
>> linux. Thus both the allocation for the fdt (if fdt_high is set) and
>> the ramdisk (if initrd_high is set) will ignore any reserved memory
>> regions.
>>
>> Unfortunately, I don't see any good way to fix this. You'd need all
>> the
>> DT fixups before we can initialize the lmb. Also, I don't know if this
>> will affect any other areas; probably I'm the only one, who reserves
>> an
>> area which is outside of the u-boot code and data segment.
>>
>> A hackish way would be to carve out the spin_table code in
>> board_lmb_reserve(). But meh..
>
> The spin table is embedded in the u-boot binary itself isn't it?  But
> the memory occupied by the u-boot should already be reserved...

Yes it is. As long as it doen't overlap with the 64k EFI code page.
See below.


Yes, I saw the comment.  Which confuses me.  The overlap itself
shouldn't be a problem since both the spin table and the EFI runtime
code page should be preserved.  But I guess that the overlap causes
problems later on because attempting to reserve overlapping areas
fails?


TBH I don't remember the specifics anymore, my first try to fix this
problem was here [1]. But the conclusion was that reserving 64k for
the EFI alone would increase the binary size a lot. Thus Heinrich
suggested to relocate the spin table code.


> Unless CONFIG_EFI_LOADER is defined.  Then it relocates the spin table
> to memory allocated using efi_allocate_pages().  But that function
> only looks at the EFI memory map to figure out what memory is
> available.  So I suspect that it might hand out the same memory as
> lmb_alloc().  It all looks a bit broken to me...

Yes, that is actually my code ;) The kontron_sl28 is the only
board which uses spin tables as far as I know. It doesn't support
PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP
it uses spin tables. The relocation code work arounds a problem
with the reserved EFI code, see [1].

And yes, it actually is broken. But so might be every code which
is using the efi_allocate_pages(), no? LMB isn't global, but is
just initialized at different places. Like before a linux kernel
is booted or when you load a file (?). And everytime the whole
memory is added, and then different regions are carved out (see
above).


Right.  So it mostly works because U-Boot either ends up using a code
path where it uses LMB or a path where it ends up using the EFI memory
map, but never both.  Except if it hits your layerscape code.

I think the right way to solve this is to allocate the memory for the
spin table through some other means and use efi_add_memory_map() to
reserve that memory like what is done in board/raspberrypi/rpi/rpi.c.


Mh. I don't think this will work, just because lmb is quite stupid
and has just the three steps to carve out memory:
 (1) arch_lmb_reserve()
 (2) board_lmb_reserve()
 (3) boot_fdt_add_mem_rsv_regions()

(1) will carve out u-boot code and data (2) is usually not used and
(3) doesn't really work IMHO, because the reserved regions are added
much later in the bootm flow.

So what would "other means" be? I could allocate the memory somehow,
but how can I communicate that to lmb, so it would be excluded there.


Or don't relocate the spin table and fix the issue with overlapping
reserves.


Which hadn't much appeal ;) And doesn't fix (fundamental) lmb problem.

-michael

[1] 
https://lore.kernel.org/u-boot/20200514123831.30157-2-mich...@walle.cc/


Re: (overlapping) lmb allocations during boot

2021-10-31 Thread Mark Kettenis
> Date: Sun, 31 Oct 2021 13:04:40 +0100
> From: Michael Walle 
> 
> Am 2021-10-31 12:44, schrieb Mark Kettenis:
> >> Date: Sun, 31 Oct 2021 11:43:38 +0100
> >> From: Michael Walle 
> >> 
> >> Hi,
> >> 
> >> I sometimes see a corrupted initrd during kernel boot on my board
> >> (kontron_sl28). Debugging showed that in this case the spin table
> >> for the secondary CPUs overlaps with the lmb initrd allocations
> >> (initrd_high is set).
> >> 
> >> I had a look at how the fdt and initrd are relocated (if enabled). In
> >> summary, they use lmb_alloc() which will first add all available 
> >> memory
> >> and then carve out reserved regions. There are calls for arch
> >> (arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The
> >> interesting thing is that, there is also code to carve out any
> >> reserved regions which were added to the fdt earlier
> >> (boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT
> >> fixups,
> >> which might add the reserved regions are called just before jumping to
> >> linux. Thus both the allocation for the fdt (if fdt_high is set) and
> >> the ramdisk (if initrd_high is set) will ignore any reserved memory
> >> regions.
> >> 
> >> Unfortunately, I don't see any good way to fix this. You'd need all 
> >> the
> >> DT fixups before we can initialize the lmb. Also, I don't know if this
> >> will affect any other areas; probably I'm the only one, who reserves 
> >> an
> >> area which is outside of the u-boot code and data segment.
> >> 
> >> A hackish way would be to carve out the spin_table code in
> >> board_lmb_reserve(). But meh..
> > 
> > The spin table is embedded in the u-boot binary itself isn't it?  But
> > the memory occupied by the u-boot should already be reserved...
> 
> Yes it is. As long as it doen't overlap with the 64k EFI code page.
> See below.

Yes, I saw the comment.  Which confuses me.  The overlap itself
shouldn't be a problem since both the spin table and the EFI runtime
code page should be preserved.  But I guess that the overlap causes
problems later on because attempting to reserve overlapping areas
fails?

> > Unless CONFIG_EFI_LOADER is defined.  Then it relocates the spin table
> > to memory allocated using efi_allocate_pages().  But that function
> > only looks at the EFI memory map to figure out what memory is
> > available.  So I suspect that it might hand out the same memory as
> > lmb_alloc().  It all looks a bit broken to me...
> 
> Yes, that is actually my code ;) The kontron_sl28 is the only
> board which uses spin tables as far as I know. It doesn't support
> PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP
> it uses spin tables. The relocation code work arounds a problem
> with the reserved EFI code, see [1].
> 
> And yes, it actually is broken. But so might be every code which
> is using the efi_allocate_pages(), no? LMB isn't global, but is
> just initialized at different places. Like before a linux kernel
> is booted or when you load a file (?). And everytime the whole
> memory is added, and then different regions are carved out (see
> above).

Right.  So it mostly works because U-Boot either ends up using a code
path where it uses LMB or a path where it ends up using the EFI memory
map, but never both.  Except if it hits your layerscape code.

I think the right way to solve this is to allocate the memory for the
spin table through some other means and use efi_add_memory_map() to
reserve that memory like what is done in board/raspberrypi/rpi/rpi.c.

Or don't relocate the spin table and fix the issue with overlapping
reserves.

Cheers,

Mark


Re: (overlapping) lmb allocations during boot

2021-10-31 Thread Michael Walle

Am 2021-10-31 12:44, schrieb Mark Kettenis:

Date: Sun, 31 Oct 2021 11:43:38 +0100
From: Michael Walle 

Hi,

I sometimes see a corrupted initrd during kernel boot on my board
(kontron_sl28). Debugging showed that in this case the spin table
for the secondary CPUs overlaps with the lmb initrd allocations
(initrd_high is set).

I had a look at how the fdt and initrd are relocated (if enabled). In
summary, they use lmb_alloc() which will first add all available 
memory

and then carve out reserved regions. There are calls for arch
(arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The
interesting thing is that, there is also code to carve out any
reserved regions which were added to the fdt earlier
(boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT
fixups,
which might add the reserved regions are called just before jumping to
linux. Thus both the allocation for the fdt (if fdt_high is set) and
the ramdisk (if initrd_high is set) will ignore any reserved memory
regions.

Unfortunately, I don't see any good way to fix this. You'd need all 
the

DT fixups before we can initialize the lmb. Also, I don't know if this
will affect any other areas; probably I'm the only one, who reserves 
an

area which is outside of the u-boot code and data segment.

A hackish way would be to carve out the spin_table code in
board_lmb_reserve(). But meh..


The spin table is embedded in the u-boot binary itself isn't it?  But
the memory occupied by the u-boot should already be reserved...


Yes it is. As long as it doen't overlap with the 64k EFI code page.
See below.


Unless CONFIG_EFI_LOADER is defined.  Then it relocates the spin table
to memory allocated using efi_allocate_pages().  But that function
only looks at the EFI memory map to figure out what memory is
available.  So I suspect that it might hand out the same memory as
lmb_alloc().  It all looks a bit broken to me...


Yes, that is actually my code ;) The kontron_sl28 is the only
board which uses spin tables as far as I know. It doesn't support
PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP
it uses spin tables. The relocation code work arounds a problem
with the reserved EFI code, see [1].

And yes, it actually is broken. But so might be every code which
is using the efi_allocate_pages(), no? LMB isn't global, but is
just initialized at different places. Like before a linux kernel
is booted or when you load a file (?). And everytime the whole
memory is added, and then different regions are carved out (see
above).


Does your target end up with CONFIG_EFI_LOADER defined?


Yes ;)

-michael

[1] 
https://lore.kernel.org/u-boot/20200601195336.3237-1-mich...@walle.cc/


Re: (overlapping) lmb allocations during boot

2021-10-31 Thread Mark Kettenis
> Date: Sun, 31 Oct 2021 11:43:38 +0100
> From: Michael Walle 
> 
> Hi,
> 
> I sometimes see a corrupted initrd during kernel boot on my board
> (kontron_sl28). Debugging showed that in this case the spin table
> for the secondary CPUs overlaps with the lmb initrd allocations
> (initrd_high is set).
> 
> I had a look at how the fdt and initrd are relocated (if enabled). In
> summary, they use lmb_alloc() which will first add all available memory
> and then carve out reserved regions. There are calls for arch
> (arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The
> interesting thing is that, there is also code to carve out any
> reserved regions which were added to the fdt earlier
> (boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT 
> fixups,
> which might add the reserved regions are called just before jumping to
> linux. Thus both the allocation for the fdt (if fdt_high is set) and
> the ramdisk (if initrd_high is set) will ignore any reserved memory
> regions.
> 
> Unfortunately, I don't see any good way to fix this. You'd need all the
> DT fixups before we can initialize the lmb. Also, I don't know if this
> will affect any other areas; probably I'm the only one, who reserves an
> area which is outside of the u-boot code and data segment.
> 
> A hackish way would be to carve out the spin_table code in
> board_lmb_reserve(). But meh..

The spin table is embedded in the u-boot binary itself isn't it?  But
the memory occupied by the u-boot should already be reserved...

Unless CONFIG_EFI_LOADER is defined.  Then it relocates the spin table
to memory allocated using efi_allocate_pages().  But that function
only looks at the EFI memory map to figure out what memory is
available.  So I suspect that it might hand out the same memory as
lmb_alloc().  It all looks a bit broken to me...

Does your target end up with CONFIG_EFI_LOADER defined?