It was <2020-06-01 pon 21:41>, when Russell King - ARM Linux admin wrote:
 > On Mon, Jun 01, 2020 at 10:27:45PM +0200, Lukasz Stelmach wrote:
>> It was <2020-06-01 pon 19:25>, when Russell King - ARM Linux admin wrote:
>>> On Mon, Jun 01, 2020 at 06:19:52PM +0200, Lukasz Stelmach wrote:
>>>> It was <2020-06-01 pon 15:55>, when Russell King - ARM Linux admin wrote:
>>>>> On Mon, Jun 01, 2020 at 04:27:52PM +0200, Łukasz Stelmach wrote:
>>>>>> Add DCSZ tag which holds dynamic memory (stack, bss, malloc pool)
>>>>>> requirements of the decompressor code.
>>>>>
>>>>> Why do we need to know the stack and BSS size, when the userspace
>>>>> kexec tool doesn't need to know this to perform the same function.
>>>> 
>>>> 
>>>> kexec-tools load zImage as low in DRAM as possible and rely on two
>>>> assumptions:
>>>> 
>>>>     + the zImage will copy itself to make enough room for the kernel,
>>>>     + sizeof(zImage+mem) < sizeof(kernel+mem), which is true because
>>>>       of compression.
>>>> 
>>>>        DRAM start
>>>>        + 0x8000
>>>> 
>>>> zImage    |-----------|-----|-------|
>>>>             text+data   bss   stack 
>>>> 
>>>>                  text+data           bss   
>>>> kernel    |---------------------|-------------------|
>>>> 
>>>> 
>>>> initrd                                              |-initrd-|-dtb-|
>>>
>>> This is actually incorrect, because the decompressor will self-
>>> relocate itself to avoid the area that it is going to decompress
>>> into.
>> 
>> I described the state right after kexec(8) invocation.
>
> Actually, you haven't, because this is _not_ how kexec(8) lays it
> out, as I attempted to detail further down in my reply.


Let me try to describe how I understand the code in kexec-tools
(commit 74c7c369).

--8<---------------cut here---------------start------------->8---
int zImage_arm_load(…, const char *buf, off_t len, …)
// buf - zImage
// len - size of zImage
        
unsigned int extra_size = 0x8000; /* TEXT_OFFSET */

kernel_mem_size = len + 4;

// locate a hole to fit zImage + 32kB as low as possible,
base = locate_hole(info, len + extra_size, 0, 0, ULONG_MAX, INT_MAX);

kernel_base = base + extra_size;

add_segment(info, buf, len, kernel_base, kernel_mem_size);
--8<---------------cut here---------------end--------------->8---

Therefore, zImage is loaded low and always requires relocation.
        
ram     |--------------------------------------------------------------
zImage     |----k_m_s----|
           ^
           |
           kernel_base — TEXT_OFFSET or higher

Next goes initrd

--8<---------------cut here---------------start------------->8---
kexec_arm_image_size = len * 5; // or passed on command line

// if the tag exists
kexec_arm_image_size = max(edata_size + bss_size,
                           edata_size + len); // len - zImage size + 64 kB 

initrd_base = kernel_base + _ALIGN(kexec_arm_image_size, page_size);

add_segment(info, ramdisk_buf, initrd_size, initrd_base, initrd_size);
--8<---------------cut here---------------end--------------->8---

above whatever is bigger (kernel + kernel bss) OR (kernel + zImage + zImage 
mem).


ram     |---------------------------------------------------------------
zImage     |----k_m_s----|                   Where kexec loads zImage 
@kernel_base

           |.............len * 5....................| Fallback
kernel     |.....edata.....|...bss...|       These are just calculations
zImage                     |.....len+....|   zImage will copy itself here WHEN 
it runs

initrd                                   |--initrd_size--|
dtb                                      ^               |---|
                                         |
                                 initrd_base

DTB, of course, goes next

    dtb_offset = initrd_base + initrd_size + page_size;


Stuff marked with "-" is actually loaded, "." are just calculations to
establish initrd_base.

>>> So, while the decompressor runs, in the above situation it
>>> ends up as:
>>>
>>>
>>> ram    |------------------------------------------------------...
>>>                  text+data           bss   
>>> kernel    |---------------------|-------------------|
>>> zImage                          |-----------|-----|-------|
>>>                                   text+data   bss   stack+malloc
>
> Note here - if the initrd was placed as _you_ describe at the end
> of where the zImage ends up including its BSS, it would be
> corrupted by the stack and malloc space of the decompressor while
> running.  Ergo, your description of how kexec(8) lays stuff out
> is incorrect.

Is my analysis above accurate now? Do I understand this?

As you noted, my intention is to load zImage after edata (dotted len+
above).

>>>>>> +struct arm_zimage_tag_dc {
>>>>>> +        struct tag_header hdr;
>>>>>> +        union {
>>>>>> +#define ZIMAGE_TAG_DECOMP_SIZE ARM_ZIMAGE_MAGIC4
>>>>>> +                struct zimage_decomp_size {
>>>>>> +                        __le32 bss_size;
>>>>>> +                        __le32 stack_size;
>>>>>> +                        __le32 malloc_size;
>>>>>> +                } decomp_size;
>>>
>>> You certainly don't need to know all this.  All you need to know is
>>> how much space the decompressor requires after the end of the image,
>>> encompassing the BSS size, stack size and malloc size, which is one
>>> value.
>> 
>> I agree. However, since we are not fighting here for every single byte,
>> I'd rather add them as separate values and make the tag, even if only
>> slightly, more future-proof.
>
> It doesn't make it more future-proof.  What happens if we add something
> else, do we need to list it separately, and then change the kernel to
> accept the new value and maybe also kexec(8)?  Or do we just say "the
> decompressor needs X many bytes after the image" and be done with it?
> The latter sounds way more future-proof to me.

You are right. I changed it to a single value. Done.

-- 
Łukasz Stelmach
Samsung R&D Institute Poland
Samsung Electronics

Attachment: signature.asc
Description: PGP signature

Reply via email to