Hi,
On 2/5/26 11:41 PM, Marco Felsch wrote:
> On 26-02-05, Michael Tretter wrote:
>> On Wed, 04 Feb 2026 21:01:25 +0100, Marco Felsch wrote:
>>> + fdt = buf;
>>> + error = fdt_fixup_mem(fdt, &mem_base, &mem_sz, 1);
>>
>> On rk3588, I noticed that fdt_fixup_mem() adds a significant delay,
>> which depends on the size of the device tree, to boot time. Did you
>> notice something similar in your tests?
>
> I noticed a delay as well but didn't debugged the root cause yet since
> doing a lot of string parsing doesn't come for free. Thanks for the
> heads up, that this function is problematic. On the other hand, we need
> to do the fixup to keep the DRAM information only encoded within the
> lpddr config file.
I actually have code lying around somewhere that takes a barebox ELF and
extracts the memory size by parsing the i.MX8M memory structs offline :D
I wouldn't recommend actually using that though..
What might help is adding dummy memory entries into the DT for all
memory banks, but with a 0 size and then patch the final size in.
>> Furthermore, if OP-TEE receives a large device tree, it also seems to
>> initialize much slower and add further delay to the boot time.
>
> Good to know! I didn't spent to much time for performance analysis
> unfortunately.
>
>> I wonder if it may be better to build a custom device tree during the
>> initialization that only contains the necessary nodes rather than
>> passing the full device tree with some fixups.
>
> But in that case you need to know which DT nodes are important for
> $platform. E.g. for i.MX CAAM devices, the CAAM node is interessting
> too, since OP-TEE disables the secure OS used jobring nodes. I don't
> know if this scales for all SoCs.
>
> That beeing said, currently we pass the DT only to OP-TEE but it could
> be used for TF-A use-cases as well.
>
> I also don't know if building a custom FDT during runtime is less
> effort than using the built-in unmodified DT.
>
> It would be great if we shift "creating the minimal DT" to build time.
> U-Boot has this "bootph-all" and CONFIG_OF_SPL_REMOVE_PROPS for the SPL.
> But U-Boot uses the custom binman tool to heavily manipulate the DTB
> after the DTC.
fdtgrep can filter device trees according to the bootph- properties.
It's not part of upstream dtc, but we vendor in dtc anyway, so we could
also include that tool if needed.
> We could also do something fancy like having a minimal DT for the PBL
> (no MMU) and later on barebox-proper loads the rest as overlay (MMU on).
We should rather enable the MMU earlier. Waiting until just before
decompression in EL2 is much later than need be.
> This would require that we split the DT into a DT and DTO. A quick
> search showed that we could use '/omit-if-no-ref/' feature to build a
> minimal pbl DTB. For PBL DTB builds the build-process needs to handle
> adding the property accordingly, so a minimal DT is produced and the
> user-impact is minimal. I could imagine something like:
>
> / {
> barebox,pbl-devices {
> caam = <&phandle_to_caam>;
> emmc = <&sdhci1>;
> eeprom = <&i2c_eeprom>;
> };
> };
>
> The build process could:
> 1) add the /omit-if-no-ref/ and later on
> 2) follows the phandles listed in barebox,pbl-devices and removes the
> property from each parent recursive.
>
> This way we could implement something similar to the U-Boot "bootph-all"
> marking but IMHO more user friendly and only relying on DTC.
I think we can do it even simpler, let's say we had these three phases
(we can think about mapping them to bootph- later):
barebox,bootph-pbl: The parts of the DT needed in PBL
barebox,bootph-core: The parts of the DT needed very early
in barebox proper and prior to and including
board driver probe
barebox,bootph-all: Everything after the board driver probe
Unmarked nodes are here as well
Now we would filter the barebox DT by each of these properties and get
three device trees, which we just concatenate and use as barebox DT.
PBL: Would take only the first DT and ignore the concatenated DTs,
but still pass all along to barebox proper
early barebox proper: would just unflatten both DTs after another into
the same live tree
late barebox proper: would unflatten the last DT and fix it up into
the live tree
Benefits I see:
- We can have a minimal DT just for the CAAM and memory nodes for
example
- We have no extra runtime overhead as unflattening a DT split
into three will take as much time as doing it all in one pass
- We can actually switch out the DTs before unflattening steps,
e.g. we could have a smaller generic DT uses for the first two
stages and then a bigger DT used for the last stage, which
is board-specific.
I don't think when we are getting around to this though ^^.
> However, thanks for your reply :) Of course there is room for
> improvements, but this shouldn't stop us from merging this since all
> current mainline users are not affected (no boot-time regression).
Agreed. I hope we will recover some speed when we start enabling the MMU
earlier in future.
Cheers,
Ahmad
>
> Regards,
> Marco
>
>>
>> Michael
>>
>>> + if (error) {
>>> + pr_warn("Failed to fixup FDT memory node, continue without
>>> FDT\n");
>>> + bl31_via_bl_params(bl31, bl32, bl33, NULL);
>>> + }
>>> +
>>> + bl31_via_bl_params(bl31, bl32, bl33, fdt);
>>> +}
>>> +
>>> /**
>>> * imx8m_tfa_start_bl31 - Load TF-A BL31 blob and transfer control to it
>>> *
>>> @@ -122,7 +186,21 @@ imx8m_tfa_start_bl31(const void *tfa_bin, size_t
>>> tfa_size, void *tfa_dest,
>>> asm volatile("msr sp_el2, %0" : :
>>> "r" (tfa_dest - 16) :
>>> "cc");
>>> - bl31();
>>> +
>>> + /*
>>> + * If enabled the bl_params are passed via x0 to the TF-A, except for
>>> + * the i.MX8MQ which doesn't support bl_params yet.
>>> + * Passing the bl_params must be explicit enabled to be backward
>>> + * compatible with downstream TF-A versions, which may have problems
>>> + * with the bl_params.
>>> + */
>>> + if (!IS_ENABLED(CONFIG_ARCH_IMX_ATF_PASS_BL_PARAMS) || cpu_is_mx8mq()) {
>>> + pr_debug("Jump to BL31 without bl-params\n");
>>> + bl31();
>>> + } else {
>>> + start_bl31_via_bl_params(bl31, bl32, bl33, fdt);
>>> + }
>>> +
>>> __builtin_unreachable();
>>> }
>>
>
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |