Hello Jonathan,
Hello Rob,

Thanks for bringing this topic to discussion. I think the status quo is
an obvious shortcoming that needs to be addressed.

On 02.05.24 16:00, Rob Herring wrote:
> On Wed, May 1, 2024 at 4:18 PM Humphreys, Jonathan <j-humphr...@ti.com> wrote:
>> Problem statement:
>> ==================
>>
>> Device trees are in theory a pure description of the hardware, and since the 
>> hardware
>> doesn't change, the device tree describing the hardware likewise never 
>> changes.
>> With this, a device tree could then be burned into the hardware's ROM to be
>> queried by software for hardware discovery. In practice, though, device trees
>> evolve over time. They evolve for many reasons, including
>> - support for previously unsupported hardware
>> - device driver improvements that require additional hardware information
>> - bug fixes
> 
> I really would like specific cases of these where compatibility is
> broken highlighted.

Screening for backwards-compatibility of new kernels (or their bindings)
with old DTs is not enough. When an A/B system fails to boot and does
a fallback, you can run into the inverse situation, namely:
An old kernel is presented with a new device tree as bootloader updates
are often not rolled back.

This seems unavoidable and the solution we have for that is to ship device
trees along with kernel updates and load both together.

> The tooling and reviewing to identify these cases
> has gotten much better.

barebox has been pulling in kernel device trees for many years and it's
a frequent cause of regressions. Here are some recent fixes
found with $(git log --grep="^Fixes:.*dts: update"):

* "aiodev: imx_thermal: fix breakage after device tree sync"
  https://github.com/barebox/barebox/commit/451c25b60e

* "pinctrl: stm32: Remove check for pins-are-numbered"
  https://github.com/barebox/barebox/commit/38ff8dad11

* "ARM: dts: i.MX8MP: snps,dis-u2-freeclk-exists-quirk"
  https://github.com/barebox/barebox/commit/db01bf84cf

* "clk: imx8mp: add USB suspend clock"
  https://github.com/barebox/barebox/commit/d86bbaed71

* "ARM: i.MX8MN: assume USBOTG power domains to be powered"
  https://github.com/barebox/barebox/commit/7b62fbc632


All of these bugs would have broken a newer Linux kernel being booted with an 
old
device tree. In practice, they didn't because normally barebox-built device 
trees are
used for barebox and Linux-built device trees are shipped along with Linux, even
if they might have been at identical some point.

> I've been prototyping a tool which will
> compare 2 versions of binding schemas and spit out incompatible
> changes for example. Those aren't the only types of changes as you
> point out, but if we can eliminate a whole class of issues I think the
> situation would be much better.

I look forward to this. Would your tooling have detected any of the above
regressions?

Fortunately, most of these issues are caught before a barebox release
(features, unlike bug fixes, sit in master a month before making it into
a monthly release), but some slip through and it introduces a lot of churn.

>> Linux's device tree source is maintained with the kernel source, and kernel 
>> builds
>> include building the device trees too. This ensures that the device tree
>> matching the kernel's usage is always kept in sync. Often, embedded distros 
>> will
>> include the matching device tree blobs.
>>
>> The EBBR mandates that the device tree blob is provided by the firmware.
>>
>> Thus it is likely that the device tree provided by the firmware and given to 
>> the
>> operating system is not the matching device tree blob for that kernel. This 
>> can
>> cause hardware to be missing, buggy, or non-functional.

Yes. My first experience with EBBR was AFAIR a system that didn't boot, because 
an
up-to-date Debian kernel failed to handle the old device tree provided by the
firmware. At least updating the EFI firmware with a USB stick worked well.

>> This proposal then has the firmware choose the device tree by name, or some
>> other identifier that can be used to match the device tree for the board 
>> [1]. It
>> has the OS-provided OS loader select the location of the matching versions of
>> DTBs for it.
>
> The firmware would pass the device tree filename/id to the OS loader, instead 
> of
> the DTB itself.
> If the firmware can't know which version of DTB, how can it know
> whether to pass a DTB vs. an identifier? The OS might be perfectly
> fine with firmware's DTB.

I think it's a fair assumption that if the kernel ships with a matching DTB,
it would be fine booting with it instead of the firmware provided DTB.

If we had a way to express this "shipped-with" relationship, we could thus
have the EFI firmware just select the matching device tree and pass it along the
exact way it's done now.

Some ways to describe this "shipped-with" relationship:

   - a section in the image as UKIs do, see Jan's mail
   - a fixed naming scheme in the EFI partition, e.g. \EFI\Debian\BOOTAA64.EFI
     -> \EFI\Debian\DTS-BOOTAA64.EFI/
   - an EFI variable or protocol?

>> This proposal should be in addition to supporting the standard way of 
>> passing in
>> a firmware-provided DT, in cases where the OS doesn't provide or have a need 
>> to
>> provide a matching DT.
> 
> Agreed, but that contradicts what you said above unless you mean we
> define 2 ways to operate with some platforms working one standard way
> and other platforms working the other standard way.

I agree that an OS-provided DT should be an alternative, not a replacement
for the firmware-provided DT.

> We discussed this a while back on this list (or u-boot?). To
> summarize, both using the filename or root node compatible were
> proposed. Several folks (myself included) don't like making the
> filename an ABI. However, there are some cases where the filename is
> more unique than the root node compatible. We should fix those root
> node compatibles in that case IMO.

Agreed.

Cheers,
Ahmad


-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
_______________________________________________
boot-architecture mailing list -- boot-architecture@lists.linaro.org
To unsubscribe send an email to boot-architecture-le...@lists.linaro.org

Reply via email to