Hi,

On Wed, Dec 3, 2025 at 2:37 PM Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Tue, Dec 2, 2025 at 3:16 PM Rob Herring <[email protected]> wrote:
> >
> > > as a set of examples. I don't see a clear way to generate these from
> > > a fragmented scheme. There's a similar problem with the board-level
> > > compatible:
> > >
> > >         compatible = "solidrun,cubox-i/dl", "fsl,imx6dl";
> > >         compatible = "solidrun,hummingboard2/dl", "fsl,imx6dl";
> > >         compatible = "solidrun,hummingboard/dl", "fsl,imx6dl";
> > >         compatible = "solidrun,hummingboard2/q", "fsl,imx6q";
> > >
> > > These don't include the SoM information.
> >
> > So we're back to needing to merge compatible even though Doug was
> > willing to drop it. Or to put it another way, there's usecases for the
> > base to be different SoC revisions and variants. So I don't think we
> > should give up on solving that.
>
> I'm willing to take a crack at it. Before doing so, though, I think we
> need to agree upon a definition of what the top-level "compatible" is
> supposed to be. Otherwise, what exactly is our goal in trying to merge
> "compatible" strings? We should have a goal of updating the official
> documentation with whatever we decide.
>
>
> I guess first, we should see what the spec says. The Devicetree
> Specification v0.4 says this about the top-level compatible (which, it
> should be noted, is documented _separtely_ from section 2.3.1 since
> all of section 2.3 is only about device nodes):
>
> > Specifies a list of platform architectures with which this platform is 
> > compatible. This property can be used by operating systems in selecting 
> > platform specific code. The recommended form of the property value is: 
> > "manufacturer,model" For example: compatible = "fsl,mpc8572ds"
>
> That's not very detailed, but I guess we can start out looking at what
> it _doesn't_ say.
>
> a) The spec doesn't say anything about the top-level compatible
> uniquely identifying a specific hardware configuration. Nothing there
> says "look at one of the strings in the list and you can tell exactly
> what product you have in front of you".
>
> b) The spec doesn't specifically mention that one should include any
> strings for a SoC / SoM / reference board. Indeed, the example given
> shows an example "compatible" with just one string: "fsl,mpc8572ds".
> Searching the interwebs, I find that this example "compatible"
> probably refers to a Freescale "MPC8572 Development System", which is
> a dev board with a "MPC8572" chip. Notably, the example "compatible"
> didn't include the "MPC8572" chip.
>
> c) The spec doesn't seem to include a firm definition of what they
> mean by the word "platform". In my mind, one could interpret the SoC
> as a "platform". One could also interpret a SoM or a reference board
> as a "platform". It's not necessarily clear. Since AI is the answer to
> all things these days, I asked Gemini. I asked what "platform" meant
> in the context of the DT spec and it (confidently) told me that "the
> platform is the physical machine." ...but when I asked if one could
> also consider the SoC the "platform", it told me that was "an
> excellent clarifying question" and went on to say the SoC "is often
> referred to as the base platform or the SoC-level platform." :-P
>
>
> How does that help us? I guess I'd summarize that, from reading the
> spec and more loosely interpreting the word "platform":
>
> a) The compatible string doesn't _need_ to include strings
> representing the SoC, SoM, or baseboard, but it can.
>
> b) The compatible string is primarily there for use by the operating
> system to select platform (board, reference board, SoM, or SoC)
> specific code.
>
>
> That still doesn't really tell us when we should / shouldn't include a
> SoC / SoM / baseboard in the top-level "compatible". It also doesn't
> tell us if we should include even more detailed levels. ...and by
> "more detailed levels", I would perhaps say that each of these could
> also be considered a "platform":
> * google,trogdor-lazor-rev6-sku6 - An exact model of board.
> * google,trogdor-lazor-rev6 - A platform that has several SKUs.
> * google,trogdor-lazor - A platform that has several revisions and SKUs.
> * google,trogdor - A reference platform that has several boards.
>
>
> Perhaps we should lean into the statement "This property can be used
> by operating systems in selecting platform specific code" to give us
> guidance? The problem is that we somehow need to not just look at
> current operating systems but, if we want to strive towards the goal
> of shipping binary device trees, we need to consider future operating
> system code that hasn't yet been written. That sounds impossible and
> makes one think you should cram as much info into the compatible
> string as possible, but...
>
> ...actually, we only need to put information into the compatible
> string if there's not an easy way for the operating system to get the
> information elsewhere, right? If the information is found elsewhere in
> the device tree or if the operating system can probe the information
> itself, then there's really no _need_ to put it in the top-level
> "compatible" and we'll never end up painting ourselves into a corner.
> We could still put the information there just to make it convenient,
> but it's not really needed. Does this make sense?
>
> I would further argue that, in order to be useful, any given
> "platform" should document its expectations and we need to be
> consistent across anyone using that platform. To make it concrete, if
> the Qualcomm SC7180 platform documents that "qcom,sc7180" belongs in
> the top-level compatible string then all device trees including sc7180
> should have that string. This _doesn't_ mean that on some future
> platform (like qcom,sc9999) we couldn't make a different decision.
> Maybe on "qcom,sc9999" we've decided to put SoC details as some
> properties under the "soc@0" node. Now the operating system can find
> the details about which SoC is present from the "soc@0" node and
> therefore we don't need to represent it in the top-level compatible
> string.
>
>
> Assuming that all makes sense, maybe the way to document the top-level
> compatible string:
>
> --
>
> Specifies a list of "platform architectures" with which this platform
> is compatible. A "platform architecture" can be at any level, from the
> specific board to the class of board to the reference platform to the
> SoM to the SoC. A given "platform architecture" should always be
> consistently included or not-included by all final device trees using
> it. If the "qcom,sc7180" SoC platform is defined to be included, it
> should be consistently included by any device trees with this SoC. The
> criteria for whether to represent a "platform architecture" in the
> top-level compatible string is the difficulty of the operating system
> obtaining the information in some other way (including from other DT
> properties or from probing). In general, the top-level "compatible"
> used by operating systems in selecting platform specific code. The
> recommended form of the property value is: "manufacturer,model"
>
> Examples:
>
> compatible = "fsl,mpc8572ds";
> - Select code related to the Freescale MPC8572 Development System
>
> No platform is included for the CPU since ("fsl,mpc8572") isn't
> consistently listed as a platform.
>
> compatible = "google,snow-rev4", "google,snow", "samsung,exynos5250",
> "samsung,exynos5"
> - Select code related to google,snow-rev4.
> - Select code related to google,snow.
> - Select code related to samsung,exynos5250.
> - Select code related to samsung,exynos5.
>
> In this example, the idea is that all exynos5 boards would have
> "samsung,exynos5" so code that needed to run on "exynos5" could
> consistently test for that "compatible" string. Similarly, all
> exynos5250 boards would have "samsung,exynos5250" and all snow boards
> would have "google,snow"
>
> --
>
> What do folks think?
>
> Note that the current Chromebook stuff [1] we used on sc7180-trogdor
> boards doesn't fit amazingly well into that definition, but it can
> kinda squeeze in there. Essentially the sc7180-trogdor stuff is
> designed around making it easy for the bootloader to find the right
> device tree but doesn't provide anything terribly useful to the OS in
> the top-level "compatible" string. At this point, I don't think I
> would encourage others to adopt something similar.
>
>
> If folks agree with the above interpretation, I think I'd end up back
> to arguing _against_ the need to merge compatible strings. If we don't
> need to put detailed SoC information into the top-level compatible
> string then we don't need to merge. I think the most
> flexible/futureproof would be to just define that for the SoC inside
> Pixel 10 (and presumably all future Google Silicon) we'll put SoC
> information under the "soc@0" node and thus there's no need to include
> it in the top-level "compatible". That leaves us without a
> "compatible" to put in the base "dtb", but maybe we can just put
> compatible = "incomplete" or something like that?
>
> I suspect that even for Russell's purposes the information can either
> be probed by the OS or put in places other than the top-level
> compatible string. We might not want to change his existing
> devicetrees in case some OS is relying on the existing compatible
> strings, but for work going forward it feels like it would be a
> solution...
>
>
> [1] https://docs.kernel.org/arch/arm/google/chromebook-boot-flow.html

It's me again. The pest.

Adding a few people who piped up when I mentioned this at Plumbers
(namely Bjorn and Geert)...


Bjorn mentioned that, in general, it's hard to know what device /
devicetree people are using when they report bugs. Presumably if we
made the top-level compatible less representative of the overall
system, this problem would be made worse?

While this is true, to me it isn't necessarily a blocker (though feel
free to object). Specifically:

* The device tree doesn't fully describe all hardware anyway. While we
might use a "SKU" variant to choose between one MIPI panel or another,
Chromebooks _don't_ use SKU variants to choose between one eDP panel
or another because eDP panels can be probed. We also might use a "SKU"
variant to choose between two MIPI webcams but not two USB webcams for
the same reason.

* We've already accepted the idea of "hardware probers" that can run
at boot anyway and those don't adjust SKU numbers. grep the source for
"fail-needs-probe".

Someone pointed out that if you really need the device tree it could
be captured in bug reports. This seems reasonable to me. I also really
liked the idea of keeping some sort of log somewhere in the device
tree every time an overlay is applied, though I tend to agree with
others that filenames of device tree files shouldn't be ABI.


Geert talked about the top-level compatible as being the "last resort"
to fix any issue. That matches my understanding above from reading the
docs and seeing how it was used. Geert: I would be curious what you
thought about my arguments above.


In general, I'm still hoping to figure out next steps. I believe this
problem is important enough that we shouldn't just drop it due to
silence, so I'll continue being my usual noisy self and keep
pestering.

-Doug
_______________________________________________
boot-architecture mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to