On Thu Nov 6, 2025 at 10:27 AM JST, John Hubbard wrote:
> NVIDIA GPUs are moving away from using NV_PMC_BOOT_0 to contain
> architecture and revision details, and will instead use NV_PMC_BOOT_42
> in the future. NV_PMC_BOOT_0 will contain a specific set of values
> that will mean "go read NV_PMC_BOOT_42 instead".
>
> Change the selection logic in Nova so that it will claim Turing and
> later GPUs. This will work for the foreseeable future, without any
> further code changes here, because all NVIDIA GPUs are considered, from
> the oldest supported on Linux (NV04), through the future GPUs.
>
> Add some comment documentation to explain, chronologically, how boot0
> and boot42 change with the GPU eras, and how that affects the selection
> logic.
>
> Cc: Alexandre Courbot <[email protected]>
> Cc: Danilo Krummrich <[email protected]>
> Cc: Timur Tabi <[email protected]>
> Signed-off-by: John Hubbard <[email protected]>
> ---
> drivers/gpu/nova-core/gpu.rs | 38 ++++++++++++++++++++++++++++++++++-
> drivers/gpu/nova-core/regs.rs | 33 ++++++++++++++++++++++++++++++
> 2 files changed, 70 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 27b8926977da..8d2bad0e27d1 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -154,6 +154,17 @@ fn try_from(boot0: regs::NV_PMC_BOOT_0) -> Result<Self> {
> }
> }
>
> +impl TryFrom<regs::NV_PMC_BOOT_42> for Spec {
> + type Error = Error;
> +
> + fn try_from(boot42: regs::NV_PMC_BOOT_42) -> Result<Self> {
> + Ok(Self {
> + chipset: boot42.chipset()?,
> + revision: boot42.revision(),
> + })
> + }
> +}
> +
> impl fmt::Display for Revision {
> fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> write!(f, "{:x}.{:x}", self.major, self.minor)
> @@ -169,9 +180,34 @@ pub(crate) struct Spec {
>
> impl Spec {
> fn new(bar: &Bar0) -> Result<Spec> {
> + // Some brief notes about boot0 and boot42, in chronological order:
> + //
> + // NV04 through Volta:
> + //
> + // Not supported by Nova. boot0 is necessary and sufficient to
> identify these GPUs.
> + // boot42 may not even exist on some of these GPUs.
> + //
> + // Turing through Blackwell:
> + //
> + // Supported by both Nouveau and Nova. boot0 is still necessary
> and sufficient to
> + // identify these GPUs. boot42 exists on these GPUs but we don't
> need to use it.
> + //
> + // Rubin:
> + //
> + // Only supported by Nova. Need to use boot42 to fully identify
> these GPUs.
> + //
> + // "Future" (after Rubin) GPUs:
> + //
> + // Only supported by Nova. NV_PMC_BOOT's ARCH_0 (bits 28:24) will
> be zeroed out, and
> + // ARCH_1 (bit 8:8) will be set to 1, which will mean, "refer to
> NV_PMC_BOOT_42".
>From the code it looks like Rubin and "Future" GPUs are handled exactly
the same - do we need two paragraphs to describe them, or can we just
have one for "Rubing and future GPUs"?
> +
> let boot0 = regs::NV_PMC_BOOT_0::read(bar);
>
> - Spec::try_from(boot0)
> + if boot0.use_boot42_instead() {
> + Spec::try_from(regs::NV_PMC_BOOT_42::read(bar))
> + } else {
> + Spec::try_from(boot0)
> + }
> }
> }
>
> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
> index 207b865335af..8b5ff3858210 100644
> --- a/drivers/gpu/nova-core/regs.rs
> +++ b/drivers/gpu/nova-core/regs.rs
> @@ -25,6 +25,13 @@
> });
>
> impl NV_PMC_BOOT_0 {
> + pub(crate) fn use_boot42_instead(self) -> bool {
> + // "Future" GPUs (some time after Rubin) will set `architecture_0`
> + // to 0, and `architecture_1` to 1, and put the architecture details
> in
> + // boot42 instead.
If this is "some time after Rubin", how do we infer that we must use
boot42 for Rubin, as the previous comment suggests?