Hi John,

On 11/11/2025 11:30 PM, John Hubbard wrote:
> NVIDIA GPUs are moving away from using NV_PMC_BOOT_0 to contain
> architecture and revision details, and will instead use NV_PMC_BOOT_42
> in the future. NV_PMC_BOOT_0 will contain a specific set of values
> that will mean "go read NV_PMC_BOOT_42 instead".
> 
> Change the selection logic in Nova so that it will claim Turing and
> later GPUs. This will work for the foreseeable future, without any
> further code changes here, because all NVIDIA GPUs are considered, from
> the oldest supported on Linux (NV04), through the future GPUs.

[...]

> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index cd58040b681b..8c5f46f6aaac 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -175,19 +175,41 @@ pub(crate) struct Spec {
>  
>  impl Spec {
>      fn new(bar: &Bar0) -> Result<Spec> {
> +        // Some brief notes about boot0 and boot42, in chronological order:
> +        //
> +        // NV04 through NV50:
> +        //
> +        //    Not supported by Nova. boot0 is necessary and sufficient to 
> identify these GPUs.
> +        //    boot42 may not even exist on some of these GPUs.
> +        //
> +        // Fermi through Volta:
> +        //
> +        //     Not supported by Nova. boot0 is still sufficient to identify 
> these GPUs, but boot42
> +        //     is also guaranteed to be both present and accurate.
> +        //
> +        // Turing and later:
> +        //
> +        //     Supported by Nova. Identified by first checking boot0 to 
> ensure that the GPU is not
> +        //     from an earlier (pre-Fermi) era, and then using boot42 to 
> precisely identify the GPU.
> +        //     Somewhere in the Rubin timeframe, boot0 will no longer have 
> space to add new GPU IDs.
> +
>          let boot0 = regs::NV_PMC_BOOT_0::read(bar);
>  
> -        Spec::try_from(boot0)
> +        if boot0.is_older_than_fermi() {
> +            return Err(ENOTSUPP);
> +        }
> +
> +        Spec::try_from(regs::NV_PMC_BOOT_42::read(bar))

There is an inconsistency in error return here, if NV04 through NV50, it returns
-ENOTSUPP. For Fermi through Volta, it will read boot42 but will return -ENODEV
because `Spec::try_from()` -> `boot42.chipset()` with return -ENODEV. I am Ok
with either error return, but it would be good to make it consistent.

There also does not seem to be a diagnostic if the chipset is not supported. It
would be good diagnostic that the chipset did not match, right now it will
return -ENODEV, which could mean the device does not exist. -ENOTSUPP is better
though but an actual dmesg error message would be nice.

With these,

Reviewed-by: Joel Fernandes <[email protected]>

Thanks.


Reply via email to