On 11/5/25 11:24 PM, Alexandre Courbot wrote:
> On Thu Nov 6, 2025 at 10:27 AM JST, John Hubbard wrote:
>> NVIDIA GPUs are moving away from using NV_PMC_BOOT_0 to contain
>> architecture and revision details, and will instead use NV_PMC_BOOT_42
>> in the future. NV_PMC_BOOT_0 will contain a specific set of values
>> that will mean "go read NV_PMC_BOOT_42 instead".
>>
>> Change the selection logic in Nova so that it will claim Turing and
>> later GPUs. This will work for the foreseeable future, without any
>> further code changes here, because all NVIDIA GPUs are considered, from
>> the oldest supported on Linux (NV04), through the future GPUs.
>>
>> Add some comment documentation to explain, chronologically, how boot0
>> and boot42 change with the GPU eras, and how that affects the selection
>> logic.
>>
>> Cc: Alexandre Courbot <[email protected]>
>> Cc: Danilo Krummrich <[email protected]>
>> Cc: Timur Tabi <[email protected]>
>> Signed-off-by: John Hubbard <[email protected]>
>> ---
>>  drivers/gpu/nova-core/gpu.rs  | 38 ++++++++++++++++++++++++++++++++++-
>>  drivers/gpu/nova-core/regs.rs | 33 ++++++++++++++++++++++++++++++
>>  2 files changed, 70 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index 27b8926977da..8d2bad0e27d1 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -154,6 +154,17 @@ fn try_from(boot0: regs::NV_PMC_BOOT_0) -> Result<Self> 
>> {
>>      }
>>  }
>>  
>> +impl TryFrom<regs::NV_PMC_BOOT_42> for Spec {
>> +    type Error = Error;
>> +
>> +    fn try_from(boot42: regs::NV_PMC_BOOT_42) -> Result<Self> {
>> +        Ok(Self {
>> +            chipset: boot42.chipset()?,
>> +            revision: boot42.revision(),
>> +        })
>> +    }
>> +}
>> +
>>  impl fmt::Display for Revision {
>>      fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>>          write!(f, "{:x}.{:x}", self.major, self.minor)
>> @@ -169,9 +180,34 @@ pub(crate) struct Spec {
>>  
>>  impl Spec {
>>      fn new(bar: &Bar0) -> Result<Spec> {
>> +        // Some brief notes about boot0 and boot42, in chronological order:
>> +        //
>> +        // NV04 through Volta:
>> +        //
>> +        //    Not supported by Nova. boot0 is necessary and sufficient to 
>> identify these GPUs.
>> +        //    boot42 may not even exist on some of these GPUs.
>> +        //
>> +        // Turing through Blackwell:
>> +        //
>> +        //     Supported by both Nouveau and Nova. boot0 is still necessary 
>> and sufficient to
>> +        //     identify these GPUs. boot42 exists on these GPUs but we 
>> don't need to use it.
>> +        //
>> +        // Rubin:
>> +        //
>> +        //     Only supported by Nova. Need to use boot42 to fully identify 
>> these GPUs.

Ohh, I scrambled the comment when I added Rubin to the discussion. Actually,
the first sentence is correct, but the second is dead wrong. :)

Rubin will still key off of boot0.

I'll fix this up.

>> +        //
>> +        // "Future" (after Rubin) GPUs:
>> +        //
>> +        //    Only supported by Nova. NV_PMC_BOOT's ARCH_0 (bits 28:24) 
>> will be zeroed out, and
>> +        //    ARCH_1 (bit 8:8) will be set to 1, which will mean, "refer to 
>> NV_PMC_BOOT_42".
> 
> From the code it looks like Rubin and "Future" GPUs are handled exactly
> the same - do we need two paragraphs to describe them, or can we just
> have one for "Rubing and future GPUs"?

They are not. The code is correct but the comment is wrong. 

> 
>> +
>>          let boot0 = regs::NV_PMC_BOOT_0::read(bar);
>>  
>> -        Spec::try_from(boot0)
>> +        if boot0.use_boot42_instead() {
>> +            Spec::try_from(regs::NV_PMC_BOOT_42::read(bar))
>> +        } else {
>> +            Spec::try_from(boot0)
>> +        }
>>      }
>>  }
>>  
>> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
>> index 207b865335af..8b5ff3858210 100644
>> --- a/drivers/gpu/nova-core/regs.rs
>> +++ b/drivers/gpu/nova-core/regs.rs
>> @@ -25,6 +25,13 @@
>>  });
>>  
>>  impl NV_PMC_BOOT_0 {
>> +    pub(crate) fn use_boot42_instead(self) -> bool {
>> +        // "Future" GPUs (some time after Rubin) will set `architecture_0`
>> +        // to 0, and `architecture_1` to 1, and put the architecture 
>> details in
>> +        // boot42 instead.
> 
> If this is "some time after Rubin", how do we infer that we must use
> boot42 for Rubin, as the previous comment suggests?

Right, we will actually use boot0 for Rubin. Thanks for catching
the inconsistency! 

thanks,
-- 
John Hubbard

Reply via email to