On Thu Aug 28, 2025 at 1:01 PM JST, John Hubbard wrote: > On 8/25/25 9:07 PM, Alexandre Courbot wrote: >> The GSP firmware is a binary blob that is verified, loaded, and run by >> the GSP bootloader. Its presentation is a bit peculiar as the GSP >> bootloader expects to be given a DMA address to a 3-levels page table >> mapping the GSP firmware at address 0 of its own address space. >> >> Prepare such a structure containing the DMA-mapped firmware as well as >> the DMA-mapped page tables, and a way to obtain the DMA handle of the >> level 0 page table. >> >> As we are performing the required ELF section parsing and radix3 page >> table building, remove these items from the TODO file. >> >> Signed-off-by: Alexandre Courbot <acour...@nvidia.com> >> --- >> Documentation/gpu/nova/core/todo.rst | 17 ----- >> drivers/gpu/nova-core/firmware.rs | 110 +++++++++++++++++++++++++++++++- >> drivers/gpu/nova-core/firmware/gsp.rs | 117 >> ++++++++++++++++++++++++++++++++++ >> drivers/gpu/nova-core/gsp.rs | 4 ++ >> drivers/gpu/nova-core/nova_core.rs | 1 + >> 5 files changed, 229 insertions(+), 20 deletions(-) > > Code looks good. Or more accurately, it's working on my machine, and > I think I understand it, aside from the SG Table internals. > > The documentation on the whole "radix 3" aspect is too light though, so > I've created some that you can add in if you agree with it. > > ... >> diff --git a/drivers/gpu/nova-core/firmware/gsp.rs >> b/drivers/gpu/nova-core/firmware/gsp.rs > ... >> +/// A device-mapped firmware with a set of (also device-mapped) pages >> tables mapping the firmware >> +/// to the start of their own address space, also known as a `Radix3` >> firmware. > > I'd like to replace the above two lines with something like this: > > /// GSP firmware with 3-level radix page tables for the GSP bootloader. > /// > /// The bootloader expects firmware to be mapped starting at address 0 in > GSP's virtual address > /// space: > /// > /// ```text > /// Level 0: 1 page, 1 entry -> points to first level 1 page > /// Level 1: Multiple pages/entries -> each entry points to a level 2 page > /// Level 2: Multiple pages/entries -> each entry points to a firmware page > /// ``` > /// > /// Each page is 4KB, each entry is 8 bytes (64-bit DMA address). > /// Also known as "Radix3" firmware.
Thanks, this is perfect! > > >> +#[pin_data] >> +pub(crate) struct GspFirmware { > > And then a slightly higher-level set of inline comments will help > developers, I think: > >> + /// The GSP firmware inside a [`VVec`], device-mapped via a SG table. >> + #[pin] >> + fw: SGTable<Owned<VVec<u8>>>, >> + /// The level 2 page table, mapping [`Self::fw`] at its beginning. > > Instead, how about: > > /// Level 2 page table(s) whose entries contain DMA addresses of > firmware pages. > >> + #[pin] >> + lvl2: SGTable<Owned<VVec<u8>>>, >> + /// The level 1 page table, mapping [`Self::lvl2`] at its beginning. > > /// Level 1 page table(s) whose entries contain DMA addresses of level > 2 pages. Looking good. But isn't it singular "table"? We have one table, with potentialy several pages, but each page is not a table in itself IIUC. > >> + #[pin] >> + lvl1: SGTable<Owned<VVec<u8>>>, >> + /// The level 0 page table, mapping [`Self::lvl1`] at its beginning. > > /// Level 0 page table (single 4KB page) with one entry: DMA address > of first level 1 page. > >> + lvl0: DmaObject, >> + /// Size in bytes of the firmware contained in [`Self::fw`]. >> + pub size: usize, >> +} >> + >> +impl GspFirmware { >> + /// Maps the GSP firmware image `fw` into `dev`'s address-space, and >> creates the page tables >> + /// expected by the GSP bootloader to load it. >> + pub(crate) fn new<'a>( >> + dev: &'a device::Device<device::Bound>, >> + fw: &'a [u8], >> + ) -> impl PinInit<Self, Error> + 'a { >> + try_pin_init!(&this in Self { >> + fw <- { >> + // Move the firmware into a vmalloc'd vector and map it >> into the device address >> + // space. >> + VVec::with_capacity(fw.len(), GFP_KERNEL) >> + .and_then(|mut v| { >> + v.extend_from_slice(fw, GFP_KERNEL)?; >> + Ok(v) >> + }) >> + .map_err(|_| ENOMEM) >> + .map(|v| SGTable::new(dev, v, DataDirection::ToDevice, >> GFP_KERNEL))? >> + }, >> + lvl2 <- { > > Why must we use a strange vowel-removal algorithm for these vrbl nms? I'll > let you have > a few extra characters and you can spell out "level2"... Yeah, let me spell these fully. > >> + // Allocate the level 2 page table, map the firmware onto >> it, and map it into the >> + // device address space. >> + // SAFETY: `this` is a valid pointer, and `fw` has been >> initialized. >> + let fw_sg_table = unsafe { &(*this.as_ptr()).fw }; >> + VVec::<u8>::with_capacity( >> + fw_sg_table.iter().count() * >> core::mem::size_of::<u64>(), >> + GFP_KERNEL, >> + ) >> + .map_err(|_| ENOMEM) >> + .and_then(|lvl2| map_into_lvl(fw_sg_table, lvl2)) >> + .map(|lvl2| SGTable::new(dev, lvl2, >> DataDirection::ToDevice, GFP_KERNEL))? >> + }, >> + lvl1 <- { >> + // Allocate the level 1 page table, map the level 2 page >> table onto it, and map it >> + // into the device address space. >> + // SAFETY: `this` is a valid pointer, and `lvl2` has been >> initialized. >> + let lvl2_sg_table = unsafe { &(*this.as_ptr()).lvl2 }; >> + VVec::<u8>::with_capacity( >> + lvl2_sg_table.iter().count() * >> core::mem::size_of::<u64>(), >> + GFP_KERNEL, >> + ) >> + .map_err(|_| ENOMEM) >> + .and_then(|lvl1| map_into_lvl(lvl2_sg_table, lvl1)) >> + .map(|lvl1| SGTable::new(dev, lvl1, >> DataDirection::ToDevice, GFP_KERNEL))? >> + }, >> + lvl0: { >> + // Allocate the level 0 page table as a device-visible DMA >> object, and map the >> + // level 1 page table onto it. >> + // SAFETY: `this` is a valid pointer, and `lvl1` has been >> initialized. >> + let lvl1_sg_table = unsafe { &(*this.as_ptr()).lvl1 }; >> + let mut lvl0 = DmaObject::new(dev, GSP_PAGE_SIZE)?; >> + // SAFETY: we are the only owner of this newly-created >> object, making races >> + // impossible. >> + let lvl0_slice = unsafe { lvl0.as_slice_mut(0, >> GSP_PAGE_SIZE) }?; >> + lvl0_slice[0..core::mem::size_of::<u64>()].copy_from_slice( >> + #[allow(clippy::useless_conversion)] >> + >> &(u64::from(lvl1_sg_table.iter().next().unwrap().dma_address())).to_le_bytes(), >> + ); >> + >> + lvl0 >> + }, >> + size: fw.len(), >> + }) >> + } >> + >> + #[expect(unused)] >> + /// Returns the DMA handle of the level 0 page table. >> + pub(crate) fn lvl0_dma_handle(&self) -> DmaAddress { >> + self.lvl0.dma_handle() >> + } >> +} >> + >> +/// Create a linear mapping the device mapping of the buffer described by >> `sg_table` into `dst`. > > How about this: > > /// Build a page table from a scatter-gather list. > /// > /// Takes each DMA-mapped region from `sg_table` and writes page table entries > /// for all 4KB pages within that region. For example, a 16KB SG entry becomes > /// 4 consecutive page table entries. Much better. And I mixed some words in the original message which didn't even make sense to begin with.