On Wed, May 21, 2025 at 03:45:09PM +0900, Alexandre Courbot wrote: > Add the common Falcon code and HAL for Ampere GPUs, and instantiate the > GSP and SEC2 Falcons that will be required to boot the GSP.
Maybe add a few more words about the architectural approach taken here? > +/// Valid values for the `size` field of the > [`crate::regs::NV_PFALCON_FALCON_DMATRFCMD`] register. > +#[repr(u8)] > +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)] > +pub(crate) enum DmaTrfCmdSize { > + /// 256 bytes transfer. > + #[default] > + Size256B = 0x6, Can we use a constant from `regs` to assign this value? Or is *this* meant to be the corresponding constant? > +} I wonder what's the correct thing to do for enum variants that do *not* have an arbitrary value, but match a specific register value in general. Should those be part of the `regs` module? > + /// Wait for memory scrubbing to complete. > + fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result { > + util::wait_on(Duration::from_millis(20), || { I general, I think there can be quite a lot of parameters such timeouts can depend on, e.g. chipset, firmware version, etc. I think it could make sense to establish a rule for the project that for such timeouts we require a dedicated `// TIMEOUT: ` comment that mentions the worst case scenario, which we derived this timeout value from. > + /// Perform a DMA write according to `load_offsets` from `dma_handle` > into the falcon's > + /// `target_mem`. > + /// > + /// `sec` is set if the loaded firmware is expected to run in secure > mode. > + fn dma_wr( > + &self, > + bar: &Bar0, > + dma_handle: bindings::dma_addr_t, > + target_mem: FalconMem, > + load_offsets: FalconLoadTarget, > + sec: bool, > + ) -> Result { > + const DMA_LEN: u32 = 256; > + > + // For IMEM, we want to use the start offset as a virtual address > tag for each page, since > + // code addresses in the firmware (and the boot vector) are virtual. > + // > + // For DMEM we can fold the start offset into the DMA handle. > + let (src_start, dma_start) = match target_mem { > + FalconMem::Imem => (load_offsets.src_start, dma_handle), > + FalconMem::Dmem => ( > + 0, > + dma_handle + load_offsets.src_start as bindings::dma_addr_t, We should make this a method of CoherentAllocation, such that we can get a boundary check on the offset calculation. For this purpose dma_rw() should also have the `F: FalconFirmware<Target = E>` generic I think. (No worries about the dependencies; I can create a shared tag for the DMA patches and merge it into the nova tree, such that it doesn't block this series.) > + // Wait for the transfer to complete. > + util::wait_on(Duration::from_millis(2000), || { Yeah, I really think some timeout justification would be nice. > +/// Hardware Abstraction Layer for Falcon cores. > +/// > +/// Implements chipset-specific low-level operations. The trait is generic > against [`FalconEngine`] > +/// so its `BASE` parameter can be used in order to avoid runtime bound > checks when accessing > +/// registers. > +pub(crate) trait FalconHal<E: FalconEngine>: Sync { > + // Activates the Falcon core if the engine is a risvc/falcon dual engine. > + fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result<()> { > + Ok(()) > + } > + > + /// Returns the fused version of the signature to use in order to run a > HS firmware on this > + /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from > the firmware header. > + fn get_signature_reg_fuse_version( Unless the method increases a reference count, please don't use the 'get' prefix. > + &self, > + falcon: &Falcon<E>, > + bar: &Bar0, > + engine_id_mask: u16, > + ucode_id: u8, > + ) -> Result<u32>; > + > + // Program the boot ROM registers prior to starting a secure firmware. > + fn program_brom(&self, falcon: &Falcon<E>, bar: &Bar0, params: > &FalconBromParams) > + -> Result<()>; > +} > + > +impl Chipset { > + /// Returns a boxed falcon HAL adequate for this chipset. > + /// > + /// We use a heap-allocated trait object instead of a statically defined > one because the > + /// generic `FalconEngine` argument makes it difficult to define all the > combinations > + /// statically. > + /// > + /// TODO: replace the return type with `KBox` once it gains the ability > to host trait objects. I think we can do this for v5. :-)