On Thu, Mar 19, 2026 at 12:29 PM Alex Williamson <[email protected]> wrote:
>
> On Thu, 19 Mar 2026 19:04:37 +0000
> David Matlack <[email protected]> wrote:
>
> > On 2026-03-17 02:42 PM, Rubin Du wrote:
> > > Add a new VFIO PCI driver for NVIDIA GPUs that enables DMA testing
> > > via the Falcon (Fast Logic Controller) microcontrollers. This driver
> > > extracts and adapts the DMA test functionality from the NVIDIA
> > > gpu-admin-tools project and integrates it into the existing VFIO
> > > selftest framework.
> > >
> > > The Falcon is a general-purpose microcontroller present on NVIDIA GPUs
> > > that can perform DMA operations between system memory and device memory.
> > > By leveraging Falcon DMA, this driver allows NVIDIA GPUs to be tested
> > > alongside Intel IOAT and DSA devices using the same selftest 
> > > infrastructure.
> > >
> > > Supported GPUs:
> > > - Kepler: K520, GTX660, K4000, K80, GT635
> > > - Maxwell Gen1: GTX750, GTX745
> > > - Maxwell Gen2: M60
> > > - Pascal: P100, P4, P40
> > > - Volta: V100
> > > - Turing: T4
> > > - Ampere: A16, A100, A10
> > > - Ada: L4, L40S
> > > - Hopper: H100
> > >
> > > The PMU falcon on Kepler and Maxwell Gen1 GPUs uses legacy FBIF register
> > > offsets and requires enabling via PMC_ENABLE with the HUB bit set.
> > >
> > > Limitations and tradeoffs:
> > >
> > > 1. Architecture support:
> > >    Blackwell and newer architectures may require additional work
> > >    due to firmware.
> > >
> > > 2. Synchronous DMA operations:
> > >    Each transfer blocks until completion because the reference
> > >    implementation does not expose command queuing - only one
> > >    DMA operation can be in flight at a time.
> >
> > Asynchronous DMA will be important for testing Live Update:
> >
> >   https://lore.kernel.org/kvm/[email protected]/
> >
> > That is why I split memcpy_start() and memcpy_wait() from the beginning.
> >
> > Would it be possible to add support for it here even though it is not in
> > the reference implementation?
>
> I'll leave the can-we questions to Rubin, but do you see either the MSI
> or asynchronous issues as blockers?

No, I don't consider either to be hard blockers.

> Currently our driver tests are
> limited to a very narrow range of Intel server platforms, whereas this
> is a plug'able endpoint we can install anywhere.  I'd think that's
> sufficiently valuable in expanding the test base to make some
> compromises.  Thanks,

Yeah we can compromise if both issues cannot be resolved.

I hope to limit differences between the selftests drivers as much as
possible, so that tests don't have to care. But I also recognize some
divergence will be inevitable if we want to support a broad set of
devices while also supporting more advanced features like asyncronous
memcpy.

Reply via email to