On Wed, Nov 19, 2025 at 3:14 AM Pierre-Eric Pelloux-Prayer
<[email protected]> wrote:
>
>
>
> Le 18/11/2025 à 23:06, Alex Deucher a écrit :
> > We need to call amdgpu_vm_handle_fault() on page fault
> > on all gfx9 and newer parts to properly update the
> > page tables, not just for recoverable page faults.
> >
> > Signed-off-by: Alex Deucher <[email protected]>
> > ---
> > drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 27 ++++++++++++++++++++++++++
> > 1 file changed, 27 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > index 7bc389d9f5c48..25cdcb850416c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > @@ -103,12 +103,39 @@ static int gmc_v11_0_process_interrupt(struct
> > amdgpu_device *adev,
> > uint32_t vmhub_index = entry->client_id == SOC21_IH_CLIENTID_VMC ?
> > AMDGPU_MMHUB0(0) : AMDGPU_GFXHUB(0);
> > struct amdgpu_vmhub *hub = &adev->vmhub[vmhub_index];
> > + bool retry_fault = !!(entry->src_data[1] & 0x80);
> > + bool write_fault = !!(entry->src_data[1] & 0x20);
> > uint32_t status = 0;
> > u64 addr;
> >
> > addr = (u64)entry->src_data[0] << 12;
> > addr |= ((u64)entry->src_data[1] & 0xf) << 44;
> >
> > + if (retry_fault) {
> > + /* Returning 1 here also prevents sending the IV to the KFD */
> > +
> > + /* Process it onyl if it's the first fault for this address */
>
> typo: onyl -> only (same for patch 2/3)
Fixed locally. thanks!
Alex
>
> Pierre-Eric
>
>
>
> > + if (entry->ih != &adev->irq.ih_soft &&
> > + amdgpu_gmc_filter_faults(adev, entry->ih, addr,
> > entry->pasid,
> > + entry->timestamp))
> > + return 1;
> > +
> > + /* Delegate it to a different ring if the hardware hasn't
> > + * already done it.
> > + */
> > + if (entry->ih == &adev->irq.ih) {
> > + amdgpu_irq_delegate(adev, entry, 8);
> > + return 1;
> > + }
> > +
> > + /* Try to handle the recoverable page faults by filling page
> > + * tables
> > + */
> > + if (amdgpu_vm_handle_fault(adev, entry->pasid, 0, 0, addr,
> > + entry->timestamp, write_fault))
> > + return 1;
> > + }
> > +
> > if (!amdgpu_sriov_vf(adev)) {
> > /*
> > * Issue a dummy read to wait for the status register to