On Monday, June 15, 2026 4:44:22 PM Central European Summer Time Tvrtko
Ursulin wrote:
> On 25/05/2026 12:45, Timur Kristóf wrote:
> > Instead of writing the doorbell in amdgpu_gmc_handle_retry_fault()
> > directly, add an IH function pointer which can be defined in
> > a different way for different IH versions.
> >
> > This is to allow implementing the filter CAM without a doorbell.
> >
> > Signed-off-by: Timur Kristóf <[email protected]>
> > ---
> >
> > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 +
> > drivers/gpu/drm/amd/amdgpu/ih_v7_0.c | 6 ++++++
> > drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 8 +++++++-
> > 4 files changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index
> > 52258f1341c2..d790b7619ccd 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -565,7 +565,7 @@ int amdgpu_gmc_handle_retry_fault(struct amdgpu_device
> > *adev,>
> > ret = amdgpu_vm_handle_fault(adev, entry->pasid,
entry->vmid, node_id,
> >
> > addr, entry-
>timestamp, write_fault);
> >
> > - WDOORBELL32(adev->irq.retry_cam_doorbell_index,
cam_index);
> > + adev->irq.ih_funcs->retry_cam_ack(adev, cam_index);
>
> How does not map which IP generations can end up calling it? Presumably
> your selection of ih_v7_0 and vega20_ih.c is an insightful one, but for
> me I see amdgpu_gmc_handle_retry_fault() is called from
> gmc_v9_0_process_interrupt, gmc_v10_0_process_interrupt,
> gmc_v11_0_process_interrupt and gmc_v12_0_process_interrupt(). Is there
> a map somewhere which shows which GMC versions go with which IH blocks?
The hardware writes interrupt data into a so-called IH ring (interrupt handler
ring):
- Old GPUs only have one IH ring
- Newer dedicated GPUs have two IH rings
- APUs only have one IH ring
- Additionally there is a soft IH ring, which is implemented entirely in the
driver and is used to make the processing more reliable.
Specifically for retry faults, it is beneficial to configure page fault
interrupts on the second IH ring (when available) to avoid them competing with
other interrupts for space in the ring. In the upstream code this is only done
on Vega dGPUs. My series implements that for GFX11 and a subsequent series
also for GFX12 dGPUs.
To answer your actual question, it all depends on the IH IP block for any
given generation. For some GPUs (but not all), the IH code configures the
hardware to use the second IH ring for page fault interrupts. For everything
else, the first IH ring is used. On GFX12 it seems the interrupts are handled
on the first IH ring even though the second ring is configured; it's unclear if
that's a bug or missing code in the kernel.
Hope this helps,
Timur
>
> > if (ret)
> >
> > return 1;
> >
> > } else {
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h index 444437c30088..e6e34f6e86f4
> > 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > @@ -97,6 +97,7 @@ struct amdgpu_ih_funcs {
> >
> > const char *(*node_id_to_die_name)(struct amdgpu_device *adev,
> >
> > unsigned int
node_id,
> > char *buf, size_t
size);
> >
> > + void (*retry_cam_ack)(struct amdgpu_device *adev, u32 cam_index);
> >
> > };
> >
> > #define amdgpu_ih_get_wptr(adev, ih)
> > (adev)->irq.ih_funcs->get_wptr((adev), (ih))>
> > diff --git a/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c
> > b/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c index 6de9e87e04e1..c2431f4c2671
> > 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c
> > @@ -289,6 +289,11 @@ static uint32_t ih_v7_0_setup_retry_doorbell(u32
> > doorbell_index)>
> > return val;
> >
> > }
> >
> > +static void ih_v7_0_retry_cam_ack(struct amdgpu_device *adev, u32
> > cam_index) +{
> > + WDOORBELL32(adev->irq.retry_cam_doorbell_index, cam_index);
> > +}
> > +
> >
> > #define regIH_RING1_CLIENT_CFG_INDEX_V7_1 0x122
> > #define regIH_RING1_CLIENT_CFG_INDEX_V7_1_BASE_IDX 0
> > #define regIH_RING1_CLIENT_CFG_DATA_V7_1 0x123
> >
> > @@ -858,6 +863,7 @@ static const struct amdgpu_ih_funcs ih_v7_0_funcs = {
> >
> > .decode_iv_ts = amdgpu_ih_decode_iv_ts_helper,
> > .set_rptr = ih_v7_0_set_rptr,
> > .node_id_to_die_name = ih_v7_0_node_id_to_die_name,
> >
> > + .retry_cam_ack = ih_v7_0_retry_cam_ack,
> >
> > };
> >
> > static void ih_v7_0_set_interrupt_funcs(struct amdgpu_device *adev)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> > b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c index 85846fd08ce4..30a82fff3ff7
> > 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> > @@ -293,6 +293,11 @@ static uint32_t vega20_setup_retry_doorbell(u32
> > doorbell_index)>
> > return val;
> >
> > }
> >
> > +static void vega20_retry_cam_ack(struct amdgpu_device *adev, u32
> > cam_index) +{
> > + WDOORBELL32(adev->irq.retry_cam_doorbell_index, cam_index);
> > +}
> > +
> >
> > /**
> >
> > * vega20_ih_irq_init - init and enable the interrupt ring
> > *
> >
> > @@ -738,7 +743,8 @@ static const struct amdgpu_ih_funcs vega20_ih_funcs =
> > {
> >
> > .get_wptr = vega20_ih_get_wptr,
> > .decode_iv = amdgpu_ih_decode_iv_helper,
> > .decode_iv_ts = amdgpu_ih_decode_iv_ts_helper,
> >
> > - .set_rptr = vega20_ih_set_rptr
> > + .set_rptr = vega20_ih_set_rptr,
> > + .retry_cam_ack = vega20_retry_cam_ack,
> >
> > };
> >
> > static void vega20_ih_set_interrupt_funcs(struct amdgpu_device *adev)