amdkfd: Add support for non-4K page size systems

Christian König Fri, 12 Dec 2025 05:01:37 -0800

On 12/12/25 11:45, Ritesh Harjani (IBM) wrote:
> Christian König <[email protected]> writes:
>>> Setup details:
>>> ============
>>> System details: Power10 LPAR using 64K pagesize.
>>> AMD GPU:
>>>   Name:                    gfx90a
>>>   Marketing Name:          AMD Instinct MI210
>>>
>>> Queries:
>>> =======
>>> 1. We currently ran rocr-debug agent tests [1]  and rccl unit tests [2] to 
>>> test
>>>    these changes. Is there anything else that you would suggest us to run to
>>>    shake out any other page size related issues w.r.t the kernel driver?
>>
>> The ROCm team needs to answer that.
>>
> 
> Is there any separate mailing list or list of people whom we can cc
> then?


With Felix on CC you already got the right person, but he's on vacation and 
will not be back before the end of the year.

I can check on Monday if some people are still around which could answer a 
couple of questions, but in general don't expect a quick response.

>>> 2. Patch 1/8: We have a querry regarding eop buffer size Is this eop ring 
>>> buffer
>>>    size HW dependent? Should it be made PAGE_SIZE?
>>
>> Yes and no.
>>
> 
> If you could more elaborate on this please? I am assuming you would
> anyway respond with more context / details on Patch-1 itself. If yes,
> that would be great!

Well, in general the EOP (End of Pipe) buffer contains in a ring buffer of all 
the events and actions the CP should execute when shaders and cache flushes 
finish.

The size depends on the HW generation and configuration of the GPU etc..., but 
don't ask me for details how that is calculated.

The point is that the size is completely unrelated to the CPU, so using 
PAGE_SIZE is clearly incorrect.

>>>
>>> 3. Patch 5/8: also have a query w.r.t the error paths when system page size 
>>> > 4K.
>>>    Do we need to lift this restriction and add MMIO remap support for 
>>> systems with
>>>    non-4K page sizes?
>>
>> The problem is the HW can't do this.
>>
> 
> We aren't that familiar with the HW / SW stack here. Wanted to understand
> what functionality will be unsupported due to this HW limitation then?

The problem is that the CPU must map some of the registers/resources of the GPU 
into the address space of the application and you run into security issues when 
you map more than 4k at a time.

>>>
>>> [1] ROCr debug agent tests: https://github.com/ROCm/rocr_debug_agent
>>> [2] RCCL tests: https://github.com/ROCm/rccl/tree/develop/test
>>>
>>>
>>> Please note that the changes in this series are on a best effort basis from 
>>> our
>>> end. Therefore, requesting the amd-gfx community (who have deeper knowledge 
>>> of the
>>> HW & SW stack) to kindly help with the review and provide feedback / 
>>> comments on
>>> these patches. The idea here is, to also have non-4K pagesize (e.g. 64K) 
>>> well
>>> supported with amd gpu kernel driver.
>>
>> Well this is generally nice to have, but there are unfortunately some HW 
>> limitations which makes ROCm pretty much unusable on non 4k page size 
>> systems.
> 
> That's a bummer :( 
> - Do we have some HW documentation around what are these limitations around 
> non-4K pagesize? Any links to such please?

You already mentioned MMIO remap which obviously has that problem, but if I'm 
not completely mistaken the PCIe doorbell BAR and some global seq counter 
resources will also cause problems here.

This can all be worked around by delegating those MMIO accesses into the 
kernel, but that means tons of extra IOCTL overhead.

Especially the cache flushes which are necessary to avoid corruption are really 
bad for performance in such an approach.

> - Are there any latest AMD GPU versions which maybe lifts such restrictions?

Not that I know off any.

>> What we can do is to support graphics and MM, but that should already work 
>> out of the box.
>>
> 
> - Maybe we should also document, what will work and what won't work due to 
> these HW limitations.

Well pretty much everything, I need to double check how ROCm does HDP 
flushing/invalidating when the MMIO remap isn't available.

Could be that there is already a fallback path and that's the reason why this 
approach actually works at all.

>> What we can do is to support graphics and MM, but that should already work 
>> out of the box.> 
> So these patches helped us resolve most of the issues like SDMA hangs
> and GPU kernel page faults which we saw with rocr and rccl tests with
> 64K pagesize. Meaning, we didn't see this working out of box perhaps
> due to 64K pagesize.

Yeah, but this is all for ROCm and not the graphics side.

To be honest I'm not sure how ROCm even works when you have 64k pages at the 
moment. I would expect much more issue lurking in the kernel driver.

> AFAIU, some of these patches may require re-work based on reviews, but
> at least with these changes, we were able to see all the tests passing.
> 
>> I need to talk with Alex and the ROCm team about it if workarounds can be 
>> implemented for those issues.
>>
> 
> Thanks a lot! That would be super helpful!
> 
> 
>> Regards,
>> Christian.
>>
> 
> Thanks again for the quick response on the patch series.

You are welcome, but since it's so near to the end of the year not all people 
are available any more.

Regards,
Christian.

> 
> -ritesh

Re: [RFC PATCH v1 0/8] amdgpu/amdkfd: Add support for non-4K page size systems

Reply via email to