On 09/02/2026 15:35, Boris Brezillon wrote: > On Mon, 9 Feb 2026 15:22:09 +0000 > Liviu Dudau <[email protected]> wrote: > >>>> Ultimately the role of this RFC is to start a discussion and to figure out >>>> a path >>>> forward for CSF GPUs where we want now to tighen a bit the formats we >>>> support and >>>> add PBHA and in the future we want to add support for v15+ page formats. >>> >>> PBHA is definitely an area for discussion. AIUI there are out-of-tree >>> patches floating about for CPU support, but it hasn't been upstreamed. I >>> don't know if any serious attempt has been made to push it upstream, but >>> it's tricky because the architecture basically just says "IMPLEMENTATION >>> DEFINED" which means you are no longer coding to the architecture but a >>> specific implementation - and there's remarkably little documentation >>> about what PBHA is used for in practice. >>> >>> I haven't looked into the GPU situation with PBHA - again it would be >>> good to have more details on how the bits would be set. >> >> I have a patch series that adds support in Panthor to apply some PBHA bits >> defined >> in the DT based on an ID also defined in the DT and passed along as a >> VM_BIND parameter >> if you want to play with it. However I have no direct knowledge on which >> PBHA values >> would make a difference on the supported platforms (RK3xxx for example).
So we need something better than a DT entry saying e.g. "ID 3 is bit pattern 0100". We need something that describes the actual behaviour of a PBHA value. Otherwise user space will end up needing to know the exact hardware platform it's running on to know what ID values mean. > I don't know if that's what it's going be used for, but one very > specific use case I'd like to see this PBHA extension backed by is > "read-zero/write-discard" behavior that's needed for sparse bindings. > Unfortunately, I've not heard on any HW-support for that in older > gens... *This* is a good example of something useful that could be exposed. If the DT can describe that the hardware supports a "read-zero/write-discard" with a specific bit pattern, then we can advertise that to user space and provide a flag for VM_BIND which gives that behaviour. And user space can make good use of it. But from what I've heard the implementations tend to have something more like a hint-mechanism where it affects the behaviour of the caches but not the functional effect. This makes it much harder to expose to user space in a meaningful way because it's highly platform dependant what "don't allocate in the system level cache" actually means in terms of performance effects. But it's possible we could describe more of a usage based flag - i.e. "PBHA bits good for a tiler heap". Thanks, Steve
