On Mon, 11 May 2026 15:53:31 +0200 Nicolas Frattaroli <[email protected]> wrote:
> On Monday, 11 May 2026 13:56:41 Central European Summer Time Boris Brezillon > wrote: > > Hi Nicolas, > > > > On Fri, 08 May 2026 20:00:54 +0200 > > Nicolas Frattaroli <[email protected]> wrote: > > > > > In Commit a8f5738779a9 ("drm/panthor: Pass an iomem pointer to GPU > > > register access helpers"), the gpu register access helpers were changed > > > from taking a pointer to a struct panthor_device in their first > > > argument, to taking a void pointer. > > > > > > This can cause problems, as patches based on panthor before this change > > > will still compile fine after it. struct panthor_device * implicitly > > > casts to a void pointer, resulting in completely wrong semantics. > > > > > > Prevent this problem by wrapping the affected functions with macros that > > > specifically check for and reject the struct panthor_device * type as > > > the first argument. > > > > > > Signed-off-by: Nicolas Frattaroli <[email protected]> > > > --- > > > drivers/gpu/drm/panthor/panthor_device.h | 68 > > > +++++++++++++++++++++++++------- > > > 1 file changed, 53 insertions(+), 15 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/panthor/panthor_device.h > > > b/drivers/gpu/drm/panthor/panthor_device.h > > > index 4e4607bca7cc..91e9f499bf69 100644 > > > --- a/drivers/gpu/drm/panthor/panthor_device.h > > > +++ b/drivers/gpu/drm/panthor/panthor_device.h > > > @@ -630,49 +630,87 @@ static inline void panthor_ ## __name ## > > > _irq_disable_events(struct panthor_irq > > > > > > extern struct workqueue_struct *panthor_cleanup_wq; > > > > > > -static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data) > > > +static inline void _gpu_write(void __iomem *iomem, u32 reg, u32 data) > > > { > > > writel(data, iomem + reg); > > > } > > > > > > -static inline u32 gpu_read(void __iomem *iomem, u32 reg) > > > +static inline u32 _gpu_read(void __iomem *iomem, u32 reg) > > > { > > > return readl(iomem + reg); > > > } > > > > > > -static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg) > > > +static inline u32 _gpu_read_relaxed(void __iomem *iomem, u32 reg) > > > > First off, I'm not a huge fan of these _ prefixes to convey the "internal, > > don't use directly" information. If we're going to have macros around these > > helpers (meaning these helpers won't be used directly), I'd rather have a > > fully descriptive name like gpu_{write,read}_iomem[_relaxed]() and a comment > > stating that they should never be used. > > Agreed. > > > > > > { > > > return readl_relaxed(iomem + reg); > > > } > > > > > > -static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data) > > > +/* > > > + * The function signature of gpu_read/gpu_write/gpu_read_relaxed/... > > > used to > > > + * take a &struct panthor_device* as the first parameter. During the > > > split of > > > + * iomem ranges into individual sub-components, this was changed to take > > > a > > > + * void __iomem* instead. These wrappers exists Tto avoid situations > > > wherein > > > + * pre-refactor patches are applied in error, as they'd compile fine. > > > That's > > > + * because the old calling convention's first parameter implicitly casts > > > to a > > > + * void pointer. > > > + */ > > > + > > > +#define gpu_write(iomem, reg, data) ({ > > > \ > > > + static_assert(!__same_type((iomem), struct panthor_device *)); \ > > > > Hm, this only covers ptdev being passed as an iomem pointer. I know it's > > the only case we had so far, but if we're going to add type enforcement, > > I think I'd prefer if we were covered for more than just ptdev. > > > > One way of doing that would be to wrap the `void __iomem *iomem` in an > > explicit type like: > > > > struct panthor_reg_bank { > > void __iomem *iomem; > > }; > > > > which then gets passed to gpu_{read,write} helpers (see the diff below). > > Hm, okay, the diff below is smaller than I feared. Though it doesn't get > us type checking for someone, say, trying to read GPU_STATUS with the > iomem of panthor_fw. But neither does my proposal below. > > > > > The other way would be to pass the component, and have the macro > > do the <component>->iomem deref, but there's a few places where reg banks > > are accessed outside of the components that own them (panthor_hw.c). > > Yeah, I prototyped going down something along that route by having > the register accessors be generics that are implemented by each > component, and it's a bit messy. Either you expose the struct > definitions of individual components so that this header has visibility > into them (not great), or you add boilerplate "do this accessor > operation for this component" helpers for every component, which is both > verbose and possibly causes the inlining to no longer work, though I have > yet to verify that. > > If we do want to go down this route (though I'm not sure, since your > reg bank solution seems to get us the same guarantees but without bringing > generics into this), then the following may be an okay idea: > > I think having just the iomem deref genericised may be a good middle > ground. If instead of making it a deref, we make it return the pointer > to the member into the component that it can then deref, then the > component-specific part can be pure (since offset of the iomem member > is constant so for a particular pointer to a component, the pointer to > the iomem member only depends on the passed-in pointer to component.) > > This should make sure that when the compiler gets > > panthor_gpu_write(ptdev->gpu, foo, bar); > val = panthor_gpu_read(ptdev->gpu, baz); > > it can optimise the expanded > > iomem = *panthor_get_iomem_ptr(ptdev->gpu); > panthor_actual_write(iomem, foo, bar); > iomem = *panthor_get_iomem_ptr(ptdev->gpu); > val = panthor_actual_read(iomem, baz); > > to the simplified > > iomem = *panthor_get_iomem_ptr(ptdev->gpu); > panthor_actual_write(iomem, foo, bar); > val = panthor_actual_read(iomem, baz); > > because panthor_get_iomem_ptr will be known to return the same value > when called with the same input param. > > Anway, I think it's probably best if I abandon this and you just send > your patch to the list with a real base. I only have one comment on it, > which I've included inline. > > > > > Regards, > > > > Boris > > > > --->8--- > > [... snip ...] > > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c > > b/drivers/gpu/drm/panthor/panthor_mmu.c > > index 9d4500850561..d4cc105afe24 100644 > > --- a/drivers/gpu/drm/panthor/panthor_mmu.c > > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c > > @@ -55,8 +55,8 @@ struct panthor_as_slot { > > * struct panthor_mmu - MMU related data > > */ > > struct panthor_mmu { > > - /** @iomem: CPU mapping of MMU_AS_CONTROL iomem region */ > > - void __iomem *iomem; > > + /** @regs: CPU mapping of MMU_AS_CONTROL iomem region */ > > + struct panthor_reg_bank regs; > > > > /** @irq: The MMU irq. */ > > struct panthor_irq irq; > > @@ -527,7 +527,7 @@ static int wait_ready(struct panthor_device *ptdev, u32 > > as_nr) > > /* Wait for the MMU status to indicate there is no active command, > > in > > * case one is pending. > > */ > > - ret = gpu_read_relaxed_poll_timeout_atomic(mmu->iomem, > > AS_STATUS(as_nr), val, > > + ret = gpu_read_relaxed_poll_timeout_atomic(mmu->regs, > > AS_STATUS(as_nr), val, > > !(val & > > AS_STATUS_AS_ACTIVE), 10, 100000); > > > > if (ret) { > > @@ -545,7 +545,7 @@ static int as_send_cmd_and_wait(struct panthor_device > > *ptdev, u32 as_nr, u32 cmd > > /* write AS_COMMAND when MMU is ready to accept another command */ > > status = wait_ready(ptdev, as_nr); > > if (!status) { > > - gpu_write(ptdev->mmu->iomem, AS_COMMAND(as_nr), cmd); > > + gpu_write(ptdev->mmu->regs, AS_COMMAND(as_nr), cmd); > > status = wait_ready(ptdev, as_nr); > > } > > > > @@ -598,9 +598,9 @@ static int panthor_mmu_as_enable(struct panthor_device > > *ptdev, u32 as_nr, > > panthor_mmu_irq_enable_events(&ptdev->mmu->irq, > > panthor_mmu_as_fault_mask(ptdev, > > as_nr)); > > > > - gpu_write64(mmu->iomem, AS_TRANSTAB(as_nr), transtab); > > - gpu_write64(mmu->iomem, AS_MEMATTR(as_nr), memattr); > > - gpu_write64(mmu->iomem, AS_TRANSCFG(as_nr), transcfg); > > + gpu_write64(mmu->regs, AS_TRANSTAB(as_nr), transtab); > > + gpu_write64(mmu->regs, AS_MEMATTR(as_nr), memattr); > > + gpu_write64(mmu->regs, AS_TRANSCFG(as_nr), transcfg); > > > > return as_send_cmd_and_wait(ptdev, as_nr, AS_COMMAND_UPDATE); > > } > > @@ -636,9 +636,9 @@ static int panthor_mmu_as_disable(struct panthor_device > > *ptdev, u32 as_nr, > > if (recycle_slot) > > return 0; > > > > - gpu_write64(mmu->iomem, AS_TRANSTAB(as_nr), 0); > > - gpu_write64(mmu->iomem, AS_MEMATTR(as_nr), 0); > > - gpu_write64(mmu->iomem, AS_TRANSCFG(as_nr), > > AS_TRANSCFG_ADRMODE_UNMAPPED); > > + gpu_write64(mmu->regs, AS_TRANSTAB(as_nr), 0); > > + gpu_write64(mmu->regs, AS_MEMATTR(as_nr), 0); > > + gpu_write64(mmu->regs, AS_TRANSCFG(as_nr), > > AS_TRANSCFG_ADRMODE_UNMAPPED); > > > > return as_send_cmd_and_wait(ptdev, as_nr, AS_COMMAND_UPDATE); > > } > > @@ -791,7 +791,7 @@ int panthor_vm_active(struct panthor_vm *vm) > > */ > > fault_mask = panthor_mmu_as_fault_mask(ptdev, as); > > if (ptdev->mmu->as.faulty_mask & fault_mask) { > > - gpu_write(ptdev->mmu->irq.iomem, INT_CLEAR, fault_mask); > > + gpu_write(ptdev->mmu->irq.regs, INT_CLEAR, fault_mask); > > ptdev->mmu->as.faulty_mask &= ~fault_mask; > > } > > > > @@ -1738,7 +1738,7 @@ static int panthor_vm_lock_region(struct panthor_vm > > *vm, u64 start, u64 size) > > mutex_lock(&ptdev->mmu->as.slots_lock); > > if (vm->as.id >= 0 && size) { > > /* Lock the region that needs to be updated */ > > - gpu_write64(ptdev->mmu->iomem, AS_LOCKADDR(vm->as.id), > > + gpu_write64(ptdev->mmu->regs, AS_LOCKADDR(vm->as.id), > > pack_region_range(ptdev, &start, &size)); > > > > /* If the lock succeeded, update the locked_region info. */ > > @@ -1800,8 +1800,8 @@ static void panthor_mmu_irq_handler(struct > > panthor_device *ptdev, u32 status) > > u32 access_type; > > u32 source_id; > > > > - fault_status = gpu_read(mmu->iomem, AS_FAULTSTATUS(as)); > > - addr = gpu_read64(mmu->iomem, AS_FAULTADDRESS(as)); > > + fault_status = gpu_read(mmu->regs, AS_FAULTSTATUS(as)); > > + addr = gpu_read64(mmu->regs, AS_FAULTADDRESS(as)); > > > > /* decode the fault status */ > > exception_type = fault_status & 0xFF; > > @@ -1832,7 +1832,7 @@ static void panthor_mmu_irq_handler(struct > > panthor_device *ptdev, u32 status) > > * Note that COMPLETED irqs are never cleared, but this is > > fine > > * because they are always masked. > > */ > > - gpu_write(mmu->irq.iomem, INT_CLEAR, mask); > > + gpu_write(mmu->irq.regs, INT_CLEAR, mask); > > > > if (ptdev->mmu->as.slots[as].vm) > > ptdev->mmu->as.slots[as].vm->unhandled_fault = true; > > @@ -3255,7 +3255,7 @@ int panthor_mmu_init(struct panthor_device *ptdev) > > if (ret) > > return ret; > > > > - mmu->iomem = ptdev->iomem + MMU_AS_BASE; > > + mmu->regs.iomem = ptdev->iomem + MMU_AS_BASE; > > ptdev->mmu = mmu; > > > > irq = platform_get_irq_byname(to_platform_device(ptdev->base.dev), > > "mmu"); > > @@ -3264,7 +3264,7 @@ int panthor_mmu_init(struct panthor_device *ptdev) > > > > ret = panthor_request_mmu_irq(ptdev, &mmu->irq, irq, > > panthor_mmu_fault_mask(ptdev, ~0), > > - ptdev->iomem + MMU_INT_BASE); > > + mmu->regs.iomem + MMU_INT_BASE); > > This is wrong. Before, we used an effective whole-device offset of 0x2000. > Now it's 0x4400. Uh, right. I hate the fact sometimes xxx_INT_BASE is some offset into the the xxx_BASE regbank (GPU_INT_BASE), and other times ({MMU,JOB}_INT_BASE), it's an absolute offset from the start of the whole IO region.
