On Mon, 15 Dec 2025 17:14:47 +0000 Lukas Zapolskas <[email protected]> wrote:
> This patch extends the DEV_QUERY ioctl to return information about the > performance counter setup for userspace, and introduces the new > ioctl DRM_PANTHOR_PERF_CONTROL in order to allow for the sampling of > performance counters. > > The new design is inspired by the perf aux ringbuffer [0], with the > insert and extract indices being mapped to userspace, allowing > multiple samples to be exposed at any given time. To avoid pointer > chasing, the sample metadata and block metadata are inline with > the elements they describe. > > Userspace is responsible for passing in resources for samples to be > exposed, including the event file descriptor for notification of new > sample availability, the ringbuffer BO to store samples, and the > control BO along with the offset for mapping the insert and extract > indices. Though these indices are only a total of 8 bytes, userspace > can then reuse the same physical page for tracking the state of > multiple buffers by giving different offsets from the BO start to > map them. > > [0]: https://docs.kernel.org/userspace-api/perf_ring_buffer.html > > Co-developed-by: Mihail Atanassov <[email protected]> > Signed-off-by: Mihail Atanassov <[email protected]> > Signed-off-by: Lukas Zapolskas <[email protected]> > Reviewed-by: Adrián Larumbe <[email protected]> A couple things pointed out by Adrian have not been fixed, I think (see below). > --- > include/uapi/drm/panthor_drm.h | 565 +++++++++++++++++++++++++++++++++ > 1 file changed, 565 insertions(+) > > diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h > index e238c6264fa1..d1a92172e878 100644 > --- a/include/uapi/drm/panthor_drm.h > +++ b/include/uapi/drm/panthor_drm.h [...] > +/** > + * struct drm_panthor_perf_info - Performance counter interface information > + * > + * Structure grouping all queryable information relating to the performance > counter > + * interfaces. > + */ > +struct drm_panthor_perf_info { > + /** > + * @counters_per_block: The number of 8-byte counters available in a > block. > + */ > + __u32 counters_per_block; > + > + /** > + * @sample_header_size: The size of the header struct available at the > beginning > + * of every sample. > + */ > + __u32 sample_header_size; > + > + /** > + * @block_header_size: The size of the header struct inline with the > counters for a > + * single block. > + */ > + __u32 block_header_size; > + > + /** > + * @sample_size: The size of a fully annotated sample, starting with a > sample header > + * of size @sample_header_size bytes, and all available > blocks for the current > + * configuration, each comprised of @counters_per_block > 64-bit counters and > + * a block header of @block_header_size bytes. > + * > + * The user must use this field to allocate size for the > ring buffer. In > + * the case of new blocks being added, an old userspace > can always use > + * this field and ignore any blocks it does not know > about. > + */ > + __u32 sample_size; > + > + /** @flags: Combination of drm_panthor_perf_feat_flags flags. */ > + __u32 flags; > + > + /** > + * @supported_clocks: Bitmask of the clocks supported by the GPU. > + * > + * Each bit represents a variant of the enum drm_panthor_perf_clock. > + * > + * For the same GPU, different implementers may have different clocks > for the same hardware > + * block. At the moment, up to three clocks are supported, and any > clocks that are present > + * will be reported here. > + */ > + __u32 supported_clocks; > + > + /** @fw_blocks: Number of FW blocks available. */ > + __u32 fw_blocks; > + > + /** @cshw_blocks: Number of CSHW blocks available. */ > + __u32 cshw_blocks; > + > + /** @tiler_blocks: Number of tiler blocks available. */ > + __u32 tiler_blocks; > + > + /** @memsys_blocks: Number of memsys blocks available. */ > + __u32 memsys_blocks; > + > + /** @shader_blocks: Number of shader core blocks available. */ > + __u32 shader_blocks; You need an extra __u32 pad; to have things aligned on 8 bytes. > +}; > + [...] > + > +/** > + * struct drm_panthor_perf_ringbuf_control - Struct used to map in the ring > buffer control indices > + * into memory shared between user > and kernel. > + * > + */ > +struct drm_panthor_perf_ringbuf_control { > + /** > + * @extract_idx: The index of the latest sample that was processed by > userspace. Only > + * modifiable by userspace. > + */ > + __u64 extract_idx; > + > + /** > + * @insert_idx: The index of the latest sample emitted by the kernel. > Only modifiable by > + * modifiable by the kernel. "modifiable by" repeated twice. > + */ > + __u64 insert_idx; > +};
