On Fri, May 03, 2024 at 06:06:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 03/05/2024 16:58, Alex Deucher wrote:
> > On Fri, May 3, 2024 at 11:33 AM Daniel Vetter <dan...@ffwll.ch> wrote:
> > > 
> > > On Fri, May 03, 2024 at 01:58:38PM +0100, Tvrtko Ursulin wrote:
> > > > 
> > > > [And I forgot dri-devel.. doing well!]
> > > > 
> > > > On 03/05/2024 13:40, Tvrtko Ursulin wrote:
> > > > > 
> > > > > [Correcting Christian's email]
> > > > > 
> > > > > On 03/05/2024 13:36, Tvrtko Ursulin wrote:
> > > > > > From: Tvrtko Ursulin <tvrtko.ursu...@igalia.com>
> > > > > > 
> > > > > > Currently it is not well defined what is drm-memory- compared to 
> > > > > > other
> > > > > > categories.
> > > > > > 
> > > > > > In practice the only driver which emits these keys is amdgpu and in 
> > > > > > them
> > > > > > exposes the total memory use (including shared).
> > > > > > 
> > > > > > Document that drm-memory- and drm-total-memory- are aliases to
> > > > > > prevent any
> > > > > > confusion in the future.
> > > > > > 
> > > > > > While at it also clarify that the reserved sub-string 'memory' 
> > > > > > refers to
> > > > > > the memory region component.
> > > > > > 
> > > > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursu...@igalia.com>
> > > > > > Cc: Alex Deucher <alexander.deuc...@amd.com>
> > > > > > Cc: Christian König <christian.keo...@amd.com>
> > > > > 
> > > > > Mea culpa, I copied the mistake from
> > > > > 77d17c4cd0bf52eacfad88e63e8932eb45d643c5. :)
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > Tvrtko
> > > > > 
> > > > > > Cc: Rob Clark <robdcl...@chromium.org>
> > > > > > ---
> > > > > >    Documentation/gpu/drm-usage-stats.rst | 10 +++++++++-
> > > > > >    1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/Documentation/gpu/drm-usage-stats.rst
> > > > > > b/Documentation/gpu/drm-usage-stats.rst
> > > > > > index 6dc299343b48..ef5c0a0aa477 100644
> > > > > > --- a/Documentation/gpu/drm-usage-stats.rst
> > > > > > +++ b/Documentation/gpu/drm-usage-stats.rst
> > > > > > @@ -128,7 +128,9 @@ Memory
> > > > > >    Each possible memory type which can be used to store buffer
> > > > > > objects by the
> > > > > >    GPU in question shall be given a stable and unique name to be
> > > > > > returned as the
> > > > > > -string here.  The name "memory" is reserved to refer to normal
> > > > > > system memory.
> > > > > > +string here.
> > > > > > +
> > > > > > +The region name "memory" is reserved to refer to normal system 
> > > > > > memory.
> > > > > >    Value shall reflect the amount of storage currently consumed by
> > > > > > the buffer
> > > > > >    objects belong to this client, in the respective memory region.
> > > > > > @@ -136,6 +138,9 @@ objects belong to this client, in the respective
> > > > > > memory region.
> > > > > >    Default unit shall be bytes with optional unit specifiers of 
> > > > > > 'KiB'
> > > > > > or 'MiB'
> > > > > >    indicating kibi- or mebi-bytes.
> > > > > > +This is an alias for drm-total-<region> and only one of the two
> > > > > > should be
> > > > > > +present.
> > > 
> > > This feels a bit awkward and seems to needlessly complicate fdinfo uapi.
> > > 
> > > - Could we just patch amdgpu to follow everyone else, and avoid the
> > >    special case? If there's no tool that relies on the special amdgpu
> > >    prefix then that would be a lot easier.
> > > 
> > > - If that's not on the table, could we make everyone (with a suitable
> > >    helper or something) just print both variants, so that we again have
> > >    consisent fdinfo output? Or breaks that a different set of existing
> > >    tools.
> > > 
> > > - Finally maybe could we get away with fixing amd by adding the common
> > >    format there, deprecating the old, fixing the tools that would break 
> > > and
> > >    then maybe if we're lucky, remove the old one from amdgpu in a year or
> > >    so?
> > 
> > I'm not really understanding what amdgpu is doing wrong.  It seems to
> > be following the documentation.  Is the idea that we would like to
> > deprecate drm-memory-<region> in favor of drm-total-<region>?
> > If that's the case, I think the 3rd option is probably the best.  We
> > have a lot of tools and customers using this.  It would have also been
> > nice to have "memory" in the string for the newer ones to avoid
> > conflicts with other things that might be a total or shared in the
> > future, but I guess that ship has sailed.  We should also note that
> > drm-memory-<region> is deprecated.  While we are here, maybe we should
> > clarify the semantics of resident, purgeable, and active.  For
> > example, isn't resident just a duplicate of total?  If the memory was
> > not resident, it would be in a different region.
> 
> Amdgpu isn't doing anything wrong. It just appears when the format was
> discussed no one noticed (me included) that the two keys are not clearly
> described. And it looks there also wasn't a plan to handle the uncelar
> duality in the future.

Yeah I didnt want to imply that amdgpu did anything wrong, just that if we
have a spec where everyone does one thing, except one driver, that's a
really unfortunate situation that will cause endless amounts of pains to
userspace people.

Like entirely different example, but vmwgfx started out as a driver not
using gem buffer ids for it's per-fd buffer objects. And after a decade
they switched because aside from their own vmwgfx specific userspace just
about no-one got the memo. Despite that it was all documented and designed
to allow that case, and we tried to tilt that windmill for years
educating userspace.

Anyway I think you have I plan, I'm out :-)
-Sima

> For me deprecating sounds fine, the 3rd option. I understand we would only
> make amdgpu emit both sets of keys and then remove drm-memory- in due time.
> 
> With regards to key naming, yeah, memory in the name would have been nice.
> We had a lot of discussion on this topic but ship has indeed sailed. It is
> probably workarble for anything new that might come to add their prefix. As
> long as it does not clash with the memory categories is should be fine.
> 
> In terms of resident semantics, think of it as VIRT vs RES in top(1). It is
> for drivers which allocate backing store lazily, on first use.
> 
> Purgeable is for drivers which have a form of MADV_DONTNEED ie. currently
> have backing store but userspace has indicated it can be dropped without
> preserving the content on memory pressure.
> 
> Active is when reservation object says there is activity on the buffer.
> 
> Regards,
> 
> Tvrtko
> 
> > 
> > Alex
> > 
> > > 
> > > Uapi that's "either do $foo or on this one driver, do $bar" is just
> > > guaranteed to fragement the ecosystem, so imo that should be the absolute
> > > last resort.
> > > -Sima
> > > 
> > > > > > +
> > > > > >    - drm-shared-<region>: <uint> [KiB|MiB]
> > > > > >    The total size of buffers that are shared with another file 
> > > > > > (e.g.,
> > > > > > have more
> > > > > > @@ -145,6 +150,9 @@ than a single handle).
> > > > > >    The total size of buffers that including shared and private 
> > > > > > memory.
> > > > > > +This is an alias for drm-memory-<region> and only one of the two
> > > > > > should be
> > > > > > +present.
> > > > > > +
> > > > > >    - drm-resident-<region>: <uint> [KiB|MiB]
> > > > > >    The total size of buffers that are resident in the specified 
> > > > > > region.
> > > 
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Reply via email to