Hi, Tvrtko,

On Sat, 2025-10-11 at 09:00 +0100, Tvrtko Ursulin wrote:
> 
> On 10/10/2025 15:11, Thomas Hellström wrote:
> > On Thu, 2025-10-09 at 09:53 +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 08/10/2025 15:39, Thomas Hellström wrote:
> > > > On Wed, 2025-10-08 at 16:02 +0200, Christian König wrote:
> > > > > On 08.10.25 15:50, Tvrtko Ursulin wrote:
> > > > > > 
> > > > > > On 08/10/2025 13:35, Christian König wrote:
> > > > > > > On 08.10.25 13:53, Tvrtko Ursulin wrote:
> > > > > > > > Disclaimer:
> > > > > > > > Please note that as this series includes a patch which
> > > > > > > > touches
> > > > > > > > a good number of
> > > > > > > > drivers I will only copy everyone in the cover letter
> > > > > > > > and
> > > > > > > > the
> > > > > > > > respective patch.
> > > > > > > > Assumption is people are subscribed to dri-devel so can
> > > > > > > > look at
> > > > > > > > the whole series
> > > > > > > > there. I know someone is bound to complain for both the
> > > > > > > > case
> > > > > > > > when everyone is
> > > > > > > > copied on everything for getting too much email, and
> > > > > > > > also
> > > > > > > > for
> > > > > > > > this other case.
> > > > > > > > So please be flexible.
> > > > > > > > 
> > > > > > > > Description:
> > > > > > > > 
> > > > > > > > All drivers which use the TTM pool allocator end up
> > > > > > > > requesting
> > > > > > > > large order
> > > > > > > > allocations when allocating large buffers. Those can be
> > > > > > > > slow
> > > > > > > > due memory pressure
> > > > > > > > and so add latency to buffer creation. But there is
> > > > > > > > often
> > > > > > > > also
> > > > > > > > a size limit
> > > > > > > > above which contiguous blocks do not bring any
> > > > > > > > performance
> > > > > > > > benefits. This series
> > > > > > > > allows drivers to say when it is okay for the TTM to
> > > > > > > > try a
> > > > > > > > bit
> > > > > > > > less hard.
> > > > > > > > 
> > > > > > > > We do this by allowing drivers to specify this cut off
> > > > > > > > point
> > > > > > > > when creating the
> > > > > > > > TTM device and pools. Allocations above this size will
> > > > > > > > skip
> > > > > > > > direct reclaim so
> > > > > > > > under memory pressure worst case latency will improve.
> > > > > > > > Background reclaim is
> > > > > > > > still kicked off and both before and after the memory
> > > > > > > > pressure
> > > > > > > > all the TTM pool
> > > > > > > > buckets remain to be used as they are today.
> > > > > > > > 
> > > > > > > > This is especially interesting if someone has
> > > > > > > > configured
> > > > > > > > MAX_PAGE_ORDER to
> > > > > > > > higher than the default. And even with the default,
> > > > > > > > with
> > > > > > > > amdgpu
> > > > > > > > for example,
> > > > > > > > the last patch in the series makes use of the new
> > > > > > > > feature
> > > > > > > > by
> > > > > > > > telling TTM that
> > > > > > > > above 2MiB we do not expect performance benefits. Which
> > > > > > > > makes
> > > > > > > > TTM not try direct
> > > > > > > > reclaim for the top bucket (4MiB).
> > > > > > > > 
> > > > > > > > End result is TTM drivers become a tiny bit nicer mm
> > > > > > > > citizens
> > > > > > > > and users benefit
> > > > > > > > from better worst case buffer creation latencies. As a
> > > > > > > > side
> > > > > > > > benefit we get rid
> > > > > > > > of two instances of those often very unreadable
> > > > > > > > mutliple
> > > > > > > > nameless booleans
> > > > > > > > function signatures.
> > > > > > > > 
> > > > > > > > If this sounds interesting and gets merge the invidual
> > > > > > > > drivers
> > > > > > > > can follow up
> > > > > > > > with patches configuring their thresholds.
> > > > > > > > 
> > > > > > > > v2:
> > > > > > > >     * Christian suggested to pass in the new data by
> > > > > > > > changing the
> > > > > > > > function signatures.
> > > > > > > > 
> > > > > > > > v3:
> > > > > > > >     * Moved ttm pool helpers into new
> > > > > > > > ttm_pool_internal.h.
> > > > > > > > (Christian)
> > > > > > > 
> > > > > > > Patch #3 is Acked-by: Christian König
> > > > > > > <[email protected]>.
> > > > > > > 
> > > > > > > The rest is Reviewed-by: Christian König
> > > > > > > <[email protected]>
> > > > > > 
> > > > > > Thank you!
> > > > > > 
> > > > > > So I think now I need acks to merge via drm-misc for all
> > > > > > the
> > > > > > drivers which have their own trees. Which seems to be just
> > > > > > xe.
> > > > > 
> > > > > I think you should ping the XE guys for their opinion, but
> > > > > since
> > > > > there shouldn't be any functional change for them you can
> > > > > probably go
> > > > > ahead and merge the patches to drm-misc-next when there is no
> > > > > reply
> > > > > in time.
> > > > 
> > > > I will try to do a review tonight. One thing that comes up
> > > > though,
> > > > is
> > > > the change to ttm_device_init() where you add pool_flags. I had
> > > > another
> > > > patch series a number of months ago that added a struct with
> > > > flags
> > > > there instead to select the return value given when OOM. Now
> > > > that
> > > > we're
> > > > adding an argument, should we try to use a struct instead so
> > > > that
> > > > we
> > > > can use it for more that pool behavior?
> > > > 
> > > > 
> > > > I'll be able to find a pointer to that series later today.
> > > 
> > > Found it:
> > > https://lore.kernel.org/dri-devel/[email protected]/
> > > 
> > > Glad to see in that thread it isn't just me permanently slowed
> > > down
> > > by
> > > "false, false" and similar. :)
> > > 
> > > I considered using a struct too and I guess there wasn't too much
> > > of
> > > a
> > > sway that I went with flags. I thought not to overcomplicate with
> > > the
> > > on
> > > stack struct which is mostly not needed for something so low
> > > level,
> > > and
> > > to stick with the old school C visual patterns.
> > > 
> > > Since you only needed a single boolean in your series I suppose
> > > you
> > > could just follow up on my series if you find it acceptable. Or I
> > > can
> > > go
> > > with yours, no problem either.
> > 
> > It seems yours has the most momentum ATM. I can follow up on yours.
> > It
> > would be great if we could perhaps change the naming of
> > "pool_flags" to
> > something more generic.
> 
> Do you have a name in mind? For ttm_device_init pool_flags made sense
> to 
> signify they relate only to the poll.

Well, what I had in mind would have been "flags" or
"device_init_flags".

Really one could change this once flags starts to have other meanings
as well, like the return value change I was proposing.
But the reason I was suggesting to do this now is to avoid yet another
added parameter to ttm_device_init, since obtaining an ack from all TTM
driver maintainers is typically time-consuming if at all possible.

When adding functionality to allocation functions, for example the use
of the ttm_allocation_ctx has proven easier to use since it's easily
extendible typically without changes to drivers.

Thanks,
Thomas



> 
> I need to respin anyway since I forgot to include the new header from
> unit tests.
> 
> Regards,
> 
> Tvrtko
> 

Reply via email to