On Wed, 2025-10-08 at 16:02 +0200, Christian König wrote:
> On 08.10.25 15:50, Tvrtko Ursulin wrote:
> > 
> > On 08/10/2025 13:35, Christian König wrote:
> > > On 08.10.25 13:53, Tvrtko Ursulin wrote:
> > > > Disclaimer:
> > > > Please note that as this series includes a patch which touches
> > > > a good number of
> > > > drivers I will only copy everyone in the cover letter and the
> > > > respective patch.
> > > > Assumption is people are subscribed to dri-devel so can look at
> > > > the whole series
> > > > there. I know someone is bound to complain for both the case
> > > > when everyone is
> > > > copied on everything for getting too much email, and also for
> > > > this other case.
> > > > So please be flexible.
> > > > 
> > > > Description:
> > > > 
> > > > All drivers which use the TTM pool allocator end up requesting
> > > > large order
> > > > allocations when allocating large buffers. Those can be slow
> > > > due memory pressure
> > > > and so add latency to buffer creation. But there is often also
> > > > a size limit
> > > > above which contiguous blocks do not bring any performance
> > > > benefits. This series
> > > > allows drivers to say when it is okay for the TTM to try a bit
> > > > less hard.
> > > > 
> > > > We do this by allowing drivers to specify this cut off point
> > > > when creating the
> > > > TTM device and pools. Allocations above this size will skip
> > > > direct reclaim so
> > > > under memory pressure worst case latency will improve.
> > > > Background reclaim is
> > > > still kicked off and both before and after the memory pressure
> > > > all the TTM pool
> > > > buckets remain to be used as they are today.
> > > > 
> > > > This is especially interesting if someone has configured
> > > > MAX_PAGE_ORDER to
> > > > higher than the default. And even with the default, with amdgpu
> > > > for example,
> > > > the last patch in the series makes use of the new feature by
> > > > telling TTM that
> > > > above 2MiB we do not expect performance benefits. Which makes
> > > > TTM not try direct
> > > > reclaim for the top bucket (4MiB).
> > > > 
> > > > End result is TTM drivers become a tiny bit nicer mm citizens
> > > > and users benefit
> > > > from better worst case buffer creation latencies. As a side
> > > > benefit we get rid
> > > > of two instances of those often very unreadable mutliple
> > > > nameless booleans
> > > > function signatures.
> > > > 
> > > > If this sounds interesting and gets merge the invidual drivers
> > > > can follow up
> > > > with patches configuring their thresholds.
> > > > 
> > > > v2:
> > > >   * Christian suggested to pass in the new data by changing the
> > > > function signatures.
> > > > 
> > > > v3:
> > > >   * Moved ttm pool helpers into new ttm_pool_internal.h.
> > > > (Christian)
> > > 
> > > Patch #3 is Acked-by: Christian König <[email protected]>.
> > > 
> > > The rest is Reviewed-by: Christian König
> > > <[email protected]>
> > 
> > Thank you!
> > 
> > So I think now I need acks to merge via drm-misc for all the
> > drivers which have their own trees. Which seems to be just xe.
> 
> I think you should ping the XE guys for their opinion, but since
> there shouldn't be any functional change for them you can probably go
> ahead and merge the patches to drm-misc-next when there is no reply
> in time.

I will try to do a review tonight. One thing that comes up though, is
the change to ttm_device_init() where you add pool_flags. I had another
patch series a number of months ago that added a struct with flags
there instead to select the return value given when OOM. Now that we're
adding an argument, should we try to use a struct instead so that we
can use it for more that pool behavior?


I'll be able to find a pointer to that series later today.

/Thomas

Reply via email to