On Mon, 14 Apr 2025 09:03:55 -0400 Alyssa Rosenzweig <aly...@rosenzweig.io> wrote:
> > Actually, CSF stands in the way of re-allocating memory to other > > contexts, because once we've allocated memory to a tiler heap, the FW > > manages this pool of chunks, and recycles them. Mesa can intercept > > the "returned chunks" and collect those chunks instead of re-assiging > > then to the tiler heap through a CS instruction (which goes thought > > the FW internallu), but that involves extra collaboration between the > > UMD, KMD and FW which we don't have at the moment. Not saying never, > > but I'd rather fix things gradually (first the blocking alloc in the > > fence-signalling path, then the optimization to share the extra mem > > reservation cost among contexts by returning the chunks to the global > > kernel pool rather than directly to the heap). > > > > This approach should work fine with JM GPUs where the tiler heap is > > entirely managed by the KMD though. > > I really think CSF should be relying on the simple heuristics with > incremental-rendering, unless you can prove that's actually a > performance issue in practice. (On Imagination/Apple parts, it almost > never is and we rely entirely on this approach. It's ok - it really is. > For simple 2D workloads, the initial heap allocation is fine. For 3D > scenes, we need very few frames to get the right size. this doesn't > cause stutters in practice.) Yep I agree, hence the "let's try the simple thing first and let's see if we actually need the more complex stuff later". My hope is that we'll never need it, but I hate to make definitive statements, because it usually bites me back when I do :P. > > For JM .. yes, this discussion remains relevant of course. I'm still trying to see if we can emulate/have incremental-rendering on JM hardware, so it really becomes a Lima-only issue. According to Erik, predicting how much heap is needed is much more predictible on Utgard (no indirect draws, simpler binning hierarchy, and other details he mentioned which I forgot).