There is an issue with tegra-drm where some buffers get created, then freed, but the dma buffer never gets freed. Causing display controller memory allocations to start failing after the leaks fill up cma.
I created an issue on the freedesktop issue tracker [0] with a patch with some debug logs I added, then a log from Android that contains these logs. CMA is set to 512MB, and when allocations start to fail, the unfreed allocations add up to just shy of 500MB, where it's reasonable to expect that 8MB contiguous is no longer available. The log was generated on a Jetson TX2 NX, but I have seen this leak on other archs as well, this also does not appear to be limited to soc's with nvdisplay. This does not appear to be a userspace issue. The graphics allocator works as expected for other soc vendors. And as the logs show, the delete dumb buffer ioctl is called, but is not always followed by the dma buffer getting freed. I have also observed this issue with a gralloc that uses the tegra gem create and such, this is not unique to dumb buffers, that's just the last log I had when deciding to post the issue to lkml. What I primarily intend to ask here is how to further debug this issue. I'm not finding any direct path between the delete dumb ioctl handling and gem release or tegra bo free. Can someone point me to the pieces in the middle I'm missing, where the logic is to decide is a buffer should be freed? Aaron [0] https://gitlab.freedesktop.org/drm/tegra/-/work_items/9
