Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
On 6/22/23 17:19, Boris Brezillon wrote: Hi Danilo, On Thu, 22 Jun 2023 15:58:23 +0200 Danilo Krummrich wrote: Hi Boris, On 6/22/23 15:01, Boris Brezillon wrote: Hi Danilo, On Tue, 20 Jun 2023 14:46:07 +0200 Danilo Krummrich wrote: The only thing I'm worried about is the 'sync mapping requests have to go through the async path and wait for all previous async requests to be processed' problem I mentioned in one of your previous submission, but I'm happy leave that for later. Yes, I'm aware of this limitation. Let me quickly try to explain where this limitation comes from and how I intend to address it. In order to be able to allocate the required page tables for a mapping request and in order to free corresponding page tables once the (async) job finished I need to know the corresponding sequence of operations (drm_gpuva_ops) to fulfill the mapping request. This requires me to update the GPUVA space in the ioctl() rather than in the async stage, because otherwise I would need to wait for previous jobs to finish before being able to submit subsequent jobs to the job queue, since I need an up to date view of the GPUVA space in order to calculate the sequence of operations to fulfill a mapping request. As a consequence all jobs need to be processed in the order they were submitted, including synchronous jobs. @Matt: I think you will have the same limitation with synchronous jobs as your implementation in XE should be similar? In order to address it I want to switch to using callbacks rather than 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the asynchronous stage. This would allow me to 'fit' synchronous jobs between jobs waiting in the async job queue. However, to do this I have to re-work how the page table handling in Nouveau is implemented, since this would require me to be able to manage the page tables without knowing the exact sequence of operations to fulfill a mapping request. Ok, so I think that's more or less what we're trying to do right now in PowerVR. - First, we make sure we reserve enough MMU page tables for a given map operation to succeed no matter the VM state in the VM_BIND job submission path (our VM_BIND ioctl). That means we're always over-provisioning and returning unused memory back when the operation is done if we end up using less memory. - We pre-allocate for the mapple-tree insertions. - Then we map using drm_gpuva_sm_map() and the callbacks we provided in the drm_sched::run_job() path. We guarantee that no memory is allocated in that path thanks to the pre-allocation/reservation we've done at VM_BIND job submission time. The problem I see with this v5 is that: 1/ We now have a dma_resv_lock_held() in drm_gpuva_{link,unlink}(), which, in our case, is called in the async drm_sched::run_job() path, and we don't hold the lock in that path (it's been released just after the job submission). My solution to this, as by now, is to - in the same way we pre-allocate - to just pre-link and pre-unlink. And then fix things up in the cleanup path. However, depending on the driver, this might require you to set a flag in the driver specific structure (embedding struct drm_gpuva) whether the gpuva is actually mapped (as in has active page table entries). Maybe we could also just add such a flag to struct drm_gpuva. But yeah, doesn't sound too nice to be honest... 2/ I'm worried that Liam's plan to only reserve what's actually needed based on the mapple tree state is going to play against us, because the mapple-tree is only modified at job exec time, and we might have several unmaps happening between the moment we created and queued the jobs, and the moment they actually get executed, meaning the mapple-tree reservation might no longer fit the bill. Yes, I'm aware and I explained to Liam in detail why we need the mas_preallocate_worst_case() way of doing it. See this mail: https://lore.kernel.org/nouveau/68cd25de-e767-725e-2e7b-703217230...@redhat.com/T/#ma326e200b1de1e3c9df4e9fcb3bf243061fee8b5 He hasn't answered yet, but I hope we can just get (or actually keep) such a function (hopefully with better naming), since it shouldn't interfere with anything else. My bad, I started reading your reply and got interrupted. Never got back to it, which I should definitely have done before posting my questions. Anyway, glad to hear we're on the same page regarding the mas_preallocate_worst_case() thing. No worries, I should probably also reply to Liams patch introducing the change. I will do that in a minute. For issue #1, it shouldn't be to problematic if we use a regular lock to insert to/remove from the GEM gpuva list. Yes, that's why I had a separate GEM gpuva list lock in the first place. However, this doesn't really work when generating ops rather than using the callback interface. Have a look at drm_gpuva_gem_unmap_ops_create() requested by Matt for XE. This
Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
Hi Boris, On 6/22/23 15:01, Boris Brezillon wrote: Hi Danilo, On Tue, 20 Jun 2023 14:46:07 +0200 Danilo Krummrich wrote: The only thing I'm worried about is the 'sync mapping requests have to go through the async path and wait for all previous async requests to be processed' problem I mentioned in one of your previous submission, but I'm happy leave that for later. Yes, I'm aware of this limitation. Let me quickly try to explain where this limitation comes from and how I intend to address it. In order to be able to allocate the required page tables for a mapping request and in order to free corresponding page tables once the (async) job finished I need to know the corresponding sequence of operations (drm_gpuva_ops) to fulfill the mapping request. This requires me to update the GPUVA space in the ioctl() rather than in the async stage, because otherwise I would need to wait for previous jobs to finish before being able to submit subsequent jobs to the job queue, since I need an up to date view of the GPUVA space in order to calculate the sequence of operations to fulfill a mapping request. As a consequence all jobs need to be processed in the order they were submitted, including synchronous jobs. @Matt: I think you will have the same limitation with synchronous jobs as your implementation in XE should be similar? In order to address it I want to switch to using callbacks rather than 'pre-allocated' drm_gpuva_ops and update the GPUVA space within the asynchronous stage. This would allow me to 'fit' synchronous jobs between jobs waiting in the async job queue. However, to do this I have to re-work how the page table handling in Nouveau is implemented, since this would require me to be able to manage the page tables without knowing the exact sequence of operations to fulfill a mapping request. Ok, so I think that's more or less what we're trying to do right now in PowerVR. - First, we make sure we reserve enough MMU page tables for a given map operation to succeed no matter the VM state in the VM_BIND job submission path (our VM_BIND ioctl). That means we're always over-provisioning and returning unused memory back when the operation is done if we end up using less memory. - We pre-allocate for the mapple-tree insertions. - Then we map using drm_gpuva_sm_map() and the callbacks we provided in the drm_sched::run_job() path. We guarantee that no memory is allocated in that path thanks to the pre-allocation/reservation we've done at VM_BIND job submission time. The problem I see with this v5 is that: 1/ We now have a dma_resv_lock_held() in drm_gpuva_{link,unlink}(), which, in our case, is called in the async drm_sched::run_job() path, and we don't hold the lock in that path (it's been released just after the job submission). My solution to this, as by now, is to - in the same way we pre-allocate - to just pre-link and pre-unlink. And then fix things up in the cleanup path. However, depending on the driver, this might require you to set a flag in the driver specific structure (embedding struct drm_gpuva) whether the gpuva is actually mapped (as in has active page table entries). Maybe we could also just add such a flag to struct drm_gpuva. But yeah, doesn't sound too nice to be honest... 2/ I'm worried that Liam's plan to only reserve what's actually needed based on the mapple tree state is going to play against us, because the mapple-tree is only modified at job exec time, and we might have several unmaps happening between the moment we created and queued the jobs, and the moment they actually get executed, meaning the mapple-tree reservation might no longer fit the bill. Yes, I'm aware and I explained to Liam in detail why we need the mas_preallocate_worst_case() way of doing it. See this mail: https://lore.kernel.org/nouveau/68cd25de-e767-725e-2e7b-703217230...@redhat.com/T/#ma326e200b1de1e3c9df4e9fcb3bf243061fee8b5 He hasn't answered yet, but I hope we can just get (or actually keep) such a function (hopefully with better naming), since it shouldn't interfere with anything else. For issue #1, it shouldn't be to problematic if we use a regular lock to insert to/remove from the GEM gpuva list. Yes, that's why I had a separate GEM gpuva list lock in the first place. However, this doesn't really work when generating ops rather than using the callback interface. Have a look at drm_gpuva_gem_unmap_ops_create() requested by Matt for XE. This function generates drm_gpuva_ops to unmap all mappings of a given GEM. In order to do that the function must iterate the GEM's gpuva list and allocate operations for each mapping. As a consequence the gpuva list lock wouldn't be allowed to be taken in the fence signalling path (run_job()) any longer. Hence, we can just protect the list with the GEM's dma-resv lock. However, I can understand that it might be inconvenient for the callback interface and admittedly my
Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
On Tue, Jun 20, 2023 at 10:13 AM Dave Airlie wrote: > > On Tue, 20 Jun 2023 at 17:06, Oded Gabbay wrote: > > > > On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie wrote: > > > > > > Since this is feature is nouveau only currently and doesn't disturb > > > the current nouveau code paths, I'd like to try and get this work in > > > tree so other drivers can work from it. > > > > > > If there are any major objections to this, I'm happy to pull it back > > > out again, but I'd like to get some acks/rb in the next couple of days > > > in order to land some of it. > > > > > > Dave. > > > > > > > > > > > > > > forgot to add your email address to the patch series - sorry about that. > > > > > > > > This series (v5) contains the Documentation changes you requested. > > > > > > > > - Danilo > > > > > > > > On 6/20/23 02:42, Danilo Krummrich wrote: > > > > > This patch series provides a new UAPI for the Nouveau driver in order > > > > > to > > > > > support Vulkan features, such as sparse bindings and sparse residency. > > > > > > > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core > > > > > feature to > > > > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > > > > > > > The DRM GPUVA manager is indented to help drivers implement > > > > > userspace-manageable > > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve > > > > > this goal it > > > > > serves the following purposes in this context. > > > > > > > > > > 1) Provide infrastructure to track GPU VA allocations and > > > > > mappings, > > > > > making use of the maple_tree. > > > > > > > > > > 2) Generically connect GPU VA mappings to their backing buffers, > > > > > in > > > > > particular DRM GEM objects. > > Will this manager be able to connect GPU VA mappings to host memory > > allocations (aka user pointers) ? > > > > I only skimmed over the uapi definitions, but from that quick glance I > > saw you can only pass a (gem) handle to the vm bind uapi. > > > > I think it is an important feature because you don't want to have two > > GPU VA managers running in your driver (if that's even possible). > > Maybe we should at least try to make sure the uapi is/will be > > compatible with such an extension. > > > > I think that would have to be a new uAPI entry point anyways, since > managing user ptrs is extra, but the uAPI is nouveau specific and > nouveau has no hostptr support as of now. > > The gpuva manager is kernel internal, I think adding host ptr tracking > is useful, but I don't think it's a blocker right now. > > One of the reasons I'd like to get this in the tree is to add things > like that instead of overloading this initial patchset with feature > creep. > > Dave. ok, that makes sense. Thanks, Oded
Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie wrote: > > Since this is feature is nouveau only currently and doesn't disturb > the current nouveau code paths, I'd like to try and get this work in > tree so other drivers can work from it. > > If there are any major objections to this, I'm happy to pull it back > out again, but I'd like to get some acks/rb in the next couple of days > in order to land some of it. > > Dave. > > > > > > forgot to add your email address to the patch series - sorry about that. > > > > This series (v5) contains the Documentation changes you requested. > > > > - Danilo > > > > On 6/20/23 02:42, Danilo Krummrich wrote: > > > This patch series provides a new UAPI for the Nouveau driver in order to > > > support Vulkan features, such as sparse bindings and sparse residency. > > > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core > > > feature to > > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > > > The DRM GPUVA manager is indented to help drivers implement > > > userspace-manageable > > > GPU VA spaces in reference to the Vulkan API. In order to achieve this > > > goal it > > > serves the following purposes in this context. > > > > > > 1) Provide infrastructure to track GPU VA allocations and mappings, > > > making use of the maple_tree. > > > > > > 2) Generically connect GPU VA mappings to their backing buffers, in > > > particular DRM GEM objects. Will this manager be able to connect GPU VA mappings to host memory allocations (aka user pointers) ? I only skimmed over the uapi definitions, but from that quick glance I saw you can only pass a (gem) handle to the vm bind uapi. I think it is an important feature because you don't want to have two GPU VA managers running in your driver (if that's even possible). Maybe we should at least try to make sure the uapi is/will be compatible with such an extension. Oded Oded > > > > > > 3) Provide a common implementation to perform more complex mapping > > > operations on the GPU VA space. In particular splitting and > > > merging > > > of GPU VA mappings, e.g. for intersecting mapping requests or > > > partial > > > unmap requests. > > > > > > The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself > > > providing the following new interfaces. > > > > > > 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT > > > ioctl > > > for UMDs to specify the portion of VA space managed by the kernel > > > and > > > userspace, respectively. > > > > > > 2) Allocate and free a VA space region as well as bind and unbind > > > memory > > > to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. > > > > > > 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > > > > > > Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of > > > the DRM > > > scheduler to queue jobs and support asynchronous processing with DRM > > > syncobjs > > > as synchronization mechanism. > > > > > > By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > > > DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > > > > > > The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution > > > context > > > for GEM buffers) by Christian König. Since the patch implementing > > > drm_exec was > > > not yet merged into drm-next it is part of this series, as well as a > > > small fix > > > for this patch, which was found while testing this series. > > > > > > This patch series is also available at [1]. > > > > > > There is a Mesa NVK merge request by Dave Airlie [2] implementing the > > > corresponding userspace parts for this series. > > > > > > The Vulkan CTS test suite passes the sparse binding and sparse residency > > > test > > > cases for the new UAPI together with Dave's Mesa work. > > > > > > There are also some test cases in the igt-gpu-tools project [3] for the > > > new UAPI > > > and hence the DRM GPU VA manager. However, most of them are testing the > > > DRM GPU > > > VA manager's logic through Nouveau's new UAPI and should be considered > > > just as > > > helper for implementation. > > > > > > However, I absolutely intend to change those test cases to proper kunit > > > test > > > cases for the DRM GPUVA manager, once and if we agree on it's usefulness > > > and > > > design. > > > > > > [1] > > > https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > > > https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > > > [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > > > [3] > > > https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > > > > > > Changes in V2: > > > == > > >Nouveau: > > > - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in > > > fence > > >signalling critical
Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
Hi Boris, On 6/20/23 11:25, Boris Brezillon wrote: Hi Danilo, On Tue, 20 Jun 2023 02:42:03 +0200 Danilo Krummrich wrote: This patch series provides a new UAPI for the Nouveau driver in order to support Vulkan features, such as sparse bindings and sparse residency. Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to keep track of GPU virtual address (VA) mappings in a more generic way. The DRM GPUVA manager is indented to help drivers implement userspace-manageable GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it serves the following purposes in this context. 1) Provide infrastructure to track GPU VA allocations and mappings, making use of the maple_tree. 2) Generically connect GPU VA mappings to their backing buffers, in particular DRM GEM objects. 3) Provide a common implementation to perform more complex mapping operations on the GPU VA space. In particular splitting and merging of GPU VA mappings, e.g. for intersecting mapping requests or partial unmap requests. The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself providing the following new interfaces. 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl for UMDs to specify the portion of VA space managed by the kernel and userspace, respectively. 2) Allocate and free a VA space region as well as bind and unbind memory to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM scheduler to queue jobs and support asynchronous processing with DRM syncobjs as synchronization mechanism. By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context for GEM buffers) by Christian König. Since the patch implementing drm_exec was not yet merged into drm-next it is part of this series, as well as a small fix for this patch, which was found while testing this series. This patch series is also available at [1]. There is a Mesa NVK merge request by Dave Airlie [2] implementing the corresponding userspace parts for this series. The Vulkan CTS test suite passes the sparse binding and sparse residency test cases for the new UAPI together with Dave's Mesa work. There are also some test cases in the igt-gpu-tools project [3] for the new UAPI and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU VA manager's logic through Nouveau's new UAPI and should be considered just as helper for implementation. However, I absolutely intend to change those test cases to proper kunit test cases for the DRM GPUVA manager, once and if we agree on it's usefulness and design. [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind Changes in V2: == Nouveau: - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence signalling critical sections. Updates to the VA space are split up in three separate stages, where only the 2. stage executes in a fence signalling critical section: 1. update the VA space, allocate new structures and page tables 2. (un-)map the requested memory bindings 3. free structures and page tables - Separated generic job scheduler code from specific job implementations. - Separated the EXEC and VM_BIND implementation of the UAPI. - Reworked the locking parts of the nvkm/vmm RAW interface, such that (un-)map operations can be executed in fence signalling critical sections. GPUVA Manager: - made drm_gpuva_regions optional for users of the GPUVA manager - allow NULL GEMs for drm_gpuva entries - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region entries - provide callbacks for users to allocate custom drm_gpuva_op structures to allow inheritance - added user bits to drm_gpuva_flags - added a prefetch operation type in order to support generating prefetch operations in the same way other operations generated - hand the responsibility for mutual exclusion for a GEM's drm_gpuva list to the user; simplified corresponding (un-)link functions Maple Tree: - I added two maple tree patches to the series, one to support custom tree walk macros and one to hand the locking responsibility to the user of the GPUVA manager without pre-defined lockdep checks. Changes in V3: ==
Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
On Tue, 20 Jun 2023 at 17:06, Oded Gabbay wrote: > > On Tue, Jun 20, 2023 at 7:05 AM Dave Airlie wrote: > > > > Since this is feature is nouveau only currently and doesn't disturb > > the current nouveau code paths, I'd like to try and get this work in > > tree so other drivers can work from it. > > > > If there are any major objections to this, I'm happy to pull it back > > out again, but I'd like to get some acks/rb in the next couple of days > > in order to land some of it. > > > > Dave. > > > > > > > > > > forgot to add your email address to the patch series - sorry about that. > > > > > > This series (v5) contains the Documentation changes you requested. > > > > > > - Danilo > > > > > > On 6/20/23 02:42, Danilo Krummrich wrote: > > > > This patch series provides a new UAPI for the Nouveau driver in order to > > > > support Vulkan features, such as sparse bindings and sparse residency. > > > > > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core > > > > feature to > > > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > > > > > The DRM GPUVA manager is indented to help drivers implement > > > > userspace-manageable > > > > GPU VA spaces in reference to the Vulkan API. In order to achieve this > > > > goal it > > > > serves the following purposes in this context. > > > > > > > > 1) Provide infrastructure to track GPU VA allocations and mappings, > > > > making use of the maple_tree. > > > > > > > > 2) Generically connect GPU VA mappings to their backing buffers, in > > > > particular DRM GEM objects. > Will this manager be able to connect GPU VA mappings to host memory > allocations (aka user pointers) ? > > I only skimmed over the uapi definitions, but from that quick glance I > saw you can only pass a (gem) handle to the vm bind uapi. > > I think it is an important feature because you don't want to have two > GPU VA managers running in your driver (if that's even possible). > Maybe we should at least try to make sure the uapi is/will be > compatible with such an extension. > I think that would have to be a new uAPI entry point anyways, since managing user ptrs is extra, but the uAPI is nouveau specific and nouveau has no hostptr support as of now. The gpuva manager is kernel internal, I think adding host ptr tracking is useful, but I don't think it's a blocker right now. One of the reasons I'd like to get this in the tree is to add things like that instead of overloading this initial patchset with feature creep. Dave.
Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
Since this is feature is nouveau only currently and doesn't disturb the current nouveau code paths, I'd like to try and get this work in tree so other drivers can work from it. If there are any major objections to this, I'm happy to pull it back out again, but I'd like to get some acks/rb in the next couple of days in order to land some of it. Dave. > > forgot to add your email address to the patch series - sorry about that. > > This series (v5) contains the Documentation changes you requested. > > - Danilo > > On 6/20/23 02:42, Danilo Krummrich wrote: > > This patch series provides a new UAPI for the Nouveau driver in order to > > support Vulkan features, such as sparse bindings and sparse residency. > > > > Furthermore, with the DRM GPUVA manager it provides a new DRM core feature > > to > > keep track of GPU virtual address (VA) mappings in a more generic way. > > > > The DRM GPUVA manager is indented to help drivers implement > > userspace-manageable > > GPU VA spaces in reference to the Vulkan API. In order to achieve this goal > > it > > serves the following purposes in this context. > > > > 1) Provide infrastructure to track GPU VA allocations and mappings, > > making use of the maple_tree. > > > > 2) Generically connect GPU VA mappings to their backing buffers, in > > particular DRM GEM objects. > > > > 3) Provide a common implementation to perform more complex mapping > > operations on the GPU VA space. In particular splitting and merging > > of GPU VA mappings, e.g. for intersecting mapping requests or > > partial > > unmap requests. > > > > The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself > > providing the following new interfaces. > > > > 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT > > ioctl > > for UMDs to specify the portion of VA space managed by the kernel > > and > > userspace, respectively. > > > > 2) Allocate and free a VA space region as well as bind and unbind > > memory > > to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. > > > > 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. > > > > Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the > > DRM > > scheduler to queue jobs and support asynchronous processing with DRM > > syncobjs > > as synchronization mechanism. > > > > By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, > > DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. > > > > The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution > > context > > for GEM buffers) by Christian König. Since the patch implementing drm_exec > > was > > not yet merged into drm-next it is part of this series, as well as a small > > fix > > for this patch, which was found while testing this series. > > > > This patch series is also available at [1]. > > > > There is a Mesa NVK merge request by Dave Airlie [2] implementing the > > corresponding userspace parts for this series. > > > > The Vulkan CTS test suite passes the sparse binding and sparse residency > > test > > cases for the new UAPI together with Dave's Mesa work. > > > > There are also some test cases in the igt-gpu-tools project [3] for the new > > UAPI > > and hence the DRM GPU VA manager. However, most of them are testing the DRM > > GPU > > VA manager's logic through Nouveau's new UAPI and should be considered just > > as > > helper for implementation. > > > > However, I absolutely intend to change those test cases to proper kunit test > > cases for the DRM GPUVA manager, once and if we agree on it's usefulness and > > design. > > > > [1] > > https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / > > https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 > > [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ > > [3] > > https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind > > > > Changes in V2: > > == > >Nouveau: > > - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in > > fence > >signalling critical sections. Updates to the VA space are split up > > in three > >separate stages, where only the 2. stage executes in a fence > > signalling > >critical section: > > > > 1. update the VA space, allocate new structures and page tables > > 2. (un-)map the requested memory bindings > > 3. free structures and page tables > > > > - Separated generic job scheduler code from specific job > > implementations. > > - Separated the EXEC and VM_BIND implementation of the UAPI. > > - Reworked the locking parts of the nvkm/vmm RAW interface, such that > >(un-)map operations can be executed in fence signalling critical > > sections. > > > >GPUVA Manager: > > - made drm_gpuva_regions optional for
Re: [Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
Hi Donald, forgot to add your email address to the patch series - sorry about that. This series (v5) contains the Documentation changes you requested. - Danilo On 6/20/23 02:42, Danilo Krummrich wrote: This patch series provides a new UAPI for the Nouveau driver in order to support Vulkan features, such as sparse bindings and sparse residency. Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to keep track of GPU virtual address (VA) mappings in a more generic way. The DRM GPUVA manager is indented to help drivers implement userspace-manageable GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it serves the following purposes in this context. 1) Provide infrastructure to track GPU VA allocations and mappings, making use of the maple_tree. 2) Generically connect GPU VA mappings to their backing buffers, in particular DRM GEM objects. 3) Provide a common implementation to perform more complex mapping operations on the GPU VA space. In particular splitting and merging of GPU VA mappings, e.g. for intersecting mapping requests or partial unmap requests. The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself providing the following new interfaces. 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl for UMDs to specify the portion of VA space managed by the kernel and userspace, respectively. 2) Allocate and free a VA space region as well as bind and unbind memory to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM scheduler to queue jobs and support asynchronous processing with DRM syncobjs as synchronization mechanism. By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context for GEM buffers) by Christian König. Since the patch implementing drm_exec was not yet merged into drm-next it is part of this series, as well as a small fix for this patch, which was found while testing this series. This patch series is also available at [1]. There is a Mesa NVK merge request by Dave Airlie [2] implementing the corresponding userspace parts for this series. The Vulkan CTS test suite passes the sparse binding and sparse residency test cases for the new UAPI together with Dave's Mesa work. There are also some test cases in the igt-gpu-tools project [3] for the new UAPI and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU VA manager's logic through Nouveau's new UAPI and should be considered just as helper for implementation. However, I absolutely intend to change those test cases to proper kunit test cases for the DRM GPUVA manager, once and if we agree on it's usefulness and design. [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind Changes in V2: == Nouveau: - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence signalling critical sections. Updates to the VA space are split up in three separate stages, where only the 2. stage executes in a fence signalling critical section: 1. update the VA space, allocate new structures and page tables 2. (un-)map the requested memory bindings 3. free structures and page tables - Separated generic job scheduler code from specific job implementations. - Separated the EXEC and VM_BIND implementation of the UAPI. - Reworked the locking parts of the nvkm/vmm RAW interface, such that (un-)map operations can be executed in fence signalling critical sections. GPUVA Manager: - made drm_gpuva_regions optional for users of the GPUVA manager - allow NULL GEMs for drm_gpuva entries - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region entries - provide callbacks for users to allocate custom drm_gpuva_op structures to allow inheritance - added user bits to drm_gpuva_flags - added a prefetch operation type in order to support generating prefetch operations in the same way other operations generated - hand the responsibility for mutual exclusion for a GEM's drm_gpuva list to the user; simplified corresponding (un-)link functions Maple Tree: - I added two maple tree patches to the series, one to support custom tree walk macros and one to hand the locking responsibility to the user of the GPUVA
[Nouveau] [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
This patch series provides a new UAPI for the Nouveau driver in order to support Vulkan features, such as sparse bindings and sparse residency. Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to keep track of GPU virtual address (VA) mappings in a more generic way. The DRM GPUVA manager is indented to help drivers implement userspace-manageable GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it serves the following purposes in this context. 1) Provide infrastructure to track GPU VA allocations and mappings, making use of the maple_tree. 2) Generically connect GPU VA mappings to their backing buffers, in particular DRM GEM objects. 3) Provide a common implementation to perform more complex mapping operations on the GPU VA space. In particular splitting and merging of GPU VA mappings, e.g. for intersecting mapping requests or partial unmap requests. The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself providing the following new interfaces. 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl for UMDs to specify the portion of VA space managed by the kernel and userspace, respectively. 2) Allocate and free a VA space region as well as bind and unbind memory to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl. 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM scheduler to queue jobs and support asynchronous processing with DRM syncobjs as synchronization mechanism. By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing, DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only. The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context for GEM buffers) by Christian König. Since the patch implementing drm_exec was not yet merged into drm-next it is part of this series, as well as a small fix for this patch, which was found while testing this series. This patch series is also available at [1]. There is a Mesa NVK merge request by Dave Airlie [2] implementing the corresponding userspace parts for this series. The Vulkan CTS test suite passes the sparse binding and sparse residency test cases for the new UAPI together with Dave's Mesa work. There are also some test cases in the igt-gpu-tools project [3] for the new UAPI and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU VA manager's logic through Nouveau's new UAPI and should be considered just as helper for implementation. However, I absolutely intend to change those test cases to proper kunit test cases for the DRM GPUVA manager, once and if we agree on it's usefulness and design. [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next / https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1 [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/ [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind Changes in V2: == Nouveau: - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence signalling critical sections. Updates to the VA space are split up in three separate stages, where only the 2. stage executes in a fence signalling critical section: 1. update the VA space, allocate new structures and page tables 2. (un-)map the requested memory bindings 3. free structures and page tables - Separated generic job scheduler code from specific job implementations. - Separated the EXEC and VM_BIND implementation of the UAPI. - Reworked the locking parts of the nvkm/vmm RAW interface, such that (un-)map operations can be executed in fence signalling critical sections. GPUVA Manager: - made drm_gpuva_regions optional for users of the GPUVA manager - allow NULL GEMs for drm_gpuva entries - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region entries - provide callbacks for users to allocate custom drm_gpuva_op structures to allow inheritance - added user bits to drm_gpuva_flags - added a prefetch operation type in order to support generating prefetch operations in the same way other operations generated - hand the responsibility for mutual exclusion for a GEM's drm_gpuva list to the user; simplified corresponding (un-)link functions Maple Tree: - I added two maple tree patches to the series, one to support custom tree walk macros and one to hand the locking responsibility to the user of the GPUVA manager without pre-defined lockdep checks. Changes in V3: == Nouveau: - Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page table cleanup) within a workqueue rather than the job_free() callback of the