On 6/22/20 1:50 PM, Daniel Vetter wrote:
On Mon, Jun 22, 2020 at 7:45 PM Christian König
<christian.koe...@amd.com> wrote:
Am 22.06.20 um 16:32 schrieb Andrey Grodzovsky:
On 6/22/20 9:18 AM, Christian König wrote:
Am 21.06.20 um 08:03 schrieb Andrey Grodzovsky:
Will be used to reroute CPU mapped BO's page faults once
device is removed.

Signed-off-by: Andrey Grodzovsky <andrey.grodzov...@amd.com>
---
   drivers/gpu/drm/drm_file.c  |  8 ++++++++
   drivers/gpu/drm/drm_prime.c | 10 ++++++++++
   include/drm/drm_file.h      |  2 ++
   include/drm/drm_gem.h       |  2 ++
   4 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index c4c704e..67c0770 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -188,6 +188,12 @@ struct drm_file *drm_file_alloc(struct
drm_minor *minor)
               goto out_prime_destroy;
       }
   +    file->dummy_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+    if (!file->dummy_page) {
+        ret = -ENOMEM;
+        goto out_prime_destroy;
+    }
+
       return file;
     out_prime_destroy:
@@ -284,6 +290,8 @@ void drm_file_free(struct drm_file *file)
       if (dev->driver->postclose)
           dev->driver->postclose(dev, file);
   +    __free_page(file->dummy_page);
+
       drm_prime_destroy_file_private(&file->prime);
         WARN_ON(!list_empty(&file->event_list));
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 1de2cde..c482e9c 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -335,6 +335,13 @@ int drm_gem_prime_fd_to_handle(struct
drm_device *dev,
         ret = drm_prime_add_buf_handle(&file_priv->prime,
               dma_buf, *handle);
+
+    if (!ret) {
+        obj->dummy_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+        if (!obj->dummy_page)
+            ret = -ENOMEM;
+    }
+
While the per file case still looks acceptable this is a clear NAK
since it will massively increase the memory needed for a prime
exported object.

I think that this is quite overkill in the first place and for the
hot unplug case we can just use the global dummy page as well.

Christian.

Global dummy page is good for read access, what do you do on write
access ? My first approach was indeed to map at first global dummy
page as read only and mark the vma->vm_flags as !VM_SHARED assuming
that this would trigger Copy On Write flow in core mm
(https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.7-rc7%2Fsource%2Fmm%2Fmemory.c%23L3977&amp;data=02%7C01%7CAndrey.Grodzovsky%40amd.com%7C3753451d037544e7495408d816d4c4ee%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637284450384586120&amp;sdata=ZpRaQgqA5K4jRfidOiedey0AleeYQ97WNUkGA29ERA0%3D&amp;reserved=0)
on the next page fault to same address triggered by a write access but
then i realized a new COW page will be allocated for each such mapping
and this is much more wasteful then having a dedicated page per GEM
object.
Yeah, but this is only for a very very small corner cases. What we need
to prevent is increasing the memory usage during normal operation to much.

Using memory during the unplug is completely unproblematic because we
just released quite a bunch of it by releasing all those system memory
buffers.

And I'm pretty sure that COWed pages are correctly accounted towards the
used memory of a process.

So I think if that approach works as intended and the COW pages are
released again on unmapping it would be the perfect solution to the problem.

Daniel what do you think?
If COW works, sure sounds reasonable. And if we can make sure we
managed to drop all the system allocations (otherwise suddenly 2x
memory usage, worst case). But I have no idea whether we can
retroshoehorn that into an established vma, you might have fun stuff
like a mkwrite handler there (which I thought is the COW handler
thing, but really no idea).


Can you clarify your concern here ? I see no DRM driver besides vmwgfx
who installs a handler for vm_operations_struct.page_mkwrite and in any
case, since I will be turning off VM_SHARED flag for the faulting vm_area_struct
making it a COW, page_mkwrite will not be called on any subsequent vm fault.

Andrey



If we need to massively change stuff then I think rw dummy page,
allocated on first fault after hotunplug (maybe just make it one per
object, that's simplest) seems like the much safer option. Much less
code that can go wrong.
-Daniel

Regards,
Christian.

We can indeed optimize by allocating this dummy page on the first page
fault after device disconnect instead on GEM object creation.

Andrey


mutex_unlock(&file_priv->prime.lock);
       if (ret)
           goto fail;
@@ -1006,6 +1013,9 @@ void drm_prime_gem_destroy(struct
drm_gem_object *obj, struct sg_table *sg)
           dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL);
       dma_buf = attach->dmabuf;
       dma_buf_detach(attach->dmabuf, attach);
+
+    __free_page(obj->dummy_page);
+
       /* remove the reference */
       dma_buf_put(dma_buf);
   }
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 19df802..349a658 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -335,6 +335,8 @@ struct drm_file {
        */
       struct drm_prime_file_private prime;
   +    struct page *dummy_page;
+
       /* private: */
   #if IS_ENABLED(CONFIG_DRM_LEGACY)
       unsigned long lock_count; /* DRI1 legacy lock count */
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 0b37506..47460d1 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -310,6 +310,8 @@ struct drm_gem_object {
        *
        */
       const struct drm_gem_object_funcs *funcs;
+
+    struct page *dummy_page;
   };
     /**

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to