Am 29.11.2017 um 17:52 schrieb Li, Samuel:
To explain the operations a little bit, the tests include repeated testing of
the following sequences,
91 const int sequences[4][4] = {
92 { STEP_MMAP, STEP_FAULT, STEP_FLIP, STEP_DRAW },
93 { STEP_MMAP, STEP_FLIP, STEP_DRAW, STEP_SKIP },
94 { STEP_MMAP, STEP_DRAW, STEP_FLIP, STEP_SKIP },
95 { STEP_FLIP, STEP_MMAP, STEP_DRAW, STEP_SKIP },
96 };
Here STEP_MMAP includes prime_mmap() and begin_cpu_access(). STEP_DRAW includes
repeated cpu reads/writes.
It looks to me the dma_buf has to be pinned to gtt here, to prevent it being
pinned back by STEP_FLIP before drawing.
When STEP_FLIP needs the BO in VRAM it would fail when we pin it to GTT
while it is mapped and that is not something we can do.
What we can do is migrate the BO into GTT when we mmap it and migrate it
back to VRAM on demand when it is pinned.
When scanning out from GTT is implemented the flip will then
automatically pin it to GTT instead of VRAM.
I've send out a WIP branch a for this a few weeks ago, but haven't worked on
this in a while. BTW it is only supported on Carizzo and Raven.
Can you show me the branch? Looks like we need to get if finished.
The amdgpu part was already done and is also already committed IIRC
(that was only a bunch of bugs fixes anyway).
What's missing is the DC part, they need to do some extra bandwith
calculation when scanning out from GTT is enabled and that was
problematic to always activate.
You need to ping Harry about the status there.
Christian.
Sam
-----Original Message-----
From: Koenig, Christian
Sent: Wednesday, November 29, 2017 10:01 AM
To: Li, Samuel <[email protected]>; [email protected]
Subject: Re: [PATCH 1/1] drm/amdgpu: Pin to gtt before cpu accesses dma
buf.
What is the concern for scanning out from GTT on APUs?
It's simply not implemented yet. You need quite a bunch of different setup in
DC for this to work.
I've send out a WIP branch a for this a few weeks ago, but haven't worked on
this in a while. BTW it is only supported on Carizzo and Raven.
But even then pinning a BO to GTT for this would still be a no-go. We could
just avoid the copy on scanout when the BO is already inside GTT because of
the CPU access.
In general we should rather work on this as Michel described and avoid
creating the BO in VRAM in the first place if possible.
Regards,
Christian.
Am 29.11.2017 um 15:56 schrieb Li, Samuel:
One major purpose of the ChromeOS mmap_test is to avoid buffer copying.
What is the concern for scanning out from GTT on APUs?
Sam
-----Original Message-----
From: Koenig, Christian
Sent: Wednesday, November 29, 2017 9:54 AM
To: Li, Samuel <[email protected]>; [email protected]
Subject: Re: [PATCH 1/1] drm/amdgpu: Pin to gtt before cpu accesses
dma buf.
And exactly that's the reason why it is a no-go.
Scanning out from GTT isn't supported at the moment.
Christian.
Am 29.11.2017 um 15:48 schrieb Li, Samuel:
Actually it needs to be pinned here, since otherwise page flip will
pin it into
vram.
SAm
-----Original Message-----
From: Christian König [mailto:[email protected]]
Sent: Wednesday, November 29, 2017 4:39 AM
To: Li, Samuel <[email protected]>; [email protected]
Subject: Re: [PATCH 1/1] drm/amdgpu: Pin to gtt before cpu accesses
dma buf.
Am 29.11.2017 um 01:20 schrieb Samuel Li:
To improve cpu read performance. This is implemented for APUs
currently.
And once more a NAK for this approach.
What we could do is migrate the BO to GTT during mmap, but pinning
it is out of question.
Regards,
Christian.
Change-Id: I300a7daed8f2b0ba6be71a43196a6b8617ddc62e
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 10 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 108
++++++++++++++++++++++
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +-
5 files changed, 126 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f8657c3..193db70 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -417,6 +417,8 @@ amdgpu_gem_prime_import_sg_table(struct
drm_device *dev,
struct dma_buf *amdgpu_gem_prime_export(struct drm_device
*dev,
struct drm_gem_object
*gobj,
int flags);
+struct drm_gem_object *amdgpu_gem_prime_import(struct
drm_device
*dev,
+ struct dma_buf *dma_buf);
int amdgpu_gem_prime_pin(struct drm_gem_object *obj);
void amdgpu_gem_prime_unpin(struct drm_gem_object *obj);
struct reservation_object *amdgpu_gem_prime_res_obj(struct
drm_gem_object *); diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index d704a45..b5483e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -147,6 +147,7 @@ int amdgpu_crtc_page_flip_target(struct
drm_crtc
*crtc,
struct drm_device *dev = crtc->dev;
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
+ bool gtt_scannable = (adev->asic_type >= CHIP_CARRIZO &&
adev-
flags
+& AMD_IS_APU);
struct amdgpu_framebuffer *old_amdgpu_fb;
struct amdgpu_framebuffer *new_amdgpu_fb;
struct drm_gem_object *obj;
@@ -190,8 +191,13 @@ int amdgpu_crtc_page_flip_target(struct
drm_crtc *crtc,
r = amdgpu_bo_pin(new_abo,
AMDGPU_GEM_DOMAIN_VRAM,
&base);
if (unlikely(r != 0)) {
- DRM_ERROR("failed to pin new abo buffer before
flip\n");
- goto unreserve;
+ /* latest APUs support gtt scan out */
+ if (gtt_scannable)
+ r = amdgpu_bo_pin(new_abo,
AMDGPU_GEM_DOMAIN_GTT, &base);
+ if (unlikely(r != 0)) {
+ DRM_ERROR("failed to pin new abo buffer
before
flip\n");
+ goto unreserve;
+ }
}
r = reservation_object_get_fences_rcu(new_abo->tbo.resv,
&work->excl, diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 31383e0..df30b08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -868,7 +868,7 @@ static struct drm_driver kms_driver = {
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_export = amdgpu_gem_prime_export,
- .gem_prime_import = drm_gem_prime_import,
+ .gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_pin = amdgpu_gem_prime_pin,
.gem_prime_unpin = amdgpu_gem_prime_unpin,
.gem_prime_res_obj = amdgpu_gem_prime_res_obj, diff --
git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index ae9c106..9e1424d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -164,6 +164,82 @@ struct reservation_object
*amdgpu_gem_prime_res_obj(struct drm_gem_object *obj)
return bo->tbo.resv;
}
+static int amdgpu_gem_begin_cpu_access(struct dma_buf
*dma_buf,
enum
+dma_data_direction direction) {
+ struct amdgpu_bo *bo = gem_to_amdgpu_bo(dma_buf-
priv);
+ struct amdgpu_device *adev = amdgpu_ttm_adev(bo-
tbo.bdev);
+ long i, ret = 0;
+ unsigned old_count;
+ bool reads = (direction == DMA_BIDIRECTIONAL || direction
==
DMA_FROM_DEVICE);
+ bool gtt_scannable = (adev->asic_type >= CHIP_CARRIZO &&
adev-
flags & AMD_IS_APU);
+ u32 domain;
+
+ if (!reads || !gtt_scannable)
+ return 0;
+
+ ret = amdgpu_bo_reserve(bo, false);
+ if (unlikely(ret != 0))
+ return ret;
+
+ /*
+ * Wait for all shared fences to complete before we switch to
future
+ * use of exclusive fence on this prime shared bo.
+ */
+ ret = reservation_object_wait_timeout_rcu(bo->tbo.resv,
true, false,
+
MAX_SCHEDULE_TIMEOUT);
+
+ if (unlikely(ret < 0)) {
+ DRM_DEBUG_PRIME("Fence wait failed: %li\n", ret);
+ amdgpu_bo_unreserve(bo);
+ return ret;
+ }
+
+ ret = 0;
+ /* Pin to gtt */
+ domain = amdgpu_mem_type_to_domain(bo-
tbo.mem.mem_type);
+ if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
+ old_count = bo->pin_count;
+ for (i = 0; i < old_count; i++)
+ amdgpu_bo_unpin(bo);
+ for (i = 0; i < old_count; i++) {
+ ret = amdgpu_bo_pin(bo,
AMDGPU_GEM_DOMAIN_GTT, NULL);
+ if (unlikely(ret != 0))
+ break;
+ }
+ }
+ if (ret == 0)
+ ret = amdgpu_bo_pin(bo,
AMDGPU_GEM_DOMAIN_GTT,
NULL);
+
+ amdgpu_bo_unreserve(bo);
+ return ret;
+}
+
+static int amdgpu_gem_end_cpu_access(struct dma_buf *dma_buf,
enum
+dma_data_direction direction) {
+ struct amdgpu_bo *bo = gem_to_amdgpu_bo(dma_buf-
priv);
+ int ret = 0;
+ bool reads = (direction == DMA_BIDIRECTIONAL || direction
==
DMA_FROM_DEVICE);
+ struct amdgpu_device *adev = amdgpu_ttm_adev(bo-
tbo.bdev);
+ bool gtt_scannable = (adev->asic_type >= CHIP_CARRIZO &&
adev-
flags
+& AMD_IS_APU);
+
+ if (!reads || !gtt_scannable)
+ return 0;
+
+ mb();
+ ret = amdgpu_bo_reserve(bo, true);
+ if (unlikely(ret != 0))
+ return ret;
+
+ amdgpu_bo_unpin(bo);
+
+ amdgpu_bo_unreserve(bo);
+
+ return 0;
+}
+
+static struct dma_buf_ops amdgpu_dmabuf_ops; static atomic_t
+aops_lock;
+
struct dma_buf *amdgpu_gem_prime_export(struct drm_device
*dev,
struct drm_gem_object
*gobj,
int flags)
@@ -178,5 +254,37 @@ struct dma_buf
*amdgpu_gem_prime_export(struct drm_device *dev,
buf = drm_gem_prime_export(dev, gobj, flags);
if (!IS_ERR(buf))
buf->file->f_mapping = dev->anon_inode-
i_mapping;
+
+ while (amdgpu_dmabuf_ops.begin_cpu_access !=
amdgpu_gem_begin_cpu_access ||
+ amdgpu_dmabuf_ops.end_cpu_access !=
amdgpu_gem_end_cpu_access )
+ {
+ if (!atomic_cmpxchg(&aops_lock, 0, 1)) {
+ amdgpu_dmabuf_ops = *(buf->ops);
+ amdgpu_dmabuf_ops.begin_cpu_access =
amdgpu_gem_begin_cpu_access;
+ amdgpu_dmabuf_ops.end_cpu_access =
amdgpu_gem_end_cpu_access;
+ }
+ }
+ buf->ops = &amdgpu_dmabuf_ops;
+
return buf;
}
+
+struct drm_gem_object *amdgpu_gem_prime_import(struct
drm_device
*dev,
+ struct dma_buf *dma_buf)
{
+ struct drm_gem_object *obj;
+
+ if (dma_buf->ops == &amdgpu_dmabuf_ops) {
+ obj = dma_buf->priv;
+ if (obj->dev == dev) {
+ /*
+ * Importing dmabuf exported from out own
gem
increases
+ * refcount on gem itself instead of f_count
of
dmabuf.
+ */
+ drm_gem_object_get(obj);
+ return obj;
+ }
+ }
+
+ return drm_gem_prime_import(dev, dma_buf); }
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a3bf021..f25b830 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2987,6 +2987,8 @@ static int
dm_plane_helper_prepare_fb(struct
drm_plane *plane,
int r;
struct dm_plane_state *dm_plane_state_new,
*dm_plane_state_old;
unsigned int awidth;
+ struct amdgpu_device *adev = plane->dev->dev_private;
+ bool gtt_scannable = (adev->asic_type >= CHIP_CARRIZO &&
adev-
flags
+& AMD_IS_APU);
dm_plane_state_old = to_dm_plane_state(plane->state);
dm_plane_state_new = to_dm_plane_state(new_state);
@@ -
3005,7
+3007,11 @@ static int dm_plane_helper_prepare_fb(struct
drm_plane
*plane,
return r;
r = amdgpu_bo_pin(rbo, AMDGPU_GEM_DOMAIN_VRAM,
&afb-
address);
-
+ if (unlikely(r != 0)) {
+ /* latest APUs support gtt scan out */
+ if (gtt_scannable)
+ r = amdgpu_bo_pin(rbo,
AMDGPU_GEM_DOMAIN_GTT, &afb->address);
+ }
amdgpu_bo_unreserve(rbo);
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx