Re: [Intel-gfx] [RFC PATCH 0/5] cgroup support for GPU devices

2019-05-05 Thread Kenny Ho
(sent again.  Not sure why my previous email was just a reply instead
of reply-all.)

On Sun, May 5, 2019 at 12:05 PM Leon Romanovsky  wrote:
> We are talking about two different access patterns for this device
> memory (DM). One is to use this device memory (DM) and second to 
> configure/limit.
> Usually those actions will be performed by different groups.
>
> First group (programmers) is using special API [1] through libibverbs [2]
> without any notion of cgroups or any limitations. Second group (sysadmins)
> is less interested in application specifics and for them "device memory" means
> "memory" and not "rdma, nic specific, internal memory".
Um... I am not sure that answered it, especially in the context of
cgroup (this is just for my curiosity btw, I don't know much about
rdma.)  You said sysadmins are less interested in application
specifics but then how would they make the judgement call on how much
"device memory" is provisioned to one application/container over
another (let say you have 5 cgroup sharing an rdma device)?  What are
the consequences of under provisioning "device memory" to an
application?  And if they are all just memory, can a sysadmin
provision more system memory in place of device memory (like, are they
interchangeable)?  I guess I am confused because if device memory is
just memory (not rdma, nic specific) to sysadmins how would they know
to set the right amount?

Regards,
Kenny

> [1] ibv_alloc_dm()
> http://man7.org/linux/man-pages/man3/ibv_alloc_dm.3.html
> https://www.openfabrics.org/images/2018workshop/presentations/304_LLiss_OnDeviceMemory.pdf
> [2] https://github.com/linux-rdma/rdma-core/blob/master/libibverbs/
>
> >
> > I think we need to be careful about drawing the line between
> > duplication and over couplings between subsystems.  I have other
> > thoughts and concerns and I will try to organize them into a response
> > in the next few days.
> >
> > Regards,
> > Kenny
> >
> >
> > > >
> > > > Is AMD interested in collaborating to help shape this framework?
> > > > It is intended to be device-neutral, so could be leveraged by various
> > > > types of devices.
> > > > If you have an alternative solution well underway, then maybe
> > > > we can work together to merge our efforts into one.
> > > > In the end, the DRM community is best served with common solution.
> > > >
> > > >
> > > > >
> > > > >>> and with future work, we could extend to:
> > > > >>> *  track and control share of GPU time (reuse of cpu/cpuacct)
> > > > >>> *  apply mask of allowed execution engines (reuse of cpusets)
> > > > >>>
> > > > >>> Instead of introducing a new cgroup subsystem for GPU devices, a new
> > > > >>> framework is proposed to allow devices to register with existing 
> > > > >>> cgroup
> > > > >>> controllers, which creates per-device cgroup_subsys_state within the
> > > > >>> cgroup.  This gives device drivers their own private cgroup controls
> > > > >>> (such as memory limits or other parameters) to be applied to device
> > > > >>> resources instead of host system resources.
> > > > >>> Device drivers (GPU or other) are then able to reuse the existing 
> > > > >>> cgroup
> > > > >>> controls, instead of inventing similar ones.
> > > > >>>
> > > > >>> Per-device controls would be exposed in cgroup filesystem as:
> > > > >>> 
> > > > >>> mount//.devices//
> > > > >>> such as (for example):
> > > > >>> mount//memory.devices//memory.max
> > > > >>> mount//memory.devices//memory.current
> > > > >>> mount//cpu.devices//cpu.stat
> > > > >>> mount//cpu.devices//cpu.weight
> > > > >>>
> > > > >>> The drm/i915 patch in this series is based on top of other RFC work 
> > > > >>> [1]
> > > > >>> for i915 device memory support.
> > > > >>>
> > > > >>> AMD [2] and Intel [3] have proposed related work in this area 
> > > > >>> within the
> > > > >>> last few years, listed below as reference.  This new RFC reuses 
> > > > >>> existing
> > > > >>> cgroup controllers and takes a different approach than prior work.
> > > > >>>
> > > > >>> Finally, some potential discussion points for this series:
> > > > >>> * merge proposed .devices into a single devices 
> > > > >>> directory?
> > > > >>> * allow devices to have multiple registrations for subsets of 
> > > > >>> resources?
> > > > >>> * document a 'common charging policy' for device drivers to follow?
> > > > >>>
> > > > >>> [1] https://patchwork.freedesktop.org/series/56683/
> > > > >>> [2] 
> > > > >>> https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
> > > > >>> [3] 
> > > > >>> https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> > > > >>>
> > > > >>>
> > > > >>> Brian Welty (5):
> > > > >>>   cgroup: Add cgroup_subsys per-device registration framework
> > > > >>>   cgroup: Change kernfs_node for directories to store
> > > > >>> cgroup_subsys_state
> > > > >>>   memcg: Add per-device support to memory cgroup subsystem
> > > > >>>   drm: Add memory cgroup registration and 

Re: [Intel-gfx] [RFC PATCH 0/5] cgroup support for GPU devices

2019-05-05 Thread Kenny Ho
On Sun, May 5, 2019 at 3:14 AM Leon Romanovsky  wrote:
> > > Doesn't RDMA already has a separate cgroup?  Why not implement it there?
> > >
> >
> > Hi Kenny, I can't answer for Leon, but I'm hopeful he agrees with rationale
> > I gave in the cover letter.  Namely, to implement in rdma controller, would
> > mean duplicating existing memcg controls there.
>
> Exactly, I didn't feel comfortable to add notion of "device memory"
> to RDMA cgroup and postponed that decision to later point of time.
> RDMA operates with verbs objects and all our user space API is based around
> that concept. At the end, system administrator will have hard time to
> understand the differences between memcg and RDMA memory.
Interesting.  I actually don't understand this part (I worked in
devops/sysadmin side of things but never with rdma.)  Don't
applications that use rdma require some awareness of rdma (I mean, you
mentioned verbs and objects... or do they just use regular malloc for
buffer allocation and then send it through some function?)  As a user,
I would have this question: why do I need to configure some part of
rdma resources under rdma cgroup while other part of rdma resources in
a different, seemingly unrelated cgroups.

I think we need to be careful about drawing the line between
duplication and over couplings between subsystems.  I have other
thoughts and concerns and I will try to organize them into a response
in the next few days.

Regards,
Kenny


> >
> > Is AMD interested in collaborating to help shape this framework?
> > It is intended to be device-neutral, so could be leveraged by various
> > types of devices.
> > If you have an alternative solution well underway, then maybe
> > we can work together to merge our efforts into one.
> > In the end, the DRM community is best served with common solution.
> >
> >
> > >
> > >>> and with future work, we could extend to:
> > >>> *  track and control share of GPU time (reuse of cpu/cpuacct)
> > >>> *  apply mask of allowed execution engines (reuse of cpusets)
> > >>>
> > >>> Instead of introducing a new cgroup subsystem for GPU devices, a new
> > >>> framework is proposed to allow devices to register with existing cgroup
> > >>> controllers, which creates per-device cgroup_subsys_state within the
> > >>> cgroup.  This gives device drivers their own private cgroup controls
> > >>> (such as memory limits or other parameters) to be applied to device
> > >>> resources instead of host system resources.
> > >>> Device drivers (GPU or other) are then able to reuse the existing cgroup
> > >>> controls, instead of inventing similar ones.
> > >>>
> > >>> Per-device controls would be exposed in cgroup filesystem as:
> > >>> mount//.devices//
> > >>> such as (for example):
> > >>> mount//memory.devices//memory.max
> > >>> mount//memory.devices//memory.current
> > >>> mount//cpu.devices//cpu.stat
> > >>> mount//cpu.devices//cpu.weight
> > >>>
> > >>> The drm/i915 patch in this series is based on top of other RFC work [1]
> > >>> for i915 device memory support.
> > >>>
> > >>> AMD [2] and Intel [3] have proposed related work in this area within the
> > >>> last few years, listed below as reference.  This new RFC reuses existing
> > >>> cgroup controllers and takes a different approach than prior work.
> > >>>
> > >>> Finally, some potential discussion points for this series:
> > >>> * merge proposed .devices into a single devices directory?
> > >>> * allow devices to have multiple registrations for subsets of resources?
> > >>> * document a 'common charging policy' for device drivers to follow?
> > >>>
> > >>> [1] https://patchwork.freedesktop.org/series/56683/
> > >>> [2] 
> > >>> https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
> > >>> [3] 
> > >>> https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> > >>>
> > >>>
> > >>> Brian Welty (5):
> > >>>   cgroup: Add cgroup_subsys per-device registration framework
> > >>>   cgroup: Change kernfs_node for directories to store
> > >>> cgroup_subsys_state
> > >>>   memcg: Add per-device support to memory cgroup subsystem
> > >>>   drm: Add memory cgroup registration and DRIVER_CGROUPS feature bit
> > >>>   drm/i915: Use memory cgroup for enforcing device memory limit
> > >>>
> > >>>  drivers/gpu/drm/drm_drv.c  |  12 +
> > >>>  drivers/gpu/drm/drm_gem.c  |   7 +
> > >>>  drivers/gpu/drm/i915/i915_drv.c|   2 +-
> > >>>  drivers/gpu/drm/i915/intel_memory_region.c |  24 +-
> > >>>  include/drm/drm_device.h   |   3 +
> > >>>  include/drm/drm_drv.h  |   8 +
> > >>>  include/drm/drm_gem.h  |  11 +
> > >>>  include/linux/cgroup-defs.h|  28 ++
> > >>>  include/linux/cgroup.h |   3 +
> > >>>  include/linux/memcontrol.h |  10 +
> > >>>  kernel/cgroup/cgroup-v1.c  |  10 +-
> > >>>  kernel/cgroup/cgroup.c

Re: [Intel-gfx] [RFC PATCH 0/5] cgroup support for GPU devices

2019-05-02 Thread Kenny Ho
> Count us (Mellanox) too, our RDMA devices are exposing special and
> limited in size device memory to the users and we would like to provide
> an option to use cgroup to control its exposure.
Doesn't RDMA already has a separate cgroup?  Why not implement it there?


> > and with future work, we could extend to:
> > *  track and control share of GPU time (reuse of cpu/cpuacct)
> > *  apply mask of allowed execution engines (reuse of cpusets)
> >
> > Instead of introducing a new cgroup subsystem for GPU devices, a new
> > framework is proposed to allow devices to register with existing cgroup
> > controllers, which creates per-device cgroup_subsys_state within the
> > cgroup.  This gives device drivers their own private cgroup controls
> > (such as memory limits or other parameters) to be applied to device
> > resources instead of host system resources.
> > Device drivers (GPU or other) are then able to reuse the existing cgroup
> > controls, instead of inventing similar ones.
> >
> > Per-device controls would be exposed in cgroup filesystem as:
> > mount//.devices//
> > such as (for example):
> > mount//memory.devices//memory.max
> > mount//memory.devices//memory.current
> > mount//cpu.devices//cpu.stat
> > mount//cpu.devices//cpu.weight
> >
> > The drm/i915 patch in this series is based on top of other RFC work [1]
> > for i915 device memory support.
> >
> > AMD [2] and Intel [3] have proposed related work in this area within the
> > last few years, listed below as reference.  This new RFC reuses existing
> > cgroup controllers and takes a different approach than prior work.
> >
> > Finally, some potential discussion points for this series:
> > * merge proposed .devices into a single devices directory?
> > * allow devices to have multiple registrations for subsets of resources?
> > * document a 'common charging policy' for device drivers to follow?
> >
> > [1] https://patchwork.freedesktop.org/series/56683/
> > [2] 
> > https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
> > [3] 
> > https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> >
> >
> > Brian Welty (5):
> >   cgroup: Add cgroup_subsys per-device registration framework
> >   cgroup: Change kernfs_node for directories to store
> > cgroup_subsys_state
> >   memcg: Add per-device support to memory cgroup subsystem
> >   drm: Add memory cgroup registration and DRIVER_CGROUPS feature bit
> >   drm/i915: Use memory cgroup for enforcing device memory limit
> >
> >  drivers/gpu/drm/drm_drv.c  |  12 +
> >  drivers/gpu/drm/drm_gem.c  |   7 +
> >  drivers/gpu/drm/i915/i915_drv.c|   2 +-
> >  drivers/gpu/drm/i915/intel_memory_region.c |  24 +-
> >  include/drm/drm_device.h   |   3 +
> >  include/drm/drm_drv.h  |   8 +
> >  include/drm/drm_gem.h  |  11 +
> >  include/linux/cgroup-defs.h|  28 ++
> >  include/linux/cgroup.h |   3 +
> >  include/linux/memcontrol.h |  10 +
> >  kernel/cgroup/cgroup-v1.c  |  10 +-
> >  kernel/cgroup/cgroup.c | 310 ++---
> >  mm/memcontrol.c| 183 +++-
> >  13 files changed, 552 insertions(+), 59 deletions(-)
> >
> > --
> > 2.21.0
> >
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH RFC 1/5] cgroup: Introduce cgroup for drm subsystem

2018-11-20 Thread Kenny Ho
Change-Id: I6830d3990f63f0c13abeba29b1d330cf28882831
Signed-off-by: Kenny Ho 
---
 include/linux/cgroup_drm.h| 32 
 include/linux/cgroup_subsys.h |  4 +++
 init/Kconfig  |  5 
 kernel/cgroup/Makefile|  1 +
 kernel/cgroup/drm.c   | 46 +++
 5 files changed, 88 insertions(+)
 create mode 100644 include/linux/cgroup_drm.h
 create mode 100644 kernel/cgroup/drm.c

diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
new file mode 100644
index ..79ab38b0f46d
--- /dev/null
+++ b/include/linux/cgroup_drm.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#ifndef _CGROUP_DRM_H
+#define _CGROUP_DRM_H
+
+#ifdef CONFIG_CGROUP_DRM
+
+#include 
+
+struct drmcgrp {
+   struct cgroup_subsys_state  css;
+};
+
+static inline struct drmcgrp *css_drmcgrp(struct cgroup_subsys_state *css)
+{
+   return css ? container_of(css, struct drmcgrp, css) : NULL;
+}
+
+static inline struct drmcgrp *get_drmcgrp(struct task_struct *task)
+{
+   return css_drmcgrp(task_get_css(task, drm_cgrp_id));
+}
+
+
+static inline struct drmcgrp *parent_drmcgrp(struct drmcgrp *cg)
+{
+   return css_drmcgrp(cg->css.parent);
+}
+
+#endif /* CONFIG_CGROUP_DRM */
+#endif /* _CGROUP_DRM_H */
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index acb77dcff3b4..ddedad809e8b 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -61,6 +61,10 @@ SUBSYS(pids)
 SUBSYS(rdma)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_DRM)
+SUBSYS(drm)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/init/Kconfig b/init/Kconfig
index a4112e95724a..bee1e164443a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -836,6 +836,11 @@ config CGROUP_RDMA
  Attaching processes with active RDMA resources to the cgroup
  hierarchy is allowed even if can cross the hierarchy's limit.
 
+config CGROUP_DRM
+   bool "DRM controller (EXPERIMENTAL)"
+   help
+ Provides accounting and enforcement of resources in the DRM subsystem.
+
 config CGROUP_FREEZER
bool "Freezer controller"
help
diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile
index bfcdae896122..6af14bd93050 100644
--- a/kernel/cgroup/Makefile
+++ b/kernel/cgroup/Makefile
@@ -4,5 +4,6 @@ obj-y := cgroup.o rstat.o namespace.o cgroup-v1.o
 obj-$(CONFIG_CGROUP_FREEZER) += freezer.o
 obj-$(CONFIG_CGROUP_PIDS) += pids.o
 obj-$(CONFIG_CGROUP_RDMA) += rdma.o
+obj-$(CONFIG_CGROUP_DRM) += drm.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
 obj-$(CONFIG_CGROUP_DEBUG) += debug.o
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
new file mode 100644
index ..d9e194b9aead
--- /dev/null
+++ b/kernel/cgroup/drm.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: MIT
+// Copyright 2018 Advanced Micro Devices, Inc.
+#include 
+#include 
+#include 
+
+static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
+   struct cftype *cft)
+{
+   return 88;
+}
+
+static void drmcgrp_css_free(struct cgroup_subsys_state *css)
+{
+   struct drmcgrp *drmcgrp = css_drmcgrp(css);
+
+   kfree(css_drmcgrp(css));
+}
+
+static struct cgroup_subsys_state *
+drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
+{
+   struct drmcgrp *drmcgrp;
+
+   drmcgrp = kzalloc(sizeof(struct drmcgrp), GFP_KERNEL);
+   if (!drmcgrp)
+   return ERR_PTR(-ENOMEM);
+
+   return >css;
+}
+
+struct cftype files[] = {
+   {
+   .name = "drm_test",
+   .read_u64 = drmcgrp_test_read,
+   },
+   { } /* terminate */
+};
+
+struct cgroup_subsys drm_cgrp_subsys = {
+   .css_alloc  = drmcgrp_css_alloc,
+   .css_free   = drmcgrp_css_free,
+   .early_init = false,
+   .legacy_cftypes = files,
+   .dfl_cftypes= files,
+};
-- 
2.19.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup

2018-11-20 Thread Kenny Ho
Account for the number of command submitted to amdgpu by type on a per
cgroup basis, for the purpose of profiling/monitoring applications.

x prefix in the control file name x.cmd_submitted.amd.stat signify
experimental.

Change-Id: Ibc22e5bda600f54fe820fe0af5400ca348691550
Signed-off-by: Kenny Ho 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  5 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 54 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  5 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c| 15 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h|  5 +-
 5 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 663043c8f0f5..b448160aed89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -33,6 +33,7 @@
 #include "amdgpu_trace.h"
 #include "amdgpu_gmc.h"
 #include "amdgpu_gem.h"
+#include "amdgpu_drmcgrp.h"
 
 static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
  struct drm_amdgpu_cs_chunk_fence *data,
@@ -1275,6 +1276,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
union drm_amdgpu_cs *cs = data;
struct amdgpu_cs_parser parser = {};
bool reserved_buffers = false;
+   struct amdgpu_ring *ring;
int i, r;
 
if (!adev->accel_working)
@@ -1317,6 +1319,9 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
if (r)
goto out;
 
+   ring = to_amdgpu_ring(parser.entity->rq->sched);
+   amdgpu_drmcgrp_count_cs(current, dev, ring->funcs->type);
+
r = amdgpu_cs_submit(, cs);
 
 out:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
index ed8aac17769c..853b77532428 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
@@ -1,11 +1,65 @@
 // SPDX-License-Identifier: MIT
 // Copyright 2018 Advanced Micro Devices, Inc.
 #include 
+#include 
 #include 
 #include 
+#include "amdgpu_ring.h"
 #include "amdgpu_drmcgrp.h"
 
+void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
+   enum amdgpu_ring_type r_type)
+{
+   struct drmcgrp *drmcgrp = get_drmcgrp(task);
+   struct drmcgrp_device_resource *ddr;
+   struct drmcgrp *p;
+   struct amd_drmcgrp_dev_resource *a_ddr;
+
+   if (drmcgrp == NULL)
+   return;
+
+   ddr = drmcgrp->dev_resources[dev->primary->index];
+
+   mutex_lock(>ddev->mutex);
+   for (p = drmcgrp; p != NULL; p = parent_drmcgrp(drmcgrp)) {
+   a_ddr = ddr_amdddr(p->dev_resources[dev->primary->index]);
+
+   a_ddr->cs_count[r_type]++;
+   }
+   mutex_unlock(>ddev->mutex);
+}
+
+int amd_drmcgrp_cmd_submit_accounting_read(struct seq_file *sf, void *v)
+{
+   struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
+   struct drmcgrp_device_resource *ddr = NULL;
+   struct amd_drmcgrp_dev_resource *a_ddr = NULL;
+   int i, j;
+
+   seq_puts(sf, "---\n");
+   for (i = 0; i < MAX_DRM_DEV; i++) {
+   ddr = drmcgrp->dev_resources[i];
+
+   if (ddr == NULL || ddr->ddev->vid != amd_drmcgrp_vendor_id)
+   continue;
+
+   a_ddr = ddr_amdddr(ddr);
+
+   seq_printf(sf, "card%d:\n", i);
+   for (j = 0; j < __MAX_AMDGPU_RING_TYPE; j++)
+   seq_printf(sf, "  %s: %llu\n", amdgpu_ring_names[j], 
a_ddr->cs_count[j]);
+   }
+
+   return 0;
+}
+
+
 struct cftype files[] = {
+   {
+   .name = "x.cmd_submitted.amd.stat",
+   .seq_show = amd_drmcgrp_cmd_submit_accounting_read,
+   .flags = CFTYPE_NOT_ON_ROOT,
+   },
{ } /* terminate */
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
index e2934b7a49f5..f894a9a1059f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
@@ -5,12 +5,17 @@
 #define _AMDGPU_DRMCGRP_H
 
 #include 
+#include "amdgpu_ring.h"
 
 /* for AMD specific DRM resources */
 struct amd_drmcgrp_dev_resource {
struct drmcgrp_device_resource ddr;
+   u64 cs_count[__MAX_AMDGPU_RING_TYPE];
 };
 
+void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
+   enum amdgpu_ring_type r_type);
+
 static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct 
drmcgrp_device_resource *ddr)
 {
return ddr ? container_of(ddr, struct amd_drmcgrp_dev_resource, ddr) : 
NULL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers

[Intel-gfx] [PATCH RFC 3/5] drm/amdgpu: Add DRM cgroup support for AMD devices

2018-11-20 Thread Kenny Ho
Change-Id: Ib66c44ac1b1c367659e362a2fc05b6fbb3805876
Signed-off-by: Kenny Ho 
---
 drivers/gpu/drm/amd/amdgpu/Makefile |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  7 
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 37 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h | 19 +++
 include/drm/drmcgrp_vendors.h   |  1 +
 5 files changed, 67 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 138cb787d27e..5cf8048f2d75 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -186,4 +186,7 @@ amdgpu-y += $(AMD_DISPLAY_FILES)
 
 endif
 
+#DRM cgroup controller
+amdgpu-y += amdgpu_drmcgrp.o
+
 obj-$(CONFIG_DRM_AMDGPU)+= amdgpu.o
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 30bc345d6fdf..ad0373f83ed3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2645,6 +2646,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
goto failed;
}
 
+   /* TODO:docs */
+   if (drmcgrp_vendors[amd_drmcgrp_vendor_id] == NULL)
+   drmcgrp_register_vendor(_drmcgrp_vendor, 
amd_drmcgrp_vendor_id);
+
+   drmcgrp_register_device(adev->ddev, amd_drmcgrp_vendor_id);
+
return 0;
 
 failed:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
new file mode 100644
index ..ed8aac17769c
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: MIT
+// Copyright 2018 Advanced Micro Devices, Inc.
+#include 
+#include 
+#include 
+#include "amdgpu_drmcgrp.h"
+
+struct cftype files[] = {
+   { } /* terminate */
+};
+
+struct cftype *drmcgrp_amd_get_cftypes(void)
+{
+   return files;
+}
+
+struct drmcgrp_device_resource *amd_drmcgrp_alloc_dev_resource(void)
+{
+   struct amd_drmcgrp_dev_resource *a_ddr;
+
+   a_ddr = kzalloc(sizeof(struct amd_drmcgrp_dev_resource), GFP_KERNEL);
+   if (!a_ddr)
+   return ERR_PTR(-ENOMEM);
+
+   return _ddr->ddr;
+}
+
+void amd_drmcgrp_free_dev_resource(struct drmcgrp_device_resource *ddr)
+{
+   kfree(ddr_amdddr(ddr));
+}
+
+struct drmcgrp_vendor amd_drmcgrp_vendor = {
+   .get_cftypes = drmcgrp_amd_get_cftypes,
+   .alloc_dev_resource = amd_drmcgrp_alloc_dev_resource,
+   .free_dev_resource = amd_drmcgrp_free_dev_resource,
+};
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
new file mode 100644
index ..e2934b7a49f5
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#ifndef _AMDGPU_DRMCGRP_H
+#define _AMDGPU_DRMCGRP_H
+
+#include 
+
+/* for AMD specific DRM resources */
+struct amd_drmcgrp_dev_resource {
+   struct drmcgrp_device_resource ddr;
+};
+
+static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct 
drmcgrp_device_resource *ddr)
+{
+   return ddr ? container_of(ddr, struct amd_drmcgrp_dev_resource, ddr) : 
NULL;
+}
+
+#endif /* _AMDGPU_DRMCGRP_H */
diff --git a/include/drm/drmcgrp_vendors.h b/include/drm/drmcgrp_vendors.h
index b04d8649851b..6cfbf1825344 100644
--- a/include/drm/drmcgrp_vendors.h
+++ b/include/drm/drmcgrp_vendors.h
@@ -3,5 +3,6 @@
  */
 #if IS_ENABLED(CONFIG_CGROUP_DRM)
 
+DRMCGRP_VENDOR(amd)
 
 #endif
-- 
2.19.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices

2018-11-20 Thread Kenny Ho
Since many parts of the DRM subsystem has vendor-specific
implementations, we introduce mechanisms for vendor to register their
specific resources and control files to the DRM cgroup subsystem.  A
vendor will register itself with the DRM cgroup subsystem first before
registering individual DRM devices to the cgroup subsystem.

In addition to the cgroup_subsys_state that is common to all DRM
devices, a device-specific state is introduced and it is allocated
according to the vendor of the device.

Change-Id: I908ee6975ea0585e4c30eafde4599f87094d8c65
Signed-off-by: Kenny Ho 
---
 include/drm/drm_cgroup.h  | 39 
 include/drm/drmcgrp_vendors.h |  7 +++
 include/linux/cgroup_drm.h| 26 +++
 kernel/cgroup/drm.c   | 84 +++
 4 files changed, 156 insertions(+)
 create mode 100644 include/drm/drm_cgroup.h
 create mode 100644 include/drm/drmcgrp_vendors.h

diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
new file mode 100644
index ..26cbea7059a6
--- /dev/null
+++ b/include/drm/drm_cgroup.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#ifndef __DRM_CGROUP_H__
+#define __DRM_CGROUP_H__
+
+#define DRMCGRP_VENDOR(_x) _x ## _drmcgrp_vendor_id,
+enum drmcgrp_vendor_id {
+#include 
+   DRMCGRP_VENDOR_COUNT,
+};
+#undef DRMCGRP_VENDOR
+
+#define DRMCGRP_VENDOR(_x) extern struct drmcgrp_vendor _x ## _drmcgrp_vendor;
+#include 
+#undef DRMCGRP_VENDOR
+
+
+
+#ifdef CONFIG_CGROUP_DRM
+
+extern struct drmcgrp_vendor *drmcgrp_vendors[];
+
+int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum 
drmcgrp_vendor_id id);
+int drmcgrp_register_device(struct drm_device *device, enum drmcgrp_vendor_id 
id);
+
+#else
+static int drmcgrp_register_vendor(struct drmcgrp_vendor *vendor, enum 
drmcgrp_vendor_id id)
+{
+   return 0;
+}
+
+static int drmcgrp_register_device(struct drm_device *device, enum 
drmcgrp_vendor_id id)
+{
+   return 0;
+}
+
+#endif /* CONFIG_CGROUP_DRM */
+#endif /* __DRM_CGROUP_H__ */
diff --git a/include/drm/drmcgrp_vendors.h b/include/drm/drmcgrp_vendors.h
new file mode 100644
index ..b04d8649851b
--- /dev/null
+++ b/include/drm/drmcgrp_vendors.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ */
+#if IS_ENABLED(CONFIG_CGROUP_DRM)
+
+
+#endif
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index 79ab38b0f46d..a776662d9593 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -6,10 +6,36 @@
 
 #ifdef CONFIG_CGROUP_DRM
 
+#include 
 #include 
+#include 
+#include 
+
+/* limit defined per the way drm_minor_alloc operates */
+#define MAX_DRM_DEV (64 * DRM_MINOR_RENDER)
+
+struct drmcgrp_device {
+   enum drmcgrp_vendor_id  vid;
+   struct drm_device   *dev;
+   struct mutexmutex;
+};
+
+/* vendor-common resource counting goes here */
+/* this struct should be included in the vendor specific resource */
+struct drmcgrp_device_resource {
+   struct drmcgrp_device   *ddev;
+};
+
+struct drmcgrp_vendor {
+   struct cftype *(*get_cftypes)(void);
+   struct drmcgrp_device_resource *(*alloc_dev_resource)(void);
+   void (*free_dev_resource)(struct drmcgrp_device_resource *dev_resource);
+};
+
 
 struct drmcgrp {
struct cgroup_subsys_state  css;
+   struct drmcgrp_device_resource  *dev_resources[MAX_DRM_DEV];
 };
 
 static inline struct drmcgrp *css_drmcgrp(struct cgroup_subsys_state *css)
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index d9e194b9aead..f9630cc389bc 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -1,8 +1,30 @@
 // SPDX-License-Identifier: MIT
 // Copyright 2018 Advanced Micro Devices, Inc.
+#include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
+#include 
+#include 
+
+/* generate an array of drm cgroup vendor pointers */
+#define DRMCGRP_VENDOR(_x)[_x ## _drmcgrp_vendor_id] = NULL,
+struct drmcgrp_vendor *drmcgrp_vendors[] = {
+#include 
+};
+#undef DRMCGRP_VENDOR
+EXPORT_SYMBOL(drmcgrp_vendors);
+
+static DEFINE_MUTEX(drmcgrp_mutex);
+
+/* indexed by drm_minor for access speed */
+static struct drmcgrp_device   *known_drmcgrp_devs[MAX_DRM_DEV];
+
+static int max_minor;
+
 
 static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
struct cftype *cft)
@@ -13,6 +35,12 @@ static u64 drmcgrp_test_read(struct cgroup_subsys_state *css,
 static void drmcgrp_css_free(struct cgroup_subsys_state *css)
 {
struct drmcgrp *drmcgrp = css_drmcgrp(css);
+   int i;
+
+   for (i = 0; i <= max_minor; i++) {
+   if (drmcgrp->dev_resources[i] != NULL)
+   
drmcgrp_vendors[known_drmcgrp_devs[i]->vid]->free_dev_resource(drmcgrp->dev_resources[i]);
+   }
 
kfree(css_drmcgrp(css));
 }
@@ -21,11 +49,27 @@ static struct cgro

[Intel-gfx] [PATCH RFC 0/5] DRM cgroup controller

2018-11-20 Thread Kenny Ho
The purpose of this patch series is to start a discussion for a generic cgroup
controller for the drm subsystem.  The design proposed here is a very early one.
We are hoping to engage the community as we develop the idea.


Backgrounds
==
Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
tasks, and all their future children, into hierarchical groups with specialized
behaviour, such as accounting/limiting the resources which processes in a cgroup
can access[1].  Weights, limits, protections, allocations are the main resource
distribution models.  Existing cgroup controllers includes cpu, memory, io,
rdma, and more.  cgroup is one of the foundational technologies that enables the
popular container application deployment and management method.

Direct Rendering Manager/drm contains code intended to support the needs of
complex graphics devices. Graphics drivers in the kernel may make use of DRM
functions to make tasks like memory management, interrupt handling and DMA
easier, and provide a uniform interface to applications.  The DRM has also
developed beyond traditional graphics applications to support compute/GPGPU
applications.


Motivations
=
As GPU grow beyond the realm of desktop/workstation graphics into areas like
data center clusters and IoT, there are increasing needs to monitor and regulate
GPU as a resource like cpu, memory and io.

Matt Roper from Intel began working on similar idea in early 2018 [2] for the
purpose of managing GPU priority using the cgroup hierarchy.  While that
particular use case may not warrant a standalone drm cgroup controller, there
are other use cases where having one can be useful [3].  Monitoring GPU
resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
(execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
sysadmins get a better understanding of the applications usage profile.  Further
usage regulations of the aforementioned resources can also help sysadmins
optimize workload deployment on limited GPU resources.

With the increased importance of machine learning, data science and other
cloud-based applications, GPUs are already in production use in data centers
today [5,6,7].  Existing GPU resource management is very course grain, however,
as sysadmins are only able to distribute workload on a per-GPU basis [8].  An
alternative is to use GPU virtualization (with or without SRIOV) but it
generally acts on the entire GPU instead of the specific resources in a GPU.
With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
resource management (in addition to what may be available via GPU
virtualization.)

In addition to production use, the DRM cgroup can also help with testing
graphics application robustness by providing a mean to artificially limit DRM
resources availble to the applications.

Challenges

While there are common infrastructure in DRM that is shared across many vendors
(the scheduler [4] for example), there are also aspects of DRM that are vendor
specific.  To accommodate this, we borrowed the mechanism used by the cgroup to
handle different kinds of cgroup controller.

Resources for DRM are also often device (GPU) specific instead of system
specific and a system may contain more than one GPU.  For this, we borrowed some
of the ideas from RDMA cgroup controller.

Approach
===
To experiment with the idea of a DRM cgroup, we would like to start with basic
accounting and statistics, then continue to iterate and add regulating
mechanisms into the driver.

[1] https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
[2] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
[3] https://www.spinics.net/lists/cgroups/msg20720.html
[4] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler
[5] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
[6] 
https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/
[7] https://github.com/RadeonOpenCompute/k8s-device-plugin
[8] https://github.com/kubernetes/kubernetes/issues/52757


Kenny Ho (5):
  cgroup: Introduce cgroup for drm subsystem
  cgroup: Add mechanism to register vendor specific DRM devices
  drm/amdgpu: Add DRM cgroup support for AMD devices
  drm/amdgpu: Add accounting of command submission via DRM cgroup
  drm/amdgpu: Add accounting of buffer object creation request via DRM
cgroup

 drivers/gpu/drm/amd/amdgpu/Makefile |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |   5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   7 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 147 
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  27 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  13 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c|  15 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h|   5 +-
 include/drm/drm_cgroup.h|  39 ++
 include/drm

[Intel-gfx] [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup

2018-11-20 Thread Kenny Ho
Account for the total size of buffer object requested to amdgpu by
buffer type on a per cgroup basis.

x prefix in the control file name x.bo_requested.amd.stat signify
experimental.

Change-Id: Ifb680c4bcf3652879a7a659510e25680c2465cf6
Signed-off-by: Kenny Ho 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 56 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 13 +
 include/uapi/drm/amdgpu_drm.h   | 24 ++---
 4 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
index 853b77532428..e3d98ed01b79 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c
@@ -7,6 +7,57 @@
 #include "amdgpu_ring.h"
 #include "amdgpu_drmcgrp.h"
 
+void amdgpu_drmcgrp_count_bo_req(struct task_struct *task, struct drm_device 
*dev,
+   u32 domain, unsigned long size)
+{
+   struct drmcgrp *drmcgrp = get_drmcgrp(task);
+   struct drmcgrp_device_resource *ddr;
+   struct drmcgrp *p;
+   struct amd_drmcgrp_dev_resource *a_ddr;
+int i;
+
+   if (drmcgrp == NULL)
+   return;
+
+   ddr = drmcgrp->dev_resources[dev->primary->index];
+
+   mutex_lock(>ddev->mutex);
+   for (p = drmcgrp; p != NULL; p = parent_drmcgrp(drmcgrp)) {
+   a_ddr = ddr_amdddr(p->dev_resources[dev->primary->index]);
+
+   for (i = 0; i < __MAX_AMDGPU_MEM_DOMAIN; i++)
+   if ( (1 << i) & domain)
+   a_ddr->bo_req_count[i] += size;
+   }
+   mutex_unlock(>ddev->mutex);
+}
+
+int amd_drmcgrp_bo_req_stat_read(struct seq_file *sf, void *v)
+{
+   struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
+   struct drmcgrp_device_resource *ddr = NULL;
+   struct amd_drmcgrp_dev_resource *a_ddr = NULL;
+   int i, j;
+
+   seq_puts(sf, "---\n");
+   for (i = 0; i < MAX_DRM_DEV; i++) {
+   ddr = drmcgrp->dev_resources[i];
+
+   if (ddr == NULL || ddr->ddev->vid != amd_drmcgrp_vendor_id)
+   continue;
+
+   a_ddr = ddr_amdddr(ddr);
+
+   seq_printf(sf, "card%d:\n", i);
+   for (j = 0; j < __MAX_AMDGPU_MEM_DOMAIN; j++)
+   seq_printf(sf, "  %s: %llu\n", 
amdgpu_mem_domain_names[j], a_ddr->bo_req_count[j]);
+   }
+
+   return 0;
+}
+
+
+
 void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
enum amdgpu_ring_type r_type)
 {
@@ -55,6 +106,11 @@ int amd_drmcgrp_cmd_submit_accounting_read(struct seq_file 
*sf, void *v)
 
 
 struct cftype files[] = {
+   {
+   .name = "x.bo_requested.amd.stat",
+   .seq_show = amd_drmcgrp_bo_req_stat_read,
+   .flags = CFTYPE_NOT_ON_ROOT,
+   },
{
.name = "x.cmd_submitted.amd.stat",
.seq_show = amd_drmcgrp_cmd_submit_accounting_read,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
index f894a9a1059f..8b9d61e47dde 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.h
@@ -11,10 +11,13 @@
 struct amd_drmcgrp_dev_resource {
struct drmcgrp_device_resource ddr;
u64 cs_count[__MAX_AMDGPU_RING_TYPE];
+   u64 bo_req_count[__MAX_AMDGPU_MEM_DOMAIN];
 };
 
 void amdgpu_drmcgrp_count_cs(struct task_struct *task, struct drm_device *dev,
enum amdgpu_ring_type r_type);
+void amdgpu_drmcgrp_count_bo_req(struct task_struct *task, struct drm_device 
*dev,
+   u32 domain, unsigned long size);
 
 static inline struct amd_drmcgrp_dev_resource *ddr_amdddr(struct 
drmcgrp_device_resource *ddr)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 7b3d1ebda9df..339e1d3edad8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -31,6 +31,17 @@
 #include 
 #include "amdgpu.h"
 #include "amdgpu_display.h"
+#include "amdgpu_drmcgrp.h"
+
+char const *amdgpu_mem_domain_names[] = {
+   [AMDGPU_MEM_DOMAIN_CPU] = "cpu",
+   [AMDGPU_MEM_DOMAIN_GTT] = "gtt",
+   [AMDGPU_MEM_DOMAIN_VRAM]= "vram",
+   [AMDGPU_MEM_DOMAIN_GDS] = "gds",
+   [AMDGPU_MEM_DOMAIN_GWS] = "gws",
+   [AMDGPU_MEM_DOMAIN_OA]  = "oa",
+   [__MAX_AMDGPU_MEM_DOMAIN]   = "_max"
+};
 
 void amdgpu_gem_object_free(struct drm_gem_object *gobj)
 {
@@ -52,6 +63,8 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigne