[PATCH RFC 1/4] cgroup, perf: Add ability to connect to perf cgroup from other cgroup controller

2021-11-18 Thread Kenny Ho
This provides the ability to allocate cgroup specific perf_event by bpf-cgroup in later patch Change-Id: I13aa7f3dfc2883ba3663c0b94744a6169504bbd8 Signed-off-by: Kenny Ho --- include/linux/cgroup.h | 2 ++ include/linux/perf_event.h | 2 ++ kernel/cgroup/cgroup.c | 4 ++-- kernel

[PATCH RFC 0/4] Add ability to attach bpf programs to a tracepoint inside a cgroup

2021-11-18 Thread Kenny Ho
3c452 [2] https://lwn.net/Articles/679074/ [3] https://www.linuxplumbersconf.org/event/4/contributions/291/attachments/313/528/Linux_Plumbers_Conference_2019.pdf [4] https://linuxplumbersconf.org/event/11/contributions/899/ Kenny Ho (4): cgroup, perf: Add ability to connect to perf cgroup fr

[PATCH RFC 2/4] bpf, perf: add ability to attach complete array of bpf prog to perf event

2021-11-18 Thread Kenny Ho
Change-Id: Ie2580c3a71e2a5116551879358cb5304b04d3838 Signed-off-by: Kenny Ho --- include/linux/trace_events.h | 9 + kernel/trace/bpf_trace.c | 28 2 files changed, 37 insertions(+) diff --git a/include/linux/trace_events.h b/include/linux

Re: [PATCH RFC 4/4] bpf,cgroup,perf: extend bpf-cgroup to support tracepoint attachment

2021-11-18 Thread Kenny Ho
On Thu, Nov 18, 2021 at 11:33 PM Alexei Starovoitov wrote: > > On Thu, Nov 18, 2021 at 03:28:40PM -0500, Kenny Ho wrote: > > + for_each_possible_cpu(cpu) { > > + /* allocate first, connect the cgroup later */ > > + events[i] = perf_event_create_

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-05-06 Thread Kenny Ho
Sorry for the late reply (I have been working on other stuff.) On Fri, Feb 5, 2021 at 8:49 AM Daniel Vetter wrote: > > So I agree that on one side CU mask can be used for low-level quality > of service guarantees (like the CLOS cache stuff on intel cpus as an > example), and that's going to be ra

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-05-07 Thread Kenny Ho
On Fri, May 7, 2021 at 4:59 AM Daniel Vetter wrote: > > Hm I missed that. I feel like time-sliced-of-a-whole gpu is the easier gpu > cgroups controler to get started, since it's much closer to other cgroups > that control bandwidth of some kind. Whether it's i/o bandwidth or compute > bandwidht is

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-05-07 Thread Kenny Ho
On Fri, May 7, 2021 at 12:54 PM Daniel Vetter wrote: > > SRIOV is kinda by design vendor specific. You set up the VF endpoint, it > shows up, it's all hw+fw magic. Nothing for cgroups to manage here at all. Right, so in theory you just use the device cgroup with the VF endpoints. > All I meant is

[RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2020-10-07 Thread Kenny Ho
more useful for specific device Signed-off-by: Kenny Ho --- fs/ioctl.c | 5 +++ include/linux/bpf-cgroup.h | 14 include/linux/bpf_types.h | 2 ++ include/uapi/linux/bpf.h | 8 + kernel/bpf/cgroup.c| 66 ++ kerne

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-02-01 Thread Kenny Ho
aniel, this is quick *draft* to get a conversation going. Bpf was actually a path suggested by Tejun back in 2018 so I think you are mischaracterizing this quite a bit. "2018-11-20 Kenny Ho: To put the questions in more concrete terms, let say a user wants to expose certain part of a gpu to a par

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-02-01 Thread Kenny Ho
hat. No Daniel, this is quick *draft* to get a conversation going. Bpf was actually a path suggested by Tejun back in 2018 so I think you are mischaracterizing this quite a bit. "2018-11-20 Kenny Ho: To put the questions in more concrete terms, let say a user wants to expose certain

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-02-03 Thread Kenny Ho
n, Feb 01, 2021 at 11:51:07AM -0500, Kenny Ho wrote: > > On Mon, Feb 1, 2021 at 9:49 AM Daniel Vetter wrote: > > > - there's been a pile of cgroups proposal to manage gpus at the drm > > > subsystem level, some by Kenny, and frankly this at least looks a bit > &

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2020-11-02 Thread Kenny Ho
Adding a few more emails from get_maintainer.pl and bumping this thread since there hasn't been any comments so far. Is this too crazy? Am I missing something fundamental? Regards, Kenny On Wed, Oct 7, 2020 at 11:24 AM Kenny Ho wrote: > > This is a skeleton implementation to invi

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2020-11-02 Thread Kenny Ho
wrote: > > On Mon, Nov 02, 2020 at 02:23:02PM -0500, Kenny Ho wrote: > > Adding a few more emails from get_maintainer.pl and bumping this > > thread since there hasn't been any comments so far. Is this too > > crazy? Am I missing something fundamental? > > sorry

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2020-11-03 Thread Kenny Ho
On Tue, Nov 3, 2020 at 12:43 AM Alexei Starovoitov wrote: > On Mon, Nov 2, 2020 at 9:39 PM Kenny Ho wrote: > pls don't top post. My apology. > > Cgroup awareness is desired because the intent > > is to use this for resource management as well (potentially along with >

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2020-11-03 Thread Kenny Ho
On Tue, Nov 3, 2020 at 4:04 PM Alexei Starovoitov wrote: > > On Tue, Nov 03, 2020 at 02:19:22PM -0500, Kenny Ho wrote: > > On Tue, Nov 3, 2020 at 12:43 AM Alexei Starovoitov > > wrote: > > > On Mon, Nov 2, 2020 at 9:39 PM Kenny Ho wrote: > > Sounds like either

Re: [PATCH] drm/amdgpu: fix amdgpu_ras_block_late_init error handler

2022-02-22 Thread Kenny Ho
On Thu, Feb 17, 2022 at 2:06 PM Alex Deucher wrote: > > On Thu, Feb 17, 2022 at 2:04 PM Nick Desaulniers > wrote: > > > > > > Alex, > > Has AMD been able to set up clang builds, yet? > > No. I think some individual teams do, but it's never been integrated > into our larger CI systems as of yet a

Re: [PATCH RFC v4 16/16] drm/amdgpu: Integrate with DRM cgroup

2019-11-28 Thread Kenny Ho
Reducing audience since this is AMD specific. On Tue, Oct 8, 2019 at 3:11 PM Kuehling, Felix wrote: > > On 2019-08-29 2:05 a.m., Kenny Ho wrote: > > The number of logical gpu (lgpu) is defined to be the number of compute > > unit (CU) for a device. The lgpu allocation lim

Re: [PATCH RFC v4 02/16] cgroup: Introduce cgroup for drm subsystem

2019-11-28 Thread Kenny Ho
On Tue, Oct 1, 2019 at 10:31 AM Michal Koutný wrote: > On Thu, Aug 29, 2019 at 02:05:19AM -0400, Kenny Ho wrote: > > +struct cgroup_subsys drm_cgrp_subsys = { > > + .css_alloc = drmcg_css_alloc, > > + .css_free = drmcg_css_free, > > +

Re: [PATCH RFC v4 07/16] drm, cgroup: Add total GEM buffer allocation limit

2019-11-28 Thread Kenny Ho
On Tue, Oct 1, 2019 at 10:30 AM Michal Koutný wrote: > On Thu, Aug 29, 2019 at 02:05:24AM -0400, Kenny Ho wrote: > > drm.buffer.default > > A read-only flat-keyed file which exists on the root cgroup. > > Each entry is keyed by the drm

Re: [PATCH RFC v4 16/16] drm/amdgpu: Integrate with DRM cgroup

2019-12-03 Thread Kenny Ho
obvious to debug. (I want to write this down so I don't forget also... :) I should probably have some dmesg for situation like this.) Thanks! Regards, Kenny On Mon, Dec 2, 2019 at 5:05 PM Greathouse, Joseph wrote: > > > -Original Message- > > From: Kenny Ho > > S

[PATCH RFC 0/5] DRM cgroup controller

2018-11-20 Thread Kenny Ho
docs/tasks/manage-gpus/scheduling-gpus/ [6] https://blog.openshift.com/gpu-accelerated-sql-queries-with-postgresql-pg-strom-in-openshift-3-10/ [7] https://github.com/RadeonOpenCompute/k8s-device-plugin [8] https://github.com/kubernetes/kubernetes/issues/52757 Kenny Ho (5): cgroup: Introduce cgro

[PATCH RFC 3/5] drm/amdgpu: Add DRM cgroup support for AMD devices

2018-11-20 Thread Kenny Ho
Change-Id: Ib66c44ac1b1c367659e362a2fc05b6fbb3805876 Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu/Makefile | 3 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 drivers/gpu/drm/amd/amdgpu/amdgpu_drmcgrp.c | 37 + drivers/gpu/drm/amd/amdgpu

[PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup

2018-11-20 Thread Kenny Ho
Account for the total size of buffer object requested to amdgpu by buffer type on a per cgroup basis. x prefix in the control file name x.bo_requested.amd.stat signify experimental. Change-Id: Ifb680c4bcf3652879a7a659510e25680c2465cf6 Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu

[PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices

2018-11-20 Thread Kenny Ho
devices to the cgroup subsystem. In addition to the cgroup_subsys_state that is common to all DRM devices, a device-specific state is introduced and it is allocated according to the vendor of the device. Change-Id: I908ee6975ea0585e4c30eafde4599f87094d8c65 Signed-off-by: Kenny Ho --- include/drm

[PATCH RFC 4/5] drm/amdgpu: Add accounting of command submission via DRM cgroup

2018-11-20 Thread Kenny Ho
Account for the number of command submitted to amdgpu by type on a per cgroup basis, for the purpose of profiling/monitoring applications. x prefix in the control file name x.cmd_submitted.amd.stat signify experimental. Change-Id: Ibc22e5bda600f54fe820fe0af5400ca348691550 Signed-off-by: Kenny Ho

[PATCH RFC 1/5] cgroup: Introduce cgroup for drm subsystem

2018-11-20 Thread Kenny Ho
Change-Id: I6830d3990f63f0c13abeba29b1d330cf28882831 Signed-off-by: Kenny Ho --- include/linux/cgroup_drm.h| 32 include/linux/cgroup_subsys.h | 4 +++ init/Kconfig | 5 kernel/cgroup/Makefile| 1 + kernel/cgroup/drm.c | 46

Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup

2018-11-27 Thread Kenny Ho
Hey Christian, Sorry for the late reply, I missed this for some reason. On Wed, Nov 21, 2018 at 5:00 AM Christian König wrote: > > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > > index 370e9a5536ef..531726443104 100644 > > --- a/include/uapi/drm/amdgpu_drm.h > > ++

Re: [PATCH RFC 5/5] drm/amdgpu: Add accounting of buffer object creation request via DRM cgroup

2018-11-27 Thread Kenny Ho
Ah I see. Thank you for the clarification. Regards, Kenny On Tue, Nov 27, 2018 at 3:31 PM Christian König wrote: > > Am 27.11.18 um 19:15 schrieb Kenny Ho: > > Hey Christian, > > > > Sorry for the late reply, I missed this for some reason. > > > > On Wed, Nov

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-03-17 Thread Kenny Ho
Hi Tejun, What's your thoughts on this latest series? Regards, Kenny On Wed, Feb 26, 2020 at 2:02 PM Kenny Ho wrote: > > This is a submission for the introduction of a new cgroup controller for the > drm subsystem follow a series of RFCs [v1, v2, v3, v4] > > Changes fr

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-03-24 Thread Kenny Ho
Hi Tejun, Can you elaborate more on what are the missing pieces? Regards, Kenny On Tue, Mar 24, 2020 at 2:46 PM Tejun Heo wrote: > > On Tue, Mar 17, 2020 at 12:03:20PM -0400, Kenny Ho wrote: > > What's your thoughts on this latest series? > > My overall impression is th

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-04-13 Thread Kenny Ho
work-conserving implementation first, especially when we have users asking for such functionality? Regards, Kenny On Mon, Apr 13, 2020 at 3:11 PM Tejun Heo wrote: > > Hello, Kenny. > > On Tue, Mar 24, 2020 at 02:49:27PM -0400, Kenny Ho wrote: > > Can you elaborate more on what are the mi

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-04-13 Thread Kenny Ho
Hi, On Mon, Apr 13, 2020 at 4:54 PM Tejun Heo wrote: > > Allocations definitely are acceptable and it's not a pre-requisite to have > work-conserving control first either. Here, given the lack of consensus in > terms of what even constitute resource units, I don't think it'd be a good > idea to c

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-04-14 Thread Kenny Ho
Hi Daniel, On Tue, Apr 14, 2020 at 8:20 AM Daniel Vetter wrote: > My understanding from talking with a few other folks is that > the cpumask-style CU-weight thing is not something any other gpu can > reasonably support (and we have about 6+ of those in-tree) How does Intel plan to support the Su

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-04-14 Thread Kenny Ho
your switching cost is zero.) As a drm co-maintainer, are you suggesting GPU has no place in the HPC use case? Regards, Kenny On Tue, Apr 14, 2020 at 8:52 AM Daniel Vetter wrote: > > On Tue, Apr 14, 2020 at 2:47 PM Kenny Ho wrote: > > On Tue, Apr 14, 2020 at 8:20 AM Daniel Vetter wr

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-04-14 Thread Kenny Ho
gestion, if not...question 2.) 2) If spatial sharing is required to support GPU HPC use cases, what would you implement if you have the hardware support today? Regards, Kenny On Tue, Apr 14, 2020 at 9:26 AM Daniel Vetter wrote: > > On Tue, Apr 14, 2020 at 3:14 PM Kenny Ho wrote: > >

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-04-14 Thread Kenny Ho
support today? How would you support low-jitter/low-latency sharing of a single GPU if you have whatever hardware support you need today? Regards, Kenny > > On Tue, Apr 14, 2020 at 9:26 AM Daniel Vetter wrote: > > > > > > On Tue, Apr 14, 2020 at 3:14 PM Kenny Ho wrote: &g

[RFC PATCH v2 2/5] cgroup: Add mechanism to register DRM devices

2019-05-09 Thread Kenny Ho
Change-Id: I908ee6975ea0585e4c30eafde4599f87094d8c65 Signed-off-by: Kenny Ho --- include/drm/drm_cgroup.h | 24 include/linux/cgroup_drm.h | 10 kernel/cgroup/drm.c| 118 - 3 files changed, 151 insertions(+), 1 deletion(-) create

[RFC PATCH v2 3/5] drm/amdgpu: Register AMD devices for DRM cgroup

2019-05-09 Thread Kenny Ho
Change-Id: I3750fc657b956b52750a36cb303c54fa6a265b44 Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index da7b4fe8ade3..2568fd730161

[RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-09 Thread Kenny Ho
r card0 to 512MB sed -i '1s/.*/536870912/' /sys/fs/cgroup//drm.buffer.total.max Change-Id: I4c249d06d45ec709d6481d4cbe87c5168545c5d0 Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 + drivers/gpu/drm/drm_gem.c | 7 + drivers/gpu/drm/dr

[RFC PATCH v2 1/5] cgroup: Introduce cgroup for drm subsystem

2019-05-09 Thread Kenny Ho
Change-Id: I6830d3990f63f0c13abeba29b1d330cf28882831 Signed-off-by: Kenny Ho --- include/linux/cgroup_drm.h| 32 ++ include/linux/cgroup_subsys.h | 4 init/Kconfig | 5 + kernel/cgroup/Makefile| 1 + kernel/cgroup/drm.c

[RFC PATCH v2 5/5] drm, cgroup: Add peak GEM buffer allocation limit

2019-05-09 Thread Kenny Ho
This new drmcgrp resource limits the largest GEM buffer that can be allocated in a cgroup. Change-Id: I0830d56775568e1cf215b56cc892d5e7945e9f25 Signed-off-by: Kenny Ho --- include/linux/cgroup_drm.h | 2 ++ kernel/cgroup/drm.c| 59 ++ 2 files changed

[RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem

2019-05-09 Thread Kenny Ho
m-in-openshift-3-10/ [7] https://github.com/RadeonOpenCompute/k8s-device-plugin [8] https://github.com/kubernetes/kubernetes/issues/52757 Kenny Ho (5): cgroup: Introduce cgroup for drm subsystem cgroup: Add mechanism to register DRM devices drm/amdgpu: Register AMD devices for DRM cgrou

Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-10 Thread Kenny Ho
On Fri, May 10, 2019 at 8:28 AM Christian König wrote: > > Am 09.05.19 um 23:04 schrieb Kenny Ho: > > + /* only allow bo from the same cgroup or its ancestor to be imported > > */ > > + if (drmcgrp != NULL && > > + !drmcgrp_is_

Re: [RFC PATCH v2 0/5] new cgroup controller for gpu/drm subsystem

2019-05-10 Thread Kenny Ho
o implement? I would be happy to dig into those. Regards, Kenny > The only major issue I can see is on patch #4, see there for further > details. > > Christian. > > Am 09.05.19 um 23:04 schrieb Kenny Ho: > > This is a follow up to the RFC I made last november to introduce a

Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-10 Thread Kenny Ho
On Fri, May 10, 2019 at 11:08 AM Koenig, Christian wrote: > Am 10.05.19 um 16:57 schrieb Kenny Ho: > > On Fri, May 10, 2019 at 8:28 AM Christian König > > wrote: > >> Am 09.05.19 um 23:04 schrieb Kenny Ho: > So the drm cgroup container is separate to other cgroup cont

Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-10 Thread Kenny Ho
On Fri, May 10, 2019 at 1:48 PM Koenig, Christian wrote: > Well another question is why do we want to prevent that in the first place? > > I mean the worst thing that can happen is that we account a BO multiple > times. That's one of the problems. The other one is the BO outliving the lifetime of

Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-15 Thread Kenny Ho
On Wed, May 15, 2019 at 5:26 PM Welty, Brian wrote: > On 5/9/2019 2:04 PM, Kenny Ho wrote: > > There are four control file types, > > stats (ro) - display current measured values for a resource > > max (rw) - limits for a resource > > default (ro, root cgroup only) - de

Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-16 Thread Kenny Ho
On Thu, May 16, 2019 at 3:25 AM Christian König wrote: > Am 16.05.19 um 09:16 schrieb Koenig, Christian: > > Am 16.05.19 um 04:29 schrieb Kenny Ho: > >> On Wed, May 15, 2019 at 5:26 PM Welty, Brian wrote: > >>> On 5/9/2019 2:04 PM, Kenny Ho wrote: > >>>

Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-16 Thread Kenny Ho
On Thu, May 16, 2019 at 10:12 AM Christian König wrote: > Am 16.05.19 um 16:03 schrieb Kenny Ho: > > On Thu, May 16, 2019 at 3:25 AM Christian König > > wrote: > >> Am 16.05.19 um 09:16 schrieb Koenig, Christian: > >> We need something like the Linux sy

Re: [RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

2019-05-16 Thread Kenny Ho
On Thu, May 16, 2019 at 10:10 AM Tejun Heo wrote: > I haven't gone through the patchset yet but some quick comments. > > On Wed, May 15, 2019 at 10:29:21PM -0400, Kenny Ho wrote: > > Given this controller is specific to the drm kernel subsystem which > > uses minor to id

[PATCH 03/11] drm, cgroup: Initialize drmcg properties

2020-02-14 Thread Kenny Ho
applies to the root cgroup since it can be created before DRM devices are available. The drmcg controller will go through all existing drm cgroups and initialize them with the new device accordingly. Change-Id: I64e421d8dfcc22ee8282cc1305960e20c2704db7 Signed-off-by: Kenny Ho --- drivers/gpu/drm

[PATCH 02/11] drm, cgroup: Bind drm and cgroup subsystem

2020-02-14 Thread Kenny Ho
Since the drm subsystem can be compiled as a module and drm devices can be added and removed during run time, add several functions to bind the drm subsystem as well as drm devices with drmcg. Two pairs of functions: drmcg_bind/drmcg_unbind - used to bind/unbind the drm subsystem to the cgroup sub

[PATCH 08/11] drm, cgroup: Add peak GEM buffer allocation limit

2020-02-14 Thread Kenny Ho
y memparse (such as k, m, g) can be used. Set largest allocation for /dev/dri/card1 to 4MB echo "226:1 4m" > drm.buffer.peak.max Change-Id: I5ab3fb4a442b6cbd5db346be595897c90217da69 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 18 +

[PATCH 04/11] drm, cgroup: Add total GEM buffer allocation stats

2020-02-14 Thread Kenny Ho
is keyed by the drm device's major:minor. Total GEM buffer allocation in bytes. Change-Id: Ibc1f646ca7dbc588e2d11802b156b524696a23e7 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 50 +- drivers/gpu/drm/drm_gem.c | 9 ++ incl

[PATCH 01/11] cgroup: Introduce cgroup for drm subsystem

2020-02-14 Thread Kenny Ho
virtualization.) Change-Id: Ia90aed8c4cb89ff20d8216a903a765655b44fc9a Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 18 - Documentation/cgroup-v1/drm.rst | 1 + include/linux/cgroup_drm.h | 92 + include/linux/cgroup_subsys.h

[PATCH 05/11] drm, cgroup: Add peak GEM buffer allocation stats

2020-02-14 Thread Kenny Ho
drm.buffer.peak.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Largest (high water mark) GEM buffer allocated in bytes. Change-Id: I40fe4c13c1cea8613b3e04b802f3e1f19eaab4fc Signed-off-by: Ken

[PATCH 11/11] drm/amdgpu: Integrate with DRM cgroup

2020-02-14 Thread Kenny Ho
the drmcg the kfd process belongs to. Change-Id: I2930e76ef9ac6d36d0feb81f604c89a4208e6614 Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 29 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 + drivers

[PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource

2020-02-14 Thread Kenny Ho
llocation after considering the relationship between the cgroups and their configurations in drm.lgpu. Change-Id: Idde0ef9a331fd67bb9c7eb8ef9978439e6452488 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 80 ++ include/drm/drm_cgroup.h|

[PATCH 07/11] drm, cgroup: Add total GEM buffer allocation limit

2020-02-14 Thread Kenny Ho
Set allocation limit for /dev/dri/card1 to 1GB echo "226:1 1g" > drm.buffer.total.max Set allocation limit for /dev/dri/card0 to 512MB echo "226:0 512m" > drm.buffer.total.max Change-Id: Id3265bbd0fafe84a16b59617df79bd32196160be Signed-off-by: Kenn

[PATCH 06/11] drm, cgroup: Add GEM buffer allocation count stats

2020-02-14 Thread Kenny Ho
drm.buffer.count.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Total number of GEM buffer allocated. Change-Id: Iad29bdf44390dbcee07b1e72ea0ff811aa3b9dcd Signed-off-by: Kenny Ho --- Document

[PATCH 10/11] drm, cgroup: add update trigger after limit change

2020-02-14 Thread Kenny Ho
type for the migrated task. Change-Id: I0ce7c4e5a04c31bd0f8d9853a383575d4bc9a3fa Signed-off-by: Kenny Ho --- include/drm/drm_drv.h | 10 kernel/cgroup/drm.c | 59 ++- 2 files changed, 68 insertions(+), 1 deletion(-) diff --git a/include/drm

[PATCH 00/11] new cgroup controller for gpu/drm subsystem

2020-02-14 Thread Kenny Ho
3-10/ [7] https://github.com/RadeonOpenCompute/k8s-device-plugin [8] https://github.com/kubernetes/kubernetes/issues/52757 Kenny Ho (11): cgroup: Introduce cgroup for drm subsystem drm, cgroup: Bind drm and cgroup subsystem drm, cgroup: Initialize drmcg properties drm, cgroup: Add total GEM bu

Re: [PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource

2020-02-14 Thread Kenny Ho
ignored." What Intel does with the user expressed configuration of "5 out of 100" is entirely up to Intel (time slice if you like, change to specific EUs later if you like, or make it driver configurable to support both if you like.) Regards, Kenny > > On Fri, Feb 14, 2020

Re: [PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource

2020-02-14 Thread Kenny Ho
a cgroup, they > > would set count=5. Per the documentation in this patch: "Some DRM > > devices may only support lgpu as anonymous resources. In such case, > > the significance of the position of the set bits in list will be > > ignored." What Intel

Re: [PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource

2020-02-14 Thread Kenny Ho
Hi Tejun, On Fri, Feb 14, 2020 at 2:17 PM Tejun Heo wrote: > > I have to agree with Daniel here. My apologies if I weren't clear > enough. Here's one interface I can think of: > > * compute weight: The same format as io.weight. Proportional control >of gpu compute. > > * memory low: Please

Re: [PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource

2020-02-19 Thread Kenny Ho
On Wed, Feb 19, 2020 at 11:18 AM Johannes Weiner wrote: > > Yes, I'd go with absolute units when it comes to memory, because it's > not a renewable resource like CPU and IO, and so we do have cliff > behavior around the edge where you transition from ok to not-enough. > > memory.low is a bit in fl

Re: [PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource

2020-02-20 Thread Kenny Ho
Thanks, I will take a look. Regards, Kenny On Wed, Feb 19, 2020 at 1:38 PM Johannes Weiner wrote: > > On Wed, Feb 19, 2020 at 11:28:48AM -0500, Kenny Ho wrote: > > On Wed, Feb 19, 2020 at 11:18 AM Johannes Weiner wrote: > > > > > > Yes, I'd go with abs

[PATCH v2 02/11] drm, cgroup: Bind drm and cgroup subsystem

2020-02-26 Thread Kenny Ho
Since the drm subsystem can be compiled as a module and drm devices can be added and removed during run time, add several functions to bind the drm subsystem as well as drm devices with drmcg. Two pairs of functions: drmcg_bind/drmcg_unbind - used to bind/unbind the drm subsystem to the cgroup sub

[PATCH v2 04/11] drm, cgroup: Add total GEM buffer allocation stats

2020-02-26 Thread Kenny Ho
is keyed by the drm device's major:minor. Total GEM buffer allocation in bytes. Change-Id: Ibc1f646ca7dbc588e2d11802b156b524696a23e7 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 50 +- drivers/gpu/drm/drm_gem.c | 9 ++ incl

[PATCH v2 01/11] cgroup: Introduce cgroup for drm subsystem

2020-02-26 Thread Kenny Ho
virtualization.) Change-Id: Ia90aed8c4cb89ff20d8216a903a765655b44fc9a Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 18 - Documentation/cgroup-v1/drm.rst | 1 + include/linux/cgroup_drm.h | 92 + include/linux/cgroup_subsys.h

[PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-02-26 Thread Kenny Ho
evice-plugin [8] https://github.com/kubernetes/kubernetes/issues/52757 Kenny Ho (11): cgroup: Introduce cgroup for drm subsystem drm, cgroup: Bind drm and cgroup subsystem drm, cgroup: Initialize drmcg properties drm, cgroup: Add total GEM buffer allocation stats drm, cgroup: Add

[PATCH v2 10/11] drm, cgroup: add update trigger after limit change

2020-02-26 Thread Kenny Ho
type for the migrated task. Change-Id: I0ce7c4e5a04c31bd0f8d9853a383575d4bc9a3fa Signed-off-by: Kenny Ho --- include/drm/drm_drv.h | 10 kernel/cgroup/drm.c | 58 +++ 2 files changed, 68 insertions(+) diff --git a/include/drm/drm_drv.h b/include

[PATCH v2 09/11] drm, cgroup: Add compute as gpu cgroup resource

2020-02-26 Thread Kenny Ho
list Enumeration of the subdevices = == Change-Id: Idde0ef9a331fd67bb9c7eb8ef9978439e6452488 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 21 +++ include/drm/drm_cgroup.h| 3 + include/linux/cg

[PATCH v2 03/11] drm, cgroup: Initialize drmcg properties

2020-02-26 Thread Kenny Ho
applies to the root cgroup since it can be created before DRM devices are available. The drmcg controller will go through all existing drm cgroups and initialize them with the new device accordingly. Change-Id: I64e421d8dfcc22ee8282cc1305960e20c2704db7 Signed-off-by: Kenny Ho --- drivers/gpu/drm

[PATCH v2 08/11] drm, cgroup: Add peak GEM buffer allocation limit

2020-02-26 Thread Kenny Ho
y memparse (such as k, m, g) can be used. Set largest allocation for /dev/dri/card1 to 4MB echo "226:1 4m" > gpu.buffer.peak.max Change-Id: I5ab3fb4a442b6cbd5db346be595897c90217da69 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 18 +

[PATCH v2 11/11] drm/amdgpu: Integrate with DRM cgroup

2020-02-26 Thread Kenny Ho
defined by the drmcg the kfd process belongs to. Change-Id: I2930e76ef9ac6d36d0feb81f604c89a4208e6614 Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 29 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7

[PATCH v2 06/11] drm, cgroup: Add GEM buffer allocation count stats

2020-02-26 Thread Kenny Ho
gpu.buffer.count.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Total number of GEM buffer allocated. Change-Id: Iad29bdf44390dbcee07b1e72ea0ff811aa3b9dcd Signed-off-by: Kenny Ho --- Document

[PATCH v2 07/11] drm, cgroup: Add total GEM buffer allocation limit

2020-02-26 Thread Kenny Ho
Set allocation limit for /dev/dri/card1 to 1GB echo "226:1 1g" > gpu.buffer.total.max Set allocation limit for /dev/dri/card0 to 512MB echo "226:0 512m" > gpu.buffer.total.max Change-Id: Id3265bbd0fafe84a16b59617df79bd32196160be Signed-off-by: Kenn

[PATCH v2 05/11] drm, cgroup: Add peak GEM buffer allocation stats

2020-02-26 Thread Kenny Ho
gpu.buffer.peak.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Largest (high water mark) GEM buffer allocated in bytes. Change-Id: I40fe4c13c1cea8613b3e04b802f3e1f19eaab4fc Signed-off-by: Ken

[PATCH RFC v4 01/16] drm: Add drm_minor_for_each

2019-08-28 Thread Kenny Ho
: I7c4b67ce6b31f06d1037b03435386ff5b8144ca5 Signed-off-by: Kenny Ho --- drivers/gpu/drm/drm_drv.c | 19 +++ drivers/gpu/drm/drm_internal.h | 4 include/drm/drm_drv.h | 4 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index

[PATCH RFC v4 07/16] drm, cgroup: Add total GEM buffer allocation limit

2019-08-28 Thread Kenny Ho
Set allocation limit for /dev/dri/card1 to 1GB echo "226:1 1g" > drm.buffer.total.max Set allocation limit for /dev/dri/card0 to 512MB echo "226:0 512m" > drm.buffer.total.max Change-Id: I96e0b7add4d331ed8bb267b3c9243d360c6e9903 Signed-off-by: Kenn

[PATCH RFC v4 00/16] new cgroup controller for gpu/drm subsystem

2019-08-28 Thread Kenny Ho
github.com/kubernetes/kubernetes/issues/52757 Kenny Ho (16): drm: Add drm_minor_for_each cgroup: Introduce cgroup for drm subsystem drm, cgroup: Initialize drmcg properties drm, cgroup: Add total GEM buffer allocation stats drm, cgroup: Add peak GEM buffer allocation stats drm, cgroup: Add

[PATCH RFC v4 02/16] cgroup: Introduce cgroup for drm subsystem

2019-08-28 Thread Kenny Ho
virtualization.) Change-Id: I6830d3990f63f0c13abeba29b1d330cf28882831 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 18 - Documentation/cgroup-v1/drm.rst | 1 + include/linux/cgroup_drm.h | 92 + include/linux/cgroup_subsys.h

[PATCH RFC v4 10/16] drm, cgroup: Add TTM buffer peak usage stats

2019-08-28 Thread Kenny Ho
usage == == Reading returns the following:: 226:0 system=0 tt=0 vram=0 priv=0 226:1 system=0 tt=9035776 vram=17768448 priv=16809984 226:2 system=0 tt=9035776 vram=17768448 priv=16809984 Change-Id: I986e44533848f66411465bdd52105e78105a709a Signed-off-by: Kenny Ho --- in

[PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-28 Thread Kenny Ho
Allow DRM TTM memory manager to register a work_struct, such that, when a drmcgrp is under memory pressure, memory reclaiming can be triggered immediately. Change-Id: I25ac04e2db9c19ff12652b88ebff18b44b2706d8 Signed-off-by: Kenny Ho --- drivers/gpu/drm/ttm/ttm_bo.c| 49

[PATCH RFC v4 03/16] drm, cgroup: Initialize drmcg properties

2019-08-28 Thread Kenny Ho
to the root cgroup since it can be created before DRM devices are available. The drmcg controller will go through all existing drm cgroups and initialize them with the new device accordingly. Change-Id: I908ee6975ea0585e4c30eafde4599f87094d8c65 Signed-off-by: Kenny Ho --- drivers/gpu/drm

[PATCH RFC v4 06/16] drm, cgroup: Add GEM buffer allocation count stats

2019-08-28 Thread Kenny Ho
drm.buffer.count.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Total number of GEM buffer allocated. Change-Id: Id3e1809d5fee8562e47a7d2b961688956d844ec6 Signed-off-by: Kenny Ho --- Document

[PATCH RFC v4 12/16] drm, cgroup: Add soft VRAM limit

2019-08-28 Thread Kenny Ho
Change-Id: I7988e28a453b53140b40a28c176239acbc81d491 Signed-off-by: Kenny Ho --- drivers/gpu/drm/ttm/ttm_bo.c | 7 ++ include/drm/drm_cgroup.h | 17 + include/linux/cgroup_drm.h | 2 + kernel/cgroup/drm.c | 135 +++ 4 files changed, 161

[PATCH RFC v4 16/16] drm/amdgpu: Integrate with DRM cgroup

2019-08-28 Thread Kenny Ho
the drmcg the kfd process belongs to. Change-Id: I69a57452c549173a1cd623c30dc57195b3b6563e Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++ drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 + drivers/gpu

[PATCH RFC v4 09/16] drm, cgroup: Add TTM buffer allocation stats

2019-08-28 Thread Kenny Ho
stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Total number of evictions. Change-Id: Ice2c4cc845051229549bebeb6aa2d7d6153bdf6a Signed-off-by: Kenny Ho --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 +- d

[PATCH RFC v4 05/16] drm, cgroup: Add peak GEM buffer allocation stats

2019-08-28 Thread Kenny Ho
drm.buffer.peak.stats A read-only flat-keyed file which exists on all cgroups. Each entry is keyed by the drm device's major:minor. Largest (high water mark) GEM buffer allocated in bytes. Change-Id: I79e56222151a3d33a76a61ba0097fe93ebb3449f Signed-off-by: Ken

[PATCH RFC v4 15/16] drm, cgroup: add update trigger after limit change

2019-08-28 Thread Kenny Ho
type for the migrated task. Change-Id: I68187a72818b855b5f295aefcb241cda8ab63b00 Signed-off-by: Kenny Ho --- include/drm/drm_drv.h | 10 kernel/cgroup/drm.c | 57 +++ 2 files changed, 67 insertions(+) diff --git a/include/drm/drm_drv.h b/include

[PATCH RFC v4 08/16] drm, cgroup: Add peak GEM buffer allocation limit

2019-08-28 Thread Kenny Ho
y memparse (such as k, m, g) can be used. Set largest allocation for /dev/dri/card1 to 4MB echo "226:1 4m" > drm.buffer.peak.max Change-Id: I0830d56775568e1cf215b56cc892d5e7945e9f25 Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 18

[PATCH RFC v4 04/16] drm, cgroup: Add total GEM buffer allocation stats

2019-08-28 Thread Kenny Ho
is keyed by the drm device's major:minor. Total GEM buffer allocation in bytes. Change-Id: I9d662ec50d64bb40a37dbf47f018b2f3a1c033ad Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 50 +- drivers/gpu/drm/drm_gem.c | 9 ++ incl

[PATCH RFC v4 14/16] drm, cgroup: Introduce lgpu as DRM cgroup resource

2019-08-28 Thread Kenny Ho
set bits in list will be ignored. This lgpu resource supports the 'allocation' resource distribution model. Change-Id: I1afcacf356770930c7f925df043e51ad06ceb98e Signed-off-by: Kenny Ho --- Documentation/admin-guide/cgroup-v2.rst | 46 includ

[PATCH RFC v4 11/16] drm, cgroup: Add per cgroup bw measure and control

2019-08-28 Thread Kenny Ho
226:2 bytes_in_period=9223372036854775807 avg_bytes_per_us=65536 Change-Id: Ie573491325ccc16535bb943e7857f43bd0962add Signed-off-by: Kenny Ho --- drivers/gpu/drm/ttm/ttm_bo.c | 7 + include/drm/drm_cgroup.h | 19 +++ include/linux/cgroup_drm.h | 16 ++ kernel/cgroup/drm.c | 319 +

Re: [PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-29 Thread Kenny Ho
straightforward as far as I understand it currently.) Regards, Kenny On Thu, Aug 29, 2019 at 3:08 AM Koenig, Christian wrote: > Am 29.08.19 um 08:05 schrieb Kenny Ho: > > Allow DRM TTM memory manager to register a work_struct, such that, when > > a drmcgrp is under memory pressure, memory re

Re: [PATCH RFC v4 13/16] drm, cgroup: Allow more aggressive memory reclaim

2019-08-29 Thread Kenny Ho
#x27;t have a distinction which domain you need to evict stuff from. > > Regards, > Christian. > > Am 29.08.19 um 16:07 schrieb Kenny Ho: > > Thanks for the feedback Christian. I am still digging into this one. Daniel > suggested leveraging the Shrinker API for the functio

Re: [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict

2019-09-02 Thread Kenny Ho
Hey Christian, Can you go into details a bit more on the how and why this doesn't work well anymore? (such as its relationship with per VM BOs?) I am curious to learn more because I was reading into this chunk of code earlier. Is this something that the Shrinker API can help with? Regards, Ken

Re: [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict

2019-09-02 Thread Kenny Ho
point, so this patch set > here switched from a dynamic approach to just assuming the worst and > reserving some memory for page tables. > > Regards, > Christian. > > Am 02.09.19 um 16:07 schrieb Kenny Ho: > > Hey Christian, > > > > Can you go into details a

  1   2   >