[Bug 110616] vce module in h264 encode
https://bugs.freedesktop.org/show_bug.cgi?id=110616 Bug ID: 110616 Summary: vce module in h264 encode Product: DRI Version: DRI git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/Radeon Assignee: dri-devel@lists.freedesktop.org Reporter: baopeng88_...@163.com We have a playback problem in the vce module of the amd graphics card, when using libva for h264 encoding, the field of pic_order_cnt_type in sps supports other options besides 0? Includes 1 or 2 and so on. Pic_order_cnt_type specifies the encoding method for poc (picture order count), and poc identifies the playback order of the images. The poc can be calculated from the frame-num through the mapping relationship, or it can be explicitly transmitted by the encoder. When pic_order_cnt_type = 2, the POC calculation method relies on the least extra parameters (can save the number of bits in the slice header), and it can only get the serial number of the top and bottom fields according to frame_num. We need to play it quickly in this way. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110615] starting X on AMDGPU takes up to a minute
https://bugs.freedesktop.org/show_bug.cgi?id=110615 Bug ID: 110615 Summary: starting X on AMDGPU takes up to a minute Product: DRI Version: XOrg git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: yury.tarasiev...@gmail.com Created attachment 144168 --> https://bugs.freedesktop.org/attachment.cgi?id=144168=edit dmesg utility output Starting X on AMDGPU takes up to a minute (about 40s, I think). I have two video adapters, integrated NVIDIA and external R7 240. It's only the first X startup after the boot that takes that long (and it's exactly Xorg executable which produces the delay). All subsequent X startups are 'normal' (fast). So I'd expect some system configuration option gets finally set after the 1st X startup, so maybe I could set it manually beforehand? However, there are those dumps from AMDGPU in the dmesg log. Didn't find anything useful on the net or in this BZ. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110615] starting X on AMDGPU takes up to a minute
https://bugs.freedesktop.org/show_bug.cgi?id=110615 --- Comment #1 from Yury --- Created attachment 144169 --> https://bugs.freedesktop.org/attachment.cgi?id=144169=edit Xorg log, FWIW -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 109925] Age of Wonders 3 - Vertical lines on Main Menu (Linux native via Steam)
https://bugs.freedesktop.org/show_bug.cgi?id=109925 Timothy Arceri changed: What|Removed |Added Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org Component|Drivers/Gallium/radeonsi|Mesa core QA Contact|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org --- Comment #5 from Timothy Arceri --- Moving to Mesa core as this happens on i965 also. I was sure this was a regression but could only run the game as far back as the 17.3 branch point on i965 and it was still present there. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 100239] Incorrect rendering in CS:GO
https://bugs.freedesktop.org/show_bug.cgi?id=100239 --- Comment #21 from Timothy Arceri --- (In reply to network723 from comment #20) > (In reply to Timothy Arceri from comment #19) > > Does that fix the issue for you? > > Yes, it does fix scope rendering for me. > Is any negative performance impact to be expected with that flag? Potentially but it's unlikely to be noticeable. > Also, is it doing effectively same thing as patch from comment #9? Yes. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 02/10] drm: Add drm_atomic_crtc_state_for_encoder helper
Hello, On Sun, May 05, 2019 at 11:15:36PM +0200, Daniel Vetter wrote: > On Fri, May 03, 2019 at 04:06:28PM +0200, Daniel Vetter wrote: > > On Fri, May 3, 2019 at 2:47 PM Sean Paul wrote: > >> On Fri, May 03, 2019 at 10:18:51AM +0200, Daniel Vetter wrote: > >>> On Thu, May 02, 2019 at 03:49:44PM -0400, Sean Paul wrote: > From: Sean Paul > > This patch adds a helper to tease out the currently connected crtc for > an encoder, along with its state. This follows the same pattern as the > drm_atomic_crtc_*_for_* macros in the atomic helpers. Since the > relationship of crtc:encoder is 1:n, we don't need a loop since there is > only one crtc per encoder. > >>> > >>> No idea which macros you mean, couldn't find them. > >> > >> No longer relevant with the changes below, but for completeness, I was > >> trying to > >> refer to drm_atomic_crtc_state_for_each_plane and friends. I see now that I > >> wasn't terribly clear :) > >> > Instead of splitting this into 3 functions which all do the same thing, > this is presented as one function. Perhaps that's too ugly and it should > be split to: > struct drm_crtc *drm_atomic_crtc_for_encoder(state, encoder); > struct drm_crtc_state *drm_atomic_new_crtc_state_for_encoder(state, > encoder); > struct drm_crtc_state *drm_atomic_old_crtc_state_for_encoder(state, > encoder); > > Suggestions welcome. > > Changes in v3: > - Added to the set > > Cc: Daniel Vetter > Cc: Ville Syrjälä > Signed-off-by: Sean Paul > --- > drivers/gpu/drm/drm_atomic_helper.c | 48 + > include/drm/drm_atomic_helper.h | 6 > 2 files changed, 54 insertions(+) > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > b/drivers/gpu/drm/drm_atomic_helper.c > index 71cc7d6b0644..1f81ca8daad7 100644 > --- a/drivers/gpu/drm/drm_atomic_helper.c > +++ b/drivers/gpu/drm/drm_atomic_helper.c > @@ -3591,3 +3591,51 @@ int drm_atomic_helper_legacy_gamma_set(struct > drm_crtc *crtc, > return ret; > } > EXPORT_SYMBOL(drm_atomic_helper_legacy_gamma_set); > + > +/** > + * drm_atomic_crtc_state_for_encoder - Get crtc and new/old state for > an encoder > + * @state: Atomic state > + * @encoder: The encoder to fetch the crtc information for > + * @crtc: If not NULL, receives the currently connected crtc > + * @old_crtc_state: If not NULL, receives the crtc's old state > + * @new_crtc_state: If not NULL, receives the crtc's new state > + * > + * This function finds the crtc which is currently connected to > @encoder and > + * returns it as well as its old and new state. If there is no crtc > currently > + * connected, the function will clear @crtc, @old_crtc_state, > @new_crtc_state. > + * > + * All of @crtc, @old_crtc_state, and @new_crtc_state are optional. > + */ > +void drm_atomic_crtc_state_for_encoder(struct drm_atomic_state *state, > + struct drm_encoder *encoder, > + struct drm_crtc **crtc, > + struct drm_crtc_state > **old_crtc_state, > + struct drm_crtc_state > **new_crtc_state) > +{ > + struct drm_crtc *tmp_crtc; > + struct drm_crtc_state *tmp_new_crtc_state, *tmp_old_crtc_state; > + u32 enc_mask = drm_encoder_mask(encoder); > + int i; > + > + for_each_oldnew_crtc_in_state(state, tmp_crtc, tmp_old_crtc_state, > + tmp_new_crtc_state, i) { > >>> > >>> So there's two ways to do this: > >>> > >>> - Using encoder_mask, which is a helper thing. In that case I'd rename > >>> this to drm_atomic_helper_crtc_for_encoder. > >>> > >>> - By looping over the connectors, and looking at ->best_encoder and > >>> ->crtc, see drm_encoder_get_crtc in drm_encoder.c. That's the core way > >>> of doing things. In that case call it drm_atomic_crtc_for_encoder, and > >>> put it into drm_atomic.c. > >>> > >>> There's two ways of doing the 2nd one: looping over connectors in a > >>> drm_atomic_state, or the connector list overall. First requires that the > >>> encoder is already in drm_atomic_state (which I think makes sense). > >> > >> Yeah, I wasn't particularly interested in encoders not in state. I had > >> considered going the connector route, but since you can have multiple > >> connectors > >> per encoder, going through crtc seemed a bit more direct. > > > > You can have multiple possible connectors for a given encoder, and > > multiple possible encoders for a given connector. In both cases the > > driver picks for you. But for active encoders and connectors the > > relationship is 1:1. That's what the helpers exploit by looping over > >
Re: [PATCH v3 02/10] drm: Add drm_atomic_crtc_state_for_encoder helper
On Fri, May 03, 2019 at 04:06:28PM +0200, Daniel Vetter wrote: > On Fri, May 3, 2019 at 2:47 PM Sean Paul wrote: > > On Fri, May 03, 2019 at 10:18:51AM +0200, Daniel Vetter wrote: > > > On Thu, May 02, 2019 at 03:49:44PM -0400, Sean Paul wrote: > > > > From: Sean Paul > > > > > > > > This patch adds a helper to tease out the currently connected crtc for > > > > an encoder, along with its state. This follows the same pattern as the > > > > drm_atomic_crtc_*_for_* macros in the atomic helpers. Since the > > > > relationship of crtc:encoder is 1:n, we don't need a loop since there is > > > > only one crtc per encoder. > > > > > > No idea which macros you mean, couldn't find them. > > > > No longer relevant with the changes below, but for completeness, I was > > trying to > > refer to drm_atomic_crtc_state_for_each_plane and friends. I see now that I > > wasn't terribly clear :) > > > > > > > > > > > > Instead of splitting this into 3 functions which all do the same thing, > > > > this is presented as one function. Perhaps that's too ugly and it should > > > > be split to: > > > > struct drm_crtc *drm_atomic_crtc_for_encoder(state, encoder); > > > > struct drm_crtc_state *drm_atomic_new_crtc_state_for_encoder(state, > > > > encoder); > > > > struct drm_crtc_state *drm_atomic_old_crtc_state_for_encoder(state, > > > > encoder); > > > > > > > > Suggestions welcome. > > > > > > > > Changes in v3: > > > > - Added to the set > > > > > > > > Cc: Daniel Vetter > > > > Cc: Ville Syrjälä > > > > Signed-off-by: Sean Paul > > > > --- > > > > drivers/gpu/drm/drm_atomic_helper.c | 48 + > > > > include/drm/drm_atomic_helper.h | 6 > > > > 2 files changed, 54 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > > > > b/drivers/gpu/drm/drm_atomic_helper.c > > > > index 71cc7d6b0644..1f81ca8daad7 100644 > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c > > > > @@ -3591,3 +3591,51 @@ int drm_atomic_helper_legacy_gamma_set(struct > > > > drm_crtc *crtc, > > > > return ret; > > > > } > > > > EXPORT_SYMBOL(drm_atomic_helper_legacy_gamma_set); > > > > + > > > > +/** > > > > + * drm_atomic_crtc_state_for_encoder - Get crtc and new/old state for > > > > an encoder > > > > + * @state: Atomic state > > > > + * @encoder: The encoder to fetch the crtc information for > > > > + * @crtc: If not NULL, receives the currently connected crtc > > > > + * @old_crtc_state: If not NULL, receives the crtc's old state > > > > + * @new_crtc_state: If not NULL, receives the crtc's new state > > > > + * > > > > + * This function finds the crtc which is currently connected to > > > > @encoder and > > > > + * returns it as well as its old and new state. If there is no crtc > > > > currently > > > > + * connected, the function will clear @crtc, @old_crtc_state, > > > > @new_crtc_state. > > > > + * > > > > + * All of @crtc, @old_crtc_state, and @new_crtc_state are optional. > > > > + */ > > > > +void drm_atomic_crtc_state_for_encoder(struct drm_atomic_state *state, > > > > + struct drm_encoder *encoder, > > > > + struct drm_crtc **crtc, > > > > + struct drm_crtc_state > > > > **old_crtc_state, > > > > + struct drm_crtc_state > > > > **new_crtc_state) > > > > +{ > > > > + struct drm_crtc *tmp_crtc; > > > > + struct drm_crtc_state *tmp_new_crtc_state, *tmp_old_crtc_state; > > > > + u32 enc_mask = drm_encoder_mask(encoder); > > > > + int i; > > > > + > > > > + for_each_oldnew_crtc_in_state(state, tmp_crtc, tmp_old_crtc_state, > > > > + tmp_new_crtc_state, i) { > > > > > > So there's two ways to do this: > > > > > > - Using encoder_mask, which is a helper thing. In that case I'd rename > > > this to drm_atomic_helper_crtc_for_encoder. > > > > > > - By looping over the connectors, and looking at ->best_encoder and > > > ->crtc, see drm_encoder_get_crtc in drm_encoder.c. That's the core way > > > of doing things. In that case call it drm_atomic_crtc_for_encoder, and > > > put it into drm_atomic.c. > > > > > > There's two ways of doing the 2nd one: looping over connectors in a > > > drm_atomic_state, or the connector list overall. First requires that the > > > encoder is already in drm_atomic_state (which I think makes sense). > > > > Yeah, I wasn't particularly interested in encoders not in state. I had > > considered going the connector route, but since you can have multiple > > connectors > > per encoder, going through crtc seemed a bit more direct. > > You can have multiple possible connectors for a given encoder, and > multiple possible encoders for a given connector. In both cases the > driver picks for you. But for active encoders and connectors the > relationship is 1:1. That's what the helpers exploit by looping over > connectors to get
[Bug 108893] Slow redrawing of menu in Gothic 2 under wine
https://bugs.freedesktop.org/show_bug.cgi?id=108893 --- Comment #15 from supercoolem...@seznam.cz --- (In reply to andrew.m.mcmahon from comment #14) > Or I can use softpipe which is even worse plus I get the laggy main menu: > https://imgur.com/a/JwCnegN Well, menu with softpipe is waay slower than RX 580 (which is slower than llvmpipe - ~5 seconds per menu move vs. what feels instant; ~30 seconds with softpipe). > I'm sure you've already enabled 32 bit support in your distro: Yes, and other 32-bit games work ok (except Sacred; and Fallout new Vegas is not totally ok, but close enough, probably due to crappy engine). > And installed all x64 and x86 Mesa related packages and firmware i.e: Indeed. It should be good, because other 32-bit games work well (except Sacred). > Maybe you've exported some funny settings that you might have forgotten Checked and my /etc/environment contains just distro-provided comments (nothing else). I also don't have any relevant environment variables set (checked with "set" command). I don't set environment variables globally unless it's absolutely necessary, instead, I write scripts that set envvar using "env" and that's it. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110371] HP Dreamcolor display *Error* No EDID read
https://bugs.freedesktop.org/show_bug.cgi?id=110371 --- Comment #9 from babblebo...@gmail.com --- I've found the exact commit! https://lists.freedesktop.org/archives/amd-gfx/2018-July/023920.html Fixes the issue against a few kernels affected but my issue is that the code base has been modified so heavily while retaining the same behavior that I can't apply this to kernel 5.2 linux-stable git. I can't even discern where to manually edit the related files to change the behavior. It may be necessary to include another fix that that list of related patches to fix the behavior for my connector/ panel. Not a programmer myself so I'm not sure what's supposed to happen here. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110371] HP Dreamcolor display *Error* No EDID read
https://bugs.freedesktop.org/show_bug.cgi?id=110371 --- Comment #8 from babblebo...@gmail.com --- I've been going down the rabbithole looking for the commit that soured my display. https://cgit.freedesktop.org/~airlied/linux/commit/?id=5c0e0b45c4936295d6333dd7961d0b89b15b070d Or This branch https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-fixes-4.17 Is the closest I can get so far, if I go one back, the kernel version works with my display. The commit directly before that https://cgit.freedesktop.org/~airlied/linux/commit/?id=44ef02c241e7c99af77b408d52af708aa159e968 Works just fine. Is there anything inside this merge that would cause this? I don't know much of what I'm looking at or looking for in these commits but I'll continue dissecting. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC PATCH 0/5] cgroup support for GPU devices
On Sun, May 05, 2019 at 12:34:16PM -0400, Kenny Ho wrote: > (sent again. Not sure why my previous email was just a reply instead > of reply-all.) > > On Sun, May 5, 2019 at 12:05 PM Leon Romanovsky wrote: > > We are talking about two different access patterns for this device > > memory (DM). One is to use this device memory (DM) and second to > > configure/limit. > > Usually those actions will be performed by different groups. > > > > First group (programmers) is using special API [1] through libibverbs [2] > > without any notion of cgroups or any limitations. Second group (sysadmins) > > is less interested in application specifics and for them "device memory" > > means > > "memory" and not "rdma, nic specific, internal memory". > Um... I am not sure that answered it, especially in the context of > cgroup (this is just for my curiosity btw, I don't know much about > rdma.) You said sysadmins are less interested in application > specifics but then how would they make the judgement call on how much > "device memory" is provisioned to one application/container over > another (let say you have 5 cgroup sharing an rdma device)? What are > the consequences of under provisioning "device memory" to an > application? And if they are all just memory, can a sysadmin > provision more system memory in place of device memory (like, are they > interchangeable)? I guess I am confused because if device memory is > just memory (not rdma, nic specific) to sysadmins how would they know > to set the right amount? One of the immediate usages of this DM that come to my mind is very fast spinlocks for MPI applications. In such case, the amount of DM will be property of network topology in given MPI cluster. In this scenario, precise amount of memory will ensure that all jobs will continue to give maximal performance despite any programmer's error in DM allocation. For under provisioning scenario and if application is written correctly, users will experience more latency and less performance, due to the PCI accesses. Slide 3 in Liran's presentation gives brief overview about motivation. Thanks > > Regards, > Kenny > > > [1] ibv_alloc_dm() > > http://man7.org/linux/man-pages/man3/ibv_alloc_dm.3.html > > https://www.openfabrics.org/images/2018workshop/presentations/304_LLiss_OnDeviceMemory.pdf > > [2] https://github.com/linux-rdma/rdma-core/blob/master/libibverbs/ > > > > > > > > I think we need to be careful about drawing the line between > > > duplication and over couplings between subsystems. I have other > > > thoughts and concerns and I will try to organize them into a response > > > in the next few days. > > > > > > Regards, > > > Kenny > > > > > > > > > > > > > > > > Is AMD interested in collaborating to help shape this framework? > > > > > It is intended to be device-neutral, so could be leveraged by various > > > > > types of devices. > > > > > If you have an alternative solution well underway, then maybe > > > > > we can work together to merge our efforts into one. > > > > > In the end, the DRM community is best served with common solution. > > > > > > > > > > > > > > > > > > > > > >>> and with future work, we could extend to: > > > > > >>> * track and control share of GPU time (reuse of cpu/cpuacct) > > > > > >>> * apply mask of allowed execution engines (reuse of cpusets) > > > > > >>> > > > > > >>> Instead of introducing a new cgroup subsystem for GPU devices, a > > > > > >>> new > > > > > >>> framework is proposed to allow devices to register with existing > > > > > >>> cgroup > > > > > >>> controllers, which creates per-device cgroup_subsys_state within > > > > > >>> the > > > > > >>> cgroup. This gives device drivers their own private cgroup > > > > > >>> controls > > > > > >>> (such as memory limits or other parameters) to be applied to > > > > > >>> device > > > > > >>> resources instead of host system resources. > > > > > >>> Device drivers (GPU or other) are then able to reuse the existing > > > > > >>> cgroup > > > > > >>> controls, instead of inventing similar ones. > > > > > >>> > > > > > >>> Per-device controls would be exposed in cgroup filesystem as: > > > > > >>> > > > > > >>> mount//.devices// > > > > > >>> such as (for example): > > > > > >>> mount//memory.devices//memory.max > > > > > >>> mount//memory.devices//memory.current > > > > > >>> mount//cpu.devices//cpu.stat > > > > > >>> mount//cpu.devices//cpu.weight > > > > > >>> > > > > > >>> The drm/i915 patch in this series is based on top of other RFC > > > > > >>> work [1] > > > > > >>> for i915 device memory support. > > > > > >>> > > > > > >>> AMD [2] and Intel [3] have proposed related work in this area > > > > > >>> within the > > > > > >>> last few years, listed below as reference. This new RFC reuses > > > > > >>> existing > > > > > >>> cgroup controllers and takes a different approach than prior work. > > > > > >>> > > > > > >>> Finally, some potential discussion points for this series: > > > > >
Re: [RFC PATCH 0/5] cgroup support for GPU devices
(sent again. Not sure why my previous email was just a reply instead of reply-all.) On Sun, May 5, 2019 at 12:05 PM Leon Romanovsky wrote: > We are talking about two different access patterns for this device > memory (DM). One is to use this device memory (DM) and second to > configure/limit. > Usually those actions will be performed by different groups. > > First group (programmers) is using special API [1] through libibverbs [2] > without any notion of cgroups or any limitations. Second group (sysadmins) > is less interested in application specifics and for them "device memory" means > "memory" and not "rdma, nic specific, internal memory". Um... I am not sure that answered it, especially in the context of cgroup (this is just for my curiosity btw, I don't know much about rdma.) You said sysadmins are less interested in application specifics but then how would they make the judgement call on how much "device memory" is provisioned to one application/container over another (let say you have 5 cgroup sharing an rdma device)? What are the consequences of under provisioning "device memory" to an application? And if they are all just memory, can a sysadmin provision more system memory in place of device memory (like, are they interchangeable)? I guess I am confused because if device memory is just memory (not rdma, nic specific) to sysadmins how would they know to set the right amount? Regards, Kenny > [1] ibv_alloc_dm() > http://man7.org/linux/man-pages/man3/ibv_alloc_dm.3.html > https://www.openfabrics.org/images/2018workshop/presentations/304_LLiss_OnDeviceMemory.pdf > [2] https://github.com/linux-rdma/rdma-core/blob/master/libibverbs/ > > > > > I think we need to be careful about drawing the line between > > duplication and over couplings between subsystems. I have other > > thoughts and concerns and I will try to organize them into a response > > in the next few days. > > > > Regards, > > Kenny > > > > > > > > > > > > Is AMD interested in collaborating to help shape this framework? > > > > It is intended to be device-neutral, so could be leveraged by various > > > > types of devices. > > > > If you have an alternative solution well underway, then maybe > > > > we can work together to merge our efforts into one. > > > > In the end, the DRM community is best served with common solution. > > > > > > > > > > > > > > > > > >>> and with future work, we could extend to: > > > > >>> * track and control share of GPU time (reuse of cpu/cpuacct) > > > > >>> * apply mask of allowed execution engines (reuse of cpusets) > > > > >>> > > > > >>> Instead of introducing a new cgroup subsystem for GPU devices, a new > > > > >>> framework is proposed to allow devices to register with existing > > > > >>> cgroup > > > > >>> controllers, which creates per-device cgroup_subsys_state within the > > > > >>> cgroup. This gives device drivers their own private cgroup controls > > > > >>> (such as memory limits or other parameters) to be applied to device > > > > >>> resources instead of host system resources. > > > > >>> Device drivers (GPU or other) are then able to reuse the existing > > > > >>> cgroup > > > > >>> controls, instead of inventing similar ones. > > > > >>> > > > > >>> Per-device controls would be exposed in cgroup filesystem as: > > > > >>> > > > > >>> mount//.devices// > > > > >>> such as (for example): > > > > >>> mount//memory.devices//memory.max > > > > >>> mount//memory.devices//memory.current > > > > >>> mount//cpu.devices//cpu.stat > > > > >>> mount//cpu.devices//cpu.weight > > > > >>> > > > > >>> The drm/i915 patch in this series is based on top of other RFC work > > > > >>> [1] > > > > >>> for i915 device memory support. > > > > >>> > > > > >>> AMD [2] and Intel [3] have proposed related work in this area > > > > >>> within the > > > > >>> last few years, listed below as reference. This new RFC reuses > > > > >>> existing > > > > >>> cgroup controllers and takes a different approach than prior work. > > > > >>> > > > > >>> Finally, some potential discussion points for this series: > > > > >>> * merge proposed .devices into a single devices > > > > >>> directory? > > > > >>> * allow devices to have multiple registrations for subsets of > > > > >>> resources? > > > > >>> * document a 'common charging policy' for device drivers to follow? > > > > >>> > > > > >>> [1] https://patchwork.freedesktop.org/series/56683/ > > > > >>> [2] > > > > >>> https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html > > > > >>> [3] > > > > >>> https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html > > > > >>> > > > > >>> > > > > >>> Brian Welty (5): > > > > >>> cgroup: Add cgroup_subsys per-device registration framework > > > > >>> cgroup: Change kernfs_node for directories to store > > > > >>> cgroup_subsys_state > > > > >>> memcg: Add per-device support to memory cgroup subsystem > > > > >>> drm: Add memory cgroup registration and
[Bug 110614] [Regression] Freeze at desktop manager startup
https://bugs.freedesktop.org/show_bug.cgi?id=110614 --- Comment #1 from raffa...@zoho.com --- Created attachment 144166 --> https://bugs.freedesktop.org/attachment.cgi?id=144166=edit backtrace from journal Debugoptimized build, trace from journalctl. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 201847] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 000000000a721000 engine 00 [GR] client 0f [GPC0/PROP_0] reason 82 [] on channel 4 [00ff85c000 X[3819]]
https://bugzilla.kernel.org/show_bug.cgi?id=201847 --- Comment #1 from Marc B. (kernel@marc.ngoe.de) --- It would be s cool if anyone would actually read this bug report and maybe try to fix it. I will assist in testing patches until this is resolved. And: I am willing to offer $100 for fixing this annoying bug! Keeps freezing my 4.19.39 kernel out of nowhere. Some things I would like to get into discussion: a) - it might have something to do with memory pressure _and_ b) - high CPU load _or_ - high number of context switches. For the latter I'm not sure. The bug actually always occurs when I ie. compile two kernels at -j24 and habe some other work besides this, say a YT video. The bug is, however, definitely triggered by a graphics event, ie. resizing/creating a window, scrolling a Web page or watching a video. [2019-05-04 15:43:24] err kern 03 kernel : [ 523.906459] nouveau :01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906467] nouveau :01:00.0: fifo: runlist 0: scheduled for recovery [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906473] nouveau :01:00.0: fifo: channel 2: killed [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906479] nouveau :01:00.0: fifo: engine 0: scheduled for recovery [2019-05-04 15:43:24] warning kern 04 kernel : [ 523.906789] nouveau :01:00.0: X[8006]: channel 2 killed! [2019-05-04 15:43:24] err kern 03 kernel : nouveau :01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] [2019-05-04 15:43:24] notice kern 05 kernel : nouveau :01:00.0: fifo: runlist 0: scheduled for recovery [2019-05-04 15:43:24] notice kern 05 kernel : nouveau :01:00.0: fifo: channel 2: killed [2019-05-04 15:43:24] notice kern 05 kernel : nouveau :01:00.0: fifo: engine 0: scheduled for recovery [2019-05-04 15:43:24] warning kern 04 kernel : nouveau :01:00.0: X[8006]: channel 2 killed! [2019-05-04 15:44:24] info kern 06 kernel : [ 584.121331] sysrq: SysRq : Keyboard mode set to system default [2019-05-04 15:44:24] info kern 06 kernel : sysrq: SysRq : Keyboard mode set to system default -- You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 110614] [Regression] Freeze at desktop manager startup
https://bugs.freedesktop.org/show_bug.cgi?id=110614 Bug ID: 110614 Summary: [Regression] Freeze at desktop manager startup Product: Mesa Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: raffa...@zoho.com QA Contact: dri-devel@lists.freedesktop.org Freeze at desktop manager startup (tested with sddm and lightdm). Bisected as commit 1cec049d4db1c4dcd121bad17df4a77273dd9bb1 Author: Julien Isorce Date: Tue Apr 23 14:28:48 2019 -0700 radeonsi: implement resource_get_info Re-use existing si_texture_get_offset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce Reviewed-by: Marek Olšák commit a3c202de0a963c0562796cf75e3a9b3eedf1afad Author: Julien Isorce Date: Tue Apr 23 14:26:33 2019 -0700 gallium: add resource_get_info to pipe_screen Generic plumbing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce Reviewed-by: Marek Olšák GPU: RX 480 OS: OpenSuse Tumbleweed -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC PATCH 0/5] cgroup support for GPU devices
On Sun, May 5, 2019 at 3:14 AM Leon Romanovsky wrote: > > > Doesn't RDMA already has a separate cgroup? Why not implement it there? > > > > > > > Hi Kenny, I can't answer for Leon, but I'm hopeful he agrees with rationale > > I gave in the cover letter. Namely, to implement in rdma controller, would > > mean duplicating existing memcg controls there. > > Exactly, I didn't feel comfortable to add notion of "device memory" > to RDMA cgroup and postponed that decision to later point of time. > RDMA operates with verbs objects and all our user space API is based around > that concept. At the end, system administrator will have hard time to > understand the differences between memcg and RDMA memory. Interesting. I actually don't understand this part (I worked in devops/sysadmin side of things but never with rdma.) Don't applications that use rdma require some awareness of rdma (I mean, you mentioned verbs and objects... or do they just use regular malloc for buffer allocation and then send it through some function?) As a user, I would have this question: why do I need to configure some part of rdma resources under rdma cgroup while other part of rdma resources in a different, seemingly unrelated cgroups. I think we need to be careful about drawing the line between duplication and over couplings between subsystems. I have other thoughts and concerns and I will try to organize them into a response in the next few days. Regards, Kenny > > > > Is AMD interested in collaborating to help shape this framework? > > It is intended to be device-neutral, so could be leveraged by various > > types of devices. > > If you have an alternative solution well underway, then maybe > > we can work together to merge our efforts into one. > > In the end, the DRM community is best served with common solution. > > > > > > > > > >>> and with future work, we could extend to: > > >>> * track and control share of GPU time (reuse of cpu/cpuacct) > > >>> * apply mask of allowed execution engines (reuse of cpusets) > > >>> > > >>> Instead of introducing a new cgroup subsystem for GPU devices, a new > > >>> framework is proposed to allow devices to register with existing cgroup > > >>> controllers, which creates per-device cgroup_subsys_state within the > > >>> cgroup. This gives device drivers their own private cgroup controls > > >>> (such as memory limits or other parameters) to be applied to device > > >>> resources instead of host system resources. > > >>> Device drivers (GPU or other) are then able to reuse the existing cgroup > > >>> controls, instead of inventing similar ones. > > >>> > > >>> Per-device controls would be exposed in cgroup filesystem as: > > >>> mount//.devices// > > >>> such as (for example): > > >>> mount//memory.devices//memory.max > > >>> mount//memory.devices//memory.current > > >>> mount//cpu.devices//cpu.stat > > >>> mount//cpu.devices//cpu.weight > > >>> > > >>> The drm/i915 patch in this series is based on top of other RFC work [1] > > >>> for i915 device memory support. > > >>> > > >>> AMD [2] and Intel [3] have proposed related work in this area within the > > >>> last few years, listed below as reference. This new RFC reuses existing > > >>> cgroup controllers and takes a different approach than prior work. > > >>> > > >>> Finally, some potential discussion points for this series: > > >>> * merge proposed .devices into a single devices directory? > > >>> * allow devices to have multiple registrations for subsets of resources? > > >>> * document a 'common charging policy' for device drivers to follow? > > >>> > > >>> [1] https://patchwork.freedesktop.org/series/56683/ > > >>> [2] > > >>> https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html > > >>> [3] > > >>> https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html > > >>> > > >>> > > >>> Brian Welty (5): > > >>> cgroup: Add cgroup_subsys per-device registration framework > > >>> cgroup: Change kernfs_node for directories to store > > >>> cgroup_subsys_state > > >>> memcg: Add per-device support to memory cgroup subsystem > > >>> drm: Add memory cgroup registration and DRIVER_CGROUPS feature bit > > >>> drm/i915: Use memory cgroup for enforcing device memory limit > > >>> > > >>> drivers/gpu/drm/drm_drv.c | 12 + > > >>> drivers/gpu/drm/drm_gem.c | 7 + > > >>> drivers/gpu/drm/i915/i915_drv.c| 2 +- > > >>> drivers/gpu/drm/i915/intel_memory_region.c | 24 +- > > >>> include/drm/drm_device.h | 3 + > > >>> include/drm/drm_drv.h | 8 + > > >>> include/drm/drm_gem.h | 11 + > > >>> include/linux/cgroup-defs.h| 28 ++ > > >>> include/linux/cgroup.h | 3 + > > >>> include/linux/memcontrol.h | 10 + > > >>> kernel/cgroup/cgroup-v1.c | 10 +- > > >>> kernel/cgroup/cgroup.c
[Bug 109345] drm-next-2018-12-14 -Linux PPC
https://bugs.freedesktop.org/show_bug.cgi?id=109345 --- Comment #31 from Christian Zigotzky --- Hi All, Allan has tested the fifth test kernel. He wrote: Christian DRM5 boots to Firepro! ace -- This step has been marked as good. Next step: git bisect good Output: Bisecting: 22 revisions left to test after this (roughly 5 steps) [48197bc564c7a1888c86024a1ba4f956e0ec2300] drm: add syncobj timeline support v9 make CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc oldconfig make CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc uImage Download: http://www.xenosoft.de/uImage-drm6 @Allan (acefnq/ace) Please test it. Thanks, Christian -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v2 0/2] add dsi pwm0 node for mt8183
Changes since v1: - remove "mediatek,mt8173-dsi" from dsi node. This patch is based on v5.1-rc1 and these patches: http://lists.infradead.org/pipermail/linux-mediatek/2019-March/017963.html https://patchwork.kernel.org/patch/10856987/ https://patchwork.kernel.org/cover/10879001/ https://patchwork.kernel.org/cover/10846677/ https://patchwork.kernel.org/patch/10893519/ Jitao Shi (2): arm64: dts: mt8183: add dsi node arm64: dts: mt8183: add pwm0 node arch/arm64/boot/dts/mediatek/mt8183.dtsi | 35 1 file changed, 35 insertions(+) -- 2.21.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 2/2] arm64: dts: mt8183: add pwm0 node
Add pwm0 node to the mt8183 Signed-off-by: Jitao Shi --- arch/arm64/boot/dts/mediatek/mt8183.dtsi | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi index 84f465fa4fac..b0dda57a7e23 100644 --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi @@ -323,6 +323,17 @@ status = "disabled"; }; + pwm0: pwm@1100e000 { + compatible = "mediatek,mt8183-disp-pwm"; + reg = <0 0x1100e000 0 0x1000>; + interrupts = ; + power-domains = < MT8183_POWER_DOMAIN_DISP>; + #pwm-cells = <2>; + clocks = < CLK_TOP_MUX_DISP_PWM>, + < CLK_INFRA_DISP_PWM>; + clock-names = "main", "mm"; + }; + audiosys: syscon@1122 { compatible = "mediatek,mt8183-audiosys", "syscon"; reg = <0 0x1122 0 0x1000>; -- 2.21.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/2] arm64: dts: mt8183: add dsi node
Add dsi and mipitx nodes to the mt8183 Signed-off-by: Jitao Shi --- arch/arm64/boot/dts/mediatek/mt8183.dtsi | 24 1 file changed, 24 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi index b36e37fcdfe3..84f465fa4fac 100644 --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi @@ -353,6 +353,16 @@ status = "disabled"; }; + mipi_tx0: mipi-dphy@11e5 { + compatible = "mediatek,mt8183-mipi-tx"; + reg = <0 0x11e5 0 0x1000>; + clocks = < CLK_APMIXED_MIPID0_26M>; + clock-names = "ref_clk"; + #clock-cells = <0>; + #phy-cells = <0>; + clock-output-names = "mipi_tx0_pll"; + }; + mfgcfg: syscon@1300 { compatible = "mediatek,mt8183-mfgcfg", "syscon"; reg = <0 0x1300 0 0x1000>; @@ -365,6 +375,20 @@ #clock-cells = <1>; }; + dsi0: dsi@14014000 { + compatible = "mediatek,mt8183-dsi"; + reg = <0 0x14014000 0 0x1000>; + interrupts = ; + power-domains = < MT8183_POWER_DOMAIN_DISP>; + mediatek,syscon-dsi = < 0x140>; + clocks = < CLK_MM_DSI0_MM>, + < CLK_MM_DSI0_IF>, + <_tx0>; + clock-names = "engine", "digital", "hs"; + phys = <_tx0>; + phy-names = "dphy"; + }; + smi_common: smi@14019000 { compatible = "mediatek,mt8183-smi-common", "syscon"; reg = <0 0x14019000 0 0x1000>; -- 2.21.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 109345] drm-next-2018-12-14 -Linux PPC
https://bugs.freedesktop.org/show_bug.cgi?id=109345 --- Comment #30 from Christian Zigotzky --- Hi All, Allan has tested the fourth test kernel. He wrote: Christian DRM4 boots to SI card. Cheers ace -- This step has been marked as bad because the fourth test kernel doesn't boot to the FirePro. Next step: git bisect bad Output: Bisecting: 45 revisions left to test after this (roughly 6 steps) [a2b50babc74394c99638a37a3d48adeb03ddc248] drm/sti: Use drm_atomic_helper_shutdown make CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc oldconfig make CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc uImage Download: http://www.xenosoft.de/uImage-drm5 @Allan (acefnq/ace) Please test it. Thanks, Christian -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v4 1/1] drm/fb-helper: Avoid race with DRM userspace
Den 04.05.2019 14.34, skrev Noralf Trønnes: > > > Den 25.04.2019 10.31, skrev Noralf Trønnes: >> drm_fb_helper_is_bound() is used to check if DRM userspace is in control. >> This is done by looking at the fb on the primary plane. By the time >> fb-helper gets around to committing, it's possible that the facts have >> changed. >> >> Avoid this race by holding the drm_device->master_mutex lock while >> committing. When DRM userspace does its first open, it will now wait >> until fb-helper is done. The helper will stay away if there's a master. >> >> Locking rule: Always take the fb-helper lock first. >> >> v2: >> - Remove drm_fb_helper_is_bound() (Daniel Vetter) >> - No need to check fb_helper->dev->master in >> drm_fb_helper_single_fb_probe(), restore_fbdev_mode() has the check. >> >> Suggested-by: Daniel Vetter >> Signed-off-by: Noralf Trønnes >> Reviewed-by: Daniel Vetter >> --- >> drivers/gpu/drm/drm_auth.c | 20 >> drivers/gpu/drm/drm_fb_helper.c | 90 - >> drivers/gpu/drm/drm_internal.h | 2 + >> 3 files changed, 67 insertions(+), 45 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c >> index 1669c42c40ed..db199807b7dc 100644 >> --- a/drivers/gpu/drm/drm_auth.c >> +++ b/drivers/gpu/drm/drm_auth.c >> @@ -368,3 +368,23 @@ void drm_master_put(struct drm_master **master) >> *master = NULL; >> } >> EXPORT_SYMBOL(drm_master_put); >> + >> +/* Used by drm_client and drm_fb_helper */ >> +bool drm_master_internal_acquire(struct drm_device *dev) >> +{ >> +mutex_lock(>master_mutex); >> +if (dev->master) { >> +mutex_unlock(>master_mutex); >> +return false; >> +} >> + >> +return true; >> +} >> +EXPORT_SYMBOL(drm_master_internal_acquire); >> + >> +/* Used by drm_client and drm_fb_helper */ >> +void drm_master_internal_release(struct drm_device *dev) >> +{ >> +mutex_unlock(>master_mutex); >> +} >> +EXPORT_SYMBOL(drm_master_internal_release); >> diff --git a/drivers/gpu/drm/drm_fb_helper.c >> b/drivers/gpu/drm/drm_fb_helper.c >> index 2339f0f8f5a8..578428461391 100644 >> --- a/drivers/gpu/drm/drm_fb_helper.c >> +++ b/drivers/gpu/drm/drm_fb_helper.c >> @@ -44,6 +44,7 @@ >> >> #include "drm_crtc_internal.h" >> #include "drm_crtc_helper_internal.h" >> +#include "drm_internal.h" >> >> static bool drm_fbdev_emulation = true; >> module_param_named(fbdev_emulation, drm_fbdev_emulation, bool, 0600); >> @@ -509,7 +510,7 @@ static int restore_fbdev_mode_legacy(struct >> drm_fb_helper *fb_helper) >> return ret; >> } >> >> -static int restore_fbdev_mode(struct drm_fb_helper *fb_helper) >> +static int restore_fbdev_mode_force(struct drm_fb_helper *fb_helper) >> { >> struct drm_device *dev = fb_helper->dev; >> >> @@ -519,6 +520,21 @@ static int restore_fbdev_mode(struct drm_fb_helper >> *fb_helper) >> return restore_fbdev_mode_legacy(fb_helper); >> } >> >> +static int restore_fbdev_mode(struct drm_fb_helper *fb_helper) >> +{ >> +struct drm_device *dev = fb_helper->dev; >> +int ret; >> + >> +if (!drm_master_internal_acquire(dev)) >> +return -EBUSY; >> + >> +ret = restore_fbdev_mode_force(fb_helper); >> + >> +drm_master_internal_release(dev); >> + >> +return ret; >> +} >> + >> /** >> * drm_fb_helper_restore_fbdev_mode_unlocked - restore fbdev configuration >> * @fb_helper: driver-allocated fbdev helper, can be NULL > > The Intel CI doesn't like this patch. AFAICT the reason is that the > igt@kms_fbcon_fbt@psr-suspend test expects fbcon to work while it has an > open fd that is master. This doesn't match the new rule of bailing out > if there's a master. > > Adding this debug output: > > @@ -558,6 +558,17 @@ int > drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper) > return 0; > > mutex_lock(_helper->lock); > +if (READ_ONCE(fb_helper->dev->master)) { > + int level = default_message_loglevel; > + > + default_message_loglevel = LOGLEVEL_DEBUG; > + printk("\n"); > + printk("\n"); > + printk("%s\n", __func__); > + printk("THERE IS A MASTER\n"); > + dump_stack(); > + default_message_loglevel = level; > +} > ret = restore_fbdev_mode_force(fb_helper); > > do_delayed = fb_helper->delayed_hotplug; > > Gives these log entries: > > <7> [1857.940072] drm_fb_helper_restore_fbdev_mode_unlocked > <7> [1857.940074] THERE IS A MASTER > <7> [1857.940079] CPU: 4 PID: 8209 Comm: kms_fbcon_fbt Tainted: G U > 5.1.0-rc7-CI-Trybot_4252+ #1 > <7> [1857.940081] Hardware name: Intel Corporation Ice Lake Client > Platform/IceLake U DDR4 SODIMM PD RVP, BIOS > ICLSFWR1.R00.3121.A00.1903190527 03/19/2019 > <7> [1857.940083] Call Trace: > <7> [1857.940091] dump_stack+0x67/0x9b > <7> [1857.940099] drm_fb_helper_restore_fbdev_mode_unlocked+0xda/0xf0 > <7> [1857.940104] drm_fb_helper_set_par+0x24/0x50 > <7> [1857.940188]
[Bug 110214] radeonsi: xterm scrollback buffer disappears while Shift+PgUp and Shift+PgDn
https://bugs.freedesktop.org/show_bug.cgi?id=110214 --- Comment #88 from Diego Viola --- (In reply to komqinxit from comment #87) > Another similar bug. > xfce4-terminal leaves a large black area at the bottom when it renders > 'dmesg' or 'cat /etc/passwd'. > > AMD Ryzen 3 2200G. > Arch Linux. > Mesa 19.0.3-1 I can reproduce this, it happens for me with xfce4-terminal and also gnome-terminal and most other VTE-based terminals. I also noticed that setting R600_DEBUG=nodpbb and MESA_EXTENSION_OVERRIDE="-GL_NV_texture_barrier" does not seem to help. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC PATCH 0/5] cgroup support for GPU devices
On Fri, May 03, 2019 at 02:14:33PM -0700, Welty, Brian wrote: > > On 5/2/2019 3:48 PM, Kenny Ho wrote: > > On 5/2/2019 1:34 AM, Leon Romanovsky wrote: > >> Count us (Mellanox) too, our RDMA devices are exposing special and > >> limited in size device memory to the users and we would like to provide > >> an option to use cgroup to control its exposure. > > Hi Leon, great to hear and happy to work with you and RDMA community > to shape this framework for use by RDMA devices as well. The intent > was to support more than GPU devices. > > Incidentally, I also wanted to ask about the rdma cgroup controller > and if there is interest in updating the device registration implemented > in that controller. It could use the cgroup_device_register() that is > proposed here. But this is perhaps future work, so can discuss separately. I'll try to take a look later this week. > > > > Doesn't RDMA already has a separate cgroup? Why not implement it there? > > > > Hi Kenny, I can't answer for Leon, but I'm hopeful he agrees with rationale > I gave in the cover letter. Namely, to implement in rdma controller, would > mean duplicating existing memcg controls there. Exactly, I didn't feel comfortable to add notion of "device memory" to RDMA cgroup and postponed that decision to later point of time. RDMA operates with verbs objects and all our user space API is based around that concept. At the end, system administrator will have hard time to understand the differences between memcg and RDMA memory. > > Is AMD interested in collaborating to help shape this framework? > It is intended to be device-neutral, so could be leveraged by various > types of devices. > If you have an alternative solution well underway, then maybe > we can work together to merge our efforts into one. > In the end, the DRM community is best served with common solution. > > > > > >>> and with future work, we could extend to: > >>> * track and control share of GPU time (reuse of cpu/cpuacct) > >>> * apply mask of allowed execution engines (reuse of cpusets) > >>> > >>> Instead of introducing a new cgroup subsystem for GPU devices, a new > >>> framework is proposed to allow devices to register with existing cgroup > >>> controllers, which creates per-device cgroup_subsys_state within the > >>> cgroup. This gives device drivers their own private cgroup controls > >>> (such as memory limits or other parameters) to be applied to device > >>> resources instead of host system resources. > >>> Device drivers (GPU or other) are then able to reuse the existing cgroup > >>> controls, instead of inventing similar ones. > >>> > >>> Per-device controls would be exposed in cgroup filesystem as: > >>> mount//.devices// > >>> such as (for example): > >>> mount//memory.devices//memory.max > >>> mount//memory.devices//memory.current > >>> mount//cpu.devices//cpu.stat > >>> mount//cpu.devices//cpu.weight > >>> > >>> The drm/i915 patch in this series is based on top of other RFC work [1] > >>> for i915 device memory support. > >>> > >>> AMD [2] and Intel [3] have proposed related work in this area within the > >>> last few years, listed below as reference. This new RFC reuses existing > >>> cgroup controllers and takes a different approach than prior work. > >>> > >>> Finally, some potential discussion points for this series: > >>> * merge proposed .devices into a single devices directory? > >>> * allow devices to have multiple registrations for subsets of resources? > >>> * document a 'common charging policy' for device drivers to follow? > >>> > >>> [1] https://patchwork.freedesktop.org/series/56683/ > >>> [2] > >>> https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html > >>> [3] > >>> https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html > >>> > >>> > >>> Brian Welty (5): > >>> cgroup: Add cgroup_subsys per-device registration framework > >>> cgroup: Change kernfs_node for directories to store > >>> cgroup_subsys_state > >>> memcg: Add per-device support to memory cgroup subsystem > >>> drm: Add memory cgroup registration and DRIVER_CGROUPS feature bit > >>> drm/i915: Use memory cgroup for enforcing device memory limit > >>> > >>> drivers/gpu/drm/drm_drv.c | 12 + > >>> drivers/gpu/drm/drm_gem.c | 7 + > >>> drivers/gpu/drm/i915/i915_drv.c| 2 +- > >>> drivers/gpu/drm/i915/intel_memory_region.c | 24 +- > >>> include/drm/drm_device.h | 3 + > >>> include/drm/drm_drv.h | 8 + > >>> include/drm/drm_gem.h | 11 + > >>> include/linux/cgroup-defs.h| 28 ++ > >>> include/linux/cgroup.h | 3 + > >>> include/linux/memcontrol.h | 10 + > >>> kernel/cgroup/cgroup-v1.c | 10 +- > >>> kernel/cgroup/cgroup.c | 310 ++--- > >>> mm/memcontrol.c|