[PATCH] drm/amd/display: remove need of modeset flag for overlay planes (V2)
This patch is in continuation to the "843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state" where we started to eliminate the dependency on DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space, which as such is not mandatory. After deferring, this patch eliminates the dependency on the flag for overlay planes. This has to be done in stages as its a pretty complex and requires thorough testing before we free primary planes as well from dependency on modeset flag. V2: Simplified the plane type check. Signed-off-by: Shirish S--- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 1a63c04..045e5df 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, } spin_unlock_irqrestore(>dev->event_lock, flags); - if (!pflip_needed) { + if (!pflip_needed || plane->type == DRM_PLANE_TYPE_OVERLAY) { WARN_ON(!dm_new_plane_state->dc_state); plane_states_constructed[planes_count] = dm_new_plane_state->dc_state; @@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc, /* Remove any changed/removed planes */ if (!enable) { - if (pflip_needed) + if (pflip_needed && + plane->type != DRM_PLANE_TYPE_OVERLAY) continue; if (!old_plane_crtc) @@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc, if (!dm_new_crtc_state->stream) continue; - if (pflip_needed) + if (pflip_needed && + plane->type != DRM_PLANE_TYPE_OVERLAY) continue; WARN_ON(dm_new_plane_state->dc_state); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes
On 5/2/2018 12:53 AM, Stéphane Marchesin wrote: On Fri, Apr 27, 2018 at 3:27 AM Shirish Swrote: This patch is in continuation to the "843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state" where we started to eliminate the dependency on DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space, which as such is not mandatory. After deferring, this patch eliminates the dependency on the flag for overlay planes. This has to be done in stages as its a pretty complex and requires thorough testing before we free primary planes as well from dependency on modeset flag. Signed-off-by: Shirish S --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 1a63c04..87b661d 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, } spin_unlock_irqrestore(>dev->event_lock, flags); - if (!pflip_needed) { + if (!pflip_needed || plane->type == DRM_PLANE_TYPE_OVERLAY) { WARN_ON(!dm_new_plane_state->dc_state); plane_states_constructed[planes_count] = dm_new_plane_state->dc_state; @@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc, /* Remove any changed/removed planes */ if (!enable) { - if (pflip_needed) + if (pflip_needed && + plane && plane->type != DRM_PLANE_TYPE_OVERLAY) nit: I don't think we need to check that plane is non-NULL Agree, was a bit over cautious. Have removed it in V2. Thanks. Regards, Shirish S Stéphane continue; if (!old_plane_crtc) @@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc, if (!dm_new_crtc_state->stream) continue; - if (pflip_needed) + if (pflip_needed && + plane && plane->type != DRM_PLANE_TYPE_OVERLAY) continue; WARN_ON(dm_new_plane_state->dc_state); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
答复: Tracking: radeon 0000:00:10.0: ring 0 stalled for more than 10240msec
Hi , If you are sure that the HW worked fine before. I think you should: 1. Be sure that HW works fine now. 2. recall the driver to the point at where it works well, and then replace them one by one to confirm component which causes the issue. 3. try to update the last VBIOS to adapt new driver. Thanks JimQu 发件人: amd-gfx代表 Christian König 发送时间: 2018年4月30日 1:16:14 收件人: Mathieu Malaterre; Deucher, Alexander 抄送: David Airlie; Zhou, David(ChunMing); dri-devel; amd-gfx@lists.freedesktop.org; LKML 主题: Re: Tracking: radeon :00:10.0: ring 0 stalled for more than 10240msec Am 23.04.2018 um 20:50 schrieb Mathieu Malaterre: > Hi there, > > I am pretty sure I was able to run kodi on an old Mac Mini G4 (big > endian) with AMD RV280. Today it is failing to start with: Well, that is rather old hardware. I suggest to make sure first that the hw isn't broken in some way. > How should I go and debug this (other than plain git-bisect) ? You first need to figure out what's the failing component. Either Mesa, DDX or the Kernel are possible candidates. Another possibility is that you updated kodi and kodi is now doing something the hw doesn't like. Regards, Christian. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: vcn regression on raven1
Hi Tom, Ha, got your meaning. Please check it with the latest drm-next from gerrit tomorrow. Jerry On 05/02/2018 09:41 AM, StDenis, Tom wrote: Hi Jerry, Like I said it's (now well) past EOD (meaning my workstation is powered off) so I'll have to check tomorrow. But I do pull from gerrit daily and build from that. I'll take a look in the morning. Cheers, Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 21:39 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either? I do find it. the tip of gerrit/amd-staging-drm-next is * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait to recover from ring hang. while the tip of freedesktop is * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for VEGAM Jerry On 05/02/2018 09:29 AM, StDenis, Tom wrote: I pull from gerrit. I'm just pointing out that it's not on drm-next upstream either. It may have been missed in a rebase or something. Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 21:07 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, Sound you get the code from freedesktop rather than the internal drm-next. Unfortunately freedesktop looks delay to sync the code from internal drm-next. That's the gap it happened as issue in the test. Hi Alex, Is that a issue for code syncing between freedesktop and internal drm-next? Or it's a known issue of delay syncing code. Jerry On 05/02/2018 08:57 AM, StDenis, Tom wrote: Hi Jerry, It's well past EOD for me I'll pick this up in the morning. I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though. If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there. Based on the public copy of the tree it's not there https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110 Cheers, Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:52 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, It was landed in the latest drm-next, like * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add emit_reg_write_reg_wait ring callback Did you test with that included? Please try to get the latest drm-next, if not. They look the same issue from the log. Jerry On 05/02/2018 08:47 AM, StDenis, Tom wrote: Hi Jerry, So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle. Was this missed in a recent rebase? Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:43 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 On 05/01/2018 09:34 PM, Tom St Denis wrote: Hi all, I've noticed that on the tip of drm-next vcn playback of video is broken (see dmesg below). I've bisected it to this commit It may be fixed here as a common issue. * https://patchwork.freedesktop.org/patch/218909/ Jerry [root@raven linux]# git bisect good 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit commit 701372349fd55b5396b335580e979ac4dde3dd02 Author: Alex DeucherDate: Tue Mar 27 17:10:56 2018 -0500 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, it provides a write and wait in a single packet which avoids a missed ack if a world switch happens between the request and waiting for the ack. Reviewed-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers Which is odd because the commit before this is the vcn change and it works fine (playing BBB right now). Here's the dmesg: [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at [ 2925.640113] IP: (null) [ 2925.640116] PGD 0 P4D 0 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash gpu_sched ttm ax88179_178a usbnet [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018 [
Re: vcn regression on raven1
Hi Jerry, Like I said it's (now well) past EOD (meaning my workstation is powered off) so I'll have to check tomorrow. But I do pull from gerrit daily and build from that. I'll take a look in the morning. Cheers, Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 21:39 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either? I do find it. the tip of gerrit/amd-staging-drm-next is * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait to recover from ring hang. while the tip of freedesktop is * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for VEGAM Jerry On 05/02/2018 09:29 AM, StDenis, Tom wrote: > I pull from gerrit. I'm just pointing out that it's not on drm-next upstream > either. > > It may have been missed in a rebase or something. > > Tom > > From: Zhang, Jerry > Sent: Tuesday, May 1, 2018 21:07 > To: StDenis, Tom; Deucher, Alexander > Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org > Subject: Re: vcn regression on raven1 > > Hi Tom, > > Sound you get the code from freedesktop rather than the internal drm-next. > Unfortunately freedesktop looks delay to sync the code from internal drm-next. > That's the gap it happened as issue in the test. > > Hi Alex, > > Is that a issue for code syncing between freedesktop and internal drm-next? > Or it's a known issue of delay syncing code. > > Jerry > > On 05/02/2018 08:57 AM, StDenis, Tom wrote: >> Hi Jerry, >> >> It's well past EOD for me I'll pick this up in the morning. >> >> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next >> as of my pull this morning though. >> >> If it's in there and I missed it somehow I apologize otherwise it'd be nice >> to make sure it's in there. >> >> Based on the public copy of the tree it's not there >> >> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110 >> >> Cheers, >> Tom >> >> From: Zhang, Jerry >> Sent: Tuesday, May 1, 2018 20:52 >> To: StDenis, Tom; Deucher, Alexander >> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org >> Subject: Re: vcn regression on raven1 >> >> Hi Tom, >> >> It was landed in the latest drm-next, like >> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add >> emit_reg_write_reg_wait ring callback >> >> Did you test with that included? >> Please try to get the latest drm-next, if not. >> They look the same issue from the log. >> >> Jerry >> >> On 05/02/2018 08:47 AM, StDenis, Tom wrote: >>> Hi Jerry, >>> >>> So far as I know this wasn't included on the tip of drm-next. I hit this >>> this morning in my semi-regular pull/build/test cycle. >>> >>> Was this missed in a recent rebase? >>> >>> Tom >>> >>> From: Zhang, Jerry >>> Sent: Tuesday, May 1, 2018 20:43 >>> To: StDenis, Tom; Deucher, Alexander >>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org >>> Subject: Re: vcn regression on raven1 >>> >>> On 05/01/2018 09:34 PM, Tom St Denis wrote: Hi all, I've noticed that on the tip of drm-next vcn playback of video is broken (see dmesg below). I've bisected it to this commit >>> >>> It may be fixed here as a common issue. >>> >>> * https://patchwork.freedesktop.org/patch/218909/ >>> >>> Jerry >>> [root@raven linux]# git bisect good 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit commit 701372349fd55b5396b335580e979ac4dde3dd02 Author: Alex DeucherDate: Tue Mar 27 17:10:56 2018 -0500 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, it provides a write and wait in a single packet which avoids a missed ack if a world switch happens between the request and waiting for the ack. Reviewed-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers Which is odd because the commit before this is the vcn change and it works fine (playing BBB right now). Here's the dmesg: [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at [ 2925.640113] IP: (null) [ 2925.640116] PGD 0 P4D 0 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI [ 2925.640126] Modules linked in: tun
Re: vcn regression on raven1
Hi Tom, Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either? I do find it. the tip of gerrit/amd-staging-drm-next is * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait to recover from ring hang. while the tip of freedesktop is * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for VEGAM Jerry On 05/02/2018 09:29 AM, StDenis, Tom wrote: I pull from gerrit. I'm just pointing out that it's not on drm-next upstream either. It may have been missed in a rebase or something. Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 21:07 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, Sound you get the code from freedesktop rather than the internal drm-next. Unfortunately freedesktop looks delay to sync the code from internal drm-next. That's the gap it happened as issue in the test. Hi Alex, Is that a issue for code syncing between freedesktop and internal drm-next? Or it's a known issue of delay syncing code. Jerry On 05/02/2018 08:57 AM, StDenis, Tom wrote: Hi Jerry, It's well past EOD for me I'll pick this up in the morning. I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though. If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there. Based on the public copy of the tree it's not there https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110 Cheers, Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:52 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, It was landed in the latest drm-next, like * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add emit_reg_write_reg_wait ring callback Did you test with that included? Please try to get the latest drm-next, if not. They look the same issue from the log. Jerry On 05/02/2018 08:47 AM, StDenis, Tom wrote: Hi Jerry, So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle. Was this missed in a recent rebase? Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:43 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 On 05/01/2018 09:34 PM, Tom St Denis wrote: Hi all, I've noticed that on the tip of drm-next vcn playback of video is broken (see dmesg below). I've bisected it to this commit It may be fixed here as a common issue. * https://patchwork.freedesktop.org/patch/218909/ Jerry [root@raven linux]# git bisect good 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit commit 701372349fd55b5396b335580e979ac4dde3dd02 Author: Alex DeucherDate: Tue Mar 27 17:10:56 2018 -0500 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, it provides a write and wait in a single packet which avoids a missed ack if a world switch happens between the request and waiting for the ack. Reviewed-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers Which is odd because the commit before this is the vcn change and it works fine (playing BBB right now). Here's the dmesg: [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at [ 2925.640113] IP: (null) [ 2925.640116] PGD 0 P4D 0 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash gpu_sched ttm ax88179_178a usbnet [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018 [ 2925.640146] RIP: 0010: (null) [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: 007c0080 [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420 [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9 [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: [ 2925.640168] FS: () GS:8801dcf0() knlGS: [ 2925.640171]
Re: vcn regression on raven1
I pull from gerrit. I'm just pointing out that it's not on drm-next upstream either. It may have been missed in a rebase or something. Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 21:07 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, Sound you get the code from freedesktop rather than the internal drm-next. Unfortunately freedesktop looks delay to sync the code from internal drm-next. That's the gap it happened as issue in the test. Hi Alex, Is that a issue for code syncing between freedesktop and internal drm-next? Or it's a known issue of delay syncing code. Jerry On 05/02/2018 08:57 AM, StDenis, Tom wrote: > Hi Jerry, > > It's well past EOD for me I'll pick this up in the morning. > > I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next > as of my pull this morning though. > > If it's in there and I missed it somehow I apologize otherwise it'd be nice > to make sure it's in there. > > Based on the public copy of the tree it's not there > > https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110 > > Cheers, > Tom > > From: Zhang, Jerry > Sent: Tuesday, May 1, 2018 20:52 > To: StDenis, Tom; Deucher, Alexander > Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org > Subject: Re: vcn regression on raven1 > > Hi Tom, > > It was landed in the latest drm-next, like > * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add > emit_reg_write_reg_wait ring callback > > Did you test with that included? > Please try to get the latest drm-next, if not. > They look the same issue from the log. > > Jerry > > On 05/02/2018 08:47 AM, StDenis, Tom wrote: >> Hi Jerry, >> >> So far as I know this wasn't included on the tip of drm-next. I hit this >> this morning in my semi-regular pull/build/test cycle. >> >> Was this missed in a recent rebase? >> >> Tom >> >> From: Zhang, Jerry >> Sent: Tuesday, May 1, 2018 20:43 >> To: StDenis, Tom; Deucher, Alexander >> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org >> Subject: Re: vcn regression on raven1 >> >> On 05/01/2018 09:34 PM, Tom St Denis wrote: >>> Hi all, >>> >>> I've noticed that on the tip of drm-next vcn playback of video is broken >>> (see >>> dmesg below). I've bisected it to this commit >> >> It may be fixed here as a common issue. >> >> * https://patchwork.freedesktop.org/patch/218909/ >> >> Jerry >> >>> >>> [root@raven linux]# git bisect good >>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit >>> commit 701372349fd55b5396b335580e979ac4dde3dd02 >>> Author: Alex Deucher>>> Date: Tue Mar 27 17:10:56 2018 -0500 >>> >>>drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb >>> flush >>> >>>Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, >>>it provides a write and wait in a single packet which avoids a missed >>>ack if a world switch happens between the request and waiting for the >>>ack. >>> >>>Reviewed-by: Huang Rui >>>Reviewed-by: Christian König >>>Signed-off-by: Alex Deucher >>> >>> :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba >>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers >>> >>> Which is odd because the commit before this is the vcn change and it works >>> fine >>> (playing BBB right now). >>> >>> Here's the dmesg: >>> >>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at >>> >>> [ 2925.640113] IP: (null) >>> [ 2925.640116] PGD 0 P4D 0 >>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI >>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash >>> gpu_sched ttm ax88179_178a usbnet >>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 >>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF >>> B350M-PLUS GAMING, BIOS 3803 01/22/2018 >>> [ 2925.640146] RIP: 0010: (null) >>> [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 >>> [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: >>> 007c0080 >>> [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: >>> 8801d8b38420 >>> [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: >>> ed003aa9eef9 >>> [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: >>> 8801d8b3277c >>> [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: >>> >>> [ 2925.640168] FS: () GS:8801dcf0() >>> knlGS: >>> [ 2925.640171] CS: 0010 DS: ES: CR0: 80050033 >>> [ 2925.640174] CR2:
Re: vcn regression on raven1
Hi Tom, Sound you get the code from freedesktop rather than the internal drm-next. Unfortunately freedesktop looks delay to sync the code from internal drm-next. That's the gap it happened as issue in the test. Hi Alex, Is that a issue for code syncing between freedesktop and internal drm-next? Or it's a known issue of delay syncing code. Jerry On 05/02/2018 08:57 AM, StDenis, Tom wrote: Hi Jerry, It's well past EOD for me I'll pick this up in the morning. I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though. If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there. Based on the public copy of the tree it's not there https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110 Cheers, Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:52 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, It was landed in the latest drm-next, like * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add emit_reg_write_reg_wait ring callback Did you test with that included? Please try to get the latest drm-next, if not. They look the same issue from the log. Jerry On 05/02/2018 08:47 AM, StDenis, Tom wrote: Hi Jerry, So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle. Was this missed in a recent rebase? Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:43 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 On 05/01/2018 09:34 PM, Tom St Denis wrote: Hi all, I've noticed that on the tip of drm-next vcn playback of video is broken (see dmesg below). I've bisected it to this commit It may be fixed here as a common issue. * https://patchwork.freedesktop.org/patch/218909/ Jerry [root@raven linux]# git bisect good 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit commit 701372349fd55b5396b335580e979ac4dde3dd02 Author: Alex DeucherDate: Tue Mar 27 17:10:56 2018 -0500 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, it provides a write and wait in a single packet which avoids a missed ack if a world switch happens between the request and waiting for the ack. Reviewed-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers Which is odd because the commit before this is the vcn change and it works fine (playing BBB right now). Here's the dmesg: [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at [ 2925.640113] IP: (null) [ 2925.640116] PGD 0 P4D 0 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash gpu_sched ttm ax88179_178a usbnet [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018 [ 2925.640146] RIP: 0010: (null) [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: 007c0080 [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420 [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9 [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: [ 2925.640168] FS: () GS:8801dcf0() knlGS: [ 2925.640171] CS: 0010 DS: ES: CR0: 80050033 [ 2925.640174] CR2: CR3: 0001d9712000 CR4: 003406e0 [ 2925.640176] Call Trace: [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu] [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu] [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu] [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu] [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu] [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu] [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu] [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu] [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu] [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
Re: vcn regression on raven1
Hi Jerry, It's well past EOD for me I'll pick this up in the morning. I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as of my pull this morning though. If it's in there and I missed it somehow I apologize otherwise it'd be nice to make sure it's in there. Based on the public copy of the tree it's not there https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110 Cheers, Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:52 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 Hi Tom, It was landed in the latest drm-next, like * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add emit_reg_write_reg_wait ring callback Did you test with that included? Please try to get the latest drm-next, if not. They look the same issue from the log. Jerry On 05/02/2018 08:47 AM, StDenis, Tom wrote: > Hi Jerry, > > So far as I know this wasn't included on the tip of drm-next. I hit this > this morning in my semi-regular pull/build/test cycle. > > Was this missed in a recent rebase? > > Tom > > From: Zhang, Jerry > Sent: Tuesday, May 1, 2018 20:43 > To: StDenis, Tom; Deucher, Alexander > Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org > Subject: Re: vcn regression on raven1 > > On 05/01/2018 09:34 PM, Tom St Denis wrote: >> Hi all, >> >> I've noticed that on the tip of drm-next vcn playback of video is broken (see >> dmesg below). I've bisected it to this commit > > It may be fixed here as a common issue. > > * https://patchwork.freedesktop.org/patch/218909/ > > Jerry > >> >> [root@raven linux]# git bisect good >> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit >> commit 701372349fd55b5396b335580e979ac4dde3dd02 >> Author: Alex Deucher>> Date: Tue Mar 27 17:10:56 2018 -0500 >> >> drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb >> flush >> >> Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, >> it provides a write and wait in a single packet which avoids a missed >> ack if a world switch happens between the request and waiting for the >> ack. >> >> Reviewed-by: Huang Rui >> Reviewed-by: Christian König >> Signed-off-by: Alex Deucher >> >> :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba >> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers >> >> Which is odd because the commit before this is the vcn change and it works >> fine >> (playing BBB right now). >> >> Here's the dmesg: >> >> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at >> >> [ 2925.640113] IP: (null) >> [ 2925.640116] PGD 0 P4D 0 >> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI >> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash >> gpu_sched ttm ax88179_178a usbnet >> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 >> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF >> B350M-PLUS GAMING, BIOS 3803 01/22/2018 >> [ 2925.640146] RIP: 0010: (null) >> [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 >> [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: >> 007c0080 >> [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: >> 8801d8b38420 >> [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: >> ed003aa9eef9 >> [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: >> 8801d8b3277c >> [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: >> >> [ 2925.640168] FS: () GS:8801dcf0() >> knlGS: >> [ 2925.640171] CS: 0010 DS: ES: CR0: 80050033 >> [ 2925.640174] CR2: CR3: 0001d9712000 CR4: >> 003406e0 >> [ 2925.640176] Call Trace: >> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu] >> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu] >> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu] >> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu] >> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu] >> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu] >> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu] >> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu] >> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu] >> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu] >> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360 >> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu] >> [ 2925.641297] ?
Re: vcn regression on raven1
Hi Tom, It was landed in the latest drm-next, like * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add emit_reg_write_reg_wait ring callback Did you test with that included? Please try to get the latest drm-next, if not. They look the same issue from the log. Jerry On 05/02/2018 08:47 AM, StDenis, Tom wrote: Hi Jerry, So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle. Was this missed in a recent rebase? Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:43 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 On 05/01/2018 09:34 PM, Tom St Denis wrote: Hi all, I've noticed that on the tip of drm-next vcn playback of video is broken (see dmesg below). I've bisected it to this commit It may be fixed here as a common issue. * https://patchwork.freedesktop.org/patch/218909/ Jerry [root@raven linux]# git bisect good 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit commit 701372349fd55b5396b335580e979ac4dde3dd02 Author: Alex DeucherDate: Tue Mar 27 17:10:56 2018 -0500 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, it provides a write and wait in a single packet which avoids a missed ack if a world switch happens between the request and waiting for the ack. Reviewed-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers Which is odd because the commit before this is the vcn change and it works fine (playing BBB right now). Here's the dmesg: [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at [ 2925.640113] IP: (null) [ 2925.640116] PGD 0 P4D 0 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash gpu_sched ttm ax88179_178a usbnet [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018 [ 2925.640146] RIP: 0010: (null) [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: 007c0080 [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420 [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9 [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: [ 2925.640168] FS: () GS:8801dcf0() knlGS: [ 2925.640171] CS: 0010 DS: ES: CR0: 80050033 [ 2925.640174] CR2: CR3: 0001d9712000 CR4: 003406e0 [ 2925.640176] Call Trace: [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu] [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu] [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu] [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu] [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu] [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu] [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu] [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu] [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu] [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu] [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360 [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu] [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu] [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0 [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40 [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0 [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched] [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] [ 2925.641328] ? save_stack+0x89/0xb0 [ 2925.641332] ? wait_woken+0x110/0x110 [ 2925.641337] ? ret_from_fork+0x22/0x40 [ 2925.641343] ? __schedule+0xd30/0xd30 [ 2925.641346] ? remove_wait_queue+0x150/0x150 [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0 [ 2925.641359] ? __lock_text_start+0x8/0x8 [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] [ 2925.641371] ? kthread+0x19b/0x1c0 [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 2925.641382] ? ret_from_fork+0x22/0x40 [ 2925.641387] Code: Bad RIP value. [ 2925.641397] RIP: (null) RSP: 8801d54f7790 [ 2925.641400] CR2: [
Re: vcn regression on raven1
Hi Jerry, So far as I know this wasn't included on the tip of drm-next. I hit this this morning in my semi-regular pull/build/test cycle. Was this missed in a recent rebase? Tom From: Zhang, Jerry Sent: Tuesday, May 1, 2018 20:43 To: StDenis, Tom; Deucher, Alexander Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org Subject: Re: vcn regression on raven1 On 05/01/2018 09:34 PM, Tom St Denis wrote: > Hi all, > > I've noticed that on the tip of drm-next vcn playback of video is broken (see > dmesg below). I've bisected it to this commit It may be fixed here as a common issue. * https://patchwork.freedesktop.org/patch/218909/ Jerry > > [root@raven linux]# git bisect good > 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit > commit 701372349fd55b5396b335580e979ac4dde3dd02 > Author: Alex Deucher> Date: Tue Mar 27 17:10:56 2018 -0500 > > drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush > > Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, > it provides a write and wait in a single packet which avoids a missed > ack if a world switch happens between the request and waiting for the > ack. > > Reviewed-by: Huang Rui > Reviewed-by: Christian König > Signed-off-by: Alex Deucher > > :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba > ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers > > Which is odd because the commit before this is the vcn change and it works > fine > (playing BBB right now). > > Here's the dmesg: > > [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at > > [ 2925.640113] IP: (null) > [ 2925.640116] PGD 0 P4D 0 > [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI > [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash > gpu_sched ttm ax88179_178a usbnet > [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 > [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF > B350M-PLUS GAMING, BIOS 3803 01/22/2018 > [ 2925.640146] RIP: 0010: (null) > [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 > [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: > 007c0080 > [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: > 8801d8b38420 > [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: > ed003aa9eef9 > [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: > 8801d8b3277c > [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: > > [ 2925.640168] FS: () GS:8801dcf0() > knlGS: > [ 2925.640171] CS: 0010 DS: ES: CR0: 80050033 > [ 2925.640174] CR2: CR3: 0001d9712000 CR4: > 003406e0 > [ 2925.640176] Call Trace: > [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu] > [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu] > [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu] > [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu] > [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu] > [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu] > [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu] > [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu] > [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu] > [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu] > [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360 > [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu] > [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu] > [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0 > [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40 > [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0 > [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched] > [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] > [ 2925.641328] ? save_stack+0x89/0xb0 > [ 2925.641332] ? wait_woken+0x110/0x110 > [ 2925.641337] ? ret_from_fork+0x22/0x40 > [ 2925.641343] ? __schedule+0xd30/0xd30 > [ 2925.641346] ? remove_wait_queue+0x150/0x150 > [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0 > [ 2925.641359] ? __lock_text_start+0x8/0x8 > [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] > [ 2925.641371] ? kthread+0x19b/0x1c0 > [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0 > [ 2925.641382] ? ret_from_fork+0x22/0x40 > [ 2925.641387] Code: Bad RIP value. > [ 2925.641397] RIP: (null) RSP: 8801d54f7790 > [ 2925.641400] CR2: > [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]--- > > > Note that regular compute/gfx workflows work fine on the tip of drm-next only > vcn playback triggeers this
Re: vcn regression on raven1
On 05/01/2018 09:34 PM, Tom St Denis wrote: Hi all, I've noticed that on the tip of drm-next vcn playback of video is broken (see dmesg below). I've bisected it to this commit It may be fixed here as a common issue. * https://patchwork.freedesktop.org/patch/218909/ Jerry [root@raven linux]# git bisect good 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit commit 701372349fd55b5396b335580e979ac4dde3dd02 Author: Alex DeucherDate: Tue Mar 27 17:10:56 2018 -0500 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, it provides a write and wait in a single packet which avoids a missed ack if a world switch happens between the request and waiting for the ack. Reviewed-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers Which is odd because the commit before this is the vcn change and it works fine (playing BBB right now). Here's the dmesg: [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at [ 2925.640113] IP: (null) [ 2925.640116] PGD 0 P4D 0 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash gpu_sched ttm ax88179_178a usbnet [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018 [ 2925.640146] RIP: 0010: (null) [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: 007c0080 [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420 [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9 [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: [ 2925.640168] FS: () GS:8801dcf0() knlGS: [ 2925.640171] CS: 0010 DS: ES: CR0: 80050033 [ 2925.640174] CR2: CR3: 0001d9712000 CR4: 003406e0 [ 2925.640176] Call Trace: [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu] [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu] [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu] [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu] [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu] [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu] [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu] [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu] [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu] [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu] [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360 [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu] [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu] [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0 [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40 [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0 [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched] [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] [ 2925.641328] ? save_stack+0x89/0xb0 [ 2925.641332] ? wait_woken+0x110/0x110 [ 2925.641337] ? ret_from_fork+0x22/0x40 [ 2925.641343] ? __schedule+0xd30/0xd30 [ 2925.641346] ? remove_wait_queue+0x150/0x150 [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0 [ 2925.641359] ? __lock_text_start+0x8/0x8 [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] [ 2925.641371] ? kthread+0x19b/0x1c0 [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 2925.641382] ? ret_from_fork+0x22/0x40 [ 2925.641387] Code: Bad RIP value. [ 2925.641397] RIP: (null) RSP: 8801d54f7790 [ 2925.641400] CR2: [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]--- Note that regular compute/gfx workflows work fine on the tip of drm-next only vcn playback triggeers this (haven't tried encode yet...). Cheers, Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 04/12] drm/amdkfd: use %px to print user space address instead of %p
From: Philip YangSigned-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_queue.c | 8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 1a4d8dc..4ced5e9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -233,7 +233,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties, pr_debug("Queue Size: 0x%llX, %u\n", q_properties->queue_size, args->ring_size); - pr_debug("Queue r/w Pointers: %p, %p\n", + pr_debug("Queue r/w Pointers: %px, %px\n", q_properties->read_ptr, q_properties->write_ptr); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c index a5315d4..6dcd621 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c @@ -36,8 +36,8 @@ void print_queue_properties(struct queue_properties *q) pr_debug("Queue Address: 0x%llX\n", q->queue_address); pr_debug("Queue Id: %u\n", q->queue_id); pr_debug("Queue Process Vmid: %u\n", q->vmid); - pr_debug("Queue Read Pointer: 0x%p\n", q->read_ptr); - pr_debug("Queue Write Pointer: 0x%p\n", q->write_ptr); + pr_debug("Queue Read Pointer: 0x%px\n", q->read_ptr); + pr_debug("Queue Write Pointer: 0x%px\n", q->write_ptr); pr_debug("Queue Doorbell Pointer: 0x%p\n", q->doorbell_ptr); pr_debug("Queue Doorbell Offset: %u\n", q->doorbell_off); } @@ -53,8 +53,8 @@ void print_queue(struct queue *q) pr_debug("Queue Address: 0x%llX\n", q->properties.queue_address); pr_debug("Queue Id: %u\n", q->properties.queue_id); pr_debug("Queue Process Vmid: %u\n", q->properties.vmid); - pr_debug("Queue Read Pointer: 0x%p\n", q->properties.read_ptr); - pr_debug("Queue Write Pointer: 0x%p\n", q->properties.write_ptr); + pr_debug("Queue Read Pointer: 0x%px\n", q->properties.read_ptr); + pr_debug("Queue Write Pointer: 0x%px\n", q->properties.write_ptr); pr_debug("Queue Doorbell Pointer: 0x%p\n", q->properties.doorbell_ptr); pr_debug("Queue Doorbell Offset: %u\n", q->properties.doorbell_off); pr_debug("Queue MQD Address: 0x%p\n", q->mqd); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 08/12] drm/amdkfd: Fix signal handling performance again
It turns out that idr_for_each_entry is really slow compared to just iterating over the slots. Based on measurements the difference is estimated to be about a factor 64. That means using idr_for_each_entry is only worth it with very few allocated events. Signed-off-by: Felix Kuehling--- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c index bccf2f7..7862fcf 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c @@ -496,7 +496,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id, pr_debug_ratelimited("Partial ID invalid: %u (%u valid bits)\n", partial_id, valid_id_bits); - if (p->signal_event_count < KFD_SIGNAL_EVENT_LIMIT/2) { + if (p->signal_event_count < KFD_SIGNAL_EVENT_LIMIT/64) { /* With relatively few events, it's faster to * iterate over the event IDR */ -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 05/12] drm/amdkfd: Remove redundant include of amd-iommu.h
Signed-off-by: Felix Kuehling--- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index fb4a72d..17de4ac 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -20,9 +20,6 @@ * OTHER DEALINGS IN THE SOFTWARE. */ -#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2) -#include -#endif #include #include #include -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 11/12] drm/amdkfd: Remove queue node when destroy queue failed
From: Shaoyun LiuHWS may hang in the middle of destroy queue, remove the queue from the process queue list so it won't be freed again in the future Signed-off-by: Shaoyun Liu Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index 3045aeb..d65ce04 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -241,7 +241,8 @@ int pqm_create_queue(struct process_queue_manager *pqm, } if (retval != 0) { - pr_err("DQM create queue failed\n"); + pr_err("Pasid %d DQM create queue %d failed. ret %d\n", + pqm->process->pasid, type, retval); goto err_create_queue; } @@ -319,8 +320,11 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid) dqm = pqn->q->device->dqm; retval = dqm->ops.destroy_queue(dqm, >qpd, pqn->q); if (retval) { - pr_debug("Destroy queue failed, returned %d\n", retval); - goto err_destroy_queue; + pr_err("Pasid %d destroy queue %d failed, ret %d\n", + pqm->process->pasid, + pqn->q->properties.queue_id, retval); + if (retval != -ETIME) + goto err_destroy_queue; } uninit_queue(pqn->q); } -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 01/12] drm/amdkfd: Dump HQD of HIQ
From: Oak ZengSigned-off-by: Oak Zeng Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 9af94b1..668ad07 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1713,6 +1713,18 @@ int dqm_debugfs_hqds(struct seq_file *m, void *data) int pipe, queue; int r = 0; + r = dqm->dev->kfd2kgd->hqd_dump(dqm->dev->kgd, + KFD_CIK_HIQ_PIPE, KFD_CIK_HIQ_QUEUE, , _regs); + if (!r) { + seq_printf(m, " HIQ on MEC %d Pipe %d Queue %d\n", + KFD_CIK_HIQ_PIPE/get_pipes_per_mec(dqm)+1, + KFD_CIK_HIQ_PIPE%get_pipes_per_mec(dqm), + KFD_CIK_HIQ_QUEUE); + seq_reg_dump(m, dump, n_regs); + + kfree(dump); + } + for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) { int pipe_offset = pipe * get_queues_per_pipe(dqm); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 03/12] drm/amdkfd: Use volatile MTYPE in default/alternate apertures
From: Jay CornwallMTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile. Cache lines loaded through these apertures are intended to be invalidated before (and sometimes during) a dispatch. The non-volatile qualifier prevents these cache lines from being distinguished from those loaded through the private aperture. Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor to automatic per-dispatch scalar/vector L1 volatile invalidation. Signed-off-by: Jay Cornwall Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/cik_regs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/cik_regs.h b/drivers/gpu/drm/amd/amdkfd/cik_regs.h index 48769d1..37ce6dd 100644 --- a/drivers/gpu/drm/amd/amdkfd/cik_regs.h +++ b/drivers/gpu/drm/amd/amdkfd/cik_regs.h @@ -33,7 +33,8 @@ #defineAPE1_MTYPE(x) ((x) << 7) /* valid for both DEFAULT_MTYPE and APE1_MTYPE */ -#defineMTYPE_CACHED0 +#defineMTYPE_CACHED_NV 0 +#defineMTYPE_CACHED1 #defineMTYPE_NONCACHED 3 #defineDEFAULT_CP_HQD_PERSISTENT_STATE (0x33U << 8) -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 07/12] drm/amdkfd: Fix CP soft hang on APUs
From: Yong ZhaoThe problem happens on Raven and Carrizo. The context save handler should not clear the high bits of PC_HI before extracting the bits of IB_STS. The bug is not relevant to VEGA10 until we enable demand paging. Signed-off-by: Jay Cornwall Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 4 ++-- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 3 +-- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 3 +-- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h index a546a21..f68aef0 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h @@ -253,7 +253,6 @@ static const uint32_t cwsr_trap_gfx8_hex[] = { 0x0072, 0x80728472, 0xc0211b7c, 0x0072, 0x80728472, 0xbf8c007f, - 0x8671ff71, 0x, 0xbefc0073, 0xbefe006e, 0xbeff006f, 0x867375ff, 0x03ff, 0xb9734803, @@ -267,6 +266,7 @@ static const uint32_t cwsr_trap_gfx8_hex[] = { 0x8e738f73, 0x87767376, 0x8673ff74, 0x0080, 0x8f739773, 0xb976f807, + 0x8671ff71, 0x, 0x86fe7e7e, 0x86ea6a6a, 0xb974f802, 0xbf8a, 0x95807370, 0xbf81, @@ -530,7 +530,6 @@ static const uint32_t cwsr_trap_gfx9_hex[] = { 0x0078, 0x80788478, 0xc0211cfa, 0x0078, 0x80788478, 0xbf8cc07f, - 0x866dff6d, 0x, 0xbefc006f, 0xbefe007a, 0xbeff007b, 0x866f71ff, 0x03ff, 0xb96f4803, @@ -554,6 +553,7 @@ static const uint32_t cwsr_trap_gfx9_hex[] = { 0x8e6f8f6f, 0x876e6f6e, 0x866fff70, 0x0080, 0x8f6f976f, 0xb96ef807, + 0x866dff6d, 0x, 0x86fe7e7e, 0x86ea6a6a, 0xb970f802, 0xbf8a, 0x95806f6c, 0xbf81, diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm index 658a4c6..a2a04bb 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm @@ -1015,8 +1015,6 @@ end s_waitcnt lgkmcnt(0) //from now on, it is safe to restore STATUS and IB_STS -s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x //pc[47:32] //Do it here in order not to affect STATUS - //for normal save & restore, the saved PC points to the next inst to execute, no adjustment needs to be made, otherwise: if ((EMU_RUN_HACK) && (!EMU_RUN_HACK_RESTORE_NORMAL)) s_add_u32 s_restore_pc_lo, s_restore_pc_lo, 8//pc[31:0]+8 //two back-to-back s_trap are used (first for save and second for restore) @@ -1052,6 +1050,7 @@ end s_lshr_b32 s_restore_m0, s_restore_m0, SQ_WAVE_STATUS_INST_ATC_SHIFT s_setreg_b32hwreg(HW_REG_IB_STS), s_restore_tmp +s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x //pc[47:32] //Do it here in order not to affect STATUS s_and_b64exec, exec, exec // Restore STATUS.EXECZ, not writable by s_setreg_b32 s_and_b64vcc, vcc, vcc // Restore STATUS.VCCZ, not writable by s_setreg_b32 s_setreg_b32hwreg(HW_REG_STATUS), s_restore_status // SCC is included, which is changed by previous salu diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm index 065f55a..998be96 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm @@ -1067,8 +1067,6 @@ end s_waitcnt lgkmcnt(0) //from now on, it is safe to restore STATUS and IB_STS -s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x //pc[47:32] //Do it here in order not to affect STATUS - //for normal save & restore, the saved PC points to the next inst to execute, no adjustment needs to be made, otherwise: if ((EMU_RUN_HACK) && (!EMU_RUN_HACK_RESTORE_NORMAL)) s_add_u32 s_restore_pc_lo, s_restore_pc_lo, 8//pc[31:0]+8 //two back-to-back s_trap are used (first for save and second for restore) @@ -1119,6 +1117,7 @@ end s_lshr_b32 s_restore_m0, s_restore_m0, SQ_WAVE_STATUS_INST_ATC_SHIFT s_setreg_b32hwreg(HW_REG_IB_STS), s_restore_tmp +s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x //pc[47:32] //Do it here in order not to affect STATUS s_and_b64 exec, exec, exec // Restore STATUS.EXECZ, not writable by s_setreg_b32
[PATCH 06/12] drm/amdkfd: Separate trap handler assembly code and its hex values
From: Yong ZhaoSince the assembly code is inside "#if 0", it is ineffective. Despite that, during debugging, we need to change the assembly code, extract it into a separate file and compile the new file into hex values using sp3. That process also requires us to remove "#if 0" and modify lines starting with "#", so that sp3 can successfully compile the new file. With this change, all the above chore is no longer needed, and cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its hex values. Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 560 + .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 267 +- .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 300 +-- drivers/gpu/drm/amd/amdkfd/kfd_device.c| 3 +- 4 files changed, 575 insertions(+), 555 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h new file mode 100644 index 000..a546a21 --- /dev/null +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h @@ -0,0 +1,560 @@ +/* + * Copyright 2018 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +static const uint32_t cwsr_trap_gfx8_hex[] = { + 0xbf820001, 0xbf820125, + 0xb8f4f802, 0x89748674, + 0xb8f5f803, 0x8675ff75, + 0x0400, 0xbf850011, + 0xc00a1e37, 0x, + 0xbf8c007f, 0x8978, + 0xbf840002, 0xb974f802, + 0xbe801d78, 0xb8f5f803, + 0x8675ff75, 0x01ff, + 0xbf850002, 0x80708470, + 0x82718071, 0x8671ff71, + 0x, 0xb974f802, + 0xbe801f70, 0xb8f5f803, + 0x8675ff75, 0x0100, + 0xbf840006, 0xbefa0080, + 0xb97a0203, 0x8671ff71, + 0x, 0x80f08870, + 0x82f18071, 0xbefa0080, + 0xb97a0283, 0xbef60068, + 0xbef70069, 0xb8fa1c07, + 0x8e7a9c7a, 0x87717a71, + 0xb8fa03c7, 0x8e7a9b7a, + 0x87717a71, 0xb8faf807, + 0x867aff7a, 0x7fff, + 0xb97af807, 0xbef2007e, + 0xbef3007f, 0xbefe0180, + 0xbf94, 0x877a8474, + 0xb97af802, 0xbf8e0002, + 0xbf88fffe, 0xbef8007e, + 0x8679ff7f, 0x, + 0x8779ff79, 0x0004, + 0xbefa0080, 0xbefb00ff, + 0x00807fac, 0x867aff7f, + 0x0800, 0x8f7a837a, + 0x877b7a7b, 0x867aff7f, + 0x7000, 0x8f7a817a, + 0x877b7a7b, 0xbeef007c, + 0xbeee0080, 0xb8ee2a05, + 0x806e816e, 0x8e6e8a6e, + 0xb8fa1605, 0x807a817a, + 0x8e7a867a, 0x806e7a6e, + 0xbefa0084, 0xbefa00ff, + 0x0100, 0xbefe007c, + 0xbefc006e, 0xc0611bfc, + 0x007c, 0x806e846e, + 0xbefc007e, 0xbefe007c, + 0xbefc006e, 0xc0611c3c, + 0x007c, 0x806e846e, + 0xbefc007e, 0xbefe007c, + 0xbefc006e, 0xc0611c7c, + 0x007c, 0x806e846e, + 0xbefc007e, 0xbefe007c, + 0xbefc006e, 0xc0611cbc, + 0x007c, 0x806e846e, + 0xbefc007e, 0xbefe007c, + 0xbefc006e, 0xc0611cfc, + 0x007c, 0x806e846e, + 0xbefc007e, 0xbefe007c, + 0xbefc006e, 0xc0611d3c, + 0x007c, 0x806e846e, + 0xbefc007e, 0xb8f5f803, + 0xbefe007c, 0xbefc006e, + 0xc0611d7c, 0x007c, + 0x806e846e, 0xbefc007e, + 0xbefe007c, 0xbefc006e, + 0xc0611dbc, 0x007c, + 0x806e846e, 0xbefc007e, + 0xbefe007c, 0xbefc006e, + 0xc0611dfc, 0x007c, + 0x806e846e, 0xbefc007e, + 0xb8eff801, 0xbefe007c, + 0xbefc006e, 0xc0611bfc, + 0x007c, 0x806e846e, + 0xbefc007e, 0xbefe007c, + 0xbefc006e, 0xc0611b3c, + 0x007c, 0x806e846e, + 0xbefc007e,
[PATCH 00/12] Assorted KFD fixes
These are some random patches I noticed when comparing amdkfd-next against amd-kfd-staging. Ben Goz (1): drm/amdkfd: Locking PM mutex while allocating IB buffer Felix Kuehling (4): drm/amdkfd: Remove redundant include of amd-iommu.h drm/amdkfd: Fix signal handling performance again drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK drm/amdkfd: Add sanity checks in IRQ handlers Jay Cornwall (2): drm/amdkfd: Reduce priority of context-saving waves before spin-wait drm/amdkfd: Use volatile MTYPE in default/alternate apertures Oak Zeng (1): drm/amdkfd: Dump HQD of HIQ Philip Yang (1): drm/amdkfd: use %px to print user space address instead of %p Shaoyun Liu (1): drm/amdkfd: Remove queue node when destroy queue failed Yong Zhao (2): drm/amdkfd: Separate trap handler assembly code and its hex values drm/amdkfd: Fix CP soft hang on APUs drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 20 +- drivers/gpu/drm/amd/amdkfd/cik_regs.h | 3 +- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 560 + .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 274 +- .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 307 +-- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_device.c| 6 +- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 + drivers/gpu/drm/amd/amdkfd/kfd_events.c| 2 +- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c| 40 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 4 - drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c| 7 +- .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 +- drivers/gpu/drm/amd/amdkfd/kfd_queue.c | 8 +- 14 files changed, 659 insertions(+), 596 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 10/12] drm/amdkfd: Locking PM mutex while allocating IB buffer
From: Ben GozSigned-off-by: Ben Goz Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c index 91f0350..c317feb4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c @@ -94,12 +94,14 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm, pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription); + mutex_lock(>lock); + retval = kfd_gtt_sa_allocate(pm->dqm->dev, *rl_buffer_size, >ib_buffer_obj); if (retval) { pr_err("Failed to allocate runlist IB\n"); - return retval; + goto out; } *(void **)rl_buffer = pm->ib_buffer_obj->cpu_ptr; @@ -107,6 +109,9 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm, memset(*rl_buffer, 0, *rl_buffer_size); pm->allocated = true; + +out: + mutex_unlock(>lock); return retval; } -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 12/12] drm/amdkfd: Add sanity checks in IRQ handlers
Only accept interrupts from KFD VMIDs. Just checking for a PASID may not be enough because amdgpu started using PASIDs to map VM faults to processes. Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug). Suggested-by: Shaoyun LiuSuggested-by: Oak Zeng Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 20 +--- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 40 ++-- 2 files changed, 39 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c index 3d5ccb3..49df6c7 100644 --- a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c +++ b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c @@ -27,18 +27,28 @@ static bool cik_event_interrupt_isr(struct kfd_dev *dev, const uint32_t *ih_ring_entry) { - unsigned int pasid; const struct cik_ih_ring_entry *ihre = (const struct cik_ih_ring_entry *)ih_ring_entry; + unsigned int vmid, pasid; + + /* Only handle interrupts from KFD VMIDs */ + vmid = (ihre->ring_id & 0xff00) >> 8; + if (vmid < dev->vm_info.first_vmid_kfd || + vmid > dev->vm_info.last_vmid_kfd) + return 0; + /* If there is no valid PASID, it's likely a firmware bug */ pasid = (ihre->ring_id & 0x) >> 16; + if (WARN_ONCE(pasid == 0, "FW bug: No PASID in KFD interrupt")) + return 0; - /* Do not process in ISR, just request it to be forwarded to WQ. */ - return (pasid != 0) && - (ihre->source_id == CIK_INTSRC_CP_END_OF_PIPE || + /* Interrupt types we care about: various signals and faults. +* They will be forwarded to a work queue (see below). +*/ + return ihre->source_id == CIK_INTSRC_CP_END_OF_PIPE || ihre->source_id == CIK_INTSRC_SDMA_TRAP || ihre->source_id == CIK_INTSRC_SQ_INTERRUPT_MSG || - ihre->source_id == CIK_INTSRC_CP_BAD_OPCODE); + ihre->source_id == CIK_INTSRC_CP_BAD_OPCODE; } static void cik_event_interrupt_wq(struct kfd_dev *dev, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c index 39d4115..37029ba 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c @@ -29,27 +29,35 @@ static bool event_interrupt_isr_v9(struct kfd_dev *dev, const uint32_t *ih_ring_entry) { uint16_t source_id, client_id, pasid, vmid; + const uint32_t *data = ih_ring_entry; - source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry); - client_id = SOC15_CLIENT_ID_FROM_IH_ENTRY(ih_ring_entry); - pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry); + /* Only handle interrupts from KFD VMIDs */ vmid = SOC15_VMID_FROM_IH_ENTRY(ih_ring_entry); + if (vmid < dev->vm_info.first_vmid_kfd || + vmid > dev->vm_info.last_vmid_kfd) + return 0; + + /* If there is no valid PASID, it's likely a firmware bug */ + pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry); + if (WARN_ONCE(pasid == 0, "FW bug: No PASID in KFD interrupt")) + return 0; - if (pasid) { - const uint32_t *data = ih_ring_entry; + source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry); + client_id = SOC15_CLIENT_ID_FROM_IH_ENTRY(ih_ring_entry); - pr_debug("client id 0x%x, source id %d, pasid 0x%x. raw data:\n", -client_id, source_id, pasid); - pr_debug("%8X, %8X, %8X, %8X, %8X, %8X, %8X, %8X.\n", -data[0], data[1], data[2], data[3], -data[4], data[5], data[6], data[7]); - } + pr_debug("client id 0x%x, source id %d, pasid 0x%x. raw data:\n", +client_id, source_id, pasid); + pr_debug("%8X, %8X, %8X, %8X, %8X, %8X, %8X, %8X.\n", +data[0], data[1], data[2], data[3], +data[4], data[5], data[6], data[7]); - return (pasid != 0) && - (source_id == SOC15_INTSRC_CP_END_OF_PIPE || -source_id == SOC15_INTSRC_SDMA_TRAP || -source_id == SOC15_INTSRC_SQ_INTERRUPT_MSG || -source_id == SOC15_INTSRC_CP_BAD_OPCODE); + /* Interrupt types we care about: various signals and faults. +* They will be forwarded to a work queue (see below). +*/ + return source_id == SOC15_INTSRC_CP_END_OF_PIPE || + source_id == SOC15_INTSRC_SDMA_TRAP || + source_id == SOC15_INTSRC_SQ_INTERRUPT_MSG || + source_id == SOC15_INTSRC_CP_BAD_OPCODE; } static void event_interrupt_wq_v9(struct kfd_dev
[PATCH 02/12] drm/amdkfd: Reduce priority of context-saving waves before spin-wait
From: Jay CornwallSynchronization between context-saving wavefronts is achieved by sending a SAVEWAVE message to the SPI and then spin-waiting for a response. These spin-waiting wavefronts may inhibit the progress of other wavefronts in the context save handler, leading to the synchronization condition never being achieved. Before spin-waiting reduce the priority of each wavefront to guarantee foward progress in the others. Signed-off-by: Jay Cornwall Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 10 -- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 8 +++- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm index 997a383d..34eabcd 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm @@ -98,6 +98,7 @@ var SWIZZLE_EN = 0 //whether we use swi /**/ var SQ_WAVE_STATUS_INST_ATC_SHIFT = 23 var SQ_WAVE_STATUS_INST_ATC_MASK = 0x0080 +var SQ_WAVE_STATUS_SPI_PRIO_SHIFT = 1 var SQ_WAVE_STATUS_SPI_PRIO_MASK = 0x0006 var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT= 12 @@ -319,6 +320,10 @@ end s_sendmsg sendmsg(MSG_SAVEWAVE) //send SPI a message and wait for SPI's write to EXEC end +// Set SPI_PRIO=2 to avoid starving instruction fetch in the waves we're waiting for. +s_or_b32 s_save_tmp, s_save_status, (2 << SQ_WAVE_STATUS_SPI_PRIO_SHIFT) +s_setreg_b32 hwreg(HW_REG_STATUS), s_save_tmp + L_SLEEP: s_sleep 0x2// sleep 1 (64clk) is not enough for 8 waves per SIMD, which will cause SQ hang, since the 7,8th wave could not get arbit to exec inst, while other waves are stuck into the sleep-loop and waiting for wrexec!=0 @@ -1132,7 +1137,7 @@ end #endif static const uint32_t cwsr_trap_gfx8_hex[] = { - 0xbf820001, 0xbf820123, + 0xbf820001, 0xbf820125, 0xb8f4f802, 0x89748674, 0xb8f5f803, 0x8675ff75, 0x0400, 0xbf850011, @@ -1158,7 +1163,8 @@ static const uint32_t cwsr_trap_gfx8_hex[] = { 0x867aff7a, 0x7fff, 0xb97af807, 0xbef2007e, 0xbef3007f, 0xbefe0180, - 0xbf94, 0xbf8e0002, + 0xbf94, 0x877a8474, + 0xb97af802, 0xbf8e0002, 0xbf88fffe, 0xbef8007e, 0x8679ff7f, 0x, 0x8779ff79, 0x0004, diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm index da09794..8fc3698 100644 --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm @@ -97,6 +97,7 @@ var ACK_SQC_STORE = 1 //workaround for suspected SQC store bug causing /**/ var SQ_WAVE_STATUS_INST_ATC_SHIFT = 23 var SQ_WAVE_STATUS_INST_ATC_MASK = 0x0080 +var SQ_WAVE_STATUS_SPI_PRIO_SHIFT = 1 var SQ_WAVE_STATUS_SPI_PRIO_MASK = 0x0006 var SQ_WAVE_STATUS_HALT_MASK = 0x2000 @@ -362,6 +363,10 @@ end s_sendmsg sendmsg(MSG_SAVEWAVE) //send SPI a message and wait for SPI's write to EXEC end +// Set SPI_PRIO=2 to avoid starving instruction fetch in the waves we're waiting for. +s_or_b32 s_save_tmp, s_save_status, (2 << SQ_WAVE_STATUS_SPI_PRIO_SHIFT) +s_setreg_b32 hwreg(HW_REG_STATUS), s_save_tmp + L_SLEEP: s_sleep 0x2 // sleep 1 (64clk) is not enough for 8 waves per SIMD, which will cause SQ hang, since the 7,8th wave could not get arbit to exec inst, while other waves are stuck into the sleep-loop and waiting for wrexec!=0 @@ -1210,7 +1215,7 @@ end #endif static const uint32_t cwsr_trap_gfx9_hex[] = { - 0xbf820001, 0xbf820158, + 0xbf820001, 0xbf82015a, 0xb8f8f802, 0x89788678, 0xb8f1f803, 0x866eff71, 0x0400, 0xbf850034, @@ -1249,6 +1254,7 @@ static const uint32_t cwsr_trap_gfx9_hex[] = { 0x7fff, 0xb970f807, 0xbeee007e, 0xbeef007f, 0xbefe0180, 0xbf94, + 0x87708478, 0xb970f802, 0xbf8e0002, 0xbf88fffe, 0xb8f02a05, 0x80708170, 0x8e708a70, 0xb8f11605, -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 09/12] drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK
The initialization is not necessary. amd-kfd-staging and ROCm releases have worked without it for two years. Signed-off-by: Felix Kuehling--- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c index 2bc49c6..06eaa21 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c @@ -79,10 +79,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd, m->cp_mqd_base_addr_lo= lower_32_bits(addr); m->cp_mqd_base_addr_hi= upper_32_bits(addr); - m->cp_hqd_ib_control = DEFAULT_MIN_IB_AVAIL_SIZE | IB_ATC_EN; - /* Although WinKFD writes this, I suspect it should not be necessary */ - m->cp_hqd_ib_control = IB_ATC_EN | DEFAULT_MIN_IB_AVAIL_SIZE; - m->cp_hqd_quantum = QUANTUM_EN | QUANTUM_SCALE_1MS | QUANTUM_DURATION(10); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/amdgpu: vcn10 Add callback for emit_reg_write_reg_wait
Reviewed-by: Andrey GrodzovskyAndrey On 05/01/2018 10:18 AM, Tom St Denis wrote: The callback .emit_reg_write_reg_wait was missing for vcn decode which resulted in a kernel oops. Signed-off-by: Tom St Denis --- drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c index d9a15338db7e..0501746b6c2c 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c @@ -1109,6 +1109,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = { .end_use = amdgpu_vcn_ring_end_use, .emit_wreg = vcn_v1_0_dec_ring_emit_wreg, .emit_reg_wait = vcn_v1_0_dec_ring_emit_reg_wait, + .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper, }; static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = { ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: New SPDX-License-Identifier requirement
I believe it should be : SPDX-License-Identifier: GPL-2.0 OR MIT But John probably knows best about this Oded On Tue, May 1, 2018 at 11:14 PM, Felix Kuehlingwrote: > Hi, > > I'm getting a checkpatch warning with the latest amdkfd-next branch > (4.17-rc2) when adding a new file: > > WARNING: Missing or malformed SPDX-License-Identifier tag in line 1 > #34: FILE: drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h:1: > > I've read Documentation/process/license-rules.rst but I'm unsure what > would be the correct license identifier to go with the license header we > use in most of our source files. I think it would be one of these: > > // SPDX-License-Identifier: GPL-2.0 OR MIT > // SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause > > Can someone confirm. > > Once that's clarified, we should probably add the appropriate license > identifier to all our source files. > > Thanks, > Felix > > -- > F e l i x K u e h l i n g > PMTS Software Development Engineer | Vertical Workstation/Compute > 1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada > (O) +1(289)695-1597 >_ _ _ _ _ > / \ | \ / | | _ \ \ _ | > / A \ | \M/ | | |D) ) /|_| | > /_/ \_\ |_| |_| |_/ |__/ \| facebook.com/AMD | amd.com > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
New SPDX-License-Identifier requirement
Hi, I'm getting a checkpatch warning with the latest amdkfd-next branch (4.17-rc2) when adding a new file: WARNING: Missing or malformed SPDX-License-Identifier tag in line 1 #34: FILE: drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h:1: I've read Documentation/process/license-rules.rst but I'm unsure what would be the correct license identifier to go with the license header we use in most of our source files. I think it would be one of these: // SPDX-License-Identifier: GPL-2.0 OR MIT // SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause Can someone confirm. Once that's clarified, we should probably add the appropriate license identifier to all our source files. Thanks, Felix -- F e l i x K u e h l i n g PMTS Software Development Engineer | Vertical Workstation/Compute 1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada (O) +1(289)695-1597 _ _ _ _ _ / \ | \ / | | _ \ \ _ | / A \ | \M/ | | |D) ) /|_| | /_/ \_\ |_| |_| |_/ |__/ \| facebook.com/AMD | amd.com ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes
On Fri, Apr 27, 2018 at 3:27 AM Shirish Swrote: > This patch is in continuation to the > "843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state" > where we started to eliminate the dependency on > DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space, > which as such is not mandatory. > After deferring, this patch eliminates the dependency on the flag > for overlay planes. > This has to be done in stages as its a pretty complex and requires thorough > testing before we free primary planes as well from dependency on modeset > flag. > Signed-off-by: Shirish S > --- > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 1a63c04..87b661d 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, > } > spin_unlock_irqrestore(>dev->event_lock, flags); > - if (!pflip_needed) { > + if (!pflip_needed || plane->type == DRM_PLANE_TYPE_OVERLAY) { > WARN_ON(!dm_new_plane_state->dc_state); > plane_states_constructed[planes_count] = dm_new_plane_state->dc_state; > @@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc, > /* Remove any changed/removed planes */ > if (!enable) { > - if (pflip_needed) > + if (pflip_needed && > + plane && plane->type != DRM_PLANE_TYPE_OVERLAY) nit: I don't think we need to check that plane is non-NULL Stéphane > continue; > if (!old_plane_crtc) > @@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc, > if (!dm_new_crtc_state->stream) > continue; > - if (pflip_needed) > + if (pflip_needed && > + plane && plane->type != DRM_PLANE_TYPE_OVERLAY) > continue; > WARN_ON(dm_new_plane_state->dc_state); > -- > 2.7.4 > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/amdgpu: vcn10 Add callback for emit_reg_write_reg_wait
The callback .emit_reg_write_reg_wait was missing for vcn decode which resulted in a kernel oops. Signed-off-by: Tom St Denis--- drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c index d9a15338db7e..0501746b6c2c 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c @@ -1109,6 +1109,7 @@ static const struct amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = { .end_use = amdgpu_vcn_ring_end_use, .emit_wreg = vcn_v1_0_dec_ring_emit_wreg, .emit_reg_wait = vcn_v1_0_dec_ring_emit_reg_wait, + .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper, }; static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = { -- 2.14.3 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE
On 2018-05-01 01:15 AM, Dave Airlie wrote: >> >> >> Yes, I fixed the original false positive messages myself with the swiotlb >> maintainer and I was CCed in fixing the recent fallout from Chris changes as >> well. > > So do we have a good summary of where this at now? > > I'm getting reports on 4.16.4 still displaying these, what hammer do I > need to hit things with to get 4.16.x+1 to not do this? > > Is there still outstanding issues upstream. There are, https://patchwork.freedesktop.org/patch/219765/ should hopefully fix the last of it. > [...] I've no idea if the swiotlb things people report are the false > positive, or some new thing. The issues I've seen reported with 4.16 are false positives from TTM's perspective, which uses DMA_ATTR_NO_WARN to suppress these warnings, due to multiple regressions introduced by commit 0176adb004065d6815a8e67946752df4cd947c5b "swiotlb: refactor coherent buffer allocation" in 4.16-rc1. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
vcn regression on raven1
Hi all, I've noticed that on the tip of drm-next vcn playback of video is broken (see dmesg below). I've bisected it to this commit [root@raven linux]# git bisect good 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit commit 701372349fd55b5396b335580e979ac4dde3dd02 Author: Alex DeucherDate: Tue Mar 27 17:10:56 2018 -0500 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it, it provides a write and wait in a single packet which avoids a missed ack if a world switch happens between the request and waiting for the ack. Reviewed-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M drivers Which is odd because the commit before this is the vcn change and it works fine (playing BBB right now). Here's the dmesg: [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at [ 2925.640113] IP: (null) [ 2925.640116] PGD 0 P4D 0 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash gpu_sched ttm ax88179_178a usbnet [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20 [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018 [ 2925.640146] RIP: 0010: (null) [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206 [ 2925.640153] RAX: RBX: 8801d8b38420 RCX: 007c0080 [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420 [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9 [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: [ 2925.640168] FS: () GS:8801dcf0() knlGS: [ 2925.640171] CS: 0010 DS: ES: CR0: 80050033 [ 2925.640174] CR2: CR3: 0001d9712000 CR4: 003406e0 [ 2925.640176] Call Trace: [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu] [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu] [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu] [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu] [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu] [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu] [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu] [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu] [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu] [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu] [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360 [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu] [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu] [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0 [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40 [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0 [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched] [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] [ 2925.641328] ? save_stack+0x89/0xb0 [ 2925.641332] ? wait_woken+0x110/0x110 [ 2925.641337] ? ret_from_fork+0x22/0x40 [ 2925.641343] ? __schedule+0xd30/0xd30 [ 2925.641346] ? remove_wait_queue+0x150/0x150 [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0 [ 2925.641359] ? __lock_text_start+0x8/0x8 [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] [ 2925.641371] ? kthread+0x19b/0x1c0 [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 2925.641382] ? ret_from_fork+0x22/0x40 [ 2925.641387] Code: Bad RIP value. [ 2925.641397] RIP: (null) RSP: 8801d54f7790 [ 2925.641400] CR2: [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]--- Note that regular compute/gfx workflows work fine on the tip of drm-next only vcn playback triggeers this (haven't tried encode yet...). Cheers, Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm] amdgpu: Deinitialize vamgr_high{,_32}
Reviewed-by: Andrey GrodzovskyAndrey On 05/01/2018 04:03 AM, Michel Dänzer wrote: On 2018-04-27 04:44 PM, Michel Dänzer wrote: From: Michel Dänzer Fixes memory leaks. Signed-off-by: Michel Dänzer --- amdgpu/amdgpu_device.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c index d81efcf8..983b19ab 100644 --- a/amdgpu/amdgpu_device.c +++ b/amdgpu/amdgpu_device.c @@ -128,6 +128,8 @@ static void amdgpu_device_free_internal(amdgpu_device_handle dev) { amdgpu_vamgr_deinit(>vamgr_32); amdgpu_vamgr_deinit(>vamgr); + amdgpu_vamgr_deinit(>vamgr_high_32); + amdgpu_vamgr_deinit(>vamgr_high); util_hash_table_destroy(dev->bo_flink_names); util_hash_table_destroy(dev->bo_handles); pthread_mutex_destroy(>bo_table_mutex); Any reviews? Without negative feedback, I'll push this tomorrow. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm] amdgpu: Deinitialize vamgr_high{,_32}
On 2018-04-27 04:44 PM, Michel Dänzer wrote: > From: Michel Dänzer> > Fixes memory leaks. > > Signed-off-by: Michel Dänzer > --- > amdgpu/amdgpu_device.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c > index d81efcf8..983b19ab 100644 > --- a/amdgpu/amdgpu_device.c > +++ b/amdgpu/amdgpu_device.c > @@ -128,6 +128,8 @@ static void > amdgpu_device_free_internal(amdgpu_device_handle dev) > { > amdgpu_vamgr_deinit(>vamgr_32); > amdgpu_vamgr_deinit(>vamgr); > + amdgpu_vamgr_deinit(>vamgr_high_32); > + amdgpu_vamgr_deinit(>vamgr_high); > util_hash_table_destroy(dev->bo_flink_names); > util_hash_table_destroy(dev->bo_handles); > pthread_mutex_destroy(>bo_table_mutex); > Any reviews? Without negative feedback, I'll push this tomorrow. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH v3 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
The amdgpu driver doesn't appear to directly use the scatterlist mapped by amdgpu_ttm_tt_pin_userptr(), it merely hands it off to drm_prime_sg_to_page_addr_arrays() to generate the dma_address array which it actually cares about. Now that the latter can cope with dma_map_sg() coalescing dma-contiguous segments such that it returns 0 < count < nents, we can relax the current count == nents check to only consider genuine failure as other drivers do. Reported-by: Sinan KayaReviewed-by: Christian König Signed-off-by: Robin Murphy --- v3: No change drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 205da3ff9cd0..f81e96a4242f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -813,7 +813,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm) r = -ENOMEM; nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction); - if (nents != ttm->sg->nents) + if (nents == 0) goto release_sg; drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages, -- 2.17.0.dirty ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH v3 3/3] drm/radeon: Allow dma_map_sg() coalescing
Much like amdgpu, the radeon driver doesn't appear to directly use the scatterlist mapped by radeon_ttm_tt_pin_userptr(), it merely hands it off to drm_prime_sg_to_page_addr_arrays() to generate the dma_address array which it actually cares about. Now that the latter can cope with dma_map_sg() coalescing dma-contiguous segments such that it returns 0 < count < nents, we can relax the current count == nents check to only consider genuine failure as other drivers do. Suggested-by: Christian KönigReviewed-by: Christian König Signed-off-by: Robin Murphy --- v3: No change drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 8689fcca051c..7c099192c7fa 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -585,7 +585,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) r = -ENOMEM; nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction); - if (nents != ttm->sg->nents) + if (nents == 0) goto release_sg; drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages, -- 2.17.0.dirty ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
On 04/30, Christian König wrote: > > Well when the process is killed we don't care about correctness any more, we > just want to get rid of it as quickly as possible (OOM situation etc...). OK, > But it is perfectly possible that a process submits some render commands and > then calls exit() or terminates because of a SIGTERM, SIGINT etc.. This doesn't differ from SIGKILL. I mean, any unhandled fatal signal translates to SIGKILL and I think this is fine. but this doesn't really matter, > So what we essentially need is to distinct between a SIGKILL (which means > stop processing as soon as possible) and any other reason because then we > don't want to annoy the user with garbage on the screen (even if it's just > for a few milliseconds). For what? OK, I see another email from Andrey, I'll reply to that email... Oleg. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
On 27/04/18 20:42, Sinan Kaya wrote: On 4/27/2018 11:54 AM, Robin Murphy wrote: ubuntu@ubuntu:~/amdgpu$_./vectoradd_hip.exe [ 834.002206] create_process:620 [ 837.413021] Unable to handle kernel NULL pointer dereference at virtual address 0018 £5 says that's sg_dma_len(NULL), which implies either that something's gone horribly wrong with the scatterlist DMA mapping such that the lengths don't match, or much more likely that ttm.dma_address is NULL and I've missed the tiny subtlety below. Does that fix matters? Turned out to be a null pointer problem after sg_next(). The following helped. Ugh, right, the whole thing's in the wrong place such that when addrs is valid we can dereference junk on the way out of the loop (entirely needlessly)... v3 coming up. Robin. + if (addrs && (dma_len == 0)) { dma_sg = sg_next(dma_sg); - dma_len = sg_dma_len(dma_sg); - addr = sg_dma_address(dma_sg); + if (dma_sg) { + dma_len = sg_dma_len(dma_sg); + addr = sg_dma_address(dma_sg); + } } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
Christian Königwrites: > Hi Eric, > > sorry for the late response, was on vacation last week. > > Am 26.04.2018 um 02:01 schrieb Eric W. Biederman: >> Andrey Grodzovsky writes: >> >>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote: On 04/25, Andrey Grodzovsky wrote: > here (drm_sched_entity_fini) is also a bad idea, but we still want to be > able to exit immediately > and not wait for GPU jobs completion when the reason for reaching this > code > is because of KILL > signal to the user process who opened the device file. Can you hook f_op->flush method? > > THANKS! That sounds like a really good idea to me and we haven't investigated > into that direction yet. For the backwards compatibility concerns you cite below the flush method seems a much better place to introduce the wait. You at least really will be in a process context for that. Still might be in exit but at least you will be legitimately be in a process. >>> But this one is called for each task releasing a reference to the the file, >>> so >>> not sure I see how this solves the problem. >> The big question is why do you need to wait during the final closing a >> file? > > As always it's because of historical reasons. Initially user space pushed > commands directly to a hardware queue and when a processes finished we didn't > need to wait for anything. > > Then the GPU scheduler was introduced which delayed pushing the jobs to the > hardware queue to a later point in time. > > This wait was then added to maintain backward compability and not break > userspace (but see below). That make sense. >> The wait can be terminated so the wait does not appear to be simply a >> matter of correctness. > > Well when the process is killed we don't care about correctness any more, we > just want to get rid of it as quickly as possible (OOM situation etc...). > > But it is perfectly possible that a process submits some render commands and > then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this > case > we need to wait here to make sure that all rendering is pushed to the hardware > because the scheduler might need resources/settings from the file > descriptor. > > For example if you just remove that wait you could close firefox and get > garbage > on the screen for a millisecond because the remaining rendering commands where > not executed. > > So what we essentially need is to distinct between a SIGKILL (which means stop > processing as soon as possible) and any other reason because then we don't > want > to annoy the user with garbage on the screen (even if it's just for a few > milliseconds). I see a couple of issues. - Running the code in release rather than in flush. Using flush will catch every close so it should be more backwards compatible. f_op->flush always runs in process context so looking at current makes sense. - Distinguishing between death by SIGKILL and other process exit deaths. In f_op->flush the code can test "((tsk->flags & PF_EXITING) && (tsk->code == SIGKILL))" to see if it was SIGKILL that terminated the process. - Dealing with stuck queues (where this patchset came in). For stuck queues you are going to need a timeout instead of the current indefinite wait after PF_EXITING is set. From what you have described a few milliseconds should be enough. If PF_EXITING is not set you can still just make the wait killable and skip the timeout if that will give a better backwards compatible user experience. What can't be done is try and catch SIGKILL after a process has called do_exit. A dead process is a dead process. Eric ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
On 04/30, Andrey Grodzovsky wrote: > > What about changing PF_SIGNALED to PF_EXITING in > drm_sched_entity_do_release > > - if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) > + if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL) let me repeat, please don't use task->exit_code. And in fact this check is racy. But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL, or do something else, but I fail to understand what are you trying to do. Suppose that the check above is correct in that it is true iff the task is exiting and it was killed by SIGKILL. What about the "else" branch which does r = wait_event_killable(sched->job_scheduled, ...) ? Once again, fatal_signal_pending() (or even signal_pending()) is not well defined after the exiting task passes exit_signals(). So wait_event_killable() can fail because fatal_signal_pending() is true; and this can happen even if it was not killed. Or it can block and SIGKILL won't be able to wake it up. > If SIGINT was sent then it's SIGINT, Yes, but see above. in this case fatal_signal_pending() will be likely true so wait_event_killable() will fail unless condition is already true. Oleg. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v3 1/3] drm/prime: Iterate SG DMA addresses separately
On 4/30/2018 9:54 AM, Robin Murphy wrote: > For dma_map_sg(), DMA API implementations are free to merge consecutive > segments into a single DMA mapping if conditions are suitable, thus the > resulting DMA addresses which drm_prime_sg_to_page_addr_arrays() > iterates over may be packed into fewer entries than sgt->nents implies. > > The current implementation does not account for this, meaning that its > callers either have to reject the 0 < count < nents case or risk getting > bogus DMA addresses beyond the first segment. Fortunately this is quite > easy to handle without having to rejig structures to also store the > mapped count, since the total DMA length should still be equal to the > total buffer length. All we need is a second scatterlist cursor to > iterate through the DMA addresses independently of the page addresses. > > Reviewed-by: Christian König> Signed-off-by: Robin Murphy > --- Much better Tested-by: Sinan Kaya for the first two patches. (1/3 and 2/3) -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[REGRESSION] drm/amd/dc: Add dc display driver (v2)
Hi Harry, A kernel bug report was opened against Ubuntu [0]. After a kernel bisect, it was found the following commit introduced the bug: commit 4562236b3bc0a28aeb6ee93b2d8a849a4c4e1c7c Author: Harry WentlandDate: Tue Sep 12 15:58:20 2017 -0400 drm/amd/dc: Add dc display driver (v2) The regression was introduced as of v4.15-rc1 and still exists in current mainline. The commit does not need to be reverted to resolve the bug. Disabling the CONFIG_DRM_AMD_DC_PRE_VEGA option makes the bug go away. I was hoping to get your feedback, since you are the patch author. Do you think gathering any additional data will help diagnose this issue? Thanks, Joe [0] http://pad.lv/1761751 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH v3 1/3] drm/prime: Iterate SG DMA addresses separately
For dma_map_sg(), DMA API implementations are free to merge consecutive segments into a single DMA mapping if conditions are suitable, thus the resulting DMA addresses which drm_prime_sg_to_page_addr_arrays() iterates over may be packed into fewer entries than sgt->nents implies. The current implementation does not account for this, meaning that its callers either have to reject the 0 < count < nents case or risk getting bogus DMA addresses beyond the first segment. Fortunately this is quite easy to handle without having to rejig structures to also store the mapped count, since the total DMA length should still be equal to the total buffer length. All we need is a second scatterlist cursor to iterate through the DMA addresses independently of the page addresses. Reviewed-by: Christian KönigSigned-off-by: Robin Murphy --- v3: Move dma_len == 0 logic earlier to avoid iterating dma_sg too far drivers/gpu/drm/drm_prime.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 7856a9b3f8a8..3e74c84d0baf 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -933,16 +933,24 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages, dma_addr_t *addrs, int max_entries) { unsigned count; - struct scatterlist *sg; + struct scatterlist *sg, *dma_sg; struct page *page; - u32 len, index; + u32 len, dma_len, index; dma_addr_t addr; index = 0; + dma_sg = sgt->sgl; + dma_len = sg_dma_len(dma_sg); + addr = sg_dma_address(dma_sg); for_each_sg(sgt->sgl, sg, sgt->nents, count) { len = sg->length; page = sg_page(sg); - addr = sg_dma_address(sg); + + if (addrs && dma_len == 0) { + dma_sg = sg_next(dma_sg); + dma_len = sg_dma_len(dma_sg); + addr = sg_dma_address(dma_sg); + } while (len > 0) { if (WARN_ON(index >= max_entries)) @@ -955,6 +963,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages, page++; addr += PAGE_SIZE; len -= PAGE_SIZE; + dma_len -= PAGE_SIZE; index++; } } -- 2.17.0.dirty ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx