[PATCH] drm/amd/display: remove need of modeset flag for overlay planes (V2)

2018-05-01 Thread Shirish S
This patch is in continuation to the
"843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state"
where we started to eliminate the dependency on
DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
which as such is not mandatory.

After deferring, this patch eliminates the dependency on the flag
for overlay planes.

This has to be done in stages as its a pretty complex and requires thorough
testing before we free primary planes as well from dependency on modeset
flag.

V2: Simplified the plane type check.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1a63c04..045e5df 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
}
spin_unlock_irqrestore(>dev->event_lock, flags);
 
-   if (!pflip_needed) {
+   if (!pflip_needed || plane->type == DRM_PLANE_TYPE_OVERLAY) {
WARN_ON(!dm_new_plane_state->dc_state);
 
plane_states_constructed[planes_count] = 
dm_new_plane_state->dc_state;
@@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc,
 
/* Remove any changed/removed planes */
if (!enable) {
-   if (pflip_needed)
+   if (pflip_needed &&
+   plane->type != DRM_PLANE_TYPE_OVERLAY)
continue;
 
if (!old_plane_crtc)
@@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc,
if (!dm_new_crtc_state->stream)
continue;
 
-   if (pflip_needed)
+   if (pflip_needed &&
+   plane->type != DRM_PLANE_TYPE_OVERLAY)
continue;
 
WARN_ON(dm_new_plane_state->dc_state);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes

2018-05-01 Thread S, Shirish



On 5/2/2018 12:53 AM, Stéphane Marchesin wrote:

On Fri, Apr 27, 2018 at 3:27 AM Shirish S  wrote:


This patch is in continuation to the
"843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state"
where we started to eliminate the dependency on
DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
which as such is not mandatory.
After deferring, this patch eliminates the dependency on the flag
for overlay planes.
This has to be done in stages as its a pretty complex and requires

thorough

testing before we free primary planes as well from dependency on modeset
flag.
Signed-off-by: Shirish S 
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +---
   1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 1a63c04..87b661d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct

drm_atomic_state *state,

  }
  spin_unlock_irqrestore(>dev->event_lock, flags);
-   if (!pflip_needed) {
+   if (!pflip_needed || plane->type ==

DRM_PLANE_TYPE_OVERLAY) {

  WARN_ON(!dm_new_plane_state->dc_state);
  plane_states_constructed[planes_count] =

dm_new_plane_state->dc_state;

@@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc,
  /* Remove any changed/removed planes */
  if (!enable) {
-   if (pflip_needed)
+   if (pflip_needed &&
+   plane && plane->type !=

DRM_PLANE_TYPE_OVERLAY)

nit: I don't think we need to check that plane is non-NULL

Agree, was a bit over cautious.
Have removed it in V2.
Thanks.
Regards,
Shirish S

Stéphane


  continue;
  if (!old_plane_crtc)
@@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc,
  if (!dm_new_crtc_state->stream)
  continue;
-   if (pflip_needed)
+   if (pflip_needed &&
+   plane && plane->type !=

DRM_PLANE_TYPE_OVERLAY)

  continue;
  WARN_ON(dm_new_plane_state->dc_state);
--
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


答复: Tracking: radeon 0000:00:10.0: ring 0 stalled for more than 10240msec

2018-05-01 Thread Qu, Jim
Hi ,

If you are sure that the HW worked fine before. I think you should:

1. Be sure that HW works fine now.
2. recall the driver to the point at where it works well, and then replace them 
one by one to confirm component which causes the issue.
3. try to update the last VBIOS to adapt new driver.

Thanks
JimQu


发件人: amd-gfx  代表 Christian König 

发送时间: 2018年4月30日 1:16:14
收件人: Mathieu Malaterre; Deucher, Alexander
抄送: David Airlie; Zhou, David(ChunMing); dri-devel; 
amd-gfx@lists.freedesktop.org; LKML
主题: Re: Tracking: radeon :00:10.0: ring 0 stalled for more than 10240msec

Am 23.04.2018 um 20:50 schrieb Mathieu Malaterre:
> Hi there,
>
> I am pretty sure I was able to run kodi on an old Mac Mini G4 (big
> endian) with AMD RV280. Today it is failing to start with:

Well, that is rather old hardware. I suggest to make sure first that the
hw isn't broken in some way.

> How should I go and debug this (other than plain git-bisect) ?

You first need to figure out what's the failing component. Either Mesa,
DDX or the Kernel are possible candidates.

Another possibility is that you updated kodi and kodi is now doing
something the hw doesn't like.

Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: vcn regression on raven1

2018-05-01 Thread Zhang, Jerry (Junwei)

Hi Tom,

Ha, got your meaning.
Please check it with the latest drm-next from gerrit tomorrow.

Jerry

On 05/02/2018 09:41 AM, StDenis, Tom wrote:

Hi Jerry,

Like I said it's (now well) past EOD (meaning my workstation is powered off) so 
I'll have to check tomorrow.  But I do pull from gerrit daily and build from 
that.

I'll take a look in the morning.

Cheers,
Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 21:39
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either?

I do find it.

the tip of gerrit/amd-staging-drm-next is
* bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait
to recover from ring hang. 

while the tip of freedesktop is
* a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for
VEGAM 

Jerry

On 05/02/2018 09:29 AM, StDenis, Tom wrote:

I pull from gerrit.  I'm just pointing out that it's not on drm-next upstream 
either.

It may have been missed in a rebase or something.

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 21:07
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

Sound you get the code from freedesktop rather than the internal drm-next.
Unfortunately freedesktop looks delay to sync the code from internal drm-next.
That's the gap it happened as issue in the test.

Hi Alex,

Is that a issue for code syncing between freedesktop and internal drm-next?
Or it's a known issue of delay syncing code.

Jerry

On 05/02/2018 08:57 AM, StDenis, Tom wrote:

Hi Jerry,

It's well past EOD for me I'll pick this up in the morning.

I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as 
of my pull this morning though.

If it's in there and I missed it somehow I apologize otherwise it'd be nice to 
make sure it's in there.

Based on the public copy of the tree it's not there

https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110

Cheers,
Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:52
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

It was landed in the latest drm-next, like
  * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
emit_reg_write_reg_wait ring callback 

Did you test with that included?
Please try to get the latest drm-next, if not.
They look the same issue from the log.

Jerry

On 05/02/2018 08:47 AM, StDenis, Tom wrote:

Hi Jerry,

So far as I know this wasn't included on the tip of drm-next.  I hit this this 
morning in my semi-regular pull/build/test cycle.

Was this missed in a recent rebase?

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:43
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

On 05/01/2018 09:34 PM, Tom St Denis wrote:

Hi all,

I've noticed that on the tip of drm-next vcn playback of video is broken (see
dmesg below).  I've bisected it to this commit


It may be fixed here as a common issue.

   * https://patchwork.freedesktop.org/patch/218909/

Jerry



[root@raven linux]# git bisect good
701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
commit 701372349fd55b5396b335580e979ac4dde3dd02
Author: Alex Deucher 
Date:   Tue Mar 27 17:10:56 2018 -0500

 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb 
flush

 Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
 it provides a write and wait in a single packet which avoids a missed
 ack if a world switch happens between the request and waiting for the
 ack.

 Reviewed-by: Huang Rui 
 Reviewed-by: Christian König 
 Signed-off-by: Alex Deucher 

:04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers

Which is odd because the commit before this is the vcn change and it works fine
(playing BBB right now).

Here's the dmesg:

[ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at

[ 2925.640113] IP:   (null)
[ 2925.640116] PGD 0 P4D 0
[ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
[ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
gpu_sched ttm ax88179_178a usbnet
[ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
[ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
B350M-PLUS GAMING, BIOS 3803 01/22/2018
[ 

Re: vcn regression on raven1

2018-05-01 Thread StDenis, Tom
Hi Jerry,

Like I said it's (now well) past EOD (meaning my workstation is powered off) so 
I'll have to check tomorrow.  But I do pull from gerrit daily and build from 
that.

I'll take a look in the morning.

Cheers,
Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 21:39
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either?

I do find it.

the tip of gerrit/amd-staging-drm-next is
   * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait
to recover from ring hang. 

while the tip of freedesktop is
   * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for
VEGAM 

Jerry

On 05/02/2018 09:29 AM, StDenis, Tom wrote:
> I pull from gerrit.  I'm just pointing out that it's not on drm-next upstream 
> either.
>
> It may have been missed in a rebase or something.
>
> Tom
> 
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 21:07
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> Hi Tom,
>
> Sound you get the code from freedesktop rather than the internal drm-next.
> Unfortunately freedesktop looks delay to sync the code from internal drm-next.
> That's the gap it happened as issue in the test.
>
> Hi Alex,
>
> Is that a issue for code syncing between freedesktop and internal drm-next?
> Or it's a known issue of delay syncing code.
>
> Jerry
>
> On 05/02/2018 08:57 AM, StDenis, Tom wrote:
>> Hi Jerry,
>>
>> It's well past EOD for me I'll pick this up in the morning.
>>
>> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next 
>> as of my pull this morning though.
>>
>> If it's in there and I missed it somehow I apologize otherwise it'd be nice 
>> to make sure it's in there.
>>
>> Based on the public copy of the tree it's not there
>>
>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>>
>> Cheers,
>> Tom
>> 
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 20:52
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> Hi Tom,
>>
>> It was landed in the latest drm-next, like
>>  * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
>> emit_reg_write_reg_wait ring callback 
>>
>> Did you test with that included?
>> Please try to get the latest drm-next, if not.
>> They look the same issue from the log.
>>
>> Jerry
>>
>> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>>> Hi Jerry,
>>>
>>> So far as I know this wasn't included on the tip of drm-next.  I hit this 
>>> this morning in my semi-regular pull/build/test cycle.
>>>
>>> Was this missed in a recent rebase?
>>>
>>> Tom
>>> 
>>> From: Zhang, Jerry
>>> Sent: Tuesday, May 1, 2018 20:43
>>> To: StDenis, Tom; Deucher, Alexander
>>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>>> Subject: Re: vcn regression on raven1
>>>
>>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
 Hi all,

 I've noticed that on the tip of drm-next vcn playback of video is broken 
 (see
 dmesg below).  I've bisected it to this commit
>>>
>>> It may be fixed here as a common issue.
>>>
>>>   * https://patchwork.freedesktop.org/patch/218909/
>>>
>>> Jerry
>>>

 [root@raven linux]# git bisect good
 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
 commit 701372349fd55b5396b335580e979ac4dde3dd02
 Author: Alex Deucher 
 Date:   Tue Mar 27 17:10:56 2018 -0500

 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu 
 tlb flush

 Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support 
 it,
 it provides a write and wait in a single packet which avoids a 
 missed
 ack if a world switch happens between the request and waiting for 
 the
 ack.

 Reviewed-by: Huang Rui 
 Reviewed-by: Christian König 
 Signed-off-by: Alex Deucher 

 :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
 ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers

 Which is odd because the commit before this is the vcn change and it works 
 fine
 (playing BBB right now).

 Here's the dmesg:

 [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
 
 [ 2925.640113] IP:   (null)
 [ 2925.640116] PGD 0 P4D 0
 [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
 [ 2925.640126] Modules linked in: tun 

Re: vcn regression on raven1

2018-05-01 Thread Zhang, Jerry (Junwei)

Hi Tom,

Do you mean you cannot find the patch from gerrit/amd-staging-dkms-next either?

I do find it.

the tip of gerrit/amd-staging-drm-next is
  * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to interruptable wait 
to recover from ring hang. 


while the tip of freedesktop is
  * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune table for 
VEGAM 


Jerry

On 05/02/2018 09:29 AM, StDenis, Tom wrote:

I pull from gerrit.  I'm just pointing out that it's not on drm-next upstream 
either.

It may have been missed in a rebase or something.

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 21:07
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

Sound you get the code from freedesktop rather than the internal drm-next.
Unfortunately freedesktop looks delay to sync the code from internal drm-next.
That's the gap it happened as issue in the test.

Hi Alex,

Is that a issue for code syncing between freedesktop and internal drm-next?
Or it's a known issue of delay syncing code.

Jerry

On 05/02/2018 08:57 AM, StDenis, Tom wrote:

Hi Jerry,

It's well past EOD for me I'll pick this up in the morning.

I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as 
of my pull this morning though.

If it's in there and I missed it somehow I apologize otherwise it'd be nice to 
make sure it's in there.

Based on the public copy of the tree it's not there

https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110

Cheers,
Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:52
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

It was landed in the latest drm-next, like
 * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
emit_reg_write_reg_wait ring callback 

Did you test with that included?
Please try to get the latest drm-next, if not.
They look the same issue from the log.

Jerry

On 05/02/2018 08:47 AM, StDenis, Tom wrote:

Hi Jerry,

So far as I know this wasn't included on the tip of drm-next.  I hit this this 
morning in my semi-regular pull/build/test cycle.

Was this missed in a recent rebase?

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:43
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

On 05/01/2018 09:34 PM, Tom St Denis wrote:

Hi all,

I've noticed that on the tip of drm-next vcn playback of video is broken (see
dmesg below).  I've bisected it to this commit


It may be fixed here as a common issue.

  * https://patchwork.freedesktop.org/patch/218909/

Jerry



[root@raven linux]# git bisect good
701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
commit 701372349fd55b5396b335580e979ac4dde3dd02
Author: Alex Deucher 
Date:   Tue Mar 27 17:10:56 2018 -0500

drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb 
flush

Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
it provides a write and wait in a single packet which avoids a missed
ack if a world switch happens between the request and waiting for the
ack.

Reviewed-by: Huang Rui 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 

:04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers

Which is odd because the commit before this is the vcn change and it works fine
(playing BBB right now).

Here's the dmesg:

[ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at

[ 2925.640113] IP:   (null)
[ 2925.640116] PGD 0 P4D 0
[ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
[ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
gpu_sched ttm ax88179_178a usbnet
[ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
[ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
B350M-PLUS GAMING, BIOS 3803 01/22/2018
[ 2925.640146] RIP: 0010:  (null)
[ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
[ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 007c0080
[ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420
[ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9
[ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c
[ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 
[ 2925.640168] FS:  () GS:8801dcf0()
knlGS:
[ 2925.640171] 

Re: vcn regression on raven1

2018-05-01 Thread StDenis, Tom
I pull from gerrit.  I'm just pointing out that it's not on drm-next upstream 
either.

It may have been missed in a rebase or something.

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 21:07
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

Sound you get the code from freedesktop rather than the internal drm-next.
Unfortunately freedesktop looks delay to sync the code from internal drm-next.
That's the gap it happened as issue in the test.

Hi Alex,

Is that a issue for code syncing between freedesktop and internal drm-next?
Or it's a known issue of delay syncing code.

Jerry

On 05/02/2018 08:57 AM, StDenis, Tom wrote:
> Hi Jerry,
>
> It's well past EOD for me I'll pick this up in the morning.
>
> I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next 
> as of my pull this morning though.
>
> If it's in there and I missed it somehow I apologize otherwise it'd be nice 
> to make sure it's in there.
>
> Based on the public copy of the tree it's not there
>
> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110
>
> Cheers,
> Tom
> 
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 20:52
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> Hi Tom,
>
> It was landed in the latest drm-next, like
> * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
> emit_reg_write_reg_wait ring callback 
>
> Did you test with that included?
> Please try to get the latest drm-next, if not.
> They look the same issue from the log.
>
> Jerry
>
> On 05/02/2018 08:47 AM, StDenis, Tom wrote:
>> Hi Jerry,
>>
>> So far as I know this wasn't included on the tip of drm-next.  I hit this 
>> this morning in my semi-regular pull/build/test cycle.
>>
>> Was this missed in a recent rebase?
>>
>> Tom
>> 
>> From: Zhang, Jerry
>> Sent: Tuesday, May 1, 2018 20:43
>> To: StDenis, Tom; Deucher, Alexander
>> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
>> Subject: Re: vcn regression on raven1
>>
>> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>>> Hi all,
>>>
>>> I've noticed that on the tip of drm-next vcn playback of video is broken 
>>> (see
>>> dmesg below).  I've bisected it to this commit
>>
>> It may be fixed here as a common issue.
>>
>>  * https://patchwork.freedesktop.org/patch/218909/
>>
>> Jerry
>>
>>>
>>> [root@raven linux]# git bisect good
>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>>> Author: Alex Deucher 
>>> Date:   Tue Mar 27 17:10:56 2018 -0500
>>>
>>>drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb 
>>> flush
>>>
>>>Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
>>>it provides a write and wait in a single packet which avoids a missed
>>>ack if a world switch happens between the request and waiting for the
>>>ack.
>>>
>>>Reviewed-by: Huang Rui 
>>>Reviewed-by: Christian König 
>>>Signed-off-by: Alex Deucher 
>>>
>>> :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers
>>>
>>> Which is odd because the commit before this is the vcn change and it works 
>>> fine
>>> (playing BBB right now).
>>>
>>> Here's the dmesg:
>>>
>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>>> 
>>> [ 2925.640113] IP:   (null)
>>> [ 2925.640116] PGD 0 P4D 0
>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>>> gpu_sched ttm ax88179_178a usbnet
>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>>> [ 2925.640146] RIP: 0010:  (null)
>>> [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
>>> [ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 
>>> 007c0080
>>> [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 
>>> 8801d8b38420
>>> [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: 
>>> ed003aa9eef9
>>> [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 
>>> 8801d8b3277c
>>> [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 
>>> 
>>> [ 2925.640168] FS:  () GS:8801dcf0()
>>> knlGS:
>>> [ 2925.640171] CS:  0010 DS:  ES:  CR0: 80050033
>>> [ 2925.640174] CR2:  

Re: vcn regression on raven1

2018-05-01 Thread Zhang, Jerry (Junwei)

Hi Tom,

Sound you get the code from freedesktop rather than the internal drm-next.
Unfortunately freedesktop looks delay to sync the code from internal drm-next.
That's the gap it happened as issue in the test.

Hi Alex,

Is that a issue for code syncing between freedesktop and internal drm-next?
Or it's a known issue of delay syncing code.

Jerry

On 05/02/2018 08:57 AM, StDenis, Tom wrote:

Hi Jerry,

It's well past EOD for me I'll pick this up in the morning.

I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as 
of my pull this morning though.

If it's in there and I missed it somehow I apologize otherwise it'd be nice to 
make sure it's in there.

Based on the public copy of the tree it's not there

https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110

Cheers,
Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:52
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

It was landed in the latest drm-next, like
* 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
emit_reg_write_reg_wait ring callback 

Did you test with that included?
Please try to get the latest drm-next, if not.
They look the same issue from the log.

Jerry

On 05/02/2018 08:47 AM, StDenis, Tom wrote:

Hi Jerry,

So far as I know this wasn't included on the tip of drm-next.  I hit this this 
morning in my semi-regular pull/build/test cycle.

Was this missed in a recent rebase?

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:43
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

On 05/01/2018 09:34 PM, Tom St Denis wrote:

Hi all,

I've noticed that on the tip of drm-next vcn playback of video is broken (see
dmesg below).  I've bisected it to this commit


It may be fixed here as a common issue.

 * https://patchwork.freedesktop.org/patch/218909/

Jerry



[root@raven linux]# git bisect good
701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
commit 701372349fd55b5396b335580e979ac4dde3dd02
Author: Alex Deucher 
Date:   Tue Mar 27 17:10:56 2018 -0500

   drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush

   Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
   it provides a write and wait in a single packet which avoids a missed
   ack if a world switch happens between the request and waiting for the
   ack.

   Reviewed-by: Huang Rui 
   Reviewed-by: Christian König 
   Signed-off-by: Alex Deucher 

:04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers

Which is odd because the commit before this is the vcn change and it works fine
(playing BBB right now).

Here's the dmesg:

[ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at

[ 2925.640113] IP:   (null)
[ 2925.640116] PGD 0 P4D 0
[ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
[ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
gpu_sched ttm ax88179_178a usbnet
[ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
[ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
B350M-PLUS GAMING, BIOS 3803 01/22/2018
[ 2925.640146] RIP: 0010:  (null)
[ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
[ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 007c0080
[ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420
[ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9
[ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c
[ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 
[ 2925.640168] FS:  () GS:8801dcf0()
knlGS:
[ 2925.640171] CS:  0010 DS:  ES:  CR0: 80050033
[ 2925.640174] CR2:  CR3: 0001d9712000 CR4: 003406e0
[ 2925.640176] Call Trace:
[ 2925.640272]  ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
[ 2925.640368]  ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
[ 2925.640459]  ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
[ 2925.640545]  ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
[ 2925.640640]  ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
[ 2925.640725]  ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
[ 2925.640810]  ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
[ 2925.640897]  ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
[ 2925.641003]  ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
[ 2925.641095]  ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]

Re: vcn regression on raven1

2018-05-01 Thread StDenis, Tom
Hi Jerry,

It's well past EOD for me I'll pick this up in the morning.

I'm fairly certain I wrote my patch against the tip of amd-staging-drm-next as 
of my pull this morning though.

If it's in there and I missed it somehow I apologize otherwise it'd be nice to 
make sure it's in there.

Based on the public copy of the tree it's not there 

https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110

Cheers,
Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:52
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

Hi Tom,

It was landed in the latest drm-next, like
   * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add
emit_reg_write_reg_wait ring callback 

Did you test with that included?
Please try to get the latest drm-next, if not.
They look the same issue from the log.

Jerry

On 05/02/2018 08:47 AM, StDenis, Tom wrote:
> Hi Jerry,
>
> So far as I know this wasn't included on the tip of drm-next.  I hit this 
> this morning in my semi-regular pull/build/test cycle.
>
> Was this missed in a recent rebase?
>
> Tom
> 
> From: Zhang, Jerry
> Sent: Tuesday, May 1, 2018 20:43
> To: StDenis, Tom; Deucher, Alexander
> Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
> Subject: Re: vcn regression on raven1
>
> On 05/01/2018 09:34 PM, Tom St Denis wrote:
>> Hi all,
>>
>> I've noticed that on the tip of drm-next vcn playback of video is broken (see
>> dmesg below).  I've bisected it to this commit
>
> It may be fixed here as a common issue.
>
> * https://patchwork.freedesktop.org/patch/218909/
>
> Jerry
>
>>
>> [root@raven linux]# git bisect good
>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
>> commit 701372349fd55b5396b335580e979ac4dde3dd02
>> Author: Alex Deucher 
>> Date:   Tue Mar 27 17:10:56 2018 -0500
>>
>>   drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb 
>> flush
>>
>>   Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
>>   it provides a write and wait in a single packet which avoids a missed
>>   ack if a world switch happens between the request and waiting for the
>>   ack.
>>
>>   Reviewed-by: Huang Rui 
>>   Reviewed-by: Christian König 
>>   Signed-off-by: Alex Deucher 
>>
>> :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers
>>
>> Which is odd because the commit before this is the vcn change and it works 
>> fine
>> (playing BBB right now).
>>
>> Here's the dmesg:
>>
>> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
>> 
>> [ 2925.640113] IP:   (null)
>> [ 2925.640116] PGD 0 P4D 0
>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
>> gpu_sched ttm ax88179_178a usbnet
>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
>> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
>> B350M-PLUS GAMING, BIOS 3803 01/22/2018
>> [ 2925.640146] RIP: 0010:  (null)
>> [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
>> [ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 
>> 007c0080
>> [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 
>> 8801d8b38420
>> [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: 
>> ed003aa9eef9
>> [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 
>> 8801d8b3277c
>> [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 
>> 
>> [ 2925.640168] FS:  () GS:8801dcf0()
>> knlGS:
>> [ 2925.640171] CS:  0010 DS:  ES:  CR0: 80050033
>> [ 2925.640174] CR2:  CR3: 0001d9712000 CR4: 
>> 003406e0
>> [ 2925.640176] Call Trace:
>> [ 2925.640272]  ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
>> [ 2925.640368]  ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
>> [ 2925.640459]  ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
>> [ 2925.640545]  ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
>> [ 2925.640640]  ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
>> [ 2925.640725]  ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
>> [ 2925.640810]  ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
>> [ 2925.640897]  ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
>> [ 2925.641003]  ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
>> [ 2925.641095]  ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
>> [ 2925.641102]  ? dma_fence_add_callback+0x15f/0x360
>> [ 2925.641201]  ? amdgpu_job_run+0x32f/0x370 [amdgpu]
>> [ 2925.641297]  ? 

Re: vcn regression on raven1

2018-05-01 Thread Zhang, Jerry (Junwei)

Hi Tom,

It was landed in the latest drm-next, like
  * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add 
emit_reg_write_reg_wait ring callback 


Did you test with that included?
Please try to get the latest drm-next, if not.
They look the same issue from the log.

Jerry

On 05/02/2018 08:47 AM, StDenis, Tom wrote:

Hi Jerry,

So far as I know this wasn't included on the tip of drm-next.  I hit this this 
morning in my semi-regular pull/build/test cycle.

Was this missed in a recent rebase?

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:43
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

On 05/01/2018 09:34 PM, Tom St Denis wrote:

Hi all,

I've noticed that on the tip of drm-next vcn playback of video is broken (see
dmesg below).  I've bisected it to this commit


It may be fixed here as a common issue.

* https://patchwork.freedesktop.org/patch/218909/

Jerry



[root@raven linux]# git bisect good
701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
commit 701372349fd55b5396b335580e979ac4dde3dd02
Author: Alex Deucher 
Date:   Tue Mar 27 17:10:56 2018 -0500

  drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush

  Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
  it provides a write and wait in a single packet which avoids a missed
  ack if a world switch happens between the request and waiting for the
  ack.

  Reviewed-by: Huang Rui 
  Reviewed-by: Christian König 
  Signed-off-by: Alex Deucher 

:04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers

Which is odd because the commit before this is the vcn change and it works fine
(playing BBB right now).

Here's the dmesg:

[ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at

[ 2925.640113] IP:   (null)
[ 2925.640116] PGD 0 P4D 0
[ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
[ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
gpu_sched ttm ax88179_178a usbnet
[ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
[ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
B350M-PLUS GAMING, BIOS 3803 01/22/2018
[ 2925.640146] RIP: 0010:  (null)
[ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
[ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 007c0080
[ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420
[ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9
[ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c
[ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 
[ 2925.640168] FS:  () GS:8801dcf0()
knlGS:
[ 2925.640171] CS:  0010 DS:  ES:  CR0: 80050033
[ 2925.640174] CR2:  CR3: 0001d9712000 CR4: 003406e0
[ 2925.640176] Call Trace:
[ 2925.640272]  ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
[ 2925.640368]  ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
[ 2925.640459]  ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
[ 2925.640545]  ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
[ 2925.640640]  ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
[ 2925.640725]  ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
[ 2925.640810]  ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
[ 2925.640897]  ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
[ 2925.641003]  ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
[ 2925.641095]  ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
[ 2925.641102]  ? dma_fence_add_callback+0x15f/0x360
[ 2925.641201]  ? amdgpu_job_run+0x32f/0x370 [amdgpu]
[ 2925.641297]  ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
[ 2925.641302]  ? __queue_delayed_work+0x144/0x1d0
[ 2925.641306]  ? delayed_work_timer_fn+0x40/0x40
[ 2925.641312]  ? prepare_to_wait_exclusive+0x1d0/0x1d0
[ 2925.641318]  ? drm_sched_main+0x68c/0x940 [gpu_sched]
[ 2925.641323]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641328]  ? save_stack+0x89/0xb0
[ 2925.641332]  ? wait_woken+0x110/0x110
[ 2925.641337]  ? ret_from_fork+0x22/0x40
[ 2925.641343]  ? __schedule+0xd30/0xd30
[ 2925.641346]  ? remove_wait_queue+0x150/0x150
[ 2925.641353]  ? rcu_note_context_switch+0x2a0/0x2a0
[ 2925.641359]  ? __lock_text_start+0x8/0x8
[ 2925.641367]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641371]  ? kthread+0x19b/0x1c0
[ 2925.641376]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 2925.641382]  ? ret_from_fork+0x22/0x40
[ 2925.641387] Code:  Bad RIP value.
[ 2925.641397] RIP:   (null) RSP: 8801d54f7790
[ 2925.641400] CR2: 
[ 

Re: vcn regression on raven1

2018-05-01 Thread StDenis, Tom
Hi Jerry,

So far as I know this wasn't included on the tip of drm-next.  I hit this this 
morning in my semi-regular pull/build/test cycle.

Was this missed in a recent rebase?

Tom

From: Zhang, Jerry
Sent: Tuesday, May 1, 2018 20:43
To: StDenis, Tom; Deucher, Alexander
Cc: Koenig, Christian; amd-gfx@lists.freedesktop.org
Subject: Re: vcn regression on raven1

On 05/01/2018 09:34 PM, Tom St Denis wrote:
> Hi all,
>
> I've noticed that on the tip of drm-next vcn playback of video is broken (see
> dmesg below).  I've bisected it to this commit

It may be fixed here as a common issue.

   * https://patchwork.freedesktop.org/patch/218909/

Jerry

>
> [root@raven linux]# git bisect good
> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
> commit 701372349fd55b5396b335580e979ac4dde3dd02
> Author: Alex Deucher 
> Date:   Tue Mar 27 17:10:56 2018 -0500
>
>  drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush
>
>  Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
>  it provides a write and wait in a single packet which avoids a missed
>  ack if a world switch happens between the request and waiting for the
>  ack.
>
>  Reviewed-by: Huang Rui 
>  Reviewed-by: Christian König 
>  Signed-off-by: Alex Deucher 
>
> :04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers
>
> Which is odd because the commit before this is the vcn change and it works 
> fine
> (playing BBB right now).
>
> Here's the dmesg:
>
> [ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at
> 
> [ 2925.640113] IP:   (null)
> [ 2925.640116] PGD 0 P4D 0
> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
> gpu_sched ttm ax88179_178a usbnet
> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
> [ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
> B350M-PLUS GAMING, BIOS 3803 01/22/2018
> [ 2925.640146] RIP: 0010:  (null)
> [ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
> [ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 
> 007c0080
> [ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 
> 8801d8b38420
> [ 2925.640159] RBP: 0001a6fa R08: 0080 R09: 
> ed003aa9eef9
> [ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 
> 8801d8b3277c
> [ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 
> 
> [ 2925.640168] FS:  () GS:8801dcf0()
> knlGS:
> [ 2925.640171] CS:  0010 DS:  ES:  CR0: 80050033
> [ 2925.640174] CR2:  CR3: 0001d9712000 CR4: 
> 003406e0
> [ 2925.640176] Call Trace:
> [ 2925.640272]  ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
> [ 2925.640368]  ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
> [ 2925.640459]  ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
> [ 2925.640545]  ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
> [ 2925.640640]  ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
> [ 2925.640725]  ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
> [ 2925.640810]  ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
> [ 2925.640897]  ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
> [ 2925.641003]  ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
> [ 2925.641095]  ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
> [ 2925.641102]  ? dma_fence_add_callback+0x15f/0x360
> [ 2925.641201]  ? amdgpu_job_run+0x32f/0x370 [amdgpu]
> [ 2925.641297]  ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
> [ 2925.641302]  ? __queue_delayed_work+0x144/0x1d0
> [ 2925.641306]  ? delayed_work_timer_fn+0x40/0x40
> [ 2925.641312]  ? prepare_to_wait_exclusive+0x1d0/0x1d0
> [ 2925.641318]  ? drm_sched_main+0x68c/0x940 [gpu_sched]
> [ 2925.641323]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
> [ 2925.641328]  ? save_stack+0x89/0xb0
> [ 2925.641332]  ? wait_woken+0x110/0x110
> [ 2925.641337]  ? ret_from_fork+0x22/0x40
> [ 2925.641343]  ? __schedule+0xd30/0xd30
> [ 2925.641346]  ? remove_wait_queue+0x150/0x150
> [ 2925.641353]  ? rcu_note_context_switch+0x2a0/0x2a0
> [ 2925.641359]  ? __lock_text_start+0x8/0x8
> [ 2925.641367]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
> [ 2925.641371]  ? kthread+0x19b/0x1c0
> [ 2925.641376]  ? kthread_create_worker_on_cpu+0xc0/0xc0
> [ 2925.641382]  ? ret_from_fork+0x22/0x40
> [ 2925.641387] Code:  Bad RIP value.
> [ 2925.641397] RIP:   (null) RSP: 8801d54f7790
> [ 2925.641400] CR2: 
> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---
>
>
> Note that regular compute/gfx workflows work fine on the tip of drm-next only
> vcn playback triggeers this 

Re: vcn regression on raven1

2018-05-01 Thread Zhang, Jerry (Junwei)

On 05/01/2018 09:34 PM, Tom St Denis wrote:

Hi all,

I've noticed that on the tip of drm-next vcn playback of video is broken (see
dmesg below).  I've bisected it to this commit


It may be fixed here as a common issue.

  * https://patchwork.freedesktop.org/patch/218909/

Jerry



[root@raven linux]# git bisect good
701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
commit 701372349fd55b5396b335580e979ac4dde3dd02
Author: Alex Deucher 
Date:   Tue Mar 27 17:10:56 2018 -0500

 drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush

 Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
 it provides a write and wait in a single packet which avoids a missed
 ack if a world switch happens between the request and waiting for the
 ack.

 Reviewed-by: Huang Rui 
 Reviewed-by: Christian König 
 Signed-off-by: Alex Deucher 

:04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba
ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers

Which is odd because the commit before this is the vcn change and it works fine
(playing BBB right now).

Here's the dmesg:

[ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at

[ 2925.640113] IP:   (null)
[ 2925.640116] PGD 0 P4D 0
[ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
[ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash
gpu_sched ttm ax88179_178a usbnet
[ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
[ 2925.640142] Hardware name: System manufacturer System Product Name/TUF
B350M-PLUS GAMING, BIOS 3803 01/22/2018
[ 2925.640146] RIP: 0010:  (null)
[ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
[ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 007c0080
[ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 8801d8b38420
[ 2925.640159] RBP: 0001a6fa R08: 0080 R09: ed003aa9eef9
[ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 8801d8b3277c
[ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 
[ 2925.640168] FS:  () GS:8801dcf0()
knlGS:
[ 2925.640171] CS:  0010 DS:  ES:  CR0: 80050033
[ 2925.640174] CR2:  CR3: 0001d9712000 CR4: 003406e0
[ 2925.640176] Call Trace:
[ 2925.640272]  ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
[ 2925.640368]  ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
[ 2925.640459]  ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
[ 2925.640545]  ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
[ 2925.640640]  ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
[ 2925.640725]  ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
[ 2925.640810]  ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
[ 2925.640897]  ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
[ 2925.641003]  ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
[ 2925.641095]  ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
[ 2925.641102]  ? dma_fence_add_callback+0x15f/0x360
[ 2925.641201]  ? amdgpu_job_run+0x32f/0x370 [amdgpu]
[ 2925.641297]  ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
[ 2925.641302]  ? __queue_delayed_work+0x144/0x1d0
[ 2925.641306]  ? delayed_work_timer_fn+0x40/0x40
[ 2925.641312]  ? prepare_to_wait_exclusive+0x1d0/0x1d0
[ 2925.641318]  ? drm_sched_main+0x68c/0x940 [gpu_sched]
[ 2925.641323]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641328]  ? save_stack+0x89/0xb0
[ 2925.641332]  ? wait_woken+0x110/0x110
[ 2925.641337]  ? ret_from_fork+0x22/0x40
[ 2925.641343]  ? __schedule+0xd30/0xd30
[ 2925.641346]  ? remove_wait_queue+0x150/0x150
[ 2925.641353]  ? rcu_note_context_switch+0x2a0/0x2a0
[ 2925.641359]  ? __lock_text_start+0x8/0x8
[ 2925.641367]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641371]  ? kthread+0x19b/0x1c0
[ 2925.641376]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 2925.641382]  ? ret_from_fork+0x22/0x40
[ 2925.641387] Code:  Bad RIP value.
[ 2925.641397] RIP:   (null) RSP: 8801d54f7790
[ 2925.641400] CR2: 
[ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---


Note that regular compute/gfx workflows work fine on the tip of drm-next only
vcn playback triggeers this (haven't tried encode yet...).

Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 04/12] drm/amdkfd: use %px to print user space address instead of %p

2018-05-01 Thread Felix Kuehling
From: Philip Yang 

Signed-off-by: Philip Yang 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c   | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 1a4d8dc..4ced5e9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -233,7 +233,7 @@ static int set_queue_properties_from_user(struct 
queue_properties *q_properties,
pr_debug("Queue Size: 0x%llX, %u\n",
q_properties->queue_size, args->ring_size);
 
-   pr_debug("Queue r/w Pointers: %p, %p\n",
+   pr_debug("Queue r/w Pointers: %px, %px\n",
q_properties->read_ptr,
q_properties->write_ptr);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
index a5315d4..6dcd621 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
@@ -36,8 +36,8 @@ void print_queue_properties(struct queue_properties *q)
pr_debug("Queue Address: 0x%llX\n", q->queue_address);
pr_debug("Queue Id: %u\n", q->queue_id);
pr_debug("Queue Process Vmid: %u\n", q->vmid);
-   pr_debug("Queue Read Pointer: 0x%p\n", q->read_ptr);
-   pr_debug("Queue Write Pointer: 0x%p\n", q->write_ptr);
+   pr_debug("Queue Read Pointer: 0x%px\n", q->read_ptr);
+   pr_debug("Queue Write Pointer: 0x%px\n", q->write_ptr);
pr_debug("Queue Doorbell Pointer: 0x%p\n", q->doorbell_ptr);
pr_debug("Queue Doorbell Offset: %u\n", q->doorbell_off);
 }
@@ -53,8 +53,8 @@ void print_queue(struct queue *q)
pr_debug("Queue Address: 0x%llX\n", q->properties.queue_address);
pr_debug("Queue Id: %u\n", q->properties.queue_id);
pr_debug("Queue Process Vmid: %u\n", q->properties.vmid);
-   pr_debug("Queue Read Pointer: 0x%p\n", q->properties.read_ptr);
-   pr_debug("Queue Write Pointer: 0x%p\n", q->properties.write_ptr);
+   pr_debug("Queue Read Pointer: 0x%px\n", q->properties.read_ptr);
+   pr_debug("Queue Write Pointer: 0x%px\n", q->properties.write_ptr);
pr_debug("Queue Doorbell Pointer: 0x%p\n", q->properties.doorbell_ptr);
pr_debug("Queue Doorbell Offset: %u\n", q->properties.doorbell_off);
pr_debug("Queue MQD Address: 0x%p\n", q->mqd);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 08/12] drm/amdkfd: Fix signal handling performance again

2018-05-01 Thread Felix Kuehling
It turns out that idr_for_each_entry is really slow compared to just
iterating over the slots. Based on measurements the difference is
estimated to be about a factor 64. That means using idr_for_each_entry
is only worth it with very few allocated events.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index bccf2f7..7862fcf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -496,7 +496,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, 
uint32_t partial_id,
pr_debug_ratelimited("Partial ID invalid: %u (%u valid 
bits)\n",
 partial_id, valid_id_bits);
 
-   if (p->signal_event_count < KFD_SIGNAL_EVENT_LIMIT/2) {
+   if (p->signal_event_count < KFD_SIGNAL_EVENT_LIMIT/64) {
/* With relatively few events, it's faster to
 * iterate over the event IDR
 */
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 05/12] drm/amdkfd: Remove redundant include of amd-iommu.h

2018-05-01 Thread Felix Kuehling
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index fb4a72d..17de4ac 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -20,9 +20,6 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-#include 
-#endif
 #include 
 #include 
 #include 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 11/12] drm/amdkfd: Remove queue node when destroy queue failed

2018-05-01 Thread Felix Kuehling
From: Shaoyun Liu 

HWS may hang in the middle of destroy queue, remove the queue from the
process queue list so it won't be freed again in the future

Signed-off-by: Shaoyun Liu 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 3045aeb..d65ce04 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -241,7 +241,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
}
 
if (retval != 0) {
-   pr_err("DQM create queue failed\n");
+   pr_err("Pasid %d DQM create queue %d failed. ret %d\n",
+   pqm->process->pasid, type, retval);
goto err_create_queue;
}
 
@@ -319,8 +320,11 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, 
unsigned int qid)
dqm = pqn->q->device->dqm;
retval = dqm->ops.destroy_queue(dqm, >qpd, pqn->q);
if (retval) {
-   pr_debug("Destroy queue failed, returned %d\n", retval);
-   goto err_destroy_queue;
+   pr_err("Pasid %d destroy queue %d failed, ret %d\n",
+   pqm->process->pasid,
+   pqn->q->properties.queue_id, retval);
+   if (retval != -ETIME)
+   goto err_destroy_queue;
}
uninit_queue(pqn->q);
}
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 01/12] drm/amdkfd: Dump HQD of HIQ

2018-05-01 Thread Felix Kuehling
From: Oak Zeng 

Signed-off-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 9af94b1..668ad07 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1713,6 +1713,18 @@ int dqm_debugfs_hqds(struct seq_file *m, void *data)
int pipe, queue;
int r = 0;
 
+   r = dqm->dev->kfd2kgd->hqd_dump(dqm->dev->kgd,
+   KFD_CIK_HIQ_PIPE, KFD_CIK_HIQ_QUEUE, , _regs);
+   if (!r) {
+   seq_printf(m, "  HIQ on MEC %d Pipe %d Queue %d\n",
+   KFD_CIK_HIQ_PIPE/get_pipes_per_mec(dqm)+1,
+   KFD_CIK_HIQ_PIPE%get_pipes_per_mec(dqm),
+   KFD_CIK_HIQ_QUEUE);
+   seq_reg_dump(m, dump, n_regs);
+
+   kfree(dump);
+   }
+
for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
int pipe_offset = pipe * get_queues_per_pipe(dqm);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 03/12] drm/amdkfd: Use volatile MTYPE in default/alternate apertures

2018-05-01 Thread Felix Kuehling
From: Jay Cornwall 

MTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile.
Cache lines loaded through these apertures are intended to be
invalidated before (and sometimes during) a dispatch. The non-volatile
qualifier prevents these cache lines from being distinguished from
those loaded through the private aperture.

Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the
compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor
to automatic per-dispatch scalar/vector L1 volatile invalidation.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cik_regs.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cik_regs.h 
b/drivers/gpu/drm/amd/amdkfd/cik_regs.h
index 48769d1..37ce6dd 100644
--- a/drivers/gpu/drm/amd/amdkfd/cik_regs.h
+++ b/drivers/gpu/drm/amd/amdkfd/cik_regs.h
@@ -33,7 +33,8 @@
 #defineAPE1_MTYPE(x)   ((x) << 7)
 
 /* valid for both DEFAULT_MTYPE and APE1_MTYPE */
-#defineMTYPE_CACHED0
+#defineMTYPE_CACHED_NV 0
+#defineMTYPE_CACHED1
 #defineMTYPE_NONCACHED 3
 
 #defineDEFAULT_CP_HQD_PERSISTENT_STATE (0x33U << 8)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 07/12] drm/amdkfd: Fix CP soft hang on APUs

2018-05-01 Thread Felix Kuehling
From: Yong Zhao 

The problem happens on Raven and Carrizo. The context save handler
should not clear the high bits of PC_HI before extracting the bits
of IB_STS.

The bug is not relevant to VEGA10 until we enable demand paging.

Signed-off-by: Jay Cornwall 
Signed-off-by: Yong Zhao 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 4 ++--
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 3 +--
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 3 +--
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index a546a21..f68aef0 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -253,7 +253,6 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
0x0072, 0x80728472,
0xc0211b7c, 0x0072,
0x80728472, 0xbf8c007f,
-   0x8671ff71, 0x,
0xbefc0073, 0xbefe006e,
0xbeff006f, 0x867375ff,
0x03ff, 0xb9734803,
@@ -267,6 +266,7 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
0x8e738f73, 0x87767376,
0x8673ff74, 0x0080,
0x8f739773, 0xb976f807,
+   0x8671ff71, 0x,
0x86fe7e7e, 0x86ea6a6a,
0xb974f802, 0xbf8a,
0x95807370, 0xbf81,
@@ -530,7 +530,6 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
0x0078, 0x80788478,
0xc0211cfa, 0x0078,
0x80788478, 0xbf8cc07f,
-   0x866dff6d, 0x,
0xbefc006f, 0xbefe007a,
0xbeff007b, 0x866f71ff,
0x03ff, 0xb96f4803,
@@ -554,6 +553,7 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
0x8e6f8f6f, 0x876e6f6e,
0x866fff70, 0x0080,
0x8f6f976f, 0xb96ef807,
+   0x866dff6d, 0x,
0x86fe7e7e, 0x86ea6a6a,
0xb970f802, 0xbf8a,
0x95806f6c, 0xbf81,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
index 658a4c6..a2a04bb 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
@@ -1015,8 +1015,6 @@ end
 
 s_waitcnt   lgkmcnt(0) 
 //from now on, it is safe to restore 
STATUS and IB_STS
 
-s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x  //pc[47:32]
//Do it here in order not to affect STATUS
-
 //for normal save & restore, the saved PC points to the next inst to 
execute, no adjustment needs to be made, otherwise:
 if ((EMU_RUN_HACK) && (!EMU_RUN_HACK_RESTORE_NORMAL))
 s_add_u32 s_restore_pc_lo, s_restore_pc_lo, 8//pc[31:0]+8  
   //two back-to-back s_trap are used (first for save and second for restore)
@@ -1052,6 +1050,7 @@ end
 s_lshr_b32  s_restore_m0, s_restore_m0, SQ_WAVE_STATUS_INST_ATC_SHIFT
 s_setreg_b32hwreg(HW_REG_IB_STS),   s_restore_tmp
 
+s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x  //pc[47:32]
//Do it here in order not to affect STATUS
 s_and_b64exec, exec, exec  // Restore STATUS.EXECZ, not writable by 
s_setreg_b32
 s_and_b64vcc, vcc, vcc  // Restore STATUS.VCCZ, not writable by 
s_setreg_b32
 s_setreg_b32hwreg(HW_REG_STATUS),   s_restore_status // SCC is 
included, which is changed by previous salu
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index 065f55a..998be96 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -1067,8 +1067,6 @@ end
 
 s_waitcnt  lgkmcnt(0)  
//from now on, it is safe to restore STATUS 
and IB_STS
 
-s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x //pc[47:32] 
   //Do it here in order not to affect STATUS
-
 //for normal save & restore, the saved PC points to the next inst to 
execute, no adjustment needs to be made, otherwise:
 if ((EMU_RUN_HACK) && (!EMU_RUN_HACK_RESTORE_NORMAL))
s_add_u32 s_restore_pc_lo, s_restore_pc_lo, 8//pc[31:0]+8   
  //two back-to-back s_trap are used (first for save and second for restore)
@@ -1119,6 +1117,7 @@ end
 s_lshr_b32 s_restore_m0, s_restore_m0, SQ_WAVE_STATUS_INST_ATC_SHIFT
 s_setreg_b32hwreg(HW_REG_IB_STS),   s_restore_tmp
 
+s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x //pc[47:32] 
   //Do it here in order not to affect STATUS
 s_and_b64   exec, exec, exec  // Restore STATUS.EXECZ, not writable by 
s_setreg_b32
 

[PATCH 06/12] drm/amdkfd: Separate trap handler assembly code and its hex values

2018-05-01 Thread Felix Kuehling
From: Yong Zhao 

Since the assembly code is inside "#if 0", it is ineffective. Despite that,
during debugging, we need to change the assembly code, extract it into
a separate file and compile the new file into hex values using sp3.
That process also requires us to remove "#if 0" and modify lines starting
with "#", so that sp3 can successfully compile the new file.

With this change, all the above chore is no longer needed, and
cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its
hex values.

Signed-off-by: Yong Zhao 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 560 +
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 267 +-
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  | 300 +--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c|   3 +-
 4 files changed, 575 insertions(+), 555 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
new file mode 100644
index 000..a546a21
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -0,0 +1,560 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+static const uint32_t cwsr_trap_gfx8_hex[] = {
+   0xbf820001, 0xbf820125,
+   0xb8f4f802, 0x89748674,
+   0xb8f5f803, 0x8675ff75,
+   0x0400, 0xbf850011,
+   0xc00a1e37, 0x,
+   0xbf8c007f, 0x8978,
+   0xbf840002, 0xb974f802,
+   0xbe801d78, 0xb8f5f803,
+   0x8675ff75, 0x01ff,
+   0xbf850002, 0x80708470,
+   0x82718071, 0x8671ff71,
+   0x, 0xb974f802,
+   0xbe801f70, 0xb8f5f803,
+   0x8675ff75, 0x0100,
+   0xbf840006, 0xbefa0080,
+   0xb97a0203, 0x8671ff71,
+   0x, 0x80f08870,
+   0x82f18071, 0xbefa0080,
+   0xb97a0283, 0xbef60068,
+   0xbef70069, 0xb8fa1c07,
+   0x8e7a9c7a, 0x87717a71,
+   0xb8fa03c7, 0x8e7a9b7a,
+   0x87717a71, 0xb8faf807,
+   0x867aff7a, 0x7fff,
+   0xb97af807, 0xbef2007e,
+   0xbef3007f, 0xbefe0180,
+   0xbf94, 0x877a8474,
+   0xb97af802, 0xbf8e0002,
+   0xbf88fffe, 0xbef8007e,
+   0x8679ff7f, 0x,
+   0x8779ff79, 0x0004,
+   0xbefa0080, 0xbefb00ff,
+   0x00807fac, 0x867aff7f,
+   0x0800, 0x8f7a837a,
+   0x877b7a7b, 0x867aff7f,
+   0x7000, 0x8f7a817a,
+   0x877b7a7b, 0xbeef007c,
+   0xbeee0080, 0xb8ee2a05,
+   0x806e816e, 0x8e6e8a6e,
+   0xb8fa1605, 0x807a817a,
+   0x8e7a867a, 0x806e7a6e,
+   0xbefa0084, 0xbefa00ff,
+   0x0100, 0xbefe007c,
+   0xbefc006e, 0xc0611bfc,
+   0x007c, 0x806e846e,
+   0xbefc007e, 0xbefe007c,
+   0xbefc006e, 0xc0611c3c,
+   0x007c, 0x806e846e,
+   0xbefc007e, 0xbefe007c,
+   0xbefc006e, 0xc0611c7c,
+   0x007c, 0x806e846e,
+   0xbefc007e, 0xbefe007c,
+   0xbefc006e, 0xc0611cbc,
+   0x007c, 0x806e846e,
+   0xbefc007e, 0xbefe007c,
+   0xbefc006e, 0xc0611cfc,
+   0x007c, 0x806e846e,
+   0xbefc007e, 0xbefe007c,
+   0xbefc006e, 0xc0611d3c,
+   0x007c, 0x806e846e,
+   0xbefc007e, 0xb8f5f803,
+   0xbefe007c, 0xbefc006e,
+   0xc0611d7c, 0x007c,
+   0x806e846e, 0xbefc007e,
+   0xbefe007c, 0xbefc006e,
+   0xc0611dbc, 0x007c,
+   0x806e846e, 0xbefc007e,
+   0xbefe007c, 0xbefc006e,
+   0xc0611dfc, 0x007c,
+   0x806e846e, 0xbefc007e,
+   0xb8eff801, 0xbefe007c,
+   0xbefc006e, 0xc0611bfc,
+   0x007c, 0x806e846e,
+   0xbefc007e, 0xbefe007c,
+   0xbefc006e, 0xc0611b3c,
+   0x007c, 0x806e846e,
+   0xbefc007e, 

[PATCH 00/12] Assorted KFD fixes

2018-05-01 Thread Felix Kuehling
These are some random patches I noticed when comparing amdkfd-next against
amd-kfd-staging.

Ben Goz (1):
  drm/amdkfd: Locking PM mutex while allocating IB buffer

Felix Kuehling (4):
  drm/amdkfd: Remove redundant include of amd-iommu.h
  drm/amdkfd: Fix signal handling performance again
  drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK
  drm/amdkfd: Add sanity checks in IRQ handlers

Jay Cornwall (2):
  drm/amdkfd: Reduce priority of context-saving waves before spin-wait
  drm/amdkfd: Use volatile MTYPE in default/alternate apertures

Oak Zeng (1):
  drm/amdkfd: Dump HQD of HIQ

Philip Yang (1):
  drm/amdkfd: use %px to print user space address instead of %p

Shaoyun Liu (1):
  drm/amdkfd: Remove queue node when destroy queue failed

Yong Zhao (2):
  drm/amdkfd: Separate trap handler assembly code and its hex values
  drm/amdkfd: Fix CP soft hang on APUs

 drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c   |  20 +-
 drivers/gpu/drm/amd/amdkfd/cik_regs.h  |   3 +-
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 560 +
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 274 +-
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  | 307 +--
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c|   6 +-
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  12 +
 drivers/gpu/drm/amd/amdkfd/kfd_events.c|   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c|  40 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |   4 -
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c|   7 +-
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c |   8 +-
 14 files changed, 659 insertions(+), 596 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h

-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 10/12] drm/amdkfd: Locking PM mutex while allocating IB buffer

2018-05-01 Thread Felix Kuehling
From: Ben Goz 

Signed-off-by: Ben Goz 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 91f0350..c317feb4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -94,12 +94,14 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 
pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
 
+   mutex_lock(>lock);
+
retval = kfd_gtt_sa_allocate(pm->dqm->dev, *rl_buffer_size,
>ib_buffer_obj);
 
if (retval) {
pr_err("Failed to allocate runlist IB\n");
-   return retval;
+   goto out;
}
 
*(void **)rl_buffer = pm->ib_buffer_obj->cpu_ptr;
@@ -107,6 +109,9 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 
memset(*rl_buffer, 0, *rl_buffer_size);
pm->allocated = true;
+
+out:
+   mutex_unlock(>lock);
return retval;
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 12/12] drm/amdkfd: Add sanity checks in IRQ handlers

2018-05-01 Thread Felix Kuehling
Only accept interrupts from KFD VMIDs. Just checking for a PASID may
not be enough because amdgpu started using PASIDs to map VM faults
to processes.

Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug).

Suggested-by: Shaoyun Liu 
Suggested-by: Oak Zeng 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 20 +---
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c  | 40 ++--
 2 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c 
b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
index 3d5ccb3..49df6c7 100644
--- a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
@@ -27,18 +27,28 @@
 static bool cik_event_interrupt_isr(struct kfd_dev *dev,
const uint32_t *ih_ring_entry)
 {
-   unsigned int pasid;
const struct cik_ih_ring_entry *ihre =
(const struct cik_ih_ring_entry *)ih_ring_entry;
+   unsigned int vmid, pasid;
+
+   /* Only handle interrupts from KFD VMIDs */
+   vmid  = (ihre->ring_id & 0xff00) >> 8;
+   if (vmid < dev->vm_info.first_vmid_kfd ||
+   vmid > dev->vm_info.last_vmid_kfd)
+   return 0;
 
+   /* If there is no valid PASID, it's likely a firmware bug */
pasid = (ihre->ring_id & 0x) >> 16;
+   if (WARN_ONCE(pasid == 0, "FW bug: No PASID in KFD interrupt"))
+   return 0;
 
-   /* Do not process in ISR, just request it to be forwarded to WQ. */
-   return (pasid != 0) &&
-   (ihre->source_id == CIK_INTSRC_CP_END_OF_PIPE ||
+   /* Interrupt types we care about: various signals and faults.
+* They will be forwarded to a work queue (see below).
+*/
+   return ihre->source_id == CIK_INTSRC_CP_END_OF_PIPE ||
ihre->source_id == CIK_INTSRC_SDMA_TRAP ||
ihre->source_id == CIK_INTSRC_SQ_INTERRUPT_MSG ||
-   ihre->source_id == CIK_INTSRC_CP_BAD_OPCODE);
+   ihre->source_id == CIK_INTSRC_CP_BAD_OPCODE;
 }
 
 static void cik_event_interrupt_wq(struct kfd_dev *dev,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
index 39d4115..37029ba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
@@ -29,27 +29,35 @@ static bool event_interrupt_isr_v9(struct kfd_dev *dev,
const uint32_t *ih_ring_entry)
 {
uint16_t source_id, client_id, pasid, vmid;
+   const uint32_t *data = ih_ring_entry;
 
-   source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry);
-   client_id = SOC15_CLIENT_ID_FROM_IH_ENTRY(ih_ring_entry);
-   pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry);
+   /* Only handle interrupts from KFD VMIDs */
vmid = SOC15_VMID_FROM_IH_ENTRY(ih_ring_entry);
+   if (vmid < dev->vm_info.first_vmid_kfd ||
+   vmid > dev->vm_info.last_vmid_kfd)
+   return 0;
+
+   /* If there is no valid PASID, it's likely a firmware bug */
+   pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry);
+   if (WARN_ONCE(pasid == 0, "FW bug: No PASID in KFD interrupt"))
+   return 0;
 
-   if (pasid) {
-   const uint32_t *data = ih_ring_entry;
+   source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry);
+   client_id = SOC15_CLIENT_ID_FROM_IH_ENTRY(ih_ring_entry);
 
-   pr_debug("client id 0x%x, source id %d, pasid 0x%x. raw 
data:\n",
-client_id, source_id, pasid);
-   pr_debug("%8X, %8X, %8X, %8X, %8X, %8X, %8X, %8X.\n",
-data[0], data[1], data[2], data[3],
-data[4], data[5], data[6], data[7]);
-   }
+   pr_debug("client id 0x%x, source id %d, pasid 0x%x. raw data:\n",
+client_id, source_id, pasid);
+   pr_debug("%8X, %8X, %8X, %8X, %8X, %8X, %8X, %8X.\n",
+data[0], data[1], data[2], data[3],
+data[4], data[5], data[6], data[7]);
 
-   return (pasid != 0) &&
-   (source_id == SOC15_INTSRC_CP_END_OF_PIPE ||
-source_id == SOC15_INTSRC_SDMA_TRAP ||
-source_id == SOC15_INTSRC_SQ_INTERRUPT_MSG ||
-source_id == SOC15_INTSRC_CP_BAD_OPCODE);
+   /* Interrupt types we care about: various signals and faults.
+* They will be forwarded to a work queue (see below).
+*/
+   return source_id == SOC15_INTSRC_CP_END_OF_PIPE ||
+   source_id == SOC15_INTSRC_SDMA_TRAP ||
+   source_id == SOC15_INTSRC_SQ_INTERRUPT_MSG ||
+   source_id == SOC15_INTSRC_CP_BAD_OPCODE;
 }
 
 static void event_interrupt_wq_v9(struct kfd_dev 

[PATCH 02/12] drm/amdkfd: Reduce priority of context-saving waves before spin-wait

2018-05-01 Thread Felix Kuehling
From: Jay Cornwall 

Synchronization between context-saving wavefronts is achieved by
sending a SAVEWAVE message to the SPI and then spin-waiting for a
response. These spin-waiting wavefronts may inhibit the progress
of other wavefronts in the context save handler, leading to the
synchronization condition never being achieved.

Before spin-waiting reduce the priority of each wavefront to
guarantee foward progress in the others.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 10 --
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm |  8 +++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
index 997a383d..34eabcd 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
@@ -98,6 +98,7 @@ var SWIZZLE_EN  =   0   
//whether we use swi
 /**/
 var SQ_WAVE_STATUS_INST_ATC_SHIFT  = 23
 var SQ_WAVE_STATUS_INST_ATC_MASK   = 0x0080
+var SQ_WAVE_STATUS_SPI_PRIO_SHIFT  = 1
 var SQ_WAVE_STATUS_SPI_PRIO_MASK   = 0x0006
 
 var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT= 12
@@ -319,6 +320,10 @@ end
 s_sendmsg   sendmsg(MSG_SAVEWAVE)  //send SPI a message and wait for 
SPI's write to EXEC
 end
 
+// Set SPI_PRIO=2 to avoid starving instruction fetch in the waves we're 
waiting for.
+s_or_b32 s_save_tmp, s_save_status, (2 << SQ_WAVE_STATUS_SPI_PRIO_SHIFT)
+s_setreg_b32 hwreg(HW_REG_STATUS), s_save_tmp
+
   L_SLEEP:
 s_sleep 0x2// sleep 1 (64clk) is not enough for 8 waves 
per SIMD, which will cause SQ hang, since the 7,8th wave could not get arbit to 
exec inst, while other waves are stuck into the sleep-loop and waiting for 
wrexec!=0
 
@@ -1132,7 +1137,7 @@ end
 #endif
 
 static const uint32_t cwsr_trap_gfx8_hex[] = {
-   0xbf820001, 0xbf820123,
+   0xbf820001, 0xbf820125,
0xb8f4f802, 0x89748674,
0xb8f5f803, 0x8675ff75,
0x0400, 0xbf850011,
@@ -1158,7 +1163,8 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
0x867aff7a, 0x7fff,
0xb97af807, 0xbef2007e,
0xbef3007f, 0xbefe0180,
-   0xbf94, 0xbf8e0002,
+   0xbf94, 0x877a8474,
+   0xb97af802, 0xbf8e0002,
0xbf88fffe, 0xbef8007e,
0x8679ff7f, 0x,
0x8779ff79, 0x0004,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index da09794..8fc3698 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -97,6 +97,7 @@ var ACK_SQC_STORE =   1   
//workaround for suspected SQC store bug causing
 /**/
 var SQ_WAVE_STATUS_INST_ATC_SHIFT  = 23
 var SQ_WAVE_STATUS_INST_ATC_MASK   = 0x0080
+var SQ_WAVE_STATUS_SPI_PRIO_SHIFT  = 1
 var SQ_WAVE_STATUS_SPI_PRIO_MASK   = 0x0006
 var SQ_WAVE_STATUS_HALT_MASK   = 0x2000
 
@@ -362,6 +363,10 @@ end
s_sendmsg   sendmsg(MSG_SAVEWAVE)  //send SPI a message and wait for 
SPI's write to EXEC
 end
 
+// Set SPI_PRIO=2 to avoid starving instruction fetch in the waves we're 
waiting for.
+s_or_b32 s_save_tmp, s_save_status, (2 << SQ_WAVE_STATUS_SPI_PRIO_SHIFT)
+s_setreg_b32 hwreg(HW_REG_STATUS), s_save_tmp
+
   L_SLEEP:
 s_sleep 0x2   // sleep 1 (64clk) is not enough for 8 
waves per SIMD, which will cause SQ hang, since the 7,8th wave could not get 
arbit to exec inst, while other waves are stuck into the sleep-loop and waiting 
for wrexec!=0
 
@@ -1210,7 +1215,7 @@ end
 #endif
 
 static const uint32_t cwsr_trap_gfx9_hex[] = {
-   0xbf820001, 0xbf820158,
+   0xbf820001, 0xbf82015a,
0xb8f8f802, 0x89788678,
0xb8f1f803, 0x866eff71,
0x0400, 0xbf850034,
@@ -1249,6 +1254,7 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
0x7fff, 0xb970f807,
0xbeee007e, 0xbeef007f,
0xbefe0180, 0xbf94,
+   0x87708478, 0xb970f802,
0xbf8e0002, 0xbf88fffe,
0xb8f02a05, 0x80708170,
0x8e708a70, 0xb8f11605,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 09/12] drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK

2018-05-01 Thread Felix Kuehling
The initialization is not necessary. amd-kfd-staging and ROCm
releases have worked without it for two years.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 2bc49c6..06eaa21 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -79,10 +79,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
m->cp_mqd_base_addr_lo= lower_32_bits(addr);
m->cp_mqd_base_addr_hi= upper_32_bits(addr);
 
-   m->cp_hqd_ib_control = DEFAULT_MIN_IB_AVAIL_SIZE | IB_ATC_EN;
-   /* Although WinKFD writes this, I suspect it should not be necessary */
-   m->cp_hqd_ib_control = IB_ATC_EN | DEFAULT_MIN_IB_AVAIL_SIZE;
-
m->cp_hqd_quantum = QUANTUM_EN | QUANTUM_SCALE_1MS |
QUANTUM_DURATION(10);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: vcn10 Add callback for emit_reg_write_reg_wait

2018-05-01 Thread Andrey Grodzovsky

Reviewed-by: Andrey Grodzovsky 

Andrey


On 05/01/2018 10:18 AM, Tom St Denis wrote:

The callback .emit_reg_write_reg_wait was missing for vcn decode
which resulted in a kernel oops.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index d9a15338db7e..0501746b6c2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -1109,6 +1109,7 @@ static const struct amdgpu_ring_funcs 
vcn_v1_0_dec_ring_vm_funcs = {
.end_use = amdgpu_vcn_ring_end_use,
.emit_wreg = vcn_v1_0_dec_ring_emit_wreg,
.emit_reg_wait = vcn_v1_0_dec_ring_emit_reg_wait,
+   .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
  };
  
  static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = {


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: New SPDX-License-Identifier requirement

2018-05-01 Thread Oded Gabbay
I believe it should be :
SPDX-License-Identifier: GPL-2.0 OR MIT

But John probably knows best about this
Oded

On Tue, May 1, 2018 at 11:14 PM, Felix Kuehling  wrote:
> Hi,
>
> I'm getting a checkpatch warning with the latest amdkfd-next branch
> (4.17-rc2) when adding a new file:
>
> WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
> #34: FILE: drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h:1:
>
> I've read Documentation/process/license-rules.rst but I'm unsure what
> would be the correct license identifier to go with the license header we
> use in most of our source files. I think it would be one of these:
>
>   // SPDX-License-Identifier: GPL-2.0 OR MIT
>   // SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
>
> Can someone confirm.
>
> Once that's clarified, we should probably add the appropriate license
> identifier to all our source files.
>
> Thanks,
>   Felix
>
> --
> F e l i x   K u e h l i n g
> PMTS Software Development Engineer | Vertical Workstation/Compute
> 1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada
> (O) +1(289)695-1597
>_ _   _   _   _
>   / \   | \ / | |  _  \  \ _  |
>  / A \  | \M/ | | |D) )  /|_| |
> /_/ \_\ |_| |_| |_/ |__/ \|   facebook.com/AMD | amd.com
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


New SPDX-License-Identifier requirement

2018-05-01 Thread Felix Kuehling
Hi,

I'm getting a checkpatch warning with the latest amdkfd-next branch
(4.17-rc2) when adding a new file:

WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#34: FILE: drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h:1:

I've read Documentation/process/license-rules.rst but I'm unsure what
would be the correct license identifier to go with the license header we
use in most of our source files. I think it would be one of these:

  // SPDX-License-Identifier: GPL-2.0 OR MIT
  // SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause

Can someone confirm.

Once that's clarified, we should probably add the appropriate license
identifier to all our source files.

Thanks,
  Felix

-- 
F e l i x   K u e h l i n g
PMTS Software Development Engineer | Vertical Workstation/Compute
1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada
(O) +1(289)695-1597
   _ _   _   _   _
  / \   | \ / | |  _  \  \ _  |
 / A \  | \M/ | | |D) )  /|_| |
/_/ \_\ |_| |_| |_/ |__/ \|   facebook.com/AMD | amd.com

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes

2018-05-01 Thread Stéphane Marchesin
On Fri, Apr 27, 2018 at 3:27 AM Shirish S  wrote:

> This patch is in continuation to the
> "843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state"
> where we started to eliminate the dependency on
> DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
> which as such is not mandatory.

> After deferring, this patch eliminates the dependency on the flag
> for overlay planes.

> This has to be done in stages as its a pretty complex and requires
thorough
> testing before we free primary planes as well from dependency on modeset
> flag.

> Signed-off-by: Shirish S 
> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +---
>   1 file changed, 5 insertions(+), 3 deletions(-)

> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 1a63c04..87b661d 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct
drm_atomic_state *state,
>  }
>  spin_unlock_irqrestore(>dev->event_lock, flags);

> -   if (!pflip_needed) {
> +   if (!pflip_needed || plane->type ==
DRM_PLANE_TYPE_OVERLAY) {
>  WARN_ON(!dm_new_plane_state->dc_state);

>  plane_states_constructed[planes_count] =
dm_new_plane_state->dc_state;
> @@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc,

>  /* Remove any changed/removed planes */
>  if (!enable) {
> -   if (pflip_needed)
> +   if (pflip_needed &&
> +   plane && plane->type !=
DRM_PLANE_TYPE_OVERLAY)

nit: I don't think we need to check that plane is non-NULL

Stéphane

>  continue;

>  if (!old_plane_crtc)
> @@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc,
>  if (!dm_new_crtc_state->stream)
>  continue;

> -   if (pflip_needed)
> +   if (pflip_needed &&
> +   plane && plane->type !=
DRM_PLANE_TYPE_OVERLAY)
>  continue;

>  WARN_ON(dm_new_plane_state->dc_state);
> --
> 2.7.4

> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: vcn10 Add callback for emit_reg_write_reg_wait

2018-05-01 Thread Tom St Denis
The callback .emit_reg_write_reg_wait was missing for vcn decode
which resulted in a kernel oops.

Signed-off-by: Tom St Denis 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index d9a15338db7e..0501746b6c2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -1109,6 +1109,7 @@ static const struct amdgpu_ring_funcs 
vcn_v1_0_dec_ring_vm_funcs = {
.end_use = amdgpu_vcn_ring_end_use,
.emit_wreg = vcn_v1_0_dec_ring_emit_wreg,
.emit_reg_wait = vcn_v1_0_dec_ring_emit_reg_wait,
+   .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
 };
 
 static const struct amdgpu_ring_funcs vcn_v1_0_enc_ring_vm_funcs = {
-- 
2.14.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-05-01 Thread Michel Dänzer
On 2018-05-01 01:15 AM, Dave Airlie wrote:
>>
>>
>> Yes, I fixed the original false positive messages myself with the swiotlb
>> maintainer and I was CCed in fixing the recent fallout from Chris changes as
>> well.
> 
> So do we have a good summary of where this at now?
> 
> I'm getting reports on 4.16.4 still displaying these, what hammer do I
> need to hit things with to get 4.16.x+1 to not do this?
> 
> Is there still outstanding issues upstream.

There are, https://patchwork.freedesktop.org/patch/219765/ should
hopefully fix the last of it.


> [...] I've no idea if the swiotlb things people report are the false
> positive, or some new thing.

The issues I've seen reported with 4.16 are false positives from TTM's
perspective, which uses DMA_ATTR_NO_WARN to suppress these warnings, due
to multiple regressions introduced by commit
0176adb004065d6815a8e67946752df4cd947c5b "swiotlb: refactor
 coherent buffer allocation" in 4.16-rc1.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


vcn regression on raven1

2018-05-01 Thread Tom St Denis

Hi all,

I've noticed that on the tip of drm-next vcn playback of video is broken 
(see dmesg below).  I've bisected it to this commit


[root@raven linux]# git bisect good
701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit
commit 701372349fd55b5396b335580e979ac4dde3dd02
Author: Alex Deucher 
Date:   Tue Mar 27 17:10:56 2018 -0500

drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb 
flush


Use amdgpu_ring_emit_reg_write_reg_wait.  On engines that support it,
it provides a write and wait in a single packet which avoids a missed
ack if a world switch happens between the request and waiting for the
ack.

Reviewed-by: Huang Rui 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 

:04 04 4e4312de03f4b34abd65f4bb12dba4c7093055ba 
ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M  drivers


Which is odd because the commit before this is the vcn change and it 
works fine (playing BBB right now).


Here's the dmesg:

[ 2925.640102] BUG: unable to handle kernel NULL pointer dereference at 


[ 2925.640113] IP:   (null)
[ 2925.640116] PGD 0 P4D 0
[ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI
[ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core chash 
gpu_sched ttm ax88179_178a usbnet

[ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted 4.16.0-rc7+ #20
[ 2925.640142] Hardware name: System manufacturer System Product 
Name/TUF B350M-PLUS GAMING, BIOS 3803 01/22/2018

[ 2925.640146] RIP: 0010:  (null)
[ 2925.640148] RSP: 0018:8801d54f7790 EFLAGS: 00010206
[ 2925.640153] RAX:  RBX: 8801d8b38420 RCX: 
007c0080
[ 2925.640156] RDX: 0001a6fa RSI: 0001a6e8 RDI: 
8801d8b38420
[ 2925.640159] RBP: 0001a6fa R08: 0080 R09: 
ed003aa9eef9
[ 2925.640162] R10: 09c74f08 R11: fbfff0f5d1e7 R12: 
8801d8b3277c
[ 2925.640164] R13: 8801d8b3001c R14: 0005 R15: 

[ 2925.640168] FS:  () GS:8801dcf0() 
knlGS:

[ 2925.640171] CS:  0010 DS:  ES:  CR0: 80050033
[ 2925.640174] CR2:  CR3: 0001d9712000 CR4: 
003406e0

[ 2925.640176] Call Trace:
[ 2925.640272]  ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu]
[ 2925.640368]  ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu]
[ 2925.640459]  ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu]
[ 2925.640545]  ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu]
[ 2925.640640]  ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu]
[ 2925.640725]  ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu]
[ 2925.640810]  ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu]
[ 2925.640897]  ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu]
[ 2925.641003]  ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu]
[ 2925.641095]  ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu]
[ 2925.641102]  ? dma_fence_add_callback+0x15f/0x360
[ 2925.641201]  ? amdgpu_job_run+0x32f/0x370 [amdgpu]
[ 2925.641297]  ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu]
[ 2925.641302]  ? __queue_delayed_work+0x144/0x1d0
[ 2925.641306]  ? delayed_work_timer_fn+0x40/0x40
[ 2925.641312]  ? prepare_to_wait_exclusive+0x1d0/0x1d0
[ 2925.641318]  ? drm_sched_main+0x68c/0x940 [gpu_sched]
[ 2925.641323]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641328]  ? save_stack+0x89/0xb0
[ 2925.641332]  ? wait_woken+0x110/0x110
[ 2925.641337]  ? ret_from_fork+0x22/0x40
[ 2925.641343]  ? __schedule+0xd30/0xd30
[ 2925.641346]  ? remove_wait_queue+0x150/0x150
[ 2925.641353]  ? rcu_note_context_switch+0x2a0/0x2a0
[ 2925.641359]  ? __lock_text_start+0x8/0x8
[ 2925.641367]  ? drm_sched_entity_fini+0x60/0x60 [gpu_sched]
[ 2925.641371]  ? kthread+0x19b/0x1c0
[ 2925.641376]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 2925.641382]  ? ret_from_fork+0x22/0x40
[ 2925.641387] Code:  Bad RIP value.
[ 2925.641397] RIP:   (null) RSP: 8801d54f7790
[ 2925.641400] CR2: 
[ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]---


Note that regular compute/gfx workflows work fine on the tip of drm-next 
only vcn playback triggeers this (haven't tried encode yet...).


Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: Deinitialize vamgr_high{,_32}

2018-05-01 Thread Andrey Grodzovsky

Reviewed-by: Andrey Grodzovsky 

Andrey


On 05/01/2018 04:03 AM, Michel Dänzer wrote:

On 2018-04-27 04:44 PM, Michel Dänzer wrote:

From: Michel Dänzer 

Fixes memory leaks.

Signed-off-by: Michel Dänzer 
---
  amdgpu/amdgpu_device.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
index d81efcf8..983b19ab 100644
--- a/amdgpu/amdgpu_device.c
+++ b/amdgpu/amdgpu_device.c
@@ -128,6 +128,8 @@ static void 
amdgpu_device_free_internal(amdgpu_device_handle dev)
  {
amdgpu_vamgr_deinit(>vamgr_32);
amdgpu_vamgr_deinit(>vamgr);
+   amdgpu_vamgr_deinit(>vamgr_high_32);
+   amdgpu_vamgr_deinit(>vamgr_high);
util_hash_table_destroy(dev->bo_flink_names);
util_hash_table_destroy(dev->bo_handles);
pthread_mutex_destroy(>bo_table_mutex);


Any reviews? Without negative feedback, I'll push this tomorrow.




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: Deinitialize vamgr_high{,_32}

2018-05-01 Thread Michel Dänzer
On 2018-04-27 04:44 PM, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> Fixes memory leaks.
> 
> Signed-off-by: Michel Dänzer 
> ---
>  amdgpu/amdgpu_device.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
> index d81efcf8..983b19ab 100644
> --- a/amdgpu/amdgpu_device.c
> +++ b/amdgpu/amdgpu_device.c
> @@ -128,6 +128,8 @@ static void 
> amdgpu_device_free_internal(amdgpu_device_handle dev)
>  {
>   amdgpu_vamgr_deinit(>vamgr_32);
>   amdgpu_vamgr_deinit(>vamgr);
> + amdgpu_vamgr_deinit(>vamgr_high_32);
> + amdgpu_vamgr_deinit(>vamgr_high);
>   util_hash_table_destroy(dev->bo_flink_names);
>   util_hash_table_destroy(dev->bo_handles);
>   pthread_mutex_destroy(>bo_table_mutex);
> 

Any reviews? Without negative feedback, I'll push this tomorrow.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v3 2/3] drm/amdgpu: Allow dma_map_sg() coalescing

2018-05-01 Thread Robin Murphy
The amdgpu driver doesn't appear to directly use the scatterlist mapped
by amdgpu_ttm_tt_pin_userptr(), it merely hands it off to
drm_prime_sg_to_page_addr_arrays() to generate the dma_address array
which it actually cares about. Now that the latter can cope with
dma_map_sg() coalescing dma-contiguous segments such that it returns
0 < count < nents, we can relax the current count == nents check to
only consider genuine failure as other drivers do.

Reported-by: Sinan Kaya 
Reviewed-by: Christian König 
Signed-off-by: Robin Murphy 
---

v3: No change

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 205da3ff9cd0..f81e96a4242f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -813,7 +813,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
r = -ENOMEM;
nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents != ttm->sg->nents)
+   if (nents == 0)
goto release_sg;
 
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
-- 
2.17.0.dirty

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v3 3/3] drm/radeon: Allow dma_map_sg() coalescing

2018-05-01 Thread Robin Murphy
Much like amdgpu, the radeon driver doesn't appear to directly use the
scatterlist mapped by radeon_ttm_tt_pin_userptr(), it merely hands it
off to drm_prime_sg_to_page_addr_arrays() to generate the dma_address
array which it actually cares about. Now that the latter can cope with
dma_map_sg() coalescing dma-contiguous segments such that it returns
0 < count < nents, we can relax the current count == nents check to
only consider genuine failure as other drivers do.

Suggested-by: Christian König 
Reviewed-by: Christian König 
Signed-off-by: Robin Murphy 
---

v3: No change

 drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 8689fcca051c..7c099192c7fa 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -585,7 +585,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
r = -ENOMEM;
nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents != ttm->sg->nents)
+   if (nents == 0)
goto release_sg;
 
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
-- 
2.17.0.dirty

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-05-01 Thread Oleg Nesterov
On 04/30, Christian König wrote:
>
> Well when the process is killed we don't care about correctness any more, we
> just want to get rid of it as quickly as possible (OOM situation etc...).

OK,

> But it is perfectly possible that a process submits some render commands and
> then calls exit() or terminates because of a SIGTERM, SIGINT etc..

This doesn't differ from SIGKILL. I mean, any unhandled fatal signal translates
to SIGKILL and I think this is fine.

but this doesn't really matter,

> So what we essentially need is to distinct between a SIGKILL (which means
> stop processing as soon as possible) and any other reason because then we
> don't want to annoy the user with garbage on the screen (even if it's just
> for a few milliseconds).

For what?

OK, I see another email from Andrey, I'll reply to that email...

Oleg.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing

2018-05-01 Thread Robin Murphy

On 27/04/18 20:42, Sinan Kaya wrote:

On 4/27/2018 11:54 AM, Robin Murphy wrote:



ubuntu@ubuntu:~/amdgpu$_./vectoradd_hip.exe
[  834.002206] create_process:620
[  837.413021] Unable to handle kernel NULL pointer dereference at virtual 
address 0018


£5 says that's sg_dma_len(NULL), which implies either that something's gone 
horribly wrong with the scatterlist DMA mapping such that the lengths don't 
match, or much more likely that ttm.dma_address is NULL and I've missed the 
tiny subtlety below. Does that fix matters?


Turned out to be a null pointer problem after sg_next(). The following helped.


Ugh, right, the whole thing's in the wrong place such that when addrs is 
valid we can dereference junk on the way out of the loop (entirely 
needlessly)... v3 coming up.


Robin.



+   if (addrs && (dma_len == 0)) {
 dma_sg = sg_next(dma_sg);
-   dma_len = sg_dma_len(dma_sg);
-   addr = sg_dma_address(dma_sg);
+   if (dma_sg) {
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
+   }
 }
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-05-01 Thread Eric W. Biederman
Christian König  writes:

> Hi Eric,
>
> sorry for the late response, was on vacation last week.
>
> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>> Andrey Grodzovsky  writes:
>>
>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
 On 04/25, Andrey Grodzovsky wrote:
> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
> able to exit immediately
> and not wait for GPU jobs completion when the reason for reaching this 
> code
> is because of KILL
> signal to the user process who opened the device file.
 Can you hook f_op->flush method?
>
> THANKS! That sounds like a really good idea to me and we haven't investigated
> into that direction yet.

For the backwards compatibility concerns you cite below the flush method
seems a much better place to introduce the wait.  You at least really
will be in a process context for that.  Still might be in exit but at
least you will be legitimately be in a process.

>>> But this one is called for each task releasing a reference to the the file, 
>>> so
>>> not sure I see how this solves the problem.
>> The big question is why do you need to wait during the final closing a
>> file?
>
> As always it's because of historical reasons. Initially user space pushed
> commands directly to a hardware queue and when a processes finished we didn't
> need to wait for anything.
>
> Then the GPU scheduler was introduced which delayed pushing the jobs to the
> hardware queue to a later point in time.
>
> This wait was then added to maintain backward compability and not break
> userspace (but see below).

That make sense.

>> The wait can be terminated so the wait does not appear to be simply a
>> matter of correctness.
>
> Well when the process is killed we don't care about correctness any more, we
> just want to get rid of it as quickly as possible (OOM situation etc...).
>
> But it is perfectly possible that a process submits some render commands and
> then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this 
> case
> we need to wait here to make sure that all rendering is pushed to the hardware
> because the scheduler might need resources/settings from the file
> descriptor.
>
> For example if you just remove that wait you could close firefox and get 
> garbage
> on the screen for a millisecond because the remaining rendering commands where
> not executed.
>
> So what we essentially need is to distinct between a SIGKILL (which means stop
> processing as soon as possible) and any other reason because then we don't 
> want
> to annoy the user with garbage on the screen (even if it's just for a few
> milliseconds).

I see a couple of issues.

- Running the code in release rather than in flush.

Using flush will catch every close so it should be more backwards
compatible.  f_op->flush always runs in process context so looking at
current makes sense.

- Distinguishing between death by SIGKILL and other process exit deaths.

In f_op->flush the code can test "((tsk->flags & PF_EXITING) &&
(tsk->code == SIGKILL))" to see if it was SIGKILL that terminated
the process.

- Dealing with stuck queues (where this patchset came in).

For stuck queues you are going to need a timeout instead of the current
indefinite wait after PF_EXITING is set.  From what you have described a
few milliseconds should be enough.  If PF_EXITING is not set you can
still just make the wait killable and skip the timeout if that will give
a better backwards compatible user experience.

What can't be done is try and catch SIGKILL after a process has called
do_exit.  A dead process is a dead process.

Eric
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.

2018-05-01 Thread Oleg Nesterov
On 04/30, Andrey Grodzovsky wrote:
>
> What about changing PF_SIGNALED to  PF_EXITING in
> drm_sched_entity_do_release
>
> -   if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> +  if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)

let me repeat, please don't use task->exit_code. And in fact this check is racy.

But this doesn't matter. Say, we can trivially add 
SIGNAL_GROUP_KILLED_BY_SIGKILL,
or do something else, but I fail to understand what are you trying to do. 
Suppose
that the check above is correct in that it is true iff the task is exiting and
it was killed by SIGKILL. What about the "else" branch which does

r = wait_event_killable(sched->job_scheduled, ...)

?

Once again, fatal_signal_pending() (or even signal_pending()) is not well 
defined
after the exiting task passes exit_signals().

So wait_event_killable() can fail because fatal_signal_pending() is true; and 
this
can happen even if it was not killed.

Or it can block and SIGKILL won't be able to wake it up.

> If SIGINT was sent then it's SIGINT,

Yes, but see above. in this case fatal_signal_pending() will be likely true so
wait_event_killable() will fail unless condition is already true.

Oleg.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/3] drm/prime: Iterate SG DMA addresses separately

2018-05-01 Thread Sinan Kaya
On 4/30/2018 9:54 AM, Robin Murphy wrote:
> For dma_map_sg(), DMA API implementations are free to merge consecutive
> segments into a single DMA mapping if conditions are suitable, thus the
> resulting DMA addresses which drm_prime_sg_to_page_addr_arrays()
> iterates over may be packed into fewer entries than sgt->nents implies.
> 
> The current implementation does not account for this, meaning that its
> callers either have to reject the 0 < count < nents case or risk getting
> bogus DMA addresses beyond the first segment. Fortunately this is quite
> easy to handle without having to rejig structures to also store the
> mapped count, since the total DMA length should still be equal to the
> total buffer length. All we need is a second scatterlist cursor to
> iterate through the DMA addresses independently of the page addresses.
> 
> Reviewed-by: Christian König 
> Signed-off-by: Robin Murphy 
> ---

Much better 

Tested-by: Sinan Kaya 

for the first two patches. (1/3 and 2/3)

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[REGRESSION] drm/amd/dc: Add dc display driver (v2)

2018-05-01 Thread Joseph Salisbury
Hi Harry,

A kernel bug report was opened against Ubuntu [0].  After a kernel
bisect, it was found the following commit introduced the bug:


commit 4562236b3bc0a28aeb6ee93b2d8a849a4c4e1c7c
Author: Harry Wentland 
Date:   Tue Sep 12 15:58:20 2017 -0400

    drm/amd/dc: Add dc display driver (v2)


The regression was introduced as of v4.15-rc1 and still exists in
current mainline.  The commit does not need to be reverted to resolve
the bug.  Disabling the CONFIG_DRM_AMD_DC_PRE_VEGA option makes the bug
go away.
   
I was hoping to get your feedback, since you are the patch author.  Do
you think gathering any additional data will help diagnose this issue?
   

Thanks,

Joe


[0] http://pad.lv/1761751

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v3 1/3] drm/prime: Iterate SG DMA addresses separately

2018-05-01 Thread Robin Murphy
For dma_map_sg(), DMA API implementations are free to merge consecutive
segments into a single DMA mapping if conditions are suitable, thus the
resulting DMA addresses which drm_prime_sg_to_page_addr_arrays()
iterates over may be packed into fewer entries than sgt->nents implies.

The current implementation does not account for this, meaning that its
callers either have to reject the 0 < count < nents case or risk getting
bogus DMA addresses beyond the first segment. Fortunately this is quite
easy to handle without having to rejig structures to also store the
mapped count, since the total DMA length should still be equal to the
total buffer length. All we need is a second scatterlist cursor to
iterate through the DMA addresses independently of the page addresses.

Reviewed-by: Christian König 
Signed-off-by: Robin Murphy 
---

v3: Move dma_len == 0 logic earlier to avoid iterating dma_sg too far

 drivers/gpu/drm/drm_prime.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 7856a9b3f8a8..3e74c84d0baf 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -933,16 +933,24 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table 
*sgt, struct page **pages,
 dma_addr_t *addrs, int max_entries)
 {
unsigned count;
-   struct scatterlist *sg;
+   struct scatterlist *sg, *dma_sg;
struct page *page;
-   u32 len, index;
+   u32 len, dma_len, index;
dma_addr_t addr;
 
index = 0;
+   dma_sg = sgt->sgl;
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
len = sg->length;
page = sg_page(sg);
-   addr = sg_dma_address(sg);
+
+   if (addrs && dma_len == 0) {
+   dma_sg = sg_next(dma_sg);
+   dma_len = sg_dma_len(dma_sg);
+   addr = sg_dma_address(dma_sg);
+   }
 
while (len > 0) {
if (WARN_ON(index >= max_entries))
@@ -955,6 +963,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, 
struct page **pages,
page++;
addr += PAGE_SIZE;
len -= PAGE_SIZE;
+   dma_len -= PAGE_SIZE;
index++;
}
}
-- 
2.17.0.dirty

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx