date:20220128

RE: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

2022-01-28 Thread Zhou1, Tao

[AMD Official Use Only]

For quick workaround, I agree with the solution. But regarding the root cause, 
the list is still messed up.
Can we make ras_list to be a global variable across all cards, and add list 
empty check (or add a flag to indicate the register status of ras block) before 
list add to avoid redundant register?

Regards,
Tao

> -Original Message-
> From: Chai, Thomas 
> Sent: Saturday, January 29, 2022 11:53 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chai, Thomas ; Zhang, Hawking
> ; Zhou1, Tao ; Clements,
> John ; Chai, Thomas 
> Subject: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop
> 
> 1. The infinite loop causing soft lock occurs on multiple amdgpu cards
>supporting ras feature.
> 2. This a workaround patch. It is valid for multiple amdgpu cards of the
>same type.
> 3. The root cause is that each GPU card device has a separate .ras_list
>link header, but the instance and linked list node of each ras block
>are unique. When each device is initialized, each ras instance will
>repeatedly add link node to the device every time. In this way, only
>the .ras_list of the last initialized device is completely correct.
>the .ras_list->prev and .ras_list->next of the device initialzied
>before can still point to the correct ras instance, but the prev
>pointer and next pointer of the pointed ras instance both point to
>the last initialized device's .ras_ list instead of the beginning
>.ras_ list. When using list_for_each_entry_safe searches for
>non-existent Ras nodes on devices other than the last device, the
>last ras instance next pointer cannot always be equal to the
>beginning .ras_list, so that the loop cannot be terminated, the
>program enters a infinite loop.
>  BTW: Since the data and initialization process of each card are the same,
>   the link list between ras instances will not be destroyed every time
>   the device is initialized.
>  4. The soft locked logs are as follows:
> [  262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G   OE
> 5.13.0-27-generic #29~20.04.1-Ubuntu
> [  262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU,
> BIOS T20200717143848 07/17/2020 [  262.165698] Workqueue: events
> amdgpu_ras_do_recovery [amdgpu] [  262.165980] RIP:
> 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [  262.166239] Code: 68
> d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 
> 89
> ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 
> 48
> 39 d3 74 25 49 89 c6 49 8b 45 [  262.166243] RSP: 0018:ac908fa87d80
> EFLAGS: 0202 [  262.166247] RAX: c1394248 RBX: 91e4ab8d6e20
> RCX: c1394248 [  262.166249] RDX: 91e4aa356e20 RSI:
> 000e RDI: 91e4ab8c [  262.166252] RBP:
> ac908fa87da8 R08: 0007 R09: 0001
> [  262.166254] R10: 91e4930b64ec R11:  R12:
> 000e [  262.166256] R13: 91e4aa356df8 R14: c1394320
> R15: 0003 [  262.166258] FS:  ()
> GS:92238fb4() knlGS: [  262.166261] CS:  0010
> DS:  ES:  CR0: 80050033 [  262.166264] CR2:
> 0001004865d0 CR3: 00406d796000 CR4: 00350ee0
> [  262.166267] Call Trace:
> [  262.166272]  amdgpu_ras_do_recovery+0x130/0x290 [amdgpu]
> [  262.166529]  ? psi_task_switch+0xd2/0x250 [  262.166537]  ?
> __switch_to+0x11d/0x460 [  262.166542]  ? __switch_to_asm+0x36/0x70
> [  262.166549]  process_one_work+0x220/0x3c0 [  262.166556]
> worker_thread+0x4d/0x3f0 [  262.166560]  ? process_one_work+0x3c0/0x3c0
> [  262.166563]  kthread+0x12b/0x150 [  262.166568]  ?
> set_kthread_struct+0x40/0x40 [  262.166571]  ret_from_fork+0x22/0x30
> 
> Signed-off-by: yipechai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index d4e07d0acb66..3d533ef0783d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -884,6 +884,7 @@ static int amdgpu_ras_block_match_default(struct
> amdgpu_ras_block_object *block_  static struct amdgpu_ras_block_object
> *amdgpu_ras_get_ras_block(struct amdgpu_device *adev,
>   enum amdgpu_ras_block block,
> uint32_t sub_block_index)  {
> + int loop_cnt = 0;
>   struct amdgpu_ras_block_object *obj, *tmp;
> 
>   if (block >= AMDGPU_RAS_BLOCK__LAST)
> @@ -900,6 +901,9 @@ static struct amdgpu_ras_block_object
> *amdgpu_ras_get_ras_block(struct amdgpu_de
>   if (amdgpu_ras_block_match_default(obj, block) == 0)
>   return obj;
>   }
> +
> + if (++loop_cnt >= AMDGPU_RAS_BLOCK__LAST)
> + break;
>   }
> 
>   return NULL;
> --
> 2.25.1

Re: [PATCH] drm/amdgpu: fix a potential GPU hang on cyan skillfish

2022-01-28 Thread Huang Rui

On Fri, Jan 28, 2022 at 06:43:23PM +0800, Yu, Lang wrote:
> We observed a GPU hang when querying GMC CG state(i.e.,
> cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
> skillfish doesn't support any CG features.
> 
> Just prevent cyan skillfish from accessing GMC CG registers.
> 
> Signed-off-by: Lang Yu 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index 73ab0eebe4e2..bddaf2417344 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void 
> *handle, u32 *flags)
>  {
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>  
> + if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 1, 3))
> + return;
> +
>   adev->mmhub.funcs->get_clockgating(adev, flags);
>  
>   if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))
> -- 
> 2.25.1
>

RE: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

2022-01-28 Thread Chai, Thomas

OK

-Original Message-
From: Chen, Guchun  
Sent: Saturday, January 29, 2022 12:02 PM
To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
Cc: Zhou1, Tao ; Zhang, Hawking ; 
Clements, John ; Chai, Thomas ; 
Chai, Thomas 
Subject: RE: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

[Public]

Please add a Fixes tag, as it should fix a regression from former patch.

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of yipechai
Sent: Saturday, January 29, 2022 11:53 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhou1, Tao ; Zhang, Hawking ; 
Clements, John ; Chai, Thomas ; 
Chai, Thomas 
Subject: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

1. The infinite loop causing soft lock occurs on multiple amdgpu cards
   supporting ras feature.
2. This a workaround patch. It is valid for multiple amdgpu cards of the
   same type.
3. The root cause is that each GPU card device has a separate .ras_list
   link header, but the instance and linked list node of each ras block
   are unique. When each device is initialized, each ras instance will
   repeatedly add link node to the device every time. In this way, only
   the .ras_list of the last initialized device is completely correct.
   the .ras_list->prev and .ras_list->next of the device initialzied
   before can still point to the correct ras instance, but the prev
   pointer and next pointer of the pointed ras instance both point to
   the last initialized device's .ras_ list instead of the beginning
   .ras_ list. When using list_for_each_entry_safe searches for
   non-existent Ras nodes on devices other than the last device, the
   last ras instance next pointer cannot always be equal to the
   beginning .ras_list, so that the loop cannot be terminated, the
   program enters a infinite loop.
 BTW: Since the data and initialization process of each card are the same,
  the link list between ras instances will not be destroyed every time
  the device is initialized.
 4. The soft locked logs are as follows:
[  262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G   OE 
5.13.0-27-generic #29~20.04.1-Ubuntu
[  262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS 
T20200717143848 07/17/2020 [  262.165698] Workqueue: events 
amdgpu_ras_do_recovery [amdgpu] [  262.165980] RIP: 
0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [  262.166239] Code: 68 d8 4c 
8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef 
e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 
d3 74 25 49 89 c6 49 8b 45 [  262.166243] RSP: 0018:ac908fa87d80 EFLAGS: 
0202 [  262.166247] RAX: c1394248 RBX: 91e4ab8d6e20 RCX: 
c1394248 [  262.166249] RDX: 91e4aa356e20 RSI: 000e 
RDI: 91e4ab8c [  262.166252] RBP: ac908fa87da8 R08: 
0007 R09: 0001 [  262.166254] R10: 91e4930b64ec 
R11:  R12: 000e [  262.166256] R13: 
91e4aa356df8 R14: c1394320 R15: 0003 [  262.166258] FS: 
 () GS:92238fb4() knlGS: [  
262.166261] CS:  0010 DS:  ES:  CR0: 80050033 [  262.166264] 
CR2: 0001004865d0 CR3: 00406d796000 CR4: 00350ee0 [  
262.166267] Call Trace:
[  262.166272]  amdgpu_ras_do_recovery+0x130/0x290 [amdgpu] [  262.166529]  ? 
psi_task_switch+0xd2/0x250 [  262.166537]  ? __switch_to+0x11d/0x460 [  
262.166542]  ? __switch_to_asm+0x36/0x70 [  262.166549]  
process_one_work+0x220/0x3c0 [  262.166556]  worker_thread+0x4d/0x3f0 [  
262.166560]  ? process_one_work+0x3c0/0x3c0 [  262.166563]  kthread+0x12b/0x150 
[  262.166568]  ? set_kthread_struct+0x40/0x40 [  262.166571]  
ret_from_fork+0x22/0x30

Signed-off-by: yipechai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index d4e07d0acb66..3d533ef0783d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -884,6 +884,7 @@ static int amdgpu_ras_block_match_default(struct 
amdgpu_ras_block_object *block_  static struct amdgpu_ras_block_object 
*amdgpu_ras_get_ras_block(struct amdgpu_device *adev,
enum amdgpu_ras_block block, uint32_t 
sub_block_index)  {
+   int loop_cnt = 0;
struct amdgpu_ras_block_object *obj, *tmp;
 
if (block >= AMDGPU_RAS_BLOCK__LAST)
@@ -900,6 +901,9 @@ static struct amdgpu_ras_block_object 
*amdgpu_ras_get_ras_block(struct amdgpu_de
if (amdgpu_ras_block_match_default(obj, block) == 0)
return obj;
}
+
+   if (++loop_cnt >= AMDGPU_RAS_BLOCK__LAST)
+   break;
}
 
return NULL;
--
2.25.1

RE: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

2022-01-28 Thread Clements, John

[AMD Official Use Only]

Reviewed-by: John Clements 

-Original Message-
From: Chai, Thomas  
Sent: Saturday, January 29, 2022 11:53 AM
To: amd-gfx@lists.freedesktop.org
Cc: Chai, Thomas ; Zhang, Hawking ; 
Zhou1, Tao ; Clements, John ; Chai, 
Thomas 
Subject: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

1. The infinite loop causing soft lock occurs on multiple amdgpu cards
   supporting ras feature.
2. This a workaround patch. It is valid for multiple amdgpu cards of the
   same type.
3. The root cause is that each GPU card device has a separate .ras_list
   link header, but the instance and linked list node of each ras block
   are unique. When each device is initialized, each ras instance will
   repeatedly add link node to the device every time. In this way, only
   the .ras_list of the last initialized device is completely correct.
   the .ras_list->prev and .ras_list->next of the device initialzied
   before can still point to the correct ras instance, but the prev
   pointer and next pointer of the pointed ras instance both point to
   the last initialized device's .ras_ list instead of the beginning
   .ras_ list. When using list_for_each_entry_safe searches for
   non-existent Ras nodes on devices other than the last device, the
   last ras instance next pointer cannot always be equal to the
   beginning .ras_list, so that the loop cannot be terminated, the
   program enters a infinite loop.
 BTW: Since the data and initialization process of each card are the same,
  the link list between ras instances will not be destroyed every time
  the device is initialized.
 4. The soft locked logs are as follows:
[  262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G   OE 
5.13.0-27-generic #29~20.04.1-Ubuntu
[  262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS 
T20200717143848 07/17/2020 [  262.165698] Workqueue: events 
amdgpu_ras_do_recovery [amdgpu] [  262.165980] RIP: 
0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [  262.166239] Code: 68 d8 4c 
8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef 
e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 
d3 74 25 49 89 c6 49 8b 45 [  262.166243] RSP: 0018:ac908fa87d80 EFLAGS: 
0202 [  262.166247] RAX: c1394248 RBX: 91e4ab8d6e20 RCX: 
c1394248 [  262.166249] RDX: 91e4aa356e20 RSI: 000e 
RDI: 91e4ab8c [  262.166252] RBP: ac908fa87da8 R08: 
0007 R09: 0001 [  262.166254] R10: 91e4930b64ec 
R11:  R12: 000e [  262.166256] R13: 
91e4aa356df8 R14: c1394320 R15: 0003 [  262.166258] FS: 
 () GS:92238fb4() knlGS: [  
262.166261] CS:  0010 DS:  ES:  CR0: 80050033 [  262.166264] 
CR2: 0001004865d0 CR3: 00406d796000 CR4: 00350ee0 [  
262.166267] Call Trace:
[  262.166272]  amdgpu_ras_do_recovery+0x130/0x290 [amdgpu] [  262.166529]  ? 
psi_task_switch+0xd2/0x250 [  262.166537]  ? __switch_to+0x11d/0x460 [  
262.166542]  ? __switch_to_asm+0x36/0x70 [  262.166549]  
process_one_work+0x220/0x3c0 [  262.166556]  worker_thread+0x4d/0x3f0 [  
262.166560]  ? process_one_work+0x3c0/0x3c0 [  262.166563]  kthread+0x12b/0x150 
[  262.166568]  ? set_kthread_struct+0x40/0x40 [  262.166571]  
ret_from_fork+0x22/0x30

Signed-off-by: yipechai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index d4e07d0acb66..3d533ef0783d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -884,6 +884,7 @@ static int amdgpu_ras_block_match_default(struct 
amdgpu_ras_block_object *block_  static struct amdgpu_ras_block_object 
*amdgpu_ras_get_ras_block(struct amdgpu_device *adev,
enum amdgpu_ras_block block, uint32_t 
sub_block_index)  {
+   int loop_cnt = 0;
struct amdgpu_ras_block_object *obj, *tmp;
 
if (block >= AMDGPU_RAS_BLOCK__LAST)
@@ -900,6 +901,9 @@ static struct amdgpu_ras_block_object 
*amdgpu_ras_get_ras_block(struct amdgpu_de
if (amdgpu_ras_block_match_default(obj, block) == 0)
return obj;
}
+
+   if (++loop_cnt >= AMDGPU_RAS_BLOCK__LAST)
+   break;
}
 
return NULL;
--
2.25.1

RE: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

2022-01-28 Thread Chen, Guchun

[Public]

Please add a Fixes tag, as it should fix a regression from former patch.

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of yipechai
Sent: Saturday, January 29, 2022 11:53 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhou1, Tao ; Zhang, Hawking ; 
Clements, John ; Chai, Thomas ; 
Chai, Thomas 
Subject: [PATCH] drm/amdgpu: Add judgement to avoid infinite loop

1. The infinite loop causing soft lock occurs on multiple amdgpu cards
   supporting ras feature.
2. This a workaround patch. It is valid for multiple amdgpu cards of the
   same type.
3. The root cause is that each GPU card device has a separate .ras_list
   link header, but the instance and linked list node of each ras block
   are unique. When each device is initialized, each ras instance will
   repeatedly add link node to the device every time. In this way, only
   the .ras_list of the last initialized device is completely correct.
   the .ras_list->prev and .ras_list->next of the device initialzied
   before can still point to the correct ras instance, but the prev
   pointer and next pointer of the pointed ras instance both point to
   the last initialized device's .ras_ list instead of the beginning
   .ras_ list. When using list_for_each_entry_safe searches for
   non-existent Ras nodes on devices other than the last device, the
   last ras instance next pointer cannot always be equal to the
   beginning .ras_list, so that the loop cannot be terminated, the
   program enters a infinite loop.
 BTW: Since the data and initialization process of each card are the same,
  the link list between ras instances will not be destroyed every time
  the device is initialized.
 4. The soft locked logs are as follows:
[  262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G   OE 
5.13.0-27-generic #29~20.04.1-Ubuntu
[  262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS 
T20200717143848 07/17/2020 [  262.165698] Workqueue: events 
amdgpu_ras_do_recovery [amdgpu] [  262.165980] RIP: 
0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu] [  262.166239] Code: 68 d8 4c 
8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef 
e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 
d3 74 25 49 89 c6 49 8b 45 [  262.166243] RSP: 0018:ac908fa87d80 EFLAGS: 
0202 [  262.166247] RAX: c1394248 RBX: 91e4ab8d6e20 RCX: 
c1394248 [  262.166249] RDX: 91e4aa356e20 RSI: 000e 
RDI: 91e4ab8c [  262.166252] RBP: ac908fa87da8 R08: 
0007 R09: 0001 [  262.166254] R10: 91e4930b64ec 
R11:  R12: 000e [  262.166256] R13: 
91e4aa356df8 R14: c1394320 R15: 0003 [  262.166258] FS: 
 () GS:92238fb4() knlGS: [  
262.166261] CS:  0010 DS:  ES:  CR0: 80050033 [  262.166264] 
CR2: 0001004865d0 CR3: 00406d796000 CR4: 00350ee0 [  
262.166267] Call Trace:
[  262.166272]  amdgpu_ras_do_recovery+0x130/0x290 [amdgpu] [  262.166529]  ? 
psi_task_switch+0xd2/0x250 [  262.166537]  ? __switch_to+0x11d/0x460 [  
262.166542]  ? __switch_to_asm+0x36/0x70 [  262.166549]  
process_one_work+0x220/0x3c0 [  262.166556]  worker_thread+0x4d/0x3f0 [  
262.166560]  ? process_one_work+0x3c0/0x3c0 [  262.166563]  kthread+0x12b/0x150 
[  262.166568]  ? set_kthread_struct+0x40/0x40 [  262.166571]  
ret_from_fork+0x22/0x30

Signed-off-by: yipechai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index d4e07d0acb66..3d533ef0783d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -884,6 +884,7 @@ static int amdgpu_ras_block_match_default(struct 
amdgpu_ras_block_object *block_  static struct amdgpu_ras_block_object 
*amdgpu_ras_get_ras_block(struct amdgpu_device *adev,
enum amdgpu_ras_block block, uint32_t 
sub_block_index)  {
+   int loop_cnt = 0;
struct amdgpu_ras_block_object *obj, *tmp;
 
if (block >= AMDGPU_RAS_BLOCK__LAST)
@@ -900,6 +901,9 @@ static struct amdgpu_ras_block_object 
*amdgpu_ras_get_ras_block(struct amdgpu_de
if (amdgpu_ras_block_match_default(obj, block) == 0)
return obj;
}
+
+   if (++loop_cnt >= AMDGPU_RAS_BLOCK__LAST)
+   break;
}
 
return NULL;
--
2.25.1

[PATCH] drm/amdgpu: Add judgement to avoid infinite loop

2022-01-28 Thread yipechai

1. The infinite loop causing soft lock occurs on multiple amdgpu cards
   supporting ras feature.
2. This a workaround patch. It is valid for multiple amdgpu cards of the
   same type.
3. The root cause is that each GPU card device has a separate .ras_list
   link header, but the instance and linked list node of each ras block
   are unique. When each device is initialized, each ras instance will
   repeatedly add link node to the device every time. In this way, only
   the .ras_list of the last initialized device is completely correct.
   the .ras_list->prev and .ras_list->next of the device initialzied
   before can still point to the correct ras instance, but the prev
   pointer and next pointer of the pointed ras instance both point to
   the last initialized device's .ras_ list instead of the beginning
   .ras_ list. When using list_for_each_entry_safe searches for
   non-existent Ras nodes on devices other than the last device, the
   last ras instance next pointer cannot always be equal to the
   beginning .ras_list, so that the loop cannot be terminated, the
   program enters a infinite loop.
 BTW: Since the data and initialization process of each card are the same,
  the link list between ras instances will not be destroyed every time
  the device is initialized.
 4. The soft locked logs are as follows:
[  262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G   OE 
5.13.0-27-generic #29~20.04.1-Ubuntu
[  262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS 
T20200717143848 07/17/2020
[  262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[  262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu]
[  262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 
32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 
28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45
[  262.166243] RSP: 0018:ac908fa87d80 EFLAGS: 0202
[  262.166247] RAX: c1394248 RBX: 91e4ab8d6e20 RCX: c1394248
[  262.166249] RDX: 91e4aa356e20 RSI: 000e RDI: 91e4ab8c
[  262.166252] RBP: ac908fa87da8 R08: 0007 R09: 0001
[  262.166254] R10: 91e4930b64ec R11:  R12: 000e
[  262.166256] R13: 91e4aa356df8 R14: c1394320 R15: 0003
[  262.166258] FS:  () GS:92238fb4() 
knlGS:
[  262.166261] CS:  0010 DS:  ES:  CR0: 80050033
[  262.166264] CR2: 0001004865d0 CR3: 00406d796000 CR4: 00350ee0
[  262.166267] Call Trace:
[  262.166272]  amdgpu_ras_do_recovery+0x130/0x290 [amdgpu]
[  262.166529]  ? psi_task_switch+0xd2/0x250
[  262.166537]  ? __switch_to+0x11d/0x460
[  262.166542]  ? __switch_to_asm+0x36/0x70
[  262.166549]  process_one_work+0x220/0x3c0
[  262.166556]  worker_thread+0x4d/0x3f0
[  262.166560]  ? process_one_work+0x3c0/0x3c0
[  262.166563]  kthread+0x12b/0x150
[  262.166568]  ? set_kthread_struct+0x40/0x40
[  262.166571]  ret_from_fork+0x22/0x30

Signed-off-by: yipechai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index d4e07d0acb66..3d533ef0783d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -884,6 +884,7 @@ static int amdgpu_ras_block_match_default(struct 
amdgpu_ras_block_object *block_
 static struct amdgpu_ras_block_object *amdgpu_ras_get_ras_block(struct 
amdgpu_device *adev,
enum amdgpu_ras_block block, uint32_t 
sub_block_index)
 {
+   int loop_cnt = 0;
struct amdgpu_ras_block_object *obj, *tmp;
 
if (block >= AMDGPU_RAS_BLOCK__LAST)
@@ -900,6 +901,9 @@ static struct amdgpu_ras_block_object 
*amdgpu_ras_get_ras_block(struct amdgpu_de
if (amdgpu_ras_block_match_default(obj, block) == 0)
return obj;
}
+
+   if (++loop_cnt >= AMDGPU_RAS_BLOCK__LAST)
+   break;
}
 
return NULL;
-- 
2.25.1

RE: [PATCH] drm/amd/display: Update watermark values for DCN301

2022-01-28 Thread Liu, Zhan

[Public]

> -Original Message-
> From: amd-gfx  On Behalf Of Agustin
> Gutierrez
> Sent: 2022/January/28, Friday 6:07 PM
> To: amd-gfx@lists.freedesktop.org; Gutierrez, Agustin
> 
> Cc: Gutierrez, Agustin 
> Subject: [PATCH] drm/amd/display: Update watermark values for DCN301
>
> [Why]
> There is underflow / visual corruption DCN301, for high
> bandwidth MST DSC configurations such as 2x1440p144 or 2x4k60.
>
> [How]
> Use up-to-date watermark values for DCN301.
>
> Signed-off-by: Agustin Gutierrez 

Looks good to me.

Reviewed-by: Zhan Liu 

> ---
>  .../amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c   | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> index 48005def1164..bc4ddc36fe58 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
> @@ -570,32 +570,32 @@ static struct wm_table lpddr5_wm_table = {
>   .wm_inst = WM_A,
>   .wm_type = WM_TYPE_PSTATE_CHG,
>   .pstate_latency_us = 11.65333,
> - .sr_exit_time_us = 7.95,
> - .sr_enter_plus_exit_time_us = 9,
> + .sr_exit_time_us = 13.5,
> + .sr_enter_plus_exit_time_us = 16.5,
>   .valid = true,
>   },
>   {
>   .wm_inst = WM_B,
>   .wm_type = WM_TYPE_PSTATE_CHG,
>   .pstate_latency_us = 11.65333,
> - .sr_exit_time_us = 9.82,
> - .sr_enter_plus_exit_time_us = 11.196,
> + .sr_exit_time_us = 13.5,
> + .sr_enter_plus_exit_time_us = 16.5,
>   .valid = true,
>   },
>   {
>   .wm_inst = WM_C,
>   .wm_type = WM_TYPE_PSTATE_CHG,
>   .pstate_latency_us = 11.65333,
> - .sr_exit_time_us = 9.89,
> - .sr_enter_plus_exit_time_us = 11.24,
> + .sr_exit_time_us = 13.5,
> + .sr_enter_plus_exit_time_us = 16.5,
>   .valid = true,
>   },
>   {
>   .wm_inst = WM_D,
>   .wm_type = WM_TYPE_PSTATE_CHG,
>   .pstate_latency_us = 11.65333,
> - .sr_exit_time_us = 9.748,
> - .sr_enter_plus_exit_time_us = 11.102,
> + .sr_exit_time_us = 13.5,
> + .sr_enter_plus_exit_time_us = 16.5,
>   .valid = true,
>   },
>   }
> --
> 2.25.1

[PATCH] drm/amd/display: Update watermark values for DCN301

2022-01-28 Thread Agustin Gutierrez

[Why]
There is underflow / visual corruption DCN301, for high
bandwidth MST DSC configurations such as 2x1440p144 or 2x4k60.

[How]
Use up-to-date watermark values for DCN301.

Signed-off-by: Agustin Gutierrez 
---
 .../amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c   | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
index 48005def1164..bc4ddc36fe58 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c
@@ -570,32 +570,32 @@ static struct wm_table lpddr5_wm_table = {
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 7.95,
-   .sr_enter_plus_exit_time_us = 9,
+   .sr_exit_time_us = 13.5,
+   .sr_enter_plus_exit_time_us = 16.5,
.valid = true,
},
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 9.82,
-   .sr_enter_plus_exit_time_us = 11.196,
+   .sr_exit_time_us = 13.5,
+   .sr_enter_plus_exit_time_us = 16.5,
.valid = true,
},
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 9.89,
-   .sr_enter_plus_exit_time_us = 11.24,
+   .sr_exit_time_us = 13.5,
+   .sr_enter_plus_exit_time_us = 16.5,
.valid = true,
},
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 9.748,
-   .sr_enter_plus_exit_time_us = 11.102,
+   .sr_exit_time_us = 13.5,
+   .sr_enter_plus_exit_time_us = 16.5,
.valid = true,
},
}
-- 
2.25.1

Re: [RFC PATCH v6 0/3] Add support modifiers for drivers whose planes only support linear layout

2022-01-28 Thread Daniel Vetter

On Fri, Jan 28, 2022 at 03:08:33PM +0900, Tomohito Esaki wrote:
> Some drivers whose planes only support linear layout fb do not support format
> modifiers.
> These drivers should support modifiers, however the DRM core should handle 
> this
> rather than open-coding in every driver.
> 
> In this patch series, these drivers expose format modifiers based on the
> following suggestion[1].
> 
> On Thu, Nov 18, 2021 at 01:02:11PM +, Daniel Stone wrote:
> > I think the best way forward here is:
> >   - add a new mode_config.cannot_support_modifiers flag, and enable
> > this in radeon (plus any other drivers in the same boat)
> >   - change drm_universal_plane_init() to advertise the LINEAR modifier
> > when NULL is passed as the modifier list (including installing a
> > default .format_mod_supported hook)
> >   - remove the mode_config.allow_fb_modifiers hook and always
> > advertise modifier support, unless
> > mode_config.cannot_support_modifiers is set
> 
> 
> [1] 
> https://patchwork.kernel.org/project/linux-renesas-soc/patch/20190509054518.10781-1-e...@igel.co.jp/#24602575
> 
> v6:
> * add Reviewed-by and Acked-by
> * add a changelog per-patch

Thanks for resending with all that added, makes my life so much easier!

All applied, thanks a bunch.

Cheers, Daniel

> 
> v5: https://www.spinics.net/lists/dri-devel/msg330860.html
> * rebase to the latest master branch (5.17-rc1+)
>   + "drm/plane: Make format_mod_supported truly optional" patch [2]
>   [2] https://patchwork.freedesktop.org/patch/467940/?series=98255=3
> 
> * change default_modifiers array from non-static to static
> * remove terminator in default_modifiers array
> * use ARRAY_SIZE to get the format_modifier_count
> * keep a sanity check in plane init func
> * modify several kerneldocs
> 
> v4: https://www.spinics.net/lists/dri-devel/msg329508.html
> * modify documentation for fb_modifiers_not_supported flag in kerneldoc
> 
> v3: https://www.spinics.net/lists/dri-devel/msg329102.html
> * change the order as follows:
>1. add fb_modifiers_not_supported flag
>2. add default modifiers
>3. remove allow_fb_modifiers flag
> * add a conditional disable in amdgpu_dm_plane_init()
> 
> v2: https://www.spinics.net/lists/dri-devel/msg328939.html
> * rebase to the latest master branch (5.16.0+)
>   + "drm/plane: Make format_mod_supported truly optional" patch [2]
> 
> v1: https://www.spinics.net/lists/dri-devel/msg327352.html
> * The initial patch set
> 
> Tomohito Esaki (3):
>   drm: introduce fb_modifiers_not_supported flag in mode_config
>   drm: add support modifiers for drivers whose planes only support
> linear layout
>   drm: remove allow_fb_modifiers
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  6 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c|  2 ++
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c|  2 ++
>  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c |  1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c |  2 ++
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  3 +++
>  drivers/gpu/drm/drm_framebuffer.c |  6 ++---
>  drivers/gpu/drm/drm_ioctl.c   |  2 +-
>  drivers/gpu/drm/drm_plane.c   | 23 +++
>  drivers/gpu/drm/nouveau/nouveau_display.c |  6 +++--
>  drivers/gpu/drm/radeon/radeon_display.c   |  2 ++
>  .../gpu/drm/selftests/test-drm_framebuffer.c  |  1 -
>  include/drm/drm_mode_config.h | 18 +--
>  include/drm/drm_plane.h   |  3 +++
>  14 files changed, 45 insertions(+), 32 deletions(-)
> 
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH v5 09/10] tools: update hmm-test to support device coherent type

2022-01-28 Thread Alex Sierra

Test cases such as migrate_fault and migrate_multiple, were modified to
explicit migrate from device to sys memory without the need of page
faults, when using device coherent type.

Snapshot test case updated to read memory device type first and based
on that, get the proper returned results migrate_ping_pong test case
added to test explicit migration from device to sys memory for both
private and coherent zone types.

Helpers to migrate from device to sys memory and vicerversa
were also added.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
---
v2:
Set FIXTURE_VARIANT to add multiple device types to the FIXTURE. This
will run all the tests for each device type (private and coherent) in
case both existed during hmm-test driver probed.
v4:
Check for the number of pages successfully migrated from coherent
device to system at migrate_multiple test.
---
 tools/testing/selftests/vm/hmm-tests.c | 123 -
 1 file changed, 102 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 203323967b50..84ec8c4a1dc7 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -44,6 +44,14 @@ struct hmm_buffer {
int fd;
uint64_tcpages;
uint64_tfaults;
+   int zone_device_type;
+};
+
+enum {
+   HMM_PRIVATE_DEVICE_ONE,
+   HMM_PRIVATE_DEVICE_TWO,
+   HMM_COHERENCE_DEVICE_ONE,
+   HMM_COHERENCE_DEVICE_TWO,
 };
 
 #define TWOMEG (1 << 21)
@@ -60,6 +68,21 @@ FIXTURE(hmm)
unsigned intpage_shift;
 };
 
+FIXTURE_VARIANT(hmm)
+{
+   int device_number;
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_private)
+{
+   .device_number = HMM_PRIVATE_DEVICE_ONE,
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_coherent)
+{
+   .device_number = HMM_COHERENCE_DEVICE_ONE,
+};
+
 FIXTURE(hmm2)
 {
int fd0;
@@ -68,6 +91,24 @@ FIXTURE(hmm2)
unsigned intpage_shift;
 };
 
+FIXTURE_VARIANT(hmm2)
+{
+   int device_number0;
+   int device_number1;
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_private)
+{
+   .device_number0 = HMM_PRIVATE_DEVICE_ONE,
+   .device_number1 = HMM_PRIVATE_DEVICE_TWO,
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_coherent)
+{
+   .device_number0 = HMM_COHERENCE_DEVICE_ONE,
+   .device_number1 = HMM_COHERENCE_DEVICE_TWO,
+};
+
 static int hmm_open(int unit)
 {
char pathname[HMM_PATH_MAX];
@@ -81,12 +122,19 @@ static int hmm_open(int unit)
return fd;
 }
 
+static bool hmm_is_coherent_type(int dev_num)
+{
+   return (dev_num >= HMM_COHERENCE_DEVICE_ONE);
+}
+
 FIXTURE_SETUP(hmm)
 {
self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
 
-   self->fd = hmm_open(0);
+   self->fd = hmm_open(variant->device_number);
+   if (self->fd < 0 && hmm_is_coherent_type(variant->device_number))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd, 0);
 }
 
@@ -95,9 +143,11 @@ FIXTURE_SETUP(hmm2)
self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
 
-   self->fd0 = hmm_open(0);
+   self->fd0 = hmm_open(variant->device_number0);
+   if (self->fd0 < 0 && hmm_is_coherent_type(variant->device_number0))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd0, 0);
-   self->fd1 = hmm_open(1);
+   self->fd1 = hmm_open(variant->device_number1);
ASSERT_GE(self->fd1, 0);
 }
 
@@ -144,6 +194,7 @@ static int hmm_dmirror_cmd(int fd,
}
buffer->cpages = cmd.cpages;
buffer->faults = cmd.faults;
+   buffer->zone_device_type = cmd.zone_device_type;
 
return 0;
 }
@@ -211,6 +262,20 @@ static void hmm_nanosleep(unsigned int n)
nanosleep(, NULL);
 }
 
+static int hmm_migrate_sys_to_dev(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_DEV, buffer, npages);
+}
+
+static int hmm_migrate_dev_to_sys(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_SYS, buffer, npages);
+}
+
 /*
  * Simple NULL test of device open/close.
  */
@@ -875,7 +940,7 @@ TEST_F(hmm, migrate)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -923,7 +988,7 @@ TEST_F(hmm, migrate_fault)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd,

[PATCH v5 10/10] tools: update test_hmm script to support SP config

2022-01-28 Thread Alex Sierra

Add two more parameters to set spm_addr_dev0 & spm_addr_dev1
addresses. These two parameters configure the start SP
addresses for each device in test_hmm driver.
Consequently, this configures zone device type as coherent.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Popple 
---
v2:
Add more mknods for device coherent type. These are represented under
/dev/hmm_mirror2 and /dev/hmm_mirror3, only in case they have created
at probing the hmm-test driver.
---
 tools/testing/selftests/vm/test_hmm.sh | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/test_hmm.sh 
b/tools/testing/selftests/vm/test_hmm.sh
index 0647b525a625..539c9371e592 100755
--- a/tools/testing/selftests/vm/test_hmm.sh
+++ b/tools/testing/selftests/vm/test_hmm.sh
@@ -40,11 +40,26 @@ check_test_requirements()
 
 load_driver()
 {
-   modprobe $DRIVER > /dev/null 2>&1
+   if [ $# -eq 0 ]; then
+   modprobe $DRIVER > /dev/null 2>&1
+   else
+   if [ $# -eq 2 ]; then
+   modprobe $DRIVER spm_addr_dev0=$1 spm_addr_dev1=$2
+   > /dev/null 2>&1
+   else
+   echo "Missing module parameters. Make sure pass"\
+   "spm_addr_dev0 and spm_addr_dev1"
+   usage
+   fi
+   fi
if [ $? == 0 ]; then
major=$(awk "\$2==\"HMM_DMIRROR\" {print \$1}" /proc/devices)
mknod /dev/hmm_dmirror0 c $major 0
mknod /dev/hmm_dmirror1 c $major 1
+   if [ $# -eq 2 ]; then
+   mknod /dev/hmm_dmirror2 c $major 2
+   mknod /dev/hmm_dmirror3 c $major 3
+   fi
fi
 }
 
@@ -58,7 +73,7 @@ run_smoke()
 {
echo "Running smoke test. Note, this test provides basic coverage."
 
-   load_driver
+   load_driver $1 $2
$(dirname "${BASH_SOURCE[0]}")/hmm-tests
unload_driver
 }
@@ -75,6 +90,9 @@ usage()
echo "# Smoke testing"
echo "./${TEST_NAME}.sh smoke"
echo
+   echo "# Smoke testing with SPM enabled"
+   echo "./${TEST_NAME}.sh smoke  "
+   echo
exit 0
 }
 
@@ -84,7 +102,7 @@ function run_test()
usage
else
if [ "$1" = "smoke" ]; then
-   run_smoke
+   run_smoke $2 $3
else
usage
fi
-- 
2.32.0

[PATCH v5 05/10] drm/amdkfd: coherent type as sys mem on migration to ram

2022-01-28 Thread Alex Sierra

Coherent device type memory on VRAM to RAM migration, has similar access
as System RAM from the CPU. This flag sets the source from the sender.
Which in Coherent type case, should be set as
MIGRATE_VMA_SELECT_DEVICE_COHERENT.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 5e8d944d359e..846ba55723fb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -659,9 +659,12 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.vma = vma;
migrate.start = start;
migrate.end = end;
-   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
 
+   if (adev->gmc.xgmi.connected_to_cpu)
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_COHERENT;
+   else
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
size = 2 * sizeof(*migrate.src) + sizeof(uint64_t) + sizeof(dma_addr_t);
size *= npages;
buf = kvmalloc(size, GFP_KERNEL | __GFP_ZERO);
-- 
2.32.0

[PATCH v5 08/10] lib: add support for device coherent type in test_hmm

2022-01-28 Thread Alex Sierra

Device Coherent type uses device memory that is coherently accesible by
the CPU. This could be shown as SP (special purpose) memory range
at the BIOS-e820 memory enumeration. If no SP memory is supported in
system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.

Currently, test_hmm only supports two different SP ranges of at least
256MB size. This could be specified in the kernel parameter variable
efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x1 &
0x14000 physical address. Ex.
efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4

Private and coherent device mirror instances can be created in the same
probed. This is done by passing the module parameters spm_addr_dev0 &
spm_addr_dev1. In this case, it will create four instances of
device_mirror. The first two correspond to private device type, the
last two to coherent type. Then, they can be easily accessed from user
space through /dev/hmm_mirror. Usually num_device 0 and 1
are for private, and 2 and 3 for coherent types. If no module
parameters are passed, two instances of private type device_mirror will
be created only.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
---
v4:
Return number of coherent device pages successfully migrated to system.
This is returned at cmd->cpages.
---
 lib/test_hmm.c  | 260 +---
 lib/test_hmm_uapi.h |  15 ++-
 2 files changed, 205 insertions(+), 70 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index c7f8d00e7b95..dedce7908ac6 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -29,11 +29,22 @@
 
 #include "test_hmm_uapi.h"
 
-#define DMIRROR_NDEVICES   2
+#define DMIRROR_NDEVICES   4
 #define DMIRROR_RANGE_FAULT_TIMEOUT1000
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+/*
+ * For device_private pages, dpage is just a dummy struct page
+ * representing a piece of device memory. dmirror_devmem_alloc_page
+ * allocates a real system memory page as backing storage to fake a
+ * real device. zone_device_data points to that backing page. But
+ * for device_coherent memory, the struct page represents real
+ * physical CPU-accessible memory that we can use directly.
+ */
+#define BACKING_PAGE(page) (is_device_private_page((page)) ? \
+  (page)->zone_device_data : (page))
+
 static unsigned long spm_addr_dev0;
 module_param(spm_addr_dev0, long, 0644);
 MODULE_PARM_DESC(spm_addr_dev0,
@@ -122,6 +133,21 @@ static int dmirror_bounce_init(struct dmirror_bounce 
*bounce,
return 0;
 }
 
+static bool dmirror_is_private_zone(struct dmirror_device *mdevice)
+{
+   return (mdevice->zone_device_type ==
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ? true : false;
+}
+
+static enum migrate_vma_direction
+   dmirror_select_device(struct dmirror *dmirror)
+{
+   return (dmirror->mdevice->zone_device_type ==
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ?
+   MIGRATE_VMA_SELECT_DEVICE_PRIVATE :
+   MIGRATE_VMA_SELECT_DEVICE_COHERENT;
+}
+
 static void dmirror_bounce_fini(struct dmirror_bounce *bounce)
 {
vfree(bounce->ptr);
@@ -572,16 +598,19 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
 {
struct page *dpage = NULL;
-   struct page *rpage;
+   struct page *rpage = NULL;
 
/*
-* This is a fake device so we alloc real system memory to store
-* our device memory.
+* For ZONE_DEVICE private type, this is a fake device so we alloc real
+* system memory to store our device memory.
+* For ZONE_DEVICE coherent type we use the actual dpage to store the 
data
+* and ignore rpage.
 */
-   rpage = alloc_page(GFP_HIGHUSER);
-   if (!rpage)
-   return NULL;
-
+   if (dmirror_is_private_zone(mdevice)) {
+   rpage = alloc_page(GFP_HIGHUSER);
+   if (!rpage)
+   return NULL;
+   }
spin_lock(>lock);
 
if (mdevice->free_pages) {
@@ -601,7 +630,8 @@ static struct page *dmirror_devmem_alloc_page(struct 
dmirror_device *mdevice)
return dpage;
 
 error:
-   __free_page(rpage);
+   if (rpage)
+   __free_page(rpage);
return NULL;
 }
 
@@ -627,12 +657,16 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
 * unallocated pte_none() or read-only zero page.
 */
spage = migrate_pfn_to_page(*src);
+   if (WARN(spage && is_zone_device_page(spage),
+"page already in device spage pfn: 0x%lx\n",
+page_to_pfn(spage)))
+   continue;
 
dpage = dmirror_devmem_alloc_page(mdevice);
if (!dpage)

[PATCH v5 04/10] drm/amdkfd: add SPM support for SVM

2022-01-28 Thread Alex Sierra

When CPU is connected throug XGMI, it has coherent
access to VRAM resource. In this case that resource
is taken from a table in the device gmc aperture base.
This resource is used along with the device type, which could
be DEVICE_PRIVATE or DEVICE_COHERENT to create the device
page map region.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 29 +++-
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index ed5385137f48..5e8d944d359e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -933,7 +933,7 @@ int svm_migrate_init(struct amdgpu_device *adev)
 {
struct kfd_dev *kfddev = adev->kfd.dev;
struct dev_pagemap *pgmap;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long size;
void *r;
 
@@ -948,28 +948,34 @@ int svm_migrate_init(struct amdgpu_device *adev)
 * should remove reserved size
 */
size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
-   res = devm_request_free_mem_region(adev->dev, _resource, size);
-   if (IS_ERR(res))
-   return -ENOMEM;
+   if (adev->gmc.xgmi.connected_to_cpu) {
+   pgmap->range.start = adev->gmc.aper_base;
+   pgmap->range.end = adev->gmc.aper_base + adev->gmc.aper_size - 
1;
+   pgmap->type = MEMORY_DEVICE_COHERENT;
+   } else {
+   res = devm_request_free_mem_region(adev->dev, _resource, 
size);
+   if (IS_ERR(res))
+   return -ENOMEM;
+   pgmap->range.start = res->start;
+   pgmap->range.end = res->end;
+   pgmap->type = MEMORY_DEVICE_PRIVATE;
+   }
 
-   pgmap->type = MEMORY_DEVICE_PRIVATE;
pgmap->nr_range = 1;
-   pgmap->range.start = res->start;
-   pgmap->range.end = res->end;
pgmap->ops = _migrate_pgmap_ops;
pgmap->owner = SVM_ADEV_PGMAP_OWNER(adev);
-   pgmap->flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
-
+   pgmap->flags = 0;
/* Device manager releases device-specific resources, memory region and
 * pgmap when driver disconnects from device.
 */
r = devm_memremap_pages(adev->dev, pgmap);
if (IS_ERR(r)) {
pr_err("failed to register HMM device memory\n");
-
/* Disable SVM support capability */
pgmap->type = 0;
-   devm_release_mem_region(adev->dev, res->start, 
resource_size(res));
+   if (pgmap->type == MEMORY_DEVICE_PRIVATE)
+   devm_release_mem_region(adev->dev, res->start,
+   res->end - res->start + 1);
return PTR_ERR(r);
}
 
@@ -982,3 +988,4 @@ int svm_migrate_init(struct amdgpu_device *adev)
 
return 0;
 }
+
-- 
2.32.0

[PATCH v5 06/10] lib: test_hmm add ioctl to get zone device type

2022-01-28 Thread Alex Sierra

new ioctl cmd added to query zone device type. This will be
used once the test_hmm adds zone device coherent type.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
---
 lib/test_hmm.c  | 23 +--
 lib/test_hmm_uapi.h |  8 
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 767538089a62..556bd80ce22e 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -84,6 +84,7 @@ struct dmirror_chunk {
 struct dmirror_device {
struct cdev cdevice;
struct hmm_devmem   *devmem;
+   unsigned intzone_device_type;
 
unsigned intdevmem_capacity;
unsigned intdevmem_count;
@@ -1024,6 +1025,15 @@ static int dmirror_snapshot(struct dmirror *dmirror,
return ret;
 }
 
+static int dmirror_get_device_type(struct dmirror *dmirror,
+   struct hmm_dmirror_cmd *cmd)
+{
+   mutex_lock(>mutex);
+   cmd->zone_device_type = dmirror->mdevice->zone_device_type;
+   mutex_unlock(>mutex);
+
+   return 0;
+}
 static long dmirror_fops_unlocked_ioctl(struct file *filp,
unsigned int command,
unsigned long arg)
@@ -1074,6 +1084,9 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp,
ret = dmirror_snapshot(dmirror, );
break;
 
+   case HMM_DMIRROR_GET_MEM_DEV_TYPE:
+   ret = dmirror_get_device_type(dmirror, );
+   break;
default:
return -EINVAL;
}
@@ -1258,14 +1271,20 @@ static void dmirror_device_remove(struct dmirror_device 
*mdevice)
 static int __init hmm_dmirror_init(void)
 {
int ret;
-   int id;
+   int id = 0;
+   int ndevices = 0;
 
ret = alloc_chrdev_region(_dev, 0, DMIRROR_NDEVICES,
  "HMM_DMIRROR");
if (ret)
goto err_unreg;
 
-   for (id = 0; id < DMIRROR_NDEVICES; id++) {
+   memset(dmirror_devices, 0, DMIRROR_NDEVICES * 
sizeof(dmirror_devices[0]));
+   dmirror_devices[ndevices++].zone_device_type =
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   dmirror_devices[ndevices++].zone_device_type =
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   for (id = 0; id < ndevices; id++) {
ret = dmirror_device_init(dmirror_devices + id, id);
if (ret)
goto err_chrdev;
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index f14dea5dcd06..17f842f1aa02 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -19,6 +19,7 @@
  * @npages: (in) number of pages to read/write
  * @cpages: (out) number of pages copied
  * @faults: (out) number of device page faults seen
+ * @zone_device_type: (out) zone device memory type
  */
 struct hmm_dmirror_cmd {
__u64   addr;
@@ -26,6 +27,7 @@ struct hmm_dmirror_cmd {
__u64   npages;
__u64   cpages;
__u64   faults;
+   __u64   zone_device_type;
 };
 
 /* Expose the address space of the calling process through hmm device file */
@@ -35,6 +37,7 @@ struct hmm_dmirror_cmd {
 #define HMM_DMIRROR_SNAPSHOT   _IOWR('H', 0x03, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_EXCLUSIVE  _IOWR('H', 0x04, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_CHECK_EXCLUSIVE_IOWR('H', 0x05, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_GET_MEM_DEV_TYPE   _IOWR('H', 0x06, struct hmm_dmirror_cmd)
 
 /*
  * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT.
@@ -62,4 +65,9 @@ enum {
HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE = 0x30,
 };
 
+enum {
+   /* 0 is reserved to catch uninitialized type fields */
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE = 1,
+};
+
 #endif /* _LIB_TEST_HMM_UAPI_H */
-- 
2.32.0

[PATCH v5 07/10] lib: test_hmm add module param for zone device type

2022-01-28 Thread Alex Sierra

In order to configure device coherent in test_hmm, two module parameters
should be passed, which correspond to the SP start address of each
device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed,
private device type is configured.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
---
 lib/test_hmm.c  | 73 -
 lib/test_hmm_uapi.h |  1 +
 2 files changed, 53 insertions(+), 21 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 556bd80ce22e..c7f8d00e7b95 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -34,6 +34,16 @@
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+static unsigned long spm_addr_dev0;
+module_param(spm_addr_dev0, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev0,
+   "Specify start address for SPM (special purpose memory) used 
for device 0. By setting this Coherent device type will be used. Make sure 
spm_addr_dev1 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
+
+static unsigned long spm_addr_dev1;
+module_param(spm_addr_dev1, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev1,
+   "Specify start address for SPM (special purpose memory) used 
for device 1. By setting this Coherent device type will be used. Make sure 
spm_addr_dev0 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
+
 static const struct dev_pagemap_ops dmirror_devmem_ops;
 static const struct mmu_interval_notifier_ops dmirror_min_ops;
 static dev_t dmirror_dev;
@@ -452,28 +462,44 @@ static int dmirror_write(struct dmirror *dmirror, struct 
hmm_dmirror_cmd *cmd)
return ret;
 }
 
-static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
+static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
   struct page **ppage)
 {
struct dmirror_chunk *devmem;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long pfn;
unsigned long pfn_first;
unsigned long pfn_last;
void *ptr;
+   int ret = -ENOMEM;
 
devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
if (!devmem)
-   return false;
+   return ret;
 
-   res = request_free_mem_region(_resource, DEVMEM_CHUNK_SIZE,
- "hmm_dmirror");
-   if (IS_ERR(res))
+   switch (mdevice->zone_device_type) {
+   case HMM_DMIRROR_MEMORY_DEVICE_PRIVATE:
+   res = request_free_mem_region(_resource, 
DEVMEM_CHUNK_SIZE,
+ "hmm_dmirror");
+   if (IS_ERR_OR_NULL(res))
+   goto err_devmem;
+   devmem->pagemap.range.start = res->start;
+   devmem->pagemap.range.end = res->end;
+   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+   break;
+   case HMM_DMIRROR_MEMORY_DEVICE_COHERENT:
+   devmem->pagemap.range.start = (MINOR(mdevice->cdevice.dev) - 2) 
?
+   spm_addr_dev0 :
+   spm_addr_dev1;
+   devmem->pagemap.range.end = devmem->pagemap.range.start +
+   DEVMEM_CHUNK_SIZE - 1;
+   devmem->pagemap.type = MEMORY_DEVICE_COHERENT;
+   break;
+   default:
+   ret = -EINVAL;
goto err_devmem;
+   }
 
-   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
-   devmem->pagemap.range.start = res->start;
-   devmem->pagemap.range.end = res->end;
devmem->pagemap.nr_range = 1;
devmem->pagemap.ops = _devmem_ops;
devmem->pagemap.owner = mdevice;
@@ -494,10 +520,14 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
mdevice->devmem_capacity = new_capacity;
mdevice->devmem_chunks = new_chunks;
}
-
ptr = memremap_pages(>pagemap, numa_node_id());
-   if (IS_ERR(ptr))
+   if (IS_ERR_OR_NULL(ptr)) {
+   if (ptr)
+   ret = PTR_ERR(ptr);
+   else
+   ret = -EFAULT;
goto err_release;
+   }
 
devmem->mdevice = mdevice;
pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT;
@@ -526,15 +556,17 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
}
spin_unlock(>lock);
 
-   return true;
+   return 0;
 
 err_release:
mutex_unlock(>devmem_lock);
-   release_mem_region(devmem->pagemap.range.start, 
range_len(>pagemap.range));
+   if (res && devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
+   release_mem_region(devmem->pagemap.range.start,
+  range_len(>pagemap.range));
 err_devmem:
kfree(devmem);
 
-   return false;
+   return ret;
 }
 
 static

[PATCH v5 01/10] mm: add zone device coherent type memory support

2022-01-28 Thread Alex Sierra

Device memory that is cache coherent from device and CPU point of view.
This is used on platforms that have an advanced system bus (like CAPI
or CXL). Any page of a process can be migrated to such memory. However,
no one should be allowed to pin such memory so that it can always be
evicted.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
---
v4:
- use the same system entry path for coherent device pages at
migrate_vma_insert_page.

- Add coherent device type support for try_to_migrate /
try_to_migrate_one.
---
 include/linux/memremap.h |  8 +++
 include/linux/mm.h   | 16 ++
 mm/memcontrol.c  |  6 +++---
 mm/memory-failure.c  |  8 +--
 mm/memremap.c| 14 -
 mm/migrate.c | 45 
 mm/rmap.c|  5 +++--
 7 files changed, 71 insertions(+), 31 deletions(-)

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 1fafcc38acba..727b8c789193 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -39,6 +39,13 @@ struct vmem_altmap {
  * A more complete discussion of unaddressable memory may be found in
  * include/linux/hmm.h and Documentation/vm/hmm.rst.
  *
+ * MEMORY_DEVICE_COHERENT:
+ * Device memory that is cache coherent from device and CPU point of view. This
+ * is used on platforms that have an advanced system bus (like CAPI or CXL). A
+ * driver can hotplug the device memory using ZONE_DEVICE and with that memory
+ * type. Any page of a process can be migrated to such memory. However no one
+ * should be allowed to pin such memory so that it can always be evicted.
+ *
  * MEMORY_DEVICE_FS_DAX:
  * Host memory that has similar access semantics as System RAM i.e. DMA
  * coherent and supports page pinning. In support of coordinating page
@@ -59,6 +66,7 @@ struct vmem_altmap {
 enum memory_type {
/* 0 is reserved to catch uninitialized type fields */
MEMORY_DEVICE_PRIVATE = 1,
+   MEMORY_DEVICE_COHERENT,
MEMORY_DEVICE_FS_DAX,
MEMORY_DEVICE_GENERIC,
MEMORY_DEVICE_PCI_P2PDMA,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e1a84b1e6787..0c61bf40edef 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1106,6 +1106,7 @@ static inline bool page_is_devmap_managed(struct page 
*page)
return false;
switch (page->pgmap->type) {
case MEMORY_DEVICE_PRIVATE:
+   case MEMORY_DEVICE_COHERENT:
case MEMORY_DEVICE_FS_DAX:
return true;
default:
@@ -1135,6 +1136,21 @@ static inline bool is_device_private_page(const struct 
page *page)
page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
 
+static inline bool is_device_coherent_page(const struct page *page)
+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   is_zone_device_page(page) &&
+   page->pgmap->type == MEMORY_DEVICE_COHERENT;
+}
+
+static inline bool is_dev_private_or_coherent_page(const struct page *page)
+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   is_zone_device_page(page) &&
+   (page->pgmap->type == MEMORY_DEVICE_PRIVATE ||
+   page->pgmap->type == MEMORY_DEVICE_COHERENT);
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 09d342c7cbd0..0882b5b2a857 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5691,8 +5691,8 @@ static int mem_cgroup_move_account(struct page *page,
  *   2(MC_TARGET_SWAP): if the swap entry corresponding to this pte is a
  * target for charge migration. if @target is not NULL, the entry is stored
  * in target->ent.
- *   3(MC_TARGET_DEVICE): like MC_TARGET_PAGE  but page is 
MEMORY_DEVICE_PRIVATE
- * (so ZONE_DEVICE page and thus not on the lru).
+ *   3(MC_TARGET_DEVICE): like MC_TARGET_PAGE  but page is device memory and
+ *   thus not on the lru.
  * For now we such page is charge like a regular page would be as for all
  * intent and purposes it is just special memory taking the place of a
  * regular page.
@@ -5726,7 +5726,7 @@ static enum mc_target_type get_mctgt_type(struct 
vm_area_struct *vma,
 */
if (page_memcg(page) == mc.from) {
ret = MC_TARGET_PAGE;
-   if (is_device_private_page(page))
+   if (is_dev_private_or_coherent_page(page))
ret = MC_TARGET_DEVICE;
if (target)
target->page = page;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 14ae5c18e776..e83740f7f05e 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1611,12 +1611,16 @@ static int memory_failure_dev_pagemap(unsigned long 
pfn, int flags,
goto unlock;
}
 
-   if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
+

[PATCH v5 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-28 Thread Alex Sierra

Avoid long term pinning for Coherent device type pages. This could
interfere with their own device memory manager. For now, we are just
returning error for PIN_LONGTERM Coherent device type pages. Eventually,
these type of pages will get migrated to system memory, once the device
migration pages support is added.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
---
 mm/gup.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index f0af462ac1e2..f596b932d7d7 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1864,6 +1864,12 @@ static long check_and_migrate_movable_pages(unsigned 
long nr_pages,
 * If we get a movable page, since we are going to be pinning
 * these entries, try to move them out if possible.
 */
+   if (is_dev_private_or_coherent_page(head)) {
+   WARN_ON_ONCE(is_device_private_page(head));
+   ret = -EFAULT;
+   goto unpin_pages;
+   }
+
if (!is_pinnable_page(head)) {
if (PageHuge(head)) {
if (!isolate_huge_page(head, 
_page_list))
@@ -1894,6 +1900,7 @@ static long check_and_migrate_movable_pages(unsigned long 
nr_pages,
if (list_empty(_page_list) && !isolation_error_count)
return nr_pages;
 
+unpin_pages:
if (gup_flags & FOLL_PIN) {
unpin_user_pages(pages, nr_pages);
} else {
-- 
2.32.0

[PATCH v5 02/10] mm: add device coherent vma selection for memory migration

2022-01-28 Thread Alex Sierra

This case is used to migrate pages from device memory, back to system
memory. Device coherent type memory is cache coherent from device and CPU
point of view.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
---
v2:
condition added when migrations from device coherent pages.
---
 include/linux/migrate.h |  1 +
 mm/migrate.c| 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index db96e10eb8da..66a34eae8cb6 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -130,6 +130,7 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 enum migrate_vma_direction {
MIGRATE_VMA_SELECT_SYSTEM = 1 << 0,
MIGRATE_VMA_SELECT_DEVICE_PRIVATE = 1 << 1,
+   MIGRATE_VMA_SELECT_DEVICE_COHERENT = 1 << 2,
 };
 
 struct migrate_vma {
diff --git a/mm/migrate.c b/mm/migrate.c
index cd137aedcfe5..d3cc3589e1e8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2264,7 +2264,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
if (is_writable_device_private_entry(entry))
mpfn |= MIGRATE_PFN_WRITE;
} else {
-   if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
+   if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM) &&
+   !(migrate->flags & 
MIGRATE_VMA_SELECT_DEVICE_COHERENT))
goto next;
pfn = pte_pfn(pte);
if (is_zero_pfn(pfn)) {
@@ -2273,6 +2274,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
goto next;
}
page = vm_normal_page(migrate->vma, addr, pte);
+   if (page && !is_zone_device_page(page) &&
+   !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
+   goto next;
+   if (page && is_device_coherent_page(page) &&
+   (!(migrate->flags & 
MIGRATE_VMA_SELECT_DEVICE_COHERENT) ||
+page->pgmap->owner != migrate->pgmap_owner))
+   goto next;
mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
}
-- 
2.32.0

[PATCH v5 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

2022-01-28 Thread Alex Sierra

This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like
MEMORY_DEVICE_PRIVATE.

Christoph, the suggestion to incorporate Ralph Campbell’s refcount
cleanup patch into our hardware page migration patchset originally came
from you, but it proved impractical to do things in that order because
the refcount cleanup introduced a bug with wide ranging structural
implications. Instead, we amended Ralph’s patch so that it could be
applied after merging the migration work. As we saw from the recent
discussion, merging the refcount work is going to take some time and
cooperation between multiple development groups, while the migration
work is ready now and is needed now. So we propose to merge this
patchset first and continue to work with Ralph and others to merge the
refcount cleanup separately, when it is ready.

This patch series is mostly self-contained except for a few places where
it needs to update other subsystems to handle the new memory type.

System stability and performance are not affected according to our
ongoing testing, including xfstests.

How it works: The system BIOS advertises the GPU device memory
(aka VRAM) as SPM (special purpose memory) in the UEFI system address
map.

The amdgpu driver registers the memory with devmap as
MEMORY_DEVICE_COHERENT using devm_memremap_pages. The initial user for
this hardware page migration capability is the Frontier supercomputer
project. This functionality is not AMD-specific. We expect other GPU
vendors to find this functionality useful, and possibly other hardware
types in the future.

Our test nodes in the lab are similar to the Frontier configuration,
with .5 TB of system memory plus 256 GB of device memory split across
4 GPUs, all in a single coherent address space. Page migration is
expected to improve application efficiency significantly. We will
report empirical results as they become available.

We extended hmm_test to cover migration of MEMORY_DEVICE_COHERENT. This
patch set builds on HMM and our SVM memory manager already merged in
5.15.

v2:
- test_hmm is now able to create private and coherent device mirror
instances in the same driver probe. This adds more usability to the hmm
test by not having to remove the kernel module for each device type
test (private/coherent type). This is done by passing the module
parameters spm_addr_dev0 & spm_addr_dev1. In this case, it will create
four instances of device_mirror. The first two correspond to private
device type, the last two to coherent type. Then, they can be easily
accessed from user space through /dev/hmm_mirror. Usually
num_device 0 and 1 are for private, and 2 and 3 for coherent types.

- Coherent device type pages at gup are now migrated back to system
memory if they have been long term pinned (FOLL_LONGTERM). The reason
is these pages could eventually interfere with their own device memory
manager. A new hmm_gup_test has been added to the hmm-test to test this
functionality. It makes use of the gup_test module to long term pin
user pages that have been migrate to device memory first.

- Other patch corrections made by Felix, Alistair and Christoph.

v3:
- Based on last v2 feedback we got from Alistair, we've decided to
remove migration logic for FOLL_LONGTERM coherent device type pages at
gup for now. Ideally, this should be done through the kernel mm,
instead of calling the device driver to do it. Currently, there's no
support for migrating device pages based on pfn, mainly because
migrate_pages() relies on pages being LRU pages. Alistair mentioned, he
has started to work on adding this migrate device pages logic. For now,
we fail on get_user_pages call with FOLL_LONGTERM for DEVICE_COHERENT
pages.

- Also, hmm_gup_test has been removed from hmm-test. We plan to include
it again after this migration work is ready.

- Addressed Liam Howlett's feedback changes.

v4:
- Addressed Alistair Popple's last v3 feedback.

- Use the same system entry path for coherent device pages at
migrate_vma_insert_page.

- Add coherent device type support for try_to_migrate /
try_to_migrate_one.

- Include number of coherent device pages successfully migrated back to
system at test_hmm. Made the proper changes to hmm-test to read/check
this number.

v5:
- Rebase on 5.17-rc1.
- Addressed Alistair Popple's last v4 feedback.

Alex Sierra (10):
  mm: add zone device coherent type memory support
  mm: add device coherent vma selection for memory migration
  mm/gup: fail get_user_pages for LONGTERM dev coherent type
  drm/amdkfd: add SPM support for SVM
  drm/amdkfd: coherent type as sys mem on migration to ram
  lib: test_hmm add ioctl to get zone device type
  lib: test_hmm add module param for zone device type
  lib: add support for device coherent type in test_hmm
  tools: update hmm-test to support device coherent type
  tools: update test_hmm script to support SP config

Re: [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu

2022-01-28 Thread Andrey Grodzovsky

Just a gentle ping if people have more comments on this patch set ? 
Especially last 5 patches

as first 7 are exact same as V2 and we already went over them mostly.

Andrey

On 2022-01-25 17:37, Andrey Grodzovsky wrote:

This patchset is based on earlier work by Boris[1] that allowed to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. On top of that I also serialized
any GPU reset we trigger from within amdgpu code to also go through the same
ordered wq and in this way simplify somewhat our GPU reset code so we don't need
to protect from concurrency by multiple GPU reset triggeres such as TDR on one
hand and sysfs trigger or RAS trigger on the other hand.

As advised by Christian and Daniel I defined a reset_domain struct such that
all the entities that go through reset together will be serialized one against
another.

TDR triggered by multiple entities within the same domain due to the same 
reason will not
be triggered as the first such reset will cancel all the pending resets. This is
relevant only to TDR timers and not to triggered resets coming from RAS or 
SYSFS,
those will still happen after the in flight resets finishes.

v2:
Add handling on SRIOV configuration, the reset notify coming from host
and driver already trigger a work queue to handle the reset so drop this
intermediate wq and send directly to timeout wq. (Shaoyun)

v3:
Lijo suggested puting 'adev->in_gpu_reset' in amdgpu_reset_domain struct.
I followed his advise and also moved adev->reset_sem into same place. This
in turn caused to do some follow-up refactor of the original patches
where i decoupled amdgpu_reset_domain life cycle frolm XGMI hive because hive 
is destroyed and
reconstructed for the case of reset the devices in the XGMI hive during probe 
for SRIOV See [2]
while we need the reset sem and gpu_reset flag to always be present. This was 
attained
by adding refcount to amdgpu_reset_domain so each device can safely point to it 
as long as
it needs.


[1] 
https://patchwork.kernel.org/project/dri-devel/patch/20210629073510.2764391-3-boris.brezil...@collabora.com/
[2] https://www.spinics.net/lists/amd-gfx/msg58836.html

P.S Going through drm-misc-next and not amd-staging-drm-next as Boris work 
hasn't landed yet there.

P.P.S Patches 8-12 are the refactor on top of the original V2 patchset.

P.P.P.S I wasn't able yet to test the reworked code on XGMI SRIOV system 
because drm-misc-next fails to load there.
Would appriciate if maybe jingwech can try it on his system like he tested V2.

Andrey Grodzovsky (12):
   drm/amdgpu: Introduce reset domain
   drm/amdgpu: Move scheduler init to after XGMI is ready
   drm/amdgpu: Fix crash on modprobe
   drm/amdgpu: Serialize non TDR gpu recovery with TDRs
   drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
   drm/amdgpu: Drop hive->in_reset
   drm/amdgpu: Drop concurrent GPU reset protection for device
   drm/amdgpu: Rework reset domain to be refcounted.
   drm/amdgpu: Move reset sem into reset_domain
   drm/amdgpu: Move in_gpu_reset into reset_domain
   drm/amdgpu: Rework amdgpu_device_lock_adev
   Revert 'drm/amdgpu: annotate a false positive recursive locking'

  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  15 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 275 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c |  43 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   2 +-
  .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c|  18 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c |  39 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h |  12 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c  |  24 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h  |   3 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|   6 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  14 +-
  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c |  19 +-
  drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c |  19 +-
  drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c |  11 +-
  16 files changed, 313 insertions(+), 199 deletions(-)

Re: [PATCH][next] drm/amd/display: fix spelling mistake: synatpics -> synaptics

2022-01-28 Thread Alex Deucher

Applied.  Thanks!

Alex

On Fri, Jan 28, 2022 at 12:59 PM Harry Wentland  wrote:
>
>
>
> On 2022-01-28 12:35, Colin Ian King wrote:
> > There are quite a few spelling mistakes in various function names
> > and error messages. Fix these.
> >
> > Signed-off-by: Colin Ian King 
>
> Reviewed-by: Harry Wentland 
>
> Harry
>
> > ---
> >  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 32 +--
> >  1 file changed, 16 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> > index 75b5299b3576..db4ab01267e4 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> > @@ -539,7 +539,7 @@ bool dm_helpers_submit_i2c(
> >  }
> >
> >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > -static bool execute_synatpics_rc_command(struct drm_dp_aux *aux,
> > +static bool execute_synaptics_rc_command(struct drm_dp_aux *aux,
> >   bool is_write_cmd,
> >   unsigned char cmd,
> >   unsigned int length,
> > @@ -578,7 +578,7 @@ static bool execute_synatpics_rc_command(struct 
> > drm_dp_aux *aux,
> >   ret = drm_dp_dpcd_write(aux, SYNAPTICS_RC_COMMAND, _cmd, 
> > sizeof(rc_cmd));
> >
> >   if (ret < 0) {
> > - DRM_ERROR(" execute_synatpics_rc_command - write cmd ..., 
> > err = %d\n", ret);
> > + DRM_ERROR(" execute_synaptics_rc_command - write cmd ..., 
> > err = %d\n", ret);
> >   return false;
> >   }
> >
> > @@ -600,7 +600,7 @@ static bool execute_synatpics_rc_command(struct 
> > drm_dp_aux *aux,
> >   drm_dp_dpcd_read(aux, SYNAPTICS_RC_DATA, data, length);
> >   }
> >
> > - DC_LOG_DC(" execute_synatpics_rc_command - success = %d\n", 
> > success);
> > + DC_LOG_DC(" execute_synaptics_rc_command - success = %d\n", 
> > success);
> >
> >   return success;
> >  }
> > @@ -618,54 +618,54 @@ static void apply_synaptics_fifo_reset_wa(struct 
> > drm_dp_aux *aux)
> >   data[3] = 'U';
> >   data[4] = 'S';
> >
> > - if (!execute_synatpics_rc_command(aux, true, 0x01, 5, 0, data))
> > + if (!execute_synaptics_rc_command(aux, true, 0x01, 5, 0, data))
> >   return;
> >
> >   // Step 3 and 4
> > - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220998, 
> > data))
> > + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220998, 
> > data))
> >   return;
> >
> >   data[0] &= (~(1 << 1)); // set bit 1 to 0
> > - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220998, data))
> > + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220998, data))
> >   return;
> >
> > - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220D98, 
> > data))
> > + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220D98, 
> > data))
> >   return;
> >
> >   data[0] &= (~(1 << 1)); // set bit 1 to 0
> > - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220D98, data))
> > + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220D98, data))
> >   return;
> >
> > - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x221198, 
> > data))
> > + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x221198, 
> > data))
> >   return;
> >
> >   data[0] &= (~(1 << 1)); // set bit 1 to 0
> > - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x221198, data))
> > + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x221198, data))
> >   return;
> >
> >   // Step 3 and 5
> > - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220998, 
> > data))
> > + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220998, 
> > data))
> >   return;
> >
> >   data[0] |= (1 << 1); // set bit 1 to 1
> > - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220998, data))
> > + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220998, data))
> >   return;
> >
> > - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220D98, 
> > data))
> > + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220D98, 
> > data))
> >   return;
> >
> >   data[0] |= (1 << 1); // set bit 1 to 1
> >   return;
> >
> > - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x221198, 
> > data))
> > + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x221198, 
> > data))
> >   return;
> >
> >   data[0] |= (1 << 1); // set bit 1 to 1
> > - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x221198, data))
> > + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x221198, data))
> >   return;
> >
> >   // Step 6
> > - if (!execute_synatpics_rc_command(aux, true, 0x02, 0, 0, NULL))

Re: [PATCH] drm/amd/pm: remove duplicate include in 'arcturus_ppt.c'

2022-01-28 Thread Alex Deucher

Applied.  thanks!

Alex

On Fri, Jan 28, 2022 at 2:19 AM  wrote:
>
> From: Changcheng Deng 
>
> 'amdgpu_dpm.h' included in 'arcturus_ppt.c' is duplicated.
>
> Reported-by: Zeal Robot 
> Signed-off-by: Changcheng Deng 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> index ee296441c5bc..709c32063ef7 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> @@ -46,7 +46,6 @@
>  #include 
>  #include "amdgpu_ras.h"
>  #include "smu_cmn.h"
> -#include "amdgpu_dpm.h"
>
>  /*
>   * DO NOT use these for err/warn/info/debug messages.
> --
> 2.25.1
>

[PATCH][next] drm/amd/display: fix spelling mistake: synatpics -> synaptics

2022-01-28 Thread Colin Ian King

There are quite a few spelling mistakes in various function names
and error messages. Fix these.

Signed-off-by: Colin Ian King 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 32 +--
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 75b5299b3576..db4ab01267e4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -539,7 +539,7 @@ bool dm_helpers_submit_i2c(
 }
 
 #if defined(CONFIG_DRM_AMD_DC_DCN)
-static bool execute_synatpics_rc_command(struct drm_dp_aux *aux,
+static bool execute_synaptics_rc_command(struct drm_dp_aux *aux,
bool is_write_cmd,
unsigned char cmd,
unsigned int length,
@@ -578,7 +578,7 @@ static bool execute_synatpics_rc_command(struct drm_dp_aux 
*aux,
ret = drm_dp_dpcd_write(aux, SYNAPTICS_RC_COMMAND, _cmd, 
sizeof(rc_cmd));
 
if (ret < 0) {
-   DRM_ERROR(" execute_synatpics_rc_command - write cmd ..., 
err = %d\n", ret);
+   DRM_ERROR(" execute_synaptics_rc_command - write cmd ..., 
err = %d\n", ret);
return false;
}
 
@@ -600,7 +600,7 @@ static bool execute_synatpics_rc_command(struct drm_dp_aux 
*aux,
drm_dp_dpcd_read(aux, SYNAPTICS_RC_DATA, data, length);
}
 
-   DC_LOG_DC(" execute_synatpics_rc_command - success = %d\n", 
success);
+   DC_LOG_DC(" execute_synaptics_rc_command - success = %d\n", 
success);
 
return success;
 }
@@ -618,54 +618,54 @@ static void apply_synaptics_fifo_reset_wa(struct 
drm_dp_aux *aux)
data[3] = 'U';
data[4] = 'S';
 
-   if (!execute_synatpics_rc_command(aux, true, 0x01, 5, 0, data))
+   if (!execute_synaptics_rc_command(aux, true, 0x01, 5, 0, data))
return;
 
// Step 3 and 4
-   if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220998, data))
+   if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220998, data))
return;
 
data[0] &= (~(1 << 1)); // set bit 1 to 0
-   if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220998, data))
+   if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220998, data))
return;
 
-   if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
+   if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
return;
 
data[0] &= (~(1 << 1)); // set bit 1 to 0
-   if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220D98, data))
+   if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220D98, data))
return;
 
-   if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x221198, data))
+   if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x221198, data))
return;
 
data[0] &= (~(1 << 1)); // set bit 1 to 0
-   if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x221198, data))
+   if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x221198, data))
return;
 
// Step 3 and 5
-   if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220998, data))
+   if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220998, data))
return;
 
data[0] |= (1 << 1); // set bit 1 to 1
-   if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220998, data))
+   if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220998, data))
return;
 
-   if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
+   if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
return;
 
data[0] |= (1 << 1); // set bit 1 to 1
return;
 
-   if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x221198, data))
+   if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x221198, data))
return;
 
data[0] |= (1 << 1); // set bit 1 to 1
-   if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x221198, data))
+   if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x221198, data))
return;
 
// Step 6
-   if (!execute_synatpics_rc_command(aux, true, 0x02, 0, 0, NULL))
+   if (!execute_synaptics_rc_command(aux, true, 0x02, 0, 0, NULL))
return;
 
DC_LOG_DC("Done apply_synaptics_fifo_reset_wa\n");
-- 
2.34.1

Re: [PATCH] drm/amdgpu: remove duplicate include in 'amdgpu_device.c'

2022-01-28 Thread Alex Deucher

Applied.  Thanks!

Alex

On Fri, Jan 28, 2022 at 2:05 AM  wrote:
>
> From: Changcheng Deng 
>
> 'linux/pci.h' included in 'amdgpu_device.c' is duplicated.
>
> Reported-by: Zeal Robot 
> Signed-off-by: Changcheng Deng 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index dd5979098e63..289c5c626324 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -56,7 +56,6 @@
>  #include "soc15.h"
>  #include "nv.h"
>  #include "bif/bif_4_1_d.h"
> -#include 
>  #include 
>  #include "amdgpu_vf_error.h"
>
> --
> 2.25.1
>

Re: [PATCH][next] drm/amd/display: fix spelling mistake: synatpics -> synaptics

2022-01-28 Thread Harry Wentland




On 2022-01-28 12:35, Colin Ian King wrote:
> There are quite a few spelling mistakes in various function names
> and error messages. Fix these.
> 
> Signed-off-by: Colin Ian King 

Reviewed-by: Harry Wentland 

Harry

> ---
>  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 32 +--
>  1 file changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> index 75b5299b3576..db4ab01267e4 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> @@ -539,7 +539,7 @@ bool dm_helpers_submit_i2c(
>  }
>  
>  #if defined(CONFIG_DRM_AMD_DC_DCN)
> -static bool execute_synatpics_rc_command(struct drm_dp_aux *aux,
> +static bool execute_synaptics_rc_command(struct drm_dp_aux *aux,
>   bool is_write_cmd,
>   unsigned char cmd,
>   unsigned int length,
> @@ -578,7 +578,7 @@ static bool execute_synatpics_rc_command(struct 
> drm_dp_aux *aux,
>   ret = drm_dp_dpcd_write(aux, SYNAPTICS_RC_COMMAND, _cmd, 
> sizeof(rc_cmd));
>  
>   if (ret < 0) {
> - DRM_ERROR(" execute_synatpics_rc_command - write cmd ..., 
> err = %d\n", ret);
> + DRM_ERROR(" execute_synaptics_rc_command - write cmd ..., 
> err = %d\n", ret);
>   return false;
>   }
>  
> @@ -600,7 +600,7 @@ static bool execute_synatpics_rc_command(struct 
> drm_dp_aux *aux,
>   drm_dp_dpcd_read(aux, SYNAPTICS_RC_DATA, data, length);
>   }
>  
> - DC_LOG_DC(" execute_synatpics_rc_command - success = %d\n", 
> success);
> + DC_LOG_DC(" execute_synaptics_rc_command - success = %d\n", 
> success);
>  
>   return success;
>  }
> @@ -618,54 +618,54 @@ static void apply_synaptics_fifo_reset_wa(struct 
> drm_dp_aux *aux)
>   data[3] = 'U';
>   data[4] = 'S';
>  
> - if (!execute_synatpics_rc_command(aux, true, 0x01, 5, 0, data))
> + if (!execute_synaptics_rc_command(aux, true, 0x01, 5, 0, data))
>   return;
>  
>   // Step 3 and 4
> - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220998, data))
> + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220998, data))
>   return;
>  
>   data[0] &= (~(1 << 1)); // set bit 1 to 0
> - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220998, data))
> + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220998, data))
>   return;
>  
> - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
> + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
>   return;
>  
>   data[0] &= (~(1 << 1)); // set bit 1 to 0
> - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220D98, data))
> + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220D98, data))
>   return;
>  
> - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x221198, data))
> + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x221198, data))
>   return;
>  
>   data[0] &= (~(1 << 1)); // set bit 1 to 0
> - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x221198, data))
> + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x221198, data))
>   return;
>  
>   // Step 3 and 5
> - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220998, data))
> + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220998, data))
>   return;
>  
>   data[0] |= (1 << 1); // set bit 1 to 1
> - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x220998, data))
> + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x220998, data))
>   return;
>  
> - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
> + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x220D98, data))
>   return;
>  
>   data[0] |= (1 << 1); // set bit 1 to 1
>   return;
>  
> - if (!execute_synatpics_rc_command(aux, false, 0x31, 4, 0x221198, data))
> + if (!execute_synaptics_rc_command(aux, false, 0x31, 4, 0x221198, data))
>   return;
>  
>   data[0] |= (1 << 1); // set bit 1 to 1
> - if (!execute_synatpics_rc_command(aux, true, 0x21, 4, 0x221198, data))
> + if (!execute_synaptics_rc_command(aux, true, 0x21, 4, 0x221198, data))
>   return;
>  
>   // Step 6
> - if (!execute_synatpics_rc_command(aux, true, 0x02, 0, 0, NULL))
> + if (!execute_synaptics_rc_command(aux, true, 0x02, 0, 0, NULL))
>   return;
>  
>   DC_LOG_DC("Done apply_synaptics_fifo_reset_wa\n");

Re: [PATCH v4 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

2022-01-28 Thread Felix Kuehling

Thank you, Alex for your persistence with this patch series. Fee free to 
add my Acked-by to all the patches that don't already have my R-b. I 
have done pretty through reviews of previous versions of those patches, 
but obviously missed a lot of issues pointed out by real MM experts.


Thank you Alistair for your reviews, feedback and collaboration!

Regards,
  Felix


Am 2022-01-27 um 18:20 schrieb Sierra Guiza, Alejandro (Alex):

Andrew,
We're somehow new on this procedure. Are you referring to rebase this 
patch series to
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 
<5.17-rc1 tag>?


Regards,
Alex Sierra

Alex Deucher,
Just a quick heads up. This patch series contains changes to the 
amdgpu driver which we're

planning to merge through Andrew's tree, If that's ok with you.

Regards,
Alex Sierra

On 1/27/2022 4:32 PM, Andrew Morton wrote:
On Wed, 26 Jan 2022 21:09:39 -0600 Alex Sierra  
wrote:



This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like
MEMORY_DEVICE_PRIVATE.

Some more reviewer input appears to be desirable here.

I was going to tentatively add it to -mm and -next, but problems.
5.17-rc1's mm/migrate.c:migrate_vma_check_page() is rather different
from the tree you patched.  Please redo, refresh and resend?

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-28 Thread Grodzovsky, Andrey

Just a gentle ping.

Andrey

From: Grodzovsky, Andrey
Sent: 26 January 2022 10:52
To: Christian König ; Koenig, Christian 
; Lazar, Lijo ; 
dri-de...@lists.freedesktop.org ; 
amd-gfx@lists.freedesktop.org ; Chen, JingWen 

Cc: Chen, Horace ; Liu, Monk 
Subject: Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

JingWen - could you maybe give those patches a try on SRIOV XGMI system ? If 
you see issues maybe you could let me connect and debug. My SRIOV XGMI system 
which Shayun kindly arranged for me is not loading the driver with my 
drm-misc-next branch even without my patches.

Andrey

On 2022-01-17 14:21, Andrey Grodzovsky wrote:

On 2022-01-17 2:17 p.m., Christian König wrote:
Am 17.01.22 um 20:14 schrieb Andrey Grodzovsky:

Ping on the question

Oh, my! That was already more than a week ago and is completely swapped out of 
my head again.

Andrey

On 2022-01-05 1:11 p.m., Andrey Grodzovsky wrote:
Also, what about having the reset_active or in_reset flag in the reset_domain 
itself?

Of hand that sounds like a good idea.

What then about the adev->reset_sem semaphore ? Should we also move this to 
reset_domain ?  Both of the moves have functional
implications only for XGMI case because there will be contention over accessing 
those single instance variables from multiple devices
while now each device has it's own copy.

Since this is a rw semaphore that should be unproblematic I think. It could 
just be that the cache line of the lock then plays ping/pong between the CPU 
cores.

What benefit the centralization into reset_domain gives - is it for example to 
prevent one device in a hive trying to access through MMIO another one's
VRAM (shared FB memory) while the other one goes through reset ?

I think that this is the killer argument for a centralized lock, yes.

np, i will add a patch with centralizing both flag into reset domain and resend.

Andrey

Christian.

Andrey

Re: [PATCH 2/2] drm/amdgpu: restructure amdgpu_fill_buffer

2022-01-28 Thread Felix Kuehling




Am 2022-01-28 um 10:16 schrieb Christian König:

We ran into the problem that clearing really larger buffer (60GiB) caused an
SDMA timeout.

Restructure the function to use the dst window instead of mapping the whole
buffer into the GART and then fill only 2MiB chunks at a time.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 200 +---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |   2 +
  2 files changed, 114 insertions(+), 88 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 2b0e83e9fa8a..8671ba32fb52 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -296,9 +296,6 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
   struct dma_resv *resv,
   struct dma_fence **f)
  {
-   const uint32_t GTT_MAX_BYTES = (AMDGPU_GTT_MAX_TRANSFER_SIZE *
-   AMDGPU_GPU_PAGE_SIZE);
-
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
struct amdgpu_res_cursor src_mm, dst_mm;
struct dma_fence *fence = NULL;
@@ -320,12 +317,15 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
uint32_t cur_size;
uint64_t from, to;
  
-		/* Copy size cannot exceed GTT_MAX_BYTES. So if src or dst

-* begins at an offset, then adjust the size accordingly
+   /*
+* Copy size cannot exceed AMDGPU_GTT_MAX_TRANSFER_BYTES. So if
+* src or dst begins at an offset, then adjust the size
+* accordingly
 */
cur_size = max(src_page_offset, dst_page_offset);
cur_size = min(min3(src_mm.size, dst_mm.size, size),
-  (uint64_t)(GTT_MAX_BYTES - cur_size));
+  (uint64_t)(AMDGPU_GTT_MAX_TRANSFER_BYTES -
+ cur_size));
  
  		/* Map src to window 0 and dst to window 1. */

r = amdgpu_ttm_map_buffer(src->bo, src->mem, _mm,
@@ -395,8 +395,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE)) {
struct dma_fence *wipe_fence = NULL;
  
-		r = amdgpu_fill_buffer(ttm_to_amdgpu_bo(bo), AMDGPU_POISON,

-  NULL, _fence);
+   r = amdgpu_fill_buffer(abo, AMDGPU_POISON, NULL, _fence);
if (r) {
goto error;
} else if (wipe_fence) {
@@ -1922,19 +1921,51 @@ void amdgpu_ttm_set_buffer_funcs_status(struct 
amdgpu_device *adev, bool enable)
adev->mman.buffer_funcs_enabled = enable;
  }
  
+static int amdgpu_ttm_prepare_job(struct amdgpu_device *adev,

+ bool direct_submit,
+ unsigned int num_dw,
+ struct dma_resv *resv,
+ bool vm_needs_flush,
+ struct amdgpu_job **job)
+{
+   enum amdgpu_ib_pool_type pool = direct_submit ?
+   AMDGPU_IB_POOL_DIRECT :
+   AMDGPU_IB_POOL_DELAYED;
+   int r;
+
+   r = amdgpu_job_alloc_with_ib(adev, num_dw * 4, pool, job);
+   if (r)
+   return r;
+
+   if (vm_needs_flush) {
+   (*job)->vm_pd_addr = amdgpu_gmc_pd_addr(adev->gmc.pdb0_bo ?
+   adev->gmc.pdb0_bo :
+   adev->gart.bo);
+   (*job)->vm_needs_flush = true;
+   }
+   if (resv) {
+   r = amdgpu_sync_resv(adev, &(*job)->sync, resv,
+AMDGPU_SYNC_ALWAYS,
+AMDGPU_FENCE_OWNER_UNDEFINED);
+   if (r) {
+   DRM_ERROR("sync failed (%d).\n", r);
+   amdgpu_job_free(*job);
+   return r;
+   }
+   }
+   return 0;
+}
+
  int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,
   uint64_t dst_offset, uint32_t byte_count,
   struct dma_resv *resv,
   struct dma_fence **fence, bool direct_submit,
   bool vm_needs_flush, bool tmz)
  {
-   enum amdgpu_ib_pool_type pool = direct_submit ? AMDGPU_IB_POOL_DIRECT :
-   AMDGPU_IB_POOL_DELAYED;
struct amdgpu_device *adev = ring->adev;
+   unsigned num_loops, num_dw;
struct amdgpu_job *job;
-
uint32_t max_bytes;
-   unsigned num_loops, num_dw;
unsigned i;
int r;
  
@@ -1946,26 +1977,11 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,

max_bytes = adev->mman.buffer_funcs->copy_max_bytes;

Re: [PATCH] drm/amdgpu: drop flood print in rlcg reg access function

2022-01-28 Thread Deucher, Alexander

[Public]

Acked-by: Alex Deucher 

From: Chen, Guchun 
Sent: Friday, January 28, 2022 10:19 AM
To: amd-gfx@lists.freedesktop.org ; Deucher, 
Alexander ; Koenig, Christian 
; Pan, Xinhui ; Zhang, Hawking 

Cc: Chen, Guchun 
Subject: [PATCH] drm/amdgpu: drop flood print in rlcg reg access function

A lot of below message are outputed in SRIOV case.
amdgpu: indirect registers access through rlcg is not supported

Also drop redundant ret set, as it's initialized to be false already.

Fixes: d4cd09ca9bce("drm/amdgpu: add helper to query rlcg reg access flag")
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 80c25176c993..b56cafb26f4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -849,9 +849,6 @@ static bool amdgpu_virt_get_rlcg_reg_access_flag(struct 
amdgpu_device *adev,
 }
 break;
 default:
-   dev_err(adev->dev,
-   "indirect registers access through rlcg is not 
supported\n");
-   ret = false;
 break;
 }
 return ret;
--
2.17.1

[PATCH] drm/amdgpu: drop flood print in rlcg reg access function

2022-01-28 Thread Guchun Chen

A lot of below message are outputed in SRIOV case.
amdgpu: indirect registers access through rlcg is not supported

Also drop redundant ret set, as it's initialized to be false already.

Fixes: d4cd09ca9bce("drm/amdgpu: add helper to query rlcg reg access flag")
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 80c25176c993..b56cafb26f4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -849,9 +849,6 @@ static bool amdgpu_virt_get_rlcg_reg_access_flag(struct 
amdgpu_device *adev,
}
break;
default:
-   dev_err(adev->dev,
-   "indirect registers access through rlcg is not 
supported\n");
-   ret = false;
break;
}
return ret;
-- 
2.17.1

[PATCH 1/2] drm/amdgpu: fix logic inversion in check

2022-01-28 Thread Christian König

We probably never trigger this, but the logic inside the check is
inverted.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3d8a20956b74..2b0e83e9fa8a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1938,7 +1938,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t 
src_offset,
unsigned i;
int r;
 
-   if (direct_submit && !ring->sched.ready) {
+   if (!direct_submit && !ring->sched.ready) {
DRM_ERROR("Trying to move memory with ring turned off.\n");
return -EINVAL;
}
-- 
2.25.1

[PATCH 2/2] drm/amdgpu: restructure amdgpu_fill_buffer

2022-01-28 Thread Christian König

We ran into the problem that clearing really larger buffer (60GiB) caused an
SDMA timeout.

Restructure the function to use the dst window instead of mapping the whole
buffer into the GART and then fill only 2MiB chunks at a time.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 200 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |   2 +
 2 files changed, 114 insertions(+), 88 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 2b0e83e9fa8a..8671ba32fb52 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -296,9 +296,6 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
   struct dma_resv *resv,
   struct dma_fence **f)
 {
-   const uint32_t GTT_MAX_BYTES = (AMDGPU_GTT_MAX_TRANSFER_SIZE *
-   AMDGPU_GPU_PAGE_SIZE);
-
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
struct amdgpu_res_cursor src_mm, dst_mm;
struct dma_fence *fence = NULL;
@@ -320,12 +317,15 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
uint32_t cur_size;
uint64_t from, to;
 
-   /* Copy size cannot exceed GTT_MAX_BYTES. So if src or dst
-* begins at an offset, then adjust the size accordingly
+   /*
+* Copy size cannot exceed AMDGPU_GTT_MAX_TRANSFER_BYTES. So if
+* src or dst begins at an offset, then adjust the size
+* accordingly
 */
cur_size = max(src_page_offset, dst_page_offset);
cur_size = min(min3(src_mm.size, dst_mm.size, size),
-  (uint64_t)(GTT_MAX_BYTES - cur_size));
+  (uint64_t)(AMDGPU_GTT_MAX_TRANSFER_BYTES -
+ cur_size));
 
/* Map src to window 0 and dst to window 1. */
r = amdgpu_ttm_map_buffer(src->bo, src->mem, _mm,
@@ -395,8 +395,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE)) {
struct dma_fence *wipe_fence = NULL;
 
-   r = amdgpu_fill_buffer(ttm_to_amdgpu_bo(bo), AMDGPU_POISON,
-  NULL, _fence);
+   r = amdgpu_fill_buffer(abo, AMDGPU_POISON, NULL, _fence);
if (r) {
goto error;
} else if (wipe_fence) {
@@ -1922,19 +1921,51 @@ void amdgpu_ttm_set_buffer_funcs_status(struct 
amdgpu_device *adev, bool enable)
adev->mman.buffer_funcs_enabled = enable;
 }
 
+static int amdgpu_ttm_prepare_job(struct amdgpu_device *adev,
+ bool direct_submit,
+ unsigned int num_dw,
+ struct dma_resv *resv,
+ bool vm_needs_flush,
+ struct amdgpu_job **job)
+{
+   enum amdgpu_ib_pool_type pool = direct_submit ?
+   AMDGPU_IB_POOL_DIRECT :
+   AMDGPU_IB_POOL_DELAYED;
+   int r;
+
+   r = amdgpu_job_alloc_with_ib(adev, num_dw * 4, pool, job);
+   if (r)
+   return r;
+
+   if (vm_needs_flush) {
+   (*job)->vm_pd_addr = amdgpu_gmc_pd_addr(adev->gmc.pdb0_bo ?
+   adev->gmc.pdb0_bo :
+   adev->gart.bo);
+   (*job)->vm_needs_flush = true;
+   }
+   if (resv) {
+   r = amdgpu_sync_resv(adev, &(*job)->sync, resv,
+AMDGPU_SYNC_ALWAYS,
+AMDGPU_FENCE_OWNER_UNDEFINED);
+   if (r) {
+   DRM_ERROR("sync failed (%d).\n", r);
+   amdgpu_job_free(*job);
+   return r;
+   }
+   }
+   return 0;
+}
+
 int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,
   uint64_t dst_offset, uint32_t byte_count,
   struct dma_resv *resv,
   struct dma_fence **fence, bool direct_submit,
   bool vm_needs_flush, bool tmz)
 {
-   enum amdgpu_ib_pool_type pool = direct_submit ? AMDGPU_IB_POOL_DIRECT :
-   AMDGPU_IB_POOL_DELAYED;
struct amdgpu_device *adev = ring->adev;
+   unsigned num_loops, num_dw;
struct amdgpu_job *job;
-
uint32_t max_bytes;
-   unsigned num_loops, num_dw;
unsigned i;
int r;
 
@@ -1946,26 +1977,11 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, 
uint64_t src_offset,
max_bytes = adev->mman.buffer_funcs->copy_max_bytes;
num_loops =

Re: [PATCH RESEND] drm/amd/display: Force link_rate as LINK_RATE_RBR2 for 2018 15" Apple Retina panels

2022-01-28 Thread Harry Wentland





On 1/28/22 08:06, Aditya Garg wrote:


Hi Alex


On 27-Jan-2022, at 11:06 PM, Alex Deucher  wrote:

C style comments please.

Shall be fixed in v2

  I'll let one of the display guys comment on
the rest of the patch.  Seems reasonable, we have a similar quirk for
the Apple MBP 2017 15" Retina panel later in this function.  Could you
move this next to the other quirk?

I guess moving it next to the other quirk may break the functionality of this 
quirk, cause the MBP 2018 one involves stuff regarding firmware revision as 
well. The original patch applies the quirk after the following lines of the 
code :-


core_link_read_dpcd(
link,
DP_SINK_HW_REVISION_START,
(uint8_t *)_hw_fw_revision,
sizeof(dp_hw_fw_revision));

link->dpcd_caps.sink_hw_revision =
dp_hw_fw_revision.ieee_hw_rev;

memmove(
link->dpcd_caps.sink_fw_revision,
dp_hw_fw_revision.ieee_fw_rev,
sizeof(dp_hw_fw_revision.ieee_fw_rev));

Which seem to related to the firmware stuff. Moving it along with the 2017 
quirk doesn't sound right to me, as this shall move the quirk BEFORE these 
lines of code instead. Maybe the author also knowingly added the quirk after 
these lines of code?

As a workaround, could we move the 2017 quirk later, instead of moving the 2018 
quirk before? This sounds more logical to me.



I think either leaving the 2017 quirk in its original place or moving it 
down works. I don't have a strong preference.


With the comment style addressed this patch is
Reviewed-by: Harry Wentland 

Harry


Regards
Aditya

RE: [PATCH v4 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

2022-01-28 Thread Deucher, Alexander

[Public]

> -Original Message-
> From: Sierra Guiza, Alejandro (Alex) 
> Sent: Thursday, January 27, 2022 6:21 PM
> To: Andrew Morton 
> Cc: Kuehling, Felix ; linux...@kvack.org;
> rcampb...@nvidia.com; linux-e...@vger.kernel.org; linux-
> x...@vger.kernel.org; amd-gfx@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; h...@lst.de; j...@nvidia.com;
> jgli...@redhat.com; apop...@nvidia.com; wi...@infradead.org; Deucher,
> Alexander 
> Subject: Re: [PATCH v4 00/10] Add MEMORY_DEVICE_COHERENT for
> coherent device memory mapping
> 
> Andrew,
> We're somehow new on this procedure. Are you referring to rebase this
> patch series to git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> next.git
> <5.17-rc1 tag>?
> 
> Regards,
> Alex Sierra
> 
> Alex Deucher,
> Just a quick heads up. This patch series contains changes to the amdgpu
> driver which we're planning to merge through Andrew's tree, If that's ok with
> you.

No problem.

Thanks!

Alex

> 
> Regards,
> Alex Sierra
> 
> On 1/27/2022 4:32 PM, Andrew Morton wrote:
> > On Wed, 26 Jan 2022 21:09:39 -0600 Alex Sierra 
> wrote:
> >
> >> This patch series introduces MEMORY_DEVICE_COHERENT, a type of
> memory
> >> owned by a device that can be mapped into CPU page tables like
> >> MEMORY_DEVICE_GENERIC and can also be migrated like
> >> MEMORY_DEVICE_PRIVATE.
> > Some more reviewer input appears to be desirable here.
> >
> > I was going to tentatively add it to -mm and -next, but problems.
> > 5.17-rc1's mm/migrate.c:migrate_vma_check_page() is rather different
> > from the tree you patched.  Please redo, refresh and resend?
> >

Re: [PATCH V3 7/7] drm/amd/pm: revise the implementation of smu_cmn_disable_all_features_with_exception

2022-01-28 Thread Deucher, Alexander

[Public]

Reviewed-by: Alex Deucher 

From: Quan, Evan 
Sent: Friday, January 28, 2022 2:04 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Lazar, Lijo 
; Quan, Evan 
Subject: [PATCH V3 7/7] drm/amd/pm: revise the implementation of 
smu_cmn_disable_all_features_with_exception

As there is no internal cache for enabled ppfeatures now. Thus the 2nd
parameter will be not needed any more.

Signed-off-by: Evan Quan 
Change-Id: I0c1811f216c55d6ddfabdc9e099dc214c21bdf2e
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 9 ++---
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 -
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 7 ---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 1 -
 drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   | 2 +-
 5 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 59be1c822b2c..1c9c11a92d42 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -1360,9 +1360,7 @@ static int smu_disable_dpms(struct smu_context *smu)
 case IP_VERSION(11, 5, 0):
 case IP_VERSION(11, 0, 12):
 case IP_VERSION(11, 0, 13):
-   return smu_disable_all_features_with_exception(smu,
-  true,
-  
SMU_FEATURE_COUNT);
+   return 0;
 default:
 break;
 }
@@ -1378,9 +1376,7 @@ static int smu_disable_dpms(struct smu_context *smu)
 case IP_VERSION(11, 0, 0):
 case IP_VERSION(11, 0, 5):
 case IP_VERSION(11, 0, 9):
-   return smu_disable_all_features_with_exception(smu,
-  true,
-  
SMU_FEATURE_BACO_BIT);
+   return 0;
 default:
 break;
 }
@@ -1392,7 +1388,6 @@ static int smu_disable_dpms(struct smu_context *smu)
  */
 if (use_baco && smu_feature_is_enabled(smu, SMU_FEATURE_BACO_BIT)) {
 ret = smu_disable_all_features_with_exception(smu,
- false,
   
SMU_FEATURE_BACO_BIT);
 if (ret)
 dev_err(adev->dev, "Failed to disable smu features 
except BACO.\n");
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 721b4080d3e6..55b24988455d 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -990,7 +990,6 @@ struct pptable_funcs {
  *   exception to those in 
  */
 int (*disable_all_features_with_exception)(struct smu_context *smu,
-  bool no_hw_disablement,
enum smu_feature_mask mask);

 /**
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index acb9f0ca191b..2a6b752a6996 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -767,9 +767,6 @@ int smu_cmn_set_pp_feature_mask(struct smu_context *smu,
  *   @mask
  *
  * @smu:   smu_context pointer
- * @no_hw_disablement: whether real dpm disablement should be performed
- * true: update the cache(about dpm enablement state) only
- * false: real dpm disablement plus cache update
  * @mask:  the dpm feature which should not be disabled
  * SMU_FEATURE_COUNT: no exception, all dpm features
  * to disable
@@ -778,7 +775,6 @@ int smu_cmn_set_pp_feature_mask(struct smu_context *smu,
  * 0 on success or a negative error code on failure.
  */
 int smu_cmn_disable_all_features_with_exception(struct smu_context *smu,
-   bool no_hw_disablement,
 enum smu_feature_mask mask)
 {
 uint64_t features_to_disable = U64_MAX;
@@ -794,9 +790,6 @@ int smu_cmn_disable_all_features_with_exception(struct 
smu_context *smu,
 features_to_disable &= ~(1ULL << skipped_feature_id);
 }

-   if (no_hw_disablement)
-   return 0;
-
 return smu_cmn_feature_update_enable_state(smu,
features_to_disable,
0);

Re: [PATCH V3 6/7] drm/amd/pm: avoid consecutive retrieving for enabled ppfeatures

2022-01-28 Thread Deucher, Alexander

[Public]

Reviewed-by: Alex Deucher 

From: Quan, Evan 
Sent: Friday, January 28, 2022 2:04 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Lazar, Lijo 
; Quan, Evan 
Subject: [PATCH V3 6/7] drm/amd/pm: avoid consecutive retrieving for enabled 
ppfeatures

As the enabled ppfeatures are just retrieved ahead. We can use
that directly instead of retrieving again and again.

Signed-off-by: Evan Quan 
Change-Id: I08827437fcbbc52084418c8ca6a90cfa503306a9
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 3d263b27b6c2..acb9f0ca191b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -680,6 +680,7 @@ size_t smu_cmn_get_pp_feature_mask(struct smu_context *smu,
 int8_t sort_feature[SMU_FEATURE_COUNT];
 size_t size = 0;
 int ret = 0, i;
+   int feature_id;

 ret = smu_cmn_get_enabled_mask(smu,
_mask);
@@ -708,11 +709,18 @@ size_t smu_cmn_get_pp_feature_mask(struct smu_context 
*smu,
 if (sort_feature[i] < 0)
 continue;

+   /* convert to asic spcific feature ID */
+   feature_id = smu_cmn_to_asic_specific_index(smu,
+   
CMN2ASIC_MAPPING_FEATURE,
+   sort_feature[i]);
+   if (feature_id < 0)
+   continue;
+
 size += sysfs_emit_at(buf, size, "%02d. %-20s (%2d) : %s\n",
 count++,
 smu_get_feature_name(smu, sort_feature[i]),
 i,
-   !!smu_cmn_feature_is_enabled(smu, 
sort_feature[i]) ?
+   !!test_bit(feature_id, (unsigned long 
*)_mask) ?
 "enabled" : "disabled");
 }

--
2.29.0

Re: [PATCH V3 5/7] drm/amd/pm: drop the cache for enabled ppfeatures

2022-01-28 Thread Deucher, Alexander

[Public]

Reviewed-by: Alex Deucher 

From: Quan, Evan 
Sent: Friday, January 28, 2022 2:04 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Lazar, Lijo 
; Quan, Evan 
Subject: [PATCH V3 5/7] drm/amd/pm: drop the cache for enabled ppfeatures

The following scenarios make the driver cache for enabled ppfeatures
outdated and invalid:
  - Other tools interact with PMFW to change the enabled ppfeatures.
  - PMFW may enable/disable some features behind driver's back. E.g.
for sienna_cichild, on gfxoff entering, PMFW will disable gfx
related DPM features. All those are performed without driver's
notice.
Also considering driver does not actually interact with PMFW such
frequently, the benefit brought by such cache is very limited.

Signed-off-by: Evan Quan 
Change-Id: I20ed58ab216e930c7a5d223be1eb99146889f2b3
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  1 -
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  1 -
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 23 +-
 .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  | 16 +--
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 23 +-
 .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  | 16 +--
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 46 +--
 7 files changed, 17 insertions(+), 109 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 803068cb5079..59be1c822b2c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -950,7 +950,6 @@ static int smu_sw_init(void *handle)
 smu->pool_size = adev->pm.smu_prv_buffer_size;
 smu->smu_feature.feature_num = SMU_FEATURE_MAX;
 bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
-   bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
 bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);

 mutex_init(>message_lock);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 8cd1c3bb595a..721b4080d3e6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -390,7 +390,6 @@ struct smu_feature
 uint32_t feature_num;
 DECLARE_BITMAP(supported, SMU_FEATURE_MAX);
 DECLARE_BITMAP(allowed, SMU_FEATURE_MAX);
-   DECLARE_BITMAP(enabled, SMU_FEATURE_MAX);
 };

 struct smu_clocks {
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
index d36b64371492..d71155a66f97 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
@@ -798,27 +798,8 @@ int smu_v11_0_set_allowed_mask(struct smu_context *smu)
 int smu_v11_0_system_features_control(struct smu_context *smu,
  bool en)
 {
-   struct smu_feature *feature = >smu_feature;
-   uint64_t feature_mask;
-   int ret = 0;
-
-   ret = smu_cmn_send_smc_msg(smu, (en ? SMU_MSG_EnableAllSmuFeatures :
-SMU_MSG_DisableAllSmuFeatures), NULL);
-   if (ret)
-   return ret;
-
-   bitmap_zero(feature->enabled, feature->feature_num);
-
-   if (en) {
-   ret = smu_cmn_get_enabled_mask(smu, _mask);
-   if (ret)
-   return ret;
-
-   bitmap_copy(feature->enabled, (unsigned long *)_mask,
-   feature->feature_num);
-   }
-
-   return ret;
+   return smu_cmn_send_smc_msg(smu, (en ? SMU_MSG_EnableAllSmuFeatures :
+ SMU_MSG_DisableAllSmuFeatures), NULL);
 }

 int smu_v11_0_notify_display_change(struct smu_context *smu)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 478151e72889..96a5b31f708d 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -1947,27 +1947,13 @@ static int vangogh_get_dpm_clock_table(struct 
smu_context *smu, struct dpm_clock
 static int vangogh_system_features_control(struct smu_context *smu, bool en)
 {
 struct amdgpu_device *adev = smu->adev;
-   struct smu_feature *feature = >smu_feature;
-   uint64_t feature_mask;
 int ret = 0;

 if (adev->pm.fw_version >= 0x43f1700 && !en)
 ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_RlcPowerNotify,
   RLC_STATUS_OFF, NULL);

-   bitmap_zero(feature->enabled, feature->feature_num);
-
-   if (!en)
-   return ret;
-
-   ret = smu_cmn_get_enabled_mask(smu, _mask);
-   if (ret)
-   return ret;
-
-   bitmap_copy(feature->enabled, (unsigned long *)_mask,
-   feature->feature_num);
-
-   return 0;

Re: [PATCH V3 4/7] drm/amd/pm: correct the usage for 'supported' member of smu_feature structure

2022-01-28 Thread Deucher, Alexander

[AMD Official Use Only]

Reviewed-by: Alex Deucher 

From: Quan, Evan 
Sent: Friday, January 28, 2022 2:04 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Lazar, Lijo 
; Quan, Evan 
Subject: [PATCH V3 4/7] drm/amd/pm: correct the usage for 'supported' member of 
smu_feature structure

The supported features should be retrieved just after EnableAllDpmFeatures 
message
complete. And the check(whether some dpm feature is supported) is only needed 
when we
decide to enable or disable it.

Signed-off-by: Evan Quan 
Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 11 +++
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 
 .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 10 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c|  3 ---
 drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  5 +
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  3 ---
 drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  3 ---
 7 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index ae48cc5aa567..803068cb5079 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -1057,8 +1057,10 @@ static int smu_get_thermal_temperature_range(struct 
smu_context *smu)

 static int smu_smc_hw_setup(struct smu_context *smu)
 {
+   struct smu_feature *feature = >smu_feature;
 struct amdgpu_device *adev = smu->adev;
 uint32_t pcie_gen = 0, pcie_width = 0;
+   uint64_t features_supported;
 int ret = 0;

 if (adev->in_suspend && smu_is_dpm_running(smu)) {
@@ -1138,6 +1140,15 @@ static int smu_smc_hw_setup(struct smu_context *smu)
 return ret;
 }

+   ret = smu_feature_get_enabled_mask(smu, _supported);
+   if (ret) {
+   dev_err(adev->dev, "Failed to retrieve supported dpm 
features!\n");
+   return ret;
+   }
+   bitmap_copy(feature->supported,
+   (unsigned long *)_supported,
+   feature->feature_num);
+
 if (!smu_is_dpm_running(smu))
 dev_info(adev->dev, "dpm has been disabled\n");

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 84cbde3f913d..f55ead5f9aba 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -1624,8 +1624,8 @@ static int navi10_display_config_changed(struct 
smu_context *smu)
 int ret = 0;

 if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
 ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_NumOfDisplays,
   
smu->display_config->num_display,
   NULL);
@@ -1860,13 +1860,13 @@ static int navi10_notify_smc_display_config(struct 
smu_context *smu)
 min_clocks.dcef_clock_in_sr = 
smu->display_config->min_dcef_deep_sleep_set_clk;
 min_clocks.memory_clock = smu->display_config->min_mem_set_clock;

-   if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {
+   if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {
 clock_req.clock_type = amd_pp_dcef_clock;
 clock_req.clock_freq_in_khz = min_clocks.dcef_clock * 10;

 ret = smu_v11_0_display_clock_voltage_request(smu, _req);
 if (!ret) {
-   if (smu_cmn_feature_is_supported(smu, 
SMU_FEATURE_DS_DCEFCLK_BIT)) {
+   if (smu_cmn_feature_is_enabled(smu, 
SMU_FEATURE_DS_DCEFCLK_BIT)) {
 ret = smu_cmn_send_smc_msg_with_param(smu,
   
SMU_MSG_SetMinDeepSleepDcefclk,
   
min_clocks.dcef_clock_in_sr/100,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index b6759f8b5167..804e1c98238d 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -1280,8 +1280,8 @@ static int sienna_cichlid_display_config_changed(struct 
smu_context *smu)
 int ret = 0;

 if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
-

Re: [PATCH V3 2/7] drm/amd/pm: unify the interface for retrieving enabled ppfeatures

2022-01-28 Thread Deucher, Alexander

[Public]

Reviewed-by: Alex Deucher 

From: Quan, Evan 
Sent: Friday, January 28, 2022 2:04 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Lazar, Lijo 
; Quan, Evan 
Subject: [PATCH V3 2/7] drm/amd/pm: unify the interface for retrieving enabled 
ppfeatures

Instead of having two which do the same thing.

Signed-off-by: Evan Quan 
Change-Id: I6302c9b5abdb999c4b7c83a0d1852181208b1c1f
--
v1->v2:
  - use SMU IP version check rather than an asic type check(Alex)
---
 .../amd/pm/swsmu/smu11/cyan_skillfish_ppt.c   |  2 +-
 .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  6 +-
 .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  6 +-
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 95 ---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  4 -
 5 files changed, 46 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
index 2f57333e6071..cc080a0075ee 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
@@ -357,7 +357,7 @@ static bool cyan_skillfish_is_dpm_running(struct 
smu_context *smu)
 if (adev->in_suspend)
 return false;

-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
 if (ret)
 return false;

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 721027917f81..b4a3c9b8b54e 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -507,7 +507,7 @@ static bool vangogh_is_dpm_running(struct smu_context *smu)
 if (adev->in_suspend)
 return false;

-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);

 if (ret)
 return false;
@@ -1965,7 +1965,7 @@ static int vangogh_system_features_control(struct 
smu_context *smu, bool en)
 if (!en)
 return ret;

-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
 if (ret)
 return ret;

@@ -2182,7 +2182,7 @@ static const struct pptable_funcs vangogh_ppt_funcs = {
 .dpm_set_jpeg_enable = vangogh_dpm_set_jpeg_enable,
 .is_dpm_running = vangogh_is_dpm_running,
 .read_sensor = vangogh_read_sensor,
-   .get_enabled_mask = smu_cmn_get_enabled_32_bits_mask,
+   .get_enabled_mask = smu_cmn_get_enabled_mask,
 .get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
 .set_watermarks_table = vangogh_set_watermarks_table,
 .set_driver_table_location = smu_v11_0_set_driver_table_location,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
index bd24a2632214..f425827e2361 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
@@ -209,7 +209,7 @@ static int yellow_carp_system_features_control(struct 
smu_context *smu, bool en)
 if (!en)
 return ret;

-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
 if (ret)
 return ret;

@@ -258,7 +258,7 @@ static bool yellow_carp_is_dpm_running(struct smu_context 
*smu)
 uint32_t feature_mask[2];
 uint64_t feature_enabled;

-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);

 if (ret)
 return false;
@@ -1174,7 +1174,7 @@ static const struct pptable_funcs yellow_carp_ppt_funcs = 
{
 .is_dpm_running = yellow_carp_is_dpm_running,
 .set_watermarks_table = yellow_carp_set_watermarks_table,
 .get_gpu_metrics = yellow_carp_get_gpu_metrics,
-   .get_enabled_mask = smu_cmn_get_enabled_32_bits_mask,
+   .get_enabled_mask = smu_cmn_get_enabled_mask,
 .get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
 .set_driver_table_location = smu_v13_0_set_driver_table_location,
 .gfx_off_control = smu_v13_0_gfx_off_control,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index c3c679bf9d9f..c2e6c8b603da 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -545,67 +545,59 @@ int smu_cmn_get_enabled_mask(struct smu_context *smu,
  uint32_t *feature_mask,
  uint32_t num)
 {
-   uint32_t feature_mask_high = 0, feature_mask_low = 0;
 struct smu_feature *feature = >smu_feature;
+   struct

Re: [PATCH] drm/amdgpu: Fix an error message in rmmod

2022-01-28 Thread Felix Kuehling

I see, thanks for clarifying. So this is happening because we unmap the 
HIQ with direct MMIO register writes instead of using the KIQ.



I'm OK with this patch as a workaround, but as a proper fix, we should 
probably add a hiq_hqd_destroy function that uses KIQ, similar to how we 
have hiq_mqd_load functions that use KIQ to map the HIQ.



Regards,
  Felix



Am 2022-01-27 um 21:34 schrieb Yin, Tianci (Rico):


[AMD Official Use Only]


The error message is from HIQ dequeue procedure,  not from HCQ, so no 
doorbell writing.


Jan 25 16:10:58 lnx-ci-node kernel: [18161.477067] Call Trace:
Jan 25 16:10:58 lnx-ci-node kernel: [18161.477072]  dump_stack+0x7d/0x9c
Jan 25 16:10:58 lnx-ci-node kernel: [18161.477651] 
 hqd_destroy_v10_3+0x58/0x254 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.48] 
 destroy_mqd+0x1e/0x30 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.477884] 
 kernel_queue_uninit+0xcf/0x100 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.477985] 
 pm_uninit+0x1a/0x30 [amdgpu] #kernel_queue_uninit(pm->priv_queue, 
hanging); this priv_queue == HIQ
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478127] 
 stop_cpsch+0x98/0x100 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478242] 
 kgd2kfd_suspend.part.0+0x32/0x50 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478338] 
 kgd2kfd_suspend+0x1b/0x20 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478433] 
 amdgpu_amdkfd_suspend+0x1e/0x30 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478529] 
 amdgpu_device_fini_hw+0x182/0x335 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478655] 
 amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478732] 
 amdgpu_pci_remove+0x27/0x40 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478806] 
 pci_device_remove+0x3e/0xb0
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478809] 
 device_release_driver_internal+0x103/0x1d0
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478813] 
 driver_detach+0x4c/0x90
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478814] 
 bus_remove_driver+0x5c/0xd0
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478815] 
 driver_unregister+0x31/0x50
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478817] 
 pci_unregister_driver+0x40/0x90
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478818] 
 amdgpu_exit+0x15/0x2d1 [amdgpu]
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478942] 
 __x64_sys_delete_module+0x147/0x260
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478944]  ? 
exit_to_user_mode_prepare+0x41/0x1d0

Jan 25 16:10:58 lnx-ci-node kernel: [18161.478946]  ? ksys_write+0x67/0xe0
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478948] 
 do_syscall_64+0x40/0xb0
Jan 25 16:10:58 lnx-ci-node kernel: [18161.478951] 
 entry_SYSCALL_64_after_hwframe+0x44/0xae


Regards,
Rico

*From:* Kuehling, Felix 
*Sent:* Thursday, January 27, 2022 23:28
*To:* Yin, Tianci (Rico) ; Wang, Yang(Kevin) 
; amd-gfx@lists.freedesktop.org 

*Cc:* Grodzovsky, Andrey ; Chen, Guchun 


*Subject:* Re: [PATCH] drm/amdgpu: Fix an error message in rmmod
The hang you're seeing is the result of a command submission of an
UNMAP_QUEUES and QUERY_STATUS command to the HIQ. This is done using a
doorbell. KFD writes commands to the HIQ and rings a doorbell to wake up
the HWS (see kq_submit_packet in kfd_kernel_queue.c). Why does this
doorbell not trigger gfxoff exit during rmmod?


Regards,
   Felix



Am 2022-01-26 um 22:38 schrieb Yin, Tianci (Rico):
>
> [AMD Official Use Only]
>
>
> The rmmod ops has prerequisite multi-user target and blacklist amdgpu,
> which is IGT requirement so that IGT can make itself DRM master to
> test KMS.
> igt-gpu-tools/build/tests/amdgpu/amd_module_load --run-subtest reload
>
> From my understanding, the KFD process belongs to the regular way of
> gfxoff exit, which doorbell writing triggers gfxoff exit. For example,
> KFD maps HCQ thru cmd on HIQ or KIQ ring, or UMD commits jobs on HCQ,
> these both trigger doorbell writing(pls refer to
> gfx_v10_0_ring_set_wptr_compute()).
>
> As to the IGT reload test, the dequeue request is not thru a cmd on a
> ring, it directly writes CP registers, so GFX core remains in gfxoff.
>
> Thanks,
> Rico
>
> 
> *From:* Kuehling, Felix 
> *Sent:* Wednesday, January 26, 2022 23:08
> *To:* Yin, Tianci (Rico) ; Wang, Yang(Kevin)
> ; amd-gfx@lists.freedesktop.org
> 
> *Cc:* Grodzovsky, Andrey ; Chen, Guchun
> 
> *Subject:* Re: [PATCH] drm/amdgpu: Fix an error message in rmmod
> My question is, why is this problem only seen during module unload? Why
> aren't we seeing HWS hangs due to GFX_OFF all the time in normal
> operations? For example when the GPU is idle and a new KFD process is
> started, creating a new runlist. Are we just getting lucky because the
> process first has to allocate some memory, which maybe makes some HW
> access (flushing TLBs etc.) that wakes up the GPU?
>
>
> Regards,
>

Re: [PATCH] drm/amdgpu: Fix uninitialized variable use warning

2022-01-28 Thread Deucher, Alexander

[Public]

Reviewed-by: Alex Deucher 

From: Lazar, Lijo 
Sent: Friday, January 28, 2022 1:40 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Zhang, Hawking ; Deucher, Alexander 
; kernel test robot 
Subject: [PATCH] drm/amdgpu: Fix uninitialized variable use warning

Fix uninitialized variable use
warning: variable 'reg_access_ctrl' is uninitialized when used here 
[-Wuninitialized]
 scratch_reg0 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg0;

Fixes: 51263163eb3f("drm/amdgpu: add helper for rlcg indirect reg
access")

Reported-by: kernel test robot 
Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 80c25176c993..c13765218919 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -875,6 +875,7 @@ static u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device 
*adev, u32 offset, u32 v
 return 0;
 }

+   reg_access_ctrl = >gfx.rlc.reg_access_ctrl;
 scratch_reg0 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg0;
 scratch_reg1 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg1;
 scratch_reg2 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg2;
--
2.25.1

RE: [PATCH 1/1] drm/amdkfd: Fix variable set but not used warning

2022-01-28 Thread Kasiviswanathan, Harish

[AMD Official Use Only]

Reviewed-By: Harish Kasiviswanathan 

-Original Message-
From: amd-gfx  On Behalf Of Philip Yang
Sent: Friday, January 28, 2022 9:39 AM
To: amd-gfx@lists.freedesktop.org
Cc: Yang, Philip 
Subject: [PATCH 1/1] drm/amdkfd: Fix variable set but not used warning

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c: In function
'svm_range_deferred_list_work':
>> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2067:22: warning:
variable 'p' set but not used [-Wunused-but-set-variable]
2067 |  struct kfd_process *p;
 |

Fixes: 8b633bdc86671("drm/amdkfd: Ensure mm remain valid in svm deferred_list 
work")

Reported-by: kernel test robot 
Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 649c1d2b9607..9a509ec8c327 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2079,13 +2079,10 @@ static void svm_range_deferred_list_work(struct 
work_struct *work)
struct svm_range_list *svms;
struct svm_range *prange;
struct mm_struct *mm;
-   struct kfd_process *p;
 
svms = container_of(work, struct svm_range_list, deferred_list_work);
pr_debug("enter svms 0x%p\n", svms);
 
-   p = container_of(svms, struct kfd_process, svms);
-
spin_lock(>deferred_list_lock);
while (!list_empty(>deferred_range_list)) {
prange = list_first_entry(>deferred_range_list,
--
2.17.1

Re: [PATCH v2 1/28] drm/amdgpu: fix that issue that the number of the crtc of the 3250c is not correct

2022-01-28 Thread Deucher, Alexander

[Public]

Reviewed-by: Alex Deucher 

From: RyanLin 
Sent: Thursday, January 27, 2022 10:47 PM
To: Wentland, Harry ; Li, Sun peng (Leo) 
; Deucher, Alexander ; Koenig, 
Christian ; david1.z...@amd.com 
; airl...@linux.ie ; dan...@ffwll.ch 
; seanp...@chromium.org ; 
b...@basnieuwenhuizen.nl ; Kazlauskas, Nicholas 
; sas...@kernel.org ; 
markyac...@google.com ; victorchengchi...@amd.com 
; ching-shih...@amd.corp-partner.google.com 
; Siqueira, Rodrigo 
; ddavenp...@chromium.org ; 
amd-gfx@lists.freedesktop.org ; 
dri-de...@lists.freedesktop.org ; 
linux-ker...@vger.kernel.org ; Li, Leon 

Cc: Lin, Tsung-hua (Ryan) 
Subject: [PATCH v2 1/28] drm/amdgpu: fix that issue that the number of the crtc 
of the 3250c is not correct

v2:
  - remove unnecessary comments and Id

[Why]
External displays take priority over internal display when there are fewer
display controllers than displays.

[How]
The root cause is because of that number of the crtc is not correct.
The number of the crtc on the 3250c is 3, but on the 3500c is 4.
>From the source code, we can see that number of the crtc has been fixed at 4.
Needs to set the num_crtc to 3 for 3250c platform.

Signed-off-by: RyanLin 

---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 40c91b448f7d..455a2c45e8cd 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2738,9 +2738,15 @@ static int dm_early_init(void *handle)
 break;
 #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
 case CHIP_RAVEN:
-   adev->mode_info.num_crtc = 4;
-   adev->mode_info.num_hpd = 4;
-   adev->mode_info.num_dig = 4;
+   if (adev->rev_id >= 8) {
+   adev->mode_info.num_crtc = 3;
+   adev->mode_info.num_hpd = 3;
+   adev->mode_info.num_dig = 3;
+   } else {
+   adev->mode_info.num_crtc = 4;
+   adev->mode_info.num_hpd = 4;
+   adev->mode_info.num_dig = 4;
+   }
 break;
 #endif
 #if defined(CONFIG_DRM_AMD_DC_DCN2_0)
--
2.25.1

[PATCH 1/1] drm/amdkfd: Fix variable set but not used warning

2022-01-28 Thread Philip Yang

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c: In function
'svm_range_deferred_list_work':
>> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2067:22: warning:
variable 'p' set but not used [-Wunused-but-set-variable]
2067 |  struct kfd_process *p;
 |

Fixes: 8b633bdc86671("drm/amdkfd: Ensure mm remain valid in svm
deferred_list work")

Reported-by: kernel test robot 
Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 649c1d2b9607..9a509ec8c327 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2079,13 +2079,10 @@ static void svm_range_deferred_list_work(struct 
work_struct *work)
struct svm_range_list *svms;
struct svm_range *prange;
struct mm_struct *mm;
-   struct kfd_process *p;
 
svms = container_of(work, struct svm_range_list, deferred_list_work);
pr_debug("enter svms 0x%p\n", svms);
 
-   p = container_of(svms, struct kfd_process, svms);
-
spin_lock(>deferred_list_lock);
while (!list_empty(>deferred_range_list)) {
prange = list_first_entry(>deferred_range_list,
-- 
2.17.1

Re: [PATCH v11 5/5] drm/amdgpu: add drm buddy support to amdgpu

2022-01-28 Thread Matthew Auld

On Thu, 27 Jan 2022 at 14:11, Arunpravin
 wrote:
>
> - Remove drm_mm references and replace with drm buddy functionalities
> - Add res cursor support for drm buddy
>
> v2(Matthew Auld):
>   - replace spinlock with mutex as we call kmem_cache_zalloc
> (..., GFP_KERNEL) in drm_buddy_alloc() function
>
>   - lock drm_buddy_block_trim() function as it calls
> mark_free/mark_split are all globally visible
>
> v3(Matthew Auld):
>   - remove trim method error handling as we address the failure case
> at drm_buddy_block_trim() function
>
> v4:
>   - fix warnings reported by kernel test robot 
>
> v5:
>   - fix merge conflict issue
>
> v6:
>   - fix warnings reported by kernel test robot 
>
> Signed-off-by: Arunpravin 
> ---
>  drivers/gpu/drm/Kconfig   |   1 +
>  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   7 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 259 ++
>  4 files changed, 231 insertions(+), 133 deletions(-)



>
> -/**
> - * amdgpu_vram_mgr_virt_start - update virtual start address
> - *
> - * @mem: ttm_resource to update
> - * @node: just allocated node
> - *
> - * Calculate a virtual BO start address to easily check if everything is CPU
> - * accessible.
> - */
> -static void amdgpu_vram_mgr_virt_start(struct ttm_resource *mem,
> -  struct drm_mm_node *node)
> -{
> -   unsigned long start;
> -
> -   start = node->start + node->size;
> -   if (start > mem->num_pages)
> -   start -= mem->num_pages;
> -   else
> -   start = 0;
> -   mem->start = max(mem->start, start);
> -}
> -
>  /**
>   * amdgpu_vram_mgr_new - allocate new ranges
>   *
> @@ -366,13 +357,13 @@ static int amdgpu_vram_mgr_new(struct 
> ttm_resource_manager *man,
>const struct ttm_place *place,
>struct ttm_resource **res)
>  {
> -   unsigned long lpfn, num_nodes, pages_per_node, pages_left, pages;
> +   unsigned long lpfn, pages_per_node, pages_left, pages, n_pages;
> +   u64 vis_usage = 0, mem_bytes, max_bytes, min_page_size;
> struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
> struct amdgpu_device *adev = to_amdgpu_device(mgr);
> -   uint64_t vis_usage = 0, mem_bytes, max_bytes;
> -   struct ttm_range_mgr_node *node;
> -   struct drm_mm *mm = >mm;
> -   enum drm_mm_insert_mode mode;
> +   struct amdgpu_vram_mgr_node *node;
> +   struct drm_buddy *mm = >mm;
> +   struct drm_buddy_block *block;
> unsigned i;
> int r;
>
> @@ -391,10 +382,9 @@ static int amdgpu_vram_mgr_new(struct 
> ttm_resource_manager *man,
> goto error_sub;
> }
>
> -   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
> +   if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
> pages_per_node = ~0ul;
> -   num_nodes = 1;
> -   } else {
> +   else {
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> pages_per_node = HPAGE_PMD_NR;
>  #else
> @@ -403,11 +393,9 @@ static int amdgpu_vram_mgr_new(struct 
> ttm_resource_manager *man,
>  #endif
> pages_per_node = max_t(uint32_t, pages_per_node,
>tbo->page_alignment);
> -   num_nodes = DIV_ROUND_UP_ULL(PFN_UP(mem_bytes), 
> pages_per_node);
> }
>
> -   node = kvmalloc(struct_size(node, mm_nodes, num_nodes),
> -   GFP_KERNEL | __GFP_ZERO);
> +   node = kzalloc(sizeof(*node), GFP_KERNEL);
> if (!node) {
> r = -ENOMEM;
> goto error_sub;
> @@ -415,9 +403,17 @@ static int amdgpu_vram_mgr_new(struct 
> ttm_resource_manager *man,
>
> ttm_resource_init(tbo, place, >base);
>
> -   mode = DRM_MM_INSERT_BEST;
> +   INIT_LIST_HEAD(>blocks);
> +
> if (place->flags & TTM_PL_FLAG_TOPDOWN)
> -   mode = DRM_MM_INSERT_HIGH;
> +   node->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
> +
> +   if (place->fpfn || lpfn != man->size)
> +   /* Allocate blocks in desired range */
> +   node->flags |= DRM_BUDDY_RANGE_ALLOCATION;
> +
> +   min_page_size = mgr->default_page_size;
> +   BUG_ON(min_page_size < mm->chunk_size);
>
> pages_left = node->base.num_pages;
>
> @@ -425,36 +421,61 @@ static int amdgpu_vram_mgr_new(struct 
> ttm_resource_manager *man,
> pages = min(pages_left, 2UL << (30 - PAGE_SHIFT));
>
> i = 0;
> -   spin_lock(>lock);
> while (pages_left) {
> -   uint32_t alignment = tbo->page_alignment;
> -
> if (pages >= pages_per_node)
> -   alignment = pages_per_node;
> -
> -   r = drm_mm_insert_node_in_range(mm, >mm_nodes[i], pages,
> -   alignment, 0, place->fpfn,
> -

[PATCH 15/17] drm/amd/display: 3.2.171

2022-01-28 Thread Stylon Wang

From: Aric Cyr 

This version brings along following fixes:
- DC refactor and bug fixes for DP links
- Bug fixes for DP2
- Fix regressions causing display not light up
- Improved debug trace
- Improved DP AUX transfer
- Updated watermark latencies to fix underflows in some modes

Acked-by: Stylon Wang 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 4f9dacd09856..69d264dd69a7 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -47,7 +47,7 @@ struct aux_payload;
 struct set_config_cmd_payload;
 struct dmub_notification;
 
-#define DC_VER "3.2.170"
+#define DC_VER "3.2.171"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.34.1

[PATCH 17/17] drm/amd/display: Add Missing HPO Stream Encoder Function Hook

2022-01-28 Thread Stylon Wang

From: Fangzhi Zuo 

[Why]
configure_dp_hpo_throttled_vcp_size() was missing promotion before, but it was 
covered by
not calling the missing function hook in the old interface 
hpo_dp_link_encoder->funcs.

Recent refactor replaces with new caller link_hwss->set_throttled_vcp_size
which needs that hook, and that causes null ptr hang.

Signed-off-by: Fangzhi Zuo 
Acked-by: Stylon Wang 
---
 .../display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c| 11 +++
 .../display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h|  9 ++---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
index 5065904c7833..23621ff08c90 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
@@ -710,6 +710,16 @@ static void dcn31_hpo_dp_stream_enc_read_state(
}
 }
 
+static void dcn31_set_hblank_min_symbol_width(
+   struct hpo_dp_stream_encoder *enc,
+   uint16_t width)
+{
+   struct dcn31_hpo_dp_stream_encoder *enc3 = 
DCN3_1_HPO_DP_STREAM_ENC_FROM_HPO_STREAM_ENC(enc);
+
+   REG_SET(DP_SYM32_ENC_HBLANK_CONTROL, 0,
+   HBLANK_MINIMUM_SYMBOL_WIDTH, width);
+}
+
 static const struct hpo_dp_stream_encoder_funcs dcn30_str_enc_funcs = {
.enable_stream = dcn31_hpo_dp_stream_enc_enable_stream,
.dp_unblank = dcn31_hpo_dp_stream_enc_dp_unblank,
@@ -725,6 +735,7 @@ static const struct hpo_dp_stream_encoder_funcs 
dcn30_str_enc_funcs = {
.dp_audio_enable = dcn31_hpo_dp_stream_enc_audio_enable,
.dp_audio_disable = dcn31_hpo_dp_stream_enc_audio_disable,
.read_state = dcn31_hpo_dp_stream_enc_read_state,
+   .set_hblank_min_symbol_width = dcn31_set_hblank_min_symbol_width,
 };
 
 void dcn31_hpo_dp_stream_encoder_construct(
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h
index 70b94fc25304..7c77c71591a0 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h
@@ -80,7 +80,8 @@
SRI(DP_SYM32_ENC_SDP_GSP_CONTROL11, DP_SYM32_ENC, id),\
SRI(DP_SYM32_ENC_SDP_METADATA_PACKET_CONTROL, DP_SYM32_ENC, id),\
SRI(DP_SYM32_ENC_SDP_AUDIO_CONTROL0, DP_SYM32_ENC, id),\
-   SRI(DP_SYM32_ENC_VID_CRC_CONTROL, DP_SYM32_ENC, id)
+   SRI(DP_SYM32_ENC_VID_CRC_CONTROL, DP_SYM32_ENC, id), \
+   SRI(DP_SYM32_ENC_HBLANK_CONTROL, DP_SYM32_ENC, id)
 
 #define DCN3_1_HPO_DP_STREAM_ENC_REGS \
uint32_t DP_STREAM_MAPPER_CONTROL0;\
@@ -116,7 +117,8 @@
uint32_t DP_SYM32_ENC_SDP_GSP_CONTROL11;\
uint32_t DP_SYM32_ENC_SDP_METADATA_PACKET_CONTROL;\
uint32_t DP_SYM32_ENC_SDP_AUDIO_CONTROL0;\
-   uint32_t DP_SYM32_ENC_VID_CRC_CONTROL
+   uint32_t DP_SYM32_ENC_VID_CRC_CONTROL;\
+   uint32_t DP_SYM32_ENC_HBLANK_CONTROL
 
 
 #define DCN3_1_HPO_DP_STREAM_ENC_MASK_SH_LIST(mask_sh)\
@@ -202,7 +204,8 @@
type GSP_SOF_REFERENCE;\
type METADATA_PACKET_ENABLE;\
type CRC_ENABLE;\
-   type CRC_CONT_MODE_ENABLE
+   type CRC_CONT_MODE_ENABLE;\
+   type HBLANK_MINIMUM_SYMBOL_WIDTH
 
 
 struct dcn31_hpo_dp_stream_encoder_registers {
-- 
2.34.1

[PATCH 16/17] drm/amd/display: Trigger DP2 Sequence With Uncertified Cable

2022-01-28 Thread Stylon Wang

From: Fangzhi Zuo 

DP2 sequence is triggered only if VESA certified cable is detected.

Force DP2 sequence with uncertified cable for testing purpose.

Reviewed-by: Wenjing Liu 
Acked-by: Stylon Wang 
Signed-off-by: Fangzhi Zuo 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 26 +++
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 306a16b7be75..d7611c81fca8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -3431,6 +3431,30 @@ static int dp_force_sst_get(void *data, u64 *val)
 }
 DEFINE_DEBUGFS_ATTRIBUTE(dp_set_mst_en_for_sst_ops, dp_force_sst_get,
 dp_force_sst_set, "%llu\n");
+
+/*
+ * Force DP2 sequence without VESA certified cable.
+ * Example usage: echo 1 > /sys/kernel/debug/dri/0/amdgpu_dm_dp_ignore_cable_id
+ */
+static int dp_ignore_cable_id_set(void *data, u64 val)
+{
+   struct amdgpu_device *adev = data;
+
+   adev->dm.dc->debug.ignore_cable_id = val;
+
+   return 0;
+}
+
+static int dp_ignore_cable_id_get(void *data, u64 *val)
+{
+   struct amdgpu_device *adev = data;
+
+   *val = adev->dm.dc->debug.ignore_cable_id;
+
+   return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(dp_ignore_cable_id_ops, dp_ignore_cable_id_get,
+dp_ignore_cable_id_set, "%llu\n");
 #endif
 
 /*
@@ -3549,6 +3573,8 @@ void dtn_debugfs_init(struct amdgpu_device *adev)
 #if defined(CONFIG_DRM_AMD_DC_DCN)
debugfs_create_file("amdgpu_dm_dp_set_mst_en_for_sst", 0644, root, adev,
_set_mst_en_for_sst_ops);
+   debugfs_create_file("amdgpu_dm_dp_ignore_cable_id", 0644, root, adev,
+   _ignore_cable_id_ops);
 #endif
 
debugfs_create_file_unsafe("amdgpu_dm_visual_confirm", 0644, root, adev,
-- 
2.34.1

[PATCH 14/17] drm/amd/display: [FW Promotion] Release 0.0.102.0

2022-01-28 Thread Stylon Wang

From: Anthony Koo 

 - Correct number of reserved bits in cmd_lock_hw
 - Extend bits of hw_lock_client to allow for more clients

Acked-by: Stylon Wang 
Signed-off-by: Anthony Koo 
---
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index 9f609829955d..a01814631911 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -47,10 +47,10 @@
 
 /* Firmware versioning. */
 #ifdef DMUB_EXPOSE_VERSION
-#define DMUB_FW_VERSION_GIT_HASH 0x1288a7b7
+#define DMUB_FW_VERSION_GIT_HASH 0xab0ae3c8
 #define DMUB_FW_VERSION_MAJOR 0
 #define DMUB_FW_VERSION_MINOR 0
-#define DMUB_FW_VERSION_REVISION 101
+#define DMUB_FW_VERSION_REVISION 102
 #define DMUB_FW_VERSION_TEST 0
 #define DMUB_FW_VERSION_VBIOS 0
 #define DMUB_FW_VERSION_HOTFIX 0
@@ -525,7 +525,7 @@ union dmub_inbox0_cmd_lock_hw {
uint32_t command_code: 8;
 
/* NOTE: Must be have enough bits to match: enum hw_lock_client 
*/
-   uint32_t hw_lock_client: 1;
+   uint32_t hw_lock_client: 2;
 
/* NOTE: Below fields must match with: struct 
dmub_hw_lock_inst_flags */
uint32_t otg_inst: 3;
@@ -540,7 +540,7 @@ union dmub_inbox0_cmd_lock_hw {
 
uint32_t lock: 1;   /**< Lock */
uint32_t should_release: 1; /**< Release */
-   uint32_t reserved: 8;   /**< Reserved for 
extending more clients, HW, etc. */
+   uint32_t reserved: 7;   /**< Reserved for 
extending more clients, HW, etc. */
} bits;
uint32_t all;
 };
-- 
2.34.1

[PATCH 13/17] drm/amd/display: move link_hwss to link folder and break down to files

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[why]
Move link_hwss to its own folder as part of DC LIB and break it down
to separate file one for each type of backend for code isolation.

Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/Makefile   |   4 +-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |   1 +
 .../gpu/drm/amd/display/dc/core/dc_resource.c |   4 +
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   1 +
 .../gpu/drm/amd/display/dc/inc/link_hwss.h|  18 --
 drivers/gpu/drm/amd/display/dc/link/Makefile  |  30 +++
 .../drm/amd/display/dc/link/link_hwss_dio.c   | 137 +++
 .../drm/amd/display/dc/link/link_hwss_dio.h   |  53 +
 .../drm/amd/display/dc/link/link_hwss_dpia.c  |  51 +
 .../drm/amd/display/dc/link/link_hwss_dpia.h  |  34 +++
 .../link_hwss_hpo_dp.c}   | 213 ++
 .../amd/display/dc/link/link_hwss_hpo_dp.h|  35 +++
 .../amd/display/dc/link/link_hwss_hpo_frl.c   |  43 
 .../amd/display/dc/link/link_hwss_hpo_frl.h   |  34 +++
 .../gpu/drm/amd/display/dc/virtual/Makefile   |   2 +-
 .../display/dc/virtual/virtual_link_hwss.c|  43 
 .../display/dc/virtual/virtual_link_hwss.h|  34 +++
 17 files changed, 528 insertions(+), 209 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/Makefile
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dio.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dio.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dpia.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dpia.h
 rename drivers/gpu/drm/amd/display/dc/{core/dc_link_hwss.c => 
link/link_hwss_hpo_dp.c} (54%)
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_hpo_dp.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_hpo_frl.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_hpo_frl.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/virtual/virtual_link_hwss.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/virtual/virtual_link_hwss.h

diff --git a/drivers/gpu/drm/amd/display/dc/Makefile 
b/drivers/gpu/drm/amd/display/dc/Makefile
index a4ef8f314307..0aaf394b73ff 100644
--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -23,7 +23,7 @@
 # Makefile for Display Core (dc) component.
 #
 
-DC_LIBS = basics bios clk_mgr dce dml gpio irq virtual
+DC_LIBS = basics bios dml clk_mgr dce gpio irq link virtual
 
 ifdef CONFIG_DRM_AMD_DC_DCN
 DC_LIBS += dcn20
@@ -58,7 +58,7 @@ AMD_DC = $(addsuffix /Makefile, $(addprefix 
$(FULL_AMD_DISPLAY_PATH)/dc/,$(DC_LI
 include $(AMD_DC)
 
 DISPLAY_CORE = dc.o  dc_stat.o dc_link.o dc_resource.o dc_hw_sequencer.o 
dc_sink.o \
-dc_surface.o dc_link_hwss.o dc_link_dp.o dc_link_ddc.o dc_debug.o dc_stream.o \
+dc_surface.o dc_link_dp.o dc_link_ddc.o dc_debug.o dc_stream.o \
 dc_link_enc_cfg.o dc_link_dpia.o dc_link_dpcd.o
 
 ifdef CONFIG_DRM_AMD_DC_DCN
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 51347e1d3d95..65ebfbcf3019 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -6842,6 +6842,7 @@ bool edp_receiver_ready_T9(struct dc_link *link)
unsigned char sinkstatus = 0;
unsigned char edpRev = 0;
enum dc_status result = DC_OK;
+
result = core_link_read_dpcd(link, DP_EDP_DPCD_REV, , 
sizeof(edpRev));
 
/* start from eDP version 1.2, SINK_STAUS indicate the sink is ready.*/
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 19e06331169d..e82aa0559bdf 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -43,6 +43,10 @@
 #include "dpcd_defs.h"
 #include "link_enc_cfg.h"
 #include "dc_link_dp.h"
+#include "virtual/virtual_link_hwss.h"
+#include "link/link_hwss_dio.h"
+#include "link/link_hwss_dpia.h"
+#include "link/link_hwss_hpo_dp.h"
 
 #if defined(CONFIG_DRM_AMD_DC_SI)
 #include "dce60/dce60_resource.h"
diff --git a/drivers/gpu/drm/amd/display/dc/inc/core_types.h 
b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
index e90123b0ee0e..951c9b60917d 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/core_types.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
@@ -54,6 +54,7 @@ void enable_surface_flip_reporting(struct dc_plane_state 
*plane_state,
 #ifdef CONFIG_DRM_AMD_DC_HDCP
 #include "dm_cp_psp.h"
 #endif
+#include "link_hwss.h"
 
 / link */
 struct link_init_data {
diff --git a/drivers/gpu/drm/amd/display/dc/inc/link_hwss.h 
b/drivers/gpu/drm/amd/display/dc/inc/link_hwss.h
index fd4bfa22eda8..3b3090e3d327 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/link_hwss.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/link_hwss.h
@@ -27,7 +27,6 @@
 #define

[PATCH 12/17] drm/amd/display: move get_link_hwss to dc_resource

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[why]
Isolate the way to obtain link_hwss from the actual implemenation
of link_hwss. So the caller can call link_hwss without knowing
the implementation detail of link_hwss.

Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 .../drm/amd/display/dc/core/dc_link_hwss.c| 51 ---
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 33 
 .../gpu/drm/amd/display/dc/inc/link_hwss.h| 17 ++-
 drivers/gpu/drm/amd/display/dc/inc/resource.h |  3 ++
 4 files changed, 85 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
index 96414f99c671..dab532cf52b9 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
@@ -29,12 +29,6 @@ static void virtual_setup_stream_encoder(struct pipe_ctx 
*pipe_ctx);
 static void virtual_reset_stream_encoder(struct pipe_ctx *pipe_ctx);
 
 /* below goes to dio_link_hwss 
/
-static bool can_use_dio_link_hwss(const struct dc_link *link,
-   const struct link_resource *link_res)
-{
-   return link->link_enc != NULL;
-}
-
 static void set_dio_throttled_vcp_size(struct pipe_ctx *pipe_ctx,
struct fixed31_32 throttled_vcp_size)
 {
@@ -135,14 +129,19 @@ static const struct link_hwss dio_link_hwss = {
},
 };
 
-/*** below goes to hpo_dp_link_hwss 
***/
-static bool can_use_dp_hpo_link_hwss(const struct dc_link *link,
+bool can_use_dio_link_hwss(const struct dc_link *link,
const struct link_resource *link_res)
 {
-   return link_res->hpo_dp_link_enc != NULL;
+   return link->link_enc != NULL;
 }
 
-static void set_dp_hpo_throttled_vcp_size(struct pipe_ctx *pipe_ctx,
+const struct link_hwss *get_dio_link_hwss(void)
+{
+   return _link_hwss;
+}
+
+/*** below goes to hpo_dp_link_hwss 
***/
+static void set_hpo_dp_throttled_vcp_size(struct pipe_ctx *pipe_ctx,
struct fixed31_32 throttled_vcp_size)
 {
struct hpo_dp_stream_encoder *hpo_dp_stream_encoder =
@@ -155,7 +154,7 @@ static void set_dp_hpo_throttled_vcp_size(struct pipe_ctx 
*pipe_ctx,
throttled_vcp_size);
 }
 
-static void set_dp_hpo_hblank_min_symbol_width(struct pipe_ctx *pipe_ctx,
+static void set_hpo_dp_hblank_min_symbol_width(struct pipe_ctx *pipe_ctx,
const struct dc_link_settings *link_settings,
struct fixed31_32 throttled_vcp_size)
 {
@@ -328,22 +327,27 @@ static const struct link_hwss hpo_dp_link_hwss = {
.setup_stream_encoder = setup_hpo_dp_stream_encoder,
.reset_stream_encoder = reset_hpo_dp_stream_encoder,
.ext = {
-   .set_throttled_vcp_size = set_dp_hpo_throttled_vcp_size,
-   .set_hblank_min_symbol_width = 
set_dp_hpo_hblank_min_symbol_width,
+   .set_throttled_vcp_size = set_hpo_dp_throttled_vcp_size,
+   .set_hblank_min_symbol_width = 
set_hpo_dp_hblank_min_symbol_width,
.enable_dp_link_output = enable_hpo_dp_link_output,
.disable_dp_link_output = disable_hpo_dp_link_output,
.set_dp_link_test_pattern  = set_hpo_dp_link_test_pattern,
.set_dp_lane_settings = set_hpo_dp_lane_settings,
},
 };
-/*** below goes to dpia_link_hwss 
*/
-static bool can_use_dpia_link_hwss(const struct dc_link *link,
+
+bool can_use_hpo_dp_link_hwss(const struct dc_link *link,
const struct link_resource *link_res)
 {
-   return link->is_dig_mapping_flexible &&
-   link->dc->res_pool->funcs->link_encs_assign;
+   return link_res->hpo_dp_link_enc != NULL;
 }
 
+const struct link_hwss *get_hpo_dp_link_hwss(void)
+{
+   return _dp_link_hwss;
+}
+
+/*** below goes to dpia_link_hwss 
*/
 static const struct link_hwss dpia_link_hwss = {
.setup_stream_encoder = setup_dio_stream_encoder,
.reset_stream_encoder = reset_dio_stream_encoder,
@@ -356,7 +360,18 @@ static const struct link_hwss dpia_link_hwss = {
},
 };
 
-/*** below goes to link_hwss 
**/
+bool can_use_dpia_link_hwss(const struct dc_link *link,
+   const struct link_resource *link_res)
+{
+   return link->is_dig_mapping_flexible &&
+   link->dc->res_pool->funcs->link_encs_assign;
+}
+
+const struct link_hwss *get_dpia_link_hwss(void)
+{
+   return _link_hwss;
+}
+/*** below goes to virtual_link_hwss 
**/
 static void virtual_setup_stream_encoder(struct pipe_ctx *pipe_ctx)
 {
 }
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c

[PATCH 11/17] drm/amd/display: temporarly move non link_hwss code to dc_link_dp

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[why]
Clean up dc_link_hwss file in the preparation of breaking it down to file for
each encoder type. We temporarly move the original dp link functions in 
link_hwss
back to dc_link_dp. We will break dc_link_dp down after link_hwss is in good 
shape.

Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 613 
 .../drm/amd/display/dc/core/dc_link_hwss.c| 653 +-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   4 +-
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|   1 +
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |  40 ++
 .../gpu/drm/amd/display/dc/inc/link_dpcd.h|   2 +-
 .../gpu/drm/amd/display/dc/inc/link_hwss.h|  56 +-
 7 files changed, 682 insertions(+), 687 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index abec79e80eed..51347e1d3d95 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -27,6 +27,7 @@
 #include "dm_helpers.h"
 #include "opp.h"
 #include "dsc.h"
+#include "clk_mgr.h"
 #include "resource.h"
 
 #include "inc/core_types.h"
@@ -6713,3 +6714,615 @@ void dc_link_dp_clear_rx_status(struct dc_link *link)
 {
memset(>dprx_status, 0, sizeof(link->dprx_status));
 }
+
+void dp_receiver_power_ctrl(struct dc_link *link, bool on)
+{
+   uint8_t state;
+
+   state = on ? DP_POWER_STATE_D0 : DP_POWER_STATE_D3;
+
+   if (link->sync_lt_in_progress)
+   return;
+
+   core_link_write_dpcd(link, DP_SET_POWER, ,
+sizeof(state));
+
+}
+
+void dp_source_sequence_trace(struct dc_link *link, uint8_t dp_test_mode)
+{
+   if (link != NULL && link->dc->debug.enable_driver_sequence_debug)
+   core_link_write_dpcd(link, DP_SOURCE_SEQUENCE,
+   _test_mode, sizeof(dp_test_mode));
+}
+
+
+static uint8_t convert_to_count(uint8_t lttpr_repeater_count)
+{
+   switch (lttpr_repeater_count) {
+   case 0x80: // 1 lttpr repeater
+   return 1;
+   case 0x40: // 2 lttpr repeaters
+   return 2;
+   case 0x20: // 3 lttpr repeaters
+   return 3;
+   case 0x10: // 4 lttpr repeaters
+   return 4;
+   case 0x08: // 5 lttpr repeaters
+   return 5;
+   case 0x04: // 6 lttpr repeaters
+   return 6;
+   case 0x02: // 7 lttpr repeaters
+   return 7;
+   case 0x01: // 8 lttpr repeaters
+   return 8;
+   default:
+   break;
+   }
+   return 0; // invalid value
+}
+
+static inline bool is_immediate_downstream(struct dc_link *link, uint32_t 
offset)
+{
+   return (convert_to_count(link->dpcd_caps.lttpr_caps.phy_repeater_cnt) 
== offset);
+}
+
+void dp_enable_link_phy(
+   struct dc_link *link,
+   const struct link_resource *link_res,
+   enum signal_type signal,
+   enum clock_source_id clock_source,
+   const struct dc_link_settings *link_settings)
+{
+   struct dc  *dc = link->ctx->dc;
+   struct dmcu *dmcu = dc->res_pool->dmcu;
+   struct pipe_ctx *pipes =
+   link->dc->current_state->res_ctx.pipe_ctx;
+   struct clock_source *dp_cs =
+   link->dc->res_pool->dp_clock_source;
+   const struct link_hwss *link_hwss = get_link_hwss(link, link_res);
+   unsigned int i;
+
+   if (link->connector_signal == SIGNAL_TYPE_EDP) {
+   link->dc->hwss.edp_power_control(link, true);
+   link->dc->hwss.edp_wait_for_hpd_ready(link, true);
+   }
+
+   /* If the current pixel clock source is not DTO(happens after
+* switching from HDMI passive dongle to DP on the same connector),
+* switch the pixel clock source to DTO.
+*/
+   for (i = 0; i < MAX_PIPES; i++) {
+   if (pipes[i].stream != NULL &&
+   pipes[i].stream->link == link) {
+   if (pipes[i].clock_source != NULL &&
+   pipes[i].clock_source->id != 
CLOCK_SOURCE_ID_DP_DTO) {
+   pipes[i].clock_source = dp_cs;
+   
pipes[i].stream_res.pix_clk_params.requested_pix_clk_100hz =
+   
pipes[i].stream->timing.pix_clk_100hz;
+   pipes[i].clock_source->funcs->program_pix_clk(
+   pipes[i].clock_source,
+   
[i].stream_res.pix_clk_params,
+   [i].pll_settings);
+   }
+   }
+   }
+
+   link->cur_link_settings = *link_settings;
+
+   if (dp_get_link_encoding_format(link_settings) ==

[PATCH 10/17] drm/amd/display: add set dp lane settings to link_hwss

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[why]
Factor set dp lane settings to link_hwss.

Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 .../drm/amd/display/dc/core/dc_link_hwss.c| 40 ++-
 .../drm/amd/display/dc/dce/dce_link_encoder.c | 17 
 .../drm/amd/display/dc/dce/dce_link_encoder.h |  3 +-
 .../amd/display/dc/dcn10/dcn10_link_encoder.c | 17 
 .../amd/display/dc/dcn10/dcn10_link_encoder.h |  3 +-
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |  3 +-
 .../gpu/drm/amd/display/dc/inc/link_hwss.h|  4 ++
 .../display/dc/virtual/virtual_link_encoder.c |  3 +-
 8 files changed, 59 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
index d5670d3b1a4b..3b7ab2ca34c6 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
@@ -295,22 +295,16 @@ void dp_set_hw_lane_settings(
const struct link_training_settings *link_settings,
uint32_t offset)
 {
-   struct link_encoder *encoder = link->link_enc;
+   const struct link_hwss *link_hwss = get_link_hwss(link, link_res);
 
if ((link->lttpr_mode == LTTPR_MODE_NON_TRANSPARENT) && 
!is_immediate_downstream(link, offset))
return;
 
-   /* call Encoder to set lane settings */
-   if (dp_get_link_encoding_format(_settings->link_settings) ==
-   DP_128b_132b_ENCODING) {
-   link_res->hpo_dp_link_enc->funcs->set_ffe(
-   link_res->hpo_dp_link_enc,
+   if (link_hwss->ext.set_dp_lane_settings)
+   link_hwss->ext.set_dp_lane_settings(link, link_res,
_settings->link_settings,
-   link_settings->lane_settings[0].FFE_PRESET.raw);
-   } else if (dp_get_link_encoding_format(_settings->link_settings)
-   == DP_8b_10b_ENCODING) {
-   encoder->funcs->dp_set_lane_settings(encoder, link_settings);
-   }
+   link_settings->hw_lane_settings);
+
memmove(link->cur_lane_setting,
link_settings->lane_settings,
sizeof(link->cur_lane_setting));
@@ -748,6 +742,16 @@ static void set_dio_dp_link_test_pattern(struct dc_link 
*link,
dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_SET_SOURCE_PATTERN);
 }
 
+static void set_dio_dp_lane_settings(struct dc_link *link,
+   const struct link_resource *link_res,
+   const struct dc_link_settings *link_settings,
+   const struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
+{
+   struct link_encoder *link_enc = link_enc_cfg_get_link_enc(link);
+
+   link_enc->funcs->dp_set_lane_settings(link_enc, link_settings, 
lane_settings);
+}
+
 static const struct link_hwss dio_link_hwss = {
.setup_stream_encoder = setup_dio_stream_encoder,
.reset_stream_encoder = reset_dio_stream_encoder,
@@ -756,6 +760,7 @@ static const struct link_hwss dio_link_hwss = {
.enable_dp_link_output = enable_dio_dp_link_output,
.disable_dp_link_output = disable_dio_dp_link_output,
.set_dp_link_test_pattern = set_dio_dp_link_test_pattern,
+   .set_dp_lane_settings = set_dio_dp_lane_settings,
},
 };
 
@@ -931,6 +936,17 @@ static void set_hpo_dp_link_test_pattern(struct dc_link 
*link,
dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_SET_SOURCE_PATTERN);
 }
 
+static void set_hpo_dp_lane_settings(struct dc_link *link,
+   const struct link_resource *link_res,
+   const struct dc_link_settings *link_settings,
+   const struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
+{
+   link_res->hpo_dp_link_enc->funcs->set_ffe(
+   link_res->hpo_dp_link_enc,
+   link_settings,
+   lane_settings[0].FFE_PRESET.raw);
+}
+
 static const struct link_hwss hpo_dp_link_hwss = {
.setup_stream_encoder = setup_hpo_dp_stream_encoder,
.reset_stream_encoder = reset_hpo_dp_stream_encoder,
@@ -940,6 +956,7 @@ static const struct link_hwss hpo_dp_link_hwss = {
.enable_dp_link_output = enable_hpo_dp_link_output,
.disable_dp_link_output = disable_hpo_dp_link_output,
.set_dp_link_test_pattern  = set_hpo_dp_link_test_pattern,
+   .set_dp_lane_settings = set_hpo_dp_lane_settings,
},
 };
 /*** below goes to dpia_link_hwss 
*/
@@ -958,6 +975,7 @@ static const struct link_hwss dpia_link_hwss = {
.enable_dp_link_output = enable_dio_dp_link_output,
.disable_dp_link_output = disable_dio_dp_link_output,
.set_dp_link_test_pattern = set_dio_dp_link_test_pattern,
+

[PATCH 09/17] drm/amd/display: add set dp link test pattern to link_hwss

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[why]
Factor set dp link test pattern to link_hwss.

Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
Signed-off-by: Wenjing Liu 
---
 .../drm/amd/display/dc/core/dc_link_hwss.c| 46 +++
 .../gpu/drm/amd/display/dc/inc/link_hwss.h|  3 ++
 2 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
index 364fa77b85f0..d5670d3b1a4b 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
@@ -323,33 +323,16 @@ void dp_set_hw_test_pattern(
uint8_t *custom_pattern,
uint32_t custom_pattern_size)
 {
+   const struct link_hwss *link_hwss = get_link_hwss(link, link_res);
struct encoder_set_dp_phy_pattern_param pattern_param = {0};
-   struct link_encoder *encoder;
-   enum dp_link_encoding link_encoding_format = 
dp_get_link_encoding_format(>cur_link_settings);
-
-   encoder = link_enc_cfg_get_link_enc(link);
-   ASSERT(encoder);
 
pattern_param.dp_phy_pattern = test_pattern;
pattern_param.custom_pattern = custom_pattern;
pattern_param.custom_pattern_size = custom_pattern_size;
pattern_param.dp_panel_mode = dp_get_panel_mode(link);
 
-   switch (link_encoding_format) {
-   case DP_128b_132b_ENCODING:
-   link_res->hpo_dp_link_enc->funcs->set_link_test_pattern(
-   link_res->hpo_dp_link_enc, _param);
-   break;
-   case DP_8b_10b_ENCODING:
-   ASSERT(encoder);
-   encoder->funcs->dp_set_phy_pattern(encoder, _param);
-   break;
-   default:
-   DC_LOG_ERROR("%s: Unknown link encoding format.", __func__);
-   break;
-   }
-
-   dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_SET_SOURCE_PATTERN);
+   if (link_hwss->ext.set_dp_link_test_pattern)
+   link_hwss->ext.set_dp_link_test_pattern(link, link_res, 
_param);
 }
 #undef DC_LOGGER
 
@@ -754,6 +737,17 @@ static void disable_dio_dp_link_output(struct dc_link 
*link,
dp_source_sequence_trace(link, DPCD_SOURCE_SEQ_AFTER_DISABLE_LINK_PHY);
 }
 
+static void set_dio_dp_link_test_pattern(struct dc_link *link,
+   const struct link_resource *link_res,
+   struct encoder_set_dp_phy_pattern_param *tp_params)
+{
+   struct link_encoder *link_enc = link_enc_cfg_get_link_enc(link);
+
+   ASSERT(link_enc);
+   link_enc->funcs->dp_set_phy_pattern(link_enc, tp_params);
+   dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_SET_SOURCE_PATTERN);
+}
+
 static const struct link_hwss dio_link_hwss = {
.setup_stream_encoder = setup_dio_stream_encoder,
.reset_stream_encoder = reset_dio_stream_encoder,
@@ -761,6 +755,7 @@ static const struct link_hwss dio_link_hwss = {
.set_throttled_vcp_size = set_dio_throttled_vcp_size,
.enable_dp_link_output = enable_dio_dp_link_output,
.disable_dp_link_output = disable_dio_dp_link_output,
+   .set_dp_link_test_pattern = set_dio_dp_link_test_pattern,
},
 };
 
@@ -927,6 +922,15 @@ static void disable_hpo_dp_link_output(struct dc_link 
*link,
}
 }
 
+static void set_hpo_dp_link_test_pattern(struct dc_link *link,
+   const struct link_resource *link_res,
+   struct encoder_set_dp_phy_pattern_param *tp_params)
+{
+   link_res->hpo_dp_link_enc->funcs->set_link_test_pattern(
+   link_res->hpo_dp_link_enc, tp_params);
+   dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_SET_SOURCE_PATTERN);
+}
+
 static const struct link_hwss hpo_dp_link_hwss = {
.setup_stream_encoder = setup_hpo_dp_stream_encoder,
.reset_stream_encoder = reset_hpo_dp_stream_encoder,
@@ -935,6 +939,7 @@ static const struct link_hwss hpo_dp_link_hwss = {
.set_hblank_min_symbol_width = 
set_dp_hpo_hblank_min_symbol_width,
.enable_dp_link_output = enable_hpo_dp_link_output,
.disable_dp_link_output = disable_hpo_dp_link_output,
+   .set_dp_link_test_pattern  = set_hpo_dp_link_test_pattern,
},
 };
 /*** below goes to dpia_link_hwss 
*/
@@ -952,6 +957,7 @@ static const struct link_hwss dpia_link_hwss = {
.set_throttled_vcp_size = set_dio_throttled_vcp_size,
.enable_dp_link_output = enable_dio_dp_link_output,
.disable_dp_link_output = disable_dio_dp_link_output,
+   .set_dp_link_test_pattern = set_dio_dp_link_test_pattern,
},
 };
 
diff --git a/drivers/gpu/drm/amd/display/dc/inc/link_hwss.h 
b/drivers/gpu/drm/amd/display/dc/inc/link_hwss.h
index 8fe20ee02d9e..ce9762aa58c9 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/link_hwss.h
+++

[PATCH 08/17] drm/amd/display: add enable/disable dp link output to link_hwss

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[why]
Factor enable/disable dp link output to link hwss.

Acked-by: Wayne Lin 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |   6 +-
 .../drm/amd/display/dc/core/dc_link_hwss.c| 256 +-
 .../gpu/drm/amd/display/dc/inc/link_hwss.h|   8 +
 3 files changed, 139 insertions(+), 131 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index c99e06afc769..cb6c91cd6e83 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -4016,8 +4016,10 @@ static void fpga_dp_hpo_enable_link_and_stream(struct 
dc_state *state, struct pi
decide_link_settings(stream, _settings);
stream->link->cur_link_settings = link_settings;
 
-   /*  Enable clock, Configure lane count, and Enable Link Encoder*/
-   enable_dp_hpo_output(stream->link, _ctx->link_res, 
>link->cur_link_settings);
+   if (link_hwss->ext.enable_dp_link_output)
+   link_hwss->ext.enable_dp_link_output(stream->link, 
_ctx->link_res,
+   stream->signal, pipe_ctx->clock_source->id,
+   _settings);
 
 #ifdef DIAGS_BUILD
/* Workaround for FPGA HPO capture DP link data:
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
index c26df0a78664..364fa77b85f0 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
@@ -76,19 +76,15 @@ void dp_enable_link_phy(
enum clock_source_id clock_source,
const struct dc_link_settings *link_settings)
 {
-   struct link_encoder *link_enc;
struct dc  *dc = link->ctx->dc;
struct dmcu *dmcu = dc->res_pool->dmcu;
-
struct pipe_ctx *pipes =
link->dc->current_state->res_ctx.pipe_ctx;
struct clock_source *dp_cs =
link->dc->res_pool->dp_clock_source;
+   const struct link_hwss *link_hwss = get_link_hwss(link, link_res);
unsigned int i;
 
-   link_enc = link_enc_cfg_get_link_enc(link);
-   ASSERT(link_enc);
-
if (link->connector_signal == SIGNAL_TYPE_EDP) {
link->dc->hwss.edp_power_control(link, true);
link->dc->hwss.edp_wait_for_hpd_ready(link, true);
@@ -126,21 +122,9 @@ void dp_enable_link_phy(
if (dmcu != NULL && dmcu->funcs->lock_phy)
dmcu->funcs->lock_phy(dmcu);
 
-   if (dp_get_link_encoding_format(link_settings) == 
DP_128b_132b_ENCODING) {
-   enable_dp_hpo_output(link, link_res, link_settings);
-   } else if (dp_get_link_encoding_format(link_settings) == 
DP_8b_10b_ENCODING) {
-   if (dc_is_dp_sst_signal(signal)) {
-   link_enc->funcs->enable_dp_output(
-   link_enc,
-   link_settings,
-   clock_source);
-   } else {
-   link_enc->funcs->enable_dp_mst_output(
-   link_enc,
-   link_settings,
-   clock_source);
-   }
-   }
+   if (link_hwss->ext.enable_dp_link_output)
+   link_hwss->ext.enable_dp_link_output(link, link_res, signal,
+   clock_source, link_settings);
 
if (dmcu != NULL && dmcu->funcs->unlock_phy)
dmcu->funcs->unlock_phy(dmcu);
@@ -221,11 +205,7 @@ void dp_disable_link_phy(struct dc_link *link, const 
struct link_resource *link_
 {
struct dc  *dc = link->ctx->dc;
struct dmcu *dmcu = dc->res_pool->dmcu;
-   struct hpo_dp_link_encoder *hpo_link_enc = link_res->hpo_dp_link_enc;
-   struct link_encoder *link_enc;
-
-   link_enc = link_enc_cfg_get_link_enc(link);
-   ASSERT(link_enc);
+   const struct link_hwss *link_hwss = get_link_hwss(link, link_res);
 
if (!link->wa_flags.dp_keep_receiver_powered)
dp_receiver_power_ctrl(link, false);
@@ -234,20 +214,15 @@ void dp_disable_link_phy(struct dc_link *link, const 
struct link_resource *link_
if (link->dc->hwss.edp_backlight_control)
link->dc->hwss.edp_backlight_control(link, false);
 
-   if (dp_get_link_encoding_format(>cur_link_settings) == 
DP_128b_132b_ENCODING)
-   disable_dp_hpo_output(link, link_res, signal);
-   else
-   link_enc->funcs->disable_output(link_enc, signal);
+   if (link_hwss->ext.disable_dp_link_output)
+   link_hwss->ext.disable_dp_link_output(link, link_res, 
signal);
link->dc->hwss.edp_power_control(link, false);
} else {
if (dmcu != NULL

[PATCH 07/17] drm/amd/display: refactor destructive verify link cap sequence

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[how]
1. move decide det link training link resource before each link training.
2. move disable link for handling vbios case into set all streams
dpms off for link sequence.
3. extract usbc hotplug workaround into its own wa function.
4. Minor syntax changes to improve code readability.

Acked-by: Wayne Lin 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  17 +--
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 134 +++---
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  22 ++-
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   7 -
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   7 +-
 5 files changed, 83 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 2c5d67abad3e..c99e06afc769 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -893,6 +893,7 @@ static void set_all_streams_dpms_off_for_link(struct 
dc_link *link)
struct pipe_ctx *pipe_ctx;
struct dc_stream_update stream_update;
bool dpms_off = true;
+   struct link_resource link_res = {0};
 
memset(_update, 0, sizeof(stream_update));
stream_update.dpms_off = _off;
@@ -907,33 +908,29 @@ static void set_all_streams_dpms_off_for_link(struct 
dc_link *link)
link->ctx->dc->current_state);
}
}
+
+   /* link can be also enabled by vbios. In this case it is not recorded
+* in pipe_ctx. Disable link phy here to make sure it is completely off
+*/
+   dp_disable_link_phy(link, _res, link->connector_signal);
 }
 
 static void verify_link_capability_destructive(struct dc_link *link,
struct dc_sink *sink,
enum dc_detect_reason reason)
 {
-   struct link_resource link_res = { 0 };
bool should_prepare_phy_clocks =

should_prepare_phy_clocks_for_link_verification(link->dc, reason);
 
if (should_prepare_phy_clocks)
prepare_phy_clocks_for_destructive_link_verification(link->dc);
 
-
if (dc_is_dp_signal(link->local_sink->sink_signal)) {
struct dc_link_settings known_limit_link_setting =
dp_get_max_link_cap(link);
-
set_all_streams_dpms_off_for_link(link);
-   if (dp_get_link_encoding_format(_limit_link_setting) ==
-   DP_128b_132b_ENCODING)
-   link_res.hpo_dp_link_enc = 
resource_get_hpo_dp_link_enc_for_det_lt(
-   >dc->current_state->res_ctx,
-   link->dc->res_pool,
-   link);
dp_verify_link_cap_with_retries(
-   link, _res, _limit_link_setting,
+   link, _limit_link_setting,
LINK_TRAINING_MAX_VERIFY_RETRY);
} else {
ASSERT(0);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index f1082674bcbf..abec79e80eed 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -97,6 +97,12 @@ static const struct dp_lt_fallback_entry dp_lt_fallbacks[] = 
{
{LANE_COUNT_ONE, LINK_RATE_LOW},
 };
 
+static const struct dc_link_settings fail_safe_link_settings = {
+   .lane_count = LANE_COUNT_ONE,
+   .link_rate = LINK_RATE_LOW,
+   .link_spread = LINK_SPREAD_DISABLED,
+};
+
 static bool decide_fallback_link_setting(
struct dc_link *link,
struct dc_link_settings initial_link_settings,
@@ -3182,25 +3188,22 @@ bool hpd_rx_irq_check_link_loss_status(
return return_code;
 }
 
-bool dp_verify_link_cap(
+static bool dp_verify_link_cap(
struct dc_link *link,
-   const struct link_resource *link_res,
struct dc_link_settings *known_limit_link_setting,
int *fail_count)
 {
-   struct dc_link_settings cur_link_setting = {0};
-   struct dc_link_settings *cur = _link_setting;
+   struct dc_link_settings cur_link_settings = {0};
struct dc_link_settings initial_link_settings = 
*known_limit_link_setting;
-   bool success;
-   bool skip_link_training;
+   bool success = false;
bool skip_video_pattern;
-   enum clock_source_id dp_cs_id = CLOCK_SOURCE_ID_EXTERNAL;
+   enum clock_source_id dp_cs_id = get_clock_source_id(link);
enum link_training_result status;
union hpd_irq_data irq_data;
+   struct link_resource link_res;
 
memset(_data, 0, sizeof(irq_data));
-   success = false;
-   skip_link_training = false;
+   cur_link_settings = initial_link_settings;
 
/* Grant

[PATCH 06/17] drm/amd/display: add setup/reset stream encoder to link_hwss

2022-01-28 Thread Stylon Wang

From: Wenjing Liu 

[why]
Factor setup/reset stream encoder to link hwss.

Acked-by: Wayne Lin 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  64 +++
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  23 +--
 .../drm/amd/display/dc/core/dc_link_hwss.c| 170 +++---
 .../display/dc/dce110/dce110_hw_sequencer.c   |  44 ++---
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|  40 +
 .../gpu/drm/amd/display/dc/inc/link_hwss.h|  22 ++-
 6 files changed, 180 insertions(+), 183 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 1e596f1ea494..2c5d67abad3e 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3441,7 +3441,7 @@ static enum dc_status dc_link_update_sst_payload(struct 
pipe_ctx *pipe_ctx,
struct link_mst_stream_allocation_table proposed_table = {0};
struct fixed31_32 avg_time_slots_per_mtp;
const struct dc_link_settings empty_link_settings = {0};
-   const struct link_hwss *link_hwss = dc_link_hwss_get(link, 
_ctx->link_res);
+   const struct link_hwss *link_hwss = get_link_hwss(link, 
_ctx->link_res);
DC_LOGGER_INIT(link->ctx->logger);
 
/* slot X.Y for SST payload deallocate */
@@ -3450,9 +3450,11 @@ static enum dc_status dc_link_update_sst_payload(struct 
pipe_ctx *pipe_ctx,
 
dc_log_vcp_x_y(link, avg_time_slots_per_mtp);
 
-   link_hwss->set_throttled_vcp_size(pipe_ctx, 
avg_time_slots_per_mtp);
-   if (link_hwss->set_hblank_min_symbol_width)
-   link_hwss->set_hblank_min_symbol_width(pipe_ctx,
+   if (link_hwss->ext.set_throttled_vcp_size)
+   link_hwss->ext.set_throttled_vcp_size(pipe_ctx,
+   avg_time_slots_per_mtp);
+   if (link_hwss->ext.set_hblank_min_symbol_width)
+   link_hwss->ext.set_hblank_min_symbol_width(pipe_ctx,
_link_settings,
avg_time_slots_per_mtp);
}
@@ -3498,9 +3500,11 @@ static enum dc_status dc_link_update_sst_payload(struct 
pipe_ctx *pipe_ctx,
 
dc_log_vcp_x_y(link, avg_time_slots_per_mtp);
 
-   link_hwss->set_throttled_vcp_size(pipe_ctx, 
avg_time_slots_per_mtp);
-   if (link_hwss->set_hblank_min_symbol_width)
-   link_hwss->set_hblank_min_symbol_width(pipe_ctx,
+   if (link_hwss->ext.set_throttled_vcp_size)
+   link_hwss->ext.set_throttled_vcp_size(pipe_ctx,
+   avg_time_slots_per_mtp);
+   if (link_hwss->ext.set_hblank_min_symbol_width)
+   link_hwss->ext.set_hblank_min_symbol_width(pipe_ctx,
>cur_link_settings,
avg_time_slots_per_mtp);
}
@@ -3526,7 +3530,7 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
struct fixed31_32 pbn_per_slot;
int i;
enum act_return_status ret;
-   const struct link_hwss *link_hwss = dc_link_hwss_get(link, 
_ctx->link_res);
+   const struct link_hwss *link_hwss = get_link_hwss(link, 
_ctx->link_res);
DC_LOGGER_INIT(link->ctx->logger);
 
/* Link encoder may have been dynamically assigned to non-physical 
display endpoint. */
@@ -3634,9 +3638,10 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
 
dc_log_vcp_x_y(link, avg_time_slots_per_mtp);
 
-   link_hwss->set_throttled_vcp_size(pipe_ctx, avg_time_slots_per_mtp);
-   if (link_hwss->set_hblank_min_symbol_width)
-   link_hwss->set_hblank_min_symbol_width(pipe_ctx,
+   if (link_hwss->ext.set_throttled_vcp_size)
+   link_hwss->ext.set_throttled_vcp_size(pipe_ctx, 
avg_time_slots_per_mtp);
+   if (link_hwss->ext.set_hblank_min_symbol_width)
+   link_hwss->ext.set_hblank_min_symbol_width(pipe_ctx,
>cur_link_settings,
avg_time_slots_per_mtp);
 
@@ -3655,7 +3660,7 @@ enum dc_status dc_link_reduce_mst_payload(struct pipe_ctx 
*pipe_ctx, uint32_t bw
struct dp_mst_stream_allocation_table proposed_table = {0};
uint8_t i;
enum act_return_status ret;
-   const struct link_hwss *link_hwss = dc_link_hwss_get(link, 
_ctx->link_res);
+   const struct link_hwss *link_hwss = get_link_hwss(link, 
_ctx->link_res);
DC_LOGGER_INIT(link->ctx->logger);
 
/* decrease throttled vcp size */
@@ -3663,9 +3668,10 @@ enum dc_status dc_link_reduce_mst_payload(struct 
pipe_ctx *pipe_ctx, uint32_t bw
pbn = get_pbn_from_bw_in_kbps(bw_in_kbps);
avg_time_slots_per_mtp = dc_fixpt_div(pbn, pbn_per_slot);
 
-

[PATCH 05/17] drm/amd/display: revert "Reset fifo after enable otg"

2022-01-28 Thread Stylon Wang

From: Zhan Liu 

[Why]
This change causes regression, that prevents some systems
from lighting up internal displays.

[How]
Revert this patch until a new solution is ready.

Reviewed-by: Charlene Liu 
Acked-by: Stylon Wang 
Signed-off-by: Zhan Liu 
---
 .../amd/display/dc/dce110/dce110_hw_sequencer.c   |  5 -
 .../amd/display/dc/dcn10/dcn10_stream_encoder.c   | 15 ---
 .../amd/display/dc/dcn10/dcn10_stream_encoder.h   |  3 ---
 .../amd/display/dc/dcn20/dcn20_stream_encoder.c   |  2 --
 .../display/dc/dcn30/dcn30_dio_stream_encoder.c   |  2 --
 .../drm/amd/display/dc/inc/hw/stream_encoder.h|  4 
 6 files changed, 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 72dd41e7a7d6..f28d6c15a4e2 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1567,11 +1567,6 @@ static enum dc_status apply_single_controller_ctx_to_hw(
pipe_ctx->stream_res.stream_enc,
pipe_ctx->stream_res.tg->inst);
 
-   if (dc_is_embedded_signal(pipe_ctx->stream->signal) &&
-   pipe_ctx->stream_res.stream_enc->funcs->reset_fifo)
-   pipe_ctx->stream_res.stream_enc->funcs->reset_fifo(
-   pipe_ctx->stream_res.stream_enc);
-
if (dc_is_dp_signal(pipe_ctx->stream->signal))
dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_CONNECT_DIG_FE_OTG);
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c
index bf4436d7aaab..b0c08ee6bc2c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c
@@ -902,19 +902,6 @@ void enc1_stream_encoder_stop_dp_info_packets(
 
 }
 
-void enc1_stream_encoder_reset_fifo(
-   struct stream_encoder *enc)
-{
-   struct dcn10_stream_encoder *enc1 = DCN10STRENC_FROM_STRENC(enc);
-
-   /* set DIG_START to 0x1 to reset FIFO */
-   REG_UPDATE(DIG_FE_CNTL, DIG_START, 1);
-   udelay(100);
-
-   /* write 0 to take the FIFO out of reset */
-   REG_UPDATE(DIG_FE_CNTL, DIG_START, 0);
-}
-
 void enc1_stream_encoder_dp_blank(
struct dc_link *link,
struct stream_encoder *enc)
@@ -1600,8 +1587,6 @@ static const struct stream_encoder_funcs 
dcn10_str_enc_funcs = {
enc1_stream_encoder_send_immediate_sdp_message,
.stop_dp_info_packets =
enc1_stream_encoder_stop_dp_info_packets,
-   .reset_fifo =
-   enc1_stream_encoder_reset_fifo,
.dp_blank =
enc1_stream_encoder_dp_blank,
.dp_unblank =
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
index a146a41f68e9..687d7e4bf7ca 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
@@ -626,9 +626,6 @@ void enc1_stream_encoder_send_immediate_sdp_message(
 void enc1_stream_encoder_stop_dp_info_packets(
struct stream_encoder *enc);
 
-void enc1_stream_encoder_reset_fifo(
-   struct stream_encoder *enc);
-
 void enc1_stream_encoder_dp_blank(
struct dc_link *link,
struct stream_encoder *enc);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_stream_encoder.c
index 8a70f92795c2..aab25ca8343a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_stream_encoder.c
@@ -593,8 +593,6 @@ static const struct stream_encoder_funcs 
dcn20_str_enc_funcs = {
enc1_stream_encoder_send_immediate_sdp_message,
.stop_dp_info_packets =
enc1_stream_encoder_stop_dp_info_packets,
-   .reset_fifo =
-   enc1_stream_encoder_reset_fifo,
.dp_blank =
enc1_stream_encoder_dp_blank,
.dp_unblank =
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_stream_encoder.c
index 8daa12730bc1..a04ca4a98392 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_stream_encoder.c
@@ -789,8 +789,6 @@ static const struct stream_encoder_funcs 
dcn30_str_enc_funcs = {
enc3_stream_encoder_update_dp_info_packets,
.stop_dp_info_packets =
enc1_stream_encoder_stop_dp_info_packets,
-   .reset_fifo =
-   enc1_stream_encoder_reset_fifo,
.dp_blank =
enc1_stream_encoder_dp_blank,
.dp_unblank =
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h

[PATCH 04/17] drm/amd/display: add infoframe update sequence debug trace

2022-01-28 Thread Stylon Wang

From: "Leo (Hanghong) Ma" 

[Why]
We find some of the driver sequence debug trace for infoframe
update is missing so add it.

[How]
Add the missing sequence debug trace for infoframe update.

Reviewed-by: Martin Leung 
Acked-by: Stylon Wang 
Signed-off-by: Leo (Hanghong) Ma 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 417c31f51562..1d9404ff29ed 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2720,6 +2720,9 @@ static void commit_planes_do_stream_update(struct dc *dc,
stream_update->vsp_infopacket) {
resource_build_info_frame(pipe_ctx);
dc->hwss.update_info_frame(pipe_ctx);
+
+   if (dc_is_dp_signal(pipe_ctx->stream->signal))
+   
dp_source_sequence_trace(pipe_ctx->stream->link, 
DPCD_SOURCE_SEQ_AFTER_UPDATE_INFO_FRAME);
}
 
if (stream_update->hdr_static_metadata &&
-- 
2.34.1

[PATCH 03/17] drm/amd/display: watermark latencies is not enough on DCN31

2022-01-28 Thread Stylon Wang

From: Paul Hsieh 

[Why]
The original latencies were causing underflow in some modes.
Resolution: 2880x1620@60p when HDR enable

[How]
1. Replace with the up-to-date watermark values based on new measurments
2. Correct the ddr_wm_table name to DDR5 on DCN31

Reviewed-by: Aric Cyr 
Acked-by: Stylon Wang 
Signed-off-by: Paul Hsieh 
---
 .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c  | 20 +--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
index 66bd0261ead6..e17c9938cee5 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
@@ -329,38 +329,38 @@ static struct clk_bw_params dcn31_bw_params = {
 
 };
 
-static struct wm_table ddr4_wm_table = {
+static struct wm_table ddr5_wm_table = {
.entries = {
{
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 6.09,
-   .sr_enter_plus_exit_time_us = 7.14,
+   .sr_exit_time_us = 9,
+   .sr_enter_plus_exit_time_us = 11,
.valid = true,
},
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 10.12,
-   .sr_enter_plus_exit_time_us = 11.48,
+   .sr_exit_time_us = 9,
+   .sr_enter_plus_exit_time_us = 11,
.valid = true,
},
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 10.12,
-   .sr_enter_plus_exit_time_us = 11.48,
+   .sr_exit_time_us = 9,
+   .sr_enter_plus_exit_time_us = 11,
.valid = true,
},
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 10.12,
-   .sr_enter_plus_exit_time_us = 11.48,
+   .sr_exit_time_us = 9,
+   .sr_enter_plus_exit_time_us = 11,
.valid = true,
},
}
@@ -687,7 +687,7 @@ void dcn31_clk_mgr_construct(
if (ctx->dc_bios->integrated_info->memory_type == 
LpDdr5MemType) {
dcn31_bw_params.wm_table = lpddr5_wm_table;
} else {
-   dcn31_bw_params.wm_table = ddr4_wm_table;
+   dcn31_bw_params.wm_table = ddr5_wm_table;
}
/* Saved clocks configured at boot for debug purposes */
 dcn31_dump_clk_registers(_mgr->base.base.boot_snapshot, 
_mgr->base.base, _info);
-- 
2.34.1

[PATCH 02/17] drm/amd/display: Improve dce_aux_transfer_with_retries logging

2022-01-28 Thread Stylon Wang

From: Wyatt Wood 

[Why + How]
Payload reply is unknown and not handled in switch statement.

Reviewed-by: Anthony Koo 
Acked-by: Stylon Wang 
Signed-off-by: Wyatt Wood 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
index 6d42a9cc9916..74b05b3aef08 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
@@ -878,7 +878,7 @@ bool dce_aux_transfer_with_retries(struct ddc_service *ddc,
default:
DC_TRACE_LEVEL_MESSAGE(DAL_TRACE_LEVEL_ERROR,
LOG_FLAG_Error_I2cAux,
-   
"dce_aux_transfer_with_retries: AUX_RET_SUCCESS: FAILURE: 
AUX_TRANSACTION_REPLY_* unknown, default case.");
+   
"dce_aux_transfer_with_retries: AUX_RET_SUCCESS: FAILURE: 
AUX_TRANSACTION_REPLY_* unknown, default case. Reply: %d", *payload->reply);
goto fail;
}
break;
-- 
2.34.1

[PATCH 01/17] drm/amd/display: Add link enc null ptr check for cable ID (#2597)

2022-01-28 Thread Stylon Wang

From: "Shen, George" 

[Why]
Certain configurations will result in link encoder
to not be assigned to the link at the time we apply
cable ID logic. We should skip it in those cases.

[How]
Check if link_enc is not null before applying
cable ID.

Reviewed-by: Wenjing Liu 
Acked-by: Stylon Wang 
Signed-off-by: George Shen 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 8cfc9a8197df..117183b5ab44 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -6327,7 +6327,12 @@ void dpcd_set_source_specific_data(struct dc_link *link)
 
 void dpcd_update_cable_id(struct dc_link *link)
 {
-   if (!link->link_enc->features.flags.bits.IS_UHBR10_CAPABLE ||
+   struct link_encoder *link_enc = NULL;
+
+   link_enc = link_enc_cfg_get_link_enc(link);
+
+   if (!link_enc ||
+   !link_enc->features.flags.bits.IS_UHBR10_CAPABLE ||
link->dprx_status.cable_id_updated)
return;
 
-- 
2.34.1

[PATCH 00/17] DC Patches Jan 31, 2022

2022-01-28 Thread Stylon Wang

This DC patchset brings improvements in multiple areas. In summary, we have:

- DC refactor and bug fixes for DP links
- Bug fixes for DP2
- Fix regressions causing display not light up
- Improved debug trace
- Improved DP AUX transfer
- Updated watermark latencies to fix underflows in some modes

Anthony Koo (1):
  drm/amd/display: [FW Promotion] Release 0.0.102.0

Aric Cyr (1):
  drm/amd/display: 3.2.171

Fangzhi Zuo (2):
  drm/amd/display: Trigger DP2 Sequence With Uncertified Cable
  drm/amd/display: Add Missing HPO Stream Encoder Function Hook

Leo (Hanghong) Ma (1):
  drm/amd/display: add infoframe update sequence debug trace

Paul Hsieh (1):
  drm/amd/display: watermark latencies is not enough on DCN31

Shen, George (1):
  drm/amd/display: Add link enc null ptr check for cable ID (#2597)

Wenjing Liu (8):
  drm/amd/display: add setup/reset stream encoder to link_hwss
  drm/amd/display: refactor destructive verify link cap sequence
  drm/amd/display: add enable/disable dp link output to link_hwss
  drm/amd/display: add set dp link test pattern to link_hwss
  drm/amd/display: add set dp lane settings to link_hwss
  drm/amd/display: temporarly move non link_hwss code to dc_link_dp
  drm/amd/display: move get_link_hwss to dc_resource
  drm/amd/display: move link_hwss to link folder and break down to files

Wyatt Wood (1):
  drm/amd/display: Improve dce_aux_transfer_with_retries logging

Zhan Liu (1):
  drm/amd/display: revert "Reset fifo after enable otg"

 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |  26 +
 drivers/gpu/drm/amd/display/dc/Makefile   |   4 +-
 .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c  |  20 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c  |   3 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  87 +-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 778 --
 .../drm/amd/display/dc/core/dc_link_hwss.c| 959 --
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  59 +-
 drivers/gpu/drm/amd/display/dc/dc.h   |   2 +-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   4 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_aux.c  |   2 +-
 .../drm/amd/display/dc/dce/dce_link_encoder.c |  17 +-
 .../drm/amd/display/dc/dce/dce_link_encoder.h |   3 +-
 .../display/dc/dce110/dce110_hw_sequencer.c   |  49 +-
 .../amd/display/dc/dcn10/dcn10_link_encoder.c |  17 +-
 .../amd/display/dc/dcn10/dcn10_link_encoder.h |   3 +-
 .../display/dc/dcn10/dcn10_stream_encoder.c   |  15 -
 .../display/dc/dcn10/dcn10_stream_encoder.h   |   3 -
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|  41 +-
 .../display/dc/dcn20/dcn20_stream_encoder.c   |   2 -
 .../dc/dcn30/dcn30_dio_stream_encoder.c   |   2 -
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c|  11 +
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.h|   9 +-
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   1 +
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |  47 +-
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |   3 +-
 .../amd/display/dc/inc/hw/stream_encoder.h|   4 -
 .../gpu/drm/amd/display/dc/inc/link_dpcd.h|   2 +-
 .../gpu/drm/amd/display/dc/inc/link_hwss.h|  90 +-
 drivers/gpu/drm/amd/display/dc/inc/resource.h |  10 +-
 drivers/gpu/drm/amd/display/dc/link/Makefile  |  30 +
 .../drm/amd/display/dc/link/link_hwss_dio.c   | 137 +++
 .../drm/amd/display/dc/link/link_hwss_dio.h   |  53 +
 .../drm/amd/display/dc/link/link_hwss_dpia.c  |  51 +
 .../drm/amd/display/dc/link/link_hwss_dpia.h  |  34 +
 .../amd/display/dc/link/link_hwss_hpo_dp.c| 254 +
 .../amd/display/dc/link/link_hwss_hpo_dp.h|  35 +
 .../amd/display/dc/link/link_hwss_hpo_frl.c   |  43 +
 .../amd/display/dc/link/link_hwss_hpo_frl.h   |  34 +
 .../gpu/drm/amd/display/dc/virtual/Makefile   |   2 +-
 .../display/dc/virtual/virtual_link_encoder.c |   3 +-
 .../display/dc/virtual/virtual_link_hwss.c|  43 +
 .../display/dc/virtual/virtual_link_hwss.h|  34 +
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   |   8 +-
 44 files changed, 1726 insertions(+), 1308 deletions(-)
 delete mode 100644 drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/Makefile
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dio.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dio.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dpia.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_dpia.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_hpo_dp.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_hpo_dp.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_hpo_frl.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/link/link_hwss_hpo_frl.h
 create mode 100644 drivers/gpu/drm/amd/display/dc/virtual/virtual_link_hwss.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/virtual/virtual_link_hwss.h

-- 
2.34.1

Re: [PATCH] drm/amd/display: Fix a NULL pointer dereference in amdgpu_dm_connector_add_common_modes()

2022-01-28 Thread Greg KH

On Tue, Jan 25, 2022 at 12:57:29AM +0800, Zhou Qingyang wrote:
> In amdgpu_dm_connector_add_common_modes(), amdgpu_dm_create_common_mode()
> is assigned to mode and is passed to drm_mode_probed_add() directly after
> that. drm_mode_probed_add() passes >head to list_add_tail(), and
> there is a dereference of it in list_add_tail() without recoveries, which
> could lead to NULL pointer dereference on failure of
> amdgpu_dm_create_common_mode().
> 
> Fix this by adding a NULL check of mode.
> 
> This bug was found by a static analyzer.
> 
> Builds with 'make allyesconfig' show no new warnings,
> and our static analyzer no longer warns about this code.
> 
> Fixes: e7b07ceef2a6 ("drm/amd/display: Merge amdgpu_dm_types and amdgpu_dm")
> Signed-off-by: Zhou Qingyang 
> ---
> The analysis employs differential checking to identify inconsistent 
> security operations (e.g., checks or kfrees) between two code paths 
> and confirms that the inconsistent operations are not recovered in the
> current function or the callers, so they constitute bugs. 
> 
> Note that, as a bug found by static analysis, it can be a false
> positive or hard to trigger. Multiple researchers have cross-reviewed
> the bug.
> 
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 7f9773f8dab6..9ad94186b146 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -8143,6 +8143,9 @@ static void amdgpu_dm_connector_add_common_modes(struct 
> drm_encoder *encoder,
>   mode = amdgpu_dm_create_common_mode(encoder,
>   common_modes[i].name, common_modes[i].w,
>   common_modes[i].h);
> + if (!mode)
> + continue;
> +
>   drm_mode_probed_add(connector, mode);
>   amdgpu_dm_connector->num_modes++;
>   }
> -- 
> 2.25.1
> 

As stated before, umn.edu is still not allowed to contribute to the
Linux kernel.  Please work with your administration to resolve this
issue.

Re: [PATCH] drm/amd/display/dc/calcs/dce_calcs: Fix a memleak in calculate_bandwidth()

2022-01-28 Thread Greg KH

On Tue, Jan 25, 2022 at 12:55:51AM +0800, Zhou Qingyang wrote:
> In calculate_bandwidth(), the tag free_sclk and free_yclk are reversed,
> which could lead to a memory leak of yclk.
> 
> Fix this bug by changing the location of free_sclk and free_yclk.
> 
> This bug was found by a static analyzer.
> 
> Builds with 'make allyesconfig' show no new warnings,
> and our static analyzer no longer warns about this code.
> 
> Fixes: 2be8989d0fc2 ("drm/amd/display/dc/calcs/dce_calcs: Move some large 
> variables from the stack to the heap")
> Signed-off-by: Zhou Qingyang 
> ---
> The analysis employs differential checking to identify inconsistent 
> security operations (e.g., checks or kfrees) between two code paths 
> and confirms that the inconsistent operations are not recovered in the
> current function or the callers, so they constitute bugs. 
> 
> Note that, as a bug found by static analysis, it can be a false
> positive or hard to trigger. Multiple researchers have cross-reviewed
> the bug.
> 
>  drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c 
> b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
> index ff5bb152ef49..e6ef36de0825 100644
> --- a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
> +++ b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
> @@ -2033,10 +2033,10 @@ static void calculate_bandwidth(
>   kfree(surface_type);
>  free_tiling_mode:
>   kfree(tiling_mode);
> -free_yclk:
> - kfree(yclk);
>  free_sclk:
>   kfree(sclk);
> +free_yclk:
> + kfree(yclk);
>  }
>  
>  
> /***
> -- 
> 2.25.1
> 

As stated before, umn.edu is still not allowed to contribute to the
Linux kernel.  Please work with your administration to resolve this
issue.

Re: [PATCH RESEND] drm/amd/display: Force link_rate as LINK_RATE_RBR2 for 2018 15" Apple Retina panels

2022-01-28 Thread Aditya Garg



Hi Alex

> On 27-Jan-2022, at 11:06 PM, Alex Deucher  wrote:
> 
> C style comments please.
Shall be fixed in v2
>  I'll let one of the display guys comment on
> the rest of the patch.  Seems reasonable, we have a similar quirk for
> the Apple MBP 2017 15" Retina panel later in this function.  Could you
> move this next to the other quirk?
I guess moving it next to the other quirk may break the functionality of this 
quirk, cause the MBP 2018 one involves stuff regarding firmware revision as 
well. The original patch applies the quirk after the following lines of the 
code :-


core_link_read_dpcd(
link,
DP_SINK_HW_REVISION_START,
(uint8_t *)_hw_fw_revision,
sizeof(dp_hw_fw_revision));

link->dpcd_caps.sink_hw_revision =
dp_hw_fw_revision.ieee_hw_rev;

memmove(
link->dpcd_caps.sink_fw_revision,
dp_hw_fw_revision.ieee_fw_rev,
sizeof(dp_hw_fw_revision.ieee_fw_rev));

Which seem to related to the firmware stuff. Moving it along with the 2017 
quirk doesn't sound right to me, as this shall move the quirk BEFORE these 
lines of code instead. Maybe the author also knowingly added the quirk after 
these lines of code?

As a workaround, could we move the 2017 quirk later, instead of moving the 2018 
quirk before? This sounds more logical to me.

Regards
Aditya

Re: [PATCH RESEND] drm/amdgpu: Remove the vega10 from ras support list

2022-01-28 Thread Ma, Jun



On 1/27/2022 10:36 PM, Chen, Guchun wrote:
> [AMD Official Use Only]
> 
> Hi Jun,
> 
> In RAS code, we have this special handling for Vega10. Can you elaborate it 
> please? Any problem you have observed?

Ok, thanks for review. I'll confirm this.

> 
> Regards,
> Guchun
> 
> -Original Message-
> From: Ma, Jun  
> Sent: Thursday, January 27, 2022 7:47 PM
> To: amd-gfx@lists.freedesktop.org; brahma_sw_dev 
> Cc: Zhang, Hawking ; Zhou1, Tao ; 
> Ma, Jun 
> Subject: [PATCH RESEND] drm/amdgpu: Remove the vega10 from ras support list
> 
> Remove vega10 from the ras support check function.
> Base on this change, the ras initial function is optimized.
> 
> Signed-off-by: majun 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 38 +
>  1 file changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 37e9b7e82993..aa1de974e07e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2129,8 +2129,7 @@ static int amdgpu_ras_recovery_fini(struct 
> amdgpu_device *adev)
>  
>  static bool amdgpu_ras_asic_supported(struct amdgpu_device *adev)  {
> - return adev->asic_type == CHIP_VEGA10 ||
> - adev->asic_type == CHIP_VEGA20 ||
> + return adev->asic_type == CHIP_VEGA20 ||
>   adev->asic_type == CHIP_ARCTURUS ||
>   adev->asic_type == CHIP_ALDEBARAN ||
>   adev->asic_type == CHIP_SIENNA_CICHLID; @@ -2164,13 +2163,13 @@ 
> static void amdgpu_ras_get_quirks(struct amdgpu_device *adev)
>   * we have to initialize ras as normal. but need check if operation is
>   * allowed or not in each function.
>   */
> -static void amdgpu_ras_check_supported(struct amdgpu_device *adev)
> +static bool amdgpu_ras_check_supported(struct amdgpu_device *adev)
>  {
>   adev->ras_hw_enabled = adev->ras_enabled = 0;
>  
>   if (amdgpu_sriov_vf(adev) || !adev->is_atom_fw ||
>   !amdgpu_ras_asic_supported(adev))
> - return;
> + return false;
>  
>   if (!adev->gmc.xgmi.connected_to_cpu) {
>   if (amdgpu_atomfirmware_mem_ecc_supported(adev)) { @@ -2203,6 
> +2202,8 @@ static void amdgpu_ras_check_supported(struct amdgpu_device *adev)
>  
>   adev->ras_enabled = amdgpu_ras_enable == 0 ? 0 :
>   adev->ras_hw_enabled & amdgpu_ras_mask;
> +
> + return true;
>  }
>  
>  static void amdgpu_ras_counte_dw(struct work_struct *work) @@ -2236,6 
> +2237,9 @@ int amdgpu_ras_init(struct amdgpu_device *adev)
>   int r;
>   bool df_poison, umc_poison;
>  
> + if (!amdgpu_ras_check_supported(adev))
> + return -EINVAL;
> +
>   if (con)
>   return 0;
>  
> @@ -2250,28 +2254,24 @@ int amdgpu_ras_init(struct amdgpu_device *adev)
>   INIT_DELAYED_WORK(>ras_counte_delay_work, amdgpu_ras_counte_dw);
>   atomic_set(>ras_ce_count, 0);
>   atomic_set(>ras_ue_count, 0);
> -
>   con->objs = (struct ras_manager *)(con + 1);
> + con->features = 0;
>  
>   amdgpu_ras_set_context(adev, con);
>  
> - amdgpu_ras_check_supported(adev);
> -
> - if (!adev->ras_enabled || adev->asic_type == CHIP_VEGA10) {
> - /* set gfx block ras context feature for VEGA20 Gaming
> -  * send ras disable cmd to ras ta during ras late init.
> -  */
> - if (!adev->ras_enabled && adev->asic_type == CHIP_VEGA20) {
> + if (!adev->ras_enabled) {
> + /* set gfx block ras context feature for VEGA20 Gaming
> +  * send ras disable cmd to ras ta during ras late init.
> +  */
> + if (adev->asic_type == CHIP_VEGA20) {
>   con->features |= BIT(AMDGPU_RAS_BLOCK__GFX);
> -
>   return 0;
> + } else {
> + r = 0;
> + goto release_con;
>   }
> -
> - r = 0;
> - goto release_con;
>   }
>  
> - con->features = 0;
>   INIT_LIST_HEAD(>head);
>   /* Might need get this flag from vbios. */
>   con->flags = RAS_DEFAULT_FLAGS;
> @@ -2545,7 +2545,9 @@ int amdgpu_ras_fini(struct amdgpu_device *adev)
>  
>  void amdgpu_ras_global_ras_isr(struct amdgpu_device *adev)  {
> - amdgpu_ras_check_supported(adev);
> + if (!amdgpu_ras_check_supported(adev))
> + return;
> +
>   if (!adev->ras_hw_enabled)
>   return;
>  
> --
> 2.25.1

[PATCH][next] drm/amdgpu: Fix a couple of spelling mistakes

2022-01-28 Thread Colin Ian King

There are two spelling mistakes in dev_err messages. Fix them.

Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 80c25176c993..06d3336a1c84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -919,14 +919,14 @@ static u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device 
*adev, u32 offset, u32 v
"wrong operation type, rlcg 
failed to program reg: 0x%05x\n", offset);
} else if (tmp & AMDGPU_RLCG_REG_NOT_IN_RANGE) {
dev_err(adev->dev,
-   "regiser is not in range, rlcg 
failed to program reg: 0x%05x\n", offset);
+   "register is not in range, rlcg 
failed to program reg: 0x%05x\n", offset);
} else {
dev_err(adev->dev,
"unknown error type, rlcg 
failed to program reg: 0x%05x\n", offset);
}
} else {
dev_err(adev->dev,
-   "timeout: rlcg faled to program reg: 
0x%05x\n", offset);
+   "timeout: rlcg failed to program reg: 
0x%05x\n", offset);
}
}
}
-- 
2.34.1

Re: [PATCH] drm/amdgpu: fix a potential GPU hang on cyan skillfish

2022-01-28 Thread Lazar, Lijo





On 1/28/2022 4:13 PM, Lang Yu wrote:

We observed a GPU hang when querying GMC CG state(i.e.,
cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
skillfish doesn't support any CG features.

Just prevent cyan skillfish from accessing GMC CG registers.

Signed-off-by: Lang Yu 


Reviewed-by: Lijo Lazar 

Thanks,
Lijo


---
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..bddaf2417344 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
+	if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 1, 3))

+   return;
+
adev->mmhub.funcs->get_clockgating(adev, flags);
  
  	if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))

[PATCH] drm/amdgpu: fix a potential GPU hang on cyan skillfish

2022-01-28 Thread Lang Yu

We observed a GPU hang when querying GMC CG state(i.e.,
cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
skillfish doesn't support any CG features.

Just prevent cyan skillfish from accessing GMC CG registers.

Signed-off-by: Lang Yu 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..bddaf2417344 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 1, 3))
+   return;
+
adev->mmhub.funcs->get_clockgating(adev, flags);
 
if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))
-- 
2.25.1

Re: [PATCH v2] drm/amdgpu: add safeguards for querying GMC CG state

2022-01-28 Thread Lazar, Lijo





On 1/28/2022 2:46 PM, Lang Yu wrote:

On 01/28/ , Lazar, Lijo wrote:



On 1/28/2022 2:22 PM, Lang Yu wrote:

On 01/28/ , Lazar, Lijo wrote:



On 1/28/2022 12:24 PM, Lang Yu wrote:

We observed a GPU hang when querying GMC CG state(i.e.,
cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
skillfish doesn't support any CG features.

Only allow ASICs which support GMC CG features accessing
related registers. As some ASICs support GMC CG but cg_flags
are not set. Use GC IP version instead of cg_flags to
determine whether GMC CG is supported or not.

v2:
- Use a function to encapsulate more functionality.(Christian)
- Use IP verion to determine whether CG is supported or not.(Lijo)

Signed-off-by: Lang Yu 
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 10 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  3 +++
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  3 +++
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   |  3 +++
5 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index d426de48d299..be1f03b02af6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -876,3 +876,13 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev)
return 0;
}
+
+bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev)
+{
+   switch (adev->ip_versions[GC_HWIP][0]) {
+   case IP_VERSION(10, 1, 3):
+   return false;
+   default:
+   return true;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 93505bb0a36c..b916e73c7de1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -338,4 +338,5 @@ uint64_t amdgpu_gmc_vram_mc2pa(struct amdgpu_device *adev, 
uint64_t mc_addr);
uint64_t amdgpu_gmc_vram_pa(struct amdgpu_device *adev, struct amdgpu_bo 
*bo);
uint64_t amdgpu_gmc_vram_cpu_pa(struct amdgpu_device *adev, struct 
amdgpu_bo *bo);
int amdgpu_gmc_vram_checking(struct amdgpu_device *adev);
+bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev);
#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..4e46f618d6c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   if (!amdgpu_gmc_cg_enabled(adev))
+   return;
+


I think Christian suggested amdgpu_gmc_cg_enabled function assuming it's a
common logic for all ASICs based on flags. Now that assumption has changed.
Now the logic is a specific IP version doesn't enable CG which is known
beforehand. So we could maintain the check in the specific IP version block
itself (gmc 10 in this example). No need to call another common function
which checks IP version again.


Thanks. You mean just like this?

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..bddaf2417344 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
   {
  struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 1, 3))

*flags = 0;

Yes, add the above line also.


It may clear CG mask of other IP block. Does it make sense? Thanks!


Ah! right. No need to clear the flags.

Thanks,
Lijo


Regards,
Lang


Thanks,
Lijo

+   return;
+
  adev->mmhub.funcs->get_clockgating(adev, flags);

  if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))

Regards,
Lang


Thanks,
Lijo


adev->mmhub.funcs->get_clockgating(adev, flags);
if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index ca9841d5669f..ff9dff2a6cf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1695,6 +1695,9 @@ static void gmc_v8_0_get_clockgating_state(void *handle, 
u32 *flags)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
int data;
+   if (!amdgpu_gmc_cg_enabled(adev))
+   return;
+
if (amdgpu_sriov_vf(adev))
*flags = 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 4595027a8c63..faf017609dfe 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1952,6 +1952,9 @@ static void gmc_v9_0_get_clockgating_state(void *handle, 
u32 *flags)
{
struct

Re: [PATCH v2] drm/amdgpu: add safeguards for querying GMC CG state

2022-01-28 Thread Lang Yu

On 01/28/ , Lazar, Lijo wrote:
> 
> 
> On 1/28/2022 2:22 PM, Lang Yu wrote:
> > On 01/28/ , Lazar, Lijo wrote:
> > > 
> > > 
> > > On 1/28/2022 12:24 PM, Lang Yu wrote:
> > > > We observed a GPU hang when querying GMC CG state(i.e.,
> > > > cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
> > > > skillfish doesn't support any CG features.
> > > > 
> > > > Only allow ASICs which support GMC CG features accessing
> > > > related registers. As some ASICs support GMC CG but cg_flags
> > > > are not set. Use GC IP version instead of cg_flags to
> > > > determine whether GMC CG is supported or not.
> > > > 
> > > > v2:
> > > >- Use a function to encapsulate more functionality.(Christian)
> > > >- Use IP verion to determine whether CG is supported or not.(Lijo)
> > > > 
> > > > Signed-off-by: Lang Yu 
> > > > ---
> > > >drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 10 ++
> > > >drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
> > > >drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  3 +++
> > > >drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  3 +++
> > > >drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   |  3 +++
> > > >5 files changed, 20 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > > index d426de48d299..be1f03b02af6 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > > > @@ -876,3 +876,13 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device 
> > > > *adev)
> > > > return 0;
> > > >}
> > > > +
> > > > +bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev)
> > > > +{
> > > > +   switch (adev->ip_versions[GC_HWIP][0]) {
> > > > +   case IP_VERSION(10, 1, 3):
> > > > +   return false;
> > > > +   default:
> > > > +   return true;
> > > > +   }
> > > > +}
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> > > > index 93505bb0a36c..b916e73c7de1 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> > > > @@ -338,4 +338,5 @@ uint64_t amdgpu_gmc_vram_mc2pa(struct amdgpu_device 
> > > > *adev, uint64_t mc_addr);
> > > >uint64_t amdgpu_gmc_vram_pa(struct amdgpu_device *adev, struct 
> > > > amdgpu_bo *bo);
> > > >uint64_t amdgpu_gmc_vram_cpu_pa(struct amdgpu_device *adev, struct 
> > > > amdgpu_bo *bo);
> > > >int amdgpu_gmc_vram_checking(struct amdgpu_device *adev);
> > > > +bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev);
> > > >#endif
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> > > > b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > > > index 73ab0eebe4e2..4e46f618d6c1 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > > > @@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void 
> > > > *handle, u32 *flags)
> > > >{
> > > > struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > > > +   if (!amdgpu_gmc_cg_enabled(adev))
> > > > +   return;
> > > > +
> > > 
> > > I think Christian suggested amdgpu_gmc_cg_enabled function assuming it's a
> > > common logic for all ASICs based on flags. Now that assumption has 
> > > changed.
> > > Now the logic is a specific IP version doesn't enable CG which is known
> > > beforehand. So we could maintain the check in the specific IP version 
> > > block
> > > itself (gmc 10 in this example). No need to call another common function
> > > which checks IP version again.
> > 
> > Thanks. You mean just like this?
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > index 73ab0eebe4e2..bddaf2417344 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > @@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void 
> > *handle, u32 *flags)
> >   {
> >  struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > 
> > +   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 1, 3))
>   *flags = 0;
> 
> Yes, add the above line also.

It may clear CG mask of other IP block. Does it make sense? Thanks!

Regards,
Lang

> Thanks,
> Lijo
> > +   return;
> > +
> >  adev->mmhub.funcs->get_clockgating(adev, flags);
> > 
> >  if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))
> > 
> > Regards,
> > Lang
> > 
> > > Thanks,
> > > Lijo
> > > 
> > > > adev->mmhub.funcs->get_clockgating(adev, flags);
> > > > if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
> > > > b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > > > index ca9841d5669f..ff9dff2a6cf1 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > > > +++

Re: [PATCH v2] drm/amdgpu: add safeguards for querying GMC CG state

2022-01-28 Thread Lazar, Lijo





On 1/28/2022 2:22 PM, Lang Yu wrote:

On 01/28/ , Lazar, Lijo wrote:



On 1/28/2022 12:24 PM, Lang Yu wrote:

We observed a GPU hang when querying GMC CG state(i.e.,
cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
skillfish doesn't support any CG features.

Only allow ASICs which support GMC CG features accessing
related registers. As some ASICs support GMC CG but cg_flags
are not set. Use GC IP version instead of cg_flags to
determine whether GMC CG is supported or not.

v2:
   - Use a function to encapsulate more functionality.(Christian)
   - Use IP verion to determine whether CG is supported or not.(Lijo)

Signed-off-by: Lang Yu 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 10 ++
   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  3 +++
   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  3 +++
   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   |  3 +++
   5 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index d426de48d299..be1f03b02af6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -876,3 +876,13 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev)
return 0;
   }
+
+bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev)
+{
+   switch (adev->ip_versions[GC_HWIP][0]) {
+   case IP_VERSION(10, 1, 3):
+   return false;
+   default:
+   return true;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 93505bb0a36c..b916e73c7de1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -338,4 +338,5 @@ uint64_t amdgpu_gmc_vram_mc2pa(struct amdgpu_device *adev, 
uint64_t mc_addr);
   uint64_t amdgpu_gmc_vram_pa(struct amdgpu_device *adev, struct amdgpu_bo 
*bo);
   uint64_t amdgpu_gmc_vram_cpu_pa(struct amdgpu_device *adev, struct amdgpu_bo 
*bo);
   int amdgpu_gmc_vram_checking(struct amdgpu_device *adev);
+bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev);
   #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..4e46f618d6c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
   {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   if (!amdgpu_gmc_cg_enabled(adev))
+   return;
+


I think Christian suggested amdgpu_gmc_cg_enabled function assuming it's a
common logic for all ASICs based on flags. Now that assumption has changed.
Now the logic is a specific IP version doesn't enable CG which is known
beforehand. So we could maintain the check in the specific IP version block
itself (gmc 10 in this example). No need to call another common function
which checks IP version again.


Thanks. You mean just like this?

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..bddaf2417344 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
  {
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 1, 3))

*flags = 0;

Yes, add the above line also.

Thanks,
Lijo

+   return;
+
 adev->mmhub.funcs->get_clockgating(adev, flags);

 if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))

Regards,
Lang


Thanks,
Lijo


adev->mmhub.funcs->get_clockgating(adev, flags);
if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index ca9841d5669f..ff9dff2a6cf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1695,6 +1695,9 @@ static void gmc_v8_0_get_clockgating_state(void *handle, 
u32 *flags)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
int data;
+   if (!amdgpu_gmc_cg_enabled(adev))
+   return;
+
if (amdgpu_sriov_vf(adev))
*flags = 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 4595027a8c63..faf017609dfe 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1952,6 +1952,9 @@ static void gmc_v9_0_get_clockgating_state(void *handle, 
u32 *flags)
   {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   if (!amdgpu_gmc_cg_enabled(adev))
+   return;
+
adev->mmhub.funcs->get_clockgating(adev, flags);
athub_v1_0_get_clockgating(adev, flags);

Re: [PATCH v4 02/10] mm: add device coherent vma selection for memory migration

2022-01-28 Thread Alistair Popple

On Thursday, 27 January 2022 2:09:41 PM AEDT Alex Sierra wrote:

[...]

> diff --git a/mm/migrate.c b/mm/migrate.c
> index 277562cd4cf5..2b3375e165b1 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2340,8 +2340,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   if (is_writable_device_private_entry(entry))
>   mpfn |= MIGRATE_PFN_WRITE;
>   } else {
> - if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
> - goto next;

This isn't correct as it allows zero pfn pages to be selected for migration
when they shouldn't be (ie. because MIGRATE_VMA_SELECT_SYSTEM isn't specified).

>   pfn = pte_pfn(pte);
>   if (is_zero_pfn(pfn)) {
>   mpfn = MIGRATE_PFN_MIGRATE;
> @@ -2349,6 +2347,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   goto next;
>   }
>   page = vm_normal_page(migrate->vma, addr, pte);
> + if (page && !is_zone_device_page(page) &&
> + !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
> + goto next;
> + if (page && is_device_coherent_page(page) &&
> + (!(migrate->flags & 
> MIGRATE_VMA_SELECT_DEVICE_COHERENT) ||
> +  page->pgmap->owner != migrate->pgmap_owner))
> + goto next;
>   mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
>   mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
>   }
>

Re: [PATCH v4 06/10] lib: test_hmm add ioctl to get zone device type

2022-01-28 Thread Alistair Popple

Reviewed-by: Alistair Popple 

On Thursday, 27 January 2022 2:09:45 PM AEDT Alex Sierra wrote:
> new ioctl cmd added to query zone device type. This will be
> used once the test_hmm adds zone device coherent type.
> 
> Signed-off-by: Alex Sierra 
> ---
>  lib/test_hmm.c  | 23 +--
>  lib/test_hmm_uapi.h |  8 
>  2 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index c259842f6d44..fb1fa7c6fa98 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -84,6 +84,7 @@ struct dmirror_chunk {
>  struct dmirror_device {
>   struct cdev cdevice;
>   struct hmm_devmem   *devmem;
> + unsigned intzone_device_type;
>  
>   unsigned intdevmem_capacity;
>   unsigned intdevmem_count;
> @@ -1025,6 +1026,15 @@ static int dmirror_snapshot(struct dmirror *dmirror,
>   return ret;
>  }
>  
> +static int dmirror_get_device_type(struct dmirror *dmirror,
> + struct hmm_dmirror_cmd *cmd)
> +{
> + mutex_lock(>mutex);
> + cmd->zone_device_type = dmirror->mdevice->zone_device_type;
> + mutex_unlock(>mutex);
> +
> + return 0;
> +}
>  static long dmirror_fops_unlocked_ioctl(struct file *filp,
>   unsigned int command,
>   unsigned long arg)
> @@ -1075,6 +1085,9 @@ static long dmirror_fops_unlocked_ioctl(struct file 
> *filp,
>   ret = dmirror_snapshot(dmirror, );
>   break;
>  
> + case HMM_DMIRROR_GET_MEM_DEV_TYPE:
> + ret = dmirror_get_device_type(dmirror, );
> + break;
>   default:
>   return -EINVAL;
>   }
> @@ -1235,14 +1248,20 @@ static void dmirror_device_remove(struct 
> dmirror_device *mdevice)
>  static int __init hmm_dmirror_init(void)
>  {
>   int ret;
> - int id;
> + int id = 0;
> + int ndevices = 0;
>  
>   ret = alloc_chrdev_region(_dev, 0, DMIRROR_NDEVICES,
> "HMM_DMIRROR");
>   if (ret)
>   goto err_unreg;
>  
> - for (id = 0; id < DMIRROR_NDEVICES; id++) {
> + memset(dmirror_devices, 0, DMIRROR_NDEVICES * 
> sizeof(dmirror_devices[0]));
> + dmirror_devices[ndevices++].zone_device_type =
> + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
> + dmirror_devices[ndevices++].zone_device_type =
> + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
> + for (id = 0; id < ndevices; id++) {
>   ret = dmirror_device_init(dmirror_devices + id, id);
>   if (ret)
>   goto err_chrdev;
> diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
> index f14dea5dcd06..17f842f1aa02 100644
> --- a/lib/test_hmm_uapi.h
> +++ b/lib/test_hmm_uapi.h
> @@ -19,6 +19,7 @@
>   * @npages: (in) number of pages to read/write
>   * @cpages: (out) number of pages copied
>   * @faults: (out) number of device page faults seen
> + * @zone_device_type: (out) zone device memory type
>   */
>  struct hmm_dmirror_cmd {
>   __u64   addr;
> @@ -26,6 +27,7 @@ struct hmm_dmirror_cmd {
>   __u64   npages;
>   __u64   cpages;
>   __u64   faults;
> + __u64   zone_device_type;
>  };
>  
>  /* Expose the address space of the calling process through hmm device file */
> @@ -35,6 +37,7 @@ struct hmm_dmirror_cmd {
>  #define HMM_DMIRROR_SNAPSHOT _IOWR('H', 0x03, struct hmm_dmirror_cmd)
>  #define HMM_DMIRROR_EXCLUSIVE_IOWR('H', 0x04, struct 
> hmm_dmirror_cmd)
>  #define HMM_DMIRROR_CHECK_EXCLUSIVE  _IOWR('H', 0x05, struct hmm_dmirror_cmd)
> +#define HMM_DMIRROR_GET_MEM_DEV_TYPE _IOWR('H', 0x06, struct hmm_dmirror_cmd)
>  
>  /*
>   * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT.
> @@ -62,4 +65,9 @@ enum {
>   HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE = 0x30,
>  };
>  
> +enum {
> + /* 0 is reserved to catch uninitialized type fields */
> + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE = 1,
> +};
> +
>  #endif /* _LIB_TEST_HMM_UAPI_H */
>

Re: [PATCH v4 07/10] lib: test_hmm add module param for zone device type

2022-01-28 Thread Alistair Popple

Thanks for the updates, looks good now.

Reviewed-by: Alistair Popple 

On Thursday, 27 January 2022 2:09:46 PM AEDT Alex Sierra wrote:
> In order to configure device coherent in test_hmm, two module parameters
> should be passed, which correspond to the SP start address of each
> device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed,
> private device type is configured.
> 
> Signed-off-by: Alex Sierra 
> ---
>  lib/test_hmm.c  | 73 -
>  lib/test_hmm_uapi.h |  1 +
>  2 files changed, 53 insertions(+), 21 deletions(-)
> 
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index fb1fa7c6fa98..6f068f7c4ee3 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -34,6 +34,16 @@
>  #define DEVMEM_CHUNK_SIZE(256 * 1024 * 1024U)
>  #define DEVMEM_CHUNKS_RESERVE16
>  
> +static unsigned long spm_addr_dev0;
> +module_param(spm_addr_dev0, long, 0644);
> +MODULE_PARM_DESC(spm_addr_dev0,
> + "Specify start address for SPM (special purpose memory) used 
> for device 0. By setting this Coherent device type will be used. Make sure 
> spm_addr_dev1 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
> +
> +static unsigned long spm_addr_dev1;
> +module_param(spm_addr_dev1, long, 0644);
> +MODULE_PARM_DESC(spm_addr_dev1,
> + "Specify start address for SPM (special purpose memory) used 
> for device 1. By setting this Coherent device type will be used. Make sure 
> spm_addr_dev0 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
> +
>  static const struct dev_pagemap_ops dmirror_devmem_ops;
>  static const struct mmu_interval_notifier_ops dmirror_min_ops;
>  static dev_t dmirror_dev;
> @@ -452,28 +462,44 @@ static int dmirror_write(struct dmirror *dmirror, 
> struct hmm_dmirror_cmd *cmd)
>   return ret;
>  }
>  
> -static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
> +static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
>  struct page **ppage)
>  {
>   struct dmirror_chunk *devmem;
> - struct resource *res;
> + struct resource *res = NULL;
>   unsigned long pfn;
>   unsigned long pfn_first;
>   unsigned long pfn_last;
>   void *ptr;
> + int ret = -ENOMEM;
>  
>   devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
>   if (!devmem)
> - return false;
> + return ret;
>  
> - res = request_free_mem_region(_resource, DEVMEM_CHUNK_SIZE,
> -   "hmm_dmirror");
> - if (IS_ERR(res))
> + switch (mdevice->zone_device_type) {
> + case HMM_DMIRROR_MEMORY_DEVICE_PRIVATE:
> + res = request_free_mem_region(_resource, 
> DEVMEM_CHUNK_SIZE,
> +   "hmm_dmirror");
> + if (IS_ERR_OR_NULL(res))
> + goto err_devmem;
> + devmem->pagemap.range.start = res->start;
> + devmem->pagemap.range.end = res->end;
> + devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
> + break;
> + case HMM_DMIRROR_MEMORY_DEVICE_COHERENT:
> + devmem->pagemap.range.start = (MINOR(mdevice->cdevice.dev) - 2) 
> ?
> + spm_addr_dev0 :
> + spm_addr_dev1;
> + devmem->pagemap.range.end = devmem->pagemap.range.start +
> + DEVMEM_CHUNK_SIZE - 1;
> + devmem->pagemap.type = MEMORY_DEVICE_COHERENT;
> + break;
> + default:
> + ret = -EINVAL;
>   goto err_devmem;
> + }
>  
> - devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
> - devmem->pagemap.range.start = res->start;
> - devmem->pagemap.range.end = res->end;
>   devmem->pagemap.nr_range = 1;
>   devmem->pagemap.ops = _devmem_ops;
>   devmem->pagemap.owner = mdevice;
> @@ -494,10 +520,14 @@ static bool dmirror_allocate_chunk(struct 
> dmirror_device *mdevice,
>   mdevice->devmem_capacity = new_capacity;
>   mdevice->devmem_chunks = new_chunks;
>   }
> -
>   ptr = memremap_pages(>pagemap, numa_node_id());
> - if (IS_ERR(ptr))
> + if (IS_ERR_OR_NULL(ptr)) {
> + if (ptr)
> + ret = PTR_ERR(ptr);
> + else
> + ret = -EFAULT;
>   goto err_release;
> + }
>  
>   devmem->mdevice = mdevice;
>   pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT;
> @@ -526,15 +556,17 @@ static bool dmirror_allocate_chunk(struct 
> dmirror_device *mdevice,
>   }
>   spin_unlock(>lock);
>  
> - return true;
> + return 0;
>  
>  err_release:
>   mutex_unlock(>devmem_lock);
> - release_mem_region(devmem->pagemap.range.start, 
> range_len(>pagemap.range));
> + if (res && devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
> +

Re: [PATCH v4 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

2022-01-28 Thread Andrew Morton

On Thu, 27 Jan 2022 17:20:40 -0600 "Sierra Guiza, Alejandro (Alex)" 
 wrote:

> Andrew,
> We're somehow new on this procedure. Are you referring to rebase this 
> patch series to
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 
> <5.17-rc1 tag>?

No, against current Linus mainline, please.

Re: [PATCH v4 01/10] mm: add zone device coherent type memory support

2022-01-28 Thread Alistair Popple

On Thursday, 27 January 2022 2:09:40 PM AEDT Alex Sierra wrote:

[...]

> diff --git a/mm/migrate.c b/mm/migrate.c
> index 1852d787e6ab..277562cd4cf5 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -362,7 +362,7 @@ static int expected_page_refs(struct address_space
> *mapping, struct page *page)> 
>  * Device private pages have an extra refcount as they are
>  * ZONE_DEVICE pages.
>  */
> 
> -   expected_count += is_device_private_page(page);
> +   expected_count += is_dev_private_or_coherent_page(page);
> 
> if (mapping)
> 
> expected_count += thp_nr_pages(page) +
> page_has_private(page);
> 
> @@ -2503,7 +2503,7 @@ static bool migrate_vma_check_page(struct page *page)
> 
>  * FIXME proper solution is to rework migration_entry_wait()
>  so
>  * it does not need to take a reference on page.
>  */
> 
> -   return is_device_private_page(page);
> +   return is_dev_private_or_coherent_page(page);

As Andrew points out this no longer applies due to changes here. I think you
can just drop this hunk though.

[...]

> diff --git a/mm/rmap.c b/mm/rmap.c
> index 6aebd1747251..32dae6839403 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1823,10 +1823,17 @@ static bool try_to_migrate_one(struct page *page, 
> struct vm_area_struct *vma,
>* pteval maps a zone device page and is therefore
>* a swap pte.
>*/
> - if (pte_swp_soft_dirty(pteval))
> - swp_pte = pte_swp_mksoft_dirty(swp_pte);
> - if (pte_swp_uffd_wp(pteval))
> - swp_pte = pte_swp_mkuffd_wp(swp_pte);
> + if (is_device_coherent_page(page)) {
> + if (pte_soft_dirty(pteval))
> + swp_pte = pte_swp_mksoft_dirty(swp_pte);
> + if (pte_uffd_wp(pteval))
> + swp_pte = pte_swp_mkuffd_wp(swp_pte);
> + } else {
> + if (pte_swp_soft_dirty(pteval))
> + swp_pte = pte_swp_mksoft_dirty(swp_pte);
> + if (pte_swp_uffd_wp(pteval))
> + swp_pte = pte_swp_mkuffd_wp(swp_pte);
> + }

As I understand things ptes for device coherent pages don't need special
treatment, therefore rather than special casing here it should just fall
through to the same path as normal pages. For that I think all you need is
something like:

-if (is_zone_device_page(page)) {
+if (is_device_private_page(page)) {

Noting that device private pages are the only zone device pages that could
have been encountered here anyway.

>   set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte);
>   /*
>* No need to invalidate here it will synchronize on
> @@ -1837,7 +1844,7 @@ static bool try_to_migrate_one(struct page *page, 
> struct vm_area_struct *vma,
>* Since only PAGE_SIZE pages can currently be
>* migrated, just set it to page. This will need to be
>* changed when hugepage migrations to device private
> -  * memory are supported.
> +  * or coherent memory are supported.
>*/
>   subpage = page;
>   } else if (PageHWPoison(page)) {
> @@ -1943,7 +1950,8 @@ void try_to_migrate(struct page *page, enum ttu_flags 
> flags)
>   TTU_SYNC)))
>   return;
>  
> - if (is_zone_device_page(page) && !is_device_private_page(page))
> + if (is_zone_device_page(page) &&
> + !is_dev_private_or_coherent_page(page))
>   return;
>  
>   /*
>

Re: [PATCH v4 04/10] drm/amdkfd: add SPM support for SVM

2022-01-28 Thread Alistair Popple

On Thursday, 27 January 2022 2:09:43 PM AEDT Alex Sierra wrote:

[...]

> @@ -984,3 +990,4 @@ int svm_migrate_init(struct amdgpu_device *adev)
>  
>   return 0;
>  }
> +
> 

git-am complained about this when I applied the series. Given you have to
rebase anyway it would be worth fixing this.

Re: [PATCH v4 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-28 Thread Alistair Popple

On Thursday, 27 January 2022 2:09:42 PM AEDT Alex Sierra wrote:
> Avoid long term pinning for Coherent device type pages. This could
> interfere with their own device memory manager. For now, we are just
> returning error for PIN_LONGTERM Coherent device type pages. Eventually,
> these type of pages will get migrated to system memory, once the device
> migration pages support is added.
> 
> Signed-off-by: Alex Sierra 
> ---
>  mm/gup.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 886d6148d3d0..5291d7221826 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1720,6 +1720,12 @@ static long check_and_migrate_movable_pages(unsigned 
> long nr_pages,
>* If we get a movable page, since we are going to be pinning
>* these entries, try to move them out if possible.
>*/
> + if (is_dev_private_or_coherent_page(head)) {
> + WARN_ON_ONCE(is_device_private_page(head));
> + ret = -EFAULT;
> + goto unpin_pages;
> + }
> +
>   if (!is_pinnable_page(head)) {
>   if (PageHuge(head)) {
>   if (!isolate_huge_page(head, 
> _page_list))
> @@ -1750,6 +1756,7 @@ static long check_and_migrate_movable_pages(unsigned 
> long nr_pages,
>   if (list_empty(_page_list) && !isolation_error_count)
>   return nr_pages;
>  
> +unpin_pages:
>   if (gup_flags & FOLL_PIN) {
>   unpin_user_pages(pages, nr_pages);
>   } else {
 
If there is a mix of ZONE_MOVABLE and device pages the return value (ret) will
be subsequently lost here:

if (!list_empty(_page_list)) {
ret = migrate_pages(_page_list, alloc_migration_target,
NULL, (unsigned long), MIGRATE_SYNC,
MR_LONGTERM_PIN, NULL);
if (ret && !list_empty(_page_list))
putback_movable_pages(_page_list);
}

Which won't actually cause any problems, but it will lead to the GUP getting
retried unnecessarily. I do still intend to address this with a series to
migrate pages instead though, so I think this is ok for now as it's an unlikely
corner case anyway. Therefore feel tree to add the below when you repost:

Reviewed-by: Alistair Poppple

[PATCH] drm/amdgpu: remove duplicate include in 'amdgpu_device.c'

2022-01-28 Thread cgel . zte

From: Changcheng Deng 

'linux/pci.h' included in 'amdgpu_device.c' is duplicated.

Reported-by: Zeal Robot 
Signed-off-by: Changcheng Deng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index dd5979098e63..289c5c626324 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -56,7 +56,6 @@
 #include "soc15.h"
 #include "nv.h"
 #include "bif/bif_4_1_d.h"
-#include 
 #include 
 #include "amdgpu_vf_error.h"

--
2.25.1

Re: [PATCH v4 08/10] lib: add support for device coherent type in test_hmm

2022-01-28 Thread Alistair Popple

I haven't tested the change which checks that pages migrated back to sysmem,
but it looks ok so:

Reviewed-by: Alistair Popple 

On Thursday, 27 January 2022 2:09:47 PM AEDT Alex Sierra wrote:
> Device Coherent type uses device memory that is coherently accesible by
> the CPU. This could be shown as SP (special purpose) memory range
> at the BIOS-e820 memory enumeration. If no SP memory is supported in
> system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.
> 
> Currently, test_hmm only supports two different SP ranges of at least
> 256MB size. This could be specified in the kernel parameter variable
> efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x1 &
> 0x14000 physical address. Ex.
> efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4
> 
> Private and coherent device mirror instances can be created in the same
> probed. This is done by passing the module parameters spm_addr_dev0 &
> spm_addr_dev1. In this case, it will create four instances of
> device_mirror. The first two correspond to private device type, the
> last two to coherent type. Then, they can be easily accessed from user
> space through /dev/hmm_mirror. Usually num_device 0 and 1
> are for private, and 2 and 3 for coherent types. If no module
> parameters are passed, two instances of private type device_mirror will
> be created only.
> 
> Signed-off-by: Alex Sierra 
> ---
> v4:
> Return number of coherent device pages successfully migrated to system.
> This is returned at cmd->cpages.
> ---
>  lib/test_hmm.c  | 260 +---
>  lib/test_hmm_uapi.h |  15 ++-
>  2 files changed, 205 insertions(+), 70 deletions(-)
> 
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index 6f068f7c4ee3..850d5331e370 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -29,11 +29,22 @@
>  
>  #include "test_hmm_uapi.h"
>  
> -#define DMIRROR_NDEVICES 2
> +#define DMIRROR_NDEVICES 4
>  #define DMIRROR_RANGE_FAULT_TIMEOUT  1000
>  #define DEVMEM_CHUNK_SIZE(256 * 1024 * 1024U)
>  #define DEVMEM_CHUNKS_RESERVE16
>  
> +/*
> + * For device_private pages, dpage is just a dummy struct page
> + * representing a piece of device memory. dmirror_devmem_alloc_page
> + * allocates a real system memory page as backing storage to fake a
> + * real device. zone_device_data points to that backing page. But
> + * for device_coherent memory, the struct page represents real
> + * physical CPU-accessible memory that we can use directly.
> + */
> +#define BACKING_PAGE(page) (is_device_private_page((page)) ? \
> +(page)->zone_device_data : (page))
> +
>  static unsigned long spm_addr_dev0;
>  module_param(spm_addr_dev0, long, 0644);
>  MODULE_PARM_DESC(spm_addr_dev0,
> @@ -122,6 +133,21 @@ static int dmirror_bounce_init(struct dmirror_bounce 
> *bounce,
>   return 0;
>  }
>  
> +static bool dmirror_is_private_zone(struct dmirror_device *mdevice)
> +{
> + return (mdevice->zone_device_type ==
> + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ? true : false;
> +}
> +
> +static enum migrate_vma_direction
> + dmirror_select_device(struct dmirror *dmirror)
> +{
> + return (dmirror->mdevice->zone_device_type ==
> + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ?
> + MIGRATE_VMA_SELECT_DEVICE_PRIVATE :
> + MIGRATE_VMA_SELECT_DEVICE_COHERENT;
> +}
> +
>  static void dmirror_bounce_fini(struct dmirror_bounce *bounce)
>  {
>   vfree(bounce->ptr);
> @@ -572,16 +598,19 @@ static int dmirror_allocate_chunk(struct dmirror_device 
> *mdevice,
>  static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
>  {
>   struct page *dpage = NULL;
> - struct page *rpage;
> + struct page *rpage = NULL;
>  
>   /*
> -  * This is a fake device so we alloc real system memory to store
> -  * our device memory.
> +  * For ZONE_DEVICE private type, this is a fake device so we alloc real
> +  * system memory to store our device memory.
> +  * For ZONE_DEVICE coherent type we use the actual dpage to store the 
> data
> +  * and ignore rpage.
>*/
> - rpage = alloc_page(GFP_HIGHUSER);
> - if (!rpage)
> - return NULL;
> -
> + if (dmirror_is_private_zone(mdevice)) {
> + rpage = alloc_page(GFP_HIGHUSER);
> + if (!rpage)
> + return NULL;
> + }
>   spin_lock(>lock);
>  
>   if (mdevice->free_pages) {
> @@ -601,7 +630,8 @@ static struct page *dmirror_devmem_alloc_page(struct 
> dmirror_device *mdevice)
>   return dpage;
>  
>  error:
> - __free_page(rpage);
> + if (rpage)
> + __free_page(rpage);
>   return NULL;
>  }
>  
> @@ -627,12 +657,16 @@ static void dmirror_migrate_alloc_and_copy(struct 
> migrate_vma *args,
>* unallocated pte_none() or read-only zero page.
>*/
>   spage = migrate_pfn_to_page(*src);
> +

[PATCH] drm/amd/pm: remove duplicate include in 'arcturus_ppt.c'

2022-01-28 Thread cgel . zte

From: Changcheng Deng 

'amdgpu_dpm.h' included in 'arcturus_ppt.c' is duplicated.

Reported-by: Zeal Robot 
Signed-off-by: Changcheng Deng 
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index ee296441c5bc..709c32063ef7 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -46,7 +46,6 @@
 #include 
 #include "amdgpu_ras.h"
 #include "smu_cmn.h"
-#include "amdgpu_dpm.h"

 /*
  * DO NOT use these for err/warn/info/debug messages.
--
2.25.1

Re: [PATCH v2] drm/amdgpu: add safeguards for querying GMC CG state

2022-01-28 Thread Lang Yu

On 01/28/ , Lazar, Lijo wrote:
> 
> 
> On 1/28/2022 12:24 PM, Lang Yu wrote:
> > We observed a GPU hang when querying GMC CG state(i.e.,
> > cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
> > skillfish doesn't support any CG features.
> > 
> > Only allow ASICs which support GMC CG features accessing
> > related registers. As some ASICs support GMC CG but cg_flags
> > are not set. Use GC IP version instead of cg_flags to
> > determine whether GMC CG is supported or not.
> > 
> > v2:
> >   - Use a function to encapsulate more functionality.(Christian)
> >   - Use IP verion to determine whether CG is supported or not.(Lijo)
> > 
> > Signed-off-by: Lang Yu 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 10 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
> >   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  3 +++
> >   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   |  3 +++
> >   5 files changed, 20 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > index d426de48d299..be1f03b02af6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -876,3 +876,13 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device 
> > *adev)
> > return 0;
> >   }
> > +
> > +bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev)
> > +{
> > +   switch (adev->ip_versions[GC_HWIP][0]) {
> > +   case IP_VERSION(10, 1, 3):
> > +   return false;
> > +   default:
> > +   return true;
> > +   }
> > +}
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> > index 93505bb0a36c..b916e73c7de1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> > @@ -338,4 +338,5 @@ uint64_t amdgpu_gmc_vram_mc2pa(struct amdgpu_device 
> > *adev, uint64_t mc_addr);
> >   uint64_t amdgpu_gmc_vram_pa(struct amdgpu_device *adev, struct amdgpu_bo 
> > *bo);
> >   uint64_t amdgpu_gmc_vram_cpu_pa(struct amdgpu_device *adev, struct 
> > amdgpu_bo *bo);
> >   int amdgpu_gmc_vram_checking(struct amdgpu_device *adev);
> > +bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev);
> >   #endif
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > index 73ab0eebe4e2..4e46f618d6c1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > @@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void 
> > *handle, u32 *flags)
> >   {
> > struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > +   if (!amdgpu_gmc_cg_enabled(adev))
> > +   return;
> > +
> 
> I think Christian suggested amdgpu_gmc_cg_enabled function assuming it's a
> common logic for all ASICs based on flags. Now that assumption has changed.
> Now the logic is a specific IP version doesn't enable CG which is known
> beforehand. So we could maintain the check in the specific IP version block
> itself (gmc 10 in this example). No need to call another common function
> which checks IP version again.

Thanks. You mean just like this?

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..bddaf2417344 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(10, 1, 3))
+   return;
+
adev->mmhub.funcs->get_clockgating(adev, flags);

if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))

Regards,
Lang

> Thanks,
> Lijo
> 
> > adev->mmhub.funcs->get_clockgating(adev, flags);
> > if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > index ca9841d5669f..ff9dff2a6cf1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > @@ -1695,6 +1695,9 @@ static void gmc_v8_0_get_clockgating_state(void 
> > *handle, u32 *flags)
> > struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > int data;
> > +   if (!amdgpu_gmc_cg_enabled(adev))
> > +   return;
> > +
> > if (amdgpu_sriov_vf(adev))
> > *flags = 0;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > index 4595027a8c63..faf017609dfe 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > @@ -1952,6 +1952,9 @@ static void gmc_v9_0_get_clockgating_state(void 
> > *handle, u32 *flags)
> >   {
> > struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > +

Re: [PATCH v2] drm/amdgpu: add safeguards for querying GMC CG state

2022-01-28 Thread Lazar, Lijo





On 1/28/2022 12:24 PM, Lang Yu wrote:

We observed a GPU hang when querying GMC CG state(i.e.,
cat amdgpu_pm_info) on cyan skillfish. Acctually, cyan
skillfish doesn't support any CG features.

Only allow ASICs which support GMC CG features accessing
related registers. As some ASICs support GMC CG but cg_flags
are not set. Use GC IP version instead of cg_flags to
determine whether GMC CG is supported or not.

v2:
  - Use a function to encapsulate more functionality.(Christian)
  - Use IP verion to determine whether CG is supported or not.(Lijo)

Signed-off-by: Lang Yu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 10 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  3 +++
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  3 +++
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   |  3 +++
  5 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index d426de48d299..be1f03b02af6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -876,3 +876,13 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev)
  
  	return 0;

  }
+
+bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev)
+{
+   switch (adev->ip_versions[GC_HWIP][0]) {
+   case IP_VERSION(10, 1, 3):
+   return false;
+   default:
+   return true;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 93505bb0a36c..b916e73c7de1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -338,4 +338,5 @@ uint64_t amdgpu_gmc_vram_mc2pa(struct amdgpu_device *adev, 
uint64_t mc_addr);
  uint64_t amdgpu_gmc_vram_pa(struct amdgpu_device *adev, struct amdgpu_bo *bo);
  uint64_t amdgpu_gmc_vram_cpu_pa(struct amdgpu_device *adev, struct amdgpu_bo 
*bo);
  int amdgpu_gmc_vram_checking(struct amdgpu_device *adev);
+bool amdgpu_gmc_cg_enabled(struct amdgpu_device *adev);
  #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 73ab0eebe4e2..4e46f618d6c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -1156,6 +1156,9 @@ static void gmc_v10_0_get_clockgating_state(void *handle, 
u32 *flags)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
+	if (!amdgpu_gmc_cg_enabled(adev))

+   return;
+


I think Christian suggested amdgpu_gmc_cg_enabled function assuming it's 
a common logic for all ASICs based on flags. Now that assumption has 
changed. Now the logic is a specific IP version doesn't enable CG which 
is known beforehand. So we could maintain the check in the specific IP 
version block itself (gmc 10 in this example). No need to call another 
common function which checks IP version again.


Thanks,
Lijo


adev->mmhub.funcs->get_clockgating(adev, flags);
  
  	if (adev->ip_versions[ATHUB_HWIP][0] >= IP_VERSION(2, 1, 0))

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index ca9841d5669f..ff9dff2a6cf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1695,6 +1695,9 @@ static void gmc_v8_0_get_clockgating_state(void *handle, 
u32 *flags)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
int data;
  
+	if (!amdgpu_gmc_cg_enabled(adev))

+   return;
+
if (amdgpu_sriov_vf(adev))
*flags = 0;
  
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c

index 4595027a8c63..faf017609dfe 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1952,6 +1952,9 @@ static void gmc_v9_0_get_clockgating_state(void *handle, 
u32 *flags)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
+	if (!amdgpu_gmc_cg_enabled(adev))

+   return;
+
adev->mmhub.funcs->get_clockgating(adev, flags);
  
  	athub_v1_0_get_clockgating(adev, flags);

86 matches

Mail list logo