RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-13 Thread Tao, Yintian
Hi  Monk

Thanks for your review. I have changed the code according to your comments. 
Please help have a review again

Best Regards
Yintian Tao

-Original Message-
From: Liu, Monk  
Sent: 2020年4月11日 16:59
To: Tao, Yintian ; Alex Deucher 
Cc: Deucher, Alexander ; Koenig, Christian 
; amd-gfx list 
Subject: RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

Hi Yintian

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c0f9a651dc06..4f9780aabf5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -152,11 +152,17 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
struct file *f,
if (r < 0)
return r;
 
+   if (!amdgpu_virt_can_access_debugfs(adev))
+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);

You patch looks simply bail out if you are not under "debug" condition, but 
that looks weird: you are forbidding KIQ to do the debugfs access during 
non-debug condition , which is an overkill to me .

The ideal logic is :
1) When we are under "tdr_deubg" mode, we allow debugfs to handled by MMIO 
access, just like bare-metal
2) When we are not under "tdr_debug" mode (e.g.: no tdr triggered ) we shall 
still allow debugfs to work, but all the register access can go with KIQ way .

Looks you are dropping 2 totally  ... 

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of Tao, Yintian
Sent: Thursday, April 9, 2020 11:23 PM
To: Alex Deucher 
Cc: Deucher, Alexander ; Koenig, Christian 
; amd-gfx list 
Subject: RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

Hi  Alex

Many thanks for your review.



-Original Message-
From: Alex Deucher 
Sent: 2020年4月9日 23:21
To: Tao, Yintian 
Cc: Koenig, Christian ; Deucher, Alexander 
; amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

On Thu, Apr 9, 2020 at 10:54 AM Yintian Tao  wrote:
>
> Under bare metal, there is no more else to take care of the GPU 
> register access through MMIO.
> Under Virtualization, to access GPU register is implemented through 
> KIQ during run-time due to world-switch.
>
> Therefore, under SR-IOV user can only access debugfs to r/w GPU 
> registers when meets all three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> v2: merge amdgpu_virt_can_access_debugfs() into
> amdgpu_virt_enable_access_debugfs()
>
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 73 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 26 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>  4 files changed, 108 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..1a4894fa3693 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> if (use_bank) {
> if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
> (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return -EINVAL;
> }
> mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +212,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -255,6 +261,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -263,6 +273,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_re

RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-11 Thread Liu, Monk
Hi Yintian

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c0f9a651dc06..4f9780aabf5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -152,11 +152,17 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
struct file *f,
if (r < 0)
return r;
 
+   if (!amdgpu_virt_can_access_debugfs(adev))
+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);

You patch looks simply bail out if you are not under "debug" condition, but 
that looks weird: you are forbidding KIQ to do the debugfs access during 
non-debug condition , which is an overkill to me .

The ideal logic is :
1) When we are under "tdr_deubg" mode, we allow debugfs to handled by MMIO 
access, just like bare-metal
2) When we are not under "tdr_debug" mode (e.g.: no tdr triggered ) we shall 
still allow debugfs to work, but all the register access can go with KIQ way .

Looks you are dropping 2 totally  ... 

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of Tao, Yintian
Sent: Thursday, April 9, 2020 11:23 PM
To: Alex Deucher 
Cc: Deucher, Alexander ; Koenig, Christian 
; amd-gfx list 
Subject: RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

Hi  Alex

Many thanks for your review.



-Original Message-
From: Alex Deucher 
Sent: 2020年4月9日 23:21
To: Tao, Yintian 
Cc: Koenig, Christian ; Deucher, Alexander 
; amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

On Thu, Apr 9, 2020 at 10:54 AM Yintian Tao  wrote:
>
> Under bare metal, there is no more else to take care of the GPU 
> register access through MMIO.
> Under Virtualization, to access GPU register is implemented through 
> KIQ during run-time due to world-switch.
>
> Therefore, under SR-IOV user can only access debugfs to r/w GPU 
> registers when meets all three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> v2: merge amdgpu_virt_can_access_debugfs() into
> amdgpu_virt_enable_access_debugfs()
>
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 73 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 26 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>  4 files changed, 108 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..1a4894fa3693 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> if (use_bank) {
> if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
> (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return -EINVAL;
> }
> mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +212,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -255,6 +261,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -263,6 +273,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
>  

RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Tao, Yintian
Hi  Alex

Many thanks for your review.



-Original Message-
From: Alex Deucher  
Sent: 2020年4月9日 23:21
To: Tao, Yintian 
Cc: Koenig, Christian ; Deucher, Alexander 
; amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

On Thu, Apr 9, 2020 at 10:54 AM Yintian Tao  wrote:
>
> Under bare metal, there is no more else to take care of the GPU 
> register access through MMIO.
> Under Virtualization, to access GPU register is implemented through 
> KIQ during run-time due to world-switch.
>
> Therefore, under SR-IOV user can only access debugfs to r/w GPU 
> registers when meets all three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> v2: merge amdgpu_virt_can_access_debugfs() into
> amdgpu_virt_enable_access_debugfs()
>
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 73 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 26 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>  4 files changed, 108 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..1a4894fa3693 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> if (use_bank) {
> if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
> (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return -EINVAL;
> }
> mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +212,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -255,6 +261,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -263,6 +273,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
> }
>
> @@ -275,6 +286,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -304,6 +316,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct 
> file *f, const char __user
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -311,6 +327,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
> }
>
> @@ -325,6 +342,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r

Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Alex Deucher
On Thu, Apr 9, 2020 at 10:54 AM Yintian Tao  wrote:
>
> Under bare metal, there is no more else to take
> care of the GPU register access through MMIO.
> Under Virtualization, to access GPU register is
> implemented through KIQ during run-time due to
> world-switch.
>
> Therefore, under SR-IOV user can only access
> debugfs to r/w GPU registers when meets all
> three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> v2: merge amdgpu_virt_can_access_debugfs() into
> amdgpu_virt_enable_access_debugfs()
>
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 73 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 26 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>  4 files changed, 108 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..1a4894fa3693 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> if (use_bank) {
> if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
> (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return -EINVAL;
> }
> mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +212,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -255,6 +261,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -263,6 +273,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
> }
>
> @@ -275,6 +286,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -304,6 +316,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct 
> file *f, const char __user
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -311,6 +327,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
> }
>
> @@ -325,6 +342,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -354,6 +372,10 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
> *f, char __user *buf,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -362,6 +384,7 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
> *f, char __user *buf,
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
> }
>
> 

[PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Yintian Tao
Under bare metal, there is no more else to take
care of the GPU register access through MMIO.
Under Virtualization, to access GPU register is
implemented through KIQ during run-time due to
world-switch.

Therefore, under SR-IOV user can only access
debugfs to r/w GPU registers when meets all
three conditions below.
- amdgpu_gpu_recovery=0
- TDR happened
- in_gpu_reset=0

v2: merge amdgpu_virt_can_access_debugfs() into
amdgpu_virt_enable_access_debugfs()

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 73 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 26 
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
 4 files changed, 108 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c0f9a651dc06..1a4894fa3693 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -152,11 +152,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
struct file *f,
if (r < 0)
return r;
 
+   r = amdgpu_virt_enable_access_debugfs(adev);
+   if (r < 0)
+   return r;
+
if (use_bank) {
if ((sh_bank != 0x && sh_bank >= 
adev->gfx.config.max_sh_per_se) ||
(se_bank != 0x && se_bank >= 
adev->gfx.config.max_shader_engines)) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return -EINVAL;
}
mutex_lock(>grbm_idx_mutex);
@@ -207,6 +212,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, struct 
file *f,
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
 
+   amdgpu_virt_disable_access_debugfs(adev);
return result;
 }
 
@@ -255,6 +261,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
*f, char __user *buf,
if (r < 0)
return r;
 
+   r = amdgpu_virt_enable_access_debugfs(adev);
+   if (r < 0)
+   return r;
+
while (size) {
uint32_t value;
 
@@ -263,6 +273,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
*f, char __user *buf,
if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
 
@@ -275,6 +286,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
*f, char __user *buf,
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
 
+   amdgpu_virt_disable_access_debugfs(adev);
return result;
 }
 
@@ -304,6 +316,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
*f, const char __user
if (r < 0)
return r;
 
+   r = amdgpu_virt_enable_access_debugfs(adev);
+   if (r < 0)
+   return r;
+
while (size) {
uint32_t value;
 
@@ -311,6 +327,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
*f, const char __user
if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
 
@@ -325,6 +342,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
*f, const char __user
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
 
+   amdgpu_virt_disable_access_debugfs(adev);
return result;
 }
 
@@ -354,6 +372,10 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (r < 0)
return r;
 
+   r = amdgpu_virt_enable_access_debugfs(adev);
+   if (r < 0)
+   return r;
+
while (size) {
uint32_t value;
 
@@ -362,6 +384,7 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
 
@@ -374,6 +397,7 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
 
+   amdgpu_virt_disable_access_debugfs(adev);
return result;
 }
 
@@ -403,6 

RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Tao, Yintian
Hi  Christian


Many thanks for your review. I will submit one new patch according to your 
suggestion.


Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian  
Sent: 2020年4月9日 20:42
To: Tao, Yintian ; Deucher, Alexander 
; Deng, Emily 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

Am 09.04.20 um 08:01 schrieb Yintian Tao:
> Under bare metal, there is no more else to take care of the GPU 
> register access through MMIO.
> Under Virtualization, to access GPU register is implemented through 
> KIQ during run-time due to world-switch.
>
> Therefore, under SR-IOV user can only access debugfs to r/w GPU 
> registers when meets all three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 83 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  7 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 23 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>   4 files changed, 114 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..4f9780aabf5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,17 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
>   if (r < 0)
>   return r;
>   
> + if (!amdgpu_virt_can_access_debugfs(adev))
> + return -EINVAL;
> + else
> + amdgpu_virt_enable_access_debugfs(adev);
> +

It would be better to merge these two functions together.

E.g. that amdgpu_virt_enable_access_debugfs() returns an error if we can't 
allow this.

And -EINVAL is maybe not the right thing here, since this is not caused by an 
invalid value.

Maybe use -EPERM instead.

Regards,
Christian.

>   if (use_bank) {
>   if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
>   (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
> + amdgpu_virt_disable_access_debugfs(adev);
>   return -EINVAL;
>   }
>   mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +213,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
>   
> + amdgpu_virt_disable_access_debugfs(adev);
>   return result;
>   }
>   
> @@ -255,6 +262,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
>   if (r < 0)
>   return r;
>   
> + if (!amdgpu_virt_can_access_debugfs(adev))
> + return -EINVAL;
> + else
> + amdgpu_virt_enable_access_debugfs(adev);
> +
>   while (size) {
>   uint32_t value;
>   
> @@ -263,6 +275,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
>   if (r) {
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
> + amdgpu_virt_disable_access_debugfs(adev);
>   return r;
>   }
>   
> @@ -275,6 +288,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
>   
> + amdgpu_virt_disable_access_debugfs(adev);
>   return result;
>   }
>   
> @@ -304,6 +318,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct 
> file *f, const char __user
>   if (r < 0)
>   return r;
>   
> + if (!amdgpu_virt_can_access_debugfs(adev))
> + return -EINVAL;
> + else
> + amdgpu_virt_enable_access_debugfs(adev);
> +
>   while (size) {
>   uint32_t value;
>   
> @@ -311,6 +330,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
>   if (r) {
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
> + amdgpu_virt_disable_acc

Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Christian König

Am 09.04.20 um 08:01 schrieb Yintian Tao:

Under bare metal, there is no more else to take
care of the GPU register access through MMIO.
Under Virtualization, to access GPU register is
implemented through KIQ during run-time due to
world-switch.

Therefore, under SR-IOV user can only access
debugfs to r/w GPU registers when meets all
three conditions below.
- amdgpu_gpu_recovery=0
- TDR happened
- in_gpu_reset=0

Signed-off-by: Yintian Tao 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 83 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  7 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 23 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
  4 files changed, 114 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c0f9a651dc06..4f9780aabf5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -152,11 +152,17 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
struct file *f,
if (r < 0)
return r;
  
+	if (!amdgpu_virt_can_access_debugfs(adev))

+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+


It would be better to merge these two functions together.

E.g. that amdgpu_virt_enable_access_debugfs() returns an error if we 
can't allow this.


And -EINVAL is maybe not the right thing here, since this is not caused 
by an invalid value.


Maybe use -EPERM instead.

Regards,
Christian.


if (use_bank) {
if ((sh_bank != 0x && sh_bank >= 
adev->gfx.config.max_sh_per_se) ||
(se_bank != 0x && se_bank >= 
adev->gfx.config.max_shader_engines)) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return -EINVAL;
}
mutex_lock(>grbm_idx_mutex);
@@ -207,6 +213,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, struct 
file *f,
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
  
+	amdgpu_virt_disable_access_debugfs(adev);

return result;
  }
  
@@ -255,6 +262,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file *f, char __user *buf,

if (r < 0)
return r;
  
+	if (!amdgpu_virt_can_access_debugfs(adev))

+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+
while (size) {
uint32_t value;
  
@@ -263,6 +275,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file *f, char __user *buf,

if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
  
@@ -275,6 +288,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file *f, char __user *buf,

pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
  
+	amdgpu_virt_disable_access_debugfs(adev);

return result;
  }
  
@@ -304,6 +318,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file *f, const char __user

if (r < 0)
return r;
  
+	if (!amdgpu_virt_can_access_debugfs(adev))

+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+
while (size) {
uint32_t value;
  
@@ -311,6 +330,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file *f, const char __user

if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
  
@@ -325,6 +345,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file *f, const char __user

pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
  
+	amdgpu_virt_disable_access_debugfs(adev);

return result;
  }
  
@@ -354,6 +375,11 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file *f, char __user *buf,

if (r < 0)
return r;
  
+	if (!amdgpu_virt_can_access_debugfs(adev))

+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+
while (size) {
uint32_t value;
  
@@ -362,6 +388,7 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file *f, char __user *buf,

if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);

[PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Yintian Tao
Under bare metal, there is no more else to take
care of the GPU register access through MMIO.
Under Virtualization, to access GPU register is
implemented through KIQ during run-time due to
world-switch.

Therefore, under SR-IOV user can only access
debugfs to r/w GPU registers when meets all
three conditions below.
- amdgpu_gpu_recovery=0
- TDR happened
- in_gpu_reset=0

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 83 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 23 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
 4 files changed, 114 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c0f9a651dc06..4f9780aabf5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -152,11 +152,17 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
struct file *f,
if (r < 0)
return r;
 
+   if (!amdgpu_virt_can_access_debugfs(adev))
+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+
if (use_bank) {
if ((sh_bank != 0x && sh_bank >= 
adev->gfx.config.max_sh_per_se) ||
(se_bank != 0x && se_bank >= 
adev->gfx.config.max_shader_engines)) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return -EINVAL;
}
mutex_lock(>grbm_idx_mutex);
@@ -207,6 +213,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, struct 
file *f,
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
 
+   amdgpu_virt_disable_access_debugfs(adev);
return result;
 }
 
@@ -255,6 +262,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
*f, char __user *buf,
if (r < 0)
return r;
 
+   if (!amdgpu_virt_can_access_debugfs(adev))
+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+
while (size) {
uint32_t value;
 
@@ -263,6 +275,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
*f, char __user *buf,
if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
 
@@ -275,6 +288,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
*f, char __user *buf,
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
 
+   amdgpu_virt_disable_access_debugfs(adev);
return result;
 }
 
@@ -304,6 +318,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
*f, const char __user
if (r < 0)
return r;
 
+   if (!amdgpu_virt_can_access_debugfs(adev))
+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+
while (size) {
uint32_t value;
 
@@ -311,6 +330,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
*f, const char __user
if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
 
@@ -325,6 +345,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
*f, const char __user
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
 
+   amdgpu_virt_disable_access_debugfs(adev);
return result;
 }
 
@@ -354,6 +375,11 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (r < 0)
return r;
 
+   if (!amdgpu_virt_can_access_debugfs(adev))
+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);
+
while (size) {
uint32_t value;
 
@@ -362,6 +388,7 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (r) {
pm_runtime_mark_last_busy(adev->ddev->dev);
pm_runtime_put_autosuspend(adev->ddev->dev);
+   amdgpu_virt_disable_access_debugfs(adev);
return r;
}
 
@@ -374,6 +401,7 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
pm_runtime_mark_last_busy(adev->ddev->dev);