from:"Rajneesh Bhardwaj"

[PATCH] drm/amdgpu: Make CPX mode auto default in NPS4

2024-05-22 Thread Rajneesh Bhardwaj

On GFXIP9.4.3, make CPX mode as the default compute mode if the node is
setup in NPS4 memory partition mode. This change is only applicable for
dGPU, for APU, continue to use TPX mode.

Cc: Lijo Lazar 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c 
b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
index d62cfa4e2d2b..2c9a0aa41e2d 100644
--- a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
+++ b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
@@ -422,7 +422,7 @@ __aqua_vanjaram_get_auto_mode(struct amdgpu_xcp_mgr 
*xcp_mgr)
 
if (adev->gmc.num_mem_partitions == num_xcc / 2)
return (adev->flags & AMD_IS_APU) ? AMDGPU_TPX_PARTITION_MODE :
-   AMDGPU_QPX_PARTITION_MODE;
+   AMDGPU_CPX_PARTITION_MODE;
 
if (adev->gmc.num_mem_partitions == 2 && !(adev->flags & AMD_IS_APU))
return AMDGPU_DPX_PARTITION_MODE;
-- 
2.34.1

[PATCH] drm/amdgpu: Update CGCG settings for GFXIP 9.4.3

2024-04-21 Thread Rajneesh Bhardwaj

Tune coarse grain clock gating idle threshold and rlc idle timeout to
achieve better kernel launch latency.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index 835004187a58..813528fb4f2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -2404,10 +2404,10 @@ gfx_v9_4_3_xcc_update_coarse_grain_clock_gating(struct 
amdgpu_device *adev,
if (def != data)
WREG32_SOC15(GC, GET_INST(GC, xcc_id), 
regRLC_CGTT_MGCG_OVERRIDE, data);
 
-   /* enable cgcg FSM(0x363F) */
+   /* CGCG Hysteresis: 400us */
def = RREG32_SOC15(GC, GET_INST(GC, xcc_id), 
regRLC_CGCG_CGLS_CTRL);
 
-   data = (0x36
+   data = (0x2710
<< RLC_CGCG_CGLS_CTRL__CGCG_GFX_IDLE_THRESHOLD__SHIFT) |
   RLC_CGCG_CGLS_CTRL__CGCG_EN_MASK;
if (adev->cg_flags & AMD_CG_SUPPORT_GFX_CGLS)
@@ -2416,10 +2416,10 @@ gfx_v9_4_3_xcc_update_coarse_grain_clock_gating(struct 
amdgpu_device *adev,
if (def != data)
WREG32_SOC15(GC, GET_INST(GC, xcc_id), 
regRLC_CGCG_CGLS_CTRL, data);
 
-   /* set IDLE_POLL_COUNT(0x00900100) */
+   /* set IDLE_POLL_COUNT(0x33450100)*/
def = RREG32_SOC15(GC, GET_INST(GC, xcc_id), 
regCP_RB_WPTR_POLL_CNTL);
data = (0x0100 << CP_RB_WPTR_POLL_CNTL__POLL_FREQUENCY__SHIFT) |
-   (0x0090 << 
CP_RB_WPTR_POLL_CNTL__IDLE_POLL_COUNT__SHIFT);
+   (0x3345 << 
CP_RB_WPTR_POLL_CNTL__IDLE_POLL_COUNT__SHIFT);
if (def != data)
WREG32_SOC15(GC, GET_INST(GC, xcc_id), 
regCP_RB_WPTR_POLL_CNTL, data);
} else {
-- 
2.34.1

[PATCH] drm/ttm: Make ttm shrinkers NUMA aware

2024-04-08 Thread Rajneesh Bhardwaj

Otherwise the nid is always passed as 0 during memory reclaim so
make TTM shrinkers NUMA aware.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/ttm/ttm_pool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index dbc96984d331..514261f44b78 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -794,7 +794,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
_pool_debugfs_shrink_fops);
 #endif
 
-   mm_shrinker = shrinker_alloc(0, "drm-ttm_pool");
+   mm_shrinker = shrinker_alloc(SHRINKER_NUMA_AWARE, "drm-ttm_pool");
if (!mm_shrinker)
return -ENOMEM;
 
-- 
2.34.1

[PATCH] drm/ttm: Implement strict NUMA pool allocations

2024-03-22 Thread Rajneesh Bhardwaj

This change allows TTM to be flexible to honor NUMA localized
allocations which can result in significant performance improvement on a
multi socket NUMA system. On GFXIP 9.4.3 based AMD APUs, we see
manyfold benefits of this change resulting not only in ~10% performance
improvement in certain benchmarks but also generating more consistent
and less sporadic results specially when the NUMA balancing is not
explecitely disabled. In certain scenarios, workloads show a run-to-run
variability e.g. HPL would show a ~10x performance drop after running
back to back 4-5 times and would recover later on a subsequent run. This
is seen with memory intensive other workloads too. It was seen that when
CPU caches were dropped e.g. sudo sysctl -w vm.drop_caches=1, the
variability reduced but the performance was still well below a good run.

Use of __GFP_THISNODE flag ensures that during memory allocation, kernel
prioritizes memory allocations from the local or closest NUMA node
thereby reducing memory access latency. When memory is allocated using
__GFP_THISNODE flag, memory allocations will predominantly be done on
the local node, consequency, the shrinkers may priotitize reclaiming
memory from caches assocoated with local node to maintain memory
locality and minimize latency, thereby provide better shinker targeting.

Reduced memory pressure on remote nodes, can also indirectly influence
shrinker behavior by potentially reducing the frequency and intensity of
memory reclamation operation on remote nodes and could provide improved
overall system performance.

While this change could be more beneficial in general, i.e., without the
use of a module parameter, but in absence of widespread testing, limit
it to the AMD GFXIP 9.4.3 based ttm pool initializations only.


Cc: Joe Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  7 ++-
 drivers/gpu/drm/ttm/tests/ttm_pool_test.c | 10 +-
 drivers/gpu/drm/ttm/ttm_device.c  |  2 +-
 drivers/gpu/drm/ttm/ttm_pool.c|  7 ++-
 include/drm/ttm/ttm_pool.h|  4 +++-
 7 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 9c62552bec34..96532cfc6230 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -253,6 +253,7 @@ extern int amdgpu_user_partt_mode;
 extern int amdgpu_agp;
 
 extern int amdgpu_wbrf;
+extern bool strict_numa_alloc;
 
 #define AMDGPU_VM_MAX_NUM_CTX  4096
 #define AMDGPU_SG_THRESHOLD(256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 80b9642f2bc4..a183a6b4493d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -781,6 +781,14 @@ int queue_preemption_timeout_ms = 9000;
 module_param(queue_preemption_timeout_ms, int, 0644);
 MODULE_PARM_DESC(queue_preemption_timeout_ms, "queue preemption timeout in ms 
(1 = Minimum, 9000 = default)");
 
+/**
+ * DOC: strict_numa_alloc(bool)
+ * Policy to force NUMA allocation requests from the proximity NUMA domain 
only.
+ */
+bool strict_numa_alloc;
+module_param(strict_numa_alloc, bool, 0444);
+MODULE_PARM_DESC(strict_numa_alloc, "Force NUMA allocation requests to be 
satisfied from the closest node only (false = default)");
+
 /**
  * DOC: debug_evictions(bool)
  * Enable extra debug messages to help determine the cause of evictions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index b0ed10f4de60..a9f78f85e28c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1768,6 +1768,7 @@ static int amdgpu_ttm_reserve_tmr(struct amdgpu_device 
*adev)
 
 static int amdgpu_ttm_pools_init(struct amdgpu_device *adev)
 {
+   bool policy = true;
int i;
 
if (!adev->gmc.is_app_apu || !adev->gmc.num_mem_partitions)
@@ -1779,11 +1780,15 @@ static int amdgpu_ttm_pools_init(struct amdgpu_device 
*adev)
if (!adev->mman.ttm_pools)
return -ENOMEM;
 
+   /* Policy not only depends on the module param but also on the ASIC
+* setting use_strict_numa_alloc as well.
+*/
for (i = 0; i < adev->gmc.num_mem_partitions; i++) {
ttm_pool_init(>mman.ttm_pools[i], adev->dev,
  adev->gmc.mem_partitions[i].numa.node,
- false, false);
+ false, false, policy && strict_numa_alloc);
}
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/ttm/tests/ttm_pool_test.c 
b/drivers/gpu/drm/ttm/tests/ttm_pool_test.c
index 2d9cae8cd984..6ff47aac570a 100644
--- a/drivers/gpu/drm/ttm/t

[Patch v2 1/2] drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards

2024-02-13 Thread Rajneesh Bhardwaj

In certain cooperative group dispatch scenarios the default SPI resource
allocation may cause reduced per-CU workgroup occupancy. Set
COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang
scenarions.

Suggested-by: Joseph Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
* Change the enum bitfield to 4 to avoid ORing condition of previous
  member flags.
* Incorporate review feedback from Felix from
  https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg102840.html
  and split one of the suggested gfx11 changes as a seperate patch.


 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c| 9 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 4 +++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 42d881809dc7..697b6d530d12 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -303,6 +303,15 @@ static void update_mqd(struct mqd_manager *mm, void *mqd,
update_cu_mask(mm, mqd, minfo, 0);
set_priority(m, q);
 
+   if (minfo && KFD_GC_VERSION(mm->dev) >= IP_VERSION(9, 4, 2)) {
+   if (minfo->update_flag & UPDATE_FLAG_IS_GWS)
+   m->compute_resource_limits |=
+   COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK;
+   else
+   m->compute_resource_limits &=
+   ~COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK;
+   }
+
q->is_active = QUEUE_IS_ACTIVE(*q);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 677281c0793e..80320b8603fc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -532,6 +532,7 @@ struct queue_properties {
 enum mqd_update_flag {
UPDATE_FLAG_DBG_WA_ENABLE = 1,
UPDATE_FLAG_DBG_WA_DISABLE = 2,
+   UPDATE_FLAG_IS_GWS = 4, /* quirk for gfx9 IP */
 };
 
 struct mqd_update_info {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 43eff221eae5..4858112f9a53 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -95,6 +95,7 @@ void kfd_process_dequeue_from_device(struct 
kfd_process_device *pdd)
 int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid,
void *gws)
 {
+   struct mqd_update_info minfo = {0};
struct kfd_node *dev = NULL;
struct process_queue_node *pqn;
struct kfd_process_device *pdd;
@@ -146,9 +147,10 @@ int pqm_set_gws(struct process_queue_manager *pqm, 
unsigned int qid,
}
 
pdd->qpd.num_gws = gws ? dev->adev->gds.gws_size : 0;
+   minfo.update_flag = gws ? UPDATE_FLAG_IS_GWS : 0;
 
return pqn->q->device->dqm->ops.update_queue(pqn->q->device->dqm,
-   pqn->q, NULL);
+   pqn->q, );
 }
 
 void kfd_process_dequeue_from_all_devices(struct kfd_process *p)
-- 
2.34.1

[Patch v2 2/2] drm/amdgpu: Fix implicit assumtion in gfx11 debug flags

2024-02-13 Thread Rajneesh Bhardwaj

Gfx11 debug flags mask is currently set with an implicit assumption that
no other mqd update flags exist. This needs to be fixed with newly
introduced flag UPDATE_FLAG_IS_GWS by the previous patch.

Reviewed-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
index d722cbd31783..826bc4f6c8a7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
@@ -55,8 +55,8 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd,
m = get_mqd(mqd);
 
if (has_wa_flag) {
-   uint32_t wa_mask = minfo->update_flag == 
UPDATE_FLAG_DBG_WA_ENABLE ?
-   0x : 0x;
+   uint32_t wa_mask =
+   (minfo->update_flag & UPDATE_FLAG_DBG_WA_ENABLE) ? 
0x : 0x;
 
m->compute_static_thread_mgmt_se0 = wa_mask;
m->compute_static_thread_mgmt_se1 = wa_mask;
-- 
2.34.1

[PATCH 2/2] drm/amdgpu: Fix implicit assumtion in gfx11 debug flags

2024-02-09 Thread Rajneesh Bhardwaj

Gfx11 debug flags mask is currently set with an implicit assumption that
no other mqd update flags exist. This needs to be fixed with newly
introduced flag UPDATE_FLAG_IS_GWS by the previous patch.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
index d722cbd31783..826bc4f6c8a7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
@@ -55,8 +55,8 @@ static void update_cu_mask(struct mqd_manager *mm, void *mqd,
m = get_mqd(mqd);
 
if (has_wa_flag) {
-   uint32_t wa_mask = minfo->update_flag == 
UPDATE_FLAG_DBG_WA_ENABLE ?
-   0x : 0x;
+   uint32_t wa_mask =
+   (minfo->update_flag & UPDATE_FLAG_DBG_WA_ENABLE) ? 
0x : 0x;
 
m->compute_static_thread_mgmt_se0 = wa_mask;
m->compute_static_thread_mgmt_se1 = wa_mask;
-- 
2.34.1

[PATCH 1/2] drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards

2024-02-09 Thread Rajneesh Bhardwaj

In certain cooperative group dispatch scenarios the default SPI resource
allocation may cause reduced per-CU workgroup occupancy. Set
COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang
scenarions.

Suggested-by: Joseph Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
* Incorporate review feedback from Felix from
  https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg102840.html
  and split one of the suggested gfx11 changes as a seperate patch.


 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c| 9 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 4 +++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 42d881809dc7..697b6d530d12 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -303,6 +303,15 @@ static void update_mqd(struct mqd_manager *mm, void *mqd,
update_cu_mask(mm, mqd, minfo, 0);
set_priority(m, q);
 
+   if (minfo && KFD_GC_VERSION(mm->dev) >= IP_VERSION(9, 4, 2)) {
+   if (minfo->update_flag & UPDATE_FLAG_IS_GWS)
+   m->compute_resource_limits |=
+   COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK;
+   else
+   m->compute_resource_limits &=
+   ~COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK;
+   }
+
q->is_active = QUEUE_IS_ACTIVE(*q);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 677281c0793e..65b504813576 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -532,6 +532,7 @@ struct queue_properties {
 enum mqd_update_flag {
UPDATE_FLAG_DBG_WA_ENABLE = 1,
UPDATE_FLAG_DBG_WA_DISABLE = 2,
+   UPDATE_FLAG_IS_GWS = 3, /* quirk for gfx9 IP */
 };
 
 struct mqd_update_info {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 43eff221eae5..4858112f9a53 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -95,6 +95,7 @@ void kfd_process_dequeue_from_device(struct 
kfd_process_device *pdd)
 int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid,
void *gws)
 {
+   struct mqd_update_info minfo = {0};
struct kfd_node *dev = NULL;
struct process_queue_node *pqn;
struct kfd_process_device *pdd;
@@ -146,9 +147,10 @@ int pqm_set_gws(struct process_queue_manager *pqm, 
unsigned int qid,
}
 
pdd->qpd.num_gws = gws ? dev->adev->gds.gws_size : 0;
+   minfo.update_flag = gws ? UPDATE_FLAG_IS_GWS : 0;
 
return pqn->q->device->dqm->ops.update_queue(pqn->q->device->dqm,
-   pqn->q, NULL);
+   pqn->q, );
 }
 
 void kfd_process_dequeue_from_all_devices(struct kfd_process *p)
-- 
2.34.1

[Patch v2] drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards

2024-02-07 Thread Rajneesh Bhardwaj

In certain cooperative group dispatch scenarios the default SPI resource
allocation may cause reduced per-CU workgroup occupancy. Set
COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang
scenarions.

Suggested-by: Joseph Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
* Found a bug in the previous reviewed version
  https://lists.freedesktop.org/archives/amd-gfx/2024-February/104101.html
  since the q->is_gws is unset for keeping the count.
* updated pqm_set_gws to pass minfo holding gws state for the active
  queues and use that to apply the FORCE_SIMD_DIST_MASK.

 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c| 4 
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 4 +++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 42d881809dc7..0b71db4c96b5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -303,6 +303,10 @@ static void update_mqd(struct mqd_manager *mm, void *mqd,
update_cu_mask(mm, mqd, minfo, 0);
set_priority(m, q);
 
+   if (minfo && KFD_GC_VERSION(mm->dev) >= IP_VERSION(9, 4, 2))
+   m->compute_resource_limits = minfo->gws ?
+   COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK : 0;
+
q->is_active = QUEUE_IS_ACTIVE(*q);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 677281c0793e..f4b327a2d4a8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -542,6 +542,7 @@ struct mqd_update_info {
} cu_mask;
};
enum mqd_update_flag update_flag;
+   bool gws;
 };
 
 /**
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 43eff221eae5..5416a110ced9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -95,6 +95,7 @@ void kfd_process_dequeue_from_device(struct 
kfd_process_device *pdd)
 int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid,
void *gws)
 {
+   struct mqd_update_info minfo = {0};
struct kfd_node *dev = NULL;
struct process_queue_node *pqn;
struct kfd_process_device *pdd;
@@ -146,9 +147,10 @@ int pqm_set_gws(struct process_queue_manager *pqm, 
unsigned int qid,
}
 
pdd->qpd.num_gws = gws ? dev->adev->gds.gws_size : 0;
+   minfo.gws = !!gws;
 
return pqn->q->device->dqm->ops.update_queue(pqn->q->device->dqm,
-   pqn->q, NULL);
+   pqn->q, );
 }
 
 void kfd_process_dequeue_from_all_devices(struct kfd_process *p)
-- 
2.34.1

[PATCH] drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards

2024-02-01 Thread Rajneesh Bhardwaj

In certain cooperative group dispatch scenarios the default SPI resource
allocation may cause reduced per-CU workgroup occupancy. Set
COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang
scenarions.

Suggested-by: Joseph Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 42d881809dc7..4b28e7dcb62f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -303,6 +303,10 @@ static void update_mqd(struct mqd_manager *mm, void *mqd,
update_cu_mask(mm, mqd, minfo, 0);
set_priority(m, q);
 
+   if (KFD_GC_VERSION(mm->dev) >= IP_VERSION(9, 4, 2))
+   m->compute_resource_limits = q->is_gws ?
+   COMPUTE_RESOURCE_LIMITS__FORCE_SIMD_DIST_MASK : 0;
+
q->is_active = QUEUE_IS_ACTIVE(*q);
 }
 
-- 
2.34.1

[PATCH] drm/ttm: set max_active to recommened default

2023-11-11 Thread Rajneesh Bhardwaj

To maximize per cpu execution context for the work items, use the
recommended settings i.e. WQ_DFL_ACTIVE(256). There is no apparent
reason to throttle to 16 while process tear down.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/ttm/ttm_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index bc97e3dd40f0..5443c0f19213 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -205,7 +205,7 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
return ret;
 
bdev->wq = alloc_workqueue("ttm",
-  WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 
16);
+  WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
if (!bdev->wq) {
ttm_global_release();
return -ENOMEM;
-- 
2.34.1

[Patch v3] drm/ttm: Schedule delayed_delete worker closer

2023-11-11 Thread Rajneesh Bhardwaj

Try to allocate system memory on the NUMA node the device is closest to
and try to run delayed_delete workers on a CPU of this node as well.

To optimize the memory clearing operation when a TTM BO gets freed by
the delayed_delete worker, scheduling it closer to a NUMA node where the
memory was initially allocated helps avoid the cases where the worker
gets randomly scheduled on the CPU cores that are across interconnect
boundaries such as xGMI, PCIe etc.

This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD
APU platforms such as GFXIP9.4.3.

Acked-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
Changes in v3:
 * Use WQ_UNBOUND to address the warning reported by CI pipeline.

 drivers/gpu/drm/ttm/ttm_bo.c | 8 +++-
 drivers/gpu/drm/ttm/ttm_device.c | 6 --
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5757b9415e37..6f28a77a565b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -370,7 +370,13 @@ static void ttm_bo_release(struct kref *kref)
spin_unlock(>bdev->lru_lock);
 
INIT_WORK(>delayed_delete, ttm_bo_delayed_delete);
-   queue_work(bdev->wq, >delayed_delete);
+
+   /* Schedule the worker on the closest NUMA node. This
+* improves performance since system memory might be
+* cleared on free and that is best done on a CPU core
+* close to it.
+*/
+   queue_work_node(bdev->pool.nid, bdev->wq, 
>delayed_delete);
return;
}
 
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 43e27ab77f95..bc97e3dd40f0 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -204,7 +204,8 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
if (ret)
return ret;
 
-   bdev->wq = alloc_workqueue("ttm", WQ_MEM_RECLAIM | WQ_HIGHPRI, 16);
+   bdev->wq = alloc_workqueue("ttm",
+  WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 
16);
if (!bdev->wq) {
ttm_global_release();
return -ENOMEM;
@@ -213,7 +214,8 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
bdev->funcs = funcs;
 
ttm_sys_man_init(bdev);
-   ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, use_dma32);
+
+   ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);
 
bdev->vma_manager = vma_manager;
spin_lock_init(>lru_lock);
-- 
2.34.1

[Patch v2] drm/ttm: Schedule delayed_delete worker closer

2023-11-08 Thread Rajneesh Bhardwaj

Try to allocate system memory on the NUMA node the device is closest to
and try to run delayed_delete workers on a CPU of this node as well.

To optimize the memory clearing operation when a TTM BO gets freed by
the delayed_delete worker, scheduling it closer to a NUMA node where the
memory was initially allocated helps avoid the cases where the worker
gets randomly scheduled on the CPU cores that are across interconnect
boundaries such as xGMI, PCIe etc.

This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD
APU platforms such as GFXIP9.4.3.

Acked-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---

Changes in v2:
 - Absorbed the feedback provided by Christian in the commit message and
   the comment.

 drivers/gpu/drm/ttm/ttm_bo.c | 8 +++-
 drivers/gpu/drm/ttm/ttm_device.c | 3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5757b9415e37..6f28a77a565b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -370,7 +370,13 @@ static void ttm_bo_release(struct kref *kref)
spin_unlock(>bdev->lru_lock);
 
INIT_WORK(>delayed_delete, ttm_bo_delayed_delete);
-   queue_work(bdev->wq, >delayed_delete);
+
+   /* Schedule the worker on the closest NUMA node. This
+* improves performance since system memory might be
+* cleared on free and that is best done on a CPU core
+* close to it.
+*/
+   queue_work_node(bdev->pool.nid, bdev->wq, 
>delayed_delete);
return;
}
 
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 43e27ab77f95..72b81a2ee6c7 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -213,7 +213,8 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
bdev->funcs = funcs;
 
ttm_sys_man_init(bdev);
-   ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, use_dma32);
+
+   ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);
 
bdev->vma_manager = vma_manager;
spin_lock_init(>lru_lock);
-- 
2.34.1

[PATCH] drm/ttm: Schedule delayed_delete worker closer

2023-11-07 Thread Rajneesh Bhardwaj

When a TTM BO is getting freed, to optimize the clearing operation on
the workqueue, schedule it closer to a NUMA node where the memory was
allocated. This avoids the cases where the ttm_bo_delayed_delete gets
scheduled on the CPU cores that are across interconnect boundaries such
as xGMI, PCIe etc.

This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD
APU platforms such as GFXIP9.4.3.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 10 +-
 drivers/gpu/drm/ttm/ttm_device.c |  3 ++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5757b9415e37..0d608441a112 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -370,7 +370,15 @@ static void ttm_bo_release(struct kref *kref)
spin_unlock(>bdev->lru_lock);
 
INIT_WORK(>delayed_delete, ttm_bo_delayed_delete);
-   queue_work(bdev->wq, >delayed_delete);
+   /* Schedule the worker on the closest NUMA node, if no
+* CPUs are available, this falls back to any CPU core
+* available system wide. This helps avoid the
+* bottleneck to clear memory in cases where the worker
+* is scheduled on a CPU which is remote to the node
+* where the memory is getting freed.
+*/
+
+   queue_work_node(bdev->pool.nid, bdev->wq, 
>delayed_delete);
return;
}
 
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 43e27ab77f95..72b81a2ee6c7 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -213,7 +213,8 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,
bdev->funcs = funcs;
 
ttm_sys_man_init(bdev);
-   ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, use_dma32);
+
+   ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);
 
bdev->vma_manager = vma_manager;
spin_lock_init(>lru_lock);
-- 
2.34.1

[Patch v2 1/2] drm/amdgpu: Rework KFD memory max limits

2023-10-02 Thread Rajneesh Bhardwaj

To allow bigger allocations specially on systems such as GFXIP 9.4.3
that use GTT memory for VRAM allocations, relax the limits to
maximize ROCm allocations.

Reviewed-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index b5b940485059..c55907ff7dcf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -42,6 +42,7 @@
  * changes to accumulate
  */
 #define AMDGPU_USERPTR_RESTORE_DELAY_MS 1
+#define AMDGPU_RESERVE_MEM_LIMIT   (3UL << 29)
 
 /*
  * Align VRAM availability to 2MB to avoid fragmentation caused by 4K 
allocations in the tail 2MB
@@ -115,11 +116,16 @@ void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
return;
 
si_meminfo();
-   mem = si.freeram - si.freehigh;
+   mem = si.totalram - si.totalhigh;
mem *= si.mem_unit;
 
spin_lock_init(_mem_limit.mem_limit_lock);
-   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 4);
+   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 6);
+   if (kfd_mem_limit.max_system_mem_limit < 2 * AMDGPU_RESERVE_MEM_LIMIT)
+   kfd_mem_limit.max_system_mem_limit >>= 1;
+   else
+   kfd_mem_limit.max_system_mem_limit -= AMDGPU_RESERVE_MEM_LIMIT;
+
kfd_mem_limit.max_ttm_mem_limit = ttm_tt_pages_limit() << PAGE_SHIFT;
pr_debug("Kernel memory limit %lluM, TTM limit %lluM\n",
(kfd_mem_limit.max_system_mem_limit >> 20),
-- 
2.34.1

[Patch v2 2/2] drm/amdgpu: Use ttm_pages_limit to override vram reporting

2023-10-02 Thread Rajneesh Bhardwaj

On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages
limit in NPS1 mode.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 17 -
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  9 +
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 38b5457baded..131e150d8a93 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -28,6 +28,7 @@
 #include "amdgpu.h"
 #include "amdgpu_gfx.h"
 #include "amdgpu_dma_buf.h"
+#include 
 #include 
 #include 
 #include "amdgpu_xgmi.h"
@@ -806,10 +807,24 @@ void amdgpu_amdkfd_unlock_kfd(struct amdgpu_device *adev)
 u64 amdgpu_amdkfd_xcp_memory_size(struct amdgpu_device *adev, int xcp_id)
 {
u64 tmp;
+   int num_nodes;
s8 mem_id = KFD_XCP_MEM_ID(adev, xcp_id);
 
if (adev->gmc.num_mem_partitions && xcp_id >= 0 && mem_id >= 0) {
-   tmp = adev->gmc.mem_partitions[mem_id].size;
+   if (adev->gmc.is_app_apu && adev->gmc.num_mem_partitions == 1) {
+   num_nodes = num_online_nodes();
+   /* In NPS1 mode, we should restrict the vram reporting
+* tied to the ttm_pages_limit which is 1/2 of the 
system
+* memory. For other partition modes, the HBM is 
uniformly
+* divided already per numa node reported. If user 
wants to
+* go beyond the default ttm limit and maximize the ROCm
+* allocations, they can go up to max ttm and sysmem 
limits.
+*/
+
+   tmp = (ttm_tt_pages_limit() << PAGE_SHIFT) / num_nodes;
+   } else {
+   tmp = adev->gmc.mem_partitions[mem_id].size;
+   }
do_div(tmp, adev->xcp_mgr->num_xcp_per_mem_partition);
return ALIGN_DOWN(tmp, PAGE_SIZE);
} else {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 268ee533e7c1..b090cd42f81f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1896,15 +1896,14 @@ static void
 gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device *adev,
  struct amdgpu_mem_partition_info *mem_ranges)
 {
-   int num_ranges = 0, ret, mem_groups;
struct amdgpu_numa_info numa_info;
int node_ids[MAX_MEM_RANGES];
+   int num_ranges = 0, ret;
int num_xcc, xcc_id;
uint32_t xcc_mask;
 
num_xcc = NUM_XCC(adev->gfx.xcc_mask);
xcc_mask = (1U << num_xcc) - 1;
-   mem_groups = hweight32(adev->aid_mask);
 
for_each_inst(xcc_id, xcc_mask) {
ret = amdgpu_acpi_get_mem_info(adev, xcc_id, _info);
@@ -1929,12 +1928,6 @@ gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device *adev,
}
 
adev->gmc.num_mem_partitions = num_ranges;
-
-   /* If there is only partition, don't use entire size */
-   if (adev->gmc.num_mem_partitions == 1) {
-   mem_ranges[0].size = mem_ranges[0].size * (mem_groups - 1);
-   do_div(mem_ranges[0].size, mem_groups);
-   }
 }
 
 static void
-- 
2.34.1

[PATCH 3/3] drm/amdgpu: Use ttm_pages_limit to override vram reporting

2023-09-29 Thread Rajneesh Bhardwaj

On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages
limit in NPS1 mode.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 005ea719d2fd..fadc4d2ed071 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include 
 
 #include "amdgpu.h"
 #include "gmc_v9_0.h"
@@ -1896,7 +1897,8 @@ static void
 gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device *adev,
  struct amdgpu_mem_partition_info *mem_ranges)
 {
-   int num_ranges = 0, ret, mem_groups;
+   int num_ranges = 0, ret, num_nodes;
+   uint64_t node_size_ttm_override = 0;
struct amdgpu_numa_info numa_info;
int node_ids[MAX_MEM_RANGES];
int num_xcc, xcc_id;
@@ -1904,7 +1906,8 @@ gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device *adev,
 
num_xcc = NUM_XCC(adev->gfx.xcc_mask);
xcc_mask = (1U << num_xcc) - 1;
-   mem_groups = hweight32(adev->aid_mask);
+   num_nodes = num_online_nodes();
+   node_size_ttm_override = (ttm_tt_pages_limit() << PAGE_SHIFT) / 
num_nodes;
 
for_each_inst(xcc_id, xcc_mask) {
ret = amdgpu_acpi_get_mem_info(adev, xcc_id, _info);
@@ -1912,7 +1915,6 @@ gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device *adev,
continue;
 
if (numa_info.nid == NUMA_NO_NODE) {
-   mem_ranges[0].size = numa_info.size;
mem_ranges[0].numa.node = numa_info.nid;
num_ranges = 1;
break;
@@ -1930,11 +1932,16 @@ gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device 
*adev,
 
adev->gmc.num_mem_partitions = num_ranges;
 
-   /* If there is only partition, don't use entire size */
-   if (adev->gmc.num_mem_partitions == 1) {
-   mem_ranges[0].size = mem_ranges[0].size * (mem_groups - 1);
-   do_div(mem_ranges[0].size, mem_groups);
-   }
+   /* In NPS1 mode, we should restrict the vram reporting
+* tied to the ttm_pages_limit which is 1/2 of the
+* system memory. For other partition modes, the HBM is
+* uniformly divided already per numa node reported. If
+* user wants to go beyond the default ttm limit and
+* maximize the ROCm allocations, they can go up to max
+* ttm and sysmem limits.
+*/
+   if (adev->gmc.num_mem_partitions == 1)
+   mem_ranges[0].size = node_size_ttm_override;
 }
 
 static void
-- 
2.34.1

[PATCH 2/3] drm/amdgpu: Initialize acpi mem ranges after TTM

2023-09-29 Thread Rajneesh Bhardwaj

Move ttm init before acpi mem range init so we can use ttm_pages_limit
to override vram size for GFXIP 9.4.3. The vram size override change
will be introduced in a future commit.

Acked-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 268ee533e7c1..005ea719d2fd 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -2190,17 +2190,17 @@ static int gmc_v9_0_sw_init(void *handle)
 
amdgpu_gmc_get_vbios_allocations(adev);
 
+   /* Memory manager */
+   r = amdgpu_bo_init(adev);
+   if (r)
+   return r;
+
if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3)) {
r = gmc_v9_0_init_mem_ranges(adev);
if (r)
return r;
}
 
-   /* Memory manager */
-   r = amdgpu_bo_init(adev);
-   if (r)
-   return r;
-
r = gmc_v9_0_gart_init(adev);
if (r)
return r;
-- 
2.34.1

[PATCH 1/3] drm/amdgpu: Rework KFD memory max limits

2023-09-29 Thread Rajneesh Bhardwaj

To allow bigger allocations specially on systems such as GFXIP 9.4.3
that use GTT memory for VRAM allocations, relax the limits to
maximize ROCm allocations.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index b5b940485059..b1c4e9c0e036 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -42,6 +42,7 @@
  * changes to accumulate
  */
 #define AMDGPU_USERPTR_RESTORE_DELAY_MS 1
+#define AMDGPU_RESERVE_MEM_LIMIT   (1UL << 30)
 
 /*
  * Align VRAM availability to 2MB to avoid fragmentation caused by 4K 
allocations in the tail 2MB
@@ -115,11 +116,16 @@ void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
return;
 
si_meminfo();
-   mem = si.freeram - si.freehigh;
+   mem = si.totalram - si.totalhigh;
mem *= si.mem_unit;
 
spin_lock_init(_mem_limit.mem_limit_lock);
-   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 4);
+   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 6);
+   if (kfd_mem_limit.max_system_mem_limit < 2 * AMDGPU_RESERVE_MEM_LIMIT)
+   kfd_mem_limit.max_system_mem_limit >>= 1;
+   else
+   kfd_mem_limit.max_system_mem_limit -= AMDGPU_RESERVE_MEM_LIMIT;
+
kfd_mem_limit.max_ttm_mem_limit = ttm_tt_pages_limit() << PAGE_SHIFT;
pr_debug("Kernel memory limit %lluM, TTM limit %lluM\n",
(kfd_mem_limit.max_system_mem_limit >> 20),
-- 
2.34.1

[PATCH] drm/amdgpu: Hide xcp partition sysfs under SRIOV

2023-08-24 Thread Rajneesh Bhardwaj

XCP partitions should not be visible for the VF for GFXIP 9.4.3.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index b4fdb269f856..b1ca3014a9e2 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -863,11 +863,15 @@ static int gfx_v9_4_3_sw_init(void *handle)
if (r)
return r;
 
-   r = amdgpu_gfx_sysfs_init(adev);
+   r = amdgpu_gfx_ras_sw_init(adev);
if (r)
return r;
 
-   return amdgpu_gfx_ras_sw_init(adev);
+
+   if (!amdgpu_sriov_vf(adev))
+   r = amdgpu_gfx_sysfs_init(adev);
+
+   return r;
 }
 
 static int gfx_v9_4_3_sw_fini(void *handle)
@@ -888,7 +892,8 @@ static int gfx_v9_4_3_sw_fini(void *handle)
gfx_v9_4_3_mec_fini(adev);
amdgpu_bo_unref(>gfx.rlc.clear_state_obj);
gfx_v9_4_3_free_microcode(adev);
-   amdgpu_gfx_sysfs_fini(adev);
+   if (!amdgpu_sriov_vf(adev))
+   amdgpu_gfx_sysfs_fini(adev);
 
return 0;
 }
-- 
2.17.1

[PATCH] drm/amdgpu: Rework memory limits to allow big allocations

2023-08-21 Thread Rajneesh Bhardwaj

Rework the KFD max system memory and ttm limit to allow bigger
system memory allocations upto 63/64 of the available memory which is
controlled by ttm module params pages_limit and page_pool_size. Also for
NPS1 mode, report the max ttm limit as the available VRAM size. For max
system memory limit, leave 1GB exclusively outside ROCm allocations i.e.
on 16GB system, >14 GB can be used by ROCm still leaving some memory for
other system applications and on 128GB systems (e.g. GFXIP 9.4.3 APU in
NPS1 mode) nearly >120GB can be used by ROCm.

Signed-off-by: Rajneesh Bhardwaj 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  5 ++--
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 25 +--
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9e18fe5eb190..3387dcdf1bc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -44,6 +44,7 @@
  * changes to accumulate
  */
 #define AMDGPU_USERPTR_RESTORE_DELAY_MS 1
+#define ONE_GB (1UL << 30)
 
 /*
  * Align VRAM availability to 2MB to avoid fragmentation caused by 4K 
allocations in the tail 2MB
@@ -117,11 +118,11 @@ void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
return;
 
si_meminfo();
-   mem = si.freeram - si.freehigh;
+   mem = si.totalram - si.totalhigh;
mem *= si.mem_unit;
 
spin_lock_init(_mem_limit.mem_limit_lock);
-   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 4);
+   kfd_mem_limit.max_system_mem_limit = mem - (mem >> 6) - (ONE_GB);
kfd_mem_limit.max_ttm_mem_limit = ttm_tt_pages_limit() << PAGE_SHIFT;
pr_debug("Kernel memory limit %lluM, TTM limit %lluM\n",
(kfd_mem_limit.max_system_mem_limit >> 20),
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 8447fcada8bb..4962e35df617 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include 
 
 #include "amdgpu.h"
 #include "gmc_v9_0.h"
@@ -1877,6 +1878,7 @@ static void
 gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device *adev,
  struct amdgpu_mem_partition_info *mem_ranges)
 {
+   uint64_t max_ttm_size = ttm_tt_pages_limit() << PAGE_SHIFT;
int num_ranges = 0, ret, mem_groups;
struct amdgpu_numa_info numa_info;
int node_ids[MAX_MEM_RANGES];
@@ -1913,8 +1915,17 @@ gmc_v9_0_init_acpi_mem_ranges(struct amdgpu_device *adev,
 
/* If there is only partition, don't use entire size */
if (adev->gmc.num_mem_partitions == 1) {
-   mem_ranges[0].size = mem_ranges[0].size * (mem_groups - 1);
-   do_div(mem_ranges[0].size, mem_groups);
+   if (max_ttm_size > mem_ranges[0].size || max_ttm_size <= 0) {
+   /* Report VRAM as 3/4th of available numa memory */
+   mem_ranges[0].size = mem_ranges[0].size * (mem_groups - 
1);
+   do_div(mem_ranges[0].size, mem_groups);
+   } else {
+   /* Report VRAM as set by ttm.pages_limit or default ttm
+* limit which is 1/2 of system memory
+*/
+   mem_ranges[0].size = max_ttm_size;
+   }
+   pr_debug("NPS1 mode, setting VRAM size = %llu\n", 
mem_ranges[0].size);
}
 }
 
@@ -2159,6 +2170,11 @@ static int gmc_v9_0_sw_init(void *handle)
 
amdgpu_gmc_get_vbios_allocations(adev);
 
+   /* Memory manager */
+   r = amdgpu_bo_init(adev);
+   if (r)
+   return r;
+
 #ifdef HAVE_ACPI_DEV_GET_FIRST_MATCH_DEV
if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 3)) {
r = gmc_v9_0_init_mem_ranges(adev);
@@ -2167,11 +2183,6 @@ static int gmc_v9_0_sw_init(void *handle)
}
 #endif
 
-   /* Memory manager */
-   r = amdgpu_bo_init(adev);
-   if (r)
-   return r;
-
r = gmc_v9_0_gart_init(adev);
if (r)
return r;
-- 
2.17.1

[Patch v2] drm/ttm: Use init_on_free to delay release TTM BOs

2023-07-07 Thread Rajneesh Bhardwaj

Delay release TTM BOs when the kernel default setting is init_on_free.
This offloads the overhead of clearing the system memory to the work
item and potentially a different CPU. This could be very beneficial when
the application does a lot of malloc/free style allocations of system
memory.

Reviewed-by: Christian König .
Signed-off-by: Rajneesh Bhardwaj 
---
Changes in v2:
- Updated commit message as per Christian's feedback

 drivers/gpu/drm/ttm/ttm_bo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 326a3d13a829..bd2e7e4f497a 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -347,6 +347,7 @@ static void ttm_bo_release(struct kref *kref)
 
if (!dma_resv_test_signaled(bo->base.resv,
DMA_RESV_USAGE_BOOKKEEP) ||
+   (want_init_on_free() && (bo->ttm != NULL)) ||
!dma_resv_trylock(bo->base.resv)) {
/* The BO is not idle, resurrect it for delayed destroy 
*/
ttm_bo_flush_all_fences(bo);
-- 
2.17.1

[PATCH] drm/ttm: Use init_on_free to early release TTM BOs

2023-07-05 Thread Rajneesh Bhardwaj

Early release TTM BOs when the kernel default setting is init_on_free to
wipe out and reinitialize system memory chunks. This could potentially
optimize performance when an application does a lot of malloc/free style
allocations with unified system memory.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 326a3d13a829..bd2e7e4f497a 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -347,6 +347,7 @@ static void ttm_bo_release(struct kref *kref)
 
if (!dma_resv_test_signaled(bo->base.resv,
DMA_RESV_USAGE_BOOKKEEP) ||
+   (want_init_on_free() && (bo->ttm != NULL)) ||
!dma_resv_trylock(bo->base.resv)) {
/* The BO is not idle, resurrect it for delayed destroy 
*/
ttm_bo_flush_all_fences(bo);
-- 
2.17.1

[PATCH] drm/amdgpu: Fix the kerneldoc description

2022-11-05 Thread Rajneesh Bhardwaj

amdgpu_ttm_tt_set_userptr() is also called by the KFD as part it
initializing the user pages for userptr BOs and also while initializing
the GPUVM for a KFD process so update the function description.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index ada57a37a020..bcad26a3b009 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1409,8 +1409,9 @@ int amdgpu_ttm_tt_get_userptr(const struct 
ttm_buffer_object *tbo,
  * @addr:  The address in the current tasks VM space to use
  * @flags: Requirements of userptr object.
  *
- * Called by amdgpu_gem_userptr_ioctl() to bind userptr pages
- * to current task
+ * Called by amdgpu_gem_userptr_ioctl() and kfd_ioctl_alloc_memory_of_gpu() to
+ * bind userptr pages to current task and by kfd_ioctl_acquire_vm() to
+ * initialize GPU VM for a KFD process.
  */
 int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo,
  uint64_t addr, uint32_t flags)
-- 
2.17.1

[Patch v2] drm/amdkfd: Fix CRIU restore op due to doorbell offset

2022-09-07 Thread Rajneesh Bhardwaj

Recently introduced change to allocate doorbells only when the first
queue is created or mapped for CPU / GPU access, did not consider
Checkpoint Restore scenario completely. This fix allows the CRIU restore
operation by extending the doorbell optimization to CRIU restore
scenario.

Fixes: 'commit 15bcfbc55b57 ("drm/amdkfd: Allocate doorbells only when needed")'

Signed-off-by: Rajneesh Bhardwaj 
---

Changes in v2:

* Addressed review feedback from Felix

 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 6 ++
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c  | 3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 7 +++
 3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 84da1a9ce37c..56f7307c21d2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2153,6 +2153,12 @@ static int criu_restore_devices(struct kfd_process *p,
ret = PTR_ERR(pdd);
goto exit;
}
+
+   if (!pdd->doorbell_index &&
+   kfd_alloc_process_doorbells(pdd->dev, >doorbell_index) 
< 0) {
+   ret = -ENOMEM;
+   goto exit;
+   }
}
 
/*
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index b33798f89ef0..cd4e61bf0493 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -303,6 +303,9 @@ int kfd_alloc_process_doorbells(struct kfd_dev *kfd, 
unsigned int *doorbell_inde
if (r > 0)
*doorbell_index = r;
 
+   if (r < 0)
+   pr_err("Failed to allocate process doorbells\n");
+
return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 6e3e7f54381b..5137476ec18e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -857,6 +857,13 @@ int kfd_criu_restore_queue(struct kfd_process *p,
ret = -EINVAL;
goto exit;
}
+
+   if (!pdd->doorbell_index &&
+   kfd_alloc_process_doorbells(pdd->dev, >doorbell_index) < 0) {
+   ret = -ENOMEM;
+   goto exit;
+   }
+
/* data stored in this order: mqd, ctl_stack */
mqd = q_extra_data;
ctl_stack = mqd + q_data->mqd_size;
-- 
2.17.1

[PATCH] drm/amdkfd: Fix CRIU restore op due to doorbell offset

2022-09-07 Thread Rajneesh Bhardwaj

Recently introduced change to allocate doorbells only when the first
queue is created or mapped for CPU / GPU access, did not consider
Checkpoint Restore scenario completely. This fix allows the CRIU restore
operation by extedning the doorbell optimization to CRIU restore
scenario.

Fixes: 'commit 15bcfbc55b57 ("drm/amdkfd: Allocate doorbells only when needed")'

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 8 
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c  | 4 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 9 +
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 84da1a9ce37c..c476993e3927 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2153,6 +2153,13 @@ static int criu_restore_devices(struct kfd_process *p,
ret = PTR_ERR(pdd);
goto exit;
}
+
+   if (!pdd->doorbell_index &&
+   kfd_alloc_process_doorbells(pdd->dev, 
>doorbell_index) < 0) {
+   pr_err("Failed to allocate process doorbells\n");
+   ret = -ENOMEM;
+   goto err_alloc_doorbells;
+   }
}
 
/*
@@ -2161,6 +2168,7 @@ static int criu_restore_devices(struct kfd_process *p,
 */
*priv_offset += args->num_devices * sizeof(*device_privs);
 
+err_alloc_doorbells:
 exit:
kfree(device_buckets);
return ret;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index b33798f89ef0..7690514c4eb3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -157,8 +157,10 @@ int kfd_doorbell_mmap(struct kfd_dev *dev, struct 
kfd_process *process,
 
/* Calculate physical address of doorbell */
address = kfd_get_process_doorbells(pdd);
-   if (!address)
+   if (!address) {
+   pr_err("Failed to  get physical address of process doorbell\n");
return -ENOMEM;
+   }
vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE |
VM_DONTDUMP | VM_PFNMAP;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 6e3e7f54381b..9f05f64c5af8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -857,6 +857,14 @@ int kfd_criu_restore_queue(struct kfd_process *p,
ret = -EINVAL;
goto exit;
}
+
+   if (!pdd->doorbell_index &&
+   kfd_alloc_process_doorbells(pdd->dev, >doorbell_index) < 0) {
+   pr_err("Failed to alloc process doorbells\n");
+   ret = -ENOMEM;
+   goto err_alloc_doorbells;
+   }
+
/* data stored in this order: mqd, ctl_stack */
mqd = q_extra_data;
ctl_stack = mqd + q_data->mqd_size;
@@ -876,6 +884,7 @@ int kfd_criu_restore_queue(struct kfd_process *p,
if (q_data->gws)
ret = pqm_set_gws(>pqm, q_data->q_id, pdd->dev->gws);
 
+err_alloc_doorbells:
 exit:
if (ret)
pr_err("Failed to restore queue (%d)\n", ret);
-- 
2.17.1

[Patch v2] drm/amdgpu: Avoid direct cast to amdgpu_ttm_tt

2022-07-27 Thread Rajneesh Bhardwaj

For typesafety, use container_of() instead of implicit cast from struct
ttm_tt to struct amdgpu_ttm_tt.

Cc: Christian König 
Signed-off-by: Rajneesh Bhardwaj 
---
Changes in v2:
 * Fixed a bug that Felix pointed out in V1 by updating the macro
   definition

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 34 +
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index be0efaae79a9..8a6c8db31c00 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -637,6 +637,8 @@ struct amdgpu_ttm_tt {
 #endif
 };
 
+#define ttm_to_amdgpu_ttm_tt(ptr)  container_of(ptr, struct amdgpu_ttm_tt, 
ttm)
+
 #ifdef CONFIG_DRM_AMDGPU_USERPTR
 /*
  * amdgpu_ttm_tt_get_user_pages - get device accessible pages that back user
@@ -648,7 +650,7 @@ struct amdgpu_ttm_tt {
 int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
 {
struct ttm_tt *ttm = bo->tbo.ttm;
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
unsigned long start = gtt->userptr;
struct vm_area_struct *vma;
struct mm_struct *mm;
@@ -702,7 +704,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
  */
 bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
 {
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
bool r = false;
 
if (!gtt || !gtt->userptr)
@@ -751,7 +753,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_device 
*bdev,
 struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY);
enum dma_data_direction direction = write ?
DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
@@ -788,7 +790,7 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_device 
*bdev,
struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY);
enum dma_data_direction direction = write ?
DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
@@ -822,7 +824,7 @@ static void amdgpu_ttm_gart_bind(struct amdgpu_device *adev,
 {
struct amdgpu_bo *abo = ttm_to_amdgpu_bo(tbo);
struct ttm_tt *ttm = tbo->ttm;
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
 
if (amdgpu_bo_encrypted(abo))
flags |= AMDGPU_PTE_TMZ;
@@ -860,7 +862,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_device *bdev,
   struct ttm_resource *bo_mem)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void*)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
uint64_t flags;
int r;
 
@@ -927,7 +929,7 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
struct ttm_operation_ctx ctx = { false, false };
-   struct amdgpu_ttm_tt *gtt = (void *)bo->ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(bo->ttm);
struct ttm_placement placement;
struct ttm_place placements;
struct ttm_resource *tmp;
@@ -998,7 +1000,7 @@ static void amdgpu_ttm_backend_unbind(struct ttm_device 
*bdev,
  struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
 
/* if the pages have userptr pinning then clear that first */
if (gtt->userptr) {
@@ -1025,7 +1027,7 @@ static void amdgpu_ttm_backend_unbind(struct ttm_device 
*bdev,
 static void amdgpu_ttm_backend_destroy(struct ttm_device *bdev,
   struct ttm_tt *ttm)
 {
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
 
if (gtt->usertask)
put_task_struct(gtt->usertask);
@@ -1079,7 +1081,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
  struct ttm_operation_ctx *ctx)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
pgoff_t i;
int ret;
 
@@ -1113,7 +1115,7 @@ static int am

[PATCH] drm/amdgpu: Avoid direct cast to amdgpu_ttm_tt

2022-07-26 Thread Rajneesh Bhardwaj

For typesafety, use container_of() instead of implicit cast from struct
ttm_tt to struct amdgpu_ttm_tt.

Cc: Christian König 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 44 ++---
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index be0efaae79a9..cd6aa206a59e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -637,6 +637,8 @@ struct amdgpu_ttm_tt {
 #endif
 };
 
+#define ttm_to_amdgpu_ttm_tt(ttm)  container_of(ttm, struct amdgpu_ttm_tt, 
ttm)
+
 #ifdef CONFIG_DRM_AMDGPU_USERPTR
 /*
  * amdgpu_ttm_tt_get_user_pages - get device accessible pages that back user
@@ -648,7 +650,7 @@ struct amdgpu_ttm_tt {
 int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
 {
struct ttm_tt *ttm = bo->tbo.ttm;
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
unsigned long start = gtt->userptr;
struct vm_area_struct *vma;
struct mm_struct *mm;
@@ -702,7 +704,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
  */
 bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
 {
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
bool r = false;
 
if (!gtt || !gtt->userptr)
@@ -751,7 +753,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_device 
*bdev,
 struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY);
enum dma_data_direction direction = write ?
DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
@@ -788,7 +790,7 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_device 
*bdev,
struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY);
enum dma_data_direction direction = write ?
DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
@@ -822,7 +824,7 @@ static void amdgpu_ttm_gart_bind(struct amdgpu_device *adev,
 {
struct amdgpu_bo *abo = ttm_to_amdgpu_bo(tbo);
struct ttm_tt *ttm = tbo->ttm;
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
 
if (amdgpu_bo_encrypted(abo))
flags |= AMDGPU_PTE_TMZ;
@@ -860,7 +862,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_device *bdev,
   struct ttm_resource *bo_mem)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void*)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
uint64_t flags;
int r;
 
@@ -927,7 +929,8 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
struct ttm_operation_ctx ctx = { false, false };
-   struct amdgpu_ttm_tt *gtt = (void *)bo->ttm;
+   struct ttm_tt *ttm = bo->ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
struct ttm_placement placement;
struct ttm_place placements;
struct ttm_resource *tmp;
@@ -998,7 +1001,7 @@ static void amdgpu_ttm_backend_unbind(struct ttm_device 
*bdev,
  struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
 
/* if the pages have userptr pinning then clear that first */
if (gtt->userptr) {
@@ -1025,7 +1028,7 @@ static void amdgpu_ttm_backend_unbind(struct ttm_device 
*bdev,
 static void amdgpu_ttm_backend_destroy(struct ttm_device *bdev,
   struct ttm_tt *ttm)
 {
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
 
if (gtt->usertask)
put_task_struct(gtt->usertask);
@@ -1079,7 +1082,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
  struct ttm_operation_ctx *ctx)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bdev);
-   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm);
pgoff_t i;
int ret;
 
@@ -1113,7 +1116,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
 static void amdg

[PATCH] drm/amdgpu: Refactor code to handle non coherent and uncached

2022-07-18 Thread Rajneesh Bhardwaj

This simplifies existing coherence handling for Arcturus and Aldabaran
to account for !coherent && uncached scenarios.

Cc: Joseph Greathouse 
Cc: Alex Deucher 
Signed-off-by: Rajneesh Bhardwaj 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 53 +--
 1 file changed, 26 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index d1657de5f875..0fdfd79f69ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -471,45 +471,44 @@ static uint64_t get_pte_flags(struct amdgpu_device *adev, 
struct kgd_mem *mem)
 
switch (adev->asic_type) {
case CHIP_ARCTURUS:
-   if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
-   if (bo_adev == adev)
-   mapping_flags |= coherent ?
-   AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW;
-   else
-   mapping_flags |= coherent ?
-   AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
-   } else {
-   mapping_flags |= coherent ?
-   AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
-   }
-   break;
case CHIP_ALDEBARAN:
-   if (coherent && uncached) {
-   if (adev->gmc.xgmi.connected_to_cpu ||
-   !(mem->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM))
-   snoop = true;
-   mapping_flags |= AMDGPU_VM_MTYPE_UC;
-   } else if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
if (bo_adev == adev) {
-   mapping_flags |= coherent ?
-   AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW;
-   if (adev->gmc.xgmi.connected_to_cpu)
+   if (uncached)
+   mapping_flags |= AMDGPU_VM_MTYPE_UC;
+   else if (coherent)
+   mapping_flags |= AMDGPU_VM_MTYPE_CC;
+   else
+   mapping_flags |= AMDGPU_VM_MTYPE_RW;
+   if (adev->asic_type == CHIP_ALDEBARAN &&
+   adev->gmc.xgmi.connected_to_cpu)
snoop = true;
} else {
-   mapping_flags |= coherent ?
-   AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
+   if (uncached || coherent)
+   mapping_flags |= AMDGPU_VM_MTYPE_UC;
+   else
+   mapping_flags |= AMDGPU_VM_MTYPE_NC;
if (amdgpu_xgmi_same_hive(adev, bo_adev))
snoop = true;
}
} else {
+   if (uncached || coherent)
+   mapping_flags |= AMDGPU_VM_MTYPE_UC;
+   else
+   mapping_flags |= AMDGPU_VM_MTYPE_NC;
snoop = true;
-   mapping_flags |= coherent ?
-   AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
}
break;
default:
-   mapping_flags |= coherent ?
-   AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
+   if (uncached || coherent)
+   mapping_flags |= AMDGPU_VM_MTYPE_UC;
+   else
+   mapping_flags |= AMDGPU_VM_MTYPE_NC;
+
+   if (!(mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM))
+   snoop = true;
+
+
}
 
pte_flags = amdgpu_gem_va_map_flags(adev, mapping_flags);
-- 
2.17.1

[PATCH 2/4] drm/amdkfd: updade SPDX license header

2022-02-11 Thread Rajneesh Bhardwaj

Update the SPDX License header for all the KFD files.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_crat.h | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v10.c | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_events.h   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.h| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.h | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_aldebaran.h| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_diq.h  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_opcodes.h  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c| 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h   | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 3 ++-
 46 files changed, 92 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 54d997f304b5..33bd13a75bb5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1,5 +1,6 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
- * Copyright 2014 Advanced Micro Devices, Inc.
+ * Copyright 2014-2022 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index bb6e49661d13..093305534d04 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1,5 +1,6 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
- * Copyright 2015-2017 Advanced Micro Devices, Inc.
+ * Copyright 2015-2022 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
index d54ceebd346b..015ada3a0d54 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
@@ -1,5 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
 /*
- * Copyright 2014 Advanced Micro Devices, Inc.
+ * Copyright 2014-2022 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c 
b/drivers/gpu/drm/amd/amdkfd/kf

[PATCH 3/4] drm/amdkfd: Fix leftover errors and warnings

2022-02-11 Thread Rajneesh Bhardwaj

A bunch of errors and warnings are leftover KFD over the years, attempt
to fix the errors and most warnings reported by checkpatch tool. Still a
few warnings remain which may be false positives so ignore them for now.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c| 11 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_crat.h   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c|  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c |  5 +++--
 .../drm/amd/amdkfd/kfd_device_queue_manager.c   | 17 -
 .../drm/amd/amdkfd/kfd_device_queue_manager.h   |  8 +++-
 .../amd/amdkfd/kfd_device_queue_manager_v10.c   |  9 -
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c|  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c|  1 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c|  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h |  4 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   |  4 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c   |  6 +++---
 17 files changed, 37 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 33bd13a75bb5..965af2a08bc0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -37,7 +37,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
 #include "kfd_svm.h"
@@ -1133,11 +1133,12 @@ static int kfd_ioctl_free_memory_of_gpu(struct file 
*filep,
return ret;
 }
 
-static bool kfd_flush_tlb_after_unmap(struct kfd_dev *dev) {
+static bool kfd_flush_tlb_after_unmap(struct kfd_dev *dev)
+{
return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2) ||
-  (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1) &&
-   dev->adev->sdma.instance[0].fw_version >= 18) ||
-  KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 0);
+   (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1) &&
+   dev->adev->sdma.instance[0].fw_version >= 18) ||
+   KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 0);
 }
 
 static int kfd_ioctl_map_memory_to_gpu(struct file *filep,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 093305534d04..24898238b024 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1382,7 +1382,7 @@ static int kfd_fill_gpu_cache_info(struct kfd_dev *kdev,
num_of_cache_types = ARRAY_SIZE(vegam_cache_info);
break;
default:
-   switch(KFD_GC_VERSION(kdev)) {
+   switch (KFD_GC_VERSION(kdev)) {
case IP_VERSION(9, 0, 1):
pcache_info = vega10_cache_info;
num_of_cache_types = ARRAY_SIZE(vega10_cache_info);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
index 015ada3a0d54..482ba84a728d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.h
@@ -233,7 +233,7 @@ struct crat_subtype_ccompute {
 #define CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT(1 << 2)
 #define CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT(1 << 3)
 #define CRAT_IOLINK_FLAGS_NO_PEER_TO_PEER_DMA  (1 << 4)
-#define CRAT_IOLINK_FLAGS_BI_DIRECTIONAL   (1 << 31)
+#define CRAT_IOLINK_FLAGS_BI_DIRECTIONAL   (1 << 31)
 #define CRAT_IOLINK_FLAGS_RESERVED_MASK0x7fe0
 
 /*
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
index 36371fb03778..581c3a30fee1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
@@ -36,7 +36,7 @@ static int kfd_debugfs_open(struct inode *inode, struct file 
*file)
 }
 static int kfd_debugfs_hang_hws_read(struct seq_file *m, void *data)
 {
-   seq_printf(m, "echo gpu_id > hang_hws\n");
+   seq_puts(m, "echo gpu_id > hang_hws\n");
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 8e3efb8518fb..339e12c94cff 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -439,7 +439,8 @@ static int kfd_gws_init(struct kfd_dev *kfd)
return ret;
 }
 
-static void kfd_smi_init(struct kfd_dev *dev) {
+static void kfd_smi_init(struct kfd_dev *dev)
+{
INIT_LIST_HEAD(>smi_clients);
spin_lock_init(>smi_lock);
 }
@@ -572,7 +573,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 
svm_migrate_init(kfd-

[PATCH 4/4] drm/amdgpu: Fix a kerneldoc warning

2022-02-11 Thread Rajneesh Bhardwaj

Add missing parameters to fix a kerneldoc warning

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7931132ce6e3..a74a1b74a172 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -554,7 +554,11 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 /**
  * amdgpu_mm_wreg_mmio_rlc -  write register either with direct/indirect mmio 
or with RLC path if in range
  *
- * this function is invoked only the debugfs register access
+ * @adev: amdgpu_device pointer
+ * @reg: mmio/rlc register
+ * @v: value to write
+ *
+ * this function is invoked only for the debugfs register access
  */
 void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
 uint32_t reg, uint32_t v)
-- 
2.17.1

[PATCH 1/4] drm/amdgpu: Fix some kerneldoc warnings

2022-02-11 Thread Rajneesh Bhardwaj

Fix few kerneldoc warnings and one typo.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index 53895a41932e..81e3b528bbc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -715,7 +715,7 @@ static void get_wave_count(struct amdgpu_device *adev, int 
queue_idx,
  * process whose pasid is provided as a parameter. The process could have ZERO
  * or more queues running and submitting waves to compute units.
  *
- * @kgd: Handle of device from which to get number of waves in flight
+ * @adev: Handle of device from which to get number of waves in flight
  * @pasid: Identifies the process for which this query call is invoked
  * @pasid_wave_cnt: Output parameter updated with number of waves in flight 
that
  * belong to process with given pasid
@@ -724,7 +724,7 @@ static void get_wave_count(struct amdgpu_device *adev, int 
queue_idx,
  *
  * Note: It's possible that the device has too many queues (oversubscription)
  * in which case a VMID could be remapped to a different PASID. This could lead
- * to an iaccurate wave count. Following is a high-level sequence:
+ * to an inaccurate wave count. Following is a high-level sequence:
  *Time T1: vmid = getVmid(); vmid is associated with Pasid P1
  *Time T2: passId = getPasId(vmid); vmid is associated with Pasid P2
  * In the sequence above wave count obtained from time T1 will be incorrectly
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 2e00c3fb4bd3..cd89d2e46852 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -121,7 +121,7 @@ static size_t amdgpu_amdkfd_acc_size(uint64_t size)
 }
 
 /**
- * @amdgpu_amdkfd_reserve_mem_limit() - Decrease available memory by size
+ * amdgpu_amdkfd_reserve_mem_limit() - Decrease available memory by size
  * of buffer including any reserved for control structures
  *
  * @adev: Device to which allocated BO belongs to
-- 
2.17.1

[PATCH 0/4] General cleanup and SPDX update

2022-02-11 Thread Rajneesh Bhardwaj

There are a bunch of warnings from checkpatch and kerneldoc style issues
that this cleanup series tries to address and also update the SPDX
License header for all the KFD files within amdgpu driver.


Rajneesh Bhardwaj (4):
  drm/amdgpu: Fix some kerneldoc warnings
  drm/amdkfd: updade SPDX license header
  drm/amdkfd: Fix leftover errors and warnings
  drm/amdgpu: Fix a kerneldoc warning

 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  4 ++--
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|  6 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 14 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c |  5 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_crat.h |  5 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c  |  5 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  8 +---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 20 +--
 .../drm/amd/amdkfd/kfd_device_queue_manager.h | 11 +-
 .../amd/amdkfd/kfd_device_queue_manager_cik.c |  3 ++-
 .../amd/amdkfd/kfd_device_queue_manager_v10.c | 12 ++-
 .../amd/amdkfd/kfd_device_queue_manager_v9.c  |  3 ++-
 .../amd/amdkfd/kfd_device_queue_manager_vi.c  |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_events.h   |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c  |  5 +++--
 .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c   |  6 --
 drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c|  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c|  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.h|  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.h |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c   |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  |  4 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  3 ++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  3 ++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  5 +++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  3 ++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  4 +++-
 .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   |  3 ++-
 .../drm/amd/amdkfd/kfd_packet_manager_v9.c|  3 ++-
 .../drm/amd/amdkfd/kfd_packet_manager_vi.c|  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c|  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h  |  3 ++-
 .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h   |  3 ++-
 .../amd/amdkfd/kfd_pm4_headers_aldebaran.h|  3 ++-
 .../gpu/drm/amd/amdkfd/kfd_pm4_headers_diq.h  |  3 ++-
 .../gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h   |  7 ---
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_opcodes.h  |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  7 ---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  3 ++-
 .../amd/amdkfd/kfd_process_queue_manager.c|  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c|  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h   |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  9 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  3 ++-
 50 files changed, 137 insertions(+), 94 deletions(-)

-- 
2.17.1

[PATCH] drm/amdkfd: Fix prototype warning for get_process_num_bos

2022-02-10 Thread Rajneesh Bhardwaj

Fix the warning: no previous prototype for 'get_process_num_bos'
[-Wmissing-prototypes]

Reported-by: kernel test robot 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index d5699aa79578..54d997f304b5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1648,7 +1648,7 @@ static int criu_checkpoint_devices(struct kfd_process *p,
return ret;
 }
 
-uint32_t get_process_num_bos(struct kfd_process *p)
+static uint32_t get_process_num_bos(struct kfd_process *p)
 {
uint32_t num_of_bos = 0;
int i;
-- 
2.17.1

[PATCH] drm/amdkfd: CRIU fix extra whitespace and block comment warnings

2022-02-10 Thread Rajneesh Bhardwaj

Fix checkpatch reported warning for a quoted line and block line
comments.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 783826640da9..b71d47afd243 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3514,7 +3514,7 @@ int kfd_criu_resume_svm(struct kfd_process *p)
 i, criu_svm_md->data.start_addr, 
criu_svm_md->data.size);
 
for (j = 0; j < num_attrs; j++) {
-   pr_debug("\ncriu_svm_md[%d]->attrs[%d].type : 0x%x 
\ncriu_svm_md[%d]->attrs[%d].value : 0x%x\n",
+   pr_debug("\ncriu_svm_md[%d]->attrs[%d].type : 
0x%x\ncriu_svm_md[%d]->attrs[%d].value : 0x%x\n",
 i, j, criu_svm_md->data.attrs[j].type,
 i, j, criu_svm_md->data.attrs[j].value);
switch (criu_svm_md->data.attrs[j].type) {
@@ -3601,7 +3601,8 @@ int kfd_criu_restore_svm(struct kfd_process *p,
num_devices = p->n_pdds;
/* Handle one SVM range object at a time, also the number of gpus are
 * assumed to be same on the restore node, checking must be done while
-* evaluating the topology earlier */
+* evaluating the topology earlier
+*/
 
svm_attrs_size = sizeof(struct kfd_ioctl_svm_attribute) *
(nattr_common + nattr_accessibility * num_devices);
-- 
2.17.1

[PATCH] drm/amdgpu: Fix recursive locking warning

2022-02-03 Thread Rajneesh Bhardwaj

Noticed the below warning while running a pytorch workload on vega10
GPUs. Change to trylock to avoid conflicts with already held reservation
locks.

[  +0.03] WARNING: possible recursive locking detected
[  +0.03] 5.13.0-kfd-rajneesh #1030 Not tainted
[  +0.04] 
[  +0.02] python/4822 is trying to acquire lock:
[  +0.04] 932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000203]
  but task is already holding lock:
[  +0.03] 932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.17]
  other info that might help us debug this:
[  +0.02]  Possible unsafe locking scenario:

[  +0.03]CPU0
[  +0.02]
[  +0.02]   lock(reservation_ww_class_mutex);
[  +0.04]   lock(reservation_ww_class_mutex);
[  +0.03]
   *** DEADLOCK ***

[  +0.02]  May be due to missing lock nesting notation

[  +0.03] 7 locks held by python/4822:
[  +0.03]  #0: 932c4ac028d0 (>mutex){+.+.}-{3:3}, at:
kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu]
[  +0.000232]  #1: 932c55e830a8 (>lock#2){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu]
[  +0.000241]  #2: 932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu]
[  +0.000236]  #3: b2b35606fd28
(reservation_ww_class_acquire){+.+.}-{0:0}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu]
[  +0.000235]  #4: 932cbb7181f8
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.15]  #5: c045f700 (*(sspp++)){}-{0:0}, at:
drm_dev_enter+0x5/0xa0 [drm]
[  +0.38]  #6: 932c52da7078 (>eviction_lock){+.+.}-{3:3},
at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu]
[  +0.000195]
  stack backtrace:
[  +0.03] CPU: 11 PID: 4822 Comm: python Not tainted
5.13.0-kfd-rajneesh #1030
[  +0.05] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02
08/29/2018
[  +0.03] Call Trace:
[  +0.03]  dump_stack+0x6d/0x89
[  +0.10]  __lock_acquire+0xb93/0x1a90
[  +0.09]  lock_acquire+0x25d/0x2d0
[  +0.05]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  ? lock_is_held_type+0xa2/0x110
[  +0.06]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  __ww_mutex_lock.constprop.17+0xca/0x1060
[  +0.07]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  ? lock_release+0x13f/0x270
[  +0.05]  ? lock_is_held_type+0xa2/0x110
[  +0.06]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000185]  ttm_bo_release+0x4c6/0x580 [ttm]
[  +0.10]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
[  +0.000183]  amdgpu_vm_free_table+0x76/0xa0 [amdgpu]
[  +0.000189]  amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu]
[  +0.000189]  amdgpu_vm_update_ptes+0x411/0x770 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update+0x251/0x610 [amdgpu]
[  +0.000191]  update_gpuvm_pte+0xcc/0x290 [amdgpu]
[  +0.000229]  ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu]
[  +0.000190]  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60
[amdgpu]
[  +0.000234]  kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu]
[  +0.000218]  kfd_ioctl+0x2b9/0x600 [amdgpu]
[  +0.000216]  ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu]
[  +0.000216]  ? lock_release+0x13f/0x270
[  +0.06]  ? __fget_files+0x107/0x1e0
[  +0.07]  __x64_sys_ioctl+0x8b/0xd0
[  +0.07]  do_syscall_64+0x36/0x70
[  +0.04]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.07] RIP: 0033:0x7fbff90a7317
[  +0.04] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[  +0.05] RSP: 002b:7fbe301fe648 EFLAGS: 0246 ORIG_RAX:
0010
[  +0.06] RAX: ffda RBX: 7fbcc402d820 RCX:
7fbff90a7317
[  +0.03] RDX: 7fbe301fe690 RSI: c0184b18 RDI:
0004
[  +0.03] RBP: 7fbe301fe690 R08:  R09:
7fbcc402d880
[  +0.03] R10: 02001000 R11: 0246 R12:
c0184b18
[  +0.03] R13: 0004 R14: 7fbf689593a0 R15:
7fbcc402d820

Cc: Christian König 
Cc: Felix Kuehling 
Cc: Alex Deucher 

Fixes: 627b92ef9d7c ("drm/amdgpu: Wipe all VRAM on free when RAS is
enabled")
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 36bb41b027ec..6ccd2be685f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

[Patch v5 23/24] drm/amdkfd: CRIU resume shared virtual memory ranges

2022-02-03 Thread Rajneesh Bhardwaj

In CRIU resume stage, resume all the shared virtual memory ranges from
the data stored inside the resuming kfd process during CRIU restore
phase. Also setup xnack mode and free up the resources.

KFD_IOCTL_SVM_ATTR_CLR_FLAGS is not available for querying via get_attr
interface but we must clear the flags during restore as there might be
some default flags set when the prange is created. Also handle the
invalid PREFETCH atribute values saved during checkpoint by replacing
them with another dummy KFD_IOCTL_SVM_ATTR_SET_FLAGS attribute.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  10 +++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 102 +++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h |   6 ++
 3 files changed, 118 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index c143f242a84d..64e3b4e3a712 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2766,7 +2766,17 @@ static int criu_resume(struct file *filep,
}
 
mutex_lock(>mutex);
+   ret = kfd_criu_resume_svm(target);
+   if (ret) {
+   pr_err("kfd_criu_resume_svm failed for %i\n", args->pid);
+   goto exit;
+   }
+
ret =  amdgpu_amdkfd_criu_resume(target->kgd_process_info);
+   if (ret)
+   pr_err("amdgpu_amdkfd_criu_resume failed for %i\n", args->pid);
+
+exit:
mutex_unlock(>mutex);
 
kfd_unref_process(target);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 41ac049b3316..30ae21953da5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3487,6 +3487,108 @@ svm_range_get_attr(struct kfd_process *p, struct 
mm_struct *mm,
return 0;
 }
 
+int kfd_criu_resume_svm(struct kfd_process *p)
+{
+   struct kfd_ioctl_svm_attribute *set_attr = NULL;
+   int nattr_common = 4, nattr_accessibility = 1;
+   struct criu_svm_metadata *criu_svm_md = NULL;
+   struct svm_range_list *svms = >svms;
+   struct criu_svm_metadata *next = NULL;
+   uint32_t set_flags = 0x;
+   int i, j, num_attrs, ret = 0;
+   uint64_t set_attr_size;
+   struct mm_struct *mm;
+
+   if (list_empty(>criu_svm_metadata_list)) {
+   pr_debug("No SVM data from CRIU restore stage 2\n");
+   return ret;
+   }
+
+   mm = get_task_mm(p->lead_thread);
+   if (!mm) {
+   pr_err("failed to get mm for the target process\n");
+   return -ESRCH;
+   }
+
+   num_attrs = nattr_common + (nattr_accessibility * p->n_pdds);
+
+   i = j = 0;
+   list_for_each_entry(criu_svm_md, >criu_svm_metadata_list, list) {
+   pr_debug("criu_svm_md[%d]\n\tstart: 0x%llx size: 0x%llx 
(npages)\n",
+i, criu_svm_md->data.start_addr, 
criu_svm_md->data.size);
+
+   for (j = 0; j < num_attrs; j++) {
+   pr_debug("\ncriu_svm_md[%d]->attrs[%d].type : 0x%x 
\ncriu_svm_md[%d]->attrs[%d].value : 0x%x\n",
+i,j, criu_svm_md->data.attrs[j].type,
+i,j, criu_svm_md->data.attrs[j].value);
+   switch (criu_svm_md->data.attrs[j].type) {
+   /* During Checkpoint operation, the query for
+* KFD_IOCTL_SVM_ATTR_PREFETCH_LOC attribute might
+* return KFD_IOCTL_SVM_LOCATION_UNDEFINED if they were
+* not used by the range which was checkpointed. Care
+* must be taken to not restore with an invalid value
+* otherwise the gpuidx value will be invalid and
+* set_attr would eventually fail so just replace those
+* with another dummy attribute such as
+* KFD_IOCTL_SVM_ATTR_SET_FLAGS.
+*/
+   case KFD_IOCTL_SVM_ATTR_PREFETCH_LOC:
+   if (criu_svm_md->data.attrs[j].value ==
+   KFD_IOCTL_SVM_LOCATION_UNDEFINED) {
+   criu_svm_md->data.attrs[j].type =
+   KFD_IOCTL_SVM_ATTR_SET_FLAGS;
+   criu_svm_md->data.attrs[j].value = 0;
+   }
+   break;
+   case KFD_IOCTL_SVM_ATTR_SET_FLAGS:
+   set_flags = criu_svm_md->data.attrs[j].value;
+   break;
+   default:
+   break;
+   }
+   }
+
+

[Patch v5 20/24] drm/amdkfd: CRIU Discover svm ranges

2022-02-03 Thread Rajneesh Bhardwaj

A KFD process may contain a number of virtual address ranges for shared
virtual memory management and each such range can have many SVM
attributes spanning across various nodes within the process boundary.
This change reports the total number of such SVM ranges and
their total private data size by extending the PROCESS_INFO op of the the
CRIU IOCTL to discover the svm ranges in the target process and a future
patches brings in the required support for checkpoint and restore for
SVM ranges.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 59 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 11 +
 4 files changed, 81 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 3ec44f71307d..a755ea68a428 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2099,10 +2099,9 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
uint32_t *num_objects,
uint64_t *objs_priv_size)
 {
-   int ret;
-   uint64_t priv_size;
+   uint64_t queues_priv_data_size, svm_priv_data_size, priv_size;
uint32_t num_queues, num_events, num_svm_ranges;
-   uint64_t queues_priv_data_size;
+   int ret;
 
*num_devices = p->n_pdds;
*num_bos = get_process_num_bos(p);
@@ -2112,7 +2111,10 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
return ret;
 
num_events = kfd_get_num_events(p);
-   num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */
+
+   ret = svm_range_get_info(p, _svm_ranges, _priv_data_size);
+   if (ret)
+   return ret;
 
*num_objects = num_queues + num_events + num_svm_ranges;
 
@@ -2122,7 +2124,7 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
priv_size += queues_priv_data_size;
priv_size += num_events * sizeof(struct 
kfd_criu_event_priv_data);
-   /* TODO: Add SVM ranges priv size */
+   priv_size += svm_priv_data_size;
*objs_priv_size = priv_size;
}
return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 903ad4a263f0..715dd0d4fac5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1082,7 +1082,10 @@ enum kfd_criu_object_type {
 
 struct kfd_criu_svm_range_priv_data {
uint32_t object_type;
-   uint32_t reserved;
+   uint64_t start_addr;
+   uint64_t size;
+   /* Variable length array of attributes */
+   struct kfd_ioctl_svm_attribute attrs[0];
 };
 
 struct kfd_criu_queue_priv_data {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index d34508f5e88b..64cd7712c098 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3481,6 +3481,65 @@ svm_range_get_attr(struct kfd_process *p, struct 
mm_struct *mm,
return 0;
 }
 
+int svm_range_get_info(struct kfd_process *p, uint32_t *num_svm_ranges,
+  uint64_t *svm_priv_data_size)
+{
+   uint64_t total_size, accessibility_size, common_attr_size;
+   int nattr_common = 4, nattr_accessibility = 1;
+   int num_devices = p->n_pdds;
+   struct svm_range_list *svms;
+   struct svm_range *prange;
+   uint32_t count = 0;
+
+   *svm_priv_data_size = 0;
+
+   svms = >svms;
+   if (!svms)
+   return -EINVAL;
+
+   mutex_lock(>lock);
+   list_for_each_entry(prange, >list, list) {
+   pr_debug("prange: 0x%p start: 0x%lx\t npages: 0x%llx\t end: 
0x%llx\n",
+prange, prange->start, prange->npages,
+prange->start + prange->npages - 1);
+   count++;
+   }
+   mutex_unlock(>lock);
+
+   *num_svm_ranges = count;
+   /* Only the accessbility attributes need to be queried for all the gpus
+* individually, remaining ones are spanned across the entire process
+* regardless of the various gpu nodes. Of the remaining attributes,
+* KFD_IOCTL_SVM_ATTR_CLR_FLAGS need not be saved.
+*
+* KFD_IOCTL_SVM_ATTR_PREFERRED_LOC
+* KFD_IOCTL_SVM_ATTR_PREFETCH_LOC
+* KFD_IOCTL_SVM_ATTR_SET_FLAGS
+* KFD_IOCTL_SVM_ATTR_GRANULARITY
+*
+* ** ACCESSBILITY ATTRIBUTES **
+* (Considered as one, type is altered during query, value is gpuid)
+* KFD_IOCTL_SVM_ATTR_ACCESS
+* KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE
+* KFD_IOCTL_SVM_ATTR_NO_ACCESS

[Patch v5 13/24] drm/amdkfd: CRIU checkpoint and restore queue control stack

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Checkpoint contents of queue control stacks on CRIU dump and restore them
during CRIU restore.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 22 ---
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  9 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  | 11 +++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 13 ++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  | 14 +++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 29 +++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 22 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 +-
 .../amd/amdkfd/kfd_process_queue_manager.c| 62 +--
 11 files changed, 138 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 999672602252..608214ea634d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -311,7 +311,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
p->pasid,
dev->id);
 
-   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL, NULL,
+   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL, NULL, NULL,
_offset_in_process);
if (err != 0)
goto err_create_queue;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 3a5303ebcabf..8eca9ed3ab36 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
properties.type = KFD_QUEUE_TYPE_DIQ;
 
status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
-   , , NULL, NULL, NULL);
+   , , NULL, NULL, NULL, NULL);
 
if (status) {
pr_err("Failed to create DIQ\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 42933610d4e1..63b3c7af681b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -323,7 +323,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
struct queue *q,
struct qcm_process_device *qpd,
const struct kfd_criu_queue_priv_data *qd,
-   const void *restore_mqd)
+   const void *restore_mqd, const void 
*restore_ctl_stack)
 {
struct mqd_manager *mqd_mgr;
int retval;
@@ -385,7 +385,8 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
 
if (qd)
mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
->properties, restore_mqd);
+>properties, restore_mqd, 
restore_ctl_stack,
+qd->ctl_stack_size);
else
mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
>gart_mqd_addr, >properties);
@@ -1342,7 +1343,7 @@ static void destroy_kernel_queue_cpsch(struct 
device_queue_manager *dqm,
 static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue 
*q,
struct qcm_process_device *qpd,
const struct kfd_criu_queue_priv_data *qd,
-   const void *restore_mqd)
+   const void *restore_mqd, const void *restore_ctl_stack)
 {
int retval;
struct mqd_manager *mqd_mgr;
@@ -1391,7 +1392,8 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
 
if (qd)
mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
->properties, restore_mqd);
+>properties, restore_mqd, 
restore_ctl_stack,
+qd->ctl_stack_size);
else
mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
>gart_mqd_addr, >properties);
@@ -1799,7 +1801,8 @@ static int get_wave_state(struct device_queue_manager 
*dqm,
 
 static void get_queue_checkpoint_info(struct device_queue_manager *dqm,
const struct queue *q,
-   u32 *mqd_size)
+   u32 *mqd_size,
+   u32 *ctl_stack_size)
 {
struct mqd_manager *mqd_mgr

[Patch v5 18/24] drm/amdkfd: CRIU allow external mm for svm ranges

2022-02-03 Thread Rajneesh Bhardwaj

Both svm_range_get_attr and svm_range_set_attr helpers use mm struct
from current but for a Checkpoint or Restore operation, the current->mm
will fetch the mm for the CRIU master process. So modify these helpers to
accept the task mm for a target kfd process to support Checkpoint
Restore.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index ffec25e642e2..d34508f5e88b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3203,10 +3203,10 @@ static void svm_range_evict_svm_bo_worker(struct 
work_struct *work)
 }
 
 static int
-svm_range_set_attr(struct kfd_process *p, uint64_t start, uint64_t size,
-  uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs)
+svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
+  uint64_t start, uint64_t size, uint32_t nattr,
+  struct kfd_ioctl_svm_attribute *attrs)
 {
-   struct mm_struct *mm = current->mm;
struct list_head update_list;
struct list_head insert_list;
struct list_head remove_list;
@@ -3305,8 +3305,9 @@ svm_range_set_attr(struct kfd_process *p, uint64_t start, 
uint64_t size,
 }
 
 static int
-svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
-  uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs)
+svm_range_get_attr(struct kfd_process *p, struct mm_struct *mm,
+  uint64_t start, uint64_t size, uint32_t nattr,
+  struct kfd_ioctl_svm_attribute *attrs)
 {
DECLARE_BITMAP(bitmap_access, MAX_GPU_INSTANCE);
DECLARE_BITMAP(bitmap_aip, MAX_GPU_INSTANCE);
@@ -3316,7 +3317,6 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, 
uint64_t size,
bool get_accessible = false;
bool get_flags = false;
uint64_t last = start + size - 1UL;
-   struct mm_struct *mm = current->mm;
uint8_t granularity = 0xff;
struct interval_tree_node *node;
struct svm_range_list *svms;
@@ -3485,6 +3485,7 @@ int
 svm_ioctl(struct kfd_process *p, enum kfd_ioctl_svm_op op, uint64_t start,
  uint64_t size, uint32_t nattrs, struct kfd_ioctl_svm_attribute *attrs)
 {
+   struct mm_struct *mm = current->mm;
int r;
 
start >>= PAGE_SHIFT;
@@ -3492,10 +3493,10 @@ svm_ioctl(struct kfd_process *p, enum kfd_ioctl_svm_op 
op, uint64_t start,
 
switch (op) {
case KFD_IOCTL_SVM_OP_SET_ATTR:
-   r = svm_range_set_attr(p, start, size, nattrs, attrs);
+   r = svm_range_set_attr(p, mm, start, size, nattrs, attrs);
break;
case KFD_IOCTL_SVM_OP_GET_ATTR:
-   r = svm_range_get_attr(p, start, size, nattrs, attrs);
+   r = svm_range_get_attr(p, mm, start, size, nattrs, attrs);
break;
default:
r = EINVAL;
-- 
2.17.1

[Patch v5 15/24] drm/amdkfd: CRIU implement gpu_id remapping

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When doing a restore on a different node, the gpu_id's on the restore
node may be different. But the user space application will still refer
use the original gpu_id's in the ioctl calls. Adding code to create a
gpu id mapping so that kfd can determine actual gpu_id during the user
ioctl's.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 468 --
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  45 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  11 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  32 ++
 .../amd/amdkfd/kfd_process_queue_manager.c|  18 +-
 5 files changed, 414 insertions(+), 160 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index a4be758647f9..69edeaf3893e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -293,14 +293,17 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
return err;
 
pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev) {
-   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
-   return -EINVAL;
-   }
 
mutex_lock(>mutex);
 
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
+   err = -EINVAL;
+   goto err_pdd;
+   }
+   dev = pdd->dev;
+
pdd = kfd_bind_process_to_device(dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
@@ -345,6 +348,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
 
 err_create_queue:
 err_bind_process:
+err_pdd:
mutex_unlock(>mutex);
return err;
 }
@@ -491,7 +495,6 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
struct kfd_process *p, void *data)
 {
struct kfd_ioctl_set_memory_policy_args *args = data;
-   struct kfd_dev *dev;
int err = 0;
struct kfd_process_device *pdd;
enum cache_policy default_policy, alternate_policy;
@@ -506,13 +509,15 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
return -EINVAL;
}
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -EINVAL;
-
mutex_lock(>mutex);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
+   err = -EINVAL;
+   goto err_pdd;
+   }
 
-   pdd = kfd_bind_process_to_device(dev, p);
+   pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
goto out;
@@ -525,7 +530,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
(args->alternate_policy == KFD_IOC_CACHE_POLICY_COHERENT)
   ? cache_policy_coherent : cache_policy_noncoherent;
 
-   if (!dev->dqm->ops.set_cache_memory_policy(dev->dqm,
+   if (!pdd->dev->dqm->ops.set_cache_memory_policy(pdd->dev->dqm,
>qpd,
default_policy,
alternate_policy,
@@ -534,6 +539,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
err = -EINVAL;
 
 out:
+err_pdd:
mutex_unlock(>mutex);
 
return err;
@@ -543,17 +549,18 @@ static int kfd_ioctl_set_trap_handler(struct file *filep,
struct kfd_process *p, void *data)
 {
struct kfd_ioctl_set_trap_handler_args *args = data;
-   struct kfd_dev *dev;
int err = 0;
struct kfd_process_device *pdd;
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -EINVAL;
-
mutex_lock(>mutex);
 
-   pdd = kfd_bind_process_to_device(dev, p);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   err = -EINVAL;
+   goto err_pdd;
+   }
+
+   pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
goto out;
@@ -562,6 +569,7 @@ static int kfd_ioctl_set_trap_handler(struct file *filep,
kfd_process_set_trap_handler(>qpd, args->tba_addr, args->tma_addr);
 
 out:
+err_pdd:
mutex_unlock(>mutex);
 
return err;
@@ -577,16 +585,20 @@ static int kfd_ioctl_dbg_register(struct file *filep,
bool create_ok;
long status = 0;
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -E

[Patch v5 19/24] drm/amdkfd: use user_gpu_id for svm ranges

2022-02-03 Thread Rajneesh Bhardwaj

Currently the SVM ranges use actual_gpu_id but with Checkpoint Restore
support its possible that the SVM ranges can be resumed on another node
where the actual_gpu_id may not be same as the original (user_gpu_id)
gpu id. So modify svm code to use user_gpu_id.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 06e6e9180fbc..8e2780d2f735 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1797,7 +1797,7 @@ int kfd_process_gpuidx_from_gpuid(struct kfd_process *p, 
uint32_t gpu_id)
int i;
 
for (i = 0; i < p->n_pdds; i++)
-   if (p->pdds[i] && gpu_id == p->pdds[i]->dev->id)
+   if (p->pdds[i] && gpu_id == p->pdds[i]->user_gpu_id)
return i;
return -EINVAL;
 }
@@ -1810,7 +1810,7 @@ kfd_process_gpuid_from_adev(struct kfd_process *p, struct 
amdgpu_device *adev,
 
for (i = 0; i < p->n_pdds; i++)
if (p->pdds[i] && p->pdds[i]->dev->adev == adev) {
-   *gpuid = p->pdds[i]->dev->id;
+   *gpuid = p->pdds[i]->user_gpu_id;
*gpuidx = i;
return 0;
}
-- 
2.17.1

[Patch v5 14/24] drm/amdkfd: CRIU checkpoint and restore events

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Add support to existing CRIU ioctl's to save and restore events during
criu checkpoint and restore.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  70 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c  | 272 ---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  27 ++-
 3 files changed, 280 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 608214ea634d..a4be758647f9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1008,57 +1008,11 @@ static int kfd_ioctl_create_event(struct file *filp, 
struct kfd_process *p,
 * through the event_page_offset field.
 */
if (args->event_page_offset) {
-   struct kfd_dev *kfd;
-   struct kfd_process_device *pdd;
-   void *mem, *kern_addr;
-   uint64_t size;
-
-   kfd = kfd_device_by_id(GET_GPU_ID(args->event_page_offset));
-   if (!kfd) {
-   pr_err("Getting device by id failed in %s\n", __func__);
-   return -EINVAL;
-   }
-
mutex_lock(>mutex);
-
-   if (p->signal_page) {
-   pr_err("Event page is already set\n");
-   err = -EINVAL;
-   goto out_unlock;
-   }
-
-   pdd = kfd_bind_process_to_device(kfd, p);
-   if (IS_ERR(pdd)) {
-   err = PTR_ERR(pdd);
-   goto out_unlock;
-   }
-
-   mem = kfd_process_device_translate_handle(pdd,
-   GET_IDR_HANDLE(args->event_page_offset));
-   if (!mem) {
-   pr_err("Can't find BO, offset is 0x%llx\n",
-  args->event_page_offset);
-   err = -EINVAL;
-   goto out_unlock;
-   }
-
-   err = amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(kfd->adev,
-   mem, _addr, );
-   if (err) {
-   pr_err("Failed to map event page to kernel\n");
-   goto out_unlock;
-   }
-
-   err = kfd_event_page_set(p, kern_addr, size);
-   if (err) {
-   pr_err("Failed to set event page\n");
-   amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(kfd->adev, 
mem);
-   goto out_unlock;
-   }
-
-   p->signal_handle = args->event_page_offset;
-
+   err = kfd_kmap_event_page(p, args->event_page_offset);
mutex_unlock(>mutex);
+   if (err)
+   return err;
}
 
err = kfd_event_create(filp, p, args->event_type,
@@ -1067,10 +1021,7 @@ static int kfd_ioctl_create_event(struct file *filp, 
struct kfd_process *p,
>event_page_offset,
>event_slot_index);
 
-   return err;
-
-out_unlock:
-   mutex_unlock(>mutex);
+   pr_debug("Created event (id:0x%08x) (%s)\n", args->event_id, __func__);
return err;
 }
 
@@ -2031,7 +1982,7 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
if (ret)
return ret;
 
-   num_events = 0; /* TODO: Implement Events */
+   num_events = kfd_get_num_events(p);
num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */
 
*num_objects = num_queues + num_events + num_svm_ranges;
@@ -2040,7 +1991,7 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
priv_size = sizeof(struct kfd_criu_process_priv_data);
priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
priv_size += queues_priv_data_size;
-   /* TODO: Add Events priv size */
+   priv_size += num_events * sizeof(struct 
kfd_criu_event_priv_data);
/* TODO: Add SVM ranges priv size */
*objs_priv_size = priv_size;
}
@@ -2102,7 +2053,10 @@ static int criu_checkpoint(struct file *filep,
if (ret)
goto exit_unlock;
 
-   /* TODO: Dump Events */
+   ret = kfd_criu_checkpoint_events(p, (uint8_t __user 
*)args->priv_data,
+_offset);
+   if (ret)
+   goto exit_unlock;
 
/* TODO: Dump SVM-Ranges */
}
@@ -2410,8 +2364,8 @@ static int criu_restore_objects(struct file *filep,
goto exit;
break;
case KFD_CRIU_OBJE

[Patch v5 24/24] drm/amdkfd: Bump up KFD API version for CRIU

2022-02-03 Thread Rajneesh Bhardwaj

 - Change KFD minor version to 7 for CRIU

Proposed userspace changes:
https://github.com/RadeonOpenCompute/criu

Signed-off-by: Rajneesh Bhardwaj 
---
 include/uapi/linux/kfd_ioctl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 49429a6c42fc..e6a56c146920 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -32,9 +32,10 @@
  * - 1.4 - Indicate new SRAM EDC bit in device properties
  * - 1.5 - Add SVM API
  * - 1.6 - Query clear flags in SVM get_attr API
+ * - 1.7 - Checkpoint Restore (CRIU) API
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 6
+#define KFD_IOCTL_MINOR_VERSION 7
 
 struct kfd_ioctl_get_version_args {
__u32 major_version;/* from KFD */
-- 
2.17.1

[Patch v5 16/24] drm/amdkfd: CRIU export BOs as prime dmabuf objects

2022-02-03 Thread Rajneesh Bhardwaj

KFD buffer objects do not associate a GEM handle with them so cannot
directly be used with libdrm to initiate a system dma (sDMA) operation
to speedup the checkpoint and restore operation so export them as dmabuf
objects and use with libdrm helper (amdgpu_bo_import) to further process
the sdma command submissions.

With sDMA, we see huge improvement in checkpoint and restore operations
compared to the generic pci based access via host data path.

Suggested-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 71 +++-
 1 file changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 69edeaf3893e..ab5107a3fe36 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
@@ -42,6 +43,7 @@
 #include "kfd_svm.h"
 #include "amdgpu_amdkfd.h"
 #include "kfd_smi_events.h"
+#include "amdgpu_dma_buf.h"
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -1936,6 +1938,33 @@ uint32_t get_process_num_bos(struct kfd_process *p)
return num_of_bos;
 }
 
+static int criu_get_prime_handle(struct drm_gem_object *gobj, int flags,
+ u32 *shared_fd)
+{
+   struct dma_buf *dmabuf;
+   int ret;
+
+   dmabuf = amdgpu_gem_prime_export(gobj, flags);
+   if (IS_ERR(dmabuf)) {
+   ret = PTR_ERR(dmabuf);
+   pr_err("dmabuf export failed for the BO\n");
+   return ret;
+   }
+
+   ret = dma_buf_fd(dmabuf, flags);
+   if (ret < 0) {
+   pr_err("dmabuf create fd failed, ret:%d\n", ret);
+   goto out_free_dmabuf;
+   }
+
+   *shared_fd = ret;
+   return 0;
+
+out_free_dmabuf:
+   dma_buf_put(dmabuf);
+   return ret;
+}
+
 static int criu_checkpoint_bos(struct kfd_process *p,
   uint32_t num_bos,
   uint8_t __user *user_bos,
@@ -1997,6 +2026,14 @@ static int criu_checkpoint_bos(struct kfd_process *p,
goto exit;
}
}
+   if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   ret = 
criu_get_prime_handle(_bo->tbo.base,
+   bo_bucket->alloc_flags &
+   
KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ? DRM_RDWR : 0,
+   _bucket->dmabuf_fd);
+   if (ret)
+   goto exit;
+   }
if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL)
bo_bucket->offset = KFD_MMAP_TYPE_DOORBELL |
KFD_MMAP_GPU_ID(pdd->dev->id);
@@ -2041,6 +2078,10 @@ static int criu_checkpoint_bos(struct kfd_process *p,
*priv_offset += num_bos * sizeof(*bo_privs);
 
 exit:
+   while (ret && bo_index--) {
+   if (bo_buckets[bo_index].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   close_fd(bo_buckets[bo_index].dmabuf_fd);
+   }
 
kvfree(bo_buckets);
kvfree(bo_privs);
@@ -2141,16 +2182,28 @@ static int criu_checkpoint(struct file *filep,
ret = kfd_criu_checkpoint_queues(p, (uint8_t __user 
*)args->priv_data,
 _offset);
if (ret)
-   goto exit_unlock;
+   goto close_bo_fds;
 
ret = kfd_criu_checkpoint_events(p, (uint8_t __user 
*)args->priv_data,
 _offset);
if (ret)
-   goto exit_unlock;
+   goto close_bo_fds;
 
/* TODO: Dump SVM-Ranges */
}
 
+close_bo_fds:
+   if (ret) {
+   /* If IOCTL returns err, user assumes all FDs opened in 
criu_dump_bos are closed */
+   uint32_t i;
+   struct kfd_criu_bo_bucket *bo_buckets = (struct 
kfd_criu_bo_bucket *) args->bos;
+
+   for (i = 0; i < num_bos; i++) {
+   if (bo_buckets[i].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   close_fd(bo_buckets[i].dmabuf_fd);
+   }
+   }
+
 exit_unlock:
mutex_unlock(>mutex);
if (ret)
@@ -2345,6 +2398,7 @@ static int criu

[Patch v5 12/24] drm/amdkfd: CRIU checkpoint and restore queue mqds

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Checkpoint contents of queue MQD's on CRIU dump and restore them during
CRIU restore.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |   2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  73 +++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  12 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |   7 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  70 
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  71 
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  71 
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  72 
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   5 +
 .../amd/amdkfd/kfd_process_queue_manager.c| 157 --
 11 files changed, 516 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index d35911550792..999672602252 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -311,7 +311,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
p->pasid,
dev->id);
 
-   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL,
+   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL, NULL,
_offset_in_process);
if (err != 0)
goto err_create_queue;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 0c50e67e2b51..3a5303ebcabf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
properties.type = KFD_QUEUE_TYPE_DIQ;
 
status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
-   , , NULL, NULL);
+   , , NULL, NULL, NULL);
 
if (status) {
pr_err("Failed to create DIQ\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 13317d2c8959..42933610d4e1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -322,7 +322,8 @@ static void deallocate_vmid(struct device_queue_manager 
*dqm,
 static int create_queue_nocpsch(struct device_queue_manager *dqm,
struct queue *q,
struct qcm_process_device *qpd,
-   const struct kfd_criu_queue_priv_data *qd)
+   const struct kfd_criu_queue_priv_data *qd,
+   const void *restore_mqd)
 {
struct mqd_manager *mqd_mgr;
int retval;
@@ -381,8 +382,14 @@ static int create_queue_nocpsch(struct 
device_queue_manager *dqm,
retval = -ENOMEM;
goto out_deallocate_doorbell;
}
-   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
-   >gart_mqd_addr, >properties);
+
+   if (qd)
+   mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
+>properties, restore_mqd);
+   else
+   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
+   >gart_mqd_addr, >properties);
+
if (q->properties.is_active) {
if (!dqm->sched_running) {
WARN_ONCE(1, "Load non-HWS mqd while stopped\n");
@@ -1334,7 +1341,8 @@ static void destroy_kernel_queue_cpsch(struct 
device_queue_manager *dqm,
 
 static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue 
*q,
struct qcm_process_device *qpd,
-   const struct kfd_criu_queue_priv_data *qd)
+   const struct kfd_criu_queue_priv_data *qd,
+   const void *restore_mqd)
 {
int retval;
struct mqd_manager *mqd_mgr;
@@ -1380,8 +1388,13 @@ static int create_queue_cpsch(struct 
device_queue_manager *dqm, struct queue *q,
 * updates the is_evicted flag but is a no-op otherwise.
 */
q->properties.is_evicted = !!qpd->evicted;
-   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
-   >gart_mqd_addr, >properties);
+
+   if (qd)
+   mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
+>properties, restore_mqd);
+   else
+   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
+   >gart_mqd_

[Patch v5 22/24] drm/amdkfd: CRIU prepare for svm resume

2022-02-03 Thread Rajneesh Bhardwaj

During CRIU restore phase, the VMAs for the virtual address ranges are
not at their final location yet so in this stage, only cache the data
required to successfully resume the svm ranges during an imminent CRIU
resume phase.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 58 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 12 +
 4 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 721c86ceba22..c143f242a84d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2643,8 +2643,8 @@ static int criu_restore_objects(struct file *filep,
goto exit;
break;
case KFD_CRIU_OBJECT_TYPE_SVM_RANGE:
-   /* TODO: Implement SVM range */
-   *priv_offset += sizeof(struct 
kfd_criu_svm_range_priv_data);
+   ret = kfd_criu_restore_svm(p, (uint8_t __user 
*)args->priv_data,
+priv_offset, 
max_priv_data_size);
if (ret)
goto exit;
break;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 715dd0d4fac5..74ff4132a163 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -790,6 +790,7 @@ struct svm_range_list {
struct list_headlist;
struct work_struct  deferred_list_work;
struct list_headdeferred_range_list;
+   struct list_headcriu_svm_metadata_list;
spinlock_t  deferred_list_lock;
atomic_tevicted_ranges;
atomic_tdrain_pagefaults;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 7cf63995c079..41ac049b3316 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -45,6 +45,11 @@
  */
 #define AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING   2000
 
+struct criu_svm_metadata {
+   struct list_head list;
+   struct kfd_criu_svm_range_priv_data data;
+};
+
 static void svm_range_evict_svm_bo_worker(struct work_struct *work);
 static bool
 svm_range_cpu_invalidate_pagetables(struct mmu_interval_notifier *mni,
@@ -2875,6 +2880,7 @@ int svm_range_list_init(struct kfd_process *p)
INIT_DELAYED_WORK(>restore_work, svm_range_restore_work);
INIT_WORK(>deferred_list_work, svm_range_deferred_list_work);
INIT_LIST_HEAD(>deferred_range_list);
+   INIT_LIST_HEAD(>criu_svm_metadata_list);
spin_lock_init(>deferred_list_lock);
 
for (i = 0; i < p->n_pdds; i++)
@@ -3481,6 +3487,58 @@ svm_range_get_attr(struct kfd_process *p, struct 
mm_struct *mm,
return 0;
 }
 
+int kfd_criu_restore_svm(struct kfd_process *p,
+uint8_t __user *user_priv_ptr,
+uint64_t *priv_data_offset,
+uint64_t max_priv_data_size)
+{
+   uint64_t svm_priv_data_size, svm_object_md_size, svm_attrs_size;
+   int nattr_common = 4, nattr_accessibility = 1;
+   struct criu_svm_metadata *criu_svm_md = NULL;
+   struct svm_range_list *svms = >svms;
+   uint32_t num_devices;
+   int ret = 0;
+
+   num_devices = p->n_pdds;
+   /* Handle one SVM range object at a time, also the number of gpus are
+* assumed to be same on the restore node, checking must be done while
+* evaluating the topology earlier */
+
+   svm_attrs_size = sizeof(struct kfd_ioctl_svm_attribute) *
+   (nattr_common + nattr_accessibility * num_devices);
+   svm_object_md_size = sizeof(struct criu_svm_metadata) + svm_attrs_size;
+
+   svm_priv_data_size = sizeof(struct kfd_criu_svm_range_priv_data) +
+   svm_attrs_size;
+
+   criu_svm_md = kzalloc(svm_object_md_size, GFP_KERNEL);
+   if (!criu_svm_md) {
+   pr_err("failed to allocate memory to store svm metadata\n");
+   return -ENOMEM;
+   }
+   if (*priv_data_offset + svm_priv_data_size > max_priv_data_size) {
+   ret = -EINVAL;
+   goto exit;
+   }
+
+   ret = copy_from_user(_svm_md->data, user_priv_ptr + 
*priv_data_offset,
+svm_priv_data_size);
+   if (ret) {
+   ret = -EFAULT;
+   goto exit;
+   }
+   *priv_data_offset += svm_priv_data_size;
+
+   list_add_tail(_svm_md->list, >criu_svm_metadata_list);
+
+   return 0;
+

[Patch v5 21/24] drm/amdkfd: CRIU Save Shared Virtual Memory ranges

2022-02-03 Thread Rajneesh Bhardwaj

During checkpoint stage, save the shared virtual memory ranges and
attributes for the target process. A process may contain a number of svm
ranges and each range might contain a number of attributes. While not
all attributes may be applicable for a given prange but during
checkpoint we store all possible values for the max possible attribute
types.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 95 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 10 +++
 3 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index a755ea68a428..721c86ceba22 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2196,7 +2196,9 @@ static int criu_checkpoint(struct file *filep,
if (ret)
goto close_bo_fds;
 
-   /* TODO: Dump SVM-Ranges */
+   ret = kfd_criu_checkpoint_svm(p, (uint8_t __user 
*)args->priv_data, _offset);
+   if (ret)
+   goto close_bo_fds;
}
 
 close_bo_fds:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 64cd7712c098..7cf63995c079 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3540,6 +3540,101 @@ int svm_range_get_info(struct kfd_process *p, uint32_t 
*num_svm_ranges,
return 0;
 }
 
+int kfd_criu_checkpoint_svm(struct kfd_process *p,
+   uint8_t __user *user_priv_data,
+   uint64_t *priv_data_offset)
+{
+   struct kfd_criu_svm_range_priv_data *svm_priv = NULL;
+   struct kfd_ioctl_svm_attribute *query_attr = NULL;
+   uint64_t svm_priv_data_size, query_attr_size = 0;
+   int index, nattr_common = 4, ret = 0;
+   struct svm_range_list *svms;
+   int num_devices = p->n_pdds;
+   struct svm_range *prange;
+   struct mm_struct *mm;
+
+   svms = >svms;
+   if (!svms)
+   return -EINVAL;
+
+   mm = get_task_mm(p->lead_thread);
+   if (!mm) {
+   pr_err("failed to get mm for the target process\n");
+   return -ESRCH;
+   }
+
+   query_attr_size = sizeof(struct kfd_ioctl_svm_attribute) *
+   (nattr_common + num_devices);
+
+   query_attr = kzalloc(query_attr_size, GFP_KERNEL);
+   if (!query_attr) {
+   ret = -ENOMEM;
+   goto exit;
+   }
+
+   query_attr[0].type = KFD_IOCTL_SVM_ATTR_PREFERRED_LOC;
+   query_attr[1].type = KFD_IOCTL_SVM_ATTR_PREFETCH_LOC;
+   query_attr[2].type = KFD_IOCTL_SVM_ATTR_SET_FLAGS;
+   query_attr[3].type = KFD_IOCTL_SVM_ATTR_GRANULARITY;
+
+   for (index = 0; index < num_devices; index++) {
+   struct kfd_process_device *pdd = p->pdds[index];
+
+   query_attr[index + nattr_common].type =
+   KFD_IOCTL_SVM_ATTR_ACCESS;
+   query_attr[index + nattr_common].value = pdd->user_gpu_id;
+   }
+
+   svm_priv_data_size = sizeof(*svm_priv) + query_attr_size;
+
+   svm_priv = kzalloc(svm_priv_data_size, GFP_KERNEL);
+   if (!svm_priv) {
+   ret = -ENOMEM;
+   goto exit_query;
+   }
+
+   index = 0;
+   list_for_each_entry(prange, >list, list) {
+
+   svm_priv->object_type = KFD_CRIU_OBJECT_TYPE_SVM_RANGE;
+   svm_priv->start_addr = prange->start;
+   svm_priv->size = prange->npages;
+   memcpy(_priv->attrs, query_attr, query_attr_size);
+   pr_debug("CRIU: prange: 0x%p start: 0x%lx\t npages: 0x%llx end: 
0x%llx\t size: 0x%llx\n",
+prange, prange->start, prange->npages,
+prange->start + prange->npages - 1,
+prange->npages * PAGE_SIZE);
+
+   ret = svm_range_get_attr(p, mm, svm_priv->start_addr,
+svm_priv->size,
+(nattr_common + num_devices),
+svm_priv->attrs);
+   if (ret) {
+   pr_err("CRIU: failed to obtain range attributes\n");
+   goto exit_priv;
+   }
+
+   ret = copy_to_user(user_priv_data + *priv_data_offset,
+  svm_priv, svm_priv_data_size);
+   if (ret) {
+   pr_err("Failed to copy svm priv to user\n");
+   goto exit_priv;
+   }
+
+   *priv_data_offset += svm_priv_data_size;
+
+   }
+
+
+exit_priv:
+   kfree(svm_priv);
+exit_query:
+   kfree(query_attr);
+exit

[Patch v5 17/24] drm/amdkfd: CRIU checkpoint and restore xnack mode

2022-02-03 Thread Rajneesh Bhardwaj

Recoverable page faults are represented by the xnack mode setting inside
a kfd process and are used to represent the device page faults. For CR,
we don't consider negative values which are typically used for querying
the current xnack mode without modifying it.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 15 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  1 +
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index ab5107a3fe36..3ec44f71307d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1848,6 +1848,11 @@ static int criu_checkpoint_process(struct kfd_process *p,
memset(_priv, 0, sizeof(process_priv));
 
process_priv.version = KFD_CRIU_PRIV_VERSION;
+   /* For CR, we don't consider negative xnack mode which is used for
+* querying without changing it, here 0 simply means disabled and 1
+* means enabled so retry for finding a valid PTE.
+*/
+   process_priv.xnack_mode = p->xnack_enabled ? 1 : 0;
 
ret = copy_to_user(user_priv_data + *priv_offset,
_priv, sizeof(process_priv));
@@ -2241,6 +2246,16 @@ static int criu_restore_process(struct kfd_process *p,
return -EINVAL;
}
 
+   pr_debug("Setting XNACK mode\n");
+   if (process_priv.xnack_mode && !kfd_process_xnack_mode(p, true)) {
+   pr_err("xnack mode cannot be set\n");
+   ret = -EPERM;
+   goto exit;
+   } else {
+   pr_debug("set xnack mode: %d\n", process_priv.xnack_mode);
+   p->xnack_enabled = process_priv.xnack_mode;
+   }
+
 exit:
return ret;
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index df68c4274bd9..903ad4a263f0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1056,6 +1056,7 @@ void kfd_process_set_trap_handler(struct 
qcm_process_device *qpd,
 
 struct kfd_criu_process_priv_data {
uint32_t version;
+   uint32_t xnack_mode;
 };
 
 struct kfd_criu_device_priv_data {
-- 
2.17.1

[Patch v5 11/24] drm/amdkfd: CRIU restore queue doorbell id

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When re-creating queues during CRIU restore, restore the queue with the
same doorbell id value used during CRIU dump.

Signed-off-by: David Yat Sin 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 60 +--
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 15fa2dc6dcba..13317d2c8959 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -144,7 +144,13 @@ static void decrement_queue_count(struct 
device_queue_manager *dqm,
dqm->active_cp_queue_count--;
 }
 
-static int allocate_doorbell(struct qcm_process_device *qpd, struct queue *q)
+/*
+ * Allocate a doorbell ID to this queue.
+ * If doorbell_id is passed in, make sure requested ID is valid then allocate 
it.
+ */
+static int allocate_doorbell(struct qcm_process_device *qpd,
+struct queue *q,
+uint32_t const *restore_id)
 {
struct kfd_dev *dev = qpd->dqm->dev;
 
@@ -152,6 +158,10 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd, struct queue *q)
/* On pre-SOC15 chips we need to use the queue ID to
 * preserve the user mode ABI.
 */
+
+   if (restore_id && *restore_id != q->properties.queue_id)
+   return -EINVAL;
+
q->doorbell_id = q->properties.queue_id;
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
@@ -160,25 +170,37 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd, struct queue *q)
 * The doobell index distance between RLC (2*i) and (2*i+1)
 * for a SDMA engine is 512.
 */
-   uint32_t *idx_offset =
-   dev->shared_resources.sdma_doorbell_idx;
 
-   q->doorbell_id = idx_offset[q->properties.sdma_engine_id]
-   + (q->properties.sdma_queue_id & 1)
-   * KFD_QUEUE_DOORBELL_MIRROR_OFFSET
-   + (q->properties.sdma_queue_id >> 1);
+   uint32_t *idx_offset = dev->shared_resources.sdma_doorbell_idx;
+   uint32_t valid_id = idx_offset[q->properties.sdma_engine_id]
+   + (q->properties.sdma_queue_id 
& 1)
+   * 
KFD_QUEUE_DOORBELL_MIRROR_OFFSET
+   + (q->properties.sdma_queue_id 
>> 1);
+
+   if (restore_id && *restore_id != valid_id)
+   return -EINVAL;
+   q->doorbell_id = valid_id;
} else {
-   /* For CP queues on SOC15 reserve a free doorbell ID */
-   unsigned int found;
-
-   found = find_first_zero_bit(qpd->doorbell_bitmap,
-   KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
-   if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
-   pr_debug("No doorbells available");
-   return -EBUSY;
+   /* For CP queues on SOC15 */
+   if (restore_id) {
+   /* make sure that ID is free  */
+   if (__test_and_set_bit(*restore_id, 
qpd->doorbell_bitmap))
+   return -EINVAL;
+
+   q->doorbell_id = *restore_id;
+   } else {
+   /* or reserve a free doorbell ID */
+   unsigned int found;
+
+   found = find_first_zero_bit(qpd->doorbell_bitmap,
+   
KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
+   if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
+   pr_debug("No doorbells available");
+   return -EBUSY;
+   }
+   set_bit(found, qpd->doorbell_bitmap);
+   q->doorbell_id = found;
}
-   set_bit(found, qpd->doorbell_bitmap);
-   q->doorbell_id = found;
}
 
q->properties.doorbell_off =
@@ -346,7 +368,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
}
 
-   retval = allocate_doorbell(qpd, q);
+   retval = allocate_doorbell(qpd, q, qd ? >doorbell_id : NULL);
if (retval)
goto out_deallocate_hqd;
 
@@ -1333,7 +1355,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
goto out;
}
 
-   retval = allocate_doorbell(qpd, q);
+   retval = allocate_doorbell(qpd, q, qd ? >doorbell_id :

[Patch v5 08/24] drm/amdkfd: CRIU add queues support

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Add support to existing CRIU ioctl's to save number of queues and queue
properties for each queue during checkpoint and re-create queues on
restore.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 110 -
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  43 +++-
 .../amd/amdkfd/kfd_process_queue_manager.c| 208 ++
 3 files changed, 353 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 6af6deeda523..d049f9cbbc79 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2015,19 +2015,36 @@ static int criu_checkpoint_bos(struct kfd_process *p,
return ret;
 }
 
-static void criu_get_process_object_info(struct kfd_process *p,
-uint32_t *num_bos,
-uint64_t *objs_priv_size)
+static int criu_get_process_object_info(struct kfd_process *p,
+   uint32_t *num_bos,
+   uint32_t *num_objects,
+   uint64_t *objs_priv_size)
 {
+   int ret;
uint64_t priv_size;
+   uint32_t num_queues, num_events, num_svm_ranges;
+   uint64_t queues_priv_data_size;
 
*num_bos = get_process_num_bos(p);
 
+   ret = kfd_process_get_queue_info(p, _queues, 
_priv_data_size);
+   if (ret)
+   return ret;
+
+   num_events = 0; /* TODO: Implement Events */
+   num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */
+
+   *num_objects = num_queues + num_events + num_svm_ranges;
+
if (objs_priv_size) {
priv_size = sizeof(struct kfd_criu_process_priv_data);
priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
+   priv_size += queues_priv_data_size;
+   /* TODO: Add Events priv size */
+   /* TODO: Add SVM ranges priv size */
*objs_priv_size = priv_size;
}
+   return 0;
 }
 
 static int criu_checkpoint(struct file *filep,
@@ -2035,7 +2052,7 @@ static int criu_checkpoint(struct file *filep,
   struct kfd_ioctl_criu_args *args)
 {
int ret;
-   uint32_t num_bos;
+   uint32_t num_bos, num_objects;
uint64_t priv_size, priv_offset = 0;
 
if (!args->bos || !args->priv_data)
@@ -2057,9 +2074,12 @@ static int criu_checkpoint(struct file *filep,
goto exit_unlock;
}
 
-   criu_get_process_object_info(p, _bos, _size);
+   ret = criu_get_process_object_info(p, _bos, _objects, 
_size);
+   if (ret)
+   goto exit_unlock;
 
if (num_bos != args->num_bos ||
+   num_objects != args->num_objects ||
priv_size != args->priv_data_size) {
 
ret = -EINVAL;
@@ -2076,6 +2096,17 @@ static int criu_checkpoint(struct file *filep,
if (ret)
goto exit_unlock;
 
+   if (num_objects) {
+   ret = kfd_criu_checkpoint_queues(p, (uint8_t __user 
*)args->priv_data,
+_offset);
+   if (ret)
+   goto exit_unlock;
+
+   /* TODO: Dump Events */
+
+   /* TODO: Dump SVM-Ranges */
+   }
+
 exit_unlock:
mutex_unlock(>mutex);
if (ret)
@@ -2344,6 +2375,62 @@ static int criu_restore_bos(struct kfd_process *p,
return ret;
 }
 
+static int criu_restore_objects(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args,
+   uint64_t *priv_offset,
+   uint64_t max_priv_data_size)
+{
+   int ret = 0;
+   uint32_t i;
+
+   BUILD_BUG_ON(offsetof(struct kfd_criu_queue_priv_data, object_type));
+   BUILD_BUG_ON(offsetof(struct kfd_criu_event_priv_data, object_type));
+   BUILD_BUG_ON(offsetof(struct kfd_criu_svm_range_priv_data, 
object_type));
+
+   for (i = 0; i < args->num_objects; i++) {
+   uint32_t object_type;
+
+   if (*priv_offset + sizeof(object_type) > max_priv_data_size) {
+   pr_err("Invalid private data size\n");
+   return -EINVAL;
+   }
+
+   ret = get_user(object_type, (uint32_t __user *)(args->priv_data 
+ *priv_offset));
+   if (ret) {
+   pr_err("Failed to copy private information from 
user\n");
+   goto exit;
+   }
+
+   switch (object_type) {
+   case KFD_CRIU_OBJECT_TYPE_QUEUE:
+   ret = kfd_criu_restore_queue(p, (uint8_t __user 
*)args->priv_data,
+

[Patch v5 10/24] drm/amdkfd: CRIU restore sdma id for queues

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When re-creating queues during CRIU restore, restore the queue with the
same sdma id value used during CRIU dump.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  3 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  4 +-
 3 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 4b6814949aad..15fa2dc6dcba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -58,7 +58,7 @@ static inline void deallocate_hqd(struct device_queue_manager 
*dqm,
struct queue *q);
 static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q);
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   struct queue *q);
+   struct queue *q, const uint32_t 
*restore_sdma_id);
 static void kfd_process_hw_exception(struct work_struct *work);
 
 static inline
@@ -299,7 +299,8 @@ static void deallocate_vmid(struct device_queue_manager 
*dqm,
 
 static int create_queue_nocpsch(struct device_queue_manager *dqm,
struct queue *q,
-   struct qcm_process_device *qpd)
+   struct qcm_process_device *qpd,
+   const struct kfd_criu_queue_priv_data *qd)
 {
struct mqd_manager *mqd_mgr;
int retval;
@@ -339,7 +340,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
q->pipe, q->queue);
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
-   retval = allocate_sdma_queue(dqm, q);
+   retval = allocate_sdma_queue(dqm, q, qd ? >sdma_id : NULL);
if (retval)
goto deallocate_vmid;
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
@@ -1034,7 +1035,7 @@ static void pre_reset(struct device_queue_manager *dqm)
 }
 
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   struct queue *q)
+   struct queue *q, const uint32_t 
*restore_sdma_id)
 {
int bit;
 
@@ -1044,9 +1045,21 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
return -ENOMEM;
}
 
-   bit = __ffs64(dqm->sdma_bitmap);
-   dqm->sdma_bitmap &= ~(1ULL << bit);
-   q->sdma_id = bit;
+   if (restore_sdma_id) {
+   /* Re-use existing sdma_id */
+   if (!(dqm->sdma_bitmap & (1ULL << *restore_sdma_id))) {
+   pr_err("SDMA queue already in use\n");
+   return -EBUSY;
+   }
+   dqm->sdma_bitmap &= ~(1ULL << *restore_sdma_id);
+   q->sdma_id = *restore_sdma_id;
+   } else {
+   /* Find first available sdma_id */
+   bit = __ffs64(dqm->sdma_bitmap);
+   dqm->sdma_bitmap &= ~(1ULL << bit);
+   q->sdma_id = bit;
+   }
+
q->properties.sdma_engine_id = q->sdma_id %
kfd_get_num_sdma_engines(dqm->dev);
q->properties.sdma_queue_id = q->sdma_id /
@@ -1056,9 +1069,19 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
pr_err("No more XGMI SDMA queue to allocate\n");
return -ENOMEM;
}
-   bit = __ffs64(dqm->xgmi_sdma_bitmap);
-   dqm->xgmi_sdma_bitmap &= ~(1ULL << bit);
-   q->sdma_id = bit;
+   if (restore_sdma_id) {
+   /* Re-use existing sdma_id */
+   if (!(dqm->xgmi_sdma_bitmap & (1ULL << 
*restore_sdma_id))) {
+   pr_err("SDMA queue already in use\n");
+   return -EBUSY;
+   }
+   dqm->xgmi_sdma_bitmap &= ~(1ULL << *restore_sdma_id);
+   q->sdma_id = *restore_sdma_id;
+   } else {
+   bit = __ffs64(dqm->xgmi_sdma_bitmap);
+   dqm->xgmi_sdma_bitmap &= ~(1ULL << bit);
+   q->sdma_id = bit;
+   }
/* sdma_engine_id is sdma id including
 * both

[Patch v5 07/24] drm/amdkfd: CRIU Implement KFD unpause operation

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Introducing UNPAUSE op. After CRIU amdgpu plugin performs a PROCESS_INFO
op the queues will be stay in an evicted state. Once the plugin is done
draining BO contents, it is safe to perform an UNPAUSE op for the queues
to resume.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 37 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  1 +
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 95fc5668195c..6af6deeda523 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2049,6 +2049,14 @@ static int criu_checkpoint(struct file *filep,
goto exit_unlock;
}
 
+   /* Confirm all process queues are evicted */
+   if (!p->queues_paused) {
+   pr_err("Cannot dump process when queues are not in evicted 
state\n");
+   /* CRIU plugin did not call op PROCESS_INFO before 
checkpointing */
+   ret = -EINVAL;
+   goto exit_unlock;
+   }
+
criu_get_process_object_info(p, _bos, _size);
 
if (num_bos != args->num_bos ||
@@ -2388,7 +2396,24 @@ static int criu_unpause(struct file *filep,
struct kfd_process *p,
struct kfd_ioctl_criu_args *args)
 {
-   return 0;
+   int ret;
+
+   mutex_lock(>mutex);
+
+   if (!p->queues_paused) {
+   mutex_unlock(>mutex);
+   return -EINVAL;
+   }
+
+   ret = kfd_process_restore_queues(p);
+   if (ret)
+   pr_err("Failed to unpause queues ret:%d\n", ret);
+   else
+   p->queues_paused = false;
+
+   mutex_unlock(>mutex);
+
+   return ret;
 }
 
 static int criu_resume(struct file *filep,
@@ -2440,6 +2465,12 @@ static int criu_process_info(struct file *filep,
goto err_unlock;
}
 
+   ret = kfd_process_evict_queues(p);
+   if (ret)
+   goto err_unlock;
+
+   p->queues_paused = true;
+
args->pid = task_pid_nr_ns(p->lead_thread,
task_active_pid_ns(p->lead_thread));
 
@@ -2447,6 +2478,10 @@ static int criu_process_info(struct file *filep,
 
dev_dbg(kfd_device, "Num of bos:%u\n", args->num_bos);
 err_unlock:
+   if (ret) {
+   kfd_process_restore_queues(p);
+   p->queues_paused = false;
+   }
mutex_unlock(>mutex);
return ret;
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 9b347247055c..677f21447112 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -877,6 +877,8 @@ struct kfd_process {
bool xnack_enabled;
 
atomic_t poison;
+   /* Queues are in paused stated because we are in the process of doing a 
CRIU checkpoint */
+   bool queues_paused;
 };
 
 #define KFD_PROCESS_TABLE_SIZE 5 /* bits: 32 entries */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index b3198e186622..0649064b8e95 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1384,6 +1384,7 @@ static struct kfd_process *create_process(const struct 
task_struct *thread)
process->mm = thread->mm;
process->lead_thread = thread->group_leader;
process->n_pdds = 0;
+   process->queues_paused = false;
INIT_DELAYED_WORK(>eviction_work, evict_process_worker);
INIT_DELAYED_WORK(>restore_work, restore_process_worker);
process->last_restore_timestamp = get_jiffies_64();
-- 
2.17.1

[Patch v5 05/24] drm/amdkfd: CRIU Implement KFD restore ioctl

2022-02-03 Thread Rajneesh Bhardwaj

This implements the KFD CRIU Restore ioctl that lays the basic
foundation for the CRIU restore operation. It provides support to
create the buffer objects corresponding to the checkpointed image.
This ioctl creates various types of buffer objects such as VRAM,
MMIO, Doorbell, GTT based on the date sent from the userspace plugin.
The data mostly contains the previously checkpointed KFD images from
some KFD processs.

While restoring a criu process, attach old IDR values to newly
created BOs. This also adds the minimal gpu mapping support for a single
gpu checkpoint restore use case.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 298 ++-
 1 file changed, 297 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 17a937b7139f..342fc56b1940 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2078,11 +2078,307 @@ static int criu_checkpoint(struct file *filep,
return ret;
 }
 
+static int criu_restore_process(struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args,
+   uint64_t *priv_offset,
+   uint64_t max_priv_data_size)
+{
+   int ret = 0;
+   struct kfd_criu_process_priv_data process_priv;
+
+   if (*priv_offset + sizeof(process_priv) > max_priv_data_size)
+   return -EINVAL;
+
+   ret = copy_from_user(_priv,
+   (void __user *)(args->priv_data + *priv_offset),
+   sizeof(process_priv));
+   if (ret) {
+   pr_err("Failed to copy process private information from 
user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+   *priv_offset += sizeof(process_priv);
+
+   if (process_priv.version != KFD_CRIU_PRIV_VERSION) {
+   pr_err("Invalid CRIU API version (checkpointed:%d 
current:%d)\n",
+   process_priv.version, KFD_CRIU_PRIV_VERSION);
+   return -EINVAL;
+   }
+
+exit:
+   return ret;
+}
+
+static int criu_restore_bos(struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args,
+   uint64_t *priv_offset,
+   uint64_t max_priv_data_size)
+{
+   struct kfd_criu_bo_bucket *bo_buckets;
+   struct kfd_criu_bo_priv_data *bo_privs;
+   bool flush_tlbs = false;
+   int ret = 0, j = 0;
+   uint32_t i;
+
+   if (*priv_offset + (args->num_bos * sizeof(*bo_privs)) > 
max_priv_data_size)
+   return -EINVAL;
+
+   bo_buckets = kvmalloc_array(args->num_bos, sizeof(*bo_buckets), 
GFP_KERNEL);
+   if (!bo_buckets)
+   return -ENOMEM;
+
+   ret = copy_from_user(bo_buckets, (void __user *)args->bos,
+args->num_bos * sizeof(*bo_buckets));
+   if (ret) {
+   pr_err("Failed to copy BOs information from user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+
+   bo_privs = kvmalloc_array(args->num_bos, sizeof(*bo_privs), GFP_KERNEL);
+   if (!bo_privs) {
+   ret = -ENOMEM;
+   goto exit;
+   }
+
+   ret = copy_from_user(bo_privs, (void __user *)args->priv_data + 
*priv_offset,
+args->num_bos * sizeof(*bo_privs));
+   if (ret) {
+   pr_err("Failed to copy BOs information from user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+   *priv_offset += args->num_bos * sizeof(*bo_privs);
+
+   /* Create and map new BOs */
+   for (i = 0; i < args->num_bos; i++) {
+   struct kfd_criu_bo_bucket *bo_bucket;
+   struct kfd_criu_bo_priv_data *bo_priv;
+   struct kfd_dev *dev;
+   struct kfd_process_device *pdd;
+   void *mem;
+   u64 offset;
+   int idr_handle;
+
+   bo_bucket = _buckets[i];
+   bo_priv = _privs[i];
+
+   dev = kfd_device_by_id(bo_bucket->gpu_id);
+   if (!dev) {
+   ret = -EINVAL;
+   pr_err("Failed to get pdd\n");
+   goto exit;
+   }
+   pdd = kfd_get_process_device_data(dev, p);
+   if (!pdd) {
+   ret = -EINVAL;
+   pr_err("Failed to get pdd\n");
+   goto exit;
+   }
+
+   pr_debug("kfd restore ioctl - bo_bucket[%d]:\n", i);
+   pr_debug("size = 0x%llx, bo_addr = 0x%llx bo_offset = 0x%llx\n"
+   "gpu_id = 0x%x alloc_flags = 0x%x\n&q

[Patch v5 09/24] drm/amdkfd: CRIU restore queue ids

2022-02-03 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When re-creating queues during CRIU restore, restore the queue with the
same queue id value used during CRIU dump.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 +
 .../amd/amdkfd/kfd_process_queue_manager.c| 37 +++
 4 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index d049f9cbbc79..d35911550792 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -311,7 +311,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
p->pasid,
dev->id);
 
-   err = pqm_create_queue(>pqm, dev, filep, _properties, _id,
+   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL,
_offset_in_process);
if (err != 0)
goto err_create_queue;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 1e30717b5253..0c50e67e2b51 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
properties.type = KFD_QUEUE_TYPE_DIQ;
 
status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
-   , , NULL);
+   , , NULL, NULL);
 
if (status) {
pr_err("Failed to create DIQ\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 41aa7b150a96..59125d8f16a7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -461,6 +461,7 @@ enum KFD_QUEUE_PRIORITY {
  * it's user mode or kernel mode queue.
  *
  */
+
 struct queue_properties {
enum kfd_queue_type type;
enum kfd_queue_format format;
@@ -1156,6 +1157,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
struct file *f,
struct queue_properties *properties,
unsigned int *qid,
+   const struct kfd_criu_queue_priv_data *q_data,
uint32_t *p_doorbell_offset_in_process);
 int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid);
 int pqm_update_queue_properties(struct process_queue_manager *pqm, unsigned 
int qid,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 38d3217f0f67..75bad4381421 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -42,6 +42,20 @@ static inline struct process_queue_node *get_queue_by_qid(
return NULL;
 }
 
+static int assign_queue_slot_by_qid(struct process_queue_manager *pqm,
+   unsigned int qid)
+{
+   if (qid >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
+   return -EINVAL;
+
+   if (__test_and_set_bit(qid, pqm->queue_slot_bitmap)) {
+   pr_err("Cannot create new queue because requested qid(%u) is in 
use\n", qid);
+   return -ENOSPC;
+   }
+
+   return 0;
+}
+
 static int find_available_queue_slot(struct process_queue_manager *pqm,
unsigned int *qid)
 {
@@ -193,6 +207,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
struct file *f,
struct queue_properties *properties,
unsigned int *qid,
+   const struct kfd_criu_queue_priv_data *q_data,
uint32_t *p_doorbell_offset_in_process)
 {
int retval;
@@ -224,7 +239,12 @@ int pqm_create_queue(struct process_queue_manager *pqm,
if (pdd->qpd.queue_count >= max_queues)
return -ENOSPC;
 
-   retval = find_available_queue_slot(pqm, qid);
+   if (q_data) {
+   retval = assign_queue_slot_by_qid(pqm, q_data->q_id);
+   *qid = q_data->q_id;
+   } else
+   retval = find_available_queue_slot(pqm, qid);
+
if (retval != 0)
return retval;
 
@@ -527,7 +547,7 @@ int kfd_process_get_queue_info(struct kfd_process *p,
return 0;
 }
 
-static void criu_dump_queue(struct kfd_process_device *pdd,
+static void criu_checkpoint_queue(struct kfd_process_device *pdd,
   struct queue *q,
   struct kfd_criu_queue_priv_data *q_data)
 {
@@ -559,7 +579,7 @@ static void criu_dump_queue(struct kfd_process

[Patch v5 02/24] drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs

2022-02-03 Thread Rajneesh Bhardwaj

Checkpoint-Restore in userspace (CRIU) is a powerful tool that can
snapshot a running process and later restore it on same or a remote
machine but expects the processes that have a device file (e.g. GPU)
associated with them, provide necessary driver support to assist CRIU
and its extensible plugin interface. Thus, In order to support the
Checkpoint-Restore of any ROCm process, the AMD Radeon Open Compute
Kernel driver, needs to provide a set of new APIs that provide
necessary VRAM metadata and its contents to a userspace component
(CRIU plugin) that can store it in form of image files.

This introduces some new ioctls which will be used to checkpoint-Restore
any KFD bound user process. KFD only allows ioctl calls from the same
process that opened the KFD file descriptor. Since these ioctls are
expected to be called from a KFD criu plugin which has elevated ptrace
attached privileges and CAP_CHECKPOINT_RESTORE capabilities attached with
the file descriptors so modify KFD to allow such calls.

(API redesigned by David Yat Sin)
Suggested-by: Felix Kuehling 
Reviewed-by: Felix Kuehling 
Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 98 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 65 +++-
 include/uapi/linux/kfd_ioctl.h   | 81 +++-
 3 files changed, 241 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 214a2c67fba4..90e6d9e335a5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "kfd_priv.h"
@@ -1859,6 +1860,75 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
 }
 #endif
 
+static int criu_checkpoint(struct file *filep,
+  struct kfd_process *p,
+  struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_restore(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_unpause(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_resume(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_process_info(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void 
*data)
+{
+   struct kfd_ioctl_criu_args *args = data;
+   int ret;
+
+   dev_dbg(kfd_device, "CRIU operation: %d\n", args->op);
+   switch (args->op) {
+   case KFD_CRIU_OP_PROCESS_INFO:
+   ret = criu_process_info(filep, p, args);
+   break;
+   case KFD_CRIU_OP_CHECKPOINT:
+   ret = criu_checkpoint(filep, p, args);
+   break;
+   case KFD_CRIU_OP_UNPAUSE:
+   ret = criu_unpause(filep, p, args);
+   break;
+   case KFD_CRIU_OP_RESTORE:
+   ret = criu_restore(filep, p, args);
+   break;
+   case KFD_CRIU_OP_RESUME:
+   ret = criu_resume(filep, p, args);
+   break;
+   default:
+   dev_dbg(kfd_device, "Unsupported CRIU operation:%d\n", 
args->op);
+   ret = -EINVAL;
+   break;
+   }
+
+   if (ret)
+   dev_dbg(kfd_device, "CRIU operation:%d err:%d\n", args->op, 
ret);
+
+   return ret;
+}
+
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
.cmd_drv = 0, .name = #ioctl}
@@ -1962,6 +2032,10 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
 
AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_XNACK_MODE,
kfd_ioctl_set_xnack_mode, 0),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_OP,
+   kfd_ioctl_criu, KFD_IOC_FLAG_CHECKPOINT_RESTORE),
+
 };
 
 #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls)
@@ -1976,6 +2050,7 @@ static long kfd_ioctl(struct file *filep, unsigned int 
cmd, unsigned long arg)
char *kdata = NULL;
unsigned int usize, asize;
int retcode = -EINVAL;
+   bool ptrace_attached = false;
 
if (nr >= AMDKFD_CORE_IOCTL_COUNT)
goto err_i1;
@@ -2001,7 +2076,15 @@ static long kfd_ioctl(struct file *filep, unsigned int 
cmd, unsigned long arg)
 * processes need to create their own KFD device con

[Patch v5 04/24] drm/amdkfd: CRIU Implement KFD checkpoint ioctl

2022-02-03 Thread Rajneesh Bhardwaj

This adds support to discover the  buffer objects that belong to a
process being checkpointed. The data corresponding to these buffer
objects is returned to user space plugin running under criu master
context which then stores this info to recreate these buffer objects
during a restore operation.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   1 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  11 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  20 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 177 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   4 +-
 6 files changed, 213 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index ac841ae8f5cc..395ba9566afe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -297,6 +297,7 @@ int amdgpu_amdkfd_get_tile_config(struct amdgpu_device 
*adev,
struct tile_config *config);
 void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev,
bool reset);
+bool amdgpu_amdkfd_bo_mapped_to_dev(struct amdgpu_device *adev, struct kgd_mem 
*mem);
 #if IS_ENABLED(CONFIG_HSA_AMD)
 void amdgpu_amdkfd_gpuvm_init_mem_limits(void);
 void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 5df387c4d7fb..3485ef856860 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2629,3 +2629,14 @@ int amdgpu_amdkfd_get_tile_config(struct amdgpu_device 
*adev,
 
return 0;
 }
+
+bool amdgpu_amdkfd_bo_mapped_to_dev(struct amdgpu_device *adev, struct kgd_mem 
*mem)
+{
+   struct kfd_mem_attachment *entry;
+
+   list_for_each_entry(entry, >attachments, list) {
+   if (entry->is_mapped && entry->adev == adev)
+   return true;
+   }
+   return false;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index b9637d1cf147..5a32ee66d8c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1127,6 +1127,26 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device 
*bdev,
return ttm_pool_free(>mman.bdev.pool, ttm);
 }
 
+/**
+ * amdgpu_ttm_tt_get_userptr - Return the userptr GTT ttm_tt for the current
+ * task
+ *
+ * @tbo: The ttm_buffer_object that contains the userptr
+ * @user_addr:  The returned value
+ */
+int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo,
+ uint64_t *user_addr)
+{
+   struct amdgpu_ttm_tt *gtt;
+
+   if (!tbo->ttm)
+   return -EINVAL;
+
+   gtt = (void *)tbo->ttm;
+   *user_addr = gtt->userptr;
+   return 0;
+}
+
 /**
  * amdgpu_ttm_tt_set_userptr - Initialize userptr GTT ttm_tt for the current
  * task
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index d9691f262f16..39d966e7185d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -181,6 +181,8 @@ static inline bool amdgpu_ttm_tt_get_user_pages_done(struct 
ttm_tt *ttm)
 #endif
 
 void amdgpu_ttm_tt_set_user_pages(struct ttm_tt *ttm, struct page **pages);
+int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo,
+ uint64_t *user_addr);
 int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo,
  uint64_t addr, uint32_t flags);
 bool amdgpu_ttm_tt_has_userptr(struct ttm_tt *ttm);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 29443419bbf0..17a937b7139f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1860,6 +1860,29 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
 }
 #endif
 
+static int criu_checkpoint_process(struct kfd_process *p,
+uint8_t __user *user_priv_data,
+uint64_t *priv_offset)
+{
+   struct kfd_criu_process_priv_data process_priv;
+   int ret;
+
+   memset(_priv, 0, sizeof(process_priv));
+
+   process_priv.version = KFD_CRIU_PRIV_VERSION;
+
+   ret = copy_to_user(user_priv_data + *priv_offset,
+   _priv, sizeof(process_priv));
+
+   if (ret) {
+   pr_err("Failed to copy process information to user\n");
+   ret = -EFAULT;
+   }
+
+   *priv_offset += sizeof(process_priv);
+   return ret;
+}
+
 uint32_t get_process_num_bos(struct kfd_pro

[Patch v5 06/24] drm/amdkfd: CRIU Implement KFD resume ioctl

2022-02-03 Thread Rajneesh Bhardwaj

This adds support to create userptr BOs on restore and introduces a new
ioctl op to restart memory notifiers for the restored userptr BOs.
When doing CRIU restore MMU notifications can happen anytime after we call
amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the
restore process i.e. criu_resume ioctl op is received, and the process is
ready to be resumed. This ioctl is different from other KFD CRIU ioctls
since its called by CRIU master restore process for all the target
processes being resumed by CRIU.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  6 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 53 +--
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 41 --
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 35 ++--
 5 files changed, 122 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 395ba9566afe..4cb14c2fe53f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -131,6 +131,7 @@ struct amdkfd_process_info {
atomic_t evicted_bos;
struct delayed_work restore_userptr_work;
struct pid *pid;
+   bool block_mmu_notifications;
 };
 
 int amdgpu_amdkfd_init(void);
@@ -268,7 +269,7 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void 
*drm_priv);
 int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
struct amdgpu_device *adev, uint64_t va, uint64_t size,
void *drm_priv, struct kgd_mem **mem,
-   uint64_t *offset, uint32_t flags);
+   uint64_t *offset, uint32_t flags, bool criu_resume);
 int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
struct amdgpu_device *adev, struct kgd_mem *mem, void *drm_priv,
uint64_t *size);
@@ -298,6 +299,9 @@ int amdgpu_amdkfd_get_tile_config(struct amdgpu_device 
*adev,
 void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev,
bool reset);
 bool amdgpu_amdkfd_bo_mapped_to_dev(struct amdgpu_device *adev, struct kgd_mem 
*mem);
+void amdgpu_amdkfd_block_mmu_notifications(void *p);
+int amdgpu_amdkfd_criu_resume(void *p);
+
 #if IS_ENABLED(CONFIG_HSA_AMD)
 void amdgpu_amdkfd_gpuvm_init_mem_limits(void);
 void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 3485ef856860..69dc9e4d841c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -842,7 +842,8 @@ static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem 
*mem,
  *
  * Returns 0 for success, negative errno for errors.
  */
-static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr)
+static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr,
+  bool criu_resume)
 {
struct amdkfd_process_info *process_info = mem->process_info;
struct amdgpu_bo *bo = mem->bo;
@@ -864,6 +865,18 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr)
goto out;
}
 
+   if (criu_resume) {
+   /*
+* During a CRIU restore operation, the userptr buffer objects
+* will be validated in the restore_userptr_work worker at a
+* later stage when it is scheduled by another ioctl called by
+* CRIU master process for the target pid for restore.
+*/
+   atomic_inc(>invalid);
+   mutex_unlock(_info->lock);
+   return 0;
+   }
+
ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages);
if (ret) {
pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
@@ -1452,10 +1465,39 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void 
*drm_priv)
return avm->pd_phys_addr;
 }
 
+void amdgpu_amdkfd_block_mmu_notifications(void *p)
+{
+   struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p;
+
+   mutex_lock(>lock);
+   WRITE_ONCE(pinfo->block_mmu_notifications, true);
+   mutex_unlock(>lock);
+}
+
+int amdgpu_amdkfd_criu_resume(void *p)
+{
+   int ret = 0;
+   struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p;
+
+   mutex_lock(>lock);
+   pr_debug("scheduling work\n");
+   atomic_inc(>evicted_bos);
+   if (!READ_ONCE(pinfo->block_mmu_notifications)) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+   WRITE_ONCE(pinfo->block_mmu_notifications, false);
+   schedule_delayed_work(>restore_userptr_work, 0);
+
+out_u

[Patch v5 03/24] drm/amdkfd: CRIU Implement KFD process_info ioctl

2022-02-03 Thread Rajneesh Bhardwaj

This IOCTL op is expected to be called as a precursor to the actual
Checkpoint operation. This does the basic discovery into the target
process seized by CRIU and relays the information to the userspace that
utilizes it to start the Checkpoint operation via another dedicated
IOCTL op.

The process_info IOCTL op determines the number of GPUs, buffer objects
that are associated with the target process, its process id in
caller's namespace since /proc/pid/mem interface maybe used to drain
the contents of the discovered buffer objects in userspace and getpid
returns the pid of CRIU dumper process. Also the pid of a process
inside a container might be different than its global pid so return
the ns pid.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 56 +++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 90e6d9e335a5..29443419bbf0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1860,6 +1860,42 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
 }
 #endif
 
+uint32_t get_process_num_bos(struct kfd_process *p)
+{
+   uint32_t num_of_bos = 0;
+   int i;
+
+   /* Run over all PDDs of the process */
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+   void *mem;
+   int id;
+
+   idr_for_each_entry(>alloc_idr, mem, id) {
+   struct kgd_mem *kgd_mem = (struct kgd_mem *)mem;
+
+   if ((uint64_t)kgd_mem->va > pdd->gpuvm_base)
+   num_of_bos++;
+   }
+   }
+   return num_of_bos;
+}
+
+static void criu_get_process_object_info(struct kfd_process *p,
+uint32_t *num_bos,
+uint64_t *objs_priv_size)
+{
+   uint64_t priv_size;
+
+   *num_bos = get_process_num_bos(p);
+
+   if (objs_priv_size) {
+   priv_size = sizeof(struct kfd_criu_process_priv_data);
+   priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
+   *objs_priv_size = priv_size;
+   }
+}
+
 static int criu_checkpoint(struct file *filep,
   struct kfd_process *p,
   struct kfd_ioctl_criu_args *args)
@@ -1892,7 +1928,25 @@ static int criu_process_info(struct file *filep,
struct kfd_process *p,
struct kfd_ioctl_criu_args *args)
 {
-   return 0;
+   int ret = 0;
+
+   mutex_lock(>mutex);
+
+   if (!p->n_pdds) {
+   pr_err("No pdd for given process\n");
+   ret = -ENODEV;
+   goto err_unlock;
+   }
+
+   args->pid = task_pid_nr_ns(p->lead_thread,
+   task_active_pid_ns(p->lead_thread));
+
+   criu_get_process_object_info(p, >num_bos, >priv_data_size);
+
+   dev_dbg(kfd_device, "Num of bos:%u\n", args->num_bos);
+err_unlock:
+   mutex_unlock(>mutex);
+   return ret;
 }
 
 static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void 
*data)
-- 
2.17.1

[Patch v5 01/24] x86/configs: CRIU update debug rock defconfig

2022-02-03 Thread Rajneesh Bhardwaj

 - Update debug config for Checkpoint-Restore (CR) support
 - Also include necessary options for CR with docker containers.

Reviewed-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
 arch/x86/configs/rock-dbg_defconfig | 53 ++---
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/arch/x86/configs/rock-dbg_defconfig 
b/arch/x86/configs/rock-dbg_defconfig
index 4877da183599..bc2a34666c1d 100644
--- a/arch/x86/configs/rock-dbg_defconfig
+++ b/arch/x86/configs/rock-dbg_defconfig
@@ -249,6 +249,7 @@ CONFIG_KALLSYMS_ALL=y
 CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
 CONFIG_KALLSYMS_BASE_RELATIVE=y
 # CONFIG_USERFAULTFD is not set
+CONFIG_USERFAULTFD=y
 CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
 CONFIG_KCMP=y
 CONFIG_RSEQ=y
@@ -1015,6 +1016,11 @@ CONFIG_PACKET_DIAG=y
 CONFIG_UNIX=y
 CONFIG_UNIX_SCM=y
 CONFIG_UNIX_DIAG=y
+CONFIG_SMC_DIAG=y
+CONFIG_XDP_SOCKETS_DIAG=y
+CONFIG_INET_MPTCP_DIAG=y
+CONFIG_TIPC_DIAG=y
+CONFIG_VSOCKETS_DIAG=y
 # CONFIG_TLS is not set
 CONFIG_XFRM=y
 CONFIG_XFRM_ALGO=y
@@ -1052,15 +1058,17 @@ CONFIG_SYN_COOKIES=y
 # CONFIG_NET_IPVTI is not set
 # CONFIG_NET_FOU is not set
 # CONFIG_NET_FOU_IP_TUNNELS is not set
-# CONFIG_INET_AH is not set
-# CONFIG_INET_ESP is not set
-# CONFIG_INET_IPCOMP is not set
-CONFIG_INET_TUNNEL=y
-CONFIG_INET_DIAG=y
-CONFIG_INET_TCP_DIAG=y
-# CONFIG_INET_UDP_DIAG is not set
-# CONFIG_INET_RAW_DIAG is not set
-# CONFIG_INET_DIAG_DESTROY is not set
+CONFIG_INET_AH=m
+CONFIG_INET_ESP=m
+CONFIG_INET_IPCOMP=m
+CONFIG_INET_ESP_OFFLOAD=m
+CONFIG_INET_TUNNEL=m
+CONFIG_INET_XFRM_TUNNEL=m
+CONFIG_INET_DIAG=m
+CONFIG_INET_TCP_DIAG=m
+CONFIG_INET_UDP_DIAG=m
+CONFIG_INET_RAW_DIAG=m
+CONFIG_INET_DIAG_DESTROY=y
 CONFIG_TCP_CONG_ADVANCED=y
 # CONFIG_TCP_CONG_BIC is not set
 CONFIG_TCP_CONG_CUBIC=y
@@ -1085,12 +1093,14 @@ CONFIG_TCP_MD5SIG=y
 CONFIG_IPV6=y
 # CONFIG_IPV6_ROUTER_PREF is not set
 # CONFIG_IPV6_OPTIMISTIC_DAD is not set
-CONFIG_INET6_AH=y
-CONFIG_INET6_ESP=y
-# CONFIG_INET6_ESP_OFFLOAD is not set
-# CONFIG_INET6_ESPINTCP is not set
-# CONFIG_INET6_IPCOMP is not set
-# CONFIG_IPV6_MIP6 is not set
+CONFIG_INET6_AH=m
+CONFIG_INET6_ESP=m
+CONFIG_INET6_ESP_OFFLOAD=m
+CONFIG_INET6_IPCOMP=m
+CONFIG_IPV6_MIP6=m
+CONFIG_INET6_XFRM_TUNNEL=m
+CONFIG_INET_DCCP_DIAG=m
+CONFIG_INET_SCTP_DIAG=m
 # CONFIG_IPV6_ILA is not set
 # CONFIG_IPV6_VTI is not set
 CONFIG_IPV6_SIT=y
@@ -1146,8 +1156,13 @@ CONFIG_NF_CT_PROTO_UDPLITE=y
 # CONFIG_NF_CONNTRACK_SANE is not set
 # CONFIG_NF_CONNTRACK_SIP is not set
 # CONFIG_NF_CONNTRACK_TFTP is not set
-# CONFIG_NF_CT_NETLINK is not set
-# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
+CONFIG_COMPAT_NETLINK_MESSAGES=y
+CONFIG_NF_CT_NETLINK=m
+CONFIG_NF_CT_NETLINK_TIMEOUT=m
+CONFIG_NF_CT_NETLINK_HELPER=m
+CONFIG_NETFILTER_NETLINK_GLUE_CT=y
+CONFIG_SCSI_NETLINK=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
 CONFIG_NF_NAT=m
 CONFIG_NF_NAT_REDIRECT=y
 CONFIG_NF_NAT_MASQUERADE=y
@@ -1992,7 +2007,7 @@ CONFIG_NETCONSOLE_DYNAMIC=y
 CONFIG_NETPOLL=y
 CONFIG_NET_POLL_CONTROLLER=y
 # CONFIG_RIONET is not set
-# CONFIG_TUN is not set
+CONFIG_TUN=y
 # CONFIG_TUN_VNET_CROSS_LE is not set
 CONFIG_VETH=y
 # CONFIG_NLMON is not set
@@ -3990,7 +4005,7 @@ CONFIG_MANDATORY_FILE_LOCKING=y
 CONFIG_FSNOTIFY=y
 CONFIG_DNOTIFY=y
 CONFIG_INOTIFY_USER=y
-# CONFIG_FANOTIFY is not set
+CONFIG_FANOTIFY=y
 CONFIG_QUOTA=y
 CONFIG_QUOTA_NETLINK_INTERFACE=y
 # CONFIG_PRINT_QUOTA_WARNING is not set
-- 
2.17.1

[Patch v5 00/24] CHECKPOINT RESTORE WITH ROCm

2022-02-03 Thread Rajneesh Bhardwaj

V5: Proposed IOCTL APIs for CRIU with consolidated feedback

CRIU is a user space tool which is very popular for container live
migration in datacentres. It can checkpoint a running application, save
its complete state, memory contents and all system resources to images
on disk which can be migrated to another m achine and restored later.
More information on CRIU can be found at https://criu.org/Main_Page

CRIU currently does not support Checkpoint / Restore with applications
that have devices files open so it cannot perform checkpoint and restore
on GPU devices which are very complex and have their own VRAM managed
privately. CRIU, however can support external devices by using a plugin
architecture. We feel that we are getting close to finalizing our IOCTL
APIs which were again changed since V3 for an improved modular design.

Our changes to CRIU user space  are can be obtained from here:
https://github.com/RadeonOpenCompute/criu/tree/amdgpu_rfc-211222

We have tested the following scenarios:
 - Checkpoint / Restore of a Pytorch (BERT) workload
 - kfdtests with queues and events
 - Gfx9 and Gfx10 based multi GPU test systems 
 - On baremetal and inside a docker container
 - Restoring on a different system

V1: Initial
V2: Addressed review comments
V3: Rebased on latest amd-staging-drm-next (5.15 based)
v4: New API design and basic support for SVM, however there is an
outstanding issue with SVM restore which is currently under debug and
hopefully that won't impact the ioctl APIs as SVMs are treated as
private data hidden from user space like queues and events with the new
approch.
V5: Fix the SVM related issues and finalize the APIs. 

David Yat Sin (9):
  drm/amdkfd: CRIU Implement KFD unpause operation
  drm/amdkfd: CRIU add queues support
  drm/amdkfd: CRIU restore queue ids
  drm/amdkfd: CRIU restore sdma id for queues
  drm/amdkfd: CRIU restore queue doorbell id
  drm/amdkfd: CRIU checkpoint and restore queue mqds
  drm/amdkfd: CRIU checkpoint and restore queue control stack
  drm/amdkfd: CRIU checkpoint and restore events
  drm/amdkfd: CRIU implement gpu_id remapping

Rajneesh Bhardwaj (15):
  x86/configs: CRIU update debug rock defconfig
  drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs
  drm/amdkfd: CRIU Implement KFD process_info ioctl
  drm/amdkfd: CRIU Implement KFD checkpoint ioctl
  drm/amdkfd: CRIU Implement KFD restore ioctl
  drm/amdkfd: CRIU Implement KFD resume ioctl
  drm/amdkfd: CRIU export BOs as prime dmabuf objects
  drm/amdkfd: CRIU checkpoint and restore xnack mode
  drm/amdkfd: CRIU allow external mm for svm ranges
  drm/amdkfd: use user_gpu_id for svm ranges
  drm/amdkfd: CRIU Discover svm ranges
  drm/amdkfd: CRIU Save Shared Virtual Memory ranges
  drm/amdkfd: CRIU prepare for svm resume
  drm/amdkfd: CRIU resume shared virtual memory ranges
  drm/amdkfd: Bump up KFD API version for CRIU

 arch/x86/configs/rock-dbg_defconfig   |   53 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|7 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   64 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   20 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |2 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 1471 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  185 ++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |   16 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  313 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |   14 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |   75 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |   77 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |   92 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |   84 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  160 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   72 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  372 -
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  331 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h  |   39 +
 include/uapi/linux/kfd_ioctl.h|   84 +-
 21 files changed, 3193 insertions(+), 340 deletions(-)

-- 
2.17.1

[Patch v4 24/24] drm/amdkfd: CRIU resume shared virtual memory ranges

2021-12-22 Thread Rajneesh Bhardwaj

In CRIU resume stage, resume all the shared virtual memory ranges from
the data stored inside the resuming kfd process during CRIU restore
phase. Also setup xnack mode and free up the resources.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 10 +
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 55 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h |  6 +++
 3 files changed, 71 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index f7aa15b18f95..6191e37656dd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2759,7 +2759,17 @@ static int criu_resume(struct file *filep,
}
 
mutex_lock(>mutex);
+   ret = kfd_criu_resume_svm(target);
+   if (ret) {
+   pr_err("kfd_criu_resume_svm failed for %i\n", args->pid);
+   goto exit;
+   }
+
ret =  amdgpu_amdkfd_criu_resume(target->kgd_process_info);
+   if (ret)
+   pr_err("amdgpu_amdkfd_criu_resume failed for %i\n", args->pid);
+
+exit:
mutex_unlock(>mutex);
 
kfd_unref_process(target);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e9f6c63c2a26..bd2dce37f345 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3427,6 +3427,61 @@ svm_range_get_attr(struct kfd_process *p, struct 
mm_struct *mm,
return 0;
 }
 
+int kfd_criu_resume_svm(struct kfd_process *p)
+{
+   int nattr_common = 4, nattr_accessibility = 1;
+   struct criu_svm_metadata *criu_svm_md = NULL;
+   struct criu_svm_metadata *next = NULL;
+   struct svm_range_list *svms = >svms;
+   int i, j, num_attrs, ret = 0;
+   struct mm_struct *mm;
+
+   if (list_empty(>criu_svm_metadata_list)) {
+   pr_debug("No SVM data from CRIU restore stage 2\n");
+   return ret;
+   }
+
+   mm = get_task_mm(p->lead_thread);
+   if (!mm) {
+   pr_err("failed to get mm for the target process\n");
+   return -ESRCH;
+   }
+
+   num_attrs = nattr_common + (nattr_accessibility * p->n_pdds);
+
+   i = j = 0;
+   list_for_each_entry(criu_svm_md, >criu_svm_metadata_list, list) {
+   pr_debug("criu_svm_md[%d]\n\tstart: 0x%llx size: 0x%llx 
(npages)\n",
+i, criu_svm_md->start_addr, criu_svm_md->size);
+   for (j = 0; j < num_attrs; j++) {
+   pr_debug("\ncriu_svm_md[%d]->attrs[%d].type : 0x%x 
\ncriu_svm_md[%d]->attrs[%d].value : 0x%x\n",
+i,j, criu_svm_md->attrs[j].type,
+i,j, criu_svm_md->attrs[j].value);
+   }
+
+   ret = svm_range_set_attr(p, mm, criu_svm_md->start_addr,
+criu_svm_md->size, num_attrs,
+criu_svm_md->attrs);
+   if (ret) {
+   pr_err("CRIU: failed to set range attributes\n");
+   goto exit;
+   }
+
+   i++;
+   }
+
+exit:
+   list_for_each_entry_safe(criu_svm_md, next, 
>criu_svm_metadata_list, list) {
+   pr_debug("freeing criu_svm_md[]\n\tstart: 0x%llx\n",
+   criu_svm_md->start_addr);
+   kfree(criu_svm_md);
+   }
+
+   mmput(mm);
+   return ret;
+
+}
+
 int svm_criu_prepare_for_resume(struct kfd_process *p,
struct kfd_criu_svm_range_priv_data *svm_priv)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
index e0c0853f085c..3b5bcb52723c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
@@ -195,6 +195,7 @@ int kfd_criu_restore_svm(struct kfd_process *p,
 uint8_t __user *user_priv_ptr,
 uint64_t *priv_data_offset,
 uint64_t max_priv_data_size);
+int kfd_criu_resume_svm(struct kfd_process *p);
 struct kfd_process_device *
 svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device 
*adev);
 void svm_range_list_lock_and_flush_work(struct svm_range_list *svms, struct 
mm_struct *mm);
@@ -256,6 +257,11 @@ static inline int kfd_criu_restore_svm(struct kfd_process 
*p,
return -EINVAL;
 }
 
+static inline int kfd_criu_resume_svm(struct kfd_process *p)
+{
+   return 0;
+}
+
 #define KFD_IS_SVM_API_SUPPORTED(dev) false
 
 #endif /* IS_ENABLED(CONFIG_HSA_AMD_SVM) */
-- 
2.17.1

[Patch v4 02/24] x86/configs: Add rock-rel_defconfig for amd-feature-criu branch

2021-12-22 Thread Rajneesh Bhardwaj

 - Add rock-rel_defconfig for release builds.

Signed-off-by: Rajneesh Bhardwaj 
---
 arch/x86/configs/rock-rel_defconfig | 4927 +++
 1 file changed, 4927 insertions(+)
 create mode 100644 arch/x86/configs/rock-rel_defconfig

diff --git a/arch/x86/configs/rock-rel_defconfig 
b/arch/x86/configs/rock-rel_defconfig
new file mode 100644
index ..f038ce7a0d06
--- /dev/null
+++ b/arch/x86/configs/rock-rel_defconfig
@@ -0,0 +1,4927 @@
+#
+# Automatically generated file; DO NOT EDIT.
+# Linux/x86 5.13.0 Kernel Configuration
+#
+CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
+CONFIG_CC_IS_GCC=y
+CONFIG_GCC_VERSION=70500
+CONFIG_CLANG_VERSION=0
+CONFIG_AS_IS_GNU=y
+CONFIG_AS_VERSION=23000
+CONFIG_LD_IS_BFD=y
+CONFIG_LD_VERSION=23000
+CONFIG_LLD_VERSION=0
+CONFIG_CC_CAN_LINK=y
+CONFIG_CC_CAN_LINK_STATIC=y
+CONFIG_CC_HAS_ASM_GOTO=y
+CONFIG_CC_HAS_ASM_INLINE=y
+CONFIG_IRQ_WORK=y
+CONFIG_BUILDTIME_TABLE_SORT=y
+CONFIG_THREAD_INFO_IN_TASK=y
+
+#
+# General setup
+#
+CONFIG_INIT_ENV_ARG_LIMIT=32
+# CONFIG_COMPILE_TEST is not set
+CONFIG_LOCALVERSION="-kfd"
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_BUILD_SALT=""
+CONFIG_HAVE_KERNEL_GZIP=y
+CONFIG_HAVE_KERNEL_BZIP2=y
+CONFIG_HAVE_KERNEL_LZMA=y
+CONFIG_HAVE_KERNEL_XZ=y
+CONFIG_HAVE_KERNEL_LZO=y
+CONFIG_HAVE_KERNEL_LZ4=y
+CONFIG_HAVE_KERNEL_ZSTD=y
+CONFIG_KERNEL_GZIP=y
+# CONFIG_KERNEL_BZIP2 is not set
+# CONFIG_KERNEL_LZMA is not set
+# CONFIG_KERNEL_XZ is not set
+# CONFIG_KERNEL_LZO is not set
+# CONFIG_KERNEL_LZ4 is not set
+# CONFIG_KERNEL_ZSTD is not set
+CONFIG_DEFAULT_INIT=""
+CONFIG_DEFAULT_HOSTNAME="(none)"
+CONFIG_SWAP=y
+CONFIG_SYSVIPC=y
+CONFIG_SYSVIPC_SYSCTL=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_POSIX_MQUEUE_SYSCTL=y
+# CONFIG_WATCH_QUEUE is not set
+CONFIG_CROSS_MEMORY_ATTACH=y
+CONFIG_USELIB=y
+CONFIG_AUDIT=y
+CONFIG_HAVE_ARCH_AUDITSYSCALL=y
+CONFIG_AUDITSYSCALL=y
+
+#
+# IRQ subsystem
+#
+CONFIG_GENERIC_IRQ_PROBE=y
+CONFIG_GENERIC_IRQ_SHOW=y
+CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
+CONFIG_GENERIC_PENDING_IRQ=y
+CONFIG_GENERIC_IRQ_MIGRATION=y
+CONFIG_HARDIRQS_SW_RESEND=y
+CONFIG_IRQ_DOMAIN=y
+CONFIG_IRQ_DOMAIN_HIERARCHY=y
+CONFIG_GENERIC_MSI_IRQ=y
+CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
+CONFIG_IRQ_MSI_IOMMU=y
+CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
+CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
+CONFIG_IRQ_FORCED_THREADING=y
+CONFIG_SPARSE_IRQ=y
+# CONFIG_GENERIC_IRQ_DEBUGFS is not set
+# end of IRQ subsystem
+
+CONFIG_CLOCKSOURCE_WATCHDOG=y
+CONFIG_ARCH_CLOCKSOURCE_INIT=y
+CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
+CONFIG_GENERIC_TIME_VSYSCALL=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
+CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
+CONFIG_GENERIC_CMOS_UPDATE=y
+CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
+CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
+
+#
+# Timers subsystem
+#
+CONFIG_TICK_ONESHOT=y
+CONFIG_NO_HZ_COMMON=y
+# CONFIG_HZ_PERIODIC is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_NO_HZ_FULL is not set
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+# end of Timers subsystem
+
+CONFIG_BPF=y
+CONFIG_HAVE_EBPF_JIT=y
+CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
+
+#
+# BPF subsystem
+#
+CONFIG_BPF_SYSCALL=y
+# CONFIG_BPF_JIT is not set
+# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set
+# CONFIG_BPF_PRELOAD is not set
+# end of BPF subsystem
+
+# CONFIG_PREEMPT_NONE is not set
+CONFIG_PREEMPT_VOLUNTARY=y
+# CONFIG_PREEMPT is not set
+CONFIG_PREEMPT_COUNT=y
+
+#
+# CPU/Task time and stats accounting
+#
+CONFIG_TICK_CPU_ACCOUNTING=y
+# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
+# CONFIG_IRQ_TIME_ACCOUNTING is not set
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+# CONFIG_PSI is not set
+# end of CPU/Task time and stats accounting
+
+# CONFIG_CPU_ISOLATION is not set
+
+#
+# RCU Subsystem
+#
+CONFIG_TREE_RCU=y
+# CONFIG_RCU_EXPERT is not set
+CONFIG_SRCU=y
+CONFIG_TREE_SRCU=y
+CONFIG_TASKS_RCU_GENERIC=y
+CONFIG_TASKS_RUDE_RCU=y
+CONFIG_TASKS_TRACE_RCU=y
+CONFIG_RCU_STALL_COMMON=y
+CONFIG_RCU_NEED_SEGCBLIST=y
+# end of RCU Subsystem
+
+CONFIG_BUILD_BIN2C=y
+# CONFIG_IKCONFIG is not set
+# CONFIG_IKHEADERS is not set
+CONFIG_LOG_BUF_SHIFT=18
+CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
+CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
+CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
+
+#
+# Scheduler features
+#
+# CONFIG_UCLAMP_TASK is not set
+# end of Scheduler features
+
+CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
+CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
+CONFIG_CC_HAS_INT128=y
+CONFIG_ARCH_SUPPORTS_INT128=y
+CONFIG_NUMA_BALANCING=y
+CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
+CONFIG_CGROUPS=y
+CONFIG_PAGE_COUNTER=y
+CONFIG_MEMCG=y
+CONFIG_MEMCG_SWAP=y
+CONFIG_MEMCG_KMEM=y
+CONFIG_BLK_CGROUP=y
+CONFIG_CGROUP_WRITEBACK=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_FAIR_GROUP_SCHED=y
+CONFIG_CFS_BANDWIDTH=y
+# CONFIG_RT_GROUP_SCHED is not set
+CONFIG_CGROUP_PIDS=y
+# CONFIG_CGROUP_RDMA is not set
+CONFIG_CGROUP_FR

[Patch v4 14/24] drm/amdkfd: CRIU checkpoint and restore queue control stack

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Checkpoint contents of queue control stacks on CRIU dump and restore them
during CRIU restore.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 23 ---
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  9 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  | 11 +++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 13 ++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  | 14 +++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 29 +++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 22 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 +-
 .../amd/amdkfd/kfd_process_queue_manager.c| 62 +--
 11 files changed, 139 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 146879cd3f2b..582b4a393f95 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -312,7 +312,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
p->pasid,
dev->id);
 
-   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL, NULL,
+   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL, NULL, NULL,
_offset_in_process);
if (err != 0)
goto err_create_queue;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 3a5303ebcabf..8eca9ed3ab36 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
properties.type = KFD_QUEUE_TYPE_DIQ;
 
status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
-   , , NULL, NULL, NULL);
+   , , NULL, NULL, NULL, NULL);
 
if (status) {
pr_err("Failed to create DIQ\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index a92274f9f1f7..248e69c7960b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -332,7 +332,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
struct queue *q,
struct qcm_process_device *qpd,
const struct kfd_criu_queue_priv_data *qd,
-   const void *restore_mqd)
+   const void *restore_mqd, const void 
*restore_ctl_stack)
 {
struct mqd_manager *mqd_mgr;
int retval;
@@ -394,7 +394,8 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
 
if (qd)
mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
->properties, restore_mqd);
+>properties, restore_mqd, 
restore_ctl_stack,
+qd->ctl_stack_size);
else
mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
>gart_mqd_addr, >properties);
@@ -1347,7 +1348,7 @@ static void destroy_kernel_queue_cpsch(struct 
device_queue_manager *dqm,
 static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue 
*q,
struct qcm_process_device *qpd,
const struct kfd_criu_queue_priv_data *qd,
-   const void *restore_mqd)
+   const void *restore_mqd, const void *restore_ctl_stack)
 {
int retval;
struct mqd_manager *mqd_mgr;
@@ -1393,9 +1394,11 @@ static int create_queue_cpsch(struct 
device_queue_manager *dqm, struct queue *q,
 * updates the is_evicted flag but is a no-op otherwise.
 */
q->properties.is_evicted = !!qpd->evicted;
+
if (qd)
mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
->properties, restore_mqd);
+>properties, restore_mqd, 
restore_ctl_stack,
+qd->ctl_stack_size);
else
mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
>gart_mqd_addr, >properties);
@@ -1788,7 +1791,8 @@ static int get_wave_state(struct device_queue_manager 
*dqm,
 
 static void get_queue_checkpoint_info(struct device_queue_manager *dqm,
const struct queue *q,
-

[Patch v4 20/24] drm/amdkfd: use user_gpu_id for svm ranges

2021-12-22 Thread Rajneesh Bhardwaj

Currently the SVM ranges use actual_gpu_id but with Checkpoint Restore
support its possible that the SVM ranges can be resumed on another node
where the actual_gpu_id may not be same as the original (user_gpu_id)
gpu id. So modify svm code to use user_gpu_id.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 67e2432098d1..0769dc655e15 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1813,7 +1813,7 @@ int kfd_process_gpuidx_from_gpuid(struct kfd_process *p, 
uint32_t gpu_id)
int i;
 
for (i = 0; i < p->n_pdds; i++)
-   if (p->pdds[i] && gpu_id == p->pdds[i]->dev->id)
+   if (p->pdds[i] && gpu_id == p->pdds[i]->user_gpu_id)
return i;
return -EINVAL;
 }
@@ -1826,7 +1826,7 @@ kfd_process_gpuid_from_adev(struct kfd_process *p, struct 
amdgpu_device *adev,
 
for (i = 0; i < p->n_pdds; i++)
if (p->pdds[i] && p->pdds[i]->dev->adev == adev) {
-   *gpuid = p->pdds[i]->dev->id;
+   *gpuid = p->pdds[i]->user_gpu_id;
*gpuidx = i;
return 0;
}
-- 
2.17.1

[Patch v4 23/24] drm/amdkfd: CRIU prepare for svm resume

2021-12-22 Thread Rajneesh Bhardwaj

During CRIU restore phase, the VMAs for the virtual address ranges are
not at their final location yet so in this stage, only cache the data
required to successfully resume the svm ranges during an imminent CRIU
resume phase.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  5 ++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 99 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 12 +++
 4 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 916b8d000317..f7aa15b18f95 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2638,8 +2638,8 @@ static int criu_restore_objects(struct file *filep,
goto exit;
break;
case KFD_CRIU_OBJECT_TYPE_SVM_RANGE:
-   /* TODO: Implement SVM range */
-   *priv_offset += sizeof(struct 
kfd_criu_svm_range_priv_data);
+   ret = kfd_criu_restore_svm(p, (uint8_t __user 
*)args->priv_data,
+priv_offset, 
max_priv_data_size);
if (ret)
goto exit;
break;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 87eb6739a78e..92191c541c29 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -790,6 +790,7 @@ struct svm_range_list {
struct list_headlist;
struct work_struct  deferred_list_work;
struct list_headdeferred_range_list;
+   struct list_headcriu_svm_metadata_list;
spinlock_t  deferred_list_lock;
atomic_tevicted_ranges;
booldrain_pagefaults;
@@ -1148,6 +1149,10 @@ int kfd_criu_restore_event(struct file *devkfd,
   uint8_t __user *user_priv_data,
   uint64_t *priv_data_offset,
   uint64_t max_priv_data_size);
+int kfd_criu_restore_svm(struct kfd_process *p,
+uint8_t __user *user_priv_data,
+uint64_t *priv_data_offset,
+uint64_t max_priv_data_size);
 /* CRIU - End */
 
 /* Queue Context Management */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 6d59f1bedcf2..e9f6c63c2a26 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -45,6 +45,14 @@
  */
 #define AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING   2000
 
+struct criu_svm_metadata {
+   struct list_head list;
+   __u64 start_addr;
+   __u64 size;
+   /* Variable length array of attributes */
+   struct kfd_ioctl_svm_attribute attrs[0];
+};
+
 static void svm_range_evict_svm_bo_worker(struct work_struct *work);
 static bool
 svm_range_cpu_invalidate_pagetables(struct mmu_interval_notifier *mni,
@@ -2753,6 +2761,7 @@ int svm_range_list_init(struct kfd_process *p)
INIT_DELAYED_WORK(>restore_work, svm_range_restore_work);
INIT_WORK(>deferred_list_work, svm_range_deferred_list_work);
INIT_LIST_HEAD(>deferred_range_list);
+   INIT_LIST_HEAD(>criu_svm_metadata_list);
spin_lock_init(>deferred_list_lock);
 
for (i = 0; i < p->n_pdds; i++)
@@ -3418,6 +3427,96 @@ svm_range_get_attr(struct kfd_process *p, struct 
mm_struct *mm,
return 0;
 }
 
+int svm_criu_prepare_for_resume(struct kfd_process *p,
+   struct kfd_criu_svm_range_priv_data *svm_priv)
+{
+   int nattr_common = 4, nattr_accessibility = 1;
+   struct criu_svm_metadata *criu_svm_md = NULL;
+   uint64_t svm_attrs_size, svm_object_md_size;
+   struct svm_range_list *svms = >svms;
+   int num_devices = p->n_pdds;
+   int i, ret = 0;
+
+   svm_attrs_size = sizeof(struct kfd_ioctl_svm_attribute) *
+   (nattr_common + nattr_accessibility * num_devices);
+   svm_object_md_size = sizeof(struct criu_svm_metadata) + svm_attrs_size;
+
+   criu_svm_md = kzalloc(svm_object_md_size, GFP_KERNEL);
+   if (!criu_svm_md) {
+   pr_err("failed to allocate memory to store svm metadata\n");
+   ret = -ENOMEM;
+   goto exit;
+   }
+
+   criu_svm_md->start_addr = svm_priv->start_addr;
+   criu_svm_md->size = svm_priv->size;
+   for (i = 0; i < svm_attrs_size; i++)
+   {
+   criu_svm_md->attrs[i].type = svm_priv->attrs[i].type;
+   criu_svm_md->attrs[i].value = svm_priv->attrs[i].value;
+   }
+
+   list_add_tail

[Patch v4 22/24] drm/amdkfd: CRIU Save Shared Virtual Memory ranges

2021-12-22 Thread Rajneesh Bhardwaj

During checkpoint stage, save the shared virtual memory ranges and
attributes for the target process. A process may contain a number of svm
ranges and each range might contain a number of arrtibutes. While not
all attributes may be applicable for a given prange but during
checkpoint we store all possible values for the max possible attribute
types.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 95 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 10 +++
 3 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 1c25d5e9067c..916b8d000317 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2186,7 +2186,9 @@ static int criu_checkpoint(struct file *filep,
if (ret)
goto close_bo_fds;
 
-   /* TODO: Dump SVM-Ranges */
+   ret = kfd_criu_checkpoint_svm(p, (uint8_t __user 
*)args->priv_data, _offset);
+   if (ret)
+   goto close_bo_fds;
}
 
 close_bo_fds:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 49e05fb5c898..6d59f1bedcf2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3478,6 +3478,101 @@ int svm_range_get_info(struct kfd_process *p, uint32_t 
*num_svm_ranges,
return 0;
 }
 
+int kfd_criu_checkpoint_svm(struct kfd_process *p,
+   uint8_t __user *user_priv_data,
+   uint64_t *priv_data_offset)
+{
+   struct kfd_criu_svm_range_priv_data *svm_priv = NULL;
+   struct kfd_ioctl_svm_attribute *query_attr = NULL;
+   uint64_t svm_priv_data_size, query_attr_size = 0;
+   int index, nattr_common = 4, ret = 0;
+   struct svm_range_list *svms;
+   int num_devices = p->n_pdds;
+   struct svm_range *prange;
+   struct mm_struct *mm;
+
+   svms = >svms;
+   if (!svms)
+   return -EINVAL;
+
+   mm = get_task_mm(p->lead_thread);
+   if (!mm) {
+   pr_err("failed to get mm for the target process\n");
+   return -ESRCH;
+   }
+
+   query_attr_size = sizeof(struct kfd_ioctl_svm_attribute) *
+   (nattr_common + num_devices);
+
+   query_attr = kzalloc(query_attr_size, GFP_KERNEL);
+   if (!query_attr) {
+   ret = -ENOMEM;
+   goto exit;
+   }
+
+   query_attr[0].type = KFD_IOCTL_SVM_ATTR_PREFERRED_LOC;
+   query_attr[1].type = KFD_IOCTL_SVM_ATTR_PREFETCH_LOC;
+   query_attr[2].type = KFD_IOCTL_SVM_ATTR_SET_FLAGS;
+   query_attr[3].type = KFD_IOCTL_SVM_ATTR_GRANULARITY;
+
+   for (index = 0; index < num_devices; index++) {
+   struct kfd_process_device *pdd = p->pdds[index];
+
+   query_attr[index + nattr_common].type =
+   KFD_IOCTL_SVM_ATTR_ACCESS;
+   query_attr[index + nattr_common].value = pdd->user_gpu_id;
+   }
+
+   svm_priv_data_size = sizeof(*svm_priv) + query_attr_size;
+
+   svm_priv = kzalloc(svm_priv_data_size, GFP_KERNEL);
+   if (!svm_priv) {
+   ret = -ENOMEM;
+   goto exit_query;
+   }
+
+   index = 0;
+   list_for_each_entry(prange, >list, list) {
+
+   svm_priv->object_type = KFD_CRIU_OBJECT_TYPE_SVM_RANGE;
+   svm_priv->start_addr = prange->start;
+   svm_priv->size = prange->npages;
+   memcpy(_priv->attrs, query_attr, query_attr_size);
+   pr_debug("CRIU: prange: 0x%p start: 0x%lx\t npages: 0x%llx end: 
0x%llx\t size: 0x%llx\n",
+prange, prange->start, prange->npages,
+prange->start + prange->npages - 1,
+prange->npages * PAGE_SIZE);
+
+   ret = svm_range_get_attr(p, mm, svm_priv->start_addr,
+svm_priv->size,
+(nattr_common + num_devices),
+svm_priv->attrs);
+   if (ret) {
+   pr_err("CRIU: failed to obtain range attributes\n");
+   goto exit_priv;
+   }
+
+   ret = copy_to_user(user_priv_data + *priv_data_offset,
+  svm_priv, svm_priv_data_size);
+   if (ret) {
+   pr_err("Failed to copy svm priv to user\n");
+   goto exit_priv;
+   }
+
+   *priv_data_offset += svm_priv_data_size;
+
+   }
+
+
+exit_priv:
+   kfree(svm_priv);
+exit_query:
+   kfree(query_attr);
+exit

[Patch v4 11/24] drm/amdkfd: CRIU restore sdma id for queues

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When re-creating queues during CRIU restore, restore the queue with the
same sdma id value used during CRIU dump.

Signed-off-by: David Yat Sin 

---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  3 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  4 +-
 3 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 62fe28244a80..7e49f70b81b9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -58,7 +58,7 @@ static inline void deallocate_hqd(struct device_queue_manager 
*dqm,
struct queue *q);
 static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q);
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   struct queue *q);
+   struct queue *q, const uint32_t 
*restore_sdma_id);
 static void kfd_process_hw_exception(struct work_struct *work);
 
 static inline
@@ -308,7 +308,8 @@ static void deallocate_vmid(struct device_queue_manager 
*dqm,
 
 static int create_queue_nocpsch(struct device_queue_manager *dqm,
struct queue *q,
-   struct qcm_process_device *qpd)
+   struct qcm_process_device *qpd,
+   const struct kfd_criu_queue_priv_data *qd)
 {
struct mqd_manager *mqd_mgr;
int retval;
@@ -348,7 +349,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
q->pipe, q->queue);
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
-   retval = allocate_sdma_queue(dqm, q);
+   retval = allocate_sdma_queue(dqm, q, qd ? >sdma_id : NULL);
if (retval)
goto deallocate_vmid;
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
@@ -1040,7 +1041,7 @@ static void pre_reset(struct device_queue_manager *dqm)
 }
 
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   struct queue *q)
+   struct queue *q, const uint32_t 
*restore_sdma_id)
 {
int bit;
 
@@ -1050,9 +1051,21 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
return -ENOMEM;
}
 
-   bit = __ffs64(dqm->sdma_bitmap);
-   dqm->sdma_bitmap &= ~(1ULL << bit);
-   q->sdma_id = bit;
+   if (restore_sdma_id) {
+   /* Re-use existing sdma_id */
+   if (!(dqm->sdma_bitmap & (1ULL << *restore_sdma_id))) {
+   pr_err("SDMA queue already in use\n");
+   return -EBUSY;
+   }
+   dqm->sdma_bitmap &= ~(1ULL << *restore_sdma_id);
+   q->sdma_id = *restore_sdma_id;
+   } else {
+   /* Find first available sdma_id */
+   bit = __ffs64(dqm->sdma_bitmap);
+   dqm->sdma_bitmap &= ~(1ULL << bit);
+   q->sdma_id = bit;
+   }
+
q->properties.sdma_engine_id = q->sdma_id %
get_num_sdma_engines(dqm);
q->properties.sdma_queue_id = q->sdma_id /
@@ -1062,9 +1075,19 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
pr_err("No more XGMI SDMA queue to allocate\n");
return -ENOMEM;
}
-   bit = __ffs64(dqm->xgmi_sdma_bitmap);
-   dqm->xgmi_sdma_bitmap &= ~(1ULL << bit);
-   q->sdma_id = bit;
+   if (restore_sdma_id) {
+   /* Re-use existing sdma_id */
+   if (!(dqm->xgmi_sdma_bitmap & (1ULL << 
*restore_sdma_id))) {
+   pr_err("SDMA queue already in use\n");
+   return -EBUSY;
+   }
+   dqm->xgmi_sdma_bitmap &= ~(1ULL << *restore_sdma_id);
+   q->sdma_id = *restore_sdma_id;
+   } else {
+   bit = __ffs64(dqm->xgmi_sdma_bitmap);
+   dqm->xgmi_sdma_bitmap &= ~(1ULL << bit);
+   q->sdma_id = bit;
+   }
/* sdma_engine_id is sdma id including
 * both PCIe-optimized SDMAs and XGMI-
 * optimized SDMAs. The calculation below
@@ -1293,7 +1316,8 @@ static void destroy_kernel_queue_cpsch(struct 
device_queue_manager *dqm,
 }
 
 static int create_queue_cpsch(struct

[Patch v4 18/24] drm/amdkfd: CRIU checkpoint and restore xnack mode

2021-12-22 Thread Rajneesh Bhardwaj

Recoverable page faults are represented by the xnack mode setting inside
a kfd process and are used to represent the device page faults. For CR,
we don't consider negative values which are typically used for querying
the current xnack mode without modifying it.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 15 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  1 +
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 178b0ccfb286..446eb9310915 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1845,6 +1845,11 @@ static int criu_checkpoint_process(struct kfd_process *p,
memset(_priv, 0, sizeof(process_priv));
 
process_priv.version = KFD_CRIU_PRIV_VERSION;
+   /* For CR, we don't consider negative xnack mode which is used for
+* querying without changing it, here 0 simply means disabled and 1
+* means enabled so retry for finding a valid PTE.
+*/
+   process_priv.xnack_mode = p->xnack_enabled ? 1 : 0;
 
ret = copy_to_user(user_priv_data + *priv_offset,
_priv, sizeof(process_priv));
@@ -2231,6 +2236,16 @@ static int criu_restore_process(struct kfd_process *p,
return -EINVAL;
}
 
+   pr_debug("Setting XNACK mode\n");
+   if (process_priv.xnack_mode && !kfd_process_xnack_mode(p, true)) {
+   pr_err("xnack mode cannot be set\n");
+   ret = -EPERM;
+   goto exit;
+   } else {
+   pr_debug("set xnack mode: %d\n", process_priv.xnack_mode);
+   p->xnack_enabled = process_priv.xnack_mode;
+   }
+
 exit:
return ret;
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 855c162b85ea..d72dda84c18c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1057,6 +1057,7 @@ void kfd_process_set_trap_handler(struct 
qcm_process_device *qpd,
 
 struct kfd_criu_process_priv_data {
uint32_t version;
+   uint32_t xnack_mode;
 };
 
 struct kfd_criu_device_priv_data {
-- 
2.17.1

[Patch v4 19/24] drm/amdkfd: CRIU allow external mm for svm ranges

2021-12-22 Thread Rajneesh Bhardwaj

Both svm_range_get_attr and svm_range_set_attr helpers use mm struct
from current but for a Checkpoint or Restore operation, the current->mm
will fetch the mm for the CRIU master process. So modify these helpers to
accept the task mm for a target kfd process to support Checkpoint
Restore.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 88360f23eb61..7c92116153fe 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3134,11 +3134,11 @@ static void svm_range_evict_svm_bo_worker(struct 
work_struct *work)
 }
 
 static int
-svm_range_set_attr(struct kfd_process *p, uint64_t start, uint64_t size,
-  uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs)
+svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
+  uint64_t start, uint64_t size, uint32_t nattr,
+  struct kfd_ioctl_svm_attribute *attrs)
 {
struct amdkfd_process_info *process_info = p->kgd_process_info;
-   struct mm_struct *mm = current->mm;
struct list_head update_list;
struct list_head insert_list;
struct list_head remove_list;
@@ -3242,8 +3242,9 @@ svm_range_set_attr(struct kfd_process *p, uint64_t start, 
uint64_t size,
 }
 
 static int
-svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
-  uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs)
+svm_range_get_attr(struct kfd_process *p, struct mm_struct *mm,
+  uint64_t start, uint64_t size, uint32_t nattr,
+  struct kfd_ioctl_svm_attribute *attrs)
 {
DECLARE_BITMAP(bitmap_access, MAX_GPU_INSTANCE);
DECLARE_BITMAP(bitmap_aip, MAX_GPU_INSTANCE);
@@ -3253,7 +3254,6 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, 
uint64_t size,
bool get_accessible = false;
bool get_flags = false;
uint64_t last = start + size - 1UL;
-   struct mm_struct *mm = current->mm;
uint8_t granularity = 0xff;
struct interval_tree_node *node;
struct svm_range_list *svms;
@@ -3422,6 +3422,7 @@ int
 svm_ioctl(struct kfd_process *p, enum kfd_ioctl_svm_op op, uint64_t start,
  uint64_t size, uint32_t nattrs, struct kfd_ioctl_svm_attribute *attrs)
 {
+   struct mm_struct *mm = current->mm;
int r;
 
start >>= PAGE_SHIFT;
@@ -3429,10 +3430,10 @@ svm_ioctl(struct kfd_process *p, enum kfd_ioctl_svm_op 
op, uint64_t start,
 
switch (op) {
case KFD_IOCTL_SVM_OP_SET_ATTR:
-   r = svm_range_set_attr(p, start, size, nattrs, attrs);
+   r = svm_range_set_attr(p, mm, start, size, nattrs, attrs);
break;
case KFD_IOCTL_SVM_OP_GET_ATTR:
-   r = svm_range_get_attr(p, start, size, nattrs, attrs);
+   r = svm_range_get_attr(p, mm, start, size, nattrs, attrs);
break;
default:
r = EINVAL;
-- 
2.17.1

[Patch v4 12/24] drm/amdkfd: CRIU restore queue doorbell id

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When re-creating queues during CRIU restore, restore the queue with the
same doorbell id value used during CRIU dump.

Signed-off-by: David Yat Sin 

---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 60 +--
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 7e49f70b81b9..a0f5b8533a03 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -153,7 +153,13 @@ static void decrement_queue_count(struct 
device_queue_manager *dqm,
dqm->active_cp_queue_count--;
 }
 
-static int allocate_doorbell(struct qcm_process_device *qpd, struct queue *q)
+/*
+ * Allocate a doorbell ID to this queue.
+ * If doorbell_id is passed in, make sure requested ID is valid then allocate 
it.
+ */
+static int allocate_doorbell(struct qcm_process_device *qpd,
+struct queue *q,
+uint32_t const *restore_id)
 {
struct kfd_dev *dev = qpd->dqm->dev;
 
@@ -161,6 +167,10 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd, struct queue *q)
/* On pre-SOC15 chips we need to use the queue ID to
 * preserve the user mode ABI.
 */
+
+   if (restore_id && *restore_id != q->properties.queue_id)
+   return -EINVAL;
+
q->doorbell_id = q->properties.queue_id;
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
@@ -169,25 +179,37 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd, struct queue *q)
 * The doobell index distance between RLC (2*i) and (2*i+1)
 * for a SDMA engine is 512.
 */
-   uint32_t *idx_offset =
-   dev->shared_resources.sdma_doorbell_idx;
 
-   q->doorbell_id = idx_offset[q->properties.sdma_engine_id]
-   + (q->properties.sdma_queue_id & 1)
-   * KFD_QUEUE_DOORBELL_MIRROR_OFFSET
-   + (q->properties.sdma_queue_id >> 1);
+   uint32_t *idx_offset = dev->shared_resources.sdma_doorbell_idx;
+   uint32_t valid_id = idx_offset[q->properties.sdma_engine_id]
+   + (q->properties.sdma_queue_id 
& 1)
+   * 
KFD_QUEUE_DOORBELL_MIRROR_OFFSET
+   + (q->properties.sdma_queue_id 
>> 1);
+
+   if (restore_id && *restore_id != valid_id)
+   return -EINVAL;
+   q->doorbell_id = valid_id;
} else {
-   /* For CP queues on SOC15 reserve a free doorbell ID */
-   unsigned int found;
-
-   found = find_first_zero_bit(qpd->doorbell_bitmap,
-   KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
-   if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
-   pr_debug("No doorbells available");
-   return -EBUSY;
+   /* For CP queues on SOC15 */
+   if (restore_id) {
+   /* make sure that ID is free  */
+   if (__test_and_set_bit(*restore_id, 
qpd->doorbell_bitmap))
+   return -EINVAL;
+
+   q->doorbell_id = *restore_id;
+   } else {
+   /* or reserve a free doorbell ID */
+   unsigned int found;
+
+   found = find_first_zero_bit(qpd->doorbell_bitmap,
+   
KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
+   if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
+   pr_debug("No doorbells available");
+   return -EBUSY;
+   }
+   set_bit(found, qpd->doorbell_bitmap);
+   q->doorbell_id = found;
}
-   set_bit(found, qpd->doorbell_bitmap);
-   q->doorbell_id = found;
}
 
q->properties.doorbell_off =
@@ -355,7 +377,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
}
 
-   retval = allocate_doorbell(qpd, q);
+   retval = allocate_doorbell(qpd, q, qd ? >doorbell_id : NULL);
if (retval)
goto out_deallocate_hqd;
 
@@ -1338,7 +1360,7 @@ static int create_queue_cpsch(struct device_queue_manager 
*dqm, struct queue *q,
goto out;
}
 
-   retval = allocate_doorbell(qpd, q);
+   retval = allocate_doorbell(qpd, q, qd ? >doorbell_id :

[Patch v4 21/24] drm/amdkfd: CRIU Discover svm ranges

2021-12-22 Thread Rajneesh Bhardwaj

A KFD process may contain a number of virtual address ranges for shared
virtual memory management and each such range can have many SVM
attributes spanning across various nodes within the process boundary.
This change reports the total number of such SVM ranges and
their total private data size by extending the PROCESS_INFO op of the the
CRIU IOCTL to discover the svm ranges in the target process and a future
patches brings in the required support for checkpoint and restore for
SVM ranges.


Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 60 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 11 +
 4 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 446eb9310915..1c25d5e9067c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2089,10 +2089,9 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
uint32_t *num_objects,
uint64_t *objs_priv_size)
 {
-   int ret;
-   uint64_t priv_size;
+   uint64_t queues_priv_data_size, svm_priv_data_size, priv_size;
uint32_t num_queues, num_events, num_svm_ranges;
-   uint64_t queues_priv_data_size;
+   int ret;
 
*num_devices = p->n_pdds;
*num_bos = get_process_num_bos(p);
@@ -2102,7 +2101,10 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
return ret;
 
num_events = kfd_get_num_events(p);
-   num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */
+
+   ret = svm_range_get_info(p, _svm_ranges, _priv_data_size);
+   if (ret)
+   return ret;
 
*num_objects = num_queues + num_events + num_svm_ranges;
 
@@ -2112,7 +2114,7 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
priv_size += queues_priv_data_size;
priv_size += num_events * sizeof(struct 
kfd_criu_event_priv_data);
-   /* TODO: Add SVM ranges priv size */
+   priv_size += svm_priv_data_size;
*objs_priv_size = priv_size;
}
return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index d72dda84c18c..87eb6739a78e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1082,7 +1082,10 @@ enum kfd_criu_object_type {
 
 struct kfd_criu_svm_range_priv_data {
uint32_t object_type;
-   uint64_t reserved;
+   uint64_t start_addr;
+   uint64_t size;
+   /* Variable length array of attributes */
+   struct kfd_ioctl_svm_attribute attrs[0];
 };
 
 struct kfd_criu_queue_priv_data {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 7c92116153fe..49e05fb5c898 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -3418,6 +3418,66 @@ svm_range_get_attr(struct kfd_process *p, struct 
mm_struct *mm,
return 0;
 }
 
+int svm_range_get_info(struct kfd_process *p, uint32_t *num_svm_ranges,
+  uint64_t *svm_priv_data_size)
+{
+   uint64_t total_size, accessibility_size, common_attr_size;
+   int nattr_common = 4, naatr_accessibility = 1;
+   int num_devices = p->n_pdds;
+   struct svm_range_list *svms;
+   struct svm_range *prange;
+   uint32_t count = 0;
+
+   *svm_priv_data_size = 0;
+
+   svms = >svms;
+   if (!svms)
+   return -EINVAL;
+
+   mutex_lock(>lock);
+   list_for_each_entry(prange, >list, list) {
+   pr_debug("prange: 0x%p start: 0x%lx\t npages: 0x%llx\t end: 
0x%llx\n",
+prange, prange->start, prange->npages,
+prange->start + prange->npages - 1);
+   count++;
+   }
+   mutex_unlock(>lock);
+
+   *num_svm_ranges = count;
+   /* Only the accessbility attributes need to be queried for all the gpus
+* individually, remaining ones are spanned across the entire process
+* regardless of the various gpu nodes. Of the remaining attributes,
+* KFD_IOCTL_SVM_ATTR_CLR_FLAGS need not be saved.
+*
+* KFD_IOCTL_SVM_ATTR_PREFERRED_LOC
+* KFD_IOCTL_SVM_ATTR_PREFETCH_LOC
+* KFD_IOCTL_SVM_ATTR_SET_FLAGS
+* KFD_IOCTL_SVM_ATTR_GRANULARITY
+*
+* ** ACCESSBILITY ATTRIBUTES **
+* (Considered as one, type is altered during query, value is gpuid)
+* KFD_IOCTL_SVM_ATTR_ACCESS
+* KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE
+* KFD_IOCTL_SVM_ATTR_NO_ACCESS

[Patch v4 08/24] drm/amdkfd: CRIU Implement KFD unpause operation

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Introducing UNPAUSE op. After CRIU amdgpu plugin performs a PROCESS_INFO
op the queues will be stay in an evicted state. Once the plugin is done
draining BO contents, it is safe to perform an UNPAUSE op for the queues
to resume.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 37 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  3 ++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  1 +
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 87b9f019e96e..db2bb302a8d4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2040,6 +2040,14 @@ static int criu_checkpoint(struct file *filep,
goto exit_unlock;
}
 
+   /* Confirm all process queues are evicted */
+   if (!p->queues_paused) {
+   pr_err("Cannot dump process when queues are not in evicted 
state\n");
+   /* CRIU plugin did not call op PROCESS_INFO before 
checkpointing */
+   ret = -EINVAL;
+   goto exit_unlock;
+   }
+
criu_get_process_object_info(p, _bos, _size);
 
if (num_bos != args->num_bos ||
@@ -2382,7 +2390,24 @@ static int criu_unpause(struct file *filep,
struct kfd_process *p,
struct kfd_ioctl_criu_args *args)
 {
-   return 0;
+   int ret;
+
+   mutex_lock(>mutex);
+
+   if (!p->queues_paused) {
+   mutex_unlock(>mutex);
+   return -EINVAL;
+   }
+
+   ret = kfd_process_restore_queues(p);
+   if (ret)
+   pr_err("Failed to unpause queues ret:%d\n", ret);
+   else
+   p->queues_paused = false;
+
+   mutex_unlock(>mutex);
+
+   return ret;
 }
 
 static int criu_resume(struct file *filep,
@@ -2434,6 +2459,12 @@ static int criu_process_info(struct file *filep,
goto err_unlock;
}
 
+   ret = kfd_process_evict_queues(p);
+   if (ret)
+   goto err_unlock;
+
+   p->queues_paused = true;
+
args->pid = task_pid_nr_ns(p->lead_thread,
task_active_pid_ns(p->lead_thread));
 
@@ -2441,6 +2472,10 @@ static int criu_process_info(struct file *filep,
 
dev_dbg(kfd_device, "Num of bos:%u\n", args->num_bos);
 err_unlock:
+   if (ret) {
+   kfd_process_restore_queues(p);
+   p->queues_paused = false;
+   }
mutex_unlock(>mutex);
return ret;
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index cd72541a8f4f..f3a9f3de34e4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -875,6 +875,9 @@ struct kfd_process {
struct svm_range_list svms;
 
bool xnack_enabled;
+
+   /* Queues are in paused stated because we are in the process of doing a 
CRIU checkpoint */
+   bool queues_paused;
 };
 
 #define KFD_PROCESS_TABLE_SIZE 5 /* bits: 32 entries */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index d2fcdc5e581f..e20fbb7ba9bb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1364,6 +1364,7 @@ static struct kfd_process *create_process(const struct 
task_struct *thread)
process->mm = thread->mm;
process->lead_thread = thread->group_leader;
process->n_pdds = 0;
+   process->queues_paused = false;
INIT_DELAYED_WORK(>eviction_work, evict_process_worker);
INIT_DELAYED_WORK(>restore_work, restore_process_worker);
process->last_restore_timestamp = get_jiffies_64();
-- 
2.17.1

[Patch v4 17/24] drm/amdkfd: CRIU export BOs as prime dmabuf objects

2021-12-22 Thread Rajneesh Bhardwaj

KFD buffer objects do not associate a GEM handle with them so cannot
directly be used with libdrm to initiate a system dma (sDMA) operation
to speedup the checkpoint and restore operation so export them as dmabuf
objects and use with libdrm helper (amdgpu_bo_import) to further process
the sdma command submissions.

With sDMA, we see huge improvement in checkpoint and restore operations
compared to the generic pci based access via host data path.

Suggested-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 71 +++-
 1 file changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 20652d488cde..178b0ccfb286 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
@@ -43,6 +44,7 @@
 #include "amdgpu_amdkfd.h"
 #include "kfd_smi_events.h"
 #include "amdgpu_object.h"
+#include "amdgpu_dma_buf.h"
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -1932,6 +1934,33 @@ uint64_t get_process_num_bos(struct kfd_process *p)
return num_of_bos;
 }
 
+static int criu_get_prime_handle(struct drm_gem_object *gobj, int flags,
+ u32 *shared_fd)
+{
+   struct dma_buf *dmabuf;
+   int ret;
+
+   dmabuf = amdgpu_gem_prime_export(gobj, flags);
+   if (IS_ERR(dmabuf)) {
+   ret = PTR_ERR(dmabuf);
+   pr_err("dmabuf export failed for the BO\n");
+   return ret;
+   }
+
+   ret = dma_buf_fd(dmabuf, flags);
+   if (ret < 0) {
+   pr_err("dmabuf create fd failed, ret:%d\n", ret);
+   goto out_free_dmabuf;
+   }
+
+   *shared_fd = ret;
+   return 0;
+
+out_free_dmabuf:
+   dma_buf_put(dmabuf);
+   return ret;
+}
+
 static int criu_checkpoint_bos(struct kfd_process *p,
   uint32_t num_bos,
   uint8_t __user *user_bos,
@@ -1992,6 +2021,14 @@ static int criu_checkpoint_bos(struct kfd_process *p,
goto exit;
}
}
+   if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   ret = 
criu_get_prime_handle(_bo->tbo.base,
+   bo_bucket->alloc_flags &
+   
KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ? DRM_RDWR : 0,
+   _bucket->dmabuf_fd);
+   if (ret)
+   goto exit;
+   }
if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL)
bo_bucket->offset = KFD_MMAP_TYPE_DOORBELL |
KFD_MMAP_GPU_ID(pdd->dev->id);
@@ -2031,6 +2068,10 @@ static int criu_checkpoint_bos(struct kfd_process *p,
*priv_offset += num_bos * sizeof(*bo_privs);
 
 exit:
+   while (ret && bo_index--) {
+   if (bo_buckets[bo_index].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   close_fd(bo_buckets[bo_index].dmabuf_fd);
+   }
 
kvfree(bo_buckets);
kvfree(bo_privs);
@@ -2131,16 +2172,28 @@ static int criu_checkpoint(struct file *filep,
ret = kfd_criu_checkpoint_queues(p, (uint8_t __user 
*)args->priv_data,
 _offset);
if (ret)
-   goto exit_unlock;
+   goto close_bo_fds;
 
ret = kfd_criu_checkpoint_events(p, (uint8_t __user 
*)args->priv_data,
 _offset);
if (ret)
-   goto exit_unlock;
+   goto close_bo_fds;
 
/* TODO: Dump SVM-Ranges */
}
 
+close_bo_fds:
+   if (ret) {
+   /* If IOCTL returns err, user assumes all FDs opened in 
criu_dump_bos are closed */
+   uint32_t i;
+   struct kfd_criu_bo_bucket *bo_buckets = (struct 
kfd_criu_bo_bucket *) args->bos;
+
+   for (i = 0; i < num_bos; i++) {
+   if (bo_buckets[i].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   close_fd(bo_buckets[i].dmabuf_fd);
+   }
+   }
+
 exit_unlock:
mutex_unlock(>mutex);
if (ret)
@@ -2335,6 +2388,7 @@ static int criu

[Patch v4 15/24] drm/amdkfd: CRIU checkpoint and restore events

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Add support to existing CRIU ioctl's to save and restore events during
criu checkpoint and restore.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  70 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c  | 272 ---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  27 ++-
 3 files changed, 280 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 582b4a393f95..08467fa2f514 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1009,57 +1009,11 @@ static int kfd_ioctl_create_event(struct file *filp, 
struct kfd_process *p,
 * through the event_page_offset field.
 */
if (args->event_page_offset) {
-   struct kfd_dev *kfd;
-   struct kfd_process_device *pdd;
-   void *mem, *kern_addr;
-   uint64_t size;
-
-   kfd = kfd_device_by_id(GET_GPU_ID(args->event_page_offset));
-   if (!kfd) {
-   pr_err("Getting device by id failed in %s\n", __func__);
-   return -EINVAL;
-   }
-
mutex_lock(>mutex);
-
-   if (p->signal_page) {
-   pr_err("Event page is already set\n");
-   err = -EINVAL;
-   goto out_unlock;
-   }
-
-   pdd = kfd_bind_process_to_device(kfd, p);
-   if (IS_ERR(pdd)) {
-   err = PTR_ERR(pdd);
-   goto out_unlock;
-   }
-
-   mem = kfd_process_device_translate_handle(pdd,
-   GET_IDR_HANDLE(args->event_page_offset));
-   if (!mem) {
-   pr_err("Can't find BO, offset is 0x%llx\n",
-  args->event_page_offset);
-   err = -EINVAL;
-   goto out_unlock;
-   }
-
-   err = amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(kfd->adev,
-   mem, _addr, );
-   if (err) {
-   pr_err("Failed to map event page to kernel\n");
-   goto out_unlock;
-   }
-
-   err = kfd_event_page_set(p, kern_addr, size);
-   if (err) {
-   pr_err("Failed to set event page\n");
-   amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(kfd->adev, 
mem);
-   goto out_unlock;
-   }
-
-   p->signal_handle = args->event_page_offset;
-
+   err = kfd_kmap_event_page(p, args->event_page_offset);
mutex_unlock(>mutex);
+   if (err)
+   return err;
}
 
err = kfd_event_create(filp, p, args->event_type,
@@ -1068,10 +1022,7 @@ static int kfd_ioctl_create_event(struct file *filp, 
struct kfd_process *p,
>event_page_offset,
>event_slot_index);
 
-   return err;
-
-out_unlock:
-   mutex_unlock(>mutex);
+   pr_debug("Created event (id:0x%08x) (%s)\n", args->event_id, __func__);
return err;
 }
 
@@ -2022,7 +1973,7 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
if (ret)
return ret;
 
-   num_events = 0; /* TODO: Implement Events */
+   num_events = kfd_get_num_events(p);
num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */
 
*num_objects = num_queues + num_events + num_svm_ranges;
@@ -2031,7 +1982,7 @@ static int criu_get_process_object_info(struct 
kfd_process *p,
priv_size = sizeof(struct kfd_criu_process_priv_data);
priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
priv_size += queues_priv_data_size;
-   /* TODO: Add Events priv size */
+   priv_size += num_events * sizeof(struct 
kfd_criu_event_priv_data);
/* TODO: Add SVM ranges priv size */
*objs_priv_size = priv_size;
}
@@ -2093,7 +2044,10 @@ static int criu_checkpoint(struct file *filep,
if (ret)
goto exit_unlock;
 
-   /* TODO: Dump Events */
+   ret = kfd_criu_checkpoint_events(p, (uint8_t __user 
*)args->priv_data,
+_offset);
+   if (ret)
+   goto exit_unlock;
 
/* TODO: Dump SVM-Ranges */
}
@@ -2406,8 +2360,8 @@ static int criu_restore_objects(struct file *filep,
goto exit;
break;
case KFD_CRIU_OBJECT_TYPE_EVENT:
-   /* TODO: Implement Events */
-   *priv_offset += sizeof(struct kfd_criu_event_priv_data);
+

[Patch v4 06/24] drm/amdkfd: CRIU Implement KFD restore ioctl

2021-12-22 Thread Rajneesh Bhardwaj

This implements the KFD CRIU Restore ioctl that lays the basic
foundation for the CRIU restore operation. It provides support to
create the buffer objects corresponding to Non-Paged system memory
mapped for GPU and/or CPU access and lays basic foundation for the
userptrs buffer objects which will be added in a separate patch.
This ioctl creates various types of buffer objects such as VRAM,
MMIO, Doorbell, GTT based on the date sent from the userspace plugin.
The data mostly contains the previously checkpointed KFD images from
some KFD processs.

While restoring a criu process, attach old IDR values to newly
created BOs. This also adds the minimal gpu mapping support for a single
gpu checkpoint restore use case.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 298 ++-
 1 file changed, 297 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index cdbb92972338..c93f74ad073f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2069,11 +2069,307 @@ static int criu_checkpoint(struct file *filep,
return ret;
 }
 
+static int criu_restore_process(struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args,
+   uint64_t *priv_offset,
+   uint64_t max_priv_data_size)
+{
+   int ret = 0;
+   struct kfd_criu_process_priv_data process_priv;
+
+   if (*priv_offset + sizeof(process_priv) > max_priv_data_size)
+   return -EINVAL;
+
+   ret = copy_from_user(_priv,
+   (void __user *)(args->priv_data + *priv_offset),
+   sizeof(process_priv));
+   if (ret) {
+   pr_err("Failed to copy process private information from 
user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+   *priv_offset += sizeof(process_priv);
+
+   if (process_priv.version != KFD_CRIU_PRIV_VERSION) {
+   pr_err("Invalid CRIU API version (checkpointed:%d 
current:%d)\n",
+   process_priv.version, KFD_CRIU_PRIV_VERSION);
+   return -EINVAL;
+   }
+
+exit:
+   return ret;
+}
+
+static int criu_restore_bos(struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args,
+   uint64_t *priv_offset,
+   uint64_t max_priv_data_size)
+{
+   struct kfd_criu_bo_bucket *bo_buckets;
+   struct kfd_criu_bo_priv_data *bo_privs;
+   bool flush_tlbs = false;
+   int ret = 0, j = 0;
+   uint32_t i;
+
+   if (*priv_offset + (args->num_bos * sizeof(*bo_privs)) > 
max_priv_data_size)
+   return -EINVAL;
+
+   bo_buckets = kvmalloc_array(args->num_bos, sizeof(*bo_buckets), 
GFP_KERNEL);
+   if (!bo_buckets)
+   return -ENOMEM;
+
+   ret = copy_from_user(bo_buckets, (void __user *)args->bos,
+args->num_bos * sizeof(*bo_buckets));
+   if (ret) {
+   pr_err("Failed to copy BOs information from user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+
+   bo_privs = kvmalloc_array(args->num_bos, sizeof(*bo_privs), GFP_KERNEL);
+   if (!bo_privs) {
+   ret = -ENOMEM;
+   goto exit;
+   }
+
+   ret = copy_from_user(bo_privs, (void __user *)args->priv_data + 
*priv_offset,
+args->num_bos * sizeof(*bo_privs));
+   if (ret) {
+   pr_err("Failed to copy BOs information from user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+   *priv_offset += args->num_bos * sizeof(*bo_privs);
+
+   /* Create and map new BOs */
+   for (i = 0; i < args->num_bos; i++) {
+   struct kfd_criu_bo_bucket *bo_bucket;
+   struct kfd_criu_bo_priv_data *bo_priv;
+   struct kfd_dev *dev;
+   struct kfd_process_device *pdd;
+   void *mem;
+   u64 offset;
+   int idr_handle;
+
+   bo_bucket = _buckets[i];
+   bo_priv = _privs[i];
+
+   dev = kfd_device_by_id(bo_bucket->gpu_id);
+   if (!dev) {
+   ret = -EINVAL;
+   pr_err("Failed to get pdd\n");
+   goto exit;
+   }
+   pdd = kfd_get_process_device_data(dev, p);
+   if (!pdd) {
+   ret = -EINVAL;
+   pr_err("Failed to get pdd\n");
+   goto exit;
+   }
+
+   pr_debug("kfd restore ioctl - bo_bucket[%d]:\n", i);
+   pr_debug(&q

[Patch v4 16/24] drm/amdkfd: CRIU implement gpu_id remapping

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When doing a restore on a different node, the gpu_id's on the restore
node may be different. But the user space application will still refer
use the original gpu_id's in the ioctl calls. Adding code to create a
gpu id mapping so that kfd can determine actual gpu_id during the user
ioctl's.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 465 --
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  45 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  11 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  32 ++
 .../amd/amdkfd/kfd_process_queue_manager.c|  18 +-
 5 files changed, 412 insertions(+), 159 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 08467fa2f514..20652d488cde 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -294,18 +294,20 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
return err;
 
pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev) {
-   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
-   return -EINVAL;
-   }
 
mutex_lock(>mutex);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
+   err = -EINVAL;
+   goto err_unlock;
+   }
+   dev = pdd->dev;
 
pdd = kfd_bind_process_to_device(dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
-   goto err_bind_process;
+   goto err_unlock;
}
 
pr_debug("Creating queue for PASID 0x%x on gpu 0x%x\n",
@@ -315,7 +317,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL, NULL, NULL,
_offset_in_process);
if (err != 0)
-   goto err_create_queue;
+   goto err_unlock;
 
args->queue_id = queue_id;
 
@@ -344,8 +346,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
 
return 0;
 
-err_create_queue:
-err_bind_process:
+err_unlock:
mutex_unlock(>mutex);
return err;
 }
@@ -492,7 +493,6 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
struct kfd_process *p, void *data)
 {
struct kfd_ioctl_set_memory_policy_args *args = data;
-   struct kfd_dev *dev;
int err = 0;
struct kfd_process_device *pdd;
enum cache_policy default_policy, alternate_policy;
@@ -507,13 +507,15 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
return -EINVAL;
}
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -EINVAL;
-
mutex_lock(>mutex);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
+   err = -EINVAL;
+   goto out;
+   }
 
-   pdd = kfd_bind_process_to_device(dev, p);
+   pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
goto out;
@@ -526,7 +528,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
(args->alternate_policy == KFD_IOC_CACHE_POLICY_COHERENT)
   ? cache_policy_coherent : cache_policy_noncoherent;
 
-   if (!dev->dqm->ops.set_cache_memory_policy(dev->dqm,
+   if (!pdd->dev->dqm->ops.set_cache_memory_policy(pdd->dev->dqm,
>qpd,
default_policy,
alternate_policy,
@@ -544,17 +546,18 @@ static int kfd_ioctl_set_trap_handler(struct file *filep,
struct kfd_process *p, void *data)
 {
struct kfd_ioctl_set_trap_handler_args *args = data;
-   struct kfd_dev *dev;
int err = 0;
struct kfd_process_device *pdd;
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -EINVAL;
-
mutex_lock(>mutex);
 
-   pdd = kfd_bind_process_to_device(dev, p);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   err = -EINVAL;
+   goto out;
+   }
+
+   pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
goto out;
@@ -578,16 +581,20 @@ static int kfd_ioctl_dbg_register(struct file *filep,

[Patch v4 05/24] drm/amdkfd: CRIU Implement KFD checkpoint ioctl

2021-12-22 Thread Rajneesh Bhardwaj

This adds support to discover the  buffer objects that belong to a
process being checkpointed. The data corresponding to these buffer
objects is returned to user space plugin running under criu master
context which then stores this info to recreate these buffer objects
during a restore operation.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  |  20 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h  |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 172 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|   3 +-
 4 files changed, 195 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 56c5c4464829..4fd36bd9dcfd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1173,6 +1173,26 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device 
*bdev,
return ttm_pool_free(>mman.bdev.pool, ttm);
 }
 
+/**
+ * amdgpu_ttm_tt_get_userptr - Return the userptr GTT ttm_tt for the current
+ * task
+ *
+ * @tbo: The ttm_buffer_object that contains the userptr
+ * @user_addr:  The returned value
+ */
+int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo,
+ uint64_t *user_addr)
+{
+   struct amdgpu_ttm_tt *gtt;
+
+   if (!tbo->ttm)
+   return -EINVAL;
+
+   gtt = (void *)tbo->ttm;
+   *user_addr = gtt->userptr;
+   return 0;
+}
+
 /**
  * amdgpu_ttm_tt_set_userptr - Initialize userptr GTT ttm_tt for the current
  * task
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 7346ecff4438..6e6d67ec43f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -177,6 +177,8 @@ static inline bool amdgpu_ttm_tt_get_user_pages_done(struct 
ttm_tt *ttm)
 #endif
 
 void amdgpu_ttm_tt_set_user_pages(struct ttm_tt *ttm, struct page **pages);
+int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo,
+ uint64_t *user_addr);
 int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo,
  uint64_t addr, uint32_t flags);
 bool amdgpu_ttm_tt_has_userptr(struct ttm_tt *ttm);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 53d7a20e3c06..cdbb92972338 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -42,6 +42,7 @@
 #include "kfd_svm.h"
 #include "amdgpu_amdkfd.h"
 #include "kfd_smi_events.h"
+#include "amdgpu_object.h"
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -1857,6 +1858,29 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
 }
 #endif
 
+static int criu_checkpoint_process(struct kfd_process *p,
+uint8_t __user *user_priv_data,
+uint64_t *priv_offset)
+{
+   struct kfd_criu_process_priv_data process_priv;
+   int ret;
+
+   memset(_priv, 0, sizeof(process_priv));
+
+   process_priv.version = KFD_CRIU_PRIV_VERSION;
+
+   ret = copy_to_user(user_priv_data + *priv_offset,
+   _priv, sizeof(process_priv));
+
+   if (ret) {
+   pr_err("Failed to copy process information to user\n");
+   ret = -EFAULT;
+   }
+
+   *priv_offset += sizeof(process_priv);
+   return ret;
+}
+
 uint64_t get_process_num_bos(struct kfd_process *p)
 {
uint64_t num_of_bos = 0, i;
@@ -1877,6 +1901,111 @@ uint64_t get_process_num_bos(struct kfd_process *p)
return num_of_bos;
 }
 
+static int criu_checkpoint_bos(struct kfd_process *p,
+  uint32_t num_bos,
+  uint8_t __user *user_bos,
+  uint8_t __user *user_priv_data,
+  uint64_t *priv_offset)
+{
+   struct kfd_criu_bo_bucket *bo_buckets;
+   struct kfd_criu_bo_priv_data *bo_privs;
+   int ret = 0, pdd_index, bo_index = 0, id;
+   void *mem;
+
+   bo_buckets = kvzalloc(num_bos * sizeof(*bo_buckets), GFP_KERNEL);
+   if (!bo_buckets) {
+   ret = -ENOMEM;
+   goto exit;
+   }
+
+   bo_privs = kvzalloc(num_bos * sizeof(*bo_privs), GFP_KERNEL);
+   if (!bo_privs) {
+   ret = -ENOMEM;
+   goto exit;
+   }
+
+   for (pdd_index = 0; pdd_index < p->n_pdds; pdd_index++) {
+   struct kfd_process_device *pdd = p->pdds[pdd_index];
+   struct amdgpu_bo *dumper_bo;
+   struct kgd_mem *kgd_mem;
+
+   idr_for_each_entry(>alloc_idr, mem, id) {
+   struct kfd_criu_bo_bucket

[Patch v4 13/24] drm/amdkfd: CRIU checkpoint and restore queue mqds

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Checkpoint contents of queue MQD's on CRIU dump and restore them during
CRIU restore.

Signed-off-by: David Yat Sin 

---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |   2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  72 +++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  14 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |   7 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  67 
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  68 
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  68 
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  69 
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   5 +
 .../amd/amdkfd/kfd_process_queue_manager.c| 158 --
 11 files changed, 506 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 3fb155f756fd..146879cd3f2b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -312,7 +312,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
p->pasid,
dev->id);
 
-   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL,
+   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL, NULL,
_offset_in_process);
if (err != 0)
goto err_create_queue;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 0c50e67e2b51..3a5303ebcabf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
properties.type = KFD_QUEUE_TYPE_DIQ;
 
status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
-   , , NULL, NULL);
+   , , NULL, NULL, NULL);
 
if (status) {
pr_err("Failed to create DIQ\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index a0f5b8533a03..a92274f9f1f7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -331,7 +331,8 @@ static void deallocate_vmid(struct device_queue_manager 
*dqm,
 static int create_queue_nocpsch(struct device_queue_manager *dqm,
struct queue *q,
struct qcm_process_device *qpd,
-   const struct kfd_criu_queue_priv_data *qd)
+   const struct kfd_criu_queue_priv_data *qd,
+   const void *restore_mqd)
 {
struct mqd_manager *mqd_mgr;
int retval;
@@ -390,8 +391,14 @@ static int create_queue_nocpsch(struct 
device_queue_manager *dqm,
retval = -ENOMEM;
goto out_deallocate_doorbell;
}
-   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
-   >gart_mqd_addr, >properties);
+
+   if (qd)
+   mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
+>properties, restore_mqd);
+   else
+   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
+   >gart_mqd_addr, >properties);
+
if (q->properties.is_active) {
if (!dqm->sched_running) {
WARN_ONCE(1, "Load non-HWS mqd while stopped\n");
@@ -1339,7 +1346,8 @@ static void destroy_kernel_queue_cpsch(struct 
device_queue_manager *dqm,
 
 static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue 
*q,
struct qcm_process_device *qpd,
-   const struct kfd_criu_queue_priv_data *qd)
+   const struct kfd_criu_queue_priv_data *qd,
+   const void *restore_mqd)
 {
int retval;
struct mqd_manager *mqd_mgr;
@@ -1385,8 +1393,12 @@ static int create_queue_cpsch(struct 
device_queue_manager *dqm, struct queue *q,
 * updates the is_evicted flag but is a no-op otherwise.
 */
q->properties.is_evicted = !!qpd->evicted;
-   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
-   >gart_mqd_addr, >properties);
+   if (qd)
+   mqd_mgr->restore_mqd(mqd_mgr, >mqd, q->mqd_mem_obj, 
>gart_mqd_addr,
+>properties, restore_mqd);
+   else
+   mqd_mgr->init_mqd(mqd_mgr, >mqd, q->mqd_mem_obj,
+   >gart_mqd_addr, >properties);
 
list_add(>list, >queues_list);
qpd->queue_count++;
@@ -1774,6 +1786,50 @@ static int get_wave_state(struct device_queue_manager 
*dqm,

[Patch v4 04/24] drm/amdkfd: CRIU Implement KFD process_info ioctl

2021-12-22 Thread Rajneesh Bhardwaj

This IOCTL is expected to be called as a precursor to the actual
Checkpoint operation. This does the basic discovery into the target
process seized by CRIU and relays the information to the userspace that
utilizes it to start the Checkpoint operation via another dedicated
IOCTL.

The process_info IOCTL determines the number of GPUs, buffer objects
that are associated with the target process, its process id in
caller's namespace since /proc/pid/mem interface maybe used to drain
the contents of the discovered buffer objects in userspace and getpid
returns the pid of CRIU dumper process. Also the pid of a process
inside a container might be different than its global pid so return
the ns pid.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 55 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14 ++
 3 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 1b863bd84c96..53d7a20e3c06 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1857,6 +1857,41 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
 }
 #endif
 
+uint64_t get_process_num_bos(struct kfd_process *p)
+{
+   uint64_t num_of_bos = 0, i;
+
+   /* Run over all PDDs of the process */
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+   void *mem;
+   int id;
+
+   idr_for_each_entry(>alloc_idr, mem, id) {
+   struct kgd_mem *kgd_mem = (struct kgd_mem *)mem;
+
+   if ((uint64_t)kgd_mem->va > pdd->gpuvm_base)
+   num_of_bos++;
+   }
+   }
+   return num_of_bos;
+}
+
+static void criu_get_process_object_info(struct kfd_process *p,
+uint32_t *num_bos,
+uint64_t *objs_priv_size)
+{
+   uint64_t priv_size;
+
+   *num_bos = get_process_num_bos(p);
+
+   if (objs_priv_size) {
+   priv_size = sizeof(struct kfd_criu_process_priv_data);
+   priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
+   *objs_priv_size = priv_size;
+   }
+}
+
 static int criu_checkpoint(struct file *filep,
   struct kfd_process *p,
   struct kfd_ioctl_criu_args *args)
@@ -1889,7 +1924,25 @@ static int criu_process_info(struct file *filep,
struct kfd_process *p,
struct kfd_ioctl_criu_args *args)
 {
-   return 0;
+   int ret = 0;
+
+   mutex_lock(>mutex);
+
+   if (!kfd_has_process_device_data(p)) {
+   pr_err("No pdd for given process\n");
+   ret = -ENODEV;
+   goto err_unlock;
+   }
+
+   args->pid = task_pid_nr_ns(p->lead_thread,
+   task_active_pid_ns(p->lead_thread));
+
+   criu_get_process_object_info(p, >num_bos, >priv_data_size);
+
+   dev_dbg(kfd_device, "Num of bos:%u\n", args->num_bos);
+err_unlock:
+   mutex_unlock(>mutex);
+   return ret;
 }
 
 static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void 
*data)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index e68f692362bb..4d9bc7af03af 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -950,6 +950,8 @@ void *kfd_process_device_translate_handle(struct 
kfd_process_device *p,
 void kfd_process_device_remove_obj_handle(struct kfd_process_device *pdd,
int handle);
 
+bool kfd_has_process_device_data(struct kfd_process *p);
+
 /* PASIDs */
 int kfd_pasid_init(void);
 void kfd_pasid_exit(void);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index d4c8a6948a9f..f77d556ca0fc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1456,6 +1456,20 @@ static int init_doorbell_bitmap(struct 
qcm_process_device *qpd,
return 0;
 }
 
+bool kfd_has_process_device_data(struct kfd_process *p)
+{
+   int i;
+
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+
+   if (pdd)
+   return true;
+   }
+
+   return false;
+}
+
 struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
struct kfd_process *p)
 {
-- 
2.17.1

[Patch v4 09/24] drm/amdkfd: CRIU add queues support

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

Add support to existing CRIU ioctl's to save number of queues and queue
properties for each queue during checkpoint and re-create queues on
restore.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 110 -
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  43 +++-
 .../amd/amdkfd/kfd_process_queue_manager.c| 212 ++
 3 files changed, 357 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index db2bb302a8d4..9665c8657929 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2006,19 +2006,36 @@ static int criu_checkpoint_bos(struct kfd_process *p,
return ret;
 }
 
-static void criu_get_process_object_info(struct kfd_process *p,
-uint32_t *num_bos,
-uint64_t *objs_priv_size)
+static int criu_get_process_object_info(struct kfd_process *p,
+   uint32_t *num_bos,
+   uint32_t *num_objects,
+   uint64_t *objs_priv_size)
 {
+   int ret;
uint64_t priv_size;
+   uint32_t num_queues, num_events, num_svm_ranges;
+   uint64_t queues_priv_data_size;
 
*num_bos = get_process_num_bos(p);
 
+   ret = kfd_process_get_queue_info(p, _queues, 
_priv_data_size);
+   if (ret)
+   return ret;
+
+   num_events = 0; /* TODO: Implement Events */
+   num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */
+
+   *num_objects = num_queues + num_events + num_svm_ranges;
+
if (objs_priv_size) {
priv_size = sizeof(struct kfd_criu_process_priv_data);
priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data);
+   priv_size += queues_priv_data_size;
+   /* TODO: Add Events priv size */
+   /* TODO: Add SVM ranges priv size */
*objs_priv_size = priv_size;
}
+   return 0;
 }
 
 static int criu_checkpoint(struct file *filep,
@@ -2026,7 +2043,7 @@ static int criu_checkpoint(struct file *filep,
   struct kfd_ioctl_criu_args *args)
 {
int ret;
-   uint32_t num_bos;
+   uint32_t num_bos, num_objects;
uint64_t priv_size, priv_offset = 0;
 
if (!args->bos || !args->priv_data)
@@ -2048,9 +2065,12 @@ static int criu_checkpoint(struct file *filep,
goto exit_unlock;
}
 
-   criu_get_process_object_info(p, _bos, _size);
+   ret = criu_get_process_object_info(p, _bos, _objects, 
_size);
+   if (ret)
+   goto exit_unlock;
 
if (num_bos != args->num_bos ||
+   num_objects != args->num_objects ||
priv_size != args->priv_data_size) {
 
ret = -EINVAL;
@@ -2067,6 +2087,17 @@ static int criu_checkpoint(struct file *filep,
if (ret)
goto exit_unlock;
 
+   if (num_objects) {
+   ret = kfd_criu_checkpoint_queues(p, (uint8_t __user 
*)args->priv_data,
+_offset);
+   if (ret)
+   goto exit_unlock;
+
+   /* TODO: Dump Events */
+
+   /* TODO: Dump SVM-Ranges */
+   }
+
 exit_unlock:
mutex_unlock(>mutex);
if (ret)
@@ -2340,6 +2371,62 @@ static int criu_restore_bos(struct kfd_process *p,
return ret;
 }
 
+static int criu_restore_objects(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args,
+   uint64_t *priv_offset,
+   uint64_t max_priv_data_size)
+{
+   int ret = 0;
+   uint32_t i;
+
+   BUILD_BUG_ON(offsetof(struct kfd_criu_queue_priv_data, object_type));
+   BUILD_BUG_ON(offsetof(struct kfd_criu_event_priv_data, object_type));
+   BUILD_BUG_ON(offsetof(struct kfd_criu_svm_range_priv_data, 
object_type));
+
+   for (i = 0; i < args->num_objects; i++) {
+   uint32_t object_type;
+
+   if (*priv_offset + sizeof(object_type) > max_priv_data_size) {
+   pr_err("Invalid private data size\n");
+   return -EINVAL;
+   }
+
+   ret = get_user(object_type, (uint32_t __user *)(args->priv_data 
+ *priv_offset));
+   if (ret) {
+   pr_err("Failed to copy private information from 
user\n");
+   goto exit;
+   }
+
+   switch (object_type) {
+   case KFD_CRIU_OBJECT_TYPE_QUEUE:
+   ret = kfd_criu_restore_queue(p, (uint8_t __user 
*)args->priv_data,
+priv_offset, 
max_priv_data_size);
+

[Patch v4 10/24] drm/amdkfd: CRIU restore queue ids

2021-12-22 Thread Rajneesh Bhardwaj

From: David Yat Sin 

When re-creating queues during CRIU restore, restore the queue with the
same queue id value used during CRIU dump.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 

---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 +
 .../amd/amdkfd/kfd_process_queue_manager.c| 37 +++
 4 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 9665c8657929..3fb155f756fd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -312,7 +312,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
p->pasid,
dev->id);
 
-   err = pqm_create_queue(>pqm, dev, filep, _properties, _id,
+   err = pqm_create_queue(>pqm, dev, filep, _properties, _id, 
NULL,
_offset_in_process);
if (err != 0)
goto err_create_queue;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 1e30717b5253..0c50e67e2b51 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
properties.type = KFD_QUEUE_TYPE_DIQ;
 
status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
-   , , NULL);
+   , , NULL, NULL);
 
if (status) {
pr_err("Failed to create DIQ\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 7c2679a23aa3..8272bd5c4600 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -461,6 +461,7 @@ enum KFD_QUEUE_PRIORITY {
  * it's user mode or kernel mode queue.
  *
  */
+
 struct queue_properties {
enum kfd_queue_type type;
enum kfd_queue_format format;
@@ -1156,6 +1157,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
struct file *f,
struct queue_properties *properties,
unsigned int *qid,
+   const struct kfd_criu_queue_priv_data *q_data,
uint32_t *p_doorbell_offset_in_process);
 int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid);
 int pqm_update_queue_properties(struct process_queue_manager *pqm, unsigned 
int qid,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 480ad794df4e..275aeebc58fa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -42,6 +42,20 @@ static inline struct process_queue_node *get_queue_by_qid(
return NULL;
 }
 
+static int assign_queue_slot_by_qid(struct process_queue_manager *pqm,
+   unsigned int qid)
+{
+   if (qid >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
+   return -EINVAL;
+
+   if (__test_and_set_bit(qid, pqm->queue_slot_bitmap)) {
+   pr_err("Cannot create new queue because requested qid(%u) is in 
use\n", qid);
+   return -ENOSPC;
+   }
+
+   return 0;
+}
+
 static int find_available_queue_slot(struct process_queue_manager *pqm,
unsigned int *qid)
 {
@@ -194,6 +208,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
struct file *f,
struct queue_properties *properties,
unsigned int *qid,
+   const struct kfd_criu_queue_priv_data *q_data,
uint32_t *p_doorbell_offset_in_process)
 {
int retval;
@@ -225,7 +240,12 @@ int pqm_create_queue(struct process_queue_manager *pqm,
if (pdd->qpd.queue_count >= max_queues)
return -ENOSPC;
 
-   retval = find_available_queue_slot(pqm, qid);
+   if (q_data) {
+   retval = assign_queue_slot_by_qid(pqm, q_data->q_id);
+   *qid = q_data->q_id;
+   } else
+   retval = find_available_queue_slot(pqm, qid);
+
if (retval != 0)
return retval;
 
@@ -528,7 +548,7 @@ int kfd_process_get_queue_info(struct kfd_process *p,
return 0;
 }
 
-static void criu_dump_queue(struct kfd_process_device *pdd,
+static void criu_checkpoint_queue(struct kfd_process_device *pdd,
   struct queue *q,
   struct kfd_criu_queue_priv_data *q_data)
 {
@@ -560,7 +580,7 @@ static void criu_dump_queue(struct kfd_process

[Patch v4 03/24] drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs

2021-12-22 Thread Rajneesh Bhardwaj

Checkpoint-Restore in userspace (CRIU) is a powerful tool that can
snapshot a running process and later restore it on same or a remote
machine but expects the processes that have a device file (e.g. GPU)
associated with them, provide necessary driver support to assist CRIU
and its extensible plugin interface. Thus, In order to support the
Checkpoint-Restore of any ROCm process, the AMD Radeon Open Compute
Kernel driver, needs to provide a set of new APIs that provide
necessary VRAM metadata and its contents to a userspace component
(CRIU plugin) that can store it in form of image files.

This introduces some new ioctls which will be used to checkpoint-Restore
any KFD bound user process. KFD doesn't allow any arbitrary ioctl call
unless it is called by the group leader process. Since these ioctls are
expected to be called from a KFD criu plugin which has elevated ptrace
attached privileges and CAP_CHECKPOINT_RESTORE capabilities attached with
the file descriptors so modify KFD to allow such calls.

(API redesigned by David Yat Sin)
Suggested-by: Felix Kuehling 
Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 94 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 65 +++-
 include/uapi/linux/kfd_ioctl.h   | 79 +++-
 3 files changed, 235 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 4bfc0c8ab764..1b863bd84c96 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "kfd_priv.h"
@@ -1856,6 +1857,75 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
 }
 #endif
 
+static int criu_checkpoint(struct file *filep,
+  struct kfd_process *p,
+  struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_restore(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_unpause(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_resume(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int criu_process_info(struct file *filep,
+   struct kfd_process *p,
+   struct kfd_ioctl_criu_args *args)
+{
+   return 0;
+}
+
+static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void 
*data)
+{
+   struct kfd_ioctl_criu_args *args = data;
+   int ret;
+
+   dev_dbg(kfd_device, "CRIU operation: %d\n", args->op);
+   switch (args->op) {
+   case KFD_CRIU_OP_PROCESS_INFO:
+   ret = criu_process_info(filep, p, args);
+   break;
+   case KFD_CRIU_OP_CHECKPOINT:
+   ret = criu_checkpoint(filep, p, args);
+   break;
+   case KFD_CRIU_OP_UNPAUSE:
+   ret = criu_unpause(filep, p, args);
+   break;
+   case KFD_CRIU_OP_RESTORE:
+   ret = criu_restore(filep, p, args);
+   break;
+   case KFD_CRIU_OP_RESUME:
+   ret = criu_resume(filep, p, args);
+   break;
+   default:
+   dev_dbg(kfd_device, "Unsupported CRIU operation:%d\n", 
args->op);
+   ret = -EINVAL;
+   break;
+   }
+
+   if (ret)
+   dev_dbg(kfd_device, "CRIU operation:%d err:%d\n", args->op, 
ret);
+
+   return ret;
+}
+
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
.cmd_drv = 0, .name = #ioctl}
@@ -1959,6 +2029,9 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
 
AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_XNACK_MODE,
kfd_ioctl_set_xnack_mode, 0),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_OP,
+   kfd_ioctl_criu, KFD_IOC_FLAG_CHECKPOINT_RESTORE),
 };
 
 #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls)
@@ -1973,6 +2046,7 @@ static long kfd_ioctl(struct file *filep, unsigned int 
cmd, unsigned long arg)
char *kdata = NULL;
unsigned int usize, asize;
int retcode = -EINVAL;
+   bool ptrace_attached = false;
 
if (nr >= AMDKFD_CORE_IOCTL_COUNT)
goto err_i1;
@@ -1998,7 +2072,15 @@ static long kfd_ioctl(struct file *filep, unsigned int 
cmd, unsigned long arg)
 * processes need to create their own KFD device context.
 */
process

[Patch v4 07/24] drm/amdkfd: CRIU Implement KFD resume ioctl

2021-12-22 Thread Rajneesh Bhardwaj

This adds support to create userptr BOs on restore and introduces a new
ioctl to restart memory notifiers for the restored userptr BOs.
When doing CRIU restore MMU notifications can happen anytime after we call
amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the
restore process i.e. criu_resume ioctl is received, and the process is
ready to be resumed. This ioctl is different from other KFD CRIU ioctls
since its called by CRIU master restore process for all the target
processes being resumed by CRIU.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  6 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 51 +--
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 44 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 35 +++--
 5 files changed, 123 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index fcbc8a9c9e06..5c5fc839f701 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -131,6 +131,7 @@ struct amdkfd_process_info {
atomic_t evicted_bos;
struct delayed_work restore_userptr_work;
struct pid *pid;
+   bool block_mmu_notifications;
 };
 
 int amdgpu_amdkfd_init(void);
@@ -269,7 +270,7 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void 
*drm_priv);
 int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
struct amdgpu_device *adev, uint64_t va, uint64_t size,
void *drm_priv, struct kgd_mem **mem,
-   uint64_t *offset, uint32_t flags);
+   uint64_t *offset, uint32_t flags, bool criu_resume);
 int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
struct amdgpu_device *adev, struct kgd_mem *mem, void *drm_priv,
uint64_t *size);
@@ -297,6 +298,9 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device 
*adev,
 int amdgpu_amdkfd_get_tile_config(struct amdgpu_device *adev,
struct tile_config *config);
 void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev);
+void amdgpu_amdkfd_block_mmu_notifications(void *p);
+int amdgpu_amdkfd_criu_resume(void *p);
+
 #if IS_ENABLED(CONFIG_HSA_AMD)
 void amdgpu_amdkfd_gpuvm_init_mem_limits(void);
 void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 90b985436878..5679fb75ec88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -846,7 +846,8 @@ static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem 
*mem,
  *
  * Returns 0 for success, negative errno for errors.
  */
-static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr)
+static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr,
+  bool criu_resume)
 {
struct amdkfd_process_info *process_info = mem->process_info;
struct amdgpu_bo *bo = mem->bo;
@@ -868,6 +869,17 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr)
goto out;
}
 
+   if (criu_resume) {
+   /*
+* During a CRIU restore operation, the userptr buffer objects
+* will be validated in the restore_userptr_work worker at a
+* later stage when it is scheduled by another ioctl called by
+* CRIU master process for the target pid for restore.
+*/
+   atomic_inc(>invalid);
+   mutex_unlock(_info->lock);
+   return 0;
+   }
ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages);
if (ret) {
pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
@@ -1240,6 +1252,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
INIT_DELAYED_WORK(>restore_userptr_work,
  amdgpu_amdkfd_restore_userptr_worker);
 
+   info->block_mmu_notifications = false;
*process_info = info;
*ef = dma_fence_get(>eviction_fence->base);
}
@@ -1456,10 +1469,37 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void 
*drm_priv)
return avm->pd_phys_addr;
 }
 
+void amdgpu_amdkfd_block_mmu_notifications(void *p)
+{
+   struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p;
+
+   pinfo->block_mmu_notifications = true;
+}
+
+int amdgpu_amdkfd_criu_resume(void *p)
+{
+   int ret = 0;
+   struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p;
+
+   mutex_lock(>lock);
+   pr_debug("scheduling work\n");
+   atomic_inc

[Patch v4 01/24] x86/configs: CRIU update debug rock defconfig

2021-12-22 Thread Rajneesh Bhardwaj

 - Update debug config for Checkpoint-Restore (CR) support
 - Also include necessary options for CR with docker containers.

Signed-off-by: Rajneesh Bhardwaj 
---
 arch/x86/configs/rock-dbg_defconfig | 53 ++---
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/arch/x86/configs/rock-dbg_defconfig 
b/arch/x86/configs/rock-dbg_defconfig
index 4877da183599..bc2a34666c1d 100644
--- a/arch/x86/configs/rock-dbg_defconfig
+++ b/arch/x86/configs/rock-dbg_defconfig
@@ -249,6 +249,7 @@ CONFIG_KALLSYMS_ALL=y
 CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
 CONFIG_KALLSYMS_BASE_RELATIVE=y
 # CONFIG_USERFAULTFD is not set
+CONFIG_USERFAULTFD=y
 CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
 CONFIG_KCMP=y
 CONFIG_RSEQ=y
@@ -1015,6 +1016,11 @@ CONFIG_PACKET_DIAG=y
 CONFIG_UNIX=y
 CONFIG_UNIX_SCM=y
 CONFIG_UNIX_DIAG=y
+CONFIG_SMC_DIAG=y
+CONFIG_XDP_SOCKETS_DIAG=y
+CONFIG_INET_MPTCP_DIAG=y
+CONFIG_TIPC_DIAG=y
+CONFIG_VSOCKETS_DIAG=y
 # CONFIG_TLS is not set
 CONFIG_XFRM=y
 CONFIG_XFRM_ALGO=y
@@ -1052,15 +1058,17 @@ CONFIG_SYN_COOKIES=y
 # CONFIG_NET_IPVTI is not set
 # CONFIG_NET_FOU is not set
 # CONFIG_NET_FOU_IP_TUNNELS is not set
-# CONFIG_INET_AH is not set
-# CONFIG_INET_ESP is not set
-# CONFIG_INET_IPCOMP is not set
-CONFIG_INET_TUNNEL=y
-CONFIG_INET_DIAG=y
-CONFIG_INET_TCP_DIAG=y
-# CONFIG_INET_UDP_DIAG is not set
-# CONFIG_INET_RAW_DIAG is not set
-# CONFIG_INET_DIAG_DESTROY is not set
+CONFIG_INET_AH=m
+CONFIG_INET_ESP=m
+CONFIG_INET_IPCOMP=m
+CONFIG_INET_ESP_OFFLOAD=m
+CONFIG_INET_TUNNEL=m
+CONFIG_INET_XFRM_TUNNEL=m
+CONFIG_INET_DIAG=m
+CONFIG_INET_TCP_DIAG=m
+CONFIG_INET_UDP_DIAG=m
+CONFIG_INET_RAW_DIAG=m
+CONFIG_INET_DIAG_DESTROY=y
 CONFIG_TCP_CONG_ADVANCED=y
 # CONFIG_TCP_CONG_BIC is not set
 CONFIG_TCP_CONG_CUBIC=y
@@ -1085,12 +1093,14 @@ CONFIG_TCP_MD5SIG=y
 CONFIG_IPV6=y
 # CONFIG_IPV6_ROUTER_PREF is not set
 # CONFIG_IPV6_OPTIMISTIC_DAD is not set
-CONFIG_INET6_AH=y
-CONFIG_INET6_ESP=y
-# CONFIG_INET6_ESP_OFFLOAD is not set
-# CONFIG_INET6_ESPINTCP is not set
-# CONFIG_INET6_IPCOMP is not set
-# CONFIG_IPV6_MIP6 is not set
+CONFIG_INET6_AH=m
+CONFIG_INET6_ESP=m
+CONFIG_INET6_ESP_OFFLOAD=m
+CONFIG_INET6_IPCOMP=m
+CONFIG_IPV6_MIP6=m
+CONFIG_INET6_XFRM_TUNNEL=m
+CONFIG_INET_DCCP_DIAG=m
+CONFIG_INET_SCTP_DIAG=m
 # CONFIG_IPV6_ILA is not set
 # CONFIG_IPV6_VTI is not set
 CONFIG_IPV6_SIT=y
@@ -1146,8 +1156,13 @@ CONFIG_NF_CT_PROTO_UDPLITE=y
 # CONFIG_NF_CONNTRACK_SANE is not set
 # CONFIG_NF_CONNTRACK_SIP is not set
 # CONFIG_NF_CONNTRACK_TFTP is not set
-# CONFIG_NF_CT_NETLINK is not set
-# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
+CONFIG_COMPAT_NETLINK_MESSAGES=y
+CONFIG_NF_CT_NETLINK=m
+CONFIG_NF_CT_NETLINK_TIMEOUT=m
+CONFIG_NF_CT_NETLINK_HELPER=m
+CONFIG_NETFILTER_NETLINK_GLUE_CT=y
+CONFIG_SCSI_NETLINK=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
 CONFIG_NF_NAT=m
 CONFIG_NF_NAT_REDIRECT=y
 CONFIG_NF_NAT_MASQUERADE=y
@@ -1992,7 +2007,7 @@ CONFIG_NETCONSOLE_DYNAMIC=y
 CONFIG_NETPOLL=y
 CONFIG_NET_POLL_CONTROLLER=y
 # CONFIG_RIONET is not set
-# CONFIG_TUN is not set
+CONFIG_TUN=y
 # CONFIG_TUN_VNET_CROSS_LE is not set
 CONFIG_VETH=y
 # CONFIG_NLMON is not set
@@ -3990,7 +4005,7 @@ CONFIG_MANDATORY_FILE_LOCKING=y
 CONFIG_FSNOTIFY=y
 CONFIG_DNOTIFY=y
 CONFIG_INOTIFY_USER=y
-# CONFIG_FANOTIFY is not set
+CONFIG_FANOTIFY=y
 CONFIG_QUOTA=y
 CONFIG_QUOTA_NETLINK_INTERFACE=y
 # CONFIG_PRINT_QUOTA_WARNING is not set
-- 
2.17.1

[Patch v4 00/24] CHECKPOINT RESTORE WITH ROCm

2021-12-22 Thread Rajneesh Bhardwaj

CRIU is a user space tool which is very popular for container live
migration in datacentres. It can checkpoint a running application, save
its complete state, memory contents and all system resources to images
on disk which can be migrated to another m achine and restored later.
More information on CRIU can be found at https://criu.org/Main_Page

CRIU currently does not support Checkpoint / Restore with applications
that have devices files open so it cannot perform checkpoint and restore
on GPU devices which are very complex and have their own VRAM managed
privately. CRIU, however can support external devices by using a plugin
architecture. We feel that we are getting close to finalizing our IOCTL
APIs which were again changed since V3 for an improved modular design.

Our changes to CRIU user space  are can be obtained from here:
https://github.com/RadeonOpenCompute/criu/tree/amdgpu_rfc-211222

We have tested the following scenarios:
 - Checkpoint / Restore of a Pytorch (BERT) workload
 - kfdtests with queues and events
 - Gfx9 and Gfx10 based multi GPU test systems 
 - On baremetal and inside a docker container
 - Restoring on a different system

V1: Initial
V2: Addressed review comments
V3: Rebased on latest amd-staging-drm-next (5.15 based)
v4: New API design and basic support for SVM, however there is an
outstanding issue with SVM restore which is currently under debug and
hopefully that won't impact the ioctl APIs as SVMs are treated as
private data hidden from user space like queues and events with the new
approch.


David Yat Sin (9):
  drm/amdkfd: CRIU Implement KFD unpause operation
  drm/amdkfd: CRIU add queues support
  drm/amdkfd: CRIU restore queue ids
  drm/amdkfd: CRIU restore sdma id for queues
  drm/amdkfd: CRIU restore queue doorbell id
  drm/amdkfd: CRIU checkpoint and restore queue mqds
  drm/amdkfd: CRIU checkpoint and restore queue control stack
  drm/amdkfd: CRIU checkpoint and restore events
  drm/amdkfd: CRIU implement gpu_id remapping

Rajneesh Bhardwaj (15):
  x86/configs: CRIU update debug rock defconfig
  x86/configs: Add rock-rel_defconfig for amd-feature-criu branch
  drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs
  drm/amdkfd: CRIU Implement KFD process_info ioctl
  drm/amdkfd: CRIU Implement KFD checkpoint ioctl
  drm/amdkfd: CRIU Implement KFD restore ioctl
  drm/amdkfd: CRIU Implement KFD resume ioctl
  drm/amdkfd: CRIU export BOs as prime dmabuf objects
  drm/amdkfd: CRIU checkpoint and restore xnack mode
  drm/amdkfd: CRIU allow external mm for svm ranges
  drm/amdkfd: use user_gpu_id for svm ranges
  drm/amdkfd: CRIU Discover svm ranges
  drm/amdkfd: CRIU Save Shared Virtual Memory ranges
  drm/amdkfd: CRIU prepare for svm resume
  drm/amdkfd: CRIU resume shared virtual memory ranges

 arch/x86/configs/rock-dbg_defconfig   |   53 +-
 arch/x86/configs/rock-rel_defconfig   | 4927 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|6 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   51 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   20 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |2 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 1453 -
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  185 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |   18 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  313 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |   14 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |   72 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |   74 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |   89 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |   81 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  166 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   86 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  377 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  326 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h  |   39 +
 include/uapi/linux/kfd_ioctl.h|   79 +-
 22 files changed, 8099 insertions(+), 334 deletions(-)
 create mode 100644 arch/x86/configs/rock-rel_defconfig

-- 
2.17.1

[Patch v2] drm/amdgpu: Don't inherit GEM object VMAs in child process

2021-12-10 Thread Rajneesh Bhardwaj

When an application having open file access to a node forks, its shared
mappings also get reflected in the address space of child process even
though it cannot access them with the object permissions applied. With the
existing permission checks on the gem objects, it might be reasonable to
also create the VMAs with VM_DONTCOPY flag so a user space application
doesn't need to explicitly call the madvise(addr, len, MADV_DONTFORK)
system call to prevent the pages in the mapped range to appear in the
address space of the child process. It also prevents the memory leaks
due to additional reference counts on the mapped BOs in the child
process that prevented freeing the memory in the parent for which we had
worked around earlier in the user space inside the thunk library.

Additionally, we faced this issue when using CRIU to checkpoint restore
an application that had such inherited mappings in the child which
confuse CRIU when it mmaps on restore. Having this flag set for the
render node VMAs helps. VMAs mapped via KFD already take care of this so
this is needed only for the render nodes.

To limit the impact of the change to user space consumers such as OpenGL
etc, limit it to KFD BOs only.

Cc: Felix Kuehling 

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---

Changes in v2:
 * Addressed Christian's concerns for user space impact
 * Further reduced the scope to KFD BOs only

 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index a224b5295edd..64a7931eda8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -263,6 +263,9 @@ static int amdgpu_gem_object_mmap(struct drm_gem_object 
*obj, struct vm_area_str
!(vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)))
vma->vm_flags &= ~VM_MAYWRITE;
 
+   if (bo->kfd_bo)
+   vma->vm_flags |= VM_DONTCOPY;
+
return drm_gem_ttm_mmap(obj, vma);
 }
 
-- 
2.17.1

[PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-08 Thread Rajneesh Bhardwaj

When an application having open file access to a node forks, its shared
mappings also get reflected in the address space of child process even
though it cannot access them with the object permissions applied. With the
existing permission checks on the gem objects, it might be reasonable to
also create the VMAs with VM_DONTCOPY flag so a user space application
doesn't need to explicitly call the madvise(addr, len, MADV_DONTFORK)
system call to prevent the pages in the mapped range to appear in the
address space of the child process. It also prevents the memory leaks
due to additional reference counts on the mapped BOs in the child
process that prevented freeing the memory in the parent for which we had
worked around earlier in the user space inside the thunk library.

Additionally, we faced this issue when using CRIU to checkpoint restore
an application that had such inherited mappings in the child which
confuse CRIU when it mmaps on restore. Having this flag set for the
render node VMAs helps. VMAs mapped via KFD already take care of this so
this is needed only for the render nodes.

Cc: Felix Kuehling 

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/drm_gem.c   | 3 ++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 09c820045859..d9c4149f36dd 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1058,7 +1058,8 @@ int drm_gem_mmap_obj(struct drm_gem_object *obj, unsigned 
long obj_size,
goto err_drm_gem_object_put;
}
 
-   vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | 
VM_DONTDUMP;
+   vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND
+   | VM_DONTDUMP | VM_DONTCOPY;
vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
}
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 33680c94127c..420a4898fdd2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -566,7 +566,7 @@ int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct 
ttm_buffer_object *bo)
 
vma->vm_private_data = bo;
 
-   vma->vm_flags |= VM_PFNMAP;
+   vma->vm_flags |= VM_PFNMAP | VM_DONTCOPY;
vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
return 0;
 }
-- 
2.17.1

[PATCH] drm/amdkfd: fix kernel-doc and cleanup

2020-07-13 Thread Rajneesh Bhardwaj

 - fix some styling issues
 - fixes for kernel-doc type

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 61 +++
 1 file changed, 25 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 51ba2020732e..4b86d912a5e1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -97,7 +97,7 @@
  * Size of the per-process TBA+TMA buffer: 2 pages
  *
  * The first page is the TBA used for the CWSR ISA code. The second
- * page is used as TMA for daisy changing a user-mode trap handler.
+ * page is used as TMA for user-mode trap handler setup in daisy-chain mode.
  */
 #define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
 #define KFD_CWSR_TMA_OFFSET PAGE_SIZE
@@ -157,29 +157,19 @@ extern int debug_largebar;
  */
 extern int ignore_crat;
 
-/*
- * Set sh_mem_config.retry_disable on Vega10
- */
+/* Set sh_mem_config.retry_disable on GFX v9 */
 extern int amdgpu_noretry;
 
-/*
- * Halt if HWS hang is detected
- */
+/* Halt if HWS hang is detected */
 extern int halt_if_hws_hang;
 
-/*
- * Whether MEC FW support GWS barriers
- */
+/* Whether MEC FW support GWS barriers */
 extern bool hws_gws_support;
 
-/*
- * Queue preemption timeout in ms
- */
+/* Queue preemption timeout in ms */
 extern int queue_preemption_timeout_ms;
 
-/*
- * Enable eviction debug messages
- */
+/* Enable eviction debug messages */
 extern bool debug_evictions;
 
 enum cache_policy {
@@ -301,7 +291,7 @@ struct kfd_dev {
 
/* xGMI */
uint64_t hive_id;
-
+
/* UUID */
uint64_t unique_id;
 
@@ -313,7 +303,7 @@ struct kfd_dev {
/* Compute Profile ref. count */
atomic_t compute_profile;
 
-   /* Global GWS resource shared b/t processes*/
+   /* Global GWS resource shared between processes */
void *gws;
 
/* Clients watching SMI events */
@@ -333,7 +323,7 @@ void kfd_chardev_exit(void);
 struct device *kfd_chardev(void);
 
 /**
- * enum kfd_unmap_queues_filter
+ * enum kfd_unmap_queues_filter - Enum for queue filters.
  *
  * @KFD_UNMAP_QUEUES_FILTER_SINGLE_QUEUE: Preempts single queue.
  *
@@ -352,15 +342,17 @@ enum kfd_unmap_queues_filter {
 };
 
 /**
- * enum kfd_queue_type
+ * enum kfd_queue_type - Enum for various queue types.
  *
  * @KFD_QUEUE_TYPE_COMPUTE: Regular user mode queue type.
  *
- * @KFD_QUEUE_TYPE_SDMA: Sdma user mode queue type.
+ * @KFD_QUEUE_TYPE_SDMA: SDMA user mode queue type.
  *
  * @KFD_QUEUE_TYPE_HIQ: HIQ queue type.
  *
  * @KFD_QUEUE_TYPE_DIQ: DIQ queue type.
+ *
+ * @KFD_QUEUE_TYPE_SDMA_XGMI: Special SDMA queue for XGMI interface.
  */
 enum kfd_queue_type  {
KFD_QUEUE_TYPE_COMPUTE,
@@ -406,9 +398,9 @@ enum KFD_QUEUE_PRIORITY {
  *
  * @write_ptr: Defines the number of dwords written to the ring buffer.
  *
- * @doorbell_ptr: This field aim is to notify the H/W of new packet written to
- * the queue ring buffer. This field should be similar to write_ptr and the
- * user should update this field after he updated the write_ptr.
+ * @doorbell_ptr: Notifies the H/W of new packet written to the queue ring
+ * buffer. This field should be similar to write_ptr and the user should
+ * update this field after updating the write_ptr.
  *
  * @doorbell_off: The doorbell offset in the doorbell pci-bar.
  *
@@ -477,7 +469,7 @@ struct queue_properties {
  *
  * @list: Queue linked list.
  *
- * @mqd: The queue MQD.
+ * @mqd: The queue MQD (memory queue descriptor).
  *
  * @mqd_mem_obj: The MQD local gpu memory object.
  *
@@ -486,7 +478,7 @@ struct queue_properties {
  * @properties: The queue properties.
  *
  * @mec: Used only in no cp scheduling mode and identifies to micro engine id
- *  that the queue should be execute on.
+ *  that the queue should be executed on.
  *
  * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe
  *   id.
@@ -527,9 +519,6 @@ struct queue {
struct kobject kobj;
 };
 
-/*
- * Please read the kfd_mqd_manager.h description.
- */
 enum KFD_MQD_TYPE {
KFD_MQD_TYPE_HIQ = 0,   /* for hiq */
KFD_MQD_TYPE_CP,/* for cp queues and diq */
@@ -587,9 +576,7 @@ struct qcm_process_device {
 */
bool mapped_gws_queue;
 
-   /*
-* All the memory management data should be here too
-*/
+   /* All the memory management data should be here too */
uint64_t gds_context_area;
/* Contains page table flags such as AMDGPU_PTE_VALID since gfx9 */
uint64_t page_table_base;
@@ -789,11 +776,13 @@ extern DECLARE_HASHTABLE(kfd_processes_table, 
KFD_PROCESS_TABLE_SIZE);
 extern struct srcu_struct kfd_processes_srcu;
 
 /**
- * Ioctl function type.
+ * typedef amdkfd_ioctl_t - typedef for ioctl function pointer.
+ *
+ * @filep: pointer to file structure.
+ * @p: amdkfd process pointer.
+ * @data: pointer to arg that was copied from user.
  *
- * \param

[PATCH] drm/amdgpu: restrict bo mapping within gpu address limits

2020-06-02 Thread Rajneesh Bhardwaj

Have strict check on bo mapping since on some systems, such as A+A or
hybrid, the cpu might support 5 level paging or can address memory above
48 bits but gpu might be limited by hardware to just use 48 bits. In
general, this applies to all asics where this limitation can be checked
against their max_pfn range. This restricts the range to map bo within
pratical limits of cpu and gpu for shared virtual memory access.

Reviewed-by: Oak Zeng 
Reviewed-by: Christian König 
Reviewed-by: Hawking Zhang 
Acked-by: Alex Deucher 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7417754e9141..71e005cf2952 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2208,7 +2208,8 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
/* make sure object fit at this offset */
eaddr = saddr + size - 1;
if (saddr >= eaddr ||
-   (bo && offset + size > amdgpu_bo_size(bo)))
+   (bo && offset + size > amdgpu_bo_size(bo)) ||
+   (eaddr >= adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT))
return -EINVAL;
 
saddr /= AMDGPU_GPU_PAGE_SIZE;
@@ -2273,7 +2274,8 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
/* make sure object fit at this offset */
eaddr = saddr + size - 1;
if (saddr >= eaddr ||
-   (bo && offset + size > amdgpu_bo_size(bo)))
+   (bo && offset + size > amdgpu_bo_size(bo)) ||
+   (eaddr >= adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT))
return -EINVAL;
 
/* Allocate all the needed memory */
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/gmc: Fix spelling mistake.

2020-04-15 Thread Rajneesh Bhardwaj

Fixes a minor typo in the file.

Reviewed-by: Christian König 
Reviewed-by: Alex Deucher 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4f8fd067d150..acabb57aa8af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -136,8 +136,8 @@ uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *bo)
 /**
  * amdgpu_gmc_vram_location - try to find VRAM location
  *
- * @adev: amdgpu device structure holding all necessary informations
- * @mc: memory controller structure holding memory informations
+ * @adev: amdgpu device structure holding all necessary information
+ * @mc: memory controller structure holding memory information
  * @base: base address at which to put VRAM
  *
  * Function will try to place VRAM at base address provided
@@ -165,8 +165,8 @@ void amdgpu_gmc_vram_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc,
 /**
  * amdgpu_gmc_gart_location - try to find GART location
  *
- * @adev: amdgpu device structure holding all necessary informations
- * @mc: memory controller structure holding memory informations
+ * @adev: amdgpu device structure holding all necessary information
+ * @mc: memory controller structure holding memory information
  *
  * Function will place try to place GART before or after VRAM.
  *
@@ -207,8 +207,8 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
 
 /**
  * amdgpu_gmc_agp_location - try to find AGP location
- * @adev: amdgpu device structure holding all necessary informations
- * @mc: memory controller structure holding memory informations
+ * @adev: amdgpu device structure holding all necessary information
+ * @mc: memory controller structure holding memory information
  *
  * Function will place try to find a place for the AGP BAR in the MC address
  * space.
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Patch v4 1/4] drm/amdgpu: Fix missing error check in suspend

2020-02-10 Thread Rajneesh Bhardwaj

amdgpu_device_suspend might return an error code since it can be called
from both runtime and system suspend flows. Add the missing return code
in case of a failure.

Reviewed-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Reviewed-by: Alex Deucher 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1a8659efad53..058ad0231f8f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1236,6 +1236,9 @@ static int amdgpu_pmops_runtime_suspend(struct device 
*dev)
drm_kms_helper_poll_disable(drm_dev);
 
ret = amdgpu_device_suspend(drm_dev, false);
+   if (ret)
+   return ret;
+
if (amdgpu_device_supports_boco(drm_dev)) {
/* Only need to handle PCI state in the driver for ATPX
 * PCI core handles it for _PR3.
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Patch v4 2/4] drm/amdkfd: show warning when kfd is locked

2020-02-10 Thread Rajneesh Bhardwaj

During system suspend the kfd driver aquires a lock that prohibits
further kfd actions unless the gpu is resumed. This adds some info which
can be useful while debugging.

Reviewed-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 275f79ab0900..86b919d82129 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -127,6 +127,8 @@ static int kfd_open(struct inode *inode, struct file *filep)
return PTR_ERR(process);
 
if (kfd_is_locked()) {
+   dev_dbg(kfd_device, "kfd is locked!\n"
+   "process %d unreferenced", process->pasid);
kfd_unref_process(process);
return -EAGAIN;
}
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Patch v4 3/4] drm/amdkfd: refactor runtime pm for baco

2020-02-10 Thread Rajneesh Bhardwaj

So far the kfd driver implemented same routines for runtime and system
wide suspend and resume (s2idle or mem). During system wide suspend the
kfd aquires an atomic lock that prevents any more user processes to
create queues and interact with kfd driver and amd gpu. This mechanism
created problem when amdgpu device is runtime suspended with BACO
enabled. Any application that relies on kfd driver fails to load because
the driver reports a locked kfd device since gpu is runtime suspended.

However, in an ideal case, when gpu is runtime  suspended the kfd driver
should be able to:

 - auto resume amdgpu driver whenever a client requests compute service
 - prevent runtime suspend for amdgpu  while kfd is in use

This change refactors the amdgpu and amdkfd drivers to support BACO and
runtime power management.

Reviewed-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 12 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 +--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 29 ---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 42 --
 6 files changed, 70 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 8609287620ea..314c4a2a0354 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -178,18 +178,18 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
kgd2kfd_interrupt(adev->kfd.dev, ih_ring_entry);
 }
 
-void amdgpu_amdkfd_suspend(struct amdgpu_device *adev)
+void amdgpu_amdkfd_suspend(struct amdgpu_device *adev, bool run_pm)
 {
if (adev->kfd.dev)
-   kgd2kfd_suspend(adev->kfd.dev);
+   kgd2kfd_suspend(adev->kfd.dev, run_pm);
 }
 
-int amdgpu_amdkfd_resume(struct amdgpu_device *adev)
+int amdgpu_amdkfd_resume(struct amdgpu_device *adev, bool run_pm)
 {
int r = 0;
 
if (adev->kfd.dev)
-   r = kgd2kfd_resume(adev->kfd.dev);
+   r = kgd2kfd_resume(adev->kfd.dev, run_pm);
 
return r;
 }
@@ -713,11 +713,11 @@ void kgd2kfd_exit(void)
 {
 }
 
-void kgd2kfd_suspend(struct kfd_dev *kfd)
+void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm)
 {
 }
 
-int kgd2kfd_resume(struct kfd_dev *kfd)
+int kgd2kfd_resume(struct kfd_dev *kfd, bool run_pm)
 {
return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 47b0f2957d1f..9e8db702d878 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -122,8 +122,8 @@ struct amdkfd_process_info {
 int amdgpu_amdkfd_init(void);
 void amdgpu_amdkfd_fini(void);
 
-void amdgpu_amdkfd_suspend(struct amdgpu_device *adev);
-int amdgpu_amdkfd_resume(struct amdgpu_device *adev);
+void amdgpu_amdkfd_suspend(struct amdgpu_device *adev, bool run_pm);
+int amdgpu_amdkfd_resume(struct amdgpu_device *adev, bool run_pm);
 void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
@@ -249,8 +249,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 struct drm_device *ddev,
 const struct kgd2kfd_shared_resources *gpu_resources);
 void kgd2kfd_device_exit(struct kfd_dev *kfd);
-void kgd2kfd_suspend(struct kfd_dev *kfd);
-int kgd2kfd_resume(struct kfd_dev *kfd);
+void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm);
+int kgd2kfd_resume(struct kfd_dev *kfd, bool run_pm);
 int kgd2kfd_pre_reset(struct kfd_dev *kfd);
 int kgd2kfd_post_reset(struct kfd_dev *kfd);
 void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8f4849f94fb0..9a7229c94b8e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3343,7 +3343,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
}
}
 
-   amdgpu_amdkfd_suspend(adev);
+   amdgpu_amdkfd_suspend(adev, !fbcon);
 
amdgpu_ras_suspend(adev);
 
@@ -3427,7 +3427,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
}
}
}
-   r = amdgpu_amdkfd_resume(adev);
+   r = amdgpu_amdkfd_resume(adev, !fbcon);
if (r)
return r;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 798ad1c8f799..42ee9ea5c45a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -732,7 +732,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 voi

[Patch v4 4/4] drm/amdgpu/runpm: enable runpm on baco capable VI+ asics

2020-02-10 Thread Rajneesh Bhardwaj

From: Alex Deucher 

Seems to work reliably on VI+ except for a few so enable runpm barring
those where baco for runtime power management is not supported.

[rajneesh] Picked https://patchwork.freedesktop.org/patch/335402/ to
enable runtime pm with baco for kfd. Also fixed a checkpatch warning and
added extra checks for VEGA20 and ARCTURUS.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 3a0ea9096498..0f3563926ad1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -170,10 +170,16 @@ int amdgpu_driver_load_kms(struct drm_device *dev, 
unsigned long flags)
}
 
if (amdgpu_device_supports_boco(dev) &&
-   (amdgpu_runtime_pm != 0)) /* enable runpm by default */
+   (amdgpu_runtime_pm != 0)) /* enable runpm by default for boco */
adev->runpm = true;
else if (amdgpu_device_supports_baco(dev) &&
-(amdgpu_runtime_pm > 0)) /* enable runpm if runpm=1 */
+(amdgpu_runtime_pm != 0) &&
+(adev->asic_type >= CHIP_TOPAZ) &&
+(adev->asic_type != CHIP_VEGA20) &&
+(adev->asic_type != CHIP_ARCTURUS)) /* enable runpm on VI+ */
+   adev->runpm = true;
+   else if (amdgpu_device_supports_baco(dev) &&
+(amdgpu_runtime_pm > 0))  /* enable runpm if runpm=1 on CI */
adev->runpm = true;
 
/* Call ACPI methods: require modeset init
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Patch v4 0/4] Enable BACO with KFD

2020-02-10 Thread Rajneesh Bhardwaj

Changes in v4:
 * Addressed recent feedback from Felix.

Changes in v3:
 * Addressed review comments.
 * Rebased to latest amd-staging-drm-next.
 * Slightly modified patch 4 to avoid runpm on devices where baco is not
   yet fully supported for runtime pm but still is used for gpu reset.

Changes in v2:
 * Rebased on latest amd-staging-drm-next
 * Addressed review comments from Felix, Oak and Alex for v1
 * Removed 60 second hack for auto-suspend delay and simplified the
   logic
 * Dropped kfd debugfs patch
 * Folded in Alex's patch from this series to enable and test with kfd.
   https://patchwork.freedesktop.org/series/67885/ also fixed a
   checkpatch warning.

Link to v1: https://www.spinics.net/lists/amd-gfx/msg44551.html

This series aims to enable BACO by default on supported AMD platforms
and ensures that the AMD Kernel Fusion Driver can co-exist with this
feature when the GPU devices are runtime suspended and firmware pushes
the envelop to save more power with BACO entry sequence. Current
approach makes sure that if KFD is using GPU services for compute, it
won't let AMDGPU driver suspend and if the AMDGPU driver is already
runtime suspended with GPUs in deep power saving mode with BACO, the KFD
driver wakes up the AMDGPU and then starts the compute workload
execution.

This series has been tested with a single GPU (fiji), Dual GPUs (fiji
and fiji/tonga) and seems to work fine with kfdtest. Also tested system
wide suspend to mem and suspend to idle and with this series and it
worked fine.



Alex Deucher (1):
  drm/amdgpu/runpm: enable runpm on baco capable VI+ asics

Rajneesh Bhardwaj (3):
  drm/amdgpu: Fix missing error check in suspend
  drm/amdkfd: show warning when kfd is locked
  drm/amdkfd: refactor runtime pm for baco

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 12 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 10 --
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   |  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 29 ---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 42 --
 9 files changed, 83 insertions(+), 28 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Patch v3 3/4] drm/amdkfd: refactor runtime pm for baco

2020-02-06 Thread Rajneesh Bhardwaj

So far the kfd driver implemented same routines for runtime and system
wide suspend and resume (s2idle or mem). During system wide suspend the
kfd aquires an atomic lock that prevents any more user processes to
create queues and interact with kfd driver and amd gpu. This mechanism
created problem when amdgpu device is runtime suspended with BACO
enabled. Any application that relies on kfd driver fails to load because
the driver reports a locked kfd device since gpu is runtime suspended.

However, in an ideal case, when gpu is runtime  suspended the kfd driver
should be able to:

 - auto resume amdgpu driver whenever a client requests compute service
 - prevent runtime suspend for amdgpu  while kfd is in use

This change refactors the amdgpu and amdkfd drivers to support BACO and
runtime power management.

Reviewed-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 12 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 +--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 29 +---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 40 --
 6 files changed, 68 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 8609287620ea..314c4a2a0354 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -178,18 +178,18 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
kgd2kfd_interrupt(adev->kfd.dev, ih_ring_entry);
 }
 
-void amdgpu_amdkfd_suspend(struct amdgpu_device *adev)
+void amdgpu_amdkfd_suspend(struct amdgpu_device *adev, bool run_pm)
 {
if (adev->kfd.dev)
-   kgd2kfd_suspend(adev->kfd.dev);
+   kgd2kfd_suspend(adev->kfd.dev, run_pm);
 }
 
-int amdgpu_amdkfd_resume(struct amdgpu_device *adev)
+int amdgpu_amdkfd_resume(struct amdgpu_device *adev, bool run_pm)
 {
int r = 0;
 
if (adev->kfd.dev)
-   r = kgd2kfd_resume(adev->kfd.dev);
+   r = kgd2kfd_resume(adev->kfd.dev, run_pm);
 
return r;
 }
@@ -713,11 +713,11 @@ void kgd2kfd_exit(void)
 {
 }
 
-void kgd2kfd_suspend(struct kfd_dev *kfd)
+void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm)
 {
 }
 
-int kgd2kfd_resume(struct kfd_dev *kfd)
+int kgd2kfd_resume(struct kfd_dev *kfd, bool run_pm)
 {
return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 47b0f2957d1f..9e8db702d878 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -122,8 +122,8 @@ struct amdkfd_process_info {
 int amdgpu_amdkfd_init(void);
 void amdgpu_amdkfd_fini(void);
 
-void amdgpu_amdkfd_suspend(struct amdgpu_device *adev);
-int amdgpu_amdkfd_resume(struct amdgpu_device *adev);
+void amdgpu_amdkfd_suspend(struct amdgpu_device *adev, bool run_pm);
+int amdgpu_amdkfd_resume(struct amdgpu_device *adev, bool run_pm);
 void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
@@ -249,8 +249,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 struct drm_device *ddev,
 const struct kgd2kfd_shared_resources *gpu_resources);
 void kgd2kfd_device_exit(struct kfd_dev *kfd);
-void kgd2kfd_suspend(struct kfd_dev *kfd);
-int kgd2kfd_resume(struct kfd_dev *kfd);
+void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm);
+int kgd2kfd_resume(struct kfd_dev *kfd, bool run_pm);
 int kgd2kfd_pre_reset(struct kfd_dev *kfd);
 int kgd2kfd_post_reset(struct kfd_dev *kfd);
 void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f33c49ed4f94..18fa78806317 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3312,7 +3312,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
}
}
 
-   amdgpu_amdkfd_suspend(adev);
+   amdgpu_amdkfd_suspend(adev, !fbcon);
 
amdgpu_ras_suspend(adev);
 
@@ -3396,7 +3396,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
}
}
}
-   r = amdgpu_amdkfd_resume(adev);
+   r = amdgpu_amdkfd_resume(adev, !fbcon);
if (r)
return r;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 798ad1c8f799..42ee9ea5c45a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -732,7 +732,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 voi

[Patch v3 1/4] drm/amdgpu: Fix missing error check in suspend

2020-02-06 Thread Rajneesh Bhardwaj

amdgpu_device_suspend might return an error code since it can be called
from both runtime and system suspend flows. Add the missing return code
in case of a failure.

Reviewed-by: Oak Zeng 
Reviewed-by: Felix Kuehling 
Reviewed-by: Alex Deucher 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f28d040de3ce..0026ff56542c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1235,6 +1235,9 @@ static int amdgpu_pmops_runtime_suspend(struct device 
*dev)
drm_kms_helper_poll_disable(drm_dev);
 
ret = amdgpu_device_suspend(drm_dev, false);
+   if (ret)
+   return ret;
+
if (amdgpu_device_supports_boco(drm_dev)) {
/* Only need to handle PCI state in the driver for ATPX
 * PCI core handles it for _PR3.
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Patch v3 0/4] Enable BACO with KFD

2020-02-06 Thread Rajneesh Bhardwaj

Changes in v3:
 * Addressed review comments.
 * Rebased to latest amd-staging-drm-next.
 * Slightly modified patch 4 to avoid runpm on devices where baco is not
   yet fully supported for runtime pm but still is used for gpu reset.

Changes in v2:
 * Rebased on latest amd-staging-drm-next
 * Addressed review comments from Felix, Oak and Alex for v1
 * Removed 60 second hack for auto-suspend delay and simplified the
   logic
 * Dropped kfd debugfs patch
 * Folded in Alex's patch from this series to enable and test with kfd.
   https://patchwork.freedesktop.org/series/67885/ also fixed a
   checkpatch warning.

Link to v1: https://www.spinics.net/lists/amd-gfx/msg44551.html

This series aims to enable BACO by default on supported AMD platforms
and ensures that the AMD Kernel Fusion Driver can co-exist with this
feature when the GPU devices are runtime suspended and firmware pushes
the envelop to save more power with BACO entry sequence. Current
approach makes sure that if KFD is using GPU services for compute, it
won't let AMDGPU driver suspend and if the AMDGPU driver is already
runtime suspended with GPUs in deep power saving mode with BACO, the KFD
driver wakes up the AMDGPU and then starts the compute workload
execution.

This series has been tested with a single GPU (fiji), Dual GPUs (fiji
and fiji/tonga) and seems to work fine with kfdtest. Also tested system
wide suspend to mem and suspend to idle and with this series and it
worked fine.


Alex Deucher (1):
  drm/amdgpu/runpm: enable runpm on baco capable VI+ asics

Rajneesh Bhardwaj (3):
  drm/amdgpu: Fix missing error check in suspend
  drm/amdkfd: show warning when kfd is locked
  drm/amdkfd: refactor runtime pm for baco

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 12 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 10 --
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   |  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 29 +---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 40 --
 9 files changed, 81 insertions(+), 28 deletions(-)

-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

1 2 >

1 - 100 of 112 matches

Mail list logo