[PATCH 16/24] drm/amdkfd: Clamp EOP queue size correctly on Gfx8

2017-08-15 Thread Felix Kuehling
From: Jay Cornwall 

Gfx8 HW incorrectly clamps CP_HQD_EOP_CONTROL.EOP_SIZE, which can
lead to scheduling deadlock due to SE EOP done counter overflow.

Enforce a EOP queue size limit which prevents the CP from sending
more than 0xFF events at a time.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Acked-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index f4c8c23..98a930e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -135,8 +135,15 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
3 << CP_HQD_IB_CONTROL__MIN_IB_AVAIL_SIZE__SHIFT |
mtype << CP_HQD_IB_CONTROL__MTYPE__SHIFT;
 
-   m->cp_hqd_eop_control |=
-   ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1;
+   /*
+* HW does not clamp this field correctly. Maximum EOP queue size
+* is constrained by per-SE EOP done signal count, which is 8-bit.
+* Limit is 0xFF EOP entries (= 0x7F8 dwords). CP will not submit
+* more than (EOP entry count - 1) so a queue size of 0x800 dwords
+* is safe, giving a maximum field value of 0xA.
+*/
+   m->cp_hqd_eop_control |= min(0xA,
+   ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1);
m->cp_hqd_eop_base_addr_lo =
lower_32_bits(q->eop_ring_buffer_address >> 8);
m->cp_hqd_eop_base_addr_hi =
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 05/24] drm/amdgpu: Remove hard-coded assumptions about compute pipes

2017-08-15 Thread Felix Kuehling
Remove hard-coded assumption that the first compute pipe is
reserved for amdgpu. Pipe 0 actually means pipe 0 now.

Signed-off-by: Felix Kuehling 
Reviewed-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 5254562..31c4fbd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -186,7 +186,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t 
pipe_id,
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-   uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+   uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
lock_srbm(kgd, mec, pipe, queue_id, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 133d066..c8ac402 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -147,7 +147,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t 
pipe_id,
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-   uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+   uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
lock_srbm(kgd, mec, pipe, queue_id, 0);
@@ -216,7 +216,7 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, 
uint32_t pipe_id)
uint32_t mec;
uint32_t pipe;
 
-   mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+   mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
lock_srbm(kgd, mec, pipe, 0, 0);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 07/24] drm/amdkfd: Consolidate and clean up log commands

2017-08-15 Thread Felix Kuehling
From: Kent Russell 

Consolidate log commands so that dev_info(NULL, "Error...") uses the more
accurate pr_err, remove the module name from the log (can be seen via
dynamic debugging with +m), and the function name (can be seen via
dynamic debugging with +f). We also don't need debug messages saying
what function we're in. Those can be added by devs when needed

Don't print vendor and device ID in error messages. They are typically
the same for all GPUs in a multi-GPU system. So this doesn't add any
value to the message.

Lastly, remove parentheses around %d, %i and 0x%llX.
According to kernel.org:
"Printing numbers in parentheses (%d) adds no value and should be
avoided."

Signed-off-by: Kent Russell 
Signed-off-by: Yong Zhao 
Signed-off-by: Felix Kuehling 
Reviewed-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 64 -
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c| 38 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c|  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 51 ++
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 81 +++---
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c  | 21 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c| 22 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c  | 16 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_module.c|  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   | 10 ---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c|  8 +--
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c| 34 -
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   |  4 +-
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 27 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c  |  6 +-
 17 files changed, 158 insertions(+), 236 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 2603b7c..6763972 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -142,12 +142,12 @@ static int set_queue_properties_from_user(struct 
queue_properties *q_properties,
struct kfd_ioctl_create_queue_args *args)
 {
if (args->queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
-   pr_err("kfd: queue percentage must be between 0 to 
KFD_MAX_QUEUE_PERCENTAGE\n");
+   pr_err("Queue percentage must be between 0 to 
KFD_MAX_QUEUE_PERCENTAGE\n");
return -EINVAL;
}
 
if (args->queue_priority > KFD_MAX_QUEUE_PRIORITY) {
-   pr_err("kfd: queue priority must be between 0 to 
KFD_MAX_QUEUE_PRIORITY\n");
+   pr_err("Queue priority must be between 0 to 
KFD_MAX_QUEUE_PRIORITY\n");
return -EINVAL;
}
 
@@ -155,26 +155,26 @@ static int set_queue_properties_from_user(struct 
queue_properties *q_properties,
(!access_ok(VERIFY_WRITE,
(const void __user *) args->ring_base_address,
sizeof(uint64_t {
-   pr_err("kfd: can't access ring base address\n");
+   pr_err("Can't access ring base address\n");
return -EFAULT;
}
 
if (!is_power_of_2(args->ring_size) && (args->ring_size != 0)) {
-   pr_err("kfd: ring size must be a power of 2 or 0\n");
+   pr_err("Ring size must be a power of 2 or 0\n");
return -EINVAL;
}
 
if (!access_ok(VERIFY_WRITE,
(const void __user *) args->read_pointer_address,
sizeof(uint32_t))) {
-   pr_err("kfd: can't access read pointer\n");
+   pr_err("Can't access read pointer\n");
return -EFAULT;
}
 
if (!access_ok(VERIFY_WRITE,
(const void __user *) args->write_pointer_address,
sizeof(uint32_t))) {
-   pr_err("kfd: can't access write pointer\n");
+   pr_err("Can't access write pointer\n");
return -EFAULT;
}
 
@@ -182,7 +182,7 @@ static int set_queue_properties_from_user(struct 
queue_properties *q_properties,
!access_ok(VERIFY_WRITE,
(const void __user *) args->eop_buffer_address,
sizeof(uint32_t))) {
-   pr_debug("kfd: can't access eop buffer");
+   pr_debug("Can't access eop buffer");
return -EFAULT;
}
 
@@ -190,7 +190,7 @@ static int set_queue_properties_from_user(struct 
queue_properties *q_properties,
!access_ok(VERIFY_WRITE,
(const void __user *) 

[PATCH 22/24] drm/amdkfd: Adding new IOCTL for scratch memory v2

2017-08-15 Thread Felix Kuehling
From: Moses Reuben 

v2:
* Renamed ALLOC_MEMORY_OF_SCRATCH to SET_SCRATCH_BACKING_VA
* Removed size parameter from the ioctl, it was unused
* Removed hole in ioctl number space
* No more call to write_config_static_mem
* Return correct error code from ioctl

Signed-off-by: Moses Reuben 
Signed-off-by: Ben Goz 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 37 ++
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  3 ++
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 ++
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |  1 +
 include/uapi/linux/kfd_ioctl.h | 11 ++-
 6 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 65b506f1..7436d34 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -848,6 +848,40 @@ static int kfd_ioctl_wait_events(struct file *filp, struct 
kfd_process *p,
 
return err;
 }
+static int kfd_ioctl_set_scratch_backing_va(struct file *filep,
+   struct kfd_process *p, void *data)
+{
+   struct kfd_ioctl_set_scratch_backing_va_args *args = data;
+   struct kfd_process_device *pdd;
+   struct kfd_dev *dev;
+   long err;
+
+   dev = kfd_device_by_id(args->gpu_id);
+   if (!dev)
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+
+   pdd = kfd_bind_process_to_device(dev, p);
+   if (IS_ERR(pdd)) {
+   err = PTR_ERR(pdd);
+   goto bind_process_to_device_fail;
+   }
+
+   pdd->qpd.sh_hidden_private_base = args->va_addr;
+
+   mutex_unlock(>mutex);
+
+   if (sched_policy == KFD_SCHED_POLICY_NO_HWS && pdd->qpd.vmid != 0)
+   dev->kfd2kgd->set_scratch_backing_va(
+   dev->kgd, args->va_addr, pdd->qpd.vmid);
+
+   return 0;
+
+bind_process_to_device_fail:
+   mutex_unlock(>mutex);
+   return err;
+}
 
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
@@ -902,6 +936,9 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
 
AMDKFD_IOCTL_DEF(AMDKFD_IOC_DBG_WAVE_CONTROL,
kfd_ioctl_dbg_wave_control, 0),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_SCRATCH_BACKING_VA,
+   kfd_ioctl_set_scratch_backing_va, 0),
 };
 
 #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 618ac65..53a66e8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -270,6 +270,9 @@ static int create_compute_queue_nocpsch(struct 
device_queue_manager *dqm,
pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
q->pipe, q->queue);
 
+   dqm->dev->kfd2kgd->set_scratch_backing_va(
+   dqm->dev->kgd, qpd->sh_hidden_private_base, qpd->vmid);
+
retval = mqd->load_mqd(mqd, q->mqd, q->pipe, q->queue, >properties,
   q->process->mm);
if (retval)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
index fadc56a..72c3cba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
@@ -24,6 +24,7 @@
 #include "kfd_device_queue_manager.h"
 #include "cik_regs.h"
 #include "oss/oss_2_4_sh_mask.h"
+#include "gca/gfx_7_2_sh_mask.h"
 
 static bool set_cache_memory_policy_cik(struct device_queue_manager *dqm,
   struct qcm_process_device *qpd,
@@ -123,6 +124,7 @@ static int register_process_cik(struct device_queue_manager 
*dqm,
} else {
temp = get_sh_mem_bases_nybble_64(pdd);
qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
+   qpd->sh_mem_config |= 1  << SH_MEM_CONFIG__PRIVATE_ATC__SHIFT;
}
 
pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 
0x%X\n",
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
index 15e81ae..40e9ddd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
@@ -135,6 +135,8 @@ static int register_process_vi(struct device_queue_manager 
*dqm,
qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
qpd->sh_mem_config |= SH_MEM_ADDRESS_MODE_HSA64 <<

[PATCH 21/24] drm/amdgpu: Add kgd/kfd interface to support scratch memory v2

2017-08-15 Thread Felix Kuehling
From: Moses Reuben 

v2:
* Shortened headline
* Removed write_config_static_mem, it gets initialized by gfx_v?_0_gpu_init
* Renamed alloc_memory_of_scratch to set_scratch_backing_va
* Made set_scratch_backing_va a void function
* Documented set_scratch_backing in kgd_kfd_interface.h

Signed-off-by: Moses Reuben 
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 15 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 16 +++-
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  5 +
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index d1719be..3793d7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -135,6 +135,8 @@ static uint16_t get_atc_vmid_pasid_mapping_pasid(struct 
kgd_dev *kgd,
 static void write_vmid_invalidate_request(struct kgd_dev *kgd, uint8_t vmid);
 
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type);
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+   uint64_t va, uint32_t vmid);
 
 static const struct kfd2kgd_calls kfd2kgd = {
.init_gtt_mem_allocation = alloc_gtt_mem,
@@ -159,7 +161,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_atc_vmid_pasid_mapping_pasid = get_atc_vmid_pasid_mapping_pasid,
.get_atc_vmid_pasid_mapping_valid = get_atc_vmid_pasid_mapping_valid,
.write_vmid_invalidate_request = write_vmid_invalidate_request,
-   .get_fw_version = get_fw_version
+   .get_fw_version = get_fw_version,
+   .set_scratch_backing_va = set_scratch_backing_va,
 };
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void)
@@ -652,6 +655,16 @@ static void write_vmid_invalidate_request(struct kgd_dev 
*kgd, uint8_t vmid)
WREG32(mmVM_INVALIDATE_REQUEST, 1 << vmid);
 }
 
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+   uint64_t va, uint32_t vmid)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
+
+   lock_srbm(kgd, 0, 0, 0, vmid);
+   WREG32(mmSH_HIDDEN_PRIVATE_BASE_VMID, va);
+   unlock_srbm(kgd);
+}
+
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 {
struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 29a6f5d..61f6457 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -94,6 +94,8 @@ static uint16_t get_atc_vmid_pasid_mapping_pasid(struct 
kgd_dev *kgd,
uint8_t vmid);
 static void write_vmid_invalidate_request(struct kgd_dev *kgd, uint8_t vmid);
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type);
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+   uint64_t va, uint32_t vmid);
 
 static const struct kfd2kgd_calls kfd2kgd = {
.init_gtt_mem_allocation = alloc_gtt_mem,
@@ -120,12 +122,14 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_atc_vmid_pasid_mapping_valid =
get_atc_vmid_pasid_mapping_valid,
.write_vmid_invalidate_request = write_vmid_invalidate_request,
-   .get_fw_version = get_fw_version
+   .get_fw_version = get_fw_version,
+   .set_scratch_backing_va = set_scratch_backing_va,
 };
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void)
 {
return (struct kfd2kgd_calls *)
+   return (struct kfd2kgd_calls *)
 }
 
 static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
@@ -573,6 +577,16 @@ static uint32_t kgd_address_watch_get_offset(struct 
kgd_dev *kgd,
return 0;
 }
 
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+   uint64_t va, uint32_t vmid)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
+
+   lock_srbm(kgd, 0, 0, 0, vmid);
+   WREG32(mmSH_HIDDEN_PRIVATE_BASE_VMID, va);
+   unlock_srbm(kgd);
+}
+
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 {
struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index ffafda0..2a9cc5e 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -128,6 +128,9 @@ struct kgd2kfd_shared_resources {
  *
  * @get_fw_version: Returns FW versions from the header
  *
+ * @set_scratch_backing_va: Sets VA for scratch backing memory of a VMID.
+ * Only used for no cp scheduling mode
+ *
  * This 

[PATCH 23/24] drm/amdgpu: Add kgd kfd interface get_tile_config() v2

2017-08-15 Thread Felix Kuehling
From: Yong Zhao 

v2:
* Removed amdgpu_amdkfd prefix from static functions
* Documented get_tile_config in kgd_kfd_interface.h

Signed-off-by: Yong Zhao 
Signed-off-by: Felix Kuehling 
Acked-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 26 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 26 +++
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   | 14 
 3 files changed, 66 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 3793d7b..b9dbbf9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -138,6 +138,31 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum 
kgd_engine_type type);
 static void set_scratch_backing_va(struct kgd_dev *kgd,
uint64_t va, uint32_t vmid);
 
+/* Because of REG_GET_FIELD() being used, we put this function in the
+ * asic specific file.
+ */
+static int get_tile_config(struct kgd_dev *kgd,
+   struct tile_config *config)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
+
+   config->gb_addr_config = adev->gfx.config.gb_addr_config;
+   config->num_banks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+   MC_ARB_RAMCFG, NOOFBANK);
+   config->num_ranks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+   MC_ARB_RAMCFG, NOOFRANKS);
+
+   config->tile_config_ptr = adev->gfx.config.tile_mode_array;
+   config->num_tile_configs =
+   ARRAY_SIZE(adev->gfx.config.tile_mode_array);
+   config->macro_tile_config_ptr =
+   adev->gfx.config.macrotile_mode_array;
+   config->num_macro_tile_configs =
+   ARRAY_SIZE(adev->gfx.config.macrotile_mode_array);
+
+   return 0;
+}
+
 static const struct kfd2kgd_calls kfd2kgd = {
.init_gtt_mem_allocation = alloc_gtt_mem,
.free_gtt_mem = free_gtt_mem,
@@ -163,6 +188,7 @@ static const struct kfd2kgd_calls kfd2kgd = {
.write_vmid_invalidate_request = write_vmid_invalidate_request,
.get_fw_version = get_fw_version,
.set_scratch_backing_va = set_scratch_backing_va,
+   .get_tile_config = get_tile_config,
 };
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 61f6457..fb6e5db 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -97,6 +97,31 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum 
kgd_engine_type type);
 static void set_scratch_backing_va(struct kgd_dev *kgd,
uint64_t va, uint32_t vmid);
 
+/* Because of REG_GET_FIELD() being used, we put this function in the
+ * asic specific file.
+ */
+static int get_tile_config(struct kgd_dev *kgd,
+   struct tile_config *config)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
+
+   config->gb_addr_config = adev->gfx.config.gb_addr_config;
+   config->num_banks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+   MC_ARB_RAMCFG, NOOFBANK);
+   config->num_ranks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+   MC_ARB_RAMCFG, NOOFRANKS);
+
+   config->tile_config_ptr = adev->gfx.config.tile_mode_array;
+   config->num_tile_configs =
+   ARRAY_SIZE(adev->gfx.config.tile_mode_array);
+   config->macro_tile_config_ptr =
+   adev->gfx.config.macrotile_mode_array;
+   config->num_macro_tile_configs =
+   ARRAY_SIZE(adev->gfx.config.macrotile_mode_array);
+
+   return 0;
+}
+
 static const struct kfd2kgd_calls kfd2kgd = {
.init_gtt_mem_allocation = alloc_gtt_mem,
.free_gtt_mem = free_gtt_mem,
@@ -124,6 +149,7 @@ static const struct kfd2kgd_calls kfd2kgd = {
.write_vmid_invalidate_request = write_vmid_invalidate_request,
.get_fw_version = get_fw_version,
.set_scratch_backing_va = set_scratch_backing_va,
+   .get_tile_config = get_tile_config,
 };
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void)
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 2a9cc5e..94277cb 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -87,6 +87,17 @@ struct kgd2kfd_shared_resources {
size_t doorbell_start_offset;
 };
 
+struct tile_config {
+   uint32_t *tile_config_ptr;
+   uint32_t *macro_tile_config_ptr;
+   

[PATCH 01/24] drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts

2017-08-15 Thread Felix Kuehling
Signed-off-by: Felix Kuehling 
Reviewed-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index d5e19b5..8b14a4e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -823,7 +823,7 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, 
struct kfd_process *p)
for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
(dev->kgd, vmid)) {
-   if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
+   if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_pasid
(dev->kgd, vmid) == p->pasid) {
pr_debug("Killing wave fronts of vmid %d and 
pasid %d\n",
vmid, p->pasid);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 19/24] drm/amd: Update MEC HQD loading code for KFD

2017-08-15 Thread Felix Kuehling
Various bug fixes and improvements that accumulated over the last two
years.

Signed-off-by: Felix Kuehling 
Acked-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  16 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 130 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 165 ++---
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |   7 +-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c  |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h   |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  23 +--
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c|  16 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |   5 -
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h|  11 +-
 drivers/gpu/drm/radeon/radeon_kfd.c|  12 +-
 11 files changed, 322 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index b8802a5..8d689ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -26,6 +26,7 @@
 #define AMDGPU_AMDKFD_H_INCLUDED
 
 #include 
+#include 
 #include 
 
 struct amdgpu_device;
@@ -60,4 +61,19 @@ uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
 uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
 
+#define read_user_wptr(mmptr, wptr, dst)   \
+   ({  \
+   bool valid = false; \
+   if ((mmptr) && (wptr)) {\
+   if ((mmptr) == current->mm) {   \
+   valid = !get_user((dst), (wptr));   \
+   } else if (current->mm == NULL) {   \
+   use_mm(mmptr);  \
+   valid = !get_user((dst), (wptr));   \
+   unuse_mm(mmptr);\
+   }   \
+   }   \
+   valid;  \
+   })
+
 #endif /* AMDGPU_AMDKFD_H_INCLUDED */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index dcd90e8..d1719be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -39,6 +39,12 @@
 #include "gmc/gmc_7_1_sh_mask.h"
 #include "cik_structs.h"
 
+enum hqd_dequeue_request_type {
+   NO_ACTION = 0,
+   DRAIN_PIPE,
+   RESET_WAVES
+};
+
 enum {
MAX_TRAPID = 8, /* 3 bits in the bitfield. */
MAX_WATCH_ADDRESSES = 4
@@ -96,12 +102,15 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t 
pipe_id,
uint32_t hpd_size, uint64_t hpd_gpu_addr);
 static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-   uint32_t queue_id, uint32_t __user *wptr);
+   uint32_t queue_id, uint32_t __user *wptr,
+   uint32_t wptr_shift, uint32_t wptr_mask,
+   struct mm_struct *mm);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
uint32_t pipe_id, uint32_t queue_id);
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+   enum kfd_preempt_type reset_type,
unsigned int utimeout, uint32_t pipe_id,
uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
@@ -290,20 +299,38 @@ static inline struct cik_sdma_rlc_registers 
*get_sdma_mqd(void *mqd)
 }
 
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-   uint32_t queue_id, uint32_t __user *wptr)
+   uint32_t queue_id, uint32_t __user *wptr,
+   uint32_t wptr_shift, uint32_t wptr_mask,
+   struct mm_struct *mm)
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
-   uint32_t wptr_shadow, is_wptr_shadow_valid;
struct cik_mqd *m;
+   uint32_t *mqd_hqd;
+   uint32_t reg, wptr_val, data;
 
m = get_mqd(mqd);
 
-   is_wptr_shadow_valid = !get_user(wptr_shadow, wptr);
-   if (is_wptr_shadow_valid)
-   m->cp_hqd_pq_wptr = wptr_shadow;
-
acquire_queue(kgd, pipe_id, queue_id);
-   

[PATCH 13/24] drm/amdkfd: Allocate gtt_sa_bitmap in long units

2017-08-15 Thread Felix Kuehling
gtt_sa_bitmap is accessed by bitmap functions, which operate on longs.
Therefore the array should be allocated in long units. Also round up
in case the number of bits is not a multiple of BITS_PER_LONG.

Signed-off-by: Felix Kuehling 
Reviewed-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index cb7ed02..416955f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -395,7 +395,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void 
*ih_ring_entry)
 static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
unsigned int chunk_size)
 {
-   unsigned int num_of_bits;
+   unsigned int num_of_longs;
 
BUG_ON(buf_size < chunk_size);
BUG_ON(buf_size == 0);
@@ -404,10 +404,10 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned 
int buf_size,
kfd->gtt_sa_chunk_size = chunk_size;
kfd->gtt_sa_num_of_chunks = buf_size / chunk_size;
 
-   num_of_bits = kfd->gtt_sa_num_of_chunks / BITS_PER_BYTE;
-   BUG_ON(num_of_bits == 0);
+   num_of_longs = (kfd->gtt_sa_num_of_chunks + BITS_PER_LONG - 1) /
+   BITS_PER_LONG;
 
-   kfd->gtt_sa_bitmap = kzalloc(num_of_bits, GFP_KERNEL);
+   kfd->gtt_sa_bitmap = kcalloc(num_of_longs, sizeof(long), GFP_KERNEL);
 
if (!kfd->gtt_sa_bitmap)
return -ENOMEM;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 20/24] drm/amdgpu: Program SH_STATIC_MEM_CONFIG globally, not per-VMID

2017-08-15 Thread Felix Kuehling
This register only has a single instance in the hardware. Its value
applies to all VMIDS.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 53a4af7..0086876 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -1921,6 +1921,7 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
   ELEMENT_SIZE, 1);
sh_static_mem_cfg = REG_SET_FIELD(sh_static_mem_cfg, 
SH_STATIC_MEM_CONFIG,
   INDEX_STRIDE, 3);
+   WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
 
mutex_lock(>srbm_mutex);
for (i = 0; i < adev->vm_manager.id_mgr[0].num_ids; i++) {
@@ -1934,7 +1935,6 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
WREG32(mmSH_MEM_APE1_BASE, 1);
WREG32(mmSH_MEM_APE1_LIMIT, 0);
WREG32(mmSH_MEM_BASES, sh_mem_base);
-   WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
}
cik_srbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 0710b0b..832e592 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -3707,6 +3707,8 @@ static void gfx_v8_0_gpu_init(struct amdgpu_device *adev)
   ELEMENT_SIZE, 1);
sh_static_mem_cfg = REG_SET_FIELD(sh_static_mem_cfg, 
SH_STATIC_MEM_CONFIG,
   INDEX_STRIDE, 3);
+   WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
+
mutex_lock(>srbm_mutex);
for (i = 0; i < adev->vm_manager.id_mgr[0].num_ids; i++) {
vi_srbm_select(adev, 0, 0, 0, i);
@@ -3730,7 +3732,6 @@ static void gfx_v8_0_gpu_init(struct amdgpu_device *adev)
 
WREG32(mmSH_MEM_APE1_BASE, 1);
WREG32(mmSH_MEM_APE1_LIMIT, 0);
-   WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
}
vi_srbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 18/24] drm/amdgpu: Disable GFX PG on CZ

2017-08-15 Thread Felix Kuehling
It's causing problems with user mode queues and the HIQ, and can
lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.

Signed-off-by: Felix Kuehling 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 6cac291..9ff69b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1028,8 +1028,7 @@ static int vi_common_early_init(void *handle)
/* rev0 hardware requires workarounds to support PG */
adev->pg_flags = 0;
if (adev->rev_id != 0x00 || 
CZ_REV_BRISTOL(adev->pdev->revision)) {
-   adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
-   AMD_PG_SUPPORT_GFX_SMG |
+   adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
AMD_PG_SUPPORT_GFX_PIPELINE |
AMD_PG_SUPPORT_CP |
AMD_PG_SUPPORT_UVD |
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 14/24] drm/amdkfd: Handle remaining BUG_ONs more gracefully v2

2017-08-15 Thread Felix Kuehling
In most cases, BUG_ONs can be replaced with WARN_ON with an error
return. In some void functions just turn them into a WARN_ON and
possibly an early exit.

v2:
* Cleaned up error handling in pm_send_unmap_queue
* Removed redundant WARN_ON in kfd_process_destroy_delayed

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c|  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c|  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 16 
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 19 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c  | 20 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c|  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c| 44 ++
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   |  9 ++---
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  7 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c  |  4 +-
 14 files changed, 84 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 3841cad..0aa021a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -60,7 +60,8 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
unsigned int *ib_packet_buff;
int status;
 
-   BUG_ON(!size_in_bytes);
+   if (WARN_ON(!size_in_bytes))
+   return -EINVAL;
 
kq = dbgdev->kq;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
index 2dc..3da25f7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
@@ -64,7 +64,8 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct 
kfd_dev *pdev)
enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
struct kfd_dbgmgr *new_buff;
 
-   BUG_ON(!pdev->init_complete);
+   if (WARN_ON(!pdev->init_complete))
+   return false;
 
new_buff = kfd_alloc_struct(new_buff);
if (!new_buff) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 416955f..f628ac3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -98,7 +98,7 @@ static const struct kfd_device_info 
*lookup_device_info(unsigned short did)
 
for (i = 0; i < ARRAY_SIZE(supported_devices); i++) {
if (supported_devices[i].did == did) {
-   BUG_ON(!supported_devices[i].device_info);
+   WARN_ON(!supported_devices[i].device_info);
return supported_devices[i].device_info;
}
}
@@ -212,9 +212,8 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int 
pasid,
flags);
 
dev = kfd_device_by_pci_dev(pdev);
-   BUG_ON(!dev);
-
-   kfd_signal_iommu_event(dev, pasid, address,
+   if (!WARN_ON(!dev))
+   kfd_signal_iommu_event(dev, pasid, address,
flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
 
return AMD_IOMMU_INV_PRI_RSP_INVALID;
@@ -397,9 +396,12 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned 
int buf_size,
 {
unsigned int num_of_longs;
 
-   BUG_ON(buf_size < chunk_size);
-   BUG_ON(buf_size == 0);
-   BUG_ON(chunk_size == 0);
+   if (WARN_ON(buf_size < chunk_size))
+   return -EINVAL;
+   if (WARN_ON(buf_size == 0))
+   return -EINVAL;
+   if (WARN_ON(chunk_size == 0))
+   return -EINVAL;
 
kfd->gtt_sa_chunk_size = chunk_size;
kfd->gtt_sa_num_of_chunks = buf_size / chunk_size;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 2486dfb..e553c5e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -388,7 +388,8 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
 {
struct mqd_manager *mqd;
 
-   BUG_ON(type >= KFD_MQD_TYPE_MAX);
+   if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
+   return NULL;
 
pr_debug("mqd type %d\n", type);
 
@@ -513,7 +514,7 @@ static void uninitialize_nocpsch(struct 
device_queue_manager *dqm)
 {
int i;
 
-   BUG_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
+   WARN_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
 
kfree(dqm->allocated_queues);
for (i = 0 ; i < KFD_MQD_TYPE_MAX ; i++)
@@ -1129,8 +1130,8 @@ struct device_queue_manager 
*device_queue_manager_init(struct kfd_dev *dev)

[PATCH 08/24] drm/amdkfd: Change x==NULL/false references to !x

2017-08-15 Thread Felix Kuehling
From: Kent Russell 

Upstream prefers the !x notation to x==NULL or x==false. Along those lines
change the ==true or !=NULL references as well. Also make the references
to !x the same, excluding () for readability.

Signed-off-by: Kent Russell 
Signed-off-by: Felix Kuehling 
Reviewed-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   | 22 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c| 20 -
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c|  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 ++---
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 50 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c|  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c  |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c|  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c| 26 +--
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   |  6 +--
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c  |  6 +--
 15 files changed, 85 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 6763972..44c6bfe 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -265,7 +265,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
 
pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
dev = kfd_device_by_id(args->gpu_id);
-   if (dev == NULL) {
+   if (!dev) {
pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
return -EINVAL;
}
@@ -400,7 +400,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
}
 
dev = kfd_device_by_id(args->gpu_id);
-   if (dev == NULL)
+   if (!dev)
return -EINVAL;
 
mutex_lock(>mutex);
@@ -443,7 +443,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
long status = 0;
 
dev = kfd_device_by_id(args->gpu_id);
-   if (dev == NULL)
+   if (!dev)
return -EINVAL;
 
if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -465,7 +465,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
return PTR_ERR(pdd);
}
 
-   if (dev->dbgmgr == NULL) {
+   if (!dev->dbgmgr) {
/* In case of a legal call, we have no dbgmgr yet */
create_ok = kfd_dbgmgr_create(_ptr, dev);
if (create_ok) {
@@ -494,7 +494,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
long status;
 
dev = kfd_device_by_id(args->gpu_id);
-   if (dev == NULL)
+   if (!dev)
return -EINVAL;
 
if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -505,7 +505,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
mutex_lock(kfd_get_dbgmgr_mutex());
 
status = kfd_dbgmgr_unregister(dev->dbgmgr, p);
-   if (status == 0) {
+   if (!status) {
kfd_dbgmgr_destroy(dev->dbgmgr);
dev->dbgmgr = NULL;
}
@@ -539,7 +539,7 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
memset((void *) _info, 0, sizeof(struct dbg_address_watch_info));
 
dev = kfd_device_by_id(args->gpu_id);
-   if (dev == NULL)
+   if (!dev)
return -EINVAL;
 
if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -646,7 +646,7 @@ static int kfd_ioctl_dbg_wave_control(struct file *filep,
sizeof(wac_info.trapId);
 
dev = kfd_device_by_id(args->gpu_id);
-   if (dev == NULL)
+   if (!dev)
return -EINVAL;
 
if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -782,9 +782,9 @@ static int kfd_ioctl_get_process_apertures(struct file 
*filp,
"scratch_limit %llX\n", pdd->scratch_limit);
 
args->num_of_nodes++;
-   } while ((pdd = kfd_get_next_process_device_data(p, pdd)) !=
-   NULL &&
-   (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
+
+   pdd = kfd_get_next_process_device_data(p, pdd);
+   } while (pdd && (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
}
 
mutex_unlock(>mutex);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index bf8ee19..0ef9136 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -77,7 +77,7 @@ static int 

[PATCH 12/24] drm/amdkfd: Fix doorbell initialization and finalization

2017-08-15 Thread Felix Kuehling
Handle errors in doorbell aperture initialization instead of BUG_ON.
iounmap doorbell aperture during finalization.

Signed-off-by: Felix Kuehling 
Reviewed-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  9 -
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 13 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  3 ++-
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index e28e818..cb7ed02 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -260,7 +260,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
goto kfd_gtt_sa_init_error;
}
 
-   kfd_doorbell_init(kfd);
+   if (kfd_doorbell_init(kfd)) {
+   dev_err(kfd_device,
+   "Error initializing doorbell aperture\n");
+   goto kfd_doorbell_error;
+   }
 
if (kfd_topology_add_device(kfd)) {
dev_err(kfd_device, "Error adding device to topology\n");
@@ -315,6 +319,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 kfd_interrupt_error:
kfd_topology_remove_device(kfd);
 kfd_topology_add_device_error:
+   kfd_doorbell_fini(kfd);
+kfd_doorbell_error:
kfd_gtt_sa_fini(kfd);
 kfd_gtt_sa_init_error:
kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
@@ -332,6 +338,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
amd_iommu_free_device(kfd->pdev);
kfd_interrupt_exit(kfd);
kfd_topology_remove_device(kfd);
+   kfd_doorbell_fini(kfd);
kfd_gtt_sa_fini(kfd);
kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index 0055270..acf4d2a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -59,7 +59,7 @@ static inline size_t doorbell_process_allocation(void)
 }
 
 /* Doorbell calculations for device init. */
-void kfd_doorbell_init(struct kfd_dev *kfd)
+int kfd_doorbell_init(struct kfd_dev *kfd)
 {
size_t doorbell_start_offset;
size_t doorbell_aperture_size;
@@ -95,7 +95,8 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
kfd->doorbell_kernel_ptr = ioremap(kfd->doorbell_base,
doorbell_process_allocation());
 
-   BUG_ON(!kfd->doorbell_kernel_ptr);
+   if (!kfd->doorbell_kernel_ptr)
+   return -ENOMEM;
 
pr_debug("Doorbell initialization:\n");
pr_debug("doorbell base   == 0x%08lX\n",
@@ -115,6 +116,14 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
 
pr_debug("doorbell kernel address == 0x%08lX\n",
(uintptr_t)kfd->doorbell_kernel_ptr);
+
+   return 0;
+}
+
+void kfd_doorbell_fini(struct kfd_dev *kfd)
+{
+   if (kfd->doorbell_kernel_ptr)
+   iounmap(kfd->doorbell_kernel_ptr);
 }
 
 int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 469b7ea..f0d55cc0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -576,7 +576,8 @@ unsigned int kfd_pasid_alloc(void);
 void kfd_pasid_free(unsigned int pasid);
 
 /* Doorbells */
-void kfd_doorbell_init(struct kfd_dev *kfd);
+int kfd_doorbell_init(struct kfd_dev *kfd);
+void kfd_doorbell_fini(struct kfd_dev *kfd);
 int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma);
 u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
unsigned int *doorbell_off);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 04/24] drm/amdkfd: Fix allocated_queues bitmap initialization

2017-08-15 Thread Felix Kuehling
Use shared_resources.queue_bitmap to determine the queues available
for KFD in each pipe.

Signed-off-by: Felix Kuehling 
Reviewed-by: Oded Gabbay 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 42de22b..9d2796b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -513,7 +513,7 @@ static int init_scheduler(struct device_queue_manager *dqm)
 
 static int initialize_nocpsch(struct device_queue_manager *dqm)
 {
-   int i;
+   int pipe, queue;
 
BUG_ON(!dqm);
 
@@ -531,8 +531,14 @@ static int initialize_nocpsch(struct device_queue_manager 
*dqm)
return -ENOMEM;
}
 
-   for (i = 0; i < get_pipes_per_mec(dqm); i++)
-   dqm->allocated_queues[i] = (1 << get_queues_per_pipe(dqm)) - 1;
+   for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
+   int pipe_offset = pipe * get_queues_per_pipe(dqm);
+
+   for (queue = 0; queue < get_queues_per_pipe(dqm); queue++)
+   if (test_bit(pipe_offset + queue,
+dqm->dev->shared_resources.queue_bitmap))
+   dqm->allocated_queues[pipe] |= 1 << queue;
+   }
 
dqm->vmid_bitmap = (1 << VMID_PER_DEVICE) - 1;
dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup

2017-08-15 Thread Felix Kuehling
I'll turn it into a dev_warn and I'll make the error message more
helpful. I think this happens when a new DID has been added to amdgpu,
but not to KFD yet. That's probably worth a warning.


Regards,
  Felix


On 2017-08-14 11:50 PM, Zhao, Yong wrote:
>
> Oded, I agree with you. When I made the change, there was WARN_ON
> already in the same function lookup_device_info(), so I followed the
> suit and used WARN again. It is indeed a bit overkill. 
>
>
> Felix, do I need to fix it or can you fix it directly?
>
>
> Yong
>
> 
> *From:* Oded Gabbay 
> *Sent:* Saturday, August 12, 2017 10:54:41 AM
> *To:* Kuehling, Felix
> *Cc:* amd-gfx list; Zhao, Yong
> *Subject:* Re: [PATCH 14/19] drm/amdkfd: Add more error printing to
> help bringup
>  
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling
>  wrote:
> > From: Yong Zhao 
> >
> > Signed-off-by: Yong Zhao 
> > Signed-off-by: Felix Kuehling 
> > ---
> >  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > index f628ac3..e1c2ad2 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > @@ -103,6 +103,8 @@ static const struct kfd_device_info
> *lookup_device_info(unsigned short did)
> > }
> > }
> >
> > +   WARN(1, "device is not added to supported_devices\n");
> > +
> I think WARN is a bit excessive here. Its not actually a warning - an
> AMD gpu device is present but not supported in amdkfd.
> Maybe a dev_info is more appropriate here.
>
> Oded
>
> > return NULL;
> >  }
> >
> > @@ -114,8 +116,10 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
> > const struct kfd_device_info *device_info =
> >
> lookup_device_info(pdev->device);
> >
> > -   if (!device_info)
> > +   if (!device_info) {
> > +   dev_err(kfd_device, "kgd2kfd_probe failed\n");
> > return NULL;
> > +   }
> >
> > kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
> > if (!kfd)
> > @@ -364,8 +368,11 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
> >
> > if (kfd->init_complete) {
> > err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> > -   if (err < 0)
> > +   if (err < 0) {
> > +   dev_err(kfd_device, "failed to initialize
> iommu\n");
> > return -ENXIO;
> > +   }
> > +
> > amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
> >
> iommu_pasid_shutdown_callback);
> > amd_iommu_set_invalid_ppr_cb(kfd->pdev,
> iommu_invalid_ppr_cb);
> > --
> > 2.7.4
> >
> With the above fixed, this patch is:
> Reviewed-by: Oded Gabbay 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 4/4] drm/amdkfd: Implement image tiling mode support

2017-08-15 Thread Michel Dänzer
On 16/08/17 09:35 AM, Felix Kuehling wrote:
> On 2017-08-15 06:20 AM, Oded Gabbay wrote:
>> I prefer to do it incrementally, to avoid very large patch-sets which
>> usually end up in longer cycles of review-fix, which causes you more
>> pain because internal development continues during this time and you
>> need to keep everything synchronized. If you do it in small pieces,
>> there is more chance it will get to upstream faster and then you can
>> cross it off your list permanently and no longer worry about it not
>> being synchronized with internal development.
>>
>> If you are talking about the current patch-set (you actually sent 2
>> patch-sets), then once you rebase them on the branches I mentioned,
>> they are more or less good to go (except from very small fixes we
>> talked about). If you can do it during this week, I think we can make
>> it for the 4.14 merge window.
>>
>> Does that make sense ?
> 
> Sounds good. I'm hoping to get a bit more into 4.14. We'll see how it
> goes. I've probably been told before but forgot: What's Dave Airlie's
> deadline for accepting patches into 4.14?

The deadline is usually around -rc6 of the previous cycle, so for 4.14
it might be the end of this week (keep in mind that Dave's in Australia,
so his work day is over when yours starts).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: add mjpeg check for uvd phycial mode msg buffer

2017-08-15 Thread Michel Dänzer
On 16/08/17 05:12 AM, Leo Liu wrote:
> Signed-off-by: Leo Liu 

There's a typo ("phycial") in the shortlog, and acronyms should always
be written in upper case. So it should probably be:

drm/amdgpu: Add MJPEG check for UVD physical mode msg buffer


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: add mjpeg check for uvd phycial mode msg buffer

2017-08-15 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Leo Liu
> Sent: Tuesday, August 15, 2017 4:13 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Leo
> Subject: [PATCH] drm/amdgpu: add mjpeg check for uvd phycial mode msg
> buffer
> 
> Signed-off-by: Leo Liu 

We need to bump the driver version as well so userspace knows which versions of 
the kernel driver support this.  Please send a patch to bump the driver version 
as well, with that addressed:
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index ff8ae50..b464f62 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -588,6 +588,10 @@ static int amdgpu_uvd_cs_msg_decode(struct
> amdgpu_device *adev, uint32_t *msg,
>   }
>   break;
> 
> + case 8: /* MJPEG */
> + min_dpb_size = 0;
> + break;
> +
>   case 16: /* H265 */
>   image_size = (ALIGN(width, 16) * ALIGN(height, 16) * 3) / 2;
>   image_size = ALIGN(image_size, 256);
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: add mjpeg check for uvd phycial mode msg buffer

2017-08-15 Thread Leo Liu
Signed-off-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index ff8ae50..b464f62 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -588,6 +588,10 @@ static int amdgpu_uvd_cs_msg_decode(struct amdgpu_device 
*adev, uint32_t *msg,
}
break;
 
+   case 8: /* MJPEG */
+   min_dpb_size = 0;
+   break;
+
case 16: /* H265 */
image_size = (ALIGN(width, 16) * ALIGN(height, 16) * 3) / 2;
image_size = ALIGN(image_size, 256);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


drm/amdgpu update uvd 7.0 enc test

2017-08-15 Thread Zhang, Boyuan
Update UVD 7.0 enc test according to firmware interface changes for session 
info ib.


Please review.


Regards,

Boyuan

From 41a3084055bb070b0541315cef5cfa7356624afa Mon Sep 17 00:00:00 2001
From: Boyuan Zhang 
Date: Tue, 15 Aug 2017 16:29:37 -0400
Subject: [PATCH] drm/amdgpu: update uvd enc test for new fw

session info interface changed due to fw interface changes,
update test ib accordingly.

Signed-off-by: Boyuan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 23a8575..466aff9 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -225,8 +225,8 @@ static int uvd_v7_0_enc_get_create_msg(struct amdgpu_ring *ring, uint32_t handle
 	ib->length_dw = 0;
 	ib->ptr[ib->length_dw++] = 0x0018;
 	ib->ptr[ib->length_dw++] = 0x0001; /* session info */
-	ib->ptr[ib->length_dw++] = handle;
 	ib->ptr[ib->length_dw++] = 0x;
+	ib->ptr[ib->length_dw++] = 0x0001;
 	ib->ptr[ib->length_dw++] = upper_32_bits(dummy);
 	ib->ptr[ib->length_dw++] = dummy;
 
@@ -288,8 +288,8 @@ int uvd_v7_0_enc_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle,
 	ib->length_dw = 0;
 	ib->ptr[ib->length_dw++] = 0x0018;
 	ib->ptr[ib->length_dw++] = 0x0001;
-	ib->ptr[ib->length_dw++] = handle;
 	ib->ptr[ib->length_dw++] = 0x;
+	ib->ptr[ib->length_dw++] = 0x0001;
 	ib->ptr[ib->length_dw++] = upper_32_bits(dummy);
 	ib->ptr[ib->length_dw++] = dummy;
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/4] drm/amdkfd: Adding new IOCTL for scratch memory

2017-08-15 Thread Christian König

Am 15.08.2017 um 18:03 schrieb Felix Kuehling:

On 2017-08-15 04:24 AM, Christian König wrote:

Am 14.08.2017 um 17:31 schrieb Felix Kuehling:

[SNIP]
Repeating the same argument I made on another email:

Commented on that in the other mail, let's keep the discussion on this
topic there.


BTW: What exactly this this good for?

Scratch memory is private memory per work-item. It's used when a shader
program has too few registers available. With HSA we use flat scratch
addressing, where shaders can access private memory in a special scratch
aperture using normal memory instructions. Using the same virtual
address, each work item gets its own private piece of memory. The
hardware does the address translation from the VA in the private
aperture to a scratch-backing VA. The application is responsible for
allocating the memory to back that scratch area, and to map it somewhere
in its virtual address space.

This ioctl tells the hardware (or HWS firmware) the VA of the scratch
backing memory.

Ok, you've got me lost here. Not that I'm deeply into that stuff, but
my last status is that those apertures are global and not per/process
or VMID.

The apertures for private (scratch) and LDS are configured in
SH_MEM_BASES. As far as I know, this is per VMID. At least that's the
way gfx_v8_0_init_compute_vmid initializes it. When we use the HWS we
don't program this directly, but this information is included in the
map_process packet in the runlist and the firmware programs the
registers when it assigns a VMID to a process.

The scratch backing VA is configured in SH_HIDDEN_PRIVATE_BASE_VMID,
which is per VMID. Again, with the HWS, this is programmed by the
firmware and the driver just includes it in the map_process packet.


Ok, that makes sense to me.

The same question came up in a different context a few month back in a 
thread and I probably just confused this with some other SH_* registers 
which turned out to be global while we always thought it is per VMID.


Christian.


Regards,
   Felix


That would mean that when two processes try to set two different
addresses we are completely lost here.

Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] tests/amdgpu: add uvd encode unit tests

2017-08-15 Thread Christian König

Am 15.08.2017 um 17:33 schrieb Boyuan Zhang:

Signed-off-by: Boyuan Zhang 
Acked-by: Alex Deucher 


Acked-by: Christian König 


---
  tests/amdgpu/Makefile.am |   1 +
  tests/amdgpu/amdgpu_test.c   |   6 +
  tests/amdgpu/amdgpu_test.h   |  15 ++
  tests/amdgpu/frame.h |   2 +-
  tests/amdgpu/uvd_enc_tests.c | 500 
  tests/amdgpu/uve_ib.h| 527 +++
  6 files changed, 1050 insertions(+), 1 deletion(-)
  create mode 100644 tests/amdgpu/uvd_enc_tests.c
  create mode 100644 tests/amdgpu/uve_ib.h

diff --git a/tests/amdgpu/Makefile.am b/tests/amdgpu/Makefile.am
index 9e08578..13b3dc8 100644
--- a/tests/amdgpu/Makefile.am
+++ b/tests/amdgpu/Makefile.am
@@ -27,4 +27,5 @@ amdgpu_test_SOURCES = \
vce_tests.c \
vce_ib.h \
frame.h \
+   uvd_enc_tests.c \
vcn_tests.c
diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c
index 1d44b09..cd6b826 100644
--- a/tests/amdgpu/amdgpu_test.c
+++ b/tests/amdgpu/amdgpu_test.c
@@ -91,6 +91,12 @@ static CU_SuiteInfo suites[] = {
.pCleanupFunc = suite_vcn_tests_clean,
.pTests = vcn_tests,
},
+   {
+   .pName = "UVD ENC Tests",
+   .pInitFunc = suite_uvd_enc_tests_init,
+   .pCleanupFunc = suite_uvd_enc_tests_clean,
+   .pTests = uvd_enc_tests,
+   },
CU_SUITE_INFO_NULL,
  };
  
diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h

index c75a07a..d0b61ba 100644
--- a/tests/amdgpu/amdgpu_test.h
+++ b/tests/amdgpu/amdgpu_test.h
@@ -120,6 +120,21 @@ int suite_vcn_tests_clean();
  extern CU_TestInfo vcn_tests[];
  
  /**

+ * Initialize uvd enc test suite
+ */
+int suite_uvd_enc_tests_init();
+
+/**
+ * Deinitialize uvd enc test suite
+ */
+int suite_uvd_enc_tests_clean();
+
+/**
+ * Tests in uvd enc test suite
+ */
+extern CU_TestInfo uvd_enc_tests[];
+
+/**
   * Helper functions
   */
  static inline amdgpu_bo_handle gpu_mem_alloc(
diff --git a/tests/amdgpu/frame.h b/tests/amdgpu/frame.h
index 4c946c2..335401c 100644
--- a/tests/amdgpu/frame.h
+++ b/tests/amdgpu/frame.h
@@ -24,7 +24,7 @@
  #ifndef _frame_h_
  #define _frame_h_
  
-const uint8_t frame[] = {

+static const uint8_t frame[] = {
0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 
0xeb, 0xeb, 0xeb, 0xeb,
0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 
0xd2, 0xd2, 0xd2, 0xd2,
0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 
0xd2, 0xaa, 0xaa, 0xaa,
diff --git a/tests/amdgpu/uvd_enc_tests.c b/tests/amdgpu/uvd_enc_tests.c
new file mode 100644
index 000..6c19f7b
--- /dev/null
+++ b/tests/amdgpu/uvd_enc_tests.c
@@ -0,0 +1,500 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+*/
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include 
+#include 
+
+#include "CUnit/Basic.h"
+
+#include "util_math.h"
+
+#include "amdgpu_test.h"
+#include "amdgpu_drm.h"
+#include "amdgpu_internal.h"
+#include "frame.h"
+#include "uve_ib.h"
+
+#define IB_SIZE4096
+#define MAX_RESOURCES  16
+
+struct amdgpu_uvd_enc_bo {
+   amdgpu_bo_handle handle;
+   amdgpu_va_handle va_handle;
+   uint64_t addr;
+   uint64_t size;
+   uint8_t *ptr;
+};
+
+struct amdgpu_uvd_enc {
+   unsigned width;
+   unsigned height;
+   struct amdgpu_uvd_enc_bo session;
+   struct amdgpu_uvd_enc_bo vbuf;
+   struct amdgpu_uvd_enc_bo bs;
+   struct amdgpu_uvd_enc_bo fb;
+   struct amdgpu_uvd_enc_bo cpb;
+};
+
+static amdgpu_device_handle device_handle;
+static uint32_t major_version;
+static uint32_t minor_version;
+static uint32_t family_id;
+
+static amdgpu_context_handle context_handle;
+static 

[PATCH libdrm] tests/amdgpu: add uvd encode unit tests

2017-08-15 Thread Boyuan Zhang
Signed-off-by: Boyuan Zhang 
Acked-by: Alex Deucher 
---
 tests/amdgpu/Makefile.am |   1 +
 tests/amdgpu/amdgpu_test.c   |   6 +
 tests/amdgpu/amdgpu_test.h   |  15 ++
 tests/amdgpu/frame.h |   2 +-
 tests/amdgpu/uvd_enc_tests.c | 500 
 tests/amdgpu/uve_ib.h| 527 +++
 6 files changed, 1050 insertions(+), 1 deletion(-)
 create mode 100644 tests/amdgpu/uvd_enc_tests.c
 create mode 100644 tests/amdgpu/uve_ib.h

diff --git a/tests/amdgpu/Makefile.am b/tests/amdgpu/Makefile.am
index 9e08578..13b3dc8 100644
--- a/tests/amdgpu/Makefile.am
+++ b/tests/amdgpu/Makefile.am
@@ -27,4 +27,5 @@ amdgpu_test_SOURCES = \
vce_tests.c \
vce_ib.h \
frame.h \
+   uvd_enc_tests.c \
vcn_tests.c
diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c
index 1d44b09..cd6b826 100644
--- a/tests/amdgpu/amdgpu_test.c
+++ b/tests/amdgpu/amdgpu_test.c
@@ -91,6 +91,12 @@ static CU_SuiteInfo suites[] = {
.pCleanupFunc = suite_vcn_tests_clean,
.pTests = vcn_tests,
},
+   {
+   .pName = "UVD ENC Tests",
+   .pInitFunc = suite_uvd_enc_tests_init,
+   .pCleanupFunc = suite_uvd_enc_tests_clean,
+   .pTests = uvd_enc_tests,
+   },
CU_SUITE_INFO_NULL,
 };
 
diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h
index c75a07a..d0b61ba 100644
--- a/tests/amdgpu/amdgpu_test.h
+++ b/tests/amdgpu/amdgpu_test.h
@@ -120,6 +120,21 @@ int suite_vcn_tests_clean();
 extern CU_TestInfo vcn_tests[];
 
 /**
+ * Initialize uvd enc test suite
+ */
+int suite_uvd_enc_tests_init();
+
+/**
+ * Deinitialize uvd enc test suite
+ */
+int suite_uvd_enc_tests_clean();
+
+/**
+ * Tests in uvd enc test suite
+ */
+extern CU_TestInfo uvd_enc_tests[];
+
+/**
  * Helper functions
  */
 static inline amdgpu_bo_handle gpu_mem_alloc(
diff --git a/tests/amdgpu/frame.h b/tests/amdgpu/frame.h
index 4c946c2..335401c 100644
--- a/tests/amdgpu/frame.h
+++ b/tests/amdgpu/frame.h
@@ -24,7 +24,7 @@
 #ifndef _frame_h_
 #define _frame_h_
 
-const uint8_t frame[] = {
+static const uint8_t frame[] = {
0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 
0xeb, 0xeb, 0xeb, 0xeb,
0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xeb, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 
0xd2, 0xd2, 0xd2, 0xd2,
0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 0xd2, 
0xd2, 0xaa, 0xaa, 0xaa,
diff --git a/tests/amdgpu/uvd_enc_tests.c b/tests/amdgpu/uvd_enc_tests.c
new file mode 100644
index 000..6c19f7b
--- /dev/null
+++ b/tests/amdgpu/uvd_enc_tests.c
@@ -0,0 +1,500 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+*/
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include 
+#include 
+
+#include "CUnit/Basic.h"
+
+#include "util_math.h"
+
+#include "amdgpu_test.h"
+#include "amdgpu_drm.h"
+#include "amdgpu_internal.h"
+#include "frame.h"
+#include "uve_ib.h"
+
+#define IB_SIZE4096
+#define MAX_RESOURCES  16
+
+struct amdgpu_uvd_enc_bo {
+   amdgpu_bo_handle handle;
+   amdgpu_va_handle va_handle;
+   uint64_t addr;
+   uint64_t size;
+   uint8_t *ptr;
+};
+
+struct amdgpu_uvd_enc {
+   unsigned width;
+   unsigned height;
+   struct amdgpu_uvd_enc_bo session;
+   struct amdgpu_uvd_enc_bo vbuf;
+   struct amdgpu_uvd_enc_bo bs;
+   struct amdgpu_uvd_enc_bo fb;
+   struct amdgpu_uvd_enc_bo cpb;
+};
+
+static amdgpu_device_handle device_handle;
+static uint32_t major_version;
+static uint32_t minor_version;
+static uint32_t family_id;
+
+static amdgpu_context_handle context_handle;
+static amdgpu_bo_handle ib_handle;
+static amdgpu_va_handle ib_va_handle;
+static uint64_t ib_mc_address;
+static uint32_t *ib_cpu;
+
+static struct 

Re: [PATCH] drm/amdgpu/gfx7: fix function name

2017-08-15 Thread Christian König

Am 15.08.2017 um 17:17 schrieb Harry Wentland:

On 2017-08-15 10:36 AM, Alex Deucher wrote:

Was using the wrong prefix (gmc rather than gfx).  The function
is related to the gfx hw, not gmc.  This also makes it consistent
with the naming in gfx8.

Signed-off-by: Alex Deucher 

Reviewed-by: Harry Wentland 


Reviewed-by: Christian König 



Harry


---
  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index ad4b5c3..50e5263 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -1823,7 +1823,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device *adev)
  }
  
  /**

- * gmc_v7_0_init_compute_vmid - gart enable
+ * gfx_v7_0_init_compute_vmid - gart enable
   *
   * @adev: amdgpu_device pointer
   *
@@ -1833,7 +1833,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device *adev)
  #define DEFAULT_SH_MEM_BASES  (0x6000)
  #define FIRST_COMPUTE_VMID(8)
  #define LAST_COMPUTE_VMID (16)
-static void gmc_v7_0_init_compute_vmid(struct amdgpu_device *adev)
+static void gfx_v7_0_init_compute_vmid(struct amdgpu_device *adev)
  {
int i;
uint32_t sh_mem_config;
@@ -1939,7 +1939,7 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
cik_srbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
  
-	gmc_v7_0_init_compute_vmid(adev);

+   gfx_v7_0_init_compute_vmid(adev);
  
  	WREG32(mmSX_DEBUG_1, 0x20);
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu/gfx7: fix function name

2017-08-15 Thread Harry Wentland
On 2017-08-15 10:36 AM, Alex Deucher wrote:
> Was using the wrong prefix (gmc rather than gfx).  The function
> is related to the gfx hw, not gmc.  This also makes it consistent
> with the naming in gfx8.
> 
> Signed-off-by: Alex Deucher 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
> index ad4b5c3..50e5263 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
> @@ -1823,7 +1823,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device 
> *adev)
>  }
>  
>  /**
> - * gmc_v7_0_init_compute_vmid - gart enable
> + * gfx_v7_0_init_compute_vmid - gart enable
>   *
>   * @adev: amdgpu_device pointer
>   *
> @@ -1833,7 +1833,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device 
> *adev)
>  #define DEFAULT_SH_MEM_BASES (0x6000)
>  #define FIRST_COMPUTE_VMID   (8)
>  #define LAST_COMPUTE_VMID(16)
> -static void gmc_v7_0_init_compute_vmid(struct amdgpu_device *adev)
> +static void gfx_v7_0_init_compute_vmid(struct amdgpu_device *adev)
>  {
>   int i;
>   uint32_t sh_mem_config;
> @@ -1939,7 +1939,7 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device 
> *adev)
>   cik_srbm_select(adev, 0, 0, 0, 0);
>   mutex_unlock(>srbm_mutex);
>  
> - gmc_v7_0_init_compute_vmid(adev);
> + gfx_v7_0_init_compute_vmid(adev);
>  
>   WREG32(mmSX_DEBUG_1, 0x20);
>  
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/4] drm/amdgpu: Adding new kgd/kfd interface functions to support scratch memory

2017-08-15 Thread Deucher, Alexander
> -Original Message-
> From: Kuehling, Felix
> Sent: Monday, August 14, 2017 9:13 PM
> To: Oded Gabbay; Marek Olšák; Deucher, Alexander
> Cc: amd-gfx list
> Subject: Re: [PATCH 1/4] drm/amdgpu: Adding new kgd/kfd interface
> functions to support scratch memory
> 
> [+Marek, Alex for comment, see below]
> 
> On 2017-08-13 04:56 AM, Oded Gabbay wrote:
> > On Sat, Aug 12, 2017 at 7:47 AM, Felix Kuehling 
> wrote:
> >> From: Moses Reuben 
> >>
> >> Signed-off-by: Moses Reuben 
> >> Signed-off-by: Felix Kuehling 
> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 34
> +-
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 35
> ++-
> >>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  4 +++
> >>  3 files changed, 71 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> >> index 994d262..11d515a 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> >> @@ -135,6 +135,10 @@ static uint16_t
> get_atc_vmid_pasid_mapping_pasid(struct kgd_dev *kgd,
> >>  static void write_vmid_invalidate_request(struct kgd_dev *kgd, uint8_t
> vmid);
> >>
> >>  static uint16_t get_fw_version(struct kgd_dev *kgd, enum
> kgd_engine_type type);
> >> +static int alloc_memory_of_scratch(struct kgd_dev *kgd,
> >> +uint64_t va, uint32_t vmid);
> >> +static int write_config_static_mem(struct kgd_dev *kgd, bool
> swizzle_enable,
> >> +   uint8_t element_size, uint8_t index_stride, uint8_t mtype);
> >>
> >>  static const struct kfd2kgd_calls kfd2kgd = {
> >> .init_gtt_mem_allocation = alloc_gtt_mem,
> >> @@ -159,7 +163,9 @@ static const struct kfd2kgd_calls kfd2kgd = {
> >> .get_atc_vmid_pasid_mapping_pasid =
> get_atc_vmid_pasid_mapping_pasid,
> >> .get_atc_vmid_pasid_mapping_valid =
> get_atc_vmid_pasid_mapping_valid,
> >> .write_vmid_invalidate_request = write_vmid_invalidate_request,
> >> -   .get_fw_version = get_fw_version
> >> +   .get_fw_version = get_fw_version,
> >> +   .alloc_memory_of_scratch = alloc_memory_of_scratch,
> >> +   .write_config_static_mem = write_config_static_mem
> >>  };
> >>
> >>  struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void)
> >> @@ -652,6 +658,32 @@ static void write_vmid_invalidate_request(struct
> kgd_dev *kgd, uint8_t vmid)
> >> WREG32(mmVM_INVALIDATE_REQUEST, 1 << vmid);
> >>  }
> >>
> >> +static int write_config_static_mem(struct kgd_dev *kgd, bool
> swizzle_enable,
> >> +   uint8_t element_size, uint8_t index_stride, uint8_t mtype)
> >> +{
> >> +   uint32_t reg;
> >> +   struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
> >> +
> >> +   reg = swizzle_enable <<
> SH_STATIC_MEM_CONFIG__SWIZZLE_ENABLE__SHIFT |
> >> +   element_size <<
> SH_STATIC_MEM_CONFIG__ELEMENT_SIZE__SHIFT |
> >> +   index_stride <<
> SH_STATIC_MEM_CONFIG__INDEX_STRIDE__SHIFT |
> >> +   mtype << SH_STATIC_MEM_CONFIG__PRIVATE_MTYPE__SHIFT;
> >> +
> >> +   WREG32(mmSH_STATIC_MEM_CONFIG, reg);
> > Don't you need to select and lock srbm before you write to this register ?
> 
> No, this register seems to be global. SRBM_GFX_CNTL had no effect on it
> when I experimented with it.
> 
> Since this is global initialization, I'm wondering if this should be
> moved into part of the amdgpu initialization. Having amdkfd call back
> into amdgpu like this during its initialization always seemed like a bit
> roundabout.
> 
> Marek, do you know if this register affects Mesa performance or
> behaviour with respect to scratch memory. According to the register spec
> it only affects flat scratch memory addressing. Not sure if you're using
> that addressing scheme in Mesa.
> 
> Alex, I see that SH_STATIC_MEM_CONFIG gets initialized in
> gfx_v[78]_0_gpu_init identically to what we do in KFD. You're doing it
> per-VMID (under cik_srbm_select). But my experiments show that this
> register is global. So I think your initialization already works for KFD
> and we can probably just drop this from the KFD initialization. Do you
> have access to hardware docs that confirm that it's global?

We'll have to confirm with the hw teams.  I always thought it was per vmid.

Alex

> 
> Regards,
>   Felix
> 
> >
> >> +   return 0;
> >> +}
> >> +static int alloc_memory_of_scratch(struct kgd_dev *kgd,
> >> +uint64_t va, uint32_t vmid)
> >> +{
> >> +   struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
> >> +
> >> +   lock_srbm(kgd, 0, 0, 0, vmid);
> >> +   WREG32(mmSH_HIDDEN_PRIVATE_BASE_VMID, va);
> >> +   unlock_srbm(kgd);
> >> +
> >> +   return 0;
> >> +}
> >> +
> >>  static 

Re: [PATCH] drm/amd/amdgpu: fix bug fail to remove debugfs when rmmod

2017-08-15 Thread Alex Deucher
On Tue, Aug 15, 2017 at 4:26 AM, Christian König
 wrote:
> Ok in this case I don't think we should do anything here.
>
> That we can't rmmod the driver on pre 4.11 is a bit annoying, but doesn't
> affect the upstream driver.

This patch can be applied to the KCL for dkms.

Alex

>
> Christian.
>
>
> Am 15.08.2017 um 05:00 schrieb Wang, Annie:
>>
>> Hi Christian,
>>
>> Yes. drm_debugfs_cleanup()  should delete all the files, after  the
>> following commit.
>>
>> commit 086f2e5cde747dcae800b2d8b0ac7741a8c6fcbe
>> Author: Noralf Trønnes 
>> Date:   Thu Jan 26 23:56:03 2017 +0100
>>
>>  drm: debugfs: Remove all files automatically on cleanup
>>
>>  Instead of having the drivers call drm_debugfs_remove_files() in
>>  their drm_driver->debugfs_cleanup hook, do it automatically by
>>  traversing minor->debugfs_list.
>>  Also use debugfs_remove_recursive() so drivers who add their own
>>  debugfs files don't have to keep track of them for removal.
>>
>>  Signed-off-by: Noralf Trønnes 
>>  Signed-off-by: Daniel Vetter 
>>  Link:
>> http://patchwork.freedesktop.org/patch/msgid/20170126225621.12314-2-nor...@tronnes.org
>>
>>
>> Kernel before 4.11 don't clean debugfs files automatically.
>>
>>> -Original Message-
>>> From: Christian König [mailto:deathsim...@vodafone.de]
>>> Sent: Friday, August 11, 2017 4:15 PM
>>> To: Wang, Annie ; Alex Deucher
>>> 
>>> Cc: amd-gfx list 
>>> Subject: Re: [PATCH] drm/amd/amdgpu: fix bug fail to remove debugfs when
>>> rmmod
>>>
>>> Hi Annie,
>>>
>>> well something is clearly not working as expected here.
>>>
>>> All those files are registered with drm_debugfs_create_files() and should
>>> automatically be removed when drm_debugfs_cleanup() is called.
>>>
>>> Can you figure out what is going wrong here?
>>>
>>> Thanks,
>>> Christian.
>>>
>>> Am 11.08.2017 um 04:25 schrieb Wang, Annie:

 Hi Alex,

 The following files are all left.

 amdgpu_fence_info
 amdgpu_firmware_info
 amdgpu_gem_info
 amdgpu_gpu_reset
 amdgpu_gtt_mm
 amdgpu_pm_info
 amdgpu_sa_info
 amdgpu_test_ib
 amdgpu_vram_mm
 ttm_dma_page_pool
 ttm_page_pool

 Instead of fini them separately,  how about implement the callback
 function ---
>>>
>>> debugfs_fini.

.debugfs_fini = amdgpu_debugfs_cleanup,

 And in the future I will send another patch to collect all the init
 function
.debugfs_init = amdgpu_debugfs_init,


> -Original Message-
> From: Alex Deucher [mailto:alexdeuc...@gmail.com]
> Sent: Friday, August 11, 2017 1:01 AM
> To: Wang, Annie 
> Cc: amd-gfx list 
> Subject: Re: [PATCH] drm/amd/amdgpu: fix bug fail to remove debugfs
> when rmmod
>
> On Thu, Aug 10, 2017 at 5:12 AM, Wang Hongcheng 
> wrote:
>>
>> Some debug files are forgotten to remove at fini. Remove them all in
>> pci_remove.
>>
>> BUG: SWDEV-129297
>
> What files are failing to get removed?  I'd prefer to remove them in
> the relevant places in the code (to mirror where they are added)
> rather than generically cleaning everything up.  Either that or unify
> all the debugfs init/fini stuff in one place.
>
> Alex
>
>> Signed-off-by: Wang Hongcheng 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h|  2 ++
>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19
>>>
>>> +++
>>
>>drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  2 ++
>>3 files changed, 23 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index c28069e4..b542191 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -1250,6 +1250,8 @@ struct amdgpu_debugfs {  int
>> amdgpu_debugfs_add_files(struct amdgpu_device *adev,
>>const struct drm_info_list *files,
>>unsigned nfiles);
>> +int amdgpu_debugfs_cleanup(struct amdgpu_device *adev);
>> +
>>int amdgpu_debugfs_fence_init(struct amdgpu_device *adev);
>>
>>#if defined(CONFIG_DEBUG_FS)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 7e40071..7594abb 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3215,6 +3215,25 @@ int amdgpu_debugfs_add_files(struct
>
> amdgpu_device *adev,
>>
>>   return 0;
>>}
>>
>> +int amdgpu_debugfs_cleanup(struct 

[PATCH] drm/amdgpu/gfx7: fix function name

2017-08-15 Thread Alex Deucher
Was using the wrong prefix (gmc rather than gfx).  The function
is related to the gfx hw, not gmc.  This also makes it consistent
with the naming in gfx8.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index ad4b5c3..50e5263 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -1823,7 +1823,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device *adev)
 }
 
 /**
- * gmc_v7_0_init_compute_vmid - gart enable
+ * gfx_v7_0_init_compute_vmid - gart enable
  *
  * @adev: amdgpu_device pointer
  *
@@ -1833,7 +1833,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device *adev)
 #define DEFAULT_SH_MEM_BASES   (0x6000)
 #define FIRST_COMPUTE_VMID (8)
 #define LAST_COMPUTE_VMID  (16)
-static void gmc_v7_0_init_compute_vmid(struct amdgpu_device *adev)
+static void gfx_v7_0_init_compute_vmid(struct amdgpu_device *adev)
 {
int i;
uint32_t sh_mem_config;
@@ -1939,7 +1939,7 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
cik_srbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
 
-   gmc_v7_0_init_compute_vmid(adev);
+   gfx_v7_0_init_compute_vmid(adev);
 
WREG32(mmSX_DEBUG_1, 0x20);
 
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/4] drm/amdkfd: Adding new IOCTL for scratch memory

2017-08-15 Thread Alex Deucher
On Tue, Aug 15, 2017 at 4:24 AM, Christian König
 wrote:
> Am 14.08.2017 um 17:31 schrieb Felix Kuehling:
>>
>> [SNIP]
>> Repeating the same argument I made on another email:
>
>
> Commented on that in the other mail, let's keep the discussion on this topic
> there.
>
>>> BTW: What exactly this this good for?
>>
>> Scratch memory is private memory per work-item. It's used when a shader
>> program has too few registers available. With HSA we use flat scratch
>> addressing, where shaders can access private memory in a special scratch
>> aperture using normal memory instructions. Using the same virtual
>> address, each work item gets its own private piece of memory. The
>> hardware does the address translation from the VA in the private
>> aperture to a scratch-backing VA. The application is responsible for
>> allocating the memory to back that scratch area, and to map it somewhere
>> in its virtual address space.
>>
>> This ioctl tells the hardware (or HWS firmware) the VA of the scratch
>> backing memory.
>
>
> Ok, you've got me lost here. Not that I'm deeply into that stuff, but my
> last status is that those apertures are global and not per/process or VMID.
>
> That would mean that when two processes try to set two different addresses
> we are completely lost here.
>

The scratch and lds apertures are per vmid.  See gfx_v8_0_init_compute_vmid()

Alex
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: add tracing of kernel DMA mappings and enable on init (v2)

2017-08-15 Thread Tom St Denis

On 15/08/17 08:16 AM, Tom St Denis wrote:

On 15/08/17 08:11 AM, Christian König wrote:

Am 15.08.2017 um 13:33 schrieb Tom St Denis:

On 15/08/17 07:22 AM, Christian König wrote:

Am 14.08.2017 um 13:37 schrieb Tom St Denis:

This patch adds trace points to the amdgpu bo_kmap/bo/kunmap functions
which capture internal allocations.

(v2): Simply add a sleep to the init.  Users can use this script to
load with map debugging enabled:

#!/bin/bash
modprobe amdgpu map_debug=1 &
sleep 1
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_unpopulate/enable 

echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_populate/enable


I've just realized that there is a far simpler method than that or 
enabling trace points using the module command line.


Assuming your GPU is PCI device 01:00.0 connected through bridge 
00:02.1 (use lspci -t to figure the connection out):


# Temporary remove the PCIe device from the bus
echo 1 > /sys/bus/pci/devices/\:01\:00.0/remove
# Load amdgpu, this allows you to enable the trace points you want 
and also sets probes etc..

modprobe amdpgu
# Rescan the bus, the device subsystem will automatically probe 
amdgpu with this device

echo 1 > /sys/bus/pci/devices/\:00\:02.1/rescan


That would definitely be a bunch trickier to script up.  Nobody will 
run this by hand.  So we'd need to remove all devices that are AMD 
vendor ID but GPU only (e.g. not other controllers) which means we 
need to parse the lspci output for VGA controller related text.


Definitely doable and probably "cleaner" from a race point of view.

Apart from that I would really like to see those trace points in TTM 
instead. I also don't mind adding a dev or pdev pointer to the 
ttm_bo_device structure for this.


Not against this in principle.  There's a bit of confusion... our ttm 
amdgpu functions are passed


struct ttm_tt *ttm

which we cast to

struct amdgpu_ttm_tt {
struct ttm_dma_ttttm;
struct amdgpu_device*adev;


BTW: Isn't this adev pointer here superflous? I mean the ttm_tt has a 
bdev pointer to use, don't they?


Honestly without cscoping it I wouldn't know :-)



Yes, you can find the adev via

static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device 
*bdev)

{
return container_of(bdev, struct amdgpu_device, mman.bdev);
}

But looking at ttm_bo_device I don't see an obvious path to a device 
structure (or pci structure).  So that would need to be added (as you 
previously commented).


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: add tracing of kernel DMA mappings and enable on init (v2)

2017-08-15 Thread Tom St Denis

On 15/08/17 08:11 AM, Christian König wrote:

Am 15.08.2017 um 13:33 schrieb Tom St Denis:

On 15/08/17 07:22 AM, Christian König wrote:

Am 14.08.2017 um 13:37 schrieb Tom St Denis:

This patch adds trace points to the amdgpu bo_kmap/bo/kunmap functions
which capture internal allocations.

(v2): Simply add a sleep to the init.  Users can use this script to
load with map debugging enabled:

#!/bin/bash
modprobe amdgpu map_debug=1 &
sleep 1
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_unpopulate/enable
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_populate/enable


I've just realized that there is a far simpler method than that or 
enabling trace points using the module command line.


Assuming your GPU is PCI device 01:00.0 connected through bridge 
00:02.1 (use lspci -t to figure the connection out):


# Temporary remove the PCIe device from the bus
echo 1 > /sys/bus/pci/devices/\:01\:00.0/remove
# Load amdgpu, this allows you to enable the trace points you want 
and also sets probes etc..

modprobe amdpgu
# Rescan the bus, the device subsystem will automatically probe 
amdgpu with this device

echo 1 > /sys/bus/pci/devices/\:00\:02.1/rescan


That would definitely be a bunch trickier to script up.  Nobody will 
run this by hand.  So we'd need to remove all devices that are AMD 
vendor ID but GPU only (e.g. not other controllers) which means we 
need to parse the lspci output for VGA controller related text.


Definitely doable and probably "cleaner" from a race point of view.

Apart from that I would really like to see those trace points in TTM 
instead. I also don't mind adding a dev or pdev pointer to the 
ttm_bo_device structure for this.


Not against this in principle.  There's a bit of confusion... our ttm 
amdgpu functions are passed


struct ttm_tt *ttm

which we cast to

struct amdgpu_ttm_tt {
struct ttm_dma_ttttm;
struct amdgpu_device*adev;


BTW: Isn't this adev pointer here superflous? I mean the ttm_tt has a 
bdev pointer to use, don't they?


Honestly without cscoping it I wouldn't know :-)




u64 offset;
... 

Then we use the "->ttm" of that to get dma_address[] but pages[] from 
the original ttm_tt structure ... from what I recall (when I looked 
into this last week) ttm_dma_tt and ttm_tt aren't the same structure 
right? (and looking at it right now ttm_dma_tt begins with ttm_tt).


So wherever I put the trace in ttm I need access to dma_address[] 
which doesn't seem so clear cut.  Can I just cast a ttm_tt pointer to 
ttm_dma_tt?


Ah, ok now I understand your problem. Well, printing the dma addresses 
in this situation is probably not a good idea to start with.


When a buffer object is mapped to kernel space there is no guarantee 
that it already has it's DMA mapping. For amdgpu we probably never run 
into this situation, but for other drivers that might now be the case.


At the points I'm putting the traces a "map" has already happened per 
page like in ttm_populate and our kmap functions.



Do we really need that info?


To reverse the IOMMU translation I need the bus/dma address, physical 
address, and PCI device information (because the dma <=> phys 
translation is only valid per PCI device).


Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: add tracing of kernel DMA mappings and enable on init (v2)

2017-08-15 Thread Christian König

Am 15.08.2017 um 13:33 schrieb Tom St Denis:

On 15/08/17 07:22 AM, Christian König wrote:

Am 14.08.2017 um 13:37 schrieb Tom St Denis:

This patch adds trace points to the amdgpu bo_kmap/bo/kunmap functions
which capture internal allocations.

(v2): Simply add a sleep to the init.  Users can use this script to
load with map debugging enabled:

#!/bin/bash
modprobe amdgpu map_debug=1 &
sleep 1
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_unpopulate/enable
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_populate/enable


I've just realized that there is a far simpler method than that or 
enabling trace points using the module command line.


Assuming your GPU is PCI device 01:00.0 connected through bridge 
00:02.1 (use lspci -t to figure the connection out):


# Temporary remove the PCIe device from the bus
echo 1 > /sys/bus/pci/devices/\:01\:00.0/remove
# Load amdgpu, this allows you to enable the trace points you want 
and also sets probes etc..

modprobe amdpgu
# Rescan the bus, the device subsystem will automatically probe 
amdgpu with this device

echo 1 > /sys/bus/pci/devices/\:00\:02.1/rescan


That would definitely be a bunch trickier to script up.  Nobody will 
run this by hand.  So we'd need to remove all devices that are AMD 
vendor ID but GPU only (e.g. not other controllers) which means we 
need to parse the lspci output for VGA controller related text.


Definitely doable and probably "cleaner" from a race point of view.

Apart from that I would really like to see those trace points in TTM 
instead. I also don't mind adding a dev or pdev pointer to the 
ttm_bo_device structure for this.


Not against this in principle.  There's a bit of confusion... our ttm 
amdgpu functions are passed


struct ttm_tt *ttm

which we cast to

struct amdgpu_ttm_tt {
struct ttm_dma_ttttm;
struct amdgpu_device*adev;


BTW: Isn't this adev pointer here superflous? I mean the ttm_tt has a 
bdev pointer to use, don't they?



u64 offset;
... 

Then we use the "->ttm" of that to get dma_address[] but pages[] from 
the original ttm_tt structure ... from what I recall (when I looked 
into this last week) ttm_dma_tt and ttm_tt aren't the same structure 
right? (and looking at it right now ttm_dma_tt begins with ttm_tt).


So wherever I put the trace in ttm I need access to dma_address[] 
which doesn't seem so clear cut.  Can I just cast a ttm_tt pointer to 
ttm_dma_tt?


Ah, ok now I understand your problem. Well, printing the dma addresses 
in this situation is probably not a good idea to start with.


When a buffer object is mapped to kernel space there is no guarantee 
that it already has it's DMA mapping. For amdgpu we probably never run 
into this situation, but for other drivers that might now be the case.


Do we really need that info?

Christian.



Tom



Regards,
Christian.



Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 12 
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  4 ++--
  4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index d2aaad77c353..2f5781df88c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -121,6 +121,7 @@ extern int amdgpu_cntl_sb_buf_per_se;
  extern int amdgpu_param_buf_per_se;
  extern int amdgpu_job_hang_limit;
  extern int amdgpu_lbpw;
+extern int amdgpu_enable_map_debugging;
  #ifdef CONFIG_DRM_AMDGPU_SI
  extern int amdgpu_si_support;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 2cdf8443e7d3..0ed777cc2f63 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -120,6 +120,7 @@ int amdgpu_cntl_sb_buf_per_se = 0;
  int amdgpu_param_buf_per_se = 0;
  int amdgpu_job_hang_limit = 0;
  int amdgpu_lbpw = -1;
+int amdgpu_enable_map_debugging = 0;
  MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in 
megabytes");

  module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
@@ -263,6 +264,9 @@ module_param_named(job_hang_limit, 
amdgpu_job_hang_limit, int ,0444);
  MODULE_PARM_DESC(lbpw, "Load Balancing Per Watt (LBPW) support (1 
= enable, 0 = disable, -1 = auto)");

  module_param_named(lbpw, amdgpu_lbpw, int, 0444);
+MODULE_PARM_DESC(map_debug, "Enable map debugging in kernel (1 = 
on, 0 = off)");

+module_param_named(map_debug, amdgpu_enable_map_debugging, int, 0444);
+
  #ifdef CONFIG_DRM_AMDGPU_SI
  #if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
@@ -865,11 +869,19 @@ static struct pci_driver amdgpu_kms_pci_driver 
= {

  };
+// enable trace events during init
+static void amdgpu_enable_dma_tracers(void)
+{
+msleep(3000);
+}
  static int __init 

Re: [PATCH] drm/amd/amdgpu: add tracing of kernel DMA mappings and enable on init (v2)

2017-08-15 Thread Tom St Denis

On 15/08/17 07:22 AM, Christian König wrote:

Am 14.08.2017 um 13:37 schrieb Tom St Denis:

This patch adds trace points to the amdgpu bo_kmap/bo/kunmap functions
which capture internal allocations.

(v2): Simply add a sleep to the init.  Users can use this script to
load with map debugging enabled:

#!/bin/bash
modprobe amdgpu map_debug=1 &
sleep 1
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_unpopulate/enable
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_populate/enable


I've just realized that there is a far simpler method than that or 
enabling trace points using the module command line.


Assuming your GPU is PCI device 01:00.0 connected through bridge 00:02.1 
(use lspci -t to figure the connection out):


# Temporary remove the PCIe device from the bus
echo 1 > /sys/bus/pci/devices/\:01\:00.0/remove
# Load amdgpu, this allows you to enable the trace points you want and 
also sets probes etc..

modprobe amdpgu
# Rescan the bus, the device subsystem will automatically probe amdgpu 
with this device

echo 1 > /sys/bus/pci/devices/\:00\:02.1/rescan


That would definitely be a bunch trickier to script up.  Nobody will run 
this by hand.  So we'd need to remove all devices that are AMD vendor ID 
but GPU only (e.g. not other controllers) which means we need to parse 
the lspci output for VGA controller related text.


Definitely doable and probably "cleaner" from a race point of view.

Apart from that I would really like to see those trace points in TTM 
instead. I also don't mind adding a dev or pdev pointer to the 
ttm_bo_device structure for this.


Not against this in principle.  There's a bit of confusion... our ttm 
amdgpu functions are passed


struct ttm_tt *ttm

which we cast to

struct amdgpu_ttm_tt {
struct ttm_dma_tt   ttm;
struct amdgpu_device*adev;
u64 offset;
... 

Then we use the "->ttm" of that to get dma_address[] but pages[] from 
the original ttm_tt structure ... from what I recall (when I looked into 
this last week) ttm_dma_tt and ttm_tt aren't the same structure right? 
(and looking at it right now ttm_dma_tt begins with ttm_tt).


So wherever I put the trace in ttm I need access to dma_address[] which 
doesn't seem so clear cut.  Can I just cast a ttm_tt pointer to ttm_dma_tt?


Tom



Regards,
Christian.



Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 12 
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  4 ++--
  4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index d2aaad77c353..2f5781df88c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -121,6 +121,7 @@ extern int amdgpu_cntl_sb_buf_per_se;
  extern int amdgpu_param_buf_per_se;
  extern int amdgpu_job_hang_limit;
  extern int amdgpu_lbpw;
+extern int amdgpu_enable_map_debugging;
  #ifdef CONFIG_DRM_AMDGPU_SI
  extern int amdgpu_si_support;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 2cdf8443e7d3..0ed777cc2f63 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -120,6 +120,7 @@ int amdgpu_cntl_sb_buf_per_se = 0;
  int amdgpu_param_buf_per_se = 0;
  int amdgpu_job_hang_limit = 0;
  int amdgpu_lbpw = -1;
+int amdgpu_enable_map_debugging = 0;
  MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
  module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
@@ -263,6 +264,9 @@ module_param_named(job_hang_limit, 
amdgpu_job_hang_limit, int ,0444);
  MODULE_PARM_DESC(lbpw, "Load Balancing Per Watt (LBPW) support (1 = 
enable, 0 = disable, -1 = auto)");

  module_param_named(lbpw, amdgpu_lbpw, int, 0444);
+MODULE_PARM_DESC(map_debug, "Enable map debugging in kernel (1 = on, 
0 = off)");

+module_param_named(map_debug, amdgpu_enable_map_debugging, int, 0444);
+
  #ifdef CONFIG_DRM_AMDGPU_SI
  #if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
@@ -865,11 +869,19 @@ static struct pci_driver amdgpu_kms_pci_driver = {
  };
+// enable trace events during init
+static void amdgpu_enable_dma_tracers(void)
+{
+msleep(3000);
+}
  static int __init amdgpu_init(void)
  {
  int r;
+if (amdgpu_enable_map_debugging)
+amdgpu_enable_dma_tracers();
+
  r = amdgpu_sync_init();
  if (r)
  goto error_sync;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index e7e899190bef..a292e86fbaa7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -37,6 +37,9 @@
  #include "amdgpu.h"
  #include "amdgpu_trace.h"
+void amdgpu_trace_dma_map(struct 

Re: [PATCH] drm/amd/amdgpu: add tracing of kernel DMA mappings and enable on init (v2)

2017-08-15 Thread Christian König

Am 14.08.2017 um 13:37 schrieb Tom St Denis:

This patch adds trace points to the amdgpu bo_kmap/bo/kunmap functions
which capture internal allocations.

(v2): Simply add a sleep to the init.  Users can use this script to
load with map debugging enabled:

#!/bin/bash
modprobe amdgpu map_debug=1 &
sleep 1
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_unpopulate/enable
echo 1 > 
/sys/kernel/debug/tracing/events/amdgpu/amdgpu_ttm_tt_populate/enable


I've just realized that there is a far simpler method than that or 
enabling trace points using the module command line.


Assuming your GPU is PCI device 01:00.0 connected through bridge 00:02.1 
(use lspci -t to figure the connection out):


# Temporary remove the PCIe device from the bus
echo 1 > /sys/bus/pci/devices/\:01\:00.0/remove
# Load amdgpu, this allows you to enable the trace points you want and 
also sets probes etc..

modprobe amdpgu
# Rescan the bus, the device subsystem will automatically probe amdgpu 
with this device

echo 1 > /sys/bus/pci/devices/\:00\:02.1/rescan

Apart from that I would really like to see those trace points in TTM 
instead. I also don't mind adding a dev or pdev pointer to the 
ttm_bo_device structure for this.


Regards,
Christian.



Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 12 
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  4 ++--
  4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d2aaad77c353..2f5781df88c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -121,6 +121,7 @@ extern int amdgpu_cntl_sb_buf_per_se;
  extern int amdgpu_param_buf_per_se;
  extern int amdgpu_job_hang_limit;
  extern int amdgpu_lbpw;
+extern int amdgpu_enable_map_debugging;
  
  #ifdef CONFIG_DRM_AMDGPU_SI

  extern int amdgpu_si_support;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2cdf8443e7d3..0ed777cc2f63 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -120,6 +120,7 @@ int amdgpu_cntl_sb_buf_per_se = 0;
  int amdgpu_param_buf_per_se = 0;
  int amdgpu_job_hang_limit = 0;
  int amdgpu_lbpw = -1;
+int amdgpu_enable_map_debugging = 0;
  
  MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");

  module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
@@ -263,6 +264,9 @@ module_param_named(job_hang_limit, amdgpu_job_hang_limit, 
int ,0444);
  MODULE_PARM_DESC(lbpw, "Load Balancing Per Watt (LBPW) support (1 = enable, 0 = 
disable, -1 = auto)");
  module_param_named(lbpw, amdgpu_lbpw, int, 0444);
  
+MODULE_PARM_DESC(map_debug, "Enable map debugging in kernel (1 = on, 0 = off)");

+module_param_named(map_debug, amdgpu_enable_map_debugging, int, 0444);
+
  #ifdef CONFIG_DRM_AMDGPU_SI
  
  #if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)

@@ -865,11 +869,19 @@ static struct pci_driver amdgpu_kms_pci_driver = {
  };
  
  
+// enable trace events during init

+static void amdgpu_enable_dma_tracers(void)
+{
+   msleep(3000);
+}
  
  static int __init amdgpu_init(void)

  {
int r;
  
+	if (amdgpu_enable_map_debugging)

+   amdgpu_enable_dma_tracers();
+
r = amdgpu_sync_init();
if (r)
goto error_sync;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index e7e899190bef..a292e86fbaa7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -37,6 +37,9 @@
  #include "amdgpu.h"
  #include "amdgpu_trace.h"
  
+void amdgpu_trace_dma_map(struct ttm_tt *ttm);

+void amdgpu_trace_dma_unmap(struct ttm_tt *ttm);
+
  static void amdgpu_ttm_bo_destroy(struct ttm_buffer_object *tbo)
  {
struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
@@ -625,6 +628,9 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
if (r)
return r;
  
+	if (unlikely(trace_amdgpu_ttm_tt_populate_enabled()) && bo->tbo.mem.mem_type != TTM_PL_VRAM)

+   amdgpu_trace_dma_map(bo->tbo.ttm);
+
if (ptr)
*ptr = amdgpu_bo_kptr(bo);
  
@@ -640,8 +646,11 @@ void *amdgpu_bo_kptr(struct amdgpu_bo *bo)
  
  void amdgpu_bo_kunmap(struct amdgpu_bo *bo)

  {
-   if (bo->kmap.bo)
+   if (bo->kmap.bo) {
+   if (unlikely(trace_amdgpu_ttm_tt_unpopulate_enabled()) && 
bo->tbo.mem.mem_type != TTM_PL_VRAM)
+   amdgpu_trace_dma_unmap(bo->tbo.ttm);
ttm_bo_kunmap(>kmap);
+   }
  }
  
  struct amdgpu_bo *amdgpu_bo_ref(struct amdgpu_bo *bo)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 

Re: [PATCH 2/2] drm/amd/amdgpu: expose fragment size as module parameter

2017-08-15 Thread Christian König

Am 15.08.2017 um 12:10 schrieb Roger He:

Change-Id: I70e4ea94b8520e19cfee5ba6c9a0ecf1ee3f5f1a
Signed-off-by: Roger He 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 25 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  3 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  3 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  3 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  9 -
  9 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d2aaad7..957bd2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -97,6 +97,7 @@ extern int amdgpu_bapm;
  extern int amdgpu_deep_color;
  extern int amdgpu_vm_size;
  extern int amdgpu_vm_block_size;
+extern int amdgpu_vm_fragment_size;
  extern int amdgpu_vm_fault_stop;
  extern int amdgpu_vm_debug;
  extern int amdgpu_vm_update_mode;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index dd1dc87..9da391f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1077,6 +1077,12 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
amdgpu_gtt_size = -1;
}
  
+	/* valid range is between 4 and 9 inclusive */

+   if (amdgpu_vm_fragment_size > 9 || amdgpu_vm_fragment_size < 4) {
+   dev_warn(adev->dev, "valid rang is between 4 and 9\n");
+   amdgpu_vm_fragment_size = -1;
+   }
+
amdgpu_check_vm_size(adev);
  
  	amdgpu_check_block_size(adev);

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2cdf844..d5c63d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -94,6 +94,7 @@ unsigned amdgpu_ip_block_mask = 0x;
  int amdgpu_bapm = -1;
  int amdgpu_deep_color = 0;
  int amdgpu_vm_size = -1;
+int amdgpu_vm_fragment_size = -1;
  int amdgpu_vm_block_size = -1;
  int amdgpu_vm_fault_stop = 0;
  int amdgpu_vm_debug = 0;
@@ -184,6 +185,9 @@ module_param_named(deep_color, amdgpu_deep_color, int, 
0444);
  MODULE_PARM_DESC(vm_size, "VM address space size in gigabytes (default 
64GB)");
  module_param_named(vm_size, amdgpu_vm_size, int, 0444);
  
+MODULE_PARM_DESC(vm_fragment_size, "VM fragment size in bits (4, 5, etc. 4 = 64K (default), Max 9 = 2M)");

+module_param_named(vm_fragment_size, amdgpu_vm_fragment_size, int, 0444);
+
  MODULE_PARM_DESC(vm_block_size, "VM page table size in bits (default depending on 
vm_size)");
  module_param_named(vm_block_size, amdgpu_vm_block_size, int, 0444);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 4ad04cd..b72d547 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2417,12 +2417,26 @@ static uint32_t amdgpu_vm_get_block_size(uint64_t 
vm_size)
  }
  
  /**

- * amdgpu_vm_adjust_size - adjust vm size and block size
+ * amdgpu_vm_set_fragment_size - adjust fragment size in PTE
+ *
+ * @adev: amdgpu_device pointer
+ * @fragment_size_default: the default fragment size if it's set auto
+ */
+void amdgpu_vm_set_fragment_size(struct amdgpu_device *adev, uint32_t 
fragment_size_default)
+{
+   if (amdgpu_vm_fragment_size == -1)
+   adev->vm_manager.fragment_size = fragment_size_default;
+   else
+   adev->vm_manager.fragment_size = amdgpu_vm_fragment_size;
+}
+
+/**
+ * amdgpu_vm_adjust_size - adjust vm size, block size and fragment size
   *
   * @adev: amdgpu_device pointer
   * @vm_size: the default vm size if it's set auto
   */
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size)
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, 
uint32_t fragment_size_default)
  {
/* adjust vm size firstly */
if (amdgpu_vm_size == -1)
@@ -2437,8 +2451,11 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
uint64_t vm_size)
else
adev->vm_manager.block_size = amdgpu_vm_block_size;
  
-	DRM_INFO("vm size is %llu GB, block size is %u-bit\n",

-   adev->vm_manager.vm_size, adev->vm_manager.block_size);
+   amdgpu_vm_set_fragment_size(adev, fragment_size_default);
+
+   DRM_INFO("vm size is %llu GB, block size is %u-bit, fragment size is 
%u-bit\n",
+   adev->vm_manager.vm_size, adev->vm_manager.block_size,
+   adev->vm_manager.fragment_size);
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h

Re: [PATCH 2/2] drm/amd/amdgpu: expose fragment size as module parameter

2017-08-15 Thread Christian König

Am 15.08.2017 um 11:53 schrieb He, Roger:

[SNIP]
Don't adjust the global parameter here.

Better use the default handling to amdgpu_vm_adjust_size() like you had in the 
last version of the patch.

For Vega10 fragment handling works a bit differently, the L1 works like
GFX8 but the L2 is currently fixed to 2M.

So different fragment sizes still make sense for Vega10, you should just not 
change the hardware setup of the L2.
[Roger]: which register do you mean here?


The VM_L2_CNTL* registers. They work differently on Vega10 compared to 
earlier generations.


Christian.



Thanks
Roger(Hongbo.He)


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix vega10 graphic hang issue in S3 test

2017-08-15 Thread Ken Wang
Change-Id: If01e32baa903c8c35991b1517c6d8bde98f5dae2
Signed-off-by: Ken Wang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 68a0d40..f47ee5c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -230,6 +230,7 @@ static int gfx_v9_0_ring_test_ring(struct amdgpu_ring *ring)
r = -EINVAL;
}
amdgpu_gfx_scratch_free(adev, scratch);
+
return r;
 }
 
@@ -1670,6 +1671,7 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device 
*adev)
u32 tmp;
u32 rb_bufsz;
u64 rb_addr, rptr_addr, wptr_gpu_addr;
+   int r;
 
/* Set the write pointer delay */
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_DELAY, 0);
@@ -1729,6 +1731,17 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device 
*adev)
 
/* start the ring */
gfx_v9_0_cp_gfx_start(adev);
+
+   r = amdgpu_ring_alloc(ring, 3);
+   if (r) {
+   return r;
+   }
+   amdgpu_ring_write(ring, PACKET3(PACKET3_SET_UCONFIG_REG, 1));
+   tmp = ((2 << 28) | (SOC15_REG_OFFSET(GC, 0, mmVGT_INDEX_TYPE) - 
PACKET3_SET_UCONFIG_REG_START));
+   amdgpu_ring_write(ring, tmp);
+   amdgpu_ring_write(ring, 0);
+   amdgpu_ring_commit(ring);
+
ring->ready = true;
 
return 0;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 4/4] drm/amdkfd: Implement image tiling mode support

2017-08-15 Thread Oded Gabbay
On Tue, Aug 15, 2017 at 1:55 AM, Felix Kuehling  wrote:
> OK, I was hoping to keep the ABI unchanged during upstreaming, but I
> realize that may not be realistic. And I feel I'm getting valuable
> feedback, so I don't want to limit myself with ABI requirements that are
> not necessary.
I think that's the correct way to go forward.

>
> I still don't have commit access to git repos on freedesktop.org. But
> I'm now planning to upload my WIP to the ROCm repositories on Github as
> upstreaming branches for ROCk (kernel) and ROCt (Thunk). I should have
> that in place in the next day or two.
>
> If I need to tweak the ABI in the upstreaming process, all I need to
> change in user mode is the Thunk, and I can make it available on my
> GitHub branch for people to try, regardless of ROCm release schedules.
> All the rest of the user mode stack should work unchanged as long as the
> Thunk API doesn't need to be changed.
>
> With that in mind, I'll remove the gaps from the ioctl assignment and
> accept that KFD will diverge (for the better) from our internal branch
> more or less significantly in the upstreaming process.
>
> What's the plan for upstreaming on your side. Do you want to push things
> incrementally? Or do you prefer to wait until everything (or most of it)
> is reviewed and tested on an upstream-based branch? If it's the latter,
> we could reshuffle the ioctls later to better match the current release
> ABI before going upstream.
>
> Regards,
>   Felix

I prefer to do it incrementally, to avoid very large patch-sets which
usually end up in longer cycles of review-fix, which causes you more
pain because internal development continues during this time and you
need to keep everything synchronized. If you do it in small pieces,
there is more chance it will get to upstream faster and then you can
cross it off your list permanently and no longer worry about it not
being synchronized with internal development.

If you are talking about the current patch-set (you actually sent 2
patch-sets), then once you rebase them on the branches I mentioned,
they are more or less good to go (except from very small fixes we
talked about). If you can do it during this week, I think we can make
it for the 4.14 merge window.

Does that make sense ?

Oded

>
> On 2017-08-14 11:18 AM, Felix Kuehling wrote:
>> On 2017-08-13 05:08 AM, Oded Gabbay wrote:
>>> As in the previous patch, there is a hole here in the IOCTLs
>>> numbering. I suggest that when you upstream new IOCTLs, you will put
>>> them in consecutive numbers per the upstream driver, and then take
>>> that change downstream because you can easily change it in your
>>> driver.
>> We can easily change it downstream, but it makes it harder for people to
>> test the new upstream kernel with already released ROCm user mode
>> drivers that are easy to install and without having to wait a few weeks
>> for the next release.
>>
>> If testability is not a priority, then I'd rather upstream the new
>> ioctls in the order in which we made them. The only reason I'm going
>> out-of-order is to make testing as easy as possible, as early as
>> possible, with the most recently released user mode stack.
>>
>> It's your call, though.
>>
>> Regards,
>>   Felix
>>
>>> Oded
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix vega10 graphic hang issue in S3 test

2017-08-15 Thread Ken Wang
Change-Id: If01e32baa903c8c35991b1517c6d8bde98f5dae2
Signed-off-by: Ken Wang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 68a0d40..66312d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1670,6 +1670,7 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device 
*adev)
u32 tmp;
u32 rb_bufsz;
u64 rb_addr, rptr_addr, wptr_gpu_addr;
+   int r;
 
/* Set the write pointer delay */
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_DELAY, 0);
@@ -1729,6 +1730,17 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device 
*adev)
 
/* start the ring */
gfx_v9_0_cp_gfx_start(adev);
+
+   r = amdgpu_ring_alloc(ring, 3);
+   if (r) {
+   return r;
+   }
+   amdgpu_ring_write(ring, PACKET3(PACKET3_SET_UCONFIG_REG, 1));
+   tmp = ((2 << 28) | (SOC15_REG_OFFSET(GC, 0, mmVGT_INDEX_TYPE) - 
PACKET3_SET_UCONFIG_REG_START));
+   amdgpu_ring_write(ring, tmp);
+   amdgpu_ring_write(ring, 0);
+   amdgpu_ring_commit(ring);
+
ring->ready = true;
 
return 0;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amd/amdgpu: expose fragment size as module parameter

2017-08-15 Thread Roger He
Change-Id: I70e4ea94b8520e19cfee5ba6c9a0ecf1ee3f5f1a
Signed-off-by: Roger He 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 25 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 -
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  3 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  3 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  3 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  9 -
 9 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d2aaad7..957bd2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -97,6 +97,7 @@ extern int amdgpu_bapm;
 extern int amdgpu_deep_color;
 extern int amdgpu_vm_size;
 extern int amdgpu_vm_block_size;
+extern int amdgpu_vm_fragment_size;
 extern int amdgpu_vm_fault_stop;
 extern int amdgpu_vm_debug;
 extern int amdgpu_vm_update_mode;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index dd1dc87..9da391f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1077,6 +1077,12 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
amdgpu_gtt_size = -1;
}
 
+   /* valid range is between 4 and 9 inclusive */
+   if (amdgpu_vm_fragment_size > 9 || amdgpu_vm_fragment_size < 4) {
+   dev_warn(adev->dev, "valid rang is between 4 and 9\n");
+   amdgpu_vm_fragment_size = -1;
+   }
+
amdgpu_check_vm_size(adev);
 
amdgpu_check_block_size(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2cdf844..d5c63d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -94,6 +94,7 @@ unsigned amdgpu_ip_block_mask = 0x;
 int amdgpu_bapm = -1;
 int amdgpu_deep_color = 0;
 int amdgpu_vm_size = -1;
+int amdgpu_vm_fragment_size = -1;
 int amdgpu_vm_block_size = -1;
 int amdgpu_vm_fault_stop = 0;
 int amdgpu_vm_debug = 0;
@@ -184,6 +185,9 @@ module_param_named(deep_color, amdgpu_deep_color, int, 
0444);
 MODULE_PARM_DESC(vm_size, "VM address space size in gigabytes (default 64GB)");
 module_param_named(vm_size, amdgpu_vm_size, int, 0444);
 
+MODULE_PARM_DESC(vm_fragment_size, "VM fragment size in bits (4, 5, etc. 4 = 
64K (default), Max 9 = 2M)");
+module_param_named(vm_fragment_size, amdgpu_vm_fragment_size, int, 0444);
+
 MODULE_PARM_DESC(vm_block_size, "VM page table size in bits (default depending 
on vm_size)");
 module_param_named(vm_block_size, amdgpu_vm_block_size, int, 0444);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4ad04cd..b72d547 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2417,12 +2417,26 @@ static uint32_t amdgpu_vm_get_block_size(uint64_t 
vm_size)
 }
 
 /**
- * amdgpu_vm_adjust_size - adjust vm size and block size
+ * amdgpu_vm_set_fragment_size - adjust fragment size in PTE
+ *
+ * @adev: amdgpu_device pointer
+ * @fragment_size_default: the default fragment size if it's set auto
+ */
+void amdgpu_vm_set_fragment_size(struct amdgpu_device *adev, uint32_t 
fragment_size_default)
+{
+   if (amdgpu_vm_fragment_size == -1)
+   adev->vm_manager.fragment_size = fragment_size_default;
+   else
+   adev->vm_manager.fragment_size = amdgpu_vm_fragment_size;
+}
+
+/**
+ * amdgpu_vm_adjust_size - adjust vm size, block size and fragment size
  *
  * @adev: amdgpu_device pointer
  * @vm_size: the default vm size if it's set auto
  */
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size)
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, 
uint32_t fragment_size_default)
 {
/* adjust vm size firstly */
if (amdgpu_vm_size == -1)
@@ -2437,8 +2451,11 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
uint64_t vm_size)
else
adev->vm_manager.block_size = amdgpu_vm_block_size;
 
-   DRM_INFO("vm size is %llu GB, block size is %u-bit\n",
-   adev->vm_manager.vm_size, adev->vm_manager.block_size);
+   amdgpu_vm_set_fragment_size(adev, fragment_size_default);
+
+   DRM_INFO("vm size is %llu GB, block size is %u-bit, fragment size is 
%u-bit\n",
+   adev->vm_manager.vm_size, adev->vm_manager.block_size,
+   adev->vm_manager.fragment_size);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index d426384..10bac33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -262,7 

[PATCH] drm/amdgpu: fix vega10 graphic hang issue in S3 test

2017-08-15 Thread Huang Rui
From: Ken Wang 

Change-Id: If01e32baa903c8c35991b1517c6d8bde98f5dae2
Signed-off-by: Ken Wang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 68a0d40..66312d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1670,6 +1670,7 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device 
*adev)
u32 tmp;
u32 rb_bufsz;
u64 rb_addr, rptr_addr, wptr_gpu_addr;
+   int r;
 
/* Set the write pointer delay */
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_DELAY, 0);
@@ -1729,6 +1730,17 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device 
*adev)
 
/* start the ring */
gfx_v9_0_cp_gfx_start(adev);
+
+   r = amdgpu_ring_alloc(ring, 3);
+   if (r) {
+   return r;
+   }
+   amdgpu_ring_write(ring, PACKET3(PACKET3_SET_UCONFIG_REG, 1));
+   tmp = ((2 << 28) | (SOC15_REG_OFFSET(GC, 0, mmVGT_INDEX_TYPE) - 
PACKET3_SET_UCONFIG_REG_START));
+   amdgpu_ring_write(ring, tmp);
+   amdgpu_ring_write(ring, 0);
+   amdgpu_ring_commit(ring);
+
ring->ready = true;
 
return 0;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amd/amdgpu: expose fragment size as module parameter

2017-08-15 Thread He, Roger
-Original Message-
From: Koenig, Christian 
Sent: Tuesday, August 15, 2017 4:48 PM
To: He, Roger ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/2] drm/amd/amdgpu: expose fragment size as module 
parameter

Am 15.08.2017 um 10:36 schrieb Roger He:
> Change-Id: I70e4ea94b8520e19cfee5ba6c9a0ecf1ee3f5f1a
> Signed-off-by: Roger He 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +--
>   drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  | 1 -
>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 1 -
>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 1 -
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 2 +-
>   8 files changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index d2aaad7..957bd2b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -97,6 +97,7 @@ extern int amdgpu_bapm;
>   extern int amdgpu_deep_color;
>   extern int amdgpu_vm_size;
>   extern int amdgpu_vm_block_size;
> +extern int amdgpu_vm_fragment_size;
>   extern int amdgpu_vm_fault_stop;
>   extern int amdgpu_vm_debug;
>   extern int amdgpu_vm_update_mode;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index dd1dc87..44c66a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1077,6 +1077,14 @@ static void amdgpu_check_arguments(struct 
> amdgpu_device *adev)
>   amdgpu_gtt_size = -1;
>   }
>   
> + /* make sense only for GFX8 and previous ASICs
> +  * valid rang is between 4 and 9 inclusive
> +  */
> + if (amdgpu_vm_fragment_size > 9 || amdgpu_vm_fragment_size < 4) {
> + dev_warn(adev->dev, "valid rang is between 4 and 9\n");
> + amdgpu_vm_fragment_size = 4;
> + }
> +
>   amdgpu_check_vm_size(adev);
>   
>   amdgpu_check_block_size(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 2cdf844..d9522a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -94,6 +94,7 @@ unsigned amdgpu_ip_block_mask = 0x;
>   int amdgpu_bapm = -1;
>   int amdgpu_deep_color = 0;
>   int amdgpu_vm_size = -1;
> +int amdgpu_vm_fragment_size = 4;
>   int amdgpu_vm_block_size = -1;
>   int amdgpu_vm_fault_stop = 0;
>   int amdgpu_vm_debug = 0;
> @@ -184,6 +185,9 @@ module_param_named(deep_color, amdgpu_deep_color, int, 
> 0444);
>   MODULE_PARM_DESC(vm_size, "VM address space size in gigabytes (default 
> 64GB)");
>   module_param_named(vm_size, amdgpu_vm_size, int, 0444);
>   
> +MODULE_PARM_DESC(vm_fragment_size, "VM fragment size in bits (4, 5, 
> +etc. 4 = 64K (default), Max 9 = 2M)"); 
> +module_param_named(vm_fragment_size, amdgpu_vm_fragment_size, int, 
> +0444);
> +
>   MODULE_PARM_DESC(vm_block_size, "VM page table size in bits (default 
> depending on vm_size)");
>   module_param_named(vm_block_size, amdgpu_vm_block_size, int, 0444);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 4ad04cd..85ef4d5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2437,8 +2437,11 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
> uint64_t vm_size)
>   else
>   adev->vm_manager.block_size = amdgpu_vm_block_size;
>   
> - DRM_INFO("vm size is %llu GB, block size is %u-bit\n",
> - adev->vm_manager.vm_size, adev->vm_manager.block_size);
> + adev->vm_manager.fragment_size = amdgpu_vm_fragment_size;
> +
> + DRM_INFO("vm size is %llu GB, block size is %u-bit, fragment size is 
> %u-bit\n",
> + adev->vm_manager.vm_size, adev->vm_manager.block_size,
> + adev->vm_manager.fragment_size);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index dcb053f..56218ac 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -815,7 +815,6 @@ static int gmc_v6_0_sw_init(void *handle)
>   return r;
>   
>   amdgpu_vm_adjust_size(adev, 64);
> - adev->vm_manager.fragment_size = 4;
>   adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
>   
>   adev->mc.mc_mask = 0xffULL; diff --git 
> a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index 2ac9afa..7f5eb02 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -951,7 +951,6 @@ static int gmc_v7_0_sw_init(void *handle)
>* Max GPUVM size for cayman and SI is 

[PATCH 2/2] drm/amd/amdgpu: expose fragment size as module parameter

2017-08-15 Thread Roger He
Change-Id: I70e4ea94b8520e19cfee5ba6c9a0ecf1ee3f5f1a
Signed-off-by: Roger He 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  | 1 -
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 1 -
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 1 -
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 2 +-
 8 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d2aaad7..957bd2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -97,6 +97,7 @@ extern int amdgpu_bapm;
 extern int amdgpu_deep_color;
 extern int amdgpu_vm_size;
 extern int amdgpu_vm_block_size;
+extern int amdgpu_vm_fragment_size;
 extern int amdgpu_vm_fault_stop;
 extern int amdgpu_vm_debug;
 extern int amdgpu_vm_update_mode;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index dd1dc87..44c66a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1077,6 +1077,14 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
amdgpu_gtt_size = -1;
}
 
+   /* make sense only for GFX8 and previous ASICs
+* valid rang is between 4 and 9 inclusive
+*/
+   if (amdgpu_vm_fragment_size > 9 || amdgpu_vm_fragment_size < 4) {
+   dev_warn(adev->dev, "valid rang is between 4 and 9\n");
+   amdgpu_vm_fragment_size = 4;
+   }
+
amdgpu_check_vm_size(adev);
 
amdgpu_check_block_size(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2cdf844..d9522a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -94,6 +94,7 @@ unsigned amdgpu_ip_block_mask = 0x;
 int amdgpu_bapm = -1;
 int amdgpu_deep_color = 0;
 int amdgpu_vm_size = -1;
+int amdgpu_vm_fragment_size = 4;
 int amdgpu_vm_block_size = -1;
 int amdgpu_vm_fault_stop = 0;
 int amdgpu_vm_debug = 0;
@@ -184,6 +185,9 @@ module_param_named(deep_color, amdgpu_deep_color, int, 
0444);
 MODULE_PARM_DESC(vm_size, "VM address space size in gigabytes (default 64GB)");
 module_param_named(vm_size, amdgpu_vm_size, int, 0444);
 
+MODULE_PARM_DESC(vm_fragment_size, "VM fragment size in bits (4, 5, etc. 4 = 
64K (default), Max 9 = 2M)");
+module_param_named(vm_fragment_size, amdgpu_vm_fragment_size, int, 0444);
+
 MODULE_PARM_DESC(vm_block_size, "VM page table size in bits (default depending 
on vm_size)");
 module_param_named(vm_block_size, amdgpu_vm_block_size, int, 0444);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4ad04cd..85ef4d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2437,8 +2437,11 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
uint64_t vm_size)
else
adev->vm_manager.block_size = amdgpu_vm_block_size;
 
-   DRM_INFO("vm size is %llu GB, block size is %u-bit\n",
-   adev->vm_manager.vm_size, adev->vm_manager.block_size);
+   adev->vm_manager.fragment_size = amdgpu_vm_fragment_size;
+
+   DRM_INFO("vm size is %llu GB, block size is %u-bit, fragment size is 
%u-bit\n",
+   adev->vm_manager.vm_size, adev->vm_manager.block_size,
+   adev->vm_manager.fragment_size);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index dcb053f..56218ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -815,7 +815,6 @@ static int gmc_v6_0_sw_init(void *handle)
return r;
 
amdgpu_vm_adjust_size(adev, 64);
-   adev->vm_manager.fragment_size = 4;
adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
 
adev->mc.mc_mask = 0xffULL;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 2ac9afa..7f5eb02 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -951,7 +951,6 @@ static int gmc_v7_0_sw_init(void *handle)
 * Max GPUVM size for cayman and SI is 40 bits.
 */
amdgpu_vm_adjust_size(adev, 64);
-   adev->vm_manager.fragment_size = 4;
adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
 
/* Set the internal MC address mask
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 27c70d8..1ffba0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1049,7 +1049,6 @@ static int gmc_v8_0_sw_init(void *handle)
 

Re: [PATCH 2/2] drm/amd/amdgpu: expose fragment size as module parameter

2017-08-15 Thread Christian König

Am 15.08.2017 um 10:36 schrieb Roger He:

Change-Id: I70e4ea94b8520e19cfee5ba6c9a0ecf1ee3f5f1a
Signed-off-by: Roger He 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  | 1 -
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 1 -
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 1 -
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 2 +-
  8 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d2aaad7..957bd2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -97,6 +97,7 @@ extern int amdgpu_bapm;
  extern int amdgpu_deep_color;
  extern int amdgpu_vm_size;
  extern int amdgpu_vm_block_size;
+extern int amdgpu_vm_fragment_size;
  extern int amdgpu_vm_fault_stop;
  extern int amdgpu_vm_debug;
  extern int amdgpu_vm_update_mode;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index dd1dc87..44c66a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1077,6 +1077,14 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
amdgpu_gtt_size = -1;
}
  
+	/* make sense only for GFX8 and previous ASICs

+* valid rang is between 4 and 9 inclusive
+*/
+   if (amdgpu_vm_fragment_size > 9 || amdgpu_vm_fragment_size < 4) {
+   dev_warn(adev->dev, "valid rang is between 4 and 9\n");
+   amdgpu_vm_fragment_size = 4;
+   }
+
amdgpu_check_vm_size(adev);
  
  	amdgpu_check_block_size(adev);

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2cdf844..d9522a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -94,6 +94,7 @@ unsigned amdgpu_ip_block_mask = 0x;
  int amdgpu_bapm = -1;
  int amdgpu_deep_color = 0;
  int amdgpu_vm_size = -1;
+int amdgpu_vm_fragment_size = 4;
  int amdgpu_vm_block_size = -1;
  int amdgpu_vm_fault_stop = 0;
  int amdgpu_vm_debug = 0;
@@ -184,6 +185,9 @@ module_param_named(deep_color, amdgpu_deep_color, int, 
0444);
  MODULE_PARM_DESC(vm_size, "VM address space size in gigabytes (default 
64GB)");
  module_param_named(vm_size, amdgpu_vm_size, int, 0444);
  
+MODULE_PARM_DESC(vm_fragment_size, "VM fragment size in bits (4, 5, etc. 4 = 64K (default), Max 9 = 2M)");

+module_param_named(vm_fragment_size, amdgpu_vm_fragment_size, int, 0444);
+
  MODULE_PARM_DESC(vm_block_size, "VM page table size in bits (default depending on 
vm_size)");
  module_param_named(vm_block_size, amdgpu_vm_block_size, int, 0444);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 4ad04cd..85ef4d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2437,8 +2437,11 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
uint64_t vm_size)
else
adev->vm_manager.block_size = amdgpu_vm_block_size;
  
-	DRM_INFO("vm size is %llu GB, block size is %u-bit\n",

-   adev->vm_manager.vm_size, adev->vm_manager.block_size);
+   adev->vm_manager.fragment_size = amdgpu_vm_fragment_size;
+
+   DRM_INFO("vm size is %llu GB, block size is %u-bit, fragment size is 
%u-bit\n",
+   adev->vm_manager.vm_size, adev->vm_manager.block_size,
+   adev->vm_manager.fragment_size);
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index dcb053f..56218ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -815,7 +815,6 @@ static int gmc_v6_0_sw_init(void *handle)
return r;
  
  	amdgpu_vm_adjust_size(adev, 64);

-   adev->vm_manager.fragment_size = 4;
adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
  
  	adev->mc.mc_mask = 0xffULL;

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 2ac9afa..7f5eb02 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -951,7 +951,6 @@ static int gmc_v7_0_sw_init(void *handle)
 * Max GPUVM size for cayman and SI is 40 bits.
 */
amdgpu_vm_adjust_size(adev, 64);
-   adev->vm_manager.fragment_size = 4;
adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
  
  	/* Set the internal MC address mask

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 27c70d8..1ffba0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1049,7 +1049,6 

Re: [PATCH 1/2] drm/amd/amdgpu: add fragment size in vm_manager

2017-08-15 Thread Christian König

Only two nit picks, the first one is that we need a better commit message.

Something like: "This adds the fragment_size in the vm_manager structure 
and implements hardware setup for it."


For the second see below.

Am 15.08.2017 um 10:35 schrieb Roger He:

Change-Id: If8de884538b8eca2214f21242925d854e16e63b7
Signed-off-by: Roger He 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c  | 5 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   | 4 +---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h   | 6 +-
  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 5 +++--
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c| 8 ++--
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c| 9 ++---
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c| 9 ++---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 8 ++--
  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c  | 5 +++--
  9 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 4a6407d..b850bf92 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -590,11 +590,8 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
dev_info.virtual_address_offset = AMDGPU_VA_RESERVED_SIZE;
dev_info.virtual_address_max = 
(uint64_t)adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE;
dev_info.virtual_address_alignment = max((int)PAGE_SIZE, 
AMDGPU_GPU_PAGE_SIZE);
-   dev_info.pte_fragment_size =
-   (1 << AMDGPU_LOG2_PAGES_PER_FRAG(adev)) *
-   AMDGPU_GPU_PAGE_SIZE;
+   dev_info.pte_fragment_size = (1 << 
adev->vm_manager.fragment_size) * AMDGPU_GPU_PAGE_SIZE;
dev_info.gart_page_size = AMDGPU_GPU_PAGE_SIZE;
-
dev_info.cu_active_number = adev->gfx.cu_info.number;
dev_info.cu_ao_mask = adev->gfx.cu_info.ao_cu_mask;
dev_info.ce_ram_size = adev->gfx.ce_ram_size;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4b964f5..4ad04cd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1419,9 +1419,7 @@ static int amdgpu_vm_frag_ptes(struct 
amdgpu_pte_update_params*params,
 * Userspace can support this by aligning virtual base address and
 * allocation size to the fragment size.
 */
-
-   /* SI and newer are optimized for 64KB */
-   unsigned pages_per_frag = AMDGPU_LOG2_PAGES_PER_FRAG(params->adev);
+   unsigned pages_per_frag = params->adev->vm_manager.fragment_size;
uint64_t frag_flags = AMDGPU_PTE_FRAG(pages_per_frag);
uint64_t frag_align = 1 << pages_per_frag;
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h

index 6e94cd2..d426384 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -50,11 +50,6 @@ struct amdgpu_bo_list_entry;
  /* PTBs (Page Table Blocks) need to be aligned to 32K */
  #define AMDGPU_VM_PTB_ALIGN_SIZE   32768
  
-/* LOG2 number of continuous pages for the fragment field */

-#define AMDGPU_LOG2_PAGES_PER_FRAG(adev) \
-   ((adev)->asic_type < CHIP_VEGA10 ? 4 : \
-(adev)->vm_manager.block_size)
-
  #define AMDGPU_PTE_VALID  (1ULL << 0)
  #define AMDGPU_PTE_SYSTEM (1ULL << 1)
  #define AMDGPU_PTE_SNOOPED(1ULL << 2)
@@ -191,6 +186,7 @@ struct amdgpu_vm_manager {
uint32_tnum_level;
uint64_tvm_size;
uint32_tblock_size;
+   uint32_tfragment_size;
/* vram base address for page table entry  */
u64 vram_base_offset;
/* vm pte handling */
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
index 6c8040e..4f2788b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
@@ -124,7 +124,7 @@ static void gfxhub_v1_0_init_tlb_regs(struct amdgpu_device 
*adev)
  
  static void gfxhub_v1_0_init_cache_regs(struct amdgpu_device *adev)

  {
-   uint32_t tmp;
+   uint32_t tmp, field;
  
  	/* Setup L2 cache */

tmp = RREG32_SOC15(GC, 0, mmVM_L2_CNTL);
@@ -143,8 +143,9 @@ static void gfxhub_v1_0_init_cache_regs(struct 
amdgpu_device *adev)
tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1);
WREG32_SOC15(GC, 0, mmVM_L2_CNTL2, tmp);
  
+	field = adev->vm_manager.fragment_size;

tmp = mmVM_L2_CNTL3_DEFAULT;
-   tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, 9);
+   tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, field);
tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, 6);
WREG32_SOC15(GC, 0, mmVM_L2_CNTL3, 

Re: [PATCH xf86-video-amdgpu] Adapt to PixmapDirtyUpdateRec::src being a DrawablePtr

2017-08-15 Thread Michel Dänzer
On 15/08/17 05:26 PM, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> Signed-off-by: Michel Dänzer 
> ---

I meant to mention here that these patches adapt to
https://patchwork.freedesktop.org/patch/150938/ , which I'm about to push.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-amdgpu] Adapt to PixmapDirtyUpdateRec::src being a DrawablePtr

2017-08-15 Thread Michel Dänzer
From: Michel Dänzer 

Signed-off-by: Michel Dänzer 
---
 src/amdgpu_drv.h  | 26 ++
 src/amdgpu_kms.c  | 20 ++--
 src/drmmode_display.c | 10 +++---
 3 files changed, 43 insertions(+), 13 deletions(-)

diff --git a/src/amdgpu_drv.h b/src/amdgpu_drv.h
index e5c44dc36..9e088e71a 100644
--- a/src/amdgpu_drv.h
+++ b/src/amdgpu_drv.h
@@ -170,6 +170,32 @@ typedef enum {
 #define AMDGPU_PIXMAP_SHARING 1
 #define amdgpu_is_gpu_screen(screen) (screen)->isGPU
 #define amdgpu_is_gpu_scrn(scrn) (scrn)->is_gpu
+
+static inline ScreenPtr
+amdgpu_dirty_master(PixmapDirtyUpdatePtr dirty)
+{
+#ifdef HAS_DIRTYTRACKING_DRAWABLE_SRC
+   ScreenPtr screen = dirty->src->pScreen;
+#else
+   ScreenPtr screen = dirty->src->drawable.pScreen;
+#endif
+
+   if (screen->current_master)
+   return screen->current_master;
+
+   return screen;
+}
+
+static inline Bool
+amdgpu_dirty_src_equals(PixmapDirtyUpdatePtr dirty, PixmapPtr pixmap)
+{
+#ifdef HAS_DIRTYTRACKING_DRAWABLE_SRC
+   return dirty->src == >drawable;
+#else
+   return dirty->src == pixmap;
+#endif
+}
+
 #else
 #define amdgpu_is_gpu_screen(screen) 0
 #define amdgpu_is_gpu_scrn(scrn) 0
diff --git a/src/amdgpu_kms.c b/src/amdgpu_kms.c
index c86f117f9..20a552baa 100644
--- a/src/amdgpu_kms.c
+++ b/src/amdgpu_kms.c
@@ -448,7 +448,7 @@ dirty_region(PixmapDirtyUpdatePtr dirty)
 static void
 redisplay_dirty(PixmapDirtyUpdatePtr dirty, RegionPtr region)
 {
-   ScrnInfoPtr scrn = xf86ScreenToScrn(dirty->src->drawable.pScreen);
+   ScrnInfoPtr scrn = xf86ScreenToScrn(dirty->slave_dst->drawable.pScreen);
 
if (RegionNil(region))
goto out;
@@ -481,12 +481,12 @@ amdgpu_prime_scanout_update_abort(xf86CrtcPtr crtc, void 
*event_data)
 void
 amdgpu_sync_shared_pixmap(PixmapDirtyUpdatePtr dirty)
 {
-   ScreenPtr master_screen = dirty->src->master_pixmap->drawable.pScreen;
+   ScreenPtr master_screen = amdgpu_dirty_master(dirty);
PixmapDirtyUpdatePtr ent;
RegionPtr region;
 
xorg_list_for_each_entry(ent, _screen->pixmap_dirty_list, ent) {
-   if (ent->slave_dst != dirty->src)
+   if (!amdgpu_dirty_src_equals(dirty, ent->slave_dst))
continue;
 
region = dirty_region(ent);
@@ -501,7 +501,7 @@ amdgpu_sync_shared_pixmap(PixmapDirtyUpdatePtr dirty)
 static Bool
 master_has_sync_shared_pixmap(ScrnInfoPtr scrn, PixmapDirtyUpdatePtr dirty)
 {
-   ScreenPtr master_screen = dirty->src->master_pixmap->drawable.pScreen;
+   ScreenPtr master_screen = amdgpu_dirty_master(dirty);
 
return master_screen->SyncSharedPixmap != NULL;
 }
@@ -517,7 +517,7 @@ slave_has_sync_shared_pixmap(ScrnInfoPtr scrn, 
PixmapDirtyUpdatePtr dirty)
 static void
 call_sync_shared_pixmap(PixmapDirtyUpdatePtr dirty)
 {
-   ScreenPtr master_screen = dirty->src->master_pixmap->drawable.pScreen;
+   ScreenPtr master_screen = amdgpu_dirty_master(dirty);
 
master_screen->SyncSharedPixmap(dirty);
 }
@@ -527,7 +527,7 @@ call_sync_shared_pixmap(PixmapDirtyUpdatePtr dirty)
 static Bool
 master_has_sync_shared_pixmap(ScrnInfoPtr scrn, PixmapDirtyUpdatePtr dirty)
 {
-   ScrnInfoPtr master_scrn = 
xf86ScreenToScrn(dirty->src->master_pixmap->drawable.pScreen);
+   ScrnInfoPtr master_scrn = xf86ScreenToScrn(amdgpu_dirty_master(dirty));
 
return master_scrn->driverName == scrn->driverName;
 }
@@ -562,7 +562,7 @@ amdgpu_prime_dirty_to_crtc(PixmapDirtyUpdatePtr dirty)
xf86CrtcPtr xf86_crtc = xf86_config->crtc[c];
drmmode_crtc_private_ptr drmmode_crtc = 
xf86_crtc->driver_private;
 
-   if (drmmode_crtc->prime_scanout_pixmap == dirty->src)
+   if (amdgpu_dirty_src_equals(dirty, 
drmmode_crtc->prime_scanout_pixmap))
return xf86_crtc;
}
 
@@ -579,7 +579,7 @@ amdgpu_prime_scanout_do_update(xf86CrtcPtr crtc, unsigned 
scanout_id)
Bool ret = FALSE;
 
xorg_list_for_each_entry(dirty, >pixmap_dirty_list, ent) {
-   if (dirty->src == drmmode_crtc->prime_scanout_pixmap) {
+   if (amdgpu_dirty_src_equals(dirty, 
drmmode_crtc->prime_scanout_pixmap)) {
RegionPtr region;
 
if (master_has_sync_shared_pixmap(scrn, dirty))
@@ -756,10 +756,10 @@ amdgpu_dirty_update(ScrnInfoPtr scrn)
PixmapDirtyUpdatePtr region_ent = ent;
 
if (master_has_sync_shared_pixmap(scrn, ent)) {
-   ScreenPtr master_screen = 
ent->src->master_pixmap->drawable.pScreen;
+   ScreenPtr master_screen = 
amdgpu_dirty_master(ent);
 
xorg_list_for_each_entry(region_ent, 
_screen->pixmap_dirty_list, ent) {
-   if (region_ent->slave_dst == ent->src)
+

Re: [PATCH 2/4] drm/amdkfd: Adding new IOCTL for scratch memory

2017-08-15 Thread Christian König

Am 14.08.2017 um 17:31 schrieb Felix Kuehling:

[SNIP]
Repeating the same argument I made on another email:


Commented on that in the other mail, let's keep the discussion on this 
topic there.



BTW: What exactly this this good for?

Scratch memory is private memory per work-item. It's used when a shader
program has too few registers available. With HSA we use flat scratch
addressing, where shaders can access private memory in a special scratch
aperture using normal memory instructions. Using the same virtual
address, each work item gets its own private piece of memory. The
hardware does the address translation from the VA in the private
aperture to a scratch-backing VA. The application is responsible for
allocating the memory to back that scratch area, and to map it somewhere
in its virtual address space.

This ioctl tells the hardware (or HWS firmware) the VA of the scratch
backing memory.


Ok, you've got me lost here. Not that I'm deeply into that stuff, but my 
last status is that those apertures are global and not per/process or VMID.


That would mean that when two processes try to set two different 
addresses we are completely lost here.


Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx