date:20220628




On 5/27/22 01:50, Dmitry Osipenko wrote:

Unlock reservations on dma_resv_reserve_fences() error to fix recursive
locking of the reservations when this error happens.

Fixes:

Cc: sta...@vger.kernel.org


With that fixed,

Reviewed-by: Thomas Hellström 



Signed-off-by: Dmitry Osipenko 
---
  drivers/gpu/drm/virtio/virtgpu_gem.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c 
b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 580a78809836..7db48d17ee3a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -228,8 +228,10 @@ int virtio_gpu_array_lock_resv(struct 
virtio_gpu_object_array *objs)
  
  	for (i = 0; i < objs->nents; ++i) {

ret = dma_resv_reserve_fences(objs->objs[i]->resv, 1);
-   if (ret)
+   if (ret) {
+   virtio_gpu_array_unlock_resv(objs);
return ret;
+   }
}
return ret;
  }

Re: [PATCH v6 02/22] drm/gem: Move mapping of imported dma-bufs to drm_gem_mmap_obj()




On 5/27/22 01:50, Dmitry Osipenko wrote:

Drivers that use drm_gem_mmap() and drm_gem_mmap_obj() helpers don't
handle imported dma-bufs properly, which results in mapping of something
else than the imported dma-buf. For example, on NVIDIA Tegra we get a hard
lockup when userspace writes to the memory mapping of a dma-buf that was
imported into Tegra's DRM GEM.

To fix this bug, move mapping of imported dma-bufs to drm_gem_mmap_obj().
Now mmaping of imported dma-bufs works properly for all DRM drivers.

Same comment about Fixes: as in patch 1,


Cc: sta...@vger.kernel.org
Signed-off-by: Dmitry Osipenko 
---
  drivers/gpu/drm/drm_gem.c  | 3 +++
  drivers/gpu/drm/drm_gem_shmem_helper.c | 9 -
  drivers/gpu/drm/tegra/gem.c| 4 
  3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 86d670c71286..7c0b025508e4 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1038,6 +1038,9 @@ int drm_gem_mmap_obj(struct drm_gem_object *obj, unsigned 
long obj_size,
if (obj_size < vma->vm_end - vma->vm_start)
return -EINVAL;
  
+	if (obj->import_attach)

+   return dma_buf_mmap(obj->dma_buf, vma, 0);


If we start enabling mmaping of imported dma-bufs on a majority of 
drivers in this way, how do we ensure that user-space is not blindly 
using the object mmap without calling the needed DMA_BUF_IOCTL_SYNC 
which is needed before and after cpu access of mmap'ed dma-bufs?


I was under the impression (admittedly without looking) that the few 
drivers that actually called into dma_buf_mmap() had some private 
user-mode driver code in place that ensured this happened.


/Thomas



+
/* Take a ref for this mapping of the object, so that the fault
 * handler can dereference the mmap offset's pointer to the object.
 * This reference is cleaned up by the corresponding vm_close
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 8ad0e02991ca..6190f5018986 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -609,17 +609,8 @@ EXPORT_SYMBOL_GPL(drm_gem_shmem_vm_ops);
   */
  int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct 
vm_area_struct *vma)
  {
-   struct drm_gem_object *obj = &shmem->base;
int ret;
  
-	if (obj->import_attach) {

-   /* Drop the reference drm_gem_mmap_obj() acquired.*/
-   drm_gem_object_put(obj);
-   vma->vm_private_data = NULL;
-
-   return dma_buf_mmap(obj->dma_buf, vma, 0);
-   }
-
ret = drm_gem_shmem_get_pages(shmem);
if (ret) {
drm_gem_vm_close(vma);
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 7c7dd84e6db8..f92aa20d63bb 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -564,6 +564,10 @@ int __tegra_gem_mmap(struct drm_gem_object *gem, struct 
vm_area_struct *vma)
  {
struct tegra_bo *bo = to_tegra_bo(gem);
  
+	/* imported dmu-buf is mapped by drm_gem_mmap_obj()  */

+   if (gem->import_attach)
+   return 0;
+
if (!bo->pages) {
unsigned long vm_pgoff = vma->vm_pgoff;
int err;

Re: [PATCH v6 02/14] mm: handling Non-LRU pages returned by vm_normal_pages

2022-06-28 Thread Sierra Guiza, Alejandro (Alex)




On 6/28/2022 5:42 AM, David Hildenbrand wrote:

On 28.06.22 02:14, Alex Sierra wrote:

With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
device-managed anonymous pages that are not LRU pages. Although they
behave like normal pages for purposes of mapping in CPU page, and for
COW. They do not support LRU lists, NUMA migration or THP.

We also introduced a FOLL_LRU flag that adds the same behaviour to
follow_page and related APIs, to allow callers to specify that they
expect to put pages on an LRU list.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Popple 
---

I think my review feedback regarding FOLL_LRU has been ignored.


Sorry David, this has been addressed in v7.

Regards,
Alex Sierra

[PATCH v7 14/14] tools: add selftests to hmm for COW in device memory

The objective is to test device migration mechanism in pages marked
as COW, for private and coherent device type. In case of writing to
COW private page(s), a page fault will migrate pages back to system
memory first. Then, these pages will be duplicated. In case of COW
device coherent type, pages are duplicated directly from device
memory.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
---
 tools/testing/selftests/vm/hmm-tests.c | 80 ++
 1 file changed, 80 insertions(+)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index bb38b9777610..716b62c05e3d 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -1874,4 +1874,84 @@ TEST_F(hmm, hmm_gup_test)
close(gup_fd);
hmm_buffer_free(buffer);
 }
+
+/*
+ * Test copy-on-write in device pages.
+ * In case of writing to COW private page(s), a page fault will migrate pages
+ * back to system memory first. Then, these pages will be duplicated. In case
+ * of COW device coherent type, pages are duplicated directly from device
+ * memory.
+ */
+TEST_F(hmm, hmm_cow_in_device)
+{
+   struct hmm_buffer *buffer;
+   unsigned long npages;
+   unsigned long size;
+   unsigned long i;
+   int *ptr;
+   int ret;
+   unsigned char *m;
+   pid_t pid;
+   int status;
+
+   npages = 4;
+   size = npages << self->page_shift;
+
+   buffer = malloc(sizeof(*buffer));
+   ASSERT_NE(buffer, NULL);
+
+   buffer->fd = -1;
+   buffer->size = size;
+   buffer->mirror = malloc(size);
+   ASSERT_NE(buffer->mirror, NULL);
+
+   buffer->ptr = mmap(NULL, size,
+  PROT_READ | PROT_WRITE,
+  MAP_PRIVATE | MAP_ANONYMOUS,
+  buffer->fd, 0);
+   ASSERT_NE(buffer->ptr, MAP_FAILED);
+
+   /* Initialize buffer in system memory. */
+   for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+   ptr[i] = i;
+
+   /* Migrate memory to device. */
+
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
+   ASSERT_EQ(ret, 0);
+   ASSERT_EQ(buffer->cpages, npages);
+
+   pid = fork();
+   if (pid == -1)
+   ASSERT_EQ(pid, 0);
+   if (!pid) {
+   /* Child process waitd for SIGTERM from the parent. */
+   while (1) {
+   }
+   perror("Should not reach this\n");
+   exit(0);
+   }
+   /* Parent process writes to COW pages(s) and gets a
+* new copy in system. In case of device private pages,
+* this write causes a migration to system mem first.
+*/
+   for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+   ptr[i] = i;
+
+   /* Terminate child and wait */
+   EXPECT_EQ(0, kill(pid, SIGTERM));
+   EXPECT_EQ(pid, waitpid(pid, &status, 0));
+   EXPECT_NE(0, WIFSIGNALED(status));
+   EXPECT_EQ(SIGTERM, WTERMSIG(status));
+
+   /* Take snapshot to CPU pagetables */
+   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_SNAPSHOT, buffer, npages);
+   ASSERT_EQ(ret, 0);
+   ASSERT_EQ(buffer->cpages, npages);
+   m = buffer->mirror;
+   for (i = 0; i < npages; i++)
+   ASSERT_EQ(HMM_DMIRROR_PROT_WRITE, m[i]);
+
+   hmm_buffer_free(buffer);
+}
 TEST_HARNESS_MAIN
-- 
2.32.0

[PATCH v7 13/14] tools: add hmm gup tests for device coherent type

The intention is to test hmm device coherent type under different get
user pages paths. Also, test gup with FOLL_LONGTERM flag set in
device coherent pages. These pages should get migrated back to system
memory.

Signed-off-by: Alex Sierra 
Reviewed-by: Alistair Popple 
---
 tools/testing/selftests/vm/hmm-tests.c | 110 +
 1 file changed, 110 insertions(+)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 4b547188ec40..bb38b9777610 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -36,6 +36,7 @@
  * in the usual include/uapi/... directory.
  */
 #include "../../../../lib/test_hmm_uapi.h"
+#include "../../../../mm/gup_test.h"
 
 struct hmm_buffer {
void*ptr;
@@ -59,6 +60,9 @@ enum {
 #define NTIMES 10
 
 #define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1)))
+/* Just the flags we need, copied from mm.h: */
+#define FOLL_WRITE 0x01/* check pte is writable */
+#define FOLL_LONGTERM   0x1 /* mapping lifetime is indefinite */
 
 FIXTURE(hmm)
 {
@@ -1764,4 +1768,110 @@ TEST_F(hmm, exclusive_cow)
hmm_buffer_free(buffer);
 }
 
+static int gup_test_exec(int gup_fd, unsigned long addr, int cmd,
+int npages, int size, int flags)
+{
+   struct gup_test gup = {
+   .nr_pages_per_call  = npages,
+   .addr   = addr,
+   .gup_flags  = FOLL_WRITE | flags,
+   .size   = size,
+   };
+
+   if (ioctl(gup_fd, cmd, &gup)) {
+   perror("ioctl on error\n");
+   return errno;
+   }
+
+   return 0;
+}
+
+/*
+ * Test get user device pages through gup_test. Setting PIN_LONGTERM flag.
+ * This should trigger a migration back to system memory for both, private
+ * and coherent type pages.
+ * This test makes use of gup_test module. Make sure GUP_TEST_CONFIG is added
+ * to your configuration before you run it.
+ */
+TEST_F(hmm, hmm_gup_test)
+{
+   struct hmm_buffer *buffer;
+   int gup_fd;
+   unsigned long npages;
+   unsigned long size;
+   unsigned long i;
+   int *ptr;
+   int ret;
+   unsigned char *m;
+
+   gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
+   if (gup_fd == -1)
+   SKIP(return, "Skipping test, could not find gup_test driver");
+
+   npages = 4;
+   size = npages << self->page_shift;
+
+   buffer = malloc(sizeof(*buffer));
+   ASSERT_NE(buffer, NULL);
+
+   buffer->fd = -1;
+   buffer->size = size;
+   buffer->mirror = malloc(size);
+   ASSERT_NE(buffer->mirror, NULL);
+
+   buffer->ptr = mmap(NULL, size,
+  PROT_READ | PROT_WRITE,
+  MAP_PRIVATE | MAP_ANONYMOUS,
+  buffer->fd, 0);
+   ASSERT_NE(buffer->ptr, MAP_FAILED);
+
+   /* Initialize buffer in system memory. */
+   for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+   ptr[i] = i;
+
+   /* Migrate memory to device. */
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
+   ASSERT_EQ(ret, 0);
+   ASSERT_EQ(buffer->cpages, npages);
+   /* Check what the device read. */
+   for (i = 0, ptr = buffer->mirror; i < size / sizeof(*ptr); ++i)
+   ASSERT_EQ(ptr[i], i);
+
+   ASSERT_EQ(gup_test_exec(gup_fd,
+   (unsigned long)buffer->ptr,
+   GUP_BASIC_TEST, 1, self->page_size, 0), 0);
+   ASSERT_EQ(gup_test_exec(gup_fd,
+   (unsigned long)buffer->ptr + 1 * 
self->page_size,
+   GUP_FAST_BENCHMARK, 1, self->page_size, 0), 0);
+   ASSERT_EQ(gup_test_exec(gup_fd,
+   (unsigned long)buffer->ptr + 2 * 
self->page_size,
+   PIN_FAST_BENCHMARK, 1, self->page_size, 
FOLL_LONGTERM), 0);
+   ASSERT_EQ(gup_test_exec(gup_fd,
+   (unsigned long)buffer->ptr + 3 * 
self->page_size,
+   PIN_LONGTERM_BENCHMARK, 1, self->page_size, 0), 
0);
+
+   /* Take snapshot to CPU pagetables */
+   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_SNAPSHOT, buffer, npages);
+   ASSERT_EQ(ret, 0);
+   ASSERT_EQ(buffer->cpages, npages);
+   m = buffer->mirror;
+   if (hmm_is_coherent_type(variant->device_number)) {
+   ASSERT_EQ(HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL | 
HMM_DMIRROR_PROT_WRITE, m[0]);
+   ASSERT_EQ(HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL | 
HMM_DMIRROR_PROT_WRITE, m[1]);
+   } else {
+   ASSERT_EQ(HMM_DMIRROR_PROT_WRITE, m[0]);
+   ASSERT_EQ(HMM_DMIRROR_PROT_WRITE, m[1]);
+   }
+   ASSERT_EQ(HMM_DMIRROR_PROT_WRITE, m[2]);
+   ASSERT_EQ(HMM_DMIRROR_PROT_WRITE, m[3]);
+   /*
+* Check again the cont

[PATCH v7 11/14] tools: update hmm-test to support device coherent type

Test cases such as migrate_fault and migrate_multiple, were modified to
explicit migrate from device to sys memory without the need of page
faults, when using device coherent type.

Snapshot test case updated to read memory device type first and based
on that, get the proper returned results migrate_ping_pong test case
added to test explicit migration from device to sys memory for both
private and coherent zone types.

Helpers to migrate from device to sys memory and vicerversa
were also added.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Popple 
Signed-off-by: Christoph Hellwig 
---
 tools/testing/selftests/vm/hmm-tests.c | 121 -
 1 file changed, 100 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 203323967b50..4b547188ec40 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -46,6 +46,13 @@ struct hmm_buffer {
uint64_tfaults;
 };
 
+enum {
+   HMM_PRIVATE_DEVICE_ONE,
+   HMM_PRIVATE_DEVICE_TWO,
+   HMM_COHERENCE_DEVICE_ONE,
+   HMM_COHERENCE_DEVICE_TWO,
+};
+
 #define TWOMEG (1 << 21)
 #define HMM_BUFFER_SIZE (1024 << 12)
 #define HMM_PATH_MAX64
@@ -60,6 +67,21 @@ FIXTURE(hmm)
unsigned intpage_shift;
 };
 
+FIXTURE_VARIANT(hmm)
+{
+   int device_number;
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_private)
+{
+   .device_number = HMM_PRIVATE_DEVICE_ONE,
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_coherent)
+{
+   .device_number = HMM_COHERENCE_DEVICE_ONE,
+};
+
 FIXTURE(hmm2)
 {
int fd0;
@@ -68,6 +90,24 @@ FIXTURE(hmm2)
unsigned intpage_shift;
 };
 
+FIXTURE_VARIANT(hmm2)
+{
+   int device_number0;
+   int device_number1;
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_private)
+{
+   .device_number0 = HMM_PRIVATE_DEVICE_ONE,
+   .device_number1 = HMM_PRIVATE_DEVICE_TWO,
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_coherent)
+{
+   .device_number0 = HMM_COHERENCE_DEVICE_ONE,
+   .device_number1 = HMM_COHERENCE_DEVICE_TWO,
+};
+
 static int hmm_open(int unit)
 {
char pathname[HMM_PATH_MAX];
@@ -81,12 +121,19 @@ static int hmm_open(int unit)
return fd;
 }
 
+static bool hmm_is_coherent_type(int dev_num)
+{
+   return (dev_num >= HMM_COHERENCE_DEVICE_ONE);
+}
+
 FIXTURE_SETUP(hmm)
 {
self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
 
-   self->fd = hmm_open(0);
+   self->fd = hmm_open(variant->device_number);
+   if (self->fd < 0 && hmm_is_coherent_type(variant->device_number))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd, 0);
 }
 
@@ -95,9 +142,11 @@ FIXTURE_SETUP(hmm2)
self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
 
-   self->fd0 = hmm_open(0);
+   self->fd0 = hmm_open(variant->device_number0);
+   if (self->fd0 < 0 && hmm_is_coherent_type(variant->device_number0))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd0, 0);
-   self->fd1 = hmm_open(1);
+   self->fd1 = hmm_open(variant->device_number1);
ASSERT_GE(self->fd1, 0);
 }
 
@@ -211,6 +260,20 @@ static void hmm_nanosleep(unsigned int n)
nanosleep(&t, NULL);
 }
 
+static int hmm_migrate_sys_to_dev(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_DEV, buffer, npages);
+}
+
+static int hmm_migrate_dev_to_sys(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_SYS, buffer, npages);
+}
+
 /*
  * Simple NULL test of device open/close.
  */
@@ -875,7 +938,7 @@ TEST_F(hmm, migrate)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -923,7 +986,7 @@ TEST_F(hmm, migrate_fault)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -936,7 +999,7 @@ TEST_F(hmm, migrate_fault)
ASSERT_EQ(ptr[i], i);
 
/* Migrate memory to the device again. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages,

[PATCH v7 05/14] mm: remove the vma check in migrate_vma_setup()

From: Alistair Popple 

migrate_vma_setup() checks that a valid vma is passed so that the page
tables can be walked to find the pfns associated with a given address
range. However in some cases the pfns are already known, such as when
migrating device coherent pages during pin_user_pages() meaning a valid
vma isn't required.

Signed-off-by: Alistair Popple 
Acked-by: Felix Kuehling 
Signed-off-by: Christoph Hellwig 
---
 mm/migrate_device.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index 18bc6483f63a..cf9668376c5a 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -486,24 +486,24 @@ int migrate_vma_setup(struct migrate_vma *args)
 
args->start &= PAGE_MASK;
args->end &= PAGE_MASK;
-   if (!args->vma || is_vm_hugetlb_page(args->vma) ||
-   (args->vma->vm_flags & VM_SPECIAL) || vma_is_dax(args->vma))
-   return -EINVAL;
-   if (nr_pages <= 0)
-   return -EINVAL;
-   if (args->start < args->vma->vm_start ||
-   args->start >= args->vma->vm_end)
-   return -EINVAL;
-   if (args->end <= args->vma->vm_start || args->end > args->vma->vm_end)
-   return -EINVAL;
if (!args->src || !args->dst)
return -EINVAL;
-
-   memset(args->src, 0, sizeof(*args->src) * nr_pages);
-   args->cpages = 0;
-   args->npages = 0;
-
-   migrate_vma_collect(args);
+   if (args->vma) {
+   if (is_vm_hugetlb_page(args->vma) ||
+   (args->vma->vm_flags & VM_SPECIAL) || vma_is_dax(args->vma))
+   return -EINVAL;
+   if (args->start < args->vma->vm_start ||
+   args->start >= args->vma->vm_end)
+   return -EINVAL;
+   if (args->end <= args->vma->vm_start ||
+   args->end > args->vma->vm_end)
+   return -EINVAL;
+   memset(args->src, 0, sizeof(*args->src) * nr_pages);
+   args->cpages = 0;
+   args->npages = 0;
+
+   migrate_vma_collect(args);
+   }
 
if (args->cpages)
migrate_vma_unmap(args);
@@ -685,7 +685,7 @@ void migrate_vma_pages(struct migrate_vma *migrate)
continue;
}
 
-   if (!page) {
+   if (!page && migrate->vma) {
if (!(migrate->src[i] & MIGRATE_PFN_MIGRATE))
continue;
if (!notified) {
-- 
2.32.0

[PATCH v7 09/14] lib: test_hmm add module param for zone device type

In order to configure device coherent in test_hmm, two module parameters
should be passed, which correspond to the SP start address of each
device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed,
private device type is configured.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
Signed-off-by: Christoph Hellwig 
---
 lib/test_hmm.c  | 73 -
 lib/test_hmm_uapi.h |  1 +
 2 files changed, 53 insertions(+), 21 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 915ef6b5b0d4..afb30af9f3ff 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -37,6 +37,16 @@
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+static unsigned long spm_addr_dev0;
+module_param(spm_addr_dev0, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev0,
+   "Specify start address for SPM (special purpose memory) used 
for device 0. By setting this Coherent device type will be used. Make sure 
spm_addr_dev1 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
+
+static unsigned long spm_addr_dev1;
+module_param(spm_addr_dev1, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev1,
+   "Specify start address for SPM (special purpose memory) used 
for device 1. By setting this Coherent device type will be used. Make sure 
spm_addr_dev0 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
+
 static const struct dev_pagemap_ops dmirror_devmem_ops;
 static const struct mmu_interval_notifier_ops dmirror_min_ops;
 static dev_t dmirror_dev;
@@ -455,28 +465,44 @@ static int dmirror_write(struct dmirror *dmirror, struct 
hmm_dmirror_cmd *cmd)
return ret;
 }
 
-static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
+static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
   struct page **ppage)
 {
struct dmirror_chunk *devmem;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long pfn;
unsigned long pfn_first;
unsigned long pfn_last;
void *ptr;
+   int ret = -ENOMEM;
 
devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
if (!devmem)
-   return false;
+   return ret;
 
-   res = request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE,
- "hmm_dmirror");
-   if (IS_ERR(res))
+   switch (mdevice->zone_device_type) {
+   case HMM_DMIRROR_MEMORY_DEVICE_PRIVATE:
+   res = request_free_mem_region(&iomem_resource, 
DEVMEM_CHUNK_SIZE,
+ "hmm_dmirror");
+   if (IS_ERR_OR_NULL(res))
+   goto err_devmem;
+   devmem->pagemap.range.start = res->start;
+   devmem->pagemap.range.end = res->end;
+   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+   break;
+   case HMM_DMIRROR_MEMORY_DEVICE_COHERENT:
+   devmem->pagemap.range.start = (MINOR(mdevice->cdevice.dev) - 2) 
?
+   spm_addr_dev0 :
+   spm_addr_dev1;
+   devmem->pagemap.range.end = devmem->pagemap.range.start +
+   DEVMEM_CHUNK_SIZE - 1;
+   devmem->pagemap.type = MEMORY_DEVICE_COHERENT;
+   break;
+   default:
+   ret = -EINVAL;
goto err_devmem;
+   }
 
-   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
-   devmem->pagemap.range.start = res->start;
-   devmem->pagemap.range.end = res->end;
devmem->pagemap.nr_range = 1;
devmem->pagemap.ops = &dmirror_devmem_ops;
devmem->pagemap.owner = mdevice;
@@ -497,10 +523,14 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
mdevice->devmem_capacity = new_capacity;
mdevice->devmem_chunks = new_chunks;
}
-
ptr = memremap_pages(&devmem->pagemap, numa_node_id());
-   if (IS_ERR(ptr))
+   if (IS_ERR_OR_NULL(ptr)) {
+   if (ptr)
+   ret = PTR_ERR(ptr);
+   else
+   ret = -EFAULT;
goto err_release;
+   }
 
devmem->mdevice = mdevice;
pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT;
@@ -529,15 +559,17 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
}
spin_unlock(&mdevice->lock);
 
-   return true;
+   return 0;
 
 err_release:
mutex_unlock(&mdevice->devmem_lock);
-   release_mem_region(devmem->pagemap.range.start, 
range_len(&devmem->pagemap.range));
+   if (res && devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
+   release_mem_region(devmem->pagemap.range.start,
+  range_len(&devmem->pagemap.range));

[PATCH v7 12/14] tools: update test_hmm script to support SP config

Add two more parameters to set spm_addr_dev0 & spm_addr_dev1
addresses. These two parameters configure the start SP
addresses for each device in test_hmm driver.
Consequently, this configures zone device type as coherent.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Popple 
Signed-off-by: Christoph Hellwig 
---
 tools/testing/selftests/vm/test_hmm.sh | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/test_hmm.sh 
b/tools/testing/selftests/vm/test_hmm.sh
index 0647b525a625..539c9371e592 100755
--- a/tools/testing/selftests/vm/test_hmm.sh
+++ b/tools/testing/selftests/vm/test_hmm.sh
@@ -40,11 +40,26 @@ check_test_requirements()
 
 load_driver()
 {
-   modprobe $DRIVER > /dev/null 2>&1
+   if [ $# -eq 0 ]; then
+   modprobe $DRIVER > /dev/null 2>&1
+   else
+   if [ $# -eq 2 ]; then
+   modprobe $DRIVER spm_addr_dev0=$1 spm_addr_dev1=$2
+   > /dev/null 2>&1
+   else
+   echo "Missing module parameters. Make sure pass"\
+   "spm_addr_dev0 and spm_addr_dev1"
+   usage
+   fi
+   fi
if [ $? == 0 ]; then
major=$(awk "\$2==\"HMM_DMIRROR\" {print \$1}" /proc/devices)
mknod /dev/hmm_dmirror0 c $major 0
mknod /dev/hmm_dmirror1 c $major 1
+   if [ $# -eq 2 ]; then
+   mknod /dev/hmm_dmirror2 c $major 2
+   mknod /dev/hmm_dmirror3 c $major 3
+   fi
fi
 }
 
@@ -58,7 +73,7 @@ run_smoke()
 {
echo "Running smoke test. Note, this test provides basic coverage."
 
-   load_driver
+   load_driver $1 $2
$(dirname "${BASH_SOURCE[0]}")/hmm-tests
unload_driver
 }
@@ -75,6 +90,9 @@ usage()
echo "# Smoke testing"
echo "./${TEST_NAME}.sh smoke"
echo
+   echo "# Smoke testing with SPM enabled"
+   echo "./${TEST_NAME}.sh smoke  "
+   echo
exit 0
 }
 
@@ -84,7 +102,7 @@ function run_test()
usage
else
if [ "$1" = "smoke" ]; then
-   run_smoke
+   run_smoke $2 $3
else
usage
fi
-- 
2.32.0

[PATCH v7 08/14] lib: test_hmm add ioctl to get zone device type

new ioctl cmd added to query zone device type. This will be
used once the test_hmm adds zone device coherent type.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
Signed-off-by: Christoph Hellwig 
---
 lib/test_hmm.c  | 11 +--
 lib/test_hmm_uapi.h | 14 ++
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index cfe632047839..915ef6b5b0d4 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -87,6 +87,7 @@ struct dmirror_chunk {
 struct dmirror_device {
struct cdev cdevice;
struct hmm_devmem   *devmem;
+   unsigned intzone_device_type;
 
unsigned intdevmem_capacity;
unsigned intdevmem_count;
@@ -1260,14 +1261,20 @@ static void dmirror_device_remove(struct dmirror_device 
*mdevice)
 static int __init hmm_dmirror_init(void)
 {
int ret;
-   int id;
+   int id = 0;
+   int ndevices = 0;
 
ret = alloc_chrdev_region(&dmirror_dev, 0, DMIRROR_NDEVICES,
  "HMM_DMIRROR");
if (ret)
goto err_unreg;
 
-   for (id = 0; id < DMIRROR_NDEVICES; id++) {
+   memset(dmirror_devices, 0, DMIRROR_NDEVICES * 
sizeof(dmirror_devices[0]));
+   dmirror_devices[ndevices++].zone_device_type =
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   dmirror_devices[ndevices++].zone_device_type =
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   for (id = 0; id < ndevices; id++) {
ret = dmirror_device_init(dmirror_devices + id, id);
if (ret)
goto err_chrdev;
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index f14dea5dcd06..0511af7464ee 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -31,10 +31,11 @@ struct hmm_dmirror_cmd {
 /* Expose the address space of the calling process through hmm device file */
 #define HMM_DMIRROR_READ   _IOWR('H', 0x00, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_WRITE  _IOWR('H', 0x01, struct hmm_dmirror_cmd)
-#define HMM_DMIRROR_MIGRATE_IOWR('H', 0x02, struct hmm_dmirror_cmd)
-#define HMM_DMIRROR_SNAPSHOT   _IOWR('H', 0x03, struct hmm_dmirror_cmd)
-#define HMM_DMIRROR_EXCLUSIVE  _IOWR('H', 0x04, struct hmm_dmirror_cmd)
-#define HMM_DMIRROR_CHECK_EXCLUSIVE_IOWR('H', 0x05, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_MIGRATE_TO_DEV _IOWR('H', 0x02, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_MIGRATE_TO_SYS _IOWR('H', 0x03, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_SNAPSHOT   _IOWR('H', 0x04, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_EXCLUSIVE  _IOWR('H', 0x05, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_CHECK_EXCLUSIVE_IOWR('H', 0x06, struct hmm_dmirror_cmd)
 
 /*
  * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT.
@@ -62,4 +63,9 @@ enum {
HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE = 0x30,
 };
 
+enum {
+   /* 0 is reserved to catch uninitialized type fields */
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE = 1,
+};
+
 #endif /* _LIB_TEST_HMM_UAPI_H */
-- 
2.32.0

[PATCH v7 07/14] drm/amdkfd: add SPM support for SVM

When CPU is connected throug XGMI, it has coherent
access to VRAM resource. In this case that resource
is taken from a table in the device gmc aperture base.
This resource is used along with the device type, which could
be DEVICE_PRIVATE or DEVICE_COHERENT to create the device
page map region.
Also, MIGRATE_VMA_SELECT_DEVICE_COHERENT flag is selected for
coherent type case during migration to device.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 34 +++-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index e44376c2ecdc..f73e3e340413 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -671,13 +671,15 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.vma = vma;
migrate.start = start;
migrate.end = end;
-   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
+   if (adev->gmc.xgmi.connected_to_cpu)
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_COHERENT;
+   else
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
 
buf = kvcalloc(npages,
   2 * sizeof(*migrate.src) + sizeof(uint64_t) + 
sizeof(dma_addr_t),
   GFP_KERNEL);
-
if (!buf)
goto out;
 
@@ -947,7 +949,7 @@ int svm_migrate_init(struct amdgpu_device *adev)
 {
struct kfd_dev *kfddev = adev->kfd.dev;
struct dev_pagemap *pgmap;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long size;
void *r;
 
@@ -962,28 +964,34 @@ int svm_migrate_init(struct amdgpu_device *adev)
 * should remove reserved size
 */
size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
-   res = devm_request_free_mem_region(adev->dev, &iomem_resource, size);
-   if (IS_ERR(res))
-   return -ENOMEM;
+   if (adev->gmc.xgmi.connected_to_cpu) {
+   pgmap->range.start = adev->gmc.aper_base;
+   pgmap->range.end = adev->gmc.aper_base + adev->gmc.aper_size - 
1;
+   pgmap->type = MEMORY_DEVICE_COHERENT;
+   } else {
+   res = devm_request_free_mem_region(adev->dev, &iomem_resource, 
size);
+   if (IS_ERR(res))
+   return -ENOMEM;
+   pgmap->range.start = res->start;
+   pgmap->range.end = res->end;
+   pgmap->type = MEMORY_DEVICE_PRIVATE;
+   }
 
-   pgmap->type = MEMORY_DEVICE_PRIVATE;
pgmap->nr_range = 1;
-   pgmap->range.start = res->start;
-   pgmap->range.end = res->end;
pgmap->ops = &svm_migrate_pgmap_ops;
pgmap->owner = SVM_ADEV_PGMAP_OWNER(adev);
-   pgmap->flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
-
+   pgmap->flags = 0;
/* Device manager releases device-specific resources, memory region and
 * pgmap when driver disconnects from device.
 */
r = devm_memremap_pages(adev->dev, pgmap);
if (IS_ERR(r)) {
pr_err("failed to register HMM device memory\n");
-
/* Disable SVM support capability */
pgmap->type = 0;
-   devm_release_mem_region(adev->dev, res->start, 
resource_size(res));
+   if (pgmap->type == MEMORY_DEVICE_PRIVATE)
+   devm_release_mem_region(adev->dev, res->start,
+   res->end - res->start + 1);
return PTR_ERR(r);
}
 
-- 
2.32.0

[PATCH v7 06/14] mm/gup: migrate device coherent pages when pinning instead of failing

From: Alistair Popple 

Currently any attempts to pin a device coherent page will fail. This is
because device coherent pages need to be managed by a device driver, and
pinning them would prevent a driver from migrating them off the device.

However this is no reason to fail pinning of these pages. These are
coherent and accessible from the CPU so can be migrated just like
pinning ZONE_MOVABLE pages. So instead of failing all attempts to pin
them first try migrating them out of ZONE_DEVICE.

Signed-off-by: Alistair Popple 
Acked-by: Felix Kuehling 
[hch: rebased to the split device memory checks,
  moved migrate_device_page to migrate_device.c]
Signed-off-by: Christoph Hellwig 
---
 mm/gup.c| 47 +++-
 mm/internal.h   |  1 +
 mm/migrate_device.c | 53 +
 3 files changed, 96 insertions(+), 5 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index b65fe8bf5af4..9b6b9923d22d 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1891,9 +1891,43 @@ static long check_and_migrate_movable_pages(unsigned 
long nr_pages,
continue;
prev_folio = folio;
 
-   if (folio_is_longterm_pinnable(folio))
+   /*
+* Device private pages will get faulted in during gup so it
+* shouldn't be possible to see one here.
+*/
+   if (WARN_ON_ONCE(folio_is_device_private(folio))) {
+   ret = -EFAULT;
+   goto unpin_pages;
+   }
+
+   /*
+* Device coherent pages are managed by a driver and should not
+* be pinned indefinitely as it prevents the driver moving the
+* page. So when trying to pin with FOLL_LONGTERM instead try
+* to migrate the page out of device memory.
+*/
+   if (folio_is_device_coherent(folio)) {
+   WARN_ON_ONCE(PageCompound(&folio->page));
+
+   /*
+* Migration will fail if the page is pinned, so convert
+* the pin on the source page to a normal reference.
+*/
+   if (gup_flags & FOLL_PIN) {
+   get_page(&folio->page);
+   unpin_user_page(&folio->page);
+   }
+
+   pages[i] = migrate_device_page(&folio->page, gup_flags);
+   if (!pages[i]) {
+   ret = -EBUSY;
+   goto unpin_pages;
+   }
continue;
+   }
 
+   if (folio_is_longterm_pinnable(folio))
+   continue;
/*
 * Try to move out any movable page before pinning the range.
 */
@@ -1929,10 +1963,13 @@ static long check_and_migrate_movable_pages(unsigned 
long nr_pages,
return nr_pages;
 
 unpin_pages:
-   if (gup_flags & FOLL_PIN) {
-   unpin_user_pages(pages, nr_pages);
-   } else {
-   for (i = 0; i < nr_pages; i++)
+   for (i = 0; i < nr_pages; i++) {
+   if (!pages[i])
+   continue;
+
+   if (gup_flags & FOLL_PIN)
+   unpin_user_page(pages[i]);
+   else
put_page(pages[i]);
}
 
diff --git a/mm/internal.h b/mm/internal.h
index c0f8fbe0445b..eeab4ee7a4a3 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -853,6 +853,7 @@ int numa_migrate_prep(struct page *page, struct 
vm_area_struct *vma,
  unsigned long addr, int page_nid, int *flags);
 
 void free_zone_device_page(struct page *page);
+struct page *migrate_device_page(struct page *page, unsigned int gup_flags);
 
 /*
  * mm/gup.c
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index cf9668376c5a..5decd26dd551 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -794,3 +794,56 @@ void migrate_vma_finalize(struct migrate_vma *migrate)
}
 }
 EXPORT_SYMBOL(migrate_vma_finalize);
+
+/*
+ * Migrate a device coherent page back to normal memory.  The caller should 
have
+ * a reference on page which will be copied to the new page if migration is
+ * successful or dropped on failure.
+ */
+struct page *migrate_device_page(struct page *page, unsigned int gup_flags)
+{
+   unsigned long src_pfn, dst_pfn = 0;
+   struct migrate_vma args;
+   struct page *dpage;
+
+   lock_page(page);
+   src_pfn = migrate_pfn(page_to_pfn(page)) | MIGRATE_PFN_MIGRATE;
+   args.src = &src_pfn;
+   args.dst = &dst_pfn;
+   args.cpages = 1;
+   args.npages = 1;
+   args.vma = NULL;
+   migrate_vma_setup(&args);
+   if (!(src_pfn & MIGRATE_PFN_MIGRATE))
+   return NULL;
+
+   dpage = alloc_pages(GFP_USER | __

[PATCH v7 10/14] lib: add support for device coherent type in test_hmm

Device Coherent type uses device memory that is coherently accesible by
the CPU. This could be shown as SP (special purpose) memory range
at the BIOS-e820 memory enumeration. If no SP memory is supported in
system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.

Currently, test_hmm only supports two different SP ranges of at least
256MB size. This could be specified in the kernel parameter variable
efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x1 &
0x14000 physical address. Ex.
efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4

Private and coherent device mirror instances can be created in the same
probed. This is done by passing the module parameters spm_addr_dev0 &
spm_addr_dev1. In this case, it will create four instances of
device_mirror. The first two correspond to private device type, the
last two to coherent type. Then, they can be easily accessed from user
space through /dev/hmm_mirror. Usually num_device 0 and 1
are for private, and 2 and 3 for coherent types. If no module
parameters are passed, two instances of private type device_mirror will
be created only.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
---
 lib/test_hmm.c  | 253 +---
 lib/test_hmm_uapi.h |   4 +
 2 files changed, 196 insertions(+), 61 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index afb30af9f3ff..7930853e7fc5 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -32,11 +32,22 @@
 
 #include "test_hmm_uapi.h"
 
-#define DMIRROR_NDEVICES   2
+#define DMIRROR_NDEVICES   4
 #define DMIRROR_RANGE_FAULT_TIMEOUT1000
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+/*
+ * For device_private pages, dpage is just a dummy struct page
+ * representing a piece of device memory. dmirror_devmem_alloc_page
+ * allocates a real system memory page as backing storage to fake a
+ * real device. zone_device_data points to that backing page. But
+ * for device_coherent memory, the struct page represents real
+ * physical CPU-accessible memory that we can use directly.
+ */
+#define BACKING_PAGE(page) (is_device_private_page((page)) ? \
+  (page)->zone_device_data : (page))
+
 static unsigned long spm_addr_dev0;
 module_param(spm_addr_dev0, long, 0644);
 MODULE_PARM_DESC(spm_addr_dev0,
@@ -125,6 +136,21 @@ static int dmirror_bounce_init(struct dmirror_bounce 
*bounce,
return 0;
 }
 
+static bool dmirror_is_private_zone(struct dmirror_device *mdevice)
+{
+   return (mdevice->zone_device_type ==
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ? true : false;
+}
+
+static enum migrate_vma_direction
+dmirror_select_device(struct dmirror *dmirror)
+{
+   return (dmirror->mdevice->zone_device_type ==
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ?
+   MIGRATE_VMA_SELECT_DEVICE_PRIVATE :
+   MIGRATE_VMA_SELECT_DEVICE_COHERENT;
+}
+
 static void dmirror_bounce_fini(struct dmirror_bounce *bounce)
 {
vfree(bounce->ptr);
@@ -575,16 +601,19 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
 {
struct page *dpage = NULL;
-   struct page *rpage;
+   struct page *rpage = NULL;
 
/*
-* This is a fake device so we alloc real system memory to store
-* our device memory.
+* For ZONE_DEVICE private type, this is a fake device so we allocate
+* real system memory to store our device memory.
+* For ZONE_DEVICE coherent type we use the actual dpage to store the
+* data and ignore rpage.
 */
-   rpage = alloc_page(GFP_HIGHUSER);
-   if (!rpage)
-   return NULL;
-
+   if (dmirror_is_private_zone(mdevice)) {
+   rpage = alloc_page(GFP_HIGHUSER);
+   if (!rpage)
+   return NULL;
+   }
spin_lock(&mdevice->lock);
 
if (mdevice->free_pages) {
@@ -603,7 +632,8 @@ static struct page *dmirror_devmem_alloc_page(struct 
dmirror_device *mdevice)
return dpage;
 
 error:
-   __free_page(rpage);
+   if (rpage)
+   __free_page(rpage);
return NULL;
 }
 
@@ -629,12 +659,16 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
 * unallocated pte_none() or read-only zero page.
 */
spage = migrate_pfn_to_page(*src);
+   if (WARN(spage && is_zone_device_page(spage),
+"page already in device spage pfn: 0x%lx\n",
+page_to_pfn(spage)))
+   continue;
 
dpage = dmirror_devmem_alloc_page(mdevice);
if (!dpage)
continue;
 
-   rpage = dpage->zone_device_data;
+   rpage = BACKING_PAGE(dpage);

[PATCH v7 03/14] mm: handling Non-LRU pages returned by vm_normal_pages

With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
device-managed anonymous pages that are not LRU pages. Although they
behave like normal pages for purposes of mapping in CPU page, and for
COW. They do not support LRU lists, NUMA migration or THP.

Callers to follow_page that expect LRU pages, are also checked for
device zone pages due to DEVICE_COHERENT type.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling  (v2)
Reviewed-by: Alistair Popple  (v6)
---
 fs/proc/task_mmu.c | 2 +-
 mm/huge_memory.c   | 2 +-
 mm/khugepaged.c| 9 ++---
 mm/ksm.c   | 6 +++---
 mm/madvise.c   | 4 ++--
 mm/memory.c| 9 -
 mm/mempolicy.c | 2 +-
 mm/migrate.c   | 4 ++--
 mm/mlock.c | 2 +-
 mm/mprotect.c  | 2 +-
 10 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 2d04e3470d4c..2dd8c8a66924 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1792,7 +1792,7 @@ static struct page *can_gather_numa_stats(pte_t pte, 
struct vm_area_struct *vma,
return NULL;
 
page = vm_normal_page(vma, addr, pte);
-   if (!page)
+   if (!page || is_zone_device_page(page))
return NULL;
 
if (PageReserved(page))
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 834f288b3769..c47e95b02244 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2910,7 +2910,7 @@ static int split_huge_pages_pid(int pid, unsigned long 
vaddr_start,
 
if (IS_ERR(page))
continue;
-   if (!page)
+   if (!page || is_zone_device_page(page))
continue;
 
if (!is_transparent_hugepage(page))
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 16be62d493cd..671ac7800e53 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -618,7 +618,7 @@ static int __collapse_huge_page_isolate(struct 
vm_area_struct *vma,
goto out;
}
page = vm_normal_page(vma, address, pteval);
-   if (unlikely(!page)) {
+   if (unlikely(!page) || unlikely(is_zone_device_page(page))) {
result = SCAN_PAGE_NULL;
goto out;
}
@@ -1267,7 +1267,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
writable = true;
 
page = vm_normal_page(vma, _address, pteval);
-   if (unlikely(!page)) {
+   if (unlikely(!page) || unlikely(is_zone_device_page(page))) {
result = SCAN_PAGE_NULL;
goto out_unmap;
}
@@ -1479,7 +1479,8 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, 
unsigned long addr)
goto abort;
 
page = vm_normal_page(vma, addr, *pte);
-
+   if (WARN_ON_ONCE(page && is_zone_device_page(page)))
+   page = NULL;
/*
 * Note that uprobe, debugger, or MAP_PRIVATE may change the
 * page table, but the new page will not be a subpage of hpage.
@@ -1497,6 +1498,8 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, 
unsigned long addr)
if (pte_none(*pte))
continue;
page = vm_normal_page(vma, addr, *pte);
+   if (WARN_ON_ONCE(page && is_zone_device_page(page)))
+   goto abort;
page_remove_rmap(page, vma, false);
}
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 54f78c9eecae..831b18a7a50b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned 
long addr)
cond_resched();
page = follow_page(vma, addr,
FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
-   if (IS_ERR_OR_NULL(page))
+   if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
break;
if (PageKsm(page))
ret = handle_mm_fault(vma, addr,
@@ -560,7 +560,7 @@ static struct page *get_mergeable_page(struct rmap_item 
*rmap_item)
goto out;
 
page = follow_page(vma, addr, FOLL_GET);
-   if (IS_ERR_OR_NULL(page))
+   if (IS_ERR_OR_NULL(page) || is_zone_device_page(page))
goto out;
if (PageAnon(page)) {
flush_anon_page(vma, page, addr);
@@ -2308,7 +2308,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct 
page **page)
if (ksm_test_exit(mm))
break;
*page = follow_page(vma, ksm_scan.address, FOLL_GET);
-   if (IS_ERR_OR_NULL(*page)) {
+   if (IS_ERR_OR_NULL(*page) || 
is_zone_device_page(*page)) {
ksm_scan.address += PAGE_SIZE;
co

[PATCH v7 04/14] mm: add device coherent vma selection for memory migration

This case is used to migrate pages from device memory, back to system
memory. Device coherent type memory is cache coherent from device and CPU
point of view.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Poppple 
Signed-off-by: Christoph Hellwig 
---
 include/linux/migrate.h |  1 +
 mm/migrate_device.c | 12 +---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 069a89e847f3..b84908debe5c 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -148,6 +148,7 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 enum migrate_vma_direction {
MIGRATE_VMA_SELECT_SYSTEM = 1 << 0,
MIGRATE_VMA_SELECT_DEVICE_PRIVATE = 1 << 1,
+   MIGRATE_VMA_SELECT_DEVICE_COHERENT = 1 << 2,
 };
 
 struct migrate_vma {
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index a4847ad65da3..18bc6483f63a 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -148,15 +148,21 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
if (is_writable_device_private_entry(entry))
mpfn |= MIGRATE_PFN_WRITE;
} else {
-   if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
-   goto next;
pfn = pte_pfn(pte);
-   if (is_zero_pfn(pfn)) {
+   if (is_zero_pfn(pfn) &&
+   (migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) {
mpfn = MIGRATE_PFN_MIGRATE;
migrate->cpages++;
goto next;
}
page = vm_normal_page(migrate->vma, addr, pte);
+   if (page && !is_zone_device_page(page) &&
+   !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
+   goto next;
+   else if (page && is_device_coherent_page(page) &&
+   (!(migrate->flags & 
MIGRATE_VMA_SELECT_DEVICE_COHERENT) ||
+page->pgmap->owner != migrate->pgmap_owner))
+   goto next;
mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
}
-- 
2.32.0

[PATCH v7 01/14] mm: rename is_pinnable_pages to is_pinnable_longterm_pages

is_pinnable_page() and folio_is_pinnable() were renamed to
is_longterm_pinnable_page() and folio_is_longterm_pinnable()
respectively. These functions are used in the FOLL_LONGTERM flag
context.

Signed-off-by: Alex Sierra 
---
 include/linux/memremap.h | 24 
 include/linux/mm.h   | 24 
 mm/gup.c |  4 ++--
 mm/gup_test.c|  4 ++--
 mm/hugetlb.c |  2 +-
 5 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 8af304f6b504..c272bd0af3c1 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -150,6 +150,30 @@ static inline bool is_pci_p2pdma_page(const struct page 
*page)
page->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA;
 }
 
+/* MIGRATE_CMA and ZONE_MOVABLE do not allow pin pages */
+#ifdef CONFIG_MIGRATION
+static inline bool is_longterm_pinnable_page(struct page *page)
+{
+#ifdef CONFIG_CMA
+   int mt = get_pageblock_migratetype(page);
+
+   if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
+   return false;
+#endif
+   return !(is_zone_movable_page(page) ||
+is_zero_pfn(page_to_pfn(page)));
+}
+#else
+static inline bool is_longterm_pinnable_page(struct page *page)
+{
+   return true;
+}
+#endif
+static inline bool folio_is_longterm_pinnable(struct folio *folio)
+{
+   return is_longterm_pinnable_page(&folio->page);
+}
+
 #ifdef CONFIG_ZONE_DEVICE
 void *memremap_pages(struct dev_pagemap *pgmap, int nid);
 void memunmap_pages(struct dev_pagemap *pgmap);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index cf3d0d673f6b..bc0f201a4cff 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1590,30 +1590,6 @@ static inline bool page_needs_cow_for_dma(struct 
vm_area_struct *vma,
return page_maybe_dma_pinned(page);
 }
 
-/* MIGRATE_CMA and ZONE_MOVABLE do not allow pin pages */
-#ifdef CONFIG_MIGRATION
-static inline bool is_pinnable_page(struct page *page)
-{
-#ifdef CONFIG_CMA
-   int mt = get_pageblock_migratetype(page);
-
-   if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
-   return false;
-#endif
-   return !is_zone_movable_page(page) || is_zero_pfn(page_to_pfn(page));
-}
-#else
-static inline bool is_pinnable_page(struct page *page)
-{
-   return true;
-}
-#endif
-
-static inline bool folio_is_pinnable(struct folio *folio)
-{
-   return is_pinnable_page(&folio->page);
-}
-
 static inline void set_page_zone(struct page *page, enum zone_type zone)
 {
page->flags &= ~(ZONES_MASK << ZONES_PGSHIFT);
diff --git a/mm/gup.c b/mm/gup.c
index 551264407624..b65fe8bf5af4 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -133,7 +133,7 @@ struct folio *try_grab_folio(struct page *page, int refs, 
unsigned int flags)
 * path.
 */
if (unlikely((flags & FOLL_LONGTERM) &&
-!is_pinnable_page(page)))
+!is_longterm_pinnable_page(page)))
return NULL;
 
/*
@@ -1891,7 +1891,7 @@ static long check_and_migrate_movable_pages(unsigned long 
nr_pages,
continue;
prev_folio = folio;
 
-   if (folio_is_pinnable(folio))
+   if (folio_is_longterm_pinnable(folio))
continue;
 
/*
diff --git a/mm/gup_test.c b/mm/gup_test.c
index d974dec19e1c..9d705ba6737e 100644
--- a/mm/gup_test.c
+++ b/mm/gup_test.c
@@ -1,5 +1,5 @@
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -53,7 +53,7 @@ static void verify_dma_pinned(unsigned int cmd, struct page 
**pages,
dump_page(page, "gup_test failure");
break;
} else if (cmd == PIN_LONGTERM_BENCHMARK &&
-   WARN(!is_pinnable_page(page),
+   WARN(!is_longterm_pinnable_page(page),
 "pages[%lu] is NOT pinnable but pinned\n",
 i)) {
dump_page(page, "gup_test failure");
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a57e1be41401..368fd33787b0 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1135,7 +1135,7 @@ static struct page *dequeue_huge_page_node_exact(struct 
hstate *h, int nid)
 
lockdep_assert_held(&hugetlb_lock);
list_for_each_entry(page, &h->hugepage_freelists[nid], lru) {
-   if (pin && !is_pinnable_page(page))
+   if (pin && !is_longterm_pinnable_page(page))
continue;
 
if (PageHWPoison(page))
-- 
2.32.0

[PATCH v7 02/14] mm: add zone device coherent type memory support

Device memory that is cache coherent from device and CPU point of view.
This is used on platforms that have an advanced system bus (like CAPI
or CXL). Any page of a process can be migrated to such memory. However,
no one should be allowed to pin such memory so that it can always be
evicted.

Signed-off-by: Alex Sierra 
Acked-by: Felix Kuehling 
Reviewed-by: Alistair Popple 
[hch: rebased ontop of the refcount changes,
  removed is_dev_private_or_coherent_page]
Signed-off-by: Christoph Hellwig 
---
 include/linux/memremap.h | 22 +-
 mm/memcontrol.c  |  7 ---
 mm/memory-failure.c  |  8 ++--
 mm/memremap.c| 10 ++
 mm/migrate_device.c  | 16 +++-
 mm/rmap.c|  5 +++--
 6 files changed, 51 insertions(+), 17 deletions(-)

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index c272bd0af3c1..6fc0ced64b2d 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -41,6 +41,13 @@ struct vmem_altmap {
  * A more complete discussion of unaddressable memory may be found in
  * include/linux/hmm.h and Documentation/vm/hmm.rst.
  *
+ * MEMORY_DEVICE_COHERENT:
+ * Device memory that is cache coherent from device and CPU point of view. This
+ * is used on platforms that have an advanced system bus (like CAPI or CXL). A
+ * driver can hotplug the device memory using ZONE_DEVICE and with that memory
+ * type. Any page of a process can be migrated to such memory. However no one
+ * should be allowed to pin such memory so that it can always be evicted.
+ *
  * MEMORY_DEVICE_FS_DAX:
  * Host memory that has similar access semantics as System RAM i.e. DMA
  * coherent and supports page pinning. In support of coordinating page
@@ -61,6 +68,7 @@ struct vmem_altmap {
 enum memory_type {
/* 0 is reserved to catch uninitialized type fields */
MEMORY_DEVICE_PRIVATE = 1,
+   MEMORY_DEVICE_COHERENT,
MEMORY_DEVICE_FS_DAX,
MEMORY_DEVICE_GENERIC,
MEMORY_DEVICE_PCI_P2PDMA,
@@ -143,6 +151,17 @@ static inline bool folio_is_device_private(const struct 
folio *folio)
return is_device_private_page(&folio->page);
 }
 
+static inline bool is_device_coherent_page(const struct page *page)
+{
+   return is_zone_device_page(page) &&
+   page->pgmap->type == MEMORY_DEVICE_COHERENT;
+}
+
+static inline bool folio_is_device_coherent(const struct folio *folio)
+{
+   return is_device_coherent_page(&folio->page);
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
return IS_ENABLED(CONFIG_PCI_P2PDMA) &&
@@ -160,7 +179,8 @@ static inline bool is_longterm_pinnable_page(struct page 
*page)
if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
return false;
 #endif
-   return !(is_zone_movable_page(page) ||
+   return !(is_device_coherent_page(page) ||
+is_zone_movable_page(page) ||
 is_zero_pfn(page_to_pfn(page)));
 }
 #else
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 618c366a2f07..5d37a85c67da 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5665,8 +5665,8 @@ static int mem_cgroup_move_account(struct page *page,
  *   2(MC_TARGET_SWAP): if the swap entry corresponding to this pte is a
  * target for charge migration. if @target is not NULL, the entry is stored
  * in target->ent.
- *   3(MC_TARGET_DEVICE): like MC_TARGET_PAGE  but page is 
MEMORY_DEVICE_PRIVATE
- * (so ZONE_DEVICE page and thus not on the lru).
+ *   3(MC_TARGET_DEVICE): like MC_TARGET_PAGE  but page is device memory and
+ *   thus not on the lru.
  * For now we such page is charge like a regular page would be as for all
  * intent and purposes it is just special memory taking the place of a
  * regular page.
@@ -5704,7 +5704,8 @@ static enum mc_target_type get_mctgt_type(struct 
vm_area_struct *vma,
 */
if (page_memcg(page) == mc.from) {
ret = MC_TARGET_PAGE;
-   if (is_device_private_page(page))
+   if (is_device_private_page(page) ||
+   is_device_coherent_page(page))
ret = MC_TARGET_DEVICE;
if (target)
target->page = page;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index da39ec8afca8..79f175eeb190 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1685,12 +1685,16 @@ static int memory_failure_dev_pagemap(unsigned long 
pfn, int flags,
goto unlock;
}
 
-   if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
+   switch (pgmap->type) {
+   case MEMORY_DEVICE_PRIVATE:
+   case MEMORY_DEVICE_COHERENT:
/*
-* TODO: Handle HMM pages which may need coordination
+* TODO: Handle device pages which may need coordination
 * with device-side memory.

[PATCH v7 00/14] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

This is our MEMORY_DEVICE_COHERENT patch series rebased and updated
for current 5.19.0-rc4

Changes since the last version:
- Fixed problems with migration during long-term pinning in
get_user_pages
- Open coded vm_normal_lru_pages as suggested in previous code review
- Update hmm_gup_test with more get_user_pages calls, include
hmm_cow_in_device in hmm-test.

This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like
MEMORY_DEVICE_PRIVATE.

This patch series is mostly self-contained except for a few places where
it needs to update other subsystems to handle the new memory type.

System stability and performance are not affected according to our
ongoing testing, including xfstests.

How it works: The system BIOS advertises the GPU device memory
(aka VRAM) as SPM (special purpose memory) in the UEFI system address
map.

The amdgpu driver registers the memory with devmap as
MEMORY_DEVICE_COHERENT using devm_memremap_pages. The initial user for
this hardware page migration capability is the Frontier supercomputer
project. This functionality is not AMD-specific. We expect other GPU
vendors to find this functionality useful, and possibly other hardware
types in the future.

Our test nodes in the lab are similar to the Frontier configuration,
with .5 TB of system memory plus 256 GB of device memory split across
4 GPUs, all in a single coherent address space. Page migration is
expected to improve application efficiency significantly. We will
report empirical results as they become available.

Coherent device type pages at gup are now migrated back to system
memory if they are being pinned long-term (FOLL_LONGTERM). The reason
is, that long-term pinning would interfere with the device memory
manager owning the device-coherent pages (e.g. evictions in TTM).
These series incorporate Alistair Popple patches to do this
migration from pin_user_pages() calls. hmm_gup_test has been added to
hmm-test to test different get user pages calls.

This series includes handling of device-managed anonymous pages
returned by vm_normal_pages. Although they behave like normal pages
for purposes of mapping in CPU page tables and for COW, they do not
support LRU lists, NUMA migration or THP.

We also introduced a FOLL_LRU flag that adds the same behaviour to
follow_page and related APIs, to allow callers to specify that they
expect to put pages on an LRU list.

v2:
- Rebase to latest 5.18-rc7.
- Drop patch "mm: add device coherent checker to remove migration pte"
and modify try_to_migrate_one, to let DEVICE_COHERENT pages fall
through to normal page path. Based on Alistair Popple's comment.
- Fix comment formatting.
- Reword comment in vm_normal_page about pte_devmap().
- Merge "drm/amdkfd: coherent type as sys mem on migration to ram" to
"drm/amdkfd: add SPM support for SVM".

v3:
- Rebase to latest 5.18.0.
- Patch "mm: handling Non-LRU pages returned by vm_normal_pages"
reordered.
- Add WARN_ON_ONCE for thp device coherent case.

v4:
- Rebase to latest 5.18.0
- Fix consitency between pages with FOLL_LRU flag set and pte_devmap
at follow_page_pte.

v5:
- Remove unused zone_device_type from lib/test_hmm and
selftest/vm/hmm-test.c.

v6:
- Rebase to 5.19.0-rc4
- Rename is_pinnable_page to is_longterm_pinnable_page and add a
coherent device checker.
- Add a new gup test to hmm-test to cover fast pinnable case with
FOLL_LONGTERM.

v7:
- Reorder patch series.
- Remove FOLL_LRU and check on each caller for LRU pages handling
instead.

Alex Sierra (12):
  mm: rename is_pinnable_pages to is_pinnable_longterm_pages
  mm: add zone device coherent type memory support
  mm: handling Non-LRU pages returned by vm_normal_pages
  mm: add device coherent vma selection for memory migration
  drm/amdkfd: add SPM support for SVM
  lib: test_hmm add ioctl to get zone device type
  lib: test_hmm add module param for zone device type
  lib: add support for device coherent type in test_hmm
  tools: update hmm-test to support device coherent type
  tools: update test_hmm script to support SP config
  tools: add hmm gup tests for device coherent type
  tools: add selftests to hmm for COW in device memory

Alistair Popple (2):
  mm: remove the vma check in migrate_vma_setup()
  mm/gup: migrate device coherent pages when pinning instead of failing

 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  34 ++-
 fs/proc/task_mmu.c   |   2 +-
 include/linux/memremap.h |  44 +++
 include/linux/migrate.h  |   1 +
 include/linux/mm.h   |  24 --
 lib/test_hmm.c   | 337 +--
 lib/test_hmm_uapi.h  |  19 +-
 mm/gup.c |  49 +++-
 mm/gup_test.c|   4 +-
 mm/huge_memory.c |   2 +-
 mm/hugetlb.c |   2 +-
 mm/internal.h

[PATCH 7/7] Revert "drm/amdgpu/gmc11: avoid cpu accessing registers to flush VM"

This reverts commit 5af39cf2fbadbaac1a04c94a604b298a9a325670
since drv enabled mes to access registers.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 51 +-
 1 file changed, 1 insertion(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
index 2be785cfc6dc..cd6b97d7184f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
@@ -261,12 +261,6 @@ static void gmc_v11_0_flush_vm_hub(struct amdgpu_device 
*adev, uint32_t vmid,
 static void gmc_v11_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
uint32_t vmhub, uint32_t flush_type)
 {
-   struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
-   struct dma_fence *fence;
-   struct amdgpu_job *job;
-
-   int r;
-
if ((vmhub == AMDGPU_GFXHUB_0) && !adev->gfx.is_poweron)
return;
 
@@ -290,51 +284,8 @@ static void gmc_v11_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
}
 
mutex_lock(&adev->mman.gtt_window_lock);
-
-   if (vmhub == AMDGPU_MMHUB_0) {
-   gmc_v11_0_flush_vm_hub(adev, vmid, AMDGPU_MMHUB_0, 0);
-   mutex_unlock(&adev->mman.gtt_window_lock);
-   return;
-   }
-
-   BUG_ON(vmhub != AMDGPU_GFXHUB_0);
-
-   if (!adev->mman.buffer_funcs_enabled ||
-   !adev->ib_pool_ready ||
-   amdgpu_in_reset(adev) ||
-   ring->sched.ready == false) {
-   gmc_v11_0_flush_vm_hub(adev, vmid, AMDGPU_GFXHUB_0, 0);
-   mutex_unlock(&adev->mman.gtt_window_lock);
-   return;
-   }
-
-   r = amdgpu_job_alloc_with_ib(adev, 16 * 4, AMDGPU_IB_POOL_IMMEDIATE,
-&job);
-   if (r)
-   goto error_alloc;
-
-   job->vm_pd_addr = amdgpu_gmc_pd_addr(adev->gart.bo);
-   job->vm_needs_flush = true;
-   job->ibs->ptr[job->ibs->length_dw++] = ring->funcs->nop;
-   amdgpu_ring_pad_ib(ring, &job->ibs[0]);
-   r = amdgpu_job_submit(job, &adev->mman.entity,
- AMDGPU_FENCE_OWNER_UNDEFINED, &fence);
-   if (r)
-   goto error_submit;
-
-   mutex_unlock(&adev->mman.gtt_window_lock);
-
-   dma_fence_wait(fence, false);
-   dma_fence_put(fence);
-
-   return;
-
-error_submit:
-   amdgpu_job_free(job);
-
-error_alloc:
+   gmc_v11_0_flush_vm_hub(adev, vmid, vmhub, 0);
mutex_unlock(&adev->mman.gtt_window_lock);
-   DRM_ERROR("Error flushing GPU TLB using the SDMA (%d)!\n", r);
return;
 }
 
-- 
2.35.1

[PATCH 5/7] drm/amdgpu: enable mes to access registers v2

Enable mes to access registers.

v2: squash mes sched ring enablement flag

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  | 8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 6 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c   | 1 +
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 5d6b04fc6206..9c8e4cd488b2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -699,6 +699,9 @@ uint32_t amdgpu_kiq_rreg(struct amdgpu_device *adev, 
uint32_t reg)
if (amdgpu_device_skip_hw_access(adev))
return 0;
 
+   if (adev->mes.ring.sched.ready)
+   return amdgpu_mes_rreg(adev, reg);
+
BUG_ON(!ring->funcs->emit_rreg);
 
spin_lock_irqsave(&kiq->ring_lock, flags);
@@ -766,6 +769,11 @@ void amdgpu_kiq_wreg(struct amdgpu_device *adev, uint32_t 
reg, uint32_t v)
if (amdgpu_device_skip_hw_access(adev))
return;
 
+   if (adev->mes.ring.sched.ready) {
+   amdgpu_mes_wreg(adev, reg, v);
+   return;
+   }
+
spin_lock_irqsave(&kiq->ring_lock, flags);
amdgpu_ring_alloc(ring, 32);
amdgpu_ring_emit_wreg(ring, reg, v);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 84807dbf5563..8f824eaee3dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -79,6 +79,12 @@ void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device 
*adev,
unsigned long flags;
uint32_t seq;
 
+   if (adev->mes.ring.sched.ready) {
+   amdgpu_mes_reg_write_reg_wait(adev, reg0, reg1,
+ ref, mask);
+   return;
+   }
+
spin_lock_irqsave(&kiq->ring_lock, flags);
amdgpu_ring_alloc(ring, 32);
amdgpu_ring_emit_reg_write_reg_wait(ring, reg0, reg1,
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
index 9865ab1ce9e4..2be785cfc6dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
@@ -276,7 +276,7 @@ static void gmc_v11_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
/* For SRIOV run time, driver shouldn't access the register through MMIO
 * Directly use kiq to do the vm invalidation instead
 */
-   if (adev->gfx.kiq.ring.sched.ready && !adev->enable_mes &&
+   if ((adev->gfx.kiq.ring.sched.ready || adev->mes.ring.sched.ready) &&
(amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
struct amdgpu_vmhub *hub = &adev->vmhub[vmhub];
const unsigned eng = 17;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index e2aa1ebb3a00..2a6c7a680c62 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -1200,6 +1200,7 @@ static int mes_v11_0_hw_init(void *handle)
 * with MES enabled.
 */
adev->gfx.kiq.ring.sched.ready = false;
+   adev->mes.ring.sched.ready = true;
 
return 0;
 
-- 
2.35.1

[PATCH 6/7] drm/amdgpu/mes: add mes ring test

Use read/write register to test mes ring.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 36 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  1 +
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  |  6 +
 3 files changed, 43 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b6c2a5058b64..c18ea0bc00eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -926,6 +926,42 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, 
uint32_t reg,
return r;
 }
 
+int amdgpu_mes_ring_test_ring(struct amdgpu_device *adev)
+{
+   uint32_t scratch;
+   uint32_t tmp = 0;
+   unsigned i;
+   int r = 0;
+
+   r = amdgpu_gfx_scratch_get(adev, &scratch);
+   if (r) {
+   DRM_ERROR("amdgpu: mes failed to get scratch reg (%d).\n", r);
+   return r;
+   }
+
+   WREG32(scratch, 0xCAFEDEAD);
+
+   tmp = amdgpu_mes_rreg(adev, scratch);
+   if (tmp != 0xCAFEDEAD) {
+   DRM_ERROR("mes failed to read register\n");
+   goto error;
+   }
+
+   r = amdgpu_mes_wreg(adev, scratch, 0xDEADBEEF);
+   if (r)
+   goto error;
+
+   tmp = RREG32(scratch);
+   if (tmp != 0xDEADBEEF) {
+   DRM_ERROR("mes failed to write register\n");
+   r = -EIO;
+   }
+
+error:
+   amdgpu_gfx_scratch_free(adev, scratch);
+   return r;
+}
+
 static void
 amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev,
   struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 93b2ba817916..81610e3f3059 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -341,6 +341,7 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, 
uint32_t reg,
 int amdgpu_mes_reg_write_reg_wait(struct amdgpu_device *adev,
  uint32_t reg0, uint32_t reg1,
  uint32_t ref, uint32_t mask);
+int amdgpu_mes_ring_test_ring(struct amdgpu_device *adev);
 
 int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
int queue_type, int idx,
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 2a6c7a680c62..c4d085429d26 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -1194,6 +1194,12 @@ static int mes_v11_0_hw_init(void *handle)
goto failure;
}
 
+   r = amdgpu_mes_ring_test_ring(adev);
+   if (r) {
+   DRM_ERROR("MES ring test failed\n");
+   goto failure;
+   }
+
/*
 * Disable KIQ ring usage from the driver once MES is enabled.
 * MES uses KIQ ring exclusively so driver cannot access KIQ ring
-- 
2.35.1

[PATCH 4/7] drm/amdgpu/mes: add mes register access interface

Add mes register access routines:
1. read register
2. write register
3. wait register
4. write and wait register

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 132 +++-
 1 file changed, 131 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 2e86baa32c55..b6c2a5058b64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -189,15 +189,29 @@ int amdgpu_mes_init(struct amdgpu_device *adev)
 
r = amdgpu_device_wb_get(adev, &adev->mes.query_status_fence_offs);
if (r) {
+   amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
dev_err(adev->dev,
"(%d) query_status_fence_offs wb alloc failed\n", r);
-   return r;
+   goto error_ids;
}
adev->mes.query_status_fence_gpu_addr =
adev->wb.gpu_addr + (adev->mes.query_status_fence_offs * 4);
adev->mes.query_status_fence_ptr =
(uint64_t *)&adev->wb.wb[adev->mes.query_status_fence_offs];
 
+   r = amdgpu_device_wb_get(adev, &adev->mes.read_val_offs);
+   if (r) {
+   amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
+   amdgpu_device_wb_free(adev, adev->mes.query_status_fence_offs);
+   dev_err(adev->dev,
+   "(%d) read_val_offs alloc failed\n", r);
+   goto error_ids;
+   }
+   adev->mes.read_val_gpu_addr =
+   adev->wb.gpu_addr + (adev->mes.read_val_offs * 4);
+   adev->mes.read_val_ptr =
+   (uint32_t *)&adev->wb.wb[adev->mes.read_val_offs];
+
r = amdgpu_mes_doorbell_init(adev);
if (r)
goto error;
@@ -206,6 +220,8 @@ int amdgpu_mes_init(struct amdgpu_device *adev)
 
 error:
amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
+   amdgpu_device_wb_free(adev, adev->mes.query_status_fence_offs);
+   amdgpu_device_wb_free(adev, adev->mes.read_val_offs);
 error_ids:
idr_destroy(&adev->mes.pasid_idr);
idr_destroy(&adev->mes.gang_id_idr);
@@ -218,6 +234,8 @@ int amdgpu_mes_init(struct amdgpu_device *adev)
 void amdgpu_mes_fini(struct amdgpu_device *adev)
 {
amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
+   amdgpu_device_wb_free(adev, adev->mes.query_status_fence_offs);
+   amdgpu_device_wb_free(adev, adev->mes.read_val_offs);
 
idr_destroy(&adev->mes.pasid_idr);
idr_destroy(&adev->mes.gang_id_idr);
@@ -796,6 +814,118 @@ int amdgpu_mes_unmap_legacy_queue(struct amdgpu_device 
*adev,
return r;
 }
 
+uint32_t amdgpu_mes_rreg(struct amdgpu_device *adev, uint32_t reg)
+{
+   struct mes_misc_op_input op_input;
+   int r, val = 0;
+
+   amdgpu_mes_lock(&adev->mes);
+
+   op_input.op = MES_MISC_OP_READ_REG;
+   op_input.read_reg.reg_offset = reg;
+   op_input.read_reg.buffer_addr = adev->mes.read_val_gpu_addr;
+
+   if (!adev->mes.funcs->misc_op) {
+   DRM_ERROR("mes rreg is not supported!\n");
+   goto error;
+   }
+
+   r = adev->mes.funcs->misc_op(&adev->mes, &op_input);
+   if (r)
+   DRM_ERROR("failed to read reg (0x%x)\n", reg);
+   else
+   val = *(adev->mes.read_val_ptr);
+
+error:
+   amdgpu_mes_unlock(&adev->mes);
+   return val;
+}
+
+int amdgpu_mes_wreg(struct amdgpu_device *adev,
+   uint32_t reg, uint32_t val)
+{
+   struct mes_misc_op_input op_input;
+   int r;
+
+   amdgpu_mes_lock(&adev->mes);
+
+   op_input.op = MES_MISC_OP_WRITE_REG;
+   op_input.write_reg.reg_offset = reg;
+   op_input.write_reg.reg_value = val;
+
+   if (!adev->mes.funcs->misc_op) {
+   DRM_ERROR("mes wreg is not supported!\n");
+   r = -EINVAL;
+   goto error;
+   }
+
+   r = adev->mes.funcs->misc_op(&adev->mes, &op_input);
+   if (r)
+   DRM_ERROR("failed to write reg (0x%x)\n", reg);
+
+error:
+   amdgpu_mes_unlock(&adev->mes);
+   return r;
+}
+
+int amdgpu_mes_reg_write_reg_wait(struct amdgpu_device *adev,
+ uint32_t reg0, uint32_t reg1,
+ uint32_t ref, uint32_t mask)
+{
+   struct mes_misc_op_input op_input;
+   int r;
+
+   amdgpu_mes_lock(&adev->mes);
+
+   op_input.op = MES_MISC_OP_WRM_REG_WR_WAIT;
+   op_input.wrm_reg.reg0 = reg0;
+   op_input.wrm_reg.reg1 = reg1;
+   op_input.wrm_reg.ref = ref;
+   op_input.wrm_reg.mask = mask;
+
+   if (!adev->mes.funcs->misc_op) {
+   DRM_ERROR("mes reg_write_reg_wait is not supported!\n");
+   r = -EINVAL;
+   goto error;
+   }
+
+   r = adev->mes.funcs->misc_op(&adev->mes, &op_input);
+   if (r)
+   DRM_ERROR("failed to reg_write_reg_wait\n");
+
+

[PATCH 3/7] drm/amdgpu/mes11: add mes11 misc op

Add misc op commands in mes11.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 53 ++
 1 file changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index d5200cbceb8a..e2aa1ebb3a00 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -262,6 +262,58 @@ static int mes_v11_0_query_sched_status(struct amdgpu_mes 
*mes)
&mes_status_pkt, sizeof(mes_status_pkt));
 }
 
+static int mes_v11_0_misc_op(struct amdgpu_mes *mes,
+struct mes_misc_op_input *input)
+{
+   union MESAPI__MISC misc_pkt;
+
+   memset(&misc_pkt, 0, sizeof(misc_pkt));
+
+   misc_pkt.header.type = MES_API_TYPE_SCHEDULER;
+   misc_pkt.header.opcode = MES_SCH_API_MISC;
+   misc_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS;
+
+   switch (input->op) {
+   case MES_MISC_OP_READ_REG:
+   misc_pkt.opcode = MESAPI_MISC__READ_REG;
+   misc_pkt.read_reg.reg_offset = input->read_reg.reg_offset;
+   misc_pkt.read_reg.buffer_addr = input->read_reg.buffer_addr;
+   break;
+   case MES_MISC_OP_WRITE_REG:
+   misc_pkt.opcode = MESAPI_MISC__WRITE_REG;
+   misc_pkt.write_reg.reg_offset = input->write_reg.reg_offset;
+   misc_pkt.write_reg.reg_value = input->write_reg.reg_value;
+   break;
+   case MES_MISC_OP_WRM_REG_WAIT:
+   misc_pkt.opcode = MESAPI_MISC__WAIT_REG_MEM;
+   misc_pkt.wait_reg_mem.op = WRM_OPERATION__WAIT_REG_MEM;
+   misc_pkt.wait_reg_mem.reference = input->wrm_reg.ref;
+   misc_pkt.wait_reg_mem.mask = input->wrm_reg.mask;
+   misc_pkt.wait_reg_mem.reg_offset1 = input->wrm_reg.reg0;
+   misc_pkt.wait_reg_mem.reg_offset2 = 0;
+   break;
+   case MES_MISC_OP_WRM_REG_WR_WAIT:
+   misc_pkt.opcode = MESAPI_MISC__WAIT_REG_MEM;
+   misc_pkt.wait_reg_mem.op = WRM_OPERATION__WR_WAIT_WR_REG;
+   misc_pkt.wait_reg_mem.reference = input->wrm_reg.ref;
+   misc_pkt.wait_reg_mem.mask = input->wrm_reg.mask;
+   misc_pkt.wait_reg_mem.reg_offset1 = input->wrm_reg.reg0;
+   misc_pkt.wait_reg_mem.reg_offset2 = input->wrm_reg.reg1;
+   break;
+   default:
+   DRM_ERROR("unsupported misc op (%d) \n", input->op);
+   return -EINVAL;
+   }
+
+   misc_pkt.api_status.api_completion_fence_addr =
+   mes->ring.fence_drv.gpu_addr;
+   misc_pkt.api_status.api_completion_fence_value =
+   ++mes->ring.fence_drv.sync_seq;
+
+   return mes_v11_0_submit_pkt_and_poll_completion(mes,
+   &misc_pkt, sizeof(misc_pkt));
+}
+
 static int mes_v11_0_set_debug_vmid(struct amdgpu_mes *mes,
   struct mes_debug_vmid_input *input)
 {
@@ -355,6 +407,7 @@ static const struct amdgpu_mes_funcs mes_v11_0_funcs = {
.suspend_gang = mes_v11_0_suspend_gang,
.resume_gang = mes_v11_0_resume_gang,
.set_debug_vmid = mes_v11_0_set_debug_vmid,
+   .misc_op = mes_v11_0_misc_op,
 };
 
 static int mes_v11_0_init_microcode(struct amdgpu_device *adev,
-- 
2.35.1

[PATCH 1/7] drm/amdgpu/mes11: update mes interface for acessing registers

Update MES firmware api for accessing registers.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/include/mes_v11_api_def.h | 37 +--
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/mes_v11_api_def.h 
b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
index 9a70af44614b..94776e6c3ad9 100644
--- a/drivers/gpu/drm/amd/include/mes_v11_api_def.h
+++ b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
@@ -508,27 +508,40 @@ union MESAPI__SET_DEBUG_VMID {
 };
 
 enum MESAPI_MISC_OPCODE {
-   MESAPI_MISC__MODIFY_REG,
+   MESAPI_MISC__WRITE_REG,
MESAPI_MISC__INV_GART,
MESAPI_MISC__QUERY_STATUS,
+   MESAPI_MISC__READ_REG,
+   MESAPI_MISC__WAIT_REG_MEM,
MESAPI_MISC__MAX,
 };
 
-enum MODIFY_REG_SUBCODE {
-   MODIFY_REG__OVERWRITE,
-   MODIFY_REG__RMW_OR,
-   MODIFY_REG__RMW_AND,
-   MODIFY_REG__MAX,
-};
-
 enum { MISC_DATA_MAX_SIZE_IN_DWORDS = 20 };
 
-struct MODIFY_REG {
-   enum MODIFY_REG_SUBCODE   subcode;
+struct WRITE_REG {
uint32_t  reg_offset;
uint32_t  reg_value;
 };
 
+struct READ_REG {
+   uint32_t  reg_offset;
+   uint64_t  buffer_addr;
+};
+
+enum WRM_OPERATION {
+   WRM_OPERATION__WAIT_REG_MEM,
+   WRM_OPERATION__WR_WAIT_WR_REG,
+   WRM_OPERATION__MAX,
+};
+
+struct WAIT_REG_MEM {
+   enum WRM_OPERATION op;
+   uint32_t   reference;
+   uint32_t   mask;
+   uint32_t   reg_offset1;
+   uint32_t   reg_offset2;
+};
+
 struct INV_GART {
uint64_t  inv_range_va_start;
uint64_t  inv_range_size;
@@ -545,9 +558,11 @@ union MESAPI__MISC {
struct MES_API_STATUS   api_status;
 
union {
-   struct  MODIFY_REG modify_reg;
+   struct  WRITE_REG write_reg;
struct  INV_GART inv_gart;
struct  QUERY_STATUS query_status;
+   struct  READ_REG read_reg;
+   struct  WAIT_REG_MEM wait_reg_mem;
uint32_tdata[MISC_DATA_MAX_SIZE_IN_DWORDS];
};
};
-- 
2.35.1

[PATCH 2/7] drm/amdgpu: add common interface for mes misc op

Add common interface for mes misc op, including accessing register
interface.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 46 +
 1 file changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 92ddee5e33db..93b2ba817916 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -108,6 +108,10 @@ struct amdgpu_mes {
uint32_tquery_status_fence_offs;
uint64_tquery_status_fence_gpu_addr;
uint64_t*query_status_fence_ptr;
+   uint32_tread_val_offs;
+   uint64_tread_val_gpu_addr;
+   uint32_t*read_val_ptr;
+
uint32_tsaved_flags;
 
/* initialize kiq pipe */
@@ -246,6 +250,36 @@ struct mes_debug_vmid_input {
uint32_toa_mask;
 };
 
+enum mes_misc_opcode {
+   MES_MISC_OP_WRITE_REG,
+   MES_MISC_OP_READ_REG,
+   MES_MISC_OP_WRM_REG_WAIT,
+   MES_MISC_OP_WRM_REG_WR_WAIT,
+};
+
+struct mes_misc_op_input {
+   enum mes_misc_opcode op;
+
+   union {
+   struct {
+   uint32_t  reg_offset;
+   uint64_t  buffer_addr;
+   } read_reg;
+
+   struct {
+   uint32_t  reg_offset;
+   uint32_t  reg_value;
+   } write_reg;
+
+   struct {
+   uint32_t   ref;
+   uint32_t   mask;
+   uint32_t   reg0;
+   uint32_t   reg1;
+   } wrm_reg;
+   };
+};
+
 struct amdgpu_mes_funcs {
int (*add_hw_queue)(struct amdgpu_mes *mes,
struct mes_add_queue_input *input);
@@ -264,6 +298,9 @@ struct amdgpu_mes_funcs {
 
int (*set_debug_vmid)(struct amdgpu_mes *mes,
   struct mes_debug_vmid_input *input);
+
+   int (*misc_op)(struct amdgpu_mes *mes,
+  struct mes_misc_op_input *input);
 };
 
 #define amdgpu_mes_kiq_hw_init(adev) (adev)->mes.kiq_hw_init((adev))
@@ -296,6 +333,15 @@ int amdgpu_mes_unmap_legacy_queue(struct amdgpu_device 
*adev,
  enum amdgpu_unmap_queues_action action,
  u64 gpu_addr, u64 seq);
 
+uint32_t amdgpu_mes_rreg(struct amdgpu_device *adev, uint32_t reg);
+int amdgpu_mes_wreg(struct amdgpu_device *adev,
+   uint32_t reg, uint32_t val);
+int amdgpu_mes_reg_wait(struct amdgpu_device *adev, uint32_t reg,
+   uint32_t val, uint32_t mask);
+int amdgpu_mes_reg_write_reg_wait(struct amdgpu_device *adev,
+ uint32_t reg0, uint32_t reg1,
+ uint32_t ref, uint32_t mask);
+
 int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
int queue_type, int idx,
struct amdgpu_mes_ctx_data *ctx_data,
-- 
2.35.1

SYNCOBJ TIMELINE Test failed while running amdgpu_test

2022-06-28 Thread Zhang, Jesse(Jie)

[AMD Official Use Only - General]

Hi Alex and Mario
We find the “Syncobj timeline” test failed on ubunt22(kernel version  >= 
5.15.34).
Failed log:
Suite: SYNCOBJ TIMELINE Tests
  Test: syncobj timeline test ...FAILED
1. sources/drm/tests/amdgpu/syncobj_tests.c:299  - 
CU_ASSERT_EQUAL(payload,18)
2. sources/drm/tests/amdgpu/syncobj_tests.c:309  - 
CU_ASSERT_EQUAL(payload,20)

We find it cause by the patch.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab66fdace8581ef3b4e7cf5381a168ed4058d779.

So I add a patch , please help to review.

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 7e48dcd1bee4..d5db818f1c76 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -887,7 +887,7 @@ static int drm_syncobj_flatten_chain(struct dma_fence **f)
goto free_fences;

dma_fence_put(*f);
-   *f = &array->base;
+   *f = array->fences[0];
return 0;



Thanks
Jesse

RE: [PATCH] drm/amdgpu/display: reduce stack size in dml32_ModeSupportAndSystemConfigurationFull()

2022-06-28 Thread Chen, Guchun

Acked-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Tuesday, June 28, 2022 10:33 PM
To: Deucher, Alexander 
Cc: Stephen Rothwell ; Pillai, Aurabindo 
; Siqueira, Rodrigo ; 
amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu/display: reduce stack size in 
dml32_ModeSupportAndSystemConfigurationFull()

Ping?

Alex

On Wed, Jun 22, 2022 at 10:48 AM Alex Deucher  wrote:
>
> Move more stack variable in to dummy vars structure on the heap.
>
> Fixes stack frame size errors:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c: In 
> function 'dml32_ModeSupportAndSystemConfigurationFull':
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32
> .c:3833:1: error: the frame size of 2720 bytes is larger than 2048 
> bytes [-Werror=frame-larger-than=]
>  3833 | } // ModeSupportAndSystemConfigurationFull
>   | ^
>
> Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321")
> Cc: Stephen Rothwell 
> Cc: Aurabindo Pillai 
> Cc: Rodrigo Siqueira Jordao 
> Signed-off-by: Alex Deucher 
> ---
>  .../dc/dml/dcn32/display_mode_vba_32.c| 77 ---
>  .../drm/amd/display/dc/dml/display_mode_vba.h |  3 +-
>  2 files changed, 36 insertions(+), 44 deletions(-)
>
> diff --git 
> a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
> b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
> index 510b7a81ee12..7f144adb1e36 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
> @@ -1660,8 +1660,7 @@ static void 
> DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
>
>  void dml32_ModeSupportAndSystemConfigurationFull(struct 
> display_mode_lib *mode_lib)  {
> -   bool dummy_boolean[2];
> -   unsigned int dummy_integer[1];
> +   unsigned int dummy_integer[4];
> bool MPCCombineMethodAsNeededForPStateChangeAndVoltage;
> bool MPCCombineMethodAsPossible;
> enum odm_combine_mode dummy_odm_mode[DC__NUM_DPP__MAX]; @@ 
> -1973,10 +1972,10 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
> display_mode_lib *mode_l
> 
> v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[5],
>  /* LongDETBufferSizeInKByte[]  */
> 
> v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[6],
>  /* LongDETBufferSizeY[]  */
> 
> v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[7],
>  /* LongDETBufferSizeC[]  */
> -   &dummy_boolean[0], /* bool   
> *UnboundedRequestEnabled  */
> -   &dummy_integer[0], /* Long   
> *CompressedBufferSizeInkByte  */
> +   
> &v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[0][0],
>  /* bool   *UnboundedRequestEnabled  */
> +   
> &v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[0][0],
>  /* Long   *CompressedBufferSizeInkByte  */
> 
> mode_lib->vba.SingleDPPViewportSizeSupportPerSurface,/* bool 
> ViewportSizeSupportPerSurface[] */
> -   &dummy_boolean[1]); /* bool   
> *ViewportSizeSupport */
> +   
> &v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[1][0]);
>  /* bool   *ViewportSizeSupport */
>
> MPCCombineMethodAsNeededForPStateChangeAndVoltage = false;
> MPCCombineMethodAsPossible = false; @@ -2506,7 +2505,6 @@ void 
> dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
> //
> for (i = 0; i < (int) v->soc.num_states; ++i) {
> for (j = 0; j <= 1; ++j) {
> -   bool dummy_boolean_array[1][DC__NUM_DPP__MAX];
> for (k = 0; k < mode_lib->vba.NumberOfActiveSurfaces; 
> ++k) {
> mode_lib->vba.RequiredDPPCLKThisState[k] = 
> mode_lib->vba.RequiredDPPCLK[i][j][k];
> mode_lib->vba.NoOfDPPThisState[k] = 
> mode_lib->vba.NoOfDPP[i][j][k]; @@ -2570,7 +2568,7 @@ void 
> dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
> mode_lib->vba.DETBufferSizeCThisState,
> 
> &mode_lib->vba.UnboundedRequestEnabledThisState,
> 
> &mode_lib->vba.CompressedBufferSizeInkByteThisState,
> -   dummy_boolean_array[0],
> +   
> + v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_bool
> + ean_array[0],
> 
> &mode_lib->vba.ViewportSizeSupport[i][j]);
>
>

RE: [PATCH] drm/amdkfd: Fix warnings from static analyzer Smatch

2022-06-28 Thread Errabolu, Ramesh

[AMD Official Use Only - General]

My responses are inline

-Original Message-
From: Kuehling, Felix  
Sent: Tuesday, June 28, 2022 6:41 PM
To: Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org; 
dan.carpen...@oracle.com
Subject: Re: [PATCH] drm/amdkfd: Fix warnings from static analyzer Smatch


Am 2022-06-28 um 19:25 schrieb Ramesh Errabolu:
> The patch fixes couple of warnings, as reported by Smatch a static 
> analyzer
>
> Signed-off-by: Ramesh Errabolu 
> Reported-by: Dan Carpenter 
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 36 ---
>   1 file changed, 19 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 25990bec600d..9d7b9ad70bc8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1417,15 +1417,17 @@ static int 
> kfd_create_indirect_link_prop(struct kfd_topology_device *kdev, int g
>   
>   /* find CPU <-->  CPU links */
>   cpu_dev = kfd_topology_device_by_proximity_domain(i);
> - if (cpu_dev) {
> - list_for_each_entry(cpu_link,
> - &cpu_dev->io_link_props, list) {
> - if (cpu_link->node_to == gpu_link->node_to)
> - break;
> - }
> - }
> + if (!cpu_dev)
> + continue;
> +
> + cpu_link = NULL;

This initialization is unnecessary. list_for_each_entry will always initialize 
it.


> + list_for_each_entry(cpu_link, &cpu_dev->io_link_props, list)
> + if (cpu_link->node_to == gpu_link->node_to)
> + break;
>   
> - if (cpu_link->node_to != gpu_link->node_to)
> + /* Ensures we didn't exit from list search with no hits */
> + if (list_entry_is_head(cpu_link, &cpu_dev->io_link_props, list) 
> ||
> + (cpu_link->node_to != gpu_link->node_to))

The second condition is redundant. If the list entry is not the head, 
the node_to must have already matched in the loop.

Ramesh: Syntactically, it is possible to walk down the list without having the 
hit. The check list_entry_is_head() is for that scenario.

But I'm no sure this solution is going to satisfy the static checker. It 
objects to using the iterator (cpu_link) outside the loop. I think a 
proper solution, that doesn't make any assumptions about how 
list_for_each_entry is implemented, would be to declare a separate 
variable as the iterator, and assign cpu_link in the loop only if there 
is a match.

Ramesh: Will wait for a response from Dan.

Regards,
   Felix


>   return -ENOMEM;
>   
>   /* CPU <--> CPU <--> GPU, GPU node*/
> @@ -1510,16 +1512,16 @@ static int kfd_add_peer_prop(struct 
> kfd_topology_device *kdev,
>   cpu_dev = 
> kfd_topology_device_by_proximity_domain(iolink1->node_to);
>   if (cpu_dev) {
>   list_for_each_entry(iolink3, &cpu_dev->io_link_props, 
> list)
> - if (iolink3->node_to == iolink2->node_to)
> + if (iolink3->node_to == iolink2->node_to) {
> + props->weight += iolink3->weight;
> + props->min_latency += 
> iolink3->min_latency;
> + props->max_latency += 
> iolink3->max_latency;
> + props->min_bandwidth = 
> min(props->min_bandwidth,
> + 
> iolink3->min_bandwidth);
> + props->max_bandwidth = 
> min(props->max_bandwidth,
> + 
> iolink3->max_bandwidth);
>   break;
> -
> - props->weight += iolink3->weight;
> - props->min_latency += iolink3->min_latency;
> - props->max_latency += iolink3->max_latency;
> - props->min_bandwidth = min(props->min_bandwidth,
> - iolink3->min_bandwidth);
> - props->max_bandwidth = min(props->max_bandwidth,
> - iolink3->max_bandwidth);
> + }
>   } else {
>   WARN(1, "CPU node not found");
>   }

Re: [PATCH] drm/amdkfd: Fix warnings from static analyzer Smatch

2022-06-28 Thread Felix Kuehling




Am 2022-06-28 um 19:25 schrieb Ramesh Errabolu:

The patch fixes couple of warnings, as reported by Smatch
a static analyzer

Signed-off-by: Ramesh Errabolu 
Reported-by: Dan Carpenter 
---
  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 36 ---
  1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 25990bec600d..9d7b9ad70bc8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1417,15 +1417,17 @@ static int kfd_create_indirect_link_prop(struct 
kfd_topology_device *kdev, int g
  
  		/* find CPU <-->  CPU links */

cpu_dev = kfd_topology_device_by_proximity_domain(i);
-   if (cpu_dev) {
-   list_for_each_entry(cpu_link,
-   &cpu_dev->io_link_props, list) {
-   if (cpu_link->node_to == gpu_link->node_to)
-   break;
-   }
-   }
+   if (!cpu_dev)
+   continue;
+
+   cpu_link = NULL;


This initialization is unnecessary. list_for_each_entry will always 
initialize it.




+   list_for_each_entry(cpu_link, &cpu_dev->io_link_props, list)
+   if (cpu_link->node_to == gpu_link->node_to)
+   break;
  
-		if (cpu_link->node_to != gpu_link->node_to)

+   /* Ensures we didn't exit from list search with no hits */
+   if (list_entry_is_head(cpu_link, &cpu_dev->io_link_props, list) 
||
+   (cpu_link->node_to != gpu_link->node_to))


The second condition is redundant. If the list entry is not the head, 
the node_to must have already matched in the loop.


But I'm no sure this solution is going to satisfy the static checker. It 
objects to using the iterator (cpu_link) outside the loop. I think a 
proper solution, that doesn't make any assumptions about how 
list_for_each_entry is implemented, would be to declare a separate 
variable as the iterator, and assign cpu_link in the loop only if there 
is a match.


Regards,
  Felix



return -ENOMEM;
  
  		/* CPU <--> CPU <--> GPU, GPU node*/

@@ -1510,16 +1512,16 @@ static int kfd_add_peer_prop(struct kfd_topology_device 
*kdev,
cpu_dev = 
kfd_topology_device_by_proximity_domain(iolink1->node_to);
if (cpu_dev) {
list_for_each_entry(iolink3, &cpu_dev->io_link_props, 
list)
-   if (iolink3->node_to == iolink2->node_to)
+   if (iolink3->node_to == iolink2->node_to) {
+   props->weight += iolink3->weight;
+   props->min_latency += 
iolink3->min_latency;
+   props->max_latency += 
iolink3->max_latency;
+   props->min_bandwidth = 
min(props->min_bandwidth,
+   
iolink3->min_bandwidth);
+   props->max_bandwidth = 
min(props->max_bandwidth,
+   
iolink3->max_bandwidth);
break;
-
-   props->weight += iolink3->weight;
-   props->min_latency += iolink3->min_latency;
-   props->max_latency += iolink3->max_latency;
-   props->min_bandwidth = min(props->min_bandwidth,
-   iolink3->min_bandwidth);
-   props->max_bandwidth = min(props->max_bandwidth,
-   iolink3->max_bandwidth);
+   }
} else {
WARN(1, "CPU node not found");
}

[PATCH] drm/amdkfd: Fix warnings from static analyzer Smatch

2022-06-28 Thread Ramesh Errabolu

The patch fixes couple of warnings, as reported by Smatch
a static analyzer

Signed-off-by: Ramesh Errabolu 
Reported-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 36 ---
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 25990bec600d..9d7b9ad70bc8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1417,15 +1417,17 @@ static int kfd_create_indirect_link_prop(struct 
kfd_topology_device *kdev, int g
 
/* find CPU <-->  CPU links */
cpu_dev = kfd_topology_device_by_proximity_domain(i);
-   if (cpu_dev) {
-   list_for_each_entry(cpu_link,
-   &cpu_dev->io_link_props, list) {
-   if (cpu_link->node_to == gpu_link->node_to)
-   break;
-   }
-   }
+   if (!cpu_dev)
+   continue;
+
+   cpu_link = NULL;
+   list_for_each_entry(cpu_link, &cpu_dev->io_link_props, list)
+   if (cpu_link->node_to == gpu_link->node_to)
+   break;
 
-   if (cpu_link->node_to != gpu_link->node_to)
+   /* Ensures we didn't exit from list search with no hits */
+   if (list_entry_is_head(cpu_link, &cpu_dev->io_link_props, list) 
||
+   (cpu_link->node_to != gpu_link->node_to))
return -ENOMEM;
 
/* CPU <--> CPU <--> GPU, GPU node*/
@@ -1510,16 +1512,16 @@ static int kfd_add_peer_prop(struct kfd_topology_device 
*kdev,
cpu_dev = 
kfd_topology_device_by_proximity_domain(iolink1->node_to);
if (cpu_dev) {
list_for_each_entry(iolink3, &cpu_dev->io_link_props, 
list)
-   if (iolink3->node_to == iolink2->node_to)
+   if (iolink3->node_to == iolink2->node_to) {
+   props->weight += iolink3->weight;
+   props->min_latency += 
iolink3->min_latency;
+   props->max_latency += 
iolink3->max_latency;
+   props->min_bandwidth = 
min(props->min_bandwidth,
+   
iolink3->min_bandwidth);
+   props->max_bandwidth = 
min(props->max_bandwidth,
+   
iolink3->max_bandwidth);
break;
-
-   props->weight += iolink3->weight;
-   props->min_latency += iolink3->min_latency;
-   props->max_latency += iolink3->max_latency;
-   props->min_bandwidth = min(props->min_bandwidth,
-   iolink3->min_bandwidth);
-   props->max_bandwidth = min(props->max_bandwidth,
-   iolink3->max_bandwidth);
+   }
} else {
WARN(1, "CPU node not found");
}
-- 
2.35.1

Re: [PATCH v2] drm/amd/display: expose additional modifier for DCN32/321

2022-06-28 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

On Tue, Jun 28, 2022 at 10:25 PM Aurabindo Pillai
 wrote:
>
> [Why&How]
> Some userspace expect a backwards compatible modifier on DCN32/321. For
> hardware with num_pipes more than 16, we expose the most efficient
> modifier first. As a fall back method, we need to expose slightly inefficient
> modifier AMD_FMT_MOD_TILE_GFX9_64K_R_X after the best option.
>
> Also set the number of packers to fixed value as required per hardware
> documentation. This value is cached during hardware initialization and
> can be read through the base driver.
>
> Signed-off-by: Aurabindo Pillai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  3 +-
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 66 ++-
>  2 files changed, 36 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index 1a512d78673a..0f5bfe5df627 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -743,8 +743,7 @@ static int convert_tiling_flags_to_modifier(struct 
> amdgpu_framebuffer *afb)
> switch (version) {
> case AMD_FMT_MOD_TILE_VER_GFX11:
> pipe_xor_bits = min(block_size_bits - 8, 
> pipes);
> -   packers = min(block_size_bits - 8 - 
> pipe_xor_bits,
> -   
> ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs));
> +   packers = 
> ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs);
> break;
> case AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS:
> pipe_xor_bits = min(block_size_bits - 8, 
> pipes);
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 98bb65377e98..adccaf2f539d 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -5208,6 +5208,7 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
> int num_pkrs = 0;
> int pkrs = 0;
> u32 gb_addr_config;
> +   u8 i = 0;
> unsigned swizzle_r_x;
> uint64_t modifier_r_x;
> uint64_t modifier_dcc_best;
> @@ -5223,37 +5224,40 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
> num_pipes = 1 << REG_GET_FIELD(gb_addr_config, GB_ADDR_CONFIG, 
> NUM_PIPES);
> pipe_xor_bits = ilog2(num_pipes);
>
> -   /* R_X swizzle modes are the best for rendering and DCC requires 
> them. */
> -   swizzle_r_x = num_pipes > 16 ? AMD_FMT_MOD_TILE_GFX11_256K_R_X :
> -  AMD_FMT_MOD_TILE_GFX9_64K_R_X;
> -
> -   modifier_r_x = AMD_FMT_MOD |
> -   AMD_FMT_MOD_SET(TILE_VERSION, AMD_FMT_MOD_TILE_VER_GFX11) |
> -   AMD_FMT_MOD_SET(TILE, swizzle_r_x) |
> -   AMD_FMT_MOD_SET(PIPE_XOR_BITS, pipe_xor_bits) |
> -   AMD_FMT_MOD_SET(PACKERS, pkrs);
> -
> -   /* DCC_CONSTANT_ENCODE is not set because it can't vary with gfx11 
> (it's implied to be 1). */
> -   modifier_dcc_best = modifier_r_x |
> -   AMD_FMT_MOD_SET(DCC, 1) |
> -   AMD_FMT_MOD_SET(DCC_INDEPENDENT_64B, 0) |
> -   AMD_FMT_MOD_SET(DCC_INDEPENDENT_128B, 1) |
> -   AMD_FMT_MOD_SET(DCC_MAX_COMPRESSED_BLOCK, 
> AMD_FMT_MOD_DCC_BLOCK_128B);
> -
> -   /* DCC settings for 4K and greater resolutions. (required by display 
> hw) */
> -   modifier_dcc_4k = modifier_r_x |
> -   AMD_FMT_MOD_SET(DCC, 1) |
> -   AMD_FMT_MOD_SET(DCC_INDEPENDENT_64B, 1) |
> -   AMD_FMT_MOD_SET(DCC_INDEPENDENT_128B, 1) |
> -   AMD_FMT_MOD_SET(DCC_MAX_COMPRESSED_BLOCK, 
> AMD_FMT_MOD_DCC_BLOCK_64B);
> -
> -   add_modifier(mods, size, capacity, modifier_dcc_best);
> -   add_modifier(mods, size, capacity, modifier_dcc_4k);
> -
> -   add_modifier(mods, size, capacity, modifier_dcc_best | 
> AMD_FMT_MOD_SET(DCC_RETILE, 1));
> -   add_modifier(mods, size, capacity, modifier_dcc_4k | 
> AMD_FMT_MOD_SET(DCC_RETILE, 1));
> -
> -   add_modifier(mods, size, capacity, modifier_r_x);
> +   for (i = 0; i < 2; i++) {
> +   /* Insert the best one first. */
> +   /* R_X swizzle modes are the best for rendering and DCC 
> requires them. */
> +   if (num_pipes > 16)
> +   swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX11_256K_R_X : 
> AMD_FMT_MOD_TILE_GFX9_64K_R_X;
> +   else
> +   swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX9_64K_R_X : 
> AMD_FMT_MOD_TILE_GFX11_256K_R_X;
> +
> +   modifier_r_x = AMD_FMT_MOD |
> +  AMD_FMT_MOD_SET(TILE_VERSION, 
> AMD_FMT_MOD_TILE_VER_GFX11) |
>

[PATCH 4/4] libhsakmt: allocate unified memory for ctx save restore area

To improve performance on queue preemption, allocate ctx s/r
 area in VRAM instead of system memory, and migrate it back
 to system memory when VRAM is full.

Signed-off-by: Eric Huang 
Change-Id: If775782027188dbe84b6868260e429373675434c
---
 include/hsakmttypes.h |   1 +
 src/queues.c  | 109 --
 2 files changed, 96 insertions(+), 14 deletions(-)

diff --git a/include/hsakmttypes.h b/include/hsakmttypes.h
index 9063f85..2c1c7cc 100644
--- a/include/hsakmttypes.h
+++ b/include/hsakmttypes.h
@@ -1329,6 +1329,7 @@ typedef enum _HSA_SVM_FLAGS {
HSA_SVM_FLAG_GPU_RO  = 0x0008, // GPUs only read, allows 
replication
HSA_SVM_FLAG_GPU_EXEC= 0x0010, // Allow execution on GPU
HSA_SVM_FLAG_GPU_READ_MOSTLY = 0x0020, // GPUs mostly read, may 
allow similar optimizations as RO, but writes fault
+   HSA_SVM_FLAG_GPU_ALWAYS_MAPPED = 0x0040, // Keep GPU memory mapping 
always valid as if XNACK is disable
 } HSA_SVM_FLAGS;
 
 typedef enum _HSA_SVM_ATTR_TYPE {
diff --git a/src/queues.c b/src/queues.c
index c83dd93..e65103d 100644
--- a/src/queues.c
+++ b/src/queues.c
@@ -68,6 +68,7 @@ struct queue {
uint32_t eop_buffer_size;
uint32_t gfxv;
bool use_ats;
+   bool unified_ctx_save_restore;
/* This queue structure is allocated from GPU with page aligned size
 * but only small bytes are used. We use the extra space in the end for
 * cu_mask bits array.
@@ -383,13 +384,50 @@ static void free_exec_aligned_memory(void *addr, uint32_t 
size, uint32_t align,
munmap(addr, size);
 }
 
+static HSAKMT_STATUS register_exec_svm_range(void *mem, uint32_t size,
+   uint32_t gpuNode, uint32_t prefetchNode,
+   uint32_t preferredNode, bool alwaysMapped)
+{
+   HSA_SVM_ATTRIBUTE *attrs;
+   HSAuint64 s_attr;
+   HSAuint32 nattr;
+   HSAuint32 flags;
+
+   flags = HSA_SVM_FLAG_HOST_ACCESS |
+   HSA_SVM_FLAG_GPU_EXEC;
+
+   if (alwaysMapped)
+   flags |= HSA_SVM_FLAG_GPU_ALWAYS_MAPPED;
+
+   nattr = 5;
+   s_attr = sizeof(*attrs) * nattr;
+   attrs = (HSA_SVM_ATTRIBUTE *)alloca(s_attr);
+
+   attrs[0].type = HSA_SVM_ATTR_PREFETCH_LOC;
+   attrs[0].value = prefetchNode;
+   attrs[1].type = HSA_SVM_ATTR_PREFERRED_LOC;
+   attrs[1].value = preferredNode;
+   attrs[2].type = HSA_SVM_ATTR_CLR_FLAGS;
+   attrs[2].value = flags;
+   attrs[3].type = HSA_SVM_ATTR_SET_FLAGS;
+   attrs[3].value = flags;
+   attrs[4].type = HSA_SVM_ATTR_ACCESS;
+   attrs[4].value = gpuNode;
+
+   return hsaKmtSVMSetAttr(mem, size, nattr, attrs);
+}
+
 static void free_queue(struct queue *q)
 {
if (q->eop_buffer)
free_exec_aligned_memory(q->eop_buffer,
 q->eop_buffer_size,
 PAGE_SIZE, q->use_ats);
-   if (q->ctx_save_restore)
+   if (q->unified_ctx_save_restore)
+   munmap(q->ctx_save_restore,
+   ALIGN_UP(q->ctx_save_restore_size + 
q->debug_memory_size,
+   PAGE_SIZE));
+   else if (q->ctx_save_restore)
free_exec_aligned_memory(q->ctx_save_restore,
 q->ctx_save_restore_size,
 PAGE_SIZE, q->use_ats);
@@ -425,6 +463,8 @@ static int handle_concrete_asic(struct queue *q,
if (ret) {
uint32_t total_mem_alloc_size = 0;
HsaUserContextSaveAreaHeader *header;
+   HsaNodeProperties node;
+   bool svm_api;
 
args->ctx_save_restore_size = q->ctx_save_restore_size;
args->ctl_stack_size = q->ctl_stack_size;
@@ -434,22 +474,63 @@ static int handle_concrete_asic(struct queue *q,
 */
total_mem_alloc_size = q->ctx_save_restore_size +
   q->debug_memory_size;
-   q->ctx_save_restore =
-   allocate_exec_aligned_memory(total_mem_alloc_size,
-q->use_ats, NodeId, false, false);
 
-   if (!q->ctx_save_restore)
-   return HSAKMT_STATUS_NO_MEMORY;
+   if (hsaKmtGetNodeProperties(NodeId, &node))
+   svm_api = false;
+   else
+   svm_api = node.Capability.ui32.SVMAPISupported;
 
-   args->ctx_save_restore_address = (uintptr_t)q->ctx_save_restore;
+   /* Allocate unified memory for context save restore
+* area on dGPU.
+*/
+   if (!q->use_ats && svm_api) {
+   uint32_t size = ALIGN_UP(total_mem_alloc_size, 
PAGE_SIZE);
+   void *addr;
+   HSAKMT_STATUS r =

[PATCH 3/4] libhsakmt: add new flags for svm

It is to add new option for always keeping gpu mapping.

Signed-off-by: Eric Huang 
Change-Id: Iebee35e6de4d52fa29f82dd19f6bbf5640249492
---
 include/linux/kfd_ioctl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/kfd_ioctl.h b/include/linux/kfd_ioctl.h
index 8a0ed49..5c45f58 100644
--- a/include/linux/kfd_ioctl.h
+++ b/include/linux/kfd_ioctl.h
@@ -1069,6 +1069,8 @@ struct kfd_ioctl_cross_memory_copy_args {
 #define KFD_IOCTL_SVM_FLAG_GPU_EXEC0x0010
 /* GPUs mostly read, may allow similar optimizations as RO, but writes fault */
 #define KFD_IOCTL_SVM_FLAG_GPU_READ_MOSTLY 0x0020
+/* Keep GPU memory mapping always valid as if XNACK is disable */
+#define KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED   0x0040
 
 /**
  * kfd_ioctl_svm_op - SVM ioctl operations
-- 
2.25.1

[PATCH 4/4] libhsakmt: allocate unified memory for ctx save restore area

To improve performance on queue preemption, allocate ctx s/r
 area in VRAM instead of system memory, and migrate it back
 to system memory when VRAM is full.

Signed-off-by: Eric Huang 
Change-Id: If775782027188dbe84b6868260e429373675434c
---
 include/hsakmttypes.h |   1 +
 src/queues.c  | 109 --
 2 files changed, 96 insertions(+), 14 deletions(-)

diff --git a/include/hsakmttypes.h b/include/hsakmttypes.h
index 9063f85..2c1c7cc 100644
--- a/include/hsakmttypes.h
+++ b/include/hsakmttypes.h
@@ -1329,6 +1329,7 @@ typedef enum _HSA_SVM_FLAGS {
HSA_SVM_FLAG_GPU_RO  = 0x0008, // GPUs only read, allows 
replication
HSA_SVM_FLAG_GPU_EXEC= 0x0010, // Allow execution on GPU
HSA_SVM_FLAG_GPU_READ_MOSTLY = 0x0020, // GPUs mostly read, may 
allow similar optimizations as RO, but writes fault
+   HSA_SVM_FLAG_GPU_ALWAYS_MAPPED = 0x0040, // Keep GPU memory mapping 
always valid as if XNACK is disable
 } HSA_SVM_FLAGS;
 
 typedef enum _HSA_SVM_ATTR_TYPE {
diff --git a/src/queues.c b/src/queues.c
index c83dd93..e65103d 100644
--- a/src/queues.c
+++ b/src/queues.c
@@ -68,6 +68,7 @@ struct queue {
uint32_t eop_buffer_size;
uint32_t gfxv;
bool use_ats;
+   bool unified_ctx_save_restore;
/* This queue structure is allocated from GPU with page aligned size
 * but only small bytes are used. We use the extra space in the end for
 * cu_mask bits array.
@@ -383,13 +384,50 @@ static void free_exec_aligned_memory(void *addr, uint32_t 
size, uint32_t align,
munmap(addr, size);
 }
 
+static HSAKMT_STATUS register_exec_svm_range(void *mem, uint32_t size,
+   uint32_t gpuNode, uint32_t prefetchNode,
+   uint32_t preferredNode, bool alwaysMapped)
+{
+   HSA_SVM_ATTRIBUTE *attrs;
+   HSAuint64 s_attr;
+   HSAuint32 nattr;
+   HSAuint32 flags;
+
+   flags = HSA_SVM_FLAG_HOST_ACCESS |
+   HSA_SVM_FLAG_GPU_EXEC;
+
+   if (alwaysMapped)
+   flags |= HSA_SVM_FLAG_GPU_ALWAYS_MAPPED;
+
+   nattr = 5;
+   s_attr = sizeof(*attrs) * nattr;
+   attrs = (HSA_SVM_ATTRIBUTE *)alloca(s_attr);
+
+   attrs[0].type = HSA_SVM_ATTR_PREFETCH_LOC;
+   attrs[0].value = prefetchNode;
+   attrs[1].type = HSA_SVM_ATTR_PREFERRED_LOC;
+   attrs[1].value = preferredNode;
+   attrs[2].type = HSA_SVM_ATTR_CLR_FLAGS;
+   attrs[2].value = flags;
+   attrs[3].type = HSA_SVM_ATTR_SET_FLAGS;
+   attrs[3].value = flags;
+   attrs[4].type = HSA_SVM_ATTR_ACCESS;
+   attrs[4].value = gpuNode;
+
+   return hsaKmtSVMSetAttr(mem, size, nattr, attrs);
+}
+
 static void free_queue(struct queue *q)
 {
if (q->eop_buffer)
free_exec_aligned_memory(q->eop_buffer,
 q->eop_buffer_size,
 PAGE_SIZE, q->use_ats);
-   if (q->ctx_save_restore)
+   if (q->unified_ctx_save_restore)
+   munmap(q->ctx_save_restore,
+   ALIGN_UP(q->ctx_save_restore_size + 
q->debug_memory_size,
+   PAGE_SIZE));
+   else if (q->ctx_save_restore)
free_exec_aligned_memory(q->ctx_save_restore,
 q->ctx_save_restore_size,
 PAGE_SIZE, q->use_ats);
@@ -425,6 +463,8 @@ static int handle_concrete_asic(struct queue *q,
if (ret) {
uint32_t total_mem_alloc_size = 0;
HsaUserContextSaveAreaHeader *header;
+   HsaNodeProperties node;
+   bool svm_api;
 
args->ctx_save_restore_size = q->ctx_save_restore_size;
args->ctl_stack_size = q->ctl_stack_size;
@@ -434,22 +474,63 @@ static int handle_concrete_asic(struct queue *q,
 */
total_mem_alloc_size = q->ctx_save_restore_size +
   q->debug_memory_size;
-   q->ctx_save_restore =
-   allocate_exec_aligned_memory(total_mem_alloc_size,
-q->use_ats, NodeId, false, false);
 
-   if (!q->ctx_save_restore)
-   return HSAKMT_STATUS_NO_MEMORY;
+   if (hsaKmtGetNodeProperties(NodeId, &node))
+   svm_api = false;
+   else
+   svm_api = node.Capability.ui32.SVMAPISupported;
 
-   args->ctx_save_restore_address = (uintptr_t)q->ctx_save_restore;
+   /* Allocate unified memory for context save restore
+* area on dGPU.
+*/
+   if (!q->use_ats && svm_api) {
+   uint32_t size = ALIGN_UP(total_mem_alloc_size, 
PAGE_SIZE);
+   void *addr;
+   HSAKMT_STATUS r =

[PATCH 0/4] Unified memory for CWSR save restore area

amdkfd changes:

Eric Huang (2):
  drm/amdkfd: add new flag for svm
  drm/amdkfd: change svm range evict

 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 --
 include/uapi/linux/kfd_ioctl.h   |  2 ++
 2 files changed, 10 insertions(+), 2 deletions(-)

libhsakmt(thunk) changes:
which are based on https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface

Eric Huang (2):
  libhsakmt: add new flags for svm
  libhsakmt: allocate unified memory for ctx save restore area

 include/hsakmttypes.h |   1 +
 include/linux/kfd_ioctl.h |   2 +
 src/queues.c  | 109 +-
 3 files changed, 98 insertions(+), 14 deletions(-)

-- 
2.25.1

[PATCH 3/4] libhsakmt: add new flags for svm

It is to add new option for always keeping gpu mapping.

Signed-off-by: Eric Huang 
Change-Id: Iebee35e6de4d52fa29f82dd19f6bbf5640249492
---
 include/linux/kfd_ioctl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/kfd_ioctl.h b/include/linux/kfd_ioctl.h
index 8a0ed49..5c45f58 100644
--- a/include/linux/kfd_ioctl.h
+++ b/include/linux/kfd_ioctl.h
@@ -1069,6 +1069,8 @@ struct kfd_ioctl_cross_memory_copy_args {
 #define KFD_IOCTL_SVM_FLAG_GPU_EXEC0x0010
 /* GPUs mostly read, may allow similar optimizations as RO, but writes fault */
 #define KFD_IOCTL_SVM_FLAG_GPU_READ_MOSTLY 0x0020
+/* Keep GPU memory mapping always valid as if XNACK is disable */
+#define KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED   0x0040
 
 /**
  * kfd_ioctl_svm_op - SVM ioctl operations
-- 
2.25.1

[PATCH 2/2] drm/amdkfd: change svm range evict

Two changes:
1. reducing unnecessary evict/unmap when range is not mapped to gpu.
2. adding always evict when flags is set to always_mapped.

Signed-off-by: Eric Huang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 4bf2f75f853b..76e817687ef9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1767,12 +1767,16 @@ svm_range_evict(struct svm_range *prange, struct 
mm_struct *mm,
struct kfd_process *p;
int r = 0;
 
+   if (!prange->mapped_to_gpu)
+   return 0;
+
p = container_of(svms, struct kfd_process, svms);
 
pr_debug("invalidate svms 0x%p prange [0x%lx 0x%lx] [0x%lx 0x%lx]\n",
 svms, prange->start, prange->last, start, last);
 
-   if (!p->xnack_enabled) {
+   if (!p->xnack_enabled ||
+   (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) {
int evicted_ranges;
 
list_for_each_entry(pchild, &prange->child_list, child_list) {
@@ -3321,7 +3325,9 @@ svm_range_set_attr(struct kfd_process *p, struct 
mm_struct *mm,
if (r)
goto out_unlock_range;
 
-   if (migrated && !p->xnack_enabled) {
+   if (migrated && (!p->xnack_enabled ||
+   (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
+   prange->mapped_to_gpu) {
pr_debug("restore_work will update mappings of GPUs\n");
mutex_unlock(&prange->migrate_mutex);
continue;
-- 
2.25.1

[PATCH 1/2] drm/amdkfd: add new flag for svm

It is to add new option for always keeping gpu mapping.

Signed-off-by: Eric Huang 
---
 include/uapi/linux/kfd_ioctl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index fd49dde4d5f4..eba04ebfd9a8 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -1076,6 +1076,8 @@ struct kfd_ioctl_cross_memory_copy_args {
 #define KFD_IOCTL_SVM_FLAG_GPU_EXEC0x0010
 /* GPUs mostly read, may allow similar optimizations as RO, but writes fault */
 #define KFD_IOCTL_SVM_FLAG_GPU_READ_MOSTLY 0x0020
+/* Keep GPU memory mapping always valid as if XNACK is disable */
+#define KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED   0x0040
 
 /**
  * kfd_ioctl_svm_op - SVM ioctl operations
-- 
2.25.1

Re: [RFC PATCH 4/5] drm/drm_color_mgmt: add 3D LUT to color mgmt properties




On 6/19/22 18:31, Melissa Wen wrote:
> Add 3D LUT for gammar correction using a 3D lookup table.  The position
> in the color correction pipeline where 3D LUT is applied depends on hw
> design, being after CTM or gamma. If just after CTM, a shaper lut must
> be set to shape the content for a non-linear space. That details should
> be handled by the driver according to its color capabilities.
> 
> Signed-off-by: Melissa Wen 
> ---
>  drivers/gpu/drm/drm_atomic_state_helper.c |  3 ++
>  drivers/gpu/drm/drm_atomic_uapi.c | 14 +-
>  drivers/gpu/drm/drm_color_mgmt.c  | 58 +++
>  drivers/gpu/drm/drm_fb_helper.c   |  2 +
>  drivers/gpu/drm/drm_mode_config.c | 14 ++
>  include/drm/drm_color_mgmt.h  |  4 ++
>  include/drm/drm_crtc.h| 12 -
>  include/drm/drm_mode_config.h | 13 +
>  8 files changed, 117 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
> b/drivers/gpu/drm/drm_atomic_state_helper.c
> index cf0545bb6e00..64800bc41365 100644
> --- a/drivers/gpu/drm/drm_atomic_state_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_state_helper.c
> @@ -141,6 +141,8 @@ void __drm_atomic_helper_crtc_duplicate_state(struct 
> drm_crtc *crtc,
>   drm_property_blob_get(state->ctm);
>   if (state->shaper_lut)
>   drm_property_blob_get(state->shaper_lut);
> + if (state->lut3d)
> + drm_property_blob_get(state->lut3d);
>   if (state->gamma_lut)
>   drm_property_blob_get(state->gamma_lut);
>  
> @@ -216,6 +218,7 @@ void __drm_atomic_helper_crtc_destroy_state(struct 
> drm_crtc_state *state)
>   drm_property_blob_put(state->degamma_lut);
>   drm_property_blob_put(state->ctm);
>   drm_property_blob_put(state->shaper_lut);
> + drm_property_blob_put(state->lut3d);
>   drm_property_blob_put(state->gamma_lut);
>  }
>  EXPORT_SYMBOL(__drm_atomic_helper_crtc_destroy_state);
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> b/drivers/gpu/drm/drm_atomic_uapi.c
> index 6468f2a080bc..1896c0422f73 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -472,6 +472,14 @@ static int drm_atomic_crtc_set_property(struct drm_crtc 
> *crtc,
>   &replaced);
>   state->color_mgmt_changed |= replaced;
>   return ret;
> + } else if (property == config->lut3d_property) {
> + ret = drm_atomic_replace_property_blob_from_id(dev,
> + &state->lut3d,
> + val,
> + -1, sizeof(struct drm_color_lut),
> + &replaced);
> + state->color_mgmt_changed |= replaced;
> + return ret;
>   } else if (property == config->gamma_lut_property) {
>   ret = drm_atomic_replace_property_blob_from_id(dev,
>   &state->gamma_lut,
> @@ -523,10 +531,12 @@ drm_atomic_crtc_get_property(struct drm_crtc *crtc,
>   *val = (state->degamma_lut) ? state->degamma_lut->base.id : 0;
>   else if (property == config->ctm_property)
>   *val = (state->ctm) ? state->ctm->base.id : 0;
> - else if (property == config->gamma_lut_property)
> - *val = (state->gamma_lut) ? state->gamma_lut->base.id : 0;
>   else if (property == config->shaper_lut_property)
>   *val = (state->shaper_lut) ? state->shaper_lut->base.id : 0;
> + else if (property == config->lut3d_property)
> + *val = (state->lut3d) ? state->lut3d->base.id : 0;
> + else if (property == config->gamma_lut_property)
> + *val = (state->gamma_lut) ? state->gamma_lut->base.id : 0;
>   else if (property == config->prop_out_fence_ptr)
>   *val = 0;
>   else if (property == crtc->scaling_filter_property)
> diff --git a/drivers/gpu/drm/drm_color_mgmt.c 
> b/drivers/gpu/drm/drm_color_mgmt.c
> index 4f57dc60fe03..696fe1e37801 100644
> --- a/drivers/gpu/drm/drm_color_mgmt.c
> +++ b/drivers/gpu/drm/drm_color_mgmt.c
> @@ -87,6 +87,25 @@
>   *   publish the largest size, and sub-sample smaller sized LUTs
>   *   appropriately.
>   *
> + * “LUT3D”:
> + *   Blob property to set the 3D LUT mapping pixel data after the color
> + *   transformation matrix and before gamma 1D lut correction. The
> + *   data is interpreted as an array of &struct drm_color_lut elements.
> + *   Hardware might choose not to use the full precision of the LUT
> + *   elements.
> + *
> + *   Setting this to NULL (blob property value set to 0) means a the output
> + *   color is identical to the input color. This is generally the driver
> + *   boot-up state too. Drivers can access this blob through
> + *   &drm_crtc_state.gamma_lut.
> + *
> + * “LUT3D_SIZE”:
> + *   Unsigned range property to give the size of the 3D lookup ta

Re: [PATCH v6 14/22] dma-buf: Introduce new locking convention

On 5/30/22 15:57, Dmitry Osipenko wrote:

On 5/30/22 16:41, Christian König wrote:

Hi Dmitry,

Am 30.05.22 um 15:26 schrieb Dmitry Osipenko:

Hello Christian,

On 5/30/22 09:50, Christian König wrote:

Hi Dmitry,

First of all please separate out this patch from the rest of the series,
since this is a complex separate structural change.

I assume all the patches will go via the DRM tree in the end since the
rest of the DRM patches in this series depend on this dma-buf change.
But I see that separation may ease reviewing of the dma-buf changes, so
let's try it.

That sounds like you are underestimating a bit how much trouble this
will be.

I have tried this before and failed because catching all the locks in
the right code paths are very tricky. So expect some fallout from this
and make sure the kernel test robot and CI systems are clean.

Sure, I'll fix up all the reported things in the next iteration.

BTW, have you ever posted yours version of the patch? Will be great if
we could compare the changed code paths.

No, I never even finished creating it after realizing how much work it
would be.

This patch introduces new locking convention for dma-buf users. From
now
on all dma-buf importers are responsible for holding dma-buf
reservation
lock around operations performed over dma-bufs.

This patch implements the new dma-buf locking convention by:

1. Making dma-buf API functions to take the reservation lock.

2. Adding new locked variants of the dma-buf API functions for
drivers
that need to manage imported dma-bufs under the held lock.

Instead of adding new locked variants please mark all variants which
expect to be called without a lock with an _unlocked postfix.

This should make it easier to remove those in a follow up patch set and
then fully move the locking into the importer.

Do we really want to move all the locks to the importers? Seems the
majority of drivers should be happy with the dma-buf helpers handling
the locking for them.

Yes, I clearly think so.

3. Converting all drivers to the new locking scheme.

I have strong doubts that you got all of them. At least radeon and
nouveau should grab the reservation lock in their ->attach callbacks
somehow.

Radeon and Nouveau use gem_prime_import_sg_table() and they take resv
lock already, seems they should be okay (?)

You are looking at the wrong side. You need to fix the export code path,
not the import ones.

See for example attach on radeon works like this
drm_gem_map_attach->drm_gem_pin->radeon_gem_prime_pin->radeon_bo_reserve->ttm_bo_reserve->dma_resv_lock.

Yeah, I was looking at the both sides, but missed this one.

Also i915 will run into trouble with attach. In particular since i915
starts a full ww transaction in its attach callback to be able to lock
other objects if migration is needed. I think i915 CI would catch this
in a selftest.

Perhaps it's worthwile to take a step back and figure out, if the
importer is required to lock, which callbacks might need a ww acquire
context?

(And off-topic, Since we do a lot of fancy stuff under dma-resv locks
including waiting for fences and other locks, IMO taking these locks
uninterruptible should ring a warning bell)

/Thomas

Same for nouveau and probably a few other exporters as well. That will
certainly cause a deadlock if you don't fix it.

I strongly suggest to do this step by step, first attach/detach and then
the rest.

Thank you very much for the suggestions. I'll implement them in the next
version.

Re: [RFC PATCH 2/5] Documentation/amdgpu/display: add DC color caps info




On 6/19/22 18:31, Melissa Wen wrote:
> Add details about color correction capabilities and explain a bit about
> differences between DC hw generations and also how they are mapped
> between DRM and DC interface. Two schemas for DCN 2.0 and 3.0
> (rasterized from the original png) is included to illustrate it. They
> were obtained from a discussion[1] in the amd-gfx mailing list.
> 
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Famd-gfx%2F20220422142811.dm6vtk6v64jcwydk%40mail.igalia.com%2F&data=05%7C01%7Charry.wentland%40amd.com%7C11fb474339ba49ad492a08da52439534%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637912748027800719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=1eUQmjydQoDifmsZHfS%2F9PPEJeaZvQ4xGgjblTL9dZE%3D&reserved=0
> 
> Signed-off-by: Melissa Wen 
> ---
>  .../amdgpu/display/dcn2_cm_drm_current.svg| 1370 +++
>  .../amdgpu/display/dcn3_cm_drm_current.svg| 1528 +
>  .../gpu/amdgpu/display/display-manager.rst|   35 +
>  drivers/gpu/drm/amd/display/dc/dc.h   |   53 +-
>  4 files changed, 2985 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/gpu/amdgpu/display/dcn2_cm_drm_current.svg
>  create mode 100644 Documentation/gpu/amdgpu/display/dcn3_cm_drm_current.svg
> 
> diff --git a/Documentation/gpu/amdgpu/display/dcn2_cm_drm_current.svg 
> b/Documentation/gpu/amdgpu/display/dcn2_cm_drm_current.svg
> new file mode 100644
> index ..0156f56d4482
> --- /dev/null
> +++ b/Documentation/gpu/amdgpu/display/dcn2_cm_drm_current.svg
> @@ -0,0 +1,1370 @@
> +
> +
> +
> + +   version="1.1"
> +   id="svg2019"
> +   width="1702"
> +   height="1845"
> +   viewBox="0 0 1702 1845"
> +   sodipodi:docname="dcn2_cm_drm_current.svg"
> +   inkscape:version="1.1.2 (0a00cf5339, 2022-02-04)"
> +   
> xmlns:inkscape="https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.inkscape.org%2Fnamespaces%2Finkscape&data=05%7C01%7Charry.wentland%40amd.com%7C11fb474339ba49ad492a08da52439534%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637912748027800719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=RV2BnPZ1Ash3pjMW8pWj%2FG%2FMiWXYfuxgCsaeDIchYOQ%3D&reserved=0";
> +   
> xmlns:sodipodi="https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsodipodi.sourceforge.net%2FDTD%2Fsodipodi-0.dtd&data=05%7C01%7Charry.wentland%40amd.com%7C11fb474339ba49ad492a08da52439534%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637912748027800719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=HMMzEZP%2BCNjxPLjW8R0sK3Q%2FRANhtKg7Q65pQZdWDw8%3D&reserved=0";
> +   
> xmlns="https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg&data=05%7C01%7Charry.wentland%40amd.com%7C11fb474339ba49ad492a08da52439534%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637912748027800719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=vx9VHcwk90HcOCQRAFGldDF0qI4teUyzWiojodX3tKc%3D&reserved=0";
> +   
> xmlns:svg="https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg&data=05%7C01%7Charry.wentland%40amd.com%7C11fb474339ba49ad492a08da52439534%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637912748027800719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=vx9VHcwk90HcOCQRAFGldDF0qI4teUyzWiojodX3tKc%3D&reserved=0";>
> +   + id="defs2023" />
> +   + id="namedview2021"
> + pagecolor="#ff"
> + bordercolor="#66"
> + borderopacity="1.0"
> + inkscape:pageshadow="2"
> + inkscape:pageopacity="0.0"
> + inkscape:pagecheckerboard="0"
> + showgrid="false"
> + inkscape:zoom="0.56413987"
> + inkscape:cx="1004.1836"
> + inkscape:cy="833.12673"
> + inkscape:window-width="1920"
> + inkscape:window-height="1131"
> + inkscape:window-x="0"
> + inkscape:window-y="0"
> + inkscape:window-maximized="1"
> + inkscape:current-layer="g2025" />
> +   + inkscape:groupmode="layer"
> + inkscape:label="Image"
> + id="g2025">
> + +   style="fill:#00;fill-opacity:0;stroke:#00;stroke-opacity:1"
> +   id="rect34"
> +   width="208.83351"
> +   height="486.09872"
> +   x="0.90158081"
> +   y="132.77872" />
> + +   style="fill:#fad7ac;fill-opacity:1;stroke:#00;stroke-opacity:1"
> +   id="rect1019"
> +   width="126.38867"
> +   height="55.320732"
> +   x="25.960823"
> +   y="188.06937" />
> + +   style="fill:#d0cee2;fill-opacity:1;stroke:#00;stroke-opacity:1"
> +   id="rect1021"
> +   width="126.38867"
> +   height="55.320732"
> +   x="25.960823"
> +   y="346.06937" />
> + +   style="fill:#afdde9;fi

[PATCH] drm/amd: Add debug mask for subviewport mclk switch

2022-06-28 Thread Aurabindo Pillai

[Why&How]
Expose a new debugfs enum to force a subviewport memory clock switch
to facilitate easy testing.

Signed-off-by: Aurabindo Pillai 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
 drivers/gpu/drm/amd/include/amd_shared.h  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index c9145864ed2b..7a034ca95be2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1559,6 +1559,9 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
if (amdgpu_dc_debug_mask & DC_DISABLE_CLOCK_GATING)
adev->dm.dc->debug.disable_clock_gate = true;
 
+   if (amdgpu_dc_debug_mask & DC_FORCE_SUBVP_MCLK_SWITCH)
+   adev->dm.dc->debug.force_subvp_mclk_switch = true;
+
r = dm_dmub_hw_init(adev);
if (r) {
DRM_ERROR("DMUB interface failed to initialize: status=%d\n", 
r);
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h
index bcdf7453a403..b1c55dd7b498 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -247,6 +247,7 @@ enum DC_DEBUG_MASK {
DC_DISABLE_DSC = 0x4,
DC_DISABLE_CLOCK_GATING = 0x8,
DC_DISABLE_PSR = 0x10,
+   DC_FORCE_SUBVP_MCLK_SWITCH = 0x20,
 };
 
 enum amd_dpm_forced_level;
-- 
2.36.1

[linux-next:master] BUILD REGRESSION cb71b93c2dc36d18a8b05245973328d018272cdf

2022-06-28 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: cb71b93c2dc36d18a8b05245973328d018272cdf  Add linux-next specific 
files for 20220628

Error/Warning: (recently discovered and may have been fixed)

arch/powerpc/kernel/interrupt.c:542:55: error: suggest braces around empty body 
in an 'if' statement [-Werror=empty-body]
arch/powerpc/kernel/interrupt.c:542:55: warning: suggest braces around empty 
body in an 'if' statement [-Wempty-body]
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1025:33: warning: 
variable 'pre_connection_type' set but not used [-Wunused-but-set-variable]
drivers/pci/endpoint/functions/pci-epf-vntb.c:975:5: warning: no previous 
prototype for 'pci_read' [-Wmissing-prototypes]
drivers/pci/endpoint/functions/pci-epf-vntb.c:984:5: warning: no previous 
prototype for 'pci_write' [-Wmissing-prototypes]
vmlinux.o: warning: objtool: __ct_user_enter+0x8c: call to 
ftrace_likely_update() leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_idle_enter+0x19: call to ftrace_likely_update() 
leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_idle_exit+0x3e: call to ftrace_likely_update() 
leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_irq_enter+0x6a: call to ftrace_likely_update() 
leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_irq_exit+0x6a: call to ftrace_likely_update() 
leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_kernel_enter.constprop.0+0x2a: call to 
ftrace_likely_update() leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_kernel_enter_state+0x2d: call to 
ftrace_likely_update() leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_kernel_exit.constprop.0+0x53: call to 
ftrace_likely_update() leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_kernel_exit_state+0x2d: call to 
ftrace_likely_update() leaves .noinstr.text section
vmlinux.o: warning: objtool: ct_nmi_enter+0x4b: call to ftrace_likely_update() 
leaves .noinstr.text section

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/net/ethernet/microchip/lan743x_main.c:1238:1: internal compiler error: 
in arc_ifcvt, at config/arc/arc.c:9637
drivers/soc/mediatek/mtk-mutex.c:799:1: internal compiler error: in arc_ifcvt, 
at config/arc/arc.c:9637
drivers/staging/media/zoran/zr36016.c:430:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/staging/media/zoran/zr36050.c:829:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/staging/media/zoran/zr36060.c:869:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/thunderbolt/tmu.c:758:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
sound/soc/sof/intel/mtl.c:547:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link.c:warning:variable-pre_connection_type-set-but-not-used
|   |-- 
drivers-pci-endpoint-functions-pci-epf-vntb.c:warning:no-previous-prototype-for-pci_read
|   `-- 
drivers-pci-endpoint-functions-pci-epf-vntb.c:warning:no-previous-prototype-for-pci_write
|-- alpha-randconfig-r002-20220628
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link.c:warning:variable-pre_connection_type-set-but-not-used
|-- alpha-randconfig-r003-20220628
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link.c:warning:variable-pre_connection_type-set-but-not-used
|-- arc-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link.c:warning:variable-pre_connection_type-set-but-not-used
|   |-- 
drivers-net-ethernet-microchip-lan743x_main.c:internal-compiler-error:in-arc_ifcvt-at-config-arc-arc.c
|   |-- 
drivers-pci-endpoint-functions-pci-epf-vntb.c:warning:no-previous-prototype-for-pci_read
|   |-- 
drivers-pci-endpoint-functions-pci-epf-vntb.c:warning:no-previous-prototype-for-pci_write
|   |-- 
drivers-soc-mediatek-mtk-mutex.c:internal-compiler-error:in-arc_ifcvt-at-config-arc-arc.c
|   |-- 
drivers-staging-media-zoran-zr36016.c:internal-compiler-error:in-arc_ifcvt-at-config-arc-arc.c
|   |-- 
drivers-staging-media-zoran-zr36050.c:internal-compiler-error:in-arc_ifcvt-at-config-arc-arc.c
|   |-- 
drivers-staging-media-zoran-zr36060.c:internal-compiler-error:in-arc_ifcvt-at-config-arc-arc.c
|   |-- 
drivers-thunderbolt-tmu.c:internal-compiler-error:in-arc_ifcvt-at-config-arc-arc.c
|   `-- 
sound-soc-sof-intel-mtl.c:internal-compiler-error:in-arc_ifcvt-at-config-arc-arc.c
|-- arm-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link.c:warning:variable-pre_connection_type-set-but-not-used
|   |-- 
drivers-pci-endpoint-functions-pci-epf-vntb.c:warning:no-previous-prototype-for-pci_read
|   `-- 
drivers-pci-endpoint-functions-pci-epf-vntb.c:warning:no-previous-prototype-for-pci

Re: [PATCH 1/3] drm/amdkfd: add new flags for svm


Thank you, Felix.

I will send all libhsakmt changes and amdkfd changes to amd-gfx.

Regards,
Eric

On 2022-06-28 16:44, Felix Kuehling wrote:

Am 2022-06-27 um 12:01 schrieb Eric Huang:
No. There is only internal link for now, because it is under review. 
Once it is submitted, external link should be in gerritgit for 
libhsakmt.


Hi Eric,

For anything that requires ioctl API changes, the user mode and kernel 
mode changes need to be reviewed together in public. You can either 
post the libhsakmt change by email to amd-gfx, or you can push your 
libhsakmt development branch to a personal branch on github and 
include a link to that in the kernel commit description.


Alex, some background about this series: We are looking into using 
unified memory for CWSR context save space. This allows us to get 
lower preemption latency when VRAM is available, but migrate it to 
system memory when more VRAM is needed for application allocations. 
Because we cannot preempt in the trap handler, and we want to 
guarantee finite time for preemption and trap handler execution, we 
need to prevent page faults on any memory accessed by the trap 
handler. The KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag is meant to 
guarantee that.


I think the KFD_IOCTL_SVM_FLAG_CUSTOM is not necessary. I've responded 
to Eric with an alternative idea.


Regards,
  Felix




Regards,
Eric

On 2022-06-27 11:58, Alex Deucher wrote:
On Mon, Jun 27, 2022 at 11:36 AM Eric Huang 
 wrote:

http://gerrit-git.amd.com/c/compute/ec/libhsakmt/+/697296

Got an external link?

Alex


Regards,
Eric

On 2022-06-27 11:33, Alex Deucher wrote:
On Fri, Jun 24, 2022 at 12:03 PM Eric Huang 
 wrote:

It is to add new options for always keeping gpu mapping
and custom of coarse grain allocation intead of fine
grain as default.

Signed-off-by: Eric Huang 

Can you provide a link to the proposed userspace for this?

Alex


---
   include/uapi/linux/kfd_ioctl.h | 4 
   1 file changed, 4 insertions(+)

diff --git a/include/uapi/linux/kfd_ioctl.h 
b/include/uapi/linux/kfd_ioctl.h

index fd49dde4d5f4..9dbf215675a0 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -1076,6 +1076,10 @@ struct kfd_ioctl_cross_memory_copy_args {
   #define KFD_IOCTL_SVM_FLAG_GPU_EXEC    0x0010
   /* GPUs mostly read, may allow similar optimizations as RO, 
but writes fault */

   #define KFD_IOCTL_SVM_FLAG_GPU_READ_MOSTLY 0x0020
+/* Keep GPU memory mapping always valid as if XNACK is disable */
+#define KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED 0x0040
+/* Allow set custom flags instead of defaults */
+#define KFD_IOCTL_SVM_FLAG_CUSTOM  0x8000

   /**
    * kfd_ioctl_svm_op - SVM ioctl operations
--
2.25.1

Re: CONFIG_ANDROID (was: rcu_sched detected expedited stalls in amdgpu after suspend)

2022-06-28 Thread Uladzislau Rezki

> Excerpts from Paul E. McKenney's message of June 28, 2022 2:54 pm:
> > All you need to do to get the previous behavior is to add something like
> > this to your defconfig file:
> > 
> > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=21000
> > 
> > Any reason why this will not work for you?
> 
> As far as I know, I do not require any particular RCU debugging features 
> intended for developers; as an individual user and distro maintainer, I 
> would like to select the option corresponding to "emit errors for 
> unexpected conditions which should be reported upstream", not "emit 
> debugging information for development purposes".
> 
Sorry but we need to apply some assumption, i.e. to me the CONFIG_ANDROID
indicates that a kernel runs on the Android wise device. When you enable
this option on you specific box it is supposed that some Android related
code are activated also on your device which may lead to some side effect.

>
> Therefore, I think 0 is a suitable setting for me and most ordinary 
> (not tightly controlled) distributions. My concern is that other users 
> and distro maintainers will also have confusion about what value to set 
> and whether the warnings are important, since the help text does not say 
> anything about Android, and "make oldconfig" does not indicate that the 
> default value is different for Android.
> 

diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
index 9b64e55d4f61..ced0d1f7c675 100644
--- a/kernel/rcu/Kconfig.debug
+++ b/kernel/rcu/Kconfig.debug
@@ -94,7 +94,8 @@ config RCU_EXP_CPU_STALL_TIMEOUT
  If the RCU grace period persists, additional CPU stall warnings
  are printed at more widely spaced intervals.  A value of zero
  says to use the RCU_CPU_STALL_TIMEOUT value converted from
- seconds to milliseconds.
+ seconds to milliseconds. If CONFIG_ANDROID is set for non-Android
+ platform and you unsure, set the RCU_EXP_CPU_STALL_TIMEOUT to zero.

 config RCU_TRACE
bool "Enable tracing for RCU"


Will it work for you?

--
Uladzislau Rezki

Re: [PATCH 1/3] drm/amdkfd: add new flags for svm

2022-06-28 Thread Felix Kuehling


Am 2022-06-27 um 12:01 schrieb Eric Huang:
No. There is only internal link for now, because it is under review. 
Once it is submitted, external link should be in gerritgit for libhsakmt.


Hi Eric,

For anything that requires ioctl API changes, the user mode and kernel 
mode changes need to be reviewed together in public. You can either post 
the libhsakmt change by email to amd-gfx, or you can push your libhsakmt 
development branch to a personal branch on github and include a link to 
that in the kernel commit description.


Alex, some background about this series: We are looking into using 
unified memory for CWSR context save space. This allows us to get lower 
preemption latency when VRAM is available, but migrate it to system 
memory when more VRAM is needed for application allocations. Because we 
cannot preempt in the trap handler, and we want to guarantee finite time 
for preemption and trap handler execution, we need to prevent page 
faults on any memory accessed by the trap handler. The 
KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag is meant to guarantee that.


I think the KFD_IOCTL_SVM_FLAG_CUSTOM is not necessary. I've responded 
to Eric with an alternative idea.


Regards,
  Felix




Regards,
Eric

On 2022-06-27 11:58, Alex Deucher wrote:
On Mon, Jun 27, 2022 at 11:36 AM Eric Huang 
 wrote:

http://gerrit-git.amd.com/c/compute/ec/libhsakmt/+/697296

Got an external link?

Alex


Regards,
Eric

On 2022-06-27 11:33, Alex Deucher wrote:
On Fri, Jun 24, 2022 at 12:03 PM Eric Huang 
 wrote:

It is to add new options for always keeping gpu mapping
and custom of coarse grain allocation intead of fine
grain as default.

Signed-off-by: Eric Huang 

Can you provide a link to the proposed userspace for this?

Alex


---
   include/uapi/linux/kfd_ioctl.h | 4 
   1 file changed, 4 insertions(+)

diff --git a/include/uapi/linux/kfd_ioctl.h 
b/include/uapi/linux/kfd_ioctl.h

index fd49dde4d5f4..9dbf215675a0 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -1076,6 +1076,10 @@ struct kfd_ioctl_cross_memory_copy_args {
   #define KFD_IOCTL_SVM_FLAG_GPU_EXEC    0x0010
   /* GPUs mostly read, may allow similar optimizations as RO, but 
writes fault */

   #define KFD_IOCTL_SVM_FLAG_GPU_READ_MOSTLY 0x0020
+/* Keep GPU memory mapping always valid as if XNACK is disable */
+#define KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED   0x0040
+/* Allow set custom flags instead of defaults */
+#define KFD_IOCTL_SVM_FLAG_CUSTOM  0x8000

   /**
    * kfd_ioctl_svm_op - SVM ioctl operations
--
2.25.1

Re: [RFC PATCH 1/5] Documentation/amdgpu_dm: Add DM color correction documentation




On 2022-06-19 18:31, Melissa Wen wrote:
> AMDGPU DM maps DRM color management properties (degamma, ctm and gamma)
> to DC color correction entities. Part of this mapping is already
> documented as code comments and can be converted as kernel docs.
> 
> Signed-off-by: Melissa Wen 
> ---
>  .../gpu/amdgpu/display/display-manager.rst|   9 ++
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |   2 +
>  .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 122 +-
>  3 files changed, 101 insertions(+), 32 deletions(-)
> 
> diff --git a/Documentation/gpu/amdgpu/display/display-manager.rst 
> b/Documentation/gpu/amdgpu/display/display-manager.rst
> index 7ce31f89d9a0..b1b0f11aed83 100644
> --- a/Documentation/gpu/amdgpu/display/display-manager.rst
> +++ b/Documentation/gpu/amdgpu/display/display-manager.rst
> @@ -40,3 +40,12 @@ Atomic Implementation
>  
>  .. kernel-doc:: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> :functions: amdgpu_dm_atomic_check amdgpu_dm_atomic_commit_tail
> +
> +Color Management Properties
> +===
> +
> +.. kernel-doc:: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +   :doc: overview
> +
> +.. kernel-doc:: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +   :internal:
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> index 3cc5c15303e6..8fd1be7f2583 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -242,6 +242,8 @@ struct hpd_rx_irq_offload_work {
>   * @force_timing_sync: set via debugfs. When set, indicates that all 
> connected
>   *  displays will be forced to synchronize.
>   * @dmcub_trace_event_en: enable dmcub trace events
> + * @num_of_edps: dumber of embedded Display Ports

/s/dumber/number

Thanks for turning these into kerneldocs. With the above minor nit fixed this is
Reviewed-by: Harry Wentland 

Harry

> + * @disable_hpd_irq: disable Hot Plug Detect handling
>   */
>  struct amdgpu_display_manager {
>  
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> index a71177305bcd..1f4a7c908587 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
> @@ -29,7 +29,9 @@
>  #include "modules/color/color_gamma.h"
>  #include "basics/conversion.h"
>  
> -/*
> +/**
> + * DOC: overview
> + *
>   * The DC interface to HW gives us the following color management blocks
>   * per pipe (surface):
>   *
> @@ -71,8 +73,8 @@
>  
>  #define MAX_DRM_LUT_VALUE 0x
>  
> -/*
> - * Initialize the color module.
> +/**
> + * amdgpu_dm_init_color_mod - Initialize the color module.
>   *
>   * We're not using the full color module, only certain components.
>   * Only call setup functions for components that we need.
> @@ -82,7 +84,14 @@ void amdgpu_dm_init_color_mod(void)
>   setup_x_points_distribution();
>  }
>  
> -/* Extracts the DRM lut and lut size from a blob. */
> +/**
> + * __extract_blob_lut - Extracts the DRM lut and lut size from a blob.
> + * @blob: DRM color mgmt property blob
> + * @size: lut size
> + *
> + * Returns:
> + * DRM LUT or NULL
> + */
>  static const struct drm_color_lut *
>  __extract_blob_lut(const struct drm_property_blob *blob, uint32_t *size)
>  {
> @@ -90,13 +99,18 @@ __extract_blob_lut(const struct drm_property_blob *blob, 
> uint32_t *size)
>   return blob ? (struct drm_color_lut *)blob->data : NULL;
>  }
>  
> -/*
> - * Return true if the given lut is a linear mapping of values, i.e. it acts
> - * like a bypass LUT.
> +/**
> + * __is_lut_linear - check if the given lut is a linear mapping of values
> + * @lut: given lut to check values
> + * @size: lut size
>   *
>   * It is considered linear if the lut represents:
> - * f(a) = (0xFF00/MAX_COLOR_LUT_ENTRIES-1)a; for integer a in
> - *   [0, MAX_COLOR_LUT_ENTRIES)
> + * f(a) = (0xFF00/MAX_COLOR_LUT_ENTRIES-1)a; for integer a in [0,
> + * MAX_COLOR_LUT_ENTRIES)
> + *
> + * Returns:
> + * True if the given lut is a linear mapping of values, i.e. it acts like a
> + * bypass LUT. Otherwise, false.
>   */
>  static bool __is_lut_linear(const struct drm_color_lut *lut, uint32_t size)
>  {
> @@ -119,9 +133,13 @@ static bool __is_lut_linear(const struct drm_color_lut 
> *lut, uint32_t size)
>   return true;
>  }
>  
> -/*
> - * Convert the drm_color_lut to dc_gamma. The conversion depends on the size
> - * of the lut - whether or not it's legacy.
> +/**
> + * __drm_lut_to_dc_gamma - convert the drm_color_lut to dc_gamma.
> + * @lut: DRM lookup table for color conversion
> + * @gamma: DC gamma to set entries
> + * @is_legacy: legacy or atomic gamma
> + *
> + * The conversion depends on the size of the lut - whether or not it's 
> legacy.
>   */
>  static void __drm_lut_to_dc_gamma(cons

[PATCH] drm/amd: Load TA firmware for DCN32/321

2022-06-28 Thread Aurabindo Pillai

[Why&How]
TA firmware is needed to enable HDCP

Signed-off-by: Aurabindo Pillai 
---
 drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
index 9e1ef81933ff..4df45c2a7d0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
@@ -40,6 +40,7 @@ MODULE_FIRMWARE("amdgpu/psp_13_0_8_toc.bin");
 MODULE_FIRMWARE("amdgpu/psp_13_0_8_ta.bin");
 MODULE_FIRMWARE("amdgpu/psp_13_0_0_sos.bin");
 MODULE_FIRMWARE("amdgpu/psp_13_0_7_sos.bin");
+MODULE_FIRMWARE("amdgpu/psp_13_0_7_ta.bin");
 
 /* For large FW files the time to complete can be very long */
 #define USBC_PD_POLLING_LIMIT_S 240
@@ -103,6 +104,10 @@ static int psp_v13_0_init_microcode(struct psp_context 
*psp)
case IP_VERSION(13, 0, 0):
case IP_VERSION(13, 0, 7):
err = psp_init_sos_microcode(psp, chip_name);
+   if (err)
+   return err;
+   /* It's not necessary to load ras ta on Guest side */
+   err = psp_init_ta_microcode(psp, chip_name);
if (err)
return err;
break;
-- 
2.36.1

[PATCH v2] drm/amd/display: expose additional modifier for DCN32/321

2022-06-28 Thread Aurabindo Pillai

[Why&How]
Some userspace expect a backwards compatible modifier on DCN32/321. For
hardware with num_pipes more than 16, we expose the most efficient
modifier first. As a fall back method, we need to expose slightly inefficient
modifier AMD_FMT_MOD_TILE_GFX9_64K_R_X after the best option.

Also set the number of packers to fixed value as required per hardware
documentation. This value is cached during hardware initialization and
can be read through the base driver.

Signed-off-by: Aurabindo Pillai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  3 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 66 ++-
 2 files changed, 36 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 1a512d78673a..0f5bfe5df627 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -743,8 +743,7 @@ static int convert_tiling_flags_to_modifier(struct 
amdgpu_framebuffer *afb)
switch (version) {
case AMD_FMT_MOD_TILE_VER_GFX11:
pipe_xor_bits = min(block_size_bits - 8, pipes);
-   packers = min(block_size_bits - 8 - 
pipe_xor_bits,
-   
ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs));
+   packers = 
ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs);
break;
case AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS:
pipe_xor_bits = min(block_size_bits - 8, pipes);
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 98bb65377e98..adccaf2f539d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5208,6 +5208,7 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
int num_pkrs = 0;
int pkrs = 0;
u32 gb_addr_config;
+   u8 i = 0;
unsigned swizzle_r_x;
uint64_t modifier_r_x;
uint64_t modifier_dcc_best;
@@ -5223,37 +5224,40 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
num_pipes = 1 << REG_GET_FIELD(gb_addr_config, GB_ADDR_CONFIG, 
NUM_PIPES);
pipe_xor_bits = ilog2(num_pipes);
 
-   /* R_X swizzle modes are the best for rendering and DCC requires them. 
*/
-   swizzle_r_x = num_pipes > 16 ? AMD_FMT_MOD_TILE_GFX11_256K_R_X :
-  AMD_FMT_MOD_TILE_GFX9_64K_R_X;
-
-   modifier_r_x = AMD_FMT_MOD |
-   AMD_FMT_MOD_SET(TILE_VERSION, AMD_FMT_MOD_TILE_VER_GFX11) |
-   AMD_FMT_MOD_SET(TILE, swizzle_r_x) |
-   AMD_FMT_MOD_SET(PIPE_XOR_BITS, pipe_xor_bits) |
-   AMD_FMT_MOD_SET(PACKERS, pkrs);
-
-   /* DCC_CONSTANT_ENCODE is not set because it can't vary with gfx11 
(it's implied to be 1). */
-   modifier_dcc_best = modifier_r_x |
-   AMD_FMT_MOD_SET(DCC, 1) |
-   AMD_FMT_MOD_SET(DCC_INDEPENDENT_64B, 0) |
-   AMD_FMT_MOD_SET(DCC_INDEPENDENT_128B, 1) |
-   AMD_FMT_MOD_SET(DCC_MAX_COMPRESSED_BLOCK, 
AMD_FMT_MOD_DCC_BLOCK_128B);
-
-   /* DCC settings for 4K and greater resolutions. (required by display 
hw) */
-   modifier_dcc_4k = modifier_r_x |
-   AMD_FMT_MOD_SET(DCC, 1) |
-   AMD_FMT_MOD_SET(DCC_INDEPENDENT_64B, 1) |
-   AMD_FMT_MOD_SET(DCC_INDEPENDENT_128B, 1) |
-   AMD_FMT_MOD_SET(DCC_MAX_COMPRESSED_BLOCK, 
AMD_FMT_MOD_DCC_BLOCK_64B);
-
-   add_modifier(mods, size, capacity, modifier_dcc_best);
-   add_modifier(mods, size, capacity, modifier_dcc_4k);
-
-   add_modifier(mods, size, capacity, modifier_dcc_best | 
AMD_FMT_MOD_SET(DCC_RETILE, 1));
-   add_modifier(mods, size, capacity, modifier_dcc_4k | 
AMD_FMT_MOD_SET(DCC_RETILE, 1));
-
-   add_modifier(mods, size, capacity, modifier_r_x);
+   for (i = 0; i < 2; i++) {
+   /* Insert the best one first. */
+   /* R_X swizzle modes are the best for rendering and DCC 
requires them. */
+   if (num_pipes > 16)
+   swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX11_256K_R_X : 
AMD_FMT_MOD_TILE_GFX9_64K_R_X;
+   else
+   swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX9_64K_R_X : 
AMD_FMT_MOD_TILE_GFX11_256K_R_X;
+
+   modifier_r_x = AMD_FMT_MOD |
+  AMD_FMT_MOD_SET(TILE_VERSION, 
AMD_FMT_MOD_TILE_VER_GFX11) |
+  AMD_FMT_MOD_SET(PIPE_XOR_BITS, pipe_xor_bits) |
+  AMD_FMT_MOD_SET(TILE, swizzle_r_x) |
+  AMD_FMT_MOD_SET(PACKERS, pkrs);
+
+   /* DCC_CONSTANT_ENCODE is not set because it can't vary with 
gfx11 (it's imp

Re: [PATCH v6 01/22] drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error


Hi,

On 5/27/22 01:50, Dmitry Osipenko wrote:

Use ww_acquire_fini() in the error code paths. Otherwise lockdep
thinks that lock is held when lock's memory is freed after the
drm_gem_lock_reservations() error. The WW needs to be annotated
as "freed"


s /WW/ww_acquire_context/ ?
s /"freed"/"released"/ ?



, which fixes the noisy "WARNING: held lock freed!" splat
of VirtIO-GPU driver with CONFIG_DEBUG_MUTEXES=y and enabled lockdep.

Cc: sta...@vger.kernel.org


Can you dig up the commit in error and add a Fixes: Tag?

Using that and "dim fixes" will also make the Cc: stable tag a bit more 
verbose.


With that fixed,

Reviewed-by: Thomas Hellström 



Signed-off-by: Dmitry Osipenko 
---
  drivers/gpu/drm/drm_gem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index eb0c2d041f13..86d670c71286 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1226,7 +1226,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, 
int count,
ret = dma_resv_lock_slow_interruptible(obj->resv,
 acquire_ctx);
if (ret) {
-   ww_acquire_done(acquire_ctx);
+   ww_acquire_fini(acquire_ctx);
return ret;
}
}
@@ -1251,7 +1251,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, 
int count,
goto retry;
}
  
-			ww_acquire_done(acquire_ctx);

+   ww_acquire_fini(acquire_ctx);
return ret;
}
}

Re: [RFC PATCH 0/5] DRM CRTC 3D LUT interface for AMD DCN

On 2022-06-19 18:30, Melissa Wen wrote:
> Hi,
> 
> I've been working on a proposal to add 3D LUT interface to DRM CRTC
> color mgmt, that means new **after-blending** properties for color
> correction. As I'm targeting AMD display drivers, I need to map these
> new properties to AMD DC interface and I have some doubts about the 3D
> LUT programming on DCN blocks.
> 
> First of all, this patchset is a working in progress and further
> discussions about the DRM interface should be done. I've examined
> previous proposal to add 3D LUT[1][2] and I understand that the main
> difference between them is regarding the property position in the DRM
> color management pipeline (between CTM and Gamma 1D or after Gamma 1D).
> On the other hand, AMD DC always considers a shaper (1D) LUT before a 3D
> LUT, used to delinearize and shape the content.  These LUTs are then
> positioned between DRM CTM and Gamma (1D).
> 
> By comparing the AMD design with the other proposals, I see that it's
> possible to cover all of them by adding and combining shaper (1D) LUT
> and 3D LUT as new color mgmt properties. Moreover, it'll not limit the
> exposure of AMD color correction caps to the userspace. Therefore, my
> proposal is to add these two new properties in the DRM color mgmt
> pipeline as follows:
> 
>  ++
>  ||
>  |  Degamma   |
>  +-+--+
>|
>  +-v--+
>  ||
>  |CTM |
>  +-+--+
>|
> ++-v--++
> ||||
> || Shaper LUT ||
> ++-+--++
>|
> ++-v--++
> ||||
> ||  3D LUT||
> ++-+--++
>|
>  +-v--+
>  ||
>  | Gamma (1D) |
>  ++
> 

As Ville already mentioned on patch 4, the increasing complexity of the
color pipeline and the arguments about the placement of the 3D LUT means
that we will probably need a definition of a particular HW's color
pipeline. Something like this proposal from Sebastian:
https://gitlab.freedesktop.org/pq/color-and-hdr/-/issues/11

> However, many doubts arose when I was mapping these two new properties
> to DC interface. This is why I decided to share an not-yet-completed
> implementation to get some feedback and explanation.
> 
> This RFC patchset is divided in three scopes of change. The first two
> patches document the AMD DM color correction mapping. Some comments were
> rewritten as kernel doc entries. I also summarize all information
> provided in previous discussions[3] and also redid those diagrams to
> svg. All doc should be reviewed and some struct members lack
> explanation. I can add them to documentation if you can provide a
> description. Some examples that lack description so far:
> * in amdgpu_display_manager: dmub_outbox_params, dmub_aux_transfer_done, 
> delayed_hpd_wq;
> * in dpp_color_caps: dgam_ram, dgam_rom_for_yuv;
> * in mpc_color_caps: ogam_ram.
> 
> The next two patches expand DRM color mgmt interface to add shaper LUT
> and 3D LUT. Finally, the last patch focuses on mapping DRM properties to
> DC interface and these are my doubts so far:
> 
> - To configure a shaper LUT, I related it to the current configuration
>   of gamma 1D. For dc_transfer_func, I should set tf according to the
>   input space, that means, LINEAR for shaper LUT after blending, right?
>   When 3D LUT is set, the input space for gamma 1D will no longer be
>   linear, so how to define the tf?  should I set tf as sRGB, BT709 or
>   what?
> 

We don't know the input space. It's nowhere defined in the KMS API. It
entirely depends on how a compositor renders the framebuffer (or transforms
it using some future KMS plane API).

DC interfaces are designed for a system where the driver is aware of the input
color space and linearity/non-linearity. This means that you'll often need
to dig through the API down to the HW programming bits to understand what
it actually does. A leaky abstraction, essentially.

Because KMS drivers don't know the linearity/non-linearity at any point
int the pipeline we need an API where the KMS client provides the
appropriate shaper LUT. In the case of any current KMS client that
will always be non-colormanaged and is assumed to be sRGB.

If your framebuffer is non-linear (sRGB) and you're not linearizing it
using the CRTC Degamma you'll already have non-linear values and won't
need to program the shaper LUT (i.e. use it in bypass or linear).

If your framebuffer is linear and you're not de-linearizing it with the
CRTC Degamma LUT you'll have linear values and need to program the
inverse EOTF for sRGB in your shaper (or degamma) LUT.

> - I see the 3dlut values being mapped to struct tetrahedral_17 as four
>   arrays lut0-4. From that I am considering tetrahedral interpolation.
>   Is there any other interpolation option? Also, as the total size of the
>   four arrays is the same of the 3D LUT size, I'm mapping DRM color lut
>   values in ascending order, starting by filling lut0 to lut4. Is it right
>   or is the

Re: [RFC PATCH 4/5] drm/drm_color_mgmt: add 3D LUT to color mgmt properties




On 2022-06-27 08:18, Ville Syrjälä wrote:
> On Sun, Jun 19, 2022 at 09:31:03PM -0100, Melissa Wen wrote:
>> Add 3D LUT for gammar correction using a 3D lookup table.  The position
>> in the color correction pipeline where 3D LUT is applied depends on hw
>> design, being after CTM or gamma. If just after CTM, a shaper lut must
>> be set to shape the content for a non-linear space. That details should
>> be handled by the driver according to its color capabilities.
> 
> I also cooked up a WIP 3D LUT support some time ago for Intel hw:
> https://github.com/vsyrjala/linux/commits/3dlut>> But that dried up due to 
> having no userspace for it.
> 
> I also cooked up some basic igts for it:
> https://patchwork.freedesktop.org/series/90165/>> 
> 
>> + * “LUT3D”:
>> + *  Blob property to set the 3D LUT mapping pixel data after the color
>> + *  transformation matrix and before gamma 1D lut correction.
> 
> On Intel hw the 3DLUT is after the gamma LUT in the pipeline, which is
> where I placed it in my branch.
> 
> There is now some discussion happening about exposing some
> kind of color pipeline description/configuration properties:
> https://gitlab.freedesktop.org/pq/color-and-hdr/-/issues/11>> 

After all the discussions about properties to support color management for
HDR and other features it's becoming clear to me that we'll need some color
pipeline description going forward, i.e. something like the one Sebastian
proposed. It's complex but if we're not defining this now we'll be in a pickle
when the next driver implementer goes and finds that their HW looks different
yet again and doesn't match any of the orders we've defined so far.

Harry

Re: [PATCH 10/20] drm/amd/display: Insert pulling smu busy status before sending another request

2022-06-28 Thread Mike Lothian

Hi

I'm seeing the following stack trace, I'm guessing due to the assert:

[3.516409] [ cut here ]
[3.516412] WARNING: CPU: 1 PID: 1 at
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn21/rn_clk_mgr_vbios_smu.c:98
rn_vbios_smu_send_msg_with_param+0x3e/0xe0
[3.516422] Modules linked in:
[3.516425] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc4-tip+ #3199
[3.516428] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
G513QY_G513QY/G513QY, BIOS G513QY.318 03/29/2022
[3.516432] RIP: 0010:rn_vbios_smu_send_msg_with_param+0x3e/0xe0
[3.516437] Code: f6 48 89 fb 48 8b 3b be 9b 62 01 00 48 c7 c2 02
bd 06 83 e8 44 c6 f0 ff 85 c0 75 12 bf c6 a7 00 00 e8 f6 9a b1 ff ff
c5 75 da <0f> 0b eb 05 83 f8 01 75 f7 48 8b 3b be 9b 62 01 00 48 c7 c1
3
c 86
[3.516442] RSP: 0018:88810026f628 EFLAGS: 00010202
[3.516445] RAX: 00fe RBX: 8881058a3200 RCX: 
[3.516447] RDX:  RSI: 888105adbb80 RDI: 888104f8
[3.516450] RBP: fffcf2bf R08: 888110ca6800 R09: 7fc9117f
[3.516452] R10:  R11: 819bca10 R12: 888110cd
[3.516454] R13: 888100cc2300 R14: 000d R15: 0001
[3.516457] FS:  () GS:888fde44()
knlGS:
[3.516460] CS:  0010 DS:  ES:  CR0: 80050033
[3.516462] CR2:  CR3: b360c000 CR4: 00350ee0
[3.516465] Call Trace:
[3.516468]  
[3.516470]  ? rn_clk_mgr_construct+0x744/0x760
[3.516475]  ? dc_clk_mgr_create+0x1f0/0x4f0
[3.516478]  ? dc_create+0x43a/0x5c0
[3.516481]  ? dm_hw_init+0x29a/0x2380
[3.516485]  ? vprintk_emit+0x106/0x230
[3.516488]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[3.516492]  ? dev_vprintk_emit+0x152/0x179
[3.516496]  ? smu_hw_init+0x255/0x290
[3.516500]  ? amdgpu_device_ip_init+0x32a/0x4a0
[3.516504]  ? amdgpu_device_init+0x1622/0x1bb0
[3.516507]  ? pci_bus_read_config_word+0x35/0x50
[3.516512]  ? amdgpu_driver_load_kms+0x14/0x150
[3.516515]  ? amdgpu_pci_probe+0x1c0/0x3d0
[3.516518]  ? pci_device_probe+0xd3/0x170
[3.516520]  ? really_probe+0x13e/0x320
[3.516523]  ? __driver_probe_device+0x91/0xd0
[3.516525]  ? driver_probe_device+0x1a/0x160
[3.516527]  ? __driver_attach+0xe6/0x1b0
[3.516530]  ? bus_add_driver+0x16e/0x270
[3.516533]  ? driver_register+0x85/0x120
[3.516535]  ?
__initstub__kmod_gpu_sched__180_178_drm_sched_fence_slab_init6+0x3f/0x3f
[3.516540]  ? do_one_initcall+0x100/0x290
[3.516545]  ? do_initcall_level+0x8a/0xe8
[3.516549]  ? do_initcalls+0x44/0x6b
[3.516551]  ? kernel_init_freeable+0xc7/0x10d
[3.516554]  ? rest_init+0xc0/0xc0
[3.516558]  ? kernel_init+0x15/0x140
[3.516561]  ? ret_from_fork+0x22/0x30
[3.516564]  
[3.516565] ---[ end trace  ]---

On Fri, 8 Apr 2022 at 18:27, Pavle Kotarac  wrote:
>
> From: Oliver Logush 
>
> [why]
> Make sure smu is not busy before sending another request, this is to
> prevent stress failures from MS.
>
> [how]
> Check to make sure the SMU fw busy signal is cleared before sending
> another request
>
> Reviewed-by: Charlene Liu 
> Reviewed-by: Nicholas Kazlauskas 
> Acked-by: Pavle Kotarac 
> Signed-off-by: Oliver Logush 
> ---
>  .../drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr_vbios_smu.c| 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git 
> a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr_vbios_smu.c 
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr_vbios_smu.c
> index 8161a6ae410d..30c6f9cd717f 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr_vbios_smu.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr_vbios_smu.c
> @@ -94,6 +94,9 @@ static int rn_vbios_smu_send_msg_with_param(struct 
> clk_mgr_internal *clk_mgr,
>  {
> uint32_t result;
>
> +   result = rn_smu_wait_for_response(clk_mgr, 10, 20);
> +   ASSERT(result == VBIOSSMC_Result_OK);
> +
> /* First clear response register */
> REG_WRITE(MP1_SMN_C2PMSG_91, VBIOSSMC_Status_BUSY);
>
> --
> 2.32.0
>

Re: CONFIG_ANDROID (was: rcu_sched detected expedited stalls in amdgpu after suspend)

2022-06-28 Thread Paul E. McKenney

On Tue, Jun 28, 2022 at 11:02:40AM -0400, Alex Xu (Hello71) wrote:
> Excerpts from Paul E. McKenney's message of June 28, 2022 12:12 am:
> > On Mon, Jun 27, 2022 at 09:50:53PM -0400, Alex Xu (Hello71) wrote:
> >> Ah, I see. I have selected the default value for 
> >> CONFIG_RCU_EXP_CPU_STALL_TIMEOUT, but that is 20 if ANDROID. I am not 
> >> using Android; I'm not sure there exist Android devices with AMD GPUs. 
> >> However, I have set CONFIG_ANDROID=y in order to use 
> >> ANDROID_BINDER_IPC=m for emulation.
> >> 
> >> In general, I think CONFIG_ANDROID is not a reliable method for 
> >> detecting if the kernel is for an Android device; for example, Fedora 
> >> sets CONFIG_ANDROID, but (AFAIK) its kernel is not intended for use with 
> >> Android userspace.
> >> 
> >> On the other hand, it's not clear to me why the value 20 should be for 
> >> Android only anyways. If, as you say in 
> >> https://lore.kernel.org/lkml/20220216195508.GM4285@paulmck-ThinkPad-P17-Gen-1/,
> >> it is related to the size of the system, perhaps some other heuristic 
> >> would be more appropriate.
> > 
> > It is related to the fact that quite a few Android guys want these
> > 20-millisecond short-timeout expedited RCU CPU stall warnings, but no one
> > else does.  Not yet anyway.
> > 
> > And let's face it, the intent and purpose of CONFIG_ANDROID=y is extremely
> > straightforward and unmistakeable.  So perhaps people not running Android
> > devices but wanting a little bit of the Android functionality should do
> > something other than setting CONFIG_ANDROID=y in their .config files.  Me,
> > I am surprised that it took this long for something like this to bite you.
> > 
> > But just out of curiosity, what would you suggest instead?
> 
> Both Debian and Fedora set CONFIG_ANDROID, specifically for binder. If 
> major distro vendors are consistently making this "mistake", then 
> perhaps the problem is elsewhere.
> 
> In my own opinion, assuming that binderfs means Android vendor is not a 
> good assumption. The ANDROID help says:
> 
> > Enable support for various drivers needed on the Android platform
> 
> It doesn't say "Enable only if building an Android device", or "Enable 
> only if you are Google". Isn't the traditional Linux philosophy a 
> collection of pieces to be assembled, without gratuitous hidden 
> dependencies? For example, [0] removes the unnecessary Android 
> dependency, it doesn't block the whole thing with "depends on ANDROID".
> 
> It seems to me that the proper way to set some configuration for Android 
> kernels is or should be to ask the Android kernel config maintainers, 
> not to set it based on an upstream kernel option. There is, after all, 
> no CONFIG_FEDORA or CONFIG_UBUNTU or CONFIG_HANNAH_MONTANA.
> 
> WireGuard and random also use CONFIG_ANDROID in a similar "proxy" way as 
> rcu, there to see if suspends are "frequent". This seems dubious for the 
> same reasons.
> 
> I wonder if it might be time to retire CONFIG_ANDROID: the only 
> remaining driver covered is binder, which originates from Android but 
> is no longer used exclusively on Android systems. Like ufs-qcom, binder 
> is no longer used exclusively on Android devices; it is also used for 
> Android device emulators, which might be used on Android-like mobile 
> devices, or might not.
> 
> My understanding is that both Android and upstream kernel developers 
> intend to add no more Android-specific drivers, so binder should be the 
> only one covered for the foreseeable future.

Thank you for the perspective, but you never did suggest an alternative.

So here is is what I suggest given the current setup:

config RCU_EXP_CPU_STALL_TIMEOUT
int "Expedited RCU CPU stall timeout in milliseconds"
depends on RCU_STALL_COMMON
range 0 21000
default 20 if ANDROID
default 0 if !ANDROID
help
  If a given expedited RCU grace period extends more than the
  specified number of milliseconds, a CPU stall warning is printed.
  If the RCU grace period persists, additional CPU stall warnings
  are printed at more widely spaced intervals.  A value of zero
  says to use the RCU_CPU_STALL_TIMEOUT value converted from
  seconds to milliseconds.

The default, and only the default, is controlled by ANDROID.

All you need to do to get the previous behavior is to add something like
this to your defconfig file:

CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=21000

Any reason why this will not work for you?

> > For that matter, why the private reply?
> 
> Mail client issues, not intentional. Lists re-added, plus Android, 
> WireGuard, and random.

Thank you!

Thanx, Paul

> Thanks,
> Alex.
> 
> [0] https://lore.kernel.org/all/20220321151853.24138-1-k...@kernel.org/

Re: [PATCH v6 00/22] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

2022-06-28 Thread Rob Clark

On Tue, Jun 28, 2022 at 5:51 AM Dmitry Osipenko
 wrote:
>
> On 6/28/22 15:31, Robin Murphy wrote:
> > ->8-
> > [   68.295951] ==
> > [   68.295956] WARNING: possible circular locking dependency detected
> > [   68.295963] 5.19.0-rc3+ #400 Not tainted
> > [   68.295972] --
> > [   68.295977] cc1/295 is trying to acquire lock:
> > [   68.295986] 08d7f1a0
> > (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_free+0x7c/0x198
> > [   68.296036]
> > [   68.296036] but task is already holding lock:
> > [   68.296041] 8c14b820 (fs_reclaim){+.+.}-{0:0}, at:
> > __alloc_pages_slowpath.constprop.0+0x4d8/0x1470
> > [   68.296080]
> > [   68.296080] which lock already depends on the new lock.
> > [   68.296080]
> > [   68.296085]
> > [   68.296085] the existing dependency chain (in reverse order) is:
> > [   68.296090]
> > [   68.296090] -> #1 (fs_reclaim){+.+.}-{0:0}:
> > [   68.296111]fs_reclaim_acquire+0xb8/0x150
> > [   68.296130]dma_resv_lockdep+0x298/0x3fc
> > [   68.296148]do_one_initcall+0xe4/0x5f8
> > [   68.296163]kernel_init_freeable+0x414/0x49c
> > [   68.296180]kernel_init+0x2c/0x148
> > [   68.296195]ret_from_fork+0x10/0x20
> > [   68.296207]
> > [   68.296207] -> #0 (reservation_ww_class_mutex){+.+.}-{3:3}:
> > [   68.296229]__lock_acquire+0x1724/0x2398
> > [   68.296246]lock_acquire+0x218/0x5b0
> > [   68.296260]__ww_mutex_lock.constprop.0+0x158/0x2378
> > [   68.296277]ww_mutex_lock+0x7c/0x4d8
> > [   68.296291]drm_gem_shmem_free+0x7c/0x198
> > [   68.296304]panfrost_gem_free_object+0x118/0x138
> > [   68.296318]drm_gem_object_free+0x40/0x68
> > [   68.296334]drm_gem_shmem_shrinker_run_objects_scan+0x42c/0x5b8
> > [   68.296352]drm_gem_shmem_shrinker_scan_objects+0xa4/0x170
> > [   68.296368]do_shrink_slab+0x220/0x808
> > [   68.296381]shrink_slab+0x11c/0x408
> > [   68.296392]shrink_node+0x6ac/0xb90
> > [   68.296403]do_try_to_free_pages+0x1dc/0x8d0
> > [   68.296416]try_to_free_pages+0x1ec/0x5b0
> > [   68.296429]__alloc_pages_slowpath.constprop.0+0x528/0x1470
> > [   68.296444]__alloc_pages+0x4e0/0x5b8
> > [   68.296455]__folio_alloc+0x24/0x60
> > [   68.296467]vma_alloc_folio+0xb8/0x2f8
> > [   68.296483]alloc_zeroed_user_highpage_movable+0x58/0x68
> > [   68.296498]__handle_mm_fault+0x918/0x12a8
> > [   68.296513]handle_mm_fault+0x130/0x300
> > [   68.296527]do_page_fault+0x1d0/0x568
> > [   68.296539]do_translation_fault+0xa0/0xb8
> > [   68.296551]do_mem_abort+0x68/0xf8
> > [   68.296562]el0_da+0x74/0x100
> > [   68.296572]el0t_64_sync_handler+0x68/0xc0
> > [   68.296585]el0t_64_sync+0x18c/0x190
> > [   68.296596]
> > [   68.296596] other info that might help us debug this:
> > [   68.296596]
> > [   68.296601]  Possible unsafe locking scenario:
> > [   68.296601]
> > [   68.296604]CPU0CPU1
> > [   68.296608]
> > [   68.296612]   lock(fs_reclaim);
> > [   68.296622] lock(reservation_ww_class_mutex);
> > [   68.296633]lock(fs_reclaim);
> > [   68.296644]   lock(reservation_ww_class_mutex);
> > [   68.296654]
> > [   68.296654]  *** DEADLOCK ***
>
> This splat could be ignored for now. I'm aware about it, although
> haven't looked closely at how to fix it since it's a kind of a lockdep
> misreporting.

The lockdep splat could be fixed with something similar to what I've
done in msm, ie. basically just not acquire the lock in the finalizer:

https://patchwork.freedesktop.org/patch/489364/

There is one gotcha to watch for, as danvet pointed out
(scan_objects() could still see the obj in the LRU before the
finalizer removes it), but if scan_objects() does the
kref_get_unless_zero() trick, it is safe.

BR,
-R

Re: CONFIG_ANDROID (was: rcu_sched detected expedited stalls in amdgpu after suspend)

2022-06-28 Thread Jason A. Donenfeld

Hi Alex,

On Tue, Jun 28, 2022 at 11:02:40AM -0400, Alex Xu (Hello71) wrote:
> WireGuard and random also use CONFIG_ANDROID in a similar "proxy" way as 
> rcu, there to see if suspends are "frequent". This seems dubious for the 
> same reasons.

I'd be happy to take a patch in WireGuard and random.c to get rid of the
CONFIG_ANDROID usage, if you can conduct an analysis and conclude this
won't break anything inadvertently.

Jason

Re: [PATCH v6 00/22] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

2022-06-28 Thread Dmitry Osipenko

Hello Robin,

On 6/28/22 15:31, Robin Murphy wrote:
>> Hello,
>>
>> This patchset introduces memory shrinker for the VirtIO-GPU DRM driver
>> and adds memory purging and eviction support to VirtIO-GPU driver.
>>
>> The new dma-buf locking convention is introduced here as well.
>>
>> During OOM, the shrinker will release BOs that are marked as "not needed"
>> by userspace using the new madvise IOCTL, it will also evict idling BOs
>> to SWAP. The userspace in this case is the Mesa VirGL driver, it will
>> mark
>> the cached BOs as "not needed", allowing kernel driver to release memory
>> of the cached shmem BOs on lowmem situations, preventing OOM kills.
>>
>> The Panfrost driver is switched to use generic memory shrinker.
> 
> I think we still have some outstanding issues here - Alyssa reported
> some weirdness yesterday, so I just tried provoking a low-memory
> condition locally with this series applied and a few debug options
> enabled, and the results as below were... interesting.

The warning and crash that you got actually are the minor issues.

Alyssa caught an interesting PREEMPT_DEBUG issue in the shrinker that I
haven't seen before.

She is also experiencing another problem in the Panfrost driver with a
bad shmem pages (I think). It is unrelated to this patchset and
apparently require an extra setup for the reproduction.

-- 
Best regards,
Dmitry

Re: [PATCH] drm/amdgpu/display: reduce stack size in dml32_ModeSupportAndSystemConfigurationFull()

2022-06-28 Thread Rodrigo Siqueira Jordao





On 2022-06-22 10:47, Alex Deucher wrote:

Move more stack variable in to dummy vars structure on the heap.

Fixes stack frame size errors:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c: In 
function 'dml32_ModeSupportAndSystemConfigurationFull':
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:3833:1:
 error: the frame size of 2720 bytes is larger than 2048 bytes 
[-Werror=frame-larger-than=]
  3833 | } // ModeSupportAndSystemConfigurationFull
   | ^

Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321")
Cc: Stephen Rothwell 
Cc: Aurabindo Pillai 
Cc: Rodrigo Siqueira Jordao 
Signed-off-by: Alex Deucher 
---
  .../dc/dml/dcn32/display_mode_vba_32.c| 77 ---
  .../drm/amd/display/dc/dml/display_mode_vba.h |  3 +-
  2 files changed, 36 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index 510b7a81ee12..7f144adb1e36 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -1660,8 +1660,7 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
  
  void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)

  {
-   bool dummy_boolean[2];
-   unsigned int dummy_integer[1];
+   unsigned int dummy_integer[4];
bool MPCCombineMethodAsNeededForPStateChangeAndVoltage;
bool MPCCombineMethodAsPossible;
enum odm_combine_mode dummy_odm_mode[DC__NUM_DPP__MAX];
@@ -1973,10 +1972,10 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l

v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[5],
 /* LongDETBufferSizeInKByte[]  */

v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[6],
 /* LongDETBufferSizeY[]  */

v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[7],
 /* LongDETBufferSizeC[]  */
-   &dummy_boolean[0], /* bool   
*UnboundedRequestEnabled  */
-   &dummy_integer[0], /* Long   
*CompressedBufferSizeInkByte  */
+   
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[0][0],
 /* bool   *UnboundedRequestEnabled  */
+   
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[0][0],
 /* Long   *CompressedBufferSizeInkByte  */
mode_lib->vba.SingleDPPViewportSizeSupportPerSurface,/* 
bool ViewportSizeSupportPerSurface[] */
-   &dummy_boolean[1]); /* bool   
*ViewportSizeSupport */
+   
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[1][0]);
 /* bool   *ViewportSizeSupport */
  
  	MPCCombineMethodAsNeededForPStateChangeAndVoltage = false;

MPCCombineMethodAsPossible = false;
@@ -2506,7 +2505,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
//
for (i = 0; i < (int) v->soc.num_states; ++i) {
for (j = 0; j <= 1; ++j) {
-   bool dummy_boolean_array[1][DC__NUM_DPP__MAX];
for (k = 0; k < mode_lib->vba.NumberOfActiveSurfaces; 
++k) {
mode_lib->vba.RequiredDPPCLKThisState[k] = 
mode_lib->vba.RequiredDPPCLK[i][j][k];
mode_lib->vba.NoOfDPPThisState[k] = 
mode_lib->vba.NoOfDPP[i][j][k];
@@ -2570,7 +2568,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
mode_lib->vba.DETBufferSizeCThisState,

&mode_lib->vba.UnboundedRequestEnabledThisState,

&mode_lib->vba.CompressedBufferSizeInkByteThisState,
-   dummy_boolean_array[0],
+   
v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[0],

&mode_lib->vba.ViewportSizeSupport[i][j]);
  
  			for (k = 0; k < mode_lib->vba.NumberOfActiveSurfaces; ++k) {

@@ -2708,9 +2706,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
}
  
  			{

-   bool dummy_boolean_array[2][DC__NUM_DPP__MAX];
-   unsigned int 
dummy_integer_array[22][DC__NUM_DPP__MAX];
-
dml32_CalculateVMRowAndSwath(

mode_lib->vba.NumberOfActiveSurfaces,

[PATCH 10/11] libhsakmt: add open SMI event handle

System Management Interface event is read from anonymous file handle,
this helper wrap the ioctl interface to get anonymous file handle for
GPU nodeid.

Define SMI event IDs, event triggers, copy the same value from
kfd_ioctl.h to avoid translation.

Change-Id: I5c8ba5301473bb3b80bb4e2aa33a9f675bedb001
Signed-off-by: Philip Yang 
---
 include/hsakmt.h  | 16 ++
 include/hsakmttypes.h | 49 +++
 src/events.c  | 27 
 src/libhsakmt.ver |  1 +
 4 files changed, 93 insertions(+)

diff --git a/include/hsakmt.h b/include/hsakmt.h
index abc617f..ca586ba 100644
--- a/include/hsakmt.h
+++ b/include/hsakmt.h
@@ -877,6 +877,22 @@ hsaKmtGetXNACKMode(
 HSAint32 * enable  // OUT: returns XNACK value.
 );
 
+/**
+   Open anonymous file handle to enable events and read SMI events.
+
+   To enable events, write 64bit events mask to fd, event enums as bit index.
+   for example, event mask 
(HSA_SMI_EVENT_MASK_FROM_INDEX(HSA_SMI_EVENT_INDEX_MAX) - 1) to enable all 
events
+
+   Read event from fd is not blocking, use poll with timeout value to check if 
event is available.
+   Event is dropped if kernel event fifo is full.
+*/
+HSAKMT_STATUS
+HSAKMTAPI
+hsaKmtOpenSMI(
+HSAuint32 NodeId,   // IN: GPU node_id to receive the SMI event from
+int *fd // OUT: anonymous file handle
+);
+
 #ifdef __cplusplus
 }   //extern "C"
 #endif
diff --git a/include/hsakmttypes.h b/include/hsakmttypes.h
index ab2591b..690e001 100644
--- a/include/hsakmttypes.h
+++ b/include/hsakmttypes.h
@@ -1354,6 +1354,55 @@ typedef struct _HSA_SVM_ATTRIBUTE {
HSAuint32 value; // attribute value
 } HSA_SVM_ATTRIBUTE;
 
+typedef enum _HSA_SMI_EVENT {
+   HSA_SMI_EVENT_NONE = 0, /* not used */
+   HSA_SMI_EVENT_VMFAULT = 1, /* event start counting at 1 */
+   HSA_SMI_EVENT_THERMAL_THROTTLE = 2,
+   HSA_SMI_EVENT_GPU_PRE_RESET = 3,
+   HSA_SMI_EVENT_GPU_POST_RESET = 4,
+   HSA_SMI_EVENT_MIGRATE_START = 5,
+   HSA_SMI_EVENT_MIGRATE_END = 6,
+   HSA_SMI_EVENT_PAGE_FAULT_START = 7,
+   HSA_SMI_EVENT_PAGE_FAULT_END = 8,
+   HSA_SMI_EVENT_QUEUE_EVICTION = 9,
+   HSA_SMI_EVENT_QUEUE_RESTORE = 10,
+   HSA_SMI_EVENT_UNMAP_FROM_GPU = 11,
+   HSA_SMI_EVENT_INDEX_MAX = 12,
+
+   /*
+* max event number, as a flag bit to get events from all processes,
+* this requires super user permission, otherwise will not be able to
+* receive event from any process. Without this flag to receive events
+* from same process.
+*/
+   HSA_SMI_EVENT_ALL_PROCESS = 64
+} HSA_EVENT_TYPE;
+
+typedef enum _HSA_MIGRATE_TRIGGERS {
+   HSA_MIGRATE_TRIGGER_PREFETCH,
+   HSA_MIGRATE_TRIGGER_PAGEFAULT_GPU,
+   HSA_MIGRATE_TRIGGER_PAGEFAULT_CPU,
+   HSA_MIGRATE_TRIGGER_TTM_EVICTION
+} HSA_MIGRATE_TRIGGERS;
+
+typedef enum _HSA_QUEUE_EVICTION_TRIGGERS {
+   HSA_QUEUE_EVICTION_TRIGGER_SVM,
+   HSA_QUEUE_EVICTION_TRIGGER_USERPTR,
+   HSA_QUEUE_EVICTION_TRIGGER_TTM,
+   HSA_QUEUE_EVICTION_TRIGGER_SUSPEND,
+   HSA_QUEUE_EVICTION_CRIU_CHECKPOINT,
+   HSA_QUEUE_EVICTION_CRIU_RESTORE
+} HSA_QUEUE_EVICTION_TRIGGERS;
+
+typedef enum _HSA_SVM_UNMAP_TRIGGERS {
+   HSA_SVM_UNMAP_TRIGGER_MMU_NOTIFY,
+   HSA_SVM_UNMAP_TRIGGER_MMU_NOTIFY_MIGRATE,
+   HSA_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU
+} HSA_SVM_UNMAP_TRIGGERS;
+
+#define HSA_SMI_EVENT_MASK_FROM_INDEX(i) (1ULL << ((i) - 1))
+#define HSA_SMI_EVENT_MSG_SIZE 96
+
 #pragma pack(pop, hsakmttypes_h)
 
 
diff --git a/src/events.c b/src/events.c
index d4c751c..06d3959 100644
--- a/src/events.c
+++ b/src/events.c
@@ -339,3 +339,30 @@ out:
 
return result;
 }
+
+HSAKMT_STATUS HSAKMTAPI hsaKmtOpenSMI(HSAuint32 NodeId, int *fd)
+{
+   struct kfd_ioctl_smi_events_args args;
+   HSAKMT_STATUS result;
+   uint32_t gpuid;
+
+   CHECK_KFD_OPEN();
+
+   pr_debug("[%s] node %d\n", __func__, NodeId);
+
+   result = validate_nodeid(NodeId, &gpuid);
+   if (result != HSAKMT_STATUS_SUCCESS) {
+   pr_err("[%s] invalid node ID: %d\n", __func__, NodeId);
+   return result;
+   }
+
+   args.gpuid = gpuid;
+   result = kmtIoctl(kfd_fd, AMDKFD_IOC_SMI_EVENTS, &args);
+   if (result) {
+   pr_debug("open SMI event fd failed %s\n", strerror(errno));
+   return HSAKMT_STATUS_ERROR;
+   }
+
+   *fd = args.anon_fd;
+   return HSAKMT_STATUS_SUCCESS;
+}
diff --git a/src/libhsakmt.ver b/src/libhsakmt.ver
index 50c309d..46370c6 100644
--- a/src/libhsakmt.ver
+++ b/src/libhsakmt.ver
@@ -69,6 +69,7 @@ hsaKmtSVMSetAttr;
 hsaKmtSVMGetAttr;
 hsaKmtSetXNACKMode;
 hsaKmtGetXNACKMode;
+hsaKmtOpenSMI;
 
 local: *;
 };
-- 
2.35.1

[PATCH v5 8/11] drm/amdkfd: Bump KFD API version for SMI profiling event

Indicate SMI profiling events available.

Signed-off-by: Philip Yang 
---
 include/uapi/linux/kfd_ioctl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index f239e260796b..b024e8ba865d 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -35,9 +35,10 @@
  * - 1.7 - Checkpoint Restore (CRIU) API
  * - 1.8 - CRIU - Support for SDMA transfers with GTT BOs
  * - 1.9 - Add available memory ioctl
+ * - 1.10 - Add SMI profiler event log
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 9
+#define KFD_IOCTL_MINOR_VERSION 10
 
 struct kfd_ioctl_get_version_args {
__u32 major_version;/* from KFD */
-- 
2.35.1

[PATCH 11/11] ROCR-Runtime Basic SVM profiler

From: Sean Keely 

Mostly a demo at this point.  Logs SVM (aka HMM) info to
HSA_SVM_PROFILE if set.

Example: HSA_SVM_PROFILE=log.txt SomeApp

Change-Id: Ib6fd688f661a21b2c695f586b833be93662a15f4
---
 src/CMakeLists.txt|   1 +
 src/core/inc/amd_gpu_agent.h  |   3 +
 src/core/inc/runtime.h|   9 +
 src/core/inc/svm_profiler.h   |  67 ++
 src/core/runtime/runtime.cpp  |   8 +
 src/core/runtime/svm_profiler.cpp | 364 ++
 src/core/util/flag.h  |   6 +
 7 files changed, 458 insertions(+)
 create mode 100644 src/core/inc/svm_profiler.h
 create mode 100644 src/core/runtime/svm_profiler.cpp

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 8fb02b14..1b7bf9b0 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -180,6 +180,7 @@ set ( SRCS core/util/lnx/os_linux.cpp
core/runtime/signal.cpp
core/runtime/queue.cpp
core/runtime/cache.cpp
+   core/runtime/svm_profiler.cpp
core/common/shared.cpp
core/common/hsa_table_interface.cpp
loader/executable.cpp
diff --git a/src/core/inc/amd_gpu_agent.h b/src/core/inc/amd_gpu_agent.h
index ed64d5be..fbdccaae 100644
--- a/src/core/inc/amd_gpu_agent.h
+++ b/src/core/inc/amd_gpu_agent.h
@@ -283,6 +283,9 @@ class GpuAgent : public GpuAgentInt {
   // @brief Returns Hive ID
   __forceinline uint64_t HiveId() const override { return  properties_.HiveID; 
}
 
+  // @brief Returns KFD's GPU id which is a hash used internally.
+  __forceinline uint64_t KfdGpuID() const { return properties_.KFDGpuID; }
+
   // @brief Returns node property.
   __forceinline const HsaNodeProperties& properties() const {
 return properties_;
diff --git a/src/core/inc/runtime.h b/src/core/inc/runtime.h
index 9f5b8acc..13190c75 100644
--- a/src/core/inc/runtime.h
+++ b/src/core/inc/runtime.h
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "core/inc/hsa_ext_interface.h"
 #include "core/inc/hsa_internal.h"
@@ -60,6 +61,7 @@
 #include "core/inc/memory_region.h"
 #include "core/inc/signal.h"
 #include "core/inc/interrupt_signal.h"
+#include "core/inc/svm_profiler.h"
 #include "core/util/flag.h"
 #include "core/util/locks.h"
 #include "core/util/os.h"
@@ -312,6 +314,8 @@ class Runtime {
 
   const std::vector& gpu_ids() { return gpu_ids_; }
 
+  Agent* agent_by_gpuid(uint32_t gpuid) { return agents_by_gpuid_[gpuid]; }
+
   Agent* region_gpu() { return region_gpu_; }
 
   const std::vector& system_regions_fine() const {
@@ -508,6 +512,9 @@ class Runtime {
   // Agent map containing all agents indexed by their KFD node IDs.
   std::map > agents_by_node_;
 
+  // Agent map containing all agents indexed by their KFD gpuid.
+  std::map agents_by_gpuid_;
+
   // Agent list containing all compatible gpu agent ids in the platform.
   std::vector gpu_ids_;
 
@@ -590,6 +597,8 @@ class Runtime {
   // Kfd version
   KfdVersion_t kfd_version;
 
+  std::unique_ptr svm_profile_;
+
   // Frees runtime memory when the runtime library is unloaded if safe to do 
so.
   // Failure to release the runtime indicates an incorrect application but is
   // common (example: calls library routines at process exit).
diff --git a/src/core/inc/svm_profiler.h b/src/core/inc/svm_profiler.h
new file mode 100644
index ..064965c7
--- /dev/null
+++ b/src/core/inc/svm_profiler.h
@@ -0,0 +1,67 @@
+
+//
+// The University of Illinois/NCSA
+// Open Source License (NCSA)
+//
+// Copyright (c) 2022-2022, Advanced Micro Devices, Inc. All rights reserved.
+//
+// Developed by:
+//
+// AMD Research and AMD HSA Software Development
+//
+// Advanced Micro Devices, Inc.
+//
+// www.amd.com
+//
+// Permission is hereby granted, free of charge, to any person obtaining a copy
+// of this software and associated documentation files (the "Software"), to
+// deal with the Software without restriction, including without limitation
+// the rights to use, copy, modify, merge, publish, distribute, sublicense,
+// and/or sell copies of the Software, and to permit persons to whom the
+// Software is furnished to do so, subject to the following conditions:
+//
+//  - Redistributions of source code must retain the above copyright notice,
+//this list of conditions and the following disclaimers.
+//  - Redistributions in binary form must reproduce the above copyright
+//notice, this list of conditions and the following disclaimers in
+//the documentation and/or other materials provided with the distribution.
+//  - Neither the names of Advanced Micro Devices, Inc,
+//nor the names of its contributors may be used to endorse or promote
+//products derived from this Software without specific prior written
+//permission.
+//
+// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+// IMPLIED, INCL

[PATCH v5 6/11] drm/amdkfd: Add unmap from GPU SMI event

SVM range unmapped from GPUs when range is unmapped from CPU, or with
xnack on from MMU notifier when range is evicted or migrated.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c |  9 
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h |  3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c| 25 +++--
 3 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index 3917c38204d0..e5896b7a16dd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -318,6 +318,15 @@ void kfd_smi_event_queue_restore_rescheduled(struct 
mm_struct *mm)
kfd_unref_process(p);
 }
 
+void kfd_smi_event_unmap_from_gpu(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, unsigned long last,
+ uint32_t trigger)
+{
+   kfd_smi_event_add(pid, dev, KFD_SMI_EVENT_UNMAP_FROM_GPU,
+ "%lld -%d @%lx(%lx) %x %d\n", ktime_get_boottime_ns(),
+ pid, address, last - address + 1, dev->id, trigger);
+}
+
 int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 {
struct kfd_smi_client *client;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
index b23292637239..76fe4e0ec2d2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
@@ -46,4 +46,7 @@ void kfd_smi_event_queue_eviction(struct kfd_dev *dev, pid_t 
pid,
  uint32_t trigger);
 void kfd_smi_event_queue_restore(struct kfd_dev *dev, pid_t pid);
 void kfd_smi_event_queue_restore_rescheduled(struct mm_struct *mm);
+void kfd_smi_event_unmap_from_gpu(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, unsigned long last,
+ uint32_t trigger);
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index ddc1e4651919..bf888ae84c92 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1200,7 +1200,7 @@ svm_range_unmap_from_gpu(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
 
 static int
 svm_range_unmap_from_gpus(struct svm_range *prange, unsigned long start,
- unsigned long last)
+ unsigned long last, uint32_t trigger)
 {
DECLARE_BITMAP(bitmap, MAX_GPU_INSTANCE);
struct kfd_process_device *pdd;
@@ -1232,6 +1232,9 @@ svm_range_unmap_from_gpus(struct svm_range *prange, 
unsigned long start,
return -EINVAL;
}
 
+   kfd_smi_event_unmap_from_gpu(pdd->dev, p->lead_thread->pid,
+start, last, trigger);
+
r = svm_range_unmap_from_gpu(pdd->dev->adev,
 drm_priv_to_vm(pdd->drm_priv),
 start, last, &fence);
@@ -1759,7 +1762,8 @@ static void svm_range_restore_work(struct work_struct 
*work)
  */
 static int
 svm_range_evict(struct svm_range *prange, struct mm_struct *mm,
-   unsigned long start, unsigned long last)
+   unsigned long start, unsigned long last,
+   enum mmu_notifier_event event)
 {
struct svm_range_list *svms = prange->svms;
struct svm_range *pchild;
@@ -1804,6 +1808,12 @@ svm_range_evict(struct svm_range *prange, struct 
mm_struct *mm,
msecs_to_jiffies(AMDGPU_SVM_RANGE_RESTORE_DELAY_MS));
} else {
unsigned long s, l;
+   uint32_t trigger;
+
+   if (event == MMU_NOTIFY_MIGRATE)
+   trigger = KFD_SVM_UNMAP_TRIGGER_MMU_NOTIFY_MIGRATE;
+   else
+   trigger = KFD_SVM_UNMAP_TRIGGER_MMU_NOTIFY;
 
pr_debug("invalidate unmap svms 0x%p [0x%lx 0x%lx] from GPUs\n",
 prange->svms, start, last);
@@ -1812,13 +1822,13 @@ svm_range_evict(struct svm_range *prange, struct 
mm_struct *mm,
s = max(start, pchild->start);
l = min(last, pchild->last);
if (l >= s)
-   svm_range_unmap_from_gpus(pchild, s, l);
+   svm_range_unmap_from_gpus(pchild, s, l, 
trigger);
mutex_unlock(&pchild->lock);
}
s = max(start, prange->start);
l = min(last, prange->last);
if (l >= s)
-   svm_range_unmap_from_gpus(prange, s, l);
+   svm_range_unmap_from_gpus(prange, s, l, trigger);
}
 
return r;
@@ -2232,6 +2242,7 @@ static void
 svm_range_unmap_from_cpu(struct mm_struct *mm, struct svm_r

[PATCH 9/11] libhsakmt: hsaKmtGetNodeProperties add gpu_id

Add KFDGpuID to HsaNodeProperties to return gpu_id to upper layer,
gpu_id is hash ID generated by KFD to distinguish GPUs on the system.
ROCr and ROCProfiler will use gpu_id to analyze SMI event message.

Change-Id: I6eabe6849230e04120674f5bc55e6ea254a532d6
Signed-off-by: Philip Yang 
---
 include/hsakmttypes.h |  4 +++-
 src/fmm.c |  7 +++
 src/libhsakmt.h   |  1 -
 src/topology.c| 26 --
 4 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/include/hsakmttypes.h b/include/hsakmttypes.h
index 9063f85..ab2591b 100644
--- a/include/hsakmttypes.h
+++ b/include/hsakmttypes.h
@@ -328,7 +328,9 @@ typedef struct _HsaNodeProperties
 
 HSAuint32   VGPRSizePerCU; // VGPR size in bytes per CU
 HSAuint32   SGPRSizePerCU; // SGPR size in bytes per CU
-HSAuint8Reserved[12];
+
+HSAuint32   KFDGpuID;  // GPU Hash ID generated by KFD
+HSAuint8Reserved[8];
 } HsaNodeProperties;
 
 
diff --git a/src/fmm.c b/src/fmm.c
index 35da3b8..92b76e1 100644
--- a/src/fmm.c
+++ b/src/fmm.c
@@ -2170,7 +2170,6 @@ HSAKMT_STATUS fmm_init_process_apertures(unsigned int 
NumNodes)
 {
uint32_t i;
int32_t gpu_mem_id = 0;
-   uint32_t gpu_id;
HsaNodeProperties props;
struct kfd_process_device_apertures *process_apertures;
uint32_t num_of_sysfs_nodes;
@@ -2235,14 +2234,14 @@ HSAKMT_STATUS fmm_init_process_apertures(unsigned int 
NumNodes)
 
for (i = 0; i < NumNodes; i++) {
memset(&props, 0, sizeof(props));
-   ret = topology_sysfs_get_node_props(i, &props, &gpu_id, NULL, 
NULL);
+   ret = topology_sysfs_get_node_props(i, &props, NULL, NULL);
if (ret != HSAKMT_STATUS_SUCCESS)
goto sysfs_parse_failed;
 
topology_setup_is_dgpu_param(&props);
 
/* Skip non-GPU nodes */
-   if (gpu_id != 0) {
+   if (props.KFDGpuID) {
int fd = open_drm_render_device(props.DrmRenderMinor);
if (fd <= 0) {
ret = HSAKMT_STATUS_ERROR;
@@ -2254,7 +2253,7 @@ HSAKMT_STATUS fmm_init_process_apertures(unsigned int 
NumNodes)
gpu_mem[gpu_mem_count].EngineId.ui32.Stepping = 
props.EngineId.ui32.Stepping;
 
gpu_mem[gpu_mem_count].drm_render_fd = fd;
-   gpu_mem[gpu_mem_count].gpu_id = gpu_id;
+   gpu_mem[gpu_mem_count].gpu_id = props.KFDGpuID;
gpu_mem[gpu_mem_count].local_mem_size = 
props.LocalMemSize;
gpu_mem[gpu_mem_count].device_id = props.DeviceId;
gpu_mem[gpu_mem_count].node_id = i;
diff --git a/src/libhsakmt.h b/src/libhsakmt.h
index e4246e0..822744b 100644
--- a/src/libhsakmt.h
+++ b/src/libhsakmt.h
@@ -173,7 +173,6 @@ HSAKMT_STATUS validate_nodeid_array(uint32_t **gpu_id_array,
uint32_t NumberOfNodes, uint32_t *NodeArray);
 
 HSAKMT_STATUS topology_sysfs_get_node_props(uint32_t node_id, 
HsaNodeProperties *props,
-   uint32_t *gpu_id,
bool *p2p_links, uint32_t 
*num_p2pLinks);
 HSAKMT_STATUS topology_sysfs_get_system_props(HsaSystemProperties *props);
 void topology_setup_is_dgpu_param(HsaNodeProperties *props);
diff --git a/src/topology.c b/src/topology.c
index 81ff62f..99a6a03 100644
--- a/src/topology.c
+++ b/src/topology.c
@@ -56,7 +56,6 @@
 #define KFD_SYSFS_PATH_NODES "/sys/devices/virtual/kfd/kfd/topology/nodes"
 
 typedef struct {
-   uint32_t gpu_id;
HsaNodeProperties node;
HsaMemoryProperties *mem; /* node->NumBanks elements */
HsaCacheProperties *cache;
@@ -1037,7 +1036,6 @@ static int topology_get_marketing_name(int minor, 
uint16_t *marketing_name)
 
 HSAKMT_STATUS topology_sysfs_get_node_props(uint32_t node_id,
HsaNodeProperties *props,
-   uint32_t *gpu_id,
bool *p2p_links,
uint32_t *num_p2pLinks)
 {
@@ -1056,13 +1054,14 @@ HSAKMT_STATUS topology_sysfs_get_node_props(uint32_t 
node_id,
HSAKMT_STATUS ret = HSAKMT_STATUS_SUCCESS;
 
assert(props);
-   assert(gpu_id);
ret = topology_sysfs_map_node_id(node_id, &sys_node_id);
if (ret != HSAKMT_STATUS_SUCCESS)
return ret;
 
/* Retrieve the GPU ID */
-   ret = topology_sysfs_get_gpu_id(sys_node_id, gpu_id);
+   ret = topology_sysfs_get_gpu_id(sys_node_id, &props->KFDGpuID);
+   if (ret != HSAKMT_STATUS_SUCCESS)
+   return ret;
 
read_buf = malloc(PAGE_SIZE);
if (!read_buf)
@@ -1723,7 +1722,7 @@ static int32_t gpu_get_direct_link_cpu(uint32_t gpu_node, 
node_props_t *node_pro

[PATCH v5 4/11] drm/amdkfd: Add migration SMI event

For migration start and end event, output timestamp when migration
starts, ends, svm range address and size, GPU id of migration source and
destination and svm range attributes,

Migration trigger could be prefetch, CPU or GPU page fault and TTM
eviction.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c| 53 -
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h|  5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 22 +
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h |  8 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c| 16 ---
 5 files changed, 83 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index fb8a94e52656..9667015a6cbc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -32,6 +32,7 @@
 #include "kfd_priv.h"
 #include "kfd_svm.h"
 #include "kfd_migrate.h"
+#include "kfd_smi_events.h"
 
 #ifdef dev_fmt
 #undef dev_fmt
@@ -402,8 +403,9 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
 static long
 svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct svm_range *prange,
struct vm_area_struct *vma, uint64_t start,
-   uint64_t end)
+   uint64_t end, uint32_t trigger)
 {
+   struct kfd_process *p = container_of(prange->svms, struct kfd_process, 
svms);
uint64_t npages = (end - start) >> PAGE_SHIFT;
struct kfd_process_device *pdd;
struct dma_fence *mfence = NULL;
@@ -430,6 +432,11 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.dst = migrate.src + npages;
scratch = (dma_addr_t *)(migrate.dst + npages);
 
+   kfd_smi_event_migration_start(adev->kfd.dev, p->lead_thread->pid,
+ start >> PAGE_SHIFT, end >> PAGE_SHIFT,
+ 0, adev->kfd.dev->id, 
prange->prefetch_loc,
+ prange->preferred_loc, trigger);
+
r = migrate_vma_setup(&migrate);
if (r) {
dev_err(adev->dev, "%s: vma setup fail %d range [0x%lx 
0x%lx]\n",
@@ -458,6 +465,10 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
svm_migrate_copy_done(adev, mfence);
migrate_vma_finalize(&migrate);
 
+   kfd_smi_event_migration_end(adev->kfd.dev, p->lead_thread->pid,
+   start >> PAGE_SHIFT, end >> PAGE_SHIFT,
+   0, adev->kfd.dev->id, trigger);
+
svm_range_dma_unmap(adev->dev, scratch, 0, npages);
svm_range_free_dma_mappings(prange);
 
@@ -479,6 +490,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
  * @prange: range structure
  * @best_loc: the device to migrate to
  * @mm: the process mm structure
+ * @trigger: reason of migration
  *
  * Context: Process context, caller hold mmap read lock, svms lock, prange lock
  *
@@ -487,7 +499,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
  */
 static int
 svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t best_loc,
-   struct mm_struct *mm)
+   struct mm_struct *mm, uint32_t trigger)
 {
unsigned long addr, start, end;
struct vm_area_struct *vma;
@@ -524,7 +536,7 @@ svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t 
best_loc,
break;
 
next = min(vma->vm_end, end);
-   r = svm_migrate_vma_to_vram(adev, prange, vma, addr, next);
+   r = svm_migrate_vma_to_vram(adev, prange, vma, addr, next, 
trigger);
if (r < 0) {
pr_debug("failed %ld to migrate\n", r);
break;
@@ -655,8 +667,10 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
  */
 static long
 svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
-  struct vm_area_struct *vma, uint64_t start, uint64_t end)
+  struct vm_area_struct *vma, uint64_t start, uint64_t end,
+  uint32_t trigger)
 {
+   struct kfd_process *p = container_of(prange->svms, struct kfd_process, 
svms);
uint64_t npages = (end - start) >> PAGE_SHIFT;
unsigned long upages = npages;
unsigned long cpages = 0;
@@ -685,6 +699,11 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.dst = migrate.src + npages;
scratch = (dma_addr_t *)(migrate.dst + npages);
 
+   kfd_smi_event_migration_start(adev->kfd.dev, p->lead_thread->pid,
+ start >> PAGE_SHIFT, end >> PAGE_SHIFT,
+ adev->kfd.dev->id, 0, 
prange->prefetch_loc,
+

[PATCH v5 3/11] drm/amdkfd: Add GPU recoverable fault SMI event

Use ktime_get_boottime_ns() as timestamp to correlate with other
APIs. Output timestamp when GPU recoverable fault starts and ends to
recover the fault, if migration happened or only GPU page table is
updated to recover, fault address, if read or write fault.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 17 +
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h |  6 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c| 17 +
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h|  2 +-
 4 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index 55ed026435e2..b7e68283925f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -244,6 +244,23 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, 
uint16_t pasid)
  task_info.pid, task_info.task_name);
 }
 
+void kfd_smi_event_page_fault_start(struct kfd_dev *dev, pid_t pid,
+   unsigned long address, bool write_fault,
+   ktime_t ts)
+{
+   kfd_smi_event_add(pid, dev, KFD_SMI_EVENT_PAGE_FAULT_START,
+ "%lld -%d @%lx(%x) %c\n", ktime_to_ns(ts), pid,
+ address, dev->id, write_fault ? 'W' : 'R');
+}
+
+void kfd_smi_event_page_fault_end(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, bool migration)
+{
+   kfd_smi_event_add(pid, dev, KFD_SMI_EVENT_PAGE_FAULT_END,
+ "%lld -%d @%lx(%x) %c\n", ktime_get_boottime_ns(),
+ pid, address, dev->id, migration ? 'M' : 'U');
+}
+
 int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 {
struct kfd_smi_client *client;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
index dfe101c21166..7903718cd9eb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
@@ -29,5 +29,9 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, 
uint16_t pasid);
 void kfd_smi_event_update_thermal_throttling(struct kfd_dev *dev,
 uint64_t throttle_bitmask);
 void kfd_smi_event_update_gpu_reset(struct kfd_dev *dev, bool post_reset);
-
+void kfd_smi_event_page_fault_start(struct kfd_dev *dev, pid_t pid,
+   unsigned long address, bool write_fault,
+   ktime_t ts);
+void kfd_smi_event_page_fault_end(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, bool migration);
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index d6fc00d51c8c..2ad08a1f38dd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -32,6 +32,7 @@
 #include "kfd_priv.h"
 #include "kfd_svm.h"
 #include "kfd_migrate.h"
+#include "kfd_smi_events.h"
 
 #ifdef dev_fmt
 #undef dev_fmt
@@ -1617,7 +1618,7 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
svm_range_unreserve_bos(&ctx);
 
if (!r)
-   prange->validate_timestamp = ktime_to_us(ktime_get());
+   prange->validate_timestamp = ktime_get_boottime();
 
return r;
 }
@@ -2694,11 +2695,12 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
struct svm_range_list *svms;
struct svm_range *prange;
struct kfd_process *p;
-   uint64_t timestamp;
+   ktime_t timestamp = ktime_get_boottime();
int32_t best_loc;
int32_t gpuidx = MAX_GPU_INSTANCE;
bool write_locked = false;
struct vm_area_struct *vma;
+   bool migration = false;
int r = 0;
 
if (!KFD_IS_SVM_API_SUPPORTED(adev->kfd.dev)) {
@@ -2775,9 +2777,9 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
goto out_unlock_range;
}
 
-   timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp;
/* skip duplicate vm fault on different pages of same range */
-   if (timestamp < AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING) {
+   if (ktime_before(timestamp, ktime_add_ns(prange->validate_timestamp,
+   AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING))) {
pr_debug("svms 0x%p [0x%lx %lx] already restored\n",
 svms, prange->start, prange->last);
r = 0;
@@ -2813,7 +2815,11 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
 svms, prange->start, prange->last, best_loc,
 prange->actual_loc);
 
+   kfd_smi_event_page_fault_start(adev->kfd.dev, p->lead_thread->pid, addr,
+  write_fault, timestamp);
+
if (prange->actual_

[PATCH v5 5/11] drm/amdkfd: Add user queue eviction restore SMI event

Output user queue eviction and restore event. User queue eviction may be
triggered by svm or userptr MMU notifier, TTM eviction, device suspend
and CRIU checkpoint and restore.

User queue restore may be rescheduled if eviction happens again while
restore.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 12 ---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  4 +--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  4 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 15 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   | 35 +++
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h   |  4 +++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  6 ++--
 9 files changed, 69 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index b25b41f50213..73bf8b5f2aa9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -336,7 +336,7 @@ void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo)
 }
 #endif
 /* KGD2KFD callbacks */
-int kgd2kfd_quiesce_mm(struct mm_struct *mm);
+int kgd2kfd_quiesce_mm(struct mm_struct *mm, uint32_t trigger);
 int kgd2kfd_resume_mm(struct mm_struct *mm);
 int kgd2kfd_schedule_evict_and_restore_process(struct mm_struct *mm,
struct dma_fence *fence);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 5ba9070d8722..6a7e045ddcc5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -32,6 +32,7 @@
 #include "amdgpu_dma_buf.h"
 #include 
 #include "amdgpu_xgmi.h"
+#include "kfd_smi_events.h"
 
 /* Userptr restore delay, just long enough to allow consecutive VM
  * changes to accumulate
@@ -2381,7 +2382,7 @@ int amdgpu_amdkfd_evict_userptr(struct kgd_mem *mem,
evicted_bos = atomic_inc_return(&process_info->evicted_bos);
if (evicted_bos == 1) {
/* First eviction, stop the queues */
-   r = kgd2kfd_quiesce_mm(mm);
+   r = kgd2kfd_quiesce_mm(mm, KFD_QUEUE_EVICTION_TRIGGER_USERPTR);
if (r)
pr_err("Failed to quiesce KFD\n");
schedule_delayed_work(&process_info->restore_userptr_work,
@@ -2655,13 +2656,16 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct 
work_struct *work)
 
 unlock_out:
mutex_unlock(&process_info->lock);
-   mmput(mm);
-   put_task_struct(usertask);
 
/* If validation failed, reschedule another attempt */
-   if (evicted_bos)
+   if (evicted_bos) {
schedule_delayed_work(&process_info->restore_userptr_work,
msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
+
+   kfd_smi_event_queue_restore_rescheduled(mm);
+   }
+   mmput(mm);
+   put_task_struct(usertask);
 }
 
 /** amdgpu_amdkfd_gpuvm_restore_process_bos - Restore all BOs for the given
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index a0246b4bae6b..6abfe10229a2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2428,7 +2428,7 @@ static int criu_restore(struct file *filep,
 * Set the process to evicted state to avoid running any new queues 
before all the memory
 * mappings are ready.
 */
-   ret = kfd_process_evict_queues(p);
+   ret = kfd_process_evict_queues(p, KFD_QUEUE_EVICTION_CRIU_RESTORE);
if (ret)
goto exit_unlock;
 
@@ -2547,7 +2547,7 @@ static int criu_process_info(struct file *filep,
goto err_unlock;
}
 
-   ret = kfd_process_evict_queues(p);
+   ret = kfd_process_evict_queues(p, KFD_QUEUE_EVICTION_CRIU_CHECKPOINT);
if (ret)
goto err_unlock;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index c8fee0dbfdcb..6ec0e9f0927d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -837,7 +837,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void 
*ih_ring_entry)
spin_unlock_irqrestore(&kfd->interrupt_lock, flags);
 }
 
-int kgd2kfd_quiesce_mm(struct mm_struct *mm)
+int kgd2kfd_quiesce_mm(struct mm_struct *mm, uint32_t trigger)
 {
struct kfd_process *p;
int r;
@@ -851,7 +851,7 @@ int kgd2kfd_quiesce_mm(struct mm_struct *mm)
return -ESRCH;
 
WARN(debug_evictions, "Evicting pid %d", p->lead_thread->pid);
-   r = kfd_process_evict_queues(p);
+   r = kfd_process_evict_queues(p, trigger);
 
kfd_unref_process(p);
return r;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_

[PATCH v5 7/11] drm/amdkfd: Asynchronously free smi_client

The synchronize_rcu may take several ms, which noticeably slows down
applications close SMI event handle. Use call_rcu to free client->fifo
and client asynchronously and eliminate the synchronize_rcu call in the
user thread.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e5896b7a16dd..0472b56de245 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -38,6 +38,7 @@ struct kfd_smi_client {
uint64_t events;
struct kfd_dev *dev;
spinlock_t lock;
+   struct rcu_head rcu;
pid_t pid;
bool suser;
 };
@@ -137,6 +138,14 @@ static ssize_t kfd_smi_ev_write(struct file *filep, const 
char __user *user,
return sizeof(events);
 }
 
+static void kfd_smi_ev_client_free(struct rcu_head *p)
+{
+   struct kfd_smi_client *ev = container_of(p, struct kfd_smi_client, rcu);
+
+   kfifo_free(&ev->fifo);
+   kfree(ev);
+}
+
 static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
 {
struct kfd_smi_client *client = filep->private_data;
@@ -146,10 +155,7 @@ static int kfd_smi_ev_release(struct inode *inode, struct 
file *filep)
list_del_rcu(&client->list);
spin_unlock(&dev->smi_lock);
 
-   synchronize_rcu();
-   kfifo_free(&client->fifo);
-   kfree(client);
-
+   call_rcu(&client->rcu, kfd_smi_ev_client_free);
return 0;
 }
 
-- 
2.35.1

[PATCH v5 1/11] drm/amdkfd: Add KFD SMI event IDs and triggers

Define new system management interface event IDs for migration, GPU
recoverable page fault, user queues eviction, restore and unmap from
GPU events and corresponding event triggers, those will be implemented
in the following patches.

Signed-off-by: Philip Yang 
---
 include/uapi/linux/kfd_ioctl.h | 37 ++
 1 file changed, 37 insertions(+)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index c648ed7c5ff1..f239e260796b 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -468,6 +468,43 @@ enum kfd_smi_event {
KFD_SMI_EVENT_THERMAL_THROTTLE = 2,
KFD_SMI_EVENT_GPU_PRE_RESET = 3,
KFD_SMI_EVENT_GPU_POST_RESET = 4,
+   KFD_SMI_EVENT_MIGRATE_START = 5,
+   KFD_SMI_EVENT_MIGRATE_END = 6,
+   KFD_SMI_EVENT_PAGE_FAULT_START = 7,
+   KFD_SMI_EVENT_PAGE_FAULT_END = 8,
+   KFD_SMI_EVENT_QUEUE_EVICTION = 9,
+   KFD_SMI_EVENT_QUEUE_RESTORE = 10,
+   KFD_SMI_EVENT_UNMAP_FROM_GPU = 11,
+
+   /*
+* max event number, as a flag bit to get events from all processes,
+* this requires super user permission, otherwise will not be able to
+* receive event from any process. Without this flag to receive events
+* from same process.
+*/
+   KFD_SMI_EVENT_ALL_PROCESS = 64
+};
+
+enum KFD_MIGRATE_TRIGGERS {
+   KFD_MIGRATE_TRIGGER_PREFETCH,
+   KFD_MIGRATE_TRIGGER_PAGEFAULT_GPU,
+   KFD_MIGRATE_TRIGGER_PAGEFAULT_CPU,
+   KFD_MIGRATE_TRIGGER_TTM_EVICTION
+};
+
+enum KFD_QUEUE_EVICTION_TRIGGERS {
+   KFD_QUEUE_EVICTION_TRIGGER_SVM,
+   KFD_QUEUE_EVICTION_TRIGGER_USERPTR,
+   KFD_QUEUE_EVICTION_TRIGGER_TTM,
+   KFD_QUEUE_EVICTION_TRIGGER_SUSPEND,
+   KFD_QUEUE_EVICTION_CRIU_CHECKPOINT,
+   KFD_QUEUE_EVICTION_CRIU_RESTORE
+};
+
+enum KFD_SVM_UNMAP_TRIGGERS {
+   KFD_SVM_UNMAP_TRIGGER_MMU_NOTIFY,
+   KFD_SVM_UNMAP_TRIGGER_MMU_NOTIFY_MIGRATE,
+   KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU
 };
 
 #define KFD_SMI_EVENT_MASK_FROM_INDEX(i) (1ULL << ((i) - 1))
-- 
2.35.1

[PATCH v5 2/11] drm/amdkfd: Enable per process SMI event

Process receive event from same process by default. Add a flag to be
able to receive event from all processes, this requires super user
permission.

Event using pid 0 to send the event to all processes, to keep the
default behavior of existing SMI events.

Signed-off-by: Philip Yang 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 37 +++--
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index f2e1d506ba21..55ed026435e2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -38,6 +38,8 @@ struct kfd_smi_client {
uint64_t events;
struct kfd_dev *dev;
spinlock_t lock;
+   pid_t pid;
+   bool suser;
 };
 
 #define MAX_KFIFO_SIZE 1024
@@ -151,16 +153,27 @@ static int kfd_smi_ev_release(struct inode *inode, struct 
file *filep)
return 0;
 }
 
-static void add_event_to_kfifo(struct kfd_dev *dev, unsigned int smi_event,
- char *event_msg, int len)
+static bool kfd_smi_ev_enabled(pid_t pid, struct kfd_smi_client *client,
+  unsigned int event)
+{
+   uint64_t all = KFD_SMI_EVENT_MASK_FROM_INDEX(KFD_SMI_EVENT_ALL_PROCESS);
+   uint64_t events = READ_ONCE(client->events);
+
+   if (pid && client->pid != pid && !(client->suser && (events & all)))
+   return false;
+
+   return events & KFD_SMI_EVENT_MASK_FROM_INDEX(event);
+}
+
+static void add_event_to_kfifo(pid_t pid, struct kfd_dev *dev,
+  unsigned int smi_event, char *event_msg, int len)
 {
struct kfd_smi_client *client;
 
rcu_read_lock();
 
list_for_each_entry_rcu(client, &dev->smi_clients, list) {
-   if (!(READ_ONCE(client->events) &
-   KFD_SMI_EVENT_MASK_FROM_INDEX(smi_event)))
+   if (!kfd_smi_ev_enabled(pid, client, smi_event))
continue;
spin_lock(&client->lock);
if (kfifo_avail(&client->fifo) >= len) {
@@ -176,9 +189,9 @@ static void add_event_to_kfifo(struct kfd_dev *dev, 
unsigned int smi_event,
rcu_read_unlock();
 }
 
-__printf(3, 4)
-static void kfd_smi_event_add(struct kfd_dev *dev, unsigned int event,
- char *fmt, ...)
+__printf(4, 5)
+static void kfd_smi_event_add(pid_t pid, struct kfd_dev *dev,
+ unsigned int event, char *fmt, ...)
 {
char fifo_in[KFD_SMI_EVENT_MSG_SIZE];
int len;
@@ -193,7 +206,7 @@ static void kfd_smi_event_add(struct kfd_dev *dev, unsigned 
int event,
len += vsnprintf(fifo_in + len, sizeof(fifo_in) - len, fmt, args);
va_end(args);
 
-   add_event_to_kfifo(dev, event, fifo_in, len);
+   add_event_to_kfifo(pid, dev, event, fifo_in, len);
 }
 
 void kfd_smi_event_update_gpu_reset(struct kfd_dev *dev, bool post_reset)
@@ -206,13 +219,13 @@ void kfd_smi_event_update_gpu_reset(struct kfd_dev *dev, 
bool post_reset)
event = KFD_SMI_EVENT_GPU_PRE_RESET;
++(dev->reset_seq_num);
}
-   kfd_smi_event_add(dev, event, "%x\n", dev->reset_seq_num);
+   kfd_smi_event_add(0, dev, event, "%x\n", dev->reset_seq_num);
 }
 
 void kfd_smi_event_update_thermal_throttling(struct kfd_dev *dev,
 uint64_t throttle_bitmask)
 {
-   kfd_smi_event_add(dev, KFD_SMI_EVENT_THERMAL_THROTTLE, "%llx:%llx\n",
+   kfd_smi_event_add(0, dev, KFD_SMI_EVENT_THERMAL_THROTTLE, "%llx:%llx\n",
  throttle_bitmask,
  amdgpu_dpm_get_thermal_throttling_counter(dev->adev));
 }
@@ -227,7 +240,7 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, 
uint16_t pasid)
if (!task_info.pid)
return;
 
-   kfd_smi_event_add(dev, KFD_SMI_EVENT_VMFAULT, "%x:%s\n",
+   kfd_smi_event_add(0, dev, KFD_SMI_EVENT_VMFAULT, "%x:%s\n",
  task_info.pid, task_info.task_name);
 }
 
@@ -251,6 +264,8 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
spin_lock_init(&client->lock);
client->events = 0;
client->dev = dev;
+   client->pid = current->tgid;
+   client->suser = capable(CAP_SYS_ADMIN);
 
spin_lock(&dev->smi_lock);
list_add_rcu(&client->list, &dev->smi_clients);
-- 
2.35.1

[PATCH v5 0/11] HMM profiler interface

This implements KFD profiling APIs to expose HMM migration and 
recoverable page fault profiling data. The ROCm profiler will shared 
link with application, to collect and expose the profiling data to 
application developers to tune the applications based on how the address 
range attributes affect the behavior and performance. Kernel perf and 
ftrace requires superuser permission to collect data, it is not suitable 
for ROCm profiler.

The profiling data is per process per device event uses the existing SMI 
(system management interface) event API. Each event log is one line of 
text with the event specific information.

For user space usage example:
patch 9/11, 10/11 Thunk libhsakmt is based on
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface

patch 11/11 ROCr Basic-SVM-profiler patch is based on
https://github.com/RadeonOpenCompute/ROCR-Runtime

v5:
 * Fix multi-thead profiling support
 * Added user space usage example Thunk and ROCr patch

v4:
 * Add event helper function
 * Rebase to 5.16 kernel

v3:
 * Changes from Felix's review

v2:
 * Keep existing events behaviour
 * Use ktime_get_boottime_ns() as timestamp to correlate with other APIs
 * Use compact message layout, stick with existing message convention
 * Add unmap from GPU event

Philip Yang (8):
  drm/amdkfd: Add KFD SMI event IDs and triggers
  drm/amdkfd: Enable per process SMI event
  drm/amdkfd: Add GPU recoverable fault SMI event
  drm/amdkfd: Add migration SMI event
  drm/amdkfd: Add user queue eviction restore SMI event
  drm/amdkfd: Add unmap from GPU SMI event
  drm/amdkfd: Asynchronously free smi_client
  drm/amdkfd: Bump KFD API version for SMI profiling event

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  12 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |   4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |   4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c  |  53 +--
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h  |   5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  15 +-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   | 134 --
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h   |  21 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  64 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h  |   2 +-
 include/uapi/linux/kfd_ioctl.h|  40 +-
 13 files changed, 293 insertions(+), 65 deletions(-)

Re: [PATCH] drm/amdgpu: Fix typos in amdgpu_stop_pending_resets

On Tue, Jun 28, 2022 at 10:42 AM Kent Russell  wrote:
>
> Change amdggpu to amdgpu and pedning to pending
>
> Signed-off-by: Kent Russell 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a2c268d48edd..39a875494edb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5075,7 +5075,7 @@ static void amdgpu_device_recheck_guilty_jobs(
> }
>  }
>
> -static inline void amdggpu_device_stop_pedning_resets(struct amdgpu_device 
> *adev)
> +static inline void amdgpu_device_stop_pending_resets(struct amdgpu_device 
> *adev)
>  {
> struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
>
> @@ -5256,7 +5256,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
>  * Drop all pending non scheduler resets. Scheduler resets
>  * were already dropped during drm_sched_stop
>  */
> -   amdggpu_device_stop_pedning_resets(tmp_adev);
> +   amdgpu_device_stop_pending_resets(tmp_adev);
> }
>
> tmp_vram_lost_counter = atomic_read(&((adev)->vram_lost_counter));
> --
> 2.25.1
>

[PATCH] drm/amdgpu: Fix typos in amdgpu_stop_pending_resets

2022-06-28 Thread Kent Russell

Change amdggpu to amdgpu and pedning to pending

Signed-off-by: Kent Russell 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a2c268d48edd..39a875494edb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5075,7 +5075,7 @@ static void amdgpu_device_recheck_guilty_jobs(
}
 }
 
-static inline void amdggpu_device_stop_pedning_resets(struct amdgpu_device 
*adev)
+static inline void amdgpu_device_stop_pending_resets(struct amdgpu_device 
*adev)
 {
struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
 
@@ -5256,7 +5256,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 * Drop all pending non scheduler resets. Scheduler resets
 * were already dropped during drm_sched_stop
 */
-   amdggpu_device_stop_pedning_resets(tmp_adev);
+   amdgpu_device_stop_pending_resets(tmp_adev);
}
 
tmp_vram_lost_counter = atomic_read(&((adev)->vram_lost_counter));
-- 
2.25.1

Re: [PATCH 11/22] drm: amd: amd_shared.h: Add missing doc for PP_GFX_DCS_MASK

Applied.  Thanks!

On Tue, Jun 28, 2022 at 5:46 AM Mauro Carvalho Chehab
 wrote:
>
> This symbol is missing documentation:
>
> drivers/gpu/drm/amd/include/amd_shared.h:224: warning: Enum value 
> 'PP_GFX_DCS_MASK' not described in enum 'PP_FEATURE_MASK'
>
> Document it.
>
> Fixes: 680602d6c2d6 ("drm/amd/pm: enable DCS")
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>
> To avoid mailbombing on a large number of people, only mailing lists were C/C 
> on the cover.
> See [PATCH 00/22] at: 
> https://lore.kernel.org/all/cover.1656409369.git.mche...@kernel.org/
>
>  drivers/gpu/drm/amd/include/amd_shared.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
> b/drivers/gpu/drm/amd/include/amd_shared.h
> index bcdf7453a403..2e02a6fc1717 100644
> --- a/drivers/gpu/drm/amd/include/amd_shared.h
> +++ b/drivers/gpu/drm/amd/include/amd_shared.h
> @@ -193,6 +193,7 @@ enum amd_powergating_state {
>   * @PP_ACG_MASK: Adaptive clock generator.
>   * @PP_STUTTER_MODE: Stutter mode.
>   * @PP_AVFS_MASK: Adaptive voltage and frequency scaling.
> + * @PP_GFX_DCS_MASK: GFX Async DCS.
>   *
>   * To override these settings on boot, append amdgpu.ppfeaturemask= to
>   * the kernel's command line parameters. This is usually done through a 
> system's
> --
> 2.36.1
>

Re: [PATCH 10/22] drm: amdgpu: amdgpu_device.c: fix a kernel-doc markup

On Tue, Jun 28, 2022 at 5:46 AM Mauro Carvalho Chehab
 wrote:
>
> The function was renamed without renaming also kernel-doc markup:
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5095: warning: expecting 
> prototype for amdgpu_device_gpu_recover_imp(). Prototype was for 
> amdgpu_device_gpu_recover() instead
>
> Signed-off-by: Mauro Carvalho Chehab 

I actually sent out the same patch a few days ago, however, the code
has since changed with Andrey's recent GPU reset series and the patch
is no longer applicable.

Thanks,

Alex


> ---
>
> To avoid mailbombing on a large number of people, only mailing lists were C/C 
> on the cover.
> See [PATCH 00/22] at: 
> https://lore.kernel.org/all/cover.1656409369.git.mche...@kernel.org/
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 9d6418bb963e..6d74767591e7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5079,7 +5079,7 @@ static inline void 
> amdggpu_device_stop_pedning_resets(struct amdgpu_device *adev
>
>
>  /**
> - * amdgpu_device_gpu_recover_imp - reset the asic and recover scheduler
> + * amdgpu_device_gpu_recover - reset the asic and recover scheduler
>   *
>   * @adev: amdgpu_device pointer
>   * @job: which job trigger hang
> --
> 2.36.1
>

Re: [PATCH 09/22] drm: amdgpu: amdgpu_dm: fix kernel-doc markups

Applied.  Thanks!

On Tue, Jun 28, 2022 at 5:46 AM Mauro Carvalho Chehab
 wrote:
>
> There are 4 undocumented fields at struct amdgpu_display_manager.
>
> Add documentation for them, fixing those warnings:
>
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'dmub_outbox_params' not described in 
> 'amdgpu_display_manager'
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'num_of_edps' not described in 
> 'amdgpu_display_manager'
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'disable_hpd_irq' not described in 
> 'amdgpu_display_manager'
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'dmub_aux_transfer_done' not described in 
> 'amdgpu_display_manager'
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'delayed_hpd_wq' not described in 
> 'amdgpu_display_manager'
>
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>
> To avoid mailbombing on a large number of people, only mailing lists were C/C 
> on the cover.
> See [PATCH 00/22] at: 
> https://lore.kernel.org/all/cover.1656409369.git.mche...@kernel.org/
>
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> index 547fc1547977..73755b304299 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -242,6 +242,13 @@ struct hpd_rx_irq_offload_work {
>   * @force_timing_sync: set via debugfs. When set, indicates that all 
> connected
>   *displays will be forced to synchronize.
>   * @dmcub_trace_event_en: enable dmcub trace events
> + * @dmub_outbox_params: DMUB Outbox parameters
> + * @num_of_edps: number of backlight eDPs
> + * @disable_hpd_irq: disables all HPD and HPD RX interrupt handling in the
> + *  driver when true
> + * @dmub_aux_transfer_done: struct completion used to indicate when DMUB
> + * transfers are done
> + * @delayed_hpd_wq: work queue used to delay DMUB HPD work
>   */
>  struct amdgpu_display_manager {
>
> --
> 2.36.1
>

Re: [PATCH] drm/amdgpu/display: reduce stack size in dml32_ModeSupportAndSystemConfigurationFull()

Ping?

Alex

On Wed, Jun 22, 2022 at 10:48 AM Alex Deucher  wrote:
>
> Move more stack variable in to dummy vars structure on the heap.
>
> Fixes stack frame size errors:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c: In 
> function 'dml32_ModeSupportAndSystemConfigurationFull':
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:3833:1:
>  error: the frame size of 2720 bytes is larger than 2048 bytes 
> [-Werror=frame-larger-than=]
>  3833 | } // ModeSupportAndSystemConfigurationFull
>   | ^
>
> Fixes: dda4fb85e433 ("drm/amd/display: DML changes for DCN32/321")
> Cc: Stephen Rothwell 
> Cc: Aurabindo Pillai 
> Cc: Rodrigo Siqueira Jordao 
> Signed-off-by: Alex Deucher 
> ---
>  .../dc/dml/dcn32/display_mode_vba_32.c| 77 ---
>  .../drm/amd/display/dc/dml/display_mode_vba.h |  3 +-
>  2 files changed, 36 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
> b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
> index 510b7a81ee12..7f144adb1e36 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
> @@ -1660,8 +1660,7 @@ static void 
> DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
>
>  void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
> *mode_lib)
>  {
> -   bool dummy_boolean[2];
> -   unsigned int dummy_integer[1];
> +   unsigned int dummy_integer[4];
> bool MPCCombineMethodAsNeededForPStateChangeAndVoltage;
> bool MPCCombineMethodAsPossible;
> enum odm_combine_mode dummy_odm_mode[DC__NUM_DPP__MAX];
> @@ -1973,10 +1972,10 @@ void 
> dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
> 
> v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[5],
>  /* LongDETBufferSizeInKByte[]  */
> 
> v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[6],
>  /* LongDETBufferSizeY[]  */
> 
> v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[7],
>  /* LongDETBufferSizeC[]  */
> -   &dummy_boolean[0], /* bool   
> *UnboundedRequestEnabled  */
> -   &dummy_integer[0], /* Long   
> *CompressedBufferSizeInkByte  */
> +   
> &v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[0][0],
>  /* bool   *UnboundedRequestEnabled  */
> +   
> &v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer_array[0][0],
>  /* Long   *CompressedBufferSizeInkByte  */
> 
> mode_lib->vba.SingleDPPViewportSizeSupportPerSurface,/* bool 
> ViewportSizeSupportPerSurface[] */
> -   &dummy_boolean[1]); /* bool   
> *ViewportSizeSupport */
> +   
> &v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[1][0]);
>  /* bool   *ViewportSizeSupport */
>
> MPCCombineMethodAsNeededForPStateChangeAndVoltage = false;
> MPCCombineMethodAsPossible = false;
> @@ -2506,7 +2505,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
> display_mode_lib *mode_l
> //
> for (i = 0; i < (int) v->soc.num_states; ++i) {
> for (j = 0; j <= 1; ++j) {
> -   bool dummy_boolean_array[1][DC__NUM_DPP__MAX];
> for (k = 0; k < mode_lib->vba.NumberOfActiveSurfaces; 
> ++k) {
> mode_lib->vba.RequiredDPPCLKThisState[k] = 
> mode_lib->vba.RequiredDPPCLK[i][j][k];
> mode_lib->vba.NoOfDPPThisState[k] = 
> mode_lib->vba.NoOfDPP[i][j][k];
> @@ -2570,7 +2568,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
> display_mode_lib *mode_l
> mode_lib->vba.DETBufferSizeCThisState,
> 
> &mode_lib->vba.UnboundedRequestEnabledThisState,
> 
> &mode_lib->vba.CompressedBufferSizeInkByteThisState,
> -   dummy_boolean_array[0],
> +   
> v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_boolean_array[0],
> 
> &mode_lib->vba.ViewportSizeSupport[i][j]);
>
> for (k = 0; k < mode_lib->vba.NumberOfActiveSurfaces; 
> ++k) {
> @@ -2708,9 +2706,6 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
> display_mode_lib *mode_l
> }
>
> {
> -   bool dummy_boolean_array[2][DC__NUM_DPP__MAX];
> -

RE: [PATCH 2/2] drm/amdgpu: fix documentation warning

2022-06-28 Thread Russell, Kent

[AMD Official Use Only - General]

Not sure why no one responded, but this is something even I can RB.

Reviewed-by: Kent Russell 



> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Monday, June 27, 2022 5:41 PM
> To: Deucher, Alexander 
> Cc: Stephen Rothwell ; amd-gfx list  g...@lists.freedesktop.org>
> Subject: Re: [PATCH 2/2] drm/amdgpu: fix documentation warning
> 
> Ping?
> 
> On Thu, Jun 23, 2022 at 12:41 PM Alex Deucher 
> wrote:
> >
> > Fixes this issue:
> > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5094: warning: expecting
> prototype for amdgpu_device_gpu_recover_imp(). Prototype was for
> amdgpu_device_gpu_recover() instead
> >
> > Fixes: cf727044144d ("drm/amdgpu: Rename
> amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover")
> > Reported-by: Stephen Rothwell 
> > Signed-off-by: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index f2a4c268ac72..6c0fbc662b3a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -5079,7 +5079,7 @@ static inline void
> amdggpu_device_stop_pedning_resets(struct amdgpu_device *adev
> >
> >
> >  /**
> > - * amdgpu_device_gpu_recover_imp - reset the asic and recover scheduler
> > + * amdgpu_device_gpu_recover - reset the asic and recover scheduler
> >   *
> >   * @adev: amdgpu_device pointer
> >   * @job: which job trigger hang
> > --
> > 2.35.3
> >

Re: (subset) [PATCH 00/22] Fix kernel-doc warnings at linux-next

2022-06-28 Thread Mark Brown

On Tue, 28 Jun 2022 10:46:04 +0100, Mauro Carvalho Chehab wrote:
> As we're currently discussing about making kernel-doc issues fatal when
> CONFIG_WERROR is enable, let's fix all 60 kernel-doc warnings
> inside linux-next:
> 
>   arch/x86/include/uapi/asm/sgx.h:19: warning: Enum value 
> 'SGX_PAGE_MEASURE' not described in enum 'sgx_page_flags'
>   arch/x86/include/uapi/asm/sgx.h:97: warning: Function parameter or 
> member 'rdi' not described in 'sgx_enclave_user_handler_t'
>   arch/x86/include/uapi/asm/sgx.h:97: warning: Function parameter or 
> member 'rsi' not described in 'sgx_enclave_user_handler_t'
>   arch/x86/include/uapi/asm/sgx.h:97: warning: Function parameter or 
> member 'rdx' not described in 'sgx_enclave_user_handler_t'
>   arch/x86/include/uapi/asm/sgx.h:97: warning: Function parameter or 
> member 'rsp' not described in 'sgx_enclave_user_handler_t'
>   arch/x86/include/uapi/asm/sgx.h:97: warning: Function parameter or 
> member 'r8' not described in 'sgx_enclave_user_handler_t'
>   arch/x86/include/uapi/asm/sgx.h:97: warning: Function parameter or 
> member 'r9' not described in 'sgx_enclave_user_handler_t'
>   arch/x86/include/uapi/asm/sgx.h:124: warning: Function parameter or 
> member 'reserved' not described in 'sgx_enclave_run'
>   drivers/devfreq/devfreq.c:707: warning: Function parameter or member 
> 'val' not described in 'qos_min_notifier_call'
>   drivers/devfreq/devfreq.c:707: warning: Function parameter or member 
> 'ptr' not described in 'qos_min_notifier_call'
>   drivers/devfreq/devfreq.c:717: warning: Function parameter or member 
> 'val' not described in 'qos_max_notifier_call'
>   drivers/devfreq/devfreq.c:717: warning: Function parameter or member 
> 'ptr' not described in 'qos_max_notifier_call'
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5095: warning: expecting 
> prototype for amdgpu_device_gpu_recover_imp(). Prototype was for 
> amdgpu_device_gpu_recover() instead
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'dmub_outbox_params' not described in 
> 'amdgpu_display_manager'
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'num_of_edps' not described in 
> 'amdgpu_display_manager'
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'disable_hpd_irq' not described in 
> 'amdgpu_display_manager'
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'dmub_aux_transfer_done' not described in 
> 'amdgpu_display_manager'
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
> Function parameter or member 'delayed_hpd_wq' not described in 
> 'amdgpu_display_manager'
>   drivers/gpu/drm/amd/include/amd_shared.h:224: warning: Enum value 
> 'PP_GFX_DCS_MASK' not described in enum 'PP_FEATURE_MASK'
>   drivers/gpu/drm/scheduler/sched_main.c:999: warning: Function parameter 
> or member 'dev' not described in 'drm_sched_init'
>   drivers/usb/dwc3/core.h:1328: warning: Function parameter or member 
> 'async_callbacks' not described in 'dwc3'
>   drivers/usb/dwc3/gadget.c:675: warning: Function parameter or member 
> 'mult' not described in 'dwc3_gadget_calc_tx_fifo_size'
>   fs/attr.c:36: warning: Function parameter or member 'ia_vfsuid' not 
> described in 'chown_ok'
>   fs/attr.c:36: warning: Excess function parameter 'uid' description in 
> 'chown_ok'
>   fs/attr.c:63: warning: Function parameter or member 'ia_vfsgid' not 
> described in 'chgrp_ok'
>   fs/attr.c:63: warning: Excess function parameter 'gid' description in 
> 'chgrp_ok'
>   fs/namei.c:649: warning: Function parameter or member 'mnt' not 
> described in 'path_connected'
>   fs/namei.c:649: warning: Function parameter or member 'dentry' not 
> described in 'path_connected'
>   fs/namei.c:1089: warning: Function parameter or member 'inode' not 
> described in 'may_follow_link'
>   include/drm/gpu_scheduler.h:463: warning: Function parameter or member 
> 'dev' not described in 'drm_gpu_scheduler'
>   include/linux/dcache.h:309: warning: expecting prototype for dget, 
> dget_dlock(). Prototype was for dget_dlock() instead
>   include/linux/fscache.h:270: warning: Function parameter or member 
> 'cookie' not described in 'fscache_use_cookie'
>   include/linux/fscache.h:270: warning: Excess function parameter 
> 'object' description in 'fscache_use_cookie'
>   include/linux/fscache.h:287: warning: Function parameter or member 
> 'cookie' not described in 'fscache_unuse_cookie'
>   include/linux/fscache.h:287: warning: Excess function parameter 
> 'object' description in 'fscache_unuse_cookie'
>   include/linux/genalloc.h:54: warning: Function parameter or member 
> 'start_addr' not described in 'genpool_algo_t'
>   include/linux/kfence.h:221: warning: Functio

Re: [PATCH v6 00/22] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

2022-06-28 Thread Robin Murphy


On 2022-05-27 00:50, Dmitry Osipenko wrote:

Hello,

This patchset introduces memory shrinker for the VirtIO-GPU DRM driver
and adds memory purging and eviction support to VirtIO-GPU driver.

The new dma-buf locking convention is introduced here as well.

During OOM, the shrinker will release BOs that are marked as "not needed"
by userspace using the new madvise IOCTL, it will also evict idling BOs
to SWAP. The userspace in this case is the Mesa VirGL driver, it will mark
the cached BOs as "not needed", allowing kernel driver to release memory
of the cached shmem BOs on lowmem situations, preventing OOM kills.

The Panfrost driver is switched to use generic memory shrinker.


I think we still have some outstanding issues here - Alyssa reported 
some weirdness yesterday, so I just tried provoking a low-memory 
condition locally with this series applied and a few debug options 
enabled, and the results as below were... interesting.


Thanks,
Robin.

->8-
[   68.295951] ==
[   68.295956] WARNING: possible circular locking dependency detected
[   68.295963] 5.19.0-rc3+ #400 Not tainted
[   68.295972] --
[   68.295977] cc1/295 is trying to acquire lock:
[   68.295986] 08d7f1a0 
(reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_free+0x7c/0x198

[   68.296036]
[   68.296036] but task is already holding lock:
[   68.296041] 8c14b820 (fs_reclaim){+.+.}-{0:0}, at: 
__alloc_pages_slowpath.constprop.0+0x4d8/0x1470

[   68.296080]
[   68.296080] which lock already depends on the new lock.
[   68.296080]
[   68.296085]
[   68.296085] the existing dependency chain (in reverse order) is:
[   68.296090]
[   68.296090] -> #1 (fs_reclaim){+.+.}-{0:0}:
[   68.296111]fs_reclaim_acquire+0xb8/0x150
[   68.296130]dma_resv_lockdep+0x298/0x3fc
[   68.296148]do_one_initcall+0xe4/0x5f8
[   68.296163]kernel_init_freeable+0x414/0x49c
[   68.296180]kernel_init+0x2c/0x148
[   68.296195]ret_from_fork+0x10/0x20
[   68.296207]
[   68.296207] -> #0 (reservation_ww_class_mutex){+.+.}-{3:3}:
[   68.296229]__lock_acquire+0x1724/0x2398
[   68.296246]lock_acquire+0x218/0x5b0
[   68.296260]__ww_mutex_lock.constprop.0+0x158/0x2378
[   68.296277]ww_mutex_lock+0x7c/0x4d8
[   68.296291]drm_gem_shmem_free+0x7c/0x198
[   68.296304]panfrost_gem_free_object+0x118/0x138
[   68.296318]drm_gem_object_free+0x40/0x68
[   68.296334]drm_gem_shmem_shrinker_run_objects_scan+0x42c/0x5b8
[   68.296352]drm_gem_shmem_shrinker_scan_objects+0xa4/0x170
[   68.296368]do_shrink_slab+0x220/0x808
[   68.296381]shrink_slab+0x11c/0x408
[   68.296392]shrink_node+0x6ac/0xb90
[   68.296403]do_try_to_free_pages+0x1dc/0x8d0
[   68.296416]try_to_free_pages+0x1ec/0x5b0
[   68.296429]__alloc_pages_slowpath.constprop.0+0x528/0x1470
[   68.296444]__alloc_pages+0x4e0/0x5b8
[   68.296455]__folio_alloc+0x24/0x60
[   68.296467]vma_alloc_folio+0xb8/0x2f8
[   68.296483]alloc_zeroed_user_highpage_movable+0x58/0x68
[   68.296498]__handle_mm_fault+0x918/0x12a8
[   68.296513]handle_mm_fault+0x130/0x300
[   68.296527]do_page_fault+0x1d0/0x568
[   68.296539]do_translation_fault+0xa0/0xb8
[   68.296551]do_mem_abort+0x68/0xf8
[   68.296562]el0_da+0x74/0x100
[   68.296572]el0t_64_sync_handler+0x68/0xc0
[   68.296585]el0t_64_sync+0x18c/0x190
[   68.296596]
[   68.296596] other info that might help us debug this:
[   68.296596]
[   68.296601]  Possible unsafe locking scenario:
[   68.296601]
[   68.296604]CPU0CPU1
[   68.296608]
[   68.296612]   lock(fs_reclaim);
[   68.296622] 
lock(reservation_ww_class_mutex);

[   68.296633]lock(fs_reclaim);
[   68.296644]   lock(reservation_ww_class_mutex);
[   68.296654]
[   68.296654]  *** DEADLOCK ***
[   68.296654]
[   68.296658] 3 locks held by cc1/295:
[   68.29]  #0: 0616e898 (&mm->mmap_lock){}-{3:3}, at: 
do_page_fault+0x144/0x568
[   68.296702]  #1: 8c14b820 (fs_reclaim){+.+.}-{0:0}, at: 
__alloc_pages_slowpath.constprop.0+0x4d8/0x1470
[   68.296740]  #2: 8c1215b0 (shrinker_rwsem){}-{3:3}, at: 
shrink_slab+0xc0/0x408

[   68.296774]
[   68.296774] stack backtrace:
[   68.296780] CPU: 2 PID: 295 Comm: cc1 Not tainted 5.19.0-rc3+ #400
[   68.296794] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Sep  3 2019

[   68.296803] Call trace:
[   68.296808]  dump_backtrace+0x1e4/0x1f0
[   68.296821]  show_stack+0x20/0x70
[   68.296832]  dump_stack_lvl+0x8c/0xb8
[   68.296849]  dump_stack+0x1c/0x38
[   68.296864]  print_circular_bug.isra.0+0x284/0x378
[   68.296881]  check_noncircular+0x1d8/0x1

Re: [PATCH v6 02/14] mm: handling Non-LRU pages returned by vm_normal_pages

2022-06-28 Thread David Hildenbrand

On 28.06.22 02:14, Alex Sierra wrote:
> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
> device-managed anonymous pages that are not LRU pages. Although they
> behave like normal pages for purposes of mapping in CPU page, and for
> COW. They do not support LRU lists, NUMA migration or THP.
> 
> We also introduced a FOLL_LRU flag that adds the same behaviour to
> follow_page and related APIs, to allow callers to specify that they
> expect to put pages on an LRU list.
> 
> Signed-off-by: Alex Sierra 
> Acked-by: Felix Kuehling 
> Reviewed-by: Alistair Popple 
> ---

I think my review feedback regarding FOLL_LRU has been ignored.


-- 
Thanks,

David / dhildenb

Re: [PATCH v6 00/22] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

2022-06-28 Thread Dmitry Osipenko

On 6/28/22 15:31, Robin Murphy wrote:
> [  100.511411]
> ==
> [  100.511419] BUG: KASAN: use-after-free in irq_work_single+0xa4/0x110
> [  100.511445] Write of size 4 at addr 107f5830 by task
> glmark2-es2-drm/280
> [  100.511458]
> [  100.511464] CPU: 1 PID: 280 Comm: glmark2-es2-drm Not tainted
> 5.19.0-rc3+ #400
> [  100.511479] Hardware name: ARM LTD ARM Juno Development Platform/ARM
> Juno Development Platform, BIOS EDK II Sep  3 2019
> [  100.511489] Call trace:
> [  100.511494]  dump_backtrace+0x1e4/0x1f0
> [  100.511512]  show_stack+0x20/0x70
> [  100.511523]  dump_stack_lvl+0x8c/0xb8
> [  100.511543]  print_report+0x16c/0x668
> [  100.511559]  kasan_report+0x80/0x208
> [  100.511574]  kasan_check_range+0x100/0x1b8
> [  100.511590]  __kasan_check_write+0x34/0x60
> [  100.511607]  irq_work_single+0xa4/0x110
> [  100.511619]  irq_work_run_list+0x6c/0x88
> [  100.511632]  irq_work_run+0x28/0x48
> [  100.511644]  ipi_handler+0x254/0x468
> [  100.511664]  handle_percpu_devid_irq+0x11c/0x518
> [  100.511681]  generic_handle_domain_irq+0x50/0x70
> [  100.511699]  gic_handle_irq+0xd4/0x118
> [  100.511711]  call_on_irq_stack+0x2c/0x58
> [  100.511725]  do_interrupt_handler+0xc0/0xc8
> [  100.511741]  el1_interrupt+0x40/0x68
> [  100.511754]  el1h_64_irq_handler+0x18/0x28
> [  100.511767]  el1h_64_irq+0x64/0x68
> [  100.511778]  irq_work_queue+0xc0/0xd8
> [  100.511790]  drm_sched_entity_fini+0x2c4/0x3b0
> [  100.511805]  drm_sched_entity_destroy+0x2c/0x40
> [  100.511818]  panfrost_job_close+0x44/0x1c0
> [  100.511833]  panfrost_postclose+0x38/0x60
> [  100.511845]  drm_file_free.part.0+0x33c/0x4b8
> [  100.511862]  drm_close_helper.isra.0+0xc0/0xd8
> [  100.511877]  drm_release+0xe4/0x1e0
> [  100.511891]  __fput+0xf8/0x390
> [  100.511904]  fput+0x18/0x28
> [  100.511917]  task_work_run+0xc4/0x1e0
> [  100.511929]  do_exit+0x554/0x1168
> [  100.511945]  do_group_exit+0x60/0x108
> [  100.511960]  __arm64_sys_exit_group+0x34/0x38
> [  100.511977]  invoke_syscall+0x64/0x180
> [  100.511993]  el0_svc_common.constprop.0+0x13c/0x170
> [  100.512012]  do_el0_svc+0x48/0xe8
> [  100.512028]  el0_svc+0x5c/0xe0
> [  100.512038]  el0t_64_sync_handler+0xb8/0xc0
> [  100.512051]  el0t_64_sync+0x18c/0x190
> [  100.512064]

This one shall be fixed by [1] that is not in the RC kernel yet, please
use linux-next.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20220628&id=7d64c40a7d96190d9d06e240305389e025295916

-- 
Best regards,
Dmitry

Re: [PATCH v6 06/14] mm: add device coherent checker to is_pinnable_page

2022-06-28 Thread David Hildenbrand

On 28.06.22 02:14, Alex Sierra wrote:
> is_device_coherent checker was added to is_pinnable_page and renamed
> to is_longterm_pinnable_page. The reason is that device coherent
> pages are not supported for longterm pinning.
> 
> Signed-off-by: Alex Sierra 
> ---
>  include/linux/memremap.h | 25 +
>  include/linux/mm.h   | 24 
>  mm/gup.c |  5 ++---
>  mm/gup_test.c|  4 ++--
>  mm/hugetlb.c |  2 +-
>  5 files changed, 30 insertions(+), 30 deletions(-)


Rename of the function should be a separate cleanup patch before any
other changes, and the remaining change should be squashed into patch
#1, to logically make sense, because it still states "no one should be
allowed to pin such memory so that it can always be evicted."

Or am I missing something?

-- 
Thanks,

David / dhildenb

Re: [PATCH v6 00/22] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

2022-06-28 Thread Dmitry Osipenko

On 6/28/22 15:31, Robin Murphy wrote:
> ->8-
> [   68.295951] ==
> [   68.295956] WARNING: possible circular locking dependency detected
> [   68.295963] 5.19.0-rc3+ #400 Not tainted
> [   68.295972] --
> [   68.295977] cc1/295 is trying to acquire lock:
> [   68.295986] 08d7f1a0
> (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_free+0x7c/0x198
> [   68.296036]
> [   68.296036] but task is already holding lock:
> [   68.296041] 8c14b820 (fs_reclaim){+.+.}-{0:0}, at:
> __alloc_pages_slowpath.constprop.0+0x4d8/0x1470
> [   68.296080]
> [   68.296080] which lock already depends on the new lock.
> [   68.296080]
> [   68.296085]
> [   68.296085] the existing dependency chain (in reverse order) is:
> [   68.296090]
> [   68.296090] -> #1 (fs_reclaim){+.+.}-{0:0}:
> [   68.296111]    fs_reclaim_acquire+0xb8/0x150
> [   68.296130]    dma_resv_lockdep+0x298/0x3fc
> [   68.296148]    do_one_initcall+0xe4/0x5f8
> [   68.296163]    kernel_init_freeable+0x414/0x49c
> [   68.296180]    kernel_init+0x2c/0x148
> [   68.296195]    ret_from_fork+0x10/0x20
> [   68.296207]
> [   68.296207] -> #0 (reservation_ww_class_mutex){+.+.}-{3:3}:
> [   68.296229]    __lock_acquire+0x1724/0x2398
> [   68.296246]    lock_acquire+0x218/0x5b0
> [   68.296260]    __ww_mutex_lock.constprop.0+0x158/0x2378
> [   68.296277]    ww_mutex_lock+0x7c/0x4d8
> [   68.296291]    drm_gem_shmem_free+0x7c/0x198
> [   68.296304]    panfrost_gem_free_object+0x118/0x138
> [   68.296318]    drm_gem_object_free+0x40/0x68
> [   68.296334]    drm_gem_shmem_shrinker_run_objects_scan+0x42c/0x5b8
> [   68.296352]    drm_gem_shmem_shrinker_scan_objects+0xa4/0x170
> [   68.296368]    do_shrink_slab+0x220/0x808
> [   68.296381]    shrink_slab+0x11c/0x408
> [   68.296392]    shrink_node+0x6ac/0xb90
> [   68.296403]    do_try_to_free_pages+0x1dc/0x8d0
> [   68.296416]    try_to_free_pages+0x1ec/0x5b0
> [   68.296429]    __alloc_pages_slowpath.constprop.0+0x528/0x1470
> [   68.296444]    __alloc_pages+0x4e0/0x5b8
> [   68.296455]    __folio_alloc+0x24/0x60
> [   68.296467]    vma_alloc_folio+0xb8/0x2f8
> [   68.296483]    alloc_zeroed_user_highpage_movable+0x58/0x68
> [   68.296498]    __handle_mm_fault+0x918/0x12a8
> [   68.296513]    handle_mm_fault+0x130/0x300
> [   68.296527]    do_page_fault+0x1d0/0x568
> [   68.296539]    do_translation_fault+0xa0/0xb8
> [   68.296551]    do_mem_abort+0x68/0xf8
> [   68.296562]    el0_da+0x74/0x100
> [   68.296572]    el0t_64_sync_handler+0x68/0xc0
> [   68.296585]    el0t_64_sync+0x18c/0x190
> [   68.296596]
> [   68.296596] other info that might help us debug this:
> [   68.296596]
> [   68.296601]  Possible unsafe locking scenario:
> [   68.296601]
> [   68.296604]    CPU0    CPU1
> [   68.296608]        
> [   68.296612]   lock(fs_reclaim);
> [   68.296622] lock(reservation_ww_class_mutex);
> [   68.296633]    lock(fs_reclaim);
> [   68.296644]   lock(reservation_ww_class_mutex);
> [   68.296654]
> [   68.296654]  *** DEADLOCK ***

This splat could be ignored for now. I'm aware about it, although
haven't looked closely at how to fix it since it's a kind of a lockdep
misreporting.

-- 
Best regards,
Dmitry

Re: Annoying AMDGPU boot-time warning due to simplefb / amdgpu resource clash

2022-06-28 Thread Jocelyn Falempe


On 28/06/2022 10:43, Thomas Zimmermann wrote:

Hi

Am 27.06.22 um 19:25 schrieb Linus Torvalds:

On Mon, Jun 27, 2022 at 1:02 AM Javier Martinez Canillas
 wrote:


The flag was dropped because it was causing drivers that requested their
memory resource with pci_request_region() to fail with -EBUSY (e.g: the
vmwgfx driver):

https://www.spinics.net/lists/dri-devel/msg329672.html


See, *that* link would have been useful in the commit.

Rather than the useless link it has.

Anyway, removing the busy bit just made things worse.


If simplefb is actually still using that frame buffer, it's a problem.
If it isn't, then maybe that resource should have been released?


It's supposed to be released once amdgpu asks for conflicting 
framebuffers
to be removed calling 
drm_aperture_remove_conflicting_pci_framebuffers().


That most definitely doesn't happen. This is on a running system:

   [torvalds@ryzen linux]$ cat /proc/iomem | grep BOOTFB
 - : BOOTFB

so I suspect that the BUSY bit was never the problem - even for
vmwgfx). The problem was that simplefb doesn't remove its resource.

Guys, the *reason* for resource management is to catch people that
trample over each other's resources.

You literally basically disabled the code that checked for it by
removing the BUSY flag, and just continued to have conflicting
resources.

That isn't a "fix", that is literally "we are ignoring and breaking
the whole reason that the resource tree exists, but we'll still use it
for no good reason".


The EFI/VESA framebuffer is represented by a platform device. The BUSY 
flag we removed is in the 'sysfb' code that creates this device. The 
BOOTFB resource you see in your /proc/iomem is the framebuffer memory. 
The code is in sysfb_create_simplefb() [1]


Later during boot a device driver, 'simplefb' or 'simpledrm', binds to 
the device and reserves the framebuffer memory for rendering into it. 
For example in simpledrm. [2] At that point a BUSY flag is set for that 
reservation.




Yeah, yeah, most modern drivers ignore the IO resource tree, because
they end up working on another resource level entirely: they work on
not the IO resources, but on the "driver level" instead, and just
attach to PCI devices.

So these days, few enough drivers even care about the IO resource
tree, and it's mostly used for (a) legacy devices (think ISA) and (b)
the actual bus resource handling (so the PCI code itself uses it to
sort out resource use and avoid conflicts, but PCI drivers themselves
generally then don't care, because the bus has "taken care of it".

So that's why the amdgpu driver itself doesn't care about resource
allocations, and we only get a warning for that memory type case, not
for any deeper resource case.

And apparently the vmwgfx driver still uses that legacy "let's claim
all PCI resources in the resource tree" instead of just claiming the
device itself. Which is why it hit this whole BOOTFB resource thing
even harder.

But the real bug is that BOOTFB seems to claim this resource even
after it is done with it and other drivers want to take over.


Once amdgpu wants to take over, it has to remove the the platform device 
that represents the EFI framebuffer. It does so by calling the 
drm_aperture_ function, which in turn calls 
platform_device_unregister(). Afterwards, the platform device, driver 
and BOOTFB range are supposed to be entirely gone.


Unfortunately, this currently only works if a driver is bound to the 
platform device. Without simpledrm or simplefb, amdgpu won't find the 
platform device to remove.


I guess, what happens on your system is that sysfb create a device for 
the EFI framebuffer and then amdgpu comes and doesn't find it for 
removal. And later you see these warnings because BOOTFB is still around.


Javier already provided patches for this scenario, which are in the DRM 
tree. From drm-next, please cherry-pick


   0949ee75da6c ("firmware: sysfb: Make sysfb_create_simplefb() return a 
pdev pointer")


   bc824922b264 ("firmware: sysfb: Add sysfb_disable() helper function")

   873eb3b11860 ("fbdev: Disable sysfb device registration when removing 
conflicting FBs")


for testing. With these patches, amdgpu will find the sysfb device and 
unregister it.


The patches are queued up for the next merge window. If they resolve the 
issue, we'll already send with the next round of fixes.


I was able to reproduce the warning with kernel v5.19-rc4, a radeon GPU 
and the following config:


CONFIG_SYSFB=y
CONFIG_SYSFB_SIMPLEFB=y
# CONFIG_DRM_SIMPLEDRM is not set
# CONFIG_FB_SIMPLE is not set

After applying the 3 patches you mentioned, the issue is resolved. (at 
least on my setup).


Best regards,

--

Jocelyn



Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/firmware/sysfb_simplefb.c#L115 

[2] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/tiny/simpledrm.c#L544 





Not the BUSY bit.

  Linus

Re: [PATCH] drm/amd/display: expose additional modifier for DCN32/321

2022-06-28 Thread Marek Olšák

This needs to be a loop inserting all 64K_R_X and all 256K_R_X modifiers.

If num_pipes > 16, insert 256K_R_X first, else insert 64K_R_X first. Insert
the other one after that. For example:

  for (unsigned i = 0; i < 2; i++) {

 unsigned swizzle_r_x;


 /* Insert the best one first. */

 if (num_pipes > 16)

swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX11_256K_R_X :
AMD_FMT_MOD_TILE_GFX9_64K_R_X;

 else

swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX9_64K_R_X :
AMD_FMT_MOD_TILE_GFX11_256K_R_X;


 uint64_t modifier_r_x = ...

 add_modifier(,,,

 add_modifier(,,,
 add_modifier(,,,
 add_modifier(,,,
 add_modifier(,,,
  }


Marek

On Mon, Jun 27, 2022 at 10:32 AM Aurabindo Pillai 
wrote:

> [Why&How]
> Some userspace expect a backwards compatible modifier on DCN32/321. For
> hardware with num_pipes more than 16, we expose the most efficient
> modifier first. As a fall back method, we need to expose slightly
> inefficient
> modifier AMD_FMT_MOD_TILE_GFX9_64K_R_X after the best option.
>
> Also set the number of packers to fixed value as required per hardware
> documentation. This value is cached during hardware initialization and
> can be read through the base driver.
>
> Fixes: 0a2c19562ffe ('Revert "drm/amd/display: ignore modifiers when
> checking for format support"')
> Signed-off-by: Aurabindo Pillai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 3 +--
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +++-
>  2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index 1a512d78673a..0f5bfe5df627 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -743,8 +743,7 @@ static int convert_tiling_flags_to_modifier(struct
> amdgpu_framebuffer *afb)
> switch (version) {
> case AMD_FMT_MOD_TILE_VER_GFX11:
> pipe_xor_bits = min(block_size_bits - 8,
> pipes);
> -   packers = min(block_size_bits - 8 -
> pipe_xor_bits,
> -
>  ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs));
> +   packers =
> ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs);
> break;
> case AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS:
> pipe_xor_bits = min(block_size_bits - 8,
> pipes);
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index c9145864ed2b..bea9cee37f65 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -5203,6 +5203,7 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
> int pkrs = 0;
> u32 gb_addr_config;
> unsigned swizzle_r_x;
> +   uint64_t modifier_r_x_best;
> uint64_t modifier_r_x;
> uint64_t modifier_dcc_best;
> uint64_t modifier_dcc_4k;
> @@ -5223,10 +5224,12 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
>
> modifier_r_x = AMD_FMT_MOD |
> AMD_FMT_MOD_SET(TILE_VERSION, AMD_FMT_MOD_TILE_VER_GFX11) |
> -   AMD_FMT_MOD_SET(TILE, swizzle_r_x) |
> AMD_FMT_MOD_SET(PIPE_XOR_BITS, pipe_xor_bits) |
> AMD_FMT_MOD_SET(PACKERS, pkrs);
>
> +   modifier_r_x_best = modifier_r_x | AMD_FMT_MOD_SET(TILE,
> AMD_FMT_MOD_TILE_GFX11_256K_R_X);
> +   modifier_r_x = modifier_r_x | AMD_FMT_MOD_SET(TILE,
> AMD_FMT_MOD_TILE_GFX9_64K_R_X);
> +
> /* DCC_CONSTANT_ENCODE is not set because it can't vary with gfx11
> (it's implied to be 1). */
> modifier_dcc_best = modifier_r_x |
> AMD_FMT_MOD_SET(DCC, 1) |
> @@ -5247,6 +5250,9 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
> add_modifier(mods, size, capacity, modifier_dcc_best |
> AMD_FMT_MOD_SET(DCC_RETILE, 1));
> add_modifier(mods, size, capacity, modifier_dcc_4k |
> AMD_FMT_MOD_SET(DCC_RETILE, 1));
>
> +   if (num_pipes > 16)
> +   add_modifier(mods, size, capacity, modifier_r_x_best);
> +
> add_modifier(mods, size, capacity, modifier_r_x);
>
> add_modifier(mods, size, capacity, AMD_FMT_MOD |
> --
> 2.36.1
>
>

Re: [PATCH 09/14] drm/radeon: use drm_oom_badness

2022-06-28 Thread Michel Dänzer

On 2022-06-24 10:04, Christian König wrote:
> This allows the OOM killer to make a better decision which process to reap.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/radeon/radeon_drv.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
> b/drivers/gpu/drm/radeon/radeon_drv.c
> index 956c72b5aa33..11d310cdd2e8 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -550,6 +550,7 @@ static const struct file_operations 
> radeon_driver_kms_fops = {
>  #ifdef CONFIG_COMPAT
>   .compat_ioctl = radeon_kms_compat_ioctl,
>  #endif
> + .file_rss = drm_file_rss,
>  };
>  
>  static const struct drm_ioctl_desc radeon_ioctls_kms[] = {

Shortlog should now say "use drm_file_rss", right?


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer

[PATCH 09/22] drm: amdgpu: amdgpu_dm: fix kernel-doc markups

There are 4 undocumented fields at struct amdgpu_display_manager.

Add documentation for them, fixing those warnings:

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
Function parameter or member 'dmub_outbox_params' not described in 
'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
Function parameter or member 'num_of_edps' not described in 
'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
Function parameter or member 'disable_hpd_irq' not described in 
'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
Function parameter or member 'dmub_aux_transfer_done' not described in 
'amdgpu_display_manager'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:544: warning: 
Function parameter or member 'delayed_hpd_wq' not described in 
'amdgpu_display_manager'

Signed-off-by: Mauro Carvalho Chehab 
---

To avoid mailbombing on a large number of people, only mailing lists were C/C 
on the cover.
See [PATCH 00/22] at: 
https://lore.kernel.org/all/cover.1656409369.git.mche...@kernel.org/

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 547fc1547977..73755b304299 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -242,6 +242,13 @@ struct hpd_rx_irq_offload_work {
  * @force_timing_sync: set via debugfs. When set, indicates that all connected
  *displays will be forced to synchronize.
  * @dmcub_trace_event_en: enable dmcub trace events
+ * @dmub_outbox_params: DMUB Outbox parameters
+ * @num_of_edps: number of backlight eDPs
+ * @disable_hpd_irq: disables all HPD and HPD RX interrupt handling in the
+ *  driver when true
+ * @dmub_aux_transfer_done: struct completion used to indicate when DMUB
+ * transfers are done
+ * @delayed_hpd_wq: work queue used to delay DMUB HPD work
  */
 struct amdgpu_display_manager {
 
-- 
2.36.1

[PATCH 11/22] drm: amd: amd_shared.h: Add missing doc for PP_GFX_DCS_MASK

This symbol is missing documentation:

drivers/gpu/drm/amd/include/amd_shared.h:224: warning: Enum value 
'PP_GFX_DCS_MASK' not described in enum 'PP_FEATURE_MASK'

Document it.

Fixes: 680602d6c2d6 ("drm/amd/pm: enable DCS")
Signed-off-by: Mauro Carvalho Chehab 
---

To avoid mailbombing on a large number of people, only mailing lists were C/C 
on the cover.
See [PATCH 00/22] at: 
https://lore.kernel.org/all/cover.1656409369.git.mche...@kernel.org/

 drivers/gpu/drm/amd/include/amd_shared.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h
index bcdf7453a403..2e02a6fc1717 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -193,6 +193,7 @@ enum amd_powergating_state {
  * @PP_ACG_MASK: Adaptive clock generator.
  * @PP_STUTTER_MODE: Stutter mode.
  * @PP_AVFS_MASK: Adaptive voltage and frequency scaling.
+ * @PP_GFX_DCS_MASK: GFX Async DCS.
  *
  * To override these settings on boot, append amdgpu.ppfeaturemask= to
  * the kernel's command line parameters. This is usually done through a 
system's
-- 
2.36.1

[PATCH 10/22] drm: amdgpu: amdgpu_device.c: fix a kernel-doc markup

The function was renamed without renaming also kernel-doc markup:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5095: warning: expecting 
prototype for amdgpu_device_gpu_recover_imp(). Prototype was for 
amdgpu_device_gpu_recover() instead

Signed-off-by: Mauro Carvalho Chehab 
---

To avoid mailbombing on a large number of people, only mailing lists were C/C 
on the cover.
See [PATCH 00/22] at: 
https://lore.kernel.org/all/cover.1656409369.git.mche...@kernel.org/

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9d6418bb963e..6d74767591e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5079,7 +5079,7 @@ static inline void 
amdggpu_device_stop_pedning_resets(struct amdgpu_device *adev
 
 
 /**
- * amdgpu_device_gpu_recover_imp - reset the asic and recover scheduler
+ * amdgpu_device_gpu_recover - reset the asic and recover scheduler
  *
  * @adev: amdgpu_device pointer
  * @job: which job trigger hang
-- 
2.36.1

[PATCH 00/22] Fix kernel-doc warnings at linux-next