Re: [PATCH] drm/amdgpu: Add early fini callback

2021-05-19 Thread Andrey Grodzovsky



On 2021-05-19 11:29 p.m., Felix Kuehling wrote:

Am 2021-05-19 um 11:20 p.m. schrieb Andrey Grodzovsky:

Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

v5:
Move HW finilization into this callback to prevent MMIO accesses
post cpi remove.

v7:
Split kfd suspend from device exit to expdite HW related
stuff to amdgpu_pci_remove

v8:
Squash previous KFD commit into this commit to avoid compile break.

Signed-off-by: Andrey Grodzovsky 
Acked-by: Christian König 

See one cosmetic comment inline. With that fixed the patch is

Reviewed-by: Felix Kuehling 



Thanks for quick response, updated.
Since this was last commit to review I also pushed the series to
drm-misc-next.

Andrey





---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 59 +--
  drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  3 +-
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
  drivers/gpu/drm/amd/include/amd_shared.h  |  2 +
  6 files changed, 56 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
}
  }
  
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)

+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
  {
if (adev->kfd.dev) {
kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 5ffb07b02810..d8a537e8aea5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
const void *ih_ring_entry);
  void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
  void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
  int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8bee95ad32d9..bc75e35dd8d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
return 0;
  }
  
-/**

- * amdgpu_device_ip_fini - run fini for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Main teardown pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
- * are run.  hw_fini tears down the hardware associated with each IP
- * and sw_fini tears down any software state associated with each IP.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
  {
int i, r;
  
-	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)

-   amdgpu_virt_release_ras_err_handler_data(adev);
+   for (i = 0; i < adev->num_ip_blocks; i++) {
+   if (!adev->ip_blocks[i].version->funcs->early_fini)
+   continue;
  
-	amdgpu_ras_pre_fini(adev);

+   r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+   if (r) {
+   DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+ adev->ip_blocks[i].version->funcs->name, r);
+   }
+   }
  
-	if (adev->gmc.xgmi.num_physical_nodes > 1)

-   amdgpu_xgmi_remove_device(adev);
+   amdgpu_amdkfd_suspend(adev, false);
  
  	amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);

amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
  
-	amdgpu_amdkfd_device_fini(adev);

-
/* need to disable SMC first */
for (i = 0; i < adev->num_ip_blocks; i++) {
if (!adev->ip_blocks[i].status.hw)
@@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
adev->ip_blocks[i].status.hw = false;
}
  
+	return 0;

+}
+
+/**
+ * amdgpu_device_ip_fini - run fini for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main teardown pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
+ * are run.  hw_fini tears down 

[GIT PULL] exynos-drm-fixes

2021-05-19 Thread Inki Dae
Hi Dave,

   Just one fixup to kerneldoc and two cleanups to drop redundant error
   messages.

   Please kindly let me know if there is any problem.

Thanks,
Inki Dae

The following changes since commit d07f6ca923ea0927a1024dfccafc5b53b61cfecc:

  Linux 5.13-rc2 (2021-05-16 15:27:44 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos 
tags/exynos-drm-fixes-for-v5.13-rc3

for you to fetch changes up to a470c5665b3b918c31bcc912234862803b10ba00:

  drm/exynos/decon5433: Remove redundant error printing in 
exynos5433_decon_probe() (2021-05-17 20:31:39 +0900)


Fixup
- Correct kerneldoc of fimd_shadow_protect_win function.

Cleanup
- Drop redundant error messages.


Krzysztof Kozlowski (1):
  drm/exynos: correct exynos_drm_fimd kerneldoc

Zhen Lei (2):
  drm/exynos: Remove redundant error printing in exynos_dsi_probe()
  drm/exynos/decon5433: Remove redundant error printing in 
exynos5433_decon_probe()

 drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 4 +---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c   | 4 +---
 drivers/gpu/drm/exynos/exynos_drm_fimd.c  | 2 +-
 3 files changed, 3 insertions(+), 7 deletions(-)


Re: [PATCH] drm/amdgpu: Let userptr BO ttm have TTM_PAGE_FLAG_SG set

2021-05-19 Thread Felix Kuehling
I think this works for KFD userptr BOs. But this problem is probably not
specific to KFD. It's only most obvious with KFD because we rely so
heavily for userptrs.

I don't really understand why we're messing with TTM_PAGE_FLAG_SG in
amdgpu_ttm_tt_populate and amdgpu_ttm_tt_unpopulate. And why are userptr
BOs created as ttm_bo_type_device, not ttm_bo_type_sg? Christian, do you
know about the history of this code?

Either way, the patch is

Acked-by: Felix Kuehling 

Thanks for looking into this!

Regards,
  Felix

Am 2021-05-19 um 11:15 p.m. schrieb xinhui pan:
> We have met memory corruption due to unexcepted swapout/swapin.
>
> swapout function create one swap storage which is filled with zero. And
> set ttm->page_flags as TTM_PAGE_FLAG_SWAPPED. But because userptr BO ttm
> has no backend page at that time, no real data is swapout to swap
> storage.
>
> swapin function is called during userptr BO populate as
> TTM_PAGE_FLAG_SWAPPED is set. Now here is the problem, we swapin data to
> ttm bakend memory from swap storage. That just causes the memory been
> overwritten.
>
> CPU 1 CPU 2
> kfd alloc BO A(userptr) alloc BO B(GTT)
> ->init -> validate(create ttm)-> init -> validate -> populate
> init_user_pages   -> swapout BO A
> -> get_user_pages (fill up ttm->pages)
>  -> validate -> populate
>   -> swapin BO A // memory overwritten
>
> To fix this issue, we can set TTM_PAGE_FLAG_SG when we create userptr BO
> ttm. Then swapout function would not swap it.
>
> Signed-off-by: xinhui pan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  | 4 
>  2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 928e8d57cd08..9a6ea966ddb2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1410,7 +1410,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   } else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
>   domain = AMDGPU_GEM_DOMAIN_GTT;
>   alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
> - alloc_flags = 0;
> + alloc_flags = AMDGPU_AMDKFD_CREATE_USERPTR_BO;
>   if (!offset || !*offset)
>   return -EINVAL;
>   user_addr = untagged_addr(*offset);
> @@ -1477,8 +1477,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   }
>   bo->kfd_bo = *mem;
>   (*mem)->bo = bo;
> - if (user_addr)
> - bo->flags |= AMDGPU_AMDKFD_CREATE_USERPTR_BO;
>  
>   (*mem)->va = va;
>   (*mem)->domain = domain;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index c7f5cc503601..5b3f45637fb5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1119,6 +1119,10 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct 
> ttm_buffer_object *bo,
>   kfree(gtt);
>   return NULL;
>   }
> +
> + if (abo->flags & AMDGPU_AMDKFD_CREATE_USERPTR_BO)
> + gtt->ttm.page_flags |= TTM_PAGE_FLAG_SG;
> +
>   return >ttm;
>  }
>  


Re: [PATCH] drm/amdgpu: Add early fini callback

2021-05-19 Thread Felix Kuehling
Am 2021-05-19 um 11:20 p.m. schrieb Andrey Grodzovsky:
> Use it to call disply code dependent on device->drv_data
> before it's set to NULL on device unplug
>
> v5:
> Move HW finilization into this callback to prevent MMIO accesses
> post cpi remove.
>
> v7:
> Split kfd suspend from device exit to expdite HW related
> stuff to amdgpu_pci_remove
>
> v8:
> Squash previous KFD commit into this commit to avoid compile break.
>
> Signed-off-by: Andrey Grodzovsky 
> Acked-by: Christian König 

See one cosmetic comment inline. With that fixed the patch is

Reviewed-by: Felix Kuehling 


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 59 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  3 +-
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
>  drivers/gpu/drm/amd/include/amd_shared.h  |  2 +
>  6 files changed, 56 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 5f6696a3c778..2b06dee9a0ce 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
>   }
>  }
>  
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
>  {
>   if (adev->kfd.dev) {
>   kgd2kfd_device_exit(adev->kfd.dev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 5ffb07b02810..d8a537e8aea5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
>   const void *ih_ring_entry);
>  void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>  void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
> -void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
> +void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
>  int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
>   uint32_t vmid, uint64_t gpu_addr,
>   uint32_t *ib_cmd, uint32_t ib_len);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 8bee95ad32d9..bc75e35dd8d8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct 
> amdgpu_device *adev)
>   return 0;
>  }
>  
> -/**
> - * amdgpu_device_ip_fini - run fini for hardware IPs
> - *
> - * @adev: amdgpu_device pointer
> - *
> - * Main teardown pass for hardware IPs.  The list of all the hardware
> - * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> - * are run.  hw_fini tears down the hardware associated with each IP
> - * and sw_fini tears down any software state associated with each IP.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
> +static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
>  {
>   int i, r;
>  
> - if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
> - amdgpu_virt_release_ras_err_handler_data(adev);
> + for (i = 0; i < adev->num_ip_blocks; i++) {
> + if (!adev->ip_blocks[i].version->funcs->early_fini)
> + continue;
>  
> - amdgpu_ras_pre_fini(adev);
> + r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
> + if (r) {
> + DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
> +   adev->ip_blocks[i].version->funcs->name, r);
> + }
> + }
>  
> - if (adev->gmc.xgmi.num_physical_nodes > 1)
> - amdgpu_xgmi_remove_device(adev);
> + amdgpu_amdkfd_suspend(adev, false);
>  
>   amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
>   amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
>  
> - amdgpu_amdkfd_device_fini(adev);
> -
>   /* need to disable SMC first */
>   for (i = 0; i < adev->num_ip_blocks; i++) {
>   if (!adev->ip_blocks[i].status.hw)
> @@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
> *adev)
>   adev->ip_blocks[i].status.hw = false;
>   }
>  
> + return 0;
> +}
> +
> +/**
> + * amdgpu_device_ip_fini - run fini for hardware IPs
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Main teardown pass for hardware IPs.  The list of all the hardware
> + * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
> + * are run.  hw_fini tears down the 

[PATCH] drm/amdgpu: Add early fini callback

2021-05-19 Thread Andrey Grodzovsky
Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

v5:
Move HW finilization into this callback to prevent MMIO accesses
post cpi remove.

v7:
Split kfd suspend from device exit to expdite HW related
stuff to amdgpu_pci_remove

v8:
Squash previous KFD commit into this commit to avoid compile break.

Signed-off-by: Andrey Grodzovsky 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 59 +--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  3 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
 drivers/gpu/drm/amd/include/amd_shared.h  |  2 +
 6 files changed, 56 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
}
 }
 
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
 {
if (adev->kfd.dev) {
kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 5ffb07b02810..d8a537e8aea5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
 int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8bee95ad32d9..bc75e35dd8d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
return 0;
 }
 
-/**
- * amdgpu_device_ip_fini - run fini for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Main teardown pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
- * are run.  hw_fini tears down the hardware associated with each IP
- * and sw_fini tears down any software state associated with each IP.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
 {
int i, r;
 
-   if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
-   amdgpu_virt_release_ras_err_handler_data(adev);
+   for (i = 0; i < adev->num_ip_blocks; i++) {
+   if (!adev->ip_blocks[i].version->funcs->early_fini)
+   continue;
 
-   amdgpu_ras_pre_fini(adev);
+   r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+   if (r) {
+   DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+ adev->ip_blocks[i].version->funcs->name, r);
+   }
+   }
 
-   if (adev->gmc.xgmi.num_physical_nodes > 1)
-   amdgpu_xgmi_remove_device(adev);
+   amdgpu_amdkfd_suspend(adev, false);
 
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
 
-   amdgpu_amdkfd_device_fini(adev);
-
/* need to disable SMC first */
for (i = 0; i < adev->num_ip_blocks; i++) {
if (!adev->ip_blocks[i].status.hw)
@@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
adev->ip_blocks[i].status.hw = false;
}
 
+   return 0;
+}
+
+/**
+ * amdgpu_device_ip_fini - run fini for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main teardown pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
+ * are run.  hw_fini tears down the hardware associated with each IP
+ * and sw_fini tears down any software state associated with each IP.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+{
+   int i, r;
+
+   if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
+

[PATCH] drm/amdgpu: Let userptr BO ttm have TTM_PAGE_FLAG_SG set

2021-05-19 Thread xinhui pan
We have met memory corruption due to unexcepted swapout/swapin.

swapout function create one swap storage which is filled with zero. And
set ttm->page_flags as TTM_PAGE_FLAG_SWAPPED. But because userptr BO ttm
has no backend page at that time, no real data is swapout to swap
storage.

swapin function is called during userptr BO populate as
TTM_PAGE_FLAG_SWAPPED is set. Now here is the problem, we swapin data to
ttm bakend memory from swap storage. That just causes the memory been
overwritten.

CPU 1   CPU 2
kfd alloc BO A(userptr) alloc BO B(GTT)
->init -> validate(create ttm)  -> init -> validate -> populate
init_user_pages   -> swapout BO A
-> get_user_pages (fill up ttm->pages)
 -> validate -> populate
  -> swapin BO A // memory overwritten

To fix this issue, we can set TTM_PAGE_FLAG_SG when we create userptr BO
ttm. Then swapout function would not swap it.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  | 4 
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 928e8d57cd08..9a6ea966ddb2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1410,7 +1410,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
} else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
domain = AMDGPU_GEM_DOMAIN_GTT;
alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
-   alloc_flags = 0;
+   alloc_flags = AMDGPU_AMDKFD_CREATE_USERPTR_BO;
if (!offset || !*offset)
return -EINVAL;
user_addr = untagged_addr(*offset);
@@ -1477,8 +1477,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
}
bo->kfd_bo = *mem;
(*mem)->bo = bo;
-   if (user_addr)
-   bo->flags |= AMDGPU_AMDKFD_CREATE_USERPTR_BO;
 
(*mem)->va = va;
(*mem)->domain = domain;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c7f5cc503601..5b3f45637fb5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1119,6 +1119,10 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct 
ttm_buffer_object *bo,
kfree(gtt);
return NULL;
}
+
+   if (abo->flags & AMDGPU_AMDKFD_CREATE_USERPTR_BO)
+   gtt->ttm.page_flags |= TTM_PAGE_FLAG_SG;
+
return >ttm;
 }
 
-- 
2.25.1



[pull] amdgpu, amdkfd, radeon drm-next-5.14

2021-05-19 Thread Alex Deucher
Hi Dave, Daniel,

New stuff for 5.14, same as last week, but with fixed up fixes tag.

The following changes since commit af8352f1ff54c4fecf84e36315fd1928809a580b:

  Merge tag 'drm-msm-next-2021-04-11' of https://gitlab.freedesktop.org/drm/msm 
into drm-next (2021-04-13 23:35:54 +0200)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-5.14-2021-05-19

for you to fetch changes up to 2bb5b5f688cbbd5030629905d3ed8032ab46e79f:

  drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are 
connected (2021-05-19 22:29:40 -0400)


amd-drm-next-5.14-2021-05-19:

amdgpu:
- Aldebaran updates
- More LTTPR display work
- Vangogh updates
- SDMA 5.x GCR fixes
- RAS fixes
- PCIe ASPM support
- Modifier fixes
- Enable TMZ on Renoir
- Buffer object code cleanup
- Display overlay fixes
- Initial support for multiple eDP panels
- Initial SR-IOV support for Aldebaran
- DP link training refactor
- Misc code cleanups and bug fixes
- SMU regression fixes for variable sized arrays
- MAINTAINERS fixes for amdgpu

amdkfd:
- Initial SR-IOV support for Aldebaran
- Topology fixes
- Initial HMM SVM support
- Misc code cleanups and bug fixes

radeon:
- Misc code cleanups and bug fixes
- SMU regression fixes for variable sized arrays
- Flickering fix for Oland with multiple 4K displays

UAPI:
- amdgpu: Drop AMDGPU_GEM_CREATE_SHADOW flag.
  This was always a kernel internal flag and userspace use of it has always 
been blocked.
  It's no longer needed so remove it.
- amdkgd: HMM SVM support
  Overview: https://patchwork.freedesktop.org/series/85562/
  Porposed userspace: 
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/tree/fxkamd/hmm-wip


Alex Deucher (12):
  drm/amdgpu/gmc9: remove dummy read workaround for newer chips
  drm/amdgpu/display: add documentation for dmcub_trace_event_en
  MAINTAINERS: fix a few more amdgpu tree links
  drm/amdgpu: Add graphics cache rinse packet for sdma 5.0
  drm/amdgpu: drop the GCR packet from the emit_ib frame for sdma5.0
  drm/amdgpu: change the default timeout for kernel compute queues
  drm/amdgpu/pm: add documentation for pp_od_clock_voltage for APUs
  drm/amdgpu/pm: add documentation for pp_od_clock_voltage for vangogh
  drm/amdgpu/display: remove an old DCN3 guard
  drm/amdgpu/display: fix warning when CONFIG_DRM_AMD_DC_DCN is not defined
  drm/amdgpu/display: fix build when CONFIG_DRM_AMD_DC_DCN is not defined
  drm/amdgpu/display: fix dal_allocation documentation

Alex Sierra (12):
  drm/amdkfd: helper to convert gpu id and idx
  drm/amdkfd: add xnack enabled flag to kfd_process
  drm/amdkfd: add ioctl to configure and query xnack retries
  drm/amdgpu: enable 48-bit IH timestamp counter
  drm/amdkfd: SVM API call to restore page tables
  drm/amdkfd: add svm_bo reference for eviction fence
  drm/amdgpu: add param bit flag to create SVM BOs
  drm/amdgpu: svm bo enable_signal call condition
  drm/amdgpu: add svm_bo eviction to enable_signal cb
  drm/amdgpu: extend xnack limit page fault timeout
  drm/amdkfd: svm ranges creation for unregistered memory
  drm/amdkfd: set attribute access for default ranges

Anthony Koo (6):
  drm/amd/display: [FW Promotion] Release 0.0.60
  drm/amd/display: [FW Promotion] Release 0.0.61
  drm/amd/display: [FW Promotion] Release 0.0.62
  drm/amd/display: [FW Promotion] Release 0.0.63
  drm/amd/display: [FW Promotion] Release 0.0.64
  drm/amd/display: [FW Promotion] Release 0.0.65

Anthony Wang (4):
  drm/amd/display: Force vsync flip when reconfiguring MPCC
  drm/amd/display: Add DSC check to seamless boot validation
  drm/amd/display: disable seamless boot for external DP
  drm/amd/display: Handle potential dpp_inst mismatch with pipe_idx

Aric Cyr (8):
  drm/amd/display: 3.2.131
  drm/amd/display: Fix FreeSync when RGB MPO in use
  drm/amd/display: 3.2.132
  drm/amd/display: 3.2.133
  drm/amdgpu/dc: Revert commit "treat memory as a single-channel"
  drm/amd/display: 3.2.134
  drm/amd/display: 3.2.135
  drm/amd/display: 3.2.135.1

Bas Nieuwenhuizen (2):
  drm/amdgpu: Init GFX10_ADDR_CONFIG for VCN v3 in DPG mode.
  drm/amdgpu: Use device specific BO size & stride check.

Bing Guo (1):
  drm/amd/display: add helper for enabling mst stream features

Brandon Syu (1):
  drm/amd/display: fix HDCP reset sequence on reinitialize

Calvin Hou (1):
  drm/amd/display: remove checking sink in is_timing_changed

Chaitanya Dhere (1):
  drm/amd/display: DETBufferSizeInKbyte variable type modifications

Chris Park (1):
  drm/amd/display: Fix BSOD with NULL check

Christian König (4):
  drm/amdgpu: fix coding style and documentation in amdgpu_gtt_mgr.c
  drm/amdgpu: 

回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Pan, Xinhui
[AMD Official Use Only]

I am not sure if we can create a ttm_bo_type_sg bo for userptr. But I have 
another idea now. we can use flag AMDGPU_AMDKFD_CREATE_USERPTR_BO to create the 
userptr bo.

发件人: Kuehling, Felix 
发送时间: 2021年5月19日 23:11
收件人: Christian König; Pan, Xinhui; amd-...@lists.freedesktop.org
抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; 
dri-devel@lists.freedesktop.org
主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout 
and swapin

Looks like we're creating the userptr BO as ttm_bo_type_device. I guess
we should be using ttm_bo_type_sg? BTW, amdgpu_gem_userptr_ioctl also
uses ttm_bo_type_device.

Regards,
  Felix


Am 2021-05-19 um 6:01 a.m. schrieb Christian König:
> I'm scratching my head how that is even possible.
>
> See when a BO is created in the system domain it is just an empty
> hull, e.g. without backing store and allocated pages.
>
> So the swapout function will just ignore it.
>
> Christian.
>
> Am 19.05.21 um 07:07 schrieb Pan, Xinhui:
>> [AMD Official Use Only]
>>
>> I have reverted Chris'  patch, still hit this failure.
>> Just see two lines in Chris' patch. Any BO in cpu domian would be
>> swapout first. That is why we hit this issue frequently now. But the
>> bug is there long time ago.
>>
>> -   for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
>> -   list_for_each_entry(bo, >swap_lru[i], swap) {
>> [snip]
>> +   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
>> +   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
>>
>>
>> 
>> 发件人: Pan, Xinhui 
>> 发送时间: 2021年5月19日 12:09
>> 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian;
>> dri-devel@lists.freedesktop.org; dan...@ffwll.ch
>> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to
>> swapout and swapin
>>
>> yes, we really dont swapout SG BOs.
>> The problems is that before we validate a userptr BO, we create this
>> BO in CPU domain by default. So this BO has chance to swapout.
>>
>> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too
>> late.
>> I have not try to revert Chris' patch as I think it desnt help. Or I
>> can have a try later.
>>
>> 
>> 发件人: Kuehling, Felix 
>> 发送时间: 2021年5月19日 11:29
>> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian;
>> dri-devel@lists.freedesktop.org; dan...@ffwll.ch
>> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to
>> swapout and swapin
>>
>> Swapping SG BOs makes no sense, because TTM doesn't own the pages of
>> this type of BO.
>>
>> Last I checked, userptr BOs (and other SG BOs) were protected from
>> swapout by the fact that they would not be added to the swap-LRU. But it
>> looks like Christian just removed the swap-LRU. I guess this broke that
>> protection:
>>
>> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
>> Author: Christian König 
>> Date:   Tue Oct 6 16:30:09 2020 +0200
>>
>>   drm/ttm: remove swap LRU v3
>>
>>   Instead evict round robin from each devices SYSTEM and TT domain.
>>
>>   v2: reorder num_pages access reported by Dan's script
>>   v3: fix rebase fallout, num_pages should be 32bit
>>
>>   Signed-off-by: Christian König 
>>   Tested-by: Nirmoy Das 
>>   Reviewed-by: Huang Rui 
>>   Reviewed-by: Matthew Auld 
>>   Link: https://patchwork.freedesktop.org/patch/424009/
>>
>> Regards,
>> Felix
>>
>>
>> On 2021-05-18 10:28 p.m., xinhui pan wrote:
>>> cpu 1   cpu 2
>>> kfd alloc BO A(userptr) alloc BO B(GTT)
>>>   ->init -> validate   -> init ->
>>> validate -> populate
>>>   init_user_pages-> swapout BO A
>>> //hit ttm pages limit
>>>-> get_user_pages (fill up ttm->pages)
>>> -> validate -> populate
>>> -> swapin BO A // Now hit the BUG
>>>
>>> We know that get_user_pages may race with swapout on same BO.
>>> Threre are some issues I have met.
>>> 1) memory corruption.
>>> This is because we do a swap before memory is setup. ttm_tt_swapout()
>>> just create a swap_storage with its content being 0x0. So when we setup
>>> memory after the swapout. The following swapin makes the memory
>>> corrupted.
>>>
>>> 2) panic
>>> When swapout happes with get_user_pages, they touch ttm->pages without
>>> anylock. It causes memory corruption too. But I hit page fault mostly.
>>>
>>> Signed-off-by: xinhui pan 
>>> ---
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16
>>> +++-
>>>1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> index 928e8d57cd08..42460e4480f8 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c

回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Pan, Xinhui
[AMD Official Use Only]

swapout function create one swap storage which is filled with zero. And set 
ttm->page_flags as TTM_PAGE_FLAG_SWAPPED.  Just because ttm has no backend page 
this time, no real data is swapout to this swap storage.

swapin function is called during populate as TTM_PAGE_FLAG_SWAPPED is set.
Now here is the problem, we swapin data to ttm bakend memory from swap storage. 
That just causes the memory been overwritten.


发件人: Christian König 
发送时间: 2021年5月19日 18:01
收件人: Pan, Xinhui; Kuehling, Felix; amd-...@lists.freedesktop.org
抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; 
dri-devel@lists.freedesktop.org
主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout 
and swapin

I'm scratching my head how that is even possible.

See when a BO is created in the system domain it is just an empty hull,
e.g. without backing store and allocated pages.

So the swapout function will just ignore it.

Christian.

Am 19.05.21 um 07:07 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> I have reverted Chris'  patch, still hit this failure.
> Just see two lines in Chris' patch. Any BO in cpu domian would be swapout 
> first. That is why we hit this issue frequently now. But the bug is there 
> long time ago.
>
> -   for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -   list_for_each_entry(bo, >swap_lru[i], swap) {
> [snip]
> +   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
> +   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
>
>
> 
> 发件人: Pan, Xinhui 
> 发送时间: 2021年5月19日 12:09
> 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org
> 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; 
> dan...@ffwll.ch
> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
> swapin
>
> yes, we really dont swapout SG BOs.
> The problems is that before we validate a userptr BO, we create this BO in 
> CPU domain by default. So this BO has chance to swapout.
>
> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late.
> I have not try to revert Chris' patch as I think it desnt help. Or I can have 
> a try later.
>
> 
> 发件人: Kuehling, Felix 
> 发送时间: 2021年5月19日 11:29
> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org
> 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; 
> dan...@ffwll.ch
> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
> swapin
>
> Swapping SG BOs makes no sense, because TTM doesn't own the pages of
> this type of BO.
>
> Last I checked, userptr BOs (and other SG BOs) were protected from
> swapout by the fact that they would not be added to the swap-LRU. But it
> looks like Christian just removed the swap-LRU. I guess this broke that
> protection:
>
> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
> Author: Christian König 
> Date:   Tue Oct 6 16:30:09 2020 +0200
>
>   drm/ttm: remove swap LRU v3
>
>   Instead evict round robin from each devices SYSTEM and TT domain.
>
>   v2: reorder num_pages access reported by Dan's script
>   v3: fix rebase fallout, num_pages should be 32bit
>
>   Signed-off-by: Christian König 
>   Tested-by: Nirmoy Das 
>   Reviewed-by: Huang Rui 
>   Reviewed-by: Matthew Auld 
>   Link: 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F424009%2Fdata=04%7C01%7CXinhui.Pan%40amd.com%7Cb4422d8b3e4947cd243c08d91aad14c3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637570152942496679%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=K3%2FnFpN56y8L49UuYRM6SqefVFLnqIwpDAtWpS1XvnQ%3Dreserved=0
>
> Regards,
> Felix
>
>
> On 2021-05-18 10:28 p.m., xinhui pan wrote:
>> cpu 1   cpu 2
>> kfd alloc BO A(userptr) alloc BO B(GTT)
>>   ->init -> validate   -> init -> validate 
>> -> populate
>>   init_user_pages-> swapout BO A //hit ttm 
>> pages limit
>>-> get_user_pages (fill up ttm->pages)
>> -> validate -> populate
>> -> swapin BO A // Now hit the BUG
>>
>> We know that get_user_pages may race with swapout on same BO.
>> Threre are some issues I have met.
>> 1) memory corruption.
>> This is because we do a swap before memory is setup. ttm_tt_swapout()
>> just create a swap_storage with its content being 0x0. So when we setup
>> memory after the swapout. The following swapin makes the memory
>> corrupted.
>>
>> 2) panic
>> When swapout happes with get_user_pages, they touch ttm->pages without
>> anylock. It causes memory corruption too. But I hit page fault mostly.
>>
>> Signed-off-by: xinhui pan 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++-
>>1 file 

[pull] radeon, amdgpu drm-fixes-5.13

2021-05-19 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 5.13.

The following changes since commit d07f6ca923ea0927a1024dfccafc5b53b61cfecc:

  Linux 5.13-rc2 (2021-05-16 15:27:44 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-5.13-2021-05-19

for you to fetch changes up to a2b4785f01280a4291edb9fda69032fc2e4bfd3f:

  drm/amdgpu: stop touching sched.ready in the backend (2021-05-19 18:07:43 
-0400)


amd-drm-fixes-5.13-2021-05-19:

amdgpu:
- Fix downscaling ratio on DCN3.x
- Fix for non-4K pages
- PCO/RV compute hang fix
- Dongle fix
- Aldebaran codec query support
- Refcount leak fix
- Use after free fix
- Navi12 golden settings updates
- GPU reset fixes

radeon:
- Fix for imported BO handling


Changfeng (1):
  drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang

Chris Park (1):
  drm/amd/display: Disconnect non-DP with no EDID

Christian König (2):
  drm/radeon: use the dummy page for GART if needed
  drm/amdgpu: stop touching sched.ready in the backend

Guchun Chen (2):
  drm/amdgpu: update gc golden setting for Navi12
  drm/amdgpu: update sdma golden setting for Navi12

James Zhu (1):
  drm/amdgpu: add video_codecs query support for aldebaran

Jingwen Chen (1):
  drm/amd/amdgpu: fix refcount leak

Lang Yu (1):
  drm/amd/amdgpu: fix a potential deadlock in gpu reset

Nikola Cornij (1):
  drm/amd/display: Use the correct max downscaling value for DCN3.x family

Yi Li (1):
  drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE

xinhui pan (1):
  drm/amdgpu: Fix a use-after-free

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c |  3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  3 ++-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  6 --
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 10 +++---
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c |  2 --
 drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c |  2 --
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  4 
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c |  5 -
 drivers/gpu/drm/amd/amdgpu/soc15.c |  3 +--
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c  |  8 +---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c  | 18 ++
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c  |  7 ---
 .../gpu/drm/amd/display/dc/dcn301/dcn301_resource.c|  7 ---
 .../gpu/drm/amd/display/dc/dcn302/dcn302_resource.c|  7 ---
 drivers/gpu/drm/radeon/radeon_gart.c   |  3 ++-
 16 files changed, 54 insertions(+), 35 deletions(-)


Re: [PATCH 0/7] component: Make into an aggregate bus

2021-05-19 Thread Stephen Boyd
Quoting Saravana Kannan (2021-05-19 18:27:50)
> On Wed, May 19, 2021 at 5:25 PM Stephen Boyd  wrote:
> >
> > This series is from discussion we had on reordering the device lists for
> > drm shutdown paths[1]. I've introduced an 'aggregate' bus that we put
> > the aggregate device onto and then we probe the device once all the
> > components are probed and call component_add(). The probe/remove hooks
> > are where the bind/unbind calls go, and then a shutdown hook is added
> > that can be used to shutdown the drm display pipeline at the right time.
> >
> > This works for me on my sc7180 board, but I'm currently struggling with
> > the last patch where we migrate the msm driver. It runs into a runtime
> > PM problem where the parent device isn't runtime PM enabled yet. I'm
> > still trying to figure out a clean solution there. Moving runtime PM
> > around breaks boot and I think that's because the power domain is off.
> >
> > Cc: Daniel Vetter 
> > Cc: "Rafael J. Wysocki" 
> > Cc: Rob Clark 
> > Cc: Russell King 
> > Cc: Saravana Kannan 
> >
> > [1] https://lore.kernel.org/r/20210508074118.1621729-1-swb...@chromium.org
> >
>
> I skimmed through the series and in general the idea is good, but I'm
> not sure why each component user needs to be converted/"modern" before
> it can make use of the benefits of this series. Why not just have
> wrapper functions around the component ops that the new aggregate bus
> driver can just call? That'll give all the existing component users
> the new ability to use the new ops without having to have two
> versions.

The existing users can only have one or the other. Either use the ops
structure or use the struct aggregate_driver. What benefits of this
series are they not gaining?

> That'll also allow us to do other improvements (I have some
> in mind) that'll apply to all the component users instead of only the
> converted ones.

What do you have in mind? I didn't want to convert drivers over to the
new way of doing things without making them consciously change their
code. Otherwise I worry it will break things in random, subtle ways. The
last patch, as I mentioned above in the cover, causes warnings because
the display driver is enabling runtime PM in an odd spot as part of the
bind callback of the aggregate/master. That should move out of there and
into the msm_pdev driver that registers the aggregate from what I can
tell.


[PATCH 7/7] drm/msm: Migrate to aggregate driver

2021-05-19 Thread Stephen Boyd
The device lists are poorly ordered when the component device code is
used. This is because component_master_add_with_match() returns 0
regardless of component devices calling component_add() first. It can
really only fail if an allocation fails, in which case everything is
going bad and we're out of memory. The driver that registers the
aggregate driver, can succeed at probe and put the attached device on
the DPM lists before any of the component devices are probed and put on
the lists.

Within the component device framework this usually isn't that bad
because the real driver work is done at bind time via
component{,master}_ops::bind(). It becomes a problem when the driver
core, or host driver, wants to operate on the component device outside
of the bind/unbind functions, e.g. via 'remove' or 'shutdown'. The
driver core doesn't understand the relationship between the host device
and the component devices and could possibly try to operate on component
devices when they're already removed from the system or shut down.

Normally, device links or probe defer would reorder the lists and put
devices that depend on other devices in the lists at the correct
location, but with component devices this doesn't happen because this
information isn't expressed anywhere. Drivers simply succeed at
registering their component or the aggregate driver with the component
framework and wait for their bind() callback to be called once the other
components are ready. In summary, the drivers that make up the aggregate
driver can probe in any order.

This ordering problem becomes fairly obvious when shutting down the
device with a DSI controller connected to a DSI bridge that is
controlled via i2c. In this case, the msm display driver wants to tear
down the display pipeline on shutdown via msm_pdev_shutdown() by calling
drm_atomic_helper_shutdown(), and it can't do that unless the whole
display chain is still probed and active in the system. When a display
bridge is on i2c, the i2c device for the bridge will be created whenever
the i2c controller probes, which could be before or after the msm
display driver probes. If the i2c controller probes after the display
driver, then the i2c controller will be shutdown before the display
controller during system wide shutdown and thus i2c transactions will
stop working before the display pipeline is shut down. This means we'll
have the display bridge trying to access an i2c bus that's shut down
because drm_atomic_helper_shutdown() is trying to disable the bridge
after the bridge is off.

The solution is to make the aggregate driver into a real struct driver
that is bound to a device when the other component devices have all
probed. Now that the component driver code is a proper bus, we can
simply register an aggregate driver with that bus via
component_aggregate_register() and then attach the shutdown hook to that
driver to be sure that the shutdown for the display pipeline is called
before any of the component device driver shutdown hooks are called.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 
Signed-off-by: Stephen Boyd 
---

As stated in the cover letter, this isn't perfect but it still works. I
get a warning from runtime PM that the parent device (e0.mdss) is
not runtime PM enabled but the child device (the aggregate device) is
being enabled by the bus logic. I need to move around the place that the
parent device is runtime PM enabled and probably keep it powered up
during the entire time that the driver is probed until the aggregate
driver probes.

 drivers/gpu/drm/msm/msm_drv.c | 47 +++
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index e1104d2454e2..0c64e6a2ce25 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1265,19 +1265,35 @@ static int add_gpu_components(struct device *dev,
return 0;
 }
 
-static int msm_drm_bind(struct device *dev)
+static int msm_drm_bind(struct aggregate_device *adev)
 {
-   return msm_drm_init(dev, _driver);
+   return msm_drm_init(adev->dev.parent, _driver);
 }
 
-static void msm_drm_unbind(struct device *dev)
+static void msm_drm_unbind(struct aggregate_device *adev)
 {
-   msm_drm_uninit(dev);
+   msm_drm_uninit(adev->dev.parent);
+}
+
+static void msm_drm_shutdown(struct aggregate_device *adev)
+{
+   struct drm_device *drm = 
platform_get_drvdata(to_platform_device(adev->dev.parent));
+   struct msm_drm_private *priv = drm ? drm->dev_private : NULL;
+
+   if (!priv || !priv->kms)
+   return;
+
+   drm_atomic_helper_shutdown(drm);
 }
 
-static const struct component_master_ops msm_drm_ops = {
-   .bind = msm_drm_bind,
-   .unbind = msm_drm_unbind,
+static struct aggregate_driver msm_drm_aggregate_driver = {
+   .probe = msm_drm_bind,
+   .remove = msm_drm_unbind,
+   .shutdown = 

[PATCH 3/7] component: Introduce struct aggregate_device

2021-05-19 Thread Stephen Boyd
Replace 'struct master' with 'struct aggregate_device' and then rename
'master' to 'adev' everywhere in the code. While we're here, put a
struct device inside the aggregate device so that we can register it
with a bus_type in the next patch.

The diff is large but that's because this is mostly a rename, where
sometimes 'master' is replaced with 'adev' and other times it is
replaced with 'parent' to indicate that the struct device that was being
used is actually the parent of the aggregate device and driver.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 
Signed-off-by: Stephen Boyd 
---
 drivers/base/component.c  | 249 --
 include/linux/component.h |   2 +-
 2 files changed, 134 insertions(+), 117 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 5e79299f6c3f..55e30e0b0952 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -9,6 +9,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -58,18 +59,21 @@ struct component_match {
struct component_match_array *compare;
 };
 
-struct master {
+struct aggregate_device {
struct list_head node;
bool bound;
 
const struct component_master_ops *ops;
struct device *parent;
+   struct device dev;
struct component_match *match;
+
+   int id;
 };
 
 struct component {
struct list_head node;
-   struct master *master;
+   struct aggregate_device *adev;
bool bound;
 
const struct component_ops *ops;
@@ -79,7 +83,9 @@ struct component {
 
 static DEFINE_MUTEX(component_mutex);
 static LIST_HEAD(component_list);
-static LIST_HEAD(masters);
+static LIST_HEAD(aggregate_devices);
+
+static DEFINE_IDA(aggregate_ida);
 
 #ifdef CONFIG_DEBUG_FS
 
@@ -87,12 +93,12 @@ static struct dentry *component_debugfs_dir;
 
 static int component_devices_show(struct seq_file *s, void *data)
 {
-   struct master *m = s->private;
+   struct aggregate_device *m = s->private;
struct component_match *match = m->match;
size_t i;
 
mutex_lock(_mutex);
-   seq_printf(s, "%-40s %20s\n", "master name", "status");
+   seq_printf(s, "%-40s %20s\n", "aggregate_device name", "status");
seq_puts(s, 
"-\n");
seq_printf(s, "%-40s %20s\n\n",
   dev_name(m->parent), m->bound ? "bound" : "not bound");
@@ -122,46 +128,46 @@ static int __init component_debug_init(void)
 
 core_initcall(component_debug_init);
 
-static void component_master_debugfs_add(struct master *m)
+static void component_master_debugfs_add(struct aggregate_device *m)
 {
debugfs_create_file(dev_name(m->parent), 0444, component_debugfs_dir, m,
_devices_fops);
 }
 
-static void component_master_debugfs_del(struct master *m)
+static void component_master_debugfs_del(struct aggregate_device *m)
 {
debugfs_remove(debugfs_lookup(dev_name(m->parent), 
component_debugfs_dir));
 }
 
 #else
 
-static void component_master_debugfs_add(struct master *m)
+static void component_master_debugfs_add(struct aggregate_device *m)
 { }
 
-static void component_master_debugfs_del(struct master *m)
+static void component_master_debugfs_del(struct aggregate_device *m)
 { }
 
 #endif
 
-static struct master *__master_find(struct device *parent,
+static struct aggregate_device *__aggregate_find(struct device *parent,
const struct component_master_ops *ops)
 {
-   struct master *m;
+   struct aggregate_device *m;
 
-   list_for_each_entry(m, , node)
+   list_for_each_entry(m, _devices, node)
if (m->parent == parent && (!ops || m->ops == ops))
return m;
 
return NULL;
 }
 
-static struct component *find_component(struct master *master,
+static struct component *find_component(struct aggregate_device *adev,
struct component_match_array *mc)
 {
struct component *c;
 
list_for_each_entry(c, _list, node) {
-   if (c->master && c->master != master)
+   if (c->adev && c->adev != adev)
continue;
 
if (mc->compare && mc->compare(c->dev, mc->data))
@@ -175,101 +181,102 @@ static struct component *find_component(struct master 
*master,
return NULL;
 }
 
-static int find_components(struct master *master)
+static int find_components(struct aggregate_device *adev)
 {
-   struct component_match *match = master->match;
+   struct component_match *match = adev->match;
size_t i;
int ret = 0;
 
/*
 * Scan the array of match functions and attach
-* any components which are found to this master.
+* any components which are found to this adev.
 */
for (i = 0; i < match->num; i++) {
struct component_match_array *mc = 

[PATCH 2/7] component: Rename 'dev' to 'parent'

2021-05-19 Thread Stephen Boyd
Let's rename this struct member to 'parent' to better reflect the
reality that it's the parent device of this psuedo-device. In the next
patch we'll put a 'struct device' inside of this struct so moving this
away simplifies that patch by reducing the number of places that 'dev'
is modified.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 
Signed-off-by: Stephen Boyd 
---
 drivers/base/component.c | 89 +++-
 1 file changed, 42 insertions(+), 47 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index bbe1757dfa89..5e79299f6c3f 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -63,7 +63,7 @@ struct master {
bool bound;
 
const struct component_master_ops *ops;
-   struct device *dev;
+   struct device *parent;
struct component_match *match;
 };
 
@@ -95,7 +95,7 @@ static int component_devices_show(struct seq_file *s, void 
*data)
seq_printf(s, "%-40s %20s\n", "master name", "status");
seq_puts(s, 
"-\n");
seq_printf(s, "%-40s %20s\n\n",
-  dev_name(m->dev), m->bound ? "bound" : "not bound");
+  dev_name(m->parent), m->bound ? "bound" : "not bound");
 
seq_printf(s, "%-40s %20s\n", "device name", "status");
seq_puts(s, 
"-\n");
@@ -124,13 +124,13 @@ core_initcall(component_debug_init);
 
 static void component_master_debugfs_add(struct master *m)
 {
-   debugfs_create_file(dev_name(m->dev), 0444, component_debugfs_dir, m,
+   debugfs_create_file(dev_name(m->parent), 0444, component_debugfs_dir, m,
_devices_fops);
 }
 
 static void component_master_debugfs_del(struct master *m)
 {
-   debugfs_remove(debugfs_lookup(dev_name(m->dev), component_debugfs_dir));
+   debugfs_remove(debugfs_lookup(dev_name(m->parent), 
component_debugfs_dir));
 }
 
 #else
@@ -143,13 +143,13 @@ static void component_master_debugfs_del(struct master *m)
 
 #endif
 
-static struct master *__master_find(struct device *dev,
+static struct master *__master_find(struct device *parent,
const struct component_master_ops *ops)
 {
struct master *m;
 
list_for_each_entry(m, , node)
-   if (m->dev == dev && (!ops || m->ops == ops))
+   if (m->parent == parent && (!ops || m->ops == ops))
return m;
 
return NULL;
@@ -189,7 +189,7 @@ static int find_components(struct master *master)
struct component_match_array *mc = >compare[i];
struct component *c;
 
-   dev_dbg(master->dev, "Looking for component %zu\n", i);
+   dev_dbg(master->parent, "Looking for component %zu\n", i);
 
if (match->compare[i].component)
continue;
@@ -200,7 +200,7 @@ static int find_components(struct master *master)
break;
}
 
-   dev_dbg(master->dev, "found component %s, duplicate %u\n", 
dev_name(c->dev), !!c->master);
+   dev_dbg(master->parent, "found component %s, duplicate %u\n", 
dev_name(c->dev), !!c->master);
 
/* Attach this component to the master */
match->compare[i].duplicate = !!c->master;
@@ -233,28 +233,28 @@ static int try_to_bring_up_master(struct master *master,
 {
int ret;
 
-   dev_dbg(master->dev, "trying to bring up master\n");
+   dev_dbg(master->parent, "trying to bring up master\n");
 
if (find_components(master)) {
-   dev_dbg(master->dev, "master has incomplete components\n");
+   dev_dbg(master->parent, "master has incomplete components\n");
return 0;
}
 
if (component && component->master != master) {
-   dev_dbg(master->dev, "master is not for this component (%s)\n",
+   dev_dbg(master->parent, "master is not for this component 
(%s)\n",
dev_name(component->dev));
return 0;
}
 
-   if (!devres_open_group(master->dev, NULL, GFP_KERNEL))
+   if (!devres_open_group(master->parent, NULL, GFP_KERNEL))
return -ENOMEM;
 
/* Found all components */
-   ret = master->ops->bind(master->dev);
+   ret = master->ops->bind(master->parent);
if (ret < 0) {
-   devres_release_group(master->dev, NULL);
+   devres_release_group(master->parent, NULL);
if (ret != -EPROBE_DEFER)
-   dev_info(master->dev, "master bind failed: %d\n", ret);
+   dev_info(master->parent, "master bind failed: %d\n", 
ret);
return ret;
}
 
@@ -281,32 +281,27 @@ static int try_to_bring_up_masters(struct component 
*component)
 static 

[PATCH 6/7] component: Move struct aggregate_device out to header file

2021-05-19 Thread Stephen Boyd
This allows aggregate driver writers to use the device passed to their
probe/remove/shutdown functions properly instead of treating it as an
opaque pointer.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 
Signed-off-by: Stephen Boyd 
---
 drivers/base/component.c  | 13 -
 include/linux/component.h | 15 +--
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 155aab7eefa6..1b4c84453319 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -62,19 +62,6 @@ struct component_match {
struct component_match_array *compare;
 };
 
-struct aggregate_device {
-   const struct component_master_ops *ops;
-   struct device dev;
-   struct component_match *match;
-
-   int id;
-};
-
-static inline struct aggregate_device *to_aggregate_device(struct device *d)
-{
-   return container_of(d, struct aggregate_device, dev);
-}
-
 struct component {
struct list_head node;
struct aggregate_device *adev;
diff --git a/include/linux/component.h b/include/linux/component.h
index bc71d34a3416..d19cc3418d12 100644
--- a/include/linux/component.h
+++ b/include/linux/component.h
@@ -39,7 +39,7 @@ void component_del(struct device *, const struct 
component_ops *);
 int component_bind_all(struct device *master, void *master_data);
 void component_unbind_all(struct device *master, void *master_data);
 
-struct aggregate_device;
+struct component_match;
 
 /**
  * struct component_master_ops - callback for the aggregate driver
@@ -80,7 +80,18 @@ struct component_master_ops {
void (*unbind)(struct device *master);
 };
 
-struct component_match;
+struct aggregate_device {
+   const struct component_master_ops *ops;
+   struct device dev;
+   struct component_match *match;
+
+   int id;
+};
+
+static inline struct aggregate_device *to_aggregate_device(struct device *d)
+{
+   return container_of(d, struct aggregate_device, dev);
+}
 
 /**
  * struct aggregate_driver - Aggregate driver (made up of other drivers)
-- 
https://chromeos.dev



[PATCH 5/7] component: Use dev.parent instead of adev->parent

2021-05-19 Thread Stephen Boyd
We left this in place to ease the code diff, but now we can remove it
because the aggregate device parent pointer is the same.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 
Signed-off-by: Stephen Boyd 
---
 drivers/base/component.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index a6c0bb7ccdbc..155aab7eefa6 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -64,7 +64,6 @@ struct component_match {
 
 struct aggregate_device {
const struct component_master_ops *ops;
-   struct device *parent;
struct device dev;
struct component_match *match;
 
@@ -105,7 +104,7 @@ static int component_devices_show(struct seq_file *s, void 
*data)
seq_printf(s, "%-40s %20s\n", "aggregate_device name", "status");
seq_puts(s, 
"-\n");
seq_printf(s, "%-40s %20s\n\n",
-  dev_name(m->parent), m->dev.driver ? "bound" : "not bound");
+  dev_name(m->dev.parent), m->dev.driver ? "bound" : "not 
bound");
 
seq_printf(s, "%-40s %20s\n", "device name", "status");
seq_puts(s, 
"-\n");
@@ -134,13 +133,13 @@ core_initcall(component_debug_init);
 
 static void component_master_debugfs_add(struct aggregate_device *m)
 {
-   debugfs_create_file(dev_name(m->parent), 0444, component_debugfs_dir, m,
+   debugfs_create_file(dev_name(m->dev.parent), 0444, 
component_debugfs_dir, m,
_devices_fops);
 }
 
 static void component_master_debugfs_del(struct aggregate_device *m)
 {
-   debugfs_remove(debugfs_lookup(dev_name(m->parent), 
component_debugfs_dir));
+   debugfs_remove(debugfs_lookup(dev_name(m->dev.parent), 
component_debugfs_dir));
 }
 
 #else
@@ -203,7 +202,7 @@ static int find_components(struct aggregate_device *adev)
struct component_match_array *mc = >compare[i];
struct component *c;
 
-   dev_dbg(adev->parent, "Looking for component %zu\n", i);
+   dev_dbg(adev->dev.parent, "Looking for component %zu\n", i);
 
if (match->compare[i].component)
continue;
@@ -212,7 +211,7 @@ static int find_components(struct aggregate_device *adev)
if (!c)
return 0;
 
-   dev_dbg(adev->parent, "found component %s, duplicate %u\n",
+   dev_dbg(adev->dev.parent, "found component %s, duplicate %u\n",
dev_name(c->dev), !!c->adev);
 
/* Attach this component to the adev */
@@ -420,12 +419,12 @@ static int aggregate_device_match(struct device *dev, 
struct device_driver *drv)
 /* TODO: Remove once all aggregate drivers use component_aggregate_register() 
*/
 static int component_probe_bind(struct aggregate_device *adev)
 {
-   return adev->ops->bind(adev->parent);
+   return adev->ops->bind(adev->dev.parent);
 }
 
 static void component_remove_unbind(struct aggregate_device *adev)
 {
-   adev->ops->unbind(adev->parent);
+   adev->ops->unbind(adev->dev.parent);
 }
 
 static int aggregate_driver_probe(struct device *dev)
@@ -443,7 +442,7 @@ static int aggregate_driver_probe(struct device *dev)
}
 
mutex_lock(_mutex);
-   if (devres_open_group(adev->parent, NULL, GFP_KERNEL)) {
+   if (devres_open_group(adev->dev.parent, NULL, GFP_KERNEL)) {
ret = adrv->probe(adev);
if (ret)
devres_release_group(dev->parent, NULL);
@@ -549,8 +548,6 @@ static struct aggregate_device *aggregate_device_add(struct 
device *parent,
}
 
adev->id = id;
-   adev->parent = parent;
-
adev->dev.parent = parent;
adev->dev.bus = _bus_type;
adev->dev.release = aggregate_device_release;
@@ -709,7 +706,7 @@ static void component_unbind(struct component *component,
WARN_ON(!component->bound);
 
if (component->ops && component->ops->unbind)
-   component->ops->unbind(component->dev, adev->parent, data);
+   component->ops->unbind(component->dev, adev->dev.parent, data);
component->bound = false;
 
/* Release all resources claimed in the binding of this component */
@@ -758,7 +755,7 @@ static int component_bind(struct component *component, 
struct aggregate_device *
 * This allows us to roll-back a failed component without
 * affecting anything else.
 */
-   if (!devres_open_group(adev->parent, NULL, GFP_KERNEL))
+   if (!devres_open_group(adev->dev.parent, NULL, GFP_KERNEL))
return -ENOMEM;
 
/*
@@ -767,14 +764,14 @@ static int component_bind(struct component *component, 
struct aggregate_device *
 * at 

[PATCH 4/7] component: Introduce the aggregate bus_type

2021-05-19 Thread Stephen Boyd
The component driver only provides 'bind' and 'unbind' callbacks to tell
the host driver that it is time to assemble the aggregate driver now
that all the components have probed. The component driver model doesn't
attempt to resolve runtime PM or suspend/resume ordering, and explicitly
mentions this in the code. This lack of support leads to some pretty
gnarly usages of the 'prepare' and 'complete' power management hooks in
drivers that host the aggregate device, and it fully breaks down when
faced with ordering shutdown between the various components, the
aggregate driver, and the host driver that registers the whole thing.

In a concrete example, the MSM display driver at drivers/gpu/drm/msm is
using 'prepare' and 'complete' to call the drm helpers
drm_mode_config_helper_suspend() and drm_mode_config_helper_resume()
respectively, so that it can move the aggregate driver suspend/resume
callbacks to be before and after the components that make up the drm
device call any suspend/resume hooks they have. This only works as long
as the component devices don't do anything in their own 'prepare' and
'complete' callbacks. If they did, then the ordering would be incorrect
and we would be doing something in the component drivers before the
aggregate driver could do anything. Yuck!

Similarly, when trying to add shutdown support to the MSM driver we run
across a problem where we're trying to shutdown the drm device via
drm_atomic_helper_shutdown(), but some of the devices in the encoder
chain have already been shutdown. This time, the component devices
aren't the problem (although they could be if they did anything in their
shutdown callbacks), but there's a DSI to eDP bridge in the encoder
chain that has already been shutdown before the driver hosting the
aggregate device runs shutdown. The ordering of driver probe is like
this:

 1. msm_pdev_probe() (host driver)
 2. DSI bridge
 3. aggregate bind

When it comes to shutdown we have this order:

 1. DSI bridge
 2. msm_pdev_shutdown() (host driver)

and so the bridge is already off, but we want to communicate to it to
turn things off on the display during msm_pdev_shutdown(). Double yuck!
Unfortunately, this time we can't split shutdown into multiple phases
and swap msm_pdev_shutdown() with the DSI bridge.

Let's make the component driver into an actual device driver that has
probe/remove/shutdown functions. The driver will only be bound to the
aggregate device once all component drivers have called component_add()
to indicate they're ready to assemble the aggregate driver. This allows
us to attach shutdown logic (and in the future runtime PM logic) to the
aggregate driver so that it runs the hooks in the correct order.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 
Signed-off-by: Stephen Boyd 
---
 drivers/base/component.c  | 429 +++---
 include/linux/component.h |  60 +-
 2 files changed, 359 insertions(+), 130 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 55e30e0b0952..a6c0bb7ccdbc 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -15,6 +15,9 @@
 #include 
 #include 
 #include 
+#include 
+
+#include "base.h"
 
 /**
  * DOC: overview
@@ -60,9 +63,6 @@ struct component_match {
 };
 
 struct aggregate_device {
-   struct list_head node;
-   bool bound;
-
const struct component_master_ops *ops;
struct device *parent;
struct device dev;
@@ -71,6 +71,11 @@ struct aggregate_device {
int id;
 };
 
+static inline struct aggregate_device *to_aggregate_device(struct device *d)
+{
+   return container_of(d, struct aggregate_device, dev);
+}
+
 struct component {
struct list_head node;
struct aggregate_device *adev;
@@ -83,7 +88,6 @@ struct component {
 
 static DEFINE_MUTEX(component_mutex);
 static LIST_HEAD(component_list);
-static LIST_HEAD(aggregate_devices);
 
 static DEFINE_IDA(aggregate_ida);
 
@@ -101,7 +105,7 @@ static int component_devices_show(struct seq_file *s, void 
*data)
seq_printf(s, "%-40s %20s\n", "aggregate_device name", "status");
seq_puts(s, 
"-\n");
seq_printf(s, "%-40s %20s\n\n",
-  dev_name(m->parent), m->bound ? "bound" : "not bound");
+  dev_name(m->parent), m->dev.driver ? "bound" : "not bound");
 
seq_printf(s, "%-40s %20s\n", "device name", "status");
seq_puts(s, 
"-\n");
@@ -149,16 +153,21 @@ static void component_master_debugfs_del(struct 
aggregate_device *m)
 
 #endif
 
-static struct aggregate_device *__aggregate_find(struct device *parent,
-   const struct component_master_ops *ops)
+struct aggregate_bus_find_data {
+   const struct component_master_ops *ops;
+   struct device *parent;
+};
+
+static int 

[PATCH 1/7] component: Drop 'dev' argument to component_match_realloc()

2021-05-19 Thread Stephen Boyd
This argument isn't used. Drop it.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 
Signed-off-by: Stephen Boyd 
---
 drivers/base/component.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 272ba42392f0..bbe1757dfa89 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -307,8 +307,7 @@ static void devm_component_match_release(struct device 
*dev, void *res)
component_match_release(dev, res);
 }
 
-static int component_match_realloc(struct device *dev,
-   struct component_match *match, size_t num)
+static int component_match_realloc(struct component_match *match, size_t num)
 {
struct component_match_array *new;
 
@@ -359,7 +358,7 @@ static void __component_match_add(struct device *master,
size_t new_size = match->alloc + 16;
int ret;
 
-   ret = component_match_realloc(master, match, new_size);
+   ret = component_match_realloc(match, new_size);
if (ret) {
*matchptr = ERR_PTR(ret);
return;
@@ -469,7 +468,7 @@ int component_master_add_with_match(struct device *dev,
int ret;
 
/* Reallocate the match array for its true size */
-   ret = component_match_realloc(dev, match, match->num);
+   ret = component_match_realloc(match, match->num);
if (ret)
return ret;
 
-- 
https://chromeos.dev



[PATCH 0/7] component: Make into an aggregate bus

2021-05-19 Thread Stephen Boyd
This series is from discussion we had on reordering the device lists for
drm shutdown paths[1]. I've introduced an 'aggregate' bus that we put
the aggregate device onto and then we probe the device once all the
components are probed and call component_add(). The probe/remove hooks
are where the bind/unbind calls go, and then a shutdown hook is added
that can be used to shutdown the drm display pipeline at the right time.

This works for me on my sc7180 board, but I'm currently struggling with
the last patch where we migrate the msm driver. It runs into a runtime
PM problem where the parent device isn't runtime PM enabled yet. I'm
still trying to figure out a clean solution there. Moving runtime PM
around breaks boot and I think that's because the power domain is off.

Cc: Daniel Vetter 
Cc: "Rafael J. Wysocki" 
Cc: Rob Clark 
Cc: Russell King 
Cc: Saravana Kannan 

[1] https://lore.kernel.org/r/20210508074118.1621729-1-swb...@chromium.org

Stephen Boyd (7):
  component: Drop 'dev' argument to component_match_realloc()
  component: Rename 'dev' to 'parent'
  component: Introduce struct aggregate_device
  component: Introduce the aggregate bus_type
  component: Use dev.parent instead of adev->parent
  component: Move struct aggregate_device out to header file
  drm/msm: Migrate to aggregate driver

 drivers/base/component.c  | 614 ++
 drivers/gpu/drm/msm/msm_drv.c |  47 +--
 include/linux/component.h |  73 +++-
 3 files changed, 487 insertions(+), 247 deletions(-)


base-commit: 6efb943b8616ec53a5e444193dccf1af9ad627b5
-- 
https://chromeos.dev



linux-next: manual merge of the drm-intel tree with Linus' tree

2021-05-19 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the drm-intel tree got a conflict in:

  drivers/gpu/drm/i915/i915_mm.c

between commit:

  293837b9ac8d ("Revert "i915: fix remap_io_sg to verify the pgprot"")

from Linus' tree and commit:

  ec279384c6a0 ("drm/i915: Initialize err in remap_io_sg()")

from the drm-intel tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/i915/i915_mm.c
index 9a777b0ff59b,25576fa73ff0..
--- a/drivers/gpu/drm/i915/i915_mm.c
+++ b/drivers/gpu/drm/i915/i915_mm.c
@@@ -82,13 -46,8 +82,13 @@@ int remap_io_sg(struct vm_area_struct *
unsigned long addr, unsigned long size,
struct scatterlist *sgl, resource_size_t iobase)
  {
 -  unsigned long pfn, len, remapped = 0;
 +  struct remap_pfn r = {
 +  .mm = vma->vm_mm,
 +  .prot = vma->vm_page_prot,
 +  .sgt = __sgt_iter(sgl, use_dma(iobase)),
 +  .iobase = iobase,
 +  };
-   int err;
+   int err = 0;
  
/* We rely on prevalidation of the io-mapping to skip track_pfn(). */
GEM_BUG_ON((vma->vm_flags & EXPECTED_FLAGS) != EXPECTED_FLAGS);


pgpWD69SQ5TEH.pgp
Description: OpenPGP digital signature


Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness

2021-05-19 Thread Nieto, David M
[AMD Official Use Only]

Parsing over 550 processes for fdinfo is taking between 40-100ms single 
threaded in a 2GHz skylake IBRS within a VM using simple string comparisons and 
DIRent parsing. And that is pretty much the worst case scenario with some more 
optimized implementations.

David

From: Daniel Vetter 
Sent: Wednesday, May 19, 2021 11:23 AM
To: Tvrtko Ursulin 
Cc: Daniel Stone ; jhubb...@nvidia.com 
; nouv...@lists.freedesktop.org 
; Intel Graphics Development 
; Maling list - DRI developers 
; Simon Ser ; Koenig, 
Christian ; arit...@nvidia.com ; 
Nieto, David M 
Subject: Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness

On Wed, May 19, 2021 at 6:16 PM Tvrtko Ursulin
 wrote:
>
>
> On 18/05/2021 10:40, Tvrtko Ursulin wrote:
> >
> > On 18/05/2021 10:16, Daniel Stone wrote:
> >> Hi,
> >>
> >> On Tue, 18 May 2021 at 10:09, Tvrtko Ursulin
> >>  wrote:
> >>> I was just wondering if stat(2) and a chrdev major check would be a
> >>> solid criteria to more efficiently (compared to parsing the text
> >>> content) detect drm files while walking procfs.
> >>
> >> Maybe I'm missing something, but is the per-PID walk actually a
> >> measurable performance issue rather than just a bit unpleasant?
> >
> > Per pid and per each open fd.
> >
> > As said in the other thread what bothers me a bit in this scheme is that
> > the cost of obtaining GPU usage scales based on non-GPU criteria.
> >
> > For use case of a top-like tool which shows all processes this is a
> > smaller additional cost, but then for a gpu-top like tool it is somewhat
> > higher.
>
> To further expand, not only cost would scale per pid multiplies per open
> fd, but to detect which of the fds are DRM I see these three options:
>
> 1) Open and parse fdinfo.
> 2) Name based matching ie /dev/dri/.. something.
> 3) Stat the symlink target and check for DRM major.

stat with symlink following should be plenty fast.

> All sound quite sub-optimal to me.
>
> Name based matching is probably the least evil on system resource usage
> (Keeping the dentry cache too hot? Too many syscalls?), even though
> fundamentally I don't it is the right approach.
>
> What happens with dup(2) is another question.

We need benchmark numbers showing that on anything remotely realistic
it's an actual problem. Until we've demonstrated it's a real problem
we don't need to solve it.

E.g. top with any sorting enabled also parses way more than it
displays on every update. It seems to be doing Just Fine (tm).

> Does anyone have any feedback on the /proc//gpu idea at all?

When we know we have a problem to solve we can take a look at solutions.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7CDavid.Nieto%40amd.com%7Cf6aea97532cf41f916de08d91af32cc1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637570453997158377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=4CFrY9qWbJREcIcSzeO9KIn2P%2Fw6k%2BYdNlh6rdS%2BEh4%3Dreserved=0


Re: [PATCH v16 4/4] dt-bindings: msm/dp: Add bindings of MSM DisplayPort controller

2021-05-19 Thread Rob Herring
On Tue, May 18, 2021 at 03:33:45PM +0530, Krishna Manikandan wrote:
> Add bindings for Snapdragon DisplayPort controller driver.
> 
> Signed-off-by: Chandan Uddaraju 
> Signed-off-by: Vara Reddy 
> Signed-off-by: Tanmay Shah 
> Signed-off-by: Kuogee Hsieh 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in V2:
> -Provide details about sel-gpio
> 
> Changes in V4:
> -Provide details about max dp lanes
> -Change the commit text
> 
> Changes in V5:
> -moved dp.txt to yaml file
> 
> Changes in v6:
> - Squash all AUX LUT properties into one pattern Property
> - Make aux-cfg[0-9]-settings properties optional
> - Remove PLL/PHY bindings from DP controller dts
> - Add DP clocks description
> - Remove _clk suffix from clock names
> - Rename pixel clock to stream_pixel
> - Remove redundant bindings (GPIO, PHY, HDCP clock, etc..)
> - Fix indentation
> - Add Display Port as interface of DPU in DPU bindings
>   and add port mapping accordingly.
> 
> Chages in v7:
> - Add dp-controller.yaml file common between multiple SOC
> - Rename dp-sc7180.yaml to dp-controller-sc7180.yaml
> - change compatible string and add SOC name to it.
> - Remove Root clock generator for pixel clock
> - Add assigned-clocks and assigned-clock-parents bindings
> - Remove redundant properties, descriptions and blank lines
> - Add DP port in DPU bindings
> - Update depends-on tag in commit message and rebase change accordingly
> 
> Changes in v8:
> - Add MDSS AHB clock in bindings
> 
> Changes in v9:
> - Remove redundant reg-name property
> - Change assigned-clocks and assigned-clocks-parents counts to 2
> - Use IRQ flags in example dts
> 
> Changes in v10:
> - Change title of this patch as it does not contain PLL bindings anymore
> - Remove redundant properties
> - Remove use of IRQ flag
> - Fix ports property
> 
> Changes in v11:
> - add ports required of both #address-cells and  #size-cells
> - add required operating-points-v2
> - add required #sound-dai-cells
> - add required power-domains
> - update maintainer list
> 
> Changes in v12:
> - remove soc node from examples (Stephen Boyd)
> - split dpu-sc7180.yaml changes to separate patch (Stephen Boyd)
> 
> Changes in v13:
> - add assigned-clocks
> - add assigned-clock-parents
> 
> Changes in v14:
> - add reference for ports (Rob Herring)
> ---
>  .../bindings/display/msm/dp-controller.yaml| 159 
> +
>  1 file changed, 159 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> 
> diff --git a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml 
> b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> new file mode 100644
> index 000..bcce567
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
> @@ -0,0 +1,159 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/msm/dp-controller.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: MSM Display Port Controller
> +
> +maintainers:
> +  - Kuogee Hsieh 
> +
> +description: |
> +  Device tree bindings for DisplayPort host controller for MSM targets
> +  that are compatible with VESA DisplayPort interface specification.
> +
> +properties:
> +  compatible:
> +enum:
> +  - qcom,sc7180-dp
> +
> +  reg:
> +maxItems: 1
> +
> +  interrupts:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: AHB clock to enable register access
> +  - description: Display Port AUX clock
> +  - description: Display Port Link clock
> +  - description: Link interface clock between DP and PHY
> +  - description: Display Port Pixel clock
> +
> +  clock-names:
> +items:
> +  - const: core_iface
> +  - const: core_aux
> +  - const: ctrl_link
> +  - const: ctrl_link_iface
> +  - const: stream_pixel
> +
> +  assigned-clocks:
> +items:
> +  - description: link clock source
> +  - description: pixel clock source
> +
> +  assigned-clock-parents:
> +items:
> +  - description: phy 0 parent
> +  - description: phy 1 parent
> +
> +  phys:
> +maxItems: 1
> +
> +  phy-names:
> +items:
> +  - const: dp
> +
> +  operating-points-v2:
> +maxItems: 1
> +
> +  power-domains:
> +maxItems: 1
> +
> +  "#sound-dai-cells":
> +const: 0
> +
> +  ports:
> +$ref: /schemas/graph.yaml#/properties/ports
> +properties:
> +  "#address-cells":
> +const: 1
> +
> +  "#size-cells":
> +const: 0

These 2 are covered by the common schema already. Drop

> +
> +  port@0:
> +$ref: /schemas/graph.yaml#/properties/port
> +description: Input endpoint of the controller
> +
> +  port@1:
> +$ref: /schemas/graph.yaml#/properties/port
> +description: Output endpoint of the controller
> +
> +required:
> +  - "#address-cells"
> +  - "#size-cells"

And this.

> +
> +

Re: [PATCH v16 3/4] dt-bindings: msm: dsi: add yaml schemas for DSI PHY bindings

2021-05-19 Thread Rob Herring
On Tue, 18 May 2021 15:33:44 +0530, Krishna Manikandan wrote:
> Add YAML schema for the device tree bindings for DSI PHY.
> 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in v1:
>- Merge dsi-phy.yaml and dsi-phy-10nm.yaml (Stephen Boyd)
>- Remove qcom,dsi-phy-regulator-ldo-mode (Stephen Boyd)
>- Add clock cells properly (Stephen Boyd)
>- Remove unnecessary decription from clock names (Stephen Boyd)
>- Add pin names for the supply entries for 10nm phy which is
>  used in sc7180 and sdm845 (Stephen Boyd)
>- Remove unused header files from examples (Stephen Boyd)
>- Drop labels for display nodes and correct node name (Stephen Boyd)
> 
> Changes in v2:
>- Drop maxItems for clock (Stephen Boyd)
>- Add vdds supply pin information for sdm845 (Stephen Boyd)
>- Add examples for 14nm, 20nm and 28nm phy yaml files (Stephen Boyd)
>- Keep child nodes directly under soc node (Stephen Boyd)
> 
> Changes in v3:
>- Use a separate yaml file to describe the common properties
>  for all the dsi phy versions (Stephen Boyd)
>- Remove soc from examples (Stephen Boyd)
>- Add description for register property
> 
> Changes in v4:
>- Modify the title for all the phy versions (Stephen Boyd)
>- Drop description for all the phy versions (Stephen Boyd)
>- Modify the description for register property (Stephen Boyd)
> 
> Changes in v5:
>- Remove unused properties from common dsi phy file
>- Add clock-cells and phy-cells to required property
>  list (Stephen Boyd)
> 
> Changes in v6:
>- Add proper compatible string in example
> ---
>  .../bindings/display/msm/dsi-phy-10nm.yaml | 68 +
>  .../bindings/display/msm/dsi-phy-14nm.yaml | 66 
>  .../bindings/display/msm/dsi-phy-20nm.yaml | 71 
> ++
>  .../bindings/display/msm/dsi-phy-28nm.yaml | 68 +
>  .../bindings/display/msm/dsi-phy-common.yaml   | 40 
>  5 files changed, 313 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-20nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-phy-common.yaml
> 

Reviewed-by: Rob Herring 


Re: [PATCH v16 1/4] dt-bindings: msm: disp: add yaml schemas for DPU bindings

2021-05-19 Thread Rob Herring
On Tue, 18 May 2021 15:33:42 +0530, Krishna Manikandan wrote:
> MSM Mobile Display Subsystem (MDSS) encapsulates sub-blocks
> like DPU display controller, DSI etc. Add YAML schema
> for DPU device tree bindings.
> 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in v2:
> - Changed dpu to DPU (Sam Ravnborg)
> - Fixed indentation issues (Sam Ravnborg)
> - Added empty line between different properties (Sam Ravnborg)
> - Replaced reference txt files with  their corresponding
>   yaml files (Sam Ravnborg)
> - Modified the file to use "|" only when it is
>   necessary (Sam Ravnborg)
> 
> Changes in v3:
> - Corrected the license used (Rob Herring)
> - Added maxItems for properties (Rob Herring)
> - Dropped generic descriptions (Rob Herring)
> - Added ranges property (Rob Herring)
> - Corrected the indendation (Rob Herring)
> - Added additionalProperties (Rob Herring)
> - Split dsi file into two, one for dsi controller
>   and another one for dsi phy per target (Rob Herring)
> - Corrected description for pinctrl-names (Rob Herring)
> - Corrected the examples used in yaml file (Rob Herring)
> - Delete dsi.txt and dpu.txt (Rob Herring)
> 
> Changes in v4:
> - Move schema up by one level (Rob Herring)
> - Add patternProperties for mdp node (Rob Herring)
> - Corrected description of some properties (Rob Herring)
> 
> Changes in v5:
> - Correct the indentation (Rob Herring)
> - Remove unnecessary description from properties (Rob Herring)
> - Correct the number of interconnect entries (Rob Herring)
> - Add interconnect names for sc7180 (Rob Herring)
> - Add description for ports (Rob Herring)
> - Remove common properties (Rob Herring)
> - Add unevalutatedProperties (Rob Herring)
> - Reference existing dsi controller yaml in the common
>   dsi controller file (Rob Herring)
> - Correct the description of clock names to include only the
>   clocks that are required (Rob Herring)
> - Remove properties which are already covered under the common
>   binding (Rob Herring)
> - Add dsi phy supply nodes which are required for sc7180 and
>   sdm845 targets (Rob Herring)
> - Add type ref for syscon-sfpb (Rob Herring)
> 
> Changes in v6:
> - Fixed errors during dt_binding_check (Rob Herring)
> - Add maxItems for phys and phys-names (Rob Herring)
> - Use unevaluatedProperties wherever required (Rob Herring)
> - Removed interrupt controller from required properties for
>   dsi controller (Rob Herring)
> - Add constraints for dsi-phy reg-names based on the compatible
>   phy version (Rob Herring)
> - Add constraints for dsi-phy supply nodes based on the
>   compatible phy version (Rob Herring)
> 
> Changes in v7:
> - Add default value for qcom,mdss-mdp-transfer-time-us (Rob Herring)
> - Modify the schema for data-lanes (Rob Herring)
> - Split the phy schema into separate schemas based on
>   the phy version (Rob Herring)
> 
> Changes in v8:
> - Resolve merge conflicts with latest dsi.txt file
> - Include dp yaml change also in the same series
> 
> Changes in v9:
> - Combine target specific dsi controller yaml files
>   to a single yaml file (Rob Herring)
> - Combine target specific dsi phy yaml files into a
>   single yaml file (Rob Herring)
> - Use unevaluatedProperties and additionalProperties
>   wherever required
> - Remove duplicate properties from common yaml files
> 
> Changes in v10:
> - Split the patch into separate patches for DPU, DSI and
>   PHY (Stephen Boyd)
> - Drop unnecessary fullstop (Stephen Boyd)
> - Add newline whereever required (Stephen Boyd)
> - Add description for clock used (Stephen Boyd)
> - Modify the description for interconnect entries  (Stephen Boyd)
> - Drop assigned clock entries as it a generic property (Stephen Boyd)
> - Correct the definition for interrupts (Stephen Boyd)
> - Drop clock names from required properties (Stephen Boyd)
> - Drop labels for display nodes from example (Stephen Boyd)
> - Drop flags from interrupts entries (Stephen Boyd)
> 
> Changes in v11:
> - Drop maxItems for clocks (Stephen Boyd)
> 
> Changes in v12:
> - Add description for register property (Stephen Boyd)
> - Add maxItems for interrupts (Stephen Boyd)
> - Add description for iommus property (Stephen Boyd)
> - Add description for interconnects (Stephen Boyd)
> - Change display node name to display_controller (Stephen Boyd)
> 
> Changes in v13:
> - Add maxItems for reg property (Stephen Boyd)
> - Add ranges property in example (Stephen Boyd)
> - Modify description for iommus property (Stephen Boyd)
> - Add Dp bindings for ports in the same patch (Stephen Boyd)
> - Remove soc from examples and change address and size cells
>   accordingly (Stephen Boyd)
> - Add reference for ports
> 
> Changes 

Re: [PATCH v16 2/4] dt-bindings: msm: dsi: add yaml schemas for DSI bindings

2021-05-19 Thread Rob Herring
On Tue, May 18, 2021 at 03:33:43PM +0530, Krishna Manikandan wrote:
> Add YAML schema for the device tree bindings for DSI
> 
> Signed-off-by: Krishna Manikandan 
> 
> Changes in v1:
> - Separate dsi controller bindings to a separate patch (Stephen Boyd)
> - Merge dsi-common-controller.yaml and dsi-controller-main.yaml to
>   a single file (Stephen Boyd)
> - Drop supply entries and definitions from properties (Stephen Boyd)
> - Modify phy-names property for dsi controller (Stephen Boyd)
> - Remove boolean from description (Stephen Boyd)
> - Drop pinctrl properties as they are standard entries (Stephen Boyd)
> - Modify the description for ports property and keep the reference
>   to the generic binding where this is defined (Stephen Boyd)
> - Add description to clock names (Stephen Boyd)
> - Correct the indendation (Stephen Boyd)
> - Drop the label for display dt nodes and correct the node
>   name (Stephen Boyd)
> 
> Changes in v2:
> - Drop maxItems for clock (Stephen Boyd)
> - Drop qcom,mdss-mdp-transfer-time-us as it is not used in upstream
>   dt file (Stephen Boyd)
> - Keep child node directly under soc node (Stephen Boyd)
> - Drop qcom,sync-dual-dsi as it is not used in upstream dt
> 
> Changes in v3:
> - Add description for register property (Stephen Boyd)
> 
> Changes in v4:
> - Add maxItems for phys property (Stephen Boyd)
> - Add maxItems for reg property (Stephen Boyd)
> - Add reference for data-lanes property (Stephen Boyd)
> - Remove soc from example (Stephen Boyd)
> 
> Changes in v5:
> - Modify title and description (Stephen Boyd)
> - Add required properties for ports node (Stephen Boyd)
> - Add data-lanes in the example (Stephen Boyd)
> - Drop qcom,master-dsi property (Stephen Boyd)
> 
> Changes in v6:
> - Add required properties for port@0, port@1 and corresponding
>   endpoints (Stephen Boyd)
> - Add address-cells and size-cells for ports (Stephen Boyd)
> - Use additionalProperties instead of unevaluatedProperties (Stephen Boyd)
> 
> Changes in v7:
> - Add reference for ports and data-lanes (Rob Herring)
> - Add maxItems and minItems for data-lanes (Rob Herring)
> ---
>  .../bindings/display/msm/dsi-controller-main.yaml  | 209 +
>  .../devicetree/bindings/display/msm/dsi.txt| 249 
> -
>  2 files changed, 209 insertions(+), 249 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
>  delete mode 100644 Documentation/devicetree/bindings/display/msm/dsi.txt
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml 
> b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
> new file mode 100644
> index 000..80f5218
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
> @@ -0,0 +1,209 @@
> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/msm/dsi-controller-main.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm Display DSI controller
> +
> +maintainers:
> +  - Krishna Manikandan 
> +
> +allOf:
> +  - $ref: "../dsi-controller.yaml#"
> +
> +properties:
> +  compatible:
> +items:
> +  - const: qcom,mdss-dsi-ctrl
> +
> +  reg:
> +maxItems: 1
> +
> +  reg-names:
> +const: dsi_ctrl
> +
> +  interrupts:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: Display byte clock
> +  - description: Display byte interface clock
> +  - description: Display pixel clock
> +  - description: Display escape clock
> +  - description: Display AHB clock
> +  - description: Display AXI clock
> +
> +  clock-names:
> +items:
> +  - const: byte
> +  - const: byte_intf
> +  - const: pixel
> +  - const: core
> +  - const: iface
> +  - const: bus
> +
> +  phys:
> +maxItems: 1
> +
> +  phy-names:
> +const: dsi
> +
> +  "#address-cells": true
> +
> +  "#size-cells": true
> +
> +  syscon-sfpb:
> +description: A phandle to mmss_sfpb syscon node (only for DSIv2).
> +$ref: "/schemas/types.yaml#/definitions/phandle"
> +
> +  qcom,dual-dsi-mode:
> +type: boolean
> +description: |
> +  Indicates if the DSI controller is driving a panel which needs
> +  2 DSI links.
> +
> +  power-domains:
> +maxItems: 1
> +
> +  operating-points-v2: true
> +
> +  ports:
> +$ref: "/schemas/graph.yaml#/properties/ports"
> +description: |
> +  Contains DSI controller input and output ports as children, each
> +  containing one endpoint subnode.
> +
> +properties:
> +  port@0:
> +$ref: "/schemas/graph.yaml#/properties/port"
> +description: |
> +  Input endpoints of the controller.
> +
> +properties:
> +  reg:
> +const: 0

You 

Re: [PATCH v7 00/10] drm: Fix EDID reading on ti-sn65dsi86 by introducing the DP AUX bus

2021-05-19 Thread Lyude Paul
JFYI I haven't had a chance yet but I'm hoping to look at this this week

On Mon, 2021-05-17 at 13:08 -0700, Douglas Anderson wrote:
> The primary goal of this series is to try to properly fix EDID reading
> for eDP panels using the ti-sn65dsi86 bridge.
> 
> Previously we had a patch that added EDID reading but it turned out
> not to work at bootup. This caused some extra churn at bootup as we
> tried (and failed) to read the EDID several times and also ended up
> forcing us to use the hardcoded mode at boot. With this patch series I
> believe EDID reading is reliable at boot now and we never use the
> hardcoded mode.
> 
> High level note: in this series the EDID reading is driven by the
> panel driver, not by the bridge chip driver. I believe this makes a
> reasonable amount of sense since the panel driver already _could_
> drive reading the EDID if provided with the DDC bus and in future
> planned work we'll want to give the panel driver the DDC bus (to make
> decisions based on EDID) and the AUX bus (to control the
> backlight). There are also planned patches from Laurent to make
> ti-sn65dsi86 able to drive full DP monitors. In that case the bridge
> chip will still be in charge of reading the EDID, but it's not hard to
> make this dynamic.
> 
> This series is the logical successor to the 3-part series containing
> the patch ("drm/bridge: ti-sn65dsi86: Properly get the EDID, but only
> if refclk") [1].
> 
> This patch was tested against drm-misc-next commit 60a6b73dd821
> ("drm/ingenic: Fix pixclock rate for 24-bit serial panels") on a
> sc7180-trogdor-lazor device.
> 
> At v7 now, this patch series grew a bit from v6 because it introduces
> the DP AUX bus.
> 
> Between v2 and v3, high-level view of changes:
> - stop doing the EDID caching in the core.
> 
> Between v3 and v4, high-level view of changes:
> - EDID reading is actually driven by the panel driver now. See above.
> - Lots of chicken-and-egg problems solved w/ sub-devices.
> 
> Between v4 and v5, high-level view of changes.
> - Some of the early patches landed, so dropped from series.
> - New pm_runtime_disable() fix (fixed a patch that already landed).
> - Added Bjorn's tags to most patches
> - Fixed problems when building as a module.
> - Reordered debugfs patch and fixed error handling there.
> - Dropped last patch. I'm not convinced it's safe w/out more work.
> 
> Between v5 and v6, high-level view of changes:
> - Added the patch ("drm/dp: Allow an early call to register DDC i2c
>   bus")
> - Many patches had been landed, so only a few "controversial" ones
>   left.
> 
> Between v6 and v7, high-level view of changes:
> - New AUX DP bus!
> 
> [1]
> https://lore.kernel.org/r/20210304155144.3.I60a7fb23ce4589006bc95c64ab8d15c74b876e68@changeid/
> 
> Changes in v7:
> - pm_runtime_dont_use_autosuspend() fix new for v7.
> - List hpd properties bindings patch new for v7.
> - ti-sn65dsi86: Add aux-bus child patch new for v7.
> - Patch introducing the DP AUX bus is new for v7.
> - Patch to allow panel-simple to be DP AUX EP new for v7.
> - Patch using the DP AUX for DDC new for v7.
> - Remove use of now-dropped drm_dp_aux_register_ddc() call.
> - Beefed up commit message in context of the DP AUX bus.
> - Set the proper sub-device "dev" pointer in the AUX structure.
> - Patch to support for DP AUX bus on ti-sn65dsi86 new for v7.
> - Adjusted commit message to talk about DP AUX bus.
> - Panel now under bridge chip instead of getting a link to ddc-i2c
> 
> Changes in v6:
> - Use new drm_dp_aux_register_ddc() calls.
> 
> Douglas Anderson (10):
>   drm/panel: panel-simple: Add missing pm_runtime_dont_use_autosuspend()
>     calls
>   dt-bindings: display: simple: List hpd properties in panel-simple
>   dt-bindings: drm/bridge: ti-sn65dsi86: Add aux-bus child
>   drm: Introduce the DP AUX bus
>   drm/panel: panel-simple: Allow panel-simple be a DP AUX endpoint
>     device
>   drm/panel: panel-simple: Stash DP AUX bus; allow using it for DDC
>   drm/bridge: ti-sn65dsi86: Promote the AUX channel to its own sub-dev
>   drm/bridge: ti-sn65dsi86: Add support for the DP AUX bus
>   drm/bridge: ti-sn65dsi86: Don't read EDID blob over DDC
>   arm64: dts: qcom: sc7180-trogdor: Move panel under the bridge chip
> 
>  .../bindings/display/bridge/ti,sn65dsi86.yaml |  22 +-
>  .../bindings/display/panel/panel-simple.yaml  |   2 +
>  arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi  |  30 +-
>  drivers/gpu/drm/Kconfig   |   5 +
>  drivers/gpu/drm/Makefile  |   2 +
>  drivers/gpu/drm/bridge/Kconfig    |   1 +
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 111 --
>  drivers/gpu/drm/drm_dp_aux_bus.c  | 322 ++
>  drivers/gpu/drm/panel/Kconfig |   1 +
>  drivers/gpu/drm/panel/panel-simple.c  |  66 +++-
>  include/drm/drm_dp_aux_bus.h  |  57 
>  11 files changed, 563 insertions(+), 56 deletions(-)
>  create mode 100644 

Re: [PATCH] dt-bindings: display: ssd1307fb: Convert to json-schema

2021-05-19 Thread Rob Herring
On Tue, 18 May 2021 09:51:31 +0200, Geert Uytterhoeven wrote:
> Convert the Solomon SSD1307 Framebuffer Device Tree binding
> documentation to json-schema.
> 
> Fix the spelling of the "pwms" property.
> Document default values.
> Make properties with default values not required.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
> I have listed Maxime as the maintainer, as he wrote the original driver
> and bindings.  Maxime: Please scream if this is inappropriate ;-)
> ---
>  .../bindings/display/solomon,ssd1307fb.yaml   | 166 ++
>  .../devicetree/bindings/display/ssd1307fb.txt |  60 ---
>  2 files changed, 166 insertions(+), 60 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/display/solomon,ssd1307fb.yaml
>  delete mode 100644 Documentation/devicetree/bindings/display/ssd1307fb.txt
> 

Reviewed-by: Rob Herring 


Re: [PATCH v7 03/10] dt-bindings: drm/bridge: ti-sn65dsi86: Add aux-bus child

2021-05-19 Thread Doug Anderson
Hi,

On Wed, May 19, 2021 at 1:02 PM Rob Herring  wrote:
>
> On Mon, May 17, 2021 at 01:09:00PM -0700, Douglas Anderson wrote:
> > We want to be able to list an eDP panel as a child of a ti-sn65dsi86
> > node to represent the fact that the panel is connected to the bridge's
> > DP AUX bus. Though the panel and the bridge chip are connected in
> > several ways, the DP AUX bus is the primary control interface between
> > the two and thus makes the most sense to model in device tree
> > hierarchy.
> >
> > Listing a panel in this way makes it possible for the panel driver to
> > easily get access to the DP AUX bus that it resides on, which can be
> > useful to help in auto-detecting the panel and for turning on various
> > bits.
> >
> > NOTE: it's still possible to continue using the bridge chip and point
> > to a panel that _isn't_ listed as a child of the bridge chip (since
> > it's worked that way previously), but that should be deprecated since
> > there is no downside to listing the panel under the bridge chip.
> >
> > The idea for this bus's design was hashed out over IRC [1].
> >
> > [1] 
> > https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel=2021-05-11
> >
> > Signed-off-by: Douglas Anderson 
> > ---
> > Possibly we might want something fancier that could be included by
> > other eDP controller bindings. If we want to do this, I'd love to be
> > pointed at a good example to follow.
> >
> > Changes in v7:
> > - ti-sn65dsi86: Add aux-bus child patch new for v7.
> >
> >  .../bindings/display/bridge/ti,sn65dsi86.yaml | 22 ++-
> >  1 file changed, 21 insertions(+), 1 deletion(-)
> >
> > diff --git 
> > a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml 
> > b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml
> > index 26932d2e86ab..51f5a29e216c 100644
> > --- a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml
> > +++ b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml
> > @@ -70,6 +70,11 @@ properties:
> >  const: 1
> >  description: See ../../pwm/pwm.yaml for description of the cell 
> > formats.
> >
> > +  aux-bus:
>
> As this is a node:
>
> type: object
>
> > +description:
> > +  It is recommended that you place your panel under the aux-bus node
> > +  here to represent the control hierarchy.
> > +
> >ports:
> >  $ref: /schemas/graph.yaml#/properties/ports
> >
> > @@ -201,11 +206,26 @@ examples:
> >
> >port@1 {
> >  reg = <1>;
> > -endpoint {
> > +sn65dsi86_out: endpoint {
> >remote-endpoint = <_in_edp>;
> >  };
> >};
> >  };
> > +
> > +aux-bus {
> > +  panel {
>
> We should perhaps have a separate aux-bus schema.

Yeah. Before spending lots of time digging into how to do this I
wanted to see if anyone was going to give me a big-old NAK on the
whole approach. ;-)

I guess I'd make a file called "dp-aux-bus.yaml" (maybe right under
bindings/display?) and then I'd include it like this:

aux-bus:
  $ref: "../dp-aux-bus.yaml#"


> Something should
> define the child node is 'panel' and nothing else.

At the moment the code also requires that the node name is 'aux-bus'.
Any objections to that?


> Though perhaps
> connectors are valid too?

They might be. We could always add it later?


[PATCH] drm/kmb: Fix an error handling path

2021-05-19 Thread Christophe JAILLET
If 'platform_get_irq()' fails, it is spurious to call
'of_reserved_mem_device_release()' in the error handling path, because
'of_reserved_mem_device_init() has not been called yet.

Moreover, a previous 'kmb_initialize_clocks()' is unbalanced by a
corresponding 'kmb_display_clk_disable()' call, has already done in the
remove function.

It is likely that 'kmb_display_clk_disable()' is expected in the error
handling path, instead of 'kmb_display_clk_disable()'.


Also, it is spurious to return directly if 'of_reserved_mem_device_init()'
fails.
Goto the error handling path instead to free some resources.

Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display")
Signed-off-by: Christophe JAILLET 
---
 drivers/gpu/drm/kmb/kmb_drv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/kmb/kmb_drv.c b/drivers/gpu/drm/kmb/kmb_drv.c
index f64e06e1067d..b41b8789fe57 100644
--- a/drivers/gpu/drm/kmb/kmb_drv.c
+++ b/drivers/gpu/drm/kmb/kmb_drv.c
@@ -138,13 +138,13 @@ static int kmb_hw_init(struct drm_device *drm, unsigned 
long flags)
irq_lcd = platform_get_irq(pdev, 0);
if (irq_lcd < 0) {
drm_err(>drm, "irq_lcd not found");
-   goto setup_fail;
+   goto disable_clk_err;
}
 
/* Get the optional framebuffer memory resource */
ret = of_reserved_mem_device_init(drm->dev);
if (ret && ret != -ENODEV)
-   return ret;
+   goto disable_clk_err;
 
spin_lock_init(>irq_lock);
 
@@ -152,8 +152,8 @@ static int kmb_hw_init(struct drm_device *drm, unsigned 
long flags)
 
return 0;
 
- setup_fail:
-   of_reserved_mem_device_release(drm->dev);
+ disable_clk_err:
+   kmb_display_clk_disable(kmb);
 
return ret;
 }
-- 
2.30.2



[PATCH v2 1/3] dt-bindings: display: convert faraday,tve200

2021-05-19 Thread Corentin Labbe
Converts display/faraday,tve200.txt to yaml.

Signed-off-by: Corentin Labbe 
---
Changes since v1:
- added two subsequent patchs fixing issue found when converting
- fixed all issues reported by Rob Herring
 .../bindings/display/faraday,tve200.txt   | 54 ---
 .../bindings/display/faraday,tve200.yaml  | 68 +++
 2 files changed, 68 insertions(+), 54 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/display/faraday,tve200.txt
 create mode 100644 
Documentation/devicetree/bindings/display/faraday,tve200.yaml

diff --git a/Documentation/devicetree/bindings/display/faraday,tve200.txt 
b/Documentation/devicetree/bindings/display/faraday,tve200.txt
deleted file mode 100644
index 82e3bc0b7485..
--- a/Documentation/devicetree/bindings/display/faraday,tve200.txt
+++ /dev/null
@@ -1,54 +0,0 @@
-* Faraday TV Encoder TVE200
-
-Required properties:
-
-- compatible: must be one of:
-   "faraday,tve200"
-   "cortina,gemini-tvc", "faraday,tve200"
-
-- reg: base address and size of the control registers block
-
-- interrupts: contains an interrupt specifier for the interrupt
-   line from the TVE200
-
-- clock-names: should contain "PCLK" for the clock line clocking the
-   silicon and "TVE" for the 27MHz clock to the video driver
-
-- clocks: contains phandle and clock specifier pairs for the entries
-   in the clock-names property. See
-   Documentation/devicetree/bindings/clock/clock-bindings.txt
-
-Optional properties:
-
-- resets: contains the reset line phandle for the block
-
-Required sub-nodes:
-
-- port: describes LCD panel signals, following the common binding
-   for video transmitter interfaces; see
-   Documentation/devicetree/bindings/media/video-interfaces.txt
-   This port should have the properties:
-   reg = <0>;
-   It should have one endpoint connected to a remote endpoint where
-   the display is connected.
-
-Example:
-
-display-controller@6a00 {
-   #address-cells = <1>;
-   #size-cells = <0>;
-   compatible = "faraday,tve200";
-   reg = <0x6a00 0x1000>;
-   interrupts = <13 IRQ_TYPE_EDGE_RISING>;
-   resets = < GEMINI_RESET_TVC>;
-   clocks = < GEMINI_CLK_GATE_TVC>,
-< GEMINI_CLK_TVC>;
-   clock-names = "PCLK", "TVE";
-
-   port@0 {
-   reg = <0>;
-   display_out: endpoint {
-   remote-endpoint = <_in>;
-   };
-   };
-};
diff --git a/Documentation/devicetree/bindings/display/faraday,tve200.yaml 
b/Documentation/devicetree/bindings/display/faraday,tve200.yaml
new file mode 100644
index ..e2ee77767321
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/faraday,tve200.yaml
@@ -0,0 +1,68 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/faraday,tve200.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Faraday TV Encoder TVE200
+
+maintainers:
+  - Linus Walleij 
+
+properties:
+  compatible:
+oneOf:
+  - const: faraday,tve200
+  - items:
+  - const: cortina,gemini-tvc
+  - const: faraday,tve200
+
+  reg:
+maxItems: 1
+
+  interrupts:
+minItems: 1
+
+  clock-names:
+items:
+  - const: PCLK
+  - const: TVE
+
+  clocks:
+minItems: 2
+
+  resets:
+minItems: 1
+
+  port:
+$ref: /schemas/graph.yaml#/properties/port
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clock-names
+  - clocks
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+#include 
+display-controller@6a00 {
+  compatible = "faraday,tve200";
+  reg = <0x6a00 0x1000>;
+  interrupts = <13 IRQ_TYPE_EDGE_RISING>;
+  resets = < GEMINI_RESET_TVC>;
+  clocks = < GEMINI_CLK_GATE_TVC>,
+   < GEMINI_CLK_TVC>;
+  clock-names = "PCLK", "TVE";
+
+  port {
+display_out: endpoint {
+  remote-endpoint = <_in>;
+};
+  };
+};
-- 
2.26.3



[PATCH v2 3/3] ARM: dts: gemini: remove xxx-cells from display

2021-05-19 Thread Corentin Labbe
dtb_check complains about #address-cells and #size-cells, so lets
remove them.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/gemini.dtsi | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm/boot/dts/gemini.dtsi b/arch/arm/boot/dts/gemini.dtsi
index fa708f5d0c72..34961e5bc7b2 100644
--- a/arch/arm/boot/dts/gemini.dtsi
+++ b/arch/arm/boot/dts/gemini.dtsi
@@ -417,8 +417,6 @@ display-controller@6a00 {
clock-names = "PCLK", "TVE";
pinctrl-names = "default";
pinctrl-0 = <_default_pins>;
-   #address-cells = <1>;
-   #size-cells = <0>;
status = "disabled";
};
 
-- 
2.26.3



[PATCH v2 2/3] ARM: dts: gemini-dlink-dir-685: Remove address from display port

2021-05-19 Thread Corentin Labbe
The address and reg adds no value to the port node, remove them.

Signed-off-by: Corentin Labbe 
---
 arch/arm/boot/dts/gemini-dlink-dir-685.dts | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/gemini-dlink-dir-685.dts 
b/arch/arm/boot/dts/gemini-dlink-dir-685.dts
index cc39289e99dd..2eeb142b5464 100644
--- a/arch/arm/boot/dts/gemini-dlink-dir-685.dts
+++ b/arch/arm/boot/dts/gemini-dlink-dir-685.dts
@@ -492,8 +492,7 @@ drive0: ide-port@0 {
display-controller@6a00 {
status = "okay";
 
-   port@0 {
-   reg = <0>;
+   port {
display_out: endpoint {
remote-endpoint = <_in>;
};
-- 
2.26.3



Re: [PATCH v1 1/3] dt-bindings: display: simple: add Innolux G070Y2-T02 panel

2021-05-19 Thread Rob Herring
On Tue, 18 May 2021 09:15:53 +0200, Oleksij Rempel wrote:
> Add binding for the Innolux G070Y2-T02 panel. It is 7" WVGA (800x480)
> TFT LCD panel with TTL interface and a backlight unit.
> 
> Signed-off-by: Oleksij Rempel 
> ---
>  .../devicetree/bindings/display/panel/panel-simple.yaml | 2 ++
>  1 file changed, 2 insertions(+)
> 

Acked-by: Rob Herring 


Re: [PATCH V2] dt-bindings: display: Fix spacing in lvds.yaml

2021-05-19 Thread Rob Herring
On Tue, 18 May 2021 00:43:36 +0200, Marek Vasut wrote:
> Add missing spaces to make the diagrams readable, no functional change.
> 
> Signed-off-by: Marek Vasut 
> Cc: Laurent Pinchart 
> Cc: Rob Herring 
> Cc: Sam Ravnborg 
> Cc: devicet...@vger.kernel.org
> To: dri-devel@lists.freedesktop.org
> ---
> V2: Replace all the other tabs too
> ---
>  .../bindings/display/panel/lvds.yaml  | 46 +--
>  1 file changed, 23 insertions(+), 23 deletions(-)
> 

Applied, thanks!


[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail

2021-05-19 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=211277

--- Comment #29 from James Zhu (jam...@amd.com) ---
Hi Jeromec,I think debug turn-on changes a little bit timing. log without debug
info can't give me any help. The amdgpu_fence_info looks good for all cases.
this issue is possible device specified.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v7 03/10] dt-bindings: drm/bridge: ti-sn65dsi86: Add aux-bus child

2021-05-19 Thread Rob Herring
On Mon, May 17, 2021 at 01:09:00PM -0700, Douglas Anderson wrote:
> We want to be able to list an eDP panel as a child of a ti-sn65dsi86
> node to represent the fact that the panel is connected to the bridge's
> DP AUX bus. Though the panel and the bridge chip are connected in
> several ways, the DP AUX bus is the primary control interface between
> the two and thus makes the most sense to model in device tree
> hierarchy.
> 
> Listing a panel in this way makes it possible for the panel driver to
> easily get access to the DP AUX bus that it resides on, which can be
> useful to help in auto-detecting the panel and for turning on various
> bits.
> 
> NOTE: it's still possible to continue using the bridge chip and point
> to a panel that _isn't_ listed as a child of the bridge chip (since
> it's worked that way previously), but that should be deprecated since
> there is no downside to listing the panel under the bridge chip.
> 
> The idea for this bus's design was hashed out over IRC [1].
> 
> [1] 
> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel=2021-05-11
> 
> Signed-off-by: Douglas Anderson 
> ---
> Possibly we might want something fancier that could be included by
> other eDP controller bindings. If we want to do this, I'd love to be
> pointed at a good example to follow.
> 
> Changes in v7:
> - ti-sn65dsi86: Add aux-bus child patch new for v7.
> 
>  .../bindings/display/bridge/ti,sn65dsi86.yaml | 22 ++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml 
> b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml
> index 26932d2e86ab..51f5a29e216c 100644
> --- a/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml
> +++ b/Documentation/devicetree/bindings/display/bridge/ti,sn65dsi86.yaml
> @@ -70,6 +70,11 @@ properties:
>  const: 1
>  description: See ../../pwm/pwm.yaml for description of the cell formats.
>  
> +  aux-bus:

As this is a node:

type: object

> +description:
> +  It is recommended that you place your panel under the aux-bus node
> +  here to represent the control hierarchy.
> +
>ports:
>  $ref: /schemas/graph.yaml#/properties/ports
>  
> @@ -201,11 +206,26 @@ examples:
>  
>port@1 {
>  reg = <1>;
> -endpoint {
> +sn65dsi86_out: endpoint {
>remote-endpoint = <_in_edp>;
>  };
>};
>  };
> +
> +aux-bus {
> +  panel {

We should perhaps have a separate aux-bus schema. Something should 
define the child node is 'panel' and nothing else. Though perhaps 
connectors are valid too?

> +compatible = "boe,nv133fhm-n62";
> +power-supply = <_dx_edp>;
> +backlight = <>;
> +hpd-gpios = <_bridge 2 GPIO_ACTIVE_HIGH>;
> +
> +port {
> +  panel_in_edp: endpoint {
> +remote-endpoint = <_out>;
> +  };
> +};
> +  };
> +};
>};
>  };
>- |
> -- 
> 2.31.1.751.gd2f1c929bd-goog
> 


[Bug 211277] sometimes crash at s2ram-wake (Ryzen 3500U): amdgpu, drm, commit_tail, amdgpu_dm_atomic_commit_tail

2021-05-19 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=211277

--- Comment #28 from Jerome C (m...@jeromec.com) ---
Created attachment 296877
  --> https://bugzilla.kernel.org/attachment.cgi?id=296877=edit
AMDGPU fence info

(In reply to James Zhu from comment #27)
> Hi Jeromec, thanks for your feedback, can you also add drm.debug=0x1ff
> modprobe? I need log: case 1 dmesg and
> /sys/kernel/debug/dri/0/amdgpu_fence_info (if you can). James.

I've tested text mode and gui/drm mode with "drm.debug=0x1ff" set and found no
crashes... when "drm.debug=0x1ff" is unset... the crashes/timeouts are back...
I think this is why your unable to reproduce the problem...

I've never known debug option(s) to remove issue(s)... oh well

I've added the contents of the file
"/sys/kernel/debug/dri/0/amdgpu_fence_info".

The file contains 4 different boot states ( vcn on/off, drm debug on/off )
clearly marked/seperated in the attached file

I'm using 5.12.5 now but I also tried this on 5.12.4. Usually the crashes
happen within 50 suspensions/resumes but today I left it to do over 2000
suspensions/resumes just to make sure...

I know you asked for a log but I spent so much time on this ( other things too
), it wasn't on my mind so I'll get that by Friday, if you still need it
ofcourse

thanks

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v7 02/15] swiotlb: Refactor swiotlb_create_debugfs

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Split the debugfs creation to make the code reusable for supporting
> different bounce buffer pools, e.g. restricted DMA pool.
> 
> Signed-off-by: Claire Chang 

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v7 07/15] swiotlb: Update is_swiotlb_active to add a struct device argument

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Update is_swiotlb_active to add a struct device argument. This will be
> useful later to allow for restricted DMA pool.
> 
> Signed-off-by: Claire Chang 

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v7 06/15] swiotlb: Update is_swiotlb_buffer to add a struct device argument

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Update is_swiotlb_buffer to add a struct device argument. This will be
> useful later to allow for restricted DMA pool.
> 
> Signed-off-by: Claire Chang 

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v7 05/15] swiotlb: Add a new get_io_tlb_mem getter

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Add a new getter, get_io_tlb_mem, to help select the io_tlb_mem struct.
> The restricted DMA pool is preferred if available.
> 
> Signed-off-by: Claire Chang 

Reviewed-by: Florian Fainelli 
-- 
Florian


[PULL] drm-intel-next

2021-05-19 Thread Rodrigo Vivi
Hi Dave and Daniel,

Here goes the first pull request targeting 5.14.

Main highlight goes to the ADL-P platform and display XeLPD IP
enabling patches. Also with a refactor on how we handle the graphics
and display IP versions.

drm-intel-next-2021-05-19-1:
Core Changes:

- drm: Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec (Jose).

Driver Changes:

- Display plane clock rates fixes and improvements (Ville).
- Uninint DMC FW loader state during shutdown (Imre).
- Convert snprintf to sysfs_emit (Xuezhi).
- Fix invalid access to ACPI _DSM objects (Takashi).
- A big refactor around how i915 addresses the graphics
  and display IP versions. (Matt, Lucas).
- Backlight fix (Lyude).
- Display watermark and DBUF fixes (Ville).
- HDCP fix (Anshuman).
- Improve cases where display is not available (Jose).
- Defeature PSR2 for RKL and ALD-S (Jose).
- VLV DSI panel power fixes and improvements (Hans).
- display-12 workaround (Jose).
- Fix modesetting (Imre).
- Drop redundant address-of op before lttpr_common_caps array (Imre).
- Fix compiler checks (Jose, Jason).
- GLK display fixes (Ville).
- Fix error code returns (Dan).
- eDP novel: back again to slow and wide link training everywhere (Kai-Heng).
- Abstract DMC FW path (Rodrigo).
- Preparation and changes for upcoming
  XeLPD display IP (Jose, Matt, Ville, Juha-Pekka, Animesh).
- Fix comment typo in DSI code (zuoqilin).
- Simplify CCS and UV plane alignment handling (Imre).
- PSR Fixes on TGL (Gwan-gyeong, Jose).
- Add intel_dp_hdcp.h and rename init (Jani).
- Move crtc and dpll declarations around (Jani).
- Fix pre-skl DP AUX precharge length (Ville).
- Remove stray newlines from random files (Ville).
- crtc->index and intel_crtc+drm_crtc pointer clean-up (Ville).
- Add frontbuffer tracking tracepoints (Ville).
- ADL-S PCI ID updates (Anand).
- Use unique backlight device names (Jani).
- A few clean-ups on i915/audio (Jani).
- Use intel_framebuffer instead of drm one on intel_fb functions (Imre).
- Add the missing MC CCS/XYUV format support on display >= 12 (Imre).
- Nuke display error state (Ville).
- ADL-P initial enablement patches
  starting to land (Clint, Imre, Jose, Umesh, Vandita, Mika).
- Display clean-up around VBT and the strap bits (Lucas).
- Try YCbCr420 color when RGB fails (Werner).
- More PSR fixes and improvements (Jose).
- Other generic display code clean-up (Jose, Ville).
- Use correct downstream caps for check Src-Ctl mode for PCON (Ankit).
- Disable HiZ Raw Stall Optimization on broken gen7 (Simon).

Thanks,
Rodrigo.

The following changes since commit 6efb943b8616ec53a5e444193dccf1af9ad627b5:

  Linux 5.13-rc1 (2021-05-09 14:17:44 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-next-2021-05-19-1

for you to fetch changes up to ec279384c6a02cf04a96054e82b1294a7aad6577:

  drm/i915: Initialize err in remap_io_sg() (2021-05-18 11:00:07 -0700)


Core Changes:

- drm: Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec (Jose).

Driver Changes:

- Display plane clock rates fixes and improvements (Ville).
- Uninint DMC FW loader state during shutdown (Imre).
- Convert snprintf to sysfs_emit (Xuezhi).
- Fix invalid access to ACPI _DSM objects (Takashi).
- A big refactor around how i915 addresses the graphics
  and display IP versions. (Matt, Lucas).
- Backlight fix (Lyude).
- Display watermark and DBUF fixes (Ville).
- HDCP fix (Anshuman).
- Improve cases where display is not available (Jose).
- Defeature PSR2 for RKL and ALD-S (Jose).
- VLV DSI panel power fixes and improvements (Hans).
- display-12 workaround (Jose).
- Fix modesetting (Imre).
- Drop redundant address-of op before lttpr_common_caps array (Imre).
- Fix compiler checks (Jose, Jason).
- GLK display fixes (Ville).
- Fix error code returns (Dan).
- eDP novel: back again to slow and wide link training everywhere (Kai-Heng).
- Abstract DMC FW path (Rodrigo).
- Preparation and changes for upcoming
  XeLPD display IP (Jose, Matt, Ville, Juha-Pekka, Animesh).
- Fix comment typo in DSI code (zuoqilin).
- Simplify CCS and UV plane alignment handling (Imre).
- PSR Fixes on TGL (Gwan-gyeong, Jose).
- Add intel_dp_hdcp.h and rename init (Jani).
- Move crtc and dpll declarations around (Jani).
- Fix pre-skl DP AUX precharge length (Ville).
- Remove stray newlines from random files (Ville).
- crtc->index and intel_crtc+drm_crtc pointer clean-up (Ville).
- Add frontbuffer tracking tracepoints (Ville).
- ADL-S PCI ID updates (Anand).
- Use unique backlight device names (Jani).
- A few clean-ups on i915/audio (Jani).
- Use intel_framebuffer instead of drm one on intel_fb functions (Imre).
- Add the missing MC CCS/XYUV format support on display >= 12 (Imre).
- Nuke display error state (Ville).
- ADL-P initial enablement patches
  starting to land (Clint, Imre, Jose, Umesh, Vandita, Mika).
- Display clean-up around VBT and the strap bits (Lucas).
- Try 

Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"

2021-05-19 Thread Jason Ekstrand

On May 19, 2021 12:16:15 Daniel Vetter  wrote:


On Wed, May 19, 2021 at 5:06 PM Jason Ekstrand  wrote:


Once we no longer rely on error propagation, I think there's a lot we
can rip out.


I honestly did not find that much ... what did you uncover?


When I was digging through this earlier today, I think I convinced myself 
that the cmdparser is the only user of the entire fence error architecture. 
I may have missed something, though.


--Jason



-Daniel



--Jason

On Wed, May 19, 2021 at 5:15 AM Daniel Vetter  wrote:


From: Jason Ekstrand 

This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Error propagation along fences sound like a good idea, but as your bug
shows, surprising consequences, since propagating errors across security
boundaries is not a good thing.

What we do have is track the hangs on the ctx, and report information to
userspace using RESET_STATS. That's how arb_robustness works. Also, if my
understanding is still correct, the EIO from execbuf is when your context
is banned (because not recoverable or too many hangs). And in all these
cases it's up to userspace to figure out what is all impacted and should
be reported to the application, that's not on the kernel to guess and
automatically propagate.

What's more, we're also building more features on top of ctx error
reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
userspace fence wait also relies on that mechanism. So it is the path
going forward for reporting gpu hangs and resets to userspace.

So all together that's why I think we should just bury this idea again as
not quite the direction we want to go to, hence why I think the revert is
the right option here.Signed-off-by: Jason Ekstrand 

v2: Augment commit message. Also restore Jason's sob that I
accidentally lost.

Signed-off-by: Jason Ekstrand  (v1)
Reported-by: Marcin Slusarz 
Cc:  # v5.6+
Cc: Jason Ekstrand 
Cc: Marcin Slusarz 
Cc: Jon Bloomfield 
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already 
signaled fences")

Signed-off-by: Daniel Vetter 
---
drivers/gpu/drm/i915/i915_request.c | 8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c

index 970d8f4986bb..b796197c0772 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,

 do {
 fence = *child++;
-   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
-   i915_sw_fence_set_error_once(>submit, 
fence->error);

+   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
 continue;
-   }

 if (fence->context == rq->fence.context)
 continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request 
*rq, struct dma_fence *fence)


 do {
 fence = *child++;
-   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
-   i915_sw_fence_set_error_once(>submit, 
fence->error);

+   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
 continue;
-   }

 /*
  * Requests on the same timeline are explicitly ordered, along
--
2.31.0




--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch




Re: [PATCH v7 03/15] swiotlb: Add DMA_RESTRICTED_POOL

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Add a new kconfig symbol, DMA_RESTRICTED_POOL, for restricted DMA pool.
> 
> Signed-off-by: Claire Chang 

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v7 11/15] dma-direct: Add a new wrapper __dma_direct_free_pages()

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Add a new wrapper __dma_direct_free_pages() that will be useful later
> for swiotlb_free().
> 
> Signed-off-by: Claire Chang 

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v7 04/15] swiotlb: Add restricted DMA pool initialization

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Add the initialization function to create restricted DMA pools from
> matching reserved-memory nodes.
> 
> Signed-off-by: Claire Chang 
> ---
>  include/linux/device.h  |  4 +++
>  include/linux/swiotlb.h |  3 +-
>  kernel/dma/swiotlb.c| 76 +
>  3 files changed, 82 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 38a2071cf776..4987608ea4ff 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -416,6 +416,7 @@ struct dev_links_info {
>   * @dma_pools:   Dma pools (if dma'ble device).
>   * @dma_mem: Internal for coherent mem override.
>   * @cma_area:Contiguous memory area for dma allocations
> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
>   * @archdata:For arch-specific additions.
>   * @of_node: Associated device tree node.
>   * @fwnode:  Associated device node supplied by platform firmware.
> @@ -521,6 +522,9 @@ struct device {
>  #ifdef CONFIG_DMA_CMA
>   struct cma *cma_area;   /* contiguous memory area for dma
>  allocations */
> +#endif
> +#ifdef CONFIG_DMA_RESTRICTED_POOL
> + struct io_tlb_mem *dma_io_tlb_mem;
>  #endif
>   /* arch specific additions */
>   struct dev_archdata archdata;
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 216854a5e513..03ad6e3b4056 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -72,7 +72,8 @@ extern enum swiotlb_force swiotlb_force;
>   *   range check to see if the memory was in fact allocated by this
>   *   API.
>   * @nslabs:  The number of IO TLB blocks (in groups of 64) between @start and
> - *   @end. This is command line adjustable via setup_io_tlb_npages.
> + *   @end. For default swiotlb, this is command line adjustable via
> + *   setup_io_tlb_npages.
>   * @used:The number of used IO TLB block.
>   * @list:The free list describing the number of free entries available
>   *   from each index.
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index b849b01a446f..1d8eb4de0d01 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -39,6 +39,13 @@
>  #ifdef CONFIG_DEBUG_FS
>  #include 
>  #endif
> +#ifdef CONFIG_DMA_RESTRICTED_POOL
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#endif
>  
>  #include 
>  #include 
> @@ -690,3 +697,72 @@ static int __init swiotlb_create_default_debugfs(void)
>  late_initcall(swiotlb_create_default_debugfs);
>  
>  #endif
> +
> +#ifdef CONFIG_DMA_RESTRICTED_POOL
> +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
> + struct device *dev)
> +{
> + struct io_tlb_mem *mem = rmem->priv;
> + unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
> +
> + if (dev->dma_io_tlb_mem)
> + return 0;
> +
> + /*
> +  * Since multiple devices can share the same pool, the private data,
> +  * io_tlb_mem struct, will be initialized by the first device attached
> +  * to it.
> +  */
> + if (!mem) {
> + mem = kzalloc(struct_size(mem, slots, nslabs), GFP_KERNEL);
> + if (!mem)
> + return -ENOMEM;
> +
> + if (PageHighMem(pfn_to_page(PHYS_PFN(rmem->base {
> + kfree(mem);
> + return -EINVAL;

This could probably deserve a warning here to indicate that the reserved
area must be accessible within the linear mapping as I would expect a
lot of people to trip over that.

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v7 01/15] swiotlb: Refactor swiotlb init functions

2021-05-19 Thread Florian Fainelli



On 5/17/2021 11:42 PM, Claire Chang wrote:
> Add a new function, swiotlb_init_io_tlb_mem, for the io_tlb_mem struct
> initialization to make the code reusable.
> 
> Note that we now also call set_memory_decrypted in swiotlb_init_with_tbl.
> 
> Signed-off-by: Claire Chang 
> ---
>  kernel/dma/swiotlb.c | 51 ++--
>  1 file changed, 25 insertions(+), 26 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 8ca7d505d61c..d3232fc19385 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -168,9 +168,30 @@ void __init swiotlb_update_mem_attributes(void)
>   memset(vaddr, 0, bytes);
>  }
>  
> -int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
> verbose)
> +static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t 
> start,
> + unsigned long nslabs, bool late_alloc)
>  {
> + void *vaddr = phys_to_virt(start);
>   unsigned long bytes = nslabs << IO_TLB_SHIFT, i;
> +
> + mem->nslabs = nslabs;
> + mem->start = start;
> + mem->end = mem->start + bytes;
> + mem->index = 0;
> + mem->late_alloc = late_alloc;
> + spin_lock_init(>lock);
> + for (i = 0; i < mem->nslabs; i++) {
> + mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> + mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> + mem->slots[i].alloc_size = 0;
> + }
> +
> + set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
> + memset(vaddr, 0, bytes);

You are doing an unconditional set_memory_decrypted() followed by a
memset here, and then:

> +}
> +
> +int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
> verbose)
> +{
>   struct io_tlb_mem *mem;
>   size_t alloc_size;
>  
> @@ -186,16 +207,8 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned 
> long nslabs, int verbose)
>   if (!mem)
>   panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
> __func__, alloc_size, PAGE_SIZE);
> - mem->nslabs = nslabs;
> - mem->start = __pa(tlb);
> - mem->end = mem->start + bytes;
> - mem->index = 0;
> - spin_lock_init(>lock);
> - for (i = 0; i < mem->nslabs; i++) {
> - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> - mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> - mem->slots[i].alloc_size = 0;
> - }
> +
> + swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false);

You convert this call site with swiotlb_init_io_tlb_mem() which did not
do the set_memory_decrypted()+memset(). Is this okay or should
swiotlb_init_io_tlb_mem() add an additional argument to do this
conditionally?
-- 
Florian


[RFC 3/3] drm/msm: Wire up gpu boost

2021-05-19 Thread Rob Clark
From: Rob Clark 

Note, at this point I haven't given a lot of consideration into how much
we should boost, and for how long.  And perhaps we should only boost at
less than 50% utilization?  At this point, this is only an example of
dma_fence_boost() implementation.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_fence.c | 10 ++
 drivers/gpu/drm/msm/msm_gpu.c   | 13 +
 drivers/gpu/drm/msm/msm_gpu.h   |  2 ++
 3 files changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index cd59a5918038..e58895603726 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -8,6 +8,7 @@
 
 #include "msm_drv.h"
 #include "msm_fence.h"
+#include "msm_gpu.h"
 
 
 struct msm_fence_context *
@@ -114,10 +115,19 @@ static bool msm_fence_signaled(struct dma_fence *fence)
return fence_completed(f->fctx, f->base.seqno);
 }
 
+static void msm_fence_boost(struct dma_fence *fence)
+{
+   struct msm_fence *f = to_msm_fence(fence);
+   struct msm_drm_private *priv = f->fctx->dev->dev_private;
+
+   msm_gpu_boost(priv->gpu);
+}
+
 static const struct dma_fence_ops msm_fence_ops = {
.get_driver_name = msm_fence_get_driver_name,
.get_timeline_name = msm_fence_get_timeline_name,
.signaled = msm_fence_signaled,
+   .boost = msm_fence_boost,
 };
 
 struct dma_fence *
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 9dd1c58430ab..c90b79116500 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -62,6 +62,10 @@ static int msm_devfreq_get_dev_status(struct device *dev,
status->total_time = ktime_us_delta(time, gpu->devfreq.time);
gpu->devfreq.time = time;
 
+   if (atomic_dec_if_positive(>devfreq.boost) >= 0) {
+   status->busy_time = status->total_time;
+   }
+
return 0;
 }
 
@@ -84,6 +88,15 @@ static struct devfreq_dev_profile msm_devfreq_profile = {
.get_cur_freq = msm_devfreq_get_cur_freq,
 };
 
+void msm_gpu_boost(struct msm_gpu *gpu)
+{
+   if (!gpu->funcs->gpu_busy)
+   return;
+
+   /* Add three devfreq polling intervals worth of boost: */
+   atomic_add(3, >devfreq.boost);
+}
+
 static void msm_devfreq_init(struct msm_gpu *gpu)
 {
/* We need target support to do devfreq */
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 18baf935e143..7a082a12d98f 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -150,6 +150,7 @@ struct msm_gpu {
struct devfreq *devfreq;
u64 busy_cycles;
ktime_t time;
+   atomic_t boost;
} devfreq;
 
uint32_t suspend_count;
@@ -295,6 +296,7 @@ static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, 
u32 hi, u64 val)
 int msm_gpu_pm_suspend(struct msm_gpu *gpu);
 int msm_gpu_pm_resume(struct msm_gpu *gpu);
 void msm_gpu_resume_devfreq(struct msm_gpu *gpu);
+void msm_gpu_boost(struct msm_gpu *gpu);
 
 int msm_gpu_hw_init(struct msm_gpu *gpu);
 
-- 
2.30.2



[RFC 0/3] dma-fence: Add a "boost" mechanism

2021-05-19 Thread Rob Clark
From: Rob Clark 

In some cases, like double-buffered rendering, missing vblanks can
trick the GPU into running at a lower frequence, when really we
want to be running at a higher frequency to not miss the vblanks
in the first place.

This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:

1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers

The last patch is just proof of concept, in reality I think it
may want to be a bit more clever.  But sending this out as it
is as an RFC to get feedback.

Rob Clark (3):
  dma-fence: Add boost fence op
  drm/atomic: Call dma_fence_boost() when we've missed a vblank
  drm/msm: Wire up gpu boost

 drivers/gpu/drm/drm_atomic_helper.c | 11 +++
 drivers/gpu/drm/msm/msm_fence.c | 10 ++
 drivers/gpu/drm/msm/msm_gpu.c   | 13 +
 drivers/gpu/drm/msm/msm_gpu.h   |  2 ++
 include/linux/dma-fence.h   | 26 ++
 5 files changed, 62 insertions(+)

-- 
2.30.2



[RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank

2021-05-19 Thread Rob Clark
From: Rob Clark 

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/drm_atomic_helper.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index 560aaecba31b..fe10fc2e7f86 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1435,11 +1435,15 @@ int drm_atomic_helper_wait_for_fences(struct drm_device 
*dev,
int i, ret;
 
for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+   u64 vblank_count;
+
if (!new_plane_state->fence)
continue;
 
WARN_ON(!new_plane_state->fb);
 
+   vblank_count = drm_crtc_vblank_count(new_plane_state->crtc);
+
/*
 * If waiting for fences pre-swap (ie: nonblock), userspace can
 * still interrupt the operation. Instead of blocking until the
@@ -1449,6 +1453,13 @@ int drm_atomic_helper_wait_for_fences(struct drm_device 
*dev,
if (ret)
return ret;
 
+   /*
+* Check if we've missed a vblank while waiting, and if we have
+* signal the fence that it's signaler should be boosted
+*/
+   if (vblank_count != 
drm_crtc_vblank_count(new_plane_state->crtc))
+   dma_fence_boost(new_plane_state->fence);
+
dma_fence_put(new_plane_state->fence);
new_plane_state->fence = NULL;
}
-- 
2.30.2



[RFC 1/3] dma-fence: Add boost fence op

2021-05-19 Thread Rob Clark
From: Rob Clark 

Add a way to hint to the fence signaler that a fence waiter has missed a
deadline waiting on the fence.

In some cases, missing a vblank can result in lower gpu utilization,
when really we want to go in the opposite direction and boost gpu freq.
The boost callback gives some feedback to the fence signaler that we
are missing deadlines, so it can take this into account in it's freq/
utilization calculations.

Signed-off-by: Rob Clark 
---
 include/linux/dma-fence.h | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..172702521acc 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -231,6 +231,17 @@ struct dma_fence_ops {
signed long (*wait)(struct dma_fence *fence,
bool intr, signed long timeout);
 
+   /**
+* @boost:
+*
+* Optional callback, to indicate that a fence waiter missed a deadline.
+* This can serve as a signal that (if possible) whatever signals the
+* fence should boost it's clocks.
+*
+* This can be called in any context that can call dma_fence_wait().
+*/
+   void (*boost)(struct dma_fence *fence);
+
/**
 * @release:
 *
@@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence 
*fence, bool intr)
return ret < 0 ? ret : 0;
 }
 
+/**
+ * dma_fence_boost - hint from waiter that it missed a deadline
+ *
+ * @fence: the fence that caused the missed deadline
+ *
+ * This function gives a hint from a fence waiter that a deadline was
+ * missed, so that the fence signaler can factor this in to device
+ * power state decisions
+ */
+static inline void dma_fence_boost(struct dma_fence *fence)
+{
+   if (fence->ops->boost)
+   fence->ops->boost(fence);
+}
+
 struct dma_fence *dma_fence_get_stub(void);
 u64 dma_fence_context_alloc(unsigned num);
 
-- 
2.30.2



Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness

2021-05-19 Thread Daniel Vetter
On Wed, May 19, 2021 at 6:16 PM Tvrtko Ursulin
 wrote:
>
>
> On 18/05/2021 10:40, Tvrtko Ursulin wrote:
> >
> > On 18/05/2021 10:16, Daniel Stone wrote:
> >> Hi,
> >>
> >> On Tue, 18 May 2021 at 10:09, Tvrtko Ursulin
> >>  wrote:
> >>> I was just wondering if stat(2) and a chrdev major check would be a
> >>> solid criteria to more efficiently (compared to parsing the text
> >>> content) detect drm files while walking procfs.
> >>
> >> Maybe I'm missing something, but is the per-PID walk actually a
> >> measurable performance issue rather than just a bit unpleasant?
> >
> > Per pid and per each open fd.
> >
> > As said in the other thread what bothers me a bit in this scheme is that
> > the cost of obtaining GPU usage scales based on non-GPU criteria.
> >
> > For use case of a top-like tool which shows all processes this is a
> > smaller additional cost, but then for a gpu-top like tool it is somewhat
> > higher.
>
> To further expand, not only cost would scale per pid multiplies per open
> fd, but to detect which of the fds are DRM I see these three options:
>
> 1) Open and parse fdinfo.
> 2) Name based matching ie /dev/dri/.. something.
> 3) Stat the symlink target and check for DRM major.

stat with symlink following should be plenty fast.

> All sound quite sub-optimal to me.
>
> Name based matching is probably the least evil on system resource usage
> (Keeping the dentry cache too hot? Too many syscalls?), even though
> fundamentally I don't it is the right approach.
>
> What happens with dup(2) is another question.

We need benchmark numbers showing that on anything remotely realistic
it's an actual problem. Until we've demonstrated it's a real problem
we don't need to solve it.

E.g. top with any sorting enabled also parses way more than it
displays on every update. It seems to be doing Just Fine (tm).

> Does anyone have any feedback on the /proc//gpu idea at all?

When we know we have a problem to solve we can take a look at solutions.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Jason Gunthorpe
> Sorry for the noise.

Not at all, it is good that more people understand things!

Jason


Re: [Intel-gfx] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-19 Thread Matthew Brost
On Wed, May 19, 2021 at 01:10:04PM +0200, Daniel Vetter wrote:
> On Tue, May 18, 2021 at 04:58:30PM -0700, Matthew Brost wrote:
> > Add entry fpr i915 new parallel submission uAPI plan.
> > 
> > v2:
> >  (Daniel Vetter):
> >   - Expand logical order explaination
> >   - Add dummy header
> >   - Only allow N BBs in execbuf IOCTL
> >   - Configure parallel submission per slot not per gem context
> > 
> > Cc: Tvrtko Ursulin 
> > Cc: Tony Ye 
> > CC: Carl Zhang 
> > Cc: Daniel Vetter 
> > Cc: Jason Ekstrand 
> > Signed-off-by: Matthew Brost 
> > ---
> >  Documentation/gpu/rfc/i915_parallel_execbuf.h | 144 ++
> >  Documentation/gpu/rfc/i915_scheduler.rst  |  53 ++-
> >  2 files changed, 196 insertions(+), 1 deletion(-)
> >  create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
> > 
> > diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> > b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > new file mode 100644
> > index ..8c64b983ccad
> > --- /dev/null
> > +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > @@ -0,0 +1,144 @@
> > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > i915_context_engines_parallel_submit */
> > +
> > +/*
> > + * i915_context_engines_parallel_submit:
> > + *
> > + * Setup a slot to allow multiple BBs to be submitted in a single execbuf 
> > IOCTL.
> > + * Those BBs will then be scheduled to run on the GPU in parallel. Multiple
> > + * hardware contexts are created internally in the i915 run these BBs. 
> > Once a
> > + * slot is configured for N BBs only N BBs can be submitted in each execbuf
> > + * IOCTL and this is implict behavior (e.g. the user doesn't tell the 
> > execbuf
> > + * IOCTL there are N BBs, the execbuf IOCTL know how many BBs there are 
> > based on
> > + * the slots configuration).
> > + *
> > + * Their are two currently defined ways to control the placement of the
> > + * hardware contexts on physical engines: default behavior (no flags) and
> > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the
> > + * future as new hardware / use cases arise. Details of how to use this
> > + * interface below above the flags.
> > + *
> > + * Returns -EINVAL if hardware context placement configuration invalid or 
> > if the
> > + * placement configuration isn't supported on the platform / submission
> > + * interface.
> > + * Returns -ENODEV if extension isn't supported on the platform / 
> > submission
> > + * inteface.
> > + */
> > +struct i915_context_engines_parallel_submit {
> > +   struct i915_user_extension base;
> > +
> > +   __u16 engine_index; /* slot for parallel engine */
> > +   __u16 width;/* number of contexts per parallel engine */
> > +   __u16 num_siblings; /* number of siblings per context */
> > +   __u16 mbz16;
> 
> Ok the big picture looks reasonable now, the flags still confuse me.
> 

Yea, it is a bit confusing.

> > +/*
> > + * Default placement behvavior (currently unsupported):
> > + *
> > + * Rather than restricting parallel submission to a single class with a
> > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a 
> > mode that
> > + * enables parallel submission across multiple engine classes. In this 
> > case each
> > + * context's logical engine mask indicates where that context can placed. 
> > It is
> > + * implied in this mode that all contexts have mutual exclusive placement 
> > (e.g.
> > + * if one context is running CS0 no other contexts can run on CS0).
> > + *
> > + * Example 1 pseudo code:
> > + * CSX[Y] = engine class X, logical instance Y
> > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> > + * set_engines(INVALID)
> > + * set_parallel(engine_index=0, width=2, num_siblings=2,
> > + * engines=CS0[0],CS0[1],CS1[0],CS1[1])
> > + *
> > + * Results in the following valid placements:
> > + * CS0[0], CS1[0]
> > + * CS0[0], CS1[1]
> > + * CS0[1], CS1[0]
> > + * CS0[1], CS1[1]
> > + *
> > + * This can also be though of as 2 virtual engines:
> > + * VE[0] = CS0[0], CS0[1]
> > + * VE[1] = CS1[0], CS1[1]
> > + *
> > + * Example 2 pseudo code:
> > + * CS[X] = generic engine of same class, logical instance X
> > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> > + * set_engines(INVALID)
> > + * set_parallel(engine_index=0, width=2, num_siblings=3,
> > + * engines=CS[0],CS[1],CS[2],CS[0],CS[1],CS[2])
> > + *
> > + * Results in the following valid placements:
> > + * CS[0], CS[1]
> > + * CS[0], CS[2]
> > + * CS[1], CS[0]
> > + * CS[1], CS[2]
> > + * CS[2], CS[0]
> > + * CS[2], CS[1]
> > + *
> > + *
> > + * This can also be though of as 2 virtual engines:
> > + * VE[0] = CS[0], CS[1], CS[2]
> > + * VE[1] = CS[0], CS[1], CS[2]
> > +
> > + * This enables a use case where all engines are created equally, we don't 
> > care
> > + * where they are scheduled, we just want a certain number of resources, 
> > for
> > + * those resources to be scheduled 

Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"

2021-05-19 Thread Daniel Vetter
On Wed, May 19, 2021 at 5:06 PM Jason Ekstrand  wrote:
>
> Once we no longer rely on error propagation, I think there's a lot we
> can rip out.

I honestly did not find that much ... what did you uncover?
-Daniel

>
> --Jason
>
> On Wed, May 19, 2021 at 5:15 AM Daniel Vetter  wrote:
> >
> > From: Jason Ekstrand 
> >
> > This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> > since that commit, we've been having issues where a hang in one client
> > can propagate to another.  In particular, a hang in an app can propagate
> > to the X server which causes the whole desktop to lock up.
> >
> > Error propagation along fences sound like a good idea, but as your bug
> > shows, surprising consequences, since propagating errors across security
> > boundaries is not a good thing.
> >
> > What we do have is track the hangs on the ctx, and report information to
> > userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> > understanding is still correct, the EIO from execbuf is when your context
> > is banned (because not recoverable or too many hangs). And in all these
> > cases it's up to userspace to figure out what is all impacted and should
> > be reported to the application, that's not on the kernel to guess and
> > automatically propagate.
> >
> > What's more, we're also building more features on top of ctx error
> > reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> > userspace fence wait also relies on that mechanism. So it is the path
> > going forward for reporting gpu hangs and resets to userspace.
> >
> > So all together that's why I think we should just bury this idea again as
> > not quite the direction we want to go to, hence why I think the revert is
> > the right option here.Signed-off-by: Jason Ekstrand 
> > 
> >
> > v2: Augment commit message. Also restore Jason's sob that I
> > accidentally lost.
> >
> > Signed-off-by: Jason Ekstrand  (v1)
> > Reported-by: Marcin Slusarz 
> > Cc:  # v5.6+
> > Cc: Jason Ekstrand 
> > Cc: Marcin Slusarz 
> > Cc: Jon Bloomfield 
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> > Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already 
> > signaled fences")
> > Signed-off-by: Daniel Vetter 
> > ---
> >  drivers/gpu/drm/i915/i915_request.c | 8 ++--
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_request.c 
> > b/drivers/gpu/drm/i915/i915_request.c
> > index 970d8f4986bb..b796197c0772 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
> >
> > do {
> > fence = *child++;
> > -   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
> > -   i915_sw_fence_set_error_once(>submit, 
> > fence->error);
> > +   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
> > continue;
> > -   }
> >
> > if (fence->context == rq->fence.context)
> > continue;
> > @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request 
> > *rq, struct dma_fence *fence)
> >
> > do {
> > fence = *child++;
> > -   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
> > -   i915_sw_fence_set_error_once(>submit, 
> > fence->error);
> > +   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
> > continue;
> > -   }
> >
> > /*
> >  * Requests on the same timeline are explicitly ordered, 
> > along
> > --
> > 2.31.0
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Mesa-dev] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-19 Thread Matthew Brost
On Wed, May 19, 2021 at 01:45:39PM +0200, Christian König wrote:
> Oh, yeah we call that gang submit on the AMD side.
> 
> Had already some internal discussions how to implement this, but so far
> couldn't figure out how to cleanly introduce that into the DRM scheduler.
> 
> Can you briefly describe in a few words how that is supposed to work on the
> Intel side?
> 

Sure, I've done a quick PoC internally and have been able to hook this
into the DRM scheduler.

Basically each BB still maps to a single job as each job is somewhat
unique (e.g. each job has its own ring, lrc, seqno, etc...). However all
the jobs configured to run in parallel map to a single sched_entity
which maintains the order each job was generated from the execbuf IOCTL
(1 - N). When the backend receives jobs 1 to N - 1 it basically just
updates some internal state. When the backend sees job N (last job) it
actually does the submit for jobs 1 - N which with GuC submission is a
simple command moving the LRC tail of the N jobs.

Daniel has suggested that we create a single job for the NN BBs but that
would be huge rework to the internals of the i915 and likely won't
happen by the time this code first lands.

Also worth noting one way a job isn't really a treated individually is
the excl slot with dma-resv. In that case we create a composite fence of
all jobs (dma_fence_array).

Matt

> Thanks,
> Christian.
> 
> Am 19.05.21 um 01:58 schrieb Matthew Brost:
> > Add entry fpr i915 new parallel submission uAPI plan.
> > 
> > v2:
> >   (Daniel Vetter):
> >- Expand logical order explaination
> >- Add dummy header
> >- Only allow N BBs in execbuf IOCTL
> >- Configure parallel submission per slot not per gem context
> > 
> > Cc: Tvrtko Ursulin 
> > Cc: Tony Ye 
> > CC: Carl Zhang 
> > Cc: Daniel Vetter 
> > Cc: Jason Ekstrand 
> > Signed-off-by: Matthew Brost 
> > ---
> >   Documentation/gpu/rfc/i915_parallel_execbuf.h | 144 ++
> >   Documentation/gpu/rfc/i915_scheduler.rst  |  53 ++-
> >   2 files changed, 196 insertions(+), 1 deletion(-)
> >   create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
> > 
> > diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> > b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > new file mode 100644
> > index ..8c64b983ccad
> > --- /dev/null
> > +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> > @@ -0,0 +1,144 @@
> > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > i915_context_engines_parallel_submit */
> > +
> > +/*
> > + * i915_context_engines_parallel_submit:
> > + *
> > + * Setup a slot to allow multiple BBs to be submitted in a single execbuf 
> > IOCTL.
> > + * Those BBs will then be scheduled to run on the GPU in parallel. Multiple
> > + * hardware contexts are created internally in the i915 run these BBs. 
> > Once a
> > + * slot is configured for N BBs only N BBs can be submitted in each execbuf
> > + * IOCTL and this is implict behavior (e.g. the user doesn't tell the 
> > execbuf
> > + * IOCTL there are N BBs, the execbuf IOCTL know how many BBs there are 
> > based on
> > + * the slots configuration).
> > + *
> > + * Their are two currently defined ways to control the placement of the
> > + * hardware contexts on physical engines: default behavior (no flags) and
> > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the
> > + * future as new hardware / use cases arise. Details of how to use this
> > + * interface below above the flags.
> > + *
> > + * Returns -EINVAL if hardware context placement configuration invalid or 
> > if the
> > + * placement configuration isn't supported on the platform / submission
> > + * interface.
> > + * Returns -ENODEV if extension isn't supported on the platform / 
> > submission
> > + * inteface.
> > + */
> > +struct i915_context_engines_parallel_submit {
> > +   struct i915_user_extension base;
> > +
> > +   __u16 engine_index; /* slot for parallel engine */
> > +   __u16 width;/* number of contexts per parallel engine */
> > +   __u16 num_siblings; /* number of siblings per context */
> > +   __u16 mbz16;
> > +/*
> > + * Default placement behvavior (currently unsupported):
> > + *
> > + * Rather than restricting parallel submission to a single class with a
> > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a 
> > mode that
> > + * enables parallel submission across multiple engine classes. In this 
> > case each
> > + * context's logical engine mask indicates where that context can placed. 
> > It is
> > + * implied in this mode that all contexts have mutual exclusive placement 
> > (e.g.
> > + * if one context is running CS0 no other contexts can run on CS0).
> > + *
> > + * Example 1 pseudo code:
> > + * CSX[Y] = engine class X, logical instance Y
> > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> > + * set_engines(INVALID)
> > + * set_parallel(engine_index=0, width=2, num_siblings=2,

Re: [PATCH] vgaarb: Use ACPI HID name to find integrated GPU

2021-05-19 Thread Alex Deucher
On Wed, May 19, 2021 at 9:57 AM Kai-Heng Feng
 wrote:
>
> Commit 3d42f1ddc47a ("vgaarb: Keep adding VGA device in queue") assumes
> the first device is an integrated GPU. However, on AMD platforms an
> integrated GPU can have higher PCI device number than a discrete GPU.
>
> Integrated GPU on ACPI platform generally has _DOD and _DOS method, so
> use that as predicate to find integrated GPU. If the new strategy
> doesn't work, fallback to use the first device as boot VGA.
>
> Signed-off-by: Kai-Heng Feng 

Reviewed-by: Alex Deucher 

Unless there are any other comments, I'll apply it tomorrow.

Alex

> ---
>  drivers/gpu/vga/vgaarb.c | 31 ++-
>  1 file changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
> index 5180c5687ee5..949fde433ea2 100644
> --- a/drivers/gpu/vga/vgaarb.c
> +++ b/drivers/gpu/vga/vgaarb.c
> @@ -50,6 +50,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>
> @@ -1450,9 +1451,23 @@ static struct miscdevice vga_arb_device = {
> MISC_DYNAMIC_MINOR, "vga_arbiter", _arb_device_fops
>  };
>
> +#if defined(CONFIG_ACPI)
> +static bool vga_arb_integrated_gpu(struct device *dev)
> +{
> +   struct acpi_device *adev = ACPI_COMPANION(dev);
> +
> +   return adev && !strcmp(acpi_device_hid(adev), ACPI_VIDEO_HID);
> +}
> +#else
> +static bool vga_arb_integrated_gpu(struct device *dev)
> +{
> +   return false;
> +}
> +#endif
> +
>  static void __init vga_arb_select_default_device(void)
>  {
> -   struct pci_dev *pdev;
> +   struct pci_dev *pdev, *found = NULL;
> struct vga_device *vgadev;
>
>  #if defined(CONFIG_X86) || defined(CONFIG_IA64)
> @@ -1505,20 +1520,26 @@ static void __init vga_arb_select_default_device(void)
>  #endif
>
> if (!vga_default_device()) {
> -   list_for_each_entry(vgadev, _list, list) {
> +   list_for_each_entry_reverse(vgadev, _list, list) {
> struct device *dev = >pdev->dev;
> u16 cmd;
>
> pdev = vgadev->pdev;
> pci_read_config_word(pdev, PCI_COMMAND, );
> if (cmd & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY)) {
> -   vgaarb_info(dev, "setting as boot device (VGA 
> legacy resources not available)\n");
> -   vga_set_default_device(pdev);
> -   break;
> +   found = pdev;
> +   if (vga_arb_integrated_gpu(dev))
> +   break;
> }
> }
> }
>
> +   if (found) {
> +   vgaarb_info(>dev, "setting as boot device (VGA legacy 
> resources not available)\n");
> +   vga_set_default_device(found);
> +   return;
> +   }
> +
> if (!vga_default_device()) {
> vgadev = list_first_entry_or_null(_list,
>   struct vga_device, list);
> --
> 2.31.1
>


Re: [PATCH 1/3] gpu: drm: replace occurrences of invalid character

2021-05-19 Thread Alex Deucher
Pushed out to drm-misc-next.  Also fixed up Michel's name.

Alex

On Wed, May 19, 2021 at 11:56 AM Randy Dunlap  wrote:
>
> On 5/19/21 1:15 AM, Mauro Carvalho Chehab wrote:
> > There are some places at drm that ended receiving a
> > REPLACEMENT CHARACTER U+fffd ('�'), probably because of
> > some bad charset conversion.
> >
> > Fix them by using what it seems   to be the proper
> > character.
> >
> > Signed-off-by: Mauro Carvalho Chehab 
>
> Acked-by: Randy Dunlap 
>
> Thanks.
>
> > ---
> >  drivers/gpu/drm/amd/include/atombios.h   | 10 +-
> >  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  2 +-
> >  drivers/gpu/drm/i915/i915_gpu_error.h|  2 +-
> >  drivers/gpu/drm/r128/r128_drv.h  |  2 +-
> >  4 files changed, 8 insertions(+), 8 deletions(-)
> >
>
> --
> ~Randy
>


[PATCH -next] drm/rockchip: cdn-dp-core: add missing clk_disable_unprepare() on error in cdn_dp_grf_write()

2021-05-19 Thread Yang Yingliang
After calling clk_prepare_enable(), clk_disable_unprepare() need
be called when calling regmap_write() failed.

Fixes: 1a0f7ed3abe2 ("drm/rockchip: cdn-dp: add cdn DP support for rk3399")
Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
 drivers/gpu/drm/rockchip/cdn-dp-core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c 
b/drivers/gpu/drm/rockchip/cdn-dp-core.c
index a4a45daf93f2..6802d9b65f82 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-core.c
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -73,6 +73,7 @@ static int cdn_dp_grf_write(struct cdn_dp_device *dp,
ret = regmap_write(dp->grf, reg, val);
if (ret) {
DRM_DEV_ERROR(dp->dev, "Could not write to GRF: %d\n", ret);
+   clk_disable_unprepare(dp->grf_clk);
return ret;
}
 
-- 
2.25.1



[PATCH] drivers/video/fbdev/core/fbmem.c: add pointer judgment

2021-05-19 Thread songqiang
Signed-off-by: songqiang 
---
 drivers/video/fbdev/core/fbmem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 072780b0e570..6036ab849475 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -1859,6 +1859,9 @@ void fb_set_suspend(struct fb_info *info, int state)
 {
WARN_CONSOLE_UNLOCKED();
 
+   if (!info) {
+   return;
+   }
if (state) {
fbcon_suspended(info);
info->state = FBINFO_STATE_SUSPENDED;
-- 
2.20.1





[PATCH] drm/i915: Force DPCD backlight mode for Samsung 16727 panel

2021-05-19 Thread Aaron Ma
Another Samsung OLED panel needs DPCD to get control of backlight.
Kernel 5.12+ support the backlight via:
commit: <4a8d79901d5b> ("drm/i915/dp: Enable Intel's HDR backlight interface 
(only SDR for now)")
Only make backlight work on lower versions of kernel.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3474
Cc: sta...@vger.kernel.org # 5.11-
Signed-off-by: Aaron Ma 
---
 drivers/gpu/drm/drm_dp_helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 5bd0934004e3..7b91d8a76cd6 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -1960,6 +1960,7 @@ static const struct edid_quirk edid_quirk_list[] = {
{ MFG(0x4d, 0x10), PROD_ID(0xe6, 0x14), 
BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
{ MFG(0x4c, 0x83), PROD_ID(0x47, 0x41), 
BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
{ MFG(0x09, 0xe5), PROD_ID(0xde, 0x08), 
BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
+   { MFG(0x4c, 0x83), PROD_ID(0x57, 0x41), 
BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
 };
 
 #undef MFG
-- 
2.25.1



[PATCH v5 3/3] MAINTAINERS: Add maintainer for hyperv video device

2021-05-19 Thread Deepak Rawat
Maintainer for hyperv synthetic video device.

Signed-off-by: Deepak Rawat 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 41f2b2b85b6d..dbe4ed540e11 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6084,6 +6084,14 @@ T:   git git://anongit.freedesktop.org/drm/drm-misc
 F: Documentation/devicetree/bindings/display/hisilicon/
 F: drivers/gpu/drm/hisilicon/
 
+DRM DRIVER FOR HYPERV SYNTHETIC VIDEO DEVICE
+M: Deepak Rawat 
+L: linux-hyp...@vger.kernel.org
+L: dri-devel@lists.freedesktop.org
+S: Maintained
+T: git git://anongit.freedesktop.org/drm/drm-misc
+F: drivers/gpu/drm/hyperv
+
 DRM DRIVERS FOR LIMA
 M: Qiang Yu 
 L: dri-devel@lists.freedesktop.org
-- 
2.31.1



[PATCH v5 2/3] drm/hyperv: Handle feature change message from device

2021-05-19 Thread Deepak Rawat
Virtual device inform if screen update is needed or not with
SYNTHVID_FEATURE_CHANGE message. Handle this message to set dirt_needed
flag.

Suggested-by: Dexuan Cui 
Signed-off-by: Deepak Rawat 
Acked-by: Thomas Zimmermann 
---
 drivers/gpu/drm/hyperv/hyperv_drm.h   | 1 +
 drivers/gpu/drm/hyperv/hyperv_drm_drv.c   | 2 ++
 drivers/gpu/drm/hyperv/hyperv_drm_proto.c | 7 +++
 3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/hyperv/hyperv_drm.h 
b/drivers/gpu/drm/hyperv/hyperv_drm.h
index e1d1fdea96f2..886add4f9cd0 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm.h
+++ b/drivers/gpu/drm/hyperv/hyperv_drm.h
@@ -29,6 +29,7 @@ struct hyperv_drm_device {
struct completion wait;
u32 synthvid_version;
u32 mmio_megabytes;
+   bool dirt_needed;
 
u8 init_buf[VMBUS_MAX_PACKET_SIZE];
u8 recv_buf[VMBUS_MAX_PACKET_SIZE];
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c 
b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
index c346dc7544aa..878b48a186c2 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
@@ -198,6 +198,8 @@ static int hyperv_vmbus_probe(struct hv_device *hdev,
if (ret)
drm_warn(dev, "Failed to update vram location.\n");
 
+   hv->dirt_needed = true;
+
ret = hyperv_mode_config_init(hv);
if (ret)
goto err_vmbus_close;
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c 
b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
index 700870b243fe..6fff24b40974 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
@@ -301,8 +301,12 @@ int hyperv_update_situation(struct hv_device *hdev, u8 
active, u32 bpp,
 
 int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect)
 {
+   struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
struct synthvid_msg msg;
 
+   if (!hv->dirt_needed)
+   return 0;
+
memset(, 0, sizeof(struct synthvid_msg));
 
msg.vid_hdr.type = SYNTHVID_DIRT;
@@ -387,6 +391,9 @@ static void hyperv_receive_sub(struct hv_device *hdev)
complete(>wait);
return;
}
+
+   if (msg->vid_hdr.type == SYNTHVID_FEATURE_CHANGE)
+   hv->dirt_needed = msg->feature_chg.is_dirt_needed;
 }
 
 static void hyperv_receive(void *ctx)
-- 
2.31.1



[PATCH v5 1/3] drm/hyperv: Add DRM driver for hyperv synthetic video device

2021-05-19 Thread Deepak Rawat
DRM driver for hyperv synthetic video device, based on hyperv_fb
framebuffer driver. Also added config option "DRM_HYPERV" to enabled
this driver.

v2:
- Add support for gen2 VM
- Fixed review comments

v3:
- Split into multiple files as suggested by Thomas Zimmermann
- Fixed hibernation issue as suggested by Dexuan Cui
- Use ioremap_cache as suggested by Dexuan Cui
- Incorporated other review comments

v4:
- Fix bitrotted code
- Review comments
- Updated the copyright and license to match hyperv_fb

v5:
- Address review comments and rebased with drm-misc-next

Signed-off-by: Deepak Rawat 
Acked-by: Thomas Zimmermann 
---
 drivers/gpu/drm/Kconfig |  12 +
 drivers/gpu/drm/Makefile|   1 +
 drivers/gpu/drm/hyperv/Makefile |   8 +
 drivers/gpu/drm/hyperv/hyperv_drm.h |  51 +++
 drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 307 +
 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c | 231 ++
 drivers/gpu/drm/hyperv/hyperv_drm_proto.c   | 478 
 7 files changed, 1088 insertions(+)
 create mode 100644 drivers/gpu/drm/hyperv/Makefile
 create mode 100644 drivers/gpu/drm/hyperv/hyperv_drm.h
 create mode 100644 drivers/gpu/drm/hyperv/hyperv_drm_drv.c
 create mode 100644 drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
 create mode 100644 drivers/gpu/drm/hyperv/hyperv_drm_proto.c

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index d3a9ca4b1cec..9d0ea894ba70 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -380,6 +380,18 @@ source "drivers/gpu/drm/xlnx/Kconfig"
 
 source "drivers/gpu/drm/gud/Kconfig"
 
+config DRM_HYPERV
+   tristate "DRM Support for hyperv synthetic video device"
+   depends on DRM && PCI && MMU && HYPERV
+   select DRM_KMS_HELPER
+   select DRM_GEM_SHMEM_HELPER
+   help
+This is a KMS driver for hyperv synthetic video device. Choose this
+option if you would like to enable drm driver for Hyper-V virtual
+machine.
+
+If M is selected the module will be called hyperv_drm.
+
 # Keep legacy drivers last
 
 menuconfig DRM_LEGACY
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index a91cc7684904..a118692a6df7 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -126,3 +126,4 @@ obj-$(CONFIG_DRM_MCDE) += mcde/
 obj-$(CONFIG_DRM_TIDSS) += tidss/
 obj-y  += xlnx/
 obj-y  += gud/
+obj-$(CONFIG_DRM_HYPERV) += hyperv/
diff --git a/drivers/gpu/drm/hyperv/Makefile b/drivers/gpu/drm/hyperv/Makefile
new file mode 100644
index ..265f12f2c660
--- /dev/null
+++ b/drivers/gpu/drm/hyperv/Makefile
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+hyperv_drm-y := \
+   hyperv_drm_drv.o \
+   hyperv_drm_modeset.o \
+   hyperv_drm_proto.o
+
+obj-$(CONFIG_DRM_HYPERV) += hyperv_drm.o
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm.h 
b/drivers/gpu/drm/hyperv/hyperv_drm.h
new file mode 100644
index ..e1d1fdea96f2
--- /dev/null
+++ b/drivers/gpu/drm/hyperv/hyperv_drm.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright 2021 Microsoft
+ */
+
+#ifndef _HYPERV_DRM_H_
+#define _HYPERV_DRM_H_
+
+#define VMBUS_MAX_PACKET_SIZE 0x4000
+
+struct hyperv_drm_device {
+   /* drm */
+   struct drm_device dev;
+   struct drm_simple_display_pipe pipe;
+   struct drm_connector connector;
+
+   /* mode */
+   u32 screen_width_max;
+   u32 screen_height_max;
+   u32 preferred_width;
+   u32 preferred_height;
+   u32 screen_depth;
+
+   /* hw */
+   struct resource *mem;
+   void __iomem *vram;
+   unsigned long fb_base;
+   unsigned long fb_size;
+   struct completion wait;
+   u32 synthvid_version;
+   u32 mmio_megabytes;
+
+   u8 init_buf[VMBUS_MAX_PACKET_SIZE];
+   u8 recv_buf[VMBUS_MAX_PACKET_SIZE];
+
+   struct hv_device *hdev;
+};
+
+#define to_hv(_dev) container_of(_dev, struct hyperv_drm_device, dev)
+
+/* hyperv_drm_modeset */
+int hyperv_mode_config_init(struct hyperv_drm_device *hv);
+
+/* hyperv_drm_proto */
+int hyperv_update_vram_location(struct hv_device *hdev, phys_addr_t vram_pp);
+int hyperv_update_situation(struct hv_device *hdev, u8 active, u32 bpp,
+   u32 w, u32 h, u32 pitch);
+int hyperv_update_dirt(struct hv_device *hdev, struct drm_rect *rect);
+int hyperv_connect_vsp(struct hv_device *hdev);
+
+#endif
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c 
b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
new file mode 100644
index ..c346dc7544aa
--- /dev/null
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
@@ -0,0 +1,307 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright 2021 Microsoft
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hyperv_drm.h"
+
+#define DRIVER_NAME "hyperv_drm"
+#define DRIVER_DESC "DRM 

Re: [PATCH v4 1/3] drm/dp_mst: Add self-tests for up requests

2021-05-19 Thread Lyude Paul
On Wed, 2021-05-19 at 13:51 +1000, Sam McNally wrote:
> On Wed, 19 May 2021 at 08:01, Lyude Paul  wrote:
> > 
> > Looks like these tests might still need to be fixed up a bit:
> > 
> > [   34.785042]  (null): [drm:drm_dp_sideband_parse_req [drm_kms_helper]]
> > connection status reply parse length fail 2 1
> > [   34.785082]  (null): [drm:drm_dp_sideband_parse_req [drm_kms_helper]]
> > resource status reply parse length fail 2 1
> > [   34.785114]  (null): [drm:drm_dp_sideband_parse_req [drm_kms_helper]]
> > sink event notify parse length fail 2 1
> > [   34.785146]  (null): [drm] *ERROR* Got unknown request 0x23
> > (REMOTE_I2C_WRITE)
> > 
> 
> Those are expected parse failures - testing that parse rejects
> messages that are too short or are unsupported. I'll set the mock
> device name to make this clearer in the next version, producing
> logging like:
> [   25.163682]  [drm_dp_mst_helper] expected parse failure:
> [drm:drm_dp_sideband_parse_req] connection status reply parse length
> fail 2 1
> [   25.163706]  [drm_dp_mst_helper] expected parse failure:
> [drm:drm_dp_sideband_parse_req] resource status reply parse length
> fail 2 1
> [   25.163719]  [drm_dp_mst_helper] expected parse failure:
> [drm:drm_dp_sideband_parse_req] sink event notify parse length fail 2
> 1
> [   25.163730]  [drm_dp_mst_helper] expected parse failure: [drm]
> *ERROR* Got unknown request 0x23 (REMOTE_I2C_WRITE)
> 

sgtm

> > 
> > On Tue, 2021-05-18 at 22:35 +1000, Sam McNally wrote:
> > Up requests are decoded by drm_dp_sideband_parse_req(), which operates
> > on a drm_dp_sideband_msg_rx, unlike down requests. Expand the existing
> > self-test helper sideband_msg_req_encode_decode() to copy the message
> > contents and length from a drm_dp_sideband_msg_tx to
> > drm_dp_sideband_msg_rx and use the parse function under test in place of
> > decode.
> > 
> > Add support for currently-supported up requests to
> > drm_dp_dump_sideband_msg_req_body(); add support to
> > drm_dp_encode_sideband_req() to allow encoding for the self-tests.
> > 
> > Add self-tests for CONNECTION_STATUS_NOTIFY and RESOURCE_STATUS_NOTIFY.
> > 
> > Signed-off-by: Sam McNally 
> > ---
> > 
> > Changes in v4:
> > - New in v4
> > 
> >  drivers/gpu/drm/drm_dp_mst_topology.c |  54 ++-
> >  .../gpu/drm/drm_dp_mst_topology_internal.h    |   4 +
> >  .../drm/selftests/test-drm_dp_mst_helper.c    | 147 --
> >  3 files changed, 190 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index 54604633e65c..573f39a3dc16 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -442,6 +442,37 @@ drm_dp_encode_sideband_req(const struct
> > drm_dp_sideband_msg_req_body *req,
> >     idx++;
> >     }
> >     break;
> > +   case DP_CONNECTION_STATUS_NOTIFY: {
> > +   const struct drm_dp_connection_status_notify *msg;
> > +
> > +   msg = >u.conn_stat;
> > +   buf[idx] = (msg->port_number & 0xf) << 4;
> > +   idx++;
> > +   memcpy(>msg[idx], msg->guid, 16);
> > +   idx += 16;
> > +   raw->msg[idx] = 0;
> > +   raw->msg[idx] |= msg->legacy_device_plug_status ? BIT(6) :
> > 0;
> > +   raw->msg[idx] |= msg->displayport_device_plug_status ?
> > BIT(5) :
> > 0;
> > +   raw->msg[idx] |= msg->message_capability_status ? BIT(4) :
> > 0;
> > +   raw->msg[idx] |= msg->input_port ? BIT(3) : 0;
> > +   raw->msg[idx] |= FIELD_PREP(GENMASK(2, 0), msg-
> > > peer_device_type);
> > +   idx++;
> > +   break;
> > +   }
> > +   case DP_RESOURCE_STATUS_NOTIFY: {
> > +   const struct drm_dp_resource_status_notify *msg;
> > +
> > +   msg = >u.resource_stat;
> > +   buf[idx] = (msg->port_number & 0xf) << 4;
> > +   idx++;
> > +   memcpy(>msg[idx], msg->guid, 16);
> > +   idx += 16;
> > +   buf[idx] = (msg->available_pbn & 0xff00) >> 8;
> > +   idx++;
> > +   buf[idx] = (msg->available_pbn & 0xff);
> > +   idx++;
> > +   break;
> > +   }
> >     }
> >     raw->cur_len = idx;
> >  }
> > @@ -672,6 +703,22 @@ drm_dp_dump_sideband_msg_req_body(const struct
> > drm_dp_sideband_msg_req_body *req
> >   req->u.enc_status.stream_behavior,
> >   req->u.enc_status.valid_stream_behavior);
> >     break;
> > +   case DP_CONNECTION_STATUS_NOTIFY:
> > +   P("port=%d guid=%*ph legacy=%d displayport=%d messaging=%d
> > input=%d peer_type=%d",
> > + req->u.conn_stat.port_number,
> > + (int)ARRAY_SIZE(req->u.conn_stat.guid), req-
> > >u.conn_stat.guid,
> > + 

Re: [PATCH] drm/i915: Force DPCD backlight mode for Samsung 16727 panel

2021-05-19 Thread Lyude Paul
Seems reasonable to me:

Reviewed-by: Lyude Paul 

On Wed, 2021-05-19 at 17:53 +0800, Aaron Ma wrote:
> Another Samsung OLED panel needs DPCD to get control of backlight.
> Kernel 5.12+ support the backlight via:
> commit: <4a8d79901d5b> ("drm/i915/dp: Enable Intel's HDR backlight interface
> (only SDR for now)")
> Only make backlight work on lower versions of kernel.
> 
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3474
> Cc: sta...@vger.kernel.org # 5.11-
> Signed-off-by: Aaron Ma 
> ---
>  drivers/gpu/drm/drm_dp_helper.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 5bd0934004e3..7b91d8a76cd6 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -1960,6 +1960,7 @@ static const struct edid_quirk edid_quirk_list[] = {
> { MFG(0x4d, 0x10), PROD_ID(0xe6, 0x14),
> BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
> { MFG(0x4c, 0x83), PROD_ID(0x47, 0x41),
> BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
> { MFG(0x09, 0xe5), PROD_ID(0xde, 0x08),
> BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
> +   { MFG(0x4c, 0x83), PROD_ID(0x57, 0x41),
> BIT(DP_QUIRK_FORCE_DPCD_BACKLIGHT) },
>  };
>  
>  #undef MFG

-- 
Sincerely,
   Lyude Paul (she/her)
   Software Engineer at Red Hat
   
Note: I deal with a lot of emails and have a lot of bugs on my plate. If you've
asked me a question, are waiting for a review/merge on a patch, etc. and I
haven't responded in a while, please feel free to send me another email to check
on my status. I don't bite!



Re: [PATCH 0/7] Per client engine busyness

2021-05-19 Thread Tvrtko Ursulin



On 18/05/2021 10:40, Tvrtko Ursulin wrote:


On 18/05/2021 10:16, Daniel Stone wrote:

Hi,

On Tue, 18 May 2021 at 10:09, Tvrtko Ursulin
 wrote:

I was just wondering if stat(2) and a chrdev major check would be a
solid criteria to more efficiently (compared to parsing the text
content) detect drm files while walking procfs.


Maybe I'm missing something, but is the per-PID walk actually a
measurable performance issue rather than just a bit unpleasant?


Per pid and per each open fd.

As said in the other thread what bothers me a bit in this scheme is that 
the cost of obtaining GPU usage scales based on non-GPU criteria.


For use case of a top-like tool which shows all processes this is a 
smaller additional cost, but then for a gpu-top like tool it is somewhat 
higher.


To further expand, not only cost would scale per pid multiplies per open 
fd, but to detect which of the fds are DRM I see these three options:


1) Open and parse fdinfo.
2) Name based matching ie /dev/dri/.. something.
3) Stat the symlink target and check for DRM major.

All sound quite sub-optimal to me.

Name based matching is probably the least evil on system resource usage 
(Keeping the dentry cache too hot? Too many syscalls?), even though 
fundamentally I don't it is the right approach.


What happens with dup(2) is another question.

Does anyone have any feedback on the /proc//gpu idea at all?

Regards,

Tvrtko


Re: [PATCH 1/3] gpu: drm: replace occurrences of invalid character

2021-05-19 Thread Randy Dunlap
On 5/19/21 1:15 AM, Mauro Carvalho Chehab wrote:
> There are some places at drm that ended receiving a
> REPLACEMENT CHARACTER U+fffd ('�'), probably because of
> some bad charset conversion.
> 
> Fix them by using what it seems   to be the proper
> character.
> 
> Signed-off-by: Mauro Carvalho Chehab 

Acked-by: Randy Dunlap 

Thanks.

> ---
>  drivers/gpu/drm/amd/include/atombios.h   | 10 +-
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  2 +-
>  drivers/gpu/drm/i915/i915_gpu_error.h|  2 +-
>  drivers/gpu/drm/r128/r128_drv.h  |  2 +-
>  4 files changed, 8 insertions(+), 8 deletions(-)
> 

-- 
~Randy



Re: [RFC] Add DMA_RESV_USAGE flags

2021-05-19 Thread Michel Dänzer
On 2021-05-19 5:21 p.m., Jason Ekstrand wrote:
> On Wed, May 19, 2021 at 5:52 AM Michel Dänzer  wrote:
>>
>> On 2021-05-19 12:06 a.m., Jason Ekstrand wrote:
>>> On Tue, May 18, 2021 at 4:17 PM Daniel Vetter  wrote:

 On Tue, May 18, 2021 at 7:40 PM Christian König
  wrote:
>
> Am 18.05.21 um 18:48 schrieb Daniel Vetter:
>> On Tue, May 18, 2021 at 2:49 PM Christian König
>>  wrote:
>>
>>> And as long as we are all inside amdgpu we also don't have any oversync,
>>> the issue only happens when we share dma-bufs with i915 (radeon and
>>> AFAIK nouveau does the right thing as well).
>> Yeah because then you can't use the amdgpu dma_resv model anymore and
>> have to use the one atomic helpers use. Which is also the one that
>> e.g. Jason is threathening to bake in as uapi with his dma_buf ioctl,
>> so as soon as that lands and someone starts using it, something has to
>> adapt _anytime_ you have a dma-buf hanging around. Not just when it's
>> shared with another device.
>
> Yeah, and that is exactly the reason why I will NAK this uAPI change.
>
> This doesn't works for amdgpu at all for the reasons outlined above.

 Uh that's really not how uapi works. "my driver is right, everyone
 else is wrong" is not how cross driver contracts are defined. If that
 means a perf impact until you've fixed your rules, that's on you.

 Also you're a few years too late with nacking this, it's already uapi
 in the form of the dma-buf poll() support.
>>>
>>> ^^  My fancy new ioctl doesn't expose anything that isn't already
>>> there.  It just lets you take a snap-shot of a wait instead of doing
>>> an active wait which might end up with more fences added depending on
>>> interrupts and retries.  The dma-buf poll waits on all fences for
>>> POLLOUT and only the exclusive fence for POLLIN.  It's already uAPI.
>>
>> Note that the dma-buf poll support could be useful to Wayland compositors 
>> for the same purpose as Jason's new ioctl (only using client buffers which 
>> have finished drawing for an output frame, to avoid missing a refresh cycle 
>> due to client drawing), *if* it didn't work differently with amdgpu.
>>
>> Am I understanding correctly that Jason's new ioctl would also work 
>> differently with amdgpu as things stand currently? If so, that would be a 
>> real bummer and might hinder adoption of the ioctl by Wayland compositors.
> 
> My new ioctl has identical semantics to poll().  It just lets you take
> a snapshot in time to wait on later instead of waiting on whatever
> happens to be set right now.  IMO, having identical semantics to
> poll() isn't something we want to change.

Agreed.

I'd argue then that making amdgpu poll semantics match those of other drivers 
is a pre-requisite for the new ioctl, otherwise it seems unlikely that the 
ioctl will be widely adopted.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2] backlight: Kconfig whitespace and indentation cleanups

2021-05-19 Thread Daniel Thompson
On Wed, May 19, 2021 at 01:03:00PM +0200, Juerg Haefliger wrote:
> Remove leading whitespaces, replace multi spaces with tabs, and fix help
> text indentation.
> 
> Signed-off-by: Juerg Haefliger 

Reviewed-by: Daniel Thompson 


Daniel.

> ---
>  drivers/video/backlight/Kconfig | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/video/backlight/Kconfig b/drivers/video/backlight/Kconfig
> index d83c87b902c1..c887338de386 100644
> --- a/drivers/video/backlight/Kconfig
> +++ b/drivers/video/backlight/Kconfig
> @@ -128,12 +128,12 @@ config LCD_HX8357
> If you have a HX-8357 LCD panel, say Y to enable its LCD control
> driver.
>  
> -  config LCD_OTM3225A
> - tristate "ORISE Technology OTM3225A support"
> - depends on SPI
> - help
> -   If you have a panel based on the OTM3225A controller
> -   chip then say y to include a driver for it.
> +config LCD_OTM3225A
> + tristate "ORISE Technology OTM3225A support"
> + depends on SPI
> + help
> +   If you have a panel based on the OTM3225A controller
> +   chip then say y to include a driver for it.
>  
>  endif # LCD_CLASS_DEVICE
>  
> @@ -269,11 +269,11 @@ config BACKLIGHT_MAX8925
> WLED output, say Y here to enable this driver.
>  
>  config BACKLIGHT_APPLE
> -   tristate "Apple Backlight Driver"
> -   depends on X86 && ACPI
> -   help
> -  If you have an Intel-based Apple say Y to enable a driver for its
> -  backlight.
> + tristate "Apple Backlight Driver"
> + depends on X86 && ACPI
> + help
> +   If you have an Intel-based Apple say Y to enable a driver for its
> +   backlight.
>  
>  config BACKLIGHT_TOSA
>   tristate "Sharp SL-6000 Backlight Driver"
> -- 
> 2.27.0
> 


Re: [RFC] Add DMA_RESV_USAGE flags

2021-05-19 Thread Jason Ekstrand
On Wed, May 19, 2021 at 6:43 AM Christian König
 wrote:
>
> Am 19.05.21 um 00:06 schrieb Jason Ekstrand:
> > [SNIP]
> >>> E.g. we can't add a fence which doesn't wait for the exclusive one as
> >>> shared.
> >> Ok I think that's a real problem, and  guess it's also related to all
> >> the ttm privatization tricks and all that. So essentially we'd need
> >> the opposite of ttm_bo->moving, as in you can't ignore it, but
> >> otherwise it completely ignores all the userspace implicit fence
> >> stuff.
> > Would you mind explaining it to the rest of the class?  I get the need
> > to do a TLB flush after a BO is removed from the processes address
> > space and I get that it may be super-heavy and that it has to be
> > delayed.  I also get that the driver needs to hold a reference to the
> > underlying pages until that TLB flush is done.  What I don't get is
> > what this has to do with the exclusive fence.  Why can't the driver
> > just gather up all the dma_resv fences on the current object (or,
> > better yet, just the ones from the current amdgpu process) and wait on
> > them all?  Why does it need to insert an exclusive fence that then
> > clogs up the whole works?
>
> Because we have mixed up resource management with implicit syncing.
>
> When I sum up all fences in (for example) a dma_fence_array container
> and add that as explicit fence to the dma_resv object resource
> management will do what I want and wait for everything to finish before
> moving or freeing the buffer. But implicit sync will just horrible over
> sync and wait for stuff it shouldn't wait for in the first place.
>
> When I add the fence as shared fence I can run into the problem the the
> TLB flush might finish before the exclusive fence. Which is not allowed
> according to the DMA-buf fencing rules.

I'm starting to feel a bit dense here, sorry...  So the problem is
that the TLB flush really wants to just wait on memory management
fences and not implicit sync fences?  Or is it that you need to wait
on the exclusive fence in case it actually matters but you don't want
to if it was stuffed in there for implicit sync and doesn't have any
memory implications?  Also, how bad is it for the TLB flush to come in
late?  Is other stuff blocking on it?

> We currently have some rather crude workarounds to make use cases like
> this work as expected. E.g. by using a
> dma_fence_chain()/dma_fence_array() and/or adding the explusive fence to
> the shared fences etc etc...
>
> >>> Let's say that you have a buffer which is shared between two drivers A
> >>> and B and let's say driver A has thrown a fence on it just to ensure
> >>> that the BO doesn't get swapped out to disk until it's at a good
> >>> stopping point.  Then driver B comes along and wants to throw a
> >>> write-fence on it.  Suddenly, your memory fence from driver A causes
> >>> driver B to have to stall waiting for a "good" time to throw in a
> >>> fence.  It sounds like this is the sort of scenario that Christian is
> >>> running into.  And, yes, with certain Vulkan drivers being a bit
> >>> sloppy about exactly when they throw in write fences, I could see it
> >>> being a real problem.
> >> Yes this is a potential problem, and on the i915 side we need to do
> >> some shuffling here most likely. Especially due to discrete, but the
> >> problem is pre-existing. tbh I forgot about the implications here
> >> until I pondered this again yesterday evening.
> >>
> >> But afaiui the amdgpu code and winsys in mesa, this isn't (yet) the
> >> problem amd vk drivers have. The issue is that with amdgpu, all you
> >> supply are the following bits at CS time:
> >> - list of always mapped private buffers, which is implicit and O(1) in
> >> the kernel fastpath
> >> - additional list of shared buffers that are used by the current CS
> >>
> >> I didn't check how exactly that works wrt winsys buffer ownership, but
> >> the thing is that on the kernel side _any_ buffer in there is treated
> >> as a implicit sync'ed write. Which means if you render your winsys
> >> with a bunch of command submission split over 3d and compute pipes,
> >> you end up with horrendous amounts of oversync.
> > What are you talking about? We have no sync at all for submissions from
> > the same client.
>  Yes. Except when the buffer is shared with another driver, at which
>  point you sync a _lot_ and feel the pain.
> >>> Yes, exactly that's the problem.
> >>>
> >>> We basically don't know during CS if a BO is shared or not.
> >>>
> >>> We do know that during importing or exporting the BO thought.
> >> No you don't. Or at least that's massively awkward, see Jason's reply.
> > Please.  In Vulkan, we know explicitly whether or not any BO will ever
> > be shared and, if a BO is ever flagged as shared even though it's not,
> > that's the app being stupid and they can eat the perf hit.
>
> Yeah, that's not a 

Re: [Intel-gfx] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-19 Thread Tvrtko Ursulin



On 19/05/2021 12:10, Daniel Vetter wrote:

On Tue, May 18, 2021 at 04:58:30PM -0700, Matthew Brost wrote:

Add entry fpr i915 new parallel submission uAPI plan.

v2:
  (Daniel Vetter):
   - Expand logical order explaination
   - Add dummy header
   - Only allow N BBs in execbuf IOCTL
   - Configure parallel submission per slot not per gem context

Cc: Tvrtko Ursulin 
Cc: Tony Ye 
CC: Carl Zhang 
Cc: Daniel Vetter 
Cc: Jason Ekstrand 
Signed-off-by: Matthew Brost 
---
  Documentation/gpu/rfc/i915_parallel_execbuf.h | 144 ++
  Documentation/gpu/rfc/i915_scheduler.rst  |  53 ++-
  2 files changed, 196 insertions(+), 1 deletion(-)
  create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h

diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
b/Documentation/gpu/rfc/i915_parallel_execbuf.h
new file mode 100644
index ..8c64b983ccad
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
@@ -0,0 +1,144 @@
+#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
i915_context_engines_parallel_submit */
+
+/*
+ * i915_context_engines_parallel_submit:
+ *
+ * Setup a slot to allow multiple BBs to be submitted in a single execbuf 
IOCTL.
+ * Those BBs will then be scheduled to run on the GPU in parallel. Multiple
+ * hardware contexts are created internally in the i915 run these BBs. Once a
+ * slot is configured for N BBs only N BBs can be submitted in each execbuf
+ * IOCTL and this is implict behavior (e.g. the user doesn't tell the execbuf
+ * IOCTL there are N BBs, the execbuf IOCTL know how many BBs there are based 
on
+ * the slots configuration).
+ *
+ * Their are two currently defined ways to control the placement of the
+ * hardware contexts on physical engines: default behavior (no flags) and
+ * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the
+ * future as new hardware / use cases arise. Details of how to use this
+ * interface below above the flags.
+ *
+ * Returns -EINVAL if hardware context placement configuration invalid or if 
the
+ * placement configuration isn't supported on the platform / submission
+ * interface.
+ * Returns -ENODEV if extension isn't supported on the platform / submission
+ * inteface.
+ */
+struct i915_context_engines_parallel_submit {
+   struct i915_user_extension base;
+
+   __u16 engine_index; /* slot for parallel engine */
+   __u16 width;/* number of contexts per parallel engine */
+   __u16 num_siblings; /* number of siblings per context */
+   __u16 mbz16;


Ok the big picture looks reasonable now, the flags still confuse me.


+/*
+ * Default placement behvavior (currently unsupported):
+ *
+ * Rather than restricting parallel submission to a single class with a
+ * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a mode 
that
+ * enables parallel submission across multiple engine classes. In this case 
each
+ * context's logical engine mask indicates where that context can placed. It is
+ * implied in this mode that all contexts have mutual exclusive placement (e.g.
+ * if one context is running CS0 no other contexts can run on CS0).
+ *
+ * Example 1 pseudo code:
+ * CSX[Y] = engine class X, logical instance Y
+ * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
+ * set_engines(INVALID)
+ * set_parallel(engine_index=0, width=2, num_siblings=2,
+ * engines=CS0[0],CS0[1],CS1[0],CS1[1])
+ *
+ * Results in the following valid placements:
+ * CS0[0], CS1[0]
+ * CS0[0], CS1[1]
+ * CS0[1], CS1[0]
+ * CS0[1], CS1[1]
+ *
+ * This can also be though of as 2 virtual engines:
+ * VE[0] = CS0[0], CS0[1]
+ * VE[1] = CS1[0], CS1[1]
+ *
+ * Example 2 pseudo code:
+ * CS[X] = generic engine of same class, logical instance X
+ * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
+ * set_engines(INVALID)
+ * set_parallel(engine_index=0, width=2, num_siblings=3,
+ * engines=CS[0],CS[1],CS[2],CS[0],CS[1],CS[2])
+ *
+ * Results in the following valid placements:
+ * CS[0], CS[1]
+ * CS[0], CS[2]
+ * CS[1], CS[0]
+ * CS[1], CS[2]
+ * CS[2], CS[0]
+ * CS[2], CS[1]
+ *
+ *
+ * This can also be though of as 2 virtual engines:
+ * VE[0] = CS[0], CS[1], CS[2]
+ * VE[1] = CS[0], CS[1], CS[2]
+
+ * This enables a use case where all engines are created equally, we don't care
+ * where they are scheduled, we just want a certain number of resources, for
+ * those resources to be scheduled in parallel, and possibly across multiple
+ * engine classes.
+ */


So I don't really get what this does compared to setting the flag below.
Is this just about running the batchbuffers the wrong way round, i.e. if
you have (simplest case)

width=2, num_sibglings=1, engines=CS[0], CS[1]

Then both
CS[0], CS[1]
and
CS[1], CS[0]
are possible options for running 2 batches? Iow, the backend is allowed to
run the batchbuffers the wrong way round, which gains us nothing, since we
assume the batches take equally long 

Re: [RFC] Add DMA_RESV_USAGE flags

2021-05-19 Thread Jason Ekstrand
On Wed, May 19, 2021 at 5:52 AM Michel Dänzer  wrote:
>
> On 2021-05-19 12:06 a.m., Jason Ekstrand wrote:
> > On Tue, May 18, 2021 at 4:17 PM Daniel Vetter  wrote:
> >>
> >> On Tue, May 18, 2021 at 7:40 PM Christian König
> >>  wrote:
> >>>
> >>> Am 18.05.21 um 18:48 schrieb Daniel Vetter:
>  On Tue, May 18, 2021 at 2:49 PM Christian König
>   wrote:
> 
> > And as long as we are all inside amdgpu we also don't have any oversync,
> > the issue only happens when we share dma-bufs with i915 (radeon and
> > AFAIK nouveau does the right thing as well).
>  Yeah because then you can't use the amdgpu dma_resv model anymore and
>  have to use the one atomic helpers use. Which is also the one that
>  e.g. Jason is threathening to bake in as uapi with his dma_buf ioctl,
>  so as soon as that lands and someone starts using it, something has to
>  adapt _anytime_ you have a dma-buf hanging around. Not just when it's
>  shared with another device.
> >>>
> >>> Yeah, and that is exactly the reason why I will NAK this uAPI change.
> >>>
> >>> This doesn't works for amdgpu at all for the reasons outlined above.
> >>
> >> Uh that's really not how uapi works. "my driver is right, everyone
> >> else is wrong" is not how cross driver contracts are defined. If that
> >> means a perf impact until you've fixed your rules, that's on you.
> >>
> >> Also you're a few years too late with nacking this, it's already uapi
> >> in the form of the dma-buf poll() support.
> >
> > ^^  My fancy new ioctl doesn't expose anything that isn't already
> > there.  It just lets you take a snap-shot of a wait instead of doing
> > an active wait which might end up with more fences added depending on
> > interrupts and retries.  The dma-buf poll waits on all fences for
> > POLLOUT and only the exclusive fence for POLLIN.  It's already uAPI.
>
> Note that the dma-buf poll support could be useful to Wayland compositors for 
> the same purpose as Jason's new ioctl (only using client buffers which have 
> finished drawing for an output frame, to avoid missing a refresh cycle due to 
> client drawing), *if* it didn't work differently with amdgpu.
>
> Am I understanding correctly that Jason's new ioctl would also work 
> differently with amdgpu as things stand currently? If so, that would be a 
> real bummer and might hinder adoption of the ioctl by Wayland compositors.

My new ioctl has identical semantics to poll().  It just lets you take
a snapshot in time to wait on later instead of waiting on whatever
happens to be set right now.  IMO, having identical semantics to
poll() isn't something we want to change.

--Jason


Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Felix Kuehling
Looks like we're creating the userptr BO as ttm_bo_type_device. I guess
we should be using ttm_bo_type_sg? BTW, amdgpu_gem_userptr_ioctl also
uses ttm_bo_type_device.

Regards,
  Felix


Am 2021-05-19 um 6:01 a.m. schrieb Christian König:
> I'm scratching my head how that is even possible.
>
> See when a BO is created in the system domain it is just an empty
> hull, e.g. without backing store and allocated pages.
>
> So the swapout function will just ignore it.
>
> Christian.
>
> Am 19.05.21 um 07:07 schrieb Pan, Xinhui:
>> [AMD Official Use Only]
>>
>> I have reverted Chris'  patch, still hit this failure.
>> Just see two lines in Chris' patch. Any BO in cpu domian would be
>> swapout first. That is why we hit this issue frequently now. But the
>> bug is there long time ago.
>>
>> -   for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
>> -   list_for_each_entry(bo, >swap_lru[i], swap) {
>> [snip]
>> +   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
>> +   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
>>
>>
>> 
>> 发件人: Pan, Xinhui 
>> 发送时间: 2021年5月19日 12:09
>> 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian;
>> dri-devel@lists.freedesktop.org; dan...@ffwll.ch
>> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to
>> swapout and swapin
>>
>> yes, we really dont swapout SG BOs.
>> The problems is that before we validate a userptr BO, we create this
>> BO in CPU domain by default. So this BO has chance to swapout.
>>
>> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too
>> late.
>> I have not try to revert Chris' patch as I think it desnt help. Or I
>> can have a try later.
>>
>> 
>> 发件人: Kuehling, Felix 
>> 发送时间: 2021年5月19日 11:29
>> 收件人: Pan, Xinhui; amd-...@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian;
>> dri-devel@lists.freedesktop.org; dan...@ffwll.ch
>> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to
>> swapout and swapin
>>
>> Swapping SG BOs makes no sense, because TTM doesn't own the pages of
>> this type of BO.
>>
>> Last I checked, userptr BOs (and other SG BOs) were protected from
>> swapout by the fact that they would not be added to the swap-LRU. But it
>> looks like Christian just removed the swap-LRU. I guess this broke that
>> protection:
>>
>> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
>> Author: Christian König 
>> Date:   Tue Oct 6 16:30:09 2020 +0200
>>
>>   drm/ttm: remove swap LRU v3
>>
>>   Instead evict round robin from each devices SYSTEM and TT domain.
>>
>>   v2: reorder num_pages access reported by Dan's script
>>   v3: fix rebase fallout, num_pages should be 32bit
>>
>>   Signed-off-by: Christian König 
>>   Tested-by: Nirmoy Das 
>>   Reviewed-by: Huang Rui 
>>   Reviewed-by: Matthew Auld 
>>   Link: https://patchwork.freedesktop.org/patch/424009/
>>
>> Regards,
>>     Felix
>>
>>
>> On 2021-05-18 10:28 p.m., xinhui pan wrote:
>>> cpu 1   cpu 2
>>> kfd alloc BO A(userptr) alloc BO B(GTT)
>>>   ->init -> validate   -> init ->
>>> validate -> populate
>>>   init_user_pages    -> swapout BO A
>>> //hit ttm pages limit
>>>    -> get_user_pages (fill up ttm->pages)
>>>     -> validate -> populate
>>>     -> swapin BO A // Now hit the BUG
>>>
>>> We know that get_user_pages may race with swapout on same BO.
>>> Threre are some issues I have met.
>>> 1) memory corruption.
>>> This is because we do a swap before memory is setup. ttm_tt_swapout()
>>> just create a swap_storage with its content being 0x0. So when we setup
>>> memory after the swapout. The following swapin makes the memory
>>> corrupted.
>>>
>>> 2) panic
>>> When swapout happes with get_user_pages, they touch ttm->pages without
>>> anylock. It causes memory corruption too. But I hit page fault mostly.
>>>
>>> Signed-off-by: xinhui pan 
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16
>>> +++-
>>>    1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> index 928e8d57cd08..42460e4480f8 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> @@ -835,6 +835,7 @@ static int init_user_pages(struct kgd_mem *mem,
>>> uint64_t user_addr)
>>>    struct amdkfd_process_info *process_info = mem->process_info;
>>>    struct amdgpu_bo *bo = mem->bo;
>>>    struct ttm_operation_ctx ctx = { true, false };
>>> + struct page **pages;
>>>    int ret = 0;
>>>
>>>    mutex_lock(_info->lock);
>>> @@ -852,7 +853,13 @@ static int init_user_pages(struct kgd_mem *mem,
>>> uint64_t user_addr)
>>>   

Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"

2021-05-19 Thread Jason Ekstrand
Once we no longer rely on error propagation, I think there's a lot we
can rip out.

--Jason

On Wed, May 19, 2021 at 5:15 AM Daniel Vetter  wrote:
>
> From: Jason Ekstrand 
>
> This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> since that commit, we've been having issues where a hang in one client
> can propagate to another.  In particular, a hang in an app can propagate
> to the X server which causes the whole desktop to lock up.
>
> Error propagation along fences sound like a good idea, but as your bug
> shows, surprising consequences, since propagating errors across security
> boundaries is not a good thing.
>
> What we do have is track the hangs on the ctx, and report information to
> userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> understanding is still correct, the EIO from execbuf is when your context
> is banned (because not recoverable or too many hangs). And in all these
> cases it's up to userspace to figure out what is all impacted and should
> be reported to the application, that's not on the kernel to guess and
> automatically propagate.
>
> What's more, we're also building more features on top of ctx error
> reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> userspace fence wait also relies on that mechanism. So it is the path
> going forward for reporting gpu hangs and resets to userspace.
>
> So all together that's why I think we should just bury this idea again as
> not quite the direction we want to go to, hence why I think the revert is
> the right option here.Signed-off-by: Jason Ekstrand 
>
> v2: Augment commit message. Also restore Jason's sob that I
> accidentally lost.
>
> Signed-off-by: Jason Ekstrand  (v1)
> Reported-by: Marcin Slusarz 
> Cc:  # v5.6+
> Cc: Jason Ekstrand 
> Cc: Marcin Slusarz 
> Cc: Jon Bloomfield 
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled 
> fences")
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/i915/i915_request.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c 
> b/drivers/gpu/drm/i915/i915_request.c
> index 970d8f4986bb..b796197c0772 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
>
> do {
> fence = *child++;
> -   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
> -   i915_sw_fence_set_error_once(>submit, 
> fence->error);
> +   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
> continue;
> -   }
>
> if (fence->context == rq->fence.context)
> continue;
> @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, 
> struct dma_fence *fence)
>
> do {
> fence = *child++;
> -   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) {
> -   i915_sw_fence_set_error_once(>submit, 
> fence->error);
> +   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags))
> continue;
> -   }
>
> /*
>  * Requests on the same timeline are explicitly ordered, along
> --
> 2.31.0
>


Re: [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms

2021-05-19 Thread Jason Ekstrand
On Wed, May 19, 2021 at 2:43 AM Daniel Vetter  wrote:
>
> On gen9 for blt cmd parser we relied on the magic fence error
> propagation which:
> - doesn't work on gen7, because there's no scheduler with ringbuffers
>   there yet
> - fence error propagation can be weaponized to attack other things, so
>   not a good design idea
>
> Instead of magic, do the same thing on gen9 as on gen7.

I think the commit message could be improved.  Maybe something like this?

When we re-introduced the command parser on Gen9 platforms to protect
against BLT CS register writes, we did things a bit differently than
on previous platforms.  On Gen7 platforms, if a batch contains
unsupported commands, we smash the start of the shadow batch to
MI_BATCH_BUFFER_END to cancel the batch.  If it's mostly ok
(-EACCESS), we trampoline to run in unprivileged mode and let the
limited HW parser handle security.  On Gen9, we only care about
rejecting batches because we don't trust the HW parser for a few cases
so we don't need this second trampoline case.

However, instead of stopping there and avoiding the trampoline, we
chose to avoid executing the new batch all together on Gen9 by use of
dma-fence error propagation.  When the batch parser fails, it returns
a non-zero error and we would propgate that through the chain of
fences and trust the scheduler to know to cancel anything dependent on
a fence with an error.  However, fence error propagation is sketchy at
best and can be weaponized to attack other things so it's not really a
good design.  This commit restores a bit of the Gen7 functionality on
Gen9 (smashing the start of the shadow batch to MI_BB_END) so that
it's always safe to run the batch post-parser.  A later commit will
get rid of the error propagation nonsense.

>
> Kudos to Jason for figuring this out.
>
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled 
> fences")
> Cc:  # v5.6+
> Cc: Jason Ekstrand 
> Cc: Marcin Slusarz 
> Cc: Jon Bloomfield 
> Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +-
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 5b4b2bd46e7c..2d3336ab7ba3 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
> *engine,
> }
> }
>
> +   /* Batch unsafe to execute with privileges, cancel! */
> +   if (ret) {
> +   cmd = page_mask_bits(shadow->obj->mm.mapping);
> +   *cmd = MI_BATCH_BUFFER_END;
> +   }
> +
> if (trampoline) {
> /*
>  * With the trampoline, the shadow is executed twice.
> @@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
> *engine,
>  */
> *batch_end = MI_BATCH_BUFFER_END;

Bit of a bike shed but, given the new structure of the code, I think
it makes it more clear if we do

if (ret == -EACCESS) {
   /* stuff */
   __gen6_emit_bb_start(...);
} else {
   *batch_end = MI_BATCH_BUFFER_END;
}

That way it's clear that we're making a choice between firing off the
client batch in privileged mode and ending early.

>
> -   if (ret) {
> -   /* Batch unsafe to execute with privileges, cancel! */
> -   cmd = page_mask_bits(shadow->obj->mm.mapping);
> -   *cmd = MI_BATCH_BUFFER_END;
> +   /* If batch is unsafe but valid, jump to the original */
> +   if (ret == -EACCES) {
> +   unsigned int flags;
>
> -   /* If batch is unsafe but valid, jump to the original 
> */
> -   if (ret == -EACCES) {
> -   unsigned int flags;
> +   flags = MI_BATCH_NON_SECURE_I965;
> +   if (IS_HASWELL(engine->i915))
> +   flags = MI_BATCH_NON_SECURE_HSW;
>
> -   flags = MI_BATCH_NON_SECURE_I965;
> -   if (IS_HASWELL(engine->i915))
> -   flags = MI_BATCH_NON_SECURE_HSW;
> +   GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> +   __gen6_emit_bb_start(batch_end,
> +batch_addr,
> +flags);
>
> -   GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> -   __gen6_emit_bb_start(batch_end,
> -batch_addr,
> -flags);
> -
> -   ret = 0; /* allow execution */
> -  

Re: [Freedreno] [PATCH 0/4] drm/msm/dp: Add support for SC8180x eDP controller

2021-05-19 Thread Bjorn Andersson
On Tue 18 May 22:41 CDT 2021, abhin...@codeaurora.org wrote:

> Hi Bjorn
> 
> I had a quick glance on the series and before getting to other things wanted
> to know how you are initializing two different connectors for
> DP & EDP resp.
> 
> The connector type for DP should be DRM_MODE_CONNECTOR_DisplayPort and eDP
> should be DRM_MODE_CONNECTOR_eDP.

As far as I've been able to conclude there is no eDP support in the
upstream DPU driver; an encoder of type DRM_MODE_ENCODER_TMDS will only
attach to INTF_DP.

> We need both to be created so that both EDP and DP can be supported
> concurrently.
> 

Further more the DP controller driver has a global variable to track
state and the INTF-picker will always pick the interface of index 0 when
setting up the DP controller.

> Will these changes work for concurrent eDP and DP case?
> 

The proposed changes are all that I need to get eDP working on my
sc8180x laptop. But the DPU code does not currently support more than a
single DP interface - and that has to be on the first INTF_DP that the
DPU driver knows about.

But this is a limitation we should fix, rather than claiming that you
can only have one of each. Further more, afaict the sc7280 DP controller
can do both DP and eDP, so it would make sense not to distinguish the
interfaces as eDP or DP - just because the product in mind will use eDP.


PS. I've currently disabled the eDP interface on my laptop and am
working on trying to get Type-C DP working. Once that's in place I'd
need a better INTF/encoder picker - because the current model of just
picking INTF_DP 0 (or in a sequential fashion) won't work.

Regards,
Bjorn

> Thanks
> 
> Abhinav
> 
> On 2021-05-10 21:20, Bjorn Andersson wrote:
> > The first patch in the series is somewhat unrelated to the support, but
> > simplifies reasoning and debugging of timing related issues.
> > 
> > The second patch introduces support for dealing with different register
> > block
> > layouts, which is used in the forth patch to describe the hardware
> > blocks found
> > in the SC8180x eDP block.
> > 
> > The third patch configures the INTF_CONFIG register, which carries the
> > configuration for widebus handling. As with the DPU the bootloader
> > enables
> > widebus and we need to disable it, or implement support for adjusting
> > the
> > timing.
> > 
> > Bjorn Andersson (4):
> >   drm/msm/dp: Simplify the mvid/nvid calculation
> >   drm/msm/dp: Store each subblock in the io region
> >   drm/msm/dp: Initialize the INTF_CONFIG register
> >   drm/msm/dp: Add support for SC8180x eDP
> > 
> >  drivers/gpu/drm/msm/dp/dp_catalog.c | 99 +++--
> >  drivers/gpu/drm/msm/dp/dp_display.c |  1 +
> >  drivers/gpu/drm/msm/dp/dp_parser.c  | 22 +++
> >  drivers/gpu/drm/msm/dp/dp_parser.h  |  8 +++
> >  4 files changed, 53 insertions(+), 77 deletions(-)


Re: [PATCH v7 4/4] backlight: rt4831: Adds support for Richtek RT4831 backlight

2021-05-19 Thread Lee Jones
On Mon, 17 May 2021, cy_huang wrote:

> From: ChiYuan Huang 
> 
> Adds support for Richtek RT4831 backlight.
> 
> Signed-off-by: ChiYuan Huang 
> Reviewed-by: Daniel Thompson 
> ---
> since v7
> - Fix typo 'common' to 'commonly' in Kconfig description.
> ---
>  drivers/video/backlight/Kconfig|   8 ++
>  drivers/video/backlight/Makefile   |   1 +
>  drivers/video/backlight/rt4831-backlight.c | 203 
> +
>  3 files changed, 212 insertions(+)
>  create mode 100644 drivers/video/backlight/rt4831-backlight.c

Applied, thanks.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v7 3/4] mfd: rt4831: Adds DT binding document for Richtek RT4831

2021-05-19 Thread Lee Jones
On Mon, 17 May 2021, cy_huang wrote:

> From: ChiYuan Huang 
> 
> Adds DT binding document for Richtek RT4831.
> 
> Signed-off-by: ChiYuan Huang 
> Reviewed-by: Rob Herring 
> ---
>  .../devicetree/bindings/mfd/richtek,rt4831.yaml| 90 
> ++
>  1 file changed, 90 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mfd/richtek,rt4831.yaml

Applied, thanks.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v7 2/4] backlight: rt4831: Adds DT binding document for Richtek RT4831 backlight

2021-05-19 Thread Lee Jones
On Mon, 17 May 2021, cy_huang wrote:

> From: ChiYuan Huang 
> 
> Adds DT binding document for Richtek RT4831 backlight.
> 
> Signed-off-by: ChiYuan Huang 
> Reviewed-by: Daniel Thompson 
> ---
> since v7
> - Add allOf property refer to common.yaml.
> - Remove default-brightness/max-brightness description and refer string.
> ---
>  .../leds/backlight/richtek,rt4831-backlight.yaml   | 62 
> ++
>  include/dt-bindings/leds/rt4831-backlight.h| 23 
>  2 files changed, 85 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/leds/backlight/richtek,rt4831-backlight.yaml
>  create mode 100644 include/dt-bindings/leds/rt4831-backlight.h

Applied, thanks.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v7 1/4] mfd: rt4831: Adds support for Richtek RT4831

2021-05-19 Thread Lee Jones
On Mon, 17 May 2021, cy_huang wrote:

> From: ChiYuan Huang 
> 
> This adds support Richtek RT4831 core. It includes four channel WLED driver
> and Display Bias Voltage outputs.
> 
> Signed-off-by: ChiYuan Huang 
> ---
> - Send the patch series for the wrong mail subject.
> 
> The RT4831 regulator patches are already on main stream, and can be referred 
> to
> 9351ab8b0cb6 regulator: rt4831: Adds support for Richtek RT4831 DSV regulator
> 934b05e81862 regulator: rt4831: Adds DT binding document for Richtek RT4831 
> DSV regulator
> 
> since v6
> - Respin the date from 2020 to 2021.
> - Rmove the shutdown routine.
> - Change the macro OF_MFD_CELL to MFD_CELL_OF.
> 
> since v5
> - Rename file name from rt4831-core.c to rt4831.c
> - Change RICHTEK_VID to RICHTEK_VENDOR_ID.
> - Change gpio_desc nameing from 'enable' to 'enable_gpio' in probe.
> - Change variable 'val' to the meaningful name 'chip_id'.
> - Refine the error log when vendor id is not matched.
> - Remove of_match_ptr.
> 
> since v2
> - Refine Kconfig descriptions.
> - Add copyright.
> - Refine error logs in probe.
> - Refine comment lines in remove and shutdown.
> ---
>  drivers/mfd/Kconfig  |  10 +
>  drivers/mfd/Makefile |   1 +
>  drivers/mfd/rt4831.c | 115 
> +++
>  3 files changed, 126 insertions(+)
>  create mode 100644 drivers/mfd/rt4831.c

Applied, thanks.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[Bug 213145] AMDGPU resets, timesout and crashes after "*ERROR* Waiting for fences timed out!"

2021-05-19 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213145

--- Comment #2 from Tomas Gayoso (tgay...@gmail.com) ---
Created attachment 296871
  --> https://bugzilla.kernel.org/attachment.cgi?id=296871=edit
kernel configuration

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 213145] AMDGPU resets, timesout and crashes after "*ERROR* Waiting for fences timed out!"

2021-05-19 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213145

--- Comment #1 from Tomas Gayoso (tgay...@gmail.com) ---
Created attachment 296869
  --> https://bugzilla.kernel.org/attachment.cgi?id=296869=edit
lsmod  output

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 213145] New: AMDGPU resets, timesout and crashes after "*ERROR* Waiting for fences timed out!"

2021-05-19 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213145

Bug ID: 213145
   Summary: AMDGPU resets, timesout and crashes after "*ERROR*
Waiting for fences timed out!"
   Product: Drivers
   Version: 2.5
Kernel Version: 5.10.37 and 5.10.38
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: high
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: tgay...@gmail.com
Regression: No

Created attachment 296867
  --> https://bugzilla.kernel.org/attachment.cgi?id=296867=edit
lspci output

AMDGPU driver crashes randomly corrupting screen and freezing X with:


[   60.449781] [drm:0xc25a7a57] *ERROR* Waiting for fences timed out!
[   60.971941] [drm:0xc25281ae] *ERROR* ring gfx timeout, signaled
seq=3658, emitted seq=3660
[   60.971946] [drm:0xc25281cb] *ERROR* Process information: process
Xorg pid 1192 thread Xorg:cs0 pid 1193
[   60.971952] amdgpu :05:00.0: amdgpu: GPU reset begin!

... (some output suppressed for clarity, look at attached dmesg, please). 

[   61.800343] amdgpu :05:00.0: amdgpu: recover vram bo from shadow start
[   61.800346] amdgpu :05:00.0: amdgpu: recover vram bo from shadow done
[   61.800348] [drm] Skip scheduling IBs!
[   61.800350] [drm] Skip scheduling IBs!
[   61.800382] [drm] Skip scheduling IBs!
[   61.800398] amdgpu :05:00.0: amdgpu: GPU reset(2) succeeded!
[   61.800566] [drm] Skip scheduling IBs!
[   61.800580] [drm] Skip scheduling IBs!
[   61.800627] [drm] Skip scheduling IBs!
[   61.801012] [drm] Skip scheduling IBs!
[   61.801024] [drm] Skip scheduling IBs!
[   61.801052] [drm] Skip scheduling IBs!
[   61.801062] [drm] Skip scheduling IBs!
[   61.801096] [drm] Skip scheduling IBs!
[   61.801105] [drm] Skip scheduling IBs!
[   61.801137] [drm] Skip scheduling IBs!
[   61.801806] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.808746] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.809392] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.809764] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.810389] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.810866] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.812529] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.813359] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.814770] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   61.816488] [drm:0xc24219b8] *ERROR* Failed to initialize parser
-125!
[   62.541982] ucsi_acpi USBC000:00: PPM init failed (-110)
[   67.004898] amdgpu_cs_ioctl: 1467 callbacks suppressed




Hardware: ASUS TUF506IU laptop (dual GPU Renoir and Nvidia GeForce GTX 1660 Ti
Mobile)

Detailed hardware in lspci.txt. 
Detailed modules in lsmod.txt. 
Kernle config attached in Kernel_config.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v1 1/1] backlight: lm3630a_bl: Put fwnode in error case during ->probe()

2021-05-19 Thread Lee Jones
On Mon, 10 May 2021, Andy Shevchenko wrote:

> device_for_each_child_node() bumps a reference counting of a returned 
> variable.
> We have to balance it whenever we return to the caller.
> 
> Fixes: 8fbce8efe15cd ("backlight: lm3630a: Add firmware node support")
> Cc: Brian Masney 
> Cc: Dan Murphy 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/video/backlight/lm3630a_bl.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Applied, thanks.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH] drm/sched: Avoid data corruptions

2021-05-19 Thread Christian König

Am 19.05.21 um 16:14 schrieb Andrey Grodzovsky:

Wait for all dependencies of a job  to complete before
killing it to avoid data corruptions.

Signed-off-by: Andrey Grodzovsky 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/scheduler/sched_entity.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 2e93e881b65f..d5cf61972558 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -222,11 +222,16 @@ static void drm_sched_entity_kill_jobs_cb(struct 
dma_fence *f,
  static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
  {
struct drm_sched_job *job;
+   struct dma_fence *f;
int r;
  
  	while ((job = to_drm_sched_job(spsc_queue_pop(>job_queue {

struct drm_sched_fence *s_fence = job->s_fence;
  
+		/* Wait for all dependencies to avoid data corruptions */

+   while ((f = job->sched->ops->dependency(job, entity)))
+   dma_fence_wait(f);
+
drm_sched_fence_scheduled(s_fence);
dma_fence_set_error(_fence->finished, -ESRCH);
  




[PATCH] drm/sched: Avoid data corruptions

2021-05-19 Thread Andrey Grodzovsky
Wait for all dependencies of a job  to complete before
killing it to avoid data corruptions.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 2e93e881b65f..d5cf61972558 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -222,11 +222,16 @@ static void drm_sched_entity_kill_jobs_cb(struct 
dma_fence *f,
 static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
 {
struct drm_sched_job *job;
+   struct dma_fence *f;
int r;
 
while ((job = to_drm_sched_job(spsc_queue_pop(>job_queue {
struct drm_sched_fence *s_fence = job->s_fence;
 
+   /* Wait for all dependencies to avoid data corruptions */
+   while ((f = job->sched->ops->dependency(job, entity)))
+   dma_fence_wait(f);
+
drm_sched_fence_scheduled(s_fence);
dma_fence_set_error(_fence->finished, -ESRCH);
 
-- 
2.25.1



Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Peter Xu
On Wed, May 19, 2021 at 10:28:42AM -0300, Jason Gunthorpe wrote:
> On Tue, May 18, 2021 at 07:45:05PM -0400, Peter Xu wrote:
> > On Tue, May 18, 2021 at 08:03:27PM -0300, Jason Gunthorpe wrote:
> > > Logically during fork all these device exclusive pages should be
> > > reverted back to their CPU pages, write protected and the CPU page PTE
> > > copied to the fork.
> > > 
> > > We should not copy the device exclusive page PTE to the fork. I think
> > > I pointed to this on an earlier rev..
> > 
> > Agreed.  Though please see the question I posted in the other thread: now I 
> > am
> > not very sure whether we'll be able to mark a page as device exclusive if 
> > that
> > page has mapcount>1.
> 
> IMHO it is similar to write protect done by filesystems on shared
> mappings - all VMAs with a copy of the CPU page have to get switched
> to the device exclusive PTE. This is why the rmap stuff is involved in
> the migration helpers

Right, I think Alistair corrected me there that I missed the early COW
happening in GUP.

Actually even without that GUP triggering early COW it won't be a problem,
because as long as one child mm restored the pte from exclusive to normal
(before any further COW happens) device exclusiveness is broken in the mmu
notifiers, and after that point all previous-exclusive ptes actually becomes
the same as a very normal PageAnon.  Then it's very sane to even not have the
original page in parent process, because we know each COWed page will contain
all the device atomic modifications (so we don't really have the requirement to
return the original page to parent).

Sorry for the noise.

-- 
Peter Xu



Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Peter Xu
On Wed, May 19, 2021 at 11:11:55PM +1000, Alistair Popple wrote:
> On Wednesday, 19 May 2021 10:15:41 PM AEST Peter Xu wrote:
> > External email: Use caution opening links or attachments
> > 
> > On Wed, May 19, 2021 at 09:04:53PM +1000, Alistair Popple wrote:
> > > Failing fork() because we couldn't take a lock doesn't seem like the right
> > > approach though, especially as there is already existing code that
> > > retries. I get this adds complexity though, so would be happy to take a
> > > look at cleaning copy_pte_range() up in future.
> > 
> > Yes, I proposed that as this one won't affect any existing applications
> > (unlike the existing ones) but only new userspace driver apps that will use
> > this new atomic feature.
> > 
> > IMHO it'll be a pity to add extra complexity and maintainance burden into
> > fork() if only for keeping the "logical correctness of fork()" however the
> > code never triggers. If we start with trylock we'll know whether people
> > will use it, since people will complain with a reason when needed; however
> > I still doubt whether a sane userspace device driver should fork() within
> > busy interaction with the device underneath..
> 
> I will refrain from commenting on the sanity or otherwise of doing that :-)
> 
> Agree such a scenario seems unlikely in practice (and possibly unreasonable). 
> Keeping the "logical correctness of fork()" still seems worthwhile to me, but 
> if the added complexity/maintenance burden for an admittedly fairly specific 
> feature is going to stop progress here I am happy to take the fail fork 
> approach. I could then possibly fix it up as a future clean up to 
> copy_pte_range(). Perhaps others have thoughts?

Yes, it's more about making this series easier to be accepted, and it'll be
great to have others' input.

Btw, just to mention that I don't even think fail fork() on failed trylock() is
against "logical correctness of fork()": IMHO it's still 100% correct just like
most syscalls can return with -EAGAIN, that suggests the userspace to try again
the syscall, and I hope that also works for fork().  I'd be more than glad to
be corrected too.

-- 
Peter Xu



[PATCH] vgaarb: Use ACPI HID name to find integrated GPU

2021-05-19 Thread Kai-Heng Feng
Commit 3d42f1ddc47a ("vgaarb: Keep adding VGA device in queue") assumes
the first device is an integrated GPU. However, on AMD platforms an
integrated GPU can have higher PCI device number than a discrete GPU.

Integrated GPU on ACPI platform generally has _DOD and _DOS method, so
use that as predicate to find integrated GPU. If the new strategy
doesn't work, fallback to use the first device as boot VGA.

Signed-off-by: Kai-Heng Feng 
---
 drivers/gpu/vga/vgaarb.c | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
index 5180c5687ee5..949fde433ea2 100644
--- a/drivers/gpu/vga/vgaarb.c
+++ b/drivers/gpu/vga/vgaarb.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -1450,9 +1451,23 @@ static struct miscdevice vga_arb_device = {
MISC_DYNAMIC_MINOR, "vga_arbiter", _arb_device_fops
 };
 
+#if defined(CONFIG_ACPI)
+static bool vga_arb_integrated_gpu(struct device *dev)
+{
+   struct acpi_device *adev = ACPI_COMPANION(dev);
+
+   return adev && !strcmp(acpi_device_hid(adev), ACPI_VIDEO_HID);
+}
+#else
+static bool vga_arb_integrated_gpu(struct device *dev)
+{
+   return false;
+}
+#endif
+
 static void __init vga_arb_select_default_device(void)
 {
-   struct pci_dev *pdev;
+   struct pci_dev *pdev, *found = NULL;
struct vga_device *vgadev;
 
 #if defined(CONFIG_X86) || defined(CONFIG_IA64)
@@ -1505,20 +1520,26 @@ static void __init vga_arb_select_default_device(void)
 #endif
 
if (!vga_default_device()) {
-   list_for_each_entry(vgadev, _list, list) {
+   list_for_each_entry_reverse(vgadev, _list, list) {
struct device *dev = >pdev->dev;
u16 cmd;
 
pdev = vgadev->pdev;
pci_read_config_word(pdev, PCI_COMMAND, );
if (cmd & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY)) {
-   vgaarb_info(dev, "setting as boot device (VGA 
legacy resources not available)\n");
-   vga_set_default_device(pdev);
-   break;
+   found = pdev;
+   if (vga_arb_integrated_gpu(dev))
+   break;
}
}
}
 
+   if (found) {
+   vgaarb_info(>dev, "setting as boot device (VGA legacy 
resources not available)\n");
+   vga_set_default_device(found);
+   return;
+   }
+
if (!vga_default_device()) {
vgadev = list_first_entry_or_null(_list,
  struct vga_device, list);
-- 
2.31.1



Re: [Mesa-dev] [RFC 2/2] drm/doc/rfc: i915 new parallel submission uAPI plan

2021-05-19 Thread Marcin Ślusarz
śr., 19 maj 2021 o 01:41 Matthew Brost  napisał(a):
>
> Add entry fpr i915 new parallel submission uAPI plan.

s/fpr/for/

>
> v2:
>  (Daniel Vetter):
>   - Expand logical order explaination
>   - Add dummy header
>   - Only allow N BBs in execbuf IOCTL
>   - Configure parallel submission per slot not per gem context
>
> Cc: Tvrtko Ursulin 
> Cc: Tony Ye 
> CC: Carl Zhang 
> Cc: Daniel Vetter 
> Cc: Jason Ekstrand 
> Signed-off-by: Matthew Brost 
> ---
>  Documentation/gpu/rfc/i915_parallel_execbuf.h | 144 ++
>  Documentation/gpu/rfc/i915_scheduler.rst  |  53 ++-
>  2 files changed, 196 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
>
> diff --git a/Documentation/gpu/rfc/i915_parallel_execbuf.h 
> b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> new file mode 100644
> index ..8c64b983ccad
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_parallel_execbuf.h
> @@ -0,0 +1,144 @@
> +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> i915_context_engines_parallel_submit */
> +
> +/*
> + * i915_context_engines_parallel_submit:
> + *
> + * Setup a slot to allow multiple BBs to be submitted in a single execbuf 
> IOCTL.
> + * Those BBs will then be scheduled to run on the GPU in parallel. Multiple
> + * hardware contexts are created internally in the i915 run these BBs. Once a
> + * slot is configured for N BBs only N BBs can be submitted in each execbuf
> + * IOCTL and this is implict behavior (e.g. the user doesn't tell the execbuf
> + * IOCTL there are N BBs, the execbuf IOCTL know how many BBs there are 
> based on
> + * the slots configuration).
> + *
> + * Their are two currently defined ways to control the placement of the

s/Their/There/

> + * hardware contexts on physical engines: default behavior (no flags) and
> + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the

IMPLICIT? This typo is repeated multiple times
s/the in the/in the/

> + * future as new hardware / use cases arise. Details of how to use this
> + * interface below above the flags.

"below above"? :)

> + *
> + * Returns -EINVAL if hardware context placement configuration invalid or if 
> the

is invalid

> + * placement configuration isn't supported on the platform / submission
> + * interface.
> + * Returns -ENODEV if extension isn't supported on the platform / submission
> + * inteface.
> + */
> +struct i915_context_engines_parallel_submit {
> +   struct i915_user_extension base;
> +
> +   __u16 engine_index; /* slot for parallel engine */
> +   __u16 width;/* number of contexts per parallel engine */
> +   __u16 num_siblings; /* number of siblings per context */
> +   __u16 mbz16;
> +/*
> + * Default placement behvavior (currently unsupported):

behavior

> + *
> + * Rather than restricting parallel submission to a single class with a
> + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a mode 
> that
> + * enables parallel submission across multiple engine classes. In this case 
> each
> + * context's logical engine mask indicates where that context can placed. It 
> is

can be placed

> + * implied in this mode that all contexts have mutual exclusive placement 
> (e.g.
> + * if one context is running CS0 no other contexts can run on CS0).
> + *
> + * Example 1 pseudo code:
> + * CSX[Y] = engine class X, logical instance Y
> + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> + * set_engines(INVALID)
> + * set_parallel(engine_index=0, width=2, num_siblings=2,
> + * engines=CS0[0],CS0[1],CS1[0],CS1[1])
> + *
> + * Results in the following valid placements:
> + * CS0[0], CS1[0]
> + * CS0[0], CS1[1]
> + * CS0[1], CS1[0]
> + * CS0[1], CS1[1]
> + *
> + * This can also be though of as 2 virtual engines:

thought
(This typo repeats multiple times)

> + * VE[0] = CS0[0], CS0[1]
> + * VE[1] = CS1[0], CS1[1]
> + *
> + * Example 2 pseudo code:
> + * CS[X] = generic engine of same class, logical instance X
> + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> + * set_engines(INVALID)
> + * set_parallel(engine_index=0, width=2, num_siblings=3,
> + * engines=CS[0],CS[1],CS[2],CS[0],CS[1],CS[2])
> + *
> + * Results in the following valid placements:
> + * CS[0], CS[1]
> + * CS[0], CS[2]
> + * CS[1], CS[0]
> + * CS[1], CS[2]
> + * CS[2], CS[0]
> + * CS[2], CS[1]
> + *
> + *
> + * This can also be though of as 2 virtual engines:
> + * VE[0] = CS[0], CS[1], CS[2]
> + * VE[1] = CS[0], CS[1], CS[2]
> +
> + * This enables a use case where all engines are created equally, we don't 
> care
> + * where they are scheduled, we just want a certain number of resources, for
> + * those resources to be scheduled in parallel, and possibly across multiple
> + * engine classes.
> + */
> +
> +/*
> + * I915_PARALLEL_IMPLICT_BONDS - Create implict bonds between each context.
> + * Each context must have the same number 

Re: New uAPI for color management proposal and feedback request

2021-05-19 Thread Ville Syrjälä
On Wed, May 19, 2021 at 12:34:05PM +0300, Pekka Paalanen wrote:
> On Wed, 12 May 2021 16:04:16 +0300
> Ville Syrjälä  wrote:
> 
> > On Wed, May 12, 2021 at 02:06:56PM +0200, Werner Sembach wrote:
> > > Hello,
> > > 
> > > In addition to the existing "max bpc", and "Broadcast RGB/output_csc" drm 
> > > properties I propose 4 new properties:
> > > "preferred pixel encoding", "active color depth", "active color range", 
> > > and "active pixel encoding"
> > > 
> > > 
> > > Motivation:
> > > 
> > > Current monitors have a variety pixel encodings available: RGB, YCbCr 
> > > 4:4:4, YCbCr 4:2:2, YCbCr 4:2:0.
> > > 
> > > In addition they might be full or limited RGB range and the monitors 
> > > accept different bit depths.
> > > 
> > > Currently the kernel driver for AMD and Intel GPUs automatically 
> > > configure the color settings automatically with little
> > > to no influence of the user. However there are several real world 
> > > scenarios where the user might disagree with the
> > > default chosen by the drivers and wants to set his or her own preference.
> > > 
> > > Some examples:
> > > 
> > > 1. While RGB and YCbCr 4:4:4 in theory carry the same amount of color 
> > > information, some screens might look better on one
> > > than the other because of bad internal conversion. The driver currently 
> > > however has a fixed default that is chosen if
> > > available (RGB for Intel and YCbCr 4:4:4 for AMD). The only way to change 
> > > this currently is by editing and overloading
> > > the edid reported by the monitor to the kernel.
> > > 
> > > 2. RGB and YCbCr 4:4:4 need a higher port clock then YCbCr 4:2:0. Some 
> > > hardware might report that it supports the higher
> > > port clock, but because of bad shielding on the PC, the cable, or the 
> > > monitor the screen cuts out every few seconds when
> > > RGB or YCbCr 4:4:4 encoding is used, while YCbCr 4:2:0 might just work 
> > > fine without changing hardware. The drivers
> > > currently however always default to the "best available" option even if 
> > > it might be broken.
> > > 
> > > 3. Some screens natively only supporting 8-bit color, simulate 10-Bit 
> > > color by rapidly switching between 2 adjacent
> > > colors. They advertise themselves to the kernel as 10-bit monitors but 
> > > the user might not like the "fake" 10-bit effect
> > > and prefer running at the native 8-bit per color.
> > > 
> > > 4. Some screens are falsely classified as full RGB range wile they 
> > > actually use limited RGB range. This results in
> > > washed out colors in dark and bright scenes. A user override can be 
> > > helpful to manually fix this issue when it occurs.
> > > 
> > > There already exist several requests, discussion, and patches regarding 
> > > the thematic:
> > > 
> > > - https://gitlab.freedesktop.org/drm/amd/-/issues/476
> > > 
> > > - https://gitlab.freedesktop.org/drm/amd/-/issues/1548
> > > 
> > > - https://lkml.org/lkml/2021/5/7/695
> > > 
> > > - https://lkml.org/lkml/2021/5/11/416
> > > 
> 
> ...
> 
> > > Adoption:
> > > 
> > > A KDE dev wants to implement the settings in the KDE settings GUI:
> > > https://gitlab.freedesktop.org/drm/amd/-/issues/476#note_912370
> > > 
> > > Tuxedo Computers (my employer) wants to implement the settings desktop 
> > > environment agnostic in Tuxedo Control Center. I
> > > will start work on this in parallel to implementing the new kernel code.  
> > 
> > I suspect everyone would be happier to accept new uapi if we had
> > multiple compositors signed up to implement it.
> 
> I think having Weston support for these would be good, but for now it
> won't be much of an UI: just weston.ini to set, and the log to see what
> happened.
> 
> However, knowing what happened is going to be important for color
> calibration auditing:
> https://gitlab.freedesktop.org/wayland/weston/-/issues/467
> 
> Yes, please, very much for read-only properties for the feedback part.
> Properties that both userspace and kernel will write are hard to deal
> with in general.
> 
> Btw. "max bpc" I can kind of guess that conversion from framebuffer
> format to the wire bpc happens automatically and only as the final
> step,

Well, there could be dithering and whatnot also involved. So it's
not super well specified atm either.

> but "Broadcast RGB" is more complicated: is the output from the
> abstract pixel pipeline sent as-is and "Broadcast RGB" is just another
> inforframe bit to the monitor, or does "Broadcast RGB" setting actually
> change what happens in the pixel pipeline *and* set infoframe bits?

It does indeed compress the actual pixel data. There was once a patch
porposed to introduce a new enum value that only sets the infoframe and
thus would allow userspace to pass through already limited range data.
Shouldn't be hard to resurrect that if needed.

> 
> My vague recollection is that framebuffer was always assumed to be in
> full range, and then if "Broadcast RGB" was set to limited range, the
> driver would mangle the pixel 

Re: [PATCH 9/9] platform/x86/intel_cht_int33fe: Correct "displayport" fwnode reference

2021-05-19 Thread Hans de Goede
Hi,

On 5/3/21 5:46 PM, Hans de Goede wrote:
> The Type-C connector on these devices is connected to DP-2 not DP-1,
> so the reference must be to the DD04 child-node of the GPU, rather
> then the DD02 child-node.
> 
> Signed-off-by: Hans de Goede 

Since this is pretty much independent from the rest of the series,
I'll take this upstream through the pdx86 tree.

I've added this to my review-hans branch now, and it will get added
to for-next from there.

Regards,

Hans



> ---
>  drivers/platform/x86/intel_cht_int33fe_typec.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel_cht_int33fe_typec.c 
> b/drivers/platform/x86/intel_cht_int33fe_typec.c
> index b61bad9cc8d2..d59544167430 100644
> --- a/drivers/platform/x86/intel_cht_int33fe_typec.c
> +++ b/drivers/platform/x86/intel_cht_int33fe_typec.c
> @@ -168,8 +168,8 @@ static int cht_int33fe_setup_dp(struct cht_int33fe_data 
> *data)
>   return -ENODEV;
>   }
>  
> - /* Then the DP child device node */
> - data->dp = device_get_named_child_node(>dev, "DD02");
> + /* Then the DP-2 child device node */
> + data->dp = device_get_named_child_node(>dev, "DD04");
>   pci_dev_put(pdev);
>   if (!data->dp)
>   return -ENODEV;
> 



Re: [PATCH 2/2] drm/doc: document drm_mode_get_plane

2021-05-19 Thread Leandro Ribeiro



On 5/6/21 6:10 AM, Pekka Paalanen wrote:
> On Wed, 28 Apr 2021 18:36:51 -0300
> Leandro Ribeiro  wrote:
> 
>> Add a small description and document struct fields of
>> drm_mode_get_plane.
>>
>> Signed-off-by: Leandro Ribeiro 
> 
> Hi,
> 
> thanks a lot for revising these.
> 
>> ---
>>  include/uapi/drm/drm_mode.h | 22 ++
>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
>> index a5e76aa06ad5..8fa6495cd948 100644
>> --- a/include/uapi/drm/drm_mode.h
>> +++ b/include/uapi/drm/drm_mode.h
>> @@ -312,16 +312,38 @@ struct drm_mode_set_plane {
>>  __u32 src_w;
>>  };
>>
>> +/**
>> + * struct drm_mode_get_plane - Get plane metadata.
>> + *
>> + * Userspace can perform a GETPLANE ioctl to retrieve information about a
>> + * plane.
>> + */
>>  struct drm_mode_get_plane {
>> +/** @plane_id: Object ID of the plane. */
> 
> This is an "in" field, right?
> 
> "in" meaning that userspace sets it to the ID of the plane it wants
> information on.
> 
> "out" field is a field written by the kernel as a response.
> 
> I'm not sure if the kernel has a habit of documenting these, because we
> use libdrm to abstract this so users do not need to care, but I think
> it would be nice.
> 

In a quick look, I couldn't find anything. But I can change the phrasing
to the following:

"@plane_id: Object ID of the plane whose information should be
retrieved. IN field, set by userspace."

>>  __u32 plane_id;
>>
>> +/** @crtc_id: Object ID of the current CRTC. */
>>  __u32 crtc_id;
>> +/** @fb_id: Object ID of the current fb. */
>>  __u32 fb_id;
>>
>> +/**
>> + * @possible_crtcs: Bitmask of CRTC's compatible with the plane. CRTC's
>> + * are created and they receive an index, which corresponds to their
>> + * position in the bitmask. CRTC with index 0 will be in bit 0, and so
>> + * on. To learn how to find out the index of a certain CRTC, please see
>> + * :ref:`crtc_index`.
> 
> This could be shortened to something like bit N corresponds to CRTC
> index N, and make "CRTC index N" a hyperlink.
> 

Nice, I'll apply this change.

>> + */
>>  __u32 possible_crtcs;
>> +/** @gamma_size: Number of entries of the legacy gamma lookup table. */
>>  __u32 gamma_size;
>>
>> +/** @count_format_types: Number of formats. */
>>  __u32 count_format_types;
>> +/**
>> + * @format_type_ptr: Pointer to ``__u32`` array of formats that are
>> + * supported by the plane. These formats do not require modifiers.
>> + */
>>  __u64 format_type_ptr;
> 
> The count/ptr fields have an interesting usage pattern, which I suppose
> is common for all DRM ioctls. Makes me wonder if it should be documented.
> 
> AFAIU, count is in+out field: when set to 0, the kernel uses it to
> return the count needed. Then userspace allocates space and calls the
> ioctl again with the right count and ptr set to point to the allocated
> array of count elements. This is so that kernel never allocates memory
> on behalf of userspace for the return data, making things much simpler
> at the cost of maybe needing to call the ioctl twice to first figure
> out long the array should be.
> 
> This can be seen in libdrm code for drmModeGetPlane().
>
> There is certainly no point in explaining all that here, that is too
> much. But if there was a way to annotate the count member as in+out,
> that would be nice. And the ptr member as "in".
> 

This is documented in the description of struct drm_mode_get_connector:

https://www.kernel.org/doc/html/latest/gpu/drm-uapi.html#c.drm_mode_get_connector

Would be enough to have something similar in struct drm_mode_get_plane?

> 
> Thanks,
> pq
> 
>>  };
>>
>> --
>> 2.31.1
>>
>> ___
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 


Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Jason Gunthorpe
On Tue, May 18, 2021 at 07:45:05PM -0400, Peter Xu wrote:
> On Tue, May 18, 2021 at 08:03:27PM -0300, Jason Gunthorpe wrote:
> > Logically during fork all these device exclusive pages should be
> > reverted back to their CPU pages, write protected and the CPU page PTE
> > copied to the fork.
> > 
> > We should not copy the device exclusive page PTE to the fork. I think
> > I pointed to this on an earlier rev..
> 
> Agreed.  Though please see the question I posted in the other thread: now I am
> not very sure whether we'll be able to mark a page as device exclusive if that
> page has mapcount>1.

IMHO it is similar to write protect done by filesystems on shared
mappings - all VMAs with a copy of the CPU page have to get switched
to the device exclusive PTE. This is why the rmap stuff is involved in
the migration helpers

Jason


RE: [drm-rerere PATCH] nightly.conf: drop amd branches from drm-tip

2021-05-19 Thread Deucher, Alexander
[AMD Public Use]

> -Original Message-
> From: Jani Nikula 
> Sent: Wednesday, May 19, 2021 4:50 AM
> To: dim-to...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; intel-...@lists.freedesktop.org;
> jani.nik...@intel.com; Deucher, Alexander
> ; Koenig, Christian
> ; Pan; Pan, Xinhui ;
> Daniel Vetter 
> Subject: [drm-rerere PATCH] nightly.conf: drop amd branches from drm-tip
> 
> We've had a stale repo for amd in drm-tip since around v4.15 i.e. for more
> than three years. Nobody seems to notice or care. Drop the amd branches
> from drm-tip.
> 
> Having the current amd branches in drm-tip would be nice to have, if only to
> have a common drm integration tree. However, maintaining that has a cost
> due to the inevitable conflicts. We can add the branches back if and when
> there's interest in sharing the burden.
> 
> Cc: Alex Deucher 
> Cc: Christian König 
> Cc: Pan, Xinhui 
> Cc: Daniel Vetter 
> Signed-off-by: Jani Nikula 

Reviewed-by: Alex Deucher 

> ---
>  nightly.conf | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/nightly.conf b/nightly.conf index 9211550ef75c..35fb1d9ba600
> 100644
> --- a/nightly.conf
> +++ b/nightly.conf
> @@ -40,12 +40,6 @@ git://anongit.freedesktop.org/drm-misc
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fano
> ngit.freedesktop.org%2Fgit%2Fdrm%2Fdrm-
> miscdata=04%7C01%7Calexander.deucher%40amd.com%7C5903896cf
> 2e642afb05408d91aa30f6d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0%7C637570109906926805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000
> p;sdata=espN%2BoIX9SjLh2Py%2FkqlVsi0p9Ru%2Fet2M11XWqJ5eUQ%3D
> mp;reserved=0
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fano
> ngit.freedesktop.org%2Fgit%2Fdrm%2Fdrm-
> misc.gitdata=04%7C01%7Calexander.deucher%40amd.com%7C590389
> 6cf2e642afb05408d91aa30f6d%7C3dd8961fe4884e608e11a82d994e183d%7C0
> %7C0%7C637570109906926805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&
> amp;sdata=E5cwRH0Pr9JkIfIMNkNzjlLn5hN6k0inxBkk%2Bwhd1lk%3Dr
> eserved=0
>  "
> -drm_tip_repos[drm-amd]="
> -ssh://git.freedesktop.org/git/drm/drm-amd
> -git://anongit.freedesktop.org/drm/drm-amd
> -
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fano
> ngit.freedesktop.org%2Fgit%2Fdrm%2Fdrm-
> amddata=04%7C01%7Calexander.deucher%40amd.com%7C5903896cf
> 2e642afb05408d91aa30f6d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0%7C637570109906926805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000
> p;sdata=1kQe4t89CyANqRhNUpQ2RP3Ndz7A3sdd%2FiWZ7FmKHM4%3D
> mp;reserved=0
> -
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fano
> ngit.freedesktop.org%2Fgit%2Fdrm%2Fdrm-
> amd.gitdata=04%7C01%7Calexander.deucher%40amd.com%7C590389
> 6cf2e642afb05408d91aa30f6d%7C3dd8961fe4884e608e11a82d994e183d%7C0
> %7C0%7C637570109906926805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&
> amp;sdata=vVqMWMbdJFHJW8j09tn1m7ItGSL0RmfeDbJZFWoYBf4%3D
> p;reserved=0
> -"
>  drm_tip_repos[drm]="
>  ssh://git.freedesktop.org/git/drm/drm
>  git://anongit.freedesktop.org/drm/drm
> @@ -76,17 +70,14 @@ drm_tip_config=(
>   "drmdrm-fixes"
>   "drm-misc   drm-misc-fixes"
>   "drm-intel  drm-intel-fixes"
> - "drm-amddrm-amd-fixes"
> 
>   "drmdrm-next"
>   "drm-misc   drm-misc-next-fixes"
>   "drm-intel  drm-intel-next-fixes"
> - "drm-amddrm-amd-next-fixes"
> 
>   "drm-misc   drm-misc-next"
>   "drm-intel  drm-intel-next"
>   "drm-intel  drm-intel-gt-next"
> - "drm-amddrm-amd-next"
> 
>   "sound-upstream for-linus"
>   "sound-upstream for-next"
> --
> 2.20.1


Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Alistair Popple
On Wednesday, 19 May 2021 10:15:41 PM AEST Peter Xu wrote:
> External email: Use caution opening links or attachments
> 
> On Wed, May 19, 2021 at 09:04:53PM +1000, Alistair Popple wrote:
> > Failing fork() because we couldn't take a lock doesn't seem like the right
> > approach though, especially as there is already existing code that
> > retries. I get this adds complexity though, so would be happy to take a
> > look at cleaning copy_pte_range() up in future.
> 
> Yes, I proposed that as this one won't affect any existing applications
> (unlike the existing ones) but only new userspace driver apps that will use
> this new atomic feature.
> 
> IMHO it'll be a pity to add extra complexity and maintainance burden into
> fork() if only for keeping the "logical correctness of fork()" however the
> code never triggers. If we start with trylock we'll know whether people
> will use it, since people will complain with a reason when needed; however
> I still doubt whether a sane userspace device driver should fork() within
> busy interaction with the device underneath..

I will refrain from commenting on the sanity or otherwise of doing that :-)

Agree such a scenario seems unlikely in practice (and possibly unreasonable). 
Keeping the "logical correctness of fork()" still seems worthwhile to me, but 
if the added complexity/maintenance burden for an admittedly fairly specific 
feature is going to stop progress here I am happy to take the fail fork 
approach. I could then possibly fix it up as a future clean up to 
copy_pte_range(). Perhaps others have thoughts?

> In all cases, please still consider to keep them in copy_nonpresent_pte()
> (and if to rework, separating patches would be great).
>
> Thanks,
> 
> --
> Peter Xu






Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Alistair Popple
On Wednesday, 19 May 2021 10:21:08 PM AEST Peter Xu wrote:
> External email: Use caution opening links or attachments
> 
> On Wed, May 19, 2021 at 09:35:10PM +1000, Alistair Popple wrote:
> > I think the approach you are describing is similar to what
> > migrate_vma_collect()/migrate_vma_unamp() does now and I think it could be
> > made to work. I ended up going with the GUP+unmap approach in part because
> > Christoph suggested it but primarily because it required less code
> > especially given we needed to call something to fault the page in/break
> > COW anyway (or extend what was there to call handle_mm_fault(), etc.).
> 
> I see, thank. Would you mind add a short paragraph in the commit message
> talking about these two solutions and why we choose the GUP way?

Sure.

 - Alistair

> --
> Peter Xu






Re: [PATCH v8 5/8] mm: Device exclusive memory access

2021-05-19 Thread Alistair Popple
On Wednesday, 19 May 2021 10:24:27 PM AEST Peter Xu wrote:
> External email: Use caution opening links or attachments
> 
> On Wed, May 19, 2021 at 08:49:01PM +1000, Alistair Popple wrote:
> > On Wednesday, 19 May 2021 7:16:38 AM AEST Peter Xu wrote:
> > > External email: Use caution opening links or attachments
> > > 
> > > 
> > > On Wed, Apr 07, 2021 at 06:42:35PM +1000, Alistair Popple wrote:
> > > 
> > > [...]
> > > 
> > > > +static bool try_to_protect(struct page *page, struct mm_struct *mm,
> > > > +unsigned long address, void *arg)
> > > > +{
> > > > + struct ttp_args ttp = {
> > > > + .mm = mm,
> > > > + .address = address,
> > > > + .arg = arg,
> > > > + .valid = false,
> > > > + };
> > > > + struct rmap_walk_control rwc = {
> > > > + .rmap_one = try_to_protect_one,
> > > > + .done = page_not_mapped,
> > > > + .anon_lock = page_lock_anon_vma_read,
> > > > + .arg = ,
> > > > + };
> > > > +
> > > > + /*
> > > > +  * Restrict to anonymous pages for now to avoid potential
> > > > writeback
> > > > +  * issues.
> > > > +  */
> > > > + if (!PageAnon(page))
> > > > + return false;
> > > > +
> > > > + /*
> > > > +  * During exec, a temporary VMA is setup and later moved.
> > > > +  * The VMA is moved under the anon_vma lock but not the
> > > > +  * page tables leading to a race where migration cannot
> > > > +  * find the migration ptes. Rather than increasing the
> > > > +  * locking requirements of exec(), migration skips
> > > > +  * temporary VMAs until after exec() completes.
> > > > +  */
> > > > + if (!PageKsm(page) && PageAnon(page))
> > > > + rwc.invalid_vma = invalid_migration_vma;
> > > > +
> > > > + rmap_walk(page, );
> > > > +
> > > > + return ttp.valid && !page_mapcount(page);
> > > > +}
> > > 
> > > I raised a question in the other thread regarding fork():
> > > 
> > > https://lore.kernel.org/lkml/YKQjmtMo+YQGx%2FwZ@t490s/
> > > 
> > > While I suddenly noticed that we may have similar issues even if we
> > > fork()
> > > before creating the ptes.
> > > 
> > > In that case, we may see multiple read-only ptes pointing to the same
> > > page.
> > > We will convert all of them into device exclusive read ptes in
> > > rmap_walk()
> > > above, however how do we guarantee after all COW done in the parent and
> > > all
> > > the childs processes, the device owned page will be returned to the
> > > parent?
> > 
> > I assume you are talking about a fork() followed by a call to
> > make_device_exclusive()? I think this should be ok because
> > make_device_exclusive() always calls GUP with FOLL_WRITE both to break COW
> > and because a device performing atomic operations needs to write to the
> > page. I suppose a comment here highlighting the need to break COW to
> > avoid this scenario would be useful though.
> 
> Indeed, sorry for the false alarm!  Yes it would be great to mention that
> too.

No problem! Thanks for the comments.

> --
> Peter Xu






Re: [PATCH v8 3/8] mm/rmap: Split try_to_munlock from try_to_unmap

2021-05-19 Thread Alistair Popple
On Wednesday, 19 May 2021 6:04:51 AM AEST Liam Howlett wrote:
> External email: Use caution opening links or attachments
> 
> * Alistair Popple  [210407 04:43]:
> > The behaviour of try_to_unmap_one() is difficult to follow because it
> > performs different operations based on a fairly large set of flags used
> > in different combinations.
> > 
> > TTU_MUNLOCK is one such flag. However it is exclusively used by
> > try_to_munlock() which specifies no other flags. Therefore rather than
> > overload try_to_unmap_one() with unrelated behaviour split this out into
> > it's own function and remove the flag.
> > 
> > Signed-off-by: Alistair Popple 
> > Reviewed-by: Ralph Campbell 
> > Reviewed-by: Christoph Hellwig 
> > 
> > ---
> > 
> > v8:
> > * Renamed try_to_munlock to page_mlock to better reflect what the
> > 
> >   function actually does.
> > 
> > * Removed the TODO from the documentation that this patch addresses.
> > 
> > v7:
> > * Added Christoph's Reviewed-by
> > 
> > v4:
> > * Removed redundant check for VM_LOCKED
> > ---
> > 
> >  Documentation/vm/unevictable-lru.rst | 33 ---
> >  include/linux/rmap.h |  3 +-
> >  mm/mlock.c   | 10 +++---
> >  mm/rmap.c| 48 +---
> >  4 files changed, 55 insertions(+), 39 deletions(-)
> > 
> > diff --git a/Documentation/vm/unevictable-lru.rst
> > b/Documentation/vm/unevictable-lru.rst index 0e1490524f53..eae3af17f2d9
> > 100644
> > --- a/Documentation/vm/unevictable-lru.rst
> > +++ b/Documentation/vm/unevictable-lru.rst
> > @@ -389,14 +389,14 @@ mlocked, munlock_vma_page() updates that zone
> > statistics for the number of> 
> >  mlocked pages.  Note, however, that at this point we haven't checked
> >  whether the page is mapped by other VM_LOCKED VMAs.
> > 
> > -We can't call try_to_munlock(), the function that walks the reverse map
> > to
> > +We can't call page_mlock(), the function that walks the reverse map to
> > 
> >  check for other VM_LOCKED VMAs, without first isolating the page from the
> >  LRU.> 
> > -try_to_munlock() is a variant of try_to_unmap() and thus requires that
> > the page +page_mlock() is a variant of try_to_unmap() and thus requires
> > that the page> 
> >  not be on an LRU list [more on these below].  However, the call to
> > 
> > -isolate_lru_page() could fail, in which case we couldn't
> > try_to_munlock().  So, +isolate_lru_page() could fail, in which case we
> > can't call page_mlock().  So,> 
> >  we go ahead and clear PG_mlocked up front, as this might be the only
> >  chance we> 
> > -have.  If we can successfully isolate the page, we go ahead and
> > -try_to_munlock(), which will restore the PG_mlocked flag and update the
> > zone +have.  If we can successfully isolate the page, we go ahead and
> > call +page_mlock(), which will restore the PG_mlocked flag and update the
> > zone> 
> >  page statistics if it finds another VMA holding the page mlocked.  If we
> >  fail to isolate the page, we'll have left a potentially mlocked page on
> >  the LRU. This is fine, because we'll catch it later if and if vmscan
> >  tries to reclaim> 
> > @@ -545,31 +545,24 @@ munlock or munmap system calls, mm teardown
> > (munlock_vma_pages_all), reclaim,> 
> >  holepunching, and truncation of file pages and their anonymous COWed
> >  pages.
> > 
> > -try_to_munlock() Reverse Map Scan
> > +page_mlock() Reverse Map Scan
> > 
> >  -
> > 
> > -.. warning::
> > -   [!] TODO/FIXME: a better name might be page_mlocked() - analogous to
> > the -   page_referenced() reverse map walker.
> > -
> > 
> >  When munlock_vma_page() [see section :ref:`munlock()/munlockall() System
> >  Call Handling ` above] tries to munlock a
> >  page, it needs to determine whether or not the page is mapped by any
> >  VM_LOCKED VMA without actually attempting to unmap all PTEs from the
> >  page.  For this purpose, the unevictable/mlock infrastructure
> > 
> > -introduced a variant of try_to_unmap() called try_to_munlock().
> > +introduced a variant of try_to_unmap() called page_mlock().
> > 
> > -try_to_munlock() calls the same functions as try_to_unmap() for anonymous
> > and -mapped file and KSM pages with a flag argument specifying unlock
> > versus unmap -processing.  Again, these functions walk the respective
> > reverse maps looking -for VM_LOCKED VMAs.  When such a VMA is found, as
> > in the try_to_unmap() case, -the functions mlock the page via
> > mlock_vma_page() and return SWAP_MLOCK.  This -undoes the pre-clearing of
> > the page's PG_mlocked done by munlock_vma_page. +page_mlock() walks the
> > respective reverse maps looking for VM_LOCKED VMAs. When +such a VMA is
> > found the page is mlocked via mlock_vma_page(). This undoes the
> > +pre-clearing of the page's PG_mlocked done by munlock_vma_page.
> > 
> > -Note that try_to_munlock()'s reverse map walk must visit every VMA in a
> > page's +Note that page_mlock()'s reverse map 

  1   2   >