date:20211025

Re: [PATCH] drm/amd/amdgpu: fix potential bad job hw_fence underflow

2021-10-25 Thread JingWen Chen



On 2021/10/25 下午11:18, Andrey Grodzovsky wrote:
>
> On 2021-10-24 10:56 p.m., JingWen Chen wrote:
>> On 2021/10/23 上午4:41, Andrey Grodzovsky wrote:
>>> What do you mean by underflow in this case ? You mean use after free 
>>> because of extra dma_fence_put() ?
>> yes
>
>
> Then maybe update the description  because 'underflow' is very confusing
>
will do
>
>>> On 2021-10-22 4:14 a.m., JingWen Chen wrote:
 ping

 On 2021/10/22 AM11:33, Jingwen Chen wrote:
> [Why]
> In advance tdr mode, the real bad job will be resubmitted twice, while
> in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job
> is put one more time than other jobs.
>
> [How]
> Adding dma_fence_get before resbumit job in
> amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs
>
> Signed-off-by: Jingwen Chen 
> ---
>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 
>    1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 41ce86244144..975f069f6fe8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4841,6 +4841,9 @@ static void amdgpu_device_recheck_guilty_jobs(
>      /* clear job's guilty and depend the folowing step to decide 
> the real one */
>    drm_sched_reset_karma(s_job);
> +    /* for the real bad job, it will be resubmitted twice, adding a 
> dma_fence_get
> + * to make sure fence is balanced */
>>>
>>> But that put in drm_sched_resubmit_jobs_ext is for the previous parent 
>>> fence.
>>> fence = sched->ops->run_job(s_job); returns a new HW fence and the put 
>>> drops the refcount on the old one.
>>>
>>> Andrey
>>>
>>>
>> Hi Andrey,
>>
>> If I remember correctly, after we embedded the hw_fence into amdgpu_job, 
>> there will be not fence replacement in amdgpu_job_run.
>
>
> Right, I forgot that... What about removing line 
> https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c#L265
>  ?
> What if you make dma_get_fence unconditional instead ?
>
> Andrey
>
>
Hi Andrey,

I have tried this and this will cause normal jobs cannot be free(lacks a 
dma_fence_put). I have figured out all the get/put

for sched_jobs and only the bad job lacks a dma_fence_get, other jobs are just 
fine.

>>
> +    dma_fence_get(s_job->s_fence->parent);
>    drm_sched_resubmit_jobs_ext(&ring->sched, 1);
>      ret = dma_fence_wait_timeout(s_job->s_fence->parent, false, 
> ring->sched.timeout);
> @@ -4876,6 +4879,7 @@ static void amdgpu_device_recheck_guilty_jobs(
>      /* got the hw fence, signal finished fence */
>    atomic_dec(ring->sched.score);
> +    dma_fence_put(s_job->s_fence->parent);
>    dma_fence_get(&s_job->s_fence->finished);
>    dma_fence_signal(&s_job->s_fence->finished);
>    dma_fence_put(&s_job->s_fence->finished);

Re: Lockdep spalt on killing a processes

2021-10-25 Thread Andrey Grodzovsky

On 2021-10-25 3:56 p.m., Christian König wrote:

In general I'm all there to get this fixed, but there is one major
problem: Drivers don't expect the lock to be dropped.

I am probably missing something but in my approach we only modify the
code for those clients that call dma_fence_signal,
not dma_fence_signal_locked. In those cases the drivers are agnostic to
lock behavior (or should be at least) since the lock
is acquired within the dma fence code. Note that if you are worried
about calling the callback without lock then same exact
concern is relevant to using the irq_work directly in the fence code
since the irq_work will execute at a later time without locked

fence->lock (which is the point of using irq_work).

What we could do is to change all drivers so they call always call the
dma_fence_signal functions and drop the _locked variants. This way we
could move calling the callback out of the spinlock.

But that requires audit of all drivers, so quite a lot of work to do.

As i said earlier - if we only modify dma_fence_signal and don't touch
dma_fence_signal_locked then our only concern should the users of
dma_fence_signal.

Let me please know if I am still missing some point of yours.

Andrey

Regards,
Christian.

Am 25.10.21 um 21:10 schrieb Andrey Grodzovsky:
Adding back Daniel (somehow he got off the addresses list) and Chris
who worked a lot in this area.

On 2021-10-21 2:34 a.m., Christian König wrote:

Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky:

On 2021-10-04 4:14 a.m., Christian König wrote:

The problem is a bit different.

The callback is on the dependent fence, while we need to signal
the scheduler fence.

Daniel is right that this needs an irq_work struct to handle this
properly.

Christian.

So we had some discussions with Christian regarding irq_work and
agreed I should look into doing it but stepping back for a sec -

Why we insist on calling the dma_fence_cb with fence->lock locked
? Is it because of dma_fence_add_callback ?
Because we first test for DMA_FENCE_FLAG_SIGNALED_BIT and only
after that lock the fence->lock ? If so, can't we
move DMA_FENCE_FLAG_SIGNALED_BIT check inside the locked section ?
Because if in theory
we could call the cb with unlocked fence->lock (i.e. this kind of
iteration
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.15-rc6%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_resource.c%23L117&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc8a4525f94c244bebbd208d997f19242%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637707886075917091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YBq%2BNkDuYKERc8XJDWTfeM%2FSknpuCBHQYgN8Uo5PFv0%3D&reserved=0)

we wouldn't have the lockdep splat. And in general, is it really
the correct approach to call a third party code from a call back
with locked spinlock ? We don't know what the cb does inside
and I don't see any explicit restrictions in documentation of
dma_fence_func_t what can and cannot be done there.

Yeah, that's exactly what I meant with using the irq_work directly
in the fence code.

My idea is not to use irq work at all but instead to implement
unlocked dma_fence cb execution using iteration
which drops the spinlock each time next cb is executed and acquiring
it again after (until cb_list is empy).

The problem is dma_fence_signal_locked() which is used by quite a
number of drivers to signal the fence while holding the lock.

For this I think we should not reuse dma_fence_signal_locked inside
dma_fence_signal and instead implement it using the
unlocked iteration I mentioned above. I looked a bit in the code and
the history and I see that until some time ago
(this commit by Chris 0fc89b6802ba1fcc561b0c906e0cefd384e3b2e5),
indeed dma_fence_signal was doing it's own, locked iteration
and wasn't reusing dma_fence_signal_locked. This way whoever relies
on the dma_fence_signal_locked won't be impacted
an who is not (like us in
drm_sched_fence_scheduled/drm_sched_fence_finished) should also not
be impacted by more narrow
scope of the lock. I also looked at dma_fence_default_wait and how it
locks the fence->lock and check if fence is signaled

before wait start and I don't see a problem there either.

I attached quick draft of this proposal to clarify.

Andrey

Otherwise we could indeed simplify the fence handling a lot.

Christian.

Andrey

Am 01.10.21 um 17:10 schrieb Andrey Grodzovsky:
From what I see here you supposed to have actual deadlock and not
only warning, sched_fence->finished is first signaled from within
hw fence done callback (drm_sched_job_done_cb) but then again
from within it's own callback (drm_sched_entity_kill_jobs_cb) and so
looks like same fence object is recursively signaled twice. This
leads to attempt to lock fence->lock second time while it's already
locked. I don't see a need to call drm_sched_fence_finished from
withi

Re: [Intel-gfx] [PATCH 1/2] drm: Add Gamma and Degamma LUT sizes props to drm_crtc to validate.

2021-10-25 Thread Sean Paul

On Wed, Oct 13, 2021 at 02:12:20PM -0400, Mark Yacoub wrote:
> From: Mark Yacoub 
> 
> [Why]
> 1. drm_atomic_helper_check doesn't check for the LUT sizes of either Gamma
> or Degamma props in the new CRTC state, allowing any invalid size to
> be passed on.
> 2. Each driver has its own LUT size, which could also be different for
> legacy users.
> 
> [How]
> 1. Create |degamma_lut_size| and |gamma_lut_size| to save the LUT sizes
> assigned by the driver when it's initializing its color and CTM
> management.
> 2. Create drm_atomic_helper_check_crtc which is called by
> drm_atomic_helper_check to check the LUT sizes saved in drm_crtc that
> they match the sizes in the new CRTC state.
> 3. Rename older lut checks that test for the color channels to indicate
> it's a channel check. It's not included in drm_atomic_helper_check_crtc
> as it's hardware specific and is to be called by the driver.
> 4. As the LUT size check now happens in drm_atomic_helper_check, remove
> the lut check in intel_color.c

I'd probably split the rename out from the crtc check since they're only
tangentially related.

> 
> Fixes: igt@kms_color@pipe-A-invalid-gamma-lut-sizes on MTK
> Tested on Zork(amdgpu) and Jacuzzi(mediatek), volteer(TGL)
> 
> v1:
> 1. Fix typos
> 2. Remove the LUT size check from intel driver
> 3. Rename old LUT check to indicate it's a channel change
> 
> Signed-off-by: Mark Yacoub 
> ---
>  drivers/gpu/drm/drm_atomic_helper.c| 60 ++
>  drivers/gpu/drm/drm_color_mgmt.c   | 14 ++---
>  drivers/gpu/drm/i915/display/intel_color.c | 14 ++---
>  include/drm/drm_atomic_helper.h|  1 +
>  include/drm/drm_color_mgmt.h   |  7 +--
>  include/drm/drm_crtc.h | 11 
>  6 files changed, 89 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> b/drivers/gpu/drm/drm_atomic_helper.c
> index bc3487964fb5e..5feb2ad0209c3 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -929,6 +929,62 @@ drm_atomic_helper_check_planes(struct drm_device *dev,
>  }
>  EXPORT_SYMBOL(drm_atomic_helper_check_planes);
>  
> +/**
> + * drm_atomic_helper_check_crtcs - validate state object for CRTC changes
> + * @state: the driver state object
> + *
> + * Check the CRTC state object such as the Gamma/Degamma LUT sizes if the new
> + * state holds them.
> + *
> + * RETURNS:
> + * Zero for success or -errno
> + */
> +int drm_atomic_helper_check_crtcs(struct drm_atomic_state *state)
> +{
> + struct drm_crtc *crtc;
> + struct drm_crtc_state *new_crtc_state;
> + int i;
> +
> + for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
> + if (new_crtc_state->color_mgmt_changed &&
> + new_crtc_state->gamma_lut) {
> + uint64_t supported_lut_size = crtc->gamma_lut_size;
> + uint32_t supported_legacy_lut_size = crtc->gamma_size;
> + uint32_t new_state_lut_size =
> + drm_color_lut_size(new_crtc_state->gamma_lut);
> +
> + if (new_state_lut_size != supported_lut_size &&
> + new_state_lut_size != supported_legacy_lut_size) {
> + drm_dbg_state(
> + state->dev,
> + "Invalid Gamma LUT size. Should be %u 
> (or %u for legacy) but got %u.\n",
> + supported_lut_size,
> + supported_legacy_lut_size,
> + new_state_lut_size);
> + return -EINVAL;
> + }
> + }
> +
> + if (new_crtc_state->color_mgmt_changed &&
> + new_crtc_state->degamma_lut) {
> + uint32_t new_state_lut_size =
> + drm_color_lut_size(new_crtc_state->degamma_lut);
> + uint64_t supported_lut_size = crtc->degamma_lut_size;
> +
> + if (new_state_lut_size != supported_lut_size) {
> + drm_dbg_state(
> + state->dev,
> + "Invalid Degamma LUT size. Should be %u 
> but got %u.\n",
> + supported_lut_size, new_state_lut_size);
> + return -EINVAL;
> + }
> + }
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_atomic_helper_check_crtcs);
> +
>  /**
>   * drm_atomic_helper_check - validate state object
>   * @dev: DRM device
> @@ -974,6 +1030,10 @@ int drm_atomic_helper_check(struct drm_device *dev,
>   if (ret)
>   return ret;
>  
> + ret = drm_atomic_helper_check_crtcs(state);
> + if (ret)
> + return ret;
> +
>   if (state->legacy_cursor_update)
>   state->async_updat

Re: [PATCH] mm/migrate.c: Remove MIGRATE_PFN_LOCKED

2021-10-25 Thread Ralph Campbell




On 10/24/21 21:16, Alistair Popple wrote:

MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a
source page was already locked during migrate_vma_collect(). If it
wasn't then the a second attempt is made to lock the page. However if
the first attempt failed it's unlikely a second attempt will succeed,
and the retry adds complexity. So clean this up by removing the retry
and MIGRATE_PFN_LOCKED flag.

Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag
set, but nothing actually checks that.

Signed-off-by: Alistair Popple 


You can add:
Reviewed-by: Ralph Campbell 


---
  Documentation/vm/hmm.rst |   2 +-
  arch/powerpc/kvm/book3s_hv_uvmem.c   |   4 +-
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |   2 -
  drivers/gpu/drm/nouveau/nouveau_dmem.c   |   4 +-
  include/linux/migrate.h  |   1 -
  lib/test_hmm.c   |   5 +-
  mm/migrate.c | 145 +--
  7 files changed, 35 insertions(+), 128 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index a14c2938e7af..f2a59ed82ed3 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -360,7 +360,7 @@ between device driver specific code and shared common code:
 system memory page, locks the page with ``lock_page()``, and fills in the
 ``dst`` array entry with::
  
- dst[i] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;

+ dst[i] = migrate_pfn(page_to_pfn(dpage));
  
 Now that the driver knows that this page is being migrated, it can

 invalidate device private MMU mappings and copy device private memory
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index a7061ee3b157..28c436df9935 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -560,7 +560,7 @@ static int __kvmppc_svm_page_out(struct vm_area_struct *vma,
  gpa, 0, page_shift);
  
  	if (ret == U_SUCCESS)

-   *mig.dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED;
+   *mig.dst = migrate_pfn(pfn);
else {
unlock_page(dpage);
__free_page(dpage);
@@ -774,7 +774,7 @@ static int kvmppc_svm_page_in(struct vm_area_struct *vma,
}
}
  
-	*mig.dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;

+   *mig.dst = migrate_pfn(page_to_pfn(dpage));
migrate_vma_pages(&mig);
  out_finalize:
migrate_vma_finalize(&mig);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 4a16e3c257b9..41d9417f182b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -300,7 +300,6 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate->dst[i] = svm_migrate_addr_to_pfn(adev, dst[i]);
svm_migrate_get_vram_page(prange, migrate->dst[i]);
migrate->dst[i] = migrate_pfn(migrate->dst[i]);
-   migrate->dst[i] |= MIGRATE_PFN_LOCKED;
src[i] = dma_map_page(dev, spage, 0, PAGE_SIZE,
  DMA_TO_DEVICE);
r = dma_mapping_error(dev, src[i]);
@@ -580,7 +579,6 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
  dst[i] >> PAGE_SHIFT, page_to_pfn(dpage));
  
  		migrate->dst[i] = migrate_pfn(page_to_pfn(dpage));

-   migrate->dst[i] |= MIGRATE_PFN_LOCKED;
j++;
}
  
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c

index 92987daa5e17..3828aafd3ac4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -166,7 +166,7 @@ static vm_fault_t nouveau_dmem_fault_copy_one(struct 
nouveau_drm *drm,
goto error_dma_unmap;
mutex_unlock(&svmm->mutex);
  
-	args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;

+   args->dst[0] = migrate_pfn(page_to_pfn(dpage));
return 0;
  
  error_dma_unmap:

@@ -602,7 +602,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct 
nouveau_drm *drm,
((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
if (src & MIGRATE_PFN_WRITE)
*pfn |= NVIF_VMM_PFNMAP_V0_W;
-   return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
+   return migrate_pfn(page_to_pfn(dpage));
  
  out_dma_unmap:

dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index c8077e936691..479b861ae490 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -119,7 +119,6 @@ static inline int migrate_misplaced_page(struct page *page,
   */
  #define MIGRATE_PF

Re: [PATCH v2 2/2] drm/amdkfd: Remove cu mask from struct queue_properties(v2)

2021-10-25 Thread Felix Kuehling

Am 2021-10-25 um 5:58 a.m. schrieb Lang Yu:
> Actually, cu_mask has been copied to mqd memory and
> does't have to persist in queue_properties. Remove it
> from queue_properties.
>
> And use struct mqd_update_info to store such properties,
> then pass it to update queue operation.
>
> v2:
> * Rename pqm_update_queue to pqm_update_queue_properties.
> * Rename struct queue_update_info to struct mqd_update_info.
> * Rename pqm_set_cu_mask to pqm_update_mqd.
>
> Suggested-by: Felix Kuehling 
> Signed-off-by: Lang Yu 

The series is

Reviewed-by: Felix Kuehling 


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 31 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c |  1 -
>  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  9 +++---
>  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  9 +++---
>  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  9 +++---
>  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  9 +++---
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 23 +-
>  .../amd/amdkfd/kfd_process_queue_manager.c| 20 +++-
>  8 files changed, 57 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 9317a2e238d0..24ebd61395d8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -405,7 +405,7 @@ static int kfd_ioctl_update_queue(struct file *filp, 
> struct kfd_process *p,
>  
>   mutex_lock(&p->mutex);
>  
> - retval = pqm_update_queue(&p->pqm, args->queue_id, &properties);
> + retval = pqm_update_queue_properties(&p->pqm, args->queue_id, 
> &properties);
>  
>   mutex_unlock(&p->mutex);
>  
> @@ -418,7 +418,7 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, 
> struct kfd_process *p,
>   int retval;
>   const int max_num_cus = 1024;
>   struct kfd_ioctl_set_cu_mask_args *args = data;
> - struct queue_properties properties;
> + struct mqd_update_info minfo = {0};
>   uint32_t __user *cu_mask_ptr = (uint32_t __user *)args->cu_mask_ptr;
>   size_t cu_mask_size = sizeof(uint32_t) * (args->num_cu_mask / 32);
>  
> @@ -428,8 +428,8 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, 
> struct kfd_process *p,
>   return -EINVAL;
>   }
>  
> - properties.cu_mask_count = args->num_cu_mask;
> - if (properties.cu_mask_count == 0) {
> + minfo.cu_mask.count = args->num_cu_mask;
> + if (minfo.cu_mask.count == 0) {
>   pr_debug("CU mask cannot be 0");
>   return -EINVAL;
>   }
> @@ -438,32 +438,33 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, 
> struct kfd_process *p,
>* limit of max_num_cus bits.  We can then just drop any CU mask bits
>* past max_num_cus bits and just use the first max_num_cus bits.
>*/
> - if (properties.cu_mask_count > max_num_cus) {
> + if (minfo.cu_mask.count > max_num_cus) {
>   pr_debug("CU mask cannot be greater than 1024 bits");
> - properties.cu_mask_count = max_num_cus;
> + minfo.cu_mask.count = max_num_cus;
>   cu_mask_size = sizeof(uint32_t) * (max_num_cus/32);
>   }
>  
> - properties.cu_mask = kzalloc(cu_mask_size, GFP_KERNEL);
> - if (!properties.cu_mask)
> + minfo.cu_mask.ptr = kzalloc(cu_mask_size, GFP_KERNEL);
> + if (!minfo.cu_mask.ptr)
>   return -ENOMEM;
>  
> - retval = copy_from_user(properties.cu_mask, cu_mask_ptr, cu_mask_size);
> + retval = copy_from_user(minfo.cu_mask.ptr, cu_mask_ptr, cu_mask_size);
>   if (retval) {
>   pr_debug("Could not copy CU mask from userspace");
> - kfree(properties.cu_mask);
> - return -EFAULT;
> + retval = -EFAULT;
> + goto out;
>   }
>  
> + minfo.update_flag = UPDATE_FLAG_CU_MASK;
> +
>   mutex_lock(&p->mutex);
>  
> - retval = pqm_set_cu_mask(&p->pqm, args->queue_id, &properties);
> + retval = pqm_update_mqd(&p->pqm, args->queue_id, &minfo);
>  
>   mutex_unlock(&p->mutex);
>  
> - if (retval)
> - kfree(properties.cu_mask);
> -
> +out:
> + kfree(minfo.cu_mask.ptr);
>   return retval;
>  }
>  
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index a2b77d1df854..64b4ac339904 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -136,7 +136,6 @@ static bool kq_initialize(struct kernel_queue *kq, struct 
> kfd_dev *dev,
>   prop.write_ptr = (uint32_t *) kq->wptr_gpu_addr;
>   prop.eop_ring_buffer_address = kq->eop_gpu_addr;
>   prop.eop_ring_buffer_size = PAGE_SIZE;
> - prop.cu_mask = NULL;
>  
>   if (init_queue(&kq->queue, &prop) != 0)
>   goto err_init_queue;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
> b/drivers/gpu/drm/amd/amdkfd/

[PATCH RESEND v5 0/4] drm/amdgpu, Add DP 2.0 MST support + drm/dp_mst helpers

2021-10-25 Thread Lyude Paul

(Sorry for the noise, had to resend because I typo'd amd's mailing list
email address by accident)

Just resubmitting this patch series from AMD with _very_ minor changes
(just a typo and fixing a debug message) so that this can be pushed
upstream with a proper patchwork link. Will be pushing this into a topic
branch and submitting to airlied in a moment.

Bhawanpreet Lakha (3):
  drm: Remove slot checks in dp mst topology during commit
  drm: Update MST First Link Slot Information Based on Encoding Format
  drm/amd/display: Add DP 2.0 MST DM Support

Fangzhi Zuo (1):
  drm/amd/display: Add DP 2.0 MST DC Support

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  29 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |   3 +
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |   5 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  14 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 292 ++
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  19 ++
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   7 +
 drivers/gpu/drm/amd/display/dc/dc_stream.h|  13 +
 drivers/gpu/drm/drm_dp_mst_topology.c |  42 ++-
 drivers/gpu/drm/i915/display/intel_dp_mst.c   |   4 +-
 drivers/gpu/drm/nouveau/dispnv50/disp.c   |   2 +-
 drivers/gpu/drm/radeon/radeon_dp_mst.c|   4 +-
 include/drm/drm_dp_mst_helper.h   |   5 +-
 13 files changed, 423 insertions(+), 16 deletions(-)

-- 
2.31.1

[PATCH RESEND v5 4/4] drm/amd/display: Add DP 2.0 MST DM Support

2021-10-25 Thread Lyude Paul

From: Bhawanpreet Lakha 

[Why]
Add DP2 MST and debugfs support

[How]
Update the slot info based on the link encoding format

Reviewed-by: "Lin, Wayne" 
Signed-off-by: Bhawanpreet Lakha 
Signed-off-by: Fangzhi Zuo 
Signed-off-by: Lyude Paul 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 29 +++
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |  3 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  5 +++-
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f35561b5a465..ecdeeedb1cde 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10684,6 +10684,8 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
 #if defined(CONFIG_DRM_AMD_DC_DCN)
struct dsc_mst_fairness_vars vars[MAX_PIPES];
 #endif
+   struct drm_dp_mst_topology_state *mst_state;
+   struct drm_dp_mst_topology_mgr *mgr;
 
trace_amdgpu_dm_atomic_check_begin(state);
 
@@ -10891,6 +10893,33 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
lock_and_validation_needed = true;
}
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   /* set the slot info for each mst_state based on the link encoding 
format */
+   for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) {
+   struct amdgpu_dm_connector *aconnector;
+   struct drm_connector *connector;
+   struct drm_connector_list_iter iter;
+   u8 link_coding_cap;
+
+   if (!mgr->mst_state )
+   continue;
+
+   drm_connector_list_iter_begin(dev, &iter);
+   drm_for_each_connector_iter(connector, &iter) {
+   int id = connector->index;
+
+   if (id == mst_state->mgr->conn_base_id) {
+   aconnector = to_amdgpu_dm_connector(connector);
+   link_coding_cap = 
dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link);
+   drm_dp_mst_update_slots(mst_state, 
link_coding_cap);
+
+   break;
+   }
+   }
+   drm_connector_list_iter_end(&iter);
+
+   }
+#endif
/**
 * Streams and planes are reset when there are changes that affect
 * bandwidth. Anything that affects bandwidth needs to go through
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 814f67d86a3c..3d44896149a2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -294,6 +294,9 @@ static ssize_t dp_link_settings_write(struct file *f, const 
char __user *buf,
case LINK_RATE_RBR2:
case LINK_RATE_HIGH2:
case LINK_RATE_HIGH3:
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   case LINK_RATE_UHBR10:
+#endif
break;
default:
valid_input = false;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 6169488e2011..53b5cc7b0679 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -219,6 +219,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
struct drm_dp_mst_topology_mgr *mst_mgr;
struct drm_dp_mst_port *mst_port;
bool ret;
+   u8 link_coding_cap;
 
aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context;
/* Accessing the connector state is required for vcpi_slots allocation
@@ -238,6 +239,8 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
 
mst_port = aconnector->port;
 
+   link_coding_cap = 
dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link);
+
if (enable) {
 
ret = drm_dp_mst_allocate_vcpi(mst_mgr, mst_port,
@@ -251,7 +254,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
}
 
/* It's OK for this to fail */
-   drm_dp_update_payload_part1(mst_mgr, 1);
+   drm_dp_update_payload_part1(mst_mgr, (link_coding_cap == 
DP_CAP_ANSI_128B132B) ? 0:1);
 
/* mst_mgr->->payloads are VC payload notify MST branch using DPCD or
 * AUX message. The sequence is slot 1-63 allocated sequence for each
-- 
2.31.1

[PATCH RESEND v5 3/4] drm/amd/display: Add DP 2.0 MST DC Support

2021-10-25 Thread Lyude Paul

From: Fangzhi Zuo 

[Why]
configure/call DC interface for DP2 mst support. This is needed to make DP2
mst work.

[How]
- add encoding type, logging, mst update/reduce payload functions

Use the link encoding to determine the DP type (1.4 or 2.0) and add a
flag to dc_stream_update to determine whether to increase/reduce
payloads.

v2:
* add DP_UNKNOWN_ENCODING handling

Signed-off-by: Fangzhi Zuo 
Reviewed-by: "Lin, Wayne" 
Signed-off-by: Bhawanpreet Lakha 
Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  14 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 292 ++
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  19 ++
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   7 +
 drivers/gpu/drm/amd/display/dc/dc_stream.h|  13 +
 5 files changed, 345 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index da942e9f5142..782141ba8ac5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2356,6 +2356,11 @@ static enum surface_update_type 
check_update_surfaces_for_stream(
if (stream_update->dsc_config)
su_flags->bits.dsc_changed = 1;
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   if (stream_update->mst_bw_update)
+   su_flags->bits.mst_bw = 1;
+#endif
+
if (su_flags->raw != 0)
overall_type = UPDATE_TYPE_FULL;
 
@@ -2741,6 +2746,15 @@ static void commit_planes_do_stream_update(struct dc *dc,
if (stream_update->dsc_config)
dp_update_dsc_config(pipe_ctx);
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   if (stream_update->mst_bw_update) {
+   if (stream_update->mst_bw_update->is_increase)
+   dc_link_increase_mst_payload(pipe_ctx, 
stream_update->mst_bw_update->mst_stream_bw);
+   else
+   dc_link_reduce_mst_payload(pipe_ctx, 
stream_update->mst_bw_update->mst_stream_bw);
+   }
+#endif
+
if (stream_update->pending_test_pattern) {
dc_link_dp_set_test_pattern(stream->link,
stream->test_pattern.type,
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index ca5dc3c168ec..fd12561b70cc 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3272,6 +3272,9 @@ static struct fixed31_32 get_pbn_from_timing(struct 
pipe_ctx *pipe_ctx)
 static void update_mst_stream_alloc_table(
struct dc_link *link,
struct stream_encoder *stream_enc,
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   struct hpo_dp_stream_encoder *hpo_dp_stream_enc, // TODO: Rename 
stream_enc to dio_stream_enc?
+#endif
const struct dp_mst_stream_allocation_table *proposed_table)
 {
struct link_mst_stream_allocation work_table[MAX_CONTROLLER_NUM] = {
@@ -3308,6 +3311,9 @@ static void update_mst_stream_alloc_table(
work_table[i].slot_count =

proposed_table->stream_allocations[i].slot_count;
work_table[i].stream_enc = stream_enc;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   work_table[i].hpo_dp_stream_enc = hpo_dp_stream_enc;
+#endif
}
}
 
@@ -3430,6 +3436,10 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
struct dc_link *link = stream->link;
struct link_encoder *link_encoder = NULL;
struct stream_encoder *stream_encoder = pipe_ctx->stream_res.stream_enc;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   struct hpo_dp_link_encoder *hpo_dp_link_encoder = link->hpo_dp_link_enc;
+   struct hpo_dp_stream_encoder *hpo_dp_stream_encoder = 
pipe_ctx->stream_res.hpo_dp_stream_enc;
+#endif
struct dp_mst_stream_allocation_table proposed_table = {0};
struct fixed31_32 avg_time_slots_per_mtp;
struct fixed31_32 pbn;
@@ -3457,7 +3467,14 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
&proposed_table,
true)) {
update_mst_stream_alloc_table(
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   link,
+   pipe_ctx->stream_res.stream_enc,
+   pipe_ctx->stream_res.hpo_dp_stream_enc,
+   &proposed_table);
+#else
link, pipe_ctx->stream_res.stream_enc, 
&proposed_table);
+#endif
}
else
DC_LOG_WARNING("Failed to update"
@@ -3471,23 +3488,56 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)

[PATCH RESEND v5 2/4] drm: Update MST First Link Slot Information Based on Encoding Format

2021-10-25 Thread Lyude Paul

From: Bhawanpreet Lakha 

8b/10b encoding format requires to reserve the first slot for
recording metadata. Real data transmission starts from the second slot,
with a total of available 63 slots available.

In 128b/132b encoding format, metadata is transmitted separately
in LLCP packet before MTP. Real data transmission starts from
the first slot, with a total of 64 slots available.

v2:
* Move total/start slots to mst_state, and copy it to mst_mgr in
atomic_check

v3:
* Only keep the slot info on the mst_state
* add a start_slot parameter to the payload function, to facilitate non
  atomic drivers (this is a temporary workaround and should be removed when
  we are moving out the non atomic driver helpers)

v4:
*fixed typo and formatting

v5: (no functional changes)
* Fixed formatting in drm_dp_mst_update_slots()
* Reference mst_state instead of mst_state->mgr for debugging info

Signed-off-by: Bhawanpreet Lakha 
Signed-off-by: Fangzhi Zuo 
[v5 nitpicks]
Reviewed-by: Lyude Paul 
Signed-off-by: Lyude Paul 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  2 +-
 drivers/gpu/drm/drm_dp_mst_topology.c | 36 ---
 drivers/gpu/drm/i915/display/intel_dp_mst.c   |  4 +--
 drivers/gpu/drm/nouveau/dispnv50/disp.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_dp_mst.c|  4 +--
 include/drm/drm_dp_mst_helper.h   |  5 ++-
 6 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index ff0f91c93ba4..6169488e2011 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -251,7 +251,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
}
 
/* It's OK for this to fail */
-   drm_dp_update_payload_part1(mst_mgr);
+   drm_dp_update_payload_part1(mst_mgr, 1);
 
/* mst_mgr->->payloads are VC payload notify MST branch using DPCD or
 * AUX message. The sequence is slot 1-63 allocated sequence for each
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 04ed34a7f71c..571da0c2f39f 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3355,6 +3355,10 @@ static int drm_dp_destroy_payload_step2(struct 
drm_dp_mst_topology_mgr *mgr,
 /**
  * drm_dp_update_payload_part1() - Execute payload update part 1
  * @mgr: manager to use.
+ * @start_slot: this is the cur slot
+ *
+ * NOTE: start_slot is a temporary workaround for non-atomic drivers,
+ * this will be removed when non-atomic mst helpers are moved out of the helper
  *
  * This iterates over all proposed virtual channels, and tries to
  * allocate space in the link for them. For 0->slots transitions,
@@ -3365,12 +3369,12 @@ static int drm_dp_destroy_payload_step2(struct 
drm_dp_mst_topology_mgr *mgr,
  * after calling this the driver should generate ACT and payload
  * packets.
  */
-int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr)
+int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr, int 
start_slot)
 {
struct drm_dp_payload req_payload;
struct drm_dp_mst_port *port;
int i, j;
-   int cur_slots = 1;
+   int cur_slots = start_slot;
bool skip;
 
mutex_lock(&mgr->payload_lock);
@@ -4505,6 +4509,27 @@ int drm_dp_atomic_release_vcpi_slots(struct 
drm_atomic_state *state,
 }
 EXPORT_SYMBOL(drm_dp_atomic_release_vcpi_slots);
 
+/**
+ * drm_dp_mst_update_slots() - updates the slot info depending on the DP 
ecoding format
+ * @mst_state: mst_state to update
+ * @link_encoding_cap: the ecoding format on the link
+ */
+void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, 
uint8_t link_encoding_cap)
+{
+   if (link_encoding_cap == DP_CAP_ANSI_128B132B) {
+   mst_state->total_avail_slots = 64;
+   mst_state->start_slot = 0;
+   } else {
+   mst_state->total_avail_slots = 63;
+   mst_state->start_slot = 1;
+   }
+
+   DRM_DEBUG_KMS("%s encoding format on mst_state 0x%p\n",
+ (link_encoding_cap == DP_CAP_ANSI_128B132B) ? 
"128b/132b":"8b/10b",
+ mst_state);
+}
+EXPORT_SYMBOL(drm_dp_mst_update_slots);
+
 /**
  * drm_dp_mst_allocate_vcpi() - Allocate a virtual channel
  * @mgr: manager for this port
@@ -5224,7 +5249,7 @@ drm_dp_mst_atomic_check_vcpi_alloc_limit(struct 
drm_dp_mst_topology_mgr *mgr,
 struct drm_dp_mst_topology_state 
*mst_state)
 {
struct drm_dp_vcpi_allocation *vcpi;
-   int avail_slots = 63, payload_count = 0;
+   int avail_slots = mst_state->total_avail_slots, payload_count = 0;
 
list_for_each_entry(vcpi, &mst_state->vcpis, next) {
/* Releasing VCPI is always OK-even if the port is gone */
@@ -5253,7 +5278,7 @@ dr

[PATCH RESEND v5 1/4] drm: Remove slot checks in dp mst topology during commit

2021-10-25 Thread Lyude Paul

From: Bhawanpreet Lakha 

This code path is used during commit, and we dont expect things to fail
during the commit stage, so remove this.

Signed-off-by: Bhawanpreet Lakha 
Reviewed-by: Lyude Paul 
Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/drm_dp_mst_topology.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 86d13d6bc463..04ed34a7f71c 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -4334,10 +4334,6 @@ static int drm_dp_init_vcpi(struct 
drm_dp_mst_topology_mgr *mgr,
 {
int ret;
 
-   /* max. time slots - one slot for MTP header */
-   if (slots > 63)
-   return -ENOSPC;
-
vcpi->pbn = pbn;
vcpi->aligned_pbn = slots * mgr->pbn_div;
vcpi->num_slots = slots;
@@ -4540,7 +4536,7 @@ bool drm_dp_mst_allocate_vcpi(struct 
drm_dp_mst_topology_mgr *mgr,
 
ret = drm_dp_init_vcpi(mgr, &port->vcpi, pbn, slots);
if (ret) {
-   drm_dbg_kms(mgr->dev, "failed to init vcpi slots=%d max=63 
ret=%d\n",
+   drm_dbg_kms(mgr->dev, "failed to init vcpi slots=%d ret=%d\n",
DIV_ROUND_UP(pbn, mgr->pbn_div), ret);
drm_dp_mst_topology_put_port(port);
goto out;
-- 
2.31.1

[PATCH v5 4/4] drm/amd/display: Add DP 2.0 MST DM Support

2021-10-25 Thread Lyude Paul

From: Bhawanpreet Lakha 

[Why]
Add DP2 MST and debugfs support

[How]
Update the slot info based on the link encoding format

Reviewed-by: "Lin, Wayne" 
Signed-off-by: Bhawanpreet Lakha 
Signed-off-by: Fangzhi Zuo 
Signed-off-by: Lyude Paul 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 29 +++
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |  3 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  5 +++-
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f35561b5a465..ecdeeedb1cde 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10684,6 +10684,8 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
 #if defined(CONFIG_DRM_AMD_DC_DCN)
struct dsc_mst_fairness_vars vars[MAX_PIPES];
 #endif
+   struct drm_dp_mst_topology_state *mst_state;
+   struct drm_dp_mst_topology_mgr *mgr;
 
trace_amdgpu_dm_atomic_check_begin(state);
 
@@ -10891,6 +10893,33 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
lock_and_validation_needed = true;
}
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   /* set the slot info for each mst_state based on the link encoding 
format */
+   for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) {
+   struct amdgpu_dm_connector *aconnector;
+   struct drm_connector *connector;
+   struct drm_connector_list_iter iter;
+   u8 link_coding_cap;
+
+   if (!mgr->mst_state )
+   continue;
+
+   drm_connector_list_iter_begin(dev, &iter);
+   drm_for_each_connector_iter(connector, &iter) {
+   int id = connector->index;
+
+   if (id == mst_state->mgr->conn_base_id) {
+   aconnector = to_amdgpu_dm_connector(connector);
+   link_coding_cap = 
dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link);
+   drm_dp_mst_update_slots(mst_state, 
link_coding_cap);
+
+   break;
+   }
+   }
+   drm_connector_list_iter_end(&iter);
+
+   }
+#endif
/**
 * Streams and planes are reset when there are changes that affect
 * bandwidth. Anything that affects bandwidth needs to go through
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 814f67d86a3c..3d44896149a2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -294,6 +294,9 @@ static ssize_t dp_link_settings_write(struct file *f, const 
char __user *buf,
case LINK_RATE_RBR2:
case LINK_RATE_HIGH2:
case LINK_RATE_HIGH3:
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   case LINK_RATE_UHBR10:
+#endif
break;
default:
valid_input = false;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 6169488e2011..53b5cc7b0679 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -219,6 +219,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
struct drm_dp_mst_topology_mgr *mst_mgr;
struct drm_dp_mst_port *mst_port;
bool ret;
+   u8 link_coding_cap;
 
aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context;
/* Accessing the connector state is required for vcpi_slots allocation
@@ -238,6 +239,8 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
 
mst_port = aconnector->port;
 
+   link_coding_cap = 
dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link);
+
if (enable) {
 
ret = drm_dp_mst_allocate_vcpi(mst_mgr, mst_port,
@@ -251,7 +254,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
}
 
/* It's OK for this to fail */
-   drm_dp_update_payload_part1(mst_mgr, 1);
+   drm_dp_update_payload_part1(mst_mgr, (link_coding_cap == 
DP_CAP_ANSI_128B132B) ? 0:1);
 
/* mst_mgr->->payloads are VC payload notify MST branch using DPCD or
 * AUX message. The sequence is slot 1-63 allocated sequence for each
-- 
2.31.1

[PATCH v5 3/4] drm/amd/display: Add DP 2.0 MST DC Support

2021-10-25 Thread Lyude Paul

From: Fangzhi Zuo 

[Why]
configure/call DC interface for DP2 mst support. This is needed to make DP2
mst work.

[How]
- add encoding type, logging, mst update/reduce payload functions

Use the link encoding to determine the DP type (1.4 or 2.0) and add a
flag to dc_stream_update to determine whether to increase/reduce
payloads.

v2:
* add DP_UNKNOWN_ENCODING handling

Signed-off-by: Fangzhi Zuo 
Reviewed-by: "Lin, Wayne" 
Signed-off-by: Bhawanpreet Lakha 
Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  14 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 292 ++
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  19 ++
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   7 +
 drivers/gpu/drm/amd/display/dc/dc_stream.h|  13 +
 5 files changed, 345 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index da942e9f5142..782141ba8ac5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2356,6 +2356,11 @@ static enum surface_update_type 
check_update_surfaces_for_stream(
if (stream_update->dsc_config)
su_flags->bits.dsc_changed = 1;
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   if (stream_update->mst_bw_update)
+   su_flags->bits.mst_bw = 1;
+#endif
+
if (su_flags->raw != 0)
overall_type = UPDATE_TYPE_FULL;
 
@@ -2741,6 +2746,15 @@ static void commit_planes_do_stream_update(struct dc *dc,
if (stream_update->dsc_config)
dp_update_dsc_config(pipe_ctx);
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   if (stream_update->mst_bw_update) {
+   if (stream_update->mst_bw_update->is_increase)
+   dc_link_increase_mst_payload(pipe_ctx, 
stream_update->mst_bw_update->mst_stream_bw);
+   else
+   dc_link_reduce_mst_payload(pipe_ctx, 
stream_update->mst_bw_update->mst_stream_bw);
+   }
+#endif
+
if (stream_update->pending_test_pattern) {
dc_link_dp_set_test_pattern(stream->link,
stream->test_pattern.type,
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index ca5dc3c168ec..fd12561b70cc 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3272,6 +3272,9 @@ static struct fixed31_32 get_pbn_from_timing(struct 
pipe_ctx *pipe_ctx)
 static void update_mst_stream_alloc_table(
struct dc_link *link,
struct stream_encoder *stream_enc,
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   struct hpo_dp_stream_encoder *hpo_dp_stream_enc, // TODO: Rename 
stream_enc to dio_stream_enc?
+#endif
const struct dp_mst_stream_allocation_table *proposed_table)
 {
struct link_mst_stream_allocation work_table[MAX_CONTROLLER_NUM] = {
@@ -3308,6 +3311,9 @@ static void update_mst_stream_alloc_table(
work_table[i].slot_count =

proposed_table->stream_allocations[i].slot_count;
work_table[i].stream_enc = stream_enc;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   work_table[i].hpo_dp_stream_enc = hpo_dp_stream_enc;
+#endif
}
}
 
@@ -3430,6 +3436,10 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
struct dc_link *link = stream->link;
struct link_encoder *link_encoder = NULL;
struct stream_encoder *stream_encoder = pipe_ctx->stream_res.stream_enc;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   struct hpo_dp_link_encoder *hpo_dp_link_encoder = link->hpo_dp_link_enc;
+   struct hpo_dp_stream_encoder *hpo_dp_stream_encoder = 
pipe_ctx->stream_res.hpo_dp_stream_enc;
+#endif
struct dp_mst_stream_allocation_table proposed_table = {0};
struct fixed31_32 avg_time_slots_per_mtp;
struct fixed31_32 pbn;
@@ -3457,7 +3467,14 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
&proposed_table,
true)) {
update_mst_stream_alloc_table(
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   link,
+   pipe_ctx->stream_res.stream_enc,
+   pipe_ctx->stream_res.hpo_dp_stream_enc,
+   &proposed_table);
+#else
link, pipe_ctx->stream_res.stream_enc, 
&proposed_table);
+#endif
}
else
DC_LOG_WARNING("Failed to update"
@@ -3471,23 +3488,56 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)

[PATCH v5 2/4] drm: Update MST First Link Slot Information Based on Encoding Format

2021-10-25 Thread Lyude Paul

From: Bhawanpreet Lakha 

8b/10b encoding format requires to reserve the first slot for
recording metadata. Real data transmission starts from the second slot,
with a total of available 63 slots available.

In 128b/132b encoding format, metadata is transmitted separately
in LLCP packet before MTP. Real data transmission starts from
the first slot, with a total of 64 slots available.

v2:
* Move total/start slots to mst_state, and copy it to mst_mgr in
atomic_check

v3:
* Only keep the slot info on the mst_state
* add a start_slot parameter to the payload function, to facilitate non
  atomic drivers (this is a temporary workaround and should be removed when
  we are moving out the non atomic driver helpers)

v4:
*fixed typo and formatting

v5: (no functional changes)
* Fixed formatting in drm_dp_mst_update_slots()
* Reference mst_state instead of mst_state->mgr for debugging info

Signed-off-by: Bhawanpreet Lakha 
Signed-off-by: Fangzhi Zuo 
[v5 nitpicks]
Reviewed-by: Lyude Paul 
Signed-off-by: Lyude Paul 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  2 +-
 drivers/gpu/drm/drm_dp_mst_topology.c | 36 ---
 drivers/gpu/drm/i915/display/intel_dp_mst.c   |  4 +--
 drivers/gpu/drm/nouveau/dispnv50/disp.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_dp_mst.c|  4 +--
 include/drm/drm_dp_mst_helper.h   |  5 ++-
 6 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index ff0f91c93ba4..6169488e2011 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -251,7 +251,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
}
 
/* It's OK for this to fail */
-   drm_dp_update_payload_part1(mst_mgr);
+   drm_dp_update_payload_part1(mst_mgr, 1);
 
/* mst_mgr->->payloads are VC payload notify MST branch using DPCD or
 * AUX message. The sequence is slot 1-63 allocated sequence for each
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 04ed34a7f71c..571da0c2f39f 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3355,6 +3355,10 @@ static int drm_dp_destroy_payload_step2(struct 
drm_dp_mst_topology_mgr *mgr,
 /**
  * drm_dp_update_payload_part1() - Execute payload update part 1
  * @mgr: manager to use.
+ * @start_slot: this is the cur slot
+ *
+ * NOTE: start_slot is a temporary workaround for non-atomic drivers,
+ * this will be removed when non-atomic mst helpers are moved out of the helper
  *
  * This iterates over all proposed virtual channels, and tries to
  * allocate space in the link for them. For 0->slots transitions,
@@ -3365,12 +3369,12 @@ static int drm_dp_destroy_payload_step2(struct 
drm_dp_mst_topology_mgr *mgr,
  * after calling this the driver should generate ACT and payload
  * packets.
  */
-int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr)
+int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr, int 
start_slot)
 {
struct drm_dp_payload req_payload;
struct drm_dp_mst_port *port;
int i, j;
-   int cur_slots = 1;
+   int cur_slots = start_slot;
bool skip;
 
mutex_lock(&mgr->payload_lock);
@@ -4505,6 +4509,27 @@ int drm_dp_atomic_release_vcpi_slots(struct 
drm_atomic_state *state,
 }
 EXPORT_SYMBOL(drm_dp_atomic_release_vcpi_slots);
 
+/**
+ * drm_dp_mst_update_slots() - updates the slot info depending on the DP 
ecoding format
+ * @mst_state: mst_state to update
+ * @link_encoding_cap: the ecoding format on the link
+ */
+void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, 
uint8_t link_encoding_cap)
+{
+   if (link_encoding_cap == DP_CAP_ANSI_128B132B) {
+   mst_state->total_avail_slots = 64;
+   mst_state->start_slot = 0;
+   } else {
+   mst_state->total_avail_slots = 63;
+   mst_state->start_slot = 1;
+   }
+
+   DRM_DEBUG_KMS("%s encoding format on mst_state 0x%p\n",
+ (link_encoding_cap == DP_CAP_ANSI_128B132B) ? 
"128b/132b":"8b/10b",
+ mst_state);
+}
+EXPORT_SYMBOL(drm_dp_mst_update_slots);
+
 /**
  * drm_dp_mst_allocate_vcpi() - Allocate a virtual channel
  * @mgr: manager for this port
@@ -5224,7 +5249,7 @@ drm_dp_mst_atomic_check_vcpi_alloc_limit(struct 
drm_dp_mst_topology_mgr *mgr,
 struct drm_dp_mst_topology_state 
*mst_state)
 {
struct drm_dp_vcpi_allocation *vcpi;
-   int avail_slots = 63, payload_count = 0;
+   int avail_slots = mst_state->total_avail_slots, payload_count = 0;
 
list_for_each_entry(vcpi, &mst_state->vcpis, next) {
/* Releasing VCPI is always OK-even if the port is gone */
@@ -5253,7 +5278,7 @@ dr

Re: Lockdep spalt on killing a processes

2021-10-25 Thread Christian König

In general I'm all there to get this fixed, but there is one major 
problem: Drivers don't expect the lock to be dropped.


What we could do is to change all drivers so they call always call the 
dma_fence_signal functions and drop the _locked variants. This way we 
could move calling the callback out of the spinlock.


But that requires audit of all drivers, so quite a lot of work to do.

Regards,
Christian.

Am 25.10.21 um 21:10 schrieb Andrey Grodzovsky:
Adding back Daniel (somehow he got off the addresses list) and Chris 
who worked a lot in this area.


On 2021-10-21 2:34 a.m., Christian König wrote:



Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky:

On 2021-10-04 4:14 a.m., Christian König wrote:


The problem is a bit different.

The callback is on the dependent fence, while we need to signal the 
scheduler fence.


Daniel is right that this needs an irq_work struct to handle this 
properly.


Christian.



So we had some discussions with Christian regarding irq_work and 
agreed I should look into doing it but stepping back for a sec -


Why we insist on calling the dma_fence_cb  with fence->lock locked ? 
Is it because of dma_fence_add_callback ?
Because we first test for DMA_FENCE_FLAG_SIGNALED_BIT and only after 
that lock the fence->lock ? If so, can't we
move DMA_FENCE_FLAG_SIGNALED_BIT  check inside the locked section ? 
Because if in theory
we could call the cb with unlocked fence->lock (i.e. this kind of 
iteration 
https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/ttm/ttm_resource.c#L117)

we wouldn't have the lockdep splat. And in general, is it really
the correct approach to call a third party code from a call back 
with locked spinlock ? We don't know what the cb does inside
and I don't see any explicit restrictions in documentation of 
dma_fence_func_t what can and cannot be done there.


Yeah, that's exactly what I meant with using the irq_work directly in 
the fence code.



My idea is not to use irq work at all but instead to implement 
unlocked dma_fence cb execution using iteration
which drops the spinlock each time next cb is executed and acquiring 
it again after (until cb_list is empy).






The problem is dma_fence_signal_locked() which is used by quite a 
number of drivers to signal the fence while holding the lock.



For this I think we should not reuse dma_fence_signal_locked inside 
dma_fence_signal and instead implement it using the
unlocked iteration I mentioned above. I looked a bit in the code and 
the history and I see that until some time ago
(this commit by Chris 0fc89b6802ba1fcc561b0c906e0cefd384e3b2e5), 
indeed dma_fence_signal was doing it's own, locked iteration
and wasn't reusing dma_fence_signal_locked. This way whoever relies on 
the dma_fence_signal_locked won't be impacted
an who is not (like us in 
drm_sched_fence_scheduled/drm_sched_fence_finished) should also not be 
impacted by more narrow
scope of the lock. I also looked at dma_fence_default_wait and how it 
locks the fence->lock and check if fence is signaled

before wait start and I don't see a problem there either.

I attached quick draft of this proposal to clarify.

Andrey




Otherwise we could indeed simplify the fence handling a lot.

Christian.



Andrey




Am 01.10.21 um 17:10 schrieb Andrey Grodzovsky:
From what I see here you supposed to have actual deadlock and not 
only warning, sched_fence->finished is  first signaled from within
hw fence done callback (drm_sched_job_done_cb) but then again from 
within it's own callback (drm_sched_entity_kill_jobs_cb) and so
looks like same fence  object is recursively signaled twice. This 
leads to attempt to lock fence->lock second time while it's already
locked. I don't see a need to call drm_sched_fence_finished from 
within drm_sched_entity_kill_jobs_cb as this callback already 
registered
on sched_fence->finished fence (entity->last_scheduled == 
s_fence->finished) and hence the signaling already took place.


Andrey

On 2021-10-01 6:50 a.m., Christian König wrote:

Hey, Andrey.

while investigating some memory management problems I've got the 
logdep splat below.


Looks like something is wrong with 
drm_sched_entity_kill_jobs_cb(), can you investigate?


Thanks,
Christian.

[11176.741052] 
[11176.741056] WARNING: possible recursive locking detected
[11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted
[11176.741066] 
[11176.741070] swapper/12/0 is trying to acquire lock:
[11176.741074] 9c337ed175a8 (&fence->lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741088]
   but task is already holding lock:
[11176.741092] 9c337ed172a8 (&fence->lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741100]
   other info that might help us debug this:
[11176.741104]  Possible unsafe locking scenario:

[11176.741108]    CPU0
[11176.741110]    
[11176.741113]   lock(&fence->lock);
[1117

Re: [PATCH] drm/amdgpu/display: add quirk handling for stutter mode

2021-10-25 Thread Harry Wentland

On 2021-10-25 14:52, Alex Deucher wrote:
> Stutter mode is a power saving feature on GPUs, however at
> least one early raven system exhibits stability issues with
> it.  Add a quirk to disable it for that system.
> 
> Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214417
> Signed-off-by: Alex Deucher 

Reviewed-by: Harry Wentland 

Harry

> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 +++
>  1 file changed, 33 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 5b8d1635da5c..b635b893837b 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1295,6 +1295,37 @@ static struct hpd_rx_irq_offload_work_queue 
> *hpd_rx_irq_create_workqueue(struct
>   return hpd_rx_offload_wq;
>  }
>  
> +struct amdgpu_stutter_quirk {
> + u16 chip_vendor;
> + u16 chip_device;
> + u16 subsys_vendor;
> + u16 subsys_device;
> + u8 revision;
> +};
> +
> +static const struct amdgpu_stutter_quirk amdgpu_stutter_quirk_list[] = {
> + /* https://bugzilla.kernel.org/show_bug.cgi?id=214417 */
> + { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc8 },
> + { 0, 0, 0, 0, 0 },
> +};
> +
> +static bool dm_should_disable_stutter(struct pci_dev *pdev)
> +{
> + const struct amdgpu_stutter_quirk *p = amdgpu_stutter_quirk_list;
> +
> + while (p && p->chip_device != 0) {
> + if (pdev->vendor == p->chip_vendor &&
> + pdev->device == p->chip_device &&
> + pdev->subsystem_vendor == p->subsys_vendor &&
> + pdev->subsystem_device == p->subsys_device &&
> + pdev->revision == p->revision) {
> + return true;
> + }
> + ++p;
> + }
> + return false;
> +}
> +
>  static int amdgpu_dm_init(struct amdgpu_device *adev)
>  {
>   struct dc_init_data init_data;
> @@ -1406,6 +1437,8 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
>  
>   if (adev->asic_type != CHIP_CARRIZO && adev->asic_type != CHIP_STONEY)
>   adev->dm.dc->debug.disable_stutter = amdgpu_pp_feature_mask & 
> PP_STUTTER_MODE ? false : true;
> + if (dm_should_disable_stutter(adev->pdev))
> + adev->dm.dc->debug.disable_stutter = true;
>  
>   if (amdgpu_dc_debug_mask & DC_DISABLE_STUTTER)
>   adev->dm.dc->debug.disable_stutter = true;
>

Re: Lockdep spalt on killing a processes

2021-10-25 Thread Andrey Grodzovsky

Adding back Daniel (somehow he got off the addresses list) and Chris who 
worked a lot in this area.


On 2021-10-21 2:34 a.m., Christian König wrote:



Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky:

On 2021-10-04 4:14 a.m., Christian König wrote:


The problem is a bit different.

The callback is on the dependent fence, while we need to signal the 
scheduler fence.


Daniel is right that this needs an irq_work struct to handle this 
properly.


Christian.



So we had some discussions with Christian regarding irq_work and 
agreed I should look into doing it but stepping back for a sec -


Why we insist on calling the dma_fence_cb  with fence->lock locked ? 
Is it because of dma_fence_add_callback ?
Because we first test for DMA_FENCE_FLAG_SIGNALED_BIT and only after 
that lock the fence->lock ? If so, can't we
move DMA_FENCE_FLAG_SIGNALED_BIT  check inside the locked section ? 
Because if in theory
we could call the cb with unlocked fence->lock (i.e. this kind of 
iteration 
https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/ttm/ttm_resource.c#L117)

we wouldn't have the lockdep splat. And in general, is it really
the correct approach to call a third party code from a call back with 
locked spinlock ? We don't know what the cb does inside
and I don't see any explicit restrictions in documentation of 
dma_fence_func_t what can and cannot be done there.


Yeah, that's exactly what I meant with using the irq_work directly in 
the fence code.



My idea is not to use irq work at all but instead to implement unlocked 
dma_fence cb execution using iteration
which drops the spinlock each time next cb is executed and acquiring it 
again after (until cb_list is empy).






The problem is dma_fence_signal_locked() which is used by quite a 
number of drivers to signal the fence while holding the lock.



For this I think we should not reuse dma_fence_signal_locked inside 
dma_fence_signal and instead implement it using the
unlocked iteration I mentioned above. I looked a bit in the code and the 
history and I see that until some time ago
(this commit by Chris 0fc89b6802ba1fcc561b0c906e0cefd384e3b2e5), indeed 
dma_fence_signal was doing it's own, locked iteration
and wasn't reusing dma_fence_signal_locked. This way whoever relies on 
the dma_fence_signal_locked won't be impacted
an who is not (like us in 
drm_sched_fence_scheduled/drm_sched_fence_finished) should also not be 
impacted by more narrow
scope of the lock. I also looked at dma_fence_default_wait and how it 
locks the fence->lock and check if fence is signaled

before wait start and I don't see a problem there either.

I attached quick draft of this proposal to clarify.

Andrey




Otherwise we could indeed simplify the fence handling a lot.

Christian.



Andrey




Am 01.10.21 um 17:10 schrieb Andrey Grodzovsky:
From what I see here you supposed to have actual deadlock and not 
only warning, sched_fence->finished is  first signaled from within
hw fence done callback (drm_sched_job_done_cb) but then again from 
within it's own callback (drm_sched_entity_kill_jobs_cb) and so
looks like same fence  object is recursively signaled twice. This 
leads to attempt to lock fence->lock second time while it's already
locked. I don't see a need to call drm_sched_fence_finished from 
within drm_sched_entity_kill_jobs_cb as this callback already 
registered
on sched_fence->finished fence (entity->last_scheduled == 
s_fence->finished) and hence the signaling already took place.


Andrey

On 2021-10-01 6:50 a.m., Christian König wrote:

Hey, Andrey.

while investigating some memory management problems I've got the 
logdep splat below.


Looks like something is wrong with 
drm_sched_entity_kill_jobs_cb(), can you investigate?


Thanks,
Christian.

[11176.741052] 
[11176.741056] WARNING: possible recursive locking detected
[11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted
[11176.741066] 
[11176.741070] swapper/12/0 is trying to acquire lock:
[11176.741074] 9c337ed175a8 (&fence->lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741088]
   but task is already holding lock:
[11176.741092] 9c337ed172a8 (&fence->lock){-.-.}-{3:3}, at: 
dma_fence_signal+0x28/0x80

[11176.741100]
   other info that might help us debug this:
[11176.741104]  Possible unsafe locking scenario:

[11176.741108]    CPU0
[11176.741110]    
[11176.741113]   lock(&fence->lock);
[11176.741118]   lock(&fence->lock);
[11176.741122]
    *** DEADLOCK ***

[11176.741125]  May be due to missing lock nesting notation

[11176.741128] 2 locks held by swapper/12/0:
[11176.741133]  #0: 9c339c30f768 
(&ring->fence_drv.lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80
[11176.741142]  #1: 9c337ed172a8 (&fence->lock){-.-.}-{3:3}, 
at: dma_fence_signal+0x28/0x80

[11176.741151]
   stack backtrace:
[11176.741155] CPU: 12 P

[PATCH] drm/amdgpu/display: add quirk handling for stutter mode

2021-10-25 Thread Alex Deucher

Stutter mode is a power saving feature on GPUs, however at
least one early raven system exhibits stability issues with
it.  Add a quirk to disable it for that system.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214417
Signed-off-by: Alex Deucher 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 +++
 1 file changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 5b8d1635da5c..b635b893837b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1295,6 +1295,37 @@ static struct hpd_rx_irq_offload_work_queue 
*hpd_rx_irq_create_workqueue(struct
return hpd_rx_offload_wq;
 }
 
+struct amdgpu_stutter_quirk {
+   u16 chip_vendor;
+   u16 chip_device;
+   u16 subsys_vendor;
+   u16 subsys_device;
+   u8 revision;
+};
+
+static const struct amdgpu_stutter_quirk amdgpu_stutter_quirk_list[] = {
+   /* https://bugzilla.kernel.org/show_bug.cgi?id=214417 */
+   { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc8 },
+   { 0, 0, 0, 0, 0 },
+};
+
+static bool dm_should_disable_stutter(struct pci_dev *pdev)
+{
+   const struct amdgpu_stutter_quirk *p = amdgpu_stutter_quirk_list;
+
+   while (p && p->chip_device != 0) {
+   if (pdev->vendor == p->chip_vendor &&
+   pdev->device == p->chip_device &&
+   pdev->subsystem_vendor == p->subsys_vendor &&
+   pdev->subsystem_device == p->subsys_device &&
+   pdev->revision == p->revision) {
+   return true;
+   }
+   ++p;
+   }
+   return false;
+}
+
 static int amdgpu_dm_init(struct amdgpu_device *adev)
 {
struct dc_init_data init_data;
@@ -1406,6 +1437,8 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
 
if (adev->asic_type != CHIP_CARRIZO && adev->asic_type != CHIP_STONEY)
adev->dm.dc->debug.disable_stutter = amdgpu_pp_feature_mask & 
PP_STUTTER_MODE ? false : true;
+   if (dm_should_disable_stutter(adev->pdev))
+   adev->dm.dc->debug.disable_stutter = true;
 
if (amdgpu_dc_debug_mask & DC_DISABLE_STUTTER)
adev->dm.dc->debug.disable_stutter = true;
-- 
2.31.1

RE: [PATCH] drm/amdgpu: Restore information reporting in RAS

2021-10-25 Thread Russell, Kent

[AMD Official Use Only]



> -Original Message-
> From: Kuehling, Felix 
> Sent: Monday, October 25, 2021 1:30 PM
> To: Tuikov, Luben ; amd-gfx@lists.freedesktop.org
> Cc: Russell, Kent ; Deucher, Alexander
> 
> Subject: Re: [PATCH] drm/amdgpu: Restore information reporting in RAS
> 
> Am 2021-10-25 um 12:02 p.m. schrieb Luben Tuikov:
> > A recent patch took away the reporting of number of RAS records and
> > the threshold due to the way it was edited/spliced on top of the code.
> > This patch restores this reporting.
> >
> > Cc: Kent Russell 
> > Cc: Alex Deucher 
> > Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for 
> > ignoring bad
> page threshold")
> > Signed-off-by: Luben Tuikov 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> > index ae64ca02ccc4f8..05117eda105b55 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> > @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct
> amdgpu_ras_eeprom_control *control,
> > res = 0;
> > } else {
> > *exceed_err_limit = true;
> > -   dev_err(adev->dev, "GPU will not be 
> > initialized. Replace this
> GPU or increase the threshold.");
> > +   dev_err(adev->dev,
> > +   "RAS records:%d exceed threshold:%d, "
> > +   "GPU will not be initialized. Replace 
> > this GPU or
> increase the threshold",
> 
> Splitting messages across multiple lines is usually discouraged
> (presumably because it makes them hard to grep). I think checkpatch will
> treat this as an error, while a long line is just a warning. Therefore
> it seems that long lines are less bad than split messages.

There are a few spots in the eeprom file where it gets done like this; I don't 
really like it either. Under 
https://www.kernel.org/doc/html/v5.13/process/coding-style.html , I see it 
supporting splitting for ASM (point 20) but not for regular string (point 2). 

In this one he's just restoring something I dropped, verbatim, so I have no 
issue giving it my RB.

 Kent

> 
> Regards,
>   Felix
> 
> 
> > +   control->ras_num_recs, ras-
> >bad_page_cnt_threshold);
> > }
> > }
> > } else {
> >
> > base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d

[PATCH 5/5] drm/amd/display: Enable dpia in dmub only for DCN31 B0

2021-10-25 Thread Nicholas Kazlauskas

From: Jude Shih 

[Why]
DMUB binary is common for both A0 and B0. Hence, driver should
notify FW about the support for DPIA in B0.

[How]
Added dpia_supported bit in dmub_fw_boot_options and will be set
only for B0.

Assign dpia_supported to true before dm_dmub_hw_init
in B0 case.

Signed-off-by: Jude Shih 
Reviewed-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 
 drivers/gpu/drm/amd/display/dmub/dmub_srv.h   |  1 +
 drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c |  1 +
 3 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 0ad6d0e2a2d8..139a7707f317 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1017,6 +1017,7 @@ static int dm_dmub_hw_init(struct amdgpu_device *adev)
const unsigned char *fw_inst_const, *fw_bss_data;
uint32_t i, fw_inst_const_size, fw_bss_data_size;
bool has_hw_support;
+   struct dc *dc = adev->dm.dc;
 
if (!dmub_srv)
/* DMUB isn't supported on the ASIC. */
@@ -1103,6 +1104,17 @@ static int dm_dmub_hw_init(struct amdgpu_device *adev)
for (i = 0; i < fb_info->num_fb; ++i)
hw_params.fb[i] = &fb_info->fb[i];
 
+   switch (adev->asic_type) {
+   case CHIP_YELLOW_CARP:
+   if (dc->ctx->asic_id.hw_internal_rev != YELLOW_CARP_A0) {
+   hw_params.dpia_supported = true;
+   hw_params.disable_dpia = 
dc->debug.dpia_debug.bits.disable_dpia;
+   }
+   break;
+   default:
+   break;
+   }
+
status = dmub_srv_hw_init(dmub_srv, &hw_params);
if (status != DMUB_STATUS_OK) {
DRM_ERROR("Error initializing DMUB HW: %d\n", status);
diff --git a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h 
b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
index 6c4f0ada163f..717c0e572d2f 100644
--- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
+++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
@@ -238,6 +238,7 @@ struct dmub_srv_hw_params {
bool load_inst_const;
bool skip_panel_power_sequence;
bool disable_z10;
+   bool dpia_supported;
bool disable_dpia;
 };
 
diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c 
b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c
index 5df990277dd4..10ebf20eaa41 100644
--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c
+++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c
@@ -338,6 +338,7 @@ void dmub_dcn31_enable_dmub_boot_options(struct dmub_srv 
*dmub, const struct dmu
union dmub_fw_boot_options boot_options = {0};
 
boot_options.bits.z10_disable = params->disable_z10;
+   boot_options.bits.dpia_supported = params->dpia_supported;
boot_options.bits.enable_dpia = params->disable_dpia ? 0 : 1;
 
boot_options.bits.sel_mux_phy_c_d_phy_f_g = (dmub->asic == 
DMUB_ASIC_DCN31B) ? 1 : 0;
-- 
2.25.1

[PATCH 3/5] drm/amd/display: Fix USB4 hot plug crash issue

2021-10-25 Thread Nicholas Kazlauskas

From: Jude Shih 

[Why]
Notify data from outbox corrupt, the notify type should be 2 (HPD) instead of 0
(No data). We copied the address instead of the value. The memory might be
freed in the end of outbox IRQ

[How]
We should allocate the memory of notify and copy the whole content from outbox 
to
hpd handle function

Fixes: 5cecad78ea53 ("drm/amd/display: Support for SET_CONFIG processing with 
DMUB")
Signed-off-by: Jude Shih 
Reviewed-by: Nicholas Kazlauskas 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 +--
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 4cd64529b180..0ad6d0e2a2d8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -730,6 +730,8 @@ static void dm_handle_hpd_work(struct work_struct *work)

dmub_hpd_wrk->adev->dm.dmub_callback[dmub_hpd_wrk->dmub_notify->type](dmub_hpd_wrk->adev,
dmub_hpd_wrk->dmub_notify);
}
+
+   kfree(dmub_hpd_wrk->dmub_notify);
kfree(dmub_hpd_wrk);
 
 }
@@ -755,12 +757,6 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params)
 
if (dc_enable_dmub_notifications(adev->dm.dc) &&
irq_params->irq_src == DC_IRQ_SOURCE_DMCUB_OUTBOX) {
-   dmub_hpd_wrk = kzalloc(sizeof(*dmub_hpd_wrk), GFP_ATOMIC);
-   if (!dmub_hpd_wrk) {
-   DRM_ERROR("Failed to allocate dmub_hpd_wrk");
-   return;
-   }
-   INIT_WORK(&dmub_hpd_wrk->handle_hpd_work, dm_handle_hpd_work);
 
do {
dc_stat_get_dmub_notification(adev->dm.dc, ¬ify);
@@ -769,7 +765,20 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params)
continue;
}
if (dm->dmub_thread_offload[notify.type] == true) {
-   dmub_hpd_wrk->dmub_notify = ¬ify;
+   dmub_hpd_wrk = kzalloc(sizeof(*dmub_hpd_wrk), 
GFP_ATOMIC);
+   if (!dmub_hpd_wrk) {
+   DRM_ERROR("Failed to allocate 
dmub_hpd_wrk");
+   return;
+   }
+   dmub_hpd_wrk->dmub_notify = 
kzalloc(sizeof(struct dmub_notification), GFP_ATOMIC);
+   if (!dmub_hpd_wrk->dmub_notify) {
+   kfree(dmub_hpd_wrk);
+   DRM_ERROR("Failed to allocate 
dmub_hpd_wrk->dmub_notify");
+   return;
+   }
+   INIT_WORK(&dmub_hpd_wrk->handle_hpd_work, 
dm_handle_hpd_work);
+   if (dmub_hpd_wrk->dmub_notify)
+   memcpy(dmub_hpd_wrk->dmub_notify, 
¬ify, sizeof(struct dmub_notification));
dmub_hpd_wrk->adev = adev;
if (notify.type == DMUB_NOTIFICATION_HPD) {
plink = 
adev->dm.dc->links[notify.link_index];
-- 
2.25.1

[PATCH 1/5] drm/amd/display: Fallback to clocks which meet requested voltage on DCN31

2021-10-25 Thread Nicholas Kazlauskas

From: Michael Strauss 

[WHY]
On certain configs, SMU clock table voltages don't match which cause parser
to behave incorrectly by leaving dcfclk and socclk table entries unpopulated.

[HOW]
Currently the function that finds the corresponding clock for a given voltage
only checks for exact voltage level matches. In the case that no match gets
found, parser now falls back to searching for the max clock which meets the
requested voltage (i.e. its corresponding voltage is below requested).

Signed-off-by: Michael Strauss 
Reviewed-by: Nicholas Kazlauskas 
---
 .../amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c| 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
index 0088dff441da..f4c9a458ace8 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
@@ -523,14 +523,21 @@ static unsigned int find_clk_for_voltage(
unsigned int voltage)
 {
int i;
+   int max_voltage = 0;
+   int clock = 0;
 
for (i = 0; i < NUM_SOC_VOLTAGE_LEVELS; i++) {
-   if (clock_table->SocVoltage[i] == voltage)
+   if (clock_table->SocVoltage[i] == voltage) {
return clocks[i];
+   } else if (clock_table->SocVoltage[i] >= max_voltage &&
+   clock_table->SocVoltage[i] < voltage) {
+   max_voltage = clock_table->SocVoltage[i];
+   clock = clocks[i];
+   }
}
 
-   ASSERT(0);
-   return 0;
+   ASSERT(clock);
+   return clock;
 }
 
 void dcn31_clk_mgr_helper_populate_bw_params(
-- 
2.25.1

[PATCH 4/5] drm/amd/display: MST support for DPIA

2021-10-25 Thread Nicholas Kazlauskas

From: Meenakshikumar Somasundaram 

[Why]
- DPIA MST slot registers are not programmed during payload
allocation and hence MST does not work with DPIA.
- HPD RX interrupts are not handled for DPIA.

[How]
- Added inbox command to program the MST slots whenever
  payload allocation happens for DPIA links.
- Added support for handling HPD RX interrupts

Signed-off-by: Meenakshikumar Somasundaram 
Reviewed-by: Jun Lei 
Acked-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 54 +++
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 28 ++
 drivers/gpu/drm/amd/display/dc/dc.h   |  6 +++
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 23 
 4 files changed, 111 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 9d33b00b7e27..12e5470fa567 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3750,6 +3750,60 @@ bool dc_process_dmub_set_config_async(struct dc *dc,
return is_cmd_complete;
 }
 
+/**
+ *
+ *  Function: dc_process_dmub_set_mst_slots
+ *
+ *  @brief
+ * Submits mst slot allocation command to dmub via inbox message
+ *
+ *  @param
+ * [in] dc: dc structure
+ * [in] link_index: link index
+ * [in] mst_alloc_slots: mst slots to be allotted
+ * [out] mst_slots_in_use: mst slots in use returned in failure 
case
+ *
+ * @return
+ * DC_OK if successful, DC_ERROR if failure
+ *
+ */
+enum dc_status dc_process_dmub_set_mst_slots(const struct dc *dc,
+   uint32_t link_index,
+   uint8_t mst_alloc_slots,
+   uint8_t *mst_slots_in_use)
+{
+   union dmub_rb_cmd cmd = {0};
+   struct dc_dmub_srv *dmub_srv = dc->ctx->dmub_srv;
+
+   /* prepare MST_ALLOC_SLOTS command */
+   cmd.set_mst_alloc_slots.header.type = DMUB_CMD__DPIA;
+   cmd.set_mst_alloc_slots.header.sub_type = 
DMUB_CMD__DPIA_MST_ALLOC_SLOTS;
+
+   cmd.set_mst_alloc_slots.mst_slots_control.instance = 
dc->links[link_index]->ddc_hw_inst;
+   cmd.set_mst_alloc_slots.mst_slots_control.mst_alloc_slots = 
mst_alloc_slots;
+
+   if (!dc_dmub_srv_cmd_with_reply_data(dmub_srv, &cmd))
+   /* command is not processed by dmub */
+   return DC_ERROR_UNEXPECTED;
+
+   /* command processed by dmub, if ret_status is 1 */
+   if (cmd.set_config_access.header.ret_status != 1)
+   /* command processing error */
+   return DC_ERROR_UNEXPECTED;
+
+   /* command processed and we have a status of 2, mst not enabled in dpia 
*/
+   if (cmd.set_mst_alloc_slots.mst_slots_control.immed_status == 2)
+   return DC_FAIL_UNSUPPORTED_1;
+
+   /* previously configured mst alloc and used slots did not match */
+   if (cmd.set_mst_alloc_slots.mst_slots_control.immed_status == 3) {
+   *mst_slots_in_use = 
cmd.set_mst_alloc_slots.mst_slots_control.mst_slots_in_use;
+   return DC_NOT_SUPPORTED;
+   }
+
+   return DC_OK;
+}
+
 /**
  * dc_disable_accelerated_mode - disable accelerated mode
  * @dc: dc structure
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 27b9ff98a569..2796bdd17de1 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3499,6 +3499,20 @@ enum dc_status dc_link_allocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
 
ASSERT(proposed_table.stream_count > 0);
 
+   if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) {
+   static enum dc_status status;
+   uint8_t mst_alloc_slots = 0, prev_mst_slots_in_use = 0xFF;
+
+   for (i = 0; i < link->mst_stream_alloc_table.stream_count; i++)
+   mst_alloc_slots += 
link->mst_stream_alloc_table.stream_allocations[i].slot_count;
+
+   status = dc_process_dmub_set_mst_slots(link->dc, 
link->link_index,
+   mst_alloc_slots, &prev_mst_slots_in_use);
+   ASSERT(status == DC_OK);
+   DC_LOG_MST("dpia : status[%d]: alloc_slots[%d]: 
used_slots[%d]\n",
+   status, mst_alloc_slots, prev_mst_slots_in_use);
+   }
+
/* program DP source TX for payload */
 #if defined(CONFIG_DRM_AMD_DC_DCN)
switch (dp_get_link_encoding_format(&link->cur_link_settings)) {
@@ -3842,6 +3856,20 @@ static enum dc_status deallocate_mst_payload(struct 
pipe_ctx *pipe_ctx)
 #endif
}
 
+   if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) {
+   enum dc_status status;
+   uint8_t mst_alloc_slots = 0, prev_mst_slots_in_u

[PATCH 2/5] drm/amd/display: Fix deadlock when falling back to v2 from v3

2021-10-25 Thread Nicholas Kazlauskas

[Why]
A deadlock in the kernel occurs when we fallback from the V3 to V2
add_topology_to_display or remove_topology_to_display because they
both try to acquire the dtm_mutex but recursive locking isn't
supported on mutex_lock().

[How]
Make the mutex_lock/unlock more fine grained and move them up such that
they're only required for the psp invocation itself.

Fixes: bf62221e9d0e ("drm/amd/display: Add DCN3.1 HDCP support")

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c 
b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c
index e9bd84ec027d..be61975f1470 100644
--- a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c
+++ b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c
@@ -105,6 +105,7 @@ static enum mod_hdcp_status remove_display_from_topology_v3(
dtm_cmd->dtm_status = TA_DTM_STATUS__GENERIC_FAILURE;
 
psp_dtm_invoke(psp, dtm_cmd->cmd_id);
+   mutex_unlock(&psp->dtm_context.mutex);
 
if (dtm_cmd->dtm_status != TA_DTM_STATUS__SUCCESS) {
status = remove_display_from_topology_v2(hdcp, index);
@@ -115,8 +116,6 @@ static enum mod_hdcp_status remove_display_from_topology_v3(
HDCP_TOP_REMOVE_DISPLAY_TRACE(hdcp, display->index);
}
 
-   mutex_unlock(&psp->dtm_context.mutex);
-
return status;
 }
 
@@ -205,6 +204,7 @@ static enum mod_hdcp_status add_display_to_topology_v3(
dtm_cmd->dtm_in_message.topology_update_v3.link_hdcp_cap = 
link->hdcp_supported_informational;
 
psp_dtm_invoke(psp, dtm_cmd->cmd_id);
+   mutex_unlock(&psp->dtm_context.mutex);
 
if (dtm_cmd->dtm_status != TA_DTM_STATUS__SUCCESS) {
status = add_display_to_topology_v2(hdcp, display);
@@ -214,8 +214,6 @@ static enum mod_hdcp_status add_display_to_topology_v3(
HDCP_TOP_ADD_DISPLAY_TRACE(hdcp, display->index);
}
 
-   mutex_unlock(&psp->dtm_context.mutex);
-
return status;
 }
 
-- 
2.25.1

[PATCH 0/5] Fix USBC lightup on DCN31B

2021-10-25 Thread Nicholas Kazlauskas

This patchset is a mini-promotion to fix USBC lightup
on DCN31B for both SST and MST displays.

There are five issues at play here:
1. Invalid clock table causing mode validation to fail
2. HDCP downgrade to v2 causes deadlock
3. MST payload allocation not being propagated
4. PHY mux selection not being properly indicated to DMCUB
5. Hang occuring during HPD on PHYF/G.

Jude Shih (2):
  drm/amd/display: Fix USB4 hot plug crash issue
  drm/amd/display: Enable dpia in dmub only for DCN31 B0

Meenakshikumar Somasundaram (1):
  drm/amd/display: MST support for DPIA

Michael Strauss (1):
  drm/amd/display: Fallback to clocks which meet requested voltage on
DCN31

Nicholas Kazlauskas (1):
  drm/amd/display: Fix deadlock when falling back to v2 from v3

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 35 +---
 .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c  | 13 +++--
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 54 +++
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 28 ++
 drivers/gpu/drm/amd/display/dc/dc.h   |  6 +++
 drivers/gpu/drm/amd/display/dmub/dmub_srv.h   |  1 +
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 23 
 .../gpu/drm/amd/display/dmub/src/dmub_dcn31.c |  1 +
 .../drm/amd/display/modules/hdcp/hdcp_psp.c   |  6 +--
 9 files changed, 153 insertions(+), 14 deletions(-)

-- 
2.25.1

Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13

2021-10-25 Thread PGNet Dev


sbios settings


any of these raise a suspicion?

screenshot from the ASRockRack X470D4U's BIOS setup:

  https://imgur.com/a/rdhGQNy

Re: [PATCH] drm/amdgpu: Restore information reporting in RAS

2021-10-25 Thread Felix Kuehling

Am 2021-10-25 um 12:02 p.m. schrieb Luben Tuikov:
> A recent patch took away the reporting of number of RAS records and
> the threshold due to the way it was edited/spliced on top of the code.
> This patch restores this reporting.
>
> Cc: Kent Russell 
> Cc: Alex Deucher 
> Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring 
> bad page threshold")
> Signed-off-by: Luben Tuikov 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index ae64ca02ccc4f8..05117eda105b55 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct 
> amdgpu_ras_eeprom_control *control,
>   res = 0;
>   } else {
>   *exceed_err_limit = true;
> - dev_err(adev->dev, "GPU will not be 
> initialized. Replace this GPU or increase the threshold.");
> + dev_err(adev->dev,
> + "RAS records:%d exceed threshold:%d, "
> + "GPU will not be initialized. Replace 
> this GPU or increase the threshold",

Splitting messages across multiple lines is usually discouraged
(presumably because it makes them hard to grep). I think checkpatch will
treat this as an error, while a long line is just a warning. Therefore
it seems that long lines are less bad than split messages.

Regards,
  Felix


> + control->ras_num_recs, 
> ras->bad_page_cnt_threshold);
>   }
>   }
>   } else {
>
> base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d

Re: I got an IOMMU IO page fault. What to do now?

2021-10-25 Thread Christian König


Hi Robin,

Am 25.10.21 um 18:01 schrieb Robin Murphy:

On 2021-10-25 12:23, Christian König wrote:

Hi Paul,

not sure how the IOMMU gives out addresses, but the printed ones look 
suspicious to me. Something like we are using an invalid address like 
-1 or similar.


FWIW those look like believable DMA addresses to me, assuming that the 
DMA mapping APIs are being backed iommu_dma_ops and the device has a 
40-bit DMA mask, since the IOVA allocator works top-down.


Thanks for that information. In that case the addresses look valid to me.

Likely causes are either a race where the dma_unmap_*() call happens 
before the hardware has really stopped accessing the relevant 
addresses, or the device's DMA mask has been set larger than it should 
be, and thus the upper bits have been truncated in the round-trip 
through the hardware.


That actually looks correct to me. The device indeed has a 40-bit DMA mask.

There is a third possibility which is actually quite likely and that are 
stale reads in the pipeline.


See for some use cases the device can queue reads into an internal 
pipeline, but when it later finds that the read isn't needed doesn't 
flush the pipeline.


The next operation pushes more read requests into the pipeline and 
eventually the stale read requests are executed as well.


Without IOMMU the result of those reads are simply discarded, so no harm 
done. But with IOMMU enabled it is perfectly possible that the stale 
read is now accessing unmapped memory -> BAM.


That's one of the reasons why we almost always have GPUs in passthrough 
mode on x86 and for example don't use system memory for GPU page tables 
on APUs.


Regards,
Christian.



Given the addresses involved, my suspicions would initially lean 
towards the latter case - the faults are in the very topmost pages 
which imply they're the first things mapped in that range. The other 
contributing factor being the trick that the IOVA allocator plays for 
PCI devices, where it tries to prefer 32-bit addresses. Thus you're 
only likely to see this happen once you already have ~3.5-4GB of live 
DMA-mapped memory to exhaust the 32-bit IOVA space (minus some 
reserved areas) and start allocating from the full DMA mask. You 
should be able to check that with a 5.13 or newer kernel by booting 
with "iommu.forcedac=1" and seeing if it breaks immediately 
(unfortunately with an older kernel you'd have to manually hack 
iommu_dma_alloc_iova() to the same effect).


Robin.

Can you try that on an up to date kernel as well? E.g. ideally 
bleeding edge amd-staging-drm-next from Alex repository.


Regards,
Christian.

Am 25.10.21 um 12:25 schrieb Paul Menzel:

Dear Linux folks,


On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages 
below. (GPU hang in amdgpu issue #1762 [1] might be related.)


    $ lspci -nn -s 05:00.0
    05:00.0 VGA compatible controller [0300]: Advanced Micro 
Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] 
[1002:6611] (rev 87)

    $ dmesg
    […]
    [6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020]
    [6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020]
    [6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020]
    [6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020]
    [6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020]
    [6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020]
    [6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020]
    [6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020]
    [6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020]
    [6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020]
    [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffc1c0 flags=0x0020]
    [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffc8c0 flags=0x0020]
    [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffb0c0 flags=0x0020]
    [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffcfc0 flags=0x0020]
    [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffb6c0 flags=0x0020]
    [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xf

Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13

2021-10-25 Thread PGNet Dev


hi


The driver is not able to find the vbios image which is required for
the driver to properly enumerate the hardware.  I would guess it's a
platform issue.  Is there a newer sbios image available for your
platform?

...

I would start with an sbios update is possible.


not that i'm aware.

this board is an ASRockRack X470D4U server board


https://www.asrockrack.com/general/productdetail.asp?Model=X470D4U#Specifications

latest BIOS


https://www.asrockrack.com/general/productdetail.asp?Model=X470D4U#Download

is,

4.208/12/2021   BIOS

which is installed,

dmidecode | grep "BIOS Information" -A5
BIOS Information
Vendor: American Megatrends International, LLC.
Version: P4.20
Release Date: 04/14/2021
Address: 0xF
Runtime Size: 64 kB


check if there are any options in the
sbios regarding the behavior of the integrated graphics when an
external GPU is present.  I suspect the one of the following is the
problem:
1. The sbios should disable the integrated graphics when a dGPU is
present, but due to a bug in the sbios or a particular sbios settings
it has failed to.
2. The sbios should be providing a vbios image for the integrated
graphics, but due to a bug in the sbios or a particular sbios settings
it has failed to.
3. The platform uses some alternative method to provide access to the
vbios image for the integrated graphics that Linux does not yet
handle.


Checking on the specific options for Dual/Concurrent GPU support is ... 
challenging ... so far.
I haven't found a clear statement/doc on how it's intended to behave, what 
options are available, or details on the individual options.

Per a chat late last week with ASRockRack, my understanding is that dual CPU 
support is *supposed* to work.

atm, I'm not clear on how specifically test/answer for any of your suspected 
issues :-/
Reading online to see what to check, etc.


To add to the list - check if ACPI support is broken or skipped.


It doesn't appear to me to be; here's dmesg output,

dmesg | grep -i acpi
...
[0.00] BIOS-e820: [mem 0x0a20-0x0a20efff] 
ACPI NVS
[0.00] BIOS-e820: [mem 0xbae2e000-0xbae70fff] 
ACPI data
[0.00] BIOS-e820: [mem 0xbae71000-0xbd0defff] 
ACPI NVS
[0.00] reserve setup_data: [mem 
0x0a20-0x0a20efff] ACPI NVS
[0.00] reserve setup_data: [mem 
0xbae2e000-0xbae70fff] ACPI data
[0.00] reserve setup_data: [mem 
0xbae71000-0xbd0defff] ACPI NVS
[0.00] efi: ACPI=0xbd0c8000 ACPI 2.0=0xbd0c8014 
SMBIOS=0xbdd6d000 SMBIOS 3.0=0xbdd6c000 MEMATTR=0xb5ee6698 ESRT=0xb5804518 
MOKvar=0xb568b000 RNG=0xbdd9eb18
[0.004625] ACPI: Early table checksum verification disabled
[0.004627] ACPI: RSDP 0xBD0C8014 24 (v02 ALASKA)
[0.004630] ACPI: XSDT 0xBD0C7728 E4 (v01 ALASKA A M I   
 01072009 AMI  0113)
[0.004634] ACPI: FACP 0xBAE6 000114 (v06 ALASKA A M I   
 01072009 AMI  00010013)
[0.004637] ACPI: DSDT 0xBAE59000 006308 (v02 ALASKA A M I   
 01072009 INTL 20120913)
[0.004639] ACPI: FACS 0xBC0C2000 40
[0.004640] ACPI: IVRS 0xBAE7 D0 (v02 AMD
AmdTable 0001 AMD  0001)
[0.004642] ACPI: SPMI 0xBAE6F000 41 (v05 ALASKA A M I   
  AMI. )
[0.004644] ACPI: SPMI 0xBAE6E000 41 (v05 ALASKA A M I   
  AMI. )
[0.004645] ACPI: SSDT 0xBAE66000 007229 (v02 AMDMYRTLE  
 0002 MSFT 0400)
[0.004647] ACPI: SSDT 0xBAE62000 003BD7 (v01 AMDAMD AOD 
 0001 INTL 20120913)
[0.004648] ACPI: SSDT 0xBAE61000 C8 (v02 ALASKA CPUSSDT 
 01072009 AMI  01072009)
[0.004650] ACPI: FIDT 0xBAE58000 9C (v01 ALASKA A M I   
 01072009 AMI  00010013)
[0.004651] ACPI: MCFG 0xBAE57000 3C (v01 ALASKA A M I   
 01072009 MSFT 00010013)
[0.004653] ACPI: AAFT 0xBAE56000 68 (v01 ALASKA OEMAAFT 
 01072009 MSFT 0097)
[0.004654] ACPI: HPET 0xBAE55000 38 (v01 ALASKA A M I   
 01072009 AMI  0005)
[0.004656] ACPI: SPCR 0xBAE54000 50 (v02 A M I  APTIO V 
 01072009 AMI. 00050011)
[0.004657] ACPI: SSDT 0xBAE51000 002B44 (v02 AMD
AmdTable 0001 AMD  0001)
[0.004659] ACPI: CRAT 0xBAE5 000B68 (v01 AMD
AmdTable 0001 AMD  0001)
[0.004660] ACPI: CDIT 0xBAE4F000 29 (v01 AMD
AmdTable 0001 AMD  0001)
[0.004662] ACPI: SSDT 0xBAE4E000 000D53 (v01 AMD
MYRTLEG2 0001 INTL 20120913)
[

Re: [PATCH] drm/amdgpu: Restore information reporting in RAS

2021-10-25 Thread Lazar, Lijo

[Public]

Does the message need a mention about the newly added option to ignore 
threshold?

Thanks,
Lijo

From: amd-gfx  on behalf of Luben Tuikov 

Sent: Monday, October 25, 2021 9:32:20 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Tuikov, Luben ; Russell, Kent ; 
Deucher, Alexander 
Subject: [PATCH] drm/amdgpu: Restore information reporting in RAS

A recent patch took away the reporting of number of RAS records and
the threshold due to the way it was edited/spliced on top of the code.
This patch restores this reporting.

Cc: Kent Russell 
Cc: Alex Deucher 
Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring 
bad page threshold")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index ae64ca02ccc4f8..05117eda105b55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct 
amdgpu_ras_eeprom_control *control,
 res = 0;
 } else {
 *exceed_err_limit = true;
-   dev_err(adev->dev, "GPU will not be 
initialized. Replace this GPU or increase the threshold.");
+   dev_err(adev->dev,
+   "RAS records:%d exceed threshold:%d, "
+   "GPU will not be initialized. Replace 
this GPU or increase the threshold",
+   control->ras_num_recs, 
ras->bad_page_cnt_threshold);
 }
 }
 } else {

base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
--
2.33.1.558.g2bd2f258f4

Re: [PATCH] drm/amdgpu: Restore information reporting in RAS

2021-10-25 Thread Alex Deucher

On Mon, Oct 25, 2021 at 12:02 PM Luben Tuikov  wrote:
>
> A recent patch took away the reporting of number of RAS records and
> the threshold due to the way it was edited/spliced on top of the code.
> This patch restores this reporting.
>
> Cc: Kent Russell 
> Cc: Alex Deucher 
> Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring 
> bad page threshold")
> Signed-off-by: Luben Tuikov 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index ae64ca02ccc4f8..05117eda105b55 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct 
> amdgpu_ras_eeprom_control *control,
> res = 0;
> } else {
> *exceed_err_limit = true;
> -   dev_err(adev->dev, "GPU will not be 
> initialized. Replace this GPU or increase the threshold.");
> +   dev_err(adev->dev,
> +   "RAS records:%d exceed threshold:%d, "
> +   "GPU will not be initialized. Replace 
> this GPU or increase the threshold",
> +   control->ras_num_recs, 
> ras->bad_page_cnt_threshold);
> }
> }
> } else {
>
> base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
> --
> 2.33.1.558.g2bd2f258f4
>

RE: [PATCH] drm/amdgpu: Restore information reporting in RAS

2021-10-25 Thread Russell, Kent

[AMD Official Use Only]

Thanks Luben

Reviewed-by: Kent Russell 

 Kent

> -Original Message-
> From: Tuikov, Luben 
> Sent: Monday, October 25, 2021 12:02 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Tuikov, Luben ; Russell, Kent 
> ;
> Deucher, Alexander 
> Subject: [PATCH] drm/amdgpu: Restore information reporting in RAS
> 
> A recent patch took away the reporting of number of RAS records and
> the threshold due to the way it was edited/spliced on top of the code.
> This patch restores this reporting.
> 
> Cc: Kent Russell 
> Cc: Alex Deucher 
> Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring 
> bad page
> threshold")
> Signed-off-by: Luben Tuikov 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index ae64ca02ccc4f8..05117eda105b55 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct
> amdgpu_ras_eeprom_control *control,
>   res = 0;
>   } else {
>   *exceed_err_limit = true;
> - dev_err(adev->dev, "GPU will not be 
> initialized. Replace this
> GPU or increase the threshold.");
> + dev_err(adev->dev,
> + "RAS records:%d exceed threshold:%d, "
> + "GPU will not be initialized. Replace 
> this GPU or
> increase the threshold",
> + control->ras_num_recs, ras-
> >bad_page_cnt_threshold);
>   }
>   }
>   } else {
> 
> base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
> --
> 2.33.1.558.g2bd2f258f4

[PATCH] drm/amdgpu: Restore information reporting in RAS

2021-10-25 Thread Luben Tuikov

A recent patch took away the reporting of number of RAS records and
the threshold due to the way it was edited/spliced on top of the code.
This patch restores this reporting.

Cc: Kent Russell 
Cc: Alex Deucher 
Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring 
bad page threshold")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index ae64ca02ccc4f8..05117eda105b55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct 
amdgpu_ras_eeprom_control *control,
res = 0;
} else {
*exceed_err_limit = true;
-   dev_err(adev->dev, "GPU will not be 
initialized. Replace this GPU or increase the threshold.");
+   dev_err(adev->dev,
+   "RAS records:%d exceed threshold:%d, "
+   "GPU will not be initialized. Replace 
this GPU or increase the threshold",
+   control->ras_num_recs, 
ras->bad_page_cnt_threshold);
}
}
} else {

base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
-- 
2.33.1.558.g2bd2f258f4

Re: I got an IOMMU IO page fault. What to do now?

2021-10-25 Thread Robin Murphy


On 2021-10-25 12:23, Christian König wrote:

Hi Paul,

not sure how the IOMMU gives out addresses, but the printed ones look 
suspicious to me. Something like we are using an invalid address like -1 
or similar.


FWIW those look like believable DMA addresses to me, assuming that the 
DMA mapping APIs are being backed iommu_dma_ops and the device has a 
40-bit DMA mask, since the IOVA allocator works top-down.


Likely causes are either a race where the dma_unmap_*() call happens 
before the hardware has really stopped accessing the relevant addresses, 
or the device's DMA mask has been set larger than it should be, and thus 
the upper bits have been truncated in the round-trip through the hardware.


Given the addresses involved, my suspicions would initially lean towards 
the latter case - the faults are in the very topmost pages which imply 
they're the first things mapped in that range. The other contributing 
factor being the trick that the IOVA allocator plays for PCI devices, 
where it tries to prefer 32-bit addresses. Thus you're only likely to 
see this happen once you already have ~3.5-4GB of live DMA-mapped memory 
to exhaust the 32-bit IOVA space (minus some reserved areas) and start 
allocating from the full DMA mask. You should be able to check that with 
a 5.13 or newer kernel by booting with "iommu.forcedac=1" and seeing if 
it breaks immediately (unfortunately with an older kernel you'd have to 
manually hack iommu_dma_alloc_iova() to the same effect).


Robin.

Can you try that on an up to date kernel as well? E.g. ideally bleeding 
edge amd-staging-drm-next from Alex repository.


Regards,
Christian.

Am 25.10.21 um 12:25 schrieb Paul Menzel:

Dear Linux folks,


On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages 
below. (GPU hang in amdgpu issue #1762 [1] might be related.)


    $ lspci -nn -s 05:00.0
    05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, 
Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] 
(rev 87)

    $ dmesg
    […]
    [6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020]
    [6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020]
    [6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020]
    [6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020]
    [6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020]
    [6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020]
    [6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020]
    [6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020]
    [6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020]
    [6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020]
    [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffc1c0 flags=0x0020]
    [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffc8c0 flags=0x0020]
    [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffb0c0 flags=0x0020]
    [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffcfc0 flags=0x0020]
    [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffb6c0 flags=0x0020]
    [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffa0c0 flags=0x0020]
    [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffbdc0 flags=0x0020]
    [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffa4c0 flags=0x0020]
    [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xff90c0 flags=0x0020]
    [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffabc0 flags=0x0020]


As this is not reproducible, how would debugging go? (The system was 
rebooted in the meantime.) What options should be enabled, that next 
time the required information is logged, or what commands should I 
execute when the system is still in that state, so the bug (driver, 
userspace, …) can be pinpointed and fixed?



Kind regards,

Paul


[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
 "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"

Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane

2021-10-25 Thread Harry Wentland




On 2021-10-25 11:12, Paul Menzel wrote:
> Dear Nicholas, dear Harry,
> 
> 
> On 25.10.21 16:42, Kazlauskas, Nicholas wrote:
>> On 2021-10-25 9:58 a.m., Harry Wentland wrote:
> 
>>> On 2021-10-25 07:25, Paul Menzel wrote:
> 
 On 24.10.21 15:31, Rodrigo Siqueira wrote:
> From: Wenjing Liu 
>
> [why]
> We have a regression that cause maximize lane settings to use
> uninitialized data from unused lanes.

 Which commit caused the regression? Please amend the commit message.

> This will cause link training to fail for 1 or 2 lanes because the lane
> adjust is populated incorrectly sometimes.

 On what card did you test this, and how can it be reproduced?

 Please describe the fix/implemantation in the commit message.

> Reviewed-by: Eric Yang 
> Acked-by: Rodrigo Siqueira 
> Signed-off-by: Wenjing Liu 
> ---
>    .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 35 +--
>    1 file changed, 32 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> index 653279ab96f4..f6ba7c734f54 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> @@ -108,6 +108,9 @@ static struct dc_link_settings 
> get_common_supported_link_settings(
>    struct dc_link_settings link_setting_b);
>    static void maximize_lane_settings(const struct link_training_settings 
> *lt_settings,
>    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
> +static void override_lane_settings(const struct link_training_settings 
> *lt_settings,
> +    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
> +
>    static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link,
>    const struct dc_link_settings *link_settings)
>    {
> @@ -734,15 +737,13 @@ void dp_decide_lane_settings(
>    }
>    #endif
>    }
> -
> -    /* we find the maximum of the requested settings across all lanes*/
> -    /* and set this maximum for all lanes*/
>    dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
> dpcd_lane_settings);
>      if (lt_settings->disallow_per_lane_settings) {
>    /* we find the maximum of the requested settings across all 
> lanes*/
>    /* and set this maximum for all lanes*/
>    maximize_lane_settings(lt_settings, hw_lane_settings);
> +    override_lane_settings(lt_settings, hw_lane_settings);
>      if (lt_settings->always_match_dpcd_with_hw_lane_settings)
>    dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
> dpcd_lane_settings);
> @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct 
> link_training_settings *lt_setti
>    }
>    }
>    +static void override_lane_settings(const struct 
> link_training_settings *lt_settings,
> +    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
> +{
> +    uint32_t lane;
> +
> +    if (lt_settings->voltage_swing == NULL &&
> +    lt_settings->pre_emphasis == NULL &&
> +#if defined(CONFIG_DRM_AMD_DC_DP2_0)
> +    lt_settings->ffe_preset == NULL &&
> +#endif
> +    lt_settings->post_cursor2 == NULL)
> +
> +    return;
> +
> +    for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) {
> +    if (lt_settings->voltage_swing)
> +    lane_settings[lane].VOLTAGE_SWING = 
> *lt_settings->voltage_swing;
> +    if (lt_settings->pre_emphasis)
> +    lane_settings[lane].PRE_EMPHASIS = 
> *lt_settings->pre_emphasis;
> +    if (lt_settings->post_cursor2)
> +    lane_settings[lane].POST_CURSOR2 = 
> *lt_settings->post_cursor2;
> +#if defined(CONFIG_DRM_AMD_DC_DP2_0)
> +    if (lt_settings->ffe_preset)
> +    lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset;
> +#endif

 Normally these checks should be done in C and not the preprocessor. `if 
 CONFIG(DRM_AMD_DC_DP2_0)` or similar should work.

>>>
>>> Interesting. I've never seen this before. Do you have an example or link to 
>>> a doc? A cursory search doesn't yield any results but I might not be 
>>> searching for the right thing.
>>>
>>> Harry
>>
>> I'm curious about this too. The compiler with optimizations should remove 
>> the constant check, but technically the C standard only permits it - it 
>> doesn't guarantee that it happens.
>>
>> However, this patch should actually be changed to drop these 
>> CONFIG_DRM_AMD_DC_DP2_0 guards - this isn't a Kconfig option nor will there 
>> be one specifically for DP2. This should be folded under the DCN support.
> 
> From the Li

Re: [PATCH] drm/amd/amdgpu: fix potential bad job hw_fence underflow

2021-10-25 Thread Andrey Grodzovsky




On 2021-10-24 10:56 p.m., JingWen Chen wrote:

On 2021/10/23 上午4:41, Andrey Grodzovsky wrote:

What do you mean by underflow in this case ? You mean use after free because of 
extra dma_fence_put() ?

yes



Then maybe update the description  because 'underflow' is very confusing



On 2021-10-22 4:14 a.m., JingWen Chen wrote:

ping

On 2021/10/22 AM11:33, Jingwen Chen wrote:

[Why]
In advance tdr mode, the real bad job will be resubmitted twice, while
in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job
is put one more time than other jobs.

[How]
Adding dma_fence_get before resbumit job in
amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs

Signed-off-by: Jingwen Chen 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 
   1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 41ce86244144..975f069f6fe8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4841,6 +4841,9 @@ static void amdgpu_device_recheck_guilty_jobs(
     /* clear job's guilty and depend the folowing step to decide the 
real one */
   drm_sched_reset_karma(s_job);
+    /* for the real bad job, it will be resubmitted twice, adding a 
dma_fence_get
+ * to make sure fence is balanced */


But that put in drm_sched_resubmit_jobs_ext is for the previous parent fence.
fence = sched->ops->run_job(s_job); returns a new HW fence and the put drops 
the refcount on the old one.

Andrey



Hi Andrey,

If I remember correctly, after we embedded the hw_fence into amdgpu_job, there 
will be not fence replacement in amdgpu_job_run.



Right, I forgot that... What about removing line 
https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c#L265 
?

What if you make dma_get_fence unconditional instead ?

Andrey





+    dma_fence_get(s_job->s_fence->parent);
   drm_sched_resubmit_jobs_ext(&ring->sched, 1);
     ret = dma_fence_wait_timeout(s_job->s_fence->parent, false, 
ring->sched.timeout);
@@ -4876,6 +4879,7 @@ static void amdgpu_device_recheck_guilty_jobs(
     /* got the hw fence, signal finished fence */
   atomic_dec(ring->sched.score);
+    dma_fence_put(s_job->s_fence->parent);
   dma_fence_get(&s_job->s_fence->finished);
   dma_fence_signal(&s_job->s_fence->finished);
   dma_fence_put(&s_job->s_fence->finished);

Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13

2021-10-25 Thread Lazar, Lijo





On 10/25/2021 7:45 PM, Alex Deucher wrote:

On Mon, Oct 25, 2021 at 9:48 AM PGNet Dev  wrote:


( cc'ing this here, OP -> dri-devel@ )

i've a dual gpu system

 inxi -GS
 System:Host: ws05 Kernel: 5.14.13-200.fc34.x86_64 x86_64 
bits: 64 Console: tty pts/0
Distro: Fedora release 34 (Thirty Four)
(1) Graphics:  Device-1: NVIDIA GK208B [GeForce GT 710] driver: 
nvidia v: 470.74
(2)Device-2: Advanced Micro Devices [AMD/ATI] Cezanne 
driver: N/A
Display: server: X.org 1.20.11 driver: loaded: 
nvidia unloaded: fbdev,modesetting,vesa
Message: Advanced graphics data unavailable for 
root.

running on

 cpu:Ryzen 5 5600G
 mobo:   ASRockRack X470D4U
 bios:   vP4.20, 04/14/2021
 kernel: 5.14.13-200.fc34.x86_64 x86_64

where,

 the nvidia is a PCIe card
 the amdgpu is the Ryzen-integrated gpu

the nvidia PCI is currently my primary
it's screen-attached, and boots/functions correctly

 lsmod | grep nvidia
 nvidia_drm 69632  0
 nvidia_modeset   1200128  1 nvidia_drm
 nvidia  35332096  1 nvidia_modeset
 drm_kms_helper303104  2 amdgpu,nvidia_drm
 drm   630784  8 
gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm

 dmesg | grep -i nvidia
 [5.755494] nvidia: loading out-of-tree module taints 
kernel.
 [5.755503] nvidia: module license 'NVIDIA' taints kernel.
 [5.759769] nvidia: module verification failed: signature 
and/or required key missing - tainting kernel
 [5.774894] nvidia-nvlink: Nvlink Core is being 
initialized, major device number 234
 [5.775299] nvidia :10:00.0: vgaarb: changed VGA 
decodes: olddecodes=io+mem,decodes=none:owns=io+mem
 [5.975449] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  
470.74  Mon Sep 13 23:09:15 UTC 2021
 [6.013181] nvidia-modeset: Loading NVIDIA Kernel Mode 
Setting Driver for UNIX platforms  470.74  Mon Sep 13 22:59:50 UTC 2021
 [6.016444] [drm] [nvidia-drm] [GPU ID 0x1000] Loading 
driver
 [6.227295] caller _nv000723rm+0x1ad/0x200 [nvidia] mapping 
multiple BARs
 [6.954906] [drm] Initialized nvidia-drm 0.0.0 20160202 for 
:10:00.0 on minor 0
 [   16.820758] input: HDA NVidia HDMI/DP,pcm=3 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input13
 [   16.820776] input: HDA NVidia HDMI/DP,pcm=7 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input14
 [   16.820808] input: HDA NVidia HDMI/DP,pcm=8 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input15
 [   16.820826] input: HDA NVidia HDMI/DP,pcm=9 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input16
 [   16.820841] input: HDA NVidia HDMI/DP,pcm=10 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input17

the amdgpu is not (currently/yet) in use; no attached screen

in BIOS, currently,

 'PCI Express' (nvidia gpu) is selected as primary
 'HybridGraphics' is enabled
 'OnBoard VGA' is enabled


on boot, mods are loaded

 lsmod | grep gpu
 amdgpu   7802880  0
 drm_ttm_helper 16384  1 amdgpu
 ttm81920  2 amdgpu,drm_ttm_helper
 iommu_v2   24576  1 amdgpu
 gpu_sched  45056  1 amdgpu
 drm_kms_helper303104  2 amdgpu,nvidia_drm
 drm   630784  8 
gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm
 i2c_algo_bit   16384  2 igb,amdgpu

but i see a 'fatal error' and 'failed' probe,

 dmesg | grep -i amdgpu
 [5.161923] [drm] amdgpu kernel modesetting enabled.
 [5.162097] amdgpu: Virtual CRAT table created for CPU
 [5.162104] amdgpu: Topology: Add CPU node
 [5.162197] amdgpu :30:00.0: enabling device ( -> 
0003)
 [5.162232] amdgpu :30:00.0: amdgpu: Trusted Memory 
Zone (TMZ) feature enabled
 [5.169105] amdgpu :30:00.0: BAR 6: can't assign [??? 
0x flags 0x2000] (bogus alignment)
 [5.174413] amdgpu :30:00.0: amdgpu: Unable to locate a 
BIOS ROM
 [5.174415] amdgpu :30:00.0: amdgpu: Fatal error during 
GPU init
 [5.174416] amdgpu :30:00.0: amdgpu: amdgpu: finishing 
device.
 [5.174425] Modules linked in: amdgpu(+) uas usb_

Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane

2021-10-25 Thread Paul Menzel


Dear Nicholas, dear Harry,


On 25.10.21 16:42, Kazlauskas, Nicholas wrote:

On 2021-10-25 9:58 a.m., Harry Wentland wrote:



On 2021-10-25 07:25, Paul Menzel wrote:



On 24.10.21 15:31, Rodrigo Siqueira wrote:

From: Wenjing Liu 

[why]
We have a regression that cause maximize lane settings to use
uninitialized data from unused lanes.


Which commit caused the regression? Please amend the commit message.


This will cause link training to fail for 1 or 2 lanes because the lane
adjust is populated incorrectly sometimes.


On what card did you test this, and how can it be reproduced?

Please describe the fix/implemantation in the commit message.


Reviewed-by: Eric Yang 
Acked-by: Rodrigo Siqueira 
Signed-off-by: Wenjing Liu 
---
   .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 35 +--
   1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c

index 653279ab96f4..f6ba7c734f54 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -108,6 +108,9 @@ static struct dc_link_settings 
get_common_supported_link_settings(

   struct dc_link_settings link_setting_b);
   static void maximize_lane_settings(const struct 
link_training_settings *lt_settings,

   struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+static void override_lane_settings(const struct 
link_training_settings *lt_settings,

+    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+
   static uint32_t get_cr_training_aux_rd_interval(struct dc_link 
*link,

   const struct dc_link_settings *link_settings)
   {
@@ -734,15 +737,13 @@ void dp_decide_lane_settings(
   }
   #endif
   }
-
-    /* we find the maximum of the requested settings across all 
lanes*/

-    /* and set this maximum for all lanes*/
   dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
dpcd_lane_settings);

     if (lt_settings->disallow_per_lane_settings) {
   /* we find the maximum of the requested settings across 
all lanes*/

   /* and set this maximum for all lanes*/
   maximize_lane_settings(lt_settings, hw_lane_settings);
+    override_lane_settings(lt_settings, hw_lane_settings);
     if (lt_settings->always_match_dpcd_with_hw_lane_settings)
   dp_hw_to_dpcd_lane_settings(lt_settings, 
hw_lane_settings, dpcd_lane_settings);
@@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct 
link_training_settings *lt_setti

   }
   }
   +static void override_lane_settings(const struct 
link_training_settings *lt_settings,

+    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
+{
+    uint32_t lane;
+
+    if (lt_settings->voltage_swing == NULL &&
+    lt_settings->pre_emphasis == NULL &&
+#if defined(CONFIG_DRM_AMD_DC_DP2_0)
+    lt_settings->ffe_preset == NULL &&
+#endif
+    lt_settings->post_cursor2 == NULL)
+
+    return;
+
+    for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) {
+    if (lt_settings->voltage_swing)
+    lane_settings[lane].VOLTAGE_SWING = 
*lt_settings->voltage_swing;

+    if (lt_settings->pre_emphasis)
+    lane_settings[lane].PRE_EMPHASIS = 
*lt_settings->pre_emphasis;

+    if (lt_settings->post_cursor2)
+    lane_settings[lane].POST_CURSOR2 = 
*lt_settings->post_cursor2;

+#if defined(CONFIG_DRM_AMD_DC_DP2_0)
+    if (lt_settings->ffe_preset)
+    lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset;
+#endif


Normally these checks should be done in C and not the preprocessor. 
`if CONFIG(DRM_AMD_DC_DP2_0)` or similar should work.




Interesting. I've never seen this before. Do you have an example or 
link to a doc? A cursory search doesn't yield any results but I might 
not be searching for the right thing.


Harry


I'm curious about this too. The compiler with optimizations should 
remove the constant check, but technically the C standard only permits 
it - it doesn't guarantee that it happens.


However, this patch should actually be changed to drop these 
CONFIG_DRM_AMD_DC_DP2_0 guards - this isn't a Kconfig option nor will 
there be one specifically for DP2. This should be folded under the DCN 
support.


From the Linux kernel coding style [1][2]:


Within code, where possible, use the IS_ENABLED macro to convert a
Kconfig symbol into a C boolean expression, and use it in a normal C
conditional:

if (IS_ENABLED(CONFIG_SOMETHING)) {
...
}



Kind regards,

Paul



+    }
+}
+
   enum dc_status dp_get_lane_status_and_lane_adjust(
   struct dc_link *link,
   const struct link_training_settings *link_training_setting,


[1]: 
https://www.kernel.org/doc/html/latest/process/coding-style.html#conditional-compilation

[2]: Documentation/process/coding-style.rst

Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane

2021-10-25 Thread Rodrigo Siqueira Jordao


Hi Paul,

First of all, thanks for your feedback.

Before I address your comments, I just want to give you some background 
about this type of patchset.


All patches in this series are from a week ago and came from all display 
teams in AMD. Keep in mind that we share our display core with other OS; 
as a result, we have multiple updates every week. To keep the upstream 
aligned with the Display core, we run a weekly process where we extract 
patches made for other people who might be a good candidates to be sent 
to the upstream. The process looks like this (oversimplified):


1. Run a script to check which commit from last week is a good candidate 
to go upstream - finally, port all of those patches to the upstream 
(this is a manual task).
2. Prepare a branch based on amd-staging-drm-next with all the new 
patches on top of it.
3. Send it to our validation team to run an extensive set of tests in 
the new series.

4. If everything is ok, send the patchset to amdgfx.
5. If the test results are positive, we merge this series in the upstream.

Again, that's just a summary; we have way more steps here...

On 2021-10-25 7:25 a.m., Paul Menzel wrote:

Dear Wenjing, dear Rodrigo,


On 24.10.21 15:31, Rodrigo Siqueira wrote:

From: Wenjing Liu 

[why]
We have a regression that cause maximize lane settings to use
uninitialized data from unused lanes.


Which commit caused the regression? Please amend the commit message.


This is one of the situations where the week interval creates some 
descriptions that might sound a little bit different. Basically, this 
patch refers to the patch "implement decide lane settings", in this 
series; however, we noticed that this is not reverted, and that's why we 
decided to keep both patches in this series.



This will cause link training to fail for 1 or 2 lanes because the lane
adjust is populated incorrectly sometimes.


On what card did you test this, and how can it be reproduced?


Our DQE team validates this series using multiple ASICs. See the final 
report from Daniel in reply to this series cover letter for more details.



Please describe the fix/implemantation in the commit message.


Reviewed-by: Eric Yang 
Acked-by: Rodrigo Siqueira 
Signed-off-by: Wenjing Liu 
---
  .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 35 +--
  1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c

index 653279ab96f4..f6ba7c734f54 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -108,6 +108,9 @@ static struct dc_link_settings 
get_common_supported_link_settings(

  struct dc_link_settings link_setting_b);
  static void maximize_lane_settings(const struct 
link_training_settings *lt_settings,

  struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+static void override_lane_settings(const struct 
link_training_settings *lt_settings,

+    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+
  static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link,
  const struct dc_link_settings *link_settings)
  {
@@ -734,15 +737,13 @@ void dp_decide_lane_settings(
  }
  #endif
  }
-
-    /* we find the maximum of the requested settings across all lanes*/
-    /* and set this maximum for all lanes*/
  dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
dpcd_lane_settings);

  if (lt_settings->disallow_per_lane_settings) {
  /* we find the maximum of the requested settings across all 
lanes*/

  /* and set this maximum for all lanes*/
  maximize_lane_settings(lt_settings, hw_lane_settings);
+    override_lane_settings(lt_settings, hw_lane_settings);
  if (lt_settings->always_match_dpcd_with_hw_lane_settings)
  dp_hw_to_dpcd_lane_settings(lt_settings, 
hw_lane_settings, dpcd_lane_settings);
@@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct 
link_training_settings *lt_setti

  }
  }
+static void override_lane_settings(const struct 
link_training_settings *lt_settings,

+    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
+{
+    uint32_t lane;
+
+    if (lt_settings->voltage_swing == NULL &&
+    lt_settings->pre_emphasis == NULL &&
+#if defined(CONFIG_DRM_AMD_DC_DP2_0)
+    lt_settings->ffe_preset == NULL &&
+#endif
+    lt_settings->post_cursor2 == NULL)
+
+    return;
+
+    for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) {
+    if (lt_settings->voltage_swing)
+    lane_settings[lane].VOLTAGE_SWING = 
*lt_settings->voltage_swing;

+    if (lt_settings->pre_emphasis)
+    lane_settings[lane].PRE_EMPHASIS = 
*lt_settings->pre_emphasis;

+    if (lt_settings->post_cursor2)
+    lane_settings[lane].POST_CURSOR2 = 
*lt_settings->post_cursor2;

+#if defined(CONFIG_DRM_AMD_DC_DP2_0)

Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane

2021-10-25 Thread Kazlauskas, Nicholas


On 2021-10-25 9:58 a.m., Harry Wentland wrote:



On 2021-10-25 07:25, Paul Menzel wrote:

Dear Wenjing, dear Rodrigo,


On 24.10.21 15:31, Rodrigo Siqueira wrote:

From: Wenjing Liu 

[why]
We have a regression that cause maximize lane settings to use
uninitialized data from unused lanes.


Which commit caused the regression? Please amend the commit message.


This will cause link training to fail for 1 or 2 lanes because the lane
adjust is populated incorrectly sometimes.


On what card did you test this, and how can it be reproduced?

Please describe the fix/implemantation in the commit message.


Reviewed-by: Eric Yang 
Acked-by: Rodrigo Siqueira 
Signed-off-by: Wenjing Liu 
---
   .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 35 +--
   1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 653279ab96f4..f6ba7c734f54 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -108,6 +108,9 @@ static struct dc_link_settings 
get_common_supported_link_settings(
   struct dc_link_settings link_setting_b);
   static void maximize_lane_settings(const struct link_training_settings 
*lt_settings,
   struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+static void override_lane_settings(const struct link_training_settings 
*lt_settings,
+    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+
   static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link,
   const struct dc_link_settings *link_settings)
   {
@@ -734,15 +737,13 @@ void dp_decide_lane_settings(
   }
   #endif
   }
-
-    /* we find the maximum of the requested settings across all lanes*/
-    /* and set this maximum for all lanes*/
   dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
dpcd_lane_settings);
     if (lt_settings->disallow_per_lane_settings) {
   /* we find the maximum of the requested settings across all lanes*/
   /* and set this maximum for all lanes*/
   maximize_lane_settings(lt_settings, hw_lane_settings);
+    override_lane_settings(lt_settings, hw_lane_settings);
     if (lt_settings->always_match_dpcd_with_hw_lane_settings)
   dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
dpcd_lane_settings);
@@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct 
link_training_settings *lt_setti
   }
   }
   +static void override_lane_settings(const struct link_training_settings 
*lt_settings,
+    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
+{
+    uint32_t lane;
+
+    if (lt_settings->voltage_swing == NULL &&
+    lt_settings->pre_emphasis == NULL &&
+#if defined(CONFIG_DRM_AMD_DC_DP2_0)
+    lt_settings->ffe_preset == NULL &&
+#endif
+    lt_settings->post_cursor2 == NULL)
+
+    return;
+
+    for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) {
+    if (lt_settings->voltage_swing)
+    lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing;
+    if (lt_settings->pre_emphasis)
+    lane_settings[lane].PRE_EMPHASIS = *lt_settings->pre_emphasis;
+    if (lt_settings->post_cursor2)
+    lane_settings[lane].POST_CURSOR2 = *lt_settings->post_cursor2;
+#if defined(CONFIG_DRM_AMD_DC_DP2_0)
+    if (lt_settings->ffe_preset)
+    lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset;
+#endif


Normally these checks should be done in C and not the preprocessor. `if 
CONFIG(DRM_AMD_DC_DP2_0)` or similar should work.



Interesting. I've never seen this before. Do you have an example or link to a 
doc? A cursory search doesn't yield any results but I might not be searching 
for the right thing.

Harry


I'm curious about this too. The compiler with optimizations should 
remove the constant check, but technically the C standard only permits 
it - it doesn't guarantee that it happens.


However, this patch should actually be changed to drop these 
CONFIG_DRM_AMD_DC_DP2_0 guards - this isn't a Kconfig option nor will 
there be one specifically for DP2. This should be folded under the DCN 
support.


Regards,
Nicholas Kazlauskas




+    }
+}
+
   enum dc_status dp_get_lane_status_and_lane_adjust(
   struct dc_link *link,
   const struct link_training_settings *link_training_setting,




Kind regards,

Paul

Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13

2021-10-25 Thread Alex Deucher

On Mon, Oct 25, 2021 at 9:48 AM PGNet Dev  wrote:
>
> ( cc'ing this here, OP -> dri-devel@ )
>
> i've a dual gpu system
>
> inxi -GS
> System:Host: ws05 Kernel: 5.14.13-200.fc34.x86_64 x86_64 
> bits: 64 Console: tty pts/0
>Distro: Fedora release 34 (Thirty Four)
> (1) Graphics:  Device-1: NVIDIA GK208B [GeForce GT 710] driver: 
> nvidia v: 470.74
> (2)Device-2: Advanced Micro Devices [AMD/ATI] Cezanne 
> driver: N/A
>Display: server: X.org 1.20.11 driver: loaded: 
> nvidia unloaded: fbdev,modesetting,vesa
>Message: Advanced graphics data unavailable for 
> root.
>
> running on
>
> cpu:Ryzen 5 5600G
> mobo:   ASRockRack X470D4U
> bios:   vP4.20, 04/14/2021
> kernel: 5.14.13-200.fc34.x86_64 x86_64
>
> where,
>
> the nvidia is a PCIe card
> the amdgpu is the Ryzen-integrated gpu
>
> the nvidia PCI is currently my primary
> it's screen-attached, and boots/functions correctly
>
> lsmod | grep nvidia
> nvidia_drm 69632  0
> nvidia_modeset   1200128  1 nvidia_drm
> nvidia  35332096  1 nvidia_modeset
> drm_kms_helper303104  2 amdgpu,nvidia_drm
> drm   630784  8 
> gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm
>
> dmesg | grep -i nvidia
> [5.755494] nvidia: loading out-of-tree module taints 
> kernel.
> [5.755503] nvidia: module license 'NVIDIA' taints kernel.
> [5.759769] nvidia: module verification failed: signature 
> and/or required key missing - tainting kernel
> [5.774894] nvidia-nvlink: Nvlink Core is being 
> initialized, major device number 234
> [5.775299] nvidia :10:00.0: vgaarb: changed VGA 
> decodes: olddecodes=io+mem,decodes=none:owns=io+mem
> [5.975449] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 
>  470.74  Mon Sep 13 23:09:15 UTC 2021
> [6.013181] nvidia-modeset: Loading NVIDIA Kernel Mode 
> Setting Driver for UNIX platforms  470.74  Mon Sep 13 22:59:50 UTC 2021
> [6.016444] [drm] [nvidia-drm] [GPU ID 0x1000] Loading 
> driver
> [6.227295] caller _nv000723rm+0x1ad/0x200 [nvidia] 
> mapping multiple BARs
> [6.954906] [drm] Initialized nvidia-drm 0.0.0 20160202 
> for :10:00.0 on minor 0
> [   16.820758] input: HDA NVidia HDMI/DP,pcm=3 as 
> /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input13
> [   16.820776] input: HDA NVidia HDMI/DP,pcm=7 as 
> /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input14
> [   16.820808] input: HDA NVidia HDMI/DP,pcm=8 as 
> /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input15
> [   16.820826] input: HDA NVidia HDMI/DP,pcm=9 as 
> /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input16
> [   16.820841] input: HDA NVidia HDMI/DP,pcm=10 as 
> /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input17
>
> the amdgpu is not (currently/yet) in use; no attached screen
>
> in BIOS, currently,
>
> 'PCI Express' (nvidia gpu) is selected as primary
> 'HybridGraphics' is enabled
> 'OnBoard VGA' is enabled
>
>
> on boot, mods are loaded
>
> lsmod | grep gpu
> amdgpu   7802880  0
> drm_ttm_helper 16384  1 amdgpu
> ttm81920  2 amdgpu,drm_ttm_helper
> iommu_v2   24576  1 amdgpu
> gpu_sched  45056  1 amdgpu
> drm_kms_helper303104  2 amdgpu,nvidia_drm
> drm   630784  8 
> gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm
> i2c_algo_bit   16384  2 igb,amdgpu
>
> but i see a 'fatal error' and 'failed' probe,
>
> dmesg | grep -i amdgpu
> [5.161923] [drm] amdgpu kernel modesetting enabled.
> [5.162097] amdgpu: Virtual CRAT table created for CPU
> [5.162104] amdgpu: Topology: Add CPU node
> [5.162197] amdgpu :30:00.0: enabling device ( -> 
> 0003)
> [5.162232] amdgpu :30:00.0: amdgpu: Trusted Memory 
> Zone (TMZ) feature enabled
> [5.169105] amdgpu :30:00.0: BAR 6: can't assign [??? 
> 0x flags 0x2000] (bogus alignment)
> [5.174413] amdgpu :30:00.0: amdgpu: Unable to locate 
> a BIOS ROM
> [5.174415] amdgpu :30:00.0: amdgpu: Fatal error 
> during GPU init
> [5.174416] amdgpu :30:00.0: amdg

Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane

2021-10-25 Thread Harry Wentland




On 2021-10-25 07:25, Paul Menzel wrote:
> Dear Wenjing, dear Rodrigo,
> 
> 
> On 24.10.21 15:31, Rodrigo Siqueira wrote:
>> From: Wenjing Liu 
>>
>> [why]
>> We have a regression that cause maximize lane settings to use
>> uninitialized data from unused lanes.
> 
> Which commit caused the regression? Please amend the commit message.
> 
>> This will cause link training to fail for 1 or 2 lanes because the lane
>> adjust is populated incorrectly sometimes.
> 
> On what card did you test this, and how can it be reproduced?
> 
> Please describe the fix/implemantation in the commit message.
> 
>> Reviewed-by: Eric Yang 
>> Acked-by: Rodrigo Siqueira 
>> Signed-off-by: Wenjing Liu 
>> ---
>>   .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 35 +--
>>   1 file changed, 32 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
>> b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> index 653279ab96f4..f6ba7c734f54 100644
>> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> @@ -108,6 +108,9 @@ static struct dc_link_settings 
>> get_common_supported_link_settings(
>>   struct dc_link_settings link_setting_b);
>>   static void maximize_lane_settings(const struct link_training_settings 
>> *lt_settings,
>>   struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
>> +static void override_lane_settings(const struct link_training_settings 
>> *lt_settings,
>> +    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
>> +
>>   static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link,
>>   const struct dc_link_settings *link_settings)
>>   {
>> @@ -734,15 +737,13 @@ void dp_decide_lane_settings(
>>   }
>>   #endif
>>   }
>> -
>> -    /* we find the maximum of the requested settings across all lanes*/
>> -    /* and set this maximum for all lanes*/
>>   dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
>> dpcd_lane_settings);
>>     if (lt_settings->disallow_per_lane_settings) {
>>   /* we find the maximum of the requested settings across all lanes*/
>>   /* and set this maximum for all lanes*/
>>   maximize_lane_settings(lt_settings, hw_lane_settings);
>> +    override_lane_settings(lt_settings, hw_lane_settings);
>>     if (lt_settings->always_match_dpcd_with_hw_lane_settings)
>>   dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
>> dpcd_lane_settings);
>> @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct 
>> link_training_settings *lt_setti
>>   }
>>   }
>>   +static void override_lane_settings(const struct link_training_settings 
>> *lt_settings,
>> +    struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
>> +{
>> +    uint32_t lane;
>> +
>> +    if (lt_settings->voltage_swing == NULL &&
>> +    lt_settings->pre_emphasis == NULL &&
>> +#if defined(CONFIG_DRM_AMD_DC_DP2_0)
>> +    lt_settings->ffe_preset == NULL &&
>> +#endif
>> +    lt_settings->post_cursor2 == NULL)
>> +
>> +    return;
>> +
>> +    for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) {
>> +    if (lt_settings->voltage_swing)
>> +    lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing;
>> +    if (lt_settings->pre_emphasis)
>> +    lane_settings[lane].PRE_EMPHASIS = *lt_settings->pre_emphasis;
>> +    if (lt_settings->post_cursor2)
>> +    lane_settings[lane].POST_CURSOR2 = *lt_settings->post_cursor2;
>> +#if defined(CONFIG_DRM_AMD_DC_DP2_0)
>> +    if (lt_settings->ffe_preset)
>> +    lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset;
>> +#endif
> 
> Normally these checks should be done in C and not the preprocessor. `if 
> CONFIG(DRM_AMD_DC_DP2_0)` or similar should work.
> 

Interesting. I've never seen this before. Do you have an example or link to a 
doc? A cursory search doesn't yield any results but I might not be searching 
for the right thing.

Harry

>> +    }
>> +}
>> +
>>   enum dc_status dp_get_lane_status_and_lane_adjust(
>>   struct dc_link *link,
>>   const struct link_training_settings *link_training_setting,
>>
> 
> 
> Kind regards,
> 
> Paul

Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13

2021-10-25 Thread PGNet Dev


adding a trace,

...
[5.201715] [drm] amdgpu kernel modesetting enabled.
[5.201902] amdgpu: Virtual CRAT table created for CPU
[5.201909] amdgpu: Topology: Add CPU node
[5.201968] checking generic (e100 1d5000) vs hw (c000 1000)
[5.201969] checking generic (e100 1d5000) vs hw (d000 20)
[5.201970] checking generic (e100 1d5000) vs hw (fc50 8)
[5.201988] amdgpu :30:00.0: enabling device ( -> 0003)
[5.202020] [drm] initializing kernel modesetting (RENOIR 0x1002:0x1638 
0x1002:0x1636 0xC9).
[5.202024] amdgpu :30:00.0: amdgpu: Trusted Memory Zone (TMZ) feature 
enabled
[5.202033] [drm] register mmio base: 0xFC50
[5.202033] [drm] register mmio size: 524288
[5.202035] [drm] PCIE atomic ops is not supported
[5.203075] [drm] add ip block number 0 
[5.203076] [drm] add ip block number 1 
[5.203077] [drm] add ip block number 2 
[5.203078] [drm] add ip block number 3 
[5.203078] [drm] add ip block number 4 
[5.203079] [drm] add ip block number 5 
[5.203079] [drm] add ip block number 6 
[5.203080] [drm] add ip block number 7 
[5.203081] [drm] add ip block number 8 
[5.203081] [drm] add ip block number 9 
[5.208784] [drm] BIOS signature incorrect 0 0
[5.208789] amdgpu :30:00.0: BAR 6: can't assign [??? 0x flags 
0x2000] (bogus alignment)
[5.214038] [drm] BIOS signature incorrect 0 0
[5.214042] amdgpu :30:00.0: amdgpu: Unable to locate a BIOS ROM
[5.214044] amdgpu :30:00.0: amdgpu: Fatal error during GPU init
[5.214045] amdgpu :30:00.0: amdgpu: amdgpu: finishing device.
[5.214048] [ cut here ]
[5.214049] WARNING: CPU: 5 PID: 539 at kernel/workqueue.c:3044 
__flush_work.isra.0+0x1ef/0x200
[5.214054] Modules linked in: fjes(-) amdgpu(+) raid1 ast drm_vram_helper 
drm_ttm_helper iommu_v2 ttm gpu_sched drm_kms_helper igb crct10dif_pclmul 
crc32_pclmul crc32c_intel ghash_clmulni_intel cec dca i2c_algo_bit sp5100_tco 
ccp drm uas usb_storage wmi video sunrpc tcp_bbr nct6775 hwmon_vid k10temp
[5.214065] CPU: 5 PID: 539 Comm: systemd-udevd Not tainted 
5.14.13-200.fc34.x86_64 #1
[5.214067] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./X470D4U, BIOS P4.20 04/14/2021
[5.214068] RIP: 0010:__flush_work.isra.0+0x1ef/0x200
[5.214070] Code: 8b 4d 00 48 8b 55 08 83 e1 08 48 0f ba 6d 00 03 80 c9 f0 e9 37 
ff ff ff 0f 0b 48 83 c4 48 44 89 e0 5b 5d 41 5c 41 5d 41 5e c3 <0f> 0b 45 31 e4 
e9 46 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00
[5.214071] RSP: 0018:9d5f00f0fa80 EFLAGS: 00010246
[5.214073] RAX: 0011 RBX:  RCX: 0027
[5.214074] RDX: 0001 RSI: 0001 RDI: 88bc91e25ab8
[5.214074] RBP: 88bc91e25ab8 R08:  R09: 9d5f00f0f898
[5.214075] R10: 9d5f00f0f890 R11: 88c39e1fcfe8 R12: 0001
[5.214075] R13: 88bc92622800 R14: 88bc91e2 R15: 9d5f00f0fde0
[5.214076] FS:  7f231d7deb40() GS:88c37df4() 
knlGS:
[5.214077] CS:  0010 DS:  ES:  CR0: 80050033
[5.214078] CR2: 7fa1bbfa5ff0 CR3: 000104b94000 CR4: 00750ea0
[5.214078] PKRU: 5554
[5.214079] Call Trace:
[5.214082]  ? dev_printk_emit+0x3e/0x40
[5.214085]  __cancel_work_timer+0xea/0x170
[5.214086]  ? del_timer_sync+0x57/0x80
[5.214089]  ttm_bo_lock_delayed_workqueue+0x11/0x20 [ttm]
[5.214093]  amdgpu_device_fini_hw+0x33/0x2c5 [amdgpu]
[5.214225]  amdgpu_driver_load_kms.cold+0x72/0x94 [amdgpu]
[5.214338]  amdgpu_pci_probe+0x110/0x1a0 [amdgpu]
[5.214420]  local_pci_probe+0x42/0x80
[5.214423]  ? __cond_resched+0x16/0x40
[5.214426]  pci_device_probe+0xd9/0x190
[5.214427]  really_probe+0x1f5/0x3f0
[5.214429]  __driver_probe_device+0xfe/0x180
[5.214430]  driver_probe_device+0x1e/0x90
[5.214431]  __driver_attach+0xc0/0x1c0
[5.214433]  ? __device_attach_driver+0xe0/0xe0
[5.214434]  ? __device_attach_driver+0xe0/0xe0
[5.214434]  bus_for_each_dev+0x64/0x90
[5.214436]  bus_add_driver+0x12b/0x1e0
[5.214438]  driver_register+0x8f/0xe0
[5.214439]  ? 0xc0d62000
[5.214440]  do_one_initcall+0x44/0x1d0
[5.214443]  ? kmem_cache_alloc_trace+0x15c/0x280
[5.214445]  do_init_module+0x5c/0x270
[5.214448]  __do_sys_init_module+0x11d/0x180
[5.214450]  do_syscall_64+0x3b/0x90
[5.214452]  ? handle_mm_fault+0xcf/0x2a0
[5.214454]  ? do_user_addr_fault+0x1d5/0x680
[5.214457]  ? syscall_exit_to_user_mode+0x18/0x40
[5.214458]  ? exc_page_fault+0x72/0x150
[5.214459]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[5.214461] RIP: 0033:0x7f231e42a0fe
[5.214463] Code: 48 8b 0d 7d 1d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 
84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff 
ff 73

amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13

2021-10-25 Thread PGNet Dev


( cc'ing this here, OP -> dri-devel@ )

i've a dual gpu system

inxi -GS
System:Host: ws05 Kernel: 5.14.13-200.fc34.x86_64 x86_64 
bits: 64 Console: tty pts/0
   Distro: Fedora release 34 (Thirty Four)
(1) Graphics:  Device-1: NVIDIA GK208B [GeForce GT 710] driver: 
nvidia v: 470.74
(2)Device-2: Advanced Micro Devices [AMD/ATI] Cezanne 
driver: N/A
   Display: server: X.org 1.20.11 driver: loaded: 
nvidia unloaded: fbdev,modesetting,vesa
   Message: Advanced graphics data unavailable for root.

running on

cpu:Ryzen 5 5600G
mobo:   ASRockRack X470D4U
bios:   vP4.20, 04/14/2021
kernel: 5.14.13-200.fc34.x86_64 x86_64

where,

the nvidia is a PCIe card
the amdgpu is the Ryzen-integrated gpu

the nvidia PCI is currently my primary
it's screen-attached, and boots/functions correctly

lsmod | grep nvidia
nvidia_drm 69632  0
nvidia_modeset   1200128  1 nvidia_drm
nvidia  35332096  1 nvidia_modeset
drm_kms_helper303104  2 amdgpu,nvidia_drm
drm   630784  8 
gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm

dmesg | grep -i nvidia
[5.755494] nvidia: loading out-of-tree module taints kernel.
[5.755503] nvidia: module license 'NVIDIA' taints kernel.
[5.759769] nvidia: module verification failed: signature 
and/or required key missing - tainting kernel
[5.774894] nvidia-nvlink: Nvlink Core is being initialized, 
major device number 234
[5.775299] nvidia :10:00.0: vgaarb: changed VGA 
decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[5.975449] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  
470.74  Mon Sep 13 23:09:15 UTC 2021
[6.013181] nvidia-modeset: Loading NVIDIA Kernel Mode 
Setting Driver for UNIX platforms  470.74  Mon Sep 13 22:59:50 UTC 2021
[6.016444] [drm] [nvidia-drm] [GPU ID 0x1000] Loading 
driver
[6.227295] caller _nv000723rm+0x1ad/0x200 [nvidia] mapping 
multiple BARs
[6.954906] [drm] Initialized nvidia-drm 0.0.0 20160202 for 
:10:00.0 on minor 0
[   16.820758] input: HDA NVidia HDMI/DP,pcm=3 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input13
[   16.820776] input: HDA NVidia HDMI/DP,pcm=7 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input14
[   16.820808] input: HDA NVidia HDMI/DP,pcm=8 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input15
[   16.820826] input: HDA NVidia HDMI/DP,pcm=9 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input16
[   16.820841] input: HDA NVidia HDMI/DP,pcm=10 as 
/devices/pci:00/:00:01.1/:10:00.1/sound/card0/input17

the amdgpu is not (currently/yet) in use; no attached screen

in BIOS, currently,

'PCI Express' (nvidia gpu) is selected as primary
'HybridGraphics' is enabled
'OnBoard VGA' is enabled


on boot, mods are loaded

lsmod | grep gpu
amdgpu   7802880  0
drm_ttm_helper 16384  1 amdgpu
ttm81920  2 amdgpu,drm_ttm_helper
iommu_v2   24576  1 amdgpu
gpu_sched  45056  1 amdgpu
drm_kms_helper303104  2 amdgpu,nvidia_drm
drm   630784  8 
gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm
i2c_algo_bit   16384  2 igb,amdgpu

but i see a 'fatal error' and 'failed' probe,

dmesg | grep -i amdgpu
[5.161923] [drm] amdgpu kernel modesetting enabled.
[5.162097] amdgpu: Virtual CRAT table created for CPU
[5.162104] amdgpu: Topology: Add CPU node
[5.162197] amdgpu :30:00.0: enabling device ( -> 
0003)
[5.162232] amdgpu :30:00.0: amdgpu: Trusted Memory Zone 
(TMZ) feature enabled
[5.169105] amdgpu :30:00.0: BAR 6: can't assign [??? 
0x flags 0x2000] (bogus alignment)
[5.174413] amdgpu :30:00.0: amdgpu: Unable to locate a 
BIOS ROM
[5.174415] amdgpu :30:00.0: amdgpu: Fatal error during 
GPU init
[5.174416] amdgpu :30:00.0: amdgpu: amdgpu: finishing 
device.
[5.174425] Modules linked in: amdgpu(+) uas usb_storage 
fjes(-) raid1 drm_ttm_helper ttm iommu_v2 gpu_sched drm_kms_helper 
crct10dif_pclmul crc32_pclmul igb crc32c_intel cec ghash_clmulni_intel drm 
sp51

RE: [PATCH 00/33] DC Patches October 24, 2020

2021-10-25 Thread Wheeler, Daniel

[Public]

Hi all,
 
This week this patchset was tested on the following systems:
 
HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 
60hz, 4k 60hz  (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 
1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA)
 
AMD Ryzen 9 5900H, with the following display types: eDP 1080p 60hz, 4k 60hz  
(via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via 
USB-C to DP and then DP to DVI/VGA)
 
Sapphire Pulse RX5700XT with the following display types:
4k 60hz  (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to 
DVI/VGA)
 
Reference AMD RX6800 with the following display types:
4k 60hz  (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI 
and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA)
 
Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz, and 3x 1080p 
60hz on all systems. Also tested DSC via USB-C to DP DSC Hub with 3x 4k 60hz on 
Ryzen 9 5900h and Ryzen 5 4500u.
 
 
Tested-by: Daniel Wheeler 
 
 
Thank you,
 
Dan Wheeler
Technologist  |  AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
Facebook |  Twitter |  amd.com  

-Original Message-
From: amd-gfx  On Behalf Of Rodrigo 
Siqueira
Sent: October 24, 2021 9:31 AM
To: amd-gfx@lists.freedesktop.org
Cc: Wentland, Harry ; Li, Sun peng (Leo) 
; Lakha, Bhawanpreet ; Siqueira, 
Rodrigo ; Pillai, Aurabindo 
; Zhuo, Qingqing ; Lipski, 
Mikita ; Li, Roman ; Jacob, Anson 
; Lin, Wayne ; Wang, Chao-kai (Stylon) 
; Chiu, Solomon ; Kotarac, Pavle 
; Gutierrez, Agustin 
Subject: [PATCH 00/33] DC Patches October 24, 2020

This new DC version brings improvements in the following areas:
- Improvements for USB4;
- Isolate FPU code for DCN20, DCN301, and DSC;
- Fixes on Linking training;
- Refactoring some parts of the code, such as PSR;

Thanks

Ahmad Othman (2):
  drm/amd/display: Add support for USB4 on C20 PHY for DCN3.1
  drm/amd/display: fix a crash on USB4 over C20 PHY

Anson Jacob (2):
  drm/amd/display: dcn20_resource_construct reduce scope of FPU enabled
  drm/amd/display: Remove unused macros

Anthony Koo (2):
  drm/amd/display: [FW Promotion] Release 0.0.89
  drm/amd/display: [FW Promotion] Release 0.0.90

Aric Cyr (4):
  drm/amd/display: Handle I2C-over-AUX write channel status update
  drm/amd/display: 3.2.158
  drm/amd/display: Fix 3DLUT skipped programming
  drm/amd/display: 3.2.159

Dmytro Laktyushkin (3):
  drm/amd/display: clean up dcn31 revision check
  drm/amd/display: restyle dcn31 resource header inline with other asics
  drm/amd/display: allow windowed mpo + odm

George Shen (2):
  drm/amd/display: Implement fixed DP drive settings
  drm/amd/display: Add comment for preferred_training_settings

Guo, Bing (2):
  drm/amd/display: Get ceiling for v_total calc
  drm/amd/display: set Layout properly for 8ch audio at timing
validation

Hansen (1):
  drm/amd/display: Set phy_mux_sel bit in dmub scratch register

Jimmy Kizito (1):
  drm/amd/display: Add workaround flag for EDID read on certain docks

Lewis Huang (1):
  drm/amd/display: Align bw context with hw config when system resume

Martin Leung (1):
  drm/amd/display: Manually adjust strobe for DCN303

Meenakshikumar Somasundaram (2):
  drm/amd/display: FEC configuration for dpia links
  drm/amd/display: FEC configuration for dpia links in MST mode

Michael Strauss (2):
  drm/amd/display: Set i2c memory to light sleep during hw init
  drm/amd/display: Defer GAMCOR and DSCL power down sequence to vupdate

Qingqing Zhuo (2):
  drm/amd/display: move FPU associated DSC code to DML folder
  drm/amd/display: move FPU associated DCN301 code to DML folder

Robin Chen (1):
  drm/amd/display: dc_link_set_psr_allow_active refactoring

Wenjing Liu (5):
  drm/amd/display: adopt DP2.0 LT SCR revision 8
  drm/amd/display: implement decide lane settings
  drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings
  drm/amd/display: add two lane settings training options
  drm/amd/display: fix link training regression for 1 or 2 lane

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   2 +-
 .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |   2 +
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c |  10 +-
 .../drm/amd/display/dc/bios/bios_parser2.c|   2 +
 .../gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c  |  16 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  17 +-
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 118 -  
.../gpu/drm/amd/display/dc/core/dc_link_ddc.c |  15 +-  
.../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 498 +++---
 .../drm/amd/display/dc/core/dc_link_dpia.c|  57 +-
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  14 +-
 drivers/gpu/drm/amd/display/dc/dc.h   |  23 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |   5 -
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  13 +-
 .../gp

[PATCH v2 2/8] drm: improve drm_buddy_alloc function

2021-10-25 Thread Arunpravin

- Make drm_buddy_alloc a single function to handle
  range allocation and non-range allocation demands

- Implemented a new function alloc_range() which allocates
  the requested power-of-two block comply with range limitations

- Moved order computation and memory alignment logic from
  i915 driver to drm buddy

V2:
merged below changes to keep the build unbroken
- drm_buddy_alloc_range() becomes obsolete and may be removed
- enable ttm range allocation (fpfn / lpfn) support in i915 driver
- apply enhanced drm_buddy_alloc() function to i915 driver

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 265 +++---
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++---
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
 include/drm/drm_buddy.h   |  22 +-
 4 files changed, 207 insertions(+), 149 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 39eb1d224bec..406e3d521903 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -274,63 +274,6 @@ void drm_buddy_free_list(struct drm_buddy_mm *mm, struct 
list_head *objects)
 }
 EXPORT_SYMBOL(drm_buddy_free_list);
 
-/**
- * drm_buddy_alloc - allocate power-of-two blocks
- *
- * @mm: DRM buddy manager to allocate from
- * @order: size of the allocation
- *
- * The order value here translates to:
- *
- * 0 = 2^0 * mm->chunk_size
- * 1 = 2^1 * mm->chunk_size
- * 2 = 2^2 * mm->chunk_size
- *
- * Returns:
- * allocated ptr to the &drm_buddy_block on success
- */
-struct drm_buddy_block *
-drm_buddy_alloc(struct drm_buddy_mm *mm, unsigned int order)
-{
-   struct drm_buddy_block *block = NULL;
-   unsigned int i;
-   int err;
-
-   for (i = order; i <= mm->max_order; ++i) {
-   block = list_first_entry_or_null(&mm->free_list[i],
-struct drm_buddy_block,
-link);
-   if (block)
-   break;
-   }
-
-   if (!block)
-   return ERR_PTR(-ENOSPC);
-
-   BUG_ON(!drm_buddy_block_is_free(block));
-
-   while (i != order) {
-   err = split_block(mm, block);
-   if (unlikely(err))
-   goto out_free;
-
-   /* Go low */
-   block = block->left;
-   i--;
-   }
-
-   mark_allocated(block);
-   mm->avail -= drm_buddy_block_size(mm, block);
-   kmemleak_update_trace(block);
-   return block;
-
-out_free:
-   if (i != order)
-   __drm_buddy_free(mm, block);
-   return ERR_PTR(err);
-}
-EXPORT_SYMBOL(drm_buddy_alloc);
-
 static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
 {
return s1 <= e2 && e1 >= s2;
@@ -341,52 +284,22 @@ static inline bool contains(u64 s1, u64 e1, u64 s2, u64 
e2)
return s1 <= s2 && e1 >= e2;
 }
 
-/**
- * drm_buddy_alloc_range - allocate range
- *
- * @mm: DRM buddy manager to allocate from
- * @blocks: output list head to add allocated blocks
- * @start: start of the allowed range for this block
- * @size: size of the allocation
- *
- * Intended for pre-allocating portions of the address space, for example to
- * reserve a block for the initial framebuffer or similar, hence the 
expectation
- * here is that drm_buddy_alloc() is still the main vehicle for
- * allocations, so if that's not the case then the drm_mm range allocator is
- * probably a much better fit, and so you should probably go use that instead.
- *
- * Note that it's safe to chain together multiple alloc_ranges
- * with the same blocks list
- *
- * Returns:
- * 0 on success, error code on failure.
- */
-int drm_buddy_alloc_range(struct drm_buddy_mm *mm,
- struct list_head *blocks,
- u64 start, u64 size)
+static struct drm_buddy_block *
+alloc_range(struct drm_buddy_mm *mm,
+   u64 start, u64 end,
+   unsigned int order)
 {
struct drm_buddy_block *block;
struct drm_buddy_block *buddy;
-   LIST_HEAD(allocated);
LIST_HEAD(dfs);
-   u64 end;
int err;
int i;
 
-   if (size < mm->chunk_size)
-   return -EINVAL;
-
-   if (!IS_ALIGNED(size | start, mm->chunk_size))
-   return -EINVAL;
-
-   if (range_overflows(start, size, mm->size))
-   return -EINVAL;
+   end = end - 1;
 
for (i = 0; i < mm->n_roots; ++i)
list_add_tail(&mm->roots[i]->tmp_link, &dfs);
 
-   end = start + size - 1;
-
do {
u64 block_start;
u64 block_end;
@@ -394,31 +307,32 @@ int drm_buddy_alloc_range(struct drm_buddy_mm *mm,
block = list_first_entry_or_null(&dfs,
 struct drm_buddy_block,
 tmp_link);
+
if (!block)
break;

[PATCH v2 1/8] drm: move the buddy allocator from i915 into common drm

2021-10-25 Thread Arunpravin

Move the base i915 buddy allocator code into drm
- Move i915_buddy.h to include/drm
- Move i915_buddy.c to drm root folder
- Rename "i915" string with "drm" string wherever applicable
- Rename "I915" string with "DRM" string wherever applicable
- Fix header file dependencies
- Fix alignment issues
- add Makefile support for drm buddy
- export functions and write kerneldoc description
- Remove i915 selftest config check condition as buddy selftest
  will be moved to drm selftest folder

cleanup i915 buddy references in i915 driver module
and replace with drm buddy

V2:
- include header file in alphabetical order (Thomas)
- merged changes listed in the body section into a single patch
  to keep the build intact (Christian, Jani)

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Makefile  |   2 +-
 drivers/gpu/drm/drm_buddy.c   | 521 ++
 drivers/gpu/drm/drm_drv.c |   3 +
 drivers/gpu/drm/i915/Makefile |   1 -
 drivers/gpu/drm/i915/i915_buddy.c | 466 
 drivers/gpu/drm/i915/i915_buddy.h | 143 -
 drivers/gpu/drm/i915/i915_module.c|   3 -
 drivers/gpu/drm/i915/i915_scatterlist.c   |  11 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  33 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   4 +-
 include/drm/drm_buddy.h   | 153 +
 11 files changed, 702 insertions(+), 638 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 create mode 100644 include/drm/drm_buddy.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 0dff40bb863c..dc61e91a3154 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -18,7 +18,7 @@ drm-y   :=drm_aperture.o drm_auth.o drm_cache.o \
drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
-   drm_managed.o drm_vblank_work.o
+   drm_managed.o drm_vblank_work.o drm_buddy.o
 
 drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
drm_dma.o \
drm_legacy_misc.o drm_lock.o drm_memory.o 
drm_scatter.o \
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
new file mode 100644
index ..39eb1d224bec
--- /dev/null
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -0,0 +1,521 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+static struct kmem_cache *slab_blocks;
+
+static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm,
+  struct drm_buddy_block *parent,
+  unsigned int order,
+  u64 offset)
+{
+   struct drm_buddy_block *block;
+
+   BUG_ON(order > DRM_BUDDY_MAX_ORDER);
+
+   block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
+   if (!block)
+   return NULL;
+
+   block->header = offset;
+   block->header |= order;
+   block->parent = parent;
+
+   BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
+   return block;
+}
+
+static void drm_block_free(struct drm_buddy_mm *mm,
+  struct drm_buddy_block *block)
+{
+   kmem_cache_free(slab_blocks, block);
+}
+
+static void mark_allocated(struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_ALLOCATED;
+
+   list_del(&block->link);
+}
+
+static void mark_free(struct drm_buddy_mm *mm,
+ struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_FREE;
+
+   list_add(&block->link,
+&mm->free_list[drm_buddy_block_order(block)]);
+}
+
+static void mark_split(struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_SPLIT;
+
+   list_del(&block->link);
+}
+
+/**
+ * drm_buddy_init - init memory manager
+ *
+ * @mm: DRM buddy manager to initialize
+ * @size: size in bytes to manage
+ * @chunk_size: minimum page size in bytes for our allocations
+ *
+ * Initializes the memory manager and its resources.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size)
+{
+   unsigned int i;
+   u64 offset;
+
+   if (size < chunk_size)
+   return -EINVAL;
+
+   if (chunk_size < PAGE_SIZE)
+   return -EINVAL;
+
+   if (!is_power_of_2(chunk_size))
+   return -EINVAL;
+
+   size = round_down(size, chunk_size);
+
+   mm->size = size;
+   mm->av

[PATCH 8/8] drm/amdgpu: add drm buddy support to amdgpu

2021-10-25 Thread Arunpravin

- Remove drm_mm references and replace with drm buddy functionalities
- Add res cursor support for drm buddy

Signed-off-by: Arunpravin 
---
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 239 ++
 3 files changed, 221 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..da12b4ff2e45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto err_out;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = &to_amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto err_out;
+
+   while (start >= amdgpu_node_size(block)) {
+   start -= amdgpu_node_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = amdgpu_node_start(block) + start;
+   cur->size = min(amdgpu_node_size(block) - start, size);
+   cur->remaining = size;
+   cur->node = block;
+   break;
+   case TTM_PL_TT:
+   node = to_ttm_range_mgr_node(res)->mm_nodes;
+   while (start >= node->size << PAGE_SHIFT)
+   start -= node++->size << PAGE_SHIFT;
+
+   cur->start = (node->start << PAGE_SHIFT) + start;
+   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   cur->remaining = size;
+   cur->node = node;
+   break;
+   default:
+   goto err_out;
+   }
 
-   cur->start = (node->start << PAGE_SHIFT) + start;
-   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   return;
+
+err_out:
+   cur->start = start;
+   cur->size = size;
cur->remaining = size;
-   cur->node = node;
+   cur->node = NULL;
+   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
+   return;
 }
 
 /**
@@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
  */
 static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t 
size)
 {
-   struct drm_mm_node *node = cur->node;
+   struct drm_buddy_block *block;
+   struct drm_mm_node *node;
+   struct list_head *next;
 
BUG_ON(size > cur->remaining);
 
@@ -99,9 +140,27 @@ static inline void amdgpu_res_next(struct amdgpu_res_cursor 
*cur, uint64_t size)
return;
}
 
-   cur->node = ++node;
-   cur->start = node->start << PAGE_SHIFT;
-   cur->size = min(node->size << PAGE_SHIFT, cur->remaining);
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   block = cur->node;
+
+   next = block->link.next;
+   block = list_entry(next, struct drm_buddy_block, link);
+
+   cur->node = block;
+   cur->start = amdgpu_node_start(block);
+   cur->size = min(amdgpu_node_size(block), cur->remaining);
+   break;
+   case TTM_PL_TT:
+   node = cur->node;
+
+   cur->node = ++node;
+   cur->start =

[PATCH 7/8] drm/amdgpu: move vram inline functions into a header

2021-10-25 Thread Arunpravin

Move shared vram inline functions and structs
into a header file

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 51 
 1 file changed, 51 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
new file mode 100644
index ..32cd6f54b6f3
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_VRAM_MGR_H__
+#define __AMDGPU_VRAM_MGR_H__
+
+#include 
+
+struct amdgpu_vram_mgr_node {
+   struct ttm_resource base;
+   struct list_head blocks;
+   unsigned long flags;
+};
+
+static inline uint64_t amdgpu_node_start(struct drm_buddy_block *block)
+{
+   return drm_buddy_block_offset(block);
+}
+
+static inline uint64_t amdgpu_node_size(struct drm_buddy_block *block)
+{
+   return PAGE_SIZE << drm_buddy_block_order(block);
+}
+
+static inline struct amdgpu_vram_mgr_node *
+to_amdgpu_vram_mgr_node(struct ttm_resource *res)
+{
+   return container_of(res, struct amdgpu_vram_mgr_node, base);
+}
+
+#endif
-- 
2.25.1

[PATCH 6/8] drm/i915: add free_unused_pages support to i915

2021-10-25 Thread Arunpravin

add drm_buddy_free_unused_pages() support on
contiguous allocation

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 963468228392..162947af8e04 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -98,6 +98,14 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
if (unlikely(err))
goto err_free_blocks;
 
+   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+   err = drm_buddy_free_unused_pages(mm, (uint64_t)n_pages << 
PAGE_SHIFT,
+  &bman_res->blocks);
+
+   if (unlikely(err))
+   goto err_free_blocks;
+   }
+
*res = &bman_res->base;
return 0;
 
-- 
2.25.1

[PATCH 5/8] drm: Implement method to free unused pages

2021-10-25 Thread Arunpravin

On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 103 
 include/drm/drm_buddy.h |   4 ++
 2 files changed, 107 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 9d3547bcc5da..0da8510736eb 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -284,6 +284,109 @@ static inline bool contains(u64 s1, u64 e1, u64 s2, u64 
e2)
return s1 <= s2 && e1 >= e2;
 }
 
+/**
+ * drm_buddy_free_unused_pages - free unused pages
+ *
+ * @mm: DRM buddy manager
+ * @actual_size: original size requested
+ * @blocks: output list head to add allocated blocks
+ *
+ * For contiguous allocation, we round up the size to the nearest
+ * power of two value, drivers consume *actual* size, so remaining
+ * portions are unused and it can be freed.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm,
+   u64 actual_size,
+   struct list_head *blocks)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   u64 actual_start;
+   u64 actual_end;
+   LIST_HEAD(dfs);
+   u64 count = 0;
+   int err;
+
+   if (!list_is_singular(blocks))
+   return -EINVAL;
+
+   block = list_first_entry_or_null(blocks,
+struct drm_buddy_block,
+link);
+
+   if (!block)
+   return -EINVAL;
+
+   if (actual_size > drm_buddy_block_size(mm, block))
+   return -EINVAL;
+
+   if (actual_size == drm_buddy_block_size(mm, block))
+   return 0;
+
+   list_del(&block->link);
+
+   actual_start = drm_buddy_block_offset(block);
+   actual_end = actual_start + actual_size - 1;
+
+   if (drm_buddy_block_is_allocated(block))
+   mark_free(mm, block);
+
+   list_add(&block->tmp_link, &dfs);
+
+   while (1) {
+   block = list_first_entry_or_null(&dfs,
+struct drm_buddy_block,
+tmp_link);
+
+   if (!block)
+   break;
+
+   list_del(&block->tmp_link);
+
+   if (count == actual_size)
+   return 0;
+
+   if (contains(actual_start, actual_end, 
drm_buddy_block_offset(block),
+   (drm_buddy_block_offset(block) + 
drm_buddy_block_size(mm, block) - 1))) {
+   BUG_ON(!drm_buddy_block_is_free(block));
+
+   /* Allocate only required blocks */
+   mark_allocated(block);
+   mm->avail -= drm_buddy_block_size(mm, block);
+   list_add_tail(&block->link, blocks);
+   count += drm_buddy_block_size(mm, block);
+   continue;
+   }
+
+   if (drm_buddy_block_order(block) == 0)
+   continue;
+
+   if (!drm_buddy_block_is_split(block)) {
+   err = split_block(mm, block);
+
+   if (unlikely(err))
+   goto err_undo;
+   }
+
+   list_add(&block->right->tmp_link, &dfs);
+   list_add(&block->left->tmp_link, &dfs);
+   }
+
+   return -ENOSPC;
+
+err_undo:
+   buddy = get_buddy(block);
+   if (buddy &&
+   (drm_buddy_block_is_free(block) &&
+drm_buddy_block_is_free(buddy)))
+   __drm_buddy_free(mm, block);
+   return err;
+}
+EXPORT_SYMBOL(drm_buddy_free_unused_pages);
+
 static struct drm_buddy_block *
 alloc_range(struct drm_buddy_mm *mm,
u64 start, u64 end,
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index cd8021d2d6e7..1dfc80c88e1f 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -145,6 +145,10 @@ int drm_buddy_alloc(struct drm_buddy_mm *mm,
struct list_head *blocks,
unsigned long flags);
 
+int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm,
+   u64 actual_size,
+   struct list_head *blocks);
+
 void drm_buddy_free(struct drm_buddy_mm *mm, struct drm_buddy_block *block);
 
 void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects);
-- 
2.25.1

[PATCH 4/8] drm/i915: add top-down alloc support to i915

2021-10-25 Thread Arunpravin

add top down allocation support to i915 driver

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 75197ba6e40d..963468228392 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
INIT_LIST_HEAD(&bman_res->blocks);
bman_res->mm = mm;
 
+   if (place->flags & TTM_PL_FLAG_TOPDOWN)
+   bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
+
if (place->fpfn || lpfn != man->size)
bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
-- 
2.25.1

[PATCH 3/8] drm: implement top-down allocation method

2021-10-25 Thread Arunpravin

Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.

The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 42 +++--
 include/drm/drm_buddy.h |  1 +
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 406e3d521903..9d3547bcc5da 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -362,6 +362,27 @@ alloc_range(struct drm_buddy_mm *mm,
return ERR_PTR(err);
 }
 
+static struct drm_buddy_block *
+get_maxblock(struct list_head *head)
+{
+   struct drm_buddy_block *max_block = NULL, *node;
+
+   max_block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+
+   if (!max_block)
+   return NULL;
+
+   list_for_each_entry(node, head, link) {
+   if (drm_buddy_block_offset(node) >
+   drm_buddy_block_offset(max_block))
+   max_block = node;
+   }
+
+   return max_block;
+}
+
 static struct drm_buddy_block *
 alloc_from_freelist(struct drm_buddy_mm *mm,
unsigned int order,
@@ -372,13 +393,22 @@ alloc_from_freelist(struct drm_buddy_mm *mm,
int err;
 
for (i = order; i <= mm->max_order; ++i) {
-   if (!list_empty(&mm->free_list[i])) {
-   block = list_first_entry_or_null(&mm->free_list[i],
-struct drm_buddy_block,
-link);
+   if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) {
+   if (!list_empty(&mm->free_list[i])) {
+   block = get_maxblock(&mm->free_list[i]);
 
-   if (block)
-   break;
+   if (block)
+   break;
+   }
+   } else {
+   if (!list_empty(&mm->free_list[i])) {
+   block = 
list_first_entry_or_null(&mm->free_list[i],
+struct 
drm_buddy_block,
+link);
+
+   if (block)
+   break;
+   }
}
}
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index c7bb5509a7ad..cd8021d2d6e7 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -28,6 +28,7 @@
 })
 
 #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0)
+#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1)
 
 struct drm_buddy_block {
 #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
-- 
2.25.1

Re: I got an IOMMU IO page fault. What to do now?

2021-10-25 Thread Paul Menzel


Dear Christian,


Thank you for your reply.


On 25.10.21 13:23, Christian König wrote:

not sure how the IOMMU gives out addresses, but the printed ones look 
suspicious to me. Something like we are using an invalid address like -1 
or similar.


Can you try that on an up to date kernel as well? E.g. ideally bleeding 
edge amd-staging-drm-next from Alex repository.


These are production desktops, so I’d need to talk to the user. 
Currently, Linux 5.10.70 is running.



Kind regards,

Paul

Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane

2021-10-25 Thread Paul Menzel


Dear Wenjing, dear Rodrigo,


On 24.10.21 15:31, Rodrigo Siqueira wrote:

From: Wenjing Liu 

[why]
We have a regression that cause maximize lane settings to use
uninitialized data from unused lanes.


Which commit caused the regression? Please amend the commit message.


This will cause link training to fail for 1 or 2 lanes because the lane
adjust is populated incorrectly sometimes.


On what card did you test this, and how can it be reproduced?

Please describe the fix/implemantation in the commit message.


Reviewed-by: Eric Yang 
Acked-by: Rodrigo Siqueira 
Signed-off-by: Wenjing Liu 
---
  .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 35 +--
  1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 653279ab96f4..f6ba7c734f54 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -108,6 +108,9 @@ static struct dc_link_settings 
get_common_supported_link_settings(
struct dc_link_settings link_setting_b);
  static void maximize_lane_settings(const struct link_training_settings 
*lt_settings,
struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+static void override_lane_settings(const struct link_training_settings 
*lt_settings,
+   struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]);
+
  static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link,
const struct dc_link_settings *link_settings)
  {
@@ -734,15 +737,13 @@ void dp_decide_lane_settings(
}
  #endif
}
-
-   /* we find the maximum of the requested settings across all lanes*/
-   /* and set this maximum for all lanes*/
dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, 
dpcd_lane_settings);
  
  	if (lt_settings->disallow_per_lane_settings) {

/* we find the maximum of the requested settings across all 
lanes*/
/* and set this maximum for all lanes*/
maximize_lane_settings(lt_settings, hw_lane_settings);
+   override_lane_settings(lt_settings, hw_lane_settings);
  
  		if (lt_settings->always_match_dpcd_with_hw_lane_settings)

dp_hw_to_dpcd_lane_settings(lt_settings, 
hw_lane_settings, dpcd_lane_settings);
@@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct 
link_training_settings *lt_setti
}
  }
  
+static void override_lane_settings(const struct link_training_settings *lt_settings,

+   struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX])
+{
+   uint32_t lane;
+
+   if (lt_settings->voltage_swing == NULL &&
+   lt_settings->pre_emphasis == NULL &&
+#if defined(CONFIG_DRM_AMD_DC_DP2_0)
+   lt_settings->ffe_preset == NULL &&
+#endif
+   lt_settings->post_cursor2 == NULL)
+
+   return;
+
+   for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) {
+   if (lt_settings->voltage_swing)
+   lane_settings[lane].VOLTAGE_SWING = 
*lt_settings->voltage_swing;
+   if (lt_settings->pre_emphasis)
+   lane_settings[lane].PRE_EMPHASIS = 
*lt_settings->pre_emphasis;
+   if (lt_settings->post_cursor2)
+   lane_settings[lane].POST_CURSOR2 = 
*lt_settings->post_cursor2;
+#if defined(CONFIG_DRM_AMD_DC_DP2_0)
+   if (lt_settings->ffe_preset)
+   lane_settings[lane].FFE_PRESET = 
*lt_settings->ffe_preset;
+#endif


Normally these checks should be done in C and not the preprocessor. `if 
CONFIG(DRM_AMD_DC_DP2_0)` or similar should work.



+   }
+}
+
  enum dc_status dp_get_lane_status_and_lane_adjust(
struct dc_link *link,
const struct link_training_settings *link_training_setting,




Kind regards,

Paul

Re: I got an IOMMU IO page fault. What to do now?

2021-10-25 Thread Christian König


Hi Paul,

not sure how the IOMMU gives out addresses, but the printed ones look 
suspicious to me. Something like we are using an invalid address like -1 
or similar.


Can you try that on an up to date kernel as well? E.g. ideally bleeding 
edge amd-staging-drm-next from Alex repository.


Regards,
Christian.

Am 25.10.21 um 12:25 schrieb Paul Menzel:

Dear Linux folks,


On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages 
below. (GPU hang in amdgpu issue #1762 [1] might be related.)


    $ lspci -nn -s 05:00.0
    05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, 
Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] 
(rev 87)

    $ dmesg
    […]
    [6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020]
    [6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020]
    [6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020]
    [6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020]
    [6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020]
    [6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020]
    [6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020]
    [6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020]
    [6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020]
    [6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020]
    [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffc1c0 flags=0x0020]
    [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffc8c0 flags=0x0020]
    [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffb0c0 flags=0x0020]
    [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffcfc0 flags=0x0020]
    [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffb6c0 flags=0x0020]
    [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffa0c0 flags=0x0020]
    [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffbdc0 flags=0x0020]
    [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffa4c0 flags=0x0020]
    [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xff90c0 flags=0x0020]
    [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT 
device=05:00.0 domain=0x000c address=0xffabc0 flags=0x0020]


As this is not reproducible, how would debugging go? (The system was 
rebooted in the meantime.) What options should be enabled, that next 
time the required information is logged, or what commands should I 
execute when the system is still in that state, so the bug (driver, 
userspace, …) can be pinpointed and fixed?



Kind regards,

Paul


[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
 "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"

I got an IOMMU IO page fault. What to do now?

2021-10-25 Thread Paul Menzel


Dear Linux folks,


On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages below. 
(GPU hang in amdgpu issue #1762 [1] might be related.)


$ lspci -nn -s 05:00.0
05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, 
Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87)

$ dmesg
[…]
[6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020]
[6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020]
[6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020]
[6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020]
[6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020]
[6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020]
[6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020]
[6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020]
[6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020]
[6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020]
[6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffc1c0 flags=0x0020]
[6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffc8c0 flags=0x0020]
[6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffb0c0 flags=0x0020]
[6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffcfc0 flags=0x0020]
[6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffb6c0 flags=0x0020]
[6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffa0c0 flags=0x0020]
[6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffbdc0 flags=0x0020]
[6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffa4c0 flags=0x0020]
[6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xff90c0 flags=0x0020]
[6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 
domain=0x000c address=0xffabc0 flags=0x0020]


As this is not reproducible, how would debugging go? (The system was 
rebooted in the meantime.) What options should be enabled, that next 
time the required information is logged, or what commands should I 
execute when the system is still in that state, so the bug (driver, 
userspace, …) can be pinpointed and fixed?



Kind regards,

Paul


[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
 "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"

Re: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN

2021-10-25 Thread Zhou1, Tao

[AMD Official Use Only]

Thanks, will and comment as you suggested.

Regards,
Tao

From: Zhang, Hawking 
Sent: Monday, October 25, 2021 5:29 PM
To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org 

Subject: RE: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on 
ALDEBARAN

[AMD Official Use Only]

Please add inline comment to remind us remove this workaround later. Other than 
the patch is

Reviewed-by: Hawking Zhang 

Regards,
Hawking

-Original Message-
From: Zhou1, Tao 
Sent: Monday, October 25, 2021 16:56
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking 
Cc: Zhou1, Tao 
Subject: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN

Skip GPRs init in specific condition since current GPRs init algorithm only 
works for some CU settings.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
index 00a2b36a24b3..a1f424e2ff5f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
@@ -706,6 +706,10 @@ int gfx_v9_4_2_do_edc_gpr_workarounds(struct amdgpu_device 
*adev)
 if (!amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX))
 return 0;

+   /* workaround for some CU settings, skip GPRs init */
+   if (adev->gfx.cu_info.bitmap[1][1] == 0x3f7f)
+   return 0;
+
 gfx_v9_4_2_do_sgprs_init(adev);

 gfx_v9_4_2_do_vgprs_init(adev);
--
2.17.1

[PATCH v2 2/2] drm/amdkfd: Remove cu mask from struct queue_properties(v2)

2021-10-25 Thread Lang Yu

Actually, cu_mask has been copied to mqd memory and
does't have to persist in queue_properties. Remove it
from queue_properties.

And use struct mqd_update_info to store such properties,
then pass it to update queue operation.

v2:
* Rename pqm_update_queue to pqm_update_queue_properties.
* Rename struct queue_update_info to struct mqd_update_info.
* Rename pqm_set_cu_mask to pqm_update_mqd.

Suggested-by: Felix Kuehling 
Signed-off-by: Lang Yu 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 31 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c |  1 -
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  9 +++---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  9 +++---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  9 +++---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  9 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 23 +-
 .../amd/amdkfd/kfd_process_queue_manager.c| 20 +++-
 8 files changed, 57 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 9317a2e238d0..24ebd61395d8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -405,7 +405,7 @@ static int kfd_ioctl_update_queue(struct file *filp, struct 
kfd_process *p,
 
mutex_lock(&p->mutex);
 
-   retval = pqm_update_queue(&p->pqm, args->queue_id, &properties);
+   retval = pqm_update_queue_properties(&p->pqm, args->queue_id, 
&properties);
 
mutex_unlock(&p->mutex);
 
@@ -418,7 +418,7 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, struct 
kfd_process *p,
int retval;
const int max_num_cus = 1024;
struct kfd_ioctl_set_cu_mask_args *args = data;
-   struct queue_properties properties;
+   struct mqd_update_info minfo = {0};
uint32_t __user *cu_mask_ptr = (uint32_t __user *)args->cu_mask_ptr;
size_t cu_mask_size = sizeof(uint32_t) * (args->num_cu_mask / 32);
 
@@ -428,8 +428,8 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, struct 
kfd_process *p,
return -EINVAL;
}
 
-   properties.cu_mask_count = args->num_cu_mask;
-   if (properties.cu_mask_count == 0) {
+   minfo.cu_mask.count = args->num_cu_mask;
+   if (minfo.cu_mask.count == 0) {
pr_debug("CU mask cannot be 0");
return -EINVAL;
}
@@ -438,32 +438,33 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, 
struct kfd_process *p,
 * limit of max_num_cus bits.  We can then just drop any CU mask bits
 * past max_num_cus bits and just use the first max_num_cus bits.
 */
-   if (properties.cu_mask_count > max_num_cus) {
+   if (minfo.cu_mask.count > max_num_cus) {
pr_debug("CU mask cannot be greater than 1024 bits");
-   properties.cu_mask_count = max_num_cus;
+   minfo.cu_mask.count = max_num_cus;
cu_mask_size = sizeof(uint32_t) * (max_num_cus/32);
}
 
-   properties.cu_mask = kzalloc(cu_mask_size, GFP_KERNEL);
-   if (!properties.cu_mask)
+   minfo.cu_mask.ptr = kzalloc(cu_mask_size, GFP_KERNEL);
+   if (!minfo.cu_mask.ptr)
return -ENOMEM;
 
-   retval = copy_from_user(properties.cu_mask, cu_mask_ptr, cu_mask_size);
+   retval = copy_from_user(minfo.cu_mask.ptr, cu_mask_ptr, cu_mask_size);
if (retval) {
pr_debug("Could not copy CU mask from userspace");
-   kfree(properties.cu_mask);
-   return -EFAULT;
+   retval = -EFAULT;
+   goto out;
}
 
+   minfo.update_flag = UPDATE_FLAG_CU_MASK;
+
mutex_lock(&p->mutex);
 
-   retval = pqm_set_cu_mask(&p->pqm, args->queue_id, &properties);
+   retval = pqm_update_mqd(&p->pqm, args->queue_id, &minfo);
 
mutex_unlock(&p->mutex);
 
-   if (retval)
-   kfree(properties.cu_mask);
-
+out:
+   kfree(minfo.cu_mask.ptr);
return retval;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index a2b77d1df854..64b4ac339904 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -136,7 +136,6 @@ static bool kq_initialize(struct kernel_queue *kq, struct 
kfd_dev *dev,
prop.write_ptr = (uint32_t *) kq->wptr_gpu_addr;
prop.eop_ring_buffer_address = kq->eop_gpu_addr;
prop.eop_ring_buffer_size = PAGE_SIZE;
-   prop.cu_mask = NULL;
 
if (init_queue(&kq->queue, &prop) != 0)
goto err_init_queue;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 00bcaa11ff57..8128f4d312f1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -42,16 +42,17 @@ static inline struct cik_sd

[PATCH v2 1/2] drm/amdkfd: Add an optional argument into update queue operation(v2)

2021-10-25 Thread Lang Yu

Currently, queue is updated with data in queue_properties.
And all allocated resource in queue_properties will not
be freed until the queue is destroyed.

But some properties(e.g., cu mask) bring some memory
management headaches(e.g., memory leak) and make code
complex. Actually they have been copied to mqd and
don't have to persist in queue_properties.

Add an argument into update queue to pass such properties,
then we can remove them from queue_properties.

v2: Don't use void *.

Suggested-by: Felix Kuehling 
Signed-off-by: Lang Yu 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  5 ++--
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  3 ++-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 23 +---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  | 10 ---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 10 ---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 26 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 ++
 .../amd/amdkfd/kfd_process_queue_manager.c|  6 ++---
 9 files changed, 52 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index f8fce9d05f50..533b27b35fc9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -557,7 +557,8 @@ static int destroy_queue_nocpsch(struct 
device_queue_manager *dqm,
return retval;
 }
 
-static int update_queue(struct device_queue_manager *dqm, struct queue *q)
+static int update_queue(struct device_queue_manager *dqm, struct queue *q,
+   struct mqd_update_info *minfo)
 {
int retval = 0;
struct mqd_manager *mqd_mgr;
@@ -605,7 +606,7 @@ static int update_queue(struct device_queue_manager *dqm, 
struct queue *q)
}
}
 
-   mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties);
+   mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties, minfo);
 
/*
 * check active state vs. the previous state and modify
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index c8719682c4da..499fc0ea387f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -93,7 +93,7 @@ struct device_queue_manager_ops {
struct queue *q);
 
int (*update_queue)(struct device_queue_manager *dqm,
-   struct queue *q);
+   struct queue *q, struct mqd_update_info *minfo);
 
int (*register_process)(struct device_queue_manager *dqm,
struct qcm_process_device *qpd);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index 6e6918ccedfd..965e17c5dbb4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -80,7 +80,8 @@ struct mqd_manager {
struct mm_struct *mms);
 
void(*update_mqd)(struct mqd_manager *mm, void *mqd,
-   struct queue_properties *q);
+   struct queue_properties *q,
+   struct mqd_update_info *minfo);
 
int (*destroy_mqd)(struct mqd_manager *mm, void *mqd,
enum kfd_preempt_type type,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 064914e1e8d6..00bcaa11ff57 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -135,7 +135,7 @@ static void init_mqd(struct mqd_manager *mm, void **mqd,
*mqd = m;
if (gart_addr)
*gart_addr = addr;
-   mm->update_mqd(mm, m, q);
+   mm->update_mqd(mm, m, q, NULL);
 }
 
 static void init_mqd_sdma(struct mqd_manager *mm, void **mqd,
@@ -152,7 +152,7 @@ static void init_mqd_sdma(struct mqd_manager *mm, void 
**mqd,
if (gart_addr)
*gart_addr = mqd_mem_obj->gpu_addr;
 
-   mm->update_mqd(mm, m, q);
+   mm->update_mqd(mm, m, q, NULL);
 }
 
 static void free_mqd(struct mqd_manager *mm, void *mqd,
@@ -185,7 +185,8 @@ static int load_mqd_sdma(struct mqd_manager *mm, void *mqd,
 }
 
 static void __update_mqd(struct mqd_manager *mm, void *mqd,
-   struct queue_properties *q, unsigned int atc_bit)
+   struct queue_properties *q, struct mqd_update_info 
*minfo,
+   unsigned int atc_bit)
 {
struct cik_mqd *m;
 
@@ -221,9 +222,10 @@ static void __update_mqd(struct mqd_manager *mm, void *mqd,
 }
 
 static void update_mqd(struct mqd_manager *mm, void *mqd,
-

RE: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN

2021-10-25 Thread Zhang, Hawking

[AMD Official Use Only]

Please add inline comment to remind us remove this workaround later. Other than 
the patch is

Reviewed-by: Hawking Zhang 

Regards,
Hawking

-Original Message-
From: Zhou1, Tao  
Sent: Monday, October 25, 2021 16:56
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking 
Cc: Zhou1, Tao 
Subject: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN

Skip GPRs init in specific condition since current GPRs init algorithm only 
works for some CU settings.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
index 00a2b36a24b3..a1f424e2ff5f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
@@ -706,6 +706,10 @@ int gfx_v9_4_2_do_edc_gpr_workarounds(struct amdgpu_device 
*adev)
if (!amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX))
return 0;
 
+   /* workaround for some CU settings, skip GPRs init */
+   if (adev->gfx.cu_info.bitmap[1][1] == 0x3f7f)
+   return 0;
+
gfx_v9_4_2_do_sgprs_init(adev);
 
gfx_v9_4_2_do_vgprs_init(adev);
-- 
2.17.1

RE: [PATCH 2/2] drm/amdkfd: Remove cu mask from struct queue_properties

2021-10-25 Thread Yu, Lang

[AMD Official Use Only]



>-Original Message-
>From: Kuehling, Felix 
>Sent: Friday, October 22, 2021 1:11 AM
>To: Yu, Lang ; amd-gfx@lists.freedesktop.org
>Cc: Deucher, Alexander ; Huang, Ray
>
>Subject: Re: [PATCH 2/2] drm/amdkfd: Remove cu mask from struct
>queue_properties
>
>Am 2021-10-15 um 4:48 a.m. schrieb Lang Yu:
>> +enum queue_update_flag {
>> +UPDATE_FLAG_PROPERTITY = 0,
>> +UPDATE_FLAG_CU_MASK,
>> +};
>> +
>> +struct queue_update_info {
>> +union {
>> +struct queue_properties properties;
>> +struct {
>> +uint32_t count; /* Must be a multiple of 32 */
>> +uint32_t *ptr;
>> +} cu_mask;
>> +};
>> +
>> +enum queue_update_flag update_flag;
>> +};
>> +
>
>This doesn't make sense to me. As I understand it, queue_update_info is for
>information that is not stored in queue_properties but only in the MQDs.
>Therefore, it should not include the queue_properties.
>
>All the low level functions in the MQD managers get both the queue_properties
>and the queue_update_info. So trying to wrap both in the same union doesn't
>make sense there either.
>
>I think you only need this because you tried to generalize pqm_update_queue to
>handle both updates to queue_properties and CU mask updates with a single
>argument. IMO this does not make the interface any clearer. I think it would be
>more straight-forward to keep a separate pqm_set_cu_mask function that takes
>a queue_update_info parameter. If you're looking for more generic names, I
>suggest the following:
>
>  * Rename pqm_update_queue to pqm_update_queue_properties
>  * Rename struct queue_update_info to struct mqd_update_info
>  * Rename pqm_set_cu_mask to pqm_update_mqd. For now this is only used
>for CU mask (the union has only one struct member for now). It may
>be used for other MQD properties that don't need to be stored in
>queue_properties in the future

Got it. Thanks for your suggestions!

Regards,
Lang

>
>Regards,
>  Felix
>

[PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN

2021-10-25 Thread Tao Zhou

Skip GPRs init in specific condition since current GPRs init algorithm only 
works for some CU settings.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
index 00a2b36a24b3..a1f424e2ff5f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c
@@ -706,6 +706,10 @@ int gfx_v9_4_2_do_edc_gpr_workarounds(struct amdgpu_device 
*adev)
if (!amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX))
return 0;
 
+   /* workaround for some CU settings, skip GPRs init */
+   if (adev->gfx.cu_info.bitmap[1][1] == 0x3f7f)
+   return 0;
+
gfx_v9_4_2_do_sgprs_init(adev);
 
gfx_v9_4_2_do_vgprs_init(adev);
-- 
2.17.1

RE: [PATCH 1/2] drm/amdkfd: Add an optional argument into update queue operation

2021-10-25 Thread Yu, Lang

[AMD Official Use Only]



>-Original Message-
>From: Kuehling, Felix 
>Sent: Friday, October 22, 2021 12:46 AM
>To: Yu, Lang ; amd-gfx@lists.freedesktop.org
>Cc: Deucher, Alexander ; Huang, Ray
>
>Subject: Re: [PATCH 1/2] drm/amdkfd: Add an optional argument into update
>queue operation
>
>
>Am 2021-10-15 um 4:48 a.m. schrieb Lang Yu:
>> Currently, queue is updated with data stored in queue_properties.
>> And all allocated resource in queue_properties will not be freed until
>> the queue is destroyed.
>>
>> But some properties(e.g., cu mask) bring some memory management
>> headaches(e.g., memory leak) and make code complex. Actually they
>> don't have to persist in queue_properties.
>>
>> So add an argument into update queue to pass such properties and
>> remove them from queue_properties.
>>
>> Signed-off-by: Lang Yu 
>> ---
>>  .../drm/amd/amdkfd/kfd_device_queue_manager.c |  4 ++--
>> .../drm/amd/amdkfd/kfd_device_queue_manager.h |  2 +-
>> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  2 +-
>> .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 18 +++
>> .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  8 +++
>>  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  8 +++
>>  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 22 +--
>>  .../amd/amdkfd/kfd_process_queue_manager.c|  6 ++---
>>  8 files changed, 35 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index f8fce9d05f50..7f6f4937eedb 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -557,7 +557,7 @@ static int destroy_queue_nocpsch(struct
>device_queue_manager *dqm,
>>  return retval;
>>  }
>>
>> -static int update_queue(struct device_queue_manager *dqm, struct
>> queue *q)
>> +static int update_queue(struct device_queue_manager *dqm, struct
>> +queue *q, void *args)
>
>Please don't use a void * here. If you don't want to declare the struct
>queue_update_info in this patch, you can just declare it as an abstract
>type:
>
>struct queue_update_info;
>
>You can cast NULL to (struct queue_update_info *) without requiring the
>structure definition.

Got it. Thanks!

Regards,
Lang
>
>Regards,
>  Felix
>
>
>>  {
>>  int retval = 0;
>>  struct mqd_manager *mqd_mgr;
>> @@ -605,7 +605,7 @@ static int update_queue(struct device_queue_manager
>*dqm, struct queue *q)
>>  }
>>  }
>>
>> -mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties);
>> +mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties, args);
>>
>>  /*
>>   * check active state vs. the previous state and modify diff --git
>> a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> index c8719682c4da..08cfc2a2fdbb 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> @@ -93,7 +93,7 @@ struct device_queue_manager_ops {
>>  struct queue *q);
>>
>>  int (*update_queue)(struct device_queue_manager *dqm,
>> -struct queue *q);
>> +struct queue *q, void *args);
>>
>>  int (*register_process)(struct device_queue_manager *dqm,
>>  struct qcm_process_device *qpd); diff --
>git
>> a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
>> index 6e6918ccedfd..6ddf93629b8c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
>> @@ -80,7 +80,7 @@ struct mqd_manager {
>>  struct mm_struct *mms);
>>
>>  void(*update_mqd)(struct mqd_manager *mm, void *mqd,
>> -struct queue_properties *q);
>> +struct queue_properties *q, void *args);
>>
>>  int (*destroy_mqd)(struct mqd_manager *mm, void *mqd,
>>  enum kfd_preempt_type type,
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
>> index 064914e1e8d6..8bb2fd4cba41 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
>> @@ -135,7 +135,7 @@ static void init_mqd(struct mqd_manager *mm, void
>**mqd,
>>  *mqd = m;
>>  if (gart_addr)
>>  *gart_addr = addr;
>> -mm->update_mqd(mm, m, q);
>> +mm->update_mqd(mm, m, q, NULL);
>>  }
>>
>>  static void init_mqd_sdma(struct mqd_manager *mm, void **mqd, @@
>> -152,7 +152,7 @@ static void init_mqd_sdma(struct mqd_manager *mm, void
>**mqd,
>>  if (gart_addr)
>>  *gart_addr = mqd_mem_obj->gpu_addr;
>>
>> -mm->update_mqd(mm, m, q);
>> +mm->update_mqd(mm, m, q, NULL)

63 matches

Mail list logo