Re: [PATCH] drm/amd/amdgpu: fix potential bad job hw_fence underflow
On 2021/10/25 下午11:18, Andrey Grodzovsky wrote: > > On 2021-10-24 10:56 p.m., JingWen Chen wrote: >> On 2021/10/23 上午4:41, Andrey Grodzovsky wrote: >>> What do you mean by underflow in this case ? You mean use after free >>> because of extra dma_fence_put() ? >> yes > > > Then maybe update the description because 'underflow' is very confusing > will do > >>> On 2021-10-22 4:14 a.m., JingWen Chen wrote: ping On 2021/10/22 AM11:33, Jingwen Chen wrote: > [Why] > In advance tdr mode, the real bad job will be resubmitted twice, while > in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job > is put one more time than other jobs. > > [How] > Adding dma_fence_get before resbumit job in > amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs > > Signed-off-by: Jingwen Chen > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 41ce86244144..975f069f6fe8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -4841,6 +4841,9 @@ static void amdgpu_device_recheck_guilty_jobs( > /* clear job's guilty and depend the folowing step to decide > the real one */ > drm_sched_reset_karma(s_job); > + /* for the real bad job, it will be resubmitted twice, adding a > dma_fence_get > + * to make sure fence is balanced */ >>> >>> But that put in drm_sched_resubmit_jobs_ext is for the previous parent >>> fence. >>> fence = sched->ops->run_job(s_job); returns a new HW fence and the put >>> drops the refcount on the old one. >>> >>> Andrey >>> >>> >> Hi Andrey, >> >> If I remember correctly, after we embedded the hw_fence into amdgpu_job, >> there will be not fence replacement in amdgpu_job_run. > > > Right, I forgot that... What about removing line > https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c#L265 > ? > What if you make dma_get_fence unconditional instead ? > > Andrey > > Hi Andrey, I have tried this and this will cause normal jobs cannot be free(lacks a dma_fence_put). I have figured out all the get/put for sched_jobs and only the bad job lacks a dma_fence_get, other jobs are just fine. >> > + dma_fence_get(s_job->s_fence->parent); > drm_sched_resubmit_jobs_ext(&ring->sched, 1); > ret = dma_fence_wait_timeout(s_job->s_fence->parent, false, > ring->sched.timeout); > @@ -4876,6 +4879,7 @@ static void amdgpu_device_recheck_guilty_jobs( > /* got the hw fence, signal finished fence */ > atomic_dec(ring->sched.score); > + dma_fence_put(s_job->s_fence->parent); > dma_fence_get(&s_job->s_fence->finished); > dma_fence_signal(&s_job->s_fence->finished); > dma_fence_put(&s_job->s_fence->finished);
Re: Lockdep spalt on killing a processes
On 2021-10-25 3:56 p.m., Christian König wrote: In general I'm all there to get this fixed, but there is one major problem: Drivers don't expect the lock to be dropped. I am probably missing something but in my approach we only modify the code for those clients that call dma_fence_signal, not dma_fence_signal_locked. In those cases the drivers are agnostic to lock behavior (or should be at least) since the lock is acquired within the dma fence code. Note that if you are worried about calling the callback without lock then same exact concern is relevant to using the irq_work directly in the fence code since the irq_work will execute at a later time without locked fence->lock (which is the point of using irq_work). What we could do is to change all drivers so they call always call the dma_fence_signal functions and drop the _locked variants. This way we could move calling the callback out of the spinlock. But that requires audit of all drivers, so quite a lot of work to do. As i said earlier - if we only modify dma_fence_signal and don't touch dma_fence_signal_locked then our only concern should the users of dma_fence_signal. Let me please know if I am still missing some point of yours. Andrey Regards, Christian. Am 25.10.21 um 21:10 schrieb Andrey Grodzovsky: Adding back Daniel (somehow he got off the addresses list) and Chris who worked a lot in this area. On 2021-10-21 2:34 a.m., Christian König wrote: Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky: On 2021-10-04 4:14 a.m., Christian König wrote: The problem is a bit different. The callback is on the dependent fence, while we need to signal the scheduler fence. Daniel is right that this needs an irq_work struct to handle this properly. Christian. So we had some discussions with Christian regarding irq_work and agreed I should look into doing it but stepping back for a sec - Why we insist on calling the dma_fence_cb with fence->lock locked ? Is it because of dma_fence_add_callback ? Because we first test for DMA_FENCE_FLAG_SIGNALED_BIT and only after that lock the fence->lock ? If so, can't we move DMA_FENCE_FLAG_SIGNALED_BIT check inside the locked section ? Because if in theory we could call the cb with unlocked fence->lock (i.e. this kind of iteration https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.15-rc6%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_resource.c%23L117&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc8a4525f94c244bebbd208d997f19242%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637707886075917091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YBq%2BNkDuYKERc8XJDWTfeM%2FSknpuCBHQYgN8Uo5PFv0%3D&reserved=0) we wouldn't have the lockdep splat. And in general, is it really the correct approach to call a third party code from a call back with locked spinlock ? We don't know what the cb does inside and I don't see any explicit restrictions in documentation of dma_fence_func_t what can and cannot be done there. Yeah, that's exactly what I meant with using the irq_work directly in the fence code. My idea is not to use irq work at all but instead to implement unlocked dma_fence cb execution using iteration which drops the spinlock each time next cb is executed and acquiring it again after (until cb_list is empy). The problem is dma_fence_signal_locked() which is used by quite a number of drivers to signal the fence while holding the lock. For this I think we should not reuse dma_fence_signal_locked inside dma_fence_signal and instead implement it using the unlocked iteration I mentioned above. I looked a bit in the code and the history and I see that until some time ago (this commit by Chris 0fc89b6802ba1fcc561b0c906e0cefd384e3b2e5), indeed dma_fence_signal was doing it's own, locked iteration and wasn't reusing dma_fence_signal_locked. This way whoever relies on the dma_fence_signal_locked won't be impacted an who is not (like us in drm_sched_fence_scheduled/drm_sched_fence_finished) should also not be impacted by more narrow scope of the lock. I also looked at dma_fence_default_wait and how it locks the fence->lock and check if fence is signaled before wait start and I don't see a problem there either. I attached quick draft of this proposal to clarify. Andrey Otherwise we could indeed simplify the fence handling a lot. Christian. Andrey Am 01.10.21 um 17:10 schrieb Andrey Grodzovsky: From what I see here you supposed to have actual deadlock and not only warning, sched_fence->finished is first signaled from within hw fence done callback (drm_sched_job_done_cb) but then again from within it's own callback (drm_sched_entity_kill_jobs_cb) and so looks like same fence object is recursively signaled twice. This leads to attempt to lock fence->lock second time while it's already locked. I don't see a need to call drm_sched_fence_finished from withi
Re: [Intel-gfx] [PATCH 1/2] drm: Add Gamma and Degamma LUT sizes props to drm_crtc to validate.
On Wed, Oct 13, 2021 at 02:12:20PM -0400, Mark Yacoub wrote: > From: Mark Yacoub > > [Why] > 1. drm_atomic_helper_check doesn't check for the LUT sizes of either Gamma > or Degamma props in the new CRTC state, allowing any invalid size to > be passed on. > 2. Each driver has its own LUT size, which could also be different for > legacy users. > > [How] > 1. Create |degamma_lut_size| and |gamma_lut_size| to save the LUT sizes > assigned by the driver when it's initializing its color and CTM > management. > 2. Create drm_atomic_helper_check_crtc which is called by > drm_atomic_helper_check to check the LUT sizes saved in drm_crtc that > they match the sizes in the new CRTC state. > 3. Rename older lut checks that test for the color channels to indicate > it's a channel check. It's not included in drm_atomic_helper_check_crtc > as it's hardware specific and is to be called by the driver. > 4. As the LUT size check now happens in drm_atomic_helper_check, remove > the lut check in intel_color.c I'd probably split the rename out from the crtc check since they're only tangentially related. > > Fixes: igt@kms_color@pipe-A-invalid-gamma-lut-sizes on MTK > Tested on Zork(amdgpu) and Jacuzzi(mediatek), volteer(TGL) > > v1: > 1. Fix typos > 2. Remove the LUT size check from intel driver > 3. Rename old LUT check to indicate it's a channel change > > Signed-off-by: Mark Yacoub > --- > drivers/gpu/drm/drm_atomic_helper.c| 60 ++ > drivers/gpu/drm/drm_color_mgmt.c | 14 ++--- > drivers/gpu/drm/i915/display/intel_color.c | 14 ++--- > include/drm/drm_atomic_helper.h| 1 + > include/drm/drm_color_mgmt.h | 7 +-- > include/drm/drm_crtc.h | 11 > 6 files changed, 89 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > b/drivers/gpu/drm/drm_atomic_helper.c > index bc3487964fb5e..5feb2ad0209c3 100644 > --- a/drivers/gpu/drm/drm_atomic_helper.c > +++ b/drivers/gpu/drm/drm_atomic_helper.c > @@ -929,6 +929,62 @@ drm_atomic_helper_check_planes(struct drm_device *dev, > } > EXPORT_SYMBOL(drm_atomic_helper_check_planes); > > +/** > + * drm_atomic_helper_check_crtcs - validate state object for CRTC changes > + * @state: the driver state object > + * > + * Check the CRTC state object such as the Gamma/Degamma LUT sizes if the new > + * state holds them. > + * > + * RETURNS: > + * Zero for success or -errno > + */ > +int drm_atomic_helper_check_crtcs(struct drm_atomic_state *state) > +{ > + struct drm_crtc *crtc; > + struct drm_crtc_state *new_crtc_state; > + int i; > + > + for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) { > + if (new_crtc_state->color_mgmt_changed && > + new_crtc_state->gamma_lut) { > + uint64_t supported_lut_size = crtc->gamma_lut_size; > + uint32_t supported_legacy_lut_size = crtc->gamma_size; > + uint32_t new_state_lut_size = > + drm_color_lut_size(new_crtc_state->gamma_lut); > + > + if (new_state_lut_size != supported_lut_size && > + new_state_lut_size != supported_legacy_lut_size) { > + drm_dbg_state( > + state->dev, > + "Invalid Gamma LUT size. Should be %u > (or %u for legacy) but got %u.\n", > + supported_lut_size, > + supported_legacy_lut_size, > + new_state_lut_size); > + return -EINVAL; > + } > + } > + > + if (new_crtc_state->color_mgmt_changed && > + new_crtc_state->degamma_lut) { > + uint32_t new_state_lut_size = > + drm_color_lut_size(new_crtc_state->degamma_lut); > + uint64_t supported_lut_size = crtc->degamma_lut_size; > + > + if (new_state_lut_size != supported_lut_size) { > + drm_dbg_state( > + state->dev, > + "Invalid Degamma LUT size. Should be %u > but got %u.\n", > + supported_lut_size, new_state_lut_size); > + return -EINVAL; > + } > + } > + } > + > + return 0; > +} > +EXPORT_SYMBOL(drm_atomic_helper_check_crtcs); > + > /** > * drm_atomic_helper_check - validate state object > * @dev: DRM device > @@ -974,6 +1030,10 @@ int drm_atomic_helper_check(struct drm_device *dev, > if (ret) > return ret; > > + ret = drm_atomic_helper_check_crtcs(state); > + if (ret) > + return ret; > + > if (state->legacy_cursor_update) > state->async_updat
Re: [PATCH] mm/migrate.c: Remove MIGRATE_PFN_LOCKED
On 10/24/21 21:16, Alistair Popple wrote: MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a source page was already locked during migrate_vma_collect(). If it wasn't then the a second attempt is made to lock the page. However if the first attempt failed it's unlikely a second attempt will succeed, and the retry adds complexity. So clean this up by removing the retry and MIGRATE_PFN_LOCKED flag. Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag set, but nothing actually checks that. Signed-off-by: Alistair Popple You can add: Reviewed-by: Ralph Campbell --- Documentation/vm/hmm.rst | 2 +- arch/powerpc/kvm/book3s_hv_uvmem.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 2 - drivers/gpu/drm/nouveau/nouveau_dmem.c | 4 +- include/linux/migrate.h | 1 - lib/test_hmm.c | 5 +- mm/migrate.c | 145 +-- 7 files changed, 35 insertions(+), 128 deletions(-) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index a14c2938e7af..f2a59ed82ed3 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -360,7 +360,7 @@ between device driver specific code and shared common code: system memory page, locks the page with ``lock_page()``, and fills in the ``dst`` array entry with:: - dst[i] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + dst[i] = migrate_pfn(page_to_pfn(dpage)); Now that the driver knows that this page is being migrated, it can invalidate device private MMU mappings and copy device private memory diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index a7061ee3b157..28c436df9935 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -560,7 +560,7 @@ static int __kvmppc_svm_page_out(struct vm_area_struct *vma, gpa, 0, page_shift); if (ret == U_SUCCESS) - *mig.dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED; + *mig.dst = migrate_pfn(pfn); else { unlock_page(dpage); __free_page(dpage); @@ -774,7 +774,7 @@ static int kvmppc_svm_page_in(struct vm_area_struct *vma, } } - *mig.dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + *mig.dst = migrate_pfn(page_to_pfn(dpage)); migrate_vma_pages(&mig); out_finalize: migrate_vma_finalize(&mig); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c index 4a16e3c257b9..41d9417f182b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c @@ -300,7 +300,6 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange, migrate->dst[i] = svm_migrate_addr_to_pfn(adev, dst[i]); svm_migrate_get_vram_page(prange, migrate->dst[i]); migrate->dst[i] = migrate_pfn(migrate->dst[i]); - migrate->dst[i] |= MIGRATE_PFN_LOCKED; src[i] = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_TO_DEVICE); r = dma_mapping_error(dev, src[i]); @@ -580,7 +579,6 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange, dst[i] >> PAGE_SHIFT, page_to_pfn(dpage)); migrate->dst[i] = migrate_pfn(page_to_pfn(dpage)); - migrate->dst[i] |= MIGRATE_PFN_LOCKED; j++; } diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 92987daa5e17..3828aafd3ac4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -166,7 +166,7 @@ static vm_fault_t nouveau_dmem_fault_copy_one(struct nouveau_drm *drm, goto error_dma_unmap; mutex_unlock(&svmm->mutex); - args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + args->dst[0] = migrate_pfn(page_to_pfn(dpage)); return 0; error_dma_unmap: @@ -602,7 +602,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, ((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT); if (src & MIGRATE_PFN_WRITE) *pfn |= NVIF_VMM_PFNMAP_V0_W; - return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + return migrate_pfn(page_to_pfn(dpage)); out_dma_unmap: dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL); diff --git a/include/linux/migrate.h b/include/linux/migrate.h index c8077e936691..479b861ae490 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -119,7 +119,6 @@ static inline int migrate_misplaced_page(struct page *page, */ #define MIGRATE_PF
Re: [PATCH v2 2/2] drm/amdkfd: Remove cu mask from struct queue_properties(v2)
Am 2021-10-25 um 5:58 a.m. schrieb Lang Yu: > Actually, cu_mask has been copied to mqd memory and > does't have to persist in queue_properties. Remove it > from queue_properties. > > And use struct mqd_update_info to store such properties, > then pass it to update queue operation. > > v2: > * Rename pqm_update_queue to pqm_update_queue_properties. > * Rename struct queue_update_info to struct mqd_update_info. > * Rename pqm_set_cu_mask to pqm_update_mqd. > > Suggested-by: Felix Kuehling > Signed-off-by: Lang Yu The series is Reviewed-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 31 ++- > drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 1 - > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 9 +++--- > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 9 +++--- > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 9 +++--- > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 9 +++--- > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 23 +- > .../amd/amdkfd/kfd_process_queue_manager.c| 20 +++- > 8 files changed, 57 insertions(+), 54 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > index 9317a2e238d0..24ebd61395d8 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > @@ -405,7 +405,7 @@ static int kfd_ioctl_update_queue(struct file *filp, > struct kfd_process *p, > > mutex_lock(&p->mutex); > > - retval = pqm_update_queue(&p->pqm, args->queue_id, &properties); > + retval = pqm_update_queue_properties(&p->pqm, args->queue_id, > &properties); > > mutex_unlock(&p->mutex); > > @@ -418,7 +418,7 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, > struct kfd_process *p, > int retval; > const int max_num_cus = 1024; > struct kfd_ioctl_set_cu_mask_args *args = data; > - struct queue_properties properties; > + struct mqd_update_info minfo = {0}; > uint32_t __user *cu_mask_ptr = (uint32_t __user *)args->cu_mask_ptr; > size_t cu_mask_size = sizeof(uint32_t) * (args->num_cu_mask / 32); > > @@ -428,8 +428,8 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, > struct kfd_process *p, > return -EINVAL; > } > > - properties.cu_mask_count = args->num_cu_mask; > - if (properties.cu_mask_count == 0) { > + minfo.cu_mask.count = args->num_cu_mask; > + if (minfo.cu_mask.count == 0) { > pr_debug("CU mask cannot be 0"); > return -EINVAL; > } > @@ -438,32 +438,33 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, > struct kfd_process *p, >* limit of max_num_cus bits. We can then just drop any CU mask bits >* past max_num_cus bits and just use the first max_num_cus bits. >*/ > - if (properties.cu_mask_count > max_num_cus) { > + if (minfo.cu_mask.count > max_num_cus) { > pr_debug("CU mask cannot be greater than 1024 bits"); > - properties.cu_mask_count = max_num_cus; > + minfo.cu_mask.count = max_num_cus; > cu_mask_size = sizeof(uint32_t) * (max_num_cus/32); > } > > - properties.cu_mask = kzalloc(cu_mask_size, GFP_KERNEL); > - if (!properties.cu_mask) > + minfo.cu_mask.ptr = kzalloc(cu_mask_size, GFP_KERNEL); > + if (!minfo.cu_mask.ptr) > return -ENOMEM; > > - retval = copy_from_user(properties.cu_mask, cu_mask_ptr, cu_mask_size); > + retval = copy_from_user(minfo.cu_mask.ptr, cu_mask_ptr, cu_mask_size); > if (retval) { > pr_debug("Could not copy CU mask from userspace"); > - kfree(properties.cu_mask); > - return -EFAULT; > + retval = -EFAULT; > + goto out; > } > > + minfo.update_flag = UPDATE_FLAG_CU_MASK; > + > mutex_lock(&p->mutex); > > - retval = pqm_set_cu_mask(&p->pqm, args->queue_id, &properties); > + retval = pqm_update_mqd(&p->pqm, args->queue_id, &minfo); > > mutex_unlock(&p->mutex); > > - if (retval) > - kfree(properties.cu_mask); > - > +out: > + kfree(minfo.cu_mask.ptr); > return retval; > } > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c > b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c > index a2b77d1df854..64b4ac339904 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c > @@ -136,7 +136,6 @@ static bool kq_initialize(struct kernel_queue *kq, struct > kfd_dev *dev, > prop.write_ptr = (uint32_t *) kq->wptr_gpu_addr; > prop.eop_ring_buffer_address = kq->eop_gpu_addr; > prop.eop_ring_buffer_size = PAGE_SIZE; > - prop.cu_mask = NULL; > > if (init_queue(&kq->queue, &prop) != 0) > goto err_init_queue; > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c > b/drivers/gpu/drm/amd/amdkfd/
[PATCH RESEND v5 0/4] drm/amdgpu, Add DP 2.0 MST support + drm/dp_mst helpers
(Sorry for the noise, had to resend because I typo'd amd's mailing list email address by accident) Just resubmitting this patch series from AMD with _very_ minor changes (just a typo and fixing a debug message) so that this can be pushed upstream with a proper patchwork link. Will be pushing this into a topic branch and submitting to airlied in a moment. Bhawanpreet Lakha (3): drm: Remove slot checks in dp mst topology during commit drm: Update MST First Link Slot Information Based on Encoding Format drm/amd/display: Add DP 2.0 MST DM Support Fangzhi Zuo (1): drm/amd/display: Add DP 2.0 MST DC Support .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 29 ++ .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 3 + .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 5 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 14 + drivers/gpu/drm/amd/display/dc/core/dc_link.c | 292 ++ .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 19 ++ drivers/gpu/drm/amd/display/dc/dc_link.h | 7 + drivers/gpu/drm/amd/display/dc/dc_stream.h| 13 + drivers/gpu/drm/drm_dp_mst_topology.c | 42 ++- drivers/gpu/drm/i915/display/intel_dp_mst.c | 4 +- drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +- drivers/gpu/drm/radeon/radeon_dp_mst.c| 4 +- include/drm/drm_dp_mst_helper.h | 5 +- 13 files changed, 423 insertions(+), 16 deletions(-) -- 2.31.1
[PATCH RESEND v5 4/4] drm/amd/display: Add DP 2.0 MST DM Support
From: Bhawanpreet Lakha [Why] Add DP2 MST and debugfs support [How] Update the slot info based on the link encoding format Reviewed-by: "Lin, Wayne" Signed-off-by: Bhawanpreet Lakha Signed-off-by: Fangzhi Zuo Signed-off-by: Lyude Paul --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 29 +++ .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 3 ++ .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 5 +++- 3 files changed, 36 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f35561b5a465..ecdeeedb1cde 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -10684,6 +10684,8 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, #if defined(CONFIG_DRM_AMD_DC_DCN) struct dsc_mst_fairness_vars vars[MAX_PIPES]; #endif + struct drm_dp_mst_topology_state *mst_state; + struct drm_dp_mst_topology_mgr *mgr; trace_amdgpu_dm_atomic_check_begin(state); @@ -10891,6 +10893,33 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, lock_and_validation_needed = true; } +#if defined(CONFIG_DRM_AMD_DC_DCN) + /* set the slot info for each mst_state based on the link encoding format */ + for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) { + struct amdgpu_dm_connector *aconnector; + struct drm_connector *connector; + struct drm_connector_list_iter iter; + u8 link_coding_cap; + + if (!mgr->mst_state ) + continue; + + drm_connector_list_iter_begin(dev, &iter); + drm_for_each_connector_iter(connector, &iter) { + int id = connector->index; + + if (id == mst_state->mgr->conn_base_id) { + aconnector = to_amdgpu_dm_connector(connector); + link_coding_cap = dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link); + drm_dp_mst_update_slots(mst_state, link_coding_cap); + + break; + } + } + drm_connector_list_iter_end(&iter); + + } +#endif /** * Streams and planes are reset when there are changes that affect * bandwidth. Anything that affects bandwidth needs to go through diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 814f67d86a3c..3d44896149a2 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -294,6 +294,9 @@ static ssize_t dp_link_settings_write(struct file *f, const char __user *buf, case LINK_RATE_RBR2: case LINK_RATE_HIGH2: case LINK_RATE_HIGH3: +#if defined(CONFIG_DRM_AMD_DC_DCN) + case LINK_RATE_UHBR10: +#endif break; default: valid_input = false; diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index 6169488e2011..53b5cc7b0679 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -219,6 +219,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( struct drm_dp_mst_topology_mgr *mst_mgr; struct drm_dp_mst_port *mst_port; bool ret; + u8 link_coding_cap; aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context; /* Accessing the connector state is required for vcpi_slots allocation @@ -238,6 +239,8 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( mst_port = aconnector->port; + link_coding_cap = dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link); + if (enable) { ret = drm_dp_mst_allocate_vcpi(mst_mgr, mst_port, @@ -251,7 +254,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( } /* It's OK for this to fail */ - drm_dp_update_payload_part1(mst_mgr, 1); + drm_dp_update_payload_part1(mst_mgr, (link_coding_cap == DP_CAP_ANSI_128B132B) ? 0:1); /* mst_mgr->->payloads are VC payload notify MST branch using DPCD or * AUX message. The sequence is slot 1-63 allocated sequence for each -- 2.31.1
[PATCH RESEND v5 3/4] drm/amd/display: Add DP 2.0 MST DC Support
From: Fangzhi Zuo [Why] configure/call DC interface for DP2 mst support. This is needed to make DP2 mst work. [How] - add encoding type, logging, mst update/reduce payload functions Use the link encoding to determine the DP type (1.4 or 2.0) and add a flag to dc_stream_update to determine whether to increase/reduce payloads. v2: * add DP_UNKNOWN_ENCODING handling Signed-off-by: Fangzhi Zuo Reviewed-by: "Lin, Wayne" Signed-off-by: Bhawanpreet Lakha Signed-off-by: Lyude Paul --- drivers/gpu/drm/amd/display/dc/core/dc.c | 14 + drivers/gpu/drm/amd/display/dc/core/dc_link.c | 292 ++ .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 19 ++ drivers/gpu/drm/amd/display/dc/dc_link.h | 7 + drivers/gpu/drm/amd/display/dc/dc_stream.h| 13 + 5 files changed, 345 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index da942e9f5142..782141ba8ac5 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -2356,6 +2356,11 @@ static enum surface_update_type check_update_surfaces_for_stream( if (stream_update->dsc_config) su_flags->bits.dsc_changed = 1; +#if defined(CONFIG_DRM_AMD_DC_DCN) + if (stream_update->mst_bw_update) + su_flags->bits.mst_bw = 1; +#endif + if (su_flags->raw != 0) overall_type = UPDATE_TYPE_FULL; @@ -2741,6 +2746,15 @@ static void commit_planes_do_stream_update(struct dc *dc, if (stream_update->dsc_config) dp_update_dsc_config(pipe_ctx); +#if defined(CONFIG_DRM_AMD_DC_DCN) + if (stream_update->mst_bw_update) { + if (stream_update->mst_bw_update->is_increase) + dc_link_increase_mst_payload(pipe_ctx, stream_update->mst_bw_update->mst_stream_bw); + else + dc_link_reduce_mst_payload(pipe_ctx, stream_update->mst_bw_update->mst_stream_bw); + } +#endif + if (stream_update->pending_test_pattern) { dc_link_dp_set_test_pattern(stream->link, stream->test_pattern.type, diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index ca5dc3c168ec..fd12561b70cc 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3272,6 +3272,9 @@ static struct fixed31_32 get_pbn_from_timing(struct pipe_ctx *pipe_ctx) static void update_mst_stream_alloc_table( struct dc_link *link, struct stream_encoder *stream_enc, +#if defined(CONFIG_DRM_AMD_DC_DCN) + struct hpo_dp_stream_encoder *hpo_dp_stream_enc, // TODO: Rename stream_enc to dio_stream_enc? +#endif const struct dp_mst_stream_allocation_table *proposed_table) { struct link_mst_stream_allocation work_table[MAX_CONTROLLER_NUM] = { @@ -3308,6 +3311,9 @@ static void update_mst_stream_alloc_table( work_table[i].slot_count = proposed_table->stream_allocations[i].slot_count; work_table[i].stream_enc = stream_enc; +#if defined(CONFIG_DRM_AMD_DC_DCN) + work_table[i].hpo_dp_stream_enc = hpo_dp_stream_enc; +#endif } } @@ -3430,6 +3436,10 @@ enum dc_status dc_link_allocate_mst_payload(struct pipe_ctx *pipe_ctx) struct dc_link *link = stream->link; struct link_encoder *link_encoder = NULL; struct stream_encoder *stream_encoder = pipe_ctx->stream_res.stream_enc; +#if defined(CONFIG_DRM_AMD_DC_DCN) + struct hpo_dp_link_encoder *hpo_dp_link_encoder = link->hpo_dp_link_enc; + struct hpo_dp_stream_encoder *hpo_dp_stream_encoder = pipe_ctx->stream_res.hpo_dp_stream_enc; +#endif struct dp_mst_stream_allocation_table proposed_table = {0}; struct fixed31_32 avg_time_slots_per_mtp; struct fixed31_32 pbn; @@ -3457,7 +3467,14 @@ enum dc_status dc_link_allocate_mst_payload(struct pipe_ctx *pipe_ctx) &proposed_table, true)) { update_mst_stream_alloc_table( +#if defined(CONFIG_DRM_AMD_DC_DCN) + link, + pipe_ctx->stream_res.stream_enc, + pipe_ctx->stream_res.hpo_dp_stream_enc, + &proposed_table); +#else link, pipe_ctx->stream_res.stream_enc, &proposed_table); +#endif } else DC_LOG_WARNING("Failed to update" @@ -3471,23 +3488,56 @@ enum dc_status dc_link_allocate_mst_payload(struct pipe_ctx *pipe_ctx)
[PATCH RESEND v5 2/4] drm: Update MST First Link Slot Information Based on Encoding Format
From: Bhawanpreet Lakha 8b/10b encoding format requires to reserve the first slot for recording metadata. Real data transmission starts from the second slot, with a total of available 63 slots available. In 128b/132b encoding format, metadata is transmitted separately in LLCP packet before MTP. Real data transmission starts from the first slot, with a total of 64 slots available. v2: * Move total/start slots to mst_state, and copy it to mst_mgr in atomic_check v3: * Only keep the slot info on the mst_state * add a start_slot parameter to the payload function, to facilitate non atomic drivers (this is a temporary workaround and should be removed when we are moving out the non atomic driver helpers) v4: *fixed typo and formatting v5: (no functional changes) * Fixed formatting in drm_dp_mst_update_slots() * Reference mst_state instead of mst_state->mgr for debugging info Signed-off-by: Bhawanpreet Lakha Signed-off-by: Fangzhi Zuo [v5 nitpicks] Reviewed-by: Lyude Paul Signed-off-by: Lyude Paul --- .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 2 +- drivers/gpu/drm/drm_dp_mst_topology.c | 36 --- drivers/gpu/drm/i915/display/intel_dp_mst.c | 4 +-- drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +- drivers/gpu/drm/radeon/radeon_dp_mst.c| 4 +-- include/drm/drm_dp_mst_helper.h | 5 ++- 6 files changed, 42 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index ff0f91c93ba4..6169488e2011 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -251,7 +251,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( } /* It's OK for this to fail */ - drm_dp_update_payload_part1(mst_mgr); + drm_dp_update_payload_part1(mst_mgr, 1); /* mst_mgr->->payloads are VC payload notify MST branch using DPCD or * AUX message. The sequence is slot 1-63 allocated sequence for each diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 04ed34a7f71c..571da0c2f39f 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3355,6 +3355,10 @@ static int drm_dp_destroy_payload_step2(struct drm_dp_mst_topology_mgr *mgr, /** * drm_dp_update_payload_part1() - Execute payload update part 1 * @mgr: manager to use. + * @start_slot: this is the cur slot + * + * NOTE: start_slot is a temporary workaround for non-atomic drivers, + * this will be removed when non-atomic mst helpers are moved out of the helper * * This iterates over all proposed virtual channels, and tries to * allocate space in the link for them. For 0->slots transitions, @@ -3365,12 +3369,12 @@ static int drm_dp_destroy_payload_step2(struct drm_dp_mst_topology_mgr *mgr, * after calling this the driver should generate ACT and payload * packets. */ -int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr) +int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr, int start_slot) { struct drm_dp_payload req_payload; struct drm_dp_mst_port *port; int i, j; - int cur_slots = 1; + int cur_slots = start_slot; bool skip; mutex_lock(&mgr->payload_lock); @@ -4505,6 +4509,27 @@ int drm_dp_atomic_release_vcpi_slots(struct drm_atomic_state *state, } EXPORT_SYMBOL(drm_dp_atomic_release_vcpi_slots); +/** + * drm_dp_mst_update_slots() - updates the slot info depending on the DP ecoding format + * @mst_state: mst_state to update + * @link_encoding_cap: the ecoding format on the link + */ +void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, uint8_t link_encoding_cap) +{ + if (link_encoding_cap == DP_CAP_ANSI_128B132B) { + mst_state->total_avail_slots = 64; + mst_state->start_slot = 0; + } else { + mst_state->total_avail_slots = 63; + mst_state->start_slot = 1; + } + + DRM_DEBUG_KMS("%s encoding format on mst_state 0x%p\n", + (link_encoding_cap == DP_CAP_ANSI_128B132B) ? "128b/132b":"8b/10b", + mst_state); +} +EXPORT_SYMBOL(drm_dp_mst_update_slots); + /** * drm_dp_mst_allocate_vcpi() - Allocate a virtual channel * @mgr: manager for this port @@ -5224,7 +5249,7 @@ drm_dp_mst_atomic_check_vcpi_alloc_limit(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_topology_state *mst_state) { struct drm_dp_vcpi_allocation *vcpi; - int avail_slots = 63, payload_count = 0; + int avail_slots = mst_state->total_avail_slots, payload_count = 0; list_for_each_entry(vcpi, &mst_state->vcpis, next) { /* Releasing VCPI is always OK-even if the port is gone */ @@ -5253,7 +5278,7 @@ dr
[PATCH RESEND v5 1/4] drm: Remove slot checks in dp mst topology during commit
From: Bhawanpreet Lakha This code path is used during commit, and we dont expect things to fail during the commit stage, so remove this. Signed-off-by: Bhawanpreet Lakha Reviewed-by: Lyude Paul Signed-off-by: Lyude Paul --- drivers/gpu/drm/drm_dp_mst_topology.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 86d13d6bc463..04ed34a7f71c 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -4334,10 +4334,6 @@ static int drm_dp_init_vcpi(struct drm_dp_mst_topology_mgr *mgr, { int ret; - /* max. time slots - one slot for MTP header */ - if (slots > 63) - return -ENOSPC; - vcpi->pbn = pbn; vcpi->aligned_pbn = slots * mgr->pbn_div; vcpi->num_slots = slots; @@ -4540,7 +4536,7 @@ bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr, ret = drm_dp_init_vcpi(mgr, &port->vcpi, pbn, slots); if (ret) { - drm_dbg_kms(mgr->dev, "failed to init vcpi slots=%d max=63 ret=%d\n", + drm_dbg_kms(mgr->dev, "failed to init vcpi slots=%d ret=%d\n", DIV_ROUND_UP(pbn, mgr->pbn_div), ret); drm_dp_mst_topology_put_port(port); goto out; -- 2.31.1
[PATCH v5 4/4] drm/amd/display: Add DP 2.0 MST DM Support
From: Bhawanpreet Lakha [Why] Add DP2 MST and debugfs support [How] Update the slot info based on the link encoding format Reviewed-by: "Lin, Wayne" Signed-off-by: Bhawanpreet Lakha Signed-off-by: Fangzhi Zuo Signed-off-by: Lyude Paul --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 29 +++ .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 3 ++ .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 5 +++- 3 files changed, 36 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f35561b5a465..ecdeeedb1cde 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -10684,6 +10684,8 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, #if defined(CONFIG_DRM_AMD_DC_DCN) struct dsc_mst_fairness_vars vars[MAX_PIPES]; #endif + struct drm_dp_mst_topology_state *mst_state; + struct drm_dp_mst_topology_mgr *mgr; trace_amdgpu_dm_atomic_check_begin(state); @@ -10891,6 +10893,33 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, lock_and_validation_needed = true; } +#if defined(CONFIG_DRM_AMD_DC_DCN) + /* set the slot info for each mst_state based on the link encoding format */ + for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) { + struct amdgpu_dm_connector *aconnector; + struct drm_connector *connector; + struct drm_connector_list_iter iter; + u8 link_coding_cap; + + if (!mgr->mst_state ) + continue; + + drm_connector_list_iter_begin(dev, &iter); + drm_for_each_connector_iter(connector, &iter) { + int id = connector->index; + + if (id == mst_state->mgr->conn_base_id) { + aconnector = to_amdgpu_dm_connector(connector); + link_coding_cap = dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link); + drm_dp_mst_update_slots(mst_state, link_coding_cap); + + break; + } + } + drm_connector_list_iter_end(&iter); + + } +#endif /** * Streams and planes are reset when there are changes that affect * bandwidth. Anything that affects bandwidth needs to go through diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 814f67d86a3c..3d44896149a2 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -294,6 +294,9 @@ static ssize_t dp_link_settings_write(struct file *f, const char __user *buf, case LINK_RATE_RBR2: case LINK_RATE_HIGH2: case LINK_RATE_HIGH3: +#if defined(CONFIG_DRM_AMD_DC_DCN) + case LINK_RATE_UHBR10: +#endif break; default: valid_input = false; diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index 6169488e2011..53b5cc7b0679 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -219,6 +219,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( struct drm_dp_mst_topology_mgr *mst_mgr; struct drm_dp_mst_port *mst_port; bool ret; + u8 link_coding_cap; aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context; /* Accessing the connector state is required for vcpi_slots allocation @@ -238,6 +239,8 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( mst_port = aconnector->port; + link_coding_cap = dc_link_dp_mst_decide_link_encoding_format(aconnector->dc_link); + if (enable) { ret = drm_dp_mst_allocate_vcpi(mst_mgr, mst_port, @@ -251,7 +254,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( } /* It's OK for this to fail */ - drm_dp_update_payload_part1(mst_mgr, 1); + drm_dp_update_payload_part1(mst_mgr, (link_coding_cap == DP_CAP_ANSI_128B132B) ? 0:1); /* mst_mgr->->payloads are VC payload notify MST branch using DPCD or * AUX message. The sequence is slot 1-63 allocated sequence for each -- 2.31.1
[PATCH v5 3/4] drm/amd/display: Add DP 2.0 MST DC Support
From: Fangzhi Zuo [Why] configure/call DC interface for DP2 mst support. This is needed to make DP2 mst work. [How] - add encoding type, logging, mst update/reduce payload functions Use the link encoding to determine the DP type (1.4 or 2.0) and add a flag to dc_stream_update to determine whether to increase/reduce payloads. v2: * add DP_UNKNOWN_ENCODING handling Signed-off-by: Fangzhi Zuo Reviewed-by: "Lin, Wayne" Signed-off-by: Bhawanpreet Lakha Signed-off-by: Lyude Paul --- drivers/gpu/drm/amd/display/dc/core/dc.c | 14 + drivers/gpu/drm/amd/display/dc/core/dc_link.c | 292 ++ .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 19 ++ drivers/gpu/drm/amd/display/dc/dc_link.h | 7 + drivers/gpu/drm/amd/display/dc/dc_stream.h| 13 + 5 files changed, 345 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index da942e9f5142..782141ba8ac5 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -2356,6 +2356,11 @@ static enum surface_update_type check_update_surfaces_for_stream( if (stream_update->dsc_config) su_flags->bits.dsc_changed = 1; +#if defined(CONFIG_DRM_AMD_DC_DCN) + if (stream_update->mst_bw_update) + su_flags->bits.mst_bw = 1; +#endif + if (su_flags->raw != 0) overall_type = UPDATE_TYPE_FULL; @@ -2741,6 +2746,15 @@ static void commit_planes_do_stream_update(struct dc *dc, if (stream_update->dsc_config) dp_update_dsc_config(pipe_ctx); +#if defined(CONFIG_DRM_AMD_DC_DCN) + if (stream_update->mst_bw_update) { + if (stream_update->mst_bw_update->is_increase) + dc_link_increase_mst_payload(pipe_ctx, stream_update->mst_bw_update->mst_stream_bw); + else + dc_link_reduce_mst_payload(pipe_ctx, stream_update->mst_bw_update->mst_stream_bw); + } +#endif + if (stream_update->pending_test_pattern) { dc_link_dp_set_test_pattern(stream->link, stream->test_pattern.type, diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index ca5dc3c168ec..fd12561b70cc 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3272,6 +3272,9 @@ static struct fixed31_32 get_pbn_from_timing(struct pipe_ctx *pipe_ctx) static void update_mst_stream_alloc_table( struct dc_link *link, struct stream_encoder *stream_enc, +#if defined(CONFIG_DRM_AMD_DC_DCN) + struct hpo_dp_stream_encoder *hpo_dp_stream_enc, // TODO: Rename stream_enc to dio_stream_enc? +#endif const struct dp_mst_stream_allocation_table *proposed_table) { struct link_mst_stream_allocation work_table[MAX_CONTROLLER_NUM] = { @@ -3308,6 +3311,9 @@ static void update_mst_stream_alloc_table( work_table[i].slot_count = proposed_table->stream_allocations[i].slot_count; work_table[i].stream_enc = stream_enc; +#if defined(CONFIG_DRM_AMD_DC_DCN) + work_table[i].hpo_dp_stream_enc = hpo_dp_stream_enc; +#endif } } @@ -3430,6 +3436,10 @@ enum dc_status dc_link_allocate_mst_payload(struct pipe_ctx *pipe_ctx) struct dc_link *link = stream->link; struct link_encoder *link_encoder = NULL; struct stream_encoder *stream_encoder = pipe_ctx->stream_res.stream_enc; +#if defined(CONFIG_DRM_AMD_DC_DCN) + struct hpo_dp_link_encoder *hpo_dp_link_encoder = link->hpo_dp_link_enc; + struct hpo_dp_stream_encoder *hpo_dp_stream_encoder = pipe_ctx->stream_res.hpo_dp_stream_enc; +#endif struct dp_mst_stream_allocation_table proposed_table = {0}; struct fixed31_32 avg_time_slots_per_mtp; struct fixed31_32 pbn; @@ -3457,7 +3467,14 @@ enum dc_status dc_link_allocate_mst_payload(struct pipe_ctx *pipe_ctx) &proposed_table, true)) { update_mst_stream_alloc_table( +#if defined(CONFIG_DRM_AMD_DC_DCN) + link, + pipe_ctx->stream_res.stream_enc, + pipe_ctx->stream_res.hpo_dp_stream_enc, + &proposed_table); +#else link, pipe_ctx->stream_res.stream_enc, &proposed_table); +#endif } else DC_LOG_WARNING("Failed to update" @@ -3471,23 +3488,56 @@ enum dc_status dc_link_allocate_mst_payload(struct pipe_ctx *pipe_ctx)
[PATCH v5 2/4] drm: Update MST First Link Slot Information Based on Encoding Format
From: Bhawanpreet Lakha 8b/10b encoding format requires to reserve the first slot for recording metadata. Real data transmission starts from the second slot, with a total of available 63 slots available. In 128b/132b encoding format, metadata is transmitted separately in LLCP packet before MTP. Real data transmission starts from the first slot, with a total of 64 slots available. v2: * Move total/start slots to mst_state, and copy it to mst_mgr in atomic_check v3: * Only keep the slot info on the mst_state * add a start_slot parameter to the payload function, to facilitate non atomic drivers (this is a temporary workaround and should be removed when we are moving out the non atomic driver helpers) v4: *fixed typo and formatting v5: (no functional changes) * Fixed formatting in drm_dp_mst_update_slots() * Reference mst_state instead of mst_state->mgr for debugging info Signed-off-by: Bhawanpreet Lakha Signed-off-by: Fangzhi Zuo [v5 nitpicks] Reviewed-by: Lyude Paul Signed-off-by: Lyude Paul --- .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 2 +- drivers/gpu/drm/drm_dp_mst_topology.c | 36 --- drivers/gpu/drm/i915/display/intel_dp_mst.c | 4 +-- drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +- drivers/gpu/drm/radeon/radeon_dp_mst.c| 4 +-- include/drm/drm_dp_mst_helper.h | 5 ++- 6 files changed, 42 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index ff0f91c93ba4..6169488e2011 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -251,7 +251,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table( } /* It's OK for this to fail */ - drm_dp_update_payload_part1(mst_mgr); + drm_dp_update_payload_part1(mst_mgr, 1); /* mst_mgr->->payloads are VC payload notify MST branch using DPCD or * AUX message. The sequence is slot 1-63 allocated sequence for each diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 04ed34a7f71c..571da0c2f39f 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3355,6 +3355,10 @@ static int drm_dp_destroy_payload_step2(struct drm_dp_mst_topology_mgr *mgr, /** * drm_dp_update_payload_part1() - Execute payload update part 1 * @mgr: manager to use. + * @start_slot: this is the cur slot + * + * NOTE: start_slot is a temporary workaround for non-atomic drivers, + * this will be removed when non-atomic mst helpers are moved out of the helper * * This iterates over all proposed virtual channels, and tries to * allocate space in the link for them. For 0->slots transitions, @@ -3365,12 +3369,12 @@ static int drm_dp_destroy_payload_step2(struct drm_dp_mst_topology_mgr *mgr, * after calling this the driver should generate ACT and payload * packets. */ -int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr) +int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr, int start_slot) { struct drm_dp_payload req_payload; struct drm_dp_mst_port *port; int i, j; - int cur_slots = 1; + int cur_slots = start_slot; bool skip; mutex_lock(&mgr->payload_lock); @@ -4505,6 +4509,27 @@ int drm_dp_atomic_release_vcpi_slots(struct drm_atomic_state *state, } EXPORT_SYMBOL(drm_dp_atomic_release_vcpi_slots); +/** + * drm_dp_mst_update_slots() - updates the slot info depending on the DP ecoding format + * @mst_state: mst_state to update + * @link_encoding_cap: the ecoding format on the link + */ +void drm_dp_mst_update_slots(struct drm_dp_mst_topology_state *mst_state, uint8_t link_encoding_cap) +{ + if (link_encoding_cap == DP_CAP_ANSI_128B132B) { + mst_state->total_avail_slots = 64; + mst_state->start_slot = 0; + } else { + mst_state->total_avail_slots = 63; + mst_state->start_slot = 1; + } + + DRM_DEBUG_KMS("%s encoding format on mst_state 0x%p\n", + (link_encoding_cap == DP_CAP_ANSI_128B132B) ? "128b/132b":"8b/10b", + mst_state); +} +EXPORT_SYMBOL(drm_dp_mst_update_slots); + /** * drm_dp_mst_allocate_vcpi() - Allocate a virtual channel * @mgr: manager for this port @@ -5224,7 +5249,7 @@ drm_dp_mst_atomic_check_vcpi_alloc_limit(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_topology_state *mst_state) { struct drm_dp_vcpi_allocation *vcpi; - int avail_slots = 63, payload_count = 0; + int avail_slots = mst_state->total_avail_slots, payload_count = 0; list_for_each_entry(vcpi, &mst_state->vcpis, next) { /* Releasing VCPI is always OK-even if the port is gone */ @@ -5253,7 +5278,7 @@ dr
Re: Lockdep spalt on killing a processes
In general I'm all there to get this fixed, but there is one major problem: Drivers don't expect the lock to be dropped. What we could do is to change all drivers so they call always call the dma_fence_signal functions and drop the _locked variants. This way we could move calling the callback out of the spinlock. But that requires audit of all drivers, so quite a lot of work to do. Regards, Christian. Am 25.10.21 um 21:10 schrieb Andrey Grodzovsky: Adding back Daniel (somehow he got off the addresses list) and Chris who worked a lot in this area. On 2021-10-21 2:34 a.m., Christian König wrote: Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky: On 2021-10-04 4:14 a.m., Christian König wrote: The problem is a bit different. The callback is on the dependent fence, while we need to signal the scheduler fence. Daniel is right that this needs an irq_work struct to handle this properly. Christian. So we had some discussions with Christian regarding irq_work and agreed I should look into doing it but stepping back for a sec - Why we insist on calling the dma_fence_cb with fence->lock locked ? Is it because of dma_fence_add_callback ? Because we first test for DMA_FENCE_FLAG_SIGNALED_BIT and only after that lock the fence->lock ? If so, can't we move DMA_FENCE_FLAG_SIGNALED_BIT check inside the locked section ? Because if in theory we could call the cb with unlocked fence->lock (i.e. this kind of iteration https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/ttm/ttm_resource.c#L117) we wouldn't have the lockdep splat. And in general, is it really the correct approach to call a third party code from a call back with locked spinlock ? We don't know what the cb does inside and I don't see any explicit restrictions in documentation of dma_fence_func_t what can and cannot be done there. Yeah, that's exactly what I meant with using the irq_work directly in the fence code. My idea is not to use irq work at all but instead to implement unlocked dma_fence cb execution using iteration which drops the spinlock each time next cb is executed and acquiring it again after (until cb_list is empy). The problem is dma_fence_signal_locked() which is used by quite a number of drivers to signal the fence while holding the lock. For this I think we should not reuse dma_fence_signal_locked inside dma_fence_signal and instead implement it using the unlocked iteration I mentioned above. I looked a bit in the code and the history and I see that until some time ago (this commit by Chris 0fc89b6802ba1fcc561b0c906e0cefd384e3b2e5), indeed dma_fence_signal was doing it's own, locked iteration and wasn't reusing dma_fence_signal_locked. This way whoever relies on the dma_fence_signal_locked won't be impacted an who is not (like us in drm_sched_fence_scheduled/drm_sched_fence_finished) should also not be impacted by more narrow scope of the lock. I also looked at dma_fence_default_wait and how it locks the fence->lock and check if fence is signaled before wait start and I don't see a problem there either. I attached quick draft of this proposal to clarify. Andrey Otherwise we could indeed simplify the fence handling a lot. Christian. Andrey Am 01.10.21 um 17:10 schrieb Andrey Grodzovsky: From what I see here you supposed to have actual deadlock and not only warning, sched_fence->finished is first signaled from within hw fence done callback (drm_sched_job_done_cb) but then again from within it's own callback (drm_sched_entity_kill_jobs_cb) and so looks like same fence object is recursively signaled twice. This leads to attempt to lock fence->lock second time while it's already locked. I don't see a need to call drm_sched_fence_finished from within drm_sched_entity_kill_jobs_cb as this callback already registered on sched_fence->finished fence (entity->last_scheduled == s_fence->finished) and hence the signaling already took place. Andrey On 2021-10-01 6:50 a.m., Christian König wrote: Hey, Andrey. while investigating some memory management problems I've got the logdep splat below. Looks like something is wrong with drm_sched_entity_kill_jobs_cb(), can you investigate? Thanks, Christian. [11176.741052] [11176.741056] WARNING: possible recursive locking detected [11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted [11176.741066] [11176.741070] swapper/12/0 is trying to acquire lock: [11176.741074] 9c337ed175a8 (&fence->lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80 [11176.741088] but task is already holding lock: [11176.741092] 9c337ed172a8 (&fence->lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80 [11176.741100] other info that might help us debug this: [11176.741104] Possible unsafe locking scenario: [11176.741108] CPU0 [11176.741110] [11176.741113] lock(&fence->lock); [1117
Re: [PATCH] drm/amdgpu/display: add quirk handling for stutter mode
On 2021-10-25 14:52, Alex Deucher wrote: > Stutter mode is a power saving feature on GPUs, however at > least one early raven system exhibits stability issues with > it. Add a quirk to disable it for that system. > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214417 > Signed-off-by: Alex Deucher Reviewed-by: Harry Wentland Harry > --- > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 +++ > 1 file changed, 33 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 5b8d1635da5c..b635b893837b 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -1295,6 +1295,37 @@ static struct hpd_rx_irq_offload_work_queue > *hpd_rx_irq_create_workqueue(struct > return hpd_rx_offload_wq; > } > > +struct amdgpu_stutter_quirk { > + u16 chip_vendor; > + u16 chip_device; > + u16 subsys_vendor; > + u16 subsys_device; > + u8 revision; > +}; > + > +static const struct amdgpu_stutter_quirk amdgpu_stutter_quirk_list[] = { > + /* https://bugzilla.kernel.org/show_bug.cgi?id=214417 */ > + { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc8 }, > + { 0, 0, 0, 0, 0 }, > +}; > + > +static bool dm_should_disable_stutter(struct pci_dev *pdev) > +{ > + const struct amdgpu_stutter_quirk *p = amdgpu_stutter_quirk_list; > + > + while (p && p->chip_device != 0) { > + if (pdev->vendor == p->chip_vendor && > + pdev->device == p->chip_device && > + pdev->subsystem_vendor == p->subsys_vendor && > + pdev->subsystem_device == p->subsys_device && > + pdev->revision == p->revision) { > + return true; > + } > + ++p; > + } > + return false; > +} > + > static int amdgpu_dm_init(struct amdgpu_device *adev) > { > struct dc_init_data init_data; > @@ -1406,6 +1437,8 @@ static int amdgpu_dm_init(struct amdgpu_device *adev) > > if (adev->asic_type != CHIP_CARRIZO && adev->asic_type != CHIP_STONEY) > adev->dm.dc->debug.disable_stutter = amdgpu_pp_feature_mask & > PP_STUTTER_MODE ? false : true; > + if (dm_should_disable_stutter(adev->pdev)) > + adev->dm.dc->debug.disable_stutter = true; > > if (amdgpu_dc_debug_mask & DC_DISABLE_STUTTER) > adev->dm.dc->debug.disable_stutter = true; >
Re: Lockdep spalt on killing a processes
Adding back Daniel (somehow he got off the addresses list) and Chris who worked a lot in this area. On 2021-10-21 2:34 a.m., Christian König wrote: Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky: On 2021-10-04 4:14 a.m., Christian König wrote: The problem is a bit different. The callback is on the dependent fence, while we need to signal the scheduler fence. Daniel is right that this needs an irq_work struct to handle this properly. Christian. So we had some discussions with Christian regarding irq_work and agreed I should look into doing it but stepping back for a sec - Why we insist on calling the dma_fence_cb with fence->lock locked ? Is it because of dma_fence_add_callback ? Because we first test for DMA_FENCE_FLAG_SIGNALED_BIT and only after that lock the fence->lock ? If so, can't we move DMA_FENCE_FLAG_SIGNALED_BIT check inside the locked section ? Because if in theory we could call the cb with unlocked fence->lock (i.e. this kind of iteration https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/ttm/ttm_resource.c#L117) we wouldn't have the lockdep splat. And in general, is it really the correct approach to call a third party code from a call back with locked spinlock ? We don't know what the cb does inside and I don't see any explicit restrictions in documentation of dma_fence_func_t what can and cannot be done there. Yeah, that's exactly what I meant with using the irq_work directly in the fence code. My idea is not to use irq work at all but instead to implement unlocked dma_fence cb execution using iteration which drops the spinlock each time next cb is executed and acquiring it again after (until cb_list is empy). The problem is dma_fence_signal_locked() which is used by quite a number of drivers to signal the fence while holding the lock. For this I think we should not reuse dma_fence_signal_locked inside dma_fence_signal and instead implement it using the unlocked iteration I mentioned above. I looked a bit in the code and the history and I see that until some time ago (this commit by Chris 0fc89b6802ba1fcc561b0c906e0cefd384e3b2e5), indeed dma_fence_signal was doing it's own, locked iteration and wasn't reusing dma_fence_signal_locked. This way whoever relies on the dma_fence_signal_locked won't be impacted an who is not (like us in drm_sched_fence_scheduled/drm_sched_fence_finished) should also not be impacted by more narrow scope of the lock. I also looked at dma_fence_default_wait and how it locks the fence->lock and check if fence is signaled before wait start and I don't see a problem there either. I attached quick draft of this proposal to clarify. Andrey Otherwise we could indeed simplify the fence handling a lot. Christian. Andrey Am 01.10.21 um 17:10 schrieb Andrey Grodzovsky: From what I see here you supposed to have actual deadlock and not only warning, sched_fence->finished is first signaled from within hw fence done callback (drm_sched_job_done_cb) but then again from within it's own callback (drm_sched_entity_kill_jobs_cb) and so looks like same fence object is recursively signaled twice. This leads to attempt to lock fence->lock second time while it's already locked. I don't see a need to call drm_sched_fence_finished from within drm_sched_entity_kill_jobs_cb as this callback already registered on sched_fence->finished fence (entity->last_scheduled == s_fence->finished) and hence the signaling already took place. Andrey On 2021-10-01 6:50 a.m., Christian König wrote: Hey, Andrey. while investigating some memory management problems I've got the logdep splat below. Looks like something is wrong with drm_sched_entity_kill_jobs_cb(), can you investigate? Thanks, Christian. [11176.741052] [11176.741056] WARNING: possible recursive locking detected [11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted [11176.741066] [11176.741070] swapper/12/0 is trying to acquire lock: [11176.741074] 9c337ed175a8 (&fence->lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80 [11176.741088] but task is already holding lock: [11176.741092] 9c337ed172a8 (&fence->lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80 [11176.741100] other info that might help us debug this: [11176.741104] Possible unsafe locking scenario: [11176.741108] CPU0 [11176.741110] [11176.741113] lock(&fence->lock); [11176.741118] lock(&fence->lock); [11176.741122] *** DEADLOCK *** [11176.741125] May be due to missing lock nesting notation [11176.741128] 2 locks held by swapper/12/0: [11176.741133] #0: 9c339c30f768 (&ring->fence_drv.lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80 [11176.741142] #1: 9c337ed172a8 (&fence->lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80 [11176.741151] stack backtrace: [11176.741155] CPU: 12 P
[PATCH] drm/amdgpu/display: add quirk handling for stutter mode
Stutter mode is a power saving feature on GPUs, however at least one early raven system exhibits stability issues with it. Add a quirk to disable it for that system. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214417 Signed-off-by: Alex Deucher --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 33 +++ 1 file changed, 33 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 5b8d1635da5c..b635b893837b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1295,6 +1295,37 @@ static struct hpd_rx_irq_offload_work_queue *hpd_rx_irq_create_workqueue(struct return hpd_rx_offload_wq; } +struct amdgpu_stutter_quirk { + u16 chip_vendor; + u16 chip_device; + u16 subsys_vendor; + u16 subsys_device; + u8 revision; +}; + +static const struct amdgpu_stutter_quirk amdgpu_stutter_quirk_list[] = { + /* https://bugzilla.kernel.org/show_bug.cgi?id=214417 */ + { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc8 }, + { 0, 0, 0, 0, 0 }, +}; + +static bool dm_should_disable_stutter(struct pci_dev *pdev) +{ + const struct amdgpu_stutter_quirk *p = amdgpu_stutter_quirk_list; + + while (p && p->chip_device != 0) { + if (pdev->vendor == p->chip_vendor && + pdev->device == p->chip_device && + pdev->subsystem_vendor == p->subsys_vendor && + pdev->subsystem_device == p->subsys_device && + pdev->revision == p->revision) { + return true; + } + ++p; + } + return false; +} + static int amdgpu_dm_init(struct amdgpu_device *adev) { struct dc_init_data init_data; @@ -1406,6 +1437,8 @@ static int amdgpu_dm_init(struct amdgpu_device *adev) if (adev->asic_type != CHIP_CARRIZO && adev->asic_type != CHIP_STONEY) adev->dm.dc->debug.disable_stutter = amdgpu_pp_feature_mask & PP_STUTTER_MODE ? false : true; + if (dm_should_disable_stutter(adev->pdev)) + adev->dm.dc->debug.disable_stutter = true; if (amdgpu_dc_debug_mask & DC_DISABLE_STUTTER) adev->dm.dc->debug.disable_stutter = true; -- 2.31.1
RE: [PATCH] drm/amdgpu: Restore information reporting in RAS
[AMD Official Use Only] > -Original Message- > From: Kuehling, Felix > Sent: Monday, October 25, 2021 1:30 PM > To: Tuikov, Luben ; amd-gfx@lists.freedesktop.org > Cc: Russell, Kent ; Deucher, Alexander > > Subject: Re: [PATCH] drm/amdgpu: Restore information reporting in RAS > > Am 2021-10-25 um 12:02 p.m. schrieb Luben Tuikov: > > A recent patch took away the reporting of number of RAS records and > > the threshold due to the way it was edited/spliced on top of the code. > > This patch restores this reporting. > > > > Cc: Kent Russell > > Cc: Alex Deucher > > Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for > > ignoring bad > page threshold") > > Signed-off-by: Luben Tuikov > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 - > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > > index ae64ca02ccc4f8..05117eda105b55 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > > @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct > amdgpu_ras_eeprom_control *control, > > res = 0; > > } else { > > *exceed_err_limit = true; > > - dev_err(adev->dev, "GPU will not be > > initialized. Replace this > GPU or increase the threshold."); > > + dev_err(adev->dev, > > + "RAS records:%d exceed threshold:%d, " > > + "GPU will not be initialized. Replace > > this GPU or > increase the threshold", > > Splitting messages across multiple lines is usually discouraged > (presumably because it makes them hard to grep). I think checkpatch will > treat this as an error, while a long line is just a warning. Therefore > it seems that long lines are less bad than split messages. There are a few spots in the eeprom file where it gets done like this; I don't really like it either. Under https://www.kernel.org/doc/html/v5.13/process/coding-style.html , I see it supporting splitting for ASM (point 20) but not for regular string (point 2). In this one he's just restoring something I dropped, verbatim, so I have no issue giving it my RB. Kent > > Regards, > Felix > > > > + control->ras_num_recs, ras- > >bad_page_cnt_threshold); > > } > > } > > } else { > > > > base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
[PATCH 5/5] drm/amd/display: Enable dpia in dmub only for DCN31 B0
From: Jude Shih [Why] DMUB binary is common for both A0 and B0. Hence, driver should notify FW about the support for DPIA in B0. [How] Added dpia_supported bit in dmub_fw_boot_options and will be set only for B0. Assign dpia_supported to true before dm_dmub_hw_init in B0 case. Signed-off-by: Jude Shih Reviewed-by: Nicholas Kazlauskas --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 1 + drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c | 1 + 3 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 0ad6d0e2a2d8..139a7707f317 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1017,6 +1017,7 @@ static int dm_dmub_hw_init(struct amdgpu_device *adev) const unsigned char *fw_inst_const, *fw_bss_data; uint32_t i, fw_inst_const_size, fw_bss_data_size; bool has_hw_support; + struct dc *dc = adev->dm.dc; if (!dmub_srv) /* DMUB isn't supported on the ASIC. */ @@ -1103,6 +1104,17 @@ static int dm_dmub_hw_init(struct amdgpu_device *adev) for (i = 0; i < fb_info->num_fb; ++i) hw_params.fb[i] = &fb_info->fb[i]; + switch (adev->asic_type) { + case CHIP_YELLOW_CARP: + if (dc->ctx->asic_id.hw_internal_rev != YELLOW_CARP_A0) { + hw_params.dpia_supported = true; + hw_params.disable_dpia = dc->debug.dpia_debug.bits.disable_dpia; + } + break; + default: + break; + } + status = dmub_srv_hw_init(dmub_srv, &hw_params); if (status != DMUB_STATUS_OK) { DRM_ERROR("Error initializing DMUB HW: %d\n", status); diff --git a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h index 6c4f0ada163f..717c0e572d2f 100644 --- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h +++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h @@ -238,6 +238,7 @@ struct dmub_srv_hw_params { bool load_inst_const; bool skip_panel_power_sequence; bool disable_z10; + bool dpia_supported; bool disable_dpia; }; diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c index 5df990277dd4..10ebf20eaa41 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c @@ -338,6 +338,7 @@ void dmub_dcn31_enable_dmub_boot_options(struct dmub_srv *dmub, const struct dmu union dmub_fw_boot_options boot_options = {0}; boot_options.bits.z10_disable = params->disable_z10; + boot_options.bits.dpia_supported = params->dpia_supported; boot_options.bits.enable_dpia = params->disable_dpia ? 0 : 1; boot_options.bits.sel_mux_phy_c_d_phy_f_g = (dmub->asic == DMUB_ASIC_DCN31B) ? 1 : 0; -- 2.25.1
[PATCH 3/5] drm/amd/display: Fix USB4 hot plug crash issue
From: Jude Shih [Why] Notify data from outbox corrupt, the notify type should be 2 (HPD) instead of 0 (No data). We copied the address instead of the value. The memory might be freed in the end of outbox IRQ [How] We should allocate the memory of notify and copy the whole content from outbox to hpd handle function Fixes: 5cecad78ea53 ("drm/amd/display: Support for SET_CONFIG processing with DMUB") Signed-off-by: Jude Shih Reviewed-by: Nicholas Kazlauskas --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 +-- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 4cd64529b180..0ad6d0e2a2d8 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -730,6 +730,8 @@ static void dm_handle_hpd_work(struct work_struct *work) dmub_hpd_wrk->adev->dm.dmub_callback[dmub_hpd_wrk->dmub_notify->type](dmub_hpd_wrk->adev, dmub_hpd_wrk->dmub_notify); } + + kfree(dmub_hpd_wrk->dmub_notify); kfree(dmub_hpd_wrk); } @@ -755,12 +757,6 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params) if (dc_enable_dmub_notifications(adev->dm.dc) && irq_params->irq_src == DC_IRQ_SOURCE_DMCUB_OUTBOX) { - dmub_hpd_wrk = kzalloc(sizeof(*dmub_hpd_wrk), GFP_ATOMIC); - if (!dmub_hpd_wrk) { - DRM_ERROR("Failed to allocate dmub_hpd_wrk"); - return; - } - INIT_WORK(&dmub_hpd_wrk->handle_hpd_work, dm_handle_hpd_work); do { dc_stat_get_dmub_notification(adev->dm.dc, ¬ify); @@ -769,7 +765,20 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params) continue; } if (dm->dmub_thread_offload[notify.type] == true) { - dmub_hpd_wrk->dmub_notify = ¬ify; + dmub_hpd_wrk = kzalloc(sizeof(*dmub_hpd_wrk), GFP_ATOMIC); + if (!dmub_hpd_wrk) { + DRM_ERROR("Failed to allocate dmub_hpd_wrk"); + return; + } + dmub_hpd_wrk->dmub_notify = kzalloc(sizeof(struct dmub_notification), GFP_ATOMIC); + if (!dmub_hpd_wrk->dmub_notify) { + kfree(dmub_hpd_wrk); + DRM_ERROR("Failed to allocate dmub_hpd_wrk->dmub_notify"); + return; + } + INIT_WORK(&dmub_hpd_wrk->handle_hpd_work, dm_handle_hpd_work); + if (dmub_hpd_wrk->dmub_notify) + memcpy(dmub_hpd_wrk->dmub_notify, ¬ify, sizeof(struct dmub_notification)); dmub_hpd_wrk->adev = adev; if (notify.type == DMUB_NOTIFICATION_HPD) { plink = adev->dm.dc->links[notify.link_index]; -- 2.25.1
[PATCH 1/5] drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
From: Michael Strauss [WHY] On certain configs, SMU clock table voltages don't match which cause parser to behave incorrectly by leaving dcfclk and socclk table entries unpopulated. [HOW] Currently the function that finds the corresponding clock for a given voltage only checks for exact voltage level matches. In the case that no match gets found, parser now falls back to searching for the max clock which meets the requested voltage (i.e. its corresponding voltage is below requested). Signed-off-by: Michael Strauss Reviewed-by: Nicholas Kazlauskas --- .../amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c| 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c index 0088dff441da..f4c9a458ace8 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c @@ -523,14 +523,21 @@ static unsigned int find_clk_for_voltage( unsigned int voltage) { int i; + int max_voltage = 0; + int clock = 0; for (i = 0; i < NUM_SOC_VOLTAGE_LEVELS; i++) { - if (clock_table->SocVoltage[i] == voltage) + if (clock_table->SocVoltage[i] == voltage) { return clocks[i]; + } else if (clock_table->SocVoltage[i] >= max_voltage && + clock_table->SocVoltage[i] < voltage) { + max_voltage = clock_table->SocVoltage[i]; + clock = clocks[i]; + } } - ASSERT(0); - return 0; + ASSERT(clock); + return clock; } void dcn31_clk_mgr_helper_populate_bw_params( -- 2.25.1
[PATCH 4/5] drm/amd/display: MST support for DPIA
From: Meenakshikumar Somasundaram [Why] - DPIA MST slot registers are not programmed during payload allocation and hence MST does not work with DPIA. - HPD RX interrupts are not handled for DPIA. [How] - Added inbox command to program the MST slots whenever payload allocation happens for DPIA links. - Added support for handling HPD RX interrupts Signed-off-by: Meenakshikumar Somasundaram Reviewed-by: Jun Lei Acked-by: Nicholas Kazlauskas --- drivers/gpu/drm/amd/display/dc/core/dc.c | 54 +++ drivers/gpu/drm/amd/display/dc/core/dc_link.c | 28 ++ drivers/gpu/drm/amd/display/dc/dc.h | 6 +++ .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 23 4 files changed, 111 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 9d33b00b7e27..12e5470fa567 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -3750,6 +3750,60 @@ bool dc_process_dmub_set_config_async(struct dc *dc, return is_cmd_complete; } +/** + * + * Function: dc_process_dmub_set_mst_slots + * + * @brief + * Submits mst slot allocation command to dmub via inbox message + * + * @param + * [in] dc: dc structure + * [in] link_index: link index + * [in] mst_alloc_slots: mst slots to be allotted + * [out] mst_slots_in_use: mst slots in use returned in failure case + * + * @return + * DC_OK if successful, DC_ERROR if failure + * + */ +enum dc_status dc_process_dmub_set_mst_slots(const struct dc *dc, + uint32_t link_index, + uint8_t mst_alloc_slots, + uint8_t *mst_slots_in_use) +{ + union dmub_rb_cmd cmd = {0}; + struct dc_dmub_srv *dmub_srv = dc->ctx->dmub_srv; + + /* prepare MST_ALLOC_SLOTS command */ + cmd.set_mst_alloc_slots.header.type = DMUB_CMD__DPIA; + cmd.set_mst_alloc_slots.header.sub_type = DMUB_CMD__DPIA_MST_ALLOC_SLOTS; + + cmd.set_mst_alloc_slots.mst_slots_control.instance = dc->links[link_index]->ddc_hw_inst; + cmd.set_mst_alloc_slots.mst_slots_control.mst_alloc_slots = mst_alloc_slots; + + if (!dc_dmub_srv_cmd_with_reply_data(dmub_srv, &cmd)) + /* command is not processed by dmub */ + return DC_ERROR_UNEXPECTED; + + /* command processed by dmub, if ret_status is 1 */ + if (cmd.set_config_access.header.ret_status != 1) + /* command processing error */ + return DC_ERROR_UNEXPECTED; + + /* command processed and we have a status of 2, mst not enabled in dpia */ + if (cmd.set_mst_alloc_slots.mst_slots_control.immed_status == 2) + return DC_FAIL_UNSUPPORTED_1; + + /* previously configured mst alloc and used slots did not match */ + if (cmd.set_mst_alloc_slots.mst_slots_control.immed_status == 3) { + *mst_slots_in_use = cmd.set_mst_alloc_slots.mst_slots_control.mst_slots_in_use; + return DC_NOT_SUPPORTED; + } + + return DC_OK; +} + /** * dc_disable_accelerated_mode - disable accelerated mode * @dc: dc structure diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index 27b9ff98a569..2796bdd17de1 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3499,6 +3499,20 @@ enum dc_status dc_link_allocate_mst_payload(struct pipe_ctx *pipe_ctx) ASSERT(proposed_table.stream_count > 0); + if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) { + static enum dc_status status; + uint8_t mst_alloc_slots = 0, prev_mst_slots_in_use = 0xFF; + + for (i = 0; i < link->mst_stream_alloc_table.stream_count; i++) + mst_alloc_slots += link->mst_stream_alloc_table.stream_allocations[i].slot_count; + + status = dc_process_dmub_set_mst_slots(link->dc, link->link_index, + mst_alloc_slots, &prev_mst_slots_in_use); + ASSERT(status == DC_OK); + DC_LOG_MST("dpia : status[%d]: alloc_slots[%d]: used_slots[%d]\n", + status, mst_alloc_slots, prev_mst_slots_in_use); + } + /* program DP source TX for payload */ #if defined(CONFIG_DRM_AMD_DC_DCN) switch (dp_get_link_encoding_format(&link->cur_link_settings)) { @@ -3842,6 +3856,20 @@ static enum dc_status deallocate_mst_payload(struct pipe_ctx *pipe_ctx) #endif } + if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) { + enum dc_status status; + uint8_t mst_alloc_slots = 0, prev_mst_slots_in_u
[PATCH 2/5] drm/amd/display: Fix deadlock when falling back to v2 from v3
[Why] A deadlock in the kernel occurs when we fallback from the V3 to V2 add_topology_to_display or remove_topology_to_display because they both try to acquire the dtm_mutex but recursive locking isn't supported on mutex_lock(). [How] Make the mutex_lock/unlock more fine grained and move them up such that they're only required for the psp invocation itself. Fixes: bf62221e9d0e ("drm/amd/display: Add DCN3.1 HDCP support") Signed-off-by: Nicholas Kazlauskas Reviewed-by: Aric Cyr --- drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c index e9bd84ec027d..be61975f1470 100644 --- a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c +++ b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c @@ -105,6 +105,7 @@ static enum mod_hdcp_status remove_display_from_topology_v3( dtm_cmd->dtm_status = TA_DTM_STATUS__GENERIC_FAILURE; psp_dtm_invoke(psp, dtm_cmd->cmd_id); + mutex_unlock(&psp->dtm_context.mutex); if (dtm_cmd->dtm_status != TA_DTM_STATUS__SUCCESS) { status = remove_display_from_topology_v2(hdcp, index); @@ -115,8 +116,6 @@ static enum mod_hdcp_status remove_display_from_topology_v3( HDCP_TOP_REMOVE_DISPLAY_TRACE(hdcp, display->index); } - mutex_unlock(&psp->dtm_context.mutex); - return status; } @@ -205,6 +204,7 @@ static enum mod_hdcp_status add_display_to_topology_v3( dtm_cmd->dtm_in_message.topology_update_v3.link_hdcp_cap = link->hdcp_supported_informational; psp_dtm_invoke(psp, dtm_cmd->cmd_id); + mutex_unlock(&psp->dtm_context.mutex); if (dtm_cmd->dtm_status != TA_DTM_STATUS__SUCCESS) { status = add_display_to_topology_v2(hdcp, display); @@ -214,8 +214,6 @@ static enum mod_hdcp_status add_display_to_topology_v3( HDCP_TOP_ADD_DISPLAY_TRACE(hdcp, display->index); } - mutex_unlock(&psp->dtm_context.mutex); - return status; } -- 2.25.1
[PATCH 0/5] Fix USBC lightup on DCN31B
This patchset is a mini-promotion to fix USBC lightup on DCN31B for both SST and MST displays. There are five issues at play here: 1. Invalid clock table causing mode validation to fail 2. HDCP downgrade to v2 causes deadlock 3. MST payload allocation not being propagated 4. PHY mux selection not being properly indicated to DMCUB 5. Hang occuring during HPD on PHYF/G. Jude Shih (2): drm/amd/display: Fix USB4 hot plug crash issue drm/amd/display: Enable dpia in dmub only for DCN31 B0 Meenakshikumar Somasundaram (1): drm/amd/display: MST support for DPIA Michael Strauss (1): drm/amd/display: Fallback to clocks which meet requested voltage on DCN31 Nicholas Kazlauskas (1): drm/amd/display: Fix deadlock when falling back to v2 from v3 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 35 +--- .../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 13 +++-- drivers/gpu/drm/amd/display/dc/core/dc.c | 54 +++ drivers/gpu/drm/amd/display/dc/core/dc_link.c | 28 ++ drivers/gpu/drm/amd/display/dc/dc.h | 6 +++ drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 1 + .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 23 .../gpu/drm/amd/display/dmub/src/dmub_dcn31.c | 1 + .../drm/amd/display/modules/hdcp/hdcp_psp.c | 6 +-- 9 files changed, 153 insertions(+), 14 deletions(-) -- 2.25.1
Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13
sbios settings any of these raise a suspicion? screenshot from the ASRockRack X470D4U's BIOS setup: https://imgur.com/a/rdhGQNy
Re: [PATCH] drm/amdgpu: Restore information reporting in RAS
Am 2021-10-25 um 12:02 p.m. schrieb Luben Tuikov: > A recent patch took away the reporting of number of RAS records and > the threshold due to the way it was edited/spliced on top of the code. > This patch restores this reporting. > > Cc: Kent Russell > Cc: Alex Deucher > Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring > bad page threshold") > Signed-off-by: Luben Tuikov > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > index ae64ca02ccc4f8..05117eda105b55 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct > amdgpu_ras_eeprom_control *control, > res = 0; > } else { > *exceed_err_limit = true; > - dev_err(adev->dev, "GPU will not be > initialized. Replace this GPU or increase the threshold."); > + dev_err(adev->dev, > + "RAS records:%d exceed threshold:%d, " > + "GPU will not be initialized. Replace > this GPU or increase the threshold", Splitting messages across multiple lines is usually discouraged (presumably because it makes them hard to grep). I think checkpatch will treat this as an error, while a long line is just a warning. Therefore it seems that long lines are less bad than split messages. Regards, Felix > + control->ras_num_recs, > ras->bad_page_cnt_threshold); > } > } > } else { > > base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
Re: I got an IOMMU IO page fault. What to do now?
Hi Robin, Am 25.10.21 um 18:01 schrieb Robin Murphy: On 2021-10-25 12:23, Christian König wrote: Hi Paul, not sure how the IOMMU gives out addresses, but the printed ones look suspicious to me. Something like we are using an invalid address like -1 or similar. FWIW those look like believable DMA addresses to me, assuming that the DMA mapping APIs are being backed iommu_dma_ops and the device has a 40-bit DMA mask, since the IOVA allocator works top-down. Thanks for that information. In that case the addresses look valid to me. Likely causes are either a race where the dma_unmap_*() call happens before the hardware has really stopped accessing the relevant addresses, or the device's DMA mask has been set larger than it should be, and thus the upper bits have been truncated in the round-trip through the hardware. That actually looks correct to me. The device indeed has a 40-bit DMA mask. There is a third possibility which is actually quite likely and that are stale reads in the pipeline. See for some use cases the device can queue reads into an internal pipeline, but when it later finds that the read isn't needed doesn't flush the pipeline. The next operation pushes more read requests into the pipeline and eventually the stale read requests are executed as well. Without IOMMU the result of those reads are simply discarded, so no harm done. But with IOMMU enabled it is perfectly possible that the stale read is now accessing unmapped memory -> BAM. That's one of the reasons why we almost always have GPUs in passthrough mode on x86 and for example don't use system memory for GPU page tables on APUs. Regards, Christian. Given the addresses involved, my suspicions would initially lean towards the latter case - the faults are in the very topmost pages which imply they're the first things mapped in that range. The other contributing factor being the trick that the IOVA allocator plays for PCI devices, where it tries to prefer 32-bit addresses. Thus you're only likely to see this happen once you already have ~3.5-4GB of live DMA-mapped memory to exhaust the 32-bit IOVA space (minus some reserved areas) and start allocating from the full DMA mask. You should be able to check that with a 5.13 or newer kernel by booting with "iommu.forcedac=1" and seeing if it breaks immediately (unfortunately with an older kernel you'd have to manually hack iommu_dma_alloc_iova() to the same effect). Robin. Can you try that on an up to date kernel as well? E.g. ideally bleeding edge amd-staging-drm-next from Alex repository. Regards, Christian. Am 25.10.21 um 12:25 schrieb Paul Menzel: Dear Linux folks, On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages below. (GPU hang in amdgpu issue #1762 [1] might be related.) $ lspci -nn -s 05:00.0 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87) $ dmesg […] [6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020] [6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020] [6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020] [6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020] [6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020] [6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020] [6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020] [6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020] [6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020] [6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020] [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc1c0 flags=0x0020] [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc8c0 flags=0x0020] [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb0c0 flags=0x0020] [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffcfc0 flags=0x0020] [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb6c0 flags=0x0020] [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xf
Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13
hi The driver is not able to find the vbios image which is required for the driver to properly enumerate the hardware. I would guess it's a platform issue. Is there a newer sbios image available for your platform? ... I would start with an sbios update is possible. not that i'm aware. this board is an ASRockRack X470D4U server board https://www.asrockrack.com/general/productdetail.asp?Model=X470D4U#Specifications latest BIOS https://www.asrockrack.com/general/productdetail.asp?Model=X470D4U#Download is, 4.208/12/2021 BIOS which is installed, dmidecode | grep "BIOS Information" -A5 BIOS Information Vendor: American Megatrends International, LLC. Version: P4.20 Release Date: 04/14/2021 Address: 0xF Runtime Size: 64 kB check if there are any options in the sbios regarding the behavior of the integrated graphics when an external GPU is present. I suspect the one of the following is the problem: 1. The sbios should disable the integrated graphics when a dGPU is present, but due to a bug in the sbios or a particular sbios settings it has failed to. 2. The sbios should be providing a vbios image for the integrated graphics, but due to a bug in the sbios or a particular sbios settings it has failed to. 3. The platform uses some alternative method to provide access to the vbios image for the integrated graphics that Linux does not yet handle. Checking on the specific options for Dual/Concurrent GPU support is ... challenging ... so far. I haven't found a clear statement/doc on how it's intended to behave, what options are available, or details on the individual options. Per a chat late last week with ASRockRack, my understanding is that dual CPU support is *supposed* to work. atm, I'm not clear on how specifically test/answer for any of your suspected issues :-/ Reading online to see what to check, etc. To add to the list - check if ACPI support is broken or skipped. It doesn't appear to me to be; here's dmesg output, dmesg | grep -i acpi ... [0.00] BIOS-e820: [mem 0x0a20-0x0a20efff] ACPI NVS [0.00] BIOS-e820: [mem 0xbae2e000-0xbae70fff] ACPI data [0.00] BIOS-e820: [mem 0xbae71000-0xbd0defff] ACPI NVS [0.00] reserve setup_data: [mem 0x0a20-0x0a20efff] ACPI NVS [0.00] reserve setup_data: [mem 0xbae2e000-0xbae70fff] ACPI data [0.00] reserve setup_data: [mem 0xbae71000-0xbd0defff] ACPI NVS [0.00] efi: ACPI=0xbd0c8000 ACPI 2.0=0xbd0c8014 SMBIOS=0xbdd6d000 SMBIOS 3.0=0xbdd6c000 MEMATTR=0xb5ee6698 ESRT=0xb5804518 MOKvar=0xb568b000 RNG=0xbdd9eb18 [0.004625] ACPI: Early table checksum verification disabled [0.004627] ACPI: RSDP 0xBD0C8014 24 (v02 ALASKA) [0.004630] ACPI: XSDT 0xBD0C7728 E4 (v01 ALASKA A M I 01072009 AMI 0113) [0.004634] ACPI: FACP 0xBAE6 000114 (v06 ALASKA A M I 01072009 AMI 00010013) [0.004637] ACPI: DSDT 0xBAE59000 006308 (v02 ALASKA A M I 01072009 INTL 20120913) [0.004639] ACPI: FACS 0xBC0C2000 40 [0.004640] ACPI: IVRS 0xBAE7 D0 (v02 AMD AmdTable 0001 AMD 0001) [0.004642] ACPI: SPMI 0xBAE6F000 41 (v05 ALASKA A M I AMI. ) [0.004644] ACPI: SPMI 0xBAE6E000 41 (v05 ALASKA A M I AMI. ) [0.004645] ACPI: SSDT 0xBAE66000 007229 (v02 AMDMYRTLE 0002 MSFT 0400) [0.004647] ACPI: SSDT 0xBAE62000 003BD7 (v01 AMDAMD AOD 0001 INTL 20120913) [0.004648] ACPI: SSDT 0xBAE61000 C8 (v02 ALASKA CPUSSDT 01072009 AMI 01072009) [0.004650] ACPI: FIDT 0xBAE58000 9C (v01 ALASKA A M I 01072009 AMI 00010013) [0.004651] ACPI: MCFG 0xBAE57000 3C (v01 ALASKA A M I 01072009 MSFT 00010013) [0.004653] ACPI: AAFT 0xBAE56000 68 (v01 ALASKA OEMAAFT 01072009 MSFT 0097) [0.004654] ACPI: HPET 0xBAE55000 38 (v01 ALASKA A M I 01072009 AMI 0005) [0.004656] ACPI: SPCR 0xBAE54000 50 (v02 A M I APTIO V 01072009 AMI. 00050011) [0.004657] ACPI: SSDT 0xBAE51000 002B44 (v02 AMD AmdTable 0001 AMD 0001) [0.004659] ACPI: CRAT 0xBAE5 000B68 (v01 AMD AmdTable 0001 AMD 0001) [0.004660] ACPI: CDIT 0xBAE4F000 29 (v01 AMD AmdTable 0001 AMD 0001) [0.004662] ACPI: SSDT 0xBAE4E000 000D53 (v01 AMD MYRTLEG2 0001 INTL 20120913) [
Re: [PATCH] drm/amdgpu: Restore information reporting in RAS
[Public] Does the message need a mention about the newly added option to ignore threshold? Thanks, Lijo From: amd-gfx on behalf of Luben Tuikov Sent: Monday, October 25, 2021 9:32:20 PM To: amd-gfx@lists.freedesktop.org Cc: Tuikov, Luben ; Russell, Kent ; Deucher, Alexander Subject: [PATCH] drm/amdgpu: Restore information reporting in RAS A recent patch took away the reporting of number of RAS records and the threshold due to the way it was edited/spliced on top of the code. This patch restores this reporting. Cc: Kent Russell Cc: Alex Deucher Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring bad page threshold") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index ae64ca02ccc4f8..05117eda105b55 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control, res = 0; } else { *exceed_err_limit = true; - dev_err(adev->dev, "GPU will not be initialized. Replace this GPU or increase the threshold."); + dev_err(adev->dev, + "RAS records:%d exceed threshold:%d, " + "GPU will not be initialized. Replace this GPU or increase the threshold", + control->ras_num_recs, ras->bad_page_cnt_threshold); } } } else { base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d -- 2.33.1.558.g2bd2f258f4
Re: [PATCH] drm/amdgpu: Restore information reporting in RAS
On Mon, Oct 25, 2021 at 12:02 PM Luben Tuikov wrote: > > A recent patch took away the reporting of number of RAS records and > the threshold due to the way it was edited/spliced on top of the code. > This patch restores this reporting. > > Cc: Kent Russell > Cc: Alex Deucher > Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring > bad page threshold") > Signed-off-by: Luben Tuikov Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > index ae64ca02ccc4f8..05117eda105b55 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct > amdgpu_ras_eeprom_control *control, > res = 0; > } else { > *exceed_err_limit = true; > - dev_err(adev->dev, "GPU will not be > initialized. Replace this GPU or increase the threshold."); > + dev_err(adev->dev, > + "RAS records:%d exceed threshold:%d, " > + "GPU will not be initialized. Replace > this GPU or increase the threshold", > + control->ras_num_recs, > ras->bad_page_cnt_threshold); > } > } > } else { > > base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d > -- > 2.33.1.558.g2bd2f258f4 >
RE: [PATCH] drm/amdgpu: Restore information reporting in RAS
[AMD Official Use Only] Thanks Luben Reviewed-by: Kent Russell Kent > -Original Message- > From: Tuikov, Luben > Sent: Monday, October 25, 2021 12:02 PM > To: amd-gfx@lists.freedesktop.org > Cc: Tuikov, Luben ; Russell, Kent > ; > Deucher, Alexander > Subject: [PATCH] drm/amdgpu: Restore information reporting in RAS > > A recent patch took away the reporting of number of RAS records and > the threshold due to the way it was edited/spliced on top of the code. > This patch restores this reporting. > > Cc: Kent Russell > Cc: Alex Deucher > Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring > bad page > threshold") > Signed-off-by: Luben Tuikov > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > index ae64ca02ccc4f8..05117eda105b55 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c > @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct > amdgpu_ras_eeprom_control *control, > res = 0; > } else { > *exceed_err_limit = true; > - dev_err(adev->dev, "GPU will not be > initialized. Replace this > GPU or increase the threshold."); > + dev_err(adev->dev, > + "RAS records:%d exceed threshold:%d, " > + "GPU will not be initialized. Replace > this GPU or > increase the threshold", > + control->ras_num_recs, ras- > >bad_page_cnt_threshold); > } > } > } else { > > base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d > -- > 2.33.1.558.g2bd2f258f4
[PATCH] drm/amdgpu: Restore information reporting in RAS
A recent patch took away the reporting of number of RAS records and the threshold due to the way it was edited/spliced on top of the code. This patch restores this reporting. Cc: Kent Russell Cc: Alex Deucher Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring bad page threshold") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index ae64ca02ccc4f8..05117eda105b55 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control, res = 0; } else { *exceed_err_limit = true; - dev_err(adev->dev, "GPU will not be initialized. Replace this GPU or increase the threshold."); + dev_err(adev->dev, + "RAS records:%d exceed threshold:%d, " + "GPU will not be initialized. Replace this GPU or increase the threshold", + control->ras_num_recs, ras->bad_page_cnt_threshold); } } } else { base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d -- 2.33.1.558.g2bd2f258f4
Re: I got an IOMMU IO page fault. What to do now?
On 2021-10-25 12:23, Christian König wrote: Hi Paul, not sure how the IOMMU gives out addresses, but the printed ones look suspicious to me. Something like we are using an invalid address like -1 or similar. FWIW those look like believable DMA addresses to me, assuming that the DMA mapping APIs are being backed iommu_dma_ops and the device has a 40-bit DMA mask, since the IOVA allocator works top-down. Likely causes are either a race where the dma_unmap_*() call happens before the hardware has really stopped accessing the relevant addresses, or the device's DMA mask has been set larger than it should be, and thus the upper bits have been truncated in the round-trip through the hardware. Given the addresses involved, my suspicions would initially lean towards the latter case - the faults are in the very topmost pages which imply they're the first things mapped in that range. The other contributing factor being the trick that the IOVA allocator plays for PCI devices, where it tries to prefer 32-bit addresses. Thus you're only likely to see this happen once you already have ~3.5-4GB of live DMA-mapped memory to exhaust the 32-bit IOVA space (minus some reserved areas) and start allocating from the full DMA mask. You should be able to check that with a 5.13 or newer kernel by booting with "iommu.forcedac=1" and seeing if it breaks immediately (unfortunately with an older kernel you'd have to manually hack iommu_dma_alloc_iova() to the same effect). Robin. Can you try that on an up to date kernel as well? E.g. ideally bleeding edge amd-staging-drm-next from Alex repository. Regards, Christian. Am 25.10.21 um 12:25 schrieb Paul Menzel: Dear Linux folks, On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages below. (GPU hang in amdgpu issue #1762 [1] might be related.) $ lspci -nn -s 05:00.0 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87) $ dmesg […] [6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020] [6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020] [6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020] [6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020] [6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020] [6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020] [6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020] [6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020] [6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020] [6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020] [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc1c0 flags=0x0020] [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc8c0 flags=0x0020] [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb0c0 flags=0x0020] [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffcfc0 flags=0x0020] [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb6c0 flags=0x0020] [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffa0c0 flags=0x0020] [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffbdc0 flags=0x0020] [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffa4c0 flags=0x0020] [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xff90c0 flags=0x0020] [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffabc0 flags=0x0020] As this is not reproducible, how would debugging go? (The system was rebooted in the meantime.) What options should be enabled, that next time the required information is logged, or what commands should I execute when the system is still in that state, so the bug (driver, userspace, …) can be pinpointed and fixed? Kind regards, Paul [1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"
Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane
On 2021-10-25 11:12, Paul Menzel wrote: > Dear Nicholas, dear Harry, > > > On 25.10.21 16:42, Kazlauskas, Nicholas wrote: >> On 2021-10-25 9:58 a.m., Harry Wentland wrote: > >>> On 2021-10-25 07:25, Paul Menzel wrote: > On 24.10.21 15:31, Rodrigo Siqueira wrote: > From: Wenjing Liu > > [why] > We have a regression that cause maximize lane settings to use > uninitialized data from unused lanes. Which commit caused the regression? Please amend the commit message. > This will cause link training to fail for 1 or 2 lanes because the lane > adjust is populated incorrectly sometimes. On what card did you test this, and how can it be reproduced? Please describe the fix/implemantation in the commit message. > Reviewed-by: Eric Yang > Acked-by: Rodrigo Siqueira > Signed-off-by: Wenjing Liu > --- > .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 35 +-- > 1 file changed, 32 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c > b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c > index 653279ab96f4..f6ba7c734f54 100644 > --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c > @@ -108,6 +108,9 @@ static struct dc_link_settings > get_common_supported_link_settings( > struct dc_link_settings link_setting_b); > static void maximize_lane_settings(const struct link_training_settings > *lt_settings, > struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); > +static void override_lane_settings(const struct link_training_settings > *lt_settings, > + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); > + > static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link, > const struct dc_link_settings *link_settings) > { > @@ -734,15 +737,13 @@ void dp_decide_lane_settings( > } > #endif > } > - > - /* we find the maximum of the requested settings across all lanes*/ > - /* and set this maximum for all lanes*/ > dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, > dpcd_lane_settings); > if (lt_settings->disallow_per_lane_settings) { > /* we find the maximum of the requested settings across all > lanes*/ > /* and set this maximum for all lanes*/ > maximize_lane_settings(lt_settings, hw_lane_settings); > + override_lane_settings(lt_settings, hw_lane_settings); > if (lt_settings->always_match_dpcd_with_hw_lane_settings) > dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, > dpcd_lane_settings); > @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct > link_training_settings *lt_setti > } > } > +static void override_lane_settings(const struct > link_training_settings *lt_settings, > + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]) > +{ > + uint32_t lane; > + > + if (lt_settings->voltage_swing == NULL && > + lt_settings->pre_emphasis == NULL && > +#if defined(CONFIG_DRM_AMD_DC_DP2_0) > + lt_settings->ffe_preset == NULL && > +#endif > + lt_settings->post_cursor2 == NULL) > + > + return; > + > + for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) { > + if (lt_settings->voltage_swing) > + lane_settings[lane].VOLTAGE_SWING = > *lt_settings->voltage_swing; > + if (lt_settings->pre_emphasis) > + lane_settings[lane].PRE_EMPHASIS = > *lt_settings->pre_emphasis; > + if (lt_settings->post_cursor2) > + lane_settings[lane].POST_CURSOR2 = > *lt_settings->post_cursor2; > +#if defined(CONFIG_DRM_AMD_DC_DP2_0) > + if (lt_settings->ffe_preset) > + lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset; > +#endif Normally these checks should be done in C and not the preprocessor. `if CONFIG(DRM_AMD_DC_DP2_0)` or similar should work. >>> >>> Interesting. I've never seen this before. Do you have an example or link to >>> a doc? A cursory search doesn't yield any results but I might not be >>> searching for the right thing. >>> >>> Harry >> >> I'm curious about this too. The compiler with optimizations should remove >> the constant check, but technically the C standard only permits it - it >> doesn't guarantee that it happens. >> >> However, this patch should actually be changed to drop these >> CONFIG_DRM_AMD_DC_DP2_0 guards - this isn't a Kconfig option nor will there >> be one specifically for DP2. This should be folded under the DCN support. > > From the Li
Re: [PATCH] drm/amd/amdgpu: fix potential bad job hw_fence underflow
On 2021-10-24 10:56 p.m., JingWen Chen wrote: On 2021/10/23 上午4:41, Andrey Grodzovsky wrote: What do you mean by underflow in this case ? You mean use after free because of extra dma_fence_put() ? yes Then maybe update the description because 'underflow' is very confusing On 2021-10-22 4:14 a.m., JingWen Chen wrote: ping On 2021/10/22 AM11:33, Jingwen Chen wrote: [Why] In advance tdr mode, the real bad job will be resubmitted twice, while in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job is put one more time than other jobs. [How] Adding dma_fence_get before resbumit job in amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs Signed-off-by: Jingwen Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 41ce86244144..975f069f6fe8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4841,6 +4841,9 @@ static void amdgpu_device_recheck_guilty_jobs( /* clear job's guilty and depend the folowing step to decide the real one */ drm_sched_reset_karma(s_job); + /* for the real bad job, it will be resubmitted twice, adding a dma_fence_get + * to make sure fence is balanced */ But that put in drm_sched_resubmit_jobs_ext is for the previous parent fence. fence = sched->ops->run_job(s_job); returns a new HW fence and the put drops the refcount on the old one. Andrey Hi Andrey, If I remember correctly, after we embedded the hw_fence into amdgpu_job, there will be not fence replacement in amdgpu_job_run. Right, I forgot that... What about removing line https://elixir.bootlin.com/linux/v5.15-rc6/source/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c#L265 ? What if you make dma_get_fence unconditional instead ? Andrey + dma_fence_get(s_job->s_fence->parent); drm_sched_resubmit_jobs_ext(&ring->sched, 1); ret = dma_fence_wait_timeout(s_job->s_fence->parent, false, ring->sched.timeout); @@ -4876,6 +4879,7 @@ static void amdgpu_device_recheck_guilty_jobs( /* got the hw fence, signal finished fence */ atomic_dec(ring->sched.score); + dma_fence_put(s_job->s_fence->parent); dma_fence_get(&s_job->s_fence->finished); dma_fence_signal(&s_job->s_fence->finished); dma_fence_put(&s_job->s_fence->finished);
Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13
On 10/25/2021 7:45 PM, Alex Deucher wrote: On Mon, Oct 25, 2021 at 9:48 AM PGNet Dev wrote: ( cc'ing this here, OP -> dri-devel@ ) i've a dual gpu system inxi -GS System:Host: ws05 Kernel: 5.14.13-200.fc34.x86_64 x86_64 bits: 64 Console: tty pts/0 Distro: Fedora release 34 (Thirty Four) (1) Graphics: Device-1: NVIDIA GK208B [GeForce GT 710] driver: nvidia v: 470.74 (2)Device-2: Advanced Micro Devices [AMD/ATI] Cezanne driver: N/A Display: server: X.org 1.20.11 driver: loaded: nvidia unloaded: fbdev,modesetting,vesa Message: Advanced graphics data unavailable for root. running on cpu:Ryzen 5 5600G mobo: ASRockRack X470D4U bios: vP4.20, 04/14/2021 kernel: 5.14.13-200.fc34.x86_64 x86_64 where, the nvidia is a PCIe card the amdgpu is the Ryzen-integrated gpu the nvidia PCI is currently my primary it's screen-attached, and boots/functions correctly lsmod | grep nvidia nvidia_drm 69632 0 nvidia_modeset 1200128 1 nvidia_drm nvidia 35332096 1 nvidia_modeset drm_kms_helper303104 2 amdgpu,nvidia_drm drm 630784 8 gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm dmesg | grep -i nvidia [5.755494] nvidia: loading out-of-tree module taints kernel. [5.755503] nvidia: module license 'NVIDIA' taints kernel. [5.759769] nvidia: module verification failed: signature and/or required key missing - tainting kernel [5.774894] nvidia-nvlink: Nvlink Core is being initialized, major device number 234 [5.775299] nvidia :10:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem [5.975449] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.74 Mon Sep 13 23:09:15 UTC 2021 [6.013181] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 470.74 Mon Sep 13 22:59:50 UTC 2021 [6.016444] [drm] [nvidia-drm] [GPU ID 0x1000] Loading driver [6.227295] caller _nv000723rm+0x1ad/0x200 [nvidia] mapping multiple BARs [6.954906] [drm] Initialized nvidia-drm 0.0.0 20160202 for :10:00.0 on minor 0 [ 16.820758] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input13 [ 16.820776] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input14 [ 16.820808] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input15 [ 16.820826] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input16 [ 16.820841] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input17 the amdgpu is not (currently/yet) in use; no attached screen in BIOS, currently, 'PCI Express' (nvidia gpu) is selected as primary 'HybridGraphics' is enabled 'OnBoard VGA' is enabled on boot, mods are loaded lsmod | grep gpu amdgpu 7802880 0 drm_ttm_helper 16384 1 amdgpu ttm81920 2 amdgpu,drm_ttm_helper iommu_v2 24576 1 amdgpu gpu_sched 45056 1 amdgpu drm_kms_helper303104 2 amdgpu,nvidia_drm drm 630784 8 gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm i2c_algo_bit 16384 2 igb,amdgpu but i see a 'fatal error' and 'failed' probe, dmesg | grep -i amdgpu [5.161923] [drm] amdgpu kernel modesetting enabled. [5.162097] amdgpu: Virtual CRAT table created for CPU [5.162104] amdgpu: Topology: Add CPU node [5.162197] amdgpu :30:00.0: enabling device ( -> 0003) [5.162232] amdgpu :30:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled [5.169105] amdgpu :30:00.0: BAR 6: can't assign [??? 0x flags 0x2000] (bogus alignment) [5.174413] amdgpu :30:00.0: amdgpu: Unable to locate a BIOS ROM [5.174415] amdgpu :30:00.0: amdgpu: Fatal error during GPU init [5.174416] amdgpu :30:00.0: amdgpu: amdgpu: finishing device. [5.174425] Modules linked in: amdgpu(+) uas usb_
Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane
Dear Nicholas, dear Harry, On 25.10.21 16:42, Kazlauskas, Nicholas wrote: On 2021-10-25 9:58 a.m., Harry Wentland wrote: On 2021-10-25 07:25, Paul Menzel wrote: On 24.10.21 15:31, Rodrigo Siqueira wrote: From: Wenjing Liu [why] We have a regression that cause maximize lane settings to use uninitialized data from unused lanes. Which commit caused the regression? Please amend the commit message. This will cause link training to fail for 1 or 2 lanes because the lane adjust is populated incorrectly sometimes. On what card did you test this, and how can it be reproduced? Please describe the fix/implemantation in the commit message. Reviewed-by: Eric Yang Acked-by: Rodrigo Siqueira Signed-off-by: Wenjing Liu --- .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 35 +-- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c index 653279ab96f4..f6ba7c734f54 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -108,6 +108,9 @@ static struct dc_link_settings get_common_supported_link_settings( struct dc_link_settings link_setting_b); static void maximize_lane_settings(const struct link_training_settings *lt_settings, struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); + static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link, const struct dc_link_settings *link_settings) { @@ -734,15 +737,13 @@ void dp_decide_lane_settings( } #endif } - - /* we find the maximum of the requested settings across all lanes*/ - /* and set this maximum for all lanes*/ dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); if (lt_settings->disallow_per_lane_settings) { /* we find the maximum of the requested settings across all lanes*/ /* and set this maximum for all lanes*/ maximize_lane_settings(lt_settings, hw_lane_settings); + override_lane_settings(lt_settings, hw_lane_settings); if (lt_settings->always_match_dpcd_with_hw_lane_settings) dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct link_training_settings *lt_setti } } +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]) +{ + uint32_t lane; + + if (lt_settings->voltage_swing == NULL && + lt_settings->pre_emphasis == NULL && +#if defined(CONFIG_DRM_AMD_DC_DP2_0) + lt_settings->ffe_preset == NULL && +#endif + lt_settings->post_cursor2 == NULL) + + return; + + for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) { + if (lt_settings->voltage_swing) + lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing; + if (lt_settings->pre_emphasis) + lane_settings[lane].PRE_EMPHASIS = *lt_settings->pre_emphasis; + if (lt_settings->post_cursor2) + lane_settings[lane].POST_CURSOR2 = *lt_settings->post_cursor2; +#if defined(CONFIG_DRM_AMD_DC_DP2_0) + if (lt_settings->ffe_preset) + lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset; +#endif Normally these checks should be done in C and not the preprocessor. `if CONFIG(DRM_AMD_DC_DP2_0)` or similar should work. Interesting. I've never seen this before. Do you have an example or link to a doc? A cursory search doesn't yield any results but I might not be searching for the right thing. Harry I'm curious about this too. The compiler with optimizations should remove the constant check, but technically the C standard only permits it - it doesn't guarantee that it happens. However, this patch should actually be changed to drop these CONFIG_DRM_AMD_DC_DP2_0 guards - this isn't a Kconfig option nor will there be one specifically for DP2. This should be folded under the DCN support. From the Linux kernel coding style [1][2]: Within code, where possible, use the IS_ENABLED macro to convert a Kconfig symbol into a C boolean expression, and use it in a normal C conditional: if (IS_ENABLED(CONFIG_SOMETHING)) { ... } Kind regards, Paul + } +} + enum dc_status dp_get_lane_status_and_lane_adjust( struct dc_link *link, const struct link_training_settings *link_training_setting, [1]: https://www.kernel.org/doc/html/latest/process/coding-style.html#conditional-compilation [2]: Documentation/process/coding-style.rst
Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane
Hi Paul, First of all, thanks for your feedback. Before I address your comments, I just want to give you some background about this type of patchset. All patches in this series are from a week ago and came from all display teams in AMD. Keep in mind that we share our display core with other OS; as a result, we have multiple updates every week. To keep the upstream aligned with the Display core, we run a weekly process where we extract patches made for other people who might be a good candidates to be sent to the upstream. The process looks like this (oversimplified): 1. Run a script to check which commit from last week is a good candidate to go upstream - finally, port all of those patches to the upstream (this is a manual task). 2. Prepare a branch based on amd-staging-drm-next with all the new patches on top of it. 3. Send it to our validation team to run an extensive set of tests in the new series. 4. If everything is ok, send the patchset to amdgfx. 5. If the test results are positive, we merge this series in the upstream. Again, that's just a summary; we have way more steps here... On 2021-10-25 7:25 a.m., Paul Menzel wrote: Dear Wenjing, dear Rodrigo, On 24.10.21 15:31, Rodrigo Siqueira wrote: From: Wenjing Liu [why] We have a regression that cause maximize lane settings to use uninitialized data from unused lanes. Which commit caused the regression? Please amend the commit message. This is one of the situations where the week interval creates some descriptions that might sound a little bit different. Basically, this patch refers to the patch "implement decide lane settings", in this series; however, we noticed that this is not reverted, and that's why we decided to keep both patches in this series. This will cause link training to fail for 1 or 2 lanes because the lane adjust is populated incorrectly sometimes. On what card did you test this, and how can it be reproduced? Our DQE team validates this series using multiple ASICs. See the final report from Daniel in reply to this series cover letter for more details. Please describe the fix/implemantation in the commit message. Reviewed-by: Eric Yang Acked-by: Rodrigo Siqueira Signed-off-by: Wenjing Liu --- .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 35 +-- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c index 653279ab96f4..f6ba7c734f54 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -108,6 +108,9 @@ static struct dc_link_settings get_common_supported_link_settings( struct dc_link_settings link_setting_b); static void maximize_lane_settings(const struct link_training_settings *lt_settings, struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); + static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link, const struct dc_link_settings *link_settings) { @@ -734,15 +737,13 @@ void dp_decide_lane_settings( } #endif } - - /* we find the maximum of the requested settings across all lanes*/ - /* and set this maximum for all lanes*/ dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); if (lt_settings->disallow_per_lane_settings) { /* we find the maximum of the requested settings across all lanes*/ /* and set this maximum for all lanes*/ maximize_lane_settings(lt_settings, hw_lane_settings); + override_lane_settings(lt_settings, hw_lane_settings); if (lt_settings->always_match_dpcd_with_hw_lane_settings) dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct link_training_settings *lt_setti } } +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]) +{ + uint32_t lane; + + if (lt_settings->voltage_swing == NULL && + lt_settings->pre_emphasis == NULL && +#if defined(CONFIG_DRM_AMD_DC_DP2_0) + lt_settings->ffe_preset == NULL && +#endif + lt_settings->post_cursor2 == NULL) + + return; + + for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) { + if (lt_settings->voltage_swing) + lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing; + if (lt_settings->pre_emphasis) + lane_settings[lane].PRE_EMPHASIS = *lt_settings->pre_emphasis; + if (lt_settings->post_cursor2) + lane_settings[lane].POST_CURSOR2 = *lt_settings->post_cursor2; +#if defined(CONFIG_DRM_AMD_DC_DP2_0)
Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane
On 2021-10-25 9:58 a.m., Harry Wentland wrote: On 2021-10-25 07:25, Paul Menzel wrote: Dear Wenjing, dear Rodrigo, On 24.10.21 15:31, Rodrigo Siqueira wrote: From: Wenjing Liu [why] We have a regression that cause maximize lane settings to use uninitialized data from unused lanes. Which commit caused the regression? Please amend the commit message. This will cause link training to fail for 1 or 2 lanes because the lane adjust is populated incorrectly sometimes. On what card did you test this, and how can it be reproduced? Please describe the fix/implemantation in the commit message. Reviewed-by: Eric Yang Acked-by: Rodrigo Siqueira Signed-off-by: Wenjing Liu --- .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 35 +-- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c index 653279ab96f4..f6ba7c734f54 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -108,6 +108,9 @@ static struct dc_link_settings get_common_supported_link_settings( struct dc_link_settings link_setting_b); static void maximize_lane_settings(const struct link_training_settings *lt_settings, struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); + static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link, const struct dc_link_settings *link_settings) { @@ -734,15 +737,13 @@ void dp_decide_lane_settings( } #endif } - - /* we find the maximum of the requested settings across all lanes*/ - /* and set this maximum for all lanes*/ dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); if (lt_settings->disallow_per_lane_settings) { /* we find the maximum of the requested settings across all lanes*/ /* and set this maximum for all lanes*/ maximize_lane_settings(lt_settings, hw_lane_settings); + override_lane_settings(lt_settings, hw_lane_settings); if (lt_settings->always_match_dpcd_with_hw_lane_settings) dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct link_training_settings *lt_setti } } +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]) +{ + uint32_t lane; + + if (lt_settings->voltage_swing == NULL && + lt_settings->pre_emphasis == NULL && +#if defined(CONFIG_DRM_AMD_DC_DP2_0) + lt_settings->ffe_preset == NULL && +#endif + lt_settings->post_cursor2 == NULL) + + return; + + for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) { + if (lt_settings->voltage_swing) + lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing; + if (lt_settings->pre_emphasis) + lane_settings[lane].PRE_EMPHASIS = *lt_settings->pre_emphasis; + if (lt_settings->post_cursor2) + lane_settings[lane].POST_CURSOR2 = *lt_settings->post_cursor2; +#if defined(CONFIG_DRM_AMD_DC_DP2_0) + if (lt_settings->ffe_preset) + lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset; +#endif Normally these checks should be done in C and not the preprocessor. `if CONFIG(DRM_AMD_DC_DP2_0)` or similar should work. Interesting. I've never seen this before. Do you have an example or link to a doc? A cursory search doesn't yield any results but I might not be searching for the right thing. Harry I'm curious about this too. The compiler with optimizations should remove the constant check, but technically the C standard only permits it - it doesn't guarantee that it happens. However, this patch should actually be changed to drop these CONFIG_DRM_AMD_DC_DP2_0 guards - this isn't a Kconfig option nor will there be one specifically for DP2. This should be folded under the DCN support. Regards, Nicholas Kazlauskas + } +} + enum dc_status dp_get_lane_status_and_lane_adjust( struct dc_link *link, const struct link_training_settings *link_training_setting, Kind regards, Paul
Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13
On Mon, Oct 25, 2021 at 9:48 AM PGNet Dev wrote: > > ( cc'ing this here, OP -> dri-devel@ ) > > i've a dual gpu system > > inxi -GS > System:Host: ws05 Kernel: 5.14.13-200.fc34.x86_64 x86_64 > bits: 64 Console: tty pts/0 >Distro: Fedora release 34 (Thirty Four) > (1) Graphics: Device-1: NVIDIA GK208B [GeForce GT 710] driver: > nvidia v: 470.74 > (2)Device-2: Advanced Micro Devices [AMD/ATI] Cezanne > driver: N/A >Display: server: X.org 1.20.11 driver: loaded: > nvidia unloaded: fbdev,modesetting,vesa >Message: Advanced graphics data unavailable for > root. > > running on > > cpu:Ryzen 5 5600G > mobo: ASRockRack X470D4U > bios: vP4.20, 04/14/2021 > kernel: 5.14.13-200.fc34.x86_64 x86_64 > > where, > > the nvidia is a PCIe card > the amdgpu is the Ryzen-integrated gpu > > the nvidia PCI is currently my primary > it's screen-attached, and boots/functions correctly > > lsmod | grep nvidia > nvidia_drm 69632 0 > nvidia_modeset 1200128 1 nvidia_drm > nvidia 35332096 1 nvidia_modeset > drm_kms_helper303104 2 amdgpu,nvidia_drm > drm 630784 8 > gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm > > dmesg | grep -i nvidia > [5.755494] nvidia: loading out-of-tree module taints > kernel. > [5.755503] nvidia: module license 'NVIDIA' taints kernel. > [5.759769] nvidia: module verification failed: signature > and/or required key missing - tainting kernel > [5.774894] nvidia-nvlink: Nvlink Core is being > initialized, major device number 234 > [5.775299] nvidia :10:00.0: vgaarb: changed VGA > decodes: olddecodes=io+mem,decodes=none:owns=io+mem > [5.975449] NVRM: loading NVIDIA UNIX x86_64 Kernel Module > 470.74 Mon Sep 13 23:09:15 UTC 2021 > [6.013181] nvidia-modeset: Loading NVIDIA Kernel Mode > Setting Driver for UNIX platforms 470.74 Mon Sep 13 22:59:50 UTC 2021 > [6.016444] [drm] [nvidia-drm] [GPU ID 0x1000] Loading > driver > [6.227295] caller _nv000723rm+0x1ad/0x200 [nvidia] > mapping multiple BARs > [6.954906] [drm] Initialized nvidia-drm 0.0.0 20160202 > for :10:00.0 on minor 0 > [ 16.820758] input: HDA NVidia HDMI/DP,pcm=3 as > /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input13 > [ 16.820776] input: HDA NVidia HDMI/DP,pcm=7 as > /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input14 > [ 16.820808] input: HDA NVidia HDMI/DP,pcm=8 as > /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input15 > [ 16.820826] input: HDA NVidia HDMI/DP,pcm=9 as > /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input16 > [ 16.820841] input: HDA NVidia HDMI/DP,pcm=10 as > /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input17 > > the amdgpu is not (currently/yet) in use; no attached screen > > in BIOS, currently, > > 'PCI Express' (nvidia gpu) is selected as primary > 'HybridGraphics' is enabled > 'OnBoard VGA' is enabled > > > on boot, mods are loaded > > lsmod | grep gpu > amdgpu 7802880 0 > drm_ttm_helper 16384 1 amdgpu > ttm81920 2 amdgpu,drm_ttm_helper > iommu_v2 24576 1 amdgpu > gpu_sched 45056 1 amdgpu > drm_kms_helper303104 2 amdgpu,nvidia_drm > drm 630784 8 > gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm > i2c_algo_bit 16384 2 igb,amdgpu > > but i see a 'fatal error' and 'failed' probe, > > dmesg | grep -i amdgpu > [5.161923] [drm] amdgpu kernel modesetting enabled. > [5.162097] amdgpu: Virtual CRAT table created for CPU > [5.162104] amdgpu: Topology: Add CPU node > [5.162197] amdgpu :30:00.0: enabling device ( -> > 0003) > [5.162232] amdgpu :30:00.0: amdgpu: Trusted Memory > Zone (TMZ) feature enabled > [5.169105] amdgpu :30:00.0: BAR 6: can't assign [??? > 0x flags 0x2000] (bogus alignment) > [5.174413] amdgpu :30:00.0: amdgpu: Unable to locate > a BIOS ROM > [5.174415] amdgpu :30:00.0: amdgpu: Fatal error > during GPU init > [5.174416] amdgpu :30:00.0: amdg
Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane
On 2021-10-25 07:25, Paul Menzel wrote: > Dear Wenjing, dear Rodrigo, > > > On 24.10.21 15:31, Rodrigo Siqueira wrote: >> From: Wenjing Liu >> >> [why] >> We have a regression that cause maximize lane settings to use >> uninitialized data from unused lanes. > > Which commit caused the regression? Please amend the commit message. > >> This will cause link training to fail for 1 or 2 lanes because the lane >> adjust is populated incorrectly sometimes. > > On what card did you test this, and how can it be reproduced? > > Please describe the fix/implemantation in the commit message. > >> Reviewed-by: Eric Yang >> Acked-by: Rodrigo Siqueira >> Signed-off-by: Wenjing Liu >> --- >> .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 35 +-- >> 1 file changed, 32 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c >> b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c >> index 653279ab96f4..f6ba7c734f54 100644 >> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c >> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c >> @@ -108,6 +108,9 @@ static struct dc_link_settings >> get_common_supported_link_settings( >> struct dc_link_settings link_setting_b); >> static void maximize_lane_settings(const struct link_training_settings >> *lt_settings, >> struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); >> +static void override_lane_settings(const struct link_training_settings >> *lt_settings, >> + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); >> + >> static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link, >> const struct dc_link_settings *link_settings) >> { >> @@ -734,15 +737,13 @@ void dp_decide_lane_settings( >> } >> #endif >> } >> - >> - /* we find the maximum of the requested settings across all lanes*/ >> - /* and set this maximum for all lanes*/ >> dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, >> dpcd_lane_settings); >> if (lt_settings->disallow_per_lane_settings) { >> /* we find the maximum of the requested settings across all lanes*/ >> /* and set this maximum for all lanes*/ >> maximize_lane_settings(lt_settings, hw_lane_settings); >> + override_lane_settings(lt_settings, hw_lane_settings); >> if (lt_settings->always_match_dpcd_with_hw_lane_settings) >> dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, >> dpcd_lane_settings); >> @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct >> link_training_settings *lt_setti >> } >> } >> +static void override_lane_settings(const struct link_training_settings >> *lt_settings, >> + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]) >> +{ >> + uint32_t lane; >> + >> + if (lt_settings->voltage_swing == NULL && >> + lt_settings->pre_emphasis == NULL && >> +#if defined(CONFIG_DRM_AMD_DC_DP2_0) >> + lt_settings->ffe_preset == NULL && >> +#endif >> + lt_settings->post_cursor2 == NULL) >> + >> + return; >> + >> + for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) { >> + if (lt_settings->voltage_swing) >> + lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing; >> + if (lt_settings->pre_emphasis) >> + lane_settings[lane].PRE_EMPHASIS = *lt_settings->pre_emphasis; >> + if (lt_settings->post_cursor2) >> + lane_settings[lane].POST_CURSOR2 = *lt_settings->post_cursor2; >> +#if defined(CONFIG_DRM_AMD_DC_DP2_0) >> + if (lt_settings->ffe_preset) >> + lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset; >> +#endif > > Normally these checks should be done in C and not the preprocessor. `if > CONFIG(DRM_AMD_DC_DP2_0)` or similar should work. > Interesting. I've never seen this before. Do you have an example or link to a doc? A cursory search doesn't yield any results but I might not be searching for the right thing. Harry >> + } >> +} >> + >> enum dc_status dp_get_lane_status_and_lane_adjust( >> struct dc_link *link, >> const struct link_training_settings *link_training_setting, >> > > > Kind regards, > > Paul
Re: amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13
adding a trace, ... [5.201715] [drm] amdgpu kernel modesetting enabled. [5.201902] amdgpu: Virtual CRAT table created for CPU [5.201909] amdgpu: Topology: Add CPU node [5.201968] checking generic (e100 1d5000) vs hw (c000 1000) [5.201969] checking generic (e100 1d5000) vs hw (d000 20) [5.201970] checking generic (e100 1d5000) vs hw (fc50 8) [5.201988] amdgpu :30:00.0: enabling device ( -> 0003) [5.202020] [drm] initializing kernel modesetting (RENOIR 0x1002:0x1638 0x1002:0x1636 0xC9). [5.202024] amdgpu :30:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled [5.202033] [drm] register mmio base: 0xFC50 [5.202033] [drm] register mmio size: 524288 [5.202035] [drm] PCIE atomic ops is not supported [5.203075] [drm] add ip block number 0 [5.203076] [drm] add ip block number 1 [5.203077] [drm] add ip block number 2 [5.203078] [drm] add ip block number 3 [5.203078] [drm] add ip block number 4 [5.203079] [drm] add ip block number 5 [5.203079] [drm] add ip block number 6 [5.203080] [drm] add ip block number 7 [5.203081] [drm] add ip block number 8 [5.203081] [drm] add ip block number 9 [5.208784] [drm] BIOS signature incorrect 0 0 [5.208789] amdgpu :30:00.0: BAR 6: can't assign [??? 0x flags 0x2000] (bogus alignment) [5.214038] [drm] BIOS signature incorrect 0 0 [5.214042] amdgpu :30:00.0: amdgpu: Unable to locate a BIOS ROM [5.214044] amdgpu :30:00.0: amdgpu: Fatal error during GPU init [5.214045] amdgpu :30:00.0: amdgpu: amdgpu: finishing device. [5.214048] [ cut here ] [5.214049] WARNING: CPU: 5 PID: 539 at kernel/workqueue.c:3044 __flush_work.isra.0+0x1ef/0x200 [5.214054] Modules linked in: fjes(-) amdgpu(+) raid1 ast drm_vram_helper drm_ttm_helper iommu_v2 ttm gpu_sched drm_kms_helper igb crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel cec dca i2c_algo_bit sp5100_tco ccp drm uas usb_storage wmi video sunrpc tcp_bbr nct6775 hwmon_vid k10temp [5.214065] CPU: 5 PID: 539 Comm: systemd-udevd Not tainted 5.14.13-200.fc34.x86_64 #1 [5.214067] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470D4U, BIOS P4.20 04/14/2021 [5.214068] RIP: 0010:__flush_work.isra.0+0x1ef/0x200 [5.214070] Code: 8b 4d 00 48 8b 55 08 83 e1 08 48 0f ba 6d 00 03 80 c9 f0 e9 37 ff ff ff 0f 0b 48 83 c4 48 44 89 e0 5b 5d 41 5c 41 5d 41 5e c3 <0f> 0b 45 31 e4 e9 46 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 [5.214071] RSP: 0018:9d5f00f0fa80 EFLAGS: 00010246 [5.214073] RAX: 0011 RBX: RCX: 0027 [5.214074] RDX: 0001 RSI: 0001 RDI: 88bc91e25ab8 [5.214074] RBP: 88bc91e25ab8 R08: R09: 9d5f00f0f898 [5.214075] R10: 9d5f00f0f890 R11: 88c39e1fcfe8 R12: 0001 [5.214075] R13: 88bc92622800 R14: 88bc91e2 R15: 9d5f00f0fde0 [5.214076] FS: 7f231d7deb40() GS:88c37df4() knlGS: [5.214077] CS: 0010 DS: ES: CR0: 80050033 [5.214078] CR2: 7fa1bbfa5ff0 CR3: 000104b94000 CR4: 00750ea0 [5.214078] PKRU: 5554 [5.214079] Call Trace: [5.214082] ? dev_printk_emit+0x3e/0x40 [5.214085] __cancel_work_timer+0xea/0x170 [5.214086] ? del_timer_sync+0x57/0x80 [5.214089] ttm_bo_lock_delayed_workqueue+0x11/0x20 [ttm] [5.214093] amdgpu_device_fini_hw+0x33/0x2c5 [amdgpu] [5.214225] amdgpu_driver_load_kms.cold+0x72/0x94 [amdgpu] [5.214338] amdgpu_pci_probe+0x110/0x1a0 [amdgpu] [5.214420] local_pci_probe+0x42/0x80 [5.214423] ? __cond_resched+0x16/0x40 [5.214426] pci_device_probe+0xd9/0x190 [5.214427] really_probe+0x1f5/0x3f0 [5.214429] __driver_probe_device+0xfe/0x180 [5.214430] driver_probe_device+0x1e/0x90 [5.214431] __driver_attach+0xc0/0x1c0 [5.214433] ? __device_attach_driver+0xe0/0xe0 [5.214434] ? __device_attach_driver+0xe0/0xe0 [5.214434] bus_for_each_dev+0x64/0x90 [5.214436] bus_add_driver+0x12b/0x1e0 [5.214438] driver_register+0x8f/0xe0 [5.214439] ? 0xc0d62000 [5.214440] do_one_initcall+0x44/0x1d0 [5.214443] ? kmem_cache_alloc_trace+0x15c/0x280 [5.214445] do_init_module+0x5c/0x270 [5.214448] __do_sys_init_module+0x11d/0x180 [5.214450] do_syscall_64+0x3b/0x90 [5.214452] ? handle_mm_fault+0xcf/0x2a0 [5.214454] ? do_user_addr_fault+0x1d5/0x680 [5.214457] ? syscall_exit_to_user_mode+0x18/0x40 [5.214458] ? exc_page_fault+0x72/0x150 [5.214459] entry_SYSCALL_64_after_hwframe+0x44/0xae [5.214461] RIP: 0033:0x7f231e42a0fe [5.214463] Code: 48 8b 0d 7d 1d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73
amdgpu "Fatal error during GPU init"; Ryzen 5600G integrated GPU + kernel 5.14.13
( cc'ing this here, OP -> dri-devel@ ) i've a dual gpu system inxi -GS System:Host: ws05 Kernel: 5.14.13-200.fc34.x86_64 x86_64 bits: 64 Console: tty pts/0 Distro: Fedora release 34 (Thirty Four) (1) Graphics: Device-1: NVIDIA GK208B [GeForce GT 710] driver: nvidia v: 470.74 (2)Device-2: Advanced Micro Devices [AMD/ATI] Cezanne driver: N/A Display: server: X.org 1.20.11 driver: loaded: nvidia unloaded: fbdev,modesetting,vesa Message: Advanced graphics data unavailable for root. running on cpu:Ryzen 5 5600G mobo: ASRockRack X470D4U bios: vP4.20, 04/14/2021 kernel: 5.14.13-200.fc34.x86_64 x86_64 where, the nvidia is a PCIe card the amdgpu is the Ryzen-integrated gpu the nvidia PCI is currently my primary it's screen-attached, and boots/functions correctly lsmod | grep nvidia nvidia_drm 69632 0 nvidia_modeset 1200128 1 nvidia_drm nvidia 35332096 1 nvidia_modeset drm_kms_helper303104 2 amdgpu,nvidia_drm drm 630784 8 gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm dmesg | grep -i nvidia [5.755494] nvidia: loading out-of-tree module taints kernel. [5.755503] nvidia: module license 'NVIDIA' taints kernel. [5.759769] nvidia: module verification failed: signature and/or required key missing - tainting kernel [5.774894] nvidia-nvlink: Nvlink Core is being initialized, major device number 234 [5.775299] nvidia :10:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem [5.975449] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.74 Mon Sep 13 23:09:15 UTC 2021 [6.013181] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 470.74 Mon Sep 13 22:59:50 UTC 2021 [6.016444] [drm] [nvidia-drm] [GPU ID 0x1000] Loading driver [6.227295] caller _nv000723rm+0x1ad/0x200 [nvidia] mapping multiple BARs [6.954906] [drm] Initialized nvidia-drm 0.0.0 20160202 for :10:00.0 on minor 0 [ 16.820758] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input13 [ 16.820776] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input14 [ 16.820808] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input15 [ 16.820826] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input16 [ 16.820841] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci:00/:00:01.1/:10:00.1/sound/card0/input17 the amdgpu is not (currently/yet) in use; no attached screen in BIOS, currently, 'PCI Express' (nvidia gpu) is selected as primary 'HybridGraphics' is enabled 'OnBoard VGA' is enabled on boot, mods are loaded lsmod | grep gpu amdgpu 7802880 0 drm_ttm_helper 16384 1 amdgpu ttm81920 2 amdgpu,drm_ttm_helper iommu_v2 24576 1 amdgpu gpu_sched 45056 1 amdgpu drm_kms_helper303104 2 amdgpu,nvidia_drm drm 630784 8 gpu_sched,drm_kms_helper,nvidia,amdgpu,drm_ttm_helper,nvidia_drm,ttm i2c_algo_bit 16384 2 igb,amdgpu but i see a 'fatal error' and 'failed' probe, dmesg | grep -i amdgpu [5.161923] [drm] amdgpu kernel modesetting enabled. [5.162097] amdgpu: Virtual CRAT table created for CPU [5.162104] amdgpu: Topology: Add CPU node [5.162197] amdgpu :30:00.0: enabling device ( -> 0003) [5.162232] amdgpu :30:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled [5.169105] amdgpu :30:00.0: BAR 6: can't assign [??? 0x flags 0x2000] (bogus alignment) [5.174413] amdgpu :30:00.0: amdgpu: Unable to locate a BIOS ROM [5.174415] amdgpu :30:00.0: amdgpu: Fatal error during GPU init [5.174416] amdgpu :30:00.0: amdgpu: amdgpu: finishing device. [5.174425] Modules linked in: amdgpu(+) uas usb_storage fjes(-) raid1 drm_ttm_helper ttm iommu_v2 gpu_sched drm_kms_helper crct10dif_pclmul crc32_pclmul igb crc32c_intel cec ghash_clmulni_intel drm sp51
RE: [PATCH 00/33] DC Patches October 24, 2020
[Public] Hi all, This week this patchset was tested on the following systems: HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA) AMD Ryzen 9 5900H, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA) Sapphire Pulse RX5700XT with the following display types: 4k 60hz (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA) Reference AMD RX6800 with the following display types: 4k 60hz (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA) Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz, and 3x 1080p 60hz on all systems. Also tested DSC via USB-C to DP DSC Hub with 3x 4k 60hz on Ryzen 9 5900h and Ryzen 5 4500u. Tested-by: Daniel Wheeler Thank you, Dan Wheeler Technologist | AMD SW Display -- 1 Commerce Valley Dr E, Thornhill, ON L3T 7X6 Facebook | Twitter | amd.com -Original Message- From: amd-gfx On Behalf Of Rodrigo Siqueira Sent: October 24, 2021 9:31 AM To: amd-gfx@lists.freedesktop.org Cc: Wentland, Harry ; Li, Sun peng (Leo) ; Lakha, Bhawanpreet ; Siqueira, Rodrigo ; Pillai, Aurabindo ; Zhuo, Qingqing ; Lipski, Mikita ; Li, Roman ; Jacob, Anson ; Lin, Wayne ; Wang, Chao-kai (Stylon) ; Chiu, Solomon ; Kotarac, Pavle ; Gutierrez, Agustin Subject: [PATCH 00/33] DC Patches October 24, 2020 This new DC version brings improvements in the following areas: - Improvements for USB4; - Isolate FPU code for DCN20, DCN301, and DSC; - Fixes on Linking training; - Refactoring some parts of the code, such as PSR; Thanks Ahmad Othman (2): drm/amd/display: Add support for USB4 on C20 PHY for DCN3.1 drm/amd/display: fix a crash on USB4 over C20 PHY Anson Jacob (2): drm/amd/display: dcn20_resource_construct reduce scope of FPU enabled drm/amd/display: Remove unused macros Anthony Koo (2): drm/amd/display: [FW Promotion] Release 0.0.89 drm/amd/display: [FW Promotion] Release 0.0.90 Aric Cyr (4): drm/amd/display: Handle I2C-over-AUX write channel status update drm/amd/display: 3.2.158 drm/amd/display: Fix 3DLUT skipped programming drm/amd/display: 3.2.159 Dmytro Laktyushkin (3): drm/amd/display: clean up dcn31 revision check drm/amd/display: restyle dcn31 resource header inline with other asics drm/amd/display: allow windowed mpo + odm George Shen (2): drm/amd/display: Implement fixed DP drive settings drm/amd/display: Add comment for preferred_training_settings Guo, Bing (2): drm/amd/display: Get ceiling for v_total calc drm/amd/display: set Layout properly for 8ch audio at timing validation Hansen (1): drm/amd/display: Set phy_mux_sel bit in dmub scratch register Jimmy Kizito (1): drm/amd/display: Add workaround flag for EDID read on certain docks Lewis Huang (1): drm/amd/display: Align bw context with hw config when system resume Martin Leung (1): drm/amd/display: Manually adjust strobe for DCN303 Meenakshikumar Somasundaram (2): drm/amd/display: FEC configuration for dpia links drm/amd/display: FEC configuration for dpia links in MST mode Michael Strauss (2): drm/amd/display: Set i2c memory to light sleep during hw init drm/amd/display: Defer GAMCOR and DSCL power down sequence to vupdate Qingqing Zhuo (2): drm/amd/display: move FPU associated DSC code to DML folder drm/amd/display: move FPU associated DCN301 code to DML folder Robin Chen (1): drm/amd/display: dc_link_set_psr_allow_active refactoring Wenjing Liu (5): drm/amd/display: adopt DP2.0 LT SCR revision 8 drm/amd/display: implement decide lane settings drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings drm/amd/display: add two lane settings training options drm/amd/display: fix link training regression for 1 or 2 lane .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- .../display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 + .../drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 10 +- .../drm/amd/display/dc/bios/bios_parser2.c| 2 + .../gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c | 16 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 17 +- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 118 - .../gpu/drm/amd/display/dc/core/dc_link_ddc.c | 15 +- .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 498 +++--- .../drm/amd/display/dc/core/dc_link_dpia.c| 57 +- .../gpu/drm/amd/display/dc/core/dc_resource.c | 14 +- drivers/gpu/drm/amd/display/dc/dc.h | 23 +- drivers/gpu/drm/amd/display/dc/dc_dp_types.h | 5 - drivers/gpu/drm/amd/display/dc/dc_link.h | 13 +- .../gp
[PATCH v2 2/8] drm: improve drm_buddy_alloc function
- Make drm_buddy_alloc a single function to handle range allocation and non-range allocation demands - Implemented a new function alloc_range() which allocates the requested power-of-two block comply with range limitations - Moved order computation and memory alignment logic from i915 driver to drm buddy V2: merged below changes to keep the build unbroken - drm_buddy_alloc_range() becomes obsolete and may be removed - enable ttm range allocation (fpfn / lpfn) support in i915 driver - apply enhanced drm_buddy_alloc() function to i915 driver Signed-off-by: Arunpravin --- drivers/gpu/drm/drm_buddy.c | 265 +++--- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 67 ++--- drivers/gpu/drm/i915/i915_ttm_buddy_manager.h | 2 + include/drm/drm_buddy.h | 22 +- 4 files changed, 207 insertions(+), 149 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 39eb1d224bec..406e3d521903 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -274,63 +274,6 @@ void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects) } EXPORT_SYMBOL(drm_buddy_free_list); -/** - * drm_buddy_alloc - allocate power-of-two blocks - * - * @mm: DRM buddy manager to allocate from - * @order: size of the allocation - * - * The order value here translates to: - * - * 0 = 2^0 * mm->chunk_size - * 1 = 2^1 * mm->chunk_size - * 2 = 2^2 * mm->chunk_size - * - * Returns: - * allocated ptr to the &drm_buddy_block on success - */ -struct drm_buddy_block * -drm_buddy_alloc(struct drm_buddy_mm *mm, unsigned int order) -{ - struct drm_buddy_block *block = NULL; - unsigned int i; - int err; - - for (i = order; i <= mm->max_order; ++i) { - block = list_first_entry_or_null(&mm->free_list[i], -struct drm_buddy_block, -link); - if (block) - break; - } - - if (!block) - return ERR_PTR(-ENOSPC); - - BUG_ON(!drm_buddy_block_is_free(block)); - - while (i != order) { - err = split_block(mm, block); - if (unlikely(err)) - goto out_free; - - /* Go low */ - block = block->left; - i--; - } - - mark_allocated(block); - mm->avail -= drm_buddy_block_size(mm, block); - kmemleak_update_trace(block); - return block; - -out_free: - if (i != order) - __drm_buddy_free(mm, block); - return ERR_PTR(err); -} -EXPORT_SYMBOL(drm_buddy_alloc); - static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) { return s1 <= e2 && e1 >= s2; @@ -341,52 +284,22 @@ static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2) return s1 <= s2 && e1 >= e2; } -/** - * drm_buddy_alloc_range - allocate range - * - * @mm: DRM buddy manager to allocate from - * @blocks: output list head to add allocated blocks - * @start: start of the allowed range for this block - * @size: size of the allocation - * - * Intended for pre-allocating portions of the address space, for example to - * reserve a block for the initial framebuffer or similar, hence the expectation - * here is that drm_buddy_alloc() is still the main vehicle for - * allocations, so if that's not the case then the drm_mm range allocator is - * probably a much better fit, and so you should probably go use that instead. - * - * Note that it's safe to chain together multiple alloc_ranges - * with the same blocks list - * - * Returns: - * 0 on success, error code on failure. - */ -int drm_buddy_alloc_range(struct drm_buddy_mm *mm, - struct list_head *blocks, - u64 start, u64 size) +static struct drm_buddy_block * +alloc_range(struct drm_buddy_mm *mm, + u64 start, u64 end, + unsigned int order) { struct drm_buddy_block *block; struct drm_buddy_block *buddy; - LIST_HEAD(allocated); LIST_HEAD(dfs); - u64 end; int err; int i; - if (size < mm->chunk_size) - return -EINVAL; - - if (!IS_ALIGNED(size | start, mm->chunk_size)) - return -EINVAL; - - if (range_overflows(start, size, mm->size)) - return -EINVAL; + end = end - 1; for (i = 0; i < mm->n_roots; ++i) list_add_tail(&mm->roots[i]->tmp_link, &dfs); - end = start + size - 1; - do { u64 block_start; u64 block_end; @@ -394,31 +307,32 @@ int drm_buddy_alloc_range(struct drm_buddy_mm *mm, block = list_first_entry_or_null(&dfs, struct drm_buddy_block, tmp_link); + if (!block) break;
[PATCH v2 1/8] drm: move the buddy allocator from i915 into common drm
Move the base i915 buddy allocator code into drm - Move i915_buddy.h to include/drm - Move i915_buddy.c to drm root folder - Rename "i915" string with "drm" string wherever applicable - Rename "I915" string with "DRM" string wherever applicable - Fix header file dependencies - Fix alignment issues - add Makefile support for drm buddy - export functions and write kerneldoc description - Remove i915 selftest config check condition as buddy selftest will be moved to drm selftest folder cleanup i915 buddy references in i915 driver module and replace with drm buddy V2: - include header file in alphabetical order (Thomas) - merged changes listed in the body section into a single patch to keep the build intact (Christian, Jani) Signed-off-by: Arunpravin --- drivers/gpu/drm/Makefile | 2 +- drivers/gpu/drm/drm_buddy.c | 521 ++ drivers/gpu/drm/drm_drv.c | 3 + drivers/gpu/drm/i915/Makefile | 1 - drivers/gpu/drm/i915/i915_buddy.c | 466 drivers/gpu/drm/i915/i915_buddy.h | 143 - drivers/gpu/drm/i915/i915_module.c| 3 - drivers/gpu/drm/i915/i915_scatterlist.c | 11 +- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 33 +- drivers/gpu/drm/i915/i915_ttm_buddy_manager.h | 4 +- include/drm/drm_buddy.h | 153 + 11 files changed, 702 insertions(+), 638 deletions(-) create mode 100644 drivers/gpu/drm/drm_buddy.c delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h create mode 100644 include/drm/drm_buddy.h diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 0dff40bb863c..dc61e91a3154 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -18,7 +18,7 @@ drm-y :=drm_aperture.o drm_auth.o drm_cache.o \ drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \ drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \ drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \ - drm_managed.o drm_vblank_work.o + drm_managed.o drm_vblank_work.o drm_buddy.o drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o drm_dma.o \ drm_legacy_misc.o drm_lock.o drm_memory.o drm_scatter.o \ diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c new file mode 100644 index ..39eb1d224bec --- /dev/null +++ b/drivers/gpu/drm/drm_buddy.c @@ -0,0 +1,521 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2021 Intel Corporation + */ + +#include +#include + +#include + +static struct kmem_cache *slab_blocks; + +static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm, + struct drm_buddy_block *parent, + unsigned int order, + u64 offset) +{ + struct drm_buddy_block *block; + + BUG_ON(order > DRM_BUDDY_MAX_ORDER); + + block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL); + if (!block) + return NULL; + + block->header = offset; + block->header |= order; + block->parent = parent; + + BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED); + return block; +} + +static void drm_block_free(struct drm_buddy_mm *mm, + struct drm_buddy_block *block) +{ + kmem_cache_free(slab_blocks, block); +} + +static void mark_allocated(struct drm_buddy_block *block) +{ + block->header &= ~DRM_BUDDY_HEADER_STATE; + block->header |= DRM_BUDDY_ALLOCATED; + + list_del(&block->link); +} + +static void mark_free(struct drm_buddy_mm *mm, + struct drm_buddy_block *block) +{ + block->header &= ~DRM_BUDDY_HEADER_STATE; + block->header |= DRM_BUDDY_FREE; + + list_add(&block->link, +&mm->free_list[drm_buddy_block_order(block)]); +} + +static void mark_split(struct drm_buddy_block *block) +{ + block->header &= ~DRM_BUDDY_HEADER_STATE; + block->header |= DRM_BUDDY_SPLIT; + + list_del(&block->link); +} + +/** + * drm_buddy_init - init memory manager + * + * @mm: DRM buddy manager to initialize + * @size: size in bytes to manage + * @chunk_size: minimum page size in bytes for our allocations + * + * Initializes the memory manager and its resources. + * + * Returns: + * 0 on success, error code on failure. + */ +int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size) +{ + unsigned int i; + u64 offset; + + if (size < chunk_size) + return -EINVAL; + + if (chunk_size < PAGE_SIZE) + return -EINVAL; + + if (!is_power_of_2(chunk_size)) + return -EINVAL; + + size = round_down(size, chunk_size); + + mm->size = size; + mm->av
[PATCH 8/8] drm/amdgpu: add drm buddy support to amdgpu
- Remove drm_mm references and replace with drm buddy functionalities - Add res cursor support for drm buddy Signed-off-by: Arunpravin --- .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 97 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 239 ++ 3 files changed, 221 insertions(+), 119 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h index acfa207cf970..da12b4ff2e45 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h @@ -30,12 +30,15 @@ #include #include +#include "amdgpu_vram_mgr.h" + /* state back for walking over vram_mgr and gtt_mgr allocations */ struct amdgpu_res_cursor { uint64_tstart; uint64_tsize; uint64_tremaining; - struct drm_mm_node *node; + void*node; + uint32_tmem_type; }; /** @@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource *res, uint64_t start, uint64_t size, struct amdgpu_res_cursor *cur) { + struct drm_buddy_block *block; + struct list_head *head, *next; struct drm_mm_node *node; - if (!res || res->mem_type == TTM_PL_SYSTEM) { - cur->start = start; - cur->size = size; - cur->remaining = size; - cur->node = NULL; - WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT); - return; - } + if (!res) + goto err_out; BUG_ON(start + size > res->num_pages << PAGE_SHIFT); - node = to_ttm_range_mgr_node(res)->mm_nodes; - while (start >= node->size << PAGE_SHIFT) - start -= node++->size << PAGE_SHIFT; + cur->mem_type = res->mem_type; + + switch (cur->mem_type) { + case TTM_PL_VRAM: + head = &to_amdgpu_vram_mgr_node(res)->blocks; + + block = list_first_entry_or_null(head, +struct drm_buddy_block, +link); + if (!block) + goto err_out; + + while (start >= amdgpu_node_size(block)) { + start -= amdgpu_node_size(block); + + next = block->link.next; + if (next != head) + block = list_entry(next, struct drm_buddy_block, link); + } + + cur->start = amdgpu_node_start(block) + start; + cur->size = min(amdgpu_node_size(block) - start, size); + cur->remaining = size; + cur->node = block; + break; + case TTM_PL_TT: + node = to_ttm_range_mgr_node(res)->mm_nodes; + while (start >= node->size << PAGE_SHIFT) + start -= node++->size << PAGE_SHIFT; + + cur->start = (node->start << PAGE_SHIFT) + start; + cur->size = min((node->size << PAGE_SHIFT) - start, size); + cur->remaining = size; + cur->node = node; + break; + default: + goto err_out; + } - cur->start = (node->start << PAGE_SHIFT) + start; - cur->size = min((node->size << PAGE_SHIFT) - start, size); + return; + +err_out: + cur->start = start; + cur->size = size; cur->remaining = size; - cur->node = node; + cur->node = NULL; + WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT); + return; } /** @@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res, */ static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t size) { - struct drm_mm_node *node = cur->node; + struct drm_buddy_block *block; + struct drm_mm_node *node; + struct list_head *next; BUG_ON(size > cur->remaining); @@ -99,9 +140,27 @@ static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t size) return; } - cur->node = ++node; - cur->start = node->start << PAGE_SHIFT; - cur->size = min(node->size << PAGE_SHIFT, cur->remaining); + switch (cur->mem_type) { + case TTM_PL_VRAM: + block = cur->node; + + next = block->link.next; + block = list_entry(next, struct drm_buddy_block, link); + + cur->node = block; + cur->start = amdgpu_node_start(block); + cur->size = min(amdgpu_node_size(block), cur->remaining); + break; + case TTM_PL_TT: + node = cur->node; + + cur->node = ++node; + cur->start =
[PATCH 7/8] drm/amdgpu: move vram inline functions into a header
Move shared vram inline functions and structs into a header file Signed-off-by: Arunpravin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 51 1 file changed, 51 insertions(+) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h new file mode 100644 index ..32cd6f54b6f3 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h @@ -0,0 +1,51 @@ +/* SPDX-License-Identifier: MIT + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#ifndef __AMDGPU_VRAM_MGR_H__ +#define __AMDGPU_VRAM_MGR_H__ + +#include + +struct amdgpu_vram_mgr_node { + struct ttm_resource base; + struct list_head blocks; + unsigned long flags; +}; + +static inline uint64_t amdgpu_node_start(struct drm_buddy_block *block) +{ + return drm_buddy_block_offset(block); +} + +static inline uint64_t amdgpu_node_size(struct drm_buddy_block *block) +{ + return PAGE_SIZE << drm_buddy_block_order(block); +} + +static inline struct amdgpu_vram_mgr_node * +to_amdgpu_vram_mgr_node(struct ttm_resource *res) +{ + return container_of(res, struct amdgpu_vram_mgr_node, base); +} + +#endif -- 2.25.1
[PATCH 6/8] drm/i915: add free_unused_pages support to i915
add drm_buddy_free_unused_pages() support on contiguous allocation Signed-off-by: Arunpravin --- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c index 963468228392..162947af8e04 100644 --- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c +++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c @@ -98,6 +98,14 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man, if (unlikely(err)) goto err_free_blocks; + if (place->flags & TTM_PL_FLAG_CONTIGUOUS) { + err = drm_buddy_free_unused_pages(mm, (uint64_t)n_pages << PAGE_SHIFT, + &bman_res->blocks); + + if (unlikely(err)) + goto err_free_blocks; + } + *res = &bman_res->base; return 0; -- 2.25.1
[PATCH 5/8] drm: Implement method to free unused pages
On contiguous allocation, we round up the size to the *next* power of 2, implement a function to free the unused pages after the newly allocate block. Signed-off-by: Arunpravin --- drivers/gpu/drm/drm_buddy.c | 103 include/drm/drm_buddy.h | 4 ++ 2 files changed, 107 insertions(+) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 9d3547bcc5da..0da8510736eb 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -284,6 +284,109 @@ static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2) return s1 <= s2 && e1 >= e2; } +/** + * drm_buddy_free_unused_pages - free unused pages + * + * @mm: DRM buddy manager + * @actual_size: original size requested + * @blocks: output list head to add allocated blocks + * + * For contiguous allocation, we round up the size to the nearest + * power of two value, drivers consume *actual* size, so remaining + * portions are unused and it can be freed. + * + * Returns: + * 0 on success, error code on failure. + */ +int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm, + u64 actual_size, + struct list_head *blocks) +{ + struct drm_buddy_block *block; + struct drm_buddy_block *buddy; + u64 actual_start; + u64 actual_end; + LIST_HEAD(dfs); + u64 count = 0; + int err; + + if (!list_is_singular(blocks)) + return -EINVAL; + + block = list_first_entry_or_null(blocks, +struct drm_buddy_block, +link); + + if (!block) + return -EINVAL; + + if (actual_size > drm_buddy_block_size(mm, block)) + return -EINVAL; + + if (actual_size == drm_buddy_block_size(mm, block)) + return 0; + + list_del(&block->link); + + actual_start = drm_buddy_block_offset(block); + actual_end = actual_start + actual_size - 1; + + if (drm_buddy_block_is_allocated(block)) + mark_free(mm, block); + + list_add(&block->tmp_link, &dfs); + + while (1) { + block = list_first_entry_or_null(&dfs, +struct drm_buddy_block, +tmp_link); + + if (!block) + break; + + list_del(&block->tmp_link); + + if (count == actual_size) + return 0; + + if (contains(actual_start, actual_end, drm_buddy_block_offset(block), + (drm_buddy_block_offset(block) + drm_buddy_block_size(mm, block) - 1))) { + BUG_ON(!drm_buddy_block_is_free(block)); + + /* Allocate only required blocks */ + mark_allocated(block); + mm->avail -= drm_buddy_block_size(mm, block); + list_add_tail(&block->link, blocks); + count += drm_buddy_block_size(mm, block); + continue; + } + + if (drm_buddy_block_order(block) == 0) + continue; + + if (!drm_buddy_block_is_split(block)) { + err = split_block(mm, block); + + if (unlikely(err)) + goto err_undo; + } + + list_add(&block->right->tmp_link, &dfs); + list_add(&block->left->tmp_link, &dfs); + } + + return -ENOSPC; + +err_undo: + buddy = get_buddy(block); + if (buddy && + (drm_buddy_block_is_free(block) && +drm_buddy_block_is_free(buddy))) + __drm_buddy_free(mm, block); + return err; +} +EXPORT_SYMBOL(drm_buddy_free_unused_pages); + static struct drm_buddy_block * alloc_range(struct drm_buddy_mm *mm, u64 start, u64 end, diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index cd8021d2d6e7..1dfc80c88e1f 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -145,6 +145,10 @@ int drm_buddy_alloc(struct drm_buddy_mm *mm, struct list_head *blocks, unsigned long flags); +int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm, + u64 actual_size, + struct list_head *blocks); + void drm_buddy_free(struct drm_buddy_mm *mm, struct drm_buddy_block *block); void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects); -- 2.25.1
[PATCH 4/8] drm/i915: add top-down alloc support to i915
add top down allocation support to i915 driver Signed-off-by: Arunpravin --- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c index 75197ba6e40d..963468228392 100644 --- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c +++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c @@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man, INIT_LIST_HEAD(&bman_res->blocks); bman_res->mm = mm; + if (place->flags & TTM_PL_FLAG_TOPDOWN) + bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION; + if (place->fpfn || lpfn != man->size) bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION; -- 2.25.1
[PATCH 3/8] drm: implement top-down allocation method
Implemented a function which walk through the order list, compares the offset and returns the maximum offset block, this method is unpredictable in obtaining the high range address blocks which depends on allocation and deallocation. for instance, if driver requests address at a low specific range, allocator traverses from the root block and splits the larger blocks until it reaches the specific block and in the process of splitting, lower orders in the freelist are occupied with low range address blocks and for the subsequent TOPDOWN memory request we may return the low range blocks.To overcome this issue, we may go with the below approach. The other approach, sorting each order list entries in ascending order and compares the last entry of each order list in the freelist and return the max block. This creates sorting overhead on every drm_buddy_free() request and split up of larger blocks for a single page request. Signed-off-by: Arunpravin --- drivers/gpu/drm/drm_buddy.c | 42 +++-- include/drm/drm_buddy.h | 1 + 2 files changed, 37 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 406e3d521903..9d3547bcc5da 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -362,6 +362,27 @@ alloc_range(struct drm_buddy_mm *mm, return ERR_PTR(err); } +static struct drm_buddy_block * +get_maxblock(struct list_head *head) +{ + struct drm_buddy_block *max_block = NULL, *node; + + max_block = list_first_entry_or_null(head, +struct drm_buddy_block, +link); + + if (!max_block) + return NULL; + + list_for_each_entry(node, head, link) { + if (drm_buddy_block_offset(node) > + drm_buddy_block_offset(max_block)) + max_block = node; + } + + return max_block; +} + static struct drm_buddy_block * alloc_from_freelist(struct drm_buddy_mm *mm, unsigned int order, @@ -372,13 +393,22 @@ alloc_from_freelist(struct drm_buddy_mm *mm, int err; for (i = order; i <= mm->max_order; ++i) { - if (!list_empty(&mm->free_list[i])) { - block = list_first_entry_or_null(&mm->free_list[i], -struct drm_buddy_block, -link); + if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) { + if (!list_empty(&mm->free_list[i])) { + block = get_maxblock(&mm->free_list[i]); - if (block) - break; + if (block) + break; + } + } else { + if (!list_empty(&mm->free_list[i])) { + block = list_first_entry_or_null(&mm->free_list[i], +struct drm_buddy_block, +link); + + if (block) + break; + } } } diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index c7bb5509a7ad..cd8021d2d6e7 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -28,6 +28,7 @@ }) #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0) +#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1) struct drm_buddy_block { #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12) -- 2.25.1
Re: I got an IOMMU IO page fault. What to do now?
Dear Christian, Thank you for your reply. On 25.10.21 13:23, Christian König wrote: not sure how the IOMMU gives out addresses, but the printed ones look suspicious to me. Something like we are using an invalid address like -1 or similar. Can you try that on an up to date kernel as well? E.g. ideally bleeding edge amd-staging-drm-next from Alex repository. These are production desktops, so I’d need to talk to the user. Currently, Linux 5.10.70 is running. Kind regards, Paul
Re: [PATCH 32/33] drm/amd/display: fix link training regression for 1 or 2 lane
Dear Wenjing, dear Rodrigo, On 24.10.21 15:31, Rodrigo Siqueira wrote: From: Wenjing Liu [why] We have a regression that cause maximize lane settings to use uninitialized data from unused lanes. Which commit caused the regression? Please amend the commit message. This will cause link training to fail for 1 or 2 lanes because the lane adjust is populated incorrectly sometimes. On what card did you test this, and how can it be reproduced? Please describe the fix/implemantation in the commit message. Reviewed-by: Eric Yang Acked-by: Rodrigo Siqueira Signed-off-by: Wenjing Liu --- .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 35 +-- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c index 653279ab96f4..f6ba7c734f54 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -108,6 +108,9 @@ static struct dc_link_settings get_common_supported_link_settings( struct dc_link_settings link_setting_b); static void maximize_lane_settings(const struct link_training_settings *lt_settings, struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]); + static uint32_t get_cr_training_aux_rd_interval(struct dc_link *link, const struct dc_link_settings *link_settings) { @@ -734,15 +737,13 @@ void dp_decide_lane_settings( } #endif } - - /* we find the maximum of the requested settings across all lanes*/ - /* and set this maximum for all lanes*/ dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); if (lt_settings->disallow_per_lane_settings) { /* we find the maximum of the requested settings across all lanes*/ /* and set this maximum for all lanes*/ maximize_lane_settings(lt_settings, hw_lane_settings); + override_lane_settings(lt_settings, hw_lane_settings); if (lt_settings->always_match_dpcd_with_hw_lane_settings) dp_hw_to_dpcd_lane_settings(lt_settings, hw_lane_settings, dpcd_lane_settings); @@ -833,6 +834,34 @@ static void maximize_lane_settings(const struct link_training_settings *lt_setti } } +static void override_lane_settings(const struct link_training_settings *lt_settings, + struct dc_lane_settings lane_settings[LANE_COUNT_DP_MAX]) +{ + uint32_t lane; + + if (lt_settings->voltage_swing == NULL && + lt_settings->pre_emphasis == NULL && +#if defined(CONFIG_DRM_AMD_DC_DP2_0) + lt_settings->ffe_preset == NULL && +#endif + lt_settings->post_cursor2 == NULL) + + return; + + for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) { + if (lt_settings->voltage_swing) + lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing; + if (lt_settings->pre_emphasis) + lane_settings[lane].PRE_EMPHASIS = *lt_settings->pre_emphasis; + if (lt_settings->post_cursor2) + lane_settings[lane].POST_CURSOR2 = *lt_settings->post_cursor2; +#if defined(CONFIG_DRM_AMD_DC_DP2_0) + if (lt_settings->ffe_preset) + lane_settings[lane].FFE_PRESET = *lt_settings->ffe_preset; +#endif Normally these checks should be done in C and not the preprocessor. `if CONFIG(DRM_AMD_DC_DP2_0)` or similar should work. + } +} + enum dc_status dp_get_lane_status_and_lane_adjust( struct dc_link *link, const struct link_training_settings *link_training_setting, Kind regards, Paul
Re: I got an IOMMU IO page fault. What to do now?
Hi Paul, not sure how the IOMMU gives out addresses, but the printed ones look suspicious to me. Something like we are using an invalid address like -1 or similar. Can you try that on an up to date kernel as well? E.g. ideally bleeding edge amd-staging-drm-next from Alex repository. Regards, Christian. Am 25.10.21 um 12:25 schrieb Paul Menzel: Dear Linux folks, On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages below. (GPU hang in amdgpu issue #1762 [1] might be related.) $ lspci -nn -s 05:00.0 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87) $ dmesg […] [6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020] [6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020] [6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020] [6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020] [6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020] [6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020] [6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020] [6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020] [6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020] [6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020] [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc1c0 flags=0x0020] [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc8c0 flags=0x0020] [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb0c0 flags=0x0020] [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffcfc0 flags=0x0020] [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb6c0 flags=0x0020] [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffa0c0 flags=0x0020] [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffbdc0 flags=0x0020] [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffa4c0 flags=0x0020] [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xff90c0 flags=0x0020] [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffabc0 flags=0x0020] As this is not reproducible, how would debugging go? (The system was rebooted in the meantime.) What options should be enabled, that next time the required information is logged, or what commands should I execute when the system is still in that state, so the bug (driver, userspace, …) can be pinpointed and fixed? Kind regards, Paul [1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"
I got an IOMMU IO page fault. What to do now?
Dear Linux folks, On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages below. (GPU hang in amdgpu issue #1762 [1] might be related.) $ lspci -nn -s 05:00.0 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87) $ dmesg […] [6318399.745242] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff0c0 flags=0x0020] [6318399.757283] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfff7c0 flags=0x0020] [6318399.769154] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe0c0 flags=0x0020] [6318399.780913] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffec0 flags=0x0020] [6318399.792734] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffe5c0 flags=0x0020] [6318399.804309] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd0c0 flags=0x0020] [6318399.816091] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffecc0 flags=0x0020] [6318399.827407] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffd3c0 flags=0x0020] [6318399.838708] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffc0c0 flags=0x0020] [6318399.850029] amdgpu :05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xffdac0 flags=0x0020] [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc1c0 flags=0x0020] [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffc8c0 flags=0x0020] [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb0c0 flags=0x0020] [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffcfc0 flags=0x0020] [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffb6c0 flags=0x0020] [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffa0c0 flags=0x0020] [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffbdc0 flags=0x0020] [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffa4c0 flags=0x0020] [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xff90c0 flags=0x0020] [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x000c address=0xffabc0 flags=0x0020] As this is not reproducible, how would debugging go? (The system was rebooted in the meantime.) What options should be enabled, that next time the required information is logged, or what commands should I execute when the system is still in that state, so the bug (driver, userspace, …) can be pinpointed and fixed? Kind regards, Paul [1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"
Re: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN
[AMD Official Use Only] Thanks, will and comment as you suggested. Regards, Tao From: Zhang, Hawking Sent: Monday, October 25, 2021 5:29 PM To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN [AMD Official Use Only] Please add inline comment to remind us remove this workaround later. Other than the patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Zhou1, Tao Sent: Monday, October 25, 2021 16:56 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking Cc: Zhou1, Tao Subject: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN Skip GPRs init in specific condition since current GPRs init algorithm only works for some CU settings. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c index 00a2b36a24b3..a1f424e2ff5f 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c @@ -706,6 +706,10 @@ int gfx_v9_4_2_do_edc_gpr_workarounds(struct amdgpu_device *adev) if (!amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX)) return 0; + /* workaround for some CU settings, skip GPRs init */ + if (adev->gfx.cu_info.bitmap[1][1] == 0x3f7f) + return 0; + gfx_v9_4_2_do_sgprs_init(adev); gfx_v9_4_2_do_vgprs_init(adev); -- 2.17.1
[PATCH v2 2/2] drm/amdkfd: Remove cu mask from struct queue_properties(v2)
Actually, cu_mask has been copied to mqd memory and does't have to persist in queue_properties. Remove it from queue_properties. And use struct mqd_update_info to store such properties, then pass it to update queue operation. v2: * Rename pqm_update_queue to pqm_update_queue_properties. * Rename struct queue_update_info to struct mqd_update_info. * Rename pqm_set_cu_mask to pqm_update_mqd. Suggested-by: Felix Kuehling Signed-off-by: Lang Yu --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 31 ++- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 1 - .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 9 +++--- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 9 +++--- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 9 +++--- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 9 +++--- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 23 +- .../amd/amdkfd/kfd_process_queue_manager.c| 20 +++- 8 files changed, 57 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 9317a2e238d0..24ebd61395d8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -405,7 +405,7 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p, mutex_lock(&p->mutex); - retval = pqm_update_queue(&p->pqm, args->queue_id, &properties); + retval = pqm_update_queue_properties(&p->pqm, args->queue_id, &properties); mutex_unlock(&p->mutex); @@ -418,7 +418,7 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, struct kfd_process *p, int retval; const int max_num_cus = 1024; struct kfd_ioctl_set_cu_mask_args *args = data; - struct queue_properties properties; + struct mqd_update_info minfo = {0}; uint32_t __user *cu_mask_ptr = (uint32_t __user *)args->cu_mask_ptr; size_t cu_mask_size = sizeof(uint32_t) * (args->num_cu_mask / 32); @@ -428,8 +428,8 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, struct kfd_process *p, return -EINVAL; } - properties.cu_mask_count = args->num_cu_mask; - if (properties.cu_mask_count == 0) { + minfo.cu_mask.count = args->num_cu_mask; + if (minfo.cu_mask.count == 0) { pr_debug("CU mask cannot be 0"); return -EINVAL; } @@ -438,32 +438,33 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, struct kfd_process *p, * limit of max_num_cus bits. We can then just drop any CU mask bits * past max_num_cus bits and just use the first max_num_cus bits. */ - if (properties.cu_mask_count > max_num_cus) { + if (minfo.cu_mask.count > max_num_cus) { pr_debug("CU mask cannot be greater than 1024 bits"); - properties.cu_mask_count = max_num_cus; + minfo.cu_mask.count = max_num_cus; cu_mask_size = sizeof(uint32_t) * (max_num_cus/32); } - properties.cu_mask = kzalloc(cu_mask_size, GFP_KERNEL); - if (!properties.cu_mask) + minfo.cu_mask.ptr = kzalloc(cu_mask_size, GFP_KERNEL); + if (!minfo.cu_mask.ptr) return -ENOMEM; - retval = copy_from_user(properties.cu_mask, cu_mask_ptr, cu_mask_size); + retval = copy_from_user(minfo.cu_mask.ptr, cu_mask_ptr, cu_mask_size); if (retval) { pr_debug("Could not copy CU mask from userspace"); - kfree(properties.cu_mask); - return -EFAULT; + retval = -EFAULT; + goto out; } + minfo.update_flag = UPDATE_FLAG_CU_MASK; + mutex_lock(&p->mutex); - retval = pqm_set_cu_mask(&p->pqm, args->queue_id, &properties); + retval = pqm_update_mqd(&p->pqm, args->queue_id, &minfo); mutex_unlock(&p->mutex); - if (retval) - kfree(properties.cu_mask); - +out: + kfree(minfo.cu_mask.ptr); return retval; } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c index a2b77d1df854..64b4ac339904 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c @@ -136,7 +136,6 @@ static bool kq_initialize(struct kernel_queue *kq, struct kfd_dev *dev, prop.write_ptr = (uint32_t *) kq->wptr_gpu_addr; prop.eop_ring_buffer_address = kq->eop_gpu_addr; prop.eop_ring_buffer_size = PAGE_SIZE; - prop.cu_mask = NULL; if (init_queue(&kq->queue, &prop) != 0) goto err_init_queue; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c index 00bcaa11ff57..8128f4d312f1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c @@ -42,16 +42,17 @@ static inline struct cik_sd
[PATCH v2 1/2] drm/amdkfd: Add an optional argument into update queue operation(v2)
Currently, queue is updated with data in queue_properties. And all allocated resource in queue_properties will not be freed until the queue is destroyed. But some properties(e.g., cu mask) bring some memory management headaches(e.g., memory leak) and make code complex. Actually they have been copied to mqd and don't have to persist in queue_properties. Add an argument into update queue to pass such properties, then we can remove them from queue_properties. v2: Don't use void *. Suggested-by: Felix Kuehling Signed-off-by: Lang Yu --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 5 ++-- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 3 ++- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 23 +--- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 10 --- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 10 --- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 26 +++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ .../amd/amdkfd/kfd_process_queue_manager.c| 6 ++--- 9 files changed, 52 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index f8fce9d05f50..533b27b35fc9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -557,7 +557,8 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm, return retval; } -static int update_queue(struct device_queue_manager *dqm, struct queue *q) +static int update_queue(struct device_queue_manager *dqm, struct queue *q, + struct mqd_update_info *minfo) { int retval = 0; struct mqd_manager *mqd_mgr; @@ -605,7 +606,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q) } } - mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties); + mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties, minfo); /* * check active state vs. the previous state and modify diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h index c8719682c4da..499fc0ea387f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h @@ -93,7 +93,7 @@ struct device_queue_manager_ops { struct queue *q); int (*update_queue)(struct device_queue_manager *dqm, - struct queue *q); + struct queue *q, struct mqd_update_info *minfo); int (*register_process)(struct device_queue_manager *dqm, struct qcm_process_device *qpd); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h index 6e6918ccedfd..965e17c5dbb4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h @@ -80,7 +80,8 @@ struct mqd_manager { struct mm_struct *mms); void(*update_mqd)(struct mqd_manager *mm, void *mqd, - struct queue_properties *q); + struct queue_properties *q, + struct mqd_update_info *minfo); int (*destroy_mqd)(struct mqd_manager *mm, void *mqd, enum kfd_preempt_type type, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c index 064914e1e8d6..00bcaa11ff57 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c @@ -135,7 +135,7 @@ static void init_mqd(struct mqd_manager *mm, void **mqd, *mqd = m; if (gart_addr) *gart_addr = addr; - mm->update_mqd(mm, m, q); + mm->update_mqd(mm, m, q, NULL); } static void init_mqd_sdma(struct mqd_manager *mm, void **mqd, @@ -152,7 +152,7 @@ static void init_mqd_sdma(struct mqd_manager *mm, void **mqd, if (gart_addr) *gart_addr = mqd_mem_obj->gpu_addr; - mm->update_mqd(mm, m, q); + mm->update_mqd(mm, m, q, NULL); } static void free_mqd(struct mqd_manager *mm, void *mqd, @@ -185,7 +185,8 @@ static int load_mqd_sdma(struct mqd_manager *mm, void *mqd, } static void __update_mqd(struct mqd_manager *mm, void *mqd, - struct queue_properties *q, unsigned int atc_bit) + struct queue_properties *q, struct mqd_update_info *minfo, + unsigned int atc_bit) { struct cik_mqd *m; @@ -221,9 +222,10 @@ static void __update_mqd(struct mqd_manager *mm, void *mqd, } static void update_mqd(struct mqd_manager *mm, void *mqd, -
RE: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN
[AMD Official Use Only] Please add inline comment to remind us remove this workaround later. Other than the patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Zhou1, Tao Sent: Monday, October 25, 2021 16:56 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking Cc: Zhou1, Tao Subject: [PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN Skip GPRs init in specific condition since current GPRs init algorithm only works for some CU settings. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c index 00a2b36a24b3..a1f424e2ff5f 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c @@ -706,6 +706,10 @@ int gfx_v9_4_2_do_edc_gpr_workarounds(struct amdgpu_device *adev) if (!amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX)) return 0; + /* workaround for some CU settings, skip GPRs init */ + if (adev->gfx.cu_info.bitmap[1][1] == 0x3f7f) + return 0; + gfx_v9_4_2_do_sgprs_init(adev); gfx_v9_4_2_do_vgprs_init(adev); -- 2.17.1
RE: [PATCH 2/2] drm/amdkfd: Remove cu mask from struct queue_properties
[AMD Official Use Only] >-Original Message- >From: Kuehling, Felix >Sent: Friday, October 22, 2021 1:11 AM >To: Yu, Lang ; amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander ; Huang, Ray > >Subject: Re: [PATCH 2/2] drm/amdkfd: Remove cu mask from struct >queue_properties > >Am 2021-10-15 um 4:48 a.m. schrieb Lang Yu: >> +enum queue_update_flag { >> +UPDATE_FLAG_PROPERTITY = 0, >> +UPDATE_FLAG_CU_MASK, >> +}; >> + >> +struct queue_update_info { >> +union { >> +struct queue_properties properties; >> +struct { >> +uint32_t count; /* Must be a multiple of 32 */ >> +uint32_t *ptr; >> +} cu_mask; >> +}; >> + >> +enum queue_update_flag update_flag; >> +}; >> + > >This doesn't make sense to me. As I understand it, queue_update_info is for >information that is not stored in queue_properties but only in the MQDs. >Therefore, it should not include the queue_properties. > >All the low level functions in the MQD managers get both the queue_properties >and the queue_update_info. So trying to wrap both in the same union doesn't >make sense there either. > >I think you only need this because you tried to generalize pqm_update_queue to >handle both updates to queue_properties and CU mask updates with a single >argument. IMO this does not make the interface any clearer. I think it would be >more straight-forward to keep a separate pqm_set_cu_mask function that takes >a queue_update_info parameter. If you're looking for more generic names, I >suggest the following: > > * Rename pqm_update_queue to pqm_update_queue_properties > * Rename struct queue_update_info to struct mqd_update_info > * Rename pqm_set_cu_mask to pqm_update_mqd. For now this is only used >for CU mask (the union has only one struct member for now). It may >be used for other MQD properties that don't need to be stored in >queue_properties in the future Got it. Thanks for your suggestions! Regards, Lang > >Regards, > Felix >
[PATCH] drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN
Skip GPRs init in specific condition since current GPRs init algorithm only works for some CU settings. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c index 00a2b36a24b3..a1f424e2ff5f 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c @@ -706,6 +706,10 @@ int gfx_v9_4_2_do_edc_gpr_workarounds(struct amdgpu_device *adev) if (!amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__GFX)) return 0; + /* workaround for some CU settings, skip GPRs init */ + if (adev->gfx.cu_info.bitmap[1][1] == 0x3f7f) + return 0; + gfx_v9_4_2_do_sgprs_init(adev); gfx_v9_4_2_do_vgprs_init(adev); -- 2.17.1
RE: [PATCH 1/2] drm/amdkfd: Add an optional argument into update queue operation
[AMD Official Use Only] >-Original Message- >From: Kuehling, Felix >Sent: Friday, October 22, 2021 12:46 AM >To: Yu, Lang ; amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander ; Huang, Ray > >Subject: Re: [PATCH 1/2] drm/amdkfd: Add an optional argument into update >queue operation > > >Am 2021-10-15 um 4:48 a.m. schrieb Lang Yu: >> Currently, queue is updated with data stored in queue_properties. >> And all allocated resource in queue_properties will not be freed until >> the queue is destroyed. >> >> But some properties(e.g., cu mask) bring some memory management >> headaches(e.g., memory leak) and make code complex. Actually they >> don't have to persist in queue_properties. >> >> So add an argument into update queue to pass such properties and >> remove them from queue_properties. >> >> Signed-off-by: Lang Yu >> --- >> .../drm/amd/amdkfd/kfd_device_queue_manager.c | 4 ++-- >> .../drm/amd/amdkfd/kfd_device_queue_manager.h | 2 +- >> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 2 +- >> .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 18 +++ >> .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 8 +++ >> .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 8 +++ >> .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 22 +-- >> .../amd/amdkfd/kfd_process_queue_manager.c| 6 ++--- >> 8 files changed, 35 insertions(+), 35 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> index f8fce9d05f50..7f6f4937eedb 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> @@ -557,7 +557,7 @@ static int destroy_queue_nocpsch(struct >device_queue_manager *dqm, >> return retval; >> } >> >> -static int update_queue(struct device_queue_manager *dqm, struct >> queue *q) >> +static int update_queue(struct device_queue_manager *dqm, struct >> +queue *q, void *args) > >Please don't use a void * here. If you don't want to declare the struct >queue_update_info in this patch, you can just declare it as an abstract >type: > >struct queue_update_info; > >You can cast NULL to (struct queue_update_info *) without requiring the >structure definition. Got it. Thanks! Regards, Lang > >Regards, > Felix > > >> { >> int retval = 0; >> struct mqd_manager *mqd_mgr; >> @@ -605,7 +605,7 @@ static int update_queue(struct device_queue_manager >*dqm, struct queue *q) >> } >> } >> >> -mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties); >> +mqd_mgr->update_mqd(mqd_mgr, q->mqd, &q->properties, args); >> >> /* >> * check active state vs. the previous state and modify diff --git >> a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h >> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h >> index c8719682c4da..08cfc2a2fdbb 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h >> @@ -93,7 +93,7 @@ struct device_queue_manager_ops { >> struct queue *q); >> >> int (*update_queue)(struct device_queue_manager *dqm, >> -struct queue *q); >> +struct queue *q, void *args); >> >> int (*register_process)(struct device_queue_manager *dqm, >> struct qcm_process_device *qpd); diff -- >git >> a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h >> b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h >> index 6e6918ccedfd..6ddf93629b8c 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h >> @@ -80,7 +80,7 @@ struct mqd_manager { >> struct mm_struct *mms); >> >> void(*update_mqd)(struct mqd_manager *mm, void *mqd, >> -struct queue_properties *q); >> +struct queue_properties *q, void *args); >> >> int (*destroy_mqd)(struct mqd_manager *mm, void *mqd, >> enum kfd_preempt_type type, >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c >> b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c >> index 064914e1e8d6..8bb2fd4cba41 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c >> @@ -135,7 +135,7 @@ static void init_mqd(struct mqd_manager *mm, void >**mqd, >> *mqd = m; >> if (gart_addr) >> *gart_addr = addr; >> -mm->update_mqd(mm, m, q); >> +mm->update_mqd(mm, m, q, NULL); >> } >> >> static void init_mqd_sdma(struct mqd_manager *mm, void **mqd, @@ >> -152,7 +152,7 @@ static void init_mqd_sdma(struct mqd_manager *mm, void >**mqd, >> if (gart_addr) >> *gart_addr = mqd_mem_obj->gpu_addr; >> >> -mm->update_mqd(mm, m, q); >> +mm->update_mqd(mm, m, q, NULL)