Re: [PATCH] drm: amd: clean up dcn32_fpu.c kernel-doc
On 2022-10-01 00:33, Randy Dunlap wrote: Rectify multiple kernel-doc warnings in dcn32_fpu.c. E.g.: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:247: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Finds dummy_latency_index when MCLK switching using firmware based drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:484: warning: Function parameter or member 'phantom_stream' not described in 'dcn32_set_phantom_stream_timing' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:601: warning: Function parameter or member 'dc' not described in 'dcn32_assign_subvp_pipe' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:601: warning: Function parameter or member 'context' not described in 'dcn32_assign_subvp_pipe' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:601: warning: Function parameter or member 'index' not described in 'dcn32_assign_subvp_pipe' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:2140: warning: Function parameter or member 'dc' not described in 'dcn32_update_bw_bounding_box_fpu' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:2140: warning: Function parameter or member 'bw_params' not described in 'dcn32_update_bw_bounding_box_fpu' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:2140: warning: expecting prototype for dcn32_update_bw_bounding_box(). Prototype was for dcn32_update_bw_bounding_box_fpu() instead Reported-by: kernel test robot Signed-off-by: Randy Dunlap Cc: George Shen Cc: Alvin Lee Cc: Nevenko Stupar Cc: Harry Wentland Cc: Leo Li Cc: Rodrigo Siqueira Cc: amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Cc: Alex Deucher Cc: Christian König Cc: "Pan, Xinhui" --- drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 116 -- 1 file changed, 49 insertions(+), 67 deletions(-) --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c @@ -243,7 +243,7 @@ void dcn32_build_wm_range_table_fpu(stru clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0x; } -/** +/* * Finds dummy_latency_index when MCLK switching using firmware based * vblank stretch is enabled. This function will iterate through the * table of dummy pstate latencies until the lowest value that allows @@ -290,15 +290,14 @@ int dcn32_find_dummy_latency_index_for_f /** * dcn32_helper_populate_phantom_dlg_params - Get DLG params for phantom pipes * and populate pipe_ctx with those params. - * - * This function must be called AFTER the phantom pipes are added to context - * and run through DML (so that the DLG params for the phantom pipes can be - * populated), and BEFORE we program the timing for the phantom pipes. - * * @dc: [in] current dc state * @context: [in] new dc state * @pipes: [in] DML pipe params array * @pipe_cnt: [in] DML pipe count + * + * This function must be called AFTER the phantom pipes are added to context + * and run through DML (so that the DLG params for the phantom pipes can be + * populated), and BEFORE we program the timing for the phantom pipes. */ void dcn32_helper_populate_phantom_dlg_params(struct dc *dc, struct dc_state *context, @@ -331,8 +330,9 @@ void dcn32_helper_populate_phantom_dlg_p } /** - * *** - * dcn32_predict_pipe_split: Predict if pipe split will occur for a given DML pipe + * dcn32_predict_pipe_split - Predict if pipe split will occur for a given DML pipe + * @context: [in] New DC state to be programmed + * @pipe_e2e: [in] DML pipe end to end context * * This function takes in a DML pipe (pipe_e2e) and predicts if pipe split is required (both * ODM and MPC). For pipe split, ODM combine is determined by the ODM mode, and MPC combine is @@ -343,12 +343,7 @@ void dcn32_helper_populate_phantom_dlg_p * - MPC combine is only chosen if there is no ODM combine requirements / policy in place, and * MPC is required * - * @param [in]: context: New DC state to be programmed - * @param [in]: pipe_e2e: DML pipe end to end context - * - * @return: Number of splits expected (1 for 2:1 split, 3 for 4:1 split, 0 for no splits). - * - * *** + * Return: Number of splits expected (1 for 2:1 split, 3 for 4:1 split, 0 for no splits). */ uint8_t dcn32_predict_pipe_split(struct dc_state *context, display_e2e_pipe_params_st *pipe_e2e) @@ -504,7 +499,14 @@ void insert_entry_into_table_sorted(stru } /** - * dcn32_set_phantom_stream_timing: Set timing params for the phantom stream + * dcn32_set_phantom_stream_timing - Set
Re: [PATCH v5] drm/amd/display: Fix vblank refcount in vrr transition
On 2022-09-21 17:20, Yunxiang Li wrote: manage_dm_interrupts disable/enable vblank using drm_crtc_vblank_off/on which causes drm_crtc_vblank_get in vrr_transition to fail, and later when drm_crtc_vblank_put is called the refcount on vblank will be messed up. Therefore move the call to after manage_dm_interrupts. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1247 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1380 Signed-off-by: Yunxiang Li --- v2: check the return code for calls that might fail and warn on them v3/v4: make the sequence closer to the original and remove redundant local variables v5: add bug tracking info .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 55 +-- 1 file changed, 26 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index ece2003a74cc..97cc8ceaeea0 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -7484,15 +7484,15 @@ static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state *old_state, * We also need vupdate irq for the actual core vblank handling * at end of vblank. */ - dm_set_vupdate_irq(new_state->base.crtc, true); - drm_crtc_vblank_get(new_state->base.crtc); + WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, true) != 0); + WARN_ON(drm_crtc_vblank_get(new_state->base.crtc) != 0); DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n", __func__, new_state->base.crtc->base.id); } else if (old_vrr_active && !new_vrr_active) { /* Transition VRR active -> inactive: * Allow vblank irq disable again for fixed refresh rate. */ - dm_set_vupdate_irq(new_state->base.crtc, false); + WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, false) != 0); drm_crtc_vblank_put(new_state->base.crtc); DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n", __func__, new_state->base.crtc->base.id); @@ -8257,23 +8257,6 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) mutex_unlock(>dc_lock); } - /* Count number of newly disabled CRTCs for dropping PM refs later. */ - for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, - new_crtc_state, i) { - if (old_crtc_state->active && !new_crtc_state->active) - crtc_disable_count++; - - dm_new_crtc_state = to_dm_crtc_state(new_crtc_state); - dm_old_crtc_state = to_dm_crtc_state(old_crtc_state); - - /* For freesync config update on crtc state and params for irq */ - update_stream_irq_parameters(dm, dm_new_crtc_state); - - /* Handle vrr on->off / off->on transitions */ - amdgpu_dm_handle_vrr_transition(dm_old_crtc_state, - dm_new_crtc_state); - } - /** * Enable interrupts for CRTCs that are newly enabled or went through * a modeset. It was intentionally deferred until after the front end @@ -8283,16 +8266,29 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) { struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc); #ifdef CONFIG_DEBUG_FS - bool configure_crc = false; enum amdgpu_dm_pipe_crc_source cur_crc_src; #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) - struct crc_rd_work *crc_rd_wrk = dm->crc_rd_wrk; + struct crc_rd_work *crc_rd_wrk; +#endif +#endif + /* Count number of newly disabled CRTCs for dropping PM refs later. */ + if (old_crtc_state->active && !new_crtc_state->active) + crtc_disable_count++; + + dm_new_crtc_state = to_dm_crtc_state(new_crtc_state); + dm_old_crtc_state = to_dm_crtc_state(old_crtc_state); + + /* For freesync config update on crtc state and params for irq */ + update_stream_irq_parameters(dm, dm_new_crtc_state); + +#ifdef CONFIG_DEBUG_FS +#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) + crc_rd_wrk = dm->crc_rd_wrk; #endif spin_lock_irqsave(_to_drm(adev)->event_lock, flags); cur_crc_src = acrtc->dm_irq_params.crc_src; spin_unlock_irqrestore(_to_drm(adev)->event_lock, flags); #endif - dm_new_crtc_state = to_dm_crtc_state(new_crtc_state); if (new_crtc_state->active && (!old_crtc_state->active || @@ -8300,16 +8296,19 @@ static
[PATCH] drm/amd/amdgpu: Update debugfs SRBM/GRBM logic
Currently our debugfs bank selection logic only supports a single GC instance. This updates the functions amdgpu_gfx_select_se_sh() and amdgpu_gfx_select_me_pipe_q() to support a GC instance parameter and ultimately a GC instance selection via the IOCTL to debugfs. Signed-off-by: Tom St Denis --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 44 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 7 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_umr.h | 15 ++- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 17 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 17 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 3 +- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 3 +- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 3 +- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 18 +++-- 9 files changed, 98 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 6066aebf491c..3a09028277f7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -139,7 +139,7 @@ static int amdgpu_debugfs_process_reg_op(bool read, struct file *f, sh_bank, instance_bank); } else if (use_ring) { mutex_lock(>srbm_mutex); - amdgpu_gfx_select_me_pipe_q(adev, me, pipe, queue, vmid); + amdgpu_gfx_select_me_pipe_q(adev, me, pipe, queue, vmid, 0); } if (pm_pg_lock) @@ -172,7 +172,7 @@ static int amdgpu_debugfs_process_reg_op(bool read, struct file *f, amdgpu_gfx_select_se_sh(adev, 0x, 0x, 0x); mutex_unlock(>grbm_idx_mutex); } else if (use_ring) { - amdgpu_gfx_select_me_pipe_q(adev, 0, 0, 0, 0); + amdgpu_gfx_select_me_pipe_q(adev, 0, 0, 0, 0, 0); mutex_unlock(>srbm_mutex); } @@ -261,15 +261,15 @@ static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 off return -EINVAL; } mutex_lock(>grbm_idx_mutex); - amdgpu_gfx_select_se_sh(adev, rd->id.grbm.se, + amdgpu_gfx_select_se_sh_instanced(adev, rd->id.grbm.se, rd->id.grbm.sh, - rd->id.grbm.instance); + rd->id.grbm.instance, rd->id.gc_instance); } if (rd->id.use_srbm) { mutex_lock(>srbm_mutex); amdgpu_gfx_select_me_pipe_q(adev, rd->id.srbm.me, rd->id.srbm.pipe, - rd->id.srbm.queue, rd->id.srbm.vmid); + rd->id.srbm.queue, rd->id.srbm.vmid, rd->id.gc_instance); } if (rd->id.pg_lock) @@ -295,12 +295,12 @@ static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 off } end: if (rd->id.use_grbm) { - amdgpu_gfx_select_se_sh(adev, 0x, 0x, 0x); + amdgpu_gfx_select_se_sh_instanced(adev, 0x, 0x, 0x, 0); mutex_unlock(>grbm_idx_mutex); } if (rd->id.use_srbm) { - amdgpu_gfx_select_me_pipe_q(adev, 0, 0, 0, 0); + amdgpu_gfx_select_me_pipe_q(adev, 0, 0, 0, 0, 0); mutex_unlock(>srbm_mutex); } @@ -319,17 +319,39 @@ static ssize_t amdgpu_debugfs_regs2_op(struct file *f, char __user *buf, u32 off static long amdgpu_debugfs_regs2_ioctl(struct file *f, unsigned int cmd, unsigned long data) { struct amdgpu_debugfs_regs2_data *rd = f->private_data; + struct amdgpu_debugfs_regs2_iocdata v1_data; int r; + mutex_lock(>lock); + switch (cmd) { + case AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE_V2: + r = copy_from_user(>id, (struct amdgpu_debugfs_regs2_iocdata_v2 *)data, sizeof rd->id); + if (r) + return r ? -EINVAL : 0; + goto done; case AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE: - mutex_lock(>lock); - r = copy_from_user(>id, (struct amdgpu_debugfs_regs2_iocdata *)data, sizeof rd->id); - mutex_unlock(>lock); - return r ? -EINVAL : 0; + r = copy_from_user(_data, (struct amdgpu_debugfs_regs2_iocdata *)data, sizeof v1_data); + if (r) + return r ? -EINVAL : 0; + goto v1_copy; default: return -EINVAL; } + +v1_copy: + rd->id.use_srbm = v1_data.use_srbm; + rd->id.use_grbm = v1_data.use_grbm; + rd->id.pg_lock = v1_data.pg_lock; + rd->id.grbm.se = v1_data.grbm.se;
Re: [PATCH] drm/amd/display: Enable dpia support for dcn314
On 10/3/2022 13:00, roman...@amd.com wrote: From: Roman Li [Why] DCN 3.1.4 supports DPIA. [How] - Set dpia_supported flag for dcn314 in dmub_hw_init() - Remove comment that becomes irrelevant after this change. Signed-off-by: Roman Li Reviewed-by: Nicholas Kazlauskas FYI As this is just adding the ID, I think this is a good stable candidate. Cc: sta...@vger.kernel.org # 6.0 Reviewed-by: Mario Limonciello --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index ece2003a74cc..8471c21496c9 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1105,7 +1105,8 @@ static int dm_dmub_hw_init(struct amdgpu_device *adev) hw_params.fb[i] = _info->fb[i]; switch (adev->ip_versions[DCE_HWIP][0]) { - case IP_VERSION(3, 1, 3): /* Only for this asic hw internal rev B0 */ + case IP_VERSION(3, 1, 3): + case IP_VERSION(3, 1, 4): hw_params.dpia_supported = true; hw_params.disable_dpia = adev->dm.dc->debug.dpia_debug.bits.disable_dpia; break;
[PATCH] drm/amd/display: Enable dpia support for dcn314
From: Roman Li [Why] DCN 3.1.4 supports DPIA. [How] - Set dpia_supported flag for dcn314 in dmub_hw_init() - Remove comment that becomes irrelevant after this change. Signed-off-by: Roman Li Reviewed-by: Nicholas Kazlauskas --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index ece2003a74cc..8471c21496c9 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1105,7 +1105,8 @@ static int dm_dmub_hw_init(struct amdgpu_device *adev) hw_params.fb[i] = _info->fb[i]; switch (adev->ip_versions[DCE_HWIP][0]) { - case IP_VERSION(3, 1, 3): /* Only for this asic hw internal rev B0 */ + case IP_VERSION(3, 1, 3): + case IP_VERSION(3, 1, 4): hw_params.dpia_supported = true; hw_params.disable_dpia = adev->dm.dc->debug.dpia_debug.bits.disable_dpia; break; -- 2.17.1
Re: [PATCH v2 1/8] mm/memory.c: Fix race when faulting a device private page
Am 2022-10-02 um 20:53 schrieb Alistair Popple: Felix Kuehling writes: On 2022-09-28 08:01, Alistair Popple wrote: When the CPU tries to access a device private page the migrate_to_ram() callback associated with the pgmap for the page is called. However no reference is taken on the faulting page. Therefore a concurrent migration of the device private page can free the page and possibly the underlying pgmap. This results in a race which can crash the kernel due to the migrate_to_ram() function pointer becoming invalid. It also means drivers can't reliably read the zone_device_data field because the page may have been freed with memunmap_pages(). Close the race by getting a reference on the page while holding the ptl to ensure it has not been freed. Unfortunately the elevated reference count will cause the migration required to handle the fault to fail. To avoid this failure pass the faulting page into the migrate_vma functions so that if an elevated reference count is found it can be checked to see if it's expected or not. Do we really have to drag the fault_page all the way into the migrate structure? Is the elevated refcount still needed at that time? Maybe it would be easier to drop the refcount early in the ops->migrage_to_ram callbacks, so we won't have to deal with it in all the migration code. That would also work. Honestly I don't really like either solution :-) Then we agree. :) I didn't like having to plumb it all through the migration code but I ended up going this way because I felt it was easier to explain the life time of vmf->page for the migrate_to_ram() callback. This way vmf->page is guaranteed to be valid for the duration of the migrate_to_ram() callbak. As you suggest we could instead have drivers call put_page(vmf->page) somewhere in migrate_to_ram() before calling migrate_vma_setup(). The reason I didn't go this way is IMHO it's more subtle because in general the page will remain valid after that put_page() anyway. So it would be easy for drivers to introduce a bug assuming the vmf->page is still valid and not notice because most of the time it is. I guess I'm biased because my migrate_to_ram implementation doesn't use the vmf->page at all. I agree that dropping the refcount in the callback is subtle. But handling an elevated refcount for just one special page in the migration code also looks a bit fragile to me. It's not my call to make. But my preference is very clear. Either way, if the decision is to go with your solution, then you have my Acked-by: Felix Kuehling Let me know if you disagree with my reasoning though - would appreciate any review here. Regards, Felix Signed-off-by: Alistair Popple Cc: Jason Gunthorpe Cc: John Hubbard Cc: Ralph Campbell Cc: Michael Ellerman Cc: Felix Kuehling Cc: Lyude Paul --- arch/powerpc/kvm/book3s_hv_uvmem.c | 15 ++- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17 +++-- drivers/gpu/drm/amd/amdkfd/kfd_migrate.h | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +--- include/linux/migrate.h | 8 ++- lib/test_hmm.c | 7 ++--- mm/memory.c | 16 +++- mm/migrate.c | 34 ++--- mm/migrate_device.c | 18 + 9 files changed, 87 insertions(+), 41 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index 5980063..d4eacf4 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -508,10 +508,10 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm) static int __kvmppc_svm_page_out(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long page_shift, - struct kvm *kvm, unsigned long gpa) + struct kvm *kvm, unsigned long gpa, struct page *fault_page) { unsigned long src_pfn, dst_pfn = 0; - struct migrate_vma mig; + struct migrate_vma mig = { 0 }; struct page *dpage, *spage; struct kvmppc_uvmem_page_pvt *pvt; unsigned long pfn; @@ -525,6 +525,7 @@ static int __kvmppc_svm_page_out(struct vm_area_struct *vma, mig.dst = _pfn; mig.pgmap_owner = _uvmem_pgmap; mig.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE; + mig.fault_page = fault_page; /* The requested page is already paged-out, nothing to do */ if (!kvmppc_gfn_is_uvmem_pfn(gpa >> page_shift, kvm, NULL)) @@ -580,12 +581,14 @@ static int __kvmppc_svm_page_out(struct vm_area_struct *vma, static inline int kvmppc_svm_page_out(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long page_shift, - struct kvm *kvm, unsigned long gpa) +
[PATCH 1/1] drm/amdgpu: Set vmbo destroy after pt bo is created
Under VRAM usage pression, map to GPU may fail to create pt bo and vmbo->shadow_list is not initialized, then ttm_bo_release calling amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below dmesg and NULL pointer access backtrace: Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt bo successfully, otherwise use default callback amdgpu_bo_destroy. amdgpu: amdgpu_vm_bo_update failed amdgpu: update_gpuvm_pte() failed amdgpu: Failed to map bo to gpuvm amdgpu :43:00.0: amdgpu: Failed to map peer::43:00.0 mem_domain:2 BUG: kernel NULL pointer dereference, address: RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu] Call Trace: ttm_bo_release+0x207/0x320 [amdttm] amdttm_bo_init_reserved+0x1d6/0x210 [amdttm] amdgpu_bo_create+0x1ba/0x520 [amdgpu] amdgpu_bo_create_vm+0x3a/0x80 [amdgpu] amdgpu_vm_pt_create+0xde/0x270 [amdgpu] amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu] amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu] amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu] update_gpuvm_pte+0x160/0x420 [amdgpu] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu] kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu] kfd_ioctl+0x24a/0x5b0 [amdgpu] Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 4570ad449390..ae924db72b62 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -688,11 +688,11 @@ int amdgpu_bo_create_vm(struct amdgpu_device *adev, * num of amdgpu_vm_pt entries. */ BUG_ON(bp->bo_ptr_size < sizeof(struct amdgpu_bo_vm)); - bp->destroy = _bo_vm_destroy; r = amdgpu_bo_create(adev, bp, _ptr); if (r) return r; + bo_ptr->tbo.destroy = _bo_vm_destroy; *vmbo_ptr = to_amdgpu_bo_vm(bo_ptr); INIT_LIST_HEAD(&(*vmbo_ptr)->shadow_list); return r; -- 2.35.1
Re: [PATCH] drm/amd/display: disable psr whenever applicable
Ping! Regards, Shirish S On 9/30/2022 7:17 PM, S, Shirish wrote: On 9/30/2022 6:59 PM, Harry Wentland wrote: +Leo On 9/30/22 06:27, Shirish S wrote: [Why] psr feature continues to be enabled for non capable links. Do you have more info on what issues you're seeing with this? Code wise without this change we end up setting "vblank_disable_immediate" parameter to false for the failing links also. Issue wise there is a remote chance of this leading to eDP/connected monitor not lighting up. [How] disable the feature on links that are not capable of the same. Signed-off-by: Shirish S --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c index 8ca10ab3dfc1..f73af028f312 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c @@ -60,11 +60,17 @@ static bool link_supports_psrsu(struct dc_link *link) */ void amdgpu_dm_set_psr_caps(struct dc_link *link) { - if (!(link->connector_signal & SIGNAL_TYPE_EDP)) + if (!(link->connector_signal & SIGNAL_TYPE_EDP)) { + DRM_ERROR("Disabling PSR as connector is not eDP\n") I don't think we should log an error here. My objective of logging an error was to inform user/developer that this boot PSR enablement had issues. Am fine with moving it to INFO or remove it, if you insist. Thanks for your comments. Regards, Shirish S + link->psr_settings.psr_feature_enabled = false; return; + } - if (link->type == dc_connection_none) + if (link->type == dc_connection_none) { + DRM_ERROR("Disabling PSR as eDP connection type is invalid\n") Same here, this doesn't warrant an error log. Harry + link->psr_settings.psr_feature_enabled = false; return; + } if (link->dpcd_caps.psr_info.psr_version == 0) { link->psr_settings.psr_version = DC_PSR_VERSION_UNSUPPORTED;
[PATCH] drm: amd: clean up dcn32_fpu.c kernel-doc
Rectify multiple kernel-doc warnings in dcn32_fpu.c. E.g.: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:247: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Finds dummy_latency_index when MCLK switching using firmware based drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:484: warning: Function parameter or member 'phantom_stream' not described in 'dcn32_set_phantom_stream_timing' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:601: warning: Function parameter or member 'dc' not described in 'dcn32_assign_subvp_pipe' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:601: warning: Function parameter or member 'context' not described in 'dcn32_assign_subvp_pipe' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:601: warning: Function parameter or member 'index' not described in 'dcn32_assign_subvp_pipe' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:2140: warning: Function parameter or member 'dc' not described in 'dcn32_update_bw_bounding_box_fpu' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:2140: warning: Function parameter or member 'bw_params' not described in 'dcn32_update_bw_bounding_box_fpu' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:2140: warning: expecting prototype for dcn32_update_bw_bounding_box(). Prototype was for dcn32_update_bw_bounding_box_fpu() instead Reported-by: kernel test robot Signed-off-by: Randy Dunlap Cc: George Shen Cc: Alvin Lee Cc: Nevenko Stupar Cc: Harry Wentland Cc: Leo Li Cc: Rodrigo Siqueira Cc: amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Cc: Alex Deucher Cc: Christian König Cc: "Pan, Xinhui" --- drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 116 -- 1 file changed, 49 insertions(+), 67 deletions(-) --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c @@ -243,7 +243,7 @@ void dcn32_build_wm_range_table_fpu(stru clk_mgr->base.bw_params->wm_table.nv_entries[WM_D].pmfw_breakdown.max_uclk = 0x; } -/** +/* * Finds dummy_latency_index when MCLK switching using firmware based * vblank stretch is enabled. This function will iterate through the * table of dummy pstate latencies until the lowest value that allows @@ -290,15 +290,14 @@ int dcn32_find_dummy_latency_index_for_f /** * dcn32_helper_populate_phantom_dlg_params - Get DLG params for phantom pipes * and populate pipe_ctx with those params. - * - * This function must be called AFTER the phantom pipes are added to context - * and run through DML (so that the DLG params for the phantom pipes can be - * populated), and BEFORE we program the timing for the phantom pipes. - * * @dc: [in] current dc state * @context: [in] new dc state * @pipes: [in] DML pipe params array * @pipe_cnt: [in] DML pipe count + * + * This function must be called AFTER the phantom pipes are added to context + * and run through DML (so that the DLG params for the phantom pipes can be + * populated), and BEFORE we program the timing for the phantom pipes. */ void dcn32_helper_populate_phantom_dlg_params(struct dc *dc, struct dc_state *context, @@ -331,8 +330,9 @@ void dcn32_helper_populate_phantom_dlg_p } /** - * *** - * dcn32_predict_pipe_split: Predict if pipe split will occur for a given DML pipe + * dcn32_predict_pipe_split - Predict if pipe split will occur for a given DML pipe + * @context: [in] New DC state to be programmed + * @pipe_e2e: [in] DML pipe end to end context * * This function takes in a DML pipe (pipe_e2e) and predicts if pipe split is required (both * ODM and MPC). For pipe split, ODM combine is determined by the ODM mode, and MPC combine is @@ -343,12 +343,7 @@ void dcn32_helper_populate_phantom_dlg_p * - MPC combine is only chosen if there is no ODM combine requirements / policy in place, and * MPC is required * - * @param [in]: context: New DC state to be programmed - * @param [in]: pipe_e2e: DML pipe end to end context - * - * @return: Number of splits expected (1 for 2:1 split, 3 for 4:1 split, 0 for no splits). - * - * *** + * Return: Number of splits expected (1 for 2:1 split, 3 for 4:1 split, 0 for no splits). */ uint8_t dcn32_predict_pipe_split(struct dc_state *context, display_e2e_pipe_params_st *pipe_e2e) @@ -504,7 +499,14 @@ void insert_entry_into_table_sorted(stru } /** - * dcn32_set_phantom_stream_timing: Set timing params for the phantom stream + * dcn32_set_phantom_stream_timing - Set timing params for the phantom stream + * @dc: current dc state + * @context: new dc
Re: [PATCH v2 1/8] mm/memory.c: Fix race when faulting a device private page
Felix Kuehling writes: > On 2022-09-28 08:01, Alistair Popple wrote: >> When the CPU tries to access a device private page the migrate_to_ram() >> callback associated with the pgmap for the page is called. However no >> reference is taken on the faulting page. Therefore a concurrent >> migration of the device private page can free the page and possibly the >> underlying pgmap. This results in a race which can crash the kernel due >> to the migrate_to_ram() function pointer becoming invalid. It also means >> drivers can't reliably read the zone_device_data field because the page >> may have been freed with memunmap_pages(). >> >> Close the race by getting a reference on the page while holding the ptl >> to ensure it has not been freed. Unfortunately the elevated reference >> count will cause the migration required to handle the fault to fail. To >> avoid this failure pass the faulting page into the migrate_vma functions >> so that if an elevated reference count is found it can be checked to see >> if it's expected or not. > > Do we really have to drag the fault_page all the way into the migrate > structure? > Is the elevated refcount still needed at that time? Maybe it would be easier > to > drop the refcount early in the ops->migrage_to_ram callbacks, so we won't have > to deal with it in all the migration code. That would also work. Honestly I don't really like either solution :-) I didn't like having to plumb it all through the migration code but I ended up going this way because I felt it was easier to explain the life time of vmf->page for the migrate_to_ram() callback. This way vmf->page is guaranteed to be valid for the duration of the migrate_to_ram() callbak. As you suggest we could instead have drivers call put_page(vmf->page) somewhere in migrate_to_ram() before calling migrate_vma_setup(). The reason I didn't go this way is IMHO it's more subtle because in general the page will remain valid after that put_page() anyway. So it would be easy for drivers to introduce a bug assuming the vmf->page is still valid and not notice because most of the time it is. Let me know if you disagree with my reasoning though - would appreciate any review here. > Regards, > Felix > > >> >> Signed-off-by: Alistair Popple >> Cc: Jason Gunthorpe >> Cc: John Hubbard >> Cc: Ralph Campbell >> Cc: Michael Ellerman >> Cc: Felix Kuehling >> Cc: Lyude Paul >> --- >> arch/powerpc/kvm/book3s_hv_uvmem.c | 15 ++- >> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17 +++-- >> drivers/gpu/drm/amd/amdkfd/kfd_migrate.h | 2 +- >> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +--- >> include/linux/migrate.h | 8 ++- >> lib/test_hmm.c | 7 ++--- >> mm/memory.c | 16 +++- >> mm/migrate.c | 34 ++--- >> mm/migrate_device.c | 18 + >> 9 files changed, 87 insertions(+), 41 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c >> b/arch/powerpc/kvm/book3s_hv_uvmem.c >> index 5980063..d4eacf4 100644 >> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c >> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c >> @@ -508,10 +508,10 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm) >> static int __kvmppc_svm_page_out(struct vm_area_struct *vma, >> unsigned long start, >> unsigned long end, unsigned long page_shift, >> -struct kvm *kvm, unsigned long gpa) >> +struct kvm *kvm, unsigned long gpa, struct page *fault_page) >> { >> unsigned long src_pfn, dst_pfn = 0; >> -struct migrate_vma mig; >> +struct migrate_vma mig = { 0 }; >> struct page *dpage, *spage; >> struct kvmppc_uvmem_page_pvt *pvt; >> unsigned long pfn; >> @@ -525,6 +525,7 @@ static int __kvmppc_svm_page_out(struct vm_area_struct >> *vma, >> mig.dst = _pfn; >> mig.pgmap_owner = _uvmem_pgmap; >> mig.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE; >> +mig.fault_page = fault_page; >> /* The requested page is already paged-out, nothing to do */ >> if (!kvmppc_gfn_is_uvmem_pfn(gpa >> page_shift, kvm, NULL)) >> @@ -580,12 +581,14 @@ static int __kvmppc_svm_page_out(struct vm_area_struct >> *vma, >> static inline int kvmppc_svm_page_out(struct vm_area_struct *vma, >>unsigned long start, unsigned long end, >>unsigned long page_shift, >> - struct kvm *kvm, unsigned long gpa) >> + struct kvm *kvm, unsigned long gpa, >> + struct page *fault_page) >> { >> int ret; >> mutex_lock(>arch.uvmem_lock); >> -ret = __kvmppc_svm_page_out(vma, start, end, page_shift, kvm, gpa); >> +ret = __kvmppc_svm_page_out(vma, start, end, page_shift, kvm, gpa, >> +fault_page);
Re: [RFC PATCH 1/1] drm/amdgpu: Fix null-ptr-deref in amdgpu_device_fini_sw()
I had done more delving into this also, thankfully was not forgotten. Additional info to solve blackout, was going to contact AMD: The problem as far as I could trace occurs in amdgpu_psp.c in function psp_cmd_submit_buf(). Normal counter 'timeout' seems to use a max of about 400 with usleep_range(5, 80). Under normal operation 'psp->fence_buf' will equal 'index' and drop from loop. I could not track what fills the kernel buffer objects 'virtual' pointer's reference (psp->fence_buf). I assume that it's to a firmware file that on read is returning an error, and getting stuck in a loop lock. The error condition I found occurs when 'psp->fence_buf' does not equal 'index', breaking from loop with 'timeout' == 0. Blackout seemed to be about 80% of reboots, but found that in reconfiguration of kernel, on 5.18.19, with CONFIG_ATA_ACPI=y drops blackouts to about 30% of reboots. I have now avoided all blackouts with addition of CONFIG_SATA_PMP=y (at least a week free?). This is pure guess but, maybe the ARM PSP processor is dependent on libata procedures, one of which causes a lock up some of the time. Steve On Fri, Sep 30, 2022 at 21:41, Zhang Boyang wrote: After amdgpu_device_init() failed, adev->reset_domain may be NULL. Thus subsequent call to amdgpu_device_fini_sw() may result in null-ptr-deref. This patch fixes the problem by adding a NULL pointer check around the code of releasing adev->reset_domain in amdgpu_device_fini_sw(). Fixes: cfbb6b004744 ("drm/amdgpu: Rework reset domain to be refcounted.") Signed-off-by: Zhang Boyang Link: https://lore.kernel.org/lkml/a8bce489-8ccc-aa95-3de6-f854e03ad...@suddenlinkmail.com/ Link: https://lore.kernel.org/lkml/at9whr.3z1t3vi9a2...@att.net/ --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index be7aff2d4a57..204daad06b32 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4021,8 +4021,10 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev) if (adev->mman.discovery_bin) amdgpu_discovery_fini(adev); - amdgpu_reset_put_reset_domain(adev->reset_domain); - adev->reset_domain = NULL; + if (adev->reset_domain) { + amdgpu_reset_put_reset_domain(adev->reset_domain); + adev->reset_domain = NULL; + } kfree(adev->pci_state); -- 2.30.2
RE: [PATCH 1/2] drm/amdkfd: correct RB_SIZE in SDMA0_QUEUE0_RB_CNTL
[AMD Official Use Only - General] Hi Felix, Yes, I just realized ffs start counting from 1. Sorry for the noise. Best Regards, Yifan -Original Message- From: Kuehling, Felix Sent: Friday, September 30, 2022 10:55 PM To: Zhang, Yifan ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Ji, Ruili Subject: Re: [PATCH 1/2] drm/amdkfd: correct RB_SIZE in SDMA0_QUEUE0_RB_CNTL Am 2022-09-30 um 02:16 schrieb Yifan Zhang: > In SDMA0_QUEUE0_RB_CNTL, queue size is 2^RB_SIZE, not 2^(RB_SIZE +1). > > Signed-off-by: Yifan Zhang > --- > drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 2 +- > drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c > b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c > index d3e2b6a599a4..03699a9ad3d9 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c > @@ -329,7 +329,7 @@ static void update_mqd_sdma(struct mqd_manager *mm, void > *mqd, > struct v10_sdma_mqd *m; > > m = get_sdma_mqd(mqd); > - m->sdmax_rlcx_rb_cntl = (ffs(q->queue_size / sizeof(unsigned int)) - 1) > + m->sdmax_rlcx_rb_cntl = order_base_2(q->queue_size / 4) I think these two are equivalent. ffs(1) == 1. order_base_2(1) == 0. You're not correcting anything. You're just writing it differently. Regards, Felix > << SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT | > q->vmid << SDMA0_RLC0_RB_CNTL__RB_VMID__SHIFT | > 1 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_ENABLE__SHIFT | diff > --git > a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > index 26b53b6d673e..451fcb9bb051 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > @@ -329,7 +329,7 @@ static void update_mqd_sdma(struct mqd_manager *mm, void > *mqd, > struct v11_sdma_mqd *m; > > m = get_sdma_mqd(mqd); > - m->sdmax_rlcx_rb_cntl = (ffs(q->queue_size / sizeof(unsigned int)) - 1) > + m->sdmax_rlcx_rb_cntl = order_base_2(q->queue_size / 4) > << SDMA0_QUEUE0_RB_CNTL__RB_SIZE__SHIFT | > q->vmid << SDMA0_QUEUE0_RB_CNTL__RB_VMID__SHIFT | > 1 << SDMA0_QUEUE0_RB_CNTL__RPTR_WRITEBACK_ENABLE__SHIFT |
Re: KMS atomic state sets, full vs. minimal (Re: [PATCH v3 0/6] Add support for atomic async page-flips)
On Mon, Oct 03, 2022 at 11:48:49AM +0300, Pekka Paalanen wrote: > On Fri, 30 Sep 2022 18:45:09 +0300 > Ville Syrjälä wrote: > > > On Fri, Sep 30, 2022 at 06:37:00PM +0300, Pekka Paalanen wrote: > > > On Fri, 30 Sep 2022 18:09:55 +0300 > > > Ville Syrjälä wrote: > > > > > > > That would actively discourage people from even attempting the > > > > "just dump all the state into the ioctl" approach with async flips > > > > since even the props whose value isn't even changing would be rejected. > > > > > > > > > > About that. > > > > > > To me it looks like just a classic case of broken communication. > > > > > > The kernel developers assume that of course userspace will minimize the > > > state set to only those properties that change, because...? > > > > > > Userspace developers think that of course the kernel will optimize the > > > state set into minimal changes, because the kernel actually has the > > > authoritative knowledge of what the current state is, and the driver > > > actually knows which properties are costly and need to be optimized and > > > which ones don't matter. It has never even occurred to me that the > > > kernel would not compare next state to current state. > > > > > > No-one ever documented any expectations, did they? > > > > Do you really want that for async flips? Async flips can't be > > done atomically with anything else, so why are you even asking > > the kernel to do that? > > I'm not talking about async flips only. > > I'm talking about atomic commits in general. I don't think it's a good > idea to make async atomic commits behave fundamentally different from > sync atomic commits wrt. what non-changing state you are allowed to > list in your state set or not. > > Isn't there common DRM code to convert an atomic commit state set into > state objects? It probably starts by copying the current state, and > then playing through the commit state set, setting all listed > properties to their new values? Why wouldn't that loop maintain the > knowledge of what actually changed? Any such book keeping is entirely ad-hoc atm. It's also not super obvious how much of it is actually useful. You have to do a real commit on the crtc anyway if the crtc (or on any of its associated objects) is listed in the commit, so there's not necessarily much to be gained by tracking chages in all properties. And that behaviour again enters very muddy waters when combined with the async flip flag for the entire commit. The current approach being proposed seems to suggest that we can instead short circuit async commits when nothing has changed. That is not at all how sync atomic commits work. > When you copy the current data, reset all changed-flags to false. When > you apply each property from the commit set, compare the value and set > the changed-flag only if the value changes. > > This manufacturing of the new tentative state set happens at atomic > commit ioctl time before the ioctl returns, right? So the current state > is well-defined: any previous atomic sync or async commit can be assumed to > have succeeded even if it hasn't applied in hardware yet if the commit > ioctl for it succeeded, right? Yes. We could certainly try to fully track all changes in properties, but no has measured if there's any real benefit of doing so. -- Ville Syrjälä Intel
Re: KMS atomic state sets, full vs. minimal (Re: [PATCH v3 0/6] Add support for atomic async page-flips)
On Fri, 30 Sep 2022 18:45:09 +0300 Ville Syrjälä wrote: > On Fri, Sep 30, 2022 at 06:37:00PM +0300, Pekka Paalanen wrote: > > On Fri, 30 Sep 2022 18:09:55 +0300 > > Ville Syrjälä wrote: > > > > > That would actively discourage people from even attempting the > > > "just dump all the state into the ioctl" approach with async flips > > > since even the props whose value isn't even changing would be rejected. > > > > About that. > > > > To me it looks like just a classic case of broken communication. > > > > The kernel developers assume that of course userspace will minimize the > > state set to only those properties that change, because...? > > > > Userspace developers think that of course the kernel will optimize the > > state set into minimal changes, because the kernel actually has the > > authoritative knowledge of what the current state is, and the driver > > actually knows which properties are costly and need to be optimized and > > which ones don't matter. It has never even occurred to me that the > > kernel would not compare next state to current state. > > > > No-one ever documented any expectations, did they? > > Do you really want that for async flips? Async flips can't be > done atomically with anything else, so why are you even asking > the kernel to do that? I'm not talking about async flips only. I'm talking about atomic commits in general. I don't think it's a good idea to make async atomic commits behave fundamentally different from sync atomic commits wrt. what non-changing state you are allowed to list in your state set or not. Isn't there common DRM code to convert an atomic commit state set into state objects? It probably starts by copying the current state, and then playing through the commit state set, setting all listed properties to their new values? Why wouldn't that loop maintain the knowledge of what actually changed? When you copy the current data, reset all changed-flags to false. When you apply each property from the commit set, compare the value and set the changed-flag only if the value changes. This manufacturing of the new tentative state set happens at atomic commit ioctl time before the ioctl returns, right? So the current state is well-defined: any previous atomic sync or async commit can be assumed to have succeeded even if it hasn't applied in hardware yet if the commit ioctl for it succeeded, right? Thanks, pq pgp0OLizrAqiL.pgp Description: OpenPGP digital signature