RE: [PATCH] drm/amdgpu: remove distinction between explicit and implicit sync (v2)
[AMD Official Use Only - Internal Distribution Only] Not sue if this is right direction, I think usermode wants all synchronizations to be explicit. Implicit sync often confuses people who don't know its history. I remember Jason from Intel is driving explicit synchronization through the Linux ecosystem, which even removes implicit sync of shared buffer. -David From: amd-gfx On Behalf Of Marek Olšák Sent: Tuesday, June 9, 2020 6:58 PM To: amd-gfx mailing list Subject: [PATCH] drm/amdgpu: remove distinction between explicit and implicit sync (v2) Hi, This enables a full pipeline sync for implicit sync. It's Christian's patch with the driver version bumped. With this, user mode drivers don't have to wait for idle at the end of gfx IBs. Any concerns? Thanks, Marek ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace
[AMD Official Use Only - Internal Distribution Only] That's fine to me. -David From: Koenig, Christian Sent: Friday, February 21, 2020 11:33 PM To: Deucher, Alexander ; Christian König ; Zhou, David(ChunMing) ; He, Jacob ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace I would just do this as part of the vm_flush() callback on the ring. E.g. check if the VMID you want to flush is reserved and if yes enable SPM. Maybe pass along a flag or something in the job to make things easier. Christian. Am 21.02.20 um 16:31 schrieb Deucher, Alexander: [AMD Public Use] We already have the RESERVE_VMID ioctl interface, can't we just use that internally in the kernel to update the rlc register via the ring when we schedule the relevant IB? E.g., add a new ring callback to set SPM state and then set it to the reserved vmid before we schedule the ib, and then reset it to 0 after the IB in amdgpu_ib_schedule(). diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index 4b2342d11520..e0db9362c6ee 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -185,6 +185,9 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, if (ring->funcs->insert_start) ring->funcs->insert_start(ring); + if (ring->funcs->setup_spm) + ring->funcs->setup_spm(ring, job); + if (job) { r = amdgpu_vm_flush(ring, job, need_pipe_sync); if (r) { @@ -273,6 +276,9 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, return r; } + if (ring->funcs->setup_spm) + ring->funcs->setup_spm(ring, NULL); + if (ring->funcs->insert_end) ring->funcs->insert_end(ring); Alex From: amd-gfx <mailto:amd-gfx-boun...@lists.freedesktop.org> on behalf of Christian König <mailto:ckoenig.leichtzumer...@gmail.com> Sent: Friday, February 21, 2020 5:28 AM To: Zhou, David(ChunMing) <mailto:david1.z...@amd.com>; He, Jacob <mailto:jacob...@amd.com>; Koenig, Christian <mailto:christian.koe...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace That would probably be a no-go, but we could enhance the kernel driver to update the RLC_SPM_VMID register with the reserved VMID. Handling that in userspace is most likely not working anyway, since the RLC registers are usually not accessible by userspace. Regards, Christian. Am 20.02.20 um 16:15 schrieb Zhou, David(ChunMing): [AMD Official Use Only - Internal Distribution Only] You can enhance amdgpu_vm_ioctl In amdgpu_vm.c to return vmid to userspace. -David From: He, Jacob <mailto:jacob...@amd.com> Sent: Thursday, February 20, 2020 10:46 PM To: Zhou, David(ChunMing) <mailto:david1.z...@amd.com>; Koenig, Christian <mailto:christian.koe...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace amdgpu_vm_reserve_vmid doesn't return the reserved vmid back to user space. There is no chance for user mode driver to update RLC_SPM_VMID. Thanks Jacob From: He, Jacob<mailto:jacob...@amd.com> Sent: Thursday, February 20, 2020 6:20 PM To: Zhou, David(ChunMing)<mailto:david1.z...@amd.com>; Koenig, Christian<mailto:christian.koe...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace Looks like amdgpu_vm_reserve_vmid could work, let me have a try to update the RLC_SPM_VMID with pm4 packets in UMD. Thanks Jacob From: Zhou, David(ChunMing)<mailto:david1.z...@amd.com> Sent: Thursday, February 20, 2020 10:13 AM To: Koenig, Christian<mailto:christian.koe...@amd.com>; He, Jacob<mailto:jacob...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace [AMD Official Use Only - Internal Distribution Only] Christian is right here, that will cause many problems for simply using VMID in kernel. We already have an pair interface for RGP, I think you can use it instead of involving additional kernel change. amdgpu_vm_reserve_vmid/ amdgpu_vm_unreserve_vmid. -David -Original Message- From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> On Behalf Of Christian König Sent: Wednesday, February 19, 2020 7:03 PM To: He, Jacob mailto:jacob...@amd.com>>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace
RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace
[AMD Official Use Only - Internal Distribution Only] You can enhance amdgpu_vm_ioctl In amdgpu_vm.c to return vmid to userspace. -David From: He, Jacob Sent: Thursday, February 20, 2020 10:46 PM To: Zhou, David(ChunMing) ; Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace amdgpu_vm_reserve_vmid doesn't return the reserved vmid back to user space. There is no chance for user mode driver to update RLC_SPM_VMID. Thanks Jacob From: He, Jacob<mailto:jacob...@amd.com> Sent: Thursday, February 20, 2020 6:20 PM To: Zhou, David(ChunMing)<mailto:david1.z...@amd.com>; Koenig, Christian<mailto:christian.koe...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace Looks like amdgpu_vm_reserve_vmid could work, let me have a try to update the RLC_SPM_VMID with pm4 packets in UMD. Thanks Jacob From: Zhou, David(ChunMing)<mailto:david1.z...@amd.com> Sent: Thursday, February 20, 2020 10:13 AM To: Koenig, Christian<mailto:christian.koe...@amd.com>; He, Jacob<mailto:jacob...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace [AMD Official Use Only - Internal Distribution Only] Christian is right here, that will cause many problems for simply using VMID in kernel. We already have an pair interface for RGP, I think you can use it instead of involving additional kernel change. amdgpu_vm_reserve_vmid/ amdgpu_vm_unreserve_vmid. -David -Original Message- From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> On Behalf Of Christian König Sent: Wednesday, February 19, 2020 7:03 PM To: He, Jacob mailto:jacob...@amd.com>>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace Am 19.02.20 um 11:15 schrieb Jacob He: > [WHY] > When SPM trace enabled, SPM_VMID should be updated with the current > vmid. > > [HOW] > Add a chunk id, AMDGPU_CHUNK_ID_SPM_TRACE, so that UMD can tell us > which job should update SPM_VMID. > Right before a job is submitted to GPU, set the SPM_VMID accordingly. > > [Limitation] > Running more than one SPM trace enabled processes simultaneously is > not supported. Well there are multiple problems with that patch. First of all you need to better describe what SPM tracing is in the commit message. Then the updating of mmRLC_SPM_MC_CNTL must be executed asynchronously on the ring. Otherwise we might corrupt an already executing SPM trace. And you also need to make sure to disable the tracing again or otherwise we run into a bunch of trouble when the VMID is reused. You also need to make sure that IBs using the SPM trace are serialized with each other, e.g. hack into amdgpu_ids.c file and make sure that only one VMID at a time can have that attribute. Regards, Christian. > > Change-Id: Ic932ef6ac9dbf244f03aaee90550e8ff3a675666 > Signed-off-by: Jacob He mailto:jacob...@amd.com>> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 7 +++ > drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 10 +++--- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h | 1 + > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 15 ++- > drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 15 ++- > 8 files changed, 48 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index f9fa6e104fef..3f32c4db5232 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -113,6 +113,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser > *p, union drm_amdgpu_cs >uint32_t uf_offset = 0; >int i; >int ret; > + bool update_spm_vmid = false; > >if (cs->in.num_chunks == 0) >return 0; > @@ -221,6 +222,10 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser > *p, union drm_amdgpu_cs >case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL: >break; > > + case AMDGPU_CHUNK_ID_SPM_TRACE: > + update_spm_vmid = true; > + break; > + >default: >ret = -EINVAL; >goto free_partial_kdata; > @@ -231,6 +236,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser > *p, union drm_amdgpu_cs >if (ret) >goto free_all_kdata; > > + p->job->ne
RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace
[AMD Official Use Only - Internal Distribution Only] Christian is right here, that will cause many problems for simply using VMID in kernel. We already have an pair interface for RGP, I think you can use it instead of involving additional kernel change. amdgpu_vm_reserve_vmid/ amdgpu_vm_unreserve_vmid. -David -Original Message- From: amd-gfx On Behalf Of Christian König Sent: Wednesday, February 19, 2020 7:03 PM To: He, Jacob ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace Am 19.02.20 um 11:15 schrieb Jacob He: > [WHY] > When SPM trace enabled, SPM_VMID should be updated with the current > vmid. > > [HOW] > Add a chunk id, AMDGPU_CHUNK_ID_SPM_TRACE, so that UMD can tell us > which job should update SPM_VMID. > Right before a job is submitted to GPU, set the SPM_VMID accordingly. > > [Limitation] > Running more than one SPM trace enabled processes simultaneously is > not supported. Well there are multiple problems with that patch. First of all you need to better describe what SPM tracing is in the commit message. Then the updating of mmRLC_SPM_MC_CNTL must be executed asynchronously on the ring. Otherwise we might corrupt an already executing SPM trace. And you also need to make sure to disable the tracing again or otherwise we run into a bunch of trouble when the VMID is reused. You also need to make sure that IBs using the SPM trace are serialized with each other, e.g. hack into amdgpu_ids.c file and make sure that only one VMID at a time can have that attribute. Regards, Christian. > > Change-Id: Ic932ef6ac9dbf244f03aaee90550e8ff3a675666 > Signed-off-by: Jacob He > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 7 +++ > drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 10 +++--- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h | 1 + > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 15 ++- > drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 15 ++- > 8 files changed, 48 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index f9fa6e104fef..3f32c4db5232 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -113,6 +113,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser > *p, union drm_amdgpu_cs > uint32_t uf_offset = 0; > int i; > int ret; > + bool update_spm_vmid = false; > > if (cs->in.num_chunks == 0) > return 0; > @@ -221,6 +222,10 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser > *p, union drm_amdgpu_cs > case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL: > break; > > + case AMDGPU_CHUNK_ID_SPM_TRACE: > + update_spm_vmid = true; > + break; > + > default: > ret = -EINVAL; > goto free_partial_kdata; > @@ -231,6 +236,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser > *p, union drm_amdgpu_cs > if (ret) > goto free_all_kdata; > > + p->job->need_update_spm_vmid = update_spm_vmid; > + > if (p->ctx->vram_lost_counter != p->job->vram_lost_counter) { > ret = -ECANCELED; > goto free_all_kdata; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > index cae81914c821..36faab12b585 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > @@ -156,9 +156,13 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, > unsigned num_ibs, > return -EINVAL; > } > > - if (vm && !job->vmid) { > - dev_err(adev->dev, "VM IB without ID\n"); > - return -EINVAL; > + if (vm) { > + if (!job->vmid) { > + dev_err(adev->dev, "VM IB without ID\n"); > + return -EINVAL; > + } else if (adev->gfx.rlc.funcs->update_spm_vmid && > job->need_update_spm_vmid) { > + adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid); > + } > } > > alloc_size = ring->funcs->emit_frame_size + num_ibs * diff --git > a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > index 2e2110dddb76..4582536961c7 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > @@ -52,6 +52,7 @@ struct amdgpu_job { > boolvm_needs_flush; > uint64_tvm_pd_addr; > unsignedvmid; > + boolneed_update_spm_vmid; > unsignedpasid; > uint32_tgds_base, gds_size; >
Re: [PATCH] drm/ttm: use the parent resv for ghost objects v2
On 2019/10/24 下午6:25, Christian König wrote: > Ping? > > Am 18.10.19 um 13:58 schrieb Christian König: >> This way the TTM is destroyed with the correct dma_resv object >> locked and we can even pipeline imported BO evictions. >> >> v2: Limit this to only cases when the parent object uses a separate >> reservation object as well. This fixes another OOM problem. >> >> Signed-off-by: Christian König >> --- >> drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +--- >> 1 file changed, 9 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c >> b/drivers/gpu/drm/ttm/ttm_bo_util.c >> index e030c27f53cf..45e440f80b7b 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c >> @@ -512,7 +512,9 @@ static int ttm_buffer_object_transfer(struct >> ttm_buffer_object *bo, >> kref_init(>base.kref); >> fbo->base.destroy = _transfered_destroy; >> fbo->base.acc_size = 0; >> - fbo->base.base.resv = >base.base._resv; >> + if (bo->base.resv == >base._resv) >> + fbo->base.base.resv = >base.base._resv; >> + >> dma_resv_init(fbo->base.base.resv); Doesn't this lead to issue if you force to init parent resv? Otherwise how to deal with if parent->resv is locking? >> ret = dma_resv_trylock(fbo->base.base.resv); >> WARN_ON(!ret); >> @@ -711,7 +713,7 @@ int ttm_bo_move_accel_cleanup(struct >> ttm_buffer_object *bo, >> if (ret) >> return ret; >> - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); >> + dma_resv_add_excl_fence(_obj->base._resv, fence); >> /** >> * If we're not moving to fixed memory, the TTM object >> @@ -724,7 +726,7 @@ int ttm_bo_move_accel_cleanup(struct >> ttm_buffer_object *bo, >> else >> bo->ttm = NULL; >> - ttm_bo_unreserve(ghost_obj); >> + dma_resv_unlock(_obj->base._resv); fbo->base.base.resv? -David >> ttm_bo_put(ghost_obj); >> } >> @@ -767,7 +769,7 @@ int ttm_bo_pipeline_move(struct >> ttm_buffer_object *bo, >> if (ret) >> return ret; >> - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); >> + dma_resv_add_excl_fence(_obj->base._resv, fence); >> /** >> * If we're not moving to fixed memory, the TTM object >> @@ -780,7 +782,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object >> *bo, >> else >> bo->ttm = NULL; >> - ttm_bo_unreserve(ghost_obj); >> + dma_resv_unlock(_obj->base._resv); >> ttm_bo_put(ghost_obj); >> } else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) { >> @@ -836,7 +838,7 @@ int ttm_bo_pipeline_gutting(struct >> ttm_buffer_object *bo) >> if (ret) >> return ret; >> - ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv); >> + ret = dma_resv_copy_fences(>base._resv, bo->base.resv); >> /* Last resort, wait for the BO to be idle when we are OOM */ >> if (ret) >> ttm_bo_wait(bo, false, false); >> @@ -845,7 +847,7 @@ int ttm_bo_pipeline_gutting(struct >> ttm_buffer_object *bo) >> bo->mem.mem_type = TTM_PL_SYSTEM; >> bo->ttm = NULL; >> - ttm_bo_unreserve(ghost); >> + dma_resv_unlock(>base._resv); >> ttm_bo_put(ghost); >> return 0; > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: remove gfx9 NGG
+Alex Yan to confirm which doesn't affect us. -Original Message- From: amd-gfx On Behalf Of Marek Olšák Sent: Friday, September 20, 2019 10:16 AM To: amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/amdgpu: remove gfx9 NGG From: Marek Olšák Never used. Signed-off-by: Marek Olšák --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 - drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 41 - drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 25 --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 11 -- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 195 5 files changed, 277 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 6ff02bb60140..80116e63e209 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -140,25 +140,20 @@ extern int amdgpu_dc; extern int amdgpu_sched_jobs; extern int amdgpu_sched_hw_submission; extern uint amdgpu_pcie_gen_cap; extern uint amdgpu_pcie_lane_cap; extern uint amdgpu_cg_mask; extern uint amdgpu_pg_mask; extern uint amdgpu_sdma_phase_quantum; extern char *amdgpu_disable_cu; extern char *amdgpu_virtual_display; extern uint amdgpu_pp_feature_mask; -extern int amdgpu_ngg; -extern int amdgpu_prim_buf_per_se; -extern int amdgpu_pos_buf_per_se; -extern int amdgpu_cntl_sb_buf_per_se; -extern int amdgpu_param_buf_per_se; extern int amdgpu_job_hang_limit; extern int amdgpu_lbpw; extern int amdgpu_compute_multipipe; extern int amdgpu_gpu_recovery; extern int amdgpu_emu_mode; extern uint amdgpu_smu_memory_pool_size; extern uint amdgpu_dc_feature_mask; extern uint amdgpu_dm_abm_level; extern struct amdgpu_mgpu_info mgpu_info; extern int amdgpu_ras_enable; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index b49ed39c1fea..cbe4ef4813f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -119,25 +119,20 @@ int amdgpu_sched_jobs = 32; int amdgpu_sched_hw_submission = 2; uint amdgpu_pcie_gen_cap = 0; uint amdgpu_pcie_lane_cap = 0; uint amdgpu_cg_mask = 0x; uint amdgpu_pg_mask = 0x; uint amdgpu_sdma_phase_quantum = 32; char *amdgpu_disable_cu = NULL; char *amdgpu_virtual_display = NULL; /* OverDrive(bit 14) disabled by default*/ uint amdgpu_pp_feature_mask = 0xbfff; -int amdgpu_ngg = 0; -int amdgpu_prim_buf_per_se = 0; -int amdgpu_pos_buf_per_se = 0; -int amdgpu_cntl_sb_buf_per_se = 0; -int amdgpu_param_buf_per_se = 0; int amdgpu_job_hang_limit = 0; int amdgpu_lbpw = -1; int amdgpu_compute_multipipe = -1; int amdgpu_gpu_recovery = -1; /* auto */ int amdgpu_emu_mode = 0; uint amdgpu_smu_memory_pool_size = 0; /* FBC (bit 0) disabled by default*/ uint amdgpu_dc_feature_mask = 0; int amdgpu_async_gfx_ring = 1; int amdgpu_mcbp = 0; @@ -443,56 +438,20 @@ module_param_named(disable_cu, amdgpu_disable_cu, charp, 0444); * DOC: virtual_display (charp) * Set to enable virtual display feature. This feature provides a virtual display hardware on headless boards * or in virtualized environments. It will be set like :xx:xx.x,x;:xx:xx.x,x. It's the pci address of * the device, plus the number of crtcs to expose. E.g., :26:00.0,4 would enable 4 virtual crtcs on the pci * device at 26:00.0. The default is NULL. */ MODULE_PARM_DESC(virtual_display, "Enable virtual display feature (the virtual_display will be set like :xx:xx.x,x;:xx:xx.x,x)"); module_param_named(virtual_display, amdgpu_virtual_display, charp, 0444); -/** - * DOC: ngg (int) - * Set to enable Next Generation Graphics (1 = enable). The default is 0 (disabled). - */ -MODULE_PARM_DESC(ngg, "Next Generation Graphics (1 = enable, 0 = disable(default depending on gfx))"); -module_param_named(ngg, amdgpu_ngg, int, 0444); - -/** - * DOC: prim_buf_per_se (int) - * Override the size of Primitive Buffer per Shader Engine in Byte. The default is 0 (depending on gfx). - */ -MODULE_PARM_DESC(prim_buf_per_se, "the size of Primitive Buffer per Shader Engine (default depending on gfx)"); -module_param_named(prim_buf_per_se, amdgpu_prim_buf_per_se, int, 0444); - -/** - * DOC: pos_buf_per_se (int) - * Override the size of Position Buffer per Shader Engine in Byte. The default is 0 (depending on gfx). - */ -MODULE_PARM_DESC(pos_buf_per_se, "the size of Position Buffer per Shader Engine (default depending on gfx)"); -module_param_named(pos_buf_per_se, amdgpu_pos_buf_per_se, int, 0444); - -/** - * DOC: cntl_sb_buf_per_se (int) - * Override the size of Control Sideband per Shader Engine in Byte. The default is 0 (depending on gfx). - */ -MODULE_PARM_DESC(cntl_sb_buf_per_se, "the size of Control Sideband per Shader Engine (default depending on gfx)"); -module_param_named(cntl_sb_buf_per_se, amdgpu_cntl_sb_buf_per_se, int, 0444); - -/** - * DOC: param_buf_per_se (int) - * Override the size of Off-Chip Parameter
Re:[PATCH] drm/amdgpu: resvert "disable bulk moves for now"
I dont know dkms status,anyway, we should submit this one as early as possible. 原始邮件 主题:Re: [PATCH] drm/amdgpu: resvert "disable bulk moves for now" 发件人:Christian König 收件人:"Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org 抄送: Just to double check: We do have that enabled in the DKMS package for a while and doesn't encounter any more problems with it, correct? Thanks, Christian. Am 12.09.19 um 16:02 schrieb Chunming Zhou: > RB on it to go ahead. > > -David > > 在 2019/9/12 18:15, Christian König 写道: >> This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a. >> >> The changes to fix this should have landed in 5.1. >> >> Signed-off-by: Christian König >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 -- >>1 file changed, 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> index 48349e4f0701..fd3fbaa73fa3 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> @@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device >> *adev, >> struct ttm_bo_global *glob = adev->mman.bdev.glob; >> struct amdgpu_vm_bo_base *bo_base; >> >> -#if 0 >> if (vm->bulk_moveable) { >> spin_lock(>lru_lock); >> ttm_bo_bulk_move_lru_tail(>lru_bulk_move); >> spin_unlock(>lru_lock); >> return; >> } >> -#endif >> >> memset(>lru_bulk_move, 0, sizeof(vm->lru_bulk_move)); >> ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables
Do you need update the vram size reported to UMD ? -David -Original Message- From: amd-gfx On Behalf Of Christian König Sent: Monday, September 2, 2019 6:52 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables This hopefully helps reduce the contention for page tables. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 9 +++-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 2eda3a8c330d..3352a87b822e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -99,6 +99,9 @@ struct amdgpu_bo_list_entry; #define AMDGPU_VM_FAULT_STOP_FIRST 1 #define AMDGPU_VM_FAULT_STOP_ALWAYS2 +/* Reserve 4MB VRAM for page tables */ +#define AMDGPU_VM_RESERVED_VRAM(4ULL << 20) + /* max number of VMHUB */ #define AMDGPU_MAX_VMHUBS 3 #define AMDGPU_GFXHUB_00 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index 1150e34bc28f..59440f71d304 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -24,6 +24,7 @@ #include #include "amdgpu.h" +#include "amdgpu_vm.h" struct amdgpu_vram_mgr { struct drm_mm mm; @@ -276,7 +277,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man, struct drm_mm_node *nodes; enum drm_mm_insert_mode mode; unsigned long lpfn, num_nodes, pages_per_node, pages_left; - uint64_t vis_usage = 0, mem_bytes; + uint64_t vis_usage = 0, mem_bytes, max_bytes; unsigned i; int r; @@ -284,9 +285,13 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man, if (!lpfn) lpfn = man->size; + max_bytes = adev->gmc.mc_vram_size; + if (tbo->type != ttm_bo_type_kernel) + max_bytes -= AMDGPU_VM_RESERVED_VRAM; + /* bail out quickly if there's likely not enough VRAM for this BO */ mem_bytes = (u64)mem->num_pages << PAGE_SHIFT; - if (atomic64_add_return(mem_bytes, >usage) > adev->gmc.mc_vram_size) { + if (atomic64_add_return(mem_bytes, >usage) > max_bytes) { atomic64_sub(mem_bytes, >usage); mem->mm_node = NULL; return 0; -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
Patch #1,#5,#6,#8,#9,#10 are Reviewed-by: Chunming Zhou Patch #2,#3,#4 are Acked-by: Chunming Zhou -David > -Original Message- > From: dri-devel On Behalf Of > Christian K?nig > Sent: Wednesday, May 29, 2019 8:27 PM > To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Subject: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3 > > This avoids OOM situations when we have lots of threads submitting at the > same time. > > v3: apply this to the whole driver, not just CS > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +- > 4 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 20f2955d2a55..3e2da24cd17a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct > amdgpu_cs_parser *p, > } > > r = ttm_eu_reserve_buffers(>ticket, >validated, true, > -, true); > +, false); > if (unlikely(r != 0)) { > if (r != -ERESTARTSYS) > DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c > index 06f83cac0d3a..f660628e6af9 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c > @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device > *adev, struct amdgpu_vm *vm, > list_add(_tv.head, ); > amdgpu_vm_get_pd_bo(vm, , ); > > - r = ttm_eu_reserve_buffers(, , true, NULL, true); > + r = ttm_eu_reserve_buffers(, , true, NULL, false); > if (r) { > DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r); > return r; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > index d513a5ad03dd..ed25a4e14404 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct > drm_gem_object *obj, > > amdgpu_vm_get_pd_bo(vm, , _pd); > > - r = ttm_eu_reserve_buffers(, , false, , true); > + r = ttm_eu_reserve_buffers(, , false, , false); > if (r) { > dev_err(adev->dev, "leaking bo va because " > "we fail to reserve bo (%d)\n", r); > @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, > void *data, > > amdgpu_vm_get_pd_bo(>vm, , _pd); > > - r = ttm_eu_reserve_buffers(, , true, , true); > + r = ttm_eu_reserve_buffers(, , true, , false); > if (r) > goto error_unref; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > index c430e8259038..d60593cc436e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > @@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct > amdgpu_bo *bo, bool no_intr) > struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); > int r; > > - r = ttm_bo_reserve(>tbo, !no_intr, false, NULL); > + r = __ttm_bo_reserve(>tbo, !no_intr, false, NULL); > if (unlikely(r != 0)) { > if (r != -ERESTARTSYS) > dev_err(adev->dev, "%p reserve failed\n", bo); > -- > 2.17.1 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 01/11] drm/ttm: Make LRU removal optional.
> -Original Message- > From: Christian König > Sent: Tuesday, May 14, 2019 8:31 PM > To: Olsak, Marek ; Zhou, David(ChunMing) > ; Liang, Prike ; dri- > de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Subject: [PATCH 01/11] drm/ttm: Make LRU removal optional. > > [CAUTION: External Email] > > We are already doing this for DMA-buf imports and also for amdgpu VM BOs > for quite a while now. > > If this doesn't run into any problems we are probably going to stop removing > BOs from the LRU altogether. > > Signed-off-by: Christian König > --- [snip] > diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c > b/drivers/gpu/drm/ttm/ttm_execbuf_util.c > index 0075eb9a0b52..957ec375a4ba 100644 > --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c > +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c > @@ -69,7 +69,8 @@ void ttm_eu_backoff_reservation(struct > ww_acquire_ctx *ticket, > list_for_each_entry(entry, list, head) { > struct ttm_buffer_object *bo = entry->bo; > > - ttm_bo_add_to_lru(bo); > + if (list_empty(>lru)) > + ttm_bo_add_to_lru(bo); > reservation_object_unlock(bo->resv); > } > spin_unlock(>lru_lock); > @@ -93,7 +94,7 @@ EXPORT_SYMBOL(ttm_eu_backoff_reservation); > > int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket, >struct list_head *list, bool intr, > - struct list_head *dups) > + struct list_head *dups, bool del_lru) > { > struct ttm_bo_global *glob; > struct ttm_validate_buffer *entry; @@ -172,11 +173,11 @@ int > ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket, > list_add(>head, list); > } > > - if (ticket) > - ww_acquire_done(ticket); > - spin_lock(>lru_lock); > - ttm_eu_del_from_lru_locked(list); > - spin_unlock(>lru_lock); > + if (del_lru) { > + spin_lock(>lru_lock); > + ttm_eu_del_from_lru_locked(list); > + spin_unlock(>lru_lock); > + } Can you make bo to lru tail here when del_lru is false? Busy iteration in evict_first will try other process Bos first, which could save loop time. > return 0; > } > EXPORT_SYMBOL(ttm_eu_reserve_buffers); > @@ -203,7 +204,10 @@ void ttm_eu_fence_buffer_objects(struct > ww_acquire_ctx *ticket, > reservation_object_add_shared_fence(bo->resv, fence); > else > reservation_object_add_excl_fence(bo->resv, fence); > - ttm_bo_add_to_lru(bo); > + if (list_empty(>lru)) > + ttm_bo_add_to_lru(bo); > + else > + ttm_bo_move_to_lru_tail(bo, NULL); If this line is done in above, then we don't need this here. -David > reservation_object_unlock(bo->resv); > } > spin_unlock(>lru_lock); > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > index 161b80fee492..5cffaa24259f 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > @@ -63,7 +63,7 @@ static int virtio_gpu_object_list_validate(struct > ww_acquire_ctx *ticket, > struct virtio_gpu_object *qobj; > int ret; > > - ret = ttm_eu_reserve_buffers(ticket, head, true, NULL); > + ret = ttm_eu_reserve_buffers(ticket, head, true, NULL, true); > if (ret != 0) > return ret; > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > index a7c30e567f09..d28cbedba0b5 100644 > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > @@ -465,7 +465,8 @@ vmw_resource_check_buffer(struct ww_acquire_ctx > *ticket, > val_buf->bo = >backup->base; > val_buf->num_shared = 0; > list_add_tail(_buf->head, _list); > - ret = ttm_eu_reserve_buffers(ticket, _list, interruptible, NULL); > + ret = ttm_eu_reserve_buffers(ticket, _list, interruptible, NULL, > +true); > if (unlikely(ret != 0)) > goto out_no_reserve; > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h > b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h > index 3b396fea40d7..ac435b51f4eb 100644 > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h > @@ -165,7 +165,7 @@ vmw_validation_bo_reserve
Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
Ah, sorry, I missed "+ ttm_bo_move_to_lru_tail(bo, NULL);". Right, moving them to end before releasing is fixing my concern. Sorry for noise. -David Original Message Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: "Koenig, Christian" To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org CC: [CAUTION: External Email] BO list? No, we stop removing them from the LRU. But we still move them to the end of the LRU before releasing them. Christian. Am 15.05.19 um 16:21 schrieb Zhou, David(ChunMing): Isn't this patch trying to stop removing for all BOs from bo list? -David Original Message Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Olsak, Marek" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org> CC: [CAUTION: External Email] That is a good point, but actually not a problem in practice. See the change to ttm_eu_fence_buffer_objects: - ttm_bo_add_to_lru(bo); + if (list_empty(>lru)) + ttm_bo_add_to_lru(bo); + else + ttm_bo_move_to_lru_tail(bo, NULL); We still move the BOs to the end of the LRU in the same order we have before, we just don't remove them when they are reserved. Regards, Christian. Am 14.05.19 um 16:31 schrieb Zhou, David(ChunMing): how to refresh LRU to keep the order align with bo list passed from user space? you can verify it by some games, performance could be different much between multiple runnings. -David Original Message Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org> CC: [CAUTION: External Email] Hui? What do you mean with that? Christian. Am 14.05.19 um 15:12 schrieb Zhou, David(ChunMing): my only concern is how to fresh LRU when bo is from bo list. -David Original Message ---- Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org> CC: [CAUTION: External Email] This avoids OOM situations when we have lots of threads submitting at the same time. Signed-off-by: Christian König <mailto:christian.koe...@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index fff558cf385b..f9240a94217b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(>ticket, >validated, true, - , true); + , false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
Isn't this patch trying to stop removing for all BOs from bo list? -David Original Message Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Olsak, Marek" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org CC: [CAUTION: External Email] That is a good point, but actually not a problem in practice. See the change to ttm_eu_fence_buffer_objects: - ttm_bo_add_to_lru(bo); + if (list_empty(>lru)) + ttm_bo_add_to_lru(bo); + else + ttm_bo_move_to_lru_tail(bo, NULL); We still move the BOs to the end of the LRU in the same order we have before, we just don't remove them when they are reserved. Regards, Christian. Am 14.05.19 um 16:31 schrieb Zhou, David(ChunMing): how to refresh LRU to keep the order align with bo list passed from user space? you can verify it by some games, performance could be different much between multiple runnings. -David Original Message Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org> CC: [CAUTION: External Email] Hui? What do you mean with that? Christian. Am 14.05.19 um 15:12 schrieb Zhou, David(ChunMing): my only concern is how to fresh LRU when bo is from bo list. -David Original Message Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org> CC: [CAUTION: External Email] This avoids OOM situations when we have lots of threads submitting at the same time. Signed-off-by: Christian König <mailto:christian.koe...@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index fff558cf385b..f9240a94217b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(>ticket, >validated, true, - , true); + , false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
how to refresh LRU to keep the order align with bo list passed from user space? you can verify it by some games, performance could be different much between multiple runnings. -David Original Message Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org CC: [CAUTION: External Email] Hui? What do you mean with that? Christian. Am 14.05.19 um 15:12 schrieb Zhou, David(ChunMing): my only concern is how to fresh LRU when bo is from bo list. -David Original Message Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org> CC: [CAUTION: External Email] This avoids OOM situations when we have lots of threads submitting at the same time. Signed-off-by: Christian König <mailto:christian.koe...@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index fff558cf385b..f9240a94217b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(>ticket, >validated, true, - , true); + , false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
my only concern is how to fresh LRU when bo is from bo list. -David Original Message Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org CC: [CAUTION: External Email] This avoids OOM situations when we have lots of threads submitting at the same time. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index fff558cf385b..f9240a94217b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(>ticket, >validated, true, - , true); + , false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.
Sorry, I only can put my Acked-by: Chunming Zhou on patch#3. I cannot fully judge patch #4, #5, #6. -David From: amd-gfx On Behalf Of Grodzovsky, Andrey Sent: Friday, April 26, 2019 10:09 PM To: Koenig, Christian ; Zhou, David(ChunMing) ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; e...@anholt.net; etna...@lists.freedesktop.org Cc: Kazlauskas, Nicholas ; Liu, Monk Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. Ping (mostly David and Monk). Andrey On 4/24/19 3:09 AM, Christian König wrote: Am 24.04.19 um 05:02 schrieb Zhou, David(ChunMing): >> -drm_sched_stop(>sched, >base); >> - >> /* after all hw jobs are reset, hw fence is meaningless, so >> force_completion */ >> amdgpu_fence_driver_force_completion(ring); >> } HW fence are already forced completion, then we can just disable irq fence process and ignore hw fence signal when we are trying to do GPU reset, I think. Otherwise which will make the logic much more complex. If this situation happens because of long time execution, we can increase timeout of reset detection. You are not thinking widely enough, forcing the hw fence to complete can trigger other to start other activity in the system. We first need to stop everything and make sure that we don't do any processing any more and then start with our reset procedure including forcing all hw fences to complete. Christian. -David From: amd-gfx <mailto:amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Grodzovsky, Andrey Sent: Wednesday, April 24, 2019 12:00 AM To: Zhou, David(ChunMing) <mailto:david1.z...@amd.com>; dri-de...@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; e...@anholt.net<mailto:e...@anholt.net>; etna...@lists.freedesktop.org<mailto:etna...@lists.freedesktop.org>; ckoenig.leichtzumer...@gmail.com<mailto:ckoenig.leichtzumer...@gmail.com> Cc: Kazlauskas, Nicholas <mailto:nicholas.kazlaus...@amd.com>; Liu, Monk <mailto:monk@amd.com> Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. No, i mean the actual HW fence which signals when the job finished execution on the HW. Andrey On 4/23/19 11:19 AM, Zhou, David(ChunMing) wrote: do you mean fence timer? why not stop it as well when stopping sched for the reason of hw reset? Original Message Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. From: "Grodzovsky, Andrey" To: "Zhou, David(ChunMing)" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com> CC: "Kazlauskas, Nicholas" ,"Liu, Monk" On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote: > +Monk. > > GPU reset is used widely in SRIOV, so need virtulizatino guy take a look. > > But out of curious, why guilty job can signal more if the job is already > set to guilty? set it wrongly? > > > -David It's possible that the job does completes at a later time then it's timeout handler started processing so in this patch we try to protect against this by rechecking the HW fence after stopping all SW schedulers. We do it BEFORE marking guilty on the job's sched_entity so at the point we check the guilty flag is not set yet. Andrey > > 在 2019/4/18 23:00, Andrey Grodzovsky 写道: >> Also reject TDRs if another one already running. >> >> v2: >> Stop all schedulers across device and entire XGMI hive before >> force signaling HW fences. >> Avoid passing job_signaled to helper fnctions to keep all the decision >> making about skipping HW reset in one place. >> >> v3: >> Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced >> against it's decrement in drm_sched_stop in non HW reset case. >> v4: rebase >> v5: Revert v3 as we do it now in sceduler code. >> >> Signed-off-by: Andrey Grodzovsky >> <mailto:andrey.grodzov...@amd.com> >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143 >> +++-- >>1 file changed, 95 insertions(+), 48 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index a0e165c..85f8792 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -3334,8 +3334,6 @@ static int amdgpu_device_pre_
RE: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.
>> -drm_sched_stop(>sched, >base); >> - >> /* after all hw jobs are reset, hw fence is meaningless, so >> force_completion */ >> amdgpu_fence_driver_force_completion(ring); >> } HW fence are already forced completion, then we can just disable irq fence process and ignore hw fence signal when we are trying to do GPU reset, I think. Otherwise which will make the logic much more complex. If this situation happens because of long time execution, we can increase timeout of reset detection. -David From: amd-gfx On Behalf Of Grodzovsky, Andrey Sent: Wednesday, April 24, 2019 12:00 AM To: Zhou, David(ChunMing) ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; e...@anholt.net; etna...@lists.freedesktop.org; ckoenig.leichtzumer...@gmail.com Cc: Kazlauskas, Nicholas ; Liu, Monk Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. No, i mean the actual HW fence which signals when the job finished execution on the HW. Andrey On 4/23/19 11:19 AM, Zhou, David(ChunMing) wrote: do you mean fence timer? why not stop it as well when stopping sched for the reason of hw reset? Original Message Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. From: "Grodzovsky, Andrey" To: "Zhou, David(ChunMing)" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com> CC: "Kazlauskas, Nicholas" ,"Liu, Monk" On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote: > +Monk. > > GPU reset is used widely in SRIOV, so need virtulizatino guy take a look. > > But out of curious, why guilty job can signal more if the job is already > set to guilty? set it wrongly? > > > -David It's possible that the job does completes at a later time then it's timeout handler started processing so in this patch we try to protect against this by rechecking the HW fence after stopping all SW schedulers. We do it BEFORE marking guilty on the job's sched_entity so at the point we check the guilty flag is not set yet. Andrey > > 在 2019/4/18 23:00, Andrey Grodzovsky 写道: >> Also reject TDRs if another one already running. >> >> v2: >> Stop all schedulers across device and entire XGMI hive before >> force signaling HW fences. >> Avoid passing job_signaled to helper fnctions to keep all the decision >> making about skipping HW reset in one place. >> >> v3: >> Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced >> against it's decrement in drm_sched_stop in non HW reset case. >> v4: rebase >> v5: Revert v3 as we do it now in sceduler code. >> >> Signed-off-by: Andrey Grodzovsky >> <mailto:andrey.grodzov...@amd.com> >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143 >> +++-- >>1 file changed, 95 insertions(+), 48 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index a0e165c..85f8792 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -3334,8 +3334,6 @@ static int amdgpu_device_pre_asic_reset(struct >> amdgpu_device *adev, >> if (!ring || !ring->sched.thread) >> continue; >> >> -drm_sched_stop(>sched, >base); >> - >> /* after all hw jobs are reset, hw fence is meaningless, so >> force_completion */ >> amdgpu_fence_driver_force_completion(ring); >> } >> @@ -3343,6 +3341,7 @@ static int amdgpu_device_pre_asic_reset(struct >> amdgpu_device *adev, >> if(job) >> drm_sched_increase_karma(>base); >> >> +/* Don't suspend on bare metal if we are not going to HW reset the ASIC >> */ >> if (!amdgpu_sriov_vf(adev)) { >> >> if (!need_full_reset) >> @@ -3480,37 +3479,21 @@ static int amdgpu_do_asic_reset(struct >> amdgpu_hive_info *hive, >> return r; >>} >> >> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev) >> +static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, bool >> trylock) >>{ >> -int i; >> - >> -for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { >> -struct amdgpu_ring *ring = ad
Re:[PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.
do you mean fence timer? why not stop it as well when stopping sched for the reason of hw reset? Original Message Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. From: "Grodzovsky, Andrey" To: "Zhou, David(ChunMing)" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com CC: "Kazlauskas, Nicholas" ,"Liu, Monk" On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote: > +Monk. > > GPU reset is used widely in SRIOV, so need virtulizatino guy take a look. > > But out of curious, why guilty job can signal more if the job is already > set to guilty? set it wrongly? > > > -David It's possible that the job does completes at a later time then it's timeout handler started processing so in this patch we try to protect against this by rechecking the HW fence after stopping all SW schedulers. We do it BEFORE marking guilty on the job's sched_entity so at the point we check the guilty flag is not set yet. Andrey > > 在 2019/4/18 23:00, Andrey Grodzovsky 写道: >> Also reject TDRs if another one already running. >> >> v2: >> Stop all schedulers across device and entire XGMI hive before >> force signaling HW fences. >> Avoid passing job_signaled to helper fnctions to keep all the decision >> making about skipping HW reset in one place. >> >> v3: >> Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced >> against it's decrement in drm_sched_stop in non HW reset case. >> v4: rebase >> v5: Revert v3 as we do it now in sceduler code. >> >> Signed-off-by: Andrey Grodzovsky >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143 >> +++-- >>1 file changed, 95 insertions(+), 48 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index a0e165c..85f8792 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -3334,8 +3334,6 @@ static int amdgpu_device_pre_asic_reset(struct >> amdgpu_device *adev, >> if (!ring || !ring->sched.thread) >> continue; >> >> -drm_sched_stop(>sched, >base); >> - >> /* after all hw jobs are reset, hw fence is meaningless, so >> force_completion */ >> amdgpu_fence_driver_force_completion(ring); >> } >> @@ -3343,6 +3341,7 @@ static int amdgpu_device_pre_asic_reset(struct >> amdgpu_device *adev, >> if(job) >> drm_sched_increase_karma(>base); >> >> +/* Don't suspend on bare metal if we are not going to HW reset the ASIC >> */ >> if (!amdgpu_sriov_vf(adev)) { >> >> if (!need_full_reset) >> @@ -3480,37 +3479,21 @@ static int amdgpu_do_asic_reset(struct >> amdgpu_hive_info *hive, >> return r; >>} >> >> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev) >> +static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, bool >> trylock) >>{ >> -int i; >> - >> -for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { >> -struct amdgpu_ring *ring = adev->rings[i]; >> - >> -if (!ring || !ring->sched.thread) >> -continue; >> - >> -if (!adev->asic_reset_res) >> -drm_sched_resubmit_jobs(>sched); >> +if (trylock) { >> +if (!mutex_trylock(>lock_reset)) >> +return false; >> +} else >> +mutex_lock(>lock_reset); >> >> -drm_sched_start(>sched, !adev->asic_reset_res); >> -} >> - >> -if (!amdgpu_device_has_dc_support(adev)) { >> -drm_helper_resume_force_mode(adev->ddev); >> -} >> - >> -adev->asic_reset_res = 0; >> -} >> - >> -static void amdgpu_device_lock_adev(struct amdgpu_device *adev) >> -{ >> -mutex_lock(>lock_reset); >> atomic_inc(>gpu_reset_counter); >> adev->in_gpu_reset = 1; >> /* Block kfd: SRIOV would do it separately */ >> if (!amdgpu_sriov_vf(adev)) >>amdgpu_amdkfd_pre_reset(adev); >> + >> +return true; >>} >> >>static void amdgpu_device_unlock_adev(struct amdgpu_device *adev) >> @@ -3538,40 +3521,42 @@ s
Re:[PATCH v5 3/6] drm/scheduler: rework job destruction
This patch is to fix deadlock between fence->lock and sched->job_list_lock, right? So I suggest to just move list_del_init(_job->node) from drm_sched_process_job to work thread. That will avoid deadlock described in the link. Original Message Subject: Re: [PATCH v5 3/6] drm/scheduler: rework job destruction From: "Grodzovsky, Andrey" To: "Zhou, David(ChunMing)" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com CC: "Kazlauskas, Nicholas" ,"Koenig, Christian" On 4/22/19 8:48 AM, Chunming Zhou wrote: > Hi Andrey, > > static void drm_sched_process_job(struct dma_fence *f, struct > dma_fence_cb *cb) > { > ... > spin_lock_irqsave(>job_list_lock, flags); > /* remove job from ring_mirror_list */ > list_del_init(_job->node); > spin_unlock_irqrestore(>job_list_lock, flags); > [David] How about just remove above to worker from irq process? Any > problem? Maybe I missed previous your discussion, but I think removing > lock for list is a risk for future maintenance although you make sure > thread safe currently. > > -David We remove the lock exactly because of the fact that insertion and removal to/from the list will be done form exactly one thread at ant time now. So I am not sure I understand what you mean. Andrey > > ... > > schedule_work(_job->finish_work); > } > > 在 2019/4/18 23:00, Andrey Grodzovsky 写道: >> From: Christian König >> >> We now destroy finished jobs from the worker thread to make sure that >> we never destroy a job currently in timeout processing. >> By this we avoid holding lock around ring mirror list in drm_sched_stop >> which should solve a deadlock reported by a user. >> >> v2: Remove unused variable. >> v4: Move guilty job free into sched code. >> v5: >> Move sched->hw_rq_count to drm_sched_start to account for counter >> decrement in drm_sched_stop even when we don't call resubmit jobs >> if guily job did signal. >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 >> >> Signed-off-by: Christian König >> Signed-off-by: Andrey Grodzovsky >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +- >>drivers/gpu/drm/etnaviv/etnaviv_dump.c | 4 - >>drivers/gpu/drm/etnaviv/etnaviv_sched.c| 2 +- >>drivers/gpu/drm/lima/lima_sched.c | 2 +- >>drivers/gpu/drm/panfrost/panfrost_job.c| 2 +- >>drivers/gpu/drm/scheduler/sched_main.c | 159 >> + >>drivers/gpu/drm/v3d/v3d_sched.c| 2 +- >>include/drm/gpu_scheduler.h| 6 +- >>8 files changed, 102 insertions(+), 84 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index 7cee269..a0e165c 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -3334,7 +3334,7 @@ static int amdgpu_device_pre_asic_reset(struct >> amdgpu_device *adev, >>if (!ring || !ring->sched.thread) >>continue; >> >> - drm_sched_stop(>sched); >> + drm_sched_stop(>sched, >base); >> >>/* after all hw jobs are reset, hw fence is meaningless, so >> force_completion */ >>amdgpu_fence_driver_force_completion(ring); >> @@ -3343,8 +3343,6 @@ static int amdgpu_device_pre_asic_reset(struct >> amdgpu_device *adev, >>if(job) >>drm_sched_increase_karma(>base); >> >> - >> - >>if (!amdgpu_sriov_vf(adev)) { >> >>if (!need_full_reset) >> @@ -3482,8 +3480,7 @@ static int amdgpu_do_asic_reset(struct >> amdgpu_hive_info *hive, >>return r; >>} >> >> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev, >> - struct amdgpu_job *job) >> +static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev) >>{ >>int i; >> >> @@ -3623,7 +3620,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device >> *adev, >> >>/* Post ASIC reset for all devs .*/ >>list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) { >> - amdgpu_device_post_asic_reset(tmp_adev, tmp_adev == adev ? job : NULL); >> + amdgpu_device_post_asic_reset(tmp_adev); >> >>if (r) { >>/* bad news, how to tell it to userspace ? */ >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c >> b/drivers/gpu/drm/e
RE: DMA-buf P2P
Which test are you using? Can share? -David > -Original Message- > From: dri-devel On Behalf Of > Christian K?nig > Sent: Thursday, April 18, 2019 8:09 PM > To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Subject: DMA-buf P2P > > Hi guys, > > as promised this is the patch set which enables P2P buffer sharing with DMA- > buf. > > Basic idea is that importers can set a flag noting that they can deal with and > sgt which doesn't contains pages. > > This in turn is the signal to the exporter that we don't need to move a buffer > to system memory any more when a remote device wants to access it. > > Please review and/or comment, > Christian. > > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4
> -Original Message- > From: Lionel Landwerlin > Sent: Saturday, March 30, 2019 10:09 PM > To: Koenig, Christian ; Zhou, David(ChunMing) > ; dri-de...@lists.freedesktop.org; amd- > g...@lists.freedesktop.org; ja...@jlekstrand.net; Hector, Tobias > > Subject: Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point > interface v4 > > On 28/03/2019 15:18, Christian König wrote: > > Am 28.03.19 um 14:50 schrieb Lionel Landwerlin: > >> On 25/03/2019 08:32, Chunming Zhou wrote: > >>> From: Christian König > >>> > >>> Use the dma_fence_chain object to create a timeline of fence objects > >>> instead of just replacing the existing fence. > >>> > >>> v2: rebase and cleanup > >>> v3: fix garbage collection parameters > >>> v4: add unorder point check, print a warn calltrace > >>> > >>> Signed-off-by: Christian König > >>> Cc: Lionel Landwerlin > >>> --- > >>> drivers/gpu/drm/drm_syncobj.c | 39 > >>> +++ > >>> include/drm/drm_syncobj.h | 5 + > >>> 2 files changed, 44 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/drm_syncobj.c > >>> b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119 > >>> 100644 > >>> --- a/drivers/gpu/drm/drm_syncobj.c > >>> +++ b/drivers/gpu/drm/drm_syncobj.c > >>> @@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct > >>> drm_syncobj *syncobj, > >>> spin_unlock(>lock); > >>> } > >>> +/** > >>> + * drm_syncobj_add_point - add new timeline point to the syncobj > >>> + * @syncobj: sync object to add timeline point do > >>> + * @chain: chain node to use to add the point > >>> + * @fence: fence to encapsulate in the chain node > >>> + * @point: sequence number to use for the point > >>> + * > >>> + * Add the chain node as new timeline point to the syncobj. > >>> + */ > >>> +void drm_syncobj_add_point(struct drm_syncobj *syncobj, > >>> + struct dma_fence_chain *chain, > >>> + struct dma_fence *fence, > >>> + uint64_t point) > >>> +{ > >>> + struct syncobj_wait_entry *cur, *tmp; > >>> + struct dma_fence *prev; > >>> + > >>> + dma_fence_get(fence); > >>> + > >>> + spin_lock(>lock); > >>> + > >>> + prev = drm_syncobj_fence_get(syncobj); > >>> + /* You are adding an unorder point to timeline, which could > >>> cause payload returned from query_ioctl is 0! */ > >>> + WARN_ON_ONCE(prev && prev->seqno >= point); > >> > >> > >> I think the WARN/BUG macros should only fire when there is an issue > >> with programming from within the kernel. > >> > >> But this particular warning can be triggered by an application. > >> > >> > >> Probably best to just remove it? > > > > Yeah, that was also my argument against it. > > > > Key point here is that we still want to note somehow that userspace > > did something wrong and returning an error is not an option. > > > > Maybe just use DRM_ERROR with a static variable to print the message > > only once. > > > > Christian. > > I don't really see any point in printing an error once. If you run your > application twice you end up thinking there was an issue just on the first run > but it's actually always wrong. > Except this nitpick, is there any other concern to push whole patch set? Is that time to push whole patch set? -David > > Unless we're willing to take the syncobj lock for longer periods of time when > adding points, I guess we'll have to defer validation to validation layers. > > > -Lionel > > > > >> > >> > >> -Lionel > >> > >> > >>> + dma_fence_chain_init(chain, prev, fence, point); > >>> + rcu_assign_pointer(syncobj->fence, >base); > >>> + > >>> + list_for_each_entry_safe(cur, tmp, >cb_list, node) { > >>> + list_del_init(>node); > >>> + syncobj_wait_syncobj_func(syncobj, cur); > >>> + } > >>> + spin_unlock(>lock); > >>> + > >>> + /* Walk the chain once to trigger garbage collection */ > >>> + dma_fence_c
RE: [PATCH] drm/amdgpu: fix old fence check in amdgpu_fence_emit
> -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Saturday, March 30, 2019 2:33 AM > To: amd-gfx@lists.freedesktop.org > Subject: [PATCH] drm/amdgpu: fix old fence check in amdgpu_fence_emit > > We don't hold a reference to the old fence, so it can go away any time we are > waiting for it to signal. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 24 - > -- > 1 file changed, 17 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > index ee47c11e92ce..4dee2326b29c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > @@ -136,8 +136,9 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, > struct dma_fence **f, { > struct amdgpu_device *adev = ring->adev; > struct amdgpu_fence *fence; > - struct dma_fence *old, **ptr; > + struct dma_fence __rcu **ptr; > uint32_t seq; > + int r; > > fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL); > if (fence == NULL) > @@ -153,15 +154,24 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, > struct dma_fence **f, > seq, flags | AMDGPU_FENCE_FLAG_INT); > > ptr = >fence_drv.fences[seq & ring- > >fence_drv.num_fences_mask]; > + if (unlikely(rcu_dereference_protected(*ptr, 1))) { Isn't this line redundant with dma_fence_get_rcu_safe? I think it's unnecessary. Otherwise looks ok to me. -David > + struct dma_fence *old; > + > + rcu_read_lock(); > + old = dma_fence_get_rcu_safe(ptr); > + rcu_read_unlock(); > + > + if (old) { > + r = dma_fence_wait(old, false); > + dma_fence_put(old); > + if (r) > + return r; > + } > + } > + > /* This function can't be called concurrently anyway, otherwise >* emitting the fence would mess up the hardware ring buffer. >*/ > - old = rcu_dereference_protected(*ptr, 1); > - if (old && !dma_fence_is_signaled(old)) { > - DRM_INFO("rcu slot is busy\n"); > - dma_fence_wait(old, false); > - } > - > rcu_assign_pointer(*ptr, dma_fence_get(>base)); > > *f = >base; > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re:[PATCH 1/9] dma-buf: add new dma_fence_chain container v6
cmpxchg be replaced by some simple c sentance? otherwise we have to remove __rcu of chian->prev. -David Original Message Subject: Re: [PATCH 1/9] dma-buf: add new dma_fence_chain container v6 From: Christian König To: "Zhou, David(ChunMing)" ,kbuild test robot ,"Zhou, David(ChunMing)" CC: kbuild-...@01.org,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,lionel.g.landwer...@intel.com,ja...@jlekstrand.net,"Koenig, Christian" ,"Hector, Tobias" Hi David, For the cmpxchg() case I of hand don't know either. Looks like so far nobody has used cmpxchg() with rcu protected structures. The other cases should be replaced by RCU_INIT_POINTER() or rcu_dereference_protected(.., true); Regards, Christian. Am 21.03.19 um 07:34 schrieb zhoucm1: > Hi Lionel and Christian, > > Below is robot report for chain->prev, which was added __rcu as you > suggested. > > How to fix this line "tmp = cmpxchg(>prev, prev, replacement); "? > I checked kernel header file, seems it has no cmpxchg for rcu. > > Any suggestion to fix this robot report? > > Thanks, > -David > > On 2019年03月21日 08:24, kbuild test robot wrote: >> Hi Chunming, >> >> I love your patch! Perhaps something to improve: >> >> [auto build test WARNING on linus/master] >> [also build test WARNING on v5.1-rc1 next-20190320] >> [if your patch is applied to the wrong git tree, please drop us a >> note to help improve the system] >> >> url: >> https://github.com/0day-ci/linux/commits/Chunming-Zhou/dma-buf-add-new-dma_fence_chain-container-v6/20190320-223607 >> reproduce: >> # apt-get install sparse >> make ARCH=x86_64 allmodconfig >> make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' >> >> >> sparse warnings: (new ones prefixed by >>) >> >>>> drivers/dma-buf/dma-fence-chain.c:73:23: sparse: incorrect type in >>>> initializer (different address spaces) @@expected struct >>>> dma_fence [noderef] *__old @@got dma_fence [noderef] >>>> *__old @@ >> drivers/dma-buf/dma-fence-chain.c:73:23:expected struct >> dma_fence [noderef] *__old >> drivers/dma-buf/dma-fence-chain.c:73:23:got struct dma_fence >> *[assigned] prev >>>> drivers/dma-buf/dma-fence-chain.c:73:23: sparse: incorrect type in >>>> initializer (different address spaces) @@expected struct >>>> dma_fence [noderef] *__new @@got dma_fence [noderef] >>>> *__new @@ >> drivers/dma-buf/dma-fence-chain.c:73:23:expected struct >> dma_fence [noderef] *__new >> drivers/dma-buf/dma-fence-chain.c:73:23:got struct dma_fence >> *[assigned] replacement >>>> drivers/dma-buf/dma-fence-chain.c:73:21: sparse: incorrect type in >>>> assignment (different address spaces) @@expected struct >>>> dma_fence *tmp @@got struct dma_fence [noderef] >>> dma_fence *tmp @@ >> drivers/dma-buf/dma-fence-chain.c:73:21:expected struct >> dma_fence *tmp >> drivers/dma-buf/dma-fence-chain.c:73:21:got struct dma_fence >> [noderef] *[assigned] __ret >>>> drivers/dma-buf/dma-fence-chain.c:190:28: sparse: incorrect type in >>>> argument 1 (different address spaces) @@expected struct >>>> dma_fence *fence @@got struct dma_fence struct dma_fence *fence @@ >> drivers/dma-buf/dma-fence-chain.c:190:28:expected struct >> dma_fence *fence >> drivers/dma-buf/dma-fence-chain.c:190:28:got struct dma_fence >> [noderef] *prev >>>> drivers/dma-buf/dma-fence-chain.c:222:21: sparse: incorrect type in >>>> assignment (different address spaces) @@expected struct >>>> dma_fence [noderef] *prev @@got [noderef] *prev @@ >> drivers/dma-buf/dma-fence-chain.c:222:21:expected struct >> dma_fence [noderef] *prev >> drivers/dma-buf/dma-fence-chain.c:222:21:got struct dma_fence >> *prev >> drivers/dma-buf/dma-fence-chain.c:235:33: sparse: expression >> using sizeof(void) >> drivers/dma-buf/dma-fence-chain.c:235:33: sparse: expression >> using sizeof(void) >> >> vim +73 drivers/dma-buf/dma-fence-chain.c >> >> 38 >> 39/** >> 40 * dma_fence_chain_walk - chain walking function >> 41 * @fence: current chain node >> 42 * >> 43 * Walk the chain to the next node. Returns the next fence >> or NULL if we are at >> 44 * the end of the chain. Garbage collects chain nodes >> which are already >>
Re:[PATCH] drm/amdgpu: enable bo priority setting from user space
yes,per submission bo list priority already is used by us. but per vm bo still is in fly, no priority on that. -David send from my phone Original Message Subject: Re: [PATCH] drm/amdgpu: enable bo priority setting from user space From: "Koenig, Christian" To: "Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org CC: Well you can already use the per submission priority for the BOs. Additional to that as I said for per VM BOs we can add a priority to sort them in the LRU. Not sure how effective both of those actually are. Regards, Christian. Am 07.03.19 um 14:09 schrieb Zhou, David(ChunMing): Yes, you are right, thanks to point it out. Will see if there is other way. -David send from my phone Original Message Subject: Re: [PATCH] drm/amdgpu: enable bo priority setting from user space From: Christian König To: "Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> CC: Am 07.03.19 um 10:15 schrieb Chunming Zhou: > Signed-off-by: Chunming Zhou <mailto:david1.z...@amd.com> Well NAK to the whole approach. The TTM priority is a global priority, but processes are only allowed to specific the priority inside their own allocations. So this approach will never fly upstream. What you can do is to add a priority for per vm BOs to affect their sort order on the LRU, but I doubt that this will have much of an effect. Regards, Christian. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 13 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h| 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 + > include/drm/ttm/ttm_bo_driver.h| 9 - > include/uapi/drm/amdgpu_drm.h | 3 +++ > 7 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > index 5cbde74b97dd..70a6baf20c22 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > @@ -144,6 +144,7 @@ static int amdgpufb_create_pinned_object(struct > amdgpu_fbdev *rfbdev, >size = mode_cmd->pitches[0] * height; >aligned_size = ALIGN(size, PAGE_SIZE); >ret = amdgpu_gem_object_create(adev, aligned_size, 0, domain, > +TTM_BO_PRIORITY_NORMAL, > AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED | > AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | > AMDGPU_GEM_CREATE_VRAM_CLEARED, > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > index d21dd2f369da..7c1c2362c67e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > @@ -44,6 +44,7 @@ void amdgpu_gem_object_free(struct drm_gem_object *gobj) > > int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size, > int alignment, u32 initial_domain, > + enum ttm_bo_priority priority, > u64 flags, enum ttm_bo_type type, > struct reservation_object *resv, > struct drm_gem_object **obj) > @@ -60,6 +61,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, > unsigned long size, >bp.type = type; >bp.resv = resv; >bp.preferred_domain = initial_domain; > + bp.priority = priority; > retry: >bp.flags = flags; >bp.domain = initial_domain; > @@ -229,6 +231,14 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void > *data, >if (args->in.domains & ~AMDGPU_GEM_DOMAIN_MASK) >return -EINVAL; > > + /* check priority */ > + if (args->in.priority == 0) { > + /* default is normal */ > + args->in.priority = TTM_BO_PRIORITY_NORMAL; > + } else if (args->in.priority > TTM_MAX_BO_PRIORITY) { > + args->in.priority = TTM_MAX_BO_PRIORITY; > + DRM_ERROR("priority specified from user space is over MAX > priority\n"); > + } >/* create a gem object to contain this object in */ >if (args->in.domains & (AMDGPU_GEM_DOMAIN_GDS | >AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) { > @@ -252,6 +262,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void > *data, > >r = amdgpu_gem_object_create(adev, size, args->in.alignment, > (u32)(0x & args->in.domains), > +
Re:[PATCH] drm/amdgpu: enable bo priority setting from user space
Yes, you are right, thanks to point it out. Will see if there is other way. -David send from my phone Original Message Subject: Re: [PATCH] drm/amdgpu: enable bo priority setting from user space From: Christian König To: "Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org CC: Am 07.03.19 um 10:15 schrieb Chunming Zhou: > Signed-off-by: Chunming Zhou Well NAK to the whole approach. The TTM priority is a global priority, but processes are only allowed to specific the priority inside their own allocations. So this approach will never fly upstream. What you can do is to add a priority for per vm BOs to affect their sort order on the LRU, but I doubt that this will have much of an effect. Regards, Christian. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 13 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h| 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 + > include/drm/ttm/ttm_bo_driver.h| 9 - > include/uapi/drm/amdgpu_drm.h | 3 +++ > 7 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > index 5cbde74b97dd..70a6baf20c22 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c > @@ -144,6 +144,7 @@ static int amdgpufb_create_pinned_object(struct > amdgpu_fbdev *rfbdev, >size = mode_cmd->pitches[0] * height; >aligned_size = ALIGN(size, PAGE_SIZE); >ret = amdgpu_gem_object_create(adev, aligned_size, 0, domain, > +TTM_BO_PRIORITY_NORMAL, > AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED | > AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | > AMDGPU_GEM_CREATE_VRAM_CLEARED, > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > index d21dd2f369da..7c1c2362c67e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > @@ -44,6 +44,7 @@ void amdgpu_gem_object_free(struct drm_gem_object *gobj) > > int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size, > int alignment, u32 initial_domain, > + enum ttm_bo_priority priority, > u64 flags, enum ttm_bo_type type, > struct reservation_object *resv, > struct drm_gem_object **obj) > @@ -60,6 +61,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, > unsigned long size, >bp.type = type; >bp.resv = resv; >bp.preferred_domain = initial_domain; > + bp.priority = priority; > retry: >bp.flags = flags; >bp.domain = initial_domain; > @@ -229,6 +231,14 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void > *data, >if (args->in.domains & ~AMDGPU_GEM_DOMAIN_MASK) >return -EINVAL; > > + /* check priority */ > + if (args->in.priority == 0) { > + /* default is normal */ > + args->in.priority = TTM_BO_PRIORITY_NORMAL; > + } else if (args->in.priority > TTM_MAX_BO_PRIORITY) { > + args->in.priority = TTM_MAX_BO_PRIORITY; > + DRM_ERROR("priority specified from user space is over MAX > priority\n"); > + } >/* create a gem object to contain this object in */ >if (args->in.domains & (AMDGPU_GEM_DOMAIN_GDS | >AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) { > @@ -252,6 +262,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void > *data, > >r = amdgpu_gem_object_create(adev, size, args->in.alignment, > (u32)(0x & args->in.domains), > + args->in.priority - 1, > flags, ttm_bo_type_device, resv, ); >if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) { >if (!r) { > @@ -304,6 +315,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void > *data, > >/* create a gem object to contain this object in */ >r = amdgpu_gem_object_create(adev, args->size, 0, > AMDGPU_GEM_DOMAIN_CPU, > + TTM_BO_PRIORITY_NORMAL, > 0, ttm_bo_type_device, NULL, ); >if (r) >return r; > @@ -755,6 +
RE: [PATCH] drm/amdgpu: force to use CPU_ACCESS hint optimization
> -Original Message- > From: Christian König > Sent: Wednesday, March 06, 2019 7:55 PM > To: Zhou, David(ChunMing) ; Koenig, Christian > ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH] drm/amdgpu: force to use CPU_ACCESS hint > optimization > > Am 06.03.19 um 12:52 schrieb Chunming Zhou: > > As we know, visible vram can be placed to invisible when no cpu access. > > > > Signed-off-by: Chunming Zhou > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 +++- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > index bc62bf41b7e9..823deb66f5da 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > @@ -592,8 +592,7 @@ static int amdgpu_info_ioctl(struct drm_device > > *dev, void *data, struct drm_file > > > > vram_gtt.vram_size = adev->gmc.real_vram_size - > > atomic64_read(>vram_pin_size); > > - vram_gtt.vram_cpu_accessible_size = adev- > >gmc.visible_vram_size - > > - atomic64_read(>visible_pin_size); > > + vram_gtt.vram_cpu_accessible_size = vram_gtt.vram_size; > > Well, NAK that would of course report the full VRAM as visible which isn't > correct. UMD also said same reason that they like report explicit vram info to application. No idea to do that. -David > > Christian. > > > vram_gtt.gtt_size = adev- > >mman.bdev.man[TTM_PL_TT].size; > > vram_gtt.gtt_size *= PAGE_SIZE; > > vram_gtt.gtt_size -= atomic64_read(>gart_pin_size); > > @@ -612,9 +611,8 @@ static int amdgpu_info_ioctl(struct drm_device > *dev, void *data, struct drm_file > > mem.vram.max_allocation = mem.vram.usable_heap_size * > 3 / 4; > > > > mem.cpu_accessible_vram.total_heap_size = > > - adev->gmc.visible_vram_size; > > - mem.cpu_accessible_vram.usable_heap_size = adev- > >gmc.visible_vram_size - > > - atomic64_read(>visible_pin_size); > > + mem.vram.total_heap_size; > > + mem.cpu_accessible_vram.usable_heap_size = > > +mem.vram.usable_heap_size; > > mem.cpu_accessible_vram.heap_usage = > > amdgpu_vram_mgr_vis_usage( > >mman.bdev.man[TTM_PL_VRAM]); > > mem.cpu_accessible_vram.max_allocation = ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 1/3] drm/amdgpu: change Vega IH ring 1 config
Acked-by: Chunming Zhou > -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Wednesday, March 06, 2019 5:29 PM > To: amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/3] drm/amdgpu: change Vega IH ring 1 config > > Ping? Can anybody review this? > > Thanks, > Christian. > > Am 04.03.19 um 20:10 schrieb Christian König: > > Disable overflow and enable full drain. This makes fault handling on > > ring 1 much more reliable since we don't generate back pressure any more. > > > > Signed-off-by: Christian König > > --- > > drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > > b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > > index 6d1f804277f8..d4a3cc413af8 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > > +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > > @@ -203,6 +203,10 @@ static int vega10_ih_irq_init(struct > > amdgpu_device *adev) > > > > ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, > mmIH_RB_CNTL_RING1); > > ih_rb_cntl = vega10_ih_rb_cntl(ih, ih_rb_cntl); > > + ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, > > + WPTR_OVERFLOW_ENABLE, 0); > > + ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, > > + RB_FULL_DRAIN_ENABLE, 1); > > WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, > ih_rb_cntl); > > > > /* set rptr, wptr to 0 */ > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: Error handling issues about CHECKED_RETURN
> -Original Message- > From: Bo YU > Sent: Thursday, February 14, 2019 12:46 PM > To: Deucher, Alexander ; Koenig, Christian > ; Zhou, David(ChunMing) > ; airl...@linux.ie; dan...@ffwll.ch; Zhu, Rex > ; Grodzovsky, Andrey > ; dri-de...@lists.freedesktop.org; linux- > ker...@vger.kernel.org > Cc: Bo Yu ; amd-gfx@lists.freedesktop.org > Subject: [PATCH] drm/amdgpu: Error handling issues about > CHECKED_RETURN > > From: Bo Yu > > Calling "amdgpu_ring_test_helper" without checking return value We could need to continue to ring test even there is one ring test failed. -David > > Signed-off-by: Bo Yu > --- > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > index 57cb3a51bda7..48465a61516b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > @@ -4728,7 +4728,9 @@ static int gfx_v8_0_cp_test_all_rings(struct > amdgpu_device *adev) > > for (i = 0; i < adev->gfx.num_compute_rings; i++) { > ring = >gfx.compute_ring[i]; > - amdgpu_ring_test_helper(ring); > + r = amdgpu_ring_test_helper(ring); > + if (r) > + return r; > } > > return 0; > -- > 2.11.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: partial revert cleanup setting bulk_movable v2
If Tom tests it OK as well, feel free add my RB to submit it ASAP. -David > -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Thursday, January 31, 2019 3:57 PM > To: amd-gfx@lists.freedesktop.org > Subject: [PATCH] drm/amdgpu: partial revert cleanup setting bulk_movable > v2 > > We still need to set bulk_movable to false when new BOs are added or > removed. > > v2: also set it to false on removal > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 79f9dde70bc0..822546a149fa 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -332,6 +332,7 @@ static void amdgpu_vm_bo_base_init(struct > amdgpu_vm_bo_base *base, > if (bo->tbo.resv != vm->root.base.bo->tbo.resv) > return; > > + vm->bulk_moveable = false; > if (bo->tbo.type == ttm_bo_type_kernel) > amdgpu_vm_bo_relocated(base); > else > @@ -2772,6 +2773,9 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device > *adev, > struct amdgpu_vm_bo_base **base; > > if (bo) { > + if (bo->tbo.resv == vm->root.base.bo->tbo.resv) > + vm->bulk_moveable = false; > + > for (base = _va->base.bo->vm_bo; *base; >base = &(*base)->next) { > if (*base != _va->base) > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
回复:[PATCH 2/2] drm/amdgpu: cleanup setting bulk_movable
Reviewed-by: Chunming Zhou send from my phone 原始邮件 主题:[PATCH 2/2] drm/amdgpu: cleanup setting bulk_movable 发件人:Christian König 收件人:amd-gfx@lists.freedesktop.org 抄送: We only need to set this to false now when BOs are removed from the LRU. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index a404ac17e5ae..79f9dde70bc0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -332,7 +332,6 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, if (bo->tbo.resv != vm->root.base.bo->tbo.resv) return; - vm->bulk_moveable = false; if (bo->tbo.type == ttm_bo_type_kernel) amdgpu_vm_bo_relocated(base); else @@ -698,8 +697,6 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_vm_bo_base *bo_base, *tmp; int r = 0; - vm->bulk_moveable &= list_empty(>evicted); - list_for_each_entry_safe(bo_base, tmp, >evicted, vm_status) { struct amdgpu_bo *bo = bo_base->bo; @@ -2775,9 +2772,6 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev, struct amdgpu_vm_bo_base **base; if (bo) { - if (bo->tbo.resv == vm->root.base.bo->tbo.resv) - vm->bulk_moveable = false; - for (base = _va->base.bo->vm_bo; *base; base = &(*base)->next) { if (*base != _va->base) -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH libdrm] amdgpu: add a faster BO list API
Looks good to me, Reviewed-by: Chunming Zhou > -Original Message- > From: amd-gfx On Behalf Of > Marek Ol?ák > Sent: Tuesday, January 08, 2019 3:31 AM > To: amd-gfx@lists.freedesktop.org > Subject: [PATCH libdrm] amdgpu: add a faster BO list API > > From: Marek Olšák > > --- > amdgpu/amdgpu-symbol-check | 3 ++ > amdgpu/amdgpu.h| 56 > +- > amdgpu/amdgpu_bo.c | 36 > amdgpu/amdgpu_cs.c | 25 + > 4 files changed, 119 insertions(+), 1 deletion(-) > > diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol- > check index 6f5e0f95..96a44b40 100755 > --- a/amdgpu/amdgpu-symbol-check > +++ b/amdgpu/amdgpu-symbol-check > @@ -12,20 +12,22 @@ _edata > _end > _fini > _init > amdgpu_bo_alloc > amdgpu_bo_cpu_map > amdgpu_bo_cpu_unmap > amdgpu_bo_export > amdgpu_bo_free > amdgpu_bo_import > amdgpu_bo_inc_ref > +amdgpu_bo_list_create_raw > +amdgpu_bo_list_destroy_raw > amdgpu_bo_list_create > amdgpu_bo_list_destroy > amdgpu_bo_list_update > amdgpu_bo_query_info > amdgpu_bo_set_metadata > amdgpu_bo_va_op > amdgpu_bo_va_op_raw > amdgpu_bo_wait_for_idle > amdgpu_create_bo_from_user_mem > amdgpu_cs_chunk_fence_info_to_data > @@ -40,20 +42,21 @@ amdgpu_cs_destroy_semaphore > amdgpu_cs_destroy_syncobj amdgpu_cs_export_syncobj > amdgpu_cs_fence_to_handle amdgpu_cs_import_syncobj > amdgpu_cs_query_fence_status amdgpu_cs_query_reset_state > amdgpu_query_sw_info amdgpu_cs_signal_semaphore > amdgpu_cs_submit amdgpu_cs_submit_raw > +amdgpu_cs_submit_raw2 > amdgpu_cs_syncobj_export_sync_file > amdgpu_cs_syncobj_import_sync_file > amdgpu_cs_syncobj_reset > amdgpu_cs_syncobj_signal > amdgpu_cs_syncobj_wait > amdgpu_cs_wait_fences > amdgpu_cs_wait_semaphore > amdgpu_device_deinitialize > amdgpu_device_initialize > amdgpu_find_bo_by_cpu_mapping > diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h index > dc51659a..5b800033 100644 > --- a/amdgpu/amdgpu.h > +++ b/amdgpu/amdgpu.h > @@ -35,20 +35,21 @@ > #define _AMDGPU_H_ > > #include > #include > > #ifdef __cplusplus > extern "C" { > #endif > > struct drm_amdgpu_info_hw_ip; > +struct drm_amdgpu_bo_list_entry; > > > /*--*/ > /* --- Defines > */ /*- > -*/ > > /** > * Define max. number of Command Buffers (IB) which could be sent to the > single > * hardware IP to accommodate CE/DE requirements > * > * \sa amdgpu_cs_ib_info > @@ -767,34 +768,65 @@ int amdgpu_bo_cpu_unmap(amdgpu_bo_handle > buf_handle); > *and no GPU access is scheduled. > * 1 GPU access is in fly or scheduled > * > * \return 0 - on success > * <0 - Negative POSIX Error code > */ > int amdgpu_bo_wait_for_idle(amdgpu_bo_handle buf_handle, > uint64_t timeout_ns, > bool *buffer_busy); > > +/** > + * Creates a BO list handle for command submission. > + * > + * \param dev - \c [in] Device handle. > + * See #amdgpu_device_initialize() > + * \param number_of_buffers- \c [in] Number of BOs in the list > + * \param buffers - \c [in] List of BO handles > + * \param result - \c [out] Created BO list handle > + * > + * \return 0 on success\n > + * <0 - Negative POSIX Error code > + * > + * \sa amdgpu_bo_list_destroy_raw() > +*/ > +int amdgpu_bo_list_create_raw(amdgpu_device_handle dev, > + uint32_t number_of_buffers, > + struct drm_amdgpu_bo_list_entry *buffers, > + uint32_t *result); > + > +/** > + * Destroys a BO list handle. > + * > + * \param bo_list - \c [in] BO list handle. > + * > + * \return 0 on success\n > + * <0 - Negative POSIX Error code > + * > + * \sa amdgpu_bo_list_create_raw(), amdgpu_cs_submit_raw2() */ int > +amdgpu_bo_list_destroy_raw(amdgpu_device_handle dev, uint32_t > bo_list); > + > /** > * Creates a BO list handle for command submission. > * > * \param dev - \c [in] Device handle. > * See #amdgpu_device_initialize() > * \param number_of_resources - \c [in] Number of BOs in the list > * \param resources- \c [in] List of BO handles > * \param resource_prios - \c [in] Optional priority for each handle > * \param result - \c [out] Created BO list handle > * > * \return 0 on success\n > * <0 - Negative POSIX Error code > * > - * \sa amdgpu_bo_list_destroy() > + * \sa amdgpu_bo_list_destroy(), amdgpu_cs_submit_raw2() > */ > int amdgpu_bo_list_create(amdgpu_device_handle dev, >
RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt accessed
Doesn't gpu check PTE prt bit first and then access va range? Even wrte to dummy page, seem there still is no problem, we don't care that content at all. -David > -Original Message- > From: Christian König > Sent: Thursday, January 03, 2019 5:54 PM > To: Zhou, David(ChunMing) ; Koenig, Christian > ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt > accessed > > Writes are then not ignored and garbage the dummy page. > > Christian. > > Am 03.01.19 um 10:46 schrieb Zhou, David(ChunMing): > > Seems we don't need two page table, we just map every prt range to > dummy page, any problem? > > > > -David > > > >> -Original Message- > >> From: Zhou, David(ChunMing) > >> Sent: Thursday, January 03, 2019 5:23 PM > >> To: Koenig, Christian ; amd- > >> g...@lists.freedesktop.org > >> Subject: RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt > >> accessed > >> > >> > >> > >>> -Original Message- > >>> From: Christian König > >>> Sent: Thursday, January 03, 2019 5:05 PM > >>> To: Zhou, David(ChunMing) ; Koenig, Christian > >>> ; Zhou, David(ChunMing) > >>> ; amd-gfx@lists.freedesktop.org > >>> Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt > >>> accessed > >>> > >>> Yes, exactly. > >>> > >>> Problem is that we then probably need two page tables. One for the > >>> CB/TC and one for the SDMA. > >> But when setup page table, how can we know the client is CB/TC or SDMA? > >> > >> -David > >> > >>> Christian. > >>> > >>> Am 03.01.19 um 10:02 schrieb zhoucm1: > >>>> need dummy page for that? > >>>> > >>>> > >>>> -David > >>>> > >>>> > >>>> On 2019年01月03日 17:01, Christian König wrote: > >>>>> NAK, the problem is not the interrupt. > >>>>> > >>>>> E.g. causing faults by accessing unmapped pages with the SDMA can > >>>>> still crash the MC. > >>>>> > >>>>> The key point is that SDMA can't work with PRT tiles on pre-gmc9 > >>>>> and we need to forbid access on the application side. > >>>>> > >>>>> Regards, > >>>>> Christian. > >>>>> > >>>>> Am 03.01.19 um 09:54 schrieb Chunming Zhou: > >>>>>> For pre-gmc9, UMD can only access unmapped PRT tile from CB/TC > >>>>>> without firing VM fault. Kernel would still receive the VM fault > >>>>>> interrupt and output the error message if SDMA is the mc_client. > >>>>>> GMC9 don't need the same since it handle the PRT in different way. > >>>>>> We cannot just skip message for SDMA, as Christian pointed, VM > >>>>>> fault could crash mc block, so we disable vm fault irq during prt > >>>>>> range is accesed. > >>>>>> The nagative is normal vm fault could be ignored during that > >>>>>> peroid without enabling vm_debug kernel parameter. > >>>>>> > >>>>>> Change-Id: Ic3c62393768eca90e3e45eaf81e7f26f2e91de84 > >>>>>> Signed-off-by: Chunming Zhou > >>>>>> --- > >>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 6 ++ > >>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 ++ > >>>>>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 6 ++ > >>>>>> 3 files changed, 18 insertions(+) > >>>>>> > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>>>>> index dae73f6768c2..175c4b319559 100644 > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>>>>> @@ -486,6 +486,10 @@ static void gmc_v6_0_set_prt(struct > >>>>>> amdgpu_device *adev, bool enable) > >>>>>> WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high); > >>>>>> WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high); > >>>>>> WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high); > >>>>>> + /* Note: whe
RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt accessed
Seems we don't need two page table, we just map every prt range to dummy page, any problem? -David > -Original Message- > From: Zhou, David(ChunMing) > Sent: Thursday, January 03, 2019 5:23 PM > To: Koenig, Christian ; amd- > g...@lists.freedesktop.org > Subject: RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt > accessed > > > > > -Original Message- > > From: Christian König > > Sent: Thursday, January 03, 2019 5:05 PM > > To: Zhou, David(ChunMing) ; Koenig, Christian > > ; Zhou, David(ChunMing) > > ; amd-gfx@lists.freedesktop.org > > Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt > > accessed > > > > Yes, exactly. > > > > Problem is that we then probably need two page tables. One for the > > CB/TC and one for the SDMA. > > But when setup page table, how can we know the client is CB/TC or SDMA? > > -David > > > > > Christian. > > > > Am 03.01.19 um 10:02 schrieb zhoucm1: > > > need dummy page for that? > > > > > > > > > -David > > > > > > > > > On 2019年01月03日 17:01, Christian König wrote: > > >> NAK, the problem is not the interrupt. > > >> > > >> E.g. causing faults by accessing unmapped pages with the SDMA can > > >> still crash the MC. > > >> > > >> The key point is that SDMA can't work with PRT tiles on pre-gmc9 > > >> and we need to forbid access on the application side. > > >> > > >> Regards, > > >> Christian. > > >> > > >> Am 03.01.19 um 09:54 schrieb Chunming Zhou: > > >>> For pre-gmc9, UMD can only access unmapped PRT tile from CB/TC > > >>> without firing VM fault. Kernel would still receive the VM fault > > >>> interrupt and output the error message if SDMA is the mc_client. > > >>> GMC9 don't need the same since it handle the PRT in different way. > > >>> We cannot just skip message for SDMA, as Christian pointed, VM > > >>> fault could crash mc block, so we disable vm fault irq during prt > > >>> range is accesed. > > >>> The nagative is normal vm fault could be ignored during that > > >>> peroid without enabling vm_debug kernel parameter. > > >>> > > >>> Change-Id: Ic3c62393768eca90e3e45eaf81e7f26f2e91de84 > > >>> Signed-off-by: Chunming Zhou > > >>> --- > > >>> drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 6 ++ > > >>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 ++ > > >>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 6 ++ > > >>> 3 files changed, 18 insertions(+) > > >>> > > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > > >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > > >>> index dae73f6768c2..175c4b319559 100644 > > >>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > > >>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > > >>> @@ -486,6 +486,10 @@ static void gmc_v6_0_set_prt(struct > > >>> amdgpu_device *adev, bool enable) > > >>> WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high); > > >>> WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high); > > >>> WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high); > > >>> + /* Note: when vm_debug enabled, vm fault from SDMAx > > >>> +accessing > > >>> + * PRT range is normal. */ > > >>> + if (!amdgpu_vm_debug) > > >>> + amdgpu_irq_put(adev, >gmc.vm_fault, 0); > > >>> } else { > > >>> WREG32(mmVM_PRT_APERTURE0_LOW_ADDR, 0xfff); > > >>> WREG32(mmVM_PRT_APERTURE1_LOW_ADDR, 0xfff); @@ - > > 495,6 > > >>> +499,8 @@ static void gmc_v6_0_set_prt(struct amdgpu_device > *adev, > > >>> bool enable) > > >>> WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, 0x0); > > >>> WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, 0x0); > > >>> WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, 0x0); > > >>> + if (!amdgpu_vm_debug) > > >>> + amdgpu_irq_get(adev, >gmc.vm_fault, 0); > > >>> } > > >>> } > > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > > >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > > >>> index 5bdeb358bfb5..a4d6d219
RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt accessed
> -Original Message- > From: Christian König > Sent: Thursday, January 03, 2019 5:05 PM > To: Zhou, David(ChunMing) ; Koenig, Christian > ; Zhou, David(ChunMing) > ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt > accessed > > Yes, exactly. > > Problem is that we then probably need two page tables. One for the CB/TC > and one for the SDMA. But when setup page table, how can we know the client is CB/TC or SDMA? -David > > Christian. > > Am 03.01.19 um 10:02 schrieb zhoucm1: > > need dummy page for that? > > > > > > -David > > > > > > On 2019年01月03日 17:01, Christian König wrote: > >> NAK, the problem is not the interrupt. > >> > >> E.g. causing faults by accessing unmapped pages with the SDMA can > >> still crash the MC. > >> > >> The key point is that SDMA can't work with PRT tiles on pre-gmc9 and > >> we need to forbid access on the application side. > >> > >> Regards, > >> Christian. > >> > >> Am 03.01.19 um 09:54 schrieb Chunming Zhou: > >>> For pre-gmc9, UMD can only access unmapped PRT tile from CB/TC > >>> without firing VM fault. Kernel would still receive the VM fault > >>> interrupt and output the error message if SDMA is the mc_client. > >>> GMC9 don't need the same since it handle the PRT in different way. > >>> We cannot just skip message for SDMA, as Christian pointed, VM fault > >>> could crash mc block, so we disable vm fault irq during prt range is > >>> accesed. > >>> The nagative is normal vm fault could be ignored during that peroid > >>> without enabling vm_debug kernel parameter. > >>> > >>> Change-Id: Ic3c62393768eca90e3e45eaf81e7f26f2e91de84 > >>> Signed-off-by: Chunming Zhou > >>> --- > >>> drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 6 ++ > >>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 ++ > >>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 6 ++ > >>> 3 files changed, 18 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>> index dae73f6768c2..175c4b319559 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c > >>> @@ -486,6 +486,10 @@ static void gmc_v6_0_set_prt(struct > >>> amdgpu_device *adev, bool enable) > >>> WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high); > >>> WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high); > >>> WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high); > >>> + /* Note: when vm_debug enabled, vm fault from SDMAx > >>> +accessing > >>> + * PRT range is normal. */ > >>> + if (!amdgpu_vm_debug) > >>> + amdgpu_irq_put(adev, >gmc.vm_fault, 0); > >>> } else { > >>> WREG32(mmVM_PRT_APERTURE0_LOW_ADDR, 0xfff); > >>> WREG32(mmVM_PRT_APERTURE1_LOW_ADDR, 0xfff); @@ - > 495,6 > >>> +499,8 @@ static void gmc_v6_0_set_prt(struct amdgpu_device *adev, > >>> bool enable) > >>> WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, 0x0); > >>> WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, 0x0); > >>> WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, 0x0); > >>> + if (!amdgpu_vm_debug) > >>> + amdgpu_irq_get(adev, >gmc.vm_fault, 0); > >>> } > >>> } > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > >>> index 5bdeb358bfb5..a4d6d219f4e8 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > >>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > >>> @@ -582,6 +582,10 @@ static void gmc_v7_0_set_prt(struct > >>> amdgpu_device *adev, bool enable) > >>> WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high); > >>> WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high); > >>> WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high); > >>> + /* Note: when vm_debug enabled, vm fault from SDMAx > >>> +accessing > >>> + * PRT range is normal. */ > >>> + if (!amdgpu_vm_debug) > >>> + amdgpu_irq_put(adev, >gmc.vm_fault, 0); > >>> } else { > >>>
RE: [Intel-gfx] [PATCH 03/10] drm/syncobj: add new drm_syncobj_add_point interface v2
+ Daniel Rakos and Jason Ekstrand. Below is the background, which is from Daniel R should be able to explain that's why: " ISVs, especially those coming from D3D12, are unsatisfied with the behavior of the Vulkan semaphores as they are unhappy with the fact that for every single dependency they need to use separate semaphores due to their binary nature. Compared to that a synchronization primitive like D3D12 monitored fences enable one of those to be used to track a sequence of operations by simply associating timeline values to the completion of individual operations. This allows them to track the lifetime and usage of resources and the ordered completion of sequences. Besides that, they also want to use a single synchronization primitive to be able to handle GPU-to-GPU and GPU-to-CPU dependencies, compared to using semaphores for the former and fences for the latter. In addition, compared to legacy semaphores, timeline semaphores are proposed to support wait-before-signal, i.e. allow enqueueing a semaphore wait operation with a wait value that is larger than any of the already enqueued signal values. This seems to be a hard requirement for ISVs. Without UMD-side queue batching, and even UMD-side queue batching doesn’t help the situation when such a semaphore is externally shared with another API. Thus in order to properly support wait-before-signal the KMD implementation has to also be able to support such dependencies. " Btw, we already add test case to igt, and tested by many existing test, like libdrm unit test, igt related test, vulkan cts, and steam games. -David > -Original Message- > From: Daniel Vetter > Sent: Wednesday, December 12, 2018 7:15 PM > To: Koenig, Christian > Cc: Zhou, David(ChunMing) ; dri-devel de...@lists.freedesktop.org>; amd-gfx list ; > intel-gfx ; Christian König > > Subject: Re: [Intel-gfx] [PATCH 03/10] drm/syncobj: add new > drm_syncobj_add_point interface v2 > > On Wed, Dec 12, 2018 at 12:08 PM Koenig, Christian > wrote: > > > > Am 12.12.18 um 11:49 schrieb Daniel Vetter: > > > On Fri, Dec 07, 2018 at 11:54:15PM +0800, Chunming Zhou wrote: > > >> From: Christian König > > >> > > >> Use the dma_fence_chain object to create a timeline of fence > > >> objects instead of just replacing the existing fence. > > >> > > >> v2: rebase and cleanup > > >> > > >> Signed-off-by: Christian König > > > Somewhat jumping back into this. Not sure we discussed this already > > > or not. I'm a bit unclear on why we have to chain the fences in the > timeline: > > > > > > - The timeline stuff is modelled after the WDDM2 monitored fences. > Which > > >really are just u64 counters in memory somewhere (I think could be > > >system ram or vram). Because WDDM2 has the memory management > entirely > > >separated from rendering synchronization it totally allows userspace to > > >create loops and deadlocks and everything else nasty using this - the > > >memory manager won't deadlock because these monitored fences > never leak > > >into the buffer manager. And if CS deadlock, gpu reset takes care of > > > the > > >mess. > > > > > > - This has a few consequences, as in they seem to indeed work like a > > >memory location: Userspace incrementing out-of-order (because they > run > > >batches updating the same fence on different engines) is totally fine, > > >as is doing anything else "stupid". > > > > > > - Now on linux we can't allow anything, because we need to make sure > that > > >deadlocks don't leak into the memory manager. But as long as we block > > >until the underlying dma_fence has materialized, nothing userspace can > > >do will lead to such a deadlock. Even if userspace ends up submitting > > >jobs without enough built-in synchronization, leading to out-of-order > > >signalling of fences on that "timeline". And I don't think that would > > >pose a problem for us. > > > > > > Essentially I think we can look at timeline syncobj as a dma_fence > > > container indexed through an integer, and there's no need to enforce > > > that the timline works like a real dma_fence timeline, with all it's > > > guarantees. It's just a pile of (possibly, if userspace is stupid) > > > unrelated dma_fences. You could implement the entire thing in > > > userspace after all, except for the "we want to share these timeline > > > objects between processes" problem. > > > > > > tldr; I think
RE: [PATCH 01/10] dma-buf: add new dma_fence_chain container v4
Hi Daniel and Chris, Could you take a look on all the patches? Can we get your RB or AB on all patches including igt patch before we submit to drm-misc? We already fix all existing issues, and also add test case in IGT as your required. Btw, the patch set is tested by below tests: a. vulkan cts " ./deqp-vk -n dEQP-VK. *semaphore*" b. internal vulkan timeline test c. libdrm test "sudo ./amdgpu_test -s 9" d. IGT test, "sudo ./syncobj_basic" e. IGT test, "sudo ./syncobj_wait" f. IGT test, "sudo ./syncobj_timeline" Any other suggestion or requirement is welcome. -David > -Original Message- > From: dri-devel On Behalf Of > Chunming Zhou > Sent: Tuesday, December 11, 2018 6:35 PM > To: Koenig, Christian ; dri- > de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; intel- > g...@lists.freedesktop.org > Cc: Christian König ; Koenig, Christian > > Subject: [PATCH 01/10] dma-buf: add new dma_fence_chain container v4 > > From: Christian König > > Lockless container implementation similar to a dma_fence_array, but with > only two elements per node and automatic garbage collection. > > v2: properly document dma_fence_chain_for_each, add > dma_fence_chain_find_seqno, > drop prev reference during garbage collection if it's not a chain fence. > v3: use head and iterator for dma_fence_chain_for_each > v4: fix reference count in dma_fence_chain_enable_signaling > > Signed-off-by: Christian König > --- > drivers/dma-buf/Makefile | 3 +- > drivers/dma-buf/dma-fence-chain.c | 241 > ++ > include/linux/dma-fence-chain.h | 81 ++ > 3 files changed, 324 insertions(+), 1 deletion(-) create mode 100644 > drivers/dma-buf/dma-fence-chain.c create mode 100644 include/linux/dma- > fence-chain.h > > diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index > 0913a6ccab5a..1f006e083eb9 100644 > --- a/drivers/dma-buf/Makefile > +++ b/drivers/dma-buf/Makefile > @@ -1,4 +1,5 @@ > -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno- > fence.o > +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ > + reservation.o seqno-fence.o > obj-$(CONFIG_SYNC_FILE) += sync_file.o > obj-$(CONFIG_SW_SYNC)+= sw_sync.o sync_debug.o > obj-$(CONFIG_UDMABUF)+= udmabuf.o > diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma- > fence-chain.c > new file mode 100644 > index ..0c5e3c902fa0 > --- /dev/null > +++ b/drivers/dma-buf/dma-fence-chain.c > @@ -0,0 +1,241 @@ > +/* > + * fence-chain: chain fences together in a timeline > + * > + * Copyright (C) 2018 Advanced Micro Devices, Inc. > + * Authors: > + * Christian König > + * > + * This program is free software; you can redistribute it and/or modify > +it > + * under the terms of the GNU General Public License version 2 as > +published by > + * the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > +WITHOUT > + * ANY WARRANTY; without even the implied warranty of > MERCHANTABILITY > +or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public > +License for > + * more details. > + */ > + > +#include > + > +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence); > + > +/** > + * dma_fence_chain_get_prev - use RCU to get a reference to the > +previous fence > + * @chain: chain node to get the previous node from > + * > + * Use dma_fence_get_rcu_safe to get a reference to the previous fence > +of the > + * chain node. > + */ > +static struct dma_fence *dma_fence_chain_get_prev(struct > +dma_fence_chain *chain) { > + struct dma_fence *prev; > + > + rcu_read_lock(); > + prev = dma_fence_get_rcu_safe(>prev); > + rcu_read_unlock(); > + return prev; > +} > + > +/** > + * dma_fence_chain_walk - chain walking function > + * @fence: current chain node > + * > + * Walk the chain to the next node. Returns the next fence or NULL if > +we are at > + * the end of the chain. Garbage collects chain nodes which are already > + * signaled. > + */ > +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) { > + struct dma_fence_chain *chain, *prev_chain; > + struct dma_fence *prev, *replacement, *tmp; > + > + chain = to_dma_fence_chain(fence); > + if (!chain) { > + dma_fence_put(fence); > + return NULL; > + } > + > + while ((prev = dma_fence_chain_get_prev(chain))) { > + > + prev_chain = to_dma_fence_chain(prev); > + if (prev_chain) { > + if (!dma_fence_is_signaled(prev_chain->fence)) > + break; > + > + replacement = > dma_fence_chain_get_prev(prev_chain); > + } else { > + if (!dma_fence_is_signaled(prev)) > + break; > + > + replacement = NULL; > + } > + > + tmp
RE: [PATCH v3 2/2] drm/sched: Rework HW fence processing.
I don't think adding cb to sched job would work as soon as their lifetime is different with fence. Unless you make the sched job reference, otherwise we will get trouble sooner or later. -David > -Original Message- > From: amd-gfx On Behalf Of > Andrey Grodzovsky > Sent: Tuesday, December 11, 2018 5:44 AM > To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; > ckoenig.leichtzumer...@gmail.com; e...@anholt.net; > etna...@lists.freedesktop.org > Cc: Zhou, David(ChunMing) ; Liu, Monk > ; Grodzovsky, Andrey > > Subject: [PATCH v3 2/2] drm/sched: Rework HW fence processing. > > Expedite job deletion from ring mirror list to the HW fence signal callback > instead from finish_work, together with waiting for all such fences to signal > in > drm_sched_stop we garantee that already signaled job will not be processed > twice. > Remove the sched finish fence callback and just submit finish_work directly > from the HW fence callback. > > v2: Fix comments. > > v3: Attach hw fence cb to sched_job > > Suggested-by: Christian Koenig > Signed-off-by: Andrey Grodzovsky > --- > drivers/gpu/drm/scheduler/sched_main.c | 58 -- > > include/drm/gpu_scheduler.h| 6 ++-- > 2 files changed, 30 insertions(+), 34 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index cdf95e2..f0c1f32 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -284,8 +284,6 @@ static void drm_sched_job_finish(struct work_struct > *work) > cancel_delayed_work_sync(>work_tdr); > > spin_lock_irqsave(>job_list_lock, flags); > - /* remove job from ring_mirror_list */ > - list_del_init(_job->node); > /* queue TDR for next job */ > drm_sched_start_timeout(sched); > spin_unlock_irqrestore(>job_list_lock, flags); @@ -293,22 > +291,11 @@ static void drm_sched_job_finish(struct work_struct *work) > sched->ops->free_job(s_job); > } > > -static void drm_sched_job_finish_cb(struct dma_fence *f, > - struct dma_fence_cb *cb) > -{ > - struct drm_sched_job *job = container_of(cb, struct drm_sched_job, > - finish_cb); > - schedule_work(>finish_work); > -} > - > static void drm_sched_job_begin(struct drm_sched_job *s_job) { > struct drm_gpu_scheduler *sched = s_job->sched; > unsigned long flags; > > - dma_fence_add_callback(_job->s_fence->finished, _job- > >finish_cb, > -drm_sched_job_finish_cb); > - > spin_lock_irqsave(>job_list_lock, flags); > list_add_tail(_job->node, >ring_mirror_list); > drm_sched_start_timeout(sched); > @@ -359,12 +346,11 @@ void drm_sched_stop(struct drm_gpu_scheduler > *sched, struct drm_sched_job *bad, > list_for_each_entry_reverse(s_job, >ring_mirror_list, node) > { > if (s_job->s_fence->parent && > dma_fence_remove_callback(s_job->s_fence->parent, > - _job->s_fence->cb)) { > + _job->cb)) { > dma_fence_put(s_job->s_fence->parent); > s_job->s_fence->parent = NULL; > atomic_dec(>hw_rq_count); > - } > - else { > + } else { > /* TODO Is it get/put neccessey here ? */ > dma_fence_get(_job->s_fence->finished); > list_add(_job->finish_node, _list); @@ - > 417,31 +403,34 @@ EXPORT_SYMBOL(drm_sched_stop); void > drm_sched_start(struct drm_gpu_scheduler *sched, bool unpark_only) { > struct drm_sched_job *s_job, *tmp; > - unsigned long flags; > int r; > > if (unpark_only) > goto unpark; > > - spin_lock_irqsave(>job_list_lock, flags); > + /* > + * Locking the list is not required here as the sched thread is parked > + * so no new jobs are being pushed in to HW and in drm_sched_stop > we > + * flushed all the jobs who were still in mirror list but who already > + * signaled and removed them self from the list. Also concurrent > + * GPU recovers can't run in parallel. > + */ > list_for_each_entry_safe(s_job, tmp, >ring_mirror_list, > node) { > - struct drm_sched_fence *s_fence = s_job->s_fence; > struct dma_fence *fence = s_job->s_fence->p
RE: [PATCH -next] drm/amdgpu: remove set but not used variable 'grbm_soft_reset'
> -Original Message- > From: YueHaibing > Sent: Saturday, December 08, 2018 11:01 PM > To: Deucher, Alexander ; Koenig, Christian > ; Zhou, David(ChunMing) > ; airl...@linux.ie; Liu, Leo ; > Gao, Likun ; Panariti, David > ; S, Shirish ; Zhu, Rex > ; Grodzovsky, Andrey > Cc: YueHaibing ; amd-gfx@lists.freedesktop.org; > dri-de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; kernel- > janit...@vger.kernel.org > Subject: [PATCH -next] drm/amdgpu: remove set but not used variable > 'grbm_soft_reset' > > Fixes gcc '-Wunused-but-set-variable' warning: > > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function > 'gfx_v8_0_pre_soft_reset': > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:4950:27: warning: > variable 'srbm_soft_reset' set but not used [-Wunused-but-set-variable] > > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function > 'gfx_v8_0_post_soft_reset': > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:5054:27: warning: > variable 'srbm_soft_reset' set but not used [-Wunused-but-set-variable] > > It never used since introduction in commit d31a501ead7f ("drm/amdgpu: add > pre_soft_reset ip func") and e4ae0fc33631 ("drm/amdgpu: implement > gfx8 post_soft_reset") > > Signed-off-by: YueHaibing Reviewed-by: Chunming Zhou > --- > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > index 1454fc3..8c1ba79 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c > @@ -4947,14 +4947,13 @@ static bool gfx_v8_0_check_soft_reset(void > *handle) static int gfx_v8_0_pre_soft_reset(void *handle) { > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > - u32 grbm_soft_reset = 0, srbm_soft_reset = 0; > + u32 grbm_soft_reset = 0; > > if ((!adev->gfx.grbm_soft_reset) && > (!adev->gfx.srbm_soft_reset)) > return 0; > > grbm_soft_reset = adev->gfx.grbm_soft_reset; > - srbm_soft_reset = adev->gfx.srbm_soft_reset; > > /* stop the rlc */ > adev->gfx.rlc.funcs->stop(adev); > @@ -5051,14 +5050,13 @@ static int gfx_v8_0_soft_reset(void *handle) > static int gfx_v8_0_post_soft_reset(void *handle) { > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > - u32 grbm_soft_reset = 0, srbm_soft_reset = 0; > + u32 grbm_soft_reset = 0; > > if ((!adev->gfx.grbm_soft_reset) && > (!adev->gfx.srbm_soft_reset)) > return 0; > > grbm_soft_reset = adev->gfx.grbm_soft_reset; > - srbm_soft_reset = adev->gfx.srbm_soft_reset; > > if (REG_GET_FIELD(grbm_soft_reset, GRBM_SOFT_RESET, > SOFT_RESET_CP) || > REG_GET_FIELD(grbm_soft_reset, GRBM_SOFT_RESET, > SOFT_RESET_CPF) || > > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu notifier v6
Even you should rename amdgpu_mn.c/h to amdgpu_hmm.c/h. -David > -Original Message- > From: amd-gfx On Behalf Of Yang, > Philip > Sent: Friday, December 07, 2018 5:03 AM > To: amd-gfx@lists.freedesktop.org > Cc: Yang, Philip > Subject: [PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu > notifier v6 > > Replace our MMU notifier with > hmm_mirror_ops.sync_cpu_device_pagetables > callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a > dependency in DRM_AMDGPU_USERPTR Kconfig. > > It supports both KFD userptr and gfx userptr paths. > > The depdent HMM patchset from Jérôme Glisse are all merged into 4.20.0 > kernel now. > > Change-Id: Ie62c3c5e3c5b8521ab3b438d1eff2aa2a003835e > Signed-off-by: Philip Yang > --- > drivers/gpu/drm/amd/amdgpu/Kconfig | 6 +- > drivers/gpu/drm/amd/amdgpu/Makefile| 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 122 ++- > -- > drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h | 2 +- > 4 files changed, 55 insertions(+), 77 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig > b/drivers/gpu/drm/amd/amdgpu/Kconfig > index 9221e5489069..960a63355705 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Kconfig > +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig > @@ -26,10 +26,10 @@ config DRM_AMDGPU_CIK config > DRM_AMDGPU_USERPTR > bool "Always enable userptr write support" > depends on DRM_AMDGPU > - select MMU_NOTIFIER > + select HMM_MIRROR > help > - This option selects CONFIG_MMU_NOTIFIER if it isn't already > - selected to enabled full userptr support. > + This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it > + isn't already selected to enabled full userptr support. > > config DRM_AMDGPU_GART_DEBUGFS > bool "Allow GART access through debugfs" > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile > b/drivers/gpu/drm/amd/amdgpu/Makefile > index f76bcb9c45e4..675efc850ff4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -172,7 +172,7 @@ endif > amdgpu-$(CONFIG_COMPAT) += amdgpu_ioc32.o > amdgpu-$(CONFIG_VGA_SWITCHEROO) += amdgpu_atpx_handler.o > amdgpu-$(CONFIG_ACPI) += amdgpu_acpi.o > -amdgpu-$(CONFIG_MMU_NOTIFIER) += amdgpu_mn.o > +amdgpu-$(CONFIG_HMM_MIRROR) += amdgpu_mn.o > > include $(FULL_AMD_PATH)/powerplay/Makefile > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c > index e55508b39496..5d518d2bb9be 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c > @@ -45,7 +45,7 @@ > > #include > #include > -#include > +#include > #include > #include > #include > @@ -58,7 +58,6 @@ > * > * @adev: amdgpu device pointer > * @mm: process address space > - * @mn: MMU notifier structure > * @type: type of MMU notifier > * @work: destruction work item > * @node: hash table node to find structure by adev and mn @@ -66,6 +65,7 > @@ > * @objects: interval tree containing amdgpu_mn_nodes > * @read_lock: mutex for recursive locking of @lock > * @recursion: depth of recursion > + * @mirror: HMM mirror function support > * > * Data for each amdgpu device and process address space. > */ > @@ -73,7 +73,6 @@ struct amdgpu_mn { > /* constant after initialisation */ > struct amdgpu_device*adev; > struct mm_struct*mm; > - struct mmu_notifier mn; > enum amdgpu_mn_type type; > > /* only used on destruction */ > @@ -87,6 +86,9 @@ struct amdgpu_mn { > struct rb_root_cached objects; > struct mutexread_lock; > atomic_trecursion; > + > + /* HMM mirror */ > + struct hmm_mirror mirror; > }; > > /** > @@ -103,7 +105,7 @@ struct amdgpu_mn_node { }; > > /** > - * amdgpu_mn_destroy - destroy the MMU notifier > + * amdgpu_mn_destroy - destroy the HMM mirror > * > * @work: previously sheduled work item > * > @@ -129,28 +131,26 @@ static void amdgpu_mn_destroy(struct work_struct > *work) > } > up_write(>lock); > mutex_unlock(>mn_lock); > - mmu_notifier_unregister_no_release(>mn, amn->mm); > + > + hmm_mirror_unregister(>mirror); > kfree(amn); > } > > /** > - * amdgpu_mn_release - callback to notify about mm destruction > + * amdgpu_hmm_mirror_release - callback to notify about mm destruction > * > - * @mn: our notifier > - * @mm: the mm this callback is about > + * @mirror: the HMM mirror (mm) this callback is about > * > - * Shedule a work item to lazy destroy our notifier. > + * Shedule a work item to lazy destroy HMM mirror. > */ > -static void amdgpu_mn_release(struct mmu_notifier *mn, > - struct mm_struct *mm) > +static void amdgpu_hmm_mirror_release(struct hmm_mirror *mirror) > { > - struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn, > mn); > + struct amdgpu_mn *amn = container_of(mirror, struct amdgpu_mn, >
RE: [PATCH 02/11] dma-buf: add new dma_fence_chain container v2
> -Original Message- > From: Christian König > Sent: Monday, December 03, 2018 9:56 PM > To: Zhou, David(ChunMing) ; Koenig, Christian > ; dri-de...@lists.freedesktop.org; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container > v2 > > Am 03.12.18 um 14:44 schrieb Chunming Zhou: > > > > 在 2018/12/3 21:28, Christian König 写道: > >> Am 03.12.18 um 14:18 schrieb Chunming Zhou: > >>> 在 2018/12/3 19:00, Christian König 写道: > >>>> Am 03.12.18 um 06:25 schrieb zhoucm1: > >>>>> On 2018年11月28日 22:50, Christian König wrote: > >>>>>> Lockless container implementation similar to a dma_fence_array, > >>>>>> but with only two elements per node and automatic garbage > >>>>>> collection. > >>>>>> > >>>>>> v2: properly document dma_fence_chain_for_each, add > >>>>>> dma_fence_chain_find_seqno, > >>>>>> drop prev reference during garbage collection if it's not > >>>>>> a chain fence. > >>>>>> > >>>>>> Signed-off-by: Christian König > >>>>>> --- [snip] > >>>>>> + > >>>>>> +/** > >>>>>> + * dma_fence_chain_init - initialize a fence chain > >>>>>> + * @chain: the chain node to initialize > >>>>>> + * @prev: the previous fence > >>>>>> + * @fence: the current fence > >>>>>> + * > >>>>>> + * Initialize a new chain node and either start a new chain or > >>>>>> +add > >>>>>> the node to > >>>>>> + * the existing chain of the previous fence. > >>>>>> + */ > >>>>>> +void dma_fence_chain_init(struct dma_fence_chain *chain, > >>>>>> + struct dma_fence *prev, > >>>>>> + struct dma_fence *fence, > >>>>>> + uint64_t seqno) > >>>>>> +{ > >>>>>> + struct dma_fence_chain *prev_chain = > >>>>>> +to_dma_fence_chain(prev); > >>>>>> + uint64_t context; > >>>>>> + > >>>>>> + spin_lock_init(>lock); > >>>>>> + chain->prev = prev; > >>>>>> + chain->fence = fence; > >>>>>> + chain->prev_seqno = 0; > >>>>>> + init_irq_work(>work, dma_fence_chain_irq_work); > >>>>>> + > >>>>>> + /* Try to reuse the context of the previous chain node. */ > >>>>>> + if (prev_chain && seqno > prev->seqno && > >>>>>> + __dma_fence_is_later(seqno, prev->seqno)) { > >>>>> As your patch#1 makes __dma_fence_is_later only be valid for > >>>>> 32bit, we cannot use it for 64bit here, we should remove it from > >>>>> here, just compare seqno directly. > >>>> That is intentional. We must make sure that the number both > >>>> increments as 64bit number as well as not wraps around as 32bit > number. > >>>> > >>>> In other words the largest difference between two sequence numbers > >>>> userspace is allowed to submit is 1<<31. > >>> Why? no one can make sure that, application users would only think > >>> it is an uint64 sequence nubmer, and they can signal any advanced > >>> point. I already see umd guys writing timeline test use max_uint64-1 > >>> as a final signal. > >>> We shouldn't add this limitation here. > >> We need to be backward compatible to hardware which can only do 32bit > >> signaling with the dma-fence implementation. > > I see that, you already explained that before. > > but can't we just grep low 32bit of seqno only when 32bit hardware try > > to use? > > > > then we can make dma_fence_later use 64bit comparation. > > The problem is that we don't know at all times when to use a 32bit compare > and when to use a 64bit compare. > > What we could do is test if any of the upper 32bits of a sequence number is > not 0 and if that is the case do a 64bit compare. This way max_uint64_t would > still be handled correctly. Sounds we can have a try, and we need mask upper 32bits for 32bit hardware case in the meanwhile, right? -David > > > Christian. > > > > >> Otherwise dma_fence_later() could return an inconsistent result and > >> break at other places. > >> > >> So if userspace wants to use more than 1<<31 difference between > >> sequence numbers we need to push back on this. > > It's rare case, but I don't think we can convince them add this > > limitation. So we cannot add this limitation here. > > > > -David ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH libdrm 4/5] wrap syncobj timeline query/wait APIs for amdgpu v3
> -Original Message- > From: Christian König > Sent: Friday, November 30, 2018 5:15 PM > To: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH libdrm 4/5] wrap syncobj timeline query/wait APIs for > amdgpu v3 > [snip] > >> +drm_public int amdgpu_cs_syncobj_query(amdgpu_device_handle dev, > >> + uint32_t *handles, uint64_t *points, > > This interfaces is public to umd, I think they like "uint64_t > > **points" for batch query, I've verified before, it works well and > > more convenience. > > If removing num_handles, that means only one syncobj to query, I agree > > with "uint64_t *point". > > "handles" as well as "points" are an array of objects. If the UMD wants to > write the points to separate locations it can do so manually after calling the > function. Ok, it doesn't matter. -David > > It doesn't make any sense that libdrm or the kernel does the extra > indirection, the transferred pointers are 64bit as well (even on a 32bit > system) so the overhead is identical. > > Adding another indirection just makes the implementation unnecessary > complex. > > Christian. > > > > > -David > >> + unsigned num_handles) { > >> + if (NULL == dev) > >> + return -EINVAL; > >> + > >> + return drmSyncobjQuery(dev->fd, handles, points, num_handles); } > >> + > >> drm_public int amdgpu_cs_export_syncobj(amdgpu_device_handle dev, > >> uint32_t handle, > >> int *shared_fd) > > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference
> -Original Message- > From: Christian König > Sent: Monday, November 26, 2018 5:23 PM > To: Sharma, Deepak ; Zhou, David(ChunMing) > ; Koenig, Christian ; > amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer > dereference > > Am 26.11.18 um 02:59 schrieb Sharma, Deepak: > > > > 在 2018/11/24 2:10, Koenig, Christian 写道: > >> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): > >>> 在 2018/11/23 21:30, Koenig, Christian 写道: > >>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): > >>>>> 在 2018/11/22 19:25, Christian König 写道: > >>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: > >>>>>>> when returned fence is not valid mostly due to userspace ignored > >>>>>>> previous error causes NULL pointer dereference. > >>>>>> Again, this is clearly incorrect. The my other mails on the > >>>>>> earlier patch. > >>>>> Sorry for I didn't get your history, but looks from the patch > >>>>> itself, it is still a valid patch, isn't it? > >>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL > >>>> when the fence is already signaled. > >>>> > >>>> So this patch could totally break userspace because it changes the > >>>> behavior when we try to sync to an already signaled fence. > >>> Ah, I got your meaning, how about attached patch? > >> Yeah something like this, but I would just give the > >> DRM_SYNCOBJ_CREATE_SIGNALED instead. > >> > >> I mean that's what this flag is good for isn't it? > > Yeah, I give a flag initally when creating patch, but as you know, there is > > a > swich case not be able to use that flag: > > > > case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: > > fd = get_unused_fd_flags(O_CLOEXEC); > > if (fd < 0) { > > dma_fence_put(fence); > > return fd; > > } > > > > sync_file = sync_file_create(fence); > > dma_fence_put(fence); > > if (!sync_file) { > > put_unused_fd(fd); > > return -ENOMEM; > > } > > > > fd_install(fd, sync_file->file); > > info->out.handle = fd; > > return 0; > > > > So I change to stub fence instead. > > Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL > fence. > > We should then probably move the stub fence function into > dma_fence_stub.c under drivers/dma-buf to keep the stuff together. Yes, you wrap it to review first with your stub fence, we can do it separately first. -David > > > > > -David > > > > I have not applied this patch. > > The issue was trying to address is when amdgpu_cs_ioctl() failed due to > low memory (ENOMEM) but userspace chose to proceed and called > amdgpu_cs_fence_to_handle_ioctl(). > > In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing > > NULL pointer dereference, this patch was to avoid that and system panic > But I understand now that its a valid case retuning NULL if fence was already > signaled but need to handle case so avoid kernel panic. Seems David patch > should fix this, I will test it tomorrow. > > Mhm, but don't we bail out with an error if we ask for a failed command > submission? If not that sounds like a bug as well. > > Christian. > > > > > -Deepak > >> Christian. > >> > >>> -David > >>>> If that patch was applied then please revert it immediately. > >>>> > >>>> Christian. > >>>> > >>>>> -David > >>>>>> If you have already pushed the patch then please revert. > >>>>>> > >>>>>> Christian. > >>>>>> > >>>>>>> Signed-off-by: Deepak Sharma > >>>>>>> --- > >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > >>>>>>> 1 file changed, 2 insertions(+) > >>>>>>> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> index 024dfbd87f11..14166cd8a12f 100644 > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence > >>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev, > >>>>>>> fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); > >>>>>>> amdgpu_ctx_put(ctx); > >>>>>>> + if(!fence) > >>>>>>> + return ERR_PTR(-EINVAL); > >>>>>>> return fence; > >>>>>>> } > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amd: add the checking to avoid NULL pointer dereference
> -Original Message- > From: amd-gfx On Behalf Of > Sharma, Deepak > Sent: Thursday, November 22, 2018 10:37 AM > To: amd-gfx@lists.freedesktop.org > Cc: Sharma, Deepak > Subject: [PATCH] drm/amd: add the checking to avoid NULL pointer > dereference > > when returned fence is not valid mostly due to userspace ignored previous > error causes NULL pointer dereference > > Signed-off-by: Deepak Sharma > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 024dfbd87f11..c85bb313e6df 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -1420,6 +1420,8 @@ int amdgpu_cs_fence_to_handle_ioctl(struct > drm_device *dev, void *data, > fence = amdgpu_cs_get_fence(adev, filp, >in.fence); > if (IS_ERR(fence)) > return PTR_ERR(fence); > + if (!fence) > + return -EINVAL; Could you move them into the end of amdgpu_cs_get_fence()? Like: If (!fence) return ERR_PTR(-EINVAL); Thanks, -David > > switch (info->in.what) { > case AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ: > -- > 2.15.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH libdrm 5/5] [libdrm] add syncobj timeline tests
> -Original Message- > From: Daniel Vetter On Behalf Of Daniel Vetter > Sent: Monday, November 05, 2018 5:39 PM > To: Zhou, David(ChunMing) > Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH libdrm 5/5] [libdrm] add syncobj timeline tests > > On Fri, Nov 02, 2018 at 04:26:49PM +0800, Chunming Zhou wrote: > > Signed-off-by: Chunming Zhou > > --- > > tests/amdgpu/Makefile.am | 3 +- > > tests/amdgpu/amdgpu_test.c | 12 ++ > > tests/amdgpu/amdgpu_test.h | 21 +++ > > tests/amdgpu/meson.build | 2 +- > > tests/amdgpu/syncobj_tests.c | 263 > > +++ > > 5 files changed, 299 insertions(+), 2 deletions(-) create mode > > 100644 tests/amdgpu/syncobj_tests.c > > This testcase seems very much a happy sunday scenario, no tests at all for > corner cases, invalid input, and generally trying to pull the kernel over the > table. I think we need a lot more, and preferrably in igt, where we already > have a good baseline of drm_syncobj tests. Hi Daniel, OK, if you insist on that, I would switch to implement a timeline test on IGT. Btw, timeline syncobj test needs based on command submission, Can I write it with amdgpu driver on IGT? And after that, where should I send igt patch to review? Last, if you are free, Could you also take a look the u/k interface of timeline syncobj? Thanks, David Zhou > -Daniel > > > > > diff --git a/tests/amdgpu/Makefile.am b/tests/amdgpu/Makefile.am index > > 447ff217..d3fbe2bb 100644 > > --- a/tests/amdgpu/Makefile.am > > +++ b/tests/amdgpu/Makefile.am > > @@ -33,4 +33,5 @@ amdgpu_test_SOURCES = \ > > vcn_tests.c \ > > uve_ib.h \ > > deadlock_tests.c \ > > - vm_tests.c > > + vm_tests.c \ > > + syncobj_tests.c > > diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c > > index 96fcd687..cdcb93a5 100644 > > --- a/tests/amdgpu/amdgpu_test.c > > +++ b/tests/amdgpu/amdgpu_test.c > > @@ -56,6 +56,7 @@ > > #define UVD_ENC_TESTS_STR "UVD ENC Tests" > > #define DEADLOCK_TESTS_STR "Deadlock Tests" > > #define VM_TESTS_STR "VM Tests" > > +#define SYNCOBJ_TIMELINE_TESTS_STR "SYNCOBJ TIMELINE Tests" > > > > /** > > * Open handles for amdgpu devices > > @@ -116,6 +117,12 @@ static CU_SuiteInfo suites[] = { > > .pCleanupFunc = suite_vm_tests_clean, > > .pTests = vm_tests, > > }, > > + { > > + .pName = SYNCOBJ_TIMELINE_TESTS_STR, > > + .pInitFunc = suite_syncobj_timeline_tests_init, > > + .pCleanupFunc = suite_syncobj_timeline_tests_clean, > > + .pTests = syncobj_timeline_tests, > > + }, > > > > CU_SUITE_INFO_NULL, > > }; > > @@ -165,6 +172,11 @@ static Suites_Active_Status suites_active_stat[] = { > > .pName = VM_TESTS_STR, > > .pActive = suite_vm_tests_enable, > > }, > > + { > > + .pName = SYNCOBJ_TIMELINE_TESTS_STR, > > + .pActive = suite_syncobj_timeline_tests_enable, > > + }, > > + > > }; > > > > > > diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h > > index 0609a74b..946e91c2 100644 > > --- a/tests/amdgpu/amdgpu_test.h > > +++ b/tests/amdgpu/amdgpu_test.h > > @@ -194,6 +194,27 @@ CU_BOOL suite_vm_tests_enable(void); > > */ > > extern CU_TestInfo vm_tests[]; > > > > +/** > > + * Initialize syncobj timeline test suite */ int > > +suite_syncobj_timeline_tests_init(); > > + > > +/** > > + * Deinitialize syncobj timeline test suite */ int > > +suite_syncobj_timeline_tests_clean(); > > + > > +/** > > + * Decide if the suite is enabled by default or not. > > + */ > > +CU_BOOL suite_syncobj_timeline_tests_enable(void); > > + > > +/** > > + * Tests in syncobj timeline test suite */ extern CU_TestInfo > > +syncobj_timeline_tests[]; > > + > > + > > /** > > * Helper functions > > */ > > diff --git a/tests/amdgpu/meson.build b/tests/amdgpu/meson.build index > > 4c1237c6..3ceec715 100644 > > --- a/tests/amdgpu/meson.build > > +++ b/tests/amdgpu/meson.build > > @@ -24,7 +24,7 @@ if dep_cunit.found() > > files( > >'amdgpu_test.c', 'basic_tests.c', 'bo_tests.c', 'cs_tests.c', > >'vce_tests.c', 'uvd_enc_tests.c', 'vcn_tests.c', 'deadlock_tests.c', > > - 'vm_tests.c'
RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync
> -Original Message- > From: Koenig, Christian > Sent: Monday, November 05, 2018 3:48 PM > To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Zhou, > David(ChunMing) > Subject: Re: [PATCH 2/3] drm/amdgpu: drop the sched_sync > > Am 05.11.18 um 08:24 schrieb Liu, Monk: > >> David Zhou had an use case which saw a >10% performance drop the last > time he tried it. > > I really don't believe that, because if you insert a WAIT_MEM on an already > signaled fence, it only cost GPU couple clocks to move on, right ? no reason > to slow down up to 10% ... with 3dmark vulkan version test, the performance > is barely different ... with my patch applied ... > > Why do you think that we insert a WAIT_MEM on an already signaled fence? > The pipeline sync always wait for the last fence value (because we can't > handle wraparounds in PM4). > > So you have a pipeline sync when you don't need one and that is really really > bad for things shared between processes, e.g. X/Wayland and it's clients. > > I also expects that this doesn't effect 3dmark at all, but everything which > runs > in a window which is composed by X could be slowed down massively. > > David do you remember which use case was affected when you tried to drop > this optimization? That was a long time ago, I remember Andrey also tried to remove sched_sync before, but he eventually kept it, right? From Monk's patch, seems he doesn't change main logic, he just moved sched_sync logic to job->need_pipe_sync. But at least, I can see a bit effect, e.g. job process evaluates fence to sched_sync, but the fence could be signaled when amdgpu_ib_schedule, then don't need insert pipeline sync. Anyway, this is a sensitive path, we should change it carefully, we should give a wide test. Regards, David Zhou > > >> When a reset happens we flush the VMIDs when re-submitting the jobs > to the rings and while doing so we also always do a pipeline sync. > > I will check that point in my branch, I didn't use drm-next, maybe > > there is gap in this part > > We had that logic for a very long time now, but we recently simplified it. > Could be that there was a bug introduced doing so. > > Maybe we should add a specific flag to run_job() to note that we are re- > running a job and then always add VM flushes/pipeline syncs? > > But my main question is why do you see any issues with quark? That is a > workaround for an issue for Vulkan sync handling and should only surface > when a specific test is run many many times. > > Regards, > Christian. > > > > > /Monk > > -Original Message- > > From: Koenig, Christian > > Sent: Monday, November 5, 2018 3:02 AM > > To: Liu, Monk ; amd-gfx@lists.freedesktop.org > > Subject: Re: [PATCH 2/3] drm/amdgpu: drop the sched_sync > > > >> Can you tell me which game/benchmark will have performance drop with > this fix by your understanding ? > > When you sync between submission things like composing X windows are > slowed down massively. > > > > David Zhou had an use case which saw a >10% performance drop the last > time he tried it. > > > >> The problem I hit is during the massive stress test against > >> multi-process + quark , if the quark process hang the engine while there is > another two job following the bad job, After the TDR these two job will lose > the explicit and the pipeline-sync was also lost. > > Well that is really strange. This workaround is only for a very specific > > Vulkan > CTS test which we are still not 100% sure is actually valid. > > > > When a reset happens we flush the VMIDs when re-submitting the jobs to > the rings and while doing so we also always do a pipeline sync. > > > > So you should never ever run into any issues in quark with that, even when > we completely disable this workaround. > > > > Regards, > > Christian. > > > > Am 04.11.18 um 01:48 schrieb Liu, Monk: > >>> NAK, that would result in a severe performance drop. > >>> We need the fence here to determine if we actually need to do the > pipeline sync or not. > >>> E.g. the explicit requested fence could already be signaled. > >> For the performance issue, only insert a WAIT_REG_MEM on > GFX/compute ring *doesn't* give the "severe" drop (it's mimic in fact) ... At > least I didn't observe any performance drop with 3dmark benchmark (also > tested vulkan CTS), Can you tell me which game/benchmark will have > performance drop with this fix by your understanding ? let me check it . > >> > >> The problem I hit is during the massive stress test against > >> multi-process + qua
RE: [PATCH] drm/amdgpu: wait for IB test on first device open
Reviewed-by: Chunming Zhou > -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Friday, November 02, 2018 4:45 PM > To: amd-gfx@lists.freedesktop.org > Subject: [PATCH] drm/amdgpu: wait for IB test on first device open > > Instead of delaying that to the first query. Otherwise we could try to use the > SDMA for VM updates before the IB tests are done. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > index 08d04f68dfeb..f87f717cc905 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > @@ -467,9 +467,6 @@ static int amdgpu_info_ioctl(struct drm_device *dev, > void *data, struct drm_file > if (!info->return_size || !info->return_pointer) > return -EINVAL; > > - /* Ensure IB tests are run on ring */ > - flush_delayed_work(>late_init_work); > - > switch (info->query) { > case AMDGPU_INFO_ACCEL_WORKING: > ui32 = adev->accel_working; > @@ -950,6 +947,9 @@ int amdgpu_driver_open_kms(struct drm_device > *dev, struct drm_file *file_priv) > struct amdgpu_fpriv *fpriv; > int r, pasid; > > + /* Ensure IB tests are run on ring */ > + flush_delayed_work(>late_init_work); > + > file_priv->driver_priv = NULL; > > r = pm_runtime_get_sync(dev->dev); > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [igt-dev] [PATCH] RFC: Make igts for cross-driver stuff mandatory?
Make igt for cross-driver, I think you should rename it first, not an intel specific. NO company wants their employee working on other company stuff. You can rename it to DGT(drm graphics test), and published following libdrm, or directly merge to libdrm, then everyone can use it and develop it in same page, which is only my personal opinion. Regards, David > -Original Message- > From: dri-devel On Behalf Of Eric > Anholt > Sent: Friday, October 26, 2018 12:36 AM > To: Sean Paul ; Daniel Vetter > Cc: IGT development ; Intel Graphics > Development ; DRI Development de...@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org > Subject: Re: [igt-dev] [PATCH] RFC: Make igts for cross-driver stuff > mandatory? > > Sean Paul writes: > > > On Fri, Oct 19, 2018 at 10:50:49AM +0200, Daniel Vetter wrote: > >> Hi all, > >> > >> This is just to collect feedback on this idea, and see whether the > >> overall dri-devel community stands on all this. I think the past few > >> cross-vendor uapi extensions all came with igts attached, and > >> personally I think there's lots of value in having them: A > >> cross-vendor interface isn't useful if every driver implements it > >> slightly differently. > >> > >> I think there's 2 questions here: > >> > >> - Do we want to make such testcases mandatory? > >> > > > > Yes, more testing == better code. > > > > > >> - If yes, are we there yet, or is there something crucially missing > >> still? > > > > In my experience, no. Last week while trying to replicate an intel-gfx > > CI failure, I tried compiling igt for one of my (intel) chromebooks. > > It seems like cross-compilation (or, in my case, just specifying > > prefix/ld_library_path/sbin_path) is broken on igt. If we want to > > impose restrictions across the entire subsystem, we need to make sure > > that everyone can build and deploy igt easily. > > > > I managed to hack around everything and get it working, but I still > > haven't tried switching out the toolchain. Once we have some GitLab CI > > to validate cross-compilation, then we can consider making IGT mandatory. > > > > It's possible that I'm just a meson n00b and didn't use the right > > incantation, so maybe it already works, but then we need better > documentation. > > > > I've pasted my horrible hacks below, I also didn't have libunwind, so > > removed its usage. > > I've also had to cut out libunwind for cross-compiling on many occasions. > Worst library. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet
> -Original Message- > From: amd-gfx On Behalf Of Rex > Zhu > Sent: Wednesday, October 24, 2018 2:03 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhu, Rex > Subject: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet > > the csa buffer is used by sdma engine to do context save when preemption > happens. it the mc address is zero, mean the preemtpion feature(MCBP) is > disabled. > > Signed-off-by: Rex Zhu > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 13 + > drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 2 ++ > drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 ++-- > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 ++-- > 4 files changed, 27 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c > index 0fb9907..24b80bc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c > @@ -40,3 +40,16 @@ struct amdgpu_sdma_instance * > amdgpu_get_sdma_instance(struct amdgpu_ring *ring) > > return NULL; > } > + > +int amdgpu_get_sdma_index(struct amdgpu_ring *ring, uint32_t *index) { > + struct amdgpu_device *adev = ring->adev; > + int i; > + > + for (i = 0; i < adev->sdma.num_instances; i++) > + if (ring == >sdma.instance[i].ring || > + ring == >sdma.instance[i].page) > + return i; > + > + return -EINVAL; > +} Loop for checking works, but looks not good. If you need ring index, you can define them first as enum, and evaluate enum index to ring when ring initializing. Regards, David Zhou > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h > index 479a245..314078a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h > @@ -26,6 +26,7 @@ > > /* max number of IP instances */ > #define AMDGPU_MAX_SDMA_INSTANCES2 > +#define AMDGPU_SDMA_CSA_SIZE (1024) > > enum amdgpu_sdma_irq { > AMDGPU_SDMA_IRQ_TRAP0 = 0, > @@ -96,4 +97,5 @@ struct amdgpu_buffer_funcs { struct > amdgpu_sdma_instance * amdgpu_get_sdma_instance(struct amdgpu_ring > *ring); > > +int amdgpu_get_sdma_index(struct amdgpu_ring *ring, uint32_t *index); > #endif > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > index f5e6aa2..fdc5d75 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > @@ -424,7 +424,11 @@ static void sdma_v3_0_ring_emit_ib(struct > amdgpu_ring *ring, > bool ctx_switch) > { > unsigned vmid = GET_VMID(job); > + uint64_t csa_mc_addr = job ? job->csa_mc_addr : 0; > + uint32_t i = 0; > > + if (amdgpu_get_sdma_index(ring, )) > + return -EINVAL; > /* IB packet must end on a 8 DW boundary */ > sdma_v3_0_ring_insert_nop(ring, (10 - (lower_32_bits(ring->wptr) & > 7)) % 8); > > @@ -434,8 +438,8 @@ static void sdma_v3_0_ring_emit_ib(struct > amdgpu_ring *ring, > amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr) & 0xffe0); > amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr)); > amdgpu_ring_write(ring, ib->length_dw); > - amdgpu_ring_write(ring, 0); > - amdgpu_ring_write(ring, 0); > + amdgpu_ring_write(ring, lower_32_bits(csa_mc_addr + i * > AMDGPU_SDMA_CSA_SIZE)); > + amdgpu_ring_write(ring, upper_32_bits(csa_mc_addr + i * > +AMDGPU_SDMA_CSA_SIZE)); > > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > index 2282ac1..e69a584 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > @@ -507,7 +507,11 @@ static void sdma_v4_0_ring_emit_ib(struct > amdgpu_ring *ring, > bool ctx_switch) > { > unsigned vmid = GET_VMID(job); > + uint64_t csa_mc_addr = job ? job->csa_mc_addr : 0; > + uint32_t i = 0; > > + if (amdgpu_get_sdma_index(ring, )) > + return -EINVAL; > /* IB packet must end on a 8 DW boundary */ > sdma_v4_0_ring_insert_nop(ring, (10 - (lower_32_bits(ring->wptr) & > 7)) % 8); > > @@ -517,8 +521,8 @@ static void sdma_v4_0_ring_emit_ib(struct > amdgpu_ring *ring, > amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr) & 0xffe0); > amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr)); > amdgpu_ring_write(ring, ib->length_dw); > - amdgpu_ring_write(ring, 0); > - amdgpu_ring_write(ring, 0); > + amdgpu_ring_write(ring, lower_32_bits(csa_mc_addr + i * > AMDGPU_SDMA_CSA_SIZE)); > + amdgpu_ring_write(ring, upper_32_bits(csa_mc_addr + i * > +AMDGPU_SDMA_CSA_SIZE)); > > } > > -- > 1.9.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/2] drm/amdgpu: Fix null point errro
A minor suggestion, not sure if it's proper, Can we insert these callback checking to func? I know these func could be defined as a macro, can we change them to function definition? David > -Original Message- > From: amd-gfx On Behalf Of Rex > Zhu > Sent: Friday, October 19, 2018 10:51 AM > To: amd-gfx@lists.freedesktop.org > Cc: Zhu, Rex > Subject: [PATCH 2/2] drm/amdgpu: Fix null point errro > > need to check adev->powerplay.pp_funcs first, becasue from AI, the smu ip > may be disabled by user, and the pp_handle is null in this case. > > Signed-off-by: Rex Zhu > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c| 6 -- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 2 +- > drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c| 2 +- > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 6 -- > 5 files changed, 11 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c > index 297a549..0a4fba1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c > @@ -135,7 +135,8 @@ static int acp_poweroff(struct generic_pm_domain > *genpd) >* 2. power off the acp tiles >* 3. check and enter ulv state >*/ > - if (adev->powerplay.pp_funcs->set_powergating_by_smu) > + if (adev->powerplay.pp_funcs && > + adev->powerplay.pp_funcs- > >set_powergating_by_smu) > amdgpu_dpm_set_powergating_by_smu(adev, > AMD_IP_BLOCK_TYPE_ACP, true); > } > return 0; > @@ -517,7 +518,8 @@ static int acp_set_powergating_state(void *handle, > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > bool enable = state == AMD_PG_STATE_GATE ? true : false; > > - if (adev->powerplay.pp_funcs->set_powergating_by_smu) > + if (adev->powerplay.pp_funcs && > + adev->powerplay.pp_funcs->set_powergating_by_smu) > amdgpu_dpm_set_powergating_by_smu(adev, > AMD_IP_BLOCK_TYPE_ACP, enable); > > return 0; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 4fca67a..7dad682 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -1783,6 +1783,7 @@ static int amdgpu_device_set_pg_state(struct > amdgpu_device *adev, enum amd_power > adev->ip_blocks[i].version->type == > AMD_IP_BLOCK_TYPE_VCE || > adev->ip_blocks[i].version->type == > AMD_IP_BLOCK_TYPE_VCN || > adev->ip_blocks[i].version->type == > AMD_IP_BLOCK_TYPE_ACP) && > + adev->powerplay.pp_funcs && > adev->powerplay.pp_funcs->set_powergating_by_smu) { > if (!adev->ip_blocks[i].status.valid) { > > amdgpu_dpm_set_powergating_by_smu(adev, adev- > >ip_blocks[i].version->type, state == AMD_PG_STATE_GATE ? > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > index 790fd54..1a656b8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > @@ -392,7 +392,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device > *adev, bool enable) > if (!(adev->powerplay.pp_feature & PP_GFXOFF_MASK)) > return; > > - if (!adev->powerplay.pp_funcs->set_powergating_by_smu) > + if (!adev->powerplay.pp_funcs || > +!adev->powerplay.pp_funcs->set_powergating_by_smu) > return; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c > b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c > index 14649f8..fd23ba1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c > @@ -280,7 +280,7 @@ void mmhub_v1_0_update_power_gating(struct > amdgpu_device *adev, > return; > > if (enable && adev->pg_flags & AMD_PG_SUPPORT_MMHUB) { > - if (adev->powerplay.pp_funcs->set_powergating_by_smu) > + if (adev->powerplay.pp_funcs && > +adev->powerplay.pp_funcs->set_powergating_by_smu) > amdgpu_dpm_set_powergating_by_smu(adev, > AMD_IP_BLOCK_TYPE_GMC, true); > > } > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > index 2e8365d..d97e6a2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > @@ -1595,7 +1595,8 @@ static int sdma_v4_0_hw_init(void *handle) > int r; > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > - if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs- > >set_powergating_by_smu) > + if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs > && > + adev->powerplay.pp_funcs- > >set_powergating_by_smu) > amdgpu_dpm_set_powergating_by_smu(adev, >
RE: [PATCH 7/7] drm/amdgpu: update version for timeline syncobj support in amdgpu
Ping... Christian, Could I get your RB on the series? And help me to push to drm-misc? After that I can rebase libdrm header file based on drm-next. Thanks, David Zhou > -Original Message- > From: amd-gfx On Behalf Of > Chunming Zhou > Sent: Monday, October 15, 2018 4:56 PM > To: dri-de...@lists.freedesktop.org > Cc: Zhou, David(ChunMing) ; amd- > g...@lists.freedesktop.org > Subject: [PATCH 7/7] drm/amdgpu: update version for timeline syncobj > support in amdgpu > > Signed-off-by: Chunming Zhou > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 6870909da926..58cba492ba55 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -70,9 +70,10 @@ > * - 3.25.0 - Add support for sensor query info (stable pstate sclk/mclk). > * - 3.26.0 - GFX9: Process AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE. > * - 3.27.0 - Add new chunk to to AMDGPU_CS to enable BO_LIST creation. > + * - 3.28.0 - Add syncobj timeline support to AMDGPU_CS. > */ > #define KMS_DRIVER_MAJOR 3 > -#define KMS_DRIVER_MINOR 27 > +#define KMS_DRIVER_MINOR 28 > #define KMS_DRIVER_PATCHLEVEL0 > > int amdgpu_vram_limit = 0; > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 3/6] drm: add support of syncobj timeline point wait v2
>> Another general comment (no good place to put it) is that I think we want >> two kinds of waits: Wait for time point to be completed and wait for time >> point to become available. The first is the usual CPU wait for completion >> while the second is for use by userspace drivers to wait until the first >> moment where they can submit work which depends on a given time point. Hi Jason, How about adding two new wait flags? DRM_SYNCOBJ_WAIT_FLAGS_WAIT_COMPLETED DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE Thanks, David From: Christian König Sent: Tuesday, September 25, 2018 5:50 PM To: Jason Ekstrand ; Zhou, David(ChunMing) Cc: amd-gfx mailing list ; Maling list - DRI developers Subject: Re: [PATCH 3/6] drm: add support of syncobj timeline point wait v2 Am 25.09.2018 um 11:22 schrieb Jason Ekstrand: On Thu, Sep 20, 2018 at 6:04 AM Chunming Zhou mailto:david1.z...@amd.com>> wrote: points array is one-to-one match with syncobjs array. v2: add seperate ioctl for timeline point wait, otherwise break uapi. I think ioctl structs can be extended as long as fields aren't re-ordered. I'm not sure on the details of this though as I'm not a particularly experienced kernel developer. Yeah, that is correct. The problem in this particular case is that we don't change the direct IOCTL parameter, but rather the array it points to. We could do something like keep the existing handles array and add a separate optional one for the timeline points. That would also drop the need for the padding of the structure. Another general comment (no good place to put it) is that I think we want two kinds of waits: Wait for time point to be completed and wait for time point to become available. The first is the usual CPU wait for completion while the second is for use by userspace drivers to wait until the first moment where they can submit work which depends on a given time point. Oh, yeah that is a really good point as ell. Christian. Signed-off-by: Chunming Zhou mailto:david1.z...@amd.com>> --- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/drm_ioctl.c| 2 + drivers/gpu/drm/drm_syncobj.c | 99 +- include/uapi/drm/drm.h | 14 + 4 files changed, 103 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index 0c4eb4a9ab31..566d44e3c782 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -183,6 +183,8 @@ int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_timeline_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 6b4a633b4240..c0891614f516 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -669,6 +669,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, drm_syncobj_timeline_wait_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 67472bd77c83..a43de0e4616c 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -126,13 +126,14 @@ static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj, } static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, +u64 point, struct dma_fence **fence, struct drm_syncobj_cb *cb, drm_syncobj_func_t func) { int ret; - ret = drm_syncobj_search_fence(syncobj, 0, 0, fence); + ret = drm_syncobj_search_fence(syncobj, point, 0, fence); if (!ret) return 1; @@ -143,7 +144,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, */ if (!list_empty(>signal_pt_list)) { spin_unlock(>lock); - drm_syncobj_search_fence(syncobj, 0, 0,
RE: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS
> -Original Message- > From: Nicolai Hähnle > Sent: Wednesday, September 26, 2018 4:44 PM > To: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS > > > static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, diff --git > > a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > index > > 1ceec56de015..412359b446f1 100644 > > --- a/include/uapi/drm/amdgpu_drm.h > > +++ b/include/uapi/drm/amdgpu_drm.h > > @@ -517,6 +517,8 @@ struct drm_amdgpu_gem_va { > > #define AMDGPU_CHUNK_ID_SYNCOBJ_IN 0x04 > > #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05 > > #define AMDGPU_CHUNK_ID_BO_HANDLES 0x06 > > +#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x07 > > +#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL 0x08 > > > > struct drm_amdgpu_cs_chunk { > > __u32 chunk_id; > > @@ -592,6 +594,14 @@ struct drm_amdgpu_cs_chunk_sem { > > __u32 handle; > > }; > > > > +struct drm_amdgpu_cs_chunk_syncobj { > > + __u32 handle; > > + __u32 pad; > > + __u64 point; > > + __u64 flags; > > +}; > > Sure it's nice to be forward-looking, but can't we just put the flags into the > padding? Will change. Thanks, David > > Cheers, > Nicolai > > > > + > > + > > #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ0 > > #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD 1 > > #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD 2 > > > > > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS
> -Original Message- > From: Nicolai Hähnle > Sent: Wednesday, September 26, 2018 5:06 PM > To: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS > > Hey Chunming, > > On 20.09.2018 13:03, Chunming Zhou wrote: > > @@ -1113,48 +1117,91 @@ static int > amdgpu_syncobj_lookup_and_add_to_sync(struct amdgpu_cs_parser *p, > > } > > > > static int amdgpu_cs_process_syncobj_in_dep(struct amdgpu_cs_parser > *p, > > - struct amdgpu_cs_chunk *chunk) > > + struct amdgpu_cs_chunk *chunk, > > + bool timeline) > > { > > unsigned num_deps; > > int i, r; > > - struct drm_amdgpu_cs_chunk_sem *deps; > > > > - deps = (struct drm_amdgpu_cs_chunk_sem *)chunk->kdata; > > - num_deps = chunk->length_dw * 4 / > > - sizeof(struct drm_amdgpu_cs_chunk_sem); > > + if (!timeline) { > > + struct drm_amdgpu_cs_chunk_sem *deps; > > > > - for (i = 0; i < num_deps; ++i) { > > - r = amdgpu_syncobj_lookup_and_add_to_sync(p, > deps[i].handle); > > + deps = (struct drm_amdgpu_cs_chunk_sem *)chunk->kdata; > > + num_deps = chunk->length_dw * 4 / > > + sizeof(struct drm_amdgpu_cs_chunk_sem); > > + for (i = 0; i < num_deps; ++i) { > > + r = amdgpu_syncobj_lookup_and_add_to_sync(p, > deps[i].handle, > > + 0, 0); > > if (r) > > return r; > > The indentation looks wrong. > > > > + } > > + } else { > > + struct drm_amdgpu_cs_chunk_syncobj *syncobj_deps; > > + > > + syncobj_deps = (struct drm_amdgpu_cs_chunk_syncobj > *)chunk->kdata; > > + num_deps = chunk->length_dw * 4 / > > + sizeof(struct drm_amdgpu_cs_chunk_syncobj); > > + for (i = 0; i < num_deps; ++i) { > > + r = amdgpu_syncobj_lookup_and_add_to_sync(p, > syncobj_deps[i].handle, > > + > syncobj_deps[i].point, > > + > syncobj_deps[i].flags); > > + if (r) > > + return r; > > Here as well. > > So I'm wondering a bit about this uapi. Specifically, what happens if you try > to > use timeline syncobjs here as dependencies _without_ > DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT? > > My understanding is, it'll just return -EINVAL without any indication as to > which syncobj actually failed. What's the caller supposed to do then? How about adding a print to indicate which syncobj failed? Thanks, David Zhou > > Cheers, > Nicolai > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: Making use of more Gitlab features for xf86-video-amdgpu
After moving patch submitted and reviewed as MRs, mail list will be no useful? And many people could miss new patches, right? Regards, David Zhou > -Original Message- > From: amd-gfx On Behalf Of > Michel D?nzer > Sent: Friday, September 21, 2018 3:13 PM > To: amd-gfx@lists.freedesktop.org > Subject: Re: Making use of more Gitlab features for xf86-video-amdgpu > > On 2018-09-19 6:46 p.m., Michel Dänzer wrote: > > > > With the 18.1.0 release out the door, I want to start making use of > > more Gitlab features for xf86-video-amdgpu development. > > > > I've already enabled merge requests (MRs) at > > https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu . From > > now on, patches should primarily be submitted and reviewed as MRs. I > > don't know yet if it'll be possible for this mailing list to get > > notifications of new MRs; you may want to enable notifications on the page > above. > > FWIW, > https://gitlab.freedesktop.org/xorg/driver/xf86-video- > amdgpu/merge_requests > now has some actual content. :) > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 3/6] drm: add support of syncobj timeline point wait v2
> -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Thursday, September 20, 2018 7:11 PM > To: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 3/6] drm: add support of syncobj timeline point wait v2 > > Am 20.09.2018 um 13:03 schrieb Chunming Zhou: > > points array is one-to-one match with syncobjs array. > > v2: > > add seperate ioctl for timeline point wait, otherwise break uapi. > > > > Signed-off-by: Chunming Zhou > > --- > > drivers/gpu/drm/drm_internal.h | 2 + > > drivers/gpu/drm/drm_ioctl.c| 2 + > > drivers/gpu/drm/drm_syncobj.c | 99 > +- > > include/uapi/drm/drm.h | 14 + > > 4 files changed, 103 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_internal.h > > b/drivers/gpu/drm/drm_internal.h index 0c4eb4a9ab31..566d44e3c782 > > 100644 > > --- a/drivers/gpu/drm/drm_internal.h > > +++ b/drivers/gpu/drm/drm_internal.h > > @@ -183,6 +183,8 @@ int drm_syncobj_fd_to_handle_ioctl(struct > drm_device *dev, void *data, > >struct drm_file *file_private); > > int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data, > >struct drm_file *file_private); > > +int drm_syncobj_timeline_wait_ioctl(struct drm_device *dev, void *data, > > + struct drm_file *file_private); > > int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, > > struct drm_file *file_private); > > int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, > > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c > > index 6b4a633b4240..c0891614f516 100644 > > --- a/drivers/gpu/drm/drm_ioctl.c > > +++ b/drivers/gpu/drm/drm_ioctl.c > > @@ -669,6 +669,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { > > DRM_UNLOCKED|DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, > drm_syncobj_wait_ioctl, > > DRM_UNLOCKED|DRM_RENDER_ALLOW), > > + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, > drm_syncobj_timeline_wait_ioctl, > > + DRM_UNLOCKED|DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, > drm_syncobj_reset_ioctl, > > DRM_UNLOCKED|DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, > drm_syncobj_signal_ioctl, > > diff --git a/drivers/gpu/drm/drm_syncobj.c > > b/drivers/gpu/drm/drm_syncobj.c index 67472bd77c83..a43de0e4616c > > 100644 > > --- a/drivers/gpu/drm/drm_syncobj.c > > +++ b/drivers/gpu/drm/drm_syncobj.c > > @@ -126,13 +126,14 @@ static void > drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj, > > } > > > > static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj > > *syncobj, > > +u64 point, > > struct dma_fence **fence, > > struct drm_syncobj_cb *cb, > > drm_syncobj_func_t func) > > { > > int ret; > > > > - ret = drm_syncobj_search_fence(syncobj, 0, 0, fence); > > + ret = drm_syncobj_search_fence(syncobj, point, 0, fence); > > if (!ret) > > return 1; > > > > @@ -143,7 +144,7 @@ static int > drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, > > */ > > if (!list_empty(>signal_pt_list)) { > > spin_unlock(>lock); > > - drm_syncobj_search_fence(syncobj, 0, 0, fence); > > + drm_syncobj_search_fence(syncobj, point, 0, fence); > > if (*fence) > > return 1; > > spin_lock(>lock); > > @@ -358,7 +359,9 @@ void drm_syncobj_replace_fence(struct > drm_syncobj *syncobj, > > spin_lock(>lock); > > list_for_each_entry_safe(cur, tmp, >cb_list, node) > { > > list_del_init(>node); > > + spin_unlock(>lock); > > cur->func(syncobj, cur); > > + spin_lock(>lock); > > That looks fishy to me. Why do we need to unlock Cb func will call _search_fence, which will need to grab the lock, otherwise deadlock. >and who guarantees that > tmp is still valid when we grab the lock again? Sorry for that, quickly fix deadlock and forget to
RE: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7
> -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Thursday, September 20, 2018 5:35 PM > To: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: Dave Airlie ; Rakos, Daniel > ; Daniel Vetter ; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7 > > The only thing I can still see is that you use wait_event_timeout() instead of > wait_event_interruptible(). > > Any particular reason for that? I tried again after you said last thread, CTS always fail, and syncobj unit test fails as well. > > Apart from that it now looks good to me. Thanks, Can I get your RB on it? Btw, I realize Vulkan spec names semaphore type as binary and timeline, so how about change _TYPE_INDIVIDUAL to _TYPE_BINARY ? Regards, David Zhou > > Christian. > > Am 20.09.2018 um 11:29 schrieb Zhou, David(ChunMing): > > Ping... > > > >> -Original Message- > >> From: amd-gfx On Behalf Of > >> Chunming Zhou > >> Sent: Wednesday, September 19, 2018 5:18 PM > >> To: dri-de...@lists.freedesktop.org > >> Cc: Zhou, David(ChunMing) ; amd- > >> g...@lists.freedesktop.org; Rakos, Daniel ; > >> Daniel Vetter ; Dave Airlie ; > >> Koenig, Christian > >> Subject: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7 > >> > >> This patch is for VK_KHR_timeline_semaphore extension, semaphore is > >> called syncobj in kernel side: > >> This extension introduces a new type of syncobj that has an integer > >> payload identifying a point in a timeline. Such timeline syncobjs > >> support the following > >> operations: > >> * CPU query - A host operation that allows querying the payload of the > >> timeline syncobj. > >> * CPU wait - A host operation that allows a blocking wait for a > >> timeline syncobj to reach a specified value. > >> * Device wait - A device operation that allows waiting for a > >> timeline syncobj to reach a specified value. > >> * Device signal - A device operation that allows advancing the > >> timeline syncobj to a specified value. > >> > >> v1: > >> Since it's a timeline, that means the front time point(PT) always is > >> signaled before the late PT. > >> a. signal PT design: > >> Signal PT fence N depends on PT[N-1] fence and signal opertion fence, > >> when PT[N] fence is signaled, the timeline will increase to value of PT[N]. > >> b. wait PT design: > >> Wait PT fence is signaled by reaching timeline point value, when > >> timeline is increasing, will compare wait PTs value with new timeline > >> value, if PT value is lower than timeline value, then wait PT will be > >> signaled, > otherwise keep in list. > >> syncobj wait operation can wait on any point of timeline, so need a > >> RB tree to order them. And wait PT could ahead of signal PT, we need > >> a sumission fence to perform that. > >> > >> v2: > >> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. > >> move unexposed denitions to .c file. (Daniel Vetter) 3. split up the > >> change to > >> drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up > >> the change to drm_syncobj_replace_fence() in a separate patch. > >> 5. drop the submission_fence implementation and instead use > >> wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL > type syncobj case. > >> (Daniel Vetter) > >> > >> v3: > >> 1. replace normal syncobj with timeline implemenation. (Vetter and > Christian) > >> a. normal syncobj signal op will create a signal PT to tail of signal > >> pt list. > >> b. normal syncobj wait op will create a wait pt with last signal > >> point, and this wait PT is only signaled by related signal point PT. > >> 2. many bug fix and clean up > >> 3. stub fence moving is moved to other patch. > >> > >> v4: > >> 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. > >> fix syncobj lifecycle. (Christian) 3. only enable_signaling when > >> there is wait_pt. (Christian) 4. fix timeline path issues. > >> 5. write a timeline test in libdrm > >> > >> v5: (Christian) > >> 1. semaphore is called syncobj in kernel side. > >> 2. don't need 'timeline' characters in some function name. > >> 3. keep syncobj cb. > >> > >> v6: (Christian) &
RE: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7
Ping... > -Original Message- > From: amd-gfx On Behalf Of > Chunming Zhou > Sent: Wednesday, September 19, 2018 5:18 PM > To: dri-de...@lists.freedesktop.org > Cc: Zhou, David(ChunMing) ; amd- > g...@lists.freedesktop.org; Rakos, Daniel ; Daniel > Vetter ; Dave Airlie ; Koenig, > Christian > Subject: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7 > > This patch is for VK_KHR_timeline_semaphore extension, semaphore is > called syncobj in kernel side: > This extension introduces a new type of syncobj that has an integer payload > identifying a point in a timeline. Such timeline syncobjs support the > following > operations: >* CPU query - A host operation that allows querying the payload of the > timeline syncobj. >* CPU wait - A host operation that allows a blocking wait for a > timeline syncobj to reach a specified value. >* Device wait - A device operation that allows waiting for a > timeline syncobj to reach a specified value. >* Device signal - A device operation that allows advancing the > timeline syncobj to a specified value. > > v1: > Since it's a timeline, that means the front time point(PT) always is signaled > before the late PT. > a. signal PT design: > Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when > PT[N] fence is signaled, the timeline will increase to value of PT[N]. > b. wait PT design: > Wait PT fence is signaled by reaching timeline point value, when timeline is > increasing, will compare wait PTs value with new timeline value, if PT value > is > lower than timeline value, then wait PT will be signaled, otherwise keep in > list. > syncobj wait operation can wait on any point of timeline, so need a RB tree to > order them. And wait PT could ahead of signal PT, we need a sumission fence > to perform that. > > v2: > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. > move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to > drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the > change to drm_syncobj_replace_fence() in a separate patch. > 5. drop the submission_fence implementation and instead use wait_event() > for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. > (Daniel Vetter) > > v3: > 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) > a. normal syncobj signal op will create a signal PT to tail of signal pt > list. > b. normal syncobj wait op will create a wait pt with last signal point, > and this > wait PT is only signaled by related signal point PT. > 2. many bug fix and clean up > 3. stub fence moving is moved to other patch. > > v4: > 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj > lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. > (Christian) > 4. fix timeline path issues. > 5. write a timeline test in libdrm > > v5: (Christian) > 1. semaphore is called syncobj in kernel side. > 2. don't need 'timeline' characters in some function name. > 3. keep syncobj cb. > > v6: (Christian) > 1. merge syncobj_timeline to syncobj structure. > 2. simplify some check sentences. > 3. some misc change. > 4. fix CTS failed issue. > > v7: (Christian) > 1. error handling when creating signal pt. > 2. remove timeline naming in func. > 3. export flags in find_fence. > 4. allow reset timeline. > > individual syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline > syncobj is tested by ./amdgpu_test -s 9 > > Signed-off-by: Chunming Zhou > Cc: Christian Konig > Cc: Dave Airlie > Cc: Daniel Rakos > Cc: Daniel Vetter > --- > drivers/gpu/drm/drm_syncobj.c | 293 ++--- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- > include/drm/drm_syncobj.h | 65 ++--- > include/uapi/drm/drm.h | 1 + > 4 files changed, 287 insertions(+), 74 deletions(-) > > diff --git a/drivers/gpu/drm/drm_syncobj.c > b/drivers/gpu/drm/drm_syncobj.c index f796c9fc3858..95b60ac045c6 100644 > --- a/drivers/gpu/drm/drm_syncobj.c > +++ b/drivers/gpu/drm/drm_syncobj.c > @@ -56,6 +56,9 @@ > #include "drm_internal.h" > #include > > +/* merge normal syncobj to timeline syncobj, the point interval is 1 */ > +#define DRM_SYNCOBJ_INDIVIDUAL_POINT 1 > + > struct drm_syncobj_stub_fence { > struct dma_fence base; > spinlock_t lock; > @@ -82,6 +85,11 @@ static const struct dma_fence_ops > drm_syncobj_stub_fence_ops = { > .release = drm_syncobj_stub_fence_release, }; > > +struct drm_syncobj_signal_pt { > +
RE: [PATCH 1/4] [RFC]drm: add syncobj timeline support v6
> -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Wednesday, September 19, 2018 3:45 PM > To: Zhou, David(ChunMing) ; Zhou, > David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: Dave Airlie ; Rakos, Daniel > ; Daniel Vetter ; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 1/4] [RFC]drm: add syncobj timeline support v6 > > Am 19.09.2018 um 09:32 schrieb zhoucm1: > > > > > > On 2018年09月19日 15:18, Christian König wrote: > >> Am 19.09.2018 um 06:26 schrieb Chunming Zhou: > > [snip] > >>> *fence = NULL; > >>> drm_syncobj_add_callback_locked(syncobj, cb, func); @@ > >>> -164,6 +177,153 @@ void drm_syncobj_remove_callback(struct > >>> drm_syncobj *syncobj, > >>> spin_unlock(>lock); > >>> } > >>> +static void drm_syncobj_timeline_init(struct drm_syncobj > >>> *syncobj) > >> > >> We still have _timeline_ in the name here. > > the func is relevant to timeline members, or which name is proper? > > Yeah, but we now use the timeline implementation for the individual syncobj > as well. > > Not a big issue, but I would just name it > drm_syncobj_init()/drm_syncobj_fini. There is already drm_syncobj_init/fini in drm_syncboj.c , any other name can be suggested? > > > > >> > >>> +{ > >>> + spin_lock(>lock); > >>> + syncobj->timeline_context = dma_fence_context_alloc(1); > > [snip] > >>> +} > >>> + > >>> +int drm_syncobj_lookup_fence(struct drm_syncobj *syncobj, u64 > >>> +point, > >>> + struct dma_fence **fence) { > >>> + > >>> + return drm_syncobj_search_fence(syncobj, point, > >>> + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, > >> > >> I still have a bad feeling setting that flag as default cause it > >> might change the behavior for the UAPI. > >> > >> Maybe export drm_syncobj_search_fence directly? E.g. with the flags > >> parameter. > > previous v5 indeed do this, you let me wrap it, need change back? > > No, the problem is that drm_syncobj_find_fence() is still using > drm_syncobj_lookup_fence() which sets the flag instead of > drm_syncobj_search_fence() without the flag. > > That changes the UAPI behavior because previously we would have returned > an error code and now we block for a fence to appear. > > So I think the right solution would be to add the flags parameter to > drm_syncobj_find_fence() and let the driver decide if we need to block or > get -ENOENT. Got your means, Exporting flag in func is easy, but driver doesn't pass flag, which flag is proper by default? We still need to give a default flag in patch, don't we? Thanks, David Zhou > > Regards, > Christian. > > > > > Regards, > > David Zhou > >> > >> Regards, > >> Christian. > >> > >>> + fence); > >>> +} > >>> +EXPORT_SYMBOL(drm_syncobj_lookup_fence); > >>> + > >>> /** > >>> * drm_syncobj_find_fence - lookup and reference the fence in a > >>> sync object > >>> * @file_private: drm file private pointer @@ -228,7 +443,7 @@ > >>> static int drm_syncobj_assign_null_handle(struct > >>> drm_syncobj *syncobj) > >>> * @fence: out parameter for the fence > >>> * > >>> * This is just a convenience function that combines > >>> drm_syncobj_find() and > >>> - * drm_syncobj_fence_get(). > >>> + * drm_syncobj_lookup_fence(). > >>> * > >>> * Returns 0 on success or a negative error value on failure. On > >>> success @fence > >>> * contains a reference to the fence, which must be released by > >>> calling @@ -236,18 +451,11 @@ static int > >>> drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj) > >>> */ > >>> int drm_syncobj_find_fence(struct drm_file *file_private, > >>> u32 handle, u64 point, > >>> - struct dma_fence **fence) -{ > >>> + struct dma_fence **fence) { > >>> struct drm_syncobj *syncobj = drm_syncobj_find(file_private, > >>> handle); > >>> - int ret = 0; > >>> - > >>> - if (!syncobj) > >>> - return -ENOENT; > >>> + int ret; > >>> - *fe
RE: [PATCH] drm/amdgpu: remove fence fallback
> -Original Message- > From: amd-gfx On Behalf Of > Christian K?nig > Sent: Tuesday, September 18, 2018 4:43 PM > To: amd-gfx@lists.freedesktop.org > Subject: [PATCH] drm/amdgpu: remove fence fallback > > DC doesn't seem to have a fallback path either. > > So when interrupts doesn't work any more we are pretty much busted no > matter what. > > Signed-off-by: Christian König Reviewed-by: Chunming Zhou > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 - > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 56 --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 - > 3 files changed, 58 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index 27382767e15a..c18d68575462 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -146,7 +146,6 @@ extern int amdgpu_cik_support; > #define AMDGPU_DEFAULT_GTT_SIZE_MB 3072ULL /* 3GB by > default */ > #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS 3000 > #define AMDGPU_MAX_USEC_TIMEOUT 10 /* 100 > ms */ > -#define AMDGPU_FENCE_JIFFIES_TIMEOUT (HZ / 2) > /* AMDGPU_IB_POOL_SIZE must be a power of 2 */ > #define AMDGPU_IB_POOL_SIZE 16 > #define AMDGPU_DEBUGFS_MAX_COMPONENTS32 > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > index da36731460b5..176f28777f5e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > @@ -195,19 +195,6 @@ int amdgpu_fence_emit_polling(struct amdgpu_ring > *ring, uint32_t *s) > return 0; > } > > -/** > - * amdgpu_fence_schedule_fallback - schedule fallback check > - * > - * @ring: pointer to struct amdgpu_ring > - * > - * Start a timer as fallback to our interrupts. > - */ > -static void amdgpu_fence_schedule_fallback(struct amdgpu_ring *ring) -{ > - mod_timer(>fence_drv.fallback_timer, > - jiffies + AMDGPU_FENCE_JIFFIES_TIMEOUT); > -} > - > /** > * amdgpu_fence_process - check for fence activity > * > @@ -229,9 +216,6 @@ void amdgpu_fence_process(struct amdgpu_ring > *ring) > > } while (atomic_cmpxchg(>last_seq, last_seq, seq) != last_seq); > > - if (seq != ring->fence_drv.sync_seq) > - amdgpu_fence_schedule_fallback(ring); > - > if (unlikely(seq == last_seq)) > return; > > @@ -262,21 +246,6 @@ void amdgpu_fence_process(struct amdgpu_ring > *ring) > } while (last_seq != seq); > } > > -/** > - * amdgpu_fence_fallback - fallback for hardware interrupts > - * > - * @work: delayed work item > - * > - * Checks for fence activity. > - */ > -static void amdgpu_fence_fallback(struct timer_list *t) -{ > - struct amdgpu_ring *ring = from_timer(ring, t, > - fence_drv.fallback_timer); > - > - amdgpu_fence_process(ring); > -} > - > /** > * amdgpu_fence_wait_empty - wait for all fences to signal > * > @@ -424,8 +393,6 @@ int amdgpu_fence_driver_init_ring(struct > amdgpu_ring *ring, > atomic_set(>fence_drv.last_seq, 0); > ring->fence_drv.initialized = false; > > - timer_setup(>fence_drv.fallback_timer, > amdgpu_fence_fallback, 0); > - > ring->fence_drv.num_fences_mask = num_hw_submission * 2 - 1; > spin_lock_init(>fence_drv.lock); > ring->fence_drv.fences = kcalloc(num_hw_submission * 2, > sizeof(void *), @@ -501,7 +468,6 @@ void amdgpu_fence_driver_fini(struct > amdgpu_device *adev) > amdgpu_irq_put(adev, ring->fence_drv.irq_src, > ring->fence_drv.irq_type); > drm_sched_fini(>sched); > - del_timer_sync(>fence_drv.fallback_timer); > for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j) > dma_fence_put(ring->fence_drv.fences[j]); > kfree(ring->fence_drv.fences); > @@ -594,27 +560,6 @@ static const char > *amdgpu_fence_get_timeline_name(struct dma_fence *f) > return (const char *)fence->ring->name; } > > -/** > - * amdgpu_fence_enable_signaling - enable signalling on fence > - * @fence: fence > - * > - * This function is called with fence_queue lock held, and adds a callback > - * to fence_queue that checks if this fence is signaled, and if so it > - * signals the fence and removes itself. > - */ > -static bool amdgpu_fence_enable_signaling(struct dma_fence *f) -{ > - struct amdgpu_fence *fence = to_amdgpu_fence(f); > - struct amdgpu_ring *ring = fence->ring; > - > - if (!timer_pending(>fence_drv.fallback_timer)) > - amdgpu_fence_schedule_fallback(ring); > - > - DMA_FENCE_TRACE(>base, "armed on ring %i!\n", ring- > >idx); > - > - return true; > -} > - > /** > * amdgpu_fence_free - free up the fence memory > * > @@ -645,7 +590,6 @@ static void amdgpu_fence_release(struct dma_fence > *f)
RE: [PATCH] [RFC]drm: add syncobj timeline support v5
> -Original Message- > From: Daniel Vetter On Behalf Of Daniel Vetter > Sent: Saturday, September 15, 2018 12:11 AM > To: Koenig, Christian > Cc: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Dave Airlie > ; Rakos, Daniel ; Daniel > Vetter > Subject: Re: [PATCH] [RFC]drm: add syncobj timeline support v5 > > On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote: > > Am 14.09.2018 um 12:37 schrieb Chunming Zhou: > > > This patch is for VK_KHR_timeline_semaphore extension, semaphore is > called syncobj in kernel side: > > > This extension introduces a new type of syncobj that has an integer > > > payload identifying a point in a timeline. Such timeline syncobjs > > > support the following operations: > > > * CPU query - A host operation that allows querying the payload of the > > > timeline syncobj. > > > * CPU wait - A host operation that allows a blocking wait for a > > > timeline syncobj to reach a specified value. > > > * Device wait - A device operation that allows waiting for a > > > timeline syncobj to reach a specified value. > > > * Device signal - A device operation that allows advancing the > > > timeline syncobj to a specified value. > > > > > > Since it's a timeline, that means the front time point(PT) always is > signaled before the late PT. > > > a. signal PT design: > > > Signal PT fence N depends on PT[N-1] fence and signal opertion > > > fence, when PT[N] fence is signaled, the timeline will increase to value > > > of > PT[N]. > > > b. wait PT design: > > > Wait PT fence is signaled by reaching timeline point value, when > > > timeline is increasing, will compare wait PTs value with new > > > timeline value, if PT value is lower than timeline value, then wait > > > PT will be signaled, otherwise keep in list. syncobj wait operation > > > can wait on any point of timeline, so need a RB tree to order them. And > wait PT could ahead of signal PT, we need a sumission fence to perform that. > > > > > > v2: > > > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. > move > > > unexposed denitions to .c file. (Daniel Vetter) 3. split up the > > > change to drm_syncobj_find_fence() in a separate patch. (Christian) > > > 4. split up the change to drm_syncobj_replace_fence() in a separate > patch. > > > 5. drop the submission_fence implementation and instead use > > > wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL > > > type syncobj case. (Daniel Vetter) > > > > > > v3: > > > 1. replace normal syncobj with timeline implemenation. (Vetter and > Christian) > > > a. normal syncobj signal op will create a signal PT to tail of > > > signal pt list. > > > b. normal syncobj wait op will create a wait pt with last signal > > > point, and > this wait PT is only signaled by related signal point PT. > > > 2. many bug fix and clean up > > > 3. stub fence moving is moved to other patch. > > > > > > v4: > > > 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. > > > fix syncobj lifecycle. (Christian) 3. only enable_signaling when > > > there is wait_pt. (Christian) 4. fix timeline path issues. > > > 5. write a timeline test in libdrm > > > > > > v5: (Christian) > > > 1. semaphore is called syncobj in kernel side. > > > 2. don't need 'timeline' characters in some function name. > > > 3. keep syncobj cb > > > > > > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline > > > syncobj is tested by ./amdgpu_test -s 9 > > > > > > Signed-off-by: Chunming Zhou > > > Cc: Christian Konig > > > Cc: Dave Airlie > > > Cc: Daniel Rakos > > > Cc: Daniel Vetter > > > > At least on first glance that looks like it should work, going to do a > > detailed review on Monday. > > Just for my understanding, it's all condensed down to 1 patch now? Yes, Christian suggest that. >I kinda > didn't follow the detailed discussion last few days at all :-/ > > Also, is there a testcase, igt highly preferred (because then we'll run it in > our > intel-gfx CI, and a bunch of people outside of intel have already discovered > that and are using it). I already wrote the test in libdrm unit test, since I'm not familiar with IGT stuff. Thanks, David Zhou > > Thanks, Daniel > > > >
RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> -Original Message- > From: Koenig, Christian > Sent: Friday, September 14, 2018 3:27 PM > To: Zhou, David(ChunMing) ; Zhou, > David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: Dave Airlie ; Rakos, Daniel > ; amd-gfx@lists.freedesktop.org; Daniel Vetter > > Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > > Am 14.09.2018 um 05:59 schrieb zhoucm1: > > > > > > On 2018年09月14日 11:14, zhoucm1 wrote: > >> > >> > >> On 2018年09月13日 18:22, Christian König wrote: > >>> Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing): > >>>> > >>>>> -----Original Message- > >>>>> From: Koenig, Christian > >>>>> Sent: Thursday, September 13, 2018 5:20 PM > >>>>> To: Zhou, David(ChunMing) ; dri- > >>>>> de...@lists.freedesktop.org > >>>>> Cc: Dave Airlie ; Rakos, Daniel > >>>>> ; amd-gfx@lists.freedesktop.org > >>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > >>>>> > >>>>> Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing): > >>>>>>> -Original Message- > >>>>>>> From: Christian König > >>>>>>> Sent: Thursday, September 13, 2018 4:50 PM > >>>>>>> To: Zhou, David(ChunMing) ; Koenig, > >>>>>>> Christian ; > >>>>>>> dri-de...@lists.freedesktop.org > >>>>>>> Cc: Dave Airlie ; Rakos, Daniel > >>>>>>> ; amd-gfx@lists.freedesktop.org > >>>>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support > >>>>>>> v4 > >>>>>>> > >>>>>>> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing): > >>>>>>>>> -Original Message- > >>>>>>>>> From: Koenig, Christian > >>>>>>>>> Sent: Thursday, September 13, 2018 2:56 PM > >>>>>>>>> To: Zhou, David(ChunMing) ; Zhou, > >>>>>>>>> David(ChunMing) ; dri- > >>>>>>>>> de...@lists.freedesktop.org > >>>>>>>>> Cc: Dave Airlie ; Rakos, Daniel > >>>>>>>>> ; amd-gfx@lists.freedesktop.org > >>>>>>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline > >>>>>>>>> support v4 > >>>>>>>>> > >>>>>>>>> Am 13.09.2018 um 04:15 schrieb zhoucm1: > >>>>>>>>>> On 2018年09月12日 19:05, Christian König wrote: > >>>>>>>>>>>>>>> [SNIP] > >>>>>>>>>>>>>>> +static void > >>>>>>>>>>>>>>> +drm_syncobj_find_signal_pt_for_wait_pt(struct > >>>>>>>>>>>>>>> drm_syncobj *syncobj, > >>>>>>>>>>>>>>> + struct drm_syncobj_wait_pt > >>>>>>>>>>>>>>> +*wait_pt) { > >>>>>>>>>>>>>> That whole approach still looks horrible complicated to me. > >>>>>>>>>>>> It's already very close to what you said before. > >>>>>>>>>>>> > >>>>>>>>>>>>>> Especially the separation of signal and wait pt is > >>>>>>>>>>>>>> completely unnecessary as far as I can see. > >>>>>>>>>>>>>> When a wait pt is requested we just need to search for > >>>>>>>>>>>>>> the signal point which it will trigger. > >>>>>>>>>>>> Yeah, I tried this, but when I implement cpu wait ioctl on > >>>>>>>>>>>> specific point, we need a advanced wait pt fence, > >>>>>>>>>>>> otherwise, we could still need old syncobj cb. > >>>>>>>>>>> Why? I mean you just need to call drm_syncobj_find_fence() > >>>>>>>>>>> and > >>>>>>> when > >>>>>>>>>>> that one returns NULL you use wait_event_*() to wait for a > >>>>>>>>>>> signal point >= your wait point to appear and tr
RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> -Original Message- > From: Koenig, Christian > Sent: Thursday, September 13, 2018 5:20 PM > To: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: Dave Airlie ; Rakos, Daniel > ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > > Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing): > > > >> -Original Message- > >> From: Christian König > >> Sent: Thursday, September 13, 2018 4:50 PM > >> To: Zhou, David(ChunMing) ; Koenig, Christian > >> ; dri-de...@lists.freedesktop.org > >> Cc: Dave Airlie ; Rakos, Daniel > >> ; amd-gfx@lists.freedesktop.org > >> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > >> > >> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing): > >>>> -----Original Message- > >>>> From: Koenig, Christian > >>>> Sent: Thursday, September 13, 2018 2:56 PM > >>>> To: Zhou, David(ChunMing) ; Zhou, > >>>> David(ChunMing) ; dri- > >>>> de...@lists.freedesktop.org > >>>> Cc: Dave Airlie ; Rakos, Daniel > >>>> ; amd-gfx@lists.freedesktop.org > >>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > >>>> > >>>> Am 13.09.2018 um 04:15 schrieb zhoucm1: > >>>>> On 2018年09月12日 19:05, Christian König wrote: > >>>>>>>>>> [SNIP] > >>>>>>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct > >>>>>>>>>> drm_syncobj *syncobj, > >>>>>>>>>> + struct drm_syncobj_wait_pt > >>>>>>>>>> +*wait_pt) { > >>>>>>>>> That whole approach still looks horrible complicated to me. > >>>>>>> It's already very close to what you said before. > >>>>>>> > >>>>>>>>> Especially the separation of signal and wait pt is completely > >>>>>>>>> unnecessary as far as I can see. > >>>>>>>>> When a wait pt is requested we just need to search for the > >>>>>>>>> signal point which it will trigger. > >>>>>>> Yeah, I tried this, but when I implement cpu wait ioctl on > >>>>>>> specific point, we need a advanced wait pt fence, otherwise, we > >>>>>>> could still need old syncobj cb. > >>>>>> Why? I mean you just need to call drm_syncobj_find_fence() and > >> when > >>>>>> that one returns NULL you use wait_event_*() to wait for a signal > >>>>>> point >= your wait point to appear and try again. > >>>>> e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have > >>>>> no fence yet, as you said, during drm_syncobj_find_fence(A) is > >>>>> working on wait_event, syncobjB and syncobjC could already be > >>>>> signaled, then we don't know which one is first signaled, which is > >>>>> need when wait ioctl returns. > >>>> I don't really see a problem with that. When you wait for the first > >>>> one you need to wait for A,B,C at the same time anyway. > >>>> > >>>> So what you do is to register a fence callback on the fences you > >>>> already have and for the syncobj which doesn't yet have a fence you > >>>> make sure that they wake up your thread when they get one. > >>>> > >>>> So essentially exactly what drm_syncobj_fence_get_or_add_callback() > >>>> already does today. > >>> So do you mean we need still use old syncobj CB for that? > >> Yes, as far as I can see it should work. > >> > >>>Advanced wait pt is bad? > >> Well it isn't bad, I just don't see any advantage in it. > > > > The advantage is to replace old syncobj cb. > > > >> The existing mechanism > >> should already be able to handle that. > > I thought more a bit, we don't that mechanism at all, if use advanced wait > pt, we can easily use fence array to achieve it for wait ioctl, we should use > kernel existing feature as much as possible, not invent another, shouldn't we? > I remember you said it before. > > Yeah, but the syncobj cb is an existing feature. This is obviously a workaround when doing for wait ioctl, Do you see it used in other place? > And I absolutely don't
RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> -Original Message- > From: Christian König > Sent: Thursday, September 13, 2018 4:50 PM > To: Zhou, David(ChunMing) ; Koenig, Christian > ; dri-de...@lists.freedesktop.org > Cc: Dave Airlie ; Rakos, Daniel > ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > > Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing): > > > >> -Original Message- > >> From: Koenig, Christian > >> Sent: Thursday, September 13, 2018 2:56 PM > >> To: Zhou, David(ChunMing) ; Zhou, > >> David(ChunMing) ; dri- > >> de...@lists.freedesktop.org > >> Cc: Dave Airlie ; Rakos, Daniel > >> ; amd-gfx@lists.freedesktop.org > >> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > >> > >> Am 13.09.2018 um 04:15 schrieb zhoucm1: > >>> On 2018年09月12日 19:05, Christian König wrote: > >>>>>>>> [SNIP] > >>>>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct > >>>>>>>> drm_syncobj *syncobj, > >>>>>>>> + struct drm_syncobj_wait_pt > >>>>>>>> +*wait_pt) { > >>>>>>> That whole approach still looks horrible complicated to me. > >>>>> It's already very close to what you said before. > >>>>> > >>>>>>> Especially the separation of signal and wait pt is completely > >>>>>>> unnecessary as far as I can see. > >>>>>>> When a wait pt is requested we just need to search for the > >>>>>>> signal point which it will trigger. > >>>>> Yeah, I tried this, but when I implement cpu wait ioctl on > >>>>> specific point, we need a advanced wait pt fence, otherwise, we > >>>>> could still need old syncobj cb. > >>>> Why? I mean you just need to call drm_syncobj_find_fence() and > when > >>>> that one returns NULL you use wait_event_*() to wait for a signal > >>>> point >= your wait point to appear and try again. > >>> e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have > >>> no fence yet, as you said, during drm_syncobj_find_fence(A) is > >>> working on wait_event, syncobjB and syncobjC could already be > >>> signaled, then we don't know which one is first signaled, which is > >>> need when wait ioctl returns. > >> I don't really see a problem with that. When you wait for the first > >> one you need to wait for A,B,C at the same time anyway. > >> > >> So what you do is to register a fence callback on the fences you > >> already have and for the syncobj which doesn't yet have a fence you > >> make sure that they wake up your thread when they get one. > >> > >> So essentially exactly what drm_syncobj_fence_get_or_add_callback() > >> already does today. > > So do you mean we need still use old syncobj CB for that? > > Yes, as far as I can see it should work. > > > Advanced wait pt is bad? > > Well it isn't bad, I just don't see any advantage in it. The advantage is to replace old syncobj cb. > The existing mechanism > should already be able to handle that. I thought more a bit, we don't that mechanism at all, if use advanced wait pt, we can easily use fence array to achieve it for wait ioctl, we should use kernel existing feature as much as possible, not invent another, shouldn't we? I remember you said it before. Thanks, David Zhou > > Christian. > > > > > Thanks, > > David Zhou > >> Regards, > >> Christian. > >> > >>> Back to my implementation, it already fixes all your concerns > >>> before, and can be able to easily used in wait_ioctl. When you feel > >>> that is complicated, I guess that is because we merged all logic to > >>> that and much clean up in one patch. In fact, it already is very > >>> simple, timeline_init/fini, create signal/wait_pt, find signal_pt > >>> for wait_pt, garbage collection, just them. > >>> > >>> Thanks, > >>> David Zhou > >>>> Regards, > >>>> Christian. > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> -Original Message- > From: Koenig, Christian > Sent: Thursday, September 13, 2018 2:56 PM > To: Zhou, David(ChunMing) ; Zhou, > David(ChunMing) ; dri- > de...@lists.freedesktop.org > Cc: Dave Airlie ; Rakos, Daniel > ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4 > > Am 13.09.2018 um 04:15 schrieb zhoucm1: > > On 2018年09月12日 19:05, Christian König wrote: > >>>>> > >>>>>> [SNIP] > >>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct > >>>>>> drm_syncobj *syncobj, > >>>>>> + struct drm_syncobj_wait_pt *wait_pt) > >>>>>> +{ > >>>>> > >>>>> That whole approach still looks horrible complicated to me. > >>> It's already very close to what you said before. > >>> > >>>>> > >>>>> Especially the separation of signal and wait pt is completely > >>>>> unnecessary as far as I can see. > >>>>> When a wait pt is requested we just need to search for the signal > >>>>> point which it will trigger. > >>> Yeah, I tried this, but when I implement cpu wait ioctl on specific > >>> point, we need a advanced wait pt fence, otherwise, we could still > >>> need old syncobj cb. > >> > >> Why? I mean you just need to call drm_syncobj_find_fence() and when > >> that one returns NULL you use wait_event_*() to wait for a signal > >> point >= your wait point to appear and try again. > > e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have no > > fence yet, as you said, during drm_syncobj_find_fence(A) is working on > > wait_event, syncobjB and syncobjC could already be signaled, then we > > don't know which one is first signaled, which is need when wait ioctl > > returns. > > I don't really see a problem with that. When you wait for the first one you > need to wait for A,B,C at the same time anyway. > > So what you do is to register a fence callback on the fences you already have > and for the syncobj which doesn't yet have a fence you make sure that they > wake up your thread when they get one. > > So essentially exactly what drm_syncobj_fence_get_or_add_callback() > already does today. So do you mean we need still use old syncobj CB for that? Advanced wait pt is bad? Thanks, David Zhou > > Regards, > Christian. > > > > > Back to my implementation, it already fixes all your concerns before, > > and can be able to easily used in wait_ioctl. When you feel that is > > complicated, I guess that is because we merged all logic to that and > > much clean up in one patch. In fact, it already is very simple, > > timeline_init/fini, create signal/wait_pt, find signal_pt for wait_pt, > > garbage collection, just them. > > > > Thanks, > > David Zhou > >> > >> Regards, > >> Christian. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/scheduler: Add stopped flag to drm_sched_entity
-Original Message- From: dri-devel On Behalf Of Andrey Grodzovsky Sent: Friday, August 17, 2018 11:16 PM To: dri-de...@lists.freedesktop.org Cc: Koenig, Christian ; amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/scheduler: Add stopped flag to drm_sched_entity The flag will prevent another thread from same process to reinsert the entity queue into scheduler's rq after it was already removed from there by another thread during drm_sched_entity_flush. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/sched_entity.c | 10 +- include/drm/gpu_scheduler.h | 2 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 1416edb..07cfe63 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -177,8 +177,12 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout) /* For killed process disable any more IBs enqueue right now */ last_user = cmpxchg(>last_user, current->group_leader, NULL); if ((!last_user || last_user == current->group_leader) && - (current->flags & PF_EXITING) && (current->exit_code == SIGKILL)) + (current->flags & PF_EXITING) && (current->exit_code == SIGKILL)) { + spin_lock(>rq_lock); + entity->stopped = true; drm_sched_rq_remove_entity(entity->rq, entity); + spin_unlock(>rq_lock); + } return ret; } @@ -504,6 +508,10 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job, if (first) { /* Add the entity to the run queue */ spin_lock(>rq_lock); + if (entity->stopped) { + spin_unlock(>rq_lock); + return; + } [DZ] the code changes so frequent recently and has this regression, my code synced last Friday still has below checking: spin_lock(>rq_lock); if (!entity->rq) { DRM_ERROR("Trying to push to a killed entity\n"); spin_unlock(>rq_lock); return; } So you should add DRM_ERROR as well when hitting it. With that fix, patch is Reviewed-by: Chunming Zhou Regards, David Zhou drm_sched_rq_add_entity(entity->rq, entity); spin_unlock(>rq_lock); drm_sched_wakeup(entity->rq->sched); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 919ae57..daec50f 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -70,6 +70,7 @@ enum drm_sched_priority { * @fini_status: contains the exit status in case the process was signalled. * @last_scheduled: points to the finished fence of the last scheduled job. * @last_user: last group leader pushing a job into the entity. + * @stopped: Marks the enity as removed from rq and destined for termination. * * Entities will emit jobs in order to their corresponding hardware * ring, and the scheduler will alternate between entities based on @@ -92,6 +93,7 @@ struct drm_sched_entity { atomic_t*guilty; struct dma_fence*last_scheduled; struct task_struct *last_user; + boolstopped; }; /** -- 2.7.4 ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 3/4] drm/scheduler: add new function to get least loaded sched v2
Another big question: I agree the general idea is good to balance scheduler load for same ring family. But, when same entity job run on different scheduler, that means the later job could be completed ahead of front, Right? That will break fence design, later fence must be signaled after front fence in same fence context. Anything I missed? Regards, David Zhou From: dri-devel On Behalf Of Nayan Deshmukh Sent: Thursday, August 02, 2018 12:07 AM To: Grodzovsky, Andrey Cc: amd-gfx@lists.freedesktop.org; Maling list - DRI developers ; Koenig, Christian Subject: Re: [PATCH 3/4] drm/scheduler: add new function to get least loaded sched v2 Yes, that is correct. Nayan On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky mailto:andrey.grodzov...@amd.com>> wrote: Clarification question - if the run queues belong to different schedulers they effectively point to different rings, it means we allow to move (reschedule) a drm_sched_entity from one ring to another - i assume that the idea int the first place, that you have a set of HW rings and you can utilize any of them for your jobs (like compute rings). Correct ? Andrey On 08/01/2018 04:20 AM, Nayan Deshmukh wrote: > The function selects the run queue from the rq_list with the > least load. The load is decided by the number of jobs in a > scheduler. > > v2: avoid using atomic read twice consecutively, instead store > it locally > > Signed-off-by: Nayan Deshmukh > mailto:nayan26deshm...@gmail.com>> > --- > drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 + > 1 file changed, 25 insertions(+) > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c > b/drivers/gpu/drm/scheduler/gpu_scheduler.c > index 375f6f7f6a93..fb4e542660b0 100644 > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c > @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct > drm_sched_entity *entity) > return true; > } > > +/** > + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load > + * > + * @entity: scheduler entity > + * > + * Return the pointer to the rq with least load. > + */ > +static struct drm_sched_rq * > +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity) > +{ > + struct drm_sched_rq *rq = NULL; > + unsigned int min_jobs = UINT_MAX, num_jobs; > + int i; > + > + for (i = 0; i < entity->num_rq_list; ++i) { > + num_jobs = atomic_read(>rq_list[i]->sched->num_jobs); > + if (num_jobs < min_jobs) { > + min_jobs = num_jobs; > + rq = entity->rq_list[i]; > + } > + } > + > + return rq; > +} > + > static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, > struct dma_fence_cb *cb) > { ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3)
Typo, excepted -> expected -Original Message- From: amd-gfx On Behalf Of Zhou, David(ChunMing) Sent: Tuesday, July 31, 2018 9:41 AM To: Koenig, Christian ; Zhang, Jerry ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3) Thanks for Jerry still remembers this series. Hi Christian, For upstream of this feature, seems we already had agreement long time ago. Two reasons for upstreaming: 1. this bug was found by an opengl game, so this bug also is in mesa driver in theory. 2. after upstream these patches, we can reduce pro specific patches, and close to open source. Btw, an unit test is excepted when upstreaming, I remember Alex mentioned. Thanks, David Zhou -Original Message- From: Christian König Sent: Monday, July 30, 2018 6:48 PM To: Zhang, Jerry ; amd-gfx@lists.freedesktop.org Cc: Zhou, David(ChunMing) ; Koenig, Christian Subject: Re: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3) Am 30.07.2018 um 12:02 schrieb Junwei Zhang: > From: Chunming Zhou > > v2: get original gem handle from gobj > v3: update find bo data structure as union(in, out) > simply some code logic Do we now have an open source user for this, so that we can upstream it? One more point below. > > Signed-off-by: Chunming Zhou > Signed-off-by: Junwei Zhang (v3) > Reviewed-by: Christian König > Reviewed-by: Jammy Zhou > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 63 > + > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3 +- > include/uapi/drm/amdgpu_drm.h | 21 +++ > 4 files changed, 88 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index 4cd20e7..46c370b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -1213,6 +1213,8 @@ int amdgpu_gem_info_ioctl(struct drm_device *dev, void > *data, > struct drm_file *filp); > int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, > struct drm_file *filp); > +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void > *data, > + struct drm_file *filp); > int amdgpu_gem_mmap_ioctl(struct drm_device *dev, void *data, > struct drm_file *filp); > int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data, > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > index 71792d8..bae8417 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > @@ -288,6 +288,69 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void > *data, > return 0; > } > > +static int amdgpu_gem_get_handle_from_object(struct drm_file *filp, > + struct drm_gem_object *obj) { > + int i; > + struct drm_gem_object *tmp; > + > + spin_lock(>table_lock); > + idr_for_each_entry(>object_idr, tmp, i) { > + if (obj == tmp) { > + drm_gem_object_reference(obj); > + spin_unlock(>table_lock); > + return i; > + } > + } Please double check if that is still up to date. I think we could as well try to use the DMA-buf handle tree for that. Christian. > + spin_unlock(>table_lock); > + > + return 0; > +} > + > +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void > *data, > + struct drm_file *filp) > +{ > + union drm_amdgpu_gem_find_bo *args = data; > + struct drm_gem_object *gobj; > + struct amdgpu_bo *bo; > + struct ttm_buffer_object *tbo; > + struct vm_area_struct *vma; > + uint32_t handle; > + int r; > + > + if (offset_in_page(args->in.addr | args->in.size)) > + return -EINVAL; > + > + down_read(>mm->mmap_sem); > + vma = find_vma(current->mm, args->in.addr); > + if (!vma || vma->vm_file != filp->filp || > + (args->in.size > (vma->vm_end - args->in.addr))) { > + args->out.handle = 0; > + up_read(>mm->mmap_sem); > + return -EINVAL; > + } > + args->out.offset = args->in.addr - vma->vm_start; > + > + tbo = vma->vm_private_data; > + bo = container_of(tbo, struct amdgpu_bo, tbo); > + amdgpu_bo_ref(bo); > + gobj = >gem_base; > + > + handle = amdgpu
RE: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3)
Thanks for Jerry still remembers this series. Hi Christian, For upstream of this feature, seems we already had agreement long time ago. Two reasons for upstreaming: 1. this bug was found by an opengl game, so this bug also is in mesa driver in theory. 2. after upstream these patches, we can reduce pro specific patches, and close to open source. Btw, an unit test is excepted when upstreaming, I remember Alex mentioned. Thanks, David Zhou -Original Message- From: Christian König Sent: Monday, July 30, 2018 6:48 PM To: Zhang, Jerry ; amd-gfx@lists.freedesktop.org Cc: Zhou, David(ChunMing) ; Koenig, Christian Subject: Re: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3) Am 30.07.2018 um 12:02 schrieb Junwei Zhang: > From: Chunming Zhou > > v2: get original gem handle from gobj > v3: update find bo data structure as union(in, out) > simply some code logic Do we now have an open source user for this, so that we can upstream it? One more point below. > > Signed-off-by: Chunming Zhou > Signed-off-by: Junwei Zhang (v3) > Reviewed-by: Christian König > Reviewed-by: Jammy Zhou > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 63 > + > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3 +- > include/uapi/drm/amdgpu_drm.h | 21 +++ > 4 files changed, 88 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index 4cd20e7..46c370b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -1213,6 +1213,8 @@ int amdgpu_gem_info_ioctl(struct drm_device *dev, void > *data, > struct drm_file *filp); > int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, > struct drm_file *filp); > +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void > *data, > + struct drm_file *filp); > int amdgpu_gem_mmap_ioctl(struct drm_device *dev, void *data, > struct drm_file *filp); > int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data, > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > index 71792d8..bae8417 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > @@ -288,6 +288,69 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void > *data, > return 0; > } > > +static int amdgpu_gem_get_handle_from_object(struct drm_file *filp, > + struct drm_gem_object *obj) { > + int i; > + struct drm_gem_object *tmp; > + > + spin_lock(>table_lock); > + idr_for_each_entry(>object_idr, tmp, i) { > + if (obj == tmp) { > + drm_gem_object_reference(obj); > + spin_unlock(>table_lock); > + return i; > + } > + } Please double check if that is still up to date. I think we could as well try to use the DMA-buf handle tree for that. Christian. > + spin_unlock(>table_lock); > + > + return 0; > +} > + > +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void > *data, > + struct drm_file *filp) > +{ > + union drm_amdgpu_gem_find_bo *args = data; > + struct drm_gem_object *gobj; > + struct amdgpu_bo *bo; > + struct ttm_buffer_object *tbo; > + struct vm_area_struct *vma; > + uint32_t handle; > + int r; > + > + if (offset_in_page(args->in.addr | args->in.size)) > + return -EINVAL; > + > + down_read(>mm->mmap_sem); > + vma = find_vma(current->mm, args->in.addr); > + if (!vma || vma->vm_file != filp->filp || > + (args->in.size > (vma->vm_end - args->in.addr))) { > + args->out.handle = 0; > + up_read(>mm->mmap_sem); > + return -EINVAL; > + } > + args->out.offset = args->in.addr - vma->vm_start; > + > + tbo = vma->vm_private_data; > + bo = container_of(tbo, struct amdgpu_bo, tbo); > + amdgpu_bo_ref(bo); > + gobj = >gem_base; > + > + handle = amdgpu_gem_get_handle_from_object(filp, gobj); > + if (!handle) { > + r = drm_gem_handle_create(filp, gobj, ); > + if (r) { > + DRM_ERROR("create gem handle failed\n"); > + up_read(>mm->mmap_sem); > +
RE: [PATCH 9/9] drm/amdgpu: create an empty bo_list if no handle is provided
Series is Reviewed-by: Chunming Zhou -Original Message- From: amd-gfx On Behalf Of Christian K?nig Sent: Monday, July 30, 2018 10:52 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 9/9] drm/amdgpu: create an empty bo_list if no handle is provided Instead of having extra handling just create an empty bo_list when no handle is provided. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 111 ++--- 1 file changed, 46 insertions(+), 65 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 1d7292ab2b62..502b94fb116a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -561,6 +561,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, union drm_amdgpu_cs *cs) { struct amdgpu_fpriv *fpriv = p->filp->driver_priv; + struct amdgpu_vm *vm = >vm; struct amdgpu_bo_list_entry *e; struct list_head duplicates; struct amdgpu_bo *gds; @@ -580,13 +581,17 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, >bo_list); if (r) return r; + } else if (!p->bo_list) { + /* Create a empty bo_list when no handle is provided */ + r = amdgpu_bo_list_create(p->adev, p->filp, NULL, 0, + >bo_list); + if (r) + return r; } - if (p->bo_list) { - amdgpu_bo_list_get_list(p->bo_list, >validated); - if (p->bo_list->first_userptr != p->bo_list->num_entries) - p->mn = amdgpu_mn_get(p->adev, AMDGPU_MN_TYPE_GFX); - } + amdgpu_bo_list_get_list(p->bo_list, >validated); + if (p->bo_list->first_userptr != p->bo_list->num_entries) + p->mn = amdgpu_mn_get(p->adev, AMDGPU_MN_TYPE_GFX); INIT_LIST_HEAD(); amdgpu_vm_get_pd_bo(>vm, >validated, >vm_pd); @@ -605,10 +610,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, goto error_free_pages; } - /* Without a BO list we don't have userptr BOs */ - if (!p->bo_list) - break; - INIT_LIST_HEAD(_pages); amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) { struct amdgpu_bo *bo = e->robj; @@ -703,21 +704,12 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved, p->bytes_moved_vis); - if (p->bo_list) { - struct amdgpu_vm *vm = >vm; - struct amdgpu_bo_list_entry *e; + gds = p->bo_list->gds_obj; + gws = p->bo_list->gws_obj; + oa = p->bo_list->oa_obj; - gds = p->bo_list->gds_obj; - gws = p->bo_list->gws_obj; - oa = p->bo_list->oa_obj; - - amdgpu_bo_list_for_each_entry(e, p->bo_list) - e->bo_va = amdgpu_vm_bo_find(vm, e->robj); - } else { - gds = p->adev->gds.gds_gfx_bo; - gws = p->adev->gds.gws_gfx_bo; - oa = p->adev->gds.oa_gfx_bo; - } + amdgpu_bo_list_for_each_entry(e, p->bo_list) + e->bo_va = amdgpu_vm_bo_find(vm, e->robj); if (gds) { p->job->gds_base = amdgpu_bo_gpu_offset(gds); @@ -745,15 +737,13 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, error_free_pages: - if (p->bo_list) { - amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) { - if (!e->user_pages) - continue; + amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) { + if (!e->user_pages) + continue; - release_pages(e->user_pages, - e->robj->tbo.ttm->num_pages); - kvfree(e->user_pages); - } + release_pages(e->user_pages, + e->robj->tbo.ttm->num_pages); + kvfree(e->user_pages); } return r; @@ -815,9 +805,10 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error, static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p) { - struct amdgpu_device *adev = p->adev; struct amdgpu_fpriv *fpriv = p->filp->driver_priv; + struct amdgpu_device *adev = p->adev; struct amdgpu_vm *vm = >vm; + struct amdgpu_bo_list_entry *e; struct amdgpu_bo_va *bo_va; struct amdgpu_bo *bo; int r; @@ -850,31 +841,26 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p) return r; } -
RE: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function v2
Go ahead with my RB. -Original Message- From: amd-gfx On Behalf Of Christian K?nig Sent: Monday, July 30, 2018 5:19 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function v2 This allows us to trace all VM ranges which should be valid inside a CS. v2: dump mappings without BO as well Signed-off-by: Christian König Reviewed-by: Chunming Zhou (v1) Reviewed-and-tested-by: Andrey Grodzovsky (v1) Reviewed-by: Huang Rui (v1) --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 29 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 1 + 4 files changed, 37 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 8a49c3b97bd4..871401cd9997 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1220,6 +1220,7 @@ static void amdgpu_cs_post_dependencies(struct amdgpu_cs_parser *p) static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, union drm_amdgpu_cs *cs) { + struct amdgpu_fpriv *fpriv = p->filp->driver_priv; struct amdgpu_ring *ring = p->ring; struct drm_sched_entity *entity = >ctx->rings[ring->idx].entity; enum drm_sched_priority priority; @@ -1272,6 +1273,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, amdgpu_job_free_resources(job); trace_amdgpu_cs_ioctl(job); + amdgpu_vm_bo_trace_cs(>vm, >ticket); priority = job->base.s_priority; drm_sched_entity_push_job(>base, entity); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h index 11f262f15200..7206a0025b17 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h @@ -314,6 +314,11 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_mapping, TP_ARGS(mapping) ); +DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_cs, + TP_PROTO(struct amdgpu_bo_va_mapping *mapping), + TP_ARGS(mapping) +); + TRACE_EVENT(amdgpu_vm_set_ptes, TP_PROTO(uint64_t pe, uint64_t addr, unsigned count, uint32_t incr, uint64_t flags), diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 5d7d7900ccab..015613b4f98b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2343,6 +2343,35 @@ struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm, return amdgpu_vm_it_iter_first(>va, addr, addr); } +/** + * amdgpu_vm_bo_trace_cs - trace all reserved mappings + * + * @vm: the requested vm + * @ticket: CS ticket + * + * Trace all mappings of BOs reserved during a command submission. + */ +void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx +*ticket) { + struct amdgpu_bo_va_mapping *mapping; + + if (!trace_amdgpu_vm_bo_cs_enabled()) + return; + + for (mapping = amdgpu_vm_it_iter_first(>va, 0, U64_MAX); mapping; +mapping = amdgpu_vm_it_iter_next(mapping, 0, U64_MAX)) { + if (mapping->bo_va && mapping->bo_va->base.bo) { + struct amdgpu_bo *bo; + + bo = mapping->bo_va->base.bo; + if (READ_ONCE(bo->tbo.resv->lock.ctx) != ticket) + continue; + } + + trace_amdgpu_vm_bo_cs(mapping); + } +} + /** * amdgpu_vm_bo_rmv - remove a bo to a specific vm * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index d416f895233d..67a15d439ac0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -318,6 +318,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev, uint64_t saddr, uint64_t size); struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm, uint64_t addr); +void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx +*ticket); void amdgpu_vm_bo_rmv(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va); void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size, -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function
Reviewed-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Friday, July 27, 2018 10:58 PM To: amd-gfx@lists.freedesktop.org; Grodzovsky, Andrey Subject: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function This allows us to trace all VM ranges which should be valid inside a CS. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 30 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 1 + 4 files changed, 38 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 8a49c3b97bd4..871401cd9997 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1220,6 +1220,7 @@ static void amdgpu_cs_post_dependencies(struct amdgpu_cs_parser *p) static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, union drm_amdgpu_cs *cs) { + struct amdgpu_fpriv *fpriv = p->filp->driver_priv; struct amdgpu_ring *ring = p->ring; struct drm_sched_entity *entity = >ctx->rings[ring->idx].entity; enum drm_sched_priority priority; @@ -1272,6 +1273,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, amdgpu_job_free_resources(job); trace_amdgpu_cs_ioctl(job); + amdgpu_vm_bo_trace_cs(>vm, >ticket); priority = job->base.s_priority; drm_sched_entity_push_job(>base, entity); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h index 11f262f15200..7206a0025b17 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h @@ -314,6 +314,11 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_mapping, TP_ARGS(mapping) ); +DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_cs, + TP_PROTO(struct amdgpu_bo_va_mapping *mapping), + TP_ARGS(mapping) +); + TRACE_EVENT(amdgpu_vm_set_ptes, TP_PROTO(uint64_t pe, uint64_t addr, unsigned count, uint32_t incr, uint64_t flags), diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 5d7d7900ccab..7aedf3184e36 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2343,6 +2343,36 @@ struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm, return amdgpu_vm_it_iter_first(>va, addr, addr); } +/** + * amdgpu_vm_bo_trace_cs - trace all reserved mappings + * + * @vm: the requested vm + * @ticket: CS ticket + * + * Trace all mappings of BOs reserved during a command submission. + */ +void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx +*ticket) { + struct amdgpu_bo_va_mapping *mapping; + + if (!trace_amdgpu_vm_bo_cs_enabled()) + return; + + for (mapping = amdgpu_vm_it_iter_first(>va, 0, U64_MAX); mapping; +mapping = amdgpu_vm_it_iter_next(mapping, 0, U64_MAX)) { + struct amdgpu_bo *bo; + + if (!mapping->bo_va || !mapping->bo_va->base.bo) + continue; + + bo = mapping->bo_va->base.bo; + if (READ_ONCE(bo->tbo.resv->lock.ctx) != ticket) + continue; + + trace_amdgpu_vm_bo_cs(mapping); + } +} + /** * amdgpu_vm_bo_rmv - remove a bo to a specific vm * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index d416f895233d..67a15d439ac0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -318,6 +318,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev, uint64_t saddr, uint64_t size); struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm, uint64_t addr); +void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx +*ticket); void amdgpu_vm_bo_rmv(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va); void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size, -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: add proper error handling to amdgpu_bo_list_get
Reviewed-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Friday, July 27, 2018 9:39 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/amdgpu: add proper error handling to amdgpu_bo_list_get Otherwise we silently don't use a BO list when the handle is invalid. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 28 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 11 --- 3 files changed, 20 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 13aaa118aca4..4cd20e722d70 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -713,8 +713,8 @@ struct amdgpu_bo_list { struct amdgpu_bo_list_entry *array; }; -struct amdgpu_bo_list * -amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id); +int amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id, + struct amdgpu_bo_list **result); void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list, struct list_head *validated); void amdgpu_bo_list_put(struct amdgpu_bo_list *list); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c index 7679c068c89a..944868e47119 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c @@ -180,27 +180,20 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev, return r; } -struct amdgpu_bo_list * -amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id) +int amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id, + struct amdgpu_bo_list **result) { - struct amdgpu_bo_list *result; - rcu_read_lock(); - result = idr_find(>bo_list_handles, id); + *result = idr_find(>bo_list_handles, id); - if (result) { - if (kref_get_unless_zero(>refcount)) { - rcu_read_unlock(); - mutex_lock(>lock); - } else { - rcu_read_unlock(); - result = NULL; - } - } else { + if (*result && kref_get_unless_zero(&(*result)->refcount)) { rcu_read_unlock(); + mutex_lock(&(*result)->lock); + return 0; } - return result; + rcu_read_unlock(); + return -ENOENT; } void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list, @@ -335,9 +328,8 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data, break; case AMDGPU_BO_LIST_OP_UPDATE: - r = -ENOENT; - list = amdgpu_bo_list_get(fpriv, handle); - if (!list) + r = amdgpu_bo_list_get(fpriv, handle, ); + if (r) goto error_free; r = amdgpu_bo_list_set(adev, filp, list, info, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 533b2e7656c0..8a49c3b97bd4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -572,11 +572,16 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, INIT_LIST_HEAD(>validated); /* p->bo_list could already be assigned if AMDGPU_CHUNK_ID_BO_HANDLES is present */ - if (!p->bo_list) - p->bo_list = amdgpu_bo_list_get(fpriv, cs->in.bo_list_handle); - else + if (p->bo_list) { mutex_lock(>bo_list->lock); + } else if (cs->in.bo_list_handle) { + r = amdgpu_bo_list_get(fpriv, cs->in.bo_list_handle, + >bo_list); + if (r) + return r; + } + if (p->bo_list) { amdgpu_bo_list_get_list(p->bo_list, >validated); if (p->bo_list->first_userptr != p->bo_list->num_entries) -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 1/2] drm/amdgpu: remove superflous UVD encode entity
Acked-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Thursday, July 19, 2018 2:45 AM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 1/2] drm/amdgpu: remove superflous UVD encode entity Not sure what that was every used for, but now it is completely unused. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 1 - drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 12 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 14 -- 3 files changed, 27 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h index 8b23a1b00c76..cae3f526216b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h @@ -48,7 +48,6 @@ struct amdgpu_uvd_inst { struct amdgpu_ring ring_enc[AMDGPU_MAX_UVD_ENC_RINGS]; struct amdgpu_irq_src irq; struct drm_sched_entity entity; - struct drm_sched_entity entity_enc; uint32_tsrbm_soft_reset; }; diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c index b796dc8375cd..598dbeaba636 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c @@ -418,16 +418,6 @@ static int uvd_v6_0_sw_init(void *handle) adev->uvd.num_enc_rings = 0; DRM_INFO("UVD ENC is disabled\n"); - } else { - struct drm_sched_rq *rq; - ring = >uvd.inst->ring_enc[0]; - rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL]; - r = drm_sched_entity_init(>uvd.inst->entity_enc, - , 1, NULL); - if (r) { - DRM_ERROR("Failed setting up UVD ENC run queue.\n"); - return r; - } } r = amdgpu_uvd_resume(adev); @@ -463,8 +453,6 @@ static int uvd_v6_0_sw_fini(void *handle) return r; if (uvd_v6_0_enc_support(adev)) { - drm_sched_entity_destroy(>uvd.inst->ring_enc[0].sched, >uvd.inst->entity_enc); - for (i = 0; i < adev->uvd.num_enc_rings; ++i) amdgpu_ring_fini(>uvd.inst->ring_enc[i]); } diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c index 89fe910e5c9a..2192f4536c24 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c @@ -389,7 +389,6 @@ static int uvd_v7_0_early_init(void *handle) static int uvd_v7_0_sw_init(void *handle) { struct amdgpu_ring *ring; - struct drm_sched_rq *rq; int i, j, r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; @@ -421,17 +420,6 @@ static int uvd_v7_0_sw_init(void *handle) DRM_INFO("PSP loading UVD firmware\n"); } - for (j = 0; j < adev->uvd.num_uvd_inst; j++) { - ring = >uvd.inst[j].ring_enc[0]; - rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL]; - r = drm_sched_entity_init(>uvd.inst[j].entity_enc, - , 1, NULL); - if (r) { - DRM_ERROR("(%d)Failed setting up UVD ENC run queue.\n", j); - return r; - } - } - r = amdgpu_uvd_resume(adev); if (r) return r; @@ -484,8 +472,6 @@ static int uvd_v7_0_sw_fini(void *handle) return r; for (j = 0; j < adev->uvd.num_uvd_inst; ++j) { - drm_sched_entity_destroy(>uvd.inst[j].ring_enc[0].sched, >uvd.inst[j].entity_enc); - for (i = 0; i < adev->uvd.num_enc_rings; ++i) amdgpu_ring_fini(>uvd.inst[j].ring_enc[i]); } -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu/powerplay: use irq source defines for smu7 sources
Rewiewed-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Alex Deucher Sent: Thursday, July 19, 2018 5:09 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH] drm/amdgpu/powerplay: use irq source defines for smu7 sources Use the newly added irq source defines rather than magic numbers for smu7 thermal interrupts. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c index 8eea49e4c74d..2aab1b475945 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c @@ -27,6 +27,7 @@ #include "atom.h" #include "ivsrcid/thm/irqsrcs_thm_9_0.h" #include "ivsrcid/smuio/irqsrcs_smuio_9_0.h" +#include "ivsrcid/ivsrcid_vislands30.h" uint8_t convert_to_vid(uint16_t vddc) { @@ -545,17 +546,17 @@ int phm_irq_process(struct amdgpu_device *adev, uint32_t src_id = entry->src_id; if (client_id == AMDGPU_IH_CLIENTID_LEGACY) { - if (src_id == 230) + if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_LOW_TO_HIGH) pr_warn("GPU over temperature range detected on PCIe %d:%d.%d!\n", PCI_BUS_NUM(adev->pdev->devfn), PCI_SLOT(adev->pdev->devfn), PCI_FUNC(adev->pdev->devfn)); - else if (src_id == 231) + else if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_HIGH_TO_LOW) pr_warn("GPU under temperature range detected on PCIe %d:%d.%d!\n", PCI_BUS_NUM(adev->pdev->devfn), PCI_SLOT(adev->pdev->devfn), PCI_FUNC(adev->pdev->devfn)); - else if (src_id == 83) + else if (src_id == VISLANDS30_IV_SRCID_GPIO_19) pr_warn("GPU Critical Temperature Fault detected on PCIe %d:%d.%d!\n", PCI_BUS_NUM(adev->pdev->devfn), PCI_SLOT(adev->pdev->devfn), -- 2.13.6 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/2] drm/amdgpu: clean up UVD instance handling v2
Acked-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Thursday, July 19, 2018 2:45 AM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 2/2] drm/amdgpu: clean up UVD instance handling v2 The whole handle, filp and entity handling is superfluous here. We should have reviewed that more thoughtfully. It looks like somebody just made the code instance aware without knowing the background. v2: fix one more missed case in amdgpu_uvd_suspend Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 121 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 10 +-- 2 files changed, 64 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index d708970244eb..80b5c453f8c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c @@ -263,21 +263,20 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev) dev_err(adev->dev, "(%d) failed to allocate UVD bo\n", r); return r; } + } - ring = >uvd.inst[j].ring; - rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL]; - r = drm_sched_entity_init(>uvd.inst[j].entity, , - 1, NULL); - if (r != 0) { - DRM_ERROR("Failed setting up UVD(%d) run queue.\n", j); - return r; - } - - for (i = 0; i < adev->uvd.max_handles; ++i) { - atomic_set(>uvd.inst[j].handles[i], 0); - adev->uvd.inst[j].filp[i] = NULL; - } + ring = >uvd.inst[0].ring; + rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL]; + r = drm_sched_entity_init(>uvd.entity, , 1, NULL); + if (r) { + DRM_ERROR("Failed setting up UVD kernel entity.\n"); + return r; } + for (i = 0; i < adev->uvd.max_handles; ++i) { + atomic_set(>uvd.handles[i], 0); + adev->uvd.filp[i] = NULL; + } + /* from uvd v5.0 HW addressing capacity increased to 64 bits */ if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 0)) adev->uvd.address_64_bit = true; @@ -306,11 +305,12 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev) { int i, j; + drm_sched_entity_destroy(>uvd.inst->ring.sched, +>uvd.entity); + for (j = 0; j < adev->uvd.num_uvd_inst; ++j) { kfree(adev->uvd.inst[j].saved_bo); - drm_sched_entity_destroy(>uvd.inst[j].ring.sched, >uvd.inst[j].entity); - amdgpu_bo_free_kernel(>uvd.inst[j].vcpu_bo, >uvd.inst[j].gpu_addr, (void **)>uvd.inst[j].cpu_addr); @@ -333,20 +333,20 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev) cancel_delayed_work_sync(>uvd.idle_work); + /* only valid for physical mode */ + if (adev->asic_type < CHIP_POLARIS10) { + for (i = 0; i < adev->uvd.max_handles; ++i) + if (atomic_read(>uvd.handles[i])) + break; + + if (i == adev->uvd.max_handles) + return 0; + } + for (j = 0; j < adev->uvd.num_uvd_inst; ++j) { if (adev->uvd.inst[j].vcpu_bo == NULL) continue; - /* only valid for physical mode */ - if (adev->asic_type < CHIP_POLARIS10) { - for (i = 0; i < adev->uvd.max_handles; ++i) - if (atomic_read(>uvd.inst[j].handles[i])) - break; - - if (i == adev->uvd.max_handles) - continue; - } - size = amdgpu_bo_size(adev->uvd.inst[j].vcpu_bo); ptr = adev->uvd.inst[j].cpu_addr; @@ -398,30 +398,27 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev) void amdgpu_uvd_free_handles(struct amdgpu_device *adev, struct drm_file *filp) { - struct amdgpu_ring *ring; - int i, j, r; - - for (j = 0; j < adev->uvd.num_uvd_inst; j++) { - ring = >uvd.inst[j].ring; + struct amdgpu_ring *ring = >uvd.inst[0].ring; + int i, r; - for (i = 0; i < adev->uvd.max_handles; ++i) { - uint32_t handle = atomic_read(>uvd.inst[j].handles[i]); - if (handle != 0 && adev->uvd.inst[j].filp[i] == filp) { - struct dma_fence *fence; - - r = amdgpu_uvd_get_destroy_msg(ring, handle, - false, ); -
RE: [PATCH] drm/amdgpu: fix job priority handling
Reviewed-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Thursday, July 19, 2018 2:15 AM To: amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/amdgpu: fix job priority handling The job might already be released at this point. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 911c4a12a163..7c5cc33d0cda 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1209,6 +1209,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, { struct amdgpu_ring *ring = p->ring; struct drm_sched_entity *entity = >ctx->rings[ring->idx].entity; + enum drm_sched_priority priority; struct amdgpu_job *job; unsigned i; uint64_t seq; @@ -1258,10 +1259,11 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, amdgpu_job_free_resources(job); trace_amdgpu_cs_ioctl(job); + priority = job->base.s_priority; drm_sched_entity_push_job(>base, entity); ring = to_amdgpu_ring(entity->sched); - amdgpu_ring_priority_get(ring, job->base.s_priority); + amdgpu_ring_priority_get(ring, priority); ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence); amdgpu_mn_unlock(p->mn); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 8b679c85d213..5a2c26a85984 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -126,6 +126,7 @@ void amdgpu_job_free(struct amdgpu_job *job) int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, void *owner, struct dma_fence **f) { + enum drm_sched_priority priority; struct amdgpu_ring *ring; int r; @@ -139,10 +140,11 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, job->owner = owner; *f = dma_fence_get(>base.s_fence->finished); amdgpu_job_free_resources(job); + priority = job->base.s_priority; drm_sched_entity_push_job(>base, entity); ring = to_amdgpu_ring(entity->sched); - amdgpu_ring_priority_get(ring, job->base.s_priority); + amdgpu_ring_priority_get(ring, priority); return 0; } -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: always initialize job->base.sched
Acked-by: Chunming Zhou , but I think it isn't a nice evaluation although there is comment in code. -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Tuesday, July 17, 2018 3:05 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/amdgpu: always initialize job->base.sched Otherwise we can't clean up the job if we run into an error before it is pushed to the scheduler. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 024efb7ea6d6..42a4764d728e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -54,6 +54,11 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs, if (!*job) return -ENOMEM; + /* +* Initialize the scheduler to at least some ring so that we always +* have a pointer to adev. +*/ + (*job)->base.sched = >rings[0]->sched; (*job)->vm = vm; (*job)->ibs = (void *)&(*job)[1]; (*job)->num_ibs = num_ibs; -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/2] drm/amdgpu: change ring priority after pushing the job
Reviewed-by: Chunming Zhou for series. -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Monday, July 16, 2018 9:25 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 2/2] drm/amdgpu: change ring priority after pushing the job Pushing a job can change the ring assignment of an entity. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 -- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 72dc9b36b937..911c4a12a163 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1256,11 +1256,13 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, job->uf_sequence = seq; amdgpu_job_free_resources(job); - amdgpu_ring_priority_get(p->ring, job->base.s_priority); trace_amdgpu_cs_ioctl(job); drm_sched_entity_push_job(>base, entity); + ring = to_amdgpu_ring(entity->sched); + amdgpu_ring_priority_get(ring, job->base.s_priority); + ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence); amdgpu_mn_unlock(p->mn); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 024efb7ea6d6..10c769db5d67 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -121,7 +121,7 @@ void amdgpu_job_free(struct amdgpu_job *job) int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, void *owner, struct dma_fence **f) { - struct amdgpu_ring *ring = to_amdgpu_ring(entity->sched); + struct amdgpu_ring *ring; int r; if (!f) @@ -134,9 +134,11 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, job->owner = owner; *f = dma_fence_get(>base.s_fence->finished); amdgpu_job_free_resources(job); - amdgpu_ring_priority_get(ring, job->base.s_priority); drm_sched_entity_push_job(>base, entity); + ring = to_amdgpu_ring(entity->sched); + amdgpu_ring_priority_get(ring, job->base.s_priority); + return 0; } -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 7/7] drm/amdgpu: minor cleanup in amdgpu_job.c
Series is Acked-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Friday, July 13, 2018 11:20 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 7/7] drm/amdgpu: minor cleanup in amdgpu_job.c Remove superflous NULL check, fix coding style a bit, shorten error messages. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index bd708b726003..024efb7ea6d6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -33,7 +33,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); struct amdgpu_job *job = to_amdgpu_job(s_job); - DRM_ERROR("ring %s timeout, last signaled seq=%u, last emitted seq=%u\n", + DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n", job->base.sched->name, atomic_read(>fence_drv.last_seq), ring->fence_drv.sync_seq); @@ -161,16 +161,17 @@ static struct dma_fence *amdgpu_job_dependency(struct drm_sched_job *sched_job, struct amdgpu_ring *ring = to_amdgpu_ring(s_entity->sched); struct amdgpu_job *job = to_amdgpu_job(sched_job); struct amdgpu_vm *vm = job->vm; + struct dma_fence *fence; bool explicit = false; int r; - struct dma_fence *fence = amdgpu_sync_get_fence(>sync, ); + fence = amdgpu_sync_get_fence(>sync, ); if (fence && explicit) { if (drm_sched_dependency_optimized(fence, s_entity)) { r = amdgpu_sync_fence(ring->adev, >sched_sync, fence, false); if (r) - DRM_ERROR("Error adding fence to sync (%d)\n", r); + DRM_ERROR("Error adding fence (%d)\n", r); } } @@ -194,10 +195,6 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) struct amdgpu_job *job; int r; - if (!sched_job) { - DRM_ERROR("job is null\n"); - return NULL; - } job = to_amdgpu_job(sched_job); finished = >base.s_fence->finished; -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH v2] drm/amdgpu: Allow to create BO lists in CS ioctl v2
Hi Andrey, Could you add compatibility flag or increase kms driver version? So that user space can keep old path when using new one. Regards, David Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of zhoucm1 Sent: Thursday, July 12, 2018 10:31 AM To: Grodzovsky, Andrey ; amd-gfx@lists.freedesktop.org Cc: Olsak, Marek ; Koenig, Christian Subject: Re: [PATCH v2] drm/amdgpu: Allow to create BO lists in CS ioctl v2 On 2018年07月12日 04:57, Andrey Grodzovsky wrote: > This change is to support MESA performace optimization. > Modify CS IOCTL to allow its input as command buffer and an array of > buffer handles to create a temporay bo list and then destroy it when > IOCTL completes. > This saves on calling for BO_LIST create and destry IOCTLs in MESA and > by this improves performance. > > v2: Avoid inserting the temp list into idr struct. > > Signed-off-by: Andrey Grodzovsky > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 11 > drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 86 > ++--- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 51 +++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +- > include/uapi/drm/amdgpu_drm.h | 1 + > 5 files changed, 114 insertions(+), 38 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index 8eaba0f..9b472b2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -732,6 +732,17 @@ void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list, >struct list_head *validated); > void amdgpu_bo_list_put(struct amdgpu_bo_list *list); > void amdgpu_bo_list_free(struct amdgpu_bo_list *list); > +int amdgpu_bo_create_list_entry_array(struct drm_amdgpu_bo_list_in *in, > + struct drm_amdgpu_bo_list_entry > **info_param); > + > +int amdgpu_bo_list_create(struct amdgpu_device *adev, > + struct drm_file *filp, > + struct drm_amdgpu_bo_list_entry *info, > + unsigned num_entries, > + int *id, > + struct amdgpu_bo_list **list); > + > +void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id); > > /* >* GFX stuff > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c > index 92be7f6..14c7c59 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c > @@ -55,11 +55,12 @@ static void amdgpu_bo_list_release_rcu(struct kref *ref) > kfree_rcu(list, rhead); > } > > -static int amdgpu_bo_list_create(struct amdgpu_device *adev, > +int amdgpu_bo_list_create(struct amdgpu_device *adev, >struct drm_file *filp, >struct drm_amdgpu_bo_list_entry *info, >unsigned num_entries, > - int *id) > + int *id, > + struct amdgpu_bo_list **list_out) > { > int r; > struct amdgpu_fpriv *fpriv = filp->driver_priv; @@ -78,20 +79,25 @@ > static int amdgpu_bo_list_create(struct amdgpu_device *adev, > return r; > } > > + if (id) { > /* idr alloc should be called only after initialization of bo list. */ > - mutex_lock(>bo_list_lock); > - r = idr_alloc(>bo_list_handles, list, 1, 0, GFP_KERNEL); > - mutex_unlock(>bo_list_lock); > - if (r < 0) { > - amdgpu_bo_list_free(list); > - return r; > + mutex_lock(>bo_list_lock); > + r = idr_alloc(>bo_list_handles, list, 1, 0, GFP_KERNEL); > + mutex_unlock(>bo_list_lock); > + if (r < 0) { > + amdgpu_bo_list_free(list); > + return r; > + } > + *id = r; > } > - *id = r; > + > + if (list_out) > + *list_out = list; > > return 0; > } > > -static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int > id) > +void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id) > { > struct amdgpu_bo_list *list; > > @@ -263,53 +269,68 @@ void amdgpu_bo_list_free(struct amdgpu_bo_list *list) > kfree(list); > } > > -int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data, > - struct drm_file *filp) > +int amdgpu_bo_create_list_entry_array(struct drm_amdgpu_bo_list_in *in, > + struct drm_amdgpu_bo_list_entry > **info_param) > { > - const uint32_t info_size = sizeof(struct drm_amdgpu_bo_list_entry); > - > - struct amdgpu_device *adev = dev->dev_private; > - struct amdgpu_fpriv *fpriv = filp->driver_priv; > -
RE: [PATCH 1/2] drm/amdgpu: switch firmware path for CIK parts
Yes, agree, radeon driver uses radeon path, amdgpu uses amdgpu path, which makes sense to me. The series is Reviewed-by: Chunming Zhou Regards, David Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Alex Deucher Sent: Tuesday, July 03, 2018 4:32 AM To: Dave Airlie Cc: Deucher, Alexander ; amd-gfx mailing list Subject: Re: [PATCH 1/2] drm/amdgpu: switch firmware path for CIK parts On Mon, Jul 2, 2018 at 4:12 PM, Dave Airlie wrote: > On 3 July 2018 at 05:36, Alex Deucher wrote: >> Use separate firmware path for amdgpu to avoid conflicts with radeon >> on CIK parts. >> > > Won't that cause a chicken and egg problem, new kernel with old > firmware package will suddenly start failing, or do we not really care > since in theory we don't suppose amdgpu on those parts yet? > > Seems like we'd want to fallback to the old paths if possible. I guess we could fall back, but in most cases the firmware loader will have to timeout first and then most users will assume it's broken anyway. radeon is still the default with most distros, so I don't think it's super critical. Alex > > Dave. > >> Signed-off-by: Alex Deucher >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c | 8 ++-- >> drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 10 ++--- >> drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 10 ++--- >> drivers/gpu/drm/amd/amdgpu/ci_dpm.c | 10 ++--- >> drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 24 +-- >> drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 72 >> - >> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 +-- >> 7 files changed, 70 insertions(+), 70 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c >> index e950730f1933..693ec5ea4950 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c >> @@ -314,17 +314,17 @@ static int amdgpu_cgs_get_firmware_info(struct >> cgs_device *cgs_device, >> (adev->pdev->revision == 0x81) || >> (adev->pdev->device == 0x665f)) { >> info->is_kicker = true; >> - strcpy(fw_name, >> "radeon/bonaire_k_smc.bin"); >> + strcpy(fw_name, >> + "amdgpu/bonaire_k_smc.bin"); >> } else { >> - strcpy(fw_name, >> "radeon/bonaire_smc.bin"); >> + strcpy(fw_name, >> + "amdgpu/bonaire_smc.bin"); >> } >> break; >> case CHIP_HAWAII: >> if (adev->pdev->revision == 0x80) { >> info->is_kicker = true; >> - strcpy(fw_name, >> "radeon/hawaii_k_smc.bin"); >> + strcpy(fw_name, >> + "amdgpu/hawaii_k_smc.bin"); >> } else { >> - strcpy(fw_name, >> "radeon/hawaii_smc.bin"); >> + strcpy(fw_name, >> + "amdgpu/hawaii_smc.bin"); >> } >> break; >> case CHIP_TOPAZ: >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c >> index 0b46ea1c6290..3e70eb61a960 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c >> @@ -53,11 +53,11 @@ >> >> /* Firmware Names */ >> #ifdef CONFIG_DRM_AMDGPU_CIK >> -#define FIRMWARE_BONAIRE "radeon/bonaire_uvd.bin" >> -#define FIRMWARE_KABINI"radeon/kabini_uvd.bin" >> -#define FIRMWARE_KAVERI"radeon/kaveri_uvd.bin" >> -#define FIRMWARE_HAWAII"radeon/hawaii_uvd.bin" >> -#define FIRMWARE_MULLINS "radeon/mullins_uvd.bin" >> +#define FIRMWARE_BONAIRE "amdgpu/bonaire_uvd.bin" >> +#define FIRMWARE_KABINI"amdgpu/kabini_uvd.bin" >> +#define FIRMWARE_KAVERI"amdgpu/kaveri_uvd.bin" >> +#define FIRMWARE_HAWAII"amdgpu/hawaii_uvd.bin" >> +#define FIRMWARE_MULLINS "amdgpu/mullins_uvd.bin" >> #endif >> #define FIRMWARE_TONGA "amdgpu/tonga_uvd.bin" >> #define FIRMWARE_CARRIZO "amdgpu/carrizo_uvd.bin" >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >> index b0dcdfd85f5b..6ae1ad7e83b3 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >> @@ -40,11 +40,11 @@ >> >> /* Firmware Names */ >> #ifdef CONFIG_DRM_AMDGPU_CIK >> -#define FIRMWARE_BONAIRE "radeon/bonaire_vce.bin" >> -#define FIRMWARE_KABINI"radeon/kabini_vce.bin" >> -#define FIRMWARE_KAVERI
RE: [PATCH] drm/amdgpu: remove duplicated codes
Feel free add my RB on that. Thanks, David Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Flora Cui Sent: Wednesday, June 27, 2018 3:06 PM To: Zhou, David(ChunMing) Cc: amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: remove duplicated codes the fence_context and seqno is init in amdgpu_vm_manager_init() & amdgpu_vmid_mgr_init(). remove the amdgpu_vmid_mgr_init() copy. Change-Id: Ic0dbd693bac093e54eb95b5e547c89b64a5743b8 Signed-off-by: Flora Cui --- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c index a1c78f9..3a072a7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c @@ -578,11 +578,6 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev) list_add_tail(_mgr->ids[j].list, _mgr->ids_lru); } } - - adev->vm_manager.fence_context = - dma_fence_context_alloc(AMDGPU_MAX_RINGS); - for (i = 0; i < AMDGPU_MAX_RINGS; ++i) - adev->vm_manager.seqno[i] = 0; } /** -- 2.7.4 On Wed, Jun 27, 2018 at 02:38:09PM +0800, Zhou, David(ChunMing) wrote: > Please add patch's comment to describe where and where are duplicated. > > -Original Message- > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > Of Flora Cui > Sent: Wednesday, June 27, 2018 2:10 PM > To: amd-gfx@lists.freedesktop.org > Cc: Cui, Flora > Subject: [PATCH] drm/amdgpu: remove duplicated codes > > Change-Id: Ic0dbd693bac093e54eb95b5e547c89b64a5743b8 > Signed-off-by: Flora Cui > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 5 - > 1 file changed, 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > index a1c78f9..3a072a7 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > @@ -578,11 +578,6 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev) > list_add_tail(_mgr->ids[j].list, _mgr->ids_lru); > } > } > - > - adev->vm_manager.fence_context = > - dma_fence_context_alloc(AMDGPU_MAX_RINGS); > - for (i = 0; i < AMDGPU_MAX_RINGS; ++i) > - adev->vm_manager.seqno[i] = 0; > } > > /** > -- > 2.7.4 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: remove duplicated codes
Please add patch's comment to describe where and where are duplicated. -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Flora Cui Sent: Wednesday, June 27, 2018 2:10 PM To: amd-gfx@lists.freedesktop.org Cc: Cui, Flora Subject: [PATCH] drm/amdgpu: remove duplicated codes Change-Id: Ic0dbd693bac093e54eb95b5e547c89b64a5743b8 Signed-off-by: Flora Cui --- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c index a1c78f9..3a072a7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c @@ -578,11 +578,6 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev) list_add_tail(_mgr->ids[j].list, _mgr->ids_lru); } } - - adev->vm_manager.fence_context = - dma_fence_context_alloc(AMDGPU_MAX_RINGS); - for (i = 0; i < AMDGPU_MAX_RINGS; ++i) - adev->vm_manager.seqno[i] = 0; } /** -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: Add AMDGPU_GPU_PAGES_IN_CPU_PAGE define
current amdgpu driver indeed always set GPU PAGE SIZE is 4096. In fact, our gpu supports bigger page size like 64KB, just we don't use it. I remeber previous amdsoc(old android kernel driver) used 64KB. correct me if I'm wrong. send from Smartisan Pro Michel D鋘zer 于 2018年6月25日 下午5:10写道: On 2018-06-25 03:56 AM, zhoucm1 wrote: > one question to you: > > Did you consider the case that GPU_PAGE_SIZE > CPU_PAGE_SIZE? That is never the case: AMDGPU_GPU_PAGE_SIZE is always 4096, and PAGE_SIZE is always >= 4096 (an integer multiple of it). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH v2 1/2] drm/scheduler: Rename cleanup functions v2.
Acked-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Andrey Grodzovsky Sent: Thursday, June 21, 2018 11:33 PM To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org Cc: e...@anholt.net; Koenig, Christian ; Grodzovsky, Andrey ; l.st...@pengutronix.de Subject: [PATCH v2 1/2] drm/scheduler: Rename cleanup functions v2. Everything in the flush code path (i.e. waiting for SW queue to become empty) names with *_flush() and everything in the release code path names *_fini() This patch also effect the amdgpu and etnaviv drivers which use those functions. v2: Also apply the change to vd3. Signed-off-by: Andrey Grodzovsky Suggested-by: Christian König Acked-by: Lucas Stach --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 8 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 4 ++-- drivers/gpu/drm/scheduler/gpu_scheduler.c | 18 +- drivers/gpu/drm/v3d/v3d_drv.c | 2 +- include/drm/gpu_scheduler.h | 6 +++--- 11 files changed, 26 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index 64b3a1e..c0f06c0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -104,7 +104,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev, failed: for (j = 0; j < i; j++) - drm_sched_entity_fini(>rings[j]->sched, + drm_sched_entity_destroy(>rings[j]->sched, >rings[j].entity); kfree(ctx->fences); ctx->fences = NULL; @@ -178,7 +178,7 @@ static void amdgpu_ctx_do_release(struct kref *ref) if (ctx->adev->rings[i] == >adev->gfx.kiq.ring) continue; - drm_sched_entity_fini(>adev->rings[i]->sched, + drm_sched_entity_destroy(>adev->rings[i]->sched, >rings[i].entity); } @@ -466,7 +466,7 @@ void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr *mgr) if (ctx->adev->rings[i] == >adev->gfx.kiq.ring) continue; - max_wait = drm_sched_entity_do_release(>adev->rings[i]->sched, + max_wait = drm_sched_entity_flush(>adev->rings[i]->sched, >rings[i].entity, max_wait); } } @@ -492,7 +492,7 @@ void amdgpu_ctx_mgr_entity_cleanup(struct amdgpu_ctx_mgr *mgr) continue; if (kref_read(>refcount) == 1) - drm_sched_entity_cleanup(>adev->rings[i]->sched, + drm_sched_entity_fini(>adev->rings[i]->sched, >rings[i].entity); else DRM_ERROR("ctx %p is still alive\n", ctx); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 0c084d3..0246cb8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -162,7 +162,7 @@ static int amdgpu_ttm_global_init(struct amdgpu_device *adev) static void amdgpu_ttm_global_fini(struct amdgpu_device *adev) { if (adev->mman.mem_global_referenced) { - drm_sched_entity_fini(adev->mman.entity.sched, + drm_sched_entity_destroy(adev->mman.entity.sched, >mman.entity); mutex_destroy(>mman.gtt_window_lock); drm_global_item_unref(>mman.bo_global_ref.ref); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index cc15d32..0b46ea1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c @@ -309,7 +309,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev) for (j = 0; j < adev->uvd.num_uvd_inst; ++j) { kfree(adev->uvd.inst[j].saved_bo); - drm_sched_entity_fini(>uvd.inst[j].ring.sched, >uvd.inst[j].entity); + drm_sched_entity_destroy(>uvd.inst[j].ring.sched, +>uvd.inst[j].entity); amdgpu_bo_free_kernel(>uvd.inst[j].vcpu_bo, >uvd.inst[j].gpu_addr, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c index 23d960e..b0dcdfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c @@ -222,7 +222,7 @@ int amdgpu_vce_sw_fini(struct
RE: [PATCH] drm/amdgpu: skip huge page for PRT mapping
Good catch, Reviewed-by: Chunming Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Junwei Zhang Sent: Monday, June 04, 2018 10:04 AM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Jerry Subject: [PATCH] drm/amdgpu: skip huge page for PRT mapping PRT mapping doesn't support huge page, since it's per PTE basis. Signed-off-by: Junwei Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 850cd66..4ce8bb0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -,7 +,8 @@ static void amdgpu_vm_handle_huge_pages(struct amdgpu_pte_update_params *p, /* In the case of a mixed PT the PDE must point to it*/ if (p->adev->asic_type >= CHIP_VEGA10 && !p->src && - nptes == AMDGPU_VM_PTE_COUNT(p->adev)) { + nptes == AMDGPU_VM_PTE_COUNT(p->adev) && + !(flags & AMDGPU_PTE_PRT)) { /* Set the huge page flag to stop scanning at this PDE */ flags |= AMDGPU_PDE_PTE; } -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
Looks good, Acked-by: Chunming Zhou-Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Emily Deng Sent: Thursday, May 17, 2018 11:05 AM To: amd-gfx@lists.freedesktop.org Cc: Deng, Emily Subject: [PATCH] drm/amdgpu: add rcu_barrier after entity fini To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini. The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below: 1.drm_sched_entity_fini -> drm_sched_entity_cleanup -> dma_fence_put(entity->last_scheduled) -> drm_sched_fence_release_finished -> drm_sched_fence_release_scheduled -> call_rcu(>finished.rcu, drm_sched_fence_free) 2.drm_sched_fence_free -> dma_fence_put(fence->parent) -> amdgpu_fence_release -> call_rcu(>rcu, amdgpu_fence_free) -> kmem_cache_free(amdgpu_fence_slab, fence); Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index cc3b067..07b2e10 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -134,6 +134,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device *adev) if (adev->mman.mem_global_referenced) { drm_sched_entity_fini(adev->mman.entity.sched, >mman.entity); + rcu_barrier(); mutex_destroy(>mman.gtt_window_lock); drm_global_item_unref(>mman.bo_global_ref.ref); drm_global_item_unref(>mman.mem_global_ref); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/2] drm/amdgpu: set ttm bo priority before initialization
The series is OK to me, Reviewed-by: Chunming Zhou <david1.z...@amd.com> It is better to wait Christian to have a look before pushing patch. Regards, David Zhou -Original Message- From: Junwei Zhang [mailto:jerry.zh...@amd.com] Sent: Friday, May 11, 2018 12:58 PM To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org Cc: Koenig, Christian <christian.koe...@amd.com>; Zhou, David(ChunMing) <david1.z...@amd.com>; Zhang, Jerry <jerry.zh...@amd.com> Subject: [PATCH 2/2] drm/amdgpu: set ttm bo priority before initialization Signed-off-by: Junwei Zhang <jerry.zh...@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index e62153a..6a9e46a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -419,6 +419,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo->tbo.bdev = >mman.bdev; amdgpu_ttm_placement_from_domain(bo, bp->domain); + if (bp->type == ttm_bo_type_kernel) + bo->tbo.priority = 1; r = ttm_bo_init_reserved(>mman.bdev, >tbo, size, bp->type, >placement, page_align, , acc_size, @@ -434,9 +436,6 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, else amdgpu_cs_report_moved_bytes(adev, ctx.bytes_moved, 0); - if (bp->type == ttm_bo_type_kernel) - bo->tbo.priority = 1; - if (bp->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED && bo->tbo.mem.placement & TTM_PL_FLAG_VRAM) { struct dma_fence *fence; -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: bo could be null when access in vm bo update
Reviewed-by: Chunming Zhou-Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Junwei Zhang Sent: Monday, April 23, 2018 5:29 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Jerry Subject: [PATCH] drm/amdgpu: bo could be null when access in vm bo update Signed-off-by: Junwei Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 6a372ca..1c00f1a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1509,7 +1509,6 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct drm_mm_node *nodes; struct dma_fence *exclusive, **last_update; uint64_t flags; - uint32_t mem_type; int r; if (clear || !bo_va->base.bo) { @@ -1568,9 +1567,9 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, * the evicted list so that it gets validated again on the * next command submission. */ - mem_type = bo->tbo.mem.mem_type; if (bo && bo->tbo.resv == vm->root.base.bo->tbo.resv && - !(bo->preferred_domains & amdgpu_mem_type_to_domain(mem_type))) + !(bo->preferred_domains & + amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))) list_add_tail(_va->base.vm_status, >evicted); spin_unlock(>status_lock); -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/scheduler: fix build broken by "move last_sched fence updating prior to job popping"
Reviewed-by: Chunming Zhou-Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Wednesday, April 18, 2018 6:06 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/scheduler: fix build broken by "move last_sched fence updating prior to job popping" We don't have s_fence as local variable here. Signed-off-by: Christian König --- drivers/gpu/drm/scheduler/gpu_scheduler.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c index 5de79bbb12c8..f4b862503710 100644 --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c @@ -402,7 +402,7 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity) dma_fence_set_error(_job->s_fence->finished, -ECANCELED); dma_fence_put(entity->last_scheduled); - entity->last_scheduled = dma_fence_get(_fence->finished); + entity->last_scheduled = dma_fence_get(_job->s_fence->finished); spsc_queue_pop(>job_queue); return sched_job; -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: PROBLEM: linux-firmware provided firmware files do not support AMDVLK driver
As I known, AMDVLK is running on amdgpu driver, not support readon kernel driver. Regards, David Zhou -Original Message- From: boomboom psh [mailto:andrewston...@gmail.com] Sent: Wednesday, April 18, 2018 11:16 AM To: Deucher, Alexander <alexander.deuc...@amd.com>; Zhou, David(ChunMing) <david1.z...@amd.com>; Koenig, Christian <christian.koe...@amd.com> Cc: amd-gfx@lists.freedesktop.org Subject: PROBLEM: linux-firmware provided firmware files do not support AMDVLK driver [1.] linux-firmware provided firmware files do not support AMDVLK driver [2.] Full description of the problem/report: Vulkan instance fails to load on the AMDVLK driver, using a radeon HD7770 card. It throws a VK_ERROR_OUT_OF_HOST_MEMORY. This appears to be due to an outdated firmware, as replacing the firmware with the firmware provided by the amdgpu-pro driver fixes the issue ( https://github.com/GPUOpen-Drivers/AMDVLK/issues/17). The issue appears to also be present on pitcairn cards. ( https://github.com/GPUOpen-Drivers/AMDVLK/issues/25) [3.] firmware, AMDVLK, vulkan, SI, verde, pitcairn [4.1.] Linux version 4.15.15-1-ARCH (builduser@heftig-4572) (gcc version 7.3.1 20180312 (GCC)) #1 SMP PREEMPT Sat Mar 31 23:59:25 UTC 2018 [7.] vulkaninfo from https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers demonstrates the problem [8.] AMD Ryzen 3 1200, 8GB DDR4, Radeon HD7770 1GB [X.] Workaround: copy firmware from amdgpupro driver, however this must be redone every time there is an update to linux-firmware. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/3] drm/amdgpu: refresh per vm bo lru
then how to keep unique lru order? any ideas? To stable performance, we have to keep unique lru order, otherwise like the issue I look into, sometimes F1game is 40fps, sometimes 28fps...even re-validate allowed domains BO. The left root cause is the moved BOs are not same. send from Smartisan Pro Christian K鰊ig于 2018年3月27日 下午6:50写道: NAK, we already tried that and it is really not a good idea because it massively increases the per submission overhead. Christian. Am 27.03.2018 um 12:16 schrieb Chunming Zhou: > Change-Id: Ibad84ed585b0746867a5f4cd1eadc2273e7cf596 > Signed-off-by: Chunming Zhou > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 +++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 + > 3 files changed, 18 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 383bf2d31c92..414e61799236 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -919,6 +919,8 @@ static int amdgpu_bo_vm_update_pte(struct > amdgpu_cs_parser *p) >} >} > > + amdgpu_vm_refresh_lru(adev, vm); > + >return r; > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 5e35e23511cf..8ad2bb705765 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -1902,6 +1902,21 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct > amdgpu_device *adev, >return bo_va; > } > > +void amdgpu_vm_refresh_lru(struct amdgpu_device *adev, struct amdgpu_vm *vm) > +{ > + struct ttm_bo_global *glob = adev->mman.bdev.glob; > + struct amdgpu_vm_bo_base *bo_base; > + > + spin_lock(>status_lock); > + list_for_each_entry(bo_base, >vm_bo_list, vm_bo) { > + spin_lock(>lru_lock); > + ttm_bo_move_to_lru_tail(_base->bo->tbo); > + if (bo_base->bo->shadow) > + ttm_bo_move_to_lru_tail(_base->bo->shadow->tbo); > + spin_unlock(>lru_lock); > + } > + spin_unlock(>status_lock); > +} > > /** >* amdgpu_vm_bo_insert_mapping - insert a new mapping > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > index 1886a561c84e..e01895581489 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > @@ -285,6 +285,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev, > struct dma_fence **fence); > int amdgpu_vm_handle_moved(struct amdgpu_device *adev, > struct amdgpu_vm *vm); > +void amdgpu_vm_refresh_lru(struct amdgpu_device *adev, struct amdgpu_vm *vm); > int amdgpu_vm_bo_update(struct amdgpu_device *adev, >struct amdgpu_bo_va *bo_va, >bool clear); ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: use separate status for buffer funcs availability v2
Patch#1 is Reviewed-by: Chunming ZhouPatch#2~#4 are Acked-by: Chunming zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Thursday, March 01, 2018 7:53 PM To: amd-gfx@lists.freedesktop.org Cc: ckoenig.leichtzumer...@gmail.com Subject: [PATCH] drm/amdgpu: use separate status for buffer funcs availability v2 The ring status can change during GPU reset, but we still need to be able to schedule TTM buffer moves in the meantime. Otherwise we can ran into problems because of aborted move/fill operations during GPU resets. v2: still check if ring is available during direct submit. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 1 + 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 2aa6823ef503..614811061d3d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -213,9 +213,7 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo, abo = ttm_to_amdgpu_bo(bo); switch (bo->mem.mem_type) { case TTM_PL_VRAM: - if (adev->mman.buffer_funcs && - adev->mman.buffer_funcs_ring && - adev->mman.buffer_funcs_ring->ready == false) { + if (!adev->mman.buffer_funcs_enabled) { amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_CPU); } else if (adev->gmc.visible_vram_size < adev->gmc.real_vram_size && !(abo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) { @@ -331,7 +329,7 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev, const uint64_t GTT_MAX_BYTES = (AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GPU_PAGE_SIZE); - if (!ring->ready) { + if (!adev->mman.buffer_funcs_enabled) { DRM_ERROR("Trying to move memory with ring turned off.\n"); return -EINVAL; } @@ -577,12 +575,9 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict, amdgpu_move_null(bo, new_mem); return 0; } - if (adev->mman.buffer_funcs == NULL || - adev->mman.buffer_funcs_ring == NULL || - !adev->mman.buffer_funcs_ring->ready) { - /* use memcpy */ + + if (!adev->mman.buffer_funcs_enabled) goto memcpy; - } if (old_mem->mem_type == TTM_PL_VRAM && new_mem->mem_type == TTM_PL_SYSTEM) { @@ -1549,6 +1544,7 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable) else size = adev->gmc.visible_vram_size; man->size = size >> PAGE_SHIFT; + adev->mman.buffer_funcs_enabled = enable; } int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma) @@ -1647,6 +1643,11 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset, unsigned i; int r; + if (direct_submit && !ring->ready) { + DRM_ERROR("Trying to move memory with ring turned off.\n"); + return -EINVAL; + } + max_bytes = adev->mman.buffer_funcs->copy_max_bytes; num_loops = DIV_ROUND_UP(byte_count, max_bytes); num_dw = num_loops * adev->mman.buffer_funcs->copy_num_dw; @@ -1720,7 +1721,7 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo, struct amdgpu_job *job; int r; - if (!ring->ready) { + if (!adev->mman.buffer_funcs_enabled) { DRM_ERROR("Trying to clear memory with ring turned off.\n"); return -EINVAL; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index b8117c6e51f1..6ea7de863041 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -53,6 +53,7 @@ struct amdgpu_mman { /* buffer handling */ const struct amdgpu_buffer_funcs*buffer_funcs; struct amdgpu_ring *buffer_funcs_ring; + boolbuffer_funcs_enabled; struct mutexgtt_window_lock; /* Scheduler entity for buffer moves */ -- 2.14.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order
I don't want to prevent all, my new approach is to prevent the later allocation is trying and ahead of front to get the memory space that the front made from eviction. 发自坚果 Pro Christian K鰊ig <ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日 下午9:24写道: Yes, exactly that's the problem. See when you want to prevent a process B from allocating the memory process A has evicted, you need to prevent all concurrent allocation. And we don't do that because it causes a major performance drop. Regards, Christian. Am 26.01.2018 um 14:21 schrieb Zhou, David(ChunMing): You patch will prevent concurrent allocation, and will result in allocation performance drop much. 发自坚果 Pro Christian K鰊ig <ckoenig.leichtzumer...@gmail.com><mailto:ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日 下午9:04写道: Attached is what you actually want to do cleanly implemented. But as I said this is a NO-GO. Regards, Christian. Am 26.01.2018 um 13:43 schrieb Christian König: After my investigation, this issue should be detect of TTM design self, which breaks scheduling balance. Yeah, but again. This is indented design we can't change easily. Regards, Christian. Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing): I am off work, so reply mail by phone, the format could not be text. back to topic itself: the problem indeed happen on amdgpu driver, someone reports me that application runs with two instances, the performance are different. I also reproduced the issue with unit test(bo_eviction_test). They always think our scheduler isn't working as expected. After my investigation, this issue should be detect of TTM design self, which breaks scheduling balance. Further, if we run containers for our gpu, container A could run high score, container B runs low score with same benchmark. So this is bug that we need fix. Regards, David Zhou 发自坚果 Pro Christian K鰊ig <ckoenig.leichtzumer...@gmail.com><mailto:ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日 下午6:31写道: Am 26.01.2018 um 11:22 schrieb Chunming Zhou: > there is a scheduling balance issue about get node like: > a. process A allocates full memory and use it for submission. > b. process B tries to allocates memory, will wait for process A BO idle in > eviction. > c. process A completes the job, process B eviction will put process A BO node, > but in the meantime, process C is comming to allocate BO, whill directly get > node successfully, and do submission, > process B will again wait for process C BO idle. > d. repeat the above setps, process B could be delayed much more. > > later allocation must not be ahead of front in same place. Again NAK to the whole approach. At least with amdgpu the problem you described above never occurs because evictions are pipelined operations. We could only block for deleted regions to become free. But independent of that incoming memory requests while we make room for eviction are intended to be served first. Changing that is certainly a no-go cause that would favor memory hungry applications over small clients. Regards, Christian. > > Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18 > Signed-off-by: Chunming Zhou <david1.z...@amd.com><mailto:david1.z...@amd.com> > --- > drivers/gpu/drm/ttm/ttm_bo.c| 69 > +++-- > include/drm/ttm/ttm_bo_api.h| 7 + > include/drm/ttm/ttm_bo_driver.h | 7 + > 3 files changed, 80 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index d33a6bb742a1..558ec2cf465d 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct > ttm_buffer_object *bo, >return 0; > } > > +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter, > + struct ttm_buffer_object *bo, > + const struct ttm_place *place) > +{ > + waiter->tbo = bo; > + memcpy((void *)>place, (void *)place, sizeof(*place)); > + INIT_LIST_HEAD(>list); > +} > + > +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man, > +struct ttm_bo_waiter *waiter) > +{ > + if (!waiter) > + return; > + spin_lock(>wait_lock); > + list_add_tail(>list, >waiter_list); > + spin_unlock(>wait_lock); > +} > + > +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man, > +struct ttm_bo_waiter *waiter) > +{ > + if (!waiter) > + return; > + spin_lock(>wait_lock); > + if (!list_empty(>list)) > + list_del(>list); > + spin_unlock(>wait_lock); > + kfree(waiter); > +} > + > +int ttm_man_check_bo
Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order
You patch will prevent concurrent allocation, and will result in allocation performance drop much. 发自坚果 Pro Christian K鰊ig <ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日 下午9:04写道: Attached is what you actually want to do cleanly implemented. But as I said this is a NO-GO. Regards, Christian. Am 26.01.2018 um 13:43 schrieb Christian König: After my investigation, this issue should be detect of TTM design self, which breaks scheduling balance. Yeah, but again. This is indented design we can't change easily. Regards, Christian. Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing): I am off work, so reply mail by phone, the format could not be text. back to topic itself: the problem indeed happen on amdgpu driver, someone reports me that application runs with two instances, the performance are different. I also reproduced the issue with unit test(bo_eviction_test). They always think our scheduler isn't working as expected. After my investigation, this issue should be detect of TTM design self, which breaks scheduling balance. Further, if we run containers for our gpu, container A could run high score, container B runs low score with same benchmark. So this is bug that we need fix. Regards, David Zhou 发自坚果 Pro Christian K鰊ig <ckoenig.leichtzumer...@gmail.com><mailto:ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日 下午6:31写道: Am 26.01.2018 um 11:22 schrieb Chunming Zhou: > there is a scheduling balance issue about get node like: > a. process A allocates full memory and use it for submission. > b. process B tries to allocates memory, will wait for process A BO idle in > eviction. > c. process A completes the job, process B eviction will put process A BO node, > but in the meantime, process C is comming to allocate BO, whill directly get > node successfully, and do submission, > process B will again wait for process C BO idle. > d. repeat the above setps, process B could be delayed much more. > > later allocation must not be ahead of front in same place. Again NAK to the whole approach. At least with amdgpu the problem you described above never occurs because evictions are pipelined operations. We could only block for deleted regions to become free. But independent of that incoming memory requests while we make room for eviction are intended to be served first. Changing that is certainly a no-go cause that would favor memory hungry applications over small clients. Regards, Christian. > > Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18 > Signed-off-by: Chunming Zhou <david1.z...@amd.com><mailto:david1.z...@amd.com> > --- > drivers/gpu/drm/ttm/ttm_bo.c| 69 > +++-- > include/drm/ttm/ttm_bo_api.h| 7 + > include/drm/ttm/ttm_bo_driver.h | 7 + > 3 files changed, 80 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index d33a6bb742a1..558ec2cf465d 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct > ttm_buffer_object *bo, >return 0; > } > > +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter, > + struct ttm_buffer_object *bo, > + const struct ttm_place *place) > +{ > + waiter->tbo = bo; > + memcpy((void *)>place, (void *)place, sizeof(*place)); > + INIT_LIST_HEAD(>list); > +} > + > +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man, > +struct ttm_bo_waiter *waiter) > +{ > + if (!waiter) > + return; > + spin_lock(>wait_lock); > + list_add_tail(>list, >waiter_list); > + spin_unlock(>wait_lock); > +} > + > +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man, > +struct ttm_bo_waiter *waiter) > +{ > + if (!waiter) > + return; > + spin_lock(>wait_lock); > + if (!list_empty(>list)) > + list_del(>list); > + spin_unlock(>wait_lock); > + kfree(waiter); > +} > + > +int ttm_man_check_bo(struct ttm_mem_type_manager *man, > + struct ttm_buffer_object *bo, > + const struct ttm_place *place) > +{ > + struct ttm_bo_waiter *waiter, *tmp; > + > + spin_lock(>wait_lock); > + list_for_each_entry_safe(waiter, tmp, >waiter_list, list) { > + if ((bo != waiter->tbo) && > + ((place->fpfn >= waiter->place.fpfn && > + place->fpfn <= waiter->place.lpfn) || > + (place->lpfn <= waiter->place.lpfn && place->lpfn >
Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order
I am off work, so reply mail by phone, the format could not be text. back to topic itself: the problem indeed happen on amdgpu driver, someone reports me that application runs with two instances, the performance are different. I also reproduced the issue with unit test(bo_eviction_test). They always think our scheduler isn't working as expected. After my investigation, this issue should be detect of TTM design self, which breaks scheduling balance. Further, if we run containers for our gpu, container A could run high score, container B runs low score with same benchmark. So this is bug that we need fix. Regards, David Zhou 发自坚果 Pro Christian K�nig于 2018年1月26日 下午6:31写道: Am 26.01.2018 um 11:22 schrieb Chunming Zhou: > there is a scheduling balance issue about get node like: > a. process A allocates full memory and use it for submission. > b. process B tries to allocates memory, will wait for process A BO idle in > eviction. > c. process A completes the job, process B eviction will put process A BO node, > but in the meantime, process C is comming to allocate BO, whill directly get > node successfully, and do submission, > process B will again wait for process C BO idle. > d. repeat the above setps, process B could be delayed much more. > > later allocation must not be ahead of front in same place. Again NAK to the whole approach. At least with amdgpu the problem you described above never occurs because evictions are pipelined operations. We could only block for deleted regions to become free. But independent of that incoming memory requests while we make room for eviction are intended to be served first. Changing that is certainly a no-go cause that would favor memory hungry applications over small clients. Regards, Christian. > > Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18 > Signed-off-by: Chunming Zhou > --- > drivers/gpu/drm/ttm/ttm_bo.c| 69 > +++-- > include/drm/ttm/ttm_bo_api.h| 7 + > include/drm/ttm/ttm_bo_driver.h | 7 + > 3 files changed, 80 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index d33a6bb742a1..558ec2cf465d 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct > ttm_buffer_object *bo, >return 0; > } > > +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter, > + struct ttm_buffer_object *bo, > + const struct ttm_place *place) > +{ > + waiter->tbo = bo; > + memcpy((void *)>place, (void *)place, sizeof(*place)); > + INIT_LIST_HEAD(>list); > +} > + > +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man, > +struct ttm_bo_waiter *waiter) > +{ > + if (!waiter) > + return; > + spin_lock(>wait_lock); > + list_add_tail(>list, >waiter_list); > + spin_unlock(>wait_lock); > +} > + > +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man, > +struct ttm_bo_waiter *waiter) > +{ > + if (!waiter) > + return; > + spin_lock(>wait_lock); > + if (!list_empty(>list)) > + list_del(>list); > + spin_unlock(>wait_lock); > + kfree(waiter); > +} > + > +int ttm_man_check_bo(struct ttm_mem_type_manager *man, > + struct ttm_buffer_object *bo, > + const struct ttm_place *place) > +{ > + struct ttm_bo_waiter *waiter, *tmp; > + > + spin_lock(>wait_lock); > + list_for_each_entry_safe(waiter, tmp, >waiter_list, list) { > + if ((bo != waiter->tbo) && > + ((place->fpfn >= waiter->place.fpfn && > + place->fpfn <= waiter->place.lpfn) || > + (place->lpfn <= waiter->place.lpfn && place->lpfn >= > + waiter->place.fpfn))) > + goto later_bo; > + } > + spin_unlock(>wait_lock); > + return true; > +later_bo: > + spin_unlock(>wait_lock); > + return false; > +} > /** >* Repeatedly evict memory from the LRU for @mem_type until we create enough >* space, or we've evicted everything and there isn't enough space. > @@ -853,17 +905,26 @@ static int ttm_bo_mem_force_space(struct > ttm_buffer_object *bo, > { >struct ttm_bo_device *bdev = bo->bdev; >struct ttm_mem_type_manager *man = >man[mem_type]; > + struct ttm_bo_waiter waiter; >int ret; > > + ttm_man_init_waiter(, bo, place); > + ttm_man_add_waiter(man, ); >do { >ret = (*man->func->get_node)(man, bo, place, mem); > - if (unlikely(ret != 0)) > + if (unlikely(ret != 0)) { > + ttm_man_del_waiter(man, ); >return ret; > - if (mem->mm_node) > +
RE: [PATCH 1/5] drm/amdgpu: add new asic callbacks for HDP flush/invalidation
Seems amdgpu_asic_invalidate_hdp() isn't used any more in following patch. Regards, David Zhou -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Alex Deucher Sent: Friday, January 05, 2018 12:19 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander; Li, Samuel Subject: [PATCH 1/5] drm/amdgpu: add new asic callbacks for HDP flush/invalidation Needed to properly flush the HDP cache with the CPU from rather than the GPU. Signed-off-by: Alex Deucher Signed-off-by: Samuel Li --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 6 ++ 1 file changed, 6 insertions(+) I keep needing to resurrect these patches to test things periodically so I'd like to get them merged even if we don't have a pressing use case at the moment. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 642bea2c9b3a..88f41c41c70a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1287,6 +1287,10 @@ struct amdgpu_asic_funcs { void (*set_pcie_lanes)(struct amdgpu_device *adev, int lanes); /* get config memsize register */ u32 (*get_config_memsize)(struct amdgpu_device *adev); + /* flush hdp write queue */ + void (*flush_hdp)(struct amdgpu_device *adev); + /* invalidate hdp read cache */ + void (*invalidate_hdp)(struct amdgpu_device *adev); }; /* @@ -1836,6 +1840,8 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring) #define amdgpu_asic_read_bios_from_rom(adev, b, l) (adev)->asic_funcs->read_bios_from_rom((adev), (b), (l)) #define amdgpu_asic_read_register(adev, se, sh, offset, v)((adev)->asic_funcs->read_register((adev), (se), (sh), (offset), (v))) #define amdgpu_asic_get_config_memsize(adev) (adev)->asic_funcs->get_config_memsize((adev)) +#define amdgpu_asic_flush_hdp(adev) +(adev)->asic_funcs->flush_hdp((adev)) +#define amdgpu_asic_invalidate_hdp(adev) +(adev)->asic_funcs->invalidate_hdp((adev)) #define amdgpu_gart_flush_gpu_tlb(adev, vmid) (adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid)) #define amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) (adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags)) #define amdgpu_gart_get_vm_pde(adev, level, dst, flags) (adev)->gart.gart_funcs->get_vm_pde((adev), (level), (dst), (flags)) -- 2.13.6 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/4] drm/amdgpu: minor optimize VM moved handling v2
>>>> +else if (reservation_object_trylock(resv)) >>>> +clear = false; this will effect bo in bo list,wont it? 发自坚果 Pro Koenig, Christian <christian.koe...@amd.com> 于 2018年1月3日 下午6:47写道: Am 03.01.2018 um 11:43 schrieb Chunming Zhou: > > > On 2018年01月03日 17:25, Christian König wrote: >> Am 03.01.2018 um 09:10 schrieb Zhou, David(ChunMing): >>> >>> On 2018年01月02日 22:47, Christian König wrote: >>>> Try to lock moved BOs if it's successful we can update the >>>> PTEs directly to the new location. >>>> >>>> v2: rebase >>>> >>>> Signed-off-by: Christian König <christian.koe...@amd.com> >>>> --- >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 ++- >>>>1 file changed, 14 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> index 3632c69f1814..c1c5ccdee783 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> @@ -1697,18 +1697,31 @@ int amdgpu_vm_handle_moved(struct >>>> amdgpu_device *adev, >>>>spin_lock(>status_lock); >>>>while (!list_empty(>moved)) { >>>>struct amdgpu_bo_va *bo_va; >>>> +struct reservation_object *resv; >>>> bo_va = list_first_entry(>moved, >>>>struct amdgpu_bo_va, base.vm_status); >>>>spin_unlock(>status_lock); >>>>+resv = bo_va->base.bo->tbo.resv; >>>> + >>>>/* Per VM BOs never need to bo cleared in the page >>>> tables */ >>> This reminders us Per-VM-BOs need to cleared as well after we allow to >>> evict/swap out per-vm-bos. >> >> Actually they don't. The page tables only need to be valid during CS. >> >> So what happens is that the per VM-BOs are validated in right before >> we call amdgpu_vm_handle_moved(). > Yeah, agree it for per-vm-bo situation after I checked all adding > moved list cases: > 1. validate pt bos > 2. bo invalidate > 3. insert_map for per-vm-bo > item #1 and #3 both are per-vm-bo, they are already validated before > handle_moved(). > > For item #2, there are three places to call it: > a. amdgpu_bo_vm_update_pte in CS for amdgpu_vm_debug > b. amdgpu_gem_op_ioctl, but it is for evicted list, nothing with moved > list. > c. amdgpu_bo_move_notify when bo validate. > > For c case, your optimization is valid, we don't need clear for > validate bo. > But for a case, yours will break amdgpu_vm_debug functionality. > > Right? Interesting point, but no that should be handled as well. The vm_debug handling is only for the BOs on the BO-list. E.g. per VM BOs are never handled here. Regards, Christian. > > Regards, > David Zhou > >> >> Regards, >> Christian. >> >>> >>> Regards, >>> David Zhou >>>> -clear = bo_va->base.bo->tbo.resv != >>>> vm->root.base.bo->tbo.resv; >>>> +if (resv == vm->root.base.bo->tbo.resv) >>>> +clear = false; >>>> +/* Try to reserve the BO to avoid clearing its ptes */ >>>> +else if (reservation_object_trylock(resv)) >>>> +clear = false; >>>> +/* Somebody else is using the BO right now */ >>>> +else >>>> +clear = true; >>>> r = amdgpu_vm_bo_update(adev, bo_va, clear); >>>>if (r) >>>>return r; >>>>+if (!clear && resv != vm->root.base.bo->tbo.resv) >>>> +reservation_object_unlock(resv); >>>> + >>>>spin_lock(>status_lock); >>>>} >>>>spin_unlock(>status_lock); >> > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/4] drm/amdgpu: minor optimize VM moved handling v2
On 2018年01月02日 22:47, Christian König wrote: > Try to lock moved BOs if it's successful we can update the > PTEs directly to the new location. > > v2: rebase > > Signed-off-by: Christian König> --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 ++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 3632c69f1814..c1c5ccdee783 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -1697,18 +1697,31 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev, > spin_lock(>status_lock); > while (!list_empty(>moved)) { > struct amdgpu_bo_va *bo_va; > + struct reservation_object *resv; > > bo_va = list_first_entry(>moved, > struct amdgpu_bo_va, base.vm_status); > spin_unlock(>status_lock); > > + resv = bo_va->base.bo->tbo.resv; > + > /* Per VM BOs never need to bo cleared in the page tables */ This reminders us Per-VM-BOs need to cleared as well after we allow to evict/swap out per-vm-bos. Regards, David Zhou > - clear = bo_va->base.bo->tbo.resv != vm->root.base.bo->tbo.resv; > + if (resv == vm->root.base.bo->tbo.resv) > + clear = false; > + /* Try to reserve the BO to avoid clearing its ptes */ > + else if (reservation_object_trylock(resv)) > + clear = false; > + /* Somebody else is using the BO right now */ > + else > + clear = true; > > r = amdgpu_vm_bo_update(adev, bo_va, clear); > if (r) > return r; > > + if (!clear && resv != vm->root.base.bo->tbo.resv) > + reservation_object_unlock(resv); > + > spin_lock(>status_lock); > } > spin_unlock(>status_lock); ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 4/5] drm/amdgpu: rename vm_id to vmid
Reviewed-by: Chunming Zhou-Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Wednesday, December 20, 2017 9:21 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 4/5] drm/amdgpu: rename vm_id to vmid sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 6 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c| 8 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 6 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h| 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 28 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 14 ++-- drivers/gpu/drm/amd/amdgpu/cik_ih.c | 2 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 14 ++-- drivers/gpu/drm/amd/amdgpu/cz_ih.c| 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 14 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 18 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 18 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 18 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 14 ++-- drivers/gpu/drm/amd/amdgpu/iceland_ih.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c| 16 ++ drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c| 16 ++ drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c| 18 +++- drivers/gpu/drm/amd/amdgpu/si_dma.c | 16 +++--- drivers/gpu/drm/amd/amdgpu/si_ih.c| 2 +- drivers/gpu/drm/amd/amdgpu/tonga_ih.c | 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 26 +++--- drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 36 +++ drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 10 - drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 18 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 36 +++ drivers/gpu/drm/amd/amdgpu/vega10_ih.c| 4 ++-- 33 files changed, 188 insertions(+), 194 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 680d4f6de52d..15903ffdf0b0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -351,7 +351,7 @@ struct amdgpu_gart_funcs { /* get the pde for a given mc addr */ void (*get_vm_pde)(struct amdgpu_device *adev, int level, u64 *dst, u64 *flags); - uint32_t (*get_invalidate_req)(unsigned int vm_id); + uint32_t (*get_invalidate_req)(unsigned int vmid); }; /* provided by the ih block */ @@ -1124,7 +1124,7 @@ struct amdgpu_job { void*owner; uint64_tfence_ctx; /* the fence_context this job uses */ boolvm_needs_flush; - unsignedvm_id; + unsignedvmid; uint64_tvm_pd_addr; uint32_tgds_base, gds_size; uint32_tgws_base, gws_size; @@ -1852,7 +1852,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring) #define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r)) #define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r)) #define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r)) -#define amdgpu_ring_emit_ib(r, ib, vm_id, c) (r)->funcs->emit_ib((r), (ib), (vm_id), (c)) +#define amdgpu_ring_emit_ib(r, ib, vmid, c) (r)->funcs->emit_ib((r), (ib), (vmid), (c)) #define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r)) #define amdgpu_ring_emit_vm_flush(r, vmid, addr) (r)->funcs->emit_vm_flush((r), (vmid), (addr)) #define amdgpu_ring_emit_fence(r, addr, seq, flags) (r)->funcs->emit_fence((r), (addr), (seq), (flags)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index 03a69942cce5..a162d87ca0c8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -149,7 +149,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, return -EINVAL; } - if (vm && !job->vm_id) { + if (vm && !job->vmid) { dev_err(adev->dev, "VM IB without ID\n"); return -EINVAL; } @@ -211,7 +211,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, !amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */ continue; -
RE: [PATCH 3/5] drm/amdgpu: separate VMID and PASID handling
Looks very good, Reviewed-by: Chunming Zhou-Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Wednesday, December 20, 2017 9:21 PM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 3/5] drm/amdgpu: separate VMID and PASID handling Move both into the new files amdgpu_ids.[ch]. No functional change. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/Makefile | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c| 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 459 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h | 91 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 422 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 44 +-- drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +- 13 files changed, 579 insertions(+), 465 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index d8da12c114b1..d6e5b7273853 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -52,7 +52,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \ amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \ - amdgpu_queue_mgr.o amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o + amdgpu_queue_mgr.o amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o \ + amdgpu_ids.o # add asic specific block amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c index 1e3e9be7d77e..1ae149456c9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c @@ -169,8 +169,8 @@ static const struct kfd2kgd_calls kfd2kgd = { .get_vmem_size = get_vmem_size, .get_gpu_clock_counter = get_gpu_clock_counter, .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz, - .alloc_pasid = amdgpu_vm_alloc_pasid, - .free_pasid = amdgpu_vm_free_pasid, + .alloc_pasid = amdgpu_pasid_alloc, + .free_pasid = amdgpu_pasid_free, .program_sh_mem_settings = kgd_program_sh_mem_settings, .set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping, .init_pipeline = kgd_init_pipeline, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c index 056929b8ccd0..e9b436bc8dcb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c @@ -128,8 +128,8 @@ static const struct kfd2kgd_calls kfd2kgd = { .get_vmem_size = get_vmem_size, .get_gpu_clock_counter = get_gpu_clock_counter, .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz, - .alloc_pasid = amdgpu_vm_alloc_pasid, - .free_pasid = amdgpu_vm_free_pasid, + .alloc_pasid = amdgpu_pasid_alloc, + .free_pasid = amdgpu_pasid_free, .program_sh_mem_settings = kgd_program_sh_mem_settings, .set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping, .init_pipeline = kgd_init_pipeline, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index 0cf86eb357d6..03a69942cce5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -230,8 +230,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, if (r) { dev_err(adev->dev, "failed to emit fence (%d)\n", r); if (job && job->vm_id) - amdgpu_vm_reset_id(adev, ring->funcs->vmhub, - job->vm_id); + amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vm_id); amdgpu_ring_undo(ring); return r; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c new file mode 100644 index ..71f8a76d4c10 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c @@ -0,0 +1,459 @@ +/* + * Copyright 2017 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person +obtaining a + * copy of