RE: [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10
Use Abaqus torturing the amdgpu driver more times will running into locking first busy BO deadlock .Then the caller will return EAGAIN and eventually dm_plane_helper_prepare_fb popups out pinned failed message .For this case, the patch#7 can we add EAGAIN as ERESTARTSYS which filter out the annoying error message . Thanks, Prike -Original Message- From: Christian König Sent: Thursday, May 23, 2019 7:04 PM To: Zhou, David(ChunMing) ; Olsak, Marek ; Zhou, David(ChunMing) ; Liang, Prike ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10 [CAUTION: External Email] Am 23.05.19 um 12:24 schrieb zhoucm1: > > > On 2019年05月22日 20:59, Christian König wrote: >> [CAUTION: External Email] >> >> BOs on the LRU might be blocked during command submission and cause >> OOM situations. >> >> Avoid this by blocking for the first busy BO not locked by the same >> ticket as the BO we are searching space for. >> >> v10: completely start over with the patch since we didn't >> handled a whole bunch of corner cases. >> >> Signed-off-by: Christian König >> --- >> drivers/gpu/drm/ttm/ttm_bo.c | 77 ++-- >> 1 file changed, 66 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >> b/drivers/gpu/drm/ttm/ttm_bo.c index 4c6389d849ed..861facac33d4 >> 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> @@ -771,32 +771,72 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); >>* b. Otherwise, trylock it. >>*/ >> static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object >> *bo, >> - struct ttm_operation_ctx *ctx, bool *locked) >> + struct ttm_operation_ctx *ctx, bool *locked, >> bool *busy) >> { >> bool ret = false; >> >> - *locked = false; >> if (bo->resv == ctx->resv) { >> reservation_object_assert_held(bo->resv); >> if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT >> || !list_empty(&bo->ddestroy)) >> ret = true; >> + *locked = false; >> + if (busy) >> + *busy = false; >> } else { >> - *locked = reservation_object_trylock(bo->resv); >> - ret = *locked; >> + ret = reservation_object_trylock(bo->resv); >> + *locked = ret; >> + if (busy) >> + *busy = !ret; >> } >> >> return ret; >> } >> >> +/** >> + * ttm_mem_evict_wait_busy - wait for a busy BO to become available >> + * >> + * @busy_bo: BO which couldn't be locked with trylock >> + * @ctx: operation context >> + * @ticket: acquire ticket >> + * >> + * Try to lock a busy buffer object to avoid failing eviction. >> + */ >> +static int ttm_mem_evict_wait_busy(struct ttm_buffer_object *busy_bo, >> + struct ttm_operation_ctx *ctx, >> + struct ww_acquire_ctx *ticket) { >> + int r; >> + >> + if (!busy_bo || !ticket) >> + return -EBUSY; >> + >> + if (ctx->interruptible) >> + r = >> + reservation_object_lock_interruptible(busy_bo->resv, >> + ticket); >> + else >> + r = reservation_object_lock(busy_bo->resv, ticket); >> + >> + /* >> +* TODO: It would be better to keep the BO locked until >> allocation is at >> +* least tried one more time, but that would mean a much >> larger rework >> +* of TTM. >> +*/ >> + if (!r) >> + reservation_object_unlock(busy_bo->resv); >> + >> + return r == -EDEADLK ? -EAGAIN : r; } >> + >> static int ttm_mem_evict_first(struct ttm_bo_device *bdev, >> uint32_t mem_type, >> const struct ttm_place *place, >> - struct ttm_operation_ctx *ctx) >> + struct ttm_operation_ctx *ctx, >> + struct ww_acquire_ctx *ticket) >> { >> + struct ttm_buffer_object *bo = NULL, *busy_bo = NULL; >> struct ttm_bo_global *glob = bdev->glob; >> struct ttm_mem_type_manager *man = &bdev->man[mem_type]; >> - struct ttm_buffer_object *bo = NULL; >> bool locked = false; >> unsigned i; >> int ret; >> @@ -804,8 +844,15 @@ static int ttm_mem_evict_first(struct >> ttm_bo_device *bdev, >> spin_lock(&glob->lru_lock); >> for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { >> list_for_each_entry(bo, &man->lru[i], lru) { >> - if (!ttm_bo_evict_swapout_allowable(bo, ctx, >> &locked)) >> + bool busy; >> + >> + if (!ttm_bo_evict_swapout_allowable(bo, ctx, >> &locked, >> + &busy)) { >> +
Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls
On Thu, May 23, 2019 at 2:04 AM Catalin Marinas wrote: > > On Wed, May 22, 2019 at 02:16:57PM -0700, Evgenii Stepanov wrote: > > On Wed, May 22, 2019 at 4:49 AM Catalin Marinas > > wrote: > > > On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote: > > > > This patch is a part of a series that extends arm64 kernel ABI to allow > > > > to > > > > pass tagged user pointers (with the top byte set to something else other > > > > than 0x00) as syscall arguments. > > > > > > > > This patch allows tagged pointers to be passed to the following memory > > > > syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2, > > > > mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap, > > > > remap_file_pages, shmat and shmdt. > > > > > > > > This is done by untagging pointers passed to these syscalls in the > > > > prologues of their handlers. > > > > > > I'll go through them one by one to see if we can tighten the expected > > > ABI while having the MTE in mind. > > > > > > > diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c > > > > index b44065fb1616..933bb9f3d6ec 100644 > > > > --- a/arch/arm64/kernel/sys.c > > > > +++ b/arch/arm64/kernel/sys.c > > > > @@ -35,10 +35,33 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned > > > > long, len, > > > > { > > > > if (offset_in_page(off) != 0) > > > > return -EINVAL; > > > > - > > > > + addr = untagged_addr(addr); > > > > return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> > > > > PAGE_SHIFT); > > > > } > > > > > > If user passes a tagged pointer to mmap() and the address is honoured > > > (or MAP_FIXED is given), what is the expected return pointer? Does it > > > need to be tagged with the value from the hint? > > > > For HWASan the most convenient would be to use the tag from the hint. > > But since in the TBI (not MTE) mode the kernel has no idea what > > meaning userspace assigns to pointer tags, perhaps it should not try > > to guess, and should return raw (zero-tagged) address instead. > > Then, just to relax the ABI for hwasan, shall we simply disallow tagged > pointers on mmap() arguments? We can leave them in for > mremap(old_address), madvise(). I think this would be fine. We should allow tagged in pointers in mprotect though. > > > With MTE, we may want to use this as a request for the default colour of > > > the mapped pages (still under discussion). > > > > I like this - and in that case it would make sense to return the > > pointer that can be immediately dereferenced without crashing the > > process, i.e. with the matching tag. > > This came up from the Android investigation work where large memory > allocations (using mmap) could be more efficiently pre-tagged by the > kernel on page fault. Not sure about the implementation details yet. > > -- > Catalin
RE: [PATCH] drm/amdgpu: Need to set the baco cap before baco reset
>-Original Message- >From: Alex Deucher >Sent: Friday, May 24, 2019 12:09 AM >To: Deng, Emily >Cc: amd-gfx list >Subject: Re: [PATCH] drm/amdgpu: Need to set the baco cap before baco >reset > >[CAUTION: External Email] > >On Thu, May 23, 2019 at 6:22 AM Emily Deng wrote: >> >> For passthrough, after rebooted the VM, driver will do a baco reset >> before doing other driver initialization during loading driver. For >> doing the baco reset, it will first check the baco reset capability. >> So first need to set the cap from the vbios information or baco reset >> won't be enabled. >> >> Signed-off-by: Emily Deng >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 >> drivers/gpu/drm/amd/amdgpu/soc15.c | 3 ++- >> drivers/gpu/drm/amd/include/kgd_pp_interface.h | 1 + >> drivers/gpu/drm/amd/powerplay/amd_powerplay.c | 16 >+++ >> drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 1 + >> .../amd/powerplay/hwmgr/vega10_processpptables.c | 24 >++ >> .../amd/powerplay/hwmgr/vega10_processpptables.h | 1 + >> drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 + >> 8 files changed, 54 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index bdd1fe73..2dde672 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -2611,6 +2611,14 @@ int amdgpu_device_init(struct amdgpu_device >*adev, >> * E.g., driver was not cleanly unloaded previously, etc. >> */ >> if (!amdgpu_sriov_vf(adev) && >> amdgpu_asic_need_reset_on_init(adev)) { >> + if (amdgpu_passthrough(adev) && adev->powerplay.pp_funcs && >adev->powerplay.pp_funcs->set_asic_baco_cap) { >> + r = adev->powerplay.pp_funcs->set_asic_baco_cap(adev- >>powerplay.pp_handle); >> + if (r) { >> + dev_err(adev->dev, "set baco capability >> failed\n"); >> + goto failed; >> + } >> + } >> + > >I think it would be cleaner to add this to hwmgr_early_init() or something >called from early init for powerplay. I also preferred to put it in the hwmgr_early_init, but as the function set_asic_baco_cap need to get the vbios info, so need to put the amdgpu_get_bios before early init. If so the code changes too big. > >Alex > >> r = amdgpu_asic_reset(adev); >> if (r) { >> dev_err(adev->dev, "asic reset on init >> failed\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c >> b/drivers/gpu/drm/amd/amdgpu/soc15.c >> index 78bd4fc..d9fdd95 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c >> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c >> @@ -764,7 +764,8 @@ static bool soc15_need_reset_on_init(struct >amdgpu_device *adev) >> /* Just return false for soc15 GPUs. Reset does not seem to >> * be necessary. >> */ >> - return false; >> + if (!amdgpu_passthrough(adev)) >> + return false; >> >> if (adev->flags & AMD_IS_APU) >> return false; >> diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h >> b/drivers/gpu/drm/amd/include/kgd_pp_interface.h >> index 9f661bf..c6e2a51 100644 >> --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h >> +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h >> @@ -296,6 +296,7 @@ struct amd_pm_funcs { >> int (*set_hard_min_fclk_by_freq)(void *handle, uint32_t clock); >> int (*set_min_deep_sleep_dcefclk)(void *handle, uint32_t clock); >> int (*get_asic_baco_capability)(void *handle, bool *cap); >> + int (*set_asic_baco_cap)(void *handle); >> int (*get_asic_baco_state)(void *handle, int *state); >> int (*set_asic_baco_state)(void *handle, int state); >> int (*get_ppfeature_status)(void *handle, char *buf); diff >> --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c >> b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c >> index bea1587..9856760 100644 >> --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c >> +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c >> @@ -1404,6 +1404,21 @@ static int pp_set_active_display_count(void >*handle, uint32_t count) >> return ret; >> } >> >> +static int pp_set_asic_baco_cap(void *handle) { >> + struct pp_hwmgr *hwmgr = handle; >> + >> + if (!hwmgr) >> + return -EINVAL; >> + >> + if (!hwmgr->pm_en || !hwmgr->hwmgr_func->set_asic_baco_cap) >> + return 0; >> + >> + hwmgr->hwmgr_func->set_asic_baco_cap(hwmgr); >> + >> + return 0; >> +} >> + >> static int pp_get_asic_baco_capability(void *handle, bool *cap) { >> struct pp_hwmgr *hwmgr = handle; @@ -1546,6 +1561,7 @@ static >> const struct amd_pm_funcs pp_dpm_funcs = { >> .set_hard
[PATCH 4/8] drm/amdgpu: Add function to add/remove gws to kfd process
GWS bo is shared between all kfd processes. Add function to add gws to kfd process's bo list so gws can be evicted from and restored for process. Change-Id: I75d74cfdadb7075ff8b2b68634e205deb73dc1ea Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 103 +-- 2 files changed, 100 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index c00c974..f968bf1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -155,6 +155,8 @@ int amdgpu_amdkfd_alloc_gtt_mem(struct kgd_dev *kgd, size_t size, void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void *mem_obj); int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t size, void **mem_obj); void amdgpu_amdkfd_free_gws(struct kgd_dev *kgd, void *mem_obj); +int amdgpu_amdkfd_add_gws_to_process(void *info, void *gws, struct kgd_mem **mem); +int amdgpu_amdkfd_remove_gws_from_process(void *info, void *mem); uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type); void amdgpu_amdkfd_get_local_mem_info(struct kgd_dev *kgd, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index e1cae4a..87177ed 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -457,6 +457,17 @@ static void add_kgd_mem_to_kfd_bo_list(struct kgd_mem *mem, mutex_unlock(&process_info->lock); } +static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem *mem, + struct amdkfd_process_info *process_info) +{ + struct ttm_validate_buffer *bo_list_entry; + + bo_list_entry = &mem->validate_list; + mutex_lock(&process_info->lock); + list_del(&bo_list_entry->head); + mutex_unlock(&process_info->lock); +} + /* Initializes user pages. It registers the MMU notifier and validates * the userptr BO in the GTT domain. * @@ -1183,12 +1194,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( if (user_addr) { ret = init_user_pages(*mem, current->mm, user_addr); - if (ret) { - mutex_lock(&avm->process_info->lock); - list_del(&(*mem)->validate_list.head); - mutex_unlock(&avm->process_info->lock); + if (ret) goto allocate_init_user_pages_failed; - } } if (offset) @@ -1197,6 +1204,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( return 0; allocate_init_user_pages_failed: + remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info); amdgpu_bo_unref(&bo); /* Don't unreserve system mem limit twice */ goto err_reserve_limit; @@ -2104,3 +2112,88 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef) kfree(pd_bo_list); return ret; } + +int amdgpu_amdkfd_add_gws_to_process(void *info, void *gws, struct kgd_mem **mem) +{ + struct amdkfd_process_info *process_info = (struct amdkfd_process_info *)info; + struct amdgpu_bo *gws_bo = (struct amdgpu_bo *)gws; + int ret; + + if (!info || !gws) + return -EINVAL; + + *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL); + if (!*mem) + return -EINVAL; + + mutex_init(&(*mem)->lock); + (*mem)->bo = amdgpu_bo_ref(gws_bo); + (*mem)->domain = AMDGPU_GEM_DOMAIN_GWS; + (*mem)->process_info = process_info; + add_kgd_mem_to_kfd_bo_list(*mem, process_info, false); + amdgpu_sync_create(&(*mem)->sync); + + + /* Validate gws bo the first time it is added to process */ + mutex_lock(&(*mem)->process_info->lock); + ret = amdgpu_bo_reserve(gws_bo, false); + if (unlikely(ret)) { + pr_err("Reserve gws bo failed %d\n", ret); + goto bo_reservation_failure; + } + + ret = amdgpu_amdkfd_bo_validate(gws_bo, AMDGPU_GEM_DOMAIN_GWS, true); + if (ret) { + pr_err("GWS BO validate failed %d\n", ret); + goto bo_validation_failure; + } + /* GWS resource is shared b/t amdgpu and amdkfd +* Add process eviction fence to bo so they can +* evict each other. +*/ + amdgpu_bo_fence(gws_bo, &process_info->eviction_fence->base, true); + amdgpu_bo_unreserve(gws_bo); + mutex_unlock(&(*mem)->process_info->lock); + + return ret; + +bo_validation_failure: + amdgpu_bo_unreserve(gws_bo); +bo_reservation_failure: + mutex_unlock(&(*mem)->process_info->lock); + amdgpu_sync_free(&(*mem)->sync); + remove_kgd_mem_from_kfd_bo_list(*mem, process_info); + amdgpu_bo_unref(&gws_bo); + m
[PATCH 3/8] drm/amdkfd: Allocate gws on device initialization
On device initialization, KFD allocates all (64) gws which is shared by all KFD processes. Change-Id: I1f1274b8d4f6a8ad08785e2791a9a79def75e913 Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 14 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 3 +++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 7b4ea24..9d1b026 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -553,6 +553,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, } else kfd->max_proc_per_quantum = hws_max_conc_proc; + /* Allocate global GWS that is shared by all KFD processes */ + if (hws_gws_support && amdgpu_amdkfd_alloc_gws(kfd->kgd, + amdgpu_amdkfd_get_num_gws(kfd->kgd), &kfd->gws)) { + dev_err(kfd_device, "Could not allocate %d gws\n", + amdgpu_amdkfd_get_num_gws(kfd->kgd)); + goto out; + } /* calculate max size of mqds needed for queues */ size = max_num_of_queues_per_device * kfd->device_info->mqd_size_aligned; @@ -576,7 +583,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, &kfd->gtt_start_gpu_addr, &kfd->gtt_start_cpu_ptr, false)) { dev_err(kfd_device, "Could not allocate %d bytes\n", size); - goto out; + goto alloc_gtt_mem_failure; } dev_info(kfd_device, "Allocated %d bytes on gart\n", size); @@ -646,6 +653,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, kfd_gtt_sa_fini(kfd); kfd_gtt_sa_init_error: amdgpu_amdkfd_free_gtt_mem(kfd->kgd, kfd->gtt_mem); +alloc_gtt_mem_failure: + if (hws_gws_support) + amdgpu_amdkfd_free_gws(kfd->kgd, kfd->gws); dev_err(kfd_device, "device %x:%x NOT added due to errors\n", kfd->pdev->vendor, kfd->pdev->device); @@ -663,6 +673,8 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd) kfd_doorbell_fini(kfd); kfd_gtt_sa_fini(kfd); amdgpu_amdkfd_free_gtt_mem(kfd->kgd, kfd->gtt_mem); + if (hws_gws_support) + amdgpu_amdkfd_free_gws(kfd->kgd, kfd->gws); } kfree(kfd); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 338fb07..5969e37 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -288,6 +288,9 @@ struct kfd_dev { /* Compute Profile ref. count */ atomic_t compute_profile; + + /* Global GWS resource shared b/t processes*/ + void *gws; }; enum kfd_mempool { -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 7/8] drm/amdkfd: PM4 packets change to support GWS
Add a field in map_queues packet to indicate whether this is a gws control queue. Only one queue per process can be gws control queue. Change num_gws field in map_process packet to 7 bits Change-Id: I0db91d99c6962d14f9202b2eb950f8e7e497b79e Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c | 1 + drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 7 --- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c index 3dd731c..07f02f8e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c @@ -159,6 +159,7 @@ static int pm_map_queues_v9(struct packet_manager *pm, uint32_t *buffer, packet->bitfields2.engine_sel = engine_sel__mes_map_queues__compute_vi; + packet->bitfields2.gws_control_queue = q->gws ? 1 : 0; packet->bitfields2.queue_type = queue_type__mes_map_queues__normal_compute_vi; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h index 0661339..49ab66b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h @@ -176,8 +176,7 @@ struct pm4_mes_map_process { union { struct { - uint32_t num_gws:6; - uint32_t reserved7:1; + uint32_t num_gws:7; uint32_t sdma_enable:1; uint32_t num_oac:4; uint32_t reserved8:4; @@ -272,7 +271,9 @@ struct pm4_mes_map_queues { struct { uint32_t reserved1:4; enum mes_map_queues_queue_sel_enum queue_sel:2; - uint32_t reserved2:15; + uint32_t reserved5:6; + uint32_t gws_control_queue:1; + uint32_t reserved2:8; enum mes_map_queues_queue_type_enum queue_type:3; uint32_t reserved3:2; enum mes_map_queues_engine_sel_enum engine_sel:3; -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 8/8] drm/amdkfd: Use kfd fd to mmap mmio
TTM doesn't support CPU mapping of sg type bo (under which mmio bo is created). Switch mmaping of mmio page to kfd device file. Change-Id: I1a1a24f2ac0662be3783d460c137731ade007b83 Signed-off-by: Oak Zeng --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 46 drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 + 2 files changed, 47 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 455a3db..67d269b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1309,6 +1309,15 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep, args->handle = MAKE_HANDLE(args->gpu_id, idr_handle); args->mmap_offset = offset; + /* MMIO is mapped through kfd device +* Generate a kfd mmap offset +*/ + if (flags & KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) { + args->mmap_offset = KFD_MMAP_TYPE_MMIO | KFD_MMAP_GPU_ID(args->gpu_id); + args->mmap_offset <<= PAGE_SHIFT; + args->mmap_offset |= amdgpu_amdkfd_get_mmio_remap_phys_addr(dev->kgd); + } + return 0; err_free: @@ -1880,6 +1889,39 @@ static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) return retcode; } +static int kfd_mmio_mmap(struct kfd_dev *dev, struct kfd_process *process, + struct vm_area_struct *vma) +{ + phys_addr_t address; + int ret; + + if (vma->vm_end - vma->vm_start != PAGE_SIZE) + return -EINVAL; + + address = amdgpu_amdkfd_get_mmio_remap_phys_addr(dev->kgd); + + vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | + VM_DONTDUMP | VM_PFNMAP; + + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + + pr_debug("Process %d mapping mmio page\n" +" target user address == 0x%08llX\n" +" physical address== 0x%08llX\n" +" vm_flags== 0x%04lX\n" +" size== 0x%04lX\n", +process->pasid, (unsigned long long) vma->vm_start, +address, vma->vm_flags, PAGE_SIZE); + + ret = io_remap_pfn_range(vma, + vma->vm_start, + address >> PAGE_SHIFT, + PAGE_SIZE, + vma->vm_page_prot); + return ret; +} + + static int kfd_mmap(struct file *filp, struct vm_area_struct *vma) { struct kfd_process *process; @@ -1910,6 +1952,10 @@ static int kfd_mmap(struct file *filp, struct vm_area_struct *vma) if (!dev) return -ENODEV; return kfd_reserved_mem_mmap(dev, process, vma); + case KFD_MMAP_TYPE_MMIO: + if (!dev) + return -ENODEV; + return kfd_mmio_mmap(dev, process, vma); } return -EFAULT; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 40a5c67..b61dc53 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -59,6 +59,7 @@ #define KFD_MMAP_TYPE_DOORBELL (0x3ULL << KFD_MMAP_TYPE_SHIFT) #define KFD_MMAP_TYPE_EVENTS (0x2ULL << KFD_MMAP_TYPE_SHIFT) #define KFD_MMAP_TYPE_RESERVED_MEM (0x1ULL << KFD_MMAP_TYPE_SHIFT) +#define KFD_MMAP_TYPE_MMIO (0x0ULL << KFD_MMAP_TYPE_SHIFT) #define KFD_MMAP_GPU_ID_SHIFT (46 - PAGE_SHIFT) #define KFD_MMAP_GPU_ID_MASK (((1ULL << KFD_GPU_ID_HASH_WIDTH) - 1) \ -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 6/8] drm/amdkfd: New IOCTL to allocate queue GWS
Add a new kfd ioctl to allocate queue GWS. Queue GWS is released on queue destroy. Change-Id: I60153c26a577992ad873e4292e759e5c3d5bbd15 Signed-off-by: Oak Zeng --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 27 +++ include/uapi/linux/kfd_ioctl.h | 20 +++- 2 files changed, 46 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 38ae53f..455a3db 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1559,6 +1559,31 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file *filep, return err; } +static int kfd_ioctl_alloc_queue_gws(struct file *filep, + struct kfd_process *p, void *data) +{ + int retval; + struct kfd_ioctl_alloc_queue_gws_args *args = data; + struct kfd_dev *dev = NULL; + + if (!hws_gws_support || + dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) + return -EINVAL; + + dev = kfd_device_by_id(args->gpu_id); + if (!dev) { + pr_debug("Could not find gpu id 0x%x\n", args->gpu_id); + return -EINVAL; + } + + mutex_lock(&p->mutex); + retval = pqm_set_gws(&p->pqm, args->queue_id, args->num_gws ? dev->gws : NULL); + mutex_unlock(&p->mutex); + + args->first_gws = 0; + return retval; +} + static int kfd_ioctl_get_dmabuf_info(struct file *filep, struct kfd_process *p, void *data) { @@ -1761,6 +1786,8 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = { AMDKFD_IOCTL_DEF(AMDKFD_IOC_IMPORT_DMABUF, kfd_ioctl_import_dmabuf, 0), + AMDKFD_IOCTL_DEF(AMDKFD_IOC_ALLOC_QUEUE_GWS, + kfd_ioctl_alloc_queue_gws, 0), }; #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls) diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index 20917c5..070d1bc 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -410,6 +410,21 @@ struct kfd_ioctl_unmap_memory_from_gpu_args { __u32 n_success;/* to/from KFD */ }; +/* Allocate GWS for specific queue + * + * @gpu_id: device identifier + * @queue_id:queue's id that GWS is allocated for + * @num_gws: how many GWS to allocate + * @first_gws: index of the first GWS allocated. + * only support contiguous GWS allocation + */ +struct kfd_ioctl_alloc_queue_gws_args { + __u32 gpu_id; /* to KFD */ + __u32 queue_id; /* to KFD */ + __u32 num_gws; /* to KFD */ + __u32 first_gws;/* from KFD */ +}; + struct kfd_ioctl_get_dmabuf_info_args { __u64 size; /* from KFD */ __u64 metadata_ptr; /* to KFD */ @@ -529,7 +544,10 @@ enum kfd_mmio_remap { #define AMDKFD_IOC_IMPORT_DMABUF \ AMDKFD_IOWR(0x1D, struct kfd_ioctl_import_dmabuf_args) +#define AMDKFD_IOC_ALLOC_QUEUE_GWS \ + AMDKFD_IOWR(0x1E, struct kfd_ioctl_alloc_queue_gws_args) + #define AMDKFD_COMMAND_START 0x01 -#define AMDKFD_COMMAND_END 0x1E +#define AMDKFD_COMMAND_END 0x1F #endif -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/8] drm/amdkfd: Add gws number to kfd topology node properties
Add amdgpu_amdkfd interface to get num_gws and add num_gws to /sys/class/kfd/kfd/topology/nodes/x/properties. Only report num_gws if MEC FW support GWS barriers. Currently it is determined by a module parameter which will be replaced with MEC FW version check when firmware is ready. Change-Id: Ie0d00fb20a37ef2856860dbecbe1ad0ca1ef09f7 Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 10 ++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 + drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 5 + drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 + 6 files changed, 29 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 98326e3b..a4780d5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -544,6 +544,13 @@ uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev *kgd) return adev->rmmio_remap.bus_addr; } +uint32_t amdgpu_amdkfd_get_num_gws(struct kgd_dev *kgd) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)kgd; + + return adev->gds.gws_size; +} + int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine, uint32_t vmid, uint64_t gpu_addr, uint32_t *ib_cmd, uint32_t ib_len) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index f57f297..5700643 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -169,6 +169,7 @@ int amdgpu_amdkfd_get_dmabuf_info(struct kgd_dev *kgd, int dma_buf_fd, uint64_t amdgpu_amdkfd_get_vram_usage(struct kgd_dev *kgd); uint64_t amdgpu_amdkfd_get_hive_id(struct kgd_dev *kgd); uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev *kgd); +uint32_t amdgpu_amdkfd_get_num_gws(struct kgd_dev *kgd); uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev *src); #define read_user_wptr(mmptr, wptr, dst) \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index a334d3b..3a03c2b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -666,6 +666,16 @@ MODULE_PARM_DESC(noretry, int halt_if_hws_hang; module_param(halt_if_hws_hang, int, 0644); MODULE_PARM_DESC(halt_if_hws_hang, "Halt if HWS hang is detected (0 = off (default), 1 = on)"); + +/** + * DOC: hws_gws_support(bool) + * Whether HWS support gws barriers. Default value: false (not supported) + * This will be replaced with a MEC firmware version check once firmware + * is ready + */ +bool hws_gws_support; +module_param(hws_gws_support, bool, 0444); +MODULE_PARM_DESC(hws_gws_support, "MEC FW support gws barriers (false = not supported (Default), true = supported)"); #endif /** diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 8f02d78..338fb07 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -160,6 +160,11 @@ extern int noretry; */ extern int halt_if_hws_hang; +/* + * Whether MEC FW support GWS barriers + */ +extern bool hws_gws_support; + enum cache_policy { cache_policy_coherent, cache_policy_noncoherent diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 2c06d6c..128c72c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -454,6 +454,8 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, dev->node_props.lds_size_in_kb); sysfs_show_32bit_prop(buffer, "gds_size_in_kb", dev->node_props.gds_size_in_kb); + sysfs_show_32bit_prop(buffer, "num_gws", + dev->node_props.num_gws); sysfs_show_32bit_prop(buffer, "wave_front_size", dev->node_props.wave_front_size); sysfs_show_32bit_prop(buffer, "array_count", @@ -1290,6 +1292,9 @@ int kfd_topology_add_device(struct kfd_dev *gpu) dev->node_props.num_sdma_engines = gpu->device_info->num_sdma_engines; dev->node_props.num_sdma_xgmi_engines = gpu->device_info->num_xgmi_sdma_engines; + dev->node_props.num_gws = (hws_gws_support && + dev->gpu->dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS) ? + amdgpu_amdkfd_get_num_gws(dev->gpu->kgd) : 0; kfd_fill_mem_clk_max_info(dev); kfd_fill_iolink_non_crat_info(dev); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h index 949e885..276354a 100644 --- a/drivers/gpu/drm/amd/amdkfd
[PATCH 2/8] drm/amdgpu: Add interface to alloc gws from amdgpu
Add amdgpu_amdkfd interface to alloc and free gws from amdgpu Change-Id: I4eb418356e5a6051aa09c5e2c4a454263631d6ab Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 34 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 ++ 2 files changed, 36 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index a4780d5..4af3989 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -339,6 +339,40 @@ void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void *mem_obj) amdgpu_bo_unref(&(bo)); } +int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t size, + void **mem_obj) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)kgd; + struct amdgpu_bo *bo = NULL; + struct amdgpu_bo_param bp; + int r; + + memset(&bp, 0, sizeof(bp)); + bp.size = size; + bp.byte_align = 1; + bp.domain = AMDGPU_GEM_DOMAIN_GWS; + bp.flags = AMDGPU_GEM_CREATE_NO_CPU_ACCESS; + bp.type = ttm_bo_type_device; + bp.resv = NULL; + + r = amdgpu_bo_create(adev, &bp, &bo); + if (r) { + dev_err(adev->dev, + "failed to allocate gws BO for amdkfd (%d)\n", r); + return r; + } + + *mem_obj = bo; + return 0; +} + +void amdgpu_amdkfd_free_gws(struct kgd_dev *kgd, void *mem_obj) +{ + struct amdgpu_bo *bo = (struct amdgpu_bo *)mem_obj; + + amdgpu_bo_unref(&bo); +} + uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 5700643..c00c974 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -153,6 +153,8 @@ int amdgpu_amdkfd_alloc_gtt_mem(struct kgd_dev *kgd, size_t size, void **mem_obj, uint64_t *gpu_addr, void **cpu_ptr, bool mqd_gfx9); void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void *mem_obj); +int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t size, void **mem_obj); +void amdgpu_amdkfd_free_gws(struct kgd_dev *kgd, void *mem_obj); uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type); void amdgpu_amdkfd_get_local_mem_info(struct kgd_dev *kgd, -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 5/8] drm/amdkfd: Add function to set queue gws
Add functions in process queue manager to set/unset queue gws. Also set process's number of gws used. Currently only one queue in process can use and use all gws. Change-Id: I03e480c8692db3eabfc3a188cce8904d5d962ab7 Signed-off-by: Oak Zeng --- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 6 +++ .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 57 ++ 2 files changed, 63 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 5969e37..40a5c67 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -454,6 +454,9 @@ struct queue_properties { * * @device: The kfd device that created this queue. * + * @gws: Pointing to gws kgd_mem if this is a gws control queue; NULL + * otherwise. + * * This structure represents user mode compute queues. * It contains all the necessary data to handle such queues. * @@ -475,6 +478,7 @@ struct queue { struct kfd_process *process; struct kfd_dev *device; + void *gws; }; /* @@ -868,6 +872,8 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, struct queue_properties *p); int pqm_set_cu_mask(struct process_queue_manager *pqm, unsigned int qid, struct queue_properties *p); +int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid, + void *gws); struct kernel_queue *pqm_get_kernel_queue(struct process_queue_manager *pqm, unsigned int qid); int pqm_get_wave_state(struct process_queue_manager *pqm, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index e652e25..5764491 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -26,6 +26,7 @@ #include "kfd_device_queue_manager.h" #include "kfd_priv.h" #include "kfd_kernel_queue.h" +#include "amdgpu_amdkfd.h" static inline struct process_queue_node *get_queue_by_qid( struct process_queue_manager *pqm, unsigned int qid) @@ -74,6 +75,55 @@ void kfd_process_dequeue_from_device(struct kfd_process_device *pdd) pdd->already_dequeued = true; } +int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid, + void *gws) +{ + struct kfd_dev *dev = NULL; + struct process_queue_node *pqn; + struct kfd_process_device *pdd; + struct kgd_mem *mem; + int ret; + + pqn = get_queue_by_qid(pqm, qid); + if (!pqn) { + pr_err("Queue id does not match any known queue\n"); + return -EINVAL; + } + + if (pqn->q) + dev = pqn->q->device; + if (WARN_ON(!dev)) + return -ENODEV; + + pdd = kfd_get_process_device_data(dev, pqm->process); + if (!pdd) { + pr_err("Process device data doesn't exist\n"); + return -EINVAL; + } + + /* Only allow one queue per process can have GWS assigned */ + if (gws && pdd->qpd.num_gws) + return -EINVAL; + + if (!gws && pdd->qpd.num_gws == 0) + return -EINVAL; + + if (gws) + ret = amdgpu_amdkfd_add_gws_to_process(pdd->process->kgd_process_info, + gws, &mem); + else + amdgpu_amdkfd_remove_gws_from_process(pdd->process->kgd_process_info, + pqn->q->gws); + if (unlikely(ret)) + return ret; + + pqn->q->gws = gws ? mem : NULL; + pdd->qpd.num_gws = gws ? amdgpu_amdkfd_get_num_gws(dev->kgd) : 0; + + return pqn->q->device->dqm->ops.update_queue(pqn->q->device->dqm, + pqn->q); +} + void kfd_process_dequeue_from_all_devices(struct kfd_process *p) { struct kfd_process_device *pdd; @@ -330,6 +380,13 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid) if (retval != -ETIME) goto err_destroy_queue; } + + if (pqn->q->gws) { + amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info, + pqn->q->gws); + pdd->qpd.num_gws = 0; + } + kfree(pqn->q->properties.cu_mask); pqn->q->properties.cu_mask = NULL; uninit_queue(pqn->q); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On 5/23/19 2:11 PM, Catalin Marinas wrote: > Hi Khalid, > > On Thu, May 23, 2019 at 11:51:40AM -0600, Khalid Aziz wrote: >> On 5/21/19 6:04 PM, Kees Cook wrote: >>> As an aside: I think Sparc ADI support in Linux actually side-stepped >>> this[1] (i.e. chose "solution 1"): "All addresses passed to kernel must >>> be non-ADI tagged addresses." (And sadly, "Kernel does not enable ADI >>> for kernel code.") I think this was a mistake we should not repeat for >>> arm64 (we do seem to be at least in agreement about this, I think). >>> >>> [1] https://lore.kernel.org/patchwork/patch/654481/ >> >> That is a very early version of the sparc ADI patch. Support for tagged >> addresses in syscalls was added in later versions and is in the patch >> that is in the kernel. > > I tried to figure out but I'm not familiar with the sparc port. How did > you solve the tagged address going into various syscall implementations > in the kernel (e.g. sys_write)? Is the tag removed on kernel entry or it > ends up deeper in the core code? > Another spot I should point out in ADI patch - Tags are not stored in VMAs and IOMMU does not support ADI tags on M7. ADI tags are stripped before userspace addresses are passed to IOMMU in the following snippet from the patch: diff --git a/arch/sparc/mm/gup.c b/arch/sparc/mm/gup.c index 5335ba3c850e..357b6047653a 100644 --- a/arch/sparc/mm/gup.c +++ b/arch/sparc/mm/gup.c @@ -201,6 +202,24 @@ int __get_user_pages_fast(unsigned long start, int nr_pages , int write, pgd_t *pgdp; int nr = 0; +#ifdef CONFIG_SPARC64 + if (adi_capable()) { + long addr = start; + + /* If userspace has passed a versioned address, kernel +* will not find it in the VMAs since it does not store +* the version tags in the list of VMAs. Storing version +* tags in list of VMAs is impractical since they can be +* changed any time from userspace without dropping into +* kernel. Any address search in VMAs will be done with +* non-versioned addresses. Ensure the ADI version bits +* are dropped here by sign extending the last bit before +* ADI bits. IOMMU does not implement version tags. +*/ + addr = (addr << (long)adi_nbits()) >> (long)adi_nbits(); + start = addr; + } +#endif start &= PAGE_MASK; addr = start; len = (unsigned long) nr_pages << PAGE_SHIFT; -- Khalid
Re: [PATCH 6/7] drm/amdkfd: New IOCTL to allocate queue GWS
On 2019-05-23 2:41 p.m., Zeng, Oak wrote: > Add a new kfd ioctl to allocate queue GWS. Queue > GWS is released on queue destroy. > > Change-Id: I60153c26a577992ad873e4292e759e5c3d5bbd15 > Signed-off-by: Oak Zeng > --- > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 30 ++ > include/uapi/linux/kfd_ioctl.h | 20 +++- > 2 files changed, 49 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > index 38ae53f..f3f08fe 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > @@ -1559,6 +1559,34 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file > *filep, > return err; > } > > +static int kfd_ioctl_alloc_queue_gws(struct file *filep, > + struct kfd_process *p, void *data) > +{ > + int retval; > + struct kfd_ioctl_alloc_queue_gws_args *args = data; > + struct kfd_dev *dev = NULL; > + > + if (args->num_gws == 0) > + return -EINVAL; This could be useful to allow releasing GWS resources without having to destroy the queue. See also my comment in patch 5. Other than that this looks good to me. Regards, Felix > + > + if (!hws_gws_support || > + dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) > + return -EINVAL; > + > + dev = kfd_device_by_id(args->gpu_id); > + if (!dev) { > + pr_debug("Could not find gpu id 0x%x\n", args->gpu_id); > + return -EINVAL; > + } > + > + mutex_lock(&p->mutex); > + retval = pqm_set_gws(&p->pqm, args->queue_id, dev->gws); > + mutex_unlock(&p->mutex); > + > + args->first_gws = 0; > + return retval; > +} > + > static int kfd_ioctl_get_dmabuf_info(struct file *filep, > struct kfd_process *p, void *data) > { > @@ -1761,6 +1789,8 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = > { > AMDKFD_IOCTL_DEF(AMDKFD_IOC_IMPORT_DMABUF, > kfd_ioctl_import_dmabuf, 0), > > + AMDKFD_IOCTL_DEF(AMDKFD_IOC_ALLOC_QUEUE_GWS, > + kfd_ioctl_alloc_queue_gws, 0), > }; > > #define AMDKFD_CORE_IOCTL_COUNT ARRAY_SIZE(amdkfd_ioctls) > diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h > index 20917c5..070d1bc 100644 > --- a/include/uapi/linux/kfd_ioctl.h > +++ b/include/uapi/linux/kfd_ioctl.h > @@ -410,6 +410,21 @@ struct kfd_ioctl_unmap_memory_from_gpu_args { > __u32 n_success;/* to/from KFD */ > }; > > +/* Allocate GWS for specific queue > + * > + * @gpu_id: device identifier > + * @queue_id:queue's id that GWS is allocated for > + * @num_gws: how many GWS to allocate > + * @first_gws: index of the first GWS allocated. > + * only support contiguous GWS allocation > + */ > +struct kfd_ioctl_alloc_queue_gws_args { > + __u32 gpu_id; /* to KFD */ > + __u32 queue_id; /* to KFD */ > + __u32 num_gws; /* to KFD */ > + __u32 first_gws;/* from KFD */ > +}; > + > struct kfd_ioctl_get_dmabuf_info_args { > __u64 size; /* from KFD */ > __u64 metadata_ptr; /* to KFD */ > @@ -529,7 +544,10 @@ enum kfd_mmio_remap { > #define AMDKFD_IOC_IMPORT_DMABUF\ > AMDKFD_IOWR(0x1D, struct kfd_ioctl_import_dmabuf_args) > > +#define AMDKFD_IOC_ALLOC_QUEUE_GWS \ > + AMDKFD_IOWR(0x1E, struct kfd_ioctl_alloc_queue_gws_args) > + > #define AMDKFD_COMMAND_START0x01 > -#define AMDKFD_COMMAND_END 0x1E > +#define AMDKFD_COMMAND_END 0x1F > > #endif ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On 5/23/19 2:11 PM, Catalin Marinas wrote: > Hi Khalid, > > On Thu, May 23, 2019 at 11:51:40AM -0600, Khalid Aziz wrote: >> On 5/21/19 6:04 PM, Kees Cook wrote: >>> As an aside: I think Sparc ADI support in Linux actually side-stepped >>> this[1] (i.e. chose "solution 1"): "All addresses passed to kernel must >>> be non-ADI tagged addresses." (And sadly, "Kernel does not enable ADI >>> for kernel code.") I think this was a mistake we should not repeat for >>> arm64 (we do seem to be at least in agreement about this, I think). >>> >>> [1] https://lore.kernel.org/patchwork/patch/654481/ >> >> That is a very early version of the sparc ADI patch. Support for tagged >> addresses in syscalls was added in later versions and is in the patch >> that is in the kernel. > > I tried to figure out but I'm not familiar with the sparc port. How did > you solve the tagged address going into various syscall implementations > in the kernel (e.g. sys_write)? Is the tag removed on kernel entry or it > ends up deeper in the core code? Tag is not removed from the user addresses. Kernel passes tagged addresses to copy_from_user and copy_to_user. MMU checks the tag embedded in the address when kernel accesses userspace addresses. This maintains the ADI integrity even when userspace attempts to access any userspace addresses through system calls. On sparc, access_ok() is defined as: #define access_ok(addr, size) __access_ok((unsigned long)(addr), size) #define __access_ok(addr, size) (__user_ok((addr) & get_fs().seg, (size))) #define __user_ok(addr, size) ({ (void)(size); (addr) < STACK_TOP; }) STACK_TOP for M7 processor (which is the first sparc processor to support ADI) is 0xfff8UL. Tagged addresses pass the access_ok() check fine. Any tag mismatches that happen during kernel access to userspace addresses are handled by do_mcd_err(). -- Khalid
Re: [PATCH 5/7] drm/amdkfd: Add function to set queue gws
On 2019-05-23 2:41 p.m., Zeng, Oak wrote: > Add functions in process queue manager to > set queue gws. Also set process's number > of gws used. Currently only one queue in > process can use and use all gws. > > Change-Id: I03e480c8692db3eabfc3a188cce8904d5d962ab7 > Signed-off-by: Oak Zeng > --- > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 6 +++ > .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 56 > ++ > 2 files changed, 62 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > index 5969e37..40a5c67 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > @@ -454,6 +454,9 @@ struct queue_properties { >* >* @device: The kfd device that created this queue. >* > + * @gws: Pointing to gws kgd_mem if this is a gws control queue; NULL > + * otherwise. > + * >* This structure represents user mode compute queues. >* It contains all the necessary data to handle such queues. >* > @@ -475,6 +478,7 @@ struct queue { > > struct kfd_process *process; > struct kfd_dev *device; > + void *gws; > }; > > /* > @@ -868,6 +872,8 @@ int pqm_update_queue(struct process_queue_manager *pqm, > unsigned int qid, > struct queue_properties *p); > int pqm_set_cu_mask(struct process_queue_manager *pqm, unsigned int qid, > struct queue_properties *p); > +int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid, > + void *gws); > struct kernel_queue *pqm_get_kernel_queue(struct process_queue_manager *pqm, > unsigned int qid); > int pqm_get_wave_state(struct process_queue_manager *pqm, > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > index e652e25..d69e17a 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > @@ -26,6 +26,7 @@ > #include "kfd_device_queue_manager.h" > #include "kfd_priv.h" > #include "kfd_kernel_queue.h" > +#include "amdgpu_amdkfd.h" > > static inline struct process_queue_node *get_queue_by_qid( > struct process_queue_manager *pqm, unsigned int qid) > @@ -74,6 +75,54 @@ void kfd_process_dequeue_from_device(struct > kfd_process_device *pdd) > pdd->already_dequeued = true; > } > > +int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid, > + void *gws) > +{ > + struct kfd_dev *dev = NULL; > + struct process_queue_node *pqn; > + struct kfd_process_device *pdd; > + struct kgd_mem *mem; > + int ret; > + > + /* Don't allow unset gws */ > + if (!gws) > + return -EINVAL; I wonder if it would be useful to allow this (e.g. if kfd_ioctl_alloc_queue_gws is called with args->num_gws == 0). Otherwise the only way to release GWS resources is to destroy the queue. > + > + pqn = get_queue_by_qid(pqm, qid); > + if (!pqn) { > + pr_err("Queue id does not match any known queue\n"); > + return -EINVAL; > + } > + > + if (pqn->q) > + dev = pqn->q->device; > + if (WARN_ON(!dev)) > + return -ENODEV; > + > + pdd = kfd_get_process_device_data(dev, pqm->process); > + if (!pdd) { > + pr_err("Process device data doesn't exist\n"); > + return -EINVAL; > + } > + > + /* Only allow one queue per process can have GWS assigned */ > + list_for_each_entry(pqn, &pqm->queues, process_queue_list) { > + if (pqn->q && pqn->q->gws) > + return -EINVAL; > + } It would be easier and more efficient to just check that pdd->qpd.num_gws is 0. Regards, Felix > + > + ret = amdgpu_amdkfd_add_gws_to_process(pdd->process->kgd_process_info, > + gws, &mem); > + if (unlikely(ret)) > + return ret; > + > + pqn->q->gws = mem; > + pdd->qpd.num_gws = amdgpu_amdkfd_get_num_gws(dev->kgd); > + > + return pqn->q->device->dqm->ops.update_queue(pqn->q->device->dqm, > + pqn->q); > +} > + > void kfd_process_dequeue_from_all_devices(struct kfd_process *p) > { > struct kfd_process_device *pdd; > @@ -330,6 +379,13 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, > unsigned int qid) > if (retval != -ETIME) > goto err_destroy_queue; > } > + > + if (pqn->q->gws) { > + > amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info, > + pqn->q->gws); > + pdd->qpd.num_gws = 0; > + } > + > kfree(pqn->q->properties.cu_mask); >
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Thu, May 23, 2019 at 06:43:46PM +0100, Catalin Marinas wrote: > On Thu, May 23, 2019 at 09:38:19AM -0700, Kees Cook wrote: > > What on this front would you be comfortable with? Given it's a new > > feature isn't it sufficient to have a CONFIG (and/or boot option)? > > I'd rather avoid re-building kernels. A boot option would do, unless we > see value in a per-process (inherited) personality or prctl. The I think I've convinced myself that the normal<->TBI ABI control should be a boot parameter. More below... > > What about testing tools that intentionally insert high bits for syscalls > > and are _expecting_ them to fail? It seems the TBI series will break them? > > In that case, do we need to opt into TBI as well? > > If there are such tools, then we may need a per-process control. It's > basically an ABI change. syzkaller already attempts to randomly inject non-canonical and 0x addresses for user pointers in syscalls in an effort to find bugs like CVE-2017-5123 where waitid() via unchecked put_user() was able to write directly to kernel memory[1]. It seems that using TBI by default and not allowing a switch back to "normal" ABI without a reboot actually means that userspace cannot inject kernel pointers into syscalls any more, since they'll get universally stripped now. Is my understanding correct, here? i.e. exploiting CVE-2017-5123 would be impossible under TBI? If so, then I think we should commit to the TBI ABI and have a boot flag to disable it, but NOT have a process flag, as that would allow attackers to bypass the masking. The only flag should be "TBI or MTE". If so, can I get top byte masking for other architectures too? Like, just to strip high bits off userspace addresses? ;) (Oh, in looking I see this is implemented with sign-extension... why not just a mask? So it'll either be valid userspace address or forced into the non-canonical range?) [1] https://salls.github.io/Linux-Kernel-CVE-2017-5123/ > > Alright, the tl;dr appears to be: > > - you want more assurances that we can find __user stripping in the > > kernel more easily. (But this seems like a parallel problem.) > > Yes, and that we found all (most) cases now. The reason I don't see it > as a parallel problem is that, as maintainer, I promise an ABI to user > and I'd rather stick to it. I don't want, for example, ncurses to stop > working because of some ioctl() rejecting tagged pointers. But this is what I don't understand: it would need to be ncurses _using TBI_, that would stop working (having started to work before, but then regress due to a newly added one-off bug). Regular ncurses will be fine because it's not using TBI. So The Golden Rule isn't violated, and by definition, it's a specific regression caused by some bug (since TBI would have had to have worked _before_ in the situation to be considered a regression now). Which describes the normal path for kernel development... add feature, find corner cases where it doesn't work, fix them, encounter new regressions, fix those, repeat forever. > If it's just the occasional one-off bug I'm fine to deal with it. But > no-one convinced me yet that this is the case. You believe there still to be some systemic cases that haven't been found yet? And even if so -- isn't it better to work on that incrementally? > As for the generic driver code (filesystems or other subsystems), > without some clear direction for developers, together with static > checking/sparse, on how user pointers are cast to longs (one example), > it would become my responsibility to identify and fix them up with any > kernel release. This series is not providing such guidance, just adding > untagged_addr() in some places that we think matter. What about adding a nice bit of .rst documentation that describes the situation and shows how to use untagged_addr(). This is the kind of kernel-wide change that "everyone" needs to know about, and shouldn't be the arch maintainer's sole responsibility to fix. > > - we might need to opt in to TBI with a prctl() > > Yes, although still up for discussion. I think I've talked myself out of it. I say boot param only! :) So what do you say to these next steps: - change untagged_addr() to use a static branch that is controlled with a boot parameter. - add, say, Documentation/core-api/user-addresses.rst to describe proper care and handling of user space pointers with untagged_addr(), with examples based on all the cases seen so far in this series. - continue work to improve static analysis. Thanks for wading through this with me! :) -- Kees Cook
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
Hi Khalid, On Thu, May 23, 2019 at 11:51:40AM -0600, Khalid Aziz wrote: > On 5/21/19 6:04 PM, Kees Cook wrote: > > As an aside: I think Sparc ADI support in Linux actually side-stepped > > this[1] (i.e. chose "solution 1"): "All addresses passed to kernel must > > be non-ADI tagged addresses." (And sadly, "Kernel does not enable ADI > > for kernel code.") I think this was a mistake we should not repeat for > > arm64 (we do seem to be at least in agreement about this, I think). > > > > [1] https://lore.kernel.org/patchwork/patch/654481/ > > That is a very early version of the sparc ADI patch. Support for tagged > addresses in syscalls was added in later versions and is in the patch > that is in the kernel. I tried to figure out but I'm not familiar with the sparc port. How did you solve the tagged address going into various syscall implementations in the kernel (e.g. sys_write)? Is the tag removed on kernel entry or it ends up deeper in the core code? Thanks. -- Catalin
Re: GPU passthrough support for Stoney [Radeon R2/R3/R4/R5 Graphics]?
On Thu, May 23, 2019 at 9:55 AM Micah Morton wrote: > > On Wed, May 22, 2019 at 6:11 PM Alex Deucher wrote: > > > > On Wed, May 22, 2019 at 7:00 PM Micah Morton wrote: > > > > > > On Wed, May 22, 2019 at 1:39 PM Alex Deucher > > > wrote: > > > > > > > > On Tue, May 21, 2019 at 1:46 PM Micah Morton > > > > wrote: > > > > > > > > > > On Fri, May 17, 2019 at 9:59 AM Alex Deucher > > > > > wrote: > > > > > > > > > > > > On Fri, May 17, 2019 at 11:36 AM Micah Morton > > > > > > wrote: > > > > > > > > > > > > > > On Thu, May 16, 2019 at 1:39 PM Alex Deucher > > > > > > > wrote: > > > > > > > > > > > > > > > > On Thu, May 16, 2019 at 4:07 PM Micah Morton > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Wed, May 15, 2019 at 7:19 PM Alex Deucher > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > On Wed, May 15, 2019 at 2:26 PM Micah Morton > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > > > > > > > > > I'm interested in running a VM on a system with an > > > > > > > > > > > integrated Stoney > > > > > > > > > > > [Radeon R2/R3/R4/R5 Graphics] card and passing through > > > > > > > > > > > the graphics > > > > > > > > > > > card to the VM using the IOMMU. I'm wondering whether > > > > > > > > > > > this is feasible > > > > > > > > > > > and supposed to be doable with the right setup (as > > > > > > > > > > > opposed to passing > > > > > > > > > > > a discrete GPU to the VM, which I think is definitely > > > > > > > > > > > doable?). > > > > > > > > > > > > > > > > > > > > > > So far, I can do all the qemu/kvm/vfio/iommu stuff to run > > > > > > > > > > > the VM and > > > > > > > > > > > pass the integrated GPU to it, but the drm driver in the > > > > > > > > > > > VM fails > > > > > > > > > > > during amdgpu_device_init(). Specifically, the logs show > > > > > > > > > > > the SMU being > > > > > > > > > > > unresponsive, which leads to a 'SMU firmware load failed' > > > > > > > > > > > error > > > > > > > > > > > message and kernel panic. I can share VM logs and the > > > > > > > > > > > invocation of > > > > > > > > > > > qemu and such if helpful, but first wanted to know at a > > > > > > > > > > > high level if > > > > > > > > > > > this should be feasible? > > > > > > > > > > > > > > > > > > > > > > P.S.: I'm not initializing the GPU in the host bios or > > > > > > > > > > > host kernel at > > > > > > > > > > > all, so I should be passing a fresh GPU to the VM. Also, > > > > > > > > > > > I'm pretty > > > > > > > > > > > sure I'm running the correct VGA bios for this GPU in the > > > > > > > > > > > guest VM > > > > > > > > > > > bios before guest boot. > > > > > > > > > > > > > > > > > > > > > > Any comments/suggestions would be appreciated! > > > > > > > > > > > > > > > > > > > > It should work in at least once as long as your vm is > > > > > > > > > > properly set up. > > > > > > > > > > > > > > > > > > Is there any reason running coreboot vs UEFI at host boot > > > > > > > > > would make a > > > > > > > > > difference? I was running a modified version of coreboot that > > > > > > > > > avoids > > > > > > > > > doing any GPU initialization in firmware -- so the first POST > > > > > > > > > happens > > > > > > > > > inside the guest. > > > > > > > > > > > > > > > > The GPU on APUs shares a bunch of resources with the CPU. > > > > > > > > There are a > > > > > > > > bunch of blocks which are shared and need to be initialized on > > > > > > > > both > > > > > > > > for everything to work properly. > > > > > > > > > > > > > > Interesting. So skipping running the vbios in the host and waiting > > > > > > > until running it for the first time in the guest SeaBIOS is a bad > > > > > > > idea? Would it be better to let APU+CPU initialize normally in the > > > > > > > host and then skip trying to run the vbios in guest SeaBIOS and > > > > > > > just > > > > > > > do some kind of reset before the drm driver starts accessing it > > > > > > > from > > > > > > > the guest? > > > > > > > > > > > > If you let the sbios initialize things, it should work. The driver > > > > > > will do the right thing to init the card when it loads whether its > > > > > > running on bare metal or in a VM. We've never tested any scenarios > > > > > > where the GPU on APUs is not handled by the sbios. Note that the > > > > > > GPU > > > > > > does not have to be posted per se, it just needs to have been > > > > > > properly > > > > > > taken into account when the sbios comes up so that shared components > > > > > > are initialized correctly. I don't know what your patched system > > > > > > does > > > > > > or doesn't do with respect to the platform initialization. > > > > > > > > > > So it sounds like you are suggesting the following: > > > > > > > > > > a) Run the vbios as part of the host sbios to initialize stuff and > > > > > patch up the vbios > > > > > > > > > > b) Don't run the drm/amdgpu driver in the
[PATCH 7/7] drm/amdkfd: PM4 packets change to support GWS
Add a field in map_queues packet to indicate whether this is a gws control queue. Only one queue per process can be gws control queue. Change num_gws field in map_process packet to 7 bits Change-Id: I0db91d99c6962d14f9202b2eb950f8e7e497b79e Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c | 1 + drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 7 --- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c index 3dd731c..07f02f8e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c @@ -159,6 +159,7 @@ static int pm_map_queues_v9(struct packet_manager *pm, uint32_t *buffer, packet->bitfields2.engine_sel = engine_sel__mes_map_queues__compute_vi; + packet->bitfields2.gws_control_queue = q->gws ? 1 : 0; packet->bitfields2.queue_type = queue_type__mes_map_queues__normal_compute_vi; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h index 0661339..49ab66b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h @@ -176,8 +176,7 @@ struct pm4_mes_map_process { union { struct { - uint32_t num_gws:6; - uint32_t reserved7:1; + uint32_t num_gws:7; uint32_t sdma_enable:1; uint32_t num_oac:4; uint32_t reserved8:4; @@ -272,7 +271,9 @@ struct pm4_mes_map_queues { struct { uint32_t reserved1:4; enum mes_map_queues_queue_sel_enum queue_sel:2; - uint32_t reserved2:15; + uint32_t reserved5:6; + uint32_t gws_control_queue:1; + uint32_t reserved2:8; enum mes_map_queues_queue_type_enum queue_type:3; uint32_t reserved3:2; enum mes_map_queues_engine_sel_enum engine_sel:3; -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/7] drm/amdkfd: Add gws number to kfd topology node properties
Add amdgpu_amdkfd interface to get num_gws and add num_gws to /sys/class/kfd/kfd/topology/nodes/x/properties. Only report num_gws if MEC FW support GWS barriers. Currently it is determined by a module parameter which will be replaced with MEC FW version check when firmware is ready. Change-Id: Ie0d00fb20a37ef2856860dbecbe1ad0ca1ef09f7 Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 10 ++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 + drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 5 + drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 + 6 files changed, 29 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 98326e3b..a4780d5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -544,6 +544,13 @@ uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev *kgd) return adev->rmmio_remap.bus_addr; } +uint32_t amdgpu_amdkfd_get_num_gws(struct kgd_dev *kgd) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)kgd; + + return adev->gds.gws_size; +} + int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine, uint32_t vmid, uint64_t gpu_addr, uint32_t *ib_cmd, uint32_t ib_len) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index f57f297..5700643 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -169,6 +169,7 @@ int amdgpu_amdkfd_get_dmabuf_info(struct kgd_dev *kgd, int dma_buf_fd, uint64_t amdgpu_amdkfd_get_vram_usage(struct kgd_dev *kgd); uint64_t amdgpu_amdkfd_get_hive_id(struct kgd_dev *kgd); uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct kgd_dev *kgd); +uint32_t amdgpu_amdkfd_get_num_gws(struct kgd_dev *kgd); uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev *src); #define read_user_wptr(mmptr, wptr, dst) \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index a334d3b..3a03c2b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -666,6 +666,16 @@ MODULE_PARM_DESC(noretry, int halt_if_hws_hang; module_param(halt_if_hws_hang, int, 0644); MODULE_PARM_DESC(halt_if_hws_hang, "Halt if HWS hang is detected (0 = off (default), 1 = on)"); + +/** + * DOC: hws_gws_support(bool) + * Whether HWS support gws barriers. Default value: false (not supported) + * This will be replaced with a MEC firmware version check once firmware + * is ready + */ +bool hws_gws_support; +module_param(hws_gws_support, bool, 0444); +MODULE_PARM_DESC(hws_gws_support, "MEC FW support gws barriers (false = not supported (Default), true = supported)"); #endif /** diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 8f02d78..338fb07 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -160,6 +160,11 @@ extern int noretry; */ extern int halt_if_hws_hang; +/* + * Whether MEC FW support GWS barriers + */ +extern bool hws_gws_support; + enum cache_policy { cache_policy_coherent, cache_policy_noncoherent diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 2c06d6c..128c72c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -454,6 +454,8 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, dev->node_props.lds_size_in_kb); sysfs_show_32bit_prop(buffer, "gds_size_in_kb", dev->node_props.gds_size_in_kb); + sysfs_show_32bit_prop(buffer, "num_gws", + dev->node_props.num_gws); sysfs_show_32bit_prop(buffer, "wave_front_size", dev->node_props.wave_front_size); sysfs_show_32bit_prop(buffer, "array_count", @@ -1290,6 +1292,9 @@ int kfd_topology_add_device(struct kfd_dev *gpu) dev->node_props.num_sdma_engines = gpu->device_info->num_sdma_engines; dev->node_props.num_sdma_xgmi_engines = gpu->device_info->num_xgmi_sdma_engines; + dev->node_props.num_gws = (hws_gws_support && + dev->gpu->dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS) ? + amdgpu_amdkfd_get_num_gws(dev->gpu->kgd) : 0; kfd_fill_mem_clk_max_info(dev); kfd_fill_iolink_non_crat_info(dev); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h index 949e885..276354a 100644 --- a/drivers/gpu/drm/amd/amdkfd
[PATCH 4/7] drm/amdgpu: Add function to add/remove gws to kfd process
GWS bo is shared between all kfd processes. Add function to add gws to kfd process's bo list so gws can be evicted from and restored for process. Change-Id: I75d74cfdadb7075ff8b2b68634e205deb73dc1ea Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 103 +-- 2 files changed, 100 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index c00c974..f968bf1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -155,6 +155,8 @@ int amdgpu_amdkfd_alloc_gtt_mem(struct kgd_dev *kgd, size_t size, void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void *mem_obj); int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t size, void **mem_obj); void amdgpu_amdkfd_free_gws(struct kgd_dev *kgd, void *mem_obj); +int amdgpu_amdkfd_add_gws_to_process(void *info, void *gws, struct kgd_mem **mem); +int amdgpu_amdkfd_remove_gws_from_process(void *info, void *mem); uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type); void amdgpu_amdkfd_get_local_mem_info(struct kgd_dev *kgd, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index e1cae4a..87177ed 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -457,6 +457,17 @@ static void add_kgd_mem_to_kfd_bo_list(struct kgd_mem *mem, mutex_unlock(&process_info->lock); } +static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem *mem, + struct amdkfd_process_info *process_info) +{ + struct ttm_validate_buffer *bo_list_entry; + + bo_list_entry = &mem->validate_list; + mutex_lock(&process_info->lock); + list_del(&bo_list_entry->head); + mutex_unlock(&process_info->lock); +} + /* Initializes user pages. It registers the MMU notifier and validates * the userptr BO in the GTT domain. * @@ -1183,12 +1194,8 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( if (user_addr) { ret = init_user_pages(*mem, current->mm, user_addr); - if (ret) { - mutex_lock(&avm->process_info->lock); - list_del(&(*mem)->validate_list.head); - mutex_unlock(&avm->process_info->lock); + if (ret) goto allocate_init_user_pages_failed; - } } if (offset) @@ -1197,6 +1204,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( return 0; allocate_init_user_pages_failed: + remove_kgd_mem_from_kfd_bo_list(*mem, avm->process_info); amdgpu_bo_unref(&bo); /* Don't unreserve system mem limit twice */ goto err_reserve_limit; @@ -2104,3 +2112,88 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef) kfree(pd_bo_list); return ret; } + +int amdgpu_amdkfd_add_gws_to_process(void *info, void *gws, struct kgd_mem **mem) +{ + struct amdkfd_process_info *process_info = (struct amdkfd_process_info *)info; + struct amdgpu_bo *gws_bo = (struct amdgpu_bo *)gws; + int ret; + + if (!info || !gws) + return -EINVAL; + + *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL); + if (!*mem) + return -EINVAL; + + mutex_init(&(*mem)->lock); + (*mem)->bo = amdgpu_bo_ref(gws_bo); + (*mem)->domain = AMDGPU_GEM_DOMAIN_GWS; + (*mem)->process_info = process_info; + add_kgd_mem_to_kfd_bo_list(*mem, process_info, false); + amdgpu_sync_create(&(*mem)->sync); + + + /* Validate gws bo the first time it is added to process */ + mutex_lock(&(*mem)->process_info->lock); + ret = amdgpu_bo_reserve(gws_bo, false); + if (unlikely(ret)) { + pr_err("Reserve gws bo failed %d\n", ret); + goto bo_reservation_failure; + } + + ret = amdgpu_amdkfd_bo_validate(gws_bo, AMDGPU_GEM_DOMAIN_GWS, true); + if (ret) { + pr_err("GWS BO validate failed %d\n", ret); + goto bo_validation_failure; + } + /* GWS resource is shared b/t amdgpu and amdkfd +* Add process eviction fence to bo so they can +* evict each other. +*/ + amdgpu_bo_fence(gws_bo, &process_info->eviction_fence->base, true); + amdgpu_bo_unreserve(gws_bo); + mutex_unlock(&(*mem)->process_info->lock); + + return ret; + +bo_validation_failure: + amdgpu_bo_unreserve(gws_bo); +bo_reservation_failure: + mutex_unlock(&(*mem)->process_info->lock); + amdgpu_sync_free(&(*mem)->sync); + remove_kgd_mem_from_kfd_bo_list(*mem, process_info); + amdgpu_bo_unref(&gws_bo); + m
[PATCH 6/7] drm/amdkfd: New IOCTL to allocate queue GWS
Add a new kfd ioctl to allocate queue GWS. Queue GWS is released on queue destroy. Change-Id: I60153c26a577992ad873e4292e759e5c3d5bbd15 Signed-off-by: Oak Zeng --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 30 ++ include/uapi/linux/kfd_ioctl.h | 20 +++- 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 38ae53f..f3f08fe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1559,6 +1559,34 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file *filep, return err; } +static int kfd_ioctl_alloc_queue_gws(struct file *filep, + struct kfd_process *p, void *data) +{ + int retval; + struct kfd_ioctl_alloc_queue_gws_args *args = data; + struct kfd_dev *dev = NULL; + + if (args->num_gws == 0) + return -EINVAL; + + if (!hws_gws_support || + dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) + return -EINVAL; + + dev = kfd_device_by_id(args->gpu_id); + if (!dev) { + pr_debug("Could not find gpu id 0x%x\n", args->gpu_id); + return -EINVAL; + } + + mutex_lock(&p->mutex); + retval = pqm_set_gws(&p->pqm, args->queue_id, dev->gws); + mutex_unlock(&p->mutex); + + args->first_gws = 0; + return retval; +} + static int kfd_ioctl_get_dmabuf_info(struct file *filep, struct kfd_process *p, void *data) { @@ -1761,6 +1789,8 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = { AMDKFD_IOCTL_DEF(AMDKFD_IOC_IMPORT_DMABUF, kfd_ioctl_import_dmabuf, 0), + AMDKFD_IOCTL_DEF(AMDKFD_IOC_ALLOC_QUEUE_GWS, + kfd_ioctl_alloc_queue_gws, 0), }; #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls) diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index 20917c5..070d1bc 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -410,6 +410,21 @@ struct kfd_ioctl_unmap_memory_from_gpu_args { __u32 n_success;/* to/from KFD */ }; +/* Allocate GWS for specific queue + * + * @gpu_id: device identifier + * @queue_id:queue's id that GWS is allocated for + * @num_gws: how many GWS to allocate + * @first_gws: index of the first GWS allocated. + * only support contiguous GWS allocation + */ +struct kfd_ioctl_alloc_queue_gws_args { + __u32 gpu_id; /* to KFD */ + __u32 queue_id; /* to KFD */ + __u32 num_gws; /* to KFD */ + __u32 first_gws;/* from KFD */ +}; + struct kfd_ioctl_get_dmabuf_info_args { __u64 size; /* from KFD */ __u64 metadata_ptr; /* to KFD */ @@ -529,7 +544,10 @@ enum kfd_mmio_remap { #define AMDKFD_IOC_IMPORT_DMABUF \ AMDKFD_IOWR(0x1D, struct kfd_ioctl_import_dmabuf_args) +#define AMDKFD_IOC_ALLOC_QUEUE_GWS \ + AMDKFD_IOWR(0x1E, struct kfd_ioctl_alloc_queue_gws_args) + #define AMDKFD_COMMAND_START 0x01 -#define AMDKFD_COMMAND_END 0x1E +#define AMDKFD_COMMAND_END 0x1F #endif -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 2/7] drm/amdgpu: Add interface to alloc gws from amdgpu
Add amdgpu_amdkfd interface to alloc and free gws from amdgpu Change-Id: I4eb418356e5a6051aa09c5e2c4a454263631d6ab Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 34 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 ++ 2 files changed, 36 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index a4780d5..4af3989 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -339,6 +339,40 @@ void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void *mem_obj) amdgpu_bo_unref(&(bo)); } +int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t size, + void **mem_obj) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)kgd; + struct amdgpu_bo *bo = NULL; + struct amdgpu_bo_param bp; + int r; + + memset(&bp, 0, sizeof(bp)); + bp.size = size; + bp.byte_align = 1; + bp.domain = AMDGPU_GEM_DOMAIN_GWS; + bp.flags = AMDGPU_GEM_CREATE_NO_CPU_ACCESS; + bp.type = ttm_bo_type_device; + bp.resv = NULL; + + r = amdgpu_bo_create(adev, &bp, &bo); + if (r) { + dev_err(adev->dev, + "failed to allocate gws BO for amdkfd (%d)\n", r); + return r; + } + + *mem_obj = bo; + return 0; +} + +void amdgpu_amdkfd_free_gws(struct kgd_dev *kgd, void *mem_obj) +{ + struct amdgpu_bo *bo = (struct amdgpu_bo *)mem_obj; + + amdgpu_bo_unref(&bo); +} + uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 5700643..c00c974 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -153,6 +153,8 @@ int amdgpu_amdkfd_alloc_gtt_mem(struct kgd_dev *kgd, size_t size, void **mem_obj, uint64_t *gpu_addr, void **cpu_ptr, bool mqd_gfx9); void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void *mem_obj); +int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t size, void **mem_obj); +void amdgpu_amdkfd_free_gws(struct kgd_dev *kgd, void *mem_obj); uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type); void amdgpu_amdkfd_get_local_mem_info(struct kgd_dev *kgd, -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 3/7] drm/amdkfd: Allocate gws on device initialization
On device initialization, KFD allocates all (64) gws which is shared by all KFD processes. Change-Id: I1f1274b8d4f6a8ad08785e2791a9a79def75e913 Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 14 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 3 +++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 7b4ea24..9d1b026 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -553,6 +553,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, } else kfd->max_proc_per_quantum = hws_max_conc_proc; + /* Allocate global GWS that is shared by all KFD processes */ + if (hws_gws_support && amdgpu_amdkfd_alloc_gws(kfd->kgd, + amdgpu_amdkfd_get_num_gws(kfd->kgd), &kfd->gws)) { + dev_err(kfd_device, "Could not allocate %d gws\n", + amdgpu_amdkfd_get_num_gws(kfd->kgd)); + goto out; + } /* calculate max size of mqds needed for queues */ size = max_num_of_queues_per_device * kfd->device_info->mqd_size_aligned; @@ -576,7 +583,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, &kfd->gtt_start_gpu_addr, &kfd->gtt_start_cpu_ptr, false)) { dev_err(kfd_device, "Could not allocate %d bytes\n", size); - goto out; + goto alloc_gtt_mem_failure; } dev_info(kfd_device, "Allocated %d bytes on gart\n", size); @@ -646,6 +653,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, kfd_gtt_sa_fini(kfd); kfd_gtt_sa_init_error: amdgpu_amdkfd_free_gtt_mem(kfd->kgd, kfd->gtt_mem); +alloc_gtt_mem_failure: + if (hws_gws_support) + amdgpu_amdkfd_free_gws(kfd->kgd, kfd->gws); dev_err(kfd_device, "device %x:%x NOT added due to errors\n", kfd->pdev->vendor, kfd->pdev->device); @@ -663,6 +673,8 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd) kfd_doorbell_fini(kfd); kfd_gtt_sa_fini(kfd); amdgpu_amdkfd_free_gtt_mem(kfd->kgd, kfd->gtt_mem); + if (hws_gws_support) + amdgpu_amdkfd_free_gws(kfd->kgd, kfd->gws); } kfree(kfd); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 338fb07..5969e37 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -288,6 +288,9 @@ struct kfd_dev { /* Compute Profile ref. count */ atomic_t compute_profile; + + /* Global GWS resource shared b/t processes*/ + void *gws; }; enum kfd_mempool { -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 5/7] drm/amdkfd: Add function to set queue gws
Add functions in process queue manager to set queue gws. Also set process's number of gws used. Currently only one queue in process can use and use all gws. Change-Id: I03e480c8692db3eabfc3a188cce8904d5d962ab7 Signed-off-by: Oak Zeng --- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 6 +++ .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 56 ++ 2 files changed, 62 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 5969e37..40a5c67 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -454,6 +454,9 @@ struct queue_properties { * * @device: The kfd device that created this queue. * + * @gws: Pointing to gws kgd_mem if this is a gws control queue; NULL + * otherwise. + * * This structure represents user mode compute queues. * It contains all the necessary data to handle such queues. * @@ -475,6 +478,7 @@ struct queue { struct kfd_process *process; struct kfd_dev *device; + void *gws; }; /* @@ -868,6 +872,8 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid, struct queue_properties *p); int pqm_set_cu_mask(struct process_queue_manager *pqm, unsigned int qid, struct queue_properties *p); +int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid, + void *gws); struct kernel_queue *pqm_get_kernel_queue(struct process_queue_manager *pqm, unsigned int qid); int pqm_get_wave_state(struct process_queue_manager *pqm, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index e652e25..d69e17a 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -26,6 +26,7 @@ #include "kfd_device_queue_manager.h" #include "kfd_priv.h" #include "kfd_kernel_queue.h" +#include "amdgpu_amdkfd.h" static inline struct process_queue_node *get_queue_by_qid( struct process_queue_manager *pqm, unsigned int qid) @@ -74,6 +75,54 @@ void kfd_process_dequeue_from_device(struct kfd_process_device *pdd) pdd->already_dequeued = true; } +int pqm_set_gws(struct process_queue_manager *pqm, unsigned int qid, + void *gws) +{ + struct kfd_dev *dev = NULL; + struct process_queue_node *pqn; + struct kfd_process_device *pdd; + struct kgd_mem *mem; + int ret; + + /* Don't allow unset gws */ + if (!gws) + return -EINVAL; + + pqn = get_queue_by_qid(pqm, qid); + if (!pqn) { + pr_err("Queue id does not match any known queue\n"); + return -EINVAL; + } + + if (pqn->q) + dev = pqn->q->device; + if (WARN_ON(!dev)) + return -ENODEV; + + pdd = kfd_get_process_device_data(dev, pqm->process); + if (!pdd) { + pr_err("Process device data doesn't exist\n"); + return -EINVAL; + } + + /* Only allow one queue per process can have GWS assigned */ + list_for_each_entry(pqn, &pqm->queues, process_queue_list) { + if (pqn->q && pqn->q->gws) + return -EINVAL; + } + + ret = amdgpu_amdkfd_add_gws_to_process(pdd->process->kgd_process_info, + gws, &mem); + if (unlikely(ret)) + return ret; + + pqn->q->gws = mem; + pdd->qpd.num_gws = amdgpu_amdkfd_get_num_gws(dev->kgd); + + return pqn->q->device->dqm->ops.update_queue(pqn->q->device->dqm, + pqn->q); +} + void kfd_process_dequeue_from_all_devices(struct kfd_process *p) { struct kfd_process_device *pdd; @@ -330,6 +379,13 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid) if (retval != -ETIME) goto err_destroy_queue; } + + if (pqn->q->gws) { + amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info, + pqn->q->gws); + pdd->qpd.num_gws = 0; + } + kfree(pqn->q->properties.cu_mask); pqn->q->properties.cu_mask = NULL; uninit_queue(pqn->q); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On 5/21/19 6:04 PM, Kees Cook wrote: > As an aside: I think Sparc ADI support in Linux actually side-stepped > this[1] (i.e. chose "solution 1"): "All addresses passed to kernel must > be non-ADI tagged addresses." (And sadly, "Kernel does not enable ADI > for kernel code.") I think this was a mistake we should not repeat for > arm64 (we do seem to be at least in agreement about this, I think). > > [1] https://lore.kernel.org/patchwork/patch/654481/ That is a very early version of the sparc ADI patch. Support for tagged addresses in syscalls was added in later versions and is in the patch that is in the kernel. That part "Kernel does not enable ADI for kernel code." is correct. It is a possible enhancement for future. -- Khalid
[PATCH] drm/amdgpu/ras: use GPU page size for page retirement
Bad pages are stored in GPU page size since they are VRAM pages. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index da1dc40b9b14..32906d6c621c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1408,8 +1408,8 @@ int amdgpu_ras_reserve_bad_pages(struct amdgpu_device *adev) for (i = data->last_reserved; i < data->count; i++) { bp = data->bps[i].bp; - if (amdgpu_ras_reserve_vram(adev, bp << PAGE_SHIFT, - PAGE_SIZE, &bo)) + if (amdgpu_ras_reserve_vram(adev, bp << AMDGPU_GPU_PAGE_SHIFT, + AMDGPU_GPU_PAGE_SIZE, &bo)) DRM_ERROR("RAS ERROR: reserve vram %llx fail\n", bp); data->bps[i].bo = bo; -- 2.20.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Thu, May 23, 2019 at 09:38:19AM -0700, Kees Cook wrote: > On Thu, May 23, 2019 at 03:44:49PM +0100, Catalin Marinas wrote: > > There is also the obvious requirement which I didn't mention: new user > > space continues to run on new/subsequent kernel versions. That's one of > > the points of contention for this series (ignoring MTE) with the > > maintainers having to guarantee this without much effort. IOW, do the > > 500K+ new lines in a subsequent kernel version break any user space out > > there? I'm only talking about the relaxed TBI ABI. Are the usual LTP, > > syskaller sufficient? Better static analysis would definitely help. > > We can't have perfect coverage of people actively (or accidentally) > working to trick static analyzers (and the compiler) into "forgetting" > about a __user annotation. We can certainly improve analysis (I see > the other sub-thread on that), but I'd like that work not to block > this series. I don't want to block this series either as it's a prerequisite for MTE but at the moment I'm not confident we tracked down all the places where things can break and what the future effort will be to analyse new kernel changes. We've made lots of progress since March last year (I think when the first version was posted) by both identifying new places in the kernel that need untagging and limiting the ranges that user space can tag (e.g. an mmap() on a device should not allow tagged pointers). Can we say we are done or more investigation is needed? > What on this front would you be comfortable with? Given it's a new > feature isn't it sufficient to have a CONFIG (and/or boot option)? I'd rather avoid re-building kernels. A boot option would do, unless we see value in a per-process (inherited) personality or prctl. The per-process option has the slight advantage that I can boot my machine normally while being able to run LTP with a relaxed ABI (and a new libc which tags pointers). I admit I don't have a very strong argument on a per-process opt-in. Another option would be a sysctl control together with a cmdline option. > > Or, if we ever want MTE to be turned on by default (i.e. tag checking), > > even if everything is tagged with 0, we have to disallow TBI for user > > and this includes hwasan. There were a small number of programs using > > the TBI (I think some JavaScript compilers tried this). But now we are > > bringing in the hwasan support and this can be a large user base. Shall > > we add an ELF note for such binaries that use TBI/hwasan? > > Just to be clear, you say "disallow TBI for user" -- you mean a > particular process, yes? i.e. there is no architectural limitation that > says once we're using MTE nothing can switch to TBI. i.e. a process is > either doing MTE or TBI (or nothing, but that's the same as TBI). I may have answered this below. The architecture does not allow TBI (i.e. faults) if MTE is enabled for a process and address range. > > > So there needs to be some way to let the kernel know which of three > > > things it should be doing: > > > 1- leaving userspace addresses as-is (present) > > > 2- wiping the top bits before using (this series) > > > > (I'd say tolerating rather than wiping since get_user still uses the tag > > in the current series) > > > > The current series does not allow any choice between 1 and 2, the > > default ABI basically becomes option 2. > > What about testing tools that intentionally insert high bits for syscalls > and are _expecting_ them to fail? It seems the TBI series will break them? > In that case, do we need to opt into TBI as well? If there are such tools, then we may need a per-process control. It's basically an ABI change. > > > 3- wiping the top bits for most things, but retaining them for MTE as > > >needed (the future) > > > > 2 and 3 are not entirely compatible as a tagged pointer may be checked > > against the memory colour by the hardware. So you can't have hwasan > > binary with MTE enabled. > > Right: a process must be either MTE or TBI, not both. Indeed. > > > I expect MTE to be the "default" in the future. Once a system's libc has > > > grown support for it, everything will be trying to use MTE. TBI will be > > > the special case (but TBI is effectively a prerequisite). > > > > The kernel handling of tagged pointers is indeed a prerequisite. The ABI > > distinction between the above 2 and 3 needs to be solved. > > Does that need solving now or when the MTE series appears? As there is > no reason to distinguish between "normal" and "TBI", that doesn't seem > to need solving at this point? We don't need to solve 2 vs 3 as long as option 3 is an explicit opt-in (prctl) by the user. Otherwise the kernel cannot guess whether the user is using TBI or not (and if it does and still opts in to MTE, we can say it's a user problem ;)). > > > To choose between "2" and "3", it seems we need a per-process flag to > > > opt into TBI (and out of MTE). > > > > Or leave option 2 the default and get it t
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Thu, May 23, 2019 at 08:44:12AM -0700, enh wrote: > On Thu, May 23, 2019 at 7:45 AM Catalin Marinas > wrote: > > On Wed, May 22, 2019 at 01:47:36PM -0700, Kees Cook wrote: > > > For userspace, how would a future binary choose TBI over MTE? If it's > > > a library issue, we can't use an ELF bit, since the choice may be > > > "late" after ELF load (this implies the need for a prctl().) If it's > > > binary-only ("built with HWKASan") then an ELF bit seems sufficient. > > > And without the marking, I'd expect the kernel to enforce MTE when > > > there are high bits. > > > > The current plan is that a future binary issues a prctl(), after > > checking the HWCAP_MTE bit (as I replied to Elliot, the MTE instructions > > are not in the current NOP space). I'd expect this to be done by the > > libc or dynamic loader under the assumption that the binaries it loads > > do _not_ use the top pointer byte for anything else. > > yeah, it sounds like to support hwasan and MTE, the dynamic linker > will need to not use either itself. > > > With hwasan compiled objects this gets more confusing (any ELF note > > to identify them?). > > no, at the moment code that wants to know checks for the presence of > __hwasan_init. (and bionic doesn't actually look at any ELF notes > right now.) but we can always add something if we need to. It's a userspace decision to make. In the kernel, we are proposing that bionic calls a prctl() to enable MTE explicitly. It could first check for the presence of __hwasan_init. -- Catalin
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Thu, May 23, 2019 at 11:42:57AM +0100, Dave P Martin wrote: > On Wed, May 22, 2019 at 09:20:52PM -0300, Jason Gunthorpe wrote: > > On Wed, May 22, 2019 at 02:49:28PM +0100, Dave Martin wrote: > > > If multiple people will care about this, perhaps we should try to > > > annotate types more explicitly in SYSCALL_DEFINEx() and ABI data > > > structures. > > > > > > For example, we could have a couple of mutually exclusive modifiers > > > > > > T __object * > > > T __vaddr * (or U __vaddr) > > > > > > In the first case the pointer points to an object (in the C sense) > > > that the call may dereference but not use for any other purpose. > > > > How would you use these two differently? > > > > So far the kernel has worked that __user should tag any pointer that > > is from userspace and then you can't do anything with it until you > > transform it into a kernel something > > Ultimately it would be good to disallow casting __object pointers execpt > to compatible __object pointer types, and to make get_user etc. demand > __object. > > __vaddr pointers / addresses would be freely castable, but not to > __object and so would not be dereferenceable even indirectly. I think it gets too complicated and there are ambiguous cases that we may not be able to distinguish. For example copy_from_user() may be used to copy a user data structure into the kernel, hence __object would work, while the same function may be used to copy opaque data to a file, so __vaddr may be a better option (unless I misunderstood your proposal). We currently have T __user * and I think it's a good starting point. The prior attempt [1] was shut down because it was just hiding the cast using __force. We'd need to work through those cases again and rather start changing the function prototypes to avoid unnecessary casting in the callers (e.g. get_user_pages(void __user *) or come up with a new type) while changing the explicit casting to a macro where it needs to be obvious that we are converting a user pointer, potentially typed (tagged), to an untyped address range. We may need a user_ptr_to_ulong() macro or similar (it seems that we have a u64_to_user_ptr, wasn't aware of it). It may actually not be far from what you suggested but I'd keep the current T __user * to denote possible dereference. [1] https://lore.kernel.org/lkml/5d54526e5ff2e5ad63d0dfdd9ab17cf359afa4f2.1535629099.git.andreyk...@google.com/ -- Catalin
Re: GPU passthrough support for Stoney [Radeon R2/R3/R4/R5 Graphics]?
On Wed, May 22, 2019 at 6:11 PM Alex Deucher wrote: > > On Wed, May 22, 2019 at 7:00 PM Micah Morton wrote: > > > > On Wed, May 22, 2019 at 1:39 PM Alex Deucher wrote: > > > > > > On Tue, May 21, 2019 at 1:46 PM Micah Morton wrote: > > > > > > > > On Fri, May 17, 2019 at 9:59 AM Alex Deucher > > > > wrote: > > > > > > > > > > On Fri, May 17, 2019 at 11:36 AM Micah Morton > > > > > wrote: > > > > > > > > > > > > On Thu, May 16, 2019 at 1:39 PM Alex Deucher > > > > > > wrote: > > > > > > > > > > > > > > On Thu, May 16, 2019 at 4:07 PM Micah Morton > > > > > > > wrote: > > > > > > > > > > > > > > > > On Wed, May 15, 2019 at 7:19 PM Alex Deucher > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Wed, May 15, 2019 at 2:26 PM Micah Morton > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Hi folks, > > > > > > > > > > > > > > > > > > > > I'm interested in running a VM on a system with an > > > > > > > > > > integrated Stoney > > > > > > > > > > [Radeon R2/R3/R4/R5 Graphics] card and passing through the > > > > > > > > > > graphics > > > > > > > > > > card to the VM using the IOMMU. I'm wondering whether this > > > > > > > > > > is feasible > > > > > > > > > > and supposed to be doable with the right setup (as opposed > > > > > > > > > > to passing > > > > > > > > > > a discrete GPU to the VM, which I think is definitely > > > > > > > > > > doable?). > > > > > > > > > > > > > > > > > > > > So far, I can do all the qemu/kvm/vfio/iommu stuff to run > > > > > > > > > > the VM and > > > > > > > > > > pass the integrated GPU to it, but the drm driver in the VM > > > > > > > > > > fails > > > > > > > > > > during amdgpu_device_init(). Specifically, the logs show > > > > > > > > > > the SMU being > > > > > > > > > > unresponsive, which leads to a 'SMU firmware load failed' > > > > > > > > > > error > > > > > > > > > > message and kernel panic. I can share VM logs and the > > > > > > > > > > invocation of > > > > > > > > > > qemu and such if helpful, but first wanted to know at a > > > > > > > > > > high level if > > > > > > > > > > this should be feasible? > > > > > > > > > > > > > > > > > > > > P.S.: I'm not initializing the GPU in the host bios or host > > > > > > > > > > kernel at > > > > > > > > > > all, so I should be passing a fresh GPU to the VM. Also, > > > > > > > > > > I'm pretty > > > > > > > > > > sure I'm running the correct VGA bios for this GPU in the > > > > > > > > > > guest VM > > > > > > > > > > bios before guest boot. > > > > > > > > > > > > > > > > > > > > Any comments/suggestions would be appreciated! > > > > > > > > > > > > > > > > > > It should work in at least once as long as your vm is > > > > > > > > > properly set up. > > > > > > > > > > > > > > > > Is there any reason running coreboot vs UEFI at host boot would > > > > > > > > make a > > > > > > > > difference? I was running a modified version of coreboot that > > > > > > > > avoids > > > > > > > > doing any GPU initialization in firmware -- so the first POST > > > > > > > > happens > > > > > > > > inside the guest. > > > > > > > > > > > > > > The GPU on APUs shares a bunch of resources with the CPU. There > > > > > > > are a > > > > > > > bunch of blocks which are shared and need to be initialized on > > > > > > > both > > > > > > > for everything to work properly. > > > > > > > > > > > > Interesting. So skipping running the vbios in the host and waiting > > > > > > until running it for the first time in the guest SeaBIOS is a bad > > > > > > idea? Would it be better to let APU+CPU initialize normally in the > > > > > > host and then skip trying to run the vbios in guest SeaBIOS and just > > > > > > do some kind of reset before the drm driver starts accessing it from > > > > > > the guest? > > > > > > > > > > If you let the sbios initialize things, it should work. The driver > > > > > will do the right thing to init the card when it loads whether its > > > > > running on bare metal or in a VM. We've never tested any scenarios > > > > > where the GPU on APUs is not handled by the sbios. Note that the GPU > > > > > does not have to be posted per se, it just needs to have been properly > > > > > taken into account when the sbios comes up so that shared components > > > > > are initialized correctly. I don't know what your patched system does > > > > > or doesn't do with respect to the platform initialization. > > > > > > > > So it sounds like you are suggesting the following: > > > > > > > > a) Run the vbios as part of the host sbios to initialize stuff and > > > > patch up the vbios > > > > > > > > b) Don't run the drm/amdgpu driver in the host, wait until the guest > > > > for that > > > > > > > > c) Avoid running the vbios (again) in the guest sbios since it has > > > > already been run. Although since "the driver needs access to the vbios > > > > image in the guest to get device specific configuration details" I > > > > should still do '-device > > > > vfio-pci,...,romfile=
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Thu, May 23, 2019 at 03:44:49PM +0100, Catalin Marinas wrote: > There is also the obvious requirement which I didn't mention: new user > space continues to run on new/subsequent kernel versions. That's one of > the points of contention for this series (ignoring MTE) with the > maintainers having to guarantee this without much effort. IOW, do the > 500K+ new lines in a subsequent kernel version break any user space out > there? I'm only talking about the relaxed TBI ABI. Are the usual LTP, > syskaller sufficient? Better static analysis would definitely help. We can't have perfect coverage of people actively (or accidentally) working to trick static analyzers (and the compiler) into "forgetting" about a __user annotation. We can certainly improve analysis (I see the other sub-thread on that), but I'd like that work not to block this series. What on this front would you be comfortable with? Given it's a new feature isn't it sufficient to have a CONFIG (and/or boot option)? > Or, if we ever want MTE to be turned on by default (i.e. tag checking), > even if everything is tagged with 0, we have to disallow TBI for user > and this includes hwasan. There were a small number of programs using > the TBI (I think some JavaScript compilers tried this). But now we are > bringing in the hwasan support and this can be a large user base. Shall > we add an ELF note for such binaries that use TBI/hwasan? Just to be clear, you say "disallow TBI for user" -- you mean a particular process, yes? i.e. there is no architectural limitation that says once we're using MTE nothing can switch to TBI. i.e. a process is either doing MTE or TBI (or nothing, but that's the same as TBI). > This needs solving as well. Most driver developers won't know why > untagged_addr() is needed unless we have more rigorous types or type > annotations and a tool to check them (we should probably revive the old > sparse thread). This seems like a parallel concern: we can do that separately from this series. Without landing it, is it much harder for people to test it, look for bugs, help with types/annotations, etc. > > So there needs to be some way to let the kernel know which of three > > things it should be doing: > > 1- leaving userspace addresses as-is (present) > > 2- wiping the top bits before using (this series) > > (I'd say tolerating rather than wiping since get_user still uses the tag > in the current series) > > The current series does not allow any choice between 1 and 2, the > default ABI basically becomes option 2. What about testing tools that intentionally insert high bits for syscalls and are _expecting_ them to fail? It seems the TBI series will break them? In that case, do we need to opt into TBI as well? > > 3- wiping the top bits for most things, but retaining them for MTE as > >needed (the future) > > 2 and 3 are not entirely compatible as a tagged pointer may be checked > against the memory colour by the hardware. So you can't have hwasan > binary with MTE enabled. Right: a process must be either MTE or TBI, not both. > > I expect MTE to be the "default" in the future. Once a system's libc has > > grown support for it, everything will be trying to use MTE. TBI will be > > the special case (but TBI is effectively a prerequisite). > > The kernel handling of tagged pointers is indeed a prerequisite. The ABI > distinction between the above 2 and 3 needs to be solved. Does that need solving now or when the MTE series appears? As there is no reason to distinguish between "normal" and "TBI", that doesn't seem to need solving at this point? > > AFAICT, the only difference I see between 2 and 3 will be the tag handling > > in usercopy (all other places will continue to ignore the top bits). Is > > that accurate? > > Yes, mostly (for the kernel). If MTE is enabled by default for a hwasan > binary, it will SEGFAULT (either in user space or in kernel uaccess). > How does the kernel choose between 2 and 3? Right -- that was my question as well. > > Is "1" a per-process state we want to keep? (I assume not, but rather it > > is available via no TBI/MTE CONFIG or a boot-time option, if at all?) > > Possibly, though not necessarily per process. For testing or if > something goes wrong during boot, a command line option with a static > label would do. The AT_FLAGS bit needs to be checked by user space. My > preference would be per-process. I would agree. > > To choose between "2" and "3", it seems we need a per-process flag to > > opt into TBI (and out of MTE). > > Or leave option 2 the default and get it to opt in to MTE. Given that MTE has to "start" at some point in the binary lifetime, I'm fine with opting into MTE. I do expect, though, this will feel redundant in a couple years as everything will immediately opt-in. But, okay, this is therefore an issue for the MTE series. > The current plan is that a future binary issues a prctl(), after > checking the HWCAP_MTE bit (as I replied to Elliot, the MTE instruction
[PATCH] drm/amdgpu: enable PCIE atomics ops support
GPU atomics operation depends on PCIE atomics support. Always enable PCIE atomics ops support in case that it hasn't been enabled. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bdd1fe73..a2c6064 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2562,6 +2562,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (adev->rio_mem == NULL) DRM_INFO("PCI I/O BAR is not found.\n"); + /* enable PCIE atomic ops */ + r = pci_enable_atomic_ops_to_root(adev->pdev, + PCI_EXP_DEVCAP2_ATOMIC_COMP32 | + PCI_EXP_DEVCAP2_ATOMIC_COMP64); + if (r) + DRM_INFO("PCIE atomic ops is not supported\n"); + amdgpu_device_get_pcie_info(adev); /* early init functions */ -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: Need to set the baco cap before baco reset
On Thu, May 23, 2019 at 6:22 AM Emily Deng wrote: > > For passthrough, after rebooted the VM, driver will do > a baco reset before doing other driver initialization during loading > driver. For doing the baco reset, it will first > check the baco reset capability. So first need to set the > cap from the vbios information or baco reset won't be > enabled. > > Signed-off-by: Emily Deng > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 > drivers/gpu/drm/amd/amdgpu/soc15.c | 3 ++- > drivers/gpu/drm/amd/include/kgd_pp_interface.h | 1 + > drivers/gpu/drm/amd/powerplay/amd_powerplay.c | 16 +++ > drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 1 + > .../amd/powerplay/hwmgr/vega10_processpptables.c | 24 > ++ > .../amd/powerplay/hwmgr/vega10_processpptables.h | 1 + > drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 + > 8 files changed, 54 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index bdd1fe73..2dde672 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2611,6 +2611,14 @@ int amdgpu_device_init(struct amdgpu_device *adev, > * E.g., driver was not cleanly unloaded previously, etc. > */ > if (!amdgpu_sriov_vf(adev) && amdgpu_asic_need_reset_on_init(adev)) { > + if (amdgpu_passthrough(adev) && adev->powerplay.pp_funcs && > adev->powerplay.pp_funcs->set_asic_baco_cap) { > + r = > adev->powerplay.pp_funcs->set_asic_baco_cap(adev->powerplay.pp_handle); > + if (r) { > + dev_err(adev->dev, "set baco capability > failed\n"); > + goto failed; > + } > + } > + I think it would be cleaner to add this to hwmgr_early_init() or something called from early init for powerplay. Alex > r = amdgpu_asic_reset(adev); > if (r) { > dev_err(adev->dev, "asic reset on init failed\n"); > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 78bd4fc..d9fdd95 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -764,7 +764,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device > *adev) > /* Just return false for soc15 GPUs. Reset does not seem to > * be necessary. > */ > - return false; > + if (!amdgpu_passthrough(adev)) > + return false; > > if (adev->flags & AMD_IS_APU) > return false; > diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h > b/drivers/gpu/drm/amd/include/kgd_pp_interface.h > index 9f661bf..c6e2a51 100644 > --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h > +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h > @@ -296,6 +296,7 @@ struct amd_pm_funcs { > int (*set_hard_min_fclk_by_freq)(void *handle, uint32_t clock); > int (*set_min_deep_sleep_dcefclk)(void *handle, uint32_t clock); > int (*get_asic_baco_capability)(void *handle, bool *cap); > + int (*set_asic_baco_cap)(void *handle); > int (*get_asic_baco_state)(void *handle, int *state); > int (*set_asic_baco_state)(void *handle, int state); > int (*get_ppfeature_status)(void *handle, char *buf); > diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > index bea1587..9856760 100644 > --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > @@ -1404,6 +1404,21 @@ static int pp_set_active_display_count(void *handle, > uint32_t count) > return ret; > } > > +static int pp_set_asic_baco_cap(void *handle) > +{ > + struct pp_hwmgr *hwmgr = handle; > + > + if (!hwmgr) > + return -EINVAL; > + > + if (!hwmgr->pm_en || !hwmgr->hwmgr_func->set_asic_baco_cap) > + return 0; > + > + hwmgr->hwmgr_func->set_asic_baco_cap(hwmgr); > + > + return 0; > +} > + > static int pp_get_asic_baco_capability(void *handle, bool *cap) > { > struct pp_hwmgr *hwmgr = handle; > @@ -1546,6 +1561,7 @@ static const struct amd_pm_funcs pp_dpm_funcs = { > .set_hard_min_dcefclk_by_freq = pp_set_hard_min_dcefclk_by_freq, > .set_hard_min_fclk_by_freq = pp_set_hard_min_fclk_by_freq, > .get_asic_baco_capability = pp_get_asic_baco_capability, > + .set_asic_baco_cap = pp_set_asic_baco_cap, > .get_asic_baco_state = pp_get_asic_baco_state, > .set_asic_baco_state = pp_set_asic_baco_state, > .get_ppfeature_status = pp_get_ppfeature_status, > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c > b/drivers/gpu/drm/amd/powerplay/hwmgr/vega1
Re: Quick question for the mobile Raven Ridge auto-rotate function
On Thu, May 23, 2019 at 1:41 AM Luya Tshimbalanga wrote: > > > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Hello team, > > Thank for you making mobile Raven Ridge nearly fully functional with the > open source driver for multiple devices like HP Envy x360 Ryzen 2500u. > However, missing is the ability to auto-rotate the screen when switching > from landscape to portrait in tablet mode. > > Which channel will be better to request enabling that function in open > source driver? See the related issue below: > > Red Hat bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1651886 > > Linux kernel report: https://bugzilla.kernel.org/show_bug.cgi?id=199715 > > I will be willing to test such driver to report the functionality. > On most programs the accelerometers were provided by the OEM as part of their platform design. I'm not sure how it works on newer programs. I'll ask around internally. Alex > > Thanks in advance, > > Luya Tshimbalanga > Fedora Design Team > > -BEGIN PGP SIGNATURE- > > iQEzBAEBCAAdFiEEWyB+BQtYiFz4GUNDXlKBdNiiYJoFAlzmMM8ACgkQXlKBdNii > YJq/wAgAn8jWhzXLLbBYjtai4f0C4rw1cKt1MDi48qOxlSiPaOplfG8RLrCGVFL3 > jMgReHhN/U3zfy3SRBXa2zsTspdVKRnxewMxJHJmS653prOAEVsfd41b/XDMInmp > kFCcTKXwR9GlUYPbSuj3pMLSwq3OHmBlPjnpL4NMXlmrcQ6psN9I992Itg8HEoh2 > 3vGF5qRdKuidLnu9xRNLceLjvpvTyJ5fhH/Ry5sylX5oJhdW7WlR5HE+Smsgu7hW > rVGgGl6yFdUEGaiFRqhPxuEpC07i7Vi5REqB+s/YUgzM+Wn86taA4Ld1R9dMeZIm > g7EJeO8c6RKII2MOKWrKtTpvuUg+RQ== > =Ftfg > -END PGP SIGNATURE- > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] Revert "drm/amdgpu: Need to set the baco cap before baco reset"
On Thu, May 23, 2019 at 2:24 AM Emily Deng wrote: > > This reverts commit 13f2a375b58918873833cf6859332f960c6cf922. > > Signed-off-by: Emily Deng Please go ahead and revert. Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +- > drivers/gpu/drm/amd/include/kgd_pp_interface.h | 1 - > drivers/gpu/drm/amd/powerplay/amd_powerplay.c | 16 > drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 1 - > .../amd/powerplay/hwmgr/vega10_processpptables.c | 22 > -- > .../amd/powerplay/hwmgr/vega10_processpptables.h | 1 - > drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 - > 7 files changed, 1 insertion(+), 51 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 242c38c..bdd1fe73 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2610,15 +2610,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, > /* check if we need to reset the asic > * E.g., driver was not cleanly unloaded previously, etc. > */ > - if (amdgpu_passthrough(adev) && amdgpu_asic_need_reset_on_init(adev)) > { > - if (adev->powerplay.pp_funcs && > adev->powerplay.pp_funcs->set_asic_baco_cap) { > - r = > adev->powerplay.pp_funcs->set_asic_baco_cap(adev->powerplay.pp_handle); > - if (r) { > - dev_err(adev->dev, "set baco capability > failed\n"); > - goto failed; > - } > - } > - > + if (!amdgpu_sriov_vf(adev) && amdgpu_asic_need_reset_on_init(adev)) { > r = amdgpu_asic_reset(adev); > if (r) { > dev_err(adev->dev, "asic reset on init failed\n"); > diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h > b/drivers/gpu/drm/amd/include/kgd_pp_interface.h > index c6e2a51..9f661bf 100644 > --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h > +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h > @@ -296,7 +296,6 @@ struct amd_pm_funcs { > int (*set_hard_min_fclk_by_freq)(void *handle, uint32_t clock); > int (*set_min_deep_sleep_dcefclk)(void *handle, uint32_t clock); > int (*get_asic_baco_capability)(void *handle, bool *cap); > - int (*set_asic_baco_cap)(void *handle); > int (*get_asic_baco_state)(void *handle, int *state); > int (*set_asic_baco_state)(void *handle, int state); > int (*get_ppfeature_status)(void *handle, char *buf); > diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > index 9856760..bea1587 100644 > --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > @@ -1404,21 +1404,6 @@ static int pp_set_active_display_count(void *handle, > uint32_t count) > return ret; > } > > -static int pp_set_asic_baco_cap(void *handle) > -{ > - struct pp_hwmgr *hwmgr = handle; > - > - if (!hwmgr) > - return -EINVAL; > - > - if (!hwmgr->pm_en || !hwmgr->hwmgr_func->set_asic_baco_cap) > - return 0; > - > - hwmgr->hwmgr_func->set_asic_baco_cap(hwmgr); > - > - return 0; > -} > - > static int pp_get_asic_baco_capability(void *handle, bool *cap) > { > struct pp_hwmgr *hwmgr = handle; > @@ -1561,7 +1546,6 @@ static const struct amd_pm_funcs pp_dpm_funcs = { > .set_hard_min_dcefclk_by_freq = pp_set_hard_min_dcefclk_by_freq, > .set_hard_min_fclk_by_freq = pp_set_hard_min_fclk_by_freq, > .get_asic_baco_capability = pp_get_asic_baco_capability, > - .set_asic_baco_cap = pp_set_asic_baco_cap, > .get_asic_baco_state = pp_get_asic_baco_state, > .set_asic_baco_state = pp_set_asic_baco_state, > .get_ppfeature_status = pp_get_ppfeature_status, > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c > b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c > index e5bcbc8..ce6aeb5 100644 > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c > @@ -5302,7 +5302,6 @@ static const struct pp_hwmgr_func vega10_hwmgr_funcs = { > .odn_edit_dpm_table = vega10_odn_edit_dpm_table, > .get_performance_level = vega10_get_performance_level, > .get_asic_baco_capability = smu9_baco_get_capability, > - .set_asic_baco_cap = vega10_baco_set_cap, > .get_asic_baco_state = smu9_baco_get_state, > .set_asic_baco_state = vega10_baco_set_state, > .enable_mgpu_fan_boost = vega10_enable_mgpu_fan_boost, > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c > b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c > index 8fdeb23..b6767d7 100644 > --- a/drivers
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Thu, May 23, 2019 at 7:45 AM Catalin Marinas wrote: > > On Wed, May 22, 2019 at 01:47:36PM -0700, Kees Cook wrote: > > On Wed, May 22, 2019 at 05:35:27PM +0100, Catalin Marinas wrote: > > > The two hard requirements I have for supporting any new hardware feature > > > in Linux are (1) a single kernel image binary continues to run on old > > > hardware while making use of the new feature if available and (2) old > > > user space continues to run on new hardware while new user space can > > > take advantage of the new feature. > > > > Agreed! And I think the series meets these requirements, yes? > > Yes. I mentioned this just to make sure people don't expect different > kernel builds for different hardware features. > > There is also the obvious requirement which I didn't mention: new user > space continues to run on new/subsequent kernel versions. That's one of > the points of contention for this series (ignoring MTE) with the > maintainers having to guarantee this without much effort. IOW, do the > 500K+ new lines in a subsequent kernel version break any user space out > there? I'm only talking about the relaxed TBI ABI. Are the usual LTP, > syskaller sufficient? Better static analysis would definitely help. > > > > For MTE, we just can't enable it by default since there are applications > > > who use the top byte of a pointer and expect it to be ignored rather > > > than failing with a mismatched tag. Just think of a hwasan compiled > > > binary where TBI is expected to work and you try to run it with MTE > > > turned on. > > > > Ah! Okay, here's the use-case I wasn't thinking of: the concern is TBI > > conflicting with MTE. And anything that starts using TBI suddenly can't > > run in the future because it's being interpreted as MTE bits? (Is that > > the ABI concern? > > That's another aspect to figure out when we add the MTE support. I don't > think we'd be able to do this without an explicit opt-in by the user. > > Or, if we ever want MTE to be turned on by default (i.e. tag checking), > even if everything is tagged with 0, we have to disallow TBI for user > and this includes hwasan. There were a small number of programs using > the TBI (I think some JavaScript compilers tried this). But now we are > bringing in the hwasan support and this can be a large user base. Shall > we add an ELF note for such binaries that use TBI/hwasan? > > This series is still required for MTE but we may decide not to relax the > ABI blindly, therefore the opt-in (prctl) or personality idea. > > > I feel like we got into the weeds about ioctl()s and one-off bugs...) > > This needs solving as well. Most driver developers won't know why > untagged_addr() is needed unless we have more rigorous types or type > annotations and a tool to check them (we should probably revive the old > sparse thread). > > > So there needs to be some way to let the kernel know which of three > > things it should be doing: > > 1- leaving userspace addresses as-is (present) > > 2- wiping the top bits before using (this series) > > (I'd say tolerating rather than wiping since get_user still uses the tag > in the current series) > > The current series does not allow any choice between 1 and 2, the > default ABI basically becomes option 2. > > > 3- wiping the top bits for most things, but retaining them for MTE as > >needed (the future) > > 2 and 3 are not entirely compatible as a tagged pointer may be checked > against the memory colour by the hardware. So you can't have hwasan > binary with MTE enabled. > > > I expect MTE to be the "default" in the future. Once a system's libc has > > grown support for it, everything will be trying to use MTE. TBI will be > > the special case (but TBI is effectively a prerequisite). > > The kernel handling of tagged pointers is indeed a prerequisite. The ABI > distinction between the above 2 and 3 needs to be solved. > > > AFAICT, the only difference I see between 2 and 3 will be the tag handling > > in usercopy (all other places will continue to ignore the top bits). Is > > that accurate? > > Yes, mostly (for the kernel). If MTE is enabled by default for a hwasan > binary, it will SEGFAULT (either in user space or in kernel uaccess). > How does the kernel choose between 2 and 3? > > > Is "1" a per-process state we want to keep? (I assume not, but rather it > > is available via no TBI/MTE CONFIG or a boot-time option, if at all?) > > Possibly, though not necessarily per process. For testing or if > something goes wrong during boot, a command line option with a static > label would do. The AT_FLAGS bit needs to be checked by user space. My > preference would be per-process. > > > To choose between "2" and "3", it seems we need a per-process flag to > > opt into TBI (and out of MTE). > > Or leave option 2 the default and get it to opt in to MTE. > > > For userspace, how would a future binary choose TBI over MTE? If it's > > a library issue, we can't use an ELF bit, since the choice may be > > "late" after ELF loa
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Wed, May 22, 2019 at 09:58:22AM -0700, enh wrote: > i was questioning the argument about the ioctl issues, and saying that > from my perspective, untagging bugs are not really any different than > any other kind of kernel bug. Once this series gets in, they are indeed just kernel bugs. What I want is an easier way to identify them, ideally before they trigger in the field. > i still don't see how this isn't just a regular testing/CI issue, the > same as any other kind of kernel bug. it's already the case that i can > get a bad kernel... The testing would have a smaller code coverage in terms of drivers, filesystems than something like a static checker (though one does not exclude the other). -- Catalin
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Wed, May 22, 2019 at 12:21:27PM -0700, Kees Cook wrote: > If a process wants to not tag, that's also up to the allocator where > it can decide not to ask the kernel, and just not tag. Nothing breaks in > userspace if a process is NOT tagging and untagged_addr() exists or is > missing. This, I think, is the core way this doesn't trip over the > golden rule: an old system image will run fine (because it's not > tagging). A *new* system may encounter bugs with tagging because it's a > new feature: this is The Way Of Things. But we don't break old userspace > because old userspace isn't using tags. With this series and hwasan binaries, at some point in the future they will be considered "old userspace" and they do use pointer tags which expect to be ignored by both the hardware and the kernel. MTE breaks this assumption. -- Catalin
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Wed, May 22, 2019 at 01:47:36PM -0700, Kees Cook wrote: > On Wed, May 22, 2019 at 05:35:27PM +0100, Catalin Marinas wrote: > > The two hard requirements I have for supporting any new hardware feature > > in Linux are (1) a single kernel image binary continues to run on old > > hardware while making use of the new feature if available and (2) old > > user space continues to run on new hardware while new user space can > > take advantage of the new feature. > > Agreed! And I think the series meets these requirements, yes? Yes. I mentioned this just to make sure people don't expect different kernel builds for different hardware features. There is also the obvious requirement which I didn't mention: new user space continues to run on new/subsequent kernel versions. That's one of the points of contention for this series (ignoring MTE) with the maintainers having to guarantee this without much effort. IOW, do the 500K+ new lines in a subsequent kernel version break any user space out there? I'm only talking about the relaxed TBI ABI. Are the usual LTP, syskaller sufficient? Better static analysis would definitely help. > > For MTE, we just can't enable it by default since there are applications > > who use the top byte of a pointer and expect it to be ignored rather > > than failing with a mismatched tag. Just think of a hwasan compiled > > binary where TBI is expected to work and you try to run it with MTE > > turned on. > > Ah! Okay, here's the use-case I wasn't thinking of: the concern is TBI > conflicting with MTE. And anything that starts using TBI suddenly can't > run in the future because it's being interpreted as MTE bits? (Is that > the ABI concern? That's another aspect to figure out when we add the MTE support. I don't think we'd be able to do this without an explicit opt-in by the user. Or, if we ever want MTE to be turned on by default (i.e. tag checking), even if everything is tagged with 0, we have to disallow TBI for user and this includes hwasan. There were a small number of programs using the TBI (I think some JavaScript compilers tried this). But now we are bringing in the hwasan support and this can be a large user base. Shall we add an ELF note for such binaries that use TBI/hwasan? This series is still required for MTE but we may decide not to relax the ABI blindly, therefore the opt-in (prctl) or personality idea. > I feel like we got into the weeds about ioctl()s and one-off bugs...) This needs solving as well. Most driver developers won't know why untagged_addr() is needed unless we have more rigorous types or type annotations and a tool to check them (we should probably revive the old sparse thread). > So there needs to be some way to let the kernel know which of three > things it should be doing: > 1- leaving userspace addresses as-is (present) > 2- wiping the top bits before using (this series) (I'd say tolerating rather than wiping since get_user still uses the tag in the current series) The current series does not allow any choice between 1 and 2, the default ABI basically becomes option 2. > 3- wiping the top bits for most things, but retaining them for MTE as >needed (the future) 2 and 3 are not entirely compatible as a tagged pointer may be checked against the memory colour by the hardware. So you can't have hwasan binary with MTE enabled. > I expect MTE to be the "default" in the future. Once a system's libc has > grown support for it, everything will be trying to use MTE. TBI will be > the special case (but TBI is effectively a prerequisite). The kernel handling of tagged pointers is indeed a prerequisite. The ABI distinction between the above 2 and 3 needs to be solved. > AFAICT, the only difference I see between 2 and 3 will be the tag handling > in usercopy (all other places will continue to ignore the top bits). Is > that accurate? Yes, mostly (for the kernel). If MTE is enabled by default for a hwasan binary, it will SEGFAULT (either in user space or in kernel uaccess). How does the kernel choose between 2 and 3? > Is "1" a per-process state we want to keep? (I assume not, but rather it > is available via no TBI/MTE CONFIG or a boot-time option, if at all?) Possibly, though not necessarily per process. For testing or if something goes wrong during boot, a command line option with a static label would do. The AT_FLAGS bit needs to be checked by user space. My preference would be per-process. > To choose between "2" and "3", it seems we need a per-process flag to > opt into TBI (and out of MTE). Or leave option 2 the default and get it to opt in to MTE. > For userspace, how would a future binary choose TBI over MTE? If it's > a library issue, we can't use an ELF bit, since the choice may be > "late" after ELF load (this implies the need for a prctl().) If it's > binary-only ("built with HWKASan") then an ELF bit seems sufficient. > And without the marking, I'd expect the kernel to enforce MTE when > there are high bits. The current plan is tha
Re: [PATCH] drm/amdgpu: sort probed modes before adding common modes
On 5/23/19 10:03 AM, Mohan Marimuthu, Yogesh wrote: > > [Why] > There are monitors which can have more than one preferred mode > set. There are chances in these monitors that if common modes are > added in function amdgpu_dm_connector_add_common_modes(), these > common modes can be calculated with different preferred mode than > the one used in function decide_crtc_timing_for_drm_display_mode(). > The preferred mode can be different because after common modes > are added, the mode list is sorted and this changes the order of > preferred modes in the list. The first mode in the list with > preferred flag set is selected as preferred mode. Due to this the > preferred mode selected varies. > If same preferred mode is not selected in common mode calculation > and crtc timing, then during mode set instead of setting preferred > timing, common mode timing will be applied which can cause "out of > range" message in the monitor with monitor blanking out. > > [How] > Sort the modes before adding common modes. The same sorting function > is called during common mode addition and deciding crtc timing. > > Signed-off-by: Yogesh Mohan Marimuthu > Change-Id: I48036a476ad621de022e2fda5e1861e72e7e8c30 With the patch title fixed to start with "drm/amd/display:", this is: Reviewed-by: Nicholas Kazlauskas So the sorting normally seems to happen at the end of drm_helper_probe_single_connector_modes, or after both get_modes and fill_modes have been called. So sorting the list in advance ahead of DRM shouldn't hurt here. This should help the order at least match the order we query for the preferred mode later on when creating the sink for the stream, but it still is just masking the overall problem of not checking the right mode for deciding when to scale or not. We'll probably need to either add a private flag to the common modes to indicate that they should be scaled or using check the scaling against what we thought was the native mode, ie. the one from the encoder. Nicholas Kazlauskas > --- > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 4a1755bce..418e3956a 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -4677,6 +4677,15 @@ static void amdgpu_dm_connector_ddc_get_modes(struct > drm_connector *connector, > amdgpu_dm_connector->num_modes = > drm_add_edid_modes(connector, edid); > > + /* sorting the probed modes before calling function > +* amdgpu_dm_get_native_mode() since EDID can have > +* more than one preferred mode. The modes that are > +* later in the probed mode list could be of higher > +* and preferred resolution. For example, 3840x2160 > +* resolution in base EDID preferred timing and 4096x2160 > +* preferred resolution in DID extension block later. > +*/ > + drm_mode_sort(&connector->probed_modes); > amdgpu_dm_get_native_mode(connector); > } else { > amdgpu_dm_connector->num_modes = 0; > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10
Am 23.05.19 um 13:50 schrieb Zhou, David(ChunMing): > 在 2019/5/23 19:03, Christian König 写道: >> [CAUTION: External Email] >> >> Am 23.05.19 um 12:24 schrieb zhoucm1: >>> >>> On 2019年05月22日 20:59, Christian König wrote: [CAUTION: External Email] BOs on the LRU might be blocked during command submission and cause OOM situations. Avoid this by blocking for the first busy BO not locked by the same ticket as the BO we are searching space for. v10: completely start over with the patch since we didn't handled a whole bunch of corner cases. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 77 ++-- 1 file changed, 66 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 4c6389d849ed..861facac33d4 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -771,32 +771,72 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; - *locked = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || !list_empty(&bo->ddestroy)) ret = true; + *locked = false; + if (busy) + *busy = false; } else { - *locked = reservation_object_trylock(bo->resv); - ret = *locked; + ret = reservation_object_trylock(bo->resv); + *locked = ret; + if (busy) + *busy = !ret; } return ret; } +/** + * ttm_mem_evict_wait_busy - wait for a busy BO to become available + * + * @busy_bo: BO which couldn't be locked with trylock + * @ctx: operation context + * @ticket: acquire ticket + * + * Try to lock a busy buffer object to avoid failing eviction. + */ +static int ttm_mem_evict_wait_busy(struct ttm_buffer_object *busy_bo, + struct ttm_operation_ctx *ctx, + struct ww_acquire_ctx *ticket) +{ + int r; + + if (!busy_bo || !ticket) + return -EBUSY; + + if (ctx->interruptible) + r = reservation_object_lock_interruptible(busy_bo->resv, + ticket); + else + r = reservation_object_lock(busy_bo->resv, ticket); + + /* + * TODO: It would be better to keep the BO locked until allocation is at + * least tried one more time, but that would mean a much larger rework + * of TTM. + */ + if (!r) + reservation_object_unlock(busy_bo->resv); + + return r == -EDEADLK ? -EAGAIN : r; +} + static int ttm_mem_evict_first(struct ttm_bo_device *bdev, uint32_t mem_type, const struct ttm_place *place, - struct ttm_operation_ctx *ctx) + struct ttm_operation_ctx *ctx, + struct ww_acquire_ctx *ticket) { + struct ttm_buffer_object *bo = NULL, *busy_bo = NULL; struct ttm_bo_global *glob = bdev->glob; struct ttm_mem_type_manager *man = &bdev->man[mem_type]; - struct ttm_buffer_object *bo = NULL; bool locked = false; unsigned i; int ret; @@ -804,8 +844,15 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, spin_lock(&glob->lru_lock); for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy; + + if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked, + &busy)) { + if (busy && !busy_bo && + bo->resv->lock.ctx != ticket) + busy_bo = bo; continue; + }
[PATCH] drm/amdgpu: sort probed modes before adding common modes
[Why] There are monitors which can have more than one preferred mode set. There are chances in these monitors that if common modes are added in function amdgpu_dm_connector_add_common_modes(), these common modes can be calculated with different preferred mode than the one used in function decide_crtc_timing_for_drm_display_mode(). The preferred mode can be different because after common modes are added, the mode list is sorted and this changes the order of preferred modes in the list. The first mode in the list with preferred flag set is selected as preferred mode. Due to this the preferred mode selected varies. If same preferred mode is not selected in common mode calculation and crtc timing, then during mode set instead of setting preferred timing, common mode timing will be applied which can cause "out of range" message in the monitor with monitor blanking out. [How] Sort the modes before adding common modes. The same sorting function is called during common mode addition and deciding crtc timing. Signed-off-by: Yogesh Mohan Marimuthu Change-Id: I48036a476ad621de022e2fda5e1861e72e7e8c30 --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 4a1755bce..418e3956a 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4677,6 +4677,15 @@ static void amdgpu_dm_connector_ddc_get_modes(struct drm_connector *connector, amdgpu_dm_connector->num_modes = drm_add_edid_modes(connector, edid); + /* sorting the probed modes before calling function +* amdgpu_dm_get_native_mode() since EDID can have +* more than one preferred mode. The modes that are +* later in the probed mode list could be of higher +* and preferred resolution. For example, 3840x2160 +* resolution in base EDID preferred timing and 4096x2160 +* preferred resolution in DID extension block later. +*/ + drm_mode_sort(&connector->probed_modes); amdgpu_dm_get_native_mode(connector); } else { amdgpu_dm_connector->num_modes = 0; -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10
在 2019/5/23 19:03, Christian König 写道: > [CAUTION: External Email] > > Am 23.05.19 um 12:24 schrieb zhoucm1: >> >> >> On 2019年05月22日 20:59, Christian König wrote: >>> [CAUTION: External Email] >>> >>> BOs on the LRU might be blocked during command submission >>> and cause OOM situations. >>> >>> Avoid this by blocking for the first busy BO not locked by >>> the same ticket as the BO we are searching space for. >>> >>> v10: completely start over with the patch since we didn't >>> handled a whole bunch of corner cases. >>> >>> Signed-off-by: Christian König >>> --- >>> drivers/gpu/drm/ttm/ttm_bo.c | 77 >>> ++-- >>> 1 file changed, 66 insertions(+), 11 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>> b/drivers/gpu/drm/ttm/ttm_bo.c >>> index 4c6389d849ed..861facac33d4 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>> @@ -771,32 +771,72 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); >>> * b. Otherwise, trylock it. >>> */ >>> static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object >>> *bo, >>> - struct ttm_operation_ctx *ctx, bool *locked) >>> + struct ttm_operation_ctx *ctx, bool *locked, >>> bool *busy) >>> { >>> bool ret = false; >>> >>> - *locked = false; >>> if (bo->resv == ctx->resv) { >>> reservation_object_assert_held(bo->resv); >>> if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT >>> || !list_empty(&bo->ddestroy)) >>> ret = true; >>> + *locked = false; >>> + if (busy) >>> + *busy = false; >>> } else { >>> - *locked = reservation_object_trylock(bo->resv); >>> - ret = *locked; >>> + ret = reservation_object_trylock(bo->resv); >>> + *locked = ret; >>> + if (busy) >>> + *busy = !ret; >>> } >>> >>> return ret; >>> } >>> >>> +/** >>> + * ttm_mem_evict_wait_busy - wait for a busy BO to become available >>> + * >>> + * @busy_bo: BO which couldn't be locked with trylock >>> + * @ctx: operation context >>> + * @ticket: acquire ticket >>> + * >>> + * Try to lock a busy buffer object to avoid failing eviction. >>> + */ >>> +static int ttm_mem_evict_wait_busy(struct ttm_buffer_object *busy_bo, >>> + struct ttm_operation_ctx *ctx, >>> + struct ww_acquire_ctx *ticket) >>> +{ >>> + int r; >>> + >>> + if (!busy_bo || !ticket) >>> + return -EBUSY; >>> + >>> + if (ctx->interruptible) >>> + r = >>> reservation_object_lock_interruptible(busy_bo->resv, >>> + ticket); >>> + else >>> + r = reservation_object_lock(busy_bo->resv, ticket); >>> + >>> + /* >>> + * TODO: It would be better to keep the BO locked until >>> allocation is at >>> + * least tried one more time, but that would mean a much >>> larger rework >>> + * of TTM. >>> + */ >>> + if (!r) >>> + reservation_object_unlock(busy_bo->resv); >>> + >>> + return r == -EDEADLK ? -EAGAIN : r; >>> +} >>> + >>> static int ttm_mem_evict_first(struct ttm_bo_device *bdev, >>> uint32_t mem_type, >>> const struct ttm_place *place, >>> - struct ttm_operation_ctx *ctx) >>> + struct ttm_operation_ctx *ctx, >>> + struct ww_acquire_ctx *ticket) >>> { >>> + struct ttm_buffer_object *bo = NULL, *busy_bo = NULL; >>> struct ttm_bo_global *glob = bdev->glob; >>> struct ttm_mem_type_manager *man = &bdev->man[mem_type]; >>> - struct ttm_buffer_object *bo = NULL; >>> bool locked = false; >>> unsigned i; >>> int ret; >>> @@ -804,8 +844,15 @@ static int ttm_mem_evict_first(struct >>> ttm_bo_device *bdev, >>> spin_lock(&glob->lru_lock); >>> for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { >>> list_for_each_entry(bo, &man->lru[i], lru) { >>> - if (!ttm_bo_evict_swapout_allowable(bo, ctx, >>> &locked)) >>> + bool busy; >>> + >>> + if (!ttm_bo_evict_swapout_allowable(bo, ctx, >>> &locked, >>> + &busy)) { >>> + if (busy && !busy_bo && >>> + bo->resv->lock.ctx != ticket) >>> + busy_bo = bo; >>> continue; >>> + } >>> >>> if (place && >>> !bdev->driver->eviction_valuable(bo, >>> place)) { >>> @@ -824,8 +871,13 @@ static int ttm_mem_evict_first(struct >>> ttm_bo_device *bdev, >>> } >>> >>>
Re: [PATCH 01/12] dma-buf: add dynamic caching of sg_table
On Thu, May 23, 2019 at 1:30 PM Daniel Vetter wrote: > > On Thu, May 23, 2019 at 1:21 PM Koenig, Christian > wrote: > > > > Am 22.05.19 um 20:30 schrieb Daniel Vetter: > > > [SNIP] > > >> Well, it seems you are making incorrect assumptions about the cache > > >> maintenance of DMA-buf here. > > >> > > >> At least for all DRM devices I'm aware of mapping/unmapping an > > >> attachment does *NOT* have any cache maintenance implications. > > >> > > >> E.g. the use case you describe above would certainly fail with amdgpu, > > >> radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the > > >> exporter from reading/writing to that buffer (just the opposite > > >> actually). > > >> > > >> All of them assume perfectly coherent access to the underlying memory. > > >> As far as I know there is no documented cache maintenance requirements > > >> for DMA-buf. > > > I think it is documented. It's just that on x86, we ignore that > > > because the dma-api pretends there's never a need for cache flushing > > > on x86, and that everything snoops the cpu caches. Which isn't true > > > since over 20 ago when AGP happened. The actual rules for x86 dma-buf > > > are very much ad-hoc (and we occasionally reapply some duct-tape when > > > cacheline noise shows up somewhere). > > > > Well I strongly disagree on this. Even on x86 at least AMD GPUs are also > > not fully coherent. > > > > For example you have the texture cache and the HDP read/write cache. So > > when both amdgpu as well as i915 would write to the same buffer at the > > same time we would get a corrupted data as well. > > > > The key point is that it is NOT DMA-buf in it's map/unmap call who is > > defining the coherency, but rather the reservation object and its > > attached dma_fence instances. > > > > So for example as long as a exclusive reservation object fence is still > > not signaled I can't assume that all caches are flushed and so can't > > start with my own operation/access to the data in question. > > The dma-api doesn't flush device caches, ever. It might flush some > iommu caches or some other bus cache somewhere in-between. So it also > won't ever make sure that multiple devices don't trample on each > another. For that you need something else (like reservation object, > but I think that's not really followed outside of drm much). > > The other bit is the coherent vs. non-coherent thing, which in the > dma-api land just talks about whether cpu/device access need extra > flushing or not. Now in practice that extra flushing is always only > cpu side, i.e. will cpu writes/reads go through the cpu cache, and > will device reads/writes snoop the cpu caches. That's (afaik at least, > an in practice, not the abstract spec) the _only_ thing dma-api's > cache maintenance does. For 0 copy that's all completely irrelevant, > because as soon as you pick a mode where you need to do manual cache > management you've screwed up, it's not 0-copy anymore really. > > The other hilarious stuff is that on x86 we let userspace (at least > with i915) do that cache management, so the kernel doesn't even have a > clue. I think what we need in dma-buf (and dma-api people will scream > about the "abstraction leak") is some notition about whether an > importer should snoop or not (or if that device always uses non-snoop > or snooped transactions). But that would shred the illusion the > dma-api tries to keep up that all that matters is whether a mapping is > coherent from the cpu's pov or not, and you can achieve coherence both > with a cache cpu mapping + snooped transactions, or with wc cpu side > and non-snooped transactions. Trying to add cache managment (which > some dma-buf exporter do indeed attempt to) will be even worse. > > Again, none of this is about preventing concurrent writes, or making > sure device caches are flushed correctly around batches. btw I just grepped for reservation_object, no one outside of drivers/gpu is using that. So for device access synchronization everyone else is relying on userspace ordering requests correctly on its own. Iirc v4l/media is pondering adding dma-fence support, but that's not going anywhere. Also, for correctness reservations aren't needed, we allow explicit syncing userspace to managed dma-fences/drm_syncobj on their own, and they are allowed to get this wrong. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Re: [PATCH 01/12] dma-buf: add dynamic caching of sg_table
On Thu, May 23, 2019 at 1:21 PM Koenig, Christian wrote: > > Am 22.05.19 um 20:30 schrieb Daniel Vetter: > > [SNIP] > >> Well, it seems you are making incorrect assumptions about the cache > >> maintenance of DMA-buf here. > >> > >> At least for all DRM devices I'm aware of mapping/unmapping an > >> attachment does *NOT* have any cache maintenance implications. > >> > >> E.g. the use case you describe above would certainly fail with amdgpu, > >> radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the > >> exporter from reading/writing to that buffer (just the opposite actually). > >> > >> All of them assume perfectly coherent access to the underlying memory. > >> As far as I know there is no documented cache maintenance requirements > >> for DMA-buf. > > I think it is documented. It's just that on x86, we ignore that > > because the dma-api pretends there's never a need for cache flushing > > on x86, and that everything snoops the cpu caches. Which isn't true > > since over 20 ago when AGP happened. The actual rules for x86 dma-buf > > are very much ad-hoc (and we occasionally reapply some duct-tape when > > cacheline noise shows up somewhere). > > Well I strongly disagree on this. Even on x86 at least AMD GPUs are also > not fully coherent. > > For example you have the texture cache and the HDP read/write cache. So > when both amdgpu as well as i915 would write to the same buffer at the > same time we would get a corrupted data as well. > > The key point is that it is NOT DMA-buf in it's map/unmap call who is > defining the coherency, but rather the reservation object and its > attached dma_fence instances. > > So for example as long as a exclusive reservation object fence is still > not signaled I can't assume that all caches are flushed and so can't > start with my own operation/access to the data in question. The dma-api doesn't flush device caches, ever. It might flush some iommu caches or some other bus cache somewhere in-between. So it also won't ever make sure that multiple devices don't trample on each another. For that you need something else (like reservation object, but I think that's not really followed outside of drm much). The other bit is the coherent vs. non-coherent thing, which in the dma-api land just talks about whether cpu/device access need extra flushing or not. Now in practice that extra flushing is always only cpu side, i.e. will cpu writes/reads go through the cpu cache, and will device reads/writes snoop the cpu caches. That's (afaik at least, an in practice, not the abstract spec) the _only_ thing dma-api's cache maintenance does. For 0 copy that's all completely irrelevant, because as soon as you pick a mode where you need to do manual cache management you've screwed up, it's not 0-copy anymore really. The other hilarious stuff is that on x86 we let userspace (at least with i915) do that cache management, so the kernel doesn't even have a clue. I think what we need in dma-buf (and dma-api people will scream about the "abstraction leak") is some notition about whether an importer should snoop or not (or if that device always uses non-snoop or snooped transactions). But that would shred the illusion the dma-api tries to keep up that all that matters is whether a mapping is coherent from the cpu's pov or not, and you can achieve coherence both with a cache cpu mapping + snooped transactions, or with wc cpu side and non-snooped transactions. Trying to add cache managment (which some dma-buf exporter do indeed attempt to) will be even worse. Again, none of this is about preventing concurrent writes, or making sure device caches are flushed correctly around batches. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 01/12] dma-buf: add dynamic caching of sg_table
Am 22.05.19 um 20:30 schrieb Daniel Vetter: > [SNIP] >> Well, it seems you are making incorrect assumptions about the cache >> maintenance of DMA-buf here. >> >> At least for all DRM devices I'm aware of mapping/unmapping an >> attachment does *NOT* have any cache maintenance implications. >> >> E.g. the use case you describe above would certainly fail with amdgpu, >> radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the >> exporter from reading/writing to that buffer (just the opposite actually). >> >> All of them assume perfectly coherent access to the underlying memory. >> As far as I know there is no documented cache maintenance requirements >> for DMA-buf. > I think it is documented. It's just that on x86, we ignore that > because the dma-api pretends there's never a need for cache flushing > on x86, and that everything snoops the cpu caches. Which isn't true > since over 20 ago when AGP happened. The actual rules for x86 dma-buf > are very much ad-hoc (and we occasionally reapply some duct-tape when > cacheline noise shows up somewhere). Well I strongly disagree on this. Even on x86 at least AMD GPUs are also not fully coherent. For example you have the texture cache and the HDP read/write cache. So when both amdgpu as well as i915 would write to the same buffer at the same time we would get a corrupted data as well. The key point is that it is NOT DMA-buf in it's map/unmap call who is defining the coherency, but rather the reservation object and its attached dma_fence instances. So for example as long as a exclusive reservation object fence is still not signaled I can't assume that all caches are flushed and so can't start with my own operation/access to the data in question. Regards, Christian. > > I've just filed this away as another instance of the dma-api not > fitting gpus, and I think giving recent discussions that won't improve > anytime soon. So we're stuck with essentially undefined dma-buf > behaviour. > -Daniel > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10
Am 23.05.19 um 12:24 schrieb zhoucm1: On 2019年05月22日 20:59, Christian König wrote: [CAUTION: External Email] BOs on the LRU might be blocked during command submission and cause OOM situations. Avoid this by blocking for the first busy BO not locked by the same ticket as the BO we are searching space for. v10: completely start over with the patch since we didn't handled a whole bunch of corner cases. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 77 ++-- 1 file changed, 66 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 4c6389d849ed..861facac33d4 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -771,32 +771,72 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; - *locked = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || !list_empty(&bo->ddestroy)) ret = true; + *locked = false; + if (busy) + *busy = false; } else { - *locked = reservation_object_trylock(bo->resv); - ret = *locked; + ret = reservation_object_trylock(bo->resv); + *locked = ret; + if (busy) + *busy = !ret; } return ret; } +/** + * ttm_mem_evict_wait_busy - wait for a busy BO to become available + * + * @busy_bo: BO which couldn't be locked with trylock + * @ctx: operation context + * @ticket: acquire ticket + * + * Try to lock a busy buffer object to avoid failing eviction. + */ +static int ttm_mem_evict_wait_busy(struct ttm_buffer_object *busy_bo, + struct ttm_operation_ctx *ctx, + struct ww_acquire_ctx *ticket) +{ + int r; + + if (!busy_bo || !ticket) + return -EBUSY; + + if (ctx->interruptible) + r = reservation_object_lock_interruptible(busy_bo->resv, + ticket); + else + r = reservation_object_lock(busy_bo->resv, ticket); + + /* + * TODO: It would be better to keep the BO locked until allocation is at + * least tried one more time, but that would mean a much larger rework + * of TTM. + */ + if (!r) + reservation_object_unlock(busy_bo->resv); + + return r == -EDEADLK ? -EAGAIN : r; +} + static int ttm_mem_evict_first(struct ttm_bo_device *bdev, uint32_t mem_type, const struct ttm_place *place, - struct ttm_operation_ctx *ctx) + struct ttm_operation_ctx *ctx, + struct ww_acquire_ctx *ticket) { + struct ttm_buffer_object *bo = NULL, *busy_bo = NULL; struct ttm_bo_global *glob = bdev->glob; struct ttm_mem_type_manager *man = &bdev->man[mem_type]; - struct ttm_buffer_object *bo = NULL; bool locked = false; unsigned i; int ret; @@ -804,8 +844,15 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, spin_lock(&glob->lru_lock); for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy; + + if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked, + &busy)) { + if (busy && !busy_bo && + bo->resv->lock.ctx != ticket) + busy_bo = bo; continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { @@ -824,8 +871,13 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, } if (!bo) { + if (busy_bo) + ttm_bo_get(busy_bo); spin_unlock(&glob->lru_lock); - return -EBUSY; + ret = ttm_mem_evict_wait_busy(busy_bo, ctx, ticket); If you rely on EAGAIN, why do you still try to lock busy_bo? any negative effect if directly return EAGAIN without tring lock? Yeah, that would burn a lot of CPU cycles because we would essentially busy wait for the BO to become unlocked. When we only return in case of a deadlock the other thread can continue with its eviction while we rea
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Wed, May 22, 2019 at 09:20:52PM -0300, Jason Gunthorpe wrote: > On Wed, May 22, 2019 at 02:49:28PM +0100, Dave Martin wrote: > > On Tue, May 21, 2019 at 03:48:56PM -0300, Jason Gunthorpe wrote: > > > On Fri, May 17, 2019 at 03:49:31PM +0100, Catalin Marinas wrote: > > > > > > > The tagged pointers (whether hwasan or MTE) should ideally be a > > > > transparent feature for the application writer but I don't think we can > > > > solve it entirely and make it seamless for the multitude of ioctls(). > > > > I'd say you only opt in to such feature if you know what you are doing > > > > and the user code takes care of specific cases like ioctl(), hence the > > > > prctl() proposal even for the hwasan. > > > > > > I'm not sure such a dire view is warrented.. > > > > > > The ioctl situation is not so bad, other than a few special cases, > > > most drivers just take a 'void __user *' and pass it as an argument to > > > some function that accepts a 'void __user *'. sparse et al verify > > > this. > > > > > > As long as the core functions do the right thing the drivers will be > > > OK. > > > > > > The only place things get dicy is if someone casts to unsigned long > > > (ie for vma work) but I think that reflects that our driver facing > > > APIs for VMAs are compatible with static analysis (ie I have no > > > earthly idea why get_user_pages() accepts an unsigned long), not that > > > this is too hard. > > > > If multiple people will care about this, perhaps we should try to > > annotate types more explicitly in SYSCALL_DEFINEx() and ABI data > > structures. > > > > For example, we could have a couple of mutually exclusive modifiers > > > > T __object * > > T __vaddr * (or U __vaddr) > > > > In the first case the pointer points to an object (in the C sense) > > that the call may dereference but not use for any other purpose. > > How would you use these two differently? > > So far the kernel has worked that __user should tag any pointer that > is from userspace and then you can't do anything with it until you > transform it into a kernel something Ultimately it would be good to disallow casting __object pointers execpt to compatible __object pointer types, and to make get_user etc. demand __object. __vaddr pointers / addresses would be freely castable, but not to __object and so would not be dereferenceable even indirectly. Or that's the general idea. Figuring out a sane set of rules that we could actually check / enforce would require a bit of work. (Whether the __vaddr base type is a pointer or an integer type is probably moot, due to the restrictions we would place on the use of these anyway.) > > to tell static analysers the real type of pointers smuggled through > > UAPI disguised as other types (*cough* KVM, etc.) > > Yes, that would help alot, we often have to pass pointers through a > u64 in the uAPI, and there is no static checker support to make sure > they are run through the u64_to_user_ptr() helper. Agreed. Cheers ---Dave
Re: [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10
On 2019年05月22日 20:59, Christian König wrote: [CAUTION: External Email] BOs on the LRU might be blocked during command submission and cause OOM situations. Avoid this by blocking for the first busy BO not locked by the same ticket as the BO we are searching space for. v10: completely start over with the patch since we didn't handled a whole bunch of corner cases. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 77 ++-- 1 file changed, 66 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 4c6389d849ed..861facac33d4 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -771,32 +771,72 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; - *locked = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || !list_empty(&bo->ddestroy)) ret = true; + *locked = false; + if (busy) + *busy = false; } else { - *locked = reservation_object_trylock(bo->resv); - ret = *locked; + ret = reservation_object_trylock(bo->resv); + *locked = ret; + if (busy) + *busy = !ret; } return ret; } +/** + * ttm_mem_evict_wait_busy - wait for a busy BO to become available + * + * @busy_bo: BO which couldn't be locked with trylock + * @ctx: operation context + * @ticket: acquire ticket + * + * Try to lock a busy buffer object to avoid failing eviction. + */ +static int ttm_mem_evict_wait_busy(struct ttm_buffer_object *busy_bo, + struct ttm_operation_ctx *ctx, + struct ww_acquire_ctx *ticket) +{ + int r; + + if (!busy_bo || !ticket) + return -EBUSY; + + if (ctx->interruptible) + r = reservation_object_lock_interruptible(busy_bo->resv, + ticket); + else + r = reservation_object_lock(busy_bo->resv, ticket); + + /* +* TODO: It would be better to keep the BO locked until allocation is at +* least tried one more time, but that would mean a much larger rework +* of TTM. +*/ + if (!r) + reservation_object_unlock(busy_bo->resv); + + return r == -EDEADLK ? -EAGAIN : r; +} + static int ttm_mem_evict_first(struct ttm_bo_device *bdev, uint32_t mem_type, const struct ttm_place *place, - struct ttm_operation_ctx *ctx) + struct ttm_operation_ctx *ctx, + struct ww_acquire_ctx *ticket) { + struct ttm_buffer_object *bo = NULL, *busy_bo = NULL; struct ttm_bo_global *glob = bdev->glob; struct ttm_mem_type_manager *man = &bdev->man[mem_type]; - struct ttm_buffer_object *bo = NULL; bool locked = false; unsigned i; int ret; @@ -804,8 +844,15 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, spin_lock(&glob->lru_lock); for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy; + + if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked, + &busy)) { + if (busy && !busy_bo && + bo->resv->lock.ctx != ticket) + busy_bo = bo; continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { @@ -824,8 +871,13 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, } if (!bo) { + if (busy_bo) + ttm_bo_get(busy_bo); spin_unlock(&glob->lru_lock); - return -EBUSY; + ret = ttm_mem_evict_wait_busy(busy_bo, ctx, ticket); If you rely on EAGAIN, why do you still try to lock busy_bo? any negative effect if directly return EAGAIN without tring lock? -David + if (busy_bo) + ttm_bo_put(busy_bo); +
[PATCH] drm/amdgpu: Need to set the baco cap before baco reset
For passthrough, after rebooted the VM, driver will do a baco reset before doing other driver initialization during loading driver. For doing the baco reset, it will first check the baco reset capability. So first need to set the cap from the vbios information or baco reset won't be enabled. Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 drivers/gpu/drm/amd/amdgpu/soc15.c | 3 ++- drivers/gpu/drm/amd/include/kgd_pp_interface.h | 1 + drivers/gpu/drm/amd/powerplay/amd_powerplay.c | 16 +++ drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 1 + .../amd/powerplay/hwmgr/vega10_processpptables.c | 24 ++ .../amd/powerplay/hwmgr/vega10_processpptables.h | 1 + drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 + 8 files changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bdd1fe73..2dde672 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2611,6 +2611,14 @@ int amdgpu_device_init(struct amdgpu_device *adev, * E.g., driver was not cleanly unloaded previously, etc. */ if (!amdgpu_sriov_vf(adev) && amdgpu_asic_need_reset_on_init(adev)) { + if (amdgpu_passthrough(adev) && adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->set_asic_baco_cap) { + r = adev->powerplay.pp_funcs->set_asic_baco_cap(adev->powerplay.pp_handle); + if (r) { + dev_err(adev->dev, "set baco capability failed\n"); + goto failed; + } + } + r = amdgpu_asic_reset(adev); if (r) { dev_err(adev->dev, "asic reset on init failed\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 78bd4fc..d9fdd95 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -764,7 +764,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device *adev) /* Just return false for soc15 GPUs. Reset does not seem to * be necessary. */ - return false; + if (!amdgpu_passthrough(adev)) + return false; if (adev->flags & AMD_IS_APU) return false; diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h b/drivers/gpu/drm/amd/include/kgd_pp_interface.h index 9f661bf..c6e2a51 100644 --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h @@ -296,6 +296,7 @@ struct amd_pm_funcs { int (*set_hard_min_fclk_by_freq)(void *handle, uint32_t clock); int (*set_min_deep_sleep_dcefclk)(void *handle, uint32_t clock); int (*get_asic_baco_capability)(void *handle, bool *cap); + int (*set_asic_baco_cap)(void *handle); int (*get_asic_baco_state)(void *handle, int *state); int (*set_asic_baco_state)(void *handle, int state); int (*get_ppfeature_status)(void *handle, char *buf); diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c index bea1587..9856760 100644 --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c @@ -1404,6 +1404,21 @@ static int pp_set_active_display_count(void *handle, uint32_t count) return ret; } +static int pp_set_asic_baco_cap(void *handle) +{ + struct pp_hwmgr *hwmgr = handle; + + if (!hwmgr) + return -EINVAL; + + if (!hwmgr->pm_en || !hwmgr->hwmgr_func->set_asic_baco_cap) + return 0; + + hwmgr->hwmgr_func->set_asic_baco_cap(hwmgr); + + return 0; +} + static int pp_get_asic_baco_capability(void *handle, bool *cap) { struct pp_hwmgr *hwmgr = handle; @@ -1546,6 +1561,7 @@ static const struct amd_pm_funcs pp_dpm_funcs = { .set_hard_min_dcefclk_by_freq = pp_set_hard_min_dcefclk_by_freq, .set_hard_min_fclk_by_freq = pp_set_hard_min_fclk_by_freq, .get_asic_baco_capability = pp_get_asic_baco_capability, + .set_asic_baco_cap = pp_set_asic_baco_cap, .get_asic_baco_state = pp_get_asic_baco_state, .set_asic_baco_state = pp_set_asic_baco_state, .get_ppfeature_status = pp_get_ppfeature_status, diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c index ce6aeb5..e5bcbc8 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c @@ -5302,6 +5302,7 @@ static const struct pp_hwmgr_func vega10_hwmgr_funcs = { .odn_edit_dpm_table = vega10_odn_edit_dpm_table, .get_performance_level = vega10_get_performance_level, .get_asic_baco_capability =
Re: [PATCH 01/10] drm/ttm: Make LRU removal optional.
Am 23.05.19 um 11:15 schrieb zhoucm1: On 2019年05月22日 20:59, Christian König wrote: [SNIP] @@ -203,7 +204,10 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket, reservation_object_add_shared_fence(bo->resv, fence); else reservation_object_add_excl_fence(bo->resv, fence); - ttm_bo_add_to_lru(bo); + if (list_empty(&bo->lru)) + ttm_bo_add_to_lru(bo); + else + ttm_bo_move_to_lru_tail(bo, NULL); can ttm_bo_move_to_lru_tail be moved to ttm_eu_reserve_buffers when del_lru is false? No, that won't work. The BO might have moved to another domain and when we have the ttm_bo_move_to_lru_tail() in the reservation we won't be handling this correctly. Christian. -David ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
Leaving BOs on the LRU is harmless. We always did this for VM page table and per VM BOs. The key point is that BOs which couldn't be reserved can't be evicted. So what happened is that an application used basically all of VRAM during CS and because of this X server couldn't pin a BO for scanout. Now we keep the BOs on the LRU and modify TTM to block for the CS to complete, which in turn allows the X server to pin its BO for scanout. Christian. Am 22.05.19 um 21:43 schrieb Kuehling, Felix: Can you explain how this avoids OOM situations? When is it safe to leave a reserved BO on the LRU list? Could we do the same thing in amdgpu_amdkfd_gpuvm.c? And if we did, what would be the expected side effects or consequences? Thanks, Felix On 2019-05-22 8:59 a.m., Christian König wrote: [CAUTION: External Email] This avoids OOM situations when we have lots of threads submitting at the same time. v3: apply this to the whole driver, not just CS Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 20f2955d2a55..3e2da24cd17a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true, - &duplicates, true); + &duplicates, false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 06f83cac0d3a..f660628e6af9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, list_add(&csa_tv.head, &list); amdgpu_vm_get_pd_bo(vm, &list, &pd); - r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, true); + r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, false); if (r) { DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r); return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index d513a5ad03dd..ed25a4e14404 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj, amdgpu_vm_get_pd_bo(vm, &list, &vm_pd); - r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, true); + r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, false); if (r) { dev_err(adev->dev, "leaking bo va because " "we fail to reserve bo (%d)\n", r); @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd); - r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates, true); + r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates, false); if (r) goto error_unref; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index c430e8259038..d60593cc436e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h @@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr) struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); int r; - r = ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL); + r = __ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) dev_err(adev->dev, "%p reserve failed\n", bo); -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 01/10] drm/ttm: Make LRU removal optional.
On 2019年05月22日 20:59, Christian König wrote: [CAUTION: External Email] We are already doing this for DMA-buf imports and also for amdgpu VM BOs for quite a while now. If this doesn't run into any problems we are probably going to stop removing BOs from the LRU altogether. Signed-off-by: Christian König --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 9 + drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 ++-- drivers/gpu/drm/qxl/qxl_release.c | 2 +- drivers/gpu/drm/radeon/radeon_gem.c | 2 +- drivers/gpu/drm/radeon/radeon_object.c| 2 +- drivers/gpu/drm/ttm/ttm_execbuf_util.c| 20 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c| 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 3 ++- drivers/gpu/drm/vmwgfx/vmwgfx_validation.h| 2 +- include/drm/ttm/ttm_bo_driver.h | 5 - include/drm/ttm/ttm_execbuf_util.h| 3 ++- 13 files changed, 34 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index e1cae4a37113..647e18f9e136 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -574,7 +574,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem, amdgpu_vm_get_pd_bo(vm, &ctx->list, &ctx->vm_pd[0]); ret = ttm_eu_reserve_buffers(&ctx->ticket, &ctx->list, -false, &ctx->duplicates); +false, &ctx->duplicates, true); if (!ret) ctx->reserved = true; else { @@ -647,7 +647,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem, } ret = ttm_eu_reserve_buffers(&ctx->ticket, &ctx->list, -false, &ctx->duplicates); +false, &ctx->duplicates, true); if (!ret) ctx->reserved = true; else @@ -1800,7 +1800,8 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info) } /* Reserve all BOs and page tables for validation */ - ret = ttm_eu_reserve_buffers(&ticket, &resv_list, false, &duplicates); + ret = ttm_eu_reserve_buffers(&ticket, &resv_list, false, &duplicates, +true); WARN(!list_empty(&duplicates), "Duplicates should be empty"); if (ret) goto out_free; @@ -2006,7 +2007,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef) } ret = ttm_eu_reserve_buffers(&ctx.ticket, &ctx.list, -false, &duplicate_save); +false, &duplicate_save, true); if (ret) { pr_debug("Memory eviction: TTM Reserve Failed. Try again\n"); goto ttm_reserve_fail; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index d72cc583ebd1..fff558cf385b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true, - &duplicates); + &duplicates, true); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 54dd02a898b9..06f83cac0d3a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, list_add(&csa_tv.head, &list); amdgpu_vm_get_pd_bo(vm, &list, &pd); - r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL); + r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, true); if (r) { DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r); return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 7b840367004c..d513a5ad03dd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj, amdgpu_vm_get_pd_bo(vm, &list, &vm_pd); - r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates); + r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, true); if (r) { dev_err(adev->dev, "leaking bo va because "
Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls
On Wed, May 22, 2019 at 02:16:57PM -0700, Evgenii Stepanov wrote: > On Wed, May 22, 2019 at 4:49 AM Catalin Marinas > wrote: > > On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote: > > > This patch is a part of a series that extends arm64 kernel ABI to allow to > > > pass tagged user pointers (with the top byte set to something else other > > > than 0x00) as syscall arguments. > > > > > > This patch allows tagged pointers to be passed to the following memory > > > syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2, > > > mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap, > > > remap_file_pages, shmat and shmdt. > > > > > > This is done by untagging pointers passed to these syscalls in the > > > prologues of their handlers. > > > > I'll go through them one by one to see if we can tighten the expected > > ABI while having the MTE in mind. > > > > > diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c > > > index b44065fb1616..933bb9f3d6ec 100644 > > > --- a/arch/arm64/kernel/sys.c > > > +++ b/arch/arm64/kernel/sys.c > > > @@ -35,10 +35,33 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned > > > long, len, > > > { > > > if (offset_in_page(off) != 0) > > > return -EINVAL; > > > - > > > + addr = untagged_addr(addr); > > > return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> > > > PAGE_SHIFT); > > > } > > > > If user passes a tagged pointer to mmap() and the address is honoured > > (or MAP_FIXED is given), what is the expected return pointer? Does it > > need to be tagged with the value from the hint? > > For HWASan the most convenient would be to use the tag from the hint. > But since in the TBI (not MTE) mode the kernel has no idea what > meaning userspace assigns to pointer tags, perhaps it should not try > to guess, and should return raw (zero-tagged) address instead. Then, just to relax the ABI for hwasan, shall we simply disallow tagged pointers on mmap() arguments? We can leave them in for mremap(old_address), madvise(). > > With MTE, we may want to use this as a request for the default colour of > > the mapped pages (still under discussion). > > I like this - and in that case it would make sense to return the > pointer that can be immediately dereferenced without crashing the > process, i.e. with the matching tag. This came up from the Android investigation work where large memory allocations (using mmap) could be more efficiently pre-tagged by the kernel on page fault. Not sure about the implementation details yet. -- Catalin
RE: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
Hi, Christian Thanks for you patch .Those patches can fix amdgpu bo pinned failed issue during perform dm_plane_helper_prepare_fb and Abaqus performance seems improved. But there some error message can be observer. Do we need drop amdgpu_vm_validate_pt_bos() error message and other warning debug info . [ 1910.674541] Call Trace: [ 1910.676944] [] dump_stack+0x19/0x1b [ 1910.682236] [] __warn+0xd8/0x100 [ 1910.687195] [] warn_slowpath_null+0x1d/0x20 [ 1910.693167] [] amdgpu_bo_move+0x169/0x1c0 [amdgpu] [ 1910.699719] [] ttm_bo_handle_move_mem+0x26b/0x5d0 [amdttm] [ 1910.706976] [] ttm_bo_evict+0x147/0x3b0 [amdttm] [ 1910.713358] [] ? drm_mm_insert_node_in_range+0x299/0x4d0 [drm] [ 1910.720881] [] ? _kcl_reservation_object_reserve_shared+0xfe/0x1a0 [amdkcl] [ 1910.729710] [] ttm_mem_evict_first+0x29e/0x3a0 [amdttm] [ 1910.736705] [] amdttm_bo_mem_space+0x1ae/0x300 [amdttm] [ 1910.743696] [] amdttm_bo_validate+0xc4/0x140 [amdttm] [ 1910.750529] [] amdgpu_cs_bo_validate+0xa5/0x220 [amdgpu] [ 1910.757625] [] amdgpu_cs_validate+0x47/0x2e0 [amdgpu] [ 1910.764463] [] ? amdgpu_cs_bo_validate+0x220/0x220 [amdgpu] [ 1910.771736] [] amdgpu_vm_validate_pt_bos+0x92/0x140 [amdgpu] [ 1910.779248] [] amdgpu_cs_ioctl+0x18a7/0x1d50 [amdgpu] [ 1910.785992] [] ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu] [ 1910.793486] [] drm_ioctl_kernel+0x6c/0xb0 [drm] [ 1910.799777] [] drm_ioctl+0x1e7/0x420 [drm] [ 1910.805643] [] ? amdgpu_cs_find_mapping+0x120/0x120 [amdgpu] [ 1910.813090] [] amdgpu_drm_ioctl+0x4b/0x80 [amdgpu] [ 1910.819639] [] do_vfs_ioctl+0x3a0/0x5a0 [ 1910.825217] [] ? __schedule+0x13a/0x890 [ 1910.830795] [] SyS_ioctl+0xa1/0xc0 [ 1910.835943] [] system_call_fastpath+0x22/0x27 [ 1910.842048] ---[ end trace a5c00b151c061d53 ]--- [ 1910.846814] [TTM] Buffer eviction failed [ 1910.850838] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* amdgpu_vm_validate_pt_bos() failed. [ 1910.858905] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -22! ... Thanks, Prike -Original Message- From: Christian König Sent: Wednesday, May 22, 2019 9:00 PM To: Olsak, Marek ; Zhou, David(ChunMing) ; Liang, Prike ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Subject: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3 [CAUTION: External Email] This avoids OOM situations when we have lots of threads submitting at the same time. v3: apply this to the whole driver, not just CS Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 20f2955d2a55..3e2da24cd17a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true, - &duplicates, true); + &duplicates, false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 06f83cac0d3a..f660628e6af9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, list_add(&csa_tv.head, &list); amdgpu_vm_get_pd_bo(vm, &list, &pd); - r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, true); + r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, false); if (r) { DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r); return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index d513a5ad03dd..ed25a4e14404 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj, amdgpu_vm_get_pd_bo(vm, &list, &vm_pd); - r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, true); + r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, + false); if (r) { dev_err(adev->dev, "leaking bo va because " "we fail to reserve bo (%d)\n", r); @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd); - r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates, true); + r = ttm_eu_reserve_buffers(&ticket, &list, tru
Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel
On Wed, May 22, 2019 at 04:09:31PM -0700, enh wrote: > On Wed, May 22, 2019 at 4:03 PM Evgenii Stepanov wrote: > > On Wed, May 22, 2019 at 1:47 PM Kees Cook wrote: > > > On Wed, May 22, 2019 at 05:35:27PM +0100, Catalin Marinas wrote: > > > > I would also expect the C library or dynamic loader to check for the > > > > presence of a HWCAP_MTE bit before starting to tag memory allocations, > > > > otherwise it would get SIGILL on the first MTE instruction it tries to > > > > execute. > > > > > > I've got the same question as Elliot: aren't MTE instructions just NOP > > > to older CPUs? I.e. if the CPU (or kernel) don't support it, it just > > > gets entirely ignored: checking is only needed to satisfy curiosity > > > or behavioral expectations. > > > > MTE instructions are not NOP. Most of them have side effects (changing > > register values, zeroing memory). > > no, i meant "they're encoded in a space that was previously no-ops, so > running on MTE code on old hardware doesn't cause SIGILL". It does result in SIGILL, there wasn't enough encoding left in the NOP space for old/current CPU implementations (in hindsight, we should have reserved a bigger NOP space). As Evgenii said, the libc needs to be careful when tagging the heap as it would cause a SIGILL if the HWCAP_MTE is not set. The standard application doesn't need to be recompiled as it would not issue MTE colouring instructions, just standard LDR/STR. Stack tagging is problematic if you want to colour each frame individually, the function prologue would need the non-NOP MTE instructions. The best we can do here is just having the (thread) stacks of different colours. -- Catalin