On 11/7/25 14:39, Timur Kristóf wrote: > On Fri, 2025-11-07 at 14:33 +0100, Christian König wrote: >> On 11/7/25 14:31, Timur Kristóf wrote: >>> On Fri, 2025-11-07 at 14:14 +0100, Christian König wrote: >>>> On 11/7/25 11:47, Timur Kristóf wrote: >>>>> On Fri, 2025-11-07 at 11:01 +0100, Christian König wrote: >>>>>> On 11/7/25 10:53, Timur Kristóf wrote: >>>>>>> On Fri, 2025-11-07 at 10:49 +0100, Christian König wrote: >>>>>>>> On 11/6/25 19:44, Timur Kristóf wrote: >>>>>>>>> VCE uses the VCPU BO to keep track of currently active >>>>>>>>> encoding sessions. To make sure the VCE functions >>>>>>>>> correctly >>>>>>>>> after suspend/resume, save the VCPU BO to system RAM on >>>>>>>>> suspend and restore it on resume. >>>>>>>>> >>>>>>>>> Additionally, make sure to keep the VCPU BO pinned. >>>>>>>>> The VCPU BO needs to stay at the same location before >>>>>>>>> and >>>>>>>>> after >>>>>>>>> sleep/resume because the FW code is not relocatable >>>>>>>>> once >>>>>>>>> it's >>>>>>>>> started. >>>>>>>>> >>>>>>>>> Unfortunately this is not enough to make VCE suspend >>>>>>>>> work >>>>>>>>> when >>>>>>>>> there are encoding sessions active, so don't allow that >>>>>>>>> yet. >>>>>>>> >>>>>>>> The question if this is the right approach or not can >>>>>>>> only >>>>>>>> Leo >>>>>>>> and >>>>>>>> Ruijing answer. >>>>>>>> >>>>>>>> If we can get VCE1-3 to keep session working after >>>>>>>> suspend/resume >>>>>>>> we >>>>>>>> should make this change otherwise we should rather make >>>>>>>> all >>>>>>>> old >>>>>>>> sessions invalid after suspend/resume and only re-load >>>>>>>> the >>>>>>>> FW. >>>>>>>> >>>>>>>> Anyway I think you should make the removal of >>>>>>>> "amdgpu_bo_kmap(adev- >>>>>>>>> vce.vcpu_bo, &cpu_addr);" a separate patch, cause that >>>>>>>>> seems to >>>>>>>>> be a >>>>>>>> good cleanup no matter what. >>>>>>>> >>>>>>> >>>>>>> This change is necessary for the VCE1 code when it loads >>>>>>> the >>>>>>> firmware >>>>>>> signature. Without this patch, we would need to call >>>>>>> reserve(), >>>>>>> kmap(), >>>>>>> kunmap(), kunreserve() in vce_v1_0_load_fw_signature(). >>>>>>> >>>>>>> Let me know which approach you prefer. >>>>>> >>>>>> In this case definately make removal of amdgpu_bo_kmap(adev- >>>>>>> vce.vcpu_bo, &cpu_addr) a separate patch. >>>>> >>>>> Sorry, can you clarify what you mean? >>>>> Should I just go back to the way things were on the first >>>>> version >>>>> of >>>>> the series, and then clean it up later? >>>> >>>> Just add a patch (early in the series, probably as first patch) >>>> to >>>> remove amdgpu_bo_kmap(adev-> vce.vcpu_bo, &cpu_addr). >>> >>> Is it allowed to keep a BO mapped when it is unreserved? >> >> Yes, as long as it is also pinned. >> >> IIRC the VCE BO is allocated by amdgpu_bo_create_kernel() and that >> both pins and maps the BO. > > I am asking because amdgpu_vce_resume calls amdgpu_bo_unreserve on the > BO, but then vce_v1_0_load_fw_signature needs to access it. So I wonder > if the CPU address of the BO will be still correct when it's > unreserved. Also wonder, do I have to call amdgpu_bo_reserve before > accessing the BO? Or is it enough that it's mapped?
It should be enough when it is mapped. That reserve/unreserve code is most likely just a leftover from radeon. On radeon we pinned/unpinned the BO on suspend/resume because we didn't know that the FW is not relocateable once started. Well that was a long long time ago :) So that reserve/unreserve can be removed as well. Regards, Christian. > >> >> Regards, >> Christian. >> >>> >>>> >>>> Then put the memset_io() into amdgpu_vce_resume() like you had in >>>> the >>>> first series of the patch so that VCE1 behaves the same as VCE2- >>>> 3. >>> >>> Ok >>> >>>> >>>> And when the series has landed we can clean everything up >>>> depending >>>> on what Leo/Ruijing has decided to do with suspend/resume on >>>> VCE1-3. >>> >>> Sounds good. >>> >>> >>> >>>> >>>> >>>>> >>>>>> >>>>>> I want to get initial VCE1 working and landed independent of >>>>>> what >>>>>> Leo/Ruijing say to suspend/resume. >>>>> >>>>> Agreed. >>>>> >>>>>> >>>>>> Can be that suspend/resume is then still broken, but that is >>>>>> also >>>>>> the >>>>>> case for VCE2-3 then. >>>>> >>>>> Yes, some extra work is going to be needed on top of this patch >>>>> to >>>>> make >>>>> suspend/resume work (if it's possible at all). >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Signed-off-by: Timur Kristóf <[email protected]> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 44 ++++++++- >>>>>>>>> ---- >>>>>>>>> ---- >>>>>>>>> ---- >>>>>>>>> drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 52 ++++----- >>>>>>>>> ---- >>>>>>>>> ---- >>>>>>>>> ---- >>>>>>>>> ---- >>>>>>>>> 2 files changed, 24 insertions(+), 72 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>>>> index 2297608c5191..4beec5b56c4f 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>>>> @@ -206,6 +206,10 @@ int amdgpu_vce_sw_init(struct >>>>>>>>> amdgpu_device >>>>>>>>> *adev, unsigned long size) >>>>>>>>> if (!adev->vce.fw) >>>>>>>>> return -ENOENT; >>>>>>>>> >>>>>>>>> + adev->vce.saved_bo = kvmalloc(size, >>>>>>>>> GFP_KERNEL); >>>>>>>>> + if (!adev->vce.saved_bo) >>>>>>>>> + return -ENOMEM; >>>>>>>>> + >>>>>>>>> r = amdgpu_bo_create_kernel(adev, size, >>>>>>>>> PAGE_SIZE, >>>>>>>>> >>>>>>>>> AMDGPU_GEM_DOMAIN_VRAM >>>>>>>>>> >>>>>>>>> >>>>>>>>> AMDGPU_GEM_DOMAIN_GTT, >>>>>>>>> @@ -254,6 +258,9 @@ int amdgpu_vce_sw_fini(struct >>>>>>>>> amdgpu_device >>>>>>>>> *adev) >>>>>>>>> amdgpu_bo_free_kernel(&adev->vce.vcpu_bo, >>>>>>>>> &adev- >>>>>>>>>> vce.gpu_addr, >>>>>>>>> (void **)&adev->vce.cpu_addr); >>>>>>>>> >>>>>>>>> + kvfree(adev->vce.saved_bo); >>>>>>>>> + adev->vce.saved_bo = NULL; >>>>>>>>> + >>>>>>>>> return 0; >>>>>>>>> } >>>>>>>>> >>>>>>>>> @@ -290,13 +297,19 @@ int amdgpu_vce_entity_init(struct >>>>>>>>> amdgpu_device *adev, struct amdgpu_ring *ring) >>>>>>>>> */ >>>>>>>>> int amdgpu_vce_suspend(struct amdgpu_device *adev) >>>>>>>>> { >>>>>>>>> - int i; >>>>>>>>> + int i, idx; >>>>>>>>> >>>>>>>>> cancel_delayed_work_sync(&adev- >>>>>>>>>> vce.idle_work); >>>>>>>>> >>>>>>>>> if (adev->vce.vcpu_bo == NULL) >>>>>>>>> return 0; >>>>>>>>> >>>>>>>>> + if (drm_dev_enter(adev_to_drm(adev), &idx)) { >>>>>>>>> + memcpy_fromio(adev->vce.saved_bo, >>>>>>>>> adev- >>>>>>>>>> vce.cpu_addr, >>>>>>>>> + amdgpu_bo_size(adev- >>>>>>>>>> vce.vcpu_bo)); >>>>>>>>> + drm_dev_exit(idx); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> for (i = 0; i < AMDGPU_MAX_VCE_HANDLES; ++i) >>>>>>>>> if (atomic_read(&adev- >>>>>>>>>> vce.handles[i])) >>>>>>>>> break; >>>>>>>>> @@ -316,40 +329,17 @@ int amdgpu_vce_suspend(struct >>>>>>>>> amdgpu_device >>>>>>>>> *adev) >>>>>>>>> */ >>>>>>>>> int amdgpu_vce_resume(struct amdgpu_device *adev) >>>>>>>>> { >>>>>>>>> - void *cpu_addr; >>>>>>>>> - const struct common_firmware_header *hdr; >>>>>>>>> - unsigned int offset; >>>>>>>>> - int r, idx; >>>>>>>>> + int idx; >>>>>>>>> >>>>>>>>> if (adev->vce.vcpu_bo == NULL) >>>>>>>>> return -EINVAL; >>>>>>>>> >>>>>>>>> - r = amdgpu_bo_reserve(adev->vce.vcpu_bo, >>>>>>>>> false); >>>>>>>>> - if (r) { >>>>>>>>> - dev_err(adev->dev, "(%d) failed to >>>>>>>>> reserve >>>>>>>>> VCE >>>>>>>>> bo\n", r); >>>>>>>>> - return r; >>>>>>>>> - } >>>>>>>>> - >>>>>>>>> - r = amdgpu_bo_kmap(adev->vce.vcpu_bo, >>>>>>>>> &cpu_addr); >>>>>>>>> - if (r) { >>>>>>>>> - amdgpu_bo_unreserve(adev- >>>>>>>>>> vce.vcpu_bo); >>>>>>>>> - dev_err(adev->dev, "(%d) VCE map >>>>>>>>> failed\n", >>>>>>>>> r); >>>>>>>>> - return r; >>>>>>>>> - } >>>>>>>>> - >>>>>>>>> - hdr = (const struct common_firmware_header >>>>>>>>> *)adev- >>>>>>>>>> vce.fw- >>>>>>>>>> data; >>>>>>>>> - offset = le32_to_cpu(hdr- >>>>>>>>>> ucode_array_offset_bytes); >>>>>>>>> - >>>>>>>>> if (drm_dev_enter(adev_to_drm(adev), &idx)) { >>>>>>>>> - memcpy_toio(cpu_addr, adev->vce.fw- >>>>>>>>>> data + >>>>>>>>> offset, >>>>>>>>> - adev->vce.fw->size - >>>>>>>>> offset); >>>>>>>>> + memcpy_toio(adev->vce.cpu_addr, adev- >>>>>>>>>> vce.saved_bo, >>>>>>>>> + amdgpu_bo_size(adev- >>>>>>>>>> vce.vcpu_bo)); >>>>>>>>> drm_dev_exit(idx); >>>>>>>>> } >>>>>>>>> >>>>>>>>> - amdgpu_bo_kunmap(adev->vce.vcpu_bo); >>>>>>>>> - >>>>>>>>> - amdgpu_bo_unreserve(adev->vce.vcpu_bo); >>>>>>>>> - >>>>>>>>> return 0; >>>>>>>>> } >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>>>> index 2d64002bed61..21b6656b2f41 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>>>> @@ -448,14 +448,8 @@ static int vce_v4_0_sw_init(struct >>>>>>>>> amdgpu_ip_block *ip_block) >>>>>>>>> return r; >>>>>>>>> >>>>>>>>> if (adev->firmware.load_type == >>>>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>>>> - const struct common_firmware_header >>>>>>>>> *hdr; >>>>>>>>> - unsigned size = amdgpu_bo_size(adev- >>>>>>>>>> vce.vcpu_bo); >>>>>>>>> - >>>>>>>>> - adev->vce.saved_bo = kvmalloc(size, >>>>>>>>> GFP_KERNEL); >>>>>>>>> - if (!adev->vce.saved_bo) >>>>>>>>> - return -ENOMEM; >>>>>>>>> - >>>>>>>>> - hdr = (const struct >>>>>>>>> common_firmware_header >>>>>>>>> *)adev- >>>>>>>>>> vce.fw->data; >>>>>>>>> + const struct common_firmware_header >>>>>>>>> *hdr = >>>>>>>>> + (const struct >>>>>>>>> common_firmware_header >>>>>>>>> *)adev->vce.fw->data; >>>>>>>>> adev- >>>>>>>>>> firmware.ucode[AMDGPU_UCODE_ID_VCE].ucode_id >>>>>>>>> = AMDGPU_UCODE_ID_VCE; >>>>>>>>> adev- >>>>>>>>>> firmware.ucode[AMDGPU_UCODE_ID_VCE].fw = >>>>>>>>> adev->vce.fw; >>>>>>>>> adev->firmware.fw_size += >>>>>>>>> @@ -506,11 +500,6 @@ static int vce_v4_0_sw_fini(struct >>>>>>>>> amdgpu_ip_block *ip_block) >>>>>>>>> /* free MM table */ >>>>>>>>> amdgpu_virt_free_mm_table(adev); >>>>>>>>> >>>>>>>>> - if (adev->firmware.load_type == >>>>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>>>> - kvfree(adev->vce.saved_bo); >>>>>>>>> - adev->vce.saved_bo = NULL; >>>>>>>>> - } >>>>>>>>> - >>>>>>>>> r = amdgpu_vce_suspend(adev); >>>>>>>>> if (r) >>>>>>>>> return r; >>>>>>>>> @@ -561,20 +550,7 @@ static int vce_v4_0_hw_fini(struct >>>>>>>>> amdgpu_ip_block *ip_block) >>>>>>>>> static int vce_v4_0_suspend(struct amdgpu_ip_block >>>>>>>>> *ip_block) >>>>>>>>> { >>>>>>>>> struct amdgpu_device *adev = ip_block->adev; >>>>>>>>> - int r, idx; >>>>>>>>> - >>>>>>>>> - if (adev->vce.vcpu_bo == NULL) >>>>>>>>> - return 0; >>>>>>>>> - >>>>>>>>> - if (drm_dev_enter(adev_to_drm(adev), &idx)) { >>>>>>>>> - if (adev->firmware.load_type == >>>>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>>>> - unsigned size = >>>>>>>>> amdgpu_bo_size(adev- >>>>>>>>>> vce.vcpu_bo); >>>>>>>>> - void *ptr = adev- >>>>>>>>>> vce.cpu_addr; >>>>>>>>> - >>>>>>>>> - memcpy_fromio(adev- >>>>>>>>>> vce.saved_bo, >>>>>>>>> ptr, >>>>>>>>> size); >>>>>>>>> - } >>>>>>>>> - drm_dev_exit(idx); >>>>>>>>> - } >>>>>>>>> + int r; >>>>>>>>> >>>>>>>>> /* >>>>>>>>> * Proper cleanups before halting the HW >>>>>>>>> engine: >>>>>>>>> @@ -609,25 +585,11 @@ static int >>>>>>>>> vce_v4_0_suspend(struct >>>>>>>>> amdgpu_ip_block *ip_block) >>>>>>>>> static int vce_v4_0_resume(struct amdgpu_ip_block >>>>>>>>> *ip_block) >>>>>>>>> { >>>>>>>>> struct amdgpu_device *adev = ip_block->adev; >>>>>>>>> - int r, idx; >>>>>>>>> - >>>>>>>>> - if (adev->vce.vcpu_bo == NULL) >>>>>>>>> - return -EINVAL; >>>>>>>>> - >>>>>>>>> - if (adev->firmware.load_type == >>>>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>>>> - >>>>>>>>> - if (drm_dev_enter(adev_to_drm(adev), >>>>>>>>> &idx)) { >>>>>>>>> - unsigned size = >>>>>>>>> amdgpu_bo_size(adev- >>>>>>>>>> vce.vcpu_bo); >>>>>>>>> - void *ptr = adev- >>>>>>>>>> vce.cpu_addr; >>>>>>>>> + int r; >>>>>>>>> >>>>>>>>> - memcpy_toio(ptr, adev- >>>>>>>>>> vce.saved_bo, >>>>>>>>> size); >>>>>>>>> - drm_dev_exit(idx); >>>>>>>>> - } >>>>>>>>> - } else { >>>>>>>>> - r = amdgpu_vce_resume(adev); >>>>>>>>> - if (r) >>>>>>>>> - return r; >>>>>>>>> - } >>>>>>>>> + r = amdgpu_vce_resume(adev); >>>>>>>>> + if (r) >>>>>>>>> + return r; >>>>>>>>> >>>>>>>>> return vce_v4_0_hw_init(ip_block); >>>>>>>>> }
