RE: [PATCH v3] drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure
[Public] Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: Shi, Leslie Sent: Friday, December 17, 2021 10:26 AM To: Grodzovsky, Andrey ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Cc: Chen, Guchun ; Shi, Leslie Subject: [PATCH v3] drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure [Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs. [How] call amdgpu_device_unmap_mmio() if device is unplugged to prevent invalid memory address in vcn_v3_0_sw_fini() when GPU initialization failure. Signed-off-by: Leslie Shi --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f31caec669e7..f6a47b927cfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3899,7 +3899,9 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_gart_dummy_page_fini(adev); - amdgpu_device_unmap_mmio(adev); + if (drm_dev_is_unplugged(adev_to_drm(adev))) + amdgpu_device_unmap_mmio(adev); + } void amdgpu_device_fini_sw(struct amdgpu_device *adev) -- 2.25.1
[PATCH] drm/amdgpu: potential dereference of null pointer
The return value of dma_alloc_coherent() needs to be checked. To avoid use of null pointer in memcpy_toio() in case of the failure of alloc. Fixes: 57430471e2fa ("drm/amdgpu: Add support for USBC PD FW download") Signed-off-by: Jiasheng Jiang --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index a09483beb968..613e25bf87e5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -3034,6 +3034,10 @@ static ssize_t psp_usbc_pd_fw_sysfs_write(struct device *dev, /* We need contiguous physical mem to place the FW for psp to access */ cpu_addr = dma_alloc_coherent(adev->dev, usbc_pd_fw->size, _addr, GFP_KERNEL); + if (!cpu_addr) { + ret = -ENOMEM; + goto rel_buf; + } ret = dma_mapping_error(adev->dev, dma_addr); if (ret) -- 2.25.1
回复: Re: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
Dear Alex:>Is the issue reproducible with the same board in bare metal on x86?Or does it only happen with passthrough on ARM?Unfortunately, my current environment is not convenient to test this GPU board on x86 platform.but I can tell you the problem still occurs on ARM without passthrough to virtual machine.In addition,at end of 2020,my colleagues also found similar problems on MIPS platforms with Graphics chips of Radeon R7 340.So,I may think it can happen to no matter based on x86 ,ARM or mips.I hope the above information is helpful to you,and I also think it will be better for user if can root cause this issue.Best regards. 主 题:Re: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 日 期:2021-12-16 23:28 发件人:Alex Deucher 收件人:周宗敏 Is the issue reproducible with the same board in bare metal on x86? Or does it only happen with passthrough on ARM? Looking through the archives, the SI patch I made was for an x86 laptop. It would be nice to root cause this, but there weren't any gfx8 boards with more than 64G of vram, so I think it's safe. That said, if you see similar issues with newer gfx IPs then we have an issue since the upper bit will be meaningful, so it would be nice to root cause this.AlexOn Thu, Dec 16, 2021 at 4:36 AM 周宗敏wrote:Hi Christian,I'm testing for GPU passthrough feature, so I pass through this GPU to virtual machine to use. It based on arm64 system.As far as i know, Alex had dealt with a similar problems on dri/radeon/si.c . Maybe they have a same reason to cause it?the history commit message is below:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ca223b029a261e82fb2f50c52eb85d510f4260eThanks very much. 主 题:Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 日 期:2021-12-16 16:15 发件人:Christian König 收件人:周宗敏Alex Deucher Hi Zongmin, that strongly sounds like the ASIC is not correctly initialized when trying to read the register. What board and environment are you using this GPU with? Is that a normal x86 system? Regards, Christian. Am 16.12.21 um 04:11 schrieb 周宗敏: the problematic boards that I have tested is [AMD/ATI] Lexa PRO [Radeon RX 550/550X] ; and the vbios version : 113-RXF9310-C09-BTWhen an exception occurs I can see the following changes in the values of vram size get from RREG32(mmCONFIG_MEMSIZE) ,it seems to have garbage in the upper 16 bits and then I can also see some dmesg like below:when vram size register have garbage,we may see error message like below:amdgpu :09:00.0: VRAM: 4286582784M 0x00F4 - 0x000FF8F4 (4286582784M used)the correct message should like below:amdgpu :09:00.0: VRAM: 4096M 0x00F4 - 0x00F4 (4096M used) if you have any problems,please send me mail.thanks very much. 主 题:Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 日 期:2021-12-16 04:23 发件人:Alex Deucher 收件人:Zongmin Zhou On Wed, Dec 15, 2021 at 10:31 AM Zongmin Zhouwrote: > > Some boards(like RX550) seem to have garbage in the upper > 16 bits of the vram size register. Check for > this and clamp the size properly. Fixes > boards reporting bogus amounts of vram. > > after add this patch,the maximum GPU VRAM size is 64GB, > otherwise only 64GB vram size will be used. Can you provide some examples of problematic boards and possibly a vbios image from the problematic board? What values are you seeing? It would be nice to see what the boards are reporting and whether the lower 16 bits are actually correct or if it is some other issue. This register is undefined until the asic has been initialized. The vbios programs it as part of it's asic init sequence (either via vesa/gop or the OS driver). Alex > > Signed-off-by: Zongmin Zhou > --- > drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > index 492ebed2915b..63b890f1e8af 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > @@ -515,10 +515,10 @@ static void
[PATCH v3] drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure
[Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs. [How] call amdgpu_device_unmap_mmio() if device is unplugged to prevent invalid memory address in vcn_v3_0_sw_fini() when GPU initialization failure. Signed-off-by: Leslie Shi --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f31caec669e7..f6a47b927cfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3899,7 +3899,9 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_gart_dummy_page_fini(adev); - amdgpu_device_unmap_mmio(adev); + if (drm_dev_is_unplugged(adev_to_drm(adev))) + amdgpu_device_unmap_mmio(adev); + } void amdgpu_device_fini_sw(struct amdgpu_device *adev) -- 2.25.1
RE: [PATCH 3/4] drm/amdkfd: add reset queue function for RAS poison
[AMD Official Use Only] OK, I'll rename it before submit. Regards, Tao > -Original Message- > From: Zhang, Hawking > Sent: Thursday, December 16, 2021 8:22 PM > To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org; Yang, > Stanley ; Chai, Thomas ; > Kuehling, Felix > Subject: RE: [PATCH 3/4] drm/amdkfd: add reset queue function for RAS poison > > + > + int (*unmap_queues_cpsch_poison)(struct device_queue_manager > *dqm, > + uint16_t pasid); > }; > > Might be better call it reset_queue directly (match with update_queue, > create_queue, .etc.,) > > Others look good to me > > The series (4 patches) is > > Reviewed-by: Hawking Zhang > > Regards, > Hawking > -Original Message- > From: Zhou1, Tao > Sent: Thursday, December 16, 2021 19:36 > To: amd-gfx@lists.freedesktop.org; Zhang, Hawking > ; Yang, Stanley ; Chai, > Thomas ; Kuehling, Felix > Cc: Zhou1, Tao > Subject: [PATCH 3/4] drm/amdkfd: add reset queue function for RAS poison > > The new interface unmaps queues with reset mode for the process consumes > RAS poison, it's only for compute queue. > > Signed-off-by: Tao Zhou > --- > .../drm/amd/amdkfd/kfd_device_queue_manager.c| 16 > .../drm/amd/amdkfd/kfd_device_queue_manager.h| 5 + > 2 files changed, 21 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > index 01a2cc3928ac..b4b0495c7024 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > @@ -1476,6 +1476,21 @@ static int unmap_queues_cpsch(struct > device_queue_manager *dqm, > return retval; > } > > +/* only for compute queue */ > +static int unmap_queues_cpsch_poison(struct device_queue_manager *dqm, > + uint16_t pasid) > +{ > + int retval; > + > + dqm_lock(dqm); > + > + retval = unmap_queues_cpsch(dqm, > KFD_UNMAP_QUEUES_FILTER_BY_PASID, > + pasid, true); > + > + dqm_unlock(dqm); > + return retval; > +} > + > /* dqm->lock mutex has to be locked before calling this function */ static > int > execute_queues_cpsch(struct device_queue_manager *dqm, > enum kfd_unmap_queues_filter filter, @@ - > 1896,6 +1911,7 @@ struct device_queue_manager > *device_queue_manager_init(struct kfd_dev *dev) > dqm->ops.evict_process_queues = evict_process_queues_cpsch; > dqm->ops.restore_process_queues = > restore_process_queues_cpsch; > dqm->ops.get_wave_state = get_wave_state; > + dqm->ops.unmap_queues_cpsch_poison = > unmap_queues_cpsch_poison; > break; > case KFD_SCHED_POLICY_NO_HWS: > /* initialize dqm for no cp scheduling */ diff --git > a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h > b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h > index 499fc0ea387f..19ec3e8859e8 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h > @@ -81,6 +81,8 @@ struct device_process_node { > * > * @get_wave_state: Retrieves context save state and optionally copies the > * control stack, if kept in the MQD, to the given userspace address. > + * > + * @unmap_queues_cpsch_poison: reset queue which consumes RAS poison > */ > > struct device_queue_manager_ops { > @@ -134,6 +136,9 @@ struct device_queue_manager_ops { > void __user *ctl_stack, > u32 *ctl_stack_used_size, > u32 *save_area_used_size); > + > + int (*unmap_queues_cpsch_poison)(struct device_queue_manager > *dqm, > + uint16_t pasid); > }; > > struct device_queue_manager_asic_ops { > -- > 2.17.1
Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
On Thu, Dec 16, 2021 at 8:43 PM Quan, Evan wrote: > > [AMD Official Use Only] > > Hi Alex, > > Per our checking, vcn_v2 and vcn_v3 already have the dpm disablement(below) > in their ->suspend routine which should prevent them from the issue here. > if (adev->pm.dpm_enabled) > amdgpu_dpm_enable_uvd(adev, false); > So, maybe it's a different story with those newer APUs. Can you forward the > bug reports to me? https://gitlab.freedesktop.org/drm/amd/-/issues/1712#note_1192187 Alex > > Thanks, > Evan > > -Original Message- > > From: Alex Deucher > > Sent: Thursday, December 16, 2021 11:38 PM > > To: Quan, Evan > > Cc: Zhu, James ; Gong, Curry > > ; amd-gfx@lists.freedesktop.org; Deucher, > > Alexander ; Liu, Leo > > Subject: Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, > > powergating is explicitly enabled > > > > FWIW, it looks like all versions of VCN need the same fix. There have been > > reports of suspend failing when VCN is in use on other newer APUs as well. > > > > Alex > > > > On Tue, Dec 14, 2021 at 12:59 AM Quan, Evan wrote: > > > > > > [AMD Official Use Only] > > > > > > > > > > > > > > > > > > > > > > > > From: Zhu, James > > > Sent: Monday, December 13, 2021 9:39 PM > > > To: Gong, Curry ; Zhu, James > > ; > > > amd-gfx@lists.freedesktop.org > > > Cc: Liu, Leo ; Quan, Evan ; > > > Deucher, Alexander > > > Subject: Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, > > > powergating is explicitly enabled > > > > > > > > > > > > Hi Curry, Evan, > > > > > > It seems vcn1.0 power gate sequence are special. > > > > > > if the original solution is thread safe, then the following solution will > > > be > > thread safe also. > > > > > > static int vcn_v1_0_hw_fini(void *handle) { > > > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > > > > > cancel_delayed_work_sync(>vcn.idle_work); > > > > > > if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || > > > (adev->vcn.cur_state != AMD_PG_STATE_GATE && > > > RREG32_SOC15(VCN, 0, mmUVD_STATUS))) { > > > +if (adev->pm.dpm_enabled) > > > +amdgpu_dpm_enable_uvd(adev, false); > > > +else > > > +vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE); > > > > > > [Quan, Evan] Considering adev->pm.dpm_enabled is always true, > > “vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE); “ will > > become dead code. > > > > > > Also, the vcn_v1_0_hw_fini is also used in other scenarios(except the > > suspend/resume discussed here). So, it seems quite risky compared with > > Curry’s original patch. > > > > > > Before we can come up with a better idea, I would suggest to land Curry’s > > original patch as a quick fix. > > > > > > > > > > > > BR > > > > > > Evan > > > > > > > > > } > > > > > > Best Regards! > > > > > > James > > > > > > On 2021-12-13 3:55 a.m., Gong, Curry wrote: > > > > > > [AMD Official Use Only] > > > > > > > > > > > > Hi James: > > > > > > > > > > > > With the following patch, an error will be reported when the driver is > > > loaded > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c > > > > > > @@ -1202,6 +1204,9 @@ static int vcn_v1_0_stop(struct amdgpu_device > > > *adev) > > > > > > else > > > > > > r = vcn_v1_0_stop_spg_mode(adev); > > > > > > > > > > > > c > > > > > > return r; > > > > > > } > > > > > > > > > > > > > > > > > > $ dmesg > > > > > > [ 363.181081] INFO: task kworker/3:2:223 blocked for more than 120 > > seconds. > > > > > > [ 363.181150] Tainted: G OE 5.11.0-41-generic > > > #45~20.04.1- > > Ubuntu > > > > > > [ 363.181208] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. > > > > > > [ 363.181266] task:kworker/3:2 state:D stack:0 pid: 223 ppid: > > > 2 > > flags:0x4000 > > > > > > [ 363.181276] Workqueue: events vcn_v1_0_idle_work_handler [amdgpu] > > > > > > [ 363.181612] Call Trace: > > > > > > [ 363.181618] __schedule+0x44c/0x8a0 > > > > > > [ 363.181627] schedule+0x4f/0xc0 > > > > > > [ 363.181631] schedule_preempt_disabled+0xe/0x10 > > > > > > [ 363.181636] __mutex_lock.isra.0+0x183/0x4d0 > > > > > > [ 363.181643] __mutex_lock_slowpath+0x13/0x20 > > > > > > [ 363.181648] mutex_lock+0x32/0x40 > > > > > > [ 363.181652] amdgpu_dpm_set_powergating_by_smu+0x9c/0x180 > > [amdgpu] > > > > > > [ 363.182055] amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu] > > > > > > [ 363.182454] vcn_v1_0_set_powergating_state+0x2e7e/0x3cf0 [amdgpu] > > > > > > [ 363.182776] amdgpu_device_ip_set_powergating_state+0x6c/0xc0 > > > [amdgpu] > > > > > > [ 363.183028] smu10_powergate_vcn+0x2a/0x80 [amdgpu] > > > > > > [ 363.183361] pp_set_powergating_by_smu+0xc5/0x2b0 [amdgpu] > > > > > > [ 363.183699] amdgpu_dpm_set_powergating_by_smu+0xb6/0x180 > > [amdgpu] > > > > > > [ 363.184040] amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu] > > > > > > [ 363.184391]
RE: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
[AMD Official Use Only] Hi Alex, Per our checking, vcn_v2 and vcn_v3 already have the dpm disablement(below) in their ->suspend routine which should prevent them from the issue here. if (adev->pm.dpm_enabled) amdgpu_dpm_enable_uvd(adev, false); So, maybe it's a different story with those newer APUs. Can you forward the bug reports to me? Thanks, Evan > -Original Message- > From: Alex Deucher > Sent: Thursday, December 16, 2021 11:38 PM > To: Quan, Evan > Cc: Zhu, James ; Gong, Curry > ; amd-gfx@lists.freedesktop.org; Deucher, > Alexander ; Liu, Leo > Subject: Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, > powergating is explicitly enabled > > FWIW, it looks like all versions of VCN need the same fix. There have been > reports of suspend failing when VCN is in use on other newer APUs as well. > > Alex > > On Tue, Dec 14, 2021 at 12:59 AM Quan, Evan wrote: > > > > [AMD Official Use Only] > > > > > > > > > > > > > > > > From: Zhu, James > > Sent: Monday, December 13, 2021 9:39 PM > > To: Gong, Curry ; Zhu, James > ; > > amd-gfx@lists.freedesktop.org > > Cc: Liu, Leo ; Quan, Evan ; > > Deucher, Alexander > > Subject: Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, > > powergating is explicitly enabled > > > > > > > > Hi Curry, Evan, > > > > It seems vcn1.0 power gate sequence are special. > > > > if the original solution is thread safe, then the following solution will be > thread safe also. > > > > static int vcn_v1_0_hw_fini(void *handle) { > > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > > > cancel_delayed_work_sync(>vcn.idle_work); > > > > if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || > > (adev->vcn.cur_state != AMD_PG_STATE_GATE && > > RREG32_SOC15(VCN, 0, mmUVD_STATUS))) { > > +if (adev->pm.dpm_enabled) > > +amdgpu_dpm_enable_uvd(adev, false); > > +else > > +vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE); > > > > [Quan, Evan] Considering adev->pm.dpm_enabled is always true, > “vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE); “ will > become dead code. > > > > Also, the vcn_v1_0_hw_fini is also used in other scenarios(except the > suspend/resume discussed here). So, it seems quite risky compared with > Curry’s original patch. > > > > Before we can come up with a better idea, I would suggest to land Curry’s > original patch as a quick fix. > > > > > > > > BR > > > > Evan > > > > > > } > > > > Best Regards! > > > > James > > > > On 2021-12-13 3:55 a.m., Gong, Curry wrote: > > > > [AMD Official Use Only] > > > > > > > > Hi James: > > > > > > > > With the following patch, an error will be reported when the driver is > > loaded > > > > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c > > > > @@ -1202,6 +1204,9 @@ static int vcn_v1_0_stop(struct amdgpu_device > > *adev) > > > > else > > > > r = vcn_v1_0_stop_spg_mode(adev); > > > > > > > > c > > > > return r; > > > > } > > > > > > > > > > > > $ dmesg > > > > [ 363.181081] INFO: task kworker/3:2:223 blocked for more than 120 > seconds. > > > > [ 363.181150] Tainted: G OE 5.11.0-41-generic > > #45~20.04.1- > Ubuntu > > > > [ 363.181208] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > > > [ 363.181266] task:kworker/3:2 state:D stack:0 pid: 223 ppid: > > 2 > flags:0x4000 > > > > [ 363.181276] Workqueue: events vcn_v1_0_idle_work_handler [amdgpu] > > > > [ 363.181612] Call Trace: > > > > [ 363.181618] __schedule+0x44c/0x8a0 > > > > [ 363.181627] schedule+0x4f/0xc0 > > > > [ 363.181631] schedule_preempt_disabled+0xe/0x10 > > > > [ 363.181636] __mutex_lock.isra.0+0x183/0x4d0 > > > > [ 363.181643] __mutex_lock_slowpath+0x13/0x20 > > > > [ 363.181648] mutex_lock+0x32/0x40 > > > > [ 363.181652] amdgpu_dpm_set_powergating_by_smu+0x9c/0x180 > [amdgpu] > > > > [ 363.182055] amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu] > > > > [ 363.182454] vcn_v1_0_set_powergating_state+0x2e7e/0x3cf0 [amdgpu] > > > > [ 363.182776] amdgpu_device_ip_set_powergating_state+0x6c/0xc0 > > [amdgpu] > > > > [ 363.183028] smu10_powergate_vcn+0x2a/0x80 [amdgpu] > > > > [ 363.183361] pp_set_powergating_by_smu+0xc5/0x2b0 [amdgpu] > > > > [ 363.183699] amdgpu_dpm_set_powergating_by_smu+0xb6/0x180 > [amdgpu] > > > > [ 363.184040] amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu] > > > > [ 363.184391] vcn_v1_0_idle_work_handler+0xe1/0x130 [amdgpu] > > > > [ 363.184667] process_one_work+0x220/0x3c0 > > > > [ 363.184674] worker_thread+0x4d/0x3f0 > > > > [ 363.184677] ? process_one_work+0x3c0/0x3c0 > > > > [ 363.184680] kthread+0x12b/0x150 > > > > [ 363.184685] ? set_kthread_struct+0x40/0x40 > > > > [ 363.184690] ret_from_fork+0x22/0x30 > > > > [ 363.184699] INFO: task kworker/2:2:233 blocked for more than 120 > seconds. > > > > [ 363.184739] Tainted: G OE
RE: [PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init
[AMD Official Use Only] I will fix the indent issue. There shouldn't be any conflicts between this patch and the xgmi init changes. We've run with those xgmi init patches applied on top of mainline-dkms-5.13 (where this change is already present) and found no issues. Thanks, Victor -Original Message- From: Alex Deucher Sent: Thursday, December 16, 2021 3:18 PM To: Skvortsov, Victor Cc: amd-gfx list ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Subject: Re: [PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init [CAUTION: External Email] On Thu, Dec 16, 2021 at 2:43 PM Victor Skvortsov wrote: > > Driver needs to call get_xgmi_info() before ip_init to determine > whether it needs to handle a pending hive reset. > > Signed-off-by: Victor Skvortsov > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ > drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 6 -- > drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 -- > 3 files changed, 7 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 5bd785cfc5ca..4fd370016834 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3576,6 +3576,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, > if (r) > return r; > > + /* Need to get xgmi info early to decide the reset behavior*/ > + if (adev->gmc.xgmi.supported) { > + r = adev->gfxhub.funcs->get_xgmi_info(adev); > + if (r) Indentation looks off here, please fix that. Also, will this conflict with Shaoyun's xgmi init patch which may or may not have landed? Other than that, these patches look pretty good to me. Alex > + return r; > + } > + > /* enable PCIE atomic ops */ > if (amdgpu_sriov_vf(adev)) > adev->have_atomics_support = ((struct > amd_sriov_msg_pf2vf_info *) diff --git > a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > index ae46eb35b3d7..3d5d47a799e3 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > @@ -914,12 +914,6 @@ static int gmc_v10_0_sw_init(void *handle) > return r; > } > > - if (adev->gmc.xgmi.supported) { > - r = adev->gfxhub.funcs->get_xgmi_info(adev); > - if (r) > - return r; > - } > - > r = gmc_v10_0_mc_init(adev); > if (r) > return r; > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > index 2b86c63b032a..57f2729a7bd0 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > @@ -1628,12 +1628,6 @@ static int gmc_v9_0_sw_init(void *handle) > } > adev->need_swiotlb = drm_need_swiotlb(44); > > - if (adev->gmc.xgmi.supported) { > - r = adev->gfxhub.funcs->get_xgmi_info(adev); > - if (r) > - return r; > - } > - > r = gmc_v9_0_mc_init(adev); > if (r) > return r; > -- > 2.25.1 >
[pull] amdgpu, amdkfd, radeon drm-next-5.17
Hi Dave, Daniel, More updates for 5.17. The following changes since commit 3c021931023a30316db415044531b116b85e6ebd: drm/amdgpu: replace drm_detect_hdmi_monitor() with drm_display_info.is_hdmi (2021-12-07 13:13:07 -0500) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-next-5.17-2021-12-16 for you to fetch changes up to a342655865b2f14d1fbf346356d3b3360e63e872: drm/radeon: Fix syntax errors in comments (2021-12-14 16:11:02 -0500) amdgpu: - Add some display debugfs entries - RAS fixes - SR-IOV fixes - W=1 fixes - Documentation fixes - IH timestamp fix - Misc power fixes - IP discovery fixes - Large driver documentation updates - Multi-GPU memory use reductions - Misc display fixes and cleanups - Add new SMU debug option amdkfd: - SVM fixes radeon: - Fix typo in comment Anthony Koo (1): drm/amd/display: [FW Promotion] Release 0.0.97 Aric Cyr (1): drm/amd/display: 3.2.166 Aurabindo Pillai (1): drm/amd/display: Add feature flags to disable LTTPR Eric Bernstein (1): drm/amd/display: ODM + MPO window on only one half of ODM Evan Quan (2): drm/amdgpu: move smu_debug_mask to a more proper place drm/amdgpu: correct the wrong cached state for GMC on PICASSO Fangzhi Zuo (1): drm/amd/display: Add Debugfs Entry to Force in SST Sequence Felix Kuehling (4): drm/amdkfd: Fix error handling in svm_range_add drm/amdkfd: Fix svm_range_is_same_attrs drm/amdkfd: Don't split unchanged SVM ranges drm/amdkfd: Make KFD support on Hawaii experimental Graham Sider (1): drm/amdkfd: add Navi2x to GWS init conditions Guchun Chen (1): drm/amdgpu: use adev_to_drm to get drm_device pointer Hawking Zhang (6): drm/amdgpu: add helper to load ip_discovery binary from file drm/amdgpu: rename discovery_read_binary helper drm/amdgpu: add helper to verify ip discovery binary signature drm/amdgpu: read and authenticate ip discovery binary drm/amdgpu: don't override default ECO_BITs setting drm/amdgpu: check df_funcs and its callback pointers Isabella Basso (10): drm/amd: Mark IP_BASE definition as __maybe_unused drm/amd: fix improper docstring syntax drm/amdgpu: fix function scopes drm/amdkfd: fix function scopes drm/amd: append missing includes drm/amdgpu: fix location of prototype for amdgpu_kms_compat_ioctl drm/amdgpu: fix amdgpu_ras_mca_query_error_status scope drm/amdgpu: remove unnecessary variables drm/amdgpu: re-format file header comments drm/amd/display: fix function scopes Jingwen Chen (2): drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV Jonathan Kim (1): drm/amdgpu: disable default navi2x co-op kernel support Lang Yu (5): drm/amdgpu: remove power on/off SDMA in SMU hw_init/fini() drm/amdgpu: only hw fini SMU fisrt for ASICs need that drm/amdgpu: introduce a kind of halt state for amdgpu device drm/amdgpu: add support for SMU debug option drm/amd/pm: fix a potential gpu_metrics_table memory leak Le Ma (1): drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE Leslie Shi (2): drm/amdgpu: add modifiers in amdgpu_vkms_plane_init() drm/amdgpu: fix incorrect VCN revision in SRIOV Lijo Lazar (1): drm/amd/pm: Skip power state allocation Mario Limonciello (4): drm/amd: add some extra checks that is_dig_enabled is defined drm/amd: move variable to local scope drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YC drivers/amd/pm: drop statement to print FW version for smu_v13 Martin Leung (1): drm/amd/display: implement dc_mode_memclk Michael Strauss (1): drm/amd/display: Force det buf size to 192KB with 3+ streams and upscaling Michel Dänzer (2): drm/amd/display: Reduce stack size for dml31_ModeSupportAndSystemConfigurationFull drm/amd/display: Reduce stack size for dml31 UseMinimumDCFCLK Mikita Lipski (1): drm/amd/display: parse and check PSR SU caps Nicholas Kazlauskas (2): drm/amd/display: Set exit_optimized_pwr_state for DCN31 drm/amd/display: Reset DMCUB before HW init Philip Yang (3): drm/amdgpu: Handle fault with same timestamp drm/amdgpu: Detect if amdgpu in IOMMU direct map mode drm/amdgpu: Reduce SG bo memory usage for mGPUs Rodrigo Siqueira (6): Documentation/gpu: Reorganize DC documentation Documentation/gpu: Document amdgpu_dm_visual_confirm debugfs entry Documentation/gpu: Document pipe split visual confirmation Documentation/gpu: How to collect DTN log Documentation/gpu: Add basic overview of DC pipeline Documentation/gpu: Add amdgpu and dc glossary Solomon Chiu (1):
Re: [PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init
On Thu, Dec 16, 2021 at 2:43 PM Victor Skvortsov wrote: > > Driver needs to call get_xgmi_info() before ip_init > to determine whether it needs to handle a pending hive reset. > > Signed-off-by: Victor Skvortsov > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ > drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 6 -- > drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 -- > 3 files changed, 7 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 5bd785cfc5ca..4fd370016834 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3576,6 +3576,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, > if (r) > return r; > > + /* Need to get xgmi info early to decide the reset behavior*/ > + if (adev->gmc.xgmi.supported) { > + r = adev->gfxhub.funcs->get_xgmi_info(adev); > + if (r) Indentation looks off here, please fix that. Also, will this conflict with Shaoyun's xgmi init patch which may or may not have landed? Other than that, these patches look pretty good to me. Alex > + return r; > + } > + > /* enable PCIE atomic ops */ > if (amdgpu_sriov_vf(adev)) > adev->have_atomics_support = ((struct > amd_sriov_msg_pf2vf_info *) > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > index ae46eb35b3d7..3d5d47a799e3 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c > @@ -914,12 +914,6 @@ static int gmc_v10_0_sw_init(void *handle) > return r; > } > > - if (adev->gmc.xgmi.supported) { > - r = adev->gfxhub.funcs->get_xgmi_info(adev); > - if (r) > - return r; > - } > - > r = gmc_v10_0_mc_init(adev); > if (r) > return r; > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > index 2b86c63b032a..57f2729a7bd0 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > @@ -1628,12 +1628,6 @@ static int gmc_v9_0_sw_init(void *handle) > } > adev->need_swiotlb = drm_need_swiotlb(44); > > - if (adev->gmc.xgmi.supported) { > - r = adev->gfxhub.funcs->get_xgmi_info(adev); > - if (r) > - return r; > - } > - > r = gmc_v9_0_mc_init(adev); > if (r) > return r; > -- > 2.25.1 >
RE: [PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init
[AMD Official Use Only] Reviewed by: shaoyun.liu -Original Message- From: Skvortsov, Victor Sent: Thursday, December 16, 2021 2:43 PM To: amd-gfx@lists.freedesktop.org; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Cc: Skvortsov, Victor Subject: [PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init Driver needs to call get_xgmi_info() before ip_init to determine whether it needs to handle a pending hive reset. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 6 -- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 -- 3 files changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 5bd785cfc5ca..4fd370016834 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3576,6 +3576,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (r) return r; + /* Need to get xgmi info early to decide the reset behavior*/ + if (adev->gmc.xgmi.supported) { + r = adev->gfxhub.funcs->get_xgmi_info(adev); + if (r) + return r; + } + /* enable PCIE atomic ops */ if (amdgpu_sriov_vf(adev)) adev->have_atomics_support = ((struct amd_sriov_msg_pf2vf_info *) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index ae46eb35b3d7..3d5d47a799e3 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -914,12 +914,6 @@ static int gmc_v10_0_sw_init(void *handle) return r; } - if (adev->gmc.xgmi.supported) { - r = adev->gfxhub.funcs->get_xgmi_info(adev); - if (r) - return r; - } - r = gmc_v10_0_mc_init(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 2b86c63b032a..57f2729a7bd0 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1628,12 +1628,6 @@ static int gmc_v9_0_sw_init(void *handle) } adev->need_swiotlb = drm_need_swiotlb(44); - if (adev->gmc.xgmi.supported) { - r = adev->gfxhub.funcs->get_xgmi_info(adev); - if (r) - return r; - } - r = gmc_v9_0_mc_init(adev); if (r) return r; -- 2.25.1
Re: [PATCH v3 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov
[AMD Official Use Only] Reviewed-by: David Nieto From: Skvortsov, Victor Sent: Thursday, December 16, 2021 11:42 AM To: amd-gfx@lists.freedesktop.org ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Cc: Skvortsov, Victor Subject: [PATCH v3 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov Expand RLCG interface for new GC read & write commands. New interface will only be used if the PF enables the flag in pf2vf msg. v2: Added a description for the scratch registers Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 117 -- 1 file changed, 89 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index edb3e3b08eed..9189fb85a4dd 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -63,6 +63,13 @@ #define mmGCEA_PROBE_MAP0x070c #define mmGCEA_PROBE_MAP_BASE_IDX 0 +#define GFX9_RLCG_GC_WRITE_OLD (0x8 << 28) +#define GFX9_RLCG_GC_WRITE (0x0 << 28) +#define GFX9_RLCG_GC_READ (0x1 << 28) +#define GFX9_RLCG_VFGATE_DISABLED 0x400 +#define GFX9_RLCG_WRONG_OPERATION_TYPE 0x200 +#define GFX9_RLCG_NOT_IN_RANGE 0x100 + MODULE_FIRMWARE("amdgpu/vega10_ce.bin"); MODULE_FIRMWARE("amdgpu/vega10_pfp.bin"); MODULE_FIRMWARE("amdgpu/vega10_me.bin"); @@ -739,7 +746,7 @@ static const u32 GFX_RLC_SRM_INDEX_CNTL_DATA_OFFSETS[] = mmRLC_SRM_INDEX_CNTL_DATA_7 - mmRLC_SRM_INDEX_CNTL_DATA_0, }; -static void gfx_v9_0_rlcg_w(struct amdgpu_device *adev, u32 offset, u32 v, u32 flag) +static u32 gfx_v9_0_rlcg_rw(struct amdgpu_device *adev, u32 offset, u32 v, uint32_t flag) { static void *scratch_reg0; static void *scratch_reg1; @@ -748,21 +755,20 @@ static void gfx_v9_0_rlcg_w(struct amdgpu_device *adev, u32 offset, u32 v, u32 f static void *spare_int; static uint32_t grbm_cntl; static uint32_t grbm_idx; + uint32_t i = 0; + uint32_t retries = 5; + u32 ret = 0; + u32 tmp; scratch_reg0 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG0_BASE_IDX] + mmSCRATCH_REG0)*4; scratch_reg1 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG1)*4; - scratch_reg2 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG2)*4; - scratch_reg3 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG3)*4; + scratch_reg2 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG2_BASE_IDX] + mmSCRATCH_REG2)*4; + scratch_reg3 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG3_BASE_IDX] + mmSCRATCH_REG3)*4; spare_int = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmRLC_SPARE_INT_BASE_IDX] + mmRLC_SPARE_INT)*4; grbm_cntl = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_CNTL_BASE_IDX] + mmGRBM_GFX_CNTL; grbm_idx = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_INDEX_BASE_IDX] + mmGRBM_GFX_INDEX; - if (amdgpu_sriov_runtime(adev)) { - pr_err("shouldn't call rlcg write register during runtime\n"); - return; - } - if (offset == grbm_cntl || offset == grbm_idx) { if (offset == grbm_cntl) writel(v, scratch_reg2); @@ -771,41 +777,95 @@ static void gfx_v9_0_rlcg_w(struct amdgpu_device *adev, u32 offset, u32 v, u32 f writel(v, ((void __iomem *)adev->rmmio) + (offset * 4)); } else { - uint32_t i = 0; - uint32_t retries = 5; - + /* +* SCRATCH_REG0 = read/write value +* SCRATCH_REG1[30:28] = command +* SCRATCH_REG1[19:0]= address in dword +* SCRATCH_REG1[26:24] = Error reporting +*/ writel(v, scratch_reg0); - writel(offset | 0x8000, scratch_reg1); + writel(offset | flag, scratch_reg1); writel(1, spare_int); - for (i = 0; i < retries; i++) { - u32 tmp; + for (i = 0; i < retries; i++) { tmp = readl(scratch_reg1); - if (!(tmp & 0x8000)) + if (!(tmp & flag)) break; udelay(10); } - if (i >= retries) - pr_err("timeout: rlcg program reg:0x%05x failed !\n", offset); + + if (i >= retries) { + if (amdgpu_sriov_reg_indirect_gc(adev)) { + if (tmp & GFX9_RLCG_VFGATE_DISABLED) +
Re: [PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init
[AMD Official Use Only] Reviewed-by: David Nieto From: Skvortsov, Victor Sent: Thursday, December 16, 2021 11:42 AM To: amd-gfx@lists.freedesktop.org ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Cc: Skvortsov, Victor Subject: [PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init Driver needs to call get_xgmi_info() before ip_init to determine whether it needs to handle a pending hive reset. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 6 -- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 -- 3 files changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 5bd785cfc5ca..4fd370016834 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3576,6 +3576,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (r) return r; + /* Need to get xgmi info early to decide the reset behavior*/ + if (adev->gmc.xgmi.supported) { + r = adev->gfxhub.funcs->get_xgmi_info(adev); + if (r) + return r; + } + /* enable PCIE atomic ops */ if (amdgpu_sriov_vf(adev)) adev->have_atomics_support = ((struct amd_sriov_msg_pf2vf_info *) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index ae46eb35b3d7..3d5d47a799e3 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -914,12 +914,6 @@ static int gmc_v10_0_sw_init(void *handle) return r; } - if (adev->gmc.xgmi.supported) { - r = adev->gfxhub.funcs->get_xgmi_info(adev); - if (r) - return r; - } - r = gmc_v10_0_mc_init(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 2b86c63b032a..57f2729a7bd0 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1628,12 +1628,6 @@ static int gmc_v9_0_sw_init(void *handle) } adev->need_swiotlb = drm_need_swiotlb(44); - if (adev->gmc.xgmi.supported) { - r = adev->gfxhub.funcs->get_xgmi_info(adev); - if (r) - return r; - } - r = gmc_v9_0_mc_init(adev); if (r) return r; -- 2.25.1
Re: [PATCH v3 3/5] drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov
[AMD Official Use Only] Reviewed-by: David Nieto From: Skvortsov, Victor Sent: Thursday, December 16, 2021 11:42 AM To: amd-gfx@lists.freedesktop.org ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Cc: Skvortsov, Victor Subject: [PATCH v3 3/5] drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov Modify GC register access from MMIO to RLCG if the indirect flag is set Signed-off-by: Victor Skvortsov --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index ddfe7aff919d..1abf662a0e91 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -166,7 +166,7 @@ int kgd_gfx_v9_init_interrupts(struct amdgpu_device *adev, uint32_t pipe_id) lock_srbm(adev, mec, pipe, 0, 0); - WREG32(SOC15_REG_OFFSET(GC, 0, mmCPC_INT_CNTL), + WREG32_SOC15(GC, 0, mmCPC_INT_CNTL, CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK | CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK); @@ -279,7 +279,7 @@ int kgd_gfx_v9_hqd_load(struct amdgpu_device *adev, void *mqd, lower_32_bits((uintptr_t)wptr)); WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_POLL_ADDR_HI), upper_32_bits((uintptr_t)wptr)); - WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_PQ_WPTR_POLL_CNTL1), + WREG32_SOC15(GC, 0, mmCP_PQ_WPTR_POLL_CNTL1, (uint32_t)get_queue_mask(adev, pipe_id, queue_id)); } @@ -488,13 +488,13 @@ bool kgd_gfx_v9_hqd_is_occupied(struct amdgpu_device *adev, uint32_t low, high; acquire_queue(adev, pipe_id, queue_id); - act = RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE)); + act = RREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE); if (act) { low = lower_32_bits(queue_address >> 8); high = upper_32_bits(queue_address >> 8); - if (low == RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_BASE)) && - high == RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_BASE_HI))) + if (low == RREG32_SOC15(GC, 0, mmCP_HQD_PQ_BASE) && + high == RREG32_SOC15(GC, 0, mmCP_HQD_PQ_BASE_HI)) retval = true; } release_queue(adev); @@ -556,7 +556,7 @@ int kgd_gfx_v9_hqd_destroy(struct amdgpu_device *adev, void *mqd, end_jiffies = (utimeout * HZ / 1000) + jiffies; while (true) { - temp = RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE)); + temp = RREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE); if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK)) break; if (time_after(jiffies, end_jiffies)) { @@ -645,7 +645,7 @@ int kgd_gfx_v9_wave_control_execute(struct amdgpu_device *adev, mutex_lock(>grbm_idx_mutex); WREG32_SOC15_RLC_SHADOW(GC, 0, mmGRBM_GFX_INDEX, gfx_index_val); - WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CMD), sq_cmd); + WREG32_SOC15(GC, 0, mmSQ_CMD, sq_cmd); data = REG_SET_FIELD(data, GRBM_GFX_INDEX, INSTANCE_BROADCAST_WRITES, 1); @@ -722,7 +722,7 @@ static void get_wave_count(struct amdgpu_device *adev, int queue_idx, pipe_idx = queue_idx / adev->gfx.mec.num_queue_per_pipe; queue_slot = queue_idx % adev->gfx.mec.num_queue_per_pipe; soc15_grbm_select(adev, 1, pipe_idx, queue_slot, 0); - reg_val = RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_CSQ_WF_ACTIVE_COUNT_0) + + reg_val = RREG32_SOC15_IP(GC, SOC15_REG_OFFSET(GC, 0, mmSPI_CSQ_WF_ACTIVE_COUNT_0) + queue_slot); *wave_cnt = reg_val & SPI_CSQ_WF_ACTIVE_COUNT_0__COUNT_MASK; if (*wave_cnt != 0) @@ -809,8 +809,7 @@ void kgd_gfx_v9_get_cu_occupancy(struct amdgpu_device *adev, int pasid, for (sh_idx = 0; sh_idx < sh_cnt; sh_idx++) { gfx_v9_0_select_se_sh(adev, se_idx, sh_idx, 0x); - queue_map = RREG32(SOC15_REG_OFFSET(GC, 0, - mmSPI_CSQ_WF_ACTIVE_STATUS)); + queue_map = RREG32_SOC15(GC, 0, mmSPI_CSQ_WF_ACTIVE_STATUS); /* * Assumption: queue map encodes following schema: four @@ -860,17 +859,17 @@ void kgd_gfx_v9_program_trap_handler_settings(struct amdgpu_device *adev, /* * Program TBA registers */ - WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_SHADER_TBA_LO), + WREG32_SOC15(GC, 0, mmSQ_SHADER_TBA_LO, lower_32_bits(tba_addr >> 8)); - WREG32(SOC15_REG_OFFSET(GC, 0,
Re: [PATCH v3 2/5] drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov
[AMD Official Use Only] Reviewed-by: David Nieto From: Skvortsov, Victor Sent: Thursday, December 16, 2021 11:42 AM To: amd-gfx@lists.freedesktop.org ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Cc: Skvortsov, Victor Subject: [PATCH v3 2/5] drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov Modify GC register access from MMIO to RLCG if the indirect flag is set v2: Replaced ternary operator with if-else for better readability Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 57 --- 1 file changed, 43 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index a5471923b3f6..2b86c63b032a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -478,9 +478,18 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev, hub = >vmhub[j]; for (i = 0; i < 16; i++) { reg = hub->vm_context0_cntl + i; - tmp = RREG32(reg); + + if (j == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP(GC, reg); + else + tmp = RREG32_SOC15_IP(MMHUB, reg); + tmp &= ~bits; - WREG32(reg, tmp); + + if (j == AMDGPU_GFXHUB_0) + WREG32_SOC15_IP(GC, reg, tmp); + else + WREG32_SOC15_IP(MMHUB, reg, tmp); } } break; @@ -489,9 +498,18 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev, hub = >vmhub[j]; for (i = 0; i < 16; i++) { reg = hub->vm_context0_cntl + i; - tmp = RREG32(reg); + + if (j == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP(GC, reg); + else + tmp = RREG32_SOC15_IP(MMHUB, reg); + tmp |= bits; - WREG32(reg, tmp); + + if (j == AMDGPU_GFXHUB_0) + WREG32_SOC15_IP(GC, reg, tmp); + else + WREG32_SOC15_IP(MMHUB, reg, tmp); } } break; @@ -788,9 +806,12 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, /* TODO: It needs to continue working on debugging with semaphore for GFXHUB as well. */ if (use_semaphore) { for (j = 0; j < adev->usec_timeout; j++) { - /* a read return value of 1 means semaphore acuqire */ - tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_sem + - hub->eng_distance * eng); + /* a read return value of 1 means semaphore acquire */ + if (vmhub == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP_NO_KIQ(GC, hub->vm_inv_eng0_sem + hub->eng_distance * eng); + else + tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, hub->vm_inv_eng0_sem + hub->eng_distance * eng); + if (tmp & 0x1) break; udelay(1); @@ -801,8 +822,10 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, } do { - WREG32_NO_KIQ(hub->vm_inv_eng0_req + - hub->eng_distance * eng, inv_req); + if (vmhub == AMDGPU_GFXHUB_0) + WREG32_SOC15_IP_NO_KIQ(GC, hub->vm_inv_eng0_req + hub->eng_distance * eng, inv_req); + else + WREG32_SOC15_IP_NO_KIQ(MMHUB, hub->vm_inv_eng0_req + hub->eng_distance * eng, inv_req); /* * Issue a dummy read to wait for the ACK register to @@ -815,8 +838,11 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, hub->eng_distance * eng); for (j = 0; j < adev->usec_timeout; j++) { - tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_ack + - hub->eng_distance * eng); + if (vmhub == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP_NO_KIQ(GC,
Re: [PATCH v3 1/5] drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions
[AMD Official Use Only] Reviewed-by: David Nieto From: Skvortsov, Victor Sent: Thursday, December 16, 2021 11:42 AM To: amd-gfx@lists.freedesktop.org ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Cc: Skvortsov, Victor Subject: [PATCH v3 1/5] drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions Add helper macros to change register access from direct to indirect. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h index 8a9ca87d8663..473767e03676 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h +++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h @@ -51,6 +51,8 @@ #define RREG32_SOC15_IP(ip, reg) __RREG32_SOC15_RLC__(reg, 0, ip##_HWIP) +#define RREG32_SOC15_IP_NO_KIQ(ip, reg) __RREG32_SOC15_RLC__(reg, AMDGPU_REGS_NO_KIQ, ip##_HWIP) + #define RREG32_SOC15_NO_KIQ(ip, inst, reg) \ __RREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \ AMDGPU_REGS_NO_KIQ, ip##_HWIP) @@ -65,6 +67,9 @@ #define WREG32_SOC15_IP(ip, reg, value) \ __WREG32_SOC15_RLC__(reg, value, 0, ip##_HWIP) +#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value) \ +__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP) + #define WREG32_SOC15_NO_KIQ(ip, inst, reg, value) \ __WREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \ value, AMDGPU_REGS_NO_KIQ, ip##_HWIP) -- 2.25.1
[PATCH v3 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov
Expand RLCG interface for new GC read & write commands. New interface will only be used if the PF enables the flag in pf2vf msg. v2: Added a description for the scratch registers Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 117 -- 1 file changed, 89 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index edb3e3b08eed..9189fb85a4dd 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -63,6 +63,13 @@ #define mmGCEA_PROBE_MAP0x070c #define mmGCEA_PROBE_MAP_BASE_IDX 0 +#define GFX9_RLCG_GC_WRITE_OLD (0x8 << 28) +#define GFX9_RLCG_GC_WRITE (0x0 << 28) +#define GFX9_RLCG_GC_READ (0x1 << 28) +#define GFX9_RLCG_VFGATE_DISABLED 0x400 +#define GFX9_RLCG_WRONG_OPERATION_TYPE 0x200 +#define GFX9_RLCG_NOT_IN_RANGE 0x100 + MODULE_FIRMWARE("amdgpu/vega10_ce.bin"); MODULE_FIRMWARE("amdgpu/vega10_pfp.bin"); MODULE_FIRMWARE("amdgpu/vega10_me.bin"); @@ -739,7 +746,7 @@ static const u32 GFX_RLC_SRM_INDEX_CNTL_DATA_OFFSETS[] = mmRLC_SRM_INDEX_CNTL_DATA_7 - mmRLC_SRM_INDEX_CNTL_DATA_0, }; -static void gfx_v9_0_rlcg_w(struct amdgpu_device *adev, u32 offset, u32 v, u32 flag) +static u32 gfx_v9_0_rlcg_rw(struct amdgpu_device *adev, u32 offset, u32 v, uint32_t flag) { static void *scratch_reg0; static void *scratch_reg1; @@ -748,21 +755,20 @@ static void gfx_v9_0_rlcg_w(struct amdgpu_device *adev, u32 offset, u32 v, u32 f static void *spare_int; static uint32_t grbm_cntl; static uint32_t grbm_idx; + uint32_t i = 0; + uint32_t retries = 5; + u32 ret = 0; + u32 tmp; scratch_reg0 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG0_BASE_IDX] + mmSCRATCH_REG0)*4; scratch_reg1 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG1)*4; - scratch_reg2 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG2)*4; - scratch_reg3 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG3)*4; + scratch_reg2 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG2_BASE_IDX] + mmSCRATCH_REG2)*4; + scratch_reg3 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG3_BASE_IDX] + mmSCRATCH_REG3)*4; spare_int = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmRLC_SPARE_INT_BASE_IDX] + mmRLC_SPARE_INT)*4; grbm_cntl = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_CNTL_BASE_IDX] + mmGRBM_GFX_CNTL; grbm_idx = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_INDEX_BASE_IDX] + mmGRBM_GFX_INDEX; - if (amdgpu_sriov_runtime(adev)) { - pr_err("shouldn't call rlcg write register during runtime\n"); - return; - } - if (offset == grbm_cntl || offset == grbm_idx) { if (offset == grbm_cntl) writel(v, scratch_reg2); @@ -771,41 +777,95 @@ static void gfx_v9_0_rlcg_w(struct amdgpu_device *adev, u32 offset, u32 v, u32 f writel(v, ((void __iomem *)adev->rmmio) + (offset * 4)); } else { - uint32_t i = 0; - uint32_t retries = 5; - + /* +* SCRATCH_REG0 = read/write value +* SCRATCH_REG1[30:28] = command +* SCRATCH_REG1[19:0] = address in dword +* SCRATCH_REG1[26:24] = Error reporting +*/ writel(v, scratch_reg0); - writel(offset | 0x8000, scratch_reg1); + writel(offset | flag, scratch_reg1); writel(1, spare_int); - for (i = 0; i < retries; i++) { - u32 tmp; + for (i = 0; i < retries; i++) { tmp = readl(scratch_reg1); - if (!(tmp & 0x8000)) + if (!(tmp & flag)) break; udelay(10); } - if (i >= retries) - pr_err("timeout: rlcg program reg:0x%05x failed !\n", offset); + + if (i >= retries) { + if (amdgpu_sriov_reg_indirect_gc(adev)) { + if (tmp & GFX9_RLCG_VFGATE_DISABLED) + pr_err("The vfgate is disabled, program reg:0x%05x failed!\n", offset); + else if (tmp & GFX9_RLCG_WRONG_OPERATION_TYPE) + pr_err("Wrong operation type, program reg:0x%05x failed!\n", offset); + else if (tmp & GFX9_RLCG_NOT_IN_RANGE) + pr_err("The
[PATCH v3 4/5] drm/amdgpu: get xgmi info before ip_init
Driver needs to call get_xgmi_info() before ip_init to determine whether it needs to handle a pending hive reset. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 6 -- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 -- 3 files changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 5bd785cfc5ca..4fd370016834 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3576,6 +3576,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (r) return r; + /* Need to get xgmi info early to decide the reset behavior*/ + if (adev->gmc.xgmi.supported) { + r = adev->gfxhub.funcs->get_xgmi_info(adev); + if (r) + return r; + } + /* enable PCIE atomic ops */ if (amdgpu_sriov_vf(adev)) adev->have_atomics_support = ((struct amd_sriov_msg_pf2vf_info *) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index ae46eb35b3d7..3d5d47a799e3 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -914,12 +914,6 @@ static int gmc_v10_0_sw_init(void *handle) return r; } - if (adev->gmc.xgmi.supported) { - r = adev->gfxhub.funcs->get_xgmi_info(adev); - if (r) - return r; - } - r = gmc_v10_0_mc_init(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 2b86c63b032a..57f2729a7bd0 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1628,12 +1628,6 @@ static int gmc_v9_0_sw_init(void *handle) } adev->need_swiotlb = drm_need_swiotlb(44); - if (adev->gmc.xgmi.supported) { - r = adev->gfxhub.funcs->get_xgmi_info(adev); - if (r) - return r; - } - r = gmc_v9_0_mc_init(adev); if (r) return r; -- 2.25.1
[PATCH v3 3/5] drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov
Modify GC register access from MMIO to RLCG if the indirect flag is set Signed-off-by: Victor Skvortsov --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index ddfe7aff919d..1abf662a0e91 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -166,7 +166,7 @@ int kgd_gfx_v9_init_interrupts(struct amdgpu_device *adev, uint32_t pipe_id) lock_srbm(adev, mec, pipe, 0, 0); - WREG32(SOC15_REG_OFFSET(GC, 0, mmCPC_INT_CNTL), + WREG32_SOC15(GC, 0, mmCPC_INT_CNTL, CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK | CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK); @@ -279,7 +279,7 @@ int kgd_gfx_v9_hqd_load(struct amdgpu_device *adev, void *mqd, lower_32_bits((uintptr_t)wptr)); WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_POLL_ADDR_HI), upper_32_bits((uintptr_t)wptr)); - WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_PQ_WPTR_POLL_CNTL1), + WREG32_SOC15(GC, 0, mmCP_PQ_WPTR_POLL_CNTL1, (uint32_t)get_queue_mask(adev, pipe_id, queue_id)); } @@ -488,13 +488,13 @@ bool kgd_gfx_v9_hqd_is_occupied(struct amdgpu_device *adev, uint32_t low, high; acquire_queue(adev, pipe_id, queue_id); - act = RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE)); + act = RREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE); if (act) { low = lower_32_bits(queue_address >> 8); high = upper_32_bits(queue_address >> 8); - if (low == RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_BASE)) && - high == RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_BASE_HI))) + if (low == RREG32_SOC15(GC, 0, mmCP_HQD_PQ_BASE) && + high == RREG32_SOC15(GC, 0, mmCP_HQD_PQ_BASE_HI)) retval = true; } release_queue(adev); @@ -556,7 +556,7 @@ int kgd_gfx_v9_hqd_destroy(struct amdgpu_device *adev, void *mqd, end_jiffies = (utimeout * HZ / 1000) + jiffies; while (true) { - temp = RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE)); + temp = RREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE); if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK)) break; if (time_after(jiffies, end_jiffies)) { @@ -645,7 +645,7 @@ int kgd_gfx_v9_wave_control_execute(struct amdgpu_device *adev, mutex_lock(>grbm_idx_mutex); WREG32_SOC15_RLC_SHADOW(GC, 0, mmGRBM_GFX_INDEX, gfx_index_val); - WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CMD), sq_cmd); + WREG32_SOC15(GC, 0, mmSQ_CMD, sq_cmd); data = REG_SET_FIELD(data, GRBM_GFX_INDEX, INSTANCE_BROADCAST_WRITES, 1); @@ -722,7 +722,7 @@ static void get_wave_count(struct amdgpu_device *adev, int queue_idx, pipe_idx = queue_idx / adev->gfx.mec.num_queue_per_pipe; queue_slot = queue_idx % adev->gfx.mec.num_queue_per_pipe; soc15_grbm_select(adev, 1, pipe_idx, queue_slot, 0); - reg_val = RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_CSQ_WF_ACTIVE_COUNT_0) + + reg_val = RREG32_SOC15_IP(GC, SOC15_REG_OFFSET(GC, 0, mmSPI_CSQ_WF_ACTIVE_COUNT_0) + queue_slot); *wave_cnt = reg_val & SPI_CSQ_WF_ACTIVE_COUNT_0__COUNT_MASK; if (*wave_cnt != 0) @@ -809,8 +809,7 @@ void kgd_gfx_v9_get_cu_occupancy(struct amdgpu_device *adev, int pasid, for (sh_idx = 0; sh_idx < sh_cnt; sh_idx++) { gfx_v9_0_select_se_sh(adev, se_idx, sh_idx, 0x); - queue_map = RREG32(SOC15_REG_OFFSET(GC, 0, - mmSPI_CSQ_WF_ACTIVE_STATUS)); + queue_map = RREG32_SOC15(GC, 0, mmSPI_CSQ_WF_ACTIVE_STATUS); /* * Assumption: queue map encodes following schema: four @@ -860,17 +859,17 @@ void kgd_gfx_v9_program_trap_handler_settings(struct amdgpu_device *adev, /* * Program TBA registers */ - WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_SHADER_TBA_LO), + WREG32_SOC15(GC, 0, mmSQ_SHADER_TBA_LO, lower_32_bits(tba_addr >> 8)); - WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_SHADER_TBA_HI), + WREG32_SOC15(GC, 0, mmSQ_SHADER_TBA_HI, upper_32_bits(tba_addr >> 8)); /* * Program TMA registers */ - WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_SHADER_TMA_LO), + WREG32_SOC15(GC, 0, mmSQ_SHADER_TMA_LO, lower_32_bits(tma_addr >> 8)); - WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_SHADER_TMA_HI), + WREG32_SOC15(GC, 0, mmSQ_SHADER_TMA_HI,
[PATCH v3 2/5] drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov
Modify GC register access from MMIO to RLCG if the indirect flag is set v2: Replaced ternary operator with if-else for better readability Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 57 --- 1 file changed, 43 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index a5471923b3f6..2b86c63b032a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -478,9 +478,18 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev, hub = >vmhub[j]; for (i = 0; i < 16; i++) { reg = hub->vm_context0_cntl + i; - tmp = RREG32(reg); + + if (j == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP(GC, reg); + else + tmp = RREG32_SOC15_IP(MMHUB, reg); + tmp &= ~bits; - WREG32(reg, tmp); + + if (j == AMDGPU_GFXHUB_0) + WREG32_SOC15_IP(GC, reg, tmp); + else + WREG32_SOC15_IP(MMHUB, reg, tmp); } } break; @@ -489,9 +498,18 @@ static int gmc_v9_0_vm_fault_interrupt_state(struct amdgpu_device *adev, hub = >vmhub[j]; for (i = 0; i < 16; i++) { reg = hub->vm_context0_cntl + i; - tmp = RREG32(reg); + + if (j == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP(GC, reg); + else + tmp = RREG32_SOC15_IP(MMHUB, reg); + tmp |= bits; - WREG32(reg, tmp); + + if (j == AMDGPU_GFXHUB_0) + WREG32_SOC15_IP(GC, reg, tmp); + else + WREG32_SOC15_IP(MMHUB, reg, tmp); } } break; @@ -788,9 +806,12 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, /* TODO: It needs to continue working on debugging with semaphore for GFXHUB as well. */ if (use_semaphore) { for (j = 0; j < adev->usec_timeout; j++) { - /* a read return value of 1 means semaphore acuqire */ - tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_sem + - hub->eng_distance * eng); + /* a read return value of 1 means semaphore acquire */ + if (vmhub == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP_NO_KIQ(GC, hub->vm_inv_eng0_sem + hub->eng_distance * eng); + else + tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, hub->vm_inv_eng0_sem + hub->eng_distance * eng); + if (tmp & 0x1) break; udelay(1); @@ -801,8 +822,10 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, } do { - WREG32_NO_KIQ(hub->vm_inv_eng0_req + - hub->eng_distance * eng, inv_req); + if (vmhub == AMDGPU_GFXHUB_0) + WREG32_SOC15_IP_NO_KIQ(GC, hub->vm_inv_eng0_req + hub->eng_distance * eng, inv_req); + else + WREG32_SOC15_IP_NO_KIQ(MMHUB, hub->vm_inv_eng0_req + hub->eng_distance * eng, inv_req); /* * Issue a dummy read to wait for the ACK register to @@ -815,8 +838,11 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, hub->eng_distance * eng); for (j = 0; j < adev->usec_timeout; j++) { - tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_ack + - hub->eng_distance * eng); + if (vmhub == AMDGPU_GFXHUB_0) + tmp = RREG32_SOC15_IP_NO_KIQ(GC, hub->vm_inv_eng0_ack + hub->eng_distance * eng); + else + tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, hub->vm_inv_eng0_ack + hub->eng_distance * eng); + if (tmp & (1 << vmid)) break; udelay(1); @@ -827,13 +853,16 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, } while (inv_req);
[PATCH v3 1/5] drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions
Add helper macros to change register access from direct to indirect. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h index 8a9ca87d8663..473767e03676 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h +++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h @@ -51,6 +51,8 @@ #define RREG32_SOC15_IP(ip, reg) __RREG32_SOC15_RLC__(reg, 0, ip##_HWIP) +#define RREG32_SOC15_IP_NO_KIQ(ip, reg) __RREG32_SOC15_RLC__(reg, AMDGPU_REGS_NO_KIQ, ip##_HWIP) + #define RREG32_SOC15_NO_KIQ(ip, inst, reg) \ __RREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \ AMDGPU_REGS_NO_KIQ, ip##_HWIP) @@ -65,6 +67,9 @@ #define WREG32_SOC15_IP(ip, reg, value) \ __WREG32_SOC15_RLC__(reg, value, 0, ip##_HWIP) +#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value) \ +__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP) + #define WREG32_SOC15_NO_KIQ(ip, inst, reg, value) \ __WREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \ value, AMDGPU_REGS_NO_KIQ, ip##_HWIP) -- 2.25.1
[PATCH v3 0/5] *** GFX9 RLCG Interface modifications ***
This patchset introduces an expanded sriov RLCG interface. This interface will be used by Aldebaran in sriov mode for indirect GC register access during full access. v2: Added descriptions to scratch registers, and improved code readability. v3: Remove the RLC function pointer init change. Move xgmi_info call to earlier in init. Victor Skvortsov (5): drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov drm/amdgpu: get xgmi info before ip_init drm/amdgpu: Modify indirect register access for gfx9 sriov .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 27 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 7 ++ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 117 +- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c| 6 - drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 63 +++--- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 5 + 6 files changed, 157 insertions(+), 68 deletions(-) -- 2.25.1
Re: [PATCH] drm/amdkfd: make SPDX License expression more sound
On Thu, Dec 16, 2021 at 11:14 AM Richard Fontana wrote: > > On Thu, Dec 16, 2021 at 4:45 AM Lukas Bulwahn wrote: > > > > Commit b5f57384805a ("drm/amdkfd: Add sysfs bitfields and enums to uAPI") > > adds include/uapi/linux/kfd_sysfs.h with the "GPL-2.0 OR MIT WITH > > Linux-syscall-note" SPDX-License expression. > > > > The command ./scripts/spdxcheck.py warns: > > > > include/uapi/linux/kfd_sysfs.h: 1:48 Exception not valid for license MIT: > > Linux-syscall-note > > > > For a uapi header, the file under GPLv2 License must be combined with the > > Linux-syscall-note, but combining the MIT License with the > > Linux-syscall-note makes no sense, as the note provides an exception for > > GPL-licensed code, not for permissively licensed code. > > > > So, reorganize the SPDX expression to only combine the note with the GPL > > License condition. This makes spdxcheck happy again. > > > > Signed-off-by: Lukas Bulwahn > > --- > > I am not a lawyer and I do not intend to modify the actual licensing of > > this header file. So, I really would like to have an Ack from some AMD > > developer here. > > > > Maybe also a lawyer on the linux-spdx list can check my reasoning on the > > licensing with the exception note? > > I believe "MIT WITH Linux-syscall-note" is a syntactically correct > SPDX expression but is otherwise sort of non-meaningful. > "(GPL-2.0 WITH Linux-syscall-note) OR MIT" is presumably what is > intended here. But yes would be good to get confirmation from someone > associated with AMD. Thanks Lukas, I agree that this is indeed clearer. +1 Reviewed-by: kstew...@linuxfoundation.org
[PATCH] drm/amd/display: fix dereference before NULL check
The "plane_state" pointer was access before checking if it was NULL. Avoid a possible NULL pointer dereference by accessing the plane address after the check. Addresses-Coverity-ID: 1474582 ("Dereference before null check") Fixes: 3f68c01be9a22 ("drm/amd/display: add cyan_skillfish display support") Signed-off-by: José Expósito --- drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c index cfd09b3f705e..fe22530242d2 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c @@ -134,11 +134,12 @@ void dcn201_update_plane_addr(const struct dc *dc, struct pipe_ctx *pipe_ctx) PHYSICAL_ADDRESS_LOC addr; struct dc_plane_state *plane_state = pipe_ctx->plane_state; struct dce_hwseq *hws = dc->hwseq; - struct dc_plane_address uma = plane_state->address; + struct dc_plane_address uma; if (plane_state == NULL) return; + uma = plane_state->address; addr_patched = patch_address_for_sbs_tb_stereo(pipe_ctx, ); plane_address_in_gpu_space_to_uma(hws, ); -- 2.25.1
Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers
[Public] If it needs to be in drm-next, please make sure it lands there. Alex From: amd-gfx on behalf of Liu, Shaoyun Sent: Thursday, December 16, 2021 12:51 PM To: Skvortsov, Victor ; Alex Deucher Cc: Ming, Davis ; Chen, JingWen ; amd-gfx list ; Deng, Emily ; Chen, Horace ; Liu, Monk ; Nieto, David M Subject: RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [AMD Official Use Only] Actually I don't know why the change " a35f147621bc drm/amdgpu: get xgmi info at eary_init " not in drm-next , instead it’s in amd-mainline-dkms-5.13. That change is necessary for passthrough XGMI hive to a VM and rely on our driver to do the reset on whole hive when driver is loaded . I checked the code again, it seems we should be ok as long as we get xgmi info at eary_init. So since gfx_v9_0_set_rlc_funcs() already gets called in gfx_v9_0_early_init(), we can move get xgmi info out of gmc_early_init and call it at the last step early_init . Regards Shaoyun.liu -Original Message- From: Skvortsov, Victor Sent: Thursday, December 16, 2021 9:28 AM To: Alex Deucher Cc: amd-gfx list ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Subject: RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [AMD Official Use Only] Gotcha, I will skip this patch for drm-next -Original Message- From: Alex Deucher Sent: Thursday, December 16, 2021 8:53 AM To: Skvortsov, Victor Cc: amd-gfx list ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [CAUTION: External Email] On Wed, Dec 15, 2021 at 6:58 PM Skvortsov, Victor wrote: > > [AMD Official Use Only] > > Hey Alex, > > This change was based on the fact that amd-mainline-dkms-5.13 calls > get_xgmi_info() in gmc_v9_0_early_init(). But I can see that drm-next it's > instead called in gmc_v9_0_sw_init(). So, I'm not sure whats the correct > behavior. But I do agree that the change is kind of ugly. I don't know where > else to put it if we do need to call get_xgmi_info() in early_init. > We could skip this patch for drm-next and just apply it to the dkms branch. There's already a lot of ugly stuff in there to deal with multiple kernel versions. Alex > Thanks, > Victor > > -Original Message- > From: Alex Deucher > Sent: Wednesday, December 15, 2021 4:38 PM > To: Skvortsov, Victor > Cc: amd-gfx list ; Deng, Emily > ; Liu, Monk ; Ming, Davis > ; Liu, Shaoyun ; Zhou, Peng > Ju ; Chen, JingWen ; Chen, > Horace ; Nieto, David M > Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function > pointers > > [CAUTION: External Email] > > On Wed, Dec 15, 2021 at 1:56 PM Victor Skvortsov > wrote: > > > > In SRIOV, RLC function pointers must be initialized early as we rely > > on the RLCG interface for all GC register access. > > > > Signed-off-by: Victor Skvortsov > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 ++ > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +-- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h | 2 ++ > > 3 files changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > index 65e1f6cc59dd..1bc92a38d124 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > @@ -844,6 +844,8 @@ static int amdgpu_discovery_set_gc_ip_blocks(struct > > amdgpu_device *adev) > > case IP_VERSION(9, 4, 1): > > case IP_VERSION(9, 4, 2): > > amdgpu_device_ip_block_add(adev, > > _v9_0_ip_block); > > + if (amdgpu_sriov_vf(adev) && adev->ip_versions[GC_HWIP][0] > > == IP_VERSION(9, 4, 2)) > > + gfx_v9_0_set_rlc_funcs(adev); > > amdgpu_discovery.c is IP independent. I'd rather not add random IP specific > function calls. gfx_v9_0_set_rlc_funcs() already gets called in > gfx_v9_0_early_init(). Is that not early enough? In general we shouldn't be > touching the hardware much if at all in early_init. > > Alex > > > break; > > case IP_VERSION(10, 1, 10): > > case IP_VERSION(10, 1, 2): > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index edb3e3b08eed..d252b06efa43 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -816,7 +816,6 @@ static void gfx_v9_0_sriov_wreg(struct > > amdgpu_device *adev, u32 offset, static void > > gfx_v9_0_set_ring_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_irq_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_gds_init(struct amdgpu_device *adev); -static void > >
RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers
[AMD Official Use Only] Actually I don't know why the change " a35f147621bc drm/amdgpu: get xgmi info at eary_init " not in drm-next , instead it’s in amd-mainline-dkms-5.13. That change is necessary for passthrough XGMI hive to a VM and rely on our driver to do the reset on whole hive when driver is loaded . I checked the code again, it seems we should be ok as long as we get xgmi info at eary_init. So since gfx_v9_0_set_rlc_funcs() already gets called in gfx_v9_0_early_init(), we can move get xgmi info out of gmc_early_init and call it at the last step early_init . Regards Shaoyun.liu -Original Message- From: Skvortsov, Victor Sent: Thursday, December 16, 2021 9:28 AM To: Alex Deucher Cc: amd-gfx list ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Subject: RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [AMD Official Use Only] Gotcha, I will skip this patch for drm-next -Original Message- From: Alex Deucher Sent: Thursday, December 16, 2021 8:53 AM To: Skvortsov, Victor Cc: amd-gfx list ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [CAUTION: External Email] On Wed, Dec 15, 2021 at 6:58 PM Skvortsov, Victor wrote: > > [AMD Official Use Only] > > Hey Alex, > > This change was based on the fact that amd-mainline-dkms-5.13 calls > get_xgmi_info() in gmc_v9_0_early_init(). But I can see that drm-next it's > instead called in gmc_v9_0_sw_init(). So, I'm not sure whats the correct > behavior. But I do agree that the change is kind of ugly. I don't know where > else to put it if we do need to call get_xgmi_info() in early_init. > We could skip this patch for drm-next and just apply it to the dkms branch. There's already a lot of ugly stuff in there to deal with multiple kernel versions. Alex > Thanks, > Victor > > -Original Message- > From: Alex Deucher > Sent: Wednesday, December 15, 2021 4:38 PM > To: Skvortsov, Victor > Cc: amd-gfx list ; Deng, Emily > ; Liu, Monk ; Ming, Davis > ; Liu, Shaoyun ; Zhou, Peng > Ju ; Chen, JingWen ; Chen, > Horace ; Nieto, David M > Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function > pointers > > [CAUTION: External Email] > > On Wed, Dec 15, 2021 at 1:56 PM Victor Skvortsov > wrote: > > > > In SRIOV, RLC function pointers must be initialized early as we rely > > on the RLCG interface for all GC register access. > > > > Signed-off-by: Victor Skvortsov > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 ++ > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +-- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h | 2 ++ > > 3 files changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > index 65e1f6cc59dd..1bc92a38d124 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > @@ -844,6 +844,8 @@ static int amdgpu_discovery_set_gc_ip_blocks(struct > > amdgpu_device *adev) > > case IP_VERSION(9, 4, 1): > > case IP_VERSION(9, 4, 2): > > amdgpu_device_ip_block_add(adev, > > _v9_0_ip_block); > > + if (amdgpu_sriov_vf(adev) && adev->ip_versions[GC_HWIP][0] > > == IP_VERSION(9, 4, 2)) > > + gfx_v9_0_set_rlc_funcs(adev); > > amdgpu_discovery.c is IP independent. I'd rather not add random IP specific > function calls. gfx_v9_0_set_rlc_funcs() already gets called in > gfx_v9_0_early_init(). Is that not early enough? In general we shouldn't be > touching the hardware much if at all in early_init. > > Alex > > > break; > > case IP_VERSION(10, 1, 10): > > case IP_VERSION(10, 1, 2): > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index edb3e3b08eed..d252b06efa43 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -816,7 +816,6 @@ static void gfx_v9_0_sriov_wreg(struct > > amdgpu_device *adev, u32 offset, static void > > gfx_v9_0_set_ring_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_irq_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_gds_init(struct amdgpu_device *adev); -static void > > gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev); static int > > gfx_v9_0_get_cu_info(struct amdgpu_device *adev, > > struct amdgpu_cu_info *cu_info); > > static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device > > *adev); @@ -7066,7 +7065,7 @@ static void gfx_v9_0_set_irq_funcs(struct > > amdgpu_device *adev) > > adev->gfx.cp_ecc_error_irq.funcs = > >
Re: [PATCH] drm/amdkfd: make SPDX License expression more sound
On Thu, Dec 16, 2021 at 4:45 AM Lukas Bulwahn wrote: > > Commit b5f57384805a ("drm/amdkfd: Add sysfs bitfields and enums to uAPI") > adds include/uapi/linux/kfd_sysfs.h with the "GPL-2.0 OR MIT WITH > Linux-syscall-note" SPDX-License expression. > > The command ./scripts/spdxcheck.py warns: > > include/uapi/linux/kfd_sysfs.h: 1:48 Exception not valid for license MIT: > Linux-syscall-note > > For a uapi header, the file under GPLv2 License must be combined with the > Linux-syscall-note, but combining the MIT License with the > Linux-syscall-note makes no sense, as the note provides an exception for > GPL-licensed code, not for permissively licensed code. > > So, reorganize the SPDX expression to only combine the note with the GPL > License condition. This makes spdxcheck happy again. > > Signed-off-by: Lukas Bulwahn > --- > I am not a lawyer and I do not intend to modify the actual licensing of > this header file. So, I really would like to have an Ack from some AMD > developer here. > > Maybe also a lawyer on the linux-spdx list can check my reasoning on the > licensing with the exception note? I believe "MIT WITH Linux-syscall-note" is a syntactically correct SPDX expression but is otherwise sort of non-meaningful. "(GPL-2.0 WITH Linux-syscall-note) OR MIT" is presumably what is intended here. But yes would be good to get confirmation from someone associated with AMD. Richard
Re: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure
Maybe we just should use drm_dev_is_unplugged() for this particular case because, there would be no race since when device is unplugged it's final. It's the other way around that requires strict drm_dev_enter/exit scope. Andrey On 2021-12-16 3:38 a.m., Christian König wrote: The !drm_dev_enter() is quite unusual and deserves a comment explaining what's going on here. Apart from that it looks good with the typos fixed I think. Christian. Am 16.12.21 um 08:27 schrieb Chen, Guchun: [Public] My BAD to misunderstand this. There are both spell typos in patch subject and body, s/iff/if. The patch is: Reviewed-by: Guchun Chen Please wait for the ack from Andrey and Christian before pushing this. Regards, Guchun -Original Message- From: Shi, Leslie Sent: Thursday, December 16, 2021 3:00 PM To: Chen, Guchun ; Grodzovsky, Andrey ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure [Public] Hi Guchun, As Andrey says, "we should not call amdgpu_device_unmap_mmio unless device is unplugged", I think we should call amdgpu_device_unmap_mmio() only if device is unplugged (drm_dev_enter() return false) . +if (!drm_dev_enter(adev_to_drm(adev), )) + amdgpu_device_unmap_mmio(adev); + else # drm_dev_exit(idx); Regards, Leslie -Original Message- From: Chen, Guchun Sent: Thursday, December 16, 2021 2:46 PM To: Shi, Leslie ; Grodzovsky, Andrey ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure [Public] Hi Leslie, I think we need to modify it like: +if (drm_dev_enter(adev_to_drm(adev), )) { + amdgpu_device_unmap_mmio(adev); + drm_dev_exit(idx); +} Also you need to credit Andrey a 'suggested-by' in your patch. Regards, Guchun -Original Message- From: Shi, Leslie Sent: Thursday, December 16, 2021 2:14 PM To: Grodzovsky, Andrey ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Cc: Chen, Guchun ; Shi, Leslie Subject: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure [Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs. [How] call amdgpu_device_unmap_mmio() iff device is unplugged to prevent invalid memory address in vcn_v3_0_sw_fini() when GPU initialization failure. Signed-off-by: Leslie Shi --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index fb03d75880ec..d3656e7b60c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3845,6 +3845,8 @@ static void amdgpu_device_unmap_mmio(struct amdgpu_device *adev) */ void amdgpu_device_fini_hw(struct amdgpu_device *adev) { + int idx; + dev_info(adev->dev, "amdgpu: finishing device.\n"); flush_delayed_work(>delayed_init_work); if (adev->mman.initialized) { @@ -3888,7 +3890,11 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_gart_dummy_page_fini(adev); - amdgpu_device_unmap_mmio(adev); + if (!drm_dev_enter(adev_to_drm(adev), )) + amdgpu_device_unmap_mmio(adev); + else + drm_dev_exit(idx); + } void amdgpu_device_fini_sw(struct amdgpu_device *adev) -- 2.25.1
RE: [PATCH] drm/amdgpu: add support for IP discovery gc_info table v2
[AMD Official Use Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Thursday, December 16, 2021 11:18 To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH] drm/amdgpu: add support for IP discovery gc_info table v2 Used on gfx9 based systems. Fixes incorrect CU counts reported in the kernel log. Bug: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1833data=04%7C01%7Chawking.zhang%40amd.com%7Cc4c2b58360d7438212e908d9c042b574%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637752215012895865%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=kZXixshnxu1G3MhOwQW%2B2tJUiAmRbPjmx3qHGJ4XcNg%3Dreserved=0 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 76 +-- drivers/gpu/drm/amd/include/discovery.h | 49 2 files changed, 103 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index ea00090b3fb3..bcc9343353b5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -526,10 +526,15 @@ void amdgpu_discovery_harvest_ip(struct amdgpu_device *adev) } } +union gc_info { + struct gc_info_v1_0 v1; + struct gc_info_v2_0 v2; +}; + int amdgpu_discovery_get_gfx_info(struct amdgpu_device *adev) { struct binary_header *bhdr; - struct gc_info_v1_0 *gc_info; + union gc_info *gc_info; if (!adev->mman.discovery_bin) { DRM_ERROR("ip discovery uninitialized\n"); @@ -537,28 +542,55 @@ int amdgpu_discovery_get_gfx_info(struct amdgpu_device *adev) } bhdr = (struct binary_header *)adev->mman.discovery_bin; - gc_info = (struct gc_info_v1_0 *)(adev->mman.discovery_bin + + gc_info = (union gc_info *)(adev->mman.discovery_bin + le16_to_cpu(bhdr->table_list[GC].offset)); - - adev->gfx.config.max_shader_engines = le32_to_cpu(gc_info->gc_num_se); - adev->gfx.config.max_cu_per_sh = 2 * (le32_to_cpu(gc_info->gc_num_wgp0_per_sa) + - le32_to_cpu(gc_info->gc_num_wgp1_per_sa)); - adev->gfx.config.max_sh_per_se = le32_to_cpu(gc_info->gc_num_sa_per_se); - adev->gfx.config.max_backends_per_se = le32_to_cpu(gc_info->gc_num_rb_per_se); - adev->gfx.config.max_texture_channel_caches = le32_to_cpu(gc_info->gc_num_gl2c); - adev->gfx.config.max_gprs = le32_to_cpu(gc_info->gc_num_gprs); - adev->gfx.config.max_gs_threads = le32_to_cpu(gc_info->gc_num_max_gs_thds); - adev->gfx.config.gs_vgt_table_depth = le32_to_cpu(gc_info->gc_gs_table_depth); - adev->gfx.config.gs_prim_buffer_depth = le32_to_cpu(gc_info->gc_gsprim_buff_depth); - adev->gfx.config.double_offchip_lds_buf = le32_to_cpu(gc_info->gc_double_offchip_lds_buffer); - adev->gfx.cu_info.wave_front_size = le32_to_cpu(gc_info->gc_wave_size); - adev->gfx.cu_info.max_waves_per_simd = le32_to_cpu(gc_info->gc_max_waves_per_simd); - adev->gfx.cu_info.max_scratch_slots_per_cu = le32_to_cpu(gc_info->gc_max_scratch_slots_per_cu); - adev->gfx.cu_info.lds_size = le32_to_cpu(gc_info->gc_lds_size); - adev->gfx.config.num_sc_per_sh = le32_to_cpu(gc_info->gc_num_sc_per_se) / -le32_to_cpu(gc_info->gc_num_sa_per_se); - adev->gfx.config.num_packer_per_sc = le32_to_cpu(gc_info->gc_num_packer_per_sc); - + switch (gc_info->v1.header.version_major) { + case 1: + adev->gfx.config.max_shader_engines = le32_to_cpu(gc_info->v1.gc_num_se); + adev->gfx.config.max_cu_per_sh = 2 * (le32_to_cpu(gc_info->v1.gc_num_wgp0_per_sa) + + le32_to_cpu(gc_info->v1.gc_num_wgp1_per_sa)); + adev->gfx.config.max_sh_per_se = le32_to_cpu(gc_info->v1.gc_num_sa_per_se); + adev->gfx.config.max_backends_per_se = le32_to_cpu(gc_info->v1.gc_num_rb_per_se); + adev->gfx.config.max_texture_channel_caches = le32_to_cpu(gc_info->v1.gc_num_gl2c); + adev->gfx.config.max_gprs = le32_to_cpu(gc_info->v1.gc_num_gprs); + adev->gfx.config.max_gs_threads = le32_to_cpu(gc_info->v1.gc_num_max_gs_thds); + adev->gfx.config.gs_vgt_table_depth = le32_to_cpu(gc_info->v1.gc_gs_table_depth); + adev->gfx.config.gs_prim_buffer_depth = le32_to_cpu(gc_info->v1.gc_gsprim_buff_depth); + adev->gfx.config.double_offchip_lds_buf = le32_to_cpu(gc_info->v1.gc_double_offchip_lds_buffer); + adev->gfx.cu_info.wave_front_size = le32_to_cpu(gc_info->v1.gc_wave_size); + adev->gfx.cu_info.max_waves_per_simd =
RE: [PATCH v2] drm/amdgpu: Separate vf2pf work item init from virt data exchange
[AMD Official Use Only] This one looks better and more logical . Reviewed By :Shaoyun.liu -Original Message- From: Skvortsov, Victor Sent: Thursday, December 16, 2021 10:39 AM To: amd-gfx@lists.freedesktop.org; Liu, Shaoyun ; Nieto, David M Cc: Skvortsov, Victor Subject: [PATCH v2] drm/amdgpu: Separate vf2pf work item init from virt data exchange We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 36 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 1 + 3 files changed, 30 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 48aeca3b8f16..ddc67b900587 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2316,6 +2316,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) /* need to do gmc hw init early so we can allocate gpu mem */ if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { + /* Try to reserve bad pages early */ + if (amdgpu_sriov_vf(adev)) + amdgpu_virt_exchange_data(adev); + r = amdgpu_device_vram_scratch_init(adev); if (r) { DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r); @@ -2347,7 +2351,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) } if (amdgpu_sriov_vf(adev)) - amdgpu_virt_init_data_exchange(adev); + amdgpu_virt_exchange_data(adev); r = amdgpu_ib_pool_init(adev); if (r) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 3fc49823f527..f8e574cc0e22 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -622,17 +622,35 @@ void amdgpu_virt_fini_data_exchange(struct amdgpu_device *adev) void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev) { - uint64_t bp_block_offset = 0; - uint32_t bp_block_size = 0; - struct amd_sriov_msg_pf2vf_info *pf2vf_v2 = NULL; - adev->virt.fw_reserve.p_pf2vf = NULL; adev->virt.fw_reserve.p_vf2pf = NULL; adev->virt.vf2pf_update_interval_ms = 0; - if (adev->mman.fw_vram_usage_va != NULL) { + if (adev->bios != NULL) { adev->virt.vf2pf_update_interval_ms = 2000; + adev->virt.fw_reserve.p_pf2vf = + (struct amd_sriov_msg_pf2vf_info_header *) + (adev->bios + (AMD_SRIOV_MSG_PF2VF_OFFSET_KB << 10)); + + amdgpu_virt_read_pf2vf_data(adev); + } + + if (adev->virt.vf2pf_update_interval_ms != 0) { + INIT_DELAYED_WORK(>virt.vf2pf_work, amdgpu_virt_update_vf2pf_work_item); + schedule_delayed_work(&(adev->virt.vf2pf_work), msecs_to_jiffies(adev->virt.vf2pf_update_interval_ms)); + } +} + + +void amdgpu_virt_exchange_data(struct amdgpu_device *adev) { + uint64_t bp_block_offset = 0; + uint32_t bp_block_size = 0; + struct amd_sriov_msg_pf2vf_info *pf2vf_v2 = NULL; + + if (adev->mman.fw_vram_usage_va != NULL) { + adev->virt.fw_reserve.p_pf2vf = (struct amd_sriov_msg_pf2vf_info_header *) (adev->mman.fw_vram_usage_va + (AMD_SRIOV_MSG_PF2VF_OFFSET_KB << 10)); @@ -663,16 +681,10 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev) (adev->bios + (AMD_SRIOV_MSG_PF2VF_OFFSET_KB << 10)); amdgpu_virt_read_pf2vf_data(adev); - - return; - } - - if (adev->virt.vf2pf_update_interval_ms != 0) { - INIT_DELAYED_WORK(>virt.vf2pf_work, amdgpu_virt_update_vf2pf_work_item); - schedule_delayed_work(&(adev->virt.vf2pf_work), adev->virt.vf2pf_update_interval_ms); } } + void amdgpu_detect_virtualization(struct amdgpu_device *adev) { uint32_t reg; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h index 8d4c20bb71c5..9adfb8d63280 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h @@ -308,6 +308,7 @@ int
[PATCH v2] drm/amdgpu: Separate vf2pf work item init from virt data exchange
We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 36 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 1 + 3 files changed, 30 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 48aeca3b8f16..ddc67b900587 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2316,6 +2316,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) /* need to do gmc hw init early so we can allocate gpu mem */ if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { + /* Try to reserve bad pages early */ + if (amdgpu_sriov_vf(adev)) + amdgpu_virt_exchange_data(adev); + r = amdgpu_device_vram_scratch_init(adev); if (r) { DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r); @@ -2347,7 +2351,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) } if (amdgpu_sriov_vf(adev)) - amdgpu_virt_init_data_exchange(adev); + amdgpu_virt_exchange_data(adev); r = amdgpu_ib_pool_init(adev); if (r) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 3fc49823f527..f8e574cc0e22 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -622,17 +622,35 @@ void amdgpu_virt_fini_data_exchange(struct amdgpu_device *adev) void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev) { - uint64_t bp_block_offset = 0; - uint32_t bp_block_size = 0; - struct amd_sriov_msg_pf2vf_info *pf2vf_v2 = NULL; - adev->virt.fw_reserve.p_pf2vf = NULL; adev->virt.fw_reserve.p_vf2pf = NULL; adev->virt.vf2pf_update_interval_ms = 0; - if (adev->mman.fw_vram_usage_va != NULL) { + if (adev->bios != NULL) { adev->virt.vf2pf_update_interval_ms = 2000; + adev->virt.fw_reserve.p_pf2vf = + (struct amd_sriov_msg_pf2vf_info_header *) + (adev->bios + (AMD_SRIOV_MSG_PF2VF_OFFSET_KB << 10)); + + amdgpu_virt_read_pf2vf_data(adev); + } + + if (adev->virt.vf2pf_update_interval_ms != 0) { + INIT_DELAYED_WORK(>virt.vf2pf_work, amdgpu_virt_update_vf2pf_work_item); + schedule_delayed_work(&(adev->virt.vf2pf_work), msecs_to_jiffies(adev->virt.vf2pf_update_interval_ms)); + } +} + + +void amdgpu_virt_exchange_data(struct amdgpu_device *adev) +{ + uint64_t bp_block_offset = 0; + uint32_t bp_block_size = 0; + struct amd_sriov_msg_pf2vf_info *pf2vf_v2 = NULL; + + if (adev->mman.fw_vram_usage_va != NULL) { + adev->virt.fw_reserve.p_pf2vf = (struct amd_sriov_msg_pf2vf_info_header *) (adev->mman.fw_vram_usage_va + (AMD_SRIOV_MSG_PF2VF_OFFSET_KB << 10)); @@ -663,16 +681,10 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev) (adev->bios + (AMD_SRIOV_MSG_PF2VF_OFFSET_KB << 10)); amdgpu_virt_read_pf2vf_data(adev); - - return; - } - - if (adev->virt.vf2pf_update_interval_ms != 0) { - INIT_DELAYED_WORK(>virt.vf2pf_work, amdgpu_virt_update_vf2pf_work_item); - schedule_delayed_work(&(adev->virt.vf2pf_work), adev->virt.vf2pf_update_interval_ms); } } + void amdgpu_detect_virtualization(struct amdgpu_device *adev) { uint32_t reg; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h index 8d4c20bb71c5..9adfb8d63280 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h @@ -308,6 +308,7 @@ int amdgpu_virt_alloc_mm_table(struct amdgpu_device *adev); void amdgpu_virt_free_mm_table(struct amdgpu_device *adev); void amdgpu_virt_release_ras_err_handler_data(struct amdgpu_device *adev); void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev); +void amdgpu_virt_exchange_data(struct amdgpu_device *adev); void amdgpu_virt_fini_data_exchange(struct amdgpu_device *adev); void
Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
FWIW, it looks like all versions of VCN need the same fix. There have been reports of suspend failing when VCN is in use on other newer APUs as well. Alex On Tue, Dec 14, 2021 at 12:59 AM Quan, Evan wrote: > > [AMD Official Use Only] > > > > > > > > From: Zhu, James > Sent: Monday, December 13, 2021 9:39 PM > To: Gong, Curry ; Zhu, James ; > amd-gfx@lists.freedesktop.org > Cc: Liu, Leo ; Quan, Evan ; Deucher, > Alexander > Subject: Re: [PATCH] drm/amdgpu: When the VCN(1.0) block is suspended, > powergating is explicitly enabled > > > > Hi Curry, Evan, > > It seems vcn1.0 power gate sequence are special. > > if the original solution is thread safe, then the following solution will be > thread safe also. > > static int vcn_v1_0_hw_fini(void *handle) > { > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > cancel_delayed_work_sync(>vcn.idle_work); > > if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || > (adev->vcn.cur_state != AMD_PG_STATE_GATE && > RREG32_SOC15(VCN, 0, mmUVD_STATUS))) { > +if (adev->pm.dpm_enabled) > +amdgpu_dpm_enable_uvd(adev, false); > +else > +vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE); > > [Quan, Evan] Considering adev->pm.dpm_enabled is always true, > “vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE); “ will become dead > code. > > Also, the vcn_v1_0_hw_fini is also used in other scenarios(except the > suspend/resume discussed here). So, it seems quite risky compared with > Curry’s original patch. > > Before we can come up with a better idea, I would suggest to land Curry’s > original patch as a quick fix. > > > > BR > > Evan > > > } > > Best Regards! > > James > > On 2021-12-13 3:55 a.m., Gong, Curry wrote: > > [AMD Official Use Only] > > > > Hi James: > > > > With the following patch, an error will be reported when the driver is loaded > > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c > > @@ -1202,6 +1204,9 @@ static int vcn_v1_0_stop(struct amdgpu_device *adev) > > else > > r = vcn_v1_0_stop_spg_mode(adev); > > > > c > > return r; > > } > > > > > > $ dmesg > > [ 363.181081] INFO: task kworker/3:2:223 blocked for more than 120 seconds. > > [ 363.181150] Tainted: G OE 5.11.0-41-generic > #45~20.04.1-Ubuntu > > [ 363.181208] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > > [ 363.181266] task:kworker/3:2 state:D stack:0 pid: 223 ppid: 2 > flags:0x4000 > > [ 363.181276] Workqueue: events vcn_v1_0_idle_work_handler [amdgpu] > > [ 363.181612] Call Trace: > > [ 363.181618] __schedule+0x44c/0x8a0 > > [ 363.181627] schedule+0x4f/0xc0 > > [ 363.181631] schedule_preempt_disabled+0xe/0x10 > > [ 363.181636] __mutex_lock.isra.0+0x183/0x4d0 > > [ 363.181643] __mutex_lock_slowpath+0x13/0x20 > > [ 363.181648] mutex_lock+0x32/0x40 > > [ 363.181652] amdgpu_dpm_set_powergating_by_smu+0x9c/0x180 [amdgpu] > > [ 363.182055] amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu] > > [ 363.182454] vcn_v1_0_set_powergating_state+0x2e7e/0x3cf0 [amdgpu] > > [ 363.182776] amdgpu_device_ip_set_powergating_state+0x6c/0xc0 [amdgpu] > > [ 363.183028] smu10_powergate_vcn+0x2a/0x80 [amdgpu] > > [ 363.183361] pp_set_powergating_by_smu+0xc5/0x2b0 [amdgpu] > > [ 363.183699] amdgpu_dpm_set_powergating_by_smu+0xb6/0x180 [amdgpu] > > [ 363.184040] amdgpu_dpm_enable_uvd+0x38/0x110 [amdgpu] > > [ 363.184391] vcn_v1_0_idle_work_handler+0xe1/0x130 [amdgpu] > > [ 363.184667] process_one_work+0x220/0x3c0 > > [ 363.184674] worker_thread+0x4d/0x3f0 > > [ 363.184677] ? process_one_work+0x3c0/0x3c0 > > [ 363.184680] kthread+0x12b/0x150 > > [ 363.184685] ? set_kthread_struct+0x40/0x40 > > [ 363.184690] ret_from_fork+0x22/0x30 > > [ 363.184699] INFO: task kworker/2:2:233 blocked for more than 120 seconds. > > [ 363.184739] Tainted: G OE 5.11.0-41-generic > #45~20.04.1-Ubuntu > > [ 363.184782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > > [ 363.184825] task:kworker/2:2 state:D stack:0 pid: 233 ppid: 2 > flags:0x4000 > > [ 363.184831] Workqueue: events amdgpu_device_delayed_init_work_handler > [amdgpu] > > [ 363.185085] Call Trace: > > [ 363.185087] __schedule+0x44c/0x8a0 > > [ 363.185092] schedule+0x4f/0xc0 > > [ 363.185095] schedule_timeout+0x202/0x290 > > [ 363.185099] ? sched_clock_cpu+0x11/0xb0 > > [ 363.185105] wait_for_completion+0x94/0x100 > > [ 363.185110] __flush_work+0x12a/0x1e0 > > [ 363.185113] ? worker_detach_from_pool+0xc0/0xc0 > > [ 363.185119] __cancel_work_timer+0x10e/0x190 > > [ 363.185123] cancel_delayed_work_sync+0x13/0x20 > > [ 363.185126] vcn_v1_0_ring_begin_use+0x20/0x70 [amdgpu] > > [ 363.185401] amdgpu_ring_alloc+0x48/0x60 [amdgpu] > > [ 363.185640] amdgpu_ib_schedule+0x493/0x600 [amdgpu] > > [ 363.185884] amdgpu_job_submit_direct+0x3c/0xd0 [amdgpu] > > [
Re: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
Is the issue reproducible with the same board in bare metal on x86? Or does it only happen with passthrough on ARM? Looking through the archives, the SI patch I made was for an x86 laptop. It would be nice to root cause this, but there weren't any gfx8 boards with more than 64G of vram, so I think it's safe. That said, if you see similar issues with newer gfx IPs then we have an issue since the upper bit will be meaningful, so it would be nice to root cause this. Alex On Thu, Dec 16, 2021 at 4:36 AM 周宗敏 wrote: > Hi Christian, > > > I'm testing for GPU passthrough feature, so I pass through this GPU to > virtual machine to use. It based on arm64 system. > > As far as i know, Alex had dealt with a similar problems on > dri/radeon/si.c . Maybe they have a same reason to cause it? > > the history commit message is below: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ca223b029a261e82fb2f50c52eb85d510f4260e > > [image: image.png] > > > Thanks very much. > > > > > > > > *主 题:*Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 > > *日 期:*2021-12-16 16:15 > *发件人:*Christian König > *收件人:*周宗敏Alex Deucher > > > > > Hi Zongmin, > >that strongly sounds like the ASIC is not correctly initialized when > trying to read the register. > >What board and environment are you using this GPU with? Is that a > normal x86 system? > >Regards, >Christian. > > > > Am 16.12.21 um 04:11 schrieb 周宗敏: > > > >1. > >the problematic boards that I have tested is [AMD/ATI] Lexa > PRO [Radeon RX 550/550X] ; and the vbios version : > 113-RXF9310-C09-BT >2. > >When an exception occurs I can see the following changes in > the values of vram size get from RREG32(mmCONFIG_MEMSIZE) , > >it seems to have garbage in the upper 16 bits > >[image: image.png] > > > > >3. > >and then I can also see some dmesg like below: > >when vram size register have garbage,we may see error > message like below: > >amdgpu :09:00.0: VRAM: 4286582784M 0x00F4 - > 0x000FF8F4 (4286582784M used) > >the correct message should like below: > >amdgpu :09:00.0: VRAM: 4096M 0x00F4 - > 0x00F4 (4096M used) > > > > >if you have any problems,please send me mail. > >thanks very much. > > > > > > > *主 题:*Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 > >*日 期:*2021-12-16 04:23 >*发件人:*Alex Deucher >*收件人:*Zongmin Zhou > > > > > On Wed, Dec 15, 2021 at 10:31 AM Zongmin Zhouwrote: > > > > Some boards(like RX550) seem to have garbage in the upper > > 16 bits of the vram size register. Check for > > this and clamp the size properly. Fixes > > boards reporting bogus amounts of vram. > > > > after add this patch,the maximum GPU VRAM size is 64GB, > > otherwise only 64GB vram size will be used. > > Can you provide some examples of problematic boards and > possibly a > vbios image from the problematic board? What values are you > seeing? > It would be nice to see what the boards are reporting and > whether the > lower 16 bits are actually correct or if it is some other > issue. This > register is undefined until the asic has been initialized. > The vbios > programs it as part of it's asic init sequence (either via >vesa/gop or > the OS driver). > > Alex > > > > > > Signed-off-by: Zongmin Zhou >> --- >> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 13 > ++--- >> 1 file changed, 10 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> index 492ebed2915b..63b890f1e8af 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> @@ -515,10 +515,10 @@ static void > gmc_v8_0_mc_program(struct amdgpu_device *adev) >> static int gmc_v8_0_mc_init(struct amdgpu_device > *adev) >> { >> int r; >> + u32 tmp; >> >> adev->gmc.vram_width = > amdgpu_atombios_get_vram_width(adev); >> if (!adev->gmc.vram_width) { >> - u32 tmp; >> int chansize, numchan; >> >> /* Get VRAM informations */ >> @@ -562,8 +562,15 @@ static int gmc_v8_0_mc_init(struct > amdgpu_device *adev) >> adev->gmc.vram_width = numchan * > chansize; >> } >> /* size in MB on si */ >> - adev->gmc.mc_vram_size = > RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL; >> -
Re: [PATCH] drm/amd/display: Fix warning comparing pointer to 0
Applied. Thanks! Alex On Thu, Dec 16, 2021 at 2:50 AM Jiapeng Chong wrote: > > Fix the following coccicheck warning: > > ./drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c:744:35-36: > WARNING comparing pointer to 0. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong > --- > drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > index 8f78e62b28b7..48005def1164 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c > @@ -741,7 +741,7 @@ void vg_clk_mgr_construct( > sizeof(struct watermarks), > _mgr->smu_wm_set.mc_address.quad_part); > > - if (clk_mgr->smu_wm_set.wm_set == 0) { > + if (!clk_mgr->smu_wm_set.wm_set) { > clk_mgr->smu_wm_set.wm_set = _wms; > clk_mgr->smu_wm_set.mc_address.quad_part = 0; > } > -- > 2.20.1.7.g153144c >
Re: [PATCH] drm/amdgpu: add support for IP discovery gc_info table v2
Ping? On Wed, Dec 15, 2021 at 10:18 PM Alex Deucher wrote: > > Used on gfx9 based systems. Fixes incorrect CU counts reported > in the kernel log. > > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833 > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 76 +-- > drivers/gpu/drm/amd/include/discovery.h | 49 > 2 files changed, 103 insertions(+), 22 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > index ea00090b3fb3..bcc9343353b5 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > @@ -526,10 +526,15 @@ void amdgpu_discovery_harvest_ip(struct amdgpu_device > *adev) > } > } > > +union gc_info { > + struct gc_info_v1_0 v1; > + struct gc_info_v2_0 v2; > +}; > + > int amdgpu_discovery_get_gfx_info(struct amdgpu_device *adev) > { > struct binary_header *bhdr; > - struct gc_info_v1_0 *gc_info; > + union gc_info *gc_info; > > if (!adev->mman.discovery_bin) { > DRM_ERROR("ip discovery uninitialized\n"); > @@ -537,28 +542,55 @@ int amdgpu_discovery_get_gfx_info(struct amdgpu_device > *adev) > } > > bhdr = (struct binary_header *)adev->mman.discovery_bin; > - gc_info = (struct gc_info_v1_0 *)(adev->mman.discovery_bin + > + gc_info = (union gc_info *)(adev->mman.discovery_bin + > le16_to_cpu(bhdr->table_list[GC].offset)); > - > - adev->gfx.config.max_shader_engines = le32_to_cpu(gc_info->gc_num_se); > - adev->gfx.config.max_cu_per_sh = 2 * > (le32_to_cpu(gc_info->gc_num_wgp0_per_sa) + > - > le32_to_cpu(gc_info->gc_num_wgp1_per_sa)); > - adev->gfx.config.max_sh_per_se = > le32_to_cpu(gc_info->gc_num_sa_per_se); > - adev->gfx.config.max_backends_per_se = > le32_to_cpu(gc_info->gc_num_rb_per_se); > - adev->gfx.config.max_texture_channel_caches = > le32_to_cpu(gc_info->gc_num_gl2c); > - adev->gfx.config.max_gprs = le32_to_cpu(gc_info->gc_num_gprs); > - adev->gfx.config.max_gs_threads = > le32_to_cpu(gc_info->gc_num_max_gs_thds); > - adev->gfx.config.gs_vgt_table_depth = > le32_to_cpu(gc_info->gc_gs_table_depth); > - adev->gfx.config.gs_prim_buffer_depth = > le32_to_cpu(gc_info->gc_gsprim_buff_depth); > - adev->gfx.config.double_offchip_lds_buf = > le32_to_cpu(gc_info->gc_double_offchip_lds_buffer); > - adev->gfx.cu_info.wave_front_size = > le32_to_cpu(gc_info->gc_wave_size); > - adev->gfx.cu_info.max_waves_per_simd = > le32_to_cpu(gc_info->gc_max_waves_per_simd); > - adev->gfx.cu_info.max_scratch_slots_per_cu = > le32_to_cpu(gc_info->gc_max_scratch_slots_per_cu); > - adev->gfx.cu_info.lds_size = le32_to_cpu(gc_info->gc_lds_size); > - adev->gfx.config.num_sc_per_sh = > le32_to_cpu(gc_info->gc_num_sc_per_se) / > - > le32_to_cpu(gc_info->gc_num_sa_per_se); > - adev->gfx.config.num_packer_per_sc = > le32_to_cpu(gc_info->gc_num_packer_per_sc); > - > + switch (gc_info->v1.header.version_major) { > + case 1: > + adev->gfx.config.max_shader_engines = > le32_to_cpu(gc_info->v1.gc_num_se); > + adev->gfx.config.max_cu_per_sh = 2 * > (le32_to_cpu(gc_info->v1.gc_num_wgp0_per_sa) + > + > le32_to_cpu(gc_info->v1.gc_num_wgp1_per_sa)); > + adev->gfx.config.max_sh_per_se = > le32_to_cpu(gc_info->v1.gc_num_sa_per_se); > + adev->gfx.config.max_backends_per_se = > le32_to_cpu(gc_info->v1.gc_num_rb_per_se); > + adev->gfx.config.max_texture_channel_caches = > le32_to_cpu(gc_info->v1.gc_num_gl2c); > + adev->gfx.config.max_gprs = > le32_to_cpu(gc_info->v1.gc_num_gprs); > + adev->gfx.config.max_gs_threads = > le32_to_cpu(gc_info->v1.gc_num_max_gs_thds); > + adev->gfx.config.gs_vgt_table_depth = > le32_to_cpu(gc_info->v1.gc_gs_table_depth); > + adev->gfx.config.gs_prim_buffer_depth = > le32_to_cpu(gc_info->v1.gc_gsprim_buff_depth); > + adev->gfx.config.double_offchip_lds_buf = > le32_to_cpu(gc_info->v1.gc_double_offchip_lds_buffer); > + adev->gfx.cu_info.wave_front_size = > le32_to_cpu(gc_info->v1.gc_wave_size); > + adev->gfx.cu_info.max_waves_per_simd = > le32_to_cpu(gc_info->v1.gc_max_waves_per_simd); > + adev->gfx.cu_info.max_scratch_slots_per_cu = > le32_to_cpu(gc_info->v1.gc_max_scratch_slots_per_cu); > + adev->gfx.cu_info.lds_size = > le32_to_cpu(gc_info->v1.gc_lds_size); > + adev->gfx.config.num_sc_per_sh = > le32_to_cpu(gc_info->v1.gc_num_sc_per_se) / > + le32_to_cpu(gc_info->v1.gc_num_sa_per_se); > +
Re: [PATCH 1/2] drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates
On Thu, Dec 16, 2021 at 9:24 AM Christian König wrote: > > > > Am 16.12.21 um 15:14 schrieb Alex Deucher: > > Add a new CTX ioctl operation to set stable pstates for profiling. > > When creating traces for tools like RGP or using SPM or doing > > performance profiling, it's required to enable a special > > stable profiling power state on the GPU. These profiling > > states set fixed clocks and disable certain other power > > features like powergating which may impact the results. > > > > Historically, these profiling pstates were enabled via sysfs, > > but this adds an interface to enable it via the CTX ioctl > > from the application. Since the power state is global > > only one application can set it at a time, so if multiple > > applications try and use it only the first will get it, > > the ioctl will return -EBUSY for others. The sysfs interface > > will override whatever has been set by this interface. > > > > Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 > > > > Signed-off-by: Alex Deucher > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 145 - > > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h| 1 + > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + > > drivers/gpu/drm/amd/pm/amdgpu_pm.c | 7 + > > drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h| 3 + > > include/uapi/drm/amdgpu_drm.h | 17 ++- > > 6 files changed, 171 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > > index 468003583b2a..327cf308c2ab 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > > @@ -237,6 +237,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev, > > ctx->vram_lost_counter = atomic_read(>vram_lost_counter); > > ctx->init_priority = priority; > > ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET; > > + ctx->stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE; > > > > return 0; > > } > > @@ -255,6 +256,102 @@ static void amdgpu_ctx_fini_entity(struct > > amdgpu_ctx_entity *entity) > > kfree(entity); > > } > > > > +static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx, > > + u32 *stable_pstate) > > +{ > > + struct amdgpu_device *adev = ctx->adev; > > + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; > > + enum amd_dpm_forced_level current_level; > > + > > + if (!ctx) > > + return -EINVAL; > > + > > + if (pp_funcs->get_performance_level) > > + current_level = amdgpu_dpm_get_performance_level(adev); > > + else > > + current_level = adev->pm.dpm.forced_level; > > + > > + switch (current_level) { > > + case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD: > > + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_STANDARD; > > + break; > > + case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK: > > + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_MIN_SCLK; > > + break; > > + case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK: > > + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK; > > + break; > > + case AMD_DPM_FORCED_LEVEL_PROFILE_PEAK: > > + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_PEAK; > > + break; > > + default: > > + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE; > > + break; > > + } > > + return 0; > > +} > > + > > +static int amdgpu_ctx_set_stable_pstate(struct amdgpu_ctx *ctx, > > + u32 stable_pstate) > > +{ > > + struct amdgpu_device *adev = ctx->adev; > > + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; > > + enum amd_dpm_forced_level level, current_level; > > + int r = 0; > > I would avoid initializing the return value and rather set it below > after everything worked out. Will fix. > > > + > > + if (!ctx) > > + return -EINVAL; > > + > > + mutex_lock(>pm.stable_pstate_ctx_lock); > > + if (adev->pm.stable_pstate_ctx && adev->pm.stable_pstate_ctx != ctx) { > > + r = -EBUSY; > > + goto done; > > + } > > + > > + switch (stable_pstate) { > > + case AMDGPU_CTX_STABLE_PSTATE_NONE: > > + level = AMD_DPM_FORCED_LEVEL_AUTO; > > + break; > > + case AMDGPU_CTX_STABLE_PSTATE_STANDARD: > > + level = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD; > > + break; > > + case AMDGPU_CTX_STABLE_PSTATE_MIN_SCLK: > > + level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK; > > + break; > > + case AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK: > > + level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK; > > + break; > > + case AMDGPU_CTX_STABLE_PSTATE_PEAK: > > + level = AMD_DPM_FORCED_LEVEL_PROFILE_PEAK; > > +
Re: [PATCH] drm/amdkfd: use max() and min() to make code cleaner
On 2021-12-15 3:52 a.m., cgel@gmail.com wrote: From: Changcheng Deng Use max() and min() in order to make code cleaner. Reported-by: Zeal Robot Signed-off-by: Changcheng Deng Reviewed-by: Philip Yang Applied, thanks. --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 7e92dcea4ce8..c6d3555b5be6 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2254,8 +2254,8 @@ svm_range_cpu_invalidate_pagetables(struct mmu_interval_notifier *mni, start = mni->interval_tree.start; last = mni->interval_tree.last; - start = (start > range->start ? start : range->start) >> PAGE_SHIFT; - last = (last < (range->end - 1) ? last : range->end - 1) >> PAGE_SHIFT; + start = max(start, range->start) >> PAGE_SHIFT; + last = min(last, range->end - 1) >> PAGE_SHIFT; pr_debug("[0x%lx 0x%lx] range[0x%lx 0x%lx] notifier[0x%lx 0x%lx] %d\n", start, last, range->start >> PAGE_SHIFT, (range->end - 1) >> PAGE_SHIFT,
RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers
[Public] Hi Victor, calling get_xgmi_info() in gmc_v9_0_early_init is from below patch, which also sent out to amd-gfx for review as a upstream patch. Do you know why it's missed from upstream? drm/amdgpu: get xgmi info at eary_init Driver need to get XGMI info function earlier before ip_init since driver need to check the XGMI setting to determine how to perform reset during init Signed-off-by: shaoyunl Acked-by: Alex Deucher Change-Id: Ic37276bbb6640bb4e9360220fed99494cedd3ef5 Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Skvortsov, Victor Sent: Thursday, December 16, 2021 10:28 PM To: Alex Deucher Cc: Ming, Davis ; Chen, JingWen ; amd-gfx list ; Deng, Emily ; Nieto, David M ; Chen, Horace ; Liu, Monk ; Liu, Shaoyun Subject: RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [AMD Official Use Only] Gotcha, I will skip this patch for drm-next -Original Message- From: Alex Deucher Sent: Thursday, December 16, 2021 8:53 AM To: Skvortsov, Victor Cc: amd-gfx list ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [CAUTION: External Email] On Wed, Dec 15, 2021 at 6:58 PM Skvortsov, Victor wrote: > > [AMD Official Use Only] > > Hey Alex, > > This change was based on the fact that amd-mainline-dkms-5.13 calls > get_xgmi_info() in gmc_v9_0_early_init(). But I can see that drm-next it's > instead called in gmc_v9_0_sw_init(). So, I'm not sure whats the correct > behavior. But I do agree that the change is kind of ugly. I don't know where > else to put it if we do need to call get_xgmi_info() in early_init. > We could skip this patch for drm-next and just apply it to the dkms branch. There's already a lot of ugly stuff in there to deal with multiple kernel versions. Alex > Thanks, > Victor > > -Original Message- > From: Alex Deucher > Sent: Wednesday, December 15, 2021 4:38 PM > To: Skvortsov, Victor > Cc: amd-gfx list ; Deng, Emily > ; Liu, Monk ; Ming, Davis > ; Liu, Shaoyun ; Zhou, Peng > Ju ; Chen, JingWen ; Chen, > Horace ; Nieto, David M > Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function > pointers > > [CAUTION: External Email] > > On Wed, Dec 15, 2021 at 1:56 PM Victor Skvortsov > wrote: > > > > In SRIOV, RLC function pointers must be initialized early as we rely > > on the RLCG interface for all GC register access. > > > > Signed-off-by: Victor Skvortsov > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 ++ > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +-- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h | 2 ++ > > 3 files changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > index 65e1f6cc59dd..1bc92a38d124 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > @@ -844,6 +844,8 @@ static int amdgpu_discovery_set_gc_ip_blocks(struct > > amdgpu_device *adev) > > case IP_VERSION(9, 4, 1): > > case IP_VERSION(9, 4, 2): > > amdgpu_device_ip_block_add(adev, > > _v9_0_ip_block); > > + if (amdgpu_sriov_vf(adev) && adev->ip_versions[GC_HWIP][0] > > == IP_VERSION(9, 4, 2)) > > + gfx_v9_0_set_rlc_funcs(adev); > > amdgpu_discovery.c is IP independent. I'd rather not add random IP specific > function calls. gfx_v9_0_set_rlc_funcs() already gets called in > gfx_v9_0_early_init(). Is that not early enough? In general we shouldn't be > touching the hardware much if at all in early_init. > > Alex > > > break; > > case IP_VERSION(10, 1, 10): > > case IP_VERSION(10, 1, 2): > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index edb3e3b08eed..d252b06efa43 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -816,7 +816,6 @@ static void gfx_v9_0_sriov_wreg(struct > > amdgpu_device *adev, u32 offset, static void > > gfx_v9_0_set_ring_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_irq_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_gds_init(struct amdgpu_device *adev); -static void > > gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev); static int > > gfx_v9_0_get_cu_info(struct amdgpu_device *adev, > > struct amdgpu_cu_info *cu_info); > > static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device > > *adev); @@ -7066,7 +7065,7 @@ static void gfx_v9_0_set_irq_funcs(struct > > amdgpu_device *adev) > > adev->gfx.cp_ecc_error_irq.funcs = > > _v9_0_cp_ecc_error_irq_funcs; } > > > >
Re: [PATCH] drm/amdgpu/sriov : Check the memory can be accesssed by ttm_device_clear_dma_mappings.
On Tue, Dec 14, 2021 at 12:44 PM Surbhi Kakarya wrote: > > On SRIOV environment, if event guard is enabled and VF doesn't > receive an ack from PF for full access, the guest driver load crashes. > This is caused due to the call to ttm_device_clear_dma_mappings with > non-initialized > mman during driver tear down. > > This patch adds the necessary condition to check if the mman initialization > passed or not > and takes the path based on the condition output. Is this actually sr-iov specific? I think any failure that happens before ttm is set up would hit this. So I think the wording could be updated to drop the SR-IOV. Alex > > Signed-off-by: skakarya > Change-Id: I1c18c7eb3500687c8b6e7fc414503dcf2a20b94c > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 598250a380f5..226110be7a2f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3984,7 +3984,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) > > amdgpu_irq_fini_hw(adev); > > - ttm_device_clear_dma_mappings(>mman.bdev); > + if (adev->mman.initialized) > + ttm_device_clear_dma_mappings(>mman.bdev); > > amdgpu_gart_dummy_page_fini(adev); > > -- > 2.25.1 >
RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers
[AMD Official Use Only] Gotcha, I will skip this patch for drm-next -Original Message- From: Alex Deucher Sent: Thursday, December 16, 2021 8:53 AM To: Skvortsov, Victor Cc: amd-gfx list ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Nieto, David M Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers [CAUTION: External Email] On Wed, Dec 15, 2021 at 6:58 PM Skvortsov, Victor wrote: > > [AMD Official Use Only] > > Hey Alex, > > This change was based on the fact that amd-mainline-dkms-5.13 calls > get_xgmi_info() in gmc_v9_0_early_init(). But I can see that drm-next it's > instead called in gmc_v9_0_sw_init(). So, I'm not sure whats the correct > behavior. But I do agree that the change is kind of ugly. I don't know where > else to put it if we do need to call get_xgmi_info() in early_init. > We could skip this patch for drm-next and just apply it to the dkms branch. There's already a lot of ugly stuff in there to deal with multiple kernel versions. Alex > Thanks, > Victor > > -Original Message- > From: Alex Deucher > Sent: Wednesday, December 15, 2021 4:38 PM > To: Skvortsov, Victor > Cc: amd-gfx list ; Deng, Emily > ; Liu, Monk ; Ming, Davis > ; Liu, Shaoyun ; Zhou, Peng > Ju ; Chen, JingWen ; Chen, > Horace ; Nieto, David M > Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function > pointers > > [CAUTION: External Email] > > On Wed, Dec 15, 2021 at 1:56 PM Victor Skvortsov > wrote: > > > > In SRIOV, RLC function pointers must be initialized early as we rely > > on the RLCG interface for all GC register access. > > > > Signed-off-by: Victor Skvortsov > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 ++ > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +-- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h | 2 ++ > > 3 files changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > index 65e1f6cc59dd..1bc92a38d124 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > @@ -844,6 +844,8 @@ static int amdgpu_discovery_set_gc_ip_blocks(struct > > amdgpu_device *adev) > > case IP_VERSION(9, 4, 1): > > case IP_VERSION(9, 4, 2): > > amdgpu_device_ip_block_add(adev, > > _v9_0_ip_block); > > + if (amdgpu_sriov_vf(adev) && adev->ip_versions[GC_HWIP][0] > > == IP_VERSION(9, 4, 2)) > > + gfx_v9_0_set_rlc_funcs(adev); > > amdgpu_discovery.c is IP independent. I'd rather not add random IP specific > function calls. gfx_v9_0_set_rlc_funcs() already gets called in > gfx_v9_0_early_init(). Is that not early enough? In general we shouldn't be > touching the hardware much if at all in early_init. > > Alex > > > break; > > case IP_VERSION(10, 1, 10): > > case IP_VERSION(10, 1, 2): > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index edb3e3b08eed..d252b06efa43 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -816,7 +816,6 @@ static void gfx_v9_0_sriov_wreg(struct > > amdgpu_device *adev, u32 offset, static void > > gfx_v9_0_set_ring_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_irq_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_gds_init(struct amdgpu_device *adev); -static void > > gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev); static int > > gfx_v9_0_get_cu_info(struct amdgpu_device *adev, > > struct amdgpu_cu_info *cu_info); > > static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device > > *adev); @@ -7066,7 +7065,7 @@ static void gfx_v9_0_set_irq_funcs(struct > > amdgpu_device *adev) > > adev->gfx.cp_ecc_error_irq.funcs = > > _v9_0_cp_ecc_error_irq_funcs; } > > > > -static void gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev) > > +void gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev) > > { > > switch (adev->ip_versions[GC_HWIP][0]) { > > case IP_VERSION(9, 0, 1): > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > index dfe8d4841f58..1817e252354f 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > @@ -29,4 +29,6 @@ extern const struct amdgpu_ip_block_version > > gfx_v9_0_ip_block; void gfx_v9_0_select_se_sh(struct amdgpu_device *adev, > > u32 se_num, u32 sh_num, > >u32 instance); > > > > +void gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev); > > + > > #endif > > -- > > 2.25.1 > >
Re: [PATCH 1/2] drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates
Am 16.12.21 um 15:14 schrieb Alex Deucher: Add a new CTX ioctl operation to set stable pstates for profiling. When creating traces for tools like RGP or using SPM or doing performance profiling, it's required to enable a special stable profiling power state on the GPU. These profiling states set fixed clocks and disable certain other power features like powergating which may impact the results. Historically, these profiling pstates were enabled via sysfs, but this adds an interface to enable it via the CTX ioctl from the application. Since the power state is global only one application can set it at a time, so if multiple applications try and use it only the first will get it, the ioctl will return -EBUSY for others. The sysfs interface will override whatever has been set by this interface. Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 145 - drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/pm/amdgpu_pm.c | 7 + drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h| 3 + include/uapi/drm/amdgpu_drm.h | 17 ++- 6 files changed, 171 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index 468003583b2a..327cf308c2ab 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -237,6 +237,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev, ctx->vram_lost_counter = atomic_read(>vram_lost_counter); ctx->init_priority = priority; ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET; + ctx->stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE; return 0; } @@ -255,6 +256,102 @@ static void amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity) kfree(entity); } +static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx, + u32 *stable_pstate) +{ + struct amdgpu_device *adev = ctx->adev; + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; + enum amd_dpm_forced_level current_level; + + if (!ctx) + return -EINVAL; + + if (pp_funcs->get_performance_level) + current_level = amdgpu_dpm_get_performance_level(adev); + else + current_level = adev->pm.dpm.forced_level; + + switch (current_level) { + case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_STANDARD; + break; + case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_MIN_SCLK; + break; + case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK; + break; + case AMD_DPM_FORCED_LEVEL_PROFILE_PEAK: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_PEAK; + break; + default: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE; + break; + } + return 0; +} + +static int amdgpu_ctx_set_stable_pstate(struct amdgpu_ctx *ctx, + u32 stable_pstate) +{ + struct amdgpu_device *adev = ctx->adev; + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; + enum amd_dpm_forced_level level, current_level; + int r = 0; I would avoid initializing the return value and rather set it below after everything worked out. + + if (!ctx) + return -EINVAL; + + mutex_lock(>pm.stable_pstate_ctx_lock); + if (adev->pm.stable_pstate_ctx && adev->pm.stable_pstate_ctx != ctx) { + r = -EBUSY; + goto done; + } + + switch (stable_pstate) { + case AMDGPU_CTX_STABLE_PSTATE_NONE: + level = AMD_DPM_FORCED_LEVEL_AUTO; + break; + case AMDGPU_CTX_STABLE_PSTATE_STANDARD: + level = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD; + break; + case AMDGPU_CTX_STABLE_PSTATE_MIN_SCLK: + level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK; + break; + case AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK: + level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK; + break; + case AMDGPU_CTX_STABLE_PSTATE_PEAK: + level = AMD_DPM_FORCED_LEVEL_PROFILE_PEAK; + break; + default: + r = -EINVAL; + goto done; + } + + if (pp_funcs->get_performance_level) + current_level = amdgpu_dpm_get_performance_level(adev); + else + current_level = adev->pm.dpm.forced_level; Mhm, that looks strongly like it doesn't belong into the ctx code. Didn't Evan had some patches to clean that
Re: [bug report] new kvmalloc() WARN() triggered by DRM ioctls tracking
On 16/12/2021 14:15, Boris Brezillon wrote: > Hi Steve, > > On Thu, 16 Dec 2021 14:02:25 + > Steven Price wrote: > >> + Boris >> >> On 16/12/2021 12:08, Dan Carpenter wrote: >>> Hi DRM Devs, >>> >>> In commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") >>> from July, Linus added a WARN_ONCE() for "crazy" allocations over 2GB. >>> I have a static checker warning for this and most of the warnings are >>> from DRM ioctls. >>> >>> drivers/gpu/drm/lima/lima_drv.c:124 lima_ioctl_gem_submit() warn: uncapped >>> user size for kvmalloc() will WARN >>> drivers/gpu/drm/radeon/radeon_cs.c:291 radeon_cs_parser_init() warn: >>> uncapped user size for kvmalloc() will WARN >>> drivers/gpu/drm/v3d/v3d_gem.c:311 v3d_lookup_bos() warn: uncapped user size >>> for kvmalloc() will WARN >>> drivers/gpu/drm/v3d/v3d_gem.c:319 v3d_lookup_bos() warn: uncapped user size >>> for kvmalloc() will WARN >>> drivers/gpu/drm/v3d/v3d_gem.c:601 v3d_get_multisync_post_deps() warn: >>> uncapped user size for kvmalloc() will WARN >>> drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:476 etnaviv_ioctl_gem_submit() >>> warn: uncapped user size for kvmalloc() will WARN >>> drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:477 etnaviv_ioctl_gem_submit() >>> warn: uncapped user size for kvmalloc() will WARN >>> drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:478 etnaviv_ioctl_gem_submit() >>> warn: uncapped user size for kvmalloc() will WARN >>> drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:479 etnaviv_ioctl_gem_submit() >>> warn: uncapped user size for kvmalloc() will WARN >>> drivers/gpu/drm/virtio/virtgpu_ioctl.c:186 virtio_gpu_execbuffer_ioctl() >>> warn: uncapped user size for kvmalloc() will WARN >>> drivers/gpu/drm/panfrost/panfrost_drv.c:198 panfrost_copy_in_sync() warn: >>> uncapped user size for kvmalloc() will WARN >> >> I believe this one in Panfrost would be fixed by Boris's series >> reworking the submit ioctl[1]. >> >> Boris: are you planning on submitting that series soon - or is it worth >> cherry picking the rework in patch 5 to fix this issue? > > Don't know when I'll get back to it, so I'd recommend cherry-picking > what you need. Thanks, no problem - it was mostly when I looked at the code I had the feeling that "surely this has already been fixed", then discovered your series was never merged ;) I'll hammer out a patch for this one issue. Thanks, Steve
Re: [bug report] new kvmalloc() WARN() triggered by DRM ioctls tracking
Hi Steve, On Thu, 16 Dec 2021 14:02:25 + Steven Price wrote: > + Boris > > On 16/12/2021 12:08, Dan Carpenter wrote: > > Hi DRM Devs, > > > > In commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") > > from July, Linus added a WARN_ONCE() for "crazy" allocations over 2GB. > > I have a static checker warning for this and most of the warnings are > > from DRM ioctls. > > > > drivers/gpu/drm/lima/lima_drv.c:124 lima_ioctl_gem_submit() warn: uncapped > > user size for kvmalloc() will WARN > > drivers/gpu/drm/radeon/radeon_cs.c:291 radeon_cs_parser_init() warn: > > uncapped user size for kvmalloc() will WARN > > drivers/gpu/drm/v3d/v3d_gem.c:311 v3d_lookup_bos() warn: uncapped user size > > for kvmalloc() will WARN > > drivers/gpu/drm/v3d/v3d_gem.c:319 v3d_lookup_bos() warn: uncapped user size > > for kvmalloc() will WARN > > drivers/gpu/drm/v3d/v3d_gem.c:601 v3d_get_multisync_post_deps() warn: > > uncapped user size for kvmalloc() will WARN > > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:476 etnaviv_ioctl_gem_submit() > > warn: uncapped user size for kvmalloc() will WARN > > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:477 etnaviv_ioctl_gem_submit() > > warn: uncapped user size for kvmalloc() will WARN > > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:478 etnaviv_ioctl_gem_submit() > > warn: uncapped user size for kvmalloc() will WARN > > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:479 etnaviv_ioctl_gem_submit() > > warn: uncapped user size for kvmalloc() will WARN > > drivers/gpu/drm/virtio/virtgpu_ioctl.c:186 virtio_gpu_execbuffer_ioctl() > > warn: uncapped user size for kvmalloc() will WARN > > drivers/gpu/drm/panfrost/panfrost_drv.c:198 panfrost_copy_in_sync() warn: > > uncapped user size for kvmalloc() will WARN > > I believe this one in Panfrost would be fixed by Boris's series > reworking the submit ioctl[1]. > > Boris: are you planning on submitting that series soon - or is it worth > cherry picking the rework in patch 5 to fix this issue? Don't know when I'll get back to it, so I'd recommend cherry-picking what you need. Regards, Boris
[PATCH 2/2] drm/amdgpu: bump driver version for new CTX OP to set/get stable pstates
So mesa and tools know when this is available. Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index bc1355c6248d..320706b3a8e0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -99,9 +99,10 @@ * - 3.42.0 - Add 16bpc fixed point display support * - 3.43.0 - Add device hot plug/unplug support * - 3.44.0 - DCN3 supports DCC independent block settings: !64B && 128B, 64B && 128B + * - 3.45.0 - Add context ioctl stable pstate interface */ #define KMS_DRIVER_MAJOR 3 -#define KMS_DRIVER_MINOR 44 +#define KMS_DRIVER_MINOR 45 #define KMS_DRIVER_PATCHLEVEL 0 int amdgpu_vram_limit; -- 2.33.1
[PATCH 1/2] drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates
Add a new CTX ioctl operation to set stable pstates for profiling. When creating traces for tools like RGP or using SPM or doing performance profiling, it's required to enable a special stable profiling power state on the GPU. These profiling states set fixed clocks and disable certain other power features like powergating which may impact the results. Historically, these profiling pstates were enabled via sysfs, but this adds an interface to enable it via the CTX ioctl from the application. Since the power state is global only one application can set it at a time, so if multiple applications try and use it only the first will get it, the ioctl will return -EBUSY for others. The sysfs interface will override whatever has been set by this interface. Mesa MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 145 - drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/pm/amdgpu_pm.c | 7 + drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h| 3 + include/uapi/drm/amdgpu_drm.h | 17 ++- 6 files changed, 171 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index 468003583b2a..327cf308c2ab 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -237,6 +237,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev, ctx->vram_lost_counter = atomic_read(>vram_lost_counter); ctx->init_priority = priority; ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET; + ctx->stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE; return 0; } @@ -255,6 +256,102 @@ static void amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity) kfree(entity); } +static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx, + u32 *stable_pstate) +{ + struct amdgpu_device *adev = ctx->adev; + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; + enum amd_dpm_forced_level current_level; + + if (!ctx) + return -EINVAL; + + if (pp_funcs->get_performance_level) + current_level = amdgpu_dpm_get_performance_level(adev); + else + current_level = adev->pm.dpm.forced_level; + + switch (current_level) { + case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_STANDARD; + break; + case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_MIN_SCLK; + break; + case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK; + break; + case AMD_DPM_FORCED_LEVEL_PROFILE_PEAK: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_PEAK; + break; + default: + *stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE; + break; + } + return 0; +} + +static int amdgpu_ctx_set_stable_pstate(struct amdgpu_ctx *ctx, + u32 stable_pstate) +{ + struct amdgpu_device *adev = ctx->adev; + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; + enum amd_dpm_forced_level level, current_level; + int r = 0; + + if (!ctx) + return -EINVAL; + + mutex_lock(>pm.stable_pstate_ctx_lock); + if (adev->pm.stable_pstate_ctx && adev->pm.stable_pstate_ctx != ctx) { + r = -EBUSY; + goto done; + } + + switch (stable_pstate) { + case AMDGPU_CTX_STABLE_PSTATE_NONE: + level = AMD_DPM_FORCED_LEVEL_AUTO; + break; + case AMDGPU_CTX_STABLE_PSTATE_STANDARD: + level = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD; + break; + case AMDGPU_CTX_STABLE_PSTATE_MIN_SCLK: + level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK; + break; + case AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK: + level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK; + break; + case AMDGPU_CTX_STABLE_PSTATE_PEAK: + level = AMD_DPM_FORCED_LEVEL_PROFILE_PEAK; + break; + default: + r = -EINVAL; + goto done; + } + + if (pp_funcs->get_performance_level) + current_level = amdgpu_dpm_get_performance_level(adev); + else + current_level = adev->pm.dpm.forced_level; + + if ((current_level != level) && pp_funcs->force_performance_level) { + mutex_lock(>pm.mutex); + r = amdgpu_dpm_force_performance_level(adev, level); + if (!r) + adev->pm.dpm.forced_level = level; +
Re: [bug report] new kvmalloc() WARN() triggered by DRM ioctls tracking
+ Boris On 16/12/2021 12:08, Dan Carpenter wrote: > Hi DRM Devs, > > In commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") > from July, Linus added a WARN_ONCE() for "crazy" allocations over 2GB. > I have a static checker warning for this and most of the warnings are > from DRM ioctls. > > drivers/gpu/drm/lima/lima_drv.c:124 lima_ioctl_gem_submit() warn: uncapped > user size for kvmalloc() will WARN > drivers/gpu/drm/radeon/radeon_cs.c:291 radeon_cs_parser_init() warn: uncapped > user size for kvmalloc() will WARN > drivers/gpu/drm/v3d/v3d_gem.c:311 v3d_lookup_bos() warn: uncapped user size > for kvmalloc() will WARN > drivers/gpu/drm/v3d/v3d_gem.c:319 v3d_lookup_bos() warn: uncapped user size > for kvmalloc() will WARN > drivers/gpu/drm/v3d/v3d_gem.c:601 v3d_get_multisync_post_deps() warn: > uncapped user size for kvmalloc() will WARN > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:476 etnaviv_ioctl_gem_submit() > warn: uncapped user size for kvmalloc() will WARN > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:477 etnaviv_ioctl_gem_submit() > warn: uncapped user size for kvmalloc() will WARN > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:478 etnaviv_ioctl_gem_submit() > warn: uncapped user size for kvmalloc() will WARN > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:479 etnaviv_ioctl_gem_submit() > warn: uncapped user size for kvmalloc() will WARN > drivers/gpu/drm/virtio/virtgpu_ioctl.c:186 virtio_gpu_execbuffer_ioctl() > warn: uncapped user size for kvmalloc() will WARN > drivers/gpu/drm/panfrost/panfrost_drv.c:198 panfrost_copy_in_sync() warn: > uncapped user size for kvmalloc() will WARN I believe this one in Panfrost would be fixed by Boris's series reworking the submit ioctl[1]. Boris: are you planning on submitting that series soon - or is it worth cherry picking the rework in patch 5 to fix this issue? [1] https://lore.kernel.org/all/20210930190954.1525933-1-boris.brezil...@collabora.com/ > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:120 amdgpu_cs_parser_init() warn: > uncapped user size for kvmalloc() will WARN > > These ioctls can all trigger the stack dump. The line numbers are from > linux next (next-20211214). > > I feel like ideally if this could be fixed in a central way, but if not > then hopefully I've added the relevant lists to the CC. I've only looked at Panfrost, but at least here we're simply allowing user space to allocate an arbitrary amount of kernel memory in one go - which is always going to be a good way of triggering the OOM killer if nothing else. Boris's series includes a change that means instead trying to copy an (attacker controller) sized array into the kernel to process, we copy each each element of the array in turn. So I don't really see how this could be fixed in a central way (but some of the other cases might be different). Thanks, Steve > regards, > dan carpenter >
Re: [PATCH] drm/amdgpu: Filter security violation registers
[Public] Reviewed-by: Alex Deucher From: amd-gfx on behalf of Bokun Zhang Sent: Wednesday, December 15, 2021 7:52 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Bokun Subject: [PATCH] drm/amdgpu: Filter security violation registers Recently, there is security policy update under SRIOV. We need to filter the registers that hit the violation and move the code to the host driver side so that the guest driver can execute correctly. Signed-off-by: Bokun Zhang Change-Id: Ida893bb17de17a80e865c7662f04c5562f5d2727 --- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 83 ++ 1 file changed, 46 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index 4f546f632223..d3d6d5b045b8 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -542,9 +542,6 @@ static void sdma_v5_2_ctx_switch_enable(struct amdgpu_device *adev, bool enable) } for (i = 0; i < adev->sdma.num_instances; i++) { - f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL)); - f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL, - AUTO_CTXSW_ENABLE, enable ? 1 : 0); if (enable && amdgpu_sdma_phase_quantum) { WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_PHASE0_QUANTUM), phase_quantum); @@ -553,7 +550,13 @@ static void sdma_v5_2_ctx_switch_enable(struct amdgpu_device *adev, bool enable) WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_PHASE2_QUANTUM), phase_quantum); } - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), f32_cntl); + + if (!amdgpu_sriov_vf(adev)) { + f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL)); + f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL, + AUTO_CTXSW_ENABLE, enable ? 1 : 0); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), f32_cntl); + } } } @@ -576,10 +579,12 @@ static void sdma_v5_2_enable(struct amdgpu_device *adev, bool enable) sdma_v5_2_rlc_stop(adev); } - for (i = 0; i < adev->sdma.num_instances; i++) { - f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL)); - f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, enable ? 0 : 1); - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl); + if (!amdgpu_sriov_vf(adev)) { + for (i = 0; i < adev->sdma.num_instances; i++) { + f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL)); + f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, enable ? 0 : 1); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl); + } } } @@ -608,7 +613,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) ring = >sdma.instance[i].ring; wb_offset = (ring->rptr_offs * 4); - WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0); + if (!amdgpu_sriov_vf(adev)) + WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0); /* Set ring buffer size in dwords */ rb_bufsz = order_base_2(ring->ring_size / 4); @@ -683,32 +689,34 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) sdma_v5_2_ring_set_wptr(ring); /* set minor_ptr_update to 0 after wptr programed */ - WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_MINOR_PTR_UPDATE), 0); - /* set utc l1 enable flag always to 1 */ - temp = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL)); - temp = REG_SET_FIELD(temp, SDMA0_CNTL, UTC_L1_ENABLE, 1); - - /* enable MCBP */ - temp = REG_SET_FIELD(temp, SDMA0_CNTL, MIDCMD_PREEMPT_ENABLE, 1); - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), temp); - - /* Set up RESP_MODE to non-copy addresses */ - temp = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_UTCL1_CNTL)); - temp = REG_SET_FIELD(temp, SDMA0_UTCL1_CNTL, RESP_MODE, 3); - temp = REG_SET_FIELD(temp, SDMA0_UTCL1_CNTL, REDO_DELAY, 9); - WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_UTCL1_CNTL), temp); - - /* program default cache read and write policy */ - temp =
Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers
On Wed, Dec 15, 2021 at 6:58 PM Skvortsov, Victor wrote: > > [AMD Official Use Only] > > Hey Alex, > > This change was based on the fact that amd-mainline-dkms-5.13 calls > get_xgmi_info() in gmc_v9_0_early_init(). But I can see that drm-next it's > instead called in gmc_v9_0_sw_init(). So, I'm not sure whats the correct > behavior. But I do agree that the change is kind of ugly. I don't know where > else to put it if we do need to call get_xgmi_info() in early_init. > We could skip this patch for drm-next and just apply it to the dkms branch. There's already a lot of ugly stuff in there to deal with multiple kernel versions. Alex > Thanks, > Victor > > -Original Message- > From: Alex Deucher > Sent: Wednesday, December 15, 2021 4:38 PM > To: Skvortsov, Victor > Cc: amd-gfx list ; Deng, Emily > ; Liu, Monk ; Ming, Davis > ; Liu, Shaoyun ; Zhou, Peng Ju > ; Chen, JingWen ; Chen, Horace > ; Nieto, David M > Subject: Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function > pointers > > [CAUTION: External Email] > > On Wed, Dec 15, 2021 at 1:56 PM Victor Skvortsov > wrote: > > > > In SRIOV, RLC function pointers must be initialized early as we rely > > on the RLCG interface for all GC register access. > > > > Signed-off-by: Victor Skvortsov > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 ++ > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +-- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h | 2 ++ > > 3 files changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > index 65e1f6cc59dd..1bc92a38d124 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > > @@ -844,6 +844,8 @@ static int amdgpu_discovery_set_gc_ip_blocks(struct > > amdgpu_device *adev) > > case IP_VERSION(9, 4, 1): > > case IP_VERSION(9, 4, 2): > > amdgpu_device_ip_block_add(adev, _v9_0_ip_block); > > + if (amdgpu_sriov_vf(adev) && adev->ip_versions[GC_HWIP][0] > > == IP_VERSION(9, 4, 2)) > > + gfx_v9_0_set_rlc_funcs(adev); > > amdgpu_discovery.c is IP independent. I'd rather not add random IP specific > function calls. gfx_v9_0_set_rlc_funcs() already gets called in > gfx_v9_0_early_init(). Is that not early enough? In general we shouldn't be > touching the hardware much if at all in early_init. > > Alex > > > break; > > case IP_VERSION(10, 1, 10): > > case IP_VERSION(10, 1, 2): > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index edb3e3b08eed..d252b06efa43 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -816,7 +816,6 @@ static void gfx_v9_0_sriov_wreg(struct > > amdgpu_device *adev, u32 offset, static void > > gfx_v9_0_set_ring_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_irq_funcs(struct amdgpu_device *adev); static void > > gfx_v9_0_set_gds_init(struct amdgpu_device *adev); -static void > > gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev); static int > > gfx_v9_0_get_cu_info(struct amdgpu_device *adev, > > struct amdgpu_cu_info *cu_info); > > static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device > > *adev); @@ -7066,7 +7065,7 @@ static void gfx_v9_0_set_irq_funcs(struct > > amdgpu_device *adev) > > adev->gfx.cp_ecc_error_irq.funcs = > > _v9_0_cp_ecc_error_irq_funcs; } > > > > -static void gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev) > > +void gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev) > > { > > switch (adev->ip_versions[GC_HWIP][0]) { > > case IP_VERSION(9, 0, 1): > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > index dfe8d4841f58..1817e252354f 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h > > @@ -29,4 +29,6 @@ extern const struct amdgpu_ip_block_version > > gfx_v9_0_ip_block; void gfx_v9_0_select_se_sh(struct amdgpu_device *adev, > > u32 se_num, u32 sh_num, > >u32 instance); > > > > +void gfx_v9_0_set_rlc_funcs(struct amdgpu_device *adev); > > + > > #endif > > -- > > 2.25.1 > >
[PATCH] drm/amdkfd: make SPDX License expression more sound
Commit b5f57384805a ("drm/amdkfd: Add sysfs bitfields and enums to uAPI") adds include/uapi/linux/kfd_sysfs.h with the "GPL-2.0 OR MIT WITH Linux-syscall-note" SPDX-License expression. The command ./scripts/spdxcheck.py warns: include/uapi/linux/kfd_sysfs.h: 1:48 Exception not valid for license MIT: Linux-syscall-note For a uapi header, the file under GPLv2 License must be combined with the Linux-syscall-note, but combining the MIT License with the Linux-syscall-note makes no sense, as the note provides an exception for GPL-licensed code, not for permissively licensed code. So, reorganize the SPDX expression to only combine the note with the GPL License condition. This makes spdxcheck happy again. Signed-off-by: Lukas Bulwahn --- I am not a lawyer and I do not intend to modify the actual licensing of this header file. So, I really would like to have an Ack from some AMD developer here. Maybe also a lawyer on the linux-spdx list can check my reasoning on the licensing with the exception note? include/uapi/linux/kfd_sysfs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/kfd_sysfs.h b/include/uapi/linux/kfd_sysfs.h index e1fb78b4bf09..3e330f368917 100644 --- a/include/uapi/linux/kfd_sysfs.h +++ b/include/uapi/linux/kfd_sysfs.h @@ -1,4 +1,4 @@ -/* SPDX-License-Identifier: GPL-2.0 OR MIT WITH Linux-syscall-note */ +/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */ /* * Copyright 2021 Advanced Micro Devices, Inc. * -- 2.17.1
回复: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
Hi Christian,I'm testing for GPU passthrough feature, so I pass through this GPU to virtual machine to use. It based on arm64 system.As far as i know, Alex had dealt with a similar problems on dri/radeon/si.c . Maybe they have a same reason to cause it?the history commit message is below:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ca223b029a261e82fb2f50c52eb85d510f4260eThanks very much. 主 题:Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 日 期:2021-12-16 16:15 发件人:Christian König 收件人:周宗敏Alex Deucher Hi Zongmin, that strongly sounds like the ASIC is not correctly initialized when trying to read the register. What board and environment are you using this GPU with? Is that a normal x86 system? Regards, Christian. Am 16.12.21 um 04:11 schrieb 周宗敏: the problematic boards that I have tested is [AMD/ATI] Lexa PRO [Radeon RX 550/550X] ; and the vbios version : 113-RXF9310-C09-BTWhen an exception occurs I can see the following changes in the values of vram size get from RREG32(mmCONFIG_MEMSIZE) ,it seems to have garbage in the upper 16 bits and then I can also see some dmesg like below:when vram size register have garbage,we may see error message like below:amdgpu :09:00.0: VRAM: 4286582784M 0x00F4 - 0x000FF8F4 (4286582784M used)the correct message should like below:amdgpu :09:00.0: VRAM: 4096M 0x00F4 - 0x00F4 (4096M used) if you have any problems,please send me mail.thanks very much. 主 题:Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 日 期:2021-12-16 04:23 发件人:Alex Deucher 收件人:Zongmin Zhou On Wed, Dec 15, 2021 at 10:31 AM Zongmin Zhouwrote: > > Some boards(like RX550) seem to have garbage in the upper > 16 bits of the vram size register. Check for > this and clamp the size properly. Fixes > boards reporting bogus amounts of vram. > > after add this patch,the maximum GPU VRAM size is 64GB, > otherwise only 64GB vram size will be used. Can you provide some examples of problematic boards and possibly a vbios image from the problematic board? What values are you seeing? It would be nice to see what the boards are reporting and whether the lower 16 bits are actually correct or if it is some other issue. This register is undefined until the asic has been initialized. The vbios programs it as part of it's asic init sequence (either via vesa/gop or the OS driver). Alex > > Signed-off-by: Zongmin Zhou > --- > drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > index 492ebed2915b..63b890f1e8af 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > @@ -515,10 +515,10 @@ static void gmc_v8_0_mc_program(struct amdgpu_device *adev) > static int gmc_v8_0_mc_init(struct amdgpu_device *adev) > { > int r; > + u32 tmp; > > adev->gmc.vram_width = amdgpu_atombios_get_vram_width(adev); > if (!adev->gmc.vram_width) { > - u32 tmp; > int chansize, numchan; > > /* Get VRAM informations */ > @@ -562,8 +562,15 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev) > adev->gmc.vram_width = numchan * chansize; > } > /* size in MB on si */ > - adev->gmc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL; > - adev->gmc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL; > + tmp = RREG32(mmCONFIG_MEMSIZE); > + /* some boards may have garbage in the upper 16 bits */ > + if (tmp & 0x) { > + DRM_INFO("Probable bad vram size: 0x%08x\n", tmp); > + if (tmp & 0x) > +
RE: [PATCH 3/4] drm/amdkfd: add reset queue function for RAS poison
+ + int (*unmap_queues_cpsch_poison)(struct device_queue_manager *dqm, + uint16_t pasid); }; Might be better call it reset_queue directly (match with update_queue, create_queue, .etc.,) Others look good to me The series (4 patches) is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Zhou1, Tao Sent: Thursday, December 16, 2021 19:36 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Yang, Stanley ; Chai, Thomas ; Kuehling, Felix Cc: Zhou1, Tao Subject: [PATCH 3/4] drm/amdkfd: add reset queue function for RAS poison The new interface unmaps queues with reset mode for the process consumes RAS poison, it's only for compute queue. Signed-off-by: Tao Zhou --- .../drm/amd/amdkfd/kfd_device_queue_manager.c| 16 .../drm/amd/amdkfd/kfd_device_queue_manager.h| 5 + 2 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 01a2cc3928ac..b4b0495c7024 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1476,6 +1476,21 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm, return retval; } +/* only for compute queue */ +static int unmap_queues_cpsch_poison(struct device_queue_manager *dqm, + uint16_t pasid) +{ + int retval; + + dqm_lock(dqm); + + retval = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_BY_PASID, + pasid, true); + + dqm_unlock(dqm); + return retval; +} + /* dqm->lock mutex has to be locked before calling this function */ static int execute_queues_cpsch(struct device_queue_manager *dqm, enum kfd_unmap_queues_filter filter, @@ -1896,6 +1911,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev) dqm->ops.evict_process_queues = evict_process_queues_cpsch; dqm->ops.restore_process_queues = restore_process_queues_cpsch; dqm->ops.get_wave_state = get_wave_state; + dqm->ops.unmap_queues_cpsch_poison = unmap_queues_cpsch_poison; break; case KFD_SCHED_POLICY_NO_HWS: /* initialize dqm for no cp scheduling */ diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h index 499fc0ea387f..19ec3e8859e8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h @@ -81,6 +81,8 @@ struct device_process_node { * * @get_wave_state: Retrieves context save state and optionally copies the * control stack, if kept in the MQD, to the given userspace address. + * + * @unmap_queues_cpsch_poison: reset queue which consumes RAS poison */ struct device_queue_manager_ops { @@ -134,6 +136,9 @@ struct device_queue_manager_ops { void __user *ctl_stack, u32 *ctl_stack_used_size, u32 *save_area_used_size); + + int (*unmap_queues_cpsch_poison)(struct device_queue_manager *dqm, + uint16_t pasid); }; struct device_queue_manager_asic_ops { -- 2.17.1
[bug report] new kvmalloc() WARN() triggered by DRM ioctls tracking
Hi DRM Devs, In commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") from July, Linus added a WARN_ONCE() for "crazy" allocations over 2GB. I have a static checker warning for this and most of the warnings are from DRM ioctls. drivers/gpu/drm/lima/lima_drv.c:124 lima_ioctl_gem_submit() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/radeon/radeon_cs.c:291 radeon_cs_parser_init() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/v3d/v3d_gem.c:311 v3d_lookup_bos() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/v3d/v3d_gem.c:319 v3d_lookup_bos() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/v3d/v3d_gem.c:601 v3d_get_multisync_post_deps() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:476 etnaviv_ioctl_gem_submit() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:477 etnaviv_ioctl_gem_submit() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:478 etnaviv_ioctl_gem_submit() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:479 etnaviv_ioctl_gem_submit() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/virtio/virtgpu_ioctl.c:186 virtio_gpu_execbuffer_ioctl() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/panfrost/panfrost_drv.c:198 panfrost_copy_in_sync() warn: uncapped user size for kvmalloc() will WARN drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:120 amdgpu_cs_parser_init() warn: uncapped user size for kvmalloc() will WARN These ioctls can all trigger the stack dump. The line numbers are from linux next (next-20211214). I feel like ideally if this could be fixed in a central way, but if not then hopefully I've added the relevant lists to the CC. regards, dan carpenter
[PATCH 3/4] drm/amdkfd: add reset queue function for RAS poison
The new interface unmaps queues with reset mode for the process consumes RAS poison, it's only for compute queue. Signed-off-by: Tao Zhou --- .../drm/amd/amdkfd/kfd_device_queue_manager.c| 16 .../drm/amd/amdkfd/kfd_device_queue_manager.h| 5 + 2 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 01a2cc3928ac..b4b0495c7024 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1476,6 +1476,21 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm, return retval; } +/* only for compute queue */ +static int unmap_queues_cpsch_poison(struct device_queue_manager *dqm, + uint16_t pasid) +{ + int retval; + + dqm_lock(dqm); + + retval = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_BY_PASID, + pasid, true); + + dqm_unlock(dqm); + return retval; +} + /* dqm->lock mutex has to be locked before calling this function */ static int execute_queues_cpsch(struct device_queue_manager *dqm, enum kfd_unmap_queues_filter filter, @@ -1896,6 +1911,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev) dqm->ops.evict_process_queues = evict_process_queues_cpsch; dqm->ops.restore_process_queues = restore_process_queues_cpsch; dqm->ops.get_wave_state = get_wave_state; + dqm->ops.unmap_queues_cpsch_poison = unmap_queues_cpsch_poison; break; case KFD_SCHED_POLICY_NO_HWS: /* initialize dqm for no cp scheduling */ diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h index 499fc0ea387f..19ec3e8859e8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h @@ -81,6 +81,8 @@ struct device_process_node { * * @get_wave_state: Retrieves context save state and optionally copies the * control stack, if kept in the MQD, to the given userspace address. + * + * @unmap_queues_cpsch_poison: reset queue which consumes RAS poison */ struct device_queue_manager_ops { @@ -134,6 +136,9 @@ struct device_queue_manager_ops { void __user *ctl_stack, u32 *ctl_stack_used_size, u32 *save_area_used_size); + + int (*unmap_queues_cpsch_poison)(struct device_queue_manager *dqm, + uint16_t pasid); }; struct device_queue_manager_asic_ops { -- 2.17.1
[PATCH 4/4] drm/amdkfd: reset queue which consumes RAS poison (v2)
CP supports unmap queue with reset mode which only destroys specific queue without affecting others. Replacing whole gpu reset with reset queue mode for RAS poison consumption saves much time, and we can also fallback to gpu reset solution if reset queue fails. v2: Return directly if process is NULL; Reset queue solution is not applicable to SDMA, fallback to legacy way. call kfd_unref_process after lookup process. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 6 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 3 +- .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 44 +-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 + 4 files changed, 47 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 46cf48b3904a..0bf09a94d944 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -721,13 +721,13 @@ bool amdgpu_amdkfd_have_atomics_support(struct amdgpu_device *adev) return adev->have_atomics_support; } -void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev) +void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev, bool reset) { struct ras_err_data err_data = {0, 0, 0, NULL}; /* CPU MCA will handle page retirement if connected_to_cpu is 1 */ if (!adev->gmc.xgmi.connected_to_cpu) - amdgpu_umc_process_ras_data_cb(adev, _data, NULL); - else + amdgpu_umc_do_page_retirement(adev, _data, NULL, reset); + else if (reset) amdgpu_amdkfd_gpu_reset(adev); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index fcbc8a9c9e06..61f899e54fd5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -296,7 +296,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev, uint64_t *mmap_offset); int amdgpu_amdkfd_get_tile_config(struct amdgpu_device *adev, struct tile_config *config); -void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev); +void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev, + bool reset); #if IS_ENABLED(CONFIG_HSA_AMD) void amdgpu_amdkfd_gpuvm_init_mem_limits(void); void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c index deb64168c9e8..bd29b50532ea 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c @@ -89,6 +89,44 @@ enum SQ_INTERRUPT_ERROR_TYPE { #define KFD_SQ_INT_DATA__ERR_TYPE_MASK 0xF0 #define KFD_SQ_INT_DATA__ERR_TYPE__SHIFT 20 +static void event_interrupt_poison_consumption(struct kfd_dev *dev, + uint16_t pasid, uint16_t source_id) +{ + int ret = -EINVAL; + struct kfd_process *p = kfd_lookup_process_by_pasid(pasid); + + if (!p) + return; + + /* all queues of a process will be unmapped in one time */ + if (atomic_read(>poison)) { + kfd_unref_process(p); + return; + } + + atomic_set(>poison, 1); + kfd_unref_process(p); + + switch (source_id) { + case SOC15_INTSRC_SQ_INTERRUPT_MSG: + if (dev->dqm->ops.unmap_queues_cpsch_poison) + ret = dev->dqm->ops.unmap_queues_cpsch_poison(dev->dqm, pasid); + break; + case SOC15_INTSRC_SDMA_ECC: + default: + break; + } + + kfd_signal_poison_consumed_event(dev, pasid); + + /* resetting queue passes, do page retirement without gpu reset + resetting queue fails, fallback to gpu reset solution */ + if (!ret) + amdgpu_amdkfd_ras_poison_consumption_handler(dev->adev, false); + else + amdgpu_amdkfd_ras_poison_consumption_handler(dev->adev, true); +} + static bool event_interrupt_isr_v9(struct kfd_dev *dev, const uint32_t *ih_ring_entry, uint32_t *patched_ihre, @@ -230,8 +268,7 @@ static void event_interrupt_wq_v9(struct kfd_dev *dev, sq_intr_err); if (sq_intr_err != SQ_INTERRUPT_ERROR_TYPE_ILLEGAL_INST && sq_intr_err != SQ_INTERRUPT_ERROR_TYPE_MEMVIOL) { - kfd_signal_poison_consumed_event(dev, pasid); - amdgpu_amdkfd_ras_poison_consumption_handler(dev->adev); + event_interrupt_poison_consumption(dev, pasid,
[PATCH 2/4] drm/amdkfd: add reset parameter for unmap queues
So we can set reset mode for unmap operation, no functional change. Signed-off-by: Tao Zhou --- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c| 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index dd0b952f0173..01a2cc3928ac 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -47,7 +47,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, uint32_t filter_param); static int unmap_queues_cpsch(struct device_queue_manager *dqm, enum kfd_unmap_queues_filter filter, - uint32_t filter_param); + uint32_t filter_param, bool reset); static int map_queues_cpsch(struct device_queue_manager *dqm); @@ -570,7 +570,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q, /* Make sure the queue is unmapped before updating the MQD */ if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS) { retval = unmap_queues_cpsch(dqm, - KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0); + KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0, false); if (retval) { pr_err("unmap queue failed\n"); goto out_unlock; @@ -1223,7 +1223,7 @@ static int stop_cpsch(struct device_queue_manager *dqm) } if (!dqm->is_hws_hang) - unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0); + unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, false); hanging = dqm->is_hws_hang || dqm->is_resetting; dqm->sched_running = false; @@ -1419,7 +1419,7 @@ static int map_queues_cpsch(struct device_queue_manager *dqm) /* dqm->lock mutex has to be locked before calling this function */ static int unmap_queues_cpsch(struct device_queue_manager *dqm, enum kfd_unmap_queues_filter filter, - uint32_t filter_param) + uint32_t filter_param, bool reset) { int retval = 0; struct mqd_manager *mqd_mgr; @@ -1432,7 +1432,7 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm, return retval; retval = pm_send_unmap_queue(>packet_mgr, KFD_QUEUE_TYPE_COMPUTE, - filter, filter_param, false, 0); + filter, filter_param, reset, 0); if (retval) return retval; @@ -1485,7 +1485,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, if (dqm->is_hws_hang) return -EIO; - retval = unmap_queues_cpsch(dqm, filter, filter_param); + retval = unmap_queues_cpsch(dqm, filter, filter_param, false); if (retval) return retval; -- 2.17.1
[PATCH 1/4] drm/amdgpu: add gpu reset control for umc page retirement
Add a reset parameter for umc page retirement, let user decide whether call gpu reset in umc page retirement. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 15 --- drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h | 5 +++-- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c index 6e4bea012ea4..0c33f367a4e5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c @@ -23,6 +23,13 @@ #include "amdgpu_ras.h" +static int amdgpu_umc_process_ras_data_cb(struct amdgpu_device *adev, + void *ras_error_status, + struct amdgpu_iv_entry *entry) +{ + return amdgpu_umc_do_page_retirement(adev, ras_error_status, entry, true); +} + int amdgpu_umc_ras_late_init(struct amdgpu_device *adev) { int r; @@ -88,9 +95,10 @@ void amdgpu_umc_ras_fini(struct amdgpu_device *adev) } } -int amdgpu_umc_process_ras_data_cb(struct amdgpu_device *adev, +int amdgpu_umc_do_page_retirement(struct amdgpu_device *adev, void *ras_error_status, - struct amdgpu_iv_entry *entry) + struct amdgpu_iv_entry *entry, + bool reset) { struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status; struct amdgpu_ras *con = amdgpu_ras_get_context(adev); @@ -164,7 +172,8 @@ int amdgpu_umc_process_ras_data_cb(struct amdgpu_device *adev, adev->smu.ppt_funcs->send_hbm_bad_pages_num(>smu, con->eeprom_control.ras_num_recs); } - amdgpu_ras_reset_gpu(adev); + if (reset) + amdgpu_ras_reset_gpu(adev); } kfree(err_data->err_addr); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h index 9e40bade0a68..8d18d5121f66 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h @@ -78,9 +78,10 @@ struct amdgpu_umc { int amdgpu_umc_ras_late_init(struct amdgpu_device *adev); void amdgpu_umc_ras_fini(struct amdgpu_device *adev); -int amdgpu_umc_process_ras_data_cb(struct amdgpu_device *adev, +int amdgpu_umc_do_page_retirement(struct amdgpu_device *adev, void *ras_error_status, - struct amdgpu_iv_entry *entry); + struct amdgpu_iv_entry *entry, + bool reset); int amdgpu_umc_process_ecc_irq(struct amdgpu_device *adev, struct amdgpu_irq_src *source, struct amdgpu_iv_entry *entry); -- 2.17.1
Re: [PATCH v4 2/6] drm: improve drm_buddy_alloc function
Hi Am 01.12.21 um 17:39 schrieb Arunpravin: - Make drm_buddy_alloc a single function to handle range allocation and non-range allocation demands - Implemented a new function alloc_range() which allocates the requested power-of-two block comply with range limitations - Moved order computation and memory alignment logic from i915 driver to drm buddy v2: merged below changes to keep the build unbroken - drm_buddy_alloc_range() becomes obsolete and may be removed - enable ttm range allocation (fpfn / lpfn) support in i915 driver - apply enhanced drm_buddy_alloc() function to i915 driver v3(Matthew Auld): - Fix alignment issues and remove unnecessary list_empty check - add more validation checks for input arguments - make alloc_range() block allocations as bottom-up - optimize order computation logic - replace uint64_t with u64, which is preferred in the kernel v4(Matthew Auld): - keep drm_buddy_alloc_range() function implementation for generic actual range allocations - keep alloc_range() implementation for end bias allocations Signed-off-by: Arunpravin --- drivers/gpu/drm/drm_buddy.c | 316 +- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 67 ++-- drivers/gpu/drm/i915/i915_ttm_buddy_manager.h | 2 + include/drm/drm_buddy.h | 22 +- 4 files changed, 285 insertions(+), 122 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 9340a4b61c5a..7f47632821f4 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -280,23 +280,97 @@ void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects) } EXPORT_SYMBOL(drm_buddy_free_list); -/** - * drm_buddy_alloc - allocate power-of-two blocks - * - * @mm: DRM buddy manager to allocate from - * @order: size of the allocation - * - * The order value here translates to: - * - * 0 = 2^0 * mm->chunk_size - * 1 = 2^1 * mm->chunk_size - * 2 = 2^2 * mm->chunk_size - * - * Returns: - * allocated ptr to the _buddy_block on success - */ -struct drm_buddy_block * -drm_buddy_alloc(struct drm_buddy_mm *mm, unsigned int order) +static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) +{ + return s1 <= e2 && e1 >= s2; +} + +static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2) +{ + return s1 <= s2 && e1 >= e2; +} + +static struct drm_buddy_block * +alloc_range_bias(struct drm_buddy_mm *mm, +u64 start, u64 end, +unsigned int order) +{ + struct drm_buddy_block *block; + struct drm_buddy_block *buddy; + LIST_HEAD(dfs); + int err; + int i; + + end = end - 1; + + for (i = 0; i < mm->n_roots; ++i) + list_add_tail(>roots[i]->tmp_link, ); + + do { + u64 block_start; + u64 block_end; + + block = list_first_entry_or_null(, +struct drm_buddy_block, +tmp_link); + if (!block) + break; + + list_del(>tmp_link); + + if (drm_buddy_block_order(block) < order) + continue; + + block_start = drm_buddy_block_offset(block); + block_end = block_start + drm_buddy_block_size(mm, block) - 1; + + if (!overlaps(start, end, block_start, block_end)) + continue; + + if (drm_buddy_block_is_allocated(block)) + continue; + + if (contains(start, end, block_start, block_end) && + order == drm_buddy_block_order(block)) { + /* +* Find the free block within the range. +*/ + if (drm_buddy_block_is_free(block)) + return block; + + continue; + } + + if (!drm_buddy_block_is_split(block)) { + err = split_block(mm, block); + if (unlikely(err)) + goto err_undo; + } + + list_add(>right->tmp_link, ); + list_add(>left->tmp_link, ); + } while (1); + + return ERR_PTR(-ENOSPC); + +err_undo: + /* +* We really don't want to leave around a bunch of split blocks, since +* bigger is better, so make sure we merge everything back before we +* free the allocated blocks. +*/ + buddy = get_buddy(block); + if (buddy && + (drm_buddy_block_is_free(block) && +drm_buddy_block_is_free(buddy))) + __drm_buddy_free(mm, block); + return ERR_PTR(err); +} + +static struct drm_buddy_block * +alloc_from_freelist(struct drm_buddy_mm *mm, + unsigned int order, + unsigned long
Re: [PATCH v4 2/6] drm: improve drm_buddy_alloc function
On 15/12/2021 20:46, Arunpravin wrote: On 14/12/21 12:29 am, Matthew Auld wrote: On 09/12/2021 15:47, Paneer Selvam, Arunpravin wrote: [AMD Official Use Only] Hi Matthew, Ping on this? No new comments from me :) I guess just a question of what we should do with the selftests, and then ofc at some point being able to throw this at CI, or at least test locally, once the series builds. sure :) I think we should rewrite the i915 buddy selftests since now we have a single function for range and non-range requirements. I will rewrite the i915 buddy selftests and move to drm selftests folder? so for the time being, I remove the i915_buddy_mock_selftest() from i915_mock_selftests.h list to avoid build errors? Yeah, whatever is easiest. Regards, Arun -Original Message- From: amd-gfx On Behalf Of Arunpravin Sent: Wednesday, December 1, 2021 10:10 PM To: dri-de...@lists.freedesktop.org; intel-...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Cc: dan...@ffwll.ch; Paneer Selvam, Arunpravin ; jani.nik...@linux.intel.com; matthew.a...@intel.com; tzimmerm...@suse.de; Deucher, Alexander ; Koenig, Christian Subject: [PATCH v4 2/6] drm: improve drm_buddy_alloc function - Make drm_buddy_alloc a single function to handle range allocation and non-range allocation demands - Implemented a new function alloc_range() which allocates the requested power-of-two block comply with range limitations - Moved order computation and memory alignment logic from i915 driver to drm buddy v2: merged below changes to keep the build unbroken - drm_buddy_alloc_range() becomes obsolete and may be removed - enable ttm range allocation (fpfn / lpfn) support in i915 driver - apply enhanced drm_buddy_alloc() function to i915 driver v3(Matthew Auld): - Fix alignment issues and remove unnecessary list_empty check - add more validation checks for input arguments - make alloc_range() block allocations as bottom-up - optimize order computation logic - replace uint64_t with u64, which is preferred in the kernel v4(Matthew Auld): - keep drm_buddy_alloc_range() function implementation for generic actual range allocations - keep alloc_range() implementation for end bias allocations Signed-off-by: Arunpravin --- drivers/gpu/drm/drm_buddy.c | 316 +- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 67 ++-- drivers/gpu/drm/i915/i915_ttm_buddy_manager.h | 2 + include/drm/drm_buddy.h | 22 +- 4 files changed, 285 insertions(+), 122 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 9340a4b61c5a..7f47632821f4 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -280,23 +280,97 @@ void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects) } EXPORT_SYMBOL(drm_buddy_free_list); -/** - * drm_buddy_alloc - allocate power-of-two blocks - * - * @mm: DRM buddy manager to allocate from - * @order: size of the allocation - * - * The order value here translates to: - * - * 0 = 2^0 * mm->chunk_size - * 1 = 2^1 * mm->chunk_size - * 2 = 2^2 * mm->chunk_size - * - * Returns: - * allocated ptr to the _buddy_block on success - */ -struct drm_buddy_block * -drm_buddy_alloc(struct drm_buddy_mm *mm, unsigned int order) +static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) { + return s1 <= e2 && e1 >= s2; +} + +static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2) { + return s1 <= s2 && e1 >= e2; +} + +static struct drm_buddy_block * +alloc_range_bias(struct drm_buddy_mm *mm, +u64 start, u64 end, +unsigned int order) +{ + struct drm_buddy_block *block; + struct drm_buddy_block *buddy; + LIST_HEAD(dfs); + int err; + int i; + + end = end - 1; + + for (i = 0; i < mm->n_roots; ++i) + list_add_tail(>roots[i]->tmp_link, ); + + do { + u64 block_start; + u64 block_end; + + block = list_first_entry_or_null(, +struct drm_buddy_block, +tmp_link); + if (!block) + break; + + list_del(>tmp_link); + + if (drm_buddy_block_order(block) < order) + continue; + + block_start = drm_buddy_block_offset(block); + block_end = block_start + drm_buddy_block_size(mm, block) - 1; + + if (!overlaps(start, end, block_start, block_end)) + continue; + + if (drm_buddy_block_is_allocated(block)) + continue; + + if (contains(start, end, block_start, block_end) && + order == drm_buddy_block_order(block)) { + /* +* Find the free block within
[PATCH] drm/amd/display: Fix warning comparing pointer to 0
Fix the following coccicheck warning: ./drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c:744:35-36: WARNING comparing pointer to 0. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong --- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c index 8f78e62b28b7..48005def1164 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c @@ -741,7 +741,7 @@ void vg_clk_mgr_construct( sizeof(struct watermarks), _mgr->smu_wm_set.mc_address.quad_part); - if (clk_mgr->smu_wm_set.wm_set == 0) { + if (!clk_mgr->smu_wm_set.wm_set) { clk_mgr->smu_wm_set.wm_set = _wms; clk_mgr->smu_wm_set.mc_address.quad_part = 0; } -- 2.20.1.7.g153144c
Re: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure
The !drm_dev_enter() is quite unusual and deserves a comment explaining what's going on here. Apart from that it looks good with the typos fixed I think. Christian. Am 16.12.21 um 08:27 schrieb Chen, Guchun: [Public] My BAD to misunderstand this. There are both spell typos in patch subject and body, s/iff/if. The patch is: Reviewed-by: Guchun Chen Please wait for the ack from Andrey and Christian before pushing this. Regards, Guchun -Original Message- From: Shi, Leslie Sent: Thursday, December 16, 2021 3:00 PM To: Chen, Guchun ; Grodzovsky, Andrey ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure [Public] Hi Guchun, As Andrey says, "we should not call amdgpu_device_unmap_mmio unless device is unplugged", I think we should call amdgpu_device_unmap_mmio() only if device is unplugged (drm_dev_enter() return false) . +if (!drm_dev_enter(adev_to_drm(adev), )) + amdgpu_device_unmap_mmio(adev); + else # drm_dev_exit(idx); Regards, Leslie -Original Message- From: Chen, Guchun Sent: Thursday, December 16, 2021 2:46 PM To: Shi, Leslie ; Grodzovsky, Andrey ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure [Public] Hi Leslie, I think we need to modify it like: +if (drm_dev_enter(adev_to_drm(adev), )) { + amdgpu_device_unmap_mmio(adev); + drm_dev_exit(idx); +} Also you need to credit Andrey a 'suggested-by' in your patch. Regards, Guchun -Original Message- From: Shi, Leslie Sent: Thursday, December 16, 2021 2:14 PM To: Grodzovsky, Andrey ; Koenig, Christian ; Pan, Xinhui ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Cc: Chen, Guchun ; Shi, Leslie Subject: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure [Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs. [How] call amdgpu_device_unmap_mmio() iff device is unplugged to prevent invalid memory address in vcn_v3_0_sw_fini() when GPU initialization failure. Signed-off-by: Leslie Shi --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index fb03d75880ec..d3656e7b60c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3845,6 +3845,8 @@ static void amdgpu_device_unmap_mmio(struct amdgpu_device *adev) */ void amdgpu_device_fini_hw(struct amdgpu_device *adev) { + int idx; + dev_info(adev->dev, "amdgpu: finishing device.\n"); flush_delayed_work(>delayed_init_work); if (adev->mman.initialized) { @@ -3888,7 +3890,11 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_gart_dummy_page_fini(adev); - amdgpu_device_unmap_mmio(adev); + if (!drm_dev_enter(adev_to_drm(adev), )) + amdgpu_device_unmap_mmio(adev); + else + drm_dev_exit(idx); + } void amdgpu_device_fini_sw(struct amdgpu_device *adev) -- 2.25.1
Re: [PATCH 6/7] drm/amdgpu: Ensure kunmap is called on error
Am 15.12.21 um 22:09 schrieb Ira Weiny: On Tue, Dec 14, 2021 at 08:09:29AM +0100, Christian König wrote: Am 14.12.21 um 04:37 schrieb Ira Weiny: On Mon, Dec 13, 2021 at 09:37:32PM +0100, Christian König wrote: Am 11.12.21 um 00:24 schrieb ira.we...@intel.com: From: Ira Weiny The default case leaves the buffer object mapped in error. Add amdgpu_bo_kunmap() to that case to ensure the mapping is cleaned up. Mhm, good catch. But why do you want to do this in the first place? I'm not sure I understand the question. Any mapping of memory should be paired with an unmapping when no longer needed. And this is supported by the call to amdgpu_bo_kunmap() in the other non-default cases. Do you believe the mapping is not needed? No, the unmapping is not needed here. See the function amdgpu_bo_kmap(), it either creates the mapping or return the cached pointer. Ah I missed that. Thanks. A call to amdgpu_bo_kunmap() is only done in a few places where we know that the created mapping most likely won't be needed any more. If that's not done the mapping is automatically destroyed when the BO is moved or freed up. I mean good bug fix, but you seem to see this as some kind of prerequisite to some follow up work converting TTM to use kmap_local() which most likely won't work in the first place. Sure. I see now that it is more complicated than I thought but I never thought of this as a strict prerequisite. Just something I found while trying to figure out how this works. How much of a speed up is it to maintain the ttm_bo_map_kmap map type? Good question. I don't really know. This used to be pretty important for older drivers since there the kernel needs to kmap individual pages and patch them up before sending the command stream to the hardware. It most likely doesn't matter for modern hardware. Could this all be done with vmap and just remove the kmap stuff? Maybe, but I wouldn't bet on it and I don't really want to touch any of the old drivers to figure that out. Christian. Ira Regards, Christian. Ira Christian. Signed-off-by: Ira Weiny --- NOTE: It seems like this function could use a fair bit of refactoring but this is the easiest way to fix the actual bug. --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 1 + 1 file changed, 1 insertion(+) nice diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index 6f8de11a17f1..b3ffd0f6b35f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c @@ -889,6 +889,7 @@ static int amdgpu_uvd_cs_msg(struct amdgpu_uvd_cs_ctx *ctx, return 0; default: + amdgpu_bo_kunmap(bo); DRM_ERROR("Illegal UVD message type (%d)!\n", msg_type); }
Re: [PATCH] drm/amdgpu: fix mismatch warning between the prototype and function name
Am 16.12.21 um 05:39 schrieb Huang Rui: Fix the typo to align with the prototype and function name. All warnings (new ones prefixed by >>): drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:631: warning: expecting prototype for amdgpu_fence_clear_job_fences(). Prototype was for amdgpu_fence_driver_clear_job_fences() instead Reported-by: kernel test robot Signed-off-by: Huang Rui Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index db41d16838b9..9afd11ca2709 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -622,7 +622,7 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev) } /** - * amdgpu_fence_clear_job_fences - clear job embedded fences of ring + * amdgpu_fence_driver_clear_job_fences - clear job embedded fences of ring * * @ring: fence of the ring to be cleared *
Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
Hi Zongmin, that strongly sounds like the ASIC is not correctly initialized when trying to read the register. What board and environment are you using this GPU with? Is that a normal x86 system? Regards, Christian. Am 16.12.21 um 04:11 schrieb 周宗敏: 1. the problematic boards that I have tested is [AMD/ATI] Lexa PRO [Radeon RX 550/550X] ; and the vbios version : 113-RXF9310-C09-BT 2. When an exception occurs I can see the following changes in the values of vram size get from RREG32(mmCONFIG_MEMSIZE) , it seems to have garbage in the upper 16 bits image.png 3. and then I can also see some dmesg like below: when vram size register have garbage,we may see error message like below: amdgpu :09:00.0: VRAM: 4286582784M 0x00F4 - 0x000FF8F4 (4286582784M used) the correct message should like below: amdgpu :09:00.0: VRAM: 4096M 0x00F4 - 0x00F4 (4096M used) if you have any problems,please send me mail. thanks very much. *主 题:*Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8 *日 期:*2021-12-16 04:23 *发件人:*Alex Deucher *收件人:*Zongmin Zhou On Wed, Dec 15, 2021 at 10:31 AM Zongmin Zhouwrote: > > Some boards(like RX550) seem to have garbage in the upper > 16 bits of the vram size register. Check for > this and clamp the size properly. Fixes > boards reporting bogus amounts of vram. > > after add this patch,the maximum GPU VRAM size is 64GB, > otherwise only 64GB vram size will be used. Can you provide some examples of problematic boards and possibly a vbios image from the problematic board? What values are you seeing? It would be nice to see what the boards are reporting and whether the lower 16 bits are actually correct or if it is some other issue. This register is undefined until the asic has been initialized. The vbios programs it as part of it's asic init sequence (either via vesa/gop or the OS driver). Alex > > Signed-off-by: Zongmin Zhou > --- > drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > index 492ebed2915b..63b890f1e8af 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > @@ -515,10 +515,10 @@ static void gmc_v8_0_mc_program(struct amdgpu_device *adev) > static int gmc_v8_0_mc_init(struct amdgpu_device *adev) > { > int r; > + u32 tmp; > > adev->gmc.vram_width = amdgpu_atombios_get_vram_width(adev); > if (!adev->gmc.vram_width) { > - u32 tmp; > int chansize, numchan; > > /* Get VRAM informations */ > @@ -562,8 +562,15 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev) > adev->gmc.vram_width = numchan * chansize; > } > /* size in MB on si */ > - adev->gmc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL; > - adev->gmc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL; > + tmp = RREG32(mmCONFIG_MEMSIZE); > + /* some boards may have garbage in the upper 16 bits */ > + if (tmp & 0x) { > + DRM_INFO("Probable bad vram size: 0x%08x\n", tmp); > + if (tmp & 0x) > + tmp &= 0x; > + } > + adev->gmc.mc_vram_size = tmp * 1024ULL * 1024ULL; > + adev->gmc.real_vram_size = adev->gmc.mc_vram_size; > > if (!(adev->flags & AMD_IS_APU)) { > r = amdgpu_device_resize_fb_bar(adev); > -- > 2.25.1 > > > No virus found > Checked by Hillstone Network AntiVirus