Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
Sure, patch 4 Reviewed-by: Andrey Grodzovsky andrey.grodzov...@amd.com and patch 5 Acked-by: Andrey Grodzovsky andrey.grodzov...@amd.com since I am not sure about the KFD bits. Andrey On 2021-03-08 11:10 a.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Hi, Andrey. The first 3 patches in this serial already been acked by Alex. D, can you help review the rest two ? Thanks Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Monday, March 8, 2021 10:53 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe I see, thanks for explaning. Andrey On 2021-03-08 10:27 a.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Check the function amdgpu_xgmi_add_device, when psp XGMI TA is bot available , the driver will assign a faked hive ID 0x10 for all GPUs, it means all GPU will belongs to one same hive . So I can still use hive->tb to sync the reset on all GPUs. The reason I can not use the default amdgpu_do_asic_reset function is because we want to build correct hive and node topology for all GPUs after reset, so we need to call amdgpu_xgmi_add_device inside the amdgpu_do_asic_reset function . To make this works , we need to destroy the hive by remove the device (call amdgpu_xgmi_remove_device) first , so when calling amdgpu_do_asic_reset , the faked hive(0x10) already been destroyed. And the hive->tb will not work in this case . That's the reason I need to call the reset explicitly with the faked hive and then destroy the hive , build the device_list for amdgpu_do_asic_reset without the hive . Hope I explain it clearly . Thanks Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Monday, March 8, 2021 1:28 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe But the hive->tb object is used regardless, inside amdgpu_device_xgmi_reset_func currently, it means then even when you explcicitly schdule xgmi_reset_work as you do now they code will try to sync using a not well iniitlized tb object. Maybe you can define a global static tb object, fill it in the loop where you send xgmi_reset_work for all devices in system and use it from within amdgpu_device_xgmi_reset_func instead of the regular per hive tb object (obviosly under your special use case). Andrey On 2021-03-06 4:11 p.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] It seems I can not directly reuse the reset HW function inside the amdgpu_do_asic_reset, the synchronization is based on hive->tb, but as explained , we actually don't know the GPU belongs to which hive and will rebuild the correct hive info inside the amdgpu_do_asic_reset function with amdgpu_xgmi_add_device . so I need to remove all GPUs from the hive first . This will lead to the sync don't work since the hive->tb will be removed as well when all GPUs are removed . Thanks shaopyunliu -Original Message- From: amd-gfx On Behalf Of Liu, Shaoyun Sent: Saturday, March 6, 2021 3:41 PM To: Grodzovsky, Andrey ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe [AMD Official Use Only - Internal Distribution Only] I call the amdgpu_do_asic_reset with the parameter skip_hw_reset = true so the reset won't be execute twice . but probably I can set this parameter to true and remove the code schedule for reset since now I already build the device_list not based on hive. Let me try that . For the schedule delayed work thread with AMDGPU_RESUME_MS, It's actually not wait for SMU to start. As I explained , I need to reset the all the GPUs in the system since I don't know which gpus belongs to which hive. So this time is allow system to probe all the GPUs in the system which means when this delayed thread starts , we can assume all the devices already been populated in mgpu_info. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Saturday, March 6, 2021 1:09 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe Thanks for explaining this, one thing I still don't understand is why you schedule the reset work explicilty in the begining of amdgpu_drv_delayed_reset_work_handler and then also call amdgpu_do_asic_reset which will do the same thing too. It looks like the physical reset will execute twice for each device. Another thing is, more like improvement suggestion - currently you schedule delayed_reset_work using AMDGPU_RESUME_MS - so i guesss this should give enough time for SMU to start ? Is there maybe a way to instead poll for SMU start completion and then execute this - some SMU status registers maybe ? Just to avoid relying on
RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
[AMD Official Use Only - Internal Distribution Only] Hi, Andrey. The first 3 patches in this serial already been acked by Alex. D, can you help review the rest two ? Thanks Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Monday, March 8, 2021 10:53 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe I see, thanks for explaning. Andrey On 2021-03-08 10:27 a.m., Liu, Shaoyun wrote: > [AMD Official Use Only - Internal Distribution Only] > > Check the function amdgpu_xgmi_add_device, when psp XGMI TA is bot available > , the driver will assign a faked hive ID 0x10 for all GPUs, it means all > GPU will belongs to one same hive . So I can still use hive->tb to sync the > reset on all GPUs. The reason I can not use the default > amdgpu_do_asic_reset function is because we want to build correct hive and > node topology for all GPUs after reset, so we need to call > amdgpu_xgmi_add_device inside the amdgpu_do_asic_reset function . To make > this works , we need to destroy the hive by remove the device (call > amdgpu_xgmi_remove_device) first , so when calling amdgpu_do_asic_reset , > the faked hive(0x10) already been destroyed. And the hive->tb will not > work in this case . That's the reason I need to call the reset explicitly > with the faked hive and then destroy the hive , build the device_list for > amdgpu_do_asic_reset without the hive . > Hope I explain it clearly . > > Thanks > Shaoyun.liu > > -Original Message- > From: Grodzovsky, Andrey > Sent: Monday, March 8, 2021 1:28 AM > To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI > hive duirng probe > > But the hive->tb object is used regardless, inside > amdgpu_device_xgmi_reset_func currently, it means then even when you > explcicitly schdule xgmi_reset_work as you do now they code will try to sync > using a not well iniitlized tb object. Maybe you can define a global static > tb object, fill it in the loop where you send xgmi_reset_work for all devices > in system and use it from within amdgpu_device_xgmi_reset_func instead of the > regular per hive tb object (obviosly under your special use case). > > Andrey > > On 2021-03-06 4:11 p.m., Liu, Shaoyun wrote: >> [AMD Official Use Only - Internal Distribution Only] >> >> It seems I can not directly reuse the reset HW function inside the >> amdgpu_do_asic_reset, the synchronization is based on hive->tb, but as >> explained , we actually don't know the GPU belongs to which hive and will >> rebuild the correct hive info inside the amdgpu_do_asic_reset function with >> amdgpu_xgmi_add_device . so I need to remove all GPUs from the hive first >> . This will lead to the sync don't work since the hive->tb will be removed >> as well when all GPUs are removed . >> >> Thanks >> shaopyunliu >> >> -----Original Message- >> From: amd-gfx On Behalf Of >> Liu, Shaoyun >> Sent: Saturday, March 6, 2021 3:41 PM >> To: Grodzovsky, Andrey ; >> amd-gfx@lists.freedesktop.org >> Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI >> hive duirng probe >> >> [AMD Official Use Only - Internal Distribution Only] >> >> I call the amdgpu_do_asic_reset with the parameter skip_hw_reset = true so >> the reset won't be execute twice . but probably I can set this parameter >> to true and remove the code schedule for reset since now I already build the >> device_list not based on hive. Let me try that . >> For the schedule delayed work thread with AMDGPU_RESUME_MS, It's actually >> not wait for SMU to start. As I explained , I need to reset the all the >> GPUs in the system since I don't know which gpus belongs to which hive. So >> this time is allow system to probe all the GPUs in the system which means >> when this delayed thread starts , we can assume all the devices already >> been populated in mgpu_info. >> >> Regards >> Shaoyun.liu >> >> -Original Message- >> From: Grodzovsky, Andrey >> Sent: Saturday, March 6, 2021 1:09 AM >> To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI >> hive duirng probe >> >> Thanks for explaining this, one thing I still don't understand is why you >> schedule the reset work explicilty in the begining of >> amdgpu_drv_delayed_reset_work_handler and then also call >> amdgpu_do_asic_reset which will do the same th
Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
I see, thanks for explaning. Andrey On 2021-03-08 10:27 a.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Check the function amdgpu_xgmi_add_device, when psp XGMI TA is bot available , the driver will assign a faked hive ID 0x10 for all GPUs, it means all GPU will belongs to one same hive . So I can still use hive->tb to sync the reset on all GPUs. The reason I can not use the default amdgpu_do_asic_reset function is because we want to build correct hive and node topology for all GPUs after reset, so we need to call amdgpu_xgmi_add_device inside the amdgpu_do_asic_reset function . To make this works , we need to destroy the hive by remove the device (call amdgpu_xgmi_remove_device) first , so when calling amdgpu_do_asic_reset , the faked hive(0x10) already been destroyed. And the hive->tb will not work in this case . That's the reason I need to call the reset explicitly with the faked hive and then destroy the hive , build the device_list for amdgpu_do_asic_reset without the hive . Hope I explain it clearly . Thanks Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Monday, March 8, 2021 1:28 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe But the hive->tb object is used regardless, inside amdgpu_device_xgmi_reset_func currently, it means then even when you explcicitly schdule xgmi_reset_work as you do now they code will try to sync using a not well iniitlized tb object. Maybe you can define a global static tb object, fill it in the loop where you send xgmi_reset_work for all devices in system and use it from within amdgpu_device_xgmi_reset_func instead of the regular per hive tb object (obviosly under your special use case). Andrey On 2021-03-06 4:11 p.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] It seems I can not directly reuse the reset HW function inside the amdgpu_do_asic_reset, the synchronization is based on hive->tb, but as explained , we actually don't know the GPU belongs to which hive and will rebuild the correct hive info inside the amdgpu_do_asic_reset function with amdgpu_xgmi_add_device . so I need to remove all GPUs from the hive first . This will lead to the sync don't work since the hive->tb will be removed as well when all GPUs are removed . Thanks shaopyunliu -Original Message- From: amd-gfx On Behalf Of Liu, Shaoyun Sent: Saturday, March 6, 2021 3:41 PM To: Grodzovsky, Andrey ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe [AMD Official Use Only - Internal Distribution Only] I call the amdgpu_do_asic_reset with the parameter skip_hw_reset = true so the reset won't be execute twice . but probably I can set this parameter to true and remove the code schedule for reset since now I already build the device_list not based on hive. Let me try that . For the schedule delayed work thread with AMDGPU_RESUME_MS, It's actually not wait for SMU to start. As I explained , I need to reset the all the GPUs in the system since I don't know which gpus belongs to which hive. So this time is allow system to probe all the GPUs in the system which means when this delayed thread starts , we can assume all the devices already been populated in mgpu_info. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Saturday, March 6, 2021 1:09 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe Thanks for explaining this, one thing I still don't understand is why you schedule the reset work explicilty in the begining of amdgpu_drv_delayed_reset_work_handler and then also call amdgpu_do_asic_reset which will do the same thing too. It looks like the physical reset will execute twice for each device. Another thing is, more like improvement suggestion - currently you schedule delayed_reset_work using AMDGPU_RESUME_MS - so i guesss this should give enough time for SMU to start ? Is there maybe a way to instead poll for SMU start completion and then execute this - some SMU status registers maybe ? Just to avoid relying on this arbitrary value. Andrey On 2021-03-05 8:37 p.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Hi, Andrey The existing reset function (amdgpu_device_gpu_recover or amd do_asic _reset) assumed driver already have the correct hive info . But in my case, it's not true . The gpus are in a bad state and the XGMI TA might not functional properly , so driver can not get the hive and node info when probe the device . It means driver even don't know the device belongs to which hive on a system with multiple hive configuration (ex, 8 gpus in two hive). The only solution I can think of is let
RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
[AMD Official Use Only - Internal Distribution Only] Check the function amdgpu_xgmi_add_device, when psp XGMI TA is bot available , the driver will assign a faked hive ID 0x10 for all GPUs, it means all GPU will belongs to one same hive . So I can still use hive->tb to sync the reset on all GPUs. The reason I can not use the default amdgpu_do_asic_reset function is because we want to build correct hive and node topology for all GPUs after reset, so we need to call amdgpu_xgmi_add_device inside the amdgpu_do_asic_reset function . To make this works , we need to destroy the hive by remove the device (call amdgpu_xgmi_remove_device) first , so when calling amdgpu_do_asic_reset , the faked hive(0x10) already been destroyed. And the hive->tb will not work in this case . That's the reason I need to call the reset explicitly with the faked hive and then destroy the hive , build the device_list for amdgpu_do_asic_reset without the hive . Hope I explain it clearly . Thanks Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Monday, March 8, 2021 1:28 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe But the hive->tb object is used regardless, inside amdgpu_device_xgmi_reset_func currently, it means then even when you explcicitly schdule xgmi_reset_work as you do now they code will try to sync using a not well iniitlized tb object. Maybe you can define a global static tb object, fill it in the loop where you send xgmi_reset_work for all devices in system and use it from within amdgpu_device_xgmi_reset_func instead of the regular per hive tb object (obviosly under your special use case). Andrey On 2021-03-06 4:11 p.m., Liu, Shaoyun wrote: > [AMD Official Use Only - Internal Distribution Only] > > It seems I can not directly reuse the reset HW function inside the > amdgpu_do_asic_reset, the synchronization is based on hive->tb, but as > explained , we actually don't know the GPU belongs to which hive and will > rebuild the correct hive info inside the amdgpu_do_asic_reset function with > amdgpu_xgmi_add_device . so I need to remove all GPUs from the hive first . > This will lead to the sync don't work since the hive->tb will be removed as > well when all GPUs are removed . > > Thanks > shaopyunliu > > -Original Message- > From: amd-gfx On Behalf Of > Liu, Shaoyun > Sent: Saturday, March 6, 2021 3:41 PM > To: Grodzovsky, Andrey ; > amd-gfx@lists.freedesktop.org > Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI > hive duirng probe > > [AMD Official Use Only - Internal Distribution Only] > > I call the amdgpu_do_asic_reset with the parameter skip_hw_reset = true so > the reset won't be execute twice . but probably I can set this parameter to > true and remove the code schedule for reset since now I already build the > device_list not based on hive. Let me try that . > For the schedule delayed work thread with AMDGPU_RESUME_MS, It's actually > not wait for SMU to start. As I explained , I need to reset the all the GPUs > in the system since I don't know which gpus belongs to which hive. So this > time is allow system to probe all the GPUs in the system which means when > this delayed thread starts , we can assume all the devices already been > populated in mgpu_info. > > Regards > Shaoyun.liu > > -Original Message- > From: Grodzovsky, Andrey > Sent: Saturday, March 6, 2021 1:09 AM > To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI > hive duirng probe > > Thanks for explaining this, one thing I still don't understand is why you > schedule the reset work explicilty in the begining of > amdgpu_drv_delayed_reset_work_handler and then also call amdgpu_do_asic_reset > which will do the same thing too. It looks like the physical reset will > execute twice for each device. > Another thing is, more like improvement suggestion - currently you schedule > delayed_reset_work using AMDGPU_RESUME_MS - so i guesss this should give > enough time for SMU to start ? Is there maybe a way to instead poll for SMU > start completion and then execute this - some SMU status registers maybe ? > Just to avoid relying on this arbitrary value. > > Andrey > > On 2021-03-05 8:37 p.m., Liu, Shaoyun wrote: >> [AMD Official Use Only - Internal Distribution Only] >> >> Hi, Andrey >> The existing reset function (amdgpu_device_gpu_recover or amd do_asic >> _reset) assumed driver already have the correct hive info . But in my case, >> it's not true . The gpus are in a bad state and the XGMI TA might not >>
Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
But the hive->tb object is used regardless, inside amdgpu_device_xgmi_reset_func currently, it means then even when you explcicitly schdule xgmi_reset_work as you do now they code will try to sync using a not well iniitlized tb object. Maybe you can define a global static tb object, fill it in the loop where you send xgmi_reset_work for all devices in system and use it from within amdgpu_device_xgmi_reset_func instead of the regular per hive tb object (obviosly under your special use case). Andrey On 2021-03-06 4:11 p.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] It seems I can not directly reuse the reset HW function inside the amdgpu_do_asic_reset, the synchronization is based on hive->tb, but as explained , we actually don't know the GPU belongs to which hive and will rebuild the correct hive info inside the amdgpu_do_asic_reset function with amdgpu_xgmi_add_device . so I need to remove all GPUs from the hive first . This will lead to the sync don't work since the hive->tb will be removed as well when all GPUs are removed . Thanks shaopyunliu -Original Message- From: amd-gfx On Behalf Of Liu, Shaoyun Sent: Saturday, March 6, 2021 3:41 PM To: Grodzovsky, Andrey ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe [AMD Official Use Only - Internal Distribution Only] I call the amdgpu_do_asic_reset with the parameter skip_hw_reset = true so the reset won't be execute twice . but probably I can set this parameter to true and remove the code schedule for reset since now I already build the device_list not based on hive. Let me try that . For the schedule delayed work thread with AMDGPU_RESUME_MS, It's actually not wait for SMU to start. As I explained , I need to reset the all the GPUs in the system since I don't know which gpus belongs to which hive. So this time is allow system to probe all the GPUs in the system which means when this delayed thread starts , we can assume all the devices already been populated in mgpu_info. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Saturday, March 6, 2021 1:09 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe Thanks for explaining this, one thing I still don't understand is why you schedule the reset work explicilty in the begining of amdgpu_drv_delayed_reset_work_handler and then also call amdgpu_do_asic_reset which will do the same thing too. It looks like the physical reset will execute twice for each device. Another thing is, more like improvement suggestion - currently you schedule delayed_reset_work using AMDGPU_RESUME_MS - so i guesss this should give enough time for SMU to start ? Is there maybe a way to instead poll for SMU start completion and then execute this - some SMU status registers maybe ? Just to avoid relying on this arbitrary value. Andrey On 2021-03-05 8:37 p.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Hi, Andrey The existing reset function (amdgpu_device_gpu_recover or amd do_asic _reset) assumed driver already have the correct hive info . But in my case, it's not true . The gpus are in a bad state and the XGMI TA might not functional properly , so driver can not get the hive and node info when probe the device . It means driver even don't know the device belongs to which hive on a system with multiple hive configuration (ex, 8 gpus in two hive). The only solution I can think of is let driver trigger the reset on all gpus at the same time after driver do the minimum initialization on the HW ( bring up the SMU IP) no matter they belongs to the same hive or not and call amdgpu_xgmi_add_device for each device after re-init . The 100 ms delay added after the baco reset . I think they can be removed . let me verify it. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Friday, March 5, 2021 2:27 PM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe On 2021-03-05 12:52 p.m., shaoyunl wrote: In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang since for XGMI configuration, all the devices within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time Signed-off-by: shaoyunl Change-Id: I34e838e611b7623c7ad824704c7ce350808014fc --- drivers/gpu/drm/a
RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
[AMD Official Use Only - Internal Distribution Only] It seems I can not directly reuse the reset HW function inside the amdgpu_do_asic_reset, the synchronization is based on hive->tb, but as explained , we actually don't know the GPU belongs to which hive and will rebuild the correct hive info inside the amdgpu_do_asic_reset function with amdgpu_xgmi_add_device . so I need to remove all GPUs from the hive first . This will lead to the sync don't work since the hive->tb will be removed as well when all GPUs are removed . Thanks shaopyunliu -Original Message- From: amd-gfx On Behalf Of Liu, Shaoyun Sent: Saturday, March 6, 2021 3:41 PM To: Grodzovsky, Andrey ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe [AMD Official Use Only - Internal Distribution Only] I call the amdgpu_do_asic_reset with the parameter skip_hw_reset = true so the reset won't be execute twice . but probably I can set this parameter to true and remove the code schedule for reset since now I already build the device_list not based on hive. Let me try that . For the schedule delayed work thread with AMDGPU_RESUME_MS, It's actually not wait for SMU to start. As I explained , I need to reset the all the GPUs in the system since I don't know which gpus belongs to which hive. So this time is allow system to probe all the GPUs in the system which means when this delayed thread starts , we can assume all the devices already been populated in mgpu_info. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Saturday, March 6, 2021 1:09 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe Thanks for explaining this, one thing I still don't understand is why you schedule the reset work explicilty in the begining of amdgpu_drv_delayed_reset_work_handler and then also call amdgpu_do_asic_reset which will do the same thing too. It looks like the physical reset will execute twice for each device. Another thing is, more like improvement suggestion - currently you schedule delayed_reset_work using AMDGPU_RESUME_MS - so i guesss this should give enough time for SMU to start ? Is there maybe a way to instead poll for SMU start completion and then execute this - some SMU status registers maybe ? Just to avoid relying on this arbitrary value. Andrey On 2021-03-05 8:37 p.m., Liu, Shaoyun wrote: > [AMD Official Use Only - Internal Distribution Only] > > Hi, Andrey > The existing reset function (amdgpu_device_gpu_recover or amd do_asic _reset) > assumed driver already have the correct hive info . But in my case, it's > not true . The gpus are in a bad state and the XGMI TA might not functional > properly , so driver can not get the hive and node info when probe the > device . It means driver even don't know the device belongs to which hive > on a system with multiple hive configuration (ex, 8 gpus in two hive). The > only solution I can think of is let driver trigger the reset on all gpus at > the same time after driver do the minimum initialization on the HW ( bring up > the SMU IP) no matter they belongs to the same hive or not and call > amdgpu_xgmi_add_device for each device after re-init . > The 100 ms delay added after the baco reset . I think they can be removed . > let me verify it. > > Regards > Shaoyun.liu > > > > -Original Message- > From: Grodzovsky, Andrey > Sent: Friday, March 5, 2021 2:27 PM > To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI > hive duirng probe > > > > On 2021-03-05 12:52 p.m., shaoyunl wrote: >> In passthrough configuration, hypervisior will trigger the >> SBR(Secondary bus reset) to the devices without sync to each other. >> This could cause device hang since for XGMI configuration, all the >> devices within the hive need to be reset at a limit time slot. This >> serial of patches try to solve this issue by co-operate with new SMU >> which will only do minimum house keeping to response the SBR request >> but don't do the real reset job and leave it to driver. Driver need >> to do the whole sw init and minimum HW init to bring up the SMU and >> trigger the reset(possibly BACO) on all the ASICs at the same time >> >> Signed-off-by: shaoyunl >> Change-Id: I34e838e611b7623c7ad824704c7ce350808014fc >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu.h| 13 +++ >>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 102 +++-- >>drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 71 ++ >>drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 1 + >>drivers
RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
[AMD Official Use Only - Internal Distribution Only] I call the amdgpu_do_asic_reset with the parameter skip_hw_reset = true so the reset won't be execute twice . but probably I can set this parameter to true and remove the code schedule for reset since now I already build the device_list not based on hive. Let me try that . For the schedule delayed work thread with AMDGPU_RESUME_MS, It's actually not wait for SMU to start. As I explained , I need to reset the all the GPUs in the system since I don't know which gpus belongs to which hive. So this time is allow system to probe all the GPUs in the system which means when this delayed thread starts , we can assume all the devices already been populated in mgpu_info. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Saturday, March 6, 2021 1:09 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe Thanks for explaining this, one thing I still don't understand is why you schedule the reset work explicilty in the begining of amdgpu_drv_delayed_reset_work_handler and then also call amdgpu_do_asic_reset which will do the same thing too. It looks like the physical reset will execute twice for each device. Another thing is, more like improvement suggestion - currently you schedule delayed_reset_work using AMDGPU_RESUME_MS - so i guesss this should give enough time for SMU to start ? Is there maybe a way to instead poll for SMU start completion and then execute this - some SMU status registers maybe ? Just to avoid relying on this arbitrary value. Andrey On 2021-03-05 8:37 p.m., Liu, Shaoyun wrote: > [AMD Official Use Only - Internal Distribution Only] > > Hi, Andrey > The existing reset function (amdgpu_device_gpu_recover or amd do_asic _reset) > assumed driver already have the correct hive info . But in my case, it's > not true . The gpus are in a bad state and the XGMI TA might not functional > properly , so driver can not get the hive and node info when probe the > device . It means driver even don't know the device belongs to which hive > on a system with multiple hive configuration (ex, 8 gpus in two hive). The > only solution I can think of is let driver trigger the reset on all gpus at > the same time after driver do the minimum initialization on the HW ( bring up > the SMU IP) no matter they belongs to the same hive or not and call > amdgpu_xgmi_add_device for each device after re-init . > The 100 ms delay added after the baco reset . I think they can be removed . > let me verify it. > > Regards > Shaoyun.liu > > > > -Original Message- > From: Grodzovsky, Andrey > Sent: Friday, March 5, 2021 2:27 PM > To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI > hive duirng probe > > > > On 2021-03-05 12:52 p.m., shaoyunl wrote: >> In passthrough configuration, hypervisior will trigger the >> SBR(Secondary bus reset) to the devices without sync to each other. >> This could cause device hang since for XGMI configuration, all the >> devices within the hive need to be reset at a limit time slot. This >> serial of patches try to solve this issue by co-operate with new SMU >> which will only do minimum house keeping to response the SBR request >> but don't do the real reset job and leave it to driver. Driver need >> to do the whole sw init and minimum HW init to bring up the SMU and >> trigger the reset(possibly BACO) on all the ASICs at the same time >> >> Signed-off-by: shaoyunl >> Change-Id: I34e838e611b7623c7ad824704c7ce350808014fc >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu.h| 13 +++ >>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 102 +++-- >>drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 71 ++ >>drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 1 + >>drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 +- >>5 files changed, 165 insertions(+), 30 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >> index d46d3794699e..5602c6edee97 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >> @@ -125,6 +125,10 @@ struct amdgpu_mgpu_info >> uint32_tnum_gpu; >> uint32_tnum_dgpu; >> uint32_tnum_apu; >> + >> +/* delayed reset_func for XGMI configuration if necessary */ >> +struct delayed_work delayed_reset_work; >> +boolpending_reset; >>
Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
Thanks for explaining this, one thing I still don't understand is why you schedule the reset work explicilty in the begining of amdgpu_drv_delayed_reset_work_handler and then also call amdgpu_do_asic_reset which will do the same thing too. It looks like the physical reset will execute twice for each device. Another thing is, more like improvement suggestion - currently you schedule delayed_reset_work using AMDGPU_RESUME_MS - so i guesss this should give enough time for SMU to start ? Is there maybe a way to instead poll for SMU start completion and then execute this - some SMU status registers maybe ? Just to avoid relying on this arbitrary value. Andrey On 2021-03-05 8:37 p.m., Liu, Shaoyun wrote: [AMD Official Use Only - Internal Distribution Only] Hi, Andrey The existing reset function (amdgpu_device_gpu_recover or amd do_asic _reset) assumed driver already have the correct hive info . But in my case, it's not true . The gpus are in a bad state and the XGMI TA might not functional properly , so driver can not get the hive and node info when probe the device . It means driver even don't know the device belongs to which hive on a system with multiple hive configuration (ex, 8 gpus in two hive). The only solution I can think of is let driver trigger the reset on all gpus at the same time after driver do the minimum initialization on the HW ( bring up the SMU IP) no matter they belongs to the same hive or not and call amdgpu_xgmi_add_device for each device after re-init . The 100 ms delay added after the baco reset . I think they can be removed . let me verify it. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Friday, March 5, 2021 2:27 PM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe On 2021-03-05 12:52 p.m., shaoyunl wrote: In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang since for XGMI configuration, all the devices within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time Signed-off-by: shaoyunl Change-Id: I34e838e611b7623c7ad824704c7ce350808014fc --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 13 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 102 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 71 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 +- 5 files changed, 165 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index d46d3794699e..5602c6edee97 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -125,6 +125,10 @@ struct amdgpu_mgpu_info uint32_tnum_gpu; uint32_tnum_dgpu; uint32_tnum_apu; + + /* delayed reset_func for XGMI configuration if necessary */ + struct delayed_work delayed_reset_work; + boolpending_reset; }; #define AMDGPU_MAX_TIMEOUT_PARAM_LENGTH 256 @@ -1124,6 +1128,15 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev, bool amdgpu_device_asic_has_dc_support(enum amd_asic_type asic_type); bool amdgpu_device_has_dc_support(struct amdgpu_device *adev); +int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, + struct amdgpu_job *job, + bool *need_full_reset_arg); + +int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, + struct list_head *device_list_handle, + bool *need_full_reset_arg, + bool skip_hw_reset); + int emu_soc_asic_init(struct amdgpu_device *adev); /* diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 3c35b0c1e710..5b520f70e660 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1220,6 +1220,10 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev) } } + /* Don't post if we need to reset whole hive on init */ + if (adev->gmc.xgmi.pending_reset) + return false; + if (adev->has_hw_reset) { adev->has_hw_reset = false; return true; @@ -2149,6 +2153,9 @@ static int amdgpu_device_fw_loading(struct amdgpu_devi
RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
[AMD Official Use Only - Internal Distribution Only] Hi, Andrey The existing reset function (amdgpu_device_gpu_recover or amd do_asic _reset) assumed driver already have the correct hive info . But in my case, it's not true . The gpus are in a bad state and the XGMI TA might not functional properly , so driver can not get the hive and node info when probe the device . It means driver even don't know the device belongs to which hive on a system with multiple hive configuration (ex, 8 gpus in two hive). The only solution I can think of is let driver trigger the reset on all gpus at the same time after driver do the minimum initialization on the HW ( bring up the SMU IP) no matter they belongs to the same hive or not and call amdgpu_xgmi_add_device for each device after re-init . The 100 ms delay added after the baco reset . I think they can be removed . let me verify it. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Friday, March 5, 2021 2:27 PM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe On 2021-03-05 12:52 p.m., shaoyunl wrote: > In passthrough configuration, hypervisior will trigger the > SBR(Secondary bus reset) to the devices without sync to each other. > This could cause device hang since for XGMI configuration, all the > devices within the hive need to be reset at a limit time slot. This > serial of patches try to solve this issue by co-operate with new SMU > which will only do minimum house keeping to response the SBR request > but don't do the real reset job and leave it to driver. Driver need to > do the whole sw init and minimum HW init to bring up the SMU and > trigger the reset(possibly BACO) on all the ASICs at the same time > > Signed-off-by: shaoyunl > Change-Id: I34e838e611b7623c7ad824704c7ce350808014fc > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h| 13 +++ > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 102 +++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 71 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 +- > 5 files changed, 165 insertions(+), 30 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index d46d3794699e..5602c6edee97 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -125,6 +125,10 @@ struct amdgpu_mgpu_info > uint32_tnum_gpu; > uint32_tnum_dgpu; > uint32_tnum_apu; > + > + /* delayed reset_func for XGMI configuration if necessary */ > + struct delayed_work delayed_reset_work; > + boolpending_reset; > }; > > #define AMDGPU_MAX_TIMEOUT_PARAM_LENGTH 256 > @@ -1124,6 +1128,15 @@ void amdgpu_device_indirect_wreg64(struct > amdgpu_device *adev, > bool amdgpu_device_asic_has_dc_support(enum amd_asic_type asic_type); > bool amdgpu_device_has_dc_support(struct amdgpu_device *adev); > > +int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, > + struct amdgpu_job *job, > + bool *need_full_reset_arg); > + > +int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, > + struct list_head *device_list_handle, > + bool *need_full_reset_arg, > + bool skip_hw_reset); > + > int emu_soc_asic_init(struct amdgpu_device *adev); > > /* > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 3c35b0c1e710..5b520f70e660 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -1220,6 +1220,10 @@ bool amdgpu_device_need_post(struct amdgpu_device > *adev) > } > } > > + /* Don't post if we need to reset whole hive on init */ > + if (adev->gmc.xgmi.pending_reset) > + return false; > + > if (adev->has_hw_reset) { > adev->has_hw_reset = false; > return true; > @@ -2149,6 +2153,9 @@ static int amdgpu_device_fw_loading(struct > amdgpu_device *adev) > if (adev->ip_blocks[i].version->type != > AMD_IP_BLOCK_TYPE_PSP) > continue; > > + if (!adev->ip_blocks[i].status.sw) > + continue; > + > /* no need to do the fw loading again if already done*/ > if
Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe
On 2021-03-05 12:52 p.m., shaoyunl wrote: In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang since for XGMI configuration, all the devices within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time Signed-off-by: shaoyunl Change-Id: I34e838e611b7623c7ad824704c7ce350808014fc --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 13 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 102 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 71 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 +- 5 files changed, 165 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index d46d3794699e..5602c6edee97 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -125,6 +125,10 @@ struct amdgpu_mgpu_info uint32_tnum_gpu; uint32_tnum_dgpu; uint32_tnum_apu; + + /* delayed reset_func for XGMI configuration if necessary */ + struct delayed_work delayed_reset_work; + boolpending_reset; }; #define AMDGPU_MAX_TIMEOUT_PARAM_LENGTH 256 @@ -1124,6 +1128,15 @@ void amdgpu_device_indirect_wreg64(struct amdgpu_device *adev, bool amdgpu_device_asic_has_dc_support(enum amd_asic_type asic_type); bool amdgpu_device_has_dc_support(struct amdgpu_device *adev); +int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, + struct amdgpu_job *job, + bool *need_full_reset_arg); + +int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, + struct list_head *device_list_handle, + bool *need_full_reset_arg, + bool skip_hw_reset); + int emu_soc_asic_init(struct amdgpu_device *adev); /* diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 3c35b0c1e710..5b520f70e660 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1220,6 +1220,10 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev) } } + /* Don't post if we need to reset whole hive on init */ + if (adev->gmc.xgmi.pending_reset) + return false; + if (adev->has_hw_reset) { adev->has_hw_reset = false; return true; @@ -2149,6 +2153,9 @@ static int amdgpu_device_fw_loading(struct amdgpu_device *adev) if (adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_PSP) continue; + if (!adev->ip_blocks[i].status.sw) + continue; + /* no need to do the fw loading again if already done*/ if (adev->ip_blocks[i].status.hw == true) break; @@ -2289,7 +2296,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) if (adev->gmc.xgmi.num_physical_nodes > 1) amdgpu_xgmi_add_device(adev); - amdgpu_amdkfd_device_init(adev); + + /* Don't init kfd if whole hive need to be reset during init */ + if (!adev->gmc.xgmi.pending_reset) + amdgpu_amdkfd_device_init(adev); amdgpu_fru_get_product_info(adev); @@ -2734,6 +2744,16 @@ static int amdgpu_device_ip_suspend_phase2(struct amdgpu_device *adev) adev->ip_blocks[i].status.hw = false; continue; } + + /* skip unnecessary suspend if we do not initialize them yet */ + if (adev->gmc.xgmi.pending_reset && + !(adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC || + adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC || + adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON || + adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_IH)) { + adev->ip_blocks[i].status.hw = false; + continue; + } /* XXX handle errors */ r = adev->ip_blocks[i].version->funcs->suspend(adev); /* XXX handle errors */ @@ -3407,10 +3427,28 @@ int amdgpu_device_init(struct amdgpu_device *adev, * E.g., driver