Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support
On 2017-11-13 11:12 AM, Oded Gabbay wrote: > ok, thanks for the explanation. > I'll wait for your v2. I was on vacation for a few days. I'll work on v2 this week, once I caught up with my email backlog. Regards, Felix ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support
ok, thanks for the explanation. I'll wait for your v2. Oded On Fri, Nov 10, 2017 at 12:02 AM, Kuehling, Felix <felix.kuehl...@amd.com> wrote: > The file contains the assembly code as a reference (inside #if 0) and an > array with the pre-compiled code. It gets #included in kfd_device.c. > > The trap handler binary gets compiled/linked into the kernel. To be > GPL-compliant the source code needs to be included. Otherwise we'd have to > load the trap handler as firmware to get around licensing issues. > > Regards, > Felix > > > From: Oded Gabbay <oded.gab...@gmail.com> > Sent: Thursday, November 9, 2017 2:02:18 AM > To: Kuehling, Felix > Cc: amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support > > Yes, that would be helpful to review. > I assume the asm file provided is just a reference code and doesn't > need to be in the kernel tree for this to work, correct ? > > > On Thu, Nov 9, 2017 at 4:46 AM, Felix Kuehling <felix.kuehl...@amd.com> wrote: >> I'll probably need a v2 of this patch. #include should >> not be needed. That's a leftover from some late cleanup. >> >> This change also ended up a bit big after squashing about 12 changes and >> fixes, and adding more cleanups on top of that. Let me know you want me >> to split it. I'm thinking this may make sense: >> >> 1. Add trap handler .asm file >> 2. Implement CWSR support >> 3. Implement user mode trap handler support >> >> I also pushed an updated kfd_ioctl.h to the Thunk github WIP branch. >> >> Regards, >> Felix >> >> >> On 2017-11-08 09:29 PM, Felix Kuehling wrote: >>> From: shaoyunl <shaoyun@amd.com> >>> >>> CWSR = compute wave save restore >>> >>> This hardware feature allows the GPU to preempt shader execution in >>> the middle of a compute wave, save the state and restore it later >>> to resume execution. >>> >>> This feature requires help from a trap handler, which is like an >>> interrupt handler running on the compute unit. The trap handler is >>> written mostly by the hardware team. It's provided as pre-compiled >>> shader code with the SP3 assembly source code as reference. >>> >>> Memory for saving the state is allocated per queue in user mode and >>> the address and size passed to the create_queue ioctl. The size >>> depends on the number of waves that can be in flight simultaneously >>> on a given ASIC. >>> >>> A second-level user mode trap handler can be installed. The CWSR trap >>> handler jumps to the secondary trap handler conditionally for any >>> conditions not handled by it. This can be used e.g. for debugging or >>> catching math exceptions. >>> >>> Signed-off-by: Shaoyun.liu <shaoyun@amd.com> >>> Signed-off-by: Jay Cornwall <jay.cornw...@amd.com> >>> Signed-off-by: Yong Zhao <yong.z...@amd.com> >>> Signed-off-by: Felix Kuehling <felix.kuehl...@amd.com> >>> --- >>> .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 1384 >>> >>> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +- >>> drivers/gpu/drm/amd/amdkfd/kfd_device.c| 24 +- >>> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 28 + >>> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h |5 + >>> drivers/gpu/drm/amd/amdkfd/kfd_module.c|4 + >>> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c| 27 + >>> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 31 +- >>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 89 +- >>> .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |4 +- >>> include/uapi/linux/kfd_ioctl.h | 15 +- >>> 11 files changed, 1644 insertions(+), 11 deletions(-) >>> create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >>> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >>> new file mode 100644 >>> index 000..751cc2e >>> --- /dev/null >>> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >>> @@ -0,0 +1,1384 @@ >>> +/* >>> + * Copyright 2015-2017 Advanced Micro Devices, Inc. >>> + * >>> + * Permission is hereby granted, free of charge, to any person obtaining a >>> + * copy of this software and associated documentati
Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support
The file contains the assembly code as a reference (inside #if 0) and an array with the pre-compiled code. It gets #included in kfd_device.c. The trap handler binary gets compiled/linked into the kernel. To be GPL-compliant the source code needs to be included. Otherwise we'd have to load the trap handler as firmware to get around licensing issues. Regards, Felix From: Oded Gabbay <oded.gab...@gmail.com> Sent: Thursday, November 9, 2017 2:02:18 AM To: Kuehling, Felix Cc: amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support Yes, that would be helpful to review. I assume the asm file provided is just a reference code and doesn't need to be in the kernel tree for this to work, correct ? On Thu, Nov 9, 2017 at 4:46 AM, Felix Kuehling <felix.kuehl...@amd.com> wrote: > I'll probably need a v2 of this patch. #include should > not be needed. That's a leftover from some late cleanup. > > This change also ended up a bit big after squashing about 12 changes and > fixes, and adding more cleanups on top of that. Let me know you want me > to split it. I'm thinking this may make sense: > > 1. Add trap handler .asm file > 2. Implement CWSR support > 3. Implement user mode trap handler support > > I also pushed an updated kfd_ioctl.h to the Thunk github WIP branch. > > Regards, > Felix > > > On 2017-11-08 09:29 PM, Felix Kuehling wrote: >> From: shaoyunl <shaoyun@amd.com> >> >> CWSR = compute wave save restore >> >> This hardware feature allows the GPU to preempt shader execution in >> the middle of a compute wave, save the state and restore it later >> to resume execution. >> >> This feature requires help from a trap handler, which is like an >> interrupt handler running on the compute unit. The trap handler is >> written mostly by the hardware team. It's provided as pre-compiled >> shader code with the SP3 assembly source code as reference. >> >> Memory for saving the state is allocated per queue in user mode and >> the address and size passed to the create_queue ioctl. The size >> depends on the number of waves that can be in flight simultaneously >> on a given ASIC. >> >> A second-level user mode trap handler can be installed. The CWSR trap >> handler jumps to the secondary trap handler conditionally for any >> conditions not handled by it. This can be used e.g. for debugging or >> catching math exceptions. >> >> Signed-off-by: Shaoyun.liu <shaoyun@amd.com> >> Signed-off-by: Jay Cornwall <jay.cornw...@amd.com> >> Signed-off-by: Yong Zhao <yong.z...@amd.com> >> Signed-off-by: Felix Kuehling <felix.kuehl...@amd.com> >> --- >> .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 1384 >> >> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +- >> drivers/gpu/drm/amd/amdkfd/kfd_device.c| 24 +- >> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 28 + >> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h |5 + >> drivers/gpu/drm/amd/amdkfd/kfd_module.c|4 + >> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c| 27 + >> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 31 +- >> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 89 +- >> .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |4 +- >> include/uapi/linux/kfd_ioctl.h | 15 +- >> 11 files changed, 1644 insertions(+), 11 deletions(-) >> create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> new file mode 100644 >> index 000..751cc2e >> --- /dev/null >> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> @@ -0,0 +1,1384 @@ >> +/* >> + * Copyright 2015-2017 Advanced Micro Devices, Inc. >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included >> in >> + * all copies or substantial portions
Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support
Yes, that would be helpful to review. I assume the asm file provided is just a reference code and doesn't need to be in the kernel tree for this to work, correct ? On Thu, Nov 9, 2017 at 4:46 AM, Felix Kuehlingwrote: > I'll probably need a v2 of this patch. #include should > not be needed. That's a leftover from some late cleanup. > > This change also ended up a bit big after squashing about 12 changes and > fixes, and adding more cleanups on top of that. Let me know you want me > to split it. I'm thinking this may make sense: > > 1. Add trap handler .asm file > 2. Implement CWSR support > 3. Implement user mode trap handler support > > I also pushed an updated kfd_ioctl.h to the Thunk github WIP branch. > > Regards, > Felix > > > On 2017-11-08 09:29 PM, Felix Kuehling wrote: >> From: shaoyunl >> >> CWSR = compute wave save restore >> >> This hardware feature allows the GPU to preempt shader execution in >> the middle of a compute wave, save the state and restore it later >> to resume execution. >> >> This feature requires help from a trap handler, which is like an >> interrupt handler running on the compute unit. The trap handler is >> written mostly by the hardware team. It's provided as pre-compiled >> shader code with the SP3 assembly source code as reference. >> >> Memory for saving the state is allocated per queue in user mode and >> the address and size passed to the create_queue ioctl. The size >> depends on the number of waves that can be in flight simultaneously >> on a given ASIC. >> >> A second-level user mode trap handler can be installed. The CWSR trap >> handler jumps to the secondary trap handler conditionally for any >> conditions not handled by it. This can be used e.g. for debugging or >> catching math exceptions. >> >> Signed-off-by: Shaoyun.liu >> Signed-off-by: Jay Cornwall >> Signed-off-by: Yong Zhao >> Signed-off-by: Felix Kuehling >> --- >> .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 1384 >> >> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +- >> drivers/gpu/drm/amd/amdkfd/kfd_device.c| 24 +- >> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 28 + >> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h |5 + >> drivers/gpu/drm/amd/amdkfd/kfd_module.c|4 + >> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c| 27 + >> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 31 +- >> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 89 +- >> .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |4 +- >> include/uapi/linux/kfd_ioctl.h | 15 +- >> 11 files changed, 1644 insertions(+), 11 deletions(-) >> create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> new file mode 100644 >> index 000..751cc2e >> --- /dev/null >> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm >> @@ -0,0 +1,1384 @@ >> +/* >> + * Copyright 2015-2017 Advanced Micro Devices, Inc. >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included >> in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS >> OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >> + * OTHER DEALINGS IN THE SOFTWARE. >> + */ >> + >> +#if 0 >> +HW (VI) source code for CWSR trap handler >> +#Version 18 + multiple trap handler >> + >> +// this performance-optimal version was originally from Seven Xu at SRDC >> + >> +// Revison #18 --... >> +/* Rev History >> +** #1. Branch from gc dv. >> //gfxip/gfx8/main/src/test/suites/block/cs/sr/cs_trap_handler.sp3#1,#50, >> #51, #52-53(Skip, Already Fixed by PV), #54-56(merged),#57-58(mergerd, >> skiped-already fixed by PV) >> +** #4. SR Memory Layout: >> +** 1. VGPR-SGPR-HWREG-{LDS} >> +** 2. tba_hi.bits.26 -