Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support

2017-11-13 Thread Felix Kuehling
On 2017-11-13 11:12 AM, Oded Gabbay wrote:
> ok, thanks for the explanation.
> I'll wait for your v2.
I was on vacation for a few days. I'll work on v2 this week, once I
caught up with my email backlog.

Regards,
  Felix


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support

2017-11-13 Thread Oded Gabbay
ok, thanks for the explanation.
I'll wait for your v2.

Oded

On Fri, Nov 10, 2017 at 12:02 AM, Kuehling, Felix
<felix.kuehl...@amd.com> wrote:
> The file contains the assembly code as a reference (inside #if 0) and an 
> array with the pre-compiled code. It gets #included in kfd_device.c.
>
> The trap handler binary gets compiled/linked into the kernel. To be 
> GPL-compliant the source code needs to be included. Otherwise we'd have to 
> load the trap handler as firmware to get around licensing issues.
>
> Regards,
>   Felix
>
> 
> From: Oded Gabbay <oded.gab...@gmail.com>
> Sent: Thursday, November 9, 2017 2:02:18 AM
> To: Kuehling, Felix
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support
>
> Yes, that would be helpful to review.
> I assume the asm file provided is just a reference code and doesn't
> need to be in the kernel tree for this to work, correct ?
>
>
> On Thu, Nov 9, 2017 at 4:46 AM, Felix Kuehling <felix.kuehl...@amd.com> wrote:
>> I'll probably need a v2 of this patch. #include  should
>> not be needed. That's a leftover from some late cleanup.
>>
>> This change also ended up a bit big after squashing about 12 changes and
>> fixes, and adding more cleanups on top of that. Let me know you want me
>> to split it. I'm thinking this may make sense:
>>
>>  1. Add trap handler .asm file
>>  2. Implement CWSR support
>>  3. Implement user mode trap handler support
>>
>> I also pushed an updated kfd_ioctl.h to the Thunk github WIP branch.
>>
>> Regards,
>>   Felix
>>
>>
>> On 2017-11-08 09:29 PM, Felix Kuehling wrote:
>>> From: shaoyunl <shaoyun@amd.com>
>>>
>>> CWSR = compute wave save restore
>>>
>>> This hardware feature allows the GPU to preempt shader execution in
>>> the middle of a compute wave, save the state and restore it later
>>> to resume execution.
>>>
>>> This feature requires help from a trap handler, which is like an
>>> interrupt handler running on the compute unit. The trap handler is
>>> written mostly by the hardware team. It's provided as pre-compiled
>>> shader code with the SP3 assembly source code as reference.
>>>
>>> Memory for saving the state is allocated per queue in user mode and
>>> the address and size passed to the create_queue ioctl. The size
>>> depends on the number of waves that can be in flight simultaneously
>>> on a given ASIC.
>>>
>>> A second-level user mode trap handler can be installed. The CWSR trap
>>> handler jumps to the secondary trap handler conditionally for any
>>> conditions not handled by it. This can be used e.g. for debugging or
>>> catching math exceptions.
>>>
>>> Signed-off-by: Shaoyun.liu <shaoyun@amd.com>
>>> Signed-off-by: Jay Cornwall <jay.cornw...@amd.com>
>>> Signed-off-by: Yong Zhao <yong.z...@amd.com>
>>> Signed-off-by: Felix Kuehling <felix.kuehl...@amd.com>
>>> ---
>>>  .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 1384 
>>> 
>>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   |   44 +-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c|   24 +-
>>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |   28 +
>>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |5 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_module.c|4 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c|   27 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |   31 +-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c   |   89 +-
>>>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |4 +-
>>>  include/uapi/linux/kfd_ioctl.h |   15 +-
>>>  11 files changed, 1644 insertions(+), 11 deletions(-)
>>>  create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm 
>>> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>>> new file mode 100644
>>> index 000..751cc2e
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>>> @@ -0,0 +1,1384 @@
>>> +/*
>>> + * Copyright 2015-2017 Advanced Micro Devices, Inc.
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a
>>> + * copy of this software and associated documentati

Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support

2017-11-09 Thread Kuehling, Felix
The file contains the assembly code as a reference (inside #if 0) and an array 
with the pre-compiled code. It gets #included in kfd_device.c.

The trap handler binary gets compiled/linked into the kernel. To be 
GPL-compliant the source code needs to be included. Otherwise we'd have to load 
the trap handler as firmware to get around licensing issues.

Regards,
  Felix


From: Oded Gabbay <oded.gab...@gmail.com>
Sent: Thursday, November 9, 2017 2:02:18 AM
To: Kuehling, Felix
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support

Yes, that would be helpful to review.
I assume the asm file provided is just a reference code and doesn't
need to be in the kernel tree for this to work, correct ?


On Thu, Nov 9, 2017 at 4:46 AM, Felix Kuehling <felix.kuehl...@amd.com> wrote:
> I'll probably need a v2 of this patch. #include  should
> not be needed. That's a leftover from some late cleanup.
>
> This change also ended up a bit big after squashing about 12 changes and
> fixes, and adding more cleanups on top of that. Let me know you want me
> to split it. I'm thinking this may make sense:
>
>  1. Add trap handler .asm file
>  2. Implement CWSR support
>  3. Implement user mode trap handler support
>
> I also pushed an updated kfd_ioctl.h to the Thunk github WIP branch.
>
> Regards,
>   Felix
>
>
> On 2017-11-08 09:29 PM, Felix Kuehling wrote:
>> From: shaoyunl <shaoyun@amd.com>
>>
>> CWSR = compute wave save restore
>>
>> This hardware feature allows the GPU to preempt shader execution in
>> the middle of a compute wave, save the state and restore it later
>> to resume execution.
>>
>> This feature requires help from a trap handler, which is like an
>> interrupt handler running on the compute unit. The trap handler is
>> written mostly by the hardware team. It's provided as pre-compiled
>> shader code with the SP3 assembly source code as reference.
>>
>> Memory for saving the state is allocated per queue in user mode and
>> the address and size passed to the create_queue ioctl. The size
>> depends on the number of waves that can be in flight simultaneously
>> on a given ASIC.
>>
>> A second-level user mode trap handler can be installed. The CWSR trap
>> handler jumps to the secondary trap handler conditionally for any
>> conditions not handled by it. This can be used e.g. for debugging or
>> catching math exceptions.
>>
>> Signed-off-by: Shaoyun.liu <shaoyun@amd.com>
>> Signed-off-by: Jay Cornwall <jay.cornw...@amd.com>
>> Signed-off-by: Yong Zhao <yong.z...@amd.com>
>> Signed-off-by: Felix Kuehling <felix.kuehl...@amd.com>
>> ---
>>  .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 1384 
>> 
>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   |   44 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c|   24 +-
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |   28 +
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |5 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_module.c|4 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c|   27 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |   31 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c   |   89 +-
>>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |4 +-
>>  include/uapi/linux/kfd_ioctl.h |   15 +-
>>  11 files changed, 1644 insertions(+), 11 deletions(-)
>>  create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm 
>> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>> new file mode 100644
>> index 000..751cc2e
>> --- /dev/null
>> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>> @@ -0,0 +1,1384 @@
>> +/*
>> + * Copyright 2015-2017 Advanced Micro Devices, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions

Re: [PATCH 1/1] drm/amdkfd: CWSR and trap handler support

2017-11-08 Thread Oded Gabbay
Yes, that would be helpful to review.
I assume the asm file provided is just a reference code and doesn't
need to be in the kernel tree for this to work, correct ?


On Thu, Nov 9, 2017 at 4:46 AM, Felix Kuehling  wrote:
> I'll probably need a v2 of this patch. #include  should
> not be needed. That's a leftover from some late cleanup.
>
> This change also ended up a bit big after squashing about 12 changes and
> fixes, and adding more cleanups on top of that. Let me know you want me
> to split it. I'm thinking this may make sense:
>
>  1. Add trap handler .asm file
>  2. Implement CWSR support
>  3. Implement user mode trap handler support
>
> I also pushed an updated kfd_ioctl.h to the Thunk github WIP branch.
>
> Regards,
>   Felix
>
>
> On 2017-11-08 09:29 PM, Felix Kuehling wrote:
>> From: shaoyunl 
>>
>> CWSR = compute wave save restore
>>
>> This hardware feature allows the GPU to preempt shader execution in
>> the middle of a compute wave, save the state and restore it later
>> to resume execution.
>>
>> This feature requires help from a trap handler, which is like an
>> interrupt handler running on the compute unit. The trap handler is
>> written mostly by the hardware team. It's provided as pre-compiled
>> shader code with the SP3 assembly source code as reference.
>>
>> Memory for saving the state is allocated per queue in user mode and
>> the address and size passed to the create_queue ioctl. The size
>> depends on the number of waves that can be in flight simultaneously
>> on a given ASIC.
>>
>> A second-level user mode trap handler can be installed. The CWSR trap
>> handler jumps to the secondary trap handler conditionally for any
>> conditions not handled by it. This can be used e.g. for debugging or
>> catching math exceptions.
>>
>> Signed-off-by: Shaoyun.liu 
>> Signed-off-by: Jay Cornwall 
>> Signed-off-by: Yong Zhao 
>> Signed-off-by: Felix Kuehling 
>> ---
>>  .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 1384 
>> 
>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c   |   44 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c|   24 +-
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |   28 +
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |5 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_module.c|4 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c|   27 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h  |   31 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c   |   89 +-
>>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |4 +-
>>  include/uapi/linux/kfd_ioctl.h |   15 +-
>>  11 files changed, 1644 insertions(+), 11 deletions(-)
>>  create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm 
>> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>> new file mode 100644
>> index 000..751cc2e
>> --- /dev/null
>> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
>> @@ -0,0 +1,1384 @@
>> +/*
>> + * Copyright 2015-2017 Advanced Micro Devices, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> + * OTHER DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +#if 0
>> +HW (VI) source code for CWSR trap handler
>> +#Version 18 + multiple trap handler
>> +
>> +// this performance-optimal version was originally from Seven Xu at SRDC
>> +
>> +// Revison #18   --...
>> +/* Rev History
>> +** #1. Branch from gc dv.   
>> //gfxip/gfx8/main/src/test/suites/block/cs/sr/cs_trap_handler.sp3#1,#50, 
>> #51, #52-53(Skip, Already Fixed by PV), #54-56(merged),#57-58(mergerd, 
>> skiped-already fixed by PV)
>> +** #4. SR Memory Layout:
>> +** 1. VGPR-SGPR-HWREG-{LDS}
>> +** 2. tba_hi.bits.26 -