amdkfd: only keep release_mem function for Hawaii

Kuehling, Felix Thu, 07 Nov 2019 12:43:24 -0800

On 2019-11-07 13:32, Alex Deucher wrote:
> On Thu, Nov 7, 2019 at 12:47 PM Kuehling, Felix <felix.kuehl...@amd.com> 
> wrote:
>> No, please lets not add a new nomenclature for PM4 packet versions. GFX 
>> versions are agreed on between hardware, firmware, and software and it's 
>> generally understood what they mean. If we add a new PM4 packet versioning 
>> scheme on our own, then this will add a lot of confusion when talking to 
>> firmware teams.
>>
>> You know, this would all be more straight forward if we accepted a little 
>> bit of code duplication and had packet writing functions per GFX version. 
>> You'll see this pattern a lot in the amdgpu driver where each IP version 
>> duplicates a bunch of code. In many cases you may be able to save a few 
>> lines of code by sharing functions between IP versions. But you'll add some 
>> confusion and burden on future maintenance.
>>
>> This is the price we pay for micro-optimizing minor code duplication.
> What we've tried to do in amdgpu is to break out shared code in to
> common helpers that are not IP specific and use that in each IP module
> (e.g., amdgpu_uvd.c amdgpu_gfx.c, etc.).  Sometimes we can use a
> particular chunk of code across multiple generations.  E.g., the uvd
> stuff is a good example.  We have shared generic uvd helpers that work
> for most UVD IP versions, and then if we need an IP specific version,
> we override that in the callbacks with a version specific one.  E.g.,
> for the video decode engines we use the generic helpers for a number
> of ring functions:
>
> static const struct amdgpu_ring_funcs uvd_v7_0_ring_vm_funcs = {
> ...
>      .test_ring = uvd_v7_0_ring_test_ring,
>          .test_ib = amdgpu_uvd_ring_test_ib,
>          .insert_nop = uvd_v7_0_ring_insert_nop,
>          .pad_ib = amdgpu_ring_generic_pad_ib,
>          .begin_use = amdgpu_uvd_ring_begin_use,
>          .end_use = amdgpu_uvd_ring_end_use,
> ...
> };
>
> while we override more of them for the video encode engines:
>
> static const struct amdgpu_ring_funcs uvd_v7_0_enc_ring_vm_funcs = {
> ...
>      .test_ring = uvd_v7_0_enc_ring_test_ring,
>          .test_ib = uvd_v7_0_enc_ring_test_ib,
>          .insert_nop = amdgpu_ring_insert_nop,
>      .insert_end = uvd_v7_0_enc_ring_insert_end,
>          .pad_ib = amdgpu_ring_generic_pad_ib,
>          .begin_use = amdgpu_uvd_ring_begin_use,
>          .end_use = amdgpu_uvd_ring_end_use,
> ...
> };
>
> But still maintain IP specific components.


Thanks Alex. In this case the common code is in kfd_packet_manager.c and 
the IP-version-specific code that writes the actual PM4 packets is in 
the kernel_queue_v*.c files. Yong is trying to merge the PM4 packet 
writing code for v9 and v10 because the packet formats are essentially 
unchanged. It makes the naming conventions in the code a bit meaningless 
because v9 now really means "v9 and v10". Apparently there is precedent 
for this, as we already did the same thing with v7 and v8, which I 
forgot about in my initial code review.

Regards,
   Felix


>
> Alex
>
>> Regards,
>>    Felix
>>
>> On 2019-11-07 12:39, Zhao, Yong wrote:
>>
>> Hi Kent,
>>
>> I can't agree more on this. Also, the same applies to the file names. 
>> Definitely we need to agree on the naming scheme before making it happen.
>>
>> Yong
>>
>> On 2019-11-07 12:33 p.m., Russell, Kent wrote:
>>
>> I think that the versioning is getting a little confusing since we’re using 
>> the old GFX versions, but not really sticking to it due to the shareability 
>> of certain managers and shaders. Could we look into doing something like 
>> gen1 or gen2, or some other more ambiguous non-GFX-related versioning? 
>> Otherwise we’ll keep having these questions of “why is Hawaii GFX8”, “why is 
>> Arcturus GFX9”, etc. Then if things change, we just up the value concretely, 
>> instead of maybe doing a v11 if GFX11 changes things, and only GFX11 ASICs 
>> use those functions/variables.
>>
>>
>>
>> Obviously not high-priority, but maybe something to consider as you continue 
>> to consolidate and remove duplicate code.
>>
>>
>>
>> Kent
>>
>>
>>
>> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Zhao, Yong
>> Sent: Thursday, November 7, 2019 11:57 AM
>> To: Kuehling, Felix <felix.kuehl...@amd.com>; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for 
>> Hawaii
>>
>>
>>
>> Hi Felix,
>>
>>
>>
>> That's because v8 and v7 share the same packet_manager_funcs. In this case, 
>> it is better to keep it as it is.
>>
>>
>>
>> Regards,
>>
>> Yong
>>
>> ________________________________
>>
>> From: Kuehling, Felix <felix.kuehl...@amd.com>
>> Sent: Wednesday, November 6, 2019 11:45 PM
>> To: Zhao, Yong <yong.z...@amd.com>; amd-gfx@lists.freedesktop.org 
>> <amd-gfx@lists.freedesktop.org>
>> Subject: Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for 
>> Hawaii
>>
>>
>>
>> On 2019-10-30 20:17, Zhao, Yong wrote:
>>> release_mem won't be used at all on GFX9 and GFX10, so delete it.
>> Hawaii was GFXv7. So we're not using the release_mem packet on GFXv8
>> either. Why arbitrarily limit this change to GFXv9 and 10?
>>
>> Regards,
>>     Felix
>>
>>> Change-Id: I13787a8a29b83e7516c582a7401f2e14721edf5f
>>> Signed-off-by: Yong Zhao <yong.z...@amd.com>
>>> ---
>>>    .../gpu/drm/amd/amdkfd/kfd_kernel_queue_v10.c | 35 ++-----------------
>>>    .../gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c  | 33 ++---------------
>>>    2 files changed, 4 insertions(+), 64 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v10.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v10.c
>>> index aed32ab7102e..bfd6221acae9 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v10.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v10.c
>>> @@ -298,37 +298,6 @@ static int pm_query_status_v10(struct packet_manager 
>>> *pm, uint32_t *buffer,
>>>         return 0;
>>>    }
>>>
>>> -
>>> -static int pm_release_mem_v10(uint64_t gpu_addr, uint32_t *buffer)
>>> -{
>>> -     struct pm4_mec_release_mem *packet;
>>> -
>>> -     WARN_ON(!buffer);
>>> -
>>> -     packet = (struct pm4_mec_release_mem *)buffer;
>>> -     memset(buffer, 0, sizeof(struct pm4_mec_release_mem));
>>> -
>>> -     packet->header.u32All = pm_build_pm4_header(IT_RELEASE_MEM,
>>> -                                     sizeof(struct pm4_mec_release_mem));
>>> -
>>> -     packet->bitfields2.event_type = CACHE_FLUSH_AND_INV_TS_EVENT;
>>> -     packet->bitfields2.event_index = 
>>> event_index__mec_release_mem__end_of_pipe;
>>> -     packet->bitfields2.tcl1_action_ena = 1;
>>> -     packet->bitfields2.tc_action_ena = 1;
>>> -     packet->bitfields2.cache_policy = cache_policy__mec_release_mem__lru;
>>> -
>>> -     packet->bitfields3.data_sel = 
>>> data_sel__mec_release_mem__send_32_bit_low;
>>> -     packet->bitfields3.int_sel =
>>> -             int_sel__mec_release_mem__send_interrupt_after_write_confirm;
>>> -
>>> -     packet->bitfields4.address_lo_32b = (gpu_addr & 0xffffffff) >> 2;
>>> -     packet->address_hi = upper_32_bits(gpu_addr);
>>> -
>>> -     packet->data_lo = 0;
>>> -
>>> -     return sizeof(struct pm4_mec_release_mem) / sizeof(unsigned int);
>>> -}
>>> -
>>>    const struct packet_manager_funcs kfd_v10_pm_funcs = {
>>>         .map_process                    = pm_map_process_v10,
>>>         .runlist                        = pm_runlist_v10,
>>> @@ -336,13 +305,13 @@ const struct packet_manager_funcs kfd_v10_pm_funcs = {
>>>         .map_queues                     = pm_map_queues_v10,
>>>         .unmap_queues                   = pm_unmap_queues_v10,
>>>         .query_status                   = pm_query_status_v10,
>>> -     .release_mem                    = pm_release_mem_v10,
>>> +     .release_mem                    = NULL,
>>>         .map_process_size               = sizeof(struct 
>>> pm4_mes_map_process),
>>>         .runlist_size                   = sizeof(struct pm4_mes_runlist),
>>>         .set_resources_size             = sizeof(struct 
>>> pm4_mes_set_resources),
>>>         .map_queues_size                = sizeof(struct pm4_mes_map_queues),
>>>         .unmap_queues_size              = sizeof(struct 
>>> pm4_mes_unmap_queues),
>>>         .query_status_size              = sizeof(struct 
>>> pm4_mes_query_status),
>>> -     .release_mem_size               = sizeof(struct pm4_mec_release_mem)
>>> +     .release_mem_size               = 0,
>>>    };
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c 
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c
>>> index 3b5ca2b1d7a6..f0e4910a8865 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c
>>> @@ -336,35 +336,6 @@ static int pm_query_status_v9(struct packet_manager 
>>> *pm, uint32_t *buffer,
>>>         return 0;
>>>    }
>>>
>>> -
>>> -static int pm_release_mem_v9(uint64_t gpu_addr, uint32_t *buffer)
>>> -{
>>> -     struct pm4_mec_release_mem *packet;
>>> -
>>> -     packet = (struct pm4_mec_release_mem *)buffer;
>>> -     memset(buffer, 0, sizeof(struct pm4_mec_release_mem));
>>> -
>>> -     packet->header.u32All = pm_build_pm4_header(IT_RELEASE_MEM,
>>> -                                     sizeof(struct pm4_mec_release_mem));
>>> -
>>> -     packet->bitfields2.event_type = CACHE_FLUSH_AND_INV_TS_EVENT;
>>> -     packet->bitfields2.event_index = 
>>> event_index__mec_release_mem__end_of_pipe;
>>> -     packet->bitfields2.tcl1_action_ena = 1;
>>> -     packet->bitfields2.tc_action_ena = 1;
>>> -     packet->bitfields2.cache_policy = cache_policy__mec_release_mem__lru;
>>> -
>>> -     packet->bitfields3.data_sel = 
>>> data_sel__mec_release_mem__send_32_bit_low;
>>> -     packet->bitfields3.int_sel =
>>> -             int_sel__mec_release_mem__send_interrupt_after_write_confirm;
>>> -
>>> -     packet->bitfields4.address_lo_32b = (gpu_addr & 0xffffffff) >> 2;
>>> -     packet->address_hi = upper_32_bits(gpu_addr);
>>> -
>>> -     packet->data_lo = 0;
>>> -
>>> -     return 0;
>>> -}
>>> -
>>>    const struct packet_manager_funcs kfd_v9_pm_funcs = {
>>>         .map_process            = pm_map_process_v9,
>>>         .runlist                = pm_runlist_v9,
>>> @@ -372,12 +343,12 @@ const struct packet_manager_funcs kfd_v9_pm_funcs = {
>>>         .map_queues             = pm_map_queues_v9,
>>>         .unmap_queues           = pm_unmap_queues_v9,
>>>         .query_status           = pm_query_status_v9,
>>> -     .release_mem            = pm_release_mem_v9,
>>> +     .release_mem            = NULL,
>>>         .map_process_size       = sizeof(struct pm4_mes_map_process),
>>>         .runlist_size           = sizeof(struct pm4_mes_runlist),
>>>         .set_resources_size     = sizeof(struct pm4_mes_set_resources),
>>>         .map_queues_size        = sizeof(struct pm4_mes_map_queues),
>>>         .unmap_queues_size      = sizeof(struct pm4_mes_unmap_queues),
>>>         .query_status_size      = sizeof(struct pm4_mes_query_status),
>>> -     .release_mem_size       = sizeof(struct pm4_mec_release_mem)
>>> +     .release_mem_size       = 0,
>>>    };
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for Hawaii

Reply via email to