On Sat, Apr 12, 2025 at 4:29 PM Rodrigo Siqueira <sique...@igalia.com> wrote: > > On 04/08, Rodrigo Siqueira wrote: > > On 04/07, Alex Deucher wrote: > > > On Mon, Apr 7, 2025 at 4:15 PM Rodrigo Siqueira <sique...@igalia.com> > > > wrote: > > > > > > > > On 04/07, Alex Deucher wrote: > > > > > On Sun, Apr 6, 2025 at 7:07 PM Rodrigo Siqueira <sique...@igalia.com> > > > > > wrote: > > > > > > > > > > > > This patchset was inspired and made on top of the below series: > > > > > > > > > > > > https://lore.kernel.org/amd-gfx/20250319162225.3775315-1-alexander.deuc...@amd.com/ > > > > > > > > > > > > In the above series, there is a bug fix in many functions named > > > > > > gfx_vX_0_get_csb_buffer (where X ranges from 6 to 11). After closely > > > > > > looking at those functions, it became clear that most of the code > > > > > > could > > > > > > be shared from gfx6 to gfx11. Aside from the code duplication > > > > > > removal, > > > > > > this also improves maintainability since a bug fix in a shared > > > > > > function > > > > > > can be propagated to all ASICs. > > > > > > > > > > > > The first patch of this series created one dedicated file for > > > > > > encapsulating common GC registers (gc_common_offset.h); this series > > > > > > only > > > > > > adds registers associated with the CSB. In the future, this file can > > > > > > keep growing as we identify common registers to be shared in the > > > > > > gc_common_offset.h. > > > > > > > > > > > > The second patch introduces the generic gfx_get_csb_buffer function, > > > > > > which has the same implementation found in gfx_v10_0_get_csb_buffer > > > > > > and > > > > > > gfx_v11_0_get_csb_buffer (these two functions have the same code). > > > > > > After > > > > > > that, every patch is dedicated to absorbing one of the csb_buffer > > > > > > functions from gfx from 9 to 6; notice that some adaptations were > > > > > > required. > > > > > > > > > > I don't really like the register header changes and moving all of the > > > > > IP version specific logic into the common code. These register > > > > > headers are used in other places as well and moving some registers > > > > > into a common header can get confusing and may lead to bugs later if > > > > > other chips change the offset of these registers. > > > > > > > > In that case, what do you think if I just abstract the first part of the > > > > function for a V2? The first part is the same for all gfx from 6 to 11. > > > > Something like this: > > > > > > > > int gfx_get_pre_setup_csb_buffer(struct amdgpu_device *adev, volatile > > > > u32 *buffer) > > > > { > > > > u32 count = 0, i; > > > > const struct cs_section_def *sect = NULL; > > > > const struct cs_extent_def *ext = NULL; > > > > int ctx_reg_offset; > > > > > > > > if (adev->gfx.rlc.cs_data == NULL) > > > > return 1; > > > > if (buffer == NULL) > > > > return 1; > > > > > > > > buffer[count++] = cpu_to_le32(PACKET3(PACKET3_PREAMBLE_CNTL, 0)); > > > > buffer[count++] = > > > > cpu_to_le32(PACKET3_PREAMBLE_BEGIN_CLEAR_STATE); > > > > > > > > buffer[count++] = cpu_to_le32(PACKET3(PACKET3_CONTEXT_CONTROL, > > > > 1)); > > > > buffer[count++] = cpu_to_le32(0x80000000); > > > > buffer[count++] = cpu_to_le32(0x80000000); > > > > > > > > for (sect = adev->gfx.rlc.cs_data; sect->section != NULL; > > > > ++sect) { > > > > for (ext = sect->section; ext->extent != NULL; ++ext) { > > > > if (sect->id == SECT_CONTEXT) { > > > > buffer[count++] = > > > > > > > > cpu_to_le32(PACKET3(PACKET3_SET_CONTEXT_REG, ext->reg_count)); > > > > buffer[count++] = > > > > cpu_to_le32(ext->reg_index - > > > > > > > > PACKET3_SET_CONTEXT_REG_START); > > > > for (i = 0; i < ext->reg_count; i++) > > > > buffer[count++] = > > > > cpu_to_le32(ext->extent[i]); > > > > } > > > > } > > > > } > > > > > > > > return 0; > > > > } > > > > > > Sure helpers are fine. maybe a preamble_start helper > > > buffer[count++] = cpu_to_le32(PACKET3(PACKET3_PREAMBLE_CNTL, 0)); > > > buffer[count++] = cpu_to_le32(PACKET3_PREAMBLE_BEGIN_CLEAR_STATE); > > > > > > buffer[count++] = cpu_to_le32(PACKET3(PACKET3_CONTEXT_CONTROL, > > > 1)); > > > buffer[count++] = cpu_to_le32(0x80000000); > > > buffer[count++] = cpu_to_le32(0x80000000); > > > and a preamble_end helper: > > > buffer[count++] = cpu_to_le32(PACKET3(PACKET3_PREAMBLE_CNTL, 0)); > > > buffer[count++] = cpu_to_le32(PACKET3_PREAMBLE_END_CLEAR_STATE); > > > > > > buffer[count++] = cpu_to_le32(PACKET3(PACKET3_CLEAR_STATE, 0)); > > > buffer[count++] = cpu_to_le32(0); > > > and a cs_data parser helper: > > > for (sect = adev->gfx.rlc.cs_data; sect->section != NULL; ++sect) > > > { > > > for (ext = sect->section; ext->extent != NULL; ++ext) { > > > if (sect->id == SECT_CONTEXT) { > > > buffer[count++] = > > > > > > cpu_to_le32(PACKET3(PACKET3_SET_CONTEXT_REG, ext->reg_count)); > > > buffer[count++] = > > > cpu_to_le32(ext->reg_index - > > > > > > PACKET3_SET_CONTEXT_REG_START); > > > for (i = 0; i < ext->reg_count; i++) > > > buffer[count++] = > > > cpu_to_le32(ext->extent[i]); > > > } > > > } > > > } > > > > Nice. I'll prepare a V2. > > > > > > > > Also, while you are at it, you could clean up gfx_v9_0_cp_gfx_start(), > > > etc. to use the cs buffer and cs size directly rather than re-parsing > > > everything again. > > I was looking into the gfx_vX_0_cp_gfx_start() functions, but I was > slightly confused about how to approach this part. I can see from > gfx_vX_0_cp_gfx_start() that a preamble start, parser, and preamble end > are common parts, but I felt I could not re-use some of the helpers > created in the V2 of this series because gfx_vX_0_cp_gfx_start() writes > directly in the ring buffer and it has some special checks. Should I > create a dedicated helper for those parts? Or did I misunderstand what > you suggested?
We already have the contents of the csb buffer stored in adev->gfx.rlc.cs_ptr when we call get_csb_buffer(). So in the gfx_vX_0_cp_gfx_start() functions we can just walk that buffer to submit the csb to the firmware rather than parsing everything again. E.g., u32 *csb_buffer = adev->gfx.rlc.cs_ptr; u32 csb_size = get_csb_size(); amdgpu_ring_alloc(ring, csb_size + 4 + 3); for (i = 0; i < csb_size; i++) amdgpu_ring_write(ring, csb_buffer[i]); amdgpu_ring_write(ring, ...); ... amdgpu_ring_commit(ring); Alex > > > > > Sure, I'll make it as a separate patchset. > > > > Thanks > > > > > > > > Alex > > > > > > > > > > > Thanks > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > Rodrigo Siqueira (6): > > > > > > drm/amd/amdgpu: Create a headears to keep some common GC registers > > > > > > drm/amdgpu/gfx: Introduce generic gfx_get_csb_buffer > > > > > > drm/amdgpu/gfx: Integrate gfx_v9_0_get_csb_buffer into > > > > > > gfx_get_csb_buffer > > > > > > drm/amdgpu/gfx: Absorb gfx_v8_0_get_csb_buffer into > > > > > > gfx_get_csb_buffer > > > > > > drm/amdgpu/gfx: Assimilate gfx_v7_0_get_csb_buffer into > > > > > > gfx_get_csb_buffer > > > > > > drm/amdgpu/gfx: Merge gfx_v6_0_get_csb_buffer into > > > > > > gfx_get_csb_buffer > > > > > > > > > > > > Documentation/gpu/amdgpu/amdgpu-glossary.rst | 3 + > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 101 > > > > > > ++++++++++++++++++ > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 1 + > > > > > > drivers/gpu/drm/amd/amdgpu/cik.c | 2 + > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 51 +-------- > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 53 +-------- > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 46 +------- > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 70 +----------- > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 51 +-------- > > > > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 43 +------- > > > > > > drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 1 + > > > > > > drivers/gpu/drm/amd/amdgpu/si.c | 2 + > > > > > > drivers/gpu/drm/amd/amdgpu/vi.c | 2 + > > > > > > .../include/asic_reg/gc/gc_10_1_0_offset.h | 3 - > > > > > > .../include/asic_reg/gc/gc_10_3_0_offset.h | 3 - > > > > > > .../include/asic_reg/gc/gc_11_0_0_offset.h | 2 - > > > > > > .../include/asic_reg/gc/gc_11_0_3_offset.h | 2 - > > > > > > .../include/asic_reg/gc/gc_11_5_0_offset.h | 2 - > > > > > > .../include/asic_reg/gc/gc_12_0_0_offset.h | 2 - > > > > > > .../amd/include/asic_reg/gc/gc_9_0_offset.h | 3 - > > > > > > .../amd/include/asic_reg/gc/gc_9_1_offset.h | 3 - > > > > > > .../amd/include/asic_reg/gc/gc_9_2_1_offset.h | 3 - > > > > > > .../amd/include/asic_reg/gc/gc_9_4_2_offset.h | 2 - > > > > > > .../amd/include/asic_reg/gc/gc_9_4_3_offset.h | 2 - > > > > > > .../include/asic_reg/gc/gc_common_offset.h | 11 ++ > > > > > > .../drm/amd/include/asic_reg/gca/gfx_6_0_d.h | 1 - > > > > > > .../drm/amd/include/asic_reg/gca/gfx_7_0_d.h | 1 - > > > > > > .../drm/amd/include/asic_reg/gca/gfx_7_2_d.h | 1 - > > > > > > .../drm/amd/include/asic_reg/gca/gfx_8_0_d.h | 1 - > > > > > > .../drm/amd/include/asic_reg/gca/gfx_8_1_d.h | 1 - > > > > > > 30 files changed, 141 insertions(+), 328 deletions(-) > > > > > > create mode 100644 > > > > > > drivers/gpu/drm/amd/include/asic_reg/gc/gc_common_offset.h > > > > > > > > > > > > -- > > > > > > 2.49.0 > > > > > > > > > > > > > > -- > > > > Rodrigo Siqueira > > > > -- > > Rodrigo Siqueira > > -- > Rodrigo Siqueira