Module: Mesa Branch: staging/20.1 Commit: 49c56c275adbb04c2e53963f80b122db1643603d URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=49c56c275adbb04c2e53963f80b122db1643603d
Author: Marek Olšák <[email protected]> Date: Thu Jul 30 08:19:48 2020 -0400 radeonsi: fix applying the NGG minimum vertex count requirement The code applied the restriction too late, which could overflow LDS size, which started happening more often after the minimum vertex count was increased for Sienna. Incorporate the clamping into the previous code for rounding up the counts. Now the LDS size can never overflow, but it may use vector lanes less efficiently (max_gsprims can be decreased more), which will be addressed in the next commit. Fixes: 4ecc39e1aa1 ("radeonsi/gfx10: NGG geometry shader PM4 and upload") Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6137> (cherry picked from commit 64c741ffb7aa0ae40c4302bc065fef0192123c6a) --- .pick_status.json | 2 +- src/gallium/drivers/radeonsi/gfx10_shader_ngg.c | 11 ++++++++--- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/.pick_status.json b/.pick_status.json index 3eb233b6dc5..78b63342dd2 100644 --- a/.pick_status.json +++ b/.pick_status.json @@ -121,7 +121,7 @@ "description": "radeonsi: fix applying the NGG minimum vertex count requirement", "nominated": true, "nomination_type": 1, - "resolution": 0, + "resolution": 1, "master_sha": null, "because_sha": "4ecc39e1aa1568f19ebf54a99ffe14643bac7d15" }, diff --git a/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c b/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c index c9fdceef605..2eb278fbb79 100644 --- a/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c +++ b/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c @@ -2003,6 +2003,8 @@ void gfx10_ngg_calculate_subgroup_info(struct si_shader *shader) max_esverts = MIN2(max_esverts, (max_lds_size - max_gsprims * gsprim_lds_size) / esvert_lds_size); max_esverts = MIN2(max_esverts, max_gsprims * max_verts_per_prim); + /* Hardware restriction: minimum value of max_esverts */ + max_esverts = MAX2(max_esverts, 23 + max_verts_per_prim); max_gsprims = align(max_gsprims, wavesize); max_gsprims = MIN2(max_gsprims, max_gsprims_base); @@ -2012,10 +2014,13 @@ void gfx10_ngg_calculate_subgroup_info(struct si_shader *shader) clamp_gsprims_to_esverts(&max_gsprims, max_esverts, min_verts_per_prim, use_adjacency); assert(max_esverts >= max_verts_per_prim && max_gsprims >= 1); } while (orig_max_esverts != max_esverts || orig_max_gsprims != max_gsprims); - } - /* Hardware restriction: minimum value of max_esverts */ - max_esverts = MAX2(max_esverts, 23 + max_verts_per_prim); + /* Verify the restriction. */ + assert(max_esverts >= 23 + max_verts_per_prim); + } else { + /* Hardware restriction: minimum value of max_esverts */ + max_esverts = MAX2(max_esverts, 23 + max_verts_per_prim); + } unsigned max_out_vertices = max_vert_out_per_gs_instance _______________________________________________ mesa-commit mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-commit
