Module: Mesa
Branch: main
Commit: e386523380d8fac9b1bca3848b1fafa8bdc90a65
URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e386523380d8fac9b1bca3848b1fafa8bdc90a65

Author: Rhys Perry <[email protected]>
Date:   Wed Jan  4 10:51:24 2023 +0000

aco/gfx11: fix discard early exit removal optimization

This optimization never happened because the NULL target was removed in
GFX11.

fossil-db (gfx1100):
Totals from 5439 (4.04% of 134574) affected shaders:
Instrs: 407865 -> 387123 (-5.09%)
CodeSize: 2163340 -> 2060644 (-4.75%)
Latency: 3432378 -> 3327802 (-3.05%)
InvThroughput: 270133 -> 262980 (-2.65%)
Branches: 8524 -> 3085 (-63.81%)

Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Timur Kristóf <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20513>

---

 src/amd/compiler/aco_lower_to_hw_instr.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/amd/compiler/aco_lower_to_hw_instr.cpp 
b/src/amd/compiler/aco_lower_to_hw_instr.cpp
index 63e8aba1ce6..3c82fe2577a 100644
--- a/src/amd/compiler/aco_lower_to_hw_instr.cpp
+++ b/src/amd/compiler/aco_lower_to_hw_instr.cpp
@@ -2198,7 +2198,7 @@ lower_to_hw_instr(Program* program)
                if ((block->instructions.size() - 1 - instr_idx) <= 4 &&
                    block->instructions.back()->opcode == aco_opcode::s_endpgm) 
{
                   unsigned null_exp_dest =
-                     (ctx.program->stage.hw == HWStage::FS) ? 9 /* NULL */ : 
V_008DFC_SQ_EXP_POS;
+                     program->gfx_level >= GFX11 ? V_008DFC_SQ_EXP_MRT : 
V_008DFC_SQ_EXP_NULL;
                   bool ignore_early_exit = true;
 
                   for (unsigned k = instr_idx + 1; k < 
block->instructions.size(); ++k) {
@@ -2207,7 +2207,8 @@ lower_to_hw_instr(Program* program)
                          instr2->opcode == aco_opcode::p_logical_end)
                         continue;
                      else if (instr2->opcode == aco_opcode::exp &&
-                              instr2->exp().dest == null_exp_dest)
+                              instr2->exp().dest == null_exp_dest &&
+                              instr2->exp().enabled_mask == 0)
                         continue;
                      else if (instr2->opcode == aco_opcode::p_parallelcopy &&
                               instr2->definitions[0].isFixed() &&

Reply via email to