A number of games have large arrays of constants, which we promote to uniforms. This introduces copies from the uniform array to the original temporary array. Normally, copy propagation eliminates those copies, making everything refer to the uniform array directly.
A number of shaders in "Deus Ex: Mankind Divided" recently exposed a limitation of copy propagation - if we had any intrinsics (i.e. image access in a compute shader), we weren't able to get rid of these copies. That meant that any variable indexing remained on the temporary array rather being moved to the uniform array. i965's scalar backend currently doesn't support indirect addressing of temporary arrays, which meant lowering it to if-ladders. This was horrible. On Skylake: total instructions in shared programs: 13700090 -> 13654519 (-0.33%) instructions in affected programs: 56438 -> 10867 (-80.75%) helped: 14 HURT: 0 total cycles in shared programs: 288879704 -> 291270232 (0.83%) cycles in affected programs: 12758080 -> 15148608 (18.74%) helped: 6 HURT: 8 All shaders helped are compute shaders in Tomb Raider or Deus Ex. Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> --- src/compiler/glsl/opt_copy_propagation.cpp | 31 ++++++++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/src/compiler/glsl/opt_copy_propagation.cpp b/src/compiler/glsl/opt_copy_propagation.cpp index 247c498..2240421 100644 --- a/src/compiler/glsl/opt_copy_propagation.cpp +++ b/src/compiler/glsl/opt_copy_propagation.cpp @@ -186,11 +186,34 @@ ir_copy_propagation_visitor::visit_enter(ir_call *ir) } } - /* Since we're unlinked, we don't (necessarily) know the side effects of - * this call. So kill all copies. + /* Since this pass can run when unlinked, we don't (necessarily) know + * the side effects of calls. (When linked, most calls are inlined + * anyway, so it doesn't matter much.) + * + * One place where this does matter is IR intrinsics. They're never + * inlined. We also know what they do - while some have side effects + * (such as image writes), none edit random global variables. So we + * can assume they're side-effect free (other than the return value + * and out parameters). */ - _mesa_hash_table_clear(acp, NULL); - this->killed_all = true; + if (!ir->callee->is_intrinsic()) { + _mesa_hash_table_clear(acp, NULL); + this->killed_all = true; + } else { + if (ir->return_deref) + kill(ir->return_deref->var); + + foreach_two_lists(formal_node, &ir->callee->parameters, + actual_node, &ir->actual_parameters) { + ir_variable *sig_param = (ir_variable *) formal_node; + if (sig_param->data.mode == ir_var_function_out || + sig_param->data.mode == ir_var_function_inout) { + ir_rvalue *ir = (ir_rvalue *) actual_node; + ir_variable *var = ir->variable_referenced(); + kill(var); + } + } + } return visit_continue_with_parent; } -- 2.10.2 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev