This series supersedes the "Global dead write vars removal pass".
The goal here is to perform copy propagation among values in different blocks. While this has currently small benefits (it effectively helped some cases with uniforms), as we move other resources to be addressed with derefs (e.g. SSBOs), we expect it to be more useful. In particular with compute shaders. To be able to do this I had to extract the dead write removal from the copy propagation pass. When performing more than per-block, the information flows in different way for that optimization (backwards), so it helps to keep them separated. The pass uses an approach similar to what we do in GLSL copy prop. We propagate values forward following the control flow graph. It doesn't try to merge values from different branches or handle more detailed control flow. I think this approach is a good intermediate step. I've experimented with various approaches to implement a full data-flow analysis, but all of them ended up either too complex or too messy. Some factors to that were: (a) we have load/stores and copies, so a value in ACP needs to be "broken up into pieces", (b) copies with wildcards force us to take into consideration whether derefs are contained or not, at many levels, (c) we have writemasks (for the vectors) associated. In particular (b) made the deref_map tree-based structure I've discussed elsewhere not as good as I've expected. Because we want to keep track of "a[*].x", "a[1].x" and "a[indirect].x", the walk on the tree is not linear on the size of the deref. A future idea I'll explore is trying to split the problem in different pieces, directed by the inputs we see. E.g. maybe a data-flow analysis only of the copies, or only the fully qualified load/stores, or handle only scalars (after a vec to scalar pass). For now, I've shelved the global optimization for dead write removal. It wasn't helping any cases, so will wait until we have more derefs around to see the difference. Caio Marcelo de Oliveira Filho (11): util: Add foreach_reverse for dynarray util: Add macro to get number of elements in dynarray nir: Add test file for vars related passes nir: Add tests for dead write elimination nir: Separate dead write removal into its own pass intel/nir: Use the separated dead write vars pass freedreno/ir3: Use the separated dead write vars pass nir: Remove handling of dead writes from copy_prop_vars nir: Add tests for copy propagation of derefs nir: Take call instruction into account in copy_prop_vars nir: Copy propagation between blocks src/compiler/Makefile.nir.am | 34 +- src/compiler/Makefile.sources | 1 + src/compiler/nir/meson.build | 12 + src/compiler/nir/nir.h | 2 + src/compiler/nir/nir_opt_copy_prop_vars.c | 481 +++++++++---- src/compiler/nir/nir_opt_dead_write_vars.c | 216 ++++++ src/compiler/nir/tests/vars_tests.cpp | 737 ++++++++++++++++++++ src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 + src/intel/compiler/brw_nir.c | 1 + src/util/u_dynarray.h | 7 + 10 files changed, 1329 insertions(+), 163 deletions(-) create mode 100644 src/compiler/nir/nir_opt_dead_write_vars.c create mode 100644 src/compiler/nir/tests/vars_tests.cpp -- 2.19.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev