Re: [Mesa-dev] [PATCH v2 14/22] intel/compiler: Do image load/store lowering to NIR

2018-08-29 Thread Kenneth Graunke
On Wednesday, August 29, 2018 10:11:48 AM PDT Jason Ekstrand wrote:
> This commit moves our storage image format conversion codegen into NIR
> instead of doing it in the back-end.  This has the advantage of letting
> us run it through NIR's optimizer which is pretty effective at shrinking
> things down.  In the common case of rgba8, the number of instructions
> emitted after NIR is done with it is half of what it was with the
> lowering happening in the back-end.  On the downside, the back-end's
> lowering is able to directly use predicates and the NIR lowering has to
> use IFs.
> 
> Shader-db results on Kaby Lake:
> 
> total instructions in shared programs: 15166910 -> 15166872 (<.01%)
> instructions in affected programs: 5895 -> 5857 (-0.64%)
> helped: 15
> HURT: 0
> 
> Clearly, we don't have that much image_load_store happening in the
> shaders in shader-db

Looks like the untyped stride checks are back and the unnecessary typed
load checks (mistakenly copied from atomics) are gone.  Nice.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 14/22] intel/compiler: Do image load/store lowering to NIR

2018-08-29 Thread Jason Ekstrand
This commit moves our storage image format conversion codegen into NIR
instead of doing it in the back-end.  This has the advantage of letting
us run it through NIR's optimizer which is pretty effective at shrinking
things down.  In the common case of rgba8, the number of instructions
emitted after NIR is done with it is half of what it was with the
lowering happening in the back-end.  On the downside, the back-end's
lowering is able to directly use predicates and the NIR lowering has to
use IFs.

Shader-db results on Kaby Lake:

total instructions in shared programs: 15166910 -> 15166872 (<.01%)
instructions in affected programs: 5895 -> 5857 (-0.64%)
helped: 15
HURT: 0

Clearly, we don't have that much image_load_store happening in the
shaders in shader-db
---
 src/compiler/nir/nir_intrinsics.py|9 +
 src/intel/Makefile.sources|1 +
 src/intel/compiler/brw_fs_nir.cpp |  126 +-
 src/intel/compiler/brw_fs_surface_builder.cpp | 1030 -
 src/intel/compiler/brw_fs_surface_builder.h   |   20 -
 src/intel/compiler/brw_nir.h  |3 +
 .../compiler/brw_nir_lower_image_load_store.c |  822 +
 src/intel/compiler/meson.build|1 +
 src/intel/vulkan/anv_pipeline.c   |2 +
 src/mesa/drivers/dri/i965/brw_program.c   |2 +
 10 files changed, 896 insertions(+), 1120 deletions(-)
 create mode 100644 src/intel/compiler/brw_nir_lower_image_load_store.c

diff --git a/src/compiler/nir/nir_intrinsics.py 
b/src/compiler/nir/nir_intrinsics.py
index 17212c4862f..170f954e375 100644
--- a/src/compiler/nir/nir_intrinsics.py
+++ b/src/compiler/nir/nir_intrinsics.py
@@ -312,6 +312,15 @@ intrinsic("image_deref_atomic_fadd",  src_comp=[1, 4, 1, 
1], dest_comp=1)
 intrinsic("image_deref_size",src_comp=[1], dest_comp=0, 
flags=[CAN_ELIMINATE, CAN_REORDER])
 intrinsic("image_deref_samples", src_comp=[1], dest_comp=1, 
flags=[CAN_ELIMINATE, CAN_REORDER])
 
+# Intel-specific query for loading from the brw_image_param struct passed
+# into the shader as a uniform.  The variable is a deref to the image
+# variable. The const index specifies which of the six parameters to load.
+intrinsic("image_deref_load_param_intel", src_comp=[1], dest_comp=0,
+  indices=[BASE], flags=[CAN_ELIMINATE, CAN_REORDER])
+intrinsic("image_deref_load_raw_intel", src_comp=[1, 1], dest_comp=0,
+  flags=[CAN_ELIMINATE])
+intrinsic("image_deref_store_raw_intel", src_comp=[1, 1, 0])
+
 # Vulkan descriptor set intrinsics
 #
 # The Vulkan API uses a different binding model from GL.  In the Vulkan
diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 5f6cd96825b..d10c4511734 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -84,6 +84,7 @@ COMPILER_FILES = \
compiler/brw_nir_analyze_ubo_ranges.c \
compiler/brw_nir_attribute_workarounds.c \
compiler/brw_nir_lower_cs_intrinsics.c \
+   compiler/brw_nir_lower_image_load_store.c \
compiler/brw_nir_opt_peephole_ffma.c \
compiler/brw_nir_tcs_workarounds.c \
compiler/brw_packed_float.c \
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 67c0bee7acd..b2be91f9117 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3865,38 +3865,33 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
case nir_intrinsic_image_deref_atomic_xor:
case nir_intrinsic_image_deref_atomic_exchange:
case nir_intrinsic_image_deref_atomic_comp_swap: {
-  using namespace image_access;
-
   if (stage == MESA_SHADER_FRAGMENT &&
   instr->intrinsic != nir_intrinsic_image_deref_load)
  brw_wm_prog_data(prog_data)->has_side_effects = true;
 
   /* Get the referenced image variable and type. */
   nir_deref_instr *deref = nir_src_as_deref(instr->src[0]);
-  const nir_variable *var = nir_deref_instr_get_variable(deref);
-  const glsl_type *type = var->type->without_array();
-  const brw_reg_type base_type = get_image_base_type(type);
+  const glsl_type *type = deref->type;
 
   /* Get some metadata from the image intrinsic. */
   const nir_intrinsic_info *info = _intrinsic_infos[instr->intrinsic];
-  const unsigned arr_dims = type->sampler_array ? 1 : 0;
-  const unsigned surf_dims = type->coordinate_components() - arr_dims;
-  const unsigned format = var->data.image.format;
+  const unsigned dims = type->coordinate_components();
   const unsigned dest_components = nir_intrinsic_dest_components(instr);
 
   /* Get the arguments of the image intrinsic. */
   const fs_reg image = get_nir_image_deref(deref);
-  const fs_reg addr = retype(get_nir_src(instr->src[1]),
- BRW_REGISTER_TYPE_UD);
+  const fs_reg coords = retype(get_nir_src(instr->src[1]),
+