vtn: "raw" pointer support

Karol Herbst Fri, 23 Mar 2018 14:16:30 -0700

On Fri, Mar 23, 2018 at 10:07 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote:
> +list
>
> On Fri, Mar 23, 2018 at 1:45 PM, Karol Herbst <kher...@redhat.com> wrote:
>>
>> On Fri, Mar 23, 2018 at 9:30 PM, Jason Ekstrand <ja...@jlekstrand.net>
>> wrote:
>> > As I've been rewriting core NIR deref handling, I've been thinking about
>> > this problem quite a bit.  One objective I have is to actually make UBO
>> > and
>> > SSBO access go through derefs instead of just being an offset and index
>> > so
>> > that the compiler can better reason about them.  In particular, I want
>> > to be
>> > able to start doing load/store elimination on SSBOs, SLM, and whatever
>> > CL
>> > has which would be great for everyone's compute performance (GL, Vulkan,
>> > CL,
>> > etc.).
>> >
>> > I would be lying if I said I had a full plan but I do have part of a
>> > plan.
>> > In my patch which adds the deref instructions, I add a new "cast" deref
>> > type
>> > which takes an arbitrary value as it's source and kicks out a deref with
>> > a
>> > type.  Whenever we discover that the source of the cast is actually
>> > another
>> > deref which is compatible (same type etc.), copy propagation gets rid of
>> > the
>> > cast for you.  The idea is that, instead of doing a load_raw(raw_ptr),
>> > you
>> > would do a load((type *)raw_ptr).
>> >
>> > Right now, most of the core NIR optimizations will throw a fit if they
>> > ever
>> > see a cast.  This is intentional because it requires us to manually go
>> > through and handle casts.  This would mean that, at the moment, you
>> > would
>> > have to lower to load_raw intrinsics almost immediately after coming out
>> > of
>> > SPIR-V.
>> >
>>
>> Well it gets more fun with OpenCL 2.0 where you can have generic
>> pointer where you only know the type at creation type. You can also
>> declare generic pointers as function inputs in a way, that you never
>> actually know from where you have to load if you only have that one
>> function. So the actual load operation depends on when you create the
>> initial pointer variable (you can cast from X to generic, but not the
>> other way around).
>>
>> Which in the end means you can end up with load(generic_ptr) and only
>> following the chain up to it's creation (with function inlining in
>> mind) you know the actual memory target.
>
>
> Yup.  And there will always be crazy cases where you can't actually follow
> it and you have to emit a pile of code to load different ways depending on
> some bits somewhere that tell you how to load it.  I'm well aware of the
> insanity. :-)  This is part of the reason why I'm glad I'm not trying to
> write an OpenCL 2.0 driver.
>
> This insanity is exactly why I'm suggesting the pointer casting.  Sure, you
> may not know the data type until the actual load.  In that case, you end up
> with the cast being right before the load.  If you don't know the storage
> class, maybe you have to switch and do multiple casts based on some bits.
> Alternatively, if you don't know the storage class, we can just let the
> deref mode be 0 for "I don't know". or maybe multiple bits for "these are
> the things it might be".  In any case, I think we can handle it.
>


there shouldn't be a situation where we don't know, except when you
don't inline all functions. I think Rob had the idea of fat pointers
where a pointer is a vec2 and the 2nd component contains the actual
pointer type and you end up with a switch over the type to get the
correct storage class. And if the compiler inlines all functions, it
should be able to optimize that switch away.

> It's insane but we need some sort of structure to be able to reason about
> the insanity.  Immediately lowering everything to load_raw is a good way to
> get a driver off the ground.  What it's not so good for is making an
> optimizing compiler that can reason about these crazy pointers and actually
> optimize them.  Lest I sound too negative, I'm 100% fine with taking a short
> path to getting something working now so long as it doesn't cloud up our
> ability to do better in the future.
>
>>
>> And I think the issue here is not that it is some kind of raw pointer
>> in the patch, but more like an unbound/physical pointer, which doesn't
>> relate to any variable. It is just a value like any other int/long as
>> well.
>>
>> > On Fri, Mar 23, 2018 at 12:33 PM, Karol Herbst <kher...@redhat.com>
>> > wrote:
>> >>
>> >> From: Rob Clark <robdcl...@gmail.com>
>> >>
>> >> An attempt to add physical pointer support to vtn.  I'm not totally
>> >> happy about the handling of logical pointers vs physical pointers.
>> >> So this is really more of an RFS (request for suggestions)
>> >>
>> >> v2: treat vec3 types as vec4 when dereferencing
>> >>
>> >> Signed-off-by: Karol Herbst <kher...@redhat.com>
>> >> ---
>> >>  src/compiler/spirv/spirv_to_nir.c  |  87 ++++++++---
>> >>  src/compiler/spirv/vtn_private.h   |  20 ++-
>> >>  src/compiler/spirv/vtn_variables.c | 300
>> >> ++++++++++++++++++++++++++++++++-----
>> >>  3 files changed, 347 insertions(+), 60 deletions(-)
>> >>
>> >> diff --git a/src/compiler/spirv/spirv_to_nir.c
>> >> b/src/compiler/spirv/spirv_to_nir.c
>> >> index 334bcab9a82..d58a68f80ef 100644
>> >> --- a/src/compiler/spirv/spirv_to_nir.c
>> >> +++ b/src/compiler/spirv/spirv_to_nir.c
>> >> @@ -572,6 +572,7 @@ vtn_types_compatible(struct vtn_builder *b,
>> >>               vtn_types_compatible(b, t1->array_element,
>> >> t2->array_element);
>> >>
>> >>     case vtn_base_type_pointer:
>> >> +   case vtn_base_type_raw_pointer:
>> >>        return vtn_types_compatible(b, t1->deref, t2->deref);
>> >>
>> >>     case vtn_base_type_struct:
>> >> @@ -609,6 +610,7 @@ vtn_type_copy(struct vtn_builder *b, struct
>> >> vtn_type
>> >> *src)
>> >>     case vtn_base_type_matrix:
>> >>     case vtn_base_type_array:
>> >>     case vtn_base_type_pointer:
>> >> +   case vtn_base_type_raw_pointer:
>> >>     case vtn_base_type_image:
>> >>     case vtn_base_type_sampler:
>> >>     case vtn_base_type_sampled_image:
>> >> @@ -939,6 +941,14 @@ vtn_type_layout_std430(struct vtn_builder *b,
>> >> struct
>> >> vtn_type *type,
>> >>        return type;
>> >>     }
>> >>
>> >> +   case vtn_base_type_raw_pointer: {
>> >> +      uint32_t comp_size = b->ptr_size / 8;
>> >> +      vtn_assert(comp_size);
>> >> +      *size_out = comp_size;
>> >> +      *align_out = comp_size;
>> >> +      return type;
>> >> +   }
>> >> +
>> >>     case vtn_base_type_vector: {
>> >>        uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
>> >>        assert(type->length > 0 && type->length <= 4);
>> >> @@ -1003,6 +1013,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
>> >> opcode,
>> >>        val->type->base_type = vtn_base_type_scalar;
>> >>        val->type->type = glsl_bool_type();
>> >>        val->type->length = 1;
>> >> +      val->type->stride = 4;
>> >>        break;
>> >>     case SpvOpTypeInt: {
>> >>        int bit_size = w[2];
>> >> @@ -1025,6 +1036,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
>> >> opcode,
>> >>           vtn_fail("Invalid int bit size");
>> >>        }
>> >>        val->type->length = 1;
>> >> +      val->type->stride = bit_size / 8;
>> >>        break;
>> >>     }
>> >>
>> >> @@ -1045,6 +1057,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
>> >> opcode,
>> >>           vtn_fail("Invalid float bit size");
>> >>        }
>> >>        val->type->length = 1;
>> >> +      val->type->stride = bit_size / 8;
>> >>        break;
>> >>     }
>> >>
>> >> @@ -1061,6 +1074,10 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
>> >> opcode,
>> >>        val->type->type =
>> >> glsl_vector_type(glsl_get_base_type(base->type),
>> >> elems);
>> >>        val->type->length = elems;
>> >>        val->type->stride = glsl_get_bit_size(base->type) / 8;
>> >> +      /* special case: vec3 is aligned to vec4 */
>> >> +      if (elems == 3)
>> >> +         elems = 4;
>> >> +      val->type->stride *= elems;
>> >>        val->type->array_element = base;
>> >>        break;
>> >>     }
>> >> @@ -1138,7 +1155,11 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
>> >> opcode,
>> >>
>> >>        const char *name = val->name ? val->name : "struct";
>> >>
>> >> -      val->type->type = glsl_struct_type(fields, num_fields, name,
>> >> false);
>> >> +      val->type->type = glsl_struct_type(fields, num_fields, name,
>> >> +                                         val->type->packed);
>> >> +      // TODO stride for a struct only matters for kernel shaders,
>> >> where
>> >> +      // cl_size is the right thing.. but still a bit ugly to
>> >> hard-code.
>> >> +      val->type->stride = glsl_get_cl_size(val->type->type);
>> >>        break;
>> >>     }
>> >>
>> >> @@ -1167,25 +1188,47 @@ vtn_handle_type(struct vtn_builder *b, SpvOp
>> >> opcode,
>> >>        val->type->storage_class = storage_class;
>> >>        val->type->deref = deref_type;
>> >>
>> >> -      if (storage_class == SpvStorageClassUniform ||
>> >> -          storage_class == SpvStorageClassStorageBuffer) {
>> >> -         /* These can actually be stored to nir_variables and used as
>> >> SSA
>> >> -          * values so they need a real glsl_type.
>> >> -          */
>> >> -         val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
>> >> -      }
>> >> -
>> >> -      if (storage_class == SpvStorageClassWorkgroup &&
>> >> -          b->options->lower_workgroup_access_to_offsets) {
>> >> +      // XXX handling the "fake" glsl pointers vs "raw" pointers in
>> >> kernel
>> >> +      // is a bit ugly..  need to understand how "pointers" are used
>> >> in
>> >> vk
>> >> +      // and figure out something better
>> >> +      if (storage_class == SpvStorageClassFunction ||
>> >> +          storage_class == SpvStorageClassUniformConstant ||
>> >> +          storage_class == SpvStorageClassWorkgroup ||
>> >> +          !b->kernel_mode) {
>> >> +         if (storage_class == SpvStorageClassUniform ||
>> >> +             storage_class == SpvStorageClassStorageBuffer) {
>> >> +            /* These can actually be stored to nir_variables and used
>> >> as
>> >> SSA
>> >> +             * values so they need a real glsl_type.
>> >> +             */
>> >> +            val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
>> >> +         } else if (storage_class == SpvStorageClassWorkgroup &&
>> >> +                    b->options->lower_workgroup_access_to_offsets) {
>> >> +            uint32_t size, align;
>> >> +            val->type->deref = vtn_type_layout_std430(b,
>> >> val->type->deref,
>> >> +                                                      &size, &align);
>> >> +            val->type->length = size;
>> >> +            val->type->align = align;
>> >> +            /* These can actually be stored to nir_variables and used
>> >> as
>> >> SSA
>> >> +             * values so they need a real glsl_type.
>> >> +             */
>> >> +            val->type->type = glsl_uint_type();
>> >> +         }
>> >> +      } else {
>> >> +         vtn_assert(storage_class == SpvStorageClassCrossWorkgroup ||
>> >> +                    storage_class == SpvStorageClassInput);
>> >>           uint32_t size, align;
>> >> +         if (b->ptr_size == 64) {
>> >> +            val->type->type = glsl_uint64_t_type();
>> >> +         } else {
>> >> +            val->type->type = glsl_uint_type();
>> >> +         }
>> >> +         val->type->base_type = vtn_base_type_raw_pointer;
>> >> +         /* pointers can be accessed as array, so set the stride as
>> >> size:
>> >> */
>> >>           val->type->deref = vtn_type_layout_std430(b,
>> >> val->type->deref,
>> >>                                                     &size, &align);
>> >> -         val->type->length = size;
>> >> +#define ALIGN(_v, _d) (((_v) + ((_d) - 1)) & ~((_d) - 1))
>> >> +         val->type->stride = ALIGN(size, align);
>> >>           val->type->align = align;
>> >> -         /* These can actually be stored to nir_variables and used as
>> >> SSA
>> >> -          * values so they need a real glsl_type.
>> >> -          */
>> >> -         val->type->type = glsl_uint_type();
>> >>        }
>> >>        break;
>> >>     }
>> >> @@ -3391,9 +3434,16 @@ vtn_handle_preamble_instruction(struct
>> >> vtn_builder
>> >> *b, SpvOp opcode,
>> >>        break;
>> >>
>> >>     case SpvOpMemoryModel:
>> >> -      vtn_assert(w[1] == SpvAddressingModelLogical);
>> >> +      vtn_assert(w[1] == SpvAddressingModelLogical ||
>> >> +                 w[1] == SpvAddressingModelPhysical32 ||
>> >> +                 w[1] == SpvAddressingModelPhysical64);
>> >>        vtn_assert(w[2] == SpvMemoryModelSimple ||
>> >> -                 w[2] == SpvMemoryModelGLSL450);
>> >> +                 w[2] == SpvMemoryModelGLSL450 ||
>> >> +                 w[2] == SpvMemoryModelOpenCL);
>> >> +      if (w[1] == SpvAddressingModelPhysical32)
>> >> +         b->ptr_size = 32;
>> >> +      else if (w[1] == SpvAddressingModelPhysical64)
>> >> +         b->ptr_size = 64;
>> >>        break;
>> >>
>> >>     case SpvOpEntryPoint: {
>> >> @@ -3780,6 +3830,7 @@ vtn_handle_body_instruction(struct vtn_builder
>> >> *b,
>> >> SpvOp opcode,
>> >>           sel_type = glsl_vector_type(GLSL_TYPE_BOOL,
>> >> res_val->type->length);
>> >>           break;
>> >>        case vtn_base_type_pointer:
>> >> +      case vtn_base_type_raw_pointer:
>> >>           /* We need to have actual storage for pointer types */
>> >>           vtn_fail_if(res_val->type->type == NULL,
>> >>                       "Invalid pointer result type for OpSelect");
>> >> diff --git a/src/compiler/spirv/vtn_private.h
>> >> b/src/compiler/spirv/vtn_private.h
>> >> index bd273122703..dbfe9eab58a 100644
>> >> --- a/src/compiler/spirv/vtn_private.h
>> >> +++ b/src/compiler/spirv/vtn_private.h
>> >> @@ -262,7 +262,8 @@ enum vtn_base_type {
>> >>     vtn_base_type_matrix,
>> >>     vtn_base_type_array,
>> >>     vtn_base_type_struct,
>> >> -   vtn_base_type_pointer,
>> >> +   vtn_base_type_pointer,         /* a logical pointer */
>> >> +   vtn_base_type_raw_pointer,     /* a physical pointer */
>> >>     vtn_base_type_image,
>> >>     vtn_base_type_sampler,
>> >>     vtn_base_type_sampled_image,
>> >> @@ -407,12 +408,14 @@ enum vtn_variable_mode {
>> >>     vtn_variable_mode_local,
>> >>     vtn_variable_mode_global,
>> >>     vtn_variable_mode_param,
>> >> +   vtn_variable_mode_const,
>> >>     vtn_variable_mode_ubo,
>> >>     vtn_variable_mode_ssbo,
>> >>     vtn_variable_mode_push_constant,
>> >>     vtn_variable_mode_image,
>> >>     vtn_variable_mode_sampler,
>> >>     vtn_variable_mode_workgroup,
>> >> +   vtn_variable_mode_cross_workgroup,
>> >>     vtn_variable_mode_input,
>> >>     vtn_variable_mode_output,
>> >>  };
>> >> @@ -588,6 +591,13 @@ struct vtn_builder {
>> >>
>> >>     bool has_loop_continue;
>> >>
>> >> +   /* pointer size is:
>> >> +    *   AddressingModelLogical:    0    (default)
>> >> +    *   AddressingModelPhysical32: 32
>> >> +    *   AddressingModelPhysical64: 64
>> >> +    */
>> >> +   unsigned ptr_size;
>> >> +
>> >>     bool kernel_mode;
>> >>  };
>> >>
>> >> @@ -624,7 +634,8 @@ vtn_push_ssa(struct vtn_builder *b, uint32_t
>> >> value_id,
>> >>               struct vtn_type *type, struct vtn_ssa_value *ssa)
>> >>  {
>> >>     struct vtn_value *val;
>> >> -   if (type->base_type == vtn_base_type_pointer) {
>> >> +   if (type->base_type == vtn_base_type_pointer ||
>> >> +       type->base_type == vtn_base_type_raw_pointer) {
>> >>        val = vtn_push_value(b, value_id, vtn_value_type_pointer);
>> >>        val->pointer = vtn_pointer_from_ssa(b, ssa->def, type);
>> >>     } else {
>> >> @@ -688,6 +699,11 @@ struct vtn_ssa_value *vtn_local_load(struct
>> >> vtn_builder *b, nir_deref_var *src);
>> >>  void vtn_local_store(struct vtn_builder *b, struct vtn_ssa_value *src,
>> >>                       nir_deref_var *dest);
>> >>
>> >> +struct vtn_ssa_value *vtn_pointer_load(struct vtn_builder *b,
>> >> +                                       struct vtn_pointer *ptr);
>> >> +void vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value
>> >> *src,
>> >> +                       struct vtn_pointer *ptr);
>> >> +
>> >>  struct vtn_ssa_value *
>> >>  vtn_variable_load(struct vtn_builder *b, struct vtn_pointer *src);
>> >>
>> >> diff --git a/src/compiler/spirv/vtn_variables.c
>> >> b/src/compiler/spirv/vtn_variables.c
>> >> index b2897407fb1..af9222d6f4e 100644
>> >> --- a/src/compiler/spirv/vtn_variables.c
>> >> +++ b/src/compiler/spirv/vtn_variables.c
>> >> @@ -51,6 +51,9 @@ vtn_access_chain_extend(struct vtn_builder *b, struct
>> >> vtn_access_chain *old,
>> >>     unsigned old_len = old ? old->length : 0;
>> >>     chain = vtn_access_chain_create(b, old_len + new_ids);
>> >>
>> >> +   if (old)
>> >> +      chain->ptr_as_array = old->ptr_as_array;
>> >> +
>> >>     for (unsigned i = 0; i < old_len; i++)
>> >>        chain->link[i] = old->link[i];
>> >>
>> >> @@ -88,11 +91,12 @@ vtn_access_chain_pointer_dereference(struct
>> >> vtn_builder *b,
>> >>        vtn_access_chain_extend(b, base->chain, deref_chain->length);
>> >>     struct vtn_type *type = base->type;
>> >>
>> >> -   /* OpPtrAccessChain is only allowed on things which support
>> >> variable
>> >> -    * pointers.  For everything else, the client is expected to just
>> >> pass
>> >> us
>> >> -    * the right access chain.
>> >> +   /* We can get an *PtrAccessChain with cl kernels, with first
>> >> element
>> >> +    * of deref chain being literal zero.  But we can't really cope w/
>> >> +    * tacking on another level of ptr_as_array.
>> >>      */
>> >> -   vtn_assert(!deref_chain->ptr_as_array);
>> >> +   vtn_assert(!(chain->ptr_as_array && deref_chain->ptr_as_array));
>> >> +   chain->ptr_as_array = deref_chain->ptr_as_array;
>> >>
>> >>     unsigned start = base->chain ? base->chain->length : 0;
>> >>     for (unsigned i = 0; i < deref_chain->length; i++) {
>> >> @@ -135,6 +139,21 @@ vtn_access_link_as_ssa(struct vtn_builder *b,
>> >> struct
>> >> vtn_access_link link,
>> >>     }
>> >>  }
>> >>
>> >> +static struct vtn_access_link
>> >> +vtn_to_access_link(struct vtn_builder *b, uint32_t link_id)
>> >> +{
>> >> +   struct vtn_value *link_val = vtn_untyped_value(b, link_id);
>> >> +   static struct vtn_access_link link;
>> >> +   if (link_val->value_type == vtn_value_type_constant) {
>> >> +      link.mode = vtn_access_mode_literal;
>> >> +      link.id = link_val->constant->values[0].u32[0];
>> >> +   } else {
>> >> +      link.mode = vtn_access_mode_id;
>> >> +      link.id = link_id;
>> >> +   }
>> >> +   return link;
>> >> +}
>> >> +
>> >>  static nir_ssa_def *
>> >>  vtn_variable_resource_index(struct vtn_builder *b, struct vtn_variable
>> >> *var,
>> >>                              nir_ssa_def *desc_array_index)
>> >> @@ -207,6 +226,12 @@ vtn_ssa_offset_pointer_dereference(struct
>> >> vtn_builder
>> >> *b,
>> >>              /* You can't have a zero-length OpPtrAccessChain */
>> >>              vtn_assert(deref_chain->length >= 1);
>> >>              desc_arr_idx = vtn_access_link_as_ssa(b,
>> >> deref_chain->link[0], 1);
>> >> +         } else if (!glsl_type_is_struct(type->type)) {
>> >> +            /* if we have something that is ptr to simple type, like
>> >> an
>> >> +             * int ptr, then just treat that like an array of length
>> >> 1,
>> >> +             * we just want to deref the zero'th element:
>> >> +             */
>> >> +            desc_arr_idx = nir_imm_int(&b->nb, 0);
>> >>           } else {
>> >>              /* We have a regular non-array SSBO. */
>> >>              desc_arr_idx = NULL;
>> >> @@ -304,7 +329,9 @@ vtn_ssa_offset_pointer_dereference(struct
>> >> vtn_builder
>> >> *b,
>> >>        case GLSL_TYPE_FLOAT:
>> >>        case GLSL_TYPE_FLOAT16:
>> >>        case GLSL_TYPE_DOUBLE:
>> >> -      case GLSL_TYPE_BOOL:
>> >> +      case GLSL_TYPE_BOOL: {
>> >> +         break;
>> >> +      }
>> >>        case GLSL_TYPE_ARRAY: {
>> >>           nir_ssa_def *elem_offset =
>> >>              vtn_access_link_as_ssa(b, deref_chain->link[idx],
>> >> type->stride);
>> >> @@ -336,6 +363,98 @@ vtn_ssa_offset_pointer_dereference(struct
>> >> vtn_builder
>> >> *b,
>> >>     return ptr;
>> >>  }
>> >>
>> >> +static struct vtn_pointer *
>> >> +vtn_ssa_raw_pointer_dereference(struct vtn_builder *b,
>> >> +                                struct vtn_pointer *base,
>> >> +                                struct vtn_access_chain *deref_chain)
>> >> +{
>> >> +   nir_ssa_def *offset = nir_imm_int(&b->nb, 0);
>> >> +   struct vtn_type *type = base->type;
>> >> +
>> >> +
>> >> +   vtn_assert(type);
>> >> +
>> >> +   unsigned idx = 0;
>> >> +
>> >> +   /* For the *PtrAccessChain deref instructions, the first entry in
>> >> +    * the deref chain is deref'ing the ptr as an array, ie. if you
>> >> +    * have "struct foo *f" then it can be deref'd as either "f->blah",
>> >> +    * which is equiv to "f[0].blah", or "f[n].blah".
>> >> +    */
>> >> +   if (deref_chain->ptr_as_array) {
>> >> +      nir_ssa_def *elem_offset =
>> >> +         vtn_access_link_as_ssa(b, deref_chain->link[idx],
>> >> type->stride);
>> >> +      offset = nir_iadd(&b->nb, offset, elem_offset);
>> >> +      idx++;
>> >> +   }
>> >> +
>> >> +   /* TODO this can probably be refactored out into helper?  Nearly
>> >> +    * the same as in vtn_ssa_offset_pointer_dereference() and perhaps
>> >> +    * elsewhere..
>> >> +    *
>> >> +    * Note that the type here in the switch is of what the pointer
>> >> +    * points to, ie. if the base type is uint, we are dereferencing
>> >> +    * (possibly as an array) a uint* pointer.
>> >> +    */
>> >> +   for (; idx < deref_chain->length; idx++) {
>> >> +      switch (glsl_get_base_type(type->type)) {
>> >> +      case GLSL_TYPE_UINT:
>> >> +      case GLSL_TYPE_INT:
>> >> +      case GLSL_TYPE_UINT16:
>> >> +      case GLSL_TYPE_INT16:
>> >> +      case GLSL_TYPE_UINT8:
>> >> +      case GLSL_TYPE_INT8:
>> >> +      case GLSL_TYPE_UINT64:
>> >> +      case GLSL_TYPE_INT64:
>> >> +      case GLSL_TYPE_FLOAT:
>> >> +      case GLSL_TYPE_FLOAT16:
>> >> +      case GLSL_TYPE_DOUBLE:
>> >> +      case GLSL_TYPE_BOOL:
>> >> +      case GLSL_TYPE_ARRAY: {
>> >> +         nir_ssa_def *elem_offset =
>> >> +            vtn_access_link_as_ssa(b, deref_chain->link[idx],
>> >> type->stride);
>> >> +         offset = nir_iadd(&b->nb, offset, elem_offset);
>> >> +         if (type->array_element) {
>> >> +            type = type->array_element;
>> >> +         } else {
>> >> +            vtn_assert(idx == (deref_chain->length - 1));
>> >> +         }
>> >> +         break;
>> >> +      }
>> >> +
>> >> +      case GLSL_TYPE_STRUCT: {
>> >> +         vtn_assert(deref_chain->link[idx].mode ==
>> >> vtn_access_mode_literal);
>> >> +         unsigned member = deref_chain->link[idx].id;
>> >> +         nir_ssa_def *mem_offset = nir_imm_int(&b->nb,
>> >> type->offsets[member]);
>> >> +         offset = nir_iadd(&b->nb, offset, mem_offset);
>> >> +         type = type->members[member];
>> >> +         break;
>> >> +      }
>> >> +
>> >> +      default:
>> >> +         vtn_fail("Invalid type for deref");
>> >> +      }
>> >> +   }
>> >> +
>> >> +   /* at this point, offset is 32b, but pointer can be either 32b or
>> >> 64b
>> >> +    * depending on memory model:
>> >> +    */
>> >> +   if (b->ptr_size == 64) {
>> >> +      offset = nir_u2u64(&b->nb, offset);
>> >> +   }
>> >> +
>> >> +   /* add pointer address to calculated offset: */
>> >> +   if (base->offset)
>> >> +      offset = nir_iadd(&b->nb, offset, base->offset);
>> >> +
>> >> +   struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
>> >> +   ptr->mode = base->mode;
>> >> +   ptr->type = base->type;
>> >> +   ptr->offset = offset;
>> >> +
>> >> +   return ptr;
>> >> +}
>> >> +
>> >>  /* Dereference the given base pointer by the access chain */
>> >>  static struct vtn_pointer *
>> >>  vtn_pointer_dereference(struct vtn_builder *b,
>> >> @@ -344,6 +463,8 @@ vtn_pointer_dereference(struct vtn_builder *b,
>> >>  {
>> >>     if (vtn_pointer_uses_ssa_offset(b, base)) {
>> >>        return vtn_ssa_offset_pointer_dereference(b, base, deref_chain);
>> >> +   } else if (base->ptr_type->base_type == vtn_base_type_raw_pointer)
>> >> {
>> >> +      return vtn_ssa_raw_pointer_dereference(b, base, deref_chain);
>> >>     } else {
>> >>        return vtn_access_chain_pointer_dereference(b, base,
>> >> deref_chain);
>> >>     }
>> >> @@ -373,7 +494,8 @@ vtn_pointer_for_variable(struct vtn_builder *b,
>> >>
>> >>     pointer->mode = var->mode;
>> >>     pointer->type = var->type;
>> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
>> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
>> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
>> >>     vtn_assert(ptr_type->deref->type == var->type->type);
>> >>     pointer->ptr_type = ptr_type;
>> >>     pointer->var = var;
>> >> @@ -384,6 +506,12 @@ vtn_pointer_for_variable(struct vtn_builder *b,
>> >>  nir_deref_var *
>> >>  vtn_pointer_to_deref(struct vtn_builder *b, struct vtn_pointer *ptr)
>> >>  {
>> >> +   /* once you've chased a pointer, you no longer really have a var,
>> >> +    * so this case is handled differently
>> >> +    */
>> >> +   if (!ptr->var)
>> >> +      return NULL;
>> >> +
>> >>     /* Do on-the-fly copy propagation for samplers. */
>> >>     if (ptr->var->copy_prop_sampler)
>> >>        return vtn_pointer_to_deref(b, ptr->var->copy_prop_sampler);
>> >> @@ -408,7 +536,19 @@ vtn_pointer_to_deref(struct vtn_builder *b, struct
>> >> vtn_pointer *ptr)
>> >>     nir_deref *tail = &deref_var->deref;
>> >>     nir_variable **members = ptr->var->members;
>> >>
>> >> -   for (unsigned i = 0; i < chain->length; i++) {
>> >> +   unsigned i = 0;
>> >> +   if (chain->ptr_as_array) {
>> >> +      /* We can't currently handle this in a nir deref chain.  Perhaps
>> >> +       * a new type of deref node is needed?  But we can at least
>> >> handle
>> >> +       * the case where the first deref link is literal zero by
>> >> skipping
>> >> +       * it.
>> >> +       */
>> >> +      vtn_assert(chain->link[0].mode == vtn_access_mode_literal);
>> >> +      vtn_assert(chain->link[0].id == 0);
>> >> +      i++;
>> >> +   }
>> >> +
>> >> +   for (; i < chain->length; i++) {
>> >>        enum glsl_base_type base_type =
>> >> glsl_get_base_type(deref_type->type);
>> >>        switch (base_type) {
>> >>        case GLSL_TYPE_UINT:
>> >> @@ -598,6 +738,49 @@ vtn_local_store(struct vtn_builder *b, struct
>> >> vtn_ssa_value *src,
>> >>     }
>> >>  }
>> >>
>> >> +struct vtn_ssa_value *
>> >> +vtn_pointer_load(struct vtn_builder *b, struct vtn_pointer *ptr)
>> >> +{
>> >> +   const struct glsl_type *type = ptr->type->type;
>> >> +   struct vtn_ssa_value *val = vtn_create_ssa_value(b, type);
>> >> +   nir_intrinsic_op op = nir_intrinsic_load_global;
>> >> +
>> >> +   vtn_assert(ptr->offset);
>> >> +
>> >> +   nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->shader,
>> >> op);
>> >> +   intrin->num_components = glsl_get_vector_elements(type);
>> >> +   intrin->src[0] = nir_src_for_ssa(ptr->offset);
>> >> +
>> >> +   nir_ssa_dest_init(&intrin->instr, &intrin->dest,
>> >> +                     intrin->num_components,
>> >> +                     glsl_get_bit_size(type),
>> >> +                     NULL);
>> >> +   val->def = &intrin->dest.ssa;
>> >> +
>> >> +   nir_builder_instr_insert(&b->nb, &intrin->instr);
>> >> +
>> >> +   return val;
>> >> +}
>> >> +
>> >> +void
>> >> +vtn_pointer_store(struct vtn_builder *b, struct vtn_ssa_value *src,
>> >> +                  struct vtn_pointer *ptr)
>> >> +{
>> >> +   const struct glsl_type *type = ptr->type->type;
>> >> +   nir_intrinsic_op op = nir_intrinsic_store_global;
>> >> +
>> >> +   vtn_assert(ptr->offset);
>> >> +
>> >> +   nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->shader,
>> >> op);
>> >> +   intrin->num_components = MAX2(1, glsl_get_vector_elements(type));
>> >> +   intrin->src[0] = nir_src_for_ssa(src->def);
>> >> +   intrin->src[1] = nir_src_for_ssa(ptr->offset);
>> >> +
>> >> +   nir_intrinsic_set_write_mask(intrin, (1 << intrin->num_components)
>> >> -
>> >> 1);
>> >> +
>> >> +   nir_builder_instr_insert(&b->nb, &intrin->instr);
>> >> +}
>> >> +
>> >>  nir_ssa_def *
>> >>  vtn_pointer_to_offset(struct vtn_builder *b, struct vtn_pointer *ptr,
>> >>                        nir_ssa_def **index_out, unsigned *end_idx_out)
>> >> @@ -1004,20 +1187,33 @@ _vtn_variable_load_store(struct vtn_builder *b,
>> >> bool load,
>> >>     case GLSL_TYPE_FLOAT:
>> >>     case GLSL_TYPE_FLOAT16:
>> >>     case GLSL_TYPE_BOOL:
>> >> -   case GLSL_TYPE_DOUBLE:
>> >> -      /* At this point, we have a scalar, vector, or matrix so we know
>> >> that
>> >> -       * there cannot be any structure splitting still in the way.  By
>> >> -       * stopping at the matrix level rather than the vector level, we
>> >> -       * ensure that matrices get loaded in the optimal way even if
>> >> they
>> >> -       * are storred row-major in a UBO.
>> >> -       */
>> >> -      if (load) {
>> >> -         *inout = vtn_local_load(b, vtn_pointer_to_deref(b, ptr));
>> >> +   case GLSL_TYPE_DOUBLE: {
>> >> +      nir_deref_var *deref_var = vtn_pointer_to_deref(b, ptr);
>> >> +
>> >> +      if (deref_var) {
>> >> +         /* At this point, we have a scalar, vector, or matrix so we
>> >> know
>> >> that
>> >> +          * there cannot be any structure splitting still in the way.
>> >> By
>> >> +          * stopping at the matrix level rather than the vector level,
>> >> we
>> >> +          * ensure that matrices get loaded in the optimal way even if
>> >> they
>> >> +          * are storred row-major in a UBO.
>> >> +          */
>> >> +         if (load) {
>> >> +            *inout = vtn_local_load(b, deref_var);
>> >> +         } else {
>> >> +            vtn_local_store(b, *inout, deref_var);
>> >> +         }
>> >>        } else {
>> >> -         vtn_local_store(b, *inout, vtn_pointer_to_deref(b, ptr));
>> >> +         /* If deref'ing a raw pointer, we don't have a variable to go
>> >> along
>> >> +          * with it.  Just directly generate load/store_global
>> >> intrinsics:
>> >> +          */
>> >> +         if (load) {
>> >> +            *inout = vtn_pointer_load(b, ptr);
>> >> +         } else {
>> >> +            vtn_pointer_store(b, *inout, ptr);
>> >> +         }
>> >>        }
>> >>        return;
>> >> -
>> >> +   }
>> >>     case GLSL_TYPE_ARRAY:
>> >>     case GLSL_TYPE_STRUCT: {
>> >>        unsigned elems = glsl_get_length(ptr->type->type);
>> >> @@ -1052,6 +1248,8 @@ vtn_variable_load(struct vtn_builder *b, struct
>> >> vtn_pointer *src)
>> >>  {
>> >>     if (vtn_pointer_is_external_block(b, src)) {
>> >>        return vtn_block_load(b, src);
>> >> +   } else if (!src->var) {
>> >> +      return vtn_pointer_load(b, src);
>> >>     } else {
>> >>        struct vtn_ssa_value *val = NULL;
>> >>        _vtn_variable_load_store(b, true, src, &val);
>> >> @@ -1065,8 +1263,11 @@ vtn_variable_store(struct vtn_builder *b, struct
>> >> vtn_ssa_value *src,
>> >>  {
>> >>     if (vtn_pointer_is_external_block(b, dest)) {
>> >>        vtn_assert(dest->mode == vtn_variable_mode_ssbo ||
>> >> -                 dest->mode == vtn_variable_mode_workgroup);
>> >> +                 dest->mode == vtn_variable_mode_workgroup ||
>> >> +                 dest->mode == vtn_variable_mode_param);
>> >>        vtn_block_store(b, src, dest);
>> >> +   } else if (!dest->var) {
>> >> +      vtn_pointer_store(b, src, dest);
>> >>     } else {
>> >>        _vtn_variable_load_store(b, false, dest, &src);
>> >>     }
>> >> @@ -1608,6 +1809,7 @@ static enum vtn_variable_mode
>> >>  vtn_storage_class_to_mode(struct vtn_builder *b,
>> >>                            SpvStorageClass class,
>> >>                            struct vtn_type *interface_type,
>> >> +                          bool has_initializer,
>> >>                            nir_variable_mode *nir_mode_out)
>> >>  {
>> >>     enum vtn_variable_mode mode;
>> >> @@ -1635,8 +1837,12 @@ vtn_storage_class_to_mode(struct vtn_builder *b,
>> >>        } else if (glsl_type_is_sampler(interface_type->type)) {
>> >>           mode = vtn_variable_mode_sampler;
>> >>           nir_mode = nir_var_uniform;
>> >> +      } else if (has_initializer) {
>> >> +         mode = vtn_variable_mode_const;
>> >> +         nir_mode = nir_var_global;
>> >>        } else {
>> >> -         vtn_fail("Invalid uniform constant variable type");
>> >> +         mode = vtn_variable_mode_push_constant;
>> >> +         nir_mode = nir_var_uniform;
>> >>        }
>> >>        break;
>> >>     case SpvStorageClassPushConstant:
>> >> @@ -1664,6 +1870,10 @@ vtn_storage_class_to_mode(struct vtn_builder *b,
>> >>        nir_mode = nir_var_shared;
>> >>        break;
>> >>     case SpvStorageClassCrossWorkgroup:
>> >> +      mode = vtn_variable_mode_cross_workgroup;
>> >> +      // TODO do we want a new nir_mode for raw pointers?
>> >> +      nir_mode = nir_var_global;
>> >> +      break;
>> >>     case SpvStorageClassGeneric:
>> >>     case SpvStorageClassAtomicCounter:
>> >>     default:
>> >> @@ -1692,7 +1902,14 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct
>> >> vtn_pointer *ptr)
>> >>        struct vtn_access_chain chain = {
>> >>           .length = 0,
>> >>        };
>> >> -      ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
>> >> +
>> >> +      if (vtn_pointer_uses_ssa_offset(b, ptr)) {
>> >> +         ptr = vtn_ssa_offset_pointer_dereference(b, ptr, &chain);
>> >> +      } else if (ptr->ptr_type->base_type ==
>> >> vtn_base_type_raw_pointer) {
>> >> +         ptr = vtn_ssa_raw_pointer_dereference(b, ptr, &chain);
>> >> +      } else {
>> >> +         vtn_fail("unhandled");
>> >> +      }
>> >>     }
>> >>
>> >>     vtn_assert(ptr->offset);
>> >> @@ -1710,27 +1927,34 @@ struct vtn_pointer *
>> >>  vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
>> >>                       struct vtn_type *ptr_type)
>> >>  {
>> >> -   vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
>> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
>> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
>> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
>> >>     vtn_assert(ptr_type->deref->base_type != vtn_base_type_pointer);
>> >>     /* This pointer type needs to have actual storage */
>> >>     vtn_assert(ptr_type->type);
>> >>
>> >>     struct vtn_pointer *ptr = rzalloc(b, struct vtn_pointer);
>> >>     ptr->mode = vtn_storage_class_to_mode(b, ptr_type->storage_class,
>> >> -                                         ptr_type, NULL);
>> >> +                                         ptr_type, false, NULL);
>> >> +
>> >> +   if (ptr_type->base_type == vtn_base_type_raw_pointer) {
>> >> +      vtn_assert(ssa->num_components == 1 && ssa->bit_size ==
>> >> b->ptr_size);
>> >> +   } else {
>> >> +      vtn_assert(ssa->num_components <= 2 && ssa->bit_size == 32);
>> >> +   }
>> >> +
>> >>     ptr->type = ptr_type->deref;
>> >>     ptr->ptr_type = ptr_type;
>> >>
>> >>     if (ssa->num_components > 1) {
>> >>        vtn_assert(ssa->num_components == 2);
>> >>        vtn_assert(ptr->mode == vtn_variable_mode_ubo ||
>> >> -                 ptr->mode == vtn_variable_mode_ssbo);
>> >> +                 ptr->mode == vtn_variable_mode_ssbo ||
>> >> +                 ptr->mode == vtn_variable_mode_global);
>> >>        ptr->block_index = nir_channel(&b->nb, ssa, 0);
>> >>        ptr->offset = nir_channel(&b->nb, ssa, 1);
>> >>     } else {
>> >>        vtn_assert(ssa->num_components == 1);
>> >> -      vtn_assert(ptr->mode == vtn_variable_mode_workgroup);
>> >>        ptr->block_index = NULL;
>> >>        ptr->offset = ssa;
>> >>     }
>> >> @@ -1761,7 +1985,8 @@ vtn_create_variable(struct vtn_builder *b, struct
>> >> vtn_value *val,
>> >>                      struct vtn_type *ptr_type, SpvStorageClass
>> >> storage_class,
>> >>                      nir_constant *initializer)
>> >>  {
>> >> -   vtn_assert(ptr_type->base_type == vtn_base_type_pointer);
>> >> +   vtn_assert(ptr_type->base_type == vtn_base_type_pointer ||
>> >> +              ptr_type->base_type == vtn_base_type_raw_pointer);
>> >>     struct vtn_type *type = ptr_type->deref;
>> >>
>> >>     struct vtn_type *without_array = type;
>> >> @@ -1770,7 +1995,8 @@ vtn_create_variable(struct vtn_builder *b, struct
>> >> vtn_value *val,
>> >>
>> >>     enum vtn_variable_mode mode;
>> >>     nir_variable_mode nir_mode;
>> >> -   mode = vtn_storage_class_to_mode(b, storage_class, without_array,
>> >> &nir_mode);
>> >> +   mode = vtn_storage_class_to_mode(b, storage_class, without_array,
>> >> +                                    !!initializer, &nir_mode);
>> >>
>> >>     switch (mode) {
>> >>     case vtn_variable_mode_ubo:
>> >> @@ -1933,6 +2159,7 @@ vtn_create_variable(struct vtn_builder *b, struct
>> >> vtn_value *val,
>> >>     case vtn_variable_mode_ubo:
>> >>     case vtn_variable_mode_ssbo:
>> >>     case vtn_variable_mode_push_constant:
>> >> +   case vtn_variable_mode_cross_workgroup:
>> >>        /* These don't need actual variables. */
>> >>        break;
>> >>     }
>> >> @@ -2031,25 +2258,18 @@ vtn_handle_variables(struct vtn_builder *b,
>> >> SpvOp
>> >> opcode,
>> >>     case SpvOpAccessChain:
>> >>     case SpvOpPtrAccessChain:
>> >>     case SpvOpInBoundsAccessChain: {
>> >> +      struct vtn_type *ptr_type = vtn_value(b, w[1],
>> >> vtn_value_type_type)->type;
>> >> +      struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
>> >>        struct vtn_access_chain *chain = vtn_access_chain_create(b,
>> >> count -
>> >> 4);
>> >> -      chain->ptr_as_array = (opcode == SpvOpPtrAccessChain);
>> >> +      chain->ptr_as_array = (opcode == SpvOpPtrAccessChain) ||
>> >> +                            (opcode == SpvOpInBoundsPtrAccessChain);
>> >>
>> >>        unsigned idx = 0;
>> >>        for (int i = 4; i < count; i++) {
>> >> -         struct vtn_value *link_val = vtn_untyped_value(b, w[i]);
>> >> -         if (link_val->value_type == vtn_value_type_constant) {
>> >> -            chain->link[idx].mode = vtn_access_mode_literal;
>> >> -            chain->link[idx].id =
>> >> link_val->constant->values[0].u32[0];
>> >> -         } else {
>> >> -            chain->link[idx].mode = vtn_access_mode_id;
>> >> -            chain->link[idx].id = w[i];
>> >> -
>> >> -         }
>> >> +         chain->link[idx] = vtn_to_access_link(b, w[i]);
>> >>           idx++;
>> >>        }
>> >>
>> >> -      struct vtn_type *ptr_type = vtn_value(b, w[1],
>> >> vtn_value_type_type)->type;
>> >> -      struct vtn_value *base_val = vtn_untyped_value(b, w[3]);
>> >>        if (base_val->value_type == vtn_value_type_sampled_image) {
>> >>           /* This is rather insane.  SPIR-V allows you to use
>> >> OpSampledImage
>> >>            * to combine an array of images with a single sampler to get
>> >> an
>> >> --
>> >> 2.14.3
>> >>
>> >> _______________________________________________
>> >> mesa-dev mailing list
>> >> mesa-dev@lists.freedesktop.org
>> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> >
>> >
>
>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 06/19] RFC: nir/vtn: "raw" pointer support

Reply via email to