On Fri, Feb 23, 2018 at 12:28 PM, Chema Casanova <jmcasan...@igalia.com> wrote:
> > > El 23/02/18 a las 17:26, Jason Ekstrand escribió: > > On Fri, Feb 23, 2018 at 1:26 AM, Jose Maria Casanova Crespo > > <jmcasan...@igalia.com <mailto:jmcasan...@igalia.com>> wrote: > > > > The surfaces that backup the GPU buffers have a boundary check that > > considers that access to partial dwords are considered out-of-bounds. > > For example is basic 16-bit cases of buffers with size 2 or 6 where > the > > last two bytes will always be read as 0 or its writting ignored. > > > > The introduction of 16-bit types implies that we need to align the > size > > to 4-bytes multiples so that partial dwords could be read/written. > > Adding an inconditional +2 size to buffers not being multiple of 2 > > solves this issue for the general cases of UBO or SSBO. > > > > But, when unsized_arrays of 16-bit elements are used it is not > possible > > to know if the size was padded or not. To solve this issue the > > implementation of SSBO calculates the needed size of the surface, > > as suggested by Jason: > > > > surface_size = 2 * aling_u64(buffer_size, 4) - buffer_size > > > > So when we calculate backwards the SpvOpArrayLenght with a nir > expresion > > when the array stride is not multiple of 4. > > > > array_size = (surface_size & ~3) - (surface_size & 3) > > > > It is also exposed this buffer requirements when robust buffer access > > is enabled so these buffer sizes recommend being multiple of 4. > > --- > > > > I have some doubts if vtn_variables.c is the best place to include > > this specific to calculate the real buffer size as this is new > > calculus seems to be quite HW dependent and maybe other drivers > > different > > to ANV don't need this kind of solution. > > > > src/compiler/spirv/vtn_variables.c | 14 ++++++++++++++ > > src/intel/vulkan/anv_descriptor_set.c | 16 ++++++++++++++++ > > src/intel/vulkan/anv_device.c | 11 +++++++++++ > > 3 files changed, 41 insertions(+) > > > > diff --git a/src/compiler/spirv/vtn_variables.c > > b/src/compiler/spirv/vtn_variables.c > > index 9eb85c24e9..78adab3ed2 100644 > > --- a/src/compiler/spirv/vtn_variables.c > > +++ b/src/compiler/spirv/vtn_variables.c > > @@ -2113,6 +2113,20 @@ vtn_handle_variables(struct vtn_builder *b, > > SpvOp opcode, > > nir_builder_instr_insert(&b->nb, &instr->instr); > > nir_ssa_def *buf_size = &instr->dest.ssa; > > > > + /* Calculate real length if padding was done to align the > buffer > > + * to 32-bits. This only could happen is stride is not > multiple > > + * of 4. Introduced to support 16-bit type unsized arrays in > anv. > > + */ > > + if (stride % 4) { > > + buf_size = nir_isub(&b->nb, > > + nir_iand(&b->nb, > > + buf_size, > > + nir_imm_int(&b->nb, ~3)), > > + nir_iand (&b->nb, > > + buf_size, > > + nir_imm_int(&b->nb, 3))); > > > > > > We can't do this in spirv_to_nir as it's also used by radv and they may > > not have the same issue. Instead, we need to handle it either in > > anv_nir_apply_pipeline_layout or in the back-end compiler. Doing it > > here has the advantage that we can only do it in the "stride % 4 != 0" > > case but I don't think the three instructions are all that big of a deal > > given that we just did a send and are about to do an integer divide. My > > preference would be to put most of it in ISL and the back-end compiler > > if we can. > > I've already had my doubts in my commit comment. So I'll implement it > properly in the backend implementation of nir_intrinsic_get_buffer_size. > I should have a look to that code before. > > > > > + } > > + > > /* array_length = max(buffer_size - offset, 0) / stride */ > > nir_ssa_def *array_length = > > nir_idiv(&b->nb, > > diff --git a/src/intel/vulkan/anv_descriptor_set.c > > b/src/intel/vulkan/anv_descriptor_set.c > > index edb829601e..a97f2f37dc 100644 > > --- a/src/intel/vulkan/anv_descriptor_set.c > > +++ b/src/intel/vulkan/anv_descriptor_set.c > > @@ -704,6 +704,22 @@ anv_descriptor_set_write_buffer(struct > > anv_descriptor_set *set, > > bview->offset = buffer->offset + offset; > > bview->range = anv_buffer_get_range(buffer, offset, range); > > > > + /* Uniform and Storage buffers need to have surface size > > + * not less that the aligned 32-bit size of the buffer. > > + * To calculate the array lenght on unsized arrays > > + * in StorageBuffer the last 2 bits store the padding size > > + * added to the surface, so we can calculate latter the > original > > + * buffer size to know the number of elements. > > + * > > + * surface_size = 2 * aling_u64(buffer_size, 4) - > buffer_size > > + * > > + * array_size = (surface_size & ~3) - (surface_size & 3) > > + */ > > + if (type == VK_DESCRIPTOR_TYPE_STORAGE_BUFFER) > > + bview->range = 2 * align_u64(bview->range, 4) - > bview->range; > > + else if (type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER) > > + bview->range = align_u64(bview->range, 4); > > > We chatted on another e-mail about doing this in ISL instead of here but > > you mentioned issues with failing GL tests. Did you ever get to the > > bottom of those? > > Yes I've already implemented that way but it was originally creating 10 > regressions in Jenkins on DEQP in tests like > dEQP-GLES31.functional.texture.texture_buffer.modify. > bufferdata.buffer_size_131071 > > > Although it was not possible to reproduce it locally (it was a kind of > poltergeist) I hate it when that happens. > it was consistent as I was messing with the texture size of > the buffer. > > So the solution was add extra restrictions to the size alignment padding > to detect UniformBuffers only when next 3 conditions matched: > - isl_format was ISL_FORMAT_R32G32B32A32_FLOAT > - stride = 1 > - size % 4 != 0 > I think that mechanism is more-or-less fine. You could also use a condition like this: if (format == ISL_FORMAT_RAW || stride < isl_format_get_layout(format)->bpb / 8) { assert(stride == 1); /* Do the math */ } That would enable it only for UBOs SSBOs and storage texel buffers where we don't have a matching typed format. Also, it would prevent us from changing num_elements in any case other than stride == 1. > But when working testing this solution I realized that for Storage > Buffers there was a another entry point where I couldn't differentiate > from an Storage Buffer at anv_image.c for buffer usage of > VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT. at is uses stride = 1 and > ISL_RAW format. > > So in order to avoid messing with the sizes in that case I though it was > better to move the padding decision to the place I could differentiate > it correctly. Maybe it was an unfounded doubt and it wouldn't mind to > modify the size in that VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT case. > > If so, what was the conclusion? > > So after re-thinking aloud, and if my doubts with > VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT is unfounded I think I'll > recover ISL approach patch, as this would be needed any case for mediump > support. > I wouldn't worry about those. The only time we will use SURFTYPE_RAW for storage texel buffers is when we don't have a matching typed format which only happens for formats which are at least 32bit and so the size will be a multiple of 4 anyway. --Jason > > + > > /* If we're writing descriptors through a push command, we > > need to > > * allocate the surface state from the command buffer. > > Otherwise it will > > * be allocated by the descriptor pool when calling > > diff --git a/src/intel/vulkan/anv_device.c > > b/src/intel/vulkan/anv_device.c > > index a83b7a39f6..cedeed5621 100644 > > --- a/src/intel/vulkan/anv_device.c > > +++ b/src/intel/vulkan/anv_device.c > > @@ -2103,6 +2103,17 @@ void anv_GetBufferMemoryRequirements( > > > > pMemoryRequirements->size = buffer->size; > > pMemoryRequirements->alignment = alignment; > > + > > + /* Storage and Uniform buffers should have their size aligned to > > + * 32-bits to avoid boundary checks when last DWord is not > complete. > > + * This would ensure that not internal padding would be needed > for > > + * 16-bit types. > > + */ > > + if (device->robust_buffer_access && > > + (buffer->usage & VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT || > > + buffer->usage & VK_BUFFER_USAGE_STORAGE_BUFFER_BIT)) > > + pMemoryRequirements->size = align_u64(buffer->size, 4); > > + > > pMemoryRequirements->memoryTypeBits = memory_types; > > } > > > > -- > > 2.14.3 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists. > freedesktop.org> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > <https://lists.freedesktop.org/mailman/listinfo/mesa-dev> > > > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev