On 12/18/18 11:18 AM, Ilia Mirkin wrote: > Nv_image_formats adds all the core formats to gles. > > It's pretty easy to add them all in, so I don't see why not. >
Ok, I didn't know about that extension, thanks for the pointer. It then makes sense to support all these formats, yes. I will submit an v2 including them. Thanks, Ilia. > On Tue, Dec 18, 2018, 03:17 Eduardo Lima Mitev <el...@igalia.com > <mailto:el...@igalia.com> wrote: > > > > On 12/18/18 9:05 AM, Eduardo Lima Mitev wrote: > > On 12/17/18 10:02 PM, Ilia Mirkin wrote: > >> Note that the format may not be known. I suspect that falls into your > >> "default" case. > >> > > > > Hi Ilia, > > > > while on GLES profiles the format qualifier must be declared for all > > image variables, it is true that core profiles allow to omit it for > > writeonly-qualified images. > > > > I guess we can add a special case for GL_NONE to the switch, that also > > returns 4 components, to at least not break current behavior if image > > format is not known. What do you think? > > > > Thinking a bit more about this, if we care about core profile then there > are more GL formats to consider apart from those supported by GLES 3.1. > > However, since image format layout qualifiers were introduced in core > 4.20, I don't think we should care for now. So I would leave the patch > as-is and fail the assert on unknown formats. > > Eduardo > > > Eduardo > > > >> On Mon, Dec 17, 2018 at 3:41 PM Eduardo Lima Mitev > <el...@igalia.com <mailto:el...@igalia.com>> wrote: > >>> > >>> emit_intrinsic_store_image() is always using 4 components when > >>> collecting registers for the value. When image has less than > >>> 4 components (e.g, r32f, r32i, r32ui) this results in extra mov > >>> instructions. > >>> > >>> This patch uses the actual number of components from the image > format. > >>> > >>> For example, in a shader like: > >>> > >>> layout (r32f, binding=0) writeonly uniform imageBuffer u_image; > >>> ... > >>> void main(void) { > >>> ... > >>> imageStore (u_image, some_offset, vec4(1.0)); > >>> ... > >>> } > >>> > >>> instruction count is reduced in at least 3 instructions (note image > >>> format is r32f, 1 component only). > >>> > >>> This obviously reduces register pressure as well. > >>> --- > >>> src/freedreno/ir3/ir3_compiler_nir.c | 34 > ++++++++++++++++++++++++++-- > >>> 1 file changed, 32 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/src/freedreno/ir3/ir3_compiler_nir.c > b/src/freedreno/ir3/ir3_compiler_nir.c > >>> index 85f14f354d2..cc00602c249 100644 > >>> --- a/src/freedreno/ir3/ir3_compiler_nir.c > >>> +++ b/src/freedreno/ir3/ir3_compiler_nir.c > >>> @@ -1251,6 +1251,35 @@ emit_intrinsic_load_image(struct > ir3_context *ctx, nir_intrinsic_instr *intr, > >>> ir3_split_dest(b, dst, sam, 0, 4); > >>> } > >>> > >>> +/* Get the number of components of the different image formats > supported > >>> + * by the GLES 3.1 spec. > >>> + */ > >>> +static unsigned > >>> +get_num_components_for_glformat(GLuint format) > >>> +{ > >>> + switch (format) { > >>> + case GL_R32F: > >>> + case GL_R32I: > >>> + case GL_R32UI: > >>> + return 1; > >>> + > >>> + case GL_RGBA32F: > >>> + case GL_RGBA16F: > >>> + case GL_RGBA8: > >>> + case GL_RGBA8_SNORM: > >>> + case GL_RGBA32I: > >>> + case GL_RGBA16I: > >>> + case GL_RGBA8I: > >>> + case GL_RGBA32UI: > >>> + case GL_RGBA16UI: > >>> + case GL_RGBA8UI: > >>> + return 4; > >>> + > >>> + default: > >>> + assert(!"Unsupported GL format for image"); > >>> + } > >>> +} > >>> + > >>> /* src[] = { deref, coord, sample_index, value }. const_index[] > = {} */ > >>> static void > >>> emit_intrinsic_store_image(struct ir3_context *ctx, > nir_intrinsic_instr *intr) > >>> @@ -1262,6 +1291,7 @@ emit_intrinsic_store_image(struct > ir3_context *ctx, nir_intrinsic_instr *intr) > >>> struct ir3_instruction * const *coords = > ir3_get_src(ctx, &intr->src[1]); > >>> unsigned ncoords = get_image_coords(var, NULL); > >>> unsigned tex_idx = get_image_slot(ctx, > nir_src_as_deref(intr->src[0])); > >>> + unsigned ncomp = > get_num_components_for_glformat(var->data.image.format); > >>> > >>> /* src0 is value > >>> * src1 is coords > >>> @@ -1276,10 +1306,10 @@ emit_intrinsic_store_image(struct > ir3_context *ctx, nir_intrinsic_instr *intr) > >>> */ > >>> > >>> stib = ir3_STIB(b, create_immed(b, tex_idx), 0, > >>> - ir3_create_collect(ctx, value, 4), 0, > >>> + ir3_create_collect(ctx, value, ncomp), 0, > >>> ir3_create_collect(ctx, coords, ncoords), 0, > >>> offset, 0); > >>> - stib->cat6.iim_val = 4; > >>> + stib->cat6.iim_val = ncomp; > >>> stib->cat6.d = ncoords; > >>> stib->cat6.type = get_image_type(var); > >>> stib->cat6.typed = true; > >>> -- > >>> 2.19.2 > >>> > >>> _______________________________________________ > >>> mesa-dev mailing list > >>> mesa-dev@lists.freedesktop.org > <mailto:mesa-dev@lists.freedesktop.org> > >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev