Re: [Mesa-dev] [PATCH] draw: handle edge flags in llvm path
On 12/14/2015 08:38 PM, srol...@vmware.com wrote: From: Roland ScheideggerWe just ignored them altogether. While this feature is rather old-fashioned supporting it is actually rather trivial. This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag and all (7) of point-vertex-id). --- src/gallium/auxiliary/draw/draw_llvm.c | 77 +++--- .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 1 + 2 files changed, 54 insertions(+), 24 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index a966e45..18a3d81 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm, LLVMValueRef* aos, int attrib, int num_outputs, -LLVMValueRef clipmask) +LLVMValueRef clipmask, +boolean need_edgeflag) { LLVMBuilderRef builder = gallivm->builder; LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib); @@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm, */ assert(DRAW_TOTAL_CLIP_PLANES==14); /* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */ - vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES); - val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), vertex_id_pad_edgeflag); + if (!need_edgeflag) { + vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES); + } + else { + vertex_id_pad_edgeflag = (0x << 16); + } + val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), + vertex_id_pad_edgeflag); /* OR with the clipmask */ cliptmp = LLVMBuildOr(builder, val, clipmask, ""); for (i = 0; i < vector_length; i++) { @@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm, LLVMValueRef clipmask, int num_outputs, struct lp_type soa_type, - boolean have_clipdist) + boolean need_edgeflag) { LLVMBuilderRef builder = gallivm->builder; unsigned chan, attrib, i; @@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm, aos, attrib, num_outputs, - clipmask); + clipmask, + need_edgeflag); } #if DEBUG_STORE lp_build_printf(gallivm, " # storing end\n"); @@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm, struct gallivm_state *gallivm, struct lp_type vs_type, LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], - boolean clip_xy, - boolean clip_z, - boolean clip_user, - boolean clip_halfz, - unsigned ucp_enable, + struct draw_llvm_variant_key *key, LLVMValueRef context_ptr, boolean *have_clipdist) { @@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm, const unsigned pos = llvm->draw->vs.position_output; const unsigned cv = llvm->draw->vs.clipvertex_output; int num_written_clipdistance = llvm->draw->vs.vertex_shader->info.num_written_clipdistance; - bool have_cd = false; + boolean have_cd = false; + boolean clip_user = key->clip_user; + unsigned ucp_enable = key->ucp_enable; unsigned cd[2]; cd[0] = llvm->draw->vs.clipdistance_output[0]; @@ -1196,7 +1202,7 @@ generate_clipmask(struct draw_llvm *llvm, } /* Cliptest, for hardwired planes */ - if (clip_xy) { + if (key->clip_xy) { /* plane 1 */ test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, pos_x , pos_w); temp = shift; @@ -1224,9 +1230,9 @@ generate_clipmask(struct draw_llvm *llvm, mask = LLVMBuildOr(builder, mask, test, ""); } - if (clip_z) { + if (key->clip_z) { temp = lp_build_const_int_vec(gallivm, i32_type, 16); - if (clip_halfz) { + if (key->clip_halfz) { /* plane 5 */ test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, zero, pos_z); test = LLVMBuildAnd(builder, test, temp, ""); @@ -1313,6 +1319,20 @@ generate_clipmask(struct draw_llvm *llvm, } } } + if (key->need_edgeflags) { + /* + * This isn't really part of clipmask but stored the same in vertex + * header later, so do it here. + */ + unsigned edge_attr = llvm->draw->vs.edgeflag_output; + LLVMValueRef one = lp_build_const_vec(gallivm, f32_type, 1.0); + LLVMValueRef edgeflag = LLVMBuildLoad(builder, outputs[edge_attr][0], ""); + test = lp_build_compare(gallivm, f32_type,
[Mesa-dev] [Bug 92570] 10 bit h264 OMX UVD decode outputs NV12
https://bugs.freedesktop.org/show_bug.cgi?id=92570 --- Comment #3 from Andy Furniss--- (In reply to Andy Furniss from comment #2) > If so why not output nv16 or something else 10 bit? lol at me re-reading this and remembering that nv16 is 8 bit 422. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965/fs: do not disable the FS unit in the presence of shader storage
On Tue, Dec 15, 2015 at 9:30 AM, Francisco Jerezwrote: > Jason Ekstrand writes: > >> On Dec 15, 2015 3:52 AM, "Iago Toral Quiroga" wrote: >>> >>> We want to make sure that the driver does not disable the FS unit if >>> the shader code only has SSBO writes (i.e. no color or depth output). >>> >>> We could go a step further and check if the shader storage is actually >>> used for writing, but does not seem worth the trouble. Also, we do the >>> same thing for atomic buffers. >>> >>> Fixes the following CTS test: >>> ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs >>> --- >>> src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 ++- >>> src/mesa/drivers/dri/i965/gen8_ps_state.c | 1 + >>> 2 files changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c >> b/src/mesa/drivers/dri/i965/gen7_wm_state.c >>> index 06d5e65..d292b13 100644 >>> --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c >>> +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c >>> @@ -77,7 +77,8 @@ upload_wm_state(struct brw_context *brw) >>>dw1 |= GEN7_WM_KILL_ENABLE; >>> } >>> >>> - if (_mesa_active_fragment_shader_has_atomic_ops(>ctx)) { >>> + if (_mesa_active_fragment_shader_has_atomic_ops(>ctx ) || >>> + _mesa_active_fragment_shader_has_shader_storage(>ctx)) { >> >> Ugh... We also need to be checking for images. >> > > The same bit is set when the shader has images or a bunch of other > things a couple of lines below. No idea why atomic counters are handled > separately. Right. So I guess this series is correct if not optimal. I think I'm still a fan of has_side_effects, but I don't care too much how it's done as long as we make some effort to be consistent. >> How about we change it to active_fragment_shader_has_side_effects and make >> it check all three? >> >>>dw1 |= GEN7_WM_DISPATCH_ENABLE; >>> } >>> >>> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c >> b/src/mesa/drivers/dri/i965/gen8_ps_state.c >>> index 945f710..8769269 100644 >>> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c >>> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c >>> @@ -91,6 +91,7 @@ gen8_upload_ps_extra(struct brw_context *brw, >>> * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS | >> _NEW_COLOR >>> */ >>> if ((_mesa_active_fragment_shader_has_atomic_ops(>ctx) || >>> +_mesa_active_fragment_shader_has_shader_storage(>ctx) || >>> prog_data->base.nr_image_params) && >>> !brw_color_buffer_write_enabled(brw)) >>>dw1 |= GEN8_PSX_SHADER_HAS_UAV; >>> -- >>> 1.9.1 >>> >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] draw: handle edge flags in llvm path
Am 15.12.2015 um 17:25 schrieb Brian Paul: > On 12/14/2015 08:38 PM, srol...@vmware.com wrote: >> From: Roland Scheidegger>> >> We just ignored them altogether. While this feature is rather >> old-fashioned >> supporting it is actually rather trivial. >> This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 >> gl-2.0-edgeflag >> and all (7) of point-vertex-id). >> --- >> src/gallium/auxiliary/draw/draw_llvm.c | 77 >> +++--- >> .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 1 + >> 2 files changed, 54 insertions(+), 24 deletions(-) >> >> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c >> b/src/gallium/auxiliary/draw/draw_llvm.c >> index a966e45..18a3d81 100644 >> --- a/src/gallium/auxiliary/draw/draw_llvm.c >> +++ b/src/gallium/auxiliary/draw/draw_llvm.c >> @@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm, >> LLVMValueRef* aos, >> int attrib, >> int num_outputs, >> -LLVMValueRef clipmask) >> +LLVMValueRef clipmask, >> +boolean need_edgeflag) >> { >> LLVMBuilderRef builder = gallivm->builder; >> LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib); >> @@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm, >> */ >> assert(DRAW_TOTAL_CLIP_PLANES==14); >> /* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */ >> - vertex_id_pad_edgeflag = (0x << 16) | (1 << >> DRAW_TOTAL_CLIP_PLANES); >> - val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), >> vertex_id_pad_edgeflag); >> + if (!need_edgeflag) { >> + vertex_id_pad_edgeflag = (0x << 16) | (1 << >> DRAW_TOTAL_CLIP_PLANES); >> + } >> + else { >> + vertex_id_pad_edgeflag = (0x << 16); >> + } >> + val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), >> + vertex_id_pad_edgeflag); >> /* OR with the clipmask */ >> cliptmp = LLVMBuildOr(builder, val, clipmask, ""); >> for (i = 0; i < vector_length; i++) { >> @@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm, >> LLVMValueRef clipmask, >> int num_outputs, >> struct lp_type soa_type, >> - boolean have_clipdist) >> + boolean need_edgeflag) >> { >> LLVMBuilderRef builder = gallivm->builder; >> unsigned chan, attrib, i; >> @@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm, >> aos, >> attrib, >> num_outputs, >> - clipmask); >> + clipmask, >> + need_edgeflag); >> } >> #if DEBUG_STORE >> lp_build_printf(gallivm, " # storing end\n"); >> @@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm, >> struct gallivm_state *gallivm, >> struct lp_type vs_type, >> LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], >> - boolean clip_xy, >> - boolean clip_z, >> - boolean clip_user, >> - boolean clip_halfz, >> - unsigned ucp_enable, >> + struct draw_llvm_variant_key *key, >> LLVMValueRef context_ptr, >> boolean *have_clipdist) >> { >> @@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm, >> const unsigned pos = llvm->draw->vs.position_output; >> const unsigned cv = llvm->draw->vs.clipvertex_output; >> int num_written_clipdistance = >> llvm->draw->vs.vertex_shader->info.num_written_clipdistance; >> - bool have_cd = false; >> + boolean have_cd = false; >> + boolean clip_user = key->clip_user; >> + unsigned ucp_enable = key->ucp_enable; >> unsigned cd[2]; >> >> cd[0] = llvm->draw->vs.clipdistance_output[0]; >> @@ -1196,7 +1202,7 @@ generate_clipmask(struct draw_llvm *llvm, >> } >> >> /* Cliptest, for hardwired planes */ >> - if (clip_xy) { >> + if (key->clip_xy) { >> /* plane 1 */ >> test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, >> pos_x , pos_w); >> temp = shift; >> @@ -1224,9 +1230,9 @@ generate_clipmask(struct draw_llvm *llvm, >> mask = LLVMBuildOr(builder, mask, test, ""); >> } >> >> - if (clip_z) { >> + if (key->clip_z) { >> temp = lp_build_const_int_vec(gallivm, i32_type, 16); >> - if (clip_halfz) { >> + if (key->clip_halfz) { >>/* plane 5 */ >>test = lp_build_compare(gallivm, f32_type, >> PIPE_FUNC_GREATER, zero, pos_z); >>test = LLVMBuildAnd(builder, test, temp, ""); >> @@ -1313,6 +1319,20 @@ generate_clipmask(struct draw_llvm *llvm, >>} >> } >> } >> + if (key->need_edgeflags) { >> +
Re: [Mesa-dev] [PATCH 1/8] nir: Silence missing field initializer warnings for nir_src
On 12/14/2015 07:36 PM, Jason Ekstrand wrote: > On Mon, Dec 14, 2015 at 5:12 PM, Ian Romanickwrote: >> On 12/14/2015 04:39 PM, Ilia Mirkin wrote: >>> On Mon, Dec 14, 2015 at 7:28 PM, Ian Romanick wrote: On 12/14/2015 03:38 PM, Ilia Mirkin wrote: > It's a pretty standard feature of compilers to init things to 0 and > not have the full structure specified like that... what compiler are > you seeing these with? Can we just fix the glitch with a > -Wno-stupid-warnings? I have observed this with several versions of GCC. In C, you can avoid this with a trailing comma like: #define NIR_SRC_INIT (nir_src) { { NULL }, } However, nir.h is also used in some C++ code where that doesn't help. To be honest, I'm not a big fan of these macros. Without C99 designated initalizers, maintaining initializers like these (or the ones in src/glsl/builtin_variables.cpp) is a real pain. We can't use those, and we can't use C++ constructors. We have no good options available. :( I thought about replacing them with a static inline function that returns a zero-initialized struct. The compiler should generate the same code. However, that doesn't work with uses like those in patch 3. I'm also a little curious why you didn't raise this issue when I sent these patches out in August. I removed the patch from the series that you objected to back then. >>> >>> I have absolutely no recollection of any of that. Perhaps I saw "nir" >>> and thought to myself, "don't care, let them do whatever, this won't >>> ever affect me". Which is a sentiment I'm happy to continue with, by >>> the way. >> >> Fair enough. :) The patch I removed was one that removed the gl_context >> parameter from a function in dd_function_table. >> >> http://patchwork.freedesktop.org/patch/58048/ >> >>> I know that doing >>> >>> x = {} >>> >>> is a gcc extension, but I thought that {0} should always work (with >>> enough {} nesting in case the first element is a struct). Perhaps it >> >> {0} is, basically what we're doing now, and GCC complains about it with >> -Wmissing-field-initializers or -Wextra. When we added C-style struct > > I'm not a big fan of spending time fixing warnings that you have to > add -Wextra to get. However, if there are C++ issues, then those > definitely need to get fixed. Those options found real bugs in builtin_variables.cpp, and I'm a big fan of that. >> and array initializers to GLSL, we discussed adding this sort of >> implicit zero initialization. I did some digging in the C89 and C99 >> specs, and I have some recollection that in this case the missing fields >> get undefined values... but, starting with C99, {0, } implicitly >> initializes the missing fields to zero. I also seem to recall that bit >> of weirdness in C is why quite a few people were opposed to adding it to >> GLSL. This was several years ago, so my memory may not be completely >> reliable. >> >>> doesn't in C++? I could believe that, although I'd be surprised. >> >> The initializer support in C++ intentionally quite a bit more primitive >> than in C99. The language designers want you to use constructors >> whether it's the best tool for the job or not... which is why there are >> no designated initializers. > > So, I've got a patch somewhere that switches based on __cplusplus and > defines NIR_SRC_INIT as either the C99 thing or nir_src() for C++. I thought about doing something like that too. Having to maintain and keep in sync two separate versions of the initializer / constructor doesn't sound like a maintainable solution either. At best, it's the kind of thing that I expect someone to see in a year, say "WTF?", and submit a patch to change. At worst, in a year we decide to add some field to nir_src that isn't zero initialized, and we forget to update one of the initializers... and end up with a hard to find bug. > Would that solve this problem? There was also a bug recently about us > not building with oricle studio that it would probably fix. If so, > let's do that rather than a gigantic mess of braces and zeros. We explicitly removed support for Oracle Studio, so that's not a consideration. > --Jason > >>> Anyways, didn't mean to stir the pot too much, just thought there >>> might be a simpler way out of all this. >> >> Well, there are. :) We just can't use them due to some combination of >> MSVC, C++, and C99. >> >>> Cheers, >>> >>> -ilia >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] draw: handle edge flags in llvm path
From: Roland ScheideggerWe just ignored them altogether. While this feature is rather old-fashioned supporting it is actually rather trivial. This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag and all (7) of point-vertex-id). v2: comment fixes, and make the use of the edgeflag in clipmask consistent with when it's actually there (should be impossible to hit a case where the difference would actually matter but still...) --- src/gallium/auxiliary/draw/draw_llvm.c | 86 +++--- .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 1 + 2 files changed, 61 insertions(+), 26 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index a966e45..89ed045 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm, LLVMValueRef* aos, int attrib, int num_outputs, -LLVMValueRef clipmask) +LLVMValueRef clipmask, +boolean need_edgeflag) { LLVMBuilderRef builder = gallivm->builder; LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib); @@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm, */ assert(DRAW_TOTAL_CLIP_PLANES==14); /* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */ - vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES); - val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), vertex_id_pad_edgeflag); + if (!need_edgeflag) { + vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES); + } + else { + vertex_id_pad_edgeflag = (0x << 16); + } + val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), + vertex_id_pad_edgeflag); /* OR with the clipmask */ cliptmp = LLVMBuildOr(builder, val, clipmask, ""); for (i = 0; i < vector_length; i++) { @@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm, LLVMValueRef clipmask, int num_outputs, struct lp_type soa_type, - boolean have_clipdist) + boolean need_edgeflag) { LLVMBuilderRef builder = gallivm->builder; unsigned chan, attrib, i; @@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm, aos, attrib, num_outputs, - clipmask); + clipmask, + need_edgeflag); } #if DEBUG_STORE lp_build_printf(gallivm, " # storing end\n"); @@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm, struct gallivm_state *gallivm, struct lp_type vs_type, LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], - boolean clip_xy, - boolean clip_z, - boolean clip_user, - boolean clip_halfz, - unsigned ucp_enable, + struct draw_llvm_variant_key *key, LLVMValueRef context_ptr, boolean *have_clipdist) { @@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm, const unsigned pos = llvm->draw->vs.position_output; const unsigned cv = llvm->draw->vs.clipvertex_output; int num_written_clipdistance = llvm->draw->vs.vertex_shader->info.num_written_clipdistance; - bool have_cd = false; + boolean have_cd = false; + boolean clip_user = key->clip_user; + unsigned ucp_enable = key->ucp_enable; unsigned cd[2]; cd[0] = llvm->draw->vs.clipdistance_output[0]; @@ -1196,7 +1202,11 @@ generate_clipmask(struct draw_llvm *llvm, } /* Cliptest, for hardwired planes */ - if (clip_xy) { + /* +* XXX should take guardband into account (currently not in key). +* Otherwise might run the draw pipeline stages for nothing. +*/ + if (key->clip_xy) { /* plane 1 */ test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, pos_x , pos_w); temp = shift; @@ -1224,9 +1234,9 @@ generate_clipmask(struct draw_llvm *llvm, mask = LLVMBuildOr(builder, mask, test, ""); } - if (clip_z) { + if (key->clip_z) { temp = lp_build_const_int_vec(gallivm, i32_type, 16); - if (clip_halfz) { + if (key->clip_halfz) { /* plane 5 */ test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, zero, pos_z); test = LLVMBuildAnd(builder, test, temp, ""); @@ -1313,6 +1323,20 @@ generate_clipmask(struct draw_llvm *llvm, } } } + if (key->need_edgeflags) { + /* + * This isn't really part of clipmask but stored the same in vertex + * header later, so do it here. + */ + unsigned
Re: [Mesa-dev] [PATCH] draw: handle edge flags in llvm path
Reviewed-by: Brian PaulOn 12/15/2015 10:06 AM, srol...@vmware.com wrote: From: Roland Scheidegger We just ignored them altogether. While this feature is rather old-fashioned supporting it is actually rather trivial. This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag and all (7) of point-vertex-id). v2: comment fixes, and make the use of the edgeflag in clipmask consistent with when it's actually there (should be impossible to hit a case where the difference would actually matter but still...) --- src/gallium/auxiliary/draw/draw_llvm.c | 86 +++--- .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 1 + 2 files changed, 61 insertions(+), 26 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index a966e45..89ed045 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -880,7 +880,8 @@ store_aos_array(struct gallivm_state *gallivm, LLVMValueRef* aos, int attrib, int num_outputs, -LLVMValueRef clipmask) +LLVMValueRef clipmask, +boolean need_edgeflag) { LLVMBuilderRef builder = gallivm->builder; LLVMValueRef attr_index = lp_build_const_int32(gallivm, attrib); @@ -912,8 +913,14 @@ store_aos_array(struct gallivm_state *gallivm, */ assert(DRAW_TOTAL_CLIP_PLANES==14); /* initialize vertex id:16 = 0x, pad:1 = 0, edgeflag:1 = 1 */ - vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES); - val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), vertex_id_pad_edgeflag); + if (!need_edgeflag) { + vertex_id_pad_edgeflag = (0x << 16) | (1 << DRAW_TOTAL_CLIP_PLANES); + } + else { + vertex_id_pad_edgeflag = (0x << 16); + } + val = lp_build_const_int_vec(gallivm, lp_int_type(soa_type), + vertex_id_pad_edgeflag); /* OR with the clipmask */ cliptmp = LLVMBuildOr(builder, val, clipmask, ""); for (i = 0; i < vector_length; i++) { @@ -943,7 +950,7 @@ convert_to_aos(struct gallivm_state *gallivm, LLVMValueRef clipmask, int num_outputs, struct lp_type soa_type, - boolean have_clipdist) + boolean need_edgeflag) { LLVMBuilderRef builder = gallivm->builder; unsigned chan, attrib, i; @@ -999,7 +1006,8 @@ convert_to_aos(struct gallivm_state *gallivm, aos, attrib, num_outputs, - clipmask); + clipmask, + need_edgeflag); } #if DEBUG_STORE lp_build_printf(gallivm, " # storing end\n"); @@ -1135,11 +1143,7 @@ generate_clipmask(struct draw_llvm *llvm, struct gallivm_state *gallivm, struct lp_type vs_type, LLVMValueRef (*outputs)[TGSI_NUM_CHANNELS], - boolean clip_xy, - boolean clip_z, - boolean clip_user, - boolean clip_halfz, - unsigned ucp_enable, + struct draw_llvm_variant_key *key, LLVMValueRef context_ptr, boolean *have_clipdist) { @@ -1155,7 +1159,9 @@ generate_clipmask(struct draw_llvm *llvm, const unsigned pos = llvm->draw->vs.position_output; const unsigned cv = llvm->draw->vs.clipvertex_output; int num_written_clipdistance = llvm->draw->vs.vertex_shader->info.num_written_clipdistance; - bool have_cd = false; + boolean have_cd = false; + boolean clip_user = key->clip_user; + unsigned ucp_enable = key->ucp_enable; unsigned cd[2]; cd[0] = llvm->draw->vs.clipdistance_output[0]; @@ -1196,7 +1202,11 @@ generate_clipmask(struct draw_llvm *llvm, } /* Cliptest, for hardwired planes */ - if (clip_xy) { + /* +* XXX should take guardband into account (currently not in key). +* Otherwise might run the draw pipeline stages for nothing. +*/ + if (key->clip_xy) { /* plane 1 */ test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, pos_x , pos_w); temp = shift; @@ -1224,9 +1234,9 @@ generate_clipmask(struct draw_llvm *llvm, mask = LLVMBuildOr(builder, mask, test, ""); } - if (clip_z) { + if (key->clip_z) { temp = lp_build_const_int_vec(gallivm, i32_type, 16); - if (clip_halfz) { + if (key->clip_halfz) { /* plane 5 */ test = lp_build_compare(gallivm, f32_type, PIPE_FUNC_GREATER, zero, pos_z); test = LLVMBuildAnd(builder, test, temp, ""); @@ -1313,6 +1323,20 @@ generate_clipmask(struct draw_llvm *llvm, } } } + if (key->need_edgeflags) { +
Re: [Mesa-dev] [PATCH 2/3] i965/fs: do not disable the FS unit in the presence of shader storage
Jason Ekstrandwrites: > On Dec 15, 2015 3:52 AM, "Iago Toral Quiroga" wrote: >> >> We want to make sure that the driver does not disable the FS unit if >> the shader code only has SSBO writes (i.e. no color or depth output). >> >> We could go a step further and check if the shader storage is actually >> used for writing, but does not seem worth the trouble. Also, we do the >> same thing for atomic buffers. >> >> Fixes the following CTS test: >> ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs >> --- >> src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 ++- >> src/mesa/drivers/dri/i965/gen8_ps_state.c | 1 + >> 2 files changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c > b/src/mesa/drivers/dri/i965/gen7_wm_state.c >> index 06d5e65..d292b13 100644 >> --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c >> +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c >> @@ -77,7 +77,8 @@ upload_wm_state(struct brw_context *brw) >>dw1 |= GEN7_WM_KILL_ENABLE; >> } >> >> - if (_mesa_active_fragment_shader_has_atomic_ops(>ctx)) { >> + if (_mesa_active_fragment_shader_has_atomic_ops(>ctx ) || >> + _mesa_active_fragment_shader_has_shader_storage(>ctx)) { > > Ugh... We also need to be checking for images. > The same bit is set when the shader has images or a bunch of other things a couple of lines below. No idea why atomic counters are handled separately. > How about we change it to active_fragment_shader_has_side_effects and make > it check all three? > >>dw1 |= GEN7_WM_DISPATCH_ENABLE; >> } >> >> diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c > b/src/mesa/drivers/dri/i965/gen8_ps_state.c >> index 945f710..8769269 100644 >> --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c >> +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c >> @@ -91,6 +91,7 @@ gen8_upload_ps_extra(struct brw_context *brw, >> * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS | > _NEW_COLOR >> */ >> if ((_mesa_active_fragment_shader_has_atomic_ops(>ctx) || >> +_mesa_active_fragment_shader_has_shader_storage(>ctx) || >> prog_data->base.nr_image_params) && >> !brw_color_buffer_write_enabled(brw)) >>dw1 |= GEN8_PSX_SHADER_HAS_UAV; >> -- >> 1.9.1 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] mesa: Add core mesa support for GL_ARB_shader_draw_parameters
On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensenwrote: > --- > src/glsl/builtin_variables.cpp | 5 + > src/glsl/glsl_parser_extras.cpp | 1 + > src/glsl/glsl_parser_extras.h | 2 ++ > src/glsl/nir/nir.c | 8 > src/glsl/nir/nir_intrinsics.h | 2 ++ > src/glsl/nir/shader_enums.h | 20 > src/glsl/standalone_scaffolding.cpp | 1 + > src/mesa/main/extensions_table.h| 1 + > src/mesa/main/mtypes.h | 1 + > 9 files changed, 41 insertions(+) > > diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp > index e8eab80..e82c99e 100644 > --- a/src/glsl/builtin_variables.cpp > +++ b/src/glsl/builtin_variables.cpp > @@ -951,6 +951,11 @@ builtin_variable_generator::generate_vs_special_vars() >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB"); > if (state->ARB_draw_instanced_enable || state->is_version(140, 300)) >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID"); > + if (state->ARB_shader_draw_parameters_enable) { > + add_system_value(SYSTEM_VALUE_BASE_VERTEX, int_t, "gl_BaseVertexARB"); > + add_system_value(SYSTEM_VALUE_BASE_INSTANCE, int_t, > "gl_BaseInstanceARB"); > + add_system_value(SYSTEM_VALUE_DRAW_ID, int_t, "gl_DrawIDARB"); > + } > if (state->AMD_vertex_shader_layer_enable) { >var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer"); >var->data.interpolation = INTERP_QUALIFIER_FLAT; > diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp > index 29cf0c6..8c46f14 100644 > --- a/src/glsl/glsl_parser_extras.cpp > +++ b/src/glsl/glsl_parser_extras.cpp > @@ -608,6 +608,7 @@ static const _mesa_glsl_extension > _mesa_glsl_supported_extensions[] = { > EXT(ARB_shader_atomic_counters, true, false, > ARB_shader_atomic_counters), > EXT(ARB_shader_bit_encoding, true, false, > ARB_shader_bit_encoding), > EXT(ARB_shader_clock, true, false, ARB_shader_clock), > + EXT(ARB_shader_draw_parameters, true, false, > ARB_shader_draw_parameters), > EXT(ARB_shader_image_load_store, true, false, > ARB_shader_image_load_store), > EXT(ARB_shader_image_size,true, false, > ARB_shader_image_size), > EXT(ARB_shader_precision, true, false, > ARB_shader_precision), > diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h > index a4bda77..afb99af 100644 > --- a/src/glsl/glsl_parser_extras.h > +++ b/src/glsl/glsl_parser_extras.h > @@ -536,6 +536,8 @@ struct _mesa_glsl_parse_state { > bool ARB_shader_bit_encoding_warn; > bool ARB_shader_clock_enable; > bool ARB_shader_clock_warn; > + bool ARB_shader_draw_parameters_enable; > + bool ARB_shader_draw_parameters_warn; > bool ARB_shader_image_load_store_enable; > bool ARB_shader_image_load_store_warn; > bool ARB_shader_image_size_enable; > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c > index 35fc1de..4b70e7c 100644 > --- a/src/glsl/nir/nir.c > +++ b/src/glsl/nir/nir.c > @@ -1588,6 +1588,10 @@ nir_intrinsic_from_system_value(gl_system_value val) >return nir_intrinsic_load_vertex_id; > case SYSTEM_VALUE_INSTANCE_ID: >return nir_intrinsic_load_instance_id; > + case SYSTEM_VALUE_DRAW_ID: > + return nir_intrinsic_load_draw_id; > + case SYSTEM_VALUE_BASE_INSTANCE: > + return nir_intrinsic_load_base_instance; > case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: >return nir_intrinsic_load_vertex_id_zero_base; > case SYSTEM_VALUE_BASE_VERTEX: > @@ -1633,6 +1637,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op > intrin) >return SYSTEM_VALUE_VERTEX_ID; > case nir_intrinsic_load_instance_id: >return SYSTEM_VALUE_INSTANCE_ID; > + case nir_intrinsic_load_draw_id: > + return SYSTEM_VALUE_DRAW_ID; > + case nir_intrinsic_load_base_instance: > + return SYSTEM_VALUE_BASE_INSTANCE; > case nir_intrinsic_load_vertex_id_zero_base: >return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; > case nir_intrinsic_load_base_vertex: > diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h > index 9811fb3..917c805 100644 > --- a/src/glsl/nir/nir_intrinsics.h > +++ b/src/glsl/nir/nir_intrinsics.h > @@ -239,6 +239,8 @@ SYSTEM_VALUE(vertex_id, 1, 0) > SYSTEM_VALUE(vertex_id_zero_base, 1, 0) > SYSTEM_VALUE(base_vertex, 1, 0) > SYSTEM_VALUE(instance_id, 1, 0) > +SYSTEM_VALUE(base_instance, 1, 0) > +SYSTEM_VALUE(draw_id, 1, 0) > SYSTEM_VALUE(sample_id, 1, 0) > SYSTEM_VALUE(sample_pos, 2, 0) > SYSTEM_VALUE(sample_mask_in, 1, 0) > diff --git a/src/glsl/nir/shader_enums.h b/src/glsl/nir/shader_enums.h > index dd0e0ba..0be217c 100644 > --- a/src/glsl/nir/shader_enums.h > +++ b/src/glsl/nir/shader_enums.h > @@ -379,6 +379,26 @@ typedef enum > * \sa SYSTEM_VALUE_VERTEX_ID,
Re: [Mesa-dev] [PATCH 1/7] mesa/vbo: Add draw_id field to struct _mesa_prim
On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensenwrote: > The drivers will need this for passing in gl_DrawIDARB. For indirect > multidraw calls, we get the prim array and prim[i].draw_id == i and is > redundant. But for non-indirect calls, we get one primitive at a time > and need the draw_id field. > --- > src/mesa/vbo/vbo.h| 1 + > src/mesa/vbo/vbo_exec_array.c | 5 + > 2 files changed, 6 insertions(+) > > diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h > index 00e843c..cef3b8c 100644 > --- a/src/mesa/vbo/vbo.h > +++ b/src/mesa/vbo/vbo.h > @@ -58,6 +58,7 @@ struct _mesa_prim { > GLint basevertex; > GLuint num_instances; > GLuint base_instance; > + GLuint draw_id; > > GLsizeiptr indirect_offset; > }; > diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c > index e27fdd9..7ff78dc 100644 > --- a/src/mesa/vbo/vbo_exec_array.c > +++ b/src/mesa/vbo/vbo_exec_array.c > @@ -1,3 +1,4 @@ > + > /** > * > * Copyright 2003 VMware, Inc. > @@ -1341,6 +1342,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, > GLenum mode, > prim[i].indexed = 1; > prim[i].num_instances = 1; > prim[i].base_instance = 0; > + prim[i].draw_id = i; > prim[i].is_indirect = 0; > if (basevertex != NULL) > prim[i].basevertex = basevertex[i]; > @@ -1371,6 +1373,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, > GLenum mode, > prim[0].indexed = 1; > prim[0].num_instances = 1; > prim[0].base_instance = 0; > + prim[0].draw_id = i; > prim[0].is_indirect = 0; > if (basevertex != NULL) > prim[0].basevertex = basevertex[i]; > @@ -1598,6 +1601,7 @@ vbo_validated_multidrawarraysindirect(struct gl_context > *ctx, >prim[i].mode = mode; >prim[i].indirect_offset = offset; >prim[i].is_indirect = 1; > + prim[i].draw_id = i; > } > > check_buffers_are_unmapped(exec->array.inputs); > @@ -1684,6 +1688,7 @@ vbo_validated_multidrawelementsindirect(struct > gl_context *ctx, >prim[i].indexed = 1; >prim[i].indirect_offset = offset; >prim[i].is_indirect = 1; > + prim[i].draw_id = i; > } > > check_buffers_are_unmapped(exec->array.inputs); > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB
On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensenwrote: > We already have gl_BaseVertexARB in the .x component of the SGVS vec4 > and plug gl_BaseInstanceARB into the last free component (.y). > --- > src/mesa/drivers/dri/i965/brw_compiler.h | 2 ++ > src/mesa/drivers/dri/i965/brw_context.h | 9 -- > src/mesa/drivers/dri/i965/brw_draw.c | 12 ++-- > src/mesa/drivers/dri/i965/brw_draw_upload.c | 35 > ++- > src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 ++- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++- > src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++-- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++- > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 6 +++- > src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 > ++- > 11 files changed, 102 insertions(+), 38 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h > b/src/mesa/drivers/dri/i965/brw_compiler.h > index 218d9c7..58ee966 100644 > --- a/src/mesa/drivers/dri/i965/brw_compiler.h > +++ b/src/mesa/drivers/dri/i965/brw_compiler.h > @@ -547,6 +547,8 @@ struct brw_vs_prog_data { > > bool uses_vertexid; > bool uses_instanceid; > + bool uses_basevertex; > + bool uses_baseinstance; Missed bool uses_drawid ? > }; > > struct brw_tcs_prog_data > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index a845541..1378402 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -905,8 +905,13 @@ struct brw_context > uint32_t pma_stall_bits; > > struct { > - /** The value of gl_BaseVertex for the current _mesa_prim. */ > - int gl_basevertex; > + struct { > + /** The value of gl_BaseVertex for the current _mesa_prim. */ > + int gl_basevertex; > + > + /** The value of gl_BaseInstance for the current _mesa_prim. */ > + int gl_baseinstance; > + } params; Missed gl_drawid and gl_drawid_bo ? > >/** > * Buffer and offset used for GL_ARB_shader_draw_parameters > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c > b/src/mesa/drivers/dri/i965/brw_draw.c > index 8398471..298ac06 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw.c > +++ b/src/mesa/drivers/dri/i965/brw_draw.c > @@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx, > } >} > > - brw->draw.gl_basevertex = > + brw->draw.params.gl_basevertex = > prims[i].indexed ? prims[i].basevertex : prims[i].start; > - > + brw->draw.params.gl_baseinstance = prims[i].base_instance; >drm_intel_bo_unreference(brw->draw.draw_params_bo); > >if (prims[i].is_indirect) { > @@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx, > brw->draw.draw_params_offset = 0; >} > > + /* gl_DrawID always needs its own vertex buffer since it's not part of > + * the indirect parameter buffer. */ > + if (brw->vs.prog_data->uses_drawid) { > + brw->draw.gl_drawid = prims[i].drawid; brw->draw.gl_drawid = prims[i].draw_id; > + drm_intel_bo_unreference(brw->draw.draw_id_bo); > + brw->ctx.NewDriverState |= BRW_NEW_VERTICES; > + } > + >if (brw->gen < 6) > brw_set_prim(brw, [i]); >else > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c > b/src/mesa/drivers/dri/i965/brw_draw_upload.c > index ea0f6f2..ccf963c 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c > @@ -592,8 +592,10 @@ void > brw_prepare_shader_draw_parameters(struct brw_context *brw) > { > /* For non-indirect draws, upload gl_BaseVertex. */ > - if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == NULL) > { > - intel_upload_data(brw, >draw.gl_basevertex, 4, 4, > + if ((brw->vs.prog_data->uses_basevertex || > +brw->vs.prog_data->uses_baseinstance) && > + brw->draw.draw_params_bo == NULL) { > + intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4, > >draw.draw_params_bo, > >draw.draw_params_offset); > } > @@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw) > brw_emit_query_begin(brw); > > unsigned nr_elements = brw->vb.nr_enabled; > - if (brw->vs.prog_data->uses_vertexid || > brw->vs.prog_data->uses_instanceid) > + if (brw->vs.prog_data->uses_vertexid || > brw->vs.prog_data->uses_instanceid || > + brw->vs.prog_data->uses_basevertex || > brw->vs.prog_data->uses_baseinstance) >++nr_elements; > > /* If the VS doesn't read any inputs (calculating vertex position from > @@ -693,8 +696,10 @@ brw_emit_vertices(struct brw_context *brw) > /* Now emit VB and VEP
Re: [Mesa-dev] [PATCH] i965/gen8/cs: fix constant push buffer
Ah! I had just also discovered this issue yesterday in some related work but I didn't get the chance to try the CTS yet! :) For the subject I had: "Gen 8 requires 64 byte alignment for push constant data" On 2015-12-15 03:55:15, Iago Toral Quiroga wrote: > Page 502 of the Command Reference Broadwell PRM says that CURBE Total > Data Length must be 64-bit aligned. I think both the base and the size alignments are bumped from 32 to 64. Could you add the base address? How about giving the volume/chapter/section in the spec reference rather than the page number? Also, could you update the call to brw_state_batch to also use 64 byte alignment for the base on gen8+? -Jordan > > Fixes the following CTS tests: > ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs > ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs > ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs > ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs > ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs > ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs > ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs > --- > src/mesa/drivers/dri/i965/gen7_cs_state.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c > b/src/mesa/drivers/dri/i965/gen7_cs_state.c > index 1fde69c..dbd1967 100644 > --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c > @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw) > > unsigned push_constant_data_size = >(prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value); > - unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32); > + unsigned reg_aligned_constant_size = > + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); > unsigned push_constant_regs = reg_aligned_constant_size / 32; > unsigned threads = get_cs_thread_count(cs_prog_data); > > @@ -241,7 +242,8 @@ brw_upload_cs_push_constants(struct brw_context *brw, > >const unsigned push_constant_data_size = > (local_id_dwords + prog_data->nr_params) * > sizeof(gl_constant_value); > - const unsigned reg_aligned_constant_size = > ALIGN(push_constant_data_size, 32); > + const unsigned reg_aligned_constant_size = > + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); >const unsigned param_aligned_count = > reg_aligned_constant_size / sizeof(*param); > > -- > 1.9.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] mesa: Add core mesa support for GL_ARB_shader_draw_parameters
--- src/glsl/builtin_variables.cpp | 5 + src/glsl/glsl_parser_extras.cpp | 1 + src/glsl/glsl_parser_extras.h | 2 ++ src/glsl/nir/nir.c | 8 src/glsl/nir/nir_intrinsics.h | 2 ++ src/glsl/nir/shader_enums.h | 20 src/glsl/standalone_scaffolding.cpp | 1 + src/mesa/main/extensions_table.h| 1 + src/mesa/main/mtypes.h | 1 + 9 files changed, 41 insertions(+) diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index e8eab80..e82c99e 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -951,6 +951,11 @@ builtin_variable_generator::generate_vs_special_vars() add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB"); if (state->ARB_draw_instanced_enable || state->is_version(140, 300)) add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID"); + if (state->ARB_shader_draw_parameters_enable) { + add_system_value(SYSTEM_VALUE_BASE_VERTEX, int_t, "gl_BaseVertexARB"); + add_system_value(SYSTEM_VALUE_BASE_INSTANCE, int_t, "gl_BaseInstanceARB"); + add_system_value(SYSTEM_VALUE_DRAW_ID, int_t, "gl_DrawIDARB"); + } if (state->AMD_vertex_shader_layer_enable) { var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer"); var->data.interpolation = INTERP_QUALIFIER_FLAT; diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index 29cf0c6..8c46f14 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -608,6 +608,7 @@ static const _mesa_glsl_extension _mesa_glsl_supported_extensions[] = { EXT(ARB_shader_atomic_counters, true, false, ARB_shader_atomic_counters), EXT(ARB_shader_bit_encoding, true, false, ARB_shader_bit_encoding), EXT(ARB_shader_clock, true, false, ARB_shader_clock), + EXT(ARB_shader_draw_parameters, true, false, ARB_shader_draw_parameters), EXT(ARB_shader_image_load_store, true, false, ARB_shader_image_load_store), EXT(ARB_shader_image_size,true, false, ARB_shader_image_size), EXT(ARB_shader_precision, true, false, ARB_shader_precision), diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index a4bda77..afb99af 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -536,6 +536,8 @@ struct _mesa_glsl_parse_state { bool ARB_shader_bit_encoding_warn; bool ARB_shader_clock_enable; bool ARB_shader_clock_warn; + bool ARB_shader_draw_parameters_enable; + bool ARB_shader_draw_parameters_warn; bool ARB_shader_image_load_store_enable; bool ARB_shader_image_load_store_warn; bool ARB_shader_image_size_enable; diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 35fc1de..4b70e7c 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -1588,6 +1588,10 @@ nir_intrinsic_from_system_value(gl_system_value val) return nir_intrinsic_load_vertex_id; case SYSTEM_VALUE_INSTANCE_ID: return nir_intrinsic_load_instance_id; + case SYSTEM_VALUE_DRAW_ID: + return nir_intrinsic_load_draw_id; + case SYSTEM_VALUE_BASE_INSTANCE: + return nir_intrinsic_load_base_instance; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: return nir_intrinsic_load_vertex_id_zero_base; case SYSTEM_VALUE_BASE_VERTEX: @@ -1633,6 +1637,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin) return SYSTEM_VALUE_VERTEX_ID; case nir_intrinsic_load_instance_id: return SYSTEM_VALUE_INSTANCE_ID; + case nir_intrinsic_load_draw_id: + return SYSTEM_VALUE_DRAW_ID; + case nir_intrinsic_load_base_instance: + return SYSTEM_VALUE_BASE_INSTANCE; case nir_intrinsic_load_vertex_id_zero_base: return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; case nir_intrinsic_load_base_vertex: diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h index 9811fb3..917c805 100644 --- a/src/glsl/nir/nir_intrinsics.h +++ b/src/glsl/nir/nir_intrinsics.h @@ -239,6 +239,8 @@ SYSTEM_VALUE(vertex_id, 1, 0) SYSTEM_VALUE(vertex_id_zero_base, 1, 0) SYSTEM_VALUE(base_vertex, 1, 0) SYSTEM_VALUE(instance_id, 1, 0) +SYSTEM_VALUE(base_instance, 1, 0) +SYSTEM_VALUE(draw_id, 1, 0) SYSTEM_VALUE(sample_id, 1, 0) SYSTEM_VALUE(sample_pos, 2, 0) SYSTEM_VALUE(sample_mask_in, 1, 0) diff --git a/src/glsl/nir/shader_enums.h b/src/glsl/nir/shader_enums.h index dd0e0ba..0be217c 100644 --- a/src/glsl/nir/shader_enums.h +++ b/src/glsl/nir/shader_enums.h @@ -379,6 +379,26 @@ typedef enum * \sa SYSTEM_VALUE_VERTEX_ID, SYSTEM_VALUE_VERTEX_ID_ZERO_BASE */ SYSTEM_VALUE_BASE_VERTEX, + + /** +* Value of \c baseinstance passed to instanced draw entry points +* +* \sa SYSTEM_VALUE_INSTANCE_ID +*/ + SYSTEM_VALUE_BASE_INSTANCE, + + /** +* From _ARB_shader_draw_parameters: +* +*
[Mesa-dev] [PATCH 5/7] i965: Add support for gl_DrawIDARB and enable extension
We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. --- src/mesa/drivers/dri/i965/brw_compiler.h | 1 + src/mesa/drivers/dri/i965/brw_context.h | 9 + src/mesa/drivers/dri/i965/brw_draw.c | 8 ++-- src/mesa/drivers/dri/i965/brw_draw_upload.c | 45 ++- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 + src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 - src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 10 + src/mesa/drivers/dri/i965/brw_vec4.cpp| 8 +++- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 - src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 5 +++ src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 - src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 12 files changed, 132 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h index 58ee966..2333f4a 100644 --- a/src/mesa/drivers/dri/i965/brw_compiler.h +++ b/src/mesa/drivers/dri/i965/brw_compiler.h @@ -549,6 +549,7 @@ struct brw_vs_prog_data { bool uses_instanceid; bool uses_basevertex; bool uses_baseinstance; + bool uses_drawid; }; struct brw_tcs_prog_data diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 1378402..97ebf06 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -919,6 +919,15 @@ struct brw_context */ drm_intel_bo *draw_params_bo; uint32_t draw_params_offset; + + /** + * The value of gl_DrawID for the current _mesa_prim. This always comes + * in from it's own vertex buffer since it's not part of the indirect + * draw parameters. + */ + int gl_drawid; + drm_intel_bo *draw_id_bo; + uint32_t draw_id_offset; } draw; struct { diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 298ac06..b0710c67 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -513,11 +513,9 @@ brw_try_draw_prims(struct gl_context *ctx, /* gl_DrawID always needs its own vertex buffer since it's not part of * the indirect parameter buffer. */ - if (brw->vs.prog_data->uses_drawid) { - brw->draw.gl_drawid = prims[i].drawid; - drm_intel_bo_unreference(brw->draw.draw_id_bo); - brw->ctx.NewDriverState |= BRW_NEW_VERTICES; - } + brw->draw.gl_drawid = prims[i].draw_id; + drm_intel_bo_unreference(brw->draw.draw_id_bo); + brw->ctx.NewDriverState |= BRW_NEW_VERTICES; if (brw->gen < 6) brw_set_prim(brw, [i]); diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index ccf963c..e601190 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -599,6 +599,12 @@ brw_prepare_shader_draw_parameters(struct brw_context *brw) >draw.draw_params_bo, >draw.draw_params_offset); } + + if (brw->vs.prog_data->uses_drawid) { + intel_upload_data(brw, >draw.gl_drawid, sizeof(brw->draw.gl_drawid), 4, + >draw.draw_id_bo, +>draw.draw_id_offset); + } } /** @@ -663,6 +669,8 @@ brw_emit_vertices(struct brw_context *brw) if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid || brw->vs.prog_data->uses_basevertex || brw->vs.prog_data->uses_baseinstance) ++nr_elements; + if (brw->vs.prog_data->uses_drawid) + nr_elements++; /* If the VS doesn't read any inputs (calculating vertex position from * a state variable for some reason, for example), emit a single pad @@ -699,7 +707,8 @@ brw_emit_vertices(struct brw_context *brw) const bool uses_draw_params = brw->vs.prog_data->uses_basevertex || brw->vs.prog_data->uses_baseinstance; - const unsigned nr_buffers = brw->vb.nr_buffers + uses_draw_params; + const unsigned nr_buffers = brw->vb.nr_buffers + + uses_draw_params + brw->vs.prog_data->uses_drawid; if (nr_buffers) { if (brw->gen >= 6) { @@ -726,6 +735,16 @@ brw_emit_vertices(struct brw_context *brw) 0, /* stride */ 0); /* step rate */ } + + if (brw->vs.prog_data->uses_drawid) { + EMIT_VERTEX_BUFFER_STATE(brw, brw->vb.nr_buffers + 1, + brw->draw.draw_id_bo, +
[Mesa-dev] [PATCH 7/7] i965: Reduce vertex state reemission
We can inspect VS prog_data for iterations i > 0, and only flag BRW_NEW_VERTICES when one of our system values change. This change also flags BRW_NEW_VERTICES in one case we were missing before: if we're doing an indirect draw, prims[i].basevertex is always 0 and the real base vertex value is in the indirect parameter buffer. Thus, if a program uses base vertex or base instance, and the draw call is indirect, flag BRW_NEW_VERTICES. A new piglit test, spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. --- src/mesa/drivers/dri/i965/brw_draw.c | 44 1 file changed, 40 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index b0710c67..9e400ca 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -491,9 +491,44 @@ brw_try_draw_prims(struct gl_context *ctx, } } - brw->draw.params.gl_basevertex = + /* Determine if we need to flag BRW_NEW_VERTICES for updating the + * gl_BaseVertexARB, gl_BaseInstanceARB or gl_DrawIDARB values. As + * above, we don't need to check first iteration, since the flag is set + * before the loop. We also can't rely on vs prog_data in the first + * iteration, but after drawing once, we've uploaded the programs and + * can look at prog_data. + * + * Despite the prims[] name, eache iteration correspond to a draw call + * from a glMulti* style draw call. We need to re-upload vertex state if + * + * 1) the program uses gl_DrawIDARB (changes every iteration), + * + * 2) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the + * draw call is indirect (meaning we can't check if the value change + * or not), or + * + * 3) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the + * value changed + */ + const int new_basevertex = prims[i].indexed ? prims[i].basevertex : prims[i].start; - brw->draw.params.gl_baseinstance = prims[i].base_instance; + const int new_baseinstance = prims[i].base_instance; + if (i > 0) { + const bool uses_draw_parameters = +brw->vs.prog_data->uses_basevertex || +brw->vs.prog_data->uses_baseinstance; + + if (brw->vs.prog_data->uses_drawid || + (uses_draw_parameters && prims[i].is_indirect) || + (brw->vs.prog_data->uses_basevertex && + brw->draw.params.gl_basevertex != new_basevertex) || + (brw->vs.prog_data->uses_baseinstance && + brw->draw.params.gl_baseinstance != new_baseinstance)) +brw->ctx.NewDriverState |= BRW_NEW_VERTICES; + } + + brw->draw.params.gl_basevertex = new_basevertex; + brw->draw.params.gl_baseinstance = new_baseinstance; drm_intel_bo_unreference(brw->draw.draw_params_bo); if (prims[i].is_indirect) { @@ -512,10 +547,11 @@ brw_try_draw_prims(struct gl_context *ctx, } /* gl_DrawID always needs its own vertex buffer since it's not part of - * the indirect parameter buffer. */ + * the indirect parameter buffer. + */ brw->draw.gl_drawid = prims[i].draw_id; drm_intel_bo_unreference(brw->draw.draw_id_bo); - brw->ctx.NewDriverState |= BRW_NEW_VERTICES; + brw->draw.draw_id_bo = NULL; if (brw->gen < 6) brw_set_prim(brw, [i]); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB
We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). --- src/mesa/drivers/dri/i965/brw_compiler.h | 2 ++ src/mesa/drivers/dri/i965/brw_context.h | 9 -- src/mesa/drivers/dri/i965/brw_draw.c | 12 ++-- src/mesa/drivers/dri/i965/brw_draw_upload.c | 35 ++- src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 ++- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++- src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++-- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++- src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 6 +++- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 ++- 11 files changed, 102 insertions(+), 38 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h index 218d9c7..58ee966 100644 --- a/src/mesa/drivers/dri/i965/brw_compiler.h +++ b/src/mesa/drivers/dri/i965/brw_compiler.h @@ -547,6 +547,8 @@ struct brw_vs_prog_data { bool uses_vertexid; bool uses_instanceid; + bool uses_basevertex; + bool uses_baseinstance; }; struct brw_tcs_prog_data diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index a845541..1378402 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -905,8 +905,13 @@ struct brw_context uint32_t pma_stall_bits; struct { - /** The value of gl_BaseVertex for the current _mesa_prim. */ - int gl_basevertex; + struct { + /** The value of gl_BaseVertex for the current _mesa_prim. */ + int gl_basevertex; + + /** The value of gl_BaseInstance for the current _mesa_prim. */ + int gl_baseinstance; + } params; /** * Buffer and offset used for GL_ARB_shader_draw_parameters diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 8398471..298ac06 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx, } } - brw->draw.gl_basevertex = + brw->draw.params.gl_basevertex = prims[i].indexed ? prims[i].basevertex : prims[i].start; - + brw->draw.params.gl_baseinstance = prims[i].base_instance; drm_intel_bo_unreference(brw->draw.draw_params_bo); if (prims[i].is_indirect) { @@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx, brw->draw.draw_params_offset = 0; } + /* gl_DrawID always needs its own vertex buffer since it's not part of + * the indirect parameter buffer. */ + if (brw->vs.prog_data->uses_drawid) { + brw->draw.gl_drawid = prims[i].drawid; + drm_intel_bo_unreference(brw->draw.draw_id_bo); + brw->ctx.NewDriverState |= BRW_NEW_VERTICES; + } + if (brw->gen < 6) brw_set_prim(brw, [i]); else diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index ea0f6f2..ccf963c 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -592,8 +592,10 @@ void brw_prepare_shader_draw_parameters(struct brw_context *brw) { /* For non-indirect draws, upload gl_BaseVertex. */ - if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == NULL) { - intel_upload_data(brw, >draw.gl_basevertex, 4, 4, + if ((brw->vs.prog_data->uses_basevertex || +brw->vs.prog_data->uses_baseinstance) && + brw->draw.draw_params_bo == NULL) { + intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4, >draw.draw_params_bo, >draw.draw_params_offset); } @@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw) brw_emit_query_begin(brw); unsigned nr_elements = brw->vb.nr_enabled; - if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid) + if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid || + brw->vs.prog_data->uses_basevertex || brw->vs.prog_data->uses_baseinstance) ++nr_elements; /* If the VS doesn't read any inputs (calculating vertex position from @@ -693,8 +696,10 @@ brw_emit_vertices(struct brw_context *brw) /* Now emit VB and VEP state packets. */ - unsigned nr_buffers = - brw->vb.nr_buffers + brw->vs.prog_data->uses_vertexid; + const bool uses_draw_params = + brw->vs.prog_data->uses_basevertex || + brw->vs.prog_data->uses_baseinstance; + const unsigned nr_buffers = brw->vb.nr_buffers + uses_draw_params; if (nr_buffers) { if (brw->gen >= 6) { @@ -713,7 +718,7 @@ brw_emit_vertices(struct brw_context
[Mesa-dev] [PATCH 3/7] i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered
fs_visitor::emit_vs_system_value() looks like it's trying to handle SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the backend. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 68f2548..d5193a9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -46,6 +46,7 @@ fs_visitor::emit_vs_system_value(int location) vs_prog_data->uses_vertexid = true; break; case SYSTEM_VALUE_VERTEX_ID: + unreachable("should have been lowered"); case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: reg->reg_offset = 2; vs_prog_data->uses_vertexid = true; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/7] GL_ARB_shader_draw_parameters
Hi, Here's 7 patches to implement GL_ARB_shader_draw_parameters: https://www.opengl.org/registry/specs/ARB/shader_draw_parameters.txt and I have few new piglit tests for the extension as well. Kristian Kristian Høgsberg Kristensen (7): mesa/vbo: Add draw_id field to struct _mesa_prim mesa: Add core mesa support for GL_ARB_shader_draw_parameters i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB i965: Add support for gl_DrawIDARB and enable extension nir: Teach nir_opt_algebraic about adding and subtracting the same thing i965: Reduce vertex state reemission src/glsl/builtin_variables.cpp| 5 ++ src/glsl/glsl_parser_extras.cpp | 1 + src/glsl/glsl_parser_extras.h | 2 + src/glsl/nir/nir.c| 8 +++ src/glsl/nir/nir_intrinsics.h | 2 + src/glsl/nir/nir_opt_algebraic.py | 4 ++ src/glsl/nir/shader_enums.h | 20 ++ src/glsl/standalone_scaffolding.cpp | 1 + src/mesa/drivers/dri/i965/brw_compiler.h | 3 + src/mesa/drivers/dri/i965/brw_context.h | 18 +- src/mesa/drivers/dri/i965/brw_draw.c | 44 - src/mesa/drivers/dri/i965/brw_draw_upload.c | 78 +++ src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 18 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 - src/mesa/drivers/dri/i965/brw_vec4.cpp| 20 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 18 +- src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 11 +++- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 65 +++ src/mesa/drivers/dri/i965/intel_extensions.c | 1 + src/mesa/main/extensions_table.h | 1 + src/mesa/main/mtypes.h| 1 + src/mesa/vbo/vbo.h| 1 + src/mesa/vbo/vbo_exec_array.c | 5 ++ 24 files changed, 311 insertions(+), 38 deletions(-) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] mesa/vbo: Add draw_id field to struct _mesa_prim
The drivers will need this for passing in gl_DrawIDARB. For indirect multidraw calls, we get the prim array and prim[i].draw_id == i and is redundant. But for non-indirect calls, we get one primitive at a time and need the draw_id field. --- src/mesa/vbo/vbo.h| 1 + src/mesa/vbo/vbo_exec_array.c | 5 + 2 files changed, 6 insertions(+) diff --git a/src/mesa/vbo/vbo.h b/src/mesa/vbo/vbo.h index 00e843c..cef3b8c 100644 --- a/src/mesa/vbo/vbo.h +++ b/src/mesa/vbo/vbo.h @@ -58,6 +58,7 @@ struct _mesa_prim { GLint basevertex; GLuint num_instances; GLuint base_instance; + GLuint draw_id; GLsizeiptr indirect_offset; }; diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c index e27fdd9..7ff78dc 100644 --- a/src/mesa/vbo/vbo_exec_array.c +++ b/src/mesa/vbo/vbo_exec_array.c @@ -1,3 +1,4 @@ + /** * * Copyright 2003 VMware, Inc. @@ -1341,6 +1342,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, GLenum mode, prim[i].indexed = 1; prim[i].num_instances = 1; prim[i].base_instance = 0; + prim[i].draw_id = i; prim[i].is_indirect = 0; if (basevertex != NULL) prim[i].basevertex = basevertex[i]; @@ -1371,6 +1373,7 @@ vbo_validated_multidrawelements(struct gl_context *ctx, GLenum mode, prim[0].indexed = 1; prim[0].num_instances = 1; prim[0].base_instance = 0; + prim[0].draw_id = i; prim[0].is_indirect = 0; if (basevertex != NULL) prim[0].basevertex = basevertex[i]; @@ -1598,6 +1601,7 @@ vbo_validated_multidrawarraysindirect(struct gl_context *ctx, prim[i].mode = mode; prim[i].indirect_offset = offset; prim[i].is_indirect = 1; + prim[i].draw_id = i; } check_buffers_are_unmapped(exec->array.inputs); @@ -1684,6 +1688,7 @@ vbo_validated_multidrawelementsindirect(struct gl_context *ctx, prim[i].indexed = 1; prim[i].indirect_offset = offset; prim[i].is_indirect = 1; + prim[i].draw_id = i; } check_buffers_are_unmapped(exec->array.inputs); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] nir: Teach nir_opt_algebraic about adding and subtracting the same thing
This optimizes a + b - b to just a. Modest shader-db results (BDW): total instructions in shared programs: 7842452 -> 7841862 (-0.01%) instructions in affected programs: 61938 -> 61348 (-0.95%) total loops in shared programs:2131 -> 2131 (0.00%) helped:263 HURT: 0 GAINED:0 LOST: 0 but the optimization turns gl_VertexID - gl_BaseVertexARB into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the i965 hardware supports natively. That means we can avoid using the internal vertex buffer for gl_BaseVertexARB in this case. --- src/glsl/nir/nir_opt_algebraic.py | 4 1 file changed, 4 insertions(+) diff --git a/src/glsl/nir/nir_opt_algebraic.py b/src/glsl/nir/nir_opt_algebraic.py index cb715c0..1fdad3d 100644 --- a/src/glsl/nir/nir_opt_algebraic.py +++ b/src/glsl/nir/nir_opt_algebraic.py @@ -62,6 +62,10 @@ optimizations = [ (('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b, c))), (('fadd', ('fneg', a), a), 0.0), (('iadd', ('ineg', a), a), 0), + (('iadd', ('ineg', a), ('iadd', a, b)), b), + (('iadd', a, ('iadd', ('ineg', a), b)), b), + (('fadd', ('fneg', a), ('fadd', a, b)), b), + (('fadd', a, ('fadd', ('fneg', a), b)), b), (('fmul', a, 0.0), 0.0), (('imul', a, 0), 0), (('umul_unorm_4x8', a, 0), 0), -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] mesa: fix interface matching done in validate_io
On Tue, 2015-12-15 at 07:58 +0200, Tapani Pälli wrote: > On 12/15/2015 03:31 AM, Timothy Arceri wrote: > > On Mon, 2015-12-14 at 10:29 +0200, Tapani Pälli wrote: > > > Patch makes following changes for interface matching: > > > > > > - do not try to match builtin variables > > > - handle swizzle in input name, as example 'a.z' should > > > match with 'a' > > > - check that amount of inputs and outputs matches > > > > > > These changes make interface matching tests to work in: > > > ES31-CTS.sepshaderobjs.StateInteraction > > > > > > Test does not still pass completely due to errors in rendering > > > output. IMO this is unrelated to interface matching. > > > > > > v2: add spec reference, return true on desktop since we do not > > > have failing cases for it, inputs and outputs amount do not > > > need to match on desktop. > > > > > > Signed-off-by: Tapani Pälli> > Hi Tapani, > > > > Just a general comment first. > > > > I think we should first move _mesa_validate_pipeline_io() and > > validate_io() to src/mesa/main/pipelineobj.c I don't think it > > belongs > > here right? > > Sure, it can be done now. Original intention was to use program > resources and that is why it ended up being here. > > > > > > --- > > > src/mesa/main/shader_query.cpp | 54 > > > ++ > > > 1 file changed, 50 insertions(+), 4 deletions(-) > > > > > > diff --git a/src/mesa/main/shader_query.cpp > > > b/src/mesa/main/shader_query.cpp > > > index ced10a9..bc01b97 100644 > > > --- a/src/mesa/main/shader_query.cpp > > > +++ b/src/mesa/main/shader_query.cpp > > > @@ -1377,19 +1377,38 @@ validate_io(const struct gl_shader > > > *input_stage, > > > const struct gl_shader *output_stage, bool isES) > > > { > > > assert(input_stage && output_stage); > > > + unsigned inputs = 0, outputs = 0; > > > + > > > + /* Currently no matching done for desktop. */ > > I think the spec quote should be moved here as it applies to all > > the > > rules in the function then you can also have the comment explaining > > why > > validation for desktop it not done. > > OK > > > I've also filed a spec bug for desktop for the reasons I outlined > > in > > irc previously. It would be great if you could quote the bug here > > also. > > Something like: > > > > /* FIXME: Update once Khronos spec bug #15331 is resolved. */ > > Sure, will add. > > > > + if (!isES) > > > + return true; > > > > > > /* For each output in a, find input in b and do any required > > > checks. */ > > > foreach_in_list(ir_instruction, out, input_stage->ir) { > > > ir_variable *out_var = out->as_variable(); > > > > It's existing code but it would also be nice to have a patch that > > renames input_stage/output_stage to producer_stage/consumer_stage > > this > > it what they are called in the linker code. Maybe its just me but > > getting the outputs from input_stage just looks wrong. > > OK, can change this. > > > > > > - if (!out_var || out_var->data.mode != ir_var_shader_out) > > > + if (!out_var || out_var->data.mode != ir_var_shader_out || > > > + is_gl_identifier(out_var->name)) > > >continue; > > > > > > + outputs++; > > > + > > > + inputs = 0; > > > foreach_in_list(ir_instruction, in, output_stage->ir) { > > Two comments here: > > > > 1. Take a look at cross_validate_outputs_to_inputs() in > > link_varyings.cpp for a way to avoid the nested loop? Although it > > may > > cause even more overhaed using the symbol table not sure. > > I don't know if symbol table can be trusted as variables that get > optimized away or changed in some way are still there. Only way to be > sure is to iterate IR or use resource list. Also, symbol table gets > destroyed after linking. My first implementation was using a hash but > that was also bad idea because variables names do not necessarily > match > exactly. The code in in cross_validate_outputs_to_inputs() doesn't use *the* symbol table it use *a* which it builds from iterating over the producers IR. But I guess it will have the same problem as the hash table. I wonder why the linking code uses it rather than a plain hash table. > > > 2. Take a look at the same function for matching via explicit > > location. > > Does the CTS not test for mismatched explicit locations? Maybe we > > should create a piglit test for this as your existing code doesn't > > take > > into account explicit locations. > > No, I haven't seen this test using explicit locations. This patch > also > makes the interface matching pass. Right but it would break any varyings with explicit locations that don't have a matching names which is legal. "An output variable is considered to match an input variable in the subsequent shader if: –the two variables match in name, type, and qualification; or –the two variables are declared with the same
Re: [Mesa-dev] [PATCH 2/8] st/va: cleanup filter color standard handling
On 11 December 2015 at 12:33, Christian Königwrote: > From: Christian König > > Signed-off-by: Christian König > --- > src/gallium/state_trackers/va/surface.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/state_trackers/va/surface.c > b/src/gallium/state_trackers/va/surface.c > index c052c8f..4a18a6f 100644 > --- a/src/gallium/state_trackers/va/surface.c > +++ b/src/gallium/state_trackers/va/surface.c > @@ -697,11 +697,11 @@ vlVaQueryVideoProcFilterCaps(VADriverContextP ctx, > VAContextID context, > return VA_STATUS_SUCCESS; > } > > -static VAProcColorStandardType > vpp_input_color_standards[VAProcColorStandardCount] = { > +static VAProcColorStandardType vpp_input_color_standards[] = { > VAProcColorStandardBT601 > }; > > -static VAProcColorStandardType > vpp_output_color_standards[VAProcColorStandardCount] = { > +static VAProcColorStandardType vpp_output_color_standards[] = { > VAProcColorStandardBT601 > }; > I was going to suggest to constifying them while we're here, yet it seems that the VAAPI will just discard them. The whole API seems to have only a few const qualifiers :-( Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] r600: fix viewport clipping magic.
> I have to NAK this series, but I was able to find something about the issue. > > If oViewport is used, VGT_REUSE_OFF must disable reuse. That's the correct > fix. > > If oViewport is constant, reuse can be enabled, but > VTE_VPORT_PROVOKE_DISABLE must be set. Okay I can confirm setting VGT_REUSE_OFF fixed the bug. I haven't got time this week to smash out real patches and test them, but I'll try and get to it soon. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/15] i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware
On 12/10/2015 06:23 AM, Jason Ekstrand wrote: > While we're at it, we also add support for the possibility that the > indirect is, in fact, a constant. This shouldn't happen in the common case > (if it does, that means NIR failed to constant-fold something), but it's > possible so we should handle it. Perhaps this should re-ordered before patch 3? > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++ > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 51 > +++--- > 2 files changed, 42 insertions(+), 13 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 9eaf8d0..a2ec03e 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -4424,6 +4424,10 @@ get_lowered_simd_width(const struct brw_device_info > *devinfo, > case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL: >return 8; > > + case SHADER_OPCODE_MOV_INDIRECT: > + /* Prior to Broadwell, we only have 8 address subregisters */ > + return devinfo->gen < 8 ? 8 : inst->exec_size; > + > default: >return inst->exec_size; > } > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > index d86eee1..7fa6d84 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > @@ -351,22 +351,47 @@ fs_generator::generate_mov_indirect(fs_inst *inst, > > unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr; > > - /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */ > - struct brw_reg addr = vec8(brw_address_reg(0)); > + if (indirect_byte_offset.file == BRW_IMMEDIATE_VALUE) { > + imm_byte_offset += indirect_byte_offset.ud; > > - /* The destination stride of an instruction (in bytes) must be greater > -* than or equal to the size of the rest of the instruction. Since the > -* address register is of type UW, we can't use a D-type instruction. > -* In order to get around this, re re-type to UW and use a stride. > -*/ > - indirect_byte_offset = > - retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW); > + reg.nr = imm_byte_offset / REG_SIZE; > + reg.subnr = imm_byte_offset % REG_SIZE; > + brw_MOV(p, dst, reg); > + } else { > + /* Prior to Broadwell, there are only 8 address registers. */ > + assert(inst->exec_size == 8 || devinfo->gen >= 8); > > - /* Prior to Broadwell, there are only 8 address registers. */ > - assert(inst->exec_size == 8 || devinfo->gen >= 8); > + /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */ > + struct brw_reg addr = vec8(brw_address_reg(0)); > > - brw_MOV(p, addr, indirect_byte_offset); > - brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type)); > + /* The destination stride of an instruction (in bytes) must be greater > + * than or equal to the size of the rest of the instruction. Since the > + * address register is of type UW, we can't use a D-type instruction. > + * In order to get around this, re re-type to UW and use a stride. > + */ > + indirect_byte_offset = > + retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW); > + > + if (devinfo->gen < 8) { > + /* Prior to broadwell, we have a restriction that the bottom 5 bits > + * of the base offset and the bottom 5 bits of the indirect must add > + * to less than 32. In other words, the hardware needs to be able > to > + * add the bottom five bits of the two to get the subnumber and add > + * the next 7 bits of each to get the actual register number. Since > + * the indirect may cause us to cross a register boundary, this > makes > + * it almost useless. We could try and do something clever where we > + * use a actual base offset if base_offset % 32 == 0 but that would > + * mean we were generating different code depending on the base > + * offset. Instead, for the sake of consistency, we'll just do the > + * add ourselves. > + */ > + brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset)); > + brw_MOV(p, dst, retype(brw_VxH_indirect(0, 0), dst.type)); > + } else { > + brw_MOV(p, addr, indirect_byte_offset); > + brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), > dst.type)); > + } > + } > } > > void > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91724] GL/gl_mangle.h misses symbols from GLES/gl.h
https://bugs.freedesktop.org/show_bug.cgi?id=91724 --- Comment #4 from Frederic Devernay--- I updated the gist for the newest Mesa release: https://gist.github.com/devernay/71f3d7661d910e6494a9 Note that, despite what Emil said in http://lists.freedesktop.org/archives/mesa-dev/2014-December/072575.html using dlopen(RTLD_LOCAL) may not be a viable option, for example if both the system's libGL and the Mesa libGL depend on libraries with the same soname (llvm or X11 for example) but which are incompatible for some reason (in my case, I am loading Mesa from a plugin, and I don't know what the host application has loaded before - the only safe way is to use a mangled Mesa). In short, RTLD_LOCAL works for the loaded library (in my case, it is the plugin), but not its dependencies. See https://sourceware.org/ml/libc-help/2014-08/msg00042.html for a full explanation. So please, don't remove the option to mangle Mesa symbols, unless there is a viable and portable possibility to load a non-mangled mesa together with the system libGL. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 1/1] i965: Do not overwrite optimizer dumps
On Thu, 2015-12-10 at 09:47 -0800, Matt Turner wrote: > Assuming that the cause is indeed non-orthogonal state changes, yes. > But I never saw an answer to that question. > > Reviewed-by: Matt TurnerAfter rebasing and testing against master (11.1-branchpoint-653- g5c5ad4d) I can't reproduce this issue anymore. Optimizations both in brw vec4 and fs are just called once, so we don't get steps overwritten. So I think I'll keep this patch unpushed for now. Thanks anyway. J.A. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/8] st/va: remove fence handling
Are you sure the flush after calling the compositor is really necessary? That clearly looks odd, but if it works I'm fine with keeping that for now. Regards, Christian. On 15.12.2015 10:06, Julien Isorce wrote: And the attachment :) On 15 December 2015 at 09:06, Julien Isorce> wrote: Hi Christian, I tried your v2. I had to apply attached change on top of your patch. (the one in buffer.c to avoid crashing, the one postproc.c otherwise same behavior as the v1 of this patch). Note that I export the RGB-like surface (the one that vpp output), not the NV12 one that come from the decoder directly. Cheers Julien On 14 December 2015 at 10:11, Christian König > wrote: Also note that in this pipeline, HW decoding is done with nouveau driver and rendering is done with intel. dmabuf in between. Yeah, I already thought that somebody is using it like this. I'm not sure if this is actually supposed to work because we don't have proper synchronization between kernel drivers with DMA-buf jet. Maybe the idea of the patch is good but something is still wrong. While it is not the proper solution I would say let's keep the pipeline draining during exporting the handle for now if that's really necessary for your use case. Please test the attached patch. Coding the patch I've just noticed that there wasn't a pipe->flush() before exporting the handle. Does it work as well if you just flush the pipeline without waiting for the commands to be finished? Regards, Christian. On 14.12.2015 10:14, Julien Isorce wrote: Hi Christian, I have tested this patch but then the displayed video is garbage (mostly white and sometimes just garbage). It also stall the nouveau driver which requires to reboot but I guess this is another issue. I tested with: GST_GL_WINDOW=x11 GST_GL_PLATFORM=egl GST_GL_API=gles2 GST_DEBUG=2 LIBVA_DRIVER_NAME=gallium gst-launch-1.0 filesrc location=simpson.mp4 ! qtdemux ! vaapidecodebin ! glimagesink (to test that you need my gstreamer-vaapi and gstgl branches on my github but I would not waste time to try them since they should be merged upstream at some point) Also note that in this pipeline, HW decoding is done with nouveau driver and rendering is done with intel. dmabuf in between. Maybe the idea of the patch is good but something is still wrong. I can test any update if it helps. Cheers Julien On 11 December 2015 at 12:33, Christian König > wrote: From: Christian König > It's nonsense to drain the pipeline like this. Signed-off-by: Christian König > --- src/gallium/state_trackers/va/buffer.c| 5 - src/gallium/state_trackers/va/image.c | 1 - src/gallium/state_trackers/va/postproc.c | 6 -- src/gallium/state_trackers/va/surface.c | 10 +- src/gallium/state_trackers/va/va_private.h | 2 -- 5 files changed, 1 insertion(+), 23 deletions(-) diff --git a/src/gallium/state_trackers/va/buffer.c b/src/gallium/state_trackers/va/buffer.c index 769305e..2ec187c 100644 --- a/src/gallium/state_trackers/va/buffer.c +++ b/src/gallium/state_trackers/va/buffer.c @@ -257,11 +257,6 @@ vlVaAcquireBufferHandle(VADriverContextP ctx, VABufferID buf_id, screen = VL_VA_PSCREEN(ctx); - if (buf->derived_surface.fence) { - screen->fence_finish(screen, buf->derived_surface.fence, PIPE_TIMEOUT_INFINITE); - screen->fence_reference(screen, >derived_surface.fence, NULL); - } - if (buf->export_refcount > 0) { if (buf->export_state.mem_type != mem_type) return VA_STATUS_ERROR_INVALID_PARAMETER; diff --git a/src/gallium/state_trackers/va/image.c b/src/gallium/state_trackers/va/image.c index ae07da8..58c9ff7 100644 --- a/src/gallium/state_trackers/va/image.c +++ b/src/gallium/state_trackers/va/image.c @@ -266,7 +266,6 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID surface, VAImage *image) img_buf->type = VAImageBufferType;
Re: [Mesa-dev] [PATCH] clover: Fix build against LLVM 3.8 SVN >= r255078
On 15.12.2015 09:17, Ilia Mirkin wrote: > On Wed, Dec 9, 2015 at 5:30 AM, Francisco Jerezwrote: >> Michel Dänzer writes: >> >>> From: Michel Dänzer >>> >>> Signed-off-by: Michel Dänzer >> >> Looks OK to me, >> Reviewed-by: Francisco Jerez >> >>> --- >>> src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> index 3b37f08..4d11c24 100644 >>> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> @@ -661,7 +661,11 @@ namespace { >>> >>>if (dump_asm) { >>> LLVMSetTargetMachineAsmVerbosity(tm, true); >>> +#if HAVE_LLVM >= 0x0308 >>> + LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod).release()); >>> +#else >>> LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod)); >>> +#endif >>> emit_code(tm, debug_mod, LLVMAssemblyFile, _buffer, r_log); >>> buffer_size = LLVMGetBufferSize(out_buffer); >>> buffer_data = LLVMGetBufferStart(out_buffer); > > Emil, consider cherry-picking this into 11.1 and perhaps even 11.0 to > save people from unnecessary compilation trouble. This is commit > b4a03e7f8f upstream. FWIW, I still think that's a bad idea at this point: Supporting unreleased snapshots of LLVM simply isn't feasible on stable Mesa branches — the next similar breakage can appear in LLVM SVN anytime. To help people running into this, maybe stable Mesa branches could get a change to configure.ac which aborts with a descriptive error message when trying to build against a version of LLVM which isn't supported on that stable Mesa branch yet. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/8] st/va: remove fence handling
And the attachment :) On 15 December 2015 at 09:06, Julien Isorcewrote: > Hi Christian, > > I tried your v2. > > I had to apply attached change on top of your patch. (the one in buffer.c > to avoid crashing, the one postproc.c otherwise same behavior as the v1 of > this patch). Note that I export the RGB-like surface (the one that vpp > output), not the NV12 one that come from the decoder directly. > > Cheers > Julien > > On 14 December 2015 at 10:11, Christian König > wrote: > >> Also note that in this pipeline, HW decoding is done with nouveau driver >> and rendering is done with intel. dmabuf in between. >> >> Yeah, I already thought that somebody is using it like this. I'm not sure >> if this is actually supposed to work because we don't have proper >> synchronization between kernel drivers with DMA-buf jet. >> >> Maybe the idea of the patch is good but something is still wrong. >> >> While it is not the proper solution I would say let's keep the pipeline >> draining during exporting the handle for now if that's really necessary for >> your use case. Please test the attached patch. >> >> Coding the patch I've just noticed that there wasn't a pipe->flush() >> before exporting the handle. Does it work as well if you just flush the >> pipeline without waiting for the commands to be finished? >> >> Regards, >> Christian. >> >> >> On 14.12.2015 10:14, Julien Isorce wrote: >> >> Hi Christian, >> >> I have tested this patch but then the displayed video is garbage (mostly >> white and sometimes just garbage). It also stall the nouveau driver which >> requires to reboot but I guess this is another issue. >> I tested with: >> GST_GL_WINDOW=x11 GST_GL_PLATFORM=egl GST_GL_API=gles2 GST_DEBUG=2 >> LIBVA_DRIVER_NAME=gallium gst-launch-1.0 filesrc location=simpson.mp4 ! >> qtdemux ! vaapidecodebin ! glimagesink >> >> (to test that you need my gstreamer-vaapi and gstgl branches on my github >> but I would not waste time to try them since they should be merged upstream >> at some point) >> >> Also note that in this pipeline, HW decoding is done with nouveau driver >> and rendering is done with intel. dmabuf in between. >> >> Maybe the idea of the patch is good but something is still wrong. >> I can test any update if it helps. >> >> Cheers >> Julien >> >> >> >> >> On 11 December 2015 at 12:33, Christian König >> wrote: >> >>> From: Christian König >>> >>> It's nonsense to drain the pipeline like this. >>> >>> Signed-off-by: Christian König >>> --- >>> src/gallium/state_trackers/va/buffer.c | 5 - >>> src/gallium/state_trackers/va/image.c | 1 - >>> src/gallium/state_trackers/va/postproc.c | 6 -- >>> src/gallium/state_trackers/va/surface.c| 10 +- >>> src/gallium/state_trackers/va/va_private.h | 2 -- >>> 5 files changed, 1 insertion(+), 23 deletions(-) >>> >>> diff --git a/src/gallium/state_trackers/va/buffer.c >>> b/src/gallium/state_trackers/va/buffer.c >>> index 769305e..2ec187c 100644 >>> --- a/src/gallium/state_trackers/va/buffer.c >>> +++ b/src/gallium/state_trackers/va/buffer.c >>> @@ -257,11 +257,6 @@ vlVaAcquireBufferHandle(VADriverContextP ctx, >>> VABufferID buf_id, >>> >>> screen = VL_VA_PSCREEN(ctx); >>> >>> - if (buf->derived_surface.fence) { >>> - screen->fence_finish(screen, buf->derived_surface.fence, >>> PIPE_TIMEOUT_INFINITE); >>> - screen->fence_reference(screen, >derived_surface.fence, >>> NULL); >>> - } >>> - >>> if (buf->export_refcount > 0) { >>>if (buf->export_state.mem_type != mem_type) >>> return VA_STATUS_ERROR_INVALID_PARAMETER; >>> diff --git a/src/gallium/state_trackers/va/image.c >>> b/src/gallium/state_trackers/va/image.c >>> index ae07da8..58c9ff7 100644 >>> --- a/src/gallium/state_trackers/va/image.c >>> +++ b/src/gallium/state_trackers/va/image.c >>> @@ -266,7 +266,6 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID >>> surface, VAImage *image) >>> img_buf->type = VAImageBufferType; >>> img_buf->size = image->data_size; >>> img_buf->num_elements = 1; >>> - img_buf->derived_surface.fence = surf->fence; >>> >>> pipe_resource_reference(_buf->derived_surface.resource, >>> surfaces[0]->texture); >>> >>> diff --git a/src/gallium/state_trackers/va/postproc.c >>> b/src/gallium/state_trackers/va/postproc.c >>> index 105f251..1ee3587 100644 >>> --- a/src/gallium/state_trackers/va/postproc.c >>> +++ b/src/gallium/state_trackers/va/postproc.c >>> @@ -54,7 +54,6 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver >>> *drv, vlVaContext *contex >>> vlVaSurface *src_surface; >>> VAProcPipelineParameterBuffer *pipeline_param; >>> struct pipe_surface **surfaces; >>> - struct pipe_screen *screen; >>> struct pipe_surface *psurf; >>> >>> if (!drv || !context) >>> @@ -77,8 +76,6 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver >>> *drv,
Re: [Mesa-dev] [PATCH 5/8] st/va: handle default post process regions
On 11 December 2015 at 12:33, Christian Königwrote: > From: Christian König > > Avoid referencing NULL pointers. > Lacking any prior knowledge of the sequential patches, I'm afraid this commit message doesn't make any sense. How about "Will be used in the follow up patches" or anything alike ? > Signed-off-by: Christian König > --- > src/gallium/state_trackers/va/postproc.c | 36 > +--- > 1 file changed, 28 insertions(+), 8 deletions(-) > > diff --git a/src/gallium/state_trackers/va/postproc.c > b/src/gallium/state_trackers/va/postproc.c > index 2d17694..105f251 100644 > --- a/src/gallium/state_trackers/va/postproc.c > +++ b/src/gallium/state_trackers/va/postproc.c > @@ -29,9 +29,26 @@ > > #include "va_private.h" > > +static const VARectangle * > +vlVaRegionDefault(const VARectangle *region, struct pipe_video_buffer *buf, > + VARectangle *def) > +{ > + if (region) > + return region; > + > + def->x = 0; > + def->y = 0; > + def->width = buf->width; > + def->height = buf->height; > + > + return def; > +} > + > VAStatus > vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver *drv, vlVaContext > *context, vlVaBuffer *buf) > { > + VARectangle def_src_region, def_dst_region; > + const VARectangle *src_region, *dst_region; > struct u_rect src_rect; > struct u_rect dst_rect; > vlVaSurface *src_surface; > @@ -64,15 +81,18 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver > *drv, vlVaContext *contex > > psurf = surfaces[0]; > > - src_rect.x0 = pipeline_param->surface_region->x; > - src_rect.y0 = pipeline_param->surface_region->y; > - src_rect.x1 = pipeline_param->surface_region->x + > pipeline_param->surface_region->width; > - src_rect.y1 = pipeline_param->surface_region->y + > pipeline_param->surface_region->height; > + src_region = vlVaRegionDefault(pipeline_param->surface_region, > src_surface->buffer, _src_region); > + dst_region = vlVaRegionDefault(pipeline_param->output_region, > context->target, _dst_region); > + Mind moving this a couple of lines down - alongside the users of dst_rect ? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/8] st/va: fix unused variable warning
On 15.12.2015 11:08, Emil Velikov wrote: On 11 December 2015 at 12:33, Christian Königwrote: From: Christian König Signed-off-by: Christian König --- src/gallium/state_trackers/va/picture.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/state_trackers/va/picture.c b/src/gallium/state_trackers/va/picture.c index 8623139..7b30bf8 100644 --- a/src/gallium/state_trackers/va/picture.c +++ b/src/gallium/state_trackers/va/picture.c @@ -92,7 +92,6 @@ vlVaGetReferenceFrame(vlVaDriver *drv, VASurfaceID surface_id, static VAStatus handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer *buf) { - unsigned int i; VAStatus vaStatus = VA_STATUS_SUCCESS; Can I bribe you to also remove the "set once, used once" variable vaStatus ? Unfortunately I already pushed this one yesterday after Julien gave me his rb. But going to keep that in mind when I touch the function the next time. Regards, Christian. Either way Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] mesa: Add helper to check if the active fragment shader has shader storage
Some drivers can disable the FS unit if there is nothing in the shader code that writes to an output (i.e. color, depth, etc). For drivers that check for these things, this helper function is useful to avoid that optimization in the case that the shader has shader storage space assigned (since it could be writing to it). --- src/mesa/main/mtypes.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 48309bf..acacae0 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -4544,6 +4544,13 @@ _mesa_active_fragment_shader_has_atomic_ops(const struct gl_context *ctx) ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumAtomicBuffers > 0; } +static inline bool +_mesa_active_fragment_shader_has_shader_storage(const struct gl_context *ctx) +{ + return ctx->Shader._CurrentFragmentProgram != NULL && + ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumShaderStorageBlocks > 0; +} + #ifdef __cplusplus } #endif -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] mesa: fix interface matching done in validate_io
On 12/15/2015 10:56 AM, Timothy Arceri wrote: On Tue, 2015-12-15 at 07:58 +0200, Tapani Pälli wrote: On 12/15/2015 03:31 AM, Timothy Arceri wrote: On Mon, 2015-12-14 at 10:29 +0200, Tapani Pälli wrote: Patch makes following changes for interface matching: - do not try to match builtin variables - handle swizzle in input name, as example 'a.z' should match with 'a' - check that amount of inputs and outputs matches These changes make interface matching tests to work in: ES31-CTS.sepshaderobjs.StateInteraction Test does not still pass completely due to errors in rendering output. IMO this is unrelated to interface matching. v2: add spec reference, return true on desktop since we do not have failing cases for it, inputs and outputs amount do not need to match on desktop. Signed-off-by: Tapani PälliHi Tapani, Just a general comment first. I think we should first move _mesa_validate_pipeline_io() and validate_io() to src/mesa/main/pipelineobj.c I don't think it belongs here right? Sure, it can be done now. Original intention was to use program resources and that is why it ended up being here. Ah but it uses ir_variable so it may be painful to move. Would it be OK to still have it in shader_query.cpp? --- src/mesa/main/shader_query.cpp | 54 ++ 1 file changed, 50 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp index ced10a9..bc01b97 100644 --- a/src/mesa/main/shader_query.cpp +++ b/src/mesa/main/shader_query.cpp @@ -1377,19 +1377,38 @@ validate_io(const struct gl_shader *input_stage, const struct gl_shader *output_stage, bool isES) { assert(input_stage && output_stage); + unsigned inputs = 0, outputs = 0; + + /* Currently no matching done for desktop. */ I think the spec quote should be moved here as it applies to all the rules in the function then you can also have the comment explaining why validation for desktop it not done. OK I've also filed a spec bug for desktop for the reasons I outlined in irc previously. It would be great if you could quote the bug here also. Something like: /* FIXME: Update once Khronos spec bug #15331 is resolved. */ Sure, will add. + if (!isES) + return true; /* For each output in a, find input in b and do any required checks. */ foreach_in_list(ir_instruction, out, input_stage->ir) { ir_variable *out_var = out->as_variable(); It's existing code but it would also be nice to have a patch that renames input_stage/output_stage to producer_stage/consumer_stage this it what they are called in the linker code. Maybe its just me but getting the outputs from input_stage just looks wrong. OK, can change this. - if (!out_var || out_var->data.mode != ir_var_shader_out) + if (!out_var || out_var->data.mode != ir_var_shader_out || + is_gl_identifier(out_var->name)) continue; + outputs++; + + inputs = 0; foreach_in_list(ir_instruction, in, output_stage->ir) { Two comments here: 1. Take a look at cross_validate_outputs_to_inputs() in link_varyings.cpp for a way to avoid the nested loop? Although it may cause even more overhaed using the symbol table not sure. I don't know if symbol table can be trusted as variables that get optimized away or changed in some way are still there. Only way to be sure is to iterate IR or use resource list. Also, symbol table gets destroyed after linking. My first implementation was using a hash but that was also bad idea because variables names do not necessarily match exactly. The code in in cross_validate_outputs_to_inputs() doesn't use *the* symbol table it use *a* which it builds from iterating over the producers IR. But I guess it will have the same problem as the hash table. I wonder why the linking code uses it rather than a plain hash table. 2. Take a look at the same function for matching via explicit location. Does the CTS not test for mismatched explicit locations? Maybe we should create a piglit test for this as your existing code doesn't take into account explicit locations. No, I haven't seen this test using explicit locations. This patch also makes the interface matching pass. Right but it would break any varyings with explicit locations that don't have a matching names which is legal. "An output variable is considered to match an input variable in the subsequent shader if: –the two variables match in name, type, and qualification; or –the two variables are declared with the same location qualifier and match in type and qualification." I was going to suggest sharing the code between here and the linker however I'm about to add a bunch of rules for matching the component qualifier for enhanced layouts so not entirely sure if we should do this what do you think? Linker will need to do much more so maybe do separately,
Re: [Mesa-dev] [PATCH 1/3] mesa: Add helper to check if the active fragment shader has shader storage
Yep, I remember when and why this was done for atomic counters. Patches 1 and 2 are Reviewed-by: Tapani PälliOn 12/15/2015 01:51 PM, Iago Toral Quiroga wrote: Some drivers can disable the FS unit if there is nothing in the shader code that writes to an output (i.e. color, depth, etc). For drivers that check for these things, this helper function is useful to avoid that optimization in the case that the shader has shader storage space assigned (since it could be writing to it). --- src/mesa/main/mtypes.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 48309bf..acacae0 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -4544,6 +4544,13 @@ _mesa_active_fragment_shader_has_atomic_ops(const struct gl_context *ctx) ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumAtomicBuffers > 0; } +static inline bool +_mesa_active_fragment_shader_has_shader_storage(const struct gl_context *ctx) +{ + return ctx->Shader._CurrentFragmentProgram != NULL && + ctx->Shader._CurrentFragmentProgram->_LinkedShaders[MESA_SHADER_FRAGMENT]->NumShaderStorageBlocks > 0; +} + #ifdef __cplusplus } #endif ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen8/cs: fix constant push buffer
Page 502 of the Command Reference Broadwell PRM says that CURBE Total Data Length must be 64-bit aligned. Fixes the following CTS tests: ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs --- src/mesa/drivers/dri/i965/gen7_cs_state.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c b/src/mesa/drivers/dri/i965/gen7_cs_state.c index 1fde69c..dbd1967 100644 --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw) unsigned push_constant_data_size = (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value); - unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32); + unsigned reg_aligned_constant_size = + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); unsigned push_constant_regs = reg_aligned_constant_size / 32; unsigned threads = get_cs_thread_count(cs_prog_data); @@ -241,7 +242,8 @@ brw_upload_cs_push_constants(struct brw_context *brw, const unsigned push_constant_data_size = (local_id_dwords + prog_data->nr_params) * sizeof(gl_constant_value); - const unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32); + const unsigned reg_aligned_constant_size = + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); const unsigned param_aligned_count = reg_aligned_constant_size / sizeof(*param); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [OT] some contribution statistics
On Tue, Dec 15, 2015 at 10:22 PM, Kenneth Graunkewrote: > On Tuesday, December 15, 2015 02:23:07 PM Giuseppe Bilotta wrote: >> The only problem with these numbers is actually the lack of a .mailmap >> to normalize contributor name/emails, which obviously skews the >> results a little bit towards the lower end. I don't suppose someone >> has a .mailmap for Mesa contributors, or is interested in creating >> one? > > I actually have one of those! > > http://cgit.freedesktop.org/~kwg/mesa/commit/?h=gitdm Doh, now I wish you would have replied earlier 8-) In the mean time I prepared a .mailmap myself … A merge might be needed. I think I'll try sending mine to the list. -- Giuseppe "Oblomov" Bilotta ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [OT] some contribution statistics
On 15.12.2015 16:22, Kenneth Graunke wrote: On Tuesday, December 15, 2015 02:23:07 PM Giuseppe Bilotta wrote: The only problem with these numbers is actually the lack of a .mailmap to normalize contributor name/emails, which obviously skews the results a little bit towards the lower end. I don't suppose someone has a .mailmap for Mesa contributors, or is interested in creating one? I actually have one of those! http://cgit.freedesktop.org/~kwg/mesa/commit/?h=gitdm Do you take patches? Nicolai --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] mesa: Add core mesa support for GL_ARB_shader_draw_parameters
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote: > --- > src/glsl/builtin_variables.cpp | 5 + > src/glsl/glsl_parser_extras.cpp | 1 + > src/glsl/glsl_parser_extras.h | 2 ++ > src/glsl/nir/nir.c | 8 > src/glsl/nir/nir_intrinsics.h | 2 ++ > src/glsl/nir/shader_enums.h | 20 > src/glsl/standalone_scaffolding.cpp | 1 + > src/mesa/main/extensions_table.h| 1 + > src/mesa/main/mtypes.h | 1 + > 9 files changed, 41 insertions(+) > > diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp > index e8eab80..e82c99e 100644 > --- a/src/glsl/builtin_variables.cpp > +++ b/src/glsl/builtin_variables.cpp > @@ -951,6 +951,11 @@ builtin_variable_generator::generate_vs_special_vars() >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB"); > if (state->ARB_draw_instanced_enable || state->is_version(140, 300)) >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID"); > + if (state->ARB_shader_draw_parameters_enable) { > + add_system_value(SYSTEM_VALUE_BASE_VERTEX, int_t, "gl_BaseVertexARB"); > + add_system_value(SYSTEM_VALUE_BASE_INSTANCE, int_t, > "gl_BaseInstanceARB"); > + add_system_value(SYSTEM_VALUE_DRAW_ID, int_t, "gl_DrawIDARB"); > + } > if (state->AMD_vertex_shader_layer_enable) { >var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer"); >var->data.interpolation = INTERP_QUALIFIER_FLAT; > diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp > index 29cf0c6..8c46f14 100644 > --- a/src/glsl/glsl_parser_extras.cpp > +++ b/src/glsl/glsl_parser_extras.cpp > @@ -608,6 +608,7 @@ static const _mesa_glsl_extension > _mesa_glsl_supported_extensions[] = { > EXT(ARB_shader_atomic_counters, true, false, > ARB_shader_atomic_counters), > EXT(ARB_shader_bit_encoding, true, false, > ARB_shader_bit_encoding), > EXT(ARB_shader_clock, true, false, ARB_shader_clock), > + EXT(ARB_shader_draw_parameters, true, false, > ARB_shader_draw_parameters), > EXT(ARB_shader_image_load_store, true, false, > ARB_shader_image_load_store), > EXT(ARB_shader_image_size,true, false, > ARB_shader_image_size), > EXT(ARB_shader_precision, true, false, > ARB_shader_precision), > diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h > index a4bda77..afb99af 100644 > --- a/src/glsl/glsl_parser_extras.h > +++ b/src/glsl/glsl_parser_extras.h > @@ -536,6 +536,8 @@ struct _mesa_glsl_parse_state { > bool ARB_shader_bit_encoding_warn; > bool ARB_shader_clock_enable; > bool ARB_shader_clock_warn; > + bool ARB_shader_draw_parameters_enable; > + bool ARB_shader_draw_parameters_warn; > bool ARB_shader_image_load_store_enable; > bool ARB_shader_image_load_store_warn; > bool ARB_shader_image_size_enable; > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c > index 35fc1de..4b70e7c 100644 > --- a/src/glsl/nir/nir.c > +++ b/src/glsl/nir/nir.c > @@ -1588,6 +1588,10 @@ nir_intrinsic_from_system_value(gl_system_value val) >return nir_intrinsic_load_vertex_id; > case SYSTEM_VALUE_INSTANCE_ID: >return nir_intrinsic_load_instance_id; > + case SYSTEM_VALUE_DRAW_ID: > + return nir_intrinsic_load_draw_id; > + case SYSTEM_VALUE_BASE_INSTANCE: > + return nir_intrinsic_load_base_instance; > case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: >return nir_intrinsic_load_vertex_id_zero_base; > case SYSTEM_VALUE_BASE_VERTEX: > @@ -1633,6 +1637,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op > intrin) >return SYSTEM_VALUE_VERTEX_ID; > case nir_intrinsic_load_instance_id: >return SYSTEM_VALUE_INSTANCE_ID; > + case nir_intrinsic_load_draw_id: > + return SYSTEM_VALUE_DRAW_ID; > + case nir_intrinsic_load_base_instance: > + return SYSTEM_VALUE_BASE_INSTANCE; > case nir_intrinsic_load_vertex_id_zero_base: >return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; > case nir_intrinsic_load_base_vertex: > diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h > index 9811fb3..917c805 100644 > --- a/src/glsl/nir/nir_intrinsics.h > +++ b/src/glsl/nir/nir_intrinsics.h > @@ -239,6 +239,8 @@ SYSTEM_VALUE(vertex_id, 1, 0) > SYSTEM_VALUE(vertex_id_zero_base, 1, 0) > SYSTEM_VALUE(base_vertex, 1, 0) > SYSTEM_VALUE(instance_id, 1, 0) > +SYSTEM_VALUE(base_instance, 1, 0) > +SYSTEM_VALUE(draw_id, 1, 0) > SYSTEM_VALUE(sample_id, 1, 0) > SYSTEM_VALUE(sample_pos, 2, 0) > SYSTEM_VALUE(sample_mask_in, 1, 0) > diff --git a/src/glsl/nir/shader_enums.h b/src/glsl/nir/shader_enums.h > index dd0e0ba..0be217c 100644 > --- a/src/glsl/nir/shader_enums.h > +++ b/src/glsl/nir/shader_enums.h > @@ -379,6 +379,26 @@ typedef enum > * \sa SYSTEM_VALUE_VERTEX_ID,
Re: [Mesa-dev] [PATCH 3/7] i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote: > fs_visitor::emit_vs_system_value() looks like it's trying to handle > SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the > backend. > --- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > index 68f2548..d5193a9 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > @@ -46,6 +46,7 @@ fs_visitor::emit_vs_system_value(int location) >vs_prog_data->uses_vertexid = true; >break; > case SYSTEM_VALUE_VERTEX_ID: > + unreachable("should have been lowered"); > case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: >reg->reg_offset = 2; >vs_prog_data->uses_vertexid = true; > There was some reason that Ken and I decided to do this like this, but I don't remember what it was. I *think* this is probably a good change, but I'd like Ken to weigh in. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB
On 12/15/2015 11:48 AM, Anuj Phogat wrote: > On Tue, Dec 15, 2015 at 12:28 AM, Kristian Høgsberg Kristensen >wrote: >> We already have gl_BaseVertexARB in the .x component of the SGVS vec4 >> and plug gl_BaseInstanceARB into the last free component (.y). >> --- >> src/mesa/drivers/dri/i965/brw_compiler.h | 2 ++ >> src/mesa/drivers/dri/i965/brw_context.h | 9 -- >> src/mesa/drivers/dri/i965/brw_draw.c | 12 ++-- >> src/mesa/drivers/dri/i965/brw_draw_upload.c | 35 >> ++- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 ++- >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++- >> src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++-- >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++- >> src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 6 +++- >> src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 >> ++- >> 11 files changed, 102 insertions(+), 38 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h >> b/src/mesa/drivers/dri/i965/brw_compiler.h >> index 218d9c7..58ee966 100644 >> --- a/src/mesa/drivers/dri/i965/brw_compiler.h >> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h >> @@ -547,6 +547,8 @@ struct brw_vs_prog_data { >> >> bool uses_vertexid; >> bool uses_instanceid; >> + bool uses_basevertex; >> + bool uses_baseinstance; > Missed bool uses_drawid ? It looks like there may be some rebase or patch splitting issues. These are added in the next patch, but they are already used in this patch. I think it will compile after patch 5, but I don't think it will compile after patch 4. >> }; >> >> struct brw_tcs_prog_data >> diff --git a/src/mesa/drivers/dri/i965/brw_context.h >> b/src/mesa/drivers/dri/i965/brw_context.h >> index a845541..1378402 100644 >> --- a/src/mesa/drivers/dri/i965/brw_context.h >> +++ b/src/mesa/drivers/dri/i965/brw_context.h >> @@ -905,8 +905,13 @@ struct brw_context >> uint32_t pma_stall_bits; >> >> struct { >> - /** The value of gl_BaseVertex for the current _mesa_prim. */ >> - int gl_basevertex; >> + struct { >> + /** The value of gl_BaseVertex for the current _mesa_prim. */ >> + int gl_basevertex; >> + >> + /** The value of gl_BaseInstance for the current _mesa_prim. */ >> + int gl_baseinstance; >> + } params; > Missed gl_drawid and gl_drawid_bo ? >> >>/** >> * Buffer and offset used for GL_ARB_shader_draw_parameters >> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c >> b/src/mesa/drivers/dri/i965/brw_draw.c >> index 8398471..298ac06 100644 >> --- a/src/mesa/drivers/dri/i965/brw_draw.c >> +++ b/src/mesa/drivers/dri/i965/brw_draw.c >> @@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx, >> } >>} >> >> - brw->draw.gl_basevertex = >> + brw->draw.params.gl_basevertex = >> prims[i].indexed ? prims[i].basevertex : prims[i].start; >> - >> + brw->draw.params.gl_baseinstance = prims[i].base_instance; >>drm_intel_bo_unreference(brw->draw.draw_params_bo); >> >>if (prims[i].is_indirect) { >> @@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx, >> brw->draw.draw_params_offset = 0; >>} >> >> + /* gl_DrawID always needs its own vertex buffer since it's not part of >> + * the indirect parameter buffer. */ >> + if (brw->vs.prog_data->uses_drawid) { >> + brw->draw.gl_drawid = prims[i].drawid; > brw->draw.gl_drawid = prims[i].draw_id; >> + drm_intel_bo_unreference(brw->draw.draw_id_bo); >> + brw->ctx.NewDriverState |= BRW_NEW_VERTICES; >> + } >> + >>if (brw->gen < 6) >> brw_set_prim(brw, [i]); >>else >> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c >> b/src/mesa/drivers/dri/i965/brw_draw_upload.c >> index ea0f6f2..ccf963c 100644 >> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c >> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c >> @@ -592,8 +592,10 @@ void >> brw_prepare_shader_draw_parameters(struct brw_context *brw) >> { >> /* For non-indirect draws, upload gl_BaseVertex. */ >> - if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == >> NULL) { >> - intel_upload_data(brw, >draw.gl_basevertex, 4, 4, >> + if ((brw->vs.prog_data->uses_basevertex || >> +brw->vs.prog_data->uses_baseinstance) && >> + brw->draw.draw_params_bo == NULL) { >> + intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4, >> >draw.draw_params_bo, >> >draw.draw_params_offset); >> } >> @@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw) >> brw_emit_query_begin(brw); >> >> unsigned nr_elements = brw->vb.nr_enabled; >> - if (brw->vs.prog_data->uses_vertexid || >>
Re: [Mesa-dev] [PATCH 6/7] nir: Teach nir_opt_algebraic about adding and subtracting the same thing
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote: > This optimizes a + b - b to just a. Modest shader-db results (BDW): > > total instructions in shared programs: 7842452 -> 7841862 (-0.01%) > instructions in affected programs: 61938 -> 61348 (-0.95%) > total loops in shared programs:2131 -> 2131 (0.00%) > helped:263 > HURT: 0 > GAINED:0 > LOST: 0 > > but the optimization turns > > gl_VertexID - gl_BaseVertexARB > > into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the > i965 hardware supports natively. That means we can avoid using the > internal vertex buffer for gl_BaseVertexARB in this case. Removing that extra state should be a bigger real win than removing the instructions. This patch is Reviewed-by: Ian Romanick> --- > src/glsl/nir/nir_opt_algebraic.py | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/glsl/nir/nir_opt_algebraic.py > b/src/glsl/nir/nir_opt_algebraic.py > index cb715c0..1fdad3d 100644 > --- a/src/glsl/nir/nir_opt_algebraic.py > +++ b/src/glsl/nir/nir_opt_algebraic.py > @@ -62,6 +62,10 @@ optimizations = [ > (('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b, c))), > (('fadd', ('fneg', a), a), 0.0), > (('iadd', ('ineg', a), a), 0), > + (('iadd', ('ineg', a), ('iadd', a, b)), b), > + (('iadd', a, ('iadd', ('ineg', a), b)), b), > + (('fadd', ('fneg', a), ('fadd', a, b)), b), > + (('fadd', a, ('fadd', ('fneg', a), b)), b), > (('fmul', a, 0.0), 0.0), > (('imul', a, 0), 0), > (('umul_unorm_4x8', a, 0), 0), > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: remove validation of shaders that should be done elsewhere
On Tue, 2015-12-15 at 14:32 +0200, Tapani Pälli wrote: > On 12/15/2015 01:25 AM, Timothy Arceri wrote: > > On Wed, 2015-12-09 at 00:17 +1100, Timothy Arceri wrote: > > > In core profile even if re-linking fails rendering shouldn't fail > > > as > > > the > > > previous succesfully linked program will still be available. It > > > also > > > shouldn't be possible to have an unlinked program as part of the > > > current rendering state. > > Hey guys, > > > > Any thoughts on this change? > > > > Thinking about this some more we should probably rework the compat > > code > > also and only do the check for link status if there is an assembly > > shader right? > > I wanted to hear from others first since for me it feels this change > seems specific to separate shader programs (I had a patch on list > that > skipped the check for those programs that were not in use by current > pipeline). > The reason is that with regular programs I can't see a way to > continue > if relinking fails (because program is now in bad state). I think > user > should detach the malfunctioning stage and link again. However with > SSO > relink to a unused stage may fail but we can still have a complete > working program with stages marked as used. Hi Tapani, I don't see anything that says this is specific to separate shader programs. For full programs you still need to call UseProgram to install the executable code as part of the rendering state just like with SSO. From Section 7.3 (Program Objects) of the OpenGL 4.5 spec under UseProgram: "This will install executable code as part of the current rendering state for each shader stage present when the program was last successfully linked." ... "If LinkProgram or ProgramBinary successfully re-links a program object that is active for any shader stage, then the newly generated executable code will be installed as part of the current rendering state for all shader stages where the program is active." ... "If a program object that is active for any shader stage is re-linked unsuccess-fully, the link status will be set to FALSE, but any existing executables and associ-ated state will remain part of the current rendering state until a subsequent call to UseProgram, UseProgramStages, or BindProgramPipeline removes them from use." ... "An unsuc-cessfully linked program may not be made part of the current rendering state by UseProgram or added to program pipeline objects by UseProgramStages until it is successfully re-linked." As far as I can tell it should not be possible to have an unsuccessfully linked program as part of the current rendering state, which is why this patch removes the LinkStatus check completely. I can add all of this to the commit message. Tim > > > > Thanks, > > Tim > > // Tapani > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] ttn: Use the new nir_load_system_value helper
Only compile-tested. Cc: Eric Anholt--- src/gallium/auxiliary/nir/tgsi_to_nir.c | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index 5def6d3..122e87b 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -544,9 +544,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, break; case TGSI_FILE_SYSTEM_VALUE: { - nir_intrinsic_instr *load; nir_intrinsic_op op; - unsigned ncomp = 1; assert(!indirect); assert(!dim); @@ -568,13 +566,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, unreachable("bad system value"); } - load = nir_intrinsic_instr_create(b->shader, op); - load->num_components = ncomp; - - nir_ssa_dest_init(>instr, >dest, ncomp, NULL); - nir_builder_instr_insert(b, >instr); - - src = nir_src_for_ssa(>dest.ssa); + src = nir_src_for_ssa(nir_load_system_value(b, op, 0)); break; } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] nir/lower_system_values: Stop supporting non-SSA
Jason Ekstrandwrites: > The one user of this (i965) only ever calls it while in SSA form. This series is: Reviewed-by: Eric Anholt signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Add .mailmap
This adds a first tentative .mailmap file, to canonicize contributor name/emails in shortlogs and other statistical endeavours. There's a couple of root and richard entries which I don't know who they belong to, and hopefully not too many overeager merges. Signed-off-by: Giuseppe Bilotta--- .mailmap | 460 +++ 1 file changed, 460 insertions(+) create mode 100644 .mailmap diff --git a/.mailmap b/.mailmap new file mode 100644 index 000..bf8b4d9 --- /dev/null +++ b/.mailmap @@ -0,0 +1,460 @@ +Aapo Tahkola + +Adam Jackson +Adam Jackson + +Adrian Marius Negreanu Adrian Negreanu +Adrian Marius Negreanu Negreanu Marius Adrian + +Dave Airlie +Dave Airlie airlied +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie + +Alan Coopersmith + +Alan Hourihane +Alan Hourihane +Alan Hourihane + +Alexander Monakov + +Alexander von Gluck IV Alexander von Gluck + +Alex Corscadden +Alex Corscadden + +Alex Deucher +Alex Deucher +Alex Deucher +Alex Deucher +Alex Deucher +Alex Deucher + +Andreas Fänger + +Andreas Hartmetz + +Andre Heider +Andreas Heider + +Andreas Pokorny + +Andrew Randrianasulu +Andrew Randrianasulu + +Arthur Huillet Arthur HUILLET + +Benjamin Franzke ben + +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs + +Ben Widawsky Ben Widawsky + +Blair Sadewitz Blair Sadewitz + +bma + +Brian Paul Brian +Brian Paul +Brian Paul +Brian Paul +Brian Paul brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul root + +Bruce Merry + +caner + +Carl-Philip Hänsch Carl-Philip Haensch +Carl-Philip Hänsch Carl-Philip Haensch +Carl-Philip Hänsch Carl-Philip Haensch + +Chad Versace +Chad Versace +Chad Versace
Re: [Mesa-dev] [PATCH 1/3] nir/lower_system_values: Stop supporting non-SSA
Jason Ekstrandwrites: > On Tue, Dec 15, 2015 at 12:26 PM, Eric Anholt wrote: >> Jason Ekstrand writes: >> >>> The one user of this (i965) only ever calls it while in SSA form. >> >> This series is: >> >> Reviewed-by: Eric Anholt > > Thanks! > > Did you happen to run it on something that actually uses clip plane > lowering? I'd like to not break things. I hadn't, just reviewed. I checked now, and piglit's user-clip does pass. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB
This patch is really doing two different things. It changes the existing SYSTEM_VALUE_BASE_VERTEX to be independent from SYSTEM_VALUE_VERTEX_ID_ZERO. It also adds SYSTEM_VALUE_BASE_INSTANCE support. I was going to let that go, but because the two things happened in one patch, I overlooked the extra gl_DrawID related cruft that should have been in the next patch. Thankfully Anuj caught it. On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote: > We already have gl_BaseVertexARB in the .x component of the SGVS vec4 > and plug gl_BaseInstanceARB into the last free component (.y). > --- > src/mesa/drivers/dri/i965/brw_compiler.h | 2 ++ > src/mesa/drivers/dri/i965/brw_context.h | 9 -- > src/mesa/drivers/dri/i965/brw_draw.c | 12 ++-- > src/mesa/drivers/dri/i965/brw_draw_upload.c | 35 > ++- > src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 ++- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++- > src/mesa/drivers/dri/i965/brw_vec4.cpp| 12 ++-- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 ++- > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 6 +++- > src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 > ++- > 11 files changed, 102 insertions(+), 38 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h > b/src/mesa/drivers/dri/i965/brw_compiler.h > index 218d9c7..58ee966 100644 > --- a/src/mesa/drivers/dri/i965/brw_compiler.h > +++ b/src/mesa/drivers/dri/i965/brw_compiler.h > @@ -547,6 +547,8 @@ struct brw_vs_prog_data { > > bool uses_vertexid; > bool uses_instanceid; > + bool uses_basevertex; > + bool uses_baseinstance; > }; > > struct brw_tcs_prog_data > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index a845541..1378402 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -905,8 +905,13 @@ struct brw_context > uint32_t pma_stall_bits; > > struct { > - /** The value of gl_BaseVertex for the current _mesa_prim. */ > - int gl_basevertex; > + struct { > + /** The value of gl_BaseVertex for the current _mesa_prim. */ > + int gl_basevertex; > + > + /** The value of gl_BaseInstance for the current _mesa_prim. */ > + int gl_baseinstance; > + } params; > >/** > * Buffer and offset used for GL_ARB_shader_draw_parameters > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c > b/src/mesa/drivers/dri/i965/brw_draw.c > index 8398471..298ac06 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw.c > +++ b/src/mesa/drivers/dri/i965/brw_draw.c > @@ -491,9 +491,9 @@ brw_try_draw_prims(struct gl_context *ctx, > } >} > > - brw->draw.gl_basevertex = > + brw->draw.params.gl_basevertex = > prims[i].indexed ? prims[i].basevertex : prims[i].start; > - > + brw->draw.params.gl_baseinstance = prims[i].base_instance; >drm_intel_bo_unreference(brw->draw.draw_params_bo); > >if (prims[i].is_indirect) { > @@ -511,6 +511,14 @@ brw_try_draw_prims(struct gl_context *ctx, > brw->draw.draw_params_offset = 0; >} > > + /* gl_DrawID always needs its own vertex buffer since it's not part of > + * the indirect parameter buffer. */ > + if (brw->vs.prog_data->uses_drawid) { > + brw->draw.gl_drawid = prims[i].drawid; > + drm_intel_bo_unreference(brw->draw.draw_id_bo); > + brw->ctx.NewDriverState |= BRW_NEW_VERTICES; > + } > + >if (brw->gen < 6) >brw_set_prim(brw, [i]); >else > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c > b/src/mesa/drivers/dri/i965/brw_draw_upload.c > index ea0f6f2..ccf963c 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c > @@ -592,8 +592,10 @@ void > brw_prepare_shader_draw_parameters(struct brw_context *brw) > { > /* For non-indirect draws, upload gl_BaseVertex. */ > - if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == NULL) > { > - intel_upload_data(brw, >draw.gl_basevertex, 4, 4, > + if ((brw->vs.prog_data->uses_basevertex || > +brw->vs.prog_data->uses_baseinstance) && > + brw->draw.draw_params_bo == NULL) { > + intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4, > >draw.draw_params_bo, > >draw.draw_params_offset); > } > @@ -658,7 +660,8 @@ brw_emit_vertices(struct brw_context *brw) > brw_emit_query_begin(brw); > > unsigned nr_elements = brw->vb.nr_enabled; > - if (brw->vs.prog_data->uses_vertexid || > brw->vs.prog_data->uses_instanceid) > + if (brw->vs.prog_data->uses_vertexid || > brw->vs.prog_data->uses_instanceid || > +
Re: [Mesa-dev] [PATCH 5/7] i965: Add support for gl_DrawIDARB and enable extension
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote: > We have to break open a new vec4 for gl_DrawIDARB. We've used up all > space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its > own separate vertex buffer anyway. This is because we point the vb for > base vertex and base instance into the draw parameter BO for indirect > draw calls, but the draw id is generated by mesa in a different buffer. > --- > src/mesa/drivers/dri/i965/brw_compiler.h | 1 + > src/mesa/drivers/dri/i965/brw_context.h | 9 + > src/mesa/drivers/dri/i965/brw_draw.c | 8 ++-- > src/mesa/drivers/dri/i965/brw_draw_upload.c | 45 > ++- > src/mesa/drivers/dri/i965/brw_fs.cpp | 2 + > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 - > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 10 + > src/mesa/drivers/dri/i965/brw_vec4.cpp| 8 +++- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 10 - > src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 5 +++ > src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 - > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + > 12 files changed, 132 insertions(+), 11 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h > b/src/mesa/drivers/dri/i965/brw_compiler.h > index 58ee966..2333f4a 100644 > --- a/src/mesa/drivers/dri/i965/brw_compiler.h > +++ b/src/mesa/drivers/dri/i965/brw_compiler.h > @@ -549,6 +549,7 @@ struct brw_vs_prog_data { > bool uses_instanceid; > bool uses_basevertex; > bool uses_baseinstance; > + bool uses_drawid; > }; > > struct brw_tcs_prog_data > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 1378402..97ebf06 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -919,6 +919,15 @@ struct brw_context > */ >drm_intel_bo *draw_params_bo; >uint32_t draw_params_offset; > + > + /** > + * The value of gl_DrawID for the current _mesa_prim. This always comes > + * in from it's own vertex buffer since it's not part of the indirect > + * draw parameters. > + */ > + int gl_drawid; > + drm_intel_bo *draw_id_bo; > + uint32_t draw_id_offset; > } draw; > > struct { > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c > b/src/mesa/drivers/dri/i965/brw_draw.c > index 298ac06..b0710c67 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw.c > +++ b/src/mesa/drivers/dri/i965/brw_draw.c > @@ -513,11 +513,9 @@ brw_try_draw_prims(struct gl_context *ctx, > >/* gl_DrawID always needs its own vertex buffer since it's not part of > * the indirect parameter buffer. */ The */ goes on its own line. > - if (brw->vs.prog_data->uses_drawid) { > - brw->draw.gl_drawid = prims[i].drawid; > - drm_intel_bo_unreference(brw->draw.draw_id_bo); > - brw->ctx.NewDriverState |= BRW_NEW_VERTICES; > - } > + brw->draw.gl_drawid = prims[i].draw_id; > + drm_intel_bo_unreference(brw->draw.draw_id_bo); > + brw->ctx.NewDriverState |= BRW_NEW_VERTICES; The previous patch (incorrectly) added this block, and it seems like this should be conditional on uses_drawid. > >if (brw->gen < 6) >brw_set_prim(brw, [i]); > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c > b/src/mesa/drivers/dri/i965/brw_draw_upload.c > index ccf963c..e601190 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c > @@ -599,6 +599,12 @@ brw_prepare_shader_draw_parameters(struct brw_context > *brw) > >draw.draw_params_bo, > >draw.draw_params_offset); > } > + > + if (brw->vs.prog_data->uses_drawid) { > + intel_upload_data(brw, >draw.gl_drawid, > sizeof(brw->draw.gl_drawid), 4, > + >draw.draw_id_bo, > +>draw.draw_id_offset); > + } > } > > /** > @@ -663,6 +669,8 @@ brw_emit_vertices(struct brw_context *brw) > if (brw->vs.prog_data->uses_vertexid || > brw->vs.prog_data->uses_instanceid || > brw->vs.prog_data->uses_basevertex || > brw->vs.prog_data->uses_baseinstance) >++nr_elements; > + if (brw->vs.prog_data->uses_drawid) > + nr_elements++; > > /* If the VS doesn't read any inputs (calculating vertex position from > * a state variable for some reason, for example), emit a single pad > @@ -699,7 +707,8 @@ brw_emit_vertices(struct brw_context *brw) > const bool uses_draw_params = >brw->vs.prog_data->uses_basevertex || >brw->vs.prog_data->uses_baseinstance; > - const unsigned nr_buffers = brw->vb.nr_buffers + uses_draw_params; > + const unsigned nr_buffers = brw->vb.nr_buffers + > + uses_draw_params + brw->vs.prog_data->uses_drawid; > > if (nr_buffers)
[Mesa-dev] [PATCH 0/5] i965: Non-overridden OpenGLES 3.1 context on Gen8+
git://people.freedesktop.org/~jljusten/mesa es31-gen8-v1 With this series, gen8+ should be able to create an OpenGLES 3.1 context without any environment variable overrides. Jordan Justen (5): main: Add MESA_VERBOSE=api for LinkProgram & UseProgram main: Allow compute shaders to be compiled with OpenGLES 3.1 main/version: Don't require ARB_compute_shader for OpenGLES 3.1 i965: Enable compute shaders in more cases for OpenGLES 3.1 i965/screen: Allow OpenGLES 3.1 for gen8+ src/mesa/drivers/dri/i965/brw_context.c | 5 - src/mesa/drivers/dri/i965/intel_screen.c | 5 + src/mesa/main/shaderapi.c| 7 ++- src/mesa/main/version.c | 9 ++--- 4 files changed, 21 insertions(+), 5 deletions(-) -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] i965/screen: Allow OpenGLES 3.1 for gen8+
OpenGLES 3.1 cannot be enabled for gen 7 (Ivy Bridge, Haswell) since they are still missing ARB_stencil_texturing. Signed-off-by: Jordan JustenCc: Ian Romanick Cc: Marta Lofstedt --- src/mesa/drivers/dri/i965/intel_screen.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 825a7c1..13498f4 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1338,6 +1338,11 @@ set_max_gl_versions(struct intel_screen *screen) switch (screen->devinfo->gen) { case 9: case 8: + psp->max_gl_core_version = 33; + psp->max_gl_compat_version = 30; + psp->max_gl_es1_version = 11; + psp->max_gl_es2_version = 31; + break; case 7: case 6: psp->max_gl_core_version = 33; -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] main: Allow compute shaders to be compiled with OpenGLES 3.1
Previous OpenGLES 3.1 testing had been done when ARB_compute_shader was overridden to enabled. Signed-off-by: Jordan JustenCc: Marta Lofstedt --- src/mesa/main/shaderapi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index a732d83..e258ad9 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -208,7 +208,7 @@ _mesa_validate_shader_target(const struct gl_context *ctx, GLenum type) case GL_TESS_EVALUATION_SHADER: return ctx == NULL || _mesa_has_tessellation(ctx); case GL_COMPUTE_SHADER: - return ctx == NULL || ctx->Extensions.ARB_compute_shader; + return ctx == NULL || _mesa_has_compute_shaders(ctx); default: return false; } -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] i965: Enable compute shaders in more cases for OpenGLES 3.1
Previously we were checking the desktop OpenGL ARB_compute_shader requirements, but for OpenGLES 3.1, the requirements are lower. Signed-off-by: Jordan JustenCc: Marta Lofstedt --- src/mesa/drivers/dri/i965/brw_context.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 0abe601..5105625 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -377,7 +377,10 @@ brw_initialize_context_constants(struct brw_context *brw) [MESA_SHADER_GEOMETRY] = brw->gen >= 6, [MESA_SHADER_FRAGMENT] = true, [MESA_SHADER_COMPUTE] = - (ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) || + (ctx->API == API_OPENGL_CORE && + ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) || + (ctx->API == API_OPENGLES2 && + ctx->Const.MaxComputeWorkGroupSize[0] >= 128) || _mesa_extension_override_enables.ARB_compute_shader, }; -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1
The OpenGL ARB_compute_shader extension specfication requires at least 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1 only required 128. Signed-off-by: Jordan JustenCc: Ian Romanick Cc: Marta Lofstedt --- src/mesa/main/version.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c index e92bb11..112a73d 100644 --- a/src/mesa/main/version.c +++ b/src/mesa/main/version.c @@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions *extensions) } static GLuint -compute_version_es2(const struct gl_extensions *extensions) +compute_version_es2(const struct gl_extensions *extensions, +const struct gl_constants *consts) { /* OpenGL ES 2.0 is derived from OpenGL 2.0 */ const bool ver_2_0 = (extensions->ARB_texture_cube_map && @@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions *extensions) extensions->EXT_texture_snorm && extensions->NV_primitive_restart && extensions->OES_depth_texture_cube_map); + const bool es31_compute_shader = + consts->MaxComputeWorkGroupInvocations >= 128; const bool ver_3_1 = (ver_3_0 && extensions->ARB_arrays_of_arrays && - extensions->ARB_compute_shader && + es31_compute_shader && extensions->ARB_draw_indirect && extensions->ARB_explicit_uniform_location && extensions->ARB_framebuffer_no_attachments && @@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions *extensions, case API_OPENGLES: return compute_version_es1(extensions); case API_OPENGLES2: - return compute_version_es2(extensions); + return compute_version_es2(extensions, consts); } return 0; } -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] main: Add MESA_VERBOSE=api for LinkProgram & UseProgram
Signed-off-by: Jordan Justen--- src/mesa/main/shaderapi.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index ac40891..a732d83 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -1514,6 +1514,8 @@ void GLAPIENTRY _mesa_LinkProgram(GLhandleARB programObj) { GET_CURRENT_CONTEXT(ctx); + if (MESA_VERBOSE & VERBOSE_API) + _mesa_debug(ctx, "glLinkProgram %u\n", programObj); link_program(ctx, programObj); } @@ -1731,6 +1733,9 @@ _mesa_UseProgram(GLhandleARB program) GET_CURRENT_CONTEXT(ctx); struct gl_shader_program *shProg; + if (MESA_VERBOSE & VERBOSE_API) + _mesa_debug(ctx, "glUseProgram %u\n", program); + if (_mesa_is_xfb_active_and_unpaused(ctx)) { _mesa_error(ctx, GL_INVALID_OPERATION, "glUseProgram(transform feedback active)"); -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: don't use debug code in update_state() in release builds
--- src/gallium/drivers/svga/svga_state.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/svga/svga_state.c b/src/gallium/drivers/svga/svga_state.c index 722b369..4479a27 100644 --- a/src/gallium/drivers/svga/svga_state.c +++ b/src/gallium/drivers/svga/svga_state.c @@ -129,7 +129,11 @@ update_state(struct svga_context *svga, const struct svga_tracked_state *atoms[], unsigned *state) { +#ifdef DEBUG boolean debug = TRUE; +#else + boolean debug = FALSE; +#endif enum pipe_error ret = PIPE_OK; unsigned i; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/osmesa: add OSMesaCreateContextAttribs() function
As with the previous commit, except for gallium. --- src/gallium/state_trackers/osmesa/osmesa.c | 96 +- 1 file changed, 93 insertions(+), 3 deletions(-) diff --git a/src/gallium/state_trackers/osmesa/osmesa.c b/src/gallium/state_trackers/osmesa/osmesa.c index 0f27ba8..ee78910 100644 --- a/src/gallium/state_trackers/osmesa/osmesa.c +++ b/src/gallium/state_trackers/osmesa/osmesa.c @@ -544,11 +544,39 @@ GLAPI OSMesaContext GLAPIENTRY OSMesaCreateContextExt(GLenum format, GLint depthBits, GLint stencilBits, GLint accumBits, OSMesaContext sharelist) { + int attribs[100], n = 0; + + attribs[n++] = OSMESA_FORMAT; + attribs[n++] = format; + attribs[n++] = OSMESA_DEPTH_BITS; + attribs[n++] = depthBits; + attribs[n++] = OSMESA_STENCIL_BITS; + attribs[n++] = stencilBits; + attribs[n++] = OSMESA_ACCUM_BITS; + attribs[n++] = accumBits; + attribs[n++] = 0; + + return OSMesaCreateContextAttribs(attribs, sharelist); +} + + +/** + * New in Mesa 11.2 + * + * Create context with attribute list. + */ +GLAPI OSMesaContext GLAPIENTRY +OSMesaCreateContextAttribs(const int *attribList, OSMesaContext sharelist) +{ OSMesaContext osmesa; struct st_context_iface *st_shared; enum st_context_error st_error = 0; struct st_context_attribs attribs; struct st_api *stapi = get_st_api(); + GLenum format = GL_RGBA; + int depthBits = 0, stencilBits = 0, accumBits = 0; + int profile = OSMESA_COMPAT_PROFILE, version_major = 1, version_minor = 0; + int i; if (sharelist) { st_shared = sharelist->stctx; @@ -561,6 +589,64 @@ OSMesaCreateContextExt(GLenum format, GLint depthBits, GLint stencilBits, if (!osmesa) return NULL; + for (i = 0; attribList[i]; i += 2) { + switch (attribList[i]) { + case OSMESA_FORMAT: + format = attribList[i+1]; + switch (format) { + case OSMESA_COLOR_INDEX: + case OSMESA_RGBA: + case OSMESA_BGRA: + case OSMESA_ARGB: + case OSMESA_RGB: + case OSMESA_BGR: + case OSMESA_RGB_565: +/* legal */ +break; + default: +return NULL; + } + break; + case OSMESA_DEPTH_BITS: + depthBits = attribList[i+1]; + if (depthBits < 0) +return NULL; + break; + case OSMESA_STENCIL_BITS: + stencilBits = attribList[i+1]; + if (stencilBits < 0) +return NULL; + break; + case OSMESA_ACCUM_BITS: + accumBits = attribList[i+1]; + if (accumBits < 0) +return NULL; + break; + case OSMESA_PROFILE: + profile = attribList[i+1]; + if (profile != OSMESA_CORE_PROFILE && + profile != OSMESA_COMPAT_PROFILE) +return NULL; + break; + case OSMESA_CONTEXT_MAJOR_VERSION: + version_major = attribList[i+1]; + if (version_major < 1) +return NULL; + break; + case OSMESA_CONTEXT_MINOR_VERSION: + version_minor = attribList[i+1]; + if (version_minor < 0) +return NULL; + break; + case 0: + /* end of list */ + break; + default: + fprintf(stderr, "Bad attribute in OSMesaCreateContextAttribs()\n"); + return NULL; + } + } + /* Choose depth/stencil/accum buffer formats */ if (accumBits > 0) { osmesa->accum_format = PIPE_FORMAT_R16G16B16A16_SNORM; @@ -581,9 +667,11 @@ OSMesaCreateContextExt(GLenum format, GLint depthBits, GLint stencilBits, /* * Create the rendering context */ - attribs.profile = ST_PROFILE_DEFAULT; - attribs.major = 2; - attribs.minor = 1; + memset(, 0, sizeof(attribs)); + attribs.profile = (profile == OSMESA_CORE_PROFILE) + ? ST_PROFILE_OPENGL_CORE : ST_PROFILE_DEFAULT; + attribs.major = version_major; + attribs.minor = version_minor; attribs.flags = 0; /* ST_CONTEXT_FLAG_x */ attribs.options.force_glsl_extensions_warn = FALSE; attribs.options.disable_blend_func_extended = FALSE; @@ -614,6 +702,7 @@ OSMesaCreateContextExt(GLenum format, GLint depthBits, GLint stencilBits, } + /** * Destroy an Off-Screen Mesa rendering context. * @@ -883,6 +972,7 @@ struct name_function static struct name_function functions[] = { { "OSMesaCreateContext", (OSMESAproc) OSMesaCreateContext }, { "OSMesaCreateContextExt", (OSMESAproc) OSMesaCreateContextExt }, + { "OSMesaCreateContextAttribs", (OSMESAproc) OSMesaCreateContextAttribs }, { "OSMesaDestroyContext", (OSMESAproc) OSMesaDestroyContext }, { "OSMesaMakeCurrent", (OSMESAproc) OSMesaMakeCurrent }, { "OSMesaGetCurrentContext", (OSMESAproc) OSMesaGetCurrentContext }, -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader
Hardly a complete review, but a handful of comments: On Tue, Dec 15, 2015 at 6:05 PM, Miklós Mátéwrote: > --- > src/mesa/Makefile.sources | 1 + > src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 > ++ > src/mesa/state_tracker/st_atifs_to_tgsi.h | 49 ++ > src/mesa/state_tracker/st_atom_constbuf.c | 14 + > src/mesa/state_tracker/st_cb_drawpixels.c | 1 + > src/mesa/state_tracker/st_cb_program.c| 35 +- > src/mesa/state_tracker/st_program.c | 22 + > src/mesa/state_tracker/st_program.h | 1 + > 8 files changed, 920 insertions(+), 1 deletion(-) > create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c > create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h > > +static struct ureg_src prepare_argument(struct st_translate *t, const > unsigned argId, > + const struct atifragshader_src_register *srcReg) > +{ > + struct ureg_src src = get_source(t, srcReg->Index); > + struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId); > + > + switch (srcReg->argRep) { > + case GL_NONE: > + break; > + case GL_RED: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, > TGSI_SWIZZLE_X); > + break; > + case GL_GREEN: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, > TGSI_SWIZZLE_Y); > + break; > + case GL_BLUE: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, > TGSI_SWIZZLE_Z); > + break; > + case GL_ALPHA: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, > TGSI_SWIZZLE_W); > + break; > + } > + emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1); > + > + if (srcReg->argMod & GL_COMP_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_imm1f(t->ureg, 1.0); > + modsrc[1] = ureg_src(arg); > + > + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); > + } > + if (srcReg->argMod & GL_BIAS_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_src(arg); > + modsrc[1] = ureg_imm1f(t->ureg, 0.5); > + > + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); > + } > + if (srcReg->argMod & GL_2X_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_src(arg); > + modsrc[1] = ureg_imm1f(t->ureg, 2.0); > + > + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); aka ADD arg, arg, arg > + } > + if (srcReg->argMod & GL_NEGATE_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_src(arg); > + modsrc[1] = ureg_imm1f(t->ureg, -1.0); > + > + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); aka NEG arg, arg > + } > + return ureg_src(arg); > +} > + > +/* These instructions have no direct equivalent in TGSI */ > +static void emit_special_inst(struct st_translate *t, struct > instruction_desc *desc, > + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) > +{ > + struct ureg_dst tmp[1]; > + struct ureg_src src[3]; > + > + if(desc->special == 1) { > + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose > a3 > + src[0] = ureg_imm1f(t->ureg, 0.5f); > + src[1] = args[2]; > + emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2); > + src[0] = ureg_src(tmp[0]); > + src[1] = args[0]; > + src[2] = args[1]; > + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); > + } else if (desc->special == 2) { > + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose > a3 > + src[0] = args[2]; > + src[1] = ureg_imm1f(t->ureg, 0.0f); > + emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2); > + src[0] = ureg_src(tmp[0]); > + src[1] = args[0]; > + src[2] = args[1]; > + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); Isn't this the CMP instruction? Just flip the args. http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP The other one should be expressible as CMP as well I think. > + } else if (desc->special == 3) { > + src[0] = args[0]; > + src[1] = args[1]; > + src[2] = ureg_swizzle(args[2], > +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z); > + emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3); > + } > +} > + > +static void emit_arith_inst(struct st_translate *t, > + struct instruction_desc *desc, > + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) > +{ > + if (desc->special) { > + return emit_special_inst(t, desc, dst, args, argcount); > + } > + > + emit_insn(t, desc->TGSI_opcode, dst, 1, args, argcount); > +} > + > +static void emit_dstmod(struct st_translate *t, > + struct ureg_dst dst, GLuint dstMod) > +{ > + float imm = 0.0; 1.0 right? (if you just have the saturate bit) > + struct ureg_src src[3]; > + > + if
Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader
On 12/15/2015 04:40 PM, Ilia Mirkin wrote: > Hardly a complete review, but a handful of comments: > > On Tue, Dec 15, 2015 at 6:05 PM, Miklós Mátéwrote: >> --- >> src/mesa/Makefile.sources | 1 + >> src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 >> ++ >> src/mesa/state_tracker/st_atifs_to_tgsi.h | 49 ++ >> src/mesa/state_tracker/st_atom_constbuf.c | 14 + >> src/mesa/state_tracker/st_cb_drawpixels.c | 1 + >> src/mesa/state_tracker/st_cb_program.c| 35 +- >> src/mesa/state_tracker/st_program.c | 22 + >> src/mesa/state_tracker/st_program.h | 1 + >> 8 files changed, 920 insertions(+), 1 deletion(-) >> create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c >> create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h >> >> +static struct ureg_src prepare_argument(struct st_translate *t, const >> unsigned argId, >> + const struct atifragshader_src_register *srcReg) >> +{ >> + struct ureg_src src = get_source(t, srcReg->Index); >> + struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId); >> + >> + switch (srcReg->argRep) { >> + case GL_NONE: >> + break; >> + case GL_RED: >> + src = ureg_swizzle(src, >> + TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, >> TGSI_SWIZZLE_X); >> + break; >> + case GL_GREEN: >> + src = ureg_swizzle(src, >> + TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, >> TGSI_SWIZZLE_Y); >> + break; >> + case GL_BLUE: >> + src = ureg_swizzle(src, >> + TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, >> TGSI_SWIZZLE_Z); >> + break; >> + case GL_ALPHA: >> + src = ureg_swizzle(src, >> + TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, >> TGSI_SWIZZLE_W); >> + break; >> + } >> + emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1); >> + >> + if (srcReg->argMod & GL_COMP_BIT_ATI) { >> + struct ureg_src modsrc[2]; >> + modsrc[0] = ureg_imm1f(t->ureg, 1.0); >> + modsrc[1] = ureg_src(arg); >> + >> + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); >> + } >> + if (srcReg->argMod & GL_BIAS_BIT_ATI) { >> + struct ureg_src modsrc[2]; >> + modsrc[0] = ureg_src(arg); >> + modsrc[1] = ureg_imm1f(t->ureg, 0.5); >> + >> + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); >> + } >> + if (srcReg->argMod & GL_2X_BIT_ATI) { >> + struct ureg_src modsrc[2]; >> + modsrc[0] = ureg_src(arg); >> + modsrc[1] = ureg_imm1f(t->ureg, 2.0); >> + >> + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); > > aka ADD arg, arg, arg > >> + } >> + if (srcReg->argMod & GL_NEGATE_BIT_ATI) { >> + struct ureg_src modsrc[2]; >> + modsrc[0] = ureg_src(arg); >> + modsrc[1] = ureg_imm1f(t->ureg, -1.0); >> + >> + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); > > aka NEG arg, arg > >> + } >> + return ureg_src(arg); >> +} >> + >> +/* These instructions have no direct equivalent in TGSI */ >> +static void emit_special_inst(struct st_translate *t, struct >> instruction_desc *desc, >> + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) >> +{ >> + struct ureg_dst tmp[1]; >> + struct ureg_src src[3]; >> + >> + if(desc->special == 1) { >> + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose >> a3 >> + src[0] = ureg_imm1f(t->ureg, 0.5f); >> + src[1] = args[2]; >> + emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2); >> + src[0] = ureg_src(tmp[0]); >> + src[1] = args[0]; >> + src[2] = args[1]; >> + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); >> + } else if (desc->special == 2) { >> + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose >> a3 >> + src[0] = args[2]; >> + src[1] = ureg_imm1f(t->ureg, 0.0f); >> + emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2); >> + src[0] = ureg_src(tmp[0]); >> + src[1] = args[0]; >> + src[2] = args[1]; >> + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); > > Isn't this the CMP instruction? Just flip the args. > > http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP > > The other one should be expressible as CMP as well I think. > >> + } else if (desc->special == 3) { >> + src[0] = args[0]; >> + src[1] = args[1]; >> + src[2] = ureg_swizzle(args[2], >> +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z); >> + emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3); >> + } >> +} >> + >> +static void emit_arith_inst(struct st_translate *t, >> + struct instruction_desc *desc, >> + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) >> +{ >> + if (desc->special) { >> + return emit_special_inst(t, desc, dst, args, argcount); >> + } >> + >> + emit_insn(t, desc->TGSI_opcode, dst, 1, args, argcount); >> +} >> + >> +static void
Re: [Mesa-dev] [PATCH 4/5] i965: Enable compute shaders in more cases for OpenGLES 3.1
Doesn't this make patch 3 irrelevant? FWIW, I like this better. On 12/15/2015 04:08 PM, Jordan Justen wrote: > Previously we were checking the desktop OpenGL ARB_compute_shader > requirements, but for OpenGLES 3.1, the requirements are lower. > > Signed-off-by: Jordan Justen> Cc: Marta Lofstedt > --- > src/mesa/drivers/dri/i965/brw_context.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index 0abe601..5105625 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -377,7 +377,10 @@ brw_initialize_context_constants(struct brw_context *brw) >[MESA_SHADER_GEOMETRY] = brw->gen >= 6, >[MESA_SHADER_FRAGMENT] = true, >[MESA_SHADER_COMPUTE] = > - (ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) || > + (ctx->API == API_OPENGL_CORE && > + ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) || > + (ctx->API == API_OPENGLES2 && > + ctx->Const.MaxComputeWorkGroupSize[0] >= 128) || > _mesa_extension_override_enables.ARB_compute_shader, > }; > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] i965: Reduce vertex state reemission
On 12/15/2015 12:28 AM, Kristian Høgsberg Kristensen wrote: > We can inspect VS prog_data for iterations i > 0, and only flag > BRW_NEW_VERTICES when one of our system values change. > > This change also flags BRW_NEW_VERTICES in one case we were missing > before: if we're doing an indirect draw, prims[i].basevertex is always 0 > and the real base vertex value is in the indirect parameter > buffer. Thus, if a program uses base vertex or base instance, and the > draw call is indirect, flag BRW_NEW_VERTICES. A new piglit test, > spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. > --- > src/mesa/drivers/dri/i965/brw_draw.c | 44 > > 1 file changed, 40 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c > b/src/mesa/drivers/dri/i965/brw_draw.c > index b0710c67..9e400ca 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw.c > +++ b/src/mesa/drivers/dri/i965/brw_draw.c > @@ -491,9 +491,44 @@ brw_try_draw_prims(struct gl_context *ctx, > } >} > > - brw->draw.params.gl_basevertex = > + /* Determine if we need to flag BRW_NEW_VERTICES for updating the > + * gl_BaseVertexARB, gl_BaseInstanceARB or gl_DrawIDARB values. As > + * above, we don't need to check first iteration, since the flag is set > + * before the loop. We also can't rely on vs prog_data in the first > + * iteration, but after drawing once, we've uploaded the programs and > + * can look at prog_data. > + * > + * Despite the prims[] name, eache iteration correspond to a draw call eachcorresponds > + * from a glMulti* style draw call. We need to re-upload vertex state > if > + * > + * 1) the program uses gl_DrawIDARB (changes every iteration), > + * > + * 2) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the > + * draw call is indirect (meaning we can't check if the value > change > + * or not), or > + * > + * 3) the program uses gl_BaseVertexARB or gl_BaseInstanceARB and the > + * value changed > + */ > + const int new_basevertex = > prims[i].indexed ? prims[i].basevertex : prims[i].start; > - brw->draw.params.gl_baseinstance = prims[i].base_instance; > + const int new_baseinstance = prims[i].base_instance; > + if (i > 0) { > + const bool uses_draw_parameters = > +brw->vs.prog_data->uses_basevertex || > +brw->vs.prog_data->uses_baseinstance; > + > + if (brw->vs.prog_data->uses_drawid || > + (uses_draw_parameters && prims[i].is_indirect) || > + (brw->vs.prog_data->uses_basevertex && > + brw->draw.params.gl_basevertex != new_basevertex) || > + (brw->vs.prog_data->uses_baseinstance && > + brw->draw.params.gl_baseinstance != new_baseinstance)) > +brw->ctx.NewDriverState |= BRW_NEW_VERTICES; > + } > + > + brw->draw.params.gl_basevertex = new_basevertex; > + brw->draw.params.gl_baseinstance = new_baseinstance; >drm_intel_bo_unreference(brw->draw.draw_params_bo); > >if (prims[i].is_indirect) { > @@ -512,10 +547,11 @@ brw_try_draw_prims(struct gl_context *ctx, >} > >/* gl_DrawID always needs its own vertex buffer since it's not part of > - * the indirect parameter buffer. */ > + * the indirect parameter buffer. > + */ Lol >brw->draw.gl_drawid = prims[i].draw_id; >drm_intel_bo_unreference(brw->draw.draw_id_bo); > - brw->ctx.NewDriverState |= BRW_NEW_VERTICES; > + brw->draw.draw_id_bo = NULL; It seems odd that this change is in this patch. Should it have always been after the drm_intel_bo_unreference call? > >if (brw->gen < 6) >brw_set_prim(brw, [i]); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] osmesa: add new OSMesaCreateContextAttribs function
This allows specifying a GL profile and version so one can get a core- profile context. --- docs/relnotes/11.2.0.html| 2 + include/GL/osmesa.h | 45 - src/mesa/drivers/osmesa/osmesa.c | 104 ++- 3 files changed, 148 insertions(+), 3 deletions(-) diff --git a/docs/relnotes/11.2.0.html b/docs/relnotes/11.2.0.html index 12e0f07..e382856 100644 --- a/docs/relnotes/11.2.0.html +++ b/docs/relnotes/11.2.0.html @@ -56,6 +56,8 @@ Note: some of the new features are only available with certain drivers. GL_ARB_vertex_type_10f_11f_11f_rev on freedreno/a4xx GL_KHR_texture_compression_astc_ldr on freedreno/a4xx GL_AMD_performance_monitor on radeonsi (CIK+ only) +New OSMesaCreateContextAttribs() function (for creating core profile +contexts) Bug fixes diff --git a/include/GL/osmesa.h b/include/GL/osmesa.h index ca0d167..39cd54e 100644 --- a/include/GL/osmesa.h +++ b/include/GL/osmesa.h @@ -58,8 +58,8 @@ extern "C" { #include -#define OSMESA_MAJOR_VERSION 10 -#define OSMESA_MINOR_VERSION 0 +#define OSMESA_MAJOR_VERSION 11 +#define OSMESA_MINOR_VERSION 2 #define OSMESA_PATCH_VERSION 0 @@ -95,6 +95,18 @@ extern "C" { #define OSMESA_MAX_WIDTH 0x24 /* new in 4.0 */ #define OSMESA_MAX_HEIGHT 0x25 /* new in 4.0 */ +/* + * Accepted in OSMesaCreateContextAttrib's attribute list. + */ +#define OSMESA_DEPTH_BITS0x30 +#define OSMESA_STENCIL_BITS 0x31 +#define OSMESA_ACCUM_BITS0x32 +#define OSMESA_PROFILE 0x33 +#define OSMESA_CORE_PROFILE 0x34 +#define OSMESA_COMPAT_PROFILE0x35 +#define OSMESA_CONTEXT_MAJOR_VERSION 0x36 +#define OSMESA_CONTEXT_MINOR_VERSION 0x37 + typedef struct osmesa_context *OSMesaContext; @@ -128,6 +140,35 @@ OSMesaCreateContextExt( GLenum format, GLint depthBits, GLint stencilBits, /* + * Create an Off-Screen Mesa rendering context with attribute list. + * The list is composed of (attribute, value) pairs and terminated with + * attribute==0. Supported Attributes: + * + * AttributesValues + * -- + * OSMESA_FORMAT OSMESA_RGBA*, OSMESA_BGRA, OSMESA_ARGB, etc. + * OSMESA_DEPTH_BITS 0*, 16, 24, 32 + * OSMESA_STENCIL_BITS 0*, 8 + * OSMESA_ACCUM_BITS 0*, 16 + * OSMESA_PROFILEOSMESA_COMPAT_PROFILE*, OSMESA_CORE_PROFILE + * OSMESA_CONTEXT_MAJOR_VERSION 1*, 2, 3 + * OSMESA_CONTEXT_MINOR_VERSION 0+ + * + * Note: * = default value + * + * We return a context version >= what's specified by OSMESA_CONTEXT_MAJOR/ + * MINOR_VERSION for the given profile. For example, if you request a GL 1.4 + * compat profile, you might get a GL 3.0 compat profile. + * Otherwise, null is returned if the version/profile is not supported. + * + * New in Mesa 11.2 + */ +GLAPI OSMesaContext GLAPIENTRY +OSMesaCreateContextAttribs( const int *attribList, OSMesaContext sharelist ); + + + +/* * Destroy an Off-Screen Mesa rendering context. * * Input: ctx - the context to destroy diff --git a/src/mesa/drivers/osmesa/osmesa.c b/src/mesa/drivers/osmesa/osmesa.c index 5c7dcac..8f14dfd 100644 --- a/src/mesa/drivers/osmesa/osmesa.c +++ b/src/mesa/drivers/osmesa/osmesa.c @@ -645,10 +645,104 @@ GLAPI OSMesaContext GLAPIENTRY OSMesaCreateContextExt( GLenum format, GLint depthBits, GLint stencilBits, GLint accumBits, OSMesaContext sharelist ) { + int attribs[100], n = 0; + + attribs[n++] = OSMESA_FORMAT; + attribs[n++] = format; + attribs[n++] = OSMESA_DEPTH_BITS; + attribs[n++] = depthBits; + attribs[n++] = OSMESA_STENCIL_BITS; + attribs[n++] = stencilBits; + attribs[n++] = OSMESA_ACCUM_BITS; + attribs[n++] = accumBits; + attribs[n++] = 0; + + return OSMesaCreateContextAttribs(attribs, sharelist); +} + + +/** + * New in Mesa 11.2 + * + * Create context with attribute list. + */ +GLAPI OSMesaContext GLAPIENTRY +OSMesaCreateContextAttribs(const int *attribList, OSMesaContext sharelist) +{ OSMesaContext osmesa; struct dd_function_table functions; GLint rind, gind, bind, aind; GLint redBits = 0, greenBits = 0, blueBits = 0, alphaBits =0; + GLenum format = OSMESA_RGBA; + GLint depthBits = 0, stencilBits = 0, accumBits = 0; + int profile = OSMESA_COMPAT_PROFILE, version_major = 1, version_minor = 0; + gl_api api_profile = API_OPENGL_COMPAT; + int i; + + osmesa = (OSMesaContext) CALLOC_STRUCT(osmesa_context); + if (!osmesa) + return NULL; + + for (i = 0; attribList[i]; i += 2) { + switch (attribList[i]) { + case OSMESA_FORMAT: + format = attribList[i+1]; + switch (format) { + case OSMESA_COLOR_INDEX: + case OSMESA_RGBA: + case OSMESA_BGRA: + case OSMESA_ARGB: + case OSMESA_RGB: + case OSMESA_BGR: + case OSMESA_RGB_565: +/* legal */ +
[Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader
--- src/mesa/Makefile.sources | 1 + src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 ++ src/mesa/state_tracker/st_atifs_to_tgsi.h | 49 ++ src/mesa/state_tracker/st_atom_constbuf.c | 14 + src/mesa/state_tracker/st_cb_drawpixels.c | 1 + src/mesa/state_tracker/st_cb_program.c| 35 +- src/mesa/state_tracker/st_program.c | 22 + src/mesa/state_tracker/st_program.h | 1 + 8 files changed, 920 insertions(+), 1 deletion(-) create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index ed9848c..a8e645d 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -390,6 +390,7 @@ VBO_FILES = \ vbo/vbo_split_inplace.c STATETRACKER_FILES = \ + state_tracker/st_atifs_to_tgsi.c \ state_tracker/st_atom_array.c \ state_tracker/st_atom_blend.c \ state_tracker/st_atom.c \ diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c b/src/mesa/state_tracker/st_atifs_to_tgsi.c new file mode 100644 index 000..1d704cb --- /dev/null +++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c @@ -0,0 +1,798 @@ + +#include "main/mtypes.h" +#include "main/atifragshader.h" +#include "main/texobj.h" +#include "main/errors.h" +#include "program/prog_parameter.h" + +#include "tgsi/tgsi_ureg.h" +#include "util/u_math.h" +#include "util/u_memory.h" + +#include "st_program.h" +#include "st_atifs_to_tgsi.h" + +/** + * Intermediate state used during shader translation. + */ +struct st_translate { + struct ureg_program *ureg; + struct gl_context *ctx; + struct ati_fragment_shader *atifs; + + struct ureg_dst temps[MAX_PROGRAM_TEMPS]; + struct ureg_src *constants; + struct ureg_dst outputs[PIPE_MAX_SHADER_OUTPUTS]; + struct ureg_src inputs[PIPE_MAX_SHADER_INPUTS]; + struct ureg_dst address[1]; + struct ureg_src samplers[PIPE_MAX_SAMPLERS]; + struct ureg_src systemValues[SYSTEM_VALUE_MAX]; + + const GLuint *inputMapping; + const GLuint *outputMapping; + + /* Keep a record of the tgsi instruction number that each mesa +* instruction starts at, will be used to fix up labels after +* translation. +*/ + unsigned *insn; + unsigned insn_size; + unsigned insn_count; + + unsigned current_pass; + + bool regs_written[MAX_NUM_PASSES_ATI][MAX_NUM_FRAGMENT_REGISTERS_ATI]; + + boolean error; +}; + +struct instruction_desc { + unsigned TGSI_opcode; + const char *name; + unsigned char arg_count; + unsigned char special; /* no 1:1 corresponding TGSI instruction */ +}; + +/* index this array as inst_desc[ATI_opcode-GL_MOV_ATI] */ +static struct instruction_desc inst_desc[] = { + {TGSI_OPCODE_MOV, "MOV", 1, 0}, + {TGSI_OPCODE_NOP, "UND", 0, 0}, /* unused */ + {TGSI_OPCODE_ADD, "ADD", 2, 0}, + {TGSI_OPCODE_MUL, "MUL", 2, 0}, + {TGSI_OPCODE_SUB, "SUB", 2, 0}, + {TGSI_OPCODE_DP3, "DOT3", 2, 0}, + {TGSI_OPCODE_DP4, "DOT4", 2, 0}, + {TGSI_OPCODE_MAD, "MAD", 3, 0}, + {TGSI_OPCODE_LRP, "LERP", 3, 0}, + {TGSI_OPCODE_NOP, "CND", 3, 1}, + {TGSI_OPCODE_NOP, "CND0", 3, 2}, + {TGSI_OPCODE_NOP, "DOT2_ADD", 3, 3} +}; + +/** + * Called prior to emitting the TGSI code for each Mesa instruction. + * Allocate additional space for instructions if needed. + * Update the insn[] array so the next Mesa instruction points to + * the next TGSI instruction. + * Copied from st_mesa_to_tgsi.c + */ +static void set_insn_start(struct st_translate *t, + unsigned start) +{ + if (t->insn_count + 1 >= t->insn_size) { + t->insn_size = 1 << (util_logbase2(t->insn_size) + 1); + t->insn = realloc(t->insn, t->insn_size * sizeof t->insn[0]); + if (t->insn == NULL) { + t->error = TRUE; + return; + } + } + + t->insn[t->insn_count++] = start; +} + +static void emit_insn(struct st_translate *t, + unsigned opcode, + const struct ureg_dst *dst, + unsigned nr_dst, + const struct ureg_src *src, + unsigned nr_src) +{ + set_insn_start(t, ureg_get_instruction_number(t->ureg)); + ureg_insn(t->ureg, opcode, dst, nr_dst, src, nr_src); +} + +static struct ureg_dst get_temp(struct st_translate *t, unsigned index) +{ + if (ureg_dst_is_undef(t->temps[index])) + t->temps[index] = ureg_DECL_temporary(t->ureg); + return t->temps[index]; +} + +static struct ureg_src apply_swizzle(struct st_translate *t, + struct ureg_src src, GLuint swizzle) +{ + if (swizzle == GL_SWIZZLE_STR_ATI) { + return src; + } else if (swizzle == GL_SWIZZLE_STQ_ATI) { + return ureg_swizzle(src, +TGSI_SWIZZLE_X, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_W, TGSI_SWIZZLE_Z); + } else { + struct ureg_dst tmp[2]; + struct ureg_src imm[3]; + + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI); + tmp[1] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+1); + imm[0] = src; + imm[1]
[Mesa-dev] [PATCH 07/11] program: fix comment about the fog formula
--- src/mesa/program/prog_statevars.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/program/prog_statevars.c b/src/mesa/program/prog_statevars.c index bdb335e..12490d0 100644 --- a/src/mesa/program/prog_statevars.c +++ b/src/mesa/program/prog_statevars.c @@ -474,7 +474,7 @@ _mesa_fetch_state(struct gl_context *ctx, const gl_state_index state[], * single MAD. * linear: fogcoord * -1/(end-start) + end/(end-start) * exp: 2^-(density/ln(2) * fogcoord) - * exp2: 2^-((density/(ln(2)^2) * fogcoord)^2) + * exp2: 2^-((density/(sqrt(ln(2))) * fogcoord)^2) */ value[0] = (ctx->Fog.End == ctx->Fog.Start) ? 1.0f : (GLfloat)(-1.0F / (ctx->Fog.End - ctx->Fog.Start)); -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/11] [RFC] mesa: optimize out the realloc from glCopyTexImagexD()
Apitrace showed this call to be 5ms (9 times per frame), but in reality it's about 500us. This shortcut makes it 20us. --- src/mesa/main/teximage.c | 29 + 1 file changed, 29 insertions(+) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index ab60a2f..ba13720 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -3393,6 +3393,21 @@ formats_differ_in_component_sizes(mesa_format f1, mesa_format f2) return GL_FALSE; } +static GLboolean +canAvoidRealloc(struct gl_texture_image *texImage, GLenum internalFormat, + GLint x, GLint y, GLsizei width, GLsizei height, GLint border) +{ + if (texImage->InternalFormat != internalFormat) + return false; + if (texImage->Border != border) + return false; + if (texImage->Width2 != width) + return false; + if (texImage->Height2 != height) + return false; + return true; +} + /** * Implement the glCopyTexImage1/2D() functions. */ @@ -3433,6 +3448,20 @@ copyteximage(struct gl_context *ctx, GLuint dims, texObj = _mesa_get_current_tex_object(ctx, target); assert(texObj); + _mesa_lock_texture(ctx, texObj); + { + texImage = _mesa_select_tex_image(texObj, target, level); + if (texImage && canAvoidRealloc(texImage, internalFormat, + x, y, width, height, border)) { + _mesa_unlock_texture(ctx, texObj); + //_mesa_debug(0, "using shortcut\n"); + return _mesa_copy_texture_sub_image(ctx, dims, texObj, target, level, + 0, 0, 0, x, y, width, height, "CopyTexImage"); + } + //_mesa_debug(0, "can't shortcut %p, %dx%d\n", texImage, width, height); + } + _mesa_unlock_texture(ctx, texObj); + texFormat = _mesa_choose_texture_format(ctx, texObj, target, level, internalFormat, GL_NONE, GL_NONE); -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/11] swrast: move two global defines to the only place where they are used
--- src/mesa/main/mtypes.h| 2 -- src/mesa/swrast/s_atifragshader.c | 2 ++ 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 5c71ac4..99e7912 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2278,8 +2278,6 @@ struct gl_compute_program_state /** * ATI_fragment_shader runtime state */ -#define ATI_FS_INPUT_PRIMARY 0 -#define ATI_FS_INPUT_SECONDARY 1 struct atifs_instruction; struct atifs_setupinst; diff --git a/src/mesa/swrast/s_atifragshader.c b/src/mesa/swrast/s_atifragshader.c index 2974dee..414a414 100644 --- a/src/mesa/swrast/s_atifragshader.c +++ b/src/mesa/swrast/s_atifragshader.c @@ -26,6 +26,8 @@ #include "swrast/s_atifragshader.h" #include "swrast/s_context.h" +#define ATI_FS_INPUT_PRIMARY 0 +#define ATI_FS_INPUT_SECONDARY 1 /** * State for executing ATI fragment shader. -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/11] GL_ATI_fragment_shader support for Gallium
Hi, This series aims to improve the looks of Star Wars: Knights of the Old Republic (via Wine), but features some additional cleanup as well. The main component of the series is the implementation of GL_ATI_fragment_shader for all Gallium drivers (though I could only test it with radeonsi, llvmpipe, and softpipe). If this extension is available, the game uses it quite extensively: perhaps the most notable effect is the animated water ripples, but it also fixes the grass, improves the specular on wet characters (e.g. the Selkath) and it is used for regular texturing almost everywhere. The game has two optional post-process effects that also depend on this extension: framebuffer effects (light bloom, distortion), and soft shadows. Patches 5&6 are needed to fix crashing with post-processing. With current fglrx the grass is wrong, and post-process crashes, but my previous Radeon cards ran this game perfectly on Windows. One other game that can use GL_ATI_fragment_shader is Doom 3, if r_renderer="r200" instead of "best" (which means "arb2", if GL_ARB_fragment_program is available). By default image_useNormalCompression=0, which results in wrong lighting and makes the specular overbright with r200. Setting it to 1 fixes r200, but messes up arb2, setting it to 2 fixes both. The light interaction is the same in r200 and arb2, but r200 doesn't have the heathaze shader. Later idTech4 games don't support r200 anymore: in Quake 4 everything is green, in Prey the organic walls are black, and ETQW has a completely revised renderer. I verified these with fglrx. The series is based on the 11.0 branch of Mesa. Patches 1-4 implement GL_ATI_fragment_shader, 5-6 fix crashing in post-process of KotOR, 7-11 are various cleanups. There are a few TODO comments where I wasn't entirely sure, and the two RFC patches are more like ideas than solutions, but most of the code should be fine. After this series the following issues remain in KotOR that I've been unable to fix: 1. Enabling soft shadows makes all characters disappear. When drawing the post-process effects the game switches between scratch framebuffers and the real one with glXMakeContextCurrent() several times. The scratch buffers have no depth, and after switching back to the real one the depth buffer is lost, so all subsequent depth tests fail. 2. Enabling MSAA results in black screen when post-process is enabled, only the light bloom is visible. I don't know how to debug this. 3. Post-process filters are extremely slow. Normally the game runs around 80fps (cpu-bound), but drops to 15fps with framebuffer effects, 20fps with soft shadows, 9fps with both. I've tried to profile this with apitrace, and found a bottleneck (see patch 10) that cost 5ms per call, but it turned out that it's not the real bottleneck. Both capturing and replaying are very slow compared to the game (15fps), so the profiler basically measures its own latency. I've tried to find the real culprit by adding time measurement to the calls made when drawing the post-process effects, but haven't found anything yet. Screenshot gallery: Dantooine Fixed-function: http://postimg.org/image/5de014vd5/ With grass: http://postimg.org/image/u7xhv7g7d/ With ATIfs: http://postimg.org/image/jijt2y4eh/ Kashyyyk Shadowlands Fixed-function: http://postimg.org/image/mchb7drbv/ ATIfs without fog: http://postimg.org/image/dk0cjp66z/ ATIfs with apply_fog(): http://postimg.org/image/rcerfbwyj/ Manaan Fixed-function: http://postimg.org/image/4l13f6mjf/ ATIfs: http://postimg.org/image/nat2vxfa3/ Framebuffer effects: http://postimg.org/image/vhl2ni5cr/ Stealth Mission Without framebuffer effects: http://postimg.org/image/xcy12i7v3/ With framebuffer effects: http://postimg.org/image/75wu6jplb/ Shadows Hard: http://postimg.org/image/ycjqkgxn3/ Soft: http://postimg.org/image/lmk3l4f2n/ Miklós Máté (11): mesa: Don't leak ATIfs instructions in DeleteFragmentShader mesa: optionally associate a gl_program to ati_fragment_shader st/mesa: implement GL_ATI_fragment_shader st/mesa: enable GL_ATI_fragment_shader [RFC] mesa: allow binding framebuffer without depth st/mesa: fix handling the fallback texture program: fix comment about the fog formula mesa: improve debug log in atifragshader swrast: move two global defines to the only place where they are used [RFC] mesa: optimize out the realloc from glCopyTexImagexD() program: Remove extra reference_program() src/mesa/Makefile.sources | 1 + src/mesa/drivers/common/driverfuncs.c | 3 + src/mesa/main/atifragshader.c | 18 +- src/mesa/main/context.c | 10 +- src/mesa/main/dd.h| 6 +- src/mesa/main/mtypes.h| 3 +- src/mesa/main/state.c | 14 +- src/mesa/main/teximage.c | 29 ++ src/mesa/program/ir_to_mesa.cpp | 2 - src/mesa/program/prog_statevars.c | 2 +-
[Mesa-dev] [PATCH 11/11] program: Remove extra reference_program()
It was already done in get_mesa_program() --- src/mesa/program/ir_to_mesa.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 8f58f3e..a28cf97 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -2938,8 +2938,6 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) if (linked_prog) { _mesa_copy_linked_program_data((gl_shader_stage) i, prog, linked_prog); -_mesa_reference_program(ctx, >_LinkedShaders[i]->Program, -linked_prog); if (!ctx->Driver.ProgramStringNotify(ctx, _mesa_shader_stage_to_program(i), linked_prog)) { -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/11] mesa: optionally associate a gl_program to ati_fragment_shader
the state tracker will use it --- src/mesa/drivers/common/driverfuncs.c | 3 +++ src/mesa/main/atifragshader.c | 13 - src/mesa/main/dd.h| 6 +- src/mesa/main/mtypes.h| 1 + src/mesa/main/state.c | 14 +- 5 files changed, 34 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/common/driverfuncs.c b/src/mesa/drivers/common/driverfuncs.c index 6fe42b1..36e9281 100644 --- a/src/mesa/drivers/common/driverfuncs.c +++ b/src/mesa/drivers/common/driverfuncs.c @@ -118,6 +118,9 @@ _mesa_init_driver_functions(struct dd_function_table *driver) driver->NewProgram = _mesa_new_program; driver->DeleteProgram = _mesa_delete_program; + /* ATI_fragment_shader */ + driver->NewATIfs = NULL; + /* simple state commands */ driver->AlphaFunc = NULL; driver->BlendColor = NULL; diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index 3ddc51d..d1c07c5 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -30,6 +30,7 @@ #include "main/mtypes.h" #include "main/dispatch.h" #include "main/atifragshader.h" +#include "program/program.h" #define MESA_DEBUG_ATI_FS 0 @@ -63,6 +64,7 @@ _mesa_delete_ati_fragment_shader(struct gl_context *ctx, struct ati_fragment_sha free(s->Instructions[i]); free(s->SetupInst[i]); } + _mesa_reference_program(ctx, >Program, NULL); free(s); } @@ -321,6 +323,8 @@ _mesa_BeginFragmentShaderATI(void) free(ctx->ATIFragmentShader.Current->SetupInst[i]); } + _mesa_reference_program(ctx, >ATIFragmentShader.Current->Program, NULL); + /* malloc the instructions here - not sure if the best place but its a start */ for (i = 0; i < MAX_NUM_PASSES_ATI; i++) { @@ -402,7 +406,14 @@ _mesa_EndFragmentShaderATI(void) } #endif - if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI, NULL)) { + if (ctx->Driver.NewATIfs) { + struct gl_program *prog = ctx->Driver.NewATIfs(ctx, +ctx->ATIFragmentShader.Current->Id); + _mesa_reference_program(ctx, >ATIFragmentShader.Current->Program, prog); + } + + if (!ctx->Driver.ProgramStringNotify(ctx, GL_FRAGMENT_SHADER_ATI, +curProg->Program)) { ctx->ATIFragmentShader.Current->isValid = GL_FALSE; /* XXX is this the right error? */ _mesa_error(ctx, GL_INVALID_OPERATION, diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h index 87eb63e..9d24279 100644 --- a/src/mesa/main/dd.h +++ b/src/mesa/main/dd.h @@ -471,7 +471,11 @@ struct dd_function_table { struct gl_program * (*NewProgram)(struct gl_context *ctx, GLenum target, GLuint id); /** Delete a program */ - void (*DeleteProgram)(struct gl_context *ctx, struct gl_program *prog); + void (*DeleteProgram)(struct gl_context *ctx, struct gl_program *prog); + /** +* Allocate a program to associate with the new ATI fragment shader (optional) +*/ + struct gl_program * (*NewATIfs)(struct gl_context *ctx, GLuint id); /** * Notify driver that a program string (and GPU code) has been specified * or modified. Return GL_TRUE or GL_FALSE to indicate if the program is diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index cc8f350..5c71ac4 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2303,6 +2303,7 @@ struct ati_fragment_shader GLboolean interpinp1; GLboolean isValid; GLuint swizzlerq; + struct gl_program *Program; }; /** diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c index d3b1c72..cabba1b 100644 --- a/src/mesa/main/state.c +++ b/src/mesa/main/state.c @@ -124,7 +124,8 @@ update_program(struct gl_context *ctx) * follows: * 1. OpenGL 2.0/ARB vertex/fragment shaders * 2. ARB/NV vertex/fragment programs -* 3. Programs derived from fixed-function state. +* 3. ATI fragment shader +* 4. Programs derived from fixed-function state. * * Note: it's possible for a vertex shader to get used with a fragment * program (and vice versa) here, but in practice that shouldn't ever @@ -152,6 +153,17 @@ update_program(struct gl_context *ctx) _mesa_reference_fragprog(ctx, >FragmentProgram._TexEnvProgram, NULL); } + else if (ctx->ATIFragmentShader._Enabled +&& ctx->ATIFragmentShader.Current->Program) { + /* Use the enabled ATI fragment shader's associated program */ + _mesa_reference_shader_program(ctx, + >_Shader->_CurrentFragmentProgram, +NULL); + _mesa_reference_fragprog(ctx, >FragmentProgram._Current, + gl_fragment_program(ctx->ATIFragmentShader.Current->Program)); + _mesa_reference_fragprog(ctx, >FragmentProgram._TexEnvProgram, + NULL); + } else if
[Mesa-dev] [PATCH 01/11] mesa: Don't leak ATIfs instructions in DeleteFragmentShader
--- src/mesa/main/atifragshader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index 935ba05..3ddc51d 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -293,7 +293,7 @@ _mesa_DeleteFragmentShaderATI(GLuint id) prog->RefCount--; if (prog->RefCount <= 0) { assert(prog != ); - free(prog); +_mesa_delete_ati_fragment_shader(ctx, prog); } } } -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/11] mesa: improve debug log in atifragshader
--- src/mesa/main/atifragshader.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index d1c07c5..8b19a35 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -349,6 +349,9 @@ _mesa_BeginFragmentShaderATI(void) ctx->ATIFragmentShader.Current->isValid = GL_FALSE; ctx->ATIFragmentShader.Current->swizzlerq = 0; ctx->ATIFragmentShader.Compiling = 1; +#if MESA_DEBUG_ATI_FS + _mesa_debug(ctx, "%s %u\n", __func__, ctx->ATIFragmentShader.Current->Id); +#endif } void GLAPIENTRY -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/11] st/mesa: enable GL_ATI_fragment_shader
--- src/mesa/state_tracker/st_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index d97dfde..45ceae1 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -652,6 +652,7 @@ void st_init_extensions(struct pipe_screen *screen, extensions->EXT_texture_env_dot3 = GL_TRUE; extensions->EXT_vertex_array_bgra = GL_TRUE; + extensions->ATI_fragment_shader = GL_TRUE; extensions->ATI_texture_env_combine3 = GL_TRUE; extensions->MESA_pack_invert = GL_TRUE; -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/11] st/mesa: fix handling the fallback texture
--- src/mesa/state_tracker/st_atom_sampler.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_atom_sampler.c b/src/mesa/state_tracker/st_atom_sampler.c index 4252c27..7d3d8e7 100644 --- a/src/mesa/state_tracker/st_atom_sampler.c +++ b/src/mesa/state_tracker/st_atom_sampler.c @@ -131,7 +131,7 @@ convert_sampler(struct st_context *st, struct pipe_sampler_state *sampler, GLuint texUnit) { - const struct gl_texture_object *texobj; + struct gl_texture_object *texobj; struct gl_context *ctx = st->ctx; struct gl_sampler_object *msamp; GLenum texBaseFormat; @@ -144,6 +144,10 @@ convert_sampler(struct st_context *st, texBaseFormat = _mesa_texture_base_format(texobj); msamp = _mesa_get_samplerobj(ctx, texUnit); + if (!msamp) { + /* handle the fallback texture */ + msamp = >Sampler; + } memset(sampler, 0, sizeof(*sampler)); sampler->wrap_s = gl_wrap_xlate(msamp->WrapS); -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/11] [RFC] mesa: allow binding framebuffer without depth
this works with radeonsi, but crashes with llvmpipe --- src/mesa/main/context.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 888c461..dcaf524 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1550,10 +1550,10 @@ check_compatible(const struct gl_context *ctx, return GL_FALSE; if (ctxvis->haveAccumBuffer && !bufvis->haveAccumBuffer) return GL_FALSE; - if (ctxvis->haveDepthBuffer && !bufvis->haveDepthBuffer) - return GL_FALSE; + /*if (ctxvis->haveDepthBuffer && !bufvis->haveDepthBuffer) + return GL_FALSE; if (ctxvis->haveStencilBuffer && !bufvis->haveStencilBuffer) - return GL_FALSE; + return GL_FALSE;*/ if (ctxvis->redMask && ctxvis->redMask != bufvis->redMask) return GL_FALSE; if (ctxvis->greenMask && ctxvis->greenMask != bufvis->greenMask) @@ -1565,8 +1565,8 @@ check_compatible(const struct gl_context *ctx, if (ctxvis->depthBits && ctxvis->depthBits != bufvis->depthBits) return GL_FALSE; #endif - if (ctxvis->stencilBits && ctxvis->stencilBits != bufvis->stencilBits) - return GL_FALSE; + /*if (ctxvis->stencilBits && ctxvis->stencilBits != bufvis->stencilBits) + return GL_FALSE;*/ return GL_TRUE; } -- 2.6.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1
On 12/15/2015 04:08 PM, Jordan Justen wrote: > The OpenGL ARB_compute_shader extension specfication requires at least > 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1 > only required 128. Does this mean that extensions->ARB_compute_shader is not set? I'm a little bit nervous about that. Are we sure that we check for compute shader support correctly everywhere (i.e., don't just check the extension bit that isn't set)? > Signed-off-by: Jordan Justen> Cc: Ian Romanick > Cc: Marta Lofstedt > --- > src/mesa/main/version.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c > index e92bb11..112a73d 100644 > --- a/src/mesa/main/version.c > +++ b/src/mesa/main/version.c > @@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions > *extensions) > } > > static GLuint > -compute_version_es2(const struct gl_extensions *extensions) > +compute_version_es2(const struct gl_extensions *extensions, > +const struct gl_constants *consts) > { > /* OpenGL ES 2.0 is derived from OpenGL 2.0 */ > const bool ver_2_0 = (extensions->ARB_texture_cube_map && > @@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions > *extensions) > extensions->EXT_texture_snorm && > extensions->NV_primitive_restart && > extensions->OES_depth_texture_cube_map); > + const bool es31_compute_shader = > + consts->MaxComputeWorkGroupInvocations >= 128; > const bool ver_3_1 = (ver_3_0 && > extensions->ARB_arrays_of_arrays && > - extensions->ARB_compute_shader && > + es31_compute_shader && > extensions->ARB_draw_indirect && > extensions->ARB_explicit_uniform_location && > extensions->ARB_framebuffer_no_attachments && > @@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions *extensions, > case API_OPENGLES: >return compute_version_es1(extensions); > case API_OPENGLES2: > - return compute_version_es2(extensions); > + return compute_version_es2(extensions, consts); > } > return 0; > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1
On 2015-12-15 16:50:39, Ian Romanick wrote: > On 12/15/2015 04:08 PM, Jordan Justen wrote: > > The OpenGL ARB_compute_shader extension specfication requires at least > > 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1 > > only required 128. > > Does this mean that extensions->ARB_compute_shader is not set? Yes. I think we can't set this in some cases due to desktop GL requirements, but we should still be able to support CS on ES 3.1. > I'm a little bit nervous about that. Are we sure that we check for > compute shader support correctly everywhere (i.e., don't just check > the extension bit that isn't set)? I think we have it pretty well covered. The ES 3.1 CTS seems pretty happy with what we have. That said, patch 2 was yet another fix to use _mesa_has_compute_shaders, and I wouldn't be surprised if we ended up finding some more. (I did try to grep to find anything we might have missed.) -Jordan > > Signed-off-by: Jordan Justen> > Cc: Ian Romanick > > Cc: Marta Lofstedt > > --- > > src/mesa/main/version.c | 9 ++--- > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c > > index e92bb11..112a73d 100644 > > --- a/src/mesa/main/version.c > > +++ b/src/mesa/main/version.c > > @@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions > > *extensions) > > } > > > > static GLuint > > -compute_version_es2(const struct gl_extensions *extensions) > > +compute_version_es2(const struct gl_extensions *extensions, > > +const struct gl_constants *consts) > > { > > /* OpenGL ES 2.0 is derived from OpenGL 2.0 */ > > const bool ver_2_0 = (extensions->ARB_texture_cube_map && > > @@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions > > *extensions) > > extensions->EXT_texture_snorm && > > extensions->NV_primitive_restart && > > extensions->OES_depth_texture_cube_map); > > + const bool es31_compute_shader = > > + consts->MaxComputeWorkGroupInvocations >= 128; > > const bool ver_3_1 = (ver_3_0 && > > extensions->ARB_arrays_of_arrays && > > - extensions->ARB_compute_shader && > > + es31_compute_shader && > > extensions->ARB_draw_indirect && > > extensions->ARB_explicit_uniform_location && > > extensions->ARB_framebuffer_no_attachments && > > @@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions > > *extensions, > > case API_OPENGLES: > >return compute_version_es1(extensions); > > case API_OPENGLES2: > > - return compute_version_es2(extensions); > > + return compute_version_es2(extensions, consts); > > } > > return 0; > > } > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965/fs: do not disable the FS unit in the presence of shader storage
On Dec 15, 2015 3:52 AM, "Iago Toral Quiroga"wrote: > > We want to make sure that the driver does not disable the FS unit if > the shader code only has SSBO writes (i.e. no color or depth output). > > We could go a step further and check if the shader storage is actually > used for writing, but does not seem worth the trouble. Also, we do the > same thing for atomic buffers. > > Fixes the following CTS test: > ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs > --- > src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 ++- > src/mesa/drivers/dri/i965/gen8_ps_state.c | 1 + > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c b/src/mesa/drivers/dri/i965/gen7_wm_state.c > index 06d5e65..d292b13 100644 > --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c > @@ -77,7 +77,8 @@ upload_wm_state(struct brw_context *brw) >dw1 |= GEN7_WM_KILL_ENABLE; > } > > - if (_mesa_active_fragment_shader_has_atomic_ops(>ctx)) { > + if (_mesa_active_fragment_shader_has_atomic_ops(>ctx ) || > + _mesa_active_fragment_shader_has_shader_storage(>ctx)) { Ugh... We also need to be checking for images. How about we change it to active_fragment_shader_has_side_effects and make it check all three? >dw1 |= GEN7_WM_DISPATCH_ENABLE; > } > > diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c b/src/mesa/drivers/dri/i965/gen8_ps_state.c > index 945f710..8769269 100644 > --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c > @@ -91,6 +91,7 @@ gen8_upload_ps_extra(struct brw_context *brw, > * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS | _NEW_COLOR > */ > if ((_mesa_active_fragment_shader_has_atomic_ops(>ctx) || > +_mesa_active_fragment_shader_has_shader_storage(>ctx) || > prog_data->base.nr_image_params) && > !brw_color_buffer_write_enabled(brw)) >dw1 |= GEN8_PSX_SHADER_HAS_UAV; > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/11] GL_ATI_fragment_shader support for Gallium
Am 16.12.2015 um 00:05 schrieb Miklós Máté: > Hi, > > This series aims to improve the looks of Star Wars: Knights of the > Old Republic (via Wine), but features some additional cleanup as > well. The main component of the series is the implementation of > GL_ATI_fragment_shader for all Gallium drivers (though I could only > test it with radeonsi, llvmpipe, and softpipe). If this extension is > available, the game uses it quite extensively: perhaps the most > notable effect is the animated water ripples, but it also fixes the > grass, improves the specular on wet characters (e.g. the Selkath) and > it is used for regular texturing almost everywhere. The game has two > optional post-process effects that also depend on this extension: > framebuffer effects (light bloom, distortion), and soft shadows. > Patches 5&6 are needed to fix crashing with post-processing. With > current fglrx the grass is wrong, and post-process crashes, but my > previous Radeon cards ran this game perfectly on Windows. > > One other game that can use GL_ATI_fragment_shader is Doom 3, if > r_renderer="r200" instead of "best" (which means "arb2", if > GL_ARB_fragment_program is available). By default > image_useNormalCompression=0, which results in wrong lighting and > makes the specular overbright with r200. Setting it to 1 fixes r200, > but messes up arb2, setting it to 2 fixes both. The light interaction > is the same in r200 and arb2, but r200 doesn't have the heathaze > shader. Later idTech4 games don't support r200 anymore: in Quake 4 > everything is green, in Prey the organic walls are black, and ETQW > has a completely revised renderer. I verified these with fglrx. I think the reason why noone was interested in making ATI_fs supported so far on anything other than r200 was that there just wasn't really anything depending on it. As doom3 could use arb_fs just fine... But I guess if wine can use it there's some more apps probably... FWIW I think quake4 should work fine. Back when I implemented this for r200, it was indeed broken and I traced that back to something broken in the main shader (can't remember what, something trivial like wrong tex unit used in an instruction). I reported that and got told it was already fixed in the game - however there was never a new demo released thus if you just have the demo it's still broken. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader
On Dec 15, 2015 8:59 PM, "Ian Romanick"wrote: > > On 12/15/2015 05:08 PM, Ilia Mirkin wrote: > > On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanick wrote: > >> On 12/15/2015 04:40 PM, Ilia Mirkin wrote: > >>> Hardly a complete review, but a handful of comments: > >>> > >>> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté wrote: > --- > src/mesa/Makefile.sources | 1 + > src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 ++ > src/mesa/state_tracker/st_atifs_to_tgsi.h | 49 ++ > src/mesa/state_tracker/st_atom_constbuf.c | 14 + > src/mesa/state_tracker/st_cb_drawpixels.c | 1 + > src/mesa/state_tracker/st_cb_program.c| 35 +- > src/mesa/state_tracker/st_program.c | 22 + > src/mesa/state_tracker/st_program.h | 1 + > 8 files changed, 920 insertions(+), 1 deletion(-) > create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c > create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h > > +static struct ureg_src prepare_argument(struct st_translate *t, const unsigned argId, > + const struct atifragshader_src_register *srcReg) > +{ > + struct ureg_src src = get_source(t, srcReg->Index); > + struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId); > + > + switch (srcReg->argRep) { > + case GL_NONE: > + break; > + case GL_RED: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X); > + break; > + case GL_GREEN: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y); > + break; > + case GL_BLUE: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z); > + break; > + case GL_ALPHA: > + src = ureg_swizzle(src, > + TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W); > + break; > + } > + emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1); > + > + if (srcReg->argMod & GL_COMP_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_imm1f(t->ureg, 1.0); > + modsrc[1] = ureg_src(arg); > + > + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); > + } > + if (srcReg->argMod & GL_BIAS_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_src(arg); > + modsrc[1] = ureg_imm1f(t->ureg, 0.5); > + > + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); > + } > + if (srcReg->argMod & GL_2X_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_src(arg); > + modsrc[1] = ureg_imm1f(t->ureg, 2.0); > + > + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); > >>> > >>> aka ADD arg, arg, arg > >>> > + } > + if (srcReg->argMod & GL_NEGATE_BIT_ATI) { > + struct ureg_src modsrc[2]; > + modsrc[0] = ureg_src(arg); > + modsrc[1] = ureg_imm1f(t->ureg, -1.0); > + > + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); > >>> > >>> aka NEG arg, arg > >>> > + } > + return ureg_src(arg); > +} > + > +/* These instructions have no direct equivalent in TGSI */ > +static void emit_special_inst(struct st_translate *t, struct instruction_desc *desc, > + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) > +{ > + struct ureg_dst tmp[1]; > + struct ureg_src src[3]; > + > + if(desc->special == 1) { > + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose a3 > + src[0] = ureg_imm1f(t->ureg, 0.5f); > + src[1] = args[2]; > + emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2); > + src[0] = ureg_src(tmp[0]); > + src[1] = args[0]; > + src[2] = args[1]; > + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); > + } else if (desc->special == 2) { > + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose a3 > + src[0] = args[2]; > + src[1] = ureg_imm1f(t->ureg, 0.0f); > + emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2); > + src[0] = ureg_src(tmp[0]); > + src[1] = args[0]; > + src[2] = args[1]; > + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); > >>> > >>> Isn't this the CMP instruction? Just flip the args. > >>> > >>> http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP > >>> > >>> The other one should be expressible as CMP as well I think. > >>> > + } else if (desc->special == 3) { > + src[0] =
Re: [Mesa-dev] [PATCH] Add .mailmap
On 16.12.2015 06:40, Giuseppe Bilotta wrote: > This adds a first tentative .mailmap file, to canonicize contributor > name/emails in shortlogs and other statistical endeavours. > > There's a couple of root and richard entries which I don't know who > they belong to, and hopefully not too many overeager merges. > > Signed-off-by: Giuseppe Bilotta[...] > diff --git a/.mailmap b/.mailmap > new file mode 100644 > index 000..bf8b4d9 > --- /dev/null > +++ b/.mailmap > @@ -0,0 +1,460 @@ > + > +Adam Jackson > +Adam Jackson In Adam's case, you put a personal e-mail address first and his employer's address last. > +Michel Dänzer Michel Daenzer > > +Michel Dänzer Michel Daenzer > > +Michel Dänzer > +Michel Dänzer > +Michel Dänzer In my case, you put my current employer's address first and my personal and former employers' addresses last. What's the (intended) meaning of this mapping? If it means that all my contributions will be accounted to AMD, I'm afraid that's not very accurate. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/15] i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware
On Dec 15, 2015 12:30 AM, "Abdiel Janulgue"wrote: > > > > On 12/10/2015 06:23 AM, Jason Ekstrand wrote: > > While we're at it, we also add support for the possibility that the > > indirect is, in fact, a constant. This shouldn't happen in the common case > > (if it does, that means NIR failed to constant-fold something), but it's > > possible so we should handle it. > > Perhaps this should re-ordered before patch 3? We could, but it really doesn't matter. No MOV_INDIRECT ever hits the generator pre-BDW prior to patch 15. They get lowered away to pull constant loads. --Jason > > --- > > src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++ > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 51 +++--- > > 2 files changed, 42 insertions(+), 13 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp > > index 9eaf8d0..a2ec03e 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > > @@ -4424,6 +4424,10 @@ get_lowered_simd_width(const struct brw_device_info *devinfo, > > case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL: > >return 8; > > > > + case SHADER_OPCODE_MOV_INDIRECT: > > + /* Prior to Broadwell, we only have 8 address subregisters */ > > + return devinfo->gen < 8 ? 8 : inst->exec_size; > > + > > default: > >return inst->exec_size; > > } > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > index d86eee1..7fa6d84 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > @@ -351,22 +351,47 @@ fs_generator::generate_mov_indirect(fs_inst *inst, > > > > unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr; > > > > - /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */ > > - struct brw_reg addr = vec8(brw_address_reg(0)); > > + if (indirect_byte_offset.file == BRW_IMMEDIATE_VALUE) { > > + imm_byte_offset += indirect_byte_offset.ud; > > > > - /* The destination stride of an instruction (in bytes) must be greater > > -* than or equal to the size of the rest of the instruction. Since the > > -* address register is of type UW, we can't use a D-type instruction. > > -* In order to get around this, re re-type to UW and use a stride. > > -*/ > > - indirect_byte_offset = > > - retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW); > > + reg.nr = imm_byte_offset / REG_SIZE; > > + reg.subnr = imm_byte_offset % REG_SIZE; > > + brw_MOV(p, dst, reg); > > + } else { > > + /* Prior to Broadwell, there are only 8 address registers. */ > > + assert(inst->exec_size == 8 || devinfo->gen >= 8); > > > > - /* Prior to Broadwell, there are only 8 address registers. */ > > - assert(inst->exec_size == 8 || devinfo->gen >= 8); > > + /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */ > > + struct brw_reg addr = vec8(brw_address_reg(0)); > > > > - brw_MOV(p, addr, indirect_byte_offset); > > - brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type)); > > + /* The destination stride of an instruction (in bytes) must be greater > > + * than or equal to the size of the rest of the instruction. Since the > > + * address register is of type UW, we can't use a D-type instruction. > > + * In order to get around this, re re-type to UW and use a stride. > > + */ > > + indirect_byte_offset = > > + retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW); > > + > > + if (devinfo->gen < 8) { > > + /* Prior to broadwell, we have a restriction that the bottom 5 bits > > + * of the base offset and the bottom 5 bits of the indirect must add > > + * to less than 32. In other words, the hardware needs to be able to > > + * add the bottom five bits of the two to get the subnumber and add > > + * the next 7 bits of each to get the actual register number. Since > > + * the indirect may cause us to cross a register boundary, this makes > > + * it almost useless. We could try and do something clever where we > > + * use a actual base offset if base_offset % 32 == 0 but that would > > + * mean we were generating different code depending on the base > > + * offset. Instead, for the sake of consistency, we'll just do the > > + * add ourselves. > > + */ > > + brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset)); > > + brw_MOV(p, dst, retype(brw_VxH_indirect(0, 0), dst.type)); > > + } else { > > + brw_MOV(p, addr, indirect_byte_offset); > > + brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type)); > > + } > > + } > > } > > > > void > >
[Mesa-dev] [PATCH] gallivm: add a horrible hack for stencil texturing with border
From: Roland Scheideggermesa/st doesn't give us a useful swizzle when stencil texturing. Moreover, it's not even obvious what the swizzle actually should be - the channel which is used for the fetch (Y) is not the same as the one which must be used for the border component (X), which is due to a mismatch between GL and gallium interface. (On top of that, I have no idea what GL expects in YZW channels in the end.) So add some special case for stencil texturing with border, to fetch the right border component. Though it seems there has to be some better solution... This fixes piglit texwrap GL_ARB_texture_stencil8 bordercolor (only the fixed version). --- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 28 +-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index e21933f..efba5a8 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -187,9 +187,33 @@ lp_build_sample_texel_soa(struct lp_build_sample_context *bld, border_type.length = 4; /* * Only replace channels which are actually present. The others should - * get optimized away eventually by sampler_view swizzle anyway but it's - * easier too. + * get optimized away eventually by sampler_view swizzle in most cases... + * If not, for "ordinary" color textures, fetch will have placed the + * correct default values there, since missing channels must use default + * values regardless of border. + * We do, however, some horrendous hack for stencil textures. We won't + * get a useful swizzle, and furthermore the channel to fetch (Y) doesn't + * match the channel for the border color (X). */ + if (util_format_has_stencil(format_desc) && +!util_format_has_depth(format_desc)) { + LLVMValueRef zero = lp_build_const_int32(bld->gallivm, 0); + LLVMValueRef border_col; + border_col = lp_build_extract_broadcast(bld->gallivm, + border_type, + bld->texel_type, + bld->border_color_clamped, + zero); + /* + * Replace first 3 chans (match what fetch did). + */ + for (chan = 0; chan < 3; chan++) { +texel_out[chan] = lp_build_select(>texel_bld, use_border, + border_col, texel_out[chan]); + } + return; + } + for (chan = 0; chan < 4; chan++) { unsigned chan_s; /* reverse-map channel... */ -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/gen8/cs: fix constant push buffer
Thanks Iago! This patch does not only fix the ssbo test mentioned below, but a lot of other GLES 3.1 CTS tests. > -Original Message- > From: Iago Toral Quiroga [mailto:ito...@igalia.com] > Sent: Tuesday, December 15, 2015 12:55 PM > To: mesa-dev@lists.freedesktop.org > Cc: Lofstedt, Marta; Justen, Jordan L; Palli, Tapani; Iago Toral Quiroga > Subject: [PATCH] i965/gen8/cs: fix constant push buffer > > Page 502 of the Command Reference Broadwell PRM says that CURBE Total > Data Length must be 64-bit aligned. > > Fixes the following CTS tests: > ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs > ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs > ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs > ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2- > cs > ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs > ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing- > case2-cs > ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs > --- > src/mesa/drivers/dri/i965/gen7_cs_state.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c > b/src/mesa/drivers/dri/i965/gen7_cs_state.c > index 1fde69c..dbd1967 100644 > --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c > @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw) > > unsigned push_constant_data_size = >(prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value); > - unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, > 32); > + unsigned reg_aligned_constant_size = > + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); > unsigned push_constant_regs = reg_aligned_constant_size / 32; > unsigned threads = get_cs_thread_count(cs_prog_data); > > @@ -241,7 +242,8 @@ brw_upload_cs_push_constants(struct brw_context > *brw, > >const unsigned push_constant_data_size = > (local_id_dwords + prog_data->nr_params) * > sizeof(gl_constant_value); > - const unsigned reg_aligned_constant_size = > ALIGN(push_constant_data_size, 32); > + const unsigned reg_aligned_constant_size = > + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); >const unsigned param_aligned_count = > reg_aligned_constant_size / sizeof(*param); > > -- > 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: remove validation of shaders that should be done elsewhere
On 12/15/2015 01:25 AM, Timothy Arceri wrote: On Wed, 2015-12-09 at 00:17 +1100, Timothy Arceri wrote: In core profile even if re-linking fails rendering shouldn't fail as the previous succesfully linked program will still be available. It also shouldn't be possible to have an unlinked program as part of the current rendering state. Hey guys, Any thoughts on this change? Thinking about this some more we should probably rework the compat code also and only do the check for link status if there is an assembly shader right? I wanted to hear from others first since for me it feels this change seems specific to separate shader programs (I had a patch on list that skipped the check for those programs that were not in use by current pipeline). The reason is that with regular programs I can't see a way to continue if relinking fails (because program is now in bad state). I think user should detach the malfunctioning stage and link again. However with SSO relink to a unused stage may fail but we can still have a complete working program with stages marked as used. Thanks, Tim // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [OT] some contribution statistics
Hello all, when Steam first announced they'd give all present and future games free to all Mesa contributors with at least 25 commits[1], I was curious to see how many people would be affected by this choice, so I ran some statistics on the number of committers (and contributions by committer) on Mesa at the time, with the following results (for April 9, 2015): # count: 684 # min: 1 # max: 14020 # mid: 7010.5 # range: 14019 # mean: 101.35818713450293 # stddev: 652.7501707733724 # mode(s): 1 # median: 2 # quartiles: 1 2 10 # IQR: 9 Having come across an old discussion about these stats, I decided to rerun the stats now, and the results are: # count: 736 # min: 1 # max: 14310 # mid: 7155.5 # range: 14309 # mean: 102.19701086956522 # stddev: 651.6642244733528 # mode(s): 1 # median: 3 # quartiles: 1 3 12 # IQR: 11 And I would say that this counts as an improvement: the mean number of contributions per developer has gone up, but most importantly the _median_ contribution has gone up (from 2 to 3 contributions), and ditto for the upper quartile (from 10 to 12). In some sense, the 'long tail' of contributors in Mesa has _shortened_ in these 8 months, even though the number of contribution has increased! The only problem with these numbers is actually the lack of a .mailmap to normalize contributor name/emails, which obviously skews the results a little bit towards the lower end. I don't suppose someone has a .mailmap for Mesa contributors, or is interested in creating one? [1]: http://lists.freedesktop.org/archives/dri-devel/2015-April/081045.html -- Giuseppe "Oblomov" Bilotta ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] nir/lower_system_values: Stop supporting non-SSA
On Tue, Dec 15, 2015 at 12:26 PM, Eric Anholtwrote: > Jason Ekstrand writes: > >> The one user of this (i965) only ever calls it while in SSA form. > > This series is: > > Reviewed-by: Eric Anholt Thanks! Did you happen to run it on something that actually uses clip plane lowering? I'd like to not break things. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [OT] some contribution statistics
On Tuesday, December 15, 2015 02:23:07 PM Giuseppe Bilotta wrote: > The only problem with these numbers is actually the lack of a .mailmap > to normalize contributor name/emails, which obviously skews the > results a little bit towards the lower end. I don't suppose someone > has a .mailmap for Mesa contributors, or is interested in creating > one? I actually have one of those! http://cgit.freedesktop.org/~kwg/mesa/commit/?h=gitdm --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] mesa: Don't leak ATIfs instructions in DeleteFragmentShader
This patch is Reviewed-by: Ian RomanickCc: "11.0 11.1" Assuming there are no objections, I'll push this in 24 hours. On 12/15/2015 03:05 PM, Miklós Máté wrote: > --- > src/mesa/main/atifragshader.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c > index 935ba05..3ddc51d 100644 > --- a/src/mesa/main/atifragshader.c > +++ b/src/mesa/main/atifragshader.c > @@ -293,7 +293,7 @@ _mesa_DeleteFragmentShaderATI(GLuint id) >prog->RefCount--; >if (prog->RefCount <= 0) { > assert(prog != ); > - free(prog); > +_mesa_delete_ati_fragment_shader(ctx, prog); >} >} > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader
On Dec 15, 2015 6:19 PM, "Ilia Mirkin"wrote: > > > On Dec 15, 2015 8:59 PM, "Ian Romanick" wrote: > > > > On 12/15/2015 05:08 PM, Ilia Mirkin wrote: > > > On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanick wrote: > > >> On 12/15/2015 04:40 PM, Ilia Mirkin wrote: > > >>> Hardly a complete review, but a handful of comments: > > >>> > > >>> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté wrote: > > --- > > src/mesa/Makefile.sources | 1 + > > src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 ++ > > src/mesa/state_tracker/st_atifs_to_tgsi.h | 49 ++ > > src/mesa/state_tracker/st_atom_constbuf.c | 14 + > > src/mesa/state_tracker/st_cb_drawpixels.c | 1 + > > src/mesa/state_tracker/st_cb_program.c| 35 +- > > src/mesa/state_tracker/st_program.c | 22 + > > src/mesa/state_tracker/st_program.h | 1 + > > 8 files changed, 920 insertions(+), 1 deletion(-) > > create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c > > create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h > > > > +static struct ureg_src prepare_argument(struct st_translate *t, const unsigned argId, > > + const struct atifragshader_src_register *srcReg) > > +{ > > + struct ureg_src src = get_source(t, srcReg->Index); > > + struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId); > > + > > + switch (srcReg->argRep) { > > + case GL_NONE: > > + break; > > + case GL_RED: > > + src = ureg_swizzle(src, > > + TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X); > > + break; > > + case GL_GREEN: > > + src = ureg_swizzle(src, > > + TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y); > > + break; > > + case GL_BLUE: > > + src = ureg_swizzle(src, > > + TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z); > > + break; > > + case GL_ALPHA: > > + src = ureg_swizzle(src, > > + TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W); > > + break; > > + } > > + emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1); > > + > > + if (srcReg->argMod & GL_COMP_BIT_ATI) { > > + struct ureg_src modsrc[2]; > > + modsrc[0] = ureg_imm1f(t->ureg, 1.0); > > + modsrc[1] = ureg_src(arg); > > + > > + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); > > + } > > + if (srcReg->argMod & GL_BIAS_BIT_ATI) { > > + struct ureg_src modsrc[2]; > > + modsrc[0] = ureg_src(arg); > > + modsrc[1] = ureg_imm1f(t->ureg, 0.5); > > + > > + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); > > + } > > + if (srcReg->argMod & GL_2X_BIT_ATI) { > > + struct ureg_src modsrc[2]; > > + modsrc[0] = ureg_src(arg); > > + modsrc[1] = ureg_imm1f(t->ureg, 2.0); > > + > > + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); > > >>> > > >>> aka ADD arg, arg, arg > > >>> > > + } > > + if (srcReg->argMod & GL_NEGATE_BIT_ATI) { > > + struct ureg_src modsrc[2]; > > + modsrc[0] = ureg_src(arg); > > + modsrc[1] = ureg_imm1f(t->ureg, -1.0); > > + > > + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); > > >>> > > >>> aka NEG arg, arg > > >>> > > + } > > + return ureg_src(arg); > > +} > > + > > +/* These instructions have no direct equivalent in TGSI */ > > +static void emit_special_inst(struct st_translate *t, struct instruction_desc *desc, > > + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) > > +{ > > + struct ureg_dst tmp[1]; > > + struct ureg_src src[3]; > > + > > + if(desc->special == 1) { > > + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose a3 > > + src[0] = ureg_imm1f(t->ureg, 0.5f); > > + src[1] = args[2]; > > + emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2); > > + src[0] = ureg_src(tmp[0]); > > + src[1] = args[0]; > > + src[2] = args[1]; > > + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); > > + } else if (desc->special == 2) { > > + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose a3 > > + src[0] = args[2]; > > + src[1] = ureg_imm1f(t->ureg, 0.0f); > > + emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2); > > + src[0] = ureg_src(tmp[0]); > > + src[1] = args[0]; > > + src[2] = args[1]; > > + emit_insn(t, TGSI_OPCODE_LRP, dst,
Re: [Mesa-dev] [PATCH] ir_to_mesa: Skip useless comparison instructions.
On Mon, Dec 7, 2015 at 10:50 AM, Matt Turnerwrote: > --- > With this, we generate the same number of Mesa IR instructions before > and after my series. all() is the same as well. Maybe Ian could have a look? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader
On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanickwrote: > On 12/15/2015 04:40 PM, Ilia Mirkin wrote: >> Hardly a complete review, but a handful of comments: >> >> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté wrote: >>> --- >>> src/mesa/Makefile.sources | 1 + >>> src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 >>> ++ >>> src/mesa/state_tracker/st_atifs_to_tgsi.h | 49 ++ >>> src/mesa/state_tracker/st_atom_constbuf.c | 14 + >>> src/mesa/state_tracker/st_cb_drawpixels.c | 1 + >>> src/mesa/state_tracker/st_cb_program.c| 35 +- >>> src/mesa/state_tracker/st_program.c | 22 + >>> src/mesa/state_tracker/st_program.h | 1 + >>> 8 files changed, 920 insertions(+), 1 deletion(-) >>> create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c >>> create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h >>> >>> +static struct ureg_src prepare_argument(struct st_translate *t, const >>> unsigned argId, >>> + const struct atifragshader_src_register *srcReg) >>> +{ >>> + struct ureg_src src = get_source(t, srcReg->Index); >>> + struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId); >>> + >>> + switch (srcReg->argRep) { >>> + case GL_NONE: >>> + break; >>> + case GL_RED: >>> + src = ureg_swizzle(src, >>> + TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, >>> TGSI_SWIZZLE_X); >>> + break; >>> + case GL_GREEN: >>> + src = ureg_swizzle(src, >>> + TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, >>> TGSI_SWIZZLE_Y); >>> + break; >>> + case GL_BLUE: >>> + src = ureg_swizzle(src, >>> + TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, >>> TGSI_SWIZZLE_Z); >>> + break; >>> + case GL_ALPHA: >>> + src = ureg_swizzle(src, >>> + TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, >>> TGSI_SWIZZLE_W); >>> + break; >>> + } >>> + emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1); >>> + >>> + if (srcReg->argMod & GL_COMP_BIT_ATI) { >>> + struct ureg_src modsrc[2]; >>> + modsrc[0] = ureg_imm1f(t->ureg, 1.0); >>> + modsrc[1] = ureg_src(arg); >>> + >>> + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); >>> + } >>> + if (srcReg->argMod & GL_BIAS_BIT_ATI) { >>> + struct ureg_src modsrc[2]; >>> + modsrc[0] = ureg_src(arg); >>> + modsrc[1] = ureg_imm1f(t->ureg, 0.5); >>> + >>> + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); >>> + } >>> + if (srcReg->argMod & GL_2X_BIT_ATI) { >>> + struct ureg_src modsrc[2]; >>> + modsrc[0] = ureg_src(arg); >>> + modsrc[1] = ureg_imm1f(t->ureg, 2.0); >>> + >>> + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); >> >> aka ADD arg, arg, arg >> >>> + } >>> + if (srcReg->argMod & GL_NEGATE_BIT_ATI) { >>> + struct ureg_src modsrc[2]; >>> + modsrc[0] = ureg_src(arg); >>> + modsrc[1] = ureg_imm1f(t->ureg, -1.0); >>> + >>> + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); >> >> aka NEG arg, arg >> >>> + } >>> + return ureg_src(arg); >>> +} >>> + >>> +/* These instructions have no direct equivalent in TGSI */ >>> +static void emit_special_inst(struct st_translate *t, struct >>> instruction_desc *desc, >>> + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) >>> +{ >>> + struct ureg_dst tmp[1]; >>> + struct ureg_src src[3]; >>> + >>> + if(desc->special == 1) { >>> + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // >>> re-purpose a3 >>> + src[0] = ureg_imm1f(t->ureg, 0.5f); >>> + src[1] = args[2]; >>> + emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2); >>> + src[0] = ureg_src(tmp[0]); >>> + src[1] = args[0]; >>> + src[2] = args[1]; >>> + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); >>> + } else if (desc->special == 2) { >>> + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // >>> re-purpose a3 >>> + src[0] = args[2]; >>> + src[1] = ureg_imm1f(t->ureg, 0.0f); >>> + emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2); >>> + src[0] = ureg_src(tmp[0]); >>> + src[1] = args[0]; >>> + src[2] = args[1]; >>> + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); >> >> Isn't this the CMP instruction? Just flip the args. >> >> http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP >> >> The other one should be expressible as CMP as well I think. >> >>> + } else if (desc->special == 3) { >>> + src[0] = args[0]; >>> + src[1] = args[1]; >>> + src[2] = ureg_swizzle(args[2], >>> +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, >>> TGSI_SWIZZLE_Z); >>> + emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3); >>> + } >>> +} >>> + >>> +static void emit_arith_inst(struct st_translate *t, >>> + struct instruction_desc *desc, >>> + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount)
Re: [Mesa-dev] [PATCH 4/5] i965: Enable compute shaders in more cases for OpenGLES 3.1
On 2015-12-15 17:00:55, Ian Romanick wrote: > Doesn't this make patch 3 irrelevant? FWIW, I like this better. This change only updates the way we program some constants. It is for a local stage_exists array, which we then use later in the same function when programming context constants. For example, without this change, I don't think image_load_store has any images to work with for the compute stage. -Jordan > > On 12/15/2015 04:08 PM, Jordan Justen wrote: > > Previously we were checking the desktop OpenGL ARB_compute_shader > > requirements, but for OpenGLES 3.1, the requirements are lower. > > > > Signed-off-by: Jordan Justen> > Cc: Marta Lofstedt > > --- > > src/mesa/drivers/dri/i965/brw_context.c | 5 - > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > > b/src/mesa/drivers/dri/i965/brw_context.c > > index 0abe601..5105625 100644 > > --- a/src/mesa/drivers/dri/i965/brw_context.c > > +++ b/src/mesa/drivers/dri/i965/brw_context.c > > @@ -377,7 +377,10 @@ brw_initialize_context_constants(struct brw_context > > *brw) > >[MESA_SHADER_GEOMETRY] = brw->gen >= 6, > >[MESA_SHADER_FRAGMENT] = true, > >[MESA_SHADER_COMPUTE] = > > - (ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) || > > + (ctx->API == API_OPENGL_CORE && > > + ctx->Const.MaxComputeWorkGroupSize[0] >= 1024) || > > + (ctx->API == API_OPENGLES2 && > > + ctx->Const.MaxComputeWorkGroupSize[0] >= 128) || > > _mesa_extension_override_enables.ARB_compute_shader, > > }; > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/11] program: fix comment about the fog formula
Yes... that matches the GL_ARB_fragment_program spec. This patch is Reviewed-by: Ian RomanickAssuming there are no objections, I'll push this in 24 hours. On 12/15/2015 03:05 PM, Miklós Máté wrote: > --- > src/mesa/program/prog_statevars.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/program/prog_statevars.c > b/src/mesa/program/prog_statevars.c > index bdb335e..12490d0 100644 > --- a/src/mesa/program/prog_statevars.c > +++ b/src/mesa/program/prog_statevars.c > @@ -474,7 +474,7 @@ _mesa_fetch_state(struct gl_context *ctx, const > gl_state_index state[], >* single MAD. >* linear: fogcoord * -1/(end-start) + end/(end-start) >* exp: 2^-(density/ln(2) * fogcoord) > - * exp2: 2^-((density/(ln(2)^2) * fogcoord)^2) > + * exp2: 2^-((density/(sqrt(ln(2))) * fogcoord)^2) >*/ > value[0] = (ctx->Fog.End == ctx->Fog.Start) > ? 1.0f : (GLfloat)(-1.0F / (ctx->Fog.End - ctx->Fog.Start)); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] main/version: Don't require ARB_compute_shader for OpenGLES 3.1
On 12/15/2015 05:01 PM, Jordan Justen wrote: > On 2015-12-15 16:50:39, Ian Romanick wrote: >> On 12/15/2015 04:08 PM, Jordan Justen wrote: >>> The OpenGL ARB_compute_shader extension specfication requires at least >>> 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1 >>> only required 128. >> >> Does this mean that extensions->ARB_compute_shader is not set? > > Yes. I think we can't set this in some cases due to desktop GL > requirements, but we should still be able to support CS on ES 3.1. > >> I'm a little bit nervous about that. Are we sure that we check for >> compute shader support correctly everywhere (i.e., don't just check >> the extension bit that isn't set)? > > I think we have it pretty well covered. The ES 3.1 CTS seems pretty > happy with what we have. > > That said, patch 2 was yet another fix to use > _mesa_has_compute_shaders, and I wouldn't be surprised if we ended up > finding some more. (I did try to grep to find anything we might have > missed.) I just did that too. I didn't see anything that looked problematic except: src/mesa/main/get.c:/* HACK: remove when ARB_compute_shader is actually supported */ This patch is Reviewed-by: Ian Romanick> -Jordan > >>> Signed-off-by: Jordan Justen >>> Cc: Ian Romanick >>> Cc: Marta Lofstedt >>> --- >>> src/mesa/main/version.c | 9 ++--- >>> 1 file changed, 6 insertions(+), 3 deletions(-) >>> >>> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c >>> index e92bb11..112a73d 100644 >>> --- a/src/mesa/main/version.c >>> +++ b/src/mesa/main/version.c >>> @@ -433,7 +433,8 @@ compute_version_es1(const struct gl_extensions >>> *extensions) >>> } >>> >>> static GLuint >>> -compute_version_es2(const struct gl_extensions *extensions) >>> +compute_version_es2(const struct gl_extensions *extensions, >>> +const struct gl_constants *consts) >>> { >>> /* OpenGL ES 2.0 is derived from OpenGL 2.0 */ >>> const bool ver_2_0 = (extensions->ARB_texture_cube_map && >>> @@ -464,9 +465,11 @@ compute_version_es2(const struct gl_extensions >>> *extensions) >>> extensions->EXT_texture_snorm && >>> extensions->NV_primitive_restart && >>> extensions->OES_depth_texture_cube_map); >>> + const bool es31_compute_shader = >>> + consts->MaxComputeWorkGroupInvocations >= 128; >>> const bool ver_3_1 = (ver_3_0 && >>> extensions->ARB_arrays_of_arrays && >>> - extensions->ARB_compute_shader && >>> + es31_compute_shader && >>> extensions->ARB_draw_indirect && >>> extensions->ARB_explicit_uniform_location && >>> extensions->ARB_framebuffer_no_attachments && >>> @@ -508,7 +511,7 @@ _mesa_get_version(const struct gl_extensions >>> *extensions, >>> case API_OPENGLES: >>>return compute_version_es1(extensions); >>> case API_OPENGLES2: >>> - return compute_version_es2(extensions); >>> + return compute_version_es2(extensions, consts); >>> } >>> return 0; >>> } >>> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 1/1] i965: add opportunistic behaviour to opt_vector_float()
opt_vector_float() transforms several scalar MOV operations to a single vectorial MOV. This is done when those MOV covers all the components of the destination register. So something like: mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf3.0.z:D, 0D is transformed in: mov vgrf3.0:F, [0F, 0F, 0F, 1F] But there are cases where not all the components are written. For example, in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf4.0.xy:D, 1065353216D mov vgrf4.0.w:D, 0D mov vgrf6.0:UD, u4.xyzw:UD Nor vgrf3 nor vgrf4 .z components are written, so the optimization is not applied. But it could be applied anyway with the components covered, using a writemask to select the ones written. So we could transform it in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F] mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F] mov vgrf6.0:UD, u4.xyzw:UD This commit does precisely that: opportunistically apply opt_vector_float() when possible. The improvement obtained regarding current upstream (11.1-branchpoint-654-gc51c09c) is: total instructions in shared programs: 6846435 -> 6838649 (-0.11%) instructions in affected programs: 393820 -> 386034 (-1.98%) total loops in shared programs:1971 -> 1971 (0.00%) helped:3980 HURT: 0 GAINED:0 LOST: 0 v2: change vectorize_mov() signature (Matt). v3: take in account predicates (Juan). Signed-off-by: Juan A. Suarez Romero--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 62 ++ src/mesa/drivers/dri/i965/brw_vec4.h | 4 +++ 2 files changed, 44 insertions(+), 22 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index a697bdf..ffbbf1a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -309,6 +309,29 @@ src_reg::equals(const src_reg ) const } bool +vec4_visitor::vectorize_mov(bblock_t *block, vec4_instruction *inst, uint8_t imm[4], +vec4_instruction *imm_inst[4], int inst_count, +unsigned writemask) +{ + if (inst_count < 2) { + return false; + } + + unsigned vf; + memcpy(, imm, sizeof(vf)); + vec4_instruction *mov = MOV(imm_inst[0]->dst, brw_imm_vf(vf)); + mov->dst.type = BRW_REGISTER_TYPE_F; + mov->dst.writemask = writemask; + inst->insert_before(block, mov); + + for (int i = 0; i < inst_count; i++) { + imm_inst[i]->remove(block); + } + + return true;} + + +bool vec4_visitor::opt_vector_float() { bool progress = false; @@ -316,27 +339,37 @@ vec4_visitor::opt_vector_float() int last_reg = -1, last_reg_offset = -1; enum brw_reg_file last_reg_file = BAD_FILE; - int remaining_channels = 0; - uint8_t imm[4]; + uint8_t imm[4] = { 0 }; int inst_count = 0; vec4_instruction *imm_inst[4]; + unsigned writemask = 0; foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) { if (last_reg != inst->dst.nr || last_reg_offset != inst->dst.reg_offset || last_reg_file != inst->dst.file) { + + progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, writemask); + + inst_count = 0; + writemask = 0; last_reg = inst->dst.nr; last_reg_offset = inst->dst.reg_offset; last_reg_file = inst->dst.file; - remaining_channels = WRITEMASK_XYZW; - - inst_count = 0; + for (int i = 0; i < 4; i++) { +imm[i] = 0; + } } if (inst->opcode != BRW_OPCODE_MOV || inst->dst.writemask == WRITEMASK_XYZW || - inst->src[0].file != IMM) + inst->src[0].file != IMM || + inst->predicate != BRW_PREDICATE_NONE) { + progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, writemask); + inst_count = 0; + last_reg = -1; continue; + } int vf = brw_float_to_vf(inst->src[0].f); if (vf == -1) @@ -351,23 +384,8 @@ vec4_visitor::opt_vector_float() if ((inst->dst.writemask & WRITEMASK_W) != 0) imm[3] = vf; + writemask |= inst->dst.writemask; imm_inst[inst_count++] = inst; - - remaining_channels &= ~inst->dst.writemask; - if (remaining_channels == 0) { - unsigned vf; - memcpy(, imm, sizeof(vf)); - vec4_instruction *mov = MOV(inst->dst, brw_imm_vf(vf)); - mov->dst.type = BRW_REGISTER_TYPE_F; - mov->dst.writemask = WRITEMASK_XYZW; - inst->insert_after(block, mov); - last_reg = -1; - - for (int i = 0; i < inst_count; i++) { -imm_inst[i]->remove(block); - } - progress = true; - } } if (progress) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h
[Mesa-dev] [PATCH v4 0/1] i965: add opportunistic behaviour to opt_vector_float()
While working on related issue, found out that previous patch (and original version) were applying incorrectly opt_vector_float in some cases. Specifically, for this piece of code: cmp.nz.f0.0 null:F, vgrf6.xyzz:F, vgrf17.xyzz:F mov vgrf2.0.x:D, 0D (+f0.0.any4h) mov vgrf2.0.x:D, -1D mov vgrf2.0.yzw:D, 0D opt_vector_float was generating: cmp.nz.f0.0 null:F, vgrf6.xyzz:F, vgrf17.xyzz:F (+f0.0.any4h) mov vgrf2.0.x:D, -1D mov vgrf2.0:F, [0F, 0F, 0F, 0F] cmp.nz.f0.0 null:D, vgrf2.xyzw:D, 0D As can be notice, in the former code vgrf2.x could be 0 or -1, depending on the predicate, while in the result it is always 0. Problem is that when applying the optimization, it was ignoring the predicate. The next patch updates the previous version to fix this problem. *** BLURB HERE *** Juan A. Suarez Romero (1): i965: add opportunistic behaviour to opt_vector_float() src/mesa/drivers/dri/i965/brw_vec4.cpp | 62 ++ src/mesa/drivers/dri/i965/brw_vec4.h | 4 +++ 2 files changed, 44 insertions(+), 22 deletions(-) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/9] nir: move to compiler
On Sat, Nov 28, 2015 at 8:45 AM, Emil Velikovwrote: > On 27 November 2015 at 20:45, Jason Ekstrand wrote: >> On Nov 27, 2015 11:26 AM, "Matt Turner" wrote: >>> On Fri, Nov 27, 2015 at 6:50 AM, Emil Velikov >>> wrote: >>> > On 25 November 2015 at 22:01, Matt Turner wrote: >>> >> On Wed, Nov 25, 2015 at 1:32 PM, Emil Velikov >>> >> wrote: >>> > >>> >>> --- a/src/Makefile.am >>> >>> +++ b/src/Makefile.am >>> >>> @@ -23,6 +23,7 @@ SUBDIRS = . gtest util mapi/glapi/gen mapi >>> >>> >>> >>> # XXX: conditionally include >>> >>> SUBDIRS += compiler >>> >>> +SUBDIRS += compiler/nir >>> >> >>> >> We have a non-recursive build in src/glsl today. I don't want to go >>> >> backwards. >>> > Not sure I fully get that can you elaborate ? Are you concerned that >>> > things won't build in parallel, increasing the compilation times ? >>> > >>> > On my dual core system running with -j2 results in approx 15 seconds >>> > increase. I'm willing to take that trade off for the improved >>> > readability. What is the difference on your system ? >>> >>> src/glsl has single Makefile that builds libglcpp, glcpp, libglsl, >>> glsl_compiler, glsl_test, libnir, and various test programs, allowing >>> all of these things to happen in parallel. The Makefile is perfectly >>> maintainable as it is and there's no advantage of splitting it, >>> especially when the work has been done to get things to this state >>> (commits 86d30dea, efd201ca) and NIR was added without an additional >>> Makefile. >> >> I would tend to agree. Making things hierarchical is nice but, >> unfortunately, autotools makes this and parallelization mutually exclusive. > > Actually I have some ancient work where we benefit from both. Namely > have a single top level Makefile.am, which directly includes the > subdirectory Automake.mk files, resulting in one big Makefile at the > very end. > > That aside can we get some quantitative representation of the penalty > you guys see. On my (old) machine the difference is negligible ~15 > sec of a ~11 minute `make all' and ~16 minute `make distcheck'. What happened to this? (Yes, "Busy doing a release" is a valid answer) --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/11] mesa: improve debug log in atifragshader
This patch is Reviewed-by: Ian RomanickAssuming there are no objections, I'll push this in 24 hours. On 12/15/2015 03:05 PM, Miklós Máté wrote: > --- > src/mesa/main/atifragshader.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c > index d1c07c5..8b19a35 100644 > --- a/src/mesa/main/atifragshader.c > +++ b/src/mesa/main/atifragshader.c > @@ -349,6 +349,9 @@ _mesa_BeginFragmentShaderATI(void) > ctx->ATIFragmentShader.Current->isValid = GL_FALSE; > ctx->ATIFragmentShader.Current->swizzlerq = 0; > ctx->ATIFragmentShader.Compiling = 1; > +#if MESA_DEBUG_ATI_FS > + _mesa_debug(ctx, "%s %u\n", __func__, ctx->ATIFragmentShader.Current->Id); > +#endif > } > > void GLAPIENTRY > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/11] swrast: move two global defines to the only place where they are used
This patch is Reviewed-by: Ian RomanickAssuming there are no objections, I'll push this in 24 hours. On 12/15/2015 03:05 PM, Miklós Máté wrote: > --- > src/mesa/main/mtypes.h| 2 -- > src/mesa/swrast/s_atifragshader.c | 2 ++ > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 5c71ac4..99e7912 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -2278,8 +2278,6 @@ struct gl_compute_program_state > /** > * ATI_fragment_shader runtime state > */ > -#define ATI_FS_INPUT_PRIMARY 0 > -#define ATI_FS_INPUT_SECONDARY 1 > > struct atifs_instruction; > struct atifs_setupinst; > diff --git a/src/mesa/swrast/s_atifragshader.c > b/src/mesa/swrast/s_atifragshader.c > index 2974dee..414a414 100644 > --- a/src/mesa/swrast/s_atifragshader.c > +++ b/src/mesa/swrast/s_atifragshader.c > @@ -26,6 +26,8 @@ > #include "swrast/s_atifragshader.h" > #include "swrast/s_context.h" > > +#define ATI_FS_INPUT_PRIMARY 0 > +#define ATI_FS_INPUT_SECONDARY 1 > > /** > * State for executing ATI fragment shader. > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] st/mesa: implement GL_ATI_fragment_shader
On 12/15/2015 05:08 PM, Ilia Mirkin wrote: > On Tue, Dec 15, 2015 at 7:59 PM, Ian Romanickwrote: >> On 12/15/2015 04:40 PM, Ilia Mirkin wrote: >>> Hardly a complete review, but a handful of comments: >>> >>> On Tue, Dec 15, 2015 at 6:05 PM, Miklós Máté wrote: --- src/mesa/Makefile.sources | 1 + src/mesa/state_tracker/st_atifs_to_tgsi.c | 798 ++ src/mesa/state_tracker/st_atifs_to_tgsi.h | 49 ++ src/mesa/state_tracker/st_atom_constbuf.c | 14 + src/mesa/state_tracker/st_cb_drawpixels.c | 1 + src/mesa/state_tracker/st_cb_program.c| 35 +- src/mesa/state_tracker/st_program.c | 22 + src/mesa/state_tracker/st_program.h | 1 + 8 files changed, 920 insertions(+), 1 deletion(-) create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.c create mode 100644 src/mesa/state_tracker/st_atifs_to_tgsi.h +static struct ureg_src prepare_argument(struct st_translate *t, const unsigned argId, + const struct atifragshader_src_register *srcReg) +{ + struct ureg_src src = get_source(t, srcReg->Index); + struct ureg_dst arg = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+argId); + + switch (srcReg->argRep) { + case GL_NONE: + break; + case GL_RED: + src = ureg_swizzle(src, + TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X, TGSI_SWIZZLE_X); + break; + case GL_GREEN: + src = ureg_swizzle(src, + TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y, TGSI_SWIZZLE_Y); + break; + case GL_BLUE: + src = ureg_swizzle(src, + TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z); + break; + case GL_ALPHA: + src = ureg_swizzle(src, + TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W, TGSI_SWIZZLE_W); + break; + } + emit_insn(t, TGSI_OPCODE_MOV, , 1, , 1); + + if (srcReg->argMod & GL_COMP_BIT_ATI) { + struct ureg_src modsrc[2]; + modsrc[0] = ureg_imm1f(t->ureg, 1.0); + modsrc[1] = ureg_src(arg); + + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); + } + if (srcReg->argMod & GL_BIAS_BIT_ATI) { + struct ureg_src modsrc[2]; + modsrc[0] = ureg_src(arg); + modsrc[1] = ureg_imm1f(t->ureg, 0.5); + + emit_insn(t, TGSI_OPCODE_SUB, , 1, modsrc, 2); + } + if (srcReg->argMod & GL_2X_BIT_ATI) { + struct ureg_src modsrc[2]; + modsrc[0] = ureg_src(arg); + modsrc[1] = ureg_imm1f(t->ureg, 2.0); + + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); >>> >>> aka ADD arg, arg, arg >>> + } + if (srcReg->argMod & GL_NEGATE_BIT_ATI) { + struct ureg_src modsrc[2]; + modsrc[0] = ureg_src(arg); + modsrc[1] = ureg_imm1f(t->ureg, -1.0); + + emit_insn(t, TGSI_OPCODE_MUL, , 1, modsrc, 2); >>> >>> aka NEG arg, arg >>> + } + return ureg_src(arg); +} + +/* These instructions have no direct equivalent in TGSI */ +static void emit_special_inst(struct st_translate *t, struct instruction_desc *desc, + struct ureg_dst *dst, struct ureg_src *args, unsigned argcount) +{ + struct ureg_dst tmp[1]; + struct ureg_src src[3]; + + if(desc->special == 1) { + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose a3 + src[0] = ureg_imm1f(t->ureg, 0.5f); + src[1] = args[2]; + emit_insn(t, TGSI_OPCODE_SLT, tmp, 1, src, 2); + src[0] = ureg_src(tmp[0]); + src[1] = args[0]; + src[2] = args[1]; + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); + } else if (desc->special == 2) { + tmp[0] = get_temp(t, MAX_NUM_FRAGMENT_REGISTERS_ATI+2); // re-purpose a3 + src[0] = args[2]; + src[1] = ureg_imm1f(t->ureg, 0.0f); + emit_insn(t, TGSI_OPCODE_SGE, tmp, 1, src, 2); + src[0] = ureg_src(tmp[0]); + src[1] = args[0]; + src[2] = args[1]; + emit_insn(t, TGSI_OPCODE_LRP, dst, 1, src, 3); >>> >>> Isn't this the CMP instruction? Just flip the args. >>> >>> http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-CMP >>> >>> The other one should be expressible as CMP as well I think. >>> + } else if (desc->special == 3) { + src[0] = args[0]; + src[1] = args[1]; + src[2] = ureg_swizzle(args[2], +TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z, TGSI_SWIZZLE_Z); + emit_insn(t, TGSI_OPCODE_DP2A, dst, 1, src, 3); + }
[Mesa-dev] stencil texturing trouble
Hi, looking at some piglit failures, I was wondering what is actually the correct thing to do with stencil texturing. What do you put in the missing channels? The GL spec seems to say depth texture mode is only applicable to depth textures, so what it is then? It looks like nvidia is returning the same value in all channels, but from all possibilities I can think of what should be returned (honor depth texture mode, treat it like a GL_RED texture, ...) this seems to be about the least likely to actually be correct. Any pointers? Or maybe it just doesn't matter? (Oh and before I forget, the piglit texwrap test is quite busted wrt stencil textures, so don't trust it...) There's actually another problem related to gallium, right now mesa/st uses (in contrast to depth textures) XYZW swizzle (albeit if it's a depth/stencil texture in stencil sampling mode, it would use the swizzle according to depth texture mode). That's quite problematic for several reasons, not least because in gallium stencil textures essentially use the "Y" component for sampling stencil, not X. Not to mention that the other components do not map to anything (as opposed to "ordinary" color formats). There's also some interface mismatch there too which looks like it can't be solved easily - we sample the Y component, however we still need to use the X component from the border color. Hmm. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev