Re: [Mesa-dev] Mesa 11.1.0 release candidate 3
On 8 December 2015 at 04:33, Timothy Arceriwrote: > > On Mon, 2015-12-07 at 14:05 +, Emil Velikov wrote: >> The third release candidate for Mesa 11.1.0 is now available. >> > > Hi Emil, > > 76c09c1792ff1209bd34e1ae0e17b9c4929a892f > 618612f867ed8140409796535314d9b9ed58440b > 8117f46f496fb31339fc97a2501d5b3325a1fefb > 2ab9cd0c4dcefb3e63266cadc1e06079e67c3962 > > I sent to the stable list after commiting them to master. To make thing > s worse they got stuck in moderation as my collabora address wasn't > signed up to the stable list. > > These fix some rather important issues for SSO and would be good to get > into the 11.1 release. > > Let me know if I should resend them to stable or if anything else is > required. > Everything's fine Tim. I've kicked the mesa-stable queue a few days ago and these patches are on the list. I'll make sure I double-check for late nominations. Thanks for the reminder. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/3] mesa: do not fail when stage not linked is not in use
Shader linking may fail but if glUseProgramStages does not take shader in use, previously set stages to pipeline should continue to work. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction Signed-off-by: Tapani Pälli--- src/mesa/main/context.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index d73c984..df2c41c 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1977,6 +1977,24 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) bool from_glsl_shader[MESA_SHADER_COMPUTE] = { false }; for (i = 0; i < MESA_SHADER_COMPUTE; i++) { + + /* If stage has not marked as active in the pipeline, it does not + * matter if linking this shader failed. If stage has been set as used + * it should have ValidProgramUse set. + * + * OpenGL ES 3.1 spec (7.3 Program objects): + * + * "If a program object that is active for any shader stage is + * re-linked unsuccessfully, the link status will be set to FALSE, + * but any existing executables and associated state will remain + * part of the current rendering state until a subsequent call to + * UseProgram, UseProgramStages, or BindProgramPipeline removes + * them from use. + */ + if (ctx->_Shader->CurrentProgram[i] && + !ctx->_Shader->CurrentProgram[i]->ValidProgramUse) +continue; + if (!shader_linked_or_absent(ctx, ctx->_Shader->CurrentProgram[i], _glsl_shader[i], where)) return GL_FALSE; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline
Validation checks that we do not have active shader stages that are not used by the pipeline. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction v2: move as generic validation check for both ES and desktop (Timothy) Signed-off-by: Tapani Pälli--- src/mesa/main/context.c | 9 + src/mesa/main/pipelineobj.c | 35 +++ src/mesa/main/pipelineobj.h | 2 ++ 3 files changed, 46 insertions(+) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index be983d4..d73c984 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) } } + /* If SSO in use, validate that all linked program stages are used. */ + if (ctx->Pipeline.Current) { + if (!_mesa_active_program_stages_used(ctx->Pipeline.Current)) { + _mesa_error(ctx, GL_INVALID_OPERATION, + "Shader program active for shader stages that " + "are not used by the pipeline"); + } + } + /* If a program is active and SSO not in use, check if validation of * samplers succeeded for the active program. */ if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx->Pipeline.Current) { diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index f4ce3ed..9067551 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const struct gl_pipeline_object *pipe) return false; } +bool +_mesa_active_program_stages_used(struct gl_pipeline_object *pipe) +{ + unsigned i; + for (i = 0; i < MESA_SHADER_STAGES; i++) { + if (!pipe->CurrentProgram[i]) + continue; + /* If program has linked but not used by pipeline. */ + if (pipe->CurrentProgram[i]->LinkStatus && + !pipe->CurrentProgram[i]->ValidProgramUse) + return false; + } + return true; +} + extern GLboolean _mesa_validate_program_pipeline(struct gl_context* ctx, struct gl_pipeline_object *pipe) @@ -839,6 +854,26 @@ _mesa_validate_program_pipeline(struct gl_context* ctx, return GL_FALSE; } + /* Section 11.1.3.11 (Validation) of the OpenGL 4.5 and OpenGL ES 3.1 +* spec say: +* +* "An INVALID_OPERATION error is generated by any command that trans- +* fers vertices to the GL or launches compute work if the current set +* of active program objects cannot be executed, for reasons including: +* +* ... +* +* "A program object is active for at least one, but not all of +* the shader stages that were present when the program was linked." +*/ + if (!_mesa_active_program_stages_used(pipe)) { + pipe->InfoLog = + ralloc_strdup(pipe, + "Shader program active for shader stages that " + "are not used by the pipeline"); + return GL_FALSE; + } + /* Section 2.11.11 (Shader Execution), subheading "Validation," of the * OpenGL 4.1 spec says: * diff --git a/src/mesa/main/pipelineobj.h b/src/mesa/main/pipelineobj.h index fbcb765..e56b522 100644 --- a/src/mesa/main/pipelineobj.h +++ b/src/mesa/main/pipelineobj.h @@ -70,6 +70,8 @@ extern GLboolean _mesa_validate_program_pipeline(struct gl_context * ctx, struct gl_pipeline_object *pipe); +extern bool +_mesa_active_program_stages_used(struct gl_pipeline_object *pipe); extern void GLAPIENTRY _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, GLuint program); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/3] mesa: add a new flag to track program usage in a pipeline
Patch adds ValidProgramUse flag to gl_shader_program that can be used to track if a program is fully utilized by program pipeline. Flag is set by glUseProgramStages and will be used for pipeline validation by the next patch. Signed-off-by: Tapani PälliSuggested-by: Timothy Arceri --- src/glsl/linker.cpp | 1 + src/mesa/main/mtypes.h | 1 + src/mesa/main/pipelineobj.c | 27 +++ 3 files changed, 29 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index ae628cd..7e0af44 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -4062,6 +4062,7 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) prog->LinkStatus = true; /* All error paths will set this to false */ prog->Validated = false; prog->_Used = false; + prog->ValidProgramUse = false; prog->ARB_fragment_coord_conventions_enable = false; diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 1eb1e21..770055f 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2746,6 +2746,7 @@ struct gl_shader_program GLboolean Validated; GLboolean _Used;/**< Ever used for drawing? */ GLboolean SamplersValidated; /**< Samplers validated against texture units? */ + GLboolean ValidProgramUse; /**< SSO, is every stage used by the pipeline? */ GLchar *InfoLog; unsigned Version; /**< GLSL version used for linking */ diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index 5eda4e5..f4ce3ed 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -219,6 +219,31 @@ _mesa_reference_pipeline_object_(struct gl_context *ctx, } /** + * Raise a flag if shader programs in pipeline contain + * stages that are not used. This is used by validation later. + */ +static void +update_program_usage(struct gl_pipeline_object *pipe) +{ + unsigned i, j; + for (i = 0; i < MESA_SHADER_STAGES; i++) { + struct gl_shader_program *shProg = pipe->CurrentProgram[i]; + if (!shProg) + continue; + shProg->ValidProgramUse = true; + for (j = 0; j < MESA_SHADER_STAGES; j++) { + if (!shProg->_LinkedShaders[j]) +continue; + /* Make sure that any shader stage is also used in the pipeline. */ + if (shProg != pipe->CurrentProgram[j]) { +shProg->ValidProgramUse = false; +return; + } + } + } +} + +/** * Bound program to severals stages of the pipeline */ void GLAPIENTRY @@ -341,6 +366,8 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, GLuint program) if ((stages & GL_COMPUTE_SHADER_BIT) != 0) _mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, pipe); + + update_program_usage(pipe); } /** -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/3] more pipeline validation
Hi; I marked this as v2 as this is alternative approach to patch series I sent ealier: http://lists.freedesktop.org/archives/mesa-dev/2015-December/102548.html I took then direction suggested by Timothy which makes validation step simpler. These patches fix 2 ES31-CTS.sepshaderobjs.StateInteraction cases leaving only interface matching part which I will tackle next. Tapani Pälli (3): mesa: add a new flag to track program usage in a pipeline mesa: validate program usage in a pipeline mesa: do not fail when stage not linked is not in use src/glsl/linker.cpp | 1 + src/mesa/main/context.c | 27 src/mesa/main/mtypes.h | 1 + src/mesa/main/pipelineobj.c | 62 + src/mesa/main/pipelineobj.h | 2 ++ 5 files changed, 93 insertions(+) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Add defines for gather push constants
On 12/07/2015 04:45 PM, Ilia Mirkin wrote: > On Mon, Dec 7, 2015 at 7:39 AM, Abdiel Janulgue >wrote: >> v2 (Francisco Jerez): >>- Rename HSW_GATHER_CONSTANTS_RESERVED to >> HSW_GATHER_POOL_ALLOC_MUST_BE_ONE. >>- Rename BRW_GATHER_* prefix to HSW_GATHER_CONSTANTS_*. >> >> Signed-off-by: Abdiel Janulgue >> --- >> src/mesa/drivers/dri/i965/brw_defines.h | 19 +++ >> 1 file changed, 19 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h >> b/src/mesa/drivers/dri/i965/brw_defines.h >> index a511d5c..f997cf9 100644 >> --- a/src/mesa/drivers/dri/i965/brw_defines.h >> +++ b/src/mesa/drivers/dri/i965/brw_defines.h >> @@ -2565,6 +2565,25 @@ enum brw_wm_barycentric_interp_mode { >> #define _3DSTATE_CONSTANT_HS 0x7819 /* GEN7+ */ >> #define _3DSTATE_CONSTANT_DS 0x781A /* GEN7+ */ >> >> +/* Resource streamer gather constants */ >> +#define HSW_GATHER_POOL_ALLOC_MUST_BE_ONE (3 << 4) /* GEN7.5 only */ >> +#define _3DSTATE_GATHER_POOL_ALLOC0x791A /* GEN7.5+ */ >> +#define _3DSTATE_GATHER_CONSTANT_VS 0x7834 >> +#define _3DSTATE_GATHER_CONSTANT_GS 0x7835 >> +#define _3DSTATE_GATHER_CONSTANT_HS 0x7836 >> +#define _3DSTATE_GATHER_CONSTANT_DS 0x7837 >> +#define _3DSTATE_GATHER_CONSTANT_PS 0x7838 >> + >> +#define HSW_GATHER_CONSTANTS_ENABLE (1 << 11) /* GEN7.5+ */ >> +#define HSW_GATHER_CONSTANTS_BUFFER_VALID_SHIFT 16 >> +#define HSW_GATHER_CONSTANTS_BUFFER_VALID_MASK INTEL_MASK(31, 16) >> +#define HSW_GATHER_CONSTANTS_BINDING_TABLE_BLOCK_SHIFT 12 >> +#define HSW_GATHER_CONSTANTS_BINDING_TABLE_BLOCK_MASK INTEL_MASK(15, 12) >> +#define HSW_GATHER_CONSTANTS_CONST_BUFFER_OFFSET_SHIFT 8 >> +#define HSW_GATHER_CONSTANTS_CONST_BUFFER_OFFSET_MASK INTEL_MASK(15, 8) > > I mentioned this on IRC, but shouldn't this one have been > INTEL_MASK(11, 8)? Seems unlikely that there are overlapping > bitfields... Actually, the CONST_BUFFER_OFFSET field is part of the "gather constant entry" fields which occupies separate dword blocks from the BINDING_TABLE_BLOCK field. -Abdiel > > -ilia > >> +#define HSW_GATHER_CONSTANTS_CHANNEL_MASK_SHIFT 4 >> +#define HSW_GATHER_CONSTANTS_CHANNEL_MASK_MASK INTEL_MASK(7, 4) >> + >> #define _3DSTATE_STREAMOUT0x781e /* GEN7+ */ >> /* DW1 */ >> # define SO_FUNCTION_ENABLE(1 << 31) >> -- >> 1.9.1 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline
On 12/08/2015 01:31 PM, Tapani Pälli wrote: On 12/08/2015 12:57 PM, Timothy Arceri wrote: On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote: Validation checks that we do not have active shader stages that are not used by the pipeline. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction v2: move as generic validation check for both ES and desktop (Timothy) Signed-off-by: Tapani Pälli--- src/mesa/main/context.c | 9 + src/mesa/main/pipelineobj.c | 35 +++ src/mesa/main/pipelineobj.h | 2 ++ 3 files changed, 46 insertions(+) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index be983d4..d73c984 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) } } + /* If SSO in use, validate that all linked program stages are used. */ + if (ctx->Pipeline.Current) { + if (!_mesa_active_program_stages_used(ctx->Pipeline.Current)) { + _mesa_error(ctx, GL_INVALID_OPERATION, + "Shader program active for shader stages that " + "are not used by the pipeline"); You can remove this now. The call to _mesa_validate_program_pipeline() below should cause the error to be thrown for us when it fails. Oops forgot this one. This is necessary to be run as the full validation happens only if pipeline was not validated before. + } + } + /* If a program is active and SSO not in use, check if validation of * samplers succeeded for the active program. */ if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx ->Pipeline.Current) { diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index f4ce3ed..9067551 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const struct gl_pipeline_object *pipe) return false; } +bool +_mesa_active_program_stages_used(struct gl_pipeline_object *pipe) +{ + unsigned i; + for (i = 0; i < MESA_SHADER_STAGES; i++) { + if (!pipe->CurrentProgram[i]) + continue; + /* If program has linked but not used by pipeline. */ + if (pipe->CurrentProgram[i]->LinkStatus && + !pipe->CurrentProgram[i]->ValidProgramUse) Two things here first if ValidProgramUse is false we need to redo the validation to be sure it really is an invalid use as its possible that the program was relinked since the UseProgramStages call and is now valid. I'm not sure about this. This feels like the ValidProgramUse would be then unnecessary as we would anyway need to iterate through everything (just because of possible relink)? Second are you sure about the LinkStatus check? Won't this cause validation to pass when it should fail if the program was linked successfully but ValidProgramUse is false, and we then do a relink that fails? As the program will stay in the pipeline. IMO we need to check that program has been linked, otherwise it should not be considered active at all. UseProgramStages will need to be called first to have actual value in ValidProgramUse and that cannot be called with program that has not been succesfully linked. Make that three things. It seems there is already code that should be doing this validation already. The first validation test in _mesa_validate_program_pipeline() calls program_stages_all_active() in a loop which should fail for this case. The code pretty much does what I was suggesting only it always does the validation, there is no flag to reduce calls. I guess it really shouldn't matter a huge deal since its only done once when somethings relinked or a sampler changes. Yeah it seems, I did not realize it's trying to test this same thing as it did not catch this but then again I guess it's not called at all during draw time currently. From my quick testing on the CTS test the program in the pipeline only has a single shader when it should have two so the test fails with this existing code. As per my first point I believe we always need to The test in question is "Separable program with both vertex and fragment shaders attached to only one stage". It fails because there is no existing check run at draw time for the case where you enable one stage but supply a program with multiple stages. revalidate here to be sure that it really is an invalid use of the program so what we really need to do is track down why there is only a single shader in the program when it seems there should be two. If we can resolve that we might not need to rework this code unless there really is a performance hit. + return false; + } + return true; +} + extern GLboolean _mesa_validate_program_pipeline(struct gl_context* ctx, struct gl_pipeline_object *pipe) @@ -839,6 +854,26 @@
Re: [Mesa-dev] [PATCH 2/2] r600: Add ARB_copy_image support
For the series: Reviewed-by: Marek OlšákMarek On Mon, Dec 7, 2015 at 2:44 PM, Edward O'Callaghan wrote: > Signed-off-by: Edward O'Callaghan > --- > docs/GL3.txt | 2 +- > src/gallium/drivers/r600/r600_pipe.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/docs/GL3.txt b/docs/GL3.txt > index 50b429c..f564263 100644 > --- a/docs/GL3.txt > +++ b/docs/GL3.txt > @@ -153,7 +153,7 @@ GL 4.3, GLSL 4.30: >GL_ARB_ES3_compatibility DONE (all drivers > that support GLSL 3.30) >GL_ARB_clear_buffer_object DONE (all drivers) >GL_ARB_compute_shaderin progress (jljusten) > - GL_ARB_copy_imageDONE (i965, nv50, > nvc0, radeonsi) > + GL_ARB_copy_imageDONE (i965, nv50, > nvc0, r600, radeonsi) >GL_KHR_debug DONE (all drivers) >GL_ARB_explicit_uniform_location DONE (all drivers > that support GLSL) >GL_ARB_fragment_layer_viewport DONE (i965, nv50, > nvc0, r600, radeonsi, llvmpipe) > diff --git a/src/gallium/drivers/r600/r600_pipe.c > b/src/gallium/drivers/r600/r600_pipe.c > index eb2ec73..ba5d9be 100644 > --- a/src/gallium/drivers/r600/r600_pipe.c > +++ b/src/gallium/drivers/r600/r600_pipe.c > @@ -277,6 +277,7 @@ static int r600_get_param(struct pipe_screen* pscreen, > enum pipe_cap param) > case PIPE_CAP_TEXTURE_FLOAT_LINEAR: > case PIPE_CAP_TEXTURE_HALF_FLOAT_LINEAR: > case PIPE_CAP_TGSI_TXQS: > + case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS: > return 1; > > case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: > @@ -346,7 +347,6 @@ static int r600_get_param(struct pipe_screen* pscreen, > enum pipe_cap param) > case PIPE_CAP_DEPTH_BOUNDS_TEST: > case PIPE_CAP_FORCE_PERSAMPLE_INTERP: > case PIPE_CAP_SHAREABLE_SHADERS: > - case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS: > case PIPE_CAP_CLEAR_TEXTURE: > return 0; > > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 93261] GL/DRI3 over ssh broken
https://bugs.freedesktop.org/show_bug.cgi?id=93261 --- Comment #1 from Martin Peres--- I can reproduce the issue, I will investigate this issue this week! -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] gallium: add GREMEDY_string_marker
On 7 December 2015 at 18:55, Rob Clarkwrote: > On Mon, Dec 7, 2015 at 1:42 PM, Emil Velikov wrote: >> On 7 December 2015 at 16:45, Rob Clark wrote: >>> From: Rob Clark >>> >>> Only exposed w/ ST_DEBUG=gremedy. >>> >> Perhaps a bit of a silly question - why expose the extension only for >> debug mesa builds ? Afaict there isn't any noticeable performance >> implication (from infrastructural POV) is there ? >> >> If driver X performance goes down the drain, just have them >> conditionally return 0/1 for PIPE_CAP_STRING_MARKER ? > > It was suggested, iirc by Ian, on the basis that apps might do > something non-performant if they see the extension (since the original > use-case was that the extension is injected by the gremedy debugger). > Hmm fair enough. Can you add some/most of that in the commit message, please ? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline
On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote: > Validation checks that we do not have active shader stages that are > not > used by the pipeline. > > This fixes a subtest in following CTS test: > ES31-CTS.sepshaderobjs.StateInteraction > > v2: move as generic validation check for both ES and desktop > (Timothy) > > Signed-off-by: Tapani Pälli> --- > src/mesa/main/context.c | 9 + > src/mesa/main/pipelineobj.c | 35 +++ > src/mesa/main/pipelineobj.h | 2 ++ > 3 files changed, 46 insertions(+) > > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c > index be983d4..d73c984 100644 > --- a/src/mesa/main/context.c > +++ b/src/mesa/main/context.c > @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx, > const char *where) >} > } > > + /* If SSO in use, validate that all linked program stages are > used. */ > + if (ctx->Pipeline.Current) { > + if (!_mesa_active_program_stages_used(ctx->Pipeline.Current)) > { > + _mesa_error(ctx, GL_INVALID_OPERATION, > + "Shader program active for shader stages that " > + "are not used by the pipeline"); You can remove this now. The call to _mesa_validate_program_pipeline() below should cause the error to be thrown for us when it fails. > + } > + } > + > /* If a program is active and SSO not in use, check if validation > of > * samplers succeeded for the active program. */ > if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx > ->Pipeline.Current) { > diff --git a/src/mesa/main/pipelineobj.c > b/src/mesa/main/pipelineobj.c > index f4ce3ed..9067551 100644 > --- a/src/mesa/main/pipelineobj.c > +++ b/src/mesa/main/pipelineobj.c > @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const > struct gl_pipeline_object *pipe) > return false; > } > > +bool > +_mesa_active_program_stages_used(struct gl_pipeline_object *pipe) > +{ > + unsigned i; > + for (i = 0; i < MESA_SHADER_STAGES; i++) { > + if (!pipe->CurrentProgram[i]) > + continue; > + /* If program has linked but not used by pipeline. */ > + if (pipe->CurrentProgram[i]->LinkStatus && > + !pipe->CurrentProgram[i]->ValidProgramUse) Two things here first if ValidProgramUse is false we need to redo the validation to be sure it really is an invalid use as its possible that the program was relinked since the UseProgramStages call and is now valid. Second are you sure about the LinkStatus check? Won't this cause validation to pass when it should fail if the program was linked successfully but ValidProgramUse is false, and we then do a relink that fails? As the program will stay in the pipeline. Make that three things. It seems there is already code that should be doing this validation already. The first validation test in _mesa_validate_program_pipeline() calls program_stages_all_active() in a loop which should fail for this case. The code pretty much does what I was suggesting only it always does the validation, there is no flag to reduce calls. I guess it really shouldn't matter a huge deal since its only done once when somethings relinked or a sampler changes. From my quick testing on the CTS test the program in the pipeline only has a single shader when it should have two so the test fails with this existing code. As per my first point I believe we always need to revalidate here to be sure that it really is an invalid use of the program so what we really need to do is track down why there is only a single shader in the program when it seems there should be two. If we can resolve that we might not need to rework this code unless there really is a performance hit. > + return false; > + } > + return true; > +} > + > extern GLboolean > _mesa_validate_program_pipeline(struct gl_context* ctx, > struct gl_pipeline_object *pipe) > @@ -839,6 +854,26 @@ _mesa_validate_program_pipeline(struct > gl_context* ctx, >return GL_FALSE; > } > > + /* Section 11.1.3.11 (Validation) of the OpenGL 4.5 and OpenGL ES > 3.1 > +* spec say: > +* > +* "An INVALID_OPERATION error is generated by any command > that trans- > +* fers vertices to the GL or launches compute work if the > current set > +* of active program objects cannot be executed, for reasons > including: > +* > +* ... > +* > +* "A program object is active for at least one, but not all > of > +* the shader stages that were present when the program was > linked." > +*/ > + if (!_mesa_active_program_stages_used(pipe)) { > + pipe->InfoLog = > + ralloc_strdup(pipe, > + "Shader program active for shader stages that > " > + "are not used by the pipeline"); > + return GL_FALSE; > + } > + > /* Section 2.11.11 (Shader Execution), subheading
Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline
On 12/08/2015 12:57 PM, Timothy Arceri wrote: On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote: Validation checks that we do not have active shader stages that are not used by the pipeline. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction v2: move as generic validation check for both ES and desktop (Timothy) Signed-off-by: Tapani Pälli--- src/mesa/main/context.c | 9 + src/mesa/main/pipelineobj.c | 35 +++ src/mesa/main/pipelineobj.h | 2 ++ 3 files changed, 46 insertions(+) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index be983d4..d73c984 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) } } + /* If SSO in use, validate that all linked program stages are used. */ + if (ctx->Pipeline.Current) { + if (!_mesa_active_program_stages_used(ctx->Pipeline.Current)) { + _mesa_error(ctx, GL_INVALID_OPERATION, + "Shader program active for shader stages that " + "are not used by the pipeline"); You can remove this now. The call to _mesa_validate_program_pipeline() below should cause the error to be thrown for us when it fails. + } + } + /* If a program is active and SSO not in use, check if validation of * samplers succeeded for the active program. */ if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx ->Pipeline.Current) { diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index f4ce3ed..9067551 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const struct gl_pipeline_object *pipe) return false; } +bool +_mesa_active_program_stages_used(struct gl_pipeline_object *pipe) +{ + unsigned i; + for (i = 0; i < MESA_SHADER_STAGES; i++) { + if (!pipe->CurrentProgram[i]) + continue; + /* If program has linked but not used by pipeline. */ + if (pipe->CurrentProgram[i]->LinkStatus && + !pipe->CurrentProgram[i]->ValidProgramUse) Two things here first if ValidProgramUse is false we need to redo the validation to be sure it really is an invalid use as its possible that the program was relinked since the UseProgramStages call and is now valid. I'm not sure about this. This feels like the ValidProgramUse would be then unnecessary as we would anyway need to iterate through everything (just because of possible relink)? Second are you sure about the LinkStatus check? Won't this cause validation to pass when it should fail if the program was linked successfully but ValidProgramUse is false, and we then do a relink that fails? As the program will stay in the pipeline. IMO we need to check that program has been linked, otherwise it should not be considered active at all. UseProgramStages will need to be called first to have actual value in ValidProgramUse and that cannot be called with program that has not been succesfully linked. Make that three things. It seems there is already code that should be doing this validation already. The first validation test in _mesa_validate_program_pipeline() calls program_stages_all_active() in a loop which should fail for this case. The code pretty much does what I was suggesting only it always does the validation, there is no flag to reduce calls. I guess it really shouldn't matter a huge deal since its only done once when somethings relinked or a sampler changes. Yeah it seems, I did not realize it's trying to test this same thing as it did not catch this but then again I guess it's not called at all during draw time currently. From my quick testing on the CTS test the program in the pipeline only has a single shader when it should have two so the test fails with this existing code. As per my first point I believe we always need to The test in question is "Separable program with both vertex and fragment shaders attached to only one stage". It fails because there is no existing check run at draw time for the case where you enable one stage but supply a program with multiple stages. revalidate here to be sure that it really is an invalid use of the program so what we really need to do is track down why there is only a single shader in the program when it seems there should be two. If we can resolve that we might not need to rework this code unless there really is a performance hit. + return false; + } + return true; +} + extern GLboolean _mesa_validate_program_pipeline(struct gl_context* ctx, struct gl_pipeline_object *pipe) @@ -839,6 +854,26 @@ _mesa_validate_program_pipeline(struct gl_context* ctx, return GL_FALSE; } + /* Section 11.1.3.11 (Validation) of the OpenGL 4.5 and OpenGL ES 3.1 +* spec
[Mesa-dev] [PATCH] mesa: remove validation of shaders that should be done elsewhere
In core profile even if re-linking fails rendering shouldn't fail as the previous succesfully linked program will still be available. It also shouldn't be possible to have an unlinked program as part of the current rendering state. This fixes a subtest in: ES31-CTS.sepshaderobjs.StateInteraction This change should improve performance on CPU limited benchmarks as noted in commit d6c6b186cf308f. From Section 7.3 (Program Objects) of the OpenGL 4.5 spec: "If a program object that is active for any shader stage is re-linked unsuccessfully, the link status will be set to FALSE, but any existing executables and associated state will remain part of the current rendering state until a subsequent call to UseProgram, UseProgramStages, or BindProgramPipeline removes them from use. If such a program is attached to any program pipeline object, the existing executables and associated state will remain part of the program pipeline object until a subsequent call to UseProgramStages removes them from use. An unsuccessfully linked program may not be made part of the current rendering state by UseProgram or added to program pipeline objects by UseProgramStages until it is successfully re-linked." "void UseProgram(uint program); ... An INVALID_OPERATION error is generated if program has not been linked, or was last linked unsuccessfully. The current rendering state is not modified." Cc: Tapani PälliCc: Ian Romanick Cc: Brian Paul --- src/mesa/main/context.c | 23 +++ 1 file changed, 3 insertions(+), 20 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index be983d4..2340c11 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1930,6 +1930,7 @@ _mesa_check_blend_func_error(struct gl_context *ctx) return GL_TRUE; } + static bool shader_linked_or_absent(struct gl_context *ctx, const struct gl_shader_program *shProg, @@ -1956,6 +1957,7 @@ shader_linked_or_absent(struct gl_context *ctx, return true; } + /** * Prior to drawing anything with glBegin, glDrawArrays, etc. this function * is called to see if it's valid to render. This involves checking that @@ -1967,33 +1969,14 @@ shader_linked_or_absent(struct gl_context *ctx, GLboolean _mesa_valid_to_render(struct gl_context *ctx, const char *where) { - unsigned i; - /* This depends on having up to date derived state (shaders) */ if (ctx->NewState) _mesa_update_state(ctx); - if (ctx->API == API_OPENGL_CORE || ctx->API == API_OPENGLES2) { - bool from_glsl_shader[MESA_SHADER_COMPUTE] = { false }; - - for (i = 0; i < MESA_SHADER_COMPUTE; i++) { - if (!shader_linked_or_absent(ctx, ctx->_Shader->CurrentProgram[i], - _glsl_shader[i], where)) -return GL_FALSE; - } - - /* In OpenGL Core Profile and OpenGL ES 2.0 / 3.0, there are no assembly - * shaders. Don't check state related to those. - */ - } else { + if (ctx->API == API_OPENGL_COMPAT) { bool has_vertex_shader = false; bool has_fragment_shader = false; - /* In OpenGL Compatibility Profile, there is only vertex shader and - * fragment shader. We take this path also for API_OPENGLES because - * optimizing that path would make the other (more common) paths - * slightly slower. - */ if (!shader_linked_or_absent(ctx, ctx->_Shader->CurrentProgram[MESA_SHADER_VERTEX], _vertex_shader, where)) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] scons: support for llvm 3.7.
On 07/12/15 16:13, olivier.pena...@gmail.com wrote: From: Olivier Pena--- scons/llvm.py | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/scons/llvm.py b/scons/llvm.py index c59b8cb..1fc8a3f 100644 --- a/scons/llvm.py +++ b/scons/llvm.py @@ -106,7 +106,19 @@ def generate(env): ]) env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')]) # LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter` -if llvm_version >= distutils.version.LooseVersion('3.6'): +if llvm_version >= distutils.version.LooseVersion('3.7'): +env.Prepend(LIBS = [ +'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser', +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', +'LLVMCodeGen', 'LLVMScalarOpts', 'LLVMProfileData', +'LLVMInstCombine', 'LLVMInstrumentation', 'LLVMTransformUtils', 'LLVMipa', +'LLVMAnalysis', 'LLVMX86Desc', 'LLVMMCDisassembler', +'LLVMX86Info', 'LLVMX86AsmPrinter', 'LLVMX86Utils', +'LLVMMCJIT', 'LLVMTarget', 'LLVMExecutionEngine', +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser', +'LLVMBitReader', 'LLVMMC', 'LLVMCore', 'LLVMSupport' +]) +elif llvm_version >= distutils.version.LooseVersion('3.6'): env.Prepend(LIBS = [ 'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser', 'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', Pushed. Thanks. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline
On 12/08/2015 03:55 PM, Timothy Arceri wrote: On Tue, 2015-12-08 at 13:34 +0200, Tapani Pälli wrote: On 12/08/2015 01:31 PM, Tapani Pälli wrote: On 12/08/2015 12:57 PM, Timothy Arceri wrote: On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote: Validation checks that we do not have active shader stages that are not used by the pipeline. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction v2: move as generic validation check for both ES and desktop (Timothy) Signed-off-by: Tapani Pälli--- src/mesa/main/context.c | 9 + src/mesa/main/pipelineobj.c | 35 +++ src/mesa/main/pipelineobj.h | 2 ++ 3 files changed, 46 insertions(+) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index be983d4..d73c984 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx, const char *where) } } + /* If SSO in use, validate that all linked program stages are used. */ + if (ctx->Pipeline.Current) { + if (!_mesa_active_program_stages_used(ctx ->Pipeline.Current)) { + _mesa_error(ctx, GL_INVALID_OPERATION, + "Shader program active for shader stages that " + "are not used by the pipeline"); You can remove this now. The call to _mesa_validate_program_pipeline() below should cause the error to be thrown for us when it fails. Oops forgot this one. This is necessary to be run as the full validation happens only if pipeline was not validated before. I see. Well in that case shouldn't we set the validated flag to false when UseProgramStages is set? Since the pipeline could now be invalid for this rule and for: "One program object is active for at least two shader stages and a second program is active for a shader stage between two stages for which the first program was active." Surely it would be less of a preformance hit to re-run full validation when UseProgramStages is called on an than to re-validate these two rules everytime? Yeah I guess so, overall I was hoping to prevent full validation runs and run just 'what is absolutely required' both in UseProgramStages and in draw time but I agree it would be good to get functionality in place before trying to optimize too much. It looks like interface matching will anyway be the hottest spot to optimize, not these small loops. + } + } + /* If a program is active and SSO not in use, check if validation of * samplers succeeded for the active program. */ if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx ->Pipeline.Current) { diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index f4ce3ed..9067551 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const struct gl_pipeline_object *pipe) return false; } +bool +_mesa_active_program_stages_used(struct gl_pipeline_object *pipe) +{ + unsigned i; + for (i = 0; i < MESA_SHADER_STAGES; i++) { + if (!pipe->CurrentProgram[i]) + continue; + /* If program has linked but not used by pipeline. */ + if (pipe->CurrentProgram[i]->LinkStatus && + !pipe->CurrentProgram[i]->ValidProgramUse) Two things here first if ValidProgramUse is false we need to redo the validation to be sure it really is an invalid use as its possible that the program was relinked since the UseProgramStages call and is now valid. I'm not sure about this. This feels like the ValidProgramUse would be then unnecessary as we would anyway need to iterate through everything (just because of possible relink)? Right I'm leaning towards removing ValidProgramUse and setting the validated pipeline flag to false when UseProgramStages is called then revalidating everything. As you said with regards to the sampler checks we can break things up later and introduce flags if perf testing shows it to be useful. Yeah, this seems the simplest way to get things working. Second are you sure about the LinkStatus check? Won't this cause validation to pass when it should fail if the program was linked successfully but ValidProgramUse is false, and we then do a relink that fails? As the program will stay in the pipeline. IMO we need to check that program has been linked, otherwise it should not be considered active at all. UseProgramStages will need to be called first to have actual value in ValidProgramUse and that cannot be called with program that has not been succesfully linked. But the LinkStatus flag gets set to false for a program that has already been used with UseProgramStage but is then relinked unsuccesfully which is what patch 3 is about. I've also sent an alternate fix for patch 3 as I spent a bunch of time testing and digging into it. It was just easier to send it out since it was done. OK, cool. Make
Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output
On 08/12/15 01:42, Brian Paul wrote: When a buffer is created with GL_STATIC_DRAW, its contents should not be changed frequently. But that's exactly what one application I'm debugging does. This patch adds code to try to detect inefficient buffer use in a couple places. The GL_ARB_debug_output mechanism is used to report the issue. NVIDIA's driver detects these sort of things too. Other types of inefficient buffer use could also be detected in the future. --- src/mesa/main/bufferobj.c | 55 +++ src/mesa/main/mtypes.h| 4 2 files changed, 59 insertions(+) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index f985982..6bc1b5e 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -51,6 +51,34 @@ /** + * We count the number of buffer modification calls to check for + * inefficient buffer use. This is the number of such calls before we + * issue a warning. + */ +#define BUFFER_WARNING_CALL_COUNT 4 + + +/** + * Helper to warn of possible performance issues, such as frequently + * updating a buffer created with GL_STATIC_DRAW. + */ +static void +buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...) +{ + va_list args; + GLuint msg_id = 0; + + va_start(args, fmt); + _mesa_gl_vdebug(ctx, _id, + MESA_DEBUG_SOURCE_API, + MESA_DEBUG_TYPE_PERFORMANCE, + MESA_DEBUG_SEVERITY_MEDIUM, + fmt, args); + va_end(args); +} + + +/** * Used as a placeholder for buffer objects between glGenBuffers() and * glBindBuffer() so that glIsBuffer() can work correctly. */ @@ -1677,6 +1705,21 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct gl_buffer_object *bufObj, if (size == 0) return; + bufObj->NumSubDataCalls++; + + if ((bufObj->Usage == GL_STATIC_DRAW || +bufObj->Usage == GL_STATIC_COPY) && + bufObj->NumSubDataCalls >= BUFFER_WARNING_CALL_COUNT) { + /* If the application declared the buffer as static draw/copy or stream + * draw, it should not be frequently modified with glBufferSubData. + */ + buffer_usage_warning(ctx, + "using %s(buffer %u, offset %u, size %u) to " + "update a %s buffer", + func, bufObj->Name, offset, size, + _mesa_enum_to_string(bufObj->Usage)); + } + bufObj->Written = GL_TRUE; assert(ctx->Driver.BufferSubData); @@ -2384,6 +2427,18 @@ _mesa_map_buffer_range(struct gl_context *ctx, return NULL; } + if (access & GL_MAP_WRITE_BIT) { + bufObj->NumMapBufferWriteCalls++; + if ((bufObj->Usage == GL_STATIC_DRAW || + bufObj->Usage == GL_STATIC_COPY) && + bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) { + buffer_usage_warning(ctx, + "using %s(buffer %u, offset %u, length %u) to " + "update a %s buffer", + func, bufObj->Name, offset, length, + _mesa_enum_to_string(bufObj->Usage)); + } + } assert(ctx->Driver.MapBufferRange); map = ctx->Driver.MapBufferRange(ctx, offset, length, access, bufObj, diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 1eb1e21..de54169 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1275,6 +1275,10 @@ struct gl_buffer_object GLboolean Immutable; /**< GL_ARB_buffer_storage */ gl_buffer_usage UsageHistory; /**< How has this buffer been used so far? */ + /** Counters used for buffer usage warnings */ + GLuint NumSubDataCalls; + GLuint NumMapBufferWriteCalls; + struct gl_buffer_mapping Mappings[MAP_COUNT]; }; LGTM. Reviewed-by: Jose Fonseca___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline
On Tue, 2015-12-08 at 13:34 +0200, Tapani Pälli wrote: > On 12/08/2015 01:31 PM, Tapani Pälli wrote: > > On 12/08/2015 12:57 PM, Timothy Arceri wrote: > > > On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote: > > > > Validation checks that we do not have active shader stages that > > > > are > > > > not > > > > used by the pipeline. > > > > > > > > This fixes a subtest in following CTS test: > > > > ES31-CTS.sepshaderobjs.StateInteraction > > > > > > > > v2: move as generic validation check for both ES and desktop > > > > (Timothy) > > > > > > > > Signed-off-by: Tapani Pälli> > > > --- > > > > src/mesa/main/context.c | 9 + > > > > src/mesa/main/pipelineobj.c | 35 > > > > +++ > > > > src/mesa/main/pipelineobj.h | 2 ++ > > > > 3 files changed, 46 insertions(+) > > > > > > > > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c > > > > index be983d4..d73c984 100644 > > > > --- a/src/mesa/main/context.c > > > > +++ b/src/mesa/main/context.c > > > > @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context > > > > *ctx, > > > > const char *where) > > > > } > > > > } > > > > + /* If SSO in use, validate that all linked program stages > > > > are > > > > used. */ > > > > + if (ctx->Pipeline.Current) { > > > > + if (!_mesa_active_program_stages_used(ctx > > > > ->Pipeline.Current)) > > > > { > > > > + _mesa_error(ctx, GL_INVALID_OPERATION, > > > > + "Shader program active for shader stages > > > > that " > > > > + "are not used by the pipeline"); > > > You can remove this now. The call to > > > _mesa_validate_program_pipeline() > > > below should cause the error to be thrown for us when it fails. > > > > > Oops forgot this one. This is necessary to be run as the full > validation > happens only if pipeline was not validated before. I see. Well in that case shouldn't we set the validated flag to false when UseProgramStages is set? Since the pipeline could now be invalid for this rule and for: "One program object is active for at least two shader stages and a second program is active for a shader stage between two stages for which the first program was active." Surely it would be less of a preformance hit to re-run full validation when UseProgramStages is called on an than to re-validate these two rules everytime? > > > > > + } > > > > + } > > > > + > > > > /* If a program is active and SSO not in use, check if > > > > validation > > > > of > > > > * samplers succeeded for the active program. */ > > > > if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx > > > > ->Pipeline.Current) { > > > > diff --git a/src/mesa/main/pipelineobj.c > > > > b/src/mesa/main/pipelineobj.c > > > > index f4ce3ed..9067551 100644 > > > > --- a/src/mesa/main/pipelineobj.c > > > > +++ b/src/mesa/main/pipelineobj.c > > > > @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const > > > > struct gl_pipeline_object *pipe) > > > > return false; > > > > } > > > > +bool > > > > +_mesa_active_program_stages_used(struct gl_pipeline_object > > > > *pipe) > > > > +{ > > > > + unsigned i; > > > > + for (i = 0; i < MESA_SHADER_STAGES; i++) { > > > > + if (!pipe->CurrentProgram[i]) > > > > + continue; > > > > + /* If program has linked but not used by pipeline. */ > > > > + if (pipe->CurrentProgram[i]->LinkStatus && > > > > + !pipe->CurrentProgram[i]->ValidProgramUse) > > > Two things here first if ValidProgramUse is false we need to redo > > > the > > > validation to be sure it really is an invalid use as its possible > > > that > > > the program was relinked since the UseProgramStages call and is > > > now > > > valid. > > I'm not sure about this. This feels like the ValidProgramUse would > > be > > then unnecessary as we would anyway need to iterate through > > everything > > (just because of possible relink)? Right I'm leaning towards removing ValidProgramUse and setting the validated pipeline flag to false when UseProgramStages is called then revalidating everything. As you said with regards to the sampler checks we can break things up later and introduce flags if perf testing shows it to be useful. > > > > > Second are you sure about the LinkStatus check? Won't this cause > > > validation to pass when it should fail if the program was linked > > > successfully but ValidProgramUse is false, and we then do a > > > relink that > > > fails? As the program will stay in the pipeline. > > > > IMO we need to check that program has been linked, otherwise it > > should > > not be considered active at all. UseProgramStages will need to be > > called first to have actual value in ValidProgramUse and that > > cannot > > be called with program that has not been succesfully linked. But the LinkStatus flag gets set to false for a program that has already been used with UseProgramStage but is
Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro
On 4 December 2015 at 17:37,wrote: > On 2015-12-05 01:33, Emil Velikov wrote: >> >> You mentioned that you've done these with a Coccinelle schematic patch >> (script) - mind if we get that (actually start collecting) in tree ... >> perhaps in bin/ or bin/cocci/ ? > > > Definitely! This is in fact part of my proposed plan after I get some of > the major churn out the way and see what passes review and hence which > .cocci scripts are suitable for mesa. > Sweet. I guess I missed that part in your summary email. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: invalidate pipeline status after glUseProgramStages
This will cause validation to run during next draw, this is done because possible changes in used stages and programs can cause invalid pipeline state. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction Signed-off-by: Tapani Pälli--- src/mesa/main/pipelineobj.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index 5eda4e5..f2a872d 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -341,6 +341,8 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, GLuint program) if ((stages & GL_COMPUTE_SHADER_BIT) != 0) _mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, pipe); + + pipe->Validated = false; } /** -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Fix crash when calling glViewport with no surface bound
If EGL_KHR_surfaceless_context is used then glViewport can be called with NULL for the draw and read surfaces. This was previously causing a crash because the i965 driver tries to use this point to invalidate the surfaces and it was derferencing the NULL pointer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257 Cc: Nanley CheryCc: "11.1" --- I've also posted a Piglit test for this here: http://patchwork.freedesktop.org/patch/67356/ src/mesa/drivers/dri/i965/brw_context.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 7d7643c..5374bad 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -159,8 +159,10 @@ intel_viewport(struct gl_context *ctx) __DRIcontext *driContext = brw->driContext; if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) { - dri2InvalidateDrawable(driContext->driDrawablePriv); - dri2InvalidateDrawable(driContext->driReadablePriv); + if (driContext->driDrawablePriv) + dri2InvalidateDrawable(driContext->driDrawablePriv); + if (driContext->driReadablePriv) + dri2InvalidateDrawable(driContext->driReadablePriv); } } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 15/44] i965: Work around L3 state leaks during context switches.
This is going to require some rather intrusive kernel changes to fix properly, in the meantime (and forever on at least pre-v4.1 kernels) we'll have to restore the hardware defaults at the end of every batch in which the L3 configuration was changed to avoid interfering with the DDX and GL clients that use an older non-L3-aware version of Mesa. Reviewed-by: Samuel Iglesias GonsálvezReviewed-by: Kristian Høgsberg v4: Optimize look-up of the default configuration by assuming it's the first entry of the L3 config array in order to avoid an FPS regression in GpuTest Triangle and SynMark OglBatch2-7 on most affected platforms. --- src/mesa/drivers/dri/i965/brw_state.h | 4 +++ src/mesa/drivers/dri/i965/gen7_l3_state.c | 51 +++ src/mesa/drivers/dri/i965/intel_batchbuffer.c | 7 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 6 +++- 4 files changed, 67 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index 49f301a..b7c0039 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -380,6 +380,10 @@ void gen7_update_binding_table_from_array(struct brw_context *brw, void gen7_disable_hw_binding_tables(struct brw_context *brw); void gen7_reset_hw_bt_pool_offsets(struct brw_context *brw); +/* gen7_l3_state.c */ +void +gen7_restore_default_l3_config(struct brw_context *brw); + #ifdef __cplusplus } #endif diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c b/src/mesa/drivers/dri/i965/gen7_l3_state.c index 7956935..7fa7336 100644 --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c @@ -520,3 +520,54 @@ const struct brw_tracked_state gen7_l3_state = { }, .emit = emit_l3_state }; + +/** + * Hack to restore the default L3 configuration. + * + * This will be called at the end of every batch in order to reset the L3 + * configuration to the default values for the time being until the kernel is + * fixed. Until kernel commit 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b + * (included in v4.1) we would set the MI_RESTORE_INHIBIT bit when submitting + * batch buffers for the default context used by the DDX, which meant that any + * context state changed by the GL would leak into the DDX, the assumption + * being that the DDX would initialize any state it cares about manually. The + * DDX is however not careful enough to program an L3 configuration + * explicitly, and it makes assumptions about it (URB size) which won't hold + * and cause it to misrender if we let our L3 set-up to leak into the DDX. + * + * Since v4.1 of the Linux kernel the default context is saved and restored + * normally, so it's far less likely for our L3 programming to interfere with + * other contexts -- In fact restoring the default L3 configuration at the end + * of the batch will be redundant most of the time. A kind of state leak is + * still possible though if the context making assumptions about L3 state is + * created immediately after our context was active (e.g. without the DDX + * default context being scheduled in between) because at present the DRM + * doesn't fully initialize the contents of newly created contexts and instead + * sets the MI_RESTORE_INHIBIT flag causing it to inherit the state from the + * last active context. + * + * It's possible to realize such a scenario if, say, an X server (or a GL + * application using an outdated non-L3-aware Mesa version) is started while + * another GL application is running and happens to have modified the L3 + * configuration, or if no X server is running at all and a GL application + * using a non-L3-aware Mesa version is started after another GL application + * ran and modified the L3 configuration -- The latter situation can actually + * be reproduced easily on IVB in our CI system. + */ +void +gen7_restore_default_l3_config(struct brw_context *brw) +{ + const struct brw_device_info *devinfo = brw->intelScreen->devinfo; + /* For efficiency assume that the first entry of the array matches the +* default configuration. +*/ + const struct brw_l3_config *const cfg = get_l3_configs(devinfo); + assert(cfg == get_l3_config(devinfo, + get_default_l3_weights(devinfo, false, false))); + + if (cfg != brw->l3.config && brw->can_do_pipelined_register_writes) { + setup_l3_config(brw, cfg); + update_urb_size(brw, cfg); + brw->l3.config = cfg; + } +} diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 0363bd3..f778074 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -208,6 +208,13 @@ brw_finish_batch(struct brw_context *brw) brw_emit_query_end(brw); if (brw->batch.ring == RENDER_RING) { + /* Work around L3 state leaks into
Re: [Mesa-dev] [PATCH 01/10] gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager
On 06.12.2015 19:00, Marek Olšák wrote: From: Marek OlšákThis simplified (basically duplicated) version of pb_cache_manager will allow removing some ugly hacks from radeon and amdgpu winsyses and flatten simplify their design. The difference is that winsyses must manually add buffers to the cache in "destroy" functions and the cache doesn't know about the buffers before that. The integration is therefore trivial and the impact on the winsys design is negligible. --- src/gallium/auxiliary/Makefile.sources | 1 + src/gallium/auxiliary/pipebuffer/pb_cache.c | 286 src/gallium/auxiliary/pipebuffer/pb_cache.h | 74 +++ 3 files changed, 361 insertions(+) create mode 100644 src/gallium/auxiliary/pipebuffer/pb_cache.c create mode 100644 src/gallium/auxiliary/pipebuffer/pb_cache.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 6160192..817308d 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -93,6 +93,7 @@ C_SOURCES := \ pipebuffer/pb_bufmgr_ondemand.c \ pipebuffer/pb_bufmgr_pool.c \ pipebuffer/pb_bufmgr_slab.c \ + pipebuffer/pb_cache.c \ I believe pb_cache.h needs to be added as well. pipebuffer/pb_validate.c \ pipebuffer/pb_validate.h \ postprocess/filters.h \ diff --git a/src/gallium/auxiliary/pipebuffer/pb_cache.c b/src/gallium/auxiliary/pipebuffer/pb_cache.c new file mode 100644 index 000..45f600d --- /dev/null +++ b/src/gallium/auxiliary/pipebuffer/pb_cache.c ... +/** + * \return 1 if compatible and can be reclaimed + * 0 if incompatible + *-1 if compatible and can't be reclaimed + */ +static int +pb_cache_is_buffer_compat(struct pb_cache_entry *entry, + pb_size size, unsigned alignment, unsigned usage) +{ + struct pb_buffer *buf = entry->buffer; + + if (usage & entry->mgr->bypass_usage) + return 0; It should be possible to move this test to the top of pb_cache_reclaim_buffer, right? + if (buf->size < size) + return 0; + + /* be lenient with size */ + if (buf->size > (unsigned) (entry->mgr->size_factor * size)) + return 0; + + if (!pb_check_alignment(alignment, buf->alignment)) + return 0; + + if (!pb_check_usage(usage, buf->usage)) + return 0; + + return entry->mgr->can_reclaim(buf) ? 1 : -1; +} + +/** + * Find a compatible buffer in the cache, return it, and remove it + * from the cache. + */ +struct pb_buffer * +pb_cache_reclaim_buffer(struct pb_cache *mgr, pb_size size, +unsigned alignment, unsigned usage) +{ + struct pb_cache_entry *entry; + struct pb_cache_entry *cur_entry; + struct list_head *cur, *next; + int64_t now; + int ret = 0; + + pipe_mutex_lock(mgr->mutex); + + entry = NULL; + cur = mgr->cache.next; + next = cur->next; + + /* search in the expired buffers, freeing them in the process */ + now = os_time_get(); + while (cur != >cache) { + cur_entry = LIST_ENTRY(struct pb_cache_entry, cur, head); + + if (!entry && (ret = pb_cache_is_buffer_compat(cur_entry, size, + alignment, usage) > 0)) + entry = cur_entry; + else if (os_time_timeout(cur_entry->start, cur_entry->end, now)) + destroy_buffer_locked(cur_entry); + else + /* This buffer (and all hereafter) are still hot in cache */ + break; + + /* the buffer is busy (and probably all remaining ones too) */ + if (ret == -1) + break; + + cur = next; + next = cur->next; + } + + /* keep searching in the hot buffers */ + if (!entry && ret != -1) { + while (cur != >cache) { + cur_entry = LIST_ENTRY(struct pb_cache_entry, cur, head); + ret = pb_cache_is_buffer_compat(cur_entry, size, alignment, usage); + + if (ret > 0) { +entry = cur_entry; +break; + } + if (ret == -1) +break; + /* no need to check the timeout here */ + cur = next; + next = cur->next; + } + } + + /* found a compatible buffer, return it */ + if (entry) { + struct pb_buffer *buf = entry->buffer; + + mgr->cache_size -= buf->size; + LIST_DEL(>head); + --mgr->num_buffers; + pipe_mutex_unlock(mgr->mutex); + /* Increase refcount */ + pipe_reference_init(>reference, 1); + return buf; + } + + pipe_mutex_unlock(mgr->mutex); + return NULL; +} + +/** + * Empty the cache. Useful when there is not enough memory. + */ +void +pb_cache_release_all_buffers(struct pb_cache *mgr) +{ + struct list_head *curr, *next; + struct pb_cache_entry *buf; + + pipe_mutex_lock(mgr->mutex); + curr = mgr->cache.next; + next = curr->next; + while (curr != >cache) { + buf = LIST_ENTRY(struct pb_cache_entry, curr, head); +
Re: [Mesa-dev] [PATCH 09/10] winsys/radeon: use pb_cache instead of pb_cache_manager
On 06.12.2015 19:01, Marek Olšák wrote: From: Marek OlšákThis is a prerequisite for the removal of radeon_winsys_cs_handle. --- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 212 +++--- src/gallium/winsys/radeon/drm/radeon_drm_bo.h | 14 +- src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 22 +-- src/gallium/winsys/radeon/drm/radeon_drm_winsys.h | 4 +- 4 files changed, 74 insertions(+), 178 deletions(-) diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c index 4c38379..9532c77 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c ... static const struct pb_vtbl radeon_bo_vtbl = { -radeon_bo_destroy, -NULL, /* never called */ -NULL, /* never called */ -radeon_bo_validate, -radeon_bo_fence, -radeon_bo_get_base_buffer, +radeon_bo_destroy_or_cache }; I take it the other functions aren't called anymore? Perhaps this patch and #3 could use an explanation to that effect. Nicolai #ifndef RADEON_GEM_GTT_WC @@ -540,40 +490,39 @@ static const struct pb_vtbl radeon_bo_vtbl = { #define RADEON_GEM_NO_CPU_ACCESS (1 << 4) #endif -static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr, -pb_size size, -const struct pb_desc *desc) +static struct radeon_bo *radeon_create_bo(struct radeon_drm_winsys *rws, + unsigned size, unsigned alignment, + unsigned usage, + unsigned initial_domains, + unsigned flags) { -struct radeon_bomgr *mgr = radeon_bomgr(_mgr); -struct radeon_drm_winsys *rws = mgr->rws; struct radeon_bo *bo; struct drm_radeon_gem_create args; -struct radeon_bo_desc *rdesc = (struct radeon_bo_desc*)desc; int r; memset(, 0, sizeof(args)); -assert(rdesc->initial_domains); -assert((rdesc->initial_domains & +assert(initial_domains); +assert((initial_domains & ~(RADEON_GEM_DOMAIN_GTT | RADEON_GEM_DOMAIN_VRAM)) == 0); args.size = size; -args.alignment = desc->alignment; -args.initial_domain = rdesc->initial_domains; +args.alignment = alignment; +args.initial_domain = initial_domains; args.flags = 0; -if (rdesc->flags & RADEON_FLAG_GTT_WC) +if (flags & RADEON_FLAG_GTT_WC) args.flags |= RADEON_GEM_GTT_WC; -if (rdesc->flags & RADEON_FLAG_CPU_ACCESS) +if (flags & RADEON_FLAG_CPU_ACCESS) args.flags |= RADEON_GEM_CPU_ACCESS; -if (rdesc->flags & RADEON_FLAG_NO_CPU_ACCESS) +if (flags & RADEON_FLAG_NO_CPU_ACCESS) args.flags |= RADEON_GEM_NO_CPU_ACCESS; if (drmCommandWriteRead(rws->fd, DRM_RADEON_GEM_CREATE, , sizeof(args))) { fprintf(stderr, "radeon: Failed to allocate a buffer:\n"); fprintf(stderr, "radeon:size : %d bytes\n", size); -fprintf(stderr, "radeon:alignment : %d bytes\n", desc->alignment); +fprintf(stderr, "radeon:alignment : %d bytes\n", alignment); fprintf(stderr, "radeon:domains : %d\n", args.initial_domain); fprintf(stderr, "radeon:flags : %d\n", args.flags); return NULL; @@ -584,20 +533,21 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr, return NULL; pipe_reference_init(>base.reference, 1); -bo->base.alignment = desc->alignment; -bo->base.usage = desc->usage; +bo->base.alignment = alignment; +bo->base.usage = usage; bo->base.size = size; bo->base.vtbl = _bo_vtbl; bo->rws = rws; bo->handle = args.handle; bo->va = 0; -bo->initial_domain = rdesc->initial_domains; +bo->initial_domain = initial_domains; pipe_mutex_init(bo->map_mutex); +pb_cache_init_entry(>bo_cache, >cache_entry, >base); if (rws->info.r600_virtual_address) { struct drm_radeon_gem_va va; -bo->va = radeon_bomgr_find_va(rws, size, desc->alignment); +bo->va = radeon_bomgr_find_va(rws, size, alignment); va.handle = bo->handle; va.vm_id = 0; @@ -610,7 +560,7 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr, if (r && va.operation == RADEON_VA_RESULT_ERROR) { fprintf(stderr, "radeon: Failed to allocate virtual address for buffer:\n"); fprintf(stderr, "radeon:size : %d bytes\n", size); -fprintf(stderr, "radeon:alignment : %d bytes\n", desc->alignment); +fprintf(stderr, "radeon:alignment : %d bytes\n", alignment); fprintf(stderr, "radeon:domains : %d\n", args.initial_domain);
[Mesa-dev] [PATCH 1/2] glsl: do not loose always_active_io when packing varyings
Otherwise packed and inactive varyings get optimized away. This needs to be prevented when using separate shader objects where interface needs to be preserved. Signed-off-by: Tapani Pälli--- src/glsl/lower_packed_varyings.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/lower_packed_varyings.cpp index 037c27d..8d1eb17 100644 --- a/src/glsl/lower_packed_varyings.cpp +++ b/src/glsl/lower_packed_varyings.cpp @@ -622,6 +622,7 @@ lower_packed_varyings_visitor::get_packed_varying_deref( packed_var->data.interpolation = unpacked_var->data.interpolation; packed_var->data.location = location; packed_var->data.precision = unpacked_var->data.precision; + packed_var->data.always_active_io = unpacked_var->data.always_active_io; unpacked_var->insert_before(packed_var); this->packed_varyings[slot] = packed_var; } else { -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: fix interface matching done in validate_io
Patch makes following changes for interface matching: - do not try to match builtin variables - handle swizzle in input name, as example 'a.z' should match with 'a' - check that amount of inputs and outputs matches These changes make interface matching tests to work in: ES31-CTS.sepshaderobjs.StateInteraction Test does not still pass completely due to errors in rendering output. I believe this is unrelated to interface matching. Signed-off-by: Tapani Pälli--- src/mesa/main/shader_query.cpp | 25 + 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp index 5d15006..8bfbd3f 100644 --- a/src/mesa/main/shader_query.cpp +++ b/src/mesa/main/shader_query.cpp @@ -1372,19 +1372,34 @@ validate_io(const struct gl_shader *input_stage, const struct gl_shader *output_stage, bool isES) { assert(input_stage && output_stage); + unsigned inputs = 0, outputs = 0; /* For each output in a, find input in b and do any required checks. */ foreach_in_list(ir_instruction, out, input_stage->ir) { ir_variable *out_var = out->as_variable(); - if (!out_var || out_var->data.mode != ir_var_shader_out) + if (!out_var || out_var->data.mode != ir_var_shader_out || + is_gl_identifier(out_var->name)) continue; + outputs++; + + inputs = 0; foreach_in_list(ir_instruction, in, output_stage->ir) { ir_variable *in_var = in->as_variable(); - if (!in_var || in_var->data.mode != ir_var_shader_in) + if (!in_var || in_var->data.mode != ir_var_shader_in || + is_gl_identifier(in_var->name)) continue; - if (strcmp(in_var->name, out_var->name) == 0) { + inputs++; + + unsigned len = strlen(in_var->name); + + /* Handle input swizzle in variable name. */ + const char *dot = strchr(in_var->name, '.'); + if (dot) +len = dot - in_var->name; + + if (strncmp(in_var->name, out_var->name, len) == 0) { /* Since we now only validate precision, we can skip this step for * desktop GLSL shaders, there precision qualifier is ignored. * @@ -1407,7 +1422,9 @@ validate_io(const struct gl_shader *input_stage, } } } - return true; + + /* Amount should match. */ + return (inputs == outputs); } /** -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics
On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote: [snip] > @@ -461,8 +455,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, > print_state *state) > } > > nir_foreach_variable(var, var_list) { > - if ((var->data.driver_location == instr->const_index[0]) && > - var->name) { > + if ((var->data.driver_location == instr->const_index[0]) && var->name) > { > fprintf(fp, "\t/* %s */", var->name); > break; >} > This last hunk seems unrelated - perhaps it should be a separate patch? That patch would get a R-b. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: Optimize useless comparisons against true/false.
On Fri, Dec 4, 2015 at 10:51 AM, Neil Robertswrote: > Matt Turner writes: > >> # Written in the form (, ) where is an expression >> # and is either an expression or a value. An expression is >> # defined as a tuple of the form (, , , , ) >> @@ -94,6 +97,8 @@ optimizations = [ >> (('inot', ('ige', a, b)), ('ilt', a, b)), >> (('inot', ('ieq', a, b)), ('ine', a, b)), >> (('inot', ('ine', a, b)), ('ieq', a, b)), >> + (('ieq', 'a@bool', true), a), >> + (('ine', 'a@bool', false), a), > > I think this second one is already in the file on line 187 here: > ># Boolean simplifications >(('ine', 'a@bool', 0), 'a'), >(('ieq', 'a@bool', 0), ('inot', 'a')), > > Maybe you could add the first one near these two? It could be good to > add a line like this as well to complete the set: > >(('ine', 'a@bool', true), ('inot', a)), > > Regards, > - Neil Thanks. Both are good suggestions. I've squashed them into the patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 05/12] nir/lower_io: Get rid of load/store_foo_indirect
On Tuesday, December 08, 2015 01:46:22 PM Jason Ekstrand wrote: > diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c > index f64ac69..a2723d5 100644 > --- a/src/glsl/nir/nir_lower_io.c > +++ b/src/glsl/nir/nir_lower_io.c > @@ -333,18 +311,18 @@ nir_lower_io(nir_shader *shader, nir_variable_mode mode, > * Return the indirect source for a load/store indirect intrinsic. This comment could use updating. How about: /** * Return the offset soruce for a load/store intrinsic. */ > */ > nir_src * > -nir_get_io_indirect_src(nir_intrinsic_instr *instr) > +nir_get_io_offset_src(nir_intrinsic_instr *instr) > { > switch (instr->intrinsic) { > - case nir_intrinsic_load_input_indirect: > - case nir_intrinsic_load_output_indirect: > - case nir_intrinsic_load_uniform_indirect: > + case nir_intrinsic_load_input: > + case nir_intrinsic_load_output: > + case nir_intrinsic_load_uniform: >return >src[0]; > - case nir_intrinsic_load_per_vertex_input_indirect: > - case nir_intrinsic_load_per_vertex_output_indirect: > - case nir_intrinsic_store_output_indirect: > + case nir_intrinsic_load_per_vertex_input: > + case nir_intrinsic_load_per_vertex_output: > + case nir_intrinsic_store_output: >return >src[1]; > - case nir_intrinsic_store_per_vertex_output_indirect: > + case nir_intrinsic_store_per_vertex_output: >return >src[2]; > default: >return NULL; signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics
On Tue, Dec 8, 2015 at 4:19 PM, Kenneth Graunkewrote: > On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote: >> +/* src[] = { buffer_index, offset }. No const_index */ >> +LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE) > > I'm pretty sure const_index[0] is a writemask, so this comemnt is wrong. For stores, yes, but not for loads. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics
On Tue, Dec 8, 2015 at 2:47 PM, Kenneth Graunkewrote: > On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote: > [snip] >> @@ -461,8 +455,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, >> print_state *state) >> } >> >> nir_foreach_variable(var, var_list) { >> - if ((var->data.driver_location == instr->const_index[0]) && >> - var->name) { >> + if ((var->data.driver_location == instr->const_index[0]) && >> var->name) { >> fprintf(fp, "\t/* %s */", var->name); >> break; >>} >> > > This last hunk seems unrelated - perhaps it should be a separate patch? > That patch would get a R-b. FIxed locally. Thanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.
This format has been added in commit: 28090b30dd6b5977de085f48c620574214b6b4ba But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format. It was causing the screen in Supertuxkart to be darker than expected, see: https://bugs.freedesktop.org/show_bug.cgi?id=92759 Cc: Boyan DingCc: "11.0 11.1" Fixes: 28090b30dd6 "i965: Add XRGB format to intel_screen_make_configs" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759 --- src/mesa/drivers/dri/i965/intel_screen.c | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index cc90efe..75d5a65 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) stencil_bits[2] = 8; num_depth_stencil_bits = 3; } + } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) { + depth_bits[1] = 24; + stencil_bits[1] = 0; } else { depth_bits[1] = 24; stencil_bits[1] = 8; @@ -1261,6 +1264,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) if (formats[i] == MESA_FORMAT_B5G6R5_UNORM) { depth_bits[0] = 16; stencil_bits[0] = 0; + } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) { + depth_bits[0] = 24; + stencil_bits[0] = 0; } else { depth_bits[0] = 24; stencil_bits[0] = 8; @@ -1301,6 +1307,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) if (formats[i] == MESA_FORMAT_B5G6R5_UNORM) { depth_bits[1] = 16; stencil_bits[1] = 0; + } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) { + depth_bits[1] = 24; + stencil_bits[1] = 0; } else { depth_bits[1] = 24; stencil_bits[1] = 8; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 06/12] i965/fs: Get rid of load/store_foo_indirect
On Tue, Dec 8, 2015 at 4:44 PM, Kenneth Graunkewrote: > On Tuesday, December 08, 2015 01:46:23 PM Jason Ekstrand wrote: >> --- >> src/mesa/drivers/dri/i965/brw_fs.h | 2 +- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 106 >> ++- >> src/mesa/drivers/dri/i965/brw_nir.c | 45 ++--- >> 3 files changed, 84 insertions(+), 69 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h >> b/src/mesa/drivers/dri/i965/brw_fs.h >> index bca4589..b55589f 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.h >> +++ b/src/mesa/drivers/dri/i965/brw_fs.h >> @@ -278,7 +278,7 @@ public: >> unsigned stream_id); >> void emit_gs_thread_end(); >> void emit_gs_input_load(const fs_reg , const nir_src _src, >> - const fs_reg _offset, unsigned >> imm_offset, >> + unsigned base_offset, const nir_src _src, >> unsigned num_components); >> void emit_cs_terminate(); >> fs_reg *emit_cs_local_invocation_id_setup(); >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index 3754622..18c9d65 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -1603,29 +1603,31 @@ fs_visitor::emit_gs_vertex(const nir_src >> _count_nir_src, >> void >> fs_visitor::emit_gs_input_load(const fs_reg , >> const nir_src _src, >> - const fs_reg _offset, >> - unsigned imm_offset, >> + unsigned base_offset, >> + const nir_src _src, >> unsigned num_components) >> { >> struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) >> prog_data; >> >> + nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); >> + nir_const_value *offset_const = nir_src_as_const_value(offset_src); >> + const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; >> + >> /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y], >> * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w]. Only >> * gl_PointSize is available as a GS input, however, so it must be that. >> */ >> - const bool is_point_size = >> - indirect_offset.file == BAD_FILE && imm_offset == 0; >> - >> - nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); >> - const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; >> + const bool is_point_size = (base_offset == 0); >> >> - if (indirect_offset.file == BAD_FILE && vertex_const != NULL && >> - 4 * imm_offset < push_reg_count) { >> - imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count; >> + if (offset_const != NULL && vertex_const != NULL && >> + 4 * offset_const->u[0] < push_reg_count) { > > This needs to be > > 4 * (base_offset + offset_const->u[0]) < push_reg_count > > otherwise we use the push model for inputs that aren't actually pushed, > which breaks around 1500 piglit tests with INTEL_SCALAR_GS=1. > >> + int imm_offset = (base_offset + offset_const->u[0]) * 4 + >> + vertex_const->u[0] * push_reg_count; >>/* This input was pushed into registers. */ >>if (is_point_size) { >> /* gl_PointSize comes in .w */ >> - bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type)); >> + assert(imm_offset == 0); >> + bld.MOV(dst, fs_reg(ATTR, 3, dst.type)); > > This is wrong. imm_offset contains the offset derived from the vertex > index, and gl_PointSize exists for multiple vertices. So it won't be 0. > > Please just remove this change, putting it back to: > > bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type)); > > That fixes assertion failures in 3 Piglit tests. Fixed both locally. Thanks for the careful review and for watching out for scalar GS! > With that fixed, patches 1-8 and 11-12 are: > Reviewed-by: Kenneth Graunke Thanks! --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Fix locking of GLsync objects
Hi, I was told that it's easier for people to review my patch if it comes in via email than being stuck in the bug tracker; FWIW, this is for bug 120238. (It's the same patch as is already in the tracker.) /* Steinar */ === From 6e3d1880fa78a3a965cb7eb51ee12b1f785f84bb Mon Sep 17 00:00:00 2001 From: "Steinar H. Gunderson"Date: Tue, 1 Dec 2015 22:05:11 +0100 Subject: [PATCH] Fix locking of GLsync objects. GLsync objects had a race condition when used from multiple threads (which is the main point of the extension, really); it could be validated as a sync object at the beginning of the function, and then deleted by another thread before use, causing crashes. Fix this by changing all casts from GLsync to struct gl_sync_object to a new function _mesa_get_sync() that validates and increases the refcount. In a similar vein, validation itself uses _mesa_set_search(), which requires synchronization -- it was called without a mutex held, causing spurious error returns and other issues. Since _mesa_get_sync() now takes the shared context mutex, this problem is also resolved. Signed-off-by: Steinar H. Gunderson --- src/mesa/main/objectlabel.c | 11 -- src/mesa/main/shared.c | 2 +- src/mesa/main/syncobj.c | 89 ++--- src/mesa/main/syncobj.h | 11 ++ 4 files changed, 64 insertions(+), 49 deletions(-) diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c index 41f370c..b083c43 100644 --- a/src/mesa/main/objectlabel.c +++ b/src/mesa/main/objectlabel.c @@ -288,7 +288,7 @@ void GLAPIENTRY _mesa_ObjectPtrLabel(const void *ptr, GLsizei length, const GLchar *label) { GET_CURRENT_CONTEXT(ctx); - struct gl_sync_object *const syncObj = (struct gl_sync_object *) ptr; + struct gl_sync_object *syncObj = _mesa_get_sync(ctx, sync, true); const char *callerstr; char **labelPtr; @@ -297,7 +297,7 @@ _mesa_ObjectPtrLabel(const void *ptr, GLsizei length, const GLchar *label) else callerstr = "glObjectPtrLabelKHR"; - if (!_mesa_validate_sync(ctx, syncObj)) { + if (!syncObj) { _mesa_error(ctx, GL_INVALID_VALUE, "%s (not a valid sync object)", callerstr); return; @@ -306,6 +306,7 @@ _mesa_ObjectPtrLabel(const void *ptr, GLsizei length, const GLchar *label) labelPtr = >Label; set_label(ctx, labelPtr, label, length, callerstr); + _mesa_unref_sync_object(ctx, syncObj, 1); } void GLAPIENTRY @@ -313,7 +314,7 @@ _mesa_GetObjectPtrLabel(const void *ptr, GLsizei bufSize, GLsizei *length, GLchar *label) { GET_CURRENT_CONTEXT(ctx); - struct gl_sync_object *const syncObj = (struct gl_sync_object *) ptr; + struct gl_sync_object *syncObj; const char *callerstr; char **labelPtr; @@ -328,7 +329,8 @@ _mesa_GetObjectPtrLabel(const void *ptr, GLsizei bufSize, GLsizei *length, return; } - if (!_mesa_validate_sync(ctx, syncObj)) { + syncObj = _mesa_get_sync(ctx, sync, true); + if (!syncObj) { _mesa_error(ctx, GL_INVALID_VALUE, "%s (not a valid sync object)", callerstr); return; @@ -337,4 +339,5 @@ _mesa_GetObjectPtrLabel(const void *ptr, GLsizei bufSize, GLsizei *length, labelPtr = >Label; copy_label(*labelPtr, label, length, bufSize); + _mesa_unref_sync_object(ctx, syncObj, 1); } diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c index c37b31d..b9f7bb6 100644 --- a/src/mesa/main/shared.c +++ b/src/mesa/main/shared.c @@ -338,7 +338,7 @@ free_shared_state(struct gl_context *ctx, struct gl_shared_state *shared) struct set_entry *entry; set_foreach(shared->SyncObjects, entry) { - _mesa_unref_sync_object(ctx, (struct gl_sync_object *) entry->key); + _mesa_unref_sync_object(ctx, (struct gl_sync_object *) entry->key, 1); } } _mesa_set_destroy(shared->SyncObjects, NULL); diff --git a/src/mesa/main/syncobj.c b/src/mesa/main/syncobj.c index c1b2d3b..d1c6c06 100644 --- a/src/mesa/main/syncobj.c +++ b/src/mesa/main/syncobj.c @@ -167,34 +167,42 @@ _mesa_free_sync_data(struct gl_context *ctx) * - not in sync objects hash table * - type is GL_SYNC_FENCE * - not marked as deleted + * + * Returns the internal gl_sync_object pointer if the sync object is valid + * or NULL if it isn't. + * + * If "incRefCount" is true, the reference count is incremented, which is + * normally what you want; otherwise, a glDeleteSync from another thread + * could delete the sync object while you are still working on it. */ -bool -_mesa_validate_sync(struct gl_context *ctx, -const struct gl_sync_object *syncObj) +struct gl_sync_object * +_mesa_get_sync(struct gl_context *ctx, GLsync sync, bool incRefCount) { - return (syncObj != NULL) + struct gl_sync_object *syncObj = (struct gl_sync_object *) sync; + mtx_lock(>Shared->Mutex); + if (syncObj != NULL &&
[Mesa-dev] [PATCH] svga: initialize pipe_driver_query_info entries with a macro
To be safe, set all the fields in case the enums ordering/values ever change. --- src/gallium/drivers/svga/svga_screen.c | 43 ++ 1 file changed, 28 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/svga/svga_screen.c b/src/gallium/drivers/svga/svga_screen.c index 09a3d33..fca501b 100644 --- a/src/gallium/drivers/svga/svga_screen.c +++ b/src/gallium/drivers/svga/svga_screen.c @@ -780,26 +780,39 @@ svga_get_driver_query_info(struct pipe_screen *screen, unsigned index, struct pipe_driver_query_info *info) { +#define QUERY(NAME, ENUM, UNITS) \ + {NAME, ENUM, {0}, UNITS, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE, 0, 0x0} + static const struct pipe_driver_query_info queries[] = { /* per-frame counters */ - {"num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS, {0}}, - {"num-fallbacks", SVGA_QUERY_NUM_FALLBACKS, {0}}, - {"num-flushes", SVGA_QUERY_NUM_FLUSHES, {0}}, - {"num-validations", SVGA_QUERY_NUM_VALIDATIONS, {0}}, - {"map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME, {0}, - PIPE_DRIVER_QUERY_TYPE_MICROSECONDS}, - {"num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED, {0}}, - {"num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED, {0}, - PIPE_DRIVER_QUERY_TYPE_BYTES, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE}, + QUERY("num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-fallbacks", SVGA_QUERY_NUM_FALLBACKS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-flushes", SVGA_QUERY_NUM_FLUSHES, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-validations", SVGA_QUERY_NUM_VALIDATIONS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME, +PIPE_DRIVER_QUERY_TYPE_MICROSECONDS), + QUERY("num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED, +PIPE_DRIVER_QUERY_TYPE_BYTES), /* running total counters */ - {"memory-used", SVGA_QUERY_MEMORY_USED, {0}, - PIPE_DRIVER_QUERY_TYPE_BYTES}, - {"num-shaders", SVGA_QUERY_NUM_SHADERS, {0}}, - {"num-resources", SVGA_QUERY_NUM_RESOURCES, {0}}, - {"num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS, {0}}, - {"num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS, {0}}, + QUERY("memory-used", SVGA_QUERY_MEMORY_USED, +PIPE_DRIVER_QUERY_TYPE_BYTES), + QUERY("num-shaders", SVGA_QUERY_NUM_SHADERS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-resources", SVGA_QUERY_NUM_RESOURCES, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS, +PIPE_DRIVER_QUERY_TYPE_UINT64), }; +#undef QUERY if (!info) return Elements(queries); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 06/12] i965/fs: Get rid of load/store_foo_indirect
On Tuesday, December 08, 2015 01:46:23 PM Jason Ekstrand wrote: > --- > src/mesa/drivers/dri/i965/brw_fs.h | 2 +- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 106 > ++- > src/mesa/drivers/dri/i965/brw_nir.c | 45 ++--- > 3 files changed, 84 insertions(+), 69 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h > b/src/mesa/drivers/dri/i965/brw_fs.h > index bca4589..b55589f 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.h > +++ b/src/mesa/drivers/dri/i965/brw_fs.h > @@ -278,7 +278,7 @@ public: > unsigned stream_id); > void emit_gs_thread_end(); > void emit_gs_input_load(const fs_reg , const nir_src _src, > - const fs_reg _offset, unsigned > imm_offset, > + unsigned base_offset, const nir_src _src, > unsigned num_components); > void emit_cs_terminate(); > fs_reg *emit_cs_local_invocation_id_setup(); > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 3754622..18c9d65 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -1603,29 +1603,31 @@ fs_visitor::emit_gs_vertex(const nir_src > _count_nir_src, > void > fs_visitor::emit_gs_input_load(const fs_reg , > const nir_src _src, > - const fs_reg _offset, > - unsigned imm_offset, > + unsigned base_offset, > + const nir_src _src, > unsigned num_components) > { > struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) > prog_data; > > + nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); > + nir_const_value *offset_const = nir_src_as_const_value(offset_src); > + const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; > + > /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y], > * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w]. Only > * gl_PointSize is available as a GS input, however, so it must be that. > */ > - const bool is_point_size = > - indirect_offset.file == BAD_FILE && imm_offset == 0; > - > - nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); > - const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; > + const bool is_point_size = (base_offset == 0); > > - if (indirect_offset.file == BAD_FILE && vertex_const != NULL && > - 4 * imm_offset < push_reg_count) { > - imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count; > + if (offset_const != NULL && vertex_const != NULL && > + 4 * offset_const->u[0] < push_reg_count) { This needs to be 4 * (base_offset + offset_const->u[0]) < push_reg_count otherwise we use the push model for inputs that aren't actually pushed, which breaks around 1500 piglit tests with INTEL_SCALAR_GS=1. > + int imm_offset = (base_offset + offset_const->u[0]) * 4 + > + vertex_const->u[0] * push_reg_count; >/* This input was pushed into registers. */ >if (is_point_size) { > /* gl_PointSize comes in .w */ > - bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type)); > + assert(imm_offset == 0); > + bld.MOV(dst, fs_reg(ATTR, 3, dst.type)); This is wrong. imm_offset contains the offset derived from the vertex index, and gl_PointSize exists for multiple vertices. So it won't be 0. Please just remove this change, putting it back to: bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type)); That fixes assertion failures in 3 Piglit tests. With that fixed, patches 1-8 and 11-12 are: Reviewed-by: Kenneth Graunkesignature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics
On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote: > +/* src[] = { buffer_index, offset }. No const_index */ > +LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE) I'm pretty sure const_index[0] is a writemask, so this comemnt is wrong. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: initialize pipe_driver_query_info entries with a macro
Reviewed-by: Charmaine LeeFrom: Brian Paul Sent: Tuesday, December 8, 2015 4:35 PM To: mesa-dev@lists.freedesktop.org Cc: Charmaine Lee Subject: [PATCH] svga: initialize pipe_driver_query_info entries with a macro To be safe, set all the fields in case the enums ordering/values ever change. --- src/gallium/drivers/svga/svga_screen.c | 43 ++ 1 file changed, 28 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/svga/svga_screen.c b/src/gallium/drivers/svga/svga_screen.c index 09a3d33..fca501b 100644 --- a/src/gallium/drivers/svga/svga_screen.c +++ b/src/gallium/drivers/svga/svga_screen.c @@ -780,26 +780,39 @@ svga_get_driver_query_info(struct pipe_screen *screen, unsigned index, struct pipe_driver_query_info *info) { +#define QUERY(NAME, ENUM, UNITS) \ + {NAME, ENUM, {0}, UNITS, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE, 0, 0x0} + static const struct pipe_driver_query_info queries[] = { /* per-frame counters */ - {"num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS, {0}}, - {"num-fallbacks", SVGA_QUERY_NUM_FALLBACKS, {0}}, - {"num-flushes", SVGA_QUERY_NUM_FLUSHES, {0}}, - {"num-validations", SVGA_QUERY_NUM_VALIDATIONS, {0}}, - {"map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME, {0}, - PIPE_DRIVER_QUERY_TYPE_MICROSECONDS}, - {"num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED, {0}}, - {"num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED, {0}, - PIPE_DRIVER_QUERY_TYPE_BYTES, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE}, + QUERY("num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-fallbacks", SVGA_QUERY_NUM_FALLBACKS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-flushes", SVGA_QUERY_NUM_FLUSHES, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-validations", SVGA_QUERY_NUM_VALIDATIONS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME, +PIPE_DRIVER_QUERY_TYPE_MICROSECONDS), + QUERY("num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED, +PIPE_DRIVER_QUERY_TYPE_BYTES), /* running total counters */ - {"memory-used", SVGA_QUERY_MEMORY_USED, {0}, - PIPE_DRIVER_QUERY_TYPE_BYTES}, - {"num-shaders", SVGA_QUERY_NUM_SHADERS, {0}}, - {"num-resources", SVGA_QUERY_NUM_RESOURCES, {0}}, - {"num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS, {0}}, - {"num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS, {0}}, + QUERY("memory-used", SVGA_QUERY_MEMORY_USED, +PIPE_DRIVER_QUERY_TYPE_BYTES), + QUERY("num-shaders", SVGA_QUERY_NUM_SHADERS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-resources", SVGA_QUERY_NUM_RESOURCES, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS, +PIPE_DRIVER_QUERY_TYPE_UINT64), + QUERY("num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS, +PIPE_DRIVER_QUERY_TYPE_UINT64), }; +#undef QUERY if (!info) return Elements(queries); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.
On Mon, Dec 7, 2015 at 5:32 PM, Dawid Ganwrote: > This format has been added in commit: > 28090b30dd6b5977de085f48c620574214b6b4ba > But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format. > It was causing the screen in Supertuxkart to be darker than expected, see: > https://bugs.freedesktop.org/show_bug.cgi?id=92759 > > Cc: Boyan Ding > Cc: "11.0 11.1" > Fixes: 28090b30dd6 "i965: Add XRGB format to intel_screen_make_configs" > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759 > --- > src/mesa/drivers/dri/i965/intel_screen.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index cc90efe..75d5a65 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) > stencil_bits[2] = 8; > num_depth_stencil_bits = 3; > } > + } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) { > + depth_bits[1] = 24; > + stencil_bits[1] = 0; Why would you want depth without stencil when using BGRX? I don't see how the two are connected... Are you sure you're picking the right visual? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] configure.ac: fix test for SSE4.1 assembler support
Change the __m128i variables to be volatile so gcc 4.9 won't optimise all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls still get optimised out but now there is at least one SSE4.1 instruction generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions got optimised out the configure test would incorrectly pass when the compiler supported the intrinsics and the assembler didn't support the instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 Signed-off-by: Jonathan GrayCc: "11.0 11.1" --- configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 260934d..1d82e47 100644 --- a/configure.ac +++ b/configure.ac @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS" AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ #include int main () { -__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c; +volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c; c = _mm_max_epu32(a, b); return 0; }]])], SSE41_SUPPORTED=1) -- 2.6.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] glsl: only update doubles inputs for vertex inputs.
From: Dave AirlieThis doesn't apply to other stages. This is only used in the mesa/st code, which needs further fixes. Signed-off-by: Dave Airlie --- src/glsl/ir_set_program_inouts.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/glsl/ir_set_program_inouts.cpp b/src/glsl/ir_set_program_inouts.cpp index 838b5e7..c9b8a38 100644 --- a/src/glsl/ir_set_program_inouts.cpp +++ b/src/glsl/ir_set_program_inouts.cpp @@ -115,8 +115,11 @@ mark(struct gl_program *prog, ir_variable *var, int offset, int len, else prog->InputsRead |= bitfield; - if (var->type->without_array()->is_dual_slot_double()) + /* double inputs read is only for vertex inputs */ + if (stage == MESA_SHADER_VERTEX && + var->type->without_array()->is_dual_slot_double()) prog->DoubleInputsRead |= bitfield; + if (stage == MESA_SHADER_FRAGMENT) { gl_fragment_program *fprog = (gl_fragment_program *) prog; fprog->InterpQualifier[idx] = -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] glsl: pass stage into mark function
From: Dave AirlieDon't use a bool here, as for some 64-bit fixes we need the stage. Signed-off-by: Dave Airlie --- src/glsl/ir_set_program_inouts.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/glsl/ir_set_program_inouts.cpp b/src/glsl/ir_set_program_inouts.cpp index d7c29b0..70d754f 100644 --- a/src/glsl/ir_set_program_inouts.cpp +++ b/src/glsl/ir_set_program_inouts.cpp @@ -90,7 +90,7 @@ is_dual_slot(ir_variable *var) static void mark(struct gl_program *prog, ir_variable *var, int offset, int len, - bool is_fragment_shader) + gl_shader_stage stage) { /* As of GLSL 1.20, varyings can only be floats, floating-point * vectors or matrices, or arrays of them. For Mesa programs using @@ -125,7 +125,7 @@ mark(struct gl_program *prog, ir_variable *var, int offset, int len, if (dual_slot) prog->DoubleInputsRead |= bitfield; - if (is_fragment_shader) { + if (stage == MESA_SHADER_FRAGMENT) { gl_fragment_program *fprog = (gl_fragment_program *) prog; fprog->InterpQualifier[idx] = (glsl_interp_qualifier) var->data.interpolation; @@ -178,7 +178,7 @@ ir_set_program_inouts_visitor::mark_whole_variable(ir_variable *var) } mark(this->prog, var, 0, type->count_attribute_slots(), -this->shader_stage == MESA_SHADER_FRAGMENT); +this->shader_stage); } /* Default handler: Mark all the locations in the variable as used. */ @@ -302,7 +302,7 @@ ir_set_program_inouts_visitor::try_mark_partial_variable(ir_variable *var, } mark(this->prog, var, index_as_constant->value.u[0] * elem_width, -elem_width, this->shader_stage == MESA_SHADER_FRAGMENT); +elem_width, this->shader_stage); return true; } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] glsl/fp64: add helper for dual slot double detection.
From: Dave AirlieThe old function didn't work for matrices, and we need this in other places to fix some other problems, so move to a helper in glsl type and fix the one user so far. A dual slot double is one that has 3 or 4 components in it's base type. Signed-off-by: Dave Airlie --- src/glsl/ir_set_program_inouts.cpp | 10 +- src/glsl/nir/glsl_types.h | 8 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/glsl/ir_set_program_inouts.cpp b/src/glsl/ir_set_program_inouts.cpp index 70d754f..782f1b1 100644 --- a/src/glsl/ir_set_program_inouts.cpp +++ b/src/glsl/ir_set_program_inouts.cpp @@ -81,13 +81,6 @@ is_shader_inout(ir_variable *var) var->data.mode == ir_var_system_value; } -static inline bool -is_dual_slot(ir_variable *var) -{ - const glsl_type *type = var->type->without_array(); - return type == glsl_type::dvec4_type || type == glsl_type::dvec3_type; -} - static void mark(struct gl_program *prog, ir_variable *var, int offset, int len, gl_shader_stage stage) @@ -101,7 +94,6 @@ mark(struct gl_program *prog, ir_variable *var, int offset, int len, */ for (int i = 0; i < len; i++) { - bool dual_slot = is_dual_slot(var); int idx = var->data.location + var->data.index + offset + i; bool is_patch_generic = var->data.patch && idx != VARYING_SLOT_TESS_LEVEL_INNER && @@ -123,7 +115,7 @@ mark(struct gl_program *prog, ir_variable *var, int offset, int len, else prog->InputsRead |= bitfield; - if (dual_slot) + if (var->type->without_array()->is_dual_slot_double()) prog->DoubleInputsRead |= bitfield; if (stage == MESA_SHADER_FRAGMENT) { gl_fragment_program *fprog = (gl_fragment_program *) prog; diff --git a/src/glsl/nir/glsl_types.h b/src/glsl/nir/glsl_types.h index d8a999a..26f25a1 100644 --- a/src/glsl/nir/glsl_types.h +++ b/src/glsl/nir/glsl_types.h @@ -471,6 +471,14 @@ struct glsl_type { } /** +* Query whether a double takes two slots. +*/ + bool is_dual_slot_double() const + { + return base_type == GLSL_TYPE_DOUBLE && vector_elements > 2; + } + + /** * Query whether or not a type is a non-array boolean type */ bool is_boolean() const -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] glsl: use dual slot helper in the linker code.
From: Dave AirlieSigned-off-by: Dave Airlie --- src/glsl/linker.cpp | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index ae628cd..89659c7 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -2603,17 +2603,8 @@ assign_attribute_or_color_locations(gl_shader_program *prog, * issue (3) of the GL_ARB_vertex_attrib_64bit behavior, this * is optional behavior, but it seems preferable. */ -const glsl_type *type = var->type->without_array(); -if (type == glsl_type::dvec3_type || -type == glsl_type::dvec4_type || -type == glsl_type::dmat2x3_type || -type == glsl_type::dmat2x4_type || -type == glsl_type::dmat3_type || -type == glsl_type::dmat3x4_type || -type == glsl_type::dmat4x3_type || -type == glsl_type::dmat4_type) { +if (var->type->without_array()->is_dual_slot_double()) double_storage_locations |= (use_mask << attr); -} } continue; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] glsl: fix count_attribute_slots to allow for different 64-bit handling
From: Dave AirlieSo vertex shader input attributes are handled different than internal varyings between shader stages, dvec3 and dvec4 only count as one slot for vertex attributes, but for internal varyings, they count as 2. This patch comments all the uses of this API to clarify what we pass in, except one which needs further investigation Signed-off-by: Dave Airlie --- src/glsl/ir_set_program_inouts.cpp | 7 ++- src/glsl/link_varyings.cpp | 6 -- src/glsl/linker.cpp| 5 +++-- src/glsl/nir/glsl_types.cpp| 18 +- src/glsl/nir/glsl_types.h | 5 - 5 files changed, 30 insertions(+), 11 deletions(-) diff --git a/src/glsl/ir_set_program_inouts.cpp b/src/glsl/ir_set_program_inouts.cpp index 782f1b1..838b5e7 100644 --- a/src/glsl/ir_set_program_inouts.cpp +++ b/src/glsl/ir_set_program_inouts.cpp @@ -146,6 +146,7 @@ void ir_set_program_inouts_visitor::mark_whole_variable(ir_variable *var) { const glsl_type *type = var->type; + bool vertex_input = false; if (this->shader_stage == MESA_SHADER_GEOMETRY && var->data.mode == ir_var_shader_in && type->is_array()) { type = type->fields.array; @@ -169,7 +170,11 @@ ir_set_program_inouts_visitor::mark_whole_variable(ir_variable *var) type = type->fields.array; } - mark(this->prog, var, 0, type->count_attribute_slots(), + if (this->shader_stage == MESA_SHADER_VERTEX && + var->data.mode == ir_var_shader_in) + vertex_input = true; + + mark(this->prog, var, 0, type->count_attribute_slots(vertex_input), this->shader_stage); } diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 71750d1..9f6467b 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -1712,7 +1712,8 @@ check_against_output_limit(struct gl_context *ctx, if (var && var->data.mode == ir_var_shader_out && var_counts_against_varying_limit(producer->Stage, var)) { - output_vectors += var->type->count_attribute_slots(); + /* outputs for fragment shader can't be doubles */ + output_vectors += var->type->count_attribute_slots(false); } } @@ -1753,7 +1754,8 @@ check_against_input_limit(struct gl_context *ctx, if (var && var->data.mode == ir_var_shader_in && var_counts_against_varying_limit(consumer->Stage, var)) { - input_vectors += var->type->count_attribute_slots(); + /* vertex inputs aren't varying counted */ + input_vectors += var->type->count_attribute_slots(false); } } diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 89659c7..1670036 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -2466,7 +2466,7 @@ assign_attribute_or_color_locations(gl_shader_program *prog, return false; } - const unsigned slots = var->type->count_attribute_slots(); + const unsigned slots = var->type->count_attribute_slots(false); /* If the variable is not a built-in and has a location statically * assigned in the shader (presumably via a layout qualifier), make sure @@ -2995,7 +2995,8 @@ check_image_resources(struct gl_context *ctx, struct gl_shader_program *prog) foreach_in_list(ir_instruction, node, sh->ir) { ir_variable *var = node->as_variable(); if (var && var->data.mode == ir_var_shader_out) - fragment_outputs += var->type->count_attribute_slots(); + /* since there are no double fs outputs - pass false */ + fragment_outputs += var->type->count_attribute_slots(false); } } } diff --git a/src/glsl/nir/glsl_types.cpp b/src/glsl/nir/glsl_types.cpp index 3cf2f03..44d3056 100644 --- a/src/glsl/nir/glsl_types.cpp +++ b/src/glsl/nir/glsl_types.cpp @@ -1640,7 +1640,7 @@ glsl_type::std430_size(bool row_major) const } unsigned -glsl_type::count_attribute_slots() const +glsl_type::count_attribute_slots(bool vertex_input_slots) const { /* From page 31 (page 37 of the PDF) of the GLSL 1.50 spec: * @@ -1661,27 +1661,35 @@ glsl_type::count_attribute_slots() const * allows varying structs, the number of varying slots taken up by a * varying struct is simply equal to the sum of the number of slots taken * up by each element. +* +* Doubles are counted different depending on whether they are vertex +* inputs or everything else. Vertex inputs from ARB_vertex_attrib_64bit +* take one location no matter what size they are, otherwise dvec3/4 +* take two locations. */ switch (this->base_type) { case GLSL_TYPE_UINT: case GLSL_TYPE_INT: case GLSL_TYPE_FLOAT: case GLSL_TYPE_BOOL: - case GLSL_TYPE_DOUBLE: return this->matrix_columns; - + case GLSL_TYPE_DOUBLE: + if (this->vector_elements > 2 && !vertex_input_slots) + return
[Mesa-dev] glsl fp64 fixes.
I've been trying to get GL41-CTS.gpu_shader_fp64.varyings to pass on mesa for a while now on radeonsi/r600. I finally got it to pass, so this is the first round of changes. The second round are all st/glsl->tgsi convertor changes, so I decided to post the GLSL changes first. The main changes are ARB_vertex_attrib_64bit related, where the vertex input locations are counted different than the other areas, but it also fixes transform feedback with doubles. It also introduces a new types helper and uses it in a couple of places. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] glsl: fix partial marking for fp64 types.
From: Dave AirlieThis doubles the element width for the types that are greater than 2 elements wide. Signed-off-by: Dave Airlie --- src/glsl/ir_set_program_inouts.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/glsl/ir_set_program_inouts.cpp b/src/glsl/ir_set_program_inouts.cpp index c9b8a38..a2dea67 100644 --- a/src/glsl/ir_set_program_inouts.cpp +++ b/src/glsl/ir_set_program_inouts.cpp @@ -301,6 +301,13 @@ ir_set_program_inouts_visitor::try_mark_partial_variable(ir_variable *var, return false; } + /* double element width for double types that takes two slots */ + if (this->shader_stage != MESA_SHADER_VERTEX || + var->data.mode != ir_var_shader_in) { + if (type->without_array()->is_dual_slot_double()) +elem_width *= 2; + } + mark(this->prog, var, index_as_constant->value.u[0] * elem_width, elem_width, this->shader_stage); return true; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50/ir: fix texture grad for cubemaps
We were ignoring the partial derivatives on the last dim. Signed-off-by: Ilia Mirkin--- A little frightening that we weren't failing any tests... perhaps I got this wrong somehow? src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 +- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp | 2 +- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp | 2 +- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 7 +++ 4 files changed, 6 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 1d2caab..b233860 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -1897,7 +1897,7 @@ Converter::handleTEX(Value *dst[4], int R, int S, int L, int C, int Dx, int Dy) shd = fetchSrc(C >> 4, C & 3); if (texi->op == OP_TXD) { - for (c = 0; c < tgt.getDim(); ++c) { + for (c = 0; c < tgt.getDim() + tgt.isCube(); ++c) { texi->dPdx[c].set(fetchSrc(Dx >> 4, (Dx & 3) + c)); texi->dPdy[c].set(fetchSrc(Dy >> 4, (Dy & 3) + c)); } diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp index 420cc4e..0b90378 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp @@ -57,7 +57,7 @@ GM107LoweringPass::handleManualTXD(TexInstruction *i) Instruction *tex, *add; Value *zero = bld.loadImm(bld.getSSA(), 0); int l, c; - const int dim = i->tex.target.getDim(); + const int dim = i->tex.target.getDim() + i->tex.target.isCube(); const int array = i->tex.target.isArray(); i->op = OP_TEX; // no need to clone dPdx/dPdy later diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp index 82f7a8f..19965ff 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp @@ -911,7 +911,7 @@ NV50LoweringPreSSA::handleTXD(TexInstruction *i) Instruction *tex; Value *zero = bld.loadImm(bld.getSSA(), 0); int l, c; - const int dim = i->tex.target.getDim(); + const int dim = i->tex.target.getDim() + i->tex.target.isCube(); handleTEX(i); i->op = OP_TEX; // no need to clone dPdx/dPdy later diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index 0f575f2..4f430cd 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp @@ -870,7 +870,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i) Instruction *tex; Value *zero = bld.loadImm(bld.getSSA(), 0); int l, c; - const int dim = i->tex.target.getDim(); + const int dim = i->tex.target.getDim() + i->tex.target.isCube(); const int array = i->tex.target.isArray(); i->op = OP_TEX; // no need to clone dPdx/dPdy later @@ -917,7 +917,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i) bool NVC0LoweringPass::handleTXD(TexInstruction *txd) { - int dim = txd->tex.target.getDim(); + int dim = txd->tex.target.getDim() + txd->tex.target.isCube(); unsigned arg = txd->tex.target.getArgCount(); unsigned expected_args = arg; const int chipset = prog->getTarget()->getChipset(); @@ -937,8 +937,7 @@ NVC0LoweringPass::handleTXD(TexInstruction *txd) if (expected_args > 4 || dim > 2 || - txd->tex.target.isShadow() || - txd->tex.target.isCube()) + txd->tex.target.isShadow()) txd->op = OP_TEX; handleTEX(txd); -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] clover: Fix build against LLVM 3.8 SVN >= r255078
From: Michel DänzerSigned-off-by: Michel Dänzer --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 3b37f08..4d11c24 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -661,7 +661,11 @@ namespace { if (dump_asm) { LLVMSetTargetMachineAsmVerbosity(tm, true); +#if HAVE_LLVM >= 0x0308 + LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod).release()); +#else LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod)); +#endif emit_code(tm, debug_mod, LLVMAssemblyFile, _buffer, r_log); buffer_size = LLVMGetBufferSize(out_buffer); buffer_data = LLVMGetBufferStart(out_buffer); -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression
https://bugs.freedesktop.org/show_bug.cgi?id=91888 --- Comment #15 from nerdopol...@verizon.net --- (In reply to Daniel Stone from comment #14) > Ugh, so I think I'm using the same options, and I can't see what's going > wrong with either big GL or GLES. To be honest, I'm at a total loss ... can > you break on _mesa_error when running SDL to see where it's triggered? Doesn't seem that break _mesa_error works, it's not defined... -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] softpipe: V.2 implement some support for multiple viewports
This fixes my initial attempt so that piglit now passes 14/14. Thanks to a couple of tips from Roland in the previous patch I was able to fix the remaining issue. This should be golden now. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] glsl: fix transform feedback for 64-bit outupts.
From: Dave AirlieThis fixes the calculations for transform feedback for doubles. Signed-off-by: Dave Airlie --- src/glsl/link_varyings.cpp | 7 +-- src/glsl/link_varyings.h | 26 +- 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 9f6467b..9cc77fe 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -432,6 +432,8 @@ tfeedback_decl::assign_location(struct gl_context *ctx, this->matched_candidate->type->fields.array->matrix_columns; const unsigned vector_elements = this->matched_candidate->type->fields.array->vector_elements; + const unsigned dmul = + this->matched_candidate->type->fields.array->is_double() ? 2 : 1; unsigned actual_array_size; switch (this->lowered_builtin_array_variable) { case clip_distance: @@ -459,7 +461,7 @@ tfeedback_decl::assign_location(struct gl_context *ctx, return false; } unsigned array_elem_size = this->lowered_builtin_array_variable ? -1 : vector_elements * matrix_cols; +1 : vector_elements * matrix_cols * dmul; fine_location += array_elem_size * this->array_subscript; this->size = 1; } else { @@ -519,7 +521,6 @@ tfeedback_decl::get_num_outputs() const if (!this->is_varying()) { return 0; } - return (this->num_components() + this->location_frac + 3)/4; } @@ -968,6 +969,8 @@ varying_matches::record(ir_variable *producer_var, ir_variable *consumer_var) } else { slots = type->matrix_columns; } + if (type->without_array()->is_dual_slot_double()) + slots *= 2; this->matches[this->num_matches].num_components = 4 * slots; } else { this->matches[this->num_matches].num_components diff --git a/src/glsl/link_varyings.h b/src/glsl/link_varyings.h index 2ce72d4..1d12978 100644 --- a/src/glsl/link_varyings.h +++ b/src/glsl/link_varyings.h @@ -131,7 +131,8 @@ public: if (this->lowered_builtin_array_variable) return this->size; else - return this->vector_elements * this->matrix_columns * this->size; + return this->vector_elements * this->matrix_columns * this->size * +(this->is_double() ? 2 : 1); } unsigned get_location() const { @@ -139,6 +140,29 @@ public: } private: + + bool is_double() const + { + switch (this->type) { + case GL_DOUBLE: + case GL_DOUBLE_VEC2: + case GL_DOUBLE_VEC3: + case GL_DOUBLE_VEC4: + case GL_DOUBLE_MAT2: + case GL_DOUBLE_MAT2x3: + case GL_DOUBLE_MAT2x4: + case GL_DOUBLE_MAT3: + case GL_DOUBLE_MAT3x2: + case GL_DOUBLE_MAT3x4: + case GL_DOUBLE_MAT4: + case GL_DOUBLE_MAT4x2: + case GL_DOUBLE_MAT4x3: + return true; + default: + return false; + } + } + /** * The name that was supplied to glTransformFeedbackVaryings. Used for * error reporting and glGetTransformFeedbackVarying(). -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] glsl: only divide left components when it is a dual slot double.
From: Dave AirlieSigned-off-by: Dave Airlie --- src/glsl/lower_packed_varyings.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/lower_packed_varyings.cpp index 037c27d..ec9af62 100644 --- a/src/glsl/lower_packed_varyings.cpp +++ b/src/glsl/lower_packed_varyings.cpp @@ -472,7 +472,7 @@ lower_packed_varyings_visitor::lower_rvalue(ir_rvalue *rvalue, char right_swizzle_name[4] = { 0, 0, 0, 0 }; left_components = 4 - fine_location % 4; - if (rvalue->type->is_double()) { + if (rvalue->type->is_dual_slot_double()) { /* We might actually end up with 0 left components! */ left_components /= 2; } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] llvmpipe: fix layer/vp input into fs when not written by prior stages
From: Roland ScheideggerARB_fragment_layer_viewport requires that if a fs reads layer or viewport index but it wasn't output by gs (or vs with other extensions), then it reads 0. This never worked for llvmpipe, and is surprisingly non-trivial to fix. The problem is the mechanism to handle non-existing outputs in draw is rather crude, it will simply redirect them to whatever is at output 0, thus later stages will just get garbage. So, rather than trying to fix this up (which looks non-trivial), fix this up in llvmpipe setup by detecting this case there and output a fixed zero directly. While here, also optimize the hw vertex layout a bit - previously if the gs outputted layer (or vp) and the fs read those inputs, we'd add them twice to the vertex layout, which is unnecessary. And do some minor cleanup, slots don't require that many bits, there was some bogus (but harmless) float/int mixup for psize slot too, make the slots all unsigned (we always put pos at pos zero thus everything else has to be positive if it exists), and make sure they are properly initialized (layer and vp index slot were not which looked fishy as they might not have got set back to zero when changing from a gs which outputs them to one which does not). This fixes the failures in piglit's arb_fragment_layer_viewport group (3 each for layer and vp). --- src/gallium/drivers/llvmpipe/lp_bld_interp.h| 3 +- src/gallium/drivers/llvmpipe/lp_context.h | 18 -- src/gallium/drivers/llvmpipe/lp_setup.c | 2 +- src/gallium/drivers/llvmpipe/lp_setup_context.h | 8 +-- src/gallium/drivers/llvmpipe/lp_setup_point.c | 2 +- src/gallium/drivers/llvmpipe/lp_state_derived.c | 78 + src/gallium/drivers/llvmpipe/lp_state_setup.c | 29 + src/gallium/drivers/llvmpipe/lp_state_setup.h | 9 ++- 8 files changed, 96 insertions(+), 53 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_bld_interp.h b/src/gallium/drivers/llvmpipe/lp_bld_interp.h index 9029d2a..0a52642 100644 --- a/src/gallium/drivers/llvmpipe/lp_bld_interp.h +++ b/src/gallium/drivers/llvmpipe/lp_bld_interp.h @@ -63,7 +63,8 @@ enum lp_interp { LP_INTERP_LINEAR, LP_INTERP_PERSPECTIVE, LP_INTERP_POSITION, - LP_INTERP_FACING + LP_INTERP_FACING, + LP_INTERP_ZERO }; struct lp_shader_input { diff --git a/src/gallium/drivers/llvmpipe/lp_context.h b/src/gallium/drivers/llvmpipe/lp_context.h index c9a5d67..9dcc102 100644 --- a/src/gallium/drivers/llvmpipe/lp_context.h +++ b/src/gallium/drivers/llvmpipe/lp_context.h @@ -108,22 +108,28 @@ struct llvmpipe_context { struct vertex_info vertex_info; /** Which vertex shader output slot contains color */ - int color_slot[2]; + uint8_t color_slot[2]; /** Which vertex shader output slot contains bcolor */ - int bcolor_slot[2]; + uint8_t bcolor_slot[2]; /** Which vertex shader output slot contains point size */ - int psize_slot; + uint8_t psize_slot; /** Which vertex shader output slot contains viewport index */ - int viewport_index_slot; + uint8_t viewport_index_slot; /** Which geometry shader output slot contains layer */ - int layer_slot; + uint8_t layer_slot; /** A fake frontface output for unfilled primitives */ - int face_slot; + uint8_t face_slot; + + /** Which output slot is used for the fake vp index info */ + uint8_t fake_vpindex_slot; + + /** Which output slot is used for the fake layer info */ + uint8_t fake_layer_slot; /** Depth format and bias settings. */ boolean floating_point_depth; diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c b/src/gallium/drivers/llvmpipe/lp_setup.c index 1778b13..ddbb88e 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup.c +++ b/src/gallium/drivers/llvmpipe/lp_setup.c @@ -1207,7 +1207,7 @@ lp_setup_update_state( struct lp_setup_context *setup, /* Will probably need to move this somewhere else, just need * to know about vertex shader point size attribute. */ - setup->psize = lp->psize_slot; + setup->psize_slot = lp->psize_slot; setup->viewport_index_slot = lp->viewport_index_slot; setup->layer_slot = lp->layer_slot; setup->face_slot = lp->face_slot; diff --git a/src/gallium/drivers/llvmpipe/lp_setup_context.h b/src/gallium/drivers/llvmpipe/lp_setup_context.h index 2410e23..4451284 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup_context.h +++ b/src/gallium/drivers/llvmpipe/lp_setup_context.h @@ -105,10 +105,10 @@ struct lp_setup_context float pixel_offset; float line_width; float point_size; - float psize; - unsigned viewport_index_slot; - unsigned layer_slot; - int face_slot; + uint8_t psize_slot; + uint8_t viewport_index_slot; + uint8_t layer_slot; + uint8_t face_slot; struct pipe_framebuffer_state fb; struct u_rect framebuffer; diff --git a/src/gallium/drivers/llvmpipe/lp_setup_point.c
Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports
The following changes since commit a13b14930d94b024160fe17814e091356d07f7fb: llvmpipe: fix fp64 inputs to geom shader. (2015-12-09 13:56:39 +1000) are available in the git repository at: https://github.com/victoredwardocallaghan/mesa-GLwork.git softpipe-ARB_viewport_array for you to fetch changes up to 04e17cf34ba04f99e2536f2e598a2ee7d8bb1237: softpipe: Update GL3.txt doc with GL_ARB_viewport_array support (2015-12-09 15:09:17 +1100) Edward O'Callaghan (2): softpipe: implement some support for multiple viewports softpipe: Update GL3.txt doc with GL_ARB_viewport_array support docs/GL3.txt | 2 +- src/gallium/drivers/softpipe/sp_context.h | 9 ++- src/gallium/drivers/softpipe/sp_quad.h| 1 + src/gallium/drivers/softpipe/sp_quad_depth_test.c | 5 +- src/gallium/drivers/softpipe/sp_screen.c | 2 +- src/gallium/drivers/softpipe/sp_setup.c | 49 +++ src/gallium/drivers/softpipe/sp_setup.h | 5 ++ src/gallium/drivers/softpipe/sp_state_clip.c | 15 +++-- src/gallium/drivers/softpipe/sp_state_derived.c | 76 +++ src/gallium/drivers/softpipe/sp_surface.c | 4 +- 10 files changed, 114 insertions(+), 54 deletions(-) On 2015-12-09 15:16, Edward O'Callaghan wrote: This fixes my initial attempt so that piglit now passes 14/14. Thanks to a couple of tips from Roland in the previous patch I was able to fix the remaining issue. This should be golden now. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] configure.ac: use pkg-config for libelf
Use PKG_CHECK_MODULES to get the flags to link libelf v2: keep AC_CHECK_LIB as a fallback for elfutils provided libelf that doesn't install a pkg-config file. Signed-off-by: Jonathan GrayReviewed-by: Michel D�nzer Tested-by: Michel D�nzer Cc: "11.0 11.1" --- configure.ac | 12 +--- src/gallium/drivers/radeon/Makefile.am | 5 +++-- src/gallium/targets/opencl/Makefile.am | 5 - 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/configure.ac b/configure.ac index 25fdc39..260934d 100644 --- a/configure.ac +++ b/configure.ac @@ -1724,7 +1724,15 @@ AC_ARG_WITH([clang-libdir], [CLANG_LIBDIR='']) PKG_CHECK_EXISTS([libclc], [have_libclc=yes], [have_libclc=no]) -AC_CHECK_LIB([elf], [elf_memory], [have_libelf=yes;ELF_LIB=-lelf]) +PKG_CHECK_MODULES([LIBELF], [libelf], [have_libelf=yes], [have_libelf=no]) + +if test "x$have_libelf" = xno; then + LIBELF_LIBS='' + LIBELF_CFLAGS='' + AC_CHECK_LIB([elf], [elf_memory], [have_libelf=yes;LIBELF_LIBS=-lelf], [have_libelf=no]) + AC_SUBST([LIBELF_LIBS]) + AC_SUBST([LIBELF_CFLAGS]) +fi if test "x$enable_opencl" = xyes; then if test -z "$with_gallium_drivers"; then @@ -2305,8 +2313,6 @@ if test "x$USE_VC4_SIMULATOR" = xyes -a "x$HAVE_GALLIUM_ILO" = xyes; then AC_MSG_ERROR([VC4 simulator on x86 replaces i965 driver build, so ilo must be disabled.]) fi -AC_SUBST([ELF_LIB]) - AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes) AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes) AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes) diff --git a/src/gallium/drivers/radeon/Makefile.am b/src/gallium/drivers/radeon/Makefile.am index 13d8976..a6fc145 100644 --- a/src/gallium/drivers/radeon/Makefile.am +++ b/src/gallium/drivers/radeon/Makefile.am @@ -16,7 +16,8 @@ libradeon_la_SOURCES = \ if NEED_RADEON_LLVM AM_CFLAGS += \ - $(LLVM_CFLAGS) + $(LLVM_CFLAGS) \ + $(LIBELF_CFLAGS) libradeon_la_SOURCES += \ $(LLVM_C_FILES) @@ -24,7 +25,7 @@ libradeon_la_SOURCES += \ libradeon_la_LIBADD = \ $(CLOCK_LIB) \ $(LLVM_LIBS) \ - $(ELF_LIB) + $(LIBELF_LIBS) libradeon_la_LDFLAGS = \ $(LLVM_LDFLAGS) diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 08f95e8..f3ba1e3 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -2,6 +2,9 @@ include $(top_srcdir)/src/gallium/Automake.inc lib_LTLIBRARIES = lib@OPENCL_LIBNAME@.la +AM_CPPFLAGS = \ +$(LIBELF_CFLAGS) + lib@OPENCL_LIBNAME@_la_LDFLAGS = \ $(LLVM_LDFLAGS) \ -no-undefined \ @@ -19,7 +22,7 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \ $(top_builddir)/src/gallium/state_trackers/clover/libclover.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ $(top_builddir)/src/util/libmesautil.la \ - $(ELF_LIB) \ + $(LIBELF_LIBS) \ $(DLOPEN_LIBS) \ -lclangCodeGen \ -lclangFrontendTool \ -- 2.6.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: add loading doubles from lds and geom inputs
On 09.12.2015 13:03, Dave Airlie wrote: > From: Dave Airlie> > This adds two cases of missing double handling to radeonsi, > one for loading doubles from LDS and one for fetch doubles > for geometry shader inputs. > > Signed-off-by: Dave Airlie Please split it up into two patches. Then both of them are Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: add loading doubles from lds and geom inputs
From: Dave AirlieThis adds two cases of missing double handling to radeonsi, one for loading doubles from LDS and one for fetch doubles for geometry shader inputs. Signed-off-by: Dave Airlie --- src/gallium/drivers/radeonsi/si_shader.c | 28 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 14f12df..1baa2eb 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -594,6 +594,14 @@ static LLVMValueRef lds_load(struct lp_build_tgsi_context *bld_base, lp_build_const_int32(gallivm, swizzle)); value = build_indexed_load(si_shader_ctx, si_shader_ctx->lds, dw_addr); + if (type == TGSI_TYPE_DOUBLE) { + LLVMValueRef value2; + dw_addr = lp_build_add(_base->uint_bld, dw_addr, + lp_build_const_int32(gallivm, swizzle + 1)); + value2 = build_indexed_load(si_shader_ctx, si_shader_ctx->lds, dw_addr); + return radeon_llvm_emit_fetch_double(bld_base, value, value2); + } + return LLVMBuildBitCast(gallivm->builder, value, tgsi2llvmtype(bld_base, type), ""); } @@ -733,6 +741,7 @@ static LLVMValueRef fetch_input_gs( unsigned semantic_name = info->input_semantic_name[reg->Register.Index]; unsigned semantic_index = info->input_semantic_index[reg->Register.Index]; unsigned param; + LLVMValueRef value; if (swizzle != ~0 && semantic_name == TGSI_SEMANTIC_PRIMID) return get_primitive_id(bld_base, swizzle); @@ -774,11 +783,22 @@ static LLVMValueRef fetch_input_gs( args[7] = uint->zero; /* SLC */ args[8] = uint->zero; /* TFE */ + value = lp_build_intrinsic(gallivm->builder, + "llvm.SI.buffer.load.dword.i32.i32", + i32, args, 9, + LLVMReadOnlyAttribute | LLVMNoUnwindAttribute); + if (type == TGSI_TYPE_DOUBLE) { + LLVMValueRef value2; + args[2] = lp_build_const_int32(gallivm, (param * 4 + swizzle + 1) * 256); + value2 = lp_build_intrinsic(gallivm->builder, + "llvm.SI.buffer.load.dword.i32.i32", + i32, args, 9, + LLVMReadOnlyAttribute | LLVMNoUnwindAttribute); + return radeon_llvm_emit_fetch_double(bld_base, +value, value2); + } return LLVMBuildBitCast(gallivm->builder, - lp_build_intrinsic(gallivm->builder, - "llvm.SI.buffer.load.dword.i32.i32", - i32, args, 9, - LLVMReadOnlyAttribute | LLVMNoUnwindAttribute), + value, tgsi2llvmtype(bld_base, type), ""); } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/10] Rework of pb_cache_manager for removal of radeon_winsys_cs_handle
On 06.12.2015 19:00, Marek Olšák wrote: This series addresses the weirdness in radeon drivers that every buffer has 2 handles: - pb_buffer (== pb_cache_buffer) - radeon_winsys_cs_handle (winsys-specific pb_buffer) The inefficiency of converting pb_cache_buffer into the winsys-specific buffer made me introduce radeon_winsys_cs_handle a few years ago, which has been used for almost everything (map/unmap/command submission/etc.) and pb_buffer has only held the reference keeping the buffer alive. Now it's time to do this right. This series moves the pb_cache_manager logic into an independent module "pb_cache". Next, all dependencies on pb_manager are removed from both winsyses and the new module is used instead. The result is that pb_buffer is equal to radeon_winsys_cs_handle, and the latter can be removed. Very nice! I only have some comments on #1 and an identical remark about #3 & #9 (those two patches are also a bit awkward because they combine several seemingly unrelated changes, though I don't mind *that* much). Modulo the comments mentioned above, the series is Reviewed-by: Nicolai HähnleCheers, Nicolai Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 00/12] nir: Get rid of the *_indirect variants of
This is my third sending of this series. I think this version is close to working on all of the relevant drivers (it works on i965). Hopefully, we can actually push it this time. As mentioned before, all but the first two patches will be squashed together so it's important to get this right. Thank you very much to Rob Clark and Eric Anholt for all your work helping me debug this on your drivers. Jason Ekstrand (12): vc4: Do all uniform loads with byte offsets i965/fs_nir: Refactor store_output, load_input, and load_uniform nir: Get rid of *_indirect variants of input/output load/store intrinsics nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect nir/lower_io: Get rid of load/store_foo_indirect i965/fs: Get rid of load/store_foo_indirect i965/vec4: Get rid of load/store_foo_indirect tgsi_to_nir: Get rid of load/store_foo_indirect ir3/nir: Use the new unified io intrinsics vc4/nir: Use the new unified io intrinsics nir/lower_clip: Update to the new load/store intrinsics nir/lower_two_sided_color: Update to the new load intrinsic src/gallium/auxiliary/nir/tgsi_to_nir.c| 52 .../drivers/freedreno/ir3/ir3_compiler_nir.c | 79 ++- src/gallium/drivers/vc4/vc4_nir_lower_blend.c | 1 + src/gallium/drivers/vc4/vc4_nir_lower_io.c | 55 +--- src/gallium/drivers/vc4/vc4_program.c | 44 +++--- src/glsl/nir/glsl_to_nir.cpp | 47 +-- src/glsl/nir/nir.h | 2 +- src/glsl/nir/nir_intrinsics.h | 81 ++-- src/glsl/nir/nir_lower_clip.c | 3 + src/glsl/nir/nir_lower_io.c| 111 ++-- src/glsl/nir/nir_lower_phis_to_scalar.c| 4 - src/glsl/nir/nir_lower_two_sided_color.c | 2 + src/glsl/nir/nir_print.c | 9 +- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 147 + src/mesa/drivers/dri/i965/brw_nir.c| 45 +-- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 7 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 95 ++--- 18 files changed, 380 insertions(+), 406 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 10/12] vc4/nir: Use the new unified io intrinsics
--- src/gallium/drivers/vc4/vc4_nir_lower_blend.c | 1 + src/gallium/drivers/vc4/vc4_nir_lower_io.c| 38 -- src/gallium/drivers/vc4/vc4_program.c | 47 +-- 3 files changed, 58 insertions(+), 28 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_blend.c b/src/gallium/drivers/vc4/vc4_nir_lower_blend.c index 0672a92..5bd69fb 100644 --- a/src/gallium/drivers/vc4/vc4_nir_lower_blend.c +++ b/src/gallium/drivers/vc4/vc4_nir_lower_blend.c @@ -49,6 +49,7 @@ vc4_nir_get_dst_color(nir_builder *b) nir_intrinsic_load_input); load->num_components = 1; load->const_index[0] = VC4_NIR_TLB_COLOR_READ_INPUT; +load->src[0] = nir_src_for_ssa(nir_imm_int(b, 0)); nir_ssa_dest_init(>instr, >dest, 1, NULL); nir_builder_instr_insert(b, >instr); return >dest.ssa; diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c b/src/gallium/drivers/vc4/vc4_nir_lower_io.c index e3d018f..c3bafbf 100644 --- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c +++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c @@ -179,6 +179,12 @@ vc4_nir_lower_vertex_attr(struct vc4_compile *c, nir_builder *b, /* All TGSI-to-NIR inputs are vec4. */ assert(intr->num_components == 4); +/* We only accept direct outputs and TGSI only ever gives them to us + * with an offset value of 0. + */ +assert(nir_src_as_const_value(intr->src[0]) && + nir_src_as_const_value(intr->src[0])->u[0] == 0); + /* Generate dword loads for the VPM values (Since these intrinsics may * be reordered, the actual reads will be generated at the top of the * shader by ntq_setup_inputs(). @@ -190,6 +196,7 @@ vc4_nir_lower_vertex_attr(struct vc4_compile *c, nir_builder *b, nir_intrinsic_load_input); intr_comp->num_components = 1; intr_comp->const_index[0] = intr->const_index[0] * 4 + i; +intr_comp->src[0] = nir_src_for_ssa(nir_imm_int(b, 0)); nir_ssa_dest_init(_comp->instr, _comp->dest, 1, NULL); nir_builder_instr_insert(b, _comp->instr); @@ -243,6 +250,12 @@ vc4_nir_lower_fs_input(struct vc4_compile *c, nir_builder *b, /* All TGSI-to-NIR inputs are vec4. */ assert(intr->num_components == 4); +/* We only accept direct inputs and TGSI only ever gives them to us + * with an offset value of 0. + */ +assert(nir_src_as_const_value(intr->src[0]) && + nir_src_as_const_value(intr->src[0])->u[0] == 0); + /* Generate scalar loads equivalent to the original VEC4. */ nir_ssa_def *dests[4]; for (unsigned i = 0; i < intr->num_components; i++) { @@ -250,6 +263,8 @@ vc4_nir_lower_fs_input(struct vc4_compile *c, nir_builder *b, nir_intrinsic_instr_create(c->s, nir_intrinsic_load_input); intr_comp->num_components = 1; intr_comp->const_index[0] = intr->const_index[0] * 4 + i; +intr_comp->src[0] = nir_src_for_ssa(nir_imm_int(b, 0)); + nir_ssa_dest_init(_comp->instr, _comp->dest, 1, NULL); nir_builder_instr_insert(b, _comp->instr); @@ -317,6 +332,12 @@ vc4_nir_lower_output(struct vc4_compile *c, nir_builder *b, /* All TGSI-to-NIR outputs are VEC4. */ assert(intr->num_components == 4); +/* We only accept direct outputs and TGSI only ever gives them to us + * with an offset value of 0. + */ +assert(nir_src_as_const_value(intr->src[1]) && + nir_src_as_const_value(intr->src[1])->u[0] == 0); + b->cursor = nir_before_instr(>instr); for (unsigned i = 0; i < intr->num_components; i++) { @@ -328,6 +349,7 @@ vc4_nir_lower_output(struct vc4_compile *c, nir_builder *b, assert(intr->src[0].is_ssa); intr_comp->src[0] = nir_src_for_ssa(nir_channel(b, intr->src[0].ssa, i)); +intr_comp->src[1] = nir_src_for_ssa(nir_imm_int(b, 0)); nir_builder_instr_insert(b, _comp->instr); } @@ -358,15 +380,12 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, /* Convert the base offset to bytes and add the component */ intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 4); -if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) { -/* Convert the variable TGSI register index to a byte - * offset. - */ -intr_comp->src[0] = -nir_src_for_ssa(nir_ishl(b, - intr->src[0].ssa, -
[Mesa-dev] [PATCH] nv50/ir: check if the target supports the new offset before inlining
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=93300 Signed-off-by: Ilia Mirkin--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 9 ++--- src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 2 ++ .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 15 +++ src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h | 2 ++ 4 files changed, 25 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 1384a25..7b7352e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -290,20 +290,23 @@ IndirectPropagation::visit(BasicBlock *bb) continue; if (insn->op == OP_ADD && !isFloatType(insn->dType)) { if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) || -!insn->src(1).getImmediate(imm)) +!insn->src(1).getImmediate(imm) || +!targ->insnCanLoadOffset(i, s, imm.reg.data.s32)) continue; i->setIndirect(s, 0, insn->getSrc(0)); i->setSrc(s, cloneShallow(func, i->getSrc(s))); i->src(s).get()->reg.data.offset += imm.reg.data.u32; } else if (insn->op == OP_SUB && !isFloatType(insn->dType)) { if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) || -!insn->src(1).getImmediate(imm)) +!insn->src(1).getImmediate(imm) || +!targ->insnCanLoadOffset(i, s, -imm.reg.data.s32)) continue; i->setIndirect(s, 0, insn->getSrc(0)); i->setSrc(s, cloneShallow(func, i->getSrc(s))); i->src(s).get()->reg.data.offset -= imm.reg.data.u32; } else if (insn->op == OP_MOV) { -if (!insn->src(0).getImmediate(imm)) +if (!insn->src(0).getImmediate(imm) || +!targ->insnCanLoadOffset(i, s, imm.reg.data.s32)) continue; i->setIndirect(s, 0, NULL); i->setSrc(s, cloneShallow(func, i->getSrc(s))); diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h index 4e33997..673f881 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h @@ -191,6 +191,8 @@ public: virtual bool insnCanLoad(const Instruction *insn, int s, const Instruction *ld) const = 0; + virtual bool insnCanLoadOffset(const Instruction *insn, int s, + int offset) const { return true; } virtual bool isOpSupported(operation, DataType) const = 0; virtual bool isAccessSupported(DataFile, DataType) const = 0; virtual bool isModSupported(const Instruction *, diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index 2d5baed..101082e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -385,6 +385,21 @@ TargetNV50::insnCanLoad(const Instruction *i, int s, } bool +TargetNV50::insnCanLoadOffset(const Instruction *i, int s, int offset) const +{ + if (!i->src(s).isIndirect(0)) + return true; + offset += i->src(s).get()->reg.data.offset; + if (i->op == OP_LOAD || i->op == OP_STORE) { + // There are some restrictions in theory, but in practice they're never + // going to be hit. When we enable shared/global memory, this will + // become more important. + return true; + } + return offset >= 0 && offset <= (int32_t)(127 * i->src(s).get()->reg.size); +} + +bool TargetNV50::isAccessSupported(DataFile file, DataType ty) const { if (ty == TYPE_B96 || ty == TYPE_NONE) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h index 0cbf180..00243d7 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h @@ -46,6 +46,8 @@ public: virtual bool insnCanLoad(const Instruction *insn, int s, const Instruction *ld) const; + virtual bool insnCanLoadOffset(const Instruction *insn, int s, + int offset) const; virtual bool isOpSupported(operation, DataType) const; virtual bool isAccessSupported(DataFile, DataType) const; virtual bool isModSupported(const Instruction *, int s, Modifier) const; -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: invalidate pipeline status after glUseProgramStages
On Tue, 2015-12-08 at 19:02 +0200, Tapani Pälli wrote: > This will cause validation to run during next draw, this is done > because possible changes in used stages and programs can cause > invalid pipeline state. > > This fixes a subtest in following CTS test: > ES31-CTS.sepshaderobjs.StateInteraction > > Signed-off-by: Tapani PälliReviewed-by: Timothy Arceri Thanks for your patience. If you end up trying to optimise this again feel free to Cc me. Although I do think we will need to add some more corner cases to the piglit tests before doing so. > --- > src/mesa/main/pipelineobj.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/mesa/main/pipelineobj.c > b/src/mesa/main/pipelineobj.c > index 5eda4e5..f2a872d 100644 > --- a/src/mesa/main/pipelineobj.c > +++ b/src/mesa/main/pipelineobj.c > @@ -341,6 +341,8 @@ _mesa_UseProgramStages(GLuint pipeline, > GLbitfield stages, GLuint program) > > if ((stages & GL_COMPUTE_SHADER_BIT) != 0) >_mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, > pipe); > + > + pipe->Validated = false; > } > > /** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 04/12] nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect
--- src/glsl/nir/glsl_to_nir.cpp | 47 +++- 1 file changed, 7 insertions(+), 40 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 45d045c..ba23b91 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -853,24 +853,12 @@ nir_visitor::visit(ir_call *ir) ir_constant *write_mask = ((ir_instruction *)param)->as_constant(); assert(write_mask); - /* Check if we need the indirect version */ - ir_constant *const_offset = offset->as_constant(); - if (!const_offset) { -op = nir_intrinsic_store_ssbo_indirect; -ralloc_free(instr); -instr = nir_intrinsic_instr_create(shader, op); -instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset)); -instr->const_index[0] = 0; - } else { -instr->const_index[0] = const_offset->value.u[0]; - } - - instr->const_index[1] = write_mask->value.u[0]; - instr->src[0] = nir_src_for_ssa(evaluate_rvalue(val)); + instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block)); + instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset)); + instr->const_index[0] = write_mask->value.u[0]; instr->num_components = val->type->vector_elements; - instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block)); nir_builder_instr_insert(, >instr); break; } @@ -881,20 +869,8 @@ nir_visitor::visit(ir_call *ir) param = param->get_next(); ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue(); - /* Check if we need the indirect version */ - ir_constant *const_offset = offset->as_constant(); - if (!const_offset) { -op = nir_intrinsic_load_ssbo_indirect; -ralloc_free(instr); -instr = nir_intrinsic_instr_create(shader, op); -instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset)); -instr->const_index[0] = 0; -dest = >dest; - } else { -instr->const_index[0] = const_offset->value.u[0]; - } - instr->src[0] = nir_src_for_ssa(evaluate_rvalue(block)); + instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset)); const glsl_type *type = ir->return_deref->var->type; instr->num_components = type->vector_elements; @@ -1174,20 +1150,11 @@ nir_visitor::visit(ir_expression *ir) /* Some special cases */ switch (ir->operation) { case ir_binop_ubo_load: { - ir_constant *const_index = ir->operands[1]->as_constant(); - - nir_intrinsic_op op; - if (const_index) { - op = nir_intrinsic_load_ubo; - } else { - op = nir_intrinsic_load_ubo_indirect; - } - nir_intrinsic_instr *load = nir_intrinsic_instr_create(this->shader, op); + nir_intrinsic_instr *load = + nir_intrinsic_instr_create(this->shader, nir_intrinsic_load_ubo); load->num_components = ir->type->vector_elements; - load->const_index[0] = const_index ? const_index->value.u[0] : 0; /* base offset */ load->src[0] = nir_src_for_ssa(evaluate_rvalue(ir->operands[0])); - if (!const_index) - load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1])); + load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1])); add_instr(>instr, ir->type->vector_elements); /* -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 02/12] i965/fs_nir: Refactor store_output, load_input, and load_uniform
There was way too much incrementing of things going on. Instead, let's just start everything off at the right base location, and then increment in the loop. --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 45 ++-- 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index a00fd0e..3754622 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2304,16 +2304,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr assert(instr->const_index[0] % 4 == 0); assert(instr->const_index[1] % 4 == 0); - fs_reg uniform_reg(UNIFORM, instr->const_index[0] / 4); - uniform_reg.reg_offset = instr->const_index[1] / 4; + fs_reg src(UNIFORM, instr->const_index[0] / 4, dest.type); + src.reg_offset = instr->const_index[1] / 4; - for (unsigned j = 0; j < instr->num_components; j++) { - fs_reg src = offset(retype(uniform_reg, dest.type), bld, j); - if (has_indirect) -src.reladdr = new(mem_ctx) fs_reg(get_nir_src(instr->src[0])); + if (has_indirect) + src.reladdr = new(mem_ctx) fs_reg(get_nir_src(instr->src[0])); - bld.MOV(dest, src); - dest = offset(dest, bld, 1); + for (unsigned j = 0; j < instr->num_components; j++) { + bld.MOV(offset(dest, bld, j), offset(src, bld, j)); } break; } @@ -2431,19 +2429,16 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr unreachable("Not allowed"); /* fallthrough */ case nir_intrinsic_load_input: { - unsigned index = 0; - for (unsigned j = 0; j < instr->num_components; j++) { - fs_reg src; - if (stage == MESA_SHADER_VERTEX) { -src = offset(fs_reg(ATTR, instr->const_index[0], dest.type), bld, index); - } else { -src = offset(retype(nir_inputs, dest.type), bld, - instr->const_index[0] + index); - } - index++; + fs_reg src; + if (stage == MESA_SHADER_VERTEX) { + src = fs_reg(ATTR, instr->const_index[0], dest.type); + } else { + src = offset(retype(nir_inputs, dest.type), bld, + instr->const_index[0]); + } - bld.MOV(dest, src); - dest = offset(dest, bld, 1); + for (unsigned j = 0; j < instr->num_components; j++) { + bld.MOV(offset(dest, bld, j), offset(src, bld, j)); } break; } @@ -2516,13 +2511,11 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr /* fallthrough */ case nir_intrinsic_store_output: { fs_reg src = get_nir_src(instr->src[0]); - unsigned index = 0; + fs_reg new_dest = offset(retype(nir_outputs, src.type), bld, + instr->const_index[0]); + for (unsigned j = 0; j < instr->num_components; j++) { - fs_reg new_dest = offset(retype(nir_outputs, src.type), bld, - instr->const_index[0] + index); - index++; - bld.MOV(new_dest, src); - src = offset(src, bld, 1); + bld.MOV(offset(new_dest, bld, j), offset(src, bld, j)); } break; } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 06/12] i965/fs: Get rid of load/store_foo_indirect
--- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 106 ++- src/mesa/drivers/dri/i965/brw_nir.c | 45 ++--- 3 files changed, 84 insertions(+), 69 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index bca4589..b55589f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -278,7 +278,7 @@ public: unsigned stream_id); void emit_gs_thread_end(); void emit_gs_input_load(const fs_reg , const nir_src _src, - const fs_reg _offset, unsigned imm_offset, + unsigned base_offset, const nir_src _src, unsigned num_components); void emit_cs_terminate(); fs_reg *emit_cs_local_invocation_id_setup(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 3754622..18c9d65 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1603,29 +1603,31 @@ fs_visitor::emit_gs_vertex(const nir_src _count_nir_src, void fs_visitor::emit_gs_input_load(const fs_reg , const nir_src _src, - const fs_reg _offset, - unsigned imm_offset, + unsigned base_offset, + const nir_src _src, unsigned num_components) { struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) prog_data; + nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); + nir_const_value *offset_const = nir_src_as_const_value(offset_src); + const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; + /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y], * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w]. Only * gl_PointSize is available as a GS input, however, so it must be that. */ - const bool is_point_size = - indirect_offset.file == BAD_FILE && imm_offset == 0; - - nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); - const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; + const bool is_point_size = (base_offset == 0); - if (indirect_offset.file == BAD_FILE && vertex_const != NULL && - 4 * imm_offset < push_reg_count) { - imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count; + if (offset_const != NULL && vertex_const != NULL && + 4 * offset_const->u[0] < push_reg_count) { + int imm_offset = (base_offset + offset_const->u[0]) * 4 + + vertex_const->u[0] * push_reg_count; /* This input was pushed into registers. */ if (is_point_size) { /* gl_PointSize comes in .w */ - bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type)); + assert(imm_offset == 0); + bld.MOV(dst, fs_reg(ATTR, 3, dst.type)); } else { for (unsigned i = 0; i < num_components; i++) { bld.MOV(offset(dst, bld, i), @@ -1683,21 +1685,21 @@ fs_visitor::emit_gs_input_load(const fs_reg , } fs_inst *inst; - if (indirect_offset.file == BAD_FILE) { + if (offset_const) { /* Constant indexing - use global offset. */ inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle); - inst->offset = imm_offset; + inst->offset = base_offset + offset_const->u[0]; inst->base_mrf = -1; inst->mlen = 1; inst->regs_written = num_components; } else { /* Indirect indexing - use per-slot offsets as well. */ - const fs_reg srcs[] = { icp_handle, indirect_offset }; + const fs_reg srcs[] = { icp_handle, get_nir_src(offset_src) }; fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0); inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, payload); - inst->offset = imm_offset; + inst->offset = base_offset; inst->base_mrf = -1; inst->mlen = 2; inst->regs_written = num_components; @@ -1763,17 +1765,12 @@ fs_visitor::nir_emit_gs_intrinsic(const fs_builder , retype(fs_reg(brw_vec8_grf(2, 0)), BRW_REGISTER_TYPE_UD)); break; - case nir_intrinsic_load_input_indirect: case nir_intrinsic_load_input: unreachable("load_input intrinsics are invalid for the GS stage"); - case nir_intrinsic_load_per_vertex_input_indirect: - indirect_offset = retype(get_nir_src(instr->src[1]), BRW_REGISTER_TYPE_D); - /* fallthrough */ case nir_intrinsic_load_per_vertex_input: - emit_gs_input_load(dest, instr->src[0], - indirect_offset, instr->const_index[0], -
[Mesa-dev] [PATCH v3 08/12] tgsi_to_nir: Get rid of load/store_foo_indirect
--- src/gallium/auxiliary/nir/tgsi_to_nir.c | 52 ++--- 1 file changed, 28 insertions(+), 24 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index 5fef542..5def6d3 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -468,7 +468,7 @@ ttn_emit_immediate(struct ttn_compile *c) nir_builder_instr_insert(b, _const->instr); } -static nir_src +static nir_ssa_def * ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect); /* generate either a constant or indirect deref chain for accessing an @@ -487,7 +487,7 @@ ttn_array_deref(struct ttn_compile *c, nir_intrinsic_instr *instr, if (indirect) { arr->deref_array_type = nir_deref_array_type_indirect; - arr->indirect = ttn_src_for_indirect(c, indirect); + arr->indirect = nir_src_for_ssa(ttn_src_for_indirect(c, indirect)); } else { arr->deref_array_type = nir_deref_array_type_direct; } @@ -586,19 +586,14 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, switch (file) { case TGSI_FILE_INPUT: - op = indirect ? nir_intrinsic_load_input_indirect : - nir_intrinsic_load_input; + op = nir_intrinsic_load_input; assert(!dim); break; case TGSI_FILE_CONSTANT: if (dim) { -op = indirect ? nir_intrinsic_load_ubo_indirect : -nir_intrinsic_load_ubo; -/* convert index from vec4 to byte: */ -index *= 16; +op = nir_intrinsic_load_ubo; } else { -op = indirect ? nir_intrinsic_load_uniform_indirect : -nir_intrinsic_load_uniform; +op = nir_intrinsic_load_uniform; } break; default: @@ -609,7 +604,6 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, load = nir_intrinsic_instr_create(b->shader, op); load->num_components = 4; - load->const_index[0] = index; if (dim) { if (dimind) { load->src[srcn] = @@ -622,17 +616,26 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, } srcn++; } - if (indirect) { - load->src[srcn] = ttn_src_for_indirect(c, indirect); - if (dim) { -assert(load->src[srcn].is_ssa); -/* we also need to covert vec4 to byte here too: */ -load->src[srcn] = - nir_src_for_ssa(nir_ishl(b, load->src[srcn].ssa, -nir_imm_int(b, 4))); + + nir_ssa_def *offset; + if (dim) { + /* UBO loads don't have a const_index[0] base offset. */ + offset = nir_imm_int(b, index); + if (indirect) { +offset = nir_iadd(b, offset, ttn_src_for_indirect(c, indirect)); + } + /* UBO offsets are in bytes, but TGSI gives them to us in vec4's */ + offset = nir_ishl(b, offset, nir_imm_int(b, 4)); + } else { + load->const_index[0] = index; + if (indirect) { +offset = ttn_src_for_indirect(c, indirect); + } else { +offset = nir_imm_int(b, 0); } - srcn++; } + load->src[srcn++] = nir_src_for_ssa(offset); + nir_ssa_dest_init(>instr, >dest, 4, NULL); nir_builder_instr_insert(b, >instr); @@ -648,7 +651,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, return src; } -static nir_src +static nir_ssa_def * ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect) { nir_builder *b = >build; @@ -660,7 +663,7 @@ ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect) indirect->File, indirect->Index, NULL, NULL, NULL); - return nir_src_for_ssa(nir_imov_alu(b, src, 1)); + return nir_imov_alu(b, src, 1); } static nir_alu_dest @@ -729,7 +732,7 @@ ttn_get_dest(struct ttn_compile *c, struct tgsi_full_dst_register *tgsi_fdst) if (tgsi_dst->Indirect && (tgsi_dst->File != TGSI_FILE_TEMPORARY)) { nir_src *indirect = ralloc(c->build.shader, nir_src); - *indirect = ttn_src_for_indirect(c, _fdst->Indirect); + *indirect = nir_src_for_ssa(ttn_src_for_indirect(c, _fdst->Indirect)); dest.dest.reg.indirect = indirect; } @@ -1927,9 +1930,10 @@ ttn_add_output_stores(struct ttn_compile *c) nir_intrinsic_instr_create(b->shader, nir_intrinsic_store_output); unsigned loc = var->data.driver_location + i; store->num_components = 4; - store->const_index[0] = loc; store->src[0].reg.reg = c->output_regs[loc].reg;
[Mesa-dev] [PATCH v3 01/12] vc4: Do all uniform loads with byte offsets
Previously, we lowered direct uniform loads to dword offsets and indirect loads to byte offsets in vc4_nir_lower_io. However, it simplifies things a bit if we just use byte offsets for everything and then divide by 4 when we handle the direct uniform load. --- src/gallium/drivers/vc4/vc4_nir_lower_io.c | 17 + src/gallium/drivers/vc4/vc4_program.c | 11 +++ 2 files changed, 12 insertions(+), 16 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c b/src/gallium/drivers/vc4/vc4_nir_lower_io.c index 1afe52a..e3d018f 100644 --- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c +++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c @@ -338,8 +338,8 @@ static void vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, nir_intrinsic_instr *intr) { -/* All TGSI-to-NIR uniform loads are vec4, but we may create dword - * loads in our lowering passes. +/* All TGSI-to-NIR uniform loads are vec4, but we need byte offsets + * in the backend. */ if (intr->num_components == 1) return; @@ -355,6 +355,9 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, intr_comp->num_components = 1; nir_ssa_dest_init(_comp->instr, _comp->dest, 1, NULL); +/* Convert the base offset to bytes and add the component */ +intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 4); + if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) { /* Convert the variable TGSI register index to a byte * offset. @@ -363,16 +366,6 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, nir_src_for_ssa(nir_ishl(b, intr->src[0].ssa, nir_imm_int(b, 4))); - -/* Convert the offset to be a byte index, too. */ -intr_comp->const_index[0] = (intr->const_index[0] * 16 + - i * 4); -} else { -/* We want a dword index for non-indirect uniform - * loads. - */ -intr_comp->const_index[0] = (intr->const_index[0] * 4 + - i); } dests[i] = _comp->dest.ssa; diff --git a/src/gallium/drivers/vc4/vc4_program.c b/src/gallium/drivers/vc4/vc4_program.c index 081adfd..5ce1143 100644 --- a/src/gallium/drivers/vc4/vc4_program.c +++ b/src/gallium/drivers/vc4/vc4_program.c @@ -1440,6 +1440,7 @@ static void ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr) { const nir_intrinsic_info *info = _intrinsic_infos[instr->intrinsic]; +int dword_offset; struct qreg *dest = NULL; if (info->has_dest) { @@ -1449,11 +1450,13 @@ ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr) switch (instr->intrinsic) { case nir_intrinsic_load_uniform: assert(instr->num_components == 1); -if (instr->const_index[0] < VC4_NIR_STATE_UNIFORM_OFFSET) { -*dest = qir_uniform(c, QUNIFORM_UNIFORM, -instr->const_index[0]); +/* The offset is in bytes, but we need dwords */ +assert(instr->const_index[0] % 4 == 0); +dword_offset = instr->const_index[0] / 4; +if (dword_offset < VC4_NIR_STATE_UNIFORM_OFFSET) { +*dest = qir_uniform(c, QUNIFORM_UNIFORM, dword_offset); } else { -*dest = qir_uniform(c, instr->const_index[0] - +*dest = qir_uniform(c, dword_offset - VC4_NIR_STATE_UNIFORM_OFFSET, 0); } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 09/12] ir3/nir: Use the new unified io intrinsics
--- .../drivers/freedreno/ir3/ir3_compiler_nir.c | 79 +- 1 file changed, 47 insertions(+), 32 deletions(-) diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c index 2723959..eea5c5e 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c @@ -1218,6 +1218,7 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, nir_intrinsic_instr *intr, { struct ir3_block *b = ctx->block; struct ir3_instruction *addr, *src0, *src1; + nir_const_value *const_offset; /* UBO addresses are the first driver params: */ unsigned ubo = regid(ctx->so->first_driver_param + IR3_UBOS_OFF, 0); unsigned off = intr->const_index[0]; @@ -1231,7 +1232,10 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, nir_intrinsic_instr *intr, addr = create_uniform_indirect(ctx, ubo, get_addr(ctx, src0)); } - if (intr->intrinsic == nir_intrinsic_load_ubo_indirect) { + const_offset = nir_src_as_const_value(intr->src[1]); + if (const_offset) { + off += const_offset->u[0]; + } else { /* For load_ubo_indirect, second src is indirect offset: */ src1 = get_src(ctx, >src[1])[0]; @@ -1394,6 +1398,7 @@ emit_intrinisic(struct ir3_compile *ctx, nir_intrinsic_instr *intr) struct ir3_instruction **dst, **src; struct ir3_block *b = ctx->block; unsigned idx = intr->const_index[0]; + nir_const_value *const_offset; if (info->has_dest) { dst = get_dst(ctx, >dest, intr->num_components); @@ -1403,43 +1408,49 @@ emit_intrinisic(struct ir3_compile *ctx, nir_intrinsic_instr *intr) switch (intr->intrinsic) { case nir_intrinsic_load_uniform: - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = create_uniform(ctx, n); - } - break; - case nir_intrinsic_load_uniform_indirect: - src = get_src(ctx, >src[0]); - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = create_uniform_indirect(ctx, n, - get_addr(ctx, src[0])); + const_offset = nir_src_as_const_value(intr->src[0]); + if (const_offset) { + idx += const_offset->u[0]; + for (int i = 0; i < intr->num_components; i++) { + unsigned n = idx * 4 + i; + dst[i] = create_uniform(ctx, n); + } + } else { + src = get_src(ctx, >src[0]); + for (int i = 0; i < intr->num_components; i++) { + unsigned n = idx * 4 + i; + dst[i] = create_uniform_indirect(ctx, n, + get_addr(ctx, src[0])); + } + /* NOTE: if relative addressing is used, we set +* constlen in the compiler (to worst-case value) +* since we don't know in the assembler what the max +* addr reg value can be: +*/ + ctx->so->constlen = ctx->s->num_uniforms; } - /* NOTE: if relative addressing is used, we set constlen in -* the compiler (to worst-case value) since we don't know in -* the assembler what the max addr reg value can be: -*/ - ctx->so->constlen = ctx->s->num_uniforms; break; case nir_intrinsic_load_ubo: - case nir_intrinsic_load_ubo_indirect: emit_intrinsic_load_ubo(ctx, intr, dst); break; case nir_intrinsic_load_input: - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = ctx->ir->inputs[n]; - } - break; - case nir_intrinsic_load_input_indirect: - src = get_src(ctx, >src[0]); - struct ir3_instruction *collect = - create_collect(b, ctx->ir->inputs, ctx->ir->ninputs); - struct ir3_instruction *addr = get_addr(ctx, src[0]); - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = create_indirect_load(ctx, ctx->ir->ninputs, - n, addr, collect); + const_offset = nir_src_as_const_value(intr->src[0]); + if (const_offset) { + idx +=
[Mesa-dev] [PATCH v3 12/12] nir/lower_two_sided_color: Update to the new load intrinsic
--- src/glsl/nir/nir_lower_two_sided_color.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/glsl/nir/nir_lower_two_sided_color.c b/src/glsl/nir/nir_lower_two_sided_color.c index 6995b9d..7df12e0 100644 --- a/src/glsl/nir/nir_lower_two_sided_color.c +++ b/src/glsl/nir/nir_lower_two_sided_color.c @@ -73,6 +73,7 @@ load_input(nir_builder *b, nir_variable *in) load = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_input); load->num_components = 4; load->const_index[0] = in->data.driver_location; + load->src[0] = nir_src_for_ssa(nir_imm_int(b, 0)); nir_ssa_dest_init(>instr, >dest, 4, NULL); nir_builder_instr_insert(b, >instr); @@ -151,6 +152,7 @@ nir_lower_two_sided_color_block(nir_block *block, void *void_state) unsigned drvloc = state->colors[idx].front->data.driver_location; if (intr->const_index[0] == drvloc) { +assert(nir_src_as_const_value(intr->src[0])); break; } } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics
There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the *_indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the *_indirect variants entirely. --- src/glsl/nir/nir_intrinsics.h | 81 + src/glsl/nir/nir_lower_phis_to_scalar.c | 4 -- src/glsl/nir/nir_print.c| 9 +--- 3 files changed, 42 insertions(+), 52 deletions(-) diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h index b2565c5..63df21e 100644 --- a/src/glsl/nir/nir_intrinsics.h +++ b/src/glsl/nir/nir_intrinsics.h @@ -228,54 +228,55 @@ SYSTEM_VALUE(num_work_groups, 3, 0) SYSTEM_VALUE(helper_invocation, 1, 0) /* - * The format of the indices depends on the type of the load. For uniforms, - * the first index is the base address and the second index is an offset that - * should be added to the base address. (This way you can determine in the - * back-end which variable is being accessed even in an array.) For inputs, - * the one and only index corresponds to the attribute slot. UBO loads also - * have a single index which is the base address to load from. + * Load operations pull data from some piece of GPU memory. All load + * operations operate in terms of offsets into some piece of theoretical + * memory. Loads from externally visible memory (UBO and SSBO) simply take a + * byte offset as a source. Loads from opaque memory (uniforms, inputs, etc.) + * take a base+offset pair where the base (const_index[0]) gives the location + * of the start of the variable being loaded and and the offset source is a + * offset into that variable. * - * UBO loads have a (possibly constant) source which is the UBO buffer index. - * For each type of load, the _indirect variant has one additional source - * (the second in the case of UBO's) that is the is an indirect to be added to - * the constant address or base offset to compute the final offset. + * Some load operations such as UBO/SSBO load and per_vertex loads take an + * additional source to specify which UBO/SSBO/vertex to load from. * - * For vector backends, the address is in terms of one vec4, and so each array - * element is +4 scalar components from the previous array element. For scalar - * backends, the address is in terms of a single 4-byte float/int and arrays - * elements begin immediately after the previous array element. + * The exact address type depends on the lowering pass that generates the + * load/store intrinsics. Typically, this is vec4 units for things such as + * varying slots and float units for fragment shader inputs. UBO and SSBO + * offsets are always in bytes. */ -#define LOAD(name, extra_srcs, indices, flags) \ - INTRINSIC(load_##name, extra_srcs, ARR(1), true, 0, 0, indices, flags) \ - INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \ - true, 0, 0, indices, flags) +#define LOAD(name, srcs, indices, flags) \ + INTRINSIC(load_##name, srcs, ARR(1, 1, 1, 1), true, 0, 0, indices, flags) -LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE) -LOAD(output, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE) -LOAD(per_vertex_output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE) +/* src[] = { offset }. const_index[] = { base } */ +LOAD(uniform, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { buffer_index, offset }. No const_index */ +LOAD(ubo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { offset }. const_index[0] = { base } */ +LOAD(input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { vertex, offset }. const_index[] = { base } */ +LOAD(per_vertex_input, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { buffer_index, offset }. No const_index */ +LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE) +/* src[] = { offset }. const_index[] = { base } */ +LOAD(output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE) +/* src[] = { vertex, offset }. const_index[] = { base } */ +LOAD(per_vertex_output, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE) /* - * Stores work the same way as loads, except now the first register input is - * the value or array to store and the optional second input is the indirect - * offset. SSBO stores are similar, but they accept
[Mesa-dev] [PATCH v3 05/12] nir/lower_io: Get rid of load/store_foo_indirect
--- src/glsl/nir/nir.h | 2 +- src/glsl/nir/nir_lower_io.c | 111 +--- 2 files changed, 44 insertions(+), 69 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index e161b70..2e72e66 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1969,7 +1969,7 @@ void nir_assign_var_locations(struct exec_list *var_list, void nir_lower_io(nir_shader *shader, nir_variable_mode mode, int (*type_size)(const struct glsl_type *)); -nir_src *nir_get_io_indirect_src(nir_intrinsic_instr *instr); +nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr); nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr); void nir_lower_vars_to_ssa(nir_shader *shader); diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c index f64ac69..a2723d5 100644 --- a/src/glsl/nir/nir_lower_io.c +++ b/src/glsl/nir/nir_lower_io.c @@ -86,10 +86,9 @@ is_per_vertex_output(struct lower_io_state *state, nir_variable *var) stage == MESA_SHADER_TESS_CTRL; } -static unsigned +static nir_ssa_def * get_io_offset(nir_builder *b, nir_deref_var *deref, nir_ssa_def **vertex_index, - nir_ssa_def **out_indirect, int (*type_size)(const struct glsl_type *)) { nir_deref *tail = >deref; @@ -109,8 +108,8 @@ get_io_offset(nir_builder *b, nir_deref_var *deref, *vertex_index = vtx; } - nir_ssa_def *indirect = NULL; - unsigned base_offset = 0; + /* Just emit code and let constant-folding go to town */ + nir_ssa_def *offset = nir_imm_int(b, 0); while (tail->child != NULL) { const struct glsl_type *parent_type = tail->type; @@ -120,55 +119,46 @@ get_io_offset(nir_builder *b, nir_deref_var *deref, nir_deref_array *deref_array = nir_deref_as_array(tail); unsigned size = type_size(tail->type); - base_offset += size * deref_array->base_offset; + offset = nir_iadd(b, offset, + nir_imm_int(b, size * deref_array->base_offset)); if (deref_array->deref_array_type == nir_deref_array_type_indirect) { nir_ssa_def *mul = nir_imul(b, nir_imm_int(b, size), nir_ssa_for_src(b, deref_array->indirect, 1)); -indirect = indirect ? nir_iadd(b, indirect, mul) : mul; +offset = nir_iadd(b, offset, mul); } } else if (tail->deref_type == nir_deref_type_struct) { nir_deref_struct *deref_struct = nir_deref_as_struct(tail); + unsigned field_offset = 0; for (unsigned i = 0; i < deref_struct->index; i++) { -base_offset += type_size(glsl_get_struct_field(parent_type, i)); +field_offset += type_size(glsl_get_struct_field(parent_type, i)); } + offset = nir_iadd(b, offset, nir_imm_int(b, field_offset)); } } - *out_indirect = indirect; - return base_offset; + return offset; } static nir_intrinsic_op load_op(struct lower_io_state *state, -nir_variable_mode mode, bool per_vertex, bool has_indirect) +nir_variable_mode mode, bool per_vertex) { nir_intrinsic_op op; switch (mode) { case nir_var_shader_in: - if (per_vertex) { - op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect : - nir_intrinsic_load_per_vertex_input; - } else { - op = has_indirect ? nir_intrinsic_load_input_indirect : - nir_intrinsic_load_input; - } + op = per_vertex ? nir_intrinsic_load_per_vertex_input : +nir_intrinsic_load_input; break; case nir_var_shader_out: - if (per_vertex) { - op = has_indirect ? nir_intrinsic_load_per_vertex_output_indirect : - nir_intrinsic_load_per_vertex_output; - } else { - op = has_indirect ? nir_intrinsic_load_output_indirect : - nir_intrinsic_load_output; - } + op = per_vertex ? nir_intrinsic_load_per_vertex_output : +nir_intrinsic_load_output; break; case nir_var_uniform: - op = has_indirect ? nir_intrinsic_load_uniform_indirect : - nir_intrinsic_load_uniform; + op = nir_intrinsic_load_uniform; break; default: unreachable("Unknown variable mode"); @@ -211,32 +201,25 @@ nir_lower_io_block(nir_block *block, void *void_state) is_per_vertex_input(state, intrin->variables[0]->var) || is_per_vertex_output(state, intrin->variables[0]->var); - nir_ssa_def *indirect; + nir_ssa_def *offset; nir_ssa_def *vertex_index; - unsigned offset = get_io_offset(b, intrin->variables[0], - per_vertex ? _index : NULL, - , state->type_size); +
[Mesa-dev] [PATCH v3 07/12] i965/vec4: Get rid of load/store_foo_indirect
--- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 7 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 95 +++ 2 files changed, 43 insertions(+), 59 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp index e51ef4b..6f66978 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp @@ -60,19 +60,19 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) src_reg src; switch (instr->intrinsic) { - case nir_intrinsic_load_per_vertex_input_indirect: - assert(!"EmitNoIndirectInput should prevent this."); case nir_intrinsic_load_per_vertex_input: { /* The EmitNoIndirectInput flag guarantees our vertex index will * be constant. We should handle indirects someday. */ nir_const_value *vertex = nir_src_as_const_value(instr->src[0]); + nir_const_value *offset = nir_src_as_const_value(instr->src[1]); /* Make up a type...we have no way of knowing... */ const glsl_type *const type = glsl_type::ivec(instr->num_components); src = src_reg(ATTR, BRW_VARYING_SLOT_COUNT * vertex->u[0] + - instr->const_index[0], type); + instr->const_index[0] + offset->u[0], +type); dest = get_nir_dest(instr->dest, src.type); dest.writemask = brw_writemask_for_size(instr->num_components); emit(MOV(dest, src)); @@ -80,7 +80,6 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) } case nir_intrinsic_load_input: - case nir_intrinsic_load_input_indirect: unreachable("nir_lower_io should have produced per_vertex intrinsics"); case nir_intrinsic_emit_vertex_with_counter: { diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 50570cd..f965b39 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -369,22 +369,17 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) dst_reg dest; src_reg src; - bool has_indirect = false; - switch (instr->intrinsic) { - case nir_intrinsic_load_input_indirect: - has_indirect = true; - /* fallthrough */ case nir_intrinsic_load_input: { - int offset = instr->const_index[0]; - src = src_reg(ATTR, offset, glsl_type::uvec4_type); + nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]); + + /* We set EmitNoIndirectInput for VS */ + assert(const_offset); + + src = src_reg(ATTR, instr->const_index[0] + const_offset->u[0], +glsl_type::uvec4_type); - if (has_indirect) { - dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr->src[0], - BRW_REGISTER_TYPE_D, - 1)); - } dest = get_nir_dest(instr->dest, src.type); dest.writemask = brw_writemask_for_size(instr->num_components); @@ -392,11 +387,11 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) break; } - case nir_intrinsic_store_output_indirect: - unreachable("nir_lower_outputs_to_temporaries should prevent this"); - case nir_intrinsic_store_output: { - int varying = instr->const_index[0]; + nir_const_value *const_offset = nir_src_as_const_value(instr->src[1]); + assert(const_offset); + + int varying = instr->const_index[0] + const_offset->u[0]; src = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_F, instr->num_components); @@ -431,9 +426,6 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) break; } - case nir_intrinsic_store_ssbo_indirect: - has_indirect = true; - /* fallthrough */ case nir_intrinsic_store_ssbo: { assert(devinfo->gen >= 7); @@ -458,20 +450,19 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) } /* Offset */ - src_reg offset_reg = src_reg(this, glsl_type::uint_type); - unsigned const_offset_bytes = 0; - if (has_indirect) { - emit(MOV(dst_reg(offset_reg), get_nir_src(instr->src[2], 1))); + src_reg offset_reg; + nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]); + if (const_offset) { + offset_reg = brw_imm_ud(const_offset->u[0]); } else { - const_offset_bytes = instr->const_index[0]; - emit(MOV(dst_reg(offset_reg), brw_imm_ud(const_offset_bytes))); + offset_reg = get_nir_src(instr->src[2], 1); } /* Value */ src_reg val_reg = get_nir_src(instr->src[0], 4); /* Writemask */ - unsigned write_mask = instr->const_index[1]; + unsigned write_mask = instr->const_index[0]; /* IvyBridge does not have a native SIMD4x2 untyped write
[Mesa-dev] [PATCH v3 11/12] nir/lower_clip: Update to the new load/store intrinsics
--- src/glsl/nir/nir_lower_clip.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/glsl/nir/nir_lower_clip.c b/src/glsl/nir/nir_lower_clip.c index c58c778..e2a2bb6 100644 --- a/src/glsl/nir/nir_lower_clip.c +++ b/src/glsl/nir/nir_lower_clip.c @@ -74,6 +74,7 @@ store_clipdist_output(nir_builder *b, nir_variable *out, nir_ssa_def **val) store->const_index[0] = out->data.driver_location; store->src[0].ssa = nir_vec4(b, val[0], val[1], val[2], val[3]); store->src[0].is_ssa = true; + store->src[1] = nir_src_for_ssa(nir_imm_int(b, 0)); nir_builder_instr_insert(b, >instr); } @@ -85,6 +86,7 @@ load_clipdist_input(nir_builder *b, nir_variable *in, nir_ssa_def **val) load = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_input); load->num_components = 4; load->const_index[0] = in->data.driver_location; + load->src[0] = nir_src_for_ssa(nir_imm_int(b, 0)); nir_ssa_dest_init(>instr, >dest, 4, NULL); nir_builder_instr_insert(b, >instr); @@ -112,6 +114,7 @@ find_output_in_block(nir_block *block, void *void_state) intr->const_index[0] == state->drvloc) { assert(state->def == NULL); assert(intr->src[0].is_ssa); +assert(nir_src_as_const_value(intr->src[1])); state->def = intr->src[0].ssa; #if !defined(DEBUG) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 1626] X server should not poll() on DRM fd
https://bugs.freedesktop.org/show_bug.cgi?id=1626 --- Comment #3 from Alex Perez--- The last comment on this bug was eight and a half years ago...the bug itself was filed back in 2004. Probably close as WONTFIX, if it's even relevant any longer? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev