date:20151208

Re: [Mesa-dev] Mesa 11.1.0 release candidate 3

2015-12-08 Thread Emil Velikov

On 8 December 2015 at 04:33, Timothy Arceri
 wrote:
>
> On Mon, 2015-12-07 at 14:05 +, Emil Velikov wrote:
>> The third release candidate for Mesa 11.1.0 is now available.
>>
>
> Hi Emil,
>
> 76c09c1792ff1209bd34e1ae0e17b9c4929a892f
> 618612f867ed8140409796535314d9b9ed58440b
> 8117f46f496fb31339fc97a2501d5b3325a1fefb
> 2ab9cd0c4dcefb3e63266cadc1e06079e67c3962
>
> I sent to the stable list after commiting them to master. To make thing
> s worse they got stuck in moderation as my collabora address wasn't
> signed up to the stable list.
>
> These fix some rather important issues for SSO and would be good to get
> into the 11.1 release.
>
> Let me know if I should resend them to stable or if anything else is
> required.
>
Everything's fine Tim. I've kicked the mesa-stable queue a few days
ago and these patches are on the list. I'll make sure I double-check
for late nominations.

Thanks for the reminder.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/3] mesa: do not fail when stage not linked is not in use

2015-12-08 Thread Tapani Pälli

Shader linking may fail but if glUseProgramStages does not take shader
in use, previously set stages to pipeline should continue to work.

This fixes a subtest in following CTS test:
ES31-CTS.sepshaderobjs.StateInteraction

Signed-off-by: Tapani Pälli 
---
 src/mesa/main/context.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index d73c984..df2c41c 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -1977,6 +1977,24 @@ _mesa_valid_to_render(struct gl_context *ctx, const char 
*where)
   bool from_glsl_shader[MESA_SHADER_COMPUTE] = { false };
 
   for (i = 0; i < MESA_SHADER_COMPUTE; i++) {
+
+ /* If stage has not marked as active in the pipeline, it does not
+  * matter if linking this shader failed. If stage has been set as used
+  * it should have ValidProgramUse set.
+  *
+  * OpenGL ES 3.1 spec (7.3 Program objects):
+  *
+  * "If a program object that is active for any shader stage is
+  * re-linked unsuccessfully, the link status will be set to FALSE,
+  * but any existing executables and associated state will remain
+  * part of the current rendering state until a subsequent call to
+  * UseProgram, UseProgramStages, or BindProgramPipeline removes
+  * them from use.
+  */
+ if (ctx->_Shader->CurrentProgram[i] &&
+ !ctx->_Shader->CurrentProgram[i]->ValidProgramUse)
+continue;
+
  if (!shader_linked_or_absent(ctx, ctx->_Shader->CurrentProgram[i],
   _glsl_shader[i], where))
 return GL_FALSE;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline

2015-12-08 Thread Tapani Pälli

Validation checks that we do not have active shader stages that are not
used by the pipeline.

This fixes a subtest in following CTS test:
ES31-CTS.sepshaderobjs.StateInteraction

v2: move as generic validation check for both ES and desktop (Timothy)

Signed-off-by: Tapani Pälli 
---
 src/mesa/main/context.c |  9 +
 src/mesa/main/pipelineobj.c | 35 +++
 src/mesa/main/pipelineobj.h |  2 ++
 3 files changed, 46 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index be983d4..d73c984 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx, const char 
*where)
   }
}
 
+   /* If SSO in use, validate that all linked program stages are used. */
+   if (ctx->Pipeline.Current) {
+  if (!_mesa_active_program_stages_used(ctx->Pipeline.Current)) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "Shader program active for shader stages that "
+ "are not used by the pipeline");
+  }
+   }
+
/* If a program is active and SSO not in use, check if validation of
 * samplers succeeded for the active program. */
if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx->Pipeline.Current) {
diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index f4ce3ed..9067551 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const struct 
gl_pipeline_object *pipe)
return false;
 }
 
+bool
+_mesa_active_program_stages_used(struct gl_pipeline_object *pipe)
+{
+   unsigned i;
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (!pipe->CurrentProgram[i])
+ continue;
+  /* If program has linked but not used by pipeline. */
+  if (pipe->CurrentProgram[i]->LinkStatus &&
+  !pipe->CurrentProgram[i]->ValidProgramUse)
+ return false;
+   }
+   return true;
+}
+
 extern GLboolean
 _mesa_validate_program_pipeline(struct gl_context* ctx,
 struct gl_pipeline_object *pipe)
@@ -839,6 +854,26 @@ _mesa_validate_program_pipeline(struct gl_context* ctx,
   return GL_FALSE;
}
 
+   /* Section 11.1.3.11 (Validation) of the OpenGL 4.5 and OpenGL ES 3.1
+* spec say:
+*
+* "An INVALID_OPERATION error is generated by any command that trans-
+* fers vertices to the GL or launches compute work if the current set
+* of active program objects cannot be executed, for reasons including:
+*
+* ...
+*
+* "A program object is active for at least one, but not all of
+* the shader stages that were present when the program was linked."
+*/
+   if (!_mesa_active_program_stages_used(pipe)) {
+  pipe->InfoLog =
+ ralloc_strdup(pipe,
+   "Shader program active for shader stages that "
+   "are not used by the pipeline");
+  return GL_FALSE;
+   }
+
/* Section 2.11.11 (Shader Execution), subheading "Validation," of the
 * OpenGL 4.1 spec says:
 *
diff --git a/src/mesa/main/pipelineobj.h b/src/mesa/main/pipelineobj.h
index fbcb765..e56b522 100644
--- a/src/mesa/main/pipelineobj.h
+++ b/src/mesa/main/pipelineobj.h
@@ -70,6 +70,8 @@ extern GLboolean
 _mesa_validate_program_pipeline(struct gl_context * ctx,
 struct gl_pipeline_object *pipe);
 
+extern bool
+_mesa_active_program_stages_used(struct gl_pipeline_object *pipe);
 
 extern void GLAPIENTRY
 _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, GLuint program);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/3] mesa: add a new flag to track program usage in a pipeline

2015-12-08 Thread Tapani Pälli

Patch adds ValidProgramUse flag to gl_shader_program that can be used
to track if a program is fully utilized by program pipeline. Flag is
set by glUseProgramStages and will be used for pipeline validation by
the next patch.

Signed-off-by: Tapani Pälli 
Suggested-by: Timothy Arceri 
---
 src/glsl/linker.cpp |  1 +
 src/mesa/main/mtypes.h  |  1 +
 src/mesa/main/pipelineobj.c | 27 +++
 3 files changed, 29 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index ae628cd..7e0af44 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -4062,6 +4062,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
prog->LinkStatus = true; /* All error paths will set this to false */
prog->Validated = false;
prog->_Used = false;
+   prog->ValidProgramUse = false;
 
prog->ARB_fragment_coord_conventions_enable = false;
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 1eb1e21..770055f 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2746,6 +2746,7 @@ struct gl_shader_program
GLboolean Validated;
GLboolean _Used;/**< Ever used for drawing? */
GLboolean SamplersValidated; /**< Samplers validated against texture units? 
*/
+   GLboolean ValidProgramUse; /**< SSO, is every stage used by the pipeline? */
GLchar *InfoLog;
 
unsigned Version;   /**< GLSL version used for linking */
diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index 5eda4e5..f4ce3ed 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -219,6 +219,31 @@ _mesa_reference_pipeline_object_(struct gl_context *ctx,
 }
 
 /**
+ * Raise a flag if shader programs in pipeline contain
+ * stages that are not used. This is used by validation later.
+ */
+static void
+update_program_usage(struct gl_pipeline_object *pipe)
+{
+   unsigned i, j;
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_shader_program *shProg = pipe->CurrentProgram[i];
+  if (!shProg)
+ continue;
+  shProg->ValidProgramUse = true;
+  for (j = 0; j < MESA_SHADER_STAGES; j++) {
+ if (!shProg->_LinkedShaders[j])
+continue;
+ /* Make sure that any shader stage is also used in the pipeline. */
+ if (shProg != pipe->CurrentProgram[j]) {
+shProg->ValidProgramUse = false;
+return;
+ }
+  }
+   }
+}
+
+/**
  * Bound program to severals stages of the pipeline
  */
 void GLAPIENTRY
@@ -341,6 +366,8 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, 
GLuint program)
 
if ((stages & GL_COMPUTE_SHADER_BIT) != 0)
   _mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, pipe);
+
+   update_program_usage(pipe);
 }
 
 /**
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/3] more pipeline validation

2015-12-08 Thread Tapani Pälli

Hi;

I marked this as v2 as this is alternative approach to patch series
I sent ealier:

http://lists.freedesktop.org/archives/mesa-dev/2015-December/102548.html

I took then direction suggested by Timothy which makes validation step
simpler. These patches fix 2 ES31-CTS.sepshaderobjs.StateInteraction
cases leaving only interface matching part which I will tackle next.

Tapani Pälli (3):
  mesa: add a new flag to track program usage in a pipeline
  mesa: validate program usage in a pipeline
  mesa: do not fail when stage not linked is not in use

 src/glsl/linker.cpp |  1 +
 src/mesa/main/context.c | 27 
 src/mesa/main/mtypes.h  |  1 +
 src/mesa/main/pipelineobj.c | 62 +
 src/mesa/main/pipelineobj.h |  2 ++
 5 files changed, 93 insertions(+)

-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Add defines for gather push constants

2015-12-08 Thread Abdiel Janulgue



On 12/07/2015 04:45 PM, Ilia Mirkin wrote:
> On Mon, Dec 7, 2015 at 7:39 AM, Abdiel Janulgue
>  wrote:
>> v2 (Francisco Jerez):
>>- Rename HSW_GATHER_CONSTANTS_RESERVED to 
>> HSW_GATHER_POOL_ALLOC_MUST_BE_ONE.
>>- Rename BRW_GATHER_* prefix to HSW_GATHER_CONSTANTS_*.
>>
>> Signed-off-by: Abdiel Janulgue 
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h | 19 +++
>>  1 file changed, 19 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
>> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index a511d5c..f997cf9 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -2565,6 +2565,25 @@ enum brw_wm_barycentric_interp_mode {
>>  #define _3DSTATE_CONSTANT_HS  0x7819 /* GEN7+ */
>>  #define _3DSTATE_CONSTANT_DS  0x781A /* GEN7+ */
>>
>> +/* Resource streamer gather constants */
>> +#define HSW_GATHER_POOL_ALLOC_MUST_BE_ONE (3 << 4) /* GEN7.5 only */
>> +#define _3DSTATE_GATHER_POOL_ALLOC0x791A /* GEN7.5+ */
>> +#define _3DSTATE_GATHER_CONSTANT_VS   0x7834
>> +#define _3DSTATE_GATHER_CONSTANT_GS   0x7835
>> +#define _3DSTATE_GATHER_CONSTANT_HS   0x7836
>> +#define _3DSTATE_GATHER_CONSTANT_DS   0x7837
>> +#define _3DSTATE_GATHER_CONSTANT_PS   0x7838
>> +
>> +#define HSW_GATHER_CONSTANTS_ENABLE   (1 << 11) /* GEN7.5+ */
>> +#define HSW_GATHER_CONSTANTS_BUFFER_VALID_SHIFT 16
>> +#define HSW_GATHER_CONSTANTS_BUFFER_VALID_MASK  INTEL_MASK(31, 16)
>> +#define HSW_GATHER_CONSTANTS_BINDING_TABLE_BLOCK_SHIFT  12
>> +#define HSW_GATHER_CONSTANTS_BINDING_TABLE_BLOCK_MASK   INTEL_MASK(15, 12)
>> +#define HSW_GATHER_CONSTANTS_CONST_BUFFER_OFFSET_SHIFT  8
>> +#define HSW_GATHER_CONSTANTS_CONST_BUFFER_OFFSET_MASK   INTEL_MASK(15, 8)
> 
> I mentioned this on IRC, but shouldn't this one have been
> INTEL_MASK(11, 8)? Seems unlikely that there are overlapping
> bitfields...

Actually, the CONST_BUFFER_OFFSET field is part of the "gather constant
entry" fields which occupies separate dword blocks from the
BINDING_TABLE_BLOCK field.

-Abdiel

> 
>   -ilia
> 
>> +#define HSW_GATHER_CONSTANTS_CHANNEL_MASK_SHIFT 4
>> +#define HSW_GATHER_CONSTANTS_CHANNEL_MASK_MASK  INTEL_MASK(7, 4)
>> +
>>  #define _3DSTATE_STREAMOUT0x781e /* GEN7+ */
>>  /* DW1 */
>>  # define SO_FUNCTION_ENABLE(1 << 31)
>> --
>> 1.9.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline

2015-12-08 Thread Tapani Pälli


On 12/08/2015 01:31 PM, Tapani Pälli wrote:

On 12/08/2015 12:57 PM, Timothy Arceri wrote:

On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote:

Validation checks that we do not have active shader stages that are
not
used by the pipeline.

This fixes a subtest in following CTS test:
ES31-CTS.sepshaderobjs.StateInteraction

v2: move as generic validation check for both ES and desktop
(Timothy)

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/context.c |  9 +
  src/mesa/main/pipelineobj.c | 35 +++
  src/mesa/main/pipelineobj.h |  2 ++
  3 files changed, 46 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index be983d4..d73c984 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx,
const char *where)
}
 }
  +   /* If SSO in use, validate that all linked program stages are
used. */
+   if (ctx->Pipeline.Current) {
+  if (!_mesa_active_program_stages_used(ctx->Pipeline.Current))
{
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "Shader program active for shader stages that "
+ "are not used by the pipeline");

You can remove this now. The call to _mesa_validate_program_pipeline()
below should cause the error to be thrown for us when it fails.



Oops forgot this one. This is necessary to be run as the full validation 
happens only if pipeline was not validated before.



+  }
+   }
+
 /* If a program is active and SSO not in use, check if validation
of
  * samplers succeeded for the active program. */
 if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx
->Pipeline.Current) {
diff --git a/src/mesa/main/pipelineobj.c
b/src/mesa/main/pipelineobj.c
index f4ce3ed..9067551 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const
struct gl_pipeline_object *pipe)
 return false;
  }
  +bool
+_mesa_active_program_stages_used(struct gl_pipeline_object *pipe)
+{
+   unsigned i;
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (!pipe->CurrentProgram[i])
+ continue;
+  /* If program has linked but not used by pipeline. */
+  if (pipe->CurrentProgram[i]->LinkStatus &&
+  !pipe->CurrentProgram[i]->ValidProgramUse)

Two things here first if ValidProgramUse is false we need to redo the
validation to be sure it really is an invalid use as its possible that
the program was relinked since the UseProgramStages call and is now
valid.
I'm not sure about this. This feels like the ValidProgramUse would be 
then unnecessary as we would anyway need to iterate through everything 
(just because of possible relink)?



Second are you sure about the LinkStatus check? Won't this cause
validation to pass when it should fail if the program was linked
successfully but ValidProgramUse is false, and we then do a relink that
fails? As the program will stay in the pipeline.


IMO we need to check that program has been linked, otherwise it should 
not be considered active at all. UseProgramStages will need to be 
called first to have actual value in ValidProgramUse and that cannot 
be called with program that has not been succesfully linked.




Make that three things. It seems there is already code that should be
doing this validation already. The first validation test in
_mesa_validate_program_pipeline() calls program_stages_all_active() in
a loop which should fail for this case. The code pretty much does what
I was suggesting only it always does the validation, there is no flag
to reduce calls. I guess it really shouldn't matter a huge deal since
its only done once when somethings relinked or a sampler changes.


Yeah it seems, I did not realize it's trying to test this same thing 
as it did not catch this but then again I guess it's not called at all 
during draw time currently.



 From my quick testing on the CTS test the program in the pipeline only
has a single shader when it should have two so the test fails with this
existing code. As per my first point I believe we always need to


The test in question is "Separable program with both vertex and 
fragment shaders attached to only one stage". It fails because there 
is no existing check run at draw time for the case where you enable 
one stage but supply a program with multiple stages.



revalidate here to be sure that it really is an invalid use of the
program so what we really need to do is track down why there is only a
single shader in the program when it seems there should be two. If we
can resolve that we might not need to rework this code unless there
really is a performance hit.



+ return false;
+   }
+   return true;
+}
+
  extern GLboolean
  _mesa_validate_program_pipeline(struct gl_context* ctx,
  struct gl_pipeline_object *pipe)
@@ -839,6 +854,26 @@

Re: [Mesa-dev] [PATCH 2/2] r600: Add ARB_copy_image support

2015-12-08 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Dec 7, 2015 at 2:44 PM, Edward O'Callaghan
 wrote:
> Signed-off-by: Edward O'Callaghan 
> ---
>  docs/GL3.txt | 2 +-
>  src/gallium/drivers/r600/r600_pipe.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index 50b429c..f564263 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -153,7 +153,7 @@ GL 4.3, GLSL 4.30:
>GL_ARB_ES3_compatibility DONE (all drivers 
> that support GLSL 3.30)
>GL_ARB_clear_buffer_object   DONE (all drivers)
>GL_ARB_compute_shaderin progress (jljusten)
> -  GL_ARB_copy_imageDONE (i965, nv50, 
> nvc0, radeonsi)
> +  GL_ARB_copy_imageDONE (i965, nv50, 
> nvc0, r600, radeonsi)
>GL_KHR_debug DONE (all drivers)
>GL_ARB_explicit_uniform_location DONE (all drivers 
> that support GLSL)
>GL_ARB_fragment_layer_viewport   DONE (i965, nv50, 
> nvc0, r600, radeonsi, llvmpipe)
> diff --git a/src/gallium/drivers/r600/r600_pipe.c 
> b/src/gallium/drivers/r600/r600_pipe.c
> index eb2ec73..ba5d9be 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -277,6 +277,7 @@ static int r600_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_TEXTURE_FLOAT_LINEAR:
> case PIPE_CAP_TEXTURE_HALF_FLOAT_LINEAR:
> case PIPE_CAP_TGSI_TXQS:
> +   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
> return 1;
>
> case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
> @@ -346,7 +347,6 @@ static int r600_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_DEPTH_BOUNDS_TEST:
> case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
> case PIPE_CAP_SHAREABLE_SHADERS:
> -   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
> case PIPE_CAP_CLEAR_TEXTURE:
> return 0;
>
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 93261] GL/DRI3 over ssh broken

2015-12-08 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93261

--- Comment #1 from Martin Peres  ---
I can reproduce the issue, I will investigate this issue this week!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] gallium: add GREMEDY_string_marker

2015-12-08 Thread Emil Velikov

On 7 December 2015 at 18:55, Rob Clark  wrote:
> On Mon, Dec 7, 2015 at 1:42 PM, Emil Velikov  wrote:
>> On 7 December 2015 at 16:45, Rob Clark  wrote:
>>> From: Rob Clark 
>>>
>>> Only exposed w/ ST_DEBUG=gremedy.
>>>
>> Perhaps a bit of a silly question - why expose the extension only for
>> debug mesa builds ? Afaict there isn't any noticeable performance
>> implication (from infrastructural POV) is there ?
>>
>> If driver X performance goes down the drain, just have them
>> conditionally return 0/1 for PIPE_CAP_STRING_MARKER ?
>
> It was suggested, iirc by Ian, on the basis that apps might do
> something non-performant if they see the extension (since the original
> use-case was that the extension is injected by the gremedy debugger).
>
Hmm fair enough. Can you add some/most of that in the commit message, please ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline

2015-12-08 Thread Timothy Arceri

On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote:
> Validation checks that we do not have active shader stages that are
> not
> used by the pipeline.
> 
> This fixes a subtest in following CTS test:
>   ES31-CTS.sepshaderobjs.StateInteraction
> 
> v2: move as generic validation check for both ES and desktop
> (Timothy)
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/context.c |  9 +
>  src/mesa/main/pipelineobj.c | 35 +++
>  src/mesa/main/pipelineobj.h |  2 ++
>  3 files changed, 46 insertions(+)
> 
> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
> index be983d4..d73c984 100644
> --- a/src/mesa/main/context.c
> +++ b/src/mesa/main/context.c
> @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx,
> const char *where)
>}
> }
>  
> +   /* If SSO in use, validate that all linked program stages are
> used. */
> +   if (ctx->Pipeline.Current) {
> +  if (!_mesa_active_program_stages_used(ctx->Pipeline.Current))
> {
> + _mesa_error(ctx, GL_INVALID_OPERATION,
> + "Shader program active for shader stages that "
> + "are not used by the pipeline");

You can remove this now. The call to _mesa_validate_program_pipeline()
below should cause the error to be thrown for us when it fails.

> +  }
> +   }
> +
> /* If a program is active and SSO not in use, check if validation
> of
>  * samplers succeeded for the active program. */
> if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx
> ->Pipeline.Current) {
> diff --git a/src/mesa/main/pipelineobj.c
> b/src/mesa/main/pipelineobj.c
> index f4ce3ed..9067551 100644
> --- a/src/mesa/main/pipelineobj.c
> +++ b/src/mesa/main/pipelineobj.c
> @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const
> struct gl_pipeline_object *pipe)
> return false;
>  }
>  
> +bool
> +_mesa_active_program_stages_used(struct gl_pipeline_object *pipe)
> +{
> +   unsigned i;
> +   for (i = 0; i < MESA_SHADER_STAGES; i++) {
> +  if (!pipe->CurrentProgram[i])
> + continue;
> +  /* If program has linked but not used by pipeline. */
> +  if (pipe->CurrentProgram[i]->LinkStatus &&
> +  !pipe->CurrentProgram[i]->ValidProgramUse)

Two things here first if ValidProgramUse is false we need to redo the
validation to be sure it really is an invalid use as its possible that
the program was relinked since the UseProgramStages call and is now
valid.

Second are you sure about the LinkStatus check? Won't this cause
validation to pass when it should fail if the program was linked
successfully but ValidProgramUse is false, and we then do a relink that
fails? As the program will stay in the pipeline.

Make that three things. It seems there is already code that should be
doing this validation already. The first validation test in
_mesa_validate_program_pipeline() calls program_stages_all_active() in
a loop which should fail for this case. The code pretty much does what
I was suggesting only it always does the validation, there is no flag
to reduce calls. I guess it really shouldn't matter a huge deal since
its only done once when somethings relinked or a sampler changes.

From my quick testing on the CTS test the program in the pipeline only
has a single shader when it should have two so the test fails with this
existing code. As per my first point I believe we always need to
revalidate here to be sure that it really is an invalid use of the
program so what we really need to do is track down why there is only a
single shader in the program when it seems there should be two. If we
can resolve that we might not need to rework this code unless there
really is a performance hit.

> + return false;
> +   }
> +   return true;
> +}
> +
>  extern GLboolean
>  _mesa_validate_program_pipeline(struct gl_context* ctx,
>  struct gl_pipeline_object *pipe)
> @@ -839,6 +854,26 @@ _mesa_validate_program_pipeline(struct
> gl_context* ctx,
>return GL_FALSE;
> }
>  
> +   /* Section 11.1.3.11 (Validation) of the OpenGL 4.5 and OpenGL ES
> 3.1
> +* spec say:
> +*
> +* "An INVALID_OPERATION error is generated by any command
> that trans-
> +* fers vertices to the GL or launches compute work if the
> current set
> +* of active program objects cannot be executed, for reasons
> including:
> +*
> +* ...
> +*
> +* "A program object is active for at least one, but not all
> of
> +* the shader stages that were present when the program was
> linked."
> +*/
> +   if (!_mesa_active_program_stages_used(pipe)) {
> +  pipe->InfoLog =
> + ralloc_strdup(pipe,
> +   "Shader program active for shader stages that
> "
> +   "are not used by the pipeline");
> +  return GL_FALSE;
> +   }
> +
> /* Section 2.11.11 (Shader Execution), subheading

Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline

2015-12-08 Thread Tapani Pälli


On 12/08/2015 12:57 PM, Timothy Arceri wrote:

On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote:

Validation checks that we do not have active shader stages that are
not
used by the pipeline.

This fixes a subtest in following CTS test:
ES31-CTS.sepshaderobjs.StateInteraction

v2: move as generic validation check for both ES and desktop
(Timothy)

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/context.c |  9 +
  src/mesa/main/pipelineobj.c | 35 +++
  src/mesa/main/pipelineobj.h |  2 ++
  3 files changed, 46 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index be983d4..d73c984 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context *ctx,
const char *where)
}
 }
  
+   /* If SSO in use, validate that all linked program stages are

used. */
+   if (ctx->Pipeline.Current) {
+  if (!_mesa_active_program_stages_used(ctx->Pipeline.Current))
{
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "Shader program active for shader stages that "
+ "are not used by the pipeline");

You can remove this now. The call to _mesa_validate_program_pipeline()
below should cause the error to be thrown for us when it fails.


+  }
+   }
+
 /* If a program is active and SSO not in use, check if validation
of
  * samplers succeeded for the active program. */
 if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx
->Pipeline.Current) {
diff --git a/src/mesa/main/pipelineobj.c
b/src/mesa/main/pipelineobj.c
index f4ce3ed..9067551 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const
struct gl_pipeline_object *pipe)
 return false;
  }
  
+bool

+_mesa_active_program_stages_used(struct gl_pipeline_object *pipe)
+{
+   unsigned i;
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (!pipe->CurrentProgram[i])
+ continue;
+  /* If program has linked but not used by pipeline. */
+  if (pipe->CurrentProgram[i]->LinkStatus &&
+  !pipe->CurrentProgram[i]->ValidProgramUse)

Two things here first if ValidProgramUse is false we need to redo the
validation to be sure it really is an invalid use as its possible that
the program was relinked since the UseProgramStages call and is now
valid.
I'm not sure about this. This feels like the ValidProgramUse would be 
then unnecessary as we would anyway need to iterate through everything 
(just because of possible relink)?



Second are you sure about the LinkStatus check? Won't this cause
validation to pass when it should fail if the program was linked
successfully but ValidProgramUse is false, and we then do a relink that
fails? As the program will stay in the pipeline.


IMO we need to check that program has been linked, otherwise it should 
not be considered active at all. UseProgramStages will need to be called 
first to have actual value in ValidProgramUse and that cannot be called 
with program that has not been succesfully linked.




Make that three things. It seems there is already code that should be
doing this validation already. The first validation test in
_mesa_validate_program_pipeline() calls program_stages_all_active() in
a loop which should fail for this case. The code pretty much does what
I was suggesting only it always does the validation, there is no flag
to reduce calls. I guess it really shouldn't matter a huge deal since
its only done once when somethings relinked or a sampler changes.


Yeah it seems, I did not realize it's trying to test this same thing as 
it did not catch this but then again I guess it's not called at all 
during draw time currently.



 From my quick testing on the CTS test the program in the pipeline only
has a single shader when it should have two so the test fails with this
existing code. As per my first point I believe we always need to


The test in question is "Separable program with both vertex and fragment 
shaders attached to only one stage". It fails because there is no 
existing check run at draw time for the case where you enable one stage 
but supply a program with multiple stages.



revalidate here to be sure that it really is an invalid use of the
program so what we really need to do is track down why there is only a
single shader in the program when it seems there should be two. If we
can resolve that we might not need to rework this code unless there
really is a performance hit.



+ return false;
+   }
+   return true;
+}
+
  extern GLboolean
  _mesa_validate_program_pipeline(struct gl_context* ctx,
  struct gl_pipeline_object *pipe)
@@ -839,6 +854,26 @@ _mesa_validate_program_pipeline(struct
gl_context* ctx,
return GL_FALSE;
 }
  
+   /* Section 11.1.3.11 (Validation) of the OpenGL 4.5 and OpenGL ES

3.1
+* spec

[Mesa-dev] [PATCH] mesa: remove validation of shaders that should be done elsewhere

2015-12-08 Thread Timothy Arceri

In core profile even if re-linking fails rendering shouldn't fail as the
previous succesfully linked program will still be available. It also
shouldn't be possible to have an unlinked program as part of the
current rendering state.

This fixes a subtest in:
ES31-CTS.sepshaderobjs.StateInteraction

This change should improve performance on CPU limited benchmarks as noted
in commit d6c6b186cf308f.

From Section 7.3 (Program Objects) of the OpenGL 4.5 spec:

   "If a program object that is active for any shader stage is re-linked
unsuccessfully, the link status will be set to FALSE, but any existing
executables and associated state will remain part of the current rendering
state until a subsequent call to UseProgram, UseProgramStages, or
BindProgramPipeline removes them from use. If such a program is attached to
any program pipeline object, the existing executables and associated state
will remain part of the program pipeline object until a subsequent call to
UseProgramStages removes them from use. An unsuccessfully linked program may
not be made part of the current rendering state by UseProgram or added to
program pipeline objects by UseProgramStages until it is successfully
re-linked."

   "void UseProgram(uint program);

   ...

   An INVALID_OPERATION error is generated if program has not been linked, or
   was last linked unsuccessfully.  The current rendering state is not 
modified."

Cc: Tapani Pälli 
Cc: Ian Romanick 
Cc: Brian Paul 
---
 src/mesa/main/context.c | 23 +++
 1 file changed, 3 insertions(+), 20 deletions(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index be983d4..2340c11 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -1930,6 +1930,7 @@ _mesa_check_blend_func_error(struct gl_context *ctx)
return GL_TRUE;
 }
 
+
 static bool
 shader_linked_or_absent(struct gl_context *ctx,
 const struct gl_shader_program *shProg,
@@ -1956,6 +1957,7 @@ shader_linked_or_absent(struct gl_context *ctx,
return true;
 }
 
+
 /**
  * Prior to drawing anything with glBegin, glDrawArrays, etc. this function
  * is called to see if it's valid to render.  This involves checking that
@@ -1967,33 +1969,14 @@ shader_linked_or_absent(struct gl_context *ctx,
 GLboolean
 _mesa_valid_to_render(struct gl_context *ctx, const char *where)
 {
-   unsigned i;
-
/* This depends on having up to date derived state (shaders) */
if (ctx->NewState)
   _mesa_update_state(ctx);
 
-   if (ctx->API == API_OPENGL_CORE || ctx->API == API_OPENGLES2) {
-  bool from_glsl_shader[MESA_SHADER_COMPUTE] = { false };
-
-  for (i = 0; i < MESA_SHADER_COMPUTE; i++) {
- if (!shader_linked_or_absent(ctx, ctx->_Shader->CurrentProgram[i],
-  _glsl_shader[i], where))
-return GL_FALSE;
-  }
-
-  /* In OpenGL Core Profile and OpenGL ES 2.0 / 3.0, there are no assembly
-   * shaders.  Don't check state related to those.
-   */
-   } else {
+   if (ctx->API == API_OPENGL_COMPAT) {
   bool has_vertex_shader = false;
   bool has_fragment_shader = false;
 
-  /* In OpenGL Compatibility Profile, there is only vertex shader and
-   * fragment shader.  We take this path also for API_OPENGLES because
-   * optimizing that path would make the other (more common) paths
-   * slightly slower.
-   */
   if (!shader_linked_or_absent(ctx,

ctx->_Shader->CurrentProgram[MESA_SHADER_VERTEX],
_vertex_shader, where))
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] scons: support for llvm 3.7.

2015-12-08 Thread Jose Fonseca


On 07/12/15 16:13, olivier.pena...@gmail.com wrote:

From: Olivier Pena 

---
  scons/llvm.py | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/scons/llvm.py b/scons/llvm.py
index c59b8cb..1fc8a3f 100644
--- a/scons/llvm.py
+++ b/scons/llvm.py
@@ -106,7 +106,19 @@ def generate(env):
  ])
  env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
  # LIBS should match the output of `llvm-config --libs engine mcjit 
bitwriter x86asmprinter`
-if llvm_version >= distutils.version.LooseVersion('3.6'):
+if llvm_version >= distutils.version.LooseVersion('3.7'):
+env.Prepend(LIBS = [
+'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
+'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
+'LLVMCodeGen', 'LLVMScalarOpts', 'LLVMProfileData',
+'LLVMInstCombine', 'LLVMInstrumentation', 
'LLVMTransformUtils', 'LLVMipa',
+'LLVMAnalysis', 'LLVMX86Desc', 'LLVMMCDisassembler',
+'LLVMX86Info', 'LLVMX86AsmPrinter', 'LLVMX86Utils',
+'LLVMMCJIT', 'LLVMTarget', 'LLVMExecutionEngine',
+'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
+'LLVMBitReader', 'LLVMMC', 'LLVMCore', 'LLVMSupport'
+])
+elif llvm_version >= distutils.version.LooseVersion('3.6'):
  env.Prepend(LIBS = [
  'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
  'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',




Pushed. Thanks.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline

2015-12-08 Thread Tapani Pälli


On 12/08/2015 03:55 PM, Timothy Arceri wrote:

On Tue, 2015-12-08 at 13:34 +0200, Tapani Pälli wrote:

On 12/08/2015 01:31 PM, Tapani Pälli wrote:

On 12/08/2015 12:57 PM, Timothy Arceri wrote:

On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote:

Validation checks that we do not have active shader stages that
are
not
used by the pipeline.

This fixes a subtest in following CTS test:
 ES31-CTS.sepshaderobjs.StateInteraction

v2: move as generic validation check for both ES and desktop
(Timothy)

Signed-off-by: Tapani Pälli 
---
   src/mesa/main/context.c |  9 +
   src/mesa/main/pipelineobj.c | 35
+++
   src/mesa/main/pipelineobj.h |  2 ++
   3 files changed, 46 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index be983d4..d73c984 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context
*ctx,
const char *where)
 }
  }
   +   /* If SSO in use, validate that all linked program stages
are
used. */
+   if (ctx->Pipeline.Current) {
+  if (!_mesa_active_program_stages_used(ctx
->Pipeline.Current))
{
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "Shader program active for shader stages
that "
+ "are not used by the pipeline");

You can remove this now. The call to
_mesa_validate_program_pipeline()
below should cause the error to be thrown for us when it fails.


Oops forgot this one. This is necessary to be run as the full
validation
happens only if pipeline was not validated before.

I see. Well in that case shouldn't we set the validated flag to false
when UseProgramStages is set? Since the pipeline could now be invalid
for this rule and for:

"One program object is active for at least two shader stages
and a second program is active for a shader stage between two
stages for which the first program was active."

Surely it would be less of a preformance hit to re-run full validation
when UseProgramStages is called on an than to re-validate these two
rules everytime?


Yeah I guess so, overall I was hoping to prevent full validation runs 
and run just 'what is absolutely required' both in UseProgramStages and 
in draw time but I agree it would be good to get functionality in place 
before trying to optimize too much. It looks like interface matching 
will anyway be the hottest spot to optimize, not these small loops.





+  }
+   }
+
  /* If a program is active and SSO not in use, check if
validation
of
   * samplers succeeded for the active program. */
  if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx
->Pipeline.Current) {
diff --git a/src/mesa/main/pipelineobj.c
b/src/mesa/main/pipelineobj.c
index f4ce3ed..9067551 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const
struct gl_pipeline_object *pipe)
  return false;
   }
   +bool
+_mesa_active_program_stages_used(struct gl_pipeline_object
*pipe)
+{
+   unsigned i;
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (!pipe->CurrentProgram[i])
+ continue;
+  /* If program has linked but not used by pipeline. */
+  if (pipe->CurrentProgram[i]->LinkStatus &&
+  !pipe->CurrentProgram[i]->ValidProgramUse)

Two things here first if ValidProgramUse is false we need to redo
the
validation to be sure it really is an invalid use as its possible
that
the program was relinked since the UseProgramStages call and is
now
valid.

I'm not sure about this. This feels like the ValidProgramUse would
be
then unnecessary as we would anyway need to iterate through
everything
(just because of possible relink)?

Right I'm leaning towards removing ValidProgramUse and setting the
validated pipeline flag to false when UseProgramStages is called then
revalidating everything. As you said with regards to the sampler checks
we can break things up later and introduce flags if perf testing shows
it to be useful.

Yeah, this seems the simplest way to get things working.


Second are you sure about the LinkStatus check? Won't this cause
validation to pass when it should fail if the program was linked
successfully but ValidProgramUse is false, and we then do a
relink that
fails? As the program will stay in the pipeline.

IMO we need to check that program has been linked, otherwise it
should
not be considered active at all. UseProgramStages will need to be
called first to have actual value in ValidProgramUse and that
cannot
be called with program that has not been succesfully linked.

But the LinkStatus flag gets set to false for a program that has
already been used with UseProgramStage but is then relinked
unsuccesfully which is what patch 3 is about.

I've also sent an alternate fix for patch 3 as I spent a bunch of time
testing and digging into it. It was just easier to send it out since it
was done.


OK, cool.




Make

Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output

2015-12-08 Thread Jose Fonseca


On 08/12/15 01:42, Brian Paul wrote:

When a buffer is created with GL_STATIC_DRAW, its contents should not
be changed frequently.  But that's exactly what one application I'm
debugging does.  This patch adds code to try to detect inefficient
buffer use in a couple places.  The GL_ARB_debug_output mechanism is
used to report the issue.

NVIDIA's driver detects these sort of things too.

Other types of inefficient buffer use could also be detected in the
future.
---
  src/mesa/main/bufferobj.c | 55 +++
  src/mesa/main/mtypes.h|  4 
  2 files changed, 59 insertions(+)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index f985982..6bc1b5e 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -51,6 +51,34 @@


  /**
+ * We count the number of buffer modification calls to check for
+ * inefficient buffer use.  This is the number of such calls before we
+ * issue a warning.
+ */
+#define BUFFER_WARNING_CALL_COUNT 4
+
+
+/**
+ * Helper to warn of possible performance issues, such as frequently
+ * updating a buffer created with GL_STATIC_DRAW.
+ */
+static void
+buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...)
+{
+   va_list args;
+   GLuint msg_id = 0;
+
+   va_start(args, fmt);
+   _mesa_gl_vdebug(ctx, _id,
+   MESA_DEBUG_SOURCE_API,
+   MESA_DEBUG_TYPE_PERFORMANCE,
+   MESA_DEBUG_SEVERITY_MEDIUM,
+   fmt, args);
+   va_end(args);
+}
+
+
+/**
   * Used as a placeholder for buffer objects between glGenBuffers() and
   * glBindBuffer() so that glIsBuffer() can work correctly.
   */
@@ -1677,6 +1705,21 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct 
gl_buffer_object *bufObj,
 if (size == 0)
return;

+   bufObj->NumSubDataCalls++;
+
+   if ((bufObj->Usage == GL_STATIC_DRAW ||
+bufObj->Usage == GL_STATIC_COPY) &&
+   bufObj->NumSubDataCalls >= BUFFER_WARNING_CALL_COUNT) {
+  /* If the application declared the buffer as static draw/copy or stream
+   * draw, it should not be frequently modified with glBufferSubData.
+   */
+  buffer_usage_warning(ctx,
+   "using %s(buffer %u, offset %u, size %u) to "
+   "update a %s buffer",
+   func, bufObj->Name, offset, size,
+   _mesa_enum_to_string(bufObj->Usage));
+   }
+
 bufObj->Written = GL_TRUE;

 assert(ctx->Driver.BufferSubData);
@@ -2384,6 +2427,18 @@ _mesa_map_buffer_range(struct gl_context *ctx,
return NULL;
 }

+   if (access & GL_MAP_WRITE_BIT) {
+  bufObj->NumMapBufferWriteCalls++;
+  if ((bufObj->Usage == GL_STATIC_DRAW ||
+   bufObj->Usage == GL_STATIC_COPY) &&
+  bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) {
+ buffer_usage_warning(ctx,
+  "using %s(buffer %u, offset %u, length %u) to "
+  "update a %s buffer",
+  func, bufObj->Name, offset, length,
+  _mesa_enum_to_string(bufObj->Usage));
+  }
+   }

 assert(ctx->Driver.MapBufferRange);
 map = ctx->Driver.MapBufferRange(ctx, offset, length, access, bufObj,
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 1eb1e21..de54169 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1275,6 +1275,10 @@ struct gl_buffer_object
 GLboolean Immutable; /**< GL_ARB_buffer_storage */
 gl_buffer_usage UsageHistory; /**< How has this buffer been used so far? */

+   /** Counters used for buffer usage warnings */
+   GLuint NumSubDataCalls;
+   GLuint NumMapBufferWriteCalls;
+
 struct gl_buffer_mapping Mappings[MAP_COUNT];
  };




LGTM.

Reviewed-by: Jose Fonseca 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate program usage in a pipeline

2015-12-08 Thread Timothy Arceri

On Tue, 2015-12-08 at 13:34 +0200, Tapani Pälli wrote:
> On 12/08/2015 01:31 PM, Tapani Pälli wrote:
> > On 12/08/2015 12:57 PM, Timothy Arceri wrote:
> > > On Tue, 2015-12-08 at 11:16 +0200, Tapani Pälli wrote:
> > > > Validation checks that we do not have active shader stages that
> > > > are
> > > > not
> > > > used by the pipeline.
> > > > 
> > > > This fixes a subtest in following CTS test:
> > > > ES31-CTS.sepshaderobjs.StateInteraction
> > > > 
> > > > v2: move as generic validation check for both ES and desktop
> > > > (Timothy)
> > > > 
> > > > Signed-off-by: Tapani Pälli 
> > > > ---
> > > >   src/mesa/main/context.c |  9 +
> > > >   src/mesa/main/pipelineobj.c | 35
> > > > +++
> > > >   src/mesa/main/pipelineobj.h |  2 ++
> > > >   3 files changed, 46 insertions(+)
> > > > 
> > > > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
> > > > index be983d4..d73c984 100644
> > > > --- a/src/mesa/main/context.c
> > > > +++ b/src/mesa/main/context.c
> > > > @@ -2042,6 +2042,15 @@ _mesa_valid_to_render(struct gl_context
> > > > *ctx,
> > > > const char *where)
> > > > }
> > > >  }
> > > >   +   /* If SSO in use, validate that all linked program stages
> > > > are
> > > > used. */
> > > > +   if (ctx->Pipeline.Current) {
> > > > +  if (!_mesa_active_program_stages_used(ctx
> > > > ->Pipeline.Current))
> > > > {
> > > > + _mesa_error(ctx, GL_INVALID_OPERATION,
> > > > + "Shader program active for shader stages
> > > > that "
> > > > + "are not used by the pipeline");
> > > You can remove this now. The call to
> > > _mesa_validate_program_pipeline()
> > > below should cause the error to be thrown for us when it fails.
> > > 
> 
> Oops forgot this one. This is necessary to be run as the full
> validation 
> happens only if pipeline was not validated before.

I see. Well in that case shouldn't we set the validated flag to false
when UseProgramStages is set? Since the pipeline could now be invalid
for this rule and for:

"One program object is active for at least two shader stages
and a second program is active for a shader stage between two
stages for which the first program was active."

Surely it would be less of a preformance hit to re-run full validation
when UseProgramStages is called on an than to re-validate these two
rules everytime?


> 
> > > > +  }
> > > > +   }
> > > > +
> > > >  /* If a program is active and SSO not in use, check if
> > > > validation
> > > > of
> > > >   * samplers succeeded for the active program. */
> > > >  if (ctx->_Shader->ActiveProgram && ctx->_Shader != ctx
> > > > ->Pipeline.Current) {
> > > > diff --git a/src/mesa/main/pipelineobj.c
> > > > b/src/mesa/main/pipelineobj.c
> > > > index f4ce3ed..9067551 100644
> > > > --- a/src/mesa/main/pipelineobj.c
> > > > +++ b/src/mesa/main/pipelineobj.c
> > > > @@ -782,6 +782,21 @@ program_stages_interleaved_illegally(const
> > > > struct gl_pipeline_object *pipe)
> > > >  return false;
> > > >   }
> > > >   +bool
> > > > +_mesa_active_program_stages_used(struct gl_pipeline_object
> > > > *pipe)
> > > > +{
> > > > +   unsigned i;
> > > > +   for (i = 0; i < MESA_SHADER_STAGES; i++) {
> > > > +  if (!pipe->CurrentProgram[i])
> > > > + continue;
> > > > +  /* If program has linked but not used by pipeline. */
> > > > +  if (pipe->CurrentProgram[i]->LinkStatus &&
> > > > +  !pipe->CurrentProgram[i]->ValidProgramUse)
> > > Two things here first if ValidProgramUse is false we need to redo
> > > the
> > > validation to be sure it really is an invalid use as its possible
> > > that
> > > the program was relinked since the UseProgramStages call and is
> > > now
> > > valid.
> > I'm not sure about this. This feels like the ValidProgramUse would
> > be 
> > then unnecessary as we would anyway need to iterate through
> > everything 
> > (just because of possible relink)?

Right I'm leaning towards removing ValidProgramUse and setting the
validated pipeline flag to false when UseProgramStages is called then
revalidating everything. As you said with regards to the sampler checks
we can break things up later and introduce flags if perf testing shows
it to be useful.

> > 
> > > Second are you sure about the LinkStatus check? Won't this cause
> > > validation to pass when it should fail if the program was linked
> > > successfully but ValidProgramUse is false, and we then do a
> > > relink that
> > > fails? As the program will stay in the pipeline.
> > 
> > IMO we need to check that program has been linked, otherwise it
> > should 
> > not be considered active at all. UseProgramStages will need to be 
> > called first to have actual value in ValidProgramUse and that
> > cannot 
> > be called with program that has not been succesfully linked.

But the LinkStatus flag gets set to false for a program that has
already been used with UseProgramStage but is

Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro

2015-12-08 Thread Emil Velikov

On 4 December 2015 at 17:37,   wrote:
> On 2015-12-05 01:33, Emil Velikov wrote:

>>
>> You mentioned that you've done these with a Coccinelle schematic patch
>> (script) - mind if we get that (actually start collecting) in tree ...
>> perhaps in bin/ or bin/cocci/ ?
>
>
> Definitely! This is in fact part of my proposed plan after I get some of
> the major churn out the way and see what passes review and hence which
> .cocci scripts are suitable for mesa.
>
Sweet. I guess I missed that part in your summary email.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: invalidate pipeline status after glUseProgramStages

2015-12-08 Thread Tapani Pälli

This will cause validation to run during next draw, this is done
because possible changes in used stages and programs can cause
invalid pipeline state.

This fixes a subtest in following CTS test:
ES31-CTS.sepshaderobjs.StateInteraction

Signed-off-by: Tapani Pälli 
---
 src/mesa/main/pipelineobj.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index 5eda4e5..f2a872d 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -341,6 +341,8 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, 
GLuint program)
 
if ((stages & GL_COMPUTE_SHADER_BIT) != 0)
   _mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, pipe);
+
+   pipe->Validated = false;
 }
 
 /**
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Fix crash when calling glViewport with no surface bound

2015-12-08 Thread Neil Roberts

If EGL_KHR_surfaceless_context is used then glViewport can be called
with NULL for the draw and read surfaces. This was previously causing
a crash because the i965 driver tries to use this point to invalidate
the surfaces and it was derferencing the NULL pointer.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257
Cc: Nanley Chery 
Cc: "11.1" 
---

I've also posted a Piglit test for this here:

http://patchwork.freedesktop.org/patch/67356/

 src/mesa/drivers/dri/i965/brw_context.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7d7643c..5374bad 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -159,8 +159,10 @@ intel_viewport(struct gl_context *ctx)
__DRIcontext *driContext = brw->driContext;
 
if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
-  dri2InvalidateDrawable(driContext->driDrawablePriv);
-  dri2InvalidateDrawable(driContext->driReadablePriv);
+  if (driContext->driDrawablePriv)
+ dri2InvalidateDrawable(driContext->driDrawablePriv);
+  if (driContext->driReadablePriv)
+ dri2InvalidateDrawable(driContext->driReadablePriv);
}
 }
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 15/44] i965: Work around L3 state leaks during context switches.

2015-12-08 Thread Francisco Jerez

This is going to require some rather intrusive kernel changes to fix
properly, in the meantime (and forever on at least pre-v4.1 kernels)
we'll have to restore the hardware defaults at the end of every batch
in which the L3 configuration was changed to avoid interfering with
the DDX and GL clients that use an older non-L3-aware version of Mesa.

Reviewed-by: Samuel Iglesias Gonsálvez 
Reviewed-by: Kristian Høgsberg  

v4: Optimize look-up of the default configuration by assuming it's the
first entry of the L3 config array in order to avoid an FPS
regression in GpuTest Triangle and SynMark OglBatch2-7 on most
affected platforms.
---
 src/mesa/drivers/dri/i965/brw_state.h |  4 +++
 src/mesa/drivers/dri/i965/gen7_l3_state.c | 51 +++
 src/mesa/drivers/dri/i965/intel_batchbuffer.c |  7 
 src/mesa/drivers/dri/i965/intel_batchbuffer.h |  6 +++-
 4 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 49f301a..b7c0039 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -380,6 +380,10 @@ void gen7_update_binding_table_from_array(struct 
brw_context *brw,
 void gen7_disable_hw_binding_tables(struct brw_context *brw);
 void gen7_reset_hw_bt_pool_offsets(struct brw_context *brw);
 
+/* gen7_l3_state.c */
+void
+gen7_restore_default_l3_config(struct brw_context *brw);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
b/src/mesa/drivers/dri/i965/gen7_l3_state.c
index 7956935..7fa7336 100644
--- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
@@ -520,3 +520,54 @@ const struct brw_tracked_state gen7_l3_state = {
},
.emit = emit_l3_state
 };
+
+/**
+ * Hack to restore the default L3 configuration.
+ *
+ * This will be called at the end of every batch in order to reset the L3
+ * configuration to the default values for the time being until the kernel is
+ * fixed.  Until kernel commit 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b
+ * (included in v4.1) we would set the MI_RESTORE_INHIBIT bit when submitting
+ * batch buffers for the default context used by the DDX, which meant that any
+ * context state changed by the GL would leak into the DDX, the assumption
+ * being that the DDX would initialize any state it cares about manually.  The
+ * DDX is however not careful enough to program an L3 configuration
+ * explicitly, and it makes assumptions about it (URB size) which won't hold
+ * and cause it to misrender if we let our L3 set-up to leak into the DDX.
+ *
+ * Since v4.1 of the Linux kernel the default context is saved and restored
+ * normally, so it's far less likely for our L3 programming to interfere with
+ * other contexts -- In fact restoring the default L3 configuration at the end
+ * of the batch will be redundant most of the time.  A kind of state leak is
+ * still possible though if the context making assumptions about L3 state is
+ * created immediately after our context was active (e.g. without the DDX
+ * default context being scheduled in between) because at present the DRM
+ * doesn't fully initialize the contents of newly created contexts and instead
+ * sets the MI_RESTORE_INHIBIT flag causing it to inherit the state from the
+ * last active context.
+ *
+ * It's possible to realize such a scenario if, say, an X server (or a GL
+ * application using an outdated non-L3-aware Mesa version) is started while
+ * another GL application is running and happens to have modified the L3
+ * configuration, or if no X server is running at all and a GL application
+ * using a non-L3-aware Mesa version is started after another GL application
+ * ran and modified the L3 configuration -- The latter situation can actually
+ * be reproduced easily on IVB in our CI system.
+ */
+void
+gen7_restore_default_l3_config(struct brw_context *brw)
+{
+   const struct brw_device_info *devinfo = brw->intelScreen->devinfo;
+   /* For efficiency assume that the first entry of the array matches the
+* default configuration.
+*/
+   const struct brw_l3_config *const cfg = get_l3_configs(devinfo);
+   assert(cfg == get_l3_config(devinfo,
+   get_default_l3_weights(devinfo, false, false)));
+
+   if (cfg != brw->l3.config && brw->can_do_pipelined_register_writes) {
+  setup_l3_config(brw, cfg);
+  update_urb_size(brw, cfg);
+  brw->l3.config = cfg;
+   }
+}
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 0363bd3..f778074 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -208,6 +208,13 @@ brw_finish_batch(struct brw_context *brw)
brw_emit_query_end(brw);
 
if (brw->batch.ring == RENDER_RING) {
+  /* Work around L3 state leaks into

Re: [Mesa-dev] [PATCH 01/10] gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager

2015-12-08 Thread Nicolai Hähnle


On 06.12.2015 19:00, Marek Olšák wrote:

From: Marek Olšák 

This simplified (basically duplicated) version of pb_cache_manager will
allow removing some ugly hacks from radeon and amdgpu winsyses and
flatten simplify their design.

The difference is that winsyses must manually add buffers to the cache
in "destroy" functions and the cache doesn't know about the buffers before
that. The integration is therefore trivial and the impact on the winsys
design is negligible.
---
  src/gallium/auxiliary/Makefile.sources  |   1 +
  src/gallium/auxiliary/pipebuffer/pb_cache.c | 286 
  src/gallium/auxiliary/pipebuffer/pb_cache.h |  74 +++
  3 files changed, 361 insertions(+)
  create mode 100644 src/gallium/auxiliary/pipebuffer/pb_cache.c
  create mode 100644 src/gallium/auxiliary/pipebuffer/pb_cache.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 6160192..817308d 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -93,6 +93,7 @@ C_SOURCES := \
pipebuffer/pb_bufmgr_ondemand.c \
pipebuffer/pb_bufmgr_pool.c \
pipebuffer/pb_bufmgr_slab.c \
+   pipebuffer/pb_cache.c \


I believe pb_cache.h needs to be added as well.


pipebuffer/pb_validate.c \
pipebuffer/pb_validate.h \
postprocess/filters.h \
diff --git a/src/gallium/auxiliary/pipebuffer/pb_cache.c 
b/src/gallium/auxiliary/pipebuffer/pb_cache.c
new file mode 100644
index 000..45f600d
--- /dev/null
+++ b/src/gallium/auxiliary/pipebuffer/pb_cache.c

...

+/**
+ * \return 1   if compatible and can be reclaimed
+ * 0   if incompatible
+ *-1   if compatible and can't be reclaimed
+ */
+static int
+pb_cache_is_buffer_compat(struct pb_cache_entry *entry,
+  pb_size size, unsigned alignment, unsigned usage)
+{
+   struct pb_buffer *buf = entry->buffer;
+
+   if (usage & entry->mgr->bypass_usage)
+  return 0;


It should be possible to move this test to the top of 
pb_cache_reclaim_buffer, right?



+   if (buf->size < size)
+  return 0;
+
+   /* be lenient with size */
+   if (buf->size > (unsigned) (entry->mgr->size_factor * size))
+  return 0;
+
+   if (!pb_check_alignment(alignment, buf->alignment))
+  return 0;
+
+   if (!pb_check_usage(usage, buf->usage))
+  return 0;
+
+   return entry->mgr->can_reclaim(buf) ? 1 : -1;
+}
+
+/**
+ * Find a compatible buffer in the cache, return it, and remove it
+ * from the cache.
+ */
+struct pb_buffer *
+pb_cache_reclaim_buffer(struct pb_cache *mgr, pb_size size,
+unsigned alignment, unsigned usage)
+{
+   struct pb_cache_entry *entry;
+   struct pb_cache_entry *cur_entry;
+   struct list_head *cur, *next;
+   int64_t now;
+   int ret = 0;
+
+   pipe_mutex_lock(mgr->mutex);
+
+   entry = NULL;
+   cur = mgr->cache.next;
+   next = cur->next;
+
+   /* search in the expired buffers, freeing them in the process */
+   now = os_time_get();
+   while (cur != >cache) {
+  cur_entry = LIST_ENTRY(struct pb_cache_entry, cur, head);
+
+  if (!entry && (ret = pb_cache_is_buffer_compat(cur_entry, size,
+ alignment, usage) > 0))
+ entry = cur_entry;
+  else if (os_time_timeout(cur_entry->start, cur_entry->end, now))
+ destroy_buffer_locked(cur_entry);
+  else
+ /* This buffer (and all hereafter) are still hot in cache */
+ break;
+
+  /* the buffer is busy (and probably all remaining ones too) */
+  if (ret == -1)
+ break;
+
+  cur = next;
+  next = cur->next;
+   }
+
+   /* keep searching in the hot buffers */
+   if (!entry && ret != -1) {
+  while (cur != >cache) {
+ cur_entry = LIST_ENTRY(struct pb_cache_entry, cur, head);
+ ret = pb_cache_is_buffer_compat(cur_entry, size, alignment, usage);
+
+ if (ret > 0) {
+entry = cur_entry;
+break;
+ }
+ if (ret == -1)
+break;
+ /* no need to check the timeout here */
+ cur = next;
+ next = cur->next;
+  }
+   }
+
+   /* found a compatible buffer, return it */
+   if (entry) {
+  struct pb_buffer *buf = entry->buffer;
+
+  mgr->cache_size -= buf->size;
+  LIST_DEL(>head);
+  --mgr->num_buffers;
+  pipe_mutex_unlock(mgr->mutex);
+  /* Increase refcount */
+  pipe_reference_init(>reference, 1);
+  return buf;
+   }
+
+   pipe_mutex_unlock(mgr->mutex);
+   return NULL;
+}
+
+/**
+ * Empty the cache. Useful when there is not enough memory.
+ */
+void
+pb_cache_release_all_buffers(struct pb_cache *mgr)
+{
+   struct list_head *curr, *next;
+   struct pb_cache_entry *buf;
+
+   pipe_mutex_lock(mgr->mutex);
+   curr = mgr->cache.next;
+   next = curr->next;
+   while (curr != >cache) {
+  buf = LIST_ENTRY(struct pb_cache_entry, curr, head);
+

Re: [Mesa-dev] [PATCH 09/10] winsys/radeon: use pb_cache instead of pb_cache_manager

2015-12-08 Thread Nicolai Hähnle


On 06.12.2015 19:01, Marek Olšák wrote:

From: Marek Olšák 

This is a prerequisite for the removal of radeon_winsys_cs_handle.
---
  src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 212 +++---
  src/gallium/winsys/radeon/drm/radeon_drm_bo.h |  14 +-
  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c |  22 +--
  src/gallium/winsys/radeon/drm/radeon_drm_winsys.h |   4 +-
  4 files changed, 74 insertions(+), 178 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index 4c38379..9532c77 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c

...

  static const struct pb_vtbl radeon_bo_vtbl = {
-radeon_bo_destroy,
-NULL, /* never called */
-NULL, /* never called */
-radeon_bo_validate,
-radeon_bo_fence,
-radeon_bo_get_base_buffer,
+radeon_bo_destroy_or_cache
  };


I take it the other functions aren't called anymore? Perhaps this patch 
and #3 could use an explanation to that effect.


Nicolai



  #ifndef RADEON_GEM_GTT_WC
@@ -540,40 +490,39 @@ static const struct pb_vtbl radeon_bo_vtbl = {
  #define RADEON_GEM_NO_CPU_ACCESS  (1 << 4)
  #endif

-static struct pb_buffer *radeon_bomgr_create_bo(struct pb_manager *_mgr,
-pb_size size,
-const struct pb_desc *desc)
+static struct radeon_bo *radeon_create_bo(struct radeon_drm_winsys *rws,
+  unsigned size, unsigned alignment,
+  unsigned usage,
+  unsigned initial_domains,
+  unsigned flags)
  {
-struct radeon_bomgr *mgr = radeon_bomgr(_mgr);
-struct radeon_drm_winsys *rws = mgr->rws;
  struct radeon_bo *bo;
  struct drm_radeon_gem_create args;
-struct radeon_bo_desc *rdesc = (struct radeon_bo_desc*)desc;
  int r;

  memset(, 0, sizeof(args));

-assert(rdesc->initial_domains);
-assert((rdesc->initial_domains &
+assert(initial_domains);
+assert((initial_domains &
  ~(RADEON_GEM_DOMAIN_GTT | RADEON_GEM_DOMAIN_VRAM)) == 0);

  args.size = size;
-args.alignment = desc->alignment;
-args.initial_domain = rdesc->initial_domains;
+args.alignment = alignment;
+args.initial_domain = initial_domains;
  args.flags = 0;

-if (rdesc->flags & RADEON_FLAG_GTT_WC)
+if (flags & RADEON_FLAG_GTT_WC)
  args.flags |= RADEON_GEM_GTT_WC;
-if (rdesc->flags & RADEON_FLAG_CPU_ACCESS)
+if (flags & RADEON_FLAG_CPU_ACCESS)
  args.flags |= RADEON_GEM_CPU_ACCESS;
-if (rdesc->flags & RADEON_FLAG_NO_CPU_ACCESS)
+if (flags & RADEON_FLAG_NO_CPU_ACCESS)
  args.flags |= RADEON_GEM_NO_CPU_ACCESS;

  if (drmCommandWriteRead(rws->fd, DRM_RADEON_GEM_CREATE,
  , sizeof(args))) {
  fprintf(stderr, "radeon: Failed to allocate a buffer:\n");
  fprintf(stderr, "radeon:size  : %d bytes\n", size);
-fprintf(stderr, "radeon:alignment : %d bytes\n", desc->alignment);
+fprintf(stderr, "radeon:alignment : %d bytes\n", alignment);
  fprintf(stderr, "radeon:domains   : %d\n", args.initial_domain);
  fprintf(stderr, "radeon:flags : %d\n", args.flags);
  return NULL;
@@ -584,20 +533,21 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct 
pb_manager *_mgr,
  return NULL;

  pipe_reference_init(>base.reference, 1);
-bo->base.alignment = desc->alignment;
-bo->base.usage = desc->usage;
+bo->base.alignment = alignment;
+bo->base.usage = usage;
  bo->base.size = size;
  bo->base.vtbl = _bo_vtbl;
  bo->rws = rws;
  bo->handle = args.handle;
  bo->va = 0;
-bo->initial_domain = rdesc->initial_domains;
+bo->initial_domain = initial_domains;
  pipe_mutex_init(bo->map_mutex);
+pb_cache_init_entry(>bo_cache, >cache_entry, >base);

  if (rws->info.r600_virtual_address) {
  struct drm_radeon_gem_va va;

-bo->va = radeon_bomgr_find_va(rws, size, desc->alignment);
+bo->va = radeon_bomgr_find_va(rws, size, alignment);

  va.handle = bo->handle;
  va.vm_id = 0;
@@ -610,7 +560,7 @@ static struct pb_buffer *radeon_bomgr_create_bo(struct 
pb_manager *_mgr,
  if (r && va.operation == RADEON_VA_RESULT_ERROR) {
  fprintf(stderr, "radeon: Failed to allocate virtual address for 
buffer:\n");
  fprintf(stderr, "radeon:size  : %d bytes\n", size);
-fprintf(stderr, "radeon:alignment : %d bytes\n", 
desc->alignment);
+fprintf(stderr, "radeon:alignment : %d bytes\n", alignment);
  fprintf(stderr, "radeon:domains   : %d\n", 
args.initial_domain);

[Mesa-dev] [PATCH 1/2] glsl: do not loose always_active_io when packing varyings

2015-12-08 Thread Tapani Pälli

Otherwise packed and inactive varyings get optimized away. This needs
to be prevented when using separate shader objects where interface
needs to be preserved.

Signed-off-by: Tapani Pälli 
---
 src/glsl/lower_packed_varyings.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/glsl/lower_packed_varyings.cpp 
b/src/glsl/lower_packed_varyings.cpp
index 037c27d..8d1eb17 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -622,6 +622,7 @@ lower_packed_varyings_visitor::get_packed_varying_deref(
   packed_var->data.interpolation = unpacked_var->data.interpolation;
   packed_var->data.location = location;
   packed_var->data.precision = unpacked_var->data.precision;
+  packed_var->data.always_active_io = unpacked_var->data.always_active_io;
   unpacked_var->insert_before(packed_var);
   this->packed_varyings[slot] = packed_var;
} else {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] mesa: fix interface matching done in validate_io

2015-12-08 Thread Tapani Pälli

Patch makes following changes for interface matching:

   - do not try to match builtin variables
   - handle swizzle in input name, as example 'a.z' should
 match with 'a'
   - check that amount of inputs and outputs matches

These changes make interface matching tests to work in:
   ES31-CTS.sepshaderobjs.StateInteraction

Test does not still pass completely due to errors in rendering
output. I believe this is unrelated to interface matching.

Signed-off-by: Tapani Pälli 
---
 src/mesa/main/shader_query.cpp | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 5d15006..8bfbd3f 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -1372,19 +1372,34 @@ validate_io(const struct gl_shader *input_stage,
 const struct gl_shader *output_stage, bool isES)
 {
assert(input_stage && output_stage);
+   unsigned inputs = 0, outputs = 0;
 
/* For each output in a, find input in b and do any required checks. */
foreach_in_list(ir_instruction, out, input_stage->ir) {
   ir_variable *out_var = out->as_variable();
-  if (!out_var || out_var->data.mode != ir_var_shader_out)
+  if (!out_var || out_var->data.mode != ir_var_shader_out ||
+  is_gl_identifier(out_var->name))
  continue;
 
+  outputs++;
+
+  inputs = 0;
   foreach_in_list(ir_instruction, in, output_stage->ir) {
  ir_variable *in_var = in->as_variable();
- if (!in_var || in_var->data.mode != ir_var_shader_in)
+ if (!in_var || in_var->data.mode != ir_var_shader_in ||
+ is_gl_identifier(in_var->name))
 continue;
 
- if (strcmp(in_var->name, out_var->name) == 0) {
+ inputs++;
+
+ unsigned len = strlen(in_var->name);
+
+ /* Handle input swizzle in variable name. */
+ const char *dot = strchr(in_var->name, '.');
+ if (dot)
+len = dot - in_var->name;
+
+ if (strncmp(in_var->name, out_var->name, len) == 0) {
 /* Since we now only validate precision, we can skip this step for
  * desktop GLSL shaders, there precision qualifier is ignored.
  *
@@ -1407,7 +1422,9 @@ validate_io(const struct gl_shader *input_stage,
  }
   }
}
-   return true;
+
+   /* Amount should match. */
+   return (inputs == outputs);
 }
 
 /**
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-08 Thread Kenneth Graunke

On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote:
[snip]
> @@ -461,8 +455,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, 
> print_state *state)
> }
>  
> nir_foreach_variable(var, var_list) {
> -  if ((var->data.driver_location == instr->const_index[0]) &&
> -  var->name) {
> +  if ((var->data.driver_location == instr->const_index[0]) && var->name) 
> {
>   fprintf(fp, "\t/* %s */", var->name);
>   break;
>}
> 

This last hunk seems unrelated - perhaps it should be a separate patch?
That patch would get a R-b.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Optimize useless comparisons against true/false.

2015-12-08 Thread Matt Turner

On Fri, Dec 4, 2015 at 10:51 AM, Neil Roberts  wrote:
> Matt Turner  writes:
>
>>  # Written in the form (, ) where  is an expression
>>  # and  is either an expression or a value.  An expression is
>>  # defined as a tuple of the form (, , , , )
>> @@ -94,6 +97,8 @@ optimizations = [
>> (('inot', ('ige', a, b)), ('ilt', a, b)),
>> (('inot', ('ieq', a, b)), ('ine', a, b)),
>> (('inot', ('ine', a, b)), ('ieq', a, b)),
>> +   (('ieq', 'a@bool', true), a),
>> +   (('ine', 'a@bool', false), a),
>
> I think this second one is already in the file on line 187 here:
>
># Boolean simplifications
>(('ine', 'a@bool', 0), 'a'),
>(('ieq', 'a@bool', 0), ('inot', 'a')),
>
> Maybe you could add the first one near these two? It could be good to
> add a line like this as well to complete the set:
>
>(('ine', 'a@bool', true), ('inot', a)),
>
> Regards,
> - Neil

Thanks. Both are good suggestions. I've squashed them into the patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 05/12] nir/lower_io: Get rid of load/store_foo_indirect

2015-12-08 Thread Kenneth Graunke

On Tuesday, December 08, 2015 01:46:22 PM Jason Ekstrand wrote:
> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
> index f64ac69..a2723d5 100644
> --- a/src/glsl/nir/nir_lower_io.c
> +++ b/src/glsl/nir/nir_lower_io.c
> @@ -333,18 +311,18 @@ nir_lower_io(nir_shader *shader, nir_variable_mode mode,
>   * Return the indirect source for a load/store indirect intrinsic.

This comment could use updating.  How about:

/**
 * Return the offset soruce for a load/store intrinsic.
 */

>   */
>  nir_src *
> -nir_get_io_indirect_src(nir_intrinsic_instr *instr)
> +nir_get_io_offset_src(nir_intrinsic_instr *instr)
>  {
> switch (instr->intrinsic) {
> -   case nir_intrinsic_load_input_indirect:
> -   case nir_intrinsic_load_output_indirect:
> -   case nir_intrinsic_load_uniform_indirect:
> +   case nir_intrinsic_load_input:
> +   case nir_intrinsic_load_output:
> +   case nir_intrinsic_load_uniform:
>return >src[0];
> -   case nir_intrinsic_load_per_vertex_input_indirect:
> -   case nir_intrinsic_load_per_vertex_output_indirect:
> -   case nir_intrinsic_store_output_indirect:
> +   case nir_intrinsic_load_per_vertex_input:
> +   case nir_intrinsic_load_per_vertex_output:
> +   case nir_intrinsic_store_output:
>return >src[1];
> -   case nir_intrinsic_store_per_vertex_output_indirect:
> +   case nir_intrinsic_store_per_vertex_output:
>return >src[2];
> default:
>return NULL;


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-08 Thread Jason Ekstrand

On Tue, Dec 8, 2015 at 4:19 PM, Kenneth Graunke  wrote:
> On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote:
>> +/* src[] = { buffer_index, offset }. No const_index */
>> +LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE)
>
> I'm pretty sure const_index[0] is a writemask, so this comemnt is wrong.

For stores, yes, but not for loads.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-08 Thread Jason Ekstrand

On Tue, Dec 8, 2015 at 2:47 PM, Kenneth Graunke  wrote:
> On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote:
> [snip]
>> @@ -461,8 +455,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, 
>> print_state *state)
>> }
>>
>> nir_foreach_variable(var, var_list) {
>> -  if ((var->data.driver_location == instr->const_index[0]) &&
>> -  var->name) {
>> +  if ((var->data.driver_location == instr->const_index[0]) && 
>> var->name) {
>>   fprintf(fp, "\t/* %s */", var->name);
>>   break;
>>}
>>
>
> This last hunk seems unrelated - perhaps it should be a separate patch?
> That patch would get a R-b.

FIxed locally.  Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.

2015-12-08 Thread Dawid Gan

This format has been added in commit:
28090b30dd6b5977de085f48c620574214b6b4ba
But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format.
It was causing the screen in Supertuxkart to be darker than expected, see:
https://bugs.freedesktop.org/show_bug.cgi?id=92759

Cc: Boyan Ding 
Cc: "11.0 11.1" 
Fixes: 28090b30dd6 "i965: Add XRGB format to intel_screen_make_configs"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759
---
 src/mesa/drivers/dri/i965/intel_screen.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index cc90efe..75d5a65 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
  stencil_bits[2] = 8;
  num_depth_stencil_bits = 3;
  }
+  } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) {
+ depth_bits[1] = 24;
+ stencil_bits[1] = 0;
   } else {
  depth_bits[1] = 24;
  stencil_bits[1] = 8;
@@ -1261,6 +1264,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
   if (formats[i] == MESA_FORMAT_B5G6R5_UNORM) {
  depth_bits[0] = 16;
  stencil_bits[0] = 0;
+  } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) {
+ depth_bits[0] = 24;
+ stencil_bits[0] = 0;
   } else {
  depth_bits[0] = 24;
  stencil_bits[0] = 8;
@@ -1301,6 +1307,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
   if (formats[i] == MESA_FORMAT_B5G6R5_UNORM) {
  depth_bits[1] = 16;
  stencil_bits[1] = 0;
+  } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) {
+ depth_bits[1] = 24;
+ stencil_bits[1] = 0;
   } else {
  depth_bits[1] = 24;
  stencil_bits[1] = 8;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 06/12] i965/fs: Get rid of load/store_foo_indirect

2015-12-08 Thread Jason Ekstrand

On Tue, Dec 8, 2015 at 4:44 PM, Kenneth Graunke  wrote:
> On Tuesday, December 08, 2015 01:46:23 PM Jason Ekstrand wrote:
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.h   |   2 +-
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 106 
>> ++-
>>  src/mesa/drivers/dri/i965/brw_nir.c  |  45 ++---
>>  3 files changed, 84 insertions(+), 69 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
>> b/src/mesa/drivers/dri/i965/brw_fs.h
>> index bca4589..b55589f 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> @@ -278,7 +278,7 @@ public:
>> unsigned stream_id);
>> void emit_gs_thread_end();
>> void emit_gs_input_load(const fs_reg , const nir_src _src,
>> -   const fs_reg _offset, unsigned 
>> imm_offset,
>> +   unsigned base_offset, const nir_src _src,
>> unsigned num_components);
>> void emit_cs_terminate();
>> fs_reg *emit_cs_local_invocation_id_setup();
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 3754622..18c9d65 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -1603,29 +1603,31 @@ fs_visitor::emit_gs_vertex(const nir_src 
>> _count_nir_src,
>>  void
>>  fs_visitor::emit_gs_input_load(const fs_reg ,
>> const nir_src _src,
>> -   const fs_reg _offset,
>> -   unsigned imm_offset,
>> +   unsigned base_offset,
>> +   const nir_src _src,
>> unsigned num_components)
>>  {
>> struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) 
>> prog_data;
>>
>> +   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
>> +   nir_const_value *offset_const = nir_src_as_const_value(offset_src);
>> +   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
>> +
>> /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y],
>>  * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w].  Only
>>  * gl_PointSize is available as a GS input, however, so it must be that.
>>  */
>> -   const bool is_point_size =
>> -  indirect_offset.file == BAD_FILE && imm_offset == 0;
>> -
>> -   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
>> -   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
>> +   const bool is_point_size = (base_offset == 0);
>>
>> -   if (indirect_offset.file == BAD_FILE && vertex_const != NULL &&
>> -   4 * imm_offset < push_reg_count) {
>> -  imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count;
>> +   if (offset_const != NULL && vertex_const != NULL &&
>> +   4 * offset_const->u[0] < push_reg_count) {
>
> This needs to be
>
> 4 * (base_offset + offset_const->u[0]) < push_reg_count
>
> otherwise we use the push model for inputs that aren't actually pushed,
> which breaks around 1500 piglit tests with INTEL_SCALAR_GS=1.
>
>> +  int imm_offset = (base_offset + offset_const->u[0]) * 4 +
>> +   vertex_const->u[0] * push_reg_count;
>>/* This input was pushed into registers. */
>>if (is_point_size) {
>>   /* gl_PointSize comes in .w */
>> - bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type));
>> + assert(imm_offset == 0);
>> + bld.MOV(dst, fs_reg(ATTR, 3, dst.type));
>
> This is wrong.  imm_offset contains the offset derived from the vertex
> index, and gl_PointSize exists for multiple vertices.  So it won't be 0.
>
> Please just remove this change, putting it back to:
>
>  bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type));
>
> That fixes assertion failures in 3 Piglit tests.

Fixed both locally.  Thanks for the careful review and for watching
out for scalar GS!

> With that fixed, patches 1-8 and 11-12 are:
> Reviewed-by: Kenneth Graunke 

Thanks!
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] Fix locking of GLsync objects

2015-12-08 Thread Steinar H. Gunderson

Hi,

I was told that it's easier for people to review my patch if it comes in via
email than being stuck in the bug tracker; FWIW, this is for bug 120238.
(It's the same patch as is already in the tracker.)

/* Steinar */

===

From 6e3d1880fa78a3a965cb7eb51ee12b1f785f84bb Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" 
Date: Tue, 1 Dec 2015 22:05:11 +0100
Subject: [PATCH] Fix locking of GLsync objects.

GLsync objects had a race condition when used from multiple threads
(which is the main point of the extension, really); it could be
validated as a sync object at the beginning of the function, and then
deleted by another thread before use, causing crashes. Fix this by
changing all casts from GLsync to struct gl_sync_object to a new
function _mesa_get_sync() that validates and increases the refcount.

In a similar vein, validation itself uses _mesa_set_search(), which
requires synchronization -- it was called without a mutex held, causing
spurious error returns and other issues. Since _mesa_get_sync() now
takes the shared context mutex, this problem is also resolved.

Signed-off-by: Steinar H. Gunderson 
---
 src/mesa/main/objectlabel.c | 11 --
 src/mesa/main/shared.c  |  2 +-
 src/mesa/main/syncobj.c | 89 ++---
 src/mesa/main/syncobj.h | 11 ++
 4 files changed, 64 insertions(+), 49 deletions(-)

diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c
index 41f370c..b083c43 100644
--- a/src/mesa/main/objectlabel.c
+++ b/src/mesa/main/objectlabel.c
@@ -288,7 +288,7 @@ void GLAPIENTRY
 _mesa_ObjectPtrLabel(const void *ptr, GLsizei length, const GLchar *label)
 {
GET_CURRENT_CONTEXT(ctx);
-   struct gl_sync_object *const syncObj = (struct gl_sync_object *) ptr;
+   struct gl_sync_object *syncObj = _mesa_get_sync(ctx, sync, true);
const char *callerstr;
char **labelPtr;
 
@@ -297,7 +297,7 @@ _mesa_ObjectPtrLabel(const void *ptr, GLsizei length, const 
GLchar *label)
else
   callerstr = "glObjectPtrLabelKHR";
 
-   if (!_mesa_validate_sync(ctx, syncObj)) {
+   if (!syncObj) {
   _mesa_error(ctx, GL_INVALID_VALUE, "%s (not a valid sync object)",
   callerstr);
   return;
@@ -306,6 +306,7 @@ _mesa_ObjectPtrLabel(const void *ptr, GLsizei length, const 
GLchar *label)
labelPtr = >Label;
 
set_label(ctx, labelPtr, label, length, callerstr);
+   _mesa_unref_sync_object(ctx, syncObj, 1);
 }
 
 void GLAPIENTRY
@@ -313,7 +314,7 @@ _mesa_GetObjectPtrLabel(const void *ptr, GLsizei bufSize, 
GLsizei *length,
 GLchar *label)
 {
GET_CURRENT_CONTEXT(ctx);
-   struct gl_sync_object *const syncObj = (struct gl_sync_object *) ptr;
+   struct gl_sync_object *syncObj;
const char *callerstr;
char **labelPtr;
 
@@ -328,7 +329,8 @@ _mesa_GetObjectPtrLabel(const void *ptr, GLsizei bufSize, 
GLsizei *length,
   return;
}
 
-   if (!_mesa_validate_sync(ctx, syncObj)) {
+   syncObj = _mesa_get_sync(ctx, sync, true);
+   if (!syncObj) {
   _mesa_error(ctx, GL_INVALID_VALUE, "%s (not a valid sync object)",
   callerstr);
   return;
@@ -337,4 +339,5 @@ _mesa_GetObjectPtrLabel(const void *ptr, GLsizei bufSize, 
GLsizei *length,
labelPtr = >Label;
 
copy_label(*labelPtr, label, length, bufSize);
+   _mesa_unref_sync_object(ctx, syncObj, 1);
 }
diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
index c37b31d..b9f7bb6 100644
--- a/src/mesa/main/shared.c
+++ b/src/mesa/main/shared.c
@@ -338,7 +338,7 @@ free_shared_state(struct gl_context *ctx, struct 
gl_shared_state *shared)
   struct set_entry *entry;
 
   set_foreach(shared->SyncObjects, entry) {
- _mesa_unref_sync_object(ctx, (struct gl_sync_object *) entry->key);
+ _mesa_unref_sync_object(ctx, (struct gl_sync_object *) entry->key, 1);
   }
}
_mesa_set_destroy(shared->SyncObjects, NULL);
diff --git a/src/mesa/main/syncobj.c b/src/mesa/main/syncobj.c
index c1b2d3b..d1c6c06 100644
--- a/src/mesa/main/syncobj.c
+++ b/src/mesa/main/syncobj.c
@@ -167,34 +167,42 @@ _mesa_free_sync_data(struct gl_context *ctx)
  *  - not in sync objects hash table
  *  - type is GL_SYNC_FENCE
  *  - not marked as deleted
+ *
+ * Returns the internal gl_sync_object pointer if the sync object is valid
+ * or NULL if it isn't.
+ *
+ * If "incRefCount" is true, the reference count is incremented, which is
+ * normally what you want; otherwise, a glDeleteSync from another thread
+ * could delete the sync object while you are still working on it.
  */
-bool
-_mesa_validate_sync(struct gl_context *ctx,
-const struct gl_sync_object *syncObj)
+struct gl_sync_object *
+_mesa_get_sync(struct gl_context *ctx, GLsync sync, bool incRefCount)
 {
-   return (syncObj != NULL)
+   struct gl_sync_object *syncObj = (struct gl_sync_object *) sync;
+   mtx_lock(>Shared->Mutex);
+   if (syncObj != NULL
   &&

[Mesa-dev] [PATCH] svga: initialize pipe_driver_query_info entries with a macro

2015-12-08 Thread Brian Paul

To be safe, set all the fields in case the enums ordering/values
ever change.
---
 src/gallium/drivers/svga/svga_screen.c | 43 ++
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_screen.c 
b/src/gallium/drivers/svga/svga_screen.c
index 09a3d33..fca501b 100644
--- a/src/gallium/drivers/svga/svga_screen.c
+++ b/src/gallium/drivers/svga/svga_screen.c
@@ -780,26 +780,39 @@ svga_get_driver_query_info(struct pipe_screen *screen,
unsigned index,
struct pipe_driver_query_info *info)
 {
+#define QUERY(NAME, ENUM, UNITS) \
+   {NAME, ENUM, {0}, UNITS, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE, 0, 0x0}
+
static const struct pipe_driver_query_info queries[] = {
   /* per-frame counters */
-  {"num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS, {0}},
-  {"num-fallbacks", SVGA_QUERY_NUM_FALLBACKS, {0}},
-  {"num-flushes", SVGA_QUERY_NUM_FLUSHES, {0}},
-  {"num-validations", SVGA_QUERY_NUM_VALIDATIONS, {0}},
-  {"map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME, {0},
-   PIPE_DRIVER_QUERY_TYPE_MICROSECONDS},
-  {"num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED, {0}},
-  {"num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED, {0},
-   PIPE_DRIVER_QUERY_TYPE_BYTES, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE},
+  QUERY("num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-fallbacks", SVGA_QUERY_NUM_FALLBACKS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-flushes", SVGA_QUERY_NUM_FLUSHES,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-validations", SVGA_QUERY_NUM_VALIDATIONS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME,
+PIPE_DRIVER_QUERY_TYPE_MICROSECONDS),
+  QUERY("num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED,
+PIPE_DRIVER_QUERY_TYPE_BYTES),
 
   /* running total counters */
-  {"memory-used", SVGA_QUERY_MEMORY_USED, {0},
-   PIPE_DRIVER_QUERY_TYPE_BYTES},
-  {"num-shaders", SVGA_QUERY_NUM_SHADERS, {0}},
-  {"num-resources", SVGA_QUERY_NUM_RESOURCES, {0}},
-  {"num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS, {0}},
-  {"num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS, {0}},
+  QUERY("memory-used", SVGA_QUERY_MEMORY_USED,
+PIPE_DRIVER_QUERY_TYPE_BYTES),
+  QUERY("num-shaders", SVGA_QUERY_NUM_SHADERS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-resources", SVGA_QUERY_NUM_RESOURCES,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
};
+#undef QUERY
 
if (!info)
   return Elements(queries);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 06/12] i965/fs: Get rid of load/store_foo_indirect

2015-12-08 Thread Kenneth Graunke

On Tuesday, December 08, 2015 01:46:23 PM Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h   |   2 +-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 106 
> ++-
>  src/mesa/drivers/dri/i965/brw_nir.c  |  45 ++---
>  3 files changed, 84 insertions(+), 69 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index bca4589..b55589f 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -278,7 +278,7 @@ public:
> unsigned stream_id);
> void emit_gs_thread_end();
> void emit_gs_input_load(const fs_reg , const nir_src _src,
> -   const fs_reg _offset, unsigned 
> imm_offset,
> +   unsigned base_offset, const nir_src _src,
> unsigned num_components);
> void emit_cs_terminate();
> fs_reg *emit_cs_local_invocation_id_setup();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 3754622..18c9d65 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1603,29 +1603,31 @@ fs_visitor::emit_gs_vertex(const nir_src 
> _count_nir_src,
>  void
>  fs_visitor::emit_gs_input_load(const fs_reg ,
> const nir_src _src,
> -   const fs_reg _offset,
> -   unsigned imm_offset,
> +   unsigned base_offset,
> +   const nir_src _src,
> unsigned num_components)
>  {
> struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) 
> prog_data;
>  
> +   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
> +   nir_const_value *offset_const = nir_src_as_const_value(offset_src);
> +   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
> +
> /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y],
>  * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w].  Only
>  * gl_PointSize is available as a GS input, however, so it must be that.
>  */
> -   const bool is_point_size =
> -  indirect_offset.file == BAD_FILE && imm_offset == 0;
> -
> -   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
> -   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
> +   const bool is_point_size = (base_offset == 0);
>  
> -   if (indirect_offset.file == BAD_FILE && vertex_const != NULL &&
> -   4 * imm_offset < push_reg_count) {
> -  imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count;
> +   if (offset_const != NULL && vertex_const != NULL &&
> +   4 * offset_const->u[0] < push_reg_count) {

This needs to be

4 * (base_offset + offset_const->u[0]) < push_reg_count

otherwise we use the push model for inputs that aren't actually pushed,
which breaks around 1500 piglit tests with INTEL_SCALAR_GS=1.

> +  int imm_offset = (base_offset + offset_const->u[0]) * 4 +
> +   vertex_const->u[0] * push_reg_count;
>/* This input was pushed into registers. */
>if (is_point_size) {
>   /* gl_PointSize comes in .w */
> - bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type));
> + assert(imm_offset == 0);
> + bld.MOV(dst, fs_reg(ATTR, 3, dst.type));

This is wrong.  imm_offset contains the offset derived from the vertex
index, and gl_PointSize exists for multiple vertices.  So it won't be 0.

Please just remove this change, putting it back to:

 bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type));

That fixes assertion failures in 3 Piglit tests.

With that fixed, patches 1-8 and 11-12 are:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-08 Thread Kenneth Graunke

On Tuesday, December 08, 2015 01:46:20 PM Jason Ekstrand wrote:
> +/* src[] = { buffer_index, offset }. No const_index */
> +LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE)

I'm pretty sure const_index[0] is a writemask, so this comemnt is wrong.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] svga: initialize pipe_driver_query_info entries with a macro

2015-12-08 Thread Charmaine Lee

Reviewed-by: Charmaine Lee 

From: Brian Paul 
Sent: Tuesday, December 8, 2015 4:35 PM
To: mesa-dev@lists.freedesktop.org
Cc: Charmaine Lee
Subject: [PATCH] svga: initialize pipe_driver_query_info entries with a macro

To be safe, set all the fields in case the enums ordering/values
ever change.
---
 src/gallium/drivers/svga/svga_screen.c | 43 ++
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_screen.c 
b/src/gallium/drivers/svga/svga_screen.c
index 09a3d33..fca501b 100644
--- a/src/gallium/drivers/svga/svga_screen.c
+++ b/src/gallium/drivers/svga/svga_screen.c
@@ -780,26 +780,39 @@ svga_get_driver_query_info(struct pipe_screen *screen,
unsigned index,
struct pipe_driver_query_info *info)
 {
+#define QUERY(NAME, ENUM, UNITS) \
+   {NAME, ENUM, {0}, UNITS, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE, 0, 0x0}
+
static const struct pipe_driver_query_info queries[] = {
   /* per-frame counters */
-  {"num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS, {0}},
-  {"num-fallbacks", SVGA_QUERY_NUM_FALLBACKS, {0}},
-  {"num-flushes", SVGA_QUERY_NUM_FLUSHES, {0}},
-  {"num-validations", SVGA_QUERY_NUM_VALIDATIONS, {0}},
-  {"map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME, {0},
-   PIPE_DRIVER_QUERY_TYPE_MICROSECONDS},
-  {"num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED, {0}},
-  {"num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED, {0},
-   PIPE_DRIVER_QUERY_TYPE_BYTES, PIPE_DRIVER_QUERY_RESULT_TYPE_AVERAGE},
+  QUERY("num-draw-calls", SVGA_QUERY_NUM_DRAW_CALLS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-fallbacks", SVGA_QUERY_NUM_FALLBACKS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-flushes", SVGA_QUERY_NUM_FLUSHES,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-validations", SVGA_QUERY_NUM_VALIDATIONS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("map-buffer-time", SVGA_QUERY_MAP_BUFFER_TIME,
+PIPE_DRIVER_QUERY_TYPE_MICROSECONDS),
+  QUERY("num-resources-mapped", SVGA_QUERY_NUM_RESOURCES_MAPPED,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-bytes-uploaded", SVGA_QUERY_NUM_BYTES_UPLOADED,
+PIPE_DRIVER_QUERY_TYPE_BYTES),

   /* running total counters */
-  {"memory-used", SVGA_QUERY_MEMORY_USED, {0},
-   PIPE_DRIVER_QUERY_TYPE_BYTES},
-  {"num-shaders", SVGA_QUERY_NUM_SHADERS, {0}},
-  {"num-resources", SVGA_QUERY_NUM_RESOURCES, {0}},
-  {"num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS, {0}},
-  {"num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS, {0}},
+  QUERY("memory-used", SVGA_QUERY_MEMORY_USED,
+PIPE_DRIVER_QUERY_TYPE_BYTES),
+  QUERY("num-shaders", SVGA_QUERY_NUM_SHADERS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-resources", SVGA_QUERY_NUM_RESOURCES,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-state-objects", SVGA_QUERY_NUM_STATE_OBJECTS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
+  QUERY("num-surface-views", SVGA_QUERY_NUM_SURFACE_VIEWS,
+PIPE_DRIVER_QUERY_TYPE_UINT64),
};
+#undef QUERY

if (!info)
   return Elements(queries);
--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.

2015-12-08 Thread Ilia Mirkin

On Mon, Dec 7, 2015 at 5:32 PM, Dawid Gan  wrote:
> This format has been added in commit:
> 28090b30dd6b5977de085f48c620574214b6b4ba
> But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format.
> It was causing the screen in Supertuxkart to be darker than expected, see:
> https://bugs.freedesktop.org/show_bug.cgi?id=92759
>
> Cc: Boyan Ding 
> Cc: "11.0 11.1" 
> Fixes: 28090b30dd6 "i965: Add XRGB format to intel_screen_make_configs"
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759
> ---
>  src/mesa/drivers/dri/i965/intel_screen.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index cc90efe..75d5a65 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
>   stencil_bits[2] = 8;
>   num_depth_stencil_bits = 3;
>   }
> +  } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) {
> + depth_bits[1] = 24;
> + stencil_bits[1] = 0;

Why would you want depth without stencil when using BGRX? I don't see
how the two are connected... Are you sure you're picking the right
visual?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] configure.ac: fix test for SSE4.1 assembler support

2015-12-08 Thread Jonathan Gray

Change the __m128i variables to be volatile so gcc 4.9 won't optimise
all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
still get optimised out but now there is at least one SSE4.1 instruction
generated via _mm_max_epu32/pmaxud.  When all of the sse4.1 instructions
got optimised out the configure test would incorrectly pass when the
compiler supported the intrinsics and the assembler didn't support the
instructions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
Signed-off-by: Jonathan Gray 
Cc: "11.0 11.1" 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 260934d..1d82e47 100644
--- a/configure.ac
+++ b/configure.ac
@@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS"
 AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
 #include 
 int main () {
-__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
+volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
 c = _mm_max_epu32(a, b);
 return 0;
 }]])], SSE41_SUPPORTED=1)
-- 
2.6.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/8] glsl: only update doubles inputs for vertex inputs.

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

This doesn't apply to other stages. This is only
used in the mesa/st code, which needs further fixes.

Signed-off-by: Dave Airlie 
---
 src/glsl/ir_set_program_inouts.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/glsl/ir_set_program_inouts.cpp 
b/src/glsl/ir_set_program_inouts.cpp
index 838b5e7..c9b8a38 100644
--- a/src/glsl/ir_set_program_inouts.cpp
+++ b/src/glsl/ir_set_program_inouts.cpp
@@ -115,8 +115,11 @@ mark(struct gl_program *prog, ir_variable *var, int 
offset, int len,
  else
 prog->InputsRead |= bitfield;
 
- if (var->type->without_array()->is_dual_slot_double())
+ /* double inputs read is only for vertex inputs */
+ if (stage == MESA_SHADER_VERTEX &&
+ var->type->without_array()->is_dual_slot_double())
 prog->DoubleInputsRead |= bitfield;
+
  if (stage == MESA_SHADER_FRAGMENT) {
 gl_fragment_program *fprog = (gl_fragment_program *) prog;
 fprog->InterpQualifier[idx] =
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/8] glsl: pass stage into mark function

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

Don't use a bool here, as for some 64-bit fixes we need
the stage.

Signed-off-by: Dave Airlie 
---
 src/glsl/ir_set_program_inouts.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/glsl/ir_set_program_inouts.cpp 
b/src/glsl/ir_set_program_inouts.cpp
index d7c29b0..70d754f 100644
--- a/src/glsl/ir_set_program_inouts.cpp
+++ b/src/glsl/ir_set_program_inouts.cpp
@@ -90,7 +90,7 @@ is_dual_slot(ir_variable *var)
 
 static void
 mark(struct gl_program *prog, ir_variable *var, int offset, int len,
- bool is_fragment_shader)
+ gl_shader_stage stage)
 {
/* As of GLSL 1.20, varyings can only be floats, floating-point
 * vectors or matrices, or arrays of them.  For Mesa programs using
@@ -125,7 +125,7 @@ mark(struct gl_program *prog, ir_variable *var, int offset, 
int len,
 
  if (dual_slot)
 prog->DoubleInputsRead |= bitfield;
- if (is_fragment_shader) {
+ if (stage == MESA_SHADER_FRAGMENT) {
 gl_fragment_program *fprog = (gl_fragment_program *) prog;
 fprog->InterpQualifier[idx] =
(glsl_interp_qualifier) var->data.interpolation;
@@ -178,7 +178,7 @@ 
ir_set_program_inouts_visitor::mark_whole_variable(ir_variable *var)
}
 
mark(this->prog, var, 0, type->count_attribute_slots(),
-this->shader_stage == MESA_SHADER_FRAGMENT);
+this->shader_stage);
 }
 
 /* Default handler: Mark all the locations in the variable as used. */
@@ -302,7 +302,7 @@ 
ir_set_program_inouts_visitor::try_mark_partial_variable(ir_variable *var,
}
 
mark(this->prog, var, index_as_constant->value.u[0] * elem_width,
-elem_width, this->shader_stage == MESA_SHADER_FRAGMENT);
+elem_width, this->shader_stage);
return true;
 }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/8] glsl/fp64: add helper for dual slot double detection.

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

The old function didn't work for matrices, and we need this
in other places to fix some other problems, so move to a helper
in glsl type and fix the one user so far.

A dual slot double is one that has 3 or 4 components in it's
base type.

Signed-off-by: Dave Airlie 
---
 src/glsl/ir_set_program_inouts.cpp | 10 +-
 src/glsl/nir/glsl_types.h  |  8 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/glsl/ir_set_program_inouts.cpp 
b/src/glsl/ir_set_program_inouts.cpp
index 70d754f..782f1b1 100644
--- a/src/glsl/ir_set_program_inouts.cpp
+++ b/src/glsl/ir_set_program_inouts.cpp
@@ -81,13 +81,6 @@ is_shader_inout(ir_variable *var)
   var->data.mode == ir_var_system_value;
 }
 
-static inline bool
-is_dual_slot(ir_variable *var)
-{
-   const glsl_type *type = var->type->without_array();
-   return type == glsl_type::dvec4_type || type == glsl_type::dvec3_type;
-}
-
 static void
 mark(struct gl_program *prog, ir_variable *var, int offset, int len,
  gl_shader_stage stage)
@@ -101,7 +94,6 @@ mark(struct gl_program *prog, ir_variable *var, int offset, 
int len,
 */
 
for (int i = 0; i < len; i++) {
-  bool dual_slot = is_dual_slot(var);
   int idx = var->data.location + var->data.index + offset + i;
   bool is_patch_generic = var->data.patch &&
   idx != VARYING_SLOT_TESS_LEVEL_INNER &&
@@ -123,7 +115,7 @@ mark(struct gl_program *prog, ir_variable *var, int offset, 
int len,
  else
 prog->InputsRead |= bitfield;
 
- if (dual_slot)
+ if (var->type->without_array()->is_dual_slot_double())
 prog->DoubleInputsRead |= bitfield;
  if (stage == MESA_SHADER_FRAGMENT) {
 gl_fragment_program *fprog = (gl_fragment_program *) prog;
diff --git a/src/glsl/nir/glsl_types.h b/src/glsl/nir/glsl_types.h
index d8a999a..26f25a1 100644
--- a/src/glsl/nir/glsl_types.h
+++ b/src/glsl/nir/glsl_types.h
@@ -471,6 +471,14 @@ struct glsl_type {
}
 
/**
+* Query whether a double takes two slots.
+*/
+   bool is_dual_slot_double() const
+   {
+  return base_type == GLSL_TYPE_DOUBLE && vector_elements > 2;
+   }
+
+   /**
 * Query whether or not a type is a non-array boolean type
 */
bool is_boolean() const
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/8] glsl: use dual slot helper in the linker code.

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

Signed-off-by: Dave Airlie 
---
 src/glsl/linker.cpp | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index ae628cd..89659c7 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2603,17 +2603,8 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
  * issue (3) of the GL_ARB_vertex_attrib_64bit behavior, this
  * is optional behavior, but it seems preferable.
  */
-const glsl_type *type = var->type->without_array();
-if (type == glsl_type::dvec3_type ||
-type == glsl_type::dvec4_type ||
-type == glsl_type::dmat2x3_type ||
-type == glsl_type::dmat2x4_type ||
-type == glsl_type::dmat3_type ||
-type == glsl_type::dmat3x4_type ||
-type == glsl_type::dmat4x3_type ||
-type == glsl_type::dmat4_type) {
+if (var->type->without_array()->is_dual_slot_double())
double_storage_locations |= (use_mask << attr);
-}
 }
 
 continue;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/8] glsl: fix count_attribute_slots to allow for different 64-bit handling

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

So vertex shader input attributes are handled different than internal
varyings between shader stages, dvec3 and dvec4 only count as
one slot for vertex attributes, but for internal varyings, they
count as 2.

This patch comments all the uses of this API to clarify what we
pass in, except one which needs further investigation

Signed-off-by: Dave Airlie 
---
 src/glsl/ir_set_program_inouts.cpp |  7 ++-
 src/glsl/link_varyings.cpp |  6 --
 src/glsl/linker.cpp|  5 +++--
 src/glsl/nir/glsl_types.cpp| 18 +-
 src/glsl/nir/glsl_types.h  |  5 -
 5 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/src/glsl/ir_set_program_inouts.cpp 
b/src/glsl/ir_set_program_inouts.cpp
index 782f1b1..838b5e7 100644
--- a/src/glsl/ir_set_program_inouts.cpp
+++ b/src/glsl/ir_set_program_inouts.cpp
@@ -146,6 +146,7 @@ void
 ir_set_program_inouts_visitor::mark_whole_variable(ir_variable *var)
 {
const glsl_type *type = var->type;
+   bool vertex_input = false;
if (this->shader_stage == MESA_SHADER_GEOMETRY &&
var->data.mode == ir_var_shader_in && type->is_array()) {
   type = type->fields.array;
@@ -169,7 +170,11 @@ 
ir_set_program_inouts_visitor::mark_whole_variable(ir_variable *var)
   type = type->fields.array;
}
 
-   mark(this->prog, var, 0, type->count_attribute_slots(),
+   if (this->shader_stage == MESA_SHADER_VERTEX &&
+   var->data.mode == ir_var_shader_in)
+  vertex_input = true;
+
+   mark(this->prog, var, 0, type->count_attribute_slots(vertex_input),
 this->shader_stage);
 }
 
diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 71750d1..9f6467b 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -1712,7 +1712,8 @@ check_against_output_limit(struct gl_context *ctx,
 
   if (var && var->data.mode == ir_var_shader_out &&
   var_counts_against_varying_limit(producer->Stage, var)) {
- output_vectors += var->type->count_attribute_slots();
+ /* outputs for fragment shader can't be doubles */
+ output_vectors += var->type->count_attribute_slots(false);
   }
}
 
@@ -1753,7 +1754,8 @@ check_against_input_limit(struct gl_context *ctx,
 
   if (var && var->data.mode == ir_var_shader_in &&
   var_counts_against_varying_limit(consumer->Stage, var)) {
- input_vectors += var->type->count_attribute_slots();
+ /* vertex inputs aren't varying counted */
+ input_vectors += var->type->count_attribute_slots(false);
   }
}
 
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 89659c7..1670036 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2466,7 +2466,7 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
  return false;
   }
 
-  const unsigned slots = var->type->count_attribute_slots();
+  const unsigned slots = var->type->count_attribute_slots(false);
 
   /* If the variable is not a built-in and has a location statically
* assigned in the shader (presumably via a layout qualifier), make sure
@@ -2995,7 +2995,8 @@ check_image_resources(struct gl_context *ctx, struct 
gl_shader_program *prog)
 foreach_in_list(ir_instruction, node, sh->ir) {
ir_variable *var = node->as_variable();
if (var && var->data.mode == ir_var_shader_out)
-  fragment_outputs += var->type->count_attribute_slots();
+  /* since there are no double fs outputs - pass false */
+  fragment_outputs += var->type->count_attribute_slots(false);
 }
  }
   }
diff --git a/src/glsl/nir/glsl_types.cpp b/src/glsl/nir/glsl_types.cpp
index 3cf2f03..44d3056 100644
--- a/src/glsl/nir/glsl_types.cpp
+++ b/src/glsl/nir/glsl_types.cpp
@@ -1640,7 +1640,7 @@ glsl_type::std430_size(bool row_major) const
 }
 
 unsigned
-glsl_type::count_attribute_slots() const
+glsl_type::count_attribute_slots(bool vertex_input_slots) const
 {
/* From page 31 (page 37 of the PDF) of the GLSL 1.50 spec:
 *
@@ -1661,27 +1661,35 @@ glsl_type::count_attribute_slots() const
 * allows varying structs, the number of varying slots taken up by a
 * varying struct is simply equal to the sum of the number of slots taken
 * up by each element.
+*
+* Doubles are counted different depending on whether they are vertex
+* inputs or everything else. Vertex inputs from ARB_vertex_attrib_64bit
+* take one location no matter what size they are, otherwise dvec3/4
+* take two locations.
 */
switch (this->base_type) {
case GLSL_TYPE_UINT:
case GLSL_TYPE_INT:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_BOOL:
-   case GLSL_TYPE_DOUBLE:
   return this->matrix_columns;
-
+   case GLSL_TYPE_DOUBLE:
+  if (this->vector_elements > 2 && !vertex_input_slots)
+ return

[Mesa-dev] glsl fp64 fixes.

2015-12-08 Thread Dave Airlie

I've been trying to get GL41-CTS.gpu_shader_fp64.varyings to
pass on mesa for a while now on radeonsi/r600.

I finally got it to pass, so this is the first round of changes.

The second round are all st/glsl->tgsi convertor changes, so I
decided to post the GLSL changes first.

The main changes are ARB_vertex_attrib_64bit related, where
the vertex input locations are counted different than the other
areas, but it also fixes transform feedback with doubles.

It also introduces a new types helper and uses it in a couple
of places.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/8] glsl: fix partial marking for fp64 types.

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

This doubles the element width for the types that are greater
than 2 elements wide.

Signed-off-by: Dave Airlie 
---
 src/glsl/ir_set_program_inouts.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/ir_set_program_inouts.cpp 
b/src/glsl/ir_set_program_inouts.cpp
index c9b8a38..a2dea67 100644
--- a/src/glsl/ir_set_program_inouts.cpp
+++ b/src/glsl/ir_set_program_inouts.cpp
@@ -301,6 +301,13 @@ 
ir_set_program_inouts_visitor::try_mark_partial_variable(ir_variable *var,
   return false;
}
 
+   /* double element width for double types that takes two slots */
+   if (this->shader_stage != MESA_SHADER_VERTEX ||
+   var->data.mode != ir_var_shader_in) {
+  if (type->without_array()->is_dual_slot_double())
+elem_width *= 2;
+   }
+
mark(this->prog, var, index_as_constant->value.u[0] * elem_width,
 elem_width, this->shader_stage);
return true;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: fix texture grad for cubemaps

2015-12-08 Thread Ilia Mirkin

We were ignoring the partial derivatives on the last dim.

Signed-off-by: Ilia Mirkin 
---

A little frightening that we weren't failing any tests... perhaps I
got this wrong somehow?

 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp | 2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp  | 2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 7 +++
 4 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 1d2caab..b233860 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1897,7 +1897,7 @@ Converter::handleTEX(Value *dst[4], int R, int S, int L, 
int C, int Dx, int Dy)
   shd = fetchSrc(C >> 4, C & 3);
 
if (texi->op == OP_TXD) {
-  for (c = 0; c < tgt.getDim(); ++c) {
+  for (c = 0; c < tgt.getDim() + tgt.isCube(); ++c) {
  texi->dPdx[c].set(fetchSrc(Dx >> 4, (Dx & 3) + c));
  texi->dPdy[c].set(fetchSrc(Dy >> 4, (Dy & 3) + c));
   }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp
index 420cc4e..0b90378 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp
@@ -57,7 +57,7 @@ GM107LoweringPass::handleManualTXD(TexInstruction *i)
Instruction *tex, *add;
Value *zero = bld.loadImm(bld.getSSA(), 0);
int l, c;
-   const int dim = i->tex.target.getDim();
+   const int dim = i->tex.target.getDim() + i->tex.target.isCube();
const int array = i->tex.target.isArray();
 
i->op = OP_TEX; // no need to clone dPdx/dPdy later
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
index 82f7a8f..19965ff 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
@@ -911,7 +911,7 @@ NV50LoweringPreSSA::handleTXD(TexInstruction *i)
Instruction *tex;
Value *zero = bld.loadImm(bld.getSSA(), 0);
int l, c;
-   const int dim = i->tex.target.getDim();
+   const int dim = i->tex.target.getDim() + i->tex.target.isCube();
 
handleTEX(i);
i->op = OP_TEX; // no need to clone dPdx/dPdy later
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 0f575f2..4f430cd 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -870,7 +870,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
Instruction *tex;
Value *zero = bld.loadImm(bld.getSSA(), 0);
int l, c;
-   const int dim = i->tex.target.getDim();
+   const int dim = i->tex.target.getDim() + i->tex.target.isCube();
const int array = i->tex.target.isArray();
 
i->op = OP_TEX; // no need to clone dPdx/dPdy later
@@ -917,7 +917,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
 bool
 NVC0LoweringPass::handleTXD(TexInstruction *txd)
 {
-   int dim = txd->tex.target.getDim();
+   int dim = txd->tex.target.getDim() + txd->tex.target.isCube();
unsigned arg = txd->tex.target.getArgCount();
unsigned expected_args = arg;
const int chipset = prog->getTarget()->getChipset();
@@ -937,8 +937,7 @@ NVC0LoweringPass::handleTXD(TexInstruction *txd)
 
if (expected_args > 4 ||
dim > 2 ||
-   txd->tex.target.isShadow() ||
-   txd->tex.target.isCube())
+   txd->tex.target.isShadow())
   txd->op = OP_TEX;
 
handleTEX(txd);
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] clover: Fix build against LLVM 3.8 SVN >= r255078

2015-12-08 Thread Michel Dänzer

From: Michel Dänzer 

Signed-off-by: Michel Dänzer 
---
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 3b37f08..4d11c24 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -661,7 +661,11 @@ namespace {
 
   if (dump_asm) {
  LLVMSetTargetMachineAsmVerbosity(tm, true);
+#if HAVE_LLVM >= 0x0308
+ LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod).release());
+#else
  LLVMModuleRef debug_mod = wrap(llvm::CloneModule(mod));
+#endif
  emit_code(tm, debug_mod, LLVMAssemblyFile, _buffer, r_log);
  buffer_size = LLVMGetBufferSize(out_buffer);
  buffer_data = LLVMGetBufferStart(out_buffer);
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression

2015-12-08 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=91888

--- Comment #15 from nerdopol...@verizon.net ---
(In reply to Daniel Stone from comment #14)
> Ugh, so I think I'm using the same options, and I can't see what's going
> wrong with either big GL or GLES. To be honest, I'm at a total loss ... can
> you break on _mesa_error when running SDL to see where it's triggered?

Doesn't seem that break _mesa_error works, it's not defined...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] softpipe: V.2 implement some support for multiple viewports

2015-12-08 Thread Edward O'Callaghan

This fixes my initial attempt so that piglit now passes 14/14. Thanks
to a couple of tips from Roland in the previous patch I was able to
fix the remaining issue. This should be golden now.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/8] glsl: fix transform feedback for 64-bit outupts.

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

This fixes the calculations for transform feedback for doubles.

Signed-off-by: Dave Airlie 
---
 src/glsl/link_varyings.cpp |  7 +--
 src/glsl/link_varyings.h   | 26 +-
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 9f6467b..9cc77fe 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -432,6 +432,8 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
  this->matched_candidate->type->fields.array->matrix_columns;
   const unsigned vector_elements =
  this->matched_candidate->type->fields.array->vector_elements;
+  const unsigned dmul =
+ this->matched_candidate->type->fields.array->is_double() ? 2 : 1;
   unsigned actual_array_size;
   switch (this->lowered_builtin_array_variable) {
   case clip_distance:
@@ -459,7 +461,7 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
 return false;
  }
  unsigned array_elem_size = this->lowered_builtin_array_variable ?
-1 : vector_elements * matrix_cols;
+1 : vector_elements * matrix_cols * dmul;
  fine_location += array_elem_size * this->array_subscript;
  this->size = 1;
   } else {
@@ -519,7 +521,6 @@ tfeedback_decl::get_num_outputs() const
if (!this->is_varying()) {
   return 0;
}
-
return (this->num_components() + this->location_frac + 3)/4;
 }
 
@@ -968,6 +969,8 @@ varying_matches::record(ir_variable *producer_var, 
ir_variable *consumer_var)
   } else {
  slots = type->matrix_columns;
   }
+  if (type->without_array()->is_dual_slot_double())
+ slots *= 2;
   this->matches[this->num_matches].num_components = 4 * slots;
} else {
   this->matches[this->num_matches].num_components
diff --git a/src/glsl/link_varyings.h b/src/glsl/link_varyings.h
index 2ce72d4..1d12978 100644
--- a/src/glsl/link_varyings.h
+++ b/src/glsl/link_varyings.h
@@ -131,7 +131,8 @@ public:
   if (this->lowered_builtin_array_variable)
  return this->size;
   else
- return this->vector_elements * this->matrix_columns * this->size;
+ return this->vector_elements * this->matrix_columns * this->size *
+(this->is_double() ? 2 : 1);
}
 
unsigned get_location() const {
@@ -139,6 +140,29 @@ public:
}
 
 private:
+
+   bool is_double() const
+   {
+  switch (this->type) {
+  case GL_DOUBLE:
+  case GL_DOUBLE_VEC2:
+  case GL_DOUBLE_VEC3:
+  case GL_DOUBLE_VEC4:
+  case GL_DOUBLE_MAT2:
+  case GL_DOUBLE_MAT2x3:
+  case GL_DOUBLE_MAT2x4:
+  case GL_DOUBLE_MAT3:
+  case GL_DOUBLE_MAT3x2:
+  case GL_DOUBLE_MAT3x4:
+  case GL_DOUBLE_MAT4:
+  case GL_DOUBLE_MAT4x2:
+  case GL_DOUBLE_MAT4x3:
+ return true;
+  default:
+ return false;
+  }
+   }
+
/**
 * The name that was supplied to glTransformFeedbackVaryings.  Used for
 * error reporting and glGetTransformFeedbackVarying().
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/8] glsl: only divide left components when it is a dual slot double.

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

Signed-off-by: Dave Airlie 
---
 src/glsl/lower_packed_varyings.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/lower_packed_varyings.cpp 
b/src/glsl/lower_packed_varyings.cpp
index 037c27d..ec9af62 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -472,7 +472,7 @@ lower_packed_varyings_visitor::lower_rvalue(ir_rvalue 
*rvalue,
   char right_swizzle_name[4] = { 0, 0, 0, 0 };
 
   left_components = 4 - fine_location % 4;
-  if (rvalue->type->is_double()) {
+  if (rvalue->type->is_dual_slot_double()) {
  /* We might actually end up with 0 left components! */
  left_components /= 2;
   }
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] llvmpipe: fix layer/vp input into fs when not written by prior stages

2015-12-08 Thread sroland

From: Roland Scheidegger 

ARB_fragment_layer_viewport requires that if a fs reads layer or viewport
index but it wasn't output by gs (or vs with other extensions), then it reads
0. This never worked for llvmpipe, and is surprisingly non-trivial to fix.
The problem is the mechanism to handle non-existing outputs in draw is rather
crude, it will simply redirect them to whatever is at output 0, thus later
stages will just get garbage. So, rather than trying to fix this up (which
looks non-trivial), fix this up in llvmpipe setup by detecting this case there
and output a fixed zero directly.
While here, also optimize the hw vertex layout a bit - previously if the gs
outputted layer (or vp) and the fs read those inputs, we'd add them twice
to the vertex layout, which is unnecessary.
And do some minor cleanup, slots don't require that many bits, there was some
bogus (but harmless) float/int mixup for psize slot too, make the slots all
unsigned (we always put pos at pos zero thus everything else has to be positive
if it exists), and make sure they are properly initialized (layer and vp index
slot were not which looked fishy as they might not have got set back to zero
when changing from a gs which outputs them to one which does not).

This fixes the failures in piglit's arb_fragment_layer_viewport group
(3 each for layer and vp).
---
 src/gallium/drivers/llvmpipe/lp_bld_interp.h|  3 +-
 src/gallium/drivers/llvmpipe/lp_context.h   | 18 --
 src/gallium/drivers/llvmpipe/lp_setup.c |  2 +-
 src/gallium/drivers/llvmpipe/lp_setup_context.h |  8 +--
 src/gallium/drivers/llvmpipe/lp_setup_point.c   |  2 +-
 src/gallium/drivers/llvmpipe/lp_state_derived.c | 78 +
 src/gallium/drivers/llvmpipe/lp_state_setup.c   | 29 +
 src/gallium/drivers/llvmpipe/lp_state_setup.h   |  9 ++-
 8 files changed, 96 insertions(+), 53 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_bld_interp.h 
b/src/gallium/drivers/llvmpipe/lp_bld_interp.h
index 9029d2a..0a52642 100644
--- a/src/gallium/drivers/llvmpipe/lp_bld_interp.h
+++ b/src/gallium/drivers/llvmpipe/lp_bld_interp.h
@@ -63,7 +63,8 @@ enum lp_interp {
LP_INTERP_LINEAR,
LP_INTERP_PERSPECTIVE,
LP_INTERP_POSITION,
-   LP_INTERP_FACING
+   LP_INTERP_FACING,
+   LP_INTERP_ZERO
 };
 
 struct lp_shader_input {
diff --git a/src/gallium/drivers/llvmpipe/lp_context.h 
b/src/gallium/drivers/llvmpipe/lp_context.h
index c9a5d67..9dcc102 100644
--- a/src/gallium/drivers/llvmpipe/lp_context.h
+++ b/src/gallium/drivers/llvmpipe/lp_context.h
@@ -108,22 +108,28 @@ struct llvmpipe_context {
struct vertex_info vertex_info;

/** Which vertex shader output slot contains color */
-   int color_slot[2];
+   uint8_t color_slot[2];
 
/** Which vertex shader output slot contains bcolor */
-   int bcolor_slot[2];
+   uint8_t bcolor_slot[2];
 
/** Which vertex shader output slot contains point size */
-   int psize_slot;
+   uint8_t psize_slot;
 
/** Which vertex shader output slot contains viewport index */
-   int viewport_index_slot;
+   uint8_t viewport_index_slot;
 
/** Which geometry shader output slot contains layer */
-   int layer_slot;
+   uint8_t layer_slot;
 
/** A fake frontface output for unfilled primitives */
-   int face_slot;
+   uint8_t face_slot;
+
+   /** Which output slot is used for the fake vp index info */
+   uint8_t fake_vpindex_slot;
+
+   /** Which output slot is used for the fake layer info */
+   uint8_t fake_layer_slot;
 
/** Depth format and bias settings. */
boolean floating_point_depth;
diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
b/src/gallium/drivers/llvmpipe/lp_setup.c
index 1778b13..ddbb88e 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup.c
@@ -1207,7 +1207,7 @@ lp_setup_update_state( struct lp_setup_context *setup,
   /* Will probably need to move this somewhere else, just need  
* to know about vertex shader point size attribute.
*/
-  setup->psize = lp->psize_slot;
+  setup->psize_slot = lp->psize_slot;
   setup->viewport_index_slot = lp->viewport_index_slot;
   setup->layer_slot = lp->layer_slot;
   setup->face_slot = lp->face_slot;
diff --git a/src/gallium/drivers/llvmpipe/lp_setup_context.h 
b/src/gallium/drivers/llvmpipe/lp_setup_context.h
index 2410e23..4451284 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup_context.h
+++ b/src/gallium/drivers/llvmpipe/lp_setup_context.h
@@ -105,10 +105,10 @@ struct lp_setup_context
float pixel_offset;
float line_width;
float point_size;
-   float psize;
-   unsigned viewport_index_slot;
-   unsigned layer_slot;
-   int face_slot;
+   uint8_t psize_slot;
+   uint8_t viewport_index_slot;
+   uint8_t layer_slot;
+   uint8_t face_slot;
 
struct pipe_framebuffer_state fb;
struct u_rect framebuffer;
diff --git a/src/gallium/drivers/llvmpipe/lp_setup_point.c

Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports

2015-12-08 Thread eocallaghan

The following changes since commit 
a13b14930d94b024160fe17814e091356d07f7fb:


  llvmpipe: fix fp64 inputs to geom shader. (2015-12-09 13:56:39 +1000)

are available in the git repository at:

  https://github.com/victoredwardocallaghan/mesa-GLwork.git 
softpipe-ARB_viewport_array


for you to fetch changes up to 04e17cf34ba04f99e2536f2e598a2ee7d8bb1237:

  softpipe: Update GL3.txt doc with GL_ARB_viewport_array support 
(2015-12-09 15:09:17 +1100)



Edward O'Callaghan (2):
  softpipe: implement some support for multiple viewports
  softpipe: Update GL3.txt doc with GL_ARB_viewport_array support

 docs/GL3.txt  |  2 +-
 src/gallium/drivers/softpipe/sp_context.h |  9 ++-
 src/gallium/drivers/softpipe/sp_quad.h|  1 +
 src/gallium/drivers/softpipe/sp_quad_depth_test.c |  5 +-
 src/gallium/drivers/softpipe/sp_screen.c  |  2 +-
 src/gallium/drivers/softpipe/sp_setup.c   | 49 +++
 src/gallium/drivers/softpipe/sp_setup.h   |  5 ++
 src/gallium/drivers/softpipe/sp_state_clip.c  | 15 +++--
 src/gallium/drivers/softpipe/sp_state_derived.c   | 76 
+++

 src/gallium/drivers/softpipe/sp_surface.c |  4 +-
 10 files changed, 114 insertions(+), 54 deletions(-)


On 2015-12-09 15:16, Edward O'Callaghan wrote:

This fixes my initial attempt so that piglit now passes 14/14. Thanks
to a couple of tips from Roland in the previous patch I was able to
fix the remaining issue. This should be golden now.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] configure.ac: use pkg-config for libelf

2015-12-08 Thread Jonathan Gray

Use PKG_CHECK_MODULES to get the flags to link libelf

v2: keep AC_CHECK_LIB as a fallback for elfutils provided
libelf that doesn't install a pkg-config file.

Signed-off-by: Jonathan Gray 
Reviewed-by: Michel D�nzer 
Tested-by: Michel D�nzer 
Cc: "11.0 11.1" 
---
 configure.ac   | 12 +---
 src/gallium/drivers/radeon/Makefile.am |  5 +++--
 src/gallium/targets/opencl/Makefile.am |  5 -
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/configure.ac b/configure.ac
index 25fdc39..260934d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1724,7 +1724,15 @@ AC_ARG_WITH([clang-libdir],
[CLANG_LIBDIR=''])
 
 PKG_CHECK_EXISTS([libclc], [have_libclc=yes], [have_libclc=no])
-AC_CHECK_LIB([elf], [elf_memory], [have_libelf=yes;ELF_LIB=-lelf])
+PKG_CHECK_MODULES([LIBELF], [libelf], [have_libelf=yes], [have_libelf=no])
+
+if test "x$have_libelf" = xno; then
+   LIBELF_LIBS=''
+   LIBELF_CFLAGS=''
+   AC_CHECK_LIB([elf], [elf_memory], [have_libelf=yes;LIBELF_LIBS=-lelf], 
[have_libelf=no])
+   AC_SUBST([LIBELF_LIBS])
+   AC_SUBST([LIBELF_CFLAGS])
+fi
 
 if test "x$enable_opencl" = xyes; then
 if test -z "$with_gallium_drivers"; then
@@ -2305,8 +2313,6 @@ if test "x$USE_VC4_SIMULATOR" = xyes -a 
"x$HAVE_GALLIUM_ILO" = xyes; then
 AC_MSG_ERROR([VC4 simulator on x86 replaces i965 driver build, so ilo must 
be disabled.])
 fi
 
-AC_SUBST([ELF_LIB])
-
 AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
 AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes)
 AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
diff --git a/src/gallium/drivers/radeon/Makefile.am 
b/src/gallium/drivers/radeon/Makefile.am
index 13d8976..a6fc145 100644
--- a/src/gallium/drivers/radeon/Makefile.am
+++ b/src/gallium/drivers/radeon/Makefile.am
@@ -16,7 +16,8 @@ libradeon_la_SOURCES = \
 if NEED_RADEON_LLVM
 
 AM_CFLAGS += \
-   $(LLVM_CFLAGS)
+   $(LLVM_CFLAGS) \
+   $(LIBELF_CFLAGS)
 
 libradeon_la_SOURCES += \
$(LLVM_C_FILES)
@@ -24,7 +25,7 @@ libradeon_la_SOURCES += \
 libradeon_la_LIBADD = \
$(CLOCK_LIB) \
$(LLVM_LIBS) \
-   $(ELF_LIB)
+   $(LIBELF_LIBS)
 
 libradeon_la_LDFLAGS = \
$(LLVM_LDFLAGS)
diff --git a/src/gallium/targets/opencl/Makefile.am 
b/src/gallium/targets/opencl/Makefile.am
index 08f95e8..f3ba1e3 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -2,6 +2,9 @@ include $(top_srcdir)/src/gallium/Automake.inc
 
 lib_LTLIBRARIES = lib@OPENCL_LIBNAME@.la
 
+AM_CPPFLAGS = \
+$(LIBELF_CFLAGS)
+
 lib@OPENCL_LIBNAME@_la_LDFLAGS = \
$(LLVM_LDFLAGS) \
-no-undefined \
@@ -19,7 +22,7 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
$(top_builddir)/src/gallium/state_trackers/clover/libclover.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(top_builddir)/src/util/libmesautil.la \
-   $(ELF_LIB) \
+   $(LIBELF_LIBS) \
$(DLOPEN_LIBS) \
-lclangCodeGen \
-lclangFrontendTool \
-- 
2.6.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: add loading doubles from lds and geom inputs

2015-12-08 Thread Michel Dänzer

On 09.12.2015 13:03, Dave Airlie wrote:
> From: Dave Airlie 
> 
> This adds two cases of missing double handling to radeonsi,
> one for loading doubles from LDS and one for fetch doubles
> for geometry shader inputs.
> 
> Signed-off-by: Dave Airlie 

Please split it up into two patches. Then both of them are

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeonsi: add loading doubles from lds and geom inputs

2015-12-08 Thread Dave Airlie

From: Dave Airlie 

This adds two cases of missing double handling to radeonsi,
one for loading doubles from LDS and one for fetch doubles
for geometry shader inputs.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/radeonsi/si_shader.c | 28 
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 14f12df..1baa2eb 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -594,6 +594,14 @@ static LLVMValueRef lds_load(struct lp_build_tgsi_context 
*bld_base,
lp_build_const_int32(gallivm, swizzle));
 
value = build_indexed_load(si_shader_ctx, si_shader_ctx->lds, dw_addr);
+   if (type == TGSI_TYPE_DOUBLE) {
+   LLVMValueRef value2;
+   dw_addr = lp_build_add(_base->uint_bld, dw_addr,
+  lp_build_const_int32(gallivm, swizzle + 
1));
+   value2 = build_indexed_load(si_shader_ctx, si_shader_ctx->lds, 
dw_addr);
+   return radeon_llvm_emit_fetch_double(bld_base, value, value2);
+   }
+
return LLVMBuildBitCast(gallivm->builder, value,
tgsi2llvmtype(bld_base, type), "");
 }
@@ -733,6 +741,7 @@ static LLVMValueRef fetch_input_gs(
unsigned semantic_name = info->input_semantic_name[reg->Register.Index];
unsigned semantic_index = 
info->input_semantic_index[reg->Register.Index];
unsigned param;
+   LLVMValueRef value;
 
if (swizzle != ~0 && semantic_name == TGSI_SEMANTIC_PRIMID)
return get_primitive_id(bld_base, swizzle);
@@ -774,11 +783,22 @@ static LLVMValueRef fetch_input_gs(
args[7] = uint->zero; /* SLC */
args[8] = uint->zero; /* TFE */
 
+   value = lp_build_intrinsic(gallivm->builder,
+  "llvm.SI.buffer.load.dword.i32.i32",
+  i32, args, 9,
+  LLVMReadOnlyAttribute | 
LLVMNoUnwindAttribute);
+   if (type == TGSI_TYPE_DOUBLE) {
+   LLVMValueRef value2;
+   args[2] = lp_build_const_int32(gallivm, (param * 4 + swizzle + 
1) * 256);
+   value2 = lp_build_intrinsic(gallivm->builder,
+   "llvm.SI.buffer.load.dword.i32.i32",
+   i32, args, 9,
+   LLVMReadOnlyAttribute | 
LLVMNoUnwindAttribute);
+   return radeon_llvm_emit_fetch_double(bld_base,
+value, value2);
+   }
return LLVMBuildBitCast(gallivm->builder,
-   lp_build_intrinsic(gallivm->builder,
-   
"llvm.SI.buffer.load.dword.i32.i32",
-   i32, args, 9,
-   LLVMReadOnlyAttribute | 
LLVMNoUnwindAttribute),
+   value,
tgsi2llvmtype(bld_base, type), "");
 }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/10] Rework of pb_cache_manager for removal of radeon_winsys_cs_handle

2015-12-08 Thread Nicolai Hähnle


On 06.12.2015 19:00, Marek Olšák wrote:

This series addresses the weirdness in radeon drivers that every buffer has 2 
handles:
- pb_buffer (== pb_cache_buffer)
- radeon_winsys_cs_handle (winsys-specific pb_buffer)

The inefficiency of converting pb_cache_buffer into the winsys-specific buffer 
made me introduce radeon_winsys_cs_handle a few years ago, which has been used 
for almost everything (map/unmap/command submission/etc.) and pb_buffer has 
only held the reference keeping the buffer alive.

Now it's time to do this right.

This series moves the pb_cache_manager logic into an independent module 
"pb_cache". Next, all dependencies on pb_manager are removed from both winsyses 
and the new module is used instead. The result is that pb_buffer is equal to 
radeon_winsys_cs_handle, and the latter can be removed.


Very nice! I only have some comments on #1 and an identical remark about 
#3 & #9 (those two patches are also a bit awkward because they combine 
several seemingly unrelated changes, though I don't mind *that* much).


Modulo the comments mentioned above, the series is

Reviewed-by: Nicolai Hähnle 

Cheers,
Nicolai

Please review.


Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 00/12] nir: Get rid of the *_indirect variants of

2015-12-08 Thread Jason Ekstrand

This is my third sending of this series.  I think this version is close to
working on all of the relevant drivers (it works on i965).  Hopefully, we
can actually push it this time.

As mentioned before, all but the first two patches will be squashed
together so it's important to get this right.  Thank you very much to Rob
Clark and Eric Anholt for all your work helping me debug this on your
drivers.

Jason Ekstrand (12):
  vc4: Do all uniform loads with byte offsets
  i965/fs_nir: Refactor store_output, load_input, and load_uniform
  nir: Get rid of *_indirect variants of input/output load/store
intrinsics
  nir/glsl: Stop handling UBO/SSBO load/stores differently depending on
indirect
  nir/lower_io: Get rid of load/store_foo_indirect
  i965/fs: Get rid of load/store_foo_indirect
  i965/vec4: Get rid of load/store_foo_indirect
  tgsi_to_nir: Get rid of load/store_foo_indirect
  ir3/nir: Use the new unified io intrinsics
  vc4/nir: Use the new unified io intrinsics
  nir/lower_clip: Update to the new load/store intrinsics
  nir/lower_two_sided_color: Update to the new load intrinsic

 src/gallium/auxiliary/nir/tgsi_to_nir.c|  52 
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   |  79 ++-
 src/gallium/drivers/vc4/vc4_nir_lower_blend.c  |   1 +
 src/gallium/drivers/vc4/vc4_nir_lower_io.c |  55 +---
 src/gallium/drivers/vc4/vc4_program.c  |  44 +++---
 src/glsl/nir/glsl_to_nir.cpp   |  47 +--
 src/glsl/nir/nir.h |   2 +-
 src/glsl/nir/nir_intrinsics.h  |  81 ++--
 src/glsl/nir/nir_lower_clip.c  |   3 +
 src/glsl/nir/nir_lower_io.c| 111 ++--
 src/glsl/nir/nir_lower_phis_to_scalar.c|   4 -
 src/glsl/nir/nir_lower_two_sided_color.c   |   2 +
 src/glsl/nir/nir_print.c   |   9 +-
 src/mesa/drivers/dri/i965/brw_fs.h |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 147 +
 src/mesa/drivers/dri/i965/brw_nir.c|  45 +--
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp  |   7 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp |  95 ++---
 18 files changed, 380 insertions(+), 406 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 10/12] vc4/nir: Use the new unified io intrinsics

2015-12-08 Thread Jason Ekstrand

---
 src/gallium/drivers/vc4/vc4_nir_lower_blend.c |  1 +
 src/gallium/drivers/vc4/vc4_nir_lower_io.c| 38 --
 src/gallium/drivers/vc4/vc4_program.c | 47 +--
 3 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_blend.c 
b/src/gallium/drivers/vc4/vc4_nir_lower_blend.c
index 0672a92..5bd69fb 100644
--- a/src/gallium/drivers/vc4/vc4_nir_lower_blend.c
+++ b/src/gallium/drivers/vc4/vc4_nir_lower_blend.c
@@ -49,6 +49,7 @@ vc4_nir_get_dst_color(nir_builder *b)
nir_intrinsic_load_input);
 load->num_components = 1;
 load->const_index[0] = VC4_NIR_TLB_COLOR_READ_INPUT;
+load->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
 nir_ssa_dest_init(>instr, >dest, 1, NULL);
 nir_builder_instr_insert(b, >instr);
 return >dest.ssa;
diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c 
b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
index e3d018f..c3bafbf 100644
--- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c
+++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
@@ -179,6 +179,12 @@ vc4_nir_lower_vertex_attr(struct vc4_compile *c, 
nir_builder *b,
 /* All TGSI-to-NIR inputs are vec4. */
 assert(intr->num_components == 4);
 
+/* We only accept direct outputs and TGSI only ever gives them to us
+ * with an offset value of 0.
+ */
+assert(nir_src_as_const_value(intr->src[0]) &&
+   nir_src_as_const_value(intr->src[0])->u[0] == 0);
+
 /* Generate dword loads for the VPM values (Since these intrinsics may
  * be reordered, the actual reads will be generated at the top of the
  * shader by ntq_setup_inputs().
@@ -190,6 +196,7 @@ vc4_nir_lower_vertex_attr(struct vc4_compile *c, 
nir_builder *b,
nir_intrinsic_load_input);
 intr_comp->num_components = 1;
 intr_comp->const_index[0] = intr->const_index[0] * 4 + i;
+intr_comp->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
 nir_ssa_dest_init(_comp->instr, _comp->dest, 1, 
NULL);
 nir_builder_instr_insert(b, _comp->instr);
 
@@ -243,6 +250,12 @@ vc4_nir_lower_fs_input(struct vc4_compile *c, nir_builder 
*b,
 /* All TGSI-to-NIR inputs are vec4. */
 assert(intr->num_components == 4);
 
+/* We only accept direct inputs and TGSI only ever gives them to us
+ * with an offset value of 0.
+ */
+assert(nir_src_as_const_value(intr->src[0]) &&
+   nir_src_as_const_value(intr->src[0])->u[0] == 0);
+
 /* Generate scalar loads equivalent to the original VEC4. */
 nir_ssa_def *dests[4];
 for (unsigned i = 0; i < intr->num_components; i++) {
@@ -250,6 +263,8 @@ vc4_nir_lower_fs_input(struct vc4_compile *c, nir_builder 
*b,
 nir_intrinsic_instr_create(c->s, 
nir_intrinsic_load_input);
 intr_comp->num_components = 1;
 intr_comp->const_index[0] = intr->const_index[0] * 4 + i;
+intr_comp->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
+
 nir_ssa_dest_init(_comp->instr, _comp->dest, 1, 
NULL);
 nir_builder_instr_insert(b, _comp->instr);
 
@@ -317,6 +332,12 @@ vc4_nir_lower_output(struct vc4_compile *c, nir_builder *b,
 /* All TGSI-to-NIR outputs are VEC4. */
 assert(intr->num_components == 4);
 
+/* We only accept direct outputs and TGSI only ever gives them to us
+ * with an offset value of 0.
+ */
+assert(nir_src_as_const_value(intr->src[1]) &&
+   nir_src_as_const_value(intr->src[1])->u[0] == 0);
+
 b->cursor = nir_before_instr(>instr);
 
 for (unsigned i = 0; i < intr->num_components; i++) {
@@ -328,6 +349,7 @@ vc4_nir_lower_output(struct vc4_compile *c, nir_builder *b,
 assert(intr->src[0].is_ssa);
 intr_comp->src[0] =
 nir_src_for_ssa(nir_channel(b, intr->src[0].ssa, i));
+intr_comp->src[1] = nir_src_for_ssa(nir_imm_int(b, 0));
 nir_builder_instr_insert(b, _comp->instr);
 }
 
@@ -358,15 +380,12 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder 
*b,
 /* Convert the base offset to bytes and add the component */
 intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 
4);
 
-if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) {
-/* Convert the variable TGSI register index to a byte
- * offset.
- */
-intr_comp->src[0] =
-nir_src_for_ssa(nir_ishl(b,
- intr->src[0].ssa,
-

[Mesa-dev] [PATCH] nv50/ir: check if the target supports the new offset before inlining

2015-12-08 Thread Ilia Mirkin

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=93300
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp  |  9 ++---
 src/gallium/drivers/nouveau/codegen/nv50_ir_target.h  |  2 ++
 .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp   | 15 +++
 src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h |  2 ++
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1384a25..7b7352e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -290,20 +290,23 @@ IndirectPropagation::visit(BasicBlock *bb)
 continue;
  if (insn->op == OP_ADD && !isFloatType(insn->dType)) {
 if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) ||
-!insn->src(1).getImmediate(imm))
+!insn->src(1).getImmediate(imm) ||
+!targ->insnCanLoadOffset(i, s, imm.reg.data.s32))
continue;
 i->setIndirect(s, 0, insn->getSrc(0));
 i->setSrc(s, cloneShallow(func, i->getSrc(s)));
 i->src(s).get()->reg.data.offset += imm.reg.data.u32;
  } else if (insn->op == OP_SUB && !isFloatType(insn->dType)) {
 if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) ||
-!insn->src(1).getImmediate(imm))
+!insn->src(1).getImmediate(imm) ||
+!targ->insnCanLoadOffset(i, s, -imm.reg.data.s32))
continue;
 i->setIndirect(s, 0, insn->getSrc(0));
 i->setSrc(s, cloneShallow(func, i->getSrc(s)));
 i->src(s).get()->reg.data.offset -= imm.reg.data.u32;
  } else if (insn->op == OP_MOV) {
-if (!insn->src(0).getImmediate(imm))
+if (!insn->src(0).getImmediate(imm) ||
+!targ->insnCanLoadOffset(i, s, imm.reg.data.s32))
continue;
 i->setIndirect(s, 0, NULL);
 i->setSrc(s, cloneShallow(func, i->getSrc(s)));
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
index 4e33997..673f881 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
@@ -191,6 +191,8 @@ public:
 
virtual bool insnCanLoad(const Instruction *insn, int s,
 const Instruction *ld) const = 0;
+   virtual bool insnCanLoadOffset(const Instruction *insn, int s,
+  int offset) const { return true; }
virtual bool isOpSupported(operation, DataType) const = 0;
virtual bool isAccessSupported(DataFile, DataType) const = 0;
virtual bool isModSupported(const Instruction *,
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
index 2d5baed..101082e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
@@ -385,6 +385,21 @@ TargetNV50::insnCanLoad(const Instruction *i, int s,
 }
 
 bool
+TargetNV50::insnCanLoadOffset(const Instruction *i, int s, int offset) const
+{
+   if (!i->src(s).isIndirect(0))
+  return true;
+   offset += i->src(s).get()->reg.data.offset;
+   if (i->op == OP_LOAD || i->op == OP_STORE) {
+  // There are some restrictions in theory, but in practice they're never
+  // going to be hit. When we enable shared/global memory, this will
+  // become more important.
+  return true;
+   }
+   return offset >= 0 && offset <= (int32_t)(127 * i->src(s).get()->reg.size);
+}
+
+bool
 TargetNV50::isAccessSupported(DataFile file, DataType ty) const
 {
if (ty == TYPE_B96 || ty == TYPE_NONE)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h
index 0cbf180..00243d7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.h
@@ -46,6 +46,8 @@ public:
 
virtual bool insnCanLoad(const Instruction *insn, int s,
 const Instruction *ld) const;
+   virtual bool insnCanLoadOffset(const Instruction *insn, int s,
+  int offset) const;
virtual bool isOpSupported(operation, DataType) const;
virtual bool isAccessSupported(DataFile, DataType) const;
virtual bool isModSupported(const Instruction *, int s, Modifier) const;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: invalidate pipeline status after glUseProgramStages

2015-12-08 Thread Timothy Arceri

On Tue, 2015-12-08 at 19:02 +0200, Tapani Pälli wrote:
> This will cause validation to run during next draw, this is done
> because possible changes in used stages and programs can cause
> invalid pipeline state.
> 
> This fixes a subtest in following CTS test:
>   ES31-CTS.sepshaderobjs.StateInteraction
> 
> Signed-off-by: Tapani Pälli 

Reviewed-by: Timothy Arceri 

Thanks for your patience. If you end up trying to optimise this again
feel free to Cc me. Although I do think we will need to add some more
corner cases to the piglit tests before doing so.


> ---
>  src/mesa/main/pipelineobj.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/mesa/main/pipelineobj.c
> b/src/mesa/main/pipelineobj.c
> index 5eda4e5..f2a872d 100644
> --- a/src/mesa/main/pipelineobj.c
> +++ b/src/mesa/main/pipelineobj.c
> @@ -341,6 +341,8 @@ _mesa_UseProgramStages(GLuint pipeline,
> GLbitfield stages, GLuint program)
>  
> if ((stages & GL_COMPUTE_SHADER_BIT) != 0)
>_mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg,
> pipe);
> +
> +   pipe->Validated = false;
>  }
>  
>  /**
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 04/12] nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect

2015-12-08 Thread Jason Ekstrand

---
 src/glsl/nir/glsl_to_nir.cpp | 47 +++-
 1 file changed, 7 insertions(+), 40 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 45d045c..ba23b91 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -853,24 +853,12 @@ nir_visitor::visit(ir_call *ir)
  ir_constant *write_mask = ((ir_instruction *)param)->as_constant();
  assert(write_mask);
 
- /* Check if we need the indirect version */
- ir_constant *const_offset = offset->as_constant();
- if (!const_offset) {
-op = nir_intrinsic_store_ssbo_indirect;
-ralloc_free(instr);
-instr = nir_intrinsic_instr_create(shader, op);
-instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset));
-instr->const_index[0] = 0;
- } else {
-instr->const_index[0] = const_offset->value.u[0];
- }
-
- instr->const_index[1] = write_mask->value.u[0];
-
  instr->src[0] = nir_src_for_ssa(evaluate_rvalue(val));
+ instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block));
+ instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset));
+ instr->const_index[0] = write_mask->value.u[0];
  instr->num_components = val->type->vector_elements;
 
- instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block));
  nir_builder_instr_insert(, >instr);
  break;
   }
@@ -881,20 +869,8 @@ nir_visitor::visit(ir_call *ir)
  param = param->get_next();
  ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue();
 
- /* Check if we need the indirect version */
- ir_constant *const_offset = offset->as_constant();
- if (!const_offset) {
-op = nir_intrinsic_load_ssbo_indirect;
-ralloc_free(instr);
-instr = nir_intrinsic_instr_create(shader, op);
-instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset));
-instr->const_index[0] = 0;
-dest = >dest;
- } else {
-instr->const_index[0] = const_offset->value.u[0];
- }
-
  instr->src[0] = nir_src_for_ssa(evaluate_rvalue(block));
+ instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset));
 
  const glsl_type *type = ir->return_deref->var->type;
  instr->num_components = type->vector_elements;
@@ -1174,20 +1150,11 @@ nir_visitor::visit(ir_expression *ir)
/* Some special cases */
switch (ir->operation) {
case ir_binop_ubo_load: {
-  ir_constant *const_index = ir->operands[1]->as_constant();
-
-  nir_intrinsic_op op;
-  if (const_index) {
- op = nir_intrinsic_load_ubo;
-  } else {
- op = nir_intrinsic_load_ubo_indirect;
-  }
-  nir_intrinsic_instr *load = nir_intrinsic_instr_create(this->shader, op);
+  nir_intrinsic_instr *load =
+ nir_intrinsic_instr_create(this->shader, nir_intrinsic_load_ubo);
   load->num_components = ir->type->vector_elements;
-  load->const_index[0] = const_index ? const_index->value.u[0] : 0; /* 
base offset */
   load->src[0] = nir_src_for_ssa(evaluate_rvalue(ir->operands[0]));
-  if (!const_index)
- load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1]));
+  load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1]));
   add_instr(>instr, ir->type->vector_elements);
 
   /*
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 02/12] i965/fs_nir: Refactor store_output, load_input, and load_uniform

2015-12-08 Thread Jason Ekstrand

There was way too much incrementing of things going on.  Instead, let's
just start everything off at the right base location, and then increment in
the loop.
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 45 ++--
 1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index a00fd0e..3754622 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -2304,16 +2304,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   assert(instr->const_index[0] % 4 == 0);
   assert(instr->const_index[1] % 4 == 0);
 
-  fs_reg uniform_reg(UNIFORM, instr->const_index[0] / 4);
-  uniform_reg.reg_offset = instr->const_index[1] / 4;
+  fs_reg src(UNIFORM, instr->const_index[0] / 4, dest.type);
+  src.reg_offset = instr->const_index[1] / 4;
 
-  for (unsigned j = 0; j < instr->num_components; j++) {
- fs_reg src = offset(retype(uniform_reg, dest.type), bld, j);
- if (has_indirect)
-src.reladdr = new(mem_ctx) fs_reg(get_nir_src(instr->src[0]));
+  if (has_indirect)
+ src.reladdr = new(mem_ctx) fs_reg(get_nir_src(instr->src[0]));
 
- bld.MOV(dest, src);
- dest = offset(dest, bld, 1);
+  for (unsigned j = 0; j < instr->num_components; j++) {
+ bld.MOV(offset(dest, bld, j), offset(src, bld, j));
   }
   break;
}
@@ -2431,19 +2429,16 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   unreachable("Not allowed");
   /* fallthrough */
case nir_intrinsic_load_input: {
-  unsigned index = 0;
-  for (unsigned j = 0; j < instr->num_components; j++) {
- fs_reg src;
- if (stage == MESA_SHADER_VERTEX) {
-src = offset(fs_reg(ATTR, instr->const_index[0], dest.type), bld, 
index);
- } else {
-src = offset(retype(nir_inputs, dest.type), bld,
- instr->const_index[0] + index);
- }
- index++;
+  fs_reg src;
+  if (stage == MESA_SHADER_VERTEX) {
+ src = fs_reg(ATTR, instr->const_index[0], dest.type);
+  } else {
+ src = offset(retype(nir_inputs, dest.type), bld,
+  instr->const_index[0]);
+  }
 
- bld.MOV(dest, src);
- dest = offset(dest, bld, 1);
+  for (unsigned j = 0; j < instr->num_components; j++) {
+ bld.MOV(offset(dest, bld, j), offset(src, bld, j));
   }
   break;
}
@@ -2516,13 +2511,11 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   /* fallthrough */
case nir_intrinsic_store_output: {
   fs_reg src = get_nir_src(instr->src[0]);
-  unsigned index = 0;
+  fs_reg new_dest = offset(retype(nir_outputs, src.type), bld,
+   instr->const_index[0]);
+
   for (unsigned j = 0; j < instr->num_components; j++) {
- fs_reg new_dest = offset(retype(nir_outputs, src.type), bld,
-  instr->const_index[0] + index);
- index++;
- bld.MOV(new_dest, src);
- src = offset(src, bld, 1);
+ bld.MOV(offset(new_dest, bld, j), offset(src, bld, j));
   }
   break;
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 06/12] i965/fs: Get rid of load/store_foo_indirect

2015-12-08 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_fs.h   |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 106 ++-
 src/mesa/drivers/dri/i965/brw_nir.c  |  45 ++---
 3 files changed, 84 insertions(+), 69 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index bca4589..b55589f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -278,7 +278,7 @@ public:
unsigned stream_id);
void emit_gs_thread_end();
void emit_gs_input_load(const fs_reg , const nir_src _src,
-   const fs_reg _offset, unsigned imm_offset,
+   unsigned base_offset, const nir_src _src,
unsigned num_components);
void emit_cs_terminate();
fs_reg *emit_cs_local_invocation_id_setup();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 3754622..18c9d65 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1603,29 +1603,31 @@ fs_visitor::emit_gs_vertex(const nir_src 
_count_nir_src,
 void
 fs_visitor::emit_gs_input_load(const fs_reg ,
const nir_src _src,
-   const fs_reg _offset,
-   unsigned imm_offset,
+   unsigned base_offset,
+   const nir_src _src,
unsigned num_components)
 {
struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) 
prog_data;
 
+   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
+   nir_const_value *offset_const = nir_src_as_const_value(offset_src);
+   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
+
/* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y],
 * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w].  Only
 * gl_PointSize is available as a GS input, however, so it must be that.
 */
-   const bool is_point_size =
-  indirect_offset.file == BAD_FILE && imm_offset == 0;
-
-   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
-   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
+   const bool is_point_size = (base_offset == 0);
 
-   if (indirect_offset.file == BAD_FILE && vertex_const != NULL &&
-   4 * imm_offset < push_reg_count) {
-  imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count;
+   if (offset_const != NULL && vertex_const != NULL &&
+   4 * offset_const->u[0] < push_reg_count) {
+  int imm_offset = (base_offset + offset_const->u[0]) * 4 +
+   vertex_const->u[0] * push_reg_count;
   /* This input was pushed into registers. */
   if (is_point_size) {
  /* gl_PointSize comes in .w */
- bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type));
+ assert(imm_offset == 0);
+ bld.MOV(dst, fs_reg(ATTR, 3, dst.type));
   } else {
  for (unsigned i = 0; i < num_components; i++) {
 bld.MOV(offset(dst, bld, i),
@@ -1683,21 +1685,21 @@ fs_visitor::emit_gs_input_load(const fs_reg ,
   }
 
   fs_inst *inst;
-  if (indirect_offset.file == BAD_FILE) {
+  if (offset_const) {
  /* Constant indexing - use global offset. */
  inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle);
- inst->offset = imm_offset;
+ inst->offset = base_offset + offset_const->u[0];
  inst->base_mrf = -1;
  inst->mlen = 1;
  inst->regs_written = num_components;
   } else {
  /* Indirect indexing - use per-slot offsets as well. */
- const fs_reg srcs[] = { icp_handle, indirect_offset };
+ const fs_reg srcs[] = { icp_handle, get_nir_src(offset_src) };
  fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
  bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0);
 
  inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, payload);
- inst->offset = imm_offset;
+ inst->offset = base_offset;
  inst->base_mrf = -1;
  inst->mlen = 2;
  inst->regs_written = num_components;
@@ -1763,17 +1765,12 @@ fs_visitor::nir_emit_gs_intrinsic(const fs_builder ,
   retype(fs_reg(brw_vec8_grf(2, 0)), BRW_REGISTER_TYPE_UD));
   break;
 
-   case nir_intrinsic_load_input_indirect:
case nir_intrinsic_load_input:
   unreachable("load_input intrinsics are invalid for the GS stage");
 
-   case nir_intrinsic_load_per_vertex_input_indirect:
-  indirect_offset = retype(get_nir_src(instr->src[1]), 
BRW_REGISTER_TYPE_D);
-  /* fallthrough */
case nir_intrinsic_load_per_vertex_input:
-  emit_gs_input_load(dest, instr->src[0],
- indirect_offset, instr->const_index[0],
-

[Mesa-dev] [PATCH v3 08/12] tgsi_to_nir: Get rid of load/store_foo_indirect

2015-12-08 Thread Jason Ekstrand

---
 src/gallium/auxiliary/nir/tgsi_to_nir.c | 52 ++---
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index 5fef542..5def6d3 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -468,7 +468,7 @@ ttn_emit_immediate(struct ttn_compile *c)
nir_builder_instr_insert(b, _const->instr);
 }
 
-static nir_src
+static nir_ssa_def *
 ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register 
*indirect);
 
 /* generate either a constant or indirect deref chain for accessing an
@@ -487,7 +487,7 @@ ttn_array_deref(struct ttn_compile *c, nir_intrinsic_instr 
*instr,
 
if (indirect) {
   arr->deref_array_type = nir_deref_array_type_indirect;
-  arr->indirect = ttn_src_for_indirect(c, indirect);
+  arr->indirect = nir_src_for_ssa(ttn_src_for_indirect(c, indirect));
} else {
   arr->deref_array_type = nir_deref_array_type_direct;
}
@@ -586,19 +586,14 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
 
   switch (file) {
   case TGSI_FILE_INPUT:
- op = indirect ? nir_intrinsic_load_input_indirect :
- nir_intrinsic_load_input;
+ op = nir_intrinsic_load_input;
  assert(!dim);
  break;
   case TGSI_FILE_CONSTANT:
  if (dim) {
-op = indirect ? nir_intrinsic_load_ubo_indirect :
-nir_intrinsic_load_ubo;
-/* convert index from vec4 to byte: */
-index *= 16;
+op = nir_intrinsic_load_ubo;
  } else {
-op = indirect ? nir_intrinsic_load_uniform_indirect :
-nir_intrinsic_load_uniform;
+op = nir_intrinsic_load_uniform;
  }
  break;
   default:
@@ -609,7 +604,6 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
   load = nir_intrinsic_instr_create(b->shader, op);
 
   load->num_components = 4;
-  load->const_index[0] = index;
   if (dim) {
  if (dimind) {
 load->src[srcn] =
@@ -622,17 +616,26 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
  }
  srcn++;
   }
-  if (indirect) {
- load->src[srcn] = ttn_src_for_indirect(c, indirect);
- if (dim) {
-assert(load->src[srcn].is_ssa);
-/* we also need to covert vec4 to byte here too: */
-load->src[srcn] =
-   nir_src_for_ssa(nir_ishl(b, load->src[srcn].ssa,
-nir_imm_int(b, 4)));
+
+  nir_ssa_def *offset;
+  if (dim) {
+ /* UBO loads don't have a const_index[0] base offset. */
+ offset = nir_imm_int(b, index);
+ if (indirect) {
+offset = nir_iadd(b, offset, ttn_src_for_indirect(c, indirect));
+ }
+ /* UBO offsets are in bytes, but TGSI gives them to us in vec4's */
+ offset = nir_ishl(b, offset, nir_imm_int(b, 4));
+  } else {
+ load->const_index[0] = index;
+ if (indirect) {
+offset = ttn_src_for_indirect(c, indirect);
+ } else {
+offset = nir_imm_int(b, 0);
  }
- srcn++;
   }
+  load->src[srcn++] = nir_src_for_ssa(offset);
+
   nir_ssa_dest_init(>instr, >dest, 4, NULL);
   nir_builder_instr_insert(b, >instr);
 
@@ -648,7 +651,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
return src;
 }
 
-static nir_src
+static nir_ssa_def *
 ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect)
 {
nir_builder *b = >build;
@@ -660,7 +663,7 @@ ttn_src_for_indirect(struct ttn_compile *c, struct 
tgsi_ind_register *indirect)
 indirect->File,
 indirect->Index,
 NULL, NULL, NULL);
-   return nir_src_for_ssa(nir_imov_alu(b, src, 1));
+   return nir_imov_alu(b, src, 1);
 }
 
 static nir_alu_dest
@@ -729,7 +732,7 @@ ttn_get_dest(struct ttn_compile *c, struct 
tgsi_full_dst_register *tgsi_fdst)
 
if (tgsi_dst->Indirect && (tgsi_dst->File != TGSI_FILE_TEMPORARY)) {
   nir_src *indirect = ralloc(c->build.shader, nir_src);
-  *indirect = ttn_src_for_indirect(c, _fdst->Indirect);
+  *indirect = nir_src_for_ssa(ttn_src_for_indirect(c, 
_fdst->Indirect));
   dest.dest.reg.indirect = indirect;
}
 
@@ -1927,9 +1930,10 @@ ttn_add_output_stores(struct ttn_compile *c)
 nir_intrinsic_instr_create(b->shader, nir_intrinsic_store_output);
  unsigned loc = var->data.driver_location + i;
  store->num_components = 4;
- store->const_index[0] = loc;
  store->src[0].reg.reg = c->output_regs[loc].reg;

[Mesa-dev] [PATCH v3 01/12] vc4: Do all uniform loads with byte offsets

2015-12-08 Thread Jason Ekstrand

Previously, we lowered direct uniform loads to dword offsets and indirect
loads to byte offsets in vc4_nir_lower_io.  However, it simplifies things a
bit if we just use byte offsets for everything and then divide by 4 when we
handle the direct uniform load.
---
 src/gallium/drivers/vc4/vc4_nir_lower_io.c | 17 +
 src/gallium/drivers/vc4/vc4_program.c  | 11 +++
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c 
b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
index 1afe52a..e3d018f 100644
--- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c
+++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
@@ -338,8 +338,8 @@ static void
 vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b,
   nir_intrinsic_instr *intr)
 {
-/* All TGSI-to-NIR uniform loads are vec4, but we may create dword
- * loads in our lowering passes.
+/* All TGSI-to-NIR uniform loads are vec4, but we need byte offsets
+ * in the backend.
  */
 if (intr->num_components == 1)
 return;
@@ -355,6 +355,9 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b,
 intr_comp->num_components = 1;
 nir_ssa_dest_init(_comp->instr, _comp->dest, 1, 
NULL);
 
+/* Convert the base offset to bytes and add the component */
+intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 
4);
+
 if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) {
 /* Convert the variable TGSI register index to a byte
  * offset.
@@ -363,16 +366,6 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder 
*b,
 nir_src_for_ssa(nir_ishl(b,
  intr->src[0].ssa,
  nir_imm_int(b, 4)));
-
-/* Convert the offset to be a byte index, too. */
-intr_comp->const_index[0] = (intr->const_index[0] * 16 
+
- i * 4);
-} else {
-/* We want a dword index for non-indirect uniform
- * loads.
- */
-intr_comp->const_index[0] = (intr->const_index[0] * 4 +
- i);
 }
 
 dests[i] = _comp->dest.ssa;
diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index 081adfd..5ce1143 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -1440,6 +1440,7 @@ static void
 ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr)
 {
 const nir_intrinsic_info *info = 
_intrinsic_infos[instr->intrinsic];
+int dword_offset;
 struct qreg *dest = NULL;
 
 if (info->has_dest) {
@@ -1449,11 +1450,13 @@ ntq_emit_intrinsic(struct vc4_compile *c, 
nir_intrinsic_instr *instr)
 switch (instr->intrinsic) {
 case nir_intrinsic_load_uniform:
 assert(instr->num_components == 1);
-if (instr->const_index[0] < VC4_NIR_STATE_UNIFORM_OFFSET) {
-*dest = qir_uniform(c, QUNIFORM_UNIFORM,
-instr->const_index[0]);
+/* The offset is in bytes, but we need dwords */
+assert(instr->const_index[0] % 4 == 0);
+dword_offset = instr->const_index[0] / 4;
+if (dword_offset < VC4_NIR_STATE_UNIFORM_OFFSET) {
+*dest = qir_uniform(c, QUNIFORM_UNIFORM, dword_offset);
 } else {
-*dest = qir_uniform(c, instr->const_index[0] -
+*dest = qir_uniform(c, dword_offset -
 VC4_NIR_STATE_UNIFORM_OFFSET,
 0);
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 09/12] ir3/nir: Use the new unified io intrinsics

2015-12-08 Thread Jason Ekstrand

---
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 79 +-
 1 file changed, 47 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 2723959..eea5c5e 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -1218,6 +1218,7 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr,
 {
struct ir3_block *b = ctx->block;
struct ir3_instruction *addr, *src0, *src1;
+   nir_const_value *const_offset;
/* UBO addresses are the first driver params: */
unsigned ubo = regid(ctx->so->first_driver_param + IR3_UBOS_OFF, 0);
unsigned off = intr->const_index[0];
@@ -1231,7 +1232,10 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr,
addr = create_uniform_indirect(ctx, ubo, get_addr(ctx, src0));
}
 
-   if (intr->intrinsic == nir_intrinsic_load_ubo_indirect) {
+   const_offset = nir_src_as_const_value(intr->src[1]);
+   if (const_offset) {
+   off += const_offset->u[0];
+   } else {
/* For load_ubo_indirect, second src is indirect offset: */
src1 = get_src(ctx, >src[1])[0];
 
@@ -1394,6 +1398,7 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
struct ir3_instruction **dst, **src;
struct ir3_block *b = ctx->block;
unsigned idx = intr->const_index[0];
+   nir_const_value *const_offset;
 
if (info->has_dest) {
dst = get_dst(ctx, >dest, intr->num_components);
@@ -1403,43 +1408,49 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
 
switch (intr->intrinsic) {
case nir_intrinsic_load_uniform:
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = create_uniform(ctx, n);
-   }
-   break;
-   case nir_intrinsic_load_uniform_indirect:
-   src = get_src(ctx, >src[0]);
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = create_uniform_indirect(ctx, n,
-   get_addr(ctx, src[0]));
+   const_offset = nir_src_as_const_value(intr->src[0]);
+   if (const_offset) {
+   idx += const_offset->u[0];
+   for (int i = 0; i < intr->num_components; i++) {
+   unsigned n = idx * 4 + i;
+   dst[i] = create_uniform(ctx, n);
+   }
+   } else {
+   src = get_src(ctx, >src[0]);
+   for (int i = 0; i < intr->num_components; i++) {
+   unsigned n = idx * 4 + i;
+   dst[i] = create_uniform_indirect(ctx, n,
+   get_addr(ctx, src[0]));
+   }
+   /* NOTE: if relative addressing is used, we set
+* constlen in the compiler (to worst-case value)
+* since we don't know in the assembler what the max
+* addr reg value can be:
+*/
+   ctx->so->constlen = ctx->s->num_uniforms;
}
-   /* NOTE: if relative addressing is used, we set constlen in
-* the compiler (to worst-case value) since we don't know in
-* the assembler what the max addr reg value can be:
-*/
-   ctx->so->constlen = ctx->s->num_uniforms;
break;
case nir_intrinsic_load_ubo:
-   case nir_intrinsic_load_ubo_indirect:
emit_intrinsic_load_ubo(ctx, intr, dst);
break;
case nir_intrinsic_load_input:
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = ctx->ir->inputs[n];
-   }
-   break;
-   case nir_intrinsic_load_input_indirect:
-   src = get_src(ctx, >src[0]);
-   struct ir3_instruction *collect =
-   create_collect(b, ctx->ir->inputs, 
ctx->ir->ninputs);
-   struct ir3_instruction *addr = get_addr(ctx, src[0]);
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = create_indirect_load(ctx, ctx->ir->ninputs,
-   n, addr, collect);
+   const_offset = nir_src_as_const_value(intr->src[0]);
+   if (const_offset) {
+   idx +=

[Mesa-dev] [PATCH v3 12/12] nir/lower_two_sided_color: Update to the new load intrinsic

2015-12-08 Thread Jason Ekstrand

---
 src/glsl/nir/nir_lower_two_sided_color.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/glsl/nir/nir_lower_two_sided_color.c 
b/src/glsl/nir/nir_lower_two_sided_color.c
index 6995b9d..7df12e0 100644
--- a/src/glsl/nir/nir_lower_two_sided_color.c
+++ b/src/glsl/nir/nir_lower_two_sided_color.c
@@ -73,6 +73,7 @@ load_input(nir_builder *b, nir_variable *in)
load = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_input);
load->num_components = 4;
load->const_index[0] = in->data.driver_location;
+   load->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
nir_ssa_dest_init(>instr, >dest, 4, NULL);
nir_builder_instr_insert(b, >instr);
 
@@ -151,6 +152,7 @@ nir_lower_two_sided_color_block(nir_block *block, void 
*void_state)
  unsigned drvloc =
 state->colors[idx].front->data.driver_location;
  if (intr->const_index[0] == drvloc) {
+assert(nir_src_as_const_value(intr->src[0]));
 break;
  }
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 03/12] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-08 Thread Jason Ekstrand

There is some special-casing needed in a competent back-end.  However, they
can do their special-casing easily enough based on whether or not the
offset is a constant.  In the mean time, having the *_indirect variants
adds special cases a number of places where they don't need to be and, in
general, only complicates things.  To complicate matters, NIR had no way to
convdert an indirect load/store to a direct one in the case that the
indirect was a constant so we would still not really get what the back-ends
wanted.  The best solution seems to be to get rid of the *_indirect
variants entirely.
---
 src/glsl/nir/nir_intrinsics.h   | 81 +
 src/glsl/nir/nir_lower_phis_to_scalar.c |  4 --
 src/glsl/nir/nir_print.c|  9 +---
 3 files changed, 42 insertions(+), 52 deletions(-)

diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index b2565c5..63df21e 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -228,54 +228,55 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
 SYSTEM_VALUE(helper_invocation, 1, 0)
 
 /*
- * The format of the indices depends on the type of the load.  For uniforms,
- * the first index is the base address and the second index is an offset that
- * should be added to the base address.  (This way you can determine in the
- * back-end which variable is being accessed even in an array.)  For inputs,
- * the one and only index corresponds to the attribute slot.  UBO loads also
- * have a single index which is the base address to load from.
+ * Load operations pull data from some piece of GPU memory.  All load
+ * operations operate in terms of offsets into some piece of theoretical
+ * memory.  Loads from externally visible memory (UBO and SSBO) simply take a
+ * byte offset as a source.  Loads from opaque memory (uniforms, inputs, etc.)
+ * take a base+offset pair where the base (const_index[0]) gives the location
+ * of the start of the variable being loaded and and the offset source is a
+ * offset into that variable.
  *
- * UBO loads have a (possibly constant) source which is the UBO buffer index.
- * For each type of load, the _indirect variant has one additional source
- * (the second in the case of UBO's) that is the is an indirect to be added to
- * the constant address or base offset to compute the final offset.
+ * Some load operations such as UBO/SSBO load and per_vertex loads take an
+ * additional source to specify which UBO/SSBO/vertex to load from.
  *
- * For vector backends, the address is in terms of one vec4, and so each array
- * element is +4 scalar components from the previous array element. For scalar
- * backends, the address is in terms of a single 4-byte float/int and arrays
- * elements begin immediately after the previous array element.
+ * The exact address type depends on the lowering pass that generates the
+ * load/store intrinsics.  Typically, this is vec4 units for things such as
+ * varying slots and float units for fragment shader inputs.  UBO and SSBO
+ * offsets are always in bytes.
  */
 
-#define LOAD(name, extra_srcs, indices, flags) \
-   INTRINSIC(load_##name, extra_srcs, ARR(1), true, 0, 0, indices, flags) \
-   INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \
- true, 0, 0, indices, flags)
+#define LOAD(name, srcs, indices, flags) \
+   INTRINSIC(load_##name, srcs, ARR(1, 1, 1, 1), true, 0, 0, indices, flags)
 
-LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
-LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
-LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
-LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
-LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
-LOAD(output, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE)
-LOAD(per_vertex_output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
+/* src[] = { offset }. const_index[] = { base } */
+LOAD(uniform, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { buffer_index, offset }. No const_index */
+LOAD(ubo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { offset }. const_index[0] = { base } */
+LOAD(input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { vertex, offset }. const_index[] = { base } */
+LOAD(per_vertex_input, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { buffer_index, offset }. No const_index */
+LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE)
+/* src[] = { offset }. const_index[] = { base } */
+LOAD(output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
+/* src[] = { vertex, offset }. const_index[] = { base } */
+LOAD(per_vertex_output, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE)
 
 /*
- * Stores work the same way as loads, except now the first register input is
- * the value or array to store and the optional second input is the indirect
- * offset. SSBO stores are similar, but they accept

[Mesa-dev] [PATCH v3 05/12] nir/lower_io: Get rid of load/store_foo_indirect

2015-12-08 Thread Jason Ekstrand

---
 src/glsl/nir/nir.h  |   2 +-
 src/glsl/nir/nir_lower_io.c | 111 +---
 2 files changed, 44 insertions(+), 69 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index e161b70..2e72e66 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1969,7 +1969,7 @@ void nir_assign_var_locations(struct exec_list *var_list,
 void nir_lower_io(nir_shader *shader,
   nir_variable_mode mode,
   int (*type_size)(const struct glsl_type *));
-nir_src *nir_get_io_indirect_src(nir_intrinsic_instr *instr);
+nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
 nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
 
 void nir_lower_vars_to_ssa(nir_shader *shader);
diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
index f64ac69..a2723d5 100644
--- a/src/glsl/nir/nir_lower_io.c
+++ b/src/glsl/nir/nir_lower_io.c
@@ -86,10 +86,9 @@ is_per_vertex_output(struct lower_io_state *state, 
nir_variable *var)
   stage == MESA_SHADER_TESS_CTRL;
 }
 
-static unsigned
+static nir_ssa_def *
 get_io_offset(nir_builder *b, nir_deref_var *deref,
   nir_ssa_def **vertex_index,
-  nir_ssa_def **out_indirect,
   int (*type_size)(const struct glsl_type *))
 {
nir_deref *tail = >deref;
@@ -109,8 +108,8 @@ get_io_offset(nir_builder *b, nir_deref_var *deref,
   *vertex_index = vtx;
}
 
-   nir_ssa_def *indirect = NULL;
-   unsigned base_offset = 0;
+   /* Just emit code and let constant-folding go to town */
+   nir_ssa_def *offset = nir_imm_int(b, 0);
 
while (tail->child != NULL) {
   const struct glsl_type *parent_type = tail->type;
@@ -120,55 +119,46 @@ get_io_offset(nir_builder *b, nir_deref_var *deref,
  nir_deref_array *deref_array = nir_deref_as_array(tail);
  unsigned size = type_size(tail->type);
 
- base_offset += size * deref_array->base_offset;
+ offset = nir_iadd(b, offset,
+   nir_imm_int(b, size * deref_array->base_offset));
 
  if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
 nir_ssa_def *mul =
nir_imul(b, nir_imm_int(b, size),
 nir_ssa_for_src(b, deref_array->indirect, 1));
 
-indirect = indirect ? nir_iadd(b, indirect, mul) : mul;
+offset = nir_iadd(b, offset, mul);
  }
   } else if (tail->deref_type == nir_deref_type_struct) {
  nir_deref_struct *deref_struct = nir_deref_as_struct(tail);
 
+ unsigned field_offset = 0;
  for (unsigned i = 0; i < deref_struct->index; i++) {
-base_offset += type_size(glsl_get_struct_field(parent_type, i));
+field_offset += type_size(glsl_get_struct_field(parent_type, i));
  }
+ offset = nir_iadd(b, offset, nir_imm_int(b, field_offset));
   }
}
 
-   *out_indirect = indirect;
-   return base_offset;
+   return offset;
 }
 
 static nir_intrinsic_op
 load_op(struct lower_io_state *state,
-nir_variable_mode mode, bool per_vertex, bool has_indirect)
+nir_variable_mode mode, bool per_vertex)
 {
nir_intrinsic_op op;
switch (mode) {
case nir_var_shader_in:
-  if (per_vertex) {
- op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect :
- nir_intrinsic_load_per_vertex_input;
-  } else {
- op = has_indirect ? nir_intrinsic_load_input_indirect :
- nir_intrinsic_load_input;
-  }
+  op = per_vertex ? nir_intrinsic_load_per_vertex_input :
+nir_intrinsic_load_input;
   break;
case nir_var_shader_out:
-  if (per_vertex) {
- op = has_indirect ? nir_intrinsic_load_per_vertex_output_indirect :
- nir_intrinsic_load_per_vertex_output;
-  } else {
- op = has_indirect ? nir_intrinsic_load_output_indirect :
- nir_intrinsic_load_output;
-  }
+  op = per_vertex ? nir_intrinsic_load_per_vertex_output :
+nir_intrinsic_load_output;
   break;
case nir_var_uniform:
-  op = has_indirect ? nir_intrinsic_load_uniform_indirect :
-  nir_intrinsic_load_uniform;
+  op = nir_intrinsic_load_uniform;
   break;
default:
   unreachable("Unknown variable mode");
@@ -211,32 +201,25 @@ nir_lower_io_block(nir_block *block, void *void_state)
 is_per_vertex_input(state, intrin->variables[0]->var) ||
 is_per_vertex_output(state, intrin->variables[0]->var);
 
- nir_ssa_def *indirect;
+ nir_ssa_def *offset;
  nir_ssa_def *vertex_index;
 
- unsigned offset = get_io_offset(b, intrin->variables[0],
- per_vertex ? _index : NULL,
- , state->type_size);
+

[Mesa-dev] [PATCH v3 07/12] i965/vec4: Get rid of load/store_foo_indirect

2015-12-08 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp |  7 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 95 +++
 2 files changed, 43 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
index e51ef4b..6f66978 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
@@ -60,19 +60,19 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
src_reg src;
 
switch (instr->intrinsic) {
-   case nir_intrinsic_load_per_vertex_input_indirect:
-  assert(!"EmitNoIndirectInput should prevent this.");
case nir_intrinsic_load_per_vertex_input: {
   /* The EmitNoIndirectInput flag guarantees our vertex index will
* be constant.  We should handle indirects someday.
*/
   nir_const_value *vertex = nir_src_as_const_value(instr->src[0]);
+  nir_const_value *offset = nir_src_as_const_value(instr->src[1]);
 
   /* Make up a type...we have no way of knowing... */
   const glsl_type *const type = glsl_type::ivec(instr->num_components);
 
   src = src_reg(ATTR, BRW_VARYING_SLOT_COUNT * vertex->u[0] +
-  instr->const_index[0], type);
+  instr->const_index[0] + offset->u[0],
+type);
   dest = get_nir_dest(instr->dest, src.type);
   dest.writemask = brw_writemask_for_size(instr->num_components);
   emit(MOV(dest, src));
@@ -80,7 +80,6 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
}
 
case nir_intrinsic_load_input:
-   case nir_intrinsic_load_input_indirect:
   unreachable("nir_lower_io should have produced per_vertex intrinsics");
 
case nir_intrinsic_emit_vertex_with_counter: {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 50570cd..f965b39 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -369,22 +369,17 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
dst_reg dest;
src_reg src;
 
-   bool has_indirect = false;
-
switch (instr->intrinsic) {
 
-   case nir_intrinsic_load_input_indirect:
-  has_indirect = true;
-  /* fallthrough */
case nir_intrinsic_load_input: {
-  int offset = instr->const_index[0];
-  src = src_reg(ATTR, offset, glsl_type::uvec4_type);
+  nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]);
+
+  /* We set EmitNoIndirectInput for VS */
+  assert(const_offset);
+
+  src = src_reg(ATTR, instr->const_index[0] + const_offset->u[0],
+glsl_type::uvec4_type);
 
-  if (has_indirect) {
- dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr->src[0],
- BRW_REGISTER_TYPE_D,
- 1));
-  }
   dest = get_nir_dest(instr->dest, src.type);
   dest.writemask = brw_writemask_for_size(instr->num_components);
 
@@ -392,11 +387,11 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
   break;
}
 
-   case nir_intrinsic_store_output_indirect:
-  unreachable("nir_lower_outputs_to_temporaries should prevent this");
-
case nir_intrinsic_store_output: {
-  int varying = instr->const_index[0];
+  nir_const_value *const_offset = nir_src_as_const_value(instr->src[1]);
+  assert(const_offset);
+
+  int varying = instr->const_index[0] + const_offset->u[0];
 
   src = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_F,
 instr->num_components);
@@ -431,9 +426,6 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
   break;
}
 
-   case nir_intrinsic_store_ssbo_indirect:
-  has_indirect = true;
-  /* fallthrough */
case nir_intrinsic_store_ssbo: {
   assert(devinfo->gen >= 7);
 
@@ -458,20 +450,19 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
   }
 
   /* Offset */
-  src_reg offset_reg = src_reg(this, glsl_type::uint_type);
-  unsigned const_offset_bytes = 0;
-  if (has_indirect) {
- emit(MOV(dst_reg(offset_reg), get_nir_src(instr->src[2], 1)));
+  src_reg offset_reg;
+  nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]);
+  if (const_offset) {
+ offset_reg = brw_imm_ud(const_offset->u[0]);
   } else {
- const_offset_bytes = instr->const_index[0];
- emit(MOV(dst_reg(offset_reg), brw_imm_ud(const_offset_bytes)));
+ offset_reg = get_nir_src(instr->src[2], 1);
   }
 
   /* Value */
   src_reg val_reg = get_nir_src(instr->src[0], 4);
 
   /* Writemask */
-  unsigned write_mask = instr->const_index[1];
+  unsigned write_mask = instr->const_index[0];
 
   /* IvyBridge does not have a native SIMD4x2 untyped write

[Mesa-dev] [PATCH v3 11/12] nir/lower_clip: Update to the new load/store intrinsics

2015-12-08 Thread Jason Ekstrand

---
 src/glsl/nir/nir_lower_clip.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/glsl/nir/nir_lower_clip.c b/src/glsl/nir/nir_lower_clip.c
index c58c778..e2a2bb6 100644
--- a/src/glsl/nir/nir_lower_clip.c
+++ b/src/glsl/nir/nir_lower_clip.c
@@ -74,6 +74,7 @@ store_clipdist_output(nir_builder *b, nir_variable *out, 
nir_ssa_def **val)
store->const_index[0] = out->data.driver_location;
store->src[0].ssa = nir_vec4(b, val[0], val[1], val[2], val[3]);
store->src[0].is_ssa = true;
+   store->src[1] = nir_src_for_ssa(nir_imm_int(b, 0));
nir_builder_instr_insert(b, >instr);
 }
 
@@ -85,6 +86,7 @@ load_clipdist_input(nir_builder *b, nir_variable *in, 
nir_ssa_def **val)
load = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_input);
load->num_components = 4;
load->const_index[0] = in->data.driver_location;
+   load->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
nir_ssa_dest_init(>instr, >dest, 4, NULL);
nir_builder_instr_insert(b, >instr);
 
@@ -112,6 +114,7 @@ find_output_in_block(nir_block *block, void *void_state)
  intr->const_index[0] == state->drvloc) {
 assert(state->def == NULL);
 assert(intr->src[0].is_ssa);
+assert(nir_src_as_const_value(intr->src[1]));
 state->def = intr->src[0].ssa;
 
 #if !defined(DEBUG)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 1626] X server should not poll() on DRM fd

2015-12-08 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=1626

--- Comment #3 from Alex Perez  ---
The last comment on this bug was eight and a half years ago...the bug itself
was filed back in 2004. Probably close as WONTFIX, if it's even relevant any
longer?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

74 matches

Mail list logo