Re: [Mesa-dev] [PATCH mesa] u_debug: do not compile asserts when they are disabled

2017-11-30 Thread Gert Wollny
There was actually some ping-pong about this issue: 

d885c9dad (Keith Whitwell) introduced it, 
7a2271c65 (José Fonseca) removed it , and  
f0ba7d897 (Keith Whitwell) introduced it again. 

Given this I would give some preference to not to forward standard
assert instead, but don't really see it as an issue to change
debug_assert.

With a31d289de6091 José Fonseca introduced p_debug.h which would
eventually become u_debug,h, and there assert is already channeled to
debug_assert, but it is not clear why it was seen as problematic to
used the standard C assert in 2008.

In any case:
Reviewed-By: Gert Wollny 

Am Donnerstag, den 30.11.2017, 12:16 + schrieb Eric Engestrom:
> Commit f0ba7d897d1c22202531a added this code to expose asserts to the
> compiler in an attempt to hide 'unused variable' warnings,
> incorrectly
> claiming it was a no-op. This has two bad effects:
> - any assert with side-effects are executed when they should be
>   disabled
> - the whole content of the assert must be understandable by the
>   compiler, which isn't true if variable or members are correctly
>   guarded by NDEBUG
> 
> Fix this by effectively reverting f0ba7d897d1c22202531a.
> 
> Unused variables warnings can be addressed by marking the variables
> as MAYBE_UNUSED instead, as is done in the rest of Mesa.
> 
> Fixes: f0ba7d897d1c22202531a "util: better fix for unused variable
>    warnings with asserts"
> Cc: Keith Whitwell 
> Reported-by: Gert Wollny 
> Signed-off-by: Eric Engestrom 
> ---
>  src/gallium/auxiliary/util/u_debug.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_debug.h
> b/src/gallium/auxiliary/util/u_debug.h
> index d2ea89f59c10e1bc0944..88e9bb8a29d63826167e 100644
> --- a/src/gallium/auxiliary/util/u_debug.h
> +++ b/src/gallium/auxiliary/util/u_debug.h
> @@ -188,7 +188,7 @@ void _debug_assert_fail(const char *expr,
>  #ifndef NDEBUG
>  #define debug_assert(expr) ((expr) ? (void)0 :
> _debug_assert_fail(#expr, __FILE__, __LINE__, __FUNCTION__))
>  #else
> -#define debug_assert(expr) (void)(0 && (expr))
> +#define debug_assert(expr) ((void)0)
>  #endif
>  
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 24/25] i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders

2017-11-30 Thread Eduardo Lima Mitev
On 12/01/2017 04:44 AM, Timothy Arceri wrote:
>
>
> On 01/12/17 04:28, Eduardo Lima Mitev wrote:
>> This is the main fork of the shader compilation code-path, where a NIR
>> shader is obtained by calling spirv_to_nir() or glsl_to_nir(),
>> depending on its nature..
>>
>> v2: Use 'spirv_data' member from gl_linked_shader to know which method
>>     to call. (Timothy Arceri)
>> ---
>>   src/mesa/drivers/dri/i965/brw_program.c | 14 --
>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_program.c
>> b/src/mesa/drivers/dri/i965/brw_program.c
>> index 755d4973cc0..596118f2fe5 100644
>> --- a/src/mesa/drivers/dri/i965/brw_program.c
>> +++ b/src/mesa/drivers/dri/i965/brw_program.c
>> @@ -31,6 +31,7 @@
>>     #include 
>>   #include "main/imports.h"
>> +#include "main/glspirv.h"
>>   #include "program/prog_parameter.h"
>>   #include "program/prog_print.h"
>>   #include "program/prog_to_nir.h"
>> @@ -73,9 +74,18 @@ brw_create_nir(struct brw_context *brw,
>>     ctx->Const.ShaderCompilerOptions[stage].NirOptions;
>>  nir_shader *nir;
>>   -   /* First, lower the GLSL IR or Mesa IR to NIR */
>> +   /* First, lower the GLSL/Mesa IR or SPIR-V to NIR */
>>  if (shader_prog) {
>> -  nir = glsl_to_nir(shader_prog, stage, options);
>> +  bool is_spirv_shader =
>> + (shader_prog->_LinkedShaders[stage]->spirv_data != NULL);
>> +
>> +  if (!is_spirv_shader) {
>> + nir = glsl_to_nir(shader_prog, stage, options);
>> +  } else {
>> + nir = _mesa_spirv_to_nir(ctx, shader_prog, stage, options);
>> +  }
>> +  assert (nir);
>
> Rather than messing around with bools, null checks and !'s I'd just
> change this to:
>
>   /* First, lower the GLSL/Mesa IR or SPIR-V to NIR */
>   if (shader_prog->_LinkedShaders[stage]->spirv_data) {
>  nir = _mesa_spirv_to_nir(ctx, shader_prog, stage, options);
>   } else {
>  nir = glsl_to_nir(shader_prog, stage, options);
>   }
>   assert (nir);
>
>

My intention was to make it clearer that we are using a pointer nullness
to decide code-path.
I don't care much so I will use your inlined version above, maybe just
adding a comment.

Thanks,

Eduardo

>> +
>>     nir_remove_dead_variables(nir, nir_var_shader_in |
>> nir_var_shader_out);
>>     nir_lower_returns(nir);
>>     nir_validate_shader(nir);
>>
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: don't run intrastage array validation when the interface type is not an array

2017-11-30 Thread Samuel Iglesias Gonsálvez
On Thu, 2017-11-30 at 15:47 +0100, Nicolai Hähnle wrote:
> Can you add an explanation / spec quote for this?
> 

Yes.

"We validate that the interface block array type's definition matches.
However, the function could be previously called if an non-array
interface block has different type definitions -for example, when the
precision qualifier differs in a GLSL ES shader, we would create two
different types-, and it would return invalid as both definitions are
non-arrays.

We fix this by specifying that at least one definition should be an
array to call the validation."

Does it sound good to you?

Sam

> On 09.11.2017 12:48, Samuel Iglesias Gonsálvez wrote:
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> > ---
> >   src/compiler/glsl/link_interface_blocks.cpp | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/compiler/glsl/link_interface_blocks.cpp
> > b/src/compiler/glsl/link_interface_blocks.cpp
> > index c2c3b58f821..ce90d916075 100644
> > --- a/src/compiler/glsl/link_interface_blocks.cpp
> > +++ b/src/compiler/glsl/link_interface_blocks.cpp
> > @@ -137,7 +137,7 @@ intrastage_match(ir_variable *a,
> >  /* If a block is an array then it must match across the
> > shader.
> >   * Unsized arrays are also processed and matched agaist sized
> > arrays.
> >   */
> > -   if (b->type != a->type &&
> > +   if (b->type != a->type && (b->type->is_array() || a->type-
> > >is_array()) &&
> >  (b->is_interface_instance() || a->is_interface_instance()) 
> > &&
> >  !validate_intrastage_arrays(prog, b, a))
> > return false;
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 00/25] Initial gl_spirv and spirv_extensions support in Mesa and i965

2017-11-30 Thread Eduardo Lima Mitev
On 12/01/2017 04:54 AM, Timothy Arceri wrote:
> On 01/12/17 04:28, Eduardo Lima Mitev wrote:
>> Hello,
>>
>> This is the second version of the series providing initial support
>> for ARB_gl_spirv and ARB_spirv_extensions in Mesa and i965.
>>
>> First version of the series can be found at
>> .
>>
>> In this series we hope we have addressed all issues detected during
>> the initial review. Thank you all who participated!
>>
>> Taking the nitpicks and minor fixes apart, most important changes
>> compared to the first version are:
>>
>> * A dedicated 'spirv' flag was removed from gl_shader struct. Now we
>> use the nulness of 'spirv_data' member for the same purpose.
>>
>> * The per-program 'spirv' flag was moved out of this series, but will
>> likely be re-introduced in the next delivery, because it will become
>> necessary.
>>
>> * We enforce one SPIR-V shader per stage, and fail linking if this
>> condition is not met.
>
> Sorry can you point me to the patch that contains this I couldn't find
> it when skimming over the series. Thanks.
>

Ohh, I'm very sorry. The revised patch got lost during rebase.

I have just sent a v3 of
"[PATCH v2 22/25] mesa/glspirv: Create gl_linked_shader objects for a
SPIR-V program" that includes the check in question.

Thanks for catching this!

Eduardo

>>
>> * 'SpirVCapabilities' struct of GL context constants is no longer a
>> pointer but a static struct.
>>
>> As usual, a tree of this series can be found at
>> .
>>
>> A tree of the larger WIP branch from which this series is taken:
>> .
>>
>> Thanks in advance for the reviews!
>>
>> cheers,
>> Eduardo
>>
>> Alejandro Piñeiro (9):
>>    spirv_extensions: rename nir_spirv_supported_extensions
>>    mesa: move nir_spirv_supported_capabilities definition
>>    i965: initialize SPIR-V capabilities
>>    spirv_extensions: add GL_ARB_spirv_extensions boilerplate
>>    spirv_extensions: add list of extensions and to_string method
>>    spirv_extensions: define spirv_extensions_supported
>>    spirv_extensions: add spirv_supported_extensions on gl_constants
>>    spirv_extensions: i965: initialize SPIR-V extensions
>>    nir/spirv: add gl_spirv_validation method
>>
>> Eduardo Lima Mitev (8):
>>    mesa/glspirv: Add struct gl_shader_spirv_data
>>    mesa/glspirv: Add a _mesa_spirv_link_shaders() placeholder
>>    mesa/program: Link SPIR-V shaders using the SPIR-V code-path
>>    mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader
>>    mesa/glspirv: Create gl_linked_shader objects for a SPIR-V program
>>    mesa/glspirv: Add a _mesa_spirv_to_nir() function
>>    i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders
>>    i965: Don't call process_glsl_ir() for SPIR-V shaders
>>
>> Neil Roberts (1):
>>    mesa: Add boilerplate for the GL 4.6 alias of glSpecializeShaderARB
>>
>> Nicolai Hähnle (7):
>>    mesa: add GL_ARB_gl_spirv boilerplate
>>    mesa/glspirv: Add struct gl_spirv_module
>>    mesa: implement SPIR-V loading in glShaderBinary
>>    mesa/shaderapi: add a getter for GL_SPIR_V_BINARY_ARB
>>    mesa: refuse to compile SPIR-V shaders or link mixed shaders
>>    mesa: add gl_constants::SpirVCapabilities
>>    mesa: Implement glSpecializeShaderARB
>>
>>   src/amd/vulkan/radv_shader.c    |   4 +-
>>   src/compiler/Makefile.sources   |   2 +
>>   src/compiler/spirv/nir_spirv.h  |  21 +-
>>   src/compiler/spirv/spirv_extensions.c   |  77 +++
>>   src/compiler/spirv/spirv_extensions.h   |  63 ++
>>   src/compiler/spirv/spirv_to_nir.c   | 160 +-
>>   src/compiler/spirv/vtn_private.h    |   2 +-
>>   src/intel/vulkan/anv_pipeline.c |   4 +-
>>   src/mapi/glapi/gen/ARB_gl_spirv.xml |  21 ++
>>   src/mapi/glapi/gen/ARB_spirv_extensions.xml |  13 ++
>>   src/mapi/glapi/gen/GL4x.xml |  11 +
>>   src/mapi/glapi/gen/Makefile.am  |   2 +
>>   src/mapi/glapi/gen/gl_API.xml   |   8 +
>>   src/mapi/glapi/gen/gl_genexec.py    |   1 +
>>   src/mapi/glapi/gen/meson.build  |   2 +
>>   src/mesa/Makefile.sources   |   4 +
>>   src/mesa/drivers/dri/i965/brw_context.c |  26 +++
>>   src/mesa/drivers/dri/i965/brw_link.cpp  |   3 +-
>>   src/mesa/drivers/dri/i965/brw_program.c |  14 +-
>>   src/mesa/main/context.c |   2 +
>>   src/mesa/main/extensions_table.h    |   2 +
>>   src/mesa/main/get.c |   7 +
>>   src/mesa/main/get_hash_params.py    |   3 +
>>   src/mesa/main/getstring.c   |  12 +
>>   src/mesa/main/glspirv.c | 331
>> 
>>   src/mesa/main/glspirv.h | 108 +
>>   

[Mesa-dev] [PATCH v3 22/25] mesa/glspirv: Create gl_linked_shader objects for a SPIR-V program

2017-11-30 Thread Eduardo Lima Mitev
v2: Bail out if we see more that one shader for the same stage, and add
   a corresponding comment. (Timothy Arceri)
---
 src/mesa/main/glspirv.c | 61 ++---
 1 file changed, 58 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
index e533853f7fa..0934ceccbb3 100644
--- a/src/mesa/main/glspirv.c
+++ b/src/mesa/main/glspirv.c
@@ -29,6 +29,8 @@
 #include "compiler/nir/nir.h"
 #include "compiler/spirv/nir_spirv.h"
 
+#include "program/program.h"
+
 #include "util/u_atomic.h"
 
 void
@@ -104,14 +106,67 @@ _mesa_spirv_shader_binary(struct gl_context *ctx,
}
 }
 
+/**
+ * This is the equivalent to compiler/glsl/linker.cpp::link_shaders()
+ * but for SPIR-V programs.
+ *
+ * This method just creates the gl_linked_shader structs with a reference to
+ * the SPIR-V data collected during previous steps.
+ *
+ * The real linking happens later in the driver-specifc call LinkShader().
+ * This is so backends can implement different linking strategies for
+ * SPIR-V programs.
+ */
 void
 _mesa_spirv_link_shaders(struct gl_context *ctx, struct gl_shader_program 
*prog)
 {
-   /* @TODO: This is a placeholder for the equivalent of
-* compiler/glsl/linker.cpp::link_shaders() but for SPIR-V.
-*/
prog->data->LinkStatus = linking_success;
prog->data->Validated = false;
+
+   for (unsigned i = 0; i < prog->NumShaders; i++) {
+  struct gl_shader *shader = prog->Shaders[i];
+  gl_shader_stage shader_type = shader->Stage;
+
+  /* We only support one shader per stage. The gl_spirv spec doesn't seem
+   * to prevent this, but the way the API is designed, requiring all 
shaders
+   * to be specialized with an entry point, makes supporting this quite
+   * undefined.
+   */
+  if (prog->_LinkedShaders[shader_type]) {
+ prog->data->LinkStatus = linking_failure;
+ return;
+  }
+
+  assert(shader->spirv_data);
+
+  struct gl_linked_shader *linked = rzalloc(NULL, struct gl_linked_shader);
+  linked->Stage = shader_type;
+
+  /* Create program and attach it to the linked shader */
+  struct gl_program *gl_prog =
+ ctx->Driver.NewProgram(ctx,
+_mesa_shader_stage_to_program(shader_type),
+prog->Name, false);
+  if (!gl_prog) {
+ prog->data->LinkStatus = linking_failure;
+ _mesa_delete_linked_shader(ctx, linked);
+ return;
+  }
+
+  _mesa_reference_shader_program_data(ctx,
+  _prog->sh.data,
+  prog->data);
+
+  /* Don't use _mesa_reference_program() just take ownership */
+  linked->Program = gl_prog;
+
+  /* Reference the SPIR-V data from shader to the linked shader */
+  _mesa_shader_spirv_data_reference(>spirv_data,
+shader->spirv_data);
+
+  prog->_LinkedShaders[shader_type] = linked;
+  prog->data->linked_stages |= 1 << shader_type;
+   }
 }
 
 void GLAPIENTRY
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl/es: precision qualifier doesn't need to match in UBOs

2017-11-30 Thread Samuel Iglesias Gonsálvez
On Thu, 2017-11-30 at 15:46 +0100, Nicolai Hähnle wrote:
> On 09.11.2017 12:48, Samuel Iglesias Gonsálvez wrote:
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> 
> Can you also update the comment? I can't quite match what the
> comment 
> says to what the code does.
> 
> I think it should be something like:
> 
> "They might mismatch due to the two shaders using different GLSL 
> versions, and that's ok in desktop GL. In ES, precision qualifiers
> don't need to match."
> 

I replied to this patch saying that I forgot to add it and suggesting
one. I am going to use yours.

> If this understanding is correct and the comment added,
> 
> Reviewed-by: Nicolai Hähnle 
> 

Thanks,

Sam

> 
> 
> > ---
> >   src/compiler/glsl/link_interface_blocks.cpp | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/compiler/glsl/link_interface_blocks.cpp
> > b/src/compiler/glsl/link_interface_blocks.cpp
> > index 510d4f71bbe..c2c3b58f821 100644
> > --- a/src/compiler/glsl/link_interface_blocks.cpp
> > +++ b/src/compiler/glsl/link_interface_blocks.cpp
> > @@ -114,7 +114,7 @@ intrastage_match(ir_variable *a,
> >  */
> > if ((a->data.how_declared != ir_var_declared_implicitly ||
> >  b->data.how_declared != ir_var_declared_implicitly) &&
> > -  (!prog->IsES || prog->data->Version != 310 ||
> > +  (!prog->IsES ||
> >  interstage_member_mismatch(prog, a-
> > >get_interface_type(),
> > b->get_interface_type(
> >return false;
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50/ir: rework the madsp subop handling

2017-11-30 Thread Karol Herbst
Creating correct SubOps for OP_MADSP wasn't easy, because devs needed to know
the proper values for each data type. Also the third source doesn't know any
explicit signess and the current code tried to work around this fact.

To make the creating of the subops much easier, this commit introduces a helper
macro, which can be used in an intuitive way and eliminates the need to know
the correct numbers for creating such subops.

Values for src0 and src1 kept their meaning, src2 changed significantly:

Before:
bit 0 changed the sign of src0
bit 1 changed the sign of src1
bits 2-3 choosed the data type out of 32, 24 and 16L

Now:
bits 0-1 choose the data type out of 32, 24 and 16L

This change also lead to an additional << 2 shift in the emmiter for src2

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  | 24 +-
 .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp |  8 +++-
 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  |  8 +++-
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 10 -
 4 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index bc15992df0..10b872e0dd 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -239,11 +239,25 @@ enum operation
 #define NV50_IR_SUBOP_LOAD_LOCKED1
 #define NV50_IR_SUBOP_STORE_UNLOCKED 2
 #define NV50_IR_SUBOP_MADSP_SD 0x
-// Yes, we could represent those with DataType.
-// Or put the type into operation and have a couple 1000 values in that enum.
-// This will have to do for now.
-// The bitfields are supposed to correspond to nve4 ISA.
-#define NV50_IR_SUBOP_MADSP(a,b,c) (((c) << 8) | ((b) << 4) | (a))
+#define NV50_IR_SUBOP_MADSP_SRC0_U32  0x000
+#define NV50_IR_SUBOP_MADSP_SRC0_S32  0x001
+#define NV50_IR_SUBOP_MADSP_SRC0_U24  0x002
+#define NV50_IR_SUBOP_MADSP_SRC0_S24  0x003
+#define NV50_IR_SUBOP_MADSP_SRC0_U16L 0x004
+#define NV50_IR_SUBOP_MADSP_SRC0_S16L 0x005
+#define NV50_IR_SUBOP_MADSP_SRC0_U16H 0x006
+#define NV50_IR_SUBOP_MADSP_SRC0_S16H 0x007
+#define NV50_IR_SUBOP_MADSP_SRC1_U24  0x000
+#define NV50_IR_SUBOP_MADSP_SRC1_S24  0x010
+#define NV50_IR_SUBOP_MADSP_SRC1_U16L 0x020
+#define NV50_IR_SUBOP_MADSP_SRC1_S16L 0x030
+#define NV50_IR_SUBOP_MADSP_SRC2_32   0x000
+#define NV50_IR_SUBOP_MADSP_SRC2_24   0x100
+#define NV50_IR_SUBOP_MADSP_SRC2_16L  0x200
+#define NV50_IR_SUBOP_MADSP_TUPLE(a, b, c) \
+   ( NV50_IR_SUBOP_MADSP_SRC0_ ## a | \
+ NV50_IR_SUBOP_MADSP_SRC1_ ## b | \
+ NV50_IR_SUBOP_MADSP_SRC2_ ## c )
 #define NV50_IR_SUBOP_V1(d,a,b)(((d) << 10) | ((b) << 5) | (a) | 0x)
 #define NV50_IR_SUBOP_V2(d,a,b)(((d) << 10) | ((b) << 5) | (a) | 0x4000)
 #define NV50_IR_SUBOP_V4(d,a,b)(((d) << 10) | ((b) << 5) | (a) | 0x8000)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
index 370427d0d1..f9c34a2760 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
@@ -556,11 +556,9 @@ CodeEmitterGK110::emitMADSP(const Instruction *i)
if (i->subOp == NV50_IR_SUBOP_MADSP_SD) {
   code[1] |= 0x00c0;
} else {
-  code[1] |= (i->subOp & 0x00f) << 19; // imadp1
-  code[1] |= (i->subOp & 0x0f0) << 20; // imadp2
-  code[1] |= (i->subOp & 0x100) << 11; // imadp3
-  code[1] |= (i->subOp & 0x200) << 15; // imadp3
-  code[1] |= (i->subOp & 0xc00) << 12; // imadp3
+  code[1] |= (i->subOp & 0x007) << 19; // src0
+  code[1] |= (i->subOp & 0x030) << 20; // src1
+  code[1] |= (i->subOp & 0x300) << 14; // src2
}
 
if (i->flagsDef >= 0)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index 58594f02c7..e6dcf2c6cb 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -826,11 +826,9 @@ CodeEmitterNVC0::emitMADSP(const Instruction *i)
if (i->subOp == NV50_IR_SUBOP_MADSP_SD) {
   code[1] |= 0x0180;
} else {
-  code[0] |= (i->subOp & 0x00f) << 7;
-  code[0] |= (i->subOp & 0x0f0) << 1;
-  code[0] |= (i->subOp & 0x100) >> 3;
-  code[0] |= (i->subOp & 0x200) >> 2;
-  code[1] |= (i->subOp & 0xc00) << 13;
+  code[0] |= (i->subOp & 0x007) << 7;
+  code[0] |= (i->subOp & 0x030) << 1;
+  code[1] |= (i->subOp & 0x300) << 11;
}
 
if (i->flagsDef >= 0)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 7243b1d2e4..d0eff75bef 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1893,17 

[Mesa-dev] [Bug 103699] Latest mesa breaks firefox on kde plasma with compositing on

2017-11-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103699

--- Comment #24 from Tapani Pälli  ---
(In reply to Fredrik Höglund from comment #23)
> (In reply to Tapani Pälli from comment #22)
> > (In reply to Fredrik Höglund from comment #20)
> > > Created attachment 135782 [details]
> > > modified test case
> > >
> > > The window decoration is a separate texture in kwin, so the decoration
> > > should be rendered normally even if the fbconfig is causing a blending 
> > > issue
> > > with the window contents. I think it is more likely that binding the 
> > > pixmap
> > > to a texture failed, and kwin is skipping the window entirely as a result.
> > >
> >
> > It is possible that there are 2 separate issues. One is the rendering
> > results of krunner, panel and launcher menu, that is way too dark and then
> > there are these invisible windows. Both get fixed by not setting sRGB
> > capable config for 32bit visual, which I think is correct thing todo since
> > there is only 1 such (non-conformant) visual exposed.
> 
> Yes, I believe the dark windows is a result of Qt not using the ARGB visual
> for those windows, so the background which is supposed to be translucent is
> solid black.
> 
> I don't disagree with the solution, but I don't think these are blending
> issues; it's rather that applications are looking for a non-sRGB capable
> fbconfig associated with the ARGB visual, and either giving up or falling
> back to a non-ARGB visual after not finding one.

Yep having done some more investigation I do agree it's not about blending,
it's about getting wrong kind of visual. I'll wait for review comments on the
xorg change.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103699] Latest mesa breaks firefox on kde plasma with compositing on

2017-11-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103699

--- Comment #25 from Mario Kleiner  ---
Another problem caused by the new Mesa srgb visuals -- and assigning srgb to
the special 32 bit composite overlay --  seems to be that the xf86-video-intel
ddx will only give you a black screen under KDE Plasma 5, as tested with
KUbuntu 16.04.3 LTS + distribution provided X-Server 1.19.3. Found this when i
disabled depth 30 after testing my 10 bpc Mesa patches on top of the new srgb
patch and stared at a black screen after logging in again.

Works fine with the intel-ddx at X-Screen "DefaultDepth 30", or with the
modesetting ddx at regular depth 24.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/22] nv50, nvc0: Support BGRX1010102 format for visuals.

2017-11-30 Thread Mario Kleiner

On 11/29/2017 04:38 PM, Ilia Mirkin wrote:

Why is this required? Can't you just use the BGR10_A2 format directly?



If i don't define this PIPE_FORMAT_B10G10R10X2_UNORM as "TD" = 
displayable, then it doesn't get exposed by the state tracker as a 
visual/fbconfig with RGBA = 10 10 10 0 under nouveau.


Wayland's Weston doesn't like that at all and gives a screen with pixel 
trash instead of a proper desktop, probably because it falls back to a 
BGRA1010102 format with alpha channel, that might indeed be zero.


On X11, all redirected/composited rendering only gives a black window 
client area, e.g., glxgears ends up as a black rectangle.


With the patch i get proper Weston display, and proper composited X11. 
"Proper" within the limitations imposed by my hacks + tbd work on the 
ddx and kms driver.



The problem with exposing these sorts of formats is that blending with
DST_ALPHA would be incorrect -- it should get read in as 1.0, but will
end up with bogus values.


Hm. My own application uses DST_ALPHA and ONE_MINUS_DST_ALPHA blending 
on the window system backbuffer in some demos and it seems to work fine 
on nouveau in depth 30 from what i can see. Not sure if this is due to 
the way my demos handle this though and there might be other cases that 
misbehave like you describe.


Unfortunately nv50/g80_defs.xml.h doesn't define a BGR10 surface format 
without alpha channel.


-mario



Cheers,

   -ilia

On Tue, Nov 28, 2017 at 11:21 PM, Mario Kleiner
 wrote:

Add it as displayable/scanout capable, so it can be
exposed as valid visual/fbconfig.

Signed-off-by: Mario Kleiner 
---
  src/gallium/drivers/nouveau/nv50/nv50_formats.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats.c 
b/src/gallium/drivers/nouveau/nv50/nv50_formats.c
index 706c34f..f2f9a4f 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_formats.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_formats.c
@@ -154,6 +154,7 @@ const struct nv50_format 
nv50_format_table[PIPE_FORMAT_COUNT] =

 C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, 
IB),
 C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, 
TD),
+   F3(A, B10G10R10X2_UNORM, BGR10_A2_UNORM, B, G, R, xx, UNORM, A2B10G10R10, 
TD),
 C4(A, R10G10B10A2_SNORM, NONE, R, G, B, A, SNORM, A2B10G10R10, T),
 C4(A, B10G10R10A2_SNORM, NONE, B, G, R, A, SNORM, A2B10G10R10, T),
 C4(A, R10G10B10A2_UINT, RGB10_A2_UINT, R, G, B, A, UINT, A2B10G10R10, TR),
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103699] Latest mesa breaks firefox on kde plasma with compositing on

2017-11-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103699

--- Comment #23 from Fredrik Höglund  ---
(In reply to Tapani Pälli from comment #22)
> (In reply to Fredrik Höglund from comment #20)
> > Created attachment 135782 [details]
> > modified test case
> >
> > The window decoration is a separate texture in kwin, so the decoration
> > should be rendered normally even if the fbconfig is causing a blending issue
> > with the window contents. I think it is more likely that binding the pixmap
> > to a texture failed, and kwin is skipping the window entirely as a result.
> >
>
> It is possible that there are 2 separate issues. One is the rendering
> results of krunner, panel and launcher menu, that is way too dark and then
> there are these invisible windows. Both get fixed by not setting sRGB
> capable config for 32bit visual, which I think is correct thing todo since
> there is only 1 such (non-conformant) visual exposed.

Yes, I believe the dark windows is a result of Qt not using the ARGB visual for
those windows, so the background which is supposed to be translucent is solid
black.

I don't disagree with the solution, but I don't think these are blending
issues; it's rather that applications are looking for a non-sRGB capable
fbconfig associated with the ARGB visual, and either giving up or falling back
to a non-ARGB visual after not finding one.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] spirv: Allow OpPtrAccessChain for block indices

2017-11-30 Thread Jason Ekstrand
After talking with Kristian a bit on IRC, I think there is a bit more
explanation needed.  I would be happy to add something like this to the
commit message or as a comment upon request.

There is an interesting edge case in the variable pointers extension when
you do an OpPtrAccessChain on a pointer to a struct decorated block.  In
this case there is not entirely clear if you should look for an array
stride and treat it as an implicit array of such structs or if you should
treat it as a single block in an array of blocks which would imply
incrementing the index portion of the descriptor tuple.  We choose the
later approach and assume that it's an array of blocks and therefore an
array of SSBOs.  This has two advantages:

 1) The SPIR-V spec for the OpPtrAccessChain instruction says "*Base* is
treated as the address of the first element of an array, and the *Element*
element’s address is computed to be the base for the *Indexes*, as per
*OpAccessChain*."  Taken literally, that would mean that your struct type
is supposed to be treated as an array of such a struct and, since it's
decorated block, that means an array of blocks which corresponds to an
array descriptor.

 2) Taking this approach ensures that any array dereference in an
OpAccessChain can be replaced with an OpAccessChain that selects element 0
of the array together with an OpPtrAccessChain taking that result and
adding to the index.  In particular, it ensures that this works when the
array dereference is on an array of SSBOs.

The downside (there always is one) is that this might be somewhat
surprising behavior to apps.  I can easily see an app slapping an
ArrayStride on the pointer and expecting that struct to implicitly turn
into an array of structs and being a bit shocked when they get GPU hangs
because of trying to access some random texture as an SSBO.  We could
attempt to be a bit "smarter" and go looking for an ArrayStride decoration
and, if we find one, use that instead of incrementing the block index.
However, that seems like a recipe for disaster because the behavior
suddenly depends on adding a decoration.

Really, this whole mess is an unfortunate complication arising from
attempting to add variable pointers onto a description of resource
descriptors that's based on GLSL syntax.  I'm in the process of trying to
get things clarified within the SPIR-V working group in Khronos.  In the
mean time, I believe this to be the best and most reasonable interpretation
of the spec as-written today.

--Jason


On Thu, Nov 30, 2017 at 7:06 PM, Jason Ekstrand 
wrote:

> The SPIR-V spec is a bit underspecified when it comes to exactly how
> you're allowed to use OpPtrAccessChain and what it means in certain edge
> cases.  In particular, what if the base pointer of the OpPtrAccessChain
> points to the base struct of an SSBO instead of an element in that SSBO.
> The original variable pointers implementation in mesa assumed that you
> weren't allowed to do an OpPtrAccessChain that adjusted the block index
> and asserted such.  However, there are some CTS tests that do this and,
> if the CTS does it, someone will do it in the wild so we should probably
> handle it.  With this commit, we significantly reduce our assumptions
> and should be able to handle more-or-less anything.
>
> The one assumption we still make for correctness is that if we see an
> OpPtrAccessChain on a pointer to a struct decorated block that the block
> index should be adjusted.  In theory, someone could try to put an array
> stride on such a pointer and try to make the SSBO an implicit array of
> the base struct and we would not give them what they want.  That said,
> any index other than 0 would count as an out-of-bounds access which is
> invalid.
> ---
>  src/compiler/spirv/vtn_variables.c | 128 --
> ---
>  1 file changed, 83 insertions(+), 45 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_
> variables.c
> index 09b3d35..89ce939 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -157,6 +157,22 @@ vtn_variable_resource_index(struct vtn_builder *b,
> struct vtn_variable *var,
> return >dest.ssa;
>  }
>
> +static nir_ssa_def *
> +vtn_resource_reindex(struct vtn_builder *b, nir_ssa_def *base_index,
> + nir_ssa_def *offset_index)
> +{
> +   nir_intrinsic_instr *instr =
> +  nir_intrinsic_instr_create(b->nb.shader,
> + nir_intrinsic_vulkan_resource_reindex);
> +   instr->src[0] = nir_src_for_ssa(base_index);
> +   instr->src[1] = nir_src_for_ssa(offset_index);
> +
> +   nir_ssa_dest_init(>instr, >dest, 1, 32, NULL);
> +   nir_builder_instr_insert(>nb, >instr);
> +
> +   return >dest.ssa;
> +}
> +
>  static struct vtn_pointer *
>  vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
> struct vtn_pointer *base,
> @@ -167,47 +183,61 @@ 

[Mesa-dev] [PATCH 2/2] r600/atomic: add cayman version of atomic save/restore from GDS

2017-11-30 Thread Dave Airlie
From: Dave Airlie 

On Cayman we don't use the append/consume counters (fglrx doesn't)
and they don't seem to work well with compute shaders.

This just uses GDS instead to do the atomic operations.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/evergreen_state.c | 60 +++-
 src/gallium/drivers/r600/r600_shader.c | 91 +++---
 2 files changed, 129 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 850165b30b..c44ed27b2c 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2659,6 +2659,7 @@ static void cayman_init_atom_start_cs(struct r600_context 
*rctx)
r600_store_value(cb, 0x76543210); /* 
CM_R_028BD4_PA_SC_CENTROID_PRIORITY_0 */
r600_store_value(cb, 0xfedcba98); /* 
CM_R_028BD8_PA_SC_CENTROID_PRIORITY_1 */
 
+   r600_store_context_reg(cb, R_028724_GDS_ADDR_SIZE, 0x3fff);
r600_store_context_reg_seq(cb, R_0288E8_SQ_LDS_ALLOC, 2);
r600_store_value(cb, 0); /* R_0288E8_SQ_LDS_ALLOC */
r600_store_value(cb, 0); /* R_0288EC_SQ_LDS_ALLOC_PS */
@@ -4502,6 +4503,51 @@ static void evergreen_emit_event_write_eos(struct 
r600_context *rctx,
radeon_emit(cs, reloc);
 }
 
+/* writes count from a buffer into GDS */
+static void cayman_write_count_to_gds(struct r600_context *rctx,
+ struct r600_shader_atomic *atomic,
+ struct r600_resource *resource,
+ uint32_t pkt_flags)
+{
+   struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
+   unsigned reloc = radeon_add_to_buffer_list(>b, >b.gfx,
+  resource,
+  RADEON_USAGE_READ,
+  
RADEON_PRIO_SHADER_RW_BUFFER);
+   uint64_t dst_offset = resource->gpu_address + (atomic->start * 4);
+
+   radeon_emit(cs, PKT3(PKT3_CP_DMA, 4, 0) | pkt_flags);
+   radeon_emit(cs, dst_offset & 0x);
+   radeon_emit(cs, PKT3_CP_DMA_CP_SYNC | PKT3_CP_DMA_DST_SEL(1) | 
((dst_offset >> 32) & 0xff));// GDS
+   radeon_emit(cs, atomic->hw_idx * 4);
+   radeon_emit(cs, 0);
+   radeon_emit(cs, PKT3_CP_DMA_CMD_DAS | 4);
+   radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
+   radeon_emit(cs, reloc);
+}
+
+static void cayman_read_count_from_gds(struct r600_context *rctx,
+   struct r600_shader_atomic *atomic,
+   struct r600_resource *resource,
+   uint32_t pkt_flags)
+{
+   struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
+   unsigned reloc = radeon_add_to_buffer_list(>b, >b.gfx,
+  resource,
+  RADEON_USAGE_WRITE,
+  
RADEON_PRIO_SHADER_RW_BUFFER);
+   uint64_t dst_offset = resource->gpu_address + (atomic->start * 4);
+
+   radeon_emit(cs, PKT3(PKT3_CP_DMA, 4, 0) | pkt_flags);
+   radeon_emit(cs, atomic->hw_idx * 4);
+   radeon_emit(cs, PKT3_CP_DMA_CP_SYNC | PKT3_CP_DMA_SRC_SEL(1));// GDS
+   radeon_emit(cs, dst_offset & 0x);
+   radeon_emit(cs, (dst_offset >> 32) & 0xff);
+   radeon_emit(cs, PKT3_CP_DMA_CMD_SAS | 4);
+   radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
+   radeon_emit(cs, reloc);
+}
+
 bool evergreen_emit_atomic_buffer_setup(struct r600_context *rctx,
struct r600_shader_atomic 
*combined_atomics,
uint8_t *atomic_used_mask_p)
@@ -4549,7 +4595,10 @@ bool evergreen_emit_atomic_buffer_setup(struct 
r600_context *rctx,
struct r600_resource *resource = 
r600_resource(astate->buffer[atomic->buffer_id].buffer);
assert(resource);
 
-   evergreen_emit_set_append_cnt(rctx, atomic, resource, 
pkt_flags);
+   if (rctx->b.chip_class == CAYMAN)
+   cayman_write_count_to_gds(rctx, atomic, resource, 
pkt_flags);
+   else
+   evergreen_emit_set_append_cnt(rctx, atomic, resource, 
pkt_flags);
}
*atomic_used_mask_p = atomic_used_mask;
return true;
@@ -4577,8 +4626,15 @@ void evergreen_emit_atomic_buffer_save(struct 
r600_context *rctx,
struct r600_resource *resource = 
r600_resource(astate->buffer[atomic->buffer_id].buffer);
assert(resource);
 
-   evergreen_emit_event_write_eos(rctx, atomic, resource, 
pkt_flags);
+   if (rctx->b.chip_class == CAYMAN)
+   cayman_read_count_from_gds(rctx, atomic, resource, 
pkt_flags);
+   else
+   

[Mesa-dev] [PATCH 1/2] r600/atomic: refactor out evergreen atomic setup/save code.

2017-11-30 Thread Dave Airlie
From: Dave Airlie 

For cayman we want to use different code paths.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/evergreen_state.c | 80 +++---
 1 file changed, 50 insertions(+), 30 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index a9982b5915..850165b30b 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -4455,6 +4455,53 @@ void eg_trace_emit(struct r600_context *rctx)
radeon_emit(cs, AC_ENCODE_TRACE_POINT(rctx->trace_id));
 }
 
+static void evergreen_emit_set_append_cnt(struct r600_context *rctx,
+ struct r600_shader_atomic *atomic,
+ struct r600_resource *resource,
+ uint32_t pkt_flags)
+{
+   struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
+   unsigned reloc = radeon_add_to_buffer_list(>b, >b.gfx,
+  resource,
+  RADEON_USAGE_READ,
+  
RADEON_PRIO_SHADER_RW_BUFFER);
+   uint64_t dst_offset = resource->gpu_address + (atomic->start * 4);
+   uint32_t base_reg_0 = R_02872C_GDS_APPEND_COUNT_0;
+
+   uint32_t reg_val = (base_reg_0 + atomic->hw_idx * 4 - 
EVERGREEN_CONTEXT_REG_OFFSET) >> 2;
+
+   radeon_emit(cs, PKT3(PKT3_SET_APPEND_CNT, 2, 0) | pkt_flags);
+   radeon_emit(cs, (reg_val << 16) | 0x3);
+   radeon_emit(cs, dst_offset & 0xfffc);
+   radeon_emit(cs, (dst_offset >> 32) & 0xff);
+   radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
+   radeon_emit(cs, reloc);
+}
+
+static void evergreen_emit_event_write_eos(struct r600_context *rctx,
+  struct r600_shader_atomic *atomic,
+  struct r600_resource *resource,
+  uint32_t pkt_flags)
+{
+   struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
+   uint32_t event = EVENT_TYPE_PS_DONE;
+   uint32_t base_reg_0 = R_02872C_GDS_APPEND_COUNT_0;
+   uint32_t reloc = radeon_add_to_buffer_list(>b, >b.gfx,
+  resource,
+  RADEON_USAGE_WRITE,
+  
RADEON_PRIO_SHADER_RW_BUFFER);
+   uint64_t dst_offset = resource->gpu_address + (atomic->start * 4);
+   uint32_t reg_val = (base_reg_0 + atomic->hw_idx * 4) >> 2;
+
+   radeon_emit(cs, PKT3(PKT3_EVENT_WRITE_EOS, 3, 0) | pkt_flags);
+   radeon_emit(cs, EVENT_TYPE(event) | EVENT_INDEX(6));
+   radeon_emit(cs, (dst_offset) & 0x);
+   radeon_emit(cs, (0 << 29) | ((dst_offset >> 32) & 0xff));
+   radeon_emit(cs, reg_val);
+   radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
+   radeon_emit(cs, reloc);
+}
+
 bool evergreen_emit_atomic_buffer_setup(struct r600_context *rctx,
struct r600_shader_atomic 
*combined_atomics,
uint8_t *atomic_used_mask_p)
@@ -4501,21 +4548,8 @@ bool evergreen_emit_atomic_buffer_setup(struct 
r600_context *rctx,
struct r600_shader_atomic *atomic = 
_atomics[atomic_index];
struct r600_resource *resource = 
r600_resource(astate->buffer[atomic->buffer_id].buffer);
assert(resource);
-   unsigned reloc = radeon_add_to_buffer_list(>b, 
>b.gfx,
-  resource,
-  RADEON_USAGE_READ,
-  
RADEON_PRIO_SHADER_RW_BUFFER);
-   uint64_t dst_offset = resource->gpu_address + (atomic->start * 
4);
-   uint32_t base_reg_0 = R_02872C_GDS_APPEND_COUNT_0;
-
-   uint32_t reg_val = (base_reg_0 + atomic->hw_idx * 4 - 
EVERGREEN_CONTEXT_REG_OFFSET) >> 2;
-
-   radeon_emit(cs, PKT3(PKT3_SET_APPEND_CNT, 2, 0) | pkt_flags);
-   radeon_emit(cs, (reg_val << 16) | 0x3);
-   radeon_emit(cs, dst_offset & 0xfffc);
-   radeon_emit(cs, (dst_offset >> 32) & 0xff);
-   radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
-   radeon_emit(cs, reloc);
+
+   evergreen_emit_set_append_cnt(rctx, atomic, resource, 
pkt_flags);
}
*atomic_used_mask_p = atomic_used_mask;
return true;
@@ -4543,21 +4577,7 @@ void evergreen_emit_atomic_buffer_save(struct 
r600_context *rctx,
struct r600_resource *resource = 
r600_resource(astate->buffer[atomic->buffer_id].buffer);
assert(resource);
 
-   uint32_t base_reg_0 = R_02872C_GDS_APPEND_COUNT_0;
-   reloc = 

[Mesa-dev] r600: cayman atomic gds support

2017-11-30 Thread Dave Airlie
There appears to be some bad interaction with the append/consume counters
on cayman (and compute shaders at least). I traced fglrx and it appears
it directly uses GDS memory.

This adds cayman specific paths to directly use GDS memory for these
atomics.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 00/25] Initial gl_spirv and spirv_extensions support in Mesa and i965

2017-11-30 Thread Timothy Arceri

On 01/12/17 04:28, Eduardo Lima Mitev wrote:

Hello,

This is the second version of the series providing initial support for 
ARB_gl_spirv and ARB_spirv_extensions in Mesa and i965.

First version of the series can be found at 
.

In this series we hope we have addressed all issues detected during the initial 
review. Thank you all who participated!

Taking the nitpicks and minor fixes apart, most important changes compared to 
the first version are:

* A dedicated 'spirv' flag was removed from gl_shader struct. Now we use the 
nulness of 'spirv_data' member for the same purpose.

* The per-program 'spirv' flag was moved out of this series, but will likely be 
re-introduced in the next delivery, because it will become necessary.

* We enforce one SPIR-V shader per stage, and fail linking if this condition is 
not met.


Sorry can you point me to the patch that contains this I couldn't find 
it when skimming over the series. Thanks.




* 'SpirVCapabilities' struct of GL context constants is no longer a pointer but 
a static struct.

As usual, a tree of this series can be found at 
.

A tree of the larger WIP branch from which this series is taken: 
.

Thanks in advance for the reviews!

cheers,
Eduardo

Alejandro Piñeiro (9):
   spirv_extensions: rename nir_spirv_supported_extensions
   mesa: move nir_spirv_supported_capabilities definition
   i965: initialize SPIR-V capabilities
   spirv_extensions: add GL_ARB_spirv_extensions boilerplate
   spirv_extensions: add list of extensions and to_string method
   spirv_extensions: define spirv_extensions_supported
   spirv_extensions: add spirv_supported_extensions on gl_constants
   spirv_extensions: i965: initialize SPIR-V extensions
   nir/spirv: add gl_spirv_validation method

Eduardo Lima Mitev (8):
   mesa/glspirv: Add struct gl_shader_spirv_data
   mesa/glspirv: Add a _mesa_spirv_link_shaders() placeholder
   mesa/program: Link SPIR-V shaders using the SPIR-V code-path
   mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader
   mesa/glspirv: Create gl_linked_shader objects for a SPIR-V program
   mesa/glspirv: Add a _mesa_spirv_to_nir() function
   i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders
   i965: Don't call process_glsl_ir() for SPIR-V shaders

Neil Roberts (1):
   mesa: Add boilerplate for the GL 4.6 alias of glSpecializeShaderARB

Nicolai Hähnle (7):
   mesa: add GL_ARB_gl_spirv boilerplate
   mesa/glspirv: Add struct gl_spirv_module
   mesa: implement SPIR-V loading in glShaderBinary
   mesa/shaderapi: add a getter for GL_SPIR_V_BINARY_ARB
   mesa: refuse to compile SPIR-V shaders or link mixed shaders
   mesa: add gl_constants::SpirVCapabilities
   mesa: Implement glSpecializeShaderARB

  src/amd/vulkan/radv_shader.c|   4 +-
  src/compiler/Makefile.sources   |   2 +
  src/compiler/spirv/nir_spirv.h  |  21 +-
  src/compiler/spirv/spirv_extensions.c   |  77 +++
  src/compiler/spirv/spirv_extensions.h   |  63 ++
  src/compiler/spirv/spirv_to_nir.c   | 160 +-
  src/compiler/spirv/vtn_private.h|   2 +-
  src/intel/vulkan/anv_pipeline.c |   4 +-
  src/mapi/glapi/gen/ARB_gl_spirv.xml |  21 ++
  src/mapi/glapi/gen/ARB_spirv_extensions.xml |  13 ++
  src/mapi/glapi/gen/GL4x.xml |  11 +
  src/mapi/glapi/gen/Makefile.am  |   2 +
  src/mapi/glapi/gen/gl_API.xml   |   8 +
  src/mapi/glapi/gen/gl_genexec.py|   1 +
  src/mapi/glapi/gen/meson.build  |   2 +
  src/mesa/Makefile.sources   |   4 +
  src/mesa/drivers/dri/i965/brw_context.c |  26 +++
  src/mesa/drivers/dri/i965/brw_link.cpp  |   3 +-
  src/mesa/drivers/dri/i965/brw_program.c |  14 +-
  src/mesa/main/context.c |   2 +
  src/mesa/main/extensions_table.h|   2 +
  src/mesa/main/get.c |   7 +
  src/mesa/main/get_hash_params.py|   3 +
  src/mesa/main/getstring.c   |  12 +
  src/mesa/main/glspirv.c | 331 
  src/mesa/main/glspirv.h | 108 +
  src/mesa/main/mtypes.h  |  31 +++
  src/mesa/main/shaderapi.c   |  60 -
  src/mesa/main/shaderobj.c   |   3 +
  src/mesa/main/spirv_extensions.c|  60 +
  src/mesa/main/spirv_extensions.h|  49 
  src/mesa/main/tests/dispatch_sanity.cpp |   3 +
  src/mesa/meson.build|   4 +
  src/mesa/program/ir_to_mesa.cpp |  23 +-
  34 files changed, 1098 insertions(+), 38 deletions(-)
  create mode 100644 src/compiler/spirv/spirv_extensions.c
  create mode 100644 

Re: [Mesa-dev] [PATCH v2 24/25] i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders

2017-11-30 Thread Timothy Arceri



On 01/12/17 04:28, Eduardo Lima Mitev wrote:

This is the main fork of the shader compilation code-path, where a NIR
shader is obtained by calling spirv_to_nir() or glsl_to_nir(),
depending on its nature..

v2: Use 'spirv_data' member from gl_linked_shader to know which method
to call. (Timothy Arceri)
---
  src/mesa/drivers/dri/i965/brw_program.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 755d4973cc0..596118f2fe5 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -31,6 +31,7 @@
  
  #include 

  #include "main/imports.h"
+#include "main/glspirv.h"
  #include "program/prog_parameter.h"
  #include "program/prog_print.h"
  #include "program/prog_to_nir.h"
@@ -73,9 +74,18 @@ brw_create_nir(struct brw_context *brw,
ctx->Const.ShaderCompilerOptions[stage].NirOptions;
 nir_shader *nir;
  
-   /* First, lower the GLSL IR or Mesa IR to NIR */

+   /* First, lower the GLSL/Mesa IR or SPIR-V to NIR */
 if (shader_prog) {
-  nir = glsl_to_nir(shader_prog, stage, options);
+  bool is_spirv_shader =
+ (shader_prog->_LinkedShaders[stage]->spirv_data != NULL);
+
+  if (!is_spirv_shader) {
+ nir = glsl_to_nir(shader_prog, stage, options);
+  } else {
+ nir = _mesa_spirv_to_nir(ctx, shader_prog, stage, options);
+  }
+  assert (nir);


Rather than messing around with bools, null checks and !'s I'd just 
change this to:


  /* First, lower the GLSL/Mesa IR or SPIR-V to NIR */
  if (shader_prog->_LinkedShaders[stage]->spirv_data) {
 nir = _mesa_spirv_to_nir(ctx, shader_prog, stage, options);
  } else {
 nir = glsl_to_nir(shader_prog, stage, options);
  }
  assert (nir);



+
nir_remove_dead_variables(nir, nir_var_shader_in | nir_var_shader_out);
nir_lower_returns(nir);
nir_validate_shader(nir);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] anv: Handle nir_intrinsic_vulkan_resource_reindex

2017-11-30 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_nir_apply_pipeline_layout.c | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c 
b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
index f8d8164..612b3f7 100644
--- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
+++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
@@ -129,6 +129,25 @@ lower_res_index_intrinsic(nir_intrinsic_instr *intrin,
 }
 
 static void
+lower_res_reindex_intrinsic(nir_intrinsic_instr *intrin,
+struct apply_pipeline_layout_state *state)
+{
+   nir_builder *b = >builder;
+
+   /* For us, the resource indices are just indices into the binding table and
+* array elements are sequential.  A resource_reindex just turns into an
+* add of the two indices.
+*/
+   assert(intrin->src[0].is_ssa && intrin->src[0].is_ssa);
+   nir_ssa_def *new_index = nir_iadd(b, intrin->src[0].ssa,
+intrin->src[1].ssa);
+
+   assert(intrin->dest.is_ssa);
+   nir_ssa_def_rewrite_uses(>dest.ssa, nir_src_for_ssa(new_index));
+   nir_instr_remove(>instr);
+}
+
+static void
 lower_tex_deref(nir_tex_instr *tex, nir_deref_var *deref,
 unsigned *const_index, unsigned array_size,
 nir_tex_src_type src_type, bool allow_indirect,
@@ -265,8 +284,15 @@ apply_pipeline_layout_block(nir_block *block,
   switch (instr->type) {
   case nir_instr_type_intrinsic: {
  nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
- if (intrin->intrinsic == nir_intrinsic_vulkan_resource_index) {
+ switch (intrin->intrinsic) {
+ case nir_intrinsic_vulkan_resource_index:
 lower_res_index_intrinsic(intrin, state);
+break;
+ case nir_intrinsic_vulkan_resource_reindex:
+lower_res_reindex_intrinsic(intrin, state);
+break;
+ default:
+break;
  }
  break;
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] spirv: Allow OpPtrAccessChain for block indices

2017-11-30 Thread Jason Ekstrand
The SPIR-V spec is a bit underspecified when it comes to exactly how
you're allowed to use OpPtrAccessChain and what it means in certain edge
cases.  In particular, what if the base pointer of the OpPtrAccessChain
points to the base struct of an SSBO instead of an element in that SSBO.
The original variable pointers implementation in mesa assumed that you
weren't allowed to do an OpPtrAccessChain that adjusted the block index
and asserted such.  However, there are some CTS tests that do this and,
if the CTS does it, someone will do it in the wild so we should probably
handle it.  With this commit, we significantly reduce our assumptions
and should be able to handle more-or-less anything.

The one assumption we still make for correctness is that if we see an
OpPtrAccessChain on a pointer to a struct decorated block that the block
index should be adjusted.  In theory, someone could try to put an array
stride on such a pointer and try to make the SSBO an implicit array of
the base struct and we would not give them what they want.  That said,
any index other than 0 would count as an out-of-bounds access which is
invalid.
---
 src/compiler/spirv/vtn_variables.c | 128 -
 1 file changed, 83 insertions(+), 45 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 09b3d35..89ce939 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -157,6 +157,22 @@ vtn_variable_resource_index(struct vtn_builder *b, struct 
vtn_variable *var,
return >dest.ssa;
 }
 
+static nir_ssa_def *
+vtn_resource_reindex(struct vtn_builder *b, nir_ssa_def *base_index,
+ nir_ssa_def *offset_index)
+{
+   nir_intrinsic_instr *instr =
+  nir_intrinsic_instr_create(b->nb.shader,
+ nir_intrinsic_vulkan_resource_reindex);
+   instr->src[0] = nir_src_for_ssa(base_index);
+   instr->src[1] = nir_src_for_ssa(offset_index);
+
+   nir_ssa_dest_init(>instr, >dest, 1, 32, NULL);
+   nir_builder_instr_insert(>nb, >instr);
+
+   return >dest.ssa;
+}
+
 static struct vtn_pointer *
 vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
struct vtn_pointer *base,
@@ -167,47 +183,61 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
struct vtn_type *type = base->type;
 
unsigned idx = 0;
-   if (deref_chain->ptr_as_array) {
-  /* We need ptr_type for the stride */
-  assert(base->ptr_type);
-  /* This must be a pointer to an actual element somewhere */
-  assert(offset);
-  assert(block_index || base->mode == vtn_variable_mode_workgroup);
-  /* We need at least one element in the chain */
-  assert(deref_chain->length >= 1);
+   if (base->mode == vtn_variable_mode_ubo ||
+   base->mode == vtn_variable_mode_ssbo) {
+  if (!block_index) {
+ assert(base->var && base->type);
+ nir_ssa_def *desc_arr_idx;
+ if (glsl_type_is_array(type->type)) {
+if (deref_chain->length >= 1) {
+   desc_arr_idx =
+  vtn_access_link_as_ssa(b, deref_chain->link[0], 1);
+   idx++;
+   /* This consumes a level of type */
+   type = type->array_element;
+} else {
+   /* This is annoying.  We've been asked for a pointer to the
+* array of UBOs/SSBOs and not a specifc buffer.  Return a
+* pointer with a descriptor index of 0 and we'll have to do
+* a reindex later to adjust it to the right thing.
+*/
+   desc_arr_idx = nir_imm_int(>nb, 0);
+}
+ } else if (deref_chain->ptr_as_array) {
+/* You can't have a zero-length OpPtrAccessChain */
+assert(deref_chain->length >= 1);
+desc_arr_idx = vtn_access_link_as_ssa(b, deref_chain->link[0], 1);
+ } else {
+/* We have a regular non-array SSBO. */
+desc_arr_idx = NULL;
+ }
+ block_index = vtn_variable_resource_index(b, base->var, desc_arr_idx);
+  } else if (deref_chain->ptr_as_array &&
+ type->base_type == vtn_base_type_struct && type->block) {
+ /* We are doing an OpPtrAccessChain on a pointer to a struct
+  * decorated block.  Do a reindex to add the first link in the
+  * OpPtrAccessChain to the index we received.
+  */
+ assert(deref_chain->length >= 1);
+ nir_ssa_def *offset_index =
+vtn_access_link_as_ssa(b, deref_chain->link[0], 1);
+ idx++;
 
-  nir_ssa_def *elem_offset =
- vtn_access_link_as_ssa(b, deref_chain->link[idx],
-base->ptr_type->stride);
-  offset = nir_iadd(>nb, offset, elem_offset);
-  idx++;
+ block_index = vtn_resource_reindex(b, block_index, offset_index);
+  }
}
 
if (!offset) {
-  /* This is 

[Mesa-dev] [PATCH 1/3] nir: Add a vulkan_resource_reindex intrinsic

2017-11-30 Thread Jason Ekstrand
This is required for being able to handle OpPtrAccessChain in SPIR-V
where the base type of the incoming pointer requires us to add to the
block index instead of the byte offset.
---
 src/compiler/nir/nir_intrinsics.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index 20bef33..ccf8e06 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -210,7 +210,7 @@ INTRINSIC(image_samples, 0, ARR(0), true, 1, 1, 0, xx, xx, 
xx,
   NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 
 /*
- * Vulkan descriptor set intrinsic
+ * Vulkan descriptor set intrinsics
  *
  * The Vulkan API uses a different binding model from GL.  In the Vulkan
  * API, all external resources are represented by a tuple:
@@ -224,10 +224,17 @@ INTRINSIC(image_samples, 0, ARR(0), true, 1, 1, 0, xx, 
xx, xx,
  *
  * The intended usage is that the shader will call vulkan_surface_index to
  * get an index and then pass that as the buffer index ubo/ssbo calls.
+ *
+ * The vulkan_resource_reindex intrinsic takes a resource index in src0
+ * (the result of a vulkan_resource_index or vulkan_resource_reindex) which
+ * corresponds to the tuple (set, binding, index) and computes an index
+ * corresponding to tuple (set, binding, idx + src1).
  */
 INTRINSIC(vulkan_resource_index, 1, ARR(1), true, 1, 0, 2,
   DESC_SET, BINDING, xx,
   NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+INTRINSIC(vulkan_resource_reindex, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx,
+  NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 
 /*
  * variable atomic intrinsics
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 12/32] radv: Move wsi initialization later in physical_device_init

2017-11-30 Thread Chad Versace
On Tue 28 Nov 2017, Jason Ekstrand wrote:
> We need it to happen after memory type setup so that we can query memory
> types in wsi_device_init.
> ---
>  src/amd/vulkan/radv_device.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)

Patches 10-12 are
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 07/32] radv: Implement VK_EXT_external_memory_dma_buf

2017-11-30 Thread Chad Versace
On Tue 28 Nov 2017, Jason Ekstrand wrote:
> ---
>  src/amd/vulkan/radv_device.c  | 8 ++--
>  src/amd/vulkan/radv_extensions.py | 1 +
>  src/amd/vulkan/radv_formats.c | 8 ++--
>  3 files changed, 13 insertions(+), 4 deletions(-)

With the changes in your wip/vulkan-wsi-prime branch, patches 6 & 7 are
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 05/32] vulkan/wsi: Add a mock image creation extension

2017-11-30 Thread Chad Versace
On Tue 28 Nov 2017, Jason Ekstrand wrote:
> ---
>  src/vulkan/wsi/wsi_common.h | 18 ++
>  1 file changed, 18 insertions(+)

Patches 1-5 are
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 13/32] vulkan/wsi: Implement prime in a completely generic way

2017-11-30 Thread Chad Versace
On Tue 28 Nov 2017, Jason Ekstrand wrote:
> ---
>  src/amd/vulkan/radv_wsi.c   | 137 +++
>  src/intel/vulkan/anv_wsi.c  |  14 +-
>  src/vulkan/wsi/wsi_common.c | 341 
> +++-
>  src/vulkan/wsi/wsi_common.h |  54 +-
>  src/vulkan/wsi/wsi_common_private.h |  16 ++
>  src/vulkan/wsi/wsi_common_wayland.c |   6 +-
>  src/vulkan/wsi/wsi_common_x11.c |  87 +
>  7 files changed, 475 insertions(+), 180 deletions(-)
> 
> diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
> index 247f7cc..589eb5c 100644
> --- a/src/amd/vulkan/radv_wsi.c
> +++ b/src/amd/vulkan/radv_wsi.c
> @@ -40,6 +40,13 @@ radv_wsi_proc_addr(VkPhysicalDevice physicalDevice, const 
> char *pName)
>   return radv_lookup_entrypoint(pName);
>  }
>  
> +static uint32_t
> +anv_wsi_queue_get_family_index(VkQueue _queue)

This function should have the 'radv' prefix, not 'anv'.

Other than that, this patch is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen10: Change the order of PIPE_CONTROL and load register.

2017-11-30 Thread Kenneth Graunke
On Thursday, November 30, 2017 4:53:44 PM PST Rafael Antognolli wrote:
> I believe the workaround describes that the MI_LOAD_REGISTER_IMM should
> come right after the 3DSTATE_SAMPLE_PATTERN.
> 
> Signed-off-by: Rafael Antognolli 
> Cc: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen8_multisample_state.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c 
> b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> index 9f849d64bbc..904e0fee2e5 100644
> --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> @@ -57,15 +57,15 @@ gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context 
> *brw)
> const struct gen_device_info *devinfo = >screen->devinfo;
> assert(devinfo->gen == 10);
>  
> +   /* Write to CACHE_MODE_0 (0x7000) */
> +   brw_load_register_imm32(brw, GEN7_CACHE_MODE_0, 0);
> +
> /* Before changing the value of CACHE_MODE_0 register, GFX pipeline must
>  * be idle; i.e., full flush is required.
>  */
> brw_emit_pipe_control_flush(brw,
> PIPE_CONTROL_CACHE_FLUSH_BITS |
> PIPE_CONTROL_CACHE_INVALIDATE_BITS);
> -
> -   /* Write to CACHE_MODE_0 (0x7000) */
> -   brw_load_register_imm32(brw, GEN7_CACHE_MODE_0, 0);
>  }
>  
>  /**
> 

I would include in the commit message:

This fixes GPU hangs in the i965 initial state batchbuffer when running
some Piglit tests with always_flush_batch=true.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] Implement WaClearTDRRegBeforeEOTForNonPS.

2017-11-30 Thread Kenneth Graunke
On Thursday, November 30, 2017 4:42:48 PM PST Rafael Antognolli wrote:
> The bspec describes:
> 
>"WA: Clear tdr register before send EOT in all non-PS shader kernels
> 
>mov(8) tdr0:ud 0x0:ud {NoMask}"
> 
> Signed-off-by: Rafael Antognolli 
> ---
>  src/intel/compiler/brw_fs_generator.cpp | 7 +++
>  src/intel/compiler/brw_reg.h| 6 ++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_fs_generator.cpp 
> b/src/intel/compiler/brw_fs_generator.cpp
> index 28790c86a64..78aa764fe73 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -573,6 +573,13 @@ fs_generator::generate_urb_write(fs_inst *inst, struct 
> brw_reg payload)
>  {
> brw_inst *insn;
>  
> +   if (inst->eot) {
> +  brw_push_insn_state(p);
> +  brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> +  brw_MOV(p, brw_tdr_reg(), brw_imm_uw(0));
> +  brw_pop_insn_state(p);
> +   }
> +
> insn = brw_next_insn(p, BRW_OPCODE_SEND);
>  
> brw_set_dest(p, insn, brw_null_reg());
> diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
> index ec1045b612a..a039c6f676c 100644
> --- a/src/intel/compiler/brw_reg.h
> +++ b/src/intel/compiler/brw_reg.h
> @@ -774,6 +774,12 @@ brw_address_reg(unsigned subnr)
> return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ADDRESS, 
> subnr);
>  }
>  
> +static inline struct brw_reg
> +brw_tdr_reg(void)
> +{
> +   return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_TDR, 0);
> +}
> +
>  /* If/else instructions break in align16 mode if writemask & swizzle
>   * aren't xyzw.  This goes against the convention for other scalar
>   * regs:
> 

Still not clear whether this is needed for compute shaders or not.

Looks good for the other stages.

My gut feeling is that this only matters if you're doing preemption
(possibly mid-draw preemption), but I have no data to back that up.
We may as well do it...

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: emit 3DSTATE_MULTISAMPLE more often.

2017-11-30 Thread Kenneth Graunke
On Thursday, November 30, 2017 4:42:47 PM PST Rafael Antognolli wrote:
> On CNL, we see multiple multisample failures on piglit tests. By
> emitting this extra state, though not documented in the bspec, those
> failures seem to go away.
> 
> This workaround could be removed if we ever find out a better solution,
> but it should be good enough for now.
> 
> Signed-off-by: Rafael Antognolli 
> Cc: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 8e500d3d285..bcfa3ee1bd8 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -3298,7 +3298,8 @@ genX(upload_multisample_state)(struct brw_context *brw)
>  
>  static const struct brw_tracked_state genX(multisample_state) = {
> .dirty = {
> -  .mesa = _NEW_MULTISAMPLE,
> +  .mesa = _NEW_MULTISAMPLE |
> +  (GEN_GEN == 10) ? _NEW_BUFFERS : 0,

I can't find any information at all about why this would be required,
but the other driver appears to re-emit 3DSTATE_MULTISAMPLE when
changing render targets, so it's possible they haven't run into this
issue.  Let's go with this for now.

(GEN_GEN == 10 ? _NEW_BUFFERS : 0),

Reviewed-by: Kenneth Graunke 

>.brw = BRW_NEW_BLORP |
>   BRW_NEW_CONTEXT |
>   BRW_NEW_NUM_SAMPLES,
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen10: Change the order of PIPE_CONTROL and load register.

2017-11-30 Thread Rafael Antognolli
I believe the workaround describes that the MI_LOAD_REGISTER_IMM should
come right after the 3DSTATE_SAMPLE_PATTERN.

Signed-off-by: Rafael Antognolli 
Cc: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_multisample_state.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c 
b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
index 9f849d64bbc..904e0fee2e5 100644
--- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
@@ -57,15 +57,15 @@ gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context 
*brw)
const struct gen_device_info *devinfo = >screen->devinfo;
assert(devinfo->gen == 10);
 
+   /* Write to CACHE_MODE_0 (0x7000) */
+   brw_load_register_imm32(brw, GEN7_CACHE_MODE_0, 0);
+
/* Before changing the value of CACHE_MODE_0 register, GFX pipeline must
 * be idle; i.e., full flush is required.
 */
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_CACHE_FLUSH_BITS |
PIPE_CONTROL_CACHE_INVALIDATE_BITS);
-
-   /* Write to CACHE_MODE_0 (0x7000) */
-   brw_load_register_imm32(brw, GEN7_CACHE_MODE_0, 0);
 }
 
 /**
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] Implement WaClearTDRRegBeforeEOTForNonPS.

2017-11-30 Thread Rafael Antognolli
It looks like I forgot to prefix the subject with "intel/compiler:".
Fixed locally.

On Thu, Nov 30, 2017 at 04:42:48PM -0800, Rafael Antognolli wrote:
> The bspec describes:
> 
>"WA: Clear tdr register before send EOT in all non-PS shader kernels
> 
>mov(8) tdr0:ud 0x0:ud {NoMask}"
> 
> Signed-off-by: Rafael Antognolli 
> ---
>  src/intel/compiler/brw_fs_generator.cpp | 7 +++
>  src/intel/compiler/brw_reg.h| 6 ++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_fs_generator.cpp 
> b/src/intel/compiler/brw_fs_generator.cpp
> index 28790c86a64..78aa764fe73 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -573,6 +573,13 @@ fs_generator::generate_urb_write(fs_inst *inst, struct 
> brw_reg payload)
>  {
> brw_inst *insn;
>  
> +   if (inst->eot) {
> +  brw_push_insn_state(p);
> +  brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> +  brw_MOV(p, brw_tdr_reg(), brw_imm_uw(0));
> +  brw_pop_insn_state(p);
> +   }
> +
> insn = brw_next_insn(p, BRW_OPCODE_SEND);
>  
> brw_set_dest(p, insn, brw_null_reg());
> diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
> index ec1045b612a..a039c6f676c 100644
> --- a/src/intel/compiler/brw_reg.h
> +++ b/src/intel/compiler/brw_reg.h
> @@ -774,6 +774,12 @@ brw_address_reg(unsigned subnr)
> return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ADDRESS, 
> subnr);
>  }
>  
> +static inline struct brw_reg
> +brw_tdr_reg(void)
> +{
> +   return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_TDR, 0);
> +}
> +
>  /* If/else instructions break in align16 mode if writemask & swizzle
>   * aren't xyzw.  This goes against the convention for other scalar
>   * regs:
> -- 
> 2.13.6
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] Implement WaClearTDRRegBeforeEOTForNonPS.

2017-11-30 Thread Rafael Antognolli
The bspec describes:

   "WA: Clear tdr register before send EOT in all non-PS shader kernels

   mov(8) tdr0:ud 0x0:ud {NoMask}"

Signed-off-by: Rafael Antognolli 
---
 src/intel/compiler/brw_fs_generator.cpp | 7 +++
 src/intel/compiler/brw_reg.h| 6 ++
 2 files changed, 13 insertions(+)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 28790c86a64..78aa764fe73 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -573,6 +573,13 @@ fs_generator::generate_urb_write(fs_inst *inst, struct 
brw_reg payload)
 {
brw_inst *insn;
 
+   if (inst->eot) {
+  brw_push_insn_state(p);
+  brw_set_default_mask_control(p, BRW_MASK_DISABLE);
+  brw_MOV(p, brw_tdr_reg(), brw_imm_uw(0));
+  brw_pop_insn_state(p);
+   }
+
insn = brw_next_insn(p, BRW_OPCODE_SEND);
 
brw_set_dest(p, insn, brw_null_reg());
diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index ec1045b612a..a039c6f676c 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -774,6 +774,12 @@ brw_address_reg(unsigned subnr)
return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ADDRESS, subnr);
 }
 
+static inline struct brw_reg
+brw_tdr_reg(void)
+{
+   return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_TDR, 0);
+}
+
 /* If/else instructions break in align16 mode if writemask & swizzle
  * aren't xyzw.  This goes against the convention for other scalar
  * regs:
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: emit 3DSTATE_MULTISAMPLE more often.

2017-11-30 Thread Rafael Antognolli
On CNL, we see multiple multisample failures on piglit tests. By
emitting this extra state, though not documented in the bspec, those
failures seem to go away.

This workaround could be removed if we ever find out a better solution,
but it should be good enough for now.

Signed-off-by: Rafael Antognolli 
Cc: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 8e500d3d285..bcfa3ee1bd8 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -3298,7 +3298,8 @@ genX(upload_multisample_state)(struct brw_context *brw)
 
 static const struct brw_tracked_state genX(multisample_state) = {
.dirty = {
-  .mesa = _NEW_MULTISAMPLE,
+  .mesa = _NEW_MULTISAMPLE |
+  (GEN_GEN == 10) ? _NEW_BUFFERS : 0,
   .brw = BRW_NEW_BLORP |
  BRW_NEW_CONTEXT |
  BRW_NEW_NUM_SAMPLES,
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: Enable vaExportSurfaceHandle()

2017-11-30 Thread Mark Thompson
It will be present from libva 2.1 (VAAPI 1.1.0 or higher).

Signed-off-by: Mark Thompson 
---
See:



Also enabled in mpv:


There are some other driver functions added in this new version:
* MFContext (multi-frame) stuff exists for lock-step processing of multiple 
streams.  As far as I can tell, it is only of value for server transcode 
setups, and probably has little benefit when encode is already asynchronous 
(which it isn't in the Intel driver).
* CreateBuffer2 is for passing 2D buffers to/from the driver.  Nothing uses it 
yet.
* QueryProcessingRate is for querying expected performance.  It might be 
sensible to implement, but would need more hardware information than I have to 
make the necessary tables.
All of them are left as NULL.

Thanks,

- Mark


 src/gallium/state_trackers/va/context.c | 8 +++-
 src/gallium/state_trackers/va/surface.c | 2 +-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index 78e1f19ab7..c4abe77cf7 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -89,7 +89,13 @@ static struct VADriverVTable vtable =
,
,
,
-#if 0
+#if VA_CHECK_VERSION(1, 1, 0)
+   NULL, /* vaCreateMFContext */
+   NULL, /* vaMFAddContext */
+   NULL, /* vaMFReleaseContext */
+   NULL, /* vaMFSubmit */
+   NULL, /* vaCreateBuffer2 */
+   NULL, /* vaQueryProcessingRate */
,
 #endif
 };
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 636505b720..f9412ce52e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -923,7 +923,7 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
return VA_STATUS_SUCCESS;
 }
 
-#if 0
+#if VA_CHECK_VERSION(1, 1, 0)
 VAStatus
 vlVaExportSurfaceHandle(VADriverContextP ctx,
 VASurfaceID surface_id,
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests

2017-11-30 Thread Kenneth Graunke
On Thursday, November 30, 2017 1:11:28 PM PST Neil Roberts wrote:
> Kenneth Graunke  writes:
> 
> > Neil, in case you were wondering, I suggested the above organization
> > of vertex elements because it would let us only upload 1 in the common
> > case. Looking in shader-db, there are 3579 shaders that use
> > gl_InstanceID, 186 shaders that use gl_VertexID, and 0 shaders that
> > use gl_BaseVertex, gl_BaseInstance, or gl_DrawID.
> >
> > It looks like your patches kicked gl_BaseVertex off to VE2 instead of
> > gl_BaseInstance. I suppose that works too, I just figured that keeping
> > VertexID/FirstVertex/BaseVertex together would make sense. If it's
> > more convenient to move gl_BaseVertex, I suppose I'm fine with that
> > too...
> 
> Yes, Antia forwarded me your emails with the previous suggestion. The
> order we came up with for this patch series is:
> 
> VE 1: 
> VE 2: 
> 
> The “offset for vertex ID” corresponds to what we were previously
> (incorrectly) sending for gl_BaseVertex, so effectively the values in
> VE1 have not changed at all.

Right...that's what I thought was going to save us from intermediate
breakage.  But it looked like you were uploading the new "offset for
vertex ID" field in VE1.x only when (system_values_read &
SYSTEM_VALUE_BASE_VERTEX_ID) != 0...which would be false...so we'd
effectively stop uploading it.

If we kept uploading it, it'd probably be fine...

> This also means we can continue to directly
> take these two values from the indirect draw params buffer. I believe
> your previous suggestion ended up needing a register store to put the
> values in the right place so with this ordering we get to avoid that.
> VE1 now contains everything needed for the most common values VertexID
> and InstanceID. VE2 is only needed for the rare cases that use gl_DrawID
> or gl_BaseVertex. On Vulkan VE2 won’t even be needed for the BaseVertex
> builtin because the semantics are different and in that case it
> corresponds to “offset for vertex ID”.

Cool, that sounds reasonable.  Thanks for explaining.  Perhaps you could
put the new order of VE's in the commit message for the patch that
changes them?

> I haven’t looked into too much detail about reordering the patches yet,
> but it does sound reasonable to try and avoid intermediate breakages.
> 
> Thanks for looking at the series.
> 
> Regards,
> - Neil



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 02/25] mesa: Add boilerplate for the GL 4.6 alias of glSpecializeShaderARB

2017-11-30 Thread Ian Romanick
I'd squash this in with the previous patch.

Reviewed-by: Ian Romanick 

On 11/30/2017 09:28 AM, Eduardo Lima Mitev wrote:
> From: Neil Roberts 
> 
> ---
>  src/mapi/glapi/gen/GL4x.xml | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
> index 88dba5cd71a..0a8094166c8 100644
> --- a/src/mapi/glapi/gen/GL4x.xml
> +++ b/src/mapi/glapi/gen/GL4x.xml
> @@ -73,6 +73,17 @@
>  
>
>
> +
> +  
> +  
> +
> +  
> +
> +
> +
> +
> +
> +  
>  
>  
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 05/25] mesa: implement SPIR-V loading in glShaderBinary

2017-11-30 Thread Ian Romanick
Two nits below...

On 11/30/2017 09:28 AM, Eduardo Lima Mitev wrote:
> From: Nicolai Hähnle 
> 
> v2: * Add a gl_shader_spirv_data member to gl_shader, which already
>encapsulates a gl_spirv_module where the binary will be saved.
>(Eduardo Lima)
> 
> * Just use the 'spirv_data' member to know whether a gl_shader has
>the SPIR_V_BINARY_ARB state. (Timothy Arceri)
> 
> * Remove redundant argument checks. Move extension presence check
>to API entry point where the rest of checks are. Retype 'n' and
>'length'arguments to use the correct and more standard types.
>(Ian Romanick)
> ---
>  src/mesa/main/glspirv.c   | 43 +++
>  src/mesa/main/glspirv.h   |  5 +
>  src/mesa/main/mtypes.h|  4 
>  src/mesa/main/shaderapi.c | 45 ++---
>  src/mesa/main/shaderobj.c |  2 ++
>  5 files changed, 96 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
> index 8d1e652e088..d2e76bb1927 100644
> --- a/src/mesa/main/glspirv.c
> +++ b/src/mesa/main/glspirv.c
> @@ -25,6 +25,8 @@
>  
>  #include "errors.h"
>  
> +#include "errors.h"
> +
>  #include "util/u_atomic.h"
>  
>  void
> @@ -59,6 +61,47 @@ _mesa_shader_spirv_data_reference(struct 
> gl_shader_spirv_data **dest,
>p_atomic_inc(>RefCount);
>  }
>  
> +void
> +_mesa_spirv_shader_binary(struct gl_context *ctx,
> +  unsigned n, struct gl_shader **shaders,
> +  const void* binary, size_t length)
> +{
> +   struct gl_spirv_module *module;
> +   struct gl_shader_spirv_data *spirv_data;
> +
> +   assert(length >= 0);
> +
> +   module = malloc(sizeof(*module) + (size_t)length);

Don't need the (size_t) because you made length be size_t. :)

> +   if (!module) {
> +  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glShaderBinary");
> +  return;
> +   }
> +
> +   p_atomic_set(>RefCount, 0);
> +   module->Length = length;
> +   memcpy(>Binary[0], binary, length);
> +
> +   for (int i = 0; i < n; ++i) {
> +  struct gl_shader *sh = shaders[i];
> +
> +  spirv_data = rzalloc(NULL, struct gl_shader_spirv_data);
> +  _mesa_shader_spirv_data_reference(>spirv_data, spirv_data);
> +  _mesa_spirv_module_reference(_data->SpirVModule, module);
> +
> +  sh->CompileStatus = compile_failure;
> +
> +  free((void *)sh->Source);
> +  sh->Source = NULL;
> +  free((void *)sh->FallbackSource);
> +  sh->FallbackSource = NULL;
> +
> +  ralloc_free(sh->ir);
> +  sh->ir = NULL;
> +  ralloc_free(sh->symbols);
> +  sh->symbols = NULL;
> +   }
> +}
> +
>  void GLAPIENTRY
>  _mesa_SpecializeShaderARB(GLuint shader,
>const GLchar *pEntryPoint,
> diff --git a/src/mesa/main/glspirv.h b/src/mesa/main/glspirv.h
> index b8a0125ea9f..ba281f68bef 100644
> --- a/src/mesa/main/glspirv.h
> +++ b/src/mesa/main/glspirv.h
> @@ -71,6 +71,11 @@ void
>  _mesa_shader_spirv_data_reference(struct gl_shader_spirv_data **dest,
>struct gl_shader_spirv_data *src);
>  
> +void
> +_mesa_spirv_shader_binary(struct gl_context *ctx,
> +  unsigned n, struct gl_shader **shaders,
> +  const void* binary, size_t length);
> +
>  /**
>   * \name API functions
>   */
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 062eea609c7..50a47e0a65d 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -98,6 +98,7 @@ struct st_context;
>  struct gl_uniform_storage;
>  struct prog_instruction;
>  struct gl_program_parameter_list;
> +struct gl_shader_spirv_data;
>  struct set;
>  struct set_entry;
>  struct vbo_context;
> @@ -2646,6 +2647,9 @@ struct gl_shader
> GLuint TransformFeedbackBufferStride[MAX_FEEDBACK_BUFFERS];
>  
> struct gl_shader_info info;
> +
> +   /* ARB_gl_spirv related data */
> +   struct gl_shader_spirv_data *spirv_data;
>  };
>  
>  
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index 72824355838..24058e5ee2e 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -42,6 +42,7 @@
>  #include "main/context.h"
>  #include "main/dispatch.h"
>  #include "main/enums.h"
> +#include "main/glspirv.h"
>  #include "main/hash.h"
>  #include "main/mtypes.h"
>  #include "main/pipelineobj.h"
> @@ -1051,6 +1052,16 @@ set_shader_source(struct gl_shader *sh, const GLchar 
> *source)
>  {
> assert(sh);
>  
> +   /* The GL_ARB_gl_spirv spec adds the following to the end of the 
> description
> +* of ShaderSource:
> +*
> +*   "If  was previously associated with a SPIR-V module (via the
> +*ShaderBinary command), that association is broken. Upon successful
> +*completion of this command the SPIR_V_BINARY_ARB state of 
> +*is set to FALSE."
> +*/
> +   _mesa_shader_spirv_data_reference(>spirv_data, NULL);
> +
> if 

Re: [Mesa-dev] [PATCH v2 07/25] mesa: refuse to compile SPIR-V shaders or link mixed shaders

2017-11-30 Thread Ian Romanick
On 11/30/2017 09:28 AM, Eduardo Lima Mitev wrote:
> From: Nicolai Hähnle 
> 
> Note that gl_shader::CompileStatus will also indicate whether a shader
> has been successfully specialized.
> 
> v2: Use the 'spirv_data' member of gl_shader to know if it is a SPIR-V
>shader, instead of a dedicated flag. (Timothy Arceri)
> ---
>  src/mesa/main/shaderapi.c   | 12 
>  src/mesa/program/ir_to_mesa.cpp | 17 -
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index 3ac1419b7ee..251c876ada8 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -1092,6 +1092,18 @@ _mesa_compile_shader(struct gl_context *ctx, struct 
> gl_shader *sh)
> if (!sh)
>return;
>  
> +   /* The GL_ARB_gl_spirv spec says:
> +*
> +*"Add a new error for the CompileShader command:
> +*
> +*  An INVALID_OPERATION error is generated if the SPIR_V_BINARY_ARB
> +*  state of  is TRUE."
> +*/
> +   if (sh->spirv_data) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION, "glCompileShader(SPIR-V)");
> +  return;
> +   }
> +
> if (!sh->Source) {
>/* If the user called glCompileShader without first calling
> * glShaderSource, we should fail to compile, but not raise a GL_ERROR.
> diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
> index aa8b6d7084b..047f5b38f71 100644
> --- a/src/mesa/program/ir_to_mesa.cpp
> +++ b/src/mesa/program/ir_to_mesa.cpp
> @@ -3077,6 +3077,7 @@ void
>  _mesa_glsl_link_shader(struct gl_context *ctx, struct gl_shader_program 
> *prog)
>  {
> unsigned int i;
> +   GLboolean spirv;

Use bool.  With that fixed, this patch is

Reviewed-by: Ian Romanick 

>  
> _mesa_clear_shader_program_data(ctx, prog);
>  
> @@ -3086,7 +3087,21 @@ _mesa_glsl_link_shader(struct gl_context *ctx, struct 
> gl_shader_program *prog)
>  
> for (i = 0; i < prog->NumShaders; i++) {
>if (!prog->Shaders[i]->CompileStatus) {
> -  linker_error(prog, "linking with uncompiled shader");
> +  linker_error(prog, "linking with uncompiled/unspecialized shader");
> +  }
> +
> +  if (!i) {
> + spirv = (prog->Shaders[i]->spirv_data != NULL);
> +  } else if (spirv && !prog->Shaders[i]->spirv_data) {
> + /* The GL_ARB_gl_spirv spec adds a new bullet point to the list of
> +  * reasons LinkProgram can fail:
> +  *
> +  *"All the shader objects attached to  do not have the
> +  * same value for the SPIR_V_BINARY_ARB state."
> +  */
> + linker_error(prog,
> +  "not all attached shaders have the same "
> +  "SPIR_V_BINARY_ARB state");
>}
> }
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 06/25] mesa/shaderapi: add a getter for GL_SPIR_V_BINARY_ARB

2017-11-30 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

On 11/30/2017 09:28 AM, Eduardo Lima Mitev wrote:
> From: Nicolai Hähnle 
> 
> v2: Use the 'spirv_data' member of gl_shader instead of a
>dedicated flag. (Timothy Arceri)
> ---
>  src/mesa/main/shaderapi.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index 24058e5ee2e..3ac1419b7ee 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -961,6 +961,9 @@ get_shaderiv(struct gl_context *ctx, GLuint name, GLenum 
> pname, GLint *params)
> case GL_SHADER_SOURCE_LENGTH:
>*params = shader->Source ? strlen((char *) shader->Source) + 1 : 0;
>break;
> +   case GL_SPIR_V_BINARY_ARB:
> +  *params = (shader->spirv_data != NULL);
> +  break;
> default:
>_mesa_error(ctx, GL_INVALID_ENUM, "glGetShaderiv(pname)");
>return;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr/scons: Fix intermittent build failure

2017-11-30 Thread Emil Velikov
On 30 November 2017 at 20:29, George Kyriazis  wrote:
> gen_rasterizer*.cpp now depend on gen_ar_eventhandler.hpp.
> Account for new dependency.
By 'now' i think you meant 'for a few months' :-P

Fixes: 0cc7c46cf43 ("swr/rast: Split rasterizer.cpp to improve compile time")
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Emit CS stall before MEDIA_VFE_STATE.

2017-11-30 Thread Matt Turner
From: Kenneth Graunke 

This fixes hangs on GFXBench 5's Aztec Ruins benchmark.

Unfortunately, it regresses OglCSCloth performance by about 10%. There
are some ideas for fixing that.

The Vulkan driver already emits this stall.

Reviewed-by: Matt Turner 
---
Commit message by Matt

 src/mesa/drivers/dri/i965/genX_state_upload.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index c2b1117186..47b29d82ae 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -4168,6 +4168,18 @@ genX(upload_cs_state)(struct brw_context *brw)
uint32_t *bind = brw_state_batch(brw, prog_data->binding_table.size_bytes,
 32, _state->bind_bo_offset);
 
+   /* The MEDIA_VFE_STATE documentation for Gen8+ says:
+*
+* "A stalling PIPE_CONTROL is required before MEDIA_VFE_STATE unless
+*  the only bits that are changed are scoreboard related: Scoreboard
+*  Enable, Scoreboard Type, Scoreboard Mask, Scoreboard * Delta. For
+*  these scoreboard related states, a MEDIA_STATE_FLUSH is sufficient."
+*
+* Earlier generations say "MI_FLUSH" instead of "stalling PIPE_CONTROL",
+* but MI_FLUSH isn't really a thing, so we assume they meant PIPE_CONTROL.
+*/
+   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
+
brw_batch_emit(brw, GENX(MEDIA_VFE_STATE), vfe) {
   if (prog_data->total_scratch) {
  uint32_t bo_offset;
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Move PIPE_CONTROL defines and prototypes to brw_pipe_control.h.

2017-11-30 Thread Matt Turner
From: Kenneth Graunke 

We need to be able to emit PIPE_CONTROLs from genX_state_upload.c,
which can't safely include brw_defines.h because it conflicts with
genxml.  Move all the PIPE_CONTROL related stuff together into a
separate header.

Reviewed-by: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_context.h   | 17 +
 src/mesa/drivers/dri/i965/brw_defines.h   | 43 -
 src/mesa/drivers/dri/i965/brw_pipe_control.h  | 89 +++
 src/mesa/drivers/dri/i965/genX_state_upload.c |  3 -
 4 files changed, 90 insertions(+), 62 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/brw_pipe_control.h

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 0670483806..aa91380b96 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -37,6 +37,7 @@
 #include "main/macros.h"
 #include "main/mtypes.h"
 #include "brw_structs.h"
+#include "brw_pipe_control.h"
 #include "compiler/brw_compiler.h"
 
 #include "isl/isl.h"
@@ -1674,22 +1675,6 @@ bool
 gen9_use_linear_1d_layout(const struct brw_context *brw,
   const struct intel_mipmap_tree *mt);
 
-/* brw_pipe_control.c */
-int brw_init_pipe_control(struct brw_context *brw,
- const struct gen_device_info *info);
-void brw_fini_pipe_control(struct brw_context *brw);
-
-void brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags);
-void brw_emit_pipe_control_write(struct brw_context *brw, uint32_t flags,
- struct brw_bo *bo, uint32_t offset,
- uint64_t imm);
-void brw_emit_end_of_pipe_sync(struct brw_context *brw, uint32_t flags);
-void brw_emit_mi_flush(struct brw_context *brw);
-void brw_emit_post_sync_nonzero_flush(struct brw_context *brw);
-void brw_emit_depth_stall_flushes(struct brw_context *brw);
-void gen7_emit_vs_workaround_flush(struct brw_context *brw);
-void gen7_emit_cs_stall_flush(struct brw_context *brw);
-
 /* brw_queryformat.c */
 void brw_query_internal_format(struct gl_context *ctx, GLenum target,
GLenum internalFormat, GLenum pname,
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 59d9e5cf21..99d41cf1a5 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1490,49 +1490,6 @@ enum brw_pixel_shader_coverage_mask_mode {
 #define MI_MATH_OPERAND_ZF   0x32
 #define MI_MATH_OPERAND_CF   0x33
 
-/** @{
- *
- * PIPE_CONTROL operation, a combination MI_FLUSH and register write with
- * additional flushing control.
- */
-#define _3DSTATE_PIPE_CONTROL  (CMD_3D | (3 << 27) | (2 << 24))
-#define PIPE_CONTROL_CS_STALL  (1 << 20)
-#define PIPE_CONTROL_GLOBAL_SNAPSHOT_COUNT_RESET   (1 << 19)
-#define PIPE_CONTROL_TLB_INVALIDATE(1 << 18)
-#define PIPE_CONTROL_SYNC_GFDT (1 << 17)
-#define PIPE_CONTROL_MEDIA_STATE_CLEAR (1 << 16)
-#define PIPE_CONTROL_NO_WRITE  (0 << 14)
-#define PIPE_CONTROL_WRITE_IMMEDIATE   (1 << 14)
-#define PIPE_CONTROL_WRITE_DEPTH_COUNT (2 << 14)
-#define PIPE_CONTROL_WRITE_TIMESTAMP   (3 << 14)
-#define PIPE_CONTROL_DEPTH_STALL   (1 << 13)
-#define PIPE_CONTROL_RENDER_TARGET_FLUSH (1 << 12)
-#define PIPE_CONTROL_INSTRUCTION_INVALIDATE (1 << 11)
-#define PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE  (1 << 10) /* GM45+ only */
-#define PIPE_CONTROL_ISP_DIS   (1 << 9)
-#define PIPE_CONTROL_INTERRUPT_ENABLE  (1 << 8)
-#define PIPE_CONTROL_FLUSH_ENABLE  (1 << 7) /* Gen7+ only */
-/* GT */
-#define PIPE_CONTROL_DATA_CACHE_FLUSH  (1 << 5)
-#define PIPE_CONTROL_VF_CACHE_INVALIDATE   (1 << 4)
-#define PIPE_CONTROL_CONST_CACHE_INVALIDATE(1 << 3)
-#define PIPE_CONTROL_STATE_CACHE_INVALIDATE(1 << 2)
-#define PIPE_CONTROL_STALL_AT_SCOREBOARD   (1 << 1)
-#define PIPE_CONTROL_DEPTH_CACHE_FLUSH (1 << 0)
-#define PIPE_CONTROL_PPGTT_WRITE   (0 << 2)
-#define PIPE_CONTROL_GLOBAL_GTT_WRITE  (1 << 2)
-
-#define PIPE_CONTROL_CACHE_FLUSH_BITS \
-   (PIPE_CONTROL_DEPTH_CACHE_FLUSH | PIPE_CONTROL_DATA_CACHE_FLUSH | \
-PIPE_CONTROL_RENDER_TARGET_FLUSH)
-
-#define PIPE_CONTROL_CACHE_INVALIDATE_BITS \
-   (PIPE_CONTROL_STATE_CACHE_INVALIDATE | PIPE_CONTROL_CONST_CACHE_INVALIDATE 
| \
-PIPE_CONTROL_VF_CACHE_INVALIDATE | PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | 
\
-PIPE_CONTROL_INSTRUCTION_INVALIDATE)
-
-/** @} */
-
 #define XY_SETUP_BLT_CMD   (CMD_2D | (0x01 << 22))
 
 #define XY_COLOR_BLT_CMD   (CMD_2D | (0x50 << 22))
diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.h 
b/src/mesa/drivers/dri/i965/brw_pipe_control.h
new file mode 100644
index 00..6e9a404870
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_pipe_control.h
@@ -0,0 +1,89 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby 

Re: [Mesa-dev] [PATCH v4 22/44] i965/fs: Helpers for un/shuffle 16-bit pairs in 32-bit components

2017-11-30 Thread Chema Casanova


On 30/11/17 23:21, Jason Ekstrand wrote:
> On Wed, Nov 29, 2017 at 6:50 PM, Jose Maria Casanova Crespo
> > wrote:
> 
> This helpers are used to load/store 16-bit types from/to 32-bit
> components.
> 
> The functions shuffle_32bit_load_result_to_16bit_data and
> shuffle_16bit_data_for_32bit_write are implemented in a similar
> way than the analogous functions for handling 64-bit types.
> ---
>  src/intel/compiler/brw_fs.h       | 11 +
>  src/intel/compiler/brw_fs_nir.cpp | 51
> +++
>  2 files changed, 62 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
> index 19b897e7a9..30557324d5 100644
> --- a/src/intel/compiler/brw_fs.h
> +++ b/src/intel/compiler/brw_fs.h
> @@ -497,6 +497,17 @@ void
> shuffle_32bit_load_result_to_64bit_data(const brw::fs_builder ,
>  fs_reg shuffle_64bit_data_for_32bit_write(const brw::fs_builder ,
>                                            const fs_reg ,
>                                            uint32_t components);
> +
> +void shuffle_32bit_load_result_to_16bit_data(const brw::fs_builder
> ,
> +                                             const fs_reg ,
> +                                             const fs_reg ,
> +                                             uint32_t components);
> +
> +void shuffle_16bit_data_for_32bit_write(const brw::fs_builder ,
> +                                        const fs_reg ,
> +                                        const fs_reg ,
> +                                        uint32_t components);
> +
>  fs_reg setup_imm_df(const brw::fs_builder ,
>                      double v);
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 726b2fcee7..c091241132 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4828,6 +4828,33 @@ shuffle_32bit_load_result_to_64bit_data(const
> fs_builder ,
>     }
>  }
> 
> +void
> +shuffle_32bit_load_result_to_16bit_data(const fs_builder ,
> +                                        const fs_reg ,
> +                                        const fs_reg ,
> +                                        uint32_t components)
> +{
> +   assert(type_sz(src.type) == 4);
> +   assert(type_sz(dst.type) == 2);
> +
> +   fs_reg tmp = retype(bld.vgrf(src.type), dst.type);
> +
> +   for (unsigned i = 0; i < components; i++) {
> +      const fs_reg component_i = subscript(offset(src, bld, i / 2),
> dst.type, i % 2);
> +
> +      bld.MOV(offset(tmp, bld, i % 2), component_i);
> +
> +      if (i % 2) {
> +         bld.MOV(offset(dst, bld, i -1), offset(tmp, bld, 0));
> +         bld.MOV(offset(dst, bld, i), offset(tmp, bld, 1));
> +      }
> 
> 
> I'm very confused by this extra moving.  Why can't we just do
> 
> bld.MOV(offset(dst, bld, i), component_i);
> 
> above?  Maybe I'm missing something but I don't see why the extra moves
> are needed.


There is a comment in the previous function
shuffle_32bit_load_result_to_64bit_data that explains the similar
situation that still applies in shuffle_32bit_load_result_to_16bit_data,
what about including the following comment.

/* A temporary that we will use to shuffle the 16-bit data of each
 * component in the vector into valid 32-bit data. We can't write
 * directly to dst because dst can be the same as src and in that
 * case the first MOV in the loop below would overwrite the data
 * read in the second MOV.
 */

But in any case I've just checked, and at first sight at the 6 final
uses of this function this situation never happens.

>  
> 
> +   }
> +   if (components % 2) {
> +      bld.MOV(offset(dst, bld, components - 1), tmp);
> +   }
> +}
> +
> +
>  /**
>   * This helper does the inverse operation of
>   * SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA.
> @@ -4860,6 +4887,30 @@ shuffle_64bit_data_for_32bit_write(const
> fs_builder ,
>     return dst;
>  }
> 
> +void
> +shuffle_16bit_data_for_32bit_write(const fs_builder ,
> +                                   const fs_reg ,
> +                                   const fs_reg ,
> +                                   uint32_t components)
> +{
> +   assert(type_sz(src.type) == 2);
> +   assert(type_sz(dst.type) == 4);
> +
> +   fs_reg tmp = bld.vgrf(dst.type);
> +
> +   for (unsigned i = 0; i < components; i++) {
> +      const fs_reg component_i = offset(src, bld, i);
> +      bld.MOV(subscript(tmp, src.type, i % 2), component_i);
> +      if (i % 2) {
> +         bld.MOV(offset(dst, bld, i / 2), tmp);
> +      }
> 
> 
> Again, why the extra MOVs?  

[Mesa-dev] [Bug 103999] 4x MSAA with RG32F shows garbage on triangle edges

2017-11-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103999

--- Comment #3 from mais...@archlinux.us ---
RADV that is. Works on other driver stacks I've tested.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103999] 4x MSAA with RG32F shows garbage on triangle edges

2017-11-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103999

--- Comment #2 from mais...@archlinux.us ---
No idea, haven't tried this on any other drivers.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 23/44] i965/fs: Enables 16-bit load_ubo with sampler

2017-11-30 Thread Jason Ekstrand
On Wed, Nov 29, 2017 at 6:50 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> load_ubo is using 32-bit loads as uniforms surfaces have a 32-bit
> surface format defined. So when reading 16-bit components with the
> sampler we need to unshuffle two 16-bit components from each 32-bit
> component.
>
> Using the sampler avoids the use of the byte_scattered_read message
> that needs one message for each component and is supposed to be
> slower.
>
> In the case of SKL+ we take advantage of a hardware feature that
> automatically defines a channel mask based on the rlen value, so on
> SKL+ we only use half of the registers without using a header in the
> payload.
> ---
>  src/intel/compiler/brw_fs.cpp   | 31
> +++
>  src/intel/compiler/brw_fs_generator.cpp | 10 --
>  2 files changed, 35 insertions(+), 6 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 1ca4d416b2..9c543496ba 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -184,9 +184,17 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const
> fs_builder ,
>  * a double this means we are only loading 2 elements worth of data.
>  * We also want to use a 32-bit data type for the dst of the load
> operation
>  * so other parts of the driver don't get confused about the size of
> the
> -* result.
> +* result. On the case of 16-bit data we only need half of the 32-bit
> +* components on SKL+ as we take advance of using message return size
> to
> +* define an xy channel mask.
>  */
> -   fs_reg vec4_result = bld.vgrf(BRW_REGISTER_TYPE_F, 4);
> +   fs_reg vec4_result;
> +   if (type_sz(dst.type) == 2 && (devinfo->gen >= 9)) {
> +  vec4_result = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
> +  vec4_result = retype(vec4_result, BRW_REGISTER_TYPE_HF);
> +   } else {
> +  vec4_result = bld.vgrf(BRW_REGISTER_TYPE_F, 4);
> +   }
>
fs_inst *inst = bld.emit(FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_LOGICAL,
>  vec4_result, surf_index, vec4_offset);
> inst->size_written = 4 * vec4_result.component_size(inst->exec_size);
> @@ -197,8 +205,23 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const
> fs_builder ,
> }
>
> vec4_result.type = dst.type;
> -   bld.MOV(dst, offset(vec4_result, bld,
> -   (const_offset & 0xf) / type_sz(vec4_result.type)));
> +
> +   if (type_sz(dst.type) == 2) {
> +  /* 16-bit types need to be unshuffled as each pair of 16-bit
> components
> +   * is packed on a 32-bit compoment because we are using a 32-bit
> format
> +   * in the surface of uniform that is read by the sampler.
> +   * TODO: On BDW+ mark when an uniform has 16-bit type so we could
> setup a
> +   * surface format of 16-bit and use the 16-bit return format at the
> +   * sampler.
> +   */
> +  vec4_result.stride = 2;
> +  bld.MOV(dst, byte_offset(offset(vec4_result, bld,
> +  (const_offset & 0x7) / 4),
> +   (const_offset & 0x7) / 2 % 2 * 2));
> +   } else {
> +  bld.MOV(dst, offset(vec4_result, bld,
> +  (const_offset & 0xf) /
> type_sz(vec4_result.type)));
> +   }
>

This seems overly complicated.  How about something like

fs_reg dw = offset(vec4_result, bld, (const_offset & 0xf) / 4);
switch (type_sz(dst.type)) {
case 2:
   shuffle_32bit_load_result_to_16bit_data(bld, dst, dw, 1);
   bld.MOV(dst, subscript(dw, dst.type, (const_offset / 2) & 1));
   break;
case 4:
   bld.MOV(dst, dw);
   break;
case 8:
   shuffle_32bit_load_result_to_64bit_data(bld, dst, dw, 1);
   break;
default:
   unreachable();
}


>  }
>
>  /**
> diff --git a/src/intel/compiler/brw_fs_generator.cpp
> b/src/intel/compiler/brw_fs_generator.cpp
> index a3861cd68e..00a4e29147 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -1381,12 +1381,18 @@ fs_generator::generate_varying
> _pull_constant_load_gen7(fs_inst *inst,
> uint32_t simd_mode, rlen, mlen;
> if (inst->exec_size == 16) {
>mlen = 2;
> -  rlen = 8;
> +  if (type_sz(dst.type) == 2 && (devinfo->gen >= 9))
> + rlen = 4;
> +  else
> + rlen = 8;
>

I'm not sure what I think of this.  We intentionally use a vec4 today
instead of a single float because it lets us potentially batch up a bunch
of loads in a single texture operation.  Are we sure that we never need
more than 2 of those result registers?

--Jason

   simd_mode = BRW_SAMPLER_SIMD_MODE_SIMD16;
> } else {
>assert(inst->exec_size == 8);
>mlen = 1;
> -  rlen = 4;
> +  if (type_sz(dst.type) == 2 && (devinfo->gen >= 9))
> + rlen = 2;
> +  else
> + rlen = 4;
>simd_mode = BRW_SAMPLER_SIMD_MODE_SIMD8;
> }
>
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> 

Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Nicolai Hähnle

Hi,

I've had a chance to look a bit more closely at the allocator prototype 
repository now. There's a whole bunch of low-level API design feedback, 
but for now let's focus on the high-level stuff first.


Going by the 4.5 major object types (as also seen on slide 5 of your 
presentation [0]), assertions and usages make sense to me.


Capabilities and capability sets should be cleaned up in my opinion, as 
the status quo is overly obfuscating things. What capability sets really 
represent, as far as I understand them, is *memory layouts*, and so 
that's what they should be called.


This conceptually simplifies `derive_capabilities` significantly without 
any loss of expressiveness as far as I can see. Given two lists of 
memory layouts, we simply look for which memory layouts appear in both 
lists, and then merge their constraints and capabilities.


Merging constraints looks good to me.

Capabilities need some more thought. The prototype removes capabilities 
when merging layouts, but I'd argue that that is often undesirable. (In 
fact, I cannot think of capabilities which we'd always want to remove.)


A typical example for this is compression (i.e. DCC in our case). For 
rendering usage, we'd return something like:


Memory layout: AMD/tiled; constraints(alignment=64k); caps(AMD/DCC)

For display usage, we might return (depending on hardware):

Memory layout: AMD/tiled; constraints(alignment=64k); caps(none)

Merging these in the prototype would remove the DCC capability, even 
though it might well make sense to keep it there for rendering. Dealing 
with the fact that display usage does not have this capability is 
precisely one of the two things that transitions are about! The other 
thing that transitions are about is caches.


I think this is kind of what Rob was saying in one of his mails.

Two interesting questions:

1. If we query for multiple usages on the same device, can we get a 
capability which can only be used for a subset of those usages?


2. What happens when we merge memory layouts with sets of capabilities 
where neither is a subset of the other?


As for the actual transition API, I accept that some metadata may be 
required, and the metadata probably needs to depend on the memory 
layout, which is often vendor-specific. But even linear layouts need 
some transitions for caches. We probably need at least some generic 
"off-device usage" bit.


Cheers,
Nicolai

[0] https://www.x.org/wiki/Events/XDC2017/jones_allocator.pdf

On 21.11.2017 02:11, James Jones wrote:
As many here know at this point, I've been working on solving issues 
related to DMA-capable memory allocation for various devices for some 
time now.  I'd like to take this opportunity to apologize for the way I 
handled the EGL stream proposals.  I understand now that the development 
process followed there was unacceptable to the community and likely 
offended many great engineers.


Moving forward, I attempted to reboot talks in a more constructive 
manner with the generic allocator library proposals & discussion forum 
at XDC 2016.  Some great design ideas came out of that, and I've since 
been prototyping some code to prove them out before bringing them back 
as official proposals.  Again, I understand some people are growing 
concerned that I've been doing this off on the side in a github project 
that has primarily NVIDIA contributors.  My goal was only to avoid 
wasting everyone's time with unproven ideas.  The intent was never to 
dump the prototype code as-is on the community and presume acceptance. 
It's just a public research project.


Now the prototyping is nearing completion, and I'd like to renew 
discussion on whether and how the new mechanisms can be integrated with 
the Linux graphics stack.


I'd be interested to know if more work is needed to demonstrate the 
usefulness of the new mechanisms, or whether people think they have 
value at this point.


After talking with people on the hallway track at XDC this year, I've 
heard several proposals for incorporating the new mechanisms:


-Include ideas from the generic allocator design into GBM.  This could 
take the form of designing a "GBM 2.0" API, or incrementally adding to 
the existing GBM API.


-Develop a library to replace GBM.  The allocator prototype code could 
be massaged into something production worthy to jump start this process.


-Develop a library that sits beside or on top of GBM, using GBM for 
low-level graphics buffer allocation, while supporting non-graphics 
kernel APIs directly.  The additional cross-device negotiation and 
sorting of capabilities would be handled in this slightly higher-level 
API before handing off to GBM and other APIs for actual allocation somehow.


-I have also heard some general comments that regardless of the 
relationship between GBM and the new allocator mechanisms, it might be 
time to move GBM out of Mesa so it can be developed as a stand-alone 
project.  I'd be interested what others think about that, 

Re: [Mesa-dev] [PATCH v4 22/44] i965/fs: Helpers for un/shuffle 16-bit pairs in 32-bit components

2017-11-30 Thread Jason Ekstrand
On Wed, Nov 29, 2017 at 6:50 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> This helpers are used to load/store 16-bit types from/to 32-bit
> components.
>
> The functions shuffle_32bit_load_result_to_16bit_data and
> shuffle_16bit_data_for_32bit_write are implemented in a similar
> way than the analogous functions for handling 64-bit types.
> ---
>  src/intel/compiler/brw_fs.h   | 11 +
>  src/intel/compiler/brw_fs_nir.cpp | 51 ++
> +
>  2 files changed, 62 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
> index 19b897e7a9..30557324d5 100644
> --- a/src/intel/compiler/brw_fs.h
> +++ b/src/intel/compiler/brw_fs.h
> @@ -497,6 +497,17 @@ void shuffle_32bit_load_result_to_64bit_data(const
> brw::fs_builder ,
>  fs_reg shuffle_64bit_data_for_32bit_write(const brw::fs_builder ,
>const fs_reg ,
>uint32_t components);
> +
> +void shuffle_32bit_load_result_to_16bit_data(const brw::fs_builder ,
> + const fs_reg ,
> + const fs_reg ,
> + uint32_t components);
> +
> +void shuffle_16bit_data_for_32bit_write(const brw::fs_builder ,
> +const fs_reg ,
> +const fs_reg ,
> +uint32_t components);
> +
>  fs_reg setup_imm_df(const brw::fs_builder ,
>  double v);
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 726b2fcee7..c091241132 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4828,6 +4828,33 @@ shuffle_32bit_load_result_to_64bit_data(const
> fs_builder ,
> }
>  }
>
> +void
> +shuffle_32bit_load_result_to_16bit_data(const fs_builder ,
> +const fs_reg ,
> +const fs_reg ,
> +uint32_t components)
> +{
> +   assert(type_sz(src.type) == 4);
> +   assert(type_sz(dst.type) == 2);
> +
> +   fs_reg tmp = retype(bld.vgrf(src.type), dst.type);
> +
> +   for (unsigned i = 0; i < components; i++) {
> +  const fs_reg component_i = subscript(offset(src, bld, i / 2),
> dst.type, i % 2);
> +
> +  bld.MOV(offset(tmp, bld, i % 2), component_i);
> +
> +  if (i % 2) {
> + bld.MOV(offset(dst, bld, i -1), offset(tmp, bld, 0));
> + bld.MOV(offset(dst, bld, i), offset(tmp, bld, 1));
> +  }
>

I'm very confused by this extra moving.  Why can't we just do

bld.MOV(offset(dst, bld, i), component_i);

above?  Maybe I'm missing something but I don't see why the extra moves are
needed.


> +   }
> +   if (components % 2) {
> +  bld.MOV(offset(dst, bld, components - 1), tmp);
> +   }
> +}
> +
> +
>  /**
>   * This helper does the inverse operation of
>   * SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA.
> @@ -4860,6 +4887,30 @@ shuffle_64bit_data_for_32bit_write(const
> fs_builder ,
> return dst;
>  }
>
> +void
> +shuffle_16bit_data_for_32bit_write(const fs_builder ,
> +   const fs_reg ,
> +   const fs_reg ,
> +   uint32_t components)
> +{
> +   assert(type_sz(src.type) == 2);
> +   assert(type_sz(dst.type) == 4);
> +
> +   fs_reg tmp = bld.vgrf(dst.type);
> +
> +   for (unsigned i = 0; i < components; i++) {
> +  const fs_reg component_i = offset(src, bld, i);
> +  bld.MOV(subscript(tmp, src.type, i % 2), component_i);
> +  if (i % 2) {
> + bld.MOV(offset(dst, bld, i / 2), tmp);
> +  }
>

Again, why the extra MOVs?  Why not

bld.MOV(subscript(offset(tmp, bld, i / 2), src.type, i % 2), component_i);

instead of the extra MOVs?

--Jason


> +   }
> +   if (components % 2) {
> +  bld.MOV(offset(dst, bld, components / 2), tmp);
> +   }
> +}
> +
> +
>  fs_reg
>  setup_imm_df(const fs_builder , double v)
>  {
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: only reset command buffers when the allocation fails

2017-11-30 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Nov 30, 2017 at 10:23 PM, Samuel Pitoiset
 wrote:
>"vkAllocateCommandBuffers can be used to create multiple command
> buffers. If the creation of any of those command buffers fails, the
> implementation must destroy all successfully created command buffer
> objects from this command, set all entries of the pCommandBuffers
> array to NULL and return the error."
>
> This has been suggested by gabr...@system.is.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 18a1c55ad1..fe4f989dd1 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -2140,9 +2140,6 @@ VkResult radv_AllocateCommandBuffers(
> VkResult result = VK_SUCCESS;
> uint32_t i;
>
> -   memset(pCommandBuffers, 0,
> -   
> sizeof(*pCommandBuffers)*pAllocateInfo->commandBufferCount);
> -
> for (i = 0; i < pAllocateInfo->commandBufferCount; i++) {
>
> if (!list_empty(>free_cmd_buffers)) {
> @@ -2164,10 +2161,23 @@ VkResult radv_AllocateCommandBuffers(
> break;
> }
>
> -   if (result != VK_SUCCESS)
> +   if (result != VK_SUCCESS) {
> radv_FreeCommandBuffers(_device, pAllocateInfo->commandPool,
> i, pCommandBuffers);
>
> +   /* From the Vulkan 1.0.66 spec:
> +*
> +* "vkAllocateCommandBuffers can be used to create multiple
> +*  command buffers. If the creation of any of those command
> +*  buffers fails, the implementation must destroy all
> +*  successfully created command buffer objects from this
> +*  command, set all entries of the pCommandBuffers array to
> +*  NULL and return the error."
> +*/
> +   memset(pCommandBuffers, 0,
> +  sizeof(*pCommandBuffers) * 
> pAllocateInfo->commandBufferCount);
> +   }
> +
> return result;
>  }
>
> --
> 2.15.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: do not dump meta shaders with RADV_DEBUG=shaders

2017-11-30 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Nov 30, 2017 at 10:16 PM, Samuel Pitoiset
 wrote:
> It's really annoying and this pollutes the output especially
> when a bunch of non-meta shaders are compiled.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c |  5 +
>  src/amd/vulkan/radv_shader.c   |  2 +-
>  src/amd/vulkan/radv_shader.h   | 10 ++
>  3 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index faffca8330..fcbb5804f5 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -1867,10 +1867,7 @@ void radv_create_shaders(struct radv_pipeline 
> *pipeline,
> radv_link_shaders(pipeline, nir);
>
> for (int i = 0; i < MESA_SHADER_STAGES; ++i) {
> -   if (!(device->instance->debug_flags & 
> RADV_DEBUG_DUMP_SHADERS))
> -   continue;
> -
> -   if (modules[i])
> +   if (modules[i] && radv_can_dump_shader(device, modules[i]))
> nir_print_shader(nir[i], stderr);
> }
>
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 32edf2abd2..5464d3a58e 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -429,7 +429,7 @@ shader_variant_create(struct radv_device *device,
>   unsigned *code_size_out)
>  {
> enum radeon_family chip_family = 
> device->physical_device->rad_info.family;
> -   bool dump_shaders = device->instance->debug_flags & 
> RADV_DEBUG_DUMP_SHADERS;
> +   bool dump_shaders = radv_can_dump_shader(device, module);
> enum ac_target_machine_options tm_options = 0;
> struct radv_shader_variant *variant;
> struct ac_shader_binary binary;
> diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
> index 9bdbe848c8..91f2e7f2a1 100644
> --- a/src/amd/vulkan/radv_shader.h
> +++ b/src/amd/vulkan/radv_shader.h
> @@ -28,6 +28,7 @@
>  #ifndef RADV_SHADER_H
>  #define RADV_SHADER_H
>
> +#include "radv_debug.h"
>  #include "radv_private.h"
>
>  #include "nir/nir.h"
> @@ -112,4 +113,13 @@ radv_shader_dump_stats(struct radv_device *device,
>gl_shader_stage stage,
>FILE *file);
>
> +static inline bool
> +radv_can_dump_shader(struct radv_device *device,
> +struct radv_shader_module *module)
> +{
> +   /* Only dump non-meta shaders, useful for debugging purposes. */
> +   return !module->nir &&
> +  device->instance->debug_flags & RADV_DEBUG_DUMP_SHADERS;
> +}
> +
>  #endif
> --
> 2.15.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: only reset command buffers when the allocation fails

2017-11-30 Thread Samuel Pitoiset
   "vkAllocateCommandBuffers can be used to create multiple command
buffers. If the creation of any of those command buffers fails, the
implementation must destroy all successfully created command buffer
objects from this command, set all entries of the pCommandBuffers
array to NULL and return the error."

This has been suggested by gabr...@system.is.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 18a1c55ad1..fe4f989dd1 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2140,9 +2140,6 @@ VkResult radv_AllocateCommandBuffers(
VkResult result = VK_SUCCESS;
uint32_t i;
 
-   memset(pCommandBuffers, 0,
-   
sizeof(*pCommandBuffers)*pAllocateInfo->commandBufferCount);
-
for (i = 0; i < pAllocateInfo->commandBufferCount; i++) {
 
if (!list_empty(>free_cmd_buffers)) {
@@ -2164,10 +2161,23 @@ VkResult radv_AllocateCommandBuffers(
break;
}
 
-   if (result != VK_SUCCESS)
+   if (result != VK_SUCCESS) {
radv_FreeCommandBuffers(_device, pAllocateInfo->commandPool,
i, pCommandBuffers);
 
+   /* From the Vulkan 1.0.66 spec:
+*
+* "vkAllocateCommandBuffers can be used to create multiple
+*  command buffers. If the creation of any of those command
+*  buffers fails, the implementation must destroy all
+*  successfully created command buffer objects from this
+*  command, set all entries of the pCommandBuffers array to
+*  NULL and return the error."
+*/
+   memset(pCommandBuffers, 0,
+  sizeof(*pCommandBuffers) * 
pAllocateInfo->commandBufferCount);
+   }
+
return result;
 }
 
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon/vcn: determine idr by pic type

2017-11-30 Thread Leo Liu



On 11/30/2017 04:12 PM, boyuan.zh...@amd.com wrote:

From: Boyuan Zhang 

Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang 

Reviewed-by: Leo Liu 


---
  src/gallium/drivers/radeon/radeon_vcn_enc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc.c 
b/src/gallium/drivers/radeon/radeon_vcn_enc.c
index 9806a69..5fc9fc7 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_enc.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_enc.c
@@ -47,7 +47,7 @@ static void radeon_vcn_enc_get_param(struct radeon_encoder 
*enc, struct pipe_h26
enc->enc_pic.ref_idx_l0 = pic->ref_idx_l0;
enc->enc_pic.ref_idx_l1 = pic->ref_idx_l1;
enc->enc_pic.not_referenced = pic->not_referenced;
-   enc->enc_pic.is_idr = pic->is_idr;
+   enc->enc_pic.is_idr = (pic->picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_IDR);
enc->enc_pic.crop_left = 0;
enc->enc_pic.crop_right = (align(enc->base.width, 16) - 
enc->base.width) / 2;
enc->enc_pic.crop_top = 0;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: do not dump meta shaders with RADV_DEBUG=shaders

2017-11-30 Thread Samuel Pitoiset
It's really annoying and this pollutes the output especially
when a bunch of non-meta shaders are compiled.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c |  5 +
 src/amd/vulkan/radv_shader.c   |  2 +-
 src/amd/vulkan/radv_shader.h   | 10 ++
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index faffca8330..fcbb5804f5 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1867,10 +1867,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
radv_link_shaders(pipeline, nir);
 
for (int i = 0; i < MESA_SHADER_STAGES; ++i) {
-   if (!(device->instance->debug_flags & RADV_DEBUG_DUMP_SHADERS))
-   continue;
-
-   if (modules[i])
+   if (modules[i] && radv_can_dump_shader(device, modules[i]))
nir_print_shader(nir[i], stderr);
}
 
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 32edf2abd2..5464d3a58e 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -429,7 +429,7 @@ shader_variant_create(struct radv_device *device,
  unsigned *code_size_out)
 {
enum radeon_family chip_family = 
device->physical_device->rad_info.family;
-   bool dump_shaders = device->instance->debug_flags & 
RADV_DEBUG_DUMP_SHADERS;
+   bool dump_shaders = radv_can_dump_shader(device, module);
enum ac_target_machine_options tm_options = 0;
struct radv_shader_variant *variant;
struct ac_shader_binary binary;
diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
index 9bdbe848c8..91f2e7f2a1 100644
--- a/src/amd/vulkan/radv_shader.h
+++ b/src/amd/vulkan/radv_shader.h
@@ -28,6 +28,7 @@
 #ifndef RADV_SHADER_H
 #define RADV_SHADER_H
 
+#include "radv_debug.h"
 #include "radv_private.h"
 
 #include "nir/nir.h"
@@ -112,4 +113,13 @@ radv_shader_dump_stats(struct radv_device *device,
   gl_shader_stage stage,
   FILE *file);
 
+static inline bool
+radv_can_dump_shader(struct radv_device *device,
+struct radv_shader_module *module)
+{
+   /* Only dump non-meta shaders, useful for debugging purposes. */
+   return !module->nir &&
+  device->instance->debug_flags & RADV_DEBUG_DUMP_SHADERS;
+}
+
 #endif
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeon/vcn: determine idr by pic type

2017-11-30 Thread boyuan.zhang
From: Boyuan Zhang 

Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vcn_enc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc.c 
b/src/gallium/drivers/radeon/radeon_vcn_enc.c
index 9806a69..5fc9fc7 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_enc.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_enc.c
@@ -47,7 +47,7 @@ static void radeon_vcn_enc_get_param(struct radeon_encoder 
*enc, struct pipe_h26
enc->enc_pic.ref_idx_l0 = pic->ref_idx_l0;
enc->enc_pic.ref_idx_l1 = pic->ref_idx_l1;
enc->enc_pic.not_referenced = pic->not_referenced;
-   enc->enc_pic.is_idr = pic->is_idr;
+   enc->enc_pic.is_idr = (pic->picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_IDR);
enc->enc_pic.crop_left = 0;
enc->enc_pic.crop_right = (align(enc->base.width, 16) - 
enc->base.width) / 2;
enc->enc_pic.crop_top = 0;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests

2017-11-30 Thread Neil Roberts
Kenneth Graunke  writes:

> Neil, in case you were wondering, I suggested the above organization
> of vertex elements because it would let us only upload 1 in the common
> case. Looking in shader-db, there are 3579 shaders that use
> gl_InstanceID, 186 shaders that use gl_VertexID, and 0 shaders that
> use gl_BaseVertex, gl_BaseInstance, or gl_DrawID.
>
> It looks like your patches kicked gl_BaseVertex off to VE2 instead of
> gl_BaseInstance. I suppose that works too, I just figured that keeping
> VertexID/FirstVertex/BaseVertex together would make sense. If it's
> more convenient to move gl_BaseVertex, I suppose I'm fine with that
> too...

Yes, Antia forwarded me your emails with the previous suggestion. The
order we came up with for this patch series is:

VE 1: 
VE 2: 

The “offset for vertex ID” corresponds to what we were previously
(incorrectly) sending for gl_BaseVertex, so effectively the values in
VE1 have not changed at all. This also means we can continue to directly
take these two values from the indirect draw params buffer. I believe
your previous suggestion ended up needing a register store to put the
values in the right place so with this ordering we get to avoid that.
VE1 now contains everything needed for the most common values VertexID
and InstanceID. VE2 is only needed for the rare cases that use gl_DrawID
or gl_BaseVertex. On Vulkan VE2 won’t even be needed for the BaseVertex
builtin because the semantics are different and in that case it
corresponds to “offset for vertex ID”.

I haven’t looked into too much detail about reordering the patches yet,
but it does sound reasonable to try and avoid intermediate breakages.

Thanks for looking at the series.

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 25/44] anv: Enable VK_KHR_16bit_storage for SSBO and UBO

2017-11-30 Thread Jason Ekstrand
I'd roll 24 in with 25 personally.  Either way, 24 and 25 are

Reviewed-by: Jason Ekstrand 

I'm going to try and get enough reviewed today that we can land this part.

On Wed, Nov 29, 2017 at 6:57 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> From: Alejandro Piñeiro 
>
> It uses VK_KHR_get_physical_device_properties2 functionality to expose
> if the extension is supported or not.
>
> v2: update due rebase against master (Alejandro)
>
> v3: (Jason Ekstrand)
> - Move this patch up in VK_KHR_16bit_storage series enabling only
>   storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccess.
> - Only expose VK_KHR_16bit_storage on Gen8+
>
> Signed-off-by: Jose Maria Casanova Crespo 
> Signed-off-by: Alejandro Piñeiro 
> ---
>  src/intel/vulkan/anv_device.c  | 13 +
>  src/intel/vulkan/anv_extensions.py |  1 +
>  2 files changed, 14 insertions(+)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index b5577ee61d..69a1f5a5f6 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -725,6 +725,19 @@ void anv_GetPhysicalDeviceFeatures2KHR(
>   break;
>}
>
> +  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES_KHR:
> {
> + ANV_FROM_HANDLE(anv_physical_device, pdevice, physicalDevice);
> +
> + VkPhysicalDevice16BitStorageFeaturesKHR *features =
> +(VkPhysicalDevice16BitStorageFeaturesKHR *)ext;
> +
> + features->storageBuffer16BitAccess = pdevice->info.gen >= 8;
> + features->uniformAndStorageBuffer16BitAccess =
> pdevice->info.gen >= 8;
> + features->storagePushConstant16 = false;
> + features->storageInputOutput16 = false;
> + break;
> +  }
> +
>default:
>   anv_debug_ignored_stype(ext->sType);
>   break;
> diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_
> extensions.py
> index b1e984b8cd..c49718dfd4 100644
> --- a/src/intel/vulkan/anv_extensions.py
> +++ b/src/intel/vulkan/anv_extensions.py
> @@ -51,6 +51,7 @@ class Extension:
>  # and dEQP-VK.api.info.device fail due to the duplicated strings.
>  EXTENSIONS = [
>  Extension('VK_ANDROID_native_buffer', 5, 'ANDROID'),
> +Extension('VK_KHR_16bit_storage', 1,
> 'device->info.gen >= 8'),
>  Extension('VK_KHR_bind_memory2',  1, True),
>  Extension('VK_KHR_dedicated_allocation',  1, True),
>  Extension('VK_KHR_descriptor_update_template',1, True),
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 20/44] i965/fs: Add byte scattered read message and fs support

2017-11-30 Thread Jason Ekstrand
On Wed, Nov 29, 2017 at 6:50 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> v2: Fix alignment style (Topi Pohjolainen)
> (Jason Ekstrand)
> - Enable bit_size parameter to scattered messages to enable different
>   bitsizes byte/word/dword.
> - Remove use of brw_send_indirect_scattered_message in favor of
>   brw_send_indirect_surface_message.
> - Move scattered messages to surface messages namespace.
> - Assert align1 for scattered messages and assume Gen8+.
> - Inline brw_set_dp_byte_scattered_read.
> ---
>  src/intel/compiler/brw_eu.h|  8 +++
>  src/intel/compiler/brw_eu_defines.h|  2 ++
>  src/intel/compiler/brw_eu_emit.c   | 30
> ++
>  src/intel/compiler/brw_fs.cpp  | 19 
>  src/intel/compiler/brw_fs_copy_propagation.cpp |  2 ++
>  src/intel/compiler/brw_fs_generator.cpp|  6 ++
>  src/intel/compiler/brw_fs_surface_builder.cpp  | 11 +-
>  src/intel/compiler/brw_fs_surface_builder.h|  7 ++
>  src/intel/compiler/brw_shader.cpp  |  6 ++
>  9 files changed, 90 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
> index 3ac3b4342a..2d0f56f793 100644
> --- a/src/intel/compiler/brw_eu.h
> +++ b/src/intel/compiler/brw_eu.h
> @@ -485,6 +485,14 @@ brw_typed_surface_write(struct brw_codegen *p,
>  unsigned msg_length,
>  unsigned num_channels);
>
> +void
> +brw_byte_scattered_read(struct brw_codegen *p,
> +struct brw_reg dst,
> +struct brw_reg payload,
> +struct brw_reg surface,
> +unsigned msg_length,
> +unsigned bit_size);
> +
>  void
>  brw_byte_scattered_write(struct brw_codegen *p,
>   struct brw_reg payload,
> diff --git a/src/intel/compiler/brw_eu_defines.h
> b/src/intel/compiler/brw_eu_defines.h
> index de6330ee54..aa510ebfa4 100644
> --- a/src/intel/compiler/brw_eu_defines.h
> +++ b/src/intel/compiler/brw_eu_defines.h
> @@ -409,6 +409,8 @@ enum opcode {
>  * opcode, but instead of taking a single payload blog they expect
> their
>  * arguments separately as individual sources, like untyped write/read.
>  */
> +   SHADER_OPCODE_BYTE_SCATTERED_READ,
> +   SHADER_OPCODE_BYTE_SCATTERED_READ_LOGICAL,
> SHADER_OPCODE_BYTE_SCATTERED_WRITE,
> SHADER_OPCODE_BYTE_SCATTERED_WRITE_LOGICAL,
>
> diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_
> emit.c
> index ded7e228cf..bdc516848a 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -2998,6 +2998,36 @@ static enum brw_data_size
> brw_data_size_from_bit_size(unsigned bit_size)
> }
>  }
>
> +
> +void
> +brw_byte_scattered_read(struct brw_codegen *p,
> +struct brw_reg dst,
> +struct brw_reg payload,
> +struct brw_reg surface,
> +unsigned msg_length,
> +unsigned bit_size)
> +{
> +   assert(brw_inst_access_mode(p->devinfo, p->current) == BRW_ALIGN_1);
> +   const struct gen_device_info *devinfo = p->devinfo;
> +   const unsigned sfid =  GEN7_SFID_DATAPORT_DATA_CACHE;
> +
> +   struct brw_inst *insn = brw_send_indirect_surface_message(
> +  p, sfid, dst, payload, surface, msg_length,
> +  brw_surface_payload_size(p, 1, true, true),
> +  false);
> +
> +   unsigned msg_control = brw_data_size_from_bit_size(bit_size) << 2;
> +
> +   if (brw_inst_exec_size(devinfo, p->current) == BRW_EXECUTE_16)
> +  msg_control |= 1; /* SIMD16 mode */
> +   else
> +  msg_control |= 0; /* SIMD8 mode */
> +
> +   brw_inst_set_dp_msg_type(devinfo, insn,
> +HSW_DATAPORT_DC_PORT0_BYTE_SCATTERED_READ);
> +   brw_inst_set_dp_msg_control(devinfo, insn, msg_control);
> +}
> +
>  void
>  brw_byte_scattered_write(struct brw_codegen *p,
>   struct brw_reg payload,
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 32f1d757f0..1ca4d416b2 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -251,6 +251,7 @@ fs_inst::is_send_from_grf() const
> case SHADER_OPCODE_UNTYPED_SURFACE_READ:
> case SHADER_OPCODE_UNTYPED_SURFACE_WRITE:
> case SHADER_OPCODE_BYTE_SCATTERED_WRITE:
> +   case SHADER_OPCODE_BYTE_SCATTERED_READ:
> case SHADER_OPCODE_TYPED_ATOMIC:
> case SHADER_OPCODE_TYPED_SURFACE_READ:
> case SHADER_OPCODE_TYPED_SURFACE_WRITE:
> @@ -750,6 +751,16 @@ fs_inst::components_read(unsigned i) const
>else
>   return 1;
>
> +   case SHADER_OPCODE_BYTE_SCATTERED_READ_LOGICAL:
> +  assert(src[3].file == IMM &&
> + src[4].file == IMM);
> +  if (i == 0)
> + return 1;
> 

Re: [Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID

2017-11-30 Thread Neil Roberts
Kenneth Graunke  writes:

> We have a number of similar names now:
>
>SYSTEM_VALUE_BASE_VERTEX
>SYSTEM_VALUE_BASE_VERTEX_ID
>SYSTEM_VALUE_VERTEX_ID
>SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
>
> BASE_VERTEX and BASE_VERTEX_ID are really similar names, and honestly
> either one seems like it could be the name for gl_BaseVertex.  I'm
> afraid it would be easy to mix them up by mistake.  IMHO, it would be
> nice to pick a different word, just to keep some distinction between
> the two fairly related concepts...
>
> Perhaps SYSTEM_VALUE_FIRST_VERTEX...?  That's only half the meaning,
> but it at least uses a different word, and makes you think "do I want
> BASE_VERTEX or FIRST_VERTEX?"

Yes, naming this thing is really difficult. I’m not sure if you noticed,
but for Vulkan, the BaseVertex builtin should actually have the value of
BASE_VERTEX_ID unlike GL. So if we rename BASE_VERTEX to something
without “base vertex” in it then it will still be confusing for Vulkan.
So effectively the descriptive names are like:

SYSTEM_VALUE_BASE_VERTEX_ON_GL_BUT_NOT_VULKAN
SYSTEM_VALUE_BASE_VERTEX_ON_VULKAN_OR_OFFSET_FOR_VERTEX_ID_ON_GL

I’m not sure whether that’s enough of an argument against FIRST_VERTEX
though, so personally I don’t really mind either way.

Antia, what do you think?

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 18/44] i965/fs: Add byte scattered write message and fs support

2017-11-30 Thread Jason Ekstrand
On Wed, Nov 29, 2017 at 6:08 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> v2: (Jason Ekstrand)
> - Enable bit_size parameter to scattered messages to enable different
>   bitsizes byte/word/dword.
> - Remove use of brw_send_indirect_scattered_message in favor of
>   brw_send_indirect_surface_message.
> - Move scattered messages to surface messages namespace.
> - Assert align1 for scattered messages and assume Gen8+.
> - Inline brw_set_dp_byte_scattered_write.
>
> Signed-off-by: Jose Maria Casanova Crespo 
> Signed-off-by: Alejandro Piñeiro 
> ---
>  src/intel/compiler/brw_eu.h|  7 +
>  src/intel/compiler/brw_eu_defines.h| 17 +++
>  src/intel/compiler/brw_eu_emit.c   | 42
> ++
>  src/intel/compiler/brw_fs.cpp  | 14 +
>  src/intel/compiler/brw_fs_copy_propagation.cpp |  2 ++
>  src/intel/compiler/brw_fs_generator.cpp|  6 
>  src/intel/compiler/brw_fs_surface_builder.cpp  | 11 +++
>  src/intel/compiler/brw_fs_surface_builder.h|  7 +
>  src/intel/compiler/brw_shader.cpp  |  7 +
>  9 files changed, 113 insertions(+)
>
> diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
> index 343dcd867d..3ac3b4342a 100644
> --- a/src/intel/compiler/brw_eu.h
> +++ b/src/intel/compiler/brw_eu.h
> @@ -485,6 +485,13 @@ brw_typed_surface_write(struct brw_codegen *p,
>  unsigned msg_length,
>  unsigned num_channels);
>
> +void
> +brw_byte_scattered_write(struct brw_codegen *p,
> + struct brw_reg payload,
> + struct brw_reg surface,
> + unsigned msg_length,
> + unsigned bit_size);
> +
>  void
>  brw_memory_fence(struct brw_codegen *p,
>   struct brw_reg dst);
> diff --git a/src/intel/compiler/brw_eu_defines.h
> b/src/intel/compiler/brw_eu_defines.h
> index 9d5cf05c86..de6330ee54 100644
> --- a/src/intel/compiler/brw_eu_defines.h
> +++ b/src/intel/compiler/brw_eu_defines.h
> @@ -402,6 +402,16 @@ enum opcode {
>
> SHADER_OPCODE_RND_MODE,
>
> +   /**
> +* Byte scattered write/read opcodes.
> +*
> +* LOGICAL opcodes are eventually translated to the matching
> non-LOGICAL
> +* opcode, but instead of taking a single payload blog they expect
> their
> +* arguments separately as individual sources, like untyped write/read.
> +*/
> +   SHADER_OPCODE_BYTE_SCATTERED_WRITE,
> +   SHADER_OPCODE_BYTE_SCATTERED_WRITE_LOGICAL,
> +
> SHADER_OPCODE_MEMORY_FENCE,
>
> SHADER_OPCODE_GEN4_SCRATCH_READ,
> @@ -1255,4 +1265,11 @@ enum PACKED brw_rnd_mode {
> BRW_RND_MODE_UNSPECIFIED,  /* Unspecified rounding mode */
>  };
>
> +/* MDC_DS - Data Size Message Descriptor Control Field */
> +enum PACKED brw_data_size {
>

I'm not sure how I feel about this being an enum with such a generic name.


> +   GEN7_BYTE_SCATTERED_DATA_SIZE_BYTE = 0,
> +   GEN7_BYTE_SCATTERED_DATA_SIZE_WORD = 1,
> +   GEN7_BYTE_SCATTERED_DATA_SIZE_DWORD = 2
> +};
> +
>  #endif /* BRW_EU_DEFINES_H */
> diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_
> emit.c
> index ca97ff7325..ded7e228cf 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -2580,6 +2580,7 @@ brw_send_indirect_surface_message(struct
> brw_codegen *p,
> return insn;
>  }
>
> +
>  static bool
>  while_jumps_before_offset(const struct gen_device_info *devinfo,
>brw_inst *insn, int while_offset, int
> start_offset)
> @@ -2983,6 +2984,47 @@ brw_untyped_surface_write(struct brw_codegen *p,
>p, insn, num_channels);
>  }
>
> +static enum brw_data_size brw_data_size_from_bit_size(unsigned bit_size)
>

Please put the return type on it's own line.


> +{
> +   switch (bit_size) {
> +   case 8:
> +  return GEN7_BYTE_SCATTERED_DATA_SIZE_BYTE;
> +   case 16:
> +  return GEN7_BYTE_SCATTERED_DATA_SIZE_WORD;
> +   case 32:
> +  return GEN7_BYTE_SCATTERED_DATA_SIZE_DWORD;
> +   default:
> +  unreachable("Unsupported bit_size for byte scattered messages");
> +   }
> +}
> +
> +void
> +brw_byte_scattered_write(struct brw_codegen *p,
> + struct brw_reg payload,
> + struct brw_reg surface,
> + unsigned msg_length,
> + unsigned bit_size)
> +{
> +   assert(brw_inst_access_mode(p->devinfo, p->current) == BRW_ALIGN_1);
> +   const struct gen_device_info *devinfo = p->devinfo;
> +   const unsigned sfid = GEN7_SFID_DATAPORT_DATA_CACHE;
> +
> +   struct brw_inst *insn = brw_send_indirect_surface_message(
> +  p, sfid, brw_writemask(brw_null_reg(), WRITEMASK_XYZW),
> +  payload, surface, msg_length, 0, true);
> +
> +   unsigned msg_control = 

[Mesa-dev] [Bug 103999] 4x MSAA with RG32F shows garbage on triangle edges

2017-11-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103999

--- Comment #1 from Ernst Sjöstrand  ---
Are you saying it worked with 17.2.5?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr/scons: Fix intermittent build failure

2017-11-30 Thread George Kyriazis
gen_rasterizer*.cpp now depend on gen_ar_eventhandler.hpp.
Account for new dependency.
---
 src/gallium/drivers/swr/SConscript | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/swr/SConscript 
b/src/gallium/drivers/swr/SConscript
index e35eff7..9204ecb 100644
--- a/src/gallium/drivers/swr/SConscript
+++ b/src/gallium/drivers/swr/SConscript
@@ -167,6 +167,7 @@ env.CodeGenerate(
 Depends(genRasterizerFiles,
 ['rasterizer/core/backends/gen_rasterizer.hpp',
  'rasterizer/archrast/gen_ar_event.hpp',
+ 'rasterizer/archrast/gen_ar_eventhandler.hpp',
  'rasterizer/codegen/gen_knobs.h']
 )
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests

2017-11-30 Thread Kenneth Graunke
On Friday, November 10, 2017 9:53:28 AM PST Antia Puentes wrote:
> Hi,
> 
> the series sets gl_BaseVertex to zero for gl*DrawArrays* in i965.
[snip]

Hi Antia, Neil,

The end result of this series looks reasonable, but it looks like it's
going to break a bunch of things at each step.  Either that or I'm a bit
lost, sorry.

Could you reorganize/split  the patches so that things keep working at
each step along the way?

>   compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID
>   nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics
>   intel/compiler: Add a uses_basevertexid flag
>   i965: emit basevertexid as a vertex element component

It looks like this moves SYSTEM_VALUE_BASE_VERTEX from VE1 to VE2...but
it doesn't update the compiler to fetch gl_BaseVertex from the new
location.  Plus, we haven't switched to using BASE_VERTEX_ID lowering
for gl_VertexID yet...so prog_data->uses_basevertexid will never be set,
so we won't upload BASE_VERTEX_ID either.

It sure looks like this would break anything using gl_VertexID or
gl_BaseVertex.  Maybe I'm missing something.

>   i965: gl_BaseVertex must be zero for non-indexed draw calls

Again, at this point, we're still using SYSTEM_VALUE_BASE_VERTEX for
gl_VertexID lowering, so this would break gl_VertexID tests.

>   intel/compiler: implement the basevertex(id) load intrinsics
>   nir: Offset vertex_id by base_vertex_id instead of base_vertex
>   i965: Let nir lower gl_VertexID instead of the linker
>   spirv: Lower BaseVertex to BASE_VERTEX_ID instead of BASE_VERTEX

The order I'd expect to see is:

1. Introduce new SYSTEM_VALUE_FIRST_VERTEX enum and nir intrinsics.

2. Add vs_prog_data::uses_first_vertex or system_values_read field.

3. In Intel code, move system values around to new VE locations.

   Vertex Element 1: 
   Vertex Element 2: 

   This would involve changing intel/compiler and both brw and anv's
   state upload code, in the same patch.

4. In Intel code, start uploading SYSTEM_VALUE_FIRST_VERTEX as VE1.y.
   Update the state upload code and the compiler in the same patch.

   Now all the values are in place, but nothing uses the new one yet.

5. Switch to NIR instead of GLSL IR based Vertex ID lowering.
   (i965: Let nir lower gl_VertexID instead of the linker)

6. Make NIR VertexID lowering use SYSTEM_VALUE_FIRST_VERTEX.
   (nir: Offset vertex_id by base_vertex_id instead of base_vertex)

   Now i965 no longer uses SYSTEM_VALUE_BASE_VERTEX for VertexID,
   so you can safely change the meaning.

7. Fix gl_BaseVertex to be 0 for non-indexed draw calls.
   (i965: gl_BaseVertex must be zero for non-indexed draw calls)

Neil, in case you were wondering, I suggested the above organization
of vertex elements because it would let us only upload 1 in the common
case.  Looking in shader-db, there are 3579 shaders that use
gl_InstanceID, 186 shaders that use gl_VertexID, and 0 shaders that use
gl_BaseVertex, gl_BaseInstance, or gl_DrawID.

It looks like your patches kicked gl_BaseVertex off to VE2 instead of
gl_BaseInstance.  I suppose that works too, I just figured that keeping
VertexID/FirstVertex/BaseVertex together would make sense.  If it's more
convenient to move gl_BaseVertex, I suppose I'm fine with that too...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 07/44] spirv/nir: Handle 16-bit types

2017-11-30 Thread Jason Ekstrand
I sprinkled a few mostly trivial comments below.  With those fixed,

Reviewed-by: Jason Ekstrand 

On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> From: Eduardo Lima Mitev 
>
> v2: Added more missing implementations of 16-bit types. (Jason Ekstrand)
>
> v3: Store values in values[0].u16[i] (Jason Ekstrand)
> Include switches based on bitsize for 16-bit types
> (Chema Casanova)
>
> Signed-off-by: Jose Maria Casanova Crespo 
> Signed-off-by: Eduardo Lima 
> ---
>  src/compiler/spirv/spirv_to_nir.c  | 111 ++
> +--
>  src/compiler/spirv/vtn_variables.c |  21 +++
>  2 files changed, 115 insertions(+), 17 deletions(-)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c
> b/src/compiler/spirv/spirv_to_nir.c
> index 027efab88d..f745373473 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -104,10 +104,13 @@ vtn_const_ssa_value(struct vtn_builder *b,
> nir_constant *constant,
> switch (glsl_get_base_type(type)) {
> case GLSL_TYPE_INT:
> case GLSL_TYPE_UINT:
> +   case GLSL_TYPE_INT16:
> +   case GLSL_TYPE_UINT16:
> case GLSL_TYPE_INT64:
> case GLSL_TYPE_UINT64:
> case GLSL_TYPE_BOOL:
> case GLSL_TYPE_FLOAT:
> +   case GLSL_TYPE_FLOAT16:
> case GLSL_TYPE_DOUBLE: {
>int bit_size = glsl_get_bit_size(type);
>if (glsl_type_is_vector_or_scalar(type)) {
> @@ -751,16 +754,38 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
>int bit_size = w[2];
>const bool signedness = w[3];
>val->type->base_type = vtn_base_type_scalar;
> -  if (bit_size == 64)
> +  switch (bit_size) {
> +  case 64:
>   val->type->type = (signedness ? glsl_int64_t_type() :
> glsl_uint64_t_type());
> -  else
> + break;
> +  case 32:
>   val->type->type = (signedness ? glsl_int_type() :
> glsl_uint_type());
> + break;
> +  case 16:
> + val->type->type = (signedness ? glsl_int16_t_type() :
> glsl_uint16_t_type());
> + break;
> +  default:
> + unreachable("Invalid int bit size");
> +  }
>break;
> }
> +
> case SpvOpTypeFloat: {
>int bit_size = w[2];
>val->type->base_type = vtn_base_type_scalar;
> -  val->type->type = bit_size == 64 ? glsl_double_type() :
> glsl_float_type();
> +  switch (bit_size) {
> +  case 16:
> + val->type->type = glsl_float16_t_type();
> + break;
> +  case 32:
> + val->type->type = glsl_float_type();
> + break;
> +  case 64:
> + val->type->type = glsl_double_type();
> + break;
> +  default:
> + assert(!"Invalid float bit size");
>

unreachable()


> +  }
>break;
> }
>
> @@ -980,10 +1005,13 @@ vtn_null_constant(struct vtn_builder *b, const
> struct glsl_type *type)
> switch (glsl_get_base_type(type)) {
> case GLSL_TYPE_INT:
> case GLSL_TYPE_UINT:
> +   case GLSL_TYPE_INT16:
> +   case GLSL_TYPE_UINT16:
> case GLSL_TYPE_INT64:
> case GLSL_TYPE_UINT64:
> case GLSL_TYPE_BOOL:
> case GLSL_TYPE_FLOAT:
> +   case GLSL_TYPE_FLOAT16:
> case GLSL_TYPE_DOUBLE:
>/* Nothing to do here.  It's already initialized to zero */
>break;
> @@ -1106,12 +1134,20 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp
> opcode,
> case SpvOpConstant: {
>assert(glsl_type_is_scalar(val->const_type));
>int bit_size = glsl_get_bit_size(val->const_type);
> -  if (bit_size == 64) {
> +  switch (bit_size) {
> +  case 64: {
>   val->constant->values->u32[0] = w[3];
>   val->constant->values->u32[1] = w[4];
>

A bit unrelated but this should be using vtn_u64_literal and setting u64[0]
instead of assuming the aliasing works out between u32 and u64.


> -  } else {
> - assert(bit_size == 32);
> + break;
> +  }
>

You don't need braces around the 64-bit case.


> +  case 32:
>   val->constant->values->u32[0] = w[3];
> + break;
> +  case 16:
> + val->constant->values->u16[0] = w[3];
> + break;
> +  default:
> + unreachable("Unsupported SpvOpConstant bit size");
>}
>break;
> }
> @@ -1119,11 +1155,21 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp
> opcode,
>assert(glsl_type_is_scalar(val->const_type));
>val->constant->values[0].u32[0] = get_specialization(b, val, w[3]);
>int bit_size = glsl_get_bit_size(val->const_type);
> -  if (bit_size == 64)
> +  switch (bit_size) {
> +  case 64:{
>   val->constant->values[0].u64[0] =
>  get_specialization64(b, val, vtn_u64_literal([3]));
> -  else
> + break;
> +  }
>

Same here.


> +  case 32:
>   val->constant->values[0].u32[0] = get_specialization(b, val,
> w[3]);

Re: [Mesa-dev] [PATCH v2] swr: Correct texture allocation and limit max size to 2GB

2017-11-30 Thread Kyriazis, George
Looks good.

Reviewed-By: George Kyriazis 
>

On Nov 30, 2017, at 11:51 AM, Bruce Cherniak 
> wrote:

This patch fixes piglit tex3d-maxsize by correcting 4 things:

The total_size calculation was using 32-bit math, therefore a >4GB
allocation request overflowed and was not returning false (unsupported).

Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle
4GB allocations.

Added error checking on texture allocations to fail gracefully.

Finally, temporarily decreased supported max texture size from 4GB to 2GB.
The gallivm texture-sampler needs some additional work to correctly handle
larger than 2GB textures (offsets to LLVMBuildGEP are signed).

I'm working on a follow-on patch to allow up to 4GB textures, as this is
useful in HPC visualization applications.

Fixes piglit tex3d-maxsize.

v2: Updated patch description to clarify ">4GB".
---
src/gallium/drivers/swr/rasterizer/common/os.h |  2 +-
src/gallium/drivers/swr/swr_screen.cpp | 12 +---
2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h 
b/src/gallium/drivers/swr/rasterizer/common/os.h
index 4ed6b88e45..358cb33b6e 100644
--- a/src/gallium/drivers/swr/rasterizer/common/os.h
+++ b/src/gallium/drivers/swr/rasterizer/common/os.h
@@ -210,7 +210,7 @@ unsigned char _BitScanReverse(unsigned int *Index, unsigned 
int Mask)
}

inline
-void *AlignedMalloc(unsigned int size, unsigned int alignment)
+void *AlignedMalloc(size_t size, size_t alignment)
{
void *ret;
if (posix_memalign(, alignment, size))
diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
b/src/gallium/drivers/swr/swr_screen.cpp
index 3f2433e65a..71a07ebe8d 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++ b/src/gallium/drivers/swr/swr_screen.cpp
@@ -50,7 +50,7 @@
 * Max texture sizes
 * XXX Check max texture size values against core and sampler.
 */
-#define SWR_MAX_TEXTURE_SIZE (4 * 1024 * 1024 * 1024ULL) /* 4GB */
+#define SWR_MAX_TEXTURE_SIZE (2 * 1024 * 1024 * 1024ULL) /* 2GB */
#define SWR_MAX_TEXTURE_2D_LEVELS 14  /* 8K x 8K for now */
#define SWR_MAX_TEXTURE_3D_LEVELS 12  /* 2K x 2K x 2K for now */
#define SWR_MAX_TEXTURE_CUBE_LEVELS 14  /* 8K x 8K for now */
@@ -821,13 +821,15 @@ swr_texture_layout(struct swr_screen *screen,
 ComputeSurfaceOffset(0, 0, 0, 0, 0, level, >swr);
   }

-   size_t total_size = res->swr.depth * res->swr.qpitch * res->swr.pitch *
-   res->swr.numSamples;
+   size_t total_size = (uint64_t)res->swr.depth * res->swr.qpitch *
+ res->swr.pitch * res->swr.numSamples;
   if (total_size > SWR_MAX_TEXTURE_SIZE)
  return false;

   if (allocate) {
  res->swr.xpBaseAddress = (gfxptr_t)AlignedMalloc(total_size, 64);
+  if (!res->swr.xpBaseAddress)
+ return false;

  if (res->has_depth && res->has_stencil) {
 res->secondary = res->swr;
@@ -843,6 +845,10 @@ swr_texture_layout(struct swr_screen *screen,
  res->secondary.pitch * res->secondary.numSamples;

 res->secondary.xpBaseAddress = (gfxptr_t) AlignedMalloc(total_size, 
64);
+ if (!res->secondary.xpBaseAddress) {
+AlignedFree((void *)res->swr.xpBaseAddress);
+return false;
+ }
  }
   }

--
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv: do not store gfx9_epitch in radv_color_buffer_info

2017-11-30 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Nov 30, 2017 at 2:32 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 7 ---
>  src/amd/vulkan/radv_device.c | 3 ---
>  src/amd/vulkan/radv_private.h| 1 -
>  3 files changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index bd72ba2a87..18a1c55ad1 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1181,9 +1181,10 @@ radv_emit_depth_biais(struct radv_cmd_buffer 
> *cmd_buffer)
>  static void
>  radv_emit_fb_color_state(struct radv_cmd_buffer *cmd_buffer,
>  int index,
> -struct radv_color_buffer_info *cb)
> +struct radv_attachment_info *att)
>  {
> bool is_vi = cmd_buffer->device->physical_device->rad_info.chip_class 
> >= VI;
> +   struct radv_color_buffer_info *cb = >cb;
>
> if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) 
> {
> radeon_set_context_reg_seq(cmd_buffer->cs, 
> R_028C60_CB_COLOR0_BASE + index * 0x3c, 11);
> @@ -1204,7 +1205,7 @@ radv_emit_fb_color_state(struct radv_cmd_buffer 
> *cmd_buffer,
> radeon_emit(cmd_buffer->cs, cb->cb_dcc_base >> 32);
>
> radeon_set_context_reg(cmd_buffer->cs, 
> R_0287A0_CB_MRT0_EPITCH + index * 4,
> -  cb->gfx9_epitch);
> +  
> S_0287A0_EPITCH(att->attachment->image->surface.u.gfx9.surf.epitch));
> } else {
> radeon_set_context_reg_seq(cmd_buffer->cs, 
> R_028C60_CB_COLOR0_BASE + index * 0x3c, 11);
> radeon_emit(cmd_buffer->cs, cb->cb_color_base);
> @@ -1464,7 +1465,7 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer 
> *cmd_buffer)
> radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, 
> att->attachment->bo, 8);
>
> assert(att->attachment->aspect_mask & 
> VK_IMAGE_ASPECT_COLOR_BIT);
> -   radv_emit_fb_color_state(cmd_buffer, i, >cb);
> +   radv_emit_fb_color_state(cmd_buffer, i, att);
>
> radv_load_color_clear_regs(cmd_buffer, 
> att->attachment->image, i);
> }
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 8e5ae0bc46..336cb125a0 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -3077,9 +3077,6 @@ radv_initialise_color_surface(struct radv_device 
> *device,
> cb->cb_color_attrib2 = 
> S_028C68_MIP0_WIDTH(iview->extent.width - 1) |
> S_028C68_MIP0_HEIGHT(iview->extent.height - 1) |
> S_028C68_MAX_MIP(iview->image->info.levels - 1);
> -
> -   cb->gfx9_epitch = 
> S_0287A0_EPITCH(iview->image->surface.u.gfx9.surf.epitch);
> -
> }
>  }
>
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index addd35e5ce..e8c7af7c64 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -1456,7 +1456,6 @@ struct radv_color_buffer_info {
> uint32_t cb_clear_value0;
> uint32_t cb_clear_value1;
> uint32_t micro_tile_mode;
> -   uint32_t gfx9_epitch;
>  };
>
>  struct radv_ds_buffer_info {
> --
> 2.15.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/12] anv: Add support for the variablePointers feature

2017-11-30 Thread Jason Ekstrand
On Thu, Nov 30, 2017 at 11:52 AM, Kristian Høgsberg 
wrote:

> On Thu, Oct 19, 2017 at 11:04 AM, Jason Ekstrand 
> wrote:
> > Not to be confused with variablePointersStorageBuffer which is the
> > subset of VK_KHR_variable_pointers required to enable the extension.
> > This gives us "full" support for variable pointers.
> >
> > The approach chosen here was to do the lowering to _shared intrinsics
> > directly in spirv_to_nir instead of using the _var intrinsics and using
> > nir_lower_io.  Pointers with a storage class of Workgroup are given an
> > implicit std430 layout and now go through the same offset pointer paths
> as
> > UBO and SSBO access.  The whole thing really ended up working out rather
> > cleanly.
> >
> > There are some downsides to this approach.  One, is that we can't delete
> > unused shared variables post-optimization.  Also, the driver may be able
> to
> > handle better than std430.  Both of these can lead to some waisted SLM
> > space.  This also means that we can't do any deref-based load/store
> > elimination optimizations on SLM but we didn't really before so that's no
> > great loss; SLM is now exactly as hard to optimize as SSBOs.
>
> I was going to take a quick look at this and ended up doing a proper
> review. Very readable and clean series. I thought it was a little odd
> (unmotivated) to add the struct vtn_builder argument to
> vtn_pointer_uses_ssa_offset in "spirv: Refactor a couple of pointer
> query helpers"  when it isn't used until "spirv: Add support for
> lowering workgroup access to offsets", but whatevs.


I mostly did that because I wanted to avoid churn in the patch that started
us using it.  It's annoying to have to touch all those places again.


> I'm curious how
> much better you think a backend should be able to pack? What kind of
> improvements over std430 can you realistically expect?


We would probably do a bit better because we'd just tightly pack everything
with maybe do natural alignment but it's probably not a huge improvement.


> Also,what kind
> of unused shared variables wouldn't nir itself be able to remove?
>

Once we've lowered to offsets in SPIR-V -> NIR, everything has a fixed
address/offset and there's no good way to shuffle things around and
re-pack.  If the shader has a bunch of variables which aren't statically
used (only used inside an "if (0)" for instance) then we would be able to
eliminate them during the NIR optimization process and reduce the size of
SLM.  My primary fear was when processing a module with a large number of
shaders that we would cause the shared memory space to explode with
variables that aren't used but patch 2 should get rid of most of that.


> Reviewed-by: Kristian H. Kristensen 
>

Thanks!


>
> > Connor, Yes, I know that this is not quite the approach you were
> suggesting
> > on IRC.  I considered how we might add some sort of deref intrinsic and I
> > don't see a good way of doing so without rewriting large chunks of NIR.
> I
> > think that rewrite is probably worth it some day but that day is not
> today.
> > We people asking for this feature so I really don't want to delay on a
> > major NIR rewrite.
> >
> > Cc: Connor Abbott 
> > Cc: Chad Versace 
> > Cc: Dave Airlie 
> >
> > Jason Ekstrand (12):
> >   spirv: Drop the impl field from vtn_builder
> >   spirv: Only emit functions which are actually used
> >   spirv: Use a dereference instead of vtn_variable_resource_index
> >   spirv: Add a switch statement for the block store opcode
> >   spirv: Refactor the base case of offset_pointer_dereference
> >   spirv: Convert the supported_extensions struct to spirv_options
> >   spirv: Refactor a couple of pointer query helpers
> >   spirv: Use offset_pointer_dereference to instead of
> > get_vulkan_resource_index
> >   spirv: Add theoretical support for single component pointers
> >   spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op
> >   spirv: Add support for lowering workgroup access to offsets
> >   anv: Add support for the variablePointers feature
> >
> >  src/amd/vulkan/radv_shader.c   |  23 ++--
> >  src/compiler/spirv/nir_spirv.h |  34 --
> >  src/compiler/spirv/spirv_to_nir.c  | 180 -
> >  src/compiler/spirv/vtn_cfg.c   |   4 +-
> >  src/compiler/spirv/vtn_private.h   |  30 +++--
> >  src/compiler/spirv/vtn_variables.c | 229 --
> ---
> >  src/intel/vulkan/anv_device.c  |   2 +-
> >  src/intel/vulkan/anv_pipeline.c|  25 ++--
> >  8 files changed, 372 insertions(+), 155 deletions(-)
> >
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list

Re: [Mesa-dev] [PATCH] radv: do not set DISABLE_LSB_CEIL on GFX9

2017-11-30 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Nov 30, 2017 at 8:58 PM, Samuel Pitoiset
 wrote:
> The state no longer exists on GFX9.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 62591c7e3d..7b1b20e335 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -3439,7 +3439,7 @@ radv_init_sampler(struct radv_device *device,
>  
> S_008F38_XY_MIN_FILTER(radv_tex_filter(pCreateInfo->minFilter, max_aniso)) |
>  
> S_008F38_MIP_FILTER(radv_tex_mipfilter(pCreateInfo->mipmapMode)) |
>  S_008F38_MIP_POINT_PRECLAMP(0) |
> -S_008F38_DISABLE_LSB_CEIL(1) |
> +
> S_008F38_DISABLE_LSB_CEIL(device->physical_device->rad_info.chip_class <= VI) 
> |
>  S_008F38_FILTER_PREC_FIX(1) |
>  S_008F38_ANISO_OVERRIDE(is_vi));
> sampler->state[3] = (S_008F3C_BORDER_COLOR_PTR(0) |
> --
> 2.15.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Lyude Paul
On Thu, 2017-11-30 at 13:20 -0500, Rob Clark wrote:
> On Thu, Nov 30, 2017 at 12:59 AM, James Jones  wrote:
> > On 11/29/2017 04:09 PM, Miguel Angel Vico wrote:
> > > 
> > > On Wed, 29 Nov 2017 16:28:15 -0500
> > > Rob Clark  wrote:
> > > > 
> > > > Do we need to define both in-place and copy transitions?  Ie. what if
> > > > GPU is still reading a tiled or compressed texture (ie. sampling from
> > > > previous frame for some reason), but we need to untile/uncompress for
> > > > display.. of maybe there are some other cases like that we should
> > > > think about..
> > > > 
> > > > Maybe you already have some thoughts about that?
> > > 
> > > 
> > > This is the next thing I'll be working on. I haven't given it much
> > > thought myself so far, but I think James might have had some insights.
> > > I'll read through some of his notes to double-check.
> > 
> > 
> > A couple of notes on usage transitions:
> > 
> > While chatting about transitions, a few assertions were made by others
> > that
> > I've come to accept, despite the fact that they reduce the generality of
> > the
> > allocator mechanisms:
> > 
> > -GPUs are the only things that actually need usage transitions as far as I
> > know thus far.  Other engines either share the GPU representations of
> > data,
> > or use more limited representations; the latter being the reason non-GPU
> > usage transitions are a useful thing.
> > 
> > -It's reasonable to assume that a GPU is required to perform a usage
> > transition.  This follows from the above postulate.  If only GPUs are
> > using
> > more advanced representations, you don't need any transitions unless you
> > have a GPU available.
> 
> This seems reasonable.  I can't think of any non-gpu related case
> where you would need a transition, other than perhaps cache flush/inv.
> 
> > From that, I derived the rough API proposal for transitions presented on
> > my
> > XDC 2017 slides.  Transition "metadata" is queried from the allocator
> > given
> > a pair of usages (which may refer to more than one device), but the
> > realization of the transition is left to existing GPU APIs.  I think I put
> > Vulkan-like pseudo-code in the slides, but the GL external objects
> > extensions (GL_EXT_memory_object and GL_EXT_semaphore) would work as well.
> 
> I haven't quite wrapped my head around how this would work in the
> cross-device case.. I mean from the API standpoint for the user, it
> seems straightforward enough.  Just not sure how to implement that and
> what the driver interface would look like.
> 
> I guess we need a capability-conversion (?).. I mean take for example
> the the fb compression capability from your slide #12[1].  If we knew
> there was an available transition to go from "Dev2 FB compression" to
> "normal", then we could have allowed the "Dev2 FB compression" valid
> set?
> 
> [1] https://www.x.org/wiki/Events/XDC2017/jones_allocator.pdf
> 
> > Regarding in-place Vs. copy: To me a transition is something that happens
> > in-place, at least semantically.  If you need to make copies, that's a
> > format conversion blit not a transition, and graphics APIs are already
> > capable of expressing that without any special transitions or help from
> > the
> > allocator.  However, I understand some chipsets perform transitions using
> > something that looks kind of like a blit using on-chip caches and
> > constrained usage semantics.  There's probably some work to do to see
> > whether those need to be accommodated as conversion blits or usgae
> > transitions.
> 
> I guess part of what I was thinking of, is what happens if the
> producing device is still reading from the buffer.  For example,
> viddec -> gpu use case, where the video decoder is also still hanging
> on to the frame to use as a reference frame to decode future frames?
> 
> I guess if transition from devA -> devB can be done in parallel with
> devA still reading the buffer, it isn't a problem.  I guess that
> limits (non-blit) transitions to decompression and cache op's?  Maybe
> that is ok..
> 
> > For our hardware's purposes, transitions are just various levels of
> > decompression or compression reconfiguration and potentially cache
> > flushing/invalidation, so our transition metadata will just be some bits
> > signaling which compression operation is needed, if any.  That's the sort
> > of
> > operation I modeled the API around, so if things are much more exotic than
> > that for others, it will probably require some adjustments.
> > 
> 
> 
> [snip]
> 
> > 
> > Gralloc-on-$new_thing, as well as hwcomposer-on-$new_thing is one of my
> > primary goals.  However, it's a pretty heavy thing to prototype.  If
> > someone
> > has the time though, I think it would be a great experiment.  It would
> > help
> > flesh out the paltry list of usages, constraints, and capabilities in the
> > existing prototype codebase.  The kmscube example really should have added
> > at least a "render" usage, but I 

Re: [Mesa-dev] [PATCH 13/29] anv/cmd_buffer: Rework aux tracking

2017-11-30 Thread Pohjolainen, Topi
On Mon, Nov 27, 2017 at 07:06:03PM -0800, Jason Ekstrand wrote:
> This makes us start tracking two bits per level for aux to describe both
> whether or not something is fast-cleared and whether or not it is
> compressed as opposed to a single "needs resolve" bit.  The previous
> scheme only worked because we assumed that CCS_E compressed images would
> always be read with CCS_E enabled and so they didn't need any sort of
> resolve if there was no fast clear color.
> 
> The assumptions of the previous scheme held because the one case where
> we do need a full resolve when CCS_E is enabled is for window-system
> images.  Since we only ever allowed X-tiled window-system images, CCS
> was entirely disabled on gen9+ and we never got CCS_E.  With the advent
> of Y-tiled window-system buffers, we now need to properly support doing
> a full resolve images marked CCS_E.  This requires us to track things
> more granularly because, if the client does two back-to-back transitions
> where we first do a partial resolve and then a full resolve, we need
> both resolves to happen.
> ---
>  src/intel/vulkan/anv_private.h |  10 +--
>  src/intel/vulkan/genX_cmd_buffer.c | 158 
> ++---
>  2 files changed, 135 insertions(+), 33 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index f805246..e875705 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -2476,16 +2476,16 @@ anv_fast_clear_state_entry_size(const struct 
> anv_device *device)
>  {
> assert(device);
> /* Entry contents:
> -*   ++
> -*   | clear value dword(s) | needs resolve dword |
> -*   ++
> +*   +-+
> +*   | clear value dword(s) | needs resolve dwords |
> +*   +-+
>  */
>  
> -   /* Ensure that the needs resolve dword is in fact dword-aligned to enable
> +   /* Ensure that the needs resolve dwords are in fact dword-aligned to 
> enable
>  * GPU memcpy operations.
>  */
> assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> -   return device->isl_dev.ss.clear_value_size + 4;
> +   return device->isl_dev.ss.clear_value_size + 8;
>  }
>  
>  static inline struct anv_address
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 6aeffa3..2d47179 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -404,6 +404,22 @@ transition_depth_buffer(struct anv_cmd_buffer 
> *cmd_buffer,
> 0, 0, 1, hiz_op);
>  }
>  
> +/** Bitfield representing the needed resolves.
> + *
> + * This bitfield represents the kinds of compression we may or may not have 
> in
> + * a given image.  The ANV_IMAGE_NEEDS_* can be ANDed with the ANV_IMAGE_HAS
> + * bits to determine when a given resolve actually needs to happen.
> + *
> + * For convenience of dealing with MI_PREDICATE, we blow these two bits out
> + * into two dwords in the image meatadata page.

s/meatadata/metadata/

> + */
> +enum anv_image_resolve_bits {
> +   ANV_IMAGE_HAS_FAST_CLEAR_BIT   = 0x1,
> +   ANV_IMAGE_HAS_COMPRESSION_BIT  = 0x2,
> +   ANV_IMAGE_NEEDS_PARTIAL_RESOLVE_BITS   = 0x1,
> +   ANV_IMAGE_NEEDS_FULL_RESOLVE_BITS  = 0x3,
> +};
> +
>  #define MI_PREDICATE_SRC0  0x2400
>  #define MI_PREDICATE_SRC1  0x2408
>  
> @@ -411,23 +427,62 @@ transition_depth_buffer(struct anv_cmd_buffer 
> *cmd_buffer,
>   * performed properly.
>   */
>  static void
> -set_image_needs_resolve(struct anv_cmd_buffer *cmd_buffer,
> -const struct anv_image *image,
> -VkImageAspectFlagBits aspect,
> -unsigned level, bool needs_resolve)
> +set_image_needs_resolve_bits(struct anv_cmd_buffer *cmd_buffer,
> + const struct anv_image *image,
> + VkImageAspectFlagBits aspect,
> + unsigned level,
> + enum anv_image_resolve_bits set_bits)
>  {
> assert(cmd_buffer && image);
> assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> assert(level < anv_image_aux_levels(image, aspect));
>  
> -   /* The HW docs say that there is no way to guarantee the completion of
> -* the following command. We use it nevertheless because it shows no
> -* issues in testing is currently being used in the GL driver.
> -*/

Why are we dropping this comment now?

> -   anv_batch_emit(_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
> -  sdi.Address = anv_image_get_needs_resolve_addr(cmd_buffer->device,
> - image, aspect, level);
> -  sdi.ImmediateData = needs_resolve;
> +   const struct anv_address resolve_flag_addr =
> +  

[Mesa-dev] [PATCH] radv: do not set DISABLE_LSB_CEIL on GFX9

2017-11-30 Thread Samuel Pitoiset
The state no longer exists on GFX9.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 62591c7e3d..7b1b20e335 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3439,7 +3439,7 @@ radv_init_sampler(struct radv_device *device,
 
S_008F38_XY_MIN_FILTER(radv_tex_filter(pCreateInfo->minFilter, max_aniso)) |
 
S_008F38_MIP_FILTER(radv_tex_mipfilter(pCreateInfo->mipmapMode)) |
 S_008F38_MIP_POINT_PRECLAMP(0) |
-S_008F38_DISABLE_LSB_CEIL(1) |
+
S_008F38_DISABLE_LSB_CEIL(device->physical_device->rad_info.chip_class <= VI) |
 S_008F38_FILTER_PREC_FIX(1) |
 S_008F38_ANISO_OVERRIDE(is_vi));
sampler->state[3] = (S_008F3C_BORDER_COLOR_PTR(0) |
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/12] anv: Add support for the variablePointers feature

2017-11-30 Thread Kristian Høgsberg
On Thu, Oct 19, 2017 at 11:04 AM, Jason Ekstrand  wrote:
> Not to be confused with variablePointersStorageBuffer which is the
> subset of VK_KHR_variable_pointers required to enable the extension.
> This gives us "full" support for variable pointers.
>
> The approach chosen here was to do the lowering to _shared intrinsics
> directly in spirv_to_nir instead of using the _var intrinsics and using
> nir_lower_io.  Pointers with a storage class of Workgroup are given an
> implicit std430 layout and now go through the same offset pointer paths as
> UBO and SSBO access.  The whole thing really ended up working out rather
> cleanly.
>
> There are some downsides to this approach.  One, is that we can't delete
> unused shared variables post-optimization.  Also, the driver may be able to
> handle better than std430.  Both of these can lead to some waisted SLM
> space.  This also means that we can't do any deref-based load/store
> elimination optimizations on SLM but we didn't really before so that's no
> great loss; SLM is now exactly as hard to optimize as SSBOs.

I was going to take a quick look at this and ended up doing a proper
review. Very readable and clean series. I thought it was a little odd
(unmotivated) to add the struct vtn_builder argument to
vtn_pointer_uses_ssa_offset in "spirv: Refactor a couple of pointer
query helpers"  when it isn't used until "spirv: Add support for
lowering workgroup access to offsets", but whatevs. I'm curious how
much better you think a backend should be able to pack? What kind of
improvements over std430 can you realistically expect? Also,what kind
of unused shared variables wouldn't nir itself be able to remove?

Reviewed-by: Kristian H. Kristensen 


> Connor, Yes, I know that this is not quite the approach you were suggesting
> on IRC.  I considered how we might add some sort of deref intrinsic and I
> don't see a good way of doing so without rewriting large chunks of NIR.  I
> think that rewrite is probably worth it some day but that day is not today.
> We people asking for this feature so I really don't want to delay on a
> major NIR rewrite.
>
> Cc: Connor Abbott 
> Cc: Chad Versace 
> Cc: Dave Airlie 
>
> Jason Ekstrand (12):
>   spirv: Drop the impl field from vtn_builder
>   spirv: Only emit functions which are actually used
>   spirv: Use a dereference instead of vtn_variable_resource_index
>   spirv: Add a switch statement for the block store opcode
>   spirv: Refactor the base case of offset_pointer_dereference
>   spirv: Convert the supported_extensions struct to spirv_options
>   spirv: Refactor a couple of pointer query helpers
>   spirv: Use offset_pointer_dereference to instead of
> get_vulkan_resource_index
>   spirv: Add theoretical support for single component pointers
>   spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op
>   spirv: Add support for lowering workgroup access to offsets
>   anv: Add support for the variablePointers feature
>
>  src/amd/vulkan/radv_shader.c   |  23 ++--
>  src/compiler/spirv/nir_spirv.h |  34 --
>  src/compiler/spirv/spirv_to_nir.c  | 180 -
>  src/compiler/spirv/vtn_cfg.c   |   4 +-
>  src/compiler/spirv/vtn_private.h   |  30 +++--
>  src/compiler/spirv/vtn_variables.c | 229 
> -
>  src/intel/vulkan/anv_device.c  |   2 +-
>  src/intel/vulkan/anv_pipeline.c|  25 ++--
>  8 files changed, 372 insertions(+), 155 deletions(-)
>
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Lyude Paul
On Thu, 2017-11-30 at 14:20 -0500, Alex Deucher wrote:
> On Thu, Nov 30, 2017 at 2:10 PM, Nicolai Hähnle  wrote:
> > On 30.11.2017 19:52, Rob Clark wrote:
> > > 
> > > On Thu, Nov 30, 2017 at 4:21 AM, Nicolai Hähnle 
> > > wrote:
> > > > 
> > > > On 30.11.2017 01:09, Miguel Angel Vico wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > It seems to me that $new_thing should grow as a separate thing
> > > > > > > whether
> > > > > > > it ends up replacing GBM or GBM internals are somewhat rewritten
> > > > > > > on
> > > > > > > top
> > > > > > > of it. If I'm reading you both correctly, you agree with that,
> > > > > > > so in
> > > > > > > order to move forward, should we go ahead and create a project
> > > > > > > in
> > > > > > > fd.o?
> > > > > > > 
> > > > > > > Before filing the new project request though, we should find an
> > > > > > > appropriate name for $new_thing. Creativity isn't one of my
> > > > > > > strengths,
> > > > > > > but I'll go ahead and start the bikeshedding with "Generic
> > > > > > > Device
> > > > > > > Memory Allocator" or "Generic Device Memory Manager".
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > liballoc - Generic Device Memory Allocator ... seems reasonable to
> > > > > > me..
> > > > > 
> > > > > 
> > > > > 
> > > > > Cool. If there aren't better suggestions, we can go with that. We
> > > > > should also namespace all APIs and structures. Is 'galloc'
> > > > > distinctive
> > > > > enough to be used as namespace? Being an 'r' away from gralloc maybe
> > > > > it's a bit confusing?
> > > > 
> > > > 
> > > > 
> > > > libgalloc with a galloc prefix seems fine.
> > > > 
> > > 
> > > I keep reading "galloc" as "gralloc".. I suspect that will be
> > > confusing.  Maybe libgal/gal_.. or just liballoc/al_?
> > 
> > 
> > True, but liballoc is *very* generic.
> > 
> > libimagealloc?
> > libsurfacealloc?
> > contractions thereof?
> 
> libdevicealloc?

libhwalloc

> 
> Alex
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] meson: install dri internal headers.

2017-11-30 Thread Dylan Baker
Quoting Emil Velikov (2017-11-30 11:12:12)
> On 30 November 2017 at 18:58, Dylan Baker  wrote:
> > Quoting Emil Velikov (2017-11-30 10:49:19)
> >> s/headers/header/ in the title. With that the series is
> >> Reviewed-by: Emil Velikov 
> >>
> >> -Emil
> >
> > Thanks Emil.
> >
> > Should we be installing the dri.pc when you build gallium drivers as dri
> > drivers, but no "pure" dri drivers?
> >
> dri.pc should be installed whenever any *_dri.so module is build/installed.
> 
> -Emil

Thats what I thought. I'll need to fix that too. Thanks again.


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] intel/compiler: Add a uses_basevertexid flag

2017-11-30 Thread Kenneth Graunke
On Friday, November 10, 2017 9:53:31 AM PST Antia Puentes wrote:
> From: Neil Roberts 
> 
> ---
>  src/intel/compiler/brw_compiler.h | 1 +
>  src/intel/compiler/brw_vec4.cpp   | 4 
>  2 files changed, 5 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_compiler.h 
> b/src/intel/compiler/brw_compiler.h
> index df6ee018546..6b5b73a54f0 100644
> --- a/src/intel/compiler/brw_compiler.h
> +++ b/src/intel/compiler/brw_compiler.h
> @@ -969,6 +969,7 @@ struct brw_vs_prog_data {
> bool uses_vertexid;
> bool uses_instanceid;
> bool uses_basevertex;
> +   bool uses_basevertexid;
> bool uses_baseinstance;
> bool uses_drawid;
>  };
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index bbe4585e0c7..d996ab8c89f 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -2795,6 +2795,10 @@ brw_compile_vs(const struct brw_compiler *compiler, 
> void *log_data,
> BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
>prog_data->uses_basevertex = true;
>  
> +   if (shader->info.system_values_read &
> +   BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX_ID))
> +  prog_data->uses_basevertexid = true;
> +
> if (shader->info.system_values_read &
> BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE))
>prog_data->uses_baseinstance = true;
> 

This is fine, and gets a:

Reviewed-by: Kenneth Graunke 

but I'm also wondering whether it'd make sense to just include
system_values_read here or in brw_stage_prog_data.  We're up to 6
booleans that come from flags, which seems a little silly...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/9] nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics

2017-11-30 Thread Kenneth Graunke
On Friday, November 10, 2017 9:53:30 AM PST Antia Puentes wrote:
> Reviewed-by: Neil Roberts 
> ---
>  src/compiler/nir/nir.c | 4 
>  src/compiler/nir/nir_gather_info.c | 1 +
>  src/compiler/nir/nir_intrinsics.h  | 1 +
>  3 files changed, 6 insertions(+)
> 
> diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> index 7380bf436a8..6f0477b0676 100644
> --- a/src/compiler/nir/nir.c
> +++ b/src/compiler/nir/nir.c
> @@ -1913,6 +1913,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
>return nir_intrinsic_load_base_instance;
> case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
>return nir_intrinsic_load_vertex_id_zero_base;
> +   case SYSTEM_VALUE_BASE_VERTEX_ID:
> +  return nir_intrinsic_load_base_vertex_id;
> case SYSTEM_VALUE_BASE_VERTEX:
>return nir_intrinsic_load_base_vertex;
> case SYSTEM_VALUE_INVOCATION_ID:
> @@ -1982,6 +1984,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
>return SYSTEM_VALUE_BASE_INSTANCE;
> case nir_intrinsic_load_vertex_id_zero_base:
>return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
> +   case nir_intrinsic_load_base_vertex_id:
> +  return SYSTEM_VALUE_BASE_VERTEX_ID;
> case nir_intrinsic_load_base_vertex:
>return SYSTEM_VALUE_BASE_VERTEX;
> case nir_intrinsic_load_invocation_id:
> diff --git a/src/compiler/nir/nir_gather_info.c 
> b/src/compiler/nir/nir_gather_info.c
> index 13cdae39eca..cf2abb8b8ed 100644
> --- a/src/compiler/nir/nir_gather_info.c
> +++ b/src/compiler/nir/nir_gather_info.c
> @@ -232,6 +232,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
> nir_shader *shader)
> case nir_intrinsic_load_vertex_id:
> case nir_intrinsic_load_vertex_id_zero_base:
> case nir_intrinsic_load_base_vertex:
> +   case nir_intrinsic_load_base_vertex_id:
> case nir_intrinsic_load_base_instance:
> case nir_intrinsic_load_instance_id:
> case nir_intrinsic_load_sample_id:
> diff --git a/src/compiler/nir/nir_intrinsics.h 
> b/src/compiler/nir/nir_intrinsics.h
> index 20bef339ac4..78123dd927a 100644
> --- a/src/compiler/nir/nir_intrinsics.h
> +++ b/src/compiler/nir/nir_intrinsics.h
> @@ -326,6 +326,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx)
>  SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx)
> +SYSTEM_VALUE(base_vertex_id, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx)
> 

Reviewed-by: Kenneth Graunke 

(perhaps s/base_vertex_id/first_vertex/g to go with my suggestion on
patch 1...)



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 06/44] nir: Handle fp16 rounding modes at nir_type_conversion_op

2017-11-30 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> nir_type_conversion enables new operations to handle rounding modes to
> convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne
> and nir_op_f2f16_rtz.
>
> The undefined behaviour doesn't has any effect and uses the original
> nir_op_f2f16 operation.
>
> v2: Indentation fixed (Jason Ekstrand)
>
> v3: Use explicit case for undefined rounding and assert if
> rounding mode is used for non 16-bit float conversions
> (Jason Ekstrand)
> ---
>  src/compiler/glsl/glsl_to_nir.cpp |  3 ++-
>  src/compiler/nir/nir.h|  3 ++-
>  src/compiler/nir/nir_opcodes.py   | 11 +--
>  src/compiler/nir/nir_opcodes_c.py | 15 ++-
>  src/compiler/spirv/vtn_alu.c  |  2 +-
>  5 files changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/src/compiler/glsl/glsl_to_nir.cpp
> b/src/compiler/glsl/glsl_to_nir.cpp
> index 33ebc13edb..1e636225c1 100644
> --- a/src/compiler/glsl/glsl_to_nir.cpp
> +++ b/src/compiler/glsl/glsl_to_nir.cpp
> @@ -1575,7 +1575,8 @@ nir_visitor::visit(ir_expression *ir)
> case ir_unop_u642i64: {
>nir_alu_type src_type = nir_get_nir_type_for_glsl_
> base_type(types[0]);
>nir_alu_type dst_type = nir_get_nir_type_for_glsl_
> base_type(out_type);
> -  result = nir_build_alu(, nir_type_conversion_op(src_type,
> dst_type),
> +  result = nir_build_alu(, nir_type_conversion_op(src_type,
> dst_type,
> + nir_rounding_mode_undef),
>   srcs[0], NULL, NULL, NULL);
>/* b2i and b2f don't have fixed bit-size versions so the builder
> will
> * just assume 32 and we have to fix it up here.
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 883f371d1f..c70c0e0220 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -761,7 +761,8 @@ nir_get_nir_type_for_glsl_type(const struct glsl_type
> *type)
> return nir_get_nir_type_for_glsl_base_type(glsl_get_base_type(type));
>  }
>
> -nir_op nir_type_conversion_op(nir_alu_type src, nir_alu_type dst);
> +nir_op nir_type_conversion_op(nir_alu_type src, nir_alu_type dst,
> +  nir_rounding_mode rnd);
>
>  typedef enum {
> NIR_OP_IS_COMMUTATIVE = (1 << 0),
> diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_
> opcodes.py
> index 28a0467228..ac7333fe78 100644
> --- a/src/compiler/nir/nir_opcodes.py
> +++ b/src/compiler/nir/nir_opcodes.py
> @@ -179,8 +179,15 @@ for src_t in [tint, tuint, tfloat]:
>else:
>   bit_sizes = [8, 16, 32, 64]
>for bit_size in bit_sizes:
> - unop_convert("{0}2{1}{2}".format(src_t[0], dst_t[0], bit_size),
> -  dst_t + str(bit_size), src_t, "src0")
> +  if bit_size == 16 and dst_t == tfloat and src_t == tfloat:
> +  rnd_modes = ['rtne', 'rtz', 'undef']
> +  for rnd_mode in rnd_modes:
> +  unop_convert("{0}2{1}{2}_{3}".format(src_t[0],
> dst_t[0],
> +   bit_size,
> rnd_mode),
> +   dst_t + str(bit_size), src_t, "src0")
> +  else:
> +  unop_convert("{0}2{1}{2}".format(src_t[0], dst_t[0],
> bit_size),
> +   dst_t + str(bit_size), src_t, "src0")
>
>  # We'll hand-code the to/from bool conversion opcodes.  Because bool
> doesn't
>  # have multiple bit-sizes, we can always infer the size from the other
> type.
> diff --git a/src/compiler/nir/nir_opcodes_c.py b/src/compiler/nir/nir_
> opcodes_c.py
> index 02bb4738ed..c19185534a 100644
> --- a/src/compiler/nir/nir_opcodes_c.py
> +++ b/src/compiler/nir/nir_opcodes_c.py
> @@ -30,7 +30,7 @@ template = Template("""
>  #include "nir.h"
>
>  nir_op
> -nir_type_conversion_op(nir_alu_type src, nir_alu_type dst)
> +nir_type_conversion_op(nir_alu_type src, nir_alu_type dst,
> nir_rounding_mode rnd)
>  {
> nir_alu_type src_base = (nir_alu_type) nir_alu_type_get_base_type(
> src);
> nir_alu_type dst_base = (nir_alu_type) nir_alu_type_get_base_type(
> dst);
> @@ -64,7 +64,20 @@ nir_type_conversion_op(nir_alu_type src, nir_alu_type
> dst)
> switch (dst_bit_size) {
>  % for dst_bits in [16, 32, 64]:
>case ${dst_bits}:
> +%if src_t == 'float' and dst_t == 'float' and
> dst_bits == 16:
> + switch(rnd) {
> +%   for rnd_t in ['rtne', 'rtz', 'undef']:
> +case nir_rounding_mode_${rnd_t}:
> +   return ${'nir_op_{0}2{1}{2}_{3}'.format(src_t[0],
> dst_t[0],
> +
>  dst_bits, rnd_t)};
> +%   endfor
> +default:
> +   unreachable("Invalid 16-bit nir rounding
> mode");
> + }
> +%

Re: [Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID

2017-11-30 Thread Kenneth Graunke
On Friday, November 10, 2017 9:53:29 AM PST Antia Puentes wrote:
> This VS system value will contain the value passed as 
> for indexed draw calls or the value passed as  for non-indexed
> draw calls. It will be used to calculate the gl_VertexID as
> SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_BASE_VERTEX_ID.
> Note that the current calculation which uses SYSTEM_VALUE_BASE_VERTEX
> is not right, as gl_BaseVertex should be zero for non-indexed calls.
> 
> Reviewed-by: Neil Roberts 
> ---
>  src/compiler/shader_enums.c |  1 +
>  src/compiler/shader_enums.h | 15 +++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
> index b2ca80b49c2..cd8c256f227 100644
> --- a/src/compiler/shader_enums.c
> +++ b/src/compiler/shader_enums.c
> @@ -215,6 +215,7 @@ gl_system_value_name(gl_system_value sysval)
>   ENUM(SYSTEM_VALUE_INSTANCE_ID),
>   ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
>   ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
> + ENUM(SYSTEM_VALUE_BASE_VERTEX_ID),
>   ENUM(SYSTEM_VALUE_BASE_VERTEX),
>   ENUM(SYSTEM_VALUE_BASE_INSTANCE),
>   ENUM(SYSTEM_VALUE_DRAW_ID),
> diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
> index 9d229d4199e..7437cccdd0a 100644
> --- a/src/compiler/shader_enums.h
> +++ b/src/compiler/shader_enums.h
> @@ -474,6 +474,21 @@ typedef enum
>  */
> SYSTEM_VALUE_BASE_VERTEX,
>  
> +
> +   /**
> +* Depending on the type of the draw call (indexed or non-indexed),
> +* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and
> +* similar, or is the value of \c first passed to \c glDrawArrays and
> +* similar.
> +*
> +* \note
> +* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as
> +* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c 
> SYSTEM_VALUE_BASE_VERTEX_ID.
> +*
> +* \sa SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, SYSTEM_VALUE_VERTEX_ID
> +*/
> +   SYSTEM_VALUE_BASE_VERTEX_ID,
> +

We have a number of similar names now:

   SYSTEM_VALUE_BASE_VERTEX
   SYSTEM_VALUE_BASE_VERTEX_ID
   SYSTEM_VALUE_VERTEX_ID
   SYSTEM_VALUE_VERTEX_ID_ZERO_BASE

BASE_VERTEX and BASE_VERTEX_ID are really similar names, and honestly
either one seems like it could be the name for gl_BaseVertex.  I'm
afraid it would be easy to mix them up by mistake.  IMHO, it would be
nice to pick a different word, just to keep some distinction between
the two fairly related concepts...

Perhaps SYSTEM_VALUE_FIRST_VERTEX...?  That's only half the meaning,
but it at least uses a different word, and makes you think "do I want
BASE_VERTEX or FIRST_VERTEX?"

> /**
>  * Value of \c baseinstance passed to instanced draw entry points
>  *
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] xlib: remove dummy GLX_MESA_set_3dfx_mode implementation

2017-11-30 Thread Emil Velikov
On 30 November 2017 at 19:09, Ian Romanick  wrote:
> Is xmesa.h something that apps could see?  Removing stuff could,
> hypothetically, cause compilation problems... but also, app developers,
> fix your old crap. :)
>
Some digging showed:
 - the header was never installed
 - seemingly no external users of XMesa
 - used for mesa <> xserver see commit 50aaabc248c9823106ff772873cbf2631d4dadcd

Brian any recollection if have any actual audience of XMesa?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Alex Deucher
On Thu, Nov 30, 2017 at 2:10 PM, Nicolai Hähnle  wrote:
> On 30.11.2017 19:52, Rob Clark wrote:
>>
>> On Thu, Nov 30, 2017 at 4:21 AM, Nicolai Hähnle 
>> wrote:
>>>
>>> On 30.11.2017 01:09, Miguel Angel Vico wrote:
>>
>>
>> It seems to me that $new_thing should grow as a separate thing whether
>> it ends up replacing GBM or GBM internals are somewhat rewritten on
>> top
>> of it. If I'm reading you both correctly, you agree with that, so in
>> order to move forward, should we go ahead and create a project in
>> fd.o?
>>
>> Before filing the new project request though, we should find an
>> appropriate name for $new_thing. Creativity isn't one of my strengths,
>> but I'll go ahead and start the bikeshedding with "Generic Device
>> Memory Allocator" or "Generic Device Memory Manager".
>
>
>
> liballoc - Generic Device Memory Allocator ... seems reasonable to me..



 Cool. If there aren't better suggestions, we can go with that. We
 should also namespace all APIs and structures. Is 'galloc' distinctive
 enough to be used as namespace? Being an 'r' away from gralloc maybe
 it's a bit confusing?
>>>
>>>
>>>
>>> libgalloc with a galloc prefix seems fine.
>>>
>>
>> I keep reading "galloc" as "gralloc".. I suspect that will be
>> confusing.  Maybe libgal/gal_.. or just liballoc/al_?
>
>
> True, but liballoc is *very* generic.
>
> libimagealloc?
> libsurfacealloc?
> contractions thereof?

libdevicealloc?

Alex
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] meson: install dri internal headers.

2017-11-30 Thread Emil Velikov
On 30 November 2017 at 18:58, Dylan Baker  wrote:
> Quoting Emil Velikov (2017-11-30 10:49:19)
>> s/headers/header/ in the title. With that the series is
>> Reviewed-by: Emil Velikov 
>>
>> -Emil
>
> Thanks Emil.
>
> Should we be installing the dri.pc when you build gallium drivers as dri
> drivers, but no "pure" dri drivers?
>
dri.pc should be installed whenever any *_dri.so module is build/installed.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103999] 4x MSAA with RG32F shows garbage on triangle edges

2017-11-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103999

Bug ID: 103999
   Summary: 4x MSAA with RG32F shows garbage on triangle edges
   Product: Mesa
   Version: 17.2
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: mais...@archlinux.us
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 135833
  --> https://bugs.freedesktop.org/attachment.cgi?id=135833=edit
Bug

On 17.2.6, edges don't resolve correctly in MSAA.
I'm using pResolveAttachments for this.
I could upload a renderdoc trace, but it's too big for the tracker (90MB).
Tested on RX470.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] xlib: remove empty GLX_NV_vertex_array_range stubs

2017-11-30 Thread Ian Romanick
I sent a couple small comments on patches 3 and 4.  Barring any actual
problems there, the series is

Reviewed-by: Ian Romanick 

On 11/29/2017 11:23 AM, Emil Velikov wrote:
> From: Emil Velikov 
> 
> The extension was never implemented and seemingly never will.
> The DRI based libGL dropped support for it over 10 years ago.
> 
> Cc: Brian Paul 
> Cc: Ian Romanick 
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/drivers/x11/fakeglx.c | 26 --
>  src/mesa/drivers/x11/glxapi.c  | 35 ---
>  src/mesa/drivers/x11/glxapi.h  |  7 ---
>  3 files changed, 68 deletions(-)
> 
> diff --git a/src/mesa/drivers/x11/fakeglx.c b/src/mesa/drivers/x11/fakeglx.c
> index d2a099f9a20..8902a7cd667 100644
> --- a/src/mesa/drivers/x11/fakeglx.c
> +++ b/src/mesa/drivers/x11/fakeglx.c
> @@ -2759,28 +2759,6 @@ Fake_glXSet3DfxModeMESA( int mode )
>  
>  
>  
> -/*** GLX_NV_vertex_array range ***/
> -static void *
> -Fake_glXAllocateMemoryNV( GLsizei size,
> -  GLfloat readFrequency,
> -  GLfloat writeFrequency,
> -  GLfloat priority )
> -{
> -   (void) size;
> -   (void) readFrequency;
> -   (void) writeFrequency;
> -   (void) priority;
> -   return NULL;
> -}
> -
> -
> -static void 
> -Fake_glXFreeMemoryNV( GLvoid *pointer )
> -{
> -   (void) pointer;
> -}
> -
> -
>  /*** GLX_MESA_agp_offset ***/
>  
>  static GLuint
> @@ -3009,10 +2987,6 @@ _mesa_GetGLXDispatchTable(void)
> /*** GLX_MESA_set_3dfx_mode ***/
> glx.Set3DfxModeMESA = Fake_glXSet3DfxModeMESA;
>  
> -   /*** GLX_NV_vertex_array_range ***/
> -   glx.AllocateMemoryNV = Fake_glXAllocateMemoryNV;
> -   glx.FreeMemoryNV = Fake_glXFreeMemoryNV;
> -
> /*** GLX_MESA_agp_offset ***/
> glx.GetAGPOffsetMESA = Fake_glXGetAGPOffsetMESA;
>  
> diff --git a/src/mesa/drivers/x11/glxapi.c b/src/mesa/drivers/x11/glxapi.c
> index 52e60265697..ff8b2b2ce16 100644
> --- a/src/mesa/drivers/x11/glxapi.c
> +++ b/src/mesa/drivers/x11/glxapi.c
> @@ -1019,37 +1019,6 @@ glXSet3DfxModeMESA(int mode)
>  
>  
>  
> -/*** GLX_NV_vertex_array_range ***/
> -
> -void PUBLIC *
> -glXAllocateMemoryNV( GLsizei size,
> - GLfloat readFrequency,
> - GLfloat writeFrequency,
> - GLfloat priority )
> -{
> -   struct _glxapi_table *t;
> -   Display *dpy = glXGetCurrentDisplay();
> -   GET_DISPATCH(dpy, t);
> -   if (!t)
> -  return NULL;
> -   return t->AllocateMemoryNV(size, readFrequency, writeFrequency, priority);
> -}
> -
> -
> -void PUBLIC
> -glXFreeMemoryNV( GLvoid *pointer )
> -{
> -   struct _glxapi_table *t;
> -   Display *dpy = glXGetCurrentDisplay();
> -   GET_DISPATCH(dpy, t);
> -   if (!t)
> -  return;
> -   t->FreeMemoryNV(pointer);
> -}
> -
> -
> -
> -
>  /*** GLX_MESA_agp_offset */
>  
>  GLuint PUBLIC
> @@ -1288,10 +1257,6 @@ static struct name_address_pair GLX_functions[] = {
> /*** GLX_ARB_get_proc_address ***/
> { "glXGetProcAddressARB", (__GLXextFuncPtr) glXGetProcAddressARB },
>  
> -   /*** GLX_NV_vertex_array_range ***/
> -   { "glXAllocateMemoryNV", (__GLXextFuncPtr) glXAllocateMemoryNV },
> -   { "glXFreeMemoryNV", (__GLXextFuncPtr) glXFreeMemoryNV },
> -
> /*** GLX_MESA_agp_offset ***/
> { "glXGetAGPOffsetMESA", (__GLXextFuncPtr) glXGetAGPOffsetMESA },
>  
> diff --git a/src/mesa/drivers/x11/glxapi.h b/src/mesa/drivers/x11/glxapi.h
> index cc4f902925b..a4930b10dca 100644
> --- a/src/mesa/drivers/x11/glxapi.h
> +++ b/src/mesa/drivers/x11/glxapi.h
> @@ -186,13 +186,6 @@ struct _glxapi_table {
> /*** GLX_MESA_set_3dfx_mode ***/
> Bool (*Set3DfxModeMESA)(int mode);
>  
> -   /*** GLX_NV_vertex_array_range ***/
> -   void * (*AllocateMemoryNV)( GLsizei size,
> -   GLfloat readFrequency,
> -   GLfloat writeFrequency,
> -   GLfloat priority );
> -   void (*FreeMemoryNV)( GLvoid *pointer );
> -
> /*** GLX_MESA_agp_offset ***/
> GLuint (*GetAGPOffsetMESA)( const GLvoid *pointer );
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Nicolai Hähnle

On 30.11.2017 19:52, Rob Clark wrote:

On Thu, Nov 30, 2017 at 4:21 AM, Nicolai Hähnle  wrote:

On 30.11.2017 01:09, Miguel Angel Vico wrote:


It seems to me that $new_thing should grow as a separate thing whether
it ends up replacing GBM or GBM internals are somewhat rewritten on top
of it. If I'm reading you both correctly, you agree with that, so in
order to move forward, should we go ahead and create a project in fd.o?

Before filing the new project request though, we should find an
appropriate name for $new_thing. Creativity isn't one of my strengths,
but I'll go ahead and start the bikeshedding with "Generic Device
Memory Allocator" or "Generic Device Memory Manager".



liballoc - Generic Device Memory Allocator ... seems reasonable to me..



Cool. If there aren't better suggestions, we can go with that. We
should also namespace all APIs and structures. Is 'galloc' distinctive
enough to be used as namespace? Being an 'r' away from gralloc maybe
it's a bit confusing?



libgalloc with a galloc prefix seems fine.



I keep reading "galloc" as "gralloc".. I suspect that will be
confusing.  Maybe libgal/gal_.. or just liballoc/al_?


True, but liballoc is *very* generic.

libimagealloc?
libsurfacealloc?
contractions thereof?

Cheers,
Nicolai



BR,
-R






--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] docs/specs: annotate MESA_agp_offset as obsolete

2017-11-30 Thread Ian Romanick
I think at least moving deprecated specs to old is a good idea.  Once
Khronos has the "Obsolete" notation, we could just delete them.

On 11/29/2017 11:23 AM, Emil Velikov wrote:
> From: Emil Velikov 
> 
> No Mesa driver has implemented the extension in ages. Seemingly non Mesa
> drivers don't implement it either.
> 
> As mentioned by Ian, the extension is effectively superseded by
> ARB_vertex_buffer_object.
> 
> Cc: Brian Paul 
> Cc: Ian Romanick 
> Signed-off-by: Emil Velikov 
> ---
> Should be commit this locally and then upstream to Khronos? Worth moving
> to OLD?
> ---
>  docs/specs/MESA_agp_offset.spec | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/docs/specs/MESA_agp_offset.spec b/docs/specs/MESA_agp_offset.spec
> index 06e1d902edd..ac5ad7d7679 100644
> --- a/docs/specs/MESA_agp_offset.spec
> +++ b/docs/specs/MESA_agp_offset.spec
> @@ -13,8 +13,7 @@ Contact
>  
>  Status
>  
> -Shipping (Mesa 4.0.4 and later.  Only implemented in particular
> -XFree86/DRI drivers.)
> +Obsolete. Effectively superseded by ARB_vertex_buffer_object.
>  
>  Version
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] xlib: remove dummy GLX_MESA_set_3dfx_mode implementation

2017-11-30 Thread Ian Romanick
Is xmesa.h something that apps could see?  Removing stuff could,
hypothetically, cause compilation problems... but also, app developers,
fix your old crap. :)

On 11/29/2017 11:23 AM, Emil Velikov wrote:
> From: Emil Velikov 
> 
> The implementation is a simple 'return EGL_FALSE'. Stop pretending and
> simply remove it.
> 
> Cc: Brian Paul 
> Cc: Ian Romanick 
> Signed-off-by: Emil Velikov 
> ---
>  docs/relnotes/17.4.0.html  |  2 +-
>  src/mesa/drivers/x11/fakeglx.c | 13 -
>  src/mesa/drivers/x11/glxapi.c  | 19 ---
>  src/mesa/drivers/x11/glxapi.h  |  3 ---
>  src/mesa/drivers/x11/xm_api.c  |  8 
>  src/mesa/drivers/x11/xmesa.h   | 22 --
>  6 files changed, 1 insertion(+), 66 deletions(-)
> 
> diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
> index ec2386b3305..8fc191a7ef0 100644
> --- a/docs/relnotes/17.4.0.html
> +++ b/docs/relnotes/17.4.0.html
> @@ -60,7 +60,7 @@ TBD
>  Changes
>  
>  
> -TBD
> +Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL
>  
>  
>  
> diff --git a/src/mesa/drivers/x11/fakeglx.c b/src/mesa/drivers/x11/fakeglx.c
> index fa923de9c3c..22d878eb80d 100644
> --- a/src/mesa/drivers/x11/fakeglx.c
> +++ b/src/mesa/drivers/x11/fakeglx.c
> @@ -2749,16 +2749,6 @@ Fake_glXReleaseBuffersMESA( Display *dpy, GLXDrawable 
> d )
>  
>  
>  
> -/*** GLX_MESA_set_3dfx_mode ***/
> -
> -static Bool
> -Fake_glXSet3DfxModeMESA( int mode )
> -{
> -   return XMesaSetFXmode( mode );
> -}
> -
> -
> -
>  /*** GLX_MESA_agp_offset ***/
>  
>  static GLuint
> @@ -2984,9 +2974,6 @@ _mesa_GetGLXDispatchTable(void)
> /*** GLX_MESA_pixmap_colormap ***/
> glx.CreateGLXPixmapMESA = Fake_glXCreateGLXPixmapMESA;
>  
> -   /*** GLX_MESA_set_3dfx_mode ***/
> -   glx.Set3DfxModeMESA = Fake_glXSet3DfxModeMESA;
> -
> /*** GLX_EXT_texture_from_pixmap ***/
> glx.BindTexImageEXT = Fake_glXBindTexImageEXT;
> glx.ReleaseTexImageEXT = Fake_glXReleaseTexImageEXT;
> diff --git a/src/mesa/drivers/x11/glxapi.c b/src/mesa/drivers/x11/glxapi.c
> index 40d73006bbe..e84e2386733 100644
> --- a/src/mesa/drivers/x11/glxapi.c
> +++ b/src/mesa/drivers/x11/glxapi.c
> @@ -1004,21 +1004,6 @@ glXCreateGLXPixmapMESA(Display *dpy, XVisualInfo 
> *visinfo, Pixmap pixmap, Colorm
>  
>  
>  
> -/*** GLX_MESA_set_3dfx_mode ***/
> -
> -Bool PUBLIC
> -glXSet3DfxModeMESA(int mode)
> -{
> -   struct _glxapi_table *t;
> -   Display *dpy = glXGetCurrentDisplay();
> -   GET_DISPATCH(dpy, t);
> -   if (!t)
> -  return False;
> -   return t->Set3DfxModeMESA(mode);
> -}
> -
> -
> -
>  /*** GLX_EXT_texture_from_pixmap */
>  
>  void PUBLIC
> @@ -1065,7 +1050,6 @@ _glxapi_get_extensions(void)
>"GLX_MESA_copy_sub_buffer",
>"GLX_MESA_release_buffers",
>"GLX_MESA_pixmap_colormap",
> -  "GLX_MESA_set_3dfx_mode",
>"GLX_SGIX_fbconfig",
>"GLX_SGIX_pbuffer",
>"GLX_EXT_texture_from_pixmap",
> @@ -1237,9 +1221,6 @@ static struct name_address_pair GLX_functions[] = {
> /*** GLX_MESA_release_buffers ***/
> { "glXReleaseBuffersMESA", (__GLXextFuncPtr) glXReleaseBuffersMESA },
>  
> -   /*** GLX_MESA_set_3dfx_mode ***/
> -   { "glXSet3DfxModeMESA", (__GLXextFuncPtr) glXSet3DfxModeMESA },
> -
> /*** GLX_ARB_get_proc_address ***/
> { "glXGetProcAddressARB", (__GLXextFuncPtr) glXGetProcAddressARB },
>  
> diff --git a/src/mesa/drivers/x11/glxapi.h b/src/mesa/drivers/x11/glxapi.h
> index 18e01b06620..7bccc50aa78 100644
> --- a/src/mesa/drivers/x11/glxapi.h
> +++ b/src/mesa/drivers/x11/glxapi.h
> @@ -183,9 +183,6 @@ struct _glxapi_table {
> /*** GLX_MESA_pixmap_colormap ***/
> GLXPixmap (*CreateGLXPixmapMESA)(Display *dpy, XVisualInfo *visinfo, 
> Pixmap pixmap, Colormap cmap);
>  
> -   /*** GLX_MESA_set_3dfx_mode ***/
> -   Bool (*Set3DfxModeMESA)(int mode);
> -
> /*** GLX_EXT_texture_from_pixmap ***/
> void (*BindTexImageEXT)(Display *dpy, GLXDrawable drawable, int buffer,
> const int *attrib_list);
> diff --git a/src/mesa/drivers/x11/xm_api.c b/src/mesa/drivers/x11/xm_api.c
> index ec2a73cb369..069e9e12b98 100644
> --- a/src/mesa/drivers/x11/xm_api.c
> +++ b/src/mesa/drivers/x11/xm_api.c
> @@ -1314,14 +1314,6 @@ Display *XMesaGetCurrentDisplay(void)
>  
>  
>  
> -GLboolean XMesaSetFXmode( GLint mode )
> -{
> -   (void) mode;
> -   return GL_FALSE;
> -}
> -
> -
> -
>  /*
>   * Copy the back buffer to the front buffer.  If there's no back buffer
>   * this is a no-op.
> diff --git a/src/mesa/drivers/x11/xmesa.h b/src/mesa/drivers/x11/xmesa.h
> index 84b2b27006d..562b9f38cde 100644
> --- a/src/mesa/drivers/x11/xmesa.h
> +++ b/src/mesa/drivers/x11/xmesa.h
> @@ -85,14 +85,6 @@ extern "C" {
>  #define XMESA_EXTENSIONS 2
>  
>  
> -/*
> - * Values passed to XMesaSetFXmode:
> - */
> -#define XMESA_FX_WINDOW   1
> -#define XMESA_FX_FULLSCREEN   2
> -
> -
> -
>  

Re: [Mesa-dev] [PATCH 1/2] meson: install dri internal headers.

2017-11-30 Thread Dylan Baker
Quoting Emil Velikov (2017-11-30 10:49:19)
> s/headers/header/ in the title. With that the series is
> Reviewed-by: Emil Velikov 
> 
> -Emil

Thanks Emil.

Should we be installing the dri.pc when you build gallium drivers as dri
drivers, but no "pure" dri drivers?

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Rob Clark
On Thu, Nov 30, 2017 at 4:21 AM, Nicolai Hähnle  wrote:
> On 30.11.2017 01:09, Miguel Angel Vico wrote:

 It seems to me that $new_thing should grow as a separate thing whether
 it ends up replacing GBM or GBM internals are somewhat rewritten on top
 of it. If I'm reading you both correctly, you agree with that, so in
 order to move forward, should we go ahead and create a project in fd.o?

 Before filing the new project request though, we should find an
 appropriate name for $new_thing. Creativity isn't one of my strengths,
 but I'll go ahead and start the bikeshedding with "Generic Device
 Memory Allocator" or "Generic Device Memory Manager".
>>>
>>>
>>> liballoc - Generic Device Memory Allocator ... seems reasonable to me..
>>
>>
>> Cool. If there aren't better suggestions, we can go with that. We
>> should also namespace all APIs and structures. Is 'galloc' distinctive
>> enough to be used as namespace? Being an 'r' away from gralloc maybe
>> it's a bit confusing?
>
>
> libgalloc with a galloc prefix seems fine.
>

I keep reading "galloc" as "gralloc".. I suspect that will be
confusing.  Maybe libgal/gal_.. or just liballoc/al_?

BR,
-R

>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] meson: install dri internal headers.

2017-11-30 Thread Emil Velikov
s/headers/header/ in the title. With that the series is
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Rob Clark
On Thu, Nov 30, 2017 at 1:28 AM, James Jones  wrote:
> On 11/29/2017 01:10 PM, Rob Clark wrote:
>>
>> On Wed, Nov 29, 2017 at 12:33 PM, Jason Ekstrand 
>> wrote:
>>>
>>> On Sat, Nov 25, 2017 at 1:20 PM, Rob Clark  wrote:


 On Sat, Nov 25, 2017 at 12:46 PM, Jason Ekstrand 
 wrote:
>
> I'm not quite some sure what I think about this.  I think I would like
> to
> see $new_thing at least replace the guts of GBM. Whether GBM becomes a
> wrapper around $new_thing or $new_thing implements the GBM API, I'm not
> sure.  What I don't think I want is to see GBM development continuing
> on
> it's own so we have two competing solutions.


 I don't really view them as competing.. there is *some* overlap, ie.
 allocating a buffer.. but even if you are using GBM w/out $new_thing
 you could allocate a buffer externally and import it.  I don't see
 $new_thing as that much different from GBM PoV.

 But things like surfaces (aka swap chains) seem a bit out of place
 when you are thinking about implementing $new_thing for non-gpu
 devices.  Plus EGL<->GBM tie-ins that seem out of place when talking
 about a (for ex.) camera.  I kinda don't want to throw out the baby
 with the bathwater here.
>>>
>>>
>>>
>>> Agreed.  GBM is very EGLish and we don't want the new allocator to be
>>> that.
>>>

 *maybe* GBM could be partially implemented on top of $new_thing.  I
 don't quite see how that would work.  Possibly we could deprecate
 parts of GBM that are no longer needed?  idk..  Either way, I fully
 expect that GBM and mesa's implementation of $new_thing could perhaps
 sit on to of some of the same set of internal APIs.  The public
 interface can be decoupled from the internal implementation.
>>>
>>>
>>>
>>> Maybe I should restate things a bit.  My real point was that modifiers +
>>> $new_thing + Kernel blob should be a complete and more powerful
>>> replacement
>>> for GBM.  I don't know that we really can implement GBM on top of it
>>> because
>>> GBM has lots of wishy-washy concepts such as "cursor plane" which may not
>>> map well at least not without querying the kernel about specifc display
>>> planes.  In particular, I don't want someone to feel like they need to
>>> use
>>> $new_thing and GBM at the same time or together.  Ideally, I'd like them
>>> to
>>> never do that unless we decide gbm_bo is a useful abstraction for
>>> $new_thing.
>>>
>>
>> (just to repeat what I mentioned on irc)
>>
>> I think main thing is how do you create a swapchain/surface and know
>> which is current front buffer after SwapBuffers()..  that is the only
>> bits of GBM that seem like there would still be useful.  idk, maybe
>> there is some other idea.
>
>
> I don't view this as terribly useful except for legacy apps that need an EGL
> window surface and can't be updated to use new methods.  Wayland compositors
> certainly don't fall in that category.  I don't know that any GBM apps do.

kmscube doesn't count?  :-P

Hmm, I assumed weston and the other wayland compositors where still
using gbm to create EGL surfaces, but I confess to have not actually
looked at weston src code for quite a few years now.

Anyways, I think it is perfectly fine for GBM to stay as-is in it's
current form.  It can already import dma-buf fd's, and those can
certainly come from $new_thing.

So I guess we want an EGL extension to return the allocator device
instance for the GPU.  That also takes care of the non-bare-metal
case.

> Rather, I think the way forward for the classes of apps that need something
> like GBM or the generic allocator is more or less the path ChromeOS took
> with their graphics architecture: Render to individual buffers (using FBOs
> bound to imported buffers in GL) and manage buffer exchanges/blits manually.
>
> The useful abstraction surfaces provide isn't so much deciding which buffer
> is currently "front" and "back", but rather handling the transition/hand-off
> to the window system/display device/etc. in SwapBuffers(), and the whole
> idea of the allocator proposals is to make that something the application or
> at least some non-driver utility library handles explicitly based on where
> exactly the buffer is being handed off to.

Hmm, ok..  I guess the transition will need some hook into the driver.
For freedreno and vc4 (and I suspect this is not uncommon for tiler
GPUs), switching FBOs doesn't necessarily flush rendering to hw.
Maybe it would work out if you requested the sync fd file descriptor
from an EGL fence before passing things to next device, as that would
flush rendering.

I wonder a bit about perf tools and related things.. gallium HUD and
apitrace use SwapBuffers() as a frame marker..

> The one other useful information provided by EGL surfaces that I suspect
> only our hardware cares about is whether the app is potentially 

[Mesa-dev] [PATCH 2/2] meson: install khrplatform header for EGL as well as GLES

2017-11-30 Thread Dylan Baker
Signed-off-by: Dylan Baker 
---
 include/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/meson.build b/include/meson.build
index bae6742c4d6..424d89acc83 100644
--- a/include/meson.build
+++ b/include/meson.build
@@ -40,7 +40,7 @@ if with_gles2
   )
 endif
 
-if with_gles1 or with_gles2 # or with_egl
+if with_gles1 or with_gles2 or with_egl
   install_headers('KHR/khrplatform.h', subdir : 'KHR')
 endif
 
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] meson: install dri internal headers.

2017-11-30 Thread Dylan Baker
Reported-by: Marc Dietrich 
Signed-off-by: Dylan Baker 
---
 include/meson.build | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/meson.build b/include/meson.build
index 35e7791507c..bae6742c4d6 100644
--- a/include/meson.build
+++ b/include/meson.build
@@ -66,3 +66,7 @@ if with_egl
 subdir : 'EGL',
   )
 endif
+
+if with_dri
+  install_headers('GL/internal/dri_interface.h', subdir : 'GL/internal')
+endif
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Check if memfd_create is already defined.

2017-11-30 Thread Emil Velikov
On 30 November 2017 at 15:18, Emil Velikov  wrote:
> Vinson please add a small note why now aka "Newly introduced with glibc 2.27"
> With that the patch is:
> Reviewed-by: Emil Velikov 
>
Completely forgot

Cc: 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/14] meson: most of gallium

2017-11-30 Thread Dylan Baker
Quoting Eric Engestrom (2017-11-29 07:44:08)
> On Tuesday, 2017-11-28 15:30:21 -0800, Dylan Baker wrote:
> > This series is the gallium media state trackers, the "nine" Direct3D state
> > tracker, and an architectural change in the way gallium drivers are linked 
> > into
> > the final targets.
> > 
> > This architectural change results in a good deal of code savings, as well as
> > ensuring that generated targets are generated before the targets that 
> > depend on
> > them are built. It makes use of meson's `declare_dependency` construct to 
> > pass
> > bundled arguments, and the result is somewhat similar to the way that 
> > autotools
> > uses the Automake.inc files.
> > 
> > The returning state trackers are the same as the v5 of the remaining
> > drivers + media series, but now making use of the internal dependencies, 
> > and are
> > joined by the D3D "nine" state tracker.
> > 
> > Dylan Baker (14):
> >   meson: Combine gallium target subdirs
> >   meson: sort gallium drivers after winsys
> >   meson: define driver dependencies
> >   meson: use the driver dependencies for the gallium dri target
> 
> Nice cleanups! ^
> I much prefer the code after your patches.
> 
> I had a quick look, and it all looks fine to me; series is:
> Acked-by: Eric Engestrom 
> 
> (except the patches I actually reviewed, obviously :)

Any comments on v2? I'm ready to start landing this stuff and get on to less
interesting work like macOS :)


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/14] meson: most of gallium

2017-11-30 Thread Dylan Baker
Quoting Marc Dietrich (2017-11-30 01:12:42)
> Hi Dylan,
> 
> Am Mittwoch, 29. November 2017, 19:14:37 CET schrieb Dylan Baker:
> > Quoting Marc Dietrich (2017-11-29 04:31:07)
> > 
> > > Hi Dylan,
> > > 
> > > Am Mittwoch, 29. November 2017, 00:30:21 CET schrieb Dylan Baker:
> > > > This series is the gallium media state trackers, the "nine" Direct3D
> > > > state
> > > > tracker, and an architectural change in the way gallium drivers are
> > > > linked
> > > > into the final targets.
> > > > 
> > > > This architectural change results in a good deal of code savings, as
> > > > well as ensuring that generated targets are generated before the
> > > > targets that depend on them are built. It makes use of meson's
> > > > `declare_dependency` construct to pass bundled arguments, and the
> > > > result is somewhat similar to the way that autotools uses the
> > > > Automake.inc files.
> > > > 
> > > > The returning state trackers are the same as the v5 of the remaining
> > > > drivers + media series, but now making use of the internal dependencies,
> > > > and are joined by the D3D "nine" state tracker.
> > > 
> > >pkgconfig/d3d.pc is not installed
> > >pkgconfig/dri.pc is not installed (is it needed for gallium only
> > >drivers?)
> > 
> > d3d.pc not being installed is bad, but I think it's your configuration.
> > I'm not sure about dri.pc, that's a good question. It's created in
> > src/mesa/drivers/dri, but I could see it being correct to install if the
> > gallium drivers are built using dri.
> 
> 
> ok, I adjusted the config and it works now. The question remains if dri.pc 
> should be installed or not. I think it is only useful if you want to develop 
> your own dri driver, so if one disables dri drivers, I guess it can be left 
> out. 
> 
> On my distro (openSUSE tumbleweed) I see the following contents of Mesa-dri-
> devel:
> 
> # rpm -ql Mesa-dri-devel-17.2.5-179.1.x86_64
> /usr/include/GL/internal
> /usr/include/GL/internal/dri_interface.h
> /usr/lib64/pkgconfig/dri.pc
> 
> I think in any case GL/internal/dri_interface.h is missing and should be 
> added 
> to meson.
> 
> Marc

Yup, that header is missing too. I'll add that.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-11-30 Thread Rob Clark
On Thu, Nov 30, 2017 at 12:59 AM, James Jones  wrote:
> On 11/29/2017 04:09 PM, Miguel Angel Vico wrote:
>>
>> On Wed, 29 Nov 2017 16:28:15 -0500
>> Rob Clark  wrote:
>>>
>>> Do we need to define both in-place and copy transitions?  Ie. what if
>>> GPU is still reading a tiled or compressed texture (ie. sampling from
>>> previous frame for some reason), but we need to untile/uncompress for
>>> display.. of maybe there are some other cases like that we should
>>> think about..
>>>
>>> Maybe you already have some thoughts about that?
>>
>>
>> This is the next thing I'll be working on. I haven't given it much
>> thought myself so far, but I think James might have had some insights.
>> I'll read through some of his notes to double-check.
>
>
> A couple of notes on usage transitions:
>
> While chatting about transitions, a few assertions were made by others that
> I've come to accept, despite the fact that they reduce the generality of the
> allocator mechanisms:
>
> -GPUs are the only things that actually need usage transitions as far as I
> know thus far.  Other engines either share the GPU representations of data,
> or use more limited representations; the latter being the reason non-GPU
> usage transitions are a useful thing.
>
> -It's reasonable to assume that a GPU is required to perform a usage
> transition.  This follows from the above postulate.  If only GPUs are using
> more advanced representations, you don't need any transitions unless you
> have a GPU available.

This seems reasonable.  I can't think of any non-gpu related case
where you would need a transition, other than perhaps cache flush/inv.

> From that, I derived the rough API proposal for transitions presented on my
> XDC 2017 slides.  Transition "metadata" is queried from the allocator given
> a pair of usages (which may refer to more than one device), but the
> realization of the transition is left to existing GPU APIs.  I think I put
> Vulkan-like pseudo-code in the slides, but the GL external objects
> extensions (GL_EXT_memory_object and GL_EXT_semaphore) would work as well.

I haven't quite wrapped my head around how this would work in the
cross-device case.. I mean from the API standpoint for the user, it
seems straightforward enough.  Just not sure how to implement that and
what the driver interface would look like.

I guess we need a capability-conversion (?).. I mean take for example
the the fb compression capability from your slide #12[1].  If we knew
there was an available transition to go from "Dev2 FB compression" to
"normal", then we could have allowed the "Dev2 FB compression" valid
set?

[1] https://www.x.org/wiki/Events/XDC2017/jones_allocator.pdf

> Regarding in-place Vs. copy: To me a transition is something that happens
> in-place, at least semantically.  If you need to make copies, that's a
> format conversion blit not a transition, and graphics APIs are already
> capable of expressing that without any special transitions or help from the
> allocator.  However, I understand some chipsets perform transitions using
> something that looks kind of like a blit using on-chip caches and
> constrained usage semantics.  There's probably some work to do to see
> whether those need to be accommodated as conversion blits or usgae
> transitions.

I guess part of what I was thinking of, is what happens if the
producing device is still reading from the buffer.  For example,
viddec -> gpu use case, where the video decoder is also still hanging
on to the frame to use as a reference frame to decode future frames?

I guess if transition from devA -> devB can be done in parallel with
devA still reading the buffer, it isn't a problem.  I guess that
limits (non-blit) transitions to decompression and cache op's?  Maybe
that is ok..

> For our hardware's purposes, transitions are just various levels of
> decompression or compression reconfiguration and potentially cache
> flushing/invalidation, so our transition metadata will just be some bits
> signaling which compression operation is needed, if any.  That's the sort of
> operation I modeled the API around, so if things are much more exotic than
> that for others, it will probably require some adjustments.
>


[snip]

>
> Gralloc-on-$new_thing, as well as hwcomposer-on-$new_thing is one of my
> primary goals.  However, it's a pretty heavy thing to prototype.  If someone
> has the time though, I think it would be a great experiment.  It would help
> flesh out the paltry list of usages, constraints, and capabilities in the
> existing prototype codebase.  The kmscube example really should have added
> at least a "render" usage, but I got lazy and just re-used texture for now.
> That won't actually work on our HW in all cases, but it's good enough for
> kmscube.
>

btw, I did start looking at it.. I guess this gets a bit into the
other side of this thread (ie. where/if GBM fits in).  So far I don't
think mesa has EGL_EXT_device_base, but I'm guessing 

[Mesa-dev] [PATCH v2] swr: Correct texture allocation and limit max size to 2GB

2017-11-30 Thread Bruce Cherniak
This patch fixes piglit tex3d-maxsize by correcting 4 things:

The total_size calculation was using 32-bit math, therefore a >4GB
allocation request overflowed and was not returning false (unsupported).

Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle
>4GB allocations.

Added error checking on texture allocations to fail gracefully.

Finally, temporarily decreased supported max texture size from 4GB to 2GB.
The gallivm texture-sampler needs some additional work to correctly handle
larger than 2GB textures (offsets to LLVMBuildGEP are signed).

I'm working on a follow-on patch to allow up to 4GB textures, as this is
useful in HPC visualization applications.

Fixes piglit tex3d-maxsize.

v2: Updated patch description to clarify ">4GB".
---
 src/gallium/drivers/swr/rasterizer/common/os.h |  2 +-
 src/gallium/drivers/swr/swr_screen.cpp | 12 +---
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h 
b/src/gallium/drivers/swr/rasterizer/common/os.h
index 4ed6b88e45..358cb33b6e 100644
--- a/src/gallium/drivers/swr/rasterizer/common/os.h
+++ b/src/gallium/drivers/swr/rasterizer/common/os.h
@@ -210,7 +210,7 @@ unsigned char _BitScanReverse(unsigned int *Index, unsigned 
int Mask)
 }
 
 inline
-void *AlignedMalloc(unsigned int size, unsigned int alignment)
+void *AlignedMalloc(size_t size, size_t alignment)
 {
 void *ret;
 if (posix_memalign(, alignment, size))
diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
b/src/gallium/drivers/swr/swr_screen.cpp
index 3f2433e65a..71a07ebe8d 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++ b/src/gallium/drivers/swr/swr_screen.cpp
@@ -50,7 +50,7 @@
  * Max texture sizes
  * XXX Check max texture size values against core and sampler.
  */
-#define SWR_MAX_TEXTURE_SIZE (4 * 1024 * 1024 * 1024ULL) /* 4GB */
+#define SWR_MAX_TEXTURE_SIZE (2 * 1024 * 1024 * 1024ULL) /* 2GB */
 #define SWR_MAX_TEXTURE_2D_LEVELS 14  /* 8K x 8K for now */
 #define SWR_MAX_TEXTURE_3D_LEVELS 12  /* 2K x 2K x 2K for now */
 #define SWR_MAX_TEXTURE_CUBE_LEVELS 14  /* 8K x 8K for now */
@@ -821,13 +821,15 @@ swr_texture_layout(struct swr_screen *screen,
  ComputeSurfaceOffset(0, 0, 0, 0, 0, level, >swr);
}
 
-   size_t total_size = res->swr.depth * res->swr.qpitch * res->swr.pitch *
-   res->swr.numSamples;
+   size_t total_size = (uint64_t)res->swr.depth * res->swr.qpitch *
+ res->swr.pitch * res->swr.numSamples;
if (total_size > SWR_MAX_TEXTURE_SIZE)
   return false;
 
if (allocate) {
   res->swr.xpBaseAddress = (gfxptr_t)AlignedMalloc(total_size, 64);
+  if (!res->swr.xpBaseAddress)
+ return false;
 
   if (res->has_depth && res->has_stencil) {
  res->secondary = res->swr;
@@ -843,6 +845,10 @@ swr_texture_layout(struct swr_screen *screen,
   res->secondary.pitch * res->secondary.numSamples;
 
  res->secondary.xpBaseAddress = (gfxptr_t) AlignedMalloc(total_size, 
64);
+ if (!res->secondary.xpBaseAddress) {
+AlignedFree((void *)res->swr.xpBaseAddress);
+return false;
+ }
   }
}
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] meta: Fix ClearTexture with GL_DEPTH_COMPONENT.

2017-11-30 Thread Kenneth Graunke
We only handled unpacking for GL_DEPTH_STENCIL formats.

Cemu was hitting _mesa_problem() for an unsupported format in
_mesa_unpack_float_32_uint_24_8_depth_stencil_row(), because the
format was depth-only, rather than depth-stencil.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94739
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103966
---
 src/mesa/drivers/common/meta.c | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 1cc736cff1c..41f4f5526ad 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -3510,15 +3510,21 @@ cleartexsubimage_depth_stencil(struct gl_context *ctx,
   /* Convert the clearValue from whatever format it's in to a floating
* point value for the depth and an integer value for the stencil index
*/
-  _mesa_unpack_float_32_uint_24_8_depth_stencil_row(texImage->TexFormat,
-1, /* n */
-clearValue,
-depthStencilValue);
-  /* We need a memcpy here instead of a cast because we need to
-   * reinterpret the bytes as a float rather than converting it
-   */
-  memcpy(, depthStencilValue, sizeof depthValue);
-  stencilValue = depthStencilValue[1] & 0xff;
+  if (texImage->_BaseFormat == GL_DEPTH_STENCIL) {
+ _mesa_unpack_float_32_uint_24_8_depth_stencil_row(texImage->TexFormat,
+   1, /* n */
+   clearValue,
+   depthStencilValue);
+ stencilValue = depthStencilValue[1] & 0xff;
+
+ /* We need a memcpy here instead of a cast because we need to
+  * reinterpret the bytes as a float rather than converting it
+  */
+ memcpy(, depthStencilValue, sizeof depthValue);
+  } else {
+ _mesa_unpack_float_z_row(texImage->TexFormat, 1 /* n */,
+  clearValue, );
+  }
} else {
   depthValue = 0.0f;
   stencilValue = 0;
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 23/25] mesa/glspirv: Add a _mesa_spirv_to_nir() function

2017-11-30 Thread Eduardo Lima Mitev
This is basically a wrapper around spirv_to_nir() that includes
arguments setup and post-conversion validation.

v2: Rebase update (SpirVCapabilities not a pointer anymore)
---
 src/mesa/main/glspirv.c | 60 +
 src/mesa/main/glspirv.h |  7 ++
 2 files changed, 67 insertions(+)

diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
index e5dc8b26ea9..2a20e4b5cc7 100644
--- a/src/mesa/main/glspirv.c
+++ b/src/mesa/main/glspirv.c
@@ -159,6 +159,66 @@ _mesa_spirv_link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
}
 }
 
+nir_shader *
+_mesa_spirv_to_nir(struct gl_context *ctx,
+   const struct gl_shader_program *prog,
+   gl_shader_stage stage,
+   const nir_shader_compiler_options *options)
+{
+   nir_shader *nir = NULL;
+
+   struct gl_linked_shader *linked_shader = prog->_LinkedShaders[stage];
+   assert (linked_shader);
+
+   struct gl_shader_spirv_data *spirv_data = linked_shader->spirv_data;
+   assert(spirv_data);
+
+   struct gl_spirv_module *spirv_module = spirv_data->SpirVModule;
+   assert (spirv_module != NULL);
+
+   const char *entry_point_name = spirv_data->SpirVEntryPoint;
+   assert(entry_point_name);
+
+   struct nir_spirv_specialization *spec_entries = NULL;
+   spec_entries = calloc(sizeof(*spec_entries),
+ spirv_data->NumSpecializationConstants);
+
+   for (unsigned i = 0; i < spirv_data->NumSpecializationConstants; ++i) {
+  spec_entries[i].id = spirv_data->SpecializationConstantsIndex[i];
+  spec_entries[i].data32 = spirv_data->SpecializationConstantsValue[i];
+  spec_entries[i].defined_on_module = false;
+   }
+
+   nir_function *entry_point =
+  spirv_to_nir((const uint32_t *) _module->Binary[0],
+   spirv_module->Length / 4,
+   spec_entries, spirv_data->NumSpecializationConstants,
+   stage, entry_point_name,
+   >Const.SpirVCapabilities,
+   options);
+   free(spec_entries);
+
+   assert (entry_point);
+   nir = entry_point->shader;
+   assert(nir->info.stage == stage);
+
+   nir->options = options;
+
+   nir->info.name =
+  ralloc_asprintf(nir, "SPIRV:%s:%d",
+  _mesa_shader_stage_to_abbrev(nir->info.stage),
+  prog->Name);
+   nir_validate_shader(nir);
+
+   if (false) {
+  /* @FIXME: Only for debugging purposes */
+  nir_print_shader(nir, stdout);
+  fflush(stdout);
+   }
+
+   return nir;
+}
+
 void GLAPIENTRY
 _mesa_SpecializeShaderARB(GLuint shader,
   const GLchar *pEntryPoint,
diff --git a/src/mesa/main/glspirv.h b/src/mesa/main/glspirv.h
index 0f03b75c111..81626ce75b5 100644
--- a/src/mesa/main/glspirv.h
+++ b/src/mesa/main/glspirv.h
@@ -24,6 +24,7 @@
 #ifndef GLSPIRV_H
 #define GLSPIRV_H
 
+#include "compiler/nir/nir.h"
 #include "mtypes.h"
 
 #ifdef __cplusplus
@@ -80,6 +81,12 @@ void
 _mesa_spirv_link_shaders(struct gl_context *ctx,
  struct gl_shader_program *prog);
 
+nir_shader *
+_mesa_spirv_to_nir(struct gl_context *ctx,
+   const struct gl_shader_program *prog,
+   gl_shader_stage stage,
+   const nir_shader_compiler_options *options);
+
 /**
  * \name API functions
  */
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 25/25] i965: Don't call process_glsl_ir() for SPIR-V shaders

2017-11-30 Thread Eduardo Lima Mitev
v2: Use 'spirv_data' from gl_linked_shader instead, to check if shader
   is SPIR-V. (Timothy Arceri)
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index d18521e792d..6bf4c413db4 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -236,7 +236,8 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
   struct gl_program *prog = shader->Program;
   prog->Parameters = _mesa_new_parameter_list();
 
-  process_glsl_ir(brw, shProg, shader);
+  if (!shader->spirv_data)
+ process_glsl_ir(brw, shProg, shader);
 
   _mesa_copy_linked_program_data(shProg, shader);
 
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 24/25] i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders

2017-11-30 Thread Eduardo Lima Mitev
This is the main fork of the shader compilation code-path, where a NIR
shader is obtained by calling spirv_to_nir() or glsl_to_nir(),
depending on its nature..

v2: Use 'spirv_data' member from gl_linked_shader to know which method
   to call. (Timothy Arceri)
---
 src/mesa/drivers/dri/i965/brw_program.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 755d4973cc0..596118f2fe5 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -31,6 +31,7 @@
 
 #include 
 #include "main/imports.h"
+#include "main/glspirv.h"
 #include "program/prog_parameter.h"
 #include "program/prog_print.h"
 #include "program/prog_to_nir.h"
@@ -73,9 +74,18 @@ brw_create_nir(struct brw_context *brw,
   ctx->Const.ShaderCompilerOptions[stage].NirOptions;
nir_shader *nir;
 
-   /* First, lower the GLSL IR or Mesa IR to NIR */
+   /* First, lower the GLSL/Mesa IR or SPIR-V to NIR */
if (shader_prog) {
-  nir = glsl_to_nir(shader_prog, stage, options);
+  bool is_spirv_shader =
+ (shader_prog->_LinkedShaders[stage]->spirv_data != NULL);
+
+  if (!is_spirv_shader) {
+ nir = glsl_to_nir(shader_prog, stage, options);
+  } else {
+ nir = _mesa_spirv_to_nir(ctx, shader_prog, stage, options);
+  }
+  assert (nir);
+
   nir_remove_dead_variables(nir, nir_var_shader_in | nir_var_shader_out);
   nir_lower_returns(nir);
   nir_validate_shader(nir);
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 22/25] mesa/glspirv: Create gl_linked_shader objects for a SPIR-V program

2017-11-30 Thread Eduardo Lima Mitev
---
 src/mesa/main/glspirv.c | 51 ++---
 1 file changed, 48 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
index e533853f7fa..e5dc8b26ea9 100644
--- a/src/mesa/main/glspirv.c
+++ b/src/mesa/main/glspirv.c
@@ -29,6 +29,8 @@
 #include "compiler/nir/nir.h"
 #include "compiler/spirv/nir_spirv.h"
 
+#include "program/program.h"
+
 #include "util/u_atomic.h"
 
 void
@@ -104,14 +106,57 @@ _mesa_spirv_shader_binary(struct gl_context *ctx,
}
 }
 
+/**
+ * This is the equivalent to compiler/glsl/linker.cpp::link_shaders()
+ * but for SPIR-V programs.
+ *
+ * This method just creates the gl_linked_shader structs with a reference to
+ * the SPIR-V data collected during previous steps.
+ *
+ * The real linking happens later in the driver-specifc call LinkShader().
+ * This is so backends can implement different linking strategies for
+ * SPIR-V programs.
+ */
 void
 _mesa_spirv_link_shaders(struct gl_context *ctx, struct gl_shader_program 
*prog)
 {
-   /* @TODO: This is a placeholder for the equivalent of
-* compiler/glsl/linker.cpp::link_shaders() but for SPIR-V.
-*/
prog->data->LinkStatus = linking_success;
prog->data->Validated = false;
+
+   for (unsigned i = 0; i < prog->NumShaders; i++) {
+  struct gl_shader *shader = prog->Shaders[i];
+  gl_shader_stage shader_type = shader->Stage;
+
+  assert(shader->spirv_data);
+
+  struct gl_linked_shader *linked = rzalloc(NULL, struct gl_linked_shader);
+  linked->Stage = shader_type;
+
+  /* Create program and attach it to the linked shader */
+  struct gl_program *gl_prog =
+ ctx->Driver.NewProgram(ctx,
+_mesa_shader_stage_to_program(shader_type),
+prog->Name, false);
+  if (!gl_prog) {
+ prog->data->LinkStatus = linking_failure;
+ _mesa_delete_linked_shader(ctx, linked);
+ return;
+  }
+
+  _mesa_reference_shader_program_data(ctx,
+  _prog->sh.data,
+  prog->data);
+
+  /* Don't use _mesa_reference_program() just take ownership */
+  linked->Program = gl_prog;
+
+  /* Reference the SPIR-V data from shader to the linked shader */
+  _mesa_shader_spirv_data_reference(>spirv_data,
+shader->spirv_data);
+
+  prog->_LinkedShaders[shader_type] = linked;
+  prog->data->linked_stages |= 1 << shader_type;
+   }
 }
 
 void GLAPIENTRY
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 19/25] mesa/glspirv: Add a _mesa_spirv_link_shaders() placeholder

2017-11-30 Thread Eduardo Lima Mitev
This will be the equivalent to link_shaders() from
src/compiler/glsl/linker.cpp, but for SPIR-V programs.
---
 src/mesa/main/glspirv.c | 10 ++
 src/mesa/main/glspirv.h |  4 
 2 files changed, 14 insertions(+)

diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
index 18710c0d8fc..e533853f7fa 100644
--- a/src/mesa/main/glspirv.c
+++ b/src/mesa/main/glspirv.c
@@ -104,6 +104,16 @@ _mesa_spirv_shader_binary(struct gl_context *ctx,
}
 }
 
+void
+_mesa_spirv_link_shaders(struct gl_context *ctx, struct gl_shader_program 
*prog)
+{
+   /* @TODO: This is a placeholder for the equivalent of
+* compiler/glsl/linker.cpp::link_shaders() but for SPIR-V.
+*/
+   prog->data->LinkStatus = linking_success;
+   prog->data->Validated = false;
+}
+
 void GLAPIENTRY
 _mesa_SpecializeShaderARB(GLuint shader,
   const GLchar *pEntryPoint,
diff --git a/src/mesa/main/glspirv.h b/src/mesa/main/glspirv.h
index ba281f68bef..0f03b75c111 100644
--- a/src/mesa/main/glspirv.h
+++ b/src/mesa/main/glspirv.h
@@ -76,6 +76,10 @@ _mesa_spirv_shader_binary(struct gl_context *ctx,
   unsigned n, struct gl_shader **shaders,
   const void* binary, size_t length);
 
+void
+_mesa_spirv_link_shaders(struct gl_context *ctx,
+ struct gl_shader_program *prog);
+
 /**
  * \name API functions
  */
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >