[Mesa-dev] [PATCH] nv50/ir: process texture offset sources as regular sources

2016-10-18 Thread Ilia Mirkin
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs
and IN's. Make sure to process them as regular sources so that we pick
up masks, etc.

This should fix some CTS tests that feed offsets directly to
textureGatherOffset, and we were not picking up the input use, thus not
advertising it in the shader header.

Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 146 +
 1 file changed, 93 insertions(+), 53 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index fe71f58..05076e1 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -182,6 +182,7 @@ public:
 
// mask of used components of source s
unsigned int srcMask(unsigned int s) const;
+   unsigned int texOffsetMask() const;
 
SrcRegister getSrc(unsigned int s) const
{
@@ -234,6 +235,34 @@ private:
const struct tgsi_full_instruction *insn;
 };
 
+unsigned int Instruction::texOffsetMask() const
+{
+   const struct tgsi_instruction_texture *tex = >Texture;
+   assert(insn->Instruction.Texture);
+
+   switch (tex->Texture) {
+   case TGSI_TEXTURE_BUFFER:
+   case TGSI_TEXTURE_1D:
+   case TGSI_TEXTURE_1D_ARRAY:
+   case TGSI_TEXTURE_SHADOW1D_ARRAY:
+  return 0x1;
+   case TGSI_TEXTURE_2D:
+   case TGSI_TEXTURE_SHADOW2D:
+   case TGSI_TEXTURE_2D_ARRAY:
+   case TGSI_TEXTURE_SHADOW2D_ARRAY:
+   case TGSI_TEXTURE_RECT:
+   case TGSI_TEXTURE_SHADOWRECT:
+   case TGSI_TEXTURE_2D_MSAA:
+   case TGSI_TEXTURE_2D_ARRAY_MSAA:
+  return 0x3;
+   case TGSI_TEXTURE_3D:
+  return 0x7;
+   default:
+  assert(!"Unexpected texture target");
+  return 0xf;
+   }
+}
+
 unsigned int Instruction::srcMask(unsigned int s) const
 {
unsigned int mask = insn->Dst[0].Register.WriteMask;
@@ -955,6 +984,9 @@ private:
int inferSysValDirection(unsigned sn) const;
bool scanDeclaration(const struct tgsi_full_declaration *);
bool scanInstruction(const struct tgsi_full_instruction *);
+   void scanInstructionSrc(const Instruction& insn,
+   const Instruction::SrcRegister& src,
+   unsigned mask);
void scanProperty(const struct tgsi_full_property *);
void scanImmediate(const struct tgsi_full_immediate *);
 
@@ -1364,6 +1396,61 @@ inline bool Source::isEdgeFlagPassthrough(const 
Instruction& insn) const
   insn.getSrc(0).getFile() == TGSI_FILE_INPUT;
 }
 
+void Source::scanInstructionSrc(const Instruction& insn,
+const Instruction::SrcRegister& src,
+unsigned mask)
+{
+   if (src.getFile() == TGSI_FILE_TEMPORARY) {
+  if (src.isIndirect(0))
+ indirectTempArrays.insert(src.getArrayId());
+   } else
+   if (src.getFile() == TGSI_FILE_BUFFER ||
+   src.getFile() == TGSI_FILE_IMAGE ||
+   (src.getFile() == TGSI_FILE_MEMORY &&
+memoryFiles[src.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) {
+  info->io.globalAccess |= (insn.getOpcode() == TGSI_OPCODE_LOAD) ?
+ 0x1 : 0x2;
+   } else
+   if (src.getFile() == TGSI_FILE_OUTPUT) {
+  if (src.isIndirect(0)) {
+ // We don't know which one is accessed, just mark everything for
+ // reading. This is an extremely unlikely occurrence.
+ for (unsigned i = 0; i < info->numOutputs; ++i)
+info->out[i].oread = 1;
+  } else {
+ info->out[src.getIndex(0)].oread = 1;
+  }
+   }
+   if (src.getFile() != TGSI_FILE_INPUT)
+  return;
+
+   if (src.isIndirect(0)) {
+  for (unsigned i = 0; i < info->numInputs; ++i)
+ info->in[i].mask = 0xf;
+   } else {
+  const int i = src.getIndex(0);
+  for (unsigned c = 0; c < 4; ++c) {
+ if (!(mask & (1 << c)))
+continue;
+ int k = src.getSwizzle(c);
+ if (k <= TGSI_SWIZZLE_W)
+info->in[i].mask |= 1 << k;
+  }
+  switch (info->in[i].sn) {
+  case TGSI_SEMANTIC_PSIZE:
+  case TGSI_SEMANTIC_PRIMID:
+  case TGSI_SEMANTIC_FOG:
+ info->in[i].mask &= 0x1;
+ break;
+  case TGSI_SEMANTIC_PCOORD:
+ info->in[i].mask &= 0x3;
+ break;
+  default:
+ break;
+  }
+   }
+}
+
 bool Source::scanInstruction(const struct tgsi_full_instruction *inst)
 {
Instruction insn(inst);
@@ -1396,66 +1483,19 @@ bool Source::scanInstruction(const struct 
tgsi_full_instruction *inst)
 indirectTempArrays.insert(dst.getArrayId());
   } else
   if (dst.getFile() == TGSI_FILE_BUFFER ||
-  dst.getFile() == TGSI_FILE_IMAGE || 
+  dst.getFile() == TGSI_FILE_IMAGE ||
   (dst.getFile() == TGSI_FILE_MEMORY &&
memoryFiles[dst.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) {
  

Re: [Mesa-dev] [PATCH] anv: drop unused zero macro.

2016-10-18 Thread Jason Ekstrand
rb

On Tue, Oct 18, 2016 at 8:36 PM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> I can't see this being used anywhere.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/intel/vulkan/anv_private.h | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> index 0e25827..3fe9d7d 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -163,8 +163,6 @@ anv_clear_mask(uint32_t *inout_mask, uint32_t
> clear_mask)
> memcpy((dest), (src), (count) * sizeof(*(src))); \
>  })
>
> -#define zero(x) (memset(&(x), 0, sizeof(x)))
> -
>  /* Define no kernel as 1, since that's an illegal offset for a kernel */
>  #define NO_KERNEL 1
>
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category.

2016-10-18 Thread Francisco Jerez
Ilia Mirkin  writes:

> Why does it care where those functions are defined? I thought it was
> all one big happy namespace, with the categories just there for
> general amusement. Could you shed some light on what the actual
> situation is?
>

Heh, I won't pretend to understand the dispatch generation mess, but
apparently the gl_procs.py treats the ES (and GL_OES) categories
specially and emits forward declarations for them before the actual
table -- Possibly to hack around build failures with GLES entry points
not defined in desktop GL headers.

> On Tue, Oct 18, 2016 at 11:48 PM, Francisco Jerez  
> wrote:
>> These two GLES 3.2 entry points were being defined in the category of
>> the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions
>> respectively instead of in the ES3.2 category.  Defining them in the
>> ES3.2 category makes sure that the gl_procs.py generator emits
>> declarations in the glprocs.h header file for the unsuffixed GLES-only
>> entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR
>> respectively alias.  This should avoid a compilation failure during
>> scons builds in combination with "mapi: export all GLES 3.2 functions
>> in libGLESv2.so".
>> ---
>>  src/mapi/glapi/gen/gl_API.xml | 30 +-
>>  1 file changed, 17 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
>> index 5998ccf..00c9bb7 100644
>> --- a/src/mapi/glapi/gen/gl_API.xml
>> +++ b/src/mapi/glapi/gen/gl_API.xml
>> @@ -8296,6 +8296,23 @@
>>  
>>  > xmlns:xi="http://www.w3.org/2001/XInclude"/>
>>
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>>  
>>  
>>
>> @@ -8316,7 +8333,6 @@
>>  
>>  
>>
>> -
>>  
>>  
>>
>> @@ -8332,18 +8348,6 @@
>>  
>>  
>>
>> -
>> -
>> -
>> -
>> -
>> -
>> -
>> -
>> -
>> -
>> -
>>  
>>  
>>  
>> --
>> 2.9.0
>>


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50, nvc0: avoid reading out of bounds when getting bogus so info

2016-10-18 Thread Ilia Mirkin
The state tracker tries to attach the info to the wrong shader. This is
easy enough to protect against.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/nv50/nv50_program.c | 3 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 7 +--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c 
b/src/gallium/drivers/nouveau/nv50/nv50_program.c
index 1e39427..9081cd8 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_program.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c
@@ -308,6 +308,9 @@ nv50_program_create_strmout_state(const struct 
nv50_ir_prog_info *info,
   const unsigned r = pso->output[i].register_index;
   b = pso->output[i].output_buffer;
 
+  if (r >= info->numOutputs)
+ continue;
+
   for (c = 0; c < pso->output[i].num_components; ++c)
  so->map[base[b] + p + c] = info->out[r].slot[s + c];
}
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
index 867d84a..50f8083 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -509,11 +509,14 @@ nvc0_program_create_tfb_state(const struct 
nv50_ir_prog_info *info,
for (i = 0; i < pso->num_outputs; ++i) {
   unsigned s = pso->output[i].start_component;
   unsigned p = pso->output[i].dst_offset;
+  const unsigned r = pso->output[i].register_index;
   b = pso->output[i].output_buffer;
 
+  if (r >= info->numOutputs)
+ continue;
+
   for (c = 0; c < pso->output[i].num_components; ++c)
- tfb->varying_index[b][p++] =
-info->out[pso->output[i].register_index].slot[s + c];
+ tfb->varying_index[b][p++] = info->out[r].slot[s + c];
 
   tfb->varying_count[b] = MAX2(tfb->varying_count[b], p);
   tfb->stream[b] = pso->output[i].stream;
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category.

2016-10-18 Thread Ilia Mirkin
Why does it care where those functions are defined? I thought it was
all one big happy namespace, with the categories just there for
general amusement. Could you shed some light on what the actual
situation is?

On Tue, Oct 18, 2016 at 11:48 PM, Francisco Jerez  wrote:
> These two GLES 3.2 entry points were being defined in the category of
> the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions
> respectively instead of in the ES3.2 category.  Defining them in the
> ES3.2 category makes sure that the gl_procs.py generator emits
> declarations in the glprocs.h header file for the unsuffixed GLES-only
> entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR
> respectively alias.  This should avoid a compilation failure during
> scons builds in combination with "mapi: export all GLES 3.2 functions
> in libGLESv2.so".
> ---
>  src/mapi/glapi/gen/gl_API.xml | 30 +-
>  1 file changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
> index 5998ccf..00c9bb7 100644
> --- a/src/mapi/glapi/gen/gl_API.xml
> +++ b/src/mapi/glapi/gen/gl_API.xml
> @@ -8296,6 +8296,23 @@
>  
>   xmlns:xi="http://www.w3.org/2001/XInclude"/>
>
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
>  
>  
>
> @@ -8316,7 +8333,6 @@
>  
>  
>
> -
>  
>  
>
> @@ -8332,18 +8348,6 @@
>  
>  
>
> -
> -
> -
> -
> -
> -
> -
> -
> -
> -
> -
>  
>  
>  
> --
> 2.9.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98172] Concurrent call to glClientWaitSync results in segfault in one of the waiters.

2016-10-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98172

--- Comment #28 from Suzuki, Shinji  ---
Yes. I agree with you that we can do without per-sync-object if  we
allow all waiters enter fence_finish() freely.
With that said, per-sync-object mutex has another benefit of
potentially reducing lock contention among waiters on differing sync
objects and with other mesa components that deals with shared
resources. To be fair I also have to mention that ctx->Shared.Mutex is
touched everywhere that trying to optimize in this particular context
only may not make much sense and adding mutex certainly has associated
overhead. Overall, I vote +1 on your strategy  if free execution of
fence_finish() is to be allowed.


On Wed, Oct 19, 2016 at 10:21 AM,   wrote:
> Comment # 27 on bug 98172 from Michel Dänzer
>
> Note that if we allow concurrent fence_finish calls, I don't think we need a
> per-sync-object mutex.
>
> 
> You are receiving this mail because:
>
> You reported the bug.
> You are on the CC list for the bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] Revert "Revert "mapi: export all GLES 3.2 functions in libGLESv2.so""

2016-10-18 Thread Francisco Jerez
This reverts commit 85e9bbc14d93fa7166c9ae075ee7ae29a8313e3f.  The
previous commit should help with the scons build failure caused by the
original commit.
---
 src/mapi/glapi/gen/static_data.py | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/mapi/glapi/gen/static_data.py 
b/src/mapi/glapi/gen/static_data.py
index 2f403e9..25e78bf 100644
--- a/src/mapi/glapi/gen/static_data.py
+++ b/src/mapi/glapi/gen/static_data.py
@@ -484,17 +484,22 @@ functions = [
 "BindVertexBuffer",
 "BindVertexBuffers",
 "Bitmap",
+"BlendBarrier",
 "BlendColor",
 "BlendColorEXT",
 "BlendEquation",
 "BlendEquationEXT",
+"BlendEquationi",
 "BlendEquationiARB",
 "BlendEquationSeparate",
+"BlendEquationSeparatei",
 "BlendEquationSeparateiARB",
 "BlendFunc",
+"BlendFunci",
 "BlendFunciARB",
 "BlendFuncSeparate",
 "BlendFuncSeparateEXT",
+"BlendFuncSeparatei",
 "BlendFuncSeparateiARB",
 "BlitFramebuffer",
 "BufferData",
@@ -825,6 +830,7 @@ functions = [
 "GetFramebufferAttachmentParameteriv",
 "GetFramebufferAttachmentParameterivEXT",
 "GetFramebufferParameteriv",
+"GetGraphicsResetStatus",
 "GetGraphicsResetStatusARB",
 "GetHandleARB",
 "GetHistogram",
@@ -864,8 +870,11 @@ functions = [
 "GetnSeparableFilterARB",
 "GetnTexImageARB",
 "GetnUniformdvARB",
+"GetnUniformfv",
 "GetnUniformfvARB",
+"GetnUniformiv",
 "GetnUniformivARB",
+"GetnUniformuiv",
 "GetnUniformuivARB",
 "GetObjectLabel",
 "GetObjectParameterfvARB",
@@ -1160,6 +1169,7 @@ functions = [
 "Orthof",
 "Orthox",
 "PassThrough",
+"PatchParameteri",
 "PauseTransformFeedback",
 "PixelMapfv",
 "PixelMapuiv",
@@ -1191,6 +1201,7 @@ functions = [
 "PopDebugGroup",
 "PopMatrix",
 "PopName",
+"PrimitiveBoundingBox",
 "PrimitiveRestartIndex",
 "PrimitiveRestartIndexNV",
 "PrimitiveRestartNV",
@@ -1273,6 +1284,7 @@ functions = [
 "RasterPos4s",
 "RasterPos4sv",
 "ReadBuffer",
+"ReadnPixels",
 "ReadnPixelsARB",
 "ReadPixels",
 "Rectd",
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category.

2016-10-18 Thread Francisco Jerez
These two GLES 3.2 entry points were being defined in the category of
the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions
respectively instead of in the ES3.2 category.  Defining them in the
ES3.2 category makes sure that the gl_procs.py generator emits
declarations in the glprocs.h header file for the unsuffixed GLES-only
entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR
respectively alias.  This should avoid a compilation failure during
scons builds in combination with "mapi: export all GLES 3.2 functions
in libGLESv2.so".
---
 src/mapi/glapi/gen/gl_API.xml | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 5998ccf..00c9bb7 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8296,6 +8296,23 @@
 
 http://www.w3.org/2001/XInclude"/>
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
 
 
 
@@ -8316,7 +8333,6 @@
 
 
 
-
 
 
 
@@ -8332,18 +8348,6 @@
 
 
 
-
-
-
-
-
-
-
-
-
-
-
 
 
 
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: drop unused zero macro.

2016-10-18 Thread Dave Airlie
From: Dave Airlie 

I can't see this being used anywhere.

Signed-off-by: Dave Airlie 
---
 src/intel/vulkan/anv_private.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 0e25827..3fe9d7d 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -163,8 +163,6 @@ anv_clear_mask(uint32_t *inout_mask, uint32_t clear_mask)
memcpy((dest), (src), (count) * sizeof(*(src))); \
 })
 
-#define zero(x) (memset(&(x), 0, sizeof(x)))
-
 /* Define no kernel as 1, since that's an illegal offset for a kernel */
 #define NO_KERNEL 1
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/25] mesa/i965/i915/r200: eliminate gl_vertex_program

2016-10-18 Thread Timothy Arceri
On Tue, 2016-10-18 at 12:07 -0700, Ian Romanick wrote:
> I'd like to see two tiny changes:
> 
> 1. A comment for the IsPositionInvariant field that it can only be
> true
> for vertex programs.

I already had that this is only used for assembly style vertex
programs. I've reworded it to be only true for :)

> 
> 2. An assertion or two like
> 
> assert(p->Target == GL_VERTEX_PROGRAM_ARB ||
>    !p->IsPositionInvariant);

I'm not sure how useful this is.
> 
>    in reasonable places.  I'm thinking:
> 
>    - Where it's assigned in src/mesa/program/arbprogparse.c

It assigned in mesa_parse_arb_vertex_program() and there is already an

assert(target == GL_VERTEX_PROGRAM_ARB);

> 
>    - Where it's used in src/mesa/state_tracker/st_program.c,

Again this is in st_translate_vertex_program()

so if its not a vp we already have problems.

>  src/mesa/drivers/dri/i965/brw_program.c, and

Its used inside case GL_VERTEX_PROGRAM_ARB:

I've added the assert to the top of the function but it seems kind of
pointless.

>  src/mesa/tnl/t_vb_program.c (both places).

In both of these the program always comes from ctx-
>VertexProgram._Current so it doesn't seems very useful here either.

> 
> I'd also support a follow-up patch that converts IsPositionInvariant
> from GLboolean to bool. :)
> 
> On 10/17/2016 11:12 PM, Timothy Arceri wrote:
> > 
> > Here we move the only field in gl_vertex_program to the
> > ARB program fields in gl_program.
> > ---
> >  src/mesa/drivers/common/meta.c   | 10 +--
> >  src/mesa/drivers/common/meta.h   |  2 +-
> >  src/mesa/drivers/dri/i915/i915_fragprog.c|  4 +-
> >  src/mesa/drivers/dri/i965/brw_context.h  |  8 +--
> >  src/mesa/drivers/dri/i965/brw_curbe.c|  2 +-
> >  src/mesa/drivers/dri/i965/brw_draw.c |  4 +-
> >  src/mesa/drivers/dri/i965/brw_program.c  |  5 +-
> >  src/mesa/drivers/dri/i965/brw_vs.c   | 41 ++--
> >  src/mesa/drivers/dri/i965/brw_vs_surface_state.c |  2 +-
> >  src/mesa/drivers/dri/i965/gen6_vs_state.c|  4 +-
> >  src/mesa/drivers/dri/r200/r200_context.h |  2 +-
> >  src/mesa/drivers/dri/r200/r200_state_init.c  |  4 +-
> >  src/mesa/drivers/dri/r200/r200_tcl.c |  2 +-
> >  src/mesa/drivers/dri/r200/r200_vertprog.c| 82
> > 
> >  src/mesa/main/arbprogram.c   | 19 +++---
> >  src/mesa/main/context.c  |  8 +--
> >  src/mesa/main/ff_fragment_shader.cpp |  2 +-
> >  src/mesa/main/ffvertex_prog.c| 72 ++
> > ---
> >  src/mesa/main/ffvertex_prog.h|  2 +-
> >  src/mesa/main/mtypes.h   | 17 ++---
> >  src/mesa/main/shared.c   |  5 +-
> >  src/mesa/main/state.c| 26 
> >  src/mesa/main/state.h|  2 +-
> >  src/mesa/program/arbprogparse.c  | 46 ++
> > ---
> >  src/mesa/program/arbprogparse.h  |  2 +-
> >  src/mesa/program/prog_statevars.c|  8 +--
> >  src/mesa/program/program.c   | 15 ++---
> >  src/mesa/program/program.h   | 26 
> >  src/mesa/program/programopt.c| 42 ++--
> >  src/mesa/program/programopt.h|  2 +-
> >  src/mesa/state_tracker/st_atom.c |  4 +-
> >  src/mesa/state_tracker/st_atom_constbuf.c|  2 +-
> >  src/mesa/state_tracker/st_atom_rasterizer.c  |  8 +--
> >  src/mesa/state_tracker/st_atom_sampler.c |  2 +-
> >  src/mesa/state_tracker/st_atom_shader.c  |  4 +-
> >  src/mesa/state_tracker/st_atom_texture.c |  2 +-
> >  src/mesa/state_tracker/st_cb_feedback.c  |  2 +-
> >  src/mesa/state_tracker/st_cb_program.c   |  2 +-
> >  src/mesa/state_tracker/st_debug.c|  4 +-
> >  src/mesa/state_tracker/st_program.c  | 35 +-
> >  src/mesa/state_tracker/st_program.h  |  4 +-
> >  src/mesa/tnl/t_context.c |  4 +-
> >  src/mesa/tnl/t_vb_program.c  | 24 +++
> >  src/mesa/tnl/t_vp_build.c|  4 +-
> >  src/mesa/vbo/vbo_exec_draw.c |  4 +-
> >  src/mesa/vbo/vbo_save_draw.c |  4 +-
> >  46 files changed, 264 insertions(+), 311 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/common/meta.c
> > b/src/mesa/drivers/common/meta.c
> > index 890e98a..ab81eed 100644
> > --- a/src/mesa/drivers/common/meta.c
> > +++ b/src/mesa/drivers/common/meta.c
> > @@ -566,8 +566,8 @@ _mesa_meta_begin(struct gl_context *ctx,
> > GLbitfield state)
> >  
> >    if (ctx->Extensions.ARB_vertex_program) {
> >   save->VertexProgramEnabled = ctx->VertexProgram.Enabled;
> > - _mesa_reference_vertprog(ctx, >VertexProgram,
> > -  

[Mesa-dev] [Bug 98172] Concurrent call to glClientWaitSync results in segfault in one of the waiters.

2016-10-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98172

--- Comment #27 from Michel Dänzer  ---
Note that if we allow concurrent fence_finish calls, I don't think we need a
per-sync-object mutex.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98172] Concurrent call to glClientWaitSync results in segfault in one of the waiters.

2016-10-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98172

--- Comment #26 from Michel Dänzer  ---
(In reply to Marek Olšák from comment #24)
> Hm. Probably none.

Actually, I think there are: E.g. consider one thread calling glClientWaitSync
with a non-0 timeout, blocking for some time with the mutex locked. If another
thread calls glClientWaitSync with a 0 timeout (or whichever API call ends up
in st_check_sync) during that time, it'll block until the first thread unlocks
the mutex.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Michael Schellenberger Costa

Hi Jan,


On 18.10.2016 00:07, Jan Ziak wrote:

This patch replaces the ir_variable_refcount_entry's linked-list
with an array-list.

The array-list has local storage which does not require ANY additional
allocations if the list has small number of elements. The size of this
storage is configurable for each variable.

Benchmark results for "./run -1 shaders" from shader-db[1]:

- The total number of executed instructions goes down from 64.184 to 63.797
   giga-instructions when Mesa is compiled with "gcc -O0 ..."
- In the call tree starting at function do_dead_code():
   - the number of calls to malloc() is reduced by about 10%
   - the number of calls to free() is reduced by about 30%

[1] git://anongit.freedesktop.org/mesa/shader-db

Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0...@gmail.com>
---
  src/compiler/glsl/ir_variable_refcount.cpp |  14 +--
  src/compiler/glsl/ir_variable_refcount.h   |   8 +-
  src/compiler/glsl/opt_dead_code.cpp|  19 ++--
  src/util/fast_list.h   | 167 +
  4 files changed, 176 insertions(+), 32 deletions(-)

diff --git a/src/compiler/glsl/ir_variable_refcount.cpp 
b/src/compiler/glsl/ir_variable_refcount.cpp
index 8306be1..94d6edc 100644
--- a/src/compiler/glsl/ir_variable_refcount.cpp
+++ b/src/compiler/glsl/ir_variable_refcount.cpp
@@ -46,15 +46,6 @@ static void
  free_entry(struct hash_entry *entry)
  {
 ir_variable_refcount_entry *ivre = (ir_variable_refcount_entry *) 
entry->data;
-
-   /* Free assignment list */
-   exec_node *n;
-   while ((n = ivre->assign_list.pop_head()) != NULL) {
-  struct assignment_entry *assignment_entry =
- exec_node_data(struct assignment_entry, n, link);
-  free(assignment_entry);
-   }
-
 delete ivre;
  }
  
@@ -142,10 +133,7 @@ ir_variable_refcount_visitor::visit_leave(ir_assignment *ir)

 */
assert(entry->referenced_count >= entry->assigned_count);
if (entry->referenced_count == entry->assigned_count) {
- struct assignment_entry *assignment_entry =
-(struct assignment_entry *)calloc(1, sizeof(*assignment_entry));
- assignment_entry->assign = ir;
- entry->assign_list.push_head(_entry->link);
+ entry->assign_list.add(ir);
}
 }
  
diff --git a/src/compiler/glsl/ir_variable_refcount.h b/src/compiler/glsl/ir_variable_refcount.h

index 08a11c0..c3ec5fe 100644
--- a/src/compiler/glsl/ir_variable_refcount.h
+++ b/src/compiler/glsl/ir_variable_refcount.h
@@ -32,11 +32,7 @@
  #include "ir.h"
  #include "ir_visitor.h"
  #include "compiler/glsl_types.h"
-
-struct assignment_entry {
-   exec_node link;
-   ir_assignment *assign;
-};
+#include "util/fast_list.h"
  
  class ir_variable_refcount_entry

  {
@@ -50,7 +46,7 @@ public:
  * This is intended to be used for dead code optimisation and may
  * not be a complete list.
  */
-   exec_list assign_list;
+   arraylist assign_list;
  
 /** Number of times the variable is referenced, including assignments. */

 unsigned referenced_count;
diff --git a/src/compiler/glsl/opt_dead_code.cpp 
b/src/compiler/glsl/opt_dead_code.cpp
index 75e668a..06e8c3d 100644
--- a/src/compiler/glsl/opt_dead_code.cpp
+++ b/src/compiler/glsl/opt_dead_code.cpp
@@ -52,7 +52,7 @@ do_dead_code(exec_list *instructions, bool 
uniform_locations_assigned)
  
 struct hash_entry *e;

 hash_table_foreach(v.ht, e) {
-  ir_variable_refcount_entry *entry = (ir_variable_refcount_entry 
*)e->data;
+  ir_variable_refcount_entry *const entry = (ir_variable_refcount_entry 
*)e->data;
  
/* Since each assignment is a reference, the refereneced count must be

 * greater than or equal to the assignment count.  If they are equal,
@@ -89,7 +89,7 @@ do_dead_code(exec_list *instructions, bool 
uniform_locations_assigned)
if (entry->var->data.always_active_io)
   continue;
  
-  if (!entry->assign_list.is_empty()) {

+  if (!entry->assign_list.empty()) {
 /* Remove all the dead assignments to the variable we found.
  * Don't do so if it's a shader or function output, though.
  */
@@ -98,26 +98,19 @@ do_dead_code(exec_list *instructions, bool 
uniform_locations_assigned)
   entry->var->data.mode != ir_var_shader_out &&
   entry->var->data.mode != ir_var_shader_storage) {
  
-while (!entry->assign_list.is_empty()) {

-   struct assignment_entry *assignment_entry =
-  exec_node_data(struct assignment_entry,
- entry->assign_list.get_head_raw(), link);
-
-  assignment_entry->assign->remove();
-
+for(ir_assignment *assign : entry->assign_list) {
The original code separates control flow instructions as for or while 
with a space before the brace, aka "for (...". This applies for all the 
code.

+  assign->remove();
   if (debug) 

Re: [Mesa-dev] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Maciej Cencora
On wtorek, 18 października 2016 00:07:18 CEST Jan Ziak wrote:
> This patch replaces the ir_variable_refcount_entry's linked-list
> with an array-list.
>
> The array-list has local storage which does not require ANY additional
> allocations if the list has small number of elements. The size of this
> storage is configurable for each variable.
>
> Benchmark results for "./run -1 shaders" from shader-db[1]:
>
> - The total number of executed instructions goes down from 64.184 to
63.797
>   giga-instructions when Mesa is compiled with "gcc -O0 ..."

Hi,

A total number of instructions in -O0 is not a good indicator of whether
this change is beneficial from performance POV.
You should check it with -O2 or whatever is the default in mesa release
builds.

> - In the call tree starting at function do_dead_code():
>   - the number of calls to malloc() is reduced by about 10%
>   - the number of calls to free() is reduced by about 30%

These are certainly a win.

>
> [1] git://anongit.freedesktop.org/mesa/shader-db
>
> Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0...@gmail.com
>
> ---
>  src/compiler/glsl/ir_variable_refcount.cpp |  14 +--
>  src/compiler/glsl/ir_variable_refcount.h   |   8 +-
>  src/compiler/glsl/opt_dead_code.cpp|  19 ++--
>  src/util/fast_list.h   | 167
+
>  4 files changed, 176 insertions(+), 32 deletions(-)
>
> diff --git a/src/compiler/glsl/ir_variable_refcount.cpp
b/src/compiler/glsl/ir_variable_refcount.cpp
> index 8306be1..94d6edc 100644
> --- a/src/compiler/glsl/ir_variable_refcount.cpp
> +++ b/src/compiler/glsl/ir_variable_refcount.cpp
> @@ -46,15 +46,6 @@ static void
>  free_entry(struct hash_entry *entry)
>  {
> ir_variable_refcount_entry *ivre = (ir_variable_refcount_entry *)
entry->data;
> -
> -   /* Free assignment list */
> -   exec_node *n;
> -   while ((n = ivre->assign_list.pop_head()) != NULL) {
> -  struct assignment_entry *assignment_entry =
> - exec_node_data(struct assignment_entry, n, link);
> -  free(assignment_entry);
> -   }
> -
> delete ivre;
>  }
>
> @@ -142,10 +133,7 @@
ir_variable_refcount_visitor::visit_leave(ir_assignment *ir)
> */
>assert(entry->referenced_count >= entry->assigned_count);
>if (entry->referenced_count == entry->assigned_count) {
> - struct assignment_entry *assignment_entry =
> -(struct assignment_entry *)calloc(1,
sizeof(*assignment_entry));
> - assignment_entry->assign = ir;
> - entry->assign_list.push_head(_entry->link);
> + entry->assign_list.add(ir);
>}
> }
>
> diff --git a/src/compiler/glsl/ir_variable_refcount.h
b/src/compiler/glsl/ir_variable_refcount.h
> index 08a11c0..c3ec5fe 100644
> --- a/src/compiler/glsl/ir_variable_refcount.h
> +++ b/src/compiler/glsl/ir_variable_refcount.h
> @@ -32,11 +32,7 @@
>  #include "ir.h"
>  #include "ir_visitor.h"
>  #include "compiler/glsl_types.h"
> -
> -struct assignment_entry {
> -   exec_node link;
> -   ir_assignment *assign;
> -};
> +#include "util/fast_list.h"
>
>  class ir_variable_refcount_entry
>  {
> @@ -50,7 +46,7 @@ public:
>  * This is intended to be used for dead code optimisation and may
>  * not be a complete list.
>  */
> -   exec_list assign_list;
> +   arraylist assign_list;
>
> /** Number of times the variable is referenced, including
assignments. */
> unsigned referenced_count;
> diff --git a/src/compiler/glsl/opt_dead_code.cpp
b/src/compiler/glsl/opt_dead_code.cpp
> index 75e668a..06e8c3d 100644
> --- a/src/compiler/glsl/opt_dead_code.cpp
> +++ b/src/compiler/glsl/opt_dead_code.cpp
> @@ -52,7 +52,7 @@ do_dead_code(exec_list *instructions, bool
uniform_locations_assigned)
>
> struct hash_entry *e;
> hash_table_foreach(v.ht, e) {
> -  ir_variable_refcount_entry *entry = (ir_variable_refcount_entry
*)e->data;
> +  ir_variable_refcount_entry *const entry =
(ir_variable_refcount_entry *)e->data;
>
>/* Since each assignment is a reference, the refereneced count
must be
> * greater than or equal to the assignment count.  If they are
equal,
> @@ -89,7 +89,7 @@ do_dead_code(exec_list *instructions, bool
uniform_locations_assigned)
>if (entry->var->data.always_active_io)
>   continue;
>
> -  if (!entry->assign_list.is_empty()) {
> +  if (!entry->assign_list.empty()) {
>   /* Remove all the dead assignments to the variable we found.
>* Don't do so if it's a shader or function output, though.
>*/
> @@ -98,26 +98,19 @@ do_dead_code(exec_list *instructions, bool
uniform_locations_assigned)
>   entry->var->data.mode != ir_var_shader_out &&
>   entry->var->data.mode != ir_var_shader_storage) {
>
> -while (!entry->assign_list.is_empty()) {
> -   struct assignment_entry *assignment_entry =
> -  exec_node_data(struct assignment_entry,
> -  

Re: [Mesa-dev] [PATCH v2 103/103] i965/gen7: expose OpenGL 4.0 on Haswell

2016-10-18 Thread Kenneth Graunke
On Tuesday, October 18, 2016 5:12:27 PM PDT Ian Romanick wrote:
> On 10/11/2016 02:02 AM, Iago Toral Quiroga wrote:
> > ARB_gpu_shader_fp64 was the last piece missing. Notice that some
> > hardware and kernel combinations do not support pipelined register
> > writes, which are required for some OpenGL 4.0 features, in which
> > case the driver won't expose 4.0.
> > ---
> >  src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++
> >  src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> > b/src/mesa/drivers/dri/i965/intel_extensions.c
> > index 0491145..a291cd5 100644
> > --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> > @@ -272,6 +272,8 @@ intelInitExtensions(struct gl_context *ctx)
> >  
> > if (brw->gen >= 8)
> >ctx->Const.GLSLVersion = 440;
> > +   else if (brw->is_haswell)
> > +  ctx->Const.GLSLVersion = 400;
> > else if (brw->gen >= 6)
> >ctx->Const.GLSLVersion = 330;
> > else
> > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> > b/src/mesa/drivers/dri/i965/intel_screen.c
> > index 9b23bac..1af7fe6 100644
> > --- a/src/mesa/drivers/dri/i965/intel_screen.c
> > +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> > @@ -1445,7 +1445,7 @@ set_max_gl_versions(struct intel_screen *screen)
> >dri_screen->max_gl_es2_version = has_astc ? 32 : 31;
> >break;
> > case 7:
> > -  dri_screen->max_gl_core_version = 33;
> > +  dri_screen->max_gl_core_version = screen->devinfo.is_haswell ? 40 : 
> > 33;
> 
> I *think* this needs to take the pipelined register writes into
> consideration.  My understanding is if you say 40 here, then
> glXCreateContextAttribs will allow creation of an OpenGL 4.0 context...
> but the context may only be 3.3.

Good catch, Ian.  Checking brw->can_do_pipelined_register_writes here
would be right...but it's awkward, since it's stored in the context, and
doesn't get populated until we actually make a context and run things on
the GPU.  That's probably not too feasible here in screen init time,
where we're trying to decide what kind of contexts to even support.

To make life easier, I might just do:

   dri_screen->max_gl_core_version = screen->has_mi_math_and_lrr ? 40 : 33;

which is the check we use for ARB_query_buffer_object.  On Haswell,
it implies a high enough command parser version that we can do
everything we need.  (We could actually get away with an older kernel
version, but I'm not sure I care...as we move toward 4.1/4.2/4.3 we'd
need to bump it higher anyway...)

The one gotcha is that has_mi_math_and_lrr / cmd_parser_version get
initialized after set_max_gl_versions() is called, so you'll need to
reorder those in the caller.  Should be straightforward.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98271] [radeonsi]Playing videos with vdpau or vaapi hardware acceleration crashes my pc

2016-10-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98271

--- Comment #20 from John  ---
> Installing an older kernel, see if that works with 12.0 mesa.
> If yes we have narrowed it down to the kernel, if not we 
> need to stick a bit more into mesa.
I've tried with a 3.18 kernel and still got the issue, so the issue is not in
the kernel. I had the firmware files from that date as well to eliminate that
possibility.

> Another possibility which came to my mind is that this might not
> we an issue with UVD decoding, but rather presenting it.
> E.g. install both VDPAU and OpenGL from a certain Mesa version
> *AND* make sure that you restart X after that so that the
> X acceleration uses the new library versions as well.
Now this is interesting, as the reboot were only post-freeze so never to test a
certain mesa version.
I've rolled back to 11 and restarted the computer and will try.

Since you mentioned presenting, could it be the DDX?


New information: I don't need to have the video on screen for the issue to
happen. I can alt-tab or switch to another virtual desktop while the script
runs and it still freezes.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 021/103] i965/vec4: implement double unpacking

2016-10-18 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 04f70ef..2631bf3 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1538,6 +1538,18 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>break;
> }
>  
> +   case nir_op_unpack_double_2x32_split_x:
> +   case nir_op_unpack_double_2x32_split_y: {
> +  enum opcode oper = (instr->op == nir_op_unpack_double_2x32_split_x) ?
> + VEC4_OPCODE_PICK_LOW_32BIT : VEC4_OPCODE_PICK_HIGH_32BIT;
> +  dst_reg tmp = dst_reg(this, glsl_type::dvec4_type);
> +  emit(MOV(tmp, op[0]));
> +  dst_reg tmp2 = dst_reg(this, glsl_type::uvec4_type);
> +  emit(oper, tmp2, src_reg(tmp));
> +  emit(MOV(dst, src_reg(tmp2)));
> +  break;
> +   }
> +
> case nir_op_unpack_half_2x16:
>/* As NIR does not guarantee that we have a correct swizzle outside the
> * boundaries of a vector, and the implementation of 
> emit_unpack_half_2x16
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 020/103] i965/vec4: don't copy propagate vector opcodes that operate in align1 mode

2016-10-18 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> Basically, ALIGN1 mode will ignore swizzles on the input vectors so we don't
> want the copy propagation pass to mess with them.
> ---
>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 24 
> ++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> index 545f4c7..d0045a7 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> @@ -283,6 +283,22 @@ try_constant_propagate(const struct gen_device_info 
> *devinfo,
>  }
>  
>  static bool
> +is_align1_opcode(unsigned opcode)
> +{
> +   switch (opcode) {
> +   case VEC4_OPCODE_DOUBLE_TO_FLOAT:
> +   case VEC4_OPCODE_FLOAT_TO_DOUBLE:
> +   case VEC4_OPCODE_PICK_LOW_32BIT:
> +   case VEC4_OPCODE_PICK_HIGH_32BIT:
> +   case VEC4_OPCODE_SET_LOW_32BIT:
> +   case VEC4_OPCODE_SET_HIGH_32BIT:
> +  return true;
> +   default:
> +  return false;
> +   }
> +}
> +
> +static bool
>  try_copy_propagate(const struct gen_device_info *devinfo,
> vec4_instruction *inst, int arg,
> const copy_entry *entry, int attributes_per_reg)
> @@ -326,6 +342,14 @@ try_copy_propagate(const struct gen_device_info *devinfo,
>  
> unsigned composed_swizzle = brw_compose_swizzle(inst->src[arg].swizzle,
> value.swizzle);
> +
> +   /* Instructions that operate on vectors in ALIGN1 mode will ignore 
> swizzles
> +* so copy-propagation won't be safe if the composed swizzle is anything
> +* other than the identity.
> +*/
> +   if (is_align1_opcode(inst->opcode) && composed_swizzle != 
> BRW_SWIZZLE_XYZW)
> +  return false;
> +
> if (inst->is_3src(devinfo) &&
> (value.file == UNIFORM ||
>  (value.file == ATTR && attributes_per_reg != 1)) &&
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 019/103] i965/vec4: Fix DCE for VEC4_OPCODE_SET_{LOW, HIGH}_32BIT

2016-10-18 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> These align1 opcodes do partial writes of 64-bit data. The problem is that we
> want to use them to write on the same register to implement packDouble2x32 and
> from the point of view of DCE, since both opcodes write to the same register,
> only the last one stands and decides to eliminate the first, which is
> not correct, so prevent this from happening.
> 
> v2: Make a helper in vec4_instruction to know if the instruction is an
> align1 partial write. This will come in handy when we implement a
> simd splitting pass in a later patch.
> ---
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h| 6 ++
>  src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp | 3 ++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> index a8e5f4a..7451f44 100644
> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> @@ -232,6 +232,12 @@ public:
> bool can_change_types() const;
> bool has_source_and_destination_hazard() const;
>  
> +   bool is_align1_partial_write()
> +   {
> +  return opcode == VEC4_OPCODE_SET_LOW_32BIT ||
> + opcode == VEC4_OPCODE_SET_HIGH_32BIT;
> +   }
> +
> bool reads_flag()
> {
>return predicate || opcode == VS_OPCODE_UNPACK_FLAGS_SIMD4X2;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> index 50706a9..950c6c8 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> @@ -109,7 +109,8 @@ vec4_visitor::dead_code_eliminate()
>  }
>   }
>  
> - if (inst->dst.file == VGRF && !inst->predicate) {
> + if (inst->dst.file == VGRF && !inst->predicate &&
> + !inst->is_align1_partial_write()) {
>  for (unsigned i = 0; i < regs_written(inst); i++) {
> for (int c = 0; c < 4; c++) {
>if (inst->dst.writemask & (1 << c)) {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 017/103] i965/vec4: add VEC4_OPCODE_PICK_{LOW, HIGH}_32BIT opcodes

2016-10-18 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

We may be able to eliminate some of this after I do int64 support.  It
might be cleaner to do unpackInt2x32(doubleBitsToInt64(x)) at a higher
level of the compiler instead.

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> These opcodes will pick the low/high 32-bit in each 64-bit data element
> using Align1 mode. We will use this, for example, to do things like
> unpackDouble2x32.
> 
> We use Align1 mode because in order to implement this in Align16 mode
> we would need to use 32-bit logical swizzles (XZ for low, YW for high),
> but the IR works in terms of 64-bit logical swizzles for DF operands
> all the way up to codegen.
> 
> v2:
>  - use suboffset() instead of get_element_ud()
>  - no need to set the width on the dst
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  |  2 ++
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  4 
>  src/mesa/drivers/dri/i965/brw_vec4.cpp   |  4 
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 25 
> 
>  4 files changed, 35 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 79b96a4..8ffb50c 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1100,6 +1100,8 @@ enum opcode {
> VEC4_OPCODE_UNPACK_UNIFORM,
> VEC4_OPCODE_DOUBLE_TO_FLOAT,
> VEC4_OPCODE_FLOAT_TO_DOUBLE,
> +   VEC4_OPCODE_PICK_LOW_32BIT,
> +   VEC4_OPCODE_PICK_HIGH_32BIT,
>  
> FS_OPCODE_DDX_COARSE,
> FS_OPCODE_DDX_FINE,
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index b063f77..b2f3a56 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -321,6 +321,10 @@ brw_instruction_name(const struct gen_device_info 
> *devinfo, enum opcode op)
>return "double_to_float";
> case VEC4_OPCODE_FLOAT_TO_DOUBLE:
>return "float_to_double";
> +   case VEC4_OPCODE_PICK_LOW_32BIT:
> +  return "pick_low_32bit";
> +   case VEC4_OPCODE_PICK_HIGH_32BIT:
> +  return "pick_high_32bit";
>  
> case FS_OPCODE_DDX_COARSE:
>return "ddx_coarse";
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 40f8702..4fd04f1 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -255,6 +255,8 @@ vec4_instruction::can_do_writemask(const struct 
> gen_device_info *devinfo)
> case SHADER_OPCODE_GEN4_SCRATCH_READ:
> case VEC4_OPCODE_DOUBLE_TO_FLOAT:
> case VEC4_OPCODE_FLOAT_TO_DOUBLE:
> +   case VEC4_OPCODE_PICK_LOW_32BIT:
> +   case VEC4_OPCODE_PICK_HIGH_32BIT:
> case VS_OPCODE_PULL_CONSTANT_LOAD:
> case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
> case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9:
> @@ -510,6 +512,8 @@ vec4_visitor::opt_reduce_swizzle()
>  
>case VEC4_OPCODE_FLOAT_TO_DOUBLE:
>case VEC4_OPCODE_DOUBLE_TO_FLOAT:
> +  case VEC4_OPCODE_PICK_LOW_32BIT:
> +  case VEC4_OPCODE_PICK_HIGH_32BIT:
>   swizzle = brw_swizzle_for_size(4);
>   break;
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index 6f4c438..b8778c4 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1940,6 +1940,31 @@ generate_code(struct brw_codegen *p,
>   break;
>}
>  
> +  case VEC4_OPCODE_PICK_LOW_32BIT:
> +  case VEC4_OPCODE_PICK_HIGH_32BIT: {
> + /* Stores the low/high 32-bit of each 64-bit element in src[0] into
> +  * dst using ALIGN1 mode and a <8,4,2>:UD region on the source.
> +  */
> + assert(type_sz(src[0].type) == 8);
> + assert(type_sz(dst.type) == 4);
> +
> + brw_set_default_access_mode(p, BRW_ALIGN_1);
> +
> + dst = retype(dst, BRW_REGISTER_TYPE_UD);
> + dst.hstride = BRW_HORIZONTAL_STRIDE_1;
> +
> + src[0] = retype(src[0], BRW_REGISTER_TYPE_UD);
> + if (inst->opcode == VEC4_OPCODE_PICK_HIGH_32BIT)
> +src[0] = suboffset(src[0], 1);
> + src[0].vstride = BRW_VERTICAL_STRIDE_8;
> + src[0].width = BRW_WIDTH_4;
> + src[0].hstride = BRW_HORIZONTAL_STRIDE_2;
> + brw_MOV(p, dst, src[0]);
> +
> + brw_set_default_access_mode(p, BRW_ALIGN_16);
> + break;
> +  }
> +
>case VEC4_OPCODE_PACK_BYTES: {
>   /* Is effectively:
>*
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 018/103] i965/vec4: add VEC4_OPCODE_SET_{LOW, HIGH}_32BIT opcodes

2016-10-18 Thread Ian Romanick
This patch is

Reviewed-by: Ian Romanick 

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> These opcodes will set the low/high 32-bit in each 64-bit data element
> using Align1 mode. We will use this to implement packDouble2x32.
> 
> We use Align1 mode because in order to implement this in Align16 mode
> we would need to use 32-bit logical swizzles (XZ for low, YW for high),
> but the IR works in terms of 64-bit logical swizzles for DF operands
> all the way up to codegen.
> 
> v2:
>  - use suboffset() instead of get_element_ud()
>  - no need to set the width on the dst
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  |  2 ++
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  4 
>  src/mesa/drivers/dri/i965/brw_vec4.cpp   |  4 
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 25 
> 
>  4 files changed, 35 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 8ffb50c..35d638c 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1102,6 +1102,8 @@ enum opcode {
> VEC4_OPCODE_FLOAT_TO_DOUBLE,
> VEC4_OPCODE_PICK_LOW_32BIT,
> VEC4_OPCODE_PICK_HIGH_32BIT,
> +   VEC4_OPCODE_SET_LOW_32BIT,
> +   VEC4_OPCODE_SET_HIGH_32BIT,
>  
> FS_OPCODE_DDX_COARSE,
> FS_OPCODE_DDX_FINE,
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index b2f3a56..153bd43 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -325,6 +325,10 @@ brw_instruction_name(const struct gen_device_info 
> *devinfo, enum opcode op)
>return "pick_low_32bit";
> case VEC4_OPCODE_PICK_HIGH_32BIT:
>return "pick_high_32bit";
> +   case VEC4_OPCODE_SET_LOW_32BIT:
> +  return "set_low_32bit";
> +   case VEC4_OPCODE_SET_HIGH_32BIT:
> +  return "set_high_32bit";
>  
> case FS_OPCODE_DDX_COARSE:
>return "ddx_coarse";
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 4fd04f1..06fa38f 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -257,6 +257,8 @@ vec4_instruction::can_do_writemask(const struct 
> gen_device_info *devinfo)
> case VEC4_OPCODE_FLOAT_TO_DOUBLE:
> case VEC4_OPCODE_PICK_LOW_32BIT:
> case VEC4_OPCODE_PICK_HIGH_32BIT:
> +   case VEC4_OPCODE_SET_LOW_32BIT:
> +   case VEC4_OPCODE_SET_HIGH_32BIT:
> case VS_OPCODE_PULL_CONSTANT_LOAD:
> case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
> case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9:
> @@ -514,6 +516,8 @@ vec4_visitor::opt_reduce_swizzle()
>case VEC4_OPCODE_DOUBLE_TO_FLOAT:
>case VEC4_OPCODE_PICK_LOW_32BIT:
>case VEC4_OPCODE_PICK_HIGH_32BIT:
> +  case VEC4_OPCODE_SET_LOW_32BIT:
> +  case VEC4_OPCODE_SET_HIGH_32BIT:
>   swizzle = brw_swizzle_for_size(4);
>   break;
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index b8778c4..120797b 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1965,6 +1965,31 @@ generate_code(struct brw_codegen *p,
>   break;
>}
>  
> +  case VEC4_OPCODE_SET_LOW_32BIT:
> +  case VEC4_OPCODE_SET_HIGH_32BIT: {
> + /* Reads consecutive 32-bit elements from src[0] and writes
> +  * them to the low/high 32-bit of each 64-bit element in dst.
> +  */
> + assert(type_sz(src[0].type) == 4);
> + assert(type_sz(dst.type) == 8);
> +
> + brw_set_default_access_mode(p, BRW_ALIGN_1);
> +
> + dst = retype(dst, BRW_REGISTER_TYPE_UD);
> + if (inst->opcode == VEC4_OPCODE_SET_HIGH_32BIT)
> +dst = suboffset(dst, 1);
> + dst.hstride = BRW_HORIZONTAL_STRIDE_2;
> +
> + src[0] = retype(src[0], BRW_REGISTER_TYPE_UD);
> + src[0].vstride = BRW_VERTICAL_STRIDE_4;
> + src[0].width = BRW_WIDTH_4;
> + src[0].hstride = BRW_HORIZONTAL_STRIDE_1;
> + brw_MOV(p, dst, src[0]);
> +
> + brw_set_default_access_mode(p, BRW_ALIGN_16);
> + break;
> +  }
> +
>case VEC4_OPCODE_PACK_BYTES: {
>   /* Is effectively:
>*
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 013/103] i965/vec4: set correct register regions for 32-bit and 64-bit

2016-10-18 Thread Ian Romanick
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> For 32-bit instructions we want to use <4,4,1> regions for VGRF
> sources so we should really set a width of 4 (we were setting 8).
> 
> For 64-bit instructions we want to use a width of 2 because the
> hardware uses 32-bit swizzles, meaning that we can only address 2
> consecutive 64-bit components in a row. Also, Curro suggested that
> the hardware is probably fixing the width to 2 for 64-bit instructions
> anyway, so just go with that and use <2,2,1>.
> 
> v2:
>  - No need to explicitly set the vertical stride of 64-bit regions to 2,
>brw_vecn_grf with a width of 2 will do that for us.
>  - No need to adjust the width of dst registers.
> 
> Signed-off-by: Connor Abbott 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 32c04b2..40f8702 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1873,20 +1873,24 @@ vec4_visitor::convert_to_hw_regs()
>   struct src_reg  = inst->src[i];
>   struct brw_reg reg;
>   switch (src.file) {
> - case VGRF:
> -reg = byte_offset(brw_vec8_grf(src.nr, 0), src.offset);
> + case VGRF: {
> +unsigned type_size = type_sz(src.type);
> +unsigned width = REG_SIZE / 2 / MAX2(4, type_size);

constify these

> +reg = byte_offset(brw_vecn_grf(width, src.nr, 0), src.offset);
>  reg.type = src.type;
>  reg.swizzle = src.swizzle;
>  reg.abs = src.abs;
>  reg.negate = src.negate;
>  break;
> + }
>  
> - case UNIFORM:
> + case UNIFORM: {
> +unsigned width = REG_SIZE / 2 / MAX2(4, type_sz(src.type));

constify this one too, and this patch is

Reviewed-by: Ian Romanick 

>  reg = stride(byte_offset(brw_vec4_grf(
>  
> prog_data->base.dispatch_grf_start_reg +
>  src.nr / 2, src.nr % 2 * 4),
>   src.offset),
> - 0, 4, 1);
> + 0, width, 1);
>  reg.type = src.type;
>  reg.swizzle = src.swizzle;
>  reg.abs = src.abs;
> @@ -1895,6 +1899,7 @@ vec4_visitor::convert_to_hw_regs()
>  /* This should have been moved to pull constants. */
>  assert(!src.reladdr);
>  break;
> + }
>  
>   case ARF:
>   case FIXED_GRF:
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 007/103] i965/vec4/nir: fix emitting 64-bit immediates

2016-10-18 Thread Ian Romanick
On 10/18/2016 05:26 PM, Matt Turner wrote:
> On Tue, Oct 18, 2016 at 5:20 PM, Ian Romanick  wrote:
>> On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 22 ++
>>>  1 file changed, 18 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> index 05e7f29..ce95c8d 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> @@ -352,8 +352,15 @@ vec4_visitor::get_indirect_offset(nir_intrinsic_instr 
>>> *instr)
>>>  void
>>>  vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
>>>  {
>>> -   dst_reg reg = dst_reg(VGRF, alloc.allocate(1));
>>> -   reg.type =  BRW_REGISTER_TYPE_D;
>>> +   dst_reg reg;
>>> +
>>> +   if (instr->def.bit_size == 64) {
>>> +  reg = dst_reg(VGRF, alloc.allocate(2));
>>> +  reg.type = BRW_REGISTER_TYPE_DF;
>>
>> For 32-bits we use an integer type (D).  Should was also use an integer
>> type (Q) here?  I'm worried that I'll have problems with this when I add
>> int64 support.
> 
> Q only exists on Broadwell and newer, so I don't think it's usable
> here (at least for HSW/IVB).

Ah yes.  Good call.

This patch is

Reviewed-by: Ian Romanick 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 011/103] i965: fix subnr overflow in suboffset()

2016-10-18 Thread Ian Romanick
Reviewed-by: Ian Romanick 

In the interest in reducing the number of patches in flight, I think
this could land ahead of the others.

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_reg.h | 13 +
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_reg.h 
> b/src/mesa/drivers/dri/i965/brw_reg.h
> index 3b46d27..8907c9c 100644
> --- a/src/mesa/drivers/dri/i965/brw_reg.h
> +++ b/src/mesa/drivers/dri/i965/brw_reg.h
> @@ -520,14 +520,6 @@ sechalf(struct brw_reg reg)
>  }
>  
>  static inline struct brw_reg
> -suboffset(struct brw_reg reg, unsigned delta)
> -{
> -   reg.subnr += delta * type_sz(reg.type);
> -   return reg;
> -}
> -
> -
> -static inline struct brw_reg
>  offset(struct brw_reg reg, unsigned delta)
>  {
> reg.nr += delta;
> @@ -544,6 +536,11 @@ byte_offset(struct brw_reg reg, unsigned bytes)
> return reg;
>  }
>  
> +static inline struct brw_reg
> +suboffset(struct brw_reg reg, unsigned delta)
> +{
> +   return byte_offset(reg, delta * type_sz(reg.type));
> +}
>  
>  /** Construct unsigned word[16] register */
>  static inline struct brw_reg
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 007/103] i965/vec4/nir: fix emitting 64-bit immediates

2016-10-18 Thread Matt Turner
On Tue, Oct 18, 2016 at 5:20 PM, Ian Romanick  wrote:
> On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 22 ++
>>  1 file changed, 18 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> index 05e7f29..ce95c8d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> @@ -352,8 +352,15 @@ vec4_visitor::get_indirect_offset(nir_intrinsic_instr 
>> *instr)
>>  void
>>  vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
>>  {
>> -   dst_reg reg = dst_reg(VGRF, alloc.allocate(1));
>> -   reg.type =  BRW_REGISTER_TYPE_D;
>> +   dst_reg reg;
>> +
>> +   if (instr->def.bit_size == 64) {
>> +  reg = dst_reg(VGRF, alloc.allocate(2));
>> +  reg.type = BRW_REGISTER_TYPE_DF;
>
> For 32-bits we use an integer type (D).  Should was also use an integer
> type (Q) here?  I'm worried that I'll have problems with this when I add
> int64 support.

Q only exists on Broadwell and newer, so I don't think it's usable
here (at least for HSW/IVB).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 010/103] i965/vec4: translate d2f/f2d

2016-10-18 Thread Ian Romanick
Reviewed-by: Ian Romanick 

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index ce95c8d..b75337c 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -,6 +,30 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>inst = emit(MOV(dst, op[0]));
>break;
>  
> +   case nir_op_d2f: {
> +  dst_reg temp = dst_reg(this, glsl_type::dvec4_type);
> +  emit(MOV(temp, op[0]));
> +
> +  dst_reg temp2 = dst_reg(this, glsl_type::dvec4_type);
> +  temp2 = retype(temp2, BRW_REGISTER_TYPE_F);
> +  emit(VEC4_OPCODE_DOUBLE_TO_FLOAT, temp2, src_reg(temp))
> + ->size_written = 2 * REG_SIZE;
> +
> +  vec4_instruction *inst = emit(MOV(dst, src_reg(temp2)));
> +  inst->saturate = instr->dest.saturate;
> +  break;
> +   }
> +
> +   case nir_op_f2d: {
> +  dst_reg tmp_dst = dst_reg(src_reg(this, glsl_type::dvec4_type));
> +  src_reg tmp_src = src_reg(this, glsl_type::vec4_type);
> +  emit(MOV(dst_reg(tmp_src), retype(op[0], BRW_REGISTER_TYPE_F)));
> +  emit(VEC4_OPCODE_FLOAT_TO_DOUBLE, tmp_dst, tmp_src);
> +  vec4_instruction *inst = emit(MOV(dst, src_reg(tmp_dst)));
> +  inst->saturate = instr->dest.saturate;
> +  break;
> +   }
> +
> case nir_op_fadd:
>/* fall through */
> case nir_op_iadd:
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 009/103] i965/vec4: add double/float conversion pseudo-opcodes

2016-10-18 Thread Ian Romanick
Based on my (fairly weak) understanding of vstrides, this patch is

Reviewed-by: Ian Romanick 

On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> These need to be emitted as align1 MOV's, since they need to have a
> stride of 2 on the float register (whether src or dest) so that data
> from another thread doesn't cross the middle of a SIMD8 register.
> 
> v2 (Iago):
> - The float-to-double needs to align 32-bit data to 64-bit before doing the
> conversion. This was doable in align16 when we tried to use an execsize
> of 4, but with an execsize of 8 we would need another align1 opcode to do
> that (since we need data to cross the middle of a SIMD register). Just
> making the opcode handle this internally seems more practical that adding
> another opcode just for this purpose and having the caller know about this
> before converting.
> - The double-to-float conversion produces 32-bit elements aligned to 64-bit
> so we make the opcode re-pack the result to 32-bit and fit in one register,
> as expected by SIMD4x2 operation. This still requires that callers reserve
> two registers for the float data destination because we need to produce
> 64-bit aligned data first, and repack it later on the same destination
> register, but it saves the need for a re-pack opcode only to achieve this
> making the operation complete in a single opcode. Hopefully that is worth
> the weirdness of the double register allocation...
> 
> Signed-off-by: Connor Abbott 
> Signed-off-by: Iago Toral Quiroga 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  |  2 ++
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  4 +++
>  src/mesa/drivers/dri/i965/brw_vec4.cpp   |  8 +
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 44 
> 
>  4 files changed, 58 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index c4e0f27..79b96a4 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1098,6 +1098,8 @@ enum opcode {
> VEC4_OPCODE_MOV_BYTES,
> VEC4_OPCODE_PACK_BYTES,
> VEC4_OPCODE_UNPACK_UNIFORM,
> +   VEC4_OPCODE_DOUBLE_TO_FLOAT,
> +   VEC4_OPCODE_FLOAT_TO_DOUBLE,
>  
> FS_OPCODE_DDX_COARSE,
> FS_OPCODE_DDX_FINE,
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index ed81563..b063f77 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -317,6 +317,10 @@ brw_instruction_name(const struct gen_device_info 
> *devinfo, enum opcode op)
>return "pack_bytes";
> case VEC4_OPCODE_UNPACK_UNIFORM:
>return "unpack_uniform";
> +   case VEC4_OPCODE_DOUBLE_TO_FLOAT:
> +  return "double_to_float";
> +   case VEC4_OPCODE_FLOAT_TO_DOUBLE:
> +  return "float_to_double";
>  
> case FS_OPCODE_DDX_COARSE:
>return "ddx_coarse";
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index c29cfb5..32c04b2 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -253,6 +253,8 @@ vec4_instruction::can_do_writemask(const struct 
> gen_device_info *devinfo)
>  {
> switch (opcode) {
> case SHADER_OPCODE_GEN4_SCRATCH_READ:
> +   case VEC4_OPCODE_DOUBLE_TO_FLOAT:
> +   case VEC4_OPCODE_FLOAT_TO_DOUBLE:
> case VS_OPCODE_PULL_CONSTANT_LOAD:
> case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
> case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9:
> @@ -505,6 +507,12 @@ vec4_visitor::opt_reduce_swizzle()
>case BRW_OPCODE_DP2:
>   swizzle = brw_swizzle_for_size(2);
>   break;
> +
> +  case VEC4_OPCODE_FLOAT_TO_DOUBLE:
> +  case VEC4_OPCODE_DOUBLE_TO_FLOAT:
> + swizzle = brw_swizzle_for_size(4);
> + break;
> +
>default:
>   swizzle = brw_swizzle_for_mask(inst->dst.writemask);
>   break;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index 163cf9d..6f4c438 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1896,6 +1896,50 @@ generate_code(struct brw_codegen *p,
>   break;
>}
>  
> +  case VEC4_OPCODE_DOUBLE_TO_FLOAT: {
> + assert(src[0].type == BRW_REGISTER_TYPE_DF);
> + assert(dst.type == BRW_REGISTER_TYPE_F);
> +
> + brw_set_default_access_mode(p, BRW_ALIGN_1);
> +
> + dst.hstride = BRW_HORIZONTAL_STRIDE_2;
> + dst.width = BRW_WIDTH_4;
> + src[0].vstride = BRW_VERTICAL_STRIDE_4;
> + src[0].width = BRW_WIDTH_4;
> + brw_MOV(p, dst, src[0]);
> +
> + struct brw_reg dst_as_src = dst;
> + dst.hstride = BRW_HORIZONTAL_STRIDE_1;
> + dst.width = BRW_WIDTH_8;
> +  

Re: [Mesa-dev] Mesa (master): glsl: Immediately inline built-ins rather than generating calls.

2016-10-18 Thread Brian Paul

On 10/18/2016 05:50 PM, Kenneth Graunke wrote:

On Tuesday, October 18, 2016 4:38:17 PM PDT Brian Paul wrote:

Hi Ken,

I found that this patch causes a regression.  There's a Windows medical
app which fails to link some shaders since this change.

Basically, when the gl_Position VS input is declared as invariant the
linker fails with:

error: declarations for uniform `gl_ModelViewProjectionMatrix' have
mismatching invariant qualifiers

I haven't investigated how to fix this.  I'm hoping you can see a simple
fix.

The attached piglit shader_runner script demonstrates the issue.  Passes
w/ NVIDIA.

Thanks!

-Brian


Oh, sorry about that!  Here are two possible fixes:

https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix
https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix-2

They're both kind of hacks...but the whole invariant propagation pass
is kind of a hack, and we've got some other hacks in place already.
So...maybe best to pile another one on.  Not sure which though.

Maybe Jason or Curro will have an opinion...


Thanks!  Either patch is OK with me (though, I'd suggest putting a 
comment on the first one to explain what's happening).


It'd be great if we can commit one or the other in the next day or so.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 007/103] i965/vec4/nir: fix emitting 64-bit immediates

2016-10-18 Thread Ian Romanick
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 22 ++
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 05e7f29..ce95c8d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -352,8 +352,15 @@ vec4_visitor::get_indirect_offset(nir_intrinsic_instr 
> *instr)
>  void
>  vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
>  {
> -   dst_reg reg = dst_reg(VGRF, alloc.allocate(1));
> -   reg.type =  BRW_REGISTER_TYPE_D;
> +   dst_reg reg;
> +
> +   if (instr->def.bit_size == 64) {
> +  reg = dst_reg(VGRF, alloc.allocate(2));
> +  reg.type = BRW_REGISTER_TYPE_DF;

For 32-bits we use an integer type (D).  Should was also use an integer
type (Q) here?  I'm worried that I'll have problems with this when I add
int64 support.

> +   } else {
> +  reg = dst_reg(VGRF, alloc.allocate(1));
> +  reg.type = BRW_REGISTER_TYPE_D;
> +   }
>  
> unsigned remaining = brw_writemask_for_size(instr->def.num_components);
>  
> @@ -368,13 +375,20 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr 
> *instr)
>   continue;
>  
>for (unsigned j = i; j < instr->def.num_components; j++) {
> - if (instr->value.u32[i] == instr->value.u32[j]) {
> + if ((instr->def.bit_size == 32 &&
> +  instr->value.u32[i] == instr->value.u32[j]) ||
> + (instr->def.bit_size == 64 &&
> +  instr->value.f64[i] == instr->value.f64[j])) {
>  writemask |= 1 << j;
>   }
>}
>  
>reg.writemask = writemask;
> -  emit(MOV(reg, brw_imm_d(instr->value.i32[i])));
> +  if (instr->def.bit_size == 64) {
> + emit(MOV(reg, brw_imm_df(instr->value.f64[i])));
> +  } else {
> + emit(MOV(reg, brw_imm_d(instr->value.i32[i])));
> +  }
>  
>remaining &= ~writemask;
> }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 004/103] i965/vec4/nir: Add bit-size information to types

2016-10-18 Thread Ian Romanick
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> Reviewed-by: Francisco Jerez 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index af76730..5048c4e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -325,7 +325,7 @@ src_reg
>  vec4_visitor::get_nir_src(const nir_src , unsigned num_components)
>  {
> /* if type is not specified, default to signed int */
> -   return get_nir_src(src, nir_type_int, num_components);
> +   return get_nir_src(src, nir_type_int32, num_components);
>  }
>  
>  src_reg
> @@ -747,7 +747,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>const nir_intrinsic_info *info = 
> _intrinsic_infos[instr->intrinsic];
>  
>/* Get the arguments of the atomic intrinsic. */
> -  src_reg offset = get_nir_src(instr->src[0], nir_type_int,
> +  src_reg offset = get_nir_src(instr->src[0], nir_type_int32,
> instr->num_components);
>const src_reg surface = brw_imm_ud(surf_index);
>const src_reg src0 = (info->num_srcs >= 2
> @@ -793,7 +793,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>* from any live channel.
>*/
>   surf_index = src_reg(this, glsl_type::uint_type);
> - emit(ADD(dst_reg(surf_index), get_nir_src(instr->src[0], 
> nir_type_int,
> + emit(ADD(dst_reg(surf_index), get_nir_src(instr->src[0], 
> nir_type_int32,
> instr->num_components),
>brw_imm_ud(prog_data->base.binding_table.ubo_start)));
>   surf_index = emit_uniformize(surf_index);
> @@ -811,7 +811,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>if (const_offset) {
>   offset = brw_imm_ud(const_offset->u32[0] & ~15);
>} else {
> - offset = get_nir_src(instr->src[1], nir_type_int, 1);
> + offset = get_nir_src(instr->src[1], nir_type_uint32, 1);

Does it matter that this changed form int to uint32?

>}
>  
>src_reg packed_consts = src_reg(this, glsl_type::vec4_type);
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 003/103] i965/vec4/nir: allocate two registers for dvec3/dvec4

2016-10-18 Thread Ian Romanick
On 10/11/2016 02:01 AM, Iago Toral Quiroga wrote:
> From: Connor Abbott 
> 
> v2 (Curro):
>   - Do not special-case for a bit-size of 64, divide the bit_size by 32
> instead.
>   - Use DIV_ROUND_UP so we can handle sub-32-bit types.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index ddeff2d..af76730 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -140,8 +140,8 @@ vec4_visitor::nir_emit_impl(nir_function_impl *impl)
> foreach_list_typed(nir_register, reg, node, >registers) {
>unsigned array_elems =
>   reg->num_array_elems == 0 ? 1 : reg->num_array_elems;
> -
> -  nir_locals[reg->index] = dst_reg(VGRF, alloc.allocate(array_elems));
> +  unsigned num_regs = array_elems * DIV_ROUND_UP(reg->bit_size, 32);

constify, and this patch is

Reviewed-by: Ian Romanick 

> +  nir_locals[reg->index] = dst_reg(VGRF, alloc.allocate(num_regs));
> }
>  
> nir_ssa_values = ralloc_array(mem_ctx, dst_reg, impl->ssa_alloc);
> @@ -270,7 +270,8 @@ dst_reg
>  vec4_visitor::get_nir_dest(const nir_dest )
>  {
> if (dest.is_ssa) {
> -  dst_reg dst = dst_reg(VGRF, alloc.allocate(1));
> +  dst_reg dst =
> + dst_reg(VGRF, alloc.allocate(DIV_ROUND_UP(dest.ssa.bit_size, 32)));
>nir_ssa_values[dest.ssa.index] = dst;
>return dst;
> } else {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 103/103] i965/gen7: expose OpenGL 4.0 on Haswell

2016-10-18 Thread Ian Romanick
On 10/11/2016 02:02 AM, Iago Toral Quiroga wrote:
> ARB_gpu_shader_fp64 was the last piece missing. Notice that some
> hardware and kernel combinations do not support pipelined register
> writes, which are required for some OpenGL 4.0 features, in which
> case the driver won't expose 4.0.
> ---
>  src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++
>  src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index 0491145..a291cd5 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -272,6 +272,8 @@ intelInitExtensions(struct gl_context *ctx)
>  
> if (brw->gen >= 8)
>ctx->Const.GLSLVersion = 440;
> +   else if (brw->is_haswell)
> +  ctx->Const.GLSLVersion = 400;
> else if (brw->gen >= 6)
>ctx->Const.GLSLVersion = 330;
> else
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index 9b23bac..1af7fe6 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -1445,7 +1445,7 @@ set_max_gl_versions(struct intel_screen *screen)
>dri_screen->max_gl_es2_version = has_astc ? 32 : 31;
>break;
> case 7:
> -  dri_screen->max_gl_core_version = 33;
> +  dri_screen->max_gl_core_version = screen->devinfo.is_haswell ? 40 : 33;

I *think* this needs to take the pipelined register writes into
consideration.  My understanding is if you say 40 here, then
glXCreateContextAttribs will allow creation of an OpenGL 4.0 context...
but the context may only be 3.3.

>dri_screen->max_gl_compat_version = 30;
>dri_screen->max_gl_es1_version = 11;
>dri_screen->max_gl_es2_version = screen->devinfo.is_haswell ? 31 : 30;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/22] anv: move to using shared wsi code

2016-10-18 Thread Dave Airlie
On 19 October 2016 at 04:26, Emil Velikov  wrote:
> Hi Dave,
>
> Thanks for doing this. It'll be great to get an Ack from the Intel
> devs, on the idea.
>
> Afaics with 22/22 in place you can drop the vk_alloc2/vk_free2
> functions since they are no longer used.

No they are still used in the anv/radv code, just not in the wsi code.

>> src/mesa/main/tests/Makefile
>> src/util/Makefile
>> src/util/tests/hash_table/Makefile
>> -   src/vulkan/Makefile])
>> +   src/vulkan/Makefile
>> +   src/vulkan/wsi/Makefile])
>>
> Just fold the new Makefile into the existing one ? In should be as
> simple as adding wsi/ prefix to files.
> Alternatively we can do that as a follow-up.

Actually we ended up not needing src/vulkan, so this ends up at one line now.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] vulkan/wsi: move some things into wsi_device.

2016-10-18 Thread Dave Airlie
From: Dave Airlie 

This copies the allocator callbacks, along with normal
callbacks and physical device into the wsi device.

I'm a bit 50/50 on whether this makes things cleaner so far
---
 src/amd/vulkan/radv_wsi.c   | 17 +
 src/amd/vulkan/radv_wsi_x11.c   |  2 --
 src/intel/vulkan/anv_wsi.c  | 17 +
 src/intel/vulkan/anv_wsi_x11.c  |  2 --
 src/vulkan/wsi/wsi_common.h | 28 +---
 src/vulkan/wsi/wsi_common_wayland.c | 32 +---
 src/vulkan/wsi/wsi_common_x11.c | 23 +--
 src/vulkan/wsi/wsi_common_x11.h |  1 -
 8 files changed, 53 insertions(+), 69 deletions(-)

diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index 3c3abe9..56eacc5 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -37,19 +37,21 @@ radv_init_wsi(struct radv_physical_device *physical_device)
 
memset(physical_device->wsi_device.wsi, 0, 
sizeof(physical_device->wsi_device.wsi));
 
+physical_device->wsi_device.alloc = physical_device->instance->alloc;
+physical_device->wsi_device.physical_device = 
anv_physical_device_to_handle(physical_device);
+physical_device->wsi_device.cbs = _cbs;
+
 #ifdef VK_USE_PLATFORM_XCB_KHR
-   result = wsi_x11_init_wsi(_device->wsi_device, 
_device->instance->alloc);
+   result = wsi_x11_init_wsi(_device->wsi_device);
if (result != VK_SUCCESS)
return result;
 #endif
 
 #ifdef VK_USE_PLATFORM_WAYLAND_KHR
-   result = wsi_wl_init_wsi(_device->wsi_device, 
_device->instance->alloc,
-
radv_physical_device_to_handle(physical_device),
-_cbs);
+   result = wsi_wl_init_wsi(_device->wsi_device);
if (result != VK_SUCCESS) {
 #ifdef VK_USE_PLATFORM_XCB_KHR
-   wsi_x11_finish_wsi(_device->wsi_device, 
_device->instance->alloc);
+wsi_x11_finish_wsi(_device->wsi_device);
 #endif
return result;
}
@@ -62,10 +64,10 @@ void
 radv_finish_wsi(struct radv_physical_device *physical_device)
 {
 #ifdef VK_USE_PLATFORM_WAYLAND_KHR
-   wsi_wl_finish_wsi(_device->wsi_device, 
_device->instance->alloc);
+   wsi_wl_finish_wsi(_device->wsi_device);
 #endif
 #ifdef VK_USE_PLATFORM_XCB_KHR
-   wsi_x11_finish_wsi(_device->wsi_device, 
_device->instance->alloc);
+   wsi_x11_finish_wsi(_device->wsi_device);
 #endif
 }
 
@@ -91,7 +93,6 @@ VkResult radv_GetPhysicalDeviceSurfaceSupportKHR(
struct wsi_interface *iface = device->wsi_device.wsi[surface->platform];
 
return iface->get_support(surface, >wsi_device,
- >instance->alloc,
  queueFamilyIndex, pSupported);
 }
 
diff --git a/src/amd/vulkan/radv_wsi_x11.c b/src/amd/vulkan/radv_wsi_x11.c
index 946b990..66c9bbb 100644
--- a/src/amd/vulkan/radv_wsi_x11.c
+++ b/src/amd/vulkan/radv_wsi_x11.c
@@ -44,7 +44,6 @@ VkBool32 radv_GetPhysicalDeviceXcbPresentationSupportKHR(
 
return wsi_get_physical_device_xcb_presentation_support(
   >wsi_device,
-  >instance->alloc,
   queueFamilyIndex, connection, visual_id);
 }
 
@@ -58,7 +57,6 @@ VkBool32 radv_GetPhysicalDeviceXlibPresentationSupportKHR(
 
return wsi_get_physical_device_xcb_presentation_support(
   >wsi_device,
-  >instance->alloc,
   queueFamilyIndex, XGetXCBConnection(dpy), visualID);
 }
 
diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
index f816735..3520300 100644
--- a/src/intel/vulkan/anv_wsi.c
+++ b/src/intel/vulkan/anv_wsi.c
@@ -36,19 +36,21 @@ anv_init_wsi(struct anv_physical_device *physical_device)
 
memset(physical_device->wsi_device.wsi, 0, 
sizeof(physical_device->wsi_device.wsi));
 
+   physical_device->wsi_device.alloc = physical_device->instance->alloc;
+   physical_device->wsi_device.physical_device = 
anv_physical_device_to_handle(physical_device);
+   physical_device->wsi_device.cbs = _cbs;
+
 #ifdef VK_USE_PLATFORM_XCB_KHR
-   result = wsi_x11_init_wsi(_device->wsi_device, 
_device->instance->alloc);
+   result = wsi_x11_init_wsi(_device->wsi_device);
if (result != VK_SUCCESS)
   return result;
 #endif
 
 #ifdef VK_USE_PLATFORM_WAYLAND_KHR
-   result = wsi_wl_init_wsi(_device->wsi_device, 
_device->instance->alloc,
-anv_physical_device_to_handle(physical_device),
-_cbs);
+   result = wsi_wl_init_wsi(_device->wsi_device);
if (result != VK_SUCCESS) {
 #ifdef VK_USE_PLATFORM_XCB_KHR
-  wsi_x11_finish_wsi(_device->wsi_device, 
_device->instance->alloc);
+  wsi_x11_finish_wsi(_device->wsi_device);
 #endif
   return result;
}
@@ -61,10 +63,10 @@ void
 anv_finish_wsi(struct anv_physical_device *physical_device)
 {
 #ifdef VK_USE_PLATFORM_WAYLAND_KHR
-   wsi_wl_finish_wsi(_device->wsi_device, 

[Mesa-dev] [PATCH 1/2] vulkan/wsi: use swapchain->alloc for destructors.

2016-10-18 Thread Dave Airlie
From: Dave Airlie 

As Jason pointed out the app has to pass in the same thing,
so just destroy using the one we copied earlier.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_wsi.c   |  2 +-
 src/intel/vulkan/anv_wsi.c  |  8 +---
 src/vulkan/wsi/wsi_common.h |  4 ++--
 src/vulkan/wsi/wsi_common_wayland.c | 10 +-
 src/vulkan/wsi/wsi_common_x11.c |  7 +++
 5 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index ba5c37b..3c3abe9 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -291,7 +291,7 @@ void radv_DestroySwapchainKHR(
radv_DestroyFence(device, swapchain->fences[i], 
pAllocator);
}
 
-   swapchain->destroy(swapchain, pAllocator);
+   swapchain->destroy(swapchain);
 }
 
 VkResult radv_GetSwapchainImagesKHR(
diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
index 064581d..f816735 100644
--- a/src/intel/vulkan/anv_wsi.c
+++ b/src/intel/vulkan/anv_wsi.c
@@ -290,20 +290,14 @@ void anv_DestroySwapchainKHR(
 VkSwapchainKHR   _swapchain,
 const VkAllocationCallbacks* pAllocator)
 {
-   ANV_FROM_HANDLE(anv_device, device, _device);
ANV_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain);
-   const VkAllocationCallbacks *alloc;
 
-   if (pAllocator)
- alloc = pAllocator;
-   else
- alloc = >alloc;
for (unsigned i = 0; i < ARRAY_SIZE(swapchain->fences); i++) {
   if (swapchain->fences[i] != VK_NULL_HANDLE)
  anv_DestroyFence(_device, swapchain->fences[i], pAllocator);
}
 
-   swapchain->destroy(swapchain, alloc);
+   swapchain->destroy(swapchain);
 }
 
 VkResult anv_GetSwapchainImagesKHR(
diff --git a/src/vulkan/wsi/wsi_common.h b/src/vulkan/wsi/wsi_common.h
index ee67511..1f4e0ae 100644
--- a/src/vulkan/wsi/wsi_common.h
+++ b/src/vulkan/wsi/wsi_common.h
@@ -54,8 +54,8 @@ struct wsi_swapchain {
const struct wsi_image_fns *image_fns;
VkFence fences[3];
 
-   VkResult (*destroy)(struct wsi_swapchain *swapchain,
-   const VkAllocationCallbacks *pAllocator);
+   VkResult (*destroy)(struct wsi_swapchain *swapchain);
+
VkResult (*get_images)(struct wsi_swapchain *swapchain,
   uint32_t *pCount, VkImage *pSwapchainImages);
VkResult (*acquire_next_image)(struct wsi_swapchain *swap_chain,
diff --git a/src/vulkan/wsi/wsi_common_wayland.c 
b/src/vulkan/wsi/wsi_common_wayland.c
index 32a0a51..ecb1ab5 100644
--- a/src/vulkan/wsi/wsi_common_wayland.c
+++ b/src/vulkan/wsi/wsi_common_wayland.c
@@ -647,19 +647,19 @@ wsi_wl_image_init(struct wsi_wl_swapchain *chain,
 }
 
 static VkResult
-wsi_wl_swapchain_destroy(struct wsi_swapchain *wsi_chain,
- const VkAllocationCallbacks *pAllocator)
+wsi_wl_swapchain_destroy(struct wsi_swapchain *wsi_chain)
 {
struct wsi_wl_swapchain *chain = (struct wsi_wl_swapchain *)wsi_chain;
 
for (uint32_t i = 0; i < chain->image_count; i++) {
   if (chain->images[i].buffer)
- chain->base.image_fns->free_wsi_image(chain->base.device, pAllocator,
+ chain->base.image_fns->free_wsi_image(chain->base.device,
+   >base.alloc,
chain->images[i].image,
chain->images[i].memory);
}
 
-   vk_free(pAllocator, chain);
+   vk_free(>base.alloc, chain);
 
return VK_SUCCESS;
 }
@@ -747,7 +747,7 @@ wsi_wl_surface_create_swapchain(VkIcdSurfaceBase 
*icd_surface,
return VK_SUCCESS;
 
 fail:
-   wsi_wl_swapchain_destroy(>base, pAllocator);
+   wsi_wl_swapchain_destroy(>base);
 
return result;
 }
diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
index 241ef42..3bb8f35 100644
--- a/src/vulkan/wsi/wsi_common_x11.c
+++ b/src/vulkan/wsi/wsi_common_x11.c
@@ -706,16 +706,15 @@ x11_image_finish(struct x11_swapchain *chain,
 }
 
 static VkResult
-x11_swapchain_destroy(struct wsi_swapchain *anv_chain,
-  const VkAllocationCallbacks *pAllocator)
+x11_swapchain_destroy(struct wsi_swapchain *anv_chain)
 {
struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain;
for (uint32_t i = 0; i < chain->image_count; i++)
-  x11_image_finish(chain, pAllocator, >images[i]);
+  x11_image_finish(chain, >base.alloc, >images[i]);
 
xcb_unregister_for_special_event(chain->conn, chain->special_event);
 
-   vk_free(pAllocator, chain);
+   vk_free(>base.alloc, chain);
 
return VK_SUCCESS;
 }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [rfc] wsi device cleanups.

2016-10-18 Thread Dave Airlie
Jason, these should address the comments you made, I'm not sure
these are a win over what was there, but I gave it a go.

If you like them I've no objections.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/dri2: add a libname to dlopen for OpenBSD

2016-10-18 Thread Jonathan Gray
On Tue, Oct 18, 2016 at 04:24:20PM +0100, Emil Velikov wrote:
> On 18 October 2016 at 00:58, Jonathan Gray  wrote:
> > On Mon, Oct 17, 2016 at 05:34:02PM +0100, Emil Velikov wrote:
> >> On 17 October 2016 at 16:39, Eric Engestrom  
> >> wrote:
> >> > On Monday, 2016-10-17 22:53:20 +1100, Jonathan Gray wrote:
> >> >> On Mon, Oct 17, 2016 at 12:39:11PM +0100, Emil Velikov wrote:
> >> >> > On 17 October 2016 at 10:53, Eric Engestrom 
> >> >> >  wrote:
> >> >> > > On Sunday, 2016-10-16 16:38:35 +1100, Jonathan Gray wrote:
> >> >> > >> On OpenBSD try to dlopen 'libglapi.so', ld.so will find
> >> >> > >> the highest major/minor version and open it in this case.
> >> >> > >>
> >> >> > >> Avoids '#error Unknown glapi provider for this platform' at build 
> >> >> > >> time.
> >> >> > >>
> >> >> > >> Signed-off-by: Jonathan Gray 
> >> >> > >
> >> >> > > LGTM, and I guess the other *BSD will want the same since 7a9c92d0 
> >> >> > > broke
> >> >> > > them too.
> >> >> > >
> >> >> > I'm not 100% sure about that. OpenBSD (unlike other BSD) did bump the
> >> >> > major when the ABI breaks due to 'internal' changes - think of
> >> >> > off_t/time_t on 32 vs 64bit systems and alike.
> >> >> >
> >> >> > Unlike Linux kernel/distros, BSDs tend to be more relaxed when in
> >> >> > comes to ABI, I believe. Don't quote me on that one ;-)
> >> >>
> >> >> OpenBSD tends to favour simplified interfaces over backwards 
> >> >> compatiblity
> >> >> and is more like a research system in that respect.  As the kernel
> >> >> and userland are one source tree ioctl compat largely doesn't exist.
> >> >> System calls get deprecated and removed over the course of a few 
> >> >> releases.
> >> >> So we didn't go through the pain of duplicated systems calls for off_t
> >> >> as mentioned, and don't go in for symbol versioning.  Just major.minor
> >> >> library versioning, which is roughly symbol removals, major crank,
> >> >> symbol additions minor crank.
> >> >>
> >> >> I believe FreeBSD tends to go in for backwards compatibility more
> >> >> but am not familiar with the details.  They also have a different ld.so.
> >> >>
> >> >> Perhaps an else case for 'libglapi.so.0' would be appropriate for all
> >> >> the other various unices instead of the #error ?
> >> >
> >> > Yeah actually, I'm thinking reverting this hunk of 7a9c92d0 might be a 
> >> > better,
> >> > to avoid the potentially huge list of every *BSD and other Unix:
> >> >
> >> Fwiw I've intentionally added the hunk since I was a bit lazy to check
> >> if the BSD(s?)/Solaris/others have bumped the major locally. Having a
> >> closer look that's not the case, so indeed we can add revert to
> >> libglapi.so.0 in the else statement.
> >>
> >> Jonathan, how about we with the above instead ?
> >
> > At the moment OpenBSD has libglapi.so.0.2 for Mesa 11.2.2.
> > New versions of Mesa add new shared_dispatch_stub_* symbols,
> > which the minor would crank for.
> >
> Don't think we [intentionally] added any symbols for a long while.

Comparing 11.2.2 libglapi and the latest Mesa I see:

Dynamic export changes:
added:
shared_dispatch_stub_1323
shared_dispatch_stub_1324
shared_dispatch_stub_1325
shared_dispatch_stub_1326
shared_dispatch_stub_1327
shared_dispatch_stub_1328
shared_dispatch_stub_1329

Perhaps this is unique to the non-tls dispatch case though.

> 
> > I'd prefer the diff I mailed for OpenBSD for if the major version
> > should crank for some reason.
> Let's worry about that if/when it happens ?

sure

> 
> Emil
> /me lands the rest of the patches

thankS
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: disable cabac for h264 baseline profile

2016-10-18 Thread Andy Furniss

Christian König wrote:

Am 18.10.2016 um 15:42 schrieb Andy Furniss:

Andy Furniss wrote:

Christian König wrote:

Am 18.10.2016 um 11:19 schrieb Andy Furniss:

boyuan.zh...@amd.com wrote:

From: Boyuan Zhang 

cabac is only supported in the h264 main and higher
profiles


So shouldn't there be code allows it if the user space
doesn't set baseline?

I don't know how in gstreamer as it seems to try to use
b-frames if you use other than baseline which doesn't work.

With avconv it is possible to call main/high and set b-frames
to 0.

I know it's technically correct spec wise, but seems a shame
as it costs a fair bit in "free" efficiency.

On Windows the raptor game recording app produces files
flagged as high with cabac - but without b-frames.


The problem is that it can easily break decoders. CABAC is
simply not allowed in a stream flagged as baseline compliant.


But with ffmpeg/avconv I can make a stream flagged as main/high
even if it's really baseline + CABAC. I guess Windows may vary
but the test I did seems to take this pragmatic approach, as it
seems do other h/w encoders eg. smartphone output.


It's a pity that we don't support B-frames any more.


Anymore? Now I am curious, seems to work with omx (cqp single
instance)



With that in place we could easily advertise support for
mainline profile.


MBAFF/PAFF?


Sorry if that came over as being pedantic, silly as I think
pragmatism is the way to go and I know intel advertise main/high,
but doubt they do interlaced.


Exactly, I mean we are talking about features to support encoding
into interlaced format. Is anybody still actively doing that?


Well, broadcasters, but I guess "users" never did anyway.


But even then, it's not so much of a problem advertising mainline
profile and then not using MBAFF/PAFF.

But when you advertise B-frames and then can't encode it you got a
serious problem because your frames are not in the right order any
more :)


Yea, it's a shame - was there a reason that b-frame support was abandoned?


In fact vce vaapi is currently advertising them as well (I did
mention it in some thread). Good for letting ffmpeg flag as such
while not using b-frames, not so good for gstreamer as they have
changed the default to high so old command lines will not
explicitly fail, but will produce junk.

I see va.h has a cabac switch and gstreamer exposes it - though
it's not read by the driver. Maybe if that were hooked up then
users could turn it on and profit :-).


Yeah, but again turning it on while the SPS/PPS only advertise the
stream to be baseline compliant is a clear violation of the codec
standard. (Is that actually encodeable in the stream? or does the
encoder switch to some higher level automatically if you use it?).


I was wrong about va.h having a switch, it was something different.

gst-inspect-1.0 vaapih264enc shows it has one - I don't what if anything
it actually does.

I am not trying to say here that flagging as constrained baseline and
using CABAC is in any way correct/legal. It just seems a shame to loose
CABAC = 10-20% less bitrate "for ever".

Looking at phone vids and vce from windows they get to use CABAC while
not using b-frames by flagging as main/high. It would be good if there
were a way to allow Linux vce to be the same.

I don't know how - if you can only advertise baseline, other than user
apps quirking their behavior depending on driver name, which I guess
some do any way.




Regards, Christian.







Christian.




Signed-off-by: Boyuan Zhang  ---
src/gallium/state_trackers/va/picture.c | 1 - 1 file
changed, 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c
b/src/gallium/state_trackers/va/picture.c index
eae5dc4..db08a3c 100644 ---
a/src/gallium/state_trackers/va/picture.c +++
b/src/gallium/state_trackers/va/picture.c @@ -110,7 +110,6
@@ getEncParamPreset(vlVaContext *context)
context->desc.h264enc.motion_est.enc_ime2_search_range_y =
0x0004;

//pic control preset -
context->desc.h264enc.pic_ctrl.enc_cabac_enable =
0x0001;
context->desc.h264enc.pic_ctrl.enc_constraint_set_flags =
0x0040;

//rate control



___ mesa-dev
mailing list mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev













___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: remove unused LocalSizeVariable

2016-10-18 Thread Timothy Arceri
Cc: Samuel Pitoiset 
Cc: Kenneth Graunke 
---
 src/mesa/main/mtypes.h| 5 -
 src/mesa/main/shaderapi.c | 1 -
 2 files changed, 6 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index ff20226..f4a9edd 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2078,11 +2078,6 @@ struct gl_compute_program
 * Size of shared variables accessed by the compute shader.
 */
unsigned SharedSize;
-
-   /**
-* Whether a variable work group size has been specified.
-*/
-   bool LocalSizeVariable;
 };
 
 
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index c40bb2d..1af1c3f 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -2212,7 +2212,6 @@ _mesa_copy_linked_program_data(gl_shader_stage type,
   for (i = 0; i < 3; i++)
  dst_cp->LocalSize[i] = src->Comp.LocalSize[i];
   dst_cp->SharedSize = src->Comp.SharedSize;
-  dst_cp->LocalSizeVariable = src->Comp.LocalSizeVariable;
   break;
}
default:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): glsl: Immediately inline built-ins rather than generating calls.

2016-10-18 Thread Kenneth Graunke
On Tuesday, October 18, 2016 4:38:17 PM PDT Brian Paul wrote:
> Hi Ken,
> 
> I found that this patch causes a regression.  There's a Windows medical 
> app which fails to link some shaders since this change.
> 
> Basically, when the gl_Position VS input is declared as invariant the 
> linker fails with:
> 
> error: declarations for uniform `gl_ModelViewProjectionMatrix' have 
> mismatching invariant qualifiers
> 
> I haven't investigated how to fix this.  I'm hoping you can see a simple 
> fix.
> 
> The attached piglit shader_runner script demonstrates the issue.  Passes 
> w/ NVIDIA.
> 
> Thanks!
> 
> -Brian

Oh, sorry about that!  Here are two possible fixes:

https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix
https://cgit.freedesktop.org/~kwg/mesa/commit/?h=invariant-fix-2

They're both kind of hacks...but the whole invariant propagation pass
is kind of a hack, and we've got some other hacks in place already.
So...maybe best to pile another one on.  Not sure which though.

Maybe Jason or Curro will have an opinion...

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/11] anv: move to using vk_alloc helpers.

2016-10-18 Thread Dave Airlie
On 19 October 2016 at 03:18, Emil Velikov  wrote:
> Hi Dave,
>
> On 17 October 2016 at 03:07, Dave Airlie  wrote:
>> From: Dave Airlie 
>>
>> This moves all the alloc/free in anv to the generic helpers.
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>  src/intel/vulkan/anv_batch_chain.c| 40 +++---
>>  src/intel/vulkan/anv_cmd_buffer.c | 22 -
>>  src/intel/vulkan/anv_descriptor_set.c | 12 -
>>  src/intel/vulkan/anv_device.c | 26 ++--
>>  src/intel/vulkan/anv_image.c  | 14 +--
>>  src/intel/vulkan/anv_intel.c  |  4 +--
>>  src/intel/vulkan/anv_pass.c   | 10 
>>  src/intel/vulkan/anv_pipeline.c   |  6 ++---
>>  src/intel/vulkan/anv_pipeline_cache.c |  8 +++---
>>  src/intel/vulkan/anv_private.h| 46 
>> +--
>>  src/intel/vulkan/anv_query.c  |  6 ++---
>>  src/intel/vulkan/anv_wsi.c|  2 +-
>>  src/intel/vulkan/anv_wsi_wayland.c| 16 ++--
>>  src/intel/vulkan/anv_wsi_x11.c| 22 -
>>  src/intel/vulkan/gen7_pipeline.c  |  4 +--
>>  src/intel/vulkan/gen8_pipeline.c  |  4 +--
>>  src/intel/vulkan/genX_pipeline.c  |  6 ++---
>>  src/intel/vulkan/genX_state.c |  2 +-
>>  18 files changed, 103 insertions(+), 147 deletions(-)
>>
> Wondering we one shouldn't include the new header only where needed ?
> Quick grep shows 33 files which include anv_private.h of which (as per
> above) ~half only need vk_alloc.h.

Don't really see the benefit, splitting anv_private.h would be a
bigger job I would think.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): glsl: Immediately inline built-ins rather than generating calls.

2016-10-18 Thread Brian Paul

Hi Ken,

I found that this patch causes a regression.  There's a Windows medical 
app which fails to link some shaders since this change.


Basically, when the gl_Position VS input is declared as invariant the 
linker fails with:


error: declarations for uniform `gl_ModelViewProjectionMatrix' have 
mismatching invariant qualifiers


I haven't investigated how to fix this.  I'm hoping you can see a simple 
fix.


The attached piglit shader_runner script demonstrates the issue.  Passes 
w/ NVIDIA.


Thanks!

-Brian



On 09/23/2016 05:45 PM, Kenneth Graunke wrote:

Module: Mesa
Branch: master
Commit: b04ef3c08a288a5857349c9e582ee2718fa562f7
URL:
https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_mesa_mesa_commit_-3Fid-3Db04ef3c08a288a5857349c9e582ee2718fa562f7=CwIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8=m7lwXZjH2_UAMD5u1FWrl6EmaAyly794Od4UBt09XC4=k1P13rDoBzgIU78tLDWd_Qo9GTSr_IX2GSRtdfMrDeI=

Author: Kenneth Graunke 
Date:   Fri May 30 23:52:22 2014 -0700

glsl: Immediately inline built-ins rather than generating calls.

In the past, we imported the prototypes of built-in functions, generated
calls to those, and waited until link time to resolve the calls and
import the actual code for the built-in functions.

This severely limited our compile-time optimization opportunities: even
trivial functions like dot() were represented as function calls.  We
also had no way of reasoning about those calls; they could have been
1,000 line functions with side-effects for all we knew.

Practically all built-in functions are trivial translations to
ir_expression opcodes, so it makes sense to just generate those inline.
Since we eventually inline all functions anyway, we may as well just do
it for all built-in functions.

There's only one snag: built-in functions that refer to built-in global
variables need those remapped to the variables in the shader being
compiled, rather than the ones in the built-in shader.  Currently,
ftransform() is the only function matching those criteria, so it seemed
easier to just make it a special case.

On Skylake:

total instructions in shared programs: 12023491 -> 12024010 (0.00%)
instructions in affected programs: 77595 -> 78114 (0.67%)
helped: 97
HURT: 309

total cycles in shared programs: 137239044 -> 137295498 (0.04%)
cycles in affected programs: 16714026 -> 16770480 (0.34%)
helped: 4663
HURT: 4923

while these statistics are in the wrong direction, the number of
hurt programs is small (309 / 41282 = 0.75%), and I don't think
anything can be done about it.  A change like this significantly
alters the order in which optimizations are performed.

Signed-off-by: Kenneth Graunke 
Reviewed-by; Ian Romanick 

---

  src/compiler/glsl/ast_function.cpp | 46 ++
  1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/src/compiler/glsl/ast_function.cpp 
b/src/compiler/glsl/ast_function.cpp
index 7e62ab7..ac3b52d 100644
--- a/src/compiler/glsl/ast_function.cpp
+++ b/src/compiler/glsl/ast_function.cpp
@@ -430,7 +430,8 @@ generate_call(exec_list *instructions, 
ir_function_signature *sig,
exec_list *actual_parameters,
ir_variable *sub_var,
ir_rvalue *array_idx,
-  struct _mesa_glsl_parse_state *state)
+  struct _mesa_glsl_parse_state *state,
+  bool inline_immediately)
  {
 void *ctx = state;
 exec_list post_call_conversions;
@@ -542,6 +543,10 @@ generate_call(exec_list *instructions, 
ir_function_signature *sig,
 ir_call *call = new(ctx) ir_call(sig, deref,
  actual_parameters, sub_var, array_idx);
 instructions->push_tail(call);
+   if (inline_immediately) {
+  call->generate_inline(call);
+  call->remove();
+   }

 /* Also emit any necessary out-parameter conversions. */
 instructions->append_list(_call_conversions);
@@ -557,19 +562,18 @@ match_function_by_name(const char *name,
 exec_list *actual_parameters,
 struct _mesa_glsl_parse_state *state)
  {
-   void *ctx = state;
 ir_function *f = state->symbols->get_function(name);
 ir_function_signature *local_sig = NULL;
 ir_function_signature *sig = NULL;

 /* Is the function hidden by a record type constructor? */
 if (state->symbols->get_type(name))
-  goto done; /* no match */
+  return sig; /* no match */

 /* Is the function hidden by a variable (impossible in 1.10)? */
 if (!state->symbols->separate_function_namespace
 && state->symbols->get_variable(name))
-  goto done; /* no match */
+  return sig; /* no match */

 if (f != NULL) {
/* In desktop GL, the presence of a user-defined signature hides any
@@ -583,31 +587,15 @@ match_function_by_name(const char *name,
sig = local_sig = 

Re: [Mesa-dev] [PATCH 1/4] configure.ac: print whether GBM is enabled

2016-10-18 Thread Eric Engestrom
On Wednesday, 2016-10-19 00:00:02 +0200, Marek Olšák wrote:
> From: Marek Olšák 

Series is:
Reviewed-by: Eric Engestrom 

> 
> ---
>  configure.ac | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/configure.ac b/configure.ac
> index 8e779d4..bc9b732 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2860,20 +2860,25 @@ if test "$enable_egl" = yes; then
>  egl_drivers=""
>  if test "x$HAVE_EGL_DRIVER_DRI2" != "x"; then
>  egl_drivers="$egl_drivers builtin:egl_dri2"
>  fi
>  if test "x$HAVE_EGL_DRIVER_DRI3" != "x"; then
>  egl_drivers="$egl_drivers builtin:egl_dri3"
>  fi
>  
>  echo "EGL drivers:$egl_drivers"
>  fi
> +if test "x$enable_gbm" = xyes; then
> +echo "GBM: yes"
> +else
> +echo "GBM: no"
> +fi
>  
>  # Vulkan
>  echo ""
>  if test "x$VULKAN_DRIVERS" != x; then
>  echo "Vulkan drivers:  $VULKAN_DRIVERS"
>  echo "Vulkan ICD dir:  $VULKAN_ICD_INSTALL_DIR"
>  else
>  echo "Vulkan drivers:  no"
>  fi
>  
> -- 
> 2.7.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Ian Romanick
On 10/18/2016 10:12 AM, Jan Ziak wrote:
>> Regarding C++ templates, the compiler doesn't use them. If u_vector
>> (Dave Airlie?) provides the same functionality as your array, I
>> suggest we use u_vector instead.
> 
> Let me repeat what you just wrote, because it is unbelievable: You are
> advising the use of non-templated collection types in C++ code.

Are you able to find any templates anywhere in the GLSL compiler?  I
don't think his statement was ambiguous.

>> If you can't use u_vector, you should
>> ask for approval from GLSL compiler leads (e.g. Ian Romanick or
>> Kenneth Graunke) to use C++ templates.
> 
> - You are talking about coding rules some Mesa developers agreed upon
> and didn't bother writing down for other developers to read

It was mostly written down, but it's not documented in the code base.
It seems impossible to even get current, de facto practices documented.
It's one of the few things in Mesa that really does get bike shedded.

Before the current GLSL compiler, there was no C++ in Mesa at all.
While developing the compiler, I found that I was re-implementing
numerous C++ features by hand in C.  It felt pretty insane.  Why am I
filling out all of these virtual function tables by hand?

At the same time, I also observed that almost 100% of shipping,
production-quality compilers were implemented using C++.  The single
exception was GCC.  The need for GCC to bootstrap on minimal, sometimes
dire, C compilers was the one thing keeping C++ out of the GCC code
base.  It wasn't even that long ago that core parts of GCC had to
support pre-C89 compilers.  As far as I am aware, they have since
started using C++ too.  Who am I to be so bold as to declare that
everyone shipping a C compiler is wrong?

In light of that, I opened a discussion about using C++ in the compiler.

Especially at that time (2008-ish), nobody working on Mesa was
particularly skilled at C++.  I had used it some, and, in the mid-90's,
had some really, really bad experiences with the implementations and
side-effects of various language features.  I still have nightmares
about trying to use templates in GCC 2.4.2.  There are quite a few C++
features that are really easy to misuse.  There are also a lot of
subtleties in the language that very few people really understand.

I don't mean this in a pejorative way, but there was and continues to be
a lot of FUD around C++.  I think a lot of this comes from the "Old
Woman Who Swallowed a Fly" nature of solving C++ development problems.
You have a problem.  The only way to solve that problem is to use
another language feature that you may or may not understand how to use
safely.  You use that feature to solve your problem.  Use of that
feature presents a new problem.  The only way to solve the new problem
is to use yet another language feature that you may or may not
understand how to use safely.  Pretty soon nobody knows how anything in
the code works.

After quite a bit of discussion on the mesa-dev list, on #dri-devel, and
face-to-face at XDC, we decided to use C++ with some restrictions.  The
main restriction was that C++ would be limited to the GLSL compiler
stack.  The other restrictions were roughly similar to the embedded C++
subset.

- No exceptions.

- No RTTI.

- No multiple inheritance.

- No operator overloading.  It could be argued that our use of
  placement new deviates from this.  In the previous metaphor, I
  think this was either the spider or the bird.

- No templates.

There are other restrictions (e.g., no STL) that come as natural
consequences of these.

Our goal was that any existing Mesa developer should be able to read any
piece of new C++ code and know what it was doing.

I feel like, due to our collective ignorance about the language, we may
have been slightly too restrictive.  It seems like we could have used
templates in some very, very restricted ways to enable things like
iterators that would have saved typing, encouraged refactoring, and made
the code more understandable.  Instead we have a proliferation of
foreach macros (or callbacks), and every data structure is a linked
list.  It's difficult to say whether it would have made things strictly
better or led us to swallow a bird, a cat, a dog...

I also feel like that ship has sailed.  When NIR was implemented using
pure C, going so far as to re-invent constructors using macros, the
chances of using more C++ faded substantially.  If, and that's a really,
really big if, additional C++ were to be used, it would have to be
preceded by patches to docs/devinfo.html that documented:

- What features were to be used.

- Why use of those features benefit the code base.  Specifically,
  why use of the new feature is substantially better than a
  different implementation that does not use the feature.

- Any restrictions on the use of those features.

Such a discussion may produce additional alternatives.

> - I am not willing to use u_vector in C++ code


Re: [Mesa-dev] [PATCH] draw: improve vertex fetch (v2)

2016-10-18 Thread Jose Fonseca

On 15/10/16 02:54, srol...@vmware.com wrote:

From: Roland Scheidegger 

The per-element fetch has quite some calculations which are constant,
these can be moved outside both the per-element as well as the main
shader loop (llvm can figure out it's constant mostly on its own, however
this can have a significant compile time cost).
Similarly, it looks easier swapping the fetch loops (outer loop per attrib,
inner loop filling up the per vertex elements - this way the aos->soa
conversion also can be done per attrib and not just at the end though again
this doesn't really make much of a difference in the generated code). (This
would also make it possible to vectorize the calculations leading to the
fetches.)
There's also some minimal change simplifying the overflow math slightly.
All in all, the generated code seems to look slightly simpler (depending
on the actual vs), but more importantly I've seen a significant reduction
in compile times for some vs (albeit with old (3.3) llvm version, and the
time reduction is only really for the optimizations run on the IR).
v2: adapt to other draw change.

No changes with piglit.
---
 src/gallium/auxiliary/draw/draw_llvm.c | 190 +++--
 .../auxiliary/gallivm/lp_bld_arit_overflow.c   |  24 +++
 .../auxiliary/gallivm/lp_bld_arit_overflow.h   |   6 +
 3 files changed, 134 insertions(+), 86 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 3b56856..2f82d9d 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -659,85 +659,42 @@ generate_vs(struct draw_llvm_variant *variant,
 static void
 generate_fetch(struct gallivm_state *gallivm,
struct draw_context *draw,
-   LLVMValueRef vbuffers_ptr,
+   const struct util_format_description *format_desc,
+   LLVMValueRef vb_stride,
+   LLVMValueRef stride_fixed,
+   LLVMValueRef map_ptr,
+   LLVMValueRef buffer_size_adj,
+   LLVMValueRef ofbit,
LLVMValueRef *res,
-   struct pipe_vertex_element *velem,
-   LLVMValueRef vbuf,
-   LLVMValueRef index,
-   LLVMValueRef instance_id,
-   LLVMValueRef start_instance)
+   LLVMValueRef index)
 {
-   const struct util_format_description *format_desc =
-  util_format_description(velem->src_format);
LLVMValueRef zero = LLVMConstNull(LLVMInt32TypeInContext(gallivm->context));
LLVMBuilderRef builder = gallivm->builder;
-   LLVMValueRef indices =
-  LLVMConstInt(LLVMInt64TypeInContext(gallivm->context),
-   velem->vertex_buffer_index, 0);
-   LLVMValueRef vbuffer_ptr = LLVMBuildGEP(builder, vbuffers_ptr,
-   , 1, "");
-   LLVMValueRef vb_stride = draw_jit_vbuffer_stride(gallivm, vbuf);
-   LLVMValueRef vb_buffer_offset = draw_jit_vbuffer_offset(gallivm, vbuf);
-   LLVMValueRef map_ptr = draw_jit_dvbuffer_map(gallivm, vbuffer_ptr);
-   LLVMValueRef buffer_size = draw_jit_dvbuffer_size(gallivm, vbuffer_ptr);
LLVMValueRef stride;
LLVMValueRef buffer_overflowed;
-   LLVMValueRef needed_buffer_size;
LLVMValueRef temp_ptr =
   lp_build_alloca(gallivm,
   lp_build_vec_type(gallivm, lp_float32_vec4_type()), "");
-   LLVMValueRef ofbit = NULL;
struct lp_build_if_state if_ctx;

-   if (velem->src_format == PIPE_FORMAT_NONE) {
+   if (format_desc->format == PIPE_FORMAT_NONE) {
   *res = lp_build_const_vec(gallivm, lp_float32_vec4_type(), 0);
   return;
}

-   if (velem->instance_divisor) {
-  /* Index is equal to the start instance plus the number of current
-   * instance divided by the divisor. In this case we compute it as:
-   * index = start_instance + (instance_id  / divisor)
-   */
-  LLVMValueRef current_instance;
-  current_instance = LLVMBuildUDiv(builder, instance_id,
-   lp_build_const_int32(gallivm, 
velem->instance_divisor),
-   "instance_divisor");
-  index = lp_build_uadd_overflow(gallivm, start_instance,
- current_instance, );
-   }
-
stride = lp_build_umul_overflow(gallivm, vb_stride, index, );
-   stride = lp_build_uadd_overflow(gallivm, stride, vb_buffer_offset, );
-   stride = lp_build_uadd_overflow(
-  gallivm, stride,
-  lp_build_const_int32(gallivm, velem->src_offset), );
-   needed_buffer_size = lp_build_uadd_overflow(
-  gallivm, stride,
-  lp_build_const_int32(gallivm,
-   util_format_get_blocksize(velem->src_format)),
-  );
+   stride = lp_build_uadd_overflow(gallivm, stride, stride_fixed, );

buffer_overflowed = LLVMBuildICmp(builder, LLVMIntUGT,
- needed_buffer_size, buffer_size,
+ 

Re: [Mesa-dev] [PATCH 1/6] util: add vector util code.

2016-10-18 Thread Dave Airlie
On 17 October 2016 at 18:09, Nicolai Hähnle  wrote:
> On 14.10.2016 05:16, Dave Airlie wrote:
>>
>> From: Dave Airlie 
>>
>> This is ported from anv, both anv and radv can share this.
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>  src/util/Makefile.sources |  4 +-
>>  src/util/u_vector.c   | 98
>> +++
>>  src/util/u_vector.h   | 85 
>>  3 files changed, 186 insertions(+), 1 deletion(-)
>>  create mode 100644 src/util/u_vector.c
>>  create mode 100644 src/util/u_vector.h
>>
>> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
>> index 8b17bcf..b7b1e91 100644
>> --- a/src/util/Makefile.sources
>> +++ b/src/util/Makefile.sources
>> @@ -35,7 +35,9 @@ MESA_UTIL_FILES :=\
>> strtod.h \
>> texcompress_rgtc_tmp.h \
>> u_atomic.h \
>> -   u_endian.h
>> +   u_endian.h \
>> +   u_vector.c \
>> +   u_vector.h
>>
>>  MESA_UTIL_GENERATED_FILES = \
>> format_srgb.c
>
> [snip]
>
>> diff --git a/src/util/u_vector.h b/src/util/u_vector.h
>> new file mode 100644
>> index 000..ea52837
>> --- /dev/null
>> +++ b/src/util/u_vector.h
>> @@ -0,0 +1,85 @@
>> +/*
>> + * Copyright © 2015 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +#ifndef U_VECTOR_H
>> +#define U_VECTOR_H
>> +
>> +#include 
>> +#include 
>> +#include "util/u_math.h"
>> +#include "util/macros.h"
>> +
>> +static inline uint32_t
>> +u_align_u32(uint32_t v, uint32_t a)
>> +{
>> +   assert(a != 0 && a == (a & -a));
>> +   return (v + a - 1) & ~(a - 1);
>> +}
>
>
> This fits better in u_math.h
>

Yes I realise this, and I'll probably move it there separately, but
I'd like to start bringing
u_math.h into src/util instead of pulling it from gallium in the future.

I'll add a todo beside this function for now.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] configure.ac: enable GBM by default

2016-10-18 Thread Marek Olšák
From: Marek Olšák 

---
 configure.ac | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index bc9b732..3431a5d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -948,23 +948,30 @@ AC_ARG_ENABLE([egl],
 [enable_egl="$enableval"],
 [enable_egl=yes])
 
 AC_ARG_ENABLE([xa],
 [AS_HELP_STRING([--enable-xa],
 [enable build of the XA X Acceleration API @<:@default=disabled@:>@])],
 [enable_xa="$enableval"],
 [enable_xa=no])
 AC_ARG_ENABLE([gbm],
[AS_HELP_STRING([--enable-gbm],
- [enable gbm library @<:@default=auto@:>@])],
+ [enable gbm library @<:@default=yes except cygwin@:>@])],
[enable_gbm="$enableval"],
-   [enable_gbm=auto])
+   [case "$host_os" in
+   cygwin*)
+  enable_gbm=no
+  ;;
+   *)
+  enable_gbm=yes
+  ;;
+esac])
 AC_ARG_ENABLE([nine],
 [AS_HELP_STRING([--enable-nine],
 [enable build of the nine Direct3D9 API @<:@default=no@:>@])],
 [enable_nine="$enableval"],
 [enable_nine=no])
 
 AC_ARG_ENABLE([xvmc],
[AS_HELP_STRING([--enable-xvmc],
  [enable xvmc library @<:@default=auto@:>@])],
[enable_xvmc="$enableval"],
@@ -1748,28 +1755,20 @@ if test "x$enable_osmesa" = xyes -o 
"x$enable_gallium_osmesa" = xyes; then
 OSMESA_PC_LIB_PRIV="-lm $PTHREAD_LIBS $SELINUX_LIBS $DLOPEN_LIBS"
 fi
 
 AC_SUBST([OSMESA_LIB_DEPS])
 AC_SUBST([OSMESA_PC_REQ])
 AC_SUBST([OSMESA_PC_LIB_PRIV])
 
 dnl
 dnl gbm configuration
 dnl
-if test "x$enable_gbm" = xauto; then
-case "$with_egl_platforms" in
-*drm*)
-enable_gbm=yes ;;
- *)
-enable_gbm=no ;;
-esac
-fi
 if test "x$enable_gbm" = xyes; then
 if test "x$enable_dri" = xyes; then
 if test "x$enable_shared_glapi" = xno; then
 AC_MSG_ERROR([gbm_dri requires --enable-shared-glapi])
 fi
 else
 # Strictly speaking libgbm does not require --enable-dri, although
 # both of its backends do. Thus one can build libgbm without any
 # backends if --disable-dri is set.
 # To avoid unnecessary complexity of checking if at least one backend
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] configure.ac: check for Glamor requirements only when needed

2016-10-18 Thread Marek Olšák
From: Marek Olšák 

---
 configure.ac | 37 +++--
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 12c8165..17dfafd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2296,35 +2296,52 @@ dnl Gallium helper functions
 dnl
 gallium_require_llvm() {
 if test "x$MESA_LLVM" = x0; then
 case "$host" in *gnux32) return;; esac
 case "$host_cpu" in
 i*86|x86_64|amd64) AC_MSG_ERROR([LLVM is required to build $1 on x86 
and x86_64]);;
 esac
 fi
 }
 
-dnl This is for Glamor. Skip this if OpenGL is disabled.
-require_egl_drm() {
+dnl If EGL/X11 or GLX is enabled, make sure they are usable.
+check_glamor_requirements() {
 if test "x$enable_opengl" = xno; then
 return 0
 fi
 
+need_glamor=no
+
+if test "x$enable_glx" = xdri; then
+need_glamor=yes
+fi
+
 case "$with_egl_platforms" in
-*drm*)
-;;
- *)
-AC_MSG_ERROR([--with-egl-platforms=drm is required to build the $1 
driver.])
+*x11*)
+need_glamor=yes
 ;;
 esac
-if test "x$enable_gbm" != xyes; then
-AC_MSG_ERROR([--enable-gbm is required to build the $1 driver.])
+
+if test "x$need_glamor" = xyes; then
+suffix="is required for X acceleration with the $1 driver."
+
+if test "x$enable_gbm" != xyes; then
+AC_MSG_ERROR([--enable-gbm $suffix])
+fi
+
+case "$with_egl_platforms" in
+*drm*)
+;;
+*)
+AC_MSG_ERROR([--with-egl-platforms=x11,drm $suffix])
+;;
+esac
 fi
 }
 
 radeon_llvm_check() {
 if test ${LLVM_VERSION_INT} -lt 307; then
 amdgpu_llvm_target_name='r600'
 else
 amdgpu_llvm_target_name='amdgpu'
 fi
 llvm_check_version_for $2 $3 $4 $1
@@ -2427,21 +2444,21 @@ if test -n "$with_gallium_drivers"; then
 radeon_gallium_llvm_check "r600g" "3" "6" "0"
 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
 fi
 ;;
 xradeonsi)
 HAVE_GALLIUM_RADEONSI=yes
 PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= 
$LIBDRM_RADEON_REQUIRED])
 PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= 
$LIBDRM_AMDGPU_REQUIRED])
 require_libdrm "radeonsi"
 radeon_gallium_llvm_check "radeonsi" "3" "6" "0"
-require_egl_drm "radeonsi"
+check_glamor_requirements "radeonsi"
 ;;
 xnouveau)
 HAVE_GALLIUM_NOUVEAU=yes
 PKG_CHECK_MODULES([NOUVEAU], [libdrm_nouveau >= 
$LIBDRM_NOUVEAU_REQUIRED])
 require_libdrm "nouveau"
 ;;
 xfreedreno)
 HAVE_GALLIUM_FREEDRENO=yes
 PKG_CHECK_MODULES([FREEDRENO], [libdrm_freedreno >= 
$LIBDRM_FREEDRENO_REQUIRED])
 require_libdrm "freedreno"
@@ -2478,21 +2495,21 @@ if test -n "$with_gallium_drivers"; then
 require_libdrm "vc4"
 
 PKG_CHECK_MODULES([SIMPENROSE], [simpenrose],
   [USE_VC4_SIMULATOR=yes;
DEFINES="$DEFINES -DUSE_VC4_SIMULATOR"],
   [USE_VC4_SIMULATOR=no])
 ;;
 xvirgl)
 HAVE_GALLIUM_VIRGL=yes
 require_libdrm "virgl"
-require_egl_drm "virgl"
+check_glamor_requirements "virgl"
 ;;
 *)
 AC_MSG_ERROR([Unknown Gallium driver: $driver])
 ;;
 esac
 done
 fi
 
 if test "x$HAVE_RADEON_VULKAN" = "xyes"; then
 radeon_llvm_check "radv" "3" "9" "0"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] configure.ac: print whether GBM is enabled

2016-10-18 Thread Marek Olšák
From: Marek Olšák 

---
 configure.ac | 5 +
 1 file changed, 5 insertions(+)

diff --git a/configure.ac b/configure.ac
index 8e779d4..bc9b732 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2860,20 +2860,25 @@ if test "$enable_egl" = yes; then
 egl_drivers=""
 if test "x$HAVE_EGL_DRIVER_DRI2" != "x"; then
 egl_drivers="$egl_drivers builtin:egl_dri2"
 fi
 if test "x$HAVE_EGL_DRIVER_DRI3" != "x"; then
 egl_drivers="$egl_drivers builtin:egl_dri3"
 fi
 
 echo "EGL drivers:$egl_drivers"
 fi
+if test "x$enable_gbm" = xyes; then
+echo "GBM: yes"
+else
+echo "GBM: no"
+fi
 
 # Vulkan
 echo ""
 if test "x$VULKAN_DRIVERS" != x; then
 echo "Vulkan drivers:  $VULKAN_DRIVERS"
 echo "Vulkan ICD dir:  $VULKAN_ICD_INSTALL_DIR"
 else
 echo "Vulkan drivers:  no"
 fi
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] configure.ac: enable EGL platform DRM if GBM is enabled

2016-10-18 Thread Marek Olšák
From: Marek Olšák 

since GBM is enabled by default, this is also enabled by default

the whitespace changes remove tabs
---
 configure.ac | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index 3431a5d..12c8165 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2010,23 +2010,27 @@ AC_SUBST([EGL_CLIENT_APIS])
 
 dnl
 dnl EGL Platforms configuration
 dnl
 AC_ARG_WITH([egl-platforms],
 [AS_HELP_STRING([--with-egl-platforms@<:@=DIRS...@:>@],
 [comma delimited native platforms libEGL supports, e.g.
 "x11,drm" @<:@default=auto@:>@])],
 [with_egl_platforms="$withval"],
 [if test "x$enable_egl" = xyes; then
-   with_egl_platforms="x11"
+if test "x$enable_gbm" = xyes; then
+   with_egl_platforms="x11,drm"
+else
+   with_egl_platforms="x11"
+fi
 else
-   with_egl_platforms=""
+with_egl_platforms=""
 fi])
 
 if test "x$with_egl_platforms" != "x" -a "x$enable_egl" != xyes; then
 AC_MSG_ERROR([cannot build egl state tracker without EGL library])
 fi
 
 PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
 WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner 
wayland-scanner`,
 WAYLAND_SCANNER='')
 if test "x$WAYLAND_SCANNER" = x; then
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] intel: genxml: add SAMPLER_BORDER_COLOR_STATE structures

2016-10-18 Thread Jason Ekstrand
Thanks for the sandy bridge doc link.  With all of the extra MBZ removed,
this patch is

Reviewed-by: Jason Ekstrand 

On Mon, Oct 17, 2016 at 11:39 AM, Lionel Landwerlin 
wrote:

> On Mon, 2016-10-17 at 10:56 -0700, Jason Ekstrand wrote:
> >
> >
> > On Mon, Oct 17, 2016 at 8:46 AM, Lionel Landwerlin  > .com> wrote:
> > > Signed-off-by: Lionel Landwerlin 
> > > ---
> > >  src/intel/genxml/gen6.xml  | 32 
> > >  src/intel/genxml/gen7.xml  | 12 
> > >  src/intel/genxml/gen75.xml | 40
> > > 
> > >  src/intel/genxml/gen8.xml  | 12 
> > >  src/intel/genxml/gen9.xml  | 12 
> > >  5 files changed, 108 insertions(+)
> > >
> > > diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
> > > index 211716b..7ba8954 100644
> > > --- a/src/intel/genxml/gen6.xml
> > > +++ b/src/intel/genxml/gen6.xml
> > > @@ -372,6 +372,38 @@
> > >  
> > >
> > >
> > > +  
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > +
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > +
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > +
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > +
> > > + > > type="int"/>
> > > + > > type="int"/>
> > > + > > type="int"/>
> > > + > > type="int"/>
> > > +
> > > + > > type="int"/>
> > > + > > type="int"/>
> > > + > > type="int"/>
> > > + > > type="int"/>
> > > +  
> >
> > Are there docs for this anywhere or did you just pull it out of the
> > gen6 GL code?
> >
>
> Yes, there are but indeed not in the PRMs.
>
>
> > > +
> > >
> > >   > > type="bool"/>
> > >   > > type="uint">
> > > diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
> > > index eabb244..a950603 100644
> > > --- a/src/intel/genxml/gen7.xml
> > > +++ b/src/intel/genxml/gen7.xml
> > > @@ -428,6 +428,18 @@
> > >   > > type="u4.8"/>
> > >
> > >
> > > +  
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > +
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > +  
> > > +
> > >
> > >   > > type="bool"/>
> > >   > > type="uint">
> > > diff --git a/src/intel/genxml/gen75.xml
> > > b/src/intel/genxml/gen75.xml
> > > index 27a12cb..42f66cb 100644
> > > --- a/src/intel/genxml/gen75.xml
> > > +++ b/src/intel/genxml/gen75.xml
> > > @@ -438,6 +438,46 @@
> > >   > > type="u4.8"/>
> > >
> > >
> > > +  
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > +
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> >
> > In the rest of the XML, MBZ fields simply don't exist.  The packing
> > functions will automatically zero anything that doesn't have data in
> > it.  I'm not sure if that's true for whole dwords but if it's not, we
> > should fix that.  In other words, I believe the correct solution is
> > to just delete these and let "Border Color 8bit Red" start super-late
> > in the packet.
> >
> > > +
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > end="575" type="uint"/>
> > > + > > end="607" type="uint"/>
> > > + > > end="639" type="uint"/>
> >
> > These can go as well
> >
> > > +
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > end="575" type="uint"/>
> >
> > and this
> >
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > end="639" type="uint"/>
> >
> > and this
> >
> > > +
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > + > > type="uint"/>
> > > +  
> > > +
> > >
> > >   > > type="bool"/>
> > >   > > type="uint">
> > > diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
> > > index ee62614..a281f01 100644
> > > --- a/src/intel/genxml/gen8.xml
> > > +++ b/src/intel/genxml/gen8.xml
> > > @@ -358,6 +358,18 @@
> > >   > > type="s1.6"/>
> > >
> > >
> > > +  
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > + > > type="float"/>
> > > +
> > > + > > 

Re: [Mesa-dev] [PATCH 10/25] mesa/nir/radv/anv: add shader_info param to nir_shader builder

2016-10-18 Thread Jason Ekstrand
On Tue, Oct 18, 2016 at 2:18 PM, Jason Ekstrand 
wrote:

>
>
> On Tue, Oct 18, 2016 at 2:06 PM, Timothy Arceri <
> timothy.arc...@collabora.com> wrote:
>
>> On Tue, 2016-10-18 at 08:47 -0700, Jason Ekstrand wrote:
>> > On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceri > > abora.com> wrote:
>> > > And pass in a pointer to the shader info in gl_program for ARB
>> > > programs.
>> > > ---
>> > >  src/amd/vulkan/radv_meta_blit.c   | 12 
>> > >  src/amd/vulkan/radv_meta_blit2d.c | 12 
>> > >  src/amd/vulkan/radv_meta_buffer.c |  6 --
>> > >  src/amd/vulkan/radv_meta_bufimage.c   |  3 ++-
>> > >  src/amd/vulkan/radv_meta_clear.c  | 12 
>> > >  src/amd/vulkan/radv_meta_decompress.c |  6 --
>> > >  src/amd/vulkan/radv_meta_fast_clear.c |  6 --
>> > >  src/amd/vulkan/radv_meta_resolve.c|  6 --
>> > >  src/amd/vulkan/radv_meta_resolve_cs.c |  2 +-
>> > >  src/amd/vulkan/radv_pipeline.c|  2 +-
>> > >  src/compiler/nir/nir_builder.h|  5 +++--
>> > >  src/compiler/nir/tests/control_flow_tests.cpp |  3 ++-
>> > >  src/gallium/auxiliary/nir/tgsi_to_nir.c   |  2 +-
>> > >  src/intel/blorp/blorp_blit.c  |  2 +-
>> > >  src/intel/blorp/blorp_clear.c |  2 +-
>> > >  src/mesa/drivers/dri/i965/brw_program.c   |  2 +-
>> > >  src/mesa/drivers/dri/i965/brw_program.h   |  2 +-
>> > >  src/mesa/drivers/dri/i965/brw_tcs.c   |  3 ++-
>> > >  src/mesa/program/prog_to_nir.c|  5 +++--
>> > >  src/mesa/program/prog_to_nir.h|  2 +-
>> > >  20 files changed, 60 insertions(+), 35 deletions(-)
>> > >
>> > > diff --git a/src/amd/vulkan/radv_meta_blit.c
>> > > b/src/amd/vulkan/radv_meta_blit.c
>> > > index bfbf880..3eda43e 100644
>> > > --- a/src/amd/vulkan/radv_meta_blit.c
>> > > +++ b/src/amd/vulkan/radv_meta_blit.c
>> > > @@ -37,7 +37,8 @@ build_nir_vertex_shader(void)
>> > > const struct glsl_type *vec4 = glsl_vec4_type();
>> > > nir_builder b;
>> > >
>> > > -   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_VERTEX, NULL);
>> > > +   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_VERTEX, NULL,
>> > > +  NULL);
>> > > b.shader->info->name = ralloc_strdup(b.shader,
>> > > "meta_blit_vs");
>> > >
>> > > nir_variable *pos_in = nir_variable_create(b.shader,
>> > > nir_var_shader_in,
>> > > @@ -67,7 +68,8 @@ build_nir_copy_fragment_shader(enum
>> > > glsl_sampler_dim tex_dim)
>> > > const struct glsl_type *vec4 = glsl_vec4_type();
>> > > nir_builder b;
>> > >
>> > > -   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_FRAGMENT, NULL);
>> > > +   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_FRAGMENT, NULL,
>> > > +  NULL);
>> > >
>> > > sprintf(shader_name, "meta_blit_fs.%d", tex_dim);
>> > > b.shader->info->name = ralloc_strdup(b.shader,
>> > > shader_name);
>> > > @@ -121,7 +123,8 @@ build_nir_copy_fragment_shader_depth(enum
>> > > glsl_sampler_dim tex_dim)
>> > > const struct glsl_type *vec4 = glsl_vec4_type();
>> > > nir_builder b;
>> > >
>> > > -   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_FRAGMENT, NULL);
>> > > +   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_FRAGMENT, NULL,
>> > > +  NULL);
>> > >
>> > > sprintf(shader_name, "meta_blit_depth_fs.%d", tex_dim);
>> > > b.shader->info->name = ralloc_strdup(b.shader,
>> > > shader_name);
>> > > @@ -175,7 +178,8 @@ build_nir_copy_fragment_shader_stencil(enum
>> > > glsl_sampler_dim tex_dim)
>> > > const struct glsl_type *vec4 = glsl_vec4_type();
>> > > nir_builder b;
>> > >
>> > > -   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_FRAGMENT, NULL);
>> > > +   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_FRAGMENT, NULL,
>> > > +  NULL);
>> > >
>> > > sprintf(shader_name, "meta_blit_stencil_fs.%d", tex_dim);
>> > > b.shader->info->name = ralloc_strdup(b.shader,
>> > > shader_name);
>> > > diff --git a/src/amd/vulkan/radv_meta_blit2d.c
>> > > b/src/amd/vulkan/radv_meta_blit2d.c
>> > > index 6e92f80..bed03a3 100644
>> > > --- a/src/amd/vulkan/radv_meta_blit2d.c
>> > > +++ b/src/amd/vulkan/radv_meta_blit2d.c
>> > > @@ -438,7 +438,8 @@ build_nir_vertex_shader(void)
>> > > const struct glsl_type *vec2 =
>> > > glsl_vector_type(GLSL_TYPE_FLOAT, 2);
>> > > nir_builder b;
>> > >
>> > > -   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_VERTEX, NULL);
>> > > +   nir_builder_init_simple_shader(, NULL,
>> > > MESA_SHADER_VERTEX, NULL,
>> > > +  

Re: [Mesa-dev] [PATCH 10/25] mesa/nir/radv/anv: add shader_info param to nir_shader builder

2016-10-18 Thread Jason Ekstrand
On Tue, Oct 18, 2016 at 2:06 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> On Tue, 2016-10-18 at 08:47 -0700, Jason Ekstrand wrote:
> > On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceri  > abora.com> wrote:
> > > And pass in a pointer to the shader info in gl_program for ARB
> > > programs.
> > > ---
> > >  src/amd/vulkan/radv_meta_blit.c   | 12 
> > >  src/amd/vulkan/radv_meta_blit2d.c | 12 
> > >  src/amd/vulkan/radv_meta_buffer.c |  6 --
> > >  src/amd/vulkan/radv_meta_bufimage.c   |  3 ++-
> > >  src/amd/vulkan/radv_meta_clear.c  | 12 
> > >  src/amd/vulkan/radv_meta_decompress.c |  6 --
> > >  src/amd/vulkan/radv_meta_fast_clear.c |  6 --
> > >  src/amd/vulkan/radv_meta_resolve.c|  6 --
> > >  src/amd/vulkan/radv_meta_resolve_cs.c |  2 +-
> > >  src/amd/vulkan/radv_pipeline.c|  2 +-
> > >  src/compiler/nir/nir_builder.h|  5 +++--
> > >  src/compiler/nir/tests/control_flow_tests.cpp |  3 ++-
> > >  src/gallium/auxiliary/nir/tgsi_to_nir.c   |  2 +-
> > >  src/intel/blorp/blorp_blit.c  |  2 +-
> > >  src/intel/blorp/blorp_clear.c |  2 +-
> > >  src/mesa/drivers/dri/i965/brw_program.c   |  2 +-
> > >  src/mesa/drivers/dri/i965/brw_program.h   |  2 +-
> > >  src/mesa/drivers/dri/i965/brw_tcs.c   |  3 ++-
> > >  src/mesa/program/prog_to_nir.c|  5 +++--
> > >  src/mesa/program/prog_to_nir.h|  2 +-
> > >  20 files changed, 60 insertions(+), 35 deletions(-)
> > >
> > > diff --git a/src/amd/vulkan/radv_meta_blit.c
> > > b/src/amd/vulkan/radv_meta_blit.c
> > > index bfbf880..3eda43e 100644
> > > --- a/src/amd/vulkan/radv_meta_blit.c
> > > +++ b/src/amd/vulkan/radv_meta_blit.c
> > > @@ -37,7 +37,8 @@ build_nir_vertex_shader(void)
> > > const struct glsl_type *vec4 = glsl_vec4_type();
> > > nir_builder b;
> > >
> > > -   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_VERTEX, NULL);
> > > +   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_VERTEX, NULL,
> > > +  NULL);
> > > b.shader->info->name = ralloc_strdup(b.shader,
> > > "meta_blit_vs");
> > >
> > > nir_variable *pos_in = nir_variable_create(b.shader,
> > > nir_var_shader_in,
> > > @@ -67,7 +68,8 @@ build_nir_copy_fragment_shader(enum
> > > glsl_sampler_dim tex_dim)
> > > const struct glsl_type *vec4 = glsl_vec4_type();
> > > nir_builder b;
> > >
> > > -   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_FRAGMENT, NULL);
> > > +   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_FRAGMENT, NULL,
> > > +  NULL);
> > >
> > > sprintf(shader_name, "meta_blit_fs.%d", tex_dim);
> > > b.shader->info->name = ralloc_strdup(b.shader,
> > > shader_name);
> > > @@ -121,7 +123,8 @@ build_nir_copy_fragment_shader_depth(enum
> > > glsl_sampler_dim tex_dim)
> > > const struct glsl_type *vec4 = glsl_vec4_type();
> > > nir_builder b;
> > >
> > > -   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_FRAGMENT, NULL);
> > > +   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_FRAGMENT, NULL,
> > > +  NULL);
> > >
> > > sprintf(shader_name, "meta_blit_depth_fs.%d", tex_dim);
> > > b.shader->info->name = ralloc_strdup(b.shader,
> > > shader_name);
> > > @@ -175,7 +178,8 @@ build_nir_copy_fragment_shader_stencil(enum
> > > glsl_sampler_dim tex_dim)
> > > const struct glsl_type *vec4 = glsl_vec4_type();
> > > nir_builder b;
> > >
> > > -   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_FRAGMENT, NULL);
> > > +   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_FRAGMENT, NULL,
> > > +  NULL);
> > >
> > > sprintf(shader_name, "meta_blit_stencil_fs.%d", tex_dim);
> > > b.shader->info->name = ralloc_strdup(b.shader,
> > > shader_name);
> > > diff --git a/src/amd/vulkan/radv_meta_blit2d.c
> > > b/src/amd/vulkan/radv_meta_blit2d.c
> > > index 6e92f80..bed03a3 100644
> > > --- a/src/amd/vulkan/radv_meta_blit2d.c
> > > +++ b/src/amd/vulkan/radv_meta_blit2d.c
> > > @@ -438,7 +438,8 @@ build_nir_vertex_shader(void)
> > > const struct glsl_type *vec2 =
> > > glsl_vector_type(GLSL_TYPE_FLOAT, 2);
> > > nir_builder b;
> > >
> > > -   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_VERTEX, NULL);
> > > +   nir_builder_init_simple_shader(, NULL,
> > > MESA_SHADER_VERTEX, NULL,
> > > +  NULL);
> > > b.shader->info->name = ralloc_strdup(b.shader,
> > > "meta_blit_vs");
> > >
> > > nir_variable *pos_in = nir_variable_create(b.shader,
> > > nir_var_shader_in,
> > 

[Mesa-dev] [PATCH 1/2] st/nine: Fix leak with integer and boolean constants

2016-10-18 Thread Axel Davy
Leak introduced by:
a83dce01284f220b1bf932774730e13fca6cdd20

The patch also moves the part to
release changed.vs_const_i and changed.vs_const_b
before the if (!cb.buffer_size) check,
to avoid reuploading every draw call if
integer or boolean constants are dirty, but the shaders
use no constants.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_state.c | 39 +---
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index f6bf51e..ea72c77 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -126,7 +126,6 @@ prepare_vs_constants_userbuf_swvp(struct NineDevice9 
*device)
 cb.user_buffer = state->vs_const_i;
 
 state->pipe.cb2_swvp = cb;
-state->changed.vs_const_i = 0;
 }
 
 if (state->changed.vs_const_b || state->changed.group & NINE_STATE_SWVP) {
@@ -138,7 +137,6 @@ prepare_vs_constants_userbuf_swvp(struct NineDevice9 
*device)
 cb.user_buffer = state->vs_const_b;
 
 state->pipe.cb3_swvp = cb;
-state->changed.vs_const_b = 0;
 }
 
 if (!device->driver_caps.user_cbufs) {
@@ -236,14 +234,30 @@ prepare_vs_constants_userbuf(struct NineDevice9 *device)
 if (state->changed.vs_const_i || state->changed.group & NINE_STATE_SWVP) {
 int *idst = (int *)>vs_const_f[4 * device->max_vs_const_f];
 memcpy(idst, state->vs_const_i, NINE_MAX_CONST_I * sizeof(int[4]));
-state->changed.vs_const_i = 0;
 }
 
 if (state->changed.vs_const_b || state->changed.group & NINE_STATE_SWVP) {
 int *idst = (int *)>vs_const_f[4 * device->max_vs_const_f];
 uint32_t *bdst = (uint32_t *)[4 * NINE_MAX_CONST_I];
 memcpy(bdst, state->vs_const_b, NINE_MAX_CONST_B * sizeof(BOOL));
-state->changed.vs_const_b = 0;
+}
+
+if (device->state.changed.vs_const_i) {
+struct nine_range *r = device->state.changed.vs_const_i;
+struct nine_range *p = r;
+while (p->next)
+p = p->next;
+nine_range_pool_put_chain(>range_pool, r, p);
+device->state.changed.vs_const_i = NULL;
+}
+
+if (device->state.changed.vs_const_b) {
+struct nine_range *r = device->state.changed.vs_const_b;
+struct nine_range *p = r;
+while (p->next)
+p = p->next;
+nine_range_pool_put_chain(>range_pool, r, p);
+device->state.changed.vs_const_b = NULL;
 }
 
 if (!cb.buffer_size)
@@ -290,23 +304,6 @@ prepare_vs_constants_userbuf(struct NineDevice9 *device)
 device->state.changed.vs_const_f = NULL;
 }
 
-if (device->state.changed.vs_const_i) {
-struct nine_range *r = device->state.changed.vs_const_i;
-struct nine_range *p = r;
-while (p->next)
-p = p->next;
-nine_range_pool_put_chain(>range_pool, r, p);
-device->state.changed.vs_const_i = NULL;
-}
-
-if (device->state.changed.vs_const_b) {
-struct nine_range *r = device->state.changed.vs_const_b;
-struct nine_range *p = r;
-while (p->next)
-p = p->next;
-nine_range_pool_put_chain(>range_pool, r, p);
-device->state.changed.vs_const_b = NULL;
-}
 state->changed.group &= ~NINE_STATE_VS_CONST;
 state->commit |= NINE_STATE_COMMIT_CONST_VS;
 }
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/nine: Fix leak at device dtor

2016-10-18 Thread Axel Davy
The datastructures to track dirty constants
weren't freed.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index c0a3c39..d7f3a40 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -481,6 +481,8 @@ void
 NineDevice9_dtor( struct NineDevice9 *This )
 {
 unsigned i;
+struct nine_range *r;
+struct nine_range_pool *pool = >base.device->range_pool;
 
 DBG("This=%p\n", This);
 
@@ -514,6 +516,23 @@ NineDevice9_dtor( struct NineDevice9 *This )
 FREE(This->state.vs_const_b);
 FREE(This->state.vs_const_f_swvp);
 
+if (This->state.changed.ps_const_f) {
+for (r = This->state.changed.ps_const_f; r->next; r = r->next);
+nine_range_pool_put_chain(pool, This->state.changed.ps_const_f, r);
+}
+if (This->state.changed.vs_const_f) {
+for (r = This->state.changed.vs_const_f; r->next; r = r->next);
+nine_range_pool_put_chain(pool, This->state.changed.vs_const_f, r);
+}
+if (This->state.changed.vs_const_i) {
+for (r = This->state.changed.vs_const_i; r->next; r = r->next);
+nine_range_pool_put_chain(pool, This->state.changed.vs_const_i, r);
+}
+if (This->state.changed.vs_const_b) {
+for (r = This->state.changed.vs_const_b; r->next; r = r->next);
+nine_range_pool_put_chain(pool, This->state.changed.vs_const_b, r);
+}
+
 if (This->swapchains) {
 for (i = 0; i < This->nswapchains; ++i)
 if (This->swapchains[i])
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/25] mesa/nir/radv/anv: add shader_info param to nir_shader builder

2016-10-18 Thread Timothy Arceri
On Tue, 2016-10-18 at 08:47 -0700, Jason Ekstrand wrote:
> On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceri  abora.com> wrote:
> > And pass in a pointer to the shader info in gl_program for ARB
> > programs.
> > ---
> >  src/amd/vulkan/radv_meta_blit.c               | 12 
> >  src/amd/vulkan/radv_meta_blit2d.c             | 12 
> >  src/amd/vulkan/radv_meta_buffer.c             |  6 --
> >  src/amd/vulkan/radv_meta_bufimage.c           |  3 ++-
> >  src/amd/vulkan/radv_meta_clear.c              | 12 
> >  src/amd/vulkan/radv_meta_decompress.c         |  6 --
> >  src/amd/vulkan/radv_meta_fast_clear.c         |  6 --
> >  src/amd/vulkan/radv_meta_resolve.c            |  6 --
> >  src/amd/vulkan/radv_meta_resolve_cs.c         |  2 +-
> >  src/amd/vulkan/radv_pipeline.c                |  2 +-
> >  src/compiler/nir/nir_builder.h                |  5 +++--
> >  src/compiler/nir/tests/control_flow_tests.cpp |  3 ++-
> >  src/gallium/auxiliary/nir/tgsi_to_nir.c       |  2 +-
> >  src/intel/blorp/blorp_blit.c                  |  2 +-
> >  src/intel/blorp/blorp_clear.c                 |  2 +-
> >  src/mesa/drivers/dri/i965/brw_program.c       |  2 +-
> >  src/mesa/drivers/dri/i965/brw_program.h       |  2 +-
> >  src/mesa/drivers/dri/i965/brw_tcs.c           |  3 ++-
> >  src/mesa/program/prog_to_nir.c                |  5 +++--
> >  src/mesa/program/prog_to_nir.h                |  2 +-
> >  20 files changed, 60 insertions(+), 35 deletions(-)
> > 
> > diff --git a/src/amd/vulkan/radv_meta_blit.c
> > b/src/amd/vulkan/radv_meta_blit.c
> > index bfbf880..3eda43e 100644
> > --- a/src/amd/vulkan/radv_meta_blit.c
> > +++ b/src/amd/vulkan/radv_meta_blit.c
> > @@ -37,7 +37,8 @@ build_nir_vertex_shader(void)
> >         const struct glsl_type *vec4 = glsl_vec4_type();
> >         nir_builder b;
> > 
> > -       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_VERTEX, NULL);
> > +       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_VERTEX, NULL,
> > +                                      NULL);
> >         b.shader->info->name = ralloc_strdup(b.shader,
> > "meta_blit_vs");
> > 
> >         nir_variable *pos_in = nir_variable_create(b.shader,
> > nir_var_shader_in,
> > @@ -67,7 +68,8 @@ build_nir_copy_fragment_shader(enum
> > glsl_sampler_dim tex_dim)
> >         const struct glsl_type *vec4 = glsl_vec4_type();
> >         nir_builder b;
> > 
> > -       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_FRAGMENT, NULL);
> > +       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_FRAGMENT, NULL,
> > +                                      NULL);
> > 
> >         sprintf(shader_name, "meta_blit_fs.%d", tex_dim);
> >         b.shader->info->name = ralloc_strdup(b.shader,
> > shader_name);
> > @@ -121,7 +123,8 @@ build_nir_copy_fragment_shader_depth(enum
> > glsl_sampler_dim tex_dim)
> >         const struct glsl_type *vec4 = glsl_vec4_type();
> >         nir_builder b;
> > 
> > -       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_FRAGMENT, NULL);
> > +       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_FRAGMENT, NULL,
> > +                                      NULL);
> > 
> >         sprintf(shader_name, "meta_blit_depth_fs.%d", tex_dim);
> >         b.shader->info->name = ralloc_strdup(b.shader,
> > shader_name);
> > @@ -175,7 +178,8 @@ build_nir_copy_fragment_shader_stencil(enum
> > glsl_sampler_dim tex_dim)
> >         const struct glsl_type *vec4 = glsl_vec4_type();
> >         nir_builder b;
> > 
> > -       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_FRAGMENT, NULL);
> > +       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_FRAGMENT, NULL,
> > +                                      NULL);
> > 
> >         sprintf(shader_name, "meta_blit_stencil_fs.%d", tex_dim);
> >         b.shader->info->name = ralloc_strdup(b.shader,
> > shader_name);
> > diff --git a/src/amd/vulkan/radv_meta_blit2d.c
> > b/src/amd/vulkan/radv_meta_blit2d.c
> > index 6e92f80..bed03a3 100644
> > --- a/src/amd/vulkan/radv_meta_blit2d.c
> > +++ b/src/amd/vulkan/radv_meta_blit2d.c
> > @@ -438,7 +438,8 @@ build_nir_vertex_shader(void)
> >         const struct glsl_type *vec2 =
> > glsl_vector_type(GLSL_TYPE_FLOAT, 2);
> >         nir_builder b;
> > 
> > -       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_VERTEX, NULL);
> > +       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_VERTEX, NULL,
> > +                                      NULL);
> >         b.shader->info->name = ralloc_strdup(b.shader,
> > "meta_blit_vs");
> > 
> >         nir_variable *pos_in = nir_variable_create(b.shader,
> > nir_var_shader_in,
> > @@ -573,7 +574,8 @@ build_nir_copy_fragment_shader(struct
> > radv_device *device,
> >         const struct glsl_type *vec2 =
> > glsl_vector_type(GLSL_TYPE_FLOAT, 2);
> >         nir_builder b;
> > 
> > -       nir_builder_init_simple_shader(, NULL,
> > MESA_SHADER_FRAGMENT, NULL);
> > +       

Re: [Mesa-dev] [PATCH 2/2] i965: Reorder PCI ID list to match release order

2016-10-18 Thread Dylan Baker
Quoting Ben Widawsky (2016-10-18 13:50:08)
> I have some OCD...
> 
> Signed-off-by: Ben Widawsky 
> ---
>  include/pci_ids/i965_pci_ids.h | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> index a93228d..e482007 100644
> --- a/include/pci_ids/i965_pci_ids.h
> +++ b/include/pci_ids/i965_pci_ids.h
> @@ -109,6 +109,10 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 
> (Broadwell GT3e)")
>  CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
>  CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
>  CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
> +CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
> +CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* 
> Overridden in brw_get_renderer_string */
> +CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
> +CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
>  CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
>  CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
>  CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
> @@ -134,6 +138,11 @@ CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 
> (Skylake GT4e)")
>  CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
>  CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")
>  CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
> +CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
> +CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
> +CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
> +CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)")
> +CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)")
>  CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
>  CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
>  CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
> @@ -154,12 +163,3 @@ CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3")
>  CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")
>  CHIPSET(0x5927, kbl_gt3, "Intel(R) Kabylake GT3")
>  CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
> -CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
> -CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* 
> Overridden in brw_get_renderer_string */
> -CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
> -CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
> -CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
> -CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
> -CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
> -CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)")
> -CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)")
> -- 
> 2.10.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

I approve of your OCD,

Reviewed-by: Dylan Baker 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] radeonsi: rename prefixes from radeon to si

2016-10-18 Thread Marek Olšák
On Tue, Oct 18, 2016 at 11:54 AM, Nicolai Hähnle  wrote:
> Makes sense as a cleanup. At some point it would make sense to look into
> sharing some stuff with radv instead. There's probably not a huge amount
> because of the NIR/TGSI split, but still.

I'm always for code sharing, but I don't know whether radv will ever
be considered important outside of the radv camp. If somebody else
wants to share code with radv, I'll gladly accept patches.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/22] wsi: swap srgb/unorm around.

2016-10-18 Thread Dave Airlie
On 19 October 2016 at 06:33, Jason Ekstrand  wrote:
> NAKish... I specifically put them in that order to *cause* talos to break.
> If we're going to support both UNORM and sRGB, then applications need to
> look at the formats they're getting and pick one intelligently rather than
> just using the first thing they find (which Talos does) especially if that
> app does their own gamma curvs.  Apps that just grab the first thing they
> find probably "want" sRGB encoding done for them.  I've talked to the guys
> at croteam and there is a Talos update in the pipe that fixes this.

you might also need to talk to Sascha Willems.

Dave.

>
> On Sun, Oct 16, 2016 at 9:24 PM, Dave Airlie  wrote:
>>
>> From: Dave Airlie 
>>
>> This prevents a Talos regression before radv
>> starts using shared WSI.
>> ---
>>  src/vulkan/wsi/wsi_common_x11.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/vulkan/wsi/wsi_common_x11.c
>> b/src/vulkan/wsi/wsi_common_x11.c
>> index b5832c6..241ef42 100644
>> --- a/src/vulkan/wsi/wsi_common_x11.c
>> +++ b/src/vulkan/wsi/wsi_common_x11.c
>> @@ -135,8 +135,8 @@ wsi_x11_get_connection(struct wsi_device *wsi_dev,
>>  }
>>
>>  static const VkSurfaceFormatKHR formats[] = {
>> -   { .format = VK_FORMAT_B8G8R8A8_SRGB, },
>> { .format = VK_FORMAT_B8G8R8A8_UNORM, },
>> +   { .format = VK_FORMAT_B8G8R8A8_SRGB, },
>>  };
>>
>>  static const VkPresentModeKHR present_modes[] = {
>> --
>> 2.5.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Add some APL and KBL SKU strings

2016-10-18 Thread Ben Widawsky
We got a couple for products that exist on ark.intel.com, so let's just
put them in now.

Signed-off-by: Ben Widawsky 
---
 include/pci_ids/i965_pci_ids.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 1566afd..a93228d 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -144,11 +144,11 @@ CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
 CHIPSET(0x5915, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
 CHIPSET(0x5917, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
 CHIPSET(0x5912, kbl_gt2, "Intel(R) Kabylake GT2")
-CHIPSET(0x5916, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Intel(R) Kabylake GT2)")
 CHIPSET(0x591A, kbl_gt2, "Intel(R) Kabylake GT2")
 CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2")
 CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2")
-CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kabylake GT2)")
 CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
 CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3")
 CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")
@@ -161,5 +161,5 @@ CHIPSET(0x22B3, chv, "Intel(R) HD Graphics 
(Cherryview)")
 CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
 CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
 CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
-CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics (Broxton)")
-CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
+CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)")
+CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)")
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Reorder PCI ID list to match release order

2016-10-18 Thread Ben Widawsky
I have some OCD...

Signed-off-by: Ben Widawsky 
---
 include/pci_ids/i965_pci_ids.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index a93228d..e482007 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -109,6 +109,10 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 
(Broadwell GT3e)")
 CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
 CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
 CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
+CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
+CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden 
in brw_get_renderer_string */
+CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
+CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
 CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
@@ -134,6 +138,11 @@ CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 
(Skylake GT4e)")
 CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
 CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")
 CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
+CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
+CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
+CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
+CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)")
+CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)")
 CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
 CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
 CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
@@ -154,12 +163,3 @@ CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3")
 CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")
 CHIPSET(0x5927, kbl_gt3, "Intel(R) Kabylake GT3")
 CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
-CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
-CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden 
in brw_get_renderer_string */
-CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
-CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
-CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
-CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
-CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
-CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)")
-CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)")
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] anv/radv: WSI sharing code

2016-10-18 Thread Jason Ekstrand
I've dug through the whole thing now.  I'm not a fan of patch 21 (re-order
UNORM and sRGB) and gave detailed comments on it.  The rest are

Reviewed-by: Jason Ekstrand 

My only other real comment is that I think I'd rather we put a bit more
stuff in wsi_device so we're not passing so much around.  In particular,
the allocation functions and format functions could go there and maybe an
alloc.  If you wanted to clean that up as a follow-on patch, that's fine,
but I would like it cleaned up if you don't mind.

--Jason

On Sun, Oct 16, 2016 at 9:24 PM, Dave Airlie  wrote:

> This series builds on top of the previous sharing patches I sent.
>
> The aim here is to share the X11 and wayland WSI code between
> the two vulkan drivers so we have a consistent implementation and
> one place to fix bugs.
>
> The series modifies the anv code in place until it's suitable
> for sharing, then it moves it to shared directory, and ports
> radv to use it.
>
> The final code leaves the WSI APIs in the drivers, but they
> call directly into the shared code once they shed their driver
> specific structs, and pick a pAllocator.
>
> Dave.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/22] wsi: swap srgb/unorm around.

2016-10-18 Thread Jason Ekstrand
NAKish... I specifically put them in that order to *cause* talos to break.
If we're going to support both UNORM and sRGB, then applications need to
look at the formats they're getting and pick one intelligently rather than
just using the first thing they find (which Talos does) especially if that
app does their own gamma curvs.  Apps that just grab the first thing they
find probably "want" sRGB encoding done for them.  I've talked to the guys
at croteam and there is a Talos update in the pipe that fixes this.

On Sun, Oct 16, 2016 at 9:24 PM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> This prevents a Talos regression before radv
> starts using shared WSI.
> ---
>  src/vulkan/wsi/wsi_common_x11.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_
> x11.c
> index b5832c6..241ef42 100644
> --- a/src/vulkan/wsi/wsi_common_x11.c
> +++ b/src/vulkan/wsi/wsi_common_x11.c
> @@ -135,8 +135,8 @@ wsi_x11_get_connection(struct wsi_device *wsi_dev,
>  }
>
>  static const VkSurfaceFormatKHR formats[] = {
> -   { .format = VK_FORMAT_B8G8R8A8_SRGB, },
> { .format = VK_FORMAT_B8G8R8A8_UNORM, },
> +   { .format = VK_FORMAT_B8G8R8A8_SRGB, },
>  };
>
>  static const VkPresentModeKHR present_modes[] = {
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] [Bug 38970] [bisected]piglit glx/glx-pixmap-multi failed

2016-10-18 Thread Nicolai Hähnle

On 18.10.2016 19:23, Ian Romanick wrote:

On 09/29/2016 01:55 PM, Anutex wrote:

I tried to debug this issue with changing the condition to check only bad magic 
and Error.
And the test passed.

Though i am not sure what is the correct behaviour if we are in this condition.
May be we should make some  other condition if the Hash Table have the bucket 
data.
---
 src/glx/dri2_glx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index af388d9..a1fd9ff 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -411,12 +411,13 @@ dri2CreateDrawable(struct glx_screen *base, XID xDrawable,
   return NULL;
}

-   if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw)) {
+   if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw) == -1) {


I'm not 100% sure the existing code is wrong.  __glxHashInsert returns
-1 for an error, and it returns 1 if the key is already in the hash
table.  In that case we'll leak the memory for the new pdraw, right?
That also seems bad.

It seems like instead the code should look up xDrawable in the hash
table and return the value that's already there.  Maybe.  I haven't
looked at this code in years, so I may be forgetting some subtlety.


dri2DestroyDrawable destroys the pdraw though. It also removes the 
xDrawable entry in the hash table without checking whether it points at 
pdraw or not, so on the surface that looks pretty bogus if we create a 
GLXDrawable twice.


_However_, the real question is what the hash is used for in the first 
place. It looks to me like the hash is actually pretty pointless in the 
pixmap case. And it just so happens that the GLX spec forbids creating a 
GLXDrawable from a Window twice, but it doesn't forbid creating a 
GLXDrawable from a Pixmap twice.


Then again, my GLX knowledge is basically zero, so what do I know :)

Nicolai




   (*psc->core->destroyDrawable) (pdraw->driDrawable);
   DRI2DestroyDrawable(psc->base.dpy, xDrawable);
   free(pdraw);
   return None;
}
+   



Spurious whitespace change.


/*
 * Make sure server has the same swap interval we do for the new



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/22] anv/wsi: move further away from passing anv displays around

2016-10-18 Thread Jason Ekstrand
On Sun, Oct 16, 2016 at 9:24 PM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> ---
>  src/intel/vulkan/anv_wsi.c | 28 +++-
>  src/intel/vulkan/anv_wsi.h |  3 ++-
>  src/intel/vulkan/anv_wsi_wayland.c | 21 +++--
>  src/intel/vulkan/anv_wsi_x11.c | 22 +++---
>  4 files changed, 35 insertions(+), 39 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
> index 514a29f..89bf780 100644
> --- a/src/intel/vulkan/anv_wsi.c
> +++ b/src/intel/vulkan/anv_wsi.c
> @@ -253,17 +253,21 @@ VkResult anv_CreateSwapchainKHR(
> struct anv_wsi_interface *iface =
>device->instance->physicalDevice.wsi_device.wsi[surface->platform];
> struct anv_swapchain *swapchain;
> +   const VkAllocationCallbacks *alloc;
>
> -   VkResult result = iface->create_swapchain(surface, device,
> pCreateInfo,
> - pAllocator,
> _wsi_image_fns,
> +   if (pAllocator)
> + alloc = pAllocator;
> +   else
> + alloc = >alloc;
> +   VkResult result = iface->create_swapchain(surface, _device,
> + >instance->
> physicalDevice.wsi_device,
> + pCreateInfo,
> + alloc, _wsi_image_fns,
>   );
> if (result != VK_SUCCESS)
>return result;
>
> -   if (pAllocator)
> -  swapchain->alloc = *pAllocator;
> -   else
> -  swapchain->alloc = device->alloc;
> +   swapchain->alloc = *alloc;
>
> for (unsigned i = 0; i < ARRAY_SIZE(swapchain->fences); i++)
>swapchain->fences[i] = VK_NULL_HANDLE;
> @@ -274,18 +278,24 @@ VkResult anv_CreateSwapchainKHR(
>  }
>
>  void anv_DestroySwapchainKHR(
> -VkDevice device,
> +VkDevice _device,
>  VkSwapchainKHR   _swapchain,
>  const VkAllocationCallbacks* pAllocator)
>  {
> +   ANV_FROM_HANDLE(anv_device, device, _device);
> ANV_FROM_HANDLE(anv_swapchain, swapchain, _swapchain);
> +   const VkAllocationCallbacks *alloc;
>
> +   if (pAllocator)
> + alloc = pAllocator;
> +   else
> + alloc = >alloc;
>

This isn't needed.  The client is required to pass the same allocator in
(if any) to this function as it does to Create.  We can just use
swapchain->alloc


> for (unsigned i = 0; i < ARRAY_SIZE(swapchain->fences); i++) {
>if (swapchain->fences[i] != VK_NULL_HANDLE)
> - anv_DestroyFence(device, swapchain->fences[i], pAllocator);
> + anv_DestroyFence(_device, swapchain->fences[i], pAllocator);
> }
>
> -   swapchain->destroy(swapchain, pAllocator);
> +   swapchain->destroy(swapchain, alloc);
>  }
>
>  VkResult anv_GetSwapchainImagesKHR(
> diff --git a/src/intel/vulkan/anv_wsi.h b/src/intel/vulkan/anv_wsi.h
> index 2548e41..236133c 100644
> --- a/src/intel/vulkan/anv_wsi.h
> +++ b/src/intel/vulkan/anv_wsi.h
> @@ -60,7 +60,8 @@ struct anv_wsi_interface {
>   uint32_t* pPresentModeCount,
>   VkPresentModeKHR* pPresentModes);
> VkResult (*create_swapchain)(VkIcdSurfaceBase *surface,
> -struct anv_device *device,
> +VkDevice device,
> +struct anv_wsi_device *wsi_device,
>  const VkSwapchainCreateInfoKHR*
> pCreateInfo,
>  const VkAllocationCallbacks* pAllocator,
>  const struct anv_wsi_image_fns *image_fns,
> diff --git a/src/intel/vulkan/anv_wsi_wayland.c
> b/src/intel/vulkan/anv_wsi_wayland.c
> index e56b3be..16a9647 100644
> --- a/src/intel/vulkan/anv_wsi_wayland.c
> +++ b/src/intel/vulkan/anv_wsi_wayland.c
> @@ -422,14 +422,6 @@ wsi_wl_surface_get_present_modes(VkIcdSurfaceBase
> *surface,
> return VK_SUCCESS;
>  }
>
> -static VkResult
> -wsi_wl_surface_create_swapchain(VkIcdSurfaceBase *surface,
> -struct anv_device *device,
> -const VkSwapchainCreateInfoKHR*
> pCreateInfo,
> -const VkAllocationCallbacks* pAllocator,
> -const struct anv_wsi_image_fns *image_fns,
> -struct anv_swapchain **swapchain);
> -
>  VkResult anv_CreateWaylandSurfaceKHR(
>  VkInstance  _instance,
>  const VkWaylandSurfaceCreateInfoKHR*pCreateInfo,
> @@ -650,7 +642,7 @@ wsi_wl_swapchain_destroy(struct anv_swapchain
> *anv_chain,
>   const VkAllocationCallbacks *pAllocator)
>  {
> struct wsi_wl_swapchain *chain = (struct wsi_wl_swapchain *)anv_chain;
> -   struct anv_device *device = 

Re: [Mesa-dev] [PATCH 06/22] anv/wsi/x11: push anv_device out of the init/finish routines

2016-10-18 Thread Jason Ekstrand
Feel free to shove an alloc in wsi_device.  Might make some of this a bit
simpler.  I guess we usually shove one in wsi_implementation so it's not a
big deal.

On Sun, Oct 16, 2016 at 9:24 PM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> ---
>  src/intel/vulkan/anv_wsi.c |  6 +++---
>  src/intel/vulkan/anv_wsi.h |  6 --
>  src/intel/vulkan/anv_wsi_x11.c | 22 --
>  3 files changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
> index 56ed3ec..767fa79 100644
> --- a/src/intel/vulkan/anv_wsi.c
> +++ b/src/intel/vulkan/anv_wsi.c
> @@ -31,7 +31,7 @@ anv_init_wsi(struct anv_physical_device *physical_device)
> memset(physical_device->wsi_device.wsi, 0,
> sizeof(physical_device->wsi_device.wsi));
>
>  #ifdef VK_USE_PLATFORM_XCB_KHR
> -   result = anv_x11_init_wsi(physical_device);
> +   result = anv_x11_init_wsi(_device->wsi_device,
> _device->instance->alloc);
> if (result != VK_SUCCESS)
>return result;
>  #endif
> @@ -40,7 +40,7 @@ anv_init_wsi(struct anv_physical_device *physical_device)
> result = anv_wl_init_wsi(physical_device);
> if (result != VK_SUCCESS) {
>  #ifdef VK_USE_PLATFORM_XCB_KHR
> -  anv_x11_finish_wsi(physical_device);
> +  anv_x11_finish_wsi(_device->wsi_device,
> _device->instance->alloc);
>  #endif
>return result;
> }
> @@ -56,7 +56,7 @@ anv_finish_wsi(struct anv_physical_device
> *physical_device)
> anv_wl_finish_wsi(physical_device);
>  #endif
>  #ifdef VK_USE_PLATFORM_XCB_KHR
> -   anv_x11_finish_wsi(physical_device);
> +   anv_x11_finish_wsi(_device->wsi_device,
> _device->instance->alloc);
>  #endif
>  }
>
> diff --git a/src/intel/vulkan/anv_wsi.h b/src/intel/vulkan/anv_wsi.h
> index 2bb8ee3..e1c8d02 100644
> --- a/src/intel/vulkan/anv_wsi.h
> +++ b/src/intel/vulkan/anv_wsi.h
> @@ -70,8 +70,10 @@ struct anv_swapchain {
>  ANV_DEFINE_NONDISP_HANDLE_CASTS(_VkIcdSurfaceBase, VkSurfaceKHR)
>  ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_swapchain, VkSwapchainKHR)
>
> -VkResult anv_x11_init_wsi(struct anv_physical_device *physical_device);
> -void anv_x11_finish_wsi(struct anv_physical_device *physical_device);
> +VkResult anv_x11_init_wsi(struct anv_wsi_device *wsi_device,
> +  const VkAllocationCallbacks *alloc);
> +void anv_x11_finish_wsi(struct anv_wsi_device *wsi_device,
> +const VkAllocationCallbacks *alloc);
>  VkResult anv_wl_init_wsi(struct anv_physical_device *physical_device);
>  void anv_wl_finish_wsi(struct anv_physical_device *physical_device);
>
> diff --git a/src/intel/vulkan/anv_wsi_x11.c b/src/intel/vulkan/anv_wsi_
> x11.c
> index 595c922..ccaabea 100644
> --- a/src/intel/vulkan/anv_wsi_x11.c
> +++ b/src/intel/vulkan/anv_wsi_x11.c
> @@ -897,12 +897,13 @@ fail_register:
>  }
>
>  VkResult
> -anv_x11_init_wsi(struct anv_physical_device *device)
> +anv_x11_init_wsi(struct anv_wsi_device *wsi_device,
> + const VkAllocationCallbacks *alloc)
>  {
> struct wsi_x11 *wsi;
> VkResult result;
>
> -   wsi = vk_alloc(>instance->alloc, sizeof(*wsi), 8,
> +   wsi = vk_alloc(alloc, sizeof(*wsi), 8,
> VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
> if (!wsi) {
>result = vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> @@ -934,33 +935,34 @@ anv_x11_init_wsi(struct anv_physical_device *device)
> wsi->base.get_present_modes = x11_surface_get_present_modes;
> wsi->base.create_swapchain = x11_surface_create_swapchain;
>
> -   device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XCB] = >base;
> -   device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XLIB] = >base;
> +   wsi_device->wsi[VK_ICD_WSI_PLATFORM_XCB] = >base;
> +   wsi_device->wsi[VK_ICD_WSI_PLATFORM_XLIB] = >base;
>
> return VK_SUCCESS;
>
>  fail_mutex:
> pthread_mutex_destroy(>mutex);
>  fail_alloc:
> -   vk_free(>instance->alloc, wsi);
> +   vk_free(alloc, wsi);
>  fail:
> -   device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XCB] = NULL;
> -   device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XLIB] = NULL;
> +   wsi_device->wsi[VK_ICD_WSI_PLATFORM_XCB] = NULL;
> +   wsi_device->wsi[VK_ICD_WSI_PLATFORM_XLIB] = NULL;
>
> return result;
>  }
>
>  void
> -anv_x11_finish_wsi(struct anv_physical_device *device)
> +anv_x11_finish_wsi(struct anv_wsi_device *wsi_device,
> +   const VkAllocationCallbacks *alloc)
>  {
> struct wsi_x11 *wsi =
> -  (struct wsi_x11 *)device->wsi_device.wsi[VK_ICD_WSI_PLATFORM_XCB];
> +  (struct wsi_x11 *)wsi_device->wsi[VK_ICD_WSI_PLATFORM_XCB];
>
> if (wsi) {
>_mesa_hash_table_destroy(wsi->connections, NULL);
>
>pthread_mutex_destroy(>mutex);
>
> -  vk_free(>instance->alloc, wsi);
> +  vk_free(alloc, wsi);
> }
>  }
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

[Mesa-dev] [Bug 98308] llvmpipe crashes with glxgears

2016-10-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98308

--- Comment #2 from Roland Scheidegger  ---
I'd be interested to know though why it fails, I don't think LTO should cause
such failures? Seems like it might be related to the threads created by
llvmpipe but I don't really see how.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98308] llvmpipe crashes with glxgears

2016-10-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98308

Marc Dietrich  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTABUG

--- Comment #1 from Marc Dietrich  ---
This crash was causes by using (unsupported) compilation with LTO. Sorry for
the noise.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/25] mesa/i965/i915/r200: eliminate gl_vertex_program

2016-10-18 Thread Ian Romanick
I'd like to see two tiny changes:

1. A comment for the IsPositionInvariant field that it can only be true
for vertex programs.

2. An assertion or two like

assert(p->Target == GL_VERTEX_PROGRAM_ARB ||
   !p->IsPositionInvariant);

   in reasonable places.  I'm thinking:

   - Where it's assigned in src/mesa/program/arbprogparse.c

   - Where it's used in src/mesa/state_tracker/st_program.c,
 src/mesa/drivers/dri/i965/brw_program.c, and
 src/mesa/tnl/t_vb_program.c (both places).

I'd also support a follow-up patch that converts IsPositionInvariant
from GLboolean to bool. :)

On 10/17/2016 11:12 PM, Timothy Arceri wrote:
> Here we move the only field in gl_vertex_program to the
> ARB program fields in gl_program.
> ---
>  src/mesa/drivers/common/meta.c   | 10 +--
>  src/mesa/drivers/common/meta.h   |  2 +-
>  src/mesa/drivers/dri/i915/i915_fragprog.c|  4 +-
>  src/mesa/drivers/dri/i965/brw_context.h  |  8 +--
>  src/mesa/drivers/dri/i965/brw_curbe.c|  2 +-
>  src/mesa/drivers/dri/i965/brw_draw.c |  4 +-
>  src/mesa/drivers/dri/i965/brw_program.c  |  5 +-
>  src/mesa/drivers/dri/i965/brw_vs.c   | 41 ++--
>  src/mesa/drivers/dri/i965/brw_vs_surface_state.c |  2 +-
>  src/mesa/drivers/dri/i965/gen6_vs_state.c|  4 +-
>  src/mesa/drivers/dri/r200/r200_context.h |  2 +-
>  src/mesa/drivers/dri/r200/r200_state_init.c  |  4 +-
>  src/mesa/drivers/dri/r200/r200_tcl.c |  2 +-
>  src/mesa/drivers/dri/r200/r200_vertprog.c| 82 
> 
>  src/mesa/main/arbprogram.c   | 19 +++---
>  src/mesa/main/context.c  |  8 +--
>  src/mesa/main/ff_fragment_shader.cpp |  2 +-
>  src/mesa/main/ffvertex_prog.c| 72 ++---
>  src/mesa/main/ffvertex_prog.h|  2 +-
>  src/mesa/main/mtypes.h   | 17 ++---
>  src/mesa/main/shared.c   |  5 +-
>  src/mesa/main/state.c| 26 
>  src/mesa/main/state.h|  2 +-
>  src/mesa/program/arbprogparse.c  | 46 ++---
>  src/mesa/program/arbprogparse.h  |  2 +-
>  src/mesa/program/prog_statevars.c|  8 +--
>  src/mesa/program/program.c   | 15 ++---
>  src/mesa/program/program.h   | 26 
>  src/mesa/program/programopt.c| 42 ++--
>  src/mesa/program/programopt.h|  2 +-
>  src/mesa/state_tracker/st_atom.c |  4 +-
>  src/mesa/state_tracker/st_atom_constbuf.c|  2 +-
>  src/mesa/state_tracker/st_atom_rasterizer.c  |  8 +--
>  src/mesa/state_tracker/st_atom_sampler.c |  2 +-
>  src/mesa/state_tracker/st_atom_shader.c  |  4 +-
>  src/mesa/state_tracker/st_atom_texture.c |  2 +-
>  src/mesa/state_tracker/st_cb_feedback.c  |  2 +-
>  src/mesa/state_tracker/st_cb_program.c   |  2 +-
>  src/mesa/state_tracker/st_debug.c|  4 +-
>  src/mesa/state_tracker/st_program.c  | 35 +-
>  src/mesa/state_tracker/st_program.h  |  4 +-
>  src/mesa/tnl/t_context.c |  4 +-
>  src/mesa/tnl/t_vb_program.c  | 24 +++
>  src/mesa/tnl/t_vp_build.c|  4 +-
>  src/mesa/vbo/vbo_exec_draw.c |  4 +-
>  src/mesa/vbo/vbo_save_draw.c |  4 +-
>  46 files changed, 264 insertions(+), 311 deletions(-)
> 
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index 890e98a..ab81eed 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -566,8 +566,8 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
>  
>if (ctx->Extensions.ARB_vertex_program) {
>   save->VertexProgramEnabled = ctx->VertexProgram.Enabled;
> - _mesa_reference_vertprog(ctx, >VertexProgram,
> -   ctx->VertexProgram.Current);
> + _mesa_reference_program(ctx, >VertexProgram,
> +  ctx->VertexProgram.Current);
>   _mesa_set_enable(ctx, GL_VERTEX_PROGRAM_ARB, GL_FALSE);
>}
>  
> @@ -945,9 +945,9 @@ _mesa_meta_end(struct gl_context *ctx)
>if (ctx->Extensions.ARB_vertex_program) {
>   _mesa_set_enable(ctx, GL_VERTEX_PROGRAM_ARB,
>save->VertexProgramEnabled);
> - _mesa_reference_vertprog(ctx, >VertexProgram.Current, 
> -  save->VertexProgram);
> -  _mesa_reference_vertprog(ctx, >VertexProgram, NULL);
> + _mesa_reference_program(ctx, >VertexProgram.Current,
> + save->VertexProgram);
> +  _mesa_reference_program(ctx, >VertexProgram, 

Re: [Mesa-dev] [PATCH] st/glsl_to_tgsi: sort input and output decls by TGSI index

2016-10-18 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Tue, Oct 18, 2016 at 6:06 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Fixes a regression introduced by commit 777dcf81b.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98307
> --
> Using std::sort here is quite a bit C++-ier than most parts of Mesa.
> I used it because the standard C library is being its usual lame self.
> If people think using qsort_r is fine from a portability point of view
> (it's a glibc-ism), I'd be happy to use that instead.
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 28 
>  1 file changed, 28 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index f49a873..406f4d5 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -48,20 +48,21 @@
>  #include "tgsi/tgsi_ureg.h"
>  #include "tgsi/tgsi_info.h"
>  #include "util/u_math.h"
>  #include "util/u_memory.h"
>  #include "st_program.h"
>  #include "st_mesa_to_tgsi.h"
>  #include "st_format.h"
>  #include "st_glsl_types.h"
>  #include "st_nir.h"
>
> +#include 
>
>  #define PROGRAM_ANY_CONST ((1 << PROGRAM_STATE_VAR) |\
> (1 << PROGRAM_CONSTANT) | \
> (1 << PROGRAM_UNIFORM))
>
>  #define MAX_GLSL_TEXTURE_OFFSET 4
>
>  class st_src_reg;
>  class st_dst_reg;
>
> @@ -6092,20 +6093,43 @@ emit_compute_block_size(const struct gl_program 
> *program,
>(const struct gl_compute_program *)program;
>
> ureg_property(ureg, TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH,
> cp->LocalSize[0]);
> ureg_property(ureg, TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT,
> cp->LocalSize[1]);
> ureg_property(ureg, TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH,
> cp->LocalSize[2]);
>  }
>
> +struct sort_inout_decls {
> +   bool operator()(const struct inout_decl , const struct inout_decl ) 
> const {
> +  return mapping[a.mesa_index] < mapping[b.mesa_index];
> +   }
> +
> +   const GLuint *mapping;
> +};
> +
> +/* Sort the given array of decls by the corresponding slot (TGSI file index).
> + *
> + * This is for the benefit of older drivers which are broken when the
> + * declarations aren't sorted in this way.
> + */
> +static void
> +sort_inout_decls_by_slot(struct inout_decl *decls,
> + unsigned count,
> + const GLuint mapping[])
> +{
> +   sort_inout_decls sorter;
> +   sorter.mapping = mapping;
> +   std::sort(decls, decls + count, sorter);
> +}
> +
>  /**
>   * Translate intermediate IR (glsl_to_tgsi_instruction) to TGSI format.
>   * \param program  the program to translate
>   * \param numInputs  number of input registers used
>   * \param inputMapping  maps Mesa fragment program inputs to TGSI generic
>   *  input indexes
>   * \param inputSemanticName  the TGSI_SEMANTIC flag for each input
>   * \param inputSemanticIndex  the semantic index (ex: which texcoord) for
>   *each input
>   * \param interpMode  the TGSI_INTERPOLATE_LINEAR/PERSP mode for each input
> @@ -6164,20 +6188,22 @@ st_translate_program(
>calloc(t->num_temp_arrays, sizeof(t->arrays[0]));
>
> /*
>  * Declare input attributes.
>  */
> switch (procType) {
> case PIPE_SHADER_FRAGMENT:
> case PIPE_SHADER_GEOMETRY:
> case PIPE_SHADER_TESS_EVAL:
> case PIPE_SHADER_TESS_CTRL:
> +  sort_inout_decls_by_slot(program->inputs, program->num_inputs, 
> inputMapping);
> +
>for (i = 0; i < program->num_inputs; ++i) {
>   struct inout_decl *decl = >inputs[i];
>   unsigned slot = inputMapping[decl->mesa_index];
>   struct ureg_src src;
>   ubyte tgsi_usage_mask = decl->usage_mask;
>
>   if (glsl_base_type_is_64bit(decl->base_type)) {
>  if (tgsi_usage_mask == 1)
> tgsi_usage_mask = TGSI_WRITEMASK_XY;
>  else if (tgsi_usage_mask == 2)
> @@ -6216,20 +6242,22 @@ st_translate_program(
>  * Declare output attributes.
>  */
> switch (procType) {
> case PIPE_SHADER_FRAGMENT:
> case PIPE_SHADER_COMPUTE:
>break;
> case PIPE_SHADER_GEOMETRY:
> case PIPE_SHADER_TESS_EVAL:
> case PIPE_SHADER_TESS_CTRL:
> case PIPE_SHADER_VERTEX:
> +  sort_inout_decls_by_slot(program->outputs, program->num_outputs, 
> outputMapping);
> +
>for (i = 0; i < program->num_outputs; ++i) {
>   struct inout_decl *decl = >outputs[i];
>   unsigned slot = outputMapping[decl->mesa_index];
>   struct ureg_dst dst;
>   ubyte tgsi_usage_mask = decl->usage_mask;
>
>   if (glsl_base_type_is_64bit(decl->base_type)) {
>  if (tgsi_usage_mask == 1)
> tgsi_usage_mask = TGSI_WRITEMASK_XY;

Re: [Mesa-dev] [PATCH 20/22] anv: move to using shared wsi code

2016-10-18 Thread Emil Velikov
Hi Dave,

Thanks for doing this. It'll be great to get an Ack from the Intel
devs, on the idea.

Afaics with 22/22 in place you can drop the vk_alloc2/vk_free2
functions since they are no longer used.

Just an extra (small) suggestion below:

On 17 October 2016 at 05:24, Dave Airlie  wrote:

>  delete mode 100644 src/intel/vulkan/wsi_common.h
>  delete mode 100644 src/intel/vulkan/wsi_common_wayland.c
>  delete mode 100644 src/intel/vulkan/wsi_common_wayland.h
>  delete mode 100644 src/intel/vulkan/wsi_common_x11.c
>  delete mode 100644 src/intel/vulkan/wsi_common_x11.h
>  create mode 100644 src/vulkan/wsi/Makefile.am
>  create mode 100644 src/vulkan/wsi/Makefile.sources
>  create mode 100644 src/vulkan/wsi/wsi_common.h
>  create mode 100644 src/vulkan/wsi/wsi_common_wayland.c
>  create mode 100644 src/vulkan/wsi/wsi_common_wayland.h
>  create mode 100644 src/vulkan/wsi/wsi_common_x11.c
>  create mode 100644 src/vulkan/wsi/wsi_common_x11.h
>
Can you use git format-patch -M (or $git config --global diff.renames
true) so that the diff is friendlier.


> diff --git a/configure.ac b/configure.ac
> index 37cc306..688459b 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2854,7 +2854,8 @@ AC_CONFIG_FILES([Makefile
> src/mesa/main/tests/Makefile
> src/util/Makefile
> src/util/tests/hash_table/Makefile
> -   src/vulkan/Makefile])
> +   src/vulkan/Makefile
> +   src/vulkan/wsi/Makefile])
>
Just fold the new Makefile into the existing one ? In should be as
simple as adding wsi/ prefix to files.
Alternatively we can do that as a follow-up.

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/11] vulkan: add vk_alloc.h shared allocation inlines.

2016-10-18 Thread Jason Ekstrand
We already talked on IRC about putting vk_alloc.h in src/util.  Assuming
that's done, the series is

Acked-by: Jason Ekstrand 

Please make sure you do a fairly complete (fedora config?) build test.  I
don't want those MIN/MAX macros to cause problems.

--Jason

On Sun, Oct 16, 2016 at 7:07 PM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> vulkan allocation allows for overriding the allocator used,
> add some macros for anv/radv to share for this.
>
> Signed-off-by: Dave Airlie 
> ---
>  configure.ac |  5 ++-
>  src/Makefile.am  |  4 +++
>  src/vulkan/Makefile.am   | 26 +++
>  src/vulkan/Makefile.sources  |  2 ++
>  src/vulkan/common/vk_alloc.h | 75 ++
> ++
>  5 files changed, 111 insertions(+), 1 deletion(-)
>  create mode 100644 src/vulkan/Makefile.am
>  create mode 100644 src/vulkan/Makefile.sources
>  create mode 100644 src/vulkan/common/vk_alloc.h
>
> diff --git a/configure.ac b/configure.ac
> index b414edd..37cc306 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2693,6 +2693,8 @@ VA_MINOR=`$PKG_CONFIG --modversion libva | $SED -n
> 's/.*\.\(.*\)\..*$/\1/p'`
>  AC_SUBST([VA_MAJOR], $VA_MAJOR)
>  AC_SUBST([VA_MINOR], $VA_MINOR)
>
> +AM_CONDITIONAL(HAVE_VULKAN_COMMON, test "x$VULKAN_DRIVERS" != "x")
> +
>  AC_SUBST([XVMC_MAJOR], 1)
>  AC_SUBST([XVMC_MINOR], 0)
>
> @@ -2851,7 +2853,8 @@ AC_CONFIG_FILES([Makefile
> src/mesa/drivers/x11/Makefile
> src/mesa/main/tests/Makefile
> src/util/Makefile
> -   src/util/tests/hash_table/Makefile])
> +   src/util/tests/hash_table/Makefile
> +   src/vulkan/Makefile])
>
>  AC_OUTPUT
>
> diff --git a/src/Makefile.am b/src/Makefile.am
> index 17c8798..10e0826 100644
> --- a/src/Makefile.am
> +++ b/src/Makefile.am
> @@ -74,6 +74,10 @@ endif
>  # include only conditionally ?
>  SUBDIRS += compiler
>
> +if HAVE_VULKAN_COMMON
> +SUBDIRS += vulkan
> +endif
> +
>  if HAVE_AMD_DRIVERS
>  SUBDIRS += amd
>  endif
> diff --git a/src/vulkan/Makefile.am b/src/vulkan/Makefile.am
> new file mode 100644
> index 000..abe8404
> --- /dev/null
> +++ b/src/vulkan/Makefile.am
> @@ -0,0 +1,26 @@
> +# Copyright © 2016 Red Hat.
> +#
> +# Permission is hereby granted, free of charge, to any person obtaining a
> +# copy of this software and associated documentation files (the
> "Software"),
> +# to deal in the Software without restriction, including without
> limitation
> +# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +# and/or sell copies of the Software, and to permit persons to whom the
> +# Software is furnished to do so, subject to the following conditions:
> +#
> +# The above copyright notice and this permission notice (including the
> next
> +# paragraph) shall be included in all copies or substantial portions of
> the
> +# Software.
> +#
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> +# IN THE SOFTWARE.
> +
> +include Makefile.sources
> +
> +noinst_LTLIBRARIES =
> +
> +EXTRA_DIST = $(COMMON_HEADER_FILES)
> diff --git a/src/vulkan/Makefile.sources b/src/vulkan/Makefile.sources
> new file mode 100644
> index 000..a73bf99
> --- /dev/null
> +++ b/src/vulkan/Makefile.sources
> @@ -0,0 +1,2 @@
> +COMMON_HEADER_FILES = \
> +   common/vk_alloc.h
> diff --git a/src/vulkan/common/vk_alloc.h b/src/vulkan/common/vk_alloc.h
> new file mode 100644
> index 000..a8e21ca
> --- /dev/null
> +++ b/src/vulkan/common/vk_alloc.h
> @@ -0,0 +1,75 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT

Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Jan Ziak
On Tue, Oct 18, 2016 at 8:04 PM, Marek Olšák  wrote:

> On Tue, Oct 18, 2016 at 7:12 PM, Jan Ziak <0xe2.0x9a.0...@gmail.com>
> wrote:
> >> Regarding C++ templates, the compiler doesn't use them. If u_vector
> >> (Dave Airlie?) provides the same functionality as your array, I
> >> suggest we use u_vector instead.
> >
> > Let me repeat what you just wrote, because it is unbelievable: You are
> > advising the use of non-templated collection types in C++ code.
>
> Absolutely.
>

I don't believe what my own eyes are seeing.


> > If it isn't merged by Thursday (2016-oct-20) I will mark it as
> > rejected (rejected based on personal rather than scientific grounds).
>
> Relax. Things tend to move slowly when people are on conferences,
> vacations, or just busy with corporate stuff they have to deal with
> every day etc. and you can't predict those.
>
> Marek
>

Ok. Let's relax.

Jan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/11] util: move min/max/clamp macros to util macros.h

2016-10-18 Thread Jason Ekstrand
THANK YOU!  I've been wanting to see this happen for a long time.

On Sun, Oct 16, 2016 at 7:07 PM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> Although the vulkan drivers include mesa macros.h, for
> radv I'd like to move away from that.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/mesa/main/macros.h | 13 -
>  src/util/macros.h  | 13 +
>  2 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
> index ed207d4..03a228b 100644
> --- a/src/mesa/main/macros.h
> +++ b/src/mesa/main/macros.h
> @@ -660,19 +660,6 @@ INTERP_4F(GLfloat t, GLfloat dst[4], const GLfloat
> out[4], const GLfloat in[4])
>
>
>
> -/** Clamp X to [MIN,MAX] */
> -#define CLAMP( X, MIN, MAX )  ( (X)<(MIN) ? (MIN) : ((X)>(MAX) ? (MAX) :
> (X)) )
> -
> -/** Minimum of two values: */
> -#define MIN2( A, B )   ( (A)<(B) ? (A) : (B) )
> -
> -/** Maximum of two values: */
> -#define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
> -
> -/** Minimum and maximum of three values: */
> -#define MIN3( A, B, C ) ((A) < (B) ? MIN2(A, C) : MIN2(B, C))
> -#define MAX3( A, B, C ) ((A) > (B) ? MAX2(A, C) : MAX2(B, C))
> -
>  static inline unsigned
>  minify(unsigned value, unsigned levels)
>  {
> diff --git a/src/util/macros.h b/src/util/macros.h
> index 9dea2a0..27d1b62 100644
> --- a/src/util/macros.h
> +++ b/src/util/macros.h
> @@ -229,4 +229,17 @@ do {   \
>  /** Compute ceiling of integer quotient of A divided by B. */
>  #define DIV_ROUND_UP( A, B )  ( (A) % (B) == 0 ? (A)/(B) : (A)/(B)+1 )
>
> +/** Clamp X to [MIN,MAX] */
> +#define CLAMP( X, MIN, MAX )  ( (X)<(MIN) ? (MIN) : ((X)>(MAX) ? (MAX) :
> (X)) )
> +
> +/** Minimum of two values: */
> +#define MIN2( A, B )   ( (A)<(B) ? (A) : (B) )
> +
> +/** Maximum of two values: */
> +#define MAX2( A, B )   ( (A)>(B) ? (A) : (B) )
> +
> +/** Minimum and maximum of three values: */
> +#define MIN3( A, B, C ) ((A) < (B) ? MIN2(A, C) : MIN2(B, C))
> +#define MAX3( A, B, C ) ((A) > (B) ? MAX2(A, C) : MAX2(B, C))
> +
>  #endif /* UTIL_MACROS_H */
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Jan Ziak
Perf stat results for shader-db:

This is measured on an AMD Kaveri CPU.

gcc-6.2.0 -fno-omit-frame-pointer -g -O2

 Unpatched:

$ cd shader-db
$ ../run-upstream perfstat-u --repeat=5 -- ./run -1 shaders >/dev/null

 Performance counter stats for './run -1 shaders' (5 runs):

  13689.962374  task-clock (msec) #1.000 CPUs utilized
   ( +-  0.29% )
   138  context-switches  #0.010 K/sec
   ( +- 17.82% )
 6  cpu-migrations#0.000 K/sec
   ( +- 13.36% )
78,559  page-faults   #0.006 M/sec
   ( +-  0.24% )
53,578,642,861  cycles:u  #3.914 GHz
   ( +-  0.29% )
44,813,859,985  instructions:u#0.84  insn per cycle
  ( +-  0.01% )
 1,069,586,875  cache-references:u#   78.129 M/sec
   ( +-  0.65% )
51,295,256  cache-misses:u#4.796 % of all cache
refs  ( +-  0.56% )
 9,508,996,305  branches:u#  694.596 M/sec
   ( +-  0.01% )
   453,237,236  branch-misses:u   #4.77% of all
branches  ( +-  0.84% )

  13.692494394 seconds time elapsed
 ( +-  0.29% )

 Patched:

$ cd shader-db
$ ../run-upstream-patched perfstat-u --repeat=5 -- ./run -1 shaders
>/dev/null

 Performance counter stats for './run -1 shaders' (5 runs):

  13602.106171  task-clock (msec) #1.000 CPUs utilized
   ( +-  0.14% )
86  context-switches  #0.006 K/sec
   ( +- 13.95% )
 6  cpu-migrations#0.000 K/sec
   ( +- 26.35% )
78,271  page-faults   #0.006 M/sec
   ( +-  0.82% )
53,299,046,681  cycles:u  #3.918 GHz
   ( +-  0.13% )
44,577,707,063  instructions:u#0.84  insn per cycle
  ( +-  0.01% )
 1,078,158,307  cache-references:u#   79.264 M/sec
   ( +-  0.70% )
51,521,287  cache-misses:u#4.779 % of all cache
refs  ( +-  1.03% )
 9,459,962,609  branches:u#  695.478 M/sec
   ( +-  0.01% )
   456,593,871  branch-misses:u   #4.83% of all
branches  ( +-  0.27% )

  13.603795247 seconds time elapsed
 ( +-  0.14% )
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/22] anv: move common wsi code to x11/wayland common files.

2016-10-18 Thread Emil Velikov
On 17 October 2016 at 05:24, Dave Airlie  wrote:

> diff --git a/src/intel/vulkan/Makefile.sources 
> b/src/intel/vulkan/Makefile.sources
> index 85df8a5..bd3afc0 100644
> --- a/src/intel/vulkan/Makefile.sources
> +++ b/src/intel/vulkan/Makefile.sources
> @@ -43,14 +43,17 @@ VULKAN_FILES := \
> anv_util.c \
> anv_wsi.c \
> anv_wsi.h \
> +   wsi_common.h \
> genX_pipeline_util.h \
> vk_format_info.h
>
>  VULKAN_WSI_WAYLAND_FILES := \
> -   anv_wsi_wayland.c
> +   anv_wsi_wayland.c \
> +   wsi_common_wayland.c
>
>  VULKAN_WSI_X11_FILES := \
> -   anv_wsi_x11.c
> +   anv_wsi_x11.c \
> +   wsi_common_x11.c
Please include the relevant headers in the lists above.

Also do copy the license from the current source. Obviously you can
add yourself/Redhat if interested.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: minor code improvements in svga_validate_pipe_sampler_view()

2016-10-18 Thread Charmaine Lee

Reviewed-by: Charmaine Lee 


From: Brian Paul 
Sent: Tuesday, October 18, 2016 9:36 AM
To: mesa-dev@lists.freedesktop.org
Cc: Charmaine Lee
Subject: [PATCH] svga: minor code improvements in 
svga_validate_pipe_sampler_view()

Use the 'texture' local var in more places.
Rename 'pFormat' to 'viewFormat'.
---
 src/gallium/drivers/svga/svga_state_sampler.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_state_sampler.c 
b/src/gallium/drivers/svga/svga_state_sampler.c
index 53bb80f..445afcc 100644
--- a/src/gallium/drivers/svga/svga_state_sampler.c
+++ b/src/gallium/drivers/svga/svga_state_sampler.c
@@ -135,21 +135,21 @@ svga_validate_pipe_sampler_view(struct svga_context *svga,
   SVGA3dSurfaceFormat format;
   SVGA3dResourceType resourceDim;
   SVGA3dShaderResourceViewDesc viewDesc;
-  enum pipe_format pformat = sv->base.format;
+  enum pipe_format viewFormat = sv->base.format;

   /* vgpu10 cannot create a BGRX view for a BGRA resource, so force it to
* create a BGRA view (and vice versa).
*/
-  if (pformat == PIPE_FORMAT_B8G8R8X8_UNORM &&
-  sv->base.texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) {
- pformat = PIPE_FORMAT_B8G8R8A8_UNORM;
+  if (viewFormat == PIPE_FORMAT_B8G8R8X8_UNORM &&
+  texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) {
+ viewFormat = PIPE_FORMAT_B8G8R8A8_UNORM;
   }
-  else if (pformat == PIPE_FORMAT_B8G8R8A8_UNORM &&
-  sv->base.texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) {
- pformat = PIPE_FORMAT_B8G8R8X8_UNORM;
+  else if (viewFormat == PIPE_FORMAT_B8G8R8A8_UNORM &&
+  texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) {
+ viewFormat = PIPE_FORMAT_B8G8R8X8_UNORM;
   }

-  format = svga_translate_format(ss, pformat,
+  format = svga_translate_format(ss, viewFormat,
  PIPE_BIND_SAMPLER_VIEW);
   assert(format != SVGA3D_FORMAT_INVALID);

--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Marek Olšák
On Tue, Oct 18, 2016 at 7:12 PM, Jan Ziak <0xe2.0x9a.0...@gmail.com> wrote:
>> Regarding C++ templates, the compiler doesn't use them. If u_vector
>> (Dave Airlie?) provides the same functionality as your array, I
>> suggest we use u_vector instead.
>
> Let me repeat what you just wrote, because it is unbelievable: You are
> advising the use of non-templated collection types in C++ code.

Absolutely.

>
>> If you can't use u_vector, you should
>> ask for approval from GLSL compiler leads (e.g. Ian Romanick or
>> Kenneth Graunke) to use C++ templates.
>
> - You are talking about coding rules some Mesa developers agreed upon
> and didn't bother writing down for other developers to read
>
> - I am not willing to use u_vector in C++ code
>
>> I'll repeat some stuff about profiling here but also explain my perspective.
>
> So far (which may be a year or so), there is no indication that you
> are better at optimizing code than me.

Good one.

>
>> Never profile with -O0 or disabled function inlining.
>
> Seriously?

Absolutely.

>
>> Mesa uses -g -O2
>> with --enable-debug, so that's what you should use too. Don't use any
>> other -O* variants.
>
> What if I find a case where -O2 prevents me from easily seeing
> information necessary to optimize the source code?

There are several ways to get useful data from optimized code (using
the frame pointer, using dwarf, etc.) -O0 is too distorted.

>
>> The only profiling tools reporting correct results are perf and
>> sysprof.
>
> I used perf on Metro 2033 Redux and saw do_dead_code() there. Then I
> used callgrind to see some more code.

I recommend building Mesa with the frame pointer enabled, or enabling
dwarf in perf. Otherwise you won't see call trees.

>
>> (both use the same mechanism) If you don't enable dwarf in
>> perf (also sysprof can't use dwarf), you have to build Mesa with
>> -fno-omit-frame-pointer to see call trees. The only reason you would
>> want to enable dwarf-based call trees is when you want to see libc
>> calls. Otherwise, they won't be displayed or counted as part of call
>> trees. For Mesa developers who do profiling often,
>> -fno-omit-frame-pointer should be your default.
>
>> Callgrind counts calls (that one you can trust), but the reported time
>> is incorrect,
>
> Are you nuts? You cannot be seriously be assuming that I didn't know about 
> that.
>
>> because it uses its own virtual model of a CPU. Avoid it
>> if you want to measure time spent in functions.
>
> I will *NOT* avoid callgrind because I know how to use it to optimize code.

I didn't suggest avoiding callgrind in all cases.

>
>>Marek
>
> As usual, I would like to notify reviewers of this path that I
> am not willing to wait months to learn whether the code will be merged
> or rejected.
>
> If it isn't merged by Thursday (2016-oct-20) I will mark it as
> rejected (rejected based on personal rather than scientific grounds).

Relax. Things tend to move slowly when people are on conferences,
vacations, or just busy with corporate stuff they have to deal with
every day etc. and you can't predict those.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT

2016-10-18 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

This comes into play with Zcull, I think. But since we don't do Zcull
yet, wtvr. I had a patch to convert it into a
layout(early_fragment_tests) effectively if the various settings
matched, but ultimately it didn't seem worthwhile.

  -ilia


On Tue, Oct 18, 2016 at 1:59 PM, Samuel Pitoiset
 wrote:
> Found that information message while replaying a trace from
> Metro 2033 Redux. Mark that property as useless for now.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index db03281..0c98744 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -1093,6 +1093,7 @@ void Source::scanProperty(const struct 
> tgsi_full_property *prop)
>break;
> case TGSI_PROPERTY_FS_COORD_ORIGIN:
> case TGSI_PROPERTY_FS_COORD_PIXEL_CENTER:
> +   case TGSI_PROPERTY_FS_DEPTH_LAYOUT:
>// we don't care
>break;
> case TGSI_PROPERTY_VS_PROHIBIT_UCPS:
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50/ir: Split 64-bit integer MAD/MUL operations

2016-10-18 Thread Pierre Moreau
Hello Ian,

Since I am working on a direct SPIR-V to NV50 IR translator, ultimately to be
used for OpenCL kernels, I will still need the patch for that work. (I even
wrote that patch because I needed it when handling 64-bit addresses. :-) )
But thanks for the heads-up!

Pierre


On 02:07 pm - Oct 17 2016, Ian Romanick wrote:
> I know know if it will make this patch unnecessary, but I have a GLSL
> IR-level lowering pass for 64-bit multiplication.  I'm going to send
> that out with the rest of the GL_ARB_gpu_shader_int64 series within the
> next day or so.
> 
> On 10/15/2016 03:24 PM, Pierre Moreau wrote:
> > Hardware does not support 64-bit integers MAD and MUL operations, so we need
> > to transform them in 32-bit operations.
> > 
> > Signed-off-by: Pierre Moreau 
> > ---
> >  .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 121 
> > +
> >  1 file changed, 121 insertions(+)
> > 
> > Tested with (the GPU result was compared to the CPU result):
> > * 0xfff3lu * 0xfff2lu + 0x80070002lu
> > * 0xfff3lu * 0x80070002lu + 0x80070002lu
> > * 0x80010003lu * 0xfff2lu + 0x80070002lu
> > * 0x80010003lu * 0x80070002lu + 0x80070002lu
> > 
> > * -523456791234l * 929835793793l + -15793793l
> > *  523456791234l * 929835793793l + -15793793l
> > * -523456791234l * -929835793793l + -15793793l
> > *  523456791234l * -929835793793l + -15793793l
> > 
> > v2:
> > * Completely re-write the patch, as it was completely flawed (Ilia Mirkin)
> > * Move pass prior to Register Allocation, as some temporaries need to
> >   be created.
> > 
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > index d88bb34..a610eb5 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > @@ -2218,6 +2218,126 @@ LateAlgebraicOpt::visit(Instruction *i)
> >  
> >  // 
> > =
> >  
> > +// Split 64-bit MUL and MAD
> > +class Split64BitOpPreRA : public Pass
> > +{
> > +private:
> > +   virtual bool visit(BasicBlock *);
> > +   void split64BitReg(Function *, Instruction *, Instruction *,
> > +  Instruction *, Value *, int);
> > +   void split64MulMad(Function *, Instruction *, DataType);
> > +
> > +   BuildUtil bld;
> > +};
> > +
> > +bool
> > +Split64BitOpPreRA::visit(BasicBlock *bb)
> > +{
> > +   Instruction *i, *next;
> > +   Modifier mod;
> > +
> > +   for (i = bb->getEntry(); i; i = next) {
> > +  next = i->next;
> > +
> > +  if (typeSizeof(i->dType) != 8)
> > + continue;
> > +
> > +  DataType hTy;
> > +  switch (i->dType) {
> > +  case TYPE_U64: hTy = TYPE_U32; break;
> > +  case TYPE_S64: hTy = TYPE_S32; break;
> > +  default:
> > + continue;
> > +  }
> > +
> > +  if (i->op == OP_MAD || i->op == OP_MUL)
> > + split64MulMad(bb->getFunction(), i, hTy);
> > +   }
> > +
> > +   return true;
> > +}
> > +
> > +void
> > +Split64BitOpPreRA::split64MulMad(Function *fn, Instruction *i, DataType 
> > hTy)
> > +{
> > +   assert(i->op == OP_MAD || i->op == OP_MUL);
> > +   if (isFloatType(i->dType) || isFloatType(i->sType))
> > +  return;
> > +
> > +   bld.setPosition(i, true);
> > +
> > +   Value *zero = bld.mkImm(0u);
> > +   Value *carry = bld.getSSA(1, FILE_FLAGS);
> > +
> > +   // We want to compute `d = a * b (+ c)?`, where a, b, c and d are 64-bit
> > +   // values (a, b and c might be 32-bit values), using 32-bit operations. 
> > This
> > +   // gives the following operations:
> > +   // * `d.low = low(a.low * b.low) (+ c.low)?`
> > +   // * `d.high = low(a.high * b.low) + low(a.low * b.high)
> > +   //   + high(a.low * b.low) (+ c.high)?`
> > +   //
> > +   // To compute the high bits, we can split in the following operations:
> > +   // * `tmp1   = low(a.high * b.low) (+ c.high)?`
> > +   // * `tmp2   = low(a.low * b.high) + tmp1`
> > +   // * `d.high = high(a.low * b.low) + tmp2`
> > +   //
> > +   // mkSplit put lower bits at index 0 and higher bits at index 1
> > +
> > +   Value *op1[2];
> > +   if (i->getSrc(0)->reg.size == 8)
> > +  bld.mkSplit(op1, typeSizeof(hTy), i->getSrc(0));
> > +   else {
> > +  op1[0] = i->getSrc(0);
> > +  op1[1] = zero;
> > +   }
> > +   Value *op2[2];
> > +   if (i->getSrc(1)->reg.size == 8)
> > +  bld.mkSplit(op2, typeSizeof(hTy), i->getSrc(1));
> > +   else {
> > +  op2[0] = i->getSrc(1);
> > +  op2[1] = zero;
> > +   }
> > +
> > +   Value *op3[2] = { NULL, NULL };
> > +   if (i->op == OP_MAD) {
> > +  if (i->getSrc(2)->reg.size == 8)
> > + bld.mkSplit(op3, typeSizeof(hTy), i->getSrc(2));
> > +  else {
> > + op3[0] = i->getSrc(2);
> > + op3[1] = zero;
> > + 

[Mesa-dev] [PATCH] nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT

2016-10-18 Thread Samuel Pitoiset
Found that information message while replaying a trace from
Metro 2033 Redux. Mark that property as useless for now.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index db03281..0c98744 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1093,6 +1093,7 @@ void Source::scanProperty(const struct tgsi_full_property 
*prop)
   break;
case TGSI_PROPERTY_FS_COORD_ORIGIN:
case TGSI_PROPERTY_FS_COORD_PIXEL_CENTER:
+   case TGSI_PROPERTY_FS_DEPTH_LAYOUT:
   // we don't care
   break;
case TGSI_PROPERTY_VS_PROHIBIT_UCPS:
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs

2016-10-18 Thread Brian Paul


Reviewed-by: Brian Paul 


On 10/18/2016 11:48 AM, Marek Olšák wrote:

From: Marek Olšák 

v2: rebased
---
  src/mesa/state_tracker/st_atom_blend.c | 3 ++-
  src/mesa/state_tracker/st_atom_depth.c | 3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_blend.c 
b/src/mesa/state_tracker/st_atom_blend.c
index 76d6a644..b8d65bd 100644
--- a/src/mesa/state_tracker/st_atom_blend.c
+++ b/src/mesa/state_tracker/st_atom_blend.c
@@ -259,21 +259,22 @@ update_blend( struct st_context *st )
   blend->rt[i].colormask |= PIPE_MASK_G;
if (ctx->Color.ColorMask[i][2])
   blend->rt[i].colormask |= PIPE_MASK_B;
if (ctx->Color.ColorMask[i][3])
   blend->rt[i].colormask |= PIPE_MASK_A;
 }

 blend->dither = ctx->Color.DitherFlag;

 if (ctx->Multisample.Enabled &&
-   ctx->DrawBuffer->Visual.sampleBuffers > 0) {
+   ctx->DrawBuffer->Visual.sampleBuffers > 0 &&
+   !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) {
/* Unlike in gallium/d3d10 these operations are only performed
 * if both msaa is enabled and we have a multisample buffer.
 */
blend->alpha_to_coverage = ctx->Multisample.SampleAlphaToCoverage;
blend->alpha_to_one = ctx->Multisample.SampleAlphaToOne;
 }

 cso_set_blend(st->cso_context, blend);

 {
diff --git a/src/mesa/state_tracker/st_atom_depth.c 
b/src/mesa/state_tracker/st_atom_depth.c
index 267b42c..7092c3f 100644
--- a/src/mesa/state_tracker/st_atom_depth.c
+++ b/src/mesa/state_tracker/st_atom_depth.c
@@ -142,21 +142,22 @@ update_depth_stencil_alpha(struct st_context *st)
else {
   /* This should be unnecessary. Drivers must not expect this to
* contain valid data, except the enabled bit
*/
   dsa->stencil[1] = dsa->stencil[0];
   dsa->stencil[1].enabled = 0;
   sr.ref_value[1] = sr.ref_value[0];
}
 }

-   if (ctx->Color.AlphaEnabled) {
+   if (ctx->Color.AlphaEnabled &&
+   !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) {
dsa->alpha.enabled = 1;
dsa->alpha.func = st_compare_func_to_pipe(ctx->Color.AlphaFunc);
dsa->alpha.ref_value = ctx->Color.AlphaRefUnclamped;
 }

 cso_set_depth_stencil_alpha(st->cso_context, dsa);
 cso_set_stencil_ref(st->cso_context, );
  }





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs

2016-10-18 Thread Marek Olšák
From: Marek Olšák 

v2: rebased
---
 src/mesa/state_tracker/st_atom_blend.c | 3 ++-
 src/mesa/state_tracker/st_atom_depth.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_blend.c 
b/src/mesa/state_tracker/st_atom_blend.c
index 76d6a644..b8d65bd 100644
--- a/src/mesa/state_tracker/st_atom_blend.c
+++ b/src/mesa/state_tracker/st_atom_blend.c
@@ -259,21 +259,22 @@ update_blend( struct st_context *st )
  blend->rt[i].colormask |= PIPE_MASK_G;
   if (ctx->Color.ColorMask[i][2])
  blend->rt[i].colormask |= PIPE_MASK_B;
   if (ctx->Color.ColorMask[i][3])
  blend->rt[i].colormask |= PIPE_MASK_A;
}
 
blend->dither = ctx->Color.DitherFlag;
 
if (ctx->Multisample.Enabled &&
-   ctx->DrawBuffer->Visual.sampleBuffers > 0) {
+   ctx->DrawBuffer->Visual.sampleBuffers > 0 &&
+   !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) {
   /* Unlike in gallium/d3d10 these operations are only performed
* if both msaa is enabled and we have a multisample buffer.
*/
   blend->alpha_to_coverage = ctx->Multisample.SampleAlphaToCoverage;
   blend->alpha_to_one = ctx->Multisample.SampleAlphaToOne;
}
 
cso_set_blend(st->cso_context, blend);
 
{
diff --git a/src/mesa/state_tracker/st_atom_depth.c 
b/src/mesa/state_tracker/st_atom_depth.c
index 267b42c..7092c3f 100644
--- a/src/mesa/state_tracker/st_atom_depth.c
+++ b/src/mesa/state_tracker/st_atom_depth.c
@@ -142,21 +142,22 @@ update_depth_stencil_alpha(struct st_context *st)
   else {
  /* This should be unnecessary. Drivers must not expect this to
   * contain valid data, except the enabled bit
   */
  dsa->stencil[1] = dsa->stencil[0];
  dsa->stencil[1].enabled = 0;
  sr.ref_value[1] = sr.ref_value[0];
   }
}
 
-   if (ctx->Color.AlphaEnabled) {
+   if (ctx->Color.AlphaEnabled &&
+   !(ctx->DrawBuffer->_IntegerBuffers & 0x1)) {
   dsa->alpha.enabled = 1;
   dsa->alpha.func = st_compare_func_to_pipe(ctx->Color.AlphaFunc);
   dsa->alpha.ref_value = ctx->Color.AlphaRefUnclamped;
}
 
cso_set_depth_stencil_alpha(st->cso_context, dsa);
cso_set_stencil_ref(st->cso_context, );
 }
 
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/surfaceless: Fix segfault in eglSwapBuffers

2016-10-18 Thread Anuj Phogat
On Tue, Oct 18, 2016 at 9:43 AM, Chad Versace  wrote:
> Since commit 63c5d5c6c46c8472ee7a8241a0f80f13d79cb8cd, the surfaceless
> platform has allowed creation of pbuffer surfaces. But the vtable entry
> for eglSwapBuffers has remained NULL.
>
> Discovered by running a little pbuffer test.
>
> Cc: Gurchetan Singh 
> ---
>  src/egl/drivers/dri2/platform_surfaceless.c | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/src/egl/drivers/dri2/platform_surfaceless.c 
> b/src/egl/drivers/dri2/platform_surfaceless.c
> index fcf7d69..a55c5f1 100644
> --- a/src/egl/drivers/dri2/platform_surfaceless.c
> +++ b/src/egl/drivers/dri2/platform_surfaceless.c
> @@ -178,6 +178,17 @@ dri2_surfaceless_create_pbuffer_surface(_EGLDriver *drv, 
> _EGLDisplay *disp,
>  }
>
>  static EGLBoolean
> +surfaceless_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface 
> *surf)
> +{
> +   assert(!surf || surf->Type == EGL_PBUFFER_BIT);
> +
> +   /* From the EGL 1.5 spec:
> +*If surface is a [...] pbuffer surface, eglSwapBuffers has no effect.
> +*/
> +   return EGL_TRUE;
> +}
> +
> +static EGLBoolean
>  surfaceless_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *dpy)
>  {
> struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
> @@ -223,6 +234,7 @@ static struct dri2_egl_display_vtbl 
> dri2_surfaceless_display_vtbl = {
> .destroy_surface = surfaceless_destroy_surface,
> .create_image = dri2_create_image_khr,
> .swap_interval = dri2_fallback_swap_interval,
> +   .swap_buffers = surfaceless_swap_buffers,
> .swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage,
> .swap_buffers_region = dri2_fallback_swap_buffers_region,
> .post_sub_buffer = dri2_fallback_post_sub_buffer,
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/25] mesa/i965: eliminate gl_tess_ctrl_program and use new shared shader_info

2016-10-18 Thread Kenneth Graunke
On Tuesday, October 18, 2016 8:28:02 AM PDT Jason Ekstrand wrote:
> On Tue, Oct 18, 2016 at 8:14 AM, Jason Ekstrand 
> wrote:
> 
> > I want to make a few comments on how this series is structured.  This is
> > not the way I would have done it and I think the way you structured it
> > makes it substantially less rebasable than it could be and a bit harder to
> > review.  The way *I* would have done this would be something like the
> > following:
> >
> >  1) Move shader_info to common code (patches 1-2)
> >  2) Add a shader_info pointer to gl_program (patch 6), break the fill
> > shader_info stuff from glsl_to_nir into its own function, and call it from
> > somewhere such that it always gets filled out.
> >  3) Add new fields to shader_info *and* make sure they get filled out from
> > other GLSL information
> >  4) Convert i965 over to the new shader_info
> >  5) Convert gallium over to the new shader_info
> >  6) Make GLSL fill out shader_info directly and nuke the old shader
> > metadata.
> >  7) Delete the shader_info fill-out function.
> >
> > Something along these lines would go a long way towards avoiding the "mega
> > patch" problem where each patch touches 4 or 5 different components.  It
> > also makes it clearer to review because you don't add fields and then the
> > reviewer goes "Wait, where does this get set?  Oh, in another patch".  I'm
> > not necessarily saying that you have to go back and change your patches.
> > It's more a suggestion for if you end up doing a v3 or another refactor
> > along these lines in the future.
> >
> 
> On the review side, splitting out as I described above would make it much
> easier to review since it would be more-or-less one type of refactor per
> patch.  In this patch, we have several different kinds of refactors:
> 
>  1) Move consumers over to reading shader_info
>  2) Remove gl_tess_ctrl_program and related refactors
>  3) Move producer over to writing shader_info
> 
> Normally, when reviewing, I would just skim (2) and give (1) a (3) more
> effort.  Having them mixed together means I have to pay constant attention
> to what's going on.  Also, having (2) mixed in makes it harder to verify
> (3) because there's a lot of code motion only some of which matters.

I agree with Jason.  This could be structured a lot more cleanly, and it
would make it much easier to review.

For example, patches 3-5 add a bunch of new structure fields.  But they
aren't populated by anything.  The CS local_size_variable field finally
gets populated in patch 12 (a whole 7 patches later!)...and by the end
of the series...I don't see a single consumer of that field.

So, the field is useless.  But I had to use 'git log -p' on a branch and
search through your entire series to determine that.  There's far too
much context to keep in my head while reading, and it means I have to
abandon my usual read-emails-mostly-in-order review process.

I actually added TES shader info a little while back (but hadn't sent
them out yet as Vulkan tessellation isn't quite ready yet).  Here's what
my patches looked like:

1) Convert spacing from GLenums to a TESS_SPACING_* enum
https://cgit.freedesktop.org/~kwg/mesa/commit/?h=vktess=8b49a8485dd37eb405efcaaecd55244a8f63f213
   (simple cleanup I did across the whole codebase)

2) Introduce nir_shader_info fields and populate them in glsl_to_nir and
   spirv_to_nir.
https://cgit.freedesktop.org/~kwg/mesa/commit/?h=vktess=3142efa913965324ad21c3cefc792ab83e1a1390
   (fields are at least populated in all frontends, but may be useless)

3) Convert i965 over to use nir_shader_info for fields
https://cgit.freedesktop.org/~kwg/mesa/commit/?h=vktess=a518388acc7a6db88c7e21829e7a15b15b9304ad
   (now the fields are used.  admittedly I did some bonus code motion in
this patch...if I'm being pedantic, I should have made that a fourth
patch to make the prog_data fields be populated in Vulkan paths)

The first patch stands alone, and patches 2-3 stand together.  All are
very small.  You need no additional context to answer questions, and can
say "those look good" and move on rather quickly.

With that in mind, I'd like to ask you to please try and rework this
series along the lines that Jason suggested.  I know it's a bunch of
work, but being disciplined in how we organize our code is a really
useful skill that pays off in the long run.  When reviewers can look
at your code and quickly give a thumbs up, you get to land your patches
a lot more quickly, and that extra effort ultimately saves you (and
others) a whole lot of time.

Sorry, Tim :(

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/22] anv/wsi/x11: abstract WSI interface from internals.

2016-10-18 Thread Emil Velikov
On 17 October 2016 at 05:24, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This allows the API and the internals to be split, and the
> internals shared.
> ---
>  src/intel/vulkan/anv_wsi_x11.c | 33 -
>  1 file changed, 24 insertions(+), 9 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_wsi_x11.c b/src/intel/vulkan/anv_wsi_x11.c
> index ccaabea..6eb06c3 100644
> --- a/src/intel/vulkan/anv_wsi_x11.c
> +++ b/src/intel/vulkan/anv_wsi_x11.c
> @@ -233,16 +233,15 @@ visual_has_alpha(xcb_visualtype_t *visual, unsigned 
> depth)
> return (all_mask & ~rgb_mask) != 0;
>  }
>
> -VkBool32 anv_GetPhysicalDeviceXcbPresentationSupportKHR(
> -VkPhysicalDevicephysicalDevice,
> +static VkBool32 anv_get_physical_device_xcb_presentation_support(
> +struct anv_wsi_device *wsi_device,
> +VkAllocationCallbacks *alloc,
Nit: indentation (here and below) seems off.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/25] mesa/i965: eliminate gl_tess_ctrl_program and use new shared shader_info

2016-10-18 Thread Jason Ekstrand
On Tue, Oct 18, 2016 at 8:28 AM, Jason Ekstrand 
wrote:

> On Tue, Oct 18, 2016 at 8:14 AM, Jason Ekstrand 
> wrote:
>
>> I want to make a few comments on how this series is structured.  This is
>> not the way I would have done it and I think the way you structured it
>> makes it substantially less rebasable than it could be and a bit harder to
>> review.  The way *I* would have done this would be something like the
>> following:
>>
>>  1) Move shader_info to common code (patches 1-2)
>>  2) Add a shader_info pointer to gl_program (patch 6), break the fill
>> shader_info stuff from glsl_to_nir into its own function, and call it from
>> somewhere such that it always gets filled out.
>>  3) Add new fields to shader_info *and* make sure they get filled out
>> from other GLSL information
>>  4) Convert i965 over to the new shader_info
>>  5) Convert gallium over to the new shader_info
>>  6) Make GLSL fill out shader_info directly and nuke the old shader
>> metadata.
>>  7) Delete the shader_info fill-out function.
>>
>
Oh, and one more step:

 8) Refactor to get rid of all of the gl_foo_program stuff.  (Maybe
multiple patches?)


>
>> Something along these lines would go a long way towards avoiding the
>> "mega patch" problem where each patch touches 4 or 5 different components.
>> It also makes it clearer to review because you don't add fields and then
>> the reviewer goes "Wait, where does this get set?  Oh, in another patch".
>> I'm not necessarily saying that you have to go back and change your
>> patches.  It's more a suggestion for if you end up doing a v3 or another
>> refactor along these lines in the future.
>>
>
> On the review side, splitting out as I described above would make it much
> easier to review since it would be more-or-less one type of refactor per
> patch.  In this patch, we have several different kinds of refactors:
>
>  1) Move consumers over to reading shader_info
>  2) Remove gl_tess_ctrl_program and related refactors
>  3) Move producer over to writing shader_info
>
> Normally, when reviewing, I would just skim (2) and give (1) a (3) more
> effort.  Having them mixed together means I have to pay constant attention
> to what's going on.  Also, having (2) mixed in makes it harder to verify
> (3) because there's a lot of code motion only some of which matters.
>
>
>>
>>
>> On Mon, Oct 17, 2016 at 11:12 PM, Timothy Arceri <
>> timothy.arc...@collabora.com> wrote:
>>
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_context.h   |  6 ++---
>>>  src/mesa/drivers/dri/i965/brw_draw.c  |  2 +-
>>>  src/mesa/drivers/dri/i965/brw_program.c   |  2 +-
>>>  src/mesa/drivers/dri/i965/brw_tcs.c   | 32
>>> ++-
>>>  src/mesa/drivers/dri/i965/brw_tcs_surface_state.c |  2 +-
>>>  src/mesa/drivers/dri/i965/brw_tes.c   | 20 +++---
>>>  src/mesa/drivers/dri/i965/gen7_hs_state.c |  4 +--
>>>  src/mesa/main/context.c   |  2 +-
>>>  src/mesa/main/mtypes.h| 12 +
>>>  src/mesa/main/shaderapi.c |  4 +--
>>>  src/mesa/main/state.c | 11 
>>>  src/mesa/program/prog_statevars.c |  2 +-
>>>  src/mesa/program/program.c|  4 +--
>>>  src/mesa/program/program.h| 23 
>>>  src/mesa/state_tracker/st_atom.c  |  2 +-
>>>  src/mesa/state_tracker/st_atom_constbuf.c |  2 +-
>>>  src/mesa/state_tracker/st_atom_sampler.c  |  2 +-
>>>  src/mesa/state_tracker/st_atom_shader.c   |  2 +-
>>>  src/mesa/state_tracker/st_atom_texture.c  |  2 +-
>>>  src/mesa/state_tracker/st_cb_program.c| 10 +++
>>>  src/mesa/state_tracker/st_program.c   |  6 ++---
>>>  src/mesa/state_tracker/st_program.h   |  6 ++---
>>>  22 files changed, 58 insertions(+), 100 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
>>> b/src/mesa/drivers/dri/i965/brw_context.h
>>> index c92bb9f..9b7e184 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_context.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_context.h
>>> @@ -337,7 +337,7 @@ struct brw_vertex_program {
>>>
>>>  /** Subclass of Mesa tessellation control program */
>>>  struct brw_tess_ctrl_program {
>>> -   struct gl_tess_ctrl_program program;
>>> +   struct gl_program program;
>>> unsigned id;  /**< serial no. to identify tess ctrl progs, never
>>> re-used */
>>>  };
>>>
>>> @@ -1008,7 +1008,7 @@ struct brw_context
>>>  */
>>> const struct gl_vertex_program *vertex_program;
>>> const struct gl_geometry_program *geometry_program;
>>> -   const struct gl_tess_ctrl_program *tess_ctrl_program;
>>> +   const struct gl_program *tess_ctrl_program;
>>> const struct gl_tess_eval_program *tess_eval_program;
>>> const struct gl_fragment_program *fragment_program;
>>> 

Re: [Mesa-dev] [PATCH 01/22] radv/anv/wsi: drop uneeded parameter

2016-10-18 Thread Emil Velikov
Typo in the summary - s/uneeded/unneeded/

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/11] anv: drop pointless struct decl.

2016-10-18 Thread Emil Velikov
On 17 October 2016 at 03:07, Dave Airlie  wrote:
> From: Dave Airlie 
>
> Signed-off-by: Dave Airlie 
Seems like a typo from the development stage - anv_wsi_inter_a_face

10 and 11 are independent so feel free to land whenever possible.
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] [Bug 38970] [bisected]piglit glx/glx-pixmap-multi failed

2016-10-18 Thread Ian Romanick
On 09/29/2016 01:55 PM, Anutex wrote:
> I tried to debug this issue with changing the condition to check only bad 
> magic and Error.
> And the test passed.
> 
> Though i am not sure what is the correct behaviour if we are in this 
> condition.
> May be we should make some  other condition if the Hash Table have the bucket 
> data.
> ---
>  src/glx/dri2_glx.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
> index af388d9..a1fd9ff 100644
> --- a/src/glx/dri2_glx.c
> +++ b/src/glx/dri2_glx.c
> @@ -411,12 +411,13 @@ dri2CreateDrawable(struct glx_screen *base, XID 
> xDrawable,
>return NULL;
> }
>  
> -   if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw)) {
> +   if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw) == -1) {

I'm not 100% sure the existing code is wrong.  __glxHashInsert returns
-1 for an error, and it returns 1 if the key is already in the hash
table.  In that case we'll leak the memory for the new pdraw, right?
That also seems bad.

It seems like instead the code should look up xDrawable in the hash
table and return the value that's already there.  Maybe.  I haven't
looked at this code in years, so I may be forgetting some subtlety.

>(*psc->core->destroyDrawable) (pdraw->driDrawable);
>DRI2DestroyDrawable(psc->base.dpy, xDrawable);
>free(pdraw);
>return None;
> }
> + 
>  

Spurious whitespace change.

> /*
>  * Make sure server has the same swap interval we do for the new
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/11] anv: move to using vk_alloc helpers.

2016-10-18 Thread Emil Velikov
Hi Dave,

On 17 October 2016 at 03:07, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This moves all the alloc/free in anv to the generic helpers.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/intel/vulkan/anv_batch_chain.c| 40 +++---
>  src/intel/vulkan/anv_cmd_buffer.c | 22 -
>  src/intel/vulkan/anv_descriptor_set.c | 12 -
>  src/intel/vulkan/anv_device.c | 26 ++--
>  src/intel/vulkan/anv_image.c  | 14 +--
>  src/intel/vulkan/anv_intel.c  |  4 +--
>  src/intel/vulkan/anv_pass.c   | 10 
>  src/intel/vulkan/anv_pipeline.c   |  6 ++---
>  src/intel/vulkan/anv_pipeline_cache.c |  8 +++---
>  src/intel/vulkan/anv_private.h| 46 
> +--
>  src/intel/vulkan/anv_query.c  |  6 ++---
>  src/intel/vulkan/anv_wsi.c|  2 +-
>  src/intel/vulkan/anv_wsi_wayland.c| 16 ++--
>  src/intel/vulkan/anv_wsi_x11.c| 22 -
>  src/intel/vulkan/gen7_pipeline.c  |  4 +--
>  src/intel/vulkan/gen8_pipeline.c  |  4 +--
>  src/intel/vulkan/genX_pipeline.c  |  6 ++---
>  src/intel/vulkan/genX_state.c |  2 +-
>  18 files changed, 103 insertions(+), 147 deletions(-)
>
Wondering we one shouldn't include the new header only where needed ?
Quick grep shows 33 files which include anv_private.h of which (as per
above) ~half only need vk_alloc.h.

Just an idea.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Jan Ziak
> Regarding C++ templates, the compiler doesn't use them. If u_vector
> (Dave Airlie?) provides the same functionality as your array, I
> suggest we use u_vector instead.

Let me repeat what you just wrote, because it is unbelievable: You are
advising the use of non-templated collection types in C++ code.

> If you can't use u_vector, you should
> ask for approval from GLSL compiler leads (e.g. Ian Romanick or
> Kenneth Graunke) to use C++ templates.

- You are talking about coding rules some Mesa developers agreed upon
and didn't bother writing down for other developers to read

- I am not willing to use u_vector in C++ code

> I'll repeat some stuff about profiling here but also explain my perspective.

So far (which may be a year or so), there is no indication that you
are better at optimizing code than me.

> Never profile with -O0 or disabled function inlining.

Seriously?

> Mesa uses -g -O2
> with --enable-debug, so that's what you should use too. Don't use any
> other -O* variants.

What if I find a case where -O2 prevents me from easily seeing
information necessary to optimize the source code?

> The only profiling tools reporting correct results are perf and
> sysprof.

I used perf on Metro 2033 Redux and saw do_dead_code() there. Then I
used callgrind to see some more code.

> (both use the same mechanism) If you don't enable dwarf in
> perf (also sysprof can't use dwarf), you have to build Mesa with
> -fno-omit-frame-pointer to see call trees. The only reason you would
> want to enable dwarf-based call trees is when you want to see libc
> calls. Otherwise, they won't be displayed or counted as part of call
> trees. For Mesa developers who do profiling often,
> -fno-omit-frame-pointer should be your default.

> Callgrind counts calls (that one you can trust), but the reported time
> is incorrect,

Are you nuts? You cannot be seriously be assuming that I didn't know about that.

> because it uses its own virtual model of a CPU. Avoid it
> if you want to measure time spent in functions.

I will *NOT* avoid callgrind because I know how to use it to optimize code.

>Marek

As usual, I would like to notify reviewers of this path that I
am not willing to wait months to learn whether the code will be merged
or rejected.

If it isn't merged by Thursday (2016-oct-20) I will mark it as
rejected (rejected based on personal rather than scientific grounds).

Jan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/glsl_to_tgsi: fix block copies of arrays of structs

2016-10-18 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Oct 17, 2016 at 7:25 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Use a full writemask in this case. This is relevant e.g. when a function
> has an inout argument which is an array of structs.
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 1662f7f..b91ebaf 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -2964,24 +2964,26 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
>
>   if (variable->data.location == FRAG_RESULT_DEPTH)
>  l.writemask = WRITEMASK_Z;
>   else {
>  assert(variable->data.location == FRAG_RESULT_STENCIL);
>  l.writemask = WRITEMASK_Y;
>   }
>} else if (ir->write_mask == 0) {
>   assert(!ir->lhs->type->is_scalar() && !ir->lhs->type->is_vector());
>
> - if (ir->lhs->type->is_array() || ir->lhs->type->is_matrix()) {
> -unsigned num_elements = 
> ir->lhs->type->without_array()->vector_elements;
> + unsigned num_elements = 
> ir->lhs->type->without_array()->vector_elements;
> +
> + if (num_elements) {
>  l.writemask = u_bit_consecutive(0, num_elements);
>   } else {
> +// The type is a struct or an array of (array of) structs.
>  l.writemask = WRITEMASK_XYZW;
>   }
>} else {
>   l.writemask = ir->write_mask;
>}
>
>for (int i = 0; i < 4; i++) {
>   if (l.writemask & (1 << i)) {
>  first_enabled_chan = GET_SWZ(r.swizzle, i);
>  break;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 05/16] loader: reimplement loader_get_user_preferred_fd via libdrm

2016-10-18 Thread Nicolai Hähnle

On 18.10.2016 18:01, Emil Velikov wrote:

On 18 October 2016 at 09:49, Nicolai Hähnle  wrote:

On 14.10.2016 20:21, Emil Velikov wrote:


From: Emil Velikov 

Currently not everyone has libudev and with follow-up patches we'll
completely remove the divergent codepaths.

Use the libdrm drm device API to construct the required ID_PATH_TAG-like
string, to preserve the current functionality for libudev users and
allow others to benefit from it as well.

v2: Drop ranty comments, pick the correct device

Cc: Axel Davy 
Signed-off-by: Emil Velikov 
---
 src/loader/loader.c | 247
++--
 1 file changed, 106 insertions(+), 141 deletions(-)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index ad4f946..06df05b 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c


[snip]


@@ -321,17 +232,60 @@ static char *loader_get_dri_config_device_id(void)
 }
 #endif

+static char *drm_construct_id_path_tag(drmDevicePtr device)
+{
+/* Length of "pci-_xx_xx_x\n" */
+#define PCI_ID_PATH_TAG_LENGTH 17
+   char *tag = NULL;
+
+   if (device->bustype == DRM_BUS_PCI) {
+tag = calloc(PCI_ID_PATH_TAG_LENGTH, sizeof(char));
+if (tag == NULL)
+return NULL;
+
+sprintf(tag, "pci-%04x_%02x_%02x_%1u",
device->businfo.pci->domain,
+device->businfo.pci->bus, device->businfo.pci->dev,
+device->businfo.pci->func);



Defensive programming would suggest to use snprintf.


Correct. It's more like extra defensive in this case but will fix.


Thanks :)



[snip]


@@ -345,55 +299,66 @@ int loader_get_user_preferred_fd(int default_fd, int
*different_device)
   return default_fd;
}

-   udev = udev_new();
-   if (!udev)
-  goto prime_clean;
+   default_tag = drm_get_id_path_tag_for_fd(default_fd);
+   if (default_tag == NULL)
+  goto err;

-   default_device_id_path_tag = get_id_path_tag_from_fd(udev,
default_fd);
-   if (!default_device_id_path_tag)
-  goto udev_clean;
+   num_devices = drmGetDevices(devices, MAX_DRM_DEVICES);
+   if (num_devices < 0)
+  goto err;

-   is_different_device = 1;
/* two format are supported:
 * "1": choose any other card than the card used by default.
 * id_path_tag: (for example "pci-_02_00_0") choose the card
 * with this id_path_tag.
 */
if (!strcmp(prime,"1")) {
-  free(prime);
-  prime = strdup(default_device_id_path_tag);
-  /* request a card with a different card than the default card */
-  another_tag = 1;
-   } else if (!strcmp(default_device_id_path_tag, prime))
-  /* we are to get a new fd (render-node) of the same device */
-  is_different_device = 0;
-
-   device_name = get_render_node_from_id_path_tag(udev,
-  prime,
-  another_tag);
-   if (device_name == NULL) {
-  is_different_device = 0;
-  goto default_device_clean;
+  /* Hmm... detection for 2-7 seems to be broken. Oh well ...
+   * Pick the first render device that is not our own.
+   */
+  for (i = 0; i < num_devices; i++) {
+ if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
+ !drm_device_matches_tag(devices[i], default_tag)) {
+
+found = true;
+break;
+ }
+  }
+   } else {
+  for (i = 0; i < num_devices; i++) {
+ if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
+drm_device_matches_tag(devices[i], prime)) {
+
+found = true;
+break;
+ }
+  }



I feel like it would be helpful to have a warning here if the device was not
found. This could avoid some confusion when people inevitably typo their
prime setting.


Original code does not have such a message, so let's add it as follow-up ?


Fine by me.

Cheers,
Nicolai



Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl/surfaceless: Fix segfault in eglSwapBuffers

2016-10-18 Thread Chad Versace
Since commit 63c5d5c6c46c8472ee7a8241a0f80f13d79cb8cd, the surfaceless
platform has allowed creation of pbuffer surfaces. But the vtable entry
for eglSwapBuffers has remained NULL.

Discovered by running a little pbuffer test.

Cc: Gurchetan Singh 
---
 src/egl/drivers/dri2/platform_surfaceless.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_surfaceless.c 
b/src/egl/drivers/dri2/platform_surfaceless.c
index fcf7d69..a55c5f1 100644
--- a/src/egl/drivers/dri2/platform_surfaceless.c
+++ b/src/egl/drivers/dri2/platform_surfaceless.c
@@ -178,6 +178,17 @@ dri2_surfaceless_create_pbuffer_surface(_EGLDriver *drv, 
_EGLDisplay *disp,
 }
 
 static EGLBoolean
+surfaceless_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf)
+{
+   assert(!surf || surf->Type == EGL_PBUFFER_BIT);
+
+   /* From the EGL 1.5 spec:
+*If surface is a [...] pbuffer surface, eglSwapBuffers has no effect.
+*/
+   return EGL_TRUE;
+}
+
+static EGLBoolean
 surfaceless_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *dpy)
 {
struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
@@ -223,6 +234,7 @@ static struct dri2_egl_display_vtbl 
dri2_surfaceless_display_vtbl = {
.destroy_surface = surfaceless_destroy_surface,
.create_image = dri2_create_image_khr,
.swap_interval = dri2_fallback_swap_interval,
+   .swap_buffers = surfaceless_swap_buffers,
.swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage,
.swap_buffers_region = dri2_fallback_swap_buffers_region,
.post_sub_buffer = dri2_fallback_post_sub_buffer,
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] svga: minor code improvements in svga_validate_pipe_sampler_view()

2016-10-18 Thread Brian Paul
Use the 'texture' local var in more places.
Rename 'pFormat' to 'viewFormat'.
---
 src/gallium/drivers/svga/svga_state_sampler.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_state_sampler.c 
b/src/gallium/drivers/svga/svga_state_sampler.c
index 53bb80f..445afcc 100644
--- a/src/gallium/drivers/svga/svga_state_sampler.c
+++ b/src/gallium/drivers/svga/svga_state_sampler.c
@@ -135,21 +135,21 @@ svga_validate_pipe_sampler_view(struct svga_context *svga,
   SVGA3dSurfaceFormat format;
   SVGA3dResourceType resourceDim;
   SVGA3dShaderResourceViewDesc viewDesc;
-  enum pipe_format pformat = sv->base.format;
+  enum pipe_format viewFormat = sv->base.format;
 
   /* vgpu10 cannot create a BGRX view for a BGRA resource, so force it to
* create a BGRA view (and vice versa).
*/
-  if (pformat == PIPE_FORMAT_B8G8R8X8_UNORM &&
-  sv->base.texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) {
- pformat = PIPE_FORMAT_B8G8R8A8_UNORM;
+  if (viewFormat == PIPE_FORMAT_B8G8R8X8_UNORM &&
+  texture->format == PIPE_FORMAT_B8G8R8A8_UNORM) {
+ viewFormat = PIPE_FORMAT_B8G8R8A8_UNORM;
   }
-  else if (pformat == PIPE_FORMAT_B8G8R8A8_UNORM &&
-  sv->base.texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) {
- pformat = PIPE_FORMAT_B8G8R8X8_UNORM;
+  else if (viewFormat == PIPE_FORMAT_B8G8R8A8_UNORM &&
+  texture->format == PIPE_FORMAT_B8G8R8X8_UNORM) {
+ viewFormat = PIPE_FORMAT_B8G8R8X8_UNORM;
   }
 
-  format = svga_translate_format(ss, pformat,
+  format = svga_translate_format(ss, viewFormat,
  PIPE_BIND_SAMPLER_VIEW);
   assert(format != SVGA3D_FORMAT_INVALID);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] radeonsi: rename prefixes from radeon to si

2016-10-18 Thread Marek Olšák
On Tue, Oct 18, 2016 at 6:28 PM, Emil Velikov  wrote:
> On 17 October 2016 at 14:44, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/drivers/radeonsi/si_pipe.c |   2 +-
>>  src/gallium/drivers/radeonsi/si_shader.c   |  96 ++---
>>  src/gallium/drivers/radeonsi/si_shader_internal.h  |  70 +-
>>  .../drivers/radeonsi/si_shader_tgsi_setup.c| 150 
>> ++---
>>  4 files changed, 159 insertions(+), 159 deletions(-)
>>
> From build POV everything is perfect thanks Marek ! For those
> Reviewed-by: Emil Velikov 

Thanks.

>
> Humble suggestion - set the following for friendlier patches ;-)
> $ git config --global diff.renames true

I forgot to remove one (almost empty) file, so a rename wasn't
detected properly. I do send all my Mesa patches with git send-email
-M.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] radeonsi: rename prefixes from radeon to si

2016-10-18 Thread Emil Velikov
On 17 October 2016 at 14:44, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/radeonsi/si_pipe.c |   2 +-
>  src/gallium/drivers/radeonsi/si_shader.c   |  96 ++---
>  src/gallium/drivers/radeonsi/si_shader_internal.h  |  70 +-
>  .../drivers/radeonsi/si_shader_tgsi_setup.c| 150 
> ++---
>  4 files changed, 159 insertions(+), 159 deletions(-)
>
From build POV everything is perfect thanks Marek ! For those
Reviewed-by: Emil Velikov 

Humble suggestion - set the following for friendlier patches ;-)
$ git config --global diff.renames true

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: eliminate trivial constant VS outputs

2016-10-18 Thread Marek Olšák
From: Marek Olšák 

These constant value VS PARAM exports:
- 0,0,0,0
- 0,0,0,1
- 1,1,1,0
- 1,1,1,1
can be loaded into PS inputs using the DEFAULT_VAL field, and the VS exports
can be removed from the IR to save export & parameter memory.

After LLVM optimizations, analyze the IR to see which exports are equal to
the ones listed above (or undef) and remove them if they are.

Targeted use cases:
- All DX9 eON ports always clear 10 VS outputs to 0.0 even if most of them
  are unused by PS (such as Witcher 2 below).
- VS output arrays with unused elements that the GLSL compiler can't
  eliminate (such as Batman below).

The shader-db deltas are quite interesting:
(not from upstream si-report.py, it won't be upstreamed)

PERCENTAGE DELTASShaders PARAM exports (affected only)
batman_arkham_origins589  -67.17 %
bioshock-infinite   1769   -0.47 %
dirt-showdown548   -2.68 %
dota2   1747   -3.36 %
f1-2015  776   -4.94 %
left_4_dead_2   1762   -0.07 %
metro_2033_redux2670   -0.43 %
portal   474   -0.22 %
talos_principle  324   -3.63 %
warsow   176   -2.20 %
witcher21040  -73.78 %

All affected 991  -65.37 %  ... 9681 -> 3353

Total  26725  -10.82 %  ... 58490 -> 52162
---
 src/gallium/drivers/radeonsi/si_shader.c| 154 
 src/gallium/drivers/radeonsi/si_shader.h|  11 ++
 src/gallium/drivers/radeonsi/si_state_shaders.c |  17 ++-
 3 files changed, 180 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index a361418..7fc1df4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -6593,20 +6593,167 @@ static void si_init_shader_ctx(struct 
si_shader_context *ctx,
bld_base->op_actions[TGSI_OPCODE_EMIT].emit = si_llvm_emit_vertex;
bld_base->op_actions[TGSI_OPCODE_ENDPRIM].emit = si_llvm_emit_primitive;
bld_base->op_actions[TGSI_OPCODE_BARRIER].emit = si_llvm_emit_barrier;
 
bld_base->op_actions[TGSI_OPCODE_MAX].emit = build_tgsi_intrinsic_nomem;
bld_base->op_actions[TGSI_OPCODE_MAX].intr_name = "llvm.maxnum.f32";
bld_base->op_actions[TGSI_OPCODE_MIN].emit = build_tgsi_intrinsic_nomem;
bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
 }
 
+/* Return true if the PARAM export has been eliminated. */
+static bool si_eliminate_const_output(struct si_shader_context *ctx,
+ LLVMValueRef inst, unsigned offset)
+{
+   struct si_shader *shader = ctx->shader;
+   unsigned num_outputs = shader->selector->info.num_outputs;
+   double v[4];
+   unsigned i, default_val; /* SPI_PS_INPUT_CNTL_i.DEFAULT_VAL */
+
+   for (i = 0; i < 4; i++) {
+   LLVMBool loses_info;
+   LLVMValueRef p = LLVMGetOperand(inst, 5 + i);
+   if (!LLVMIsConstant(p))
+   return false;
+
+   /* It's a constant expression. Undef outputs are eliminated 
too. */
+   if (LLVMIsUndef(p))
+   v[i] = 0;
+   else
+   v[i] = LLVMConstRealGetDouble(p, _info);
+
+   if (v[i] != 0 && v[i] != 1)
+   return false;
+   }
+
+   /* Only certain combinations of 0 and 1 can be eliminated. */
+   if (v[0] == 0 && v[1] == 0 && v[2] == 0)
+   default_val = v[3] == 0 ? 0 : 1;
+   else if (v[0] == 1 && v[1] == 1 && v[2] == 1)
+   default_val = v[3] == 0 ? 2 : 3;
+   else
+   return false;
+
+   /* The PARAM export can be represented as DEFAULT_VAL. Kill it. */
+   LLVMInstructionEraseFromParent(inst);
+
+   /* Change OFFSET to DEFAULT_VAL. */
+   for (i = 0; i < num_outputs; i++) {
+   if (shader->info.vs_output_param_offset[i] == offset) {
+   shader->info.vs_output_param_offset[i] =
+   EXP_PARAM_DEFAULT_VAL_ + default_val;
+   break;
+   }
+   }
+   return true;
+}
+
+struct si_vs_exports {
+   unsigned num;
+   unsigned offset[SI_MAX_VS_OUTPUTS];
+   LLVMValueRef inst[SI_MAX_VS_OUTPUTS];
+};
+
+static void si_eliminate_const_vs_outputs(struct si_shader_context *ctx)
+{
+   struct si_shader *shader = ctx->shader;
+   struct tgsi_shader_info *info = >selector->info;
+   LLVMBasicBlockRef bb;
+   struct si_vs_exports exports;
+   bool removed_any = false;
+
+   exports.num = 0;
+
+   if ((ctx->type == PIPE_SHADER_VERTEX &&
+(shader->key.vs.as_es || shader->key.vs.as_ls)) ||
+   (ctx->type == PIPE_SHADER_TESS_EVAL && shader->key.tes.as_es))
+

Re: [Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

2016-10-18 Thread Marek Olšák
On Tue, Oct 18, 2016 at 3:55 PM, Eero Tamminen
 wrote:
> Hi,
>
> On 18.10.2016 16:25, Jan Ziak wrote:
>>
>> On Tue, Oct 18, 2016 at 3:12 PM, Nicolai Hähnle 
>> wrote:
>>>
>>> On 18.10.2016 15:07, Jan Ziak wrote:

 On Tue Oct 18 09:29:59 UTC 2016, Eero Tamminen wrote:
>
> On 18.10.2016 01:07, Jan Ziak wrote:
>>
>> - The total number of executed instructions goes down from 64.184 to
>> 63.797
>>   giga-instructions when Mesa is compiled with "gcc -O0 ..."
>
>
> Please don't do performance related decisions based on data from
> compiling code with optimizations disabled.  Use -O2 or -O3 (or even
> better, check both).


 Options -O2 and -O3 interfere with profiling tools.

 I will try using -Og the next time.
>>>
>>>
>>> Just stop and use proper profiling tools like perf that can work with
>>> optimized tools.
>
>
> Valgrind/callgrind/cachegrind works also fine with optimized binaries.
>
> All profiling tools lie, at least a bit. It's better to know their strengths
> and weaknesses so that one knows which ones complement each other. Perf is
> e.g. good at finding hotspots, Valgrind (callgrind) is more reliable in
> telling how they get called.
>
> One may also needs GCC version from this decade.  Really old GCC versions
> didn't inlude all debug info needed for debugging optimized binaries.

Regarding C++ templates, the compiler doesn't use them. If u_vector
(Dave Airlie?) provides the same functionality as your array, I
suggest we use u_vector instead. If you can't use u_vector, you should
ask for approval from GLSL compiler leads (e.g. Ian Romanick or
Kenneth Graunke) to use C++ templates.


I'll repeat some stuff about profiling here but also explain my perspective.

Never profile with -O0 or disabled function inlining. Mesa uses -g -O2
with --enable-debug, so that's what you should use too. Don't use any
other -O* variants.

The only profiling tools reporting correct results are perf and
sysprof. (both use the same mechanism) If you don't enable dwarf in
perf (also sysprof can't use dwarf), you have to build Mesa with
-fno-omit-frame-pointer to see call trees. The only reason you would
want to enable dwarf-based call trees is when you want to see libc
calls. Otherwise, they won't be displayed or counted as part of call
trees. For Mesa developers who do profiling often,
-fno-omit-frame-pointer should be your default.

Callgrind counts calls (that one you can trust), but the reported time
is incorrect, because it uses its own virtual model of a CPU. Avoid it
if you want to measure time spent in functions.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >