Re: [Mesa-dev] [PATCH] i965: Fix GLX_MESA_query_renderer video memory on 32-bit.

2017-03-30 Thread Kenneth Graunke
On Thursday, March 30, 2017 6:48:38 PM PDT Kenneth Graunke wrote:
> On Thursday, March 30, 2017 4:38:14 PM PDT Chris Wilson wrote:
> > On Thu, Mar 30, 2017 at 04:28:19PM -0700, Kenneth Graunke wrote:
> > > On modern systems with 4GB apertures, the size in bytes is 4294967296,
> > > or (1ull << 32).  The kernel gives us the aperture size as a __u64,
> > > which works out great.
> > > 
> > > Unfortunately, libdrm "helpfully" returns the data as a size_t, which
> > > on 32-bit systems means it truncates the aperture size to 0 bytes.
> > > We've happily reported this value as 0 MB of video memory via
> > > GLX_MESA_query_renderer since it was originally exposed.
> > > 
> > > This patch bypasses libdrm and calls the ioctl ourselves so we can
> > > use a proper uint64_t, avoiding the 32-bit integer overflow.  We now
> > > report a proper video memory size on 32-bit systems.
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_screen.c | 16 
> > >  1 file changed, 12 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> > > b/src/mesa/drivers/dri/i965/intel_screen.c
> > > index 811a9c5a867..f94e8a77c10 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_screen.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> > > @@ -950,6 +950,17 @@ static const __DRIimageExtension intelImageExtension 
> > > = {
> > >  .createImageWithModifiers   = 
> > > intel_create_image_with_modifiers,
> > >  };
> > >  
> > > +static uint64_t
> > > +get_aperture_size(int fd)
> > > +{
> > > +   struct drm_i915_gem_get_aperture aperture;
> > > +
> > > +   if (drmIoctl(fd, DRM_IOCTL_I915_GEM_GET_APERTURE, ) != 0)
> > > +  return 0;
> > 
> > The aperture is nothing to do with the video memory limits... You want
> > to query the context for the size of the GTT, e.g.
> > https://patchwork.freedesktop.org/patch/62189/
> > 
> > i.e.
> > static uint64_t get_gtt_size(int fd)
> > {
> >struct drm_i915_gem_context_param p;
> >size_t mappable_size, aper_size;
> > 
> >memset(, 0, sizeof(p));
> >p.param = I915_CONTEXT_PARAM_GTT_SIZE;
> >if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, ) == 0)
> >   return p.value;
> > 
> >/* do sometheing useful for old kernels */
> > 
> >drm_intel_get_aperture_sizes(fd, _size, _size);
> > 
> >return aper_size;
> > }
> 
> It's somewhat debatable what a unified memory GPU should return for a
> "Number of megabytes of video memory available to the renderer" query,
> as there really isn't a concept of video RAM.
> 
> When Ian implemented this, he chose to pick the amount of memory that
> a single batch can reference, which is 3/4 of the aperture.  This may
> be too small - applications can certainly use more memory than that.
> However, their working set for a draw had better fit within this limit,
> or else there will be a performance penalty.  I think it's pretty
> reasonable for what applications want.  They're trying to gauge how
> high-res their textures can be without incurring penalties.
> 
> I know Ian also spoke with a number of game vendors when drafting
> and implementing this extension, so I'm inclined to trust his
> interpretation.
> 
> I don't think exposing GTT_SIZE is useful.  With 48-bit addressing
> and PPGTT, the result of that query will be (1ull << 48) aka 256
> terabytes.  There is no way in hell that an application can use
> that much RAM.  We could restrict it to the total system RAM, but
> even then, it would not be performant to try and use all of RAM.

Okay, I missed that right below, we limit it to the amount of system
RAM.  So it'll never be 256 terabytes.  That's more reasonable.

Still not sure whether it's the right thing to do, though.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] egl/android: Dequeue buffers inside EGL calls

2017-03-30 Thread Rob Clark
On Fri, Mar 31, 2017 at 12:22 AM, Tapani Pälli  wrote:
>
>
> On 03/30/2017 05:57 PM, Emil Velikov wrote:
>>
>> On 30 March 2017 at 15:30, Tomasz Figa  wrote:
>>>
>>> On Thu, Mar 30, 2017 at 11:17 PM, Emil Velikov 
>>> wrote:


 On 30 March 2017 at 11:55, Tomasz Figa  wrote:
>
> Android buffer queues can be abandoned, which results in failing to
> dequeue next buffer. Currently this would fail somewhere deep within
> the DRI stack calling loader's getBuffers*(), without any error
> reporting to the client app. However Android framework code relies on
> proper signaling of this event, so we move buffer dequeue to
> createWindowSurface() and swapBuffers() call, which can generate proper
> EGL errors. To keep the performance benefits of delayed buffer
> handling,
> if any, fence wait and DRI image creation is kept delayed until
> getBuffers*() is called by the DRI driver.
>
 Thank you Tomasz.

 I'm fairly confident that this should resolve the crash [in
 swap_buffers] that Mauro was seeing.
 Mauro can you give it a test ?
>>>
>>>
>>> Ah, I actually noticed a problem with existing code, supposedly fixed
>>> by [1], but I'm afraid it's still wrong.
>>>
>>> Current swap_buffers calls get_back_bo(), but doesn't call
>>> update_buffers(), which is the function that should be called before
>>> to actually dequeue a buffer from Android's buffer queue. Given that,
>>> get_back_bo() would simply fail with !dri2_surf->buffer, because no
>>> buffer was dequeued.
>>>
>> Right - I was wondering why we don't hit that on EGL/GBM or EGL/Wayland.
>> From a quick look - may be because EGL/Android drops the dpy mutex in
>> droid_window_enqueue_buffer().
>>
>>> My patch removes update_buffers() and changes the buffer management so
>>> that there is always a buffer dequeued, starting from surface
>>> creation, unless there was an error somewhere.
>>>
>> Of the top of your head - is there something stopping us from using
>> the same method on $other platforms?
>>
>>> [1]
>>> https://cgit.freedesktop.org/mesa/mesa/commit/src/egl/drivers/dri2/platform_android.c?id=4d4558411db166d2d66f8cec9cb581149dbe1597
>>>


 Not that huge of an expert on the Android specifics, so just a humble
 request:
 Can we seek the code resuffle (droid_{alloc,free}_local_buffer,
>>
>> Oops silly typo - s/seek/split/.
>>
 other?) separate from the functionality changes ?
>>>
>>>
>>> Sure. Thanks for suggestion.
>>>
>> Please give it a day or two for others to comment.
>
>
> I'm trying to debug why this causes our homescreen (wallpaper) to be black.
> Otherwise I haven't seen any issues with these changes.
>

wallpaper seems to be a special sorta hell..  I wonder if there is
somehow some sort of interaction with what I fixed / worked-around in
a5e733c6b52e93de3000647d075f5ca2f55fcb71 ??

Maybe at least try commenting out the temp-pbuffer thing to get max
texture size, and see if that "fixes" things

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gm107/ir: Emit SV_CLOCK system value

2017-03-30 Thread Boyan Ding
2017-03-31 11:21 GMT+08:00 Ilia Mirkin :
> Did you check what the blob does? There's clocklo/hi and
> globaltimerlo/hi. Without additional documentation, it's a bit hard to
> tell the difference... Note that envydis's gf100.c/gk110.c disagree on
> which is which. Probably not due to any architectural reasons, but due
> to RE methodology. (From before nvdisasm was
> available/trusted/used/whatever.)

(replying to your concern in 1 and 2 at the same time)

I have checked against the blob and nvidisasm before, and gk110.c in
envydis was actually wrong. I made a PR for that [1].

This is what I get when using clockARB() on GK208:
281c0006 8640 mov b32 $r1 $sr80
289c0002 8640 mov b32 $r0 $sr81
(note $r1 <- $sr80, $r0 <- $sr81, and they are called SR_CLOCKLO and
SR_CLOCKHI in nvdisasm respectively)

I haven't really checked with maxwell+, just believing in the
correctness in envydis and uniformity between architectures. But I
will check when I reach my pascal machine later.

Cheers.
Boyan Ding

[1] https://github.com/envytools/envytools/pull/84

>
> On Thu, Mar 30, 2017 at 10:33 PM, Boyan Ding  wrote:
>> Signed-off-by: Boyan Ding 
>> ---
>>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> index 6de3f396e3..ab9c94b4d0 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> @@ -269,6 +269,7 @@ CodeEmitterGM107::emitSYS(int pos, const Value *val)
>> case SV_INVOCATION_INFO: id = 0x1d; break;
>> case SV_TID: id = 0x21 + val->reg.data.sv.index; break;
>> case SV_CTAID  : id = 0x25 + val->reg.data.sv.index; break;
>> +   case SV_CLOCK  : id = 0x50 + val->reg.data.sv.index; break;
>> default:
>>assert(!"invalid system value");
>>id = 0;
>> --
>> 2.12.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] egl/android: Dequeue buffers inside EGL calls

2017-03-30 Thread Tapani Pälli



On 03/30/2017 05:57 PM, Emil Velikov wrote:

On 30 March 2017 at 15:30, Tomasz Figa  wrote:

On Thu, Mar 30, 2017 at 11:17 PM, Emil Velikov  wrote:


On 30 March 2017 at 11:55, Tomasz Figa  wrote:

Android buffer queues can be abandoned, which results in failing to
dequeue next buffer. Currently this would fail somewhere deep within
the DRI stack calling loader's getBuffers*(), without any error
reporting to the client app. However Android framework code relies on
proper signaling of this event, so we move buffer dequeue to
createWindowSurface() and swapBuffers() call, which can generate proper
EGL errors. To keep the performance benefits of delayed buffer handling,
if any, fence wait and DRI image creation is kept delayed until
getBuffers*() is called by the DRI driver.


Thank you Tomasz.

I'm fairly confident that this should resolve the crash [in
swap_buffers] that Mauro was seeing.
Mauro can you give it a test ?


Ah, I actually noticed a problem with existing code, supposedly fixed
by [1], but I'm afraid it's still wrong.

Current swap_buffers calls get_back_bo(), but doesn't call
update_buffers(), which is the function that should be called before
to actually dequeue a buffer from Android's buffer queue. Given that,
get_back_bo() would simply fail with !dri2_surf->buffer, because no
buffer was dequeued.


Right - I was wondering why we don't hit that on EGL/GBM or EGL/Wayland.
From a quick look - may be because EGL/Android drops the dpy mutex in
droid_window_enqueue_buffer().


My patch removes update_buffers() and changes the buffer management so
that there is always a buffer dequeued, starting from surface
creation, unless there was an error somewhere.


Of the top of your head - is there something stopping us from using
the same method on $other platforms?


[1] 
https://cgit.freedesktop.org/mesa/mesa/commit/src/egl/drivers/dri2/platform_android.c?id=4d4558411db166d2d66f8cec9cb581149dbe1597




Not that huge of an expert on the Android specifics, so just a humble request:
Can we seek the code resuffle (droid_{alloc,free}_local_buffer,

Oops silly typo - s/seek/split/.


other?) separate from the functionality changes ?


Sure. Thanks for suggestion.


Please give it a day or two for others to comment.


I'm trying to debug why this causes our homescreen (wallpaper) to be 
black. Otherwise I haven't seen any issues with these changes.


Thanks;

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: use -O0 optimization for builtin_functions.cpp with MinGW

2017-03-30 Thread Brian Paul

On 03/30/2017 09:23 PM, Brian Paul wrote:

Some versions of MinGW-w64 such as 5.3.1 and 6.2.0 produce bad code
with -O2 or -O3 causing a random driver crash when running programs
that use GLSL.  Most Mesa demos in the glsl/ directory trigger the
bug, but not the fragcoord.c test.

Use a #pragma to force -O1 for this file for later MinGW versions.
Luckily, this is basically one-time setup code.  I suspect the bug
is related to the sheer size of this file.

This should let us move to newer versions of MinGW-w64 for Mesa.
---
  src/compiler/glsl/builtin_functions.cpp | 20 
  1 file changed, 20 insertions(+)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index e30509a..e32b18c 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -53,6 +53,26 @@
   *name and parameters.
   */

+
+/**
+ * Unfortunately, some versions of MinGW produce bad code if this file
+ * is compiled with -O2 or -O3.  The resulting driver will crash in random
+ * places if the app uses GLSL.
+ * The work-around is to disable optimizations for just this file.  Luckily,
+ * this code is basically just executed once.
+ *
+ * MinGW 4.6.3 (in Ubuntu 13.10) does not have this bug.
+ * MinGW 5.3.1 (in Ubuntu 16.04) definitely has this bug.
+ * MinGW 6.2.0 (in Ubuntu 16.10) definitely has this bug.
+ * MinGW x.y.z - don't know.  Assume versions after 4.6.x are buggy
+ */
+
+#if defined(__MINGW32__) && ((__GNUC__ * 100) + __GNUC_MINOR >= 407)
+#warning "disabling optimizations for this file to work around compiler bug"
+#pragma GCC optimize("O0")


Ugh, that should be "O1".  Fixed locally and verified.

-Brian


+#endif
+
+
  #include 
  #include 
  #include "main/core.h" /* for struct gl_shader */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] nvc0/ir: Handle TGSI_OPCODE_CLOCK

2017-03-30 Thread Ilia Mirkin
On Thu, Mar 30, 2017 at 10:33 PM, Boyan Ding  wrote:
> Signed-off-by: Boyan Ding 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 7aaeedf8dd..9fbd3c0d30 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -3410,6 +3410,11 @@ Converter::handleInstruction(const struct 
> tgsi_full_instruction *insn)
>   mkCvt(OP_CVT, TYPE_U32, dst0[c], TYPE_U8, val0);
>}
>break;
> +   case TGSI_OPCODE_CLOCK:
> +  // The shifting is weird, but that's how they made it
> +  mkOp1(OP_RDSV, TYPE_U32, dst0[1], mkSysVal(SV_CLOCK, 0))->fixed = 1;
> +  mkOp1(OP_RDSV, TYPE_U32, dst0[0], mkSysVal(SV_CLOCK, 1))->fixed = 1;

How sure are you about this? Shouldn't clocklo go into dst[0] and
clockhi go into dst[1]? This is confirmed by

"""
clock2x32ARB() returns
the same value encoded as a two-component vector of 32-bit unsigned integers
with the first component containing the 32 least significant bits and the
second component containing the 32 most significant bits.
"""

Did the tests fail without that? Perhaps that indicates something else is wrong?

> +  break;
> case TGSI_OPCODE_KILL_IF:
>val0 = new_LValue(func, FILE_PREDICATE);
>mask = 0;
> --
> 2.12.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: use -O0 optimization for builtin_functions.cpp with MinGW

2017-03-30 Thread Brian Paul
Some versions of MinGW-w64 such as 5.3.1 and 6.2.0 produce bad code
with -O2 or -O3 causing a random driver crash when running programs
that use GLSL.  Most Mesa demos in the glsl/ directory trigger the
bug, but not the fragcoord.c test.

Use a #pragma to force -O1 for this file for later MinGW versions.
Luckily, this is basically one-time setup code.  I suspect the bug
is related to the sheer size of this file.

This should let us move to newer versions of MinGW-w64 for Mesa.
---
 src/compiler/glsl/builtin_functions.cpp | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index e30509a..e32b18c 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -53,6 +53,26 @@
  *name and parameters.
  */
 
+
+/**
+ * Unfortunately, some versions of MinGW produce bad code if this file
+ * is compiled with -O2 or -O3.  The resulting driver will crash in random
+ * places if the app uses GLSL.
+ * The work-around is to disable optimizations for just this file.  Luckily,
+ * this code is basically just executed once.
+ *
+ * MinGW 4.6.3 (in Ubuntu 13.10) does not have this bug.
+ * MinGW 5.3.1 (in Ubuntu 16.04) definitely has this bug.
+ * MinGW 6.2.0 (in Ubuntu 16.10) definitely has this bug.
+ * MinGW x.y.z - don't know.  Assume versions after 4.6.x are buggy
+ */
+
+#if defined(__MINGW32__) && ((__GNUC__ * 100) + __GNUC_MINOR >= 407)
+#warning "disabling optimizations for this file to work around compiler bug"
+#pragma GCC optimize("O0")
+#endif
+
+
 #include 
 #include 
 #include "main/core.h" /* for struct gl_shader */
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: add null pointer check in print_without_declaration()

2017-03-30 Thread Brian Paul
To avoid/fix a segmentation fault when running the stand-alone GLSL
compiler utility for cases such as the Mesa demos toyball test:

glsl_compiler --dump-builder --version 120 CH11-toyball.vert CH11-toyball.frag
---
 src/compiler/glsl/ir_builder_print_visitor.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_builder_print_visitor.cpp 
b/src/compiler/glsl/ir_builder_print_visitor.cpp
index 825dbe1..164a237 100644
--- a/src/compiler/glsl/ir_builder_print_visitor.cpp
+++ b/src/compiler/glsl/ir_builder_print_visitor.cpp
@@ -581,7 +581,9 @@ ir_builder_print_visitor::print_without_declaration(const 
ir_expression *ir)
  const struct hash_entry *const he =
 _mesa_hash_table_search(index_map, ir->operands[i]);
 
- print_without_indent("r%04X", (unsigned)(uintptr_t) he->data);
+ if (he) {
+print_without_indent("r%04X", (unsigned)(uintptr_t) he->data);
+ }
   }
 
   if (i < num_op - 1)
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gm107/ir: Emit SV_CLOCK system value

2017-03-30 Thread Ilia Mirkin
Did you check what the blob does? There's clocklo/hi and
globaltimerlo/hi. Without additional documentation, it's a bit hard to
tell the difference... Note that envydis's gf100.c/gk110.c disagree on
which is which. Probably not due to any architectural reasons, but due
to RE methodology. (From before nvdisasm was
available/trusted/used/whatever.)

On Thu, Mar 30, 2017 at 10:33 PM, Boyan Ding  wrote:
> Signed-off-by: Boyan Ding 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 6de3f396e3..ab9c94b4d0 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -269,6 +269,7 @@ CodeEmitterGM107::emitSYS(int pos, const Value *val)
> case SV_INVOCATION_INFO: id = 0x1d; break;
> case SV_TID: id = 0x21 + val->reg.data.sv.index; break;
> case SV_CTAID  : id = 0x25 + val->reg.data.sv.index; break;
> +   case SV_CLOCK  : id = 0x50 + val->reg.data.sv.index; break;
> default:
>assert(!"invalid system value");
>id = 0;
> --
> 2.12.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] nvc0: enable ARB_shader_clock

2017-03-30 Thread Boyan Ding
Signed-off-by: Boyan Ding 
---
 docs/features.txt  | 2 +-
 docs/relnotes/17.1.0.html  | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index d707f01185..228be5ce66 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -293,7 +293,7 @@ Khronos, ARB, and OES extensions that are not part of any 
OpenGL or OpenGL ES ve
   GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
radeonsi, r600, softpipe, swr)
   GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
nvc0, radeonsi, softpipe)
   GL_ARB_shader_ballot  not started
-  GL_ARB_shader_clock   DONE (i965/gen7+, 
radeonsi)
+  GL_ARB_shader_clock   DONE (i965/gen7+, 
nvc0, radeonsi)
   GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
radeonsi)
   GL_ARB_shader_group_vote  DONE (nvc0)
   GL_ARB_shader_stencil_export  DONE (i965/gen9+, 
radeonsi, softpipe, llvmpipe, swr)
diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
index 52b35b5f8f..f4d3fc3b6c 100644
--- a/docs/relnotes/17.1.0.html
+++ b/docs/relnotes/17.1.0.html
@@ -45,7 +45,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, 
llvmpipe
-GL_ARB_shader_clock on radeonsi
+GL_ARB_shader_clock on nvc0, radeonsi
 GL_ARB_transform_feedback2 on i965/gen6
 GL_ARB_transform_feedback_overflow_query on i965/gen6+
 Geometry shaders enabled on swr
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index ab94d9c4e4..880e6ed80b 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -247,6 +247,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_DOUBLES:
case PIPE_CAP_INT64:
case PIPE_CAP_TGSI_TEX_TXF_LZ:
+   case PIPE_CAP_TGSI_CLOCK:
   return 1;
case PIPE_CAP_COMPUTE:
   return (class_3d < GP100_3D_CLASS);
@@ -285,7 +286,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_INT64_DIVMOD:
-   case PIPE_CAP_TGSI_CLOCK:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] gm107/ir: Emit SV_CLOCK system value

2017-03-30 Thread Boyan Ding
Signed-off-by: Boyan Ding 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 6de3f396e3..ab9c94b4d0 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -269,6 +269,7 @@ CodeEmitterGM107::emitSYS(int pos, const Value *val)
case SV_INVOCATION_INFO: id = 0x1d; break;
case SV_TID: id = 0x21 + val->reg.data.sv.index; break;
case SV_CTAID  : id = 0x25 + val->reg.data.sv.index; break;
+   case SV_CLOCK  : id = 0x50 + val->reg.data.sv.index; break;
default:
   assert(!"invalid system value");
   id = 0;
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] nvc0/ir: Handle TGSI_OPCODE_CLOCK

2017-03-30 Thread Boyan Ding
Signed-off-by: Boyan Ding 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 7aaeedf8dd..9fbd3c0d30 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -3410,6 +3410,11 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
  mkCvt(OP_CVT, TYPE_U32, dst0[c], TYPE_U8, val0);
   }
   break;
+   case TGSI_OPCODE_CLOCK:
+  // The shifting is weird, but that's how they made it
+  mkOp1(OP_RDSV, TYPE_U32, dst0[1], mkSysVal(SV_CLOCK, 0))->fixed = 1;
+  mkOp1(OP_RDSV, TYPE_U32, dst0[0], mkSysVal(SV_CLOCK, 1))->fixed = 1;
+  break;
case TGSI_OPCODE_KILL_IF:
   val0 = new_LValue(func, FILE_PREDICATE);
   mask = 0;
-- 
2.12.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] ARB_shader_clock for nvc0

2017-03-30 Thread Boyan Ding
This series depends on the Nicolai's work in gallium[1], and it's quite
trivial. The first patch handles clock registers on Maxwell+, and the
second one translates TGSI_OPCODE_CLOCK into special register reads.
The last patch just flips the extension on.

Boyan

[1]https://lists.freedesktop.org/archives/mesa-dev/2017-March/150122.html

-- 
2.12.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix GLX_MESA_query_renderer video memory on 32-bit.

2017-03-30 Thread Kenneth Graunke
On Thursday, March 30, 2017 4:38:14 PM PDT Chris Wilson wrote:
> On Thu, Mar 30, 2017 at 04:28:19PM -0700, Kenneth Graunke wrote:
> > On modern systems with 4GB apertures, the size in bytes is 4294967296,
> > or (1ull << 32).  The kernel gives us the aperture size as a __u64,
> > which works out great.
> > 
> > Unfortunately, libdrm "helpfully" returns the data as a size_t, which
> > on 32-bit systems means it truncates the aperture size to 0 bytes.
> > We've happily reported this value as 0 MB of video memory via
> > GLX_MESA_query_renderer since it was originally exposed.
> > 
> > This patch bypasses libdrm and calls the ioctl ourselves so we can
> > use a proper uint64_t, avoiding the 32-bit integer overflow.  We now
> > report a proper video memory size on 32-bit systems.
> > ---
> >  src/mesa/drivers/dri/i965/intel_screen.c | 16 
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> > b/src/mesa/drivers/dri/i965/intel_screen.c
> > index 811a9c5a867..f94e8a77c10 100644
> > --- a/src/mesa/drivers/dri/i965/intel_screen.c
> > +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> > @@ -950,6 +950,17 @@ static const __DRIimageExtension intelImageExtension = 
> > {
> >  .createImageWithModifiers   = 
> > intel_create_image_with_modifiers,
> >  };
> >  
> > +static uint64_t
> > +get_aperture_size(int fd)
> > +{
> > +   struct drm_i915_gem_get_aperture aperture;
> > +
> > +   if (drmIoctl(fd, DRM_IOCTL_I915_GEM_GET_APERTURE, ) != 0)
> > +  return 0;
> 
> The aperture is nothing to do with the video memory limits... You want
> to query the context for the size of the GTT, e.g.
> https://patchwork.freedesktop.org/patch/62189/
> 
> i.e.
> static uint64_t get_gtt_size(int fd)
> {
>struct drm_i915_gem_context_param p;
>size_t mappable_size, aper_size;
> 
>memset(, 0, sizeof(p));
>p.param = I915_CONTEXT_PARAM_GTT_SIZE;
>if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, ) == 0)
>   return p.value;
> 
>/* do sometheing useful for old kernels */
> 
>drm_intel_get_aperture_sizes(fd, _size, _size);
> 
>return aper_size;
> }

It's somewhat debatable what a unified memory GPU should return for a
"Number of megabytes of video memory available to the renderer" query,
as there really isn't a concept of video RAM.

When Ian implemented this, he chose to pick the amount of memory that
a single batch can reference, which is 3/4 of the aperture.  This may
be too small - applications can certainly use more memory than that.
However, their working set for a draw had better fit within this limit,
or else there will be a performance penalty.  I think it's pretty
reasonable for what applications want.  They're trying to gauge how
high-res their textures can be without incurring penalties.

I know Ian also spoke with a number of game vendors when drafting
and implementing this extension, so I'm inclined to trust his
interpretation.

I don't think exposing GTT_SIZE is useful.  With 48-bit addressing
and PPGTT, the result of that query will be (1ull << 48) aka 256
terabytes.  There is no way in hell that an application can use
that much RAM.  We could restrict it to the total system RAM, but
even then, it would not be performant to try and use all of RAM.

Regardless, this patch fixes an clear bug where we expose a value
of 0 MB for 32-bit applications but 3072 MB for 64-bit applications.
I'd like to fix that before altering the limit that we advertise.

Looking again...libdrm_intel sets bufmgr_gem->gtt_size to
drm_i915_gem_get_aperture::aper_available_size - and uses that field
to -ENOSPC your execbuffers.  drm_intel_get_aperture_sizes, and this
query, use drm_i915_gem_get_aperture::aper_size - which is not quite
the same.  Reading the kernel sources, it looks like aper_available_size
subtracts any pinned memory.  At least in a PPGTT world, that's probably
not materially different given how early we're calling it.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: always do tess ring size calculations.

2017-03-30 Thread Dave Airlie
From: Dave Airlie 

We could store these in the device, but it's probably
not that much overhead to recalculate them, this is needed
because we will emit the rings if the queue has them created
so we need to emit the register values correctly as well.

This fixes some tess tests failing when run after other tests
inside CTS.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_device.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index af82926..5c48be1 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -1393,15 +1393,11 @@ radv_get_preamble_cs(struct radv_queue *queue,
if (needs_tess_rings)
add_tess_rings = true;
}
-
-   if (add_tess_rings) {
-   tess_factor_ring_size = 32768 * 
queue->device->physical_device->rad_info.max_se;
-   hs_offchip_param = radv_get_hs_offchip_param(queue->device,
-
_offchip_buffers);
-   tess_offchip_ring_size = max_offchip_buffers *
-   queue->device->tess_offchip_block_dw_size * 4;
-
-   }
+   tess_factor_ring_size = 32768 * 
queue->device->physical_device->rad_info.max_se;
+   hs_offchip_param = radv_get_hs_offchip_param(queue->device,
+_offchip_buffers);
+   tess_offchip_ring_size = max_offchip_buffers *
+   queue->device->tess_offchip_block_dw_size * 4;
 
if (scratch_size <= queue->scratch_size &&
compute_scratch_size <= queue->compute_scratch_size &&
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

2017-03-30 Thread Chad Versace
On Wed 15 Mar 2017, Jason Ekstrand wrote:
> This cache allows us to easily ensure that we have a unique anv_bo for
> each gem handle.  We'll need this in order to support multiple-import of
> memory objects and semaphores.
> 
> v2 (Jason Ekstrand):
>  - Reject BO imports if the size doesn't match the prime fd size as
>reported by lseek().
> 
> v3 (Jason Ekstrand):
>  - Fix reference counting around cache_release (Chris Willson)
>  - Move the mutex_unlock() later in cache_release
> ---
>  src/intel/vulkan/anv_allocator.c | 261 
> +++
>  src/intel/vulkan/anv_private.h   |  26 
>  2 files changed, 287 insertions(+)


> +struct anv_cached_bo {
> +   struct anv_bo bo;
> +
> +   uint32_t refcount;
> +};

Extra whitespace.

> +static bool
> +uint32_t_equal(const void *a, const void *b)
> +{
> +   return a == b;
> +}

This can be replaced with with hash_table.h:_mesa_key_pointer_equal().

> +VkResult
> +anv_bo_cache_init(struct anv_bo_cache *cache)
> +{
> +   cache->bo_map = _mesa_hash_table_create(NULL, hash_uint32_t, 
> uint32_t_equal);
> +   if (!cache->bo_map)
> +  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);

Since the table's keys are not real pointers (they're just uint32_t gem
handles), you need to set a custom value for deleted keys.

  _mesa_hash_table_set_deleted_key(cache->bo_map, (void*)(uintptr_t) 0);

> +
> +   if (pthread_mutex_init(>mutex, NULL)) {
> +  _mesa_hash_table_destroy(cache->bo_map, NULL);
> +  return vk_errorf(VK_ERROR_OUT_OF_HOST_MEMORY,
> +   "pthread_mutex_inti failed: %m");

Typo: s/inti/init/

> +   }
> +
> +   return VK_SUCCESS;
> +}
> +
> +void
> +anv_bo_cache_finish(struct anv_bo_cache *cache)
> +{
> +   _mesa_hash_table_destroy(cache->bo_map, NULL);
> +   pthread_mutex_destroy(>mutex);
> +}
> +
> +static struct anv_cached_bo *
> +anv_bo_cache_lookup_locked(struct anv_bo_cache *cache, uint32_t gem_handle)
> +{
> +   struct hash_entry *entry =
> +  _mesa_hash_table_search(cache->bo_map,
> +  (const void *)(uintptr_t)gem_handle);
> +   if (!entry)
> +  return NULL;
> +
> +   struct anv_cached_bo *bo = (struct anv_cached_bo *)entry->data;
> +   assert(bo->bo.gem_handle == gem_handle);
> +
> +   return bo;
> +}
> +
> +VkResult
> +anv_bo_cache_alloc(struct anv_device *device,
> +   struct anv_bo_cache *cache,
> +   uint64_t size, struct anv_bo **bo_out,
> +   VkAllocationCallbacks *alloc)
> +{
> +   struct anv_cached_bo *bo =
> +  vk_alloc(alloc, size, 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> +   if (!bo)
> +  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> +
> +   bo->refcount = 1;
> +
> +   /* The kernel is going to give us whole pages anyway */
> +   size = align_u64(size, 4096);
> +
> +   VkResult result = anv_bo_init_new(>bo, device, size);
> +   if (result != VK_SUCCESS) {
> +  vk_free(alloc, bo);
> +  return result;
> +   }
> +
> +   assert(bo->bo.gem_handle);
> +
> +   pthread_mutex_lock(>mutex);
> +
> +   _mesa_hash_table_insert(cache->bo_map,
> +   (void *)(uintptr_t)bo->bo.gem_handle, bo);
> +
> +   pthread_mutex_unlock(>mutex);
> +
> +   *bo_out = >bo;
> +
> +   return VK_SUCCESS;
> +}
> +
> +VkResult
> +anv_bo_cache_import(struct anv_device *device,
> +struct anv_bo_cache *cache,
> +int fd, uint64_t size, struct anv_bo **bo_out,
> +VkAllocationCallbacks *alloc)
> +{
> +   pthread_mutex_lock(>mutex);
> +
> +   /* The kernel is going to give us whole pages anyway */
> +   size = align_u64(size, 4096);
> +
> +   uint32_t gem_handle = anv_gem_fd_to_handle(device, fd);
> +   if (!gem_handle) {
> +  pthread_mutex_unlock(>mutex);
> +  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> +   }
> +
> +   struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, gem_handle);
> +   if (bo) {
> +  assert(bo->bo.size == size);
> +  __sync_fetch_and_add(>refcount, 1);
> +   } else {
> +  /* For security purposes, we reject BO imports where the size does not
> +   * match exactly.  This prevents a malicious client from passing a
> +   * buffer to a trusted client, lying about the size, and telling the
> +   * trusted client to try and texture from an image that goes
> +   * out-of-bounds.  This sort of thing could lead to GPU hangs or worse
> +   * in the trusted client.  The trusted client can protect itself 
> against
> +   * this sort of attack but only if it can trust the buffer size.
> +   */
> +  off_t import_size = lseek(fd, 0, SEEK_END);

We might as well as use lseek64 here, for future-proofing. Someday, on
some systems, a client may try to import a 4GB dma_buf, on which the
client does its own suballocations.

> +  if (import_size == (off_t)-1 || import_size != size) {
> + anv_gem_close(device, gem_handle);

This anv_gem_close initially looked wrong 

Re: [Mesa-dev] [rfc] radv missing tesseellation patches

2017-03-30 Thread Mike Lothian
That gets both tessellation and terraintessellation demos working for me

Feel free to add my tested by for the series

On Fri, 31 Mar 2017 at 01:45 Dave Airlie  wrote:

> I somehow in my rebasing lost two patches, not sure where I ate them.
>
> Anyways slot these in as patch 24.1 and 24.2 before the enable patch.
>
> I've updated the radv-wip-tess-submit branch as well.
>
> Dave.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel: genxml: compress all gen files into one

2017-03-30 Thread Mike Lothian
This prevents me building master

PYTHONPATH=/var/tmp/portage/media-libs/mesa-/work/mesa-/src/compiler/nir
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/compiler/brw_nir_trig_workarounds.py
> compiler/brw_nir_trig_workarounds.c || (rm -f compi
ler/brw_nir_trig_workarounds.c; false)
/bin/mkdir -p genxml
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen4.xml
> genxml/gen4_pack.h || (rm -f genxml/gen4_pack.h; false)
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen45.xml
> genxml/gen45_pack.h || (rm -f genxml/gen45_pack.h; false)
/bin/mkdir -p genxml
/bin/mkdir -p genxml
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen5.xml
> genxml/gen5_pack.h || (rm -f genxml/gen5_pack.h; false)
/bin/mkdir -p genxml
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen6.xml
> genxml/gen6_pack.h || (rm -f genxml/gen6_pack.h; false)
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen7.xml
> genxml/gen7_pack.h || (rm -f genxml/gen7_pack.h; false)
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen75.xml
> genxml/gen75_pack.h || (rm -f genxml/gen75_pack.h; false)
/bin/mkdir -p genxml
/bin/mkdir -p genxml
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen8.xml
> genxml/gen8_pack.h || (rm -f genxml/gen8_pack.h; false)
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_pack_header.py
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen9.xml
> genxml/gen9_pack.h || (rm -f genxml/gen9_pack.h; false)
/bin/mkdir -p genxml
/bin/mkdir -p genxml
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_bits_header.py
-o genxml/genX_bits.h
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen4.xml
/var/tmp/portage/media-libs/mesa-/work/m
esa-/src/intel/genxml/gen45.xml
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen5.xml
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen6.xml
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/int
el/genxml/gen7.xml
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen75.xml
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen8.xml
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen9.x
ml
/usr/bin/python2.7
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_zipped_file.py
genxml/gen4.xml genxml/gen45.xml genxml/gen5.xml genxml/gen6.xml
genxml/gen7.xml genxml/gen75.xml genxml/gen8.xml genxml/gen9.xml >
genxml/genX_xm
l.h || (rm -f genxml/genX_xml.h; false)
/bin/mkdir -p isl
/bin/mkdir -p vulkan
/usr/bin/python2.7
 
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/isl/gen_format_layout.py
\
   --csv
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/isl/isl_format_layout.csv
--out isl/isl_format_layout.c
/usr/bin/python2.7
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/vulkan/anv_entrypoints_gen.py
\
   --xml
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/vulkan/registry/vk.xml
--outdir ./vulkan
Traceback (most recent call last):
 File
"/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_zipped_file.py",
line 71, in 
   main()
 File
"/var/tmp/portage/media-libs/mesa-/work/mesa-/src/intel/genxml/gen_zipped_file.py",
line 48, in main
   xml = open(filename).read()
IOError: [Errno 2] No such file or directory: 'genxml/gen4.xml'
make[3]: *** [Makefile:4275: genxml/genX_xml.h] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: Leaving directory
'/var/tmp/portage/media-libs/mesa-/work/mesa--abi_x86_32.x86/src/intel'

make[2]: *** [Makefile:852: all-recursive] Error 1
make[2]: Leaving directory
'/var/tmp/portage/media-libs/mesa-/work/mesa--abi_x86_32.x86/src'
make[1]: *** [Makefile:643: all] Error 2
make[1]: Leaving directory
'/var/tmp/portage/media-libs/mesa-/work/mesa--abi_x86_32.x86/src'
make: *** [Makefile:643: all-recursive] Error 1
* ERROR: 

Re: [Mesa-dev] [PATCH V2] mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled

2017-03-30 Thread Timothy Arceri



On 31/03/17 11:45, Timothy Arceri wrote:

We could re-enable it also but I haven't tested that yet, and I'm
not sure we care much anyway.

V2: don't disable it from with the call itself. We need a custom
marshalling function or we get stuck waiting for thread to
finish.


*from within

Fixed locally
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2] mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled

2017-03-30 Thread Timothy Arceri
We could re-enable it also but I haven't tested that yet, and I'm
not sure we care much anyway.

V2: don't disable it from with the call itself. We need a custom
marshalling function or we get stuck waiting for thread to
finish.
---
 src/mapi/glapi/gen/gl_API.xml |  2 +-
 src/mesa/main/marshal.c   | 37 +
 src/mesa/main/marshal.h   |  8 
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index dfaeaaf..148387e 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -2354,21 +2354,21 @@
 
 
 
 
 
 
 
 
 
 
-
+
 
 
 
 
 
 
 
 
 

[Mesa-dev] [PATCH 2/2] radv: setup lds for tessellation

2017-03-30 Thread Dave Airlie
From: Dave Airlie 

This seems to get lost in the rebases, should fix
the tessellation demos, crash in llvm.

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index dfb4672..048601f 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -548,6 +548,14 @@ static void set_userdata_location_indirect(struct 
ac_userdata_info *ud_info, uin
 }
 #endif
 
+static void declare_tess_lds(struct nir_to_llvm_context *ctx)
+{
+   unsigned lds_size = ctx->options->chip_class >= CIK ? 65536 : 32768;
+   ctx->lds = LLVMBuildIntToPtr(ctx->builder, ctx->i32zero,
+LLVMPointerType(LLVMArrayType(ctx->i32, 
lds_size / 4), LOCAL_ADDR_SPACE),
+   "tess_lds");
+}
+
 static void create_function(struct nir_to_llvm_context *ctx)
 {
LLVMTypeRef arg_types[23];
@@ -785,6 +793,8 @@ static void create_function(struct nir_to_llvm_context *ctx)
ctx->vs_prim_id = LLVMGetParam(ctx->main_function, 
arg_idx++);
ctx->instance_id = LLVMGetParam(ctx->main_function, 
arg_idx++);
}
+   if (ctx->options->key.vs.as_ls)
+   declare_tess_lds(ctx);
break;
case MESA_SHADER_TESS_CTRL:
set_userdata_location_shader(ctx, AC_UD_TCS_OFFCHIP_LAYOUT, 
user_sgpr_idx, 4);
@@ -797,6 +807,8 @@ static void create_function(struct nir_to_llvm_context *ctx)
ctx->tess_factor_offset = LLVMGetParam(ctx->main_function, 
arg_idx++);
ctx->tcs_patch_id = LLVMGetParam(ctx->main_function, arg_idx++);
ctx->tcs_rel_ids = LLVMGetParam(ctx->main_function, arg_idx++);
+
+   declare_tess_lds(ctx);
break;
case MESA_SHADER_TESS_EVAL:
set_userdata_location_shader(ctx, AC_UD_TES_OFFCHIP_LAYOUT, 
user_sgpr_idx, 1);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [rfc] radv missing tesseellation patches

2017-03-30 Thread Dave Airlie
I somehow in my rebasing lost two patches, not sure where I ate them.

Anyways slot these in as patch 24.1 and 24.2 before the enable patch.

I've updated the radv-wip-tess-submit branch as well.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radv: add ia_multi_vgt_param tessellation support.

2017-03-30 Thread Dave Airlie
From: Dave Airlie 

This just ports the relevant radeonsi pieces.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/si_cmd_buffer.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 4673f28..6ee0f17 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -609,13 +609,42 @@ si_get_ia_multi_vgt_param(struct radv_cmd_buffer 
*cmd_buffer,
uint32_t num_prims = 
radv_prims_for_vertices(_buffer->state.pipeline->graphics.prim_vertex_count,
 draw_vertex_count);
bool multi_instances_smaller_than_primgroup;
 
-   if (radv_pipeline_has_gs(cmd_buffer->state.pipeline))
+   if (radv_pipeline_has_tess(cmd_buffer->state.pipeline))
+   primgroup_size = 
cmd_buffer->state.pipeline->graphics.tess.num_patches;
+   else if (radv_pipeline_has_gs(cmd_buffer->state.pipeline))
primgroup_size = 64;  /* recommended with a GS */
 
multi_instances_smaller_than_primgroup = indirect_draw || 
(instanced_draw &&
   num_prims < 
primgroup_size);
/* TODO TES */
+   if (radv_pipeline_has_tess(cmd_buffer->state.pipeline)) {
+   /* SWITCH_ON_EOI must be set if PrimID is used. */
+   if 
(cmd_buffer->state.pipeline->shaders[MESA_SHADER_TESS_CTRL]->info.tcs.uses_prim_id
 ||
+   
cmd_buffer->state.pipeline->shaders[MESA_SHADER_TESS_EVAL]->info.tes.uses_prim_id)
+   ia_switch_on_eoi = true;
+
+   /* Bug with tessellation and GS on Bonaire and older 2 SE 
chips. */
+   if ((family == CHIP_TAHITI ||
+family == CHIP_PITCAIRN ||
+family == CHIP_BONAIRE) &&
+   radv_pipeline_has_gs(cmd_buffer->state.pipeline))
+   partial_vs_wave = true;
+
+   /* Needed for 028B6C_DISTRIBUTION_MODE != 0 */
+   if (cmd_buffer->device->has_distributed_tess) {
+   if (radv_pipeline_has_gs(cmd_buffer->state.pipeline)) {
+   partial_es_wave = true;
 
+   if (family == CHIP_TONGA ||
+   family == CHIP_FIJI ||
+   family == CHIP_POLARIS10 ||
+   family == CHIP_POLARIS11)
+   partial_vs_wave = true;
+   } else {
+   partial_vs_wave = true;
+   }
+   }
+   }
/* TODO linestipple */
 
if (chip_class >= CIK) {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled

2017-03-30 Thread Timothy Arceri



On 30/03/17 18:54, Nicolai Hähnle wrote:

On 30.03.2017 07:21, Timothy Arceri wrote:

We could re-enable it also but I haven't tested that yet, and I'm
not sure we care much anyway.


Reviewed-by: Nicolai Hähnle 



Sorry, I've made a silly mistake here. We can't disable the glthread
from with a function call, we need a custom marshaling function.

Please see version 2.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96684] [swrast] piglit glsl-array-bounds-01 regression

2017-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96684

--- Comment #4 from Timothy Arceri  ---
I've update the test to have the same outcome regardless of which branch is
taken.

https://patchwork.freedesktop.org/patch/147474/

This should fix the problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96684] [swrast] piglit glsl-array-bounds-01 regression

2017-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96684

--- Comment #3 from Timothy Arceri  ---
I think the test is wrong, it should not be expecting a specific outcome.

The spec says:

"Behavior is undefined if a shader subscripts an array with an index less than
0 or greater than or equal to the size the array was declared with."

And the test is doing:

float array[] = float [] (1.0, 2.0, 3.0, 4.0);

void main()
{
   int idx = 20;

   if (array[idx] == 5.0)
  gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);
   else
  gl_FragColor = vec4(0.0, 1.0, 0.0, 1.0);
}

So the result is undefined.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix GLX_MESA_query_renderer video memory on 32-bit.

2017-03-30 Thread Chris Wilson
On Thu, Mar 30, 2017 at 04:28:19PM -0700, Kenneth Graunke wrote:
> On modern systems with 4GB apertures, the size in bytes is 4294967296,
> or (1ull << 32).  The kernel gives us the aperture size as a __u64,
> which works out great.
> 
> Unfortunately, libdrm "helpfully" returns the data as a size_t, which
> on 32-bit systems means it truncates the aperture size to 0 bytes.
> We've happily reported this value as 0 MB of video memory via
> GLX_MESA_query_renderer since it was originally exposed.
> 
> This patch bypasses libdrm and calls the ioctl ourselves so we can
> use a proper uint64_t, avoiding the 32-bit integer overflow.  We now
> report a proper video memory size on 32-bit systems.
> ---
>  src/mesa/drivers/dri/i965/intel_screen.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index 811a9c5a867..f94e8a77c10 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -950,6 +950,17 @@ static const __DRIimageExtension intelImageExtension = {
>  .createImageWithModifiers   = intel_create_image_with_modifiers,
>  };
>  
> +static uint64_t
> +get_aperture_size(int fd)
> +{
> +   struct drm_i915_gem_get_aperture aperture;
> +
> +   if (drmIoctl(fd, DRM_IOCTL_I915_GEM_GET_APERTURE, ) != 0)
> +  return 0;

The aperture is nothing to do with the video memory limits... You want
to query the context for the size of the GTT, e.g.
https://patchwork.freedesktop.org/patch/62189/

i.e.
static uint64_t get_gtt_size(int fd)
{
   struct drm_i915_gem_context_param p;
   size_t mappable_size, aper_size;

   memset(, 0, sizeof(p));
   p.param = I915_CONTEXT_PARAM_GTT_SIZE;
   if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, ) == 0)
  return p.value;

   /* do sometheing useful for old kernels */

   drm_intel_get_aperture_sizes(fd, _size, _size);

   return aper_size;
}

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix GLX_MESA_query_renderer video memory on 32-bit.

2017-03-30 Thread Kenneth Graunke
On modern systems with 4GB apertures, the size in bytes is 4294967296,
or (1ull << 32).  The kernel gives us the aperture size as a __u64,
which works out great.

Unfortunately, libdrm "helpfully" returns the data as a size_t, which
on 32-bit systems means it truncates the aperture size to 0 bytes.
We've happily reported this value as 0 MB of video memory via
GLX_MESA_query_renderer since it was originally exposed.

This patch bypasses libdrm and calls the ioctl ourselves so we can
use a proper uint64_t, avoiding the 32-bit integer overflow.  We now
report a proper video memory size on 32-bit systems.
---
 src/mesa/drivers/dri/i965/intel_screen.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 811a9c5a867..f94e8a77c10 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -950,6 +950,17 @@ static const __DRIimageExtension intelImageExtension = {
 .createImageWithModifiers   = intel_create_image_with_modifiers,
 };
 
+static uint64_t
+get_aperture_size(int fd)
+{
+   struct drm_i915_gem_get_aperture aperture;
+
+   if (drmIoctl(fd, DRM_IOCTL_I915_GEM_GET_APERTURE, ) != 0)
+  return 0;
+
+   return aperture.aper_size;
+}
+
 static int
 brw_query_renderer_integer(__DRIscreen *dri_screen,
int param, unsigned int *value)
@@ -972,10 +983,7 @@ brw_query_renderer_integer(__DRIscreen *dri_screen,
* assume that there's some fragmentation, and we start doing extra
* flushing, etc.  That's the big cliff apps will care about.
*/
-  size_t aper_size;
-  size_t mappable_size;
-
-  drm_intel_get_aperture_sizes(dri_screen->fd, _size, _size);
+  uint64_t aper_size = get_aperture_size(dri_screen->fd);
 
   const unsigned gpu_mappable_megabytes =
  (aper_size / (1024 * 1024)) * 3 / 4;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 6/6] nvc0: Add support for NV_fill_rectangle for the GM200+

2017-03-30 Thread Ilia Mirkin
On Thu, Mar 30, 2017 at 5:40 PM, Lyude  wrote:
> This enables support for the GL_NV_fill_rectangle extension on the
> GM200+ for Desktop OpenGL.
>
> Signed-off-by: Lyude 
>
> Changes since v1:
> - Fix commit message
> - Add note to reldocs
>
> Signed-off-by: Lyude 
> ---
>  docs/relnotes/17.1.0.html| 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h   | 3 +++
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 3 ++-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c| 4 
>  src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +-
>  5 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
> index ada1e38..e0014bb 100644
> --- a/docs/relnotes/17.1.0.html
> +++ b/docs/relnotes/17.1.0.html
> @@ -48,6 +48,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  GL_ARB_transform_feedback2 on i965/gen6
>  GL_ARB_transform_feedback_overflow_query on i965/gen6+
>  Geometry shaders enabled on swr
> +GL_NV_fill_rectangle on nvc0

Sort please.

>  
>
>  Bug fixes
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h
> index 1be5952..accde94 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h
> @@ -772,6 +772,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
> SOFTWARE.
>  #define NVC0_3D_VTX_ATTR_MASK_UNK0DD0_ALT__ESIZE   0x0004
>  #define NVC0_3D_VTX_ATTR_MASK_UNK0DD0_ALT__LEN 0x0004
>
> +#define NVC0_3D_FILL_RECTANGLE 0x113c
> +#define NVC0_3D_FILL_RECTANGLE_ENABLE  0x0002
> +
>  #define NVC0_3D_UNK11400x1140
>
>  #define NVC0_3D_UNK11440x1144
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> index 945101b..f0e4e12 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> @@ -256,6 +256,8 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>return nouveau_screen(pscreen)->vram_domain & NOUVEAU_BO_VRAM ? 1 : 0;
> case PIPE_CAP_TGSI_FS_FBFETCH:
>return class_3d >= NVE4_3D_CLASS; /* needs testing on fermi */
> +   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
> +  return (class_3d >= GM200_3D_CLASS);

Still unnecessary parens.

>
> /* unsupported caps */
> case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
> @@ -285,7 +287,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_NATIVE_FENCE_FD:
> case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
> case PIPE_CAP_INT64_DIVMOD:
> -   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
>return 0;
>
> case PIPE_CAP_VENDOR_ID:
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index 32233a5..803843b 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -261,6 +261,10 @@ nvc0_rasterizer_state_create(struct pipe_context *pipe,
>  SB_IMMED_3D(so, POINT_SPRITE_ENABLE, cso->point_quad_rasterization);
>  SB_IMMED_3D(so, POINT_SMOOTH_ENABLE, cso->point_smooth);
>
> +SB_IMMED_3D(so, FILL_RECTANGLE,
> +cso->fill_front == PIPE_POLYGON_MODE_FILL_RECTANGLE ?
> +NVC0_3D_FILL_RECTANGLE_ENABLE : 0);

Oh, I forgot to mention this last time, but ... this will generate
errors on pre-GM200 GPUs. Please stick this in a if (foo->class_3d >=
GM204_3D_CLASS)

> +
>  SB_BEGIN_3D(so, MACRO_POLYGON_MODE_FRONT, 1);
>  SB_DATA(so, nvgl_polygon_mode(cso->fill_front));
>  SB_BEGIN_3D(so, MACRO_POLYGON_MODE_BACK, 1);
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h
> index 054b1e7..3006ed6 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h
> @@ -23,7 +23,7 @@ struct nvc0_blend_stateobj {
>  struct nvc0_rasterizer_stateobj {
> struct pipe_rasterizer_state pipe;
> int size;
> -   uint32_t state[42];
> +   uint32_t state[43];
>  };
>
>  struct nvc0_zsa_stateobj {
> --
> 2.9.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/6] mesa: Add support for GL_NV_fill_rectangle

2017-03-30 Thread Ilia Mirkin
On Thu, Mar 30, 2017 at 5:40 PM, Lyude  wrote:
> +   /* From the GL_NV_fill_rectangle spec:
> +*
> +* "An INVALID_OPERATION error is generated by Begin or any Draw command 
> if
> +*  only one of the front and back polygon mode is FILL_RECTANGLE_NV."
> +*/
> +   if ((ctx->Polygon.FrontMode == GL_FILL_RECTANGLE_NV) !=
> +   (ctx->Polygon.BackMode == GL_FILL_RECTANGLE_NV)) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION,
> +  "GL_NV_fill_rectangle can only be used on both the front "

That should probably say "GL_FILL_RECTANGLE_NV must be used as both
front/back polygon mode or neither".

> +  "and back polygon mode, not one or the other");
> +  return GL_FALSE;
> +   }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] radv initial tessellation support

2017-03-30 Thread Mike Lothian
I tried running the tessellation demo from Sacha Williams on my Tonga and I
got the following error:

vulkantessellation:
/var/tmp/portage/sys-devel/llvm-/work/llvm-/include/llvm/Support/Casting.h:95:
static bool llvm::isa_impl_cl::doit(const From *) [To = llvm::Constant, From = const llvm::Value *]:
Assertion
`Val && "isa<> used on a null pointer"' failed.
Aborted (core dumped)


On Thu, 30 Mar 2017 at 09:01 Dave Airlie  wrote:

This contains the initial tessellation shader support for radv.

It currently passes all the tess tests in CTS when run
under piglit, but when run in CTS itself I seem to get some
misc fails when multiple tests are run in one process. I'll
be chasing that down asap.

But I thought it would be good to get the code out there and
on the list, the main fun is in Patch 21 (and probably lots
of the bugs).

This is also in my radv-wip-tess-submit branch on github.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Meson mesademos (Was: [RFC libdrm 0/2] Replace the build system with meson)

2017-03-30 Thread Jose Fonseca

On 30/03/17 19:52, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-29 15:27:58)

On 28/03/17 22:37, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-28 13:45:57)

On 28/03/17 21:32, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-28 09:19:48)

On 28/03/17 00:12, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-27 09:58:59)

On 27/03/17 17:42, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-27 09:31:04)

On 27/03/17 17:24, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-26 14:53:50)

I've pushed the branch to mesa/demos, so we can all collaborate without
wasting time crossporting patches between private branches.

   https://cgit.freedesktop.org/mesa/demos/commit/?h=meson

Unfortunately, I couldn't actually go very far until I hit a wall, as
you can see in the last commit message.


The issue is that Windows has no standard paths for dependencies
includes/libraries (like /usr/include or /usr/lib), nor standard tool
for dependencies (no pkgconfig).  But it seems that Meson presumes any
unknown dependency can be resolved with pkgconfig.


The question is: how do I tell Meson where the GLEW headers/library for
MinGW are supposed to be found?


I know one solution might be Meson Wraps.  Is that the only way?


CMake makes it very easy to do it (via Cache files as explained in my
commit message.)  Is there a way to achieve the same, perhaps via
cross_file properties or something like that?


Jose


I think there are two ways you could solve this:

Wraps are probably the most generically correct method; what I mean by that is
that a proper wrap would solve the problem for everyone, on every operating
system, forever.


Yeah, that sounded a good solution, particularly for windows where's so
much easier to just build the dependencies as a subproject rather than
fetch dependencies from somewhere, since MSVC RT versions have to match
and so.

 > That said, I took a look at GLEW and it doesn't look like a

straightforward project to port to meson, since it uses a huge pile of gnu
makefiles for compilation, without any autoconf/cmake/etc. I still might take a
swing at it since I want to know how hard it would be to write a wrap file for
something like GLEW (and it would probably be a pretty useful project to wrap)
where a meson build system is likely never going to go upstream.


BTW, regarding GLEW, some time ago I actually prototyped using GLAD
instead of GLEW for mesademos:

   https://cgit.freedesktop.org/~jrfonseca/mesademos/log/?h=glad

I find GLAD much nicer that GLEW: it's easier to build, it uses upstream
XML files, it supports EGL, and it's easy to bundle.

Maybe we could migrate mesademos to GLAD as part of this work instead of
trying to get glew "mesonfied".


The other option I think you can use use is cross properties[1], which I believe
is the closest thing meson has to cmake's cache files.

I've pushed a couple of commits, the last one implements the cross properties
idea, which gets the build farther, but then it can't find the glut headers,
and I don't understand why, since "cc.has_header('GL/glut')" returns true. I
still think that wraps are a better plan, but I'll have to spend some time today
working on a glew wrap.

[1] https://github.com/mesonbuild/meson/wiki/Cross-compilation (at the bottom
under the heading "Custom Data")


I'm running out of time today, but I'll try to take a look tomorrow.

Jose



I'd had a similar thought, but thought of libpeoxy? It supports the platforms we
want, and already has a meson build system that works for windows.


I have no experience with libepoxy.  I know GLAD is really easy to
understand, use and integrate.  It's completly agnostic to toolkits like
GLUT/GLFW/etc doesn't try to alias equivalent entrypoints, or anything
smart like libepoxy.

In particular I don't fully understand libepoxy behavior regarding
wglMakeCurrent is, and whether that will create problems with GLUT,
since GLUT will call wglMakeCurrent..


Jose


Okay, I have libepoxy working for windows. I also got libepoxy working as a
subproject, but it took a bit of hacking on their build system (there's
some things they're doing that make them non-subproject safe, I'll send patches
and work that out with them.

https://github.com/dcbaker/libepoxy.git fix-suproject


Thanks.

GLEW is not the only one case though.  There's also FREEGLUT.  So we
can't really avoid the problem of external windows binaries/subprojects.

So I've been thinking, and I suspect is better if first get things
working with binary GLEW / FREGLUT projects, then try the glew ->
libepoxy in a 2nd step, so there's less to take in to merge meson into
master.


Clone that repo into $mesa-demos-root/subprojects and things should just work,
or mostly work. I got epoxy compiling, but ran into some issues in the mingw glu
header.

Dylan


I'm pretty sure the problem with MinGW glu is the lack of windows.h.  We
need to do the same as CMakeLists.txt snippet quoted below.

I'm running out of time today, but I'll look into porting 

Re: [Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing

2017-03-30 Thread Matt Turner
On Thu, Mar 30, 2017 at 3:26 PM, Grazvydas Ignotas  wrote:
> There are still some distributions trying to support unfortunate people
> with old or exotic CPUs that don't have 64bit atomic operations. When
> compiling for such a machine, gcc conveniently inserts a library call to
> a helper, but it's implementation is missing and we get a linker error.
> This allows us to provide our own implementation, which is marked weak
> to prefer a better implementation, should one exist.
>
> v2: changed copyright, some style adjustments
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
> Signed-off-by: Grazvydas Ignotas 
> Reviewed-by: Matt Turner 

Thanks, I'll commit this.

> ---
>  no commit access, but request sent:
>  https://bugs.freedesktop.org/show_bug.cgi?id=100467

Thanks. I commented on the bug and said I approve :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] mesa: remove MESA_GLSL=no_opts env option

2017-03-30 Thread Emil Velikov
On 30 March 2017 at 17:44, Brian Paul  wrote:
> On 03/30/2017 05:26 AM, Timothy Arceri wrote:
>>
>> This is confusing because is only applys to ARB shaders, and because
>
>
> s/ARB shaders/GL_ARB_vertex/fragment_program/
>
>> of that its also not very useful.
>>
>> If someone requires this for debugging they can just make an ad-hoc
>> code change.
>
>
> I think I'm the original author of this and my intention was that if a user
> ran into a suspected shader compiler bug, they could set this env var to see
> if disabling optimizations worked around the issue.  Seems like a useful
> thing to me, but I'm not sure it got much/any use.  I'm pretty certain an
> earlier version of the GLSL compiler respected it too.  But it probably was
> overlooked and dropped at some point.
>
> I still think this env option could be useful.  But for lack of use, I guess
> it can go.
>
Skimming through your reply reminded me the following bug/regression
that Vinson reported recently.

https://bugs.freedesktop.org/show_bug.cgi?id=96684

Not sure if I'll get to having another look at it soon, so I'm just
throwing it in here ;-)

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/glsl_to_tgsi: fix 64-bit integer bit shifts

2017-03-30 Thread Marek Olšák
On Thu, Mar 30, 2017 at 5:29 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Fix a bug that was caused by a type mismatch in the shfit count between
> GLSL and TGSI. I briefly considered adjusting the TGSI semantics, but
> since both LLVM and AMD GCN require both arguments to be of the same type,
> it makes more sense to keep TGSI as-is -- it reflects the underlying
> implementation better.
>
> I'm also sending out piglit tests that expose this error.
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 31c14ed..6ef41f2 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -2095,27 +2095,37 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* 
> ir, st_src_reg *op)
>if (native_integers) {
>   emit_asm(ir, TGSI_OPCODE_NOT, result_dst, op[0]);
>   break;
>}
> case ir_unop_u2f:
>if (native_integers) {
>   emit_asm(ir, TGSI_OPCODE_U2F, result_dst, op[0]);
>   break;
>}
> case ir_binop_lshift:
> -  if (native_integers) {
> - emit_asm(ir, TGSI_OPCODE_SHL, result_dst, op[0], op[1]);
> - break;
> -  }
> case ir_binop_rshift:
>if (native_integers) {
> - emit_asm(ir, TGSI_OPCODE_ISHR, result_dst, op[0], op[1]);
> + unsigned opcode = ir->operation == ir_binop_lshift ? TGSI_OPCODE_SHL
> +: 
> TGSI_OPCODE_ISHR;
> + st_src_reg count;
> +
> + if (glsl_base_type_is_64bit(op[0].type)) {
> +/* GLSL shift operations have 32-bit shift counts, but TGSI uses
> + * 64 bits.
> + */
> +count = get_temp(glsl_type::u64vec(4));

Does count really have to have 4 u64 components?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Select pipeline and emit state base address in Gen8+ HiZ ops.

2017-03-30 Thread Nanley Chery
On Mon, Mar 20, 2017 at 08:05:22PM -0700, Nanley Chery wrote:

Okay, I've re-read the email. 

> On Mon, Mar 20, 2017 at 08:01:25PM -0700, Nanley Chery wrote:
> > On Thu, Mar 16, 2017 at 05:34:13PM -0700, Kenneth Graunke wrote:
> > > On Wednesday, March 8, 2017 10:27:20 AM PDT Nanley Chery wrote:
> > > > On Wed, Mar 08, 2017 at 10:07:12AM -0800, Nanley Chery wrote:
> > > > > On Wed, Mar 08, 2017 at 02:17:59AM -0800, Kenneth Graunke wrote:
> > > > > > On Thursday, March 2, 2017 4:36:08 PM PST Nanley Chery wrote:
> > > > > > > On Mon, Feb 06, 2017 at 03:55:49PM -0800, Kenneth Graunke wrote:
> > > > > > > > If a HiZ op is the first thing in the batch, we should make sure
> > > > > > > > to select the render pipeline and emit state base address before
> > > > > > > > proceeding.
> > > > > > > > 
> > > > > > > > I believe 3DSTATE_WM_HZ_OP creates 3DPRIMITIVEs internally, and
> > > > > > > > dispatching those on the GPGPU pipeline seems a bit sketchy.  
> > > > > > > > I'm
> > > > > > > 
> > > > > > > Yes, it does seem like we currently allow HZ_OPs within a GPGPU
> > > > > > > pipeline. This patch should fix that problem.
> > > > > > > 
> > > > > > > > not actually sure that STATE_BASE_ADDRESS is necessary, as the
> > > > > > > > depth related commands use graphics addresses, not ones relative
> > > > > > > > to the base address...but we're likely to set it as part of the
> > > > > > > > next operation anyway, so we should just do it right away.
> > > > > > > > 
> > 
> > Why should we do it right away if it will happen later on? I don't see
> > why this part of the patch is necessary.
> > 
> > > > > > > 
> > > > > > > I agree, re-emitting STATE_BASE_ADDRESS doesn't seem necessary. I 
> > > > > > > think
> > > > > > > we should drop this part of the patch and add it back in later if 
> > > > > > > we get
> > > > > > > some data that it's necessary. Leaving it there may be 
> > > > > > > distracting to
> > > > > > > some readers and the BDW PRM warns that it's an expensive command:
> > > > > > > 
> > > > > > >   Execution of this command causes a full pipeline flush, thus its
> > > > > > >   use should be minimized for higher performance.
> > > > > > 
> > > > > > I think it should be basically free, actually.  We track a boolean,
> > > > > > brw->batch.state_base_address_emitted, to avoid emitting it multiple
> > > > > > times per batch.
> > > > > > 
> > > > > > Let's say the first thing in a fresh batch is a HiZ op, followed by
> > > > > > normal drawing.  Previously, we'd do:
> > > > > > 
> > > > > > 1. HiZ op commands
> > > > > > 2. STATE_BASE_ADDRESS (triggered by normal rendering upload)
> > > > > > 3. rest of normal drawing commands
> > > > > > 
> > > > > > Now we'd do:
> > > > > > 
> > > > > > 1. STATE_BASE_ADDRESS (triggered by HiZ op)
> > > > > > 2. HiZ op commands
> > > > > > 3. normal drawing commands (second SBA is skipped)
> > > > > > 
> > > > > > In other words...we're just moving it a bit earlier.  I suppose 
> > > > > > there
> > > > > > could be a batch containing only HiZ ops, at which point we'd pay 
> > > > > > for
> > > > > > a single STATE_BASE_ADDRESS...but that seems really unlikely.
> > > > > > 
> > > > > 
> > > > > Sorry for not stating it up front, but the special case you've 
> > > > > mentioned
> > > > > is exactly what I'd like not to hurt unnecessarily.
> > > > > 
> > > 
> > > Why?  We really think there are going to be batches with only
> > > 3DSTATE_WM_HZ_OP and no normal rendering or BLORP?  It sounds
> > > really hypothetical to me.
> > > 
> > 
> > I've commented on the performance implications of that snippet because
> > it is the only functional change I can see from emitting SBA. That
> > unfortunately seems to have distracted us from the more important
> > question found above. Sorry about that.
> > 

I've commented on the performance implications of that snippet because
it is the only functional change I can see from emitting SBA.
Unfortunately, discussing the impact of this change is seems to have
distracted us from the more important question of why we're making this
change. Sorry about that.

> > > > Correct me if I'm wrong, but after thinking about it some more, it seems
> > > > that performance wouldn't suffer by emitting the SBA since the pipeline
> > > > was already flushed at the end of the preceding batch. It may also
> > > > improve since the pipelined HiZ op will likely be followed by other
> > > > pipelined commands. I'm not totally confident in my understanding on
> > > > pipeline flushes by the way. Is this why you'd like to emit the SBA 
> > > > here?
> > > > I think it's fine to leave it if we expound on the rationale.
> > > 
> > > Performance is not a motivation for this patch.  Having the GPU do
> > > work without a pipeline selected or state base addresses in place seems
> > > potentially dangerous.  I was hoping it would help with GPU hangs.  I'm
> > > not certain that it does, and it might be safe to skip this, but it
> > > 

[Mesa-dev] [Bug 100262] libswrAVX2.so Causes hang with QOpenGLWidget

2017-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100262

--- Comment #4 from ch...@circlecvi.com ---
(In reply to Tim Rowley from comment #3)
> Ok, looks like I need to check and make sure there hasn't been a regression
> for this since Mesa 13.  Are you using the dri version of the driver, or
> libgl-x11?
> 
> Interesting that setting the variable fixed the problem for you.  We've seen
> similar problems before with TBB (thread building blocks), where if we bound
> threads inside swr, their threading code would think no cpus were available
> for its use.  Previously the workaround we've suggested is to initialize the
> threading library before creating an OpenGL context.  If that's possible in
> Qt that would the cleanest way forward, though I could see that potentially
> being hard to do since it renders the UI with OpenGL as well.
> 
> If used, for maximum performance MAX_KNOB_WORKER_THREADS should be the
> number of cores minus one (we have an API thread that feeds the workers).

I am using the libgl-x11 version (--disable-dri option).

Thanks for the info, I will try and putz around with the threading library and
see if it helps.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing

2017-03-30 Thread Grazvydas Ignotas
There are still some distributions trying to support unfortunate people
with old or exotic CPUs that don't have 64bit atomic operations. When
compiling for such a machine, gcc conveniently inserts a library call to
a helper, but it's implementation is missing and we get a linker error.
This allows us to provide our own implementation, which is marked weak
to prefer a better implementation, should one exist.

v2: changed copyright, some style adjustments

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
Signed-off-by: Grazvydas Ignotas 
Reviewed-by: Matt Turner 
---
 no commit access, but request sent:
 https://bugs.freedesktop.org/show_bug.cgi?id=100467

 configure.ac  | 12 
 src/util/Makefile.sources |  1 +
 src/util/u_atomic.c   | 75 +++
 3 files changed, 88 insertions(+)
 create mode 100644 src/util/u_atomic.c

diff --git a/configure.ac b/configure.ac
index 70885fb..74b870e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -413,10 +413,22 @@ int main() {
 if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
 DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
 fi
 AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test 
x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
 
+dnl Check if host supports 64bit atomics
+dnl note that lack of support usually results in link (not compile) error
+AC_LINK_IFELSE([AC_LANG_SOURCE([[
+#include 
+uint64_t v;
+int main() {
+return __sync_add_and_fetch(, (uint64_t)1);
+}]])], GCC_64BIT_ATOMICS_SUPPORTED=1)
+if test "x$GCC_64BIT_ATOMICS_SUPPORTED" != x1; then
+DEFINES="$DEFINES -DMISSING_64BIT_ATOMICS"
+fi
+
 dnl Check for Endianness
 AC_C_BIGENDIAN(
little_endian=no,
little_endian=yes,
little_endian=no,
diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 8ee45d5..e905734 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -41,10 +41,11 @@ MESA_UTIL_FILES := \
string_to_uint_map.h \
strndup.h \
strtod.c \
strtod.h \
texcompress_rgtc_tmp.h \
+   u_atomic.c \
u_atomic.h \
u_endian.h \
u_queue.c \
u_queue.h \
u_string.h \
diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c
new file mode 100644
index 000..44b75fb
--- /dev/null
+++ b/src/util/u_atomic.c
@@ -0,0 +1,75 @@
+/*
+ * Copyright © 2017 Gražvydas Ignotas
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#if defined(MISSING_64BIT_ATOMICS) && defined(HAVE_PTHREAD)
+
+#include 
+#include 
+
+#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__)
+#define WEAK __attribute__((weak))
+#else
+#define WEAK
+#endif
+
+static pthread_mutex_t sync_mutex = PTHREAD_MUTEX_INITIALIZER;
+
+WEAK uint64_t
+__sync_add_and_fetch_8(uint64_t *ptr, uint64_t val)
+{
+   uint64_t r;
+
+   pthread_mutex_lock(_mutex);
+   *ptr += val;
+   r = *ptr;
+   pthread_mutex_unlock(_mutex);
+
+   return r;
+}
+
+WEAK uint64_t
+__sync_sub_and_fetch_8(uint64_t *ptr, uint64_t val)
+{
+   uint64_t r;
+
+   pthread_mutex_lock(_mutex);
+   *ptr -= val;
+   r = *ptr;
+   pthread_mutex_unlock(_mutex);
+
+   return r;
+}
+
+WEAK uint64_t
+__atomic_fetch_add_8(uint64_t *ptr, uint64_t val, int memorder)
+{
+   return __sync_add_and_fetch(ptr, val);
+}
+
+WEAK uint64_t
+__atomic_fetch_sub_8(uint64_t *ptr, uint64_t val, int memorder)
+{
+   return __sync_sub_and_fetch(ptr, val);
+}
+
+#endif
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: add lp_build_emit_fetch_src() helper

2017-03-30 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Mar 30, 2017 at 7:57 PM, Samuel Pitoiset
 wrote:
> lp_build_emit_fetch() is useful when the source type can be
> infered from the instruction opcode.
>
> However, for bindless samplers/images we can't do that easily
> because tgsi_opcode_infer_src_type() returns TGSI_TYPE_FLOAT for
> TEX instructions, while we need TGSI_TYPE_UNSIGNED64 if the
> resource register is bindless.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi.c | 22 +-
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |  7 +++
>  2 files changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
> index d368f38d09..69863ab93c 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
> @@ -323,16 +323,14 @@ lp_build_tgsi_inst_llvm(
>
>
>  LLVMValueRef
> -lp_build_emit_fetch(
> +lp_build_emit_fetch_src(
> struct lp_build_tgsi_context *bld_base,
> -   const struct tgsi_full_instruction *inst,
> -   unsigned src_op,
> +   const struct tgsi_full_src_register *reg,
> +   enum tgsi_opcode_type stype,
> const unsigned chan_index)
>  {
> -   const struct tgsi_full_src_register *reg = >Src[src_op];
> unsigned swizzle;
> LLVMValueRef res;
> -   enum tgsi_opcode_type stype = 
> tgsi_opcode_infer_src_type(inst->Instruction.Opcode);
>
> if (chan_index == LP_CHAN_ALL) {
>swizzle = ~0u;
> @@ -413,7 +411,21 @@ lp_build_emit_fetch(
> }
>
> return res;
> +}
> +
> +
> +LLVMValueRef
> +lp_build_emit_fetch(
> +   struct lp_build_tgsi_context *bld_base,
> +   const struct tgsi_full_instruction *inst,
> +   unsigned src_op,
> +   const unsigned chan_index)
> +{
> +   const struct tgsi_full_src_register *reg = >Src[src_op];
> +   enum tgsi_opcode_type stype =
> +  tgsi_opcode_infer_src_type(inst->Instruction.Opcode);
>
> +   return lp_build_emit_fetch_src(bld_base, reg, stype, chan_index);
>  }
>
>
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
> index b6b3fe369b..22bd2a16ec 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
> @@ -645,6 +645,13 @@ lp_build_tgsi_inst_llvm(
> const struct tgsi_full_instruction *inst);
>
>  LLVMValueRef
> +lp_build_emit_fetch_src(
> +   struct lp_build_tgsi_context *bld_base,
> +   const struct tgsi_full_src_register *reg,
> +   enum tgsi_opcode_type stype,
> +   const unsigned chan_index);
> +
> +LLVMValueRef
>  lp_build_emit_fetch(
> struct lp_build_tgsi_context *bld_base,
> const struct tgsi_full_instruction *inst,
> --
> 2.12.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: genxml: add gen7 ERR_INT register

2017-03-30 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] mesa: remove MESA_GLSL=opt

2017-03-30 Thread Marek Olšák
I agree with Nicolai. For the series:

Reviewed-by: Marek Olšák 

Marek

On Thu, Mar 30, 2017 at 1:26 PM, Timothy Arceri  wrote:
> This is unused.
> ---
>  docs/shading.html |  1 -
>  src/mesa/main/mtypes.h| 15 +++
>  src/mesa/main/shaderapi.c |  2 --
>  3 files changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/docs/shading.html b/docs/shading.html
> index cd01af0..7e3d2e4 100644
> --- a/docs/shading.html
> +++ b/docs/shading.html
> @@ -43,21 +43,20 @@ Contents
>  The MESA_GLSL environment variable can be set to a comma-separated
>  list of keywords to control some aspects of the GLSL compiler and shader
>  execution.  These are generally used for debugging.
>  
>  
>  dump - print GLSL shader code to stdout at link time
>  log - log all GLSL shaders to files.
>  The filenames will be "shader_X.vert" or "shader_X.frag" where X
>  the shader ID.
>  cache_info - print debug information about shader cache
> -opt - force compiler optimizations
>  uniform - print message to stdout when glUniform is called
>  nopvert - force vertex shaders to be a simple shader that just 
> transforms
>  the vertex position with ftransform() and passes through the color and
>  texcoord[0] attributes.
>  nopfrag - force fragment shader to be a simple shader that passes
>  through the color attribute.
>  useprog - log glUseProgram calls to stderr
>  
>  
>  Example:  export MESA_GLSL=dump,nopt
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index daccb2b..e4c6771 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2876,28 +2876,27 @@ struct gl_shader_program
>
> /* True if any of the fragment shaders attached to this program use:
>  * #extension ARB_fragment_coord_conventions: enable
>  */
> GLboolean ARB_fragment_coord_conventions_enable;
>  };
>
>
>  #define GLSL_DUMP  0x1  /**< Dump shaders to stdout */
>  #define GLSL_LOG   0x2  /**< Write shaders to files */
> -#define GLSL_OPT   0x4  /**< Force optimizations (override pragmas) */
> -#define GLSL_UNIFORMS  0x8  /**< Print glUniform calls */
> -#define GLSL_NOP_VERT 0x10  /**< Force no-op vertex shaders */
> -#define GLSL_NOP_FRAG 0x20  /**< Force no-op fragment shaders */
> -#define GLSL_USE_PROG 0x40  /**< Log glUseProgram calls */
> -#define GLSL_REPORT_ERRORS 0x80  /**< Print compilation errors */
> -#define GLSL_DUMP_ON_ERROR 0x100 /**< Dump shaders to stderr on compile 
> error */
> -#define GLSL_CACHE_INFO 0x200 /**< Print debug information about shader 
> cache */
> +#define GLSL_UNIFORMS  0x4  /**< Print glUniform calls */
> +#define GLSL_NOP_VERT  0x8  /**< Force no-op vertex shaders */
> +#define GLSL_NOP_FRAG 0x10  /**< Force no-op fragment shaders */
> +#define GLSL_USE_PROG 0x20  /**< Log glUseProgram calls */
> +#define GLSL_REPORT_ERRORS 0x40  /**< Print compilation errors */
> +#define GLSL_DUMP_ON_ERROR 0x80 /**< Dump shaders to stderr on compile error 
> */
> +#define GLSL_CACHE_INFO 0x100 /**< Print debug information about shader 
> cache */
>
>
>  /**
>   * Context state for GLSL vertex/fragment shaders.
>   * Extended to support pipeline object
>   */
>  struct gl_pipeline_object
>  {
> /** Name of the pipeline object as received from glGenProgramPipelines.
>  * It would be 0 for shaders without separate shader objects.
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index 3d77448..187475f 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -76,22 +76,20 @@ _mesa_get_shader_flags(void)
>else if (strstr(env, "dump"))
>   flags |= GLSL_DUMP;
>if (strstr(env, "log"))
>   flags |= GLSL_LOG;
>if (strstr(env, "cache_info"))
>   flags |= GLSL_CACHE_INFO;
>if (strstr(env, "nopvert"))
>   flags |= GLSL_NOP_VERT;
>if (strstr(env, "nopfrag"))
>   flags |= GLSL_NOP_FRAG;
> -  else if (strstr(env, "opt"))
> - flags |= GLSL_OPT;
>if (strstr(env, "uniform"))
>   flags |= GLSL_UNIFORMS;
>if (strstr(env, "useprog"))
>   flags |= GLSL_USE_PROG;
>if (strstr(env, "errors"))
>   flags |= GLSL_REPORT_ERRORS;
> }
>
> return flags;
>  }
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: implement ARB_shader_group_vote

2017-03-30 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Mar 30, 2017 at 10:48 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  docs/features.txt|  2 +-
>  docs/relnotes/17.1.0.html|  1 +
>  src/gallium/drivers/radeonsi/si_pipe.c   |  4 +-
>  src/gallium/drivers/radeonsi/si_shader.c | 82 
> 
>  4 files changed, 87 insertions(+), 2 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index d707f01..1e145e1 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -288,21 +288,21 @@ Khronos, ARB, and OES extensions that are not part of 
> any OpenGL or OpenGL ES ve
>GL_ARB_parallel_shader_compilenot started, but 
> Chia-I Wu did some related work in 2014
>GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
> radeonsi, softpipe, swr)
>GL_ARB_post_depth_coverageDONE (i965)
>GL_ARB_robustness_isolation   not started
>GL_ARB_sample_locations   not started
>GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
> radeonsi, r600, softpipe, swr)
>GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
> nvc0, radeonsi, softpipe)
>GL_ARB_shader_ballot  not started
>GL_ARB_shader_clock   DONE (i965/gen7+, 
> radeonsi)
>GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
> radeonsi)
> -  GL_ARB_shader_group_vote  DONE (nvc0)
> +  GL_ARB_shader_group_vote  DONE (nvc0, radeonsi)
>GL_ARB_shader_stencil_export  DONE (i965/gen9+, 
> radeonsi, softpipe, llvmpipe, swr)
>GL_ARB_shader_viewport_layer_arrayDONE (i965/gen6+)
>GL_ARB_sparse_buffer  not started
>GL_ARB_sparse_texture not started
>GL_ARB_sparse_texture2not started
>GL_ARB_sparse_texture_clamp   not started
>GL_ARB_texture_filter_minmax  not started
>GL_ARB_transform_feedback_overflow_query  DONE (i965/gen6+)
>GL_KHR_blend_equation_advanced_coherent   DONE (i965/gen9+)
>GL_KHR_no_error   not started
> diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
> index 52b35b5..38bc1e8 100644
> --- a/docs/relnotes/17.1.0.html
> +++ b/docs/relnotes/17.1.0.html
> @@ -39,20 +39,21 @@ TBD.
>
>  New features
>
>  
>  Note: some of the new features are only available with certain drivers.
>  
>
>  
>  GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, 
> llvmpipe
>  GL_ARB_shader_clock on radeonsi
> +GL_ARB_shader_group_vote on radeonsi
>  GL_ARB_transform_feedback2 on i965/gen6
>  GL_ARB_transform_feedback_overflow_query on i965/gen6+
>  Geometry shaders enabled on swr
>  
>
>  Bug fixes
>
>  
>  
>
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 6944c7c..688900e 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -417,20 +417,23 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS:
> case PIPE_CAP_DOUBLES:
> case PIPE_CAP_TGSI_TEX_TXF_LZ:
> return 1;
>
> case PIPE_CAP_INT64:
> case PIPE_CAP_INT64_DIVMOD:
> case PIPE_CAP_TGSI_CLOCK:
> return HAVE_LLVM >= 0x0309;
>
> +   case PIPE_CAP_TGSI_VOTE:
> +   return HAVE_LLVM >= 0x0400;
> +
> case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
> return !SI_BIG_ENDIAN && sscreen->b.info.has_userptr;
>
> case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
> return (sscreen->b.info.drm_major == 2 &&
> sscreen->b.info.drm_minor >= 43) ||
>sscreen->b.info.drm_major == 3;
>
> case PIPE_CAP_TEXTURE_MULTISAMPLE:
> /* 2D tiling on CIK is supported since DRM 2.35.0 */
> @@ -471,21 +474,20 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
>
> /* Unsupported features. */
> case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
> case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
> case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
> case PIPE_CAP_USER_VERTEX_BUFFERS:
> case PIPE_CAP_FAKE_SW_MSAA:
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> -   case PIPE_CAP_TGSI_VOTE:
> case PIPE_CAP_MAX_WINDOW_RECTANGLES:
> 

[Mesa-dev] [Bug 100262] libswrAVX2.so Causes hang with QOpenGLWidget

2017-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100262

--- Comment #3 from Tim Rowley  ---
Ok, looks like I need to check and make sure there hasn't been a regression for
this since Mesa 13.  Are you using the dri version of the driver, or libgl-x11?

Interesting that setting the variable fixed the problem for you.  We've seen
similar problems before with TBB (thread building blocks), where if we bound
threads inside swr, their threading code would think no cpus were available for
its use.  Previously the workaround we've suggested is to initialize the
threading library before creating an OpenGL context.  If that's possible in Qt
that would the cleanest way forward, though I could see that potentially being
hard to do since it renders the UI with OpenGL as well.

If used, for maximum performance MAX_KNOB_WORKER_THREADS should be the number
of cores minus one (we have an API thread that feeds the workers).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 6/6] nvc0: Add support for NV_fill_rectangle for the GM200+

2017-03-30 Thread Lyude
This enables support for the GL_NV_fill_rectangle extension on the
GM200+ for Desktop OpenGL.

Signed-off-by: Lyude 

Changes since v1:
- Fix commit message
- Add note to reldocs

Signed-off-by: Lyude 
---
 docs/relnotes/17.1.0.html| 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h   | 3 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 3 ++-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c| 4 
 src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +-
 5 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
index ada1e38..e0014bb 100644
--- a/docs/relnotes/17.1.0.html
+++ b/docs/relnotes/17.1.0.html
@@ -48,6 +48,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_ARB_transform_feedback2 on i965/gen6
 GL_ARB_transform_feedback_overflow_query on i965/gen6+
 Geometry shaders enabled on swr
+GL_NV_fill_rectangle on nvc0
 
 
 Bug fixes
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h
index 1be5952..accde94 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h
@@ -772,6 +772,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #define NVC0_3D_VTX_ATTR_MASK_UNK0DD0_ALT__ESIZE   0x0004
 #define NVC0_3D_VTX_ATTR_MASK_UNK0DD0_ALT__LEN 0x0004
 
+#define NVC0_3D_FILL_RECTANGLE 0x113c
+#define NVC0_3D_FILL_RECTANGLE_ENABLE  0x0002
+
 #define NVC0_3D_UNK11400x1140
 
 #define NVC0_3D_UNK11440x1144
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 945101b..f0e4e12 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -256,6 +256,8 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
   return nouveau_screen(pscreen)->vram_domain & NOUVEAU_BO_VRAM ? 1 : 0;
case PIPE_CAP_TGSI_FS_FBFETCH:
   return class_3d >= NVE4_3D_CLASS; /* needs testing on fermi */
+   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
+  return (class_3d >= GM200_3D_CLASS);
 
/* unsupported caps */
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
@@ -285,7 +287,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_INT64_DIVMOD:
-   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 32233a5..803843b 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -261,6 +261,10 @@ nvc0_rasterizer_state_create(struct pipe_context *pipe,
 SB_IMMED_3D(so, POINT_SPRITE_ENABLE, cso->point_quad_rasterization);
 SB_IMMED_3D(so, POINT_SMOOTH_ENABLE, cso->point_smooth);
 
+SB_IMMED_3D(so, FILL_RECTANGLE,
+cso->fill_front == PIPE_POLYGON_MODE_FILL_RECTANGLE ?
+NVC0_3D_FILL_RECTANGLE_ENABLE : 0);
+
 SB_BEGIN_3D(so, MACRO_POLYGON_MODE_FRONT, 1);
 SB_DATA(so, nvgl_polygon_mode(cso->fill_front));
 SB_BEGIN_3D(so, MACRO_POLYGON_MODE_BACK, 1);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h
index 054b1e7..3006ed6 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h
@@ -23,7 +23,7 @@ struct nvc0_blend_stateobj {
 struct nvc0_rasterizer_stateobj {
struct pipe_rasterizer_state pipe;
int size;
-   uint32_t state[42];
+   uint32_t state[43];
 };
 
 struct nvc0_zsa_stateobj {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 5/6] st/mesa: Add support for NV_fill_rectangle

2017-03-30 Thread Lyude
Signed-off-by: Lyude 

Changes since v1:
- Fix commit name

Signed-off-by: Lyude 
---
 src/mesa/state_tracker/st_atom_rasterizer.c | 2 ++
 src/mesa/state_tracker/st_extensions.c  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
b/src/mesa/state_tracker/st_atom_rasterizer.c
index 50be7b6..0b0e045 100644
--- a/src/mesa/state_tracker/st_atom_rasterizer.c
+++ b/src/mesa/state_tracker/st_atom_rasterizer.c
@@ -50,6 +50,8 @@ static GLuint translate_fill( GLenum mode )
   return PIPE_POLYGON_MODE_LINE;
case GL_FILL:
   return PIPE_POLYGON_MODE_FILL;
+   case GL_FILL_RECTANGLE_NV:
+  return PIPE_POLYGON_MODE_FILL_RECTANGLE;
default:
   assert(0);
   return 0;
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 16f8685..eefd21d 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -637,6 +637,7 @@ void st_init_extensions(struct pipe_screen *screen,
   { o(ATI_separate_stencil), PIPE_CAP_TWO_SIDED_STENCIL
},
   { o(ATI_texture_mirror_once),  PIPE_CAP_TEXTURE_MIRROR_CLAMP 
},
   { o(NV_conditional_render),PIPE_CAP_CONDITIONAL_RENDER   
},
+  { o(NV_fill_rectangle),
PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE  },
   { o(NV_primitive_restart), PIPE_CAP_PRIMITIVE_RESTART
},
   { o(NV_texture_barrier),   PIPE_CAP_TEXTURE_BARRIER  
},
   { o(NVX_gpu_memory_info),  PIPE_CAP_QUERY_MEMORY_INFO
},
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 4/6] gallium: Add NV_fill_rectangle to pipe state

2017-03-30 Thread Lyude
Signed-off-by: Lyude 

Changes since v1:
- Fix accidental widening of bitfields

Signed-off-by: Lyude 
---
 src/gallium/include/pipe/p_defines.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 0f0e260..7f781c4 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -133,6 +133,7 @@ enum {
PIPE_POLYGON_MODE_FILL,
PIPE_POLYGON_MODE_LINE,
PIPE_POLYGON_MODE_POINT,
+   PIPE_POLYGON_MODE_FILL_RECTANGLE,
 };
 
 /** Polygon face specification, eg for culling */
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/6] gallium: Add a cap to check if the driver supports fill_rectangle

2017-03-30 Thread Lyude
Changes since v1:
- Add pipe caps for etnaviv, freedreno, swr and virgl

Signed-off-by: Lyude 
---
 src/gallium/docs/source/screen.rst   | 4 
 src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 17 files changed, 20 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 00c9503..c103194 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -376,6 +376,10 @@ The integer capabilities:
   operations are supported.
 * ``PIPE_CAP_TGSI_TEX_TXF_LZ``: Whether TEX_LZ and TXF_LZ opcodes are
   supported.
+* ``PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE``: Whether the
+  PIPE_POLYGON_MODE_FILL_RECTANGLE mode is supported for
+  ``pipe_rasterizer_state::fill_front`` and
+  ``pipe_rasterizer_state::fill_back``.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index ed7fb64..bf8f46d 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -245,6 +245,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_INT64:
case PIPE_CAP_INT64_DIVMOD:
case PIPE_CAP_TGSI_TEX_TXF_LZ:
+   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
   return 0;
 
/* Stream output. */
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 5657de5..2520034 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -302,6 +302,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_INT64:
case PIPE_CAP_INT64_DIVMOD:
case PIPE_CAP_TGSI_TEX_TXF_LZ:
+   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index d25c2b3..28be7a9 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -278,6 +278,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
+   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
   return 0;
 
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index f6ac9b6..d4d04d4 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -347,6 +347,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_TGSI_FS_FBFETCH:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
+   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 5c7ae24..be73cf0 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -211,6 +211,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_INT64:
case PIPE_CAP_INT64_DIVMOD:
case PIPE_CAP_TGSI_TEX_TXF_LZ:
+   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 249947a..ee33936 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -263,6 +263,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_DOUBLES:
case PIPE_CAP_INT64:
case PIPE_CAP_INT64_DIVMOD:
+   case PIPE_CAP_POLYGON_MODE_FILL_RECTANGLE:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 

[Mesa-dev] [PATCH v2 2/6] mesa: Add support for GL_NV_fill_rectangle

2017-03-30 Thread Lyude
Since we don't have the bits required to support this in OpenGLES yet,
this only enables support for Desktop OpenGL

Signed-off-by: Lyude 

Changes since v1:
- Simply _mesa_PolygonMode() a little bit
- Fix formatting in OpenGL spec excerpts
- Move polygon mode checking into _mesa_valid_to_render()

Signed-off-by: Lyude 
---
 src/mesa/main/api_validate.c | 13 +
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/mtypes.h   |  1 +
 src/mesa/main/polygon.c  | 13 +++--
 4 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 44d164a..0385c72 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -189,6 +189,19 @@ _mesa_valid_to_render(struct gl_context *ctx, const char 
*where)
   return GL_FALSE;
}
 
+   /* From the GL_NV_fill_rectangle spec:
+*
+* "An INVALID_OPERATION error is generated by Begin or any Draw command if
+*  only one of the front and back polygon mode is FILL_RECTANGLE_NV."
+*/
+   if ((ctx->Polygon.FrontMode == GL_FILL_RECTANGLE_NV) !=
+   (ctx->Polygon.BackMode == GL_FILL_RECTANGLE_NV)) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "GL_NV_fill_rectangle can only be used on both the front "
+  "and back polygon mode, not one or the other");
+  return GL_FALSE;
+   }
+
 #ifdef DEBUG
if (ctx->_Shader->Flags & GLSL_LOG) {
   struct gl_program **prog = ctx->_Shader->CurrentProgram;
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index ec71791..f2eac2b 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -320,6 +320,7 @@ EXT(NV_conditional_render   , 
NV_conditional_render
 EXT(NV_depth_clamp  , ARB_depth_clamp  
  , GLL, GLC,  x ,  x , 2001)
 EXT(NV_draw_buffers , dummy_true   
  ,  x ,  x ,  x , ES2, 2011)
 EXT(NV_fbo_color_attachments, dummy_true   
  ,  x ,  x ,  x , ES2, 2010)
+EXT(NV_fill_rectangle   , NV_fill_rectangle
  , GLL, GLC,  x ,  x , 2015)
 EXT(NV_fog_distance , NV_fog_distance  
  , GLL,  x ,  x ,  x , 2001)
 EXT(NV_image_formats, ARB_shader_image_load_store  
  ,  x ,  x ,  x ,  31, 2014)
 EXT(NV_light_max_exponent   , dummy_true   
  , GLL,  x ,  x ,  x , 1999)
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 91e1948..893caad 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4016,6 +4016,7 @@ struct gl_extensions
GLboolean MESA_shader_integer_functions;
GLboolean MESA_ycbcr_texture;
GLboolean NV_conditional_render;
+   GLboolean NV_fill_rectangle;
GLboolean NV_fog_distance;
GLboolean NV_point_sprite;
GLboolean NV_primitive_restart;
diff --git a/src/mesa/main/polygon.c b/src/mesa/main/polygon.c
index 4caf62a..1bb7190 100644
--- a/src/mesa/main/polygon.c
+++ b/src/mesa/main/polygon.c
@@ -131,8 +131,17 @@ _mesa_PolygonMode( GLenum face, GLenum mode )
   _mesa_enum_to_string(face),
   _mesa_enum_to_string(mode));
 
-   if (mode!=GL_POINT && mode!=GL_LINE && mode!=GL_FILL) {
-  _mesa_error( ctx, GL_INVALID_ENUM, "glPolygonMode(mode)" );
+   switch (mode) {
+   case GL_POINT:
+   case GL_LINE:
+   case GL_FILL:
+  break;
+   case GL_FILL_RECTANGLE_NV:
+  if (ctx->Extensions.NV_fill_rectangle)
+ break;
+  /* fall-through */
+   default:
+  _mesa_error(ctx, GL_INVALID_ENUM, "glPolygonMode(mode)");
   return;
}
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/6] glapi: Add GL_NV_fill_rectangle

2017-03-30 Thread Lyude
Signed-off-by: Lyude 
---
 src/mapi/glapi/gen/gl_API.xml | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index dfaeaaf..8392e3a 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -12840,6 +12840,10 @@
 
 
 
+
+
+
+
 
   
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/6] Add support for GL_NV_fill_rectangle

2017-03-30 Thread Lyude
This adds basic support for GL_NV_fill_rectangle in Gallium, along with
enabling it for the GM200+ in nouveau. It should be noted this only
implements the OpenGL 4.3 bits, since we don't have the features required
yet to add this for OpenGLES.

Lyude (6):
  glapi: Add GL_NV_fill_rectangle
  mesa: Add support for GL_NV_fill_rectangle
  gallium: Add a cap to check if the driver supports fill_rectangle
  gallium: Add NV_fill_rectangle to pipe state
  st/mesa: Add support for NV_fill_rectangle
  nvc0: Add support for NV_fill_rectangle for the GM200+

 docs/relnotes/17.1.0.html|  1 +
 src/gallium/docs/source/screen.rst   |  4 
 src/gallium/drivers/etnaviv/etnaviv_screen.c |  1 +
 src/gallium/drivers/freedreno/freedreno_screen.c |  1 +
 src/gallium/drivers/i915/i915_screen.c   |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c |  1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h   |  3 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   |  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c|  4 
 src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h |  2 +-
 src/gallium/drivers/r300/r300_screen.c   |  1 +
 src/gallium/drivers/r600/r600_pipe.c |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
 src/gallium/drivers/softpipe/sp_screen.c |  1 +
 src/gallium/drivers/svga/svga_screen.c   |  1 +
 src/gallium/drivers/swr/swr_screen.cpp   |  1 +
 src/gallium/drivers/vc4/vc4_screen.c |  1 +
 src/gallium/drivers/virgl/virgl_screen.c |  1 +
 src/gallium/include/pipe/p_defines.h |  2 ++
 src/mapi/glapi/gen/gl_API.xml|  4 
 src/mesa/main/api_validate.c | 13 +
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/mtypes.h   |  1 +
 src/mesa/main/polygon.c  | 13 +++--
 src/mesa/state_tracker/st_atom_rasterizer.c  |  2 ++
 src/mesa/state_tracker/st_extensions.c   |  1 +
 28 files changed, 64 insertions(+), 3 deletions(-)

-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel: tools: add aubinator_error_decode tool

2017-03-30 Thread Lionel Landwerlin
This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.

v2 (from Matt): Drop authors
Remove undefined automake variable

Signed-off-by: Lionel Landwerlin 
Acked-by: Matt Turner 
---
 src/intel/Makefile.tools.am  |  19 +-
 src/intel/common/gen_decoder.c   |  10 +
 src/intel/common/gen_decoder.h   |   1 +
 src/intel/tools/.gitignore   |   1 +
 src/intel/tools/aubinator_error_decode.c | 778 +++
 5 files changed, 808 insertions(+), 1 deletion(-)
 create mode 100644 src/intel/tools/aubinator_error_decode.c

diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am
index 245bd03eef..7ad5b3367e 100644
--- a/src/intel/Makefile.tools.am
+++ b/src/intel/Makefile.tools.am
@@ -19,7 +19,9 @@
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 # IN THE SOFTWARE.
 
-noinst_PROGRAMS += tools/aubinator
+noinst_PROGRAMS += \
+   tools/aubinator \
+   tools/aubinator_error_decode
 
 tools_aubinator_SOURCES = \
tools/aubinator.c \
@@ -41,3 +43,18 @@ tools_aubinator_LDADD = \
$(EXPAT_LIBS) \
$(ZLIB_LIBS) \
-lm
+
+
+tools_aubinator_error_decode_SOURCES = \
+   tools/aubinator_error_decode.c
+
+tools_aubinator_error_decode_LDADD = \
+   common/libintel_common.la \
+   $(top_builddir)/src/util/libmesautil.la \
+   $(EXPAT_LIBS) \
+   $(ZLIB_LIBS)
+
+tools_aubinator_error_decode_CFLAGS = \
+   $(AM_CFLAGS) \
+   $(EXPAT_CFLAGS) \
+   $(ZLIB_CFLAGS)
diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 1c3246f265..3af472caef 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -112,6 +112,16 @@ gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset)
return NULL;
 }
 
+struct gen_group *
+gen_spec_find_register_by_name(struct gen_spec *spec, const char *name)
+{
+   for (int i = 0; i < spec->nregisters; i++)
+  if (strcmp(spec->registers[i]->name, name) == 0)
+ return spec->registers[i];
+
+   return NULL;
+}
+
 struct gen_enum *
 gen_spec_find_enum(struct gen_spec *spec, const char *name)
 {
diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h
index 1c41de80a4..936b052455 100644
--- a/src/intel/common/gen_decoder.h
+++ b/src/intel/common/gen_decoder.h
@@ -45,6 +45,7 @@ struct gen_spec *gen_spec_load_from_path(const struct 
gen_device_info *devinfo,
 uint32_t gen_spec_get_gen(struct gen_spec *spec);
 struct gen_group *gen_spec_find_instruction(struct gen_spec *spec, const 
uint32_t *p);
 struct gen_group *gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset);
+struct gen_group *gen_spec_find_register_by_name(struct gen_spec *spec, const 
char *name);
 int gen_group_get_length(struct gen_group *group, const uint32_t *p);
 const char *gen_group_get_name(struct gen_group *group);
 uint32_t gen_group_get_opcode(struct gen_group *group);
diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
index 0c80a6fed2..27437f9eef 100644
--- a/src/intel/tools/.gitignore
+++ b/src/intel/tools/.gitignore
@@ -1 +1,2 @@
 /aubinator
+/aubinator_error_decode
diff --git a/src/intel/tools/aubinator_error_decode.c 
b/src/intel/tools/aubinator_error_decode.c
new file mode 100644
index 00..a44b1bb44e
--- /dev/null
+++ b/src/intel/tools/aubinator_error_decode.c
@@ -0,0 +1,778 @@
+/*
+ * Copyright © 2007-2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "common/gen_decoder.h"
+#include "util/macros.h"
+

[Mesa-dev] [PATCH] intel: genxml: add gen7 ERR_INT register

2017-03-30 Thread Lionel Landwerlin
v2: add register to gen7.5 (Matt)

Signed-off-by: Lionel Landwerlin 
---
 src/intel/genxml/gen7.xml  | 11 +++
 src/intel/genxml/gen75.xml | 11 +++
 2 files changed, 22 insertions(+)

diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index ba9c8e8154..08307b3506 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2665,4 +2665,15 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 979f1e3ee2..9de6caa9db 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -3089,4 +3089,15 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+  
+
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] mesa: remove MESA_GLSL=no_opts env option

2017-03-30 Thread Timothy Arceri



On 31/03/17 03:44, Brian Paul wrote:

On 03/30/2017 05:26 AM, Timothy Arceri wrote:

This is confusing because is only applys to ARB shaders, and because


s/ARB shaders/GL_ARB_vertex/fragment_program/


of that its also not very useful.

If someone requires this for debugging they can just make an ad-hoc
code change.


I think I'm the original author of this and my intention was that if a
user ran into a suspected shader compiler bug, they could set this env
var to see if disabling optimizations worked around the issue.  Seems
like a useful thing to me, but I'm not sure it got much/any use.  I'm
pretty certain an earlier version of the GLSL compiler respected it too.
 But it probably was overlooked and dropped at some point.


Yeah I figured that. At this point in time in the glsl compiler various 
passes/lowering/backends depend on optimisations having been done,
turning them off would just break things (and I don't think there is 
anything wrong with that).


Anyway, thanks for the reviews.



I still think this env option could be useful.  But for lack of use, I
guess it can go.

Reviewed-by: Brian Paul 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] anv: Use subpass dependencies for flushes

2017-03-30 Thread Nanley Chery
On Tue, Mar 14, 2017 at 07:55:53AM -0700, Jason Ekstrand wrote:
> Instead of figuring it all out ourselves, just use the information given
> to us by the client.
> ---
>  src/intel/vulkan/anv_blorp.c   | 88 
> --
>  src/intel/vulkan/genX_cmd_buffer.c | 10 +
>  2 files changed, 18 insertions(+), 80 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 8de339c..41966d6 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -1070,80 +1070,6 @@ enum subpass_stage {
>  };
>  
>  static bool
> -attachment_needs_flush(struct anv_cmd_buffer *cmd_buffer,
> -   struct anv_render_pass_attachment *att,
> -   enum subpass_stage stage)
> -{
> -   struct anv_render_pass *pass = cmd_buffer->state.pass;
> -   const uint32_t subpass_idx = anv_get_subpass_id(_buffer->state);
> -
> -   /* We handle this subpass specially based on the current stage */
> -   enum anv_subpass_usage usage = att->subpass_usage[subpass_idx];
> -   switch (stage) {
> -   case SUBPASS_STAGE_LOAD:
> -  if (usage & (ANV_SUBPASS_USAGE_INPUT | ANV_SUBPASS_USAGE_RESOLVE_SRC))
> - return true;
> -  break;
> -
> -   case SUBPASS_STAGE_DRAW:
> -  if (usage & ANV_SUBPASS_USAGE_RESOLVE_SRC)
> - return true;
> -  break;
> -
> -   default:
> -  break;
> -   }
> -
> -   for (uint32_t s = subpass_idx + 1; s < pass->subpass_count; s++) {
> -  usage = att->subpass_usage[s];
> -
> -  /* If this attachment is going to be used as an input in this or any
> -   * future subpass, then we need to flush its cache and invalidate the
> -   * texture cache.
> -   */
> -  if (att->subpass_usage[s] & ANV_SUBPASS_USAGE_INPUT)
> - return true;
> -
> -  if (usage & (ANV_SUBPASS_USAGE_DRAW | ANV_SUBPASS_USAGE_RESOLVE_DST)) {
> - /* We found another subpass that draws to this attachment.  We'll
> -  * wait to resolve until then.
> -  */
> - return false;
> -  }
> -   }
> -
> -   return false;
> -}
> -
> -static void
> -anv_cmd_buffer_flush_attachments(struct anv_cmd_buffer *cmd_buffer,
> - enum subpass_stage stage)
> -{
> -   struct anv_subpass *subpass = cmd_buffer->state.subpass;
> -   struct anv_render_pass *pass = cmd_buffer->state.pass;
> -
> -   for (uint32_t i = 0; i < subpass->color_count; ++i) {
> -  uint32_t att = subpass->color_attachments[i].attachment;
> -  assert(att < pass->attachment_count);
> -  if (attachment_needs_flush(cmd_buffer, >attachments[att], 
> stage)) {
> - cmd_buffer->state.pending_pipe_bits |=
> -ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT |
> -ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
> -  }
> -   }
> -
> -   if (subpass->depth_stencil_attachment.attachment != VK_ATTACHMENT_UNUSED) 
> {
> -  uint32_t att = subpass->depth_stencil_attachment.attachment;
> -  assert(att < pass->attachment_count);
> -  if (attachment_needs_flush(cmd_buffer, >attachments[att], 
> stage)) {
> - cmd_buffer->state.pending_pipe_bits |=
> -ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT |
> -ANV_PIPE_DEPTH_CACHE_FLUSH_BIT;
> -  }
> -   }
> -}
> -
> -static bool
>  subpass_needs_clear(const struct anv_cmd_buffer *cmd_buffer)
>  {
> const struct anv_cmd_state *cmd_state = _buffer->state;
> @@ -1327,8 +1253,6 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer 
> *cmd_buffer)
> }
>  
> blorp_batch_finish();
> -
> -   anv_cmd_buffer_flush_attachments(cmd_buffer, SUBPASS_STAGE_LOAD);
>  }
>  
>  static void
> @@ -1554,9 +1478,15 @@ anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer 
> *cmd_buffer)
>   subpass->color_attachments[i].attachment);
> }
>  
> -   anv_cmd_buffer_flush_attachments(cmd_buffer, SUBPASS_STAGE_DRAW);
> -
> if (subpass->has_resolve) {
> +  /* We are about to do some MSAA resolves.  We need to flush so that the
> +   * result of writes to the MSAA color attachments show up in the 
> sampler
> +   * when we blit to the single-sampled resolve target.
> +   */
> +  cmd_buffer->state.pending_pipe_bits |=
> + ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT |
> + ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
> +
>for (uint32_t i = 0; i < subpass->color_count; ++i) {
>   uint32_t src_att = subpass->color_attachments[i].attachment;
>   uint32_t dst_att = subpass->resolve_attachments[i].attachment;
> @@ -1594,8 +1524,6 @@ anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer 
> *cmd_buffer)
>  
>   ccs_resolve_attachment(cmd_buffer, , dst_att);
>}
> -
> -  anv_cmd_buffer_flush_attachments(cmd_buffer, SUBPASS_STAGE_RESOLVE);
> }
>  
> blorp_batch_finish();
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index acb59d5..02dff44 100644
> --- 

Re: [Mesa-dev] Windows build requires extra tools from release tarball

2017-03-30 Thread Jose Fonseca

On 30/03/17 18:38, Ben Boeckel wrote:

Hi,

I'm trying to compile Mesa on Windows, but am hitting up against a
problem that mako and lex/yacc are required. The generated files are in
the source tree, but from my investigations (and limited knowledge of
Scons), it appears that the Scons code does not care and always does the
generation logic.

Does the Scons build require these tools, or would it be possible to get
it to instead use the pre-generated source files instead?

Thanks,

--Ben



Yes, SCons always require these tools.

It's easy to get them though.  Check 
https://cgit.freedesktop.org/mesa/mesa/tree/appveyor.yml


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/4] anv/pass: Record required pipe flushes

2017-03-30 Thread Nanley Chery
On Wed, Mar 15, 2017 at 12:01:16PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/anv_pass.c| 89 
> ++
>  src/intel/vulkan/anv_private.h |  2 +
>  2 files changed, 91 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> index 8d1768d..0f71fb3 100644
> --- a/src/intel/vulkan/anv_pass.c
> +++ b/src/intel/vulkan/anv_pass.c
> @@ -98,6 +98,7 @@ VkResult anv_CreateRenderPass(
> if (pass->subpass_attachments == NULL)
>goto fail_subpass_usages;
>  
> +   bool has_color = false, has_depth = false, has_input = false;
> p = pass->subpass_attachments;
> for (uint32_t i = 0; i < pCreateInfo->subpassCount; i++) {
>const VkSubpassDescription *desc = >pSubpasses[i];
> @@ -115,6 +116,7 @@ VkResult anv_CreateRenderPass(
>  uint32_t a = desc->pInputAttachments[j].attachment;
>  subpass->input_attachments[j] = desc->pInputAttachments[j];
>  if (a != VK_ATTACHMENT_UNUSED) {
> +   has_input = true;
> pass->attachments[a].usage |= 
> VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT;
> pass->attachments[a].subpass_usage[i] |= 
> ANV_SUBPASS_USAGE_INPUT;
> pass->attachments[a].last_subpass_idx = i;
> @@ -134,6 +136,7 @@ VkResult anv_CreateRenderPass(
>  uint32_t a = desc->pColorAttachments[j].attachment;
>  subpass->color_attachments[j] = desc->pColorAttachments[j];
>  if (a != VK_ATTACHMENT_UNUSED) {
> +   has_color = true;
> pass->attachments[a].usage |= 
> VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
> pass->attachments[a].subpass_usage[i] |= 
> ANV_SUBPASS_USAGE_DRAW;
> pass->attachments[a].last_subpass_idx = i;
> @@ -170,6 +173,7 @@ VkResult anv_CreateRenderPass(
>   *p++ = subpass->depth_stencil_attachment =
>  *desc->pDepthStencilAttachment;
>   if (a != VK_ATTACHMENT_UNUSED) {
> +has_depth = true;
>  pass->attachments[a].usage |=
> VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT;
>  pass->attachments[a].subpass_usage[i] |= ANV_SUBPASS_USAGE_DRAW;
> @@ -181,10 +185,94 @@ VkResult anv_CreateRenderPass(
>}
> }
>  
> +   pass->subpass_flushes =
> +  vk_zalloc2(>alloc, pAllocator,
> + (pass->subpass_count + 1) * sizeof(*pass->subpass_flushes),
> + 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> +   if (pass->subpass_flushes == NULL)
> +  goto fail_subpass_attachments;
> +

I think we should allocate this memory differently. We currently have
two memory allocation methods in this function:
   1. Allocate memory per data structure. This makes pointer
  assignments simple and using gotos improves error handling in this
  case.
   2. Allocate one big block of memory and assign pointers for certain
  offsets into this block. This is a more efficient usage of memory
  and error handling is quite simple in this case.

For this series, I think the memory allocation for ::subpass_flushes
should be lumped into the allocation for the entire render pass object.
It's simple to do because we know the size upfront and we can avoid
having to add the goto infrastructure.

> +   for (uint32_t i = 0; i < pCreateInfo->dependencyCount; i++) {
> +  const VkSubpassDependency *dep = >pDependencies[i];
> +  if (dep->dstSubpass != VK_SUBPASS_EXTERNAL) {
> + assert(dep->dstSubpass < pass->subpass_count);
> + pass->subpass_flushes[dep->dstSubpass] |=
> +anv_pipe_invalidate_bits_for_access_flags(dep->dstAccessMask);
> +  }
> +
> +  if (dep->srcSubpass != VK_SUBPASS_EXTERNAL) {
> + assert(dep->srcSubpass < pass->subpass_count);
> + pass->subpass_flushes[dep->srcSubpass + 1] |=
> +anv_pipe_flush_bits_for_access_flags(dep->srcAccessMask);
> +  }
> +   }

Why set the flush bits at srcSubpass + 1? This can cause excessive
flushing. We can avoid this excess by setting the flush bits at
dstSubpass (like we do for the invalidate bits).

I see that the implicit dependency with VK_SUBPASS_EXTERNAL is covered
below, but explicit dependencies should be covered as well. For example,
the above doesn't handle the case where a user declares an external
dependency that has VK_ACCESS_SHADER_WRITE_BIT set in the srcSubpass.
According to PipelineBarrier, this would require a DATA_CACHE flush.

-Nanley

> +
> +   /* From the Vulkan 1.0.39 spec:
> +*
> +*If there is no subpass dependency from VK_SUBPASS_EXTERNAL to the
> +*first subpass that uses an attachment, then an implicit subpass
> +*dependency exists from VK_SUBPASS_EXTERNAL to the first subpass it 
> is
> +*used in. The subpass dependency operates as if defined with the
> +*following parameters:
> +*
> +*VkSubpassDependency implicitDependency = {
> +*

Re: [Mesa-dev] i965: On-demand render target flushing

2017-03-30 Thread Francisco Jerez
"Pohjolainen, Topi"  writes:

> Jason, Curro, do you have any opinion if this is worth pursuing?
> I need something for blorp blits at least - using blorp for texture
> uploads on top of current excessive flushing regresses perf.
>

I wouldn't be surprised if it improves throughput slightly for workloads
doing both compute and 3D rendering, but only testing will tell how
helpful it is in practice...

> When working on gpu hangs on SKL we also identified compute
> flushing caches it shouldn't. I think those could be addressed
> nicely here as well. I can respin if this is something we'd like
> to have.
>
> On Fri, Feb 17, 2017 at 09:32:03PM +0200, Topi Pohjolainen wrote:
>> Currently:
>> 
>> 1) Blorp color clears and resolves emit unconditional render target
>>flush + command stream after every clear/resolve (including
>>regular non-fast clears).
>> 
>> 2) Blorp color clears, resolves and blits emit texture and constant
>>cache resolves even in case only destination is dirty. This is
>>because brw_render_cache_set_check_flush() does both render target
>>flush as well as the top-of-pipe read cache flushes.
>> 
>> 3) Similarly to item 2, 3D and compute paths also flush texture and
>>constant caches even if none of the texture surfaces are dirty.
>> 
>> 4) In case of multiple surfaces needing resolves, all render paths
>>(blorp, 3D and compute) emit render target, texture and constant
>>cache flushes after each resolve instead of just once after all
>>resolves.
>> 
>> This series addresses all four cases. Good news are that even though
>> the current setup isn't optimal, it doesn't actually get any better in
>> most cases performance wise. There is modest gain in OglDrvRes which
>> does heavy blorp blitting. I'm expecting this series also to make
>> blorp tex uploads and blorp mipmap generation more competitive.
>> 
>> Bad news are in the final patch - it looks that current unconditional
>> flushing/stalling has been hiding bugs elsewhere. There are cases
>> which rely on the flushes after non-fast clears. Hunting the real
>> cause is, however, difficult. I only saw them in CI system within
>> full runs and was not able to reproduce them myself.
>> 
>> As first steps the series introduces end-of-pipe synchronization.
>> This is a flush combined with stall and post-sync operation of
>> writing a double word (32 bits). Until now this wasn't really
>> needed as there was in many cases double flushing which in turn
>> looks to take long enough to hide the need for the sync. I also
>> noticed that one needs to be rather careful with it - performance
>> gets decreased noticeably when used unneeded.
>> 
>> I don't really know if we want to go this way myself even. Current
>> logic - while not ideal - is rather simple.
>> 
>> Topi Pohjolainen (16):
>>   i965/miptree: Tell if anything got resolved
>>   i965/gen6+: Implement end-of-pipe sync
>>   i965: Hook end-of-pipe-sync after texture resolves
>>   i965: Hook end-of-pipe-sync after image resolves
>>   i965: Hook end-of-pipe-sync after framebuffer resolves
>>   i965: Consider layered rt resolves along with other
>>   i965: Add color resolve end-of-pipe-sync before switch to blit ring
>>   i965/dri2: Add end-of-pipe-sync after color resolves
>>   i965/miptree: Add color resolve end-of-pipe-sync before sharing
>>   i965: Add end-of-pipe sync before non-gpu read of color resolves
>>   i965/blorp: Do more fine grained flushing/syncing
>>   i965/blorp/blit: Refactor hiz/ccs prep for blits
>>   i965/blorp: Use conditional end-of-pipe-sync
>>   i965: Consider surface resolves and sync after blorp ops
>>   i965: Check if fast color clear state transition needs sync
>>   i965/blorp: Drop unnecessary flushes after clear/resolve
>> 
>>  src/mesa/drivers/dri/i965/brw_blorp.c  | 187 ++
>>  src/mesa/drivers/dri/i965/brw_compute.c|   2 +
>>  src/mesa/drivers/dri/i965/brw_context.c| 333 
>> +++--
>>  src/mesa/drivers/dri/i965/brw_context.h|   3 +
>>  src/mesa/drivers/dri/i965/brw_draw.c   |  36 +--
>>  src/mesa/drivers/dri/i965/brw_pipe_control.c   |  91 +++
>>  src/mesa/drivers/dri/i965/genX_blorp_exec.c|  11 -
>>  src/mesa/drivers/dri/i965/intel_blit.c |  16 +-
>>  src/mesa/drivers/dri/i965/intel_copy_image.c   |  10 +-
>>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c  |  25 +-
>>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h  |   2 +-
>>  src/mesa/drivers/dri/i965/intel_pixel.c|   4 +
>>  src/mesa/drivers/dri/i965/intel_pixel_bitmap.c |   5 +-
>>  src/mesa/drivers/dri/i965/intel_pixel_read.c   |   7 +-
>>  src/mesa/drivers/dri/i965/intel_tex_image.c|  10 +-
>>  src/mesa/drivers/dri/i965/intel_tex_subimage.c |  11 +-
>>  16 files changed, 557 insertions(+), 196 deletions(-)
>> 
>> -- 
>> 2.5.5
>> 


signature.asc
Description: PGP signature
___
mesa-dev mailing list

Re: [Mesa-dev] [PATCH 7/7] intel: tools: add aubinator_error_decode tool

2017-03-30 Thread Kristian Høgsberg
On Thu, Mar 30, 2017 at 1:38 PM, Lionel Landwerlin
 wrote:
> On 30/03/17 20:09, Chris Wilson wrote:
>>
>> On Thu, Mar 30, 2017 at 11:27:26AM -0700, Matt Turner wrote:
>>>
>>> I think we should figure out how to make this not just a fork of
>>> intel_error_decode. Should intel_error_decode do away?
>>>
>>> There are various tools in i-g-t that I'm definitely in favor of
>>> moving into Mesa (like the assembler and disassembler, and aubdump).
>>> Should this one move too? Do we have buy-in from Chris?
>>
>> Sure, if it means that we do get an actively maintained decoder - just
>> hope that gen2-3 is forthcoming, and perhaps some inferred type
>> decoding.
>
>
> I guess that depends on whether we can get somebody to write genxml for
> them.
> Not sure it's a big priority for us :/
> Maybe a rainy weekend project?
>
>>
>> What's more important is that it is fairly flexible in error-state file
>> format - mostly the lists and order of registers change, different
>> amount of detail in the kernel state trackers etc. I was thinking that
>> maybe we wanted a more parseable format like json (error.json),
>> something a bit more flexible?
>
>
> That would be great.
>
>> What I also want is a use-case for EXEC_OBJECT_CAPTURE. Aside from
>> plain listing in the error state, it would be nice if the decoder could
>> parse 3DSTATE_BASE to work out which buffer was, for example, the
>> instruction buffer and then decode the relevant kernels from 3DSTATE_PS.
>
>
> If I understood things right, I think that's what Matt's working on.

aubinator already does this.

>> Decoding vertex buffers is perhaps less interesting in general that it
>> was for me, but similarly if they could be found within the error state,
>> dumping the vertices for each PRIMITIVE -- though with GS/HS/TS that
>> again is probably useless -- so really just an example of being
>> flexible, if the data is there in the error state, decode it inline.
>
>
> That's something I'm also interested in.
> I would really like to be able to inspect the state of the GPU for each draw
> call in aubinator for example.
> Lines of text aren't the best way to inspect though.
>
> A few months back, I started to write an aubinator.py script in the hope of
> having some UI on top.
> I think we need a slightly better genxml description to have that completely
> automatic though, as it stands we miss a some relationships between
> offsets/address & STATE_BASE_ADDRESS for example.

Be careful, before you know it you'll be writing a simulator. It happened to me.

Kristian

>>
>> I was won over at having pager support. I was getting close to moving
>> the decoder to mesa myself, though it would have been a much less
>> refined effort!
>> -Chris
>>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] intel: tools: add aubinator_error_decode tool

2017-03-30 Thread Lionel Landwerlin

On 30/03/17 20:09, Chris Wilson wrote:

On Thu, Mar 30, 2017 at 11:27:26AM -0700, Matt Turner wrote:

I think we should figure out how to make this not just a fork of
intel_error_decode. Should intel_error_decode do away?

There are various tools in i-g-t that I'm definitely in favor of
moving into Mesa (like the assembler and disassembler, and aubdump).
Should this one move too? Do we have buy-in from Chris?

Sure, if it means that we do get an actively maintained decoder - just
hope that gen2-3 is forthcoming, and perhaps some inferred type
decoding.


I guess that depends on whether we can get somebody to write genxml for 
them.

Not sure it's a big priority for us :/
Maybe a rainy weekend project?



What's more important is that it is fairly flexible in error-state file
format - mostly the lists and order of registers change, different
amount of detail in the kernel state trackers etc. I was thinking that
maybe we wanted a more parseable format like json (error.json),
something a bit more flexible?


That would be great.


What I also want is a use-case for EXEC_OBJECT_CAPTURE. Aside from
plain listing in the error state, it would be nice if the decoder could
parse 3DSTATE_BASE to work out which buffer was, for example, the
instruction buffer and then decode the relevant kernels from 3DSTATE_PS.


If I understood things right, I think that's what Matt's working on.


Decoding vertex buffers is perhaps less interesting in general that it
was for me, but similarly if they could be found within the error state,
dumping the vertices for each PRIMITIVE -- though with GS/HS/TS that
again is probably useless -- so really just an example of being
flexible, if the data is there in the error state, decode it inline.


That's something I'm also interested in.
I would really like to be able to inspect the state of the GPU for each 
draw call in aubinator for example.

Lines of text aren't the best way to inspect though.

A few months back, I started to write an aubinator.py script in the hope 
of having some UI on top.
I think we need a slightly better genxml description to have that 
completely automatic though, as it stands we miss a some relationships 
between offsets/address & STATE_BASE_ADDRESS for example.




I was won over at having pager support. I was getting close to moving
the decoder to mesa myself, though it would have been a much less
refined effort!
-Chris



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 35268] initial-exec TLS model breaks dlopen'ed libGL

2017-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=35268

Nicholas Fish  changed:

   What|Removed |Added

 CC||x...@seaofdirac.net

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/glthread: Call unmarshal_batch directly in glthread_finish

2017-03-30 Thread Bartosz Tomczyk
Call it directly when batch queue is empty. This avoids costly thread
synchronisation. This commit improves performance of games that have
previously regressed with mesa_glthread=true.
---
 src/mesa/main/glthread.c | 47 ++-
 1 file changed, 34 insertions(+), 13 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 06115b916d..4fcd322163 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx)
}
 }
 
-void
-_mesa_glthread_flush_batch(struct gl_context *ctx)
+static void
+_mesa_glthread_flush_batch_locked(struct gl_context *ctx)
 {
struct glthread_state *glthread = ctx->GLThread;
-   struct glthread_batch *batch;
-
-   if (!glthread)
-  return;
-
-   batch = glthread->batch;
+   struct glthread_batch *batch = glthread->batch;
+ 
if (!batch->used)
   return;
 
@@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
   return;
}
 
-   pthread_mutex_lock(>mutex);
*glthread->batch_queue_tail = batch;
glthread->batch_queue_tail = >next;
pthread_cond_broadcast(>new_work);
+}
+
+void
+_mesa_glthread_flush_batch(struct gl_context *ctx)
+{
+   struct glthread_state *glthread = ctx->GLThread;
+   struct glthread_batch *batch;
+
+   if (!glthread)
+  return;
+
+   batch = glthread->batch;
+   if (!batch->used)
+  return;
+
+   pthread_mutex_lock(>mutex);
+   _mesa_glthread_flush_batch_locked(ctx);
pthread_mutex_unlock(>mutex);
 }
 
@@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (pthread_self() == glthread->thread)
   return;
 
-   _mesa_glthread_flush_batch(ctx);
-
pthread_mutex_lock(>mutex);
 
-   while (glthread->batch_queue || glthread->busy)
-  pthread_cond_wait(>work_done, >mutex);
+   if (!(glthread->batch_queue || glthread->busy)) {
+  if (glthread->batch && glthread->batch->used) {
+ struct _glapi_table *dispatch = _glapi_get_dispatch();
+ glthread_unmarshal_batch(ctx, glthread->batch);
+ _glapi_set_dispatch(dispatch);
+ glthread_allocate_batch(ctx);
+  }
+   }
+   else {
+  _mesa_glthread_flush_batch_locked(ctx);
+  while (glthread->batch_queue || glthread->busy)
+ pthread_cond_wait(>work_done, >mutex);
+   }
 
pthread_mutex_unlock(>mutex);
 }
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: allow glsl_type::sampler_index() with images

2017-03-30 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Mar 30, 2017 at 12:33 AM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/compiler/glsl_types.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
> index 405aa3679a..cf0fe71d1a 100644
> --- a/src/compiler/glsl_types.cpp
> +++ b/src/compiler/glsl_types.cpp
> @@ -315,7 +315,7 @@ glsl_type::sampler_index() const
>  {
> const glsl_type *const t = (this->is_array()) ? this->fields.array : this;
>
> -   assert(t->is_sampler());
> +   assert(t->is_sampler() || t->is_image());
>
> switch (t->sampler_dimensionality) {
> case GLSL_SAMPLER_DIM_1D:
> --
> 2.12.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/glsl_to_tgsi: use glsl_type::sampler_index()

2017-03-30 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Mar 30, 2017 at 12:28 AM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 68 
> +-
>  1 file changed, 2 insertions(+), 66 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 46c97783d8..d70018c8a8 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -3883,39 +3883,7 @@ glsl_to_tgsi_visitor::visit_image_intrinsic(ir_call 
> *ir)
> inst->sampler_array_size = sampler_array_size;
> inst->sampler_base = sampler_base;
>
> -   switch (type->sampler_dimensionality) {
> -   case GLSL_SAMPLER_DIM_1D:
> -  inst->tex_target = (type->sampler_array)
> - ? TEXTURE_1D_ARRAY_INDEX : TEXTURE_1D_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_2D:
> -  inst->tex_target = (type->sampler_array)
> - ? TEXTURE_2D_ARRAY_INDEX : TEXTURE_2D_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_3D:
> -  inst->tex_target = TEXTURE_3D_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_CUBE:
> -  inst->tex_target = (type->sampler_array)
> - ? TEXTURE_CUBE_ARRAY_INDEX : TEXTURE_CUBE_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_RECT:
> -  inst->tex_target = TEXTURE_RECT_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_BUF:
> -  inst->tex_target = TEXTURE_BUFFER_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_EXTERNAL:
> -  inst->tex_target = TEXTURE_EXTERNAL_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_MS:
> -  inst->tex_target = (type->sampler_array)
> - ? TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX : TEXTURE_2D_MULTISAMPLE_INDEX;
> -  break;
> -   default:
> -  assert(!"Should not get here.");
> -   }
> -
> +   inst->tex_target = type->sampler_index();
> inst->image_format = st_mesa_format_to_pipe_format(st_context(ctx),
>   _mesa_get_shader_image_format(imgvar->data.image_format));
>
> @@ -4425,39 +4393,7 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
>inst->tex_offset_num_offset = i;
> }
>
> -   switch (sampler_type->sampler_dimensionality) {
> -   case GLSL_SAMPLER_DIM_1D:
> -  inst->tex_target = (sampler_type->sampler_array)
> - ? TEXTURE_1D_ARRAY_INDEX : TEXTURE_1D_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_2D:
> -  inst->tex_target = (sampler_type->sampler_array)
> - ? TEXTURE_2D_ARRAY_INDEX : TEXTURE_2D_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_3D:
> -  inst->tex_target = TEXTURE_3D_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_CUBE:
> -  inst->tex_target = (sampler_type->sampler_array)
> - ? TEXTURE_CUBE_ARRAY_INDEX : TEXTURE_CUBE_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_RECT:
> -  inst->tex_target = TEXTURE_RECT_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_BUF:
> -  inst->tex_target = TEXTURE_BUFFER_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_EXTERNAL:
> -  inst->tex_target = TEXTURE_EXTERNAL_INDEX;
> -  break;
> -   case GLSL_SAMPLER_DIM_MS:
> -  inst->tex_target = (sampler_type->sampler_array)
> - ? TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX : TEXTURE_2D_MULTISAMPLE_INDEX;
> -  break;
> -   default:
> -  assert(!"Should not get here.");
> -   }
> -
> +   inst->tex_target = sampler_type->sampler_index();
> inst->tex_type = ir->type->base_type;
>
> this->result = result_src;
> --
> 2.12.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] radeonsi: enable ARB_shader_clock

2017-03-30 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Thu, Mar 30, 2017 at 9:38 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  docs/features.txt  | 2 +-
>  docs/relnotes/17.1.0.html  | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
>  3 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index b4e54a7..d707f01 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -286,21 +286,21 @@ Khronos, ARB, and OES extensions that are not part of 
> any OpenGL or OpenGL ES ve
>GL_ARB_gpu_shader_int64   DONE (i965/gen8+, 
> nvc0, radeonsi, softpipe, llvmpipe)
>GL_ARB_indirect_parametersDONE (nvc0, radeonsi)
>GL_ARB_parallel_shader_compilenot started, but 
> Chia-I Wu did some related work in 2014
>GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
> radeonsi, softpipe, swr)
>GL_ARB_post_depth_coverageDONE (i965)
>GL_ARB_robustness_isolation   not started
>GL_ARB_sample_locations   not started
>GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
> radeonsi, r600, softpipe, swr)
>GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
> nvc0, radeonsi, softpipe)
>GL_ARB_shader_ballot  not started
> -  GL_ARB_shader_clock   DONE (i965/gen7+)
> +  GL_ARB_shader_clock   DONE (i965/gen7+, 
> radeonsi)
>GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
> radeonsi)
>GL_ARB_shader_group_vote  DONE (nvc0)
>GL_ARB_shader_stencil_export  DONE (i965/gen9+, 
> radeonsi, softpipe, llvmpipe, swr)
>GL_ARB_shader_viewport_layer_arrayDONE (i965/gen6+)
>GL_ARB_sparse_buffer  not started
>GL_ARB_sparse_texture not started
>GL_ARB_sparse_texture2not started
>GL_ARB_sparse_texture_clamp   not started
>GL_ARB_texture_filter_minmax  not started
>GL_ARB_transform_feedback_overflow_query  DONE (i965/gen6+)
> diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
> index ada1e38..52b35b5 100644
> --- a/docs/relnotes/17.1.0.html
> +++ b/docs/relnotes/17.1.0.html
> @@ -38,20 +38,21 @@ TBD.
>
>
>  New features
>
>  
>  Note: some of the new features are only available with certain drivers.
>  
>
>  
>  GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, 
> llvmpipe
> +GL_ARB_shader_clock on radeonsi
>  GL_ARB_transform_feedback2 on i965/gen6
>  GL_ARB_transform_feedback_overflow_query on i965/gen6+
>  Geometry shaders enabled on swr
>  
>
>  Bug fixes
>
>  
>  
>
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 5423e89..6944c7c 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -414,20 +414,21 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
> case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
> case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
> case PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS:
> case PIPE_CAP_DOUBLES:
> case PIPE_CAP_TGSI_TEX_TXF_LZ:
> return 1;
>
> case PIPE_CAP_INT64:
> case PIPE_CAP_INT64_DIVMOD:
> +   case PIPE_CAP_TGSI_CLOCK:
> return HAVE_LLVM >= 0x0309;
>
> case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
> return !SI_BIG_ENDIAN && sscreen->b.info.has_userptr;
>
> case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
> return (sscreen->b.info.drm_major == 2 &&
> sscreen->b.info.drm_minor >= 43) ||
>sscreen->b.info.drm_major == 3;
>
> @@ -476,21 +477,20 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_FAKE_SW_MSAA:
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_TGSI_VOTE:
> case PIPE_CAP_MAX_WINDOW_RECTANGLES:
> case PIPE_CAP_NATIVE_FENCE_FD:
> case PIPE_CAP_TGSI_FS_FBFETCH:
> case PIPE_CAP_TGSI_MUL_ZERO_WINS:
> case PIPE_CAP_UMA:
> -   case PIPE_CAP_TGSI_CLOCK:
> return 0;
>
> case PIPE_CAP_QUERY_BUFFER_OBJECT:
> return si_have_tgsi_compute(sscreen);
>
> case PIPE_CAP_DRAW_PARAMETERS:
> 

Re: [Mesa-dev] [Request for Comments] - Port documentation to Markdown

2017-03-30 Thread Jean Hertel
Hello Emil,


I'm not sure if you have read the full conversation, but I changed my mind 
about this.

I have shifted focus to use Sphinx for website generation along with 
documentation.

As far as I can tell, there is already documentation about NIR and Gallium 
writen in ReStructured Text, which Sphinx can use.


The last two week I'm very busy at work, so I haven't made much progress on the 
website itself.

My hope is to release a new repository on github this weekend, with the initial 
website in a ReStructured Text format.


Best Regards,

Jean Hertel.

De: Emil Velikov 
Enviado: quarta-feira, 29 de março de 2017 12:32
Para: Brian Paul
Cc: Jean Hertel; mesa-dev@lists.freedesktop.org
Assunto: Re: [Mesa-dev] [Request for Comments] - Port documentation to Markdown

Hi Jean,

On 8 March 2017 at 16:12, Brian Paul  wrote:

>> >One thing that I would prefer so not see if heavy things like
>> Bootstrap.
>> >We definitely don't need it, I think writing our own few lines of CSS
>> >(which can be inspired by anything you want) is better. We have more
>> >than enough people who know how to do it (myself included), it will
>> be
>> >cleaner (we won't need to include the whole forest to get our tree)
>> and
>> >much easier to fix when there's a bug.
>>
>>
>> I would tend to agree but I don't care too much about those details so
>> long as it's maintainable.  My primary concern is that while a lot of
>> random developers in the community are liable to have brushed into CSS a
>> time or two, most probably won't know bootstrap.
>
>
> Yeah, I can's stress that too much.  The site has to be easily maintainable
> by the developers.  I, for one, don't know much about websites beyond html
> and a little CSS.  If you create a new website infrastructure and then
> disappear after a few months we need to be able to take over.  Also, we
> can't funnel documentation updates through a handful of people that know a
> complex system.
>
Have you had some time to look into this ?

It would be great if we can get things rolling, even if not perfect.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100262] libswrAVX2.so Causes hang with QOpenGLWidget

2017-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100262

--- Comment #2 from ch...@circlecvi.com ---
(In reply to Tim Rowley from comment #1)
> Not seeing a hang here, but a transparent area where the OpenGL should be
> rendered to.
> 
> what version of Mesa and Qt were you using?
> 
> If you don't mind running an experiment, could you try setting
> KNOB_MAX_WORKER_THREADS=4 in the environment before running hellogl2?  This
> will stop swr from binding threads which might possibly be confusing Qt's
> internal threading.

Hi Tim!

I tested on Qt 4.8.6 and 5.6.1, both cause a hang for me using the SWR driver
on Mesa 13.0.1 and 13.0.5

Setting the MAX_KNOB_WORKER_THREADS=4 seems to correct the issue!

Can you tell me if this is a permanent fix? Also does this need to change with
the number of CPUs available? I have field deployments experiencing these
issues.

Thanks!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/1] NULL pointer in blob_write_string

2017-03-30 Thread Gregory Hainaut
Hello,

Fix a crash on Nouveau + Shader cache.

I don't know if it could impact current stable version

As a remainder I don't have commit access.

Best regards,

Gregory Hainaut (1):
  glsl/blob: handle copy of NULL ptr in blob_write_string

 src/compiler/glsl/blob.c   |   5 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/1] glsl/blob: handle copy of NULL ptr in blob_write_string

2017-03-30 Thread Gregory Hainaut
Typically happen when we want to copy an unnamed shader parameter in the
shader cache.

Note: it is safer to copy an empty string so we can read it back safely.

Fix piglit crashes of the 'texturegatheroffsets' tests

Signed-off-by: Gregory Hainaut 
---
 src/compiler/glsl/blob.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/blob.c b/src/compiler/glsl/blob.c
index 769ebf1..f84d7f3 100644
--- a/src/compiler/glsl/blob.c
+++ b/src/compiler/glsl/blob.c
@@ -176,7 +176,10 @@ blob_write_intptr(struct blob *blob, intptr_t value)
 bool
 blob_write_string(struct blob *blob, const char *str)
 {
-   return blob_write_bytes(blob, str, strlen(str) + 1);
+   if (str == NULL)
+  return blob_write_bytes(blob, "", 1);
+   else
+  return blob_write_bytes(blob, str, strlen(str) + 1);
 }
 
 void
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] intel: tools: add aubinator_error_decode tool

2017-03-30 Thread Lionel Landwerlin

On 30/03/17 19:27, Matt Turner wrote:

On Wed, Mar 29, 2017 at 1:07 PM, Lionel Landwerlin
 wrote:

This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.

Signed-off-by: Lionel Landwerlin 
---
  src/intel/Makefile.tools.am  |  20 +-
  src/intel/common/gen_decoder.c   |  10 +
  src/intel/common/gen_decoder.h   |   1 +
  src/intel/tools/.gitignore   |   1 +
  src/intel/tools/aubinator_error_decode.c | 783 +++
  5 files changed, 814 insertions(+), 1 deletion(-)
  create mode 100644 src/intel/tools/aubinator_error_decode.c

diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am
index 245bd03eef..a3a917d50e 100644
--- a/src/intel/Makefile.tools.am
+++ b/src/intel/Makefile.tools.am
@@ -19,7 +19,9 @@
  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
  # IN THE SOFTWARE.

-noinst_PROGRAMS += tools/aubinator
+noinst_PROGRAMS += \
+   tools/aubinator \
+   tools/aubinator_error_decode

  tools_aubinator_SOURCES = \
 tools/aubinator.c \
@@ -41,3 +43,19 @@ tools_aubinator_LDADD = \
 $(EXPAT_LIBS) \
 $(ZLIB_LIBS) \
 -lm
+
+
+tools_aubinator_error_decode_SOURCES = \
+   tools/aubinator_error_decode.c
+
+tools_aubinator_error_decode_LDADD = \
+   common/libintel_common.la \
+   $(top_builddir)/src/util/libmesautil.la \
+   $(aubinator_DEPS) \

What is aubinator_DEPS?


Left over from the rebase.. Removing!




+   $(EXPAT_LIBS) \
+   $(ZLIB_LIBS)
+
+tools_aubinator_error_decode_CFLAGS = \
+   $(AM_CFLAGS) \
+   $(EXPAT_CFLAGS) \
+   $(ZLIB_CFLAGS)
diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 1c3246f265..3af472caef 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -112,6 +112,16 @@ gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset)
 return NULL;
  }

+struct gen_group *
+gen_spec_find_register_by_name(struct gen_spec *spec, const char *name)
+{
+   for (int i = 0; i < spec->nregisters; i++)
+  if (strcmp(spec->registers[i]->name, name) == 0)
+ return spec->registers[i];

Use braces in nested control flow.


Sure.




+
+   return NULL;
+}
+
  struct gen_enum *
  gen_spec_find_enum(struct gen_spec *spec, const char *name)
  {
diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h
index 1c41de80a4..936b052455 100644
--- a/src/intel/common/gen_decoder.h
+++ b/src/intel/common/gen_decoder.h
@@ -45,6 +45,7 @@ struct gen_spec *gen_spec_load_from_path(const struct 
gen_device_info *devinfo,
  uint32_t gen_spec_get_gen(struct gen_spec *spec);
  struct gen_group *gen_spec_find_instruction(struct gen_spec *spec, const 
uint32_t *p);
  struct gen_group *gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset);
+struct gen_group *gen_spec_find_register_by_name(struct gen_spec *spec, const 
char *name);
  int gen_group_get_length(struct gen_group *group, const uint32_t *p);
  const char *gen_group_get_name(struct gen_group *group);
  uint32_t gen_group_get_opcode(struct gen_group *group);
diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
index 0c80a6fed2..27437f9eef 100644
--- a/src/intel/tools/.gitignore
+++ b/src/intel/tools/.gitignore
@@ -1 +1,2 @@
  /aubinator
+/aubinator_error_decode
diff --git a/src/intel/tools/aubinator_error_decode.c 
b/src/intel/tools/aubinator_error_decode.c
new file mode 100644
index 00..a477086cd8
--- /dev/null
+++ b/src/intel/tools/aubinator_error_decode.c
@@ -0,0 +1,783 @@
+/*
+ * Copyright © 2007-2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Eric Anholt 
+ *Carl 

[Mesa-dev] Windows build requires extra tools from release tarball

2017-03-30 Thread Ben Boeckel
Hi,

I'm trying to compile Mesa on Windows, but am hitting up against a
problem that mako and lex/yacc are required. The generated files are in
the source tree, but from my investigations (and limited knowledge of
Scons), it appears that the Scons code does not care and always does the
generation logic.

Does the Scons build require these tools, or would it be possible to get
it to instead use the pre-generated source files instead?

Thanks,

--Ben
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] intel: genxml: add gen7 ERR_INT register

2017-03-30 Thread Lionel Landwerlin

On 30/03/17 18:53, Matt Turner wrote:

On Wed, Mar 29, 2017 at 1:07 PM, Lionel Landwerlin
 wrote:

Signed-off-by: Lionel Landwerlin 
---
  src/intel/genxml/gen7.xml | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index ba9c8e8154..08307b3506 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2665,4 +2665,15 @@
  


+  

Looks like this exists on HSW too. Do we need to add it to gen75.xml?


Woops, yes indeed!

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 06/18] anv: Implement VK_KHX_external_memory

2017-03-30 Thread Jason Ekstrand
On Thu, Mar 30, 2017 at 11:27 AM, Chad Versace 
wrote:

> On Mon 13 Mar 2017, Jason Ekstrand wrote:
> > There's really nothing for us to do here.  So long as the user doesn't
> > set any crazy environment variables such as INTEL_VK_HIZ=false, all of
> > the compression formats etc. should "just work" at least for opaque
> > handle types.
>
> I think the commit message should go with the opaque fd commit. This
> patch's commit message should say something like,
>
>   Turn it on. Trivially correct. Don't support any
> VkExternalMemoryHandleTypes yet.
>

Good call.  I wrote:

This is the trivial implementation that just exposes the extension
string but exposes zero external handle types.

I moved the other comment to the external_memory_fd commit.


> but in real sentences ;)
>
> > ---
> >  src/intel/vulkan/anv_device.c   | 6 +-
> >  src/intel/vulkan/anv_entrypoints_gen.py | 1 +
> >  2 files changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> > index f92a313..385a806 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -314,7 +314,11 @@ static const VkExtensionProperties
> device_extensions[] = {
> > {
> >.extensionName = VK_KHR_DESCRIPTOR_UPDATE_
> TEMPLATE_EXTENSION_NAME,
> >.specVersion = 1,
> > -   }
> > +   },
> > +   {
> > +  .extensionName = VK_KHX_EXTERNAL_MEMORY_EXTENSION_NAME,
> > +  .specVersion = 1,
> > +   },
> >  };
> >
> >  static void *
> > diff --git a/src/intel/vulkan/anv_entrypoints_gen.py
> b/src/intel/vulkan/anv_entrypoints_gen.py
> > index 2c084ae..e8cdfb7 100644
> > --- a/src/intel/vulkan/anv_entrypoints_gen.py
> > +++ b/src/intel/vulkan/anv_entrypoints_gen.py
> > @@ -39,6 +39,7 @@ supported_extensions = [
> > 'VK_KHR_wayland_surface',
> > 'VK_KHR_xcb_surface',
> > 'VK_KHR_xlib_surface',
> > +   'VK_KHX_external_memory',
> > 'VK_KHX_external_memory_capabilities',
> >  ]
> >
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] intel: tools: add aubinator_error_decode tool

2017-03-30 Thread Chris Wilson
On Thu, Mar 30, 2017 at 11:27:26AM -0700, Matt Turner wrote:
> I think we should figure out how to make this not just a fork of
> intel_error_decode. Should intel_error_decode do away?
> 
> There are various tools in i-g-t that I'm definitely in favor of
> moving into Mesa (like the assembler and disassembler, and aubdump).
> Should this one move too? Do we have buy-in from Chris?

Sure, if it means that we do get an actively maintained decoder - just
hope that gen2-3 is forthcoming, and perhaps some inferred type
decoding.

What's more important is that it is fairly flexible in error-state file
format - mostly the lists and order of registers change, different
amount of detail in the kernel state trackers etc. I was thinking that
maybe we wanted a more parseable format like json (error.json),
something a bit more flexible?

What I also want is a use-case for EXEC_OBJECT_CAPTURE. Aside from
plain listing in the error state, it would be nice if the decoder could
parse 3DSTATE_BASE to work out which buffer was, for example, the
instruction buffer and then decode the relevant kernels from 3DSTATE_PS.
Decoding vertex buffers is perhaps less interesting in general that it
was for me, but similarly if they could be found within the error state,
dumping the vertices for each PRIMITIVE -- though with GS/HS/TS that
again is probably useless -- so really just an example of being
flexible, if the data is there in the error state, decode it inline.

I was won over at having pager support. I was getting close to moving
the decoder to mesa myself, though it would have been a much less
refined effort!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Invalidate L2 for TRANSFER_WRITE barriers

2017-03-30 Thread Bas Nieuwenhuizen
From: Alex Smith 

CP DMA and PKT3_WRITE_DATA (in CmdUpdateBuffer) don't (currently) write
through L2. Therefore, to make these writes visible to later accesses
we must invalidate L2 rather than just writing it back, to avoid the
possibility that stale data is read through L2.

Signed-off-by: Alex Smith 
Reviewed-by: Bas Nieuwenhuizen 
Cc: "17.0" 
[Bas: patch is a backport for 17.0 of the cherry-pick below]
(cherry picked from commit bc5d587a80b64fb3e0a5ea8067e6317fbca2bbc5)
---
 src/amd/vulkan/radv_cmd_buffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 628737c75ac..3aa415b152f 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2580,7 +2580,8 @@ void radv_CmdPipelineBarrier(
flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_DB;
break;
case VK_ACCESS_TRANSFER_WRITE_BIT:
-   flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
+   flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB |
+ RADV_CMD_FLAG_INV_GLOBAL_L2;
break;
default:
break;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel: genxml: compress all gen files into one

2017-03-30 Thread Jordan Justen
Series Reviewed-by: Jordan Justen 

On 2017-03-25 14:57:15, Lionel Landwerlin wrote:
> Combining all the files into a single string didn't make any
> difference in the size of the aubinator binary.
> 
> With this change we now also embed gen4/4.5/5 descriptions, which
> increases the aubinator size by ~16Kb.
> 
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/Makefile.genxml.am| 10 +++---
>  src/intel/Makefile.sources  |  7 -
>  src/intel/common/gen_decoder.c  | 62 
> +
>  src/intel/genxml/.gitignore |  2 +-
>  src/intel/genxml/gen_zipped_file.py | 34 +---
>  5 files changed, 56 insertions(+), 59 deletions(-)
> 
> diff --git a/src/intel/Makefile.genxml.am b/src/intel/Makefile.genxml.am
> index 01a02b63b4..c5cc843191 100644
> --- a/src/intel/Makefile.genxml.am
> +++ b/src/intel/Makefile.genxml.am
> @@ -21,12 +21,12 @@
>  
>  BUILT_SOURCES += \
> $(GENXML_GENERATED_FILES) \
> -   $(AUBINATOR_GENERATED_FILES)
> +   genxml/genX_xml.h
>  
>  EXTRA_DIST += \
> $(GENXML_XML_FILES) \
> $(GENXML_GENERATED_FILES) \
> -   $(AUBINATOR_GENERATED_FILES)
> +   genxml/genX_xml.h
>  
>  SUFFIXES = _pack.h _xml.h .xml
>  
> @@ -36,11 +36,9 @@ $(GENXML_GENERATED_FILES): genxml/gen_pack_header.py
> $(MKDIR_GEN)
> $(PYTHON_GEN) $(srcdir)/genxml/gen_pack_header.py $< > $@ || ($(RM) 
> $@; false)
>  
> -$(AUBINATOR_GENERATED_FILES): genxml/gen_zipped_file.py
> -
> -.xml_xml.h:
> +genxml/genX_xml.h: $(GENXML_XML_FILES) genxml/gen_zipped_file.py
> $(MKDIR_GEN)
> -   $(AM_V_GEN) $(PYTHON2) $(srcdir)/genxml/gen_zipped_file.py $< > $@ || 
> ($(RM) $@; false)
> +   $(AM_V_GEN) $(PYTHON2) $(srcdir)/genxml/gen_zipped_file.py 
> $(GENXML_XML_FILES) > $@
>  
>  EXTRA_DIST += \
> genxml/genX_pack.h \
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 88bcf60f6e..b5992c8d35 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -129,13 +129,6 @@ GENXML_GENERATED_FILES = \
> genxml/gen8_pack.h \
> genxml/gen9_pack.h
>  
> -AUBINATOR_GENERATED_FILES = \
> -   genxml/gen6_xml.h \
> -   genxml/gen7_xml.h \
> -   genxml/gen75_xml.h \
> -   genxml/gen8_xml.h \
> -   genxml/gen9_xml.h
> -
>  ISL_FILES = \
> isl/isl.c \
> isl/isl.h \
> diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
> index 1c3246f265..7b04ac051b 100644
> --- a/src/intel/common/gen_decoder.c
> +++ b/src/intel/common/gen_decoder.c
> @@ -34,11 +34,7 @@
>  
>  #include "gen_decoder.h"
>  
> -#include "genxml/gen6_xml.h"
> -#include "genxml/gen7_xml.h"
> -#include "genxml/gen75_xml.h"
> -#include "genxml/gen8_xml.h"
> -#include "genxml/gen9_xml.h"
> +#include "genxml/genX_xml.h"
>  
>  #define XML_BUFFER_SIZE 4096
>  
> @@ -481,35 +477,6 @@ devinfo_to_gen(const struct gen_device_info *devinfo)
> return value;
>  }
>  
> -static const struct {
> -   int gen;
> -   const uint8_t *data;
> -   size_t data_length;
> -} gen_data[] = {
> -   { .gen = 60, .data = gen6_xml, .data_length = sizeof(gen6_xml) },
> -   { .gen = 70, .data = gen7_xml, .data_length = sizeof(gen7_xml) },
> -   { .gen = 75, .data = gen75_xml, .data_length = sizeof(gen75_xml) },
> -   { .gen = 80, .data = gen8_xml, .data_length = sizeof(gen8_xml) },
> -   { .gen = 90, .data = gen9_xml, .data_length = sizeof(gen9_xml) }
> -};
> -
> -static const uint8_t *
> -devinfo_to_xml_data(const struct gen_device_info *devinfo,
> -uint32_t *data_length)
> -{
> -   int i, gen = devinfo_to_gen(devinfo);
> -
> -   for (i = 0; i < ARRAY_SIZE(gen_data); i++) {
> -  if (gen_data[i].gen == gen) {
> - *data_length = gen_data[i].data_length;
> - return gen_data[i].data;
> -  }
> -   }
> -
> -   unreachable("Unknown generation");
> -   return NULL;
> -}
> -
>  static uint32_t zlib_inflate(const void *compressed_data,
>   uint32_t compressed_len,
>   void **out_ptr)
> @@ -563,9 +530,22 @@ gen_spec_load(const struct gen_device_info *devinfo)
>  {
> struct parser_context ctx;
> void *buf;
> -   const void *zlib_data;
> -   void *text_data;
> -   uint32_t zlib_length = 0, text_length;
> +   uint8_t *text_data;
> +   uint32_t text_offset = 0, text_length = 0, total_length;
> +   uint32_t gen_10 = devinfo_to_gen(devinfo);
> +
> +   for (int i = 0; i < ARRAY_SIZE(genxml_files_table); i++) {
> +  if (genxml_files_table[i].gen_10 == gen_10) {
> + text_offset = genxml_files_table[i].offset;
> + text_length = genxml_files_table[i].length;
> + break;
> +  }
> +   }
> +
> +   if (text_length == 0) {
> +  fprintf(stderr, "unable to find gen (%u) data\n", gen_10);
> +  return NULL;
> +   }
>  
> memset(, 0, sizeof ctx);
> ctx.parser = 

Re: [Mesa-dev] Meson mesademos (Was: [RFC libdrm 0/2] Replace the build system with meson)

2017-03-30 Thread Dylan Baker
Quoting Jose Fonseca (2017-03-29 15:27:58)
> On 28/03/17 22:37, Dylan Baker wrote:
> > Quoting Jose Fonseca (2017-03-28 13:45:57)
> >> On 28/03/17 21:32, Dylan Baker wrote:
> >>> Quoting Jose Fonseca (2017-03-28 09:19:48)
>  On 28/03/17 00:12, Dylan Baker wrote:
> > Quoting Jose Fonseca (2017-03-27 09:58:59)
> >> On 27/03/17 17:42, Dylan Baker wrote:
> >>> Quoting Jose Fonseca (2017-03-27 09:31:04)
>  On 27/03/17 17:24, Dylan Baker wrote:
> > Quoting Jose Fonseca (2017-03-26 14:53:50)
> >> I've pushed the branch to mesa/demos, so we can all collaborate 
> >> without
> >> wasting time crossporting patches between private branches.
> >>
> >>https://cgit.freedesktop.org/mesa/demos/commit/?h=meson
> >>
> >> Unfortunately, I couldn't actually go very far until I hit a wall, 
> >> as
> >> you can see in the last commit message.
> >>
> >>
> >> The issue is that Windows has no standard paths for dependencies
> >> includes/libraries (like /usr/include or /usr/lib), nor standard 
> >> tool
> >> for dependencies (no pkgconfig).  But it seems that Meson presumes 
> >> any
> >> unknown dependency can be resolved with pkgconfig.
> >>
> >>
> >> The question is: how do I tell Meson where the GLEW 
> >> headers/library for
> >> MinGW are supposed to be found?
> >>
> >>
> >> I know one solution might be Meson Wraps.  Is that the only way?
> >>
> >>
> >> CMake makes it very easy to do it (via Cache files as explained in 
> >> my
> >> commit message.)  Is there a way to achieve the same, perhaps via
> >> cross_file properties or something like that?
> >>
> >>
> >> Jose
> >
> > I think there are two ways you could solve this:
> >
> > Wraps are probably the most generically correct method; what I mean 
> > by that is
> > that a proper wrap would solve the problem for everyone, on every 
> > operating
> > system, forever.
> 
>  Yeah, that sounded a good solution, particularly for windows where's 
>  so
>  much easier to just build the dependencies as a subproject rather 
>  than
>  fetch dependencies from somewhere, since MSVC RT versions have to 
>  match
>  and so.
> 
>   > That said, I took a look at GLEW and it doesn't look like a
> > straightforward project to port to meson, since it uses a huge pile 
> > of gnu
> > makefiles for compilation, without any autoconf/cmake/etc. I still 
> > might take a
> > swing at it since I want to know how hard it would be to write a 
> > wrap file for
> > something like GLEW (and it would probably be a pretty useful 
> > project to wrap)
> > where a meson build system is likely never going to go upstream.
> 
>  BTW, regarding GLEW, some time ago I actually prototyped using GLAD
>  instead of GLEW for mesademos:
> 
> https://cgit.freedesktop.org/~jrfonseca/mesademos/log/?h=glad
> 
>  I find GLAD much nicer that GLEW: it's easier to build, it uses 
>  upstream
>  XML files, it supports EGL, and it's easy to bundle.
> 
>  Maybe we could migrate mesademos to GLAD as part of this work 
>  instead of
>  trying to get glew "mesonfied".
> 
> > The other option I think you can use use is cross properties[1], 
> > which I believe
> > is the closest thing meson has to cmake's cache files.
> >
> > I've pushed a couple of commits, the last one implements the cross 
> > properties
> > idea, which gets the build farther, but then it can't find the glut 
> > headers,
> > and I don't understand why, since "cc.has_header('GL/glut')" 
> > returns true. I
> > still think that wraps are a better plan, but I'll have to spend 
> > some time today
> > working on a glew wrap.
> >
> > [1] https://github.com/mesonbuild/meson/wiki/Cross-compilation (at 
> > the bottom
> > under the heading "Custom Data")
> 
>  I'm running out of time today, but I'll try to take a look tomorrow.
> 
>  Jose
> 
> >>>
> >>> I'd had a similar thought, but thought of libpeoxy? It supports the 
> >>> platforms we
> >>> want, and already has a meson build system that works for windows.
> >>
> >> I have no experience with libepoxy.  I know GLAD is really easy to
> >> understand, use and integrate.  It's completly agnostic to toolkits 
> >> like
> >> GLUT/GLFW/etc doesn't try to alias equivalent 

[Mesa-dev] [Bug 100262] libswrAVX2.so Causes hang with QOpenGLWidget

2017-03-30 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100262

--- Comment #1 from Tim Rowley  ---
Not seeing a hang here, but a transparent area where the OpenGL should be
rendered to.

what version of Mesa and Qt were you using?

If you don't mind running an experiment, could you try setting
KNOB_MAX_WORKER_THREADS=4 in the environment before running hellogl2?  This
will stop swr from binding threads which might possibly be confusing Qt's
internal threading.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-announce] Mesa 17.0.3 release candidate

2017-03-30 Thread Andres Gomez
On Thu, 2017-03-30 at 18:09 +0300, Andres Gomez wrote:
...

> Rejected (2)
> 

...

> Thomas Hellstrom (1):
>   gbm/dri: Flush after unmap
> 
> The commit caused a regression in i965 (and possibly others) since it
This should have said i915 -

-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses

2017-03-30 Thread Jason Ekstrand
On Thu, Mar 30, 2017 at 9:25 AM, Kristian H. Kristensen 
wrote:

> Jason Ekstrand  writes:
>
> > This commit adds support for using the full 48-bit address space on
> > Broadwell and newer hardware.  Thanks to certain limitations, not all
> > objects can be placed above the 32-bit boundary.  In particular, general
> > and state base address need to live within 32 bits.  (See also
> > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
> > to handle this, we add a supports_48bit_address field to anv_bo and only
> > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
> > for all client-allocated memory objects but leave it false for
> > driver-allocated objects.  While this is more conservative than needed,
> > all driver allocations should easily fit in the first 32 bits of address
> > space and keeps things simple because we don't have to think about
> > whether or not any given one of our allocation data structures will be
> > used in a 48-bit-unsafe way.
> > ---
> >  src/intel/vulkan/anv_allocator.c   | 10 --
> >  src/intel/vulkan/anv_batch_chain.c | 14 ++
> >  src/intel/vulkan/anv_device.c  |  4 +++-
> >  src/intel/vulkan/anv_gem.c | 18 ++
> >  src/intel/vulkan/anv_intel.c   |  2 +-
> >  src/intel/vulkan/anv_private.h | 29 +++--
> >  6 files changed, 67 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_
> allocator.c
> > index 45c663b..88c9c13 100644
> > --- a/src/intel/vulkan/anv_allocator.c
> > +++ b/src/intel/vulkan/anv_allocator.c
> > @@ -255,7 +255,7 @@ anv_block_pool_init(struct anv_block_pool *pool,
> > assert(util_is_power_of_two(block_size));
> >
> > pool->device = device;
> > -   anv_bo_init(>bo, 0, 0);
> > +   anv_bo_init(>bo, 0, 0, false);
> > pool->block_size = block_size;
> > pool->free_list = ANV_FREE_LIST_EMPTY;
> > pool->back_free_list = ANV_FREE_LIST_EMPTY;
> > @@ -475,7 +475,13 @@ anv_block_pool_grow(struct anv_block_pool *pool,
> struct anv_block_state *state)
> >  * values back into pool. */
> > pool->map = map + center_bo_offset;
> > pool->center_bo_offset = center_bo_offset;
> > -   anv_bo_init(>bo, gem_handle, size);
> > +
> > +   /* Block pool BOs are marked as not supporting 48-bit addresses
> because
> > +* they are used to back STATE_BASE_ADDRESS.
> > +*
> > +* See also anv_bo::supports_48bit_address.
> > +*/
> > +   anv_bo_init(>bo, gem_handle, size, false);
> > pool->bo.map = map;
> >
> >  done:
> > diff --git a/src/intel/vulkan/anv_batch_chain.c
> b/src/intel/vulkan/anv_batch_chain.c
> > index 5d7abc6..b098e4b 100644
> > --- a/src/intel/vulkan/anv_batch_chain.c
> > +++ b/src/intel/vulkan/anv_batch_chain.c
> > @@ -979,7 +979,8 @@ anv_execbuf_finish(struct anv_execbuf *exec,
> >  }
> >
> >  static VkResult
> > -anv_execbuf_add_bo(struct anv_execbuf *exec,
> > +anv_execbuf_add_bo(struct anv_device *device,
> > +   struct anv_execbuf *exec,
> > struct anv_bo *bo,
> > struct anv_reloc_list *relocs,
> > const VkAllocationCallbacks *alloc)
> > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> >obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0;
> >obj->rsvd1 = 0;
> >obj->rsvd2 = 0;
> > +
> > +  if (device->instance->physicalDevice.supports_48bit_addresses &&
> > +  bo->supports_48bit_address)
> > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > }
>
> Looking at the pointer chasing to get to supports_48bit_address and how
> you had to add an anv_device argument to this entire callchain, I'd
> suggest rolling bo->is_winsys_bo and bo->supports_48bit_address into a
> bo->flags field where you can set EXEC_OBJECT_SUPPORTS_48B_ADDRESS and
> EXEC_OBJECT_WRITE as needed at bo create time. All you need to do then
> is obj->flags = bo->flags and all these computations and conditionals
> move out of the hot path and you can drop the anv_device argument again.
>
> For the case where we set the render write domain for winsys buffers,
> you'll have to test if the EXEC_OBJECT_WRITE flag is set instead of
> testing if bo->is_winsys_bo is true.
>

In an earlier version of the series, I did exactly that. :-)  I'm happy to
change it back.


> > if (relocs != NULL && obj->relocation_count == 0) {
> > @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> >for (size_t i = 0; i < relocs->num_relocs; i++) {
> >   /* A quick sanity check on relocations */
> >   assert(relocs->relocs[i].offset < bo->size);
> > - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc);
> > + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL,
> alloc);
> >}
> > }
> >
> > @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct 

Re: [Mesa-dev] [PATCH 1/2] nir/constant_expressions: Pull the guts out into a helper block

2017-03-30 Thread Jason Ekstrand
On Thu, Mar 30, 2017 at 4:23 AM, Eduardo Lima Mitev 
wrote:

> Looks good and clearer. Series is:
>
> Reviewed-by: Eduardo Lima Mitev 
>

Thanks!  I've pushed these two and the one for float16 support.


> On 03/14/2017 10:08 PM, Jason Ekstrand wrote:
> > ---
> >  src/compiler/nir/nir_constant_expressions.py | 199
> ++-
> >  1 file changed, 101 insertions(+), 98 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_constant_expressions.py
> b/src/compiler/nir/nir_constant_expressions.py
> > index c6745f1..ad841e3 100644
> > --- a/src/compiler/nir/nir_constant_expressions.py
> > +++ b/src/compiler/nir/nir_constant_expressions.py
> > @@ -266,116 +266,119 @@ struct bool32_vec {
> >  bool w;
> >  };
> >
> > -% for name, op in sorted(opcodes.iteritems()):
> > -static nir_const_value
> > -evaluate_${name}(MAYBE_UNUSED unsigned num_components, unsigned
> bit_size,
> > - MAYBE_UNUSED nir_const_value *_src)
> > -{
> > -   nir_const_value _dst_val = { {0, } };
> > -
> > -   switch (bit_size) {
> > -   % for bit_size in op_bit_sizes(op):
> > -   case ${bit_size}: {
> > -  <%
> > -  output_type = type_add_size(op.output_type, bit_size)
> > -  input_types = [type_add_size(type_, bit_size) for type_ in
> op.input_types]
> > -  %>
> > -
> > -  ## For each non-per-component input, create a variable srcN that
> > -  ## contains x, y, z, and w elements which are filled in with the
> > -  ## appropriately-typed values.
> > -  % for j in range(op.num_inputs):
> > - % if op.input_sizes[j] == 0:
> > -<% continue %>
> > - % elif "src" + str(j) not in op.const_expr:
> > -## Avoid unused variable warnings
> > -<% continue %>
> > - %endif
> > -
> > - const struct ${input_types[j]}_vec src${j} = {
> > - % for k in range(op.input_sizes[j]):
> > -% if input_types[j] == "bool32":
> > -   _src[${j}].u32[${k}] != 0,
> > -% else:
> > -   _src[${j}].${get_const_field(input_types[j])}[${k}],
> > -% endif
> > - % endfor
> > - % for k in range(op.input_sizes[j], 4):
> > -0,
> > - % endfor
> > - };
> > +<%def name="evaluate_op(op, bit_size)">
> > +   <%
> > +   output_type = type_add_size(op.output_type, bit_size)
> > +   input_types = [type_add_size(type_, bit_size) for type_ in
> op.input_types]
> > +   %>
> > +
> > +   ## For each non-per-component input, create a variable srcN that
> > +   ## contains x, y, z, and w elements which are filled in with the
> > +   ## appropriately-typed values.
> > +   % for j in range(op.num_inputs):
> > +  % if op.input_sizes[j] == 0:
> > + <% continue %>
> > +  % elif "src" + str(j) not in op.const_expr:
> > + ## Avoid unused variable warnings
> > + <% continue %>
> > +  %endif
> > +
> > +  const struct ${input_types[j]}_vec src${j} = {
> > +  % for k in range(op.input_sizes[j]):
> > + % if input_types[j] == "bool32":
> > +_src[${j}].u32[${k}] != 0,
> > + % else:
> > +_src[${j}].${get_const_field(input_types[j])}[${k}],
> > + % endif
> >% endfor
> > +  % for k in range(op.input_sizes[j], 4):
> > + 0,
> > +  % endfor
> > +  };
> > +   % endfor
> >
> > -  % if op.output_size == 0:
> > - ## For per-component instructions, we need to iterate over the
> > - ## components and apply the constant expression one component
> > - ## at a time.
> > - for (unsigned _i = 0; _i < num_components; _i++) {
> > -## For each per-component input, create a variable srcN that
> > -## contains the value of the current (_i'th) component.
> > -% for j in range(op.num_inputs):
> > -   % if op.input_sizes[j] != 0:
> > -  <% continue %>
> > -   % elif "src" + str(j) not in op.const_expr:
> > -  ## Avoid unused variable warnings
> > -  <% continue %>
> > -   % elif input_types[j] == "bool32":
> > -  const bool src${j} = _src[${j}].u32[_i] != 0;
> > -   % else:
> > -  const ${input_types[j]}_t src${j} =
> > - _src[${j}].${get_const_field(input_types[j])}[_i];
> > -   % endif
> > -% endfor
> > -
> > -## Create an appropriately-typed variable dst and assign the
> > -## result of the const_expr to it.  If const_expr already
> contains
> > -## writes to dst, just include const_expr directly.
> > -% if "dst" in op.const_expr:
> > -   ${output_type}_t dst;
> > -
> > -   ${op.const_expr}
> > -% else:
> > -   ${output_type}_t dst = ${op.const_expr};
> > -% endif
> > -
> > -## Store 

Re: [Mesa-dev] [PATCH 1/4] anv: Add helpers for converting access flags to pipe bits

2017-03-30 Thread Nanley Chery
On Tue, Mar 14, 2017 at 07:55:50AM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/anv_private.h | 59 
> ++
>  src/intel/vulkan/genX_cmd_buffer.c | 48 ++-
>  2 files changed, 62 insertions(+), 45 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index b11adfd..a0eefe3 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -1156,6 +1156,65 @@ enum anv_pipe_bits {
> ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT | \
> ANV_PIPE_INSTRUCTION_CACHE_INVALIDATE_BIT)
>  
> +static inline enum anv_pipe_bits
> +anv_pipe_flush_bits_for_access_flags(VkAccessFlags flags)
> +{
> +   enum anv_pipe_bits pipe_bits = 0;
> +
> +   unsigned b;
> +   for_each_bit(b, flags) {
> +  switch ((VkAccessFlagBits)(1 << b)) {
> +  case VK_ACCESS_SHADER_WRITE_BIT:
> + pipe_bits |= ANV_PIPE_DATA_CACHE_FLUSH_BIT;
> + break;
> +  case VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT:
> + pipe_bits |= ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
> + break;
> +  case VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT:
> + pipe_bits |= ANV_PIPE_DEPTH_CACHE_FLUSH_BIT;
> + break;
> +  case VK_ACCESS_TRANSFER_WRITE_BIT:
> + pipe_bits |= ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
> + pipe_bits |= ANV_PIPE_DEPTH_CACHE_FLUSH_BIT;
> + break;
> +  default:
> + break; /* Nothing to do */
> +  }
> +   }
> +
> +   return pipe_bits;
> +}
> +
> +static inline enum anv_pipe_bits
> +anv_pipe_invalidate_bits_for_access_flags(VkAccessFlags flags)
> +{
> +   enum anv_pipe_bits pipe_bits = 0;
> +
> +   unsigned b;
> +   for_each_bit(b, flags) {
> +  switch ((VkAccessFlagBits)(1 << b)) {
> +  case VK_ACCESS_INDIRECT_COMMAND_READ_BIT:
> +  case VK_ACCESS_INDEX_READ_BIT:
> +  case VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT:
> + pipe_bits |= ANV_PIPE_VF_CACHE_INVALIDATE_BIT;
> + break;
> +  case VK_ACCESS_UNIFORM_READ_BIT:
> + pipe_bits |= ANV_PIPE_CONSTANT_CACHE_INVALIDATE_BIT;
> + pipe_bits |= ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT;
> + break;
> +  case VK_ACCESS_SHADER_READ_BIT:
> +  case VK_ACCESS_INPUT_ATTACHMENT_READ_BIT:
> +  case VK_ACCESS_TRANSFER_READ_BIT:
> + pipe_bits |= ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT;
> + break;
> +  default:
> + break; /* Nothing to do */
> +  }
> +   }
> +
> +   return pipe_bits;
> +}
> +
>  struct anv_vertex_binding {
> struct anv_buffer *  buffer;
> VkDeviceSize offset;
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index a12bd67..acb59d5 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -920,7 +920,6 @@ void genX(CmdPipelineBarrier)(
>  const VkImageMemoryBarrier* pImageMemoryBarriers)
>  {
> ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
> -   uint32_t b;
>  
> /* XXX: Right now, we're really dumb and just flush whatever categories
>  * the app asks for.  One of these days we may make this a bit better
> @@ -951,50 +950,9 @@ void genX(CmdPipelineBarrier)(
>}
> }
>  
> -   enum anv_pipe_bits pipe_bits = 0;
> -
> -   for_each_bit(b, src_flags) {
> -  switch ((VkAccessFlagBits)(1 << b)) {
> -  case VK_ACCESS_SHADER_WRITE_BIT:
> - pipe_bits |= ANV_PIPE_DATA_CACHE_FLUSH_BIT;
> - break;
> -  case VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT:
> - pipe_bits |= ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
> - break;
> -  case VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT:
> - pipe_bits |= ANV_PIPE_DEPTH_CACHE_FLUSH_BIT;
> - break;
> -  case VK_ACCESS_TRANSFER_WRITE_BIT:
> - pipe_bits |= ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
> - pipe_bits |= ANV_PIPE_DEPTH_CACHE_FLUSH_BIT;
> - break;
> -  default:
> - break; /* Nothing to do */
> -  }
> -   }
> -
> -   for_each_bit(b, dst_flags) {
> -  switch ((VkAccessFlagBits)(1 << b)) {
> -  case VK_ACCESS_INDIRECT_COMMAND_READ_BIT:
> -  case VK_ACCESS_INDEX_READ_BIT:
> -  case VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT:
> - pipe_bits |= ANV_PIPE_VF_CACHE_INVALIDATE_BIT;
> - break;
> -  case VK_ACCESS_UNIFORM_READ_BIT:
> - pipe_bits |= ANV_PIPE_CONSTANT_CACHE_INVALIDATE_BIT;
> - pipe_bits |= ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT;
> - break;
> -  case VK_ACCESS_SHADER_READ_BIT:
> -  case VK_ACCESS_INPUT_ATTACHMENT_READ_BIT:
> -  case VK_ACCESS_TRANSFER_READ_BIT:
> - pipe_bits |= ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT;
> - break;
> -  default:
> - break; /* Nothing to do */
> -  }
> -   }
> -
> -   

Re: [Mesa-dev] [PATCH v2 06/18] anv: Implement VK_KHX_external_memory

2017-03-30 Thread Chad Versace
On Mon 13 Mar 2017, Jason Ekstrand wrote:
> There's really nothing for us to do here.  So long as the user doesn't
> set any crazy environment variables such as INTEL_VK_HIZ=false, all of
> the compression formats etc. should "just work" at least for opaque
> handle types.

I think the commit message should go with the opaque fd commit. This
patch's commit message should say something like,

  Turn it on. Trivially correct. Don't support any VkExternalMemoryHandleTypes 
yet.

but in real sentences ;)

> ---
>  src/intel/vulkan/anv_device.c   | 6 +-
>  src/intel/vulkan/anv_entrypoints_gen.py | 1 +
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index f92a313..385a806 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -314,7 +314,11 @@ static const VkExtensionProperties device_extensions[] = 
> {
> {
>.extensionName = VK_KHR_DESCRIPTOR_UPDATE_TEMPLATE_EXTENSION_NAME,
>.specVersion = 1,
> -   }
> +   },
> +   {
> +  .extensionName = VK_KHX_EXTERNAL_MEMORY_EXTENSION_NAME,
> +  .specVersion = 1,
> +   },
>  };
>  
>  static void *
> diff --git a/src/intel/vulkan/anv_entrypoints_gen.py 
> b/src/intel/vulkan/anv_entrypoints_gen.py
> index 2c084ae..e8cdfb7 100644
> --- a/src/intel/vulkan/anv_entrypoints_gen.py
> +++ b/src/intel/vulkan/anv_entrypoints_gen.py
> @@ -39,6 +39,7 @@ supported_extensions = [
> 'VK_KHR_wayland_surface',
> 'VK_KHR_xcb_surface',
> 'VK_KHR_xlib_surface',
> +   'VK_KHX_external_memory',
> 'VK_KHX_external_memory_capabilities',
>  ]
>  
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] intel: tools: add aubinator_error_decode tool

2017-03-30 Thread Matt Turner
On Wed, Mar 29, 2017 at 1:07 PM, Lionel Landwerlin
 wrote:
> This is pretty much the same tool as what i-g-t has, only with a more
> fancy decoding of the instructions/registers. It also doesn't support
> anything before gen4.
>
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/Makefile.tools.am  |  20 +-
>  src/intel/common/gen_decoder.c   |  10 +
>  src/intel/common/gen_decoder.h   |   1 +
>  src/intel/tools/.gitignore   |   1 +
>  src/intel/tools/aubinator_error_decode.c | 783 
> +++
>  5 files changed, 814 insertions(+), 1 deletion(-)
>  create mode 100644 src/intel/tools/aubinator_error_decode.c
>
> diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am
> index 245bd03eef..a3a917d50e 100644
> --- a/src/intel/Makefile.tools.am
> +++ b/src/intel/Makefile.tools.am
> @@ -19,7 +19,9 @@
>  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
>  # IN THE SOFTWARE.
>
> -noinst_PROGRAMS += tools/aubinator
> +noinst_PROGRAMS += \
> +   tools/aubinator \
> +   tools/aubinator_error_decode
>
>  tools_aubinator_SOURCES = \
> tools/aubinator.c \
> @@ -41,3 +43,19 @@ tools_aubinator_LDADD = \
> $(EXPAT_LIBS) \
> $(ZLIB_LIBS) \
> -lm
> +
> +
> +tools_aubinator_error_decode_SOURCES = \
> +   tools/aubinator_error_decode.c
> +
> +tools_aubinator_error_decode_LDADD = \
> +   common/libintel_common.la \
> +   $(top_builddir)/src/util/libmesautil.la \
> +   $(aubinator_DEPS) \

What is aubinator_DEPS?

> +   $(EXPAT_LIBS) \
> +   $(ZLIB_LIBS)
> +
> +tools_aubinator_error_decode_CFLAGS = \
> +   $(AM_CFLAGS) \
> +   $(EXPAT_CFLAGS) \
> +   $(ZLIB_CFLAGS)
> diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
> index 1c3246f265..3af472caef 100644
> --- a/src/intel/common/gen_decoder.c
> +++ b/src/intel/common/gen_decoder.c
> @@ -112,6 +112,16 @@ gen_spec_find_register(struct gen_spec *spec, uint32_t 
> offset)
> return NULL;
>  }
>
> +struct gen_group *
> +gen_spec_find_register_by_name(struct gen_spec *spec, const char *name)
> +{
> +   for (int i = 0; i < spec->nregisters; i++)
> +  if (strcmp(spec->registers[i]->name, name) == 0)
> + return spec->registers[i];

Use braces in nested control flow.

> +
> +   return NULL;
> +}
> +
>  struct gen_enum *
>  gen_spec_find_enum(struct gen_spec *spec, const char *name)
>  {
> diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h
> index 1c41de80a4..936b052455 100644
> --- a/src/intel/common/gen_decoder.h
> +++ b/src/intel/common/gen_decoder.h
> @@ -45,6 +45,7 @@ struct gen_spec *gen_spec_load_from_path(const struct 
> gen_device_info *devinfo,
>  uint32_t gen_spec_get_gen(struct gen_spec *spec);
>  struct gen_group *gen_spec_find_instruction(struct gen_spec *spec, const 
> uint32_t *p);
>  struct gen_group *gen_spec_find_register(struct gen_spec *spec, uint32_t 
> offset);
> +struct gen_group *gen_spec_find_register_by_name(struct gen_spec *spec, 
> const char *name);
>  int gen_group_get_length(struct gen_group *group, const uint32_t *p);
>  const char *gen_group_get_name(struct gen_group *group);
>  uint32_t gen_group_get_opcode(struct gen_group *group);
> diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
> index 0c80a6fed2..27437f9eef 100644
> --- a/src/intel/tools/.gitignore
> +++ b/src/intel/tools/.gitignore
> @@ -1 +1,2 @@
>  /aubinator
> +/aubinator_error_decode
> diff --git a/src/intel/tools/aubinator_error_decode.c 
> b/src/intel/tools/aubinator_error_decode.c
> new file mode 100644
> index 00..a477086cd8
> --- /dev/null
> +++ b/src/intel/tools/aubinator_error_decode.c
> @@ -0,0 +1,783 @@
> +/*
> + * Copyright © 2007-2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE 

Re: [Mesa-dev] [PATCH v2 05/18] anv: Implement VK_KHX_external_memory_capabilities

2017-03-30 Thread Chad Versace
On Mon 13 Mar 2017, Jason Ekstrand wrote:
> From: Chad Versace 
> 
> This is a complete but trivial implementation. It's trivial becasue We
> support no external memory capabilities yet.  Most of the real work in
> this commit is in reworking the UUIDs advertised by the driver.
> 
> v2 (chadv):
>   - Fix chain traversal in vkGetPhysicalDeviceImageFormatProperties2KHR.
> Extract VkPhysicalDeviceExternalImageFormatInfoKHX from the chain of
> input structs, not the chain of output structs.
>   - In vkGetPhysicalDeviceImageFormatProperties2KHR, iterate over the
> input chain and the output chain separately. Reduces diff in future
> dma_buf patches.
> 
> Co-authored-with: Jason Ekstrand 
> ---
>  src/intel/vulkan/anv_device.c   | 53 ---
>  src/intel/vulkan/anv_entrypoints_gen.py |  1 +
>  src/intel/vulkan/anv_formats.c  | 75 
> +
>  src/intel/vulkan/anv_private.h  |  2 +
>  4 files changed, 117 insertions(+), 14 deletions(-)


> +* some bits if ISL info to ensure that this is safe.
  ^^^
Typo: s/if/of/

With that, this patch is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/7] Aubinator error decode

2017-03-30 Thread Matt Turner
1-3 are

Reviewed-by: Matt Turner 

I think 4 should touch gen75.xml, and I sent a comment.

I cannot find the registers in 5 or 6 in the internal documentation.

I'll review patch 7 separately.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/14] st/mesa: Use compressed fog mode for atifs.

2017-03-30 Thread Gustaw Smolarczyk
Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/state_tracker/st_atifs_to_tgsi.c |  6 +++---
 src/mesa/state_tracker/st_atom_shader.c   | 17 +
 2 files changed, 4 insertions(+), 19 deletions(-)

diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c 
b/src/mesa/state_tracker/st_atifs_to_tgsi.c
index 64879f1b27..90286a1115 100644
--- a/src/mesa/state_tracker/st_atifs_to_tgsi.c
+++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c
@@ -705,7 +705,7 @@ transform_inst:
   }
 
   /* compute the 1 component fog factor f */
-  if (ctx->key->fog == 1) {
+  if (ctx->key->fog == FOG_LINEAR) {
  /* LINEAR formula: f = (end - z) / (end - start)
   * with optimized parameters:
   *f = MAD(fogcoord, oparams.x, oparams.y)
@@ -721,7 +721,7 @@ transform_inst:
  SET_SRC(, 1, TGSI_FILE_CONSTANT, MAX_NUM_FRAGMENT_CONSTANTS_ATI, 
X, X, X, X);
  SET_SRC(, 2, TGSI_FILE_CONSTANT, MAX_NUM_FRAGMENT_CONSTANTS_ATI, 
Y, Y, Y, Y);
  tctx->emit_instruction(tctx, );
-  } else if (ctx->key->fog == 2) {
+  } else if (ctx->key->fog == FOG_EXP) {
  /* EXP formula: f = exp(-dens * z)
   * with optimized parameters:
   *f = MUL(fogcoord, oparams.z); f= EX2(-f)
@@ -747,7 +747,7 @@ transform_inst:
  SET_SRC(, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, Z, 
W);
  inst.Src[0].Register.Negate = 1;
  tctx->emit_instruction(tctx, );
-  } else if (ctx->key->fog == 3) {
+  } else if (ctx->key->fog == FOG_EXP2) {
  /* EXP2 formula: f = exp(-(dens * z)^2)
   * with optimized parameters:
   *f = MUL(fogcoord, oparams.w); f=MUL(f, f); f= EX2(-f)
diff --git a/src/mesa/state_tracker/st_atom_shader.c 
b/src/mesa/state_tracker/st_atom_shader.c
index f79afe0b1c..ee97c69df3 100644
--- a/src/mesa/state_tracker/st_atom_shader.c
+++ b/src/mesa/state_tracker/st_atom_shader.c
@@ -54,19 +54,6 @@
 #include "st_texture.h"
 
 
-/** Compress the fog function enums into a 2-bit value */
-static GLuint
-translate_fog_mode(GLenum mode)
-{
-   switch (mode) {
-   case GL_LINEAR: return 1;
-   case GL_EXP:return 2;
-   case GL_EXP2:   return 3;
-   default:
-  return 0;
-   }
-}
-
 static unsigned
 get_texture_target(struct gl_context *ctx, const unsigned unit)
 {
@@ -132,9 +119,7 @@ update_fp( struct st_context *st )
   _mesa_geometric_samples(st->ctx->DrawBuffer) > 1;
 
if (stfp->ati_fs) {
-  if (st->ctx->Fog.Enabled) {
- key.fog = translate_fog_mode(st->ctx->Fog.Mode);
-  }
+  key.fog = st->ctx->Fog._PackedEnabledMode;
 
   for (unsigned u = 0; u < MAX_NUM_FRAGMENT_REGISTERS_ATI; u++) {
  key.texture_targets[u] = get_texture_target(st->ctx, u);
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/14] mesa/main: Maintain compressed TexEnv Combine state.

2017-03-30 Thread Gustaw Smolarczyk
Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/mtypes.h   |  83 ++
 src/mesa/main/texstate.c | 103 +++
 2 files changed, 186 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 186d79928c..369d2327f2 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1078,6 +1078,87 @@ struct gl_tex_env_combine_state
 };
 
 
+/** Compressed TexEnv effective Combine mode */
+enum gl_tex_env_mode
+{
+   TEXENV_MODE_REPLACE, /* r = a0 */
+   TEXENV_MODE_MODULATE,/* r = a0 * a1 */
+   TEXENV_MODE_ADD, /* r = a0 + a1 */
+   TEXENV_MODE_ADD_SIGNED,  /* r = a0 + a1 - 0.5 */
+   TEXENV_MODE_INTERPOLATE, /* r = a0 * a2 + a1 * (1 - a2) */
+   TEXENV_MODE_SUBTRACT,/* r = a0 - a1 */
+   TEXENV_MODE_DOT3_RGB,/* r = a0 . a1 */
+   TEXENV_MODE_DOT3_RGB_EXT,/* r = a0 . a1 */
+   TEXENV_MODE_DOT3_RGBA,   /* r = a0 . a1 */
+   TEXENV_MODE_DOT3_RGBA_EXT,   /* r = a0 . a1 */
+   TEXENV_MODE_MODULATE_ADD_ATI,/* r = a0 * a2 + a1 */
+   TEXENV_MODE_MODULATE_SIGNED_ADD_ATI, /* r = a0 * a2 + a1 - 0.5 */
+   TEXENV_MODE_MODULATE_SUBTRACT_ATI,   /* r = a0 * a2 - a1 */
+   TEXENV_MODE_ADD_PRODUCTS_NV, /* r = a0 * a1 + a2 * a3 */
+   TEXENV_MODE_ADD_PRODUCTS_SIGNED_NV,  /* r = a0 * a1 + a2 * a3 - 0.5 */
+};
+
+
+/** Compressed TexEnv Combine source */
+enum gl_tex_env_source
+{
+   TEXENV_SRC_TEXTURE0,
+   TEXENV_SRC_TEXTURE1,
+   TEXENV_SRC_TEXTURE2,
+   TEXENV_SRC_TEXTURE3,
+   TEXENV_SRC_TEXTURE4,
+   TEXENV_SRC_TEXTURE5,
+   TEXENV_SRC_TEXTURE6,
+   TEXENV_SRC_TEXTURE7,
+   TEXENV_SRC_TEXTURE,
+   TEXENV_SRC_PREVIOUS,
+   TEXENV_SRC_PRIMARY_COLOR,
+   TEXENV_SRC_CONSTANT,
+   TEXENV_SRC_ZERO,
+   TEXENV_SRC_ONE,
+};
+
+
+/** Compressed TexEnv Combine operand */
+enum gl_tex_env_operand
+{
+   TEXENV_OPR_COLOR,
+   TEXENV_OPR_ONE_MINUS_COLOR,
+   TEXENV_OPR_ALPHA,
+   TEXENV_OPR_ONE_MINUS_ALPHA,
+};
+
+
+/** Compressed TexEnv Combine argument */
+struct gl_tex_env_argument
+{
+#ifdef __GNUC__
+   __extension__ uint8_t Source:4;  /**< TEXENV_SRC_x */
+   __extension__ uint8_t Operand:2; /**< TEXENV_OPR_x */
+#else
+   uint8_t Source;  /**< SRC_x */
+   uint8_t Operand; /**< OPR_x */
+#endif
+};
+
+
+/***
+ * Compressed TexEnv Combine state.
+ */
+struct gl_tex_env_combine_packed
+{
+   uint32_t ModeRGB:4;  /**< Effective mode for RGB as 4 bits */
+   uint32_t ModeA:4;/**< Effective mode for RGB as 4 bits */
+   uint32_t ScaleShiftRGB:2;/**< 0, 1 or 2 */
+   uint32_t ScaleShiftA:2;  /**< 0, 1 or 2 */
+   uint32_t NumArgsRGB:3;   /**< Number of inputs used for the RGB 
combiner */
+   uint32_t NumArgsA:3; /**< Number of inputs used for the A combiner 
*/
+   /** Source arguments in a packed manner */
+   struct gl_tex_env_argument ArgsRGB[MAX_COMBINER_TERMS];
+   struct gl_tex_env_argument ArgsA[MAX_COMBINER_TERMS];
+};
+
+
 /**
  * TexGenEnabled flags.
  */
@@ -1180,6 +1261,8 @@ struct gl_texture_unit
/** Points to highest priority, complete and enabled texture object */
struct gl_texture_object *_Current;
 
+   /** Current compressed TexEnv & Combine state */
+   struct gl_tex_env_combine_packed _CurrentCombinePacked;
 };
 
 
diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c
index ada0dfdb66..71757dc154 100644
--- a/src/mesa/main/texstate.c
+++ b/src/mesa/main/texstate.c
@@ -376,6 +376,107 @@ _mesa_update_texture_matrices(struct gl_context *ctx)
 
 
 /**
+ * Translate GL combiner state into a MODE_x value
+ */
+static uint32_t
+tex_combine_translate_mode(GLenum envMode, GLenum mode)
+{
+   switch (mode) {
+   case GL_REPLACE: return TEXENV_MODE_REPLACE;
+   case GL_MODULATE: return TEXENV_MODE_MODULATE;
+   case GL_ADD:
+  if (envMode == GL_COMBINE4_NV)
+return TEXENV_MODE_ADD_PRODUCTS_NV;
+  else
+return TEXENV_MODE_ADD;
+   case GL_ADD_SIGNED:
+  if (envMode == GL_COMBINE4_NV)
+return TEXENV_MODE_ADD_PRODUCTS_SIGNED_NV;
+  else
+return TEXENV_MODE_ADD_SIGNED;
+   case GL_INTERPOLATE: return TEXENV_MODE_INTERPOLATE;
+   case GL_SUBTRACT: return TEXENV_MODE_SUBTRACT;
+   case GL_DOT3_RGB: return TEXENV_MODE_DOT3_RGB;
+   case GL_DOT3_RGB_EXT: return TEXENV_MODE_DOT3_RGB_EXT;
+   case GL_DOT3_RGBA: return TEXENV_MODE_DOT3_RGBA;
+   case GL_DOT3_RGBA_EXT: return TEXENV_MODE_DOT3_RGBA_EXT;
+   case GL_MODULATE_ADD_ATI: return TEXENV_MODE_MODULATE_ADD_ATI;
+   case GL_MODULATE_SIGNED_ADD_ATI: return TEXENV_MODE_MODULATE_SIGNED_ADD_ATI;
+   case GL_MODULATE_SUBTRACT_ATI: return TEXENV_MODE_MODULATE_SUBTRACT_ATI;
+   default:
+  unreachable("Invalid TexEnv Combine mode");
+   }
+}
+
+
+static uint8_t
+tex_combine_translate_source(GLenum src)
+{
+   switch (src) {
+   case GL_TEXTURE0:
+   case GL_TEXTURE1:
+   case GL_TEXTURE2:
+   

[Mesa-dev] [PATCH 13/14] mesa/main/ff_frag: Use compressed TexEnv Combine state.

2017-03-30 Thread Gustaw Smolarczyk
Along the way, add missing GL_ONE source support and drop non-existing
GL_ZERO and GL_ONE operand support.

Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 335 +++
 1 file changed, 104 insertions(+), 231 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index bdbefc7880..aac9de78ca 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -81,16 +81,6 @@ texenv_doing_secondary_color(struct gl_context *ctx)
return GL_FALSE;
 }
 
-struct mode_opt {
-#ifdef __GNUC__
-   __extension__ GLubyte Source:4;  /**< SRC_x */
-   __extension__ GLubyte Operand:3; /**< OPR_x */
-#else
-   GLubyte Source;  /**< SRC_x */
-   GLubyte Operand; /**< OPR_x */
-#endif
-};
-
 struct state_key {
GLuint nr_enabled_units:4;
GLuint separate_specular:1;
@@ -103,131 +93,23 @@ struct state_key {
   GLuint enabled:1;
   GLuint source_index:4;   /**< TEXTURE_x_INDEX */
   GLuint shadow:1;
+
+  /***
+   * These are taken from struct gl_tex_env_combine_packed
+   * @{
+   */
+  GLuint ModeRGB:4;
+  GLuint ModeA:4;
   GLuint ScaleShiftRGB:2;
   GLuint ScaleShiftA:2;
-
-  GLuint NumArgsRGB:3;  /**< up to MAX_COMBINER_TERMS */
-  GLuint ModeRGB:5; /**< MODE_x */
-
-  GLuint NumArgsA:3;  /**< up to MAX_COMBINER_TERMS */
-  GLuint ModeA:5; /**< MODE_x */
-
-  struct mode_opt OptRGB[MAX_COMBINER_TERMS];
-  struct mode_opt OptA[MAX_COMBINER_TERMS];
+  GLuint NumArgsRGB:3;
+  GLuint NumArgsA:3;
+  struct gl_tex_env_argument ArgsRGB[MAX_COMBINER_TERMS];
+  struct gl_tex_env_argument ArgsA[MAX_COMBINER_TERMS];
+  /** @} */
} unit[MAX_TEXTURE_COORD_UNITS];
 };
 
-#define OPR_SRC_COLOR   0
-#define OPR_ONE_MINUS_SRC_COLOR 1
-#define OPR_SRC_ALPHA   2
-#define OPR_ONE_MINUS_SRC_ALPHA3
-#define OPR_ZERO4
-#define OPR_ONE 5
-#define OPR_UNKNOWN 7
-
-static GLuint translate_operand( GLenum operand )
-{
-   switch (operand) {
-   case GL_SRC_COLOR: return OPR_SRC_COLOR;
-   case GL_ONE_MINUS_SRC_COLOR: return OPR_ONE_MINUS_SRC_COLOR;
-   case GL_SRC_ALPHA: return OPR_SRC_ALPHA;
-   case GL_ONE_MINUS_SRC_ALPHA: return OPR_ONE_MINUS_SRC_ALPHA;
-   case GL_ZERO: return OPR_ZERO;
-   case GL_ONE: return OPR_ONE;
-   default:
-  assert(0);
-  return OPR_UNKNOWN;
-   }
-}
-
-#define SRC_TEXTURE  0
-#define SRC_TEXTURE0 1
-#define SRC_TEXTURE1 2
-#define SRC_TEXTURE2 3
-#define SRC_TEXTURE3 4
-#define SRC_TEXTURE4 5
-#define SRC_TEXTURE5 6
-#define SRC_TEXTURE6 7
-#define SRC_TEXTURE7 8
-#define SRC_CONSTANT 9
-#define SRC_PRIMARY_COLOR 10
-#define SRC_PREVIOUS 11
-#define SRC_ZERO 12
-#define SRC_UNKNOWN  15
-
-static GLuint translate_source( GLenum src )
-{
-   switch (src) {
-   case GL_TEXTURE: return SRC_TEXTURE;
-   case GL_TEXTURE0:
-   case GL_TEXTURE1:
-   case GL_TEXTURE2:
-   case GL_TEXTURE3:
-   case GL_TEXTURE4:
-   case GL_TEXTURE5:
-   case GL_TEXTURE6:
-   case GL_TEXTURE7: return SRC_TEXTURE0 + (src - GL_TEXTURE0);
-   case GL_CONSTANT: return SRC_CONSTANT;
-   case GL_PRIMARY_COLOR: return SRC_PRIMARY_COLOR;
-   case GL_PREVIOUS: return SRC_PREVIOUS;
-   case GL_ZERO:
-  return SRC_ZERO;
-   default:
-  assert(0);
-  return SRC_UNKNOWN;
-   }
-}
-
-#define MODE_REPLACE 0  /* r = a0 */
-#define MODE_MODULATE1  /* r = a0 * a1 */
-#define MODE_ADD 2  /* r = a0 + a1 */
-#define MODE_ADD_SIGNED  3  /* r = a0 + a1 - 0.5 */
-#define MODE_INTERPOLATE 4  /* r = a0 * a2 + a1 * (1 - a2) */
-#define MODE_SUBTRACT5  /* r = a0 - a1 */
-#define MODE_DOT3_RGB6  /* r = a0 . a1 */
-#define MODE_DOT3_RGB_EXT7  /* r = a0 . a1 */
-#define MODE_DOT3_RGBA   8  /* r = a0 . a1 */
-#define MODE_DOT3_RGBA_EXT   9  /* r = a0 . a1 */
-#define MODE_MODULATE_ADD_ATI   10  /* r = a0 * a2 + a1 */
-#define MODE_MODULATE_SIGNED_ADD_ATI11  /* r = a0 * a2 + a1 - 0.5 */
-#define MODE_MODULATE_SUBTRACT_ATI  12  /* r = a0 * a2 - a1 */
-#define MODE_ADD_PRODUCTS   13  /* r = a0 * a1 + a2 * a3 */
-#define MODE_ADD_PRODUCTS_SIGNED14  /* r = a0 * a1 + a2 * a3 - 0.5 */
-#define MODE_UNKNOWN16
-
-/**
- * Translate GL combiner state into a MODE_x value
- */
-static GLuint translate_mode( GLenum envMode, GLenum mode )
-{
-   switch (mode) {
-   case GL_REPLACE: return MODE_REPLACE;
-   case GL_MODULATE: return MODE_MODULATE;
-   case GL_ADD:
-  if (envMode == GL_COMBINE4_NV)
- return MODE_ADD_PRODUCTS;
-  else
- return MODE_ADD;
-   case GL_ADD_SIGNED:
-  if (envMode == GL_COMBINE4_NV)
- return MODE_ADD_PRODUCTS_SIGNED;
-  else
- return 

[Mesa-dev] [PATCH 09/14] mesa/main/ff_frag: Don't retrieve format if not necessary.

2017-03-30 Thread Gustaw Smolarczyk
Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 2b4d99c879..e1fe9b58c0 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -402,24 +402,21 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
   const struct gl_texture_unit *texUnit = >Texture.Unit[i];
   const struct gl_texture_object *texObj = texUnit->_Current;
   const struct gl_tex_env_combine_state *comb = texUnit->_CurrentCombine;
-  const struct gl_sampler_object *samp;
-  GLenum format;
 
   if (!texObj)
  continue;
 
-  samp = _mesa_get_samplerobj(ctx, i);
-  format = _mesa_texture_base_format(texObj);
-
   key->unit[i].enabled = 1;
   inputs_referenced |= VARYING_BIT_TEX(i);
 
   key->unit[i].source_index = texObj->TargetIndex;
 
-  key->unit[i].shadow =
- ((samp->CompareMode == GL_COMPARE_R_TO_TEXTURE) &&
-  ((format == GL_DEPTH_COMPONENT) || 
-   (format == GL_DEPTH_STENCIL_EXT)));
+  const struct gl_sampler_object *samp = _mesa_get_samplerobj(ctx, i);
+  if (samp->CompareMode == GL_COMPARE_R_TO_TEXTURE) {
+ const GLenum format = _mesa_texture_base_format(texObj);
+ key->unit[i].shadow = (format == GL_DEPTH_COMPONENT ||
+   format == GL_DEPTH_STENCIL_EXT);
+  }
 
   key->unit[i].NumArgsRGB = comb->_NumArgsRGB;
   key->unit[i].NumArgsA = comb->_NumArgsA;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/14] mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC.

2017-03-30 Thread Gustaw Smolarczyk
It's not used.

Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 95c74e2b92..05641997de 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -449,10 +449,8 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
}
 
/* _NEW_FOG */
-   if (ctx->Fog.Enabled) {
+   if (ctx->Fog.Enabled)
   key->fog_mode = translate_fog_mode(ctx->Fog.Mode);
-  inputs_referenced |= VARYING_BIT_FOGC; /* maybe */
-   }
 
/* _NEW_BUFFERS */
key->num_draw_buffers = ctx->DrawBuffer->_NumColorDrawBuffers;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/14] mesa/main/ff_frag: Use compressed fog mode.

2017-03-30 Thread Gustaw Smolarczyk
Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 18 +-
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index e1fe9b58c0..bdbefc7880 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -117,21 +117,6 @@ struct state_key {
} unit[MAX_TEXTURE_COORD_UNITS];
 };
 
-#define FOG_NONE0
-#define FOG_LINEAR  1
-#define FOG_EXP 2
-#define FOG_EXP23
-
-static GLuint translate_fog_mode( GLenum mode )
-{
-   switch (mode) {
-   case GL_LINEAR: return FOG_LINEAR;
-   case GL_EXP: return FOG_EXP;
-   case GL_EXP2: return FOG_EXP2;
-   default: return FOG_NONE;
-   }
-}
-
 #define OPR_SRC_COLOR   0
 #define OPR_ONE_MINUS_SRC_COLOR 1
 #define OPR_SRC_ALPHA   2
@@ -446,8 +431,7 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
}
 
/* _NEW_FOG */
-   if (ctx->Fog.Enabled)
-  key->fog_mode = translate_fog_mode(ctx->Fog.Mode);
+   key->fog_mode = ctx->Fog._PackedEnabledMode;
 
/* _NEW_BUFFERS */
key->num_draw_buffers = ctx->DrawBuffer->_NumColorDrawBuffers;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/14] mesa/main/ff_frag: Store nr_enabled_units only once.

2017-03-30 Thread Gustaw Smolarczyk
Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 1b76f40e9a..717f39e9d3 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -396,8 +396,9 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
 
/* _NEW_TEXTURE_OBJECT */
mask = ctx->Texture._EnabledCoordUnits;
+   int i = -1;
while (mask) {
-  const int i = u_bit_scan();
+  i = u_bit_scan();
   const struct gl_texture_unit *texUnit = >Texture.Unit[i];
   const struct gl_texture_object *texObj = texUnit->_Current;
   const struct gl_tex_env_combine_state *comb = texUnit->_CurrentCombine;
@@ -411,7 +412,6 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
   format = _mesa_texture_base_format(texObj);
 
   key->unit[i].enabled = 1;
-  key->nr_enabled_units = i + 1;
   inputs_referenced |= VARYING_BIT_TEX(i);
 
   key->unit[i].source_index = _mesa_tex_target_to_index(ctx,
@@ -441,6 +441,8 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
   }
}
 
+   key->nr_enabled_units = i + 1;
+
/* _NEW_LIGHT | _NEW_FOG */
if (texenv_doing_secondary_color(ctx)) {
   key->separate_specular = 1;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/14] mesa/main: Maintain compressed fog mode.

2017-03-30 Thread Gustaw Smolarczyk
Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/enable.c |  1 +
 src/mesa/main/fog.c|  9 +
 src/mesa/main/mtypes.h | 14 ++
 3 files changed, 24 insertions(+)

diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index d9d63a6b4b..ef278a318a 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -385,6 +385,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
 return;
  FLUSH_VERTICES(ctx, _NEW_FOG);
  ctx->Fog.Enabled = state;
+ ctx->Fog._PackedEnabledMode = state ? ctx->Fog._PackedMode : FOG_NONE;
  break;
   case GL_LIGHT0:
   case GL_LIGHT1:
diff --git a/src/mesa/main/fog.c b/src/mesa/main/fog.c
index 1ad939cfde..76e65080b7 100644
--- a/src/mesa/main/fog.c
+++ b/src/mesa/main/fog.c
@@ -102,8 +102,13 @@ _mesa_Fogfv( GLenum pname, const GLfloat *params )
  m = (GLenum) (GLint) *params;
 switch (m) {
 case GL_LINEAR:
+   ctx->Fog._PackedMode = FOG_LINEAR;
+   break;
 case GL_EXP:
+   ctx->Fog._PackedMode = FOG_EXP;
+   break;
 case GL_EXP2:
+   ctx->Fog._PackedMode = FOG_EXP2;
break;
 default:
_mesa_error( ctx, GL_INVALID_ENUM, "glFog" );
@@ -113,6 +118,8 @@ _mesa_Fogfv( GLenum pname, const GLfloat *params )
return;
 FLUSH_VERTICES(ctx, _NEW_FOG);
 ctx->Fog.Mode = m;
+ctx->Fog._PackedEnabledMode = ctx->Fog.Enabled ?
+  ctx->Fog._PackedMode : FOG_NONE;
 break;
   case GL_FOG_DENSITY:
 if (*params<0.0F) {
@@ -210,6 +217,8 @@ void _mesa_init_fog( struct gl_context * ctx )
/* Fog group */
ctx->Fog.Enabled = GL_FALSE;
ctx->Fog.Mode = GL_EXP;
+   ctx->Fog._PackedMode = FOG_EXP;
+   ctx->Fog._PackedEnabledMode = FOG_NONE;
ASSIGN_4V( ctx->Fog.Color, 0.0, 0.0, 0.0, 0.0 );
ASSIGN_4V( ctx->Fog.ColorUnclamped, 0.0, 0.0, 0.0, 0.0 );
ctx->Fog.Index = 0.0;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 91e1948c23..186d79928c 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -575,12 +575,26 @@ struct gl_eval_attrib
 
 
 /**
+ * Compressed fog mode.
+ */
+enum gl_fog_mode
+{
+   FOG_NONE,
+   FOG_LINEAR,
+   FOG_EXP,
+   FOG_EXP2,
+};
+
+
+/**
  * Fog attribute group (GL_FOG_BIT).
  */
 struct gl_fog_attrib
 {
GLboolean Enabled;  /**< Fog enabled flag */
GLboolean ColorSumEnabled;
+   uint8_t _PackedMode;/**< Fog mode as 2 bits */
+   uint8_t _PackedEnabledMode; /**< Masked CompressedMode */
GLfloat ColorUnclamped[4];/**< Fog color */
GLfloat Color[4];   /**< Fog color */
GLfloat Density;/**< Density >= 0.0 */
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/14] mesa/main/ff_frag: Use gl_texture_object::TargetIndex.

2017-03-30 Thread Gustaw Smolarczyk
Instead of computing it once again using _mesa_tex_target_to_index.

Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 717f39e9d3..2b4d99c879 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -414,8 +414,7 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
   key->unit[i].enabled = 1;
   inputs_referenced |= VARYING_BIT_TEX(i);
 
-  key->unit[i].source_index = _mesa_tex_target_to_index(ctx,
-texObj->Target);
+  key->unit[i].source_index = texObj->TargetIndex;
 
   key->unit[i].shadow =
  ((samp->CompareMode == GL_COMPARE_R_TO_TEXTURE) &&
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/14] mesa/main/ff_frag: Simplify get_fp_input_mask.

2017-03-30 Thread Gustaw Smolarczyk
Change it into filter_fp_input_mask transform function that instead of
returning a mask, transforms input.

Also, simplify the case of vertex program handling by assuming that
fp_inputs is always a combination of VARYING_BIT_COL* and VARYING_BIT_TEX*.

Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 111 +--
 1 file changed, 55 insertions(+), 56 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 05641997de..1b76f40e9a 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -284,30 +284,33 @@ need_saturate( GLuint mode )
  * constants instead.
  *
  * This function figures out all the inputs that the fragment program
- * has access to.  The bitmask is later reduced to just those which
- * are actually referenced.
+ * has access to and filters input bitmask.
  */
-static GLbitfield get_fp_input_mask( struct gl_context *ctx )
+static GLbitfield filter_fp_input_mask( GLbitfield fp_inputs,
+   struct gl_context *ctx )
 {
-   /* _NEW_PROGRAM */
-   const GLboolean vertexShader =
-  ctx->_Shader->CurrentProgram[MESA_SHADER_VERTEX] != NULL;
-   const GLboolean vertexProgram = ctx->VertexProgram._Enabled;
-   GLbitfield fp_inputs = 0x0;
-
if (ctx->VertexProgram._Overriden) {
   /* Somebody's messing with the vertex program and we don't have
* a clue what's happening.  Assume that it could be producing
* all possible outputs.
*/
-  fp_inputs = ~0;
+  return fp_inputs;
}
-   else if (ctx->RenderMode == GL_FEEDBACK) {
+
+   if (ctx->RenderMode == GL_FEEDBACK) {
   /* _NEW_RENDERMODE */
-  fp_inputs = (VARYING_BIT_COL0 | VARYING_BIT_TEX0);
+  return fp_inputs & (VARYING_BIT_COL0 | VARYING_BIT_TEX0);
}
-   else if (!(vertexProgram || vertexShader)) {
+
+   /* _NEW_PROGRAM */
+   const GLboolean vertexShader =
+ ctx->_Shader->CurrentProgram[MESA_SHADER_VERTEX] != NULL;
+   const GLboolean vertexProgram = ctx->VertexProgram._Enabled;
+
+   if (!(vertexProgram || vertexShader)) {
   /* Fixed function vertex logic */
+  GLbitfield possible_inputs = 0;
+
   /* _NEW_VARYING_VP_INPUTS */
   GLbitfield64 varying_inputs = ctx->varying_vp_inputs;
 
@@ -315,69 +318,66 @@ static GLbitfield get_fp_input_mask( struct gl_context 
*ctx )
* vertex program:
*/
   /* _NEW_POINT */
-  if (ctx->Point.PointSprite)
- varying_inputs |= VARYING_BITS_TEX_ANY;
+  if (ctx->Point.PointSprite) {
+ /* All texture varyings are possible to use */
+ possible_inputs = VARYING_BITS_TEX_ANY;
+  }
+  else {
+ /* _NEW_TEXTURE_STATE */
+ const GLbitfield possible_tex_inputs =
+   ctx->Texture._TexGenEnabled |
+   ctx->Texture._TexMatEnabled |
+   ((varying_inputs & VERT_BIT_TEX_ANY) >> VERT_ATTRIB_TEX0);
+
+ possible_inputs = (possible_tex_inputs << VARYING_SLOT_TEX0);
+  }
 
   /* First look at what values may be computed by the generated
* vertex program:
*/
   /* _NEW_LIGHT */
   if (ctx->Light.Enabled) {
- fp_inputs |= VARYING_BIT_COL0;
+ possible_inputs |= VARYING_BIT_COL0;
 
  if (texenv_doing_secondary_color(ctx))
-fp_inputs |= VARYING_BIT_COL1;
+possible_inputs |= VARYING_BIT_COL1;
   }
 
-  /* _NEW_TEXTURE_STATE */
-  fp_inputs |= (ctx->Texture._TexGenEnabled |
-ctx->Texture._TexMatEnabled) << VARYING_SLOT_TEX0;
-
   /* Then look at what might be varying as a result of enabled
* arrays, etc:
*/
   if (varying_inputs & VERT_BIT_COLOR0)
- fp_inputs |= VARYING_BIT_COL0;
+ possible_inputs |= VARYING_BIT_COL0;
   if (varying_inputs & VERT_BIT_COLOR1)
- fp_inputs |= VARYING_BIT_COL1;
-
-  fp_inputs |= (((varying_inputs & VERT_BIT_TEX_ANY) >> VERT_ATTRIB_TEX0) 
-<< VARYING_SLOT_TEX0);
+ possible_inputs |= VARYING_BIT_COL1;
 
+  return fp_inputs & possible_inputs;
}
-   else {
-  /* calculate from vp->outputs */
-  struct gl_program *vprog;
-  GLbitfield64 vp_outputs;
-
-  /* Choose GLSL vertex shader over ARB vertex program.  Need this
-   * since vertex shader state validation comes after fragment state
-   * validation (see additional comments in state.c).
-   */
-  if (vertexShader)
- vprog = ctx->_Shader->CurrentProgram[MESA_SHADER_VERTEX];
-  else
- vprog = ctx->VertexProgram.Current;
 
-  vp_outputs = vprog->info.outputs_written;
+   /* calculate from vp->outputs */
+   struct gl_program *vprog;
 
-  /* These get generated in the setup routine regardless of the
-   * vertex program:
-   */
-  /* _NEW_POINT */
-  if (ctx->Point.PointSprite)
- vp_outputs |= 

[Mesa-dev] [PATCH 01/14] mesa/main/ff_frag: Use correct constant.

2017-03-30 Thread Gustaw Smolarczyk
Since fixed-function shaders are restricted to MAX_TEXTURE_COORD_UNITS
texture units, use this constant instead of MAX_TEXTURE_UNITS. This
reduces the array size from 32 to 8.

Signed-off-by: Gustaw Smolarczyk 
Reviewed-by: Eric Anholt 
---
 src/mesa/main/ff_fragment_shader.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index f007ac3b40..7679328d4c 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -123,7 +123,7 @@ struct state_key {
 
   struct mode_opt OptRGB[MAX_COMBINER_TERMS];
   struct mode_opt OptA[MAX_COMBINER_TERMS];
-   } unit[MAX_TEXTURE_UNITS];
+   } unit[MAX_TEXTURE_COORD_UNITS];
 };
 
 #define FOG_NONE0
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/14] mesa/main/ff_frag: Remove unused struct.

2017-03-30 Thread Gustaw Smolarczyk
Signed-off-by: Gustaw Smolarczyk 
---
 src/mesa/main/ff_fragment_shader.cpp | 8 
 1 file changed, 8 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 9b00c36534..95c74e2b92 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -68,14 +68,6 @@ using namespace ir_builder;
  */
 
 
-struct texenvprog_cache_item
-{
-   GLuint hash;
-   void *key;
-   struct gl_shader_program *data;
-   struct texenvprog_cache_item *next;
-};
-
 static GLboolean
 texenv_doing_secondary_color(struct gl_context *ctx)
 {
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/14] mesa/main/ff_frag: Reduce the size of nr_enabled_units.

2017-03-30 Thread Gustaw Smolarczyk
Since it holds values from 0 to 8, 4 bits will suffice.

Signed-off-by: Gustaw Smolarczyk 
Reviewed-by: Eric Anholt 
---
 src/mesa/main/ff_fragment_shader.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index d1e89abc08..9b00c36534 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -100,7 +100,7 @@ struct mode_opt {
 };
 
 struct state_key {
-   GLuint nr_enabled_units:8;
+   GLuint nr_enabled_units:4;
GLuint separate_specular:1;
GLuint fog_mode:2;  /**< FOG_x */
GLuint inputs_available:12;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/14] mesa/main/ff_frag: Remove enabled_units.

2017-03-30 Thread Gustaw Smolarczyk
Its only usage is easily replaced by nr_enabled_units. As for cache key
part, unit[i].enabled should be enough.

Signed-off-by: Gustaw Smolarczyk 
Reviewed-by: Eric Anholt 
---
 src/mesa/main/ff_fragment_shader.cpp | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index 7679328d4c..d1e89abc08 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -101,7 +101,6 @@ struct mode_opt {
 
 struct state_key {
GLuint nr_enabled_units:8;
-   GLuint enabled_units:8;
GLuint separate_specular:1;
GLuint fog_mode:2;  /**< FOG_x */
GLuint inputs_available:12;
@@ -421,7 +420,6 @@ static GLuint make_state_key( struct gl_context *ctx,  
struct state_key *key )
   format = _mesa_texture_base_format(texObj);
 
   key->unit[i].enabled = 1;
-  key->enabled_units |= (1<nr_enabled_units = i + 1;
   inputs_referenced |= VARYING_BIT_TEX(i);
 
@@ -1136,7 +1134,7 @@ emit_instructions(texenv_fragment_program *p)
struct state_key *key = p->state;
GLuint unit;
 
-   if (key->enabled_units) {
+   if (key->nr_enabled_units) {
   /* First pass - to support texture_env_crossbar, first identify
* all referenced texture sources and emit texld instructions
* for each:
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/14] More substantial ff_fragment_shader cache key optimizations.

2017-03-30 Thread Gustaw Smolarczyk
Hello,

This is the continuation of my ff_fragment_shader cache key optimizations.
I have continued to try to reduce overhead of make_state_key function and it
seems that I have gained a little bit. As this is the first time I have
ventured into the mesa codebase so much, it's possible that I did something
wrong along the way. Please, point it out if you find anything incorrect.
For example, I was a little bit confused by the indentation used in some code
parts (like using only spaces or a mix of spaces and tabs). I tried to preserve
the indentation of the files I modified.

As before, the number of patches might be a little bit high since some of them
are very simple. I might squash some of them if you prefer that.

The first 3 patches are a rebased resend of a previous series. I have kept
Eric's r-by (I have changed the commit message of these a little bit, I hope
keeping r-by was ok).

Patches 4-9 contain simple self-contained improvements to the cache key and
its computation.

Patches 10 and 11 try to move some of the state computation to the point it is
changed. I have added a couple of compressed state fields into the context
object.

Patches 12 and 13 use these new fields inside make_state_key, simplifying it
a lot. Along the way, I have fixed an apparent bug (GL_ONE was not handled as
a combine source), though there was no difference for piglit quick run.

Finally, patch 14 uses the new compressed fog state for atifs state handling
in st/mesa, since it was quite simple to modify it. I didn't bother using
the new state for classic dri drivers.

I have run a piglit quick test on radeonsi before and after the series and
there were no differences apart from some unstable test results.

As for performance measurements, I have run a simple minecraft apitrace
through perf-record 5 times and have found that:

1. The apitrace replay fps measure is too variable to show any difference.
   It can be passed as "a wash".
2. perf-report shows something more encouraging. The time spent in
   _mesa_get_fixed_func_fragment_program has dropped from ~0.78% to ~0.37%.
   Standard deviation here is ~0.025% so the performance gain is statistically
   significant.

Regards,
Gustaw

Gustaw Smolarczyk (14):
  mesa/main/ff_frag: Use correct constant.
  mesa/main/ff_frag: Remove enabled_units.
  mesa/main/ff_frag: Reduce the size of nr_enabled_units.
  mesa/main/ff_frag: Remove unused struct.
  mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC.
  mesa/main/ff_frag: Simplify get_fp_input_mask.
  mesa/main/ff_frag: Store nr_enabled_units only once.
  mesa/main/ff_frag: Use gl_texture_object::TargetIndex.
  mesa/main/ff_frag: Don't retrieve format if not necessary.
  mesa/main: Maintain compressed fog mode.
  mesa/main: Maintain compressed TexEnv Combine state.
  mesa/main/ff_frag: Use compressed fog mode.
  mesa/main/ff_frag: Use compressed TexEnv Combine state.
  st/mesa: Use compressed fog mode for atifs.

 src/mesa/main/enable.c|   1 +
 src/mesa/main/ff_fragment_shader.cpp  | 506 ++
 src/mesa/main/fog.c   |   9 +
 src/mesa/main/mtypes.h|  97 ++
 src/mesa/main/texstate.c  | 103 ++
 src/mesa/state_tracker/st_atifs_to_tgsi.c |   6 +-
 src/mesa/state_tracker/st_atom_shader.c   |  17 +-
 7 files changed, 388 insertions(+), 351 deletions(-)

-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   >