Re: [Mesa-dev] [PATCH] glsl: Use ir_var_temporary when generating inline functions.

2016-12-19 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Mon, 2016-12-19 at 15:29 -0800, Kenneth Graunke wrote:
> We were using ir_var_auto for the inlined function parameter
> variables,
> which is wrong, as it suggests that those are real variables declared
> by the program.
> 
> Normally this doesn't matter.  However, if you called built-ins at
> global scope, it would pollute the global variable namespace with
> these new parameter temporaries.  If the shader already had variables
> with those names, the linker might see contradictory global variable
> declarations and raise an error.
> 
> Making them temporaries indicates that these are just things
> generated
> by the compiler internally.  This avoids confusing the linker.
> 
> Fixes a new Piglit test: glsl-fs-multiple-builtins.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097
> Reported-by: Niels Ole Salscheider 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/opt_function_inlining.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/opt_function_inlining.cpp
> b/src/compiler/glsl/opt_function_inlining.cpp
> index 62c1f4b..78a726b 100644
> --- a/src/compiler/glsl/opt_function_inlining.cpp
> +++ b/src/compiler/glsl/opt_function_inlining.cpp
> @@ -164,7 +164,7 @@ ir_call::generate_inline(ir_instruction *next_ir)
>    parameters[i] = NULL;
>    } else {
>    parameters[i] = sig_param->clone(ctx, ht);
> -  parameters[i]->data.mode = ir_var_auto;
> +  parameters[i]->data.mode = ir_var_temporary;
>  
>    /* Remove the read-only decoration because we're going to
> write
>     * directly to this variable.  If the cloned variable is
> left
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add renderonly library

2016-12-19 Thread Christian Gmeiner
2016-12-19 22:50 GMT+01:00 Thierry Reding :
> On Mon, Dec 19, 2016 at 08:54:04PM +0100, Christian Gmeiner wrote:
>> 2016-12-19 14:08 GMT+01:00 Thierry Reding :
>> > On Wed, Nov 30, 2016 at 02:44:34PM +0100, Christian Gmeiner wrote:
> [...]
>> >>  GALLIUM_WINSYS_CFLAGS = \
>> >>   -I$(top_srcdir)/src \
>> >>   -I$(top_srcdir)/include \
>> >> diff --git a/src/gallium/auxiliary/Makefile.am 
>> >> b/src/gallium/auxiliary/Makefile.am
>> >> index 4a4a4fb..6b63cf1 100644
>> >> --- a/src/gallium/auxiliary/Makefile.am
>> >> +++ b/src/gallium/auxiliary/Makefile.am
>> >> @@ -20,6 +20,16 @@ libgallium_la_SOURCES = \
>> >>   $(NIR_SOURCES) \
>> >>   $(GENERATED_SOURCES)
>> >>
>> >> +if HAVE_LIBDRM
>> >> +
>> >> +AM_CFLAGS += \
>> >> + $(LIBDRM_CFLAGS)
>> >
>> > Same here.
>>
>> I just redone what other have done in that file. There are some if's
>> in that file and I am unsure
>> what to do here.
>
> Maybe Emil has a strong opinion here, I was really just nit-picking, so
> feel free to leave this.
>

:)

>> >> diff --git a/src/gallium/auxiliary/Makefile.sources 
>> >> b/src/gallium/auxiliary/Makefile.sources
>> >> index 5d4fe30..8d3e4a9 100644
>> >> --- a/src/gallium/auxiliary/Makefile.sources
>> >> +++ b/src/gallium/auxiliary/Makefile.sources
>> >> @@ -435,3 +435,7 @@ GALLIVM_SOURCES := \
>> >>   draw/draw_llvm_sample.c \
>> >>   draw/draw_pt_fetch_shade_pipeline_llvm.c \
>> >>   draw/draw_vs_llvm.c
>> >> +
>> >> +RENDERONLY_SOURCES := \
>> >> + renderonly/renderonly.c \
>> >> + renderonly/renderonly.h
>> >> diff --git a/src/gallium/auxiliary/renderonly/renderonly.c 
>> >> b/src/gallium/auxiliary/renderonly/renderonly.c
>> >> new file mode 100644
>> >> index 000..c4ea784
>> >> --- /dev/null
>> >> +++ b/src/gallium/auxiliary/renderonly/renderonly.c
>> >> @@ -0,0 +1,199 @@
>> >> +/*
>> >> + * Copyright (C) 2016 Christian Gmeiner 
>> >> + *
>> >> + * Permission is hereby granted, free of charge, to any person obtaining 
>> >> a
>> >> + * copy of this software and associated documentation files (the 
>> >> "Software"),
>> >> + * to deal in the Software without restriction, including without 
>> >> limitation
>> >> + * the rights to use, copy, modify, merge, publish, distribute, 
>> >> sublicense,
>> >> + * and/or sell copies of the Software, and to permit persons to whom the
>> >> + * Software is furnished to do so, subject to the following conditions:
>> >> + *
>> >> + * The above copyright notice and this permission notice (including the 
>> >> next
>> >> + * paragraph) shall be included in all copies or substantial portions of 
>> >> the
>> >> + * Software.
>> >> + *
>> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>> >> EXPRESS OR
>> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>> >> MERCHANTABILITY,
>> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
>> >> SHALL
>> >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> >> OTHER
>> >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
>> >> ARISING FROM,
>> >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> >> DEALINGS IN THE
>> >> + * SOFTWARE.
>> >> + *
>> >> + * Authors:
>> >> + *Christian Gmeiner 
>> >> + */
>> >> +
>> >> +#include "renderonly/renderonly.h"
>> >
>> > This include is oddly placed. Should it not go below all other includes?
>> >
>>
>> I think thats a matter of style but I can puit it above #include
>> "state_tracker/drm_driver.h"
>
> I prefer having these in hierarchical order. That is, anything from the
> C runtime goes first, then other system includes, followed by project
> core includes and finally driver-local includes. By that ordering
> renderonly.h would be very last in line. Not sure if anyone in Mesa is
> pedantic enough to insist on a common style, so feel free to leave this
> as-is if you prefer.
>

Okay.. I think this could be fixed after it went in.

>> >> +struct pipe_screen *
>> >> +renderonly_screen_create(int fd, const struct renderonly_ops *ops, void 
>> >> *priv)
>> >
>> > This is slightly nit-picky, but I found it confusing when first reading
>> > through the patches: I know this started out as an effort to support the
>> > various render-only devices out there, but what you're creating here is
>> > really an abstraction for the scanout engine. Reading renderonly_* the
>> > first thing I think of is that this creates a render-only device, where
>> > in fact is creates the scanout engine.
>> >
>> > Perhaps renaming this to struct scanout, struct scanout_engine or any
>> > other number of possibilities would remove that confusion.
>> >
>>
>> In my current version I have only one struct left:
>>
>> struct renderonly {
>>int (*tiling)(int fd, uint32_t handle); /**< for !intermediate_rendering 
>> */
>>bool intermediate_rendering;
>>

[Mesa-dev] [PATCH] egl/dri2: implement query surface hook

2016-12-19 Thread Tapani Pälli
This makes better guarantee that the values we return are
in sync what the underlying drawable currently has.

Together with dEQP change bug #98327 this fixes following test:

   dEQP-EGL.functional.resize.surface_size.grow

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98327
---
 src/egl/drivers/dri2/platform_x11.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index db7d3b9..0c5d577 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -395,6 +395,34 @@ dri2_x11_destroy_surface(_EGLDriver *drv, _EGLDisplay 
*disp, _EGLSurface *surf)
 }
 
 /**
+ * Function utilizes swrastGetDrawableInfo to get surface
+ * geometry from x server and calls default query surface
+ * implementation that returns the updated values.
+ *
+ * In case of errors we still return values that we currently
+ * have.
+ */
+static EGLBoolean
+dri2_query_surface(_EGLDriver *drv, _EGLDisplay *dpy,
+   _EGLSurface *surf, EGLint attribute,
+   EGLint *value)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   int x, y, w = -1, h = -1;
+
+   __DRIdrawable *drawable = dri2_dpy->vtbl->get_dri_drawable(surf);
+   swrastGetDrawableInfo(drawable, , , , , dri2_surf);
+
+   if (w != -1 && h != -1) {
+  surf->Width = w;
+  surf->Height = h;
+   }
+
+   return _eglQuerySurface(drv, dpy, surf, attribute, value);
+}
+
+/**
  * Process list of buffer received from the server
  *
  * Processes the list of buffers received in a reply from the server to either
@@ -1113,6 +1141,7 @@ static struct dri2_egl_display_vtbl 
dri2_x11_swrast_display_vtbl = {
.post_sub_buffer = dri2_fallback_post_sub_buffer,
.copy_buffers = dri2_x11_copy_buffers,
.query_buffer_age = dri2_fallback_query_buffer_age,
+   .query_surface = dri2_query_surface,
.create_wayland_buffer_from_image = 
dri2_fallback_create_wayland_buffer_from_image,
.get_sync_values = dri2_fallback_get_sync_values,
.get_dri_drawable = dri2_surface_get_dri_drawable,
@@ -1132,6 +1161,7 @@ static struct dri2_egl_display_vtbl dri2_x11_display_vtbl 
= {
.post_sub_buffer = dri2_x11_post_sub_buffer,
.copy_buffers = dri2_x11_copy_buffers,
.query_buffer_age = dri2_fallback_query_buffer_age,
+   .query_surface = dri2_query_surface,
.create_wayland_buffer_from_image = 
dri2_fallback_create_wayland_buffer_from_image,
.get_sync_values = dri2_x11_get_sync_values,
.get_dri_drawable = dri2_surface_get_dri_drawable,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 000/103] i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0

2016-12-19 Thread Samuel Iglesias Gonsálvez
On Mon, 2016-12-19 at 11:31 -0600, Matt Turner wrote:
> On Mon, Dec 19, 2016 at 2:00 AM, Samuel Iglesias Gonsálvez
>  wrote:
> > Hello Matt,
> > 
> > We have done most of the suggestions you made to our patches.
> > However,
> > we have replied to some of your questions/suggestions and we are
> > waiting for a reply before marking them as R-b or not.
> 
> Thank you guys so much.
> 
> > You can clone the new version of the patch series by running this
> > command:
> > 
> > $ git clone -b i965-fp64-gen7-scalar-vec4-rc3 https://github.com/Ig
> > alia
> > /mesa.git
> > 
> > Below is the list of patches that need a R-b (they are marked as
> > UNREVIEWED in the branch).
> > 
> > * i965/vec4: implement hardware workaround for align16 double to
> > float
> > conversion
> > > 
> > >   This always seemed like a really strange hardware bug, and
> > 
> > one
> > >   that no one should ever hit.
> > > 
> > >   I'd prefer that, instead of loading an immediate double and
> > > then
> > >   performing a conversion to float, that we just convert the
> > >   double to float in the compiler and emit an instruction to
> > 
> > load
> > >   that.
> > > 
> > 
> >   We have done this. Does this change get your R-b?
> 
> Yes!
> 
> > 
> > * i965/vec4: fix optimize predicate for doubles
> > 
> >   We have replied here [0].
> 
> Sounds good to me.
> 
> > 
> > * i965/vec4: handle 32 and 64 bit channels in liveness analysis
> > 
> >   It is still unreviewed. Maybe Curro can take a look at it.
> 
> I've also pinged Curro to ask if he'll review it.
> 
> > * i965/vec4: add a SIMD lowering pass
> > 
> >   Replied here [1].
> 
> Silly messy hardware. :)
> 
> > * i965/vec4: Prevent copy propagation from violating pre-gen8
> > restrictions
> > 
> >   Replied here [1].
> > 
> > * i965/vec4: run scalarize_df() after spilling
> > 
> >   Replied here [1].
> 
> Makes sense.
> 
> Yes, all of those should be
> 
> Reviewed-by: Matt Turner 
> 
> Again, thank you so much. This was a large amount of work, and the
> way
> you guys handled it was extremely impressive. I'm only sorry that the
> review of your work wasn't executed as well as your actual work!
> 

Thanks to you for the review! :-)

Sam

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] new EGL internal SyncSurface hook

2016-12-19 Thread Tapani Pälli
Forget about this ... with some more coffee I realized I can just go and 
implement the geometry fetching in QuerySurface hook for dri2 x11, will 
send a patch for that!


Sorry about the noise!

On 12/19/2016 02:36 PM, Tapani Pälli wrote:

Hi;

I've been investigating dEQP EGL bugs. Following test seems to fail now
and then and only on X11 when using DRI2:

   dEQP-EGL.functional.resize.surface_size.grow

This patch is RFC for a 'SyncSurface' API that will query surface
geometry (xcb_get_geometry) before returning current stored values in
'_eglQuerySurface'. Without this we might have wrong values in place,
I believe this is related to following comments in the source code ..

--- 8< ---
src/mesa/drivers/dri/i965/brw_context.c:

/* GLX uses DRI2 invalidate events to handle window resizing.
 * Unfortunately, EGL does not - libEGL is written in XCB (not Xlib),
 * which doesn't provide a mechanism for snooping the event queues.
 *
 * So EGL still relies on viewport hacks to handle window resizing.
 * This should go away with DRI3000.
 */

src/egl/drivers/dri2/platform_x11.c:

/* Since we aren't watching for the server's invalidate events like we're
 * supposed to (due to XCB providing no mechanism for filtering the events
 * the way xlib does), and SwapBuffers is a common cause of invalidate
 * events, just shove one down to the driver, even though we haven't told
 * the driver that we're the kind of loader that provides reliable
 * invalidate events.  This causes the driver to request buffers again at
 * its next draw, so that we get the correct buffers if a pageflip
 * happened.  The driver should still be using the viewport hack to catch
 * window resizes.
 */
--- 8< ---

This patch together with a dEQP change I've attached to bug #98327 makes
this test pass consistently for me on 2 machines, one is running Fedora
with DRI3 and another one Ubuntu with DRI2.

Is this OK approach? Should the name be something else like
SyncSurfaceGeometry or GetSurfaceGeometry?

Any comments appreciated;

Tapani Pälli (1):
  egl: syncronize surface information with driver RFC

 src/egl/drivers/dri2/egl_dri2.c | 12 
 src/egl/drivers/dri2/egl_dri2.h |  2 ++
 src/egl/drivers/dri2/platform_x11.c | 31 +++
 src/egl/main/eglapi.h   |  2 ++
 src/egl/main/eglfallbacks.c |  1 +
 src/egl/main/eglsurface.c   |  3 +++
 6 files changed, 51 insertions(+)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vulkan/wsi/x11: don't crash on null wsi x11 connection

2016-12-19 Thread Edward O'Callaghan
Acked-by: Edward O'Callaghan 

On 12/20/2016 02:59 PM, Arda Coskunses wrote:
> Without this check driver crash when application window
> closed unexpectedly.
> ---
>  src/vulkan/wsi/wsi_common_x11.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
> index 25ba0c1..afb7809 100644
> --- a/src/vulkan/wsi/wsi_common_x11.c
> +++ b/src/vulkan/wsi/wsi_common_x11.c
> @@ -261,6 +261,11 @@ VkBool32 
> wsi_get_physical_device_xcb_presentation_support(
> struct wsi_x11_connection *wsi_conn =
>wsi_x11_get_connection(wsi_device, alloc, connection);
>  
> +   if (!wsi_conn) {
> +  fprintf(stderr, "vulkan: wsi connection lost\n");
> +  return false;
> +   }
> +
> if (!wsi_conn->has_dri3) {
>fprintf(stderr, "vulkan: No DRI3 support\n");
>return false;
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/algebraic: Add optimizations for "a == a && a CMP b"

2016-12-19 Thread Jason Ekstrand
This sequence shows up The Talos Principal, at least under Vulkan,
and prevents loop analysis from properly computing trip counts in a
few loops.
---
 src/compiler/nir/nir_opt_algebraic.py | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 698ac67..cc70ad5 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -464,6 +464,14 @@ def bitfield_reverse(u):
 
 optimizations += [(bitfield_reverse('x@32'), ('bitfield_reverse', 'x'))]
 
+# For any comparison operation, "cmp", if you have "a != a && a cmp b" then
+# the "a != a" is redundant because it's equivalent to "a is not NaN" and, if
+# a is a NaN then the second comparison will fail anyway.
+for op in ['flt', 'fge', 'feq', 'fne']:
+   optimizations += [
+  (('iand', ('feq', a, a), (op, a, b)), (op, a, b)),
+  (('iand', ('feq', a, a), (op, b, a)), (op, b, a)),
+   ]
 
 # Add optimizations to handle the case where the result of a ternary is
 # compared to a constant.  This way we can take things like
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99076] dEQP-GLES3.functional.negative_api.texture#teximage3d fails due to wrong Error code

2016-12-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99076

Gary Wang  changed:

   What|Removed |Added

 CC||gary.c.w...@intel.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: make use of nir_lower_returns() for GL

2016-12-19 Thread Timothy Arceri
Fixes two new piglit tests:

spec/glsl-1.10/execution/vs-nested-return-sibling-loop.shader_test
spec/glsl-1.10/execution/vs-nested-return-sibling-loop2.shader_test

shader-db results for BDW:

total instructions in shared programs: 12903158 -> 12903134 (-0.00%)
instructions in affected programs: 27100 -> 27076 (-0.09%)
helped: 32
HURT: 6

total cycles in shared programs: 294922518 -> 294922804 (0.00%)
cycles in affected programs: 4372828 -> 4373114 (0.01%)
helped: 31
HURT: 8
---
 src/mesa/drivers/dri/i965/brw_link.cpp  | 6 --
 src/mesa/drivers/dri/i965/brw_program.c | 2 ++
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 6f37428..19db982 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -160,12 +160,6 @@ process_glsl_ir(struct brw_context *brw,
  brw_do_vector_splitting(shader->ir);
   }
 
-  progress = do_lower_jumps(shader->ir, true, true,
-true, /* main return */
-false, /* continue */
-false /* loops */
-) || progress;
-
   progress = do_common_optimization(shader->ir, true, true,
 options, ctx->Const.NativeIntegers) || 
progress;
} while (progress);
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index a502b8e..c4ab5ee 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -78,6 +78,8 @@ brw_create_nir(struct brw_context *brw,
if (shader_prog) {
   nir = glsl_to_nir(shader_prog, stage, options);
   nir_remove_dead_variables(nir, nir_var_shader_in | nir_var_shader_out);
+  nir_lower_returns(nir);
+  nir_validate_shader(nir);
   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
  nir_shader_get_entrypoint(nir), true, false);
} else {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] nir: update nir_lower_returns to only predicate instructions when needed

2016-12-19 Thread Timothy Arceri
Unless an if statement contains nested returns we can simply add
any following instructions to the branch without the return.

V2: fix handling if_nested_return value when there is a sibling if/loop
that doesn't contain a return. (Spotted by Ken)
---
 src/compiler/nir/nir_lower_returns.c | 37 ++--
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/src/compiler/nir/nir_lower_returns.c 
b/src/compiler/nir/nir_lower_returns.c
index cf49d5b..5eec984 100644
--- a/src/compiler/nir/nir_lower_returns.c
+++ b/src/compiler/nir/nir_lower_returns.c
@@ -30,6 +30,8 @@ struct lower_returns_state {
struct exec_list *cf_list;
nir_loop *loop;
nir_variable *return_flag;
+   /* Are there other return statments nested in the current if */
+   bool if_nested_return;
 };
 
 static bool lower_returns_in_cf_list(struct exec_list *cf_list,
@@ -82,8 +84,10 @@ lower_returns_in_loop(nir_loop *loop, struct 
lower_returns_state *state)
 * flag set to true.  We need to predicate everything following the loop
 * on the return flag.
 */
-   if (progress)
+   if (progress) {
   predicate_following(>cf_node, state);
+  state->if_nested_return = true;
+   }
 
return progress;
 }
@@ -91,10 +95,13 @@ lower_returns_in_loop(nir_loop *loop, struct 
lower_returns_state *state)
 static bool
 lower_returns_in_if(nir_if *if_stmt, struct lower_returns_state *state)
 {
-   bool progress;
+   bool progress, then_progress;
 
-   progress = lower_returns_in_cf_list(_stmt->then_list, state);
-   progress = lower_returns_in_cf_list(_stmt->else_list, state) || progress;
+   bool if_nested_return = state->if_nested_return;
+   state->if_nested_return = false;
+
+   then_progress = lower_returns_in_cf_list(_stmt->then_list, state);
+   progress = lower_returns_in_cf_list(_stmt->else_list, state) || 
then_progress;
 
/* If either of the recursive calls made progress, then there were
 * returns inside of the body of the if.  If we're in a loop, then these
@@ -106,8 +113,25 @@ lower_returns_in_if(nir_if *if_stmt, struct 
lower_returns_state *state)
 * after a return, we need to predicate everything following on the
 * return flag.
 */
-   if (progress && !state->loop)
-  predicate_following(_stmt->cf_node, state);
+   if (progress && !state->loop) {
+  if (state->if_nested_return) {
+ predicate_following(_stmt->cf_node, state);
+  } else {
+ /* If there are no nested returns we can just add the instructions to
+  * the end of the branch that doesn't have the return.
+  */
+ nir_cf_list list;
+ nir_cf_extract(, nir_after_cf_node(_stmt->cf_node),
+nir_after_cf_list(state->cf_list));
+
+ if (then_progress)
+nir_cf_reinsert(, nir_after_cf_list(_stmt->else_list));
+ else
+nir_cf_reinsert(, nir_after_cf_list(_stmt->then_list));
+  }
+   }
+
+   state->if_nested_return = progress || if_nested_return;
 
return progress;
 }
@@ -221,6 +245,7 @@ nir_lower_returns_impl(nir_function_impl *impl)
state.cf_list = >body;
state.loop = NULL;
state.return_flag = NULL;
+   state.if_nested_return = false;
nir_builder_init(, impl);
 
bool progress = lower_returns_in_cf_list(>body, );
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/6] nir: A couple of SPIR-V focused loop optimizations

2016-12-19 Thread Jason Ekstrand
While working with Tim on loop unrolling, I realized that the new NIR loop
unrolling pass couldn't actually unroll any loops coming out of SPIR-V
thanks to the way we handle continues.  This little series adds two
relatively small optimization passes that deal with this by peeling apart
ifs at the tops of loops that take one branch on the first iteration and
the other branch on all subsequent operations.

Jason Ekstrand (6):
  nir: Add a couple quick-and-dirty out-of-SSA helpers
  nir: Correctly handle blocks in cf_node_cf_tree_next
  nir: Expose function_impl versions of copy-prop and dce
  nir: Add an optimization pass to remove trivial continues
  nir: Add a pass for moving SPIR-V continue blocks to the ends of loops
  i965: Use nir_opt_trivial_continues and nir_opt_if

 src/compiler/Makefile.sources|   2 +
 src/compiler/nir/nir.c   |   2 +-
 src/compiler/nir/nir.h   |   9 +
 src/compiler/nir/nir_from_ssa.c  | 189 ++--
 src/compiler/nir/nir_opt_copy_propagate.c|   2 +-
 src/compiler/nir/nir_opt_dce.c   |   2 +-
 src/compiler/nir/nir_opt_if.c| 253 +++
 src/compiler/nir/nir_opt_trivial_continues.c | 141 +++
 src/mesa/drivers/dri/i965/brw_nir.c  |   2 +
 9 files changed, 587 insertions(+), 15 deletions(-)
 create mode 100644 src/compiler/nir/nir_opt_if.c
 create mode 100644 src/compiler/nir/nir_opt_trivial_continues.c

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] nir: Expose function_impl versions of copy-prop and dce

2016-12-19 Thread Jason Ekstrand
---
 src/compiler/nir/nir.h| 2 ++
 src/compiler/nir/nir_opt_copy_propagate.c | 2 +-
 src/compiler/nir/nir_opt_dce.c| 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 0fd6a77..89b0c70 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2545,10 +2545,12 @@ bool nir_opt_constant_folding(nir_shader *shader);
 
 bool nir_opt_global_to_local(nir_shader *shader);
 
+bool nir_copy_prop_impl(nir_function_impl *impl);
 bool nir_copy_prop(nir_shader *shader);
 
 bool nir_opt_cse(nir_shader *shader);
 
+bool nir_opt_dce_impl(nir_function_impl *impl);
 bool nir_opt_dce(nir_shader *shader);
 
 bool nir_opt_dead_cf(nir_shader *shader);
diff --git a/src/compiler/nir/nir_opt_copy_propagate.c 
b/src/compiler/nir/nir_opt_copy_propagate.c
index c26e07f..3cd65a2 100644
--- a/src/compiler/nir/nir_opt_copy_propagate.c
+++ b/src/compiler/nir/nir_opt_copy_propagate.c
@@ -240,7 +240,7 @@ copy_prop_if(nir_if *if_stmt)
return copy_prop_src(_stmt->condition, NULL, if_stmt);
 }
 
-static bool
+bool
 nir_copy_prop_impl(nir_function_impl *impl)
 {
bool progress = false;
diff --git a/src/compiler/nir/nir_opt_dce.c b/src/compiler/nir/nir_opt_dce.c
index 5cefba3..2d2ec29 100644
--- a/src/compiler/nir/nir_opt_dce.c
+++ b/src/compiler/nir/nir_opt_dce.c
@@ -128,7 +128,7 @@ init_block(nir_block *block, struct exec_list *worklist)
return true;
 }
 
-static bool
+bool
 nir_opt_dce_impl(nir_function_impl *impl)
 {
struct exec_list *worklist = rzalloc(NULL, struct exec_list);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] nir: Add a pass for moving SPIR-V continue blocks to the ends of loops

2016-12-19 Thread Jason Ekstrand
When shaders come in from SPIR-V, we handle continue blocks by placing
the contents of the continue inside of a "if (!first_iteration)".  We do
this so that we can properly handle the fact that continues in SPIR-V
jump to the continue block at the end of the loop rather than jumping
directly to the top of the loop like they do in NIR.  In particular, the
increment step of a simple for loop ends up in the continue block.  This
pass looks for this case in loops that don't actually have any continues
and moves the continue contents to the end of the loop instead.  We need
this because loop unrolling doesn't work if the increment is inside of a
condition.
---
 src/compiler/Makefile.sources |   1 +
 src/compiler/nir/nir.h|   2 +
 src/compiler/nir/nir_opt_if.c | 253 ++
 3 files changed, 256 insertions(+)
 create mode 100644 src/compiler/nir/nir_opt_if.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index a0abede..6f701af 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -239,6 +239,7 @@ NIR_FILES = \
nir/nir_opt_dead_cf.c \
nir/nir_opt_gcm.c \
nir/nir_opt_global_to_local.c \
+   nir/nir_opt_if.c \
nir/nir_opt_loop_unroll.c \
nir/nir_opt_peephole_select.c \
nir/nir_opt_remove_phis.c \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index a6c8956..7b582a6 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2557,6 +2557,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
 
 bool nir_opt_gcm(nir_shader *shader, bool value_number);
 
+bool nir_opt_if(nir_shader *shader);
+
 bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode indirect_mask);
 
 bool nir_opt_peephole_select(nir_shader *shader, unsigned limit);
diff --git a/src/compiler/nir/nir_opt_if.c b/src/compiler/nir/nir_opt_if.c
new file mode 100644
index 000..5590684
--- /dev/null
+++ b/src/compiler/nir/nir_opt_if.c
@@ -0,0 +1,253 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+#include "nir_control_flow.h"
+
+/**
+ * This optimization detects if statements at the tops of loops where the
+ * condition is a phi node of two constants and moves half of the if to above
+ * the loop and the other half of the if to the end of the loop.  A simple for
+ * loop "for (int i = 0; i < 4; i++)", when run through the SPIR-V front-end,
+ * ends up looking something like this:
+ *
+ * vec1 32 ssa_0 = load_const (0x)
+ * vec1 32 ssa_1 = load_const (0x)
+ * loop {
+ *block block_1:
+ *vec1 32 ssa_2 = phi block_0: ssa_0, block_7: ssa_5
+ *vec1 32 ssa_3 = phi block_0: ssa_0, block_7: ssa_1
+ *if ssa_2 {
+ *   block block_2:
+ *   vec1 32 ssa_4 = load_const (0x0001)
+ *   vec1 32 ssa_5 = iadd ssa_2, ssa_4
+ *} else {
+ *   block block_3:
+ *}
+ *block block_4:
+ *vec1 32 ssa_6 = load_const (0x0004)
+ *vec1 32 ssa_7 = ilt ssa_5, ssa_6
+ *if ssa_7 {
+ *   block block_5:
+ *} else {
+ *   block block_6:
+ *   break
+ *}
+ *block block_7:
+ * }
+ *
+ * This turns it into something like this:
+ *
+ * // Stuff from block 1
+ * // Stuff from block 3
+ * loop {
+ *block block_1:
+ *vec1 32 ssa_3 = phi block_0: ssa_0, block_7: ssa_1
+ *vec1 32 ssa_6 = load_const (0x0004)
+ *vec1 32 ssa_7 = ilt ssa_5, ssa_6
+ *if ssa_7 {
+ *   block block_5:
+ *} else {
+ *   block block_6:
+ *   break
+ *}
+ *block block_7:
+ *// Stuff from block 1
+ *// Stuff from block 2
+ *vec1 32 ssa_4 = load_const (0x0001)
+ *vec1 32 ssa_5 = iadd ssa_2, ssa_4
+ * }
+ */
+static bool
+opt_peel_loop_initial_if(nir_loop *loop)
+{
+   nir_block *header_block = nir_loop_first_block(loop);
+   nir_block *prev_block =
+  

[Mesa-dev] [PATCH 2/6] nir: Correctly handle blocks in cf_node_cf_tree_next

2016-12-19 Thread Jason Ekstrand
---
 src/compiler/nir/nir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index e522a67..b8d1abd 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1754,7 +1754,7 @@ nir_block *nir_cf_node_cf_tree_last(nir_cf_node *node)
 nir_block *nir_cf_node_cf_tree_next(nir_cf_node *node)
 {
if (node->type == nir_cf_node_block)
-  return nir_cf_node_cf_tree_first(nir_cf_node_next(node));
+  return nir_block_cf_tree_next(nir_cf_node_as_block(node));
else if (node->type == nir_cf_node_function)
   return NULL;
else
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] i965: Use nir_opt_trivial_continues and nir_opt_if

2016-12-19 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_nir.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 0c1fb44..a091861 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -429,6 +429,8 @@ nir_optimize(nir_shader *nir, const struct brw_compiler 
*compiler,
   OPT(nir_opt_algebraic);
   OPT(nir_opt_constant_folding);
   OPT(nir_opt_dead_cf);
+  OPT(nir_opt_trivial_continues);
+  OPT(nir_opt_if);
   if (nir->options->max_unroll_iterations != 0) {
  OPT(nir_opt_loop_unroll, indirect_mask);
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] nir: Add an optimization pass to remove trivial continues

2016-12-19 Thread Jason Ekstrand
---
 src/compiler/Makefile.sources|   1 +
 src/compiler/nir/nir.h   |   2 +
 src/compiler/nir/nir_opt_trivial_continues.c | 141 +++
 3 files changed, 144 insertions(+)
 create mode 100644 src/compiler/nir/nir_opt_trivial_continues.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index ae3e5f0..a0abede 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -242,6 +242,7 @@ NIR_FILES = \
nir/nir_opt_loop_unroll.c \
nir/nir_opt_peephole_select.c \
nir/nir_opt_remove_phis.c \
+   nir/nir_opt_trivial_continues.c \
nir/nir_opt_undef.c \
nir/nir_phi_builder.c \
nir/nir_phi_builder.h \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 89b0c70..a6c8956 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2563,6 +2563,8 @@ bool nir_opt_peephole_select(nir_shader *shader, unsigned 
limit);
 
 bool nir_opt_remove_phis(nir_shader *shader);
 
+bool nir_opt_trivial_continues(nir_shader *shader);
+
 bool nir_opt_undef(nir_shader *shader);
 
 bool nir_opt_conditional_discard(nir_shader *shader);
diff --git a/src/compiler/nir/nir_opt_trivial_continues.c 
b/src/compiler/nir/nir_opt_trivial_continues.c
new file mode 100644
index 000..ff3205d
--- /dev/null
+++ b/src/compiler/nir/nir_opt_trivial_continues.c
@@ -0,0 +1,141 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+
+static bool
+instr_is_continue(nir_instr *instr)
+{
+   if (instr->type != nir_instr_type_jump)
+  return false;
+
+   return nir_instr_as_jump(instr)->type == nir_jump_continue;
+}
+
+static bool
+lower_trivial_continues_block(nir_block *block, nir_loop *loop)
+{
+   bool progress = false;
+   nir_instr *first_instr = nir_block_first_instr(block);
+   if (!first_instr || instr_is_continue(first_instr)) {
+  /* The block contains only a continue if anything */
+  nir_cf_node *prev_node = nir_cf_node_prev(>cf_node);
+  if (prev_node && prev_node->type == nir_cf_node_if) {
+ nir_if *prev_if = nir_cf_node_as_if(prev_node);
+ progress |= lower_trivial_continues_block(
+nir_if_last_then_block(prev_if), loop);
+ progress |= lower_trivial_continues_block(
+nir_if_last_else_block(prev_if), loop);
+  }
+
+  /* The lower_phis_to_regs helper is smart enough to push phi sources as
+   * far up if-ladders as it possibly can.  In other words, the recursive
+   * calls above shouldn't be adding instructions to this block since it
+   * follows an if statement.
+   */
+  first_instr = nir_block_first_instr(block);
+  assert(!first_instr || instr_is_continue(first_instr));
+   }
+
+   nir_instr *last_instr = nir_block_last_instr(block);
+   if (!last_instr || !instr_is_continue(last_instr))
+  return progress;
+
+   /* We're at the end of a block that goes straight through to the end of the
+* loop and continues to the top.  We can exit normally instead of jump.
+*/
+   nir_lower_phis_to_regs_block(nir_loop_first_block(loop));
+   nir_instr_remove(last_instr);
+   return true;
+}
+
+static bool
+lower_trivial_continues_list(struct exec_list *cf_list,
+ bool list_ends_at_loop_tail,
+ nir_loop *loop)
+{
+   bool progress = false;
+   foreach_list_typed(nir_cf_node, cf_node, node, cf_list) {
+  bool at_loop_tail = list_ends_at_loop_tail &&
+  _node->node == exec_list_get_tail(cf_list);
+  switch (cf_node->type) {
+  case nir_cf_node_block:
+ break;
+
+  case nir_cf_node_if: {
+ nir_if *nif = nir_cf_node_as_if(cf_node);
+ if (lower_trivial_continues_list(>then_list, at_loop_tail, loop))
+progress = true;
+ 

[Mesa-dev] [PATCH 1/6] nir: Add a couple quick-and-dirty out-of-SSA helpers

2016-12-19 Thread Jason Ekstrand
These are designed for use within an optimization pass when SSA becomes
more pain than it's worth.  They're very naive and don't generate
anything close to optimal register-based NIR.  Also, they may result in
shaders which do not validate because of, for instance, registers in phi
sources.  However, the register-based into-SSA pass should be pretty
efficient at cleaning up the mess.
---
 src/compiler/nir/nir.h  |   3 +
 src/compiler/nir/nir_from_ssa.c | 189 +---
 2 files changed, 180 insertions(+), 12 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 51bc6b2..0fd6a77 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2536,6 +2536,9 @@ void nir_convert_loop_to_lcssa(nir_loop *loop);
  */
 void nir_convert_from_ssa(nir_shader *shader, bool phi_webs_only);
 
+bool nir_lower_phis_to_regs_block(nir_block *block);
+bool nir_lower_ssa_defs_to_regs_block(nir_block *block);
+
 bool nir_opt_algebraic(nir_shader *shader);
 bool nir_opt_algebraic_late(nir_shader *shader);
 bool nir_opt_constant_folding(nir_shader *shader);
diff --git a/src/compiler/nir/nir_from_ssa.c b/src/compiler/nir/nir_from_ssa.c
index 15e89a3..603d1a5 100644
--- a/src/compiler/nir/nir_from_ssa.c
+++ b/src/compiler/nir/nir_from_ssa.c
@@ -447,6 +447,19 @@ aggressive_coalesce_block(nir_block *block, struct 
from_ssa_state *state)
return true;
 }
 
+static nir_register *
+create_reg_for_ssa_def(nir_ssa_def *def, nir_function_impl *impl)
+{
+   nir_register *reg = nir_local_reg_create(impl);
+
+   reg->name = def->name;
+   reg->num_components = def->num_components;
+   reg->bit_size = def->bit_size;
+   reg->num_array_elems = 0;
+
+   return reg;
+}
+
 static bool
 rewrite_ssa_def(nir_ssa_def *def, void *void_state)
 {
@@ -463,13 +476,8 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
* the things in the merge set should be the same so it doesn't
* matter which node's definition we use.
*/
-  if (node->set->reg == NULL) {
- node->set->reg = nir_local_reg_create(state->impl);
- node->set->reg->name = def->name;
- node->set->reg->num_components = def->num_components;
- node->set->reg->bit_size = def->bit_size;
- node->set->reg->num_array_elems = 0;
-  }
+  if (node->set->reg == NULL)
+ node->set->reg = create_reg_for_ssa_def(def, state->impl);
 
   reg = node->set->reg;
} else {
@@ -482,11 +490,7 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
   if (def->parent_instr->type == nir_instr_type_load_const)
  return true;
 
-  reg = nir_local_reg_create(state->impl);
-  reg->name = def->name;
-  reg->num_components = def->num_components;
-  reg->bit_size = def->bit_size;
-  reg->num_array_elems = 0;
+  reg = create_reg_for_ssa_def(def, state->impl);
}
 
nir_ssa_def_rewrite_uses(def, nir_src_for_reg(reg));
@@ -810,3 +814,164 @@ nir_convert_from_ssa(nir_shader *shader, bool 
phi_webs_only)
  nir_convert_from_ssa_impl(function->impl, phi_webs_only);
}
 }
+
+
+static void
+place_phi_read(nir_shader *shader, nir_register *reg,
+   nir_ssa_def *def, nir_block *block)
+{
+   if (block != def->parent_instr->block) {
+  /* Try to go up the single-successor tree */
+  bool all_single_successors = true;
+  struct set_entry *entry;
+  set_foreach(block->predecessors, entry) {
+ nir_block *pred = (nir_block *)entry->key;
+ if (pred->successors[0] && pred->successors[1]) {
+all_single_successors = false;
+break;
+ }
+  }
+
+  if (all_single_successors) {
+ /* All predecessors of this block have exactly one successor and it
+  * is this block so they must eventually lead here without
+  * intersecting each other.  Place the reads in the predecessors
+  * instead of this block.
+  */
+ set_foreach(block->predecessors, entry)
+place_phi_read(shader, reg, def, (nir_block *)entry->key);
+ return;
+  }
+   }
+
+   nir_alu_instr *mov = nir_alu_instr_create(shader, nir_op_imov);
+   mov->src[0].src = nir_src_for_ssa(def);
+   mov->dest.dest = nir_dest_for_reg(reg);
+   mov->dest.write_mask = (1 << reg->num_components) - 1;
+   nir_instr_insert(nir_after_block_before_jump(block), >instr);
+}
+
+/** Lower all of the phi nodes in a block to imov's to and from a register
+ *
+ * This provides a very quick-and-dirty out-of-SSA pass that you can run on a
+ * single block to convert all of it's phis to a register and some imov's.
+ * The code that is generated, while not optimal for actual codegen in a
+ * back-end, is easy to generate, correct, and will turn into the same set of
+ * phis after you call regs_to_ssa and do some copy propagation.
+ *
+ * The one intelligent thing this pass does is that it places the moves from
+ * the phi sources as high up the predecessor tree 

Re: [Mesa-dev] [PATCH] glsl: Fix program interface queries relating to interface blocks.

2016-12-19 Thread Kenneth Graunke
On Monday, December 19, 2016 8:41:17 PM PST Matt Turner wrote:
> On Mon, Dec 19, 2016 at 4:13 PM, Kenneth Graunke  
> wrote:
> > For what it's worth, the OpenGL wiki's Program Introspection page(*),
> > under "Interface block member naming" gives an example matching my above
> > reply.  It says:
> >
> > uniform BlockName3
> > {
> >   int mem;
> > } instanceName3[4];
> >
> > This definition will create a single member named "BlockName3.min".
> > The reason this array of four blocks only counts as having one
> > variable is because each of the four blocks uses the same internal
> > definition. There is nothing that could be queried from
> > BlockName3[1] that could not be queried from BlockName3[0].
> >
> > (*) https://www.khronos.org/opengl/wiki/Program_Introspection
> >
> > I think that's a decent explanation of why this is reasonable.  Because
> > the block entries have per-element entries (Block[0], Block[1], etc.
> > you can query whether a block is referenced (i.e. a UBO binding is used).
> 
> This might be a stupid question, but why is the field named "mem" in
> the code, but the following paragraph says "min"? Are the two not
> supposed to be the same?

Oh.  I think that's a typo.  It should be BlockName3.mem.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] vulkan/wsi/x11: don't crash on null wsi x11 connection

2016-12-19 Thread Arda Coskunses
Without this check driver crash when application window
closed unexpectedly.
---
 src/vulkan/wsi/wsi_common_x11.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
index 25ba0c1..afb7809 100644
--- a/src/vulkan/wsi/wsi_common_x11.c
+++ b/src/vulkan/wsi/wsi_common_x11.c
@@ -261,6 +261,11 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support(
struct wsi_x11_connection *wsi_conn =
   wsi_x11_get_connection(wsi_device, alloc, connection);
 
+   if (!wsi_conn) {
+  fprintf(stderr, "vulkan: wsi connection lost\n");
+  return false;
+   }
+
if (!wsi_conn->has_dri3) {
   fprintf(stderr, "vulkan: No DRI3 support\n");
   return false;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Roland Scheidegger
Am 20.12.2016 um 00:12 schrieb Giuseppe Bilotta:
> Hello,
> 
> I realize that I'm a little late to comment about this, but I think
> the formula used for
> tanh should be changed again. Specifically, as suggested by Roland
> 
> On Fri, Dec 9, 2016 at 5:41 AM, Roland Scheidegger  wrote:
>> btw I'm wondering if some vendors wouldn't implement that with slightly
>> simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
>> used for cg apparently according to docs, saving one of the
>> exponentials). Might be worse for accuracy though (and won't solve this
>> problem, though it would now only need a one-sided clamp).

It was changed to this formula.


> 
> Another option is the 1 - 2/(1+expf(2x)), or even better 1 -
> 2/(2+expm1f(2x)).. I've run some tests and this seems to have the same
> accuracy as the
> one mentioned by Roland, with the bonus benefit of not needing any
> clamping. The accuracy seems to actually be better
> than the direct evaluation (difference over sum of exps), except
> around 0 (say, when abs(x) < 1).

The 1 - 2/(1+expf(2x)) is worse for numbers close to zero (probably
provably so, I think you might have one bit more to play with there with
the other formula due to the division by essentially 2). e.g. if you
have 8e-8, libm tanhf() gives me 8e-8 as a result (it looks like it's
actually hard-coded to return the input as result for sufficiently small
values), the (e^2x - 1) / (e^2x + 1) formula gives 5.960464e-08 whereas
1 - 2/(1+expf(2x)) will give you back 0.0f (but, with even smaller
values like 2e-8, both methods will return 0.0f which is pretty wrong in
any case, the relative error can get to enormous levels there).
I'm not sure which method is better for larger values, I think they
might be about the same. Nvidia docs stating they use the slightly more
complex formula for cg though may be a hint that this indeed has some
properties which are nice-to-have. Though arguably it's not that more
complex, since the only part it saves is the one-sided clamp - the most
expensive parts are the exp and the div, neither of which you can get
rid of.
Not sure it really matters though one way or another. If you wanted good
accuracy around 0, you'd have to use a different formula plus a select
(seems like libm implementations actually use 3 cases depending on input
value magnitude - not so hot with vectors, but thankfully glsl doesn't
require 1 ULP accuracy).

Roland



> 
> I've found the relative error away from 0 to be typically in the same
> order of magnitude as the error in tanhf() itself (compared to tanh())
> , and generally less than machine epsilon.. I'm currently looking at
> options to improve the accuracy without clamping and without excessive
> additional computations, might propose a patch in the next couple of
> days.
> 
> Just one question though —not knowing much of the shader language, can
> I expect expm1 to be available?
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Elminate the open-coded version of process_block_array_leaf

2016-12-19 Thread Timothy Arceri
Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix program interface queries relating to interface blocks.

2016-12-19 Thread Matt Turner
On Mon, Dec 19, 2016 at 4:13 PM, Kenneth Graunke  wrote:
> For what it's worth, the OpenGL wiki's Program Introspection page(*),
> under "Interface block member naming" gives an example matching my above
> reply.  It says:
>
> uniform BlockName3
> {
>   int mem;
> } instanceName3[4];
>
> This definition will create a single member named "BlockName3.min".
> The reason this array of four blocks only counts as having one
> variable is because each of the four blocks uses the same internal
> definition. There is nothing that could be queried from
> BlockName3[1] that could not be queried from BlockName3[0].
>
> (*) https://www.khronos.org/opengl/wiki/Program_Introspection
>
> I think that's a decent explanation of why this is reasonable.  Because
> the block entries have per-element entries (Block[0], Block[1], etc.
> you can query whether a block is referenced (i.e. a UBO binding is used).

This might be a stupid question, but why is the field named "mem" in
the code, but the following paragraph says "min"? Are the two not
supposed to be the same?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: Bugfix needed for hashcat

2016-12-19 Thread Christian Inci
Hashcat needs MAX_GLOBAL_BUFFERS to be 21 or even 22 for some modes. It'll 
crash otherwise.
I'm adding an assert to see if programs need it to be even higher.

Signed-off-by: Christian Inci 

---
 src/gallium/drivers/radeonsi/si_compute.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 9d83cb3a..9bad34ed 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -32,7 +32,7 @@
 #include "si_pipe.h"
 #include "sid.h"

-#define MAX_GLOBAL_BUFFERS 20
+#define MAX_GLOBAL_BUFFERS 22

 struct si_compute {
unsigned ir_type;
@@ -195,6 +195,7 @@ static void si_set_global_binding(
unsigned i;
struct si_context *sctx = (struct si_context*)ctx;
struct si_compute *program = sctx->cs_shader_state.program;
+   assert(n <= MAX_GLOBAL_BUFFERS);

if (!resources) {
for (i = first; i < first + n; i++) {
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Elminate the open-coded version of process_block_array_leaf

2016-12-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/link_uniform_blocks.cpp | 40 +++
 1 file changed, 9 insertions(+), 31 deletions(-)

diff --git a/src/compiler/glsl/link_uniform_blocks.cpp 
b/src/compiler/glsl/link_uniform_blocks.cpp
index 3ed3556..ba01269 100644
--- a/src/compiler/glsl/link_uniform_blocks.cpp
+++ b/src/compiler/glsl/link_uniform_blocks.cpp
@@ -205,7 +205,7 @@ struct block {
bool has_instance_name;
 };
 
-static void process_block_array_leaf(char **name, gl_uniform_block *blocks,
+static void process_block_array_leaf(const char *name, gl_uniform_block 
*blocks,
  ubo_visitor *parcel,
  gl_uniform_buffer_variable *variables,
  const struct link_uniform_block_active 
*const b,
@@ -241,7 +241,7 @@ process_block_array(struct uniform_block_array_elements 
*ub_array, char **name,
  parcel, variables, b, block_index,
  binding_offset, ctx, prog, first_index);
   } else {
- process_block_array_leaf(name, blocks,
+ process_block_array_leaf(*name, blocks,
   parcel, variables, b, block_index,
   binding_offset, *block_index - first_index,
   ctx, prog);
@@ -250,7 +250,7 @@ process_block_array(struct uniform_block_array_elements 
*ub_array, char **name,
 }
 
 static void
-process_block_array_leaf(char **name,
+process_block_array_leaf(const char *name,
  gl_uniform_block *blocks,
  ubo_visitor *parcel, gl_uniform_buffer_variable 
*variables,
  const struct link_uniform_block_active *const b,
@@ -261,7 +261,7 @@ process_block_array_leaf(char **name,
unsigned i = *block_index;
const glsl_type *type =  b->type->without_array();
 
-   blocks[i].Name = ralloc_strdup(blocks, *name);
+   blocks[i].Name = ralloc_strdup(blocks, name);
blocks[i].Uniforms = [(*parcel).index];
 
/* The ARB_shading_language_420pack spec says:
@@ -278,7 +278,7 @@ process_block_array_leaf(char **name,
blocks[i]._RowMajor = type->get_interface_row_major();
blocks[i].linearized_array_index = linearized_index;
 
-   parcel->process(type, blocks[i].Name);
+   parcel->process(type, b->has_instance_name ? blocks[i].Name : "");
 
blocks[i].UniformBufferSize = parcel->buffer_size;
 
@@ -367,8 +367,8 @@ create_buffer_blocks(void *mem_ctx, struct gl_context *ctx,
   if ((create_ubo_blocks && !b->is_shader_storage) ||
   (!create_ubo_blocks && b->is_shader_storage)) {
 
+ unsigned binding_offset = 0;
  if (b->array != NULL) {
-unsigned binding_offset = 0;
 char *name = ralloc_strdup(NULL,
block_type->without_array()->name);
 size_t name_length = strlen(name);
@@ -379,31 +379,9 @@ create_buffer_blocks(void *mem_ctx, struct gl_context *ctx,
 i);
 ralloc_free(name);
  } else {
-blocks[i].Name = ralloc_strdup(blocks, block_type->name);
-blocks[i].Uniforms = [parcel.index];
-blocks[i].Binding = (b->has_binding) ? b->binding : 0;
-blocks[i].UniformBufferSize = 0;
-blocks[i]._Packing =
-   gl_uniform_block_packing(block_type->interface_packing);
-blocks[i]._RowMajor = block_type->get_interface_row_major();
-
-parcel.process(block_type,
-   b->has_instance_name ? block_type->name : "");
-
-blocks[i].UniformBufferSize = parcel.buffer_size;
-
-/* Check SSBO size is lower than maximum supported size for SSBO
- */
-if (b->is_shader_storage &&
-parcel.buffer_size > ctx->Const.MaxShaderStorageBlockSize) {
-   linker_error(prog, "shader storage block `%s' has size %d, "
-"which is larger than than the maximum allowed 
(%d)",
-block_type->name, parcel.buffer_size,
-ctx->Const.MaxShaderStorageBlockSize);
-}
-blocks[i].NumUniforms = (unsigned)(ptrdiff_t)
-   ([parcel.index] - blocks[i].Uniforms);
-i++;
+process_block_array_leaf(block_type->name, blocks, ,
+ variables, b, , _offset,
+ 0, ctx, prog);
  }
   }
}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Ilia Mirkin
On Mon, Dec 19, 2016 at 8:17 PM, Matt Turner  wrote:
> On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
>  wrote:
>> Just one question though —not knowing much of the shader language, can
>> I expect expm1 to be available?
>
> No, expm1 doesn't exist in GLSL.

And - more importantly - not in the hardware either. At least not on
NVIDIA, and on a brief scan, not Intel either (SKL "math"
instruction).

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Matt Turner
On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
 wrote:
> Just one question though —not knowing much of the shader language, can
> I expect expm1 to be available?

No, expm1 doesn't exist in GLSL.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: don't attempt to unlock an unlocked debug state mutex

2016-12-19 Thread Jonathan Gray
Can someone push this to master?

On Sun, Dec 11, 2016 at 07:21:36PM +0100, Eduardo Lima Mitev wrote:
> Looks good.
> 
> Reviewed-by: Eduardo Lima Mitev 
> 
> On 12/11/2016 04:42 PM, Jonathan Gray wrote:
> > Commit 929fcee47e46781c57f2a354ce0a013915c033d1 introduced code that
> > attempts to unlock an unlocked mutex which is undefined behaviour.
> > 
> > On OpenBSD this leads to an abort:
> > 
> > 0  0x124dadfa96ba in thrkill () at :2
> > 1  0x124dadf3da39 in *_libc_abort () at 
> > /usr/src/lib/libc/stdlib/abort.c:52
> > 2  0x124d2c1165b5 in *_libpthread_pthread_mutex_unlock 
> > (mutexp=)
> > at /usr/src/lib/librthread/rthread_sync.c:221
> > 3  0x124d279c02e4 in init_attrib_groups (ctx=0x124df0fda000) at 
> > main/context.c:825
> > 4  _mesa_initialize_context (ctx=ctx@entry=0x124df0fda000, 
> > api=api@entry=API_OPENGL_CORE,
> > visual=visual@entry=0x7f7bdfd0, share_list=share_list@entry=0x0,
> > driverFunctions=driverFunctions@entry=0x7f7bda60) at 
> > main/context.c:1204
> > 5  0x124d27b507ec in st_create_context (api=api@entry=API_OPENGL_CORE,
> > pipe=pipe@entry=0x124dc491, visual=visual@entry=0x7f7bdfd0,
> > share=share@entry=0x0, options=options@entry=0x7f7be128)
> > at state_tracker/st_context.c:545
> > 6  0x124d27b8639f in st_api_create_context (stapi=,
> > smapi=0x124d1b608800, attribs=0x7f7be100, error=0x7f7be0fc, 
> > shared_stctxi=0x0)
> > at state_tracker/st_manager.c:669
> > 7  0x124d27cc5b9c in dri_create_context (api=, 
> > visual=0x124d8a0f8a00,
> > cPriv=0x124de473f240, major_version=, 
> > minor_version=,
> > flags=, notify_reset=false, error=0x7f7be2b4,
> > sharedContextPrivate=0x0) at dri_context.c:123
> > 8  0x124d27cc5029 in driCreateContextAttribs (screen=0x124d8a0f8400,
> > api=, config=0x124d8a0f8a00, shared=,
> > num_attribs=, attribs=, 
> > error=0x7f7be2b4,
> > data=0x124d77814a00) at dri_util.c:448
> > 9  0x124d8e109b00 in drisw_create_context_attribs (base=0x124df3e08700,
> > config_base=0x124d7a0e7300, shareList=, 
> > num_attribs=,
> > attribs=, error=0x7f7be2b4) at drisw_glx.c:476
> > 10 0x124d8e104b4a in glXCreateContextAttribsARB (dpy=0x124d533f,
> > config=0x124d7a0e7300, share_context=0x0, direct=1, 
> > attrib_list=0x7f7be300)
> > at create_context.c:78
> > 
> > Signed-off-by: Jonathan Gray 
> > ---
> >  src/mesa/main/debug_output.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/main/debug_output.c b/src/mesa/main/debug_output.c
> > index 48dbbb3..bc933db 100644
> > --- a/src/mesa/main/debug_output.c
> > +++ b/src/mesa/main/debug_output.c
> > @@ -1282,14 +1282,13 @@ _mesa_init_debug_output(struct gl_context *ctx)
> > */
> >struct gl_debug_state *debug = _mesa_lock_debug_state(ctx);
> >if (!debug) {
> > - goto done;
> > + return;
> >}
> >debug->DebugOutput = GL_TRUE;
> >debug->LogToStderr = GL_TRUE;
> >ctx->Const.ContextFlags |= GL_CONTEXT_FLAG_DEBUG_BIT;
> > +  _mesa_unlock_debug_state(ctx);
> > }
> > -done:
> > -   _mesa_unlock_debug_state(ctx);
> >  }
> >  
> >  
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Use ir_var_temporary when generating inline functions.

2016-12-19 Thread Ilia Mirkin
Makes sense.

Reviewed-by: Ilia Mirkin 

On Mon, Dec 19, 2016 at 6:33 PM, Kenneth Graunke  wrote:
> On Monday, December 19, 2016 3:29:13 PM PST Kenneth Graunke wrote:
>> We were using ir_var_auto for the inlined function parameter variables,
>> which is wrong, as it suggests that those are real variables declared
>> by the program.
>>
>> Normally this doesn't matter.  However, if you called built-ins at
>> global scope, it would pollute the global variable namespace with
>> these new parameter temporaries.  If the shader already had variables
>> with those names, the linker might see contradictory global variable
>> declarations and raise an error.
>>
>> Making them temporaries indicates that these are just things generated
>> by the compiler internally.  This avoids confusing the linker.
>>
>> Fixes a new Piglit test: glsl-fs-multiple-builtins.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097
>
> That should be
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99154
>
> (fixed locally)
>
>> Reported-by: Niels Ole Salscheider 
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/compiler/glsl/opt_function_inlining.cpp | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/compiler/glsl/opt_function_inlining.cpp 
>> b/src/compiler/glsl/opt_function_inlining.cpp
>> index 62c1f4b..78a726b 100644
>> --- a/src/compiler/glsl/opt_function_inlining.cpp
>> +++ b/src/compiler/glsl/opt_function_inlining.cpp
>> @@ -164,7 +164,7 @@ ir_call::generate_inline(ir_instruction *next_ir)
>>parameters[i] = NULL;
>>} else {
>>parameters[i] = sig_param->clone(ctx, ht);
>> -  parameters[i]->data.mode = ir_var_auto;
>> +  parameters[i]->data.mode = ir_var_temporary;
>>
>>/* Remove the read-only decoration because we're going to write
>> * directly to this variable.  If the cloned variable is left
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Use ir_var_temporary when generating inline functions.

2016-12-19 Thread Kenneth Graunke
On Monday, December 19, 2016 3:29:13 PM PST Kenneth Graunke wrote:
> We were using ir_var_auto for the inlined function parameter variables,
> which is wrong, as it suggests that those are real variables declared
> by the program.
> 
> Normally this doesn't matter.  However, if you called built-ins at
> global scope, it would pollute the global variable namespace with
> these new parameter temporaries.  If the shader already had variables
> with those names, the linker might see contradictory global variable
> declarations and raise an error.
> 
> Making them temporaries indicates that these are just things generated
> by the compiler internally.  This avoids confusing the linker.
> 
> Fixes a new Piglit test: glsl-fs-multiple-builtins.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097

That should be
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99154

(fixed locally)

> Reported-by: Niels Ole Salscheider 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/opt_function_inlining.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/opt_function_inlining.cpp 
> b/src/compiler/glsl/opt_function_inlining.cpp
> index 62c1f4b..78a726b 100644
> --- a/src/compiler/glsl/opt_function_inlining.cpp
> +++ b/src/compiler/glsl/opt_function_inlining.cpp
> @@ -164,7 +164,7 @@ ir_call::generate_inline(ir_instruction *next_ir)
>parameters[i] = NULL;
>} else {
>parameters[i] = sig_param->clone(ctx, ht);
> -  parameters[i]->data.mode = ir_var_auto;
> +  parameters[i]->data.mode = ir_var_temporary;
>  
>/* Remove the read-only decoration because we're going to write
> * directly to this variable.  If the cloned variable is left
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99154] Link time error when using multiple builtin functions

2016-12-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99154

Kenneth Graunke  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |kenn...@whitecape.org
   |org |
 Status|NEW |ASSIGNED

--- Comment #1 from Kenneth Graunke  ---
Patch on list:
https://lists.freedesktop.org/archives/mesa-dev/2016-December/138613.html

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Use ir_var_temporary when generating inline functions.

2016-12-19 Thread Kenneth Graunke
We were using ir_var_auto for the inlined function parameter variables,
which is wrong, as it suggests that those are real variables declared
by the program.

Normally this doesn't matter.  However, if you called built-ins at
global scope, it would pollute the global variable namespace with
these new parameter temporaries.  If the shader already had variables
with those names, the linker might see contradictory global variable
declarations and raise an error.

Making them temporaries indicates that these are just things generated
by the compiler internally.  This avoids confusing the linker.

Fixes a new Piglit test: glsl-fs-multiple-builtins.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097
Reported-by: Niels Ole Salscheider 
Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/opt_function_inlining.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/opt_function_inlining.cpp 
b/src/compiler/glsl/opt_function_inlining.cpp
index 62c1f4b..78a726b 100644
--- a/src/compiler/glsl/opt_function_inlining.cpp
+++ b/src/compiler/glsl/opt_function_inlining.cpp
@@ -164,7 +164,7 @@ ir_call::generate_inline(ir_instruction *next_ir)
 parameters[i] = NULL;
   } else {
 parameters[i] = sig_param->clone(ctx, ht);
-parameters[i]->data.mode = ir_var_auto;
+parameters[i]->data.mode = ir_var_temporary;
 
 /* Remove the read-only decoration because we're going to write
  * directly to this variable.  If the cloned variable is left
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Giuseppe Bilotta
Hello,

I realize that I'm a little late to comment about this, but I think
the formula used for
tanh should be changed again. Specifically, as suggested by Roland

On Fri, Dec 9, 2016 at 5:41 AM, Roland Scheidegger  wrote:
> btw I'm wondering if some vendors wouldn't implement that with slightly
> simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
> used for cg apparently according to docs, saving one of the
> exponentials). Might be worse for accuracy though (and won't solve this
> problem, though it would now only need a one-sided clamp).

Another option is the 1 - 2/(1+expf(2x)), or even better 1 -
2/(2+expm1f(2x)).. I've run some tests and this seems to have the same
accuracy as the
one mentioned by Roland, with the bonus benefit of not needing any
clamping. The accuracy seems to actually be better
than the direct evaluation (difference over sum of exps), except
around 0 (say, when abs(x) < 1).

I've found the relative error away from 0 to be typically in the same
order of magnitude as the error in tanhf() itself (compared to tanh())
, and generally less than machine epsilon.. I'm currently looking at
options to improve the accuracy without clamping and without excessive
additional computations, might propose a patch in the next couple of
days.

Just one question though —not knowing much of the shader language, can
I expect expm1 to be available?

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] st/nine: Upload on secondary context for Draw*Up

2016-12-19 Thread Axel Davy
Avoid synchronization by using the secondary context
for uploading the vertex data for Draw*Up.

v2: Rely on u_upload_mgr to use persistent coherent
buffers. Do not flush.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 9f2575309a..f095ff3c86 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -274,6 +274,12 @@ NineDevice9_ctor( struct NineDevice9 *This,
 /* r600 and radeonsi are thread safe. */
 This->csmt_active = strstr(pScreen->get_name(pScreen), "AMD") != NULL;
 
+/* We rely on u_upload_mgr using persistent coherent buffers (which don't
+ * require flush to work in multi-pipe_context scenario) for vertex and
+ * index buffers */
+if (!GET_PCAP(BUFFER_MAP_PERSISTENT_COHERENT))
+This->csmt_active = false;
+
 if (This->csmt_active) {
 This->csmt_ctx = nine_csmt_create(This);
 if (!This->csmt_ctx)
@@ -472,13 +478,20 @@ NineDevice9_ctor( struct NineDevice9 *This,
 This->driver_caps.user_sw_vbufs = 
This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS);
 This->driver_caps.user_sw_cbufs = 
This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS);
 
+/* Implicit use of context pipe for vertex and index uploaded when
+ * csmt is not active. Does not need to sync since csmt is unactive,
+ * thus no need to call NineDevice9_GetPipe at each upload. */
 if (!This->driver_caps.user_vbufs)
-This->vertex_uploader = u_upload_create(This->context.pipe, 65536,
+This->vertex_uploader = u_upload_create(This->csmt_active ?
+This->pipe_secondary : 
This->context.pipe,
+65536,
 PIPE_BIND_VERTEX_BUFFER, 
PIPE_USAGE_STREAM);
 This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536,
 PIPE_BIND_VERTEX_BUFFER, 
PIPE_USAGE_STREAM);
 if (!This->driver_caps.user_ibufs)
-This->index_uploader = u_upload_create(This->context.pipe, 128 * 1024,
+This->index_uploader = u_upload_create(This->csmt_active ?
+This->pipe_secondary : 
This->context.pipe,
+   128 * 1024,
PIPE_BIND_INDEX_BUFFER, 
PIPE_USAGE_STREAM);
 if (!This->driver_caps.user_cbufs) {
 This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT);
@@ -2835,8 +2848,6 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This,
 vtxbuf.user_buffer = pVertexStreamZeroData;
 
 if (!This->driver_caps.user_vbufs) {
-/* Implicit use of context pipe */
-(void)NineDevice9_GetPipe(This);
 u_upload_data(This->vertex_uploader,
   0,
   (prim_count_to_vertex_count(PrimitiveType, 
PrimitiveCount)) * VertexStreamZeroStride, /* XXX */
@@ -2900,8 +2911,6 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 
*This,
 
 if (!This->driver_caps.user_vbufs) {
 const unsigned base = MinVertexIndex * VertexStreamZeroStride;
-/* Implicit use of context pipe */
-(void)NineDevice9_GetPipe(This);
 u_upload_data(This->vertex_uploader,
   base,
   NumVertices * VertexStreamZeroStride, /* XXX */
@@ -2915,8 +2924,6 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 
*This,
 vbuf.user_buffer = NULL;
 }
 if (!This->driver_caps.user_ibufs) {
-/* Implicit use of context pipe */
-(void)NineDevice9_GetPipe(This);
 u_upload_data(This->index_uploader,
   0,
   (prim_count_to_vertex_count(PrimitiveType, 
PrimitiveCount)) * ibuf.index_size,
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 84/84 v7] st/nine: Implement new buffer upload path

2016-12-19 Thread Axel Davy
This new buffer upload path enables to lock
faster than the normal path when using
DISCARD/NOOVERWRITE.

v2: Diverse cleanups and fixes.
v3: Fix allocation size for 'lone' buffers and
add more debug info.
v4: Rewrite of the path to handle when DISCARD/NOOVERWRITE
is not used anymore. The resource content is copied to the
new resource used.
v5: flush for safety after unmap (not sure it is really required
here, but safer to flush).
v6: Do not use the path if persistent coherent mapping is unavailable.
Fix buffer creation flags.
v7: Do not flush since it is not needed.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/Makefile.sources   |   2 +
 src/gallium/state_trackers/nine/buffer9.c  |  97 ++-
 src/gallium/state_trackers/nine/buffer9.h  |   9 +-
 src/gallium/state_trackers/nine/device9.c  |   6 +
 src/gallium/state_trackers/nine/device9.h  |   3 +
 src/gallium/state_trackers/nine/indexbuffer9.c |  10 +-
 src/gallium/state_trackers/nine/indexbuffer9.h |   2 -
 .../state_trackers/nine/nine_buffer_upload.c   | 294 +
 .../state_trackers/nine/nine_buffer_upload.h   |  59 +
 src/gallium/state_trackers/nine/vertexbuffer9.c|   3 +-
 10 files changed, 461 insertions(+), 24 deletions(-)
 create mode 100644 src/gallium/state_trackers/nine/nine_buffer_upload.c
 create mode 100644 src/gallium/state_trackers/nine/nine_buffer_upload.h

diff --git a/src/gallium/state_trackers/nine/Makefile.sources 
b/src/gallium/state_trackers/nine/Makefile.sources
index 2bb08a2612..56698a19f1 100644
--- a/src/gallium/state_trackers/nine/Makefile.sources
+++ b/src/gallium/state_trackers/nine/Makefile.sources
@@ -23,6 +23,8 @@ C_SOURCES := \
indexbuffer9.h \
iunknown.c \
iunknown.h \
+   nine_buffer_upload.c \
+   nine_buffer_upload.h \
nine_csmt_helper.h \
nine_debug.c \
nine_debug.h \
diff --git a/src/gallium/state_trackers/nine/buffer9.c 
b/src/gallium/state_trackers/nine/buffer9.c
index c745d77c2c..b22713b351 100644
--- a/src/gallium/state_trackers/nine/buffer9.c
+++ b/src/gallium/state_trackers/nine/buffer9.c
@@ -23,6 +23,7 @@
 
 #include "buffer9.h"
 #include "device9.h"
+#include "nine_buffer_upload.h"
 #include "nine_helpers.h"
 #include "nine_pipe.h"
 
@@ -100,6 +101,10 @@ NineBuffer9_ctor( struct NineBuffer9 *This,
 else
 info->usage = PIPE_USAGE_DYNAMIC;
 
+/* When Writeonly is not set, we don't want to enable the
+ * optimizations */
+This->discard_nooverwrite_only = !!(Usage & D3DUSAGE_WRITEONLY) &&
+ pParams->device->buffer_upload;
 /* if (pDesc->Usage & D3DUSAGE_DONOTCLIP) { } */
 /* if (pDesc->Usage & D3DUSAGE_NONSECURE) { } */
 /* if (pDesc->Usage & D3DUSAGE_NPATCHES) { } */
@@ -161,12 +166,18 @@ NineBuffer9_dtor( struct NineBuffer9 *This )
 list_del(>managed.list2);
 }
 
+if (This->buf)
+nine_upload_release_buffer(This->base.base.device->buffer_upload, 
This->buf);
+
 NineResource9_dtor(>base);
 }
 
 struct pipe_resource *
-NineBuffer9_GetResource( struct NineBuffer9 *This )
+NineBuffer9_GetResource( struct NineBuffer9 *This, unsigned *offset )
 {
+if (This->buf)
+return nine_upload_buffer_resource_and_offset(This->buf, offset);
+*offset = 0;
 return NineResource9_GetResource(>base);
 }
 
@@ -264,6 +275,8 @@ NineBuffer9_Lock( struct NineBuffer9 *This,
 if (Flags & D3DLOCK_DONOTWAIT && !(This->base.usage & D3DUSAGE_DYNAMIC))
 usage |= PIPE_TRANSFER_DONTBLOCK;
 
+This->discard_nooverwrite_only &= !!(Flags & (D3DLOCK_DISCARD | 
D3DLOCK_NOOVERWRITE));
+
 if (This->nmaps == This->maxmaps) {
 struct NineTransfer *newmaps =
 REALLOC(This->maps, sizeof(struct NineTransfer)*This->maxmaps,
@@ -275,8 +288,67 @@ NineBuffer9_Lock( struct NineBuffer9 *This,
 This->maps = newmaps;
 }
 
-This->maps[This->nmaps].is_pipe_secondary = FALSE;
+if (This->buf && !This->discard_nooverwrite_only) {
+struct pipe_box src_box;
+unsigned offset;
+struct pipe_resource *src_res;
+DBG("Disabling nine_subbuffer for a buffer having"
+"used a nine_subbuffer buffer\n");
+/* Copy buffer content to the buffer resource, which
+ * we will now use.
+ * Note: The behaviour may be different from what is expected
+ * with double lock. However applications can't really make 
expectations
+ * about double locks, and don't really use them, so that's ok. */
+src_res = nine_upload_buffer_resource_and_offset(This->buf, );
+u_box_1d(offset, This->size, _box);
+
+pipe = NineDevice9_GetPipe(device);
+pipe->resource_copy_region(pipe, This->base.resource, 0, 0, 0, 0,
+   src_res, 0, _box);
+/* Release previous resource */
+if (This->nmaps >= 1)
+

[Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v3)

2016-12-19 Thread Alex Deucher
From: Junwei Zhang 

v2: use gfxip names for llvm 4.0+
v3: use tonga for llvm <= 3.8

Signed-off-by: Junwei Zhang 
Reviewed-by: Nicolai Hähnle 
Acked-by: Christian König 
---
 src/amd/addrlib/r800/ciaddrlib.cpp| 3 ++-
 src/amd/addrlib/r800/ciaddrlib.h  | 1 +
 src/amd/common/amd_family.h   | 1 +
 src/amd/common/amdgpu_id.h| 4 
 src/gallium/drivers/radeon/r600_pipe_common.c | 3 +++
 src/gallium/drivers/radeon/radeon_vce.c   | 3 ++-
 src/gallium/drivers/radeonsi/si_pipe.c| 1 +
 src/gallium/drivers/radeonsi/si_state.c   | 1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
 9 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/amd/addrlib/r800/ciaddrlib.cpp 
b/src/amd/addrlib/r800/ciaddrlib.cpp
index 7c5d29a..c726c4d 100644
--- a/src/amd/addrlib/r800/ciaddrlib.cpp
+++ b/src/amd/addrlib/r800/ciaddrlib.cpp
@@ -353,6 +353,7 @@ AddrChipFamily CIAddrLib::HwlConvertChipFamily(
 m_settings.isFiji= ASICREV_IS_FIJI_P(uChipRevision);
 m_settings.isPolaris10   = 
ASICREV_IS_POLARIS10_P(uChipRevision);
 m_settings.isPolaris11   = 
ASICREV_IS_POLARIS11_M(uChipRevision);
+m_settings.isPolaris12   = 
ASICREV_IS_POLARIS12_V(uChipRevision);
 break;
 case FAMILY_CZ:
 m_settings.isCarrizo = 1;
@@ -417,7 +418,7 @@ BOOL_32 CIAddrLib::HwlInitGlobalParams(
 {
 m_pipes = 16;
 }
-else if (m_settings.isPolaris11)
+else if (m_settings.isPolaris11 || m_settings.isPolaris12)
 {
 m_pipes = 4;
 }
diff --git a/src/amd/addrlib/r800/ciaddrlib.h b/src/amd/addrlib/r800/ciaddrlib.h
index de995fa..2c9a4cc 100644
--- a/src/amd/addrlib/r800/ciaddrlib.h
+++ b/src/amd/addrlib/r800/ciaddrlib.h
@@ -62,6 +62,7 @@ struct CIChipSettings
 UINT_32 isFiji: 1;
 UINT_32 isPolaris10   : 1;
 UINT_32 isPolaris11   : 1;
+UINT_32 isPolaris12   : 1;
 // VI fusion (Carrizo)
 UINT_32 isCarrizo : 1;
 };
diff --git a/src/amd/common/amd_family.h b/src/amd/common/amd_family.h
index 6a713ad..b09bbb8 100644
--- a/src/amd/common/amd_family.h
+++ b/src/amd/common/amd_family.h
@@ -91,6 +91,7 @@ enum radeon_family {
 CHIP_STONEY,
 CHIP_POLARIS10,
 CHIP_POLARIS11,
+CHIP_POLARIS12,
 CHIP_LAST,
 };
 
diff --git a/src/amd/common/amdgpu_id.h b/src/amd/common/amdgpu_id.h
index f91df55..1683a5a 100644
--- a/src/amd/common/amdgpu_id.h
+++ b/src/amd/common/amdgpu_id.h
@@ -142,6 +142,8 @@ enum {
 
VI_POLARIS11_M_A0 = 90,
 
+   VI_POLARIS12_V_A0 = 100,
+
VI_UNKNOWN= 0xFF
 };
 
@@ -156,6 +158,8 @@ enum {
((eChipRev >= VI_POLARIS10_P_A0) && (eChipRev < VI_POLARIS11_M_A0))
 #define ASICREV_IS_POLARIS11_M(eChipRev)   \
(eChipRev >= VI_POLARIS11_M_A0)
+#define ASICREV_IS_POLARIS12_V(eChipRev)\
+   (eChipRev >= VI_POLARIS12_V_A0)
 
 /* CZ specific rev IDs */
 enum {
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 0b5c6dc..e0b914c 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
r600_common_screen *rscreen)
case CHIP_FIJI: return "AMD FIJI";
case CHIP_POLARIS10: return "AMD POLARIS10";
case CHIP_POLARIS11: return "AMD POLARIS11";
+   case CHIP_POLARIS12: return "AMD POLARIS12";
case CHIP_STONEY: return "AMD STONEY";
default: return "AMD unknown";
}
@@ -889,9 +890,11 @@ const char *r600_get_llvm_processor_name(enum 
radeon_family family)
 #if HAVE_LLVM <= 0x0308
case CHIP_POLARIS10: return "tonga";
case CHIP_POLARIS11: return "tonga";
+   case CHIP_POLARIS12: return "tonga";
 #else
case CHIP_POLARIS10: return "polaris10";
case CHIP_POLARIS11: return "polaris11";
+   case CHIP_POLARIS12: return "polaris11";
 #endif
default: return "";
}
diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index aad2ec1..dcd56ea 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -413,7 +413,8 @@ struct pipe_video_codec *rvce_create_encoder(struct 
pipe_context *context,
enc->use_vui = true;
if (rscreen->info.family >= CHIP_TONGA &&
rscreen->info.family != CHIP_STONEY &&
-   rscreen->info.family != CHIP_POLARIS11)
+   rscreen->info.family != CHIP_POLARIS11 &&
+   rscreen->info.family != CHIP_POLARIS12)
enc->dual_pipe = true;
/* TODO enable B frame with dual instance */
if ((rscreen->info.family >= CHIP_TONGA) &&
diff --git 

Re: [Mesa-dev] [PATCH] glsl: Drop bogus is_vertex_input from add_shader_variable().

2016-12-19 Thread Anuj Phogat
On Fri, Dec 16, 2016 at 9:46 PM, Kenneth Graunke  wrote:
>
> stage_mask is a bitmask of shader stages, so the proper comparison would
> be (1 << MESA_SHADER_VERTEX), not MESA_SHADER_VERTEX itself.
>
> But we only care for structure types, and VS inputs cannot be structs.
> So we can just drop this entirely.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/linker.cpp | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
>
> Sorry, I forgot I split this into two patches...apply this one before
> [PATCH] glsl: Fix program interface queries relating to interface blocks.
>
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index 3660257..5066014 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -3735,10 +3735,6 @@ add_shader_variable(const struct gl_context *ctx,
>  bool use_implicit_location, int location,
>  const glsl_type *outermost_struct_type = NULL)
>  {
> -   const bool is_vertex_input =
> -  programInterface == GL_PROGRAM_INPUT &&
> -  stage_mask == MESA_SHADER_VERTEX;
> -
> switch (type->base_type) {
> case GLSL_TYPE_STRUCT: {
>/* The ARB_program_interface_query spec says:
> @@ -3764,8 +3760,7 @@ add_shader_variable(const struct gl_context *ctx,
>outermost_struct_type))
>  return false;
>
> - field_location +=
> -field->type->count_attribute_slots(is_vertex_input);
> + field_location += field->type->count_attribute_slots(false);
>}
>return true;
> }
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/19] glsl: Use simpler visitor to determine which UBO and SSBO blocks are used

2016-12-19 Thread Ian Romanick
On 12/18/2016 10:09 PM, Timothy Arceri wrote:
> On Thu, 2016-12-15 at 20:10 -0800, Ian Romanick wrote:
>> From: Ian Romanick 
>>
>> Very soon this visitor will get more complicated.  The users of the
>> existing ir_variable_refcount visitor won't need the coming
>> functionality, and this use doesn't need much of the functionality of
>> ir_variable_refcount.
>>
>> Signed-off-by: Ian Romanick 
>> Cc: mesa-sta...@lists.freedesktop.org
>> ---
>>  src/compiler/Makefile.sources   |   2 +
>>  src/compiler/glsl/ir_array_refcount.cpp | 100
>> 
>>  src/compiler/glsl/ir_array_refcount.h   |  65 +
>>  src/compiler/glsl/link_uniforms.cpp |  10 ++--
>>  4 files changed, 172 insertions(+), 5 deletions(-)
>>  create mode 100644 src/compiler/glsl/ir_array_refcount.cpp
>>  create mode 100644 src/compiler/glsl/ir_array_refcount.h
>>
>> diff --git a/src/compiler/Makefile.sources
>> b/src/compiler/Makefile.sources
>> index 17b15de..15f410f 100644
>> --- a/src/compiler/Makefile.sources
>> +++ b/src/compiler/Makefile.sources
>> @@ -29,6 +29,8 @@ LIBGLSL_FILES = \
>>  glsl/glsl_to_nir.cpp \
>>  glsl/glsl_to_nir.h \
>>  glsl/hir_field_selection.cpp \
>> +glsl/ir_array_refcount.cpp \
>> +glsl/ir_array_refcount.h \
>>  glsl/ir_basic_block.cpp \
>>  glsl/ir_basic_block.h \
>>  glsl/ir_builder.cpp \
>> diff --git a/src/compiler/glsl/ir_array_refcount.cpp
>> b/src/compiler/glsl/ir_array_refcount.cpp
>> new file mode 100644
>> index 000..41a0914
>> --- /dev/null
>> +++ b/src/compiler/glsl/ir_array_refcount.cpp
>> @@ -0,0 +1,100 @@
>> +/*
>> + * Copyright © 2016 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person
>> obtaining a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom
>> the
>> + * Software is furnished to do so, subject to the following
>> conditions:
>> + *
>> + * The above copyright notice and this permission notice (including
>> the next
>> + * paragraph) shall be included in all copies or substantial
>> portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>> EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
>> OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +/**
>> + * \file ir_array_refcount.cpp
>> + *
>> + * Provides a visitor which produces a list of variables referenced.
>> + */
>> +
>> +#include "ir.h"
>> +#include "ir_visitor.h"
>> +#include "ir_array_refcount.h"
>> +#include "compiler/glsl_types.h"
>> +#include "util/hash_table.h"
>> +
>> +ir_array_refcount_visitor::ir_array_refcount_visitor()
>> +{
>> +   this->mem_ctx = ralloc_context(NULL);
>> +   this->ht = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
>> +  _mesa_key_pointer_equal);
>> +}
>> +
>> +static void
>> +free_entry(struct hash_entry *entry)
>> +{
>> +   ir_array_refcount_entry *ivre = (ir_array_refcount_entry *)
>> entry->data;
>> +   delete ivre;
>> +}
>> +
>> +ir_array_refcount_visitor::~ir_array_refcount_visitor()
>> +{
>> +   ralloc_free(this->mem_ctx);
>> +   _mesa_hash_table_destroy(this->ht, free_entry);
>> +}
>> +
>> +ir_array_refcount_entry::ir_array_refcount_entry(ir_variable *var)
>> +   : var(var), is_referenced(false)
>> +{
>> +   /* empty */
>> +}
>> +
>> +
>> +ir_array_refcount_entry *
>> +ir_array_refcount_visitor::get_variable_entry(ir_variable *var)
>> +{
>> +   assert(var);
>> +
>> +   struct hash_entry *e = _mesa_hash_table_search(this->ht, var);
>> +   if (e)
>> +  return (ir_array_refcount_entry *)e->data;
>> +
>> +   ir_array_refcount_entry *entry = new
>> ir_array_refcount_entry(var);
>> +   _mesa_hash_table_insert(this->ht, var, entry);
>> +
>> +   return entry;
>> +}
>> +
>> +
>> +ir_visitor_status
>> +ir_array_refcount_visitor::visit(ir_dereference_variable *ir)
>> +{
>> +   ir_variable *const var = ir->variable_referenced();
>> +   ir_array_refcount_entry *entry = this->get_variable_entry(var);
>> +
>> +   if (entry)
> 
> I don't think this can ever be not true. maybe just assert(entry). If
> new fails it will throw an exception rather than return null as far as
> I understand.

I believe you are correct.  I'll change that.

> Otherwise seems fine:
> 
> Reviewed-by: Timothy Arceri 

Re: [Mesa-dev] [PATCH 03/19] glsl: Track the linearized array index for each UBO instance array element

2016-12-19 Thread Ian Romanick
On 12/18/2016 09:54 PM, Timothy Arceri wrote:
> On Thu, 2016-12-15 at 20:10 -0800, Ian Romanick wrote:
>> From: Ian Romanick 
>>
>> Signed-off-by: Ian Romanick 
>> Cc: mesa-sta...@lists.freedesktop.org
>> ---
>>  src/compiler/glsl/link_uniform_blocks.cpp | 17 ++---
>>  src/mesa/main/mtypes.h| 15 +++
>>  2 files changed, 29 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/compiler/glsl/link_uniform_blocks.cpp
>> b/src/compiler/glsl/link_uniform_blocks.cpp
>> index 41b26e7..9adfbd5 100644
>> --- a/src/compiler/glsl/link_uniform_blocks.cpp
>> +++ b/src/compiler/glsl/link_uniform_blocks.cpp
>> @@ -209,13 +209,19 @@ static void process_block_array_leaf(char
>> **name, gl_uniform_block *blocks,
>>   struct gl_context *ctx,
>>   struct gl_shader_program
>> *prog);
>>  
>> +/**
>> + *
>> + * \param first_index Value of \c block_index for the first element
>> of the
>> + *array.
>> + */
>>  static void
>>  process_block_array(struct uniform_block_array_elements *ub_array,
>> char **name,
>>  size_t name_length, gl_uniform_block *blocks,
>>  ubo_visitor *parcel, gl_uniform_buffer_variable
>> *variables,
>>  const struct link_uniform_block_active *const b,
>>  unsigned *block_index, unsigned *binding_offset,
>> -struct gl_context *ctx, struct gl_shader_program
>> *prog)
>> +struct gl_context *ctx, struct gl_shader_program
>> *prog,
>> +unsigned first_index)
>>  {
>> for (unsigned j = 0; j < ub_array->num_array_elements; j++) {
>>size_t new_length = name_length;
>> @@ -227,11 +233,15 @@ process_block_array(struct
>> uniform_block_array_elements *ub_array, char **name,
>>if (ub_array->array) {
>>   process_block_array(ub_array->array, name, new_length,
>> blocks,
>>   parcel, variables, b, block_index,
>> - binding_offset, ctx, prog);
>> + binding_offset, ctx, prog,
>> first_index);
>>} else {
>> + const unsigned i = *block_index;
>> +
>>   process_block_array_leaf(name, blocks,
>>parcel, variables, b, block_index,
>>binding_offset, ctx, prog);
>> +
>> + blocks[i].linearized_array_index = i - first_index;
> 
> Shouldn't this go in the new process_block_array_leaf() too?

It only needs to be one place.  *block_index is modified by
process_block_array_leaf, but the linearized index is the pre-modified
(that's a word... I'm sure of it!) value.  It could go in
process_block_array_leaf instead, but then I'd also need to pass either
first_index or (i - first_index).

I originally had a patch that removed the open-coded copy of
process_block_array_leaf which does not need to set
linearized_array_index.  I dropped that patch because there was a subtle
difference between the two implementations.  I will probably revisit the
idea, however.

> Otherwise this patch is:
> 
> Reviewed-by: Timothy Arceri 
> 
> 
>>}
>> }
>>  }
>> @@ -359,7 +369,8 @@ create_buffer_blocks(void *mem_ctx, struct
>> gl_context *ctx,
>>  
>>  assert(b->has_instance_name);
>>  process_block_array(b->array, , name_length,
>> blocks, ,
>> -variables, b, , _offset,
>> ctx, prog);
>> +variables, b, , _offset,
>> ctx, prog,
>> +i);
>>  ralloc_free(name);
>>   } else {
>>  blocks[i].Name = ralloc_strdup(blocks, block_type-
>>> name);
>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> index 36d48e2..ac4cac0 100644
>> --- a/src/mesa/main/mtypes.h
>> +++ b/src/mesa/main/mtypes.h
>> @@ -2493,6 +2493,21 @@ struct gl_uniform_block
>> uint8_t stageref;
>>  
>> /**
>> +* Linearized array index for uniform block instance arrays
>> +*
>> +* Given a uniform block instance array declared with size
>> +* blk[s_0][s_1]..[s_m], the block referenced by
>> blk[i_0][i_1]..[i_m] will
>> +* have the linearized array index
>> +*
>> +*   m-1   m
>> +* i_m + ∑   i_j * ∏ s_k
>> +*   j=0   k=j+1
>> +*
>> +* For a uniform block instance that is not an array, this is
>> always 0.
>> +*/
>> +   uint8_t linearized_array_index;
>> +
>> +   /**
>>  * Layout specified in the shader
>>  *
>>  * This isn't accessible through the API, but it is used while

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix program interface queries relating to interface blocks.

2016-12-19 Thread Kenneth Graunke
On Monday, December 19, 2016 1:36:00 PM PST Ian Romanick wrote:
> On 12/16/2016 09:35 PM, Kenneth Graunke wrote:
> > This fixes 555 dEQP tests (using the nougat-cts-dev branch), Piglit's
> > arb_program_interface_query/arb_program_interface_query-resource-query,
> > and GL45-CTS.program_interface_query.separate-programs-{tess-control,
> > tess-eval,geometry}.  Only one dEQP program interface failure remains.
> > 
> > I would have liked to split this up into several distinct changes, but
> > I wasn't sure how to do that given thet tangled nature of these issues.
> > 
> > So, the issues:
> > 
> >* We need to treat interface blocks declared as an array of instances
> >  as a single block - removing the outer array.  The resource list
> >  entry's name should not include the array length.  Properties such
> >  as GL_ARRAY_SIZE should refer to the variable inside the block, not
> >  the interface block's array properties.
> > 
> >* We need to do this prefixing even for structure variables.
> > 
> >* We need to do this for built-ins (such as gl_PerVertex.gl_Position).
> > 
> >* After interface array unwrapping, any variable which is an array
> >  should have [0] appended.  It doesn't matter if it's a TCS/TES/GS
> >  input or TCS output - that looked like an attempt to unwrap for
> >  per-vertex variables, but that didn't consider per-patch variables,
> >  and as far as I can tell there's nothing to justify this.
> > 
> > Several Mesa developers have suggested that Issue 16 contradicts the
> > main specification, but I believe that it doesn't - the main spec just
> > isn't terribly clear.  The main ARB_program_interface query spec says:
> > 
> >   "* For an active interface block not declared as an array of block
> >  instances, a single entry will be generated, using the block name from
> >  the shader source.
> > 
> >* For an active interface block declared as an array of instances,
> >  separate entries will be generated for each active instance.  The name
> >  of the instance is formed by concatenating the block name, the "["
> >  character, an integer identifying the instance number, and the "]"
> >  character."
> > 
> > Issue 16 says that built-ins should be named "gl_PerVertex.gl_Position",
> > but several people suggested the second bullet above means that it
> > should be named "gl_PerVertex[array length].gl_Position".
> > 
> > There are two important things to note.  Those bullet points say
> > "an active interface block", while the others say "variable" or "active
> > shader storage block member".  They also don't mention applying the
> > rules recursively (unlike the other bullets).  Both suggest that
> > these rules apply to blocks themselves, not members of blocks.
> > 
> > In fact, for GL_UNIFORM_BLOCK queries, we do have "block[0]",
> > "block[1]", ... resource list entries - so those rules are real,
> > and actually used.  So if they don't apply to block members, then how
> > should members be named?  Unfortunately, I don't see any rules outside
> > of issue 16 - where the rationale is very unclear.  I hope to clarify
> > the spec in the future.
> 
> Based on my understanding, something like
> 
> out Vertex {
> vec4 p;
> vec4 c[3];
> } vertex[2];
> 
> in a vertex shader should produce
> 
> Vertex[0].p
> Vertex[0].c[0] (with GL_ARRAY_SIZE = 3)
> Vertex[1].p
> Vertex[1].c[0] (with GL_ARRAY_SIZE = 3)
> 
> This is definitely what we would produce for a uniform block.
> 
> What I've never understood is why that isn't done for gl_PerVertex stage
> boundaries where gl_PerVertex is explicitly arrayed.

My expectation is that the above code would produce:

Block entries:

Vertex[0]
Vertex[1]

Variable/member entries:

Vertex.p
Vertex.c[0] (with GL_ARRAY_SIZE = 3)

We would produce similar results for uniform blocks after my patch.

I don't think anything about the rules is gl_PerVertex specific.

> I think this patch is good for now.  I like that it brings all the
> strange edge-case handling into one place.
> 
> Reviewed-by: Ian Romanick 

I've attached some notes I took while looking at dEQP program interface
query tests - it has the shader code and expected results for a lot of
test cases.  I didn't write down the uniform ones, but they follow the
same pattern (as shown above).

For what it's worth, the OpenGL wiki's Program Introspection page(*),
under "Interface block member naming" gives an example matching my above
reply.  It says:

uniform BlockName3
{
  int mem;
} instanceName3[4];

This definition will create a single member named "BlockName3.min".
The reason this array of four blocks only counts as having one
variable is because each of the four blocks uses the same internal
definition. There is nothing that could be queried from
BlockName3[1] that could not be queried from BlockName3[0].

(*) 

Re: [Mesa-dev] [PATCH v2 043/103] i965/vec4: handle 32 and 64 bit channels in liveness analysis

2016-12-19 Thread Francisco Jerez
Iago Toral Quiroga  writes:

> From: "Juan A. Suarez Romero" 
>
> Our current data flow analysis does not take into account that channels
> on 64-bit operands are 64-bit. This is a problem when the same register
> is accessed using both 64-bit and 32-bit channels. This is very common
> in operations where we need to access 64-bit data in 32-bit chunks,
> such as the double packing and packing operations.
>
> This patch changes the analysis by checking the bits that each source
> or destination datatype needs. Actually, rather than bits, we use
> blocks of 32bits, which is the minimum channel size.
>
> Because a vgrf can contain a dvec4 (256 bits), we reserve 8
> 32-bit blocks to map the channels.
>
> v2 (Curro):
>   - Simplify code by making the var_from_reg helpers take an extra
> argument with the register component we want.
>   - Fix a couple of cases where we had to update the code to the new
> way of representing live variables.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp |  2 +-
>  src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  2 +-
>  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  | 25 +
>  .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 32 
> +++---
>  .../drivers/dri/i965/brw_vec4_live_variables.h | 15 ++
>  5 files changed, 42 insertions(+), 34 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 3191eab..34cab04 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1140,7 +1140,7 @@ vec4_visitor::opt_register_coalesce()
>/* Can't coalesce this GRF if someone else was going to
> * read it later.
> */
> -  if (var_range_end(var_from_reg(alloc, dst_reg(inst->src[0])), 4) > ip)
> +  if (var_range_end(var_from_reg(alloc, dst_reg(inst->src[0])), 8) > ip)
>continue;
>  
>/* We need to check interference with the final destination between 
> this
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> index 1b91db9..bef897a 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> @@ -246,7 +246,7 @@ vec4_visitor::opt_cse_local(bblock_t *block)
>   * more -- a sure sign they'll fail operands_match().
>   */
>  if (src->file == VGRF) {
> -   if (var_range_end(var_from_reg(alloc, dst_reg(*src)), 4) < 
> ip) {
> +   if (var_range_end(var_from_reg(alloc, dst_reg(*src)), 8) < 
> ip) {
>entry->remove();
>ralloc_free(entry);
>break;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> index 950c6c8..6a80810 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
> @@ -57,12 +57,13 @@ vec4_visitor::dead_code_eliminate()
>   if ((inst->dst.file == VGRF && !inst->has_side_effects()) ||
>   (inst->dst.is_null() && inst->writes_flag())){
>  bool result_live[4] = { false };
> -
>  if (inst->dst.file == VGRF) {
> -   for (unsigned i = 0; i < regs_written(inst); i++) {
> -  for (int c = 0; c < 4; c++)
> - result_live[c] |= BITSET_TEST(
> -live, var_from_reg(alloc, offset(inst->dst, i), c));
> +   for (unsigned i = 0; i < 2 * regs_written(inst); i++) {

One of the issues we discussed in the past about this approach is that
it would overestimate the number of register OWORDs accessed by
instructions with size_written < REG_SIZE (or size_read(i) < REG_SIZE),
which will be emitted by the SIMD lowering pass.  Now that the amount of
data read and written by instructions is represented in byte units you
can avoid this problem by using DIV_ROUND_UP(inst->size_written, 16)
instead of 2 * regs_written(inst) above.

> +  for (int c = 0; c < 4; c++) {
> + const unsigned v =
> +var_from_reg(alloc, inst->dst, c, i);
> + result_live[c] |= BITSET_TEST(live, v);
> +  }
> }
>  } else {
> for (unsigned c = 0; c < 4; c++)
> @@ -111,11 +112,12 @@ vec4_visitor::dead_code_eliminate()
>  
>   if (inst->dst.file == VGRF && !inst->predicate &&
>   !inst->is_align1_partial_write()) {
> -for (unsigned i = 0; i < regs_written(inst); i++) {
> +for (unsigned i = 0; i < 2 * regs_written(inst); i++) {

Same rounding issue as above.

> for (int c = 0; c < 4; c++) {
>if (inst->dst.writemask & (1 << c)) {
> - 

Re: [Mesa-dev] [PATCH 1/3] gallium: add renderonly library

2016-12-19 Thread Thierry Reding
On Mon, Dec 19, 2016 at 08:54:04PM +0100, Christian Gmeiner wrote:
> 2016-12-19 14:08 GMT+01:00 Thierry Reding :
> > On Wed, Nov 30, 2016 at 02:44:34PM +0100, Christian Gmeiner wrote:
[...]
> >>  GALLIUM_WINSYS_CFLAGS = \
> >>   -I$(top_srcdir)/src \
> >>   -I$(top_srcdir)/include \
> >> diff --git a/src/gallium/auxiliary/Makefile.am 
> >> b/src/gallium/auxiliary/Makefile.am
> >> index 4a4a4fb..6b63cf1 100644
> >> --- a/src/gallium/auxiliary/Makefile.am
> >> +++ b/src/gallium/auxiliary/Makefile.am
> >> @@ -20,6 +20,16 @@ libgallium_la_SOURCES = \
> >>   $(NIR_SOURCES) \
> >>   $(GENERATED_SOURCES)
> >>
> >> +if HAVE_LIBDRM
> >> +
> >> +AM_CFLAGS += \
> >> + $(LIBDRM_CFLAGS)
> >
> > Same here.
> 
> I just redone what other have done in that file. There are some if's
> in that file and I am unsure
> what to do here.

Maybe Emil has a strong opinion here, I was really just nit-picking, so
feel free to leave this.

> >> diff --git a/src/gallium/auxiliary/Makefile.sources 
> >> b/src/gallium/auxiliary/Makefile.sources
> >> index 5d4fe30..8d3e4a9 100644
> >> --- a/src/gallium/auxiliary/Makefile.sources
> >> +++ b/src/gallium/auxiliary/Makefile.sources
> >> @@ -435,3 +435,7 @@ GALLIVM_SOURCES := \
> >>   draw/draw_llvm_sample.c \
> >>   draw/draw_pt_fetch_shade_pipeline_llvm.c \
> >>   draw/draw_vs_llvm.c
> >> +
> >> +RENDERONLY_SOURCES := \
> >> + renderonly/renderonly.c \
> >> + renderonly/renderonly.h
> >> diff --git a/src/gallium/auxiliary/renderonly/renderonly.c 
> >> b/src/gallium/auxiliary/renderonly/renderonly.c
> >> new file mode 100644
> >> index 000..c4ea784
> >> --- /dev/null
> >> +++ b/src/gallium/auxiliary/renderonly/renderonly.c
> >> @@ -0,0 +1,199 @@
> >> +/*
> >> + * Copyright (C) 2016 Christian Gmeiner 
> >> + *
> >> + * Permission is hereby granted, free of charge, to any person obtaining a
> >> + * copy of this software and associated documentation files (the 
> >> "Software"),
> >> + * to deal in the Software without restriction, including without 
> >> limitation
> >> + * the rights to use, copy, modify, merge, publish, distribute, 
> >> sublicense,
> >> + * and/or sell copies of the Software, and to permit persons to whom the
> >> + * Software is furnished to do so, subject to the following conditions:
> >> + *
> >> + * The above copyright notice and this permission notice (including the 
> >> next
> >> + * paragraph) shall be included in all copies or substantial portions of 
> >> the
> >> + * Software.
> >> + *
> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> >> EXPRESS OR
> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> >> MERCHANTABILITY,
> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
> >> SHALL
> >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> >> OTHER
> >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> >> ARISING FROM,
> >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
> >> IN THE
> >> + * SOFTWARE.
> >> + *
> >> + * Authors:
> >> + *Christian Gmeiner 
> >> + */
> >> +
> >> +#include "renderonly/renderonly.h"
> >
> > This include is oddly placed. Should it not go below all other includes?
> >
> 
> I think thats a matter of style but I can puit it above #include
> "state_tracker/drm_driver.h"

I prefer having these in hierarchical order. That is, anything from the
C runtime goes first, then other system includes, followed by project
core includes and finally driver-local includes. By that ordering
renderonly.h would be very last in line. Not sure if anyone in Mesa is
pedantic enough to insist on a common style, so feel free to leave this
as-is if you prefer.

> >> +struct pipe_screen *
> >> +renderonly_screen_create(int fd, const struct renderonly_ops *ops, void 
> >> *priv)
> >
> > This is slightly nit-picky, but I found it confusing when first reading
> > through the patches: I know this started out as an effort to support the
> > various render-only devices out there, but what you're creating here is
> > really an abstraction for the scanout engine. Reading renderonly_* the
> > first thing I think of is that this creates a render-only device, where
> > in fact is creates the scanout engine.
> >
> > Perhaps renaming this to struct scanout, struct scanout_engine or any
> > other number of possibilities would remove that confusion.
> >
> 
> In my current version I have only one struct left:
> 
> struct renderonly {
>int (*tiling)(int fd, uint32_t handle); /**< for !intermediate_rendering */
>bool intermediate_rendering;
>int kms_fd;
>int gpu_fd;
> };
> 
> And here is how the only function in imx-drm winsys looks like:
> 
> struct pipe_screen *imx_drm_screen_create(int fd)
> {
>struct renderonly ro = {
>   .intermediate_rendering = true,
>   .kms_fd = fd,
>   .gpu_fd 

Re: [Mesa-dev] [PATCH] glsl: Fix program interface queries relating to interface blocks.

2016-12-19 Thread Ian Romanick
On 12/16/2016 09:35 PM, Kenneth Graunke wrote:
> This fixes 555 dEQP tests (using the nougat-cts-dev branch), Piglit's
> arb_program_interface_query/arb_program_interface_query-resource-query,
> and GL45-CTS.program_interface_query.separate-programs-{tess-control,
> tess-eval,geometry}.  Only one dEQP program interface failure remains.
> 
> I would have liked to split this up into several distinct changes, but
> I wasn't sure how to do that given thet tangled nature of these issues.
> 
> So, the issues:
> 
>* We need to treat interface blocks declared as an array of instances
>  as a single block - removing the outer array.  The resource list
>  entry's name should not include the array length.  Properties such
>  as GL_ARRAY_SIZE should refer to the variable inside the block, not
>  the interface block's array properties.
> 
>* We need to do this prefixing even for structure variables.
> 
>* We need to do this for built-ins (such as gl_PerVertex.gl_Position).
> 
>* After interface array unwrapping, any variable which is an array
>  should have [0] appended.  It doesn't matter if it's a TCS/TES/GS
>  input or TCS output - that looked like an attempt to unwrap for
>  per-vertex variables, but that didn't consider per-patch variables,
>  and as far as I can tell there's nothing to justify this.
> 
> Several Mesa developers have suggested that Issue 16 contradicts the
> main specification, but I believe that it doesn't - the main spec just
> isn't terribly clear.  The main ARB_program_interface query spec says:
> 
>   "* For an active interface block not declared as an array of block
>  instances, a single entry will be generated, using the block name from
>  the shader source.
> 
>* For an active interface block declared as an array of instances,
>  separate entries will be generated for each active instance.  The name
>  of the instance is formed by concatenating the block name, the "["
>  character, an integer identifying the instance number, and the "]"
>  character."
> 
> Issue 16 says that built-ins should be named "gl_PerVertex.gl_Position",
> but several people suggested the second bullet above means that it
> should be named "gl_PerVertex[array length].gl_Position".
> 
> There are two important things to note.  Those bullet points say
> "an active interface block", while the others say "variable" or "active
> shader storage block member".  They also don't mention applying the
> rules recursively (unlike the other bullets).  Both suggest that
> these rules apply to blocks themselves, not members of blocks.
> 
> In fact, for GL_UNIFORM_BLOCK queries, we do have "block[0]",
> "block[1]", ... resource list entries - so those rules are real,
> and actually used.  So if they don't apply to block members, then how
> should members be named?  Unfortunately, I don't see any rules outside
> of issue 16 - where the rationale is very unclear.  I hope to clarify
> the spec in the future.

Based on my understanding, something like

out Vertex {
vec4 p;
vec4 c[3];
} vertex[2];

in a vertex shader should produce

Vertex[0].p
Vertex[0].c[0] (with GL_ARRAY_SIZE = 3)
Vertex[1].p
Vertex[1].c[0] (with GL_ARRAY_SIZE = 3)

This is definitely what we would produce for a uniform block.

What I've never understood is why that isn't done for gl_PerVertex stage
boundaries where gl_PerVertex is explicitly arrayed.

I think this patch is good for now.  I like that it brings all the
strange edge-case handling into one place.

Reviewed-by: Ian Romanick 

> Cc: Ian Romanick 
> Cc: Alejandro Piñeiro 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/linker.cpp   | 100 
> -
>  src/mesa/main/shader_query.cpp |  92 -
>  2 files changed, 79 insertions(+), 113 deletions(-)
> 
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index 5066014..5508d58 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -3735,6 +3735,65 @@ add_shader_variable(const struct gl_context *ctx,
>  bool use_implicit_location, int location,
>  const glsl_type *outermost_struct_type = NULL)
>  {
> +   const glsl_type *interface_type = var->get_interface_type();
> +
> +   if (outermost_struct_type == NULL) {
> +  /* Unsized (non-patch) TCS output/TES input arrays are implicitly
> +   * sized to gl_MaxPatchVertices.  Internally, we shrink them to a
> +   * smaller size.
> +   *
> +   * This can cause trouble with SSO programs.  Since the TCS declares
> +   * the number of output vertices, we can always shrink TCS output
> +   * arrays.  However, the TES might not be linked with a TCS, in
> +   * which case it won't know the size of the patch.  In other words,
> +   

[Mesa-dev] [Bug 99154] Link time error when using multiple builtin functions

2016-12-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99154

Bug ID: 99154
   Summary: Link time error when using multiple builtin functions
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: niels_...@salscheider-online.de
QA Contact: intel-3d-b...@lists.freedesktop.org

Created attachment 128570
  --> https://bugs.freedesktop.org/attachment.cgi?id=128570=edit
Simple test case

When using multiple builtin functions, linking of a shader can fail if
temporary variables of the GLSL compiler collide.

builtin_builder::binop creates variables named "x" and "y" for the parameters
of builtin functions. Later, ir_call::generate_inline inlines the call to the
builtins. It clones the variables (without changing the name) and assigns the
function inputs to them.

This can cause a collision of variable names, e. g. when using multiple builtin
functions with parameters of different types. The attached test case can be
used to reproduce the problem.

I can work around the problem by adding a sequence number and possibly a prefix
to the variable names but I am not sure if that is the correct solution.

Thanks to imirkin and robclark for helping me to figure this out on IRC.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99154] Link time error when using multiple builtin functions

2016-12-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99154

Niels Ole Salscheider  changed:

   What|Removed |Added

 CC||niels_ole@salscheider-onlin
   ||e.de

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] imx: gallium driver for imx-drm scanout driver

2016-12-19 Thread Thierry Reding
On Mon, Dec 19, 2016 at 04:04:34PM +, Emil Velikov wrote:
> On Monday, 19 December 2016, Thierry Reding 
> wrote:
> 
> > On Wed, Nov 30, 2016 at 02:44:36PM +0100, Christian Gmeiner wrote:
> > [...]
> > > +static struct pipe_screen *imx_open_render_node(struct renderonly *ro)
> > > +{
> > > +   return etna_drm_screen_create_rendernode(ro);
> > > +}
> >
> > Patch 2/3 never made it into my inbox for some reason, and had to find
> > it in some archives. I'll have to resort to commenting on the code here.
> > From what I saw, etna_drm_screen_create_rendernode() hard-codes the GPU
> > render node (to /dev/dri/renderD128, I think). It's a little brittle for
> > obvious reasons, but I think you might be able to get away with it,
> > depending on the SoC.
> >
> > On Tegra we have a bunch of people that stick discrete GPUs into the
> > PCIe slot and run a second instance of Nouveau on that. It's an
> > interesting use-case, but it also has the drawback of creating a second
> > renderD device. What's more, it uses the same kernel driver, so hard-
> > coding the device node is going to give us a 50-50 chance on average
> > that we get the right one. Back when I had written the original proposal
> > I had also coded a heuristic that would walk sysfs and select the render
> > node that was on the same bus as the card node. This was before Emil had
> > removed the dependency on udev, but I've rewritten that code to work
> > with Emil's drmDevice*() API, which needs some fleshing out[0].
> >
> > Out of curiosity, is this something that could happen on i.MX devices as
> > well? Or even if not i.MX, I'm fairly sure that there are other SoCs
> > that have a Vivante GPU and a PCI slot that people could use to stick
> > big GPUs into, so you may run into the same issue eventually.
> 
> Thanks Thierry for the nice write-up. Must admit that I was feeling a bit
> lazy.
> 
> Considering the ~ok odds and the fact that we don't have platform support
> for the drm*Device helpers I'm inclined to have this fixed after the merge.
> 
> Let's have etna and(?) tegra and sort bugs during the RCs ;-)

I'm fine if you want etnaviv as-is. I personally would want to hold off
on Tegra for a wee bit longer. Let's see if Alex has a strong opinion.

What are your requirements for release? Will you be freezing the libdrm
dependency during RCs? I've got most of the patches ready to make this
work reliably on Tegra, though I'd still have to port my patches to
Christian's renderonly library.

Thierry


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v2)

2016-12-19 Thread Andreas Boll
2016-12-19 21:41 GMT+01:00 Alex Deucher :
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index 0b5c6dc..76a34fe 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
> r600_common_screen *rscreen)
> case CHIP_FIJI: return "AMD FIJI";
> case CHIP_POLARIS10: return "AMD POLARIS10";
> case CHIP_POLARIS11: return "AMD POLARIS11";
> +   case CHIP_POLARIS12: return "AMD POLARIS12";
> case CHIP_STONEY: return "AMD STONEY";
> default: return "AMD unknown";
> }
> @@ -893,6 +894,11 @@ const char *r600_get_llvm_processor_name(enum 
> radeon_family family)
> case CHIP_POLARIS10: return "polaris10";
> case CHIP_POLARIS11: return "polaris11";
>  #endif

LLVM <= 3.8 needs the same fallback as CHIP_POLARIS11.

#if HAVE_LLVM <= 0x0308
needs to return "tonga"

Thanks,
Andreas

> +#if HAVE_LLVM <= 0x0309
> +   case CHIP_POLARIS12: return "polaris11";
> +#else
> +   case CHIP_POLARIS12: return "gfx803";
> +#endif
> default: return "";
> }
>  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v2)

2016-12-19 Thread Alex Deucher
From: Junwei Zhang 

v2: use gfxip names for llvm 4.0+

Signed-off-by: Junwei Zhang 
Reviewed-by: Nicolai Hähnle 
Acked-by: Christian König 
---
 src/amd/addrlib/r800/ciaddrlib.cpp| 3 ++-
 src/amd/addrlib/r800/ciaddrlib.h  | 1 +
 src/amd/common/amd_family.h   | 1 +
 src/amd/common/amdgpu_id.h| 4 
 src/gallium/drivers/radeon/r600_pipe_common.c | 6 ++
 src/gallium/drivers/radeon/radeon_vce.c   | 3 ++-
 src/gallium/drivers/radeonsi/si_pipe.c| 1 +
 src/gallium/drivers/radeonsi/si_state.c   | 1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
 9 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/amd/addrlib/r800/ciaddrlib.cpp 
b/src/amd/addrlib/r800/ciaddrlib.cpp
index 7c5d29a..c726c4d 100644
--- a/src/amd/addrlib/r800/ciaddrlib.cpp
+++ b/src/amd/addrlib/r800/ciaddrlib.cpp
@@ -353,6 +353,7 @@ AddrChipFamily CIAddrLib::HwlConvertChipFamily(
 m_settings.isFiji= ASICREV_IS_FIJI_P(uChipRevision);
 m_settings.isPolaris10   = 
ASICREV_IS_POLARIS10_P(uChipRevision);
 m_settings.isPolaris11   = 
ASICREV_IS_POLARIS11_M(uChipRevision);
+m_settings.isPolaris12   = 
ASICREV_IS_POLARIS12_V(uChipRevision);
 break;
 case FAMILY_CZ:
 m_settings.isCarrizo = 1;
@@ -417,7 +418,7 @@ BOOL_32 CIAddrLib::HwlInitGlobalParams(
 {
 m_pipes = 16;
 }
-else if (m_settings.isPolaris11)
+else if (m_settings.isPolaris11 || m_settings.isPolaris12)
 {
 m_pipes = 4;
 }
diff --git a/src/amd/addrlib/r800/ciaddrlib.h b/src/amd/addrlib/r800/ciaddrlib.h
index de995fa..2c9a4cc 100644
--- a/src/amd/addrlib/r800/ciaddrlib.h
+++ b/src/amd/addrlib/r800/ciaddrlib.h
@@ -62,6 +62,7 @@ struct CIChipSettings
 UINT_32 isFiji: 1;
 UINT_32 isPolaris10   : 1;
 UINT_32 isPolaris11   : 1;
+UINT_32 isPolaris12   : 1;
 // VI fusion (Carrizo)
 UINT_32 isCarrizo : 1;
 };
diff --git a/src/amd/common/amd_family.h b/src/amd/common/amd_family.h
index 6a713ad..b09bbb8 100644
--- a/src/amd/common/amd_family.h
+++ b/src/amd/common/amd_family.h
@@ -91,6 +91,7 @@ enum radeon_family {
 CHIP_STONEY,
 CHIP_POLARIS10,
 CHIP_POLARIS11,
+CHIP_POLARIS12,
 CHIP_LAST,
 };
 
diff --git a/src/amd/common/amdgpu_id.h b/src/amd/common/amdgpu_id.h
index f91df55..1683a5a 100644
--- a/src/amd/common/amdgpu_id.h
+++ b/src/amd/common/amdgpu_id.h
@@ -142,6 +142,8 @@ enum {
 
VI_POLARIS11_M_A0 = 90,
 
+   VI_POLARIS12_V_A0 = 100,
+
VI_UNKNOWN= 0xFF
 };
 
@@ -156,6 +158,8 @@ enum {
((eChipRev >= VI_POLARIS10_P_A0) && (eChipRev < VI_POLARIS11_M_A0))
 #define ASICREV_IS_POLARIS11_M(eChipRev)   \
(eChipRev >= VI_POLARIS11_M_A0)
+#define ASICREV_IS_POLARIS12_V(eChipRev)\
+   (eChipRev >= VI_POLARIS12_V_A0)
 
 /* CZ specific rev IDs */
 enum {
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 0b5c6dc..76a34fe 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
r600_common_screen *rscreen)
case CHIP_FIJI: return "AMD FIJI";
case CHIP_POLARIS10: return "AMD POLARIS10";
case CHIP_POLARIS11: return "AMD POLARIS11";
+   case CHIP_POLARIS12: return "AMD POLARIS12";
case CHIP_STONEY: return "AMD STONEY";
default: return "AMD unknown";
}
@@ -893,6 +894,11 @@ const char *r600_get_llvm_processor_name(enum 
radeon_family family)
case CHIP_POLARIS10: return "polaris10";
case CHIP_POLARIS11: return "polaris11";
 #endif
+#if HAVE_LLVM <= 0x0309
+   case CHIP_POLARIS12: return "polaris11";
+#else
+   case CHIP_POLARIS12: return "gfx803";
+#endif
default: return "";
}
 }
diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index aad2ec1..dcd56ea 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -413,7 +413,8 @@ struct pipe_video_codec *rvce_create_encoder(struct 
pipe_context *context,
enc->use_vui = true;
if (rscreen->info.family >= CHIP_TONGA &&
rscreen->info.family != CHIP_STONEY &&
-   rscreen->info.family != CHIP_POLARIS11)
+   rscreen->info.family != CHIP_POLARIS11 &&
+   rscreen->info.family != CHIP_POLARIS12)
enc->dual_pipe = true;
/* TODO enable B frame with dual instance */
if ((rscreen->info.family >= CHIP_TONGA) &&
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 88685f9..11cca6f 100644
--- 

Re: [Mesa-dev] [PATCH 08/19] glsl: Walk a list of ir_dereference_array to mark array elements as accessed

2016-12-19 Thread Kenneth Graunke
On Thursday, December 15, 2016 8:10:20 PM PST Ian Romanick wrote:
> From: Ian Romanick 
> 
> Signed-off-by: Ian Romanick 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/compiler/glsl/ir_array_refcount.cpp |  73 +++-
>  src/compiler/glsl/ir_array_refcount.h   |  10 +
>  src/compiler/glsl/tests/array_refcount_test.cpp | 425 
> 
>  3 files changed, 507 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/ir_array_refcount.cpp 
> b/src/compiler/glsl/ir_array_refcount.cpp
> index 75dca0a..7c9ab8e 100644
> --- a/src/compiler/glsl/ir_array_refcount.cpp
> +++ b/src/compiler/glsl/ir_array_refcount.cpp
> @@ -34,7 +34,7 @@
>  #include "util/hash_table.h"
>  
>  ir_array_refcount_visitor::ir_array_refcount_visitor()
> -   : derefs(0), num_derefs(0), derefs_size(0)
> +   : last_array_deref(0), derefs(0), num_derefs(0), derefs_size(0)
>  {
> this->mem_ctx = ralloc_context(NULL);
> this->ht = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
> @@ -159,6 +159,77 @@ ir_array_refcount_visitor::get_array_deref()
> return d;
>  }
>  
> +ir_visitor_status
> +ir_array_refcount_visitor::visit_enter(ir_dereference_array *ir)
> +{
> +   /* It could also be a vector or a matrix.  Individual elements of vectors
> +* are natrices are not tracked, so bail.

 and matrices 

> +*/
> +   if (!ir->array->type->is_array())
> +  return visit_continue;
> +
> +   /* If this array dereference is a child of an array dereference that was
> +* already visited, just continue on.  Otherwise, for an arrays-of-arrays
> +* dereference like x[1][2][3][4], we'd process the [1][2][3][4] sequence,
> +* the [1][2][3] sequence, the [1][2] sequence, and the [1] sequence.  
> This
> +* ensures that we only process the full sequence.
> +*/
> +   if (last_array_deref && last_array_deref->array == ir) {
> +  last_array_deref = ir;
> +  return visit_continue;
> +   }
> +
> +   last_array_deref = ir;
> +
> +   num_derefs = 0;
> +
> +   ir_rvalue *rv = ir;
> +   while (rv->ir_type == ir_type_dereference_array) {
> +  ir_dereference_array *const deref = rv->as_dereference_array();
> +
> +  assert(deref != NULL);
> +  assert(deref->array->type->is_array());
> +
> +  ir_rvalue *const array = deref->array;
> +  const ir_constant *const idx = deref->array_index->as_constant();
> +  array_deref_range *const dr = get_array_deref();
> +
> +  dr->size = array->type->array_size();
> +
> +  if (idx != NULL) {
> + dr->index = idx->get_int_component(0);
> +  } else {
> + /* An unsized array can occur at the end of an SSBO.  We can't track
> +  * accesses to such an array, so bail.
> +  */
> + if (array->type->array_size() == 0)
> +return visit_continue;
> +
> + dr->index = dr->size;
> +  }
> +
> +  rv = array;
> +   }
> +
> +   ir_dereference_variable *const var_deref = rv->as_dereference_variable();
> +
> +   /* If the array being dereferenced is not a variable, bail.  At the very
> +* least, ir_constant and ir_dereference_record are possible.
> +*/
> +   if (var_deref == NULL)
> +  return visit_continue;
> +
> +   ir_array_refcount_entry *const entry =
> +  this->get_variable_entry(var_deref->var);
> +
> +   if (entry == NULL)
> +  return visit_stop;
> +
> +   entry->mark_array_elements_referenced(derefs, num_derefs);
> +
> +   return visit_continue;
> +}
> +
>  
>  ir_visitor_status
>  ir_array_refcount_visitor::visit(ir_dereference_variable *ir)
> diff --git a/src/compiler/glsl/ir_array_refcount.h 
> b/src/compiler/glsl/ir_array_refcount.h
> index 2988046..46ba36c 100644
> --- a/src/compiler/glsl/ir_array_refcount.h
> +++ b/src/compiler/glsl/ir_array_refcount.h
> @@ -140,6 +140,7 @@ public:
> virtual ir_visitor_status visit(ir_dereference_variable *);
>  
> virtual ir_visitor_status visit_enter(ir_function_signature *);
> +   virtual ir_visitor_status visit_enter(ir_dereference_array *);
>  
> /**
>  * Find variable in the hash table, and insert it if not present
> @@ -158,6 +159,15 @@ private:
> array_deref_range *get_array_deref();
>  
> /**
> +* Last ir_dereference_array that was visited
> +*
> +* Used to prevent some redundant calculations.
> +*
> +* \sa ::visit_enter(ir_dereference_array *)
> +*/
> +   ir_dereference_array *last_array_deref;
> +
> +   /**
>  * \name array_deref_range tracking
>  */
> /*@{*/
> diff --git a/src/compiler/glsl/tests/array_refcount_test.cpp 
> b/src/compiler/glsl/tests/array_refcount_test.cpp
> index 29bb133..ecd7f46 100644
> --- a/src/compiler/glsl/tests/array_refcount_test.cpp
> +++ b/src/compiler/glsl/tests/array_refcount_test.cpp
> @@ -23,12 +23,18 @@
>  #include 
>  #include "ir.h"
>  #include "ir_array_refcount.h"
> +#include "ir_builder.h"
> +#include "util/hash_table.h"
> +
> +using 

Re: [Mesa-dev] [PATCH 1/3] gallium: add renderonly library

2016-12-19 Thread Christian Gmeiner
Hi Thierry,

2016-12-19 14:08 GMT+01:00 Thierry Reding :
> On Wed, Nov 30, 2016 at 02:44:34PM +0100, Christian Gmeiner wrote:
>> This a very lightweight library to add basic support for
>> renderonly GPUs. It does all the magic regarding in/exporting
>> buffers etc. This library will likely break android support and
>> hopefully will get replaced with a better solution based on gbm2.
>>
>> Signed-off-by: Christian Gmeiner 
>> ---
>>  src/gallium/Automake.inc  |   5 +
>>  src/gallium/auxiliary/Makefile.am |  10 ++
>>  src/gallium/auxiliary/Makefile.sources|   4 +
>>  src/gallium/auxiliary/renderonly/renderonly.c | 199 
>> ++
>>  src/gallium/auxiliary/renderonly/renderonly.h |  81 +++
>>  5 files changed, 299 insertions(+)
>>  create mode 100644 src/gallium/auxiliary/renderonly/renderonly.c
>>  create mode 100644 src/gallium/auxiliary/renderonly/renderonly.h
>
> Hi Christian,
>
> sorry for being late to the party. I only ran across this by accident.
> Would you mind keeping me in Cc for subsequent versions of this? It's
> been more than two years since I wrote the original proposal for this,
> but I'm still interested in seeing a solution emerge.
>

Sure will add you to CC list. Btw. your timing could not be better I
wanted to send out
v2 today. But I will hold it back and incooperate your feedback.

> Anyway, thanks for picking this up.
>
> From a diff point of view this is certainly much more terse than the
> original proposal. I find it slightly unfortunate that the drivers for
> the render GPU have to be changed. But it's hard to argue with the
> reduction in size.
>
> I still think that Tegra will eventually require more than the stub that
> you have in this series because the same device that exposes the scanout
> engine also contains units that can do video compositing, decoding and
> encoding. There's in fact recently been discussions on how best to
> provide that functionality and Mesa looks like the best long-term target
> currently. Anyway, that's a bridge I think we can cross when we get
> there, this concept looks fine to get started.
>
>> diff --git a/src/gallium/Automake.inc b/src/gallium/Automake.inc
>> index 6fe2e22..6aadcb9 100644
>> --- a/src/gallium/Automake.inc
>> +++ b/src/gallium/Automake.inc
>> @@ -50,6 +50,11 @@ GALLIUM_COMMON_LIB_DEPS = \
>>   $(PTHREAD_LIBS) \
>>   $(DLOPEN_LIBS)
>>
>> +if HAVE_LIBDRM
>> +GALLIUM_COMMON_LIB_DEPS += \
>> + $(LIBDRM_LIBS)
>> +endif
>
> Nit: Is the conditional necessary here? LIBDRM_LIBS would be empty if
> libdrm wasn't found, right? So no harm in just appending it to the
> GALLIUM_COMMON_LIB_DEPS variable unconditonally?
>

Done.

>>  GALLIUM_WINSYS_CFLAGS = \
>>   -I$(top_srcdir)/src \
>>   -I$(top_srcdir)/include \
>> diff --git a/src/gallium/auxiliary/Makefile.am 
>> b/src/gallium/auxiliary/Makefile.am
>> index 4a4a4fb..6b63cf1 100644
>> --- a/src/gallium/auxiliary/Makefile.am
>> +++ b/src/gallium/auxiliary/Makefile.am
>> @@ -20,6 +20,16 @@ libgallium_la_SOURCES = \
>>   $(NIR_SOURCES) \
>>   $(GENERATED_SOURCES)
>>
>> +if HAVE_LIBDRM
>> +
>> +AM_CFLAGS += \
>> + $(LIBDRM_CFLAGS)
>
> Same here.

I just redone what other have done in that file. There are some if's
in that file and I am unsure
what to do here.

>
>> diff --git a/src/gallium/auxiliary/Makefile.sources 
>> b/src/gallium/auxiliary/Makefile.sources
>> index 5d4fe30..8d3e4a9 100644
>> --- a/src/gallium/auxiliary/Makefile.sources
>> +++ b/src/gallium/auxiliary/Makefile.sources
>> @@ -435,3 +435,7 @@ GALLIVM_SOURCES := \
>>   draw/draw_llvm_sample.c \
>>   draw/draw_pt_fetch_shade_pipeline_llvm.c \
>>   draw/draw_vs_llvm.c
>> +
>> +RENDERONLY_SOURCES := \
>> + renderonly/renderonly.c \
>> + renderonly/renderonly.h
>> diff --git a/src/gallium/auxiliary/renderonly/renderonly.c 
>> b/src/gallium/auxiliary/renderonly/renderonly.c
>> new file mode 100644
>> index 000..c4ea784
>> --- /dev/null
>> +++ b/src/gallium/auxiliary/renderonly/renderonly.c
>> @@ -0,0 +1,199 @@
>> +/*
>> + * Copyright (C) 2016 Christian Gmeiner 
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, 

Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support

2016-12-19 Thread Tom Stellard
On Mon, Dec 19, 2016 at 02:04:05PM -0500, Alex Deucher wrote:
> From: Junwei Zhang 
> 
> Signed-off-by: Junwei Zhang 
> Reviewed-by: Nicolai Hähnle 
> Acked-by: Christian König 
> ---
>  src/amd/addrlib/r800/ciaddrlib.cpp| 3 ++-
>  src/amd/addrlib/r800/ciaddrlib.h  | 1 +
>  src/amd/common/amd_family.h   | 1 +
>  src/amd/common/amdgpu_id.h| 4 
>  src/gallium/drivers/radeon/r600_pipe_common.c | 6 ++
>  src/gallium/drivers/radeon/radeon_vce.c   | 3 ++-
>  src/gallium/drivers/radeonsi/si_pipe.c| 1 +
>  src/gallium/drivers/radeonsi/si_state.c   | 1 +
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
>  9 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/src/amd/addrlib/r800/ciaddrlib.cpp 
> b/src/amd/addrlib/r800/ciaddrlib.cpp
> index 7c5d29a..c726c4d 100644
> --- a/src/amd/addrlib/r800/ciaddrlib.cpp
> +++ b/src/amd/addrlib/r800/ciaddrlib.cpp
> @@ -353,6 +353,7 @@ AddrChipFamily CIAddrLib::HwlConvertChipFamily(
>  m_settings.isFiji= ASICREV_IS_FIJI_P(uChipRevision);
>  m_settings.isPolaris10   = 
> ASICREV_IS_POLARIS10_P(uChipRevision);
>  m_settings.isPolaris11   = 
> ASICREV_IS_POLARIS11_M(uChipRevision);
> +m_settings.isPolaris12   = 
> ASICREV_IS_POLARIS12_V(uChipRevision);
>  break;
>  case FAMILY_CZ:
>  m_settings.isCarrizo = 1;
> @@ -417,7 +418,7 @@ BOOL_32 CIAddrLib::HwlInitGlobalParams(
>  {
>  m_pipes = 16;
>  }
> -else if (m_settings.isPolaris11)
> +else if (m_settings.isPolaris11 || m_settings.isPolaris12)
>  {
>  m_pipes = 4;
>  }
> diff --git a/src/amd/addrlib/r800/ciaddrlib.h 
> b/src/amd/addrlib/r800/ciaddrlib.h
> index de995fa..2c9a4cc 100644
> --- a/src/amd/addrlib/r800/ciaddrlib.h
> +++ b/src/amd/addrlib/r800/ciaddrlib.h
> @@ -62,6 +62,7 @@ struct CIChipSettings
>  UINT_32 isFiji: 1;
>  UINT_32 isPolaris10   : 1;
>  UINT_32 isPolaris11   : 1;
> +UINT_32 isPolaris12   : 1;
>  // VI fusion (Carrizo)
>  UINT_32 isCarrizo : 1;
>  };
> diff --git a/src/amd/common/amd_family.h b/src/amd/common/amd_family.h
> index 6a713ad..b09bbb8 100644
> --- a/src/amd/common/amd_family.h
> +++ b/src/amd/common/amd_family.h
> @@ -91,6 +91,7 @@ enum radeon_family {
>  CHIP_STONEY,
>  CHIP_POLARIS10,
>  CHIP_POLARIS11,
> +CHIP_POLARIS12,
>  CHIP_LAST,
>  };
>  
> diff --git a/src/amd/common/amdgpu_id.h b/src/amd/common/amdgpu_id.h
> index f91df55..1683a5a 100644
> --- a/src/amd/common/amdgpu_id.h
> +++ b/src/amd/common/amdgpu_id.h
> @@ -142,6 +142,8 @@ enum {
>  
>   VI_POLARIS11_M_A0 = 90,
>  
> + VI_POLARIS12_V_A0 = 100,
> +
>   VI_UNKNOWN= 0xFF
>  };
>  
> @@ -156,6 +158,8 @@ enum {
>   ((eChipRev >= VI_POLARIS10_P_A0) && (eChipRev < VI_POLARIS11_M_A0))
>  #define ASICREV_IS_POLARIS11_M(eChipRev)   \
>   (eChipRev >= VI_POLARIS11_M_A0)
> +#define ASICREV_IS_POLARIS12_V(eChipRev)\
> + (eChipRev >= VI_POLARIS12_V_A0)
>  
>  /* CZ specific rev IDs */
>  enum {
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index 0b5c6dc..033e59c 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
> r600_common_screen *rscreen)
>   case CHIP_FIJI: return "AMD FIJI";
>   case CHIP_POLARIS10: return "AMD POLARIS10";
>   case CHIP_POLARIS11: return "AMD POLARIS11";
> + case CHIP_POLARIS12: return "AMD POLARIS12";
>   case CHIP_STONEY: return "AMD STONEY";
>   default: return "AMD unknown";
>   }
> @@ -893,6 +894,11 @@ const char *r600_get_llvm_processor_name(enum 
> radeon_family family)
>   case CHIP_POLARIS10: return "polaris10";
>   case CHIP_POLARIS11: return "polaris11";
>  #endif
> +#if HAVE_LLVM <= 0x0309
> + case CHIP_POLARIS12: return "polaris11";
> +#else
> + case CHIP_POLARIS12: return "polaris12";

There is a preference to move away from code names and use
gfxip names in the compiler. For LLVM >= 4.0 this should be "gfx803"

-Tom

> +#endif
>   default: return "";
>   }
>  }
> diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
> b/src/gallium/drivers/radeon/radeon_vce.c
> index aad2ec1..dcd56ea 100644
> --- a/src/gallium/drivers/radeon/radeon_vce.c
> +++ b/src/gallium/drivers/radeon/radeon_vce.c
> @@ -413,7 +413,8 @@ struct pipe_video_codec *rvce_create_encoder(struct 
> pipe_context *context,
>   enc->use_vui = true;
>   if (rscreen->info.family >= CHIP_TONGA &&
>   rscreen->info.family != CHIP_STONEY &&
> - rscreen->info.family != CHIP_POLARIS11)
> + 

[Mesa-dev] [PATCH] i965: Don't bail on vertex element processing if we need draw params.

2016-12-19 Thread Kenneth Graunke
BaseVertex, BaseInstance, DrawID, and some edge flag conditions need
vertex buffer and elements structs.  We can't bail early in this case.

Gen4-7 already do this properly.  Gen8+ did not.

Thanks to Ilia Mirkin for helping track this down.

Cc: mesa-sta...@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99144
Reported-by: Pierre-Eric Pelloux-Prayer 
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 ++--
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index 69ba8e9..3177f9a 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -110,6 +110,22 @@ gen8_emit_vertices(struct brw_context *brw)
   ADVANCE_BATCH();
}
 
+   /* Normally we don't need an element for the SGVS attribute because the
+* 3DSTATE_VF_SGVS instruction lets you store the generated attribute in an
+* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
+* we're using draw parameters then we need an element for the those
+* values.  Additionally if there is an edge flag element then the SGVS
+* can't be inserted past that so we need a dummy element to ensure that
+* the edge flag is the last one.
+*/
+   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_baseinstance ||
+((vs_prog_data->uses_instanceid ||
+  vs_prog_data->uses_vertexid) &&
+ uses_edge_flag));
+   const unsigned nr_elements =
+  brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
+
/* If the VS doesn't read any inputs (calculating vertex position from
 * a state variable for some reason, for example), emit a single pad
 * VERTEX_ELEMENT struct and bail.
@@ -117,7 +133,7 @@ gen8_emit_vertices(struct brw_context *brw)
 * The stale VB state stays in place, but they don't do anything unless
 * a VE loads from them.
 */
-   if (brw->vb.nr_enabled == 0) {
+   if (nr_elements == 0) {
   BEGIN_BATCH(3);
   OUT_BATCH((_3DSTATE_VERTEX_ELEMENTS << 16) | (3 - 2));
   OUT_BATCH((0 << GEN6_VE0_INDEX_SHIFT) |
@@ -172,22 +188,6 @@ gen8_emit_vertices(struct brw_context *brw)
   ADVANCE_BATCH();
}
 
-   /* Normally we don't need an element for the SGVS attribute because the
-* 3DSTATE_VF_SGVS instruction lets you store the generated attribute in an
-* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
-* we're using draw parameters then we need an element for the those
-* values.  Additionally if there is an edge flag element then the SGVS
-* can't be inserted past that so we need a dummy element to ensure that
-* the edge flag is the last one.
-*/
-   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
-vs_prog_data->uses_baseinstance ||
-((vs_prog_data->uses_instanceid ||
-  vs_prog_data->uses_vertexid) &&
- uses_edge_flag));
-   const unsigned nr_elements =
-  brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
-
/* The hardware allows one more VERTEX_ELEMENTS than VERTEX_BUFFERS,
 * presumably for VertexID/InstanceID.
 */
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] gallium-docs: Add documentation for when using several contexts

2016-12-19 Thread Axel Davy
Add documentation to explicit what can be expected and what is allowed
when using several contexts.

Signed-off-by: Axel Davy 
---
 src/gallium/docs/source/context.rst | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index e190cefc85..35f51a0941 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -707,3 +707,26 @@ notifications are single-shot, i.e. subsequent calls to
   since the last call or since the last notification by callback.
 * ``set_device_reset_callback`` sets a callback which will be called when
   a device reset is detected. The callback is only called synchronously.
+
+Using several contexts
+--
+
+Several contexts from the same screen can be used at the same time. Objects
+created on one context cannot be used in another context, but the objects
+created by the screen methods can be used by all contexts.
+
+Transfers
+^
+A transfer on one context is not expected to synchronize properly with
+rendering on other contexts, thus only areas not yet used for rendering should
+be locked.
+
+A flush is required after transfer_unmap to expect other contexts to see the
+uploaded data, unless:
+
+* Using persistent mapping. Associated with coherent mapping, unmapping the
+  resource is also not required to use it in other contexts. Without coherent
+  mapping, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) should be called on the
+  context that has mapped the resource. No flush is required.
+
+* Mapping the resource with PIPE_TRANSFER_MAP_DIRECTLY.
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2] New gallium documentation to explicit threading requirements

2016-12-19 Thread Axel Davy
This is a new RFC to replace "New gallium flags for using different
contexts in several threads".

Please comment.

Yours,

Axel Davy

Axel Davy (2):
  gallium-docs: Add documentation for threading requirements
  gallium-docs: Add documentation for when using several contexts

 src/gallium/docs/source/context.rst | 23 +++
 src/gallium/docs/source/screen.rst  | 10 ++
 2 files changed, 33 insertions(+)

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gallium-docs: Add documentation for threading requirements

2016-12-19 Thread Axel Davy
Add documentation for the requirements related to threading
for screens and contexts.

Signed-off-by: Axel Davy 
---
 src/gallium/docs/source/screen.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 7ac39ffc44..86aa2591ab 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -737,3 +737,13 @@ query group at the specified **index** is returned in 
**info**.
 The function returns non-zero on success.
 The driver-specific query group is described with the
 pipe_driver_query_group_info structure.
+
+
+Thread safety
+-
+
+Screen methods are required to be thread safe. While gallium rendering
+contexts are not required to be thread safe, it is required to be safe to use
+different contexts created with the same screen in different threads without
+locks. It is also required to be safe using screen methods in a thread, while
+using one of its contexts in another (without locks).
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radeonsi: add Polaris12 PCI ID

2016-12-19 Thread Alex Deucher
From: Junwei Zhang 

Signed-off-by: Junwei Zhang 
Reviewed-by: Nicolai Hähnle 
---
 include/pci_ids/radeonsi_pci_ids.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/pci_ids/radeonsi_pci_ids.h 
b/include/pci_ids/radeonsi_pci_ids.h
index 20c1583..fca47b0 100644
--- a/include/pci_ids/radeonsi_pci_ids.h
+++ b/include/pci_ids/radeonsi_pci_ids.h
@@ -205,3 +205,10 @@ CHIPSET(0x67CF, POLARIS10_, POLARIS10)
 CHIPSET(0x67DF, POLARIS10_, POLARIS10)
 
 CHIPSET(0x98E4, STONEY_, STONEY)
+
+CHIPSET(0x6980, POLARIS12_, POLARIS12)
+CHIPSET(0x6981, POLARIS12_, POLARIS12)
+CHIPSET(0x6985, POLARIS12_, POLARIS12)
+CHIPSET(0x6986, POLARIS12_, POLARIS12)
+CHIPSET(0x6987, POLARIS12_, POLARIS12)
+CHIPSET(0x699F, POLARIS12_, POLARIS12)
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support

2016-12-19 Thread Alex Deucher
From: Junwei Zhang 

Signed-off-by: Junwei Zhang 
Reviewed-by: Nicolai Hähnle 
Acked-by: Christian König 
---
 src/amd/addrlib/r800/ciaddrlib.cpp| 3 ++-
 src/amd/addrlib/r800/ciaddrlib.h  | 1 +
 src/amd/common/amd_family.h   | 1 +
 src/amd/common/amdgpu_id.h| 4 
 src/gallium/drivers/radeon/r600_pipe_common.c | 6 ++
 src/gallium/drivers/radeon/radeon_vce.c   | 3 ++-
 src/gallium/drivers/radeonsi/si_pipe.c| 1 +
 src/gallium/drivers/radeonsi/si_state.c   | 1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
 9 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/amd/addrlib/r800/ciaddrlib.cpp 
b/src/amd/addrlib/r800/ciaddrlib.cpp
index 7c5d29a..c726c4d 100644
--- a/src/amd/addrlib/r800/ciaddrlib.cpp
+++ b/src/amd/addrlib/r800/ciaddrlib.cpp
@@ -353,6 +353,7 @@ AddrChipFamily CIAddrLib::HwlConvertChipFamily(
 m_settings.isFiji= ASICREV_IS_FIJI_P(uChipRevision);
 m_settings.isPolaris10   = 
ASICREV_IS_POLARIS10_P(uChipRevision);
 m_settings.isPolaris11   = 
ASICREV_IS_POLARIS11_M(uChipRevision);
+m_settings.isPolaris12   = 
ASICREV_IS_POLARIS12_V(uChipRevision);
 break;
 case FAMILY_CZ:
 m_settings.isCarrizo = 1;
@@ -417,7 +418,7 @@ BOOL_32 CIAddrLib::HwlInitGlobalParams(
 {
 m_pipes = 16;
 }
-else if (m_settings.isPolaris11)
+else if (m_settings.isPolaris11 || m_settings.isPolaris12)
 {
 m_pipes = 4;
 }
diff --git a/src/amd/addrlib/r800/ciaddrlib.h b/src/amd/addrlib/r800/ciaddrlib.h
index de995fa..2c9a4cc 100644
--- a/src/amd/addrlib/r800/ciaddrlib.h
+++ b/src/amd/addrlib/r800/ciaddrlib.h
@@ -62,6 +62,7 @@ struct CIChipSettings
 UINT_32 isFiji: 1;
 UINT_32 isPolaris10   : 1;
 UINT_32 isPolaris11   : 1;
+UINT_32 isPolaris12   : 1;
 // VI fusion (Carrizo)
 UINT_32 isCarrizo : 1;
 };
diff --git a/src/amd/common/amd_family.h b/src/amd/common/amd_family.h
index 6a713ad..b09bbb8 100644
--- a/src/amd/common/amd_family.h
+++ b/src/amd/common/amd_family.h
@@ -91,6 +91,7 @@ enum radeon_family {
 CHIP_STONEY,
 CHIP_POLARIS10,
 CHIP_POLARIS11,
+CHIP_POLARIS12,
 CHIP_LAST,
 };
 
diff --git a/src/amd/common/amdgpu_id.h b/src/amd/common/amdgpu_id.h
index f91df55..1683a5a 100644
--- a/src/amd/common/amdgpu_id.h
+++ b/src/amd/common/amdgpu_id.h
@@ -142,6 +142,8 @@ enum {
 
VI_POLARIS11_M_A0 = 90,
 
+   VI_POLARIS12_V_A0 = 100,
+
VI_UNKNOWN= 0xFF
 };
 
@@ -156,6 +158,8 @@ enum {
((eChipRev >= VI_POLARIS10_P_A0) && (eChipRev < VI_POLARIS11_M_A0))
 #define ASICREV_IS_POLARIS11_M(eChipRev)   \
(eChipRev >= VI_POLARIS11_M_A0)
+#define ASICREV_IS_POLARIS12_V(eChipRev)\
+   (eChipRev >= VI_POLARIS12_V_A0)
 
 /* CZ specific rev IDs */
 enum {
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 0b5c6dc..033e59c 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
r600_common_screen *rscreen)
case CHIP_FIJI: return "AMD FIJI";
case CHIP_POLARIS10: return "AMD POLARIS10";
case CHIP_POLARIS11: return "AMD POLARIS11";
+   case CHIP_POLARIS12: return "AMD POLARIS12";
case CHIP_STONEY: return "AMD STONEY";
default: return "AMD unknown";
}
@@ -893,6 +894,11 @@ const char *r600_get_llvm_processor_name(enum 
radeon_family family)
case CHIP_POLARIS10: return "polaris10";
case CHIP_POLARIS11: return "polaris11";
 #endif
+#if HAVE_LLVM <= 0x0309
+   case CHIP_POLARIS12: return "polaris11";
+#else
+   case CHIP_POLARIS12: return "polaris12";
+#endif
default: return "";
}
 }
diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index aad2ec1..dcd56ea 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -413,7 +413,8 @@ struct pipe_video_codec *rvce_create_encoder(struct 
pipe_context *context,
enc->use_vui = true;
if (rscreen->info.family >= CHIP_TONGA &&
rscreen->info.family != CHIP_STONEY &&
-   rscreen->info.family != CHIP_POLARIS11)
+   rscreen->info.family != CHIP_POLARIS11 &&
+   rscreen->info.family != CHIP_POLARIS12)
enc->dual_pipe = true;
/* TODO enable B frame with dual instance */
if ((rscreen->info.family >= CHIP_TONGA) &&
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 88685f9..11cca6f 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ 

Re: [Mesa-dev] [PATCH v2 000/103] i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0

2016-12-19 Thread Matt Turner
On Mon, Dec 19, 2016 at 2:00 AM, Samuel Iglesias Gonsálvez
 wrote:
> Hello Matt,
>
> We have done most of the suggestions you made to our patches. However,
> we have replied to some of your questions/suggestions and we are
> waiting for a reply before marking them as R-b or not.

Thank you guys so much.

> You can clone the new version of the patch series by running this
> command:
>
> $ git clone -b i965-fp64-gen7-scalar-vec4-rc3 https://github.com/Igalia
> /mesa.git
>
> Below is the list of patches that need a R-b (they are marked as
> UNREVIEWED in the branch).
>
> * i965/vec4: implement hardware workaround for align16 double to float
> conversion
>>
>>   This always seemed like a really strange hardware bug, and
> one
>>   that no one should ever hit.
>>
>>   I'd prefer that, instead of loading an immediate double and
>> then
>>   performing a conversion to float, that we just convert the
>>   double to float in the compiler and emit an instruction to
> load
>>   that.
>>
>
>   We have done this. Does this change get your R-b?

Yes!

>
> * i965/vec4: fix optimize predicate for doubles
>
>   We have replied here [0].

Sounds good to me.

>
> * i965/vec4: handle 32 and 64 bit channels in liveness analysis
>
>   It is still unreviewed. Maybe Curro can take a look at it.

I've also pinged Curro to ask if he'll review it.

> * i965/vec4: add a SIMD lowering pass
>
>   Replied here [1].

Silly messy hardware. :)

> * i965/vec4: Prevent copy propagation from violating pre-gen8
> restrictions
>
>   Replied here [1].
>
> * i965/vec4: run scalarize_df() after spilling
>
>   Replied here [1].

Makes sense.

Yes, all of those should be

Reviewed-by: Matt Turner 

Again, thank you so much. This was a large amount of work, and the way
you guys handled it was extremely impressive. I'm only sorry that the
review of your work wasn't executed as well as your actual work!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] imx: gallium driver for imx-drm scanout driver

2016-12-19 Thread Emil Velikov
On Monday, 19 December 2016, Thierry Reding 
wrote:

> On Wed, Nov 30, 2016 at 02:44:36PM +0100, Christian Gmeiner wrote:
> [...]
> > +static struct pipe_screen *imx_open_render_node(struct renderonly *ro)
> > +{
> > +   return etna_drm_screen_create_rendernode(ro);
> > +}
>
> Patch 2/3 never made it into my inbox for some reason, and had to find
> it in some archives. I'll have to resort to commenting on the code here.
> From what I saw, etna_drm_screen_create_rendernode() hard-codes the GPU
> render node (to /dev/dri/renderD128, I think). It's a little brittle for
> obvious reasons, but I think you might be able to get away with it,
> depending on the SoC.
>
> On Tegra we have a bunch of people that stick discrete GPUs into the
> PCIe slot and run a second instance of Nouveau on that. It's an
> interesting use-case, but it also has the drawback of creating a second
> renderD device. What's more, it uses the same kernel driver, so hard-
> coding the device node is going to give us a 50-50 chance on average
> that we get the right one. Back when I had written the original proposal
> I had also coded a heuristic that would walk sysfs and select the render
> node that was on the same bus as the card node. This was before Emil had
> removed the dependency on udev, but I've rewritten that code to work
> with Emil's drmDevice*() API, which needs some fleshing out[0].
>
> Out of curiosity, is this something that could happen on i.MX devices as
> well? Or even if not i.MX, I'm fairly sure that there are other SoCs
> that have a Vivante GPU and a PCI slot that people could use to stick
> big GPUs into, so you may run into the same issue eventually.
>
> Thanks Thierry for the nice write-up. Must admit that I was feeling a bit
lazy.

Considering the ~ok odds and the fact that we don't have platform support
for the drm*Device helpers I'm inclined to have this fixed after the merge.

Let's have etna and(?) tegra and sort bugs during the RCs ;-)

Emil
P.S. Pardon the html formatting
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] imx: gallium driver for imx-drm scanout driver

2016-12-19 Thread Thierry Reding
On Wed, Nov 30, 2016 at 02:44:36PM +0100, Christian Gmeiner wrote:
[...]
> +static struct pipe_screen *imx_open_render_node(struct renderonly *ro)
> +{
> +   return etna_drm_screen_create_rendernode(ro);
> +}

Patch 2/3 never made it into my inbox for some reason, and had to find
it in some archives. I'll have to resort to commenting on the code here.
From what I saw, etna_drm_screen_create_rendernode() hard-codes the GPU
render node (to /dev/dri/renderD128, I think). It's a little brittle for
obvious reasons, but I think you might be able to get away with it,
depending on the SoC.

On Tegra we have a bunch of people that stick discrete GPUs into the
PCIe slot and run a second instance of Nouveau on that. It's an
interesting use-case, but it also has the drawback of creating a second
renderD device. What's more, it uses the same kernel driver, so hard-
coding the device node is going to give us a 50-50 chance on average
that we get the right one. Back when I had written the original proposal
I had also coded a heuristic that would walk sysfs and select the render
node that was on the same bus as the card node. This was before Emil had
removed the dependency on udev, but I've rewritten that code to work
with Emil's drmDevice*() API, which needs some fleshing out[0].

Out of curiosity, is this something that could happen on i.MX devices as
well? Or even if not i.MX, I'm fairly sure that there are other SoCs
that have a Vivante GPU and a PCI slot that people could use to stick
big GPUs into, so you may run into the same issue eventually.

Thierry

[0]: Interestingly, things would probably work for Tegra regardless of
whether the GPU is the Tegra one or a discrete one, as long as the
discrete is driven by Nouveau. So perhaps an even more interesting
heuristic would be to look for a render node that's driven by the
Nouveau driver...


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: Respect LLVM_CFLAGS in LLVM version detection

2016-12-19 Thread Tomasz Figa
On Mon, Dec 19, 2016 at 8:06 PM, Emil Velikov  wrote:
> On 19 December 2016 at 08:52, Tomasz Figa  wrote:
>> Hi Tobias,
>>
>> On Sat, Dec 17, 2016 at 2:15 AM, Tobias Droste  wrote:
>>> Hi Tomasz,
>>>
>>> does this actually fix anything?
>>>
>>> Because right now llvm-config.h does not include anything and I doubt it 
>>> will
>>> in the future, as it's just a collection of defines.
>>> The path to the header file itself is given by llvm-config 
>>> ($LLVM_INCLUDEDIR).
>>>
>>> Did you just happen to see this or do you get an error without this patch?
>>
>> We happen to have the setup exactly as described in the patch
>> description, i.e. LLVM in a non-standard directory and with
>> llvm-config.h that includes another header. Without the patch
>> ./configure fails because of LLVM version detection errors.
>>
> I believe you're spot on - be that due to missing include [directives]
> or conditional header inclusion one would need to have the respective
> CFLAGS. Thus the patch is
> Reviewed-by: Emil Velikov 
>
> At the same time - I seems to be lucky enough to have the headers sane
> [neither includes nor defines are required] on my Arch setup.
> Wondering if it's not something specific to the Android way of
> building LLVM

Yeah, that's very likely. :)

Although I can imagine distributors creating a wrapper including one
or another real header, depending on some setting. I might have seen
something like this in Gentoo, but I think it used absolute paths,
which made it compile without any flags.

> - IIRC the Android-x86 guys were explaining some
> interesting things.
> ... but that for another day. I'll commit your patch once I
> double-check Tobias' fixes.

Thanks!

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add renderonly library

2016-12-19 Thread Thierry Reding
On Wed, Nov 30, 2016 at 02:44:34PM +0100, Christian Gmeiner wrote:
> This a very lightweight library to add basic support for
> renderonly GPUs. It does all the magic regarding in/exporting
> buffers etc. This library will likely break android support and
> hopefully will get replaced with a better solution based on gbm2.
> 
> Signed-off-by: Christian Gmeiner 
> ---
>  src/gallium/Automake.inc  |   5 +
>  src/gallium/auxiliary/Makefile.am |  10 ++
>  src/gallium/auxiliary/Makefile.sources|   4 +
>  src/gallium/auxiliary/renderonly/renderonly.c | 199 
> ++
>  src/gallium/auxiliary/renderonly/renderonly.h |  81 +++
>  5 files changed, 299 insertions(+)
>  create mode 100644 src/gallium/auxiliary/renderonly/renderonly.c
>  create mode 100644 src/gallium/auxiliary/renderonly/renderonly.h

Hi Christian,

sorry for being late to the party. I only ran across this by accident.
Would you mind keeping me in Cc for subsequent versions of this? It's
been more than two years since I wrote the original proposal for this,
but I'm still interested in seeing a solution emerge.

Anyway, thanks for picking this up.

From a diff point of view this is certainly much more terse than the
original proposal. I find it slightly unfortunate that the drivers for
the render GPU have to be changed. But it's hard to argue with the
reduction in size.

I still think that Tegra will eventually require more than the stub that
you have in this series because the same device that exposes the scanout
engine also contains units that can do video compositing, decoding and
encoding. There's in fact recently been discussions on how best to
provide that functionality and Mesa looks like the best long-term target
currently. Anyway, that's a bridge I think we can cross when we get
there, this concept looks fine to get started.

> diff --git a/src/gallium/Automake.inc b/src/gallium/Automake.inc
> index 6fe2e22..6aadcb9 100644
> --- a/src/gallium/Automake.inc
> +++ b/src/gallium/Automake.inc
> @@ -50,6 +50,11 @@ GALLIUM_COMMON_LIB_DEPS = \
>   $(PTHREAD_LIBS) \
>   $(DLOPEN_LIBS)
>  
> +if HAVE_LIBDRM
> +GALLIUM_COMMON_LIB_DEPS += \
> + $(LIBDRM_LIBS)
> +endif

Nit: Is the conditional necessary here? LIBDRM_LIBS would be empty if
libdrm wasn't found, right? So no harm in just appending it to the
GALLIUM_COMMON_LIB_DEPS variable unconditonally?

>  GALLIUM_WINSYS_CFLAGS = \
>   -I$(top_srcdir)/src \
>   -I$(top_srcdir)/include \
> diff --git a/src/gallium/auxiliary/Makefile.am 
> b/src/gallium/auxiliary/Makefile.am
> index 4a4a4fb..6b63cf1 100644
> --- a/src/gallium/auxiliary/Makefile.am
> +++ b/src/gallium/auxiliary/Makefile.am
> @@ -20,6 +20,16 @@ libgallium_la_SOURCES = \
>   $(NIR_SOURCES) \
>   $(GENERATED_SOURCES)
>  
> +if HAVE_LIBDRM
> +
> +AM_CFLAGS += \
> + $(LIBDRM_CFLAGS)

Same here.

> diff --git a/src/gallium/auxiliary/Makefile.sources 
> b/src/gallium/auxiliary/Makefile.sources
> index 5d4fe30..8d3e4a9 100644
> --- a/src/gallium/auxiliary/Makefile.sources
> +++ b/src/gallium/auxiliary/Makefile.sources
> @@ -435,3 +435,7 @@ GALLIVM_SOURCES := \
>   draw/draw_llvm_sample.c \
>   draw/draw_pt_fetch_shade_pipeline_llvm.c \
>   draw/draw_vs_llvm.c
> +
> +RENDERONLY_SOURCES := \
> + renderonly/renderonly.c \
> + renderonly/renderonly.h
> diff --git a/src/gallium/auxiliary/renderonly/renderonly.c 
> b/src/gallium/auxiliary/renderonly/renderonly.c
> new file mode 100644
> index 000..c4ea784
> --- /dev/null
> +++ b/src/gallium/auxiliary/renderonly/renderonly.c
> @@ -0,0 +1,199 @@
> +/*
> + * Copyright (C) 2016 Christian Gmeiner 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
> THE
> + * SOFTWARE.
> + *
> + * Authors:
> + *Christian Gmeiner 
> + */
> 

Re: [Mesa-dev] [PATCH 3/3] imx: gallium driver for imx-drm scanout driver

2016-12-19 Thread Thierry Reding
On Thu, Dec 01, 2016 at 03:00:20PM +, Emil Velikov wrote:
> On 30 November 2016 at 13:44, Christian Gmeiner
>  wrote:
> > The imx (stub) driver is needed to get hardware acceleration from
> > etnaviv on a platform using imx-drm kms driver. This adds support
> > for wayland and native kms egl apps.
> >
> > Signed-off-by: Christian Gmeiner 
> > ---
> >  configure.ac   | 12 +++
> >  src/gallium/Makefile.am|  4 +++
> >  .../auxiliary/pipe-loader/pipe_loader_drm.c|  5 +++
> >  src/gallium/auxiliary/target-helpers/drm_helper.h  | 23 
> >  .../auxiliary/target-helpers/drm_helper_public.h   |  3 ++
> >  src/gallium/drivers/imx/Automake.inc   |  9 +
> >  src/gallium/drivers/imx/Makefile.am|  9 +
> >  src/gallium/winsys/imx/drm/Makefile.am | 33 +
> >  src/gallium/winsys/imx/drm/Makefile.sources|  3 ++
> >  src/gallium/winsys/imx/drm/imx_drm_public.h| 31 
> >  src/gallium/winsys/imx/drm/imx_drm_winsys.c| 41 
> > ++
> >  11 files changed, 173 insertions(+)
> I think you want to add the following to src/gallium/targets/dri/Makefile.am
> 
> include $(top_srcdir)/src/gallium/drivers/imx/Automake.inc
> 
> Otherwise there will be no imx_dri.so module which you can use.
> ^^ is a must have afaics, everything else (mentioned below) can be
> tackled at a later stage.
> 
> A set of targets/pipe-loader/* changes would be nice... unless I beat
> you to it and fold the final round of duplication that we have in the
> pipe-loader/targets topic ;-)
> 
> 
> > +include Makefile.sources
> > +include $(top_srcdir)/src/gallium/Automake.inc
> > +
> > +AM_CFLAGS = \
> > +   -I$(top_srcdir)/src/gallium/drivers \
> Add the following and then ...
>-I$(top_srcdir)/src/gallium/winsys \
> 
> > +   $(GALLIUM_WINSYS_CFLAGS) \
> > +   $(IMX_CFLAGS)
> > +
> > +noinst_LTLIBRARIES = libimxdrm.la
> > +
> > +libimxdrm_la_SOURCES = $(C_SOURCES)
> > \ No newline at end of file
> Please add the missing newlines throughout.
> 
> 
> > diff --git a/src/gallium/winsys/imx/drm/imx_drm_public.h 
> > b/src/gallium/winsys/imx/drm/imx_drm_public.h
> > new file mode 100644
> > index 000..2d93da2
> > --- /dev/null
> > +++ b/src/gallium/winsys/imx/drm/imx_drm_public.h
> > @@ -0,0 +1,31 @@
> > +/*
> > + * Copyright © 2014 NVIDIA Corporation
> > + *
> Disclaimer: IANAL
> 
> Here and other copyright notices could be updated to reflect you.
> Things have changed noticeably that any recemblense with the original
> is conicidential.

Agreed, there's nothing in this file that I'd consider copyright of
NVIDIA. If there's anything copyrightable in this series at all I'd
think it'd be the create/export/import/tiling dance that the render-
only library does.

Thierry


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl: syncronize surface information with driver RFC

2016-12-19 Thread Tapani Pälli
This makes sure that the values we return are in sync what the
driver currently has.

Together with dEQP change bug #98327 this fixes following test:

   dEQP-EGL.functional.resize.surface_size.grow

v2: implement callback also for dri3
v3: make optional for dri2 drivers, only x11 dri2 seems
to require this right now

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98327
---
 src/egl/drivers/dri2/egl_dri2.c | 12 
 src/egl/drivers/dri2/egl_dri2.h |  2 ++
 src/egl/drivers/dri2/platform_x11.c | 31 +++
 src/egl/main/eglapi.h   |  2 ++
 src/egl/main/eglfallbacks.c |  1 +
 src/egl/main/eglsurface.c   |  3 +++
 6 files changed, 51 insertions(+)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 52fbdff..112c6ad 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -1264,6 +1264,17 @@ dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLSurface *surf)
return dri2_dpy->vtbl->destroy_surface(drv, dpy, surf);
 }
 
+static EGLBoolean
+dri2_sync_surface(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+
+   if (dri2_dpy->vtbl->sync_surface)
+  return dri2_dpy->vtbl->sync_surface(drv, dpy, surf);
+
+   return EGL_TRUE;
+}
+
 /**
  * Called via eglMakeCurrent(), drv->API.MakeCurrent().
  */
@@ -2942,6 +2953,7 @@ _eglBuiltInDriverDRI2(const char *args)
dri2_drv->base.API.CreatePixmapSurface = dri2_create_pixmap_surface;
dri2_drv->base.API.CreatePbufferSurface = dri2_create_pbuffer_surface;
dri2_drv->base.API.DestroySurface = dri2_destroy_surface;
+   dri2_drv->base.API.SyncSurface = dri2_sync_surface;
dri2_drv->base.API.GetProcAddress = dri2_get_proc_address;
dri2_drv->base.API.WaitClient = dri2_wait_client;
dri2_drv->base.API.WaitNative = dri2_wait_native;
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index eac58f3..d5cc9b8 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -110,6 +110,8 @@ struct dri2_egl_display_vtbl {
 
EGLBoolean (*destroy_surface)(_EGLDriver *drv, _EGLDisplay *dpy,
  _EGLSurface *surface);
+   EGLBoolean (*sync_surface)(_EGLDriver *drv, _EGLDisplay *dpy,
+  _EGLSurface *surface);
 
EGLBoolean (*swap_interval)(_EGLDriver *drv, _EGLDisplay *dpy,
_EGLSurface *surf, EGLint interval);
diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index db7d3b9..ecbd8de 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -395,6 +395,36 @@ dri2_x11_destroy_surface(_EGLDriver *drv, _EGLDisplay 
*disp, _EGLSurface *surf)
 }
 
 /**
+ * Syncronize surface geometry with server
+ */
+static EGLBoolean
+dri2_x11_sync_surface(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   xcb_get_geometry_cookie_t cookie;
+   xcb_get_geometry_reply_t *reply;
+   EGLBoolean result = EGL_TRUE;
+   xcb_generic_error_t *error;
+
+   cookie = xcb_get_geometry (dri2_dpy->conn, dri2_surf->drawable);
+   reply = xcb_get_geometry_reply (dri2_dpy->conn, cookie, );
+   if (reply == NULL)
+  return EGL_FALSE;
+
+   if (error != NULL) {
+  _eglLog(_EGL_WARNING, "error in xcb_get_geometry");
+  result = EGL_FALSE;
+  free(error);
+   } else {
+  surf->Width = reply->width;
+  surf->Height = reply->height;
+   }
+   free(reply);
+   return result;
+}
+
+/**
  * Process list of buffer received from the server
  *
  * Processes the list of buffers received in a reply from the server to either
@@ -1124,6 +1154,7 @@ static struct dri2_egl_display_vtbl dri2_x11_display_vtbl 
= {
.create_pixmap_surface = dri2_x11_create_pixmap_surface,
.create_pbuffer_surface = dri2_x11_create_pbuffer_surface,
.destroy_surface = dri2_x11_destroy_surface,
+   .sync_surface = dri2_x11_sync_surface,
.create_image = dri2_x11_create_image_khr,
.swap_interval = dri2_x11_swap_interval,
.swap_buffers = dri2_x11_swap_buffers,
diff --git a/src/egl/main/eglapi.h b/src/egl/main/eglapi.h
index 710c5d8..5152b93 100644
--- a/src/egl/main/eglapi.h
+++ b/src/egl/main/eglapi.h
@@ -94,6 +94,8 @@ struct _egl_api
 const EGLint *attrib_list);
EGLBoolean (*DestroySurface)(_EGLDriver *drv, _EGLDisplay *dpy,
 _EGLSurface *surface);
+   EGLBoolean (*SyncSurface)(_EGLDriver *drv, _EGLDisplay *dpy,
+  _EGLSurface *surface);
EGLBoolean (*QuerySurface)(_EGLDriver *drv, _EGLDisplay *dpy,
   _EGLSurface *surface, EGLint attribute,

[Mesa-dev] [PATCH] new EGL internal SyncSurface hook

2016-12-19 Thread Tapani Pälli
Hi;

I've been investigating dEQP EGL bugs. Following test seems to fail now
and then and only on X11 when using DRI2:

   dEQP-EGL.functional.resize.surface_size.grow

This patch is RFC for a 'SyncSurface' API that will query surface
geometry (xcb_get_geometry) before returning current stored values in
'_eglQuerySurface'. Without this we might have wrong values in place,
I believe this is related to following comments in the source code ..

--- 8< ---
src/mesa/drivers/dri/i965/brw_context.c:

/* GLX uses DRI2 invalidate events to handle window resizing.
 * Unfortunately, EGL does not - libEGL is written in XCB (not Xlib),
 * which doesn't provide a mechanism for snooping the event queues.
 *
 * So EGL still relies on viewport hacks to handle window resizing.
 * This should go away with DRI3000.
 */

src/egl/drivers/dri2/platform_x11.c:

/* Since we aren't watching for the server's invalidate events like we're
 * supposed to (due to XCB providing no mechanism for filtering the events
 * the way xlib does), and SwapBuffers is a common cause of invalidate
 * events, just shove one down to the driver, even though we haven't told
 * the driver that we're the kind of loader that provides reliable
 * invalidate events.  This causes the driver to request buffers again at
 * its next draw, so that we get the correct buffers if a pageflip
 * happened.  The driver should still be using the viewport hack to catch
 * window resizes.
 */
--- 8< ---

This patch together with a dEQP change I've attached to bug #98327 makes
this test pass consistently for me on 2 machines, one is running Fedora
with DRI3 and another one Ubuntu with DRI2.

Is this OK approach? Should the name be something else like
SyncSurfaceGeometry or GetSurfaceGeometry?

Any comments appreciated;

Tapani Pälli (1):
  egl: syncronize surface information with driver RFC

 src/egl/drivers/dri2/egl_dri2.c | 12 
 src/egl/drivers/dri2/egl_dri2.h |  2 ++
 src/egl/drivers/dri2/platform_x11.c | 31 +++
 src/egl/main/eglapi.h   |  2 ++
 src/egl/main/eglfallbacks.c |  1 +
 src/egl/main/eglsurface.c   |  3 +++
 6 files changed, 51 insertions(+)

-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Drop bogus is_vertex_input from add_shader_variable().

2016-12-19 Thread Alejandro Piñeiro
lgtm

Reviewed-by: Alejandro Piñeiro 

On 17/12/16 03:46, Kenneth Graunke wrote:
> stage_mask is a bitmask of shader stages, so the proper comparison would
> be (1 << MESA_SHADER_VERTEX), not MESA_SHADER_VERTEX itself.
>
> But we only care for structure types, and VS inputs cannot be structs.
> So we can just drop this entirely.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/linker.cpp | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
>
> Sorry, I forgot I split this into two patches...apply this one before
> [PATCH] glsl: Fix program interface queries relating to interface blocks.
>
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index 3660257..5066014 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -3735,10 +3735,6 @@ add_shader_variable(const struct gl_context *ctx,
>  bool use_implicit_location, int location,
>  const glsl_type *outermost_struct_type = NULL)
>  {
> -   const bool is_vertex_input =
> -  programInterface == GL_PROGRAM_INPUT &&
> -  stage_mask == MESA_SHADER_VERTEX;
> -
> switch (type->base_type) {
> case GLSL_TYPE_STRUCT: {
>/* The ARB_program_interface_query spec says:
> @@ -3764,8 +3760,7 @@ add_shader_variable(const struct gl_context *ctx,
>outermost_struct_type))
>  return false;
>  
> - field_location +=
> -field->type->count_attribute_slots(is_vertex_input);
> + field_location += field->type->count_attribute_slots(false);
>}
>return true;
> }

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: Respect LLVM_CFLAGS in LLVM version detection

2016-12-19 Thread Emil Velikov
On 19 December 2016 at 08:52, Tomasz Figa  wrote:
> Hi Tobias,
>
> On Sat, Dec 17, 2016 at 2:15 AM, Tobias Droste  wrote:
>> Hi Tomasz,
>>
>> does this actually fix anything?
>>
>> Because right now llvm-config.h does not include anything and I doubt it will
>> in the future, as it's just a collection of defines.
>> The path to the header file itself is given by llvm-config 
>> ($LLVM_INCLUDEDIR).
>>
>> Did you just happen to see this or do you get an error without this patch?
>
> We happen to have the setup exactly as described in the patch
> description, i.e. LLVM in a non-standard directory and with
> llvm-config.h that includes another header. Without the patch
> ./configure fails because of LLVM version detection errors.
>
I believe you're spot on - be that due to missing include [directives]
or conditional header inclusion one would need to have the respective
CFLAGS. Thus the patch is
Reviewed-by: Emil Velikov 

At the same time - I seems to be lucky enough to have the headers sane
[neither includes nor defines are required] on my Arch setup.
Wondering if it's not something specific to the Android way of
building LLVM - IIRC the Android-x86 guys were explaining some
interesting things.
... but that for another day. I'll commit your patch once I
double-check Tobias' fixes.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix program interface queries relating to interface blocks.

2016-12-19 Thread Alejandro Piñeiro
On 17/12/16 17:57, Kenneth Graunke wrote:
> On Saturday, December 17, 2016 5:41:35 PM PST Alejandro Piñeiro wrote:
>> On 17/12/16 03:35, Kenneth Graunke wrote:
>>> This fixes 555 dEQP tests (using the nougat-cts-dev branch), Piglit's
>>> arb_program_interface_query/arb_program_interface_query-resource-query,
>> Somewhat offtopic: FWIW, piglit tests for arb_program_interface_query
>> doesn't tests too much the REFERENCED_BY_XXX property (see below). I
>> have a wip piglit tests locally that still fails. Mentioning this
>> because ...
>>
>>> and GL45-CTS.program_interface_query.separate-programs-{tess-control,
>>> tess-eval,geometry}.  Only one dEQP program interface failure remains.
>> ... as far as I see, just one CTS program interface query keeps failing
>> after your patch, and it is caused by some issues with
>> REFERENCED_BY_XXX. In any case, about this, now Im somewhat more
>> optimistic, after realizing (while testing your patch) that Ian Romanick
>> fixed GL45-CTS.geometry_shader.program_resource.program_resource, and
>> cleaned up referenced_by on the patch "linker: Accurately track
>> gl_uniform_block::stageref" (commit 084105). Will take a look to that
>> next Monday.
> Is the one remaining failure
>
> GL45-CTS.program_interface_query.uniform-block-types
>
> ...?  

Yes that one.

> I believe that one's fixed by Ian's series:
>
> https://lists.freedesktop.org/archives/mesa-dev/2016-December/138359.html

Oh neat! Yes, that series fixes that test, just tested.

Thanks!



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix program interface queries relating to interface blocks.

2016-12-19 Thread Alejandro Piñeiro
On 17/12/16 17:57, Kenneth Graunke wrote:
> On Saturday, December 17, 2016 5:41:35 PM PST Alejandro Piñeiro wrote:
>> On 17/12/16 03:35, Kenneth Graunke wrote:
>>> This fixes 555 dEQP tests (using the nougat-cts-dev branch), Piglit's
>>> arb_program_interface_query/arb_program_interface_query-resource-query,
>> Somewhat offtopic: FWIW, piglit tests for arb_program_interface_query
>> doesn't tests too much the REFERENCED_BY_XXX property (see below). I
>> have a wip piglit tests locally that still fails. Mentioning this
>> because ...
>>
>>> and GL45-CTS.program_interface_query.separate-programs-{tess-control,
>>> tess-eval,geometry}.  Only one dEQP program interface failure remains.
>> ... as far as I see, just one CTS program interface query keeps failing
>> after your patch, and it is caused by some issues with
>> REFERENCED_BY_XXX. In any case, about this, now Im somewhat more
>> optimistic, after realizing (while testing your patch) that Ian Romanick
>> fixed GL45-CTS.geometry_shader.program_resource.program_resource, and
>> cleaned up referenced_by on the patch "linker: Accurately track
>> gl_uniform_block::stageref" (commit 084105). Will take a look to that
>> next Monday.
> Is the one remaining failure
>
> GL45-CTS.program_interface_query.uniform-block-types

Yes I meant that test.

>
> ...?  I believe that one's fixed by Ian's series:
>
> https://lists.freedesktop.org/archives/mesa-dev/2016-December/138359.html

Oh neat, just made a quick test, and yes it fixes that test.

BR




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V7 Loop unrolling in NIR

2016-12-19 Thread Timothy Arceri
On Mon, 2016-12-19 at 17:38 +1100, Timothy Arceri wrote:
> On Sun, 2016-12-18 at 21:58 -0800, Jason Ekstrand wrote:
> > On Dec 18, 2016 9:48 PM, "Timothy Arceri"  > .c
> > om> wrote:
> > V7:
> >  - paritally out of ssa in unrolling pass to avoid phis
> >  - lots of simplification/tidy ups in the analysis pass
> >  - if_uses bug fix (missing functionality) in lcssa fixed
> >  - better support for non trivial loop terminators
> >  - fixed all loop HURT except 1 that is too big to unroll.
> > 
> > total instructions in shared programs: 12584624 -> 12584621 (-
> > 0.00%)
> > instructions in affected programs: 68507 -> 68504 (-0.00%)
> > helped: 70
> > HURT: 170
> > 
> > total cycles in shared programs: 24146 -> 241476226 (-0.01%)
> > cycles in affected programs: 4060722 -> 4036952 (-0.59%)
> > helped: 1241
> > HURT: 1278
> > 
> > total loops in shared programs: 4245 -> 2948 (-30.55%)
> > loops in affected programs: 1535 -> 238 (-84.50%)
> > helped: 1142
> > HURT: 1
> > 
> > That is a *lot* of loops that we were leaving intact for no good
> > reason.  Any idea how many of those were because the glsl pass
> > didn't
> > know what to do weight them vs. the heuristic subtly changing?
> 

Oh whoops I need to learn how to read. Doing things the other way
around if I remove the instruction and iterations limit from the GLSL
IR pass I get ... a frozen PC for a few minutes processing various
shaders and eventually it crashes processing a deus ex shader.

Looking at the history is seems the heuristics might have more to do
with GLSL IR being slow than the actual output being too large.

commit be5f27a84d0d4efb57071d9d7ecda061223d03ef
Author: Eric Anholt 
Date:   Tue Feb 21 13:37:49 2012 -0800

glsl: Refine the loop instruction counting.

Before, we were only counting top-level instructions.  But if we
have
an assignment of a giant expression tree (such as the ones
eventually
generated by glsl-fs-unroll), we were counting the same as an
assignment of a variable deref.

glsl-fs-unroll-explosion now fails in a reasonable amount of time
on
i965 because the unrolling didn't go ridiculously far.

Reviewed-by: Kenneth Graunke 

commit 67007080b716c7e51039a381f407ababd68230f7
Author: Mathias Fröhlich 
Date:   Wed Jan 25 17:35:01 2012 +0100

glsl: Avoid excessive loop unrolling.

Avoid unrollong loops that are either nested loops or
where the loop body times the unroll count is huge.

The change is far from being perfect but it extends the
loop unrolling decision heuristic by some additional
safeguard. In particular this cuts down compilation of
a shader precomputing atmospheric scattering integral
tables containing two nesting levels in a loop from
something way beyond some minutes (I never waited for
it to finish) to some fractions of a second.

This fixes piglit tests glsl-fs-unroll-explosion and
glsl-vs-unroll-explosion on r600g.

Reviewed-by: Eric Anholt 
Signed-off-by: Mathias Fröhlich 

> Are you sending html emails lately? Your comments have been
> indistinguishable from the original email in a number of emails
> lately.
> 
> I disable instruction/iteration unrolling restrictions and it helped
> on
> ly 13 loops.
> 
> For the record V7 unrolls around 500 more loops than V6. I think
> handling phis as a conditional might give us another good jump. I
> would
> need to dig a bit deeper to be sure why so many loops remain.
> 
> I'm happy to work on improving things further once we land the
> initial
> version.
> 
> > 
> > 
> > LOST:   26
> > GAINED: 16
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: Respect LLVM_CFLAGS in LLVM version detection

2016-12-19 Thread Tomasz Figa
Hi Tobias,

On Sat, Dec 17, 2016 at 2:15 AM, Tobias Droste  wrote:
> Hi Tomasz,
>
> does this actually fix anything?
>
> Because right now llvm-config.h does not include anything and I doubt it will
> in the future, as it's just a collection of defines.
> The path to the header file itself is given by llvm-config ($LLVM_INCLUDEDIR).
>
> Did you just happen to see this or do you get an error without this patch?

We happen to have the setup exactly as described in the patch
description, i.e. LLVM in a non-standard directory and with
llvm-config.h that includes another header. Without the patch
./configure fails because of LLVM version detection errors.

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] cso: Optimize cso_save/restore_fragment_samplers

2016-12-19 Thread Nicolai Hähnle

On 19.12.2016 09:11, Michel Dänzer wrote:

On 16/12/16 08:22 PM, Nicolai Hähnle wrote:

On 16.12.2016 10:52, Michel Dänzer wrote:

From: Michel Dänzer 

Only copy/memset the pointers that actually need to be.

Signed-off-by: Michel Dänzer 
---
 src/gallium/auxiliary/cso_cache/cso_context.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/cso_cache/cso_context.c
b/src/gallium/auxiliary/cso_cache/cso_context.c
index f52969d366..03f78eea53 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.c
+++ b/src/gallium/auxiliary/cso_cache/cso_context.c
@@ -1267,8 +1267,10 @@ cso_save_fragment_samplers(struct cso_context
*ctx)
struct sampler_info *saved = >fragment_samplers_saved;

saved->nr_samplers = info->nr_samplers;
-   memcpy(saved->cso_samplers, info->cso_samplers,
sizeof(info->cso_samplers));
-   memcpy(saved->samplers, info->samplers, sizeof(info->samplers));
+   memcpy(saved->cso_samplers, info->cso_samplers, info->nr_samplers *
+  sizeof(*info->cso_samplers));
+   memcpy(saved->samplers, info->samplers, info->nr_samplers *
+  sizeof(*info->samplers));
 }


@@ -1277,9 +1279,20 @@ cso_restore_fragment_samplers(struct
cso_context *ctx)
 {
struct sampler_info *info = >samplers[PIPE_SHADER_FRAGMENT];
struct sampler_info *saved = >fragment_samplers_saved;
+   int delta = info->nr_samplers - saved->nr_samplers;
+
+   memcpy(info->cso_samplers, saved->cso_samplers,
+  saved->nr_samplers * sizeof(*info->cso_samplers));
+   memcpy(info->samplers, saved->samplers,
+  saved->nr_samplers * sizeof(*info->samplers));
+
+   if (delta > 0) {


I'd be more comfortable with an explicit info->nr_samplers >
saved->nr_samplers given that nr_samplers is an unsigned.


How about:

   int delta = (int)info->nr_samplers - saved->nr_samplers;

This generates identical code as without the cast, whereas with the
explicit comparison the ELF text section grows by 48 bytes.


Yes, that sounds good.







Apart from that, patches 1-3:

Reviewed-by: Nicolai Hähnle 


Thanks!



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] cso: Optimize cso_save/restore_fragment_samplers

2016-12-19 Thread Michel Dänzer
On 16/12/16 08:22 PM, Nicolai Hähnle wrote:
> On 16.12.2016 10:52, Michel Dänzer wrote:
>> From: Michel Dänzer 
>>
>> Only copy/memset the pointers that actually need to be.
>>
>> Signed-off-by: Michel Dänzer 
>> ---
>>  src/gallium/auxiliary/cso_cache/cso_context.c | 21 +
>>  1 file changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/cso_cache/cso_context.c
>> b/src/gallium/auxiliary/cso_cache/cso_context.c
>> index f52969d366..03f78eea53 100644
>> --- a/src/gallium/auxiliary/cso_cache/cso_context.c
>> +++ b/src/gallium/auxiliary/cso_cache/cso_context.c
>> @@ -1267,8 +1267,10 @@ cso_save_fragment_samplers(struct cso_context
>> *ctx)
>> struct sampler_info *saved = >fragment_samplers_saved;
>>
>> saved->nr_samplers = info->nr_samplers;
>> -   memcpy(saved->cso_samplers, info->cso_samplers,
>> sizeof(info->cso_samplers));
>> -   memcpy(saved->samplers, info->samplers, sizeof(info->samplers));
>> +   memcpy(saved->cso_samplers, info->cso_samplers, info->nr_samplers *
>> +  sizeof(*info->cso_samplers));
>> +   memcpy(saved->samplers, info->samplers, info->nr_samplers *
>> +  sizeof(*info->samplers));
>>  }
>>
>>
>> @@ -1277,9 +1279,20 @@ cso_restore_fragment_samplers(struct
>> cso_context *ctx)
>>  {
>> struct sampler_info *info = >samplers[PIPE_SHADER_FRAGMENT];
>> struct sampler_info *saved = >fragment_samplers_saved;
>> +   int delta = info->nr_samplers - saved->nr_samplers;
>> +
>> +   memcpy(info->cso_samplers, saved->cso_samplers,
>> +  saved->nr_samplers * sizeof(*info->cso_samplers));
>> +   memcpy(info->samplers, saved->samplers,
>> +  saved->nr_samplers * sizeof(*info->samplers));
>> +
>> +   if (delta > 0) {
> 
> I'd be more comfortable with an explicit info->nr_samplers >
> saved->nr_samplers given that nr_samplers is an unsigned.

How about:

   int delta = (int)info->nr_samplers - saved->nr_samplers;

This generates identical code as without the cast, whereas with the
explicit comparison the ELF text section grows by 48 bytes.


> Apart from that, patches 1-3:
> 
> Reviewed-by: Nicolai Hähnle 

Thanks!


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 000/103] i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0

2016-12-19 Thread Samuel Iglesias Gonsálvez
Hello Matt,

We have done most of the suggestions you made to our patches. However,
we have replied to some of your questions/suggestions and we are
waiting for a reply before marking them as R-b or not.

You can clone the new version of the patch series by running this
command:

$ git clone -b i965-fp64-gen7-scalar-vec4-rc3 https://github.com/Igalia
/mesa.git

Below is the list of patches that need a R-b (they are marked as
UNREVIEWED in the branch).

* i965/vec4: implement hardware workaround for align16 double to float
conversion
> 
>   This always seemed like a really strange hardware bug, and
one
>   that no one should ever hit.
> 
>   I'd prefer that, instead of loading an immediate double and
> then
>   performing a conversion to float, that we just convert the
>   double to float in the compiler and emit an instruction to
load
>   that.
> 

  We have done this. Does this change get your R-b?

* i965/vec4: fix optimize predicate for doubles

  We have replied here [0].

* i965/vec4: handle 32 and 64 bit channels in liveness analysis

  It is still unreviewed. Maybe Curro can take a look at it.

* i965/vec4: add a SIMD lowering pass

  Replied here [1].

* i965/vec4: Prevent copy propagation from violating pre-gen8
restrictions

  Replied here [1].

* i965/vec4: run scalarize_df() after spilling

  Replied here [1].

* i965/gen7: expose OpenGL 4.0 on Haswell

  We are currently discussing it with Curro :-)

Thanks!

Sam

[0] https://lists.freedesktop.org/archives/mesa-dev/2016-December/13815
1.html
[1] https://lists.freedesktop.org/archives/mesa-dev/2016-December/13816
0.html

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev