Mesa (master): broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 0024b77e876017b559b76d816d40a2abbd9a0ea1
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=0024b77e876017b559b76d816d40a2abbd9a0ea1

Author: Eric Anholt 
Date:   Mon Mar 26 12:39:12 2018 -0700

broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.

This is the actual hardware layout, and we were only swizzling R/B back
around in texturing.  Fixes part of
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in
simulation.

---

 src/gallium/drivers/vc5/v3dx_format_table.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/vc5/v3dx_format_table.c 
b/src/gallium/drivers/vc5/v3dx_format_table.c
index 884f7373a1..4aaf0ecd3d 100644
--- a/src/gallium/drivers/vc5/v3dx_format_table.c
+++ b/src/gallium/drivers/vc5/v3dx_format_table.c
@@ -68,7 +68,7 @@ static const struct vc5_format format_table[] = {
 FORMAT(R8G8B8A8_SNORM,NO,   RGBA8_SNORM, SWIZ_XYZW, 16, 0),
 FORMAT(R8G8B8X8_SNORM,NO,   RGBA8_SNORM, SWIZ_XYZ1, 16, 0),
 FORMAT(R10G10B10A2_UNORM, RGB10_A2, RGB10_A2,SWIZ_XYZW, 16, 0),
-FORMAT(B10G10R10A2_UINT,  RGB10_A2UI,   RGB10_A2UI,  SWIZ_ZYXW, 16, 0),
+FORMAT(R10G10B10A2_UINT,  RGB10_A2UI,   RGB10_A2UI,  SWIZ_XYZW, 16, 0),
 
 FORMAT(A4B4G4R4_UNORM,ABGR, RGBA4,   SWIZ_XYZW, 16, 0),
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): st: Allow accelerated CopyTexImage from RGBA to RGB.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: d491ad1d364afa60eef5cf7b45f69f7007ab3dfd
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=d491ad1d364afa60eef5cf7b45f69f7007ab3dfd

Author: Eric Anholt 
Date:   Wed Mar 21 11:43:28 2018 -0700

st: Allow accelerated CopyTexImage from RGBA to RGB.

There's nothing to worry about here -- the A channel just gets dropped by
the blit.  This avoids a segfault in the fallback path when copying from a
RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an
RGBA16_SINT texture (the fallback path tries to get/fetch to float
buffers, but the float pack/unpack functions are NULL for SINT/UINT).

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5.

v2: Extract the logic to a helper function and explain what's going on
better.
v3: const-qualify args

Reviewed-by: Brian Paul 

---

 src/mesa/state_tracker/st_cb_texture.c | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 6345ead639..3a793a7265 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -2281,6 +2281,31 @@ fallback_copy_texsubimage(struct gl_context *ctx,
pipe->transfer_unmap(pipe, src_trans);
 }
 
+static bool
+st_can_copyteximage_using_blit(const struct gl_texture_image *texImage,
+   const struct gl_renderbuffer *rb)
+{
+   GLenum tex_baseformat = _mesa_get_format_base_format(texImage->TexFormat);
+
+   /* We don't blit to a teximage where the GL base format doesn't match the
+* texture's chosen format, except in the case of a GL_RGB texture
+* represented with GL_RGBA (where the alpha channel is just being
+* dropped).
+*/
+   if (texImage->_BaseFormat != tex_baseformat &&
+   ((texImage->_BaseFormat != GL_RGB || tex_baseformat != GL_RGBA))) {
+  return false;
+   }
+
+   /* We can't blit from a RB where the GL base format doesn't match the RB's
+* chosen format (for example, GL RGB or ALPHA with rb->Format of an RGBA
+* type, because the other channels will be undefined).
+*/
+   if (rb->_BaseFormat != _mesa_get_format_base_format(rb->Format))
+  return false;
+
+   return true;
+}
 
 /**
  * Do a CopyTex[Sub]Image1/2/3D() using a hardware (blit) path if possible.
@@ -2324,12 +2349,7 @@ st_CopyTexSubImage(struct gl_context *ctx, GLuint dims,
   goto fallback;
}
 
-   /* The base internal format must match the mesa format, so make sure
-* e.g. an RGB internal format is really allocated as RGB and not as RGBA.
-*/
-   if (texImage->_BaseFormat !=
-   _mesa_get_format_base_format(texImage->TexFormat) ||
-   rb->_BaseFormat != _mesa_get_format_base_format(rb->Format)) {
+   if (!st_can_copyteximage_using_blit(texImage, rb)) {
   goto fallback;
}
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: ef2cf9cc3c1a4bc96fcc46eb623768a400c3d68d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=ef2cf9cc3c1a4bc96fcc46eb623768a400c3d68d

Author: Eric Anholt 
Date:   Fri Mar 23 15:28:40 2018 -0700

broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.

The next job from this client will turn it back on unless TF gets
disabled, but we don't want the state to leak from this client to another
(which causes GPU hangs).

---

 src/gallium/drivers/vc5/v3dx_job.c| 21 +++--
 src/gallium/drivers/vc5/vc5_context.h |  6 ++
 src/gallium/drivers/vc5/vc5_emit.c|  7 ---
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/vc5/v3dx_job.c 
b/src/gallium/drivers/vc5/v3dx_job.c
index d4b0adfea0..ca3831c75b 100644
--- a/src/gallium/drivers/vc5/v3dx_job.c
+++ b/src/gallium/drivers/vc5/v3dx_job.c
@@ -33,8 +33,12 @@
 void v3dX(bcl_epilogue)(struct vc5_context *vc5, struct vc5_job *job)
 {
 vc5_cl_ensure_space_with_branch(&job->bcl,
-7 +
-
cl_packet_length(OCCLUSION_QUERY_COUNTER));
+
cl_packet_length(OCCLUSION_QUERY_COUNTER) +
+#if V3D_VERSION >= 41
+
cl_packet_length(TRANSFORM_FEEDBACK_SPECS) +
+#endif
+
cl_packet_length(INCREMENT_SEMAPHORE) +
+
cl_packet_length(FLUSH_ALL_STATE));
 
 if (job->oq_enabled) {
 /* Disable the OQ at the end of the CL, so that the
@@ -44,6 +48,19 @@ void v3dX(bcl_epilogue)(struct vc5_context *vc5, struct 
vc5_job *job)
 cl_emit(&job->bcl, OCCLUSION_QUERY_COUNTER, counter);
 }
 
+/* Disable TF at the end of the CL, so that the next job to be
+ * run doesn't start out trying to write TF primitives.  On
+ * V3D 3.x, it's only the TF primitive mode that triggers TF
+ * writes.
+ */
+#if V3D_VERSION >= 41
+if (job->tf_enabled) {
+cl_emit(&job->bcl, TRANSFORM_FEEDBACK_SPECS, tfe) {
+tfe.enable = false;
+};
+}
+#endif /* V3D_VERSION >= 41 */
+
 /* Increment the semaphore indicating that binning is done and
  * unblocking the render thread.  Note that this doesn't act
  * until the FLUSH completes.
diff --git a/src/gallium/drivers/vc5/vc5_context.h 
b/src/gallium/drivers/vc5/vc5_context.h
index 7272e045c4..f6ed91c27a 100644
--- a/src/gallium/drivers/vc5/vc5_context.h
+++ b/src/gallium/drivers/vc5/vc5_context.h
@@ -294,6 +294,12 @@ struct vc5_job {
  */
 bool oq_enabled;
 
+/**
+ * Set when a packet enabling TF on all further primitives has been
+ * emitted.
+ */
+bool tf_enabled;
+
 bool uses_early_z;
 
 /**
diff --git a/src/gallium/drivers/vc5/vc5_emit.c 
b/src/gallium/drivers/vc5/vc5_emit.c
index 061d6e7c9d..a98fd037d0 100644
--- a/src/gallium/drivers/vc5/vc5_emit.c
+++ b/src/gallium/drivers/vc5/vc5_emit.c
@@ -585,12 +585,13 @@ v3dX(emit_state)(struct pipe_context *pctx)
   vc5->prog.bind_vs->tf_specs);
 
 #if V3D_VERSION >= 40
+job->tf_enabled = (vc5->prog.bind_vs->num_tf_specs != 
0 &&
+   vc5->active_queries);
+
 cl_emit(&job->bcl, TRANSFORM_FEEDBACK_SPECS, tfe) {
 
tfe.number_of_16_bit_output_data_specs_following =
 vc5->prog.bind_vs->num_tf_specs;
-tfe.enable =
-(vc5->prog.bind_vs->num_tf_specs != 0 
&&
- vc5->active_queries);
+tfe.enable = job->tf_enabled;
 };
 #else /* V3D_VERSION < 40 */
 cl_emit(&job->bcl, TRANSFORM_FEEDBACK_ENABLE, tfe) {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): broadcom/vc5: Move the BCL epilogue code to a per-version compile.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 1fa820cef861b0f2efd001cfb3c4adecf2fa549b
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=1fa820cef861b0f2efd001cfb3c4adecf2fa549b

Author: Eric Anholt 
Date:   Fri Mar 23 15:19:05 2018 -0700

broadcom/vc5: Move the BCL epilogue code to a per-version compile.

I need to do some new packets for transform feedback on 4.1.

---

 src/gallium/drivers/vc5/Makefile.sources |  1 +
 src/gallium/drivers/vc5/meson.build  |  1 +
 src/gallium/drivers/vc5/v3dx_context.h   |  2 ++
 src/gallium/drivers/vc5/v3dx_job.c   | 59 
 src/gallium/drivers/vc5/vc5_job.c| 28 +++
 5 files changed, 67 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/vc5/Makefile.sources 
b/src/gallium/drivers/vc5/Makefile.sources
index 0259ecc99b..c1e4e0b023 100644
--- a/src/gallium/drivers/vc5/Makefile.sources
+++ b/src/gallium/drivers/vc5/Makefile.sources
@@ -28,6 +28,7 @@ C_SOURCES := \
 VC5_PER_VERSION_SOURCES = \
v3dx_context.h \
v3dx_format_table.c \
+   v3dx_job.c \
v3dx_simulator.c \
vc5_draw.c \
vc5_emit.c \
diff --git a/src/gallium/drivers/vc5/meson.build 
b/src/gallium/drivers/vc5/meson.build
index 005bf2f9b8..4f20c2697e 100644
--- a/src/gallium/drivers/vc5/meson.build
+++ b/src/gallium/drivers/vc5/meson.build
@@ -44,6 +44,7 @@ files_libvc5 = files(
 
 files_per_version = files(
   'v3dx_format_table.c',
+  'v3dx_job.c',
   'v3dx_simulator.c',
   'vc5_draw.c',
   'vc5_emit.c',
diff --git a/src/gallium/drivers/vc5/v3dx_context.h 
b/src/gallium/drivers/vc5/v3dx_context.h
index addc7433b3..f9edd1c636 100644
--- a/src/gallium/drivers/vc5/v3dx_context.h
+++ b/src/gallium/drivers/vc5/v3dx_context.h
@@ -34,6 +34,8 @@ void v3dX(emit_rcl)(struct vc5_job *job);
 void v3dX(draw_init)(struct pipe_context *pctx);
 void v3dX(state_init)(struct pipe_context *pctx);
 
+void v3dX(bcl_epilogue)(struct vc5_context *vc5, struct vc5_job *job);
+
 void v3dX(simulator_init_regs)(struct v3d_hw *v3d);
 int v3dX(simulator_get_param_ioctl)(struct v3d_hw *v3d,
 struct drm_vc5_get_param *args);
diff --git a/src/gallium/drivers/vc5/v3dx_job.c 
b/src/gallium/drivers/vc5/v3dx_job.c
new file mode 100644
index 00..d4b0adfea0
--- /dev/null
+++ b/src/gallium/drivers/vc5/v3dx_job.c
@@ -0,0 +1,59 @@
+/*
+ * Copyright © 2014-2017 Broadcom
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/** @file v3dx_job.c
+ *
+ * V3D version-specific functions for submitting VC5 render jobs to the
+ * kernel.
+ */
+
+#include "vc5_context.h"
+#include "broadcom/cle/v3dx_pack.h"
+
+void v3dX(bcl_epilogue)(struct vc5_context *vc5, struct vc5_job *job)
+{
+vc5_cl_ensure_space_with_branch(&job->bcl,
+7 +
+
cl_packet_length(OCCLUSION_QUERY_COUNTER));
+
+if (job->oq_enabled) {
+/* Disable the OQ at the end of the CL, so that the
+ * draw calls at the start of the CL don't inherit the
+ * OQ counter.
+ */
+cl_emit(&job->bcl, OCCLUSION_QUERY_COUNTER, counter);
+}
+
+/* Increment the semaphore indicating that binning is done and
+ * unblocking the render thread.  Note that this doesn't act
+ * until the FLUSH completes.
+ */
+cl_emit(&job->bcl, INCREMENT_SEMAPHORE, incr);
+
+/* The FLUSH_ALL emits any unwritten state changes in each
+ * tile.  We can use this to reset any state that needs to be
+ * present at the start of the next tile, as we do with
+ * OCCLUSION_QUERY_COUNTER above.
+ */
+cl_emit(

Mesa (master): broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 1bf466270d416643e8fcacd6b790e53660303059
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=1bf466270d416643e8fcacd6b790e53660303059

Author: Eric Anholt 
Date:   Fri Mar 23 16:18:02 2018 -0700

broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.

Once we've disabled EZ for some draws, we need to not use EZ on future
draws.  Implementing that made implementing the GT/GE direction trivial.

Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.

---

 src/gallium/drivers/vc5/vc5_context.h | 20 +--
 src/gallium/drivers/vc5/vc5_draw.c| 47 ---
 src/gallium/drivers/vc5/vc5_emit.c|  9 ---
 src/gallium/drivers/vc5/vc5_rcl.c | 16 +++-
 src/gallium/drivers/vc5/vc5_state.c   | 40 -
 5 files changed, 111 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/vc5/vc5_context.h 
b/src/gallium/drivers/vc5/vc5_context.h
index f6ed91c27a..f61c37ba92 100644
--- a/src/gallium/drivers/vc5/vc5_context.h
+++ b/src/gallium/drivers/vc5/vc5_context.h
@@ -199,6 +199,13 @@ struct vc5_job_key {
 struct pipe_surface *zsbuf;
 };
 
+enum vc5_ez_state {
+VC5_EZ_UNDECIDED = 0,
+VC5_EZ_GT_GE,
+VC5_EZ_LT_LE,
+VC5_EZ_DISABLED,
+};
+
 /**
  * A complete bin/render job.
  *
@@ -300,7 +307,16 @@ struct vc5_job {
  */
 bool tf_enabled;
 
-bool uses_early_z;
+/**
+ * Current EZ state for drawing. Updated at the start of draw after
+ * we've decided on the shader being rendered.
+ */
+enum vc5_ez_state ez_state;
+/**
+ * The first EZ state that was used for drawing with a decided EZ
+ * direction (so either UNDECIDED, GT, or LT).
+ */
+enum vc5_ez_state first_ez_state;
 
 /**
  * Number of draw calls (not counting full buffer clears) queued in
@@ -429,7 +445,7 @@ struct vc5_rasterizer_state {
 struct vc5_depth_stencil_alpha_state {
 struct pipe_depth_stencil_alpha_state base;
 
-bool early_z_enable;
+enum vc5_ez_state ez_state;
 
 /** Uniforms for stencil state.
  *
diff --git a/src/gallium/drivers/vc5/vc5_draw.c 
b/src/gallium/drivers/vc5/vc5_draw.c
index 7a409c14d4..25f4883be2 100644
--- a/src/gallium/drivers/vc5/vc5_draw.c
+++ b/src/gallium/drivers/vc5/vc5_draw.c
@@ -327,6 +327,49 @@ vc5_tf_statistics_record(struct vc5_context *vc5,
 }
 
 static void
+vc5_update_job_ez(struct vc5_context *vc5, struct vc5_job *job)
+{
+switch (vc5->zsa->ez_state) {
+case VC5_EZ_UNDECIDED:
+/* If the Z/S state didn't pick a direction but didn't
+ * disable, then go along with the current EZ state.  This
+ * allows EZ optimization for Z func == EQUAL or NEVER.
+ */
+break;
+
+case VC5_EZ_LT_LE:
+case VC5_EZ_GT_GE:
+/* If the Z/S state picked a direction, then it needs to match
+ * the current direction if we've decided on one.
+ */
+if (job->ez_state == VC5_EZ_UNDECIDED)
+job->ez_state = vc5->zsa->ez_state;
+else if (job->ez_state != vc5->zsa->ez_state)
+job->ez_state = VC5_EZ_DISABLED;
+break;
+
+case VC5_EZ_DISABLED:
+/* If the current Z/S state disables EZ because of a bad Z
+ * func or stencil operation, then we can't do any more EZ in
+ * this frame.
+ */
+job->ez_state = VC5_EZ_DISABLED;
+break;
+}
+
+/* If the FS affects the Z of the pixels, then it may update against
+ * the chosen EZ direction (though we could use
+ * ARB_conservative_depth's hints to avoid this)
+ */
+if (vc5->prog.fs->prog_data.fs->writes_z) {
+job->ez_state = VC5_EZ_DISABLED;
+}
+
+if (job->first_ez_state == VC5_EZ_UNDECIDED)
+job->first_ez_state = job->ez_state;
+}
+
+static void
 vc5_draw_vbo(struct pipe_context *pctx, const struct pipe_draw_info *info)
 {
 struct vc5_context *vc5 = vc5_context(pctx);
@@ -384,6 +427,7 @@ vc5_draw_vbo(struct pipe_context *pctx, const struct 
pipe_draw_info *info)
 
 vc5_start_draw(vc5);
 vc5_update_compiled_shaders(vc5, info->mode);
+vc5_update_job_ez(vc5, job);
 
 #if V3D_VERSION >= 41
 v3d41_emit_state(pctx);
@@ -515,9 +559,6 @@ vc5_draw_vbo(struct pipe_context *pctx, const struct 
pipe_draw_info *info)
 if (vc5->zsa->base.depth.enabled) {
 job->resolve |= PIPE_CLEAR_DEPTH;
 rsc->initialized_buffers = PIPE_CLEAR_DEPTH;
-
-if (vc5->zsa->early_z_enable)
-job->uses_e

Mesa (master): broadcom/vc5: Implement workaround for GFXH-1431.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 494da6c2dd8f0b570693f7611b58be11061224e0
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=494da6c2dd8f0b570693f7611b58be11061224e0

Author: Eric Anholt 
Date:   Mon Mar 26 10:38:28 2018 -0700

broadcom/vc5: Implement workaround for GFXH-1431.

This should fix some blending errors, but doesn't impact any testcases in
the CTS.

---

 src/gallium/drivers/vc5/vc5_emit.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/vc5/vc5_emit.c 
b/src/gallium/drivers/vc5/vc5_emit.c
index 71f508c9ee..deb46228da 100644
--- a/src/gallium/drivers/vc5/vc5_emit.c
+++ b/src/gallium/drivers/vc5/vc5_emit.c
@@ -490,7 +490,11 @@ v3dX(emit_state)(struct pipe_context *pctx)
 }
 }
 
-if (vc5->dirty & VC5_DIRTY_BLEND_COLOR) {
+/* GFXH-1431: On V3D 3.x, writing BLEND_CONFIG resets the constant
+ * color.
+ */
+if (vc5->dirty & VC5_DIRTY_BLEND_COLOR ||
+(V3D_VERSION < 41 && (vc5->dirty & VC5_DIRTY_BLEND))) {
 cl_emit(&job->bcl, BLEND_CONSTANT_COLOUR, colour) {
 colour.red_f16 = (vc5->swap_color_rb ?
   vc5->blend_color.hf[2] :

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): broadcom/vc5: Limit each transform feedback data spec to 16 dwords.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 9e62aec9cd4853016b4d03a56b5756111a312d65
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=9e62aec9cd4853016b4d03a56b5756111a312d65

Author: Eric Anholt 
Date:   Wed Mar 21 15:18:34 2018 -0700

broadcom/vc5: Limit each transform feedback data spec to 16 dwords.

The length-1 field only has 4 bits, so we need to generate separate specs
when there's too much TF output per buffer.

Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type
and transform_feedback_max_interleaved.

---

 src/gallium/drivers/vc5/vc5_context.h |  2 +-
 src/gallium/drivers/vc5/vc5_program.c | 43 ---
 2 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/vc5/vc5_context.h 
b/src/gallium/drivers/vc5/vc5_context.h
index 1ab5a6b153..976fba90f8 100644
--- a/src/gallium/drivers/vc5/vc5_context.h
+++ b/src/gallium/drivers/vc5/vc5_context.h
@@ -130,7 +130,7 @@ struct vc5_uncompiled_shader {
 struct pipe_shader_state base;
 uint32_t num_tf_outputs;
 struct v3d_varying_slot *tf_outputs;
-uint16_t tf_specs[PIPE_MAX_SO_BUFFERS];
+uint16_t tf_specs[16];
 uint32_t num_tf_specs;
 
 /**
diff --git a/src/gallium/drivers/vc5/vc5_program.c 
b/src/gallium/drivers/vc5/vc5_program.c
index 87c21abe8b..a7a089510b 100644
--- a/src/gallium/drivers/vc5/vc5_program.c
+++ b/src/gallium/drivers/vc5/vc5_program.c
@@ -49,6 +49,14 @@ vc5_get_slot_for_driver_location(nir_shader *s, uint32_t 
driver_location)
 return -1;
 }
 
+/**
+ * Precomputes the TRANSFORM_FEEDBACK_OUTPUT_DATA_SPEC array for the shader.
+ *
+ * A shader can have 16 of these specs, and each one of them can write up to
+ * 16 dwords.  Since we allow a total of 64 transform feedback output
+ * components (not 16 vectors), we have to group the writes of multiple
+ * varyings together in a single data spec.
+ */
 static void
 vc5_set_transform_feedback_outputs(struct vc5_uncompiled_shader *so,
const struct pipe_stream_output_info 
*stream_output)
@@ -102,19 +110,28 @@ vc5_set_transform_feedback_outputs(struct 
vc5_uncompiled_shader *so,
 if (!vpm_size)
 continue;
 
-struct V3D33_TRANSFORM_FEEDBACK_OUTPUT_DATA_SPEC unpacked = {
-/* We need the offset from the coordinate shader's VPM
- * output block, which has the [X, Y, Z, W, Xs, Ys]
- * values at the start.  Note that this will need some
- * shifting when PSIZ is also present.
- */
-.first_shaded_vertex_value_to_output = vpm_start + 6,
-
.number_of_consecutive_vertex_values_to_output_as_32_bit_values_minus_1 = 
vpm_size - 1,
-.output_buffer_to_write_to = buffer,
-};
-V3D33_TRANSFORM_FEEDBACK_OUTPUT_DATA_SPEC_pack(NULL,
-   (void 
*)&so->tf_specs[so->num_tf_specs++],
-   &unpacked);
+uint32_t vpm_start_offset = vpm_start + 6;
+
+while (vpm_size) {
+uint32_t write_size = MIN2(vpm_size, 1 << 4);
+
+struct V3D33_TRANSFORM_FEEDBACK_OUTPUT_DATA_SPEC 
unpacked = {
+/* We need the offset from the coordinate 
shader's VPM
+ * output block, which has the [X, Y, Z, W, 
Xs, Ys]
+ * values at the start.
+ */
+.first_shaded_vertex_value_to_output = 
vpm_start_offset,
+
.number_of_consecutive_vertex_values_to_output_as_32_bit_values_minus_1 = 
write_size - 1,
+.output_buffer_to_write_to = buffer,
+};
+
+assert(so->num_tf_specs != ARRAY_SIZE(so->tf_specs));
+V3D33_TRANSFORM_FEEDBACK_OUTPUT_DATA_SPEC_pack(NULL,
+   (void 
*)&so->tf_specs[so->num_tf_specs++],
+   
&unpacked);
+vpm_start_offset += write_size;
+vpm_size -= write_size;
+}
 }
 
 so->num_tf_outputs = slot_count;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: c2b13627d9d7973687350cab243ade65115cff0d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=c2b13627d9d7973687350cab243ade65115cff0d

Author: Eric Anholt 
Date:   Mon Mar 26 12:18:39 2018 -0700

broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.

Just like TLB without a config uniform, we don't have a register index.

---

 src/broadcom/compiler/vir_dump.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/broadcom/compiler/vir_dump.c b/src/broadcom/compiler/vir_dump.c
index 90a3fb0ac6..88b5dc90ac 100644
--- a/src/broadcom/compiler/vir_dump.c
+++ b/src/broadcom/compiler/vir_dump.c
@@ -71,6 +71,7 @@ vir_print_reg(struct v3d_compile *c, struct qreg reg)
 break;
 
 case QFILE_TLB:
+case QFILE_TLBU:
 fprintf(stderr, "%s", files[reg.file]);
 break;
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): broadcom/vc5: Fix transform feedback in the presence of point size.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 33878641305345b9bb76ad5ebf2335ec9c17adfa
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=33878641305345b9bb76ad5ebf2335ec9c17adfa

Author: Eric Anholt 
Date:   Wed Mar 21 15:07:19 2018 -0700

broadcom/vc5: Fix transform feedback in the presence of point size.

I had this note to myself, and it turns out that a lot of CTS tests use
XFB with points to get data out without using a fragment shader.  Keep
track of two sets of precomputed TF specs (point size in VPM prologue or
not), and switch between them when we enable/disable point size.

---

 src/gallium/drivers/vc5/vc5_context.h |  1 +
 src/gallium/drivers/vc5/vc5_emit.c| 13 ++---
 src/gallium/drivers/vc5/vc5_program.c | 13 -
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/vc5/vc5_context.h 
b/src/gallium/drivers/vc5/vc5_context.h
index 976fba90f8..7272e045c4 100644
--- a/src/gallium/drivers/vc5/vc5_context.h
+++ b/src/gallium/drivers/vc5/vc5_context.h
@@ -131,6 +131,7 @@ struct vc5_uncompiled_shader {
 uint32_t num_tf_outputs;
 struct v3d_varying_slot *tf_outputs;
 uint16_t tf_specs[16];
+uint16_t tf_specs_psiz[16];
 uint32_t num_tf_specs;
 
 /**
diff --git a/src/gallium/drivers/vc5/vc5_emit.c 
b/src/gallium/drivers/vc5/vc5_emit.c
index 1db97081df..061d6e7c9d 100644
--- a/src/gallium/drivers/vc5/vc5_emit.c
+++ b/src/gallium/drivers/vc5/vc5_emit.c
@@ -572,10 +572,18 @@ v3dX(emit_state)(struct pipe_context *pctx)
 /* Set up the transform feedback data specs (which VPM entries to
  * output to which buffers).
  */
-if (vc5->dirty & VC5_DIRTY_STREAMOUT) {
+if (vc5->dirty & (VC5_DIRTY_STREAMOUT |
+  VC5_DIRTY_RASTERIZER |
+  VC5_DIRTY_PRIM_MODE)) {
 struct vc5_streamout_stateobj *so = &vc5->streamout;
 
 if (so->num_targets) {
+bool psiz_per_vertex = (vc5->prim_mode == 
PIPE_PRIM_POINTS &&
+
vc5->rasterizer->base.point_size_per_vertex);
+uint16_t *tf_specs = (psiz_per_vertex ?
+  vc5->prog.bind_vs->tf_specs_psiz 
:
+  vc5->prog.bind_vs->tf_specs);
+
 #if V3D_VERSION >= 40
 cl_emit(&job->bcl, TRANSFORM_FEEDBACK_SPECS, tfe) {
 
tfe.number_of_16_bit_output_data_specs_following =
@@ -593,8 +601,7 @@ v3dX(emit_state)(struct pipe_context *pctx)
 };
 #endif /* V3D_VERSION < 40 */
 for (int i = 0; i < vc5->prog.bind_vs->num_tf_specs; 
i++) {
-cl_emit_prepacked(&job->bcl,
-  
&vc5->prog.bind_vs->tf_specs[i]);
+cl_emit_prepacked(&job->bcl, &tf_specs[i]);
 }
 }
 }
diff --git a/src/gallium/drivers/vc5/vc5_program.c 
b/src/gallium/drivers/vc5/vc5_program.c
index a7a089510b..7bad80a168 100644
--- a/src/gallium/drivers/vc5/vc5_program.c
+++ b/src/gallium/drivers/vc5/vc5_program.c
@@ -127,8 +127,19 @@ vc5_set_transform_feedback_outputs(struct 
vc5_uncompiled_shader *so,
 
 assert(so->num_tf_specs != ARRAY_SIZE(so->tf_specs));
 V3D33_TRANSFORM_FEEDBACK_OUTPUT_DATA_SPEC_pack(NULL,
-   (void 
*)&so->tf_specs[so->num_tf_specs++],
+   (void 
*)&so->tf_specs[so->num_tf_specs],

&unpacked);
+
+/* If point size is being written by the shader, then
+ * all the VPM start offsets are shifted up by one.
+ * We won't know that until the variant is compiled,
+ * though.
+ */
+unpacked.first_shaded_vertex_value_to_output++;
+V3D33_TRANSFORM_FEEDBACK_OUTPUT_DATA_SPEC_pack(NULL,
+   (void 
*)&so->tf_specs_psiz[so->num_tf_specs],
+   
&unpacked);
+so->num_tf_specs++;
 vpm_start_offset += write_size;
 vpm_size -= write_size;
 }

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 262208eb3c2c53a1fd807bc76b12088f6ce2c56d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=262208eb3c2c53a1fd807bc76b12088f6ce2c56d

Author: Eric Anholt 
Date:   Fri Mar 23 15:43:50 2018 -0700

broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.

On 3.x, we just don't flag the primitive as needing TF, but those
primitive bits are now allocated to the new primitive types.  Now we need
to actually update the enable flag at draw time.

---

 src/gallium/drivers/vc5/vc5_emit.c  | 7 +++
 src/gallium/drivers/vc5/vc5_query.c | 1 +
 2 files changed, 8 insertions(+)

diff --git a/src/gallium/drivers/vc5/vc5_emit.c 
b/src/gallium/drivers/vc5/vc5_emit.c
index a98fd037d0..d5bf2824d2 100644
--- a/src/gallium/drivers/vc5/vc5_emit.c
+++ b/src/gallium/drivers/vc5/vc5_emit.c
@@ -604,6 +604,13 @@ v3dX(emit_state)(struct pipe_context *pctx)
 for (int i = 0; i < vc5->prog.bind_vs->num_tf_specs; 
i++) {
 cl_emit_prepacked(&job->bcl, &tf_specs[i]);
 }
+} else if (job->tf_enabled) {
+#if V3D_VERSION >= 40
+cl_emit(&job->bcl, TRANSFORM_FEEDBACK_SPECS, tfe) {
+tfe.enable = false;
+};
+job->tf_enabled = false;
+#endif /* V3D_VERSION >= 40 */
 }
 }
 
diff --git a/src/gallium/drivers/vc5/vc5_query.c 
b/src/gallium/drivers/vc5/vc5_query.c
index 5ec9be2e35..9aa80cf536 100644
--- a/src/gallium/drivers/vc5/vc5_query.c
+++ b/src/gallium/drivers/vc5/vc5_query.c
@@ -164,6 +164,7 @@ vc5_set_active_query_state(struct pipe_context *pctx, 
boolean enable)
 
 vc5->active_queries = enable;
 vc5->dirty |= VC5_DIRTY_OQ;
+vc5->dirty |= VC5_DIRTY_STREAMOUT;
 }
 
 void

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): broadcom/vc5: Split transform feedback specs update from buffers.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 09ac5ade8f3855e42e4902d7e1acab540f3f1568
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=09ac5ade8f3855e42e4902d7e1acab540f3f1568

Author: Eric Anholt 
Date:   Fri Mar 23 15:40:36 2018 -0700

broadcom/vc5: Split transform feedback specs update from buffers.

The specs update will be changing based on additional state flags in the
next commit, and this unindents the buffer update code.

---

 src/gallium/drivers/vc5/vc5_emit.c | 59 +-
 1 file changed, 32 insertions(+), 27 deletions(-)

diff --git a/src/gallium/drivers/vc5/vc5_emit.c 
b/src/gallium/drivers/vc5/vc5_emit.c
index e5a9e0e03a..1db97081df 100644
--- a/src/gallium/drivers/vc5/vc5_emit.c
+++ b/src/gallium/drivers/vc5/vc5_emit.c
@@ -569,6 +569,9 @@ v3dX(emit_state)(struct pipe_context *pctx)
 }
 }
 
+/* Set up the transform feedback data specs (which VPM entries to
+ * output to which buffers).
+ */
 if (vc5->dirty & VC5_DIRTY_STREAMOUT) {
 struct vc5_streamout_stateobj *so = &vc5->streamout;
 
@@ -593,42 +596,44 @@ v3dX(emit_state)(struct pipe_context *pctx)
 cl_emit_prepacked(&job->bcl,
   
&vc5->prog.bind_vs->tf_specs[i]);
 }
+}
+}
 
-for (int i = 0; i < so->num_targets; i++) {
-const struct pipe_stream_output_target *target 
=
-so->targets[i];
-struct vc5_resource *rsc = target ?
-vc5_resource(target->buffer) : NULL;
+/* Set up the trasnform feedback buffers. */
+if (vc5->dirty & VC5_DIRTY_STREAMOUT) {
+struct vc5_streamout_stateobj *so = &vc5->streamout;
+for (int i = 0; i < so->num_targets; i++) {
+const struct pipe_stream_output_target *target =
+so->targets[i];
+struct vc5_resource *rsc = target ?
+vc5_resource(target->buffer) : NULL;
 
 #if V3D_VERSION >= 40
-if (!target)
-continue;
+if (!target)
+continue;
 
-cl_emit(&job->bcl, TRANSFORM_FEEDBACK_BUFFER, 
output) {
-output.buffer_address =
+cl_emit(&job->bcl, TRANSFORM_FEEDBACK_BUFFER, output) {
+output.buffer_address =
+cl_address(rsc->bo,
+   target->buffer_offset);
+output.buffer_size_in_32_bit_words =
+target->buffer_size >> 2;
+output.buffer_number = i;
+}
+#else /* V3D_VERSION < 40 */
+cl_emit(&job->bcl, TRANSFORM_FEEDBACK_OUTPUT_ADDRESS, 
output) {
+if (target) {
+output.address =
 cl_address(rsc->bo,

target->buffer_offset);
-output.buffer_size_in_32_bit_words =
-target->buffer_size >> 2;
-output.buffer_number = i;
 }
-#else /* V3D_VERSION < 40 */
-cl_emit(&job->bcl, 
TRANSFORM_FEEDBACK_OUTPUT_ADDRESS, output) {
-if (target) {
-output.address =
-cl_address(rsc->bo,
-   
target->buffer_offset);
-}
-};
+};
 #endif /* V3D_VERSION < 40 */
-if (target) {
-vc5_job_add_write_resource(vc5->job,
-   
target->buffer);
-}
-/* XXX: buffer_size? */
+if (target) {
+vc5_job_add_write_resource(vc5->job,
+   target->buffer);
 }
-} else {
-/* XXX? */
+/* XXX: buffer_size? */
 }
 }
 

___
mesa-c

Mesa (master): gallium/u_vbuf: Protect against overflow with large instance divisors.

2018-03-26 Thread Eric Anholt
Module: Mesa
Branch: master
Commit: 0356db022da819176d9d0eacab63d4c2c852f876
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=0356db022da819176d9d0eacab63d4c2c852f876

Author: Eric Anholt 
Date:   Tue Mar 20 10:42:12 2018 -0700

gallium/u_vbuf: Protect against overflow with large instance divisors.

GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1
as a divisor, so we would overflow to count=0 and upload no data,
triggering the assert below.  We want to upload 1 element in this case,
fixing the test on VC5.

v2: Use some more obvious logic, and explain why we don't use the normal
round_up().

Reviewed-by: Brian Paul 

---

 src/gallium/auxiliary/util/u_vbuf.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
b/src/gallium/auxiliary/util/u_vbuf.c
index 95d7990c6c..8a680d60a6 100644
--- a/src/gallium/auxiliary/util/u_vbuf.c
+++ b/src/gallium/auxiliary/util/u_vbuf.c
@@ -936,7 +936,16 @@ u_vbuf_upload_buffers(struct u_vbuf *mgr,
  size = mgr->ve->src_format_size[i];
   } else if (instance_div) {
  /* Per-instance attrib. */
- unsigned count = (num_instances + instance_div - 1) / instance_div;
+
+ /* Figure out how many instances we'll render given instance_div.  We
+  * can't use the typical div_round_up() pattern because the CTS uses
+  * instance_div = ~0 for a test, which overflows div_round_up()'s
+  * addition.
+  */
+ unsigned count = num_instances / instance_div;
+ if (count * instance_div != num_instances)
+count++;
+
  first += vb->stride * start_instance;
  size = vb->stride * (count - 1) + mgr->ve->src_format_size[i];
   } else {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): winsys/amdgpu: always allow GTT placements on APUs

2018-03-26 Thread Marek Olšák
Module: Mesa
Branch: master
Commit: 7d2079908d9ef05ec3f35b7078833e57846cab5b
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7d2079908d9ef05ec3f35b7078833e57846cab5b

Author: Marek Olšák 
Date:   Wed Mar 21 16:10:29 2018 -0400

winsys/amdgpu: always allow GTT placements on APUs

Reviewed-by: Christian König 

---

 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 7740b46b7b..22b5a73143 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -409,14 +409,12 @@ static struct amdgpu_winsys_bo *amdgpu_create_bo(struct 
amdgpu_winsys *ws,
if (initial_domain & RADEON_DOMAIN_GTT)
   request.preferred_heap |= AMDGPU_GEM_DOMAIN_GTT;
 
-   /* If VRAM is just stolen system memory, allow both VRAM and
-* GTT, whichever has free space. If a buffer is evicted from
-* VRAM to GTT, it will stay there.
-*
-* DRM 3.6.0 has good BO move throttling, so we can allow VRAM-only
-* placements even with a low amount of stolen VRAM.
+   /* Since VRAM and GTT have almost the same performance on APUs, we could
+* just set GTT. However, in order to decrease GTT(RAM) usage, which is
+* shared with the OS, allow VRAM placements too. The idea is not to use
+* VRAM usefully, but to use it so that it's not unused and wasted.
 */
-   if (!ws->info.has_dedicated_vram && ws->info.drm_minor < 6)
+   if (!ws->info.has_dedicated_vram)
   request.preferred_heap |= AMDGPU_GEM_DOMAIN_GTT;
 
if (flags & RADEON_FLAG_NO_CPU_ACCESS)

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): radeonsi: don't reallocate on DMABUF export if local BOs are disabled

2018-03-26 Thread Marek Olšák
Module: Mesa
Branch: master
Commit: 769603564ececf8edc6424ba500090bee661dadb
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=769603564ececf8edc6424ba500090bee661dadb

Author: Marek Olšák 
Date:   Thu Mar 15 15:58:57 2018 -0400

radeonsi: don't reallocate on DMABUF export if local BOs are disabled

---

 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeon/r600_texture.c | 4 +++-
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 7 +++
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 73b5da0fe1..73fc36203c 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -313,6 +313,8 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_syncobj_wait_for_submit = info->has_syncobj && 
info->drm_minor >= 20;
info->has_fence_to_handle = info->has_syncobj && info->drm_minor >= 21;
info->has_ctx_priority = info->drm_minor >= 22;
+   /* TODO: Enable this once the kernel handles it efficiently. */
+   /*info->has_local_buffers = ws->info.drm_minor >= 20;*/
info->num_render_backends = amdinfo->rb_pipes;
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 34d91bec14..3f08b577c4 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -89,6 +89,7 @@ struct radeon_info {
boolhas_syncobj_wait_for_submit;
boolhas_fence_to_handle;
boolhas_ctx_priority;
+   boolhas_local_buffers;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 3a0a79187b..1614df63c9 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -701,6 +701,7 @@ static boolean r600_texture_get_handle(struct pipe_screen* 
screen,
if (sscreen->ws->buffer_is_suballocated(res->buf) ||
rtex->surface.tile_swizzle ||
(rtex->resource.flags & RADEON_FLAG_NO_INTERPROCESS_SHARING 
&&
+sscreen->info.has_local_buffers &&
 whandle->type != DRM_API_HANDLE_TYPE_KMS)) {
assert(!res->b.is_shared);
r600_reallocate_texture_inplace(rctx, rtex,
@@ -762,7 +763,8 @@ static boolean r600_texture_get_handle(struct pipe_screen* 
screen,
/* Move a suballocated buffer into a non-suballocated 
allocation. */
if (sscreen->ws->buffer_is_suballocated(res->buf) ||
/* A DMABUF export always fails if the BO is local. */
-   rtex->resource.flags & RADEON_FLAG_NO_INTERPROCESS_SHARING) 
{
+   (rtex->resource.flags & RADEON_FLAG_NO_INTERPROCESS_SHARING 
&&
+sscreen->info.has_local_buffers)) {
assert(!res->b.is_shared);
 
/* Allocate a new buffer with PIPE_BIND_SHARED. */
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 12d497d292..7740b46b7b 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -423,10 +423,9 @@ static struct amdgpu_winsys_bo *amdgpu_create_bo(struct 
amdgpu_winsys *ws,
   request.flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
if (flags & RADEON_FLAG_GTT_WC)
   request.flags |= AMDGPU_GEM_CREATE_CPU_GTT_USWC;
-   /* TODO: Enable this once the kernel handles it efficiently. */
-   /*if (flags & RADEON_FLAG_NO_INTERPROCESS_SHARING &&
-   ws->info.drm_minor >= 20)
-  request.flags |= AMDGPU_GEM_CREATE_VM_ALWAYS_VALID;*/
+   if (flags & RADEON_FLAG_NO_INTERPROCESS_SHARING &&
+   ws->info.has_local_buffers)
+  request.flags |= AMDGPU_GEM_CREATE_VM_ALWAYS_VALID;
 
r = amdgpu_bo_alloc(ws->dev, &request, &buf_handle);
if (r) {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): glsl: fix infinite loop caused by bug in loop unrolling pass

2018-03-26 Thread Timothy Arceri
Module: Mesa
Branch: master
Commit: 56b867395dee1a48594b27987d3bf68a4e745dda
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=56b867395dee1a48594b27987d3bf68a4e745dda

Author: Timothy Arceri 
Date:   Mon Mar 26 10:31:26 2018 +1100

glsl: fix infinite loop caused by bug in loop unrolling pass

Just checking for 2 jumps is not enough to be sure we can do a
complex loop unroll. We need to make sure we also have also found
2 loop terminators.

Without this we were attempting to unroll a loop where the second
jump was nested inside multiple ifs which loop analysis is unable
to detect as a terminator. We ended up splicing out the first
terminator but failed to actually unroll the loop, this resulted
in the creation of a possible infinite loop.

Fixes: 646621c66da9 "glsl: make loop unrolling more like the nir unrolling path"

Tested-by: Gert Wollny 
Reviewed-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670

---

 src/compiler/glsl/loop_unroll.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/loop_unroll.cpp 
b/src/compiler/glsl/loop_unroll.cpp
index 6e06a30fb9..f6efe6475a 100644
--- a/src/compiler/glsl/loop_unroll.cpp
+++ b/src/compiler/glsl/loop_unroll.cpp
@@ -519,7 +519,7 @@ loop_unroll_visitor::visit_leave(ir_loop *ir)
 * isn't any additional unknown terminators, or any other jumps nested
 * inside futher ifs.
 */
-   if (ls->num_loop_jumps != 2)
+   if (ls->num_loop_jumps != 2 || ls->terminators.length() != 2)
   return visit_continue;
 
ir_instruction *first_ir =

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): gallium: Do not add -Wframe-address option for gcc <= 4.4.

2018-03-26 Thread Vinson Lee
Module: Mesa
Branch: master
Commit: dc94a0506f1d267a761961d3ac905d77de3dae2e
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=dc94a0506f1d267a761961d3ac905d77de3dae2e

Author: Vinson Lee 
Date:   Wed Mar 21 14:59:32 2018 -0700

gallium: Do not add -Wframe-address option for gcc <= 4.4.

This patch fixes these build errors with GCC 4.4.

  Compiling src/gallium/auxiliary/util/u_debug_stack.c ...
src/gallium/auxiliary/util/u_debug_stack.c: In function 
‘debug_backtrace_capture’:
src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic 
not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic 
not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic 
not allowed inside functions

Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument 
is unsafe warning")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529
Signed-off-by: Vinson Lee 
Reviewed-by: Timothy Arceri 
Reviewed-by: Jose Fonseca 

---

 src/gallium/auxiliary/util/u_debug_stack.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_debug_stack.c 
b/src/gallium/auxiliary/util/u_debug_stack.c
index 974e639e89..846f648431 100644
--- a/src/gallium/auxiliary/util/u_debug_stack.c
+++ b/src/gallium/auxiliary/util/u_debug_stack.c
@@ -264,7 +264,7 @@ debug_backtrace_capture(struct debug_stack_frame *backtrace,
}
 #endif
 
-#if defined(PIPE_CC_GCC)
+#if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION > 404) || defined(__clang__)
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wframe-address"
frame_pointer = ((const void **)__builtin_frame_address(1));

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): gallium: Correct minor typo in header comments

2018-03-26 Thread Dylan Baker
Module: Mesa
Branch: master
Commit: 029f1a2d6102cbd6d6c0f3badbb1df36073931b3
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=029f1a2d6102cbd6d6c0f3badbb1df36073931b3

Author: Alyssa Rosenzweig 
Date:   Mon Mar 26 15:56:53 2018 +

gallium: Correct minor typo in header comments

Signed-off-by: Alyssa Rosenzweig 
Reviewed-by: Dylan Baker 

---

 src/gallium/include/state_tracker/graw.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/include/state_tracker/graw.h 
b/src/gallium/include/state_tracker/graw.h
index 217fa31ba1..78ddf0a87f 100644
--- a/src/gallium/include/state_tracker/graw.h
+++ b/src/gallium/include/state_tracker/graw.h
@@ -33,7 +33,7 @@
  * necessary to implement this interface is orchestrated by the
  * individual target building this entity.
  *
- * For instance, the graw-xlib target includes code to implent these
+ * For instance, the graw-xlib target includes code to implement these
  * interfaces on top of the X window system.
  *
  * Programs using this interface may additionally benefit from some of

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): intel/aubinator_error_decode: Decode more registers.

2018-03-26 Thread Rafael Antognolli
Module: Mesa
Branch: master
Commit: 27581d18bc7f34e9c8f0b3a7c568323c7a8b03bf
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=27581d18bc7f34e9c8f0b3a7c568323c7a8b03bf

Author: Rafael Antognolli 
Date:   Wed Mar 21 11:42:23 2018 -0700

intel/aubinator_error_decode: Decode more registers.

Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE.

Reviewed-by: Lionel Landwerlin 

---

 src/intel/tools/aubinator_error_decode.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/intel/tools/aubinator_error_decode.c 
b/src/intel/tools/aubinator_error_decode.c
index db880d74a9..9abd05fd75 100644
--- a/src/intel/tools/aubinator_error_decode.c
+++ b/src/intel/tools/aubinator_error_decode.c
@@ -540,6 +540,18 @@ read_data_file(FILE *file)
print_register(spec, reg_name, reg);
  }
 
+ matched = sscanf(line, "  SC_INSTDONE: 0x%08x\n", ®);
+ if (matched == 1)
+print_register(spec, "SC_INSTDONE", reg);
+
+ matched = sscanf(line, "  SAMPLER_INSTDONE[%*d][%*d]: 0x%08x\n", 
®);
+ if (matched == 1)
+print_register(spec, "SAMPLER_INSTDONE", reg);
+
+ matched = sscanf(line, "  ROW_INSTDONE[%*d][%*d]: 0x%08x\n", ®);
+ if (matched == 1)
+print_register(spec, "ROW_INSTDONE", reg);
+
  matched = sscanf(line, "  INSTDONE1: 0x%08x\n", ®);
  if (matched == 1)
 print_register(spec, "INSTDONE_1", reg);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): intel/genxml: Add SC_INSTDONE register.

2018-03-26 Thread Rafael Antognolli
Module: Mesa
Branch: master
Commit: 4c0ae36143d292745e44103a3394ff0fadfdcbe6
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=4c0ae36143d292745e44103a3394ff0fadfdcbe6

Author: Rafael Antognolli 
Date:   Wed Mar 21 11:42:20 2018 -0700

intel/genxml: Add SC_INSTDONE register.

Reviewed-by: Lionel Landwerlin 

---

 src/intel/genxml/gen10.xml | 27 +++
 src/intel/genxml/gen11.xml | 27 +++
 src/intel/genxml/gen7.xml  | 19 +++
 src/intel/genxml/gen75.xml | 17 +
 src/intel/genxml/gen8.xml  | 24 
 src/intel/genxml/gen9.xml  | 26 ++
 6 files changed, 140 insertions(+)

diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml
index cc696e800d..e0bf0e9159 100644
--- a/src/intel/genxml/gen10.xml
+++ b/src/intel/genxml/gen10.xml
@@ -3459,6 +3459,33 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml
index 417fac1365..3278f35b82 100644
--- a/src/intel/genxml/gen11.xml
+++ b/src/intel/genxml/gen11.xml
@@ -3455,6 +3455,33 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 87e05c94ef..bc9fa5b65d 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2397,6 +2397,25 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 68aff857f3..9e2b789006 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -2869,6 +2869,23 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 8a4bf34cf7..0a6be59698 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3123,6 +3123,30 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index cfae4a8b65..834f5773ff 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3406,6 +3406,32 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): intel/genxml: Add SAMPLER_INSTDONE register.

2018-03-26 Thread Rafael Antognolli
Module: Mesa
Branch: master
Commit: 70d7c70e8dcc74674cebcb9877cf71f44db6c924
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=70d7c70e8dcc74674cebcb9877cf71f44db6c924

Author: Rafael Antognolli 
Date:   Wed Mar 21 11:42:22 2018 -0700

intel/genxml: Add SAMPLER_INSTDONE register.

Reviewed-by: Lionel Landwerlin 

---

 src/intel/genxml/gen10.xml | 23 +++
 src/intel/genxml/gen11.xml | 23 +++
 src/intel/genxml/gen7.xml  | 22 ++
 src/intel/genxml/gen75.xml | 25 +
 src/intel/genxml/gen8.xml  | 23 +++
 src/intel/genxml/gen9.xml  | 23 +++
 6 files changed, 139 insertions(+)

diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml
index afdb580b62..aeb9966759 100644
--- a/src/intel/genxml/gen10.xml
+++ b/src/intel/genxml/gen10.xml
@@ -3504,6 +3504,29 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml
index a5e67c30bf..6ca0e785ba 100644
--- a/src/intel/genxml/gen11.xml
+++ b/src/intel/genxml/gen11.xml
@@ -3500,6 +3500,29 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 52ca043b51..4865843fcb 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2436,6 +2436,28 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 9501ec53f8..da06e84ee9 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -2908,6 +2908,31 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 10dc787f48..71626c15cd 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3165,6 +3165,29 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 90d3a15eb2..c32f2c3162 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3450,6 +3450,29 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): intel/genxml: Add ROW_INSTDONE register.

2018-03-26 Thread Rafael Antognolli
Module: Mesa
Branch: master
Commit: 227edf05f34ed38a3f7829d716c988eff5f7d271
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=227edf05f34ed38a3f7829d716c988eff5f7d271

Author: Rafael Antognolli 
Date:   Wed Mar 21 11:42:21 2018 -0700

intel/genxml: Add ROW_INSTDONE register.

Reviewed-by: Lionel Landwerlin 

---

 src/intel/genxml/gen10.xml | 18 ++
 src/intel/genxml/gen11.xml | 18 ++
 src/intel/genxml/gen7.xml  | 20 
 src/intel/genxml/gen75.xml | 22 ++
 src/intel/genxml/gen8.xml  | 18 ++
 src/intel/genxml/gen9.xml  | 18 ++
 6 files changed, 114 insertions(+)

diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml
index e0bf0e9159..afdb580b62 100644
--- a/src/intel/genxml/gen10.xml
+++ b/src/intel/genxml/gen10.xml
@@ -3486,6 +3486,24 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml
index 3278f35b82..a5e67c30bf 100644
--- a/src/intel/genxml/gen11.xml
+++ b/src/intel/genxml/gen11.xml
@@ -3482,6 +3482,24 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index bc9fa5b65d..52ca043b51 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2416,6 +2416,26 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 9e2b789006..9501ec53f8 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -2886,6 +2886,28 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 0a6be59698..10dc787f48 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3147,6 +3147,24 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 834f5773ff..90d3a15eb2 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3432,6 +3432,24 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): i965: Add negative_equals methods

2018-03-26 Thread Ian Romanick
Module: Mesa
Branch: master
Commit: 8f83eea71e233227d34dc8547dac79d2912c2311
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8f83eea71e233227d34dc8547dac79d2912c2311

Author: Ian Romanick 
Date:   Tue Apr  7 16:11:37 2015 -0700

i965: Add negative_equals methods

This method is similar to the existing ::equals methods.  Instead of
testing that two src_regs are equal to each other, it tests that one is
the negation of the other.

v2: Simplify various checks based on suggestions from Matt.  Use
src_reg::type instead of fixed_hw_reg.type in a check.  Also suggested
by Matt.

v3: Rebase on 3 years.  Fix some problems with negative_equals with VF
constants.  Add fs_reg::negative_equals.

v4: Replace the existing default case with BRW_REGISTER_TYPE_UB,
BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF.  Suggested by Matt.
Expand the FINISHME comment to better explain why it isn't already
finished.

Signed-off-by: Ian Romanick 
Reviewed-by: Alejandro Piñeiro  [v3]
Reviewed-by: Matt Turner 

---

 src/intel/compiler/brw_fs.cpp |  7 ++
 src/intel/compiler/brw_ir_fs.h|  1 +
 src/intel/compiler/brw_ir_vec4.h  |  1 +
 src/intel/compiler/brw_reg.h  | 49 +++
 src/intel/compiler/brw_shader.cpp |  6 +
 src/intel/compiler/brw_shader.h   |  1 +
 src/intel/compiler/brw_vec4.cpp   |  7 ++
 7 files changed, 72 insertions(+)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 6eea532f56..3d454c3db1 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -454,6 +454,13 @@ fs_reg::equals(const fs_reg &r) const
 }
 
 bool
+fs_reg::negative_equals(const fs_reg &r) const
+{
+   return (this->backend_reg::negative_equals(r) &&
+   stride == r.stride);
+}
+
+bool
 fs_reg::is_contiguous() const
 {
return stride == 1;
diff --git a/src/intel/compiler/brw_ir_fs.h b/src/intel/compiler/brw_ir_fs.h
index 54797ff0fa..f06a33c516 100644
--- a/src/intel/compiler/brw_ir_fs.h
+++ b/src/intel/compiler/brw_ir_fs.h
@@ -41,6 +41,7 @@ public:
fs_reg(enum brw_reg_file file, int nr, enum brw_reg_type type);
 
bool equals(const fs_reg &r) const;
+   bool negative_equals(const fs_reg &r) const;
bool is_contiguous() const;
 
/**
diff --git a/src/intel/compiler/brw_ir_vec4.h b/src/intel/compiler/brw_ir_vec4.h
index cbaff2feff..95c5119c6c 100644
--- a/src/intel/compiler/brw_ir_vec4.h
+++ b/src/intel/compiler/brw_ir_vec4.h
@@ -43,6 +43,7 @@ public:
src_reg(struct ::brw_reg reg);
 
bool equals(const src_reg &r) const;
+   bool negative_equals(const src_reg &r) const;
 
src_reg(class vec4_visitor *v, const struct glsl_type *type);
src_reg(class vec4_visitor *v, const struct glsl_type *type, int size);
diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index 7ad144bdfd..68158cc0cc 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -255,6 +255,55 @@ brw_regs_equal(const struct brw_reg *a, const struct 
brw_reg *b)
return a->bits == b->bits && (df ? a->u64 == b->u64 : a->ud == b->ud);
 }
 
+static inline bool
+brw_regs_negative_equal(const struct brw_reg *a, const struct brw_reg *b)
+{
+   if (a->file == IMM) {
+  if (a->bits != b->bits)
+ return false;
+
+  switch (a->type) {
+  case BRW_REGISTER_TYPE_UQ:
+  case BRW_REGISTER_TYPE_Q:
+ return a->d64 == -b->d64;
+  case BRW_REGISTER_TYPE_DF:
+ return a->df == -b->df;
+  case BRW_REGISTER_TYPE_UD:
+  case BRW_REGISTER_TYPE_D:
+ return a->d == -b->d;
+  case BRW_REGISTER_TYPE_F:
+ return a->f == -b->f;
+  case BRW_REGISTER_TYPE_VF:
+ /* It is tempting to treat 0 as a negation of 0 (and -0 as a negation
+  * of -0).  There are occasions where 0 or -0 is used and the exact
+  * bit pattern is desired.  At the very least, changing this to allow
+  * 0 as a negation of 0 causes some fp64 tests to fail on IVB.
+  */
+ return a->ud == (b->ud ^ 0x80808080);
+  case BRW_REGISTER_TYPE_UW:
+  case BRW_REGISTER_TYPE_W:
+  case BRW_REGISTER_TYPE_UV:
+  case BRW_REGISTER_TYPE_V:
+  case BRW_REGISTER_TYPE_HF:
+ /* FINISHME: Implement support for these types once there is
+  * something in the compiler that can generate them.  Until then,
+  * they cannot be tested.
+  */
+ return false;
+  case BRW_REGISTER_TYPE_UB:
+  case BRW_REGISTER_TYPE_B:
+  case BRW_REGISTER_TYPE_NF:
+ unreachable("not reached");
+  }
+   } else {
+  struct brw_reg tmp = *a;
+
+  tmp.negate = !tmp.negate;
+
+  return brw_regs_equal(&tmp, b);
+   }
+}
+
 struct brw_indirect {
unsigned addr_subnr:4;
int addr_offset:10;
diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index 054962bd7e..9cdf9fcb23 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ 

Mesa (master): i965/vec4: Propagate conditional modifiers from compares to adds

2018-03-26 Thread Ian Romanick
Module: Mesa
Branch: master
Commit: cd635d149b23f0522cb1a73d6a007851896883e3
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=cd635d149b23f0522cb1a73d6a007851896883e3

Author: Ian Romanick 
Date:   Wed Mar 21 15:22:51 2018 -0700

i965/vec4: Propagate conditional modifiers from compares to adds

No changes on Broadwell or later as those platforms do not use the vec4
backend.

Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11682119 -> 11681056 (<.01%)
instructions in affected programs: 150403 -> 149340 (-0.71%)
helped: 950
HURT: 0
helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1
helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71%
95% mean confidence interval for instructions value: -1.19 -1.04
95% mean confidence interval for instructions %-change: -0.84% -0.79%
Instructions are helped.

total cycles in shared programs: 257495842 -> 257495238 (<.01%)
cycles in affected programs: 270302 -> 269698 (-0.22%)
helped: 271
HURT: 13
helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2
helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28%
HURT stats (abs)   min: 2 max: 12 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26%
95% mean confidence interval for cycles value: -2.41 -1.84
95% mean confidence interval for cycles %-change: -0.31% -0.26%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10430493 -> 10429727 (<.01%)
instructions in affected programs: 120860 -> 120094 (-0.63%)
helped: 766
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 146138718 -> 146138446 (<.01%)
cycles in affected programs: 244114 -> 243842 (-0.11%)
helped: 132
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2
helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19%
95% mean confidence interval for cycles value: -2.12 -2.00
95% mean confidence interval for cycles %-change: -0.18% -0.15%
Cycles are helped.

GM45 and Iron Lake had identical results. (Iron Lake shown)
total instructions in shared programs: 7780251 -> 7780248 (<.01%)
instructions in affected programs: 175 -> 172 (-1.71%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49%

total cycles in shared programs: 177851584 -> 177851578 (<.01%)
cycles in affected programs: 9796 -> 9790 (-0.06%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05%

Signed-off-by: Ian Romanick 
Reviewed-by: Alejandro Piñeiro 
Reviewed-by: Matt Turner 

---

 src/intel/compiler/brw_vec4_cmod_propagation.cpp | 70 ++--
 1 file changed, 65 insertions(+), 5 deletions(-)

diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp 
b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
index 7f1001b6d1..5205da4983 100644
--- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
@@ -50,8 +50,14 @@ opt_cmod_propagation_local(bblock_t *block)
   inst->predicate != BRW_PREDICATE_NONE ||
   !inst->dst.is_null() ||
   (inst->src[0].file != VGRF && inst->src[0].file != ATTR &&
-   inst->src[0].file != UNIFORM) ||
-  inst->src[0].abs)
+   inst->src[0].file != UNIFORM))
+ continue;
+
+  /* An ABS source modifier can only be handled when processing a compare
+   * with a value other than zero.
+   */
+  if (inst->src[0].abs &&
+  (inst->opcode != BRW_OPCODE_CMP || inst->src[1].is_zero()))
  continue;
 
   if (inst->opcode == BRW_OPCODE_AND &&
@@ -60,15 +66,68 @@ opt_cmod_propagation_local(bblock_t *block)
 !inst->src[0].negate))
  continue;
 
-  if (inst->opcode == BRW_OPCODE_CMP && !inst->src[1].is_zero())
- continue;
-
   if (inst->opcode == BRW_OPCODE_MOV &&
   inst->conditional_mod != BRW_CONDITIONAL_NZ)
  continue;
 
   bool read_flag = false;
   foreach_inst_in_block_reverse_starting_from(vec4_instruction, scan_inst, 
inst) {
+ /* A CMP with a second source of zero can match with anything.  A CMP
+  * with a second source that is not zero can only match with an ADD
+  * instruction.
+  */
+ if (inst->opcode == BRW_OPCODE_CMP && !inst->src[1].is_zero()) {
+bool negate;
+
+if (scan_inst->opcode != BRW_OPCODE_ADD)
+   goto not_match;
+
+/* A CMP is basically a subtraction.  The result of the
+ * subtraction must be the same as the result of the addition.
+ * This means that one of the operands must be negated.  So (a +
+ * b) vs (a == -b) or (a + 

Mesa (master): i965/fs: Allow cmod propagation when src0 is a uniform or shader input

2018-03-26 Thread Ian Romanick
Module: Mesa
Branch: master
Commit: 5bbb3d60d358cf906ca7078641ae7fb50c4d4e06
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=5bbb3d60d358cf906ca7078641ae7fb50c4d4e06

Author: Ian Romanick 
Date:   Wed Mar 14 10:19:19 2018 -0700

i965/fs: Allow cmod propagation when src0 is a uniform or shader input

No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help about 900 more shaders.

Signed-off-by: Ian Romanick 
Reviewed-by: Alejandro Piñeiro 
Reviewed-by: Matt Turner 

---

 src/intel/compiler/brw_fs_cmod_propagation.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp 
b/src/intel/compiler/brw_fs_cmod_propagation.cpp
index 4625d69f89..b995a51d3c 100644
--- a/src/intel/compiler/brw_fs_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_fs_cmod_propagation.cpp
@@ -62,7 +62,8 @@ opt_cmod_propagation_local(const gen_device_info *devinfo, 
bblock_t *block)
inst->opcode != BRW_OPCODE_MOV) ||
   inst->predicate != BRW_PREDICATE_NONE ||
   !inst->dst.is_null() ||
-  inst->src[0].file != VGRF ||
+  (inst->src[0].file != VGRF && inst->src[0].file != ATTR &&
+   inst->src[0].file != UNIFORM) ||
   inst->src[0].abs)
  continue;
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): i965/vec4: Allow cmod propagation when src0 is a uniform or shader input

2018-03-26 Thread Ian Romanick
Module: Mesa
Branch: master
Commit: 780f307ba860e3d8f85df8d6e1e60a1d612b97d9
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=780f307ba860e3d8f85df8d6e1e60a1d612b97d9

Author: Ian Romanick 
Date:   Wed Mar 21 15:22:15 2018 -0700

i965/vec4: Allow cmod propagation when src0 is a uniform or shader input

No shader-db changes.  This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input.  However, this
change allows the next commit to help more shaders.

Signed-off-by: Ian Romanick 
Reviewed-by: Alejandro Piñeiro 
Reviewed-by: Matt Turner 

---

 src/intel/compiler/brw_vec4_cmod_propagation.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp 
b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
index 0d72d82a57..7f1001b6d1 100644
--- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp
+++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
@@ -49,7 +49,8 @@ opt_cmod_propagation_local(bblock_t *block)
inst->opcode != BRW_OPCODE_MOV) ||
   inst->predicate != BRW_PREDICATE_NONE ||
   !inst->dst.is_null() ||
-  inst->src[0].file != VGRF ||
+  (inst->src[0].file != VGRF && inst->src[0].file != ATTR &&
+   inst->src[0].file != UNIFORM) ||
   inst->src[0].abs)
  continue;
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): i965/fs: Propagate conditional modifiers from compares to adds

2018-03-26 Thread Ian Romanick
Module: Mesa
Branch: master
Commit: 020b0055e7a085a6a8c961ad12ce94e58606a1ae
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=020b0055e7a085a6a8c961ad12ce94e58606a1ae

Author: Ian Romanick 
Date:   Fri Mar  9 13:45:01 2018 -0800

i965/fs: Propagate conditional modifiers from compares to adds

The math inside the add and the cmp in this instruction sequence is the
same.  We can utilize this to eliminate the compare.

add(8)  g5<1>F  g2<8,8,1>F  g64.5<0,1,0>F   { align1 1Q 
compacted };
cmp.z.f0(8) null<1>Fg2<8,8,1>F  -g64.5<0,1,0>F  { align1 1Q 
switch };
(-f0) sel(8)g8<1>F  (abs)g5<8,8,1>F 3e-37F  { align1 1Q };

This is reduced to:

add.z.f0(8) g5<1>F  g2<8,8,1>F  g64.5<0,1,0>F   { align1 1Q 
compacted };
(-f0) sel(8)g8<1>F  (abs)g5<8,8,1>F 3e-37F  { align1 1Q };

This optimization pass could do even better.  The nature of converting
vectorized code from the GLSL front end to scalar code in NIR results in
sequences like:

add(8)  g7<1>F  g4<8,8,1>F  g64.5<0,1,0>F   { align1 1Q 
compacted };
add(8)  g6<1>F  g3<8,8,1>F  g64.5<0,1,0>F   { align1 1Q 
compacted };
add(8)  g5<1>F  g2<8,8,1>F  g64.5<0,1,0>F   { align1 1Q 
compacted };
cmp.z.f0(8) null<1>Fg2<8,8,1>F  -g64.5<0,1,0>F  { align1 1Q 
switch };
(-f0) sel(8)g8<1>F  (abs)g5<8,8,1>F 3e-37F  { align1 1Q };
cmp.z.f0(8) null<1>Fg3<8,8,1>F  -g64.5<0,1,0>F  { align1 1Q 
switch };
(-f0) sel(8)g10<1>F (abs)g6<8,8,1>F 3e-37F  { align1 1Q };
cmp.z.f0(8) null<1>Fg4<8,8,1>F  -g64.5<0,1,0>F  { align1 1Q 
switch };
(-f0) sel(8)g12<1>F (abs)g7<8,8,1>F 3e-37F  { align1 1Q };

In this sequence, only the first cmp.z is removed.  With different
scheduling, all 3 could get removed.

Skylake
total instructions in shared programs: 14407009 -> 14400173 (-0.05%)
instructions in affected programs: 1307274 -> 1300438 (-0.52%)
helped: 4880
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.45 -1.35
95% mean confidence interval for instructions %-change: -0.72% -0.69%
Instructions are helped.

total cycles in shared programs: 532943169 -> 532923528 (<.01%)
cycles in affected programs: 14065798 -> 14046157 (-0.14%)
helped: 2703
HURT: 339
helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2
helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21%
HURT stats (abs)   min: 1 max: 739 x̄: 39.86 x̃: 12
HURT stats (rel)   min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41%
95% mean confidence interval for cycles value: -8.66 -4.26
95% mean confidence interval for cycles %-change: -0.24% -0.14%
Cycles are helped.

LOST:   0
GAINED: 1

Broadwell
total instructions in shared programs: 14719636 -> 14712949 (-0.05%)
instructions in affected programs: 1288188 -> 1281501 (-0.52%)
helped: 4845
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1
helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.43 -1.33
95% mean confidence interval for instructions %-change: -0.72% -0.68%
Instructions are helped.

total cycles in shared programs: 559599253 -> 559581699 (<.01%)
cycles in affected programs: 13315565 -> 13298011 (-0.13%)
helped: 2600
HURT: 269
helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2
helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20%
HURT stats (abs)   min: 1 max: 790 x̄: 53.07 x̃: 20
HURT stats (rel)   min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75%
95% mean confidence interval for cycles value: -8.47 -3.77
95% mean confidence interval for cycles %-change: -0.27% -0.18%
Cycles are helped.

LOST:   0
GAINED: 8

Haswell
total instructions in shared programs: 12978609 -> 12973483 (-0.04%)
instructions in affected programs: 932921 -> 927795 (-0.55%)
helped: 3480
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58%
95% mean confidence interval for instructions value: -1.53 -1.42
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.

total cycles in shared programs: 410270788 -> 410250531 (<.01%)
cycles in affected programs: 10986161 -> 10965904 (-0.18%)
helped: 2087
HURT: 254
helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4
helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21%
HURT stats (abs)   min: 1 max: 519 x̄: 40.49 x̃: 16
HURT stats (rel)   min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47%
95% mean confidence interval for cycles value: -12.82 -4.49
95% mean confidence interval for cycles %-change: -0.31% -0.18%
Cycles are helped.

LOST:   0
GAINED: 5

Ivy Bridge
total instructions in shared programs: 11686082 -> 11681548 (-0.04%)
instructions in affected programs: 937696 -> 933162 (-0.48%)
helped: 3150
HURT: 0
helped stats (

Mesa (master): nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional

2018-03-26 Thread Ian Romanick
Module: Mesa
Branch: master
Commit: 2c643fd978c43205b9620820038ba6246ed045e2
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2c643fd978c43205b9620820038ba6246ed045e2

Author: Ian Romanick 
Date:   Wed Mar 14 16:25:07 2018 -0700

nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional

Now that i965 recognizes that a-b generates the same conditions as 'a <
b', there is no reason to condition this transformation on 'is not used
by conditional.'

Since this was the only user of the is_not_used_by_conditional function,
delete it.

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14400775 -> 14400595 (<.01%)
instructions in affected programs: 36712 -> 36532 (-0.49%)
helped: 182
HURT: 26
helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1
helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90%
95% mean confidence interval for instructions value: -0.97 -0.76
95% mean confidence interval for instructions %-change: -0.59% -0.43%
Instructions are helped.

total cycles in shared programs: 532929592 -> 532926345 (<.01%)
cycles in affected programs: 478660 -> 475413 (-0.68%)
helped: 187
HURT: 22
helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18
helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03%
HURT stats (abs)   min: 1 max: 214 x̄: 30.86 x̃: 11
HURT stats (rel)   min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86%
95% mean confidence interval for cycles value: -19.50 -11.57
95% mean confidence interval for cycles %-change: -1.42% -0.58%
Cycles are helped.

GM45 and Iron Lake had similar results. (Iron Lake shown)
total cycles in shared programs: 177851578 -> 177851810 (<.01%)
cycles in affected programs: 24408 -> 24640 (0.95%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44%
HURT stats (abs)   min: 24 max: 108 x̄: 60.00 x̃: 54
HURT stats (rel)   min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02%
95% mean confidence interval for cycles value: -7.75 85.08
95% mean confidence interval for cycles %-change: -0.39% 1.49%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick 
Reviewed-by: Matt Turner 

---

 src/compiler/nir/nir_opt_algebraic.py |  4 +---
 src/compiler/nir/nir_search_helpers.h | 15 ---
 2 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index b9565cea7b..96232f0e54 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -208,9 +208,7 @@ optimizations = [
# fmax.  If b is > 1.0, the bcsel will be replaced with a b2f.
(('fmin', ('b2f', a), '#b'), ('bcsel', a, ('fmin', b, 1.0), ('fmin', b, 
0.0))),
 
-   # ignore this opt when the result is used by a bcsel or if so we can make
-   # use of conditional modifiers on supported hardware.
-   (('flt(is_not_used_by_conditional)', ('fadd(is_used_once)', a, ('fneg', 
b)), 0.0), ('flt', a, b)),
+   (('flt', ('fadd(is_used_once)', a, ('fneg', b)), 0.0), ('flt', a, b)),
 
(('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
(('~bcsel', ('flt', b, a), b, a), ('fmin', a, b)),
diff --git a/src/compiler/nir/nir_search_helpers.h 
b/src/compiler/nir/nir_search_helpers.h
index 2e3bd137d6..2d399bd5dc 100644
--- a/src/compiler/nir/nir_search_helpers.h
+++ b/src/compiler/nir/nir_search_helpers.h
@@ -170,19 +170,4 @@ is_not_used_by_if(nir_alu_instr *instr)
return list_empty(&instr->dest.dest.ssa.if_uses);
 }
 
-static inline bool
-is_not_used_by_conditional(nir_alu_instr *instr)
-{
-   if (!is_not_used_by_if(instr))
-  return false;
-
-   nir_foreach_use(use, &instr->dest.dest.ssa) {
-  if (use->parent_instr->type == nir_instr_type_alu &&
-  nir_instr_as_alu(use->parent_instr)->op == nir_op_bcsel)
- return false;
-   }
-
-   return true;
-}
-
 #endif /* _NIR_SEARCH_ */

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): i965/vec4: Fix null destination register in 3-source instructions

2018-03-26 Thread Ian Romanick
Module: Mesa
Branch: master
Commit: 91225cb33f0baede872114bd416084b3b52937a1
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=91225cb33f0baede872114bd416084b3b52937a1

Author: Ian Romanick 
Date:   Fri Mar 23 11:46:12 2018 -0700

i965/vec4: Fix null destination register in 3-source instructions

A recent commit (see below) triggered some cases where conditional
modifier propagation and dead code elimination would cause a MAD
instruction like the following to be generated:

mad.l.f0  null, ...

Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases
like this in the scalar backend.  This commit basically ports that code
to the vec4 backend.

NOTE: I have sent a couple tests to the piglit list that reproduce this
bug *without* the commit mentioned below.  This commit fixes those
tests.

Signed-off-by: Ian Romanick 
Reviewed-by: Matt Turner 
Tested-by: Tapani Pälli 
Cc: mesa-sta...@lists.freedesktop.org
Fixes: ee63933a7 ("nir: Distribute binary operations with constants into bcsel")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704

---

 src/intel/compiler/brw_vec4.cpp | 26 ++
 src/intel/compiler/brw_vec4.h   |  1 +
 2 files changed, 27 insertions(+)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 6680410a52..2f352a1118 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -1952,6 +1952,30 @@ is_align1_df(vec4_instruction *inst)
}
 }
 
+/**
+ * Three source instruction must have a GRF/MRF destination register.
+ * ARF NULL is not allowed.  Fix that up by allocating a temporary GRF.
+ */
+void
+vec4_visitor::fixup_3src_null_dest()
+{
+   bool progress = false;
+
+   foreach_block_and_inst_safe (block, vec4_instruction, inst, cfg) {
+  if (inst->is_3src(devinfo) && inst->dst.is_null()) {
+ const unsigned size_written = type_sz(inst->dst.type);
+ const unsigned num_regs = DIV_ROUND_UP(size_written, REG_SIZE);
+
+ inst->dst = retype(dst_reg(VGRF, alloc.allocate(num_regs)),
+inst->dst.type);
+ progress = true;
+  }
+   }
+
+   if (progress)
+  invalidate_live_intervals();
+}
+
 void
 vec4_visitor::convert_to_hw_regs()
 {
@@ -2703,6 +2727,8 @@ vec4_visitor::run()
   OPT(scalarize_df);
}
 
+   fixup_3src_null_dest();
+
bool allocated_without_spills = reg_allocate();
 
if (!allocated_without_spills) {
diff --git a/src/intel/compiler/brw_vec4.h b/src/intel/compiler/brw_vec4.h
index 39ce51c7dc..71880db969 100644
--- a/src/intel/compiler/brw_vec4.h
+++ b/src/intel/compiler/brw_vec4.h
@@ -158,6 +158,7 @@ public:
void opt_set_dependency_control();
void opt_schedule_instructions();
void convert_to_hw_regs();
+   void fixup_3src_null_dest();
 
bool is_supported_64bit_region(vec4_instruction *inst, unsigned arg);
bool lower_simd_width();

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): mesa/st/tests: Use tgsi opcode enum also in the test classes

2018-03-26 Thread Brian Paul
Module: Mesa
Branch: master
Commit: a21da49e5c55f8e61253503d865cef936125ea5f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a21da49e5c55f8e61253503d865cef936125ea5f

Author: Gert Wollny 
Date:   Mon Mar 26 02:17:00 2018 -0600

mesa/st/tests: Use tgsi opcode enum also in the test classes

Fixes: ec478cf9c31K ("st/mesa,tgsi: use enum tgsi_opcode")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105737
Signed-off-by: Gert Wollny 
Reviewed-by: Brian Paul 

---

 src/mesa/state_tracker/tests/st_tests_common.cpp |  6 +++---
 src/mesa/state_tracker/tests/st_tests_common.h   | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/state_tracker/tests/st_tests_common.cpp 
b/src/mesa/state_tracker/tests/st_tests_common.cpp
index ea01ca..63e3d6b2c4 100644
--- a/src/mesa/state_tracker/tests/st_tests_common.cpp
+++ b/src/mesa/state_tracker/tests/st_tests_common.cpp
@@ -43,7 +43,7 @@ using std::tuple;
 /* Implementation of helper and test classes */
 void *FakeCodeline::mem_ctx = nullptr;
 
-FakeCodeline::FakeCodeline(unsigned _op, const vector& _dst,
+FakeCodeline::FakeCodeline(tgsi_opcode _op, const vector& _dst,
const vector& _src, const vector&_to):
op(_op),
max_temp_id(0)
@@ -59,7 +59,7 @@ FakeCodeline::FakeCodeline(unsigned _op, const vector& 
_dst,
 
 }
 
-FakeCodeline::FakeCodeline(unsigned _op, const vector>& _dst,
+FakeCodeline::FakeCodeline(tgsi_opcode _op, const vector>& _dst,
const vector>& _src,
const vector>&_to,
SWZ with_swizzle):
@@ -84,7 +84,7 @@ FakeCodeline::FakeCodeline(unsigned _op, const 
vector>& _dst,
});
 }
 
-FakeCodeline::FakeCodeline(unsigned _op, const vector>& 
_dst,
+FakeCodeline::FakeCodeline(tgsi_opcode _op, const vector>& 
_dst,
const vector>& _src,
const vector>&_to, RA 
with_reladdr):
op(_op),
diff --git a/src/mesa/state_tracker/tests/st_tests_common.h 
b/src/mesa/state_tracker/tests/st_tests_common.h
index 2e18832923..6d855fe581 100644
--- a/src/mesa/state_tracker/tests/st_tests_common.h
+++ b/src/mesa/state_tracker/tests/st_tests_common.h
@@ -40,15 +40,15 @@ struct RA {};
 
 /* A line to describe a TGSI instruction for building mock shaders. */
 struct FakeCodeline {
-   FakeCodeline(unsigned _op): op(_op), max_temp_id(0) {}
-   FakeCodeline(unsigned _op, const std::vector& _dst, const 
std::vector& _src,
+   FakeCodeline(tgsi_opcode _op): op(_op), max_temp_id(0) {}
+   FakeCodeline(tgsi_opcode _op, const std::vector& _dst, const 
std::vector& _src,
 const std::vector&_to);
 
-   FakeCodeline(unsigned _op, const std::vector>& _dst,
+   FakeCodeline(tgsi_opcode _op, const std::vector>& _dst,
 const std::vector>& _src,
 const std::vector>&_to, SWZ 
with_swizzle);
 
-   FakeCodeline(unsigned _op, const std::vector>& _dst,
+   FakeCodeline(tgsi_opcode _op, const std::vector>& 
_dst,
 const std::vector>& _src,
 const std::vector>&_to, RA 
with_reladdr);
 
@@ -78,7 +78,7 @@ private:
template 
void read_reg(const st_reg& s);
 
-   unsigned op;
+   tgsi_opcode op;
std::vector dst;
std::vector src;
std::vector tex_offsets;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): meson: fix header check message

2018-03-26 Thread Eric Engeström
Module: Mesa
Branch: master
Commit: 1e36fe5dc490ed64736591acbcd32876ea1a69dd
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=1e36fe5dc490ed64736591acbcd32876ea1a69dd

Author: Eric Engestrom 
Date:   Fri Mar 23 17:18:56 2018 +

meson: fix header check message

before: Checking if "endian.h works" compiles: YES
after:  Checking if "endian.h" compiles: YES

Signed-off-by: Eric Engestrom 
Reviewed-by: Emil Velikov 

---

 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 041d2bfc70..f210eeb253 100644
--- a/meson.build
+++ b/meson.build
@@ -917,7 +917,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major')
 endif
 
 foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h']
-  if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h))
+  if cc.compiles('#include <@0@>'.format(h), name : '@0@'.format(h))
 pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify())
   endif
 endforeach

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit