[Mesa-dev] [PATCH 2/9] gallium: use $(top_builddir) when referencing other .la's

2015-04-15 Thread Emil Velikov
Just like every other place in gallium.

Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/gallium/drivers/freedreno/Makefile.am | 2 +-
 src/gallium/drivers/nouveau/Makefile.am   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/Makefile.am 
b/src/gallium/drivers/freedreno/Makefile.am
index e798e44..cbf62c6 100644
--- a/src/gallium/drivers/freedreno/Makefile.am
+++ b/src/gallium/drivers/freedreno/Makefile.am
@@ -28,7 +28,7 @@ ir3_compiler_SOURCES = \
 
 ir3_compiler_LDADD = \
libfreedreno.la \
-   ../../auxiliary/libgallium.la \
+   $(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(top_builddir)/src/glsl/libnir.la \
$(top_builddir)/src/libglsl_util.la \
$(top_builddir)/src/util/libmesautil.la \
diff --git a/src/gallium/drivers/nouveau/Makefile.am 
b/src/gallium/drivers/nouveau/Makefile.am
index 0aefc03..d05f0a1 100644
--- a/src/gallium/drivers/nouveau/Makefile.am
+++ b/src/gallium/drivers/nouveau/Makefile.am
@@ -48,7 +48,7 @@ nouveau_compiler_SOURCES = \
 
 nouveau_compiler_LDADD = \
libnouveau.la \
-   ../../auxiliary/libgallium.la \
+   $(top_builddir)/src/gallium/auxiliary/libgallium.la \
$(top_builddir)/src/util/libmesautil.la \
$(GALLIUM_COMMON_LIB_DEPS)
 
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Add the header for constant loads outside of the generator

2015-04-15 Thread Neil Roberts
Ben Widawsky b...@bwidawsk.net writes:

 +   void generate_set_simd4x2_header_gen9(vec4_instruction *inst,
 + struct brw_reg dst);

 super bikesheddy, but this function name sounds pretty bad, though I
 understand it's a side-effect of the name of the opcode (which seems
 consistent with similar opcodes). I can't think of anything great
 either though.

Yes, it's a bit awkward, but unless someone proposes a better name I
will leave it. It matches methods like
fs_generator::generate_set_simd4x2_offset which already exists.

  void
 +vec4_generator::generate_set_simd4x2_header_gen9(vec4_instruction *inst,
 + struct brw_reg dst)
 +{
 +   /* Skylake requires a message header in order to use SIMD4x2 mode. */

 This comment is pretty useless here.

Yes, good point. I'll remove it.

 +   brw_push_insn_state(p);
 +   brw_set_default_mask_control(p, BRW_MASK_DISABLE);
 +
 +   brw_MOV(p, dst, retype(brw_vec4_grf(0, 0), BRW_REGISTER_TYPE_UD));
 +

 retype(brw_vec8_grf(0, 0)? It would jive better with the diff.

Ah yes, oops. I've accidentally undone the fix I made in 07c571a39fa1
here. I'll fix it.

 +void
 +vec4_visitor::emit_pull_constant_load_reg(dst_reg dst,
 +  src_reg surf_index,
 +  src_reg offset_reg,
 +  bblock_t *before_block,
 +  vec4_instruction *before_inst)
 +{
 +   assert((before_inst == NULL  before_block == NULL) ||
 +  (before_inst  before_block));
 +
 +   vec4_instruction *pull;
 +
 +   if (brw-gen = 9) {
 +  /* Gen9+ needs a message header in order to use SIMD4x2 mode */
 +  src_reg header(this, glsl_type::uvec4_type, 2);
 +

 Just curious why you didn't go for a uvec8_type. It seems more natural
 for what we want to do, and how the other code handles things.

I don't think I picked uvec4_type here for any particular reason. I
guess it's hard to pick a type that makes sense because it's a virtual
register that needs to be sized for 2 actual registers where the first
one is a chunk of structed data used for the header and the second one
is actually just an integer I think. As Ian pointed out, uvec8_type
doesn't exist.

 +  dst_reg index_reg = retype(offset(dst_reg(header), 1),
 + offset_reg.type);
 +  pull = MOV(writemask(index_reg, WRITEMASK_X), offset_reg);

 I am not finding where the 1 in the offset there comes from? I guess
 that's why you used a uvec4 above, though I think suboffset can
 achieve the same. Can you elaborate for me, I was sort of expecting to
 see a '2' instead of 1?

This isn't a suboffset into a register, the 1 means to add a whole
register. I think the register type is irrelevant in that case. The
header register is allocated as two-registers wide. The first actual
register gets the header and here we are copying in the index into the
second register.

 I wouldn't be opposed to doing the extraction as a prep-patch fwiw.
 Should make things slightly easier to review, but I can live with
 whatever you prefer.

Ok, good idea. I'll do that.

 I couldn't spot any issues otherwise. If you can address my questions
 comments, I think I'm ready to slap on the r-b

Thanks for the review.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] llvmpipe support for ARM?

2015-04-15 Thread Marko Moberg
Hi,

I have been trying to get OpenGL sw rastering to work on Linux based ARM
CortexA15 device but for some reason gallium llvmpipe gives me a
segmentation fault. Has anybody managed to run sw rastering using gallium
llvmpipe on ARM? Softpipe seems to be working ok.

I am using the following versions:
- Mesa 9.2.5
- LLVM3.3
- Wayland/Weston 1.5

The issue is related to opengl/gles2 rendering i.e. weston-simple-egl demo
and Qt openGL demos give seg faults when executed with llvmpipe enabled.

Backtrace from weston-simple-egl looks something like this:
#0  0x6d7de620 in ?? ()
#1  0xb354004c in fs17_variant0_partial ()
#2  0xb6841740 in lp_rast_shade_quads_mask (task=task@entry=0x1e360,
inputs=inputs@entry=0x47a00, x=128, y=80, mask=4369) at lp_rast.c:466
#3  0xb6842d04 in do_block_4_1 (c=optimized out, y=optimized out,
x=optimized out, plane=optimized out, tri=optimized out,
task=optimized out) at lp_rast_tri_tmp.h:61
#4  do_block_16_1 (c=synthetic pointer, y=optimized out,
x=optimized out, plane=0xb4f4ccf0, tri=optimized out,
task=optimized out) at lp_rast_tri_tmp.h:130
#5  lp_rast_triangle_1 (task=0x1e360, arg=...) at lp_rast_tri_tmp.h:232
#6  0xb6840420 in do_rasterize_bin (bin=optimized out, task=0x1e360,
x=optimized out, y=optimized out) at lp_rast.c:607
#7  rasterize_bin (y=optimized out, x=optimized out, bin=optimized
out,
task=0x1e360) at lp_rast.c:626
#8  rasterize_scene (task=task@entry=0x1e360, scene=0xb3646008)
at lp_rast.c:675
#9  0xb6840cb0 in thread_function (init_data=0x1e360) at lp_rast.c:788
#10 0xb6d98ed2 in start_thread () from /lib/libpthread.so.0
#11 0xb6c91058 in ?? () from /lib/libc.so.6
#12 0xb6c91058 in ?? () from /lib/libc.so.6

regards,
Marko

marko.s.mob...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium/ttn: add support for texture offsets

2015-04-15 Thread Rob Clark
On Tue, Apr 14, 2015 at 1:04 PM, Eric Anholt e...@anholt.net wrote:
 Rob Clark robdcl...@gmail.com writes:

 From: Rob Clark robcl...@freedesktop.org

 Signed-off-by: Rob Clark robcl...@freedesktop.org

 1-3 (with the fix to 1 that you posted in irc) are:

 Reviewed-by: Eric Anholt e...@anholt.net

 I don't like the mismatch on bytes vs vec4s in the load_ubo_indirect
 arguments for patch 4, and will be interested in seeing the version you
 were working on to fix that.

so, if I pull out the addressing-modes patch, then for the ubo patch
we are back to 'index *= 16' vec4-byte conversion.  Which is funny
looking, but at the moment my best definition of correct is what
glsl_to_nir and intel driver do, and by that definition, this is the
correct thing to do.

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] freedreno: use CXX linker rather than explicit link against libstdc++

2015-04-15 Thread Emil Velikov
Cc: Rob Clark robcl...@freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/gallium/drivers/freedreno/Makefile.am | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/Makefile.am 
b/src/gallium/drivers/freedreno/Makefile.am
index 4b2629f..e798e44 100644
--- a/src/gallium/drivers/freedreno/Makefile.am
+++ b/src/gallium/drivers/freedreno/Makefile.am
@@ -21,6 +21,8 @@ libfreedreno_la_SOURCES = \
 
 noinst_PROGRAMS = ir3_compiler
 
+# XXX: Required due to the C++ sources in libnir/libglsl_util
+nodist_EXTRA_ir3_compiler_SOURCES = dummy.cpp
 ir3_compiler_SOURCES = \
ir3/ir3_cmdline.c
 
@@ -29,7 +31,6 @@ ir3_compiler_LDADD = \
../../auxiliary/libgallium.la \
$(top_builddir)/src/glsl/libnir.la \
$(top_builddir)/src/libglsl_util.la \
-   -lstdc++ \
$(top_builddir)/src/util/libmesautil.la \
$(GALLIUM_COMMON_LIB_DEPS) \
$(FREEDRENO_LIBS)
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/9] pipe-loader: add libnir and libglsl_util to the link

2015-04-15 Thread Emil Velikov
Otherwise we'll error out due to unresolved nir symbols. Note that we
still fail to link due to unresolved _mesa_error_no_memory().

Based on commit 101142c4010(xa: support for drivers which use NIR)

Cc: Rob Clark robcl...@freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/gallium/targets/pipe-loader/Makefile.am | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/targets/pipe-loader/Makefile.am 
b/src/gallium/targets/pipe-loader/Makefile.am
index 967cdb7..e4048b5 100644
--- a/src/gallium/targets/pipe-loader/Makefile.am
+++ b/src/gallium/targets/pipe-loader/Makefile.am
@@ -52,6 +52,8 @@ endif
 
 PIPE_LIBS += \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
+   $(top_builddir)/src/glsl/libnir.la \
+   $(top_builddir)/src/libglsl_util.la \
$(top_builddir)/src/util/libmesautil.la \
$(top_builddir)/src/gallium/drivers/rbug/librbug.la \
$(top_builddir)/src/gallium/drivers/trace/libtrace.la \
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/9] freedreno/ir3: use the weak _mesa_error_no_memory() symbol

2015-04-15 Thread Emil Velikov
We no longer need to (although we can) provide this symbol ourselves.

Cc: Rob Clark robcl...@freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/gallium/drivers/freedreno/ir3/ir3_cmdline.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c 
b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
index d0517aa..0b16cc1 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
@@ -360,11 +360,3 @@ int main(int argc, char **argv)
}
dump_info(v, info);
 }
-
-void _mesa_error_no_memory(const char *caller);
-
-void
-_mesa_error_no_memory(const char *caller)
-{
-   fprintf(stderr, Mesa error: out of memory in %s, caller);
-}
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 9/9] r300: do not link against libdrm_intel

2015-04-15 Thread Emil Velikov
Accidentally added since the introduction of the file.

Cc: 10.4 10.5 mesa-sta...@lists.freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/gallium/drivers/r300/Automake.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r300/Automake.inc 
b/src/gallium/drivers/r300/Automake.inc
index 9334973..d4ddc40 100644
--- a/src/gallium/drivers/r300/Automake.inc
+++ b/src/gallium/drivers/r300/Automake.inc
@@ -5,7 +5,7 @@ TARGET_CPPFLAGS += -DGALLIUM_R300
 TARGET_LIB_DEPS += \
$(top_builddir)/src/gallium/drivers/r300/libr300.la \
$(RADEON_LIBS) \
-   $(INTEL_LIBS)
+   $(LIBDRM_LIBS)
 
 TARGET_RADEON_WINSYS = \
$(top_builddir)/src/gallium/winsys/radeon/drm/libradeonwinsys.la
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/9] prog; add weak _mesa_error_no_memory() symbol and add it to libglsl_util

2015-04-15 Thread Emil Velikov
Rather than forcing everyone to provide their own definition of the symbol
provide a common weak one, which anyone can override if needed.

This resolved the build of the standalone pipe-drivers, as it provides a
default symbol which was missing previously.

Cc: Rob Clark robcl...@freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/Makefile.am|  3 ++-
 src/glsl/SConscript|  2 ++
 src/mesa/Android.libmesa_glsl_utils.mk |  6 --
 src/mesa/program/weak_errors.c | 30 ++
 4 files changed, 38 insertions(+), 3 deletions(-)
 create mode 100644 src/mesa/program/weak_errors.c

diff --git a/src/Makefile.am b/src/Makefile.am
index 18cb4ce..461da27 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -72,4 +72,5 @@ noinst_LTLIBRARIES = libglsl_util.la
 libglsl_util_la_SOURCES = \
mesa/main/imports.c \
mesa/program/prog_hash_table.c \
-   mesa/program/symbol_table.c
+   mesa/program/symbol_table.c \
+   mesa/program/weak_errors.c
diff --git a/src/glsl/SConscript b/src/glsl/SConscript
index 284b375..d18a8a7 100644
--- a/src/glsl/SConscript
+++ b/src/glsl/SConscript
@@ -71,6 +71,7 @@ env.Command('imports.c', '#src/mesa/main/imports.c', 
Copy('$TARGET', '$SOURCE'))
 env.Prepend(CPPPATH = ['#src/mesa/program'])
 env.Command('prog_hash_table.c', '#src/mesa/program/prog_hash_table.c', 
Copy('$TARGET', '$SOURCE'))
 env.Command('symbol_table.c', '#src/mesa/program/symbol_table.c', 
Copy('$TARGET', '$SOURCE'))
+env.Command('weak_errors.c', '#src/mesa/program/weak_errors.c', 
Copy('$TARGET', '$SOURCE'))
 
 compiler_objs = env.StaticObject(source_lists['GLSL_COMPILER_CXX_FILES'])
 
@@ -78,6 +79,7 @@ mesa_objs = env.StaticObject([
 'imports.c',
 'prog_hash_table.c',
 'symbol_table.c',
+'weak_errors.c',
 ])
 
 compiler_objs += mesa_objs
diff --git a/src/mesa/Android.libmesa_glsl_utils.mk 
b/src/mesa/Android.libmesa_glsl_utils.mk
index a9f6ff5..08786e3 100644
--- a/src/mesa/Android.libmesa_glsl_utils.mk
+++ b/src/mesa/Android.libmesa_glsl_utils.mk
@@ -43,7 +43,8 @@ LOCAL_C_INCLUDES := \
 LOCAL_SRC_FILES := \
main/imports.c \
program/prog_hash_table.c \
-   program/symbol_table.c
+   program/symbol_table.c \
+   program/weak_errors.c
 
 include $(MESA_COMMON_MK)
 include $(BUILD_STATIC_LIBRARY)
@@ -66,7 +67,8 @@ LOCAL_C_INCLUDES := \
 LOCAL_SRC_FILES := \
main/imports.c \
program/prog_hash_table.c \
-   program/symbol_table.c
+   program/symbol_table.c \
+   program/weak_errors.c
 
 include $(MESA_COMMON_MK)
 include $(BUILD_HOST_STATIC_LIBRARY)
diff --git a/src/mesa/program/weak_errors.c b/src/mesa/program/weak_errors.c
new file mode 100644
index 000..85c35dc
--- /dev/null
+++ b/src/mesa/program/weak_errors.c
@@ -0,0 +1,30 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+#include stdio.h
+#include main/errors.h
+
+__attribute__((weak)) void
+_mesa_error_no_memory(const char *caller)
+{
+   fprintf(stderr, Mesa error: out of memory in %s, caller);
+}
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/9] targets/nine: link against libnir/libglsl_util

2015-04-15 Thread Emil Velikov
Similar to commit 127f8767e0a and 101142c4010.

Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/gallium/targets/d3dadapter9/Makefile.am | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/targets/d3dadapter9/Makefile.am 
b/src/gallium/targets/d3dadapter9/Makefile.am
index 1dc55f5..591978f 100644
--- a/src/gallium/targets/d3dadapter9/Makefile.am
+++ b/src/gallium/targets/d3dadapter9/Makefile.am
@@ -74,6 +74,8 @@ endif # HAVE_LD_VERSION_SCRIPT
 d3dadapter9_la_LIBADD = \
$(top_builddir)/src/gallium/auxiliary/libgalliumvl_stub.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
+   $(top_builddir)/src/glsl/libnir.la \
+   $(top_builddir)/src/libglsl_util.la \
$(top_builddir)/src/gallium/state_trackers/nine/libninetracker.la \
$(top_builddir)/src/util/libmesautil.la \
$(top_builddir)/src/gallium/winsys/sw/wrapper/libwsw.la \
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/9] st/xa: use the weak _mesa_error_no_memory() symbol

2015-04-15 Thread Emil Velikov
Thus we can remove the workaround that we previously had.

Cc: Rob Clark robcl...@freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/gallium/state_trackers/xa/xa_tracker.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/src/gallium/state_trackers/xa/xa_tracker.c 
b/src/gallium/state_trackers/xa/xa_tracker.c
index 8901998..f69ac8e 100644
--- a/src/gallium/state_trackers/xa/xa_tracker.c
+++ b/src/gallium/state_trackers/xa/xa_tracker.c
@@ -535,15 +535,3 @@ xa_surface_format(const struct xa_surface *srf)
 {
 return srf-fdesc.xa_format;
 }
-
-/*
- * _mesa_error_no_memory() is expected by NIR to be provided by the
- * user.  Normally this is in mesa st, but other state trackers
- * must provide their own.
- */
-void _mesa_error_no_memory(const char *caller);
-void
-_mesa_error_no_memory(const char *caller)
-{
-   debug_printf(Mesa error: out of memory in %s, caller);
-}
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glx: Allow to create any OpenGL ES version.

2015-04-15 Thread Emil Velikov
On 14 April 2015 at 17:22, Ian Romanick i...@freedesktop.org wrote:
 On 04/14/2015 08:36 AM, Emil Velikov wrote:
 On 14 April 2015 at 13:52, Jose Fonseca jfons...@vmware.com wrote:
 On 13/04/15 22:59, Ian Romanick wrote:

 On 04/10/2015 03:36 PM, Jose Fonseca wrote:

 From: José Fonseca jfons...@vmware.com

 The latest version of GLX_EXT_create_context_es2_profile states:

If the version requested is a valid and supported OpenGL-ES version,
and the GLX_CONTEXT_ES_PROFILE_BIT_EXT bit is set in the
GLX_CONTEXT_PROFILE_MASK_ARB attribute (see below), then the context
returned will implement the OpenGL ES version requested.

 We must also export EXT_create_context_es_profile too, as
 EXT_create_context_es2_profile specification is crystal clear:

NOTE: implementations of this extension must export BOTH extension
strings, for backwards compatibility with applications written
against version 1 of this extension.

 Totally untested.  (Just happened to noticed this while implementing
 GLX_EXT_create_context_es2_profile for st/xlib.)

 Reviewed-by: Brian Paul bri...@vmware.com
 Reviewed-by: Emil Velikov emil.l.veli...@gmail.com

 v2: Replicate the drisw_glx.c to dri2_glx.c and dri3_glx.c as suggested
 by Emil Velikov.
 ---
   src/glx/dri2_glx.c   |  5 -
   src/glx/dri3_glx.c   |  5 -
   src/glx/dri_common.c | 32 
   src/glx/drisw_glx.c  |  2 ++
   4 files changed, 26 insertions(+), 18 deletions(-)

 diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
 index 462d560..8192c54 100644
 --- a/src/glx/dri2_glx.c
 +++ b/src/glx/dri2_glx.c
 @@ -1102,9 +1102,12 @@ dri2BindExtensions(struct dri2_screen *psc, struct
 glx_display * priv,
 __glXEnableDirectExtension(psc-base, GLX_ARB_create_context);
 __glXEnableDirectExtension(psc-base,
 GLX_ARB_create_context_profile);

 -  if ((mask  (1  __DRI_API_GLES2)) != 0)
 +  if ((mask  (1  __DRI_API_GLES2)) != 0) {
 +__glXEnableDirectExtension(psc-base,
 +   GLX_EXT_create_context_es_profile);
  __glXEnableDirectExtension(psc-base,

 GLX_EXT_create_context_es2_profile);
 +  }
  }

  for (i = 0; extensions[i]; i++) {
 diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
 index 1ddc723..6973ad1 100644
 --- a/src/glx/dri3_glx.c
 +++ b/src/glx/dri3_glx.c
 @@ -1825,9 +1825,12 @@ dri3_bind_extensions(struct dri3_screen *psc,
 struct glx_display * priv,
  __glXEnableDirectExtension(psc-base, GLX_ARB_create_context);
  __glXEnableDirectExtension(psc-base,
 GLX_ARB_create_context_profile);

 -   if ((mask  (1  __DRI_API_GLES2)) != 0)
 +   if ((mask  (1  __DRI_API_GLES2)) != 0) {
 +  __glXEnableDirectExtension(psc-base,
 + GLX_EXT_create_context_es_profile);
 __glXEnableDirectExtension(psc-base,
GLX_EXT_create_context_es2_profile);
 +   }

  for (i = 0; extensions[i]; i++) {
 /* when on a different gpu than the server, the server pixmaps
 diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c
 index 63c8de3..541abbb 100644
 --- a/src/glx/dri_common.c
 +++ b/src/glx/dri_common.c
 @@ -544,9 +544,22 @@ dri2_convert_glx_attribs(unsigned num_attribs, const
 uint32_t *attribs,
 case GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB:
  *api = __DRI_API_OPENGL;
  break;
 -  case GLX_CONTEXT_ES2_PROFILE_BIT_EXT:
 -*api = __DRI_API_GLES2;
 -break;
 +  case GLX_CONTEXT_ES_PROFILE_BIT_EXT:
 + switch (*major_ver) {
 + case 3:
 +*api = __DRI_API_GLES3;
 +break;
 + case 2:
 +*api = __DRI_API_GLES2;
 +break;
 + case 1:
 +*api = __DRI_API_GLES;
 +break;
 + default:
 +*error = __DRI_CTX_ERROR_BAD_API;
 +return false;
 + }
 + break;
 default:
  *error = __DRI_CTX_ERROR_BAD_API;
  return false;
 @@ -577,19 +590,6 @@ dri2_convert_glx_attribs(unsigned num_attribs, const
 uint32_t *attribs,
 return false;
  }

 -   /* The GLX_EXT_create_context_es2_profile spec says:
 -*
 -* ... If the version requested is 2.0, and the
 -* GLX_CONTEXT_ES2_PROFILE_BIT_EXT bit is set in the
 -* GLX_CONTEXT_PROFILE_MASK_ARB attribute (see below), then the
 context
 -* returned will implement OpenGL ES 2.0. This is the only way in
 which
 -* an implementation may request an OpenGL ES 2.0 context.
 -*/
 -   if (*api == __DRI_API_GLES2  (*major_ver != 2 || *minor_ver != 0))
 {
 -  *error = __DRI_CTX_ERROR_BAD_API;
 -  return false;
 -   }


 I guess this text was removed from the extension spec, and now we rely
 on some other layer detecting invalid versions (like 2.1)?


 Yes, the spec replaced that with

 ... If the version requested is a valid and supported OpenGL-ES
 version,
 and the 

Re: [Mesa-dev] llvmpipe support for ARM?

2015-04-15 Thread Jan Vesely
Hi,
I can run arm llvmpipe glxgears on IFC6410 (Snapdragon 600 APQ8064,
Krait 300) without problems.
llvm 3.5
mesa 10.4.3
both are fedora 21 standard packages without modifications.
I haven't tried anything more complex than glxgears

jan

On Wed, 2015-04-15 at 16:30 +0200, Roland Scheidegger wrote:
 There were reports with varying success of running llvmpipe with arm in
 the past but I haven't heard anything for a while. In theory it should
 work (and crashing in the pixel shader is an indication the vertex
 shader worked or at least didn't crash).
 But it's not really tested and hence might easily break. My suggestion
 would be to try a newer Mesa version since 9.2.5 is very old, but I
 really don't know if it will help.
 If that doesn't help, you could try debugging it, you can get the
 crashing instruction in gdb with x/i address and see if that was an
 unaligned access which should have been aligned or something else.
 GALLIVM_DEBUG=ir,asm will get you the IR and compiled code (with debug
 builds).
 
 Roland
 
 
 Am 15.04.2015 um 14:37 schrieb Marko Moberg:
  Hi,
  
  I have been trying to get OpenGL sw rastering to work on Linux based ARM
  CortexA15 device but for some reason gallium llvmpipe gives me a
  segmentation fault. Has anybody managed to run sw rastering using
  gallium llvmpipe on ARM? Softpipe seems to be working ok.
  
  I am using the following versions:
  - Mesa 9.2.5
  - LLVM3.3
  - Wayland/Weston 1.5
  
  The issue is related to opengl/gles2 rendering i.e. weston-simple-egl
  demo and Qt openGL demos give seg faults when executed with llvmpipe
  enabled. 
  
  Backtrace from weston-simple-egl looks something like this:
  #0  0x6d7de620 in ?? ()
  #1  0xb354004c in fs17_variant0_partial ()
  #2  0xb6841740 in lp_rast_shade_quads_mask (task=task@entry=0x1e360,
  inputs=inputs@entry=0x47a00, x=128, y=80, mask=4369) at lp_rast.c:466
  #3  0xb6842d04 in do_block_4_1 (c=optimized out, y=optimized out,
  x=optimized out, plane=optimized out, tri=optimized out,
  task=optimized out) at lp_rast_tri_tmp.h:61
  #4  do_block_16_1 (c=synthetic pointer, y=optimized out,
  x=optimized out, plane=0xb4f4ccf0, tri=optimized out,
  task=optimized out) at lp_rast_tri_tmp.h:130
  #5  lp_rast_triangle_1 (task=0x1e360, arg=...) at lp_rast_tri_tmp.h:232
  #6  0xb6840420 in do_rasterize_bin (bin=optimized out, task=0x1e360,
  x=optimized out, y=optimized out) at lp_rast.c:607
  #7  rasterize_bin (y=optimized out, x=optimized out, bin=optimized
  out,
  task=0x1e360) at lp_rast.c:626
  #8  rasterize_scene (task=task@entry=0x1e360, scene=0xb3646008)
  at lp_rast.c:675
  #9  0xb6840cb0 in thread_function (init_data=0x1e360) at lp_rast.c:788
  #10 0xb6d98ed2 in start_thread () from /lib/libpthread.so.0
  #11 0xb6c91058 in ?? () from /lib/libc.so.6
  #12 0xb6c91058 in ?? () from /lib/libc.so.6
  
  regards,
  Marko
  
  marko.s.mob...@gmail.com mailto:marko.s.mob...@gmail.com
  
  
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=5qZ4KpVhu_wXA8-dTn8xYiul5G713jYwztPr8kfcUZMs=wBNyDvf5DiNBjxXXl7QvyHOvNuTVjZAaw0n7WaQM-bAe=
   
  
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Jan Vesely jan.ves...@rutgers.edu


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] llvmpipe support for ARM?

2015-04-15 Thread Roland Scheidegger
There were reports with varying success of running llvmpipe with arm in
the past but I haven't heard anything for a while. In theory it should
work (and crashing in the pixel shader is an indication the vertex
shader worked or at least didn't crash).
But it's not really tested and hence might easily break. My suggestion
would be to try a newer Mesa version since 9.2.5 is very old, but I
really don't know if it will help.
If that doesn't help, you could try debugging it, you can get the
crashing instruction in gdb with x/i address and see if that was an
unaligned access which should have been aligned or something else.
GALLIVM_DEBUG=ir,asm will get you the IR and compiled code (with debug
builds).

Roland


Am 15.04.2015 um 14:37 schrieb Marko Moberg:
 Hi,
 
 I have been trying to get OpenGL sw rastering to work on Linux based ARM
 CortexA15 device but for some reason gallium llvmpipe gives me a
 segmentation fault. Has anybody managed to run sw rastering using
 gallium llvmpipe on ARM? Softpipe seems to be working ok.
 
 I am using the following versions:
 - Mesa 9.2.5
 - LLVM3.3
 - Wayland/Weston 1.5
 
 The issue is related to opengl/gles2 rendering i.e. weston-simple-egl
 demo and Qt openGL demos give seg faults when executed with llvmpipe
 enabled. 
 
 Backtrace from weston-simple-egl looks something like this:
 #0  0x6d7de620 in ?? ()
 #1  0xb354004c in fs17_variant0_partial ()
 #2  0xb6841740 in lp_rast_shade_quads_mask (task=task@entry=0x1e360,
 inputs=inputs@entry=0x47a00, x=128, y=80, mask=4369) at lp_rast.c:466
 #3  0xb6842d04 in do_block_4_1 (c=optimized out, y=optimized out,
 x=optimized out, plane=optimized out, tri=optimized out,
 task=optimized out) at lp_rast_tri_tmp.h:61
 #4  do_block_16_1 (c=synthetic pointer, y=optimized out,
 x=optimized out, plane=0xb4f4ccf0, tri=optimized out,
 task=optimized out) at lp_rast_tri_tmp.h:130
 #5  lp_rast_triangle_1 (task=0x1e360, arg=...) at lp_rast_tri_tmp.h:232
 #6  0xb6840420 in do_rasterize_bin (bin=optimized out, task=0x1e360,
 x=optimized out, y=optimized out) at lp_rast.c:607
 #7  rasterize_bin (y=optimized out, x=optimized out, bin=optimized
 out,
 task=0x1e360) at lp_rast.c:626
 #8  rasterize_scene (task=task@entry=0x1e360, scene=0xb3646008)
 at lp_rast.c:675
 #9  0xb6840cb0 in thread_function (init_data=0x1e360) at lp_rast.c:788
 #10 0xb6d98ed2 in start_thread () from /lib/libpthread.so.0
 #11 0xb6c91058 in ?? () from /lib/libc.so.6
 #12 0xb6c91058 in ?? () from /lib/libc.so.6
 
 regards,
 Marko
 
 marko.s.mob...@gmail.com mailto:marko.s.mob...@gmail.com
 
 
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=5qZ4KpVhu_wXA8-dTn8xYiul5G713jYwztPr8kfcUZMs=wBNyDvf5DiNBjxXXl7QvyHOvNuTVjZAaw0n7WaQM-bAe=
  
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glx: Create proper server dependency for GLX_EXT_create_context_es2_profile

2015-04-15 Thread Emil Velikov
On 14 April 2015 at 17:35, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Previously GLX_EXT_create_context_es2_profile was marked as direct
 only so that it would not depend on server support.  Since the
 extension required functions that are part of
 GLX_ARB_create_context_profile, support for the EXT was disabled if the
 ARB was not supported.

 This was complete rubbish.  If the server supported the ARB but not the
 EXT, sending a request with GLX_CONTEXT_ES2_PROFILE_BIT_EXT would result
 in GLXBadProfileARB.

 Instead of the misguided hack, make GLX_EXT_create_context_es2_profile
 properly depend on server support by not marking it as direct only.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: José Fonseca jfons...@vmware.com
 Cc: Chad Versace chad.vers...@intel.com
 Cc: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/glx/glxextensions.c | 12 +---
  1 file changed, 1 insertion(+), 11 deletions(-)

 diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c
 index ce5d66d..a326f0d 100644
 --- a/src/glx/glxextensions.c
 +++ b/src/glx/glxextensions.c
 @@ -84,7 +84,7 @@ static const struct extension_info known_glx_extensions[] = 
 {
 { GLX(EXT_visual_rating),   VER(0,0), Y, Y, N, N },
 { GLX(EXT_fbconfig_packed_float),   VER(0,0), Y, Y, N, N },
 { GLX(EXT_framebuffer_sRGB),VER(0,0), Y, Y, N, N },
 -   { GLX(EXT_create_context_es2_profile), VER(0,0), Y, N, N, Y },
 +   { GLX(EXT_create_context_es2_profile), VER(0,0), Y, N, N, N },
I've had a look at these a while back, and I don't think I ever got
the idea how an extension can be direct_only without having
direct_support. If you can spare a minute that would be greatly
appreciated, the glx code can be a bit misty at times.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] nir: compiler options for addressing modes

2015-04-15 Thread Rob Clark
On Wed, Apr 15, 2015 at 10:32 AM, Connor Abbott cwabbo...@gmail.com wrote:

 But more immediately, I hit a sort of snag:  I cannot seem to narrow
 from 32b to 16b at the same time I move to address register.  Which
 ends up meaning I need a mov from 32b to 16b followed by a 2nd mov to
 get it into address register...  which sort of defeats the purpose of
 this whole exercise.  I need to do some more r/e around this, but it
 may end up being better the way it was before.  And if we end up
 needing to do the shl in half-precision registers, then solving this
 in NIR would (I think) require NIR to be aware of half-precision.
 Which sounds useful, but -EBIGGER_FIRES

 I don't quite understand... is this just a problem with using
 registers? Would the entire sequence of operations need to be in 16
 bits, or can you have whatever instruction computed your address do
 the conversion to 16-bit as part of the output? If it's the latter,
 you can just re-emit a 16-bit-outputting version of it and use that,
 although it's a bit of a hack.

Well, the problem if the shl is done in NIR, I commenly end up with:

  cov.f32s32 r0.x, r0.x
  shl.b r0.x, r0.x, 2
  cov.s32s16 hr0.x, r0.x
  mova a0.x, hr0.x

vs

  cov.f32s16 hr0.x, r0.x
  shl.b hr0.x, hr0.x, 2
  mova a0.x, hr0.x

(in both cases, with four nop's between each if I don't have other
instructions I can schedule in between, so the extra instruction here
could translate to 4 cycles in a lot of cases)

I tried to see if I could do something like 'shl.b hr0.x, r0.x, 2' but
that didn't seem to work.. possibly the narrowing on alu op's only
works for float.  (The ISA has all these fun holes like that, where
the instruction encoding supports things, but the designers apparently
didn't think it was worth it to spend transistors for less common
cases)


 Long-term, support for half-precision in NIR is definitely in the
 cards, but it'll probably have to wait until fp64 support as they're
 both very similar wrt changes we have to make in the IR. Unless
 someone has a burning desire to do half-precision first :).

It would be useful..  or at least my understanding is half-precision
should be faster for me.  Although I don't see a particularly good way
to solve that until we figure out how to get TGSI out from the middle.
I guess I need to take a look at the tie-ins between
vbo/uniform/texture/etc state in mesa st and tgsi, and figure out how
to make that work without a glsl_to_tgsi pass so drivers can request
NIR directly.  That probably comes after flow control in ir3 backend,
but *eventually* I'll be really interested in proper mediump support.

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] nir: compiler options for addressing modes

2015-04-15 Thread Connor Abbott
On Tue, Apr 14, 2015 at 7:08 PM, Rob Clark robdcl...@gmail.com wrote:
 On Tue, Apr 14, 2015 at 6:24 PM, Connor Abbott cwabbo...@gmail.com wrote:
 On Tue, Apr 14, 2015 at 5:16 PM, Rob Clark robdcl...@gmail.com wrote:
 On Tue, Apr 14, 2015 at 4:59 PM, Jason Ekstrand ja...@jlekstrand.net 
 wrote:
 +   /**
 +* Addressing mode for corresponding _indirect intrinsics:
 +*/
 +   nir_addressing_mode var_addressing_mode;
 +   nir_addressing_mode input_addressing_mode;
 +   nir_addressing_mode output_addressing_mode;
 +   nir_addressing_mode uniform_addressing_mode;
 +   nir_addressing_mode ubo_addressing_mode;

 What is var_addressing_mode?  Sorry for not bringing that up before.


 well, originally in my thinking it was for load_var/store_var..  but
 perhaps that doesn't make sense (given lower_io).  Maybe it makes more
 sense to define is as applying to var_local/var_global (where the
 others apply to shader_in/shader_out/uniform and their equivalent
 intrinsic forms)?

 Maybe it's a bit weird since I don't lower vars to regs before feeding
 to my ir3 frontend, but the whole load_var/store_var for array access,
 and ssa for everything else form works kind of nicely for me.

 BR,
 -R

 I don't think we should be letting the driver define the stride of
 variable array accesses. Variables are supposed to be structured,
 backend-independent things that core NIR can manipulate and optimize
 as it pleases; it shouldn't need to know anything about how the driver
 will index the data. For doing the kinds of optimizations you're
 talking about, you have registers that are backend-dependent, and core
 NIR (other than the lower_locals_to_regs) doesn't need to know what
 the indices mean. What you're doing right now is a hack, and if you
 want to get the benefits of optimizing the index expression in core
 NIR you should be using lower_locals_to_regs(). Having scalars be SSA
 values and arrays be registers can't be that much more complicated
 than having arrays be variables, and that's how it was set up to work
 from the beginning.

 well, it is pretty convenient for me to have direct and indirect array
 access come via intrinsics, since that gives me a nice single point to
 do all the magic I need to do to set up instruction dependencies for
 scheduling and register assignment where the arrays get allocated in
 registers.  Possibly that means we need an option to lower array
 access to some new sort of intrinsic?  Not sure, I'll play with the
 lower_locals_to_regs without first coming out of SSA.. maybe if then
 the only things in regs are array accesses, I can achieve the same
 result.

Yeah, it seems like some sort of load_register intrinsic might be more
useful to you... there are a few reasons I never added it from the
beginning:

- You can't have a pointer to the nir_register * that contains useful
info like the number of vector components. You would have to search
through the list of registers, which takes O(n) time and is a lot more
annoying.
- There's another usecase for registers, namely backends that don't
support SSA at all; there, the possibility of register reads/writes
that are arbitrarily ordered compared to how they're used seems like
no fun. Having more than one way to represent a register load/store
doesn't seem like a great idea either.

although I can't imagine supporting the current way of loading/storing
registers would be that much more complicated. Variables definitely
aren't what you want.


 But more immediately, I hit a sort of snag:  I cannot seem to narrow
 from 32b to 16b at the same time I move to address register.  Which
 ends up meaning I need a mov from 32b to 16b followed by a 2nd mov to
 get it into address register...  which sort of defeats the purpose of
 this whole exercise.  I need to do some more r/e around this, but it
 may end up being better the way it was before.  And if we end up
 needing to do the shl in half-precision registers, then solving this
 in NIR would (I think) require NIR to be aware of half-precision.
 Which sounds useful, but -EBIGGER_FIRES

I don't quite understand... is this just a problem with using
registers? Would the entire sequence of operations need to be in 16
bits, or can you have whatever instruction computed your address do
the conversion to 16-bit as part of the output? If it's the latter,
you can just re-emit a 16-bit-outputting version of it and use that,
although it's a bit of a hack.

Long-term, support for half-precision in NIR is definitely in the
cards, but it'll probably have to wait until fp64 support as they're
both very similar wrt changes we have to make in the IR. Unless
someone has a burning desire to do half-precision first :).


 The other problem is that currently ttn gives addr src in float, which
 is how things are in tgsi land.  I'm not sure if changing this will be
 a problem for Eric.

 An interesting alternative solution to consider, is to allow the
 backend to lower to driver specific specific alu opcodes.. and then
 

[Mesa-dev] [PATCH v2] nir: Try commutative sources in CSE

2015-04-15 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Shader-db results:

GM45 NIR:
total instructions in shared programs: 4082044 - 4081919 (-0.00%)
instructions in affected programs: 27609 - 27484 (-0.45%)
helped:44

Iron Lake NIR:
total instructions in shared programs: 5678776 - 5678646 (-0.00%)
instructions in affected programs: 27406 - 27276 (-0.47%)
helped:45

Sandy Bridge NIR:
total instructions in shared programs: 7329995 - 7329096 (-0.01%)
instructions in affected programs: 142035 - 141136 (-0.63%)
helped:406
HURT:  19

Ivy Bridge NIR:
total instructions in shared programs: 6769314 - 6768359 (-0.01%)
instructions in affected programs: 140820 - 139865 (-0.68%)
helped:423
HURT:  2

Haswell NIR:
total instructions in shared programs: 6183693 - 6183298 (-0.01%)
instructions in affected programs: 96538 - 96143 (-0.41%)
helped:303
HURT:  4

Broadwell NIR:
total instructions in shared programs: 7501711 - 7498170 (-0.05%)
instructions in affected programs: 266403 - 262862 (-1.33%)
helped:705
HURT:  5
GAINED:4

v2: Rebase on top of Connor's fix.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com [v1]
Cc: Jason Ekstrand jason.ekstr...@intel.com
Cc: Connor Abbott cwabbo...@gmail.com
---

The v2 changes were a little more intrusive than before, so it seemed
worth sending out again.

 src/glsl/nir/nir_opt_cse.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c
index 56d491c..3c5ece7 100644
--- a/src/glsl/nir/nir_opt_cse.c
+++ b/src/glsl/nir/nir_opt_cse.c
@@ -37,18 +37,19 @@ struct cse_state {
 };
 
 static bool
-nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src)
+nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1,
+   unsigned src2)
 {
-   if (alu1-src[src].abs != alu2-src[src].abs ||
-   alu1-src[src].negate != alu2-src[src].negate)
+   if (alu1-src[src1].abs != alu2-src[src2].abs ||
+   alu1-src[src1].negate != alu2-src[src2].negate)
   return false;
 
-   for (unsigned i = 0; i  nir_ssa_alu_instr_src_components(alu1, src); i++) {
-  if (alu1-src[src].swizzle[i] != alu2-src[src].swizzle[i])
+   for (unsigned i = 0; i  nir_ssa_alu_instr_src_components(alu1, src1); i++) 
{
+  if (alu1-src[src1].swizzle[i] != alu2-src[src2].swizzle[i])
  return false;
}
 
-   return nir_srcs_equal(alu1-src[src].src, alu2-src[src].src);
+   return nir_srcs_equal(alu1-src[src1].src, alu2-src[src2].src);
 }
 
 static bool
@@ -71,9 +72,17 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2)
   if (alu1-dest.dest.ssa.num_components != 
alu2-dest.dest.ssa.num_components)
  return false;
 
-  for (unsigned i = 0; i  nir_op_infos[alu1-op].num_inputs; i++) {
- if (!nir_alu_srcs_equal(alu1, alu2, i))
-return false;
+  if (nir_op_infos[alu1-op].num_inputs == 2 
+  (nir_op_infos[alu1-op].algebraic_properties  
NIR_OP_IS_COMMUTATIVE)) {
+ return (nir_alu_srcs_equal(alu1, alu2, 0, 0) 
+ nir_alu_srcs_equal(alu1, alu2, 1, 1)) ||
+(nir_alu_srcs_equal(alu1, alu2, 0, 1) 
+ nir_alu_srcs_equal(alu1, alu2, 1, 0));
+  } else {
+ for (unsigned i = 0; i  nir_op_infos[alu1-op].num_inputs; i++) {
+if (!nir_alu_srcs_equal(alu1, alu2, i, i))
+   return false;
+ }
   }
   return true;
}
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] llvmpipe support for ARM?

2015-04-15 Thread Marko Moberg
Thanks for the replies.

I tried to upgrade LLVM from 3.3 to 3.4 after I sent the e-mail but ended
up having an internal compiler error from gcc linaro toolchain compiler.
Nice. I could try a more up-to-date version of Mesa to see if it helps.
Although I have learned that I can not take the most up-to-date version
since something has gotten broken related to SW rendering/wayland/egl.

Marko



On Wed, Apr 15, 2015 at 6:45 PM, Jan Vesely jan.ves...@rutgers.edu wrote:

 Hi,
 I can run arm llvmpipe glxgears on IFC6410 (Snapdragon 600 APQ8064,
 Krait 300) without problems.
 llvm 3.5
 mesa 10.4.3
 both are fedora 21 standard packages without modifications.
 I haven't tried anything more complex than glxgears

 jan

 On Wed, 2015-04-15 at 16:30 +0200, Roland Scheidegger wrote:
  There were reports with varying success of running llvmpipe with arm in
  the past but I haven't heard anything for a while. In theory it should
  work (and crashing in the pixel shader is an indication the vertex
  shader worked or at least didn't crash).
  But it's not really tested and hence might easily break. My suggestion
  would be to try a newer Mesa version since 9.2.5 is very old, but I
  really don't know if it will help.
  If that doesn't help, you could try debugging it, you can get the
  crashing instruction in gdb with x/i address and see if that was an
  unaligned access which should have been aligned or something else.
  GALLIVM_DEBUG=ir,asm will get you the IR and compiled code (with debug
  builds).
 
  Roland
 
 
  Am 15.04.2015 um 14:37 schrieb Marko Moberg:
   Hi,
  
   I have been trying to get OpenGL sw rastering to work on Linux based
 ARM
   CortexA15 device but for some reason gallium llvmpipe gives me a
   segmentation fault. Has anybody managed to run sw rastering using
   gallium llvmpipe on ARM? Softpipe seems to be working ok.
  
   I am using the following versions:
   - Mesa 9.2.5
   - LLVM3.3
   - Wayland/Weston 1.5
  
   The issue is related to opengl/gles2 rendering i.e. weston-simple-egl
   demo and Qt openGL demos give seg faults when executed with llvmpipe
   enabled.
  
   Backtrace from weston-simple-egl looks something like this:
   #0  0x6d7de620 in ?? ()
   #1  0xb354004c in fs17_variant0_partial ()
   #2  0xb6841740 in lp_rast_shade_quads_mask (task=task@entry=0x1e360,
   inputs=inputs@entry=0x47a00, x=128, y=80, mask=4369) at
 lp_rast.c:466
   #3  0xb6842d04 in do_block_4_1 (c=optimized out, y=optimized out,
   x=optimized out, plane=optimized out, tri=optimized out,
   task=optimized out) at lp_rast_tri_tmp.h:61
   #4  do_block_16_1 (c=synthetic pointer, y=optimized out,
   x=optimized out, plane=0xb4f4ccf0, tri=optimized out,
   task=optimized out) at lp_rast_tri_tmp.h:130
   #5  lp_rast_triangle_1 (task=0x1e360, arg=...) at lp_rast_tri_tmp.h:232
   #6  0xb6840420 in do_rasterize_bin (bin=optimized out, task=0x1e360,
   x=optimized out, y=optimized out) at lp_rast.c:607
   #7  rasterize_bin (y=optimized out, x=optimized out, bin=optimized
   out,
   task=0x1e360) at lp_rast.c:626
   #8  rasterize_scene (task=task@entry=0x1e360, scene=0xb3646008)
   at lp_rast.c:675
   #9  0xb6840cb0 in thread_function (init_data=0x1e360) at lp_rast.c:788
   #10 0xb6d98ed2 in start_thread () from /lib/libpthread.so.0
   #11 0xb6c91058 in ?? () from /lib/libc.so.6
   #12 0xb6c91058 in ?? () from /lib/libc.so.6
  
   regards,
   Marko
  
   marko.s.mob...@gmail.com mailto:marko.s.mob...@gmail.com
  
  
  
   ___
   mesa-dev mailing list
   mesa-dev@lists.freedesktop.org
  
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=5qZ4KpVhu_wXA8-dTn8xYiul5G713jYwztPr8kfcUZMs=wBNyDvf5DiNBjxXXl7QvyHOvNuTVjZAaw0n7WaQM-bAe=
  
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

 --
 Jan Vesely jan.ves...@rutgers.edu

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] nir: Try commutative sources in CSE

2015-04-15 Thread Ilia Mirkin
On Wed, Apr 15, 2015 at 12:37 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Shader-db results:

 GM45 NIR:
 total instructions in shared programs: 4082044 - 4081919 (-0.00%)
 instructions in affected programs: 27609 - 27484 (-0.45%)
 helped:44

 Iron Lake NIR:
 total instructions in shared programs: 5678776 - 5678646 (-0.00%)
 instructions in affected programs: 27406 - 27276 (-0.47%)
 helped:45

 Sandy Bridge NIR:
 total instructions in shared programs: 7329995 - 7329096 (-0.01%)
 instructions in affected programs: 142035 - 141136 (-0.63%)
 helped:406
 HURT:  19

 Ivy Bridge NIR:
 total instructions in shared programs: 6769314 - 6768359 (-0.01%)
 instructions in affected programs: 140820 - 139865 (-0.68%)
 helped:423
 HURT:  2

 Haswell NIR:
 total instructions in shared programs: 6183693 - 6183298 (-0.01%)
 instructions in affected programs: 96538 - 96143 (-0.41%)
 helped:303
 HURT:  4

 Broadwell NIR:
 total instructions in shared programs: 7501711 - 7498170 (-0.05%)
 instructions in affected programs: 266403 - 262862 (-1.33%)
 helped:705
 HURT:  5
 GAINED:4

 v2: Rebase on top of Connor's fix.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com [v1]
 Cc: Jason Ekstrand jason.ekstr...@intel.com
 Cc: Connor Abbott cwabbo...@gmail.com
 ---

 The v2 changes were a little more intrusive than before, so it seemed
 worth sending out again.

  src/glsl/nir/nir_opt_cse.c | 27 ++-
  1 file changed, 18 insertions(+), 9 deletions(-)

 diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c
 index 56d491c..3c5ece7 100644
 --- a/src/glsl/nir/nir_opt_cse.c
 +++ b/src/glsl/nir/nir_opt_cse.c
 @@ -37,18 +37,19 @@ struct cse_state {
  };

  static bool
 -nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src)
 +nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1,
 +   unsigned src2)
  {
 -   if (alu1-src[src].abs != alu2-src[src].abs ||
 -   alu1-src[src].negate != alu2-src[src].negate)
 +   if (alu1-src[src1].abs != alu2-src[src2].abs ||
 +   alu1-src[src1].negate != alu2-src[src2].negate)
return false;

 -   for (unsigned i = 0; i  nir_ssa_alu_instr_src_components(alu1, src); 
 i++) {
 -  if (alu1-src[src].swizzle[i] != alu2-src[src].swizzle[i])
 +   for (unsigned i = 0; i  nir_ssa_alu_instr_src_components(alu1, src1); 
 i++) {
 +  if (alu1-src[src1].swizzle[i] != alu2-src[src2].swizzle[i])
   return false;
 }

 -   return nir_srcs_equal(alu1-src[src].src, alu2-src[src].src);
 +   return nir_srcs_equal(alu1-src[src1].src, alu2-src[src2].src);
  }

  static bool
 @@ -71,9 +72,17 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2)
if (alu1-dest.dest.ssa.num_components != 
 alu2-dest.dest.ssa.num_components)
   return false;

 -  for (unsigned i = 0; i  nir_op_infos[alu1-op].num_inputs; i++) {
 - if (!nir_alu_srcs_equal(alu1, alu2, i))
 -return false;
 +  if (nir_op_infos[alu1-op].num_inputs == 2 
 +  (nir_op_infos[alu1-op].algebraic_properties  
 NIR_OP_IS_COMMUTATIVE)) {

Just a couple of thoughts (could well go into a later patch, or not at all) --

(a) is the NIR_OP_IS_COMMUTATIVE flag ever set when num_inputs != 2?
(b) I see that there is an isub/fsub/etc, which are (hopefully) not
marked as commutative. Would it make sense to first get rid of them
(and introduce negate flags) before CSE is done? I guess it's an
argument that expressions should have a clear canonical form for
maximal CSE, after which various negs can be propagated through.

 + return (nir_alu_srcs_equal(alu1, alu2, 0, 0) 
 + nir_alu_srcs_equal(alu1, alu2, 1, 1)) ||
 +(nir_alu_srcs_equal(alu1, alu2, 0, 1) 
 + nir_alu_srcs_equal(alu1, alu2, 1, 0));
 +  } else {
 + for (unsigned i = 0; i  nir_op_infos[alu1-op].num_inputs; i++) {
 +if (!nir_alu_srcs_equal(alu1, alu2, i, i))
 +   return false;
 + }
}
return true;
 }
 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/13] i965/fs_cse: Factor out code to create copy instructions

2015-04-15 Thread Matt Turner
On Wed, Apr 1, 2015 at 6:19 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 
 
  1 file changed, 38 insertions(+), 37 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 index f2c4098..dd199fa 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 @@ -183,6 +183,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate)
operands_match(a, b, negate);
  }

 +static fs_inst *
 +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bblock_t *block,

const fs_reg src

With that and Topi's comments addressed,

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2 v2] i965/skl: Add the header for constant loads outside of the generator

2015-04-15 Thread Neil Roberts
Commit 5a06ee738 added a step to the generator to set up the message
header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7
instruction. That pseudo opcode is implemented in terms of multiple
actual opcodes, one of which writes to one of the source registers in
order to set up the message header. This causes problems because the
scheduler isn't aware that the source register is written to and it
can end up reorganising the instructions incorrectly such that the
write to the source register overwrites a needed value from a previous
instruction. This problem was presenting itself as a rendering error
in the weapon in Enemy Territory: Quake Wars.

Since commit 588859e1 there is an additional problem that the double
register allocated to include the message header would end up being
split into two. This wasn't happening previously because the code to
split registers was explicitly avoided for instructions that are
sending from the GRF.

This patch fixes both problems by splitting the code to set up the
message header into a new pseudo opcode so that it will be done
outside of the generator. This new opcode has the header register as a
destination so the scheduler can recognise that the register is
written to. This has the additional benefit that the scheduler can
optimise the message header slightly better by moving the mov
instructions further away from the send instructions.

On Skylake it appears to fix the following three Piglit tests without
causing any regressions:

 gs-float-array-variable-index
 gs-mat3x4-row-major
 gs-mat4x3-row-major

I think we actually may need to do something similar for the fs
backend and possibly for message headers from regular texture sampling
but I'm not entirely sure.

v2: Make sure the exec-size is retained as 8 for the mov instruction
to initialise the header from g0. This was accidentally lost
during a rebase on top of 07c571a39fa1.
Split the patch into two so that the helper function is a separate
change.
Fix emitting the MOV instruction on Gen7.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89058
---
 src/mesa/drivers/dri/i965/brw_defines.h|  1 +
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  4 ++
 src/mesa/drivers/dri/i965/brw_vec4.h   |  2 +
 .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   | 52 +++---
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 38 
 6 files changed, 63 insertions(+), 35 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index da6ed5b..a97a944 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -948,6 +948,7 @@ enum opcode {
VS_OPCODE_URB_WRITE,
VS_OPCODE_PULL_CONSTANT_LOAD,
VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
+   VS_OPCODE_SET_SIMD4X2_HEADER_GEN9,
VS_OPCODE_UNPACK_FLAGS_SIMD4X2,
 
/**
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 335a800..0d6ac0c 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -568,6 +568,10 @@ brw_instruction_name(enum opcode op)
   return pull_constant_load;
case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
   return pull_constant_load_gen7;
+
+   case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9:
+  return set_simd4x2_header_gen9;
+
case VS_OPCODE_UNPACK_FLAGS_SIMD4X2:
   return unpack_flags_simd4x2;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 0363924..a0ee2cc 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -500,6 +500,8 @@ private:
  struct brw_reg dst,
  struct brw_reg surf_index,
  struct brw_reg offset);
+   void generate_set_simd4x2_header_gen9(vec4_instruction *inst,
+ struct brw_reg dst);
void generate_unpack_flags(struct brw_reg dst);
 
void generate_untyped_atomic(vec4_instruction *inst,
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
index 980e266..70d2af5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
@@ -44,6 +44,7 @@ can_do_writemask(const struct brw_context *brw,
case SHADER_OPCODE_GEN4_SCRATCH_READ:
case VS_OPCODE_PULL_CONSTANT_LOAD:
case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
+   case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9:
   return false;
default:
   /* The MATH instruction on Gen6 only executes in align1 mode, which does
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 

[Mesa-dev] [PATCH 1/2] i965/vec4: Add a helper function to emit VS_OPCODE_PULL_CONSTANT_LOAD

2015-04-15 Thread Neil Roberts
There were three places in the visitor that had a similar chunk of
code to emit the VS_OPCODE_PULL_CONSTANT_LOAD opcode using a register
for the offset. This patch combines the chunks into a helper function
to reduce the code duplication. It will also be useful in the next
patch to expand what happens on Gen9+. This shouldn't introduce any
functional changes.
---
 src/mesa/drivers/dri/i965/brw_vec4.h   |   5 ++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 120 +
 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp  |  27 ++
 3 files changed, 75 insertions(+), 77 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 700ca69..0363924 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -364,6 +364,11 @@ public:
dst_reg dst,
src_reg orig_src,
int base_offset);
+   void emit_pull_constant_load_reg(dst_reg dst,
+src_reg surf_index,
+src_reg offset,
+bblock_t *before_block,
+vec4_instruction *before_inst);
src_reg emit_resolve_reladdr(int scratch_loc[], bblock_t *block,
 vec4_instruction *inst, src_reg src);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index ffbe04d..f7d542b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1296,6 +1296,63 @@ vec4_visitor::emit_lrp(const dst_reg dst,
}
 }
 
+/**
+ * Emits the instructions needed to perform a pull constant load. before_block
+ * and before_inst can be NULL in which case the instruction will be appended
+ * to the end of the instruction list.
+ */
+void
+vec4_visitor::emit_pull_constant_load_reg(dst_reg dst,
+  src_reg surf_index,
+  src_reg offset_reg,
+  bblock_t *before_block,
+  vec4_instruction *before_inst)
+{
+   assert((before_inst == NULL  before_block == NULL) ||
+  (before_inst  before_block));
+
+   vec4_instruction *pull;
+
+   if (brw-gen = 7) {
+  dst_reg grf_offset = dst_reg(this, glsl_type::int_type);
+
+  /* We have to use a message header on Skylake to get SIMD4x2 mode.
+   * Reserve space for the register.
+   */
+  if (brw-gen = 9) {
+ grf_offset.reg_offset++;
+ alloc.sizes[grf_offset.reg] = 2;
+  }
+
+  grf_offset.type = offset_reg.type;
+
+  pull = MOV(grf_offset, offset_reg);
+
+  if (before_inst)
+ emit_before(before_block, before_inst, pull);
+  else
+ emit(pull);
+
+  pull = new(mem_ctx) vec4_instruction(VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
+   dst,
+   surf_index,
+   src_reg(grf_offset));
+  pull-mlen = 1;
+   } else {
+  pull = new(mem_ctx) vec4_instruction(VS_OPCODE_PULL_CONSTANT_LOAD,
+   dst,
+   surf_index,
+   offset_reg);
+  pull-base_mrf = 14;
+  pull-mlen = 1;
+   }
+
+   if (before_inst)
+  emit_before(before_block, before_inst, pull);
+   else
+  emit(pull);
+}
+
 void
 vec4_visitor::visit(ir_expression *ir)
 {
@@ -1774,36 +1831,10 @@ vec4_visitor::visit(ir_expression *ir)
  emit(SHR(dst_reg(offset), op[1], src_reg(4)));
   }
 
-  if (brw-gen = 7) {
- dst_reg grf_offset = dst_reg(this, glsl_type::int_type);
-
- /* We have to use a message header on Skylake to get SIMD4x2 mode.
-  * Reserve space for the register.
-  */
- if (brw-gen = 9) {
-grf_offset.reg_offset++;
-alloc.sizes[grf_offset.reg] = 2;
- }
-
- grf_offset.type = offset.type;
-
- emit(MOV(grf_offset, offset));
-
- vec4_instruction *pull =
-emit(new(mem_ctx) 
vec4_instruction(VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
-   dst_reg(packed_consts),
-   surf_index,
-   src_reg(grf_offset)));
- pull-mlen = 1;
-  } else {
- vec4_instruction *pull =
-emit(new(mem_ctx) vec4_instruction(VS_OPCODE_PULL_CONSTANT_LOAD,
-   dst_reg(packed_consts),
-   surf_index,
-   offset));
- pull-base_mrf = 14;
- pull-mlen = 1;
-  }

Re: [Mesa-dev] [PATCH 2/2 v2] i965/skl: Add the header for constant loads outside of the generator

2015-04-15 Thread Ben Widawsky
On Wed, Apr 15, 2015 at 06:58:02PM +0100, Neil Roberts wrote:
 Commit 5a06ee738 added a step to the generator to set up the message
 header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7
 instruction. That pseudo opcode is implemented in terms of multiple
 actual opcodes, one of which writes to one of the source registers in
 order to set up the message header. This causes problems because the
 scheduler isn't aware that the source register is written to and it
 can end up reorganising the instructions incorrectly such that the
 write to the source register overwrites a needed value from a previous
 instruction. This problem was presenting itself as a rendering error
 in the weapon in Enemy Territory: Quake Wars.
 
 Since commit 588859e1 there is an additional problem that the double
 register allocated to include the message header would end up being
 split into two. This wasn't happening previously because the code to
 split registers was explicitly avoided for instructions that are
 sending from the GRF.
 
 This patch fixes both problems by splitting the code to set up the
 message header into a new pseudo opcode so that it will be done
 outside of the generator. This new opcode has the header register as a
 destination so the scheduler can recognise that the register is
 written to. This has the additional benefit that the scheduler can
 optimise the message header slightly better by moving the mov
 instructions further away from the send instructions.
 
 On Skylake it appears to fix the following three Piglit tests without
 causing any regressions:
 
  gs-float-array-variable-index
  gs-mat3x4-row-major
  gs-mat4x3-row-major
 
 I think we actually may need to do something similar for the fs
 backend and possibly for message headers from regular texture sampling
 but I'm not entirely sure.
 
 v2: Make sure the exec-size is retained as 8 for the mov instruction
 to initialise the header from g0. This was accidentally lost
 during a rebase on top of 07c571a39fa1.
 Split the patch into two so that the helper function is a separate
 change.
 Fix emitting the MOV instruction on Gen7.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89058

Very nice fix.
Reviewed-by: Ben Widawsky b...@bwidawsk.net

 ---
  src/mesa/drivers/dri/i965/brw_defines.h|  1 +
  src/mesa/drivers/dri/i965/brw_shader.cpp   |  4 ++
  src/mesa/drivers/dri/i965/brw_vec4.h   |  2 +
  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  1 +
  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   | 52 
 +++---
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 38 
  6 files changed, 63 insertions(+), 35 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index da6ed5b..a97a944 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -948,6 +948,7 @@ enum opcode {
 VS_OPCODE_URB_WRITE,
 VS_OPCODE_PULL_CONSTANT_LOAD,
 VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
 +   VS_OPCODE_SET_SIMD4X2_HEADER_GEN9,
 VS_OPCODE_UNPACK_FLAGS_SIMD4X2,
  
 /**
 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index 335a800..0d6ac0c 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -568,6 +568,10 @@ brw_instruction_name(enum opcode op)
return pull_constant_load;
 case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
return pull_constant_load_gen7;
 +
 +   case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9:
 +  return set_simd4x2_header_gen9;
 +
 case VS_OPCODE_UNPACK_FLAGS_SIMD4X2:
return unpack_flags_simd4x2;
  
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index 0363924..a0ee2cc 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -500,6 +500,8 @@ private:
   struct brw_reg dst,
   struct brw_reg surf_index,
   struct brw_reg offset);
 +   void generate_set_simd4x2_header_gen9(vec4_instruction *inst,
 + struct brw_reg dst);
 void generate_unpack_flags(struct brw_reg dst);
  
 void generate_untyped_atomic(vec4_instruction *inst,
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
 index 980e266..70d2af5 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
 @@ -44,6 +44,7 @@ can_do_writemask(const struct brw_context *brw,
 case SHADER_OPCODE_GEN4_SCRATCH_READ:
 case VS_OPCODE_PULL_CONSTANT_LOAD:
 case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
 +   case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9:
return 

Re: [Mesa-dev] [PATCH v2] nir: Try commutative sources in CSE

2015-04-15 Thread Jason Ekstrand
On Wed, Apr 15, 2015 at 9:50 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Wed, Apr 15, 2015 at 12:37 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Shader-db results:

 GM45 NIR:
 total instructions in shared programs: 4082044 - 4081919 (-0.00%)
 instructions in affected programs: 27609 - 27484 (-0.45%)
 helped:44

 Iron Lake NIR:
 total instructions in shared programs: 5678776 - 5678646 (-0.00%)
 instructions in affected programs: 27406 - 27276 (-0.47%)
 helped:45

 Sandy Bridge NIR:
 total instructions in shared programs: 7329995 - 7329096 (-0.01%)
 instructions in affected programs: 142035 - 141136 (-0.63%)
 helped:406
 HURT:  19

 Ivy Bridge NIR:
 total instructions in shared programs: 6769314 - 6768359 (-0.01%)
 instructions in affected programs: 140820 - 139865 (-0.68%)
 helped:423
 HURT:  2

 Haswell NIR:
 total instructions in shared programs: 6183693 - 6183298 (-0.01%)
 instructions in affected programs: 96538 - 96143 (-0.41%)
 helped:303
 HURT:  4

 Broadwell NIR:
 total instructions in shared programs: 7501711 - 7498170 (-0.05%)
 instructions in affected programs: 266403 - 262862 (-1.33%)
 helped:705
 HURT:  5
 GAINED:4

 v2: Rebase on top of Connor's fix.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com [v1]
 Cc: Jason Ekstrand jason.ekstr...@intel.com
 Cc: Connor Abbott cwabbo...@gmail.com
 ---

 The v2 changes were a little more intrusive than before, so it seemed
 worth sending out again.

  src/glsl/nir/nir_opt_cse.c | 27 ++-
  1 file changed, 18 insertions(+), 9 deletions(-)

 diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c
 index 56d491c..3c5ece7 100644
 --- a/src/glsl/nir/nir_opt_cse.c
 +++ b/src/glsl/nir/nir_opt_cse.c
 @@ -37,18 +37,19 @@ struct cse_state {
  };

  static bool
 -nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src)
 +nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1,
 +   unsigned src2)
  {
 -   if (alu1-src[src].abs != alu2-src[src].abs ||
 -   alu1-src[src].negate != alu2-src[src].negate)
 +   if (alu1-src[src1].abs != alu2-src[src2].abs ||
 +   alu1-src[src1].negate != alu2-src[src2].negate)
return false;

 -   for (unsigned i = 0; i  nir_ssa_alu_instr_src_components(alu1, src); 
 i++) {
 -  if (alu1-src[src].swizzle[i] != alu2-src[src].swizzle[i])
 +   for (unsigned i = 0; i  nir_ssa_alu_instr_src_components(alu1, src1); 
 i++) {
 +  if (alu1-src[src1].swizzle[i] != alu2-src[src2].swizzle[i])
   return false;
 }

 -   return nir_srcs_equal(alu1-src[src].src, alu2-src[src].src);
 +   return nir_srcs_equal(alu1-src[src1].src, alu2-src[src2].src);
  }

  static bool
 @@ -71,9 +72,17 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2)
if (alu1-dest.dest.ssa.num_components != 
 alu2-dest.dest.ssa.num_components)
   return false;

 -  for (unsigned i = 0; i  nir_op_infos[alu1-op].num_inputs; i++) {
 - if (!nir_alu_srcs_equal(alu1, alu2, i))
 -return false;
 +  if (nir_op_infos[alu1-op].num_inputs == 2 
 +  (nir_op_infos[alu1-op].algebraic_properties  
 NIR_OP_IS_COMMUTATIVE)) {

 Just a couple of thoughts (could well go into a later patch, or not at all) --

 (a) is the NIR_OP_IS_COMMUTATIVE flag ever set when num_inputs != 2?

No it's not.  It would probably be reasonable to turn that into an assert.

With that changed (and piglitted to make sure I'm right),

Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 (b) I see that there is an isub/fsub/etc, which are (hopefully) not
 marked as commutative. Would it make sense to first get rid of them
 (and introduce negate flags) before CSE is done? I guess it's an
 argument that expressions should have a clear canonical form for
 maximal CSE, after which various negs can be propagated through.

We (i965) do at the moment.  Eric doesn't for vc4 but that was his choice.

 + return (nir_alu_srcs_equal(alu1, alu2, 0, 0) 
 + nir_alu_srcs_equal(alu1, alu2, 1, 1)) ||
 +(nir_alu_srcs_equal(alu1, alu2, 0, 1) 
 + nir_alu_srcs_equal(alu1, alu2, 1, 0));
 +  } else {
 + for (unsigned i = 0; i  nir_op_infos[alu1-op].num_inputs; i++) {
 +if (!nir_alu_srcs_equal(alu1, alu2, i, i))
 +   return false;
 + }
}
return true;
 }
 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 

[Mesa-dev] [PATCH] glx: Massive update of comments in struct extension_info

2015-04-15 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

In response to another patch, Emil asked for some clarification how this
stuff works.  Rather than just reply to the e-mail, I decided to update
the exlanation in the code.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: Emil Velikov emil.l.veli...@gmail.com
---
 src/glx/glxextensions.c | 69 ++---
 1 file changed, 65 insertions(+), 4 deletions(-)

diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c
index a326f0d..cb8cd66 100644
--- a/src/glx/glxextensions.c
+++ b/src/glx/glxextensions.c
@@ -61,12 +61,73 @@ struct extension_info
 */
unsigned char version_major;
unsigned char version_minor;
+
+   /**
+* The client (i.e., libGL) supports this extension.
+*
+* Except during bring up, all extensions should have this set to Y.  There
+* are a few cases of extensions that have partial (or speculative)
+* support, but these are rare.  There also shouldn't be any new ones
+* added.
+*
+* Generally, extensions require server support and ::client_support to be
+* enabled.  If the display is capable of direct rendering,
+* ::direct_support is also required.
+*
+* \sa ::client_only
+*/
unsigned char client_support;
+
+   /**
+* The direct-renderer (e.g., i965_dri.so) supports this extension.
+*
+* For cases where all of the infrastructure to support the extension is a
+* required part of the loader/driver interface, this can default to Y.
+* For most cases, extended functionality, usually in the form of DRI2
+* extensions, is necessary to support the extension.  The loader will set
+* the flag true if all the requirements are met.
+*
+* If the display is capable of direct rendering, ::direct_support is
+* required for the extension to be enabled.
+*/
unsigned char direct_support;
-   unsigned char client_only;/** Is the extension client-side only? */
-   unsigned char direct_only;/** Is the extension for direct
- * contexts only?
- */
+
+   /**
+* The extension depends only on client support.
+*
+* This is for extensions like GLX_ARB_get_proc_address that are contained
+* entirely in the client library.  There is no dependency on the server or
+* the direct-renderer.
+*
+* These extensions will be enabled if ::client_support is set.
+*
+* \note
+* An extension \b cannot be both client-only and direct-only because being
+* direct-only implies a dependency on the direct renderer.
+*
+* \sa ::client_support, ::direct_only
+*/
+   unsigned char client_only;
+
+   /**
+* The extension only functions with direct-rendering contexts
+*
+* The extension has no GLX protocol, and, therefore, no explicit
+* dependency on the server.  The functionality is contained entirely in
+* the client library and the direct renderer.  A few of the swap-related
+* extensions are intended to behave this way.
+*
+* These extensions will be enabled if both ::client_support and
+* ::direct_support are set.
+*
+* \note
+* An extension \b cannot be both client-only and direct-only because being
+* client-only implies that all functionality is outside the
+* direct-renderer.
+*
+* \sa ::direct_support, ::client_only
+*/
+   unsigned char direct_only;
 };
 
 /* *INDENT-OFF* */
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Render R16G16B16X16 as R16G16B16A16

2015-04-15 Thread Anuj Phogat
This enables using _mesa_meta_pbo_TexSubImage() to upload data
to R16G16B16X16 texture. Earlier it fell back to slower paths.

Jenkins run shows no piglit regressions.

Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index f04bac5..7bec8fa 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -582,6 +582,12 @@ brw_init_surface_formats(struct brw_context *brw)
   case BRW_SURFACEFORMAT_L16_FLOAT:
 render = BRW_SURFACEFORMAT_R16_FLOAT;
 break;
+  case BRW_SURFACEFORMAT_R16G16B16X16_UNORM:
+ render = BRW_SURFACEFORMAT_R16G16B16A16_UNORM;
+ break;
+  case BRW_SURFACEFORMAT_R16G16B16X16_FLOAT:
+ render = BRW_SURFACEFORMAT_R16G16B16A16_FLOAT;
+ break;
   case BRW_SURFACEFORMAT_B8G8R8X8_UNORM:
 /* XRGB is handled as ARGB because the chips in this family
  * cannot render to XRGB targets.  This means that we have to
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Update the comment about platforms supporting blorp

2015-04-15 Thread Anuj Phogat
Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 644cb41..7fd812e 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -223,8 +223,8 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
struct intel_mipmap_tree *src_mt = src_irb-mt;
struct intel_mipmap_tree *dst_mt = intel_image-mt;
 
-   /* BLORP is not supported before Gen6. */
-   if (brw-gen  6 || brw-gen = 8)
+   /* BLORP is only supported for Gen6-7. */
+   if (brw-gen  6 || brw-gen  7)
   return false;
 
if (_mesa_get_format_base_format(src_rb-Format) !=
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Change the order of conditions tested in if

2015-04-15 Thread Anuj Phogat
Reduces the number of conditions tested in if to one in case of
non-integer formats. Makes no functional changes.

Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index 7261c01..f04bac5 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -683,9 +683,10 @@ brw_render_target_supported(struct brw_context *brw,
 * available to fake it like we do for XRGB.  Force them to being
 * unsupported.
 */
-   if ((rb-_BaseFormat != GL_RGBA 
+   if (_mesa_is_format_integer_color(format) 
+   (rb-_BaseFormat != GL_RGBA 
rb-_BaseFormat != GL_RG 
-   rb-_BaseFormat != GL_RED)  _mesa_is_format_integer_color(format))
+   rb-_BaseFormat != GL_RED))
   return false;
 
/* Under some conditions, MSAA is not supported for formats whose width is
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/9] prog; add weak _mesa_error_no_memory() symbol and add it to libglsl_util

2015-04-15 Thread Emil Velikov
On 15 April 2015 at 18:33, Matt Turner matts...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 7:08 AM, Emil Velikov emil.l.veli...@gmail.com 
 wrote:
 Rather than forcing everyone to provide their own definition of the symbol
 provide a common weak one, which anyone can override if needed.

 This resolved the build of the standalone pipe-drivers, as it provides a
 default symbol which was missing previously.

 Cc: Rob Clark robcl...@freedesktop.org
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/Makefile.am|  3 ++-
  src/glsl/SConscript|  2 ++
  src/mesa/Android.libmesa_glsl_utils.mk |  6 --
  src/mesa/program/weak_errors.c | 30 ++

 Is program/ really the right place for this? (maybe main/ instead?)
Hmm main/ might be better indeed. If there are any recommendations
about the name I'll gladly take it :)

Jose/Brian

From a quick look I cannot find a suitable alternative for
__attribute__((weak)) for MSVC. Would you know any of the top of your
head or should I just wrap the attribute in ifndef _MSC_VER and leave
the rest ? If I drop the weak_errors.c hunk from the SCons here, and
replace tests/common.c with it things should just work.


Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/vec4: Add a helper function to emit VS_OPCODE_PULL_CONSTANT_LOAD

2015-04-15 Thread Ben Widawsky
On Wed, Apr 15, 2015 at 06:58:01PM +0100, Neil Roberts wrote:
 There were three places in the visitor that had a similar chunk of
 code to emit the VS_OPCODE_PULL_CONSTANT_LOAD opcode using a register
 for the offset. This patch combines the chunks into a helper function
 to reduce the code duplication. It will also be useful in the next
 patch to expand what happens on Gen9+. This shouldn't introduce any
 functional changes.

Hopefully you agreed and didn't just do it for me :-)
Reviewed-by: Ben Widawsky b...@bwidawsk.net

 ---
  src/mesa/drivers/dri/i965/brw_vec4.h   |   5 ++
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 120 
 +
  src/mesa/drivers/dri/i965/brw_vec4_vp.cpp  |  27 ++
  3 files changed, 75 insertions(+), 77 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index 700ca69..0363924 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -364,6 +364,11 @@ public:
   dst_reg dst,
   src_reg orig_src,
   int base_offset);
 +   void emit_pull_constant_load_reg(dst_reg dst,
 +src_reg surf_index,
 +src_reg offset,
 +bblock_t *before_block,
 +vec4_instruction *before_inst);
 src_reg emit_resolve_reladdr(int scratch_loc[], bblock_t *block,
  vec4_instruction *inst, src_reg src);
  
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 index ffbe04d..f7d542b 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 @@ -1296,6 +1296,63 @@ vec4_visitor::emit_lrp(const dst_reg dst,
 }
  }
  
 +/**
 + * Emits the instructions needed to perform a pull constant load. 
 before_block
 + * and before_inst can be NULL in which case the instruction will be appended
 + * to the end of the instruction list.
 + */
 +void
 +vec4_visitor::emit_pull_constant_load_reg(dst_reg dst,
 +  src_reg surf_index,
 +  src_reg offset_reg,
 +  bblock_t *before_block,
 +  vec4_instruction *before_inst)
 +{
 +   assert((before_inst == NULL  before_block == NULL) ||
 +  (before_inst  before_block));
 +
 +   vec4_instruction *pull;
 +
 +   if (brw-gen = 7) {
 +  dst_reg grf_offset = dst_reg(this, glsl_type::int_type);
 +
 +  /* We have to use a message header on Skylake to get SIMD4x2 mode.
 +   * Reserve space for the register.
 +   */
 +  if (brw-gen = 9) {
 + grf_offset.reg_offset++;
 + alloc.sizes[grf_offset.reg] = 2;
 +  }
 +
 +  grf_offset.type = offset_reg.type;
 +
 +  pull = MOV(grf_offset, offset_reg);
 +
 +  if (before_inst)
 + emit_before(before_block, before_inst, pull);
 +  else
 + emit(pull);
 +
 +  pull = new(mem_ctx) vec4_instruction(VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
 +   dst,
 +   surf_index,
 +   src_reg(grf_offset));
 +  pull-mlen = 1;
 +   } else {
 +  pull = new(mem_ctx) vec4_instruction(VS_OPCODE_PULL_CONSTANT_LOAD,
 +   dst,
 +   surf_index,
 +   offset_reg);
 +  pull-base_mrf = 14;
 +  pull-mlen = 1;
 +   }
 +
 +   if (before_inst)
 +  emit_before(before_block, before_inst, pull);
 +   else
 +  emit(pull);
 +}
 +
  void
  vec4_visitor::visit(ir_expression *ir)
  {
 @@ -1774,36 +1831,10 @@ vec4_visitor::visit(ir_expression *ir)
   emit(SHR(dst_reg(offset), op[1], src_reg(4)));
}
  
 -  if (brw-gen = 7) {
 - dst_reg grf_offset = dst_reg(this, glsl_type::int_type);
 -
 - /* We have to use a message header on Skylake to get SIMD4x2 mode.
 -  * Reserve space for the register.
 -  */
 - if (brw-gen = 9) {
 -grf_offset.reg_offset++;
 -alloc.sizes[grf_offset.reg] = 2;
 - }
 -
 - grf_offset.type = offset.type;
 -
 - emit(MOV(grf_offset, offset));
 -
 - vec4_instruction *pull =
 -emit(new(mem_ctx) 
 vec4_instruction(VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
 -   dst_reg(packed_consts),
 -   surf_index,
 -   src_reg(grf_offset)));
 - pull-mlen = 1;
 -  } else {
 - vec4_instruction *pull =
 -emit(new(mem_ctx) 

Re: [Mesa-dev] [PATCH 4/9] prog; add weak _mesa_error_no_memory() symbol and add it to libglsl_util

2015-04-15 Thread Matt Turner
On Wed, Apr 15, 2015 at 7:08 AM, Emil Velikov emil.l.veli...@gmail.com wrote:
 Rather than forcing everyone to provide their own definition of the symbol
 provide a common weak one, which anyone can override if needed.

 This resolved the build of the standalone pipe-drivers, as it provides a
 default symbol which was missing previously.

 Cc: Rob Clark robcl...@freedesktop.org
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/Makefile.am|  3 ++-
  src/glsl/SConscript|  2 ++
  src/mesa/Android.libmesa_glsl_utils.mk |  6 --
  src/mesa/program/weak_errors.c | 30 ++

Is program/ really the right place for this? (maybe main/ instead?)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: remove bogus r600-- triple

2015-04-15 Thread Emil Velikov
On 15 April 2015 at 10:24, Michel Dänzer mic...@daenzer.net wrote:
 On 09.04.2015 03:27, Emil Velikov wrote:
 As mentioned by Michel Dänzer for LLVM = 3.6 we create the
 LLVMTargetMachine (with triple amdgcn--), as we setup the radeonsi
 context. For older LLVM or hardware (r600) the triple is always r600--
 and is created at a later stage - radeon_llvm_compile()

 Cc: Michel Dänzer michel.daen...@amd.com
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/gallium/drivers/radeonsi/si_pipe.c | 2 --
  1 file changed, 2 deletions(-)

 diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
 b/src/gallium/drivers/radeonsi/si_pipe.c
 index e761d20..5ea8868 100644
 --- a/src/gallium/drivers/radeonsi/si_pipe.c
 +++ b/src/gallium/drivers/radeonsi/si_pipe.c
 @@ -85,8 +85,6 @@ static struct pipe_context *si_create_context(struct 
 pipe_screen *screen, void *
   LLVMTargetRef r600_target;
  #if HAVE_LLVM = 0x0306
   const char *triple = amdgcn--;
 -#else
 - const char *triple = r600--;
  #endif
   int shader, i;

 r600_target should probably also be moved inside the

 #if HAVE_LLVM = 0x0306

 as it's only used in that case. But that can be done in a separate
 patch. Either way, this patch is

 Reviewed-by: Michel Dänzer michel.daen...@amd.com

Will keep the r600_target change in a separate patch. I have a few
patches which handle unused variables/used uninitialised warnings.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload

2015-04-15 Thread Matt Turner
On Wed, Apr 1, 2015 at 6:19 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 Instead of the complicated and broken-by-design pile of heuristics we had
 before, we now have a straightforward lowering:

  1) All header sources are copied directly using force_writemask_all and,
 since they are guaranteed to be a single register, there are no
 force_sechalf issues.

  2) All non-header sources are copied using the exact same force_sechalf
 and saturate modifiers as the LOAD_PAYLOAD operation itself.

Let's not do this. Nothing puts a saturate modifier on LOAD_PAYLOAD
today, and it is kind of confusing about what it means. Can't we have
fbwrites that write depth as well. I wouldn't think we wanted to
saturate that.

I don't think it buys us anything. If we just run copy propagation
after lower_load_payload() we'll get the code we want.

  3) In order to accommodate older gens that need interleaved colors,
 lower_load_payload detects when the destination is a COMPR4 register
 and automatically interleaves the non-header sources.  The
 lower_load_payload pass does the right thing here regardless of whether
 or not the hardware actually supports COMPR4.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 154 
 ++-
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   3 -
  2 files changed, 80 insertions(+), 77 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index f8e26c0..bc45a38 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3201,93 +3201,99 @@ fs_visitor::lower_load_payload()
  {
 bool progress = false;

 -   int vgrf_to_reg[alloc.count];
 -   int reg_count = 0;
 -   for (unsigned i = 0; i  alloc.count; ++i) {
 -  vgrf_to_reg[i] = reg_count;
 -  reg_count += alloc.sizes[i];
 -   }
 -
 -   struct {
 -  bool written:1; /* Whether this register has ever been written */
 -  bool force_writemask_all:1;
 -  bool force_sechalf:1;
 -   } metadata[reg_count];
 -   memset(metadata, 0, sizeof(metadata));
 -
 foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
 -  if (inst-dst.file == GRF) {
 - const int dst_reg = vgrf_to_reg[inst-dst.reg] + 
 inst-dst.reg_offset;
 - bool force_sechalf = inst-force_sechalf 
 -  !inst-force_writemask_all;
 - bool toggle_sechalf = inst-dst.width == 16 
 -   type_sz(inst-dst.type) == 4 
 -   !inst-force_writemask_all;
 - for (int i = 0; i  inst-regs_written; ++i) {
 -metadata[dst_reg + i].written = true;
 -metadata[dst_reg + i].force_sechalf = force_sechalf;
 -metadata[dst_reg + i].force_writemask_all = 
 inst-force_writemask_all;
 -force_sechalf = (toggle_sechalf != force_sechalf);
 - }
 -  }
 -
if (inst-opcode == SHADER_OPCODE_LOAD_PAYLOAD) {

We should invert this condition and 'continue' so that we can avoid an
extra level of indentation which only makes the code harder to read.

   assert(inst-dst.file == MRF || inst-dst.file == GRF);
 +
   fs_reg dst = inst-dst;

 - for (int i = 0; i  inst-sources; i++) {
 -dst.width = inst-src[i].effective_width;
 -dst.type = inst-src[i].type;
 -
 -if (inst-src[i].file == BAD_FILE) {
 -   /* Do nothing but otherwise increment as normal */
 -} else if (dst.file == MRF 
 -   dst.width == 8 
 -   brw-has_compr4 
 -   i + 4  inst-sources 
 -   inst-src[i + 4].equals(horiz_offset(inst-src[i], 
 8))) {
 -   fs_reg compr4_dst = dst;
 -   compr4_dst.reg += BRW_MRF_COMPR4;
 -   compr4_dst.width = 16;
 -   fs_reg compr4_src = inst-src[i];
 -   compr4_src.width = 16;
 -   fs_inst *mov = MOV(compr4_dst, compr4_src);
 + /* Get rid of COMPR4.  We'll add it back in if we need it */
 + if (dst.file == MRF  dst.reg  BRW_MRF_COMPR4)
 +dst.reg = dst.reg  ~BRW_MRF_COMPR4;

No point in checking whether the BRW_MRF_COMPR4 bit is set before
clearing it. Just clear it.

 +
 + dst.width = 8;
 + for (uint8_t i = 0; i  inst-header_size; i++) {
 +if (inst-src[i].file != BAD_FILE) {
 +   fs_reg mov_dst = retype(dst, BRW_REGISTER_TYPE_UD);
 +   fs_reg mov_src = retype(inst-src[i], BRW_REGISTER_TYPE_UD);
 +   mov_src.width = 8;
 +   fs_inst *mov = MOV(mov_dst, mov_src);
 mov-force_writemask_all = true;
 inst-insert_before(block, mov);
 -   /* Mark i+4 as BAD_FILE so we don't emit a MOV for it */
 -   inst-src[i + 4].file = BAD_FILE;
 -} else {
 -   fs_inst *mov = MOV(dst, inst-src[i]);
 -   

Re: [Mesa-dev] [PATCH] glx: Allow to create any OpenGL ES version.

2015-04-15 Thread Chad Versace

On Wed 15 Apr 2015, Emil Velikov wrote:

On 14 April 2015 at 17:22, Ian Romanick i...@freedesktop.org wrote:

On 04/14/2015 08:36 AM, Emil Velikov wrote:

On 14 April 2015 at 13:52, Jose Fonseca jfons...@vmware.com wrote:

On 13/04/15 22:59, Ian Romanick wrote:


On 04/10/2015 03:36 PM, Jose Fonseca wrote:


From: José Fonseca jfons...@vmware.com



diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 8192c54..192525a 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1102,7 +1102,7 @@ dri2BindExtensions(struct dri2_screen *psc, struct
glx_display * priv,
   __glXEnableDirectExtension(psc-base, GLX_ARB_create_context);
   __glXEnableDirectExtension(psc-base,
GLX_ARB_create_context_profile);

-  if ((mask  (1  __DRI_API_GLES2)) != 0) {
+  if ((mask  ((1  __DRI_API_GLES2) | (1  __DRI_API_GLES3))) != 0)


Since ES3 is a superset of ES2, this shouldn't be necessary.  I can't
imagine a driver only setting __DRI_API_GLES3... and the common code may
not even make that possible.  We could, however, enable
GLX_EXT_create_context_es2_profile if only __DRI_API_GLES is set.  I
don't really feel like testing any drivers that only support ES 1.x, do
you? :)



Think of it as allowing gles1 support via GLX (apart from gles2/3),
rather than testing gles1 only drivers. I would assume that you
already support gles1 via EGL so things should just work for GLX.
Either way the decision is not mine to make ;-)


Drivers that support GLES1 but not GLES2 are easily created. All that's
needed is rebuilding i965 with --enable-gles1 --disable-gles2, right?

I also have no time to rebuild the xserver. I try that once in a blue
moon, and everytime it leads to broken input drivers and the need for
vodka.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glx: Create proper server dependency for GLX_EXT_create_context_es2_profile

2015-04-15 Thread Chad Versace

On Tue 14 Apr 2015, Ian Romanick wrote:

From: Ian Romanick ian.d.roman...@intel.com

Previously GLX_EXT_create_context_es2_profile was marked as direct
only so that it would not depend on server support.  Since the
extension required functions that are part of
GLX_ARB_create_context_profile, support for the EXT was disabled if the
ARB was not supported.

This was complete rubbish.  If the server supported the ARB but not the
EXT, sending a request with GLX_CONTEXT_ES2_PROFILE_BIT_EXT would result
in GLXBadProfileARB.

Instead of the misguided hack, make GLX_EXT_create_context_es2_profile
properly depend on server support by not marking it as direct only.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: José Fonseca jfons...@vmware.com
Cc: Chad Versace chad.vers...@intel.com
Cc: Emil Velikov emil.l.veli...@gmail.com
---
src/glx/glxextensions.c | 12 +---
1 file changed, 1 insertion(+), 11 deletions(-)


There's a lot of subtlety here, but it looks correct as far as I can
tell.

Reviewed-by: Chad Versace chad.vers...@intel.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Update the comment about platforms supporting blorp

2015-04-15 Thread Chad Versace

On Wed 15 Apr 2015, Anuj Phogat wrote:

Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
---
src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 644cb41..7fd812e 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -223,8 +223,8 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
   struct intel_mipmap_tree *src_mt = src_irb-mt;
   struct intel_mipmap_tree *dst_mt = intel_image-mt;

-   /* BLORP is not supported before Gen6. */
-   if (brw-gen  6 || brw-gen = 8)
+   /* BLORP is only supported for Gen6-7. */
+   if (brw-gen  6 || brw-gen  7)
  return false;

   if (_mesa_get_format_base_format(src_rb-Format) !=


Reviewed-by: Chad Versace chad.vers...@intel.com






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Make shader_time store names/ids instead of referencing shaders.

2015-04-15 Thread Jason Ekstrand
On Wed, Apr 15, 2015 at 2:24 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 Jason noticed that shader_time was bumping the reference count on the
 gl_shader_program and gl_program structures, in code called during
 compilation.

 Not only were these never unreferenced, but it meant fragment shaders
 might be referenced twice (SIMD8 and SIMD16)...or only once.

 We don't actually need the programs.  We just need their numeric ID and
 their language (GLSL/ARB/FF) or KHR_debug label.  If there's a label, we
 have to strdup it since the underlying program could be deleted.

 To be fair, we're not exactly cleaning that up either, but we at least
 ralloc it out of the shader_time arrays, so if we ever bother cleaning
 those up, they'll go away properly.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_context.h |  4 +--
  src/mesa/drivers/dri/i965/brw_program.c | 52 
 +++--
  2 files changed, 19 insertions(+), 37 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 0bd0ed1..a6d6787 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -1454,8 +1454,8 @@ struct brw_context

 struct {
drm_intel_bo *bo;
 -  struct gl_shader_program **shader_programs;
 -  struct gl_program **programs;
 +  const char **names;
 +  int *ids;
enum shader_time_shader_type *types;
uint64_t *cumulative;
int num_entries;
 diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
 b/src/mesa/drivers/dri/i965/brw_program.c
 index 7ea08e6..81a0c19 100644
 --- a/src/mesa/drivers/dri/i965/brw_program.c
 +++ b/src/mesa/drivers/dri/i965/brw_program.c
 @@ -294,10 +294,8 @@ brw_init_shader_time(struct brw_context *brw)
 brw-shader_time.bo = drm_intel_bo_alloc(brw-bufmgr, shader time,
  max_entries * SHADER_TIME_STRIDE,
  4096);
 -   brw-shader_time.shader_programs = rzalloc_array(brw, struct 
 gl_shader_program *,
 -max_entries);
 -   brw-shader_time.programs = rzalloc_array(brw, struct gl_program *,
 - max_entries);
 +   brw-shader_time.names = rzalloc_array(brw, const char *, max_entries);
 +   brw-shader_time.ids = rzalloc_array(brw, int, max_entries);
 brw-shader_time.types = rzalloc_array(brw, enum shader_time_shader_type,
max_entries);
 brw-shader_time.cumulative = rzalloc_array(brw, uint64_t,
 @@ -434,36 +432,15 @@ brw_report_shader_time(struct brw_context *brw)
 fprintf(stderr, \n);
 fprintf(stderr, type  ID  cycles spent   
 %% of total\n);
 for (int s = 0; s  brw-shader_time.num_entries; s++) {
 -  const char *shader_name;
const char *stage;
/* Work back from the sorted pointers times to a time to print. */
int i = sorted[s] - scaled;
 -  struct gl_shader_program *prog = brw-shader_time.shader_programs[i];

if (scaled[i] == 0)
   continue;

 -  int shader_num = 0;
 -  if (prog) {
 - shader_num = prog-Name;
 -
 - if (prog-Label) {
 -shader_name = prog-Label;
 - } else if (shader_num == 0) {
 -shader_name = ff;
 - } else {
 -shader_name = glsl;
 - }
 -  } else if (brw-shader_time.programs[i]) {
 - shader_num = brw-shader_time.programs[i]-Id;
 - if (shader_num == 0) {
 -shader_name = ff;
 - } else {
 -shader_name = prog;
 - }
 -  } else {
 - shader_name = other;
 -  }
 +  int shader_num = brw-shader_time.ids[i];
 +  const char *shader_name = brw-shader_time.names[i];

switch (brw-shader_time.types[i]) {
case ST_VS:
 @@ -543,19 +520,24 @@ brw_get_shader_time_index(struct brw_context *brw,
struct gl_program *prog,
enum shader_time_shader_type type)
  {

I'm still not happy that this is still in get_shader_time_index but at
least it doesn't use the GL context anymore.

Series is Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 -   struct gl_context *ctx = brw-ctx;
 -
 int shader_time_index = brw-shader_time.num_entries++;
 assert(shader_time_index  brw-shader_time.max_entries);
 brw-shader_time.types[shader_time_index] = type;

 -   _mesa_reference_shader_program(ctx,
 -  
 brw-shader_time.shader_programs[shader_time_index],
 -  shader_prog);
 +   int id = shader_prog ? shader_prog-Name : prog-Id;
 +   const char *name;
 +   if (id == 0) {
 +  name = ff;
 +   } else if (!shader_prog) {
 +  name = prog;
 +   } else if (shader_prog-Label) {
 +  name = 

Re: [Mesa-dev] [PATCH] glx: Allow to create any OpenGL ES version.

2015-04-15 Thread Jose Fonseca

On 14/04/15 17:22, Ian Romanick wrote:

On 04/14/2015 08:36 AM, Emil Velikov wrote:

On 14 April 2015 at 13:52, Jose Fonseca jfons...@vmware.com wrote:

On 13/04/15 22:59, Ian Romanick wrote:


On 04/10/2015 03:36 PM, Jose Fonseca wrote:


From: José Fonseca jfons...@vmware.com

The latest version of GLX_EXT_create_context_es2_profile states:

If the version requested is a valid and supported OpenGL-ES version,
and the GLX_CONTEXT_ES_PROFILE_BIT_EXT bit is set in the
GLX_CONTEXT_PROFILE_MASK_ARB attribute (see below), then the context
returned will implement the OpenGL ES version requested.

We must also export EXT_create_context_es_profile too, as
EXT_create_context_es2_profile specification is crystal clear:

NOTE: implementations of this extension must export BOTH extension
strings, for backwards compatibility with applications written
against version 1 of this extension.

Totally untested.  (Just happened to noticed this while implementing
GLX_EXT_create_context_es2_profile for st/xlib.)

Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Emil Velikov emil.l.veli...@gmail.com

v2: Replicate the drisw_glx.c to dri2_glx.c and dri3_glx.c as suggested
by Emil Velikov.
---
   src/glx/dri2_glx.c   |  5 -
   src/glx/dri3_glx.c   |  5 -
   src/glx/dri_common.c | 32 
   src/glx/drisw_glx.c  |  2 ++
   4 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 462d560..8192c54 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1102,9 +1102,12 @@ dri2BindExtensions(struct dri2_screen *psc, struct
glx_display * priv,
 __glXEnableDirectExtension(psc-base, GLX_ARB_create_context);
 __glXEnableDirectExtension(psc-base,
GLX_ARB_create_context_profile);

-  if ((mask  (1  __DRI_API_GLES2)) != 0)
+  if ((mask  (1  __DRI_API_GLES2)) != 0) {
+__glXEnableDirectExtension(psc-base,
+   GLX_EXT_create_context_es_profile);
  __glXEnableDirectExtension(psc-base,

GLX_EXT_create_context_es2_profile);
+  }
  }

  for (i = 0; extensions[i]; i++) {
diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 1ddc723..6973ad1 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1825,9 +1825,12 @@ dri3_bind_extensions(struct dri3_screen *psc,
struct glx_display * priv,
  __glXEnableDirectExtension(psc-base, GLX_ARB_create_context);
  __glXEnableDirectExtension(psc-base,
GLX_ARB_create_context_profile);

-   if ((mask  (1  __DRI_API_GLES2)) != 0)
+   if ((mask  (1  __DRI_API_GLES2)) != 0) {
+  __glXEnableDirectExtension(psc-base,
+ GLX_EXT_create_context_es_profile);
 __glXEnableDirectExtension(psc-base,
GLX_EXT_create_context_es2_profile);
+   }

  for (i = 0; extensions[i]; i++) {
 /* when on a different gpu than the server, the server pixmaps
diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c
index 63c8de3..541abbb 100644
--- a/src/glx/dri_common.c
+++ b/src/glx/dri_common.c
@@ -544,9 +544,22 @@ dri2_convert_glx_attribs(unsigned num_attribs, const
uint32_t *attribs,
 case GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB:
  *api = __DRI_API_OPENGL;
  break;
-  case GLX_CONTEXT_ES2_PROFILE_BIT_EXT:
-*api = __DRI_API_GLES2;
-break;
+  case GLX_CONTEXT_ES_PROFILE_BIT_EXT:
+ switch (*major_ver) {
+ case 3:
+*api = __DRI_API_GLES3;
+break;
+ case 2:
+*api = __DRI_API_GLES2;
+break;
+ case 1:
+*api = __DRI_API_GLES;
+break;
+ default:
+*error = __DRI_CTX_ERROR_BAD_API;
+return false;
+ }
+ break;
 default:
  *error = __DRI_CTX_ERROR_BAD_API;
  return false;
@@ -577,19 +590,6 @@ dri2_convert_glx_attribs(unsigned num_attribs, const
uint32_t *attribs,
 return false;
  }

-   /* The GLX_EXT_create_context_es2_profile spec says:
-*
-* ... If the version requested is 2.0, and the
-* GLX_CONTEXT_ES2_PROFILE_BIT_EXT bit is set in the
-* GLX_CONTEXT_PROFILE_MASK_ARB attribute (see below), then the
context
-* returned will implement OpenGL ES 2.0. This is the only way in
which
-* an implementation may request an OpenGL ES 2.0 context.
-*/
-   if (*api == __DRI_API_GLES2  (*major_ver != 2 || *minor_ver != 0))
{
-  *error = __DRI_CTX_ERROR_BAD_API;
-  return false;
-   }



I guess this text was removed from the extension spec, and now we rely
on some other layer detecting invalid versions (like 2.1)?



Yes, the spec replaced that with

 ... If the version requested is a valid and supported OpenGL-ES
version,
 and the GLX_CONTEXT_ES_PROFILE_BIT_EXT bit is set in the
 GLX_CONTEXT_PROFILE_MASK_ARB attribute (see below), then the 

Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload

2015-04-15 Thread Matt Turner
On Fri, Apr 3, 2015 at 7:28 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Thu, Apr 2, 2015 at 3:01 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 Instead of the complicated and broken-by-design pile of heuristics we had
 before, we now have a straightforward lowering:

  1) All header sources are copied directly using force_writemask_all and,
 since they are guaranteed to be a single register, there are no
 force_sechalf issues.

  2) All non-header sources are copied using the exact same force_sechalf
 and saturate modifiers as the LOAD_PAYLOAD operation itself.

  3) In order to accommodate older gens that need interleaved colors,
 lower_load_payload detects when the destination is a COMPR4 register
 and automatically interleaves the non-header sources.  The
 lower_load_payload pass does the right thing here regardless of whether
 or not the hardware actually supports COMPR4.

 I had a quick glance at the series and it seems to be going in the right
 direction.

 One thing I honestly don't like is the ad-hoc and IMHO premature
 treatment of payload headers, it still feels like the LOAD_PAYLOAD
 instruction has more complex semantics than necessary and the benefit is
 unclear to me.

 I suppose that your motivation was to avoid setting force_writemask_all
 in LOAD_PAYLOAD instructions with header.  The optimizer should be able
 to cope with those flags and get rid of them from the resulting moves
 where they are redundant, and if it's not able to it looks like
 something that should be fixed anyway.  The explicit handling of headers
 is responsible for much of the churn in this series and is likely to
 complicate users of LOAD_PAYLOAD and optimization passes that have to
 manipulate them.

 Avoiding force_writemask_all is only half of the motivation and the
 small half at that.  A header source, more properly defined, is a
 single physical register that, conceptually, applies to all channels.
 Effectively, a header source (I should have stated this clearly) has
 two properties:

 1) It has force_writemask_all set
 2) It is exactly one physical hardware register.

 This second property is the more important of the two.  Most of the
 disaster of the previous LOAD_PAYLOAD implementation was that we did a
 pile of guesswork and had a ill-conceved effective width think in
 order to figure out how big the register actually was.  Making the
 user specify which sources are header sources eliminates that
 guesswork.  It also has the nice side-effect that we can do the right
 force_writemask_all and we can properly handle COMPR4 for the the
 user.

 Yeah, true, but this seems like the least orthogonal and most annoying
 to use solution for this problem, it forces the caller to provide
 redundant information, it takes into account the saturate flag on some
 arguments and not others,

There's no way to get a saturated load_payload, as far as I can tell.

Let's just avoid the problem you mention all together and continue not
doing it. I agree that this patch adding support for that is not a
good idea.

 it shuffles sources with respect to the
 specified order when COMPR4 is set, but only for the first four
 non-header sources.

To confirm I understand -- the existing lower_load_payload takes 2x
8-channel color sources and recognizes that src[i] and src[i+4] can be
emitted as a compr4 MOV. The proposed patch changes that to take 1x
16-channel color source and just emits a compr4 MOV.

I don't really find that change particularly contentious. In fact, I
kind of wonder what case we were trying to handle by allowing separate
sources for the first 8-channels and the second 8-channels. Is there
some case where these are going to be produced by different
instructions that can't do compr4? Texturing or math instructions come
to mind, but they can't have MRF destinations anyway.


In another email, you mention that compr4 change in this patch means
that optimization passes need to know some extra information.

I'm not able to think of any places where we'd need that information
in optimization passes. The only passes that really operate on
load_payload are CSE, copy propagate, and register coalescing; and I
don't think any of these will touch things writing to MRFs. Am I
forgetting something?

I think I understand the concern in general -- the sources are copied
into different destination registers dependent on (dst.reg 
BRW_MRF_COMPR4) == 0. That's a little strange, but that strangeness is
in the hardware. I mean, if it's strange for load_payload to do this,
isn't it strange for other hardware instructions to do it? Or is it
just that it's a virtual opcode doing a hardware-specific thing?

I don't know. It seems like it's just hiding a little bit of
complexity (give me 16-channels and I'll emit compr4 MOVs to put them
in the right hardware-specific places).

 I think any of the following 

Re: [Mesa-dev] [PATCH 4/9] prog; add weak _mesa_error_no_memory() symbol and add it to libglsl_util

2015-04-15 Thread Brian Paul

On 04/15/2015 11:56 AM, Emil Velikov wrote:

On 15 April 2015 at 18:33, Matt Turner matts...@gmail.com wrote:

On Wed, Apr 15, 2015 at 7:08 AM, Emil Velikov emil.l.veli...@gmail.com wrote:

Rather than forcing everyone to provide their own definition of the symbol
provide a common weak one, which anyone can override if needed.

This resolved the build of the standalone pipe-drivers, as it provides a
default symbol which was missing previously.

Cc: Rob Clark robcl...@freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
  src/Makefile.am|  3 ++-
  src/glsl/SConscript|  2 ++
  src/mesa/Android.libmesa_glsl_utils.mk |  6 --
  src/mesa/program/weak_errors.c | 30 ++


Is program/ really the right place for this? (maybe main/ instead?)

Hmm main/ might be better indeed. If there are any recommendations
about the name I'll gladly take it :)

Jose/Brian

 From a quick look I cannot find a suitable alternative for
__attribute__((weak)) for MSVC. Would you know any of the top of your
head or should I just wrap the attribute in ifndef _MSC_VER and leave
the rest ? If I drop the weak_errors.c hunk from the SCons here, and
replace tests/common.c with it things should just work.


I'd have to do some digging to answer the MSVC question.

I guess I'm not totally clear on the background of what's going on here, 
but let me make a suggestion.


The comment in src/gallium/state_trackers/xa/xa_tracker.c says that it's 
called by NIR, but I can't find any calls to _mesa_error_no_memory() in 
src/glsl/nir/.  I guess the actual calls are in the src/glsl/ code, right?


I ran into a similar thing in the glapi code where I wanted to call some 
core Mesa function, but didn't want a dependency of that nature.


The solution was to use a callback function.

So, could we add a function to the glsl/nir code which registers an 
out-of-memory callback function?  The glsl/nir code could have a default 
fprintf(stderr) fallback.  But Mesa would register a callback that 
actually calls  _mesa_error() to record the error.


That seems cleaner than using a compiler/linker work-around.

-Brian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Rewrite ir_tex to ir_txl with lod 0 for vertex shaders

2015-04-15 Thread Kristian Høgsberg
The ir_tex opcode turns into a sample or sample_c message, which will try to
compute derivatives to determine the lod. This produces garbage for
non-fragment shaders where the sample coordinates don't correspond to
subspans.

We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
lod to 0.

https://bugs.freedesktop.org/show_bug.cgi?id=89457
Signed-off-by: Kristian Høgsberg k...@bitplanet.net
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 0049b2d..4e99366 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1839,6 +1839,15 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
fs_reg dst,
   offset_value.file != BAD_FILE  offset_value.file != IMM;
bool coordinate_done = false;
 
+   /* The sampler can only meaningfully compute LOD for fragment shader
+* messages. For all other stages, we change the opcode to ir_txl and
+* hardcode the LOD to 0.
+*/
+   if (stage != MESA_SHADER_FRAGMENT  op == ir_tex) {
+  op = ir_txl;
+  lod = fs_reg(0.0f);
+   }
+
/* Set up the LOD info */
switch (op) {
case ir_tex:
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/9] prog; add weak _mesa_error_no_memory() symbol and add it to libglsl_util

2015-04-15 Thread Jose Fonseca

On 15/04/15 20:38, Brian Paul wrote:

On 04/15/2015 11:56 AM, Emil Velikov wrote:

On 15 April 2015 at 18:33, Matt Turner matts...@gmail.com wrote:

On Wed, Apr 15, 2015 at 7:08 AM, Emil Velikov
emil.l.veli...@gmail.com wrote:

Rather than forcing everyone to provide their own definition of the
symbol
provide a common weak one, which anyone can override if needed.

This resolved the build of the standalone pipe-drivers, as it
provides a
default symbol which was missing previously.

Cc: Rob Clark robcl...@freedesktop.org
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
  src/Makefile.am|  3 ++-
  src/glsl/SConscript|  2 ++
  src/mesa/Android.libmesa_glsl_utils.mk |  6 --
  src/mesa/program/weak_errors.c | 30
++


Is program/ really the right place for this? (maybe main/ instead?)

Hmm main/ might be better indeed. If there are any recommendations
about the name I'll gladly take it :)

Jose/Brian

 From a quick look I cannot find a suitable alternative for
__attribute__((weak)) for MSVC. Would you know any of the top of your
head or should I just wrap the attribute in ifndef _MSC_VER and leave
the rest ? If I drop the weak_errors.c hunk from the SCons here, and
replace tests/common.c with it things should just work.


I'd have to do some digging to answer the MSVC question.


I don't believe MSVC has the equivalent of __attribute__((weak)).

There's _declspec(selectany) , which allows to pick any of multiple 
instances of the same symbol, but it does not allow to say which of the 
multiple implementations is strong vs weak.



I guess I'm not totally clear on the background of what's going on here,
but let me make a suggestion.

The comment in src/gallium/state_trackers/xa/xa_tracker.c says that it's
called by NIR, but I can't find any calls to _mesa_error_no_memory() in
src/glsl/nir/.  I guess the actual calls are in the src/glsl/ code, right?

I ran into a similar thing in the glapi code where I wanted to call some
core Mesa function, but didn't want a dependency of that nature.

The solution was to use a callback function.

So, could we add a function to the glsl/nir code which registers an
out-of-memory callback function?  The glsl/nir code could have a default
fprintf(stderr) fallback.  But Mesa would register a callback that
actually calls  _mesa_error() to record the error.

That seems cleaner than using a compiler/linker work-around.


I agree with Brian.  This sort of compiler/linker tricks are trick to 
sustain with portable code.  Callbacks seem a much more elegant solution.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium/ttn: add support for texture offsets

2015-04-15 Thread Eric Anholt
Rob Clark robdcl...@gmail.com writes:

 On Tue, Apr 14, 2015 at 1:04 PM, Eric Anholt e...@anholt.net wrote:
 Rob Clark robdcl...@gmail.com writes:

 From: Rob Clark robcl...@freedesktop.org

 Signed-off-by: Rob Clark robcl...@freedesktop.org

 1-3 (with the fix to 1 that you posted in irc) are:

 Reviewed-by: Eric Anholt e...@anholt.net

 I don't like the mismatch on bytes vs vec4s in the load_ubo_indirect
 arguments for patch 4, and will be interested in seeing the version you
 were working on to fix that.

 so, if I pull out the addressing-modes patch, then for the ubo patch
 we are back to 'index *= 16' vec4-byte conversion.  Which is funny
 looking, but at the moment my best definition of correct is what
 glsl_to_nir and intel driver do, and by that definition, this is the
 correct thing to do.

OK.  But we should be doing the SHL here, too (also gets us consistency
with glsl_to_nir).


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glx: Create proper server dependency for GLX_EXT_create_context_es2_profile

2015-04-15 Thread Jose Fonseca

On 14/04/15 17:35, Ian Romanick wrote:

From: Ian Romanick ian.d.roman...@intel.com

Previously GLX_EXT_create_context_es2_profile was marked as direct
only so that it would not depend on server support.  Since the
extension required functions that are part of
GLX_ARB_create_context_profile, support for the EXT was disabled if the
ARB was not supported.

This was complete rubbish.  If the server supported the ARB but not the
EXT, sending a request with GLX_CONTEXT_ES2_PROFILE_BIT_EXT would result
in GLXBadProfileARB.

Instead of the misguided hack, make GLX_EXT_create_context_es2_profile
properly depend on server support by not marking it as direct only.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: José Fonseca jfons...@vmware.com
Cc: Chad Versace chad.vers...@intel.com
Cc: Emil Velikov emil.l.veli...@gmail.com
---
  src/glx/glxextensions.c | 12 +---
  1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c
index ce5d66d..a326f0d 100644
--- a/src/glx/glxextensions.c
+++ b/src/glx/glxextensions.c
@@ -84,7 +84,7 @@ static const struct extension_info known_glx_extensions[] = {
 { GLX(EXT_visual_rating),   VER(0,0), Y, Y, N, N },
 { GLX(EXT_fbconfig_packed_float),   VER(0,0), Y, Y, N, N },
 { GLX(EXT_framebuffer_sRGB),VER(0,0), Y, Y, N, N },
-   { GLX(EXT_create_context_es2_profile), VER(0,0), Y, N, N, Y },
+   { GLX(EXT_create_context_es2_profile), VER(0,0), Y, N, N, N },
 { GLX(MESA_copy_sub_buffer),VER(0,0), Y, N, N, N },
 { GLX(MESA_multithread_makecurrent),VER(0,0), Y, N, Y, N },
 { GLX(MESA_query_renderer), VER(0,0), Y, N, N, Y },
@@ -627,16 +627,6 @@ __glXCalculateUsableExtensions(struct glx_screen * psc,
}
 }

-   /* This hack is necessary because GLX_ARB_create_context_profile depends on
-* server support, but GLX_EXT_create_context_es2_profile is direct-only.
-* Without this hack, it would be possible to advertise
-* GLX_EXT_create_context_es2_profile without
-* GLX_ARB_create_context_profile.  That would be a problem.
-*/
-   if (!IS_SET(server_support, ARB_create_context_profile_bit)) {
-  CLR_BIT(usable, EXT_create_context_es2_profile_bit);
-   }
-
 psc-effectiveGLXexts = __glXGetStringFromTable(known_glx_extensions,
 usable);
  }



Makes sense AFAICT.  But I confess I'm not very familiar with these 
subtle interactions between GLX and X server.


Acked-by: Jose Fonseca jfons...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Render R16G16B16X16 as R16G16B16A16

2015-04-15 Thread Kenneth Graunke
On Wednesday, April 15, 2015 11:38:51 AM Anuj Phogat wrote:
 This enables using _mesa_meta_pbo_TexSubImage() to upload data
 to R16G16B16X16 texture. Earlier it fell back to slower paths.
 
 Jenkins run shows no piglit regressions.
 
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/dri/i965/brw_surface_formats.c | 6 ++
  1 file changed, 6 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
 b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 index f04bac5..7bec8fa 100644
 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
 +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 @@ -582,6 +582,12 @@ brw_init_surface_formats(struct brw_context *brw)
case BRW_SURFACEFORMAT_L16_FLOAT:
render = BRW_SURFACEFORMAT_R16_FLOAT;
break;
 +  case BRW_SURFACEFORMAT_R16G16B16X16_UNORM:
 + render = BRW_SURFACEFORMAT_R16G16B16A16_UNORM;
 + break;
 +  case BRW_SURFACEFORMAT_R16G16B16X16_FLOAT:
 + render = BRW_SURFACEFORMAT_R16G16B16A16_FLOAT;
 + break;
case BRW_SURFACEFORMAT_B8G8R8X8_UNORM:
/* XRGB is handled as ARGB because the chips in this family
 * cannot render to XRGB targets.  This means that we have to
 

LGTM, thanks!

Reviewed-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/mesa: also emit labels for TGSI_OPCODE_BGNSUB

2015-04-15 Thread Brian Paul

On 04/14/2015 02:20 PM, Jose Fonseca wrote:

On 14/04/15 17:56, Brian Paul wrote:

Subroutines need labels so they can be identied by CAL instructions.
---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 19 +--
  1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 67a4da7..eb0ce07 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -4771,18 +4771,25 @@ compile_tgsi_instruction(struct st_translate *t,
 }

 switch(inst-op) {
+   case TGSI_OPCODE_BGNSUB:
 case TGSI_OPCODE_BGNLOOP:
 case TGSI_OPCODE_CAL:
 case TGSI_OPCODE_ELSE:
 case TGSI_OPCODE_ENDLOOP:
 case TGSI_OPCODE_IF:
 case TGSI_OPCODE_UIF:
-  assert(num_dst == 0);
-  ureg_label_insn(ureg,
-  inst-op,
-  src, num_src,
-  get_label(t,
-inst-op == TGSI_OPCODE_CAL ?
inst-function-sig_id : 0));
+  {
+ int sig_id = 0;
+ if (inst-op == TGSI_OPCODE_CAL ||
+ inst-op == TGSI_OPCODE_BGNSUB) {


I'm not sure about this.

One thing is the label of the instruction (ie, instrution no), the other
is the operand label.

CAL takes an operand label (the #no of the instruction where BGNSUB
starts).

But BGNSUB shouldn't need a label (what matter is the instrucion #no).
If it does take a label, then it's probably the label of the ENDSUB, not
where the function starts.


I thought we had removed all these pointless labels.  I suppose that's
why the label 0 is used.  In other words, I believe the right thing here
is to add the case statement for TGSI_OPCODE_BGNSUB, but not treat it
like CAL.


Hmm, OK, I'll have to take another look at this when I can find some 
time.  I know labels were kind of iffy...


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/ttn: add UBO support

2015-04-15 Thread Rob Clark
From: Rob Clark robcl...@freedesktop.org

v2: move ishl into ttn (instead of driver backend) to keep the units
consistent between immediate and indirect offsets

Signed-off-by: Rob Clark robcl...@freedesktop.org
---
 src/gallium/auxiliary/nir/tgsi_to_nir.c | 69 -
 1 file changed, 60 insertions(+), 9 deletions(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index b6123e0..6c8d3fc 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -162,6 +162,10 @@ ttn_emit_declaration(struct ttn_compile *c)
  file == TGSI_FILE_OUTPUT ||
  file == TGSI_FILE_CONSTANT);
 
+  /* nothing to do for UBOs: */
+  if ((file == TGSI_FILE_CONSTANT)  decl-Declaration.Dimension)
+ return;
+
   var = rzalloc(b-shader, nir_variable);
   var-data.driver_location = decl-Range.First;
 
@@ -286,7 +290,9 @@ ttn_array_deref(struct ttn_compile *c, nir_intrinsic_instr 
*instr,
 
 static nir_src
 ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned 
index,
-   struct tgsi_ind_register *indirect)
+   struct tgsi_ind_register *indirect,
+   struct tgsi_dimension *dim,
+   struct tgsi_ind_register *dimind)
 {
nir_builder *b = c-build;
nir_src src;
@@ -314,15 +320,18 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
  assert(!indirect);
  src.reg.reg = c-temp_regs[index].reg;
   }
+  assert(!dim);
   break;
 
case TGSI_FILE_ADDRESS:
   src.reg.reg = c-addr_reg;
+  assert(!dim);
   break;
 
case TGSI_FILE_IMMEDIATE:
   src = nir_src_for_ssa(c-imm_defs[index]);
   assert(!indirect);
+  assert(!dim);
   break;
 
case TGSI_FILE_SYSTEM_VALUE: {
@@ -330,6 +339,9 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
   nir_intrinsic_op op;
   unsigned ncomp = 1;
 
+  assert(!indirect);
+  assert(!dim);
+
   switch (c-scan-system_value_semantic_name[index]) {
   case TGSI_SEMANTIC_VERTEXID_NOBASE:
  op = nir_intrinsic_load_vertex_id_zero_base;
@@ -361,15 +373,24 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
case TGSI_FILE_CONSTANT: {
   nir_intrinsic_instr *load;
   nir_intrinsic_op op;
+  unsigned srcn = 0;
 
   switch (file) {
   case TGSI_FILE_INPUT:
  op = indirect ? nir_intrinsic_load_input_indirect :
  nir_intrinsic_load_input;
+ assert(!dim);
  break;
   case TGSI_FILE_CONSTANT:
- op = indirect ? nir_intrinsic_load_uniform_indirect :
- nir_intrinsic_load_uniform;
+ if (dim) {
+op = indirect ? nir_intrinsic_load_ubo_indirect :
+nir_intrinsic_load_ubo;
+/* convert index from vec4 to byte: */
+index *= 16;
+ } else {
+op = indirect ? nir_intrinsic_load_uniform_indirect :
+nir_intrinsic_load_uniform;
+ }
  break;
   default:
  unreachable(No other load files supported);
@@ -381,8 +402,28 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
   load-num_components = 4;
   load-const_index[0] = index;
   load-const_index[1] = 1;
+  if (dim) {
+ if (dimind) {
+load-src[srcn] =
+   ttn_src_for_file_and_index(c, dimind-File, dimind-Index,
+  NULL, NULL, NULL);
+ } else {
+/* UBOs start at index 1 in TGSI: */
+load-src[srcn] =
+   nir_src_for_ssa(nir_imm_int(b, dim-Index - 1));
+ }
+ srcn++;
+  }
   if (indirect) {
- load-src[0] = ttn_src_for_indirect(c, indirect);
+ load-src[srcn] = ttn_src_for_indirect(c, indirect);
+ if (dim) {
+assert(load-src[srcn].is_ssa);
+/* we also need to covert vec4 to byte here too: */
+load-src[srcn] =
+   nir_src_for_ssa(nir_ishl(b, load-src[srcn].ssa,
+nir_imm_int(b, 4)));
+ }
+ srcn++;
   }
   nir_ssa_dest_init(load-instr, load-dest, 4, NULL);
   nir_instr_insert_after_cf_list(b-cf_node_list, load-instr);
@@ -409,7 +450,8 @@ ttn_src_for_indirect(struct ttn_compile *c, struct 
tgsi_ind_register *indirect)
   src.swizzle[i] = indirect-Swizzle;
src.src = ttn_src_for_file_and_index(c,
 indirect-File,
-indirect-Index, NULL);
+indirect-Index,
+NULL, NULL, NULL);
return 

Re: [Mesa-dev] [PATCH] i965: Rewrite ir_tex to ir_txl with lod 0 for vertex shaders

2015-04-15 Thread Ian Romanick
On 04/15/2015 12:41 PM, Kristian Høgsberg wrote:
 The ir_tex opcode turns into a sample or sample_c message, which will try to
 compute derivatives to determine the lod. This produces garbage for
 non-fragment shaders where the sample coordinates don't correspond to
 subspans.
 
 We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
 lod to 0.

This seems logical, but... why the heck hasn't this been a problem
before?  Is this some weirdness of just BSW?  Should we perhaps only do
this on BSW?

 https://bugs.freedesktop.org/show_bug.cgi?id=89457

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457

That allows the bin/bugzilla_mesa.sh script to find it for generating
release notes.

 Signed-off-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 +
  1 file changed, 9 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 0049b2d..4e99366 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1839,6 +1839,15 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
offset_value.file != BAD_FILE  offset_value.file != IMM;
 bool coordinate_done = false;
  
 +   /* The sampler can only meaningfully compute LOD for fragment shader
 +* messages. For all other stages, we change the opcode to ir_txl and
 +* hardcode the LOD to 0.
 +*/
 +   if (stage != MESA_SHADER_FRAGMENT  op == ir_tex) {
 +  op = ir_txl;
 +  lod = fs_reg(0.0f);
 +   }
 +
 /* Set up the LOD info */
 switch (op) {
 case ir_tex:
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90032] build fails, undefined reference to llvm:RTDyldMemoryManager

2015-04-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90032

--- Comment #11 from Felix von Leitner felix-freedesk...@fefe.de ---
Good news! It works now that I recompiled llvm with RTTI support!

I thought I had RTTI in my LLVM but I was doing it wrong.
For the record: the right way to build LLVM with RTTI support is:

cmake [path-to-llvm-sources] -DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/usr -DLLVM_LIBDIR_SUFFIX=64 -DBUILD_SHARED_LIBS=ON
-DLLVM_ENABLE_RTTI=1

(the -DLLVM_ENABLE_RTTI=1 part is the important part)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/9] prog; add weak _mesa_error_no_memory() symbol and add it to libglsl_util

2015-04-15 Thread Rob Clark
On Wed, Apr 15, 2015 at 6:15 PM, Ian Romanick i...@freedesktop.org wrote:
 On 04/15/2015 10:56 AM, Emil Velikov wrote:
 On 15 April 2015 at 18:33, Matt Turner matts...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 7:08 AM, Emil Velikov emil.l.veli...@gmail.com 
 wrote:
 Rather than forcing everyone to provide their own definition of the symbol
 provide a common weak one, which anyone can override if needed.

 This resolved the build of the standalone pipe-drivers, as it provides a
 default symbol which was missing previously.

 Cc: Rob Clark robcl...@freedesktop.org
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/Makefile.am|  3 ++-
  src/glsl/SConscript|  2 ++
  src/mesa/Android.libmesa_glsl_utils.mk |  6 --
  src/mesa/program/weak_errors.c | 30 ++

 Is program/ really the right place for this? (maybe main/ instead?)
 Hmm main/ might be better indeed. If there are any recommendations
 about the name I'll gladly take it :)

 So... who is going to use this other than src/glsl/main.cpp and
 src/glsl/tests?  It seems like you could just put the (non-weak) symbol
 in a common file that only those things link.

 Is the goal just code sharing, or is the goal partitioning src/glsl from
 src/mesa?  The former probably isn't worth the effort, and the latter
 should have a more systematic approach... and I could get behind that.

fwiw, the goal is sharing libnir (both vc4 and freedreno/ir3 use it,
and I expect others might like to at some point).. but that currently
brings along some glsl dependencies, which bring along the
_mesa_error_no_memory() dependency..

BR,
-R


 Jose/Brian

 From a quick look I cannot find a suitable alternative for
 __attribute__((weak)) for MSVC. Would you know any of the top of your
 head or should I just wrap the attribute in ifndef _MSC_VER and leave
 the rest ? If I drop the weak_errors.c hunk from the SCons here, and
 replace tests/common.c with it things should just work.


 Thanks
 Emil
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/9] prog; add weak _mesa_error_no_memory() symbol and add it to libglsl_util

2015-04-15 Thread Ian Romanick
On 04/15/2015 10:56 AM, Emil Velikov wrote:
 On 15 April 2015 at 18:33, Matt Turner matts...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 7:08 AM, Emil Velikov emil.l.veli...@gmail.com 
 wrote:
 Rather than forcing everyone to provide their own definition of the symbol
 provide a common weak one, which anyone can override if needed.

 This resolved the build of the standalone pipe-drivers, as it provides a
 default symbol which was missing previously.

 Cc: Rob Clark robcl...@freedesktop.org
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/Makefile.am|  3 ++-
  src/glsl/SConscript|  2 ++
  src/mesa/Android.libmesa_glsl_utils.mk |  6 --
  src/mesa/program/weak_errors.c | 30 ++

 Is program/ really the right place for this? (maybe main/ instead?)
 Hmm main/ might be better indeed. If there are any recommendations
 about the name I'll gladly take it :)

So... who is going to use this other than src/glsl/main.cpp and
src/glsl/tests?  It seems like you could just put the (non-weak) symbol
in a common file that only those things link.

Is the goal just code sharing, or is the goal partitioning src/glsl from
src/mesa?  The former probably isn't worth the effort, and the latter
should have a more systematic approach... and I could get behind that.

 Jose/Brian
 
 From a quick look I cannot find a suitable alternative for
 __attribute__((weak)) for MSVC. Would you know any of the top of your
 head or should I just wrap the attribute in ifndef _MSC_VER and leave
 the rest ? If I drop the weak_errors.c hunk from the SCons here, and
 replace tests/common.c with it things should just work.
 
 
 Thanks
 Emil
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/3] util: Use set_foreach instead of rolling our own

2015-04-15 Thread Matt Turner
On Wed, Apr 15, 2015 at 2:14 PM, Thomas Helland
thomashellan...@gmail.com wrote:
 This follows the same pattern as in the hash_table.

 Reviewed-by: Jason Ekstrand jason.ekstrand at intel.com

Fix the email address before committing, whoever commits.

 Signed-off-by: Thomas Helland thomashellan...@gmail.com
 ---
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Rewrite ir_tex to ir_txl with lod 0 for vertex shaders

2015-04-15 Thread Kenneth Graunke
On Wednesday, April 15, 2015 02:47:01 PM Ian Romanick wrote:
 On 04/15/2015 12:41 PM, Kristian Høgsberg wrote:
  The ir_tex opcode turns into a sample or sample_c message, which will try to
  compute derivatives to determine the lod. This produces garbage for
  non-fragment shaders where the sample coordinates don't correspond to
  subspans.
  
  We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
  lod to 0.
 
 This seems logical, but... why the heck hasn't this been a problem
 before?  Is this some weirdness of just BSW?  Should we perhaps only do
 this on BSW?

In the vec4 backend, we've always made ir_tex use the SAMPLE_LOD message
with lod set to 0.0 - so Gen4-7.5 have always done this workaround.

When Kristian added SIMD8 VS support (Oct 2014, Gen8+), we started using
SAMPLE (by virtue of not changing the FS behavior).  We honestly have no
idea why it appears to work on Broadwell - it sure seems like it ought
to be broken there.  Maybe we've just been lucky.

There's also a message header bit for Force LOD to Zero, which the
docs say is mandatory if you're going to use the SAMPLE message in
SIMD4x2 stages (i.e. not fragment).  We've wondered whether headerless
messages in non-fragment stages magically get that right.  That could
make the issue less noticable...

Using SAMPLE_LOD is easier than setting the message header bit, and
seems just as effective.  I think we should do it on Broadwell too.

Thanks for fixing this, Kristian!

Cc: 10.5 mesa-sta...@lists.freedesktop.org
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 
  https://bugs.freedesktop.org/show_bug.cgi?id=89457
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457
 
 That allows the bin/bugzilla_mesa.sh script to find it for generating
 release notes.
 
  Signed-off-by: Kristian Høgsberg k...@bitplanet.net
  ---
   src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 +
   1 file changed, 9 insertions(+)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
  b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  index 0049b2d..4e99366 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  @@ -1839,6 +1839,15 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
  fs_reg dst,
 offset_value.file != BAD_FILE  offset_value.file != IMM;
  bool coordinate_done = false;
   
  +   /* The sampler can only meaningfully compute LOD for fragment shader
  +* messages. For all other stages, we change the opcode to ir_txl and
  +* hardcode the LOD to 0.
  +*/
  +   if (stage != MESA_SHADER_FRAGMENT  op == ir_tex) {
  +  op = ir_txl;
  +  lod = fs_reg(0.0f);
  +   }
  +
  /* Set up the LOD info */
  switch (op) {
  case ir_tex:
  
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 9/9] r300: do not link against libdrm_intel

2015-04-15 Thread Marek Olšák
Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Wed, Apr 15, 2015 at 4:08 PM, Emil Velikov emil.l.veli...@gmail.com wrote:
 Accidentally added since the introduction of the file.

 Cc: 10.4 10.5 mesa-sta...@lists.freedesktop.org
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/gallium/drivers/r300/Automake.inc | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/r300/Automake.inc 
 b/src/gallium/drivers/r300/Automake.inc
 index 9334973..d4ddc40 100644
 --- a/src/gallium/drivers/r300/Automake.inc
 +++ b/src/gallium/drivers/r300/Automake.inc
 @@ -5,7 +5,7 @@ TARGET_CPPFLAGS += -DGALLIUM_R300
  TARGET_LIB_DEPS += \
 $(top_builddir)/src/gallium/drivers/r300/libr300.la \
 $(RADEON_LIBS) \
 -   $(INTEL_LIBS)
 +   $(LIBDRM_LIBS)

  TARGET_RADEON_WINSYS = \
 $(top_builddir)/src/gallium/winsys/radeon/drm/libradeonwinsys.la
 --
 2.3.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/builder: add nir_builder_insert_after_instr()

2015-04-15 Thread Rob Clark
From: Rob Clark robcl...@freedesktop.org

For lowering if/else, I need a way to insert at the end of the previous
block.

Signed-off-by: Rob Clark robcl...@freedesktop.org
---
 src/glsl/nir/nir_builder.h | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_builder.h b/src/glsl/nir/nir_builder.h
index 587d014..d1419ee 100644
--- a/src/glsl/nir/nir_builder.h
+++ b/src/glsl/nir/nir_builder.h
@@ -29,6 +29,7 @@ struct exec_list;
 typedef struct nir_builder {
struct exec_list *cf_node_list;
nir_instr *before_instr;
+   nir_instr *after_instr;
 
nir_shader *shader;
nir_function_impl *impl;
@@ -47,12 +48,24 @@ nir_builder_insert_after_cf_list(nir_builder *build,
  struct exec_list *cf_node_list)
 {
build-cf_node_list = cf_node_list;
+   build-before_instr = NULL;
+   build-after_instr = NULL;
 }
 
 static inline void
 nir_builder_insert_before_instr(nir_builder *build, nir_instr *before_instr)
 {
+   build-cf_node_list = NULL;
build-before_instr = before_instr;
+   build-after_instr = NULL;
+}
+
+static inline void
+nir_builder_insert_after_instr(nir_builder *build, nir_instr *after_instr)
+{
+   build-cf_node_list = NULL;
+   build-before_instr = NULL;
+   build-after_instr = after_instr;
 }
 
 static inline void
@@ -60,9 +73,12 @@ nir_builder_instr_insert(nir_builder *build, nir_instr 
*instr)
 {
if (build-cf_node_list) {
   nir_instr_insert_after_cf_list(build-cf_node_list, instr);
-   } else {
-  assert(build-before_instr);
+   } else if (build-before_instr) {
   nir_instr_insert_before(build-before_instr, instr);
+   } else {
+  assert(build-after_instr);
+  nir_instr_insert_after(build-after_instr, instr);
+  build-after_instr = instr;
}
 }
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Rewrite ir_tex to ir_txl with lod 0 for vertex shaders

2015-04-15 Thread Kristian Høgsberg
On Wed, Apr 15, 2015 at 4:28 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Wednesday, April 15, 2015 02:47:01 PM Ian Romanick wrote:
 On 04/15/2015 12:41 PM, Kristian Høgsberg wrote:
  The ir_tex opcode turns into a sample or sample_c message, which will try 
  to
  compute derivatives to determine the lod. This produces garbage for
  non-fragment shaders where the sample coordinates don't correspond to
  subspans.
 
  We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
  lod to 0.

 This seems logical, but... why the heck hasn't this been a problem
 before?  Is this some weirdness of just BSW?  Should we perhaps only do
 this on BSW?

 In the vec4 backend, we've always made ir_tex use the SAMPLE_LOD message
 with lod set to 0.0 - so Gen4-7.5 have always done this workaround.

 When Kristian added SIMD8 VS support (Oct 2014, Gen8+), we started using
 SAMPLE (by virtue of not changing the FS behavior).  We honestly have no
 idea why it appears to work on Broadwell - it sure seems like it ought
 to be broken there.  Maybe we've just been lucky.

 There's also a message header bit for Force LOD to Zero, which the
 docs say is mandatory if you're going to use the SAMPLE message in
 SIMD4x2 stages (i.e. not fragment).  We've wondered whether headerless
 messages in non-fragment stages magically get that right.  That could
 make the issue less noticable...

More speculation: since SKL adds a new message type (sample_c_lz) for
sample_c with lod=0, it could be that some of that got pulled into
BSW. Maybe pre-SKL, the thread dispatcher set the header bit in the
thread payload for VS so that it carries over in message header for
sample and sample_c and gets the right behavior. If BSW removes that
special case in favor of a dedicated opcode, it would explain why it
breaks on BSW but not BDW. The docs doesn't mention sample_c_lz for
BSW, so who knows...

 Using SAMPLE_LOD is easier than setting the message header bit, and
 seems just as effective.  I think we should do it on Broadwell too.

 Thanks for fixing this, Kristian!

Thanks for the help.

Kristian

 Cc: 10.5 mesa-sta...@lists.freedesktop.org
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org


  https://bugs.freedesktop.org/show_bug.cgi?id=89457

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457

 That allows the bin/bugzilla_mesa.sh script to find it for generating
 release notes.

  Signed-off-by: Kristian Høgsberg k...@bitplanet.net
  ---
   src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 +
   1 file changed, 9 insertions(+)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
  b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  index 0049b2d..4e99366 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  @@ -1839,6 +1839,15 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
  fs_reg dst,
 offset_value.file != BAD_FILE  offset_value.file != IMM;
  bool coordinate_done = false;
 
  +   /* The sampler can only meaningfully compute LOD for fragment shader
  +* messages. For all other stages, we change the opcode to ir_txl and
  +* hardcode the LOD to 0.
  +*/
  +   if (stage != MESA_SHADER_FRAGMENT  op == ir_tex) {
  +  op = ir_txl;
  +  lod = fs_reg(0.0f);
  +   }
  +
  /* Set up the LOD info */
  switch (op) {
  case ir_tex:
 

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload

2015-04-15 Thread Jason Ekstrand
On Wed, Apr 15, 2015 at 1:31 PM, Matt Turner matts...@gmail.com wrote:
 On Wed, Apr 1, 2015 at 6:19 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 Instead of the complicated and broken-by-design pile of heuristics we had
 before, we now have a straightforward lowering:

  1) All header sources are copied directly using force_writemask_all and,
 since they are guaranteed to be a single register, there are no
 force_sechalf issues.

  2) All non-header sources are copied using the exact same force_sechalf
 and saturate modifiers as the LOAD_PAYLOAD operation itself.

 Let's not do this. Nothing puts a saturate modifier on LOAD_PAYLOAD
 today, and it is kind of confusing about what it means. Can't we have
 fbwrites that write depth as well. I wouldn't think we wanted to
 saturate that.

Sure.  I can drop saturate and just assert that it's not set.  We do
want to keep force_sechalf and force_writemask_all though.

 I don't think it buys us anything. If we just run copy propagation
 after lower_load_payload() we'll get the code we want.

  3) In order to accommodate older gens that need interleaved colors,
 lower_load_payload detects when the destination is a COMPR4 register
 and automatically interleaves the non-header sources.  The
 lower_load_payload pass does the right thing here regardless of whether
 or not the hardware actually supports COMPR4.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 154 
 ++-
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   3 -
  2 files changed, 80 insertions(+), 77 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index f8e26c0..bc45a38 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3201,93 +3201,99 @@ fs_visitor::lower_load_payload()
  {
 bool progress = false;

 -   int vgrf_to_reg[alloc.count];
 -   int reg_count = 0;
 -   for (unsigned i = 0; i  alloc.count; ++i) {
 -  vgrf_to_reg[i] = reg_count;
 -  reg_count += alloc.sizes[i];
 -   }
 -
 -   struct {
 -  bool written:1; /* Whether this register has ever been written */
 -  bool force_writemask_all:1;
 -  bool force_sechalf:1;
 -   } metadata[reg_count];
 -   memset(metadata, 0, sizeof(metadata));
 -
 foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
 -  if (inst-dst.file == GRF) {
 - const int dst_reg = vgrf_to_reg[inst-dst.reg] + 
 inst-dst.reg_offset;
 - bool force_sechalf = inst-force_sechalf 
 -  !inst-force_writemask_all;
 - bool toggle_sechalf = inst-dst.width == 16 
 -   type_sz(inst-dst.type) == 4 
 -   !inst-force_writemask_all;
 - for (int i = 0; i  inst-regs_written; ++i) {
 -metadata[dst_reg + i].written = true;
 -metadata[dst_reg + i].force_sechalf = force_sechalf;
 -metadata[dst_reg + i].force_writemask_all = 
 inst-force_writemask_all;
 -force_sechalf = (toggle_sechalf != force_sechalf);
 - }
 -  }
 -
if (inst-opcode == SHADER_OPCODE_LOAD_PAYLOAD) {

 We should invert this condition and 'continue' so that we can avoid an
 extra level of indentation which only makes the code harder to read.

Sure.  I can do that.

   assert(inst-dst.file == MRF || inst-dst.file == GRF);
 +
   fs_reg dst = inst-dst;

 - for (int i = 0; i  inst-sources; i++) {
 -dst.width = inst-src[i].effective_width;
 -dst.type = inst-src[i].type;
 -
 -if (inst-src[i].file == BAD_FILE) {
 -   /* Do nothing but otherwise increment as normal */
 -} else if (dst.file == MRF 
 -   dst.width == 8 
 -   brw-has_compr4 
 -   i + 4  inst-sources 
 -   inst-src[i + 4].equals(horiz_offset(inst-src[i], 
 8))) {
 -   fs_reg compr4_dst = dst;
 -   compr4_dst.reg += BRW_MRF_COMPR4;
 -   compr4_dst.width = 16;
 -   fs_reg compr4_src = inst-src[i];
 -   compr4_src.width = 16;
 -   fs_inst *mov = MOV(compr4_dst, compr4_src);
 + /* Get rid of COMPR4.  We'll add it back in if we need it */
 + if (dst.file == MRF  dst.reg  BRW_MRF_COMPR4)
 +dst.reg = dst.reg  ~BRW_MRF_COMPR4;

 No point in checking whether the BRW_MRF_COMPR4 bit is set before
 clearing it. Just clear it.

Sure.

 +
 + dst.width = 8;
 + for (uint8_t i = 0; i  inst-header_size; i++) {
 +if (inst-src[i].file != BAD_FILE) {
 +   fs_reg mov_dst = retype(dst, BRW_REGISTER_TYPE_UD);
 +   fs_reg mov_src = retype(inst-src[i], BRW_REGISTER_TYPE_UD);
 +   mov_src.width = 8;
 +   fs_inst *mov = MOV(mov_dst, mov_src);
 mov-force_writemask_all = true;
 

Re: [Mesa-dev] [PATCH] nir: Convert the if-test for num_inputs == 2 to an assertion

2015-04-15 Thread Connor Abbott
Reviewed-by: Connor Abbott cwabo...@gmail.com

On Wed, Apr 15, 2015 at 9:18 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Suggested by Jason on a different patch after some comments /
 questions by Ilia.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: Jason Ekstrand jason.ekstr...@intel.com
 ---
  src/glsl/nir/nir_search.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
 index 73a802b..5ba0160 100644
 --- a/src/glsl/nir/nir_search.c
 +++ b/src/glsl/nir/nir_search.c
 @@ -218,8 +218,8 @@ match_expression(const nir_search_expression *expr, 
 nir_alu_instr *instr,
 if (matched)
return true;

 -   if (nir_op_infos[instr-op].num_inputs == 2 
 -   (nir_op_infos[instr-op].algebraic_properties  
 NIR_OP_IS_COMMUTATIVE)) {
 +   if (nir_op_infos[instr-op].algebraic_properties  NIR_OP_IS_COMMUTATIVE) 
 {
 +  assert(nir_op_infos[instr-op].num_inputs == 2);
if (!match_value(expr-srcs[0], instr, 1, num_components,
 swizzle, state))
   return false;
 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Convert the if-test for num_inputs == 2 to an assertion

2015-04-15 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

On Wed, Apr 15, 2015 at 6:18 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Suggested by Jason on a different patch after some comments /
 questions by Ilia.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: Jason Ekstrand jason.ekstr...@intel.com
 ---
  src/glsl/nir/nir_search.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
 index 73a802b..5ba0160 100644
 --- a/src/glsl/nir/nir_search.c
 +++ b/src/glsl/nir/nir_search.c
 @@ -218,8 +218,8 @@ match_expression(const nir_search_expression *expr, 
 nir_alu_instr *instr,
 if (matched)
return true;

 -   if (nir_op_infos[instr-op].num_inputs == 2 
 -   (nir_op_infos[instr-op].algebraic_properties  
 NIR_OP_IS_COMMUTATIVE)) {
 +   if (nir_op_infos[instr-op].algebraic_properties  NIR_OP_IS_COMMUTATIVE) 
 {
 +  assert(nir_op_infos[instr-op].num_inputs == 2);
if (!match_value(expr-srcs[0], instr, 1, num_components,
 swizzle, state))
   return false;
 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/ttn: add UBO support

2015-04-15 Thread Eric Anholt
Rob Clark robdcl...@gmail.com writes:

 From: Rob Clark robcl...@freedesktop.org

 v2: move ishl into ttn (instead of driver backend) to keep the units
 consistent between immediate and indirect offsets

This looks good to me.

Reviewed-by: Eric Anholt e...@anholt.net


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Convert the if-test for num_inputs == 2 to an assertion

2015-04-15 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Suggested by Jason on a different patch after some comments /
questions by Ilia.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: Jason Ekstrand jason.ekstr...@intel.com
---
 src/glsl/nir/nir_search.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
index 73a802b..5ba0160 100644
--- a/src/glsl/nir/nir_search.c
+++ b/src/glsl/nir/nir_search.c
@@ -218,8 +218,8 @@ match_expression(const nir_search_expression *expr, 
nir_alu_instr *instr,
if (matched)
   return true;
 
-   if (nir_op_infos[instr-op].num_inputs == 2 
-   (nir_op_infos[instr-op].algebraic_properties  NIR_OP_IS_COMMUTATIVE)) 
{
+   if (nir_op_infos[instr-op].algebraic_properties  NIR_OP_IS_COMMUTATIVE) {
+  assert(nir_op_infos[instr-op].num_inputs == 2);
   if (!match_value(expr-srcs[0], instr, 1, num_components,
swizzle, state))
  return false;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload

2015-04-15 Thread Matt Turner
On Wed, Apr 15, 2015 at 5:13 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 On Wed, Apr 15, 2015 at 1:31 PM, Matt Turner matts...@gmail.com wrote:
 On Wed, Apr 1, 2015 at 6:19 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 Instead of the complicated and broken-by-design pile of heuristics we had
 before, we now have a straightforward lowering:

  1) All header sources are copied directly using force_writemask_all and,
 since they are guaranteed to be a single register, there are no
 force_sechalf issues.

  2) All non-header sources are copied using the exact same force_sechalf
 and saturate modifiers as the LOAD_PAYLOAD operation itself.

 Let's not do this. Nothing puts a saturate modifier on LOAD_PAYLOAD
 today, and it is kind of confusing about what it means. Can't we have
 fbwrites that write depth as well. I wouldn't think we wanted to
 saturate that.

 Sure.  I can drop saturate and just assert that it's not set.  We do
 want to keep force_sechalf and force_writemask_all though.

I didn't think about those before, but I don't know how a load_payload
could have force_writemask_all set. Have I missed something?

I see that setup_color_payload sets force_sechalf for dual-source
fbwrites -- that's the only case we're going to have force_sechalf
set, right?

That is, the Gen  6 case is going to be handled by passing 16-channel
sources to load_payload and letting it do compr4?

 I don't think it buys us anything. If we just run copy propagation
 after lower_load_payload() we'll get the code we want.
[snip]
 +/* The COMPR4 code took care of the first 4 sources.  We'll let
 + * the regular path handle any remaining sources.  Yes, we are
 + * modifying the instruction but we're about to delete it so
 + * this really doesn't hurt anything.
 + */
 +inst-header_size += 4;

 I mean, while the comment is a true statement, why is doing this any
 better than just...

 + }
 +
 + for (uint8_t i = inst-header_size; i  inst-sources; i++) {

 ... changing this to inst-header_size + 4?

 Because the inst-header_size += 4 is predicated on it being a COMPR4
 destination while the code below handles both the remaining sources
 (in the COMPR4 case) and the regular non-COMPR4 case.

Ahh, right. It'd be fewer lines (no commenting necessary) to just have
a 'start' variable that you set to header_size at the top and +=4
here.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Rewrite ir_tex to ir_txl with lod 0 for vertex shaders

2015-04-15 Thread Ian Romanick
On 04/15/2015 04:28 PM, Kenneth Graunke wrote:
 On Wednesday, April 15, 2015 02:47:01 PM Ian Romanick wrote:
 On 04/15/2015 12:41 PM, Kristian Høgsberg wrote:
 The ir_tex opcode turns into a sample or sample_c message, which will try to
 compute derivatives to determine the lod. This produces garbage for
 non-fragment shaders where the sample coordinates don't correspond to
 subspans.

 We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
 lod to 0.

 This seems logical, but... why the heck hasn't this been a problem
 before?  Is this some weirdness of just BSW?  Should we perhaps only do
 this on BSW?
 
 In the vec4 backend, we've always made ir_tex use the SAMPLE_LOD message
 with lod set to 0.0 - so Gen4-7.5 have always done this workaround.
 
 When Kristian added SIMD8 VS support (Oct 2014, Gen8+), we started using
 SAMPLE (by virtue of not changing the FS behavior).  We honestly have no

Yes, that's the part the slipped my mind.  In that case, FWIW,

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 idea why it appears to work on Broadwell - it sure seems like it ought
 to be broken there.  Maybe we've just been lucky.
 
 There's also a message header bit for Force LOD to Zero, which the
 docs say is mandatory if you're going to use the SAMPLE message in
 SIMD4x2 stages (i.e. not fragment).  We've wondered whether headerless
 messages in non-fragment stages magically get that right.  That could
 make the issue less noticable...
 
 Using SAMPLE_LOD is easier than setting the message header bit, and
 seems just as effective.  I think we should do it on Broadwell too.
 
 Thanks for fixing this, Kristian!
 
 Cc: 10.5 mesa-sta...@lists.freedesktop.org
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 

 https://bugs.freedesktop.org/show_bug.cgi?id=89457

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457

 That allows the bin/bugzilla_mesa.sh script to find it for generating
 release notes.

 Signed-off-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 9 +
  1 file changed, 9 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 0049b2d..4e99366 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1839,6 +1839,15 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
offset_value.file != BAD_FILE  offset_value.file != IMM;
 bool coordinate_done = false;
  
 +   /* The sampler can only meaningfully compute LOD for fragment shader
 +* messages. For all other stages, we change the opcode to ir_txl and
 +* hardcode the LOD to 0.
 +*/
 +   if (stage != MESA_SHADER_FRAGMENT  op == ir_tex) {
 +  op = ir_txl;
 +  lod = fs_reg(0.0f);
 +   }
 +
 /* Set up the LOD info */
 switch (op) {
 case ir_tex:


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev





signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] Hash-table and hash-set, V4

2015-04-15 Thread Eric Anholt
Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Apr 11, 2015 at 4:25 PM, Thomas Helland
 thomashellan...@gmail.com wrote:
 The performance numbers (shader-db runtime) are:

 Difference at 95.0% confidence
   -14.7608 +/- 3.36786
  -9.05064% +/- 2.06501%
 (Original runtime was 160 seconds)

 Good Work!

 I had one comment on the hash set patch.  With that fixed, the series is

 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 Probably want to give Eric a a couple of days to look at it too.

Yeah, I'm splitting the series up into each individual change (resizing
cutoff, linear probing, power-of-two, then quadratic probing) so that we
can make sure that they're all good for both the compiler and GL object
lookup.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] util: Change util/set to use quadratic probing

2015-04-15 Thread Thomas Helland
The same rationale applies here as for the hash table.
Power of two size should give better performance,
and using the algorithm hash = sh + i/2 + i*i/2
should result in only distinct hash values when hitting collisions.

Difference at 95.0% confidence
 -7.9505 +/- 2.44011
   -5.04357% +/- 1.54794%

V4: Feedback from Jason Ekstrand
   - Split cleanup of set_rehash into separate commit

V3: Feedback from Eric Anholt
   - Don't change load factor and starting size.

V2: Feedback from Connor Abbott
   - Don't set initial hash address before potential rehash
   - Remove hash_sizes table
   - Correct the quadratic hashing algorithm
   - Use correct comment style

   Feedback from Jason Ekstrand
   - Use unreachable() to detect if we fail to insert

Reviewed-by: Jason Ekstrand jason.ekstrand at intel.com
Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/util/set.c | 110 +
 src/util/set.h |   3 +-
 2 files changed, 41 insertions(+), 72 deletions(-)

diff --git a/src/util/set.c b/src/util/set.c
index f01f869..2bcc743 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -32,6 +32,17 @@
  *Keith Packard kei...@keithp.com
  */
 
+/**
+ * Implements an open-addressing, quadratic probing hash-set.
+ *
+ * We choose set sizes that's a power of two.
+ * This is computationally less expensive than primes.
+ * As a bonus the size and free space can be calculated instead of looked up.
+ * FNV-1a has good avalanche properties, so collision is not an issue.
+ * These sets are sized to have an extra 10% free to avoid
+ * exponential performance degradation as the set fills.
+ */
+
 #include stdlib.h
 #include assert.h
 
@@ -39,51 +50,9 @@
 #include ralloc.h
 #include set.h
 
-/*
- * From Knuth -- a good choice for hash/rehash values is p, p-2 where
- * p and p-2 are both prime.  These tables are sized to have an extra 10%
- * free to avoid exponential performance degradation as the hash table fills
- */
-
 uint32_t deleted_key_value;
 const void *deleted_key = deleted_key_value;
 
-static const struct {
-   uint32_t max_entries, size, rehash;
-} hash_sizes[] = {
-   { 2,5,3},
-   { 4,7,5},
-   { 8,13,   11   },
-   { 16,   19,   17   },
-   { 32,   43,   41   },
-   { 64,   73,   71   },
-   { 128,  151,  149  },
-   { 256,  283,  281  },
-   { 512,  571,  569  },
-   { 1024, 1153, 1151 },
-   { 2048, 2269, 2267 },
-   { 4096, 4519, 4517 },
-   { 8192, 9013, 9011 },
-   { 16384,18043,18041},
-   { 32768,36109,36107},
-   { 65536,72091,72089},
-   { 131072,   144409,   144407   },
-   { 262144,   288361,   288359   },
-   { 524288,   576883,   576881   },
-   { 1048576,  1153459,  1153457  },
-   { 2097152,  2307163,  2307161  },
-   { 4194304,  4613893,  4613891  },
-   { 8388608,  9227641,  9227639  },
-   { 16777216, 18455029, 18455027 },
-   { 33554432, 36911011, 36911009 },
-   { 67108864, 73819861, 73819859 },
-   { 134217728,147639589,147639587},
-   { 268435456,295279081,295279079},
-   { 536870912,590559793,590559791},
-   { 1073741824,   1181116273,   1181116271   },
-   { 2147483648ul, 2362232233ul, 2362232231ul }
-};
-
 static int
 entry_is_free(struct set_entry *entry)
 {
@@ -114,10 +83,9 @@ _mesa_set_create(void *mem_ctx,
if (ht == NULL)
   return NULL;
 
-   ht-size_index = 0;
-   ht-size = hash_sizes[ht-size_index].size;
-   ht-rehash = hash_sizes[ht-size_index].rehash;
-   ht-max_entries = hash_sizes[ht-size_index].max_entries;
+   ht-size_iteration = 2;
+   ht-size = 1  ht-size_iteration;
+   ht-max_entries = ht-size * 0.9;
ht-key_hash_function = key_hash_function;
ht-key_equals_function = key_equals_function;
ht-table = rzalloc_array(ht, struct set_entry, ht-size);
@@ -163,12 +131,11 @@ _mesa_set_destroy(struct set *ht, void 
(*delete_function)(struct set_entry *entr
 static struct set_entry *
 set_search(const struct set *ht, uint32_t hash, const void *key)
 {
-   uint32_t hash_address;
+   uint32_t start_hash_address = hash  (ht-size - 1);
+   uint32_t hash_address = start_hash_address;
+   uint32_t quad_hash = 1;
 
-   hash_address = hash % ht-size;
do {
-  uint32_t double_hash;
-
   struct set_entry *entry = ht-table + hash_address;
 
   if (entry_is_free(entry)) {
@@ -179,10 +146,10 @@ set_search(const struct set *ht, uint32_t hash, const 
void *key)
  }
   }
 
-  double_hash = 1 + hash % ht-rehash;
-
-  

[Mesa-dev] [PATCH 4/3] util: Use set_foreach instead of rolling our own

2015-04-15 Thread Thomas Helland
This follows the same pattern as in the hash_table.

Reviewed-by: Jason Ekstrand jason.ekstrand at intel.com
Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/util/set.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/util/set.c b/src/util/set.c
index 2bcc743..7ff9520 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -196,12 +196,8 @@ set_rehash(struct set *ht, uint32_t new_size_iteration)
ht-entries = 0;
ht-deleted_entries = 0;
 
-   for (entry = old_ht.table;
-entry != old_ht.table + old_ht.size;
-entry++) {
-  if (entry_is_present(entry)) {
- set_add(ht, entry-hash, entry-key);
-  }
+   set_foreach(old_ht, entry) {
+  set_add(ht, entry-hash, entry-key);
}
 
ralloc_free(old_ht.table);
-- 
2.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] llvmpipe support for ARM?

2015-04-15 Thread Roland Scheidegger
That's good to know. I've always thought it's a bit of a pity that it
isn't optimized at all for arm (doesn't use any arm intrinsics, though
with newer llvm versions it is probably not all that important any more,
and I'm not even sure which neon intrinsics would be useful) but someone
would need to do the work. But that it's running is a good start :-).

Roland

Am 15.04.2015 um 17:45 schrieb Jan Vesely:
 Hi,
 I can run arm llvmpipe glxgears on IFC6410 (Snapdragon 600 APQ8064,
 Krait 300) without problems.
 llvm 3.5
 mesa 10.4.3
 both are fedora 21 standard packages without modifications.
 I haven't tried anything more complex than glxgears
 
 jan
 
 On Wed, 2015-04-15 at 16:30 +0200, Roland Scheidegger wrote:
 There were reports with varying success of running llvmpipe with arm in
 the past but I haven't heard anything for a while. In theory it should
 work (and crashing in the pixel shader is an indication the vertex
 shader worked or at least didn't crash).
 But it's not really tested and hence might easily break. My suggestion
 would be to try a newer Mesa version since 9.2.5 is very old, but I
 really don't know if it will help.
 If that doesn't help, you could try debugging it, you can get the
 crashing instruction in gdb with x/i address and see if that was an
 unaligned access which should have been aligned or something else.
 GALLIVM_DEBUG=ir,asm will get you the IR and compiled code (with debug
 builds).

 Roland


 Am 15.04.2015 um 14:37 schrieb Marko Moberg:
 Hi,

 I have been trying to get OpenGL sw rastering to work on Linux based ARM
 CortexA15 device but for some reason gallium llvmpipe gives me a
 segmentation fault. Has anybody managed to run sw rastering using
 gallium llvmpipe on ARM? Softpipe seems to be working ok.

 I am using the following versions:
 - Mesa 9.2.5
 - LLVM3.3
 - Wayland/Weston 1.5

 The issue is related to opengl/gles2 rendering i.e. weston-simple-egl
 demo and Qt openGL demos give seg faults when executed with llvmpipe
 enabled. 

 Backtrace from weston-simple-egl looks something like this:
 #0  0x6d7de620 in ?? ()
 #1  0xb354004c in fs17_variant0_partial ()
 #2  0xb6841740 in lp_rast_shade_quads_mask (task=task@entry=0x1e360,
 inputs=inputs@entry=0x47a00, x=128, y=80, mask=4369) at lp_rast.c:466
 #3  0xb6842d04 in do_block_4_1 (c=optimized out, y=optimized out,
 x=optimized out, plane=optimized out, tri=optimized out,
 task=optimized out) at lp_rast_tri_tmp.h:61
 #4  do_block_16_1 (c=synthetic pointer, y=optimized out,
 x=optimized out, plane=0xb4f4ccf0, tri=optimized out,
 task=optimized out) at lp_rast_tri_tmp.h:130
 #5  lp_rast_triangle_1 (task=0x1e360, arg=...) at lp_rast_tri_tmp.h:232
 #6  0xb6840420 in do_rasterize_bin (bin=optimized out, task=0x1e360,
 x=optimized out, y=optimized out) at lp_rast.c:607
 #7  rasterize_bin (y=optimized out, x=optimized out, bin=optimized
 out,
 task=0x1e360) at lp_rast.c:626
 #8  rasterize_scene (task=task@entry=0x1e360, scene=0xb3646008)
 at lp_rast.c:675
 #9  0xb6840cb0 in thread_function (init_data=0x1e360) at lp_rast.c:788
 #10 0xb6d98ed2 in start_thread () from /lib/libpthread.so.0
 #11 0xb6c91058 in ?? () from /lib/libc.so.6
 #12 0xb6c91058 in ?? () from /lib/libc.so.6

 regards,
 Marko

 marko.s.mob...@gmail.com mailto:marko.s.mob...@gmail.com



 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=5qZ4KpVhu_wXA8-dTn8xYiul5G713jYwztPr8kfcUZMs=wBNyDvf5DiNBjxXXl7QvyHOvNuTVjZAaw0n7WaQM-bAe=
  


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIFaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=nAdzrx0oFETmuwc9kjHUEBdG2y7GhDKXpaAS5jLedxAs=ldDTgR0M9Ltu-Kyh-nH37uDBxApVBRuxXB086icVVMce=
  
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/radeon: don't crash when getting out-of-bounds TEMP references

2015-04-15 Thread Michel Dänzer
On 12.04.2015 04:11, Marek Olšák wrote:
 From: Marek Olšák marek.ol...@amd.com
 
 ---
  src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 6 ++
  1 file changed, 6 insertions(+)
 
 diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
 b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
 index 1690194..333f7ae 100644
 --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
 +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
 @@ -191,6 +191,8 @@ emit_fetch(
   break;
  
   case TGSI_FILE_TEMPORARY:
 + if (reg-Register.Index = ctx-temps_count)
 + return LLVMGetUndef(tgsi2llvmtype(bld_base, type));
   if (uses_temp_indirect_addressing(bld_base)) {
   ptr = lp_get_temp_ptr_soa(bld, reg-Register.Index, 
 swizzle);
   break;
 @@ -395,6 +397,8 @@ emit_store(
   break;
  
   case TGSI_FILE_TEMPORARY:
 + if (range.First + i = ctx-temps_count)
 + continue;
   if 
 (uses_temp_indirect_addressing(bld_base))
   temp_ptr = 
 lp_get_temp_ptr_soa(bld, i + range.First, chan_index);
   else
 @@ -416,6 +420,8 @@ emit_store(
   break;
  
   case TGSI_FILE_TEMPORARY:
 + if (reg-Register.Index = ctx-temps_count)
 + continue;
   if (uses_temp_indirect_addressing(bld_base)) {
   temp_ptr = NULL;
   break;
 

Looks good to me, but Tom should probably also take a look.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (10.5): radeonsi: Cache LLVMTargetMachineRef in context instead of in screen

2015-04-15 Thread Michel Dänzer
On 09.04.2015 20:27, eveli...@kemper.freedesktop.org (Emil Velikov) wrote:
 @@ -2740,7 +2741,45 @@ static int si_generate_gs_copy_shader(struct si_screen 
 *sscreen,
   return r;
  }
  
 -int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
 +static void si_dump_key(unsigned shader, union si_shader_key *key)
 +{
 + int i;
 +
 + fprintf(stderr, SHADER KEY\n);
 +
 + switch (shader) {
 + case PIPE_SHADER_VERTEX:
 + fprintf(stderr,   instance_divisors = {);
 + for (i = 0; i  Elements(key-vs.instance_divisors); i++)
 + fprintf(stderr, !i ? %u : , %u,
 + key-vs.instance_divisors[i]);
 + fprintf(stderr, }\n);
 +
 + if (key-vs.as_es)
 + fprintf(stderr,   gs_used_inputs = 0x%PRIx64\n,
 + key-vs.gs_used_inputs);
 + fprintf(stderr,   as_es = %u\n, key-vs.as_es);
 + break;
 +
 + case PIPE_SHADER_GEOMETRY:
 + break;
 +
 + case PIPE_SHADER_FRAGMENT:
 + fprintf(stderr,   export_16bpc = 0x%X\n, 
 key-ps.export_16bpc);
 + fprintf(stderr,   last_cbuf = %u\n, key-ps.last_cbuf);
 + fprintf(stderr,   color_two_side = %u\n, 
 key-ps.color_two_side);
 + fprintf(stderr,   alpha_func = %u\n, key-ps.alpha_func);
 + fprintf(stderr,   alpha_to_one = %u\n, key-ps.alpha_to_one);
 + fprintf(stderr,   poly_stipple = %u\n, key-ps.poly_stipple);
 + break;
 +
 + default:
 + assert(0);
 + }
 +}

The si_dump_key function shouldn't have been added.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965: Delete some unnecessary code in brw_report_shader_time().

2015-04-15 Thread Kenneth Graunke
It is true that a gl_shader_program with ID 0 will be a fixed-function
fragment program; a gl_program with ID 0 but NULL gl_shader_program
means that it's a fixed-function vertex shader.

But that's not terribly interesting or relevant to what we're doing.
We just need to know that ID 0 means fixed function.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_program.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 0ca63de..7ea08e6 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -447,14 +447,9 @@ brw_report_shader_time(struct brw_context *brw)
   if (prog) {
  shader_num = prog-Name;
 
- /* The fixed function fragment shader generates GLSL IR with a Name
-  * of 0, and nothing else does.
-  */
  if (prog-Label) {
 shader_name = prog-Label;
- } else if (shader_num == 0 
- (brw-shader_time.types[i] == ST_FS8 ||
-  brw-shader_time.types[i] == ST_FS16)) {
+ } else if (shader_num == 0) {
 shader_name = ff;
  } else {
 shader_name = glsl;
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i965: Make shader_time use 0 instead of -1 for no meaningful ID.

2015-04-15 Thread Kenneth Graunke
0 is not a valid GLSL shader or ARB program ID.  For some reason,
shader_time used -1 instead...so we had code to detect 0, then override
it to -1.

We can just delete that.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_program.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 9e27c2a..0ca63de 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -346,7 +346,7 @@ print_shader_time_line(const char *stage, const char *name,
 {
fprintf(stderr, %-6s%-18s, stage, name);
 
-   if (shader_num != -1)
+   if (shader_num != 0)
   fprintf(stderr, %4d: , shader_num);
else
   fprintf(stderr, : );
@@ -443,7 +443,7 @@ brw_report_shader_time(struct brw_context *brw)
   if (scaled[i] == 0)
  continue;
 
-  int shader_num = -1;
+  int shader_num = 0;
   if (prog) {
  shader_num = prog-Name;
 
@@ -456,7 +456,6 @@ brw_report_shader_time(struct brw_context *brw)
  (brw-shader_time.types[i] == ST_FS8 ||
   brw-shader_time.types[i] == ST_FS16)) {
 shader_name = ff;
-shader_num = -1;
  } else {
 shader_name = glsl;
  }
@@ -464,7 +463,6 @@ brw_report_shader_time(struct brw_context *brw)
  shader_num = brw-shader_time.programs[i]-Id;
  if (shader_num == 0) {
 shader_name = ff;
-shader_num = -1;
  } else {
 shader_name = prog;
  }
@@ -495,10 +493,10 @@ brw_report_shader_time(struct brw_context *brw)
}
 
fprintf(stderr, \n);
-   print_shader_time_line(total, vs, -1, total_by_type[ST_VS], total);
-   print_shader_time_line(total, gs, -1, total_by_type[ST_GS], total);
-   print_shader_time_line(total, fs8, -1, total_by_type[ST_FS8], total);
-   print_shader_time_line(total, fs16, -1, total_by_type[ST_FS16], total);
+   print_shader_time_line(total, vs, 0, total_by_type[ST_VS], total);
+   print_shader_time_line(total, gs, 0, total_by_type[ST_GS], total);
+   print_shader_time_line(total, fs8, 0, total_by_type[ST_FS8], total);
+   print_shader_time_line(total, fs16, 0, total_by_type[ST_FS16], total);
 }
 
 static void
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: remove bogus r600-- triple

2015-04-15 Thread Michel Dänzer
On 09.04.2015 03:27, Emil Velikov wrote:
 As mentioned by Michel Dänzer for LLVM = 3.6 we create the
 LLVMTargetMachine (with triple amdgcn--), as we setup the radeonsi
 context. For older LLVM or hardware (r600) the triple is always r600--
 and is created at a later stage - radeon_llvm_compile()
 
 Cc: Michel Dänzer michel.daen...@amd.com
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/gallium/drivers/radeonsi/si_pipe.c | 2 --
  1 file changed, 2 deletions(-)
 
 diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
 b/src/gallium/drivers/radeonsi/si_pipe.c
 index e761d20..5ea8868 100644
 --- a/src/gallium/drivers/radeonsi/si_pipe.c
 +++ b/src/gallium/drivers/radeonsi/si_pipe.c
 @@ -85,8 +85,6 @@ static struct pipe_context *si_create_context(struct 
 pipe_screen *screen, void *
   LLVMTargetRef r600_target;
  #if HAVE_LLVM = 0x0306
   const char *triple = amdgcn--;
 -#else
 - const char *triple = r600--;
  #endif
   int shader, i;

r600_target should probably also be moved inside the

#if HAVE_LLVM = 0x0306

as it's only used in that case. But that can be done in a separate
patch. Either way, this patch is

Reviewed-by: Michel Dänzer michel.daen...@amd.com


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev