[Mesa-dev] [PATCH] llvmpipe: fix pipeline statistics with a null ps

2013-08-13 Thread Zack Rusin
If the fragment shader is null then pixel shader invocations have
to be equal to zero. And if we're running a null ps then clipper
invocations and primitives should be equal to zero but only
if both stancil and depth testing are disabled.

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/drivers/llvmpipe/lp_query.c |   30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_query.c 
b/src/gallium/drivers/llvmpipe/lp_query.c
index cea2d07..fb24c36 100644
--- a/src/gallium/drivers/llvmpipe/lp_query.c
+++ b/src/gallium/drivers/llvmpipe/lp_query.c
@@ -32,6 +32,7 @@
 
 #include draw/draw_context.h
 #include pipe/p_defines.h
+#include tgsi/tgsi_scan.h
 #include util/u_memory.h
 #include os/os_time.h
 #include lp_context.h
@@ -95,6 +96,7 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
   union pipe_query_result *vresult)
 {
struct llvmpipe_screen *screen = llvmpipe_screen(pipe-screen);
+   struct llvmpipe_context *llvmpipe = llvmpipe_context(pipe);
unsigned num_threads = MAX2(1, screen-num_threads);
struct llvmpipe_query *pq = llvmpipe_query(q);
uint64_t *result = (uint64_t *)vresult;
@@ -166,11 +168,31 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
case PIPE_QUERY_PIPELINE_STATISTICS: {
   struct pipe_query_data_pipeline_statistics *stats =
  (struct pipe_query_data_pipeline_statistics *)vresult;
-  /* only ps_invocations come from binned query */
-  for (i = 0; i  num_threads; i++) {
- pq-stats.ps_invocations += pq-end[i];
+  /* If we're running on what's considrered a null fragment
+   * shader, i.e. fragment shader consisting of a single
+   * END opcode or if the fragment shader is null then
+   * the number of ps_invocations should be zero */
+  if (llvmpipe-fs  llvmpipe-fs-info.base.num_tokens  1) {
+ /* only ps_invocations come from binned query */
+ for (i = 0; i  num_threads; i++) {
+pq-stats.ps_invocations += pq-end[i];
+ }
+ pq-stats.ps_invocations *=
+LP_RASTER_BLOCK_SIZE * LP_RASTER_BLOCK_SIZE;
+  } else {
+ /* 
+  * Clipper primitives and invocations are equal to zero
+  * if we're running a null fragment shader but only
+  * if both stencil and depth testing are disabled.
+  */
+ if (!llvmpipe-depth_stencil-depth.enabled 
+ !llvmpipe-depth_stencil-stencil[0].enabled 
+ !llvmpipe-depth_stencil-stencil[1].enabled) {
+pq-stats.c_primitives = 0;
+pq-stats.c_invocations = 0;
+ }
+ pq-stats.ps_invocations = 0;
   }
-  pq-stats.ps_invocations *= LP_RASTER_BLOCK_SIZE * LP_RASTER_BLOCK_SIZE;
   *stats = pq-stats;
}
   break;
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Patches: R600: Merge R600 and SI vector op expansions

2013-08-13 Thread Michel Dänzer
On Mon, 2013-08-12 at 15:25 -0700, Tom Stellard wrote:
 
 The attached patches expand a few more vector operations and also move
 the expansion code into AMDGPUISelLowering.cpp so it can be shared
 between R600 and SI.

This series is

Reviewed-by: Michel Dänzer michel.daen...@amd.com


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)

2013-08-13 Thread Tom Gall
On Mon, Aug 12, 2013 at 10:09 PM, Chad Versace
chad.vers...@linux.intel.com wrote:
 On 08/06/2013 09:44 PM, Siarhei Siamashka wrote:

 On Tue, 6 Aug 2013 15:54:57 -0700
 Matt Turner matts...@gmail.com wrote:

 On Tue, Aug 6, 2013 at 2:13 PM, Siarhei Siamashka
 siarhei.siamas...@gmail.com wrote:


 But if upstream Mesa treats this configuration as unsupported, then I
 also don't see it progressing anywhere in Gentoo. So could you please
 re-consider this decision?


 As far as I'm aware, ES without Desktop GL is disallowed only because
 it was discovered to be broken
 which is because no one working on Mesa appears to test it.


 I have not done any really serious testing. I'm just playing around [...]



 If you can test it (and provide patches when you notice that it's
 broken) I don't have a problem with allowing ES-only builds.


 I agree. If you can fix Mesa to support ES-only builds and do *serious*
 testing with Piglit and some real ES applications to prove that it works,
 then I'm not opposed to supporting that configuration.

If you're willing to give a go, I'm willing to spend some time on it.

-- 
Regards,
Tom

Where's the kaboom!? There was supposed to be an earth-shattering
kaboom! Marvin Martian
Tech Lead, Graphics Working Group | Linaro.org │ Open source software
for ARM SoCs
w) tom.gall att linaro.org
h) tom_gall att mac.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Patches: R600/SI: Rework resource descriptor types and fix piglit compute hangs

2013-08-13 Thread Michel Dänzer
On Mon, 2013-08-12 at 12:53 -0700, Tom Stellard wrote:
 Hi,
 
 The attached patches make the v1i32 type which was used for sample
 coordinates and the v16i8 type which was used for resource descriptors
 illegal.  There is a new pass which will convert v1i32 to i32 and v16i8
 to i128 for all non-compute shaders.
 
 Since v16i8 is a legal type in OpenCL, using this type for resource
 descriptors and making it legal was over-complicating the type legalizer
 and causing some piglit tests to hang.
 
 The v1i32 type is identical to i32 on R600 and there is really no benefit
 in having it be a legal type.

I currently get 231 piglit failures from quick-driver.tests on SI, so it
would be quite an achievement for patch 3 to fix 364 piglit tests. ;)
Please fix / clarify that paragraph of the commit log.

In patch 5, it would be better to reference the URL of the bug report
itself instead of a bug attachment. And you can verify using llc and the
IR in the attachment that the bug is fixed.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw: make sure that the stages setup outputs

2013-08-13 Thread Roland Scheidegger
Am 09.08.2013 16:13, schrieb Zack Rusin:
 Calling the prepare outputs cleans up the slot assignments
 for outputs, unfortunately aapoint and aaline didn't have
 code to reset their slots after the initial setup, this
 was messing up our slot assignments. The unfilled stage
 was just missing the initial assignment of the face slot.
 This fixes all of the reported piglit failures.
 
 Signed-off-by: Zack Rusin za...@vmware.com
 ---
  src/gallium/auxiliary/draw/draw_context.c   |2 +
  src/gallium/auxiliary/draw/draw_pipe.h  |5 +-
  src/gallium/auxiliary/draw/draw_pipe_aaline.c   |   27 ---
  src/gallium/auxiliary/draw/draw_pipe_aapoint.c  |   56 
 ++-
  src/gallium/auxiliary/draw/draw_pipe_unfilled.c |2 +
  5 files changed, 62 insertions(+), 30 deletions(-)
 
 diff --git a/src/gallium/auxiliary/draw/draw_context.c 
 b/src/gallium/auxiliary/draw/draw_context.c
 index 2d4843e..d1fac0c 100644
 --- a/src/gallium/auxiliary/draw/draw_context.c
 +++ b/src/gallium/auxiliary/draw/draw_context.c
 @@ -564,6 +564,8 @@ draw_prepare_shader_outputs(struct draw_context *draw)
 draw_remove_extra_vertex_attribs(draw);
 draw_prim_assembler_prepare_outputs(draw-ia);
 draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled);
 +   draw_aapoint_prepare_outputs(draw, draw-pipeline.aapoint);
 +   draw_aaline_prepare_outputs(draw, draw-pipeline.aaline);
  }
  
  /**
 diff --git a/src/gallium/auxiliary/draw/draw_pipe.h 
 b/src/gallium/auxiliary/draw/draw_pipe.h
 index 7c9ed6c..ad3165f 100644
 --- a/src/gallium/auxiliary/draw/draw_pipe.h
 +++ b/src/gallium/auxiliary/draw/draw_pipe.h
 @@ -101,7 +101,10 @@ void draw_pipe_passthrough_tri(struct draw_stage *stage, 
 struct prim_header *hea
  void draw_pipe_passthrough_line(struct draw_stage *stage, struct prim_header 
 *header);
  void draw_pipe_passthrough_point(struct draw_stage *stage, struct 
 prim_header *header);
  
 -
 +void draw_aapoint_prepare_outputs(struct draw_context *context,
 +  struct draw_stage *stage);
 +void draw_aaline_prepare_outputs(struct draw_context *context,
 + struct draw_stage *stage);
  void draw_unfilled_prepare_outputs(struct draw_context *context,
 struct draw_stage *stage);
  
 diff --git a/src/gallium/auxiliary/draw/draw_pipe_aaline.c 
 b/src/gallium/auxiliary/draw/draw_pipe_aaline.c
 index aa88459..c44c236 100644
 --- a/src/gallium/auxiliary/draw/draw_pipe_aaline.c
 +++ b/src/gallium/auxiliary/draw/draw_pipe_aaline.c
 @@ -692,13 +692,7 @@ aaline_first_line(struct draw_stage *stage, struct 
 prim_header *header)
return;
 }
  
 -   /* update vertex attrib info */
 -   aaline-pos_slot = draw_current_shader_position_output(draw);;
 -
 -   /* allocate the extra post-transformed vertex attribute */
 -   aaline-tex_slot = draw_alloc_extra_vertex_attrib(draw,
 - TGSI_SEMANTIC_GENERIC,
 - 
 aaline-fs-generic_attrib);
 +   draw_aaline_prepare_outputs(draw, draw-pipeline.aaline);
  
 /* how many samplers? */
 /* we'll use sampler/texture[pstip-sampler_unit] for the stipple */
 @@ -953,6 +947,25 @@ aaline_set_sampler_views(struct pipe_context *pipe,
  }
  
  
 +void
 +draw_aaline_prepare_outputs(struct draw_context *draw,
 +struct draw_stage *stage)
 +{
 +   struct aaline_stage *aaline = aaline_stage(stage);
 +   const struct pipe_rasterizer_state *rast = draw-rasterizer;
 +
 +   /* update vertex attrib info */
 +   aaline-pos_slot = draw_current_shader_position_output(draw);;
 +
 +   if (!rast-line_smooth)
 +  return;
 +
 +   /* allocate the extra post-transformed vertex attribute */
 +   aaline-tex_slot = draw_alloc_extra_vertex_attrib(draw,
 + TGSI_SEMANTIC_GENERIC,
 + 
 aaline-fs-generic_attrib);
 +}
 +
  /**
   * Called by drivers that want to install this AA line prim stage
   * into the draw module's pipeline.  This will not be used if the
 diff --git a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c 
 b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
 index 0d7b88e..7ae1ddd 100644
 --- a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
 +++ b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
 @@ -696,28 +696,7 @@ aapoint_first_point(struct draw_stage *stage, struct 
 prim_header *header)
  */
 bind_aapoint_fragment_shader(aapoint);
  
 -   /* update vertex attrib info */
 -   aapoint-pos_slot = draw_current_shader_position_output(draw);
 -
 -   /* allocate the extra post-transformed vertex attribute */
 -   aapoint-tex_slot = draw_alloc_extra_vertex_attrib(draw,
 -  TGSI_SEMANTIC_GENERIC,
 -  
 aapoint-fs-generic_attrib);
 -   

Re: [Mesa-dev] [PATCH] tgsi_build: fix order of arguments for ind register build

2013-08-13 Thread Brian Paul

On 08/12/2013 06:14 PM, Dave Airlie wrote:

From: Dave Airlie airl...@redhat.com

This was broken when arrayid was added.

Signed-off-by: Dave Airlie airl...@redhat.com
---
  src/gallium/auxiliary/tgsi/tgsi_build.c |  2 +-
  src/gallium/renderer/virgl_hw.h | 39 +
  2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 626faad..9c00cb6 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -875,8 +875,8 @@ static struct tgsi_ind_register
  tgsi_build_ind_register(
 unsigned file,
 unsigned swizzle,
-   unsigned arrayid,
 int index,
+   unsigned arrayid,
 struct tgsi_instruction *instruction,
 struct tgsi_header *header )
  {


For that part, Reviewed-by: Brian Paul bri...@vmware.com

The rest of this patch below is unrelated afaict.

-Brian


diff --git a/src/gallium/renderer/virgl_hw.h b/src/gallium/renderer/virgl_hw.h
index 2a8be61..71989cc 100644
--- a/src/gallium/renderer/virgl_hw.h
+++ b/src/gallium/renderer/virgl_hw.h
@@ -276,4 +276,43 @@ enum virgl_formats {
 VIRGL_FORMAT_MAX,
  };

+struct virgl_caps_bool_set1 {
+unsigned indep_blend_enable:1;
+unsigned indep_blend_func:1;
+unsigned cube_map_array:1;
+unsigned shader_stencil_export:1;
+unsigned conditional_render:1;
+unsigned start_instance:1;
+unsigned primitive_restart:1;
+unsigned blend_eq_sep:1;
+unsigned instanceid:1;
+unsigned vertex_element_instance_divisor:1;
+unsigned seamless_cube_map:1;
+unsigned occlusion_query:1;
+unsigned timer_query:1;
+unsigned streamout_pause_resume:1;
+};
+
+/* endless expansion capabilites - current gallium has 252 formats */
+struct virgl_supported_format_mask {
+uint32_t bitmask[16];
+};
+/* capabilities set 2 - version 1 - 32-bit and float values */
+struct virgl_caps_v1 {
+struct virgl_caps_bool_set1 bset;
+uint32_t glsl_level;
+uint32_t max_texture_array_layers;
+uint32_t max_streamout_buffers;
+uint32_t max_dual_source_render_targets;
+uint32_t max_render_targets;
+struct virgl_supported_format_mask sampler;
+struct virgl_supported_format_mask fb;
+struct virgl_supported_format_mask depthstencil;
+struct virgl_supported_format_mask vertexbuffer;
+};
+
+union virgl_caps {
+uint32_t max_version;
+struct virgl_caps_v1 v1;
+};
  #endif



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] segfault in pstip_bind_sampler_states

2013-08-13 Thread Brian Paul

On 08/12/2013 11:30 AM, Kevin H. Hobbs wrote:

On 08/12/2013 10:29 AM, Brian Paul wrote:

Can you run with valgrind?  That should give us some useful info if
there's a use-after-free.


Sure,

$ valgrind /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkpython
--enable-bt
/home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py/rtImageTest.py
/home/kevin/kitware/VTK/Filters/Hybrid/Testing/Python/largeImageOffset.py
-D /home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Testing -T
/home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary -V
/home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Filters/Hybrid/Testing/Data/Baseline/largeImageOffset.png
-A /home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py 
/tmp/osmesa_valgrind.txt 21



[...]


==30166==
--30166-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - 
exiting
--30166-- si_code=80;  Faulting address: 0x0;  sp: 0x4030fdd50

valgrind: the 'impossible' happened:
   Killed by fatal signal


Well, that's not too helpful.  Can you send me an executable?  Or, is it 
simple to build the test case?


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Patches: R600: Improve load / store support for 8-bit and 16-bit types

2013-08-13 Thread Aaron Watry
I've finished running comparison tests on Cayman/Pitcairn.  The
descriptors series and 8/16-bit load/store support series both look
like they're in good condition for the cards I was able to test: Cedar
(5400), A6-3500 Llano and Pitcairn (7850).  No regressions spotted,
just improvements.

And as you said, the descriptors series fixed compute hangs for the
7850 on quite a few kernels which did comparison operations (max/clamp
kernels mostly, maybe some min).

You can definitely get a tested-by for both the descriptors series and this:
Tested-by: Aaron Watry awa...@gmail.com

Quite a few of the tablegen changes are still a bit above my head, so
I don't feel qualified to give a comprehensive review on that.

--Aaron

On Mon, Aug 12, 2013 at 6:00 PM, Aaron Watry awa...@gmail.com wrote:
 It'll take me a while to attempt to parse everything that's going on
 in these patches (and your resource descriptor types series that this
 depends on), but I have sent it all through a piglit run on Evergreen
 (Cedar).  Everything was latest Mesa/LLVM/libclc upstream code as of
 today.

 Baseline: 567/855 tests passed
 Descriptors Series: 575/855 tests passed -- Main differences here
 were with some int3 load/store issues which were just exposed recently
 and fixed by this series)
 Descriptors + char/short load/store series: 880/1119 tests passed
 (most of the additional tests and passes were char/short tests that no
 longer crash out).

 Specifically, I've double-checked the char/short/uchar/ushort built-in
 functions, as well as the char/short arithmetic tests, and things are
 looking good so far.  I'll try to test on Cayman/SI later.

 --Aaron

 On Mon, Aug 12, 2013 at 2:56 PM, Tom Stellard t...@stellard.net wrote:
 Hi,

 The attached patches improve support for i8 and i16 loads and stores for
 Evergreen and newer GPUs.  This means that byte-addressable stores are
 now supported.

 Please review/test.

 -Tom

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)

2013-08-13 Thread Siarhei Siamashka
On Thu, 08 Aug 2013 16:19:28 -0700
Ian Romanick i...@freedesktop.org wrote:

 On 08/06/2013 02:13 PM, Siarhei Siamashka wrote:
  Hello,
 
  Some months ago, the commit configure.ac: Allow OpenGL ES1 and ES2 only
  with enabled OpenGL dropped support for the OpenGL-free configuration.
 
   
  http://lists.freedesktop.org/archives/mesa-dev/2013-February/033909.html
   
  http://lists.freedesktop.org/archives/mesa-commit/2013-February/041708.html
 
  Could this be possibly reverted to allow me to continue shooting
  myself in the foot? The support for OpenGL ES is pretty horrible
  in the open source software. One nice exception is Qt5 which is doing
  pretty well. But the rest of the software does not generally work out
  of the box without patches or tweaks. You can also hardly find a
  problem-free OpenGL ES compatible open source game (other than Quake3).
 
  I have an open feature request for Gentoo, which is a very configurable
  Linux distribution and should not have any troubles working either with
  or without OpenGL (the choice is up to the user):
 
   https://bugs.gentoo.org/show_bug.cgi?id=476524
 
  But if upstream Mesa treats this configuration as unsupported, then I
  also don't see it progressing anywhere in Gentoo. So could you please
  re-consider this decision?
 
 We've removed all of the #ifdef code inside Mesa that would have made 
 any difference.  It was a nightmare to maintain, and we almost always 
 got it wrong... because nobody was testing that configuration.

I believe this can be changed :-) That's a bit of a chicken/egg problem.
The OpenGL ES support in free software applications and libraries is
so broken, that it's currently a big pain to try this configuration for
anything practical. And the applications/libraries can't be fixed
without having a non-OpenGL environment for development and testing.

The needed tweaks for Mesa are really trivial. Maybe one could also
just compile everything, but delete GL headers, gl.pc and libGL.so
after compilation and before installing Mesa to the system. Still it
is a bit ugly to have the configure script claim that OpenGL ES is not
supported without OpenGL, while in fact it works.

 The only thing this is possibly going to gain you is a trivial amount
 of build time (by not building libGL, etc.).

The compilation time is irrelevant. But it is very useful to be able to
install Mesa without OpenGL headers and without libGL.so, so that the
problematic software just fails at compile time instead of exhibiting
hard to debug problems at runtime.

It seems to be a rather common failure scenario when some big bloatware
application loads both libGL.so (provided by Mesa) and libGLESv2.so
(provided by some proprietary OpenGL ES driver on ARM hardware) into
the same process via indirect library dependencies. These shared
libraries are providing overlapping function names, but are backed by
totally different implementations. And everything blows up as a result
when the application is run, or maybe it even mostly works if you are
lucky.

What's the point installing both Mesa and the proprietary OpenGL ES
drivers on the same system? I would surely love to have open source
hardware accelerated OpenGL ES drivers on ARM systems today. But they
are not quite here yet. And even assuming that we get perfectly
functional free software OpenGL ES drivers for embedded hardware,
the current buggy applications are not going be magically fixed
themselves. Somebody still needs to debug and fix the OpenGL ES
compatibility problems.

The easiest way forward seems to be just allowing to compile Mesa
without desktop OpenGL. It is going to provide:
1. On x86 desktop systems - the development environment for testing
OpenGL ES applications.
2. On ARM hardware via softpipe/llvmpipe - some reference fallback
implementation.
3. Have both the existing proprietary drivers and Mesa installed on
ARM hardware (with the ability to switch between them at any time) -
the applications can run at full speed and be profiled/benchmarked.

Somebody may argue that I'm exaggerating and OpenGL ES support seems
to be not so bad. There were many OpenGL ES related news and
announcements. Also there exists Linaro/Ubuntu distribution and some
videos on youtube showing how it successfully runs something in 3D on
ARM. Still the problem is that in many applications the said OpenGL ES
support is either in the work-in-progress state, or it possibly has
been contributed by somebody some time ago and has already bitrotten.
Also Linaro bundles a bunch of OpenGL ES hacks, which don't seem to be
actively pushed upstream. This all is less than perfect and needs to be
improved. That is unless we are happy with having OpenGL ES just
exclusively for running Qt5 and a few compliant compositing window
managers.

-- 
Best regards,
Siarhei Siamashka
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)

2013-08-13 Thread Siarhei Siamashka
On Mon, 12 Aug 2013 14:09:39 -0700
Chad Versace chad.vers...@linux.intel.com wrote:

 On 08/06/2013 09:44 PM, Siarhei Siamashka wrote:
  On Tue, 6 Aug 2013 15:54:57 -0700
  Matt Turner matts...@gmail.com wrote:
 
  On Tue, Aug 6, 2013 at 2:13 PM, Siarhei Siamashka
  siarhei.siamas...@gmail.com wrote:
 
  But if upstream Mesa treats this configuration as unsupported, then I
  also don't see it progressing anywhere in Gentoo. So could you please
  re-consider this decision?
 
  As far as I'm aware, ES without Desktop GL is disallowed only because
  it was discovered to be broken
  which is because no one working on Mesa appears to test it.
 
  I have not done any really serious testing. I'm just playing around [...]
 
 
  If you can test it (and provide patches when you notice that it's
  broken) I don't have a problem with allowing ES-only builds.
 
 I agree. If you can fix Mesa to support ES-only builds and do *serious*
 testing with Piglit and some real ES applications to prove that it works,
 then I'm not opposed to supporting that configuration.

That's a really good point about Piglit.

Also if you wonder about what can be run with OpenGL ES only (and
without OpenGL), then here is some initial list:

Qt5 works, KWin works, glmark2 works, OGRE works to some extent (simple
demos run, but not full fledged games). WebGL in Firefox also works,
but still has some issues:
https://bugzilla.mozilla.org/show_bug.cgi?id=788319

But anything, that wants GLU as a dependency, simply does not build.
Some applications desperately want to link with -lGL for no reason.
If you want an example of such problematic package, a good one
is mesa demos - http://cgit.freedesktop.org/mesa/demos
While it is supposed to provide the es2gears test program, it does not
build out of the box:

checking for GL... no
checking GL/gl.h usability... no
checking GL/gl.h presence... no
checking for GL/gl.h... no
configure: error: GL not found

And so on. There is definitely some work to do on the applications
front.

PS. I would be grateful if somebody could advise a highly dynamic and
enjoyable OpenGL ES compatible open source 3D game (not Quake3!). Tux
Rider World Challenge seems to be promising (as a GLESv1 testcase),
but needs some porting back to Linux and X11 EGL:
http://www.barlow-server.com/tuxriderworldchallenge

-- 
Best regards,
Siarhei Siamashka
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] segfault in pstip_bind_sampler_states

2013-08-13 Thread Kevin H. Hobbs
On 08/13/2013 09:50 AM, Brian Paul wrote:
 On 08/12/2013 11:30 AM, Kevin H. Hobbs wrote:
 
 --30166-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - 
 exiting
 
 Well, that's not too helpful.

I think it may have been helpful.

Before Valgrind crashed it mentions that
osmesa_st_framebuffer_flush_front wrote to an address it should not have.

In the test VTK uses a filter that tiles the rendering of a (not very)
large image.

The render window is initially 150x150 and then after some fooling
around magnification is set to 3.

I think this results in what I see in gdb :

res-width0 = 450
res-height0= 450
osbuffer-width = 150
bytes   = 1800
dst_stride  = -600

If I read that last right then in the for loop we write 1800 bytes to
dst move back 600 bytes and write another 1800 bytes.

Are we overwriting 2/3 of what we just wrote?

  Can you send me an executable?

Not quickly.

  Or, is it 
 simple to build the test case?
 

I loose track of what's simple and not for me there's a cron job that
just runs a bunch of scripts while I sleep.

The test is a python wrapped test so the whole hour long build of vtk is
hard to avoid.

They are attached for good measure but all I do is:

The mesa I use is built nightly with :
./autogen.sh \
  --prefix=/home/kevin/mesa_nightly \
  --enable-glx \
  --enable-dri \
  --enable-shared-glapi \
  --enable-gallium-llvm \
  --with-gallium-drivers=nouveau,swrast \
  --enable-osmesa


I have VTK cloned :
git clone http://vtk.org/VTK.git

I happen to be on the nightly-master branch for the dashboard but that
shouldn't matter.

ctest builds and tests VTK with :
ctest -S vtk_osmesa.cmake

Since you only want one test a build just like mine is :

VTK_BUILD=~/VTK_Build
VTK_SRC=~/VTK
mkdir $VTK_BUILD
cd $VTK_BUILD
cmake \
\
  -DBUILD_EXAMPLES:BOOL=ON\
  -DBUILD_SHARED_LIBS:BOOL=ON\
\
  -DVTK_BUILD_ALL_MODULES:BOOL=OFF\
  -DVTK_Group_Imaging:BOOL=ON\
  -DVTK_Group_MPI:BOOL=ON\
  -DVTK_Group_Rendering:BOOL=ON\
  -DVTK_Group_StandAlone:BOOL=ON\
  -DVTK_Group_Views:BOOL=ON\
\
  -DVTK_WRAP_JAVA:BOOL=OFF\
  -DVTK_WRAP_PYTHON:BOOL=ON\
  -DVTK_WRAP_TCL:BOOL=ON\
\
  -DOPENGL_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include\
  -DOPENGL_gl_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGL.so\
  -DOPENGL_glu_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGLU.so\
  -DVTK_OPENGL_HAS_OSMESA:BOOL=ON\
  -DOSMESA_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include\
  -DOSMESA_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libOSMesa.so\
\
  -DVTK_USE_OFFSCREEN:BOOL=ON\
  -DVTK_USE_X:BOOL=OFF\
  -DVTK_USE_TK:BOOL=OFF\
  $VTK_SRC

and the test is :

$VTK_BUILD/bin/vtkpython --enable-bt
$VTK_BUILD/Utilities/vtkTclTest2Py/rtImageTest.py
$VTK_SRC/Filters/Hybrid/Testing/Python/largeImageOffset.py -D
$VTK_BUILD/ExternalData/Testing -T $VTK_BUILD/Testing/Temporary
-V
$VTK_BUILD/ExternalData/Filters/Hybrid/Testing/Data/Baseline/largeImageOffset.png
-A $VTK_BUILD/Utilities/vtkTclTest2Py


ctest usually downloads the the validation image largeImageOffset.png
and the input file mentioned in the test iflamigm.3ds

I don't know if this happens without ctest.
# Client maintainer: hob...@ohio.edu

set(CTEST_SITE bubbles.hooperlab)
set(CTEST_BUILD_NAME Fedora-18_OSMesaDevel-x86_64)
set(CTEST_CONFIGURATION_TYPE Release)
set(CTEST_CMAKE_GENERATOR Unix Makefiles)
set( dashboard_model Nightly )
set( CTEST_BUILD_FLAGS -ij8 )
set( CTEST_TEST_ARGS PARALLEL_LEVEL 8 )
set( CTEST_DASHBOARD_ROOT ${CTEST_SCRIPT_DIRECTORY} )
set( CTEST_SOURCE_DIRECTORY ${CTEST_DASHBOARD_ROOT}/VTK )
set( CTEST_BINARY_DIRECTORY ${CTEST_DASHBOARD_ROOT}/VTK_OSMesa_Build )
set( VTK_USE_LARGE_DATA ON )
set( dashboard_cache 
  BUILD_EXAMPLES:BOOL=ON
  BUILD_SHARED_LIBS:BOOL=ON

  VTK_BUILD_ALL_MODULES:BOOL=OFF
  VTK_Group_Imaging:BOOL=ON
  VTK_Group_MPI:BOOL=ON
  VTK_Group_Rendering:BOOL=ON
  VTK_Group_StandAlone:BOOL=ON
  VTK_Group_Views:BOOL=ON

  VTK_WRAP_JAVA:BOOL=OFF
  VTK_WRAP_PYTHON:BOOL=ON
  VTK_WRAP_TCL:BOOL=ON

  OPENGL_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include
  OPENGL_gl_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGL.so
  OPENGL_glu_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libGLU.so
  VTK_OPENGL_HAS_OSMESA:BOOL=ON
  OSMESA_INCLUDE_DIR:PATH=/home/kevin/mesa_nightly/include
  OSMESA_LIBRARY:FILEPATH=/home/kevin/mesa_nightly/lib/libOSMesa.so

  VTK_USE_OFFSCREEN:BOOL=ON
  VTK_USE_X:BOOL=OFF
  VTK_USE_TK:BOOL=OFF
  )

include(${CTEST_SCRIPT_DIRECTORY}/VTKScripts/vtk_common.cmake)



update_kitware.sh
Description: application/shellscript


update_mesa.sh
Description: application/shellscript


signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] i965: Some small refactors for Broadwell

2013-08-13 Thread Chad Versace
Here are some small refactors for Broadwell. I'd like to see them merged
upstream now to ease the maintenance of internal trees. There's nothing
exciting here, so I request your rubberstamp.

Chad Versace (2):
  i965: Refactor names of sample_positions_8/4x arrays
  i965: Move arrays brw_multisample_positions* to new header

Kenneth Graunke (2):
  i965: Mark a few brw_draw_upload.c functions as non-static
  i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2)

 src/mesa/drivers/dri/i965/brw_context.h|  5 ++
 src/mesa/drivers/dri/i965/brw_draw_upload.c| 16 ++---
 src/mesa/drivers/dri/i965/brw_multisample_state.h  | 72 ++
 src/mesa/drivers/dri/i965/brw_state.h  |  4 ++
 src/mesa/drivers/dri/i965/gen6_multisample_state.c | 57 ++---
 src/mesa/drivers/dri/i965/gen7_sol_state.c |  6 +-
 6 files changed, 99 insertions(+), 61 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/brw_multisample_state.h

-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] i965: Mark a few brw_draw_upload.c functions as non-static

2013-08-13 Thread Chad Versace
From: Kenneth Graunke kenn...@whitecape.org

We will reuse these for Broadwell.

Reviewed-by: Chad Versace chad.vers...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_context.h |  5 +
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 16 +---
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 00dd2b4..74e38f1 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1362,6 +1362,11 @@ int brw_disasm (FILE *file, struct brw_instruction 
*inst, int gen);
 /* brw_vs.c */
 gl_clip_plane *brw_select_clip_planes(struct gl_context *ctx);
 
+/* brw_draw_upload.c */
+unsigned brw_get_vertex_surface_type(struct brw_context *brw,
+ const struct gl_client_array *glarray);
+unsigned brw_get_index_type(GLenum type);
+
 /* brw_wm_surface_state.c */
 void brw_init_surface_formats(struct brw_context *brw);
 void
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 897e733..158c9e5 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -222,9 +222,9 @@ static GLuint byte_types_scale[5] = {
  * the appopriate hardware surface type.
  * Format will be GL_RGBA or possibly GL_BGRA for GLubyte[4] color arrays.
  */
-static unsigned
-get_surface_type(struct brw_context *brw,
- const struct gl_client_array *glarray)
+unsigned
+brw_get_vertex_surface_type(struct brw_context *brw,
+const struct gl_client_array *glarray)
 {
int size = glarray-Size;
 
@@ -342,7 +342,8 @@ get_surface_type(struct brw_context *brw,
}
 }
 
-static GLuint get_index_type(GLenum type)
+unsigned
+brw_get_index_type(GLenum type)
 {
switch (type) {
case GL_UNSIGNED_BYTE:  return BRW_INDEX_BYTE;
@@ -687,7 +688,7 @@ static void brw_emit_vertices(struct brw_context *brw)
OUT_BATCH((_3DSTATE_VERTEX_ELEMENTS  16) | (2 * nr_elements - 1));
for (i = 0; i  brw-vb.nr_enabled; i++) {
   struct brw_vertex_element *input = brw-vb.enabled[i];
-  uint32_t format = get_surface_type(brw, input-glarray);
+  uint32_t format = brw_get_vertex_surface_type(brw, input-glarray);
   uint32_t comp0 = BRW_VE1_COMPONENT_STORE_SRC;
   uint32_t comp1 = BRW_VE1_COMPONENT_STORE_SRC;
   uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC;
@@ -748,7 +749,8 @@ static void brw_emit_vertices(struct brw_context *brw)
}
 
if (brw-gen = 6  gen6_edgeflag_input) {
-  uint32_t format = get_surface_type(brw, gen6_edgeflag_input-glarray);
+  uint32_t format =
+ brw_get_vertex_surface_type(brw, gen6_edgeflag_input-glarray);
 
   OUT_BATCH((gen6_edgeflag_input-buffer  GEN6_VE0_INDEX_SHIFT) |
 GEN6_VE0_VALID |
@@ -900,7 +902,7 @@ static void brw_emit_index_buffer(struct brw_context *brw)
BEGIN_BATCH(3);
OUT_BATCH(CMD_INDEX_BUFFER  16 |
  cut_index_setting |
- get_index_type(index_buffer-type)  8 |
+ brw_get_index_type(index_buffer-type)  8 |
  1);
OUT_RELOC(brw-ib.bo,
  I915_GEM_DOMAIN_VERTEX, 0,
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2)

2013-08-13 Thread Chad Versace
From: Kenneth Graunke kenn...@whitecape.org

We will reuse this for Broadwell.

v2: Prefix function name with 'gen7'. (chadv)

Reviewed-by: Chad Versace chad.vers...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_state.h  | 4 
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 321bffe..a1236b7 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -200,6 +200,10 @@ void gen7_init_vtable_surface_functions(struct brw_context 
*brw);
 void gen7_create_shader_time_surface(struct brw_context *brw,
  uint32_t *out_offset);
 
+/* gen7_sol_state.c */
+void gen7_upload_3dstate_so_decl_list(struct brw_context *brw,
+  const struct brw_vue_map *vue_map);
+
 /* brw_wm_sampler_state.c */
 uint32_t translate_wrap_mode(GLenum wrap, bool using_nearest);
 void upload_default_color(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 034efe8..185e422 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -97,9 +97,9 @@ upload_3dstate_so_buffers(struct brw_context *brw)
  * stream.  We only have one stream of rendering coming out of the GS unit, so
  * we only emit stream 0 (low 16 bits) SO_DECLs.
  */
-static void
-upload_3dstate_so_decl_list(struct brw_context *brw,
-   const struct brw_vue_map *vue_map)
+void
+gen7_upload_3dstate_so_decl_list(struct brw_context *brw,
+ const struct brw_vue_map *vue_map)
 {
struct gl_context *ctx = brw-ctx;
/* BRW_NEW_VERTEX_PROGRAM */
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] i965: Refactor names of sample_positions_8/4x arrays

2013-08-13 Thread Chad Versace
Place each array in the brw namespace by renaming it:
sample_positions_4x - brw_multisample_positions_4x
sample_positions_8x - brw_multisample_positions_8x

This prepares for moving the arrays to a header shared by gen6 and gen8.

CC: Paul Berry stereotype...@gmail.com
Signed-off-by: Chad Versace chad.vers...@linux.intel.com
---
 src/mesa/drivers/dri/i965/gen6_multisample_state.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c 
b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
index 268dc79..0ba3642 100644
--- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
@@ -34,7 +34,7 @@
  * e 3
  */
 static uint32_t
-sample_positions_4x[] = { 0xae2ae662 };
+brw_multisample_positions_4x[] = { 0xae2ae662 };
 /* Sample positions are based on a solution to the 8 queens puzzle.
  * Rationale: in a solution to the 8 queens puzzle, no two queens share
  * a row, column, or diagonal.  This is a desirable property for samples
@@ -69,7 +69,7 @@ sample_positions_4x[] = { 0xae2ae662 };
  * f   7
  */
 static uint32_t
-sample_positions_8x[] = { 0xdbb39d79, 0x3ff55117 };
+brw_multisample_positions_8x[] = { 0xdbb39d79, 0x3ff55117 };
 
 
 void
@@ -82,13 +82,13 @@ gen6_get_sample_position(struct gl_context *ctx,
   result[0] = result[1] = 0.5f;
   break;
case 4: {
-  uint8_t val = (uint8_t)(sample_positions_4x[0]  (8*index));
+  uint8_t val = (uint8_t)(brw_multisample_positions_4x[0]  (8*index));
   result[0] = ((val  4)  0xf) / 16.0f;
   result[1] = (val  0xf) / 16.0f;
   break;
}
case 8: {
-  uint8_t val = (uint8_t)(sample_positions_8x[index2]  (8*(index  
3)));
+  uint8_t val = (uint8_t)(brw_multisample_positions_8x[index2]  
(8*(index  3)));
   result[0] = ((val  4)  0xf) / 16.0f;
   result[1] = (val  0xf) / 16.0f;
   break;
@@ -116,12 +116,12 @@ gen6_emit_3dstate_multisample(struct brw_context *brw,
   break;
case 4:
   number_of_multisamples = MS_NUMSAMPLES_4;
-  sample_positions_3210 = sample_positions_4x[0];
+  sample_positions_3210 = brw_multisample_positions_4x[0];
   break;
case 8:
   number_of_multisamples = MS_NUMSAMPLES_8;
-  sample_positions_3210 = sample_positions_8x[0];
-  sample_positions_7654 = sample_positions_8x[1];
+  sample_positions_3210 = brw_multisample_positions_8x[0];
+  sample_positions_7654 = brw_multisample_positions_8x[1];
   break;
default:
   assert(!Unrecognized num_samples in gen6_emit_3dstate_multisample);
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965: Move arrays brw_multisample_positions* to new header

2013-08-13 Thread Chad Versace
Move the arrays to the new header brw_multisample_state.h, which will be
shared with Broadwell code.

CC: Paul Berry stereotype...@gmail.com
Signed-off-by: Chad Versace chad.vers...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_multisample_state.h  | 72 ++
 src/mesa/drivers/dri/i965/gen6_multisample_state.c | 47 +-
 2 files changed, 73 insertions(+), 46 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/brw_multisample_state.h

diff --git a/src/mesa/drivers/dri/i965/brw_multisample_state.h 
b/src/mesa/drivers/dri/i965/brw_multisample_state.h
new file mode 100644
index 000..79566f0
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_multisample_state.h
@@ -0,0 +1,72 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include stdint.h
+
+/**
+ * Sample positions:
+ *   2 6 a e
+ * 2   0
+ * 6   1
+ * a 2
+ * e 3
+ */
+static const uint32_t
+brw_multisample_positions_4x[] = { 0xae2ae662 };
+
+/**
+ * Sample positions are based on a solution to the 8 queens puzzle.
+ * Rationale: in a solution to the 8 queens puzzle, no two queens share
+ * a row, column, or diagonal.  This is a desirable property for samples
+ * in a multisampling pattern, because it ensures that the samples are
+ * relatively uniformly distributed through the pixel.
+ *
+ * There are several solutions to the 8 queens puzzle (see
+ * http://en.wikipedia.org/wiki/Eight_queens_puzzle).  This solution was
+ * chosen because it has a queen close to the center; this should
+ * improve the accuracy of centroid interpolation, since the hardware
+ * implements centroid interpolation by choosing the centermost sample
+ * that overlaps with the primitive being drawn.
+ *
+ * Note: from the Ivy Bridge PRM, Vol2 Part1 p304 (3DSTATE_MULTISAMPLE:
+ * Programming Notes):
+ *
+ * When programming the sample offsets (for NUMSAMPLES_4 or _8 and
+ * MSRASTMODE_xxx_PATTERN), the order of the samples 0 to 3 (or 7
+ * for 8X) must have monotonically increasing distance from the
+ * pixel center. This is required to get the correct centroid
+ * computation in the device.
+ *
+ * Sample positions:
+ *   1 3 5 7 9 b d f
+ * 1 5
+ * 3   2
+ * 5   6
+ * 7 4
+ * 9   0
+ * b 3
+ * d 1
+ * f   7
+ */
+static const uint32_t
+brw_multisample_positions_8x[] = { 0xdbb39d79, 0x3ff55117 };
diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c 
b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
index 0ba3642..c94c900 100644
--- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
@@ -25,52 +25,7 @@
 
 #include brw_context.h
 #include brw_defines.h
-
-/* Sample positions:
- *   2 6 a e
- * 2   0
- * 6   1
- * a 2
- * e 3
- */
-static uint32_t
-brw_multisample_positions_4x[] = { 0xae2ae662 };
-/* Sample positions are based on a solution to the 8 queens puzzle.
- * Rationale: in a solution to the 8 queens puzzle, no two queens share
- * a row, column, or diagonal.  This is a desirable property for samples
- * in a multisampling pattern, because it ensures that the samples are
- * relatively uniformly distributed through the pixel.
- *
- * There are several solutions to the 8 queens puzzle (see
- * http://en.wikipedia.org/wiki/Eight_queens_puzzle).  This solution was
- * chosen because it has a queen close to the center; this should
- * improve the accuracy of centroid interpolation, since the hardware
- * implements centroid interpolation by choosing the centermost sample
- * that overlaps with the primitive being drawn.
- *
- * Note: from the Ivy Bridge PRM, Vol2 Part1 p304 (3DSTATE_MULTISAMPLE:
- * Programming Notes):
- *
- * When programming the sample offsets (for NUMSAMPLES_4 or _8 and
- * MSRASTMODE_xxx_PATTERN), the order of the samples 0 to 3 (or 7
- * for 

[Mesa-dev] [PATCH] configure: link against -lLLVM to determine build type

2013-08-13 Thread Maarten Lankhorst
Fixes a build failure of 9.2 on ubuntu, because libLLVM-3.3.so is not present 
in /usr/lib/llvm-3.2/lib.

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
diff --git a/configure.ac b/configure.ac
index 35f6797..579d8d4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1870,7 +1870,18 @@ if test x$MESA_LLVM != x0; then
 if test x$with_llvm_shared_libs = xyes; then
 dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,
 LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version`
-AS_IF([test -f $LLVM_LIBDIR/lib$LLVM_SO_NAME.so], 
[llvm_have_one_so=yes])
+
+AC_MSG_CHECKING([whether $LLVM_SO_NAME is a monolithic blob])
+save_LIBS=$LIBS
+save_LDFLAGS=$LDFLAGS
+LDFLAGS=$LDFLAGS $LLVM_LDFLAGS
+LIBS=$LIBS -l$LLVM_SO_NAME
+
+AC_LINK_IFELSE([AC_LANG_CALL([], [LLVMInitializeCore])],
+   [llvm_have_one_so=yes], [llvm_have_one_so=no])
+LIBS=$save_LIBS
+LDFLAGS=$save_LDFLAGS
+AC_MSG_RESULT([$llvm_have_one_so])
 
 if test x$llvm_have_one_so = xyes; then
 dnl LLVM was built using auto*, so there is only one shared object.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-13 Thread Chad Versace

On 08/12/2013 03:15 PM, Stéphane Marchesin wrote:

On Mon, Aug 12, 2013 at 3:05 PM, Marek Olšák mar...@gmail.com wrote:

On Mon, Aug 12, 2013 at 11:36 PM, Stéphane Marchesin
stephane.marche...@gmail.com wrote:

Other than hybrid systems (of which
there are none with i915 graphics), is there any case where
__DRI_IMAGE_USE_SHARE can occur?


You could do interesting things like cross-process sharing with it. I
think it's worth doing it, no matter what. It's easy to pick up now,
and hard to fix up later.


Cross-process sharing is mandatory already and exposed via
resource_from_handle and resource_get_handle. I don't think this is
useful for cross-process sharing anyway, because it disables tiling.



Well, for Chrome we're thinking of using it. If one end can map linear
memory and write texture data to it from the CPU, and the other end
can use it as a GL texture, then we have a zero copy cross-process
texture upload. I realize it's not your normal use case, but... :)


Stéphane, have you considered using EGL_EXT_dma_buf_import with
GL_OES_EGL_image_external for Chrome texture sharing? It seems
like a good fit for what Chrome is doing: sharing non-mipmapped
2d texture memory. Both extensions recently landed on master for
i965.

On that note, do you have any interest in moving Chrome/Intel
awaw from GLX to EGL? Some people on the Intel Media team have
been discussing a desire to do that for some of the media
components.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeon/llvm: fix compile error with -Werror=format-security

2013-08-13 Thread Maarten Lankhorst
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
---
diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
b/src/gallium/drivers/radeon/radeon_llvm_emit.c
index 1a4d4fd..2dd7bf7 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
+++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
@@ -124,7 +124,7 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct 
radeon_llvm_binary *binary,
r = LLVMTargetMachineEmitToMemoryBuffer(tm, M, LLVMObjectFile, err,
 out_buffer);
if (r) {
-   fprintf(stderr, err);
+   fprintf(stderr, %s, err);
FREE(err);
return 1;
}

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Move arrays brw_multisample_positions* to new header

2013-08-13 Thread Kenneth Graunke

On 08/13/2013 08:53 AM, Chad Versace wrote:

Move the arrays to the new header brw_multisample_state.h, which will be
shared with Broadwell code.

CC: Paul Berry stereotype...@gmail.com
Signed-off-by: Chad Versace chad.vers...@linux.intel.com



Hmm.  Looks like I botched my patches in a rebase, but I'm pretty sure 
my new plan was to just put the Broadwell code in 
gen6_multisample_state.c (rather than introducing a new file) to avoid 
having to do this.  It's not much code.


Then again, I don't really mind doing this either.

Patches 3-4 get my R-b, and I'm fine with landing the series.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] ilo: implement new float comparison instructions

2013-08-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com

untested.
---
 src/gallium/drivers/ilo/shader/toy_tgsi.c |   20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/ilo/shader/toy_tgsi.c 
b/src/gallium/drivers/ilo/shader/toy_tgsi.c
index d5a3f2f..830aa57 100644
--- a/src/gallium/drivers/ilo/shader/toy_tgsi.c
+++ b/src/gallium/drivers/ilo/shader/toy_tgsi.c
@@ -209,15 +209,18 @@ aos_set_on_cond(struct toy_compiler *tc,
case TGSI_OPCODE_SLT:
case TGSI_OPCODE_ISLT:
case TGSI_OPCODE_USLT:
+   case TGSI_OPCODE_FSLT:
   cond = BRW_CONDITIONAL_L;
   break;
case TGSI_OPCODE_SGE:
case TGSI_OPCODE_ISGE:
case TGSI_OPCODE_USGE:
+   case TGSI_OPCODE_FSGE:
   cond = BRW_CONDITIONAL_GE;
   break;
case TGSI_OPCODE_SEQ:
case TGSI_OPCODE_USEQ:
+   case TGSI_OPCODE_FSEQ:
   cond = BRW_CONDITIONAL_EQ;
   break;
case TGSI_OPCODE_SGT:
@@ -228,6 +231,7 @@ aos_set_on_cond(struct toy_compiler *tc,
   break;
case TGSI_OPCODE_SNE:
case TGSI_OPCODE_USNE:
+   case TGSI_OPCODE_FSNE:
   cond = BRW_CONDITIONAL_NEQ;
   break;
default:
@@ -935,10 +939,10 @@ static const toy_tgsi_translate 
aos_translate_table[TGSI_OPCODE_LAST] = {
[105]  = aos_unsupported,
[106]  = aos_unsupported,
[TGSI_OPCODE_NOP]  = aos_simple,
-   [108]  = aos_unsupported,
-   [109]  = aos_unsupported,
-   [110]  = aos_unsupported,
-   [111]  = aos_unsupported,
+   [TGSI_OPCODE_FSEQ] = aos_set_on_cond,
+   [TGSI_OPCODE_FSGE] = aos_set_on_cond,
+   [TGSI_OPCODE_FSLT] = aos_set_on_cond,
+   [TGSI_OPCODE_FSNE] = aos_set_on_cond,
[TGSI_OPCODE_NRM4] = aos_NRM4,
[TGSI_OPCODE_CALLNZ]   = aos_unsupported,
[TGSI_OPCODE_BREAKC]   = aos_unsupported,
@@ -1551,10 +1555,10 @@ static const toy_tgsi_translate 
soa_translate_table[TGSI_OPCODE_LAST] = {
[105]  = soa_unsupported,
[106]  = soa_unsupported,
[TGSI_OPCODE_NOP]  = soa_passthrough,
-   [108]  = soa_unsupported,
-   [109]  = soa_unsupported,
-   [110]  = soa_unsupported,
-   [111]  = soa_unsupported,
+   [TGSI_OPCODE_FSEQ] = soa_per_channel,
+   [TGSI_OPCODE_FSGE] = soa_per_channel,
+   [TGSI_OPCODE_FSLT] = soa_per_channel,
+   [TGSI_OPCODE_FSNE] = soa_per_channel,
[TGSI_OPCODE_NRM4] = soa_NRM4,
[TGSI_OPCODE_CALLNZ]   = soa_unsupported,
[TGSI_OPCODE_BREAKC]   = soa_unsupported,
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] r600/radeonsi: implement new float comparison instructions

2013-08-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com

Also use ordered comparisons for old cmp instructions. Untested.
---
 src/gallium/drivers/r600/r600_shader.c |   18 ---
 .../drivers/radeon/radeon_setup_tgsi_llvm.c|   49 
 2 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 37298cc..fb766c4 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -5743,11 +5743,10 @@ static struct r600_shader_tgsi_instruction 
r600_shader_tgsi_instruction[] = {
{105,   0, ALU_OP0_NOP, tgsi_unsupported},
{106,   0, ALU_OP0_NOP, tgsi_unsupported},
{TGSI_OPCODE_NOP,   0, ALU_OP0_NOP, tgsi_unsupported},
-   /* gap */
-   {108,   0, ALU_OP0_NOP, tgsi_unsupported},
-   {109,   0, ALU_OP0_NOP, tgsi_unsupported},
-   {110,   0, ALU_OP0_NOP, tgsi_unsupported},
-   {111,   0, ALU_OP0_NOP, tgsi_unsupported},
+   {TGSI_OPCODE_FSEQ,  0, ALU_OP2_SETE_DX10, tgsi_op2},
+   {TGSI_OPCODE_FSGE,  0, ALU_OP2_SETGE_DX10, tgsi_op2},
+   {TGSI_OPCODE_FSLT,  0, ALU_OP2_SETGT_DX10, tgsi_op2_swap},
+   {TGSI_OPCODE_FSNE,  0, ALU_OP2_SETNE_DX10, tgsi_op2_swap},
{TGSI_OPCODE_NRM4,  0, ALU_OP0_NOP, tgsi_unsupported},
{TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported},
/* gap */
@@ -5936,11 +5935,10 @@ static struct r600_shader_tgsi_instruction 
eg_shader_tgsi_instruction[] = {
{105,   0, ALU_OP0_NOP, tgsi_unsupported},
{106,   0, ALU_OP0_NOP, tgsi_unsupported},
{TGSI_OPCODE_NOP,   0, ALU_OP0_NOP, tgsi_unsupported},
-   /* gap */
-   {108,   0, ALU_OP0_NOP, tgsi_unsupported},
-   {109,   0, ALU_OP0_NOP, tgsi_unsupported},
-   {110,   0, ALU_OP0_NOP, tgsi_unsupported},
-   {111,   0, ALU_OP0_NOP, tgsi_unsupported},
+   {TGSI_OPCODE_FSEQ,  0, ALU_OP2_SETE_DX10, tgsi_op2},
+   {TGSI_OPCODE_FSGE,  0, ALU_OP2_SETGE_DX10, tgsi_op2},
+   {TGSI_OPCODE_FSLT,  0, ALU_OP2_SETGT_DX10, tgsi_op2_swap},
+   {TGSI_OPCODE_FSNE,  0, ALU_OP2_SETNE_DX10, tgsi_op2_swap},
{TGSI_OPCODE_NRM4,  0, ALU_OP0_NOP, tgsi_unsupported},
{TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported},
/* gap */
diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 7a47746..8ff9abd 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -850,18 +850,16 @@ static void emit_cmp(
LLVMRealPredicate pred;
LLVMValueRef cond;
 
-   /* XXX I'm not sure whether to do unordered or ordered comparisons,
-* but llvmpipe uses unordered comparisons, so for consistency we use
-* unordered.  (The authors of llvmpipe aren't sure about using
-* unordered vs ordered comparisons either.
+   /* Use ordered for everything but NE (which is usual for
+* float comparisons)
 */
switch (emit_data-inst-Instruction.Opcode) {
-   case TGSI_OPCODE_SGE: pred = LLVMRealUGE; break;
-   case TGSI_OPCODE_SEQ: pred = LLVMRealUEQ; break;
-   case TGSI_OPCODE_SLE: pred = LLVMRealULE; break;
-   case TGSI_OPCODE_SLT: pred = LLVMRealULT; break;
+   case TGSI_OPCODE_SGE: pred = LLVMRealOGE; break;
+   case TGSI_OPCODE_SEQ: pred = LLVMRealOEQ; break;
+   case TGSI_OPCODE_SLE: pred = LLVMRealOLE; break;
+   case TGSI_OPCODE_SLT: pred = LLVMRealOLT; break;
case TGSI_OPCODE_SNE: pred = LLVMRealUNE; break;
-   case TGSI_OPCODE_SGT: pred = LLVMRealUGT; break;
+   case TGSI_OPCODE_SGT: pred = LLVMRealOGT; break;
default: assert(!unknown instruction); pred = 0; break;
}
 
@@ -872,6 +870,35 @@ static void emit_cmp(
cond, bld_base-base.one, bld_base-base.zero, );
 }
 
+static void emit_fcmp(
+   const struct lp_build_tgsi_action *action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   LLVMBuilderRef builder = bld_base-base.gallivm-builder;
+   LLVMContextRef context = bld_base-base.gallivm-context;
+   LLVMRealPredicate pred;
+
+   /* Use ordered for everything but NE (which is usual for
+* float comparisons)
+*/
+   switch (emit_data-inst-Instruction.Opcode) {
+   case TGSI_OPCODE_FSEQ: pred = LLVMRealOEQ; break;
+   case TGSI_OPCODE_FSGE: pred = LLVMRealOGE; break;
+   case TGSI_OPCODE_FSLT: pred = LLVMRealOLT; break;
+   case TGSI_OPCODE_FSNE: pred = LLVMRealUNE; break;
+   default: assert(!unknown instruction); pred = 0; break;
+  

[Mesa-dev] [PATCH 2/4] nv50: implement new float comparison instructions

2013-08-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com

untested.
---
 .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp |   17 +
 1 file changed, 17 insertions(+)

diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
index 56eccac..a2ad9f4 100644
--- a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
@@ -440,6 +440,11 @@ nv50_ir::DataType Instruction::inferDstType() const
switch (getOpcode()) {
case TGSI_OPCODE_F2U: return nv50_ir::TYPE_U32;
case TGSI_OPCODE_F2I: return nv50_ir::TYPE_S32;
+   case TGSI_OPCODE_FSEQ:
+   case TGSI_OPCODE_FSGE:
+   case TGSI_OPCODE_FSLT:
+   case TGSI_OPCODE_FSNE:
+  return nv50_ir::TYPE_U32;
case TGSI_OPCODE_I2F:
case TGSI_OPCODE_U2F:
   return nv50_ir::TYPE_F32;
@@ -456,19 +461,23 @@ nv50_ir::CondCode Instruction::getSetCond() const
case TGSI_OPCODE_SLT:
case TGSI_OPCODE_ISLT:
case TGSI_OPCODE_USLT:
+   case TGSI_OPCODE_FSLT:
   return CC_LT;
case TGSI_OPCODE_SLE:
   return CC_LE;
case TGSI_OPCODE_SGE:
case TGSI_OPCODE_ISGE:
case TGSI_OPCODE_USGE:
+   case TGSI_OPCODE_FSGE:
   return CC_GE;
case TGSI_OPCODE_SGT:
   return CC_GT;
case TGSI_OPCODE_SEQ:
case TGSI_OPCODE_USEQ:
+   case TGSI_OPCODE_FSEQ:
   return CC_EQ;
case TGSI_OPCODE_SNE:
+   case TGSI_OPCODE_FSNE:
   return CC_NEU;
case TGSI_OPCODE_USNE:
   return CC_NE;
@@ -556,6 +565,10 @@ static nv50_ir::operation translateOpcode(uint opcode)
NV50_IR_OPCODE_CASE(KILL_IF, DISCARD);
 
NV50_IR_OPCODE_CASE(F2I, CVT);
+   NV50_IR_OPCODE_CASE(FSEQ, SET);
+   NV50_IR_OPCODE_CASE(FSGE, SET);
+   NV50_IR_OPCODE_CASE(FSLT, SET);
+   NV50_IR_OPCODE_CASE(FSNE, SET);
NV50_IR_OPCODE_CASE(IDIV, DIV);
NV50_IR_OPCODE_CASE(IMAX, MAX);
NV50_IR_OPCODE_CASE(IMIN, MIN);
@@ -2354,6 +2367,10 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
case TGSI_OPCODE_SLE:
case TGSI_OPCODE_SNE:
case TGSI_OPCODE_STR:
+   case TGSI_OPCODE_FSEQ:
+   case TGSI_OPCODE_FSGE:
+   case TGSI_OPCODE_FSLT:
+   case TGSI_OPCODE_FSNE:
case TGSI_OPCODE_ISGE:
case TGSI_OPCODE_ISLT:
case TGSI_OPCODE_USEQ:
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported

2013-08-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com

Should get rid of some float-to-int conversions (with negation).
No piglit regressions (with llvmpipe).
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   45 ++--
 1 file changed, 15 insertions(+), 30 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index d9b4ed2..65ba449 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -420,8 +420,6 @@ public:
void emit_scalar(ir_instruction *ir, unsigned op,
st_dst_reg dst, st_src_reg src0, st_src_reg src1);
 
-   void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst);
-
void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);
 
void emit_scs(ir_instruction *ir, unsigned op,
@@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op,
 
this-instructions.push_tail(inst);
 
-   if (native_integers)
-  try_emit_float_set(ir, op, dst);
-
return inst;
 }
 
@@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op)
return emit(ir, op, undef_dst, undef_src, undef_src, undef_src);
 }
 
- /**
- * Emits the code to convert the result of float SET instructions to integers.
- */
-void
-glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op,
-st_dst_reg dst)
-{
-   if ((op == TGSI_OPCODE_SEQ ||
-op == TGSI_OPCODE_SNE ||
-op == TGSI_OPCODE_SGE ||
-op == TGSI_OPCODE_SLT))
-   {
-  st_src_reg src = st_src_reg(dst);
-  src.negate = ~src.negate;
-  dst.type = GLSL_TYPE_FLOAT;
-  emit(ir, TGSI_OPCODE_F2I, dst, src);
-   }
-}
-
 /**
  * Determines whether to use an integer, unsigned integer, or float opcode 
  * based on the operands and input opcode, then emits the result.
@@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
 #define case2fi(f, i)   case4(f, f, i, i)
 #define case2iu(i, u)   case4(i, LAST, i, u)

+#define casecomp(c, f, i, u) \
+   case TGSI_OPCODE_##c: \
+  if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \
+  else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \
+  else if (native_integers) \
+ op = TGSI_OPCODE_##f; \
+  else op = TGSI_OPCODE_##c; \
+  break;
+
switch(op) {
   case2fi(ADD, UADD);
   case2fi(MUL, UMUL);
@@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
   case3(MAX, IMAX, UMAX);
   case3(MIN, IMIN, UMIN);
   case2iu(MOD, UMOD);
-  
-  case2fi(SEQ, USEQ);
-  case2fi(SNE, USNE);
-  case3(SGE, ISGE, USGE);
-  case3(SLT, ISLT, USLT);
-  
+
+  casecomp(SEQ, FSEQ, USEQ, USEQ);
+  casecomp(SNE, FSNE, USNE, USNE);
+  casecomp(SGE, FSGE, ISGE, USGE);
+  casecomp(SLT, FSLT, ISLT, USLT);
+
   case2iu(ISHR, USHR);
 
   case2fi(SSG, ISSG);
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported

2013-08-13 Thread Roland Scheidegger
I tested this for llvmpipe, but it would be good if the respective
driver authors could verify it works for their drivers, I have no idea
if I got it right there, just guessed how it might work based mostly on
how other comparison instructions are handled (and I hope I caught all
drivers, those supporting integers), but if not glsl will probably break
quite badly I suppose.

Roland


Am 13.08.2013 19:04, schrieb srol...@vmware.com:
 From: Roland Scheidegger srol...@vmware.com
 
 Should get rid of some float-to-int conversions (with negation).
 No piglit regressions (with llvmpipe).
 ---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   45 
 ++--
  1 file changed, 15 insertions(+), 30 deletions(-)
 
 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index d9b4ed2..65ba449 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -420,8 +420,6 @@ public:
 void emit_scalar(ir_instruction *ir, unsigned op,
   st_dst_reg dst, st_src_reg src0, st_src_reg src1);
  
 -   void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst);
 -
 void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);
  
 void emit_scs(ir_instruction *ir, unsigned op,
 @@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op,
  
 this-instructions.push_tail(inst);
  
 -   if (native_integers)
 -  try_emit_float_set(ir, op, dst);
 -
 return inst;
  }
  
 @@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
 op)
 return emit(ir, op, undef_dst, undef_src, undef_src, undef_src);
  }
  
 - /**
 - * Emits the code to convert the result of float SET instructions to 
 integers.
 - */
 -void
 -glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op,
 -  st_dst_reg dst)
 -{
 -   if ((op == TGSI_OPCODE_SEQ ||
 -op == TGSI_OPCODE_SNE ||
 -op == TGSI_OPCODE_SGE ||
 -op == TGSI_OPCODE_SLT))
 -   {
 -  st_src_reg src = st_src_reg(dst);
 -  src.negate = ~src.negate;
 -  dst.type = GLSL_TYPE_FLOAT;
 -  emit(ir, TGSI_OPCODE_F2I, dst, src);
 -   }
 -}
 -
  /**
   * Determines whether to use an integer, unsigned integer, or float opcode 
   * based on the operands and input opcode, then emits the result.
 @@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
 unsigned op,
  #define case2fi(f, i)   case4(f, f, i, i)
  #define case2iu(i, u)   case4(i, LAST, i, u)
 
 +#define casecomp(c, f, i, u) \
 +   case TGSI_OPCODE_##c: \
 +  if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \
 +  else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \
 +  else if (native_integers) \
 + op = TGSI_OPCODE_##f; \
 +  else op = TGSI_OPCODE_##c; \
 +  break;
 +
 switch(op) {
case2fi(ADD, UADD);
case2fi(MUL, UMUL);
 @@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
 unsigned op,
case3(MAX, IMAX, UMAX);
case3(MIN, IMIN, UMIN);
case2iu(MOD, UMOD);
 -  
 -  case2fi(SEQ, USEQ);
 -  case2fi(SNE, USNE);
 -  case3(SGE, ISGE, USGE);
 -  case3(SLT, ISLT, USLT);
 -  
 +
 +  casecomp(SEQ, FSEQ, USEQ, USEQ);
 +  casecomp(SNE, FSNE, USNE, USNE);
 +  casecomp(SGE, FSGE, ISGE, USGE);
 +  casecomp(SLT, FSLT, ISLT, USLT);
 +
case2iu(ISHR, USHR);
  
case2fi(SSG, ISSG);
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.

2013-08-13 Thread Kenneth Graunke

On 08/09/2013 12:33 AM, Eric Anholt wrote:

Mark Mueller markkmuel...@gmail.com writes:


On Thu, Aug 8, 2013 at 2:19 PM, Eric Anholt e...@anholt.net wrote:


Mark Mueller markkmuel...@gmail.com writes:

Signed-off-by: Mark Mueller markkmuel...@gmail.com
---
  src/mesa/drivers/dri/i965/brw_draw.c | 16 ++--
  1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c

b/src/mesa/drivers/dri/i965/brw_draw.c

index 6170d07..1b5ed55 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context

*ctx,

 bool retval = true;
 GLuint i;
 bool fail_next = false;
+   const int estimated_max_prim_size =
+   512 + /* batchbuffer commands */
+   ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) +

sizeof(struct gen5_sampler_default_color +

+   1024 + /* gen6 VS push constants */
+   1024 + /* gen6 WM push constants */
+   512; /* misc. pad */


What's the point of this change?  Moving loop invariants out of loops is
something basic that your compiler does,



Is that universally true for the code as it looked originally (see below)?
I've worked on embedded Atom and other systems with heavily dumbed down gcc
or other cross compilers. For instance there is a good chance that the
compilers from vehicle infotainment systems that I've worked on recently
would generate assembly for each line of code below inside the loop.


If your compiler isn't doing that, it's a problem with your compiler,
not the code being compiled, and you need to fix that in your build
environment.


Sure, yet it's in the company of fail_next with a similar problem. What
about keeping the definition inside the for loop but adding the const
keyword and adding all of the immediates as one operation?

for (i = 0; i  nr_prims; i++) {

  const int estimated_max_prim_size =
  512 + /* batchbuffer commands */
  ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) +
sizeof(struct gen5_sampler_default_color +
  1024 + /* gen6 VS push constants */
  1024 + /* gen6 WM push constants */
  512; /* misc. pad */


The const keyword doesn't tell the compiler anyhing except keep the
developer from trying to modify this, which just makes things
irritating when somebody comes along later to add something like oh,
and on gen11 we need to reserve an extra 4k or whatever.


Just a friendly reminder that this number is kind of bullshit anyway:

1. 512 bytes for batchbuffer commands?  On what generation?
2. Surface states anybody?  That's potentially 1-3k of space not tracked
3. With hardware contexts, we don't always emit the full state anyway...
4. BLORP shares batches with normal drawing now...

As far as I know, this is only used to flush the batch when it's 
approaching full.  If we filled up the last little bit, and then ran out 
of space, we'd have to start over, wasting CPU time.


If we want per-generation numbers, I think the right solution is to do:

static const int estimated_max_prim_size[] = { ... }
... estimated_max_prim_size[brw-gen - 4] ...

or a helper function, not if-ladders which +=.

But this is all pretty dodgy anyway.

--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Enable GLX TLS by default in Mesa?

2013-08-13 Thread Kenneth Graunke

On 08/09/2013 12:26 AM, Vedran Rodic wrote:

Hi,

I've been burned with the issue of GLX TLS not being enabled by
default in Mesa (Dota 2 seems to rely on it).

What's the rationale of not enabling it by default?

Thanks,

Vedran Rodic


As far as I know, --enable-glx-tls just makes things more efficient.

Nothing should *rely* on it, or even be able to detect it...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: Fix incorrect pattern matching in ir_set_program_inouts

2013-08-13 Thread Kenneth Graunke

On 08/09/2013 10:27 AM, Paul Berry wrote:

In commit 8fc41df (glsl: Modify ir_set_program_inouts to handle
geometry shaders), when attempting to pattern match the foo part of
expressions such as:

foo[i][j]
foo[i]

I incorrectly called as_dereference_variable() on the subexpression
foo[i] instead of foo.  As a result, the pattern never matched, so
ir_set_program_inouts would fall back on marking the entire variable
as used, rather than just the portion indexed by the array.

This didn't result in incorrect behaviour, but it could have resulted
in inefficiency by causing the back-end to allocate resources for
unused parts of an input or output array.


Patch 1 is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] radeonsi: Don't export unused clip distance vectors from vertex shader

2013-08-13 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

E.g. the Source engine seems to always write to gl_ClipVertex, but normally
doesn't enable any GL_CLIP_DISTANCEn states. This change removes some
irrelevant parts from the generated vertex shader code in such cases.

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---

v2: Adapt for the possibility to export clip distance vector 1 but not 0.

 src/gallium/drivers/radeonsi/radeonsi_shader.c | 10 +-
 src/gallium/drivers/radeonsi/radeonsi_shader.h |  1 +
 src/gallium/drivers/radeonsi/si_state.c|  4 
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index dd9581d..6bf4b05 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -565,6 +565,7 @@ static void si_llvm_emit_clipvertex(struct 
lp_build_tgsi_context * bld_base,
LLVMValueRef (*pos)[9], unsigned index)
 {
struct si_shader_context *si_shader_ctx = si_shader_context(bld_base);
+   struct si_pipe_shader *shader = si_shader_ctx-shader;
struct lp_build_context *base = bld_base-base;
struct lp_build_context *uint = 
si_shader_ctx-radeon_bld.soa.bld_base.uint_bld;
unsigned reg_index;
@@ -583,6 +584,11 @@ static void si_llvm_emit_clipvertex(struct 
lp_build_tgsi_context * bld_base,
for (reg_index = 0; reg_index  2; reg_index ++) {
LLVMValueRef *args = pos[2 + reg_index];
 
+   if (!(shader-key.vs.ucps_enabled  (1  reg_index)))
+   continue;
+
+   shader-shader.clip_dist_write |= 0xf  (4 * reg_index);
+
args[5] =
args[6] =
args[7] =
@@ -709,13 +715,15 @@ handle_semantic:
}
break;
case TGSI_SEMANTIC_CLIPDIST:
+   if 
(!(si_shader_ctx-shader-key.vs.ucps_enabled 
+ (1  d-Semantic.Index)))
+   continue;
shader-clip_dist_write |=
d-Declaration.UsageMask  
(d-Semantic.Index  2);
target = V_008DFC_SQ_EXP_POS + 2 + 
d-Semantic.Index;
break;
case TGSI_SEMANTIC_CLIPVERTEX:
si_llvm_emit_clipvertex(bld_base, pos_args, 
index);
-   shader-clip_dist_write = 0xFF;
continue;
case TGSI_SEMANTIC_FOG:
case TGSI_SEMANTIC_GENERIC:
diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h 
b/src/gallium/drivers/radeonsi/radeonsi_shader.h
index f28a0ea..2d4468a 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
@@ -128,6 +128,7 @@ union si_shader_key {
} ps;
struct {
unsignedinstance_divisors[PIPE_MAX_ATTRIBS];
+   unsigneducps_enabled:2;
} vs;
 };
 
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 58e5a56..0fecb1d 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2040,6 +2040,10 @@ static INLINE void si_shader_selector_key(struct 
pipe_context *ctx,
for (i = 0; i  rctx-vertex_elements-count; ++i)
key-vs.instance_divisors[i] = 
rctx-vertex_elements-elements[i].instance_divisor;
 
+   if (rctx-queued.named.rasterizer-clip_plane_enable  0xf0)
+   key-vs.ucps_enabled |= 0x2;
+   if (rctx-queued.named.rasterizer-clip_plane_enable  0xf)
+   key-vs.ucps_enabled |= 0x1;
} else if (sel-type == PIPE_SHADER_FRAGMENT) {
if (sel-fs_write_all)
key-ps.nr_cbufs = rctx-framebuffer.nr_cbufs;
-- 
1.8.4.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] radeonsi: Don't leave gaps between position exports from vertex shader

2013-08-13 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

If the vertex shader exports clip distances but not point size, use
position exports 1/2 instead of 2/3 for the clip distances. Fixes
geometry corruption in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Michel Dänzer michel.daen...@amd.com
---

v2: No need to export unused position vectors, just export to consecutive
position export slots.

 src/gallium/drivers/radeonsi/radeonsi_shader.c | 135 +++--
 src/gallium/drivers/radeonsi/radeonsi_shader.h |   1 +
 src/gallium/drivers/radeonsi/si_state_draw.c   |   6 +-
 3 files changed, 83 insertions(+), 59 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index fee6262..dd9581d 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -562,12 +562,11 @@ static void si_alpha_test(struct lp_build_tgsi_context 
*bld_base,
 }
 
 static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base,
-   unsigned index)
+   LLVMValueRef (*pos)[9], unsigned index)
 {
struct si_shader_context *si_shader_ctx = si_shader_context(bld_base);
struct lp_build_context *base = bld_base-base;
struct lp_build_context *uint = 
si_shader_ctx-radeon_bld.soa.bld_base.uint_bld;
-   LLVMValueRef args[9];
unsigned reg_index;
unsigned chan;
unsigned const_chan;
@@ -582,6 +581,8 @@ static void si_llvm_emit_clipvertex(struct 
lp_build_tgsi_context * bld_base,
}
 
for (reg_index = 0; reg_index  2; reg_index ++) {
+   LLVMValueRef *args = pos[2 + reg_index];
+
args[5] =
args[6] =
args[7] =
@@ -612,10 +613,6 @@ static void si_llvm_emit_clipvertex(struct 
lp_build_tgsi_context * bld_base,
args[3] = lp_build_const_int32(base-gallivm,
   V_008DFC_SQ_EXP_POS + 2 + 
reg_index);
args[4] = uint-zero;
-   lp_build_intrinsic(base-gallivm-builder,
-  llvm.SI.export,
-  
LLVMVoidTypeInContext(base-gallivm-context),
-  args, 9);
}
 }
 
@@ -630,17 +627,18 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
struct tgsi_parse_context *parse = si_shader_ctx-parse;
LLVMValueRef args[9];
LLVMValueRef last_args[9] = { 0 };
+   LLVMValueRef pos_args[4][9] = { { 0 } };
unsigned semantic_name;
unsigned color_count = 0;
unsigned param_count = 0;
int depth_index = -1, stencil_index = -1;
+   int i;
 
while (!tgsi_parse_end_of_tokens(parse)) {
struct tgsi_full_declaration *d =
parse-FullToken.FullDeclaration;
unsigned target;
unsigned index;
-   int i;
 
tgsi_parse_token(parse);
 
@@ -716,7 +714,7 @@ handle_semantic:
target = V_008DFC_SQ_EXP_POS + 2 + 
d-Semantic.Index;
break;
case TGSI_SEMANTIC_CLIPVERTEX:
-   si_llvm_emit_clipvertex(bld_base, index);
+   si_llvm_emit_clipvertex(bld_base, pos_args, 
index);
shader-clip_dist_write = 0xFF;
continue;
case TGSI_SEMANTIC_FOG:
@@ -734,9 +732,13 @@ handle_semantic:
 
si_llvm_init_export_args(bld_base, d, index, target, 
args);
 
-   if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX ?
-   (semantic_name == TGSI_SEMANTIC_POSITION) :
-   (semantic_name == TGSI_SEMANTIC_COLOR)) {
+   if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX 
+   target = V_008DFC_SQ_EXP_POS 
+   target = (V_008DFC_SQ_EXP_POS + 3)) {
+   memcpy(pos_args[target - V_008DFC_SQ_EXP_POS],
+  args, sizeof(args));
+   } else if (si_shader_ctx-type == 
TGSI_PROCESSOR_FRAGMENT 
+  semantic_name == TGSI_SEMANTIC_COLOR) {
if (last_args[0]) {

lp_build_intrinsic(base-gallivm-builder,
   llvm.SI.export,
@@ -806,66 +808,87 @@ handle_semantic:
memcpy(last_args, args, sizeof(args));
}
 
-   if (!last_args[0]) {
-   assert(si_shader_ctx-type == TGSI_PROCESSOR_FRAGMENT);
-
-

Re: [Mesa-dev] segfault in pstip_bind_sampler_states

2013-08-13 Thread Kevin H. Hobbs
On 08/13/2013 11:45 AM, Kevin H. Hobbs wrote:
 On 08/13/2013 09:50 AM, Brian Paul wrote:
 On 08/12/2013 11:30 AM, Kevin H. Hobbs wrote:

 --30166-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) 
 - exiting

 Well, that's not too helpful.
 

Ha! I can move the segfault all the way back to :

Program received signal SIGSEGV, Segmentation fault.
0x7fffe0ba6d43 in osmesa_st_framebuffer_flush_front
(stctx=0x1518ee0, stfbi=0x1520560,
statt=ST_ATTACHMENT_FRONT_LEFT) at osmesa.c:305
305u_box_2d(0, 0, res-width0, res-height0, box);

if I make the magnification on the vtkRenderLargeImage instance high enough.




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] glsl: Emit errors for things that look like default precision statements

2013-08-13 Thread Kenneth Graunke

On 08/09/2013 04:38 PM, Ian Romanick wrote:

From: Ian Romanick ian.d.roman...@intel.com

Previously we would emit a warning for empty declarations like

float;

We would also emit the same warning for things like

highp float;

However, this second case is most likely the application trying to set the
default precision.  We should instead generate an error.

Fixes piglit precision-05.vert.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org


AMD succesfully compiles precision-05.vert, so I think this probably 
needs to be allowed.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] nv50: implement new float comparison instructions

2013-08-13 Thread Christoph Bumiller
On 13.08.2013 19:04, srol...@vmware.com wrote:
 From: Roland Scheidegger srol...@vmware.com

 untested.

Looks like it should work though, thanks.
nv50 only supported u32 result all along and on nvc0 both cases are
already handled by the rest of the code, too.

 ---
  .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp |   17 +
  1 file changed, 17 insertions(+)

 diff --git a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp 
 b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
 index 56eccac..a2ad9f4 100644
 --- a/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
 +++ b/src/gallium/drivers/nv50/codegen/nv50_ir_from_tgsi.cpp
 @@ -440,6 +440,11 @@ nv50_ir::DataType Instruction::inferDstType() const
 switch (getOpcode()) {
 case TGSI_OPCODE_F2U: return nv50_ir::TYPE_U32;
 case TGSI_OPCODE_F2I: return nv50_ir::TYPE_S32;
 +   case TGSI_OPCODE_FSEQ:
 +   case TGSI_OPCODE_FSGE:
 +   case TGSI_OPCODE_FSLT:
 +   case TGSI_OPCODE_FSNE:
 +  return nv50_ir::TYPE_U32;
 case TGSI_OPCODE_I2F:
 case TGSI_OPCODE_U2F:
return nv50_ir::TYPE_F32;
 @@ -456,19 +461,23 @@ nv50_ir::CondCode Instruction::getSetCond() const
 case TGSI_OPCODE_SLT:
 case TGSI_OPCODE_ISLT:
 case TGSI_OPCODE_USLT:
 +   case TGSI_OPCODE_FSLT:
return CC_LT;
 case TGSI_OPCODE_SLE:
return CC_LE;
 case TGSI_OPCODE_SGE:
 case TGSI_OPCODE_ISGE:
 case TGSI_OPCODE_USGE:
 +   case TGSI_OPCODE_FSGE:
return CC_GE;
 case TGSI_OPCODE_SGT:
return CC_GT;
 case TGSI_OPCODE_SEQ:
 case TGSI_OPCODE_USEQ:
 +   case TGSI_OPCODE_FSEQ:
return CC_EQ;
 case TGSI_OPCODE_SNE:
 +   case TGSI_OPCODE_FSNE:
return CC_NEU;
 case TGSI_OPCODE_USNE:
return CC_NE;
 @@ -556,6 +565,10 @@ static nv50_ir::operation translateOpcode(uint opcode)
 NV50_IR_OPCODE_CASE(KILL_IF, DISCARD);
  
 NV50_IR_OPCODE_CASE(F2I, CVT);
 +   NV50_IR_OPCODE_CASE(FSEQ, SET);
 +   NV50_IR_OPCODE_CASE(FSGE, SET);
 +   NV50_IR_OPCODE_CASE(FSLT, SET);
 +   NV50_IR_OPCODE_CASE(FSNE, SET);
 NV50_IR_OPCODE_CASE(IDIV, DIV);
 NV50_IR_OPCODE_CASE(IMAX, MAX);
 NV50_IR_OPCODE_CASE(IMIN, MIN);
 @@ -2354,6 +2367,10 @@ Converter::handleInstruction(const struct 
 tgsi_full_instruction *insn)
 case TGSI_OPCODE_SLE:
 case TGSI_OPCODE_SNE:
 case TGSI_OPCODE_STR:
 +   case TGSI_OPCODE_FSEQ:
 +   case TGSI_OPCODE_FSGE:
 +   case TGSI_OPCODE_FSLT:
 +   case TGSI_OPCODE_FSNE:
 case TGSI_OPCODE_ISGE:
 case TGSI_OPCODE_ISLT:
 case TGSI_OPCODE_USEQ:

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Enable GLX TLS by default in Mesa?

2013-08-13 Thread Vedran Rodic
On Tue, Aug 13, 2013 at 7:19 PM, Kenneth Graunke kenn...@whitecape.org wrote:


 As far as I know, --enable-glx-tls just makes things more efficient.

 Nothing should *rely* on it, or even be able to detect it...

Dota 2 crashes without that option when loading the actual game map. I
assumed it adds thread safety.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Uwe Schmidt
Hi,

I have read about the issue of implementing the S3TC Extension in Mesa:
http://dri.freedesktop.org/wiki/S3TC/

As I understood, the problem is, that encoding and decoding S3TC in
software is covered by patents, while passing S3TC compressed data to the
GPU is still ok.

AS NOW:

If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded
texture works if format==internalFormat is true. If format!=internalFormat
is true, it would fail (as i know).

SO MY PROPOSAL:

If 'format' is one of the S3TC types, and format!=internalFormat is true,
then set internalFormat:=format.

Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't,
set internalFormat:=format (or any other format, Mesa3D can encode).


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Add Gen7 depth stall flushes before disabling depth in BLORP.

2013-08-13 Thread Kenneth Graunke
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen7_blorp.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
index 518d7f5..44e7578 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
@@ -756,6 +756,8 @@ static void
 gen7_blorp_emit_depth_disable(struct brw_context *brw,
   const brw_blorp_params *params)
 {
+   intel_emit_depth_stall_flushes(brw);
+
BEGIN_BATCH(7);
OUT_BATCH(GEN7_3DSTATE_DEPTH_BUFFER  16 | (7 - 2));
OUT_BATCH(BRW_DEPTHFORMAT_D32_FLOAT  18 | (BRW_SURFACE_NULL  29));
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Add Gen6 depth stall flushes before disabling depth in BLORP.

2013-08-13 Thread Kenneth Graunke
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.

On Sandybridge, this also requires the post_sync_nonzero flush.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/gen6_blorp.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
index a4a9081..129c113 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
@@ -914,6 +914,9 @@ static void
 gen6_blorp_emit_depth_disable(struct brw_context *brw,
   const brw_blorp_params *params)
 {
+   intel_emit_post_sync_nonzero_flush(brw);
+   intel_emit_depth_stall_flushes(brw);
+
BEGIN_BATCH(7);
OUT_BATCH(_3DSTATE_DEPTH_BUFFER  16 | (7 - 2));
OUT_BATCH((BRW_DEPTHFORMAT_D32_FLOAT  18) |
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Patrick Baggett
I've been hanging on this list for a while, and this isn't the first time
this has been suggested. The general thing that is repeated is basically
this: if you make an API (e.g. OpenGL) that supports S3TC without a
license, you're in trouble, even if it is a passthrough to the hardware,
which also required a license to produce in the first place. I think the
assumption most people make is that if the hardware vendor paid a license
to implement S3TC in an ASIC, then surely simply passing through data is
OK. After all, it is being done without any knowledge of the algorithm,
etc. From a common sense standpoint, I would agree.
However, the note in the S3TC extension itself[1] mentions explicitly
to be wary of such assumptions in the IP Status section, and notes that *a
license for one API is not a license for another*. This implies that for an
API to make use of S3TC, it requires a license, which Mesa in general, does
not have, while a hardware vendor might. All of this is theoretical as far
as I've read; I don't think anyone has legally challenged this for open
source drivers and posted the results on this mailing list -- mostly have
stayed away from it with a prejudice. I think the patent was granted in
1999, so at least in the USA, hopefully we don't have too many more years
of this garbage.

Patrick

[1] http://www.opengl.org/registry/specs/EXT/texture_compression_s3tc.txt


On Tue, Aug 13, 2013 at 1:53 PM, Uwe Schmidt 
simon.schm...@cs-systemberatung.de wrote:

 Hi,

 I have read about the issue of implementing the S3TC Extension in Mesa:
 http://dri.freedesktop.org/wiki/S3TC/

 As I understood, the problem is, that encoding and decoding S3TC in
 software is covered by patents, while passing S3TC compressed data to the
 GPU is still ok.

 AS NOW:

 If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded
 texture works if format==internalFormat is true. If format!=internalFormat
 is true, it would fail (as i know).

 SO MY PROPOSAL:

 If 'format' is one of the S3TC types, and format!=internalFormat is true,
 then set internalFormat:=format.

 Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't,
 set internalFormat:=format (or any other format, Mesa3D can encode).


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Ian Romanick

On 08/13/2013 11:53 AM, Uwe Schmidt wrote:

Hi,

I have read about the issue of implementing the S3TC Extension in Mesa:
http://dri.freedesktop.org/wiki/S3TC/

As I understood, the problem is, that encoding and decoding S3TC in
software is covered by patents, while passing S3TC compressed data to the
GPU is still ok.


It's all patented.  Some hardware vendors have licenses for things their 
hardware does.  There's a thing called contributory infringement, too. 
Please don't play arm-chair IP attorney.



AS NOW:

If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded
texture works if format==internalFormat is true. If format!=internalFormat
is true, it would fail (as i know).


It doesn't fail.  Mesa just leaves the data uncompressed.  The only 
failure occurs if the application calls glGetComrpessedTexImage to get 
the compressed data back.



SO MY PROPOSAL:

If 'format' is one of the S3TC types, and format!=internalFormat is true,
then set internalFormat:=format.


'format' cannot be a compressed type.  Compressed data can only be 
supplied using glCompressedTexImage2D, and that function only has an 
internalFormat parameter.



Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't,
set internalFormat:=format (or any other format, Mesa3D can encode).


The only format that Mesa can encode is FXT1.  Only Intel hardware 
supports FXT1, and the quality (of Mesa's compressor) is not very good. 
 Picking that format would result in bug reports of game XYZ looks 
horrible on Intel graphix you suck.  So that leaves us with the only 
option of leaving the data uncompressed.


Until S3 grants it's IP to OIN or the patents expire, this is going to 
be the situation.  We've been through this mental exercise of the last 5 
years more times than I can count, and we always come back to the same 
place.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)

2013-08-13 Thread Ian Romanick

On 08/13/2013 07:49 AM, Siarhei Siamashka wrote:

On Thu, 08 Aug 2013 16:19:28 -0700
Ian Romanick i...@freedesktop.org wrote:


On 08/06/2013 02:13 PM, Siarhei Siamashka wrote:

Hello,

Some months ago, the commit configure.ac: Allow OpenGL ES1 and ES2 only
with enabled OpenGL dropped support for the OpenGL-free configuration.

  http://lists.freedesktop.org/archives/mesa-dev/2013-February/033909.html
  
http://lists.freedesktop.org/archives/mesa-commit/2013-February/041708.html

Could this be possibly reverted to allow me to continue shooting
myself in the foot? The support for OpenGL ES is pretty horrible
in the open source software. One nice exception is Qt5 which is doing
pretty well. But the rest of the software does not generally work out
of the box without patches or tweaks. You can also hardly find a
problem-free OpenGL ES compatible open source game (other than Quake3).

I have an open feature request for Gentoo, which is a very configurable
Linux distribution and should not have any troubles working either with
or without OpenGL (the choice is up to the user):

  https://bugs.gentoo.org/show_bug.cgi?id=476524

But if upstream Mesa treats this configuration as unsupported, then I
also don't see it progressing anywhere in Gentoo. So could you please
re-consider this decision?


We've removed all of the #ifdef code inside Mesa that would have made
any difference.  It was a nightmare to maintain, and we almost always
got it wrong... because nobody was testing that configuration.


I believe this can be changed :-) That's a bit of a chicken/egg problem.
The OpenGL ES support in free software applications and libraries is
so broken, that it's currently a big pain to try this configuration for
anything practical. And the applications/libraries can't be fixed
without having a non-OpenGL environment for development and testing.


What does that have to do with building Mesa without desktop GL?  Build 
Mesa *with* ES, and develop your software.



The needed tweaks for Mesa are really trivial. Maybe one could also
just compile everything, but delete GL headers, gl.pc and libGL.so
after compilation and before installing Mesa to the system. Still it
is a bit ugly to have the configure script claim that OpenGL ES is not
supported without OpenGL, while in fact it works.


It's ugly once at package-time instead of ugly continuously at 
development time.



The only thing this is possibly going to gain you is a trivial amount
of build time (by not building libGL, etc.).


The compilation time is irrelevant. But it is very useful to be able to
install Mesa without OpenGL headers and without libGL.so, so that the
problematic software just fails at compile time instead of exhibiting
hard to debug problems at runtime.

It seems to be a rather common failure scenario when some big bloatware
application loads both libGL.so (provided by Mesa) and libGLESv2.so
(provided by some proprietary OpenGL ES driver on ARM hardware) into
the same process via indirect library dependencies. These shared
libraries are providing overlapping function names, but are backed by
totally different implementations. And everything blows up as a result
when the application is run, or maybe it even mostly works if you are
lucky.

What's the point installing both Mesa and the proprietary OpenGL ES
drivers on the same system? I would surely love to have open source
hardware accelerated OpenGL ES drivers on ARM systems today. But they
are not quite here yet. And even assuming that we get perfectly
functional free software OpenGL ES drivers for embedded hardware,
the current buggy applications are not going be magically fixed
themselves. Somebody still needs to debug and fix the OpenGL ES
compatibility problems.


This all sounds like a packaging problem.  It should be fixed in the 
packaging, not in the upstream project.



The easiest way forward seems to be just allowing to compile Mesa
without desktop OpenGL. It is going to provide:
1. On x86 desktop systems - the development environment for testing
OpenGL ES applications.
2. On ARM hardware via softpipe/llvmpipe - some reference fallback
implementation.
3. Have both the existing proprietary drivers and Mesa installed on
ARM hardware (with the ability to switch between them at any time) -
the applications can run at full speed and be profiled/benchmarked.

Somebody may argue that I'm exaggerating and OpenGL ES support seems
to be not so bad. There were many OpenGL ES related news and
announcements. Also there exists Linaro/Ubuntu distribution and some
videos on youtube showing how it successfully runs something in 3D on
ARM. Still the problem is that in many applications the said OpenGL ES
support is either in the work-in-progress state, or it possibly has
been contributed by somebody some time ago and has already bitrotten.
Also Linaro bundles a bunch of OpenGL ES hacks, which don't seem to be
actively pushed upstream. This all is less than perfect and needs to be
improved. That 

Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Roland Mainz
On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org wrote:
 On 08/13/2013 11:53 AM, Uwe Schmidt wrote:
[snip]
 SO MY PROPOSAL:

 If 'format' is one of the S3TC types, and format!=internalFormat is true,
 then set internalFormat:=format.

 'format' cannot be a compressed type.  Compressed data can only be supplied
 using glCompressedTexImage2D, and that function only has an internalFormat
 parameter.


 Else, if 'internalFormat' is one of the S3TC types, but the 'format'
 isn't,
 set internalFormat:=format (or any other format, Mesa3D can encode).


 The only format that Mesa can encode is FXT1.  Only Intel hardware supports
 FXT1, and the quality (of Mesa's compressor) is not very good.  Picking that
 format would result in bug reports of game XYZ looks horrible on Intel
 graphix you suck.  So that leaves us with the only option of leaving the
 data uncompressed.

 Until S3 grants it's IP to OIN or the patents expire, this is going to be
 the situation.  We've been through this mental exercise of the last 5 years
 more times than I can count, and we always come back to the same place.

Please don't hit me with a stick if this has been asked
|powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code
(sample implementation, ... do not use without a license from S3 and
the ritual scarification of at least one software engineer...) but
having it off in the default build won't work... right ?



Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.ma...@nrubsig.org
  \__\/\/__/  MPEG specialist, CJAVASunUnix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] segfault in pstip_bind_sampler_states

2013-08-13 Thread Kevin H. Hobbs
On 08/13/2013 01:41 PM, Kevin H. Hobbs wrote:
 
 Ha! I can move the segfault all the way back to :
 
 Program received signal SIGSEGV, Segmentation fault.
 0x7fffe0ba6d43 in osmesa_st_framebuffer_flush_front
 (stctx=0x1518ee0, stfbi=0x1520560,
 statt=ST_ATTACHMENT_FRONT_LEFT) at osmesa.c:305
 305u_box_2d(0, 0, res-width0, res-height0, box);
 
 if I make the magnification on the vtkRenderLargeImage instance high enough.
 

This post was all wet ignore it.




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Ian Romanick

On 08/13/2013 01:27 PM, Roland Mainz wrote:

On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org wrote:

On 08/13/2013 11:53 AM, Uwe Schmidt wrote:

[snip]

SO MY PROPOSAL:

If 'format' is one of the S3TC types, and format!=internalFormat is true,
then set internalFormat:=format.


'format' cannot be a compressed type.  Compressed data can only be supplied
using glCompressedTexImage2D, and that function only has an internalFormat
parameter.



Else, if 'internalFormat' is one of the S3TC types, but the 'format'
isn't,
set internalFormat:=format (or any other format, Mesa3D can encode).



The only format that Mesa can encode is FXT1.  Only Intel hardware supports
FXT1, and the quality (of Mesa's compressor) is not very good.  Picking that
format would result in bug reports of game XYZ looks horrible on Intel
graphix you suck.  So that leaves us with the only option of leaving the
data uncompressed.

Until S3 grants it's IP to OIN or the patents expire, this is going to be
the situation.  We've been through this mental exercise of the last 5 years
more times than I can count, and we always come back to the same place.


Please don't hit me with a stick if this has been asked
|powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code
(sample implementation, ... do not use without a license from S3 and
the ritual scarification of at least one software engineer...) but
having it off in the default build won't work... right ?


That is more difficult for end-users than the current situation.  In the 
current situation, your distro can build Mesa (no S3TC), and, if you 
live in a country without software patents, you can just drop in the 
libtxc_dxtn library to get compression.  Putting it in Mesa, along with 
making the distros really uncomfortable, would mean you'd have to 
rebuild Mesa.


Did I mention that we've been through this mental exercise a few times?




Bye,
Roland



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Emit better warnings for things that look like default precision statements

2013-08-13 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Previously we would emit a warning for empty declarations like

float;

We would also emit the same warning for things like

highp float;

However, this second case is most likely the application trying to set
the default precision.  This makes the compiler generate a stronger
warning with some suggestion of a fix.

It really seems like this should be an error.  I'll bet that 100% of the
time someone writes 'highp float;' the actually meant 'precision highp
float;'.  Alas, both AMD and NVIDIA accept this syntax, and the spec
doesn't explicitly forbid it.

This makes piglit's precision-05.vert generate the following warnings:

0:12(11): warning: empty declaration with precision qualifier, to set the 
default precision, use `precision lowp float;'
0:13(12): warning: empty declaration with precision qualifier, to set the 
default precision, use `precision mediump int;'

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: Kenneth Graunke kenn...@whitecape.org
Cc: 9.2 mesa-sta...@lists.freedesktop.org
---
 src/glsl/ast_to_hir.cpp | 43 ++-
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 40992fb..f96b64b 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2719,6 +2719,10 @@ ast_declarator_list::hir(exec_list *instructions,
*   name of a known structure type.  This is both invalid and weird.
*   Emit an error.
*
+   * - The program text contained something like 'mediump float;'
+   *   when the programmer probably meant 'precision mediump
+   *   float;' Emit an error.
+   *
* Note that if decl_type is NULL and there is a structure involved,
* there must have been some sort of error with the structure.  In this
* case we assume that an error was already generated on this line of
@@ -2727,20 +2731,33 @@ ast_declarator_list::hir(exec_list *instructions,
*/
   assert(this-type-specifier-structure == NULL || decl_type != NULL
 || state-error);
-  if (this-type-specifier-structure == NULL) {
-if (decl_type != NULL) {
-   _mesa_glsl_warning(loc, state, empty declaration);
-} else {
-   _mesa_glsl_error(loc, state,
-invalid type `%s' in empty declaration,
-type_name);
-}
-  }
 
-  if (this-type-qualifier.precision != ast_precision_none 
-  this-type-specifier-structure != NULL) {
- _mesa_glsl_error(loc, state, precision qualifiers can't be applied 
-  to structures);
+  if (decl_type == NULL) {
+ _mesa_glsl_error(loc, state,
+  invalid type `%s' in empty declaration,
+  type_name);
+  } else if (this-type-qualifier.precision != ast_precision_none) {
+ if (this-type-specifier-structure != NULL)
+_mesa_glsl_error(loc, state,
+ precision qualifiers can't be applied 
+ to structures);
+ else {
+static const char *const precision_names[] = {
+   highp,
+   highp,
+   mediump,
+   lowp
+};
+
+_mesa_glsl_warning(loc, state,
+   empty declaration with precision qualifier, 
+   to set the default precision, use 
+   `precision %s %s;',
+   
precision_names[this-type-qualifier.precision],
+   type_name);
+ }
+  } else {
+ _mesa_glsl_warning(loc, state, empty declaration);
   }
}
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Enable GLX TLS by default in Mesa?

2013-08-13 Thread Chad Versace

On 08/13/2013 12:04 PM, Vedran Rodic wrote:

On Tue, Aug 13, 2013 at 7:19 PM, Kenneth Graunke kenn...@whitecape.org wrote:



As far as I know, --enable-glx-tls just makes things more efficient.

Nothing should *rely* on it, or even be able to detect it...


Dota 2 crashes without that option when loading the actual game map. I
assumed it adds thread safety.


Ian explained to me once that Mesa and the X server's GLX must be using
compatible TLS options, otherwise disaster occurs. I don't recall anymore
his explanation, or which were the dangerous combinations.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Roland Mainz
On Tue, Aug 13, 2013 at 10:33 PM, Ian Romanick i...@freedesktop.org wrote:
 On 08/13/2013 01:27 PM, Roland Mainz wrote:
 On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org
 wrote:
 On 08/13/2013 11:53 AM, Uwe Schmidt wrote:
[snip]
 Until S3 grants it's IP to OIN or the patents expire, this is going to be
 the situation.  We've been through this mental exercise of the last 5
 years
 more times than I can count, and we always come back to the same place.

 Please don't hit me with a stick if this has been asked
 |powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code
 (sample implementation, ... do not use without a license from S3 and
 the ritual scarification of at least one software engineer...) but
 having it off in the default build won't work... right ?

 That is more difficult for end-users than the current situation.  In the
 current situation, your distro can build Mesa (no S3TC), and, if you live in
 a country without software patents, you can just drop in the libtxc_dxtn
 library to get compression.  Putting it in Mesa, along with making the
 distros really uncomfortable, would mean you'd have to rebuild Mesa.

Sounds reasonable for me...

 Did I mention that we've been through this mental exercise a few times?

Erm... I'm wondering... why does the S3TC issue come up every few
months out of it's grave and haunt the list (and your nerves) ?



Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.ma...@nrubsig.org
  \__\/\/__/  MPEG specialist, CJAVASunUnix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Add Gen7 depth stall flushes before disabling depth in BLORP.

2013-08-13 Thread Chad Versace

The series is
Reviewed-by: Chad Versace chad.vers...@linux.intel.com

Yet another instance of the reason to unify blorp with normal draw.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Ian Romanick

On 08/13/2013 01:42 PM, Roland Mainz wrote:

On Tue, Aug 13, 2013 at 10:33 PM, Ian Romanick i...@freedesktop.org wrote:

On 08/13/2013 01:27 PM, Roland Mainz wrote:

On Tue, Aug 13, 2013 at 10:20 PM, Ian Romanick i...@freedesktop.org
wrote:

On 08/13/2013 11:53 AM, Uwe Schmidt wrote:

[snip]

Until S3 grants it's IP to OIN or the patents expire, this is going to be
the situation.  We've been through this mental exercise of the last 5
years
more times than I can count, and we always come back to the same place.


Please don't hit me with a stick if this has been asked
|powl(INT64_MAX, INT64_MAX)| times... but... erm... adding the code
(sample implementation, ... do not use without a license from S3 and
the ritual scarification of at least one software engineer...) but
having it off in the default build won't work... right ?


That is more difficult for end-users than the current situation.  In the
current situation, your distro can build Mesa (no S3TC), and, if you live in
a country without software patents, you can just drop in the libtxc_dxtn
library to get compression.  Putting it in Mesa, along with making the
distros really uncomfortable, would mean you'd have to rebuild Mesa.


Sounds reasonable for me...


Did I mention that we've been through this mental exercise a few times?


Erm... I'm wondering... why does the S3TC issue come up every few
months out of it's grave and haunt the list (and your nerves) ?


I didn't bring it up.  Don't ask me! :)




Bye,
Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Enable GLX TLS by default in Mesa?

2013-08-13 Thread Ian Romanick

On 08/13/2013 12:04 PM, Vedran Rodic wrote:

On Tue, Aug 13, 2013 at 7:19 PM, Kenneth Graunke kenn...@whitecape.org wrote:



As far as I know, --enable-glx-tls just makes things more efficient.

Nothing should *rely* on it, or even be able to detect it...


Dota 2 crashes without that option when loading the actual game map. I
assumed it adds thread safety.


With TLS the context pointer and the dispatch pointer are stored in 
thread local storage.  Looking them up (which happens on every GL call) 
is fast.


Without TLS the context pointer and the dispatch pointer are stored 
using pthread_setspecific / pthread_getspecific.  Looking them up is 
hella slow.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Patrick Baggett
Erm... I'm wondering... why does the S3TC issue come up every few
 months out of it's grave and haunt the list (and your nerves) ?


I think it is because the issue looks deceptively simple. Hardware is
hardware, right? ASICs do the decompression, not software. Surely blindly
copying bits from one device to another *can't* be patent infringement.
Surely, right? :\

Patrick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported

2013-08-13 Thread Brian Paul

On 08/13/2013 11:04 AM, srol...@vmware.com wrote:

From: Roland Scheidegger srol...@vmware.com

Should get rid of some float-to-int conversions (with negation).
No piglit regressions (with llvmpipe).
---
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   45 ++--
  1 file changed, 15 insertions(+), 30 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index d9b4ed2..65ba449 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -420,8 +420,6 @@ public:
 void emit_scalar(ir_instruction *ir, unsigned op,
st_dst_reg dst, st_src_reg src0, st_src_reg src1);

-   void try_emit_float_set(ir_instruction *ir, unsigned op, st_dst_reg dst);
-
 void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);

 void emit_scs(ir_instruction *ir, unsigned op,
@@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op,

 this-instructions.push_tail(inst);

-   if (native_integers)
-  try_emit_float_set(ir, op, dst);
-
 return inst;
  }

@@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op)
 return emit(ir, op, undef_dst, undef_src, undef_src, undef_src);
  }

- /**
- * Emits the code to convert the result of float SET instructions to integers.
- */
-void
-glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned op,
-st_dst_reg dst)
-{
-   if ((op == TGSI_OPCODE_SEQ ||
-op == TGSI_OPCODE_SNE ||
-op == TGSI_OPCODE_SGE ||
-op == TGSI_OPCODE_SLT))
-   {
-  st_src_reg src = st_src_reg(dst);
-  src.negate = ~src.negate;
-  dst.type = GLSL_TYPE_FLOAT;
-  emit(ir, TGSI_OPCODE_F2I, dst, src);
-   }
-}
-
  /**
   * Determines whether to use an integer, unsigned integer, or float opcode
   * based on the operands and input opcode, then emits the result.
@@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
  #define case2fi(f, i)   case4(f, f, i, i)
  #define case2iu(i, u)   case4(i, LAST, i, u)

+#define casecomp(c, f, i, u) \
+   case TGSI_OPCODE_##c: \
+  if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \
+  else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \
+  else if (native_integers) \
+ op = TGSI_OPCODE_##f; \
+  else op = TGSI_OPCODE_##c; \
+  break;
+


Would you mind cleaning up the formatting of that macro...

case x:
  if (type == GLSL_TYPE_INT)
  op = ...
  else if (type == GLSL_TYPE_UINT)
  op = ...
  else if (native_integers)
  op = ...
  else
  op = ...
  break;



 switch(op) {
case2fi(ADD, UADD);
case2fi(MUL, UMUL);
@@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction *ir, 
unsigned op,
case3(MAX, IMAX, UMAX);
case3(MIN, IMIN, UMIN);
case2iu(MOD, UMOD);
-
-  case2fi(SEQ, USEQ);
-  case2fi(SNE, USNE);
-  case3(SGE, ISGE, USGE);
-  case3(SLT, ISLT, USLT);
-
+
+  casecomp(SEQ, FSEQ, USEQ, USEQ);
+  casecomp(SNE, FSNE, USNE, USNE);
+  casecomp(SGE, FSGE, ISGE, USGE);
+  casecomp(SLT, FSLT, ISLT, USLT);
+
case2iu(ISHR, USHR);

case2fi(SSG, ISSG);



Reviewed-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] st/mesa: use new float comparison opcodes if native integers are supported

2013-08-13 Thread Roland Scheidegger
Am 13.08.2013 23:38, schrieb Brian Paul:
 On 08/13/2013 11:04 AM, srol...@vmware.com wrote:
 From: Roland Scheidegger srol...@vmware.com

 Should get rid of some float-to-int conversions (with negation).
 No piglit regressions (with llvmpipe).
 ---
   src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   45
 ++--
   1 file changed, 15 insertions(+), 30 deletions(-)

 diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 index d9b4ed2..65ba449 100644
 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
 @@ -420,8 +420,6 @@ public:
  void emit_scalar(ir_instruction *ir, unsigned op,
   st_dst_reg dst, st_src_reg src0, st_src_reg src1);

 -   void try_emit_float_set(ir_instruction *ir, unsigned op,
 st_dst_reg dst);
 -
  void emit_arl(ir_instruction *ir, st_dst_reg dst, st_src_reg src0);

  void emit_scs(ir_instruction *ir, unsigned op,
 @@ -594,9 +592,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir,
 unsigned op,

  this-instructions.push_tail(inst);

 -   if (native_integers)
 -  try_emit_float_set(ir, op, dst);
 -
  return inst;
   }

 @@ -622,25 +617,6 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir,
 unsigned op)
  return emit(ir, op, undef_dst, undef_src, undef_src, undef_src);
   }

 - /**
 - * Emits the code to convert the result of float SET instructions to
 integers.
 - */
 -void
 -glsl_to_tgsi_visitor::try_emit_float_set(ir_instruction *ir, unsigned
 op,
 - st_dst_reg dst)
 -{
 -   if ((op == TGSI_OPCODE_SEQ ||
 -op == TGSI_OPCODE_SNE ||
 -op == TGSI_OPCODE_SGE ||
 -op == TGSI_OPCODE_SLT))
 -   {
 -  st_src_reg src = st_src_reg(dst);
 -  src.negate = ~src.negate;
 -  dst.type = GLSL_TYPE_FLOAT;
 -  emit(ir, TGSI_OPCODE_F2I, dst, src);
 -   }
 -}
 -
   /**
* Determines whether to use an integer, unsigned integer, or float
 opcode
* based on the operands and input opcode, then emits the result.
 @@ -672,6 +648,15 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction
 *ir, unsigned op,
   #define case2fi(f, i)   case4(f, f, i, i)
   #define case2iu(i, u)   case4(i, LAST, i, u)

 +#define casecomp(c, f, i, u) \
 +   case TGSI_OPCODE_##c: \
 +  if (type == GLSL_TYPE_INT) op = TGSI_OPCODE_##i; \
 +  else if (type == GLSL_TYPE_UINT) op = TGSI_OPCODE_##u; \
 +  else if (native_integers) \
 + op = TGSI_OPCODE_##f; \
 +  else op = TGSI_OPCODE_##c; \
 +  break;
 +
 
 Would you mind cleaning up the formatting of that macro...
 
 case x:
   if (type == GLSL_TYPE_INT)
   op = ...
   else if (type == GLSL_TYPE_UINT)
   op = ...
   else if (native_integers)
   op = ...
   else
   op = ...
   break;
 
 
Ok. I copied it from the case4 macro right above it that's why only one
case (the new one) has any indentation :-).

Roland



  switch(op) {
 case2fi(ADD, UADD);
 case2fi(MUL, UMUL);
 @@ -680,12 +665,12 @@ glsl_to_tgsi_visitor::get_opcode(ir_instruction
 *ir, unsigned op,
 case3(MAX, IMAX, UMAX);
 case3(MIN, IMIN, UMIN);
 case2iu(MOD, UMOD);
 -
 -  case2fi(SEQ, USEQ);
 -  case2fi(SNE, USNE);
 -  case3(SGE, ISGE, USGE);
 -  case3(SLT, ISLT, USLT);
 -
 +
 +  casecomp(SEQ, FSEQ, USEQ, USEQ);
 +  casecomp(SNE, FSNE, USNE, USNE);
 +  casecomp(SGE, FSGE, ISGE, USGE);
 +  casecomp(SLT, FSLT, ISLT, USLT);
 +
 case2iu(ISHR, USHR);

 case2fi(SSG, ISSG);

 
 Reviewed-by: Brian Paul bri...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Marek Olšák
I don't think this is our problem. If a distro wants S3TC support, its
maintainers can package libtxc_dxtn. Some distros really do it. If a
distro doesn't want S3TC support, there is nothing we can do about it.

Marek

On Tue, Aug 13, 2013 at 8:53 PM, Uwe Schmidt
simon.schm...@cs-systemberatung.de wrote:
 Hi,

 I have read about the issue of implementing the S3TC Extension in Mesa:
 http://dri.freedesktop.org/wiki/S3TC/

 As I understood, the problem is, that encoding and decoding S3TC in
 software is covered by patents, while passing S3TC compressed data to the
 GPU is still ok.

 AS NOW:

 If force_s3tc_enable is enabled in Mesa3D, uploading a S3TC encoded
 texture works if format==internalFormat is true. If format!=internalFormat
 is true, it would fail (as i know).

 SO MY PROPOSAL:

 If 'format' is one of the S3TC types, and format!=internalFormat is true,
 then set internalFormat:=format.

 Else, if 'internalFormat' is one of the S3TC types, but the 'format' isn't,
 set internalFormat:=format (or any other format, Mesa3D can encode).


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-13 Thread Kristian Høgsberg
On Mon, Aug 12, 2013 at 3:05 PM, Marek Olšák mar...@gmail.com wrote:
 On Mon, Aug 12, 2013 at 11:36 PM, Stéphane Marchesin
 stephane.marche...@gmail.com wrote:
 Other than hybrid systems (of which
 there are none with i915 graphics), is there any case where
 __DRI_IMAGE_USE_SHARE can occur?

 You could do interesting things like cross-process sharing with it. I
 think it's worth doing it, no matter what. It's easy to pick up now,
 and hard to fix up later.

 Cross-process sharing is mandatory already and exposed via
 resource_from_handle and resource_get_handle. I don't think this is
 useful for cross-process sharing anyway, because it disables tiling.

No, we need a different flag for this.  I can't speak to the gallium
flag, but the __DRI_IMAGE_USE_SHARE flag is use for same-gpu cross
process sharing under wayland, either using GEM names or Prime fd
passing.  We can't drop tiling in this case, there's an obvious
performance penalty.  We need a flag to indicate that a buffer will be
used on multiple GPUs.  The next level up in the stack
(src/egl/drivers/dri2/platform_wayland.c in case of Wayland) needs to
know whether or not a buffer will be used in this way and pass the
flag when applicable.

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] gbm: fix linking

2013-08-13 Thread Bryce W. Harrington
Armin K wrote:
 Link to internal libwayland-drm library if Wayland
 EGL platform is enabled. The library needs to be
 built before gbm.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67962

In going through the wayland manual build directions[1] the other day, I
hit this bug when trying to build weston against current git mesa.  This
patch fixed the build breakage.  So...

Tested-by: Bryce Harrington b.harring...@samsung.com

Bryce

1: http://wayland.freedesktop.org/building.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-08-13 Thread Vedran Rodic
On Mon, Aug 12, 2013 at 3:07 PM,  ville.syrj...@linux.intel.com wrote:
 From: Ville Syrjälä ville.syrj...@linux.intel.com

 IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
 so let's make use of it.

According to the discussion we had on #intel-gfx a few weeks ago, on
IVB all Mesa memory is already marked as cached in DRM allocated PTEs.
So this should not have any effect. Or I'm misunderstanding something.

As I understand, marking everything uncacheable and then marking just
certain things cacheable could make a difference (since AFAIK, you
can't mark select regions as uncacheable after you mark PTEs as
cacheable on IVB).

Can somebody more knowledgeable comment?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.

2013-08-13 Thread Kenneth Graunke
128 bpp formats are not allowed to be Y-tiled on any architectures
except Gen7.

+11 Piglits on Sandybridge (mostly regression fixes since the
switch to Y-tiling).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867
Cc: Topi Pohjolainen topi.pohjolai...@intel.com
Cc: Chad Versace chad.vers...@linux.intel.com
Cc: Paul Berry stereotype...@gmail.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index d6643ca..86a2d53 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context *brw,
if (brw-gen  6)
   return I915_TILING_X;
 
+   /* From the Sandybridge PRM, Volume 1, Part 2, page 32:
+* NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX
+*  or Linear.
+* 128 bits per pixel translates to 16 bytes per pixel.  This is necessary
+* all the way back to 965, but is explicitly permitted on Gen7.
+*/
+   if (brw-gen != 7  mt-cpp = 16)
+  return I915_TILING_X;
+
return I915_TILING_Y | I915_TILING_X;
 }
 
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Emit better warnings for things that look like default precision statements

2013-08-13 Thread Matt Turner
On Tue, Aug 13, 2013 at 1:35 PM, Ian Romanick i...@freedesktop.org wrote:
 and the spec doesn't explicitly forbid it.

I was surprised by this, so I verified it.

In the GLSL ES 3.0 spec:

single_declaration
 fully_specified_type
  type_specifier
   precision_qualifier type_specifier_no_prec

precision_qualifier
 highp, mediump, lowp

type_specifier_no_prec
 type_specifier_nonarray
  expands to list of built-in types

Seems weird, but legitimate.

Have we actually seen 'highp float;' in the wild (outside of piglit)?

Assuming that the two instances of highp in precision_names is
intentional (or was not, but is fixed)

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/9] glsl: Merge precision qualifiers too

2013-08-13 Thread Matt Turner
On Fri, Aug 9, 2013 at 4:38 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 We never noticed this before because we previously didn't enfoce GLSL ES
 fragement shader requirements that precision be defined.  There may also
 have been some interaction here with the addition of
 GL_ARB_shading_language_420pack, but it doesn't appear to me that it
 added any new bugs (just perhaps uncovered some old ones).

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: Matt Turner matts...@gmail.com
 Cc: 9.2 mesa-sta...@lists.freedesktop.org
 ---

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Fix Sandybridge regressions from SEL optimization.

2013-08-13 Thread Kenneth Graunke
Sandybridge is the only platform that supports an IF instruction
with an embedded comparison.  In this case, we need to emit a CMP
to go along with the SEL.

Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16,
fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and
isinf-and-isnan fs_fbo.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index a36c248..984b08a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1911,10 +1911,19 @@ fs_visitor::try_replace_with_sel()
  emit(MOV(src0, then_mov-src[0]));
   }
 
-  fs_inst *sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, 
else_mov-src[0]);
-  sel-predicate = if_inst-predicate;
-  sel-predicate_inverse = if_inst-predicate_inverse;
-  sel-conditional_mod = if_inst-conditional_mod;
+  fs_inst *sel;
+  if (if_inst-conditional_mod) {
+ /* Sandybridge-specific IF with embedded comparison */
+ emit(CMP(reg_null_d, if_inst-src[0], if_inst-src[1],
+  if_inst-conditional_mod));
+ sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]);
+ sel-predicate = BRW_PREDICATE_NORMAL;
+  } else {
+ /* Separate CMP and IF instructions */
+ sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]);
+ sel-predicate = if_inst-predicate;
+ sel-predicate_inverse = if_inst-predicate_inverse;
+  }
}
 }
 
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Another Take on the S3TC issue

2013-08-13 Thread Maxence Le Doré
Maybe it's time let S3TC go away and think about something else. Like
BPTC. I've started something. Currently, this is very dirty.

Please, take a quick look at this :

https://docs.google.com/file/d/0B1BiksMm0x0GVjZPZHgyNm0xLVE/edit?usp=sharing


Yeah, I know. Ugly. But it basically works. I can nearly recognize
this beautiful flower, after the original image passes throught an
encoding and a decoding pass.

I preferred to start it as a standalone codec to be sure i'm
concentrating on it (and not the surrounding code of the mesa tree !).
Currently I've just worked on BC7 parts, not BC6H but I may start
soon.

I'll make the encoder-decoder avaible when it will be in better
shapes, on github.

I'm going to improve things before the end of this month. When it will
be more acceptable, I will start porting the standalone codec into the
mesa code (on github again, possibly with many other things regarding
mesa softpipe).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: fix border color with normalized texture formats

2013-08-13 Thread sroland
From: Roland Scheidegger srol...@vmware.com

We need to put border color into texture format color space which
essentially means clamping for non-float, normalized formats (not entirely
sure if we're also meant to quantize the float but it's probably ok not to
do it thankfully).
For OpenGL we could do this easily outside generated code due to the
1:1 sampler/texture correspondence but not for d3d10 which is terrible
(as we recalculate a constant over and over again per shader invocation).
Fortunately border color should be rare enough that we don't care THAT much.
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |   66 +
 1 file changed, 53 insertions(+), 13 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index 65d6e7b..2a4462b 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -179,24 +179,64 @@ lp_build_sample_texel_soa(struct lp_build_sample_context 
*bld,
 */
 
if (use_border) {
-  /* select texel color or border color depending on use_border */
-  LLVMValueRef border_color_ptr = 
+  /* select texel color or border color depending on use_border. */
+ LLVMValueRef border_color_ptr =
  bld-dynamic_state-border_color(bld-dynamic_state,
   bld-gallivm, sampler_unit);
+  const struct util_format_description *format_desc;
   int chan;
+  format_desc = util_format_description(bld-static_texture_state-format);
+  /*
+   * Only replace channels which are actually present. The others should
+   * get optimized away eventually by sampler_view swizzle anyway but it's
+   * easier too as we'd need some extra logic for channels where we can't
+   * determine the format directly otherwise.
+   */
   for (chan = 0; chan  4; chan++) {
- LLVMValueRef border_chan =
-lp_build_array_get(bld-gallivm, border_color_ptr,
-   lp_build_const_int32(bld-gallivm, chan));
- LLVMValueRef border_chan_vec =
-lp_build_broadcast_scalar(bld-float_vec_bld, border_chan);
-
- if (!bld-texel_type.floating) {
-border_chan_vec = LLVMBuildBitCast(builder, border_chan_vec,
-   bld-texel_bld.vec_type, );
+ unsigned chan_s;
+ /* reverse-map channel... */
+ for (chan_s = 0; chan_s  4; chan_s++) {
+if (chan_s == format_desc-swizzle[chan]) {
+   break;
+}
+ }
+ if (chan_s = 3) {
+LLVMValueRef border_chan =
+   lp_build_array_get(bld-gallivm, border_color_ptr,
+  lp_build_const_int32(bld-gallivm, chan));
+LLVMValueRef border_chan_vec =
+   lp_build_broadcast_scalar(bld-float_vec_bld, border_chan);
+
+if (!bld-texel_type.floating) {
+   border_chan_vec = LLVMBuildBitCast(builder, border_chan_vec,
+  bld-texel_bld.vec_type, );
+}
+else {
+   /*
+* For normalized format need to clamp border color (technically
+* probably should also quantize the data). Really sucks doing 
this
+* here but can't avoid at least for now since this is part of
+* sampler state and texture format is part of sampler_view 
state.
+*/
+   unsigned chan_type = format_desc-channel[chan_s].type;
+   unsigned chan_norm = format_desc-channel[chan_s].normalized;
+   if (chan_type == UTIL_FORMAT_TYPE_SIGNED  chan_norm) {
+  LLVMValueRef clamp_min;
+  clamp_min = lp_build_const_vec(bld-gallivm, 
bld-texel_type, -1.0F);
+  border_chan_vec = lp_build_clamp(bld-texel_bld, 
border_chan_vec,
+   clamp_min,
+   bld-texel_bld.one);
+   }
+   else if (chan_type == UTIL_FORMAT_TYPE_UNSIGNED  chan_norm) {
+  border_chan_vec = lp_build_clamp(bld-texel_bld, 
border_chan_vec,
+   bld-texel_bld.zero,
+   bld-texel_bld.one);
+   }
+   /* not exactly sure about all others but I think should be ok? 
*/
+}
+texel_out[chan] = lp_build_select(bld-texel_bld, use_border,
+  border_chan_vec, 
texel_out[chan]);
  }
- texel_out[chan] = lp_build_select(bld-texel_bld, use_border,
-   border_chan_vec, texel_out[chan]);
   }
}
 }
-- 
1.7.9.5
___
mesa-dev 

Re: [Mesa-dev] [PATCH] glsl: Emit better warnings for things that look like default precision statements

2013-08-13 Thread Ian Romanick

On 08/13/2013 03:50 PM, Matt Turner wrote:

On Tue, Aug 13, 2013 at 1:35 PM, Ian Romanick i...@freedesktop.org wrote:

and the spec doesn't explicitly forbid it.


I was surprised by this, so I verified it.

In the GLSL ES 3.0 spec:

single_declaration
  fully_specified_type
   type_specifier
precision_qualifier type_specifier_no_prec

precision_qualifier
  highp, mediump, lowp

type_specifier_no_prec
  type_specifier_nonarray
   expands to list of built-in types

Seems weird, but legitimate.


C allows empty declarations too.  I believe it's a side-effect of 
function prototypes without formal parameter names.  If you can do


int foo(int, float, struct S *);

it's easy to end up with a parser that can also do

int;
float;
struct S *;

It's actually more work to reject those (or generate a warning).


Have we actually seen 'highp float;' in the wild (outside of piglit)?


Not that I know of.


Assuming that the two instances of highp in precision_names is
intentional (or was not, but is fixed)


I had to put something in the ast_precision_none slot, and that seemed 
as good a choice as any.   didn't seem too good. :)



Reviewed-by: Matt Turner matts...@gmail.com


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/hsw: Populate MOCS for STATE_BASE_ADDRESS (v2)

2013-08-13 Thread Chad Versace
From: Ville Syrjälä ville.syrj...@linux.intel.com

Just spotted these unpopulated MOCS fields when comparing the code
against BSpec. Set the MOCS to the same as everywhere else in Haswell:
L3-cacheable.

v2: Annotate state packet fields (chadv).

Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
Reviewed-by: Chad Versace chad.vers...@linux.intel.com
---

Ville, I added comments to explain what new fields get set. If this looks
good to you, then I'll commit it.

 src/mesa/drivers/dri/i965/brw_misc_state.c | 7 +--
 src/mesa/drivers/dri/i965/gen6_blorp.cpp   | 7 ++-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 3bf37b9..9854e26 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -1038,13 +1038,16 @@ static void upload_state_base_address( struct 
brw_context *brw )
 */
 
if (brw-gen = 6) {
+  uint8_t mocs = brw-is_haswell ? GEN7_MOCS_L3 : 0;
+
   if (brw-gen == 6)
 intel_emit_post_sync_nonzero_flush(brw);
 
BEGIN_BATCH(10);
OUT_BATCH(CMD_STATE_BASE_ADDRESS  16 | (10 - 2));
-   /* General state base address: stateless DP read/write requests */
-   OUT_BATCH(1);
+   OUT_BATCH(mocs  8 | /* General State Memory Object Control State */
+ mocs  4 | /* Stateless Data Port Access Memory Object 
Control State */
+ 1); /* General State Base Address Modif Enable */
/* Surface state base address:
* BINDING_TABLE_STATE
* SURFACE_STATE
diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
index a4a9081..b82323d 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
@@ -74,9 +74,14 @@ void
 gen6_blorp_emit_state_base_address(struct brw_context *brw,
const brw_blorp_params *params)
 {
+   uint8_t mocs = brw-is_haswell ? GEN7_MOCS_L3 : 0;
+
BEGIN_BATCH(10);
OUT_BATCH(CMD_STATE_BASE_ADDRESS  16 | (10 - 2));
-   OUT_BATCH(1); /* GeneralStateBaseAddressModifyEnable */
+   OUT_BATCH(mocs  8 | /* GeneralStateMemoryObjectControlState */
+ mocs  4 | /* StatelessDataPortAccessMemoryObjectControlState */
+ 1); /* GeneralStateBaseAddressModifEnable */
+
/* SurfaceStateBaseAddress */
OUT_RELOC(brw-batch.bo, I915_GEM_DOMAIN_SAMPLER, 0, 1);
/* DynamicStateBaseAddress */
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.

2013-08-13 Thread Ian Romanick

On 08/13/2013 03:37 PM, Kenneth Graunke wrote:

128 bpp formats are not allowed to be Y-tiled on any architectures
except Gen7.

+11 Piglits on Sandybridge (mostly regression fixes since the
switch to Y-tiling).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867


Also https://bugs.freedesktop.org/show_bug.cgi?id=64261?


Cc: Topi Pohjolainen topi.pohjolai...@intel.com
Cc: Chad Versace chad.vers...@linux.intel.com
Cc: Paul Berry stereotype...@gmail.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index d6643ca..86a2d53 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context *brw,
 if (brw-gen  6)
return I915_TILING_X;

+   /* From the Sandybridge PRM, Volume 1, Part 2, page 32:
+* NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX
+*  or Linear.
+* 128 bits per pixel translates to 16 bytes per pixel.  This is necessary
+* all the way back to 965, but is explicitly permitted on Gen7.
+*/
+   if (brw-gen != 7  mt-cpp = 16)
+  return I915_TILING_X;


brw-gen  7?  It seems reasonable to expect future hardware to not 
re-introduce this restriction, right?



+
 return I915_TILING_Y | I915_TILING_X;
  }




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-08-13 Thread Chad Versace

On 08/13/2013 03:31 PM, Vedran Rodic wrote:

On Mon, Aug 12, 2013 at 3:07 PM,  ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
so let's make use of it.


According to the discussion we had on #intel-gfx a few weeks ago, on
IVB all Mesa memory is already marked as cached in DRM allocated PTEs.
So this should not have any effect. Or I'm misunderstanding something.

As I understand, marking everything uncacheable and then marking just
certain things cacheable could make a difference (since AFAIK, you
can't mark select regions as uncacheable after you mark PTEs as
cacheable on IVB).

Can somebody more knowledgeable comment?


On Ivybridge, the PTEs mark only contexts as LLC+L3 cacheable. Everything
else is marked as cacheable in LLC, but not L3. So, Ville's patches will
give a perf boost to Mesa running on any kernel that continues that cacheing
policy.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.

2013-08-13 Thread Chad Versace

On 08/13/2013 05:45 PM, Ian Romanick wrote:

On 08/13/2013 03:37 PM, Kenneth Graunke wrote:

128 bpp formats are not allowed to be Y-tiled on any architectures
except Gen7.

+11 Piglits on Sandybridge (mostly regression fixes since the
switch to Y-tiling).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867


Also https://bugs.freedesktop.org/show_bug.cgi?id=64261?


Cc: Topi Pohjolainen topi.pohjolai...@intel.com
Cc: Chad Versace chad.vers...@linux.intel.com
Cc: Paul Berry stereotype...@gmail.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index d6643ca..86a2d53 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context *brw,
 if (brw-gen  6)
return I915_TILING_X;

+   /* From the Sandybridge PRM, Volume 1, Part 2, page 32:
+* NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX
+*  or Linear.
+* 128 bits per pixel translates to 16 bytes per pixel.  This is necessary
+* all the way back to 965, but is explicitly permitted on Gen7.
+*/
+   if (brw-gen != 7  mt-cpp = 16)
+  return I915_TILING_X;


brw-gen  7?  It seems reasonable to expect future hardware to not 
re-introduce this restriction, right?


Future hardware does re-introduce it.

Reviewed-by: Chad Versace chad.vers...@linux.intel.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Fix Sandybridge regressions from SEL optimization.

2013-08-13 Thread Ian Romanick

On 08/13/2013 04:47 PM, Kenneth Graunke wrote:

Sandybridge is the only platform that supports an IF instruction
with an embedded comparison.  In this case, we need to emit a CMP
to go along with the SEL.

Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16,
fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and
isinf-and-isnan fs_fbo.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index a36c248..984b08a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1911,10 +1911,19 @@ fs_visitor::try_replace_with_sel()
   emit(MOV(src0, then_mov-src[0]));
}

-  fs_inst *sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, 
else_mov-src[0]);
-  sel-predicate = if_inst-predicate;
-  sel-predicate_inverse = if_inst-predicate_inverse;
-  sel-conditional_mod = if_inst-conditional_mod;
+  fs_inst *sel;
+  if (if_inst-conditional_mod) {
+ /* Sandybridge-specific IF with embedded comparison */


This doesn't appear to be SNB-specific code.  Can you explain this?  Is 
if_inst-conditional_mod only set on SNB?  I really need to learn more 
about the back end...



+ emit(CMP(reg_null_d, if_inst-src[0], if_inst-src[1],
+  if_inst-conditional_mod));
+ sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]);
+ sel-predicate = BRW_PREDICATE_NORMAL;
+  } else {
+ /* Separate CMP and IF instructions */
+ sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0, else_mov-src[0]);
+ sel-predicate = if_inst-predicate;
+ sel-predicate_inverse = if_inst-predicate_inverse;
+  }
 }
  }




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Force X-tiling for 128 bpp formats on Sandybridge.

2013-08-13 Thread Ian Romanick

On 08/13/2013 05:47 PM, Chad Versace wrote:

On 08/13/2013 05:45 PM, Ian Romanick wrote:

On 08/13/2013 03:37 PM, Kenneth Graunke wrote:

128 bpp formats are not allowed to be Y-tiled on any architectures
except Gen7.

+11 Piglits on Sandybridge (mostly regression fixes since the
switch to Y-tiling).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867


Also https://bugs.freedesktop.org/show_bug.cgi?id=64261?


Cc: Topi Pohjolainen topi.pohjolai...@intel.com
Cc: Chad Versace chad.vers...@linux.intel.com
Cc: Paul Berry stereotype...@gmail.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index d6643ca..86a2d53 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -468,6 +468,15 @@ intel_miptree_choose_tiling(struct brw_context
*brw,
 if (brw-gen  6)
return I915_TILING_X;

+   /* From the Sandybridge PRM, Volume 1, Part 2, page 32:
+* NOTE: 128BPE Format Color Buffer ( render target ) MUST be
either TileX
+*  or Linear.
+* 128 bits per pixel translates to 16 bytes per pixel.  This is
necessary
+* all the way back to 965, but is explicitly permitted on Gen7.
+*/
+   if (brw-gen != 7  mt-cpp = 16)
+  return I915_TILING_X;


brw-gen  7?  It seems reasonable to expect future hardware to not
re-introduce this restriction, right?


Future hardware does re-introduce it.


Of course.  Lol.

Reviewed-by: Ian Romanick ian.d.roman...@intel.com


Reviewed-by: Chad Versace chad.vers...@linux.intel.com


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-08-13 Thread Chad Versace

On 08/12/2013 06:07 AM, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
so let's make use of it.

pts/xonotic and pts/reaction @ 1920x1080 gain ~4% on my IVB GT2. Most
other things show less gains/no regressions, except furmark which
loses some 10 points.

I didn't have a BYT at hand for testing.

Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
---
  src/mesa/drivers/dri/i965/brw_draw_upload.c   | 2 +-
  src/mesa/drivers/dri/i965/brw_misc_state.c| 2 +-
  src/mesa/drivers/dri/i965/gen6_blorp.cpp  | 4 ++--
  src/mesa/drivers/dri/i965/gen7_blorp.cpp  | 6 +++---
  src/mesa/drivers/dri/i965/gen7_misc_state.c   | 2 +-
  src/mesa/drivers/dri/i965/gen7_vs_state.c | 2 +-
  src/mesa/drivers/dri/i965/gen7_wm_state.c | 2 +-
  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 4 ++--
  8 files changed, 12 insertions(+), 12 deletions(-)


Conceptually, the patch looks good. The (intel-gen == 7)
checks should be removed from the changes in the gen7 files.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Fix Sandybridge regressions from SEL optimization.

2013-08-13 Thread Kenneth Graunke

On 08/13/2013 05:49 PM, Ian Romanick wrote:

On 08/13/2013 04:47 PM, Kenneth Graunke wrote:

Sandybridge is the only platform that supports an IF instruction
with an embedded comparison.  In this case, we need to emit a CMP
to go along with the SEL.

Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16,
fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and
isinf-and-isnan fs_fbo.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index a36c248..984b08a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1911,10 +1911,19 @@ fs_visitor::try_replace_with_sel()
   emit(MOV(src0, then_mov-src[0]));
}

-  fs_inst *sel = emit(BRW_OPCODE_SEL, then_mov-dst, src0,
else_mov-src[0]);
-  sel-predicate = if_inst-predicate;
-  sel-predicate_inverse = if_inst-predicate_inverse;
-  sel-conditional_mod = if_inst-conditional_mod;
+  fs_inst *sel;
+  if (if_inst-conditional_mod) {
+ /* Sandybridge-specific IF with embedded comparison */


This doesn't appear to be SNB-specific code.  Can you explain this?  Is
if_inst-conditional_mod only set on SNB?  I really need to learn more
about the back end...


Normally, control flow looks like:

cmp.l.f0(8)  null   g58,8,1F0F
(+f0) if(8)

For Sandybridge, the hardware designers extended IF to support built-in 
comparisons, so you can simply do:


if.l(8)g58,8,1F0F

They immediately dropped this with Ivybridge; it's not been present on 
any other platform.


fs_inst::conditional_mod represents that conditional modifier (always = 
0, never, less, equal, lequal, greater, notequal, gequal).

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/gen7: Don't use L3$ for render targets

2013-08-13 Thread Chad Versace

On 08/12/2013 06:07 AM, ville.syrj...@linux.intel.com wrote:

From: Ville Syrjälä ville.syrj...@linux.intel.com

According to HSW Bspec L3$ evictions may land in LLC regardless of
LLC MOCS/PTE settings. That means we shouldn't set scanout buffers
as L3 cacheable when writing to them.

So far I've been unable to observe this phenomenon on my IVB, but
better safe than sorry. Especially since this doesn't appear to
hurt performance.

Ideally this should be limited to scanout buffers, but that information
is not availabe to Mesa. Limiting it to winsys buffers might be a
reasonable comporomise, but MOCS setup appears to be done at a
lower layer where that information is already lost, and I was too
lazy to start passing that infromation down.


Let's try harder to add that plumbing. I'll try to think of something tomorrow.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states

2013-08-13 Thread Marek Olšák
It's the same as in r600g.

I can't merge si_atom with si_pm4_state, because the latter is too big and
isn't even driven by the dirty flag. Also I'm gonna share the whole streamout
state handling with r600g (just one atom though) and therefore I need the same
interface. The advantage is that almost no streamout code will be needed
in radeonsi and the old code can be removed.

We will also need to port r600_flush_emit and the associated code from r600g,
so that cache flushing takes places before state emission (I think this will
be required for streamout).
---
 src/gallium/drivers/radeonsi/r600_hw_context.c |  8 
 src/gallium/drivers/radeonsi/radeonsi_pipe.h   | 10 ++
 src/gallium/drivers/radeonsi/si_state.h|  8 
 src/gallium/drivers/radeonsi/si_state_draw.c   |  8 +++-
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
b/src/gallium/drivers/radeonsi/r600_hw_context.c
index 19e9d1c..c9a613b 100644
--- a/src/gallium/drivers/radeonsi/r600_hw_context.c
+++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
@@ -114,9 +114,17 @@ err:
 void si_need_cs_space(struct r600_context *ctx, unsigned num_dw,
boolean count_draw_in)
 {
+   int i;
+
/* The number of dwords we already used in the CS so far. */
num_dw += ctx-cs-cdw;
 
+   for (i = 0; i  SI_NUM_ATOMS(ctx); i++) {
+   if (ctx-atoms.array[i]-dirty) {
+   num_dw += ctx-atoms.array[i]-num_dw;
+   }
+   }
+
if (count_draw_in) {
/* The number of dwords all the dirty states would take. */
num_dw += ctx-pm4_dirty_cdwords;
diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h 
b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
index e370149..b4a6e0c 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
@@ -132,6 +132,8 @@ struct r600_constbuf_state
uint32_tdirty_mask;
 };
 
+#define SI_NUM_ATOMS(rctx) 
(sizeof((rctx)-atoms)/sizeof((rctx)-atoms.array[0]))
+
 struct r600_context {
struct pipe_context context;
struct blitter_context  *blitter;
@@ -145,6 +147,14 @@ struct r600_context {
void*custom_blend_decompress;
struct r600_screen  *screen;
struct radeon_winsys*ws;
+
+   union {
+   struct {
+   /* Place atoms here. */
+   };
+   struct si_atom *array[0];
+   } atoms;
+
struct si_vertex_element*vertex_elements;
struct pipe_framebuffer_state   framebuffer;
unsignedfb_log_samples;
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index b01fbf2..09ef56e 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -29,6 +29,14 @@
 
 #include radeonsi_pm4.h
 
+/* This encapsulates a state or an operation which can emitted into the GPU
+ * command stream. */
+struct si_atom {
+   void (*emit)(struct r600_context *ctx, struct si_atom *state);
+   unsignednum_dw;
+   booldirty;
+};
+
 struct si_state_blend {
struct si_pm4_state pm4;
uint32_tcb_target_mask;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 2007dc4..b951a39 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -665,7 +665,7 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
 {
struct r600_context *rctx = (struct r600_context *)ctx;
struct pipe_index_buffer ib = {};
-   uint32_t cp_coher_cntl;
+   uint32_t cp_coher_cntl, i;
 
if (!info-count  (info-indexed || !info-count_from_stream_output))
return;
@@ -729,6 +729,12 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
 
si_need_cs_space(rctx, 0, TRUE);
 
+   for (i = 0; i  SI_NUM_ATOMS(rctx); i++) {
+   if (rctx-atoms.array[i]-dirty) {
+   rctx-atoms.array[i]-emit(rctx, rctx-atoms.array[i]);
+   }
+   }
+
si_pm4_emit_dirty(rctx);
rctx-pm4_dirty_cdwords = 0;
 
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g/llvm: Add missing %s format string to fprintf.

2013-08-13 Thread Tom Stellard
On Sun, Aug 11, 2013 at 07:37:01PM +0200, Jon Severinsson wrote:
 This fixes a compilation warning with -Wformat-security.
 
 CC: 9.2 mesa-sta...@lists.freedesktop.org

Reviewed-by: Tom Stellard thomas.stell...@amd.com

I've pushed this patch, thanks.

-Tom

 ---
  src/gallium/drivers/radeon/radeon_llvm_emit.c |2 +-
  1 fil ändrad, 1 tillägg(+), 1 borttagning(-)
 
 diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
 b/src/gallium/drivers/radeon/radeon_llvm_emit.c
 index 1a4d4fdd..2dd7bf7b 100644
 --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
 +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
 @@ -124,7 +124,7 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct 
 radeon_llvm_binary *binary,
   r = LLVMTargetMachineEmitToMemoryBuffer(tm, M, LLVMObjectFile, err,
out_buffer);
   if (r) {
 - fprintf(stderr, err);
 + fprintf(stderr, %s, err);
   FREE(err);
   return 1;
   }
 -- 
 1.7.10.4
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon/llvm: fix compile error with -Werror=format-security

2013-08-13 Thread Tom Stellard
On Tue, Aug 13, 2013 at 06:25:28PM +0200, Maarten Lankhorst wrote:
 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com

An identical patch was sent to the list a few days ago, and I've just
pushed it now.

-Tom

 ---
 diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
 b/src/gallium/drivers/radeon/radeon_llvm_emit.c
 index 1a4d4fd..2dd7bf7 100644
 --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
 +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
 @@ -124,7 +124,7 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct 
 radeon_llvm_binary *binary,
   r = LLVMTargetMachineEmitToMemoryBuffer(tm, M, LLVMObjectFile, err,
out_buffer);
   if (r) {
 - fprintf(stderr, err);
 + fprintf(stderr, %s, err);
   FREE(err);
   return 1;
   }
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: link against -lLLVM to determine build type

2013-08-13 Thread Tom Stellard
On Tue, Aug 13, 2013 at 05:53:52PM +0200, Maarten Lankhorst wrote:
 Fixes a build failure of 9.2 on ubuntu, because libLLVM-3.3.so is not present 
 in /usr/lib/llvm-3.2/lib.


I'm trying to understand the problem here, could you give a little more
information about how Ubuntu packages LLVM?  Where are the LLVM
libraries installed and what does llvm-config --libdir --ldflags report?

-Tom

 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
 ---
 diff --git a/configure.ac b/configure.ac
 index 35f6797..579d8d4 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -1870,7 +1870,18 @@ if test x$MESA_LLVM != x0; then
  if test x$with_llvm_shared_libs = xyes; then
  dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,
  LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version`
 -AS_IF([test -f $LLVM_LIBDIR/lib$LLVM_SO_NAME.so], 
 [llvm_have_one_so=yes])
 +
 +AC_MSG_CHECKING([whether $LLVM_SO_NAME is a monolithic blob])
 +save_LIBS=$LIBS
 +save_LDFLAGS=$LDFLAGS
 +LDFLAGS=$LDFLAGS $LLVM_LDFLAGS
 +LIBS=$LIBS -l$LLVM_SO_NAME
 +
 +AC_LINK_IFELSE([AC_LANG_CALL([], [LLVMInitializeCore])],
 +   [llvm_have_one_so=yes], [llvm_have_one_so=no])
 +LIBS=$save_LIBS
 +LDFLAGS=$save_LDFLAGS
 +AC_MSG_RESULT([$llvm_have_one_so])
  
  if test x$llvm_have_one_so = xyes; then
  dnl LLVM was built using auto*, so there is only one shared 
 object.
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] r600/radeonsi: implement new float comparison instructions

2013-08-13 Thread Tom Stellard
On Tue, Aug 13, 2013 at 07:04:56PM +0200, srol...@vmware.com wrote:
 From: Roland Scheidegger srol...@vmware.com
 
 Also use ordered comparisons for old cmp instructions. Untested.

This patch looks good to me, but I would like to do a piglit run on
radeonsi before you commit.  I will try to do this tomorrow.

-Tom

 ---
  src/gallium/drivers/r600/r600_shader.c |   18 ---
  .../drivers/radeon/radeon_setup_tgsi_llvm.c|   49 
 
  2 files changed, 48 insertions(+), 19 deletions(-)
 
 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index 37298cc..fb766c4 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -5743,11 +5743,10 @@ static struct r600_shader_tgsi_instruction 
 r600_shader_tgsi_instruction[] = {
   {105,   0, ALU_OP0_NOP, tgsi_unsupported},
   {106,   0, ALU_OP0_NOP, tgsi_unsupported},
   {TGSI_OPCODE_NOP,   0, ALU_OP0_NOP, tgsi_unsupported},
 - /* gap */
 - {108,   0, ALU_OP0_NOP, tgsi_unsupported},
 - {109,   0, ALU_OP0_NOP, tgsi_unsupported},
 - {110,   0, ALU_OP0_NOP, tgsi_unsupported},
 - {111,   0, ALU_OP0_NOP, tgsi_unsupported},
 + {TGSI_OPCODE_FSEQ,  0, ALU_OP2_SETE_DX10, tgsi_op2},
 + {TGSI_OPCODE_FSGE,  0, ALU_OP2_SETGE_DX10, tgsi_op2},
 + {TGSI_OPCODE_FSLT,  0, ALU_OP2_SETGT_DX10, tgsi_op2_swap},
 + {TGSI_OPCODE_FSNE,  0, ALU_OP2_SETNE_DX10, tgsi_op2_swap},
   {TGSI_OPCODE_NRM4,  0, ALU_OP0_NOP, tgsi_unsupported},
   {TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported},
   /* gap */
 @@ -5936,11 +5935,10 @@ static struct r600_shader_tgsi_instruction 
 eg_shader_tgsi_instruction[] = {
   {105,   0, ALU_OP0_NOP, tgsi_unsupported},
   {106,   0, ALU_OP0_NOP, tgsi_unsupported},
   {TGSI_OPCODE_NOP,   0, ALU_OP0_NOP, tgsi_unsupported},
 - /* gap */
 - {108,   0, ALU_OP0_NOP, tgsi_unsupported},
 - {109,   0, ALU_OP0_NOP, tgsi_unsupported},
 - {110,   0, ALU_OP0_NOP, tgsi_unsupported},
 - {111,   0, ALU_OP0_NOP, tgsi_unsupported},
 + {TGSI_OPCODE_FSEQ,  0, ALU_OP2_SETE_DX10, tgsi_op2},
 + {TGSI_OPCODE_FSGE,  0, ALU_OP2_SETGE_DX10, tgsi_op2},
 + {TGSI_OPCODE_FSLT,  0, ALU_OP2_SETGT_DX10, tgsi_op2_swap},
 + {TGSI_OPCODE_FSNE,  0, ALU_OP2_SETNE_DX10, tgsi_op2_swap},
   {TGSI_OPCODE_NRM4,  0, ALU_OP0_NOP, tgsi_unsupported},
   {TGSI_OPCODE_CALLNZ,0, ALU_OP0_NOP, tgsi_unsupported},
   /* gap */
 diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
 b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
 index 7a47746..8ff9abd 100644
 --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
 +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
 @@ -850,18 +850,16 @@ static void emit_cmp(
   LLVMRealPredicate pred;
   LLVMValueRef cond;
  
 - /* XXX I'm not sure whether to do unordered or ordered comparisons,
 -  * but llvmpipe uses unordered comparisons, so for consistency we use
 -  * unordered.  (The authors of llvmpipe aren't sure about using
 -  * unordered vs ordered comparisons either.
 + /* Use ordered for everything but NE (which is usual for
 +  * float comparisons)
*/
   switch (emit_data-inst-Instruction.Opcode) {
 - case TGSI_OPCODE_SGE: pred = LLVMRealUGE; break;
 - case TGSI_OPCODE_SEQ: pred = LLVMRealUEQ; break;
 - case TGSI_OPCODE_SLE: pred = LLVMRealULE; break;
 - case TGSI_OPCODE_SLT: pred = LLVMRealULT; break;
 + case TGSI_OPCODE_SGE: pred = LLVMRealOGE; break;
 + case TGSI_OPCODE_SEQ: pred = LLVMRealOEQ; break;
 + case TGSI_OPCODE_SLE: pred = LLVMRealOLE; break;
 + case TGSI_OPCODE_SLT: pred = LLVMRealOLT; break;
   case TGSI_OPCODE_SNE: pred = LLVMRealUNE; break;
 - case TGSI_OPCODE_SGT: pred = LLVMRealUGT; break;
 + case TGSI_OPCODE_SGT: pred = LLVMRealOGT; break;
   default: assert(!unknown instruction); pred = 0; break;
   }
  
 @@ -872,6 +870,35 @@ static void emit_cmp(
   cond, bld_base-base.one, bld_base-base.zero, );
  }
  
 +static void emit_fcmp(
 + const struct lp_build_tgsi_action *action,
 + struct lp_build_tgsi_context * bld_base,
 + struct lp_build_emit_data * emit_data)
 +{
 + LLVMBuilderRef builder = bld_base-base.gallivm-builder;
 + LLVMContextRef context = bld_base-base.gallivm-context;
 + LLVMRealPredicate pred;
 +
 + /* Use ordered for everything but NE (which is usual for
 +  * float comparisons)
 +  */
 + switch (emit_data-inst-Instruction.Opcode) {
 + case TGSI_OPCODE_FSEQ: pred = LLVMRealOEQ; break;
 + case TGSI_OPCODE_FSGE: pred = LLVMRealOGE; break;
 + 

Re: [Mesa-dev] [PATCH v2 1/2] radeonsi: Don't leave gaps between position exports from vertex shader

2013-08-13 Thread Tom Stellard
On Tue, Aug 13, 2013 at 07:39:10PM +0200, Michel Dänzer wrote:
 From: Michel Dänzer michel.daen...@amd.com
 
 If the vertex shader exports clip distances but not point size, use
 position exports 1/2 instead of 2/3 for the clip distances. Fixes
 geometry corruption in that case.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974
 
 Cc: mesa-sta...@lists.freedesktop.org
 Signed-off-by: Michel Dänzer michel.daen...@amd.com

I took a look through the LLVM calls in these patches and they look OK
to me.

Reviewed-by: Tom Stellard thomas.stell...@amd.com

 ---
 
 v2: No need to export unused position vectors, just export to consecutive
 position export slots.
 
  src/gallium/drivers/radeonsi/radeonsi_shader.c | 135 
 +++--
  src/gallium/drivers/radeonsi/radeonsi_shader.h |   1 +
  src/gallium/drivers/radeonsi/si_state_draw.c   |   6 +-
  3 files changed, 83 insertions(+), 59 deletions(-)
 
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
 b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 index fee6262..dd9581d 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
 +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 @@ -562,12 +562,11 @@ static void si_alpha_test(struct lp_build_tgsi_context 
 *bld_base,
  }
  
  static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base,
 - unsigned index)
 + LLVMValueRef (*pos)[9], unsigned index)
  {
   struct si_shader_context *si_shader_ctx = si_shader_context(bld_base);
   struct lp_build_context *base = bld_base-base;
   struct lp_build_context *uint = 
 si_shader_ctx-radeon_bld.soa.bld_base.uint_bld;
 - LLVMValueRef args[9];
   unsigned reg_index;
   unsigned chan;
   unsigned const_chan;
 @@ -582,6 +581,8 @@ static void si_llvm_emit_clipvertex(struct 
 lp_build_tgsi_context * bld_base,
   }
  
   for (reg_index = 0; reg_index  2; reg_index ++) {
 + LLVMValueRef *args = pos[2 + reg_index];
 +
   args[5] =
   args[6] =
   args[7] =
 @@ -612,10 +613,6 @@ static void si_llvm_emit_clipvertex(struct 
 lp_build_tgsi_context * bld_base,
   args[3] = lp_build_const_int32(base-gallivm,
  V_008DFC_SQ_EXP_POS + 2 + 
 reg_index);
   args[4] = uint-zero;
 - lp_build_intrinsic(base-gallivm-builder,
 -llvm.SI.export,
 -
 LLVMVoidTypeInContext(base-gallivm-context),
 -args, 9);
   }
  }
  
 @@ -630,17 +627,18 @@ static void si_llvm_emit_epilogue(struct 
 lp_build_tgsi_context * bld_base)
   struct tgsi_parse_context *parse = si_shader_ctx-parse;
   LLVMValueRef args[9];
   LLVMValueRef last_args[9] = { 0 };
 + LLVMValueRef pos_args[4][9] = { { 0 } };
   unsigned semantic_name;
   unsigned color_count = 0;
   unsigned param_count = 0;
   int depth_index = -1, stencil_index = -1;
 + int i;
  
   while (!tgsi_parse_end_of_tokens(parse)) {
   struct tgsi_full_declaration *d =
   parse-FullToken.FullDeclaration;
   unsigned target;
   unsigned index;
 - int i;
  
   tgsi_parse_token(parse);
  
 @@ -716,7 +714,7 @@ handle_semantic:
   target = V_008DFC_SQ_EXP_POS + 2 + 
 d-Semantic.Index;
   break;
   case TGSI_SEMANTIC_CLIPVERTEX:
 - si_llvm_emit_clipvertex(bld_base, index);
 + si_llvm_emit_clipvertex(bld_base, pos_args, 
 index);
   shader-clip_dist_write = 0xFF;
   continue;
   case TGSI_SEMANTIC_FOG:
 @@ -734,9 +732,13 @@ handle_semantic:
  
   si_llvm_init_export_args(bld_base, d, index, target, 
 args);
  
 - if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX ?
 - (semantic_name == TGSI_SEMANTIC_POSITION) :
 - (semantic_name == TGSI_SEMANTIC_COLOR)) {
 + if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX 
 + target = V_008DFC_SQ_EXP_POS 
 + target = (V_008DFC_SQ_EXP_POS + 3)) {
 + memcpy(pos_args[target - V_008DFC_SQ_EXP_POS],
 +args, sizeof(args));
 + } else if (si_shader_ctx-type == 
 TGSI_PROCESSOR_FRAGMENT 
 +semantic_name == TGSI_SEMANTIC_COLOR) {
   if (last_args[0]) {
   
 lp_build_intrinsic(base-gallivm-builder,
  llvm.SI.export,
 @@ -806,66 +808,87 @@