[Mesa-dev] [PATCH] i965: Fix assignment instead of comparison in asserts.

2013-01-25 Thread Vinson Lee
Fixes side effect in assertion defects reported by Coverity.

Signed-off-by: Vinson Lee 
---
 src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
index d9ed27c..45072da 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
@@ -951,8 +951,8 @@ fs_generator::generate_pack_half_2x16_split(fs_inst *inst,
 {
assert(intel->gen >= 7);
assert(dst.type == BRW_REGISTER_TYPE_UD);
-   assert(x.type = BRW_REGISTER_TYPE_F);
-   assert(y.type = BRW_REGISTER_TYPE_F);
+   assert(x.type == BRW_REGISTER_TYPE_F);
+   assert(y.type == BRW_REGISTER_TYPE_F);
 
/* From the Ivybridge PRM, Vol4, Part3, Section 6.27 f32to16:
 *
-- 
1.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.

2013-01-25 Thread Kenneth Graunke

On 01/25/2013 03:13 PM, Roland Scheidegger wrote:

I'm quite sure there are g965 boards around which indeed support Pentium
4 (and P4-based Celerons) (but yes I guess cmov and at least sse2 are
safe - not that the p4 had a usable cmov implementation as it was
incredibly slow IIRC but it should at least work).

Roland


Sadly I think Roland is right here: the Intel 946GZ is a Gen4 chip that 
appears on motherboards which claim to support Pentium 4s.


That's crazy...I've never heard of such a machine.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.

2013-01-25 Thread Eric Anholt
Roland Scheidegger  writes:

> I'm quite sure there are g965 boards around which indeed support Pentium
> 4 (and P4-based Celerons) (but yes I guess cmov and at least sse2 are
> safe - not that the p4 had a usable cmov implementation as it was
> incredibly slow IIRC but it should at least work).

It looks like everything you could put in a g965 had SSE3, but that
*S*SSE3 is not covered.  Sigh.  -march=nocona -mtune=core2 I think
should work.


pgpOKOJnoM543.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel: Un-hardcode lengths from blitter commands.

2013-01-25 Thread Kenneth Graunke
The packet length may change at some point in the future.  Specifying it
explicitly (rather than hardcoding it in the command #define) allows us
to change it much more easily in the future.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/intel/intel_blit.c | 8 
 src/mesa/drivers/dri/intel/intel_reg.h  | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_blit.c 
b/src/mesa/drivers/dri/intel/intel_blit.c
index 4b86f0e..0946972 100644
--- a/src/mesa/drivers/dri/intel/intel_blit.c
+++ b/src/mesa/drivers/dri/intel/intel_blit.c
@@ -194,7 +194,7 @@ intelEmitCopyBlit(struct intel_context *intel,
assert(dst_y < dst_y2);
 
BEGIN_BATCH_BLT(8);
-   OUT_BATCH(CMD);
+   OUT_BATCH(CMD | (8 - 2));
OUT_BATCH(BR13 | (uint16_t)dst_pitch);
OUT_BATCH((dst_y << 16) | dst_x);
OUT_BATCH((dst_y2 << 16) | dst_x2);
@@ -368,7 +368,7 @@ intelClearWithBlit(struct gl_context *ctx, GLbitfield mask)
   }
 
   BEGIN_BATCH_BLT(6);
-  OUT_BATCH(CMD);
+  OUT_BATCH(CMD | (6 - 2));
   OUT_BATCH(BR13);
   OUT_BATCH((y1 << 16) | x1);
   OUT_BATCH((y2 << 16) | x2);
@@ -445,7 +445,7 @@ intelEmitImmediateColorExpandBlit(struct intel_context 
*intel,
   blit_cmd |= XY_DST_TILED;
 
BEGIN_BATCH_BLT(8 + 3);
-   OUT_BATCH(opcode);
+   OUT_BATCH(opcode | (8 - 2));
OUT_BATCH(br13);
OUT_BATCH((0 << 16) | 0); /* clip x1, y1 */
OUT_BATCH((100 << 16) | 100); /* clip x2, y2 */
@@ -587,7 +587,7 @@ intel_set_teximage_alpha_to_one(struct gl_context *ctx,
}
 
BEGIN_BATCH_BLT(6);
-   OUT_BATCH(CMD);
+   OUT_BATCH(CMD | (6 - 2));
OUT_BATCH(BR13);
OUT_BATCH((y1 << 16) | x1);
OUT_BATCH((y2 << 16) | x2);
diff --git a/src/mesa/drivers/dri/intel/intel_reg.h 
b/src/mesa/drivers/dri/intel/intel_reg.h
index 53b1cb9..e4871eb 100644
--- a/src/mesa/drivers/dri/intel/intel_reg.h
+++ b/src/mesa/drivers/dri/intel/intel_reg.h
@@ -240,11 +240,11 @@
 #define PRIM3D_DIB (0x9<<18)
 #define PRIM3D_MASK(0x1f<<18)
 
-#define XY_SETUP_BLT_CMD   (CMD_2D | (0x01 << 22) | 6)
+#define XY_SETUP_BLT_CMD   (CMD_2D | (0x01 << 22))
 
-#define XY_COLOR_BLT_CMD   (CMD_2D | (0x50 << 22) | 4)
+#define XY_COLOR_BLT_CMD   (CMD_2D | (0x50 << 22))
 
-#define XY_SRC_COPY_BLT_CMD (CMD_2D | (0x53 << 22) | 6)
+#define XY_SRC_COPY_BLT_CMD (CMD_2D | (0x53 << 22))
 
 #define XY_TEXT_IMMEDIATE_BLIT_CMD (CMD_2D | (0x31 << 22))
 # define XY_TEXT_BYTE_PACKED   (1 << 16)
-- 
1.8.1.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

Alex Deucher  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Alex Deucher  ---
Fixed by:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=264e6dad28e64755dc1580abdbb4e339c3439883

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: List new extensions added in Mesa 9.1

2013-01-25 Thread Matt Turner
On Fri, Jan 25, 2013 at 6:02 PM, Marek Olšák  wrote:
> These extensions are not new in Mesa:
> - ARB_base_instance (since 9.0)
> - ARB_vertex_type_2_10_10_10_rev (since 8.0)
> - OES_standard_derivatives (since 7.10, I think)

Ah you're right. It was just i965 that added these. I'll drop them
from the list.

> Also, we don't have ARB_ES3_compatibility yet.

We do (on i965), since today: e4f661afc89e6e7608edceb73528a5e54a147a85.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix up CP DMA for VM on cayman and TN

2013-01-25 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Sat, Jan 26, 2013 at 12:49 AM,   wrote:
> From: Alex Deucher 
>
> Need to add the virtual address.
>
> Signed-off-by: Alex Deucher 
> ---
>  src/gallium/drivers/r600/r600.h|4 ++--
>  src/gallium/drivers/r600/r600_hw_context.c |   11 +++
>  2 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h
> index 93604fb..06e914f 100644
> --- a/src/gallium/drivers/r600/r600.h
> +++ b/src/gallium/drivers/r600/r600.h
> @@ -172,8 +172,8 @@ void r600_context_streamout_end(struct r600_context *ctx);
>  void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean 
> count_draw_in);
>  void r600_context_block_emit_dirty(struct r600_context *ctx, struct 
> r600_block *block, unsigned pkt_flags);
>  void r600_cp_dma_copy_buffer(struct r600_context *rctx,
> -struct pipe_resource *dst, unsigned dst_offset,
> -struct pipe_resource *src, unsigned src_offset,
> +struct pipe_resource *dst, unsigned long 
> dst_offset,
> +struct pipe_resource *src, unsigned long 
> src_offset,
>  unsigned size);
>
>  int evergreen_context_init(struct r600_context *ctx);
> diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
> b/src/gallium/drivers/r600/r600_hw_context.c
> index caebf5c..e13b502 100644
> --- a/src/gallium/drivers/r600/r600_hw_context.c
> +++ b/src/gallium/drivers/r600/r600_hw_context.c
> @@ -1065,8 +1065,8 @@ void r600_context_streamout_end(struct r600_context 
> *ctx)
>  #define CP_DMA_MAX_BYTE_COUNT ((1 << 21) - 8)
>
>  void r600_cp_dma_copy_buffer(struct r600_context *rctx,
> -struct pipe_resource *dst, unsigned dst_offset,
> -struct pipe_resource *src, unsigned src_offset,
> +struct pipe_resource *dst, unsigned long 
> dst_offset,
> +struct pipe_resource *src, unsigned long 
> src_offset,
>  unsigned size)
>  {
> struct radeon_winsys_cs *cs = rctx->cs;
> @@ -1079,6 +1079,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
> return;
> }
>
> +   dst_offset += r600_resource_va(&rctx->screen->screen, dst);
> +   src_offset += r600_resource_va(&rctx->screen->screen, src);
> +
> /* We flush the caches, because we might read from or write
>  * to resources which are bound right now. */
> rctx->flags |= R600_CONTEXT_INVAL_READ_CACHES |
> @@ -1112,9 +1115,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
>
> r600_write_value(cs, PKT3(PKT3_CP_DMA, 4, 0));
> r600_write_value(cs, src_offset);   /* SRC_ADDR_LO [31:0] 
> */
> -   r600_write_value(cs, sync); /* CP_SYNC [31] | 
> SRC_ADDR_HI [7:0] */
> +   r600_write_value(cs, sync | ((src_offset >> 32) & 0xff)); 
>   /* CP_SYNC [31] | SRC_ADDR_HI [7:0] */
> r600_write_value(cs, dst_offset);   /* DST_ADDR_LO [31:0] 
> */
> -   r600_write_value(cs, 0);/* DST_ADDR_HI [7:0] 
> */
> +   r600_write_value(cs, (dst_offset >> 32) & 0xff);  
>   /* DST_ADDR_HI [7:0] */
> r600_write_value(cs, byte_count);   /* COMMAND [29:22] | 
> BYTE_COUNT [20:0] */
>
> r600_write_value(cs, PKT3(PKT3_NOP, 0, 0));
> --
> 1.7.7.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: List new extensions added in Mesa 9.1

2013-01-25 Thread Marek Olšák
These extensions are not new in Mesa:
- ARB_base_instance (since 9.0)
- ARB_vertex_type_2_10_10_10_rev (since 8.0)
- OES_standard_derivatives (since 7.10, I think)

Also, we don't have ARB_ES3_compatibility yet.

Marek

On Sat, Jan 26, 2013 at 12:08 AM, Matt Turner  wrote:
> I did not list the *_get_program_binary extensions since they're not
> useful to anyone with their current implementation (that supports 0
> binary formats).
> ---
> We should also write something about ES3 and the float-texture & S3TC
> changes.
>
>  docs/relnotes-9.1.html |   12 +++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
>
> diff --git a/docs/relnotes-9.1.html b/docs/relnotes-9.1.html
> index ffca275..14e6c02 100644
> --- a/docs/relnotes-9.1.html
> +++ b/docs/relnotes-9.1.html
> @@ -44,9 +44,19 @@ Note: some of the new features are only available with 
> certain drivers.
>  
>
>  
> +GL_ANGLE_texture_compression_dxt3
> +GL_ANGLE_texture_compression_dxt5
> +GL_ARB_base_instance
> +GL_ARB_ES3_compatibility
> +GL_ARB_internalformat_query
>  GL_ARB_map_buffer_alignment
> -GL_ARB_texture_cube_map_array
> +GL_ARB_shading_language_packing
>  GL_ARB_texture_buffer_object_rgb32
> +GL_ARB_texture_cube_map_array
> +GL_ARB_vertex_type_2_10_10_10_rev
> +GL_EXT_color_buffer_float
> +GL_OES_depth_texture_cube_map
> +GL_OES_standard_derivatives
>  
>
>
> --
> 1.7.8.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59877] Build fail since r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59877

--- Comment #1 from Tom Stellard  ---
Created attachment 73664
  --> https://bugs.freedesktop.org/attachment.cgi?id=73664&action=edit
Possible fix

Does this patch help?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] r600g: only emit gfx cmd when there is actual work in it

2013-01-25 Thread Marek Olšák
You forgot about fences and queries other than timestamp. All queries
must be emitted even if there is no rendering between them (the GL
spec says that if a query is busy, any later query must be busy too,
and empty queries are allowed - we have piglit tests for all that).

Anyway, I think this is not needed and it's also prone to errors as your
patch shows. The current mechanism that prevents an empty CS from
being emitted is sufficient. The CS flushing is skipped if:
- cs->cdw == ctx->start_cs_cmd.num_dw in r600_context_flush, or
- cs->cdw == 0 in radeon_drm_cs_flush

Marek

On Fri, Jan 25, 2013 at 6:50 PM,   wrote:
> From: Jerome Glisse 
>
> Signed-off-by: Jerome Glisse 
> ---
>  src/gallium/drivers/r600/evergreen_compute.c | 2 ++
>  src/gallium/drivers/r600/r600_hw_context.c   | 1 +
>  src/gallium/drivers/r600/r600_pipe.c | 6 ++
>  src/gallium/drivers/r600/r600_pipe.h | 1 +
>  src/gallium/drivers/r600/r600_query.c| 2 ++
>  src/gallium/drivers/r600/r600_state_common.c | 1 +
>  6 files changed, 13 insertions(+)
>
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index f4a7905..977595e 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -308,6 +308,8 @@ static void evergreen_emit_direct_dispatch(
> r600_write_value(cs, grid_layout[2]);
> /* VGT_DISPATCH_INITIATOR = COMPUTE_SHADER_EN */
> r600_write_value(cs, 1);
> +
> +   rctx->rings.gfx.cdraw++;
>  }
>
>  static void compute_emit_cs(struct r600_context *ctx, const uint 
> *block_layout,
> diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
> b/src/gallium/drivers/r600/r600_hw_context.c
> index d7518a5..511a276 100644
> --- a/src/gallium/drivers/r600/r600_hw_context.c
> +++ b/src/gallium/drivers/r600/r600_hw_context.c
> @@ -1122,6 +1122,7 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
> size -= byte_count;
> src_offset += byte_count;
> dst_offset += byte_count;
> +   rctx->rings.gfx.cdraw++;
> }
>  }
>
> diff --git a/src/gallium/drivers/r600/r600_pipe.c 
> b/src/gallium/drivers/r600/r600_pipe.c
> index 6767412..af08cff 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -120,6 +120,10 @@ static void r600_flush(struct pipe_context *ctx, 
> unsigned flags)
> struct pipe_query *render_cond = NULL;
> unsigned render_cond_mode = 0;
>
> +   if (!rctx->rings.gfx.cdraw) {
> +   return;
> +   }
> +
> rctx->rings.gfx.flushing = true;
> /* Disable render condition. */
> if (rctx->current_render_cond) {
> @@ -130,6 +134,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned 
> flags)
>
> r600_context_flush(rctx, flags);
> rctx->rings.gfx.flushing = false;
> +   rctx->rings.gfx.cdraw = 0;
> r600_begin_new_cs(rctx);
>
> /* Re-enable render condition. */
> @@ -387,6 +392,7 @@ static struct pipe_context *r600_create_context(struct 
> pipe_screen *screen, void
> goto fail;
> }
>
> +   rctx->rings.gfx.cdraw = 0;
> rctx->rings.gfx.cs = rctx->ws->cs_create(rctx->ws, RING_GFX);
> rctx->rings.gfx.flush = r600_flush_gfx_ring;
> rctx->ws->cs_set_flush_callback(rctx->rings.gfx.cs, 
> r600_flush_from_winsys, rctx);
> diff --git a/src/gallium/drivers/r600/r600_pipe.h 
> b/src/gallium/drivers/r600/r600_pipe.h
> index 31dcd05..5c72756 100644
> --- a/src/gallium/drivers/r600/r600_pipe.h
> +++ b/src/gallium/drivers/r600/r600_pipe.h
> @@ -418,6 +418,7 @@ struct r600_fetch_shader {
>  struct r600_ring {
> struct radeon_winsys_cs *cs;
> boolflushing;
> +   unsignedcdraw;
> void (*flush)(void *ctx, unsigned flags);
>  };
>
> diff --git a/src/gallium/drivers/r600/r600_query.c 
> b/src/gallium/drivers/r600/r600_query.c
> index 0335189..7916f2d 100644
> --- a/src/gallium/drivers/r600/r600_query.c
> +++ b/src/gallium/drivers/r600/r600_query.c
> @@ -149,6 +149,7 @@ static void r600_emit_query_begin(struct r600_context 
> *ctx, struct r600_query *q
> cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF);
> cs->buf[cs->cdw++] = 0;
> cs->buf[cs->cdw++] = 0;
> +   ctx->rings.gfx.cdraw++;
> break;
> default:
> assert(0);
> @@ -201,6 +202,7 @@ static void r600_emit_query_end(struct r600_context *ctx, 
> struct r600_query *que
> cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF);
> cs->buf[cs->cdw++] = 0;
> cs->buf[cs->cdw++] = 0;
> +   ctx->rings.gfx.cdraw++;
> break;
> default:
> assert(0);
> diff --git a/src/gallium/drivers/r600/r600_state_commo

[Mesa-dev] [Bug 59880] New: piglit arb_uniform_buffer_object-dlist regression

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59880

  Priority: medium
Bug ID: 59880
  Keywords: regression
CC: i...@freedesktop.org
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: piglit arb_uniform_buffer_object-dlist regression
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

mesa: c1d35aece0afc2822d6d9f6c22664c04e6fcbba3 (master)

$ ./bin/arb_uniform_buffer_object-dlist -auto
Mesa: User error: GL_INVALID_VALUE in glUniformBlockBinding(block index 1 >= 1)
Mesa: User error: GL_INVALID_VALUE in glGetActiveUniformBlockiv(block index 1
>= 1)
piglit/tests/spec/arb_uniform_buffer_object/dlist.c:129: Binding 1 should be 3,
was 2
Mesa: User error: GL_INVALID_VALUE in glUniformBlockBinding(block index 1 >= 1)
Mesa: User error: GL_INVALID_VALUE in glGetActiveUniformBlockiv(block index 1
>= 1)
piglit/tests/spec/arb_uniform_buffer_object/dlist.c:137: Binding 1 should be 3,
was 2
Unexpected GL error: GL_INVALID_VALUE 0x501
(Error at piglit/tests/spec/arb_uniform_buffer_object/dlist.c:148)
PIGLIT: {'result': 'fail' }

There are only 'skip'ped commits left to test.
The first bad commit could be any of:
32f322925592e9eeda6a5624c7320232fc170c03
514f8c7ec7cc1ab18be93cebb5b9bf970b1955a9
f09d77b2af0e6e7553a1e2efca2f12fe2e4dcea8
22233da1ee4b59663966169759960c00c033d0e9
We cannot bisect more!
bisect run cannot continue any more

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59879] New: reducing symbol visibility of shared objects / static libstdc++

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59879

  Priority: medium
Bug ID: 59879
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: reducing symbol visibility of shared objects / static
libstdc++
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: liquid.a...@gmx.net
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: unspecified
 Component: Mesa core
   Product: Mesa

Hello,

this is sort of cleaned up report of bug #37637.

To quickly summarize what happens there: Build r600g with the llvm compiler
backend and try starting ut2003. Segfault happens since apparantly ut's engine
has a version of libstdc++ built in, which now clashes with the libstdc++
shared lib which either r600_dri.so or LLVM (when build as shared) loads. This
is independent of preloading order. When the symbols from the system libstdc++
take preference, then the game engine crashes. When the game engine symbols
take preference, the r600g driver initialization crashes.

The fix for the problem: Since we can't modify the ut2003 binary, we have to
hide the "duplicate" symbols somehow.

This means:
- build r600g with static llvm
- build r600 with static libstdc++
- only make those symbols in r600_dri.so visible which are necessary

Building r600g with static llvm is trivial. The symbol visibility can be
properly handled by ld:
http://sourceware.org/binutils/docs-2.21/ld/VERSION.html#VERSION

My current dri-symbols.map:
{
  global: __dri*; dri*; _glapi*;
  local: *;
};

This hides everything except for the symbols matching __dri*, dri* and _glapi*.
This can be potentially reduced even further. However it's not clear to me what
the loader code in libGL really needs.

This version-script'ing can be properly put into autotools language:
http://www.gnu.org/software/gnulib/manual/html_node/LD-Version-Scripts.html




What I'm struggling with is properly telling autotools to build a shared lib
with static libstdc++. The gcc manpage mentions an option called
"-static-libstdc++", but it doesn't seem to have any effect.

Let's look at the critical calls (I've shortened them somewhat):

OK, we're in src/gallium/targets/dri-r600. The last libtool call that the
Makefile executes is the following one:

bin/sh ../../../../libtool  --tag=CXX   --mode=link g++  -g -O2 -Wall
-fno-strict-aliasing -fno-builtin-memcmp  -module -avoid-version -shared
-no-undefined
-Wl,--version-script=../../../../src/gallium/targets/dri-symbols.map
-L/usr/lib64/llvm  -lpthread -lffi -ldl -lm   -o r600_dri.la -rpath
/usr/local/lib/dri target.lo utils.lo dri_util.lo xmlconfig.lo 
-ldrm   -lexpat -lm -lrt -lpthread -ldl -ldrm   -ldrm_radeon   


libtool itself produces this call from it:

g++ -fPIC -DPIC -shared -nostdlib  -Wl,--whole-archive  -Wl,--no-whole-archive  -L/usr/lib64/llvm  -lexpat
-lrt -lpthread -ldl -ldrm -ldrm_radeon 
-L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2
-L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../lib64 -L/lib/../lib64
-L/usr/lib/../lib64
-L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../x86_64-pc-linux-gnu/lib
-L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../.. -lstdc++ -lm -lc -lgcc_s
 -O2
-Wl,--version-script=../../../../src/gallium/targets/dri-symbols.map
-Wl,-soname -Wl,r600_dri.so -o .libs/r600_dri.so

Notice the "-lstdc++", dynamic linking to libstdc++. What I'd like libtool (and
eventually autotools) to produce is the following:

g++ -fPIC -DPIC -shared -nostdlib  -Wl,-Bstatic -lstdc++
-Wl,-Bdynamic -Wl,--whole-archive  -Wl,--no-whole-archive 
-L/usr/lib64/llvm  -lexpat -lrt -lpthread -ldl -ldrm
-ldrm_radeon  -L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2
-L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../lib64 -L/lib/../lib64
-L/usr/lib/../lib64
-L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../../../x86_64-pc-linux-gnu/lib
-L/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.2/../../.. -lm -lc -lgcc_s  -O2 -Wl,--version-script=../../../../src/gallium/targets/dri-symbols.map
-Wl,-soname -Wl,r600_dri.so -o .libs/r600_dri.so

This is just removing "-lstdc++" and replacing it by "-Wl,-Bstatic -lstdc++
-Wl,-Bdynamic" at a different position (!). Putting it in front of the archive
assembly seems to be critical.

This links fine (no warnings, etc.) and produces a .so that is loaded properly
by libGL's loader and (more importantly) works fine with ut2003.

However it is still a mystery to me how to makes this clear to either libtool
or autotools.

Greets,
Tobias

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59877] New: Build fail since r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59877

  Priority: medium
Bug ID: 59877
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Build fail since r600g: Don't build llvm_wrapper.cpp
when we aren't using LLVM
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: li...@andyfurniss.entadsl.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

make -k distclean
git clean -dfx

./autogen.sh --prefix=/usr --disable-egl --enable-texture-float
--enable-gallium-g3dvl --enable-r600-llvm-compiler
--with-gallium-drivers=r600,swrast --with-dri-drivers= && make -j5

Making all in r600
make[4]: Entering directory
`/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers/r600'
  CC   r600_asm.lo
In file included from r600_pipe.h:33:0,
 from r600_formats.h:5,
 from r600_asm.c:25:
r600_llvm.h:7:25: fatal error: radeon_llvm.h: No such file or directory
compilation terminated.
make[4]: *** [r600_asm.lo] Error 1
make[4]: Leaving directory
`/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers/r600'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/mnt/sdb1/Src64/Mesa-git/mesa/src'
make: *** [all-recursive] Error 1

andy [ /mnt/sdb1/Src64/Mesa-git/mesa ]$ find ./ -name radeon_llvm.h
./src/gallium/drivers/radeon/radeon_llvm.h

Reverting 
264e6dad28e64755dc1580abdbb4e339c3439883
r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM

will build OK (but not work due to undefined symbol)

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59876] New: glGetTexLevelParameteriv broken for indirect rendering

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59876

  Priority: medium
Bug ID: 59876
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: glGetTexLevelParameteriv broken for indirect rendering
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: gl...@gclements.plus.com
  Hardware: Other
Status: NEW
   Version: 9.0
 Component: GLX
   Product: Mesa

With indirect rendering, glGetTexLevelParameteriv() is returning garbage, at
least for GL_TEXTURE_WIDTH and GL_TEXTURE_HEIGHT.

I don't know whether this is in libGL, XCB or the X server. I've tried it with
several X servers (including Xorg, Xvnc, Xvfb, Cygwin's XWin.exe and Xming),
but they're all based on the same underlying code base so that doesn't mean
much.

I managed to track it down as far as the USE_XCB branch of
__indirect_glGetTexLevelParameteriv() in src/glx/indirect.c. The reply contains
the following:

> print *reply
$7 = {
  response_type = 1 '\001', 
  pad0 = 0 '\000', 
  sequence = 65, 
  length = 0, 
  pad1 = "\000\000\000", 
  n = 1, 
  datum = 256, 
  pad2 = "\230y'a\000\000\000\000\000\000\000"
}

As xcb_glx_get_tex_level_parameteriv_data_length(reply) (i.e. reply->n) is
non-zero, it expects to find the value at
xcb_glx_get_tex_level_parameteriv_data() (i.e. following the structure), but
the correct value (256) is actually in reply->datum (which would have been used
if reply->n was zero).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59873] New: [swrast] piglit ext_framebuffer_multisample-interpolation 0 centroid-edges regression

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59873

  Priority: medium
Bug ID: 59873
  Keywords: regression
CC: bri...@vmware.com
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [swrast] piglit
ext_framebuffer_multisample-interpolation 0
centroid-edges regression
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

mesa: c1d35aece0afc2822d6d9f6c22664c04e6fcbba3 (master)

$ ./bin/ext_framebuffer_multisample-interpolation 0 centroid-edges -auto
Probe at (70,86)
  Left: 1.00 1.00 1.00 1.00
  Right: 0.00 0.00 1.00 1.00
PIGLIT: {'result': 'fail' }

728bf86a23f6de137c0871ea87b09e75e55468a9 is the first bad commit
commit 728bf86a23f6de137c0871ea87b09e75e55468a9
Author: Brian Paul 
Date:   Mon Jan 21 08:59:25 2013 -0700

swrast: move resampleRow setup code in blit_nearest()

The resampleRow setup depends on pixelSize.  For color buffers,
we don't know the pixelSize until we're in the buffer loop.  Move
that code inside the loop.

Fixes: http://bugs.freedesktop.org/show_bug.cgi?id=59541

Reviewed-by: José Fonseca 

:04 04 ab31ec7b1a500e0d1a18fed21d9aa50e1161e548
bbb140aa5922081d252b2cc4eea6f8ec113c4652 Msrc
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gles3: Update gl3.h

2013-01-25 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On Fri, Jan 25, 2013 at 4:07 PM, Matt Turner  wrote:
> Contains a fix for Khronos bug 9557.
> ---
>  include/GLES3/gl3.h |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/GLES3/gl3.h b/include/GLES3/gl3.h
> index b9399e9..09f2b53 100644
> --- a/include/GLES3/gl3.h
> +++ b/include/GLES3/gl3.h
> @@ -2,7 +2,7 @@
>  #define __gl3_h_
>
>  /*
> - * gl3.h last updated on $Date: 2012-09-12 10:13:02 -0700 (Wed, 12 Sep 2012) 
> $
> + * gl3.h last updated on $Date: 2012-10-03 07:52:40 -0700 (Wed, 03 Oct 2012) 
> $
>   */
>
>  #include 
> @@ -796,7 +796,7 @@ typedef struct __GLsync *GLsync;
>  #define GL_TEXTURE_IMMUTABLE_FORMAT  0x912F
>  #define GL_MAX_ELEMENT_INDEX 0x8D6B
>  #define GL_NUM_SAMPLE_COUNTS 0x9380
> -#define GL_TEXTURE_IMMUTABLE_LEVELS  0x8D63
> +#define GL_TEXTURE_IMMUTABLE_LEVELS  0x82DF
>
>  /*-
>   * Entrypoint definitions
> --
> 1.7.8.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] gallivm, draw, llvmpipe: mass rename of unit->texture_unit/sampler_unit

2013-01-25 Thread sroland
From: Roland Scheidegger 

Make it obvious what "unit" this is (no change in functionality).
draw still uses "unit" in places where it changes the shader by adding
texture sampling itself - it seems like this can't work with shaders
using dx10-style sample opcodes (can't mix gl-style and dx10-style
sample instructions in a shader).
---
 src/gallium/auxiliary/draw/draw_llvm_sample.c |   32 -
 src/gallium/auxiliary/gallivm/lp_bld_sample.c |   18 +++---
 src/gallium/auxiliary/gallivm/lp_bld_sample.h |   32 -
 src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |2 +-
 src/gallium/auxiliary/gallivm/lp_bld_sample_aos.h |2 +-
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |   72 ++---
 src/gallium/drivers/llvmpipe/lp_tex_sample.c  |   32 -
 7 files changed, 95 insertions(+), 95 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c 
b/src/gallium/auxiliary/draw/draw_llvm_sample.c
index ac1c031..3f866d4 100644
--- a/src/gallium/auxiliary/draw/draw_llvm_sample.c
+++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c
@@ -86,7 +86,7 @@ struct draw_llvm_sampler_soa
 static LLVMValueRef
 draw_llvm_texture_member(const struct lp_sampler_dynamic_state *base,
  struct gallivm_state *gallivm,
- unsigned unit,
+ unsigned texture_unit,
  unsigned member_index,
  const char *member_name,
  boolean emit_load)
@@ -98,14 +98,14 @@ draw_llvm_texture_member(const struct 
lp_sampler_dynamic_state *base,
LLVMValueRef ptr;
LLVMValueRef res;
 
-   debug_assert(unit < PIPE_MAX_SHADER_SAMPLER_VIEWS);
+   debug_assert(texture_unit < PIPE_MAX_SHADER_SAMPLER_VIEWS);
 
/* context[0] */
indices[0] = lp_build_const_int32(gallivm, 0);
/* context[0].textures */
indices[1] = lp_build_const_int32(gallivm, DRAW_JIT_CTX_TEXTURES);
/* context[0].textures[unit] */
-   indices[2] = lp_build_const_int32(gallivm, unit);
+   indices[2] = lp_build_const_int32(gallivm, texture_unit);
/* context[0].textures[unit].member */
indices[3] = lp_build_const_int32(gallivm, member_index);
 
@@ -116,7 +116,7 @@ draw_llvm_texture_member(const struct 
lp_sampler_dynamic_state *base,
else
   res = ptr;
 
-   lp_build_name(res, "context.texture%u.%s", unit, member_name);
+   lp_build_name(res, "context.texture%u.%s", texture_unit, member_name);
 
return res;
 }
@@ -133,7 +133,7 @@ draw_llvm_texture_member(const struct 
lp_sampler_dynamic_state *base,
 static LLVMValueRef
 draw_llvm_sampler_member(const struct lp_sampler_dynamic_state *base,
  struct gallivm_state *gallivm,
- unsigned unit,
+ unsigned sampler_unit,
  unsigned member_index,
  const char *member_name,
  boolean emit_load)
@@ -145,14 +145,14 @@ draw_llvm_sampler_member(const struct 
lp_sampler_dynamic_state *base,
LLVMValueRef ptr;
LLVMValueRef res;
 
-   debug_assert(unit < PIPE_MAX_SAMPLERS);
+   debug_assert(sampler_unit < PIPE_MAX_SAMPLERS);
 
/* context[0] */
indices[0] = lp_build_const_int32(gallivm, 0);
/* context[0].samplers */
indices[1] = lp_build_const_int32(gallivm, DRAW_JIT_CTX_SAMPLERS);
/* context[0].samplers[unit] */
-   indices[2] = lp_build_const_int32(gallivm, unit);
+   indices[2] = lp_build_const_int32(gallivm, sampler_unit);
/* context[0].samplers[unit].member */
indices[3] = lp_build_const_int32(gallivm, member_index);
 
@@ -163,7 +163,7 @@ draw_llvm_sampler_member(const struct 
lp_sampler_dynamic_state *base,
else
   res = ptr;
 
-   lp_build_name(res, "context.sampler%u.%s", unit, member_name);
+   lp_build_name(res, "context.sampler%u.%s", sampler_unit, member_name);
 
return res;
 }
@@ -182,9 +182,9 @@ draw_llvm_sampler_member(const struct 
lp_sampler_dynamic_state *base,
static LLVMValueRef \
draw_llvm_texture_##_name( const struct lp_sampler_dynamic_state *base, \
   struct gallivm_state *gallivm,   \
-  unsigned unit)\
+  unsigned texture_unit)   \
{ \
-  return draw_llvm_texture_member(base, gallivm, unit, _index, #_name, 
_emit_load ); \
+  return draw_llvm_texture_member(base, gallivm, texture_unit, _index, 
#_name, _emit_load ); \
}
 
 
@@ -203,9 +203,9 @@ DRAW_LLVM_TEXTURE_MEMBER(mip_offsets, 
DRAW_JIT_TEXTURE_MIP_OFFSETS, FALSE)
static LLVMValueRef \
draw_llvm_sampler_##_name( const struct lp_sampler_dynamic_state *base, \
   struct gallivm_state *gallivm,   \
-  unsigned unit)\
+  unsigned sampler_unit)   

[Mesa-dev] [Bug 59872] New: [swrast] piglit depth_texture_mode_and_swizzle regression

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59872

  Priority: medium
Bug ID: 59872
  Keywords: regression
CC: cwo...@cworth.org
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [swrast] piglit depth_texture_mode_and_swizzle
regression
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

mesa: c1d35aece0afc2822d6d9f6c22664c04e6fcbba3 (master)

$ ./bin/depth_texture_mode_and_swizzle -auto
Probe at (10,10)
  Expected: 0.50 0.50 0.50 0.50
  Observed: 0.501961 0.501961 0.501961 1.00
Probe at (30,10)
  Expected: 1.00 0.50 0.50 0.50
  Observed: 1.00 0.501961 0.501961 1.00
Probe at (130,10)
  Expected: 0.00 0.00 0.00 0.50
  Observed: 0.00 0.00 0.00 1.00
Probe at (150,10)
  Expected: 1.00 0.00 0.50 0.00
  Observed: 1.00 0.00 0.501961 1.00
PIGLIT: {'result': 'fail' }


570ed2be7d776211e1ca2a7a4c44ee6a1d141714 is the first bad commit
commit 570ed2be7d776211e1ca2a7a4c44ee6a1d141714
Author: Carl Worth 
Date:   Mon Jan 21 12:16:27 2013 -0800

ReadPixels: Force ALPHA to 1 while rebasing RGBA values for GL_RGB format

When performing a ReadPixels operation, we may be reading from a buffer
that
stores alpha values, but that is actually representing a buffer with no
alpha
channel. In this case, while rebasing the values, touch up all alpha values
read to 1.0.

This commit fixes the following piglit (sub) tests:

ARB_texture_float/fbo-colormask-formats
GL_RBG16F_ARB
EXT_texture_snorm/fbo-colormask-formats
GL_RGB16_SNORM
GL_RGB8_SNORM
GL_RGB_SNORM

It likely improves the results of other tests as well, but a PASS remains
elusive due to additional bugs.

Reviewed-by: Brian Paul 
Reviewed-by: Anuj Phogat 

:04 04 144369a7d3779929bad84beca8f3a5b2ccf90640
c25eb37e73f6f6e5435230fe8a799b1b62ed347b Msrc
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.

2013-01-25 Thread Matt Turner
On Thu, Jan 24, 2013 at 7:33 PM, Eric Anholt  wrote:
> While most of our development and testing is on x86-64, some of our
> major consumers of the driver are on i386 still.  This meant they aren't
> taking advantage of SSE for floating point math or cmov instructions,
> unless the user went out of their way to choose a -march flag
> (unlikely).  Given that the driver can only get probed on i965 and newer
> chipsets, which only support core2 and above CPUs, this is safe.
>
> Improves (32-bit) GLbenchmark 2.1 offscreen performance by .76 +/- 0.35%
> (n=19)
> ---
>  configure.ac  |   17 +
>  src/mesa/drivers/dri/i965/Makefile.am |3 ++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index e769eda..0af3176 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -492,6 +492,23 @@ if test "x$enable_asm" = xyes; then
>  fi
>  AC_SUBST([MESA_ASM_FILES])
>
> +# If the user hasn't set an explicit -march flag, then autodetect a few for
> +# use by the i965 driver.
> +if echo $CFLAGS | grep -v march > /dev/null; then
> +case "$host_cpu" in
> +i?86 | x86_64)
> +save_CFLAGS="$CFLAGS"
> +AC_MSG_CHECKING([whether $CC supports -march=core2])
> +CFLAGS="$save_CFLAGS -march=core2"
> +AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [[]])],
> +  [AC_MSG_RESULT([yes]); 
> MARCH_CORE2="-march=core2"],
> +  [AC_MSG_RESULT([no]); MARCH_CORE2=""])
> +CFLAGS="$save_CFLAGS"
> +;;
> +esac
> +fi
> +AC_SUBST([MARCH_CORE2])
> +
>  dnl Check to see if dlopen is in default libraries (like Solaris, which
>  dnl has it in libc), or if libdl is needed to get it.
>  AC_CHECK_FUNC([dlopen], [DEFINES="$DEFINES -DHAVE_DLOPEN"],
> diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
> b/src/mesa/drivers/dri/i965/Makefile.am
> index dc140df..d5d0631 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.am
> +++ b/src/mesa/drivers/dri/i965/Makefile.am
> @@ -38,7 +38,8 @@ AM_CFLAGS = \
> $(DEFINES) \
> $(API_DEFINES) \
> $(VISIBILITY_CFLAGS) \
> -   $(INTEL_CFLAGS)
> +   $(INTEL_CFLAGS) \
> +   $(MARCH_CORE2)
>
>  AM_CXXFLAGS = $(AM_CFLAGS)
>
> --
> 1.7.10.4

Nice. Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gles3: Update gl3.h

2013-01-25 Thread Matt Turner
Contains a fix for Khronos bug 9557.
---
 include/GLES3/gl3.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/GLES3/gl3.h b/include/GLES3/gl3.h
index b9399e9..09f2b53 100644
--- a/include/GLES3/gl3.h
+++ b/include/GLES3/gl3.h
@@ -2,7 +2,7 @@
 #define __gl3_h_
 
 /* 
- * gl3.h last updated on $Date: 2012-09-12 10:13:02 -0700 (Wed, 12 Sep 2012) $
+ * gl3.h last updated on $Date: 2012-10-03 07:52:40 -0700 (Wed, 03 Oct 2012) $
  */
 
 #include 
@@ -796,7 +796,7 @@ typedef struct __GLsync *GLsync;
 #define GL_TEXTURE_IMMUTABLE_FORMAT  0x912F
 #define GL_MAX_ELEMENT_INDEX 0x8D6B
 #define GL_NUM_SAMPLE_COUNTS 0x9380
-#define GL_TEXTURE_IMMUTABLE_LEVELS  0x8D63
+#define GL_TEXTURE_IMMUTABLE_LEVELS  0x82DF
 
 /*-
  * Entrypoint definitions
-- 
1.7.8.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: fix up CP DMA for VM on cayman and TN

2013-01-25 Thread alexdeucher
From: Alex Deucher 

Need to add the virtual address.

Signed-off-by: Alex Deucher 
---
 src/gallium/drivers/r600/r600.h|4 ++--
 src/gallium/drivers/r600/r600_hw_context.c |   11 +++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h
index 93604fb..06e914f 100644
--- a/src/gallium/drivers/r600/r600.h
+++ b/src/gallium/drivers/r600/r600.h
@@ -172,8 +172,8 @@ void r600_context_streamout_end(struct r600_context *ctx);
 void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw, boolean 
count_draw_in);
 void r600_context_block_emit_dirty(struct r600_context *ctx, struct r600_block 
*block, unsigned pkt_flags);
 void r600_cp_dma_copy_buffer(struct r600_context *rctx,
-struct pipe_resource *dst, unsigned dst_offset,
-struct pipe_resource *src, unsigned src_offset,
+struct pipe_resource *dst, unsigned long 
dst_offset,
+struct pipe_resource *src, unsigned long 
src_offset,
 unsigned size);
 
 int evergreen_context_init(struct r600_context *ctx);
diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index caebf5c..e13b502 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -1065,8 +1065,8 @@ void r600_context_streamout_end(struct r600_context *ctx)
 #define CP_DMA_MAX_BYTE_COUNT ((1 << 21) - 8)
 
 void r600_cp_dma_copy_buffer(struct r600_context *rctx,
-struct pipe_resource *dst, unsigned dst_offset,
-struct pipe_resource *src, unsigned src_offset,
+struct pipe_resource *dst, unsigned long 
dst_offset,
+struct pipe_resource *src, unsigned long 
src_offset,
 unsigned size)
 {
struct radeon_winsys_cs *cs = rctx->cs;
@@ -1079,6 +1079,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
return;
}
 
+   dst_offset += r600_resource_va(&rctx->screen->screen, dst);
+   src_offset += r600_resource_va(&rctx->screen->screen, src);
+
/* We flush the caches, because we might read from or write
 * to resources which are bound right now. */
rctx->flags |= R600_CONTEXT_INVAL_READ_CACHES |
@@ -1112,9 +1115,9 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
 
r600_write_value(cs, PKT3(PKT3_CP_DMA, 4, 0));
r600_write_value(cs, src_offset);   /* SRC_ADDR_LO [31:0] */
-   r600_write_value(cs, sync); /* CP_SYNC [31] | 
SRC_ADDR_HI [7:0] */
+   r600_write_value(cs, sync | ((src_offset >> 32) & 0xff));   
/* CP_SYNC [31] | SRC_ADDR_HI [7:0] */
r600_write_value(cs, dst_offset);   /* DST_ADDR_LO [31:0] */
-   r600_write_value(cs, 0);/* DST_ADDR_HI [7:0] */
+   r600_write_value(cs, (dst_offset >> 32) & 0xff);
/* DST_ADDR_HI [7:0] */
r600_write_value(cs, byte_count);   /* COMMAND [29:22] | 
BYTE_COUNT [20:0] */
 
r600_write_value(cs, PKT3(PKT3_NOP, 0, 0));
-- 
1.7.7.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Trying MSAAx2 (r300g) on RS690/AMD Radeon X1200 128MB

2013-01-25 Thread Bryan Quigley
When trying glxgears the screen locks up, and SSH eventually stops
responding as well, but I was able to get these messages from kern.log:

[  790.516059] radeon :01:05.0: >GPU lockup CP stall for more than
1msec
[  790.516076] radeon :01:05.0: >GPU lockup (waiting for
0x215b last fence id 0x2157)
[  790.664495] radeon: wait for empty RBBM fifo failed ! Bad things might
happen.
[  790.793829] Failed to wait GUI idle while programming pipes. Bad things
might happen.
[  790.794831] radeon :01:05.0: >(rs600_asic_reset:357)
RBBM_STATUS=0x9411C100
[  791.292885] radeon :01:05.0: >(rs600_asic_reset:377)
RBBM_STATUS=0x9401C100
[  791.789934] radeon :01:05.0: >(rs600_asic_reset:385)
RBBM_STATUS=0x9400C100

Testing as requested in commit 8ed6b1400.  Using the oibaf PPA on a Quantal
Lubuntu LiveUSB.  I just upgraded mesa from oibaf repo this second time,
but it still crashed when I did everything (Xorg/libdrm). Unfortunately, I
can only do live testing on this machine and using the persistence file
didn't seem to work with the big Xorg changes required, so upgrading mesa
will be done fresh for every test.

Thanks,
Bryan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] R600: Fold remaining CONST_COPY after expand pseudo inst

2013-01-25 Thread Vincent Lejeune
---
 lib/Target/R600/AMDGPUTargetMachine.cpp |   2 +-
 lib/Target/R600/R600LowerConstCopy.cpp  | 170 +---
 2 files changed, 160 insertions(+), 12 deletions(-)

diff --git a/lib/Target/R600/AMDGPUTargetMachine.cpp 
b/lib/Target/R600/AMDGPUTargetMachine.cpp
index 7b069e7..2185be3 100644
--- a/lib/Target/R600/AMDGPUTargetMachine.cpp
+++ b/lib/Target/R600/AMDGPUTargetMachine.cpp
@@ -136,8 +136,8 @@ bool AMDGPUPassConfig::addPreEmitPass() {
 addPass(createAMDGPUCFGPreparationPass(*TM));
 addPass(createAMDGPUCFGStructurizerPass(*TM));
 addPass(createR600ExpandSpecialInstrsPass(*TM));
-addPass(createR600LowerConstCopy(*TM));
 addPass(&FinalizeMachineBundlesID);
+addPass(createR600LowerConstCopy(*TM));
   } else {
 addPass(createSILowerLiteralConstantsPass(*TM));
 addPass(createSILowerControlFlowPass(*TM));
diff --git a/lib/Target/R600/R600LowerConstCopy.cpp 
b/lib/Target/R600/R600LowerConstCopy.cpp
index d14ae20..2557e8f 100644
--- a/lib/Target/R600/R600LowerConstCopy.cpp
+++ b/lib/Target/R600/R600LowerConstCopy.cpp
@@ -13,7 +13,6 @@
 /// fold them inside vector instruction, like DOT4 or Cube ; ISel emits
 /// ConstCopy instead. This pass (executed after ExpandingSpecialInstr) will 
try
 /// to fold them if possible or replace them by MOV otherwise.
-/// TODO : Implement the folding part, using Copy Propagation algorithm.
 //
 
//===--===//
 
@@ -30,6 +29,13 @@ class R600LowerConstCopy : public MachineFunctionPass {
 private:
   static char ID;
   const R600InstrInfo *TII;
+
+  struct ConstPairs {
+unsigned XYPair;
+unsigned ZWPair;
+  };
+
+  bool canFoldInBundle(ConstPairs &UsedConst, unsigned ReadConst) const;
 public:
   R600LowerConstCopy(TargetMachine &tm);
   virtual bool runOnMachineFunction(MachineFunction &MF);
@@ -39,27 +45,169 @@ public:
 
 char R600LowerConstCopy::ID = 0;
 
-
 R600LowerConstCopy::R600LowerConstCopy(TargetMachine &tm) :
 MachineFunctionPass(ID),
 TII (static_cast(tm.getInstrInfo()))
 {
 }
 
+bool R600LowerConstCopy::canFoldInBundle(ConstPairs &UsedConst,
+unsigned ReadConst) const {
+  unsigned ReadConstChan = ReadConst & 3;
+  unsigned ReadConstIndex = ReadConst & (~3);
+  if (ReadConstChan < 2) {
+if (!UsedConst.XYPair) {
+  UsedConst.XYPair = ReadConstIndex;
+}
+return UsedConst.XYPair == ReadConstIndex;
+  } else {
+if (!UsedConst.ZWPair) {
+  UsedConst.ZWPair = ReadConstIndex;
+}
+return UsedConst.ZWPair == ReadConstIndex;
+  }
+}
+
+static bool isControlFlow(const MachineInstr &MI) {
+  return (MI.getOpcode() == AMDGPU::IF_PREDICATE_SET) ||
+  (MI.getOpcode() == AMDGPU::ENDIF) ||
+  (MI.getOpcode() == AMDGPU::ELSE) ||
+  (MI.getOpcode() == AMDGPU::WHILELOOP) ||
+  (MI.getOpcode() == AMDGPU::BREAK);
+}
+
 bool R600LowerConstCopy::runOnMachineFunction(MachineFunction &MF) {
+
   for (MachineFunction::iterator BB = MF.begin(), BB_E = MF.end();
   BB != BB_E; ++BB) {
 MachineBasicBlock &MBB = *BB;
-for (MachineBasicBlock::iterator I = MBB.begin(), E = MBB.end();
-  I != E;) {
-  MachineInstr &MI = *I;
-  I = llvm::next(I);
-  if (MI.getOpcode() != AMDGPU::CONST_COPY)
+DenseMap RegToConstIndex;
+for (MachineBasicBlock::instr_iterator I = MBB.instr_begin(),
+E = MBB.instr_end(); I != E;) {
+
+  if (I->getOpcode() == AMDGPU::CONST_COPY) {
+MachineInstr &MI = *I;
+I = llvm::next(I);
+unsigned DstReg = MI.getOperand(0).getReg();
+DenseMap::iterator SrcMI =
+RegToConstIndex.find(DstReg);
+if (SrcMI != RegToConstIndex.end()) {
+  SrcMI->second->eraseFromParent();
+  RegToConstIndex.erase(SrcMI);
+}
+MachineInstr *NewMI = 
+TII->buildDefaultInstruction(MBB, &MI, AMDGPU::MOV,
+MI.getOperand(0).getReg(), AMDGPU::ALU_CONST);
+TII->setImmOperand(NewMI, R600Operands::SRC0_SEL,
+MI.getOperand(1).getImm());
+RegToConstIndex[DstReg] = NewMI;
+MI.eraseFromParent();
 continue;
-  MachineInstr *NewMI = TII->buildDefaultInstruction(MBB, I, AMDGPU::MOV,
-  MI.getOperand(0).getReg(), AMDGPU::ALU_CONST);
-  NewMI->getOperand(9).setImm(MI.getOperand(1).getImm());
-  MI.eraseFromParent();
+  }
+
+  std::vector Defs;
+  // We consider all Instructions as bundled because algorithm that  handle
+  // const read port limitations inside an IG is still valid with single
+  // instructions.
+  std::vector Bundle;
+
+  if (I->isBundle()) {
+unsigned BundleSize = I->getBundleSize();
+for (unsigned i = 0; i < BundleSize; i++) {
+  I = llvm::next(I);
+  Bundle.push_back(I);
+}
+  } else if (TII->isALUInstr(I->getOpcode())){
+Bundle.push_back(I);
+  } el

Re: [Mesa-dev] [PATCH] i965: Compile the driver with -march=core2.

2013-01-25 Thread Roland Scheidegger
I'm quite sure there are g965 boards around which indeed support Pentium
4 (and P4-based Celerons) (but yes I guess cmov and at least sse2 are
safe - not that the p4 had a usable cmov implementation as it was
incredibly slow IIRC but it should at least work).

Roland


Am 25.01.2013 04:33, schrieb Eric Anholt:
> While most of our development and testing is on x86-64, some of our
> major consumers of the driver are on i386 still.  This meant they aren't
> taking advantage of SSE for floating point math or cmov instructions,
> unless the user went out of their way to choose a -march flag
> (unlikely).  Given that the driver can only get probed on i965 and newer
> chipsets, which only support core2 and above CPUs, this is safe.
> 
> Improves (32-bit) GLbenchmark 2.1 offscreen performance by .76 +/- 0.35%
> (n=19)
> ---
>  configure.ac  |   17 +
>  src/mesa/drivers/dri/i965/Makefile.am |3 ++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/configure.ac b/configure.ac
> index e769eda..0af3176 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -492,6 +492,23 @@ if test "x$enable_asm" = xyes; then
>  fi
>  AC_SUBST([MESA_ASM_FILES])
>  
> +# If the user hasn't set an explicit -march flag, then autodetect a few for
> +# use by the i965 driver.
> +if echo $CFLAGS | grep -v march > /dev/null; then
> +case "$host_cpu" in
> +i?86 | x86_64)
> +save_CFLAGS="$CFLAGS"
> +AC_MSG_CHECKING([whether $CC supports -march=core2])
> +CFLAGS="$save_CFLAGS -march=core2"
> +AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [[]])],
> +  [AC_MSG_RESULT([yes]); 
> MARCH_CORE2="-march=core2"],
> +  [AC_MSG_RESULT([no]); MARCH_CORE2=""])
> +CFLAGS="$save_CFLAGS"
> +;;
> +esac
> +fi
> +AC_SUBST([MARCH_CORE2])
> +
>  dnl Check to see if dlopen is in default libraries (like Solaris, which
>  dnl has it in libc), or if libdl is needed to get it.
>  AC_CHECK_FUNC([dlopen], [DEFINES="$DEFINES -DHAVE_DLOPEN"],
> diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
> b/src/mesa/drivers/dri/i965/Makefile.am
> index dc140df..d5d0631 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.am
> +++ b/src/mesa/drivers/dri/i965/Makefile.am
> @@ -38,7 +38,8 @@ AM_CFLAGS = \
>   $(DEFINES) \
>   $(API_DEFINES) \
>   $(VISIBILITY_CFLAGS) \
> - $(INTEL_CFLAGS)
> + $(INTEL_CFLAGS) \
> + $(MARCH_CORE2)
>  
>  AM_CXXFLAGS = $(AM_CFLAGS)
>  
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] docs: List new extensions added in Mesa 9.1

2013-01-25 Thread Matt Turner
I did not list the *_get_program_binary extensions since they're not
useful to anyone with their current implementation (that supports 0
binary formats).
---
We should also write something about ES3 and the float-texture & S3TC
changes.

 docs/relnotes-9.1.html |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/docs/relnotes-9.1.html b/docs/relnotes-9.1.html
index ffca275..14e6c02 100644
--- a/docs/relnotes-9.1.html
+++ b/docs/relnotes-9.1.html
@@ -44,9 +44,19 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
+GL_ANGLE_texture_compression_dxt3
+GL_ANGLE_texture_compression_dxt5
+GL_ARB_base_instance
+GL_ARB_ES3_compatibility
+GL_ARB_internalformat_query
 GL_ARB_map_buffer_alignment
-GL_ARB_texture_cube_map_array
+GL_ARB_shading_language_packing
 GL_ARB_texture_buffer_object_rgb32
+GL_ARB_texture_cube_map_array
+GL_ARB_vertex_type_2_10_10_10_rev
+GL_EXT_color_buffer_float
+GL_OES_depth_texture_cube_map
+GL_OES_standard_derivatives
 
 
 
-- 
1.7.8.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59851] AC_ARG_WITH misusage leading to mesa configure failure

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59851

Matt Turner  changed:

   What|Removed |Added

 CC||tstel...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Matt Turner
On Thu, Jan 24, 2013 at 7:44 PM, Matt Turner  wrote:
> Following this email are eight patches that add the 4x8 pack/unpack
> operations that are the difference between what GLSL ES 3.0 and
> ARB_shading_language_packing require.
>
> They require Chad's gles3-glsl-packing series and are available at
> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
>
> I've also added testing support on top of Chad's piglit patch. The
> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
> spot why.
>
> Please give it a look. I'd be nice to get this into 9.1.
>
> Thanks,
> Matt

Thanks for all the review comments. I've fixed the problems spotted
and pushed. The piglit patch is getting a second look-over before it's
pushed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] r600g: add async for staging buffer upload v2

2013-01-25 Thread j . glisse
From: Jerome Glisse 

v2: Add virtual address to dma src/dst offset for cayman

Signed-off-by: Jerome Glisse 
---
 src/gallium/drivers/r600/evergreen_hw_context.c |  46 ++
 src/gallium/drivers/r600/evergreen_state.c  | 201 
 src/gallium/drivers/r600/evergreend.h   |  15 ++
 src/gallium/drivers/r600/r600.h |  27 
 src/gallium/drivers/r600/r600_buffer.c  |  25 ++-
 src/gallium/drivers/r600/r600_hw_context.c  |  48 +-
 src/gallium/drivers/r600/r600_pipe.c|   6 +-
 src/gallium/drivers/r600/r600_pipe.h|   9 ++
 src/gallium/drivers/r600/r600_state.c   | 190 ++
 src/gallium/drivers/r600/r600_state_common.c|   6 +-
 src/gallium/drivers/r600/r600_texture.c |  24 ++-
 src/gallium/drivers/r600/r600d.h|  15 ++
 12 files changed, 595 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_hw_context.c 
b/src/gallium/drivers/r600/evergreen_hw_context.c
index fa90c9a..ca4f4b3 100644
--- a/src/gallium/drivers/r600/evergreen_hw_context.c
+++ b/src/gallium/drivers/r600/evergreen_hw_context.c
@@ -26,6 +26,7 @@
 #include "r600_hw_context_priv.h"
 #include "evergreend.h"
 #include "util/u_memory.h"
+#include "util/u_math.h"
 
 static const struct r600_reg cayman_config_reg_list[] = {
{R_009100_SPI_CONFIG_CNTL, REG_FLAG_ENABLE_ALWAYS | 
REG_FLAG_FLUSH_CHANGE, 0},
@@ -238,3 +239,48 @@ void evergreen_set_streamout_enable(struct r600_context 
*ctx, unsigned buffer_en
r600_write_context_reg(cs, R_028B94_VGT_STRMOUT_CONFIG, 
S_028B94_STREAMOUT_0_EN(0));
}
 }
+
+void evergreen_dma_copy(struct r600_context *rctx,
+   struct pipe_resource *dst,
+   struct pipe_resource *src,
+   unsigned long dst_offset,
+   unsigned long src_offset,
+   unsigned long size)
+{
+   struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
+   unsigned i, ncopy, csize, sub_cmd, shift;
+   struct r600_resource *rdst = (struct r600_resource*)dst;
+   struct r600_resource *rsrc = (struct r600_resource*)src;
+
+   /* make sure that the dma ring is only one active */
+   rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
+   dst_offset += r600_resource_va(&rctx->screen->screen, dst);
+   src_offset += r600_resource_va(&rctx->screen->screen, src);
+
+   /* see if we use dword or byte copy */
+   if (!(dst_offset & 0x3) && !(src_offset & 0x3) && !(size & 0x3)) {
+   size >>= 2;
+   sub_cmd = 0x00;
+   shift = 2;
+   } else {
+   sub_cmd = 0x40;
+   shift = 0;
+   }
+   ncopy = (size / 0x000f) + !!(size % 0x000f);
+
+   r600_need_dma_space(rctx, ncopy * 5);
+   for (i = 0; i < ncopy; i++) {
+   csize = size < 0x000f ? size : 0x000f;
+   /* emit reloc before writting cs so that cs is always in 
consistent state */
+   r600_context_bo_reloc(rctx, &rctx->rings.dma, rsrc, 
RADEON_USAGE_READ);
+   r600_context_bo_reloc(rctx, &rctx->rings.dma, rdst, 
RADEON_USAGE_WRITE);
+   cs->buf[cs->cdw++] = DMA_PACKET(DMA_PACKET_COPY, sub_cmd, 
csize);
+   cs->buf[cs->cdw++] = dst_offset & 0x;
+   cs->buf[cs->cdw++] = src_offset & 0x;
+   cs->buf[cs->cdw++] = (dst_offset >> 32UL) & 0xff;
+   cs->buf[cs->cdw++] = (src_offset >> 32UL) & 0xff;
+   dst_offset += csize << shift;
+   src_offset += csize << shift;
+   size -= csize;
+   }
+}
diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 86e2c81..5c22e24 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -30,6 +30,20 @@
 #include "util/u_framebuffer.h"
 #include "util/u_dual_blend.h"
 #include "evergreen_compute.h"
+#include "util/u_math.h"
+
+static INLINE unsigned evergreen_array_mode(unsigned mode)
+{
+   switch (mode) {
+   case RADEON_SURF_MODE_LINEAR_ALIGNED:   return 
V_028C70_ARRAY_LINEAR_ALIGNED;
+   break;
+   case RADEON_SURF_MODE_1D:   return 
V_028C70_ARRAY_1D_TILED_THIN1;
+   break;
+   case RADEON_SURF_MODE_2D:   return 
V_028C70_ARRAY_2D_TILED_THIN1;
+   default:
+   case RADEON_SURF_MODE_LINEAR:   return 
V_028C70_ARRAY_LINEAR_GENERAL;
+   }
+}
 
 static uint32_t eg_num_banks(uint32_t nbanks)
 {
@@ -3445,3 +3459,190 @@ void evergreen_update_db_shader_control(struct 
r600_context * rctx)
rctx->db_misc_state.atom.dirty = true;
}
 }
+
+static void evergreen_dma_copy_tile(struct r600_context *rctx,
+   struct pipe_resource *dst,
+   unsigned dst_level,
+   unsigned dst

Re: [Mesa-dev] [PATCH] intel: Use a CPU map of the batch on LLC-sharing architectures.

2013-01-25 Thread Kenneth Graunke

On 01/25/2013 09:31 AM, Eric Anholt wrote:

Kenneth Graunke  writes:


On 01/20/2013 02:59 PM, Eric Anholt wrote:

Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in,
which was an improvement over mapping the batch through the GTT directly
(since any readback or other failure to stream through write combining
correctly would hurt).  However, on LLC-sharing architectures we can do better
by mapping the batch directly, which reduces the cache footprint of the
application since we no longer have this extra copy of a batchbuffer around.

Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4%
(n=21).  Improves Lightsmark performance by 1.1 +/- 0.1% (n=76).  Improves
cairo-gl performance by 1.9% +/- 1.4% (n=57).

No statistically significant difference in GLB2.1 on SNB (n=37).  Improves
cairo-gl performance by 2.1% +/- 0.1% (n=278).


Looks good to me.  Have you tested this on a non-LLC machine?


Not in a long time.  It shouldn't affect performance, since they get the
same behavior as before.


Okay.  I mashed has_llc to false and ran glxgears, which was proof 
enough that the non-LLC path still works.  I figured it did, but just 
wanted to check.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Paul Berry
On 25 January 2013 13:18, Matt Turner  wrote:

> On Fri, Jan 25, 2013 at 9:55 AM, Paul Berry 
> wrote:
> > On 25 January 2013 07:49, Paul Berry  wrote:
> >>
> >> On 24 January 2013 19:44, Matt Turner  wrote:
> >>>
> >>> Following this email are eight patches that add the 4x8 pack/unpack
> >>> operations that are the difference between what GLSL ES 3.0 and
> >>> ARB_shading_language_packing require.
> >>>
> >>> They require Chad's gles3-glsl-packing series and are available at
> >>>
> >>>
> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
> >>>
> >>> I've also added testing support on top of Chad's piglit patch. The
> >>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
> >>> spot why.
> >>
> >>
> >> I had minor comments on patches 4/8 and 5/8.  The remainder is:
> >>
> >> Reviewed-by: Paul Berry 
> >>
> >> I didn't spot anything that would explain the failure in unpackUnorm4x8
> >> tests.  I'll go have a look at your piglit tests now, and if I don't
> find
> >> anything there either, I'll fire up the simulator and see if I can see
> >> what's going wrong.
> >
> >
> > I found the problem.  On i965, floating point divisions are implemented
> as
> > multiplication by a reciprocal, whereas on the CPU there's a floating
> point
> > division instruction.  Therefore, unpackUnorm4x8's computation of "f /
> > 255.0" doesn't yield consistent results when run on the CPU vs the
> > GPU--there is a tiny difference due to the accumulation of floating point
> > rounding errors.
> >
> > That's why the "fs" and "vs" variants of the tests failed, and the
> "const"
> > variant passed--because Mesa does constant folding using the CPU's
> floating
> > point division instruction, which matches the Python test generator
> > perfectly, whereas the "fs" and "vs" variants use the actual GPU.
> >
> > It's only by dumb luck that this rounding error issue didn't bite us
> until
> > now, because in principle it could equally well have occurred in the
> > unpack2x16 functions.
> >
> > I believe we should relax the test to allow for these tiny rounding
> errors
> > (this is what the other test generators, such as
> > gen_builtin_uniform_tests.py, do).  As an experiment I modified
> > gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and
> > fs_unpack_2x16_template, "actual == expect${j}" is replaced with
> > "distance(actual, expect${j}) < 0.1".  With this change, the test
> > passes.
> >
> > However, that change isn't good enough to commit to piglit, for two
> reasons:
> >
> > (1) It should only be applied when testing the functions whose definition
> > includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and
> > unpackSnorm2x16).  A properly functioning implementation ought to be
> able to
> > get exact answers with all the other packing functions, and we should
> test
> > that it does.
> >
> > (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced
> > without error, since a shader author might conceivably write code that
> > relies on these values being exact.  That is, we should check that the
> > following conversions are exact, with no rounding error:
> >
> > unpackUnorm4x8(0) == vec4(0.0)
> > unpackUnorm4x8(0x) == vec4(1.0)
> > unpackSnorm4x8(0) == vec4(0.0)
> > unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0)
> > unpackSnorm4x8(0x80808080) == vec4(-1.0)
> > unpackSnorm4x8(0x81818181) == vec4(-1.0)
> > unpackUnorm2x16(0) == vec2(0.0)
> > unpackUnorm2x16(0x) == vec4(1.0)
> > unpackSnorm2x16(0) == vec4(0.0)
> > unpackSnorm2x16(0x7fff7fff) == vec4(1.0)
> > unpackSnorm2x16(0x80008000) == vec4(-1.0)
> > unpackSnorm2x16(0x80018001) == vec4(-1.0)
> >
> > My recommendation: address problem (1) by modifying the templates to
> accept
> > a new parameter that determines whether the test needs to be precise or
> > approximate (e.g. "func.precise").  Address problem (2) by hand-coding a
> few
> > shader_runner tests to check the cases above.  IMHO it would be ok to
> leave
> > the current patch as is (modulo my previous comments) and do a pair of
> > follow-on patches to address problems (1) and (2).
>
> Interesting. Thanks a lot for finding that and writing it up.
>
> Since div() is used in by both the Snorm and Unorm unpacking
> functions, any idea why it only adversely affects the results of
> Unorm? Multiplication by 1/255 yields lower precision than by 1/127?
>

After messing around with numpy for a while, it looks like 1/255 expressed
as a float32 happens to fall almost exactly between two representable
float32 values:

0.0039215683937072754 (representable float32)
0.0039215686274509803 (true value of 1/255)
0.0039215688593685627 (next representable float32)

So regardless of which way the rounding goes the relative error is
approximately 5.9e-8.

By luck, 1/127, 1/32767, and 1/65535 are all much closer to representable
float32's, with relative errors of 3.7e-9, 9.3e-10, and 2.2e-10
respectively.

So yeah, the relative error int

Re: [Mesa-dev] [PATCH 4/8] glsl: Evaluate constant pack/unpack 4x8 expressions

2013-01-25 Thread Matt Turner
On Fri, Jan 25, 2013 at 11:59 AM, Chad Versace
 wrote:
>>> +   *x = unpack_1x8((uint8_t) (u & 0xff));
>>> +   *y = unpack_1x8((uint8_t) (u >> 8));
>>> +   *z = unpack_1x8((uint8_t) (u >> 16));
>>> +   *w = unpack_1x8((uint8_t) (u >> 24));
>>> +}
>>
>> The bitmask (u & 0xff) confused me for a few moments, made me say "Why does 
>> Matt
>> need a bitmask there?". But, then I realized that I did the same for 
>> unpack_2x16,
>> and you likely just copied my pattern. Oh well. I'd prefer that unpack_2x16
>> and unpack_4x8 follow a similar visual pattern rather than clean that up now,
>> so I'm ok with that funny looking bitmask staying in this patch.

Hah, I wondered the same thing about your patch. :)

gcc, and I assume any other compiler we could possible care about,
knows the & 0xff is a nop.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Matt Turner
On Fri, Jan 25, 2013 at 9:55 AM, Paul Berry  wrote:
> On 25 January 2013 07:49, Paul Berry  wrote:
>>
>> On 24 January 2013 19:44, Matt Turner  wrote:
>>>
>>> Following this email are eight patches that add the 4x8 pack/unpack
>>> operations that are the difference between what GLSL ES 3.0 and
>>> ARB_shading_language_packing require.
>>>
>>> They require Chad's gles3-glsl-packing series and are available at
>>>
>>> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
>>>
>>> I've also added testing support on top of Chad's piglit patch. The
>>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
>>> spot why.
>>
>>
>> I had minor comments on patches 4/8 and 5/8.  The remainder is:
>>
>> Reviewed-by: Paul Berry 
>>
>> I didn't spot anything that would explain the failure in unpackUnorm4x8
>> tests.  I'll go have a look at your piglit tests now, and if I don't find
>> anything there either, I'll fire up the simulator and see if I can see
>> what's going wrong.
>
>
> I found the problem.  On i965, floating point divisions are implemented as
> multiplication by a reciprocal, whereas on the CPU there's a floating point
> division instruction.  Therefore, unpackUnorm4x8's computation of "f /
> 255.0" doesn't yield consistent results when run on the CPU vs the
> GPU--there is a tiny difference due to the accumulation of floating point
> rounding errors.
>
> That's why the "fs" and "vs" variants of the tests failed, and the "const"
> variant passed--because Mesa does constant folding using the CPU's floating
> point division instruction, which matches the Python test generator
> perfectly, whereas the "fs" and "vs" variants use the actual GPU.
>
> It's only by dumb luck that this rounding error issue didn't bite us until
> now, because in principle it could equally well have occurred in the
> unpack2x16 functions.
>
> I believe we should relax the test to allow for these tiny rounding errors
> (this is what the other test generators, such as
> gen_builtin_uniform_tests.py, do).  As an experiment I modified
> gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and
> fs_unpack_2x16_template, "actual == expect${j}" is replaced with
> "distance(actual, expect${j}) < 0.1".  With this change, the test
> passes.
>
> However, that change isn't good enough to commit to piglit, for two reasons:
>
> (1) It should only be applied when testing the functions whose definition
> includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and
> unpackSnorm2x16).  A properly functioning implementation ought to be able to
> get exact answers with all the other packing functions, and we should test
> that it does.
>
> (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced
> without error, since a shader author might conceivably write code that
> relies on these values being exact.  That is, we should check that the
> following conversions are exact, with no rounding error:
>
> unpackUnorm4x8(0) == vec4(0.0)
> unpackUnorm4x8(0x) == vec4(1.0)
> unpackSnorm4x8(0) == vec4(0.0)
> unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0)
> unpackSnorm4x8(0x80808080) == vec4(-1.0)
> unpackSnorm4x8(0x81818181) == vec4(-1.0)
> unpackUnorm2x16(0) == vec2(0.0)
> unpackUnorm2x16(0x) == vec4(1.0)
> unpackSnorm2x16(0) == vec4(0.0)
> unpackSnorm2x16(0x7fff7fff) == vec4(1.0)
> unpackSnorm2x16(0x80008000) == vec4(-1.0)
> unpackSnorm2x16(0x80018001) == vec4(-1.0)
>
> My recommendation: address problem (1) by modifying the templates to accept
> a new parameter that determines whether the test needs to be precise or
> approximate (e.g. "func.precise").  Address problem (2) by hand-coding a few
> shader_runner tests to check the cases above.  IMHO it would be ok to leave
> the current patch as is (modulo my previous comments) and do a pair of
> follow-on patches to address problems (1) and (2).

Interesting. Thanks a lot for finding that and writing it up.

Since div() is used in by both the Snorm and Unorm unpacking
functions, any idea why it only adversely affects the results of
Unorm? Multiplication by 1/255 yields lower precision than by 1/127?

In investigating the Unorm unpacking failure I did notice that some
values worked (like 0.0, 1.0, and even 0.0078431377), so I don't
expect any problems with precision on the values you suggest.

I agree with your recommended solution. I'll push these patches today
for the 9.1 branch and do follow-on patches to piglit like you
suggest.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: split sampler and texture state

2013-01-25 Thread Brian Paul

On 01/24/2013 07:48 PM, srol...@vmware.com wrote:

From: Roland Scheidegger

Split the sampler interface to use separate sampler and texture (sampler_view)
state. This is needed to support dx10-style sampling instructions.
This is not quite complete since both draw/llvmpipe don't really track
textures/samplers independently yet, as well as the gallivm code not quite
using the right sampler or texture index respectively (but it should work
for the sampling codes used by opengl).
We are however losing some optimizations in the process, apply_max_lod will
no longer work, and we potentially could end up with more (unnecessary)
recompiles (if switching textures with/without mipmaps only so it shouldn't
be too bad).

v2: don't use different callback structs for sampler/sampler view functions
(which just complicates things), fix up sampling code to actually use the
right texture or sampler index, and similar for llvmpipe/draw actually
distinguish between samplers and sampler views.



Looks good AFAICT.  Just a few minor comments (about comments) below.

Reviewed-by: Brian Paul 

Nice work!



---
  src/gallium/auxiliary/draw/draw_llvm.c|  129 +-
  src/gallium/auxiliary/draw/draw_llvm.h|   66 ++--
  src/gallium/auxiliary/draw/draw_llvm_sample.c |   88 --
  src/gallium/auxiliary/draw/draw_private.h |2 +-
  src/gallium/auxiliary/gallivm/lp_bld_sample.c |  108 +++-
  src/gallium/auxiliary/gallivm/lp_bld_sample.h |   66 ++--
  src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |  104 ++--
  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  187 -
  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h   |3 +-
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c   |6 +-
  src/gallium/drivers/llvmpipe/lp_jit.c |   54 --
  src/gallium/drivers/llvmpipe/lp_jit.h |   24 ++-
  src/gallium/drivers/llvmpipe/lp_setup.c   |   12 +-
  src/gallium/drivers/llvmpipe/lp_state_fs.c|   84 ++---
  src/gallium/drivers/llvmpipe/lp_state_fs.h|   17 +-
  src/gallium/drivers/llvmpipe/lp_state_setup.c |   16 +-
  src/gallium/drivers/llvmpipe/lp_tex_sample.c  |  102 ---
  17 files changed, 711 insertions(+), 357 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index a3a3bbf..9e5ff1c 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -85,11 +85,6 @@ create_jit_texture_type(struct gallivm_state *gallivm, const 
char *struct_name)
 elem_types[DRAW_JIT_TEXTURE_IMG_STRIDE] =
 elem_types[DRAW_JIT_TEXTURE_MIP_OFFSETS] =
LLVMArrayType(int32_type, PIPE_MAX_TEXTURE_LEVELS);
-   elem_types[DRAW_JIT_TEXTURE_MIN_LOD] =
-   elem_types[DRAW_JIT_TEXTURE_MAX_LOD] =
-   elem_types[DRAW_JIT_TEXTURE_LOD_BIAS] = 
LLVMFloatTypeInContext(gallivm->context);
-   elem_types[DRAW_JIT_TEXTURE_BORDER_COLOR] =
-  LLVMArrayType(LLVMFloatTypeInContext(gallivm->context), 4);

 texture_type = LLVMStructTypeInContext(gallivm->context, elem_types,
Elements(elem_types), 0);
@@ -130,18 +125,6 @@ create_jit_texture_type(struct gallivm_state *gallivm, 
const char *struct_name)
 LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, mip_offsets,
target, texture_type,
DRAW_JIT_TEXTURE_MIP_OFFSETS);
-   LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, min_lod,
-  target, texture_type,
-  DRAW_JIT_TEXTURE_MIN_LOD);
-   LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, max_lod,
-  target, texture_type,
-  DRAW_JIT_TEXTURE_MAX_LOD);
-   LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, lod_bias,
-  target, texture_type,
-  DRAW_JIT_TEXTURE_LOD_BIAS);
-   LP_CHECK_MEMBER_OFFSET(struct draw_jit_texture, border_color,
-  target, texture_type,
-  DRAW_JIT_TEXTURE_BORDER_COLOR);

 LP_CHECK_STRUCT_SIZE(struct draw_jit_texture, target, texture_type);

@@ -150,15 +133,63 @@ create_jit_texture_type(struct gallivm_state *gallivm, 
const char *struct_name)


  /**
+ * Create LLVM type for struct draw_jit_sampler
+ */
+static LLVMTypeRef
+create_jit_sampler_type(struct gallivm_state *gallivm, const char *struct_name)
+{
+   LLVMTargetDataRef target = gallivm->target;
+   LLVMTypeRef sampler_type;
+   LLVMTypeRef elem_types[DRAW_JIT_SAMPLER_NUM_FIELDS];
+
+   elem_types[DRAW_JIT_SAMPLER_MIN_LOD] =
+   elem_types[DRAW_JIT_SAMPLER_MAX_LOD] =
+   elem_types[DRAW_JIT_SAMPLER_LOD_BIAS] = 
LLVMFloatTypeInContext(gallivm->context);
+   elem_types[DRAW_JIT_SAMPLER_BORDER_COLOR] =
+  LLVMArrayType(LLVMFloatTypeInContext(gallivm->context), 4);
+
+   sampler_type = LLVMStructTypeInContext(gallivm->context, elem_ty

Re: [Mesa-dev] [PATCH] glx: only advertise GLX_INTEL_swap_event if it's supported

2013-01-25 Thread Brian Paul

On 01/24/2013 06:59 PM, Zack Rusin wrote:

Only drivers supporting DRI2 version>=4 support GLX_INTEL_swap_event.
So lets mark it as such otherwise applications which use this extension
(i.e. everything based on Clutter, e.g. gnome-shell) break horribly on
drivers supporting DRI2 versions only up to 3.

Note: This is a candidate for the 9.0 branch.

Signed-off-by: Zack Rusin
---
  src/glx/dri2_glx.c |5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 1b3cf2b..a51716f 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1062,8 +1062,9 @@ dri2BindExtensions(struct dri2_screen *psc, const 
__DRIextension **extensions)
 __glXEnableDirectExtension(&psc->base, "GLX_MESA_swap_control");
 __glXEnableDirectExtension(&psc->base, "GLX_SGI_make_current_read");

-   /* FIXME: if DRI2 version supports it... */
-   __glXEnableDirectExtension(&psc->base, "GLX_INTEL_swap_event");
+   if (psc->dri2->base.version>= 4) {
+  __glXEnableDirectExtension(&psc->base, "GLX_INTEL_swap_event");
+   }

 if (psc->dri2->base.version>= 3) {
const unsigned mask = psc->dri2->getAPIMask(psc->driScreen);


Other people are more familiar with this than me, but
Reviewed-by: Brian Paul 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59835] ir_constant_expression.cpp:156: undefined reference to `_mesa_round_to_even'

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59835

--- Comment #2 from Chad Versace  ---
Sorry about that. Next time I change the Android and Autotools system, I'll
remember to change Scons too.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Chad Versace
On 01/25/2013 09:55 AM, Paul Berry wrote:
> On 25 January 2013 07:49, Paul Berry  wrote:
> 
>> On 24 January 2013 19:44, Matt Turner  wrote:
>>
>>> Following this email are eight patches that add the 4x8 pack/unpack
>>> operations that are the difference between what GLSL ES 3.0 and
>>> ARB_shading_language_packing require.
>>>
>>> They require Chad's gles3-glsl-packing series and are available at
>>>
>>> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
>>>
>>> I've also added testing support on top of Chad's piglit patch. The
>>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
>>> spot why.
>>>
>>
>> I had minor comments on patches 4/8 and 5/8.  The remainder is:
>>
>> Reviewed-by: Paul Berry 
>>
>> I didn't spot anything that would explain the failure in unpackUnorm4x8
>> tests.  I'll go have a look at your piglit tests now, and if I don't find
>> anything there either, I'll fire up the simulator and see if I can see
>> what's going wrong.
>>
> 
> I found the problem.  On i965, floating point divisions are implemented as
> multiplication by a reciprocal, whereas on the CPU there's a floating point
> division instruction.  Therefore, unpackUnorm4x8's computation of "f /
> 255.0" doesn't yield consistent results when run on the CPU vs the
> GPU--there is a tiny difference due to the accumulation of floating point
> rounding errors.
> 
> That's why the "fs" and "vs" variants of the tests failed, and the "const"
> variant passed--because Mesa does constant folding using the CPU's floating
> point division instruction, which matches the Python test generator
> perfectly, whereas the "fs" and "vs" variants use the actual GPU.
> 
> It's only by dumb luck that this rounding error issue didn't bite us until
> now, because in principle it could equally well have occurred in the
> unpack2x16 functions.
> 
> I believe we should relax the test to allow for these tiny rounding errors
> (this is what the other test generators, such as
> gen_builtin_uniform_tests.py, do).  As an experiment I modified
> gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and
> fs_unpack_2x16_template, "actual == expect${j}" is replaced with
> "distance(actual, expect${j}) < 0.1".  With this change, the test
> passes.
> 
> However, that change isn't good enough to commit to piglit, for two reasons:
> 
> (1) It should only be applied when testing the functions whose definition
> includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and
> unpackSnorm2x16).  A properly functioning implementation ought to be able
> to get exact answers with all the other packing functions, and we should
> test that it does.
> 
> (2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced
> without error, since a shader author might conceivably write code that
> relies on these values being exact.  That is, we should check that the
> following conversions are exact, with no rounding error:
> 
> unpackUnorm4x8(0) == vec4(0.0)
> unpackUnorm4x8(0x) == vec4(1.0)
> unpackSnorm4x8(0) == vec4(0.0)
> unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0)
> unpackSnorm4x8(0x80808080) == vec4(-1.0)
> unpackSnorm4x8(0x81818181) == vec4(-1.0)
> unpackUnorm2x16(0) == vec2(0.0)
> unpackUnorm2x16(0x) == vec4(1.0)
> unpackSnorm2x16(0) == vec4(0.0)
> unpackSnorm2x16(0x7fff7fff) == vec4(1.0)
> unpackSnorm2x16(0x80008000) == vec4(-1.0)
> unpackSnorm2x16(0x80018001) == vec4(-1.0)
> 
> My recommendation: address problem (1) by modifying the templates to accept
> a new parameter that determines whether the test needs to be precise or
> approximate (e.g. "func.precise").  Address problem (2) by hand-coding a
> few shader_runner tests to check the cases above.  IMHO it would be ok to
> leave the current patch as is (modulo my previous comments) and do a pair
> of follow-on patches to address problems (1) and (2).
> 
> Chad, do you have any thoughts on this subject, since you're the original
> author of this test generator?

I don't like the kludge of having a separate shader_test for exact values.
But, I've thought hard on what modifications to the python script would
be needed to solve the problem solely within the script and its generated
shaders, and I like that solution even less.

So, Paul, I think we should go forward with your proposed solution.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Chad Versace
On 01/24/2013 07:44 PM, Matt Turner wrote:
> Following this email are eight patches that add the 4x8 pack/unpack
> operations that are the difference between what GLSL ES 3.0 and
> ARB_shading_language_packing require.
> 
> They require Chad's gles3-glsl-packing series and are available at
> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
> 
> I've also added testing support on top of Chad's piglit patch. The
> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
> spot why.

By the way, my Mesa series is committed to the master and gles3 branches.
My Piglit patch is on master too.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] glsl: Add infrastructure for ARB_shading_language_packing

2013-01-25 Thread Chad Versace
Patches 1-3, 6-8 are
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] glsl: Evaluate constant pack/unpack 4x8 expressions

2013-01-25 Thread Chad Versace
On 01/25/2013 11:38 AM, Chad Versace wrote:
> On 01/24/2013 07:47 PM, Matt Turner wrote:
>> That is, evaluate constant expressions for the following functions:
>>   packSnorm4x8, unpackSnorm4x8
>>   packUnorm4x8, unpackUnorm4x8
>> ---
>>  src/glsl/ir_constant_expression.cpp |  162 
>> +++
>>  1 files changed, 162 insertions(+), 0 deletions(-)
>>
>> diff --git a/src/glsl/ir_constant_expression.cpp 
>> b/src/glsl/ir_constant_expression.cpp
>> index b34c6e8..4796f6f 100644
>> --- a/src/glsl/ir_constant_expression.cpp
>> +++ b/src/glsl/ir_constant_expression.cpp
>> @@ -76,12 +76,24 @@ bitcast_f2u(float f)
>>  }
>>  
>>  /**
>> + * Evaluate one component of a floating-point 4x8 unpacking function.
>> + */
>> +typedef uint8_t
>> +(*pack_1x8_func_t)(float);
>> +
>> +/**
>>   * Evaluate one component of a floating-point 2x16 unpacking function.
>>   */
>>  typedef uint16_t
>>  (*pack_1x16_func_t)(float);
>>  
>>  /**
>> + * Evaluate one component of a floating-point 4x8 unpacking function.
>> + */
>> +typedef float
>> +(*unpack_1x8_func_t)(uint8_t);
>> +
>> +/**
>>   * Evaluate one component of a floating-point 2x16 unpacking function.
>>   */
>>  typedef float
>> @@ -112,6 +124,32 @@ pack_2x16(pack_1x16_func_t pack_1x16,
>>  }
>>  
>>  /**
>> + * Evaluate a 4x8 floating-point packing function.
>> + */
>> +static uint32_t
>> +pack_4x8(pack_1x8_func_t pack_1x8,
>> + float x, float y, float z, float w)
>> +{
>> +   /* From section 8.4 of the GLSL 4.30 spec:
>> +*
>> +*packSnorm4x8
>> +*
>> +*The first component of the vector will be written to the least
>> +*significant bits of the output; the last component will be written 
>> to
>> +*the most significant bits.
>> +*
>> +* The specifications for the other packing functions contain similar
>> +* language.
>> +*/
>> +   uint32_t u = 0;
>> +   u |= ((uint32_t) pack_1x8(x) << 0);
>> +   u |= ((uint32_t) pack_1x8(y) << 8);
>> +   u |= ((uint32_t) pack_1x8(z) << 16);
>> +   u |= ((uint32_t) pack_1x8(w) << 24);
>> +   return u;
>> +}
>> +
>> +/**
>>   * Evaluate a 2x16 floating-point unpacking function.
>>   */
>>  static void
>> @@ -135,6 +173,48 @@ unpack_2x16(unpack_1x16_func_t unpack_1x16,
>>  }
>>  
>>  /**
>> + * Evaluate a 4x8 floating-point unpacking function.
>> + */
>> +static void
>> +unpack_4x8(unpack_1x8_func_t unpack_1x8, uint32_t u,
>> +   float *x, float *y, float *z, float *w)
>> +{
>> +/* From section 8.4 of the GLSL 4.30 spec:
>> + *
>> + *unpackSnorm4x8
>> + *--
>> + *The first component of the returned vector will be extracted from
>> + *the least significant bits of the input; the last component will 
>> be
>> + *extracted from the most significant bits.
>> + *
>> + * The specifications for the other unpacking functions contain similar
>> + * language.
>> + */
>> +   *x = unpack_1x8((uint8_t) (u & 0xff));
>> +   *y = unpack_1x8((uint8_t) (u >> 8));
>> +   *z = unpack_1x8((uint8_t) (u >> 16));
>> +   *w = unpack_1x8((uint8_t) (u >> 24));
>> +}
> 
> The bitmask (u & 0xff) confused me for a few moments, made me say "Why does 
> Matt
> need a bitmask there?". But, then I realized that I did the same for 
> unpack_2x16,
> and you likely just copied my pattern. Oh well. I'd prefer that unpack_2x16
> and unpack_4x8 follow a similar visual pattern rather than clean that up now,
> so I'm ok with that funny looking bitmask staying in this patch.
> 
>> +
>> +/**
>> + * Evaluate one component of packSnorm4x8.
>> + */
>> +static uint8_t
>> +pack_snorm_1x8(float x)
>> +{
>> +/* From section 8.4 of the GLSL 4.30 spec:
>> + *
>> + *packSnorm4x8
>> + *
>> + *The conversion for component c of v to fixed point is done as
>> + *follows:
>> + *
>> + *  packSnorm4x8: round(clamp(c, -1, +1) * 127.0)
>> + */
>> +   return (uint8_t) _mesa_round_to_even(CLAMP(x, -1.0f, +1.0f) * 127.0f);
>> +}
> 
> Conversion from a negative float to a uint, so an intermediate conversion to
> int8_t is needed here. Like Paul said. With that change, this is
> Reviewed-by: Chad Versace 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] glsl: Add support for lowering 4x8 pack/unpack operations

2013-01-25 Thread Chad Versace
On 01/25/2013 11:59 AM, Chad Versace wrote:
> On 01/24/2013 07:47 PM, Matt Turner wrote:
>> Lower them to arithmetic and bit manipulation expressions.
>> ---
>>  src/glsl/ir_optimization.h  |6 +
>>  src/glsl/lower_packing_builtins.cpp |  279 
>> +++
>>  2 files changed, 285 insertions(+), 0 deletions(-)
>>
>> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
>> index ac90b87..8f33018 100644
>> --- a/src/glsl/ir_optimization.h
>> +++ b/src/glsl/ir_optimization.h
>> @@ -54,6 +54,12 @@ enum lower_packing_builtins_op {
>>  
>> LOWER_PACK_HALF_2x16_TO_SPLIT= 0x0040,
>> LOWER_UNPACK_HALF_2x16_TO_SPLIT  = 0x0080,
>> +
>> +   LOWER_PACK_SNORM_4x8 = 0x0100,
>> +   LOWER_UNPACK_SNORM_4x8   = 0x0200,
>> +
>> +   LOWER_PACK_UNORM_4x8 = 0x0400,
>> +   LOWER_UNPACK_UNORM_4x8   = 0x0800,
>>  };
>>  
>>  bool do_common_optimization(exec_list *ir, bool linked,
>> diff --git a/src/glsl/lower_packing_builtins.cpp 
>> b/src/glsl/lower_packing_builtins.cpp
>> index 49176cc..aa6765f 100644
>> --- a/src/glsl/lower_packing_builtins.cpp
>> +++ b/src/glsl/lower_packing_builtins.cpp
>> @@ -85,9 +85,15 @@ public:
>>case LOWER_PACK_SNORM_2x16:
>>   *rvalue = lower_pack_snorm_2x16(op0);
>>   break;
>> +  case LOWER_PACK_SNORM_4x8:
>> + *rvalue = lower_pack_snorm_4x8(op0);
>> + break;
>>case LOWER_PACK_UNORM_2x16:
>>   *rvalue = lower_pack_unorm_2x16(op0);
>>   break;
>> +  case LOWER_PACK_UNORM_4x8:
>> + *rvalue = lower_pack_unorm_4x8(op0);
>> + break;
>>case LOWER_PACK_HALF_2x16:
>>   *rvalue = lower_pack_half_2x16(op0);
>>   break;
>> @@ -97,9 +103,15 @@ public:
>>case LOWER_UNPACK_SNORM_2x16:
>>   *rvalue = lower_unpack_snorm_2x16(op0);
>>   break;
>> +  case LOWER_UNPACK_SNORM_4x8:
>> + *rvalue = lower_unpack_snorm_4x8(op0);
>> + break;
>>case LOWER_UNPACK_UNORM_2x16:
>>   *rvalue = lower_unpack_unorm_2x16(op0);
>>   break;
>> +  case LOWER_UNPACK_UNORM_4x8:
>> + *rvalue = lower_unpack_unorm_4x8(op0);
>> + break;
>>case LOWER_UNPACK_HALF_2x16:
>>   *rvalue = lower_unpack_half_2x16(op0);
>>   break;
>> @@ -137,18 +149,30 @@ private:
>>case ir_unop_pack_snorm_2x16:
>>   result = op_mask & LOWER_PACK_SNORM_2x16;
>>   break;
>> +  case ir_unop_pack_snorm_4x8:
>> + result = op_mask & LOWER_PACK_SNORM_4x8;
>> + break;
>>case ir_unop_pack_unorm_2x16:
>>   result = op_mask & LOWER_PACK_UNORM_2x16;
>>   break;
>> +  case ir_unop_pack_unorm_4x8:
>> + result = op_mask & LOWER_PACK_UNORM_4x8;
>> + break;
>>case ir_unop_pack_half_2x16:
>>   result = op_mask & (LOWER_PACK_HALF_2x16 | 
>> LOWER_PACK_HALF_2x16_TO_SPLIT);
>>   break;
>>case ir_unop_unpack_snorm_2x16:
>>   result = op_mask & LOWER_UNPACK_SNORM_2x16;
>>   break;
>> +  case ir_unop_unpack_snorm_4x8:
>> + result = op_mask & LOWER_UNPACK_SNORM_4x8;
>> + break;
>>case ir_unop_unpack_unorm_2x16:
>>   result = op_mask & LOWER_UNPACK_UNORM_2x16;
>>   break;
>> +  case ir_unop_unpack_unorm_4x8:
>> + result = op_mask & LOWER_UNPACK_UNORM_4x8;
>> + break;
>>case ir_unop_unpack_half_2x16:
>>   result = op_mask & (LOWER_UNPACK_HALF_2x16 | 
>> LOWER_UNPACK_HALF_2x16_TO_SPLIT);
>>   break;
>> @@ -214,6 +238,30 @@ private:
>> }
>>  
>> /**
>> +* \brief Pack four uint8's into a single uint32.
>> +*
>> +* Interpret the given uvec4 as a uint32 quad. Pack the quad into a 
>> uint32
>> +* where the least significant bits specify the first element of the 
>> quad.
>> +* Return the uint32.
>> +*/
> 
> I find the term "uint32 quad" confusing. It is too reminiscient of "quadword".
> This not-so-bright reviewer thought: "A uint32 quadword? Huh? Oh! That means
> a uint32 4-tuple". So, I'd like to see the phrase changed to "uint32 4-tuple"
> or something similar, but this suggestion doesn't block the patch.
> 
>> +   ir_rvalue*
>> +   pack_uvec4_to_uint(ir_rvalue *uvec4_rval)
>> +   {
>> +  assert(uvec4_rval->type == glsl_type::uvec4_type);
>> +
>> +  /* uvec4 u = UVEC4_RVAL; */
>> +  ir_variable *u = factory.make_temp(glsl_type::uvec4_type,
>> +  "tmp_pack_uvec4_to_uint");
>> +  factory.emit(assign(u, uvec4_rval));
>> +
>> +  /* return ((u.w 0xff) << 24) | ((u.z & 0xff) << 16) | ((u.y & 0xff) 
>> << 8) | (u.x & 0xff); */
> ^^^ missing &
>> +  return bit_or(bit_or(lshift(bit_and(swizzle_w(u), constant(0xffu)), 
>> constant(24u)),
>> +   lshift(bit_and(swizzle_z(u), constant(0xffu)), 
>> consta

Re: [Mesa-dev] [PATCH 24/32] glsl: Make the align function available elsewhere in the linker

2013-01-25 Thread Kenneth Graunke

On 01/25/2013 05:43 AM, Ian Romanick wrote:

On 01/24/2013 08:40 PM, Kenneth Graunke wrote:

On 01/22/2013 12:52 AM, Ian Romanick wrote:

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
  src/glsl/glsl_types.cpp  | 12 +++-
  src/glsl/glsl_types.h|  6 ++
  src/glsl/link_uniforms.cpp   | 14 --
  src/glsl/lower_ubo_reference.cpp | 19 +++
  4 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
index 0075550..ddd0148 100644
--- a/src/glsl/glsl_types.cpp
+++ b/src/glsl/glsl_types.cpp
@@ -863,12 +863,6 @@ glsl_type::std140_base_alignment(bool row_major)
const
 return -1;
  }

-static unsigned
-align(unsigned val, unsigned align)
-{
-   return (val + align - 1) / align * align;
-}
-


Why not just eliminate this function altogether and use ALIGN() from
macros.h?  (The implementation is slightly different, but I think it
should work.)


I thought about that.  The ALIGN macro only works when align is a power
of two, and it wasn't obvious to me that all the uses of this function
met that requirement.  I did this refactor right before sending this
series out, and it felt a little like the 11th hour to do something that
could have a functional change.

I'd prefer to revisit this after the release.


Sounds like a good plan.

--Ken

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] configure.ac: Don't set LLVM_LIBS when llvm is disabled

2013-01-25 Thread Tom Stellard
From: Tom Stellard 

---
 configure.ac | 35 +++
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/configure.ac b/configure.ac
index ccf95c5..9cc5c4a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1898,21 +1898,23 @@ dnl by calling llvm-config --libs 
${DRIVER_LLVM_COMPONENTS}, but
 dnl this was causing the same libraries to be appear multiple times
 dnl in LLVM_LIBS.
 
-LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
+if test "x$MESA_LLVM" != x0; then
 
-if test "x$with_llvm_shared_libs" = xyes; then
-dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,
-LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version`
-AC_CHECK_FILE("$LLVM_LIBDIR/lib$LLVM_SO_NAME.so", llvm_have_one_so=yes,)
+LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
 
-if test "x$llvm_have_one_so" = xyes; then
-dnl LLVM was built using auto*, so there is only one shared object.
-LLVM_LIBS="-l$LLVM_SO_NAME"
-else
-dnl If LLVM was built with CMake, there will be one shared object per
-dnl component.
-AC_CHECK_FILE("$LLVM_LIBDIR/libLLVMTarget.so",,
-AC_MSG_ERROR([Could not find llvm shared libraries:
+if test "x$with_llvm_shared_libs" = xyes; then
+dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,
+LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version`
+AC_CHECK_FILE("$LLVM_LIBDIR/lib$LLVM_SO_NAME.so", 
llvm_have_one_so=yes,)
+
+if test "x$llvm_have_one_so" = xyes; then
+dnl LLVM was built using auto*, so there is only one shared object.
+LLVM_LIBS="-l$LLVM_SO_NAME"
+else
+dnl If LLVM was built with CMake, there will be one shared object 
per
+dnl component.
+AC_CHECK_FILE("$LLVM_LIBDIR/libLLVMTarget.so",,
+AC_MSG_ERROR([Could not find llvm shared libraries:
Please make sure you have built llvm with the --enable-shared option
and that your llvm libraries are installed in $LLVM_LIBDIR
If you have installed your llvm libraries to a different directory you
@@ -1925,9 +1927,10 @@ if test "x$with_llvm_shared_libs" = xyes; then
use llvm static libraries then remove these options from your configure
invocation and reconfigure.]))
 
-   dnl We don't need to update LLVM_LIBS in this case because the LLVM
-   dnl install uses a shared object for each compoenent and we have
-   dnl already added all of these objects to LLVM_LIBS. 
+   dnl We don't need to update LLVM_LIBS in this case because the LLVM
+   dnl install uses a shared object for each compoenent and we have
+   dnl already added all of these objects to LLVM_LIBS.
+fi
 fi
 fi
 
-- 
1.7.11.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Paul Berry
On 25 January 2013 07:49, Paul Berry  wrote:

> On 24 January 2013 19:44, Matt Turner  wrote:
>
>> Following this email are eight patches that add the 4x8 pack/unpack
>> operations that are the difference between what GLSL ES 3.0 and
>> ARB_shading_language_packing require.
>>
>> They require Chad's gles3-glsl-packing series and are available at
>>
>> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
>>
>> I've also added testing support on top of Chad's piglit patch. The
>> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
>> spot why.
>>
>
> I had minor comments on patches 4/8 and 5/8.  The remainder is:
>
> Reviewed-by: Paul Berry 
>
> I didn't spot anything that would explain the failure in unpackUnorm4x8
> tests.  I'll go have a look at your piglit tests now, and if I don't find
> anything there either, I'll fire up the simulator and see if I can see
> what's going wrong.
>

I found the problem.  On i965, floating point divisions are implemented as
multiplication by a reciprocal, whereas on the CPU there's a floating point
division instruction.  Therefore, unpackUnorm4x8's computation of "f /
255.0" doesn't yield consistent results when run on the CPU vs the
GPU--there is a tiny difference due to the accumulation of floating point
rounding errors.

That's why the "fs" and "vs" variants of the tests failed, and the "const"
variant passed--because Mesa does constant folding using the CPU's floating
point division instruction, which matches the Python test generator
perfectly, whereas the "fs" and "vs" variants use the actual GPU.

It's only by dumb luck that this rounding error issue didn't bite us until
now, because in principle it could equally well have occurred in the
unpack2x16 functions.

I believe we should relax the test to allow for these tiny rounding errors
(this is what the other test generators, such as
gen_builtin_uniform_tests.py, do).  As an experiment I modified
gen_builtin_packing_tests.py so that in vs_unpack_2x16_template and
fs_unpack_2x16_template, "actual == expect${j}" is replaced with
"distance(actual, expect${j}) < 0.1".  With this change, the test
passes.

However, that change isn't good enough to commit to piglit, for two reasons:

(1) It should only be applied when testing the functions whose definition
includes a division (unpackUnorm4x8, unpackSnorm4x8, unpackUnorm2x16, and
unpackSnorm2x16).  A properly functioning implementation ought to be able
to get exact answers with all the other packing functions, and we should
test that it does.

(2) IMHO we should test that ouput values of 0.0 and +/- 1.0 are produced
without error, since a shader author might conceivably write code that
relies on these values being exact.  That is, we should check that the
following conversions are exact, with no rounding error:

unpackUnorm4x8(0) == vec4(0.0)
unpackUnorm4x8(0x) == vec4(1.0)
unpackSnorm4x8(0) == vec4(0.0)
unpackSnorm4x8(0x7f7f7f7f) == vec4(1.0)
unpackSnorm4x8(0x80808080) == vec4(-1.0)
unpackSnorm4x8(0x81818181) == vec4(-1.0)
unpackUnorm2x16(0) == vec2(0.0)
unpackUnorm2x16(0x) == vec4(1.0)
unpackSnorm2x16(0) == vec4(0.0)
unpackSnorm2x16(0x7fff7fff) == vec4(1.0)
unpackSnorm2x16(0x80008000) == vec4(-1.0)
unpackSnorm2x16(0x80018001) == vec4(-1.0)

My recommendation: address problem (1) by modifying the templates to accept
a new parameter that determines whether the test needs to be precise or
approximate (e.g. "func.precise").  Address problem (2) by hand-coding a
few shader_runner tests to check the cases above.  IMHO it would be ok to
leave the current patch as is (modulo my previous comments) and do a pair
of follow-on patches to address problems (1) and (2).

Chad, do you have any thoughts on this subject, since you're the original
author of this test generator?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] r600g: add async for staging buffer upload

2013-01-25 Thread j . glisse
From: Jerome Glisse 

Signed-off-by: Jerome Glisse 
---
 src/gallium/drivers/r600/evergreen_hw_context.c |  44 ++
 src/gallium/drivers/r600/evergreen_state.c  | 197 
 src/gallium/drivers/r600/evergreend.h   |  15 ++
 src/gallium/drivers/r600/r600.h |  27 
 src/gallium/drivers/r600/r600_buffer.c  |  25 ++-
 src/gallium/drivers/r600/r600_hw_context.c  |  48 +-
 src/gallium/drivers/r600/r600_pipe.c|   6 +-
 src/gallium/drivers/r600/r600_pipe.h|   9 ++
 src/gallium/drivers/r600/r600_state.c   | 190 +++
 src/gallium/drivers/r600/r600_state_common.c|   6 +-
 src/gallium/drivers/r600/r600_texture.c |  24 ++-
 src/gallium/drivers/r600/r600d.h|  15 ++
 12 files changed, 589 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_hw_context.c 
b/src/gallium/drivers/r600/evergreen_hw_context.c
index fa90c9a..1c30404 100644
--- a/src/gallium/drivers/r600/evergreen_hw_context.c
+++ b/src/gallium/drivers/r600/evergreen_hw_context.c
@@ -26,6 +26,7 @@
 #include "r600_hw_context_priv.h"
 #include "evergreend.h"
 #include "util/u_memory.h"
+#include "util/u_math.h"
 
 static const struct r600_reg cayman_config_reg_list[] = {
{R_009100_SPI_CONFIG_CNTL, REG_FLAG_ENABLE_ALWAYS | 
REG_FLAG_FLUSH_CHANGE, 0},
@@ -238,3 +239,46 @@ void evergreen_set_streamout_enable(struct r600_context 
*ctx, unsigned buffer_en
r600_write_context_reg(cs, R_028B94_VGT_STRMOUT_CONFIG, 
S_028B94_STREAMOUT_0_EN(0));
}
 }
+
+void evergreen_dma_copy(struct r600_context *rctx,
+   struct pipe_resource *dst,
+   struct pipe_resource *src,
+   unsigned long dst_offset,
+   unsigned long src_offset,
+   unsigned long size)
+{
+   struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
+   unsigned i, ncopy, csize, sub_cmd, shift;
+   struct r600_resource *rdst = (struct r600_resource*)dst;
+   struct r600_resource *rsrc = (struct r600_resource*)src;
+
+   /* make sure that the dma ring is only one active */
+   rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC);
+
+   /* see if we use dword or byte copy */
+   if (!(dst_offset & 0x3) && !(src_offset & 0x3) && !(size & 0x3)) {
+   size >>= 2;
+   sub_cmd = 0x00;
+   shift = 2;
+   } else {
+   sub_cmd = 0x40;
+   shift = 0;
+   }
+   ncopy = (size / 0x000f) + !!(size % 0x000f);
+
+   r600_need_dma_space(rctx, ncopy * 5);
+   for (i = 0; i < ncopy; i++) {
+   csize = size < 0x000f ? size : 0x000f;
+   /* emit reloc before writting cs so that cs is always in 
consistent state */
+   r600_context_bo_reloc(rctx, &rctx->rings.dma, rsrc, 
RADEON_USAGE_READ);
+   r600_context_bo_reloc(rctx, &rctx->rings.dma, rdst, 
RADEON_USAGE_WRITE);
+   cs->buf[cs->cdw++] = DMA_PACKET(DMA_PACKET_COPY, sub_cmd, 
csize);
+   cs->buf[cs->cdw++] = dst_offset & 0x;
+   cs->buf[cs->cdw++] = src_offset & 0x;
+   cs->buf[cs->cdw++] = (dst_offset >> 32UL) & 0xff;
+   cs->buf[cs->cdw++] = (src_offset >> 32UL) & 0xff;
+   dst_offset += csize << shift;
+   src_offset += csize << shift;
+   size -= csize;
+   }
+}
diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 86e2c81..f0511d8 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -30,6 +30,20 @@
 #include "util/u_framebuffer.h"
 #include "util/u_dual_blend.h"
 #include "evergreen_compute.h"
+#include "util/u_math.h"
+
+static INLINE unsigned evergreen_array_mode(unsigned mode)
+{
+   switch (mode) {
+   case RADEON_SURF_MODE_LINEAR_ALIGNED:   return 
V_028C70_ARRAY_LINEAR_ALIGNED;
+   break;
+   case RADEON_SURF_MODE_1D:   return 
V_028C70_ARRAY_1D_TILED_THIN1;
+   break;
+   case RADEON_SURF_MODE_2D:   return 
V_028C70_ARRAY_2D_TILED_THIN1;
+   default:
+   case RADEON_SURF_MODE_LINEAR:   return 
V_028C70_ARRAY_LINEAR_GENERAL;
+   }
+}
 
 static uint32_t eg_num_banks(uint32_t nbanks)
 {
@@ -3445,3 +3459,186 @@ void evergreen_update_db_shader_control(struct 
r600_context * rctx)
rctx->db_misc_state.atom.dirty = true;
}
 }
+
+static void evergreen_dma_copy_tile(struct r600_context *rctx,
+   struct pipe_resource *dst,
+   unsigned dst_level,
+   unsigned dst_x,
+   unsigned dst_y,
+   unsigned dst_z,
+   struct pipe_resource *src,
+   un

[Mesa-dev] [PATCH 1/4] radeon/winsys: add dma ring support to winsys v3

2013-01-25 Thread j . glisse
From: Jerome Glisse 

Add ring support, you can create a cs for each ring. DMA ring is
bit special regarding relocation as you must emit as much relocation
as there is use of the buffer.

v2: - Improved comment on relocation changes
- Use a single thread to queue cs submittion this simplify driver
  code while not impacting performances. Rational for this is that
  you have to wait for all previous submission to have completed
  so there was never a case while we could have 2 different thread
  submitting a command stream at the same time. This code just
  consolidate submission into one single thread per winsys.
v3: - Do not use semaphore for empty queue signaling, instead use
  cond var. This is because it's tricky to maintain an even number
  of call to semaphore wait and semaphore signal (the number of
  cs in the stack would for instance make that number vary).

Signed-off-by: Jerome Glisse 
---
 src/gallium/drivers/r300/r300_context.c   |   2 +-
 src/gallium/drivers/r600/r600_pipe.c  |   2 +-
 src/gallium/drivers/radeonsi/radeonsi_pipe.c  |   2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c |   2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 160 --
 src/gallium/winsys/radeon/drm/radeon_drm_cs.h |   8 +-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c |  87 
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.h |  17 +++
 src/gallium/winsys/radeon/drm/radeon_winsys.h |  20 ++-
 9 files changed, 218 insertions(+), 82 deletions(-)

diff --git a/src/gallium/drivers/r300/r300_context.c 
b/src/gallium/drivers/r300/r300_context.c
index d8af13f..340a7f0 100644
--- a/src/gallium/drivers/r300/r300_context.c
+++ b/src/gallium/drivers/r300/r300_context.c
@@ -379,7 +379,7 @@ struct pipe_context* r300_create_context(struct 
pipe_screen* screen,
  sizeof(struct pipe_transfer), 64,
  UTIL_SLAB_SINGLETHREADED);
 
-r300->cs = rws->cs_create(rws);
+r300->cs = rws->cs_create(rws, RING_GFX);
 if (r300->cs == NULL)
 goto fail;
 
diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index fda5074..e4a35cf 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -289,7 +289,7 @@ static struct pipe_context *r600_create_context(struct 
pipe_screen *screen, void
goto fail;
}
 
-   rctx->cs = rctx->ws->cs_create(rctx->ws);
+   rctx->cs = rctx->ws->cs_create(rctx->ws, RING_GFX);
rctx->ws->cs_set_flush_callback(rctx->cs, r600_flush_from_winsys, rctx);
 
rctx->uploader = u_upload_create(&rctx->context, 1024 * 1024, 256,
diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c 
b/src/gallium/drivers/radeonsi/radeonsi_pipe.c
index cbb3bc4..5792fe2 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.c
@@ -222,7 +222,7 @@ static struct pipe_context *r600_create_context(struct 
pipe_screen *screen, void
case TAHITI:
si_init_state_functions(rctx);
LIST_INITHEAD(&rctx->active_query_list);
-   rctx->cs = rctx->ws->cs_create(rctx->ws);
+   rctx->cs = rctx->ws->cs_create(rctx->ws, RING_GFX);
rctx->max_db = 8;
si_init_config(rctx);
break;
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index 897e962..6daafc3 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -453,7 +453,7 @@ static void *radeon_bo_map(struct radeon_winsys_cs_handle 
*buf,
 } else {
 /* Try to avoid busy-waiting in radeon_bo_wait. */
 if (p_atomic_read(&bo->num_active_ioctls))
-radeon_drm_cs_sync_flush(cs);
+radeon_drm_cs_sync_flush(rcs);
 }
 
 radeon_bo_wait((struct pb_buffer*)bo, RADEON_USAGE_READWRITE);
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
index c5e7f1e..cab2704 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
@@ -90,6 +90,10 @@
 #define RADEON_CS_RING_COMPUTE  1
 #endif
 
+#ifndef RADEON_CS_RING_DMA
+#define RADEON_CS_RING_DMA  2
+#endif
+
 #ifndef RADEON_CS_END_OF_FRAME
 #define RADEON_CS_END_OF_FRAME  0x04
 #endif
@@ -158,10 +162,8 @@ static void radeon_destroy_cs_context(struct 
radeon_cs_context *csc)
 FREE(csc->relocs);
 }
 
-DEBUG_GET_ONCE_BOOL_OPTION(thread, "RADEON_THREAD", TRUE)
-static PIPE_THREAD_ROUTINE(radeon_drm_cs_emit_ioctl, param);
 
-static struct radeon_winsys_cs *radeon_drm_cs_create(struct radeon_winsys *rws)
+static struct radeon_winsys_cs *radeon_drm_cs_create(struct radeon_winsys 
*rws

[Mesa-dev] r600g async dma support

2013-01-25 Thread j . glisse
So design is mostly the same then previously. Few changes, first i use only
one thread to offload all cs submission wether gfx or dma. Reasons is that
using on thread for gfx and one for dma lead to more complex synchronization
with no gain ie when submitting gfx you would need to make sure previous
dma submittion are done and vice et versa. So in the end it's just not a
good idea. Moreover the dma submission is lot faster than the gfx one as
the dma cs are smaller and simpler to parse for the kernel.

Second is that i don't use a stack in r600g to keep track of cs submission
ordering. Instead anytime r600g switch cmd stream ie start writing dma
command after writing gfx one, we first asynchronously flush the gfx
command. This insure that any point in time the driver is only building
command for either gfx or dma ring and everything is serialize from driver
pov. It simplify implementation as there is no need to special case some
corner case such as query/event or streamout buffer.

The last patch is a small optimization that decrease the cpu overhead by
not submitting gfx cmd that does not do anything.

Everything been tested on r7xx and evergreen and i witnessed no regression.

Evergreen can be improved by adding support for partial blit but i am not
sure it's worth it.

Cheers,
Jerome

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: handle new GLSL IR enumerants in switch statements

2013-01-25 Thread Brian Paul
To silence warnings about unhandled cases.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 643a9bb..2c5ba41 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -984,6 +984,7 @@ type_size(const struct glsl_type *type)
* at link time.
*/
   return 1;
+   case GLSL_TYPE_INTERFACE:
case GLSL_TYPE_VOID:
case GLSL_TYPE_ERROR:
   assert(!"Invalid type in type_size");
@@ -1934,10 +1935,19 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
   }
   break;
}
+   case ir_unop_pack_snorm_2x16:
+   case ir_unop_pack_unorm_2x16:
+   case ir_unop_pack_half_2x16:
+   case ir_unop_unpack_snorm_2x16:
+   case ir_unop_unpack_unorm_2x16:
+   case ir_unop_unpack_half_2x16:
+   case ir_unop_unpack_half_2x16_split_x:
+   case ir_unop_unpack_half_2x16_split_y:
+   case ir_binop_pack_half_2x16_split:
case ir_quadop_vector:
-  /* This operation should have already been handled.
+  /* This operation is not supported, or should have already been handled.
*/
-  assert(!"Should not get here.");
+  assert(!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()");
   break;
}
 
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: Use a CPU map of the batch on LLC-sharing architectures.

2013-01-25 Thread Eric Anholt
Kenneth Graunke  writes:

> On 01/20/2013 02:59 PM, Eric Anholt wrote:
>> Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in,
>> which was an improvement over mapping the batch through the GTT directly
>> (since any readback or other failure to stream through write combining
>> correctly would hurt).  However, on LLC-sharing architectures we can do 
>> better
>> by mapping the batch directly, which reduces the cache footprint of the
>> application since we no longer have this extra copy of a batchbuffer around.
>>
>> Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4%
>> (n=21).  Improves Lightsmark performance by 1.1 +/- 0.1% (n=76).  Improves
>> cairo-gl performance by 1.9% +/- 1.4% (n=57).
>>
>> No statistically significant difference in GLB2.1 on SNB (n=37).  Improves
>> cairo-gl performance by 2.1% +/- 0.1% (n=278).
>
> Looks good to me.  Have you tested this on a non-LLC machine?

Not in a long time.  It shouldn't affect performance, since they get the
same behavior as before.


pgpwEqlFHN6a3.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: fix stride to be bytes instead of pixels

2013-01-25 Thread Eric Anholt
Tapani Pälli  writes:

> commit 60894edeef973e86a73067276f658b72f84271b6 changed the way dri2
> buffer pitch is interpreted in intel driver createImageFromName
> implementation, caller must set pitch in bytes, not pixels.

Oops, I didn't mean to change behavior of the interface.  It looks like
dri2_create_image_khr_pixmap() is also passing in a number of pixels.  I
can't tell on dri2_create_image_mesa_drm_buffer().

Since it's an interface breakage, so I think we should fix it on the
intel driver side, unless krh agrees that this is the intended interface
all along and that we don't care about new libGL vs old Intel drivers in
this particular case.


pgpKqKbrc6cI6.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] r600g: only emit gfx cmd when there is actual work in it

2013-01-25 Thread j . glisse
From: Jerome Glisse 

Signed-off-by: Jerome Glisse 
---
 src/gallium/drivers/r600/evergreen_compute.c | 2 ++
 src/gallium/drivers/r600/r600_hw_context.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 6 ++
 src/gallium/drivers/r600/r600_pipe.h | 1 +
 src/gallium/drivers/r600/r600_query.c| 2 ++
 src/gallium/drivers/r600/r600_state_common.c | 1 +
 6 files changed, 13 insertions(+)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index f4a7905..977595e 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -308,6 +308,8 @@ static void evergreen_emit_direct_dispatch(
r600_write_value(cs, grid_layout[2]);
/* VGT_DISPATCH_INITIATOR = COMPUTE_SHADER_EN */
r600_write_value(cs, 1);
+
+   rctx->rings.gfx.cdraw++;
 }
 
 static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout,
diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index d7518a5..511a276 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -1122,6 +1122,7 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx,
size -= byte_count;
src_offset += byte_count;
dst_offset += byte_count;
+   rctx->rings.gfx.cdraw++;
}
 }
 
diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index 6767412..af08cff 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -120,6 +120,10 @@ static void r600_flush(struct pipe_context *ctx, unsigned 
flags)
struct pipe_query *render_cond = NULL;
unsigned render_cond_mode = 0;
 
+   if (!rctx->rings.gfx.cdraw) {
+   return;
+   }
+
rctx->rings.gfx.flushing = true;
/* Disable render condition. */
if (rctx->current_render_cond) {
@@ -130,6 +134,7 @@ static void r600_flush(struct pipe_context *ctx, unsigned 
flags)
 
r600_context_flush(rctx, flags);
rctx->rings.gfx.flushing = false;
+   rctx->rings.gfx.cdraw = 0;
r600_begin_new_cs(rctx);
 
/* Re-enable render condition. */
@@ -387,6 +392,7 @@ static struct pipe_context *r600_create_context(struct 
pipe_screen *screen, void
goto fail;
}
 
+   rctx->rings.gfx.cdraw = 0;
rctx->rings.gfx.cs = rctx->ws->cs_create(rctx->ws, RING_GFX);
rctx->rings.gfx.flush = r600_flush_gfx_ring;
rctx->ws->cs_set_flush_callback(rctx->rings.gfx.cs, 
r600_flush_from_winsys, rctx);
diff --git a/src/gallium/drivers/r600/r600_pipe.h 
b/src/gallium/drivers/r600/r600_pipe.h
index 31dcd05..5c72756 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -418,6 +418,7 @@ struct r600_fetch_shader {
 struct r600_ring {
struct radeon_winsys_cs *cs;
boolflushing;
+   unsignedcdraw;
void (*flush)(void *ctx, unsigned flags);
 };
 
diff --git a/src/gallium/drivers/r600/r600_query.c 
b/src/gallium/drivers/r600/r600_query.c
index 0335189..7916f2d 100644
--- a/src/gallium/drivers/r600/r600_query.c
+++ b/src/gallium/drivers/r600/r600_query.c
@@ -149,6 +149,7 @@ static void r600_emit_query_begin(struct r600_context *ctx, 
struct r600_query *q
cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF);
cs->buf[cs->cdw++] = 0;
cs->buf[cs->cdw++] = 0;
+   ctx->rings.gfx.cdraw++;
break;
default:
assert(0);
@@ -201,6 +202,7 @@ static void r600_emit_query_end(struct r600_context *ctx, 
struct r600_query *que
cs->buf[cs->cdw++] = (3 << 29) | ((va >> 32UL) & 0xFF);
cs->buf[cs->cdw++] = 0;
cs->buf[cs->cdw++] = 0;
+   ctx->rings.gfx.cdraw++;
break;
default:
assert(0);
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index b547d64..d4616ce 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1439,6 +1439,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
r600_trace_emit(rctx);
}
 #endif
+   rctx->rings.gfx.cdraw++;
 
/* Set the depth buffer as dirty. */
if (rctx->framebuffer.state.zsbuf) {
-- 
1.7.11.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: implement GL_ARB_texture_buffer_range v5

2013-01-25 Thread Christoph Bumiller
v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead
of the buffer's current size so we know we always have to use the
full size of the buffer object (i.e. even if it changes without the
user calling TexBuffer again) for the texture.

Clarify invalid offset alignment error message.

v3: Use extra GL_CORE-only section in get_hash_params.py for
TEXTURE_BUFFER_OFFSET_ALIGNMENT.

v4: Remove unnecessary check for profile in _mesa_TexBufferRange.
Add check for extension enable in get_tex_level_parameter_buffer.

v5: Fix position in gl_API.xml.
Add comment about meaning of BufferSize == -1.
---
 src/mapi/glapi/gen/ARB_texture_buffer_range.xml |   22 ++
 src/mapi/glapi/gen/Makefile.am  |1 +
 src/mapi/glapi/gen/gl_API.xml   |4 +
 src/mesa/main/context.c |1 +
 src/mesa/main/extensions.c  |1 +
 src/mesa/main/get.c |1 +
 src/mesa/main/get_hash_params.py|6 ++
 src/mesa/main/mtypes.h  |6 ++
 src/mesa/main/teximage.c|   84 ++-
 src/mesa/main/teximage.h|4 +
 src/mesa/main/texparam.c|   12 +++
 11 files changed, 125 insertions(+), 17 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_texture_buffer_range.xml

diff --git a/src/mapi/glapi/gen/ARB_texture_buffer_range.xml 
b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml
new file mode 100644
index 000..2176c08
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml
@@ -0,0 +1,22 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index f869d28..4d51bbc 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -108,6 +108,7 @@ API_XML = \
ARB_seamless_cube_map.xml \
ARB_sync.xml \
ARB_texture_buffer_object.xml \
+   ARB_texture_buffer_range.xml \
ARB_texture_compression_rgtc.xml \
ARB_texture_float.xml \
ARB_texture_rg.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 404ccea..4cbd724 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8316,6 +8316,10 @@
 
 http://www.w3.org/2001/XInclude"/>
 
+
+
+http://www.w3.org/2001/XInclude"/>
+
 
 
 
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 5e9e539..5058c07 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -564,6 +564,7 @@ _mesa_init_constants(struct gl_context *ctx)
ctx->Const.MaxTextureMaxAnisotropy = MAX_TEXTURE_MAX_ANISOTROPY;
ctx->Const.MaxTextureLodBias = MAX_TEXTURE_LOD_BIAS;
ctx->Const.MaxTextureBufferSize = 65536;
+   ctx->Const.TextureBufferOffsetAlignment = 1;
ctx->Const.MaxArrayLockSize = MAX_ARRAY_LOCK_SIZE;
ctx->Const.SubPixelBits = SUB_PIXEL_BITS;
ctx->Const.MinPointSize = MIN_POINT_SIZE;
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 5d01ac8..207572f 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -130,6 +130,7 @@ static const struct extension extension_table[] = {
{ "GL_ARB_texture_border_clamp",
o(ARB_texture_border_clamp),GLL,2000 },
{ "GL_ARB_texture_buffer_object",   
o(ARB_texture_buffer_object),   GLC,2008 },
{ "GL_ARB_texture_buffer_object_rgb32", 
o(ARB_texture_buffer_object_rgb32), GLC,2009 },
+   { "GL_ARB_texture_buffer_range",
o(ARB_texture_buffer_range),GLC,2012 },
{ "GL_ARB_texture_compression", o(dummy_true),  
GLL,2000 },
{ "GL_ARB_texture_compression_rgtc",
o(ARB_texture_compression_rgtc),GL, 2004 },
{ "GL_ARB_texture_cube_map",o(ARB_texture_cube_map),
GLL,1999 },
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 5f4e2fa..da1e01c 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -353,6 +353,7 @@ EXTRA_EXT(ARB_uniform_buffer_object);
 EXTRA_EXT(ARB_timer_query);
 EXTRA_EXT(ARB_map_buffer_alignment);
 EXTRA_EXT(ARB_texture_cube_map_array);
+EXTRA_EXT(ARB_texture_buffer_range);
 
 static const int
 extra_NV_primitive_restart[] = {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 26a722a..b6bed80 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -701,6 +701,12 @@ descriptor=[
 
 # GL_ARB_texture_cube_map_array
   [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ],
+]},
+
+# Enums restricted to OpenGL Core profile
+{ "apis"

Re: [Mesa-dev] [PATCH 1/2] r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM

2013-01-25 Thread Alex Deucher
On Fri, Jan 25, 2013 at 10:43 AM, Tom Stellard  wrote:
> From: Tom Stellard 
>
> We were using the NEED_RADEON_GALLIUM conditional to decide whether or not
> to build llvm_wrapper.cpp, which is required for using the LLVM backend.
> llvm_wrapper.cpp needs to be linked against the LLVM IPO libary
> and this library is only added to LLVM_LIBS if either opencl or the
> r600-llvm-compiler is enabled.
>
> The NEED_RADEON_GALLIUM conditional is set to true when enabling the
> radeonsi driver, so if the radeonsi and r600 drivers are enabled without
> also enabling opencl or r600-llvm-compiler, llvm_wrapper.cpp will be
> built, but the IPO library won't be added to LLVM_LIBS.  This was
> causing unresolved symbol errors when buiding with this configuration.
>
> https://bugs.freedesktop.org/show_bug.cgi?id=59831

confirmed this fixes the issue.  for the series:

Tested-by: Alex Deucher 

> ---
>  src/gallium/drivers/r600/Makefile.am | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/r600/Makefile.am 
> b/src/gallium/drivers/r600/Makefile.am
> index 995261b..6de7e0f 100644
> --- a/src/gallium/drivers/r600/Makefile.am
> +++ b/src/gallium/drivers/r600/Makefile.am
> @@ -13,7 +13,8 @@ AM_CFLAGS = \
>  libr600_la_SOURCES = \
> $(C_SOURCES)
>
> -if NEED_RADEON_GALLIUM
> +if USE_R600_LLVM_COMPILER
> +if HAVE_GALLIUM_COMPUTE
>
>  libr600_la_SOURCES += \
> $(LLVM_C_SOURCES) \
> @@ -28,6 +29,7 @@ AM_CFLAGS += \
>  AM_CXXFLAGS= \
> $(LLVM_CXXFLAGS)
>  endif
> +endif
>
>  if USE_R600_LLVM_COMPILER
>  AM_CFLAGS += \
> --
> 1.7.11.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g/llvm: Add dummy export for vs output

2013-01-25 Thread Vincent Lejeune
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=59588
---
 src/gallium/drivers/r600/r600_llvm.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_llvm.c 
b/src/gallium/drivers/r600/r600_llvm.c
index 32b8e56..913dccc 100644
--- a/src/gallium/drivers/r600/r600_llvm.c
+++ b/src/gallium/drivers/r600/r600_llvm.c
@@ -374,9 +374,27 @@ static void llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
}
}
}
+   // Add dummy exports
+   if (ctx->type == TGSI_PROCESSOR_VERTEX) {
+   if (!next_param) {
+   lp_build_intrinsic_unary(base->gallivm->builder, 
"llvm.R600.store.dummy",
+   LLVMVoidTypeInContext(base->gallivm->context),
+   lp_build_const_int32(base->gallivm, 
V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PARAM));
+   }
+   if (!(next_pos-60)) {
+   lp_build_intrinsic_unary(base->gallivm->builder, 
"llvm.R600.store.dummy",
+   LLVMVoidTypeInContext(base->gallivm->context),
+   lp_build_const_int32(base->gallivm, 
V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_POS));
+   }
+   }
+   if (ctx->type == TGSI_PROCESSOR_FRAGMENT) {
+   if (!has_color) {
+   lp_build_intrinsic_unary(base->gallivm->builder, 
"llvm.R600.store.dummy",
+   LLVMVoidTypeInContext(base->gallivm->context),
+   lp_build_const_int32(base->gallivm, 
V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL));
+   }
+   }
 
-   if (!has_color && ctx->type == TGSI_PROCESSOR_FRAGMENT)
-   lp_build_intrinsic(base->gallivm->builder, 
"llvm.R600.store.pixel.dummy", LLVMVoidTypeInContext(base->gallivm->context), 
0, 0);
 }
 
 static void llvm_emit_tex(
-- 
1.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] R600: Make store_dummy intrinsic more general by passing export type

2013-01-25 Thread Vincent Lejeune
---
 lib/Target/R600/R600Instructions.td | 9 +++--
 lib/Target/R600/R600Intrinsics.td   | 4 ++--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/Target/R600/R600Instructions.td 
b/lib/Target/R600/R600Instructions.td
index 13293b6..3537906 100644
--- a/lib/Target/R600/R600Instructions.td
+++ b/lib/Target/R600/R600Instructions.td
@@ -608,9 +608,14 @@ multiclass ExportPattern 
cf_inst> {
 0, 61, 7, 0, 7, 7, cf_inst, 0)
   >;
 
-  def : Pat<(int_R600_store_pixel_dummy),
+  def : Pat<(int_R600_store_dummy (i32 imm:$type)),
 (ExportInst
-(v4f32 (IMPLICIT_DEF)), 0, 0, 7, 7, 7, 7, cf_inst, 0)
+(v4f32 (IMPLICIT_DEF)), imm:$type, 0, 7, 7, 7, 7, cf_inst, 0)
+  >;
+
+  def : Pat<(int_R600_store_dummy 1),
+(ExportInst
+(v4f32 (IMPLICIT_DEF)), 1, 60, 7, 7, 7, 7, cf_inst, 0)
   >;
 
   def : Pat<(EXPORT (v4f32 R600_Reg128:$src), (i32 imm:$base), (i32 imm:$type),
diff --git a/lib/Target/R600/R600Intrinsics.td 
b/lib/Target/R600/R600Intrinsics.td
index 4c652a6..b5e4f1e 100644
--- a/lib/Target/R600/R600Intrinsics.td
+++ b/lib/Target/R600/R600Intrinsics.td
@@ -24,6 +24,6 @@ let TargetPrefix = "R600", isTarget = 1 in {
   Intrinsic<[], [llvm_float_ty], []>;
   def int_R600_store_pixel_stencil :
   Intrinsic<[], [llvm_float_ty], []>;
-  def int_R600_store_pixel_dummy :
-  Intrinsic<[], [], []>;
+  def int_R600_store_dummy :
+  Intrinsic<[], [llvm_i32_ty], []>;
 }
-- 
1.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

--- Comment #7 from Tom Stellard  ---
(In reply to comment #5)
> It was false to remove libr600_la_LDFLAGS in this patch:
> http://cgit.freedesktop.org/mesa/mesa/commit/
> ?id=69d639ba8b3cfd95cfbb12b861dbe2eda53f2e25
> 
> And please change all Makefile.am to generate LLVM related LIBADDs this way
> to avoid stupid dependencies if LLVM was compiled with the better cmake
> build system which creates shared instead of static libs / one big shared
> lib and can save memory this way.

Generating different shared libraries depending on the build system used is a
bug in LLVM.  However, until it is fixed we need to support both build systems
even if one is better.

Adding llvm libraries in makefiles using llvm-config will not work when we are
linking against shared libraries generated by an autotools build of LLVM,
because then we will be linking against shared and static libraries at the same
time.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Paul Berry
On 24 January 2013 19:44, Matt Turner  wrote:

> Following this email are eight patches that add the 4x8 pack/unpack
> operations that are the difference between what GLSL ES 3.0 and
> ARB_shading_language_packing require.
>
> They require Chad's gles3-glsl-packing series and are available at
>
> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing
>
> I've also added testing support on top of Chad's piglit patch. The
> {vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
> spot why.
>

I had minor comments on patches 4/8 and 5/8.  The remainder is:

Reviewed-by: Paul Berry 

I didn't spot anything that would explain the failure in unpackUnorm4x8
tests.  I'll go have a look at your piglit tests now, and if I don't find
anything there either, I'll fire up the simulator and see if I can see
what's going wrong.


>
> Please give it a look. I'd be nice to get this into 9.1.
>
> Thanks,
> Matt
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

--- Comment #6 from Tom Stellard  ---
This should be fixed by this patch:

http://lists.freedesktop.org/archives/mesa-dev/2013-January/033482.html

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] glsl: Add support for lowering 4x8 pack/unpack operations

2013-01-25 Thread Paul Berry
On 24 January 2013 19:47, Matt Turner  wrote:

> Lower them to arithmetic and bit manipulation expressions.
> ---
>  src/glsl/ir_optimization.h  |6 +
>  src/glsl/lower_packing_builtins.cpp |  279
> +++
>  2 files changed, 285 insertions(+), 0 deletions(-)
>
> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
> index ac90b87..8f33018 100644
> --- a/src/glsl/ir_optimization.h
> +++ b/src/glsl/ir_optimization.h
> @@ -54,6 +54,12 @@ enum lower_packing_builtins_op {
>
> LOWER_PACK_HALF_2x16_TO_SPLIT= 0x0040,
> LOWER_UNPACK_HALF_2x16_TO_SPLIT  = 0x0080,
> +
> +   LOWER_PACK_SNORM_4x8 = 0x0100,
> +   LOWER_UNPACK_SNORM_4x8   = 0x0200,
> +
> +   LOWER_PACK_UNORM_4x8 = 0x0400,
> +   LOWER_UNPACK_UNORM_4x8   = 0x0800,
>  };
>
>  bool do_common_optimization(exec_list *ir, bool linked,
> diff --git a/src/glsl/lower_packing_builtins.cpp
> b/src/glsl/lower_packing_builtins.cpp
> index 49176cc..aa6765f 100644
> --- a/src/glsl/lower_packing_builtins.cpp
> +++ b/src/glsl/lower_packing_builtins.cpp
> @@ -85,9 +85,15 @@ public:
>case LOWER_PACK_SNORM_2x16:
>   *rvalue = lower_pack_snorm_2x16(op0);
>   break;
> +  case LOWER_PACK_SNORM_4x8:
> + *rvalue = lower_pack_snorm_4x8(op0);
> + break;
>case LOWER_PACK_UNORM_2x16:
>   *rvalue = lower_pack_unorm_2x16(op0);
>   break;
> +  case LOWER_PACK_UNORM_4x8:
> + *rvalue = lower_pack_unorm_4x8(op0);
> + break;
>case LOWER_PACK_HALF_2x16:
>   *rvalue = lower_pack_half_2x16(op0);
>   break;
> @@ -97,9 +103,15 @@ public:
>case LOWER_UNPACK_SNORM_2x16:
>   *rvalue = lower_unpack_snorm_2x16(op0);
>   break;
> +  case LOWER_UNPACK_SNORM_4x8:
> + *rvalue = lower_unpack_snorm_4x8(op0);
> + break;
>case LOWER_UNPACK_UNORM_2x16:
>   *rvalue = lower_unpack_unorm_2x16(op0);
>   break;
> +  case LOWER_UNPACK_UNORM_4x8:
> + *rvalue = lower_unpack_unorm_4x8(op0);
> + break;
>case LOWER_UNPACK_HALF_2x16:
>   *rvalue = lower_unpack_half_2x16(op0);
>   break;
> @@ -137,18 +149,30 @@ private:
>case ir_unop_pack_snorm_2x16:
>   result = op_mask & LOWER_PACK_SNORM_2x16;
>   break;
> +  case ir_unop_pack_snorm_4x8:
> + result = op_mask & LOWER_PACK_SNORM_4x8;
> + break;
>case ir_unop_pack_unorm_2x16:
>   result = op_mask & LOWER_PACK_UNORM_2x16;
>   break;
> +  case ir_unop_pack_unorm_4x8:
> + result = op_mask & LOWER_PACK_UNORM_4x8;
> + break;
>case ir_unop_pack_half_2x16:
>   result = op_mask & (LOWER_PACK_HALF_2x16 |
> LOWER_PACK_HALF_2x16_TO_SPLIT);
>   break;
>case ir_unop_unpack_snorm_2x16:
>   result = op_mask & LOWER_UNPACK_SNORM_2x16;
>   break;
> +  case ir_unop_unpack_snorm_4x8:
> + result = op_mask & LOWER_UNPACK_SNORM_4x8;
> + break;
>case ir_unop_unpack_unorm_2x16:
>   result = op_mask & LOWER_UNPACK_UNORM_2x16;
>   break;
> +  case ir_unop_unpack_unorm_4x8:
> + result = op_mask & LOWER_UNPACK_UNORM_4x8;
> + break;
>case ir_unop_unpack_half_2x16:
>   result = op_mask & (LOWER_UNPACK_HALF_2x16 |
> LOWER_UNPACK_HALF_2x16_TO_SPLIT);
>   break;
> @@ -214,6 +238,30 @@ private:
> }
>
> /**
> +* \brief Pack four uint8's into a single uint32.
> +*
> +* Interpret the given uvec4 as a uint32 quad. Pack the quad into a
> uint32
> +* where the least significant bits specify the first element of the
> quad.
> +* Return the uint32.
> +*/
> +   ir_rvalue*
> +   pack_uvec4_to_uint(ir_rvalue *uvec4_rval)
> +   {
> +  assert(uvec4_rval->type == glsl_type::uvec4_type);
> +
> +  /* uvec4 u = UVEC4_RVAL; */
> +  ir_variable *u = factory.make_temp(glsl_type::uvec4_type,
> +  "tmp_pack_uvec4_to_uint");
> +  factory.emit(assign(u, uvec4_rval));
>

Rather than do four scalar bit_and(..., constant(0xffu)) instructions
below, how about changing the above line to:

factory.emit(assign(u, bit_and(uvec4_rval, constant(0xffu;

That way we take advantage of vector processing in the GPU to do all four
bit_ands at once.

With that fixed (as well as the copy/paste errors Ian spotted), this patch
is:

Reviewed-by: Paul Berry 


> +
> +  /* return ((u.w 0xff) << 24) | ((u.z & 0xff) << 16) | ((u.y & 0xff)
> << 8) | (u.x & 0xff); */
> +  return bit_or(bit_or(lshift(bit_and(swizzle_w(u), constant(0xffu)),
> constant(24u)),
> +   lshift(bit_and(swizzle_z(u), constant(0xffu)),
> constant(16u))),
> +bit_or(lshift(bit_and(swizzle_y(u), constant(0xffu)),
> constant(8u)),
> +   bit_and(swizzle_

[Mesa-dev] [PATCH 2/2] configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs

2013-01-25 Thread Tom Stellard
From: Tom Stellard 

This is required when LLVM is built with CMake, which creates one
shared library for each component.
---
 configure.ac | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index ccf95c5..90085de 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1662,16 +1662,14 @@ if test "x$enable_gallium_llvm" = xyes; then
 if test "x$LLVM_CONFIG" != xno; then
LLVM_VERSION=`$LLVM_CONFIG --version | sed 's/svn.*//g'`
LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 
's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'`
-if test "x$with_llvm_shared_libs" != xyes; then
-LLVM_COMPONENTS="engine bitwriter"
-if $LLVM_CONFIG --components | grep -q '\'; then
-LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
-fi
+LLVM_COMPONENTS="engine bitwriter"
+if $LLVM_CONFIG --components | grep -q '\'; then
+LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
+fi
 
-if test "x$enable_opencl" = xyes; then
-LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
-fi
-   fi
+if test "x$enable_opencl" = xyes; then
+LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
+fi
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
LLVM_CPPFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cppflags"`
@@ -1839,7 +1837,7 @@ if test "x$with_gallium_drivers" != x; then
 if test "x$enable_r600_llvm" = xyes; then
 USE_R600_LLVM_COMPILER=yes;
 fi
-if test "x$enable_opencl" = xyes -a "x$with_llvm_shared_libs" = 
xno; then
+if test "x$enable_opencl" = xyes; then
 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
 fi
 gallium_check_st "radeon/drm" "dri-r600" "xorg-r600" "" 
"xvmc-r600" "vdpau-r600"
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM

2013-01-25 Thread Tom Stellard
From: Tom Stellard 

We were using the NEED_RADEON_GALLIUM conditional to decide whether or not
to build llvm_wrapper.cpp, which is required for using the LLVM backend.
llvm_wrapper.cpp needs to be linked against the LLVM IPO libary
and this library is only added to LLVM_LIBS if either opencl or the
r600-llvm-compiler is enabled.

The NEED_RADEON_GALLIUM conditional is set to true when enabling the
radeonsi driver, so if the radeonsi and r600 drivers are enabled without
also enabling opencl or r600-llvm-compiler, llvm_wrapper.cpp will be
built, but the IPO library won't be added to LLVM_LIBS.  This was
causing unresolved symbol errors when buiding with this configuration.

https://bugs.freedesktop.org/show_bug.cgi?id=59831
---
 src/gallium/drivers/r600/Makefile.am | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/Makefile.am 
b/src/gallium/drivers/r600/Makefile.am
index 995261b..6de7e0f 100644
--- a/src/gallium/drivers/r600/Makefile.am
+++ b/src/gallium/drivers/r600/Makefile.am
@@ -13,7 +13,8 @@ AM_CFLAGS = \
 libr600_la_SOURCES = \
$(C_SOURCES)
 
-if NEED_RADEON_GALLIUM
+if USE_R600_LLVM_COMPILER
+if HAVE_GALLIUM_COMPUTE
 
 libr600_la_SOURCES += \
$(LLVM_C_SOURCES) \
@@ -28,6 +29,7 @@ AM_CFLAGS += \
 AM_CXXFLAGS= \
$(LLVM_CXXFLAGS)
 endif
+endif
 
 if USE_R600_LLVM_COMPILER
 AM_CFLAGS += \
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] glsl: Evaluate constant pack/unpack 4x8 expressions

2013-01-25 Thread Paul Berry
On 24 January 2013 19:47, Matt Turner  wrote:

> That is, evaluate constant expressions for the following functions:
>   packSnorm4x8, unpackSnorm4x8
>   packUnorm4x8, unpackUnorm4x8
> ---
>  src/glsl/ir_constant_expression.cpp |  162
> +++
>  1 files changed, 162 insertions(+), 0 deletions(-)
>
> diff --git a/src/glsl/ir_constant_expression.cpp
> b/src/glsl/ir_constant_expression.cpp
> index b34c6e8..4796f6f 100644
> --- a/src/glsl/ir_constant_expression.cpp
> +++ b/src/glsl/ir_constant_expression.cpp
> @@ -76,12 +76,24 @@ bitcast_f2u(float f)
>  }
>
>  /**
> + * Evaluate one component of a floating-point 4x8 unpacking function.
> + */
> +typedef uint8_t
> +(*pack_1x8_func_t)(float);
> +
> +/**
>   * Evaluate one component of a floating-point 2x16 unpacking function.
>   */
>  typedef uint16_t
>  (*pack_1x16_func_t)(float);
>
>  /**
> + * Evaluate one component of a floating-point 4x8 unpacking function.
> + */
> +typedef float
> +(*unpack_1x8_func_t)(uint8_t);
> +
> +/**
>   * Evaluate one component of a floating-point 2x16 unpacking function.
>   */
>  typedef float
> @@ -112,6 +124,32 @@ pack_2x16(pack_1x16_func_t pack_1x16,
>  }
>
>  /**
> + * Evaluate a 4x8 floating-point packing function.
> + */
> +static uint32_t
> +pack_4x8(pack_1x8_func_t pack_1x8,
> + float x, float y, float z, float w)
> +{
> +   /* From section 8.4 of the GLSL 4.30 spec:
> +*
> +*packSnorm4x8
> +*
> +*The first component of the vector will be written to the least
> +*significant bits of the output; the last component will be
> written to
> +*the most significant bits.
> +*
> +* The specifications for the other packing functions contain similar
> +* language.
> +*/
> +   uint32_t u = 0;
> +   u |= ((uint32_t) pack_1x8(x) << 0);
> +   u |= ((uint32_t) pack_1x8(y) << 8);
> +   u |= ((uint32_t) pack_1x8(z) << 16);
> +   u |= ((uint32_t) pack_1x8(w) << 24);
> +   return u;
> +}
> +
> +/**
>   * Evaluate a 2x16 floating-point unpacking function.
>   */
>  static void
> @@ -135,6 +173,48 @@ unpack_2x16(unpack_1x16_func_t unpack_1x16,
>  }
>
>  /**
> + * Evaluate a 4x8 floating-point unpacking function.
> + */
> +static void
> +unpack_4x8(unpack_1x8_func_t unpack_1x8, uint32_t u,
> +   float *x, float *y, float *z, float *w)
> +{
> +/* From section 8.4 of the GLSL 4.30 spec:
> + *
> + *unpackSnorm4x8
> + *--
> + *The first component of the returned vector will be extracted
> from
> + *the least significant bits of the input; the last component
> will be
> + *extracted from the most significant bits.
> + *
> + * The specifications for the other unpacking functions contain
> similar
> + * language.
> + */
> +   *x = unpack_1x8((uint8_t) (u & 0xff));
> +   *y = unpack_1x8((uint8_t) (u >> 8));
> +   *z = unpack_1x8((uint8_t) (u >> 16));
> +   *w = unpack_1x8((uint8_t) (u >> 24));
> +}
> +
> +/**
> + * Evaluate one component of packSnorm4x8.
> + */
> +static uint8_t
> +pack_snorm_1x8(float x)
> +{
> +/* From section 8.4 of the GLSL 4.30 spec:
> + *
> + *packSnorm4x8
> + *
> + *The conversion for component c of v to fixed point is done as
> + *follows:
> + *
> + *  packSnorm4x8: round(clamp(c, -1, +1) * 127.0)
> + */
> +   return (uint8_t) _mesa_round_to_even(CLAMP(x, -1.0f, +1.0f) * 127.0f);
> +}
>

IIRC, Brian Paul has a patch out on the list that changes the return type
of _mesa_round_to_even() to float.  If & when that patch lands, this
conversion will result in undefined behaviour, since casing from a negative
float to an unsigned value is undefined by the C standard.

I recommend changing this to "return (uint8_t) (int8_t)
_mesa_round_to_even(...)" and adding a sentence to the comment to explain
why this is necessary.  See the existing pack_snorm_1x16() function, which
used to have the same issue.

With that change, this patch is:

Reviewed-by: Paul Berry 


> +
> +/**
>   * Evaluate one component of packSnorm2x16.
>   */
>  static uint16_t
> @@ -153,6 +233,24 @@ pack_snorm_1x16(float x)
>  }
>
>  /**
> + * Evaluate one component of unpackSnorm4x8.
> + */
> +static float
> +unpack_snorm_1x8(uint8_t u)
> +{
> +/* From section 8.4 of the GLSL 4.30 spec:
> + *
> + *unpackSnorm4x8
> + *--
> + *The conversion for unpacked fixed-point value f to floating
> point is
> + *done as follows:
> + *
> + *   unpackSnorm4x8: clamp(f / 127.0, -1, +1)
> + */
> +   return CLAMP((int8_t) u / 127.0f, -1.0f, +1.0f);
> +}
> +
> +/**
>   * Evaluate one component of unpackSnorm2x16.
>   */
>  static float
> @@ -171,6 +269,24 @@ unpack_snorm_1x16(uint16_t u)
>  }
>
>  /**
> + * Evaluate one component packUnorm4x8.
> + */
> +static uint8_t
> +pack_unorm_1x8(float x)
> +{
> +/* From section 8.4 of the GLSL 4.30 spec:
> + *
> + 

Re: [Mesa-dev] [PATCH] mesa: implement GL_ARB_texture_buffer_range v4

2013-01-25 Thread Ian Romanick

On 01/25/2013 08:54 AM, Christoph Bumiller wrote:

v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead
of the buffer's current size so we know we always have to use the
full size of the buffer object (i.e. even if it changes without the
user calling TexBuffer again) for the texture.


Maybe make this a comment in the code somewhere.  Perhaps at the 
BufferSize declaration in gl_texture_object?



Clarify invalid offset alignment error message.

v3: Use extra GL_CORE-only section in get_hash_params.py for
TEXTURE_BUFFER_OFFSET_ALIGNMENT.

v4: Remove unnecessary check for profile in _mesa_TexBufferRange.
Add check for extension enable in get_tex_level_parameter_buffer.
---
  src/mapi/glapi/gen/ARB_texture_buffer_range.xml |   22 ++
  src/mapi/glapi/gen/Makefile.am  |1 +
  src/mapi/glapi/gen/gl_API.xml   |2 +
  src/mesa/main/context.c |1 +
  src/mesa/main/extensions.c  |1 +
  src/mesa/main/get.c |1 +
  src/mesa/main/get_hash_params.py|6 ++
  src/mesa/main/mtypes.h  |6 ++
  src/mesa/main/teximage.c|   84 ++-
  src/mesa/main/teximage.h|4 +
  src/mesa/main/texparam.c|   12 +++
  11 files changed, 123 insertions(+), 17 deletions(-)
  create mode 100644 src/mapi/glapi/gen/ARB_texture_buffer_range.xml

diff --git a/src/mapi/glapi/gen/ARB_texture_buffer_range.xml 
b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml
new file mode 100644
index 000..2176c08
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml
@@ -0,0 +1,22 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index f869d28..4d51bbc 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -108,6 +108,7 @@ API_XML = \
ARB_seamless_cube_map.xml \
ARB_sync.xml \
ARB_texture_buffer_object.xml \
+   ARB_texture_buffer_range.xml \
ARB_texture_compression_rgtc.xml \
ARB_texture_float.xml \
ARB_texture_rg.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 404ccea..8d700a1 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8151,6 +8151,8 @@

  http://www.w3.org/2001/XInclude"/>

+http://www.w3.org/2001/XInclude"/>
+


Everywhere else we sort alphabetically by name.  Here, however, we sort 
by assigned extension number.  It just happens that the other 3 
extensions shown in this hunk have the same sort order for both.



  http://www.w3.org/2001/XInclude"/>

  http://www.w3.org/2001/XInclude"/>
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 5e9e539..5058c07 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -564,6 +564,7 @@ _mesa_init_constants(struct gl_context *ctx)
 ctx->Const.MaxTextureMaxAnisotropy = MAX_TEXTURE_MAX_ANISOTROPY;
 ctx->Const.MaxTextureLodBias = MAX_TEXTURE_LOD_BIAS;
 ctx->Const.MaxTextureBufferSize = 65536;
+   ctx->Const.TextureBufferOffsetAlignment = 1;
 ctx->Const.MaxArrayLockSize = MAX_ARRAY_LOCK_SIZE;
 ctx->Const.SubPixelBits = SUB_PIXEL_BITS;
 ctx->Const.MinPointSize = MIN_POINT_SIZE;
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 5d01ac8..207572f 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -130,6 +130,7 @@ static const struct extension extension_table[] = {
 { "GL_ARB_texture_border_clamp",
o(ARB_texture_border_clamp),GLL,2000 },
 { "GL_ARB_texture_buffer_object",   
o(ARB_texture_buffer_object),   GLC,2008 },
 { "GL_ARB_texture_buffer_object_rgb32", 
o(ARB_texture_buffer_object_rgb32), GLC,2009 },
+   { "GL_ARB_texture_buffer_range",
o(ARB_texture_buffer_range),GLC,2012 },
 { "GL_ARB_texture_compression", o(dummy_true), 
 GLL,2000 },
 { "GL_ARB_texture_compression_rgtc",
o(ARB_texture_compression_rgtc),GL, 2004 },
 { "GL_ARB_texture_cube_map",o(ARB_texture_cube_map),   
 GLL,1999 },
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 5f4e2fa..da1e01c 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -353,6 +353,7 @@ EXTRA_EXT(ARB_uniform_buffer_object);
  EXTRA_EXT(ARB_timer_query);
  EXTRA_EXT(ARB_map_buffer_alignment);
  EXTRA_EXT(ARB_texture_cube_map_array);
+EXTRA_EXT(ARB_texture_buffer_range);

  static const int
  extra_NV_primitive_restart[] = {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/mai

[Mesa-dev] [PATCH V6 6/8] intel: Create a miptree using offsets in intel_set_texture_image_region

2013-01-25 Thread Abdiel Janulgue
When binding a region to a texture image, re-create the miptree base-level
considering the offset and dimension information exported by DRIImage.

Signed-off-by: Abdiel Janulgue 
---
 src/mesa/drivers/dri/intel/intel_tex_image.c |   31 --
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_tex_image.c 
b/src/mesa/drivers/dri/intel/intel_tex_image.c
index 7361e6a..a4cf883 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_image.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_image.c
@@ -256,7 +256,11 @@ intel_set_texture_image_region(struct gl_context *ctx,
   GLenum target,
   GLenum internalFormat,
   gl_format format,
-   uint32_t offset)
+   uint32_t offset,
+   GLuint width,
+   GLuint height,
+   GLuint tile_x,
+   GLuint tile_y)
 {
struct intel_context *intel = intel_context(ctx);
struct intel_texture_image *intel_image = intel_texture_image(image);
@@ -264,14 +268,22 @@ intel_set_texture_image_region(struct gl_context *ctx,
struct intel_texture_object *intel_texobj = intel_texture_object(texobj);
 
_mesa_init_teximage_fields(&intel->ctx, image,
- region->width, region->height, 1,
+ width, height, 1,
  0, internalFormat, format);
 
ctx->Driver.FreeTextureImageBuffer(ctx, image);
 
-   intel_image->mt = intel_miptree_create_for_region(intel, target,
-image->TexFormat,
-region);
+   intel_image->mt = intel_miptree_create_layout(intel, target, 
image->TexFormat,
+ 0, 0,
+ width, height, 1,
+ true, 0 /* num_samples */,
+ INTEL_MSAA_LAYOUT_NONE);
+   intel_region_reference(&intel_image->mt->region, region);
+   intel_image->mt->total_width = width;
+   intel_image->mt->total_height = height;
+   intel_image->mt->level[0].slice[0].x_offset = tile_x;
+   intel_image->mt->level[0].slice[0].y_offset = tile_y;
+
if (intel_image->mt == NULL)
return;
intel_texobj->needs_validate = true;
@@ -332,7 +344,10 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target,
_mesa_lock_texture(&intel->ctx, texObj);
texImage = _mesa_get_tex_image(ctx, texObj, target, level);
intel_set_texture_image_region(ctx, texImage, rb->mt->region, target,
- internalFormat, texFormat, 0);
+  internalFormat, texFormat, 0,
+  rb->mt->region->width,
+  rb->mt->region->height,
+  0, 0);
_mesa_unlock_texture(&intel->ctx, texObj);
 }
 
@@ -363,7 +378,9 @@ intel_image_target_texture_2d(struct gl_context *ctx, 
GLenum target,
 
intel_set_texture_image_region(ctx, texImage, image->region,
  target, image->internal_format,
-  image->format, image->offset);
+  image->format, image->offset,
+  image->width,  image->height,
+  image->tile_x, image->tile_y);
 }
 
 void
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V6 8/8] intel: implement create image from texture

2013-01-25 Thread Abdiel Janulgue
Save miptree level info to DRIImage:
- Appropriately-aligned base offset pointing to the image
- Additional x/y adjustment offsets from above.

In non-tile-aligned surface cases where resolving back to the original image
located in mip-levels higher than the base level proves problematic due to
offset alignment issues, report INVALID_OPERATION as per spec wording.

Signed-off-by: Abdiel Janulgue 
---
 src/mesa/drivers/dri/intel/intel_screen.c |  179 +
 1 file changed, 159 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_screen.c 
b/src/mesa/drivers/dri/intel/intel_screen.c
index e0fe8c1..d23246a 100644
--- a/src/mesa/drivers/dri/intel/intel_screen.c
+++ b/src/mesa/drivers/dri/intel/intel_screen.c
@@ -31,11 +31,13 @@
 #include "main/context.h"
 #include "main/framebuffer.h"
 #include "main/renderbuffer.h"
+#include "main/texobj.h"
 #include "main/hash.h"
 #include "main/fbobject.h"
 #include "main/mfeatures.h"
 #include "main/version.h"
 #include "swrast/s_renderbuffer.h"
+#include "egl/main/eglcurrent.h"
 
 #include "utils.h"
 #include "xmlpool.h"
@@ -104,6 +106,10 @@ const GLuint __driNConfigOptions = 15;
 #include "intel_tex.h"
 #include "intel_regions.h"
 
+#ifndef I915
+#include "brw_context.h"
+#endif
+
 #include "i915_drm.h"
 
 #ifdef USE_NEW_INTERFACE
@@ -295,6 +301,87 @@ intel_allocate_image(int dri_format, void *loaderPrivate)
 return image;
 }
 
+static void
+intel_image_set_level_info(__DRIimage *image, struct intel_mipmap_tree *mt,
+   int level, int slice)
+{
+   unsigned int draw_x, draw_y;
+   uint32_t mask_x, mask_y;
+
+   intel_region_get_tile_masks(mt->region, &mask_x, &mask_y, false);
+   intel_miptree_get_image_offset(mt, level, slice, &draw_x, &draw_y);
+
+   image->width = mt->level[level].width;
+   image->height = mt->level[level].height;
+   image->tile_x = draw_x & mask_x;
+   image->tile_y = draw_y & mask_y;
+
+   image->offset = intel_region_get_aligned_offset(mt->region,
+   draw_x & ~mask_x,
+   draw_y & ~mask_y,
+   false);
+}
+
+/**
+ * Sets up a DRIImage structure to point to our shared image in a region
+ */
+static bool
+intel_setup_image_from_mipmap_tree(struct intel_context *intel, __DRIimage 
*image,
+   struct intel_mipmap_tree *mt, GLuint level,
+   GLuint zoffset)
+{
+   bool has_surface_tile_offset = false;
+   uint32_t draw_x, draw_y;
+
+   intel_miptree_check_level_layer(mt, level, zoffset);
+   intel_miptree_get_tile_offsets(mt, level, zoffset, &draw_x, &draw_y);
+
+#ifndef I915
+   has_surface_tile_offset = brw_context(&intel->ctx)->has_surface_tile_offset;
+#endif
+   if (!has_surface_tile_offset &&
+   (draw_x != 0 || draw_y != 0))
+  /* Non-tile aligned sufaces in gen4 hw and earlier have problems 
resolving
+   * back to our destination due to alignment issues. Bail-out and report 
error
+   */
+  return false;
+
+   intel_image_set_level_info(image, mt, level, zoffset);
+   intel_region_reference(&image->region, mt->region);
+
+   return true;
+}
+
+static void
+intel_setup_image_from_dimensions(__DRIimage *image)
+{
+   image->width= image->region->width;
+   image->height   = image->region->height;
+   image->tile_x = 0;
+   image->tile_y = 0;
+}
+
+static inline uint32_t
+intel_dri_format(GLuint format)
+{
+   switch (format) {
+   case MESA_FORMAT_RGB565:
+  return __DRI_IMAGE_FORMAT_RGB565;
+   case MESA_FORMAT_XRGB:
+  return __DRI_IMAGE_FORMAT_XRGB;
+   case MESA_FORMAT_ARGB:
+  return __DRI_IMAGE_FORMAT_ARGB;
+   case MESA_FORMAT_RGBA_REV:
+  return __DRI_IMAGE_FORMAT_ABGR;
+   case MESA_FORMAT_R8:
+  return __DRI_IMAGE_FORMAT_R8;
+   case MESA_FORMAT_RG88:
+  return __DRI_IMAGE_FORMAT_GR88;
+   }
+
+   return MESA_FORMAT_NONE;
+}
+
 static __DRIimage *
 intel_create_image_from_name(__DRIscreen *screen,
 int width, int height, int format,
@@ -317,6 +404,8 @@ intel_create_image_from_name(__DRIscreen *screen,
return NULL;
 }
 
+intel_setup_image_from_dimensions(image);
+
 return image;  
 }
 
@@ -346,26 +435,69 @@ intel_create_image_from_renderbuffer(__DRIcontext 
*context,
image->offset = 0;
image->data = loaderPrivate;
intel_region_reference(&image->region, irb->mt->region);
+   intel_setup_image_from_dimensions(image);
+   image->dri_format = intel_dri_format(image->format);
 
-   switch (image->format) {
-   case MESA_FORMAT_RGB565:
-  image->dri_format = __DRI_IMAGE_FORMAT_RGB565;
-  break;
-   case MESA_FORMAT_XRGB:
-  image->dri_format = __DRI_IMAGE_FORMAT_XRGB;
-  break;
-   case MESA_FORMAT_ARGB:
-  image->dri_format = __DRI_IMAGE_FORMAT_ARGB;
-  break;
-   case MESA_FORMAT_RGBA888

[Mesa-dev] [PATCH V6 7/8] intel: Account for mt->offset in intel_miptree_map

2013-01-25 Thread Abdiel Janulgue
We need to take account the offset from original bo when using glTexSubImage()
and other functions that manipulate the subregion of an exported texture.
Offsets are appended to mapped region address and when blitting from a source
region.

Signed-off-by: Abdiel Janulgue 
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index 435f12f..ceb5322 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -1126,7 +1126,7 @@ intel_miptree_map_gtt(struct intel_context *intel,
assert(y % bh == 0);
y /= bh;
 
-   base = intel_region_map(intel, mt->region, map->mode);
+   base = intel_region_map(intel, mt->region, map->mode) + mt->offset;
 
if (base == NULL)
   map->ptr = NULL;
@@ -1186,7 +1186,7 @@ intel_miptree_map_blit(struct intel_context *intel,
if (!intelEmitCopyBlit(intel,
  mt->region->cpp,
  mt->region->pitch, mt->region->bo,
- 0, mt->region->tiling,
+ mt->offset, mt->region->tiling,
  map->stride / mt->region->cpp, map->bo,
  0, I915_TILING_NONE,
  x, y,
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V6 5/8] i965: Account for offsets when updating SURFACE_STATE.

2013-01-25 Thread Abdiel Janulgue
If the offsets are present, this lets us specify a particular level and slice
in a shared region using the base level of an exported mip-map tree.

Signed-off-by: Abdiel Janulgue 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |   12 +++-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |   12 ++--
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index a2a875f..e37de8d 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -804,6 +804,7 @@ brw_update_texture_surface(struct gl_context *ctx,
struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
uint32_t *surf;
int width, height, depth;
+   uint32_t tile_x, tile_y;
 
if (tObj->Target == GL_TEXTURE_BUFFER) {
   brw_update_buffer_texture_surface(ctx, unit, binding_table, surf_index);
@@ -837,7 +838,16 @@ brw_update_texture_surface(struct gl_context *ctx,
 
surf[4] = 0;
 
-   surf[5] = (mt->align_h == 4) ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0;
+   intel_miptree_get_tile_offsets(intelObj->mt, 0, 0, &tile_x, &tile_y);
+   assert(brw->has_surface_tile_offset || (tile_x == 0 && tile_y == 0));
+   /* Note that the low bits of these fields are missing, so
+* there's the possibility of getting in trouble.
+*/
+   assert(tile_x % 4 == 0);
+   assert(tile_y % 2 == 0);
+   surf[5] = ((tile_x / 4) << BRW_SURFACE_X_OFFSET_SHIFT |
+ (tile_y / 2) << BRW_SURFACE_Y_OFFSET_SHIFT |
+ (mt->align_h == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0));
 
/* Emit relocation to surface contents */
drm_intel_bo_emit_reloc(brw->intel.batch.bo,
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 1e5af95..0eacd0a 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -302,6 +302,7 @@ gen7_update_texture_surface(struct gl_context *ctx,
struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
struct gen7_surface_state *surf;
int width, height, depth;
+   uint32_t tile_x, tile_y;
 
if (tObj->Target == GL_TEXTURE_BUFFER) {
   gen7_update_buffer_texture_surface(ctx, unit, binding_table, surf_index);
@@ -360,12 +361,19 @@ gen7_update_texture_surface(struct gl_context *ctx,
 
/* ss4: ignored? */
 
+   intel_miptree_get_tile_offsets(intelObj->mt, 0, 0, &tile_x, &tile_y);
+   assert(brw->has_surface_tile_offset || (tile_x == 0 && tile_y == 0));
+   /* Note that the low bits of these fields are missing, so
+* there's the possibility of getting in trouble.
+*/
+   assert(tile_x % 4 == 0);
+   assert(tile_y % 2 == 0);
surf->ss5.mip_count = intelObj->_MaxLevel - tObj->BaseLevel;
surf->ss5.min_lod = 0;
+   surf->ss5.x_offset = tile_x / 4;
+   surf->ss5.y_offset = tile_y / 2;
 
/* ss5 remaining fields:
-* - x_offset (N/A for textures?)
-* - y_offset (ditto)
 * - cache_control
 */
 
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V6 4/8] intel: add pixel offset calculator for miptree levels

2013-01-25 Thread Abdiel Janulgue
Add helper to calculate fine-grained x and y adjustment pixels
to an image within a miptree level for tiled regions.

Signed-off-by: Abdiel Janulgue 
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c |   15 +++
 src/mesa/drivers/dri/intel/intel_mipmap_tree.h |6 ++
 2 files changed, 21 insertions(+)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index cc74d3c..435f12f 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -688,6 +688,21 @@ intel_miptree_get_image_offset(struct intel_mipmap_tree 
*mt,
*y = mt->level[level].slice[slice].y_offset;
 }
 
+void
+intel_miptree_get_tile_offsets(struct intel_mipmap_tree *mt,
+   GLuint level, GLuint slice,
+   uint32_t *tile_x,
+   uint32_t *tile_y)
+{
+   struct intel_region *region = mt->region;
+   uint32_t mask_x, mask_y;
+
+   intel_region_get_tile_masks(region, &mask_x, &mask_y, false);
+
+   *tile_x = mt->level[level].slice[slice].x_offset & mask_x;
+   *tile_y = mt->level[level].slice[slice].y_offset & mask_y;
+}
+
 static void
 intel_miptree_copy_slice(struct intel_context *intel,
 struct intel_mipmap_tree *dst_mt,
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h
index 1b2270a..d822491 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h
@@ -460,6 +460,12 @@ void
 intel_miptree_get_dimensions_for_image(struct gl_texture_image *image,
int *width, int *height, int *depth);
 
+void
+intel_miptree_get_tile_offsets(struct intel_mipmap_tree *mt,
+   GLuint level, GLuint slice,
+   uint32_t *tile_x,
+   uint32_t *tile_y);
+
 void intel_miptree_set_level_info(struct intel_mipmap_tree *mt,
   GLuint level,
   GLuint x, GLuint y,
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V6 3/8] intel: Expose intel_miptree_create_internal as intel_miptree_create_layout.

2013-01-25 Thread Abdiel Janulgue
Signed-off-by: Abdiel Janulgue 
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c |   37 
 src/mesa/drivers/dri/intel/intel_mipmap_tree.h |   14 -
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index 8d814bd..cc74d3c 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -71,18 +71,18 @@ target_to_target(GLenum target)
  *intel_miptree_create_for_region(). If true, then do not create
  *\c stencil_mt.
  */
-static struct intel_mipmap_tree *
-intel_miptree_create_internal(struct intel_context *intel,
- GLenum target,
- gl_format format,
- GLuint first_level,
- GLuint last_level,
- GLuint width0,
- GLuint height0,
- GLuint depth0,
- bool for_region,
-  GLuint num_samples,
-  enum intel_msaa_layout msaa_layout)
+struct intel_mipmap_tree *
+intel_miptree_create_layout(struct intel_context *intel,
+GLenum target,
+gl_format format,
+GLuint first_level,
+GLuint last_level,
+GLuint width0,
+GLuint height0,
+GLuint depth0,
+bool for_region,
+GLuint num_samples,
+enum intel_msaa_layout msaa_layout)
 {
struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1);
int compress_byte = 0;
@@ -262,7 +262,7 @@ intel_miptree_create(struct intel_context *intel,
 tiling = I915_TILING_X;
}
 
-   mt = intel_miptree_create_internal(intel, target, format,
+   mt = intel_miptree_create_layout(intel, target, format,
  first_level, last_level, width0,
  height0, depth0,
  false, num_samples, msaa_layout);
@@ -305,7 +305,6 @@ intel_miptree_create(struct intel_context *intel,
return mt;
 }
 
-
 struct intel_mipmap_tree *
 intel_miptree_create_for_region(struct intel_context *intel,
GLenum target,
@@ -314,11 +313,11 @@ intel_miptree_create_for_region(struct intel_context 
*intel,
 {
struct intel_mipmap_tree *mt;
 
-   mt = intel_miptree_create_internal(intel, target, format,
- 0, 0,
- region->width, region->height, 1,
- true, 0 /* num_samples */,
-  INTEL_MSAA_LAYOUT_NONE);
+   mt = intel_miptree_create_layout(intel, target, format,
+0, 0,
+region->width, region->height, 1,
+true, 0 /* num_samples */,
+INTEL_MSAA_LAYOUT_NONE);
if (!mt)
   return mt;
 
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h
index eb4ad7f..1b2270a 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h
@@ -387,6 +387,19 @@ struct intel_mipmap_tree *intel_miptree_create(struct 
intel_context *intel,
enum intel_msaa_layout 
msaa_layout);
 
 struct intel_mipmap_tree *
+intel_miptree_create_layout(struct intel_context *intel,
+GLenum target,
+gl_format format,
+GLuint first_level,
+GLuint last_level,
+GLuint width0,
+GLuint height0,
+GLuint depth0,
+bool for_region,
+GLuint num_samples,
+enum intel_msaa_layout msaa_layout);
+
+struct intel_mipmap_tree *
 intel_miptree_create_for_region(struct intel_context *intel,
GLenum target,
gl_format format,
@@ -398,7 +411,6 @@ intel_miptree_create_for_dri2_buffer(struct intel_context 
*intel,
  gl_format format,
  uint32_t num_samples,
  struct intel_region *region);
-
 /**
  * Create a miptree appropriate as the storage for a non-texture renderbuffer.
  * The miptree has the following properties:
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.free

[Mesa-dev] [PATCH V6 2/8] intel: expose dimensions and offsets of a miptree level in DRIImage

2013-01-25 Thread Abdiel Janulgue
Signed-off-by: Abdiel Janulgue 
---
 src/mesa/drivers/dri/intel/intel_regions.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/intel/intel_regions.h 
b/src/mesa/drivers/dri/intel/intel_regions.h
index 8737a6d..1eef3b5 100644
--- a/src/mesa/drivers/dri/intel/intel_regions.h
+++ b/src/mesa/drivers/dri/intel/intel_regions.h
@@ -174,6 +174,12 @@ struct __DRIimageRec {
uint32_t offsets[3];
struct intel_image_format *planar_format;
 
+   /* particular miptree level */
+   GLuint width;
+   GLuint height;
+   GLuint tile_x;
+   GLuint tile_y;
+
void *data;
 };
 
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V6 1/8] dri2: Create image from texture

2013-01-25 Thread Abdiel Janulgue
Add create image from texture extension and bump version.

Signed-off-by: Abdiel Janulgue 
---
 include/GL/internal/dri_interface.h |   14 +-
 src/egl/drivers/dri2/egl_dri2.c |   85 +++
 2 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 568581d..63cb2d6 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -937,7 +937,7 @@ struct __DRIdri2ExtensionRec {
  * extensions.
  */
 #define __DRI_IMAGE "DRI_IMAGE"
-#define __DRI_IMAGE_VERSION 5
+#define __DRI_IMAGE_VERSION 6
 
 /**
  * These formats correspond to the similarly named MESA_FORMAT_*
@@ -1086,6 +1086,18 @@ struct __DRIimageExtensionRec {
 */
 __DRIimage *(*fromPlanar)(__DRIimage *image, int plane,
   void *loaderPrivate);
+
+/**
+ * Create image from texture.
+ *
+ * \since 6
+ */
+   __DRIimage *(*createImageFromTexture)(__DRIcontext *context,
+ int target,
+ unsigned texture,
+ int depth,
+ int level,
+ void *loaderPrivate);
 };
 
 
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 1f13d79..5d83573 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -490,6 +490,11 @@ dri2_setup_screen(_EGLDisplay *disp)
   disp->Extensions.MESA_drm_image = EGL_TRUE;
   disp->Extensions.KHR_image_base = EGL_TRUE;
   disp->Extensions.KHR_gl_renderbuffer_image = EGL_TRUE;
+  if (dri2_dpy->image->base.version >= 5 &&
+  dri2_dpy->image->createImageFromTexture) {
+ disp->Extensions.KHR_gl_texture_2D_image = EGL_TRUE;
+ disp->Extensions.KHR_gl_texture_cubemap_image = EGL_TRUE;
+  }
}
 }
 
@@ -1210,6 +1215,78 @@ dri2_create_image_wayland_wl_buffer(_EGLDisplay *disp, 
_EGLContext *ctx,
 }
 #endif
 
+static _EGLImage *
+dri2_create_image_khr_texture(_EGLDisplay *disp, _EGLContext *ctx,
+  EGLenum target,
+  EGLClientBuffer buffer,
+  const EGLint *attr_list)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
+   struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx);
+   struct dri2_egl_image *dri2_img;
+   GLuint texture = (GLuint) (uintptr_t) buffer;
+   _EGLImageAttribs attrs;
+   GLuint depth;
+   GLenum gl_target;
+
+   if (texture == 0) {
+  _eglError(EGL_BAD_PARAMETER, "dri2_create_image_khr");
+  return EGL_NO_IMAGE_KHR;
+   }
+
+   if (_eglParseImageAttribList(&attrs, disp, attr_list) != EGL_SUCCESS)
+  return EGL_NO_IMAGE_KHR;
+
+   switch (target) {
+   case EGL_GL_TEXTURE_2D_KHR:
+  depth = 0;
+  gl_target = GL_TEXTURE_2D;
+  break;
+   case EGL_GL_TEXTURE_3D_KHR:
+  depth = attrs.GLTextureZOffset;
+  gl_target = GL_TEXTURE_3D;
+  break;
+   case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_X_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_X_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Y_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Y_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Z_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Z_KHR:
+  depth = target - EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_X_KHR;
+  gl_target = GL_TEXTURE_CUBE_MAP;
+  break;
+   default:
+  _eglError(EGL_BAD_PARAMETER, "dri2_create_image_khr");
+  return EGL_NO_IMAGE_KHR;
+   }
+
+   dri2_img = malloc(sizeof *dri2_img);
+   if (!dri2_img) {
+  _eglError(EGL_BAD_ALLOC, "dri2_create_image_khr");
+  return EGL_NO_IMAGE_KHR;
+   }
+
+   if (!_eglInitImage(&dri2_img->base, disp)) {
+  _eglError(EGL_BAD_ALLOC, "dri2_create_image_khr");
+  free(dri2_img);
+  return EGL_NO_IMAGE_KHR;
+   }
+
+   dri2_img->dri_image =
+  dri2_dpy->image->createImageFromTexture(dri2_ctx->dri_context,
+  gl_target,
+  texture,
+  depth,
+  attrs.GLTextureLevel,
+  dri2_img);
+
+   if (!dri2_img->dri_image) {
+  free(dri2_img);
+  return EGL_NO_IMAGE_KHR;
+   }
+   return &dri2_img->base;
+}
+
 _EGLImage *
 dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp,
  _EGLContext *ctx, EGLenum target,
@@ -1218,6 +1295,14 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp,
(void) drv;
 
switch (target) {
+   case EGL_GL_TEXTURE_2D_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_X_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_X_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Y_KHR:
+   case EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Y_KHR:
+   case EGL_GL_TEXTURE_

[Mesa-dev] [PATCH V6 0/8] intel: add support for EGL_KHR_gl_image

2013-01-25 Thread Abdiel Janulgue
- Rename draw_x/y to tile_x/y in dri image struct. These are now used as 
  adjustment pixels from our stored aligned offset to the exported image
  instead of the entire x/y offset from the base address.
- Take into consideration the offset from our bo so that sub-image functions 
  resolves properly to the our original image. 
- Move mt->stencil_mt check out of misleading comment in 
  intel_setup_image_from_mipmap_tree

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] glsl: Add support for lowering 4x8 pack/unpack operations

2013-01-25 Thread Ian Romanick

On 01/24/2013 10:47 PM, Matt Turner wrote:

Lower them to arithmetic and bit manipulation expressions.
---
  src/glsl/ir_optimization.h  |6 +
  src/glsl/lower_packing_builtins.cpp |  279 +++
  2 files changed, 285 insertions(+), 0 deletions(-)

diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index ac90b87..8f33018 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -54,6 +54,12 @@ enum lower_packing_builtins_op {

 LOWER_PACK_HALF_2x16_TO_SPLIT= 0x0040,
 LOWER_UNPACK_HALF_2x16_TO_SPLIT  = 0x0080,
+
+   LOWER_PACK_SNORM_4x8 = 0x0100,
+   LOWER_UNPACK_SNORM_4x8   = 0x0200,
+
+   LOWER_PACK_UNORM_4x8 = 0x0400,
+   LOWER_UNPACK_UNORM_4x8   = 0x0800,
  };

  bool do_common_optimization(exec_list *ir, bool linked,
diff --git a/src/glsl/lower_packing_builtins.cpp 
b/src/glsl/lower_packing_builtins.cpp
index 49176cc..aa6765f 100644
--- a/src/glsl/lower_packing_builtins.cpp
+++ b/src/glsl/lower_packing_builtins.cpp
@@ -85,9 +85,15 @@ public:
case LOWER_PACK_SNORM_2x16:
   *rvalue = lower_pack_snorm_2x16(op0);
   break;
+  case LOWER_PACK_SNORM_4x8:
+ *rvalue = lower_pack_snorm_4x8(op0);
+ break;
case LOWER_PACK_UNORM_2x16:
   *rvalue = lower_pack_unorm_2x16(op0);
   break;
+  case LOWER_PACK_UNORM_4x8:
+ *rvalue = lower_pack_unorm_4x8(op0);
+ break;
case LOWER_PACK_HALF_2x16:
   *rvalue = lower_pack_half_2x16(op0);
   break;
@@ -97,9 +103,15 @@ public:
case LOWER_UNPACK_SNORM_2x16:
   *rvalue = lower_unpack_snorm_2x16(op0);
   break;
+  case LOWER_UNPACK_SNORM_4x8:
+ *rvalue = lower_unpack_snorm_4x8(op0);
+ break;
case LOWER_UNPACK_UNORM_2x16:
   *rvalue = lower_unpack_unorm_2x16(op0);
   break;
+  case LOWER_UNPACK_UNORM_4x8:
+ *rvalue = lower_unpack_unorm_4x8(op0);
+ break;
case LOWER_UNPACK_HALF_2x16:
   *rvalue = lower_unpack_half_2x16(op0);
   break;
@@ -137,18 +149,30 @@ private:
case ir_unop_pack_snorm_2x16:
   result = op_mask & LOWER_PACK_SNORM_2x16;
   break;
+  case ir_unop_pack_snorm_4x8:
+ result = op_mask & LOWER_PACK_SNORM_4x8;
+ break;
case ir_unop_pack_unorm_2x16:
   result = op_mask & LOWER_PACK_UNORM_2x16;
   break;
+  case ir_unop_pack_unorm_4x8:
+ result = op_mask & LOWER_PACK_UNORM_4x8;
+ break;
case ir_unop_pack_half_2x16:
   result = op_mask & (LOWER_PACK_HALF_2x16 | 
LOWER_PACK_HALF_2x16_TO_SPLIT);
   break;
case ir_unop_unpack_snorm_2x16:
   result = op_mask & LOWER_UNPACK_SNORM_2x16;
   break;
+  case ir_unop_unpack_snorm_4x8:
+ result = op_mask & LOWER_UNPACK_SNORM_4x8;
+ break;
case ir_unop_unpack_unorm_2x16:
   result = op_mask & LOWER_UNPACK_UNORM_2x16;
   break;
+  case ir_unop_unpack_unorm_4x8:
+ result = op_mask & LOWER_UNPACK_UNORM_4x8;
+ break;
case ir_unop_unpack_half_2x16:
   result = op_mask & (LOWER_UNPACK_HALF_2x16 | 
LOWER_UNPACK_HALF_2x16_TO_SPLIT);
   break;
@@ -214,6 +238,30 @@ private:
 }

 /**
+* \brief Pack four uint8's into a single uint32.
+*
+* Interpret the given uvec4 as a uint32 quad. Pack the quad into a uint32
+* where the least significant bits specify the first element of the quad.
+* Return the uint32.
+*/
+   ir_rvalue*
+   pack_uvec4_to_uint(ir_rvalue *uvec4_rval)
+   {
+  assert(uvec4_rval->type == glsl_type::uvec4_type);
+
+  /* uvec4 u = UVEC4_RVAL; */
+  ir_variable *u = factory.make_temp(glsl_type::uvec4_type,
+  "tmp_pack_uvec4_to_uint");
+  factory.emit(assign(u, uvec4_rval));
+
+  /* return ((u.w 0xff) << 24) | ((u.z & 0xff) << 16) | ((u.y & 0xff) << 8) | 
(u.x & 0xff); */
+  return bit_or(bit_or(lshift(bit_and(swizzle_w(u), constant(0xffu)), 
constant(24u)),
+   lshift(bit_and(swizzle_z(u), constant(0xffu)), 
constant(16u))),
+bit_or(lshift(bit_and(swizzle_y(u), constant(0xffu)), 
constant(8u)),
+   bit_and(swizzle_x(u), constant(0xffu;
+   }
+
+   /**
  * \brief Unpack a uint32 into two uint16's.
  *
  * Interpret the given uint32 as a uint16 pair where the uint32's least
@@ -244,6 +292,44 @@ private:
 }

 /**
+* \brief Unpack a uint32 into four uint8's.
+*
+* Interpret the given uint32 as a uint8 quad where the uint32's least
+* significant bits specify the quad's first element. Return the uint8
+* quad as a uvec4.
+*/
+   ir_rvalue*
+   unpack_uint_to_uvec4(ir_rvalue *uint_rval)
+   {
+  assert(uint_rval->type == glsl_type::uint_

Re: [Mesa-dev] [PATCH 0/8] ARB_shading_language_packing

2013-01-25 Thread Ian Romanick

On 01/24/2013 10:44 PM, Matt Turner wrote:

Following this email are eight patches that add the 4x8 pack/unpack
operations that are the difference between what GLSL ES 3.0 and
ARB_shading_language_packing require.

They require Chad's gles3-glsl-packing series and are available at
http://cgit.freedesktop.org/~mattst88/mesa/log/?h=ARB_shading_language_packing

I've also added testing support on top of Chad's piglit patch. The
{vs,fs}-unpackUnorm4x8 tests currently fail, and I've been unable to
spot why.


Do they pass of you modify the tests to use my_unpackUnorm4x8 and 
hand-code a my_unpackUnorm4x8 that does what your lowering pass 
generates?  In other words, is it possible this is exposing an existing bug?



Please give it a look. I'd be nice to get this into 9.1.

Thanks,
Matt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

Johannes Obermayr  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |tstel...@gmail.com
   |org |
 QA Contact||mesa-dev@lists.freedesktop.
   ||org

--- Comment #5 from Johannes Obermayr  ---
It was false to remove libr600_la_LDFLAGS in this patch:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=69d639ba8b3cfd95cfbb12b861dbe2eda53f2e25

And please change all Makefile.am to generate LLVM related LIBADDs this way to
avoid stupid dependencies if LLVM was compiled with the better cmake build
system which creates shared instead of static libs / one big shared lib and can
save memory this way.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59851] New: AC_ARG_WITH misusage leading to mesa configure failure

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59851

  Priority: medium
Bug ID: 59851
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: AC_ARG_WITH misusage leading to mesa configure failure
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: sardemff7+freedesk...@sardemff7.net
  Hardware: All
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

Created attachment 73648
  --> https://bugs.freedesktop.org/attachment.cgi?id=73648&action=edit
Patch to fix mesa configure

Copy of the commit message:

The third argument of AC_ARG_WITH is evaluated for any provided value,
not only on --with-, so it must not force-enable the feature
Also, setting $with_llvm_shared_libs in the opencl check was overriding
the user switch

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] glsl: Add infrastructure for ARB_shading_language_packing

2013-01-25 Thread Ian Romanick

On 01/24/2013 10:47 PM, Matt Turner wrote:

---
  src/glsl/builtins/tools/generate_builtins.py |1 +
  src/glsl/glcpp/glcpp-parse.y |3 +++
  src/glsl/glsl_parser_extras.cpp  |1 +
  src/glsl/glsl_parser_extras.h|2 ++
  src/glsl/standalone_scaffolding.cpp  |1 +
  src/mesa/main/extensions.c   |1 +
  src/mesa/main/mtypes.h   |1 +
  7 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/src/glsl/builtins/tools/generate_builtins.py 
b/src/glsl/builtins/tools/generate_builtins.py
index 2cfb1a3..3db862e 100755
--- a/src/glsl/builtins/tools/generate_builtins.py
+++ b/src/glsl/builtins/tools/generate_builtins.py
@@ -189,6 +189,7 @@ read_builtins(GLenum target, const char *protos, const char 
**functions, unsigne
 st->OES_EGL_image_external_enable = true;
 st->ARB_shader_bit_encoding_enable = true;
 st->ARB_texture_cube_map_array_enable = true;
+   st->ARB_shading_language_packing_enable = true;
 _mesa_glsl_initialize_types(st);

 sh->ir = new(sh) exec_list;
diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index 8fba923..e927c7c 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -1227,6 +1227,9 @@ glcpp_parser_create (const struct gl_extensions 
*extensions, int api)

  if (extensions->ARB_texture_cube_map_array)
 add_builtin_define(parser, "GL_ARB_texture_cube_map_array", 1);
+
+ if (extensions->ARB_shading_language_packing)
+add_builtin_define(parser, "GL_ARB_shading_language_packing", 
1);
   }
}

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index b460c86..c8dbc89 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -462,6 +462,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
 EXT(ARB_uniform_buffer_object,  true,  false, true,  true,  false, 
ARB_uniform_buffer_object),
 EXT(OES_standard_derivatives,   false, false, true,  false,  true, 
OES_standard_derivatives),
 EXT(ARB_texture_cube_map_array, true,  false, true,  true,  false, 
ARB_texture_cube_map_array),
+   EXT(ARB_shading_language_packing,   true,  false, true,  true,  false, 
ARB_shading_language_packing),


This array should be sorted...


  };

  #undef EXT
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 2e6bb0b..53df149 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -272,6 +272,8 @@ struct _mesa_glsl_parse_state {
 bool OES_standard_derivatives_warn;
 bool ARB_texture_cube_map_array_enable;
 bool ARB_texture_cube_map_array_warn;
+   bool ARB_shading_language_packing_enable;
+   bool ARB_shading_language_packing_warn;
 /*@}*/

 /** Extensions supported by the OpenGL implementation. */
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index ccf5b4f..8b12f81 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -101,6 +101,7 @@ void initialize_context_to_defaults(struct gl_context *ctx, 
gl_api api)
 ctx->Extensions.ARB_shader_bit_encoding = true;
 ctx->Extensions.OES_standard_derivatives = true;
 ctx->Extensions.ARB_texture_cube_map_array = true;
+   ctx->Extensions.ARB_shading_language_packing = true;

 ctx->Const.GLSLVersion = 120;

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index fd25d31..fb41760 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -125,6 +125,7 @@ static const struct extension extension_table[] = {
 { "GL_ARB_shader_stencil_export",   
o(ARB_shader_stencil_export),   GL, 2009 },
 { "GL_ARB_shader_texture_lod",  o(ARB_shader_texture_lod), 
 GL, 2009 },
 { "GL_ARB_shading_language_100",
o(ARB_shading_language_100),GLL,2003 },
+   { "GL_ARB_shading_language_packing",
o(ARB_shading_language_packing),GL, 2011 },
 { "GL_ARB_shadow",  o(ARB_shadow), 
 GLL,2001 },
 { "GL_ARB_sync",o(ARB_sync),   
 GL, 2003 },
 { "GL_ARB_texture_border_clamp",
o(ARB_texture_border_clamp),GLL,2000 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index cba1e16..254679f 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3042,6 +3042,7 @@ struct gl_extensions
 GLboolean ARB_shader_stencil_export;
 GLboolean ARB_shader_texture_lod;
 GLboolean ARB_shading_language_100;
+   GLboolean ARB_shading_language_packing;
 GLboolean ARB_sha

[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

--- Comment #4 from Tom Stellard  ---
The problem is that llvm_wrapper.cpp is being built without --enable-opencl or
--enable-r600-llvm-compiler, so the necessary libraries haven't been added to
LLVM_LIBS.  The fix is to disable building of llvm_wrapper.cpp in this case.  I
will write a patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] glsl: Generate an interface type for uniform blocks

2013-01-25 Thread Ian Romanick

On 01/23/2013 09:49 PM, Paul Berry wrote:

On 22 January 2013 00:52, Ian Romanick mailto:i...@freedesktop.org>> wrote:

From: Ian Romanick mailto:ian.d.roman...@intel.com>>

If the block has an instance name, add the instance name to the symbol
table instead of the individual fields.

Fixes the piglit test interface-name-access-without-interface-name.vert
for real.

Signed-off-by: Ian Romanick mailto:ian.d.roman...@intel.com>>
---
  src/glsl/ast_to_hir.cpp | 167
++--
  1 file changed, 118 insertions(+), 49 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 575dd84..a740a3c 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4020,7 +4020,9 @@
ast_process_structure_or_interface_block(exec_list *instructions,
  struct
_mesa_glsl_parse_state *state,
  exec_list *declarations,
  YYLTYPE &loc,
-glsl_struct_field **fields_ret)
+glsl_struct_field **fields_ret,
+ bool is_interface,
+ bool block_row_major)
  {
 unsigned decl_count = 0;

@@ -4062,7 +4064,32 @@
ast_process_structure_or_interface_block(exec_list *instructions,

foreach_list_typed (ast_declaration, decl, link,
   &decl_list->declarations) {
-const struct glsl_type *field_type = decl_type;
+ /* From the GL_ARB_uniform_buffer_object spec:
+  *
+  * "Sampler types are not allowed inside of uniform
+  *  blocks. All other types, arrays, and structures
+  *  allowed for uniforms are allowed within a uniform
+  *  block."
+  */
+ const struct glsl_type *field_type = decl_type;
+
+ if (is_interface && field_type->contains_sampler()) {
+YYLTYPE loc = decl_list->get_location();
+_mesa_glsl_error(&loc, state,
+ "Uniform in non-default uniform block
contains sampler\n");
+ }
+
+ const struct ast_type_qualifier *const qual =
+& decl_list->type->qualifier;
+ if (qual->flags.q.std140 ||
+ qual->flags.q.packed ||
+ qual->flags.q.shared) {
+_mesa_glsl_error(&loc, state,
+ "uniform block layout qualifiers
std140, packed, and "
+ "shared can only be applied to uniform
blocks, not "
+ "members");
+ }
+
  if (decl->is_array) {
 field_type = process_array_type(&loc, decl_type,
decl->array_size,
 state);
@@ -4070,6 +4097,26 @@
ast_process_structure_or_interface_block(exec_list *instructions,
  fields[i].type = (field_type != NULL)
 ? field_type : glsl_type::error_type;
  fields[i].name = decl->identifier;
+
+ if (qual->flags.q.row_major || qual->flags.q.column_major) {
+if (!field_type->is_matrix() && !field_type->is_record()) {
+   _mesa_glsl_error(&loc, state,
+"uniform block layout qualifiers
row_major and "
+"column_major can only be applied
to matrix and "
+"structure types");
+} else
+   validate_matrix_layout_for_type(state, &loc,
field_type);
+ }
+
+ if (field_type->is_matrix() ||
+ (field_type->is_array() &&
field_type->fields.array->is_matrix())) {
+fields[i].row_major = block_row_major;
+if (qual->flags.q.row_major)
+   fields[i].row_major = true;
+else if (qual->flags.q.column_major)
+   fields[i].row_major = false;
+ }
+
  i++;
}
 }
@@ -4092,7 +4139,9 @@ ast_struct_specifier::hir(exec_list *instructions,
state,
&this->declarations,
loc,
-  &fields);
+  &fields,
+   false,
+   false);

 const glsl_type *t =
glsl_type::get_record_instance(fields, decl_count, t

[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

--- Comment #3 from Michel Dänzer  ---
(In reply to comment #2)
> without.  Is that required now?

No, but I do wonder if we shouldn't drop support for linking LLVM statically.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: implement GL_ARB_texture_buffer_range v4

2013-01-25 Thread Christoph Bumiller
v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead
of the buffer's current size so we know we always have to use the
full size of the buffer object (i.e. even if it changes without the
user calling TexBuffer again) for the texture.

Clarify invalid offset alignment error message.

v3: Use extra GL_CORE-only section in get_hash_params.py for
TEXTURE_BUFFER_OFFSET_ALIGNMENT.

v4: Remove unnecessary check for profile in _mesa_TexBufferRange.
Add check for extension enable in get_tex_level_parameter_buffer.
---
 src/mapi/glapi/gen/ARB_texture_buffer_range.xml |   22 ++
 src/mapi/glapi/gen/Makefile.am  |1 +
 src/mapi/glapi/gen/gl_API.xml   |2 +
 src/mesa/main/context.c |1 +
 src/mesa/main/extensions.c  |1 +
 src/mesa/main/get.c |1 +
 src/mesa/main/get_hash_params.py|6 ++
 src/mesa/main/mtypes.h  |6 ++
 src/mesa/main/teximage.c|   84 ++-
 src/mesa/main/teximage.h|4 +
 src/mesa/main/texparam.c|   12 +++
 11 files changed, 123 insertions(+), 17 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_texture_buffer_range.xml

diff --git a/src/mapi/glapi/gen/ARB_texture_buffer_range.xml 
b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml
new file mode 100644
index 000..2176c08
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_texture_buffer_range.xml
@@ -0,0 +1,22 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index f869d28..4d51bbc 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -108,6 +108,7 @@ API_XML = \
ARB_seamless_cube_map.xml \
ARB_sync.xml \
ARB_texture_buffer_object.xml \
+   ARB_texture_buffer_range.xml \
ARB_texture_compression_rgtc.xml \
ARB_texture_float.xml \
ARB_texture_rg.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 404ccea..8d700a1 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8151,6 +8151,8 @@
 
 http://www.w3.org/2001/XInclude"/>
 
+http://www.w3.org/2001/XInclude"/>
+
 http://www.w3.org/2001/XInclude"/>
 
 http://www.w3.org/2001/XInclude"/>
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 5e9e539..5058c07 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -564,6 +564,7 @@ _mesa_init_constants(struct gl_context *ctx)
ctx->Const.MaxTextureMaxAnisotropy = MAX_TEXTURE_MAX_ANISOTROPY;
ctx->Const.MaxTextureLodBias = MAX_TEXTURE_LOD_BIAS;
ctx->Const.MaxTextureBufferSize = 65536;
+   ctx->Const.TextureBufferOffsetAlignment = 1;
ctx->Const.MaxArrayLockSize = MAX_ARRAY_LOCK_SIZE;
ctx->Const.SubPixelBits = SUB_PIXEL_BITS;
ctx->Const.MinPointSize = MIN_POINT_SIZE;
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 5d01ac8..207572f 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -130,6 +130,7 @@ static const struct extension extension_table[] = {
{ "GL_ARB_texture_border_clamp",
o(ARB_texture_border_clamp),GLL,2000 },
{ "GL_ARB_texture_buffer_object",   
o(ARB_texture_buffer_object),   GLC,2008 },
{ "GL_ARB_texture_buffer_object_rgb32", 
o(ARB_texture_buffer_object_rgb32), GLC,2009 },
+   { "GL_ARB_texture_buffer_range",
o(ARB_texture_buffer_range),GLC,2012 },
{ "GL_ARB_texture_compression", o(dummy_true),  
GLL,2000 },
{ "GL_ARB_texture_compression_rgtc",
o(ARB_texture_compression_rgtc),GL, 2004 },
{ "GL_ARB_texture_cube_map",o(ARB_texture_cube_map),
GLL,1999 },
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 5f4e2fa..da1e01c 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -353,6 +353,7 @@ EXTRA_EXT(ARB_uniform_buffer_object);
 EXTRA_EXT(ARB_timer_query);
 EXTRA_EXT(ARB_map_buffer_alignment);
 EXTRA_EXT(ARB_texture_cube_map_array);
+EXTRA_EXT(ARB_texture_buffer_range);
 
 static const int
 extra_NV_primitive_restart[] = {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 26a722a..b6bed80 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -701,6 +701,12 @@ descriptor=[
 
 # GL_ARB_texture_cube_map_array
   [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ],
+]},
+
+# Enums restricted to OpenGL Core profile
+{ "apis": ["GL_CORE"], "p

Re: [Mesa-dev] [PATCH 24/32] glsl: Make the align function available elsewhere in the linker

2013-01-25 Thread Ian Romanick

On 01/24/2013 08:40 PM, Kenneth Graunke wrote:

On 01/22/2013 12:52 AM, Ian Romanick wrote:

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
  src/glsl/glsl_types.cpp  | 12 +++-
  src/glsl/glsl_types.h|  6 ++
  src/glsl/link_uniforms.cpp   | 14 --
  src/glsl/lower_ubo_reference.cpp | 19 +++
  4 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
index 0075550..ddd0148 100644
--- a/src/glsl/glsl_types.cpp
+++ b/src/glsl/glsl_types.cpp
@@ -863,12 +863,6 @@ glsl_type::std140_base_alignment(bool row_major)
const
 return -1;
  }

-static unsigned
-align(unsigned val, unsigned align)
-{
-   return (val + align - 1) / align * align;
-}
-


Why not just eliminate this function altogether and use ALIGN() from
macros.h?  (The implementation is slightly different, but I think it
should work.)


I thought about that.  The ALIGN macro only works when align is a power 
of two, and it wasn't obvious to me that all the uses of this function 
met that requirement.  I did this refactor right before sending this 
series out, and it felt a little like the 11th hour to do something that 
could have a functional change.


I'd prefer to revisit this after the release.


  unsigned
  glsl_type::std140_size(bool row_major) const
  {
@@ -970,11 +964,11 @@ glsl_type::std140_size(bool row_major) const
for (unsigned i = 0; i < this->length; i++) {
   const struct glsl_type *field_type =
this->fields.structure[i].type;
   unsigned align = field_type->std140_base_alignment(row_major);
- size = (size + align - 1) / align * align;
+ size = glsl_align(size, align);
   size += field_type->std140_size(row_major);
}
-  size = align(size,
-
this->fields.structure[0].type->std140_base_alignment(row_major));
+  size = glsl_align(size,
+
this->fields.structure[0].type->std140_base_alignment(row_major));
return size;
 }

diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h
index 8588685..b0db2bf 100644
--- a/src/glsl/glsl_types.h
+++ b/src/glsl/glsl_types.h
@@ -601,6 +601,12 @@ struct glsl_struct_field {
 bool row_major;
  };

+static inline unsigned int
+glsl_align(unsigned int a, unsigned int align)
+{
+   return (a + align - 1) / align * align;
+}
+
  #endif /* __cplusplus */

  #endif /* GLSL_TYPES_H */
diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 2a1af6b..439b711 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -29,12 +29,6 @@
  #include "program/hash_table.h"
  #include "program.h"

-static inline unsigned int
-align(unsigned int a, unsigned int align)
-{
-   return (a + align - 1) / align * align;
-}
-
  /**
   * \file link_uniforms.cpp
   * Assign locations for GLSL uniforms.
@@ -421,13 +415,13 @@ private:
   this->uniforms[id].block_index = this->ubo_block_index;

   unsigned alignment = type->std140_base_alignment(ubo_row_major);
- this->ubo_byte_offset = align(this->ubo_byte_offset, alignment);
+ this->ubo_byte_offset = glsl_align(this->ubo_byte_offset,
alignment);
   this->uniforms[id].offset = this->ubo_byte_offset;
   this->ubo_byte_offset += type->std140_size(ubo_row_major);

   if (type->is_array()) {
  this->uniforms[id].array_stride =
-   align(type->fields.array->std140_size(ubo_row_major), 16);
+   glsl_align(type->fields.array->std140_size(ubo_row_major),
16);
   } else {
  this->uniforms[id].array_stride = 0;
   }
@@ -564,7 +558,7 @@ link_assign_uniform_block_offsets(struct gl_shader
*shader)
   unsigned alignment =
type->std140_base_alignment(ubo_var->RowMajor);
   unsigned size = type->std140_size(ubo_var->RowMajor);

- offset = align(offset, alignment);
+ offset = glsl_align(offset, alignment);
   ubo_var->Offset = offset;
   offset += size;
}
@@ -580,7 +574,7 @@ link_assign_uniform_block_offsets(struct gl_shader
*shader)
 *  and rounding up to the next multiple of the base
 *  alignment required for a vec4."
 */
-  block->UniformBufferSize = align(offset, 16);
+  block->UniformBufferSize = glsl_align(offset, 16);
 }
  }

diff --git a/src/glsl/lower_ubo_reference.cpp
b/src/glsl/lower_ubo_reference.cpp
index 1d08009..8d13ec1 100644
--- a/src/glsl/lower_ubo_reference.cpp
+++ b/src/glsl/lower_ubo_reference.cpp
@@ -61,12 +61,6 @@ public:
 bool progress;
  };

-static inline unsigned int
-align(unsigned int a, unsigned int align)
-{
-   return (a + align - 1) / align * align;
-}
-
  void
  lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
  {
@@ -113,7 +107,7 @@
lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
  array_stride = 4;
   } else {
  array_stride = deref_array->type->std140_size(row_major);
-array_stride = align(array_stride, 16);
+array_stride = glsl_align

[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

--- Comment #2 from Alex Deucher  ---
without.  Is that required now?

My configure options are:
./autogen.sh --prefix=/usr --libdir=/usr/lib64 --with-dri-drivers=radeon,r200
--with-gallium-drivers=r300,r600,radeonsi,swrast --enable-gles1 --enable-gles2
--enable-xorg --enable-vdpau --enable-shared-glapi --enable-gbm
--enable-gallium-llvm --with-egl-platforms=drm --enable-glx-tls --enable-debug

Also, I'm using llvm c5c65f9ad0e1e897f6d828248bdf25a6714cdd09 from Tom's tree.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 59831] undefined symbol _ZN4llvm19createGlobalDCEPassEv in r600g

2013-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=59831

--- Comment #1 from Michel Dänzer  ---
Is that with or without --with-llvm-shared-libs for the Mesa build?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/32] UBOs for OpenGL ES 3.0

2013-01-25 Thread Jordan Justen
23 & 26-31
Reviewed-by: Jordan Justen 

On Tue, Jan 22, 2013 at 12:51 AM, Ian Romanick  wrote:
> So here it is.
>
> This is the last of the UBO instance and array instance rework for the
> linker.  It's a giant pile of patches, so let me explain what's going
> on.
>
> Previous to this patch series, information about the layout of a UBO was
> created at compile-time during ast-to-ir translation.  This made it
> somewhere between difficult and impossible to implement several require
> features for OpenGL ES 3.0 conformance.
>
> 1. Uniform blocks with an instance name.  These blocks have different
> scoping rules, and the fields are exposed to applications differently
> through the GL API.  In the shader, these are accessed like structures.
>
> 2. Arrays of uniform blocks.  These basically compound the issues of
> instance names.  For example, to query the layout of an instance array
> block, you do *not* use the array index.
>
> 3. Marking unused block members and unused blocks as not active.  This
> was actually way more annoying to deal than I had expected.  Even with
> the std140 layout, if a block member is never used in a shader, it
> should not show up in the active list.
>
> All of these issues led me to a design that does all of the layout
> during linking.  This allows our usual dead variable elimination and a
> bunch of other nice things.
>
> To do this, I added a new type called GLSL_TYPE_INTERFACE.  Interfaces
> work mostly like structures, but they have additional semantic
> limitations (imposed by the language).  Once that was in place in the
> compiler front-end, the linker just needed to detect unused blocks and
> block members, cross-validate the blocks, and assign the offsets.
>
> The bulk of the added code is in link_uniform_blocks.  This is the real
> work-horse of the whole deal.  The functions that do all the
> intra-shader layouts and name assignments for the blocks live here.
>
> Other than the few cases mentioned in individual commit messages, there
> are no commit-to-commit piglit or gles3conform regressions.  I don't
> believe there are any commit-to-commit build failures, but I'll double
> check that before I push.
>
> With this series, i965 passes all of the gles3conform UBO tests on IVB.
> I believe there is still one issue on SNB, but I haven't tested it.
>
>  src/glsl/Makefile.sources  |   2 +
>  src/glsl/ast.h |  12 ++-
>  src/glsl/ast_to_hir.cpp| 248 
> +++---
>  src/glsl/builtin_types.h   |  74 +++
>  src/glsl/glsl_parser.yy|  82 -
>  src/glsl/glsl_symbol_table.cpp |  14 +--
>  src/glsl/glsl_symbol_table.h   |   1 -
>  src/glsl/glsl_types.cpp|  94 +++
>  src/glsl/glsl_types.h  |  43 -
>  src/glsl/hir_field_selection.cpp   |   3 +-
>  src/glsl/ir.cpp|   1 -
>  src/glsl/ir.h  |  33 ---
>  src/glsl/ir_clone.cpp  |  12 ++-
>  src/glsl/link_uniform_block_active_visitor.cpp | 162 
> +
>  src/glsl/link_uniform_block_active_visitor.h   |  62 +
>  src/glsl/link_uniform_blocks.cpp   | 313 
> 
>  src/glsl/link_uniform_initializers.cpp |   6 +-
>  src/glsl/link_uniforms.cpp | 250 
> +++
>  src/glsl/linker.cpp|  25 ++
>  src/glsl/linker.h  |  45 +-
>  src/glsl/lower_ubo_reference.cpp   | 104 ++---
>  src/glsl/opt_dead_code.cpp |   7 +-
>  src/glsl/tests/uniform_initializer_utils.cpp   |   3 +
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |   8 +-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |   6 +-
>  src/mesa/drivers/dri/i965/brw_shader.cpp   |   8 +-
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  10 ++-
>  src/mesa/main/mtypes.h |  27 ++
>  src/mesa/main/uniforms.c   |   2 +-
>  src/mesa/program/ir_to_mesa.cpp|  26 --
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   8 +-
>  31 files changed, 1355 insertions(+), 336 deletions(-)
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/32] glsl: Add link_uniform_blocks to calculate all UBO data at link-time

2013-01-25 Thread Jordan Justen
On Tue, Jan 22, 2013 at 12:52 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Calculate all of the block member offsets, the IndexNames, and
> everything else to do with every UBO.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/glsl/link_uniform_blocks.cpp | 248 
> +++
>  src/glsl/linker.h|   7 ++
>  2 files changed, 255 insertions(+)
>
> diff --git a/src/glsl/link_uniform_blocks.cpp 
> b/src/glsl/link_uniform_blocks.cpp
> index c9cbde9..74fe1e2 100644
> --- a/src/glsl/link_uniform_blocks.cpp
> +++ b/src/glsl/link_uniform_blocks.cpp
> @@ -25,8 +25,256 @@
>  #include "ir.h"
>  #include "linker.h"
>  #include "ir_uniform.h"
> +#include "link_uniform_block_active_visitor.h"
> +#include "main/hash_table.h"
>  #include "program.h"
>
> +class ubo_visitor : public uniform_field_visitor {
> +public:
> +   ubo_visitor(void *mem_ctx, gl_uniform_buffer_variable *variables,
> +   unsigned num_variables)
> +  : index(0), offset(0), buffer_size(0), variables(variables),
> +num_variables(num_variables), mem_ctx(mem_ctx), 
> is_array_instance(false)
> +   {
> +  /* empty */
> +   }
> +
> +   void process(const glsl_type *type, const char *name)
> +   {
> +  this->offset = 0;
> +  this->buffer_size = 0;
> +  this->is_array_instance = strchr(name, ']') != NULL;
> +  this->uniform_field_visitor::process(type, name);
> +   }
> +
> +   unsigned index;
> +   unsigned offset;
> +   unsigned buffer_size;
> +   gl_uniform_buffer_variable *variables;
> +   unsigned num_variables;
> +   void *mem_ctx;
> +   bool is_array_instance;
> +
> +private:
> +   virtual void visit_field(const glsl_type *type, const char *name,
> +bool row_major)
> +   {
> +  assert(this->index < this->num_variables);
> +
> +  gl_uniform_buffer_variable *v = &this->variables[this->index++];
> +
> +  v->Name = ralloc_strdup(mem_ctx, name);
> +  v->Type = type;
> +  v->RowMajor = row_major;
> +
> +  if (this->is_array_instance) {
> + v->IndexName = ralloc_strdup(mem_ctx, name);
> +
> + char *open_bracket = strchr(v->IndexName, '[');
> + assert(open_bracket != NULL);
> +
> + char *close_bracket = strchr(open_bracket, ']');
> + assert(close_bracket != NULL);
> +
> + /* Length of the tail without the ']' but with the NUL.
> +  */
> + unsigned len = strlen(close_bracket + 1) + 1;
> +
> + memmove(open_bracket, close_bracket + 1, len);
> + } else {

Missing a space of indentation.

-Jordan

> + v->IndexName = v->Name;
> +  }
> +
> +  unsigned alignment = type->std140_base_alignment(v->RowMajor);
> +  unsigned size = type->std140_size(v->RowMajor);
> +
> +  this->offset = glsl_align(this->offset, alignment);
> +  v->Offset = this->offset;
> +  this->offset += size;
> +
> +  /* From the GL_ARB_uniform_buffer_object spec:
> +   *
> +   * "For uniform blocks laid out according to [std140] rules, the
> +   *  minimum buffer object size returned by the
> +   *  UNIFORM_BLOCK_DATA_SIZE query is derived by taking the offset 
> of
> +   *  the last basic machine unit consumed by the last uniform of the
> +   *  uniform block (including any end-of-array or end-of-structure
> +   *  padding), adding one, and rounding up to the next multiple of
> +   *  the base alignment required for a vec4."
> +   */
> +  this->buffer_size = glsl_align(this->offset, 16);
> +   }
> +
> +   virtual void visit_field(const glsl_struct_field *field)
> +   {
> +  this->offset = glsl_align(this->offset,
> +field->type->std140_base_alignment(false));
> +   }
> +};
> +
> +class count_block_size : public uniform_field_visitor {
> +public:
> +   count_block_size() : num_active_uniforms(0)
> +   {
> +  /* empty */
> +   }
> +
> +   unsigned num_active_uniforms;
> +
> +private:
> +   virtual void visit_field(const glsl_type *type, const char *name,
> +bool row_major)
> +   {
> +  (void) type;
> +  (void) name;
> +  (void) row_major;
> +  this->num_active_uniforms++;
> +   }
> +};
> +
> +struct block {
> +   const glsl_type *type;
> +   bool has_instance_name;
> +};
> +
> +int
> +link_uniform_blocks(void *mem_ctx,
> +struct gl_shader_program *prog,
> +struct gl_shader **shader_list,
> +unsigned num_shaders,
> +struct gl_uniform_block **blocks_ret)
> +{
> +   /* This hash table will track all of the uniform blocks that have been
> +* encountered.  Since blocks with the same block-name must be the same,
> +* the hash is organized by block-name.
> +*/
> +   struct hash_table *block_hash =
> +  _mesa_hash_table_create(mem_ctx, _mesa_key_string_equal);
> +
> +   /* Determine which uniform blocks are active.
> +  

Re: [Mesa-dev] [PATCH 2/2] st/mesa: do proper error checking for u_upload_alloc() calls

2013-01-25 Thread Jose Fonseca
Series is 
Reviewed-by: Jose Fonseca 

- Original Message -
> We weren't properly checking the return value of these calls (and
> calls to u_upload_data()) to detect OOM errors.
> ---
>  src/mesa/state_tracker/st_cb_bitmap.c |5 ++---
>  src/mesa/state_tracker/st_cb_clear.c  |5 ++---
>  src/mesa/state_tracker/st_cb_drawpixels.c |5 ++---
>  src/mesa/state_tracker/st_cb_drawtex.c|7 +++
>  src/mesa/state_tracker/st_draw.c  |   21
>  +
>  5 files changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/src/mesa/state_tracker/st_cb_bitmap.c
> b/src/mesa/state_tracker/st_cb_bitmap.c
> index 843dc5b..63dbdb2 100644
> --- a/src/mesa/state_tracker/st_cb_bitmap.c
> +++ b/src/mesa/state_tracker/st_cb_bitmap.c
> @@ -350,9 +350,8 @@ setup_bitmap_vertex_data(struct st_context *st,
> bool normalized,
>tBot = (GLfloat) height;
> }
>  
> -   u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]),
> vbuf_offset, vbuf,
> -   (void**)&vertices);
> -   if (!vbuf) {
> +   if (u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]),
> +  vbuf_offset, vbuf, (void **) &vertices) !=
> PIPE_OK) {
>return;
> }
>  
> diff --git a/src/mesa/state_tracker/st_cb_clear.c
> b/src/mesa/state_tracker/st_cb_clear.c
> index d01236e..a5aa8f4 100644
> --- a/src/mesa/state_tracker/st_cb_clear.c
> +++ b/src/mesa/state_tracker/st_cb_clear.c
> @@ -141,9 +141,8 @@ draw_quad(struct st_context *st,
> GLuint i, offset;
> float (*vertices)[2][4];  /**< vertex pos + color */
>  
> -   u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]), &offset,
> &vbuf,
> -   (void**)&vertices);
> -   if (!vbuf) {
> +   if (u_upload_alloc(st->uploader, 0, 4 * sizeof(vertices[0]),
> +  &offset, &vbuf, (void **) &vertices) !=
> PIPE_OK) {
>return;
> }
>  
> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c
> b/src/mesa/state_tracker/st_cb_drawpixels.c
> index 65f1160..c944b81 100644
> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> @@ -568,9 +568,8 @@ draw_quad(struct gl_context *ctx, GLfloat x0,
> GLfloat y0, GLfloat z,
> struct pipe_resource *buf = NULL;
> unsigned offset;
>  
> -   u_upload_alloc(st->uploader, 0, 4 * sizeof(verts[0]), &offset,
> &buf,
> -   (void**)&verts);
> -   if (!buf) {
> +   if (u_upload_alloc(st->uploader, 0, 4 * sizeof(verts[0]),
> &offset,
> +  &buf, (void **) &verts) != PIPE_OK) {
>return;
> }
>  
> diff --git a/src/mesa/state_tracker/st_cb_drawtex.c
> b/src/mesa/state_tracker/st_cb_drawtex.c
> index 269068d..5ca0970 100644
> --- a/src/mesa/state_tracker/st_cb_drawtex.c
> +++ b/src/mesa/state_tracker/st_cb_drawtex.c
> @@ -148,10 +148,9 @@ st_DrawTex(struct gl_context *ctx, GLfloat x,
> GLfloat y, GLfloat z,
>GLfloat *vbuf = NULL;
>GLuint attr;
>  
> -  u_upload_alloc(st->uploader, 0,
> -  numAttribs * 4 * 4 * sizeof(GLfloat),
> -  &offset, &vbuffer, (void**)&vbuf);
> -  if (!vbuffer) {
> +  if (u_upload_alloc(st->uploader, 0,
> + numAttribs * 4 * 4 * sizeof(GLfloat),
> + &offset, &vbuffer, (void **) &vbuf) !=
> PIPE_OK) {
>   return;
>}
>
> diff --git a/src/mesa/state_tracker/st_draw.c
> b/src/mesa/state_tracker/st_draw.c
> index de539ca..de62264 100644
> --- a/src/mesa/state_tracker/st_draw.c
> +++ b/src/mesa/state_tracker/st_draw.c
> @@ -84,7 +84,12 @@ all_varyings_in_vbos(const struct gl_client_array
> *arrays[])
>  }
>  
>  
> -static void
> +/**
> + * Basically, translate Mesa's index buffer information into
> + * a pipe_index_buffer object.
> + * \return TRUE or FALSE for success/failure
> + */
> +static boolean
>  setup_index_buffer(struct st_context *st,
> const struct _mesa_index_buffer *ib,
> struct pipe_index_buffer *ibuffer)
> @@ -100,8 +105,12 @@ setup_index_buffer(struct st_context *st,
>ibuffer->offset = pointer_to_offset(ib->ptr);
> }
> else if (st->indexbuf_uploader) {
> -  u_upload_data(st->indexbuf_uploader, 0, ib->count *
> ibuffer->index_size,
> -ib->ptr, &ibuffer->offset, &ibuffer->buffer);
> +  if (u_upload_data(st->indexbuf_uploader, 0,
> +ib->count * ibuffer->index_size, ib->ptr,
> +&ibuffer->offset, &ibuffer->buffer) !=
> PIPE_OK) {
> + /* out of memory */
> + return FALSE;
> +  }
>u_upload_unmap(st->indexbuf_uploader);
> }
> else {
> @@ -110,6 +119,7 @@ setup_index_buffer(struct st_context *st,
> }
>  
> cso_set_index_buffer(st->cso_context, ibuffer);
> +   return TRUE;
>  }
>  
>  
> @@ -220,7 +230,10 @@ st_draw_vbo(struct gl_context *ctx,
>  vbo_get_minmax_indices(ctx, prims, ib, &min_index,
>  &max_ind