Re: [Mesa-dev] [PATCH 1/2] r600g, radeonsi: query the buffer domain from the kernel for DRI2 buffers

2014-02-04 Thread Michel Dänzer
On Mit, 2014-02-05 at 00:01 +0100, Marek Olšák wrote:
> From: Marek Olšák 
> 
> Better then guessing it.
> 
> Yeah we have had this query for a long time...

[...]

> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
> b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> index 2ac060b..7c59f26 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> @@ -201,6 +201,28 @@ static boolean radeon_bo_is_busy(struct pb_buffer *_buf,
>  }
>  }
>  
> +static enum radeon_bo_domain radeon_bo_get_current_domain(struct pb_buffer 
> *_buf)
> +{
> +struct radeon_bo *bo = get_radeon_bo(_buf);
> +struct drm_radeon_gem_busy args;
> +
> +memset(&args, 0, sizeof(args));
> +args.handle = bo->handle;
> +
> +drmCommandWriteRead(bo->rws->fd, DRM_RADEON_GEM_BUSY,
> +&args, sizeof(args));
> +
> +/* Zero domains the driver doesn't understand. */
> +args.domain &= ~(RADEON_GEM_DOMAIN_VRAM | RADEON_GEM_DOMAIN_GTT);
> +
> +/* If no domain is set, we must set something... */
> +if (!args.domain)
> +args.domain = RADEON_GEM_DOMAIN_VRAM | RADEON_GEM_DOMAIN_GTT;
> +
> +/* GEM domains and winsys domains are defined the same. */
> +return args.domain;
> +}

The problem with this is that DRM_RADEON_GEM_BUSY doesn't say where the
BO is supposed to be, but where it happens to be right now. E.g. it
could return GTT for a BO that's supposed to be in VRAM but was evicted
(or couldn't get into VRAM in the first place).


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nouveau/codegen: allow tex offsets on non-TXF instructions (e.g. TXL)

2014-02-04 Thread Ilia Mirkin
On Tue, Feb 4, 2014 at 2:58 AM, Ilia Mirkin  wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>
> This fixes the bin/fs-textureOffset-2D piglit test on nv50. Have yet to re-run
> the full piglit suite with this change in place, but it seems pretty obvious.

BTW, I re-ran the gpu piglits and no regressions with this change.

>
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 51d3d08..b078f2e 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -1729,6 +1729,14 @@ Converter::handleTEX(Value *dst[4], int R, int S, int 
> L, int C, int Dx, int Dy)
> if (tgsi.getOpcode() == TGSI_OPCODE_SAMPLE_C_LZ)
>texi->tex.levelZero = true;
>
> +   for (s = 0; s < tgsi.getNumTexOffsets(); ++s) {
> +  for (c = 0; c < 3; ++c) {
> + texi->tex.offset[s][c] = tgsi.getTexOffset(s).getValueU32(c, info);
> + if (texi->tex.offset[s][c])
> +texi->tex.useOffsets = s + 1;
> +  }
> +   }
> +
> bb->insertTail(texi);
>  }
>
> --
> 1.8.3.2
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Drop VECTOR_MASK_ENABLE in Broadwell's 3DSTATE_VS packet.

2014-02-04 Thread Kenneth Graunke
We never set it on previous generations, but I had to set it in
3DSTATE_PS for correct behavior.  For symmetry, I set it in 3DSTATE_VS
as well, but there's no actual need to do so.  Piglit works fine either
way.  The documentation also remarks that there should never be a need
to program this.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_vs_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_vs_state.c 
b/src/mesa/drivers/dri/i965/gen8_vs_state.c
index 65a62f3..02a0176 100644
--- a/src/mesa/drivers/dri/i965/gen8_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_vs_state.c
@@ -87,7 +87,7 @@ upload_vs_state(struct brw_context *brw)
OUT_BATCH(_3DSTATE_VS << 16 | (9 - 2));
OUT_BATCH(stage_state->prog_offset);
OUT_BATCH(0);
-   OUT_BATCH(GEN6_VS_VECTOR_MASK_ENABLE | floating_point_mode |
+   OUT_BATCH(floating_point_mode |
  ((ALIGN(stage_state->sampler_count, 4) / 4) <<
GEN6_VS_SAMPLER_COUNT_SHIFT) |
  ((prog_data->base.binding_table.size_bytes / 4) <<
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix General and Indirect Base Addresses on Broadwell.

2014-02-04 Thread Kenneth Graunke
I set the "address modify enable" bit in the wrong DWord.  The first
DWord is the high 16 bits of the address, while the second is the low
32-bits and enable bit.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_misc_state.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_misc_state.c 
b/src/mesa/drivers/dri/i965/gen8_misc_state.c
index ddc65a8..72ac2b2 100644
--- a/src/mesa/drivers/dri/i965/gen8_misc_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_misc_state.c
@@ -36,8 +36,8 @@ static void upload_state_base_address(struct brw_context *brw)
BEGIN_BATCH(16);
OUT_BATCH(CMD_STATE_BASE_ADDRESS << 16 | (16 - 2));
/* General state base address: stateless DP read/write requests */
-   OUT_BATCH(1);
OUT_BATCH(0);
+   OUT_BATCH(1);
OUT_BATCH(0);
/* Surface state base address: */
OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0, 1);
@@ -45,8 +45,8 @@ static void upload_state_base_address(struct brw_context *brw)
OUT_RELOC64(brw->batch.bo,
I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0, 1);
/* Indirect object base address: MEDIA_OBJECT data */
-   OUT_BATCH(1);
OUT_BATCH(0);
+   OUT_BATCH(1);
/* Instruction base address: shader kernels (incl. SIP) */
OUT_RELOC64(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0, 1);
 
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Implement a CS stall workaround

2014-02-04 Thread Kenneth Graunke
According to the latest documentation, any PIPE_CONTROL with the
"Command Streamer Stall" bit set must also have another bit set,
with five different options:

   - Render Target Cache Flush
   - Depth Cache Flush
   - Stall at Pixel Scoreboard
   - Post-Sync Operation
   - Depth Stall

I chose "Stall at Pixel Scoreboard" since we've used it effectively
in the past, but the choice is fairly arbitrary.

Implementing this in the PIPE_CONTROL emit helpers ensures that the
workaround will always take effect when it ought to.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index fbbd527..719b026 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -441,6 +441,10 @@ void
 brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
 {
if (brw->gen >= 8) {
+  /* Workarounds */
+  if (flags & PIPE_CONTROL_CS_STALL)
+ flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
+
   BEGIN_BATCH(6);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
   OUT_BATCH(flags);
@@ -481,6 +485,10 @@ brw_emit_pipe_control_write(struct brw_context *brw, 
uint32_t flags,
 uint32_t imm_lower, uint32_t imm_upper)
 {
if (brw->gen >= 8) {
+  /* Workarounds */
+  if (flags & PIPE_CONTROL_CS_STALL)
+ flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD;
+
   BEGIN_BATCH(6);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
   OUT_BATCH(flags);
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Use the new brw_load_register_mem helper for draw indirect.

2014-02-04 Thread Kenneth Graunke
This makes it work on Broadwell, too.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_draw.c | 56 
 1 file changed, 25 insertions(+), 31 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 39da953..9de622f 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -217,42 +217,36 @@ static void brw_emit_prim(struct brw_context *brw,
 
   indirect_flag = GEN7_3DPRIM_INDIRECT_PARAMETER_ENABLE;
 
-  BEGIN_BATCH(15);
-
-  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
-  OUT_BATCH(GEN7_3DPRIM_VERTEX_COUNT);
-  OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0,
-prim->indirect_offset + 0);
-  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
-  OUT_BATCH(GEN7_3DPRIM_INSTANCE_COUNT);
-  OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0,
-prim->indirect_offset + 4);
-  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
-  OUT_BATCH(GEN7_3DPRIM_START_VERTEX);
-  OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0,
-prim->indirect_offset + 8);
-
+  brw_load_register_mem(brw, GEN7_3DPRIM_VERTEX_COUNT, bo,
+I915_GEM_DOMAIN_VERTEX, 0,
+prim->indirect_offset + 0);
+  brw_load_register_mem(brw, GEN7_3DPRIM_INSTANCE_COUNT, bo,
+I915_GEM_DOMAIN_VERTEX, 0,
+prim->indirect_offset + 4);
+
+  brw_load_register_mem(brw, GEN7_3DPRIM_START_VERTEX, bo,
+I915_GEM_DOMAIN_VERTEX, 0,
+prim->indirect_offset + 8);
   if (prim->indexed) {
- OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
- OUT_BATCH(GEN7_3DPRIM_BASE_VERTEX);
- OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0,
-   prim->indirect_offset + 12);
- OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
- OUT_BATCH(GEN7_3DPRIM_START_INSTANCE);
- OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0,
-   prim->indirect_offset + 16);
-  }
-  else {
- OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
- OUT_BATCH(GEN7_3DPRIM_START_INSTANCE);
- OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0,
-   prim->indirect_offset + 12);
+ brw_load_register_mem(brw, GEN7_3DPRIM_BASE_VERTEX, bo,
+   I915_GEM_DOMAIN_VERTEX, 0,
+   prim->indirect_offset + 12);
+ brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo,
+   I915_GEM_DOMAIN_VERTEX, 0,
+   prim->indirect_offset + 16);
+  } else {
+ brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo,
+   I915_GEM_DOMAIN_VERTEX, 0,
+   prim->indirect_offset + 12);
+ brw_load_register_mem(brw, GEN7_3DPRIM_BASE_VERTEX, bo,
+   I915_GEM_DOMAIN_VERTEX, 0,
+   prim->indirect_offset + 12);
+ BEGIN_BATCH(3);
  OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
  OUT_BATCH(GEN7_3DPRIM_BASE_VERTEX);
  OUT_BATCH(0);
+ ADVANCE_BATCH();
   }
-
-  ADVANCE_BATCH();
}
else {
   indirect_flag = 0;
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Implement a brw_load_register_mem helper function.

2014-02-04 Thread Kenneth Graunke
This saves some boilerplate and hides the OUT_RELOC/OUT_RELOC64
distinction.

Placing the function in intel_batchbuffer.c is rather arbitrary; there
wasn't really an obvious place for it.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  7 +++
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 25 +
 2 files changed, 32 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a0189b7..fec1a8e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1567,6 +1567,13 @@ void brw_write_depth_count(struct brw_context *brw, 
drm_intel_bo *bo, int idx);
 void brw_store_register_mem64(struct brw_context *brw,
   drm_intel_bo *bo, uint32_t reg, int idx);
 
+/** intel_batchbuffer.c */
+void brw_load_register_mem(struct brw_context *brw,
+   uint32_t reg,
+   drm_intel_bo *bo,
+   uint32_t read_domains, uint32_t write_domain,
+   uint32_t offset);
+
 /*==
  * brw_state_dump.c
  */
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 719b026..1762d92 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -669,3 +669,28 @@ intel_batchbuffer_emit_mi_flush(struct brw_context *brw)
   brw_emit_pipe_control_flush(brw, flags);
}
 }
+
+void
+brw_load_register_mem(struct brw_context *brw,
+  uint32_t reg,
+  drm_intel_bo *bo,
+  uint32_t read_domains, uint32_t write_domain,
+  uint32_t offset)
+{
+   /* MI_LOAD_REGISTER_MEM only exists on Gen7+. */
+   assert(brw->gen >= 7);
+
+   if (brw->gen >= 8) {
+  BEGIN_BATCH(4);
+  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (4 - 2));
+  OUT_BATCH(reg);
+  OUT_RELOC64(bo, read_domains, write_domain, offset);
+  ADVANCE_BATCH();
+   } else {
+  BEGIN_BATCH(3);
+  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
+  OUT_BATCH(reg);
+  OUT_RELOC(bo, read_domains, write_domain, offset);
+  ADVANCE_BATCH();
+   }
+}
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Add missing sample shading bits to Gen8's 3DSTATE_PS_EXTRA.

2014-02-04 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_ps_state.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
index b45da8f..852ba36 100644
--- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
@@ -22,6 +22,7 @@
  */
 
 #include 
+#include "program/program.h"
 #include "brw_state.h"
 #include "brw_defines.h"
 #include "intel_batchbuffer.h"
@@ -29,6 +30,7 @@
 static void
 upload_ps_extra(struct brw_context *brw)
 {
+   struct gl_context *ctx = &brw->ctx;
/* BRW_NEW_FRAGMENT_PROGRAM */
const struct brw_fragment_program *fp =
   brw_fragment_program_const(brw->fragment_program);
@@ -63,6 +65,15 @@ upload_ps_extra(struct brw_context *brw)
if (fp->program.Base.InputsRead & VARYING_BIT_POS)
   dw1 |= GEN8_PSX_USES_SOURCE_DEPTH | GEN8_PSX_USES_SOURCE_W;
 
+   /* _NEW_BUFFERS */
+   bool multisampled_fbo = ctx->DrawBuffer->Visual.samples > 1;
+   if (multisampled_fbo &&
+   _mesa_get_min_invocations_per_fragment(ctx, &fp->program, false) > 1)
+  dw1 |= GEN8_PSX_SHADER_IS_PER_SAMPLE;
+
+   if (fp->program.Base.SystemValuesRead & SYSTEM_BIT_SAMPLE_MASK_IN)
+  dw1 |= GEN8_PSX_SHADER_USES_INPUT_COVERAGE_MASK;
+
BEGIN_BATCH(2);
OUT_BATCH(_3DSTATE_PS_EXTRA << 16 | (2 - 2));
OUT_BATCH(dw1);
@@ -71,7 +82,7 @@ upload_ps_extra(struct brw_context *brw)
 
 const struct brw_tracked_state gen8_ps_extra = {
.dirty = {
-  .mesa  = 0,
+  .mesa  = _NEW_BUFFERS,
   .brw   = BRW_NEW_CONTEXT | BRW_NEW_FRAGMENT_PROGRAM,
   .cache = 0,
},
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nouveau/video: make sure that firmware is present when checking caps

2014-02-04 Thread Ilia Mirkin
Apparently some players are ill-prepared for us claiming that a decoder
exists only to have creating it fail, and express this poor preparation
with crashes (e.g. flash). Check that firmware is there to increase the
chances of there being a high correlation between reported capabilities
and ability to create a decoder.

Signed-off-by: Ilia Mirkin 
Cc: 10.0 10.1 
---

I tested this on a VP3 card. Would be nice if someone could give the (somewhat
different) vp2 logic a shot. Emil perhaps? If no one confirms after a while
I'll go swap cards in my computer.

 src/gallium/drivers/nouveau/nouveau_screen.h|  5 ++
 src/gallium/drivers/nouveau/nouveau_vp3_video.c | 54 +++-
 src/gallium/drivers/nouveau/nv50/nv84_video.c   | 68 -
 3 files changed, 123 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_screen.h 
b/src/gallium/drivers/nouveau/nouveau_screen.h
index 7f15d10..51e24fa 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.h
+++ b/src/gallium/drivers/nouveau/nouveau_screen.h
@@ -49,6 +49,11 @@ struct nouveau_screen {
 
boolean hint_buf_keep_sysmem_copy;
 
+   struct {
+   unsigned profiles_checked;
+   unsigned profiles_present;
+   } firmware_info;
+
 #ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS
union {
   uint64_t v[29];
diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.c 
b/src/gallium/drivers/nouveau/nouveau_vp3_video.c
index ff00b37..660a3d0 100644
--- a/src/gallium/drivers/nouveau/nouveau_vp3_video.c
+++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c
@@ -21,6 +21,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -350,6 +351,53 @@ nouveau_vp3_load_firmware(struct nouveau_vp3_decoder *dec,
return 0;
 }
 
+static int
+firmware_present(struct pipe_screen *pscreen, enum pipe_video_profile profile)
+{
+   struct nouveau_screen *screen = nouveau_screen(pscreen);
+   int chipset = screen->device->chipset;
+   int vp3 = chipset < 0xa3 || chipset == 0xaa || chipset == 0xac;
+   int vp5 = chipset >= 0xd0;
+   int ret;
+
+   /* For all chipsets, try to create a BSP objects. Assume that if firmware
+* is present for it, firmware is also present for VP/PPP */
+   if (!(screen->firmware_info.profiles_checked & 1)) {
+  struct nouveau_object *bsp = NULL;
+  int oclass;
+  if (chipset < 0xc0)
+ oclass = 0x85b1;
+  else if (vp5)
+ oclass = 0x95b1;
+  else
+ oclass = 0x90b1;
+  nouveau_object_new(screen->channel, 0, oclass, NULL, 0, &bsp);
+  if (bsp)
+ screen->firmware_info.profiles_present |= 1;
+  nouveau_object_del(&bsp);
+  screen->firmware_info.profiles_checked |= 1;
+   }
+
+   if (!(screen->firmware_info.profiles_present & 1))
+  return 0;
+
+   /* For vp3/vp4 chipsets, make sure that the relevant firmware is present */
+   if (!vp5 && !(screen->firmware_info.profiles_checked & (1 << profile))) {
+  char path[PATH_MAX];
+  struct stat s;
+  if (vp3)
+ vp3_getpath(profile, path);
+  else
+ vp4_getpath(profile, path);
+  ret = stat(path, &s);
+  if (!ret && s.st_size > 1000)
+ screen->firmware_info.profiles_present |= (1 << profile);
+  screen->firmware_info.profiles_checked |= (1 << profile);
+   }
+
+   return vp5 || (screen->firmware_info.profiles_present & (1 << profile));
+}
+
 int
 nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen,
enum pipe_video_profile profile,
@@ -363,8 +411,10 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen 
*pscreen,
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
   /* VP3 does not support MPEG4, VP4+ do. */
-  return profile >= PIPE_VIDEO_PROFILE_MPEG1 && (
-!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4);
+  return entrypoint == PIPE_VIDEO_ENTRYPOINT_BITSTREAM &&
+ profile >= PIPE_VIDEO_PROFILE_MPEG1 &&
+ (!vp3 || codec != PIPE_VIDEO_FORMAT_MPEG4) &&
+ firmware_present(pscreen, profile);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
   return 1;
case PIPE_VIDEO_CAP_MAX_WIDTH:
diff --git a/src/gallium/drivers/nouveau/nv50/nv84_video.c 
b/src/gallium/drivers/nouveau/nv50/nv84_video.c
index 3fee6d9..a39f572 100644
--- a/src/gallium/drivers/nouveau/nv50/nv84_video.c
+++ b/src/gallium/drivers/nouveau/nv50/nv84_video.c
@@ -741,16 +741,80 @@ error:
return NULL;
 }
 
+#define FIRMWARE_BSP_KERN  0x01
+#define FIRMWARE_VP_KERN   0x02
+#define FIRMWARE_BSP_H264  0x04
+#define FIRMWARE_VP_MPEG2  0x08
+#define FIRMWARE_VP_H264_1 0x10
+#define FIRMWARE_VP_H264_2 0x20
+#define FIRMWARE_PRESENT(val, fw) (val & FIRMWARE_ ## fw)
+
+static int
+firmware_present(struct pipe_screen *pscreen, enum pipe_video_format codec)
+{
+   struct nouveau_screen *screen = nouveau_screen(pscreen);
+   struct nouveau_object *obj = NULL;
+   struct stat s;
+   int checked = screen->firmware_info.profiles_checked;
+   int present, ret;
+
+  

Re: [Mesa-dev] [PATCH 12/15] i965: Pull out layer_multiplier in intel_update_renderbuffer_wrapper

2014-02-04 Thread Kenneth Graunke
On 01/25/2014 05:00 PM, Chris Forbes wrote:
> On Thu, Jan 23, 2014 at 6:58 AM, Courtney Goeltzenleuchter 
>  wrote:
>> On Tue, Jan 21, 2014 at 3:34 AM, Chris Forbes  wrote:
>>>
>>> We're about to need this in another place.
>>>
>>> Signed-off-by: Chris Forbes 
>>>   ---
>>>src/mesa/drivers/dri/i965/intel_fbo.c | 7 +--
>>>1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>>   diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c
>>>   b/src/mesa/drivers/dri/i965/intel_fbo.c
>>>   index 4cdf54d..e7c5571 100644
>>>   --- a/src/mesa/drivers/dri/i965/intel_fbo.c
>>>   +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
>>>   @@ -433,16 +433,19 @@ intel_renderbuffer_update_wrapper(struct
>>>   brw_context *brw,
>>>   intel_miptree_check_level_layer(mt, level, layer);
>>>   irb->mt_level = level;
>>>
>>>   +   int layer_multiplier;
>>>
>>Shouldn't this declaration be at the top of the function?
> 
> Since we're not restricted to C89 in the driver, I put it as close to
> the point of use as possible -- but yes, I can move it to the top of
> the function if you think that's clearer.

FWIW, I prefer putting declarations closer to their use...I'd stick with
the way you did it.  C99 is fine in the driver, since nobody uses MSVC
on it.

--Ken



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] ARB_texture_gather support for Sandy Bridge

2014-02-04 Thread Kenneth Graunke
On 02/03/2014 01:29 AM, Chris Forbes wrote:
> This series adds a bunch of workarounds to enable ARB_texture_gather
> (in its more restrictive form) on Gen6 hardware.
> 
> These are necessary because Gen6's gather4 instruction doesn't work
> correctly with integer or unsigned integer formats.
> 
> The approach is:
> 
> * For 32-bit wide formats, pretend the surface is FLOAT, and reinterpret
> the bits as INT/UINT. This requires only a surface format override; nothing
> in the shader.
> 
> * For 8- and 16-bit wide formats, pretend the surface is UNORM,
> and recover the appropriate unsigned integer value by multiplying up,
> and then converting to INT/UINT. If INT is required, then fix the sign
> extension of the value by the usual SHL/ASR method.
> 
> This now passes all the applicable ARB_texture_gather piglit tests.

Chris,

This is great!  Much, much simpler than your old code to get this
working.  I see no reason not to commit this and enable it.

I sent a bunch of nits and suggestions.  Mostly, more comments would be
great.  I don't feel a need to see this again, so assuming you take my
suggestions (or some approximation thereof), feel free to push this.

For the series:
Reviewed-by: Kenneth Graunke 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965: Add Gen6 gather wa to sampler key

2014-02-04 Thread Kenneth Graunke
On 02/03/2014 01:29 AM, Chris Forbes wrote:
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/drivers/dri/i965/brw_program.h | 11 +++
>  src/mesa/drivers/dri/i965/brw_wm.c  | 20 
>  2 files changed, 31 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program.h 
> b/src/mesa/drivers/dri/i965/brw_program.h
> index 51182ea..c071b5b 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.h
> +++ b/src/mesa/drivers/dri/i965/brw_program.h
> @@ -24,6 +24,12 @@
>  #ifndef BRW_PROGRAM_H
>  #define BRW_PROGRAM_H
>  
> +enum gen6_gather_sampler_wa {
> +   WA_SIGN = 1, /* whether we need to sign extend */

Funny indentation here...probably should align the comments.

> +   WA_8BIT = 2,  /* if we have an 8bit format needing wa */
> +   WA_16BIT = 4, /* if we have a 16bit format needing wa */
> +};
> +
>  /**
>   * Sampler information needed by VS, WM, and GS program cache keys.
>   */
> @@ -50,6 +56,11 @@ struct brw_sampler_prog_key_data {
>  * Whether this sampler uses the compressed multisample surface layout.
>  */
> uint16_t compressed_multisample_layout_mask;
> +
> +   /**
> +* For Sandybridge, which shader w/a we need for gather quirks.
> +*/
> +   uint8_t gen6_gather_wa[MAX_SAMPLERS];
>  };
>  
>  #ifdef __cplusplus
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index a0758d2..97016c6 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -317,6 +317,20 @@ brw_wm_debug_recompile(struct brw_context *brw,
> }
>  }
>  
> +static uint8_t
> +_gen6_gather_wa(GLenum internalformat)

We normally don't prefix static functions with '_'...I'd just call this
gen6_gather_wa() or gen6_gather_workaround().

> +{
> +   switch (internalformat) {
> +  case GL_R8I: return WA_SIGN | WA_8BIT;
> +  case GL_R8UI: return WA_8BIT;
> +  case GL_R16I: return WA_SIGN | WA_16BIT;
> +  case GL_R16UI: return WA_16BIT;
> +  /* note that even though GL_R32I and GL_R32UI have format overrides
> +   * in the surface state, there is no shader w/a required */
> +  default: return 0;
> +   }
> +}
> +
>  void
>  brw_populate_sampler_prog_key_data(struct gl_context *ctx,
>  const struct gl_program *prog,
> @@ -372,6 +386,12 @@ brw_populate_sampler_prog_key_data(struct gl_context 
> *ctx,
> key->gather_channel_quirk_mask |= 1 << s;
>   }
>  
> + /* Gen6's gather4 is broken for UINT/SINT; we treat them as 
> UNORM/FLOAT instead
> +  * and fix it in the shader. */

Closing */ should go on it's own line.

> + if (brw->gen == 6 && prog->UsesGather) {
> +key->gen6_gather_wa[s] = _gen6_gather_wa(img->InternalFormat);
> + }
> +
>   /* If this is a multisample sampler, and uses the CMS MSAA layout,
>* then we need to emit slightly different code to first sample the
>* MCS surface.




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] glsl: serialize methods for IR instructions

2014-02-04 Thread Paul Berry
On 29 January 2014 01:24, Tapani Pälli  wrote:

> +
> +
> +/**
> + * Serialization of exec_list, writes length of the list +
> + * calls serialize_data for each instruction
> + */
> +void
> +serialize_list(exec_list *list, memory_writer &mem)
> +{
> +   uint32_t list_len = 0;
> +   foreach_list(node, list)
> +  list_len++;
> +
> +   mem.write_uint32_t(list_len);
> +
> +   foreach_list(node, list)
> +  ((ir_instruction *)node)->serialize(mem);
>

Nit pick: we could avoid having to walk the list twice by using
mem.overwrite() to write out the length once we've written all the list
elements.  A similar comment applies to ir_call::serialize_data().


> +}
> +
> +
> +static void
> +serialize_glsl_type(const glsl_type *type, memory_writer &mem)
> +{
> +   uint32_t data_len = 0;
> +
> +   mem.write_string(type->name);
> +
> +   unsigned start_pos = mem.position();
> +   mem.write_uint32_t(data_len);
> +
> +   uint32_t type_hash;
> +
> +   /**
> +* notify reader if a user defined type exists already
> +* (has been serialized before)
> +*/
> +   uint8_t user_type_exists = 0;
> +
> +   /* serialize only user defined types */
> +   switch (type->base_type) {
> +   case GLSL_TYPE_ARRAY:
> +   case GLSL_TYPE_STRUCT:
> +   case GLSL_TYPE_INTERFACE:
> +  break;
> +   default:
> +  goto glsl_type_serilization_epilogue;
> +   }
> +
>

With the changes I recommended in the last patch, I believe we can replace
everything from here...


> +   type_hash = _mesa_hash_data(type, sizeof(glsl_type));
>
+
> +   /* check if this type has been written before */
> +   if (mem.data_was_written((void *)type, type_hash))
> +  user_type_exists = 1;
> +   else
> +  mem.mark_data_written((void *)type, type_hash);
>

...to here with:

   user_type_exists = mem.make_unique_id(type, &type_hash);

Although I'd recommend renaming type_hash to type_id so that there's no
confusion about the fact that it's a unique ID and not a hash (hashes
aren't guaranteed to be unique).


> +
> +   mem.write_uint8_t(user_type_exists);
> +   mem.write_uint32_t(type_hash);
> +
> +   /* no need to write again ... */
> +   if (user_type_exists)
> +  goto glsl_type_serilization_epilogue;
>

Nit pick: how about instead of the goto, just do:

if (!user_type_exists) {
   /* glsl type data */
   mem.write_uint32_t((uint32_t) type->length);
   ...
}

data_len = mem.position() - start_pos - sizeof(data_len);
...



> +
> +   /* glsl type data */
> +   mem.write_uint32_t((uint32_t)type->length);
> +   mem.write_uint8_t((uint8_t)type->base_type);
> +   mem.write_uint8_t((uint8_t)type->interface_packing);
> +
> +   if (type->base_type == GLSL_TYPE_ARRAY)
> +  serialize_glsl_type(type->element_type(), mem);
> +   else if (type->base_type == GLSL_TYPE_STRUCT ||
> +  type->base_type == GLSL_TYPE_INTERFACE) {
> +  glsl_struct_field *field = type->fields.structure;
> +  for (unsigned k = 0; k < type->length; k++, field++) {
> + mem.write_string(field->name);
> + serialize_glsl_type(field->type, mem);
> + mem.write_uint8_t((uint8_t)field->row_major);
> + mem.write_int32_t(field->location);
> + mem.write_uint8_t((uint8_t)field->interpolation);
> + mem.write_uint8_t((uint8_t)field->centroid);
> +  }
> +   }
> +
> +glsl_type_serilization_epilogue:
> +
> +   data_len = mem.position() - start_pos - sizeof(data_len);
> +   mem.overwrite(&data_len, sizeof(data_len), start_pos);
> +}
> +
> +
> +void
> +ir_variable::serialize_data(memory_writer &mem)
> +{
> +   /* name can be NULL, see ir_print_visitor for explanation */
> +   const char *non_null_name = name ? name : "parameter";
>

Since mem.write_string handles NULL strings, can we get rid of this?


> +   int64_t unique_id = (int64_t) (intptr_t) this;
> +   uint8_t mode = data.mode;
> +   uint8_t has_constant_value = constant_value ? 1 : 0;
> +   uint8_t has_constant_initializer = constant_initializer ? 1 : 0;
> +
> +   serialize_glsl_type(type, mem);
> +
> +   mem.write_string(non_null_name);
> +   mem.write_int64_t(unique_id);
> +   mem.write_uint8_t(mode);
> +
> +   mem.write(&data, sizeof(data));
> +
> +   mem.write_uint32_t(num_state_slots);
> +   mem.write_uint8_t(has_constant_value);
> +   mem.write_uint8_t(has_constant_initializer);
> +
> +   for (unsigned i = 0; i < num_state_slots; i++) {
> +  mem.write_int32_t(state_slots[i].swizzle);
> +  for (unsigned j = 0; j < 5; j++) {
> + mem.write_int32_t(state_slots[i].tokens[j]);
> +  }
> +   }
> +
> +   if (constant_value)
> +  constant_value->serialize(mem);
> +
> +   if (constant_initializer)
> +  constant_initializer->serialize(mem);
> +
> +   uint8_t has_interface_type = get_interface_type() ? 1 : 0;
> +
> +   mem.write_uint8_t(has_interface_type);
> +   if (has_interface_type)
> +  serialize_glsl_type(get_interface_type(), mem);
> +}
>

Everything's a nit pick except for the suggestion to use make_unique_id().
With that fixed, this patch is

Re: [Mesa-dev] [PATCH 2/5] i965: Add surface format overrides for Gen6 gather

2014-02-04 Thread Kenneth Graunke
On 02/03/2014 01:29 AM, Chris Forbes wrote:
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 30 
> 
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index dd96c9b..247b663 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -282,15 +282,35 @@ brw_update_texture_surface(struct gl_context *ctx,
> surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> 6 * 4, 32, surf_offset);
>  
> -   (void) for_gather;   /* no w/a to apply for this gen */
> +   uint32_t tex_format = translate_tex_format(brw, mt->format,
> +  sampler->sRGBDecode);
> +
> +   if (for_gather) {

The cover letter for your patch series has a nice explanation of what's
going on.  It would be great to capture some of that here.  Perhaps by
adding something like:

  /* Sandybridge's gather4 message is broken for integer formats.
   * To work around this, we pretend the surface is UNORM for
   * 8 or 16-bit formats, and emit shader instructions to recover
   * the real INT/UINT value.  For 32-bit formats, we pretend
   * the surface is FLOAT, and simply reinterpret the resulting
   * bits.
   */

> +  switch (tex_format) {
> +  case BRW_SURFACEFORMAT_R8_SINT:
> +  case BRW_SURFACEFORMAT_R8_UINT:
> + tex_format = BRW_SURFACEFORMAT_R8_UNORM;
> + break;
> +
> +  case BRW_SURFACEFORMAT_R16_SINT:
> +  case BRW_SURFACEFORMAT_R16_UINT:
> + tex_format = BRW_SURFACEFORMAT_R16_UNORM;
> + break;
> +
> +  case BRW_SURFACEFORMAT_R32_SINT:
> +  case BRW_SURFACEFORMAT_R32_UINT:
> + tex_format = BRW_SURFACEFORMAT_R32_FLOAT;
> + break;
> +
> +  default:
> + break;
> +  }
> +   }
>  
> surf[0] = (translate_tex_target(tObj->Target) << BRW_SURFACE_TYPE_SHIFT |
> BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
> BRW_SURFACE_CUBEFACE_ENABLES |
> -   (translate_tex_format(brw,
> -mt->format,
> - sampler->sRGBDecode) <<
> -BRW_SURFACE_FORMAT_SHIFT));
> +   tex_format << BRW_SURFACE_FORMAT_SHIFT);
>  
> surf[1] = intelObj->mt->region->bo->offset64 + intelObj->mt->offset; /* 
> reloc */
>  
> 




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] i965/fs: Emit shader w/a for Gen6 gather

2014-02-04 Thread Kenneth Graunke
On 02/03/2014 01:29 AM, Chris Forbes wrote:
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26 ++
>  2 files changed, 27 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 9c5c13a..3d668b9 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -360,6 +360,7 @@ public:
>fs_reg shadow_comp, fs_reg lod, fs_reg lod2,
>fs_reg sample_index, fs_reg mcs, int sampler);
> fs_reg emit_mcs_fetch(ir_texture *ir, fs_reg coordinate, int sampler);
> +   void emit_gen6_gather_wa(uint8_t wa, fs_reg dst);
> fs_reg fix_math_operand(fs_reg src);
> fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0);
> fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0, fs_reg src1);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index d88d24c..109f2e8 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1699,9 +1699,35 @@ fs_visitor::visit(ir_texture *ir)
>}
> }
>  
> +   if (brw->gen == 6 && ir->op == ir_tg4 && 
> c->key.tex.gen6_gather_wa[sampler]) {

I think it might be easier to read if you did:

   if (brw->gen == 6 && ir->op == ir_tg4) {
  emit_gen6_gather_wa(c->key.tex.gen6_gather_wa[sampler], dst);
   }

and then in the body of the function did:

   if (!wa)
  return;

> +  emit_gen6_gather_wa(c->key.tex.gen6_gather_wa[sampler], dst);
> +   }
> +
> swizzle_result(ir, dst, sampler);
>  }
>  
> +/*

Comments above functions should start with /**, for Doxygen.

> + * Apply workarounds for Gen6 gather with UINT/SINT
> + */
> +void
> +fs_visitor::emit_gen6_gather_wa(uint8_t wa, fs_reg dst)
> +{
> +   int width = (wa & WA_8BIT) ? 8 : 16;
> +
> +   for (int i = 0; i < 4; i++) {
> +  fs_reg dst_f = dst.retype(BRW_REGISTER_TYPE_F);

Adding a comment would help:

  /* Convert from UNORM to UINT. */

> +  emit(MUL(dst_f, dst_f, fs_reg((float)((1 << width) - 1;

If you like, you could write this using C++ constructor casts
emit(MUL(dst_f, dst_f, fs_reg(float((1 << width) - 1;
which reduces the parenthesis jumble a bit.  Either way is fine.

> +  emit(MOV(dst, dst_f));
> +
> +  if (wa & WA_SIGN) {

 /* Reinterpret the UINT value as a signed INT value by shifting
  * the sign bit into place, then shifting back preserving sign.
  */

> + emit(SHL(dst, dst, fs_reg(32 - width)));
> + emit(ASR(dst, dst, fs_reg(32 - width)));
> +  }
> +
> +  dst.reg_offset++;
> +   }
> +}
> +
>  /**
>   * Set up the gather channel based on the swizzle, for gather4.
>   */
> 

These suggestions apply to patch 4 as well.



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
On 4 February 2014 19:52, Paul Berry  wrote:

> Whoops, I discovered another issue:
>
>
> On 29 January 2014 01:24, Tapani Pälli  wrote:
>
>> Class will be used by the shader binary cache implementation.
>>
>> Signed-off-by: Tapani Pälli 
>> ---
>>  src/glsl/memory_writer.h | 188
>> +++
>>  1 file changed, 188 insertions(+)
>>  create mode 100644 src/glsl/memory_writer.h
>>
>> diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
>> new file mode 100644
>> index 000..979169f
>> --- /dev/null
>> +++ b/src/glsl/memory_writer.h
>> @@ -0,0 +1,188 @@
>> +/* -*- c++ -*- */
>> +/*
>> + * Copyright © 2013 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +#pragma once
>> +#ifndef MEMORY_WRITER_H
>> +#define MEMORY_WRITER_H
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "main/hash_table.h"
>> +
>> +#ifdef __cplusplus
>> +/**
>> + * Helper class for writing data to memory
>> + *
>> + * This class maintains a dynamically-sized memory buffer and allows
>> + * for data to be efficiently appended to it with automatic resizing.
>> + */
>> +class memory_writer
>> +{
>> +public:
>> +   memory_writer() :
>> +  memory(NULL),
>> +  curr_size(0),
>> +  pos(0)
>> +   {
>> +  data_hash = _mesa_hash_table_create(0, int_equal);
>> +  hash_value = _mesa_hash_data(this, sizeof(memory_writer));
>> +   }
>> +
>> +   ~memory_writer()
>> +   {
>> +  free(memory);
>> +  _mesa_hash_table_destroy(data_hash, NULL);
>> +   }
>> +
>> +   /* user wants to claim the memory */
>> +   char *release_memory(size_t *size)
>> +   {
>> +  /* final realloc to free allocated but unused memory */
>> +  char *result = (char *) realloc(memory, pos);
>> +  *size = pos;
>> +  memory = NULL;
>> +  curr_size = 0;
>> +  pos = 0;
>> +  return result;
>> +   }
>> +
>> +/**
>> + * write functions per type
>> + */
>> +#define DECL_WRITER(type) void write_ ##type (const type data) {\
>> +   write(&data, sizeof(type));\
>> +}
>> +
>> +   DECL_WRITER(int32_t);
>> +   DECL_WRITER(int64_t);
>> +   DECL_WRITER(uint8_t);
>> +   DECL_WRITER(uint32_t);
>> +
>> +   void write_bool(bool data)
>> +   {
>> +  uint8_t val = data;
>> +  write_uint8_t(val);
>> +   }
>> +
>> +   /* write function that reallocates more memory if required */
>> +   void write(const void *data, int size)
>> +   {
>> +  if (!memory || pos > (curr_size - size))
>> + if (!grow(size))
>> +return;
>> +
>> +  memcpy(memory + pos, data, size);
>> +
>> +  pos += size;
>> +   }
>> +
>> +   void overwrite(const void *data, int size, int offset)
>> +   {
>> +  if (offset < 0 || offset + size > pos)
>> + return;
>> +  memcpy(memory + offset, data, size);
>> +   }
>> +
>> +   /* length is written to make reading safe */
>> +   void write_string(const char *str)
>> +   {
>> +  uint32_t len = str ? strlen(str) : 0;
>> +  write_uint32_t(len);
>> +
>> +  if (str)
>> + write(str, len + 1);
>> +   }
>> +
>> +   unsigned position()
>> +   {
>> +  return pos;
>> +   }
>> +
>> +   /**
>> +* check if some data was written
>> +*/
>> +   bool data_was_written(void *data, uint32_t id)
>> +   {
>> +  hash_entry *entry =
>> + _mesa_hash_table_search(data_hash, hash_value,
>> +(void*) (intptr_t) id);
>>
>
> This isn't an efficient way to use a hashtable.  You're always passing the
> same hash_value (the one computed in the constructor), which means that
> every single hash entry will be stored in the same chain, and lookups will
> be really slow.
>
> Since you're only using this hashtable to detect duplicate glsl_types,
> there's a much easier approach: take advantage of the fact that glsl_type
> is a flyweigh

[Mesa-dev] slow depth test on Intel gen7+

2014-02-04 Thread Chia-I Wu
Hi,

I am looking at performance issues for some benchmark for a while, and am able
to identify three issues so far, all related to depth test.

The first issue is slow 16-bit depth buffer.  This is already known and is
fixed for GLES contexts by commit 73bc6061f5c3b6a3bb7a8114bb2e1a.  It is not
fixed for GL contexts because GL3.0 spec states, in section 3.9.1,

  Requesting one of these (sized) internal formats for any texture type will
  allocate exactly the internal component sizes and ...

However, all GL versions other than 3.0 do not require exact component sizes.
There are several possible fixes:

 - ignore GL3.0 spec and always allocate 24-bit depth buffers for best
   performance
 - add an drirc option to determine whether to ignore GL 3.0 spec or not
   in this case (the name of the option is to be determined)
 - allocate 24-bit depth buffers for all except GL 3.0 contexts

The second issue is, when HiZ is enabled, the rendering of small triangles
hitting the same HiZ pixel blocks is slow.  There turned out to be a register
to control the HiZ behavior and a patch to change the behavior has been
committed to the kernel drm-intel tree.

The third issue is the rendering of small triangles becomes very slow when
depth test is enabled and

 - GEN7_WM_PSCDEPTH_ON is set in 3DSTATE_WM, or
 - GEN7_WM_KILL_ENABLE is set in 3DSTATE_WM _and_ some pixels are killed

In both cases, early-Z is likely disabled, and that could easily explain the
slowdown.  But there are other ways to disable early-Z, such as

 - set the depth test function to GL_ALWAYS
 - make sure GEN7_WM_KILL_ENABLE is set but do not kill pixels

and they do not hurt the performance as much.  I have yet to figure out the
cause for the dramatic slowdown.  Kind of hope there is another magic register
to fix it, but that does not happen yet.  Any ideas?

-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] centroid affects interpolation

2014-02-04 Thread Kenneth Graunke
On 02/04/2014 05:01 AM, Kevin Rogovin wrote:
> Place centroid keyword as an interpolation qualifier.
> Previously was a storage qualifier. Fixes front end
> to accept input of the form "centroid in type variable"

No, it doesn't.  Without your patch, Mesa successfully compiles the
following shader:

#version 130
centroid in float foo;

which is of the form "centroid in type variable".  Chris is right - the
specs are very clear that 'centroid in' was a storage qualifier, and if
420pack is enabled, 'centroid' becomes an auxiliary storage qualifier.



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
Whoops, I discovered another issue:


On 29 January 2014 01:24, Tapani Pälli  wrote:

> Class will be used by the shader binary cache implementation.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/memory_writer.h | 188
> +++
>  1 file changed, 188 insertions(+)
>  create mode 100644 src/glsl/memory_writer.h
>
> diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
> new file mode 100644
> index 000..979169f
> --- /dev/null
> +++ b/src/glsl/memory_writer.h
> @@ -0,0 +1,188 @@
> +/* -*- c++ -*- */
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#pragma once
> +#ifndef MEMORY_WRITER_H
> +#define MEMORY_WRITER_H
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "main/hash_table.h"
> +
> +#ifdef __cplusplus
> +/**
> + * Helper class for writing data to memory
> + *
> + * This class maintains a dynamically-sized memory buffer and allows
> + * for data to be efficiently appended to it with automatic resizing.
> + */
> +class memory_writer
> +{
> +public:
> +   memory_writer() :
> +  memory(NULL),
> +  curr_size(0),
> +  pos(0)
> +   {
> +  data_hash = _mesa_hash_table_create(0, int_equal);
> +  hash_value = _mesa_hash_data(this, sizeof(memory_writer));
> +   }
> +
> +   ~memory_writer()
> +   {
> +  free(memory);
> +  _mesa_hash_table_destroy(data_hash, NULL);
> +   }
> +
> +   /* user wants to claim the memory */
> +   char *release_memory(size_t *size)
> +   {
> +  /* final realloc to free allocated but unused memory */
> +  char *result = (char *) realloc(memory, pos);
> +  *size = pos;
> +  memory = NULL;
> +  curr_size = 0;
> +  pos = 0;
> +  return result;
> +   }
> +
> +/**
> + * write functions per type
> + */
> +#define DECL_WRITER(type) void write_ ##type (const type data) {\
> +   write(&data, sizeof(type));\
> +}
> +
> +   DECL_WRITER(int32_t);
> +   DECL_WRITER(int64_t);
> +   DECL_WRITER(uint8_t);
> +   DECL_WRITER(uint32_t);
> +
> +   void write_bool(bool data)
> +   {
> +  uint8_t val = data;
> +  write_uint8_t(val);
> +   }
> +
> +   /* write function that reallocates more memory if required */
> +   void write(const void *data, int size)
> +   {
> +  if (!memory || pos > (curr_size - size))
> + if (!grow(size))
> +return;
> +
> +  memcpy(memory + pos, data, size);
> +
> +  pos += size;
> +   }
> +
> +   void overwrite(const void *data, int size, int offset)
> +   {
> +  if (offset < 0 || offset + size > pos)
> + return;
> +  memcpy(memory + offset, data, size);
> +   }
> +
> +   /* length is written to make reading safe */
> +   void write_string(const char *str)
> +   {
> +  uint32_t len = str ? strlen(str) : 0;
> +  write_uint32_t(len);
> +
> +  if (str)
> + write(str, len + 1);
> +   }
> +
> +   unsigned position()
> +   {
> +  return pos;
> +   }
> +
> +   /**
> +* check if some data was written
> +*/
> +   bool data_was_written(void *data, uint32_t id)
> +   {
> +  hash_entry *entry =
> + _mesa_hash_table_search(data_hash, hash_value,
> +(void*) (intptr_t) id);
>

This isn't an efficient way to use a hashtable.  You're always passing the
same hash_value (the one computed in the constructor), which means that
every single hash entry will be stored in the same chain, and lookups will
be really slow.

Since you're only using this hashtable to detect duplicate glsl_types,
there's a much easier approach: take advantage of the fact that glsl_type
is a flyweight class (in other words, code elsewhere in Mesa ensures that
no duplicate glsl_type objects exist anywhere in memory, so any two
glsl_type pointers can be compared simply using pointer equality).  Because
of that, you should be abl

Re: [Mesa-dev] [PATCH 01/10] glsl: memory_writer helper class for data serialization

2014-02-04 Thread Paul Berry
On 29 January 2014 01:24, Tapani Pälli  wrote:

> Class will be used by the shader binary cache implementation.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/memory_writer.h | 188
> +++
>  1 file changed, 188 insertions(+)
>  create mode 100644 src/glsl/memory_writer.h
>
> diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h
> new file mode 100644
> index 000..979169f
> --- /dev/null
> +++ b/src/glsl/memory_writer.h
> @@ -0,0 +1,188 @@
> +/* -*- c++ -*- */
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#pragma once
> +#ifndef MEMORY_WRITER_H
> +#define MEMORY_WRITER_H
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "main/hash_table.h"
> +
> +#ifdef __cplusplus
> +/**
> + * Helper class for writing data to memory
> + *
> + * This class maintains a dynamically-sized memory buffer and allows
> + * for data to be efficiently appended to it with automatic resizing.
> + */
> +class memory_writer
> +{
> +public:
> +   memory_writer() :
> +  memory(NULL),
> +  curr_size(0),
> +  pos(0)
> +   {
> +  data_hash = _mesa_hash_table_create(0, int_equal);
> +  hash_value = _mesa_hash_data(this, sizeof(memory_writer));
> +   }
> +
> +   ~memory_writer()
> +   {
> +  free(memory);
> +  _mesa_hash_table_destroy(data_hash, NULL);
> +   }
> +
> +   /* user wants to claim the memory */
> +   char *release_memory(size_t *size)
> +   {
> +  /* final realloc to free allocated but unused memory */
> +  char *result = (char *) realloc(memory, pos);
> +  *size = pos;
> +  memory = NULL;
> +  curr_size = 0;
> +  pos = 0;
> +  return result;
> +   }
> +
> +/**
> + * write functions per type
> + */
> +#define DECL_WRITER(type) void write_ ##type (const type data) {\
> +   write(&data, sizeof(type));\
> +}
> +
> +   DECL_WRITER(int32_t);
> +   DECL_WRITER(int64_t);
> +   DECL_WRITER(uint8_t);
> +   DECL_WRITER(uint32_t);
> +
> +   void write_bool(bool data)
> +   {
> +  uint8_t val = data;
> +  write_uint8_t(val);
> +   }
> +
> +   /* write function that reallocates more memory if required */
> +   void write(const void *data, int size)
> +   {
> +  if (!memory || pos > (curr_size - size))
> + if (!grow(size))
> +return;
>

Thanks for making the changes I asked for about error handling.  I think
this is much better than what we had before.  One minor comment: it is
helpful in debugging if we can make sure that the program stops executing
the moment an error condition is detected (because this gives us the best
chance of being able to break in with the debugger and be close to the
cause of the error).  Therefore, I'd recommend doing something like this:

if (!memory || pos > (curr_size - size)) {
   if (!grow(size)) {
  assert(!"Out of memory while serializing a shader");
  return;
   }
}

In release builds this is equivalent to what you wrote, but in debug builds
it ensures that the program will halt if it runs out of memory.  That way
if we ever happen to introduce a bug that makes the serializer go into an
infinite loop writing data, we'll hit the assertion rather than just loop
forever.


> +
> +  memcpy(memory + pos, data, size);
> +
> +  pos += size;
> +   }
> +
> +   void overwrite(const void *data, int size, int offset)
> +   {
> +  if (offset < 0 || offset + size > pos)
> + return;
>

Similarly, there should be an assertion in this return path, since this
should never happen except in the case of a bug elsewhere in Mesa.


> +  memcpy(memory + offset, data, size);
> +   }
> +
> +   /* length is written to make reading safe */
> +   void write_string(const char *str)
> +   {
> +  uint32_t len = str ? strlen(str) : 0;
> +  write_uint32_t(len);
>

There's 

Re: [Mesa-dev] [PATCH 1/3] gallivm: handle huge number of immediates

2014-02-04 Thread Zack Rusin
> > reasons. This commit adds code to skip that performance optimization
> > and always use just the dynamically allocated immediates if the
> > number of them is too great.
> 
> So is there any limit on the number of immediates now?

Technically not. Practically other parts of the code will max out and assert at 
anything greater than 4096 which is what sm4 defines as maximum for temps. So 
at least theoretically the gallivm code will just work if that limit is 
increased elsewhere.

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] tgsi/ureg: increase the number of immediates

2014-02-04 Thread Zack Rusin
Yes, they simply always behave as if they were accessed indirectly from our 
code, but llvm seems to be pretty good at moving all of those accesses to 
registers (aka. eliminating alloca's) if they're not actually indirectly 
indexed, so it all ends up pretty.

z

- Original Message -
> Am 05.02.2014 01:34, schrieb Zack Rusin:
> > ureg_program is allocated on the heap so we can just bump the
> > number of immediates that it can handle. It's needed for d3d10.
> > 
> > Signed-off-by: Zack Rusin 
> > ---
> >  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> > b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> > index f06858e..f928f57 100644
> > --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> > +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> > @@ -77,7 +77,7 @@ struct ureg_tokens {
> >  #define UREG_MAX_SYSTEM_VALUE PIPE_MAX_ATTRIBS
> >  #define UREG_MAX_OUTPUT PIPE_MAX_SHADER_OUTPUTS
> >  #define UREG_MAX_CONSTANT_RANGE 32
> > -#define UREG_MAX_IMMEDIATE 256
> > +#define UREG_MAX_IMMEDIATE 4096
> >  #define UREG_MAX_ADDR 2
> >  #define UREG_MAX_PRED 1
> >  #define UREG_MAX_ARRAY_TEMPS 256
> > 
> 
> Series looks good to me. llvm can still perform all optimizations on
> such immediates right?
> 
> Roland
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Bump MaxTexMbytes from 1GB to 1.5GB.

2014-02-04 Thread Kenneth Graunke
On 02/04/2014 10:37 AM, Daniel Vetter wrote:
> On Sun, Feb 02, 2014 at 03:16:45AM -0800, Kenneth Graunke wrote:
>> Even with the other limits raised, TestProxyTexImage would still reject
>> textures > 1GB in size.  This is an artificial limit; nothing prevents
>> us from having a larger texture.  I stayed shy of 2GB to avoid the
>> larger-than-aperture situation.
>>
>> For 3D textures, this raises the effective limit:
>>  - RGBA8:   645 -> 738
>>  - RGBA16:  512 -> 586
>>  - RGBA32F: 406 -> 465
>>
>> Cc: i...@freedesktop.org
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/brw_context.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
>> b/src/mesa/drivers/dri/i965/brw_context.c
>> index 17b75e1..66d6ccb 100644
>> --- a/src/mesa/drivers/dri/i965/brw_context.c
>> +++ b/src/mesa/drivers/dri/i965/brw_context.c
>> @@ -306,6 +306,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>>ctx->Const.MaxTextureLevels = MAX_TEXTURE_LEVELS;
>> ctx->Const.Max3DTextureLevels = 12; /* 2048 */
>> ctx->Const.MaxCubeTextureLevels = 14; /* 8192 */
>> +   ctx->Const.MaxTextureMbytes = 1536;
> 
> Original gen4 (i.e. i965g) only has 512 MB of aperture ... Also going this
> high runs the risk that you fool up with fragmentation, but meh.
> 
> You'd need to get at bufmgr_gem->gtt_size somehow. At least the current
> code is safe for address spaces > 4G.
> -Daniel

This whole check is fairly stupid.  It denies things that could work (as
evidenced by this patch), and doesn't prevent things that won't work
(you can always create 2 or 3 large textures, each of which pass this
check, but exceed the aperture when used together.)

So, I don't think that making the formula more precise by getting at
bufmgr_gem->gtt_size is really useful...I think we should just stop
using this code altogether.



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/8] loader: Get driver name from udev hwdb when available

2014-02-04 Thread Kristian Høgsberg
The udev hwdb is a mechanism for applying udev properties to devices at
hotplug time.  The hwdb text files are compiled into a binary database
that lets udev efficiently look up and apply properties to devices that
match a given modalias.

This patch exports the mesa PCI ID tables as hwdb files and extends the
loader code to try to look up the driver name from the DRI_DRIVER udev
property.  The benefits to this approach are:

 - No longer PCI specific, any device udev can match with a modalias can
   be assigned a DRI driver.

 - Available outside mesa; writing a DRI2 compatible generic DDX with
   glamor needs to know the DRI driver name to send to the client.

 - Can be overridden by custom udev rules.

Signed-off-by: Kristian Høgsberg 
---
 configure.ac| 14 ++
 src/loader/.gitignore   |  1 +
 src/loader/Makefile.am  | 17 +
 src/loader/dump-hwdb.sh | 29 +
 src/loader/loader.c | 33 +
 src/loader/loader.h |  2 +-
 6 files changed, 87 insertions(+), 9 deletions(-)
 create mode 100644 src/loader/.gitignore
 create mode 100755 src/loader/dump-hwdb.sh

diff --git a/configure.ac b/configure.ac
index ba158e8..9bf27d4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -771,6 +771,20 @@ if test "x$have_libdrm" = xyes; then
DEFINES="$DEFINES -DHAVE_LIBDRM"
 fi
 
+# This /lib prefix does not change with 32/64 bits it's always /lib
+case "$prefix" in
+/usr) default_udevhwdbdir=/lib/udev/hwdb.d ;;
+NONE) default_udevhwdbdir=${ac_default_prefix}/lib/udev/hwdb.d ;;
+*)default_udevhwdbdir=$prefix/lib/udev/hwdb.d ;;
+esac
+
+AC_ARG_WITH([udev-hwdb-dir],
+[AS_HELP_STRING([--with-udev-hwdb-dir=DIR],
+[directory for the udev hwdb @<:@/lib/udev/hwdb.d@:>@])],
+[udevhwdbdir="$withval"],
+[udevhwdbdir=$default_udevhwdbdir])
+AC_SUBST([udevhwdbdir])
+
 PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED],
   have_libudev=yes, have_libudev=no)
 
diff --git a/src/loader/.gitignore b/src/loader/.gitignore
new file mode 100644
index 000..e11c470
--- /dev/null
+++ b/src/loader/.gitignore
@@ -0,0 +1 @@
+20-dri-driver.hwdb
diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index bddf7ac..f407c78 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -41,3 +41,20 @@ libloader_la_LIBADD = \
 endif
 
 libloader_la_SOURCES = $(LOADER_C_FILES)
+
+
+dist_udevhwdb_DATA = 20-dri-driver.hwdb
+
+# Update hwdb on installation. Do not bother if installing
+# in DESTDIR, since this is likely for packaging purposes.
+install-data-hook :
+   -test -z "$(DESTDIR)" && udevadm hwdb --update
+
+export DEFINES
+
+20-dri-driver.hwdb :
+   $(srcdir)/dump-hwdb.sh > $@-tmp && mv $@-tmp $@
+
+20-dri-driver.hwdb : dump-hwdb.sh ../../include/pci_ids/*.h
+
+CLEANFILES = 20-dri-driver.hwdb
diff --git a/src/loader/dump-hwdb.sh b/src/loader/dump-hwdb.sh
new file mode 100755
index 000..b4224a9
--- /dev/null
+++ b/src/loader/dump-hwdb.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+set -e
+
+PROP_NAME=DRI_DRIVER
+
+while read vendor driver; do
+pci_id_file=../../include/pci_ids/${driver}_pci_ids.h
+if ! test -r $pci_id_file; then
+printf "pci:v%08x*bc03*\n $PROP_NAME=$driver\n\n" $vendor
+continue
+fi
+
+gcc -E $DEFINES $pci_id_file |
+while IFS=' (,' read c id rest; do
+test -z "$id" && continue
+printf "pci:v%08xd%08x*\n $PROP_NAME=$driver\n\n" $vendor $id
+done
+done <= 0);
+   return (*driver != NULL) || (*chip_id >= 0);
 }
 
 #elif defined(ANDROID) && !defined(__NOT_HAVE_DRM_H)
@@ -210,11 +218,12 @@ out:
 #include 
 
 int
-loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
+loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver)
 {
drmVersionPtr version;
 
*chip_id = -1;
+   *driver = NULL;
 
version = drmGetVersion(fd);
if (!version) {
@@ -276,7 +285,7 @@ loader_get_pci_id_for_fd(int fd, int *vendor_id, int 
*chip_id)
 #else
 
 int
-loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
+loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver)
 {
return 0;
 }
@@ -325,7 +334,7 @@ loader_get_driver_for_fd(int fd, unsigned driver_types)
if (!driver_types)
   driver_types = _LOADER_GALLIUM | _LOADER_DRI;
 
-   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id)) {
+   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id, &driver)) {
 
 #ifndef __NOT_HAVE_DRM_H
   /* fallback to drmGetVersion(): */
@@ -345,6 +354,9 @@ loader_get_driver_for_fd(int fd, unsigned driver_types)
   return driver;
}
 
+   if (driver)
+  goto out;
+
for (i = 0; driver_map[i].driver; i++) {
   if (vendor_id != driver_map[i].vendor_id)
  continue;
@@ -365,9 +377,14 @@ loader_get_driver_for_fd(int fd, unsigned driver_types)
}
 
 out:
-   log_(driver ? _LOADER_DEBUG : _LOADER_WARNING,
- "pci id for fd %d: %04x:%04x, driv

[Mesa-dev] [PATCH v3 2/8] gallium-loader: Don't worry about PCI IDs in gallium-loader

2014-02-04 Thread Kristian Høgsberg
There's no reason to look this up in the gallium loader code now that
the generic loader handles all this.  This allows us to not export
loader_get_pci_id_for_fd() from loader.c.

Signed-off-by: Kristian Høgsberg 
---
 src/gallium/auxiliary/pipe-loader/pipe_loader.h | 16 
 src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 10 +-
 src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c  |  1 -
 src/loader/loader.c |  2 +-
 src/loader/loader.h |  3 ---
 5 files changed, 2 insertions(+), 30 deletions(-)

diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.h 
b/src/gallium/auxiliary/pipe-loader/pipe_loader.h
index e0525df..68aacf9 100644
--- a/src/gallium/auxiliary/pipe-loader/pipe_loader.h
+++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.h
@@ -41,26 +41,10 @@ extern "C" {
 
 struct pipe_screen;
 
-enum pipe_loader_device_type {
-   PIPE_LOADER_DEVICE_SOFTWARE,
-   PIPE_LOADER_DEVICE_PCI,
-   PIPE_LOADER_DEVICE_PLATFORM,
-   NUM_PIPE_LOADER_DEVICE_TYPES
-};
-
 /**
  * A device known to the pipe loader.
  */
 struct pipe_loader_device {
-   enum pipe_loader_device_type type;
-
-   union {
-  struct {
- int vendor_id;
- int chip_id;
-  } pci;
-   } u; /**< Discriminated by \a type */
-
const char *driver_name;
const struct pipe_loader_ops *ops;
 };
diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c 
b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
index d6869fd..b201bc0 100644
--- a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
+++ b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
@@ -116,15 +116,7 @@ boolean
 pipe_loader_drm_probe_fd(struct pipe_loader_device **dev, int fd)
 {
struct pipe_loader_drm_device *ddev = CALLOC_STRUCT(pipe_loader_drm_device);
-   int vendor_id, chip_id;
-
-   if (loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id)) {
-  ddev->base.type = PIPE_LOADER_DEVICE_PCI;
-  ddev->base.u.pci.vendor_id = vendor_id;
-  ddev->base.u.pci.chip_id = chip_id;
-   } else {
-  ddev->base.type = PIPE_LOADER_DEVICE_PLATFORM;
-   }
+
ddev->base.ops = &pipe_loader_drm_ops;
ddev->fd = fd;
 
diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c 
b/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c
index 95a4f84..c1d5f66 100644
--- a/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c
+++ b/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c
@@ -59,7 +59,6 @@ pipe_loader_sw_probe(struct pipe_loader_device **devs, int 
ndev)
   if (i < ndev) {
  struct pipe_loader_sw_device *sdev = 
CALLOC_STRUCT(pipe_loader_sw_device);
 
- sdev->base.type = PIPE_LOADER_DEVICE_SOFTWARE;
  sdev->base.driver_name = "swrast";
  sdev->base.ops = &pipe_loader_sw_ops;
  sdev->ws = backends[i]();
diff --git a/src/loader/loader.c b/src/loader/loader.c
index 4119f77..51bac3e 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -157,7 +157,7 @@ udev_device_new_from_fd(struct udev *udev, int fd)
return device;
 }
 
-int
+static int
 loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver)
 {
struct udev *udev = NULL;
diff --git a/src/loader/loader.h b/src/loader/loader.h
index 5771280..30fa26e 100644
--- a/src/loader/loader.h
+++ b/src/loader/loader.h
@@ -32,9 +32,6 @@
 #define _LOADER_DRI  (1 << 0)
 #define _LOADER_GALLIUM  (1 << 1)
 
-int
-loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver);
-
 char *
 loader_get_driver_for_fd(int fd, unsigned driver_types);
 
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/8] loader: Factor out drmGetVersion() fallback code

2014-02-04 Thread Kristian Høgsberg
Signed-off-by: Kristian Høgsberg 
---
 src/loader/loader.c | 44 +---
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index 88210df..7c38f94 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -115,6 +115,29 @@ lookup_driver_for_pci_id(int vendor_id, int chip_id, 
unsigned int driver_types)
return NULL;
 }
 
+static char *
+fallback_to_kernel_name(int fd)
+{
+   char *driver = NULL;
+
+#ifndef __NOT_HAVE_DRM_H
+   /* fallback to drmGetVersion(): */
+   drmVersionPtr version = drmGetVersion(fd);
+
+   if (!version) {
+  log_(_LOADER_WARNING, "failed to get driver name for fd %d\n", fd);
+  return NULL;
+   }
+
+   driver = strndup(version->name, version->name_len);
+   log_(_LOADER_INFO, "using driver %s for %d\n", driver, fd);
+
+   drmFreeVersion(version);
+#endif
+
+   return driver;
+}
+
 #ifdef HAVE_LIBUDEV
 #include 
 
@@ -357,25 +380,8 @@ loader_get_driver_for_fd(int fd, unsigned int driver_types)
if (!driver_types)
   driver_types = _LOADER_GALLIUM | _LOADER_DRI;
 
-   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id, &driver)) {
-
-#ifndef __NOT_HAVE_DRM_H
-  /* fallback to drmGetVersion(): */
-  drmVersionPtr version = drmGetVersion(fd);
-
-  if (!version) {
- log_(_LOADER_WARNING, "failed to get driver name for fd %d\n", fd);
- return NULL;
-  }
-
-  driver = strndup(version->name, version->name_len);
-  log_(_LOADER_INFO, "using driver %s for %d\n", driver, fd);
-
-  drmFreeVersion(version);
-#endif
-
-  return driver;
-   }
+   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id, &driver))
+  return fallback_to_kernel_name(fd);
 
if (driver == NULL)
   driver = lookup_driver_for_pci_id(vendor_id, chip_id, driver_types);
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 7/8] loader: Move loader_get_driver_for_fd() into the three different #ifdef cases

2014-02-04 Thread Kristian Høgsberg
Having one function that tries to accomodate the different decision trees
for the three different #ifdef cases makes the logic hard to follow.  Now
that most of the complexity in loader_get_driver_for_fd() has been split
out in helper functions, we can just move loader_get_driver_for_fd() into
the #ifdefs and simplify the code for each case.

Signed-off-by: Kristian Høgsberg 
---
 src/loader/loader.c | 62 +++--
 1 file changed, 27 insertions(+), 35 deletions(-)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index 7131d46..9bd3561 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -212,8 +212,8 @@ udev_device_new_from_fd(struct udev *udev, int fd)
return device;
 }
 
-static int
-loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver)
+char *
+loader_get_driver_for_fd(int fd, unsigned int driver_types)
 {
struct udev *udev = NULL;
struct udev_device *device = NULL, *parent;
@@ -227,9 +227,8 @@ loader_get_pci_id_for_fd(int fd, int *vendor_id, int 
*chip_id, char **driver)
(struct udev_device *));
UDEV_SYMBOL(struct udev *, udev_unref, (struct udev *));
const char *hwdb_driver;
-
-   *chip_id = -1;
-   *driver = NULL;
+   int vendor_id, chip_id;
+   char *driver = NULL;
 
udev = udev_new();
device = udev_device_new_from_fd(udev, fd);
@@ -245,25 +244,25 @@ loader_get_pci_id_for_fd(int fd, int *vendor_id, int 
*chip_id, char **driver)
hwdb_driver = udev_device_get_property_value(parent, "DRI_DRIVER");
if (hwdb_driver != NULL) {
   log_(_LOADER_INFO, "using driver %s from udev hwdb", driver);
-  *driver = strdup(hwdb_driver);
+  driver = strdup(hwdb_driver);
   goto out;
}
 
pci_id = udev_device_get_property_value(parent, "PCI_ID");
-   if (pci_id == NULL ||
-   sscanf(pci_id, "%x:%x", vendor_id, chip_id) != 2) {
-  log_(_LOADER_WARNING, "MESA-LOADER: malformed or no PCI ID\n");
-  *chip_id = -1;
+   if (pci_id && sscanf(pci_id, "%x:%x", &vendor_id, &chip_id) == 2) {
+  driver = lookup_driver_for_pci_id(vendor_id, chip_id, driver_types);
   goto out;
}
 
+   driver = fallback_to_kernel_name(fd);
+
 out:
if (device)
   udev_device_unref(device);
if (udev)
   udev_unref(udev);
 
-   return (*driver != NULL) || (*chip_id >= 0);
+   return driver;
 }
 
 #elif defined(ANDROID) && defined(__HAVE_DRM_H)
@@ -338,12 +337,26 @@ loader_get_pci_id_for_fd(int fd, int *vendor_id, int 
*chip_id, char **driver)
return (*chip_id >= 0);
 }
 
+char *
+loader_get_driver_for_fd(int fd, unsigned int driver_types)
+{
+   int vendor_id, chip_id;
+
+   if (!driver_types)
+  driver_types = _LOADER_GALLIUM | _LOADER_DRI;
+
+   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id))
+  return fallback_to_kernel_name(fd);
+
+   return lookup_driver_for_pci_id(vendor_id, chip_id, driver_types);
+}
+
 #else
 
-int
-loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver)
+char *
+loader_get_driver_for_fd(int fd, unsigned int driver_types)
 {
-   return 0;
+   return fallback_to_kernel_name(fd);
 }
 
 #endif
@@ -381,27 +394,6 @@ out:
return device_name;
 }
 
-char *
-loader_get_driver_for_fd(int fd, unsigned int driver_types)
-{
-   int vendor_id, chip_id;
-   char *driver = NULL;
-
-   if (!driver_types)
-  driver_types = _LOADER_GALLIUM | _LOADER_DRI;
-
-   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id, &driver))
-  return fallback_to_kernel_name(fd);
-
-   if (driver == NULL)
-  driver = lookup_driver_for_pci_id(vendor_id, chip_id, driver_types);
-
-   if (driver == NULL)
-  log_(_LOADER_WARNING, "no driver %s for %d\n", fd);
-
-   return driver;
-}
-
 void
 loader_set_logger(void (*logger)(int level, const char *fmt, ...))
 {
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/8] loader: Invert __NOT_HAVE_DRM_H to __HAVE_DRM_H

2014-02-04 Thread Kristian Høgsberg
Fix up double negations for easier readability.

Signed-off-by: Kristian Høgsberg 
---
 src/loader/Makefile.am | 8 
 src/loader/loader.c| 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index f407c78..1e0a140 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -29,15 +29,15 @@ libloader_la_CPPFLAGS = \
$(VISIBILITY_CFLAGS) \
$(LIBUDEV_CFLAGS)
 
-if !HAVE_LIBDRM
-libloader_la_CPPFLAGS += \
-   -D__NOT_HAVE_DRM_H
-else
+if HAVE_LIBDRM
 libloader_la_CPPFLAGS += \
$(LIBDRM_CFLAGS)
 
 libloader_la_LIBADD = \
$(LIBDRM_LIBS)
+else
+libloader_la_CPPFLAGS += \
+   -D__HAVE_DRM_H
 endif
 
 libloader_la_SOURCES = $(LOADER_C_FILES)
diff --git a/src/loader/loader.c b/src/loader/loader.c
index 7c38f94..319bcf5 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -73,7 +73,7 @@
 #endif
 #include "loader.h"
 
-#ifndef __NOT_HAVE_DRM_H
+#ifdef __HAVE_DRM_H
 #include 
 #endif
 
@@ -120,7 +120,7 @@ fallback_to_kernel_name(int fd)
 {
char *driver = NULL;
 
-#ifndef __NOT_HAVE_DRM_H
+#ifdef __HAVE_DRM_H
/* fallback to drmGetVersion(): */
drmVersionPtr version = drmGetVersion(fd);
 
@@ -256,7 +256,7 @@ out:
return (*driver != NULL) || (*chip_id >= 0);
 }
 
-#elif defined(ANDROID) && !defined(__NOT_HAVE_DRM_H)
+#elif defined(ANDROID) && defined(__HAVE_DRM_H)
 
 /* for i915 */
 #include 
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 8/8] loader: Switch loader to match modalias strings instead of PCI IDs

2014-02-04 Thread Kristian Høgsberg
This lets us match any device on any bus, including platform devices.

Signed-off-by: Kristian Høgsberg 
---
 include/pci_ids/pci_id_driver_map.h | 81 -
 src/loader/Makefile.am  | 10 +++--
 src/loader/Makefile.sources |  3 +-
 src/loader/dump-hwdb.sh | 50 ---
 src/loader/loader.c | 56 +++--
 5 files changed, 68 insertions(+), 132 deletions(-)
 delete mode 100644 include/pci_ids/pci_id_driver_map.h

diff --git a/include/pci_ids/pci_id_driver_map.h 
b/include/pci_ids/pci_id_driver_map.h
deleted file mode 100644
index db9e07f..000
--- a/include/pci_ids/pci_id_driver_map.h
+++ /dev/null
@@ -1,81 +0,0 @@
-#ifndef _PCI_ID_DRIVER_MAP_H_
-#define _PCI_ID_DRIVER_MAP_H_
-
-#include 
-
-#ifndef ARRAY_SIZE
-#define ARRAY_SIZE(a) (sizeof(a) / sizeof((a)[0]))
-#endif
-
-#ifndef __IS_LOADER
-#  error "Only include from loader.c"
-#endif
-
-static const int i915_chip_ids[] = {
-#define CHIPSET(chip, desc, name) chip,
-#include "pci_ids/i915_pci_ids.h"
-#undef CHIPSET
-};
-
-static const int i965_chip_ids[] = {
-#define CHIPSET(chip, family, name) chip,
-#include "pci_ids/i965_pci_ids.h"
-#undef CHIPSET
-};
-
-static const int r100_chip_ids[] = {
-#define CHIPSET(chip, name, family) chip,
-#include "pci_ids/radeon_pci_ids.h"
-#undef CHIPSET
-};
-
-static const int r200_chip_ids[] = {
-#define CHIPSET(chip, name, family) chip,
-#include "pci_ids/r200_pci_ids.h"
-#undef CHIPSET
-};
-
-static const int r300_chip_ids[] = {
-#define CHIPSET(chip, name, family) chip,
-#include "pci_ids/r300_pci_ids.h"
-#undef CHIPSET
-};
-
-static const int r600_chip_ids[] = {
-#define CHIPSET(chip, name, family) chip,
-#include "pci_ids/r600_pci_ids.h"
-#undef CHIPSET
-};
-
-static const int radeonsi_chip_ids[] = {
-#define CHIPSET(chip, name, family) chip,
-#include "pci_ids/radeonsi_pci_ids.h"
-#undef CHIPSET
-};
-
-static const int vmwgfx_chip_ids[] = {
-#define CHIPSET(chip, name, family) chip,
-#include "pci_ids/vmwgfx_pci_ids.h"
-#undef CHIPSET
-};
-
-static const struct {
-   int vendor_id;
-   const char *driver;
-   const int *chip_ids;
-   int num_chips_ids;
-   unsigned driver_types;
-} driver_map[] = {
-   { 0x8086, "i915", i915_chip_ids, ARRAY_SIZE(i915_chip_ids), _LOADER_DRI | 
_LOADER_GALLIUM },
-   { 0x8086, "i965", i965_chip_ids, ARRAY_SIZE(i965_chip_ids), _LOADER_DRI | 
_LOADER_GALLIUM },
-   { 0x1002, "radeon", r100_chip_ids, ARRAY_SIZE(r100_chip_ids), _LOADER_DRI },
-   { 0x1002, "r200", r200_chip_ids, ARRAY_SIZE(r200_chip_ids), _LOADER_DRI },
-   { 0x1002, "r300", r300_chip_ids, ARRAY_SIZE(r300_chip_ids), _LOADER_GALLIUM 
},
-   { 0x1002, "r600", r600_chip_ids, ARRAY_SIZE(r600_chip_ids), _LOADER_GALLIUM 
},
-   { 0x1002, "radeonsi", radeonsi_chip_ids, ARRAY_SIZE(radeonsi_chip_ids), 
_LOADER_GALLIUM},
-   { 0x10de, "nouveau", NULL, -1,  _LOADER_GALLIUM  },
-   { 0x15ad, "vmwgfx", vmwgfx_chip_ids, ARRAY_SIZE(vmwgfx_chip_ids), 
_LOADER_GALLIUM },
-   { 0x, NULL, NULL, 0 },
-};
-
-#endif /* _PCI_ID_DRIVER_MAP_H_ */
diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index 1e0a140..f09185f 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -42,6 +42,10 @@ endif
 
 libloader_la_SOURCES = $(LOADER_C_FILES)
 
+loader.c : modalias-map.h
+
+modalias-map.h :
+   $(srcdir)/dump-hwdb.sh modalias-map > $@-tmp && mv $@-tmp $@
 
 dist_udevhwdb_DATA = 20-dri-driver.hwdb
 
@@ -53,8 +57,8 @@ install-data-hook :
 export DEFINES
 
 20-dri-driver.hwdb :
-   $(srcdir)/dump-hwdb.sh > $@-tmp && mv $@-tmp $@
+   $(srcdir)/dump-hwdb.sh hwdb > $@-tmp && mv $@-tmp $@
 
-20-dri-driver.hwdb : dump-hwdb.sh ../../include/pci_ids/*.h
+modalias-map.h 20-dri-driver.hwdb : dump-hwdb.sh ../../include/pci_ids/*.h
 
-CLEANFILES = 20-dri-driver.hwdb
+CLEANFILES = 20-dri-driver.hwdb modalias-map.h
diff --git a/src/loader/Makefile.sources b/src/loader/Makefile.sources
index 51a64ea..1201ef1 100644
--- a/src/loader/Makefile.sources
+++ b/src/loader/Makefile.sources
@@ -1,2 +1,3 @@
 LOADER_C_FILES := \
-   loader.c
\ No newline at end of file
+   loader.c \
+   driver-map.h
\ No newline at end of file
diff --git a/src/loader/dump-hwdb.sh b/src/loader/dump-hwdb.sh
index b4224a9..7d1d97c 100755
--- a/src/loader/dump-hwdb.sh
+++ b/src/loader/dump-hwdb.sh
@@ -2,21 +2,20 @@
 
 set -e
 
-PROP_NAME=DRI_DRIVER
+function modalias_list() {
+while read vendor driver; do
+pci_id_file=../../include/pci_ids/${driver}_pci_ids.h
+if ! test -r $pci_id_file; then
+printf "$driver pci:v%08x*bc03*\n" $vendor
+continue
+fi
 
-while read vendor driver; do
-pci_id_file=../../include/pci_ids/${driver}_pci_ids.h
-if ! test -r $pci_id_file; then
-printf "pci:v%08x*bc03*\n $PROP_NAME=$driver\n\n" $vendor
-continue
-fi
-
-gcc -E $DEFINES $pci_id_file |
-while IFS=' (,' read c id rest; do
-test -z "$id" && continue
-   

[Mesa-dev] [PATCH v3 6/8] loader: Move debug logging to where we find the driver

2014-02-04 Thread Kristian Høgsberg
Trying to figure out where a driver name comes from by looking at
whether or not chip_id is -1 isn't very roboust.

Signed-off-by: Kristian Høgsberg 
---
 src/loader/loader.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index 319bcf5..7131d46 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -105,11 +105,20 @@ lookup_driver_for_pci_id(int vendor_id, int chip_id, 
unsigned int driver_types)
  continue;
 
   if (driver_map[i].num_chips_ids == -1)
- return strdup(driver_map[i].driver);
+ goto out;
 
   for (j = 0; j < driver_map[i].num_chips_ids; j++)
  if (driver_map[i].chip_ids[j] == chip_id)
-return strdup(driver_map[i].driver);
+goto out;
+   }
+
+ out:
+   if (driver_map[i].driver) {
+  log_(_LOADER_DEBUG,
+   "pci id: %04x:%04x, driver %s from internal db",
+   vendor_id, chip_id, driver_map[i].driver);
+
+  return strdup(driver_map[i].driver);
}
 
return NULL;
@@ -235,6 +244,7 @@ loader_get_pci_id_for_fd(int fd, int *vendor_id, int 
*chip_id, char **driver)
 
hwdb_driver = udev_device_get_property_value(parent, "DRI_DRIVER");
if (hwdb_driver != NULL) {
+  log_(_LOADER_INFO, "using driver %s from udev hwdb", driver);
   *driver = strdup(hwdb_driver);
   goto out;
}
@@ -386,13 +396,8 @@ loader_get_driver_for_fd(int fd, unsigned int driver_types)
if (driver == NULL)
   driver = lookup_driver_for_pci_id(vendor_id, chip_id, driver_types);
 
-   if (driver && chip_id == -1) {
-  log_(_LOADER_INFO, "using driver %s from udev hwdb", driver);
-   } else {
-  log_(driver ? _LOADER_DEBUG : _LOADER_WARNING,
-   "pci id for fd %d: %04x:%04x, driver %s",
-   fd, vendor_id, chip_id, driver);
-   }
+   if (driver == NULL)
+  log_(_LOADER_WARNING, "no driver %s for %d\n", fd);
 
return driver;
 }
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Use hwdb in loader

2014-02-04 Thread Kristian Høgsberg
This series fixes a few problems in the v2 udev hwdb patch (it broke on Kens
recent #ifdef PRELIMINARY addition) and then cleans up the loader logic a 
bit with the goal of making the loader use modalias strings to find 
drivers.  modalias strings are of the form

  :

and allows the loader to look up a driver for any device that the linux
kernel supports regardless of the bus it's on.

Kristian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/8] loader: Factor out code to map PCI ID to driver name

2014-02-04 Thread Kristian Høgsberg
Making this its own function cleans up loader_get_driver_for_fd() a bit
and simplifies the control flow.

Signed-off-by: Kristian Høgsberg 
---
 src/loader/loader.c | 51 +++
 1 file changed, 27 insertions(+), 24 deletions(-)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index 51bac3e..88210df 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -92,6 +92,29 @@ static void default_logger(int level, const char *fmt, ...)
 
 static void (*log_)(int level, const char *fmt, ...) = default_logger;
 
+static char *
+lookup_driver_for_pci_id(int vendor_id, int chip_id, unsigned int driver_types)
+{
+   int i, j;
+
+   for (i = 0; driver_map[i].driver; i++) {
+  if (vendor_id != driver_map[i].vendor_id)
+ continue;
+
+  if (!(driver_types & driver_map[i].driver_types))
+ continue;
+
+  if (driver_map[i].num_chips_ids == -1)
+ return strdup(driver_map[i].driver);
+
+  for (j = 0; j < driver_map[i].num_chips_ids; j++)
+ if (driver_map[i].chip_ids[j] == chip_id)
+return strdup(driver_map[i].driver);
+   }
+
+   return NULL;
+}
+
 #ifdef HAVE_LIBUDEV
 #include 
 
@@ -326,9 +349,9 @@ out:
 }
 
 char *
-loader_get_driver_for_fd(int fd, unsigned driver_types)
+loader_get_driver_for_fd(int fd, unsigned int driver_types)
 {
-   int vendor_id, chip_id, i, j;
+   int vendor_id, chip_id;
char *driver = NULL;
 
if (!driver_types)
@@ -354,29 +377,9 @@ loader_get_driver_for_fd(int fd, unsigned driver_types)
   return driver;
}
 
-   if (driver)
-  goto out;
-
-   for (i = 0; driver_map[i].driver; i++) {
-  if (vendor_id != driver_map[i].vendor_id)
- continue;
-
-  if (!(driver_types & driver_map[i].driver_types))
- continue;
+   if (driver == NULL)
+  driver = lookup_driver_for_pci_id(vendor_id, chip_id, driver_types);
 
-  if (driver_map[i].num_chips_ids == -1) {
- driver = strdup(driver_map[i].driver);
- goto out;
-  }
-
-  for (j = 0; j < driver_map[i].num_chips_ids; j++)
- if (driver_map[i].chip_ids[j] == chip_id) {
-driver = strdup(driver_map[i].driver);
-goto out;
- }
-   }
-
-out:
if (driver && chip_id == -1) {
   log_(_LOADER_INFO, "using driver %s from udev hwdb", driver);
} else {
-- 
1.8.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] tgsi/ureg: increase the number of immediates

2014-02-04 Thread Roland Scheidegger
Am 05.02.2014 01:34, schrieb Zack Rusin:
> ureg_program is allocated on the heap so we can just bump the
> number of immediates that it can handle. It's needed for d3d10.
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
> b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> index f06858e..f928f57 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> @@ -77,7 +77,7 @@ struct ureg_tokens {
>  #define UREG_MAX_SYSTEM_VALUE PIPE_MAX_ATTRIBS
>  #define UREG_MAX_OUTPUT PIPE_MAX_SHADER_OUTPUTS
>  #define UREG_MAX_CONSTANT_RANGE 32
> -#define UREG_MAX_IMMEDIATE 256
> +#define UREG_MAX_IMMEDIATE 4096
>  #define UREG_MAX_ADDR 2
>  #define UREG_MAX_PRED 1
>  #define UREG_MAX_ARRAY_TEMPS 256
> 

Series looks good to me. llvm can still perform all optimizations on
such immediates right?

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 7/9] glsl: add gl_InvocationID variable for ARB_gpu_shader5

2014-02-04 Thread Dave Airlie
On Wed, Feb 5, 2014 at 9:07 AM, Jordan Justen  wrote:
> v2:
>  * Make gl_InstanceID a system value

typo ^^ I assume you mean gl_InvocationID.

Dave.
>
> Signed-off-by: Jordan Justen 
> Reviewed-by: Paul Berry 
> ---
>  src/glsl/builtin_variables.cpp | 2 ++
>  src/mesa/main/mtypes.h | 1 +
>  2 files changed, 3 insertions(+)
>
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index d6bc3c0..d9ed2db 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -782,6 +782,8 @@ builtin_variable_generator::generate_gs_special_vars()
> add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> if (state->ARB_viewport_array_enable)
>add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +   if (state->ARB_gpu_shader5_enable)
> +  add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");
>
> /* Although gl_PrimitiveID appears in tessellation control and 
> tessellation
>  * evaluation shaders, it has a different function there than it has in
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index b76b984..10d4206 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2015,6 +2015,7 @@ typedef enum
> SYSTEM_VALUE_SAMPLE_ID,  /**< Fragment shader only */
> SYSTEM_VALUE_SAMPLE_POS, /**< Fragment shader only */
> SYSTEM_VALUE_SAMPLE_MASK_IN, /**< Fragment shader only */
> +   SYSTEM_VALUE_INVOCATION_ID,  /**< Geometry shader only */
> SYSTEM_VALUE_MAX /**< Number of values */
>  } gl_system_value;
>
> --
> 1.8.5.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] gallivm: make sure analysis works with large number of immediates

2014-02-04 Thread Brian Paul

On 02/04/2014 05:34 PM, Zack Rusin wrote:

We need to handle a lot more immediates and in order to do that
we also switch from allocating this structure on the stack to
allocating it on the heap.

Signed-off-by: Zack Rusin 



Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallivm: handle huge number of immediates

2014-02-04 Thread Brian Paul

On 02/04/2014 05:34 PM, Zack Rusin wrote:

We only supported up to 256 immediates, which isn't enough. We had
code which was allocating immediates as an allocated array, but it
was always used along a statically backed array for performance


"along with a"



reasons. This commit adds code to skip that performance optimization
and always use just the dynamically allocated immediates if the
number of them is too great.


So is there any limit on the number of immediates now?


LGTM.  Reviewed-by: Brian Paul 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] tgsi/ureg: increase the number of immediates

2014-02-04 Thread Brian Paul

On 02/04/2014 05:34 PM, Zack Rusin wrote:

ureg_program is allocated on the heap so we can just bump the
number of immediates that it can handle. It's needed for d3d10.

Signed-off-by: Zack Rusin 
---
  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index f06858e..f928f57 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -77,7 +77,7 @@ struct ureg_tokens {
  #define UREG_MAX_SYSTEM_VALUE PIPE_MAX_ATTRIBS
  #define UREG_MAX_OUTPUT PIPE_MAX_SHADER_OUTPUTS
  #define UREG_MAX_CONSTANT_RANGE 32
-#define UREG_MAX_IMMEDIATE 256
+#define UREG_MAX_IMMEDIATE 4096
  #define UREG_MAX_ADDR 2
  #define UREG_MAX_PRED 1
  #define UREG_MAX_ARRAY_TEMPS 256



Reviewed-by: Brian Paul 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 27/30] r600g: calculate a better value for array_size

2014-02-04 Thread Dave Airlie
On Wed, Feb 5, 2014 at 12:19 AM, Grigori Goronzy  wrote:
> On 04.02.2014 00:53, Dave Airlie wrote:
>>
>> From: Dave Airlie 
>>
>> attempt to calculate a better value for array size to avoid breaking apps.
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>   src/gallium/drivers/r600/r600_shader.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/r600/r600_shader.c
>> b/src/gallium/drivers/r600/r600_shader.c
>> index 8fa7054..f0e980b 100644
>> --- a/src/gallium/drivers/r600/r600_shader.c
>> +++ b/src/gallium/drivers/r600/r600_shader.c
>> @@ -1416,7 +1416,7 @@ static int emit_gs_ring_writes(struct
>> r600_shader_ctx *ctx, bool ind)
>>
>> if (ind) {
>> output.array_base = ring_offset >> 2; /* in dwords
>> */
>> -   output.array_size = 0xff
>> +   output.array_size =
>> ctx->shader->gs_max_out_vertices * 4;
>
>
> array_size is 12 bits in size. It overflows when gs_max_out_vertices is set
> to 1024, and no vertices will be written at all. I don't believe it's safe
> to assume a fixed output size per vertex either. This easily breaks GSVS
> writes in case there are many vertex attributes.
>
> Is there anything wrong with just setting array_size to the maximum, 0xfff?
> streamout does the same thing.

probably not, though fglrx does seem to limit this, at least in the
shaders I've compiled with the
AMD shader compiler,

Though maybe I should just do 0xfff for now and we can refine it later
if necessary.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] tgsi/ureg: increase the number of immediates

2014-02-04 Thread Zack Rusin
ureg_program is allocated on the heap so we can just bump the
number of immediates that it can handle. It's needed for d3d10.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index f06858e..f928f57 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -77,7 +77,7 @@ struct ureg_tokens {
 #define UREG_MAX_SYSTEM_VALUE PIPE_MAX_ATTRIBS
 #define UREG_MAX_OUTPUT PIPE_MAX_SHADER_OUTPUTS
 #define UREG_MAX_CONSTANT_RANGE 32
-#define UREG_MAX_IMMEDIATE 256
+#define UREG_MAX_IMMEDIATE 4096
 #define UREG_MAX_ADDR 2
 #define UREG_MAX_PRED 1
 #define UREG_MAX_ARRAY_TEMPS 256
-- 
1.8.3.2
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] gallivm: handle huge number of immediates

2014-02-04 Thread Zack Rusin
We only supported up to 256 immediates, which isn't enough. We had
code which was allocating immediates as an allocated array, but it
was always used along a statically backed array for performance
reasons. This commit adds code to skip that performance optimization
and always use just the dynamically allocated immediates if the
number of them is too great.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |   2 +-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 112 
 2 files changed, 77 insertions(+), 37 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index 1a93951..46f7d77 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -482,7 +482,7 @@ struct lp_build_tgsi_soa_context
struct lp_exec_mask exec_mask;
 
uint num_immediates;
-
+   boolean use_immediates_array;
 };
 
 void
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 7c5de21..067e6af 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -1295,33 +1295,42 @@ emit_fetch_immediate(
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef res = NULL;
 
-   if (reg->Register.Indirect) {
-  LLVMValueRef indirect_index;
-  LLVMValueRef index_vec;  /* index into the immediate register array */
+   if (bld->use_immediates_array || reg->Register.Indirect) {
   LLVMValueRef imms_array;
   LLVMTypeRef fptr_type;
 
-  indirect_index = get_indirect_index(bld,
-  reg->Register.File,
-  reg->Register.Index,
-  ®->Indirect);
-  /*
-   * Unlike for other reg classes, adding pixel offsets is unnecessary -
-   * immediates are stored as full vectors (FIXME??? - might be better
-   * to store them the same as constants) but all elements are the same
-   * in any case.
-   */
-  index_vec = get_soa_array_offsets(&bld_base->uint_bld,
-indirect_index,
-swizzle,
-FALSE);
-
   /* cast imms_array pointer to float* */
   fptr_type = LLVMPointerType(LLVMFloatTypeInContext(gallivm->context), 0);
   imms_array = LLVMBuildBitCast(builder, bld->imms_array, fptr_type, "");
 
-  /* Gather values from the immediate register array */
-  res = build_gather(&bld_base->base, imms_array, index_vec, NULL);
+  if (reg->Register.Indirect) {
+ LLVMValueRef indirect_index;
+ LLVMValueRef index_vec;  /* index into the immediate register array */
+
+ indirect_index = get_indirect_index(bld,
+ reg->Register.File,
+ reg->Register.Index,
+ ®->Indirect);
+ /*
+  * Unlike for other reg classes, adding pixel offsets is unnecessary -
+  * immediates are stored as full vectors (FIXME??? - might be better
+  * to store them the same as constants) but all elements are the same
+  * in any case.
+  */
+ index_vec = get_soa_array_offsets(&bld_base->uint_bld,
+   indirect_index,
+   swizzle,
+   FALSE);
+
+ /* Gather values from the immediate register array */
+ res = build_gather(&bld_base->base, imms_array, index_vec, NULL);
+  } else {
+ LLVMValueRef lindex = lp_build_const_int32(gallivm,
+reg->Register.Index * 4 + swizzle);
+ LLVMValueRef imms_ptr =  LLVMBuildGEP(builder,
+bld->imms_array, &lindex, 1, 
"");
+ res = LLVMBuildLoad(builder, imms_ptr, "");
+  }
}
else {
   res = bld->immediates[reg->Register.Index][swizzle];
@@ -2728,51 +2737,71 @@ void lp_emit_immediate_soa(
 {
struct lp_build_tgsi_soa_context *bld = lp_soa_context(bld_base);
struct gallivm_state * gallivm = bld_base->base.gallivm;
-
-   /* simply copy the immediate values into the next immediates[] slot */
+   LLVMValueRef imms[4];
unsigned i;
const uint size = imm->Immediate.NrTokens - 1;
assert(size <= 4);
-   assert(bld->num_immediates < LP_MAX_TGSI_IMMEDIATES);
switch (imm->Immediate.DataType) {
case TGSI_IMM_FLOAT32:
   for( i = 0; i < size; ++i )
- bld->immediates[bld->num_immediates][i] =
-lp_build_const_vec(gallivm, bld_base->base.type, imm->u[i].Float);
+ imms[i] =
+   lp_build_const_vec(gallivm, bld_base->base.type, 
imm->u[i].Float);
 
   break;
ca

[Mesa-dev] [PATCH 2/3] gallivm: make sure analysis works with large number of immediates

2014-02-04 Thread Zack Rusin
We need to handle a lot more immediates and in order to do that
we also switch from allocating this structure on the stack to
allocating it on the heap.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
index 184790b..ce0598d 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
@@ -47,7 +47,7 @@ struct analysis_context
struct lp_tgsi_info *info;
 
unsigned num_imms;
-   float imm[128][4];
+   float imm[4096][4];
 
struct lp_tgsi_channel_info temp[32][4];
 };
@@ -487,7 +487,7 @@ lp_build_tgsi_info(const struct tgsi_token *tokens,
struct lp_tgsi_info *info)
 {
struct tgsi_parse_context parse;
-   struct analysis_context ctx;
+   struct analysis_context *ctx;
unsigned index;
unsigned chan;
 
@@ -495,8 +495,8 @@ lp_build_tgsi_info(const struct tgsi_token *tokens,
 
tgsi_scan_shader(tokens, &info->base);
 
-   memset(&ctx, 0, sizeof ctx);
-   ctx.info = info;
+   ctx = CALLOC(1, sizeof(struct analysis_context));
+   ctx->info = info;
 
tgsi_parse_init(&parse, tokens);
 
@@ -518,7 +518,7 @@ lp_build_tgsi_info(const struct tgsi_token *tokens,
goto finished;
 }
 
-analyse_instruction(&ctx, inst);
+analyse_instruction(ctx, inst);
  }
  break;
 
@@ -527,16 +527,16 @@ lp_build_tgsi_info(const struct tgsi_token *tokens,
 const unsigned size =
   parse.FullToken.FullImmediate.Immediate.NrTokens - 1;
 assert(size <= 4);
-if (ctx.num_imms < Elements(ctx.imm)) {
+if (ctx->num_imms < Elements(ctx->imm)) {
for (chan = 0; chan < size; ++chan) {
   float value = parse.FullToken.FullImmediate.u[chan].Float;
-  ctx.imm[ctx.num_imms][chan] = value;
+  ctx->imm[ctx->num_imms][chan] = value;
 
   if (value < 0.0f || value > 1.0f) {
  info->unclamped_immediates = TRUE;
   }
}
-   ++ctx.num_imms;
+   ++ctx->num_imms;
 }
  }
  break;
@@ -551,6 +551,7 @@ lp_build_tgsi_info(const struct tgsi_token *tokens,
 finished:
 
tgsi_parse_free(&parse);
+   FREE(ctx);
 
 
/*
-- 
1.8.3.2
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl-compiler: ast: Precise locations positions.

2014-02-04 Thread Carl Worth
Sir Anthony  writes:
> 1. Change locations setup in glsl_parser.yy from yylloc to appropriate token 
> locations.
> 2. Addition of two fields in ast_node location to hold end position of token.
> 3. Addition of ast_node method to setup range locations (for aggregate 
> tokens).
> 4. Fix for glcpp-lex.l. It handled spaces wrong and convert two
> adjacent spaces into one, which added location offset for shaders with
> indentation.

Hello, Sir.

Thanks for contributing your patch!

You've described 4 different changes in a single patch.

Could you please split this up into 2 or more patches? Ideally each
patch would include a single change, (and perhaps depend on previous
changes).

This would make the patches easier to review, and improve our ability to
analyze the commit history in the future, (such as with "git bisect").

For my part, I haven't done much in the glsl_lexer.ll nor glsl_parser.yy
files, but I'm one of the primary authors of the glcpp-lex.l code. So I
would like to review the changes there independently from the rest of
the patch.

And on that point, I'm a bit confused by the patch to glcpp-lex.l Your
commit message says it "handled spaces wrong and convert two adjacent
spaces into one" so I expected to see the patch changing something about
the conversion of adjacent spaces, (but I don't see anything like
that). Or did you mean just that it computed location incorrectly in the
case of collapsing adjacent spaces?

Also, for glcpp, we have a collection of small tests in the glcpp/tests
directory. These tests are invoked by "make check". If you are changing
the behavior of glcpp, then you should be updating any tests where the
desired output is changed. And if you are fixing a bug, but no existing
test changes its output, then we need to be adding a new test to cover
that case.

Thanks again,

-Carl

-- 
carl.d.wo...@intel.com


pgpKgL0PJwZ1S.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] gallium: define the behavior of PIPE_USAGE_* flags properly

2014-02-04 Thread Marek Olšák
From: Marek Olšák 

STATIC will be removed in the following commit.
---
 src/gallium/docs/source/screen.rst   | 18 --
 src/gallium/include/pipe/p_defines.h | 13 +++--
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index c26f98c..5932e3b 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -343,12 +343,18 @@ PIPE_USAGE_*
 
 
 The PIPE_USAGE enums are hints about the expected usage pattern of a resource.
-
-* ``PIPE_USAGE_DEFAULT``: Expect many uploads to the resource, intermixed with 
draws.
-* ``PIPE_USAGE_DYNAMIC``: Expect many uploads to the resource, intermixed with 
draws.
-* ``PIPE_USAGE_STATIC``: Same as immutable (?)
-* ``PIPE_USAGE_IMMUTABLE``: Resource will not be changed after first upload.
-* ``PIPE_USAGE_STREAM``: Upload will be followed by draw, followed by upload, 
...
+Note that drivers must always support read and write CPU access at any time
+no matter which hint they got.
+
+* ``PIPE_USAGE_DEFAULT``: Optimized for fast GPU access.
+* ``PIPE_USAGE_IMMUTABLE``: Optimized for fast GPU access and the resource is
+  not expected to be mapped after the first upload.
+* ``PIPE_USAGE_DYNAMIC``: Expect frequent write-only CPU access. What is
+  uploaded is expected to be used at least several times by the GPU.
+* ``PIPE_USAGE_STATIC``: Same as PIPE_USAGE_DEFAULT.
+* ``PIPE_USAGE_STREAM``: Expect frequent write-only CPU access. What is
+  uploaded is expected to be used only once by the GPU.
+* ``PIPE_USAGE_STAGING``: Optimized for fast CPU access.
 
 
 Methods
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 52c12df..1c550f8 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -382,13 +382,14 @@ enum pipe_flush_flags {
 #define PIPE_RESOURCE_FLAG_ST_PRIV (1 << 24) /* state-tracker/winsys 
private */
 
 /* Hint about the expected lifecycle of a resource.
+ * Sorted according to GPU vs CPU access.
  */
-#define PIPE_USAGE_DEFAULT0 /* many uploads, draws intermixed */
-#define PIPE_USAGE_DYNAMIC1 /* many uploads, draws intermixed */
-#define PIPE_USAGE_STATIC 2 /* same as immutable?? */
-#define PIPE_USAGE_IMMUTABLE  3 /* no change after first upload */
-#define PIPE_USAGE_STREAM 4 /* upload, draw, upload, draw */
-#define PIPE_USAGE_STAGING5 /* supports data transfers from the GPU to 
the CPU */
+#define PIPE_USAGE_DEFAULT0 /* fast GPU access */
+#define PIPE_USAGE_IMMUTABLE  1 /* fast GPU access, immutable */
+#define PIPE_USAGE_DYNAMIC2 /* uploaded data is used multiple times */
+#define PIPE_USAGE_STREAM 3 /* uploaded data is used once */
+#define PIPE_USAGE_STAGING4 /* fast CPU access */
+#define PIPE_USAGE_STATIC 5 /* same as DEFAULT, will be removed */
 
 
 /**
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] gallium: remove PIPE_RESOURCE_FLAG_GEN_MIPS

2014-02-04 Thread Marek Olšák
From: Marek Olšák 

Unused.
---
 src/gallium/include/pipe/p_defines.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 02f507c..52c12df 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -376,7 +376,6 @@ enum pipe_flush_flags {
 
 /* Flags for the driver about resource behaviour:
  */
-#define PIPE_RESOURCE_FLAG_GEN_MIPS(1 << 0)  /* Driver performs autogen 
mips */
 #define PIPE_RESOURCE_FLAG_MAP_PERSISTENT (1 << 1)
 #define PIPE_RESOURCE_FLAG_MAP_COHERENT   (1 << 2)
 #define PIPE_RESOURCE_FLAG_DRV_PRIV(1 << 16) /* driver/winsys private */
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 72895] Missing trees in flightgear 2.12.1 with r600 driver and mesa 10.0.1

2014-02-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=72895

--- Comment #10 from Barto  ---
(In reply to comment #9)
> I believe this is a bug in mesa core, not in a specific driver.
> 
> I haven't been able to isolate the code that renders the trees (there are no
> trees visible anywhere near the airport in the version packaged by debian).
> 

in order to see trees you have to enable "random vegetation" option in "View
--> Rendering options",

then at startup you should be able to see some trees around the airport, then
you will notice that these trees will randomly disapear when you move the plane
or when you change the point of view,

I will try your patches for the bug 73504 and if it is not enough I will record
an apitrace

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 8/9] i965: support gl_InvocationID for gen7

2014-02-04 Thread Jordan Justen
v2:
 * Make gl_InstanceID a system value

v3:
 * Properly shift from R0.1 into DST.4 by adding
   GS_OPCODE_GET_INSTANCE_ID

Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_defines.h   | 12 
 src/mesa/drivers/dri/i965/brw_shader.cpp  |  2 ++
 src/mesa/drivers/dri/i965/brw_vec4.h  |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 20 
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 16 +---
 5 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 75d09fc..199a699 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -902,6 +902,13 @@ enum opcode {
 *   form the final channel mask.
 */
GS_OPCODE_SET_CHANNEL_MASKS,
+
+   /**
+* Get the "Instance ID" fields from the payload.
+*
+* - dst is the GRF for gl_InvocationID.
+*/
+   GS_OPCODE_GET_INSTANCE_ID,
 };
 
 enum brw_urb_write_flags {
@@ -1538,6 +1545,11 @@ enum brw_message_target {
 # define BRW_GS_EDGE_INDICATOR_0   (1 << 8)
 # define BRW_GS_EDGE_INDICATOR_1   (1 << 9)
 
+/* GS Thread Payload
+ */
+/* R0 */
+# define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT 27
+
 /* 3DSTATE_GS "Output Vertex Size" has an effective maximum of 62.  It's
  * counted in multiples of 16 bytes.
  */
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 6cc2595..5e3005a 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -525,6 +525,8 @@ brw_instruction_name(enum opcode op)
   return "prepare_channel_masks";
case GS_OPCODE_SET_CHANNEL_MASKS:
   return "set_channel_masks";
+   case GS_OPCODE_GET_INSTANCE_ID:
+  return "get_instance_id";
 
default:
   /* Yes, this leaks.  It's in debug code, it should never occur, and if
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index e17b5cd..182a1e1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -616,6 +616,7 @@ private:
void generate_gs_set_dword_2_immed(struct brw_reg dst, struct brw_reg src);
void generate_gs_prepare_channel_masks(struct brw_reg dst);
void generate_gs_set_channel_masks(struct brw_reg dst, struct brw_reg src);
+   void generate_gs_get_instance_id(struct brw_reg dst);
void generate_oword_dual_block_offsets(struct brw_reg m1,
  struct brw_reg index);
void generate_scratch_write(vec4_instruction *inst,
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 94d1e79..a48d829 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -639,6 +639,22 @@ vec4_generator::generate_gs_set_channel_masks(struct 
brw_reg dst,
 }
 
 void
+vec4_generator::generate_gs_get_instance_id(struct brw_reg dst)
+{
+   /* We want to right shift R0.0 & R0.1 by GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT
+* and store into dst.0 & dst.4. So generate the instruction:
+*
+* shr(8) dst<1> R0<1,4,0> GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT { align1 
WE_all }
+*/
+   brw_push_insn_state(p);
+   brw_set_access_mode(p, BRW_ALIGN_1);
+   dst = retype(dst, BRW_REGISTER_TYPE_UD);
+   struct brw_reg r0(retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
+   brw_SHR(p, dst, stride(r0, 1, 4, 0), 
brw_imm_ud(GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT));
+   brw_pop_insn_state(p);
+}
+
+void
 vec4_generator::generate_oword_dual_block_offsets(struct brw_reg m1,
   struct brw_reg index)
 {
@@ -1218,6 +1234,10 @@ 
vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
   generate_gs_set_channel_masks(dst, src[0]);
   break;
 
+   case GS_OPCODE_GET_INSTANCE_ID:
+  generate_gs_get_instance_id(dst);
+  break;
+
case SHADER_OPCODE_SHADER_TIME_ADD:
   brw_shader_time_add(p, src[0],
   prog_data->base.binding_table.shader_time_start);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 40743cc..360703b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -51,9 +51,19 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw,
 dst_reg *
 vec4_gs_visitor::make_reg_for_system_value(ir_variable *ir)
 {
-   /* Geometry shaders don't use any system values. */
-   assert(!"Unreached");
-   return NULL;
+   dst_reg *reg = new(mem_ctx) dst_reg(this, ir->type);
+
+   switch (ir->data.location) {
+   case SYSTEM_VALUE_INVOCATION_ID:
+  this->current_annotation = "initialize gl_InvocationID";
+  emit(GS_OPCODE_GET_INSTANCE_ID,

[Mesa-dev] [PATCH v3 7/9] glsl: add gl_InvocationID variable for ARB_gpu_shader5

2014-02-04 Thread Jordan Justen
v2:
 * Make gl_InstanceID a system value

Signed-off-by: Jordan Justen 
Reviewed-by: Paul Berry 
---
 src/glsl/builtin_variables.cpp | 2 ++
 src/mesa/main/mtypes.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index d6bc3c0..d9ed2db 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -782,6 +782,8 @@ builtin_variable_generator::generate_gs_special_vars()
add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
if (state->ARB_viewport_array_enable)
   add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
+   if (state->ARB_gpu_shader5_enable)
+  add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");
 
/* Although gl_PrimitiveID appears in tessellation control and tessellation
 * evaluation shaders, it has a different function there than it has in
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b76b984..10d4206 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2015,6 +2015,7 @@ typedef enum
SYSTEM_VALUE_SAMPLE_ID,  /**< Fragment shader only */
SYSTEM_VALUE_SAMPLE_POS, /**< Fragment shader only */
SYSTEM_VALUE_SAMPLE_MASK_IN, /**< Fragment shader only */
+   SYSTEM_VALUE_INVOCATION_ID,  /**< Geometry shader only */
SYSTEM_VALUE_MAX /**< Number of values */
 } gl_system_value;
 
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/9] glsl/linker: produce gl_shader_program Geom.Invocations

2014-02-04 Thread Jordan Justen
Grab the parsed invocation count, check for consistency
during linking, and finally save the result in
gl_shader_program Geom.Invocations.

Signed-off-by: Jordan Justen 
Reviewed-by: Paul Berry 
---
 src/glsl/glsl_parser_extras.cpp |  4 
 src/glsl/linker.cpp | 18 ++
 src/mesa/main/mtypes.h  |  9 +
 3 files changed, 31 insertions(+)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 7e5969a..23a8da9 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1339,6 +1339,10 @@ set_shader_inout_layout(struct gl_shader *shader,
if (state->out_qualifier->flags.q.max_vertices)
   shader->Geom.VerticesOut = state->out_qualifier->max_vertices;
 
+   shader->Geom.Invocations = 0;
+   if (state->in_qualifier->flags.q.invocations)
+  shader->Geom.Invocations = state->in_qualifier->invocations;
+
if (state->gs_input_prim_type_specified) {
   shader->Geom.InputType = state->in_qualifier->prim_type;
} else {
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 93b4754..800de0b 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1206,6 +1206,7 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
unsigned num_shaders)
 {
linked_shader->Geom.VerticesOut = 0;
+   linked_shader->Geom.Invocations = 0;
linked_shader->Geom.InputType = PRIM_UNKNOWN;
linked_shader->Geom.OutputType = PRIM_UNKNOWN;
 
@@ -1259,6 +1260,18 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
 }
 linked_shader->Geom.VerticesOut = shader->Geom.VerticesOut;
   }
+
+  if (shader->Geom.Invocations != 0) {
+if (linked_shader->Geom.Invocations != 0 &&
+linked_shader->Geom.Invocations != shader->Geom.Invocations) {
+   linker_error(prog, "geometry shader defined with conflicting "
+"invocation count (%d and %d)\n",
+linked_shader->Geom.Invocations,
+shader->Geom.Invocations);
+   return;
+}
+linked_shader->Geom.Invocations = shader->Geom.Invocations;
+  }
}
 
/* Just do the intrastage -> interstage propagation right now,
@@ -1285,6 +1298,11 @@ link_gs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
   return;
}
prog->Geom.VerticesOut = linked_shader->Geom.VerticesOut;
+
+   if (linked_shader->Geom.Invocations == 0)
+  linked_shader->Geom.Invocations = 1;
+
+   prog->Geom.Invocations = linked_shader->Geom.Invocations;
 }
 
 /**
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 5fc15af..48a1e36 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2409,6 +2409,11 @@ struct gl_shader
struct {
   GLint VerticesOut;
   /**
+   * 0 - Invocations count not declared in shader, or
+   * 1 .. MAX_GEOMETRY_SHADER_INVOCATIONS
+   */
+  GLint Invocations;
+  /**
* GL_POINTS, GL_LINES, GL_LINES_ADJACENCY, GL_TRIANGLES, or
* GL_TRIANGLES_ADJACENCY, or PRIM_UNKNOWN if it's not set in this
* shader.
@@ -2606,6 +2611,10 @@ struct gl_shader_program
struct {
   GLint VerticesIn;
   GLint VerticesOut;
+  /**
+   * 1 .. MAX_GEOMETRY_SHADER_INVOCATIONS
+   */
+  GLint Invocations;
   GLenum InputType;  /**< GL_POINTS, GL_LINES, GL_LINES_ADJACENCY_ARB,
   GL_TRIANGLES, or GL_TRIANGLES_ADJACENCY_ARB */
   GLenum OutputType; /**< GL_POINTS, GL_LINE_STRIP or GL_TRIANGLE_STRIP */
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 9/9] i965: support instanced GS on gen7

2014-02-04 Thread Jordan Justen
v3:
 * Properly prevent dual object mode execution when
   the invocation count > 1

Signed-off-by: Jordan Justen 
Reviewed-by: Paul Berry 
---
 src/mesa/drivers/dri/i965/brw_context.h   | 2 ++
 src/mesa/drivers/dri/i965/brw_defines.h   | 1 +
 src/mesa/drivers/dri/i965/brw_vec4_gs.c   | 2 ++
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 6 --
 src/mesa/drivers/dri/i965/gen7_gs_state.c | 2 ++
 5 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a0189b7..53d930f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -639,6 +639,8 @@ struct brw_gs_prog_data
 
bool include_primitive_id;
 
+   int invocations;
+
/**
 * True if the thread should be dispatched in DUAL_INSTANCE mode, false if
 * it should be dispatched in DUAL_OBJECT mode.
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 199a699..a46d881 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1518,6 +1518,7 @@ enum brw_message_target {
 # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT 0
 # define GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID 1
 # define GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT20
+# define GEN7_GS_INSTANCE_CONTROL_SHIFT15
 # define GEN7_GS_DISPATCH_MODE_SINGLE  (0 << 11)
 # define GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE   (1 << 11)
 # define GEN7_GS_DISPATCH_MODE_DUAL_OBJECT (2 << 11)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index abc181b..3c6393f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -48,6 +48,8 @@ do_gs_prog(struct brw_context *brw,
c.prog_data.include_primitive_id =
   (gp->program.Base.InputsRead & VARYING_BIT_PRIMITIVE_ID) != 0;
 
+   c.prog_data.invocations = gp->program.Invocations;
+
/* Allocate the references to the uniforms that will end up in the
 * prog_data associated with the compiled program, and which will be freed
 * by the state cache.
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 360703b..f586c41 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -588,9 +588,11 @@ brw_gs_emit(struct brw_context *brw,
}
 
/* Compile the geometry shader in DUAL_OBJECT dispatch mode, if we can do
-* so without spilling.
+* so without spilling. If the GS invocations count > 1, then we can't use
+* dual object mode.
 */
-   if (likely(!(INTEL_DEBUG & DEBUG_NO_DUAL_OBJECT_GS))) {
+   if (c->prog_data.invocations <= 1 &&
+   likely(!(INTEL_DEBUG & DEBUG_NO_DUAL_OBJECT_GS))) {
   c->prog_data.dual_instanced_dispatch = false;
 
   vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */);
diff --git a/src/mesa/drivers/dri/i965/gen7_gs_state.c 
b/src/mesa/drivers/dri/i965/gen7_gs_state.c
index d2ba354..b179d19 100644
--- a/src/mesa/drivers/dri/i965/gen7_gs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_gs_state.c
@@ -153,6 +153,8 @@ upload_gs_state(struct brw_context *brw)
  ((brw->max_gs_threads - 1) << max_threads_shift) |
  (brw->gs.prog_data->control_data_header_size_hwords <<
   GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT) |
+ ((brw->gs.prog_data->invocations - 1) <<
+  GEN7_GS_INSTANCE_CONTROL_SHIFT) |
  (brw->gs.prog_data->dual_instanced_dispatch ?
   GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE :
   GEN7_GS_DISPATCH_MODE_DUAL_OBJECT) |
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/9] glsl: parse invocations layout qualifier for ARB_gpu_shader5

2014-02-04 Thread Jordan Justen
_mesa_glsl_parse_state in_qualifier->invocations will store the
invocations count.

v3:
 * Use in_qualifier to allow the primitive to be specied
   separately from the invocations count (merge_qualifiers)

Signed-off-by: Jordan Justen 
---
 src/glsl/ast.h  |  8 
 src/glsl/ast_type.cpp   | 23 +++
 src/glsl/glsl_parser.yy | 27 +++
 3 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index e74d8a3..b3721c0 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -460,6 +460,11 @@ struct ast_type_qualifier {
 unsigned prim_type:1;
 unsigned max_vertices:1;
 /** \} */
+
+ /** \name Layout qualifiers for GL_ARB_gpu_shader5 */
+ /** \{ */
+ unsigned invocations:1;
+ /** \} */
   }
   /** \brief Set of flags, accessed by name. */
   q;
@@ -471,6 +476,9 @@ struct ast_type_qualifier {
/** Precision of the type (highp/medium/lowp). */
unsigned precision:2;
 
+   /** Geometry shader invocations for GL_ARB_gpu_shader5. */
+   int invocations;
+
/**
 * Location specified via GL_ARB_explicit_attrib_location layout
 *
diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
index f1c59da..fe15159 100644
--- a/src/glsl/ast_type.cpp
+++ b/src/glsl/ast_type.cpp
@@ -153,6 +153,17 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
   this->max_vertices = q.max_vertices;
}
 
+   if (q.flags.q.invocations) {
+  if (this->flags.q.invocations && this->invocations != q.invocations) {
+_mesa_glsl_error(loc, state,
+ "geometry shader set conflicting invocations "
+ "(%d and %d)", this->invocations, q.invocations);
+return false;
+  }
+  this->invocations = q.invocations;
+  this->flags.q.invocations = 1;
+   }
+
if ((q.flags.i & ubo_mat_mask.flags.i) != 0)
   this->flags.i &= ~ubo_mat_mask.flags.i;
if ((q.flags.i & ubo_layout_mask.flags.i) != 0)
@@ -186,6 +197,7 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
ast_type_qualifier valid_in_mask;
valid_in_mask.flags.i = 0;
valid_in_mask.flags.q.prim_type = 1;
+   valid_in_mask.flags.q.invocations = 1;
 
/* Generate an error when invalid input layout qualifiers are used. */
if ((q.flags.i & ~valid_in_mask.flags.i) != 0) {
@@ -222,5 +234,16 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
   state->in_qualifier->prim_type = q.prim_type;
}
 
+   if (this->flags.q.invocations &&
+   q.flags.q.invocations &&
+   this->invocations != q.invocations) {
+  _mesa_glsl_error(loc, state,
+   "conflicting invocations counts specified");
+  return false;
+   } else if (q.flags.q.invocations) {
+  this->flags.q.invocations = 1;
+  this->invocations = q.invocations;
+   }
+
return true;
 }
diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index e368217..66ad47f 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -1291,6 +1291,29 @@ layout_qualifier_id:
  }
   }
 
+  if (match_layout_qualifier("invocations", $1, state) == 0) {
+ $$.flags.q.invocations = 1;
+
+ if ($3 <= 0) {
+_mesa_glsl_error(& @3, state,
+ "invalid invocations %d specified", $3);
+YYERROR;
+ } else if ($3 > MAX_GEOMETRY_SHADER_INVOCATIONS) {
+_mesa_glsl_error(& @3, state,
+ "invocations (%d) exceeds "
+ "GL_MAX_GEOMETRY_SHADER_INVOCATIONS", $3);
+YYERROR;
+ } else {
+$$.invocations = $3;
+if (!state->is_version(400, 0) &&
+!state->ARB_gpu_shader5_enable) {
+   _mesa_glsl_error(& @3, state,
+"GL_ARB_gpu_shader5 invocations "
+"qualifier specified", $3);
+}
+ }
+  }
+
   /* If the identifier didn't match any known layout identifiers,
* emit an error.
*/
@@ -2338,10 +2361,6 @@ layout_defaults:
  _mesa_glsl_error(& @1, state,
   "input layout qualifiers only valid in "
   "geometry shaders");
-  } else if (!$1.flags.q.prim_type) {
- _mesa_glsl_error(& @1, state,
-  "input layout qualifiers must specify a primitive"
-  " type");
   } else {
  bool first_prim = $1.flags.q.prim_type &&
!state->in_qualifier->flags.q.prim_type;
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/9] glsl: Generate error for invalid input layout declarations

2014-02-04 Thread Jordan Justen
Fixes various piglit tests:
spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-*.geom

Signed-off-by: Jordan Justen 
---
 src/glsl/ast_type.cpp | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
index 7f3737b..f1c59da 100644
--- a/src/glsl/ast_type.cpp
+++ b/src/glsl/ast_type.cpp
@@ -183,6 +183,17 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
ast_type_qualifier q)
 {
+   ast_type_qualifier valid_in_mask;
+   valid_in_mask.flags.i = 0;
+   valid_in_mask.flags.q.prim_type = 1;
+
+   /* Generate an error when invalid input layout qualifiers are used. */
+   if ((q.flags.i & ~valid_in_mask.flags.i) != 0) {
+  _mesa_glsl_error(loc, state,
+  "invalid input layout qualifiers used");
+  return false;
+   }
+
/* Input layout qualifiers can be specified multiple
 * times in separate declarations, as long as they match.
 */
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/9] mesa: initialize gl_geometry_program Invocations field

2014-02-04 Thread Jordan Justen
Signed-off-by: Jordan Justen 
Reviewed-by: Paul Berry 
---
 src/mesa/main/mtypes.h | 1 +
 src/mesa/main/shaderapi.c  | 1 +
 src/mesa/program/program.c | 1 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 1 +
 src/mesa/state_tracker/st_program.c| 1 +
 5 files changed, 5 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 48a1e36..b76b984 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2145,6 +2145,7 @@ struct gl_geometry_program
 
GLint VerticesIn;
GLint VerticesOut;
+   GLint Invocations;
GLenum InputType;  /**< GL_POINTS, GL_LINES, GL_LINES_ADJACENCY_ARB,
GL_TRIANGLES, or GL_TRIANGLES_ADJACENCY_ARB */
GLenum OutputType; /**< GL_POINTS, GL_LINE_STRIP or GL_TRIANGLE_STRIP */
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 61ac0e3..a8336c9 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1834,6 +1834,7 @@ _mesa_copy_linked_program_data(gl_shader_stage type,
   struct gl_geometry_program *dst_gp = (struct gl_geometry_program *) dst;
   dst_gp->VerticesIn = src->Geom.VerticesIn;
   dst_gp->VerticesOut = src->Geom.VerticesOut;
+  dst_gp->Invocations = src->Geom.Invocations;
   dst_gp->InputType = src->Geom.InputType;
   dst_gp->OutputType = src->Geom.OutputType;
   dst->UsesClipDistanceOut = src->Geom.UsesClipDistance;
diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c
index ea8eb0d..7686b31 100644
--- a/src/mesa/program/program.c
+++ b/src/mesa/program/program.c
@@ -530,6 +530,7 @@ _mesa_clone_program(struct gl_context *ctx, const struct 
gl_program *prog)
  struct gl_geometry_program *gpc = gl_geometry_program(clone);
  gpc->VerticesOut = gp->VerticesOut;
  gpc->InputType = gp->InputType;
+ gpc->Invocations = gp->Invocations;
  gpc->OutputType = gp->OutputType;
   }
   break;
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0871dd0..235696e 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5180,6 +5180,7 @@ get_mesa_program(struct gl_context *ctx,
   stgp->Base.InputType = shader_program->Geom.InputType;
   stgp->Base.OutputType = shader_program->Geom.OutputType;
   stgp->Base.VerticesOut = shader_program->Geom.VerticesOut;
+  stgp->Base.Invocations = shader_program->Geom.Invocations;
   break;
default:
   assert(!"should not be reached");
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index cadbe17..f67b0fa 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -1087,6 +1087,7 @@ st_translate_geometry_program(struct st_context *st,
ureg_property_gs_input_prim(ureg, stgp->Base.InputType);
ureg_property_gs_output_prim(ureg, stgp->Base.OutputType);
ureg_property_gs_max_vertices(ureg, stgp->Base.VerticesOut);
+   ureg_property_gs_invocations(ureg, stgp->Base.Invocations);
 
if (stgp->glsl_to_tgsi)
   st_translate_program(st->ctx,
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 6/9] main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support

2014-02-04 Thread Jordan Justen
v3:
 * Add check for ARB_gpu_shader5

Signed-off-by: Jordan Justen 
Reviewed-by: Paul Berry 
---
 src/mesa/main/shaderapi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index a8336c9..1b2158f 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -603,6 +603,12 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname, GLint *param
   if (check_gs_query(ctx, shProg))
  *params = shProg->Geom.VerticesOut;
   return;
+   case GL_GEOMETRY_SHADER_INVOCATIONS:
+  if (!has_core_gs || !ctx->Extensions.ARB_gpu_shader5)
+ break;
+  if (check_gs_query(ctx, shProg))
+ *params = shProg->Geom.Invocations;
+  return;
case GL_GEOMETRY_INPUT_TYPE:
   if (!has_core_gs)
  break;
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 0/9] i965/gen7 instanced GS support for ARB_gpu_shader5

2014-02-04 Thread Jordan Justen
v3:
 * Fix major brokenness of dual instance mode operation
   using Paul's suggestions
 * Update parsing to allow separate primitive and
   invocation declarations. Fixes piglit test:
   spec/arb_gpu_shader5/execution/invocation-id-in-separate-gs
 * New: glsl: Generate error for invalid input layout declarations
   This is made easier by the in_qualifier addition in
   this series, but it otherwise an unrelated bug fix.
 * Added check for valid invocation count values

v2:
 * Convert gl_InvocationID to a system value

No piglit regressions on HSW.

Instanced GS support requires overriding ARB_gpu_shader5 to
be enabled.

Patches are available at:
git://people.freedesktop.org/~jljusten/mesa gs-inv-id

Jordan Justen (9):
  glsl: convert GS input primitive to use ast_type_qualifier
  glsl: Generate error for invalid input layout declarations
  glsl: parse invocations layout qualifier for ARB_gpu_shader5
  glsl/linker: produce gl_shader_program Geom.Invocations
  mesa: initialize gl_geometry_program Invocations field
  main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support
  glsl: add gl_InvocationID variable for ARB_gpu_shader5
  i965: support gl_InvocationID for gen7
  i965: support instanced GS on gen7

 src/glsl/ast.h| 13 +
 src/glsl/ast_to_hir.cpp   |  5 +-
 src/glsl/ast_type.cpp | 69 +++
 src/glsl/builtin_variables.cpp|  2 +
 src/glsl/glsl_parser.yy   | 45 +--
 src/glsl/glsl_parser_extras.cpp   | 10 +++-
 src/glsl/glsl_parser_extras.h |  7 +--
 src/glsl/linker.cpp   | 18 ++
 src/mesa/drivers/dri/i965/brw_context.h   |  2 +
 src/mesa/drivers/dri/i965/brw_defines.h   | 13 +
 src/mesa/drivers/dri/i965/brw_shader.cpp  |  2 +
 src/mesa/drivers/dri/i965/brw_vec4.h  |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 20 +++
 src/mesa/drivers/dri/i965/brw_vec4_gs.c   |  2 +
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 22 ++--
 src/mesa/drivers/dri/i965/gen7_gs_state.c |  2 +
 src/mesa/main/mtypes.h| 11 
 src/mesa/main/shaderapi.c |  7 +++
 src/mesa/program/program.c|  1 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp|  1 +
 src/mesa/state_tracker/st_program.c   |  1 +
 21 files changed, 222 insertions(+), 32 deletions(-)

-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/9] glsl: convert GS input primitive to use ast_type_qualifier

2014-02-04 Thread Jordan Justen
This allows the use of merge_qualifier, which will be needed
for supporting multiple input qualifiers, such as the
primitive type type being specified separately from the
invocations count (ARB_gpu_shader5).

state->gs_input_prim_type is moved into state->in_qualifier->prim_type

state->gs_input_prim_type_specified is still processed separately
so we can determine when the input primitive is specified. This
is important since certain scenerios are not supported until after
the primitive type has been specified in the shader code.

Signed-off-by: Jordan Justen 
---
 src/glsl/ast.h  |  5 +
 src/glsl/ast_to_hir.cpp |  5 ++---
 src/glsl/ast_type.cpp   | 35 +++
 src/glsl/glsl_parser.yy | 18 ++
 src/glsl/glsl_parser_extras.cpp |  6 +++---
 src/glsl/glsl_parser_extras.h   |  7 ++-
 6 files changed, 53 insertions(+), 23 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index 0bda28d..e74d8a3 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -544,6 +544,11 @@ struct ast_type_qualifier {
bool merge_qualifier(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
ast_type_qualifier q);
+
+   bool merge_in_qualifier(YYLTYPE *loc,
+   _mesa_glsl_parse_state *state,
+   ast_type_qualifier q);
+
 };
 
 class ast_declarator_list;
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 1bfb4e5..faf47f9 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2767,7 +2767,7 @@ handle_geometry_shader_input_decl(struct 
_mesa_glsl_parse_state *state,
 {
unsigned num_vertices = 0;
if (state->gs_input_prim_type_specified) {
-  num_vertices = vertices_per_prim(state->gs_input_prim_type);
+  num_vertices = vertices_per_prim(state->in_qualifier->prim_type);
}
 
/* Geometry shader input variables must be arrays.  Caller should have
@@ -5236,7 +5236,7 @@ ast_gs_input_layout::hir(exec_list *instructions,
 * was consistent with this one.
 */
if (state->gs_input_prim_type_specified &&
-   state->gs_input_prim_type != this->prim_type) {
+   state->in_qualifier->prim_type != this->prim_type) {
   _mesa_glsl_error(&loc, state,
"geometry shader input layout does not match"
" previous declaration");
@@ -5257,7 +5257,6 @@ ast_gs_input_layout::hir(exec_list *instructions,
}
 
state->gs_input_prim_type_specified = true;
-   state->gs_input_prim_type = this->prim_type;
 
/* If any shader inputs occurred before this declaration and did not
 * specify an array size, their size is determined now.
diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
index 637da0d..7f3737b 100644
--- a/src/glsl/ast_type.cpp
+++ b/src/glsl/ast_type.cpp
@@ -178,3 +178,38 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
return true;
 }
 
+bool
+ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
+   _mesa_glsl_parse_state *state,
+   ast_type_qualifier q)
+{
+   /* Input layout qualifiers can be specified multiple
+* times in separate declarations, as long as they match.
+*/
+   if (this->flags.q.prim_type) {
+  if (q.flags.q.prim_type &&
+  this->prim_type != q.prim_type) {
+ _mesa_glsl_error(loc, state,
+  "conflicting input primitive types specified");
+  }
+   } else if (q.flags.q.prim_type) {
+  /* Make sure this is a valid input primitive type. */
+  switch (q.prim_type) {
+  case GL_POINTS:
+  case GL_LINES:
+  case GL_LINES_ADJACENCY:
+  case GL_TRIANGLES:
+  case GL_TRIANGLES_ADJACENCY:
+ break;
+  default:
+ _mesa_glsl_error(loc, state,
+  "invalid geometry shader input primitive type");
+ return false;
+  }
+
+  state->in_qualifier->flags.q.prim_type = 1;
+  state->in_qualifier->prim_type = q.prim_type;
+   }
+
+   return true;
+}
diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index 928c57e..e368217 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -2343,19 +2343,13 @@ layout_defaults:
   "input layout qualifiers must specify a primitive"
   " type");
   } else {
- /* Make sure this is a valid input primitive type. */
- switch ($1.prim_type) {
- case GL_POINTS:
- case GL_LINES:
- case GL_LINES_ADJACENCY:
- case GL_TRIANGLES:
- case GL_TRIANGLES_ADJACENCY:
+ bool first_prim = $1.flags.q.prim_type &&
+   !state->in_qualifier->flags.q.prim_type;
+
+ if (!state->in_qualifier->merge_in_qualifier(& @1, state, $1)) {
+YYERROR;
+ } else if (first_prim) {
 $$ = new(ctx) ast_gs_input_layout(@1, $

[Mesa-dev] [PATCH 2/2] r600g, radeonsi: set resource domains in one place (v2)

2014-02-04 Thread Marek Olšák
From: Marek Olšák 

v2: This doesn't change the behavior. It only moves the tiling check
to r600_init_resource and removes the usage parameter.
---
 src/gallium/drivers/r600/r600_state_common.c|  4 +--
 src/gallium/drivers/radeon/r600_buffer_common.c | 33 -
 src/gallium/drivers/radeon/r600_pipe_common.h   |  2 +-
 src/gallium/drivers/radeon/r600_texture.c   |  7 ++
 src/gallium/drivers/radeonsi/si_descriptors.c   |  4 +--
 5 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index d8fab10..c1d7e29 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -2082,8 +2082,8 @@ static void r600_invalidate_buffer(struct pipe_context 
*ctx, struct pipe_resourc
pb_reference(&rbuffer->buf, NULL);
 
/* Create a new one in the same pipe_resource. */
-   r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0, 
alignment,
-  TRUE, rbuffer->b.b.usage);
+   r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0,
+  alignment, TRUE);
 
/* We changed the buffer, now we need to bind it where the old one was 
bound. */
/* Vertex buffers. */
diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 2077228..59578e1 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -103,42 +103,41 @@ void *r600_buffer_map_sync_with_rings(struct 
r600_common_context *ctx,
 bool r600_init_resource(struct r600_common_screen *rscreen,
struct r600_resource *res,
unsigned size, unsigned alignment,
-   bool use_reusable_pool, unsigned usage)
+   bool use_reusable_pool)
 {
-   uint32_t initial_domain, domains;
+   struct r600_texture *rtex = (struct r600_texture*)res;
 
-   switch(usage) {
+   switch (res->b.b.usage) {
case PIPE_USAGE_STAGING:
case PIPE_USAGE_DYNAMIC:
case PIPE_USAGE_STREAM:
-   /* These resources participate in transfers, i.e. are used
-* for uploads and downloads from regular resources.
-* We generate them internally for some transfers.
-*/
-   initial_domain = RADEON_DOMAIN_GTT;
-   domains = RADEON_DOMAIN_GTT;
+   /* Transfers are likely to occur more often with these 
resources. */
+   res->domains = RADEON_DOMAIN_GTT;
break;
case PIPE_USAGE_DEFAULT:
case PIPE_USAGE_STATIC:
case PIPE_USAGE_IMMUTABLE:
default:
-   /* Don't list GTT here, because the memory manager would put 
some
-* resources to GTT no matter what the initial domain is.
-* Not listing GTT in the domains improves performance a lot. */
-   initial_domain = RADEON_DOMAIN_VRAM;
-   domains = RADEON_DOMAIN_VRAM;
+   /* Not listing GTT here improves performance in some apps. */
+   res->domains = RADEON_DOMAIN_VRAM;
break;
}
 
+   /* Tiled textures are unmappable. Always put them in VRAM. */
+   if (res->b.b.target != PIPE_BUFFER &&
+   rtex->surface.level[0].mode >= RADEON_SURF_MODE_1D) {
+   res->domains = RADEON_DOMAIN_VRAM;
+   }
+
+   /* Allocate the resource. */
res->buf = rscreen->ws->buffer_create(rscreen->ws, size, alignment,
   use_reusable_pool,
-  initial_domain);
+  res->domains);
if (!res->buf) {
return false;
}
 
res->cs_buf = rscreen->ws->buffer_get_cs_handle(res->buf);
-   res->domains = domains;
util_range_set_empty(&res->valid_buffer_range);
 
if (rscreen->debug_flags & DBG_VM && res->b.b.target == PIPE_BUFFER) {
@@ -327,7 +326,7 @@ struct pipe_resource *r600_buffer_create(struct pipe_screen 
*screen,
rbuffer->b.vtbl = &r600_buffer_vtbl;
util_range_init(&rbuffer->valid_buffer_range);
 
-   if (!r600_init_resource(rscreen, rbuffer, templ->width0, alignment, 
TRUE, templ->usage)) {
+   if (!r600_init_resource(rscreen, rbuffer, templ->width0, alignment, 
TRUE)) {
FREE(rbuffer);
return NULL;
}
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 9fdfdfd..7193a0f 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -321,7 +321,7 @@ void *r600_buffer_map_sync_with_rings(struct 
r600_common_context *ctx,
 bool r600_init_res

[Mesa-dev] [PATCH 1/2] r600g, radeonsi: query the buffer domain from the kernel for DRI2 buffers

2014-02-04 Thread Marek Olšák
From: Marek Olšák 

Better then guessing it.

Yeah we have had this query for a long time...
---
 src/gallium/drivers/radeon/r600_texture.c |  2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 23 +++
 src/gallium/winsys/radeon/drm/radeon_winsys.h |  5 +
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 878b26f..aa4e8ea 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -627,7 +627,7 @@ r600_texture_create_object(struct pipe_screen *screen,
} else {
resource->buf = buf;
resource->cs_buf = rscreen->ws->buffer_get_cs_handle(buf);
-   resource->domains = RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM;
+   resource->domains = rscreen->ws->buffer_get_current_domain(buf);
}
 
if (rtex->cmask.size) {
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index 2ac060b..7c59f26 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -201,6 +201,28 @@ static boolean radeon_bo_is_busy(struct pb_buffer *_buf,
 }
 }
 
+static enum radeon_bo_domain radeon_bo_get_current_domain(struct pb_buffer 
*_buf)
+{
+struct radeon_bo *bo = get_radeon_bo(_buf);
+struct drm_radeon_gem_busy args;
+
+memset(&args, 0, sizeof(args));
+args.handle = bo->handle;
+
+drmCommandWriteRead(bo->rws->fd, DRM_RADEON_GEM_BUSY,
+&args, sizeof(args));
+
+/* Zero domains the driver doesn't understand. */
+args.domain &= ~(RADEON_GEM_DOMAIN_VRAM | RADEON_GEM_DOMAIN_GTT);
+
+/* If no domain is set, we must set something... */
+if (!args.domain)
+args.domain = RADEON_GEM_DOMAIN_VRAM | RADEON_GEM_DOMAIN_GTT;
+
+/* GEM domains and winsys domains are defined the same. */
+return args.domain;
+}
+
 static uint64_t radeon_bomgr_find_va(struct radeon_bomgr *mgr, uint64_t size, 
uint64_t alignment)
 {
 struct radeon_bo_va_hole *hole, *n;
@@ -1089,4 +,5 @@ void radeon_bomgr_init_functions(struct radeon_drm_winsys 
*ws)
 ws->base.buffer_from_handle = radeon_winsys_bo_from_handle;
 ws->base.buffer_get_handle = radeon_winsys_bo_get_handle;
 ws->base.buffer_get_virtual_address = radeon_winsys_bo_va;
+ws->base.buffer_get_current_domain = radeon_bo_get_current_domain;
 }
diff --git a/src/gallium/winsys/radeon/drm/radeon_winsys.h 
b/src/gallium/winsys/radeon/drm/radeon_winsys.h
index 55f60d3..fb942c0 100644
--- a/src/gallium/winsys/radeon/drm/radeon_winsys.h
+++ b/src/gallium/winsys/radeon/drm/radeon_winsys.h
@@ -376,6 +376,11 @@ struct radeon_winsys {
  */
 uint64_t (*buffer_get_virtual_address)(struct radeon_winsys_cs_handle 
*buf);
 
+/**
+ * Query the current placement of the buffer from the memory manager.
+ */
+enum radeon_bo_domain (*buffer_get_current_domain)(struct pb_buffer *buf);
+
 /**
  * Command submission.
  *
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 72895] Missing trees in flightgear 2.12.1 with r600 driver and mesa 10.0.1

2014-02-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=72895

--- Comment #9 from Fredrik Höglund  ---
I believe this is a bug in mesa core, not in a specific driver.

I haven't been able to isolate the code that renders the trees (there are no
trees visible anywhere near the airport in the version packaged by debian).

But I have found a number of issues with display lists that affect FlightGear.
I have attached patches for those to bug 73504. What they fix specifically is
occasional crashes and brief flashes of random geometry. It's possible that the
same bugs are also responsible for the missing trees.

If not, can you capture an apitrace that shows the problem?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: add support for GL_ARB_viewport_array (v0.2)

2014-02-04 Thread Ilia Mirkin
On Tue, Jan 21, 2014 at 1:09 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> this just ties the mesa code to the pre-existing gallium interface,
> I'm not sure what to do with the CSO stuff yet.
>
> 0.2: fix min/max bounds
>
> Signed-off-by: Dave Airlie 

Series is Acked-by: Ilia Mirkin 

Don't know if I have enough knowledge here for a R-b, but this works
with my nv50 impl now (after changing the piglit tests around :) )

It's a little inefficient that the st does diffing but sets all the
viewports anyways, and then the driver basically has to do that as
well. But perhaps that's the convention. [Also note that your r600
impl last I looked didn't do the diffing in the driver, so it would
set all the viewports/scissors every time. But perhaps I misread.]

> ---
>  src/mesa/state_tracker/st_atom_scissor.c  | 75 
> ---
>  src/mesa/state_tracker/st_atom_viewport.c | 37 ---
>  src/mesa/state_tracker/st_context.h   |  4 +-
>  src/mesa/state_tracker/st_draw_feedback.c |  2 +-
>  src/mesa/state_tracker/st_extensions.c|  9 
>  5 files changed, 72 insertions(+), 55 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atom_scissor.c 
> b/src/mesa/state_tracker/st_atom_scissor.c
> index a1f72da..50b921a 100644
> --- a/src/mesa/state_tracker/st_atom_scissor.c
> +++ b/src/mesa/state_tracker/st_atom_scissor.c
> @@ -43,51 +43,56 @@
>  static void
>  update_scissor( struct st_context *st )
>  {
> -   struct pipe_scissor_state scissor;
> +   struct pipe_scissor_state scissor[PIPE_MAX_VIEWPORTS];
> const struct gl_context *ctx = st->ctx;
> const struct gl_framebuffer *fb = ctx->DrawBuffer;
> GLint miny, maxy;
> +   int i;
> +   bool changed = false;
> +   for (i = 0 ; i < ctx->Const.MaxViewports; i++) {
> +  scissor[i].minx = 0;
> +  scissor[i].miny = 0;
> +  scissor[i].maxx = fb->Width;
> +  scissor[i].maxy = fb->Height;
>
> -   scissor.minx = 0;
> -   scissor.miny = 0;
> -   scissor.maxx = fb->Width;
> -   scissor.maxy = fb->Height;
> +  if (ctx->Scissor.EnableFlags & (1 << i)) {
> + /* need to be careful here with xmax or ymax < 0 */
> + GLint xmax = MAX2(0, ctx->Scissor.ScissorArray[i].X + 
> ctx->Scissor.ScissorArray[i].Width);
> + GLint ymax = MAX2(0, ctx->Scissor.ScissorArray[i].Y + 
> ctx->Scissor.ScissorArray[i].Height);
>
> -   if (ctx->Scissor.EnableFlags & 1) {
> -  /* need to be careful here with xmax or ymax < 0 */
> -  GLint xmax = MAX2(0, ctx->Scissor.ScissorArray[0].X + 
> ctx->Scissor.ScissorArray[0].Width);
> -  GLint ymax = MAX2(0, ctx->Scissor.ScissorArray[0].Y + 
> ctx->Scissor.ScissorArray[0].Height);
> + if (ctx->Scissor.ScissorArray[i].X > (GLint)scissor[i].minx)
> +scissor[i].minx = ctx->Scissor.ScissorArray[i].X;
> + if (ctx->Scissor.ScissorArray[i].Y > (GLint)scissor[i].miny)
> +scissor[i].miny = ctx->Scissor.ScissorArray[i].Y;
>
> -  if (ctx->Scissor.ScissorArray[0].X > (GLint)scissor.minx)
> - scissor.minx = ctx->Scissor.ScissorArray[0].X;
> -  if (ctx->Scissor.ScissorArray[0].Y > (GLint)scissor.miny)
> - scissor.miny = ctx->Scissor.ScissorArray[0].Y;
> + if (xmax < (GLint) scissor[i].maxx)
> +scissor[i].maxx = xmax;
> + if (ymax < (GLint) scissor[i].maxy)
> +scissor[i].maxy = ymax;
>
> -  if (xmax < (GLint) scissor.maxx)
> - scissor.maxx = xmax;
> -  if (ymax < (GLint) scissor.maxy)
> - scissor.maxy = ymax;
> + /* check for null space */
> + if (scissor[i].minx >= scissor[i].maxx || scissor[i].miny >= 
> scissor[i].maxy)
> +scissor[i].minx = scissor[i].miny = scissor[i].maxx = 
> scissor[i].maxy = 0;
> +  }
>
> -  /* check for null space */
> -  if (scissor.minx >= scissor.maxx || scissor.miny >= scissor.maxy)
> - scissor.minx = scissor.miny = scissor.maxx = scissor.maxy = 0;
> -   }
> -
> -   /* Now invert Y if needed.
> -* Gallium drivers use the convention Y=0=top for surfaces.
> -*/
> -   if (st_fb_orientation(fb) == Y_0_TOP) {
> -  miny = fb->Height - scissor.maxy;
> -  maxy = fb->Height - scissor.miny;
> -  scissor.miny = miny;
> -  scissor.maxy = maxy;
> -   }
> +  /* Now invert Y if needed.
> +   * Gallium drivers use the convention Y=0=top for surfaces.
> +   */
> +  if (st_fb_orientation(fb) == Y_0_TOP) {
> + miny = fb->Height - scissor[i].maxy;
> + maxy = fb->Height - scissor[i].miny;
> + scissor[i].miny = miny;
> + scissor[i].maxy = maxy;
> +  }
>
> -   if (memcmp(&scissor, &st->state.scissor, sizeof(scissor)) != 0) {
> -  /* state has changed */
> -  st->state.scissor = scissor;  /* struct copy */
> -  st->pipe->set_scissor_states(st->pipe, 0, 1, &scissor); /* activate */
> +  if (memcmp(&scissor[i], &st->state.scissor[i], sizeof(scissor)) != 0) {
> + /* state has changed */
> + st

[Mesa-dev] [Bug 72895] Missing trees in flightgear 2.12.1 with r600 driver and mesa 10.0.1

2014-02-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=72895

--- Comment #8 from Barto  ---
the problem of missing trees in flightgear is still here in mesa 10.0.3,

but I notice if I type :

"export LIBGL_ALWAYS_SOFTWARE=1" the bug is still here in flightgear, so even
with software rendering mode I get the bug,

that means that whatever the graphic card this bug will occur

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 26/35] meta: Use common GLSL code for blits

2014-02-04 Thread Marek Olšák
On Tue, Feb 4, 2014 at 9:06 PM, Roland Scheidegger  wrote:
> Am 04.02.2014 13:19, schrieb Marek Olšák:
>> On Tue, Feb 4, 2014 at 10:29 AM, Rogovin, Kevin  
>> wrote:
>>>
>>>
 I don't believe our hardware can support GL_ARB_shader_stencil_export.
 The render target write message can take RGBA, depth, and sample masks,
 but not stencil.  Without that, it's not at all obvious how to implement 
 it.
>>>
>>> There is a terrible hack-ish way to do it, but I stress the word terrible 
>>> and hackish and
>>> it may not work correctly depending on the tile modes and all that fun.
>>>
>>> Here goes. Assuming the depth-stencil is D24S8 we can do this and that the
>>> tile modes work out:
>>>
>>> Bind src depth-stencil as RGBA_8UI, the depth should be in RGB and the 
>>> stencil in A.
>>> Bind the dst  depth-stencil as RGBA_8UI as well. Fragment shader is simple 
>>> unfiltered
>>> read and write to dest. If not writing to depth or stencil, mask our RGB or 
>>> A respectively.
>>>
>>> The above does not handle MSAA->non-MSAA. Going further, it can be done in 
>>> general
>>> on *paper* with GL_ARB_texture_view if that is extended to allow D24S8 to 
>>> be on the same
>>> castable category as RGBA_8UI. The main catch is how the tile modes work 
>>> out, i.e. if the
>>> tile mode for a D24S8 is "compatible" with a RGBA_8UI render target.
>>
>> This is how r300g does it. It blits D24S8 as RGBA_UNORM. Gallium has
>> texture views and it has no limitations on how you can change the
>> format, so it's pretty trivial. r300g changes the format, then calls
>> our "meta" code (u_blitter).
>>
>
> It is not actually obvious this is something which should work in
> gallium in that way. The docs say the format must be "compatible"
> without saying much else (obviously, same number of block bits is a
> requirement). u_format has some function which would check if formats
> are compatible (albeit for a different purpose) and it wouldn't consider
> your example compatible.
> The intention of sampler views is it should allow casting allowed by the
> APIs (d3d10, whatever GL extension).
> The definition of this is definitely loose.

Yeah, it's not very well defined, but that's not important, because
r300g changes the format *internally*, because it knows it can do
that.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix F2U opcode

2014-02-04 Thread Zack Rusin
Looks good.

Reviewed-by: Zack Rusin 

- Original Message -
> From: Roland Scheidegger 
> 
> Previously, we were really doing F2I. And also move it to generic section.
> (Note that for llvmpipe the code generated is definitely bad, due to lack
> of unsigned conversions with sse. I think though what llvm does (using scalar
> conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit)
> including lots of domain changes is quite suboptimal, could do something like
> is_large = arg >= 2^31
> half_arg = 0.5 * arg
> small_c = fptoint(arg)
> large_c = fptoint(half_arg) << 1
> res = select(is_large, large_c, small_c)
> which should be much less instructions but that's something llvm should do
> itself.)
> 
> This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs
> GL 3.0 version override to run.)
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |   42
>  ++--
>  1 file changed, 22 insertions(+), 20 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> index caaeb01..b9546db 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> @@ -720,10 +720,23 @@ sub_emit(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> - emit_data->output[emit_data->chan] = LLVMBuildFSub(
> - bld_base->base.gallivm->builder,
> - emit_data->args[0],
> - emit_data->args[1], "");
> +   emit_data->output[emit_data->chan] =
> +  LLVMBuildFSub(bld_base->base.gallivm->builder,
> +emit_data->args[0],
> +emit_data->args[1], "");
> +}
> +
> +/* TGSI_OPCODE_F2U */
> +static void
> +f2u_emit(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data)
> +{
> +   emit_data->output[emit_data->chan] =
> +  LLVMBuildFPToUI(bld_base->base.gallivm->builder,
> +  emit_data->args[0],
> +  bld_base->base.int_vec_type, "");
>  }
>  
>  /* TGSI_OPCODE_U2F */
> @@ -733,9 +746,10 @@ u2f_emit(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> -   emit_data->output[emit_data->chan] =
> LLVMBuildUIToFP(bld_base->base.gallivm->builder,
> - emit_data->args[0],
> - 
> bld_base->base.vec_type, "");
> +   emit_data->output[emit_data->chan] =
> +  LLVMBuildUIToFP(bld_base->base.gallivm->builder,
> +  emit_data->args[0],
> +  bld_base->base.vec_type, "");
>  }
>  
>  static void
> @@ -949,6 +963,7 @@ lp_set_default_actions(struct lp_build_tgsi_context *
> bld_base)
> bld_base->op_actions[TGSI_OPCODE_SUB].emit = sub_emit;
>  
> bld_base->op_actions[TGSI_OPCODE_UARL].emit = mov_emit;
> +   bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit;
> bld_base->op_actions[TGSI_OPCODE_U2F].emit = u2f_emit;
> bld_base->op_actions[TGSI_OPCODE_UMAD].emit = umad_emit;
> bld_base->op_actions[TGSI_OPCODE_UMUL].emit = umul_emit;
> @@ -1128,18 +1143,6 @@ f2i_emit_cpu(
>  emit_data->args[0]);
>  }
>  
> -/* TGSI_OPCODE_F2U (CPU Only) */
> -static void
> -f2u_emit_cpu(
> -   const struct lp_build_tgsi_action * action,
> -   struct lp_build_tgsi_context * bld_base,
> -   struct lp_build_emit_data * emit_data)
> -{
> -   /* FIXME: implement and use lp_build_utrunc() */
> -   emit_data->output[emit_data->chan] = lp_build_itrunc(&bld_base->base,
> -emit_data->args[0]);
> -}
> -
>  /* TGSI_OPCODE_FSET Helper (CPU Only) */
>  static void
>  fset_emit_cpu(
> @@ -1832,7 +1835,6 @@ lp_set_default_actions_cpu(
> bld_base->op_actions[TGSI_OPCODE_DIV].emit = div_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_EX2].emit = ex2_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_F2I].emit = f2i_emit_cpu;
> -   bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_FLR].emit = flr_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_FSEQ].emit = fseq_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_FSGE].emit = fsge_emit_cpu;
> --
> 1.7.9.5
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix F2U opcode

2014-02-04 Thread Jose Fonseca
Looks good to me.

Jose

- Original Message -
> From: Roland Scheidegger 
> 
> Previously, we were really doing F2I. And also move it to generic section.
> (Note that for llvmpipe the code generated is definitely bad, due to lack
> of unsigned conversions with sse. I think though what llvm does (using scalar
> conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit)
> including lots of domain changes is quite suboptimal, could do something like
> is_large = arg >= 2^31
> half_arg = 0.5 * arg
> small_c = fptoint(arg)
> large_c = fptoint(half_arg) << 1
> res = select(is_large, large_c, small_c)
> which should be much less instructions but that's something llvm should do
> itself.)
> 
> This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs
> GL 3.0 version override to run.)
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |   42
>  ++--
>  1 file changed, 22 insertions(+), 20 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> index caaeb01..b9546db 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> @@ -720,10 +720,23 @@ sub_emit(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> - emit_data->output[emit_data->chan] = LLVMBuildFSub(
> - bld_base->base.gallivm->builder,
> - emit_data->args[0],
> - emit_data->args[1], "");
> +   emit_data->output[emit_data->chan] =
> +  LLVMBuildFSub(bld_base->base.gallivm->builder,
> +emit_data->args[0],
> +emit_data->args[1], "");
> +}
> +
> +/* TGSI_OPCODE_F2U */
> +static void
> +f2u_emit(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data)
> +{
> +   emit_data->output[emit_data->chan] =
> +  LLVMBuildFPToUI(bld_base->base.gallivm->builder,
> +  emit_data->args[0],
> +  bld_base->base.int_vec_type, "");
>  }
>  
>  /* TGSI_OPCODE_U2F */
> @@ -733,9 +746,10 @@ u2f_emit(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> -   emit_data->output[emit_data->chan] =
> LLVMBuildUIToFP(bld_base->base.gallivm->builder,
> - emit_data->args[0],
> - 
> bld_base->base.vec_type, "");
> +   emit_data->output[emit_data->chan] =
> +  LLVMBuildUIToFP(bld_base->base.gallivm->builder,
> +  emit_data->args[0],
> +  bld_base->base.vec_type, "");
>  }
>  
>  static void
> @@ -949,6 +963,7 @@ lp_set_default_actions(struct lp_build_tgsi_context *
> bld_base)
> bld_base->op_actions[TGSI_OPCODE_SUB].emit = sub_emit;
>  
> bld_base->op_actions[TGSI_OPCODE_UARL].emit = mov_emit;
> +   bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit;
> bld_base->op_actions[TGSI_OPCODE_U2F].emit = u2f_emit;
> bld_base->op_actions[TGSI_OPCODE_UMAD].emit = umad_emit;
> bld_base->op_actions[TGSI_OPCODE_UMUL].emit = umul_emit;
> @@ -1128,18 +1143,6 @@ f2i_emit_cpu(
>  emit_data->args[0]);
>  }
>  
> -/* TGSI_OPCODE_F2U (CPU Only) */
> -static void
> -f2u_emit_cpu(
> -   const struct lp_build_tgsi_action * action,
> -   struct lp_build_tgsi_context * bld_base,
> -   struct lp_build_emit_data * emit_data)
> -{
> -   /* FIXME: implement and use lp_build_utrunc() */
> -   emit_data->output[emit_data->chan] = lp_build_itrunc(&bld_base->base,
> -emit_data->args[0]);
> -}
> -
>  /* TGSI_OPCODE_FSET Helper (CPU Only) */
>  static void
>  fset_emit_cpu(
> @@ -1832,7 +1835,6 @@ lp_set_default_actions_cpu(
> bld_base->op_actions[TGSI_OPCODE_DIV].emit = div_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_EX2].emit = ex2_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_F2I].emit = f2i_emit_cpu;
> -   bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_FLR].emit = flr_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_FSEQ].emit = fseq_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_FSGE].emit = fsge_emit_cpu;
> --
> 1.7.9.5
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: allow large numbers of temporaries

2014-02-04 Thread Jose Fonseca
Sounds great to me.

Jose

- Original Message -
> The number of allowed temporaries increases almost with every
> iteration of an api. We used to support 128, then we started
> increasing and the newer api's support 4096+. So if we notice
> that the number of temporaries is larger than our statically
> allocated storage would allow we just treat them as indexable
> temporaries and allocate them as an array from the start.
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> index 9db41a9..7c5de21 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> @@ -2672,8 +2672,8 @@ lp_emit_declaration_soa(
>assert(last <= bld->bld_base.info->file_max[decl->Declaration.File]);
>switch (decl->Declaration.File) {
>case TGSI_FILE_TEMPORARY:
> - assert(idx < LP_MAX_TGSI_TEMPS);
>   if (!(bld->indirect_files & (1 << TGSI_FILE_TEMPORARY))) {
> +assert(idx < LP_MAX_TGSI_TEMPS);
>  for (i = 0; i < TGSI_NUM_CHANNELS; i++)
> bld->temps[idx][i] = lp_build_alloca(gallivm, vec_type,
> "temp");
>   }
> @@ -3621,6 +3621,15 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
> bld.bld_base.info = info;
> bld.indirect_files = info->indirect_files;
>  
> +   /*
> +* If the number of temporaries is rather large then we just
> +* allocate them as an array right from the start and treat
> +* like indirect temporaries.
> +*/
> +   if (info->file_max[TGSI_FILE_TEMPORARY] >= LP_MAX_TGSI_TEMPS) {
> +  bld.indirect_files |= (1 << TGSI_FILE_TEMPORARY);
> +   }
> +
> bld.bld_base.soa = TRUE;
> bld.bld_base.emit_debug = emit_debug;
> bld.bld_base.emit_fetch_funcs[TGSI_FILE_CONSTANT] = emit_fetch_constant;
> --
> 1.8.3.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: allow large numbers of temporaries

2014-02-04 Thread Roland Scheidegger
Am 04.02.2014 03:43, schrieb Zack Rusin:
> The number of allowed temporaries increases almost with every
> iteration of an api. We used to support 128, then we started
> increasing and the newer api's support 4096+. So if we notice
> that the number of temporaries is larger than our statically
> allocated storage would allow we just treat them as indexable
> temporaries and allocate them as an array from the start.
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> index 9db41a9..7c5de21 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> @@ -2672,8 +2672,8 @@ lp_emit_declaration_soa(
>assert(last <= bld->bld_base.info->file_max[decl->Declaration.File]);
>switch (decl->Declaration.File) {
>case TGSI_FILE_TEMPORARY:
> - assert(idx < LP_MAX_TGSI_TEMPS);
>   if (!(bld->indirect_files & (1 << TGSI_FILE_TEMPORARY))) {
> +assert(idx < LP_MAX_TGSI_TEMPS);
>  for (i = 0; i < TGSI_NUM_CHANNELS; i++)
> bld->temps[idx][i] = lp_build_alloca(gallivm, vec_type, 
> "temp");
>   }
> @@ -3621,6 +3621,15 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
> bld.bld_base.info = info;
> bld.indirect_files = info->indirect_files;
>  
> +   /*
> +* If the number of temporaries is rather large then we just
> +* allocate them as an array right from the start and treat
> +* like indirect temporaries.
> +*/
> +   if (info->file_max[TGSI_FILE_TEMPORARY] >= LP_MAX_TGSI_TEMPS) {
> +  bld.indirect_files |= (1 << TGSI_FILE_TEMPORARY);
> +   }
> +
> bld.bld_base.soa = TRUE;
> bld.bld_base.emit_debug = emit_debug;
> bld.bld_base.emit_fetch_funcs[TGSI_FILE_CONSTANT] = emit_fetch_constant;
> 

Looks good to me.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 00/11] glsl: add Single Static Assignment (SSA)

2014-02-04 Thread Connor Abbott
On Fri, Jan 31, 2014 at 3:34 PM, Paul Berry  wrote:

> On 22 January 2014 09:16, Connor Abbott  wrote:
>
>> This series enables GLSL IR support for SSA, including passes to convert
>> to and from SSA form. SSA is a form of the intermediate representation
>> of a compiler in which each variable is assigned exactly once. SSA form
>> makes many optimizations faster and easier to write, and enables other
>> more powerful optimizations. SSA is used in GCC [1] and LLVM [2] as well
>> as various compiler backends within Mesa itself, such as r600g-sb and
>> Nouveau. Adding support for SSA will allow the various optimizations
>> these backends perform to be implemented in one place, instead of
>> making each driver reinvent the wheel (as several have already done).
>> Additionally, all new backends would recieve these optimizations,
>> reducing the burden of writing a compiler backend for a new driver.
>>
>> Even though no optimization passes are now implemented, I am putting out
>> this series to solicit feedback on the design, to make sure I don't have
>> to rewrite things before I go ahead and write these new passes.
>>
>> There are no piglit regressions on Softpipe, except for the
>> spec/OpenGL 2.0/max-samplers test, which only passed before because the
>> compiler happened to unroll the loop; the extra copies caused by the
>> conversion to and from SSA stop the compiler from unrolling, meaning
>> that the resulting GLSL IR code contains an indirect sampler index which
>> glsl-to-tgsi can't handle.
>>
>
> I had a detailed look at your patches and I really like where you're going
> with this.  I sent a lot of feedback in response to patches 4, 5, 7, 9, and
> 10; I don't think any of my feedback would require a major change to your
> overall plan.  Nice work!  Consider patches 1, 2, 3, 6, 8, and 11:
>
> Reviewed-by: Paul Berry 
>

Thanks Paul! I'll implement most of your changes for a v2, I only have a
few things I'd like to comment on first.


>
> Note, however, that probably patch 11 should be postponed until we've
> written something that makes use of the SSA form (i.e. the GVN-GCM
> algorithm you mentioned, or a direct conversion from GLSL IR to an SSA
> backend).
>

 Note that both the backends I wrote for the lima driver are SSA-based, so
I hope to use this for that.

Also, a lot of the optimizations that are made possible with SSA would
either be not as powerful or harder to write with a tree-based IR. For
example, take GVN-GCM: part of an expression tree may be using SSA
variables only, so it would be a candidate for GCM or GVM to optimize, but
as long as only one part of the tree uses a non-SSA temporary, then we
can't optimize the entire thing. I thought that for these sorts of
situations I could implement a hack by just flattening the expression tree
of anything that writes to an SSA variable, but after watching Ian's talk
at FOSDEM I'm worried that this would make the current
running-out-of-memory problems even worse. So, we may have to wait until
"flatland" arrives before we can take full advantage of SSA. Regardless,
this should help us transition to a flat IR anyways (since it makes use-def
chains trivial to compute), so it may be worth it to commit this even
before most of the optimizations are written (we can still do copy
propagation and DCE though).

Connor
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/14] st/omx: initial OpenMAX support v3

2014-02-04 Thread Matt Turner
On Tue, Feb 4, 2014 at 7:17 AM, Christian König  wrote:
> diff --git a/configure.ac b/configure.ac
> index ba158e8..98007f9 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2011,6 +2037,7 @@ AC_CONFIG_FILES([Makefile
> src/gallium/state_trackers/gbm/Makefile
> src/gallium/state_trackers/glx/xlib/Makefile
> src/gallium/state_trackers/osmesa/Makefile
> +   src/gallium/state_trackers/omx/Makefile

omx before osmesa.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 26/35] meta: Use common GLSL code for blits

2014-02-04 Thread Roland Scheidegger
Am 04.02.2014 13:19, schrieb Marek Olšák:
> On Tue, Feb 4, 2014 at 10:29 AM, Rogovin, Kevin  
> wrote:
>>
>>
>>> I don't believe our hardware can support GL_ARB_shader_stencil_export.
>>> The render target write message can take RGBA, depth, and sample masks,
>>> but not stencil.  Without that, it's not at all obvious how to implement it.
>>
>> There is a terrible hack-ish way to do it, but I stress the word terrible 
>> and hackish and
>> it may not work correctly depending on the tile modes and all that fun.
>>
>> Here goes. Assuming the depth-stencil is D24S8 we can do this and that the
>> tile modes work out:
>>
>> Bind src depth-stencil as RGBA_8UI, the depth should be in RGB and the 
>> stencil in A.
>> Bind the dst  depth-stencil as RGBA_8UI as well. Fragment shader is simple 
>> unfiltered
>> read and write to dest. If not writing to depth or stencil, mask our RGB or 
>> A respectively.
>>
>> The above does not handle MSAA->non-MSAA. Going further, it can be done in 
>> general
>> on *paper* with GL_ARB_texture_view if that is extended to allow D24S8 to be 
>> on the same
>> castable category as RGBA_8UI. The main catch is how the tile modes work 
>> out, i.e. if the
>> tile mode for a D24S8 is "compatible" with a RGBA_8UI render target.
> 
> This is how r300g does it. It blits D24S8 as RGBA_UNORM. Gallium has
> texture views and it has no limitations on how you can change the
> format, so it's pretty trivial. r300g changes the format, then calls
> our "meta" code (u_blitter).
> 

It is not actually obvious this is something which should work in
gallium in that way. The docs say the format must be "compatible"
without saying much else (obviously, same number of block bits is a
requirement). u_format has some function which would check if formats
are compatible (albeit for a different purpose) and it wouldn't consider
your example compatible.
The intention of sampler views is it should allow casting allowed by the
APIs (d3d10, whatever GL extension).
The definition of this is definitely loose.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] centroid affects interpolation

2014-02-04 Thread Chris Forbes
`centroid` has never been an interpolation qualifier, though.

In GLSL 1.20, `centroid varying` is a storage qualifier.
In GLSL 1.30, `centroid in`, `centroid out` are added as storage qualifiers.
In GLSL 4.20 (or ARB_shading_language_420pack), `centroid`, `sample`,
and `patch` are split from `in` and `out`, and made `auxiliary storage
qualifiers`.


On Wed, Feb 5, 2014 at 2:01 AM, Kevin Rogovin  wrote:
> Place centroid keyword as an interpolation qualifier.
> Previously was a storage qualifier. Fixes front end
> to accept input of the form "centroid in type variable"
>
> ---
>  src/glsl/glsl_parser.yy | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> index 928c57e..265fc57 100644
> --- a/src/glsl/glsl_parser.yy
> +++ b/src/glsl/glsl_parser.yy
> @@ -1353,6 +1353,11 @@ interpolation_qualifier:
>memset(& $$, 0, sizeof($$));
>$$.flags.q.flat = 1;
> }
> +   | CENTROID
> +   {
> +  memset(& $$, 0, sizeof($$));
> +  $$.flags.q.centroid = 1;
> +   }
> | NOPERSPECTIVE
> {
>memset(& $$, 0, sizeof($$));
> @@ -1501,13 +1506,7 @@ type_qualifier:
> }
> ;
>
> -auxiliary_storage_qualifier:
> -   CENTROID
> -   {
> -  memset(& $$, 0, sizeof($$));
> -  $$.flags.q.centroid = 1;
> -   }
> -   | SAMPLE
> +auxiliary_storage_qualifier:SAMPLE
> {
>memset(& $$, 0, sizeof($$));
>$$.flags.q.sample = 1;
> --
> 1.8.1.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g,radeonsi: set domains in one place; put DRI2 tiled textures in VRAM

2014-02-04 Thread Christian König

Am 04.02.2014 20:42, schrieb Marek Olšák:

From: Marek Olšák 

This is a rework of "r600g,radeonsi: force VRAM placement for DRI2 buffers".
It mainly consolidates the code determining resource placements. It also takes
Prime into account.
---
  src/gallium/drivers/r600/r600_state_common.c|  4 +-
  src/gallium/drivers/radeon/r600_buffer_common.c | 72 +++--
  src/gallium/drivers/radeon/r600_pipe_common.h   |  3 +-
  src/gallium/drivers/radeon/r600_texture.c   | 11 ++--
  src/gallium/drivers/radeonsi/si_descriptors.c   |  4 +-
  5 files changed, 54 insertions(+), 40 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index d8fab10..c1d7e29 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -2082,8 +2082,8 @@ static void r600_invalidate_buffer(struct pipe_context 
*ctx, struct pipe_resourc
pb_reference(&rbuffer->buf, NULL);
  
  	/* Create a new one in the same pipe_resource. */

-   r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0, 
alignment,
-  TRUE, rbuffer->b.b.usage);
+   r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0,
+  alignment, TRUE);
  
  	/* We changed the buffer, now we need to bind it where the old one was bound. */

/* Vertex buffers. */
diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 2077228..466630e 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -100,45 +100,56 @@ void *r600_buffer_map_sync_with_rings(struct 
r600_common_context *ctx,
return ctx->ws->buffer_map(resource->cs_buf, NULL, usage);
  }
  
+void r600_set_resource_placement(struct r600_resource *res)

+{
+   if (res->b.b.target == PIPE_BUFFER) {
+   /* Buffer placement is determined from the usage flag directly. 
*/
+   switch (res->b.b.usage) {
+   case PIPE_USAGE_IMMUTABLE:
+   case PIPE_USAGE_DEFAULT:
+   case PIPE_USAGE_STATIC:
+   default:
+   /* Not listing GTT here improves performance in some 
apps. */
+   res->domains = RADEON_DOMAIN_VRAM;
+   break;
+   case PIPE_USAGE_DYNAMIC:
+   case PIPE_USAGE_STREAM:
+   case PIPE_USAGE_STAGING:
+   /* These resources participate in transfers, i.e. are 
used
+* for uploads and downloads from regular resources. */
+   res->domains = RADEON_DOMAIN_GTT;
+   break;
+   }
+   } else {
+   /* Texture placement is determined from the tiling mode,
+* which was determined from the usage flags. */
+   struct r600_texture *rtex = (struct r600_texture*)rtex;
+
+   if (rtex->surface.level[0].mode >= RADEON_SURF_MODE_1D) {
+   /* Tiled textures are unmappable. Always put them in 
VRAM. */
+   res->domains = RADEON_DOMAIN_VRAM;
+   } else {
+   /* Linear textures should be in GTT for Prime and 
texture
+* transfers. Also, why would anyone want to have a 
linear
+* texture in VRAM? */


VCE doesn't supports tilling and you still want the resource to be in VRAM.

Christian.


+   res->domains = RADEON_DOMAIN_GTT;
+   }
+   }
+}
+
  bool r600_init_resource(struct r600_common_screen *rscreen,
struct r600_resource *res,
unsigned size, unsigned alignment,
-   bool use_reusable_pool, unsigned usage)
+   bool use_reusable_pool)
  {
-   uint32_t initial_domain, domains;
-
-   switch(usage) {
-   case PIPE_USAGE_STAGING:
-   case PIPE_USAGE_DYNAMIC:
-   case PIPE_USAGE_STREAM:
-   /* These resources participate in transfers, i.e. are used
-* for uploads and downloads from regular resources.
-* We generate them internally for some transfers.
-*/
-   initial_domain = RADEON_DOMAIN_GTT;
-   domains = RADEON_DOMAIN_GTT;
-   break;
-   case PIPE_USAGE_DEFAULT:
-   case PIPE_USAGE_STATIC:
-   case PIPE_USAGE_IMMUTABLE:
-   default:
-   /* Don't list GTT here, because the memory manager would put 
some
-* resources to GTT no matter what the initial domain is.
-* Not listing GTT in the domains improves performance a lot. */
-   initial_domain = RADEON_DOMAIN_VRAM;
-   domains = RADEON_DOMAIN_VRAM;
-   break;
-   }
-
res->buf = rscreen

[Mesa-dev] [PATCH] r600g,radeonsi: set domains in one place; put DRI2 tiled textures in VRAM

2014-02-04 Thread Marek Olšák
From: Marek Olšák 

This is a rework of "r600g,radeonsi: force VRAM placement for DRI2 buffers".
It mainly consolidates the code determining resource placements. It also takes
Prime into account.
---
 src/gallium/drivers/r600/r600_state_common.c|  4 +-
 src/gallium/drivers/radeon/r600_buffer_common.c | 72 +++--
 src/gallium/drivers/radeon/r600_pipe_common.h   |  3 +-
 src/gallium/drivers/radeon/r600_texture.c   | 11 ++--
 src/gallium/drivers/radeonsi/si_descriptors.c   |  4 +-
 5 files changed, 54 insertions(+), 40 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index d8fab10..c1d7e29 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -2082,8 +2082,8 @@ static void r600_invalidate_buffer(struct pipe_context 
*ctx, struct pipe_resourc
pb_reference(&rbuffer->buf, NULL);
 
/* Create a new one in the same pipe_resource. */
-   r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0, 
alignment,
-  TRUE, rbuffer->b.b.usage);
+   r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0,
+  alignment, TRUE);
 
/* We changed the buffer, now we need to bind it where the old one was 
bound. */
/* Vertex buffers. */
diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 2077228..466630e 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -100,45 +100,56 @@ void *r600_buffer_map_sync_with_rings(struct 
r600_common_context *ctx,
return ctx->ws->buffer_map(resource->cs_buf, NULL, usage);
 }
 
+void r600_set_resource_placement(struct r600_resource *res)
+{
+   if (res->b.b.target == PIPE_BUFFER) {
+   /* Buffer placement is determined from the usage flag directly. 
*/
+   switch (res->b.b.usage) {
+   case PIPE_USAGE_IMMUTABLE:
+   case PIPE_USAGE_DEFAULT:
+   case PIPE_USAGE_STATIC:
+   default:
+   /* Not listing GTT here improves performance in some 
apps. */
+   res->domains = RADEON_DOMAIN_VRAM;
+   break;
+   case PIPE_USAGE_DYNAMIC:
+   case PIPE_USAGE_STREAM:
+   case PIPE_USAGE_STAGING:
+   /* These resources participate in transfers, i.e. are 
used
+* for uploads and downloads from regular resources. */
+   res->domains = RADEON_DOMAIN_GTT;
+   break;
+   }
+   } else {
+   /* Texture placement is determined from the tiling mode,
+* which was determined from the usage flags. */
+   struct r600_texture *rtex = (struct r600_texture*)rtex;
+
+   if (rtex->surface.level[0].mode >= RADEON_SURF_MODE_1D) {
+   /* Tiled textures are unmappable. Always put them in 
VRAM. */
+   res->domains = RADEON_DOMAIN_VRAM;
+   } else {
+   /* Linear textures should be in GTT for Prime and 
texture
+* transfers. Also, why would anyone want to have a 
linear
+* texture in VRAM? */
+   res->domains = RADEON_DOMAIN_GTT;
+   }
+   }
+}
+
 bool r600_init_resource(struct r600_common_screen *rscreen,
struct r600_resource *res,
unsigned size, unsigned alignment,
-   bool use_reusable_pool, unsigned usage)
+   bool use_reusable_pool)
 {
-   uint32_t initial_domain, domains;
-
-   switch(usage) {
-   case PIPE_USAGE_STAGING:
-   case PIPE_USAGE_DYNAMIC:
-   case PIPE_USAGE_STREAM:
-   /* These resources participate in transfers, i.e. are used
-* for uploads and downloads from regular resources.
-* We generate them internally for some transfers.
-*/
-   initial_domain = RADEON_DOMAIN_GTT;
-   domains = RADEON_DOMAIN_GTT;
-   break;
-   case PIPE_USAGE_DEFAULT:
-   case PIPE_USAGE_STATIC:
-   case PIPE_USAGE_IMMUTABLE:
-   default:
-   /* Don't list GTT here, because the memory manager would put 
some
-* resources to GTT no matter what the initial domain is.
-* Not listing GTT in the domains improves performance a lot. */
-   initial_domain = RADEON_DOMAIN_VRAM;
-   domains = RADEON_DOMAIN_VRAM;
-   break;
-   }
-
res->buf = rscreen->ws->buffer_create(rscreen->ws, size, alignment,
   use_reusable_pool,
-   

[Mesa-dev] [PATCH] configure: Use LLVM shared libraries by default

2014-02-04 Thread Tom Stellard
From: Tom Stellard 

Linking with LLVM static libraries is easily broken by changes to
the llvm-config program or when LLVM adds, removes, or changes library
components.  Keeping up with these changes requires a lot of maintanence
effort to keep the build working on the master and stable branches.

Also, because of issues in the past LLVM static libraries, the release
manager is currently configuring with --with-llvm-shared-libs when
checking the build before release.  Enabling shared libraries by
default would allow the release manager to run ./configure with
no arguments, and be reasonably confident that the build would succeed.

CC: "10.1" 

---
 configure.ac | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/configure.ac b/configure.ac
index 4da6c51..9568e7b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1528,11 +1528,11 @@ AC_ARG_ENABLE([gallium-llvm],
 [enable_gallium_llvm="$enableval"],
 [enable_gallium_llvm=auto])
 
-AC_ARG_WITH([llvm-shared-libs],
-[AS_HELP_STRING([--with-llvm-shared-libs],
-[link with LLVM shared libraries @<:@default=disabled@:>@])],
+AC_ARG_ENABLE([llvm-shared-libs],
+[AS_HELP_STRING([--enable-llvm-shared-libs],
+[link with LLVM shared libraries @<:@default=enabled@:>@])],
 [],
-[with_llvm_shared_libs=no])
+[with_llvm_shared_libs=yes])
 
 AC_ARG_WITH([llvm-prefix],
 [AS_HELP_STRING([--with-llvm-prefix],
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: fix F2U opcode

2014-02-04 Thread sroland
From: Roland Scheidegger 

Previously, we were really doing F2I. And also move it to generic section.
(Note that for llvmpipe the code generated is definitely bad, due to lack
of unsigned conversions with sse. I think though what llvm does (using scalar
conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit)
including lots of domain changes is quite suboptimal, could do something like
is_large = arg >= 2^31
half_arg = 0.5 * arg
small_c = fptoint(arg)
large_c = fptoint(half_arg) << 1
res = select(is_large, large_c, small_c)
which should be much less instructions but that's something llvm should do
itself.)

This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs
GL 3.0 version override to run.)
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |   42 ++--
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index caaeb01..b9546db 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -720,10 +720,23 @@ sub_emit(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
-   emit_data->output[emit_data->chan] = LLVMBuildFSub(
-   bld_base->base.gallivm->builder,
-   emit_data->args[0],
-   emit_data->args[1], "");
+   emit_data->output[emit_data->chan] =
+  LLVMBuildFSub(bld_base->base.gallivm->builder,
+emit_data->args[0],
+emit_data->args[1], "");
+}
+
+/* TGSI_OPCODE_F2U */
+static void
+f2u_emit(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   emit_data->output[emit_data->chan] =
+  LLVMBuildFPToUI(bld_base->base.gallivm->builder,
+  emit_data->args[0],
+  bld_base->base.int_vec_type, "");
 }
 
 /* TGSI_OPCODE_U2F */
@@ -733,9 +746,10 @@ u2f_emit(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
-   emit_data->output[emit_data->chan] = 
LLVMBuildUIToFP(bld_base->base.gallivm->builder,
-   emit_data->args[0],
-   
bld_base->base.vec_type, "");
+   emit_data->output[emit_data->chan] =
+  LLVMBuildUIToFP(bld_base->base.gallivm->builder,
+  emit_data->args[0],
+  bld_base->base.vec_type, "");
 }
 
 static void
@@ -949,6 +963,7 @@ lp_set_default_actions(struct lp_build_tgsi_context * 
bld_base)
bld_base->op_actions[TGSI_OPCODE_SUB].emit = sub_emit;
 
bld_base->op_actions[TGSI_OPCODE_UARL].emit = mov_emit;
+   bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit;
bld_base->op_actions[TGSI_OPCODE_U2F].emit = u2f_emit;
bld_base->op_actions[TGSI_OPCODE_UMAD].emit = umad_emit;
bld_base->op_actions[TGSI_OPCODE_UMUL].emit = umul_emit;
@@ -1128,18 +1143,6 @@ f2i_emit_cpu(
 emit_data->args[0]);
 }
 
-/* TGSI_OPCODE_F2U (CPU Only) */
-static void
-f2u_emit_cpu(
-   const struct lp_build_tgsi_action * action,
-   struct lp_build_tgsi_context * bld_base,
-   struct lp_build_emit_data * emit_data)
-{
-   /* FIXME: implement and use lp_build_utrunc() */
-   emit_data->output[emit_data->chan] = lp_build_itrunc(&bld_base->base,
-emit_data->args[0]);
-}
-
 /* TGSI_OPCODE_FSET Helper (CPU Only) */
 static void
 fset_emit_cpu(
@@ -1832,7 +1835,6 @@ lp_set_default_actions_cpu(
bld_base->op_actions[TGSI_OPCODE_DIV].emit = div_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_EX2].emit = ex2_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_F2I].emit = f2i_emit_cpu;
-   bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_FLR].emit = flr_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_FSEQ].emit = fseq_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_FSGE].emit = fsge_emit_cpu;
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: remove stray bits of GL_EXT_cull_vertex

2014-02-04 Thread Eric Anholt
Brian Paul  writes:

> GL_EXT_cull_vertex was removed back in 2010 in commit 02984e3536
> but these bits still lingered.

Reviewed-by: Eric Anholt 


pgpQggYJFrIcn.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] loader: Get driver name from udev hwdb when available

2014-02-04 Thread Kristian Høgsberg
The udev hwdb is a mechanism for applying udev properties to devices at
hotplug time.  The hwdb text files are compiled into a binary database
that lets udev efficiently look up and apply properties to devices that
match a given modalias.

This patch exports the mesa PCI ID tables as hwdb files and extends the
loader code to try to look up the driver name from the DRI_DRIVER udev
property.  The benefits to this approach are:

 - No longer PCI specific, any device udev can match with a modalias can
   be assigned a DRI driver.

 - Available outside mesa; writing a DRI2 compatible generic DDX with
   glamor needs to know the DRI driver name to send to the client.

 - Can be overridden by custom udev rules.

Signed-off-by: Kristian Høgsberg 
---

This v2 rewrites dump-hwdb in shell so we don't have to worry about
cross-compilation and host cc stuff.  One down-side to this is that we
duplicate the vendor id + driver name mapping from pci_id_driver_map.h,
but it's not a whole lot of duplication.

Eric, the install-data-hook is fine, it looks like this:

  [krh@tokamak loader]$ make install udevhwdbdir=$PWD/foo 
  ...
  gmake  install-data-hook
  test -z "" && udevadm hwdb --update
  Failure writing database /etc/udev/hwdb.bin: Permission denied
  gmake[2]: [install-data-hook] Error 1 (ignored)

and doesn't break make install otherwise.

Kristian


 configure.ac| 14 ++
 src/loader/.gitignore   |  1 +
 src/loader/Makefile.am  | 11 +++
 src/loader/dump-hwdb.sh | 29 +
 src/loader/loader.c | 33 +
 src/loader/loader.h |  2 +-
 6 files changed, 81 insertions(+), 9 deletions(-)
 create mode 100644 src/loader/.gitignore
 create mode 100755 src/loader/dump-hwdb.sh

diff --git a/configure.ac b/configure.ac
index ba158e8..9bf27d4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -771,6 +771,20 @@ if test "x$have_libdrm" = xyes; then
DEFINES="$DEFINES -DHAVE_LIBDRM"
 fi
 
+# This /lib prefix does not change with 32/64 bits it's always /lib
+case "$prefix" in
+/usr) default_udevhwdbdir=/lib/udev/hwdb.d ;;
+NONE) default_udevhwdbdir=${ac_default_prefix}/lib/udev/hwdb.d ;;
+*)default_udevhwdbdir=$prefix/lib/udev/hwdb.d ;;
+esac
+
+AC_ARG_WITH([udev-hwdb-dir],
+[AS_HELP_STRING([--with-udev-hwdb-dir=DIR],
+[directory for the udev hwdb @<:@/lib/udev/hwdb.d@:>@])],
+[udevhwdbdir="$withval"],
+[udevhwdbdir=$default_udevhwdbdir])
+AC_SUBST([udevhwdbdir])
+
 PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED],
   have_libudev=yes, have_libudev=no)
 
diff --git a/src/loader/.gitignore b/src/loader/.gitignore
new file mode 100644
index 000..e11c470
--- /dev/null
+++ b/src/loader/.gitignore
@@ -0,0 +1 @@
+20-dri-driver.hwdb
diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index bddf7ac..14c85d0 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -41,3 +41,14 @@ libloader_la_LIBADD = \
 endif
 
 libloader_la_SOURCES = $(LOADER_C_FILES)
+
+
+dist_udevhwdb_DATA = 20-dri-driver.hwdb
+
+# Update hwdb on installation. Do not bother if installing
+# in DESTDIR, since this is likely for packaging purposes.
+install-data-hook :
+   -test -z "$(DESTDIR)" && udevadm hwdb --update
+
+20-dri-driver.hwdb :
+   $(srcdir)/dump-hwdb.sh > $@-tmp && mv $@-tmp $@
diff --git a/src/loader/dump-hwdb.sh b/src/loader/dump-hwdb.sh
new file mode 100755
index 000..2034c75
--- /dev/null
+++ b/src/loader/dump-hwdb.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+set -e
+
+PROP_NAME=DRI_DRIVER
+
+while read vendor driver; do
+pci_id_file=../../include/pci_ids/${driver}_pci_ids.h
+if ! test -r $pci_id_file; then
+printf "pci:v%08x*bc03*\n $PROP_NAME=$driver\n\n" $vendor
+continue
+fi
+
+while IFS=' (,' read c id rest; do
+test -z "$id" && continue
+printf "pci:v%08xd%08x*\n $PROP_NAME=$driver\n\n" $vendor $id
+
+done < $pci_id_file
+done <= 0);
+   return (*driver != NULL) || (*chip_id >= 0);
 }
 
 #elif defined(ANDROID) && !defined(__NOT_HAVE_DRM_H)
@@ -195,11 +203,12 @@ out:
 #include 
 
 int
-loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
+loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver)
 {
drmVersionPtr version;
 
*chip_id = -1;
+   *driver = NULL;
 
version = drmGetVersion(fd);
if (!version) {
@@ -261,7 +270,7 @@ loader_get_pci_id_for_fd(int fd, int *vendor_id, int 
*chip_id)
 #else
 
 int
-loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
+loader_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id, char **driver)
 {
return 0;
 }
@@ -310,7 +319,7 @@ loader_get_driver_for_fd(int fd, unsigned driver_types)
if (!driver_types)
   driver_types = _LOADER_GALLIUM | _LOADER_DRI;
 
-   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id)) {
+   if (!loader_get_pci_id_for_fd(fd, &vendor_id, &chip_id, &driver)) {
 
 #ifndef __NOT_HAVE_DRM_H
   /* fa

Re: [Mesa-dev] [PATCH 3/3] i965: Bump MaxTexMbytes from 1GB to 1.5GB.

2014-02-04 Thread Daniel Vetter
On Sun, Feb 02, 2014 at 03:16:45AM -0800, Kenneth Graunke wrote:
> Even with the other limits raised, TestProxyTexImage would still reject
> textures > 1GB in size.  This is an artificial limit; nothing prevents
> us from having a larger texture.  I stayed shy of 2GB to avoid the
> larger-than-aperture situation.
> 
> For 3D textures, this raises the effective limit:
>  - RGBA8:   645 -> 738
>  - RGBA16:  512 -> 586
>  - RGBA32F: 406 -> 465
> 
> Cc: i...@freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 17b75e1..66d6ccb 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -306,6 +306,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>ctx->Const.MaxTextureLevels = MAX_TEXTURE_LEVELS;
> ctx->Const.Max3DTextureLevels = 12; /* 2048 */
> ctx->Const.MaxCubeTextureLevels = 14; /* 8192 */
> +   ctx->Const.MaxTextureMbytes = 1536;

Original gen4 (i.e. i965g) only has 512 MB of aperture ... Also going this
high runs the risk that you fool up with fragmentation, but meh.

You'd need to get at bufmgr_gem->gtt_size somehow. At least the current
code is safe for address spaces > 4G.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.

2014-02-04 Thread Jordan Justen
On Tue, Feb 4, 2014 at 8:43 AM, Paul Berry  wrote:
> On 1 February 2014 23:21, Jordan Justen  wrote:
>>
>> On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry 
>> wrote:
>> > This will allow testing of compute shader functionality before it is
>> > completed.
>> >
>> > To enable ARB_compute_shader functionality in the i965 driver, set
>> > INTEL_COMPUTE_SHADER=1.
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_context.c  | 11 ++-
>> >  src/mesa/drivers/dri/i965/intel_extensions.c |  2 ++
>> >  2 files changed, 12 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
>> > b/src/mesa/drivers/dri/i965/brw_context.c
>> > index 1b42751..76dd9be 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_context.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
>> > @@ -298,10 +298,17 @@ brw_initialize_context_constants(struct
>> > brw_context *brw)
>> >ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits =
>> > BRW_MAX_TEX_UNIT;
>> > else
>> >ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits =
>> > 0;
>> > +   if (getenv("INTEL_COMPUTE_SHADER")) {
>>
>> What about trying to make use of
>> MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader?
>>
>> We could add
>> extensions.c:bool _mesa_is_extension_override_enabled(char *)
>>
>> And then
>> if (_mesa_is_extension_override_enabled("GL_ARB_compute_shader"))
>>
>> Or, similarly, get overrides shoved into ctx->Extensions to allow
>> checking for overrides at this early stage.
>>
>> -Jordan
>
>
> I looked into what would be necessary to do this, and it's unfortunately
> more complicated than it should be due to some flukes about initialization
> order (perhaps this is what you were alluding to when you said "get
> overrides shoved into ctx->Extensions to allow checking for overrides at
> this early stage").

Yeah, I'll admit I looked at it a bit, and decided to keep my feedback
'hand wavy'. :)

>  Currently, our initialization order is:
>
> 1. brwCreateContext() calls brw_initialize_context_constants(), which needs
> to know whether compute shaders are supported in order to set constants like
> MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS correctly.
>
> 2. Later, brwCreateContext() calls intelInitExtensions(), which is the
> function where we'll set ctx->Extensions.ARB_compute_shader to true once
> compute shaders are fully supported.
>
> 3. Much later in initialization, when the context is being made current for
> the first time, core Mesa calls _mesa_make_extension_string(), which calls
> get_extension_override() to modify the the ctx->Extensions table based on
> the environment variable override.
>
> To do what you suggested, I would either have to:
>
> A. change 1 to call _mesa_is_extension_override_enabled(); that function, in
> turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we
> couldn't reuse the parsing code in _mesa_make_extension_string(), since that
> parsing code updates ctx->Extensions as a side effect, and it's not time to
> update ctx->Extensions yet.  In addition to the code duplication, we would
> have a danger of bugs, since the override takes effect so late in
> initialization--if we ever accidentally introduced some code in between
> steps 2 and 3 that checked the value of ctx->Extensions.ARB_compute_shader,
> it would see false even if compute shaders were enabled by override.
>
> Or:
>
> B. change the order of initialization so that 2 happens first, followed by 3
> and then 1.  In the long run I think this would be a good thing, but it's a
> big change (since it affects all back ends) and I'm not sure it would be a
> good idea to hold up this patch series waiting for it.
>
>
> How would you feel if I landed the patch series as is, and then worked on a
> follow up patch to do B?

Sure, you can have my r-b for this then.

-Jordan

>> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits =
>> > BRW_MAX_TEX_UNIT;
>> > +  ctx->Const.MaxUniformBufferBindings += 12;
>> > +   } else {
>> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 0;
>> > +   }
>> > ctx->Const.MaxCombinedTextureImageUnits =
>> >ctx->Const.Program[MESA_SHADER_VERTEX].MaxTextureImageUnits +
>> >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits +
>> > -  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits;
>> > +  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits +
>> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
>> >
>> > ctx->Const.MaxTextureLevels = 14; /* 8192 */
>> > if (ctx->Const.MaxTextureLevels > MAX_TEXTURE_LEVELS)
>> > @@ -425,9 +432,11 @@ brw_initialize_context_constants(struct brw_context
>> > *brw)
>> >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters =
>> > MAX_ATOMIC_COUNTERS;
>> >ctx->Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters =
>> > MAX_ATOMIC_COUNTERS;
>> >ctx->Const.Program[MESA_SHAD

[Mesa-dev] [PATCH] mesa: remove stray bits of GL_EXT_cull_vertex

2014-02-04 Thread Brian Paul
GL_EXT_cull_vertex was removed back in 2010 in commit 02984e3536
but these bits still lingered.
---
 src/mesa/main/matrix.c |   13 +
 src/mesa/main/mtypes.h |3 ---
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/src/mesa/main/matrix.c b/src/mesa/main/matrix.c
index b213022..99a5013 100644
--- a/src/mesa/main/matrix.c
+++ b/src/mesa/main/matrix.c
@@ -606,16 +606,8 @@ calculate_model_project_matrix( struct gl_context *ctx )
  */
 void _mesa_update_modelview_project( struct gl_context *ctx, GLuint new_state )
 {
-   if (new_state & _NEW_MODELVIEW) {
+   if (new_state & _NEW_MODELVIEW)
   _math_matrix_analyse( ctx->ModelviewMatrixStack.Top );
-
-  /* Bring cull position up to date.
-   */
-  TRANSFORM_POINT3( ctx->Transform.CullObjPos, 
-   ctx->ModelviewMatrixStack.Top->inv,
-   ctx->Transform.CullEyePos );
-   }
-
 
if (new_state & _NEW_PROJECTION)
   update_projection( ctx );
@@ -762,9 +754,6 @@ void _mesa_init_transform( struct gl_context *ctx )
   ASSIGN_4V( ctx->Transform.EyeUserPlane[i], 0.0, 0.0, 0.0, 0.0 );
}
ctx->Transform.ClipPlanesEnabled = 0;
-
-   ASSIGN_4V( ctx->Transform.CullObjPos, 0.0, 0.0, 1.0, 0.0 );
-   ASSIGN_4V( ctx->Transform.CullEyePos, 0.0, 0.0, 1.0, 0.0 );
 }
 
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 5fc15af..b9ac2b3 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1423,9 +1423,6 @@ struct gl_transform_attrib
GLboolean RescaleNormals;   /**< GL_EXT_rescale_normal */
GLboolean RasterPositionUnclipped;   /**< GL_IBM_rasterpos_clip */
GLboolean DepthClamp;   /**< GL_ARB_depth_clamp */
-
-   GLfloat CullEyePos[4];
-   GLfloat CullObjPos[4];
 };
 
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Potential fix for #70410

2014-02-04 Thread Aaron Watry
Yup, that's the same exact patch that I sent to the LLVM list.  It's
been working just fine along with the mesa patch in:
https://bugs.freedesktop.org/attachment.cgi?id=91764

I've been using that on my own system for a while now.  It could
probably use more eyes/testing, but it seems ok from my usage.

--Aaron

On Tue, Feb 4, 2014 at 10:46 AM, Krzysztof A. Sobiecki  wrote:
> "Armin K."  writes:
>> This would be easier to fix in LLVM. The newline is rather unnecessary
>> in the output.
> I'm neither able or willing to hack LLVM, but
> https://bugs.freedesktop.org/attachment.cgi?id=91751 looks nice?
>
> --
> X was an interactive protocol:
> alpha blending a full-screen image looked like slugs racing down the monitor.
> http://www.keithp.com/~keithp/talks/usenix2000/render.html
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: allow large numbers of temporaries

2014-02-04 Thread Brian Paul

On 02/03/2014 07:43 PM, Zack Rusin wrote:

The number of allowed temporaries increases almost with every
iteration of an api. We used to support 128, then we started
increasing and the newer api's support 4096+. So if we notice
that the number of temporaries is larger than our statically
allocated storage would allow we just treat them as indexable
temporaries and allocate them as an array from the start.

Signed-off-by: Zack Rusin 
---


LGTM.  Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 30/30] i965/cs: Allow ARB_compute_shader to be enabled via env var.

2014-02-04 Thread Paul Berry
On 1 February 2014 23:21, Jordan Justen  wrote:

> On Thu, Jan 9, 2014 at 6:19 PM, Paul Berry 
> wrote:
> > This will allow testing of compute shader functionality before it is
> > completed.
> >
> > To enable ARB_compute_shader functionality in the i965 driver, set
> > INTEL_COMPUTE_SHADER=1.
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.c  | 11 ++-
> >  src/mesa/drivers/dri/i965/intel_extensions.c |  2 ++
> >  2 files changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
> > index 1b42751..76dd9be 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > @@ -298,10 +298,17 @@ brw_initialize_context_constants(struct
> brw_context *brw)
> >ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits =
> BRW_MAX_TEX_UNIT;
> > else
> >ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 0;
> > +   if (getenv("INTEL_COMPUTE_SHADER")) {
>
> What about trying to make use of
> MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader?
>
> We could add
> extensions.c:bool _mesa_is_extension_override_enabled(char *)
>
> And then
> if (_mesa_is_extension_override_enabled("GL_ARB_compute_shader"))
>
> Or, similarly, get overrides shoved into ctx->Extensions to allow
> checking for overrides at this early stage.
>
> -Jordan
>

I looked into what would be necessary to do this, and it's unfortunately
more complicated than it should be due to some flukes about initialization
order (perhaps this is what you were alluding to when you said "get
overrides shoved into ctx->Extensions to allow checking for overrides at
this early stage").  Currently, our initialization order is:

1. brwCreateContext() calls brw_initialize_context_constants(), which needs
to know whether compute shaders are supported in order to set constants
like MAX_COMPUTE_TEXTURE_IMAGE_UNITS and MAX_UNIFORM_BUFFER_BINDINGS
correctly.

2. Later, brwCreateContext() calls intelInitExtensions(), which is the
function where we'll set ctx->Extensions.ARB_compute_shader to true once
compute shaders are fully supported.

3. Much later in initialization, when the context is being made current for
the first time, core Mesa calls _mesa_make_extension_string(), which calls
get_extension_override() to modify the the ctx->Extensions table based on
the environment variable override.

To do what you suggested, I would either have to:

A. change 1 to call _mesa_is_extension_override_enabled(); that function,
in turn, would have to parse the MESA_EXTENSION_OVERRIDE string, but we
couldn't reuse the parsing code in _mesa_make_extension_string(), since
that parsing code updates ctx->Extensions as a side effect, and it's not
time to update ctx->Extensions yet.  In addition to the code duplication,
we would have a danger of bugs, since the override takes effect so late in
initialization--if we ever accidentally introduced some code in between
steps 2 and 3 that checked the value of ctx->Extensions.ARB_compute_shader,
it would see false even if compute shaders were enabled by override.

Or:

B. change the order of initialization so that 2 happens first, followed by
3 and then 1.  In the long run I think this would be a good thing, but it's
a big change (since it affects all back ends) and I'm not sure it would be
a good idea to hold up this patch series waiting for it.


How would you feel if I landed the patch series as is, and then worked on a
follow up patch to do B?


>
> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits =
> BRW_MAX_TEX_UNIT;
> > +  ctx->Const.MaxUniformBufferBindings += 12;
> > +   } else {
> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 0;
> > +   }
> > ctx->Const.MaxCombinedTextureImageUnits =
> >ctx->Const.Program[MESA_SHADER_VERTEX].MaxTextureImageUnits +
> >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits +
> > -  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits;
> > +  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits +
> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
> >
> > ctx->Const.MaxTextureLevels = 14; /* 8192 */
> > if (ctx->Const.MaxTextureLevels > MAX_TEXTURE_LEVELS)
> > @@ -425,9 +432,11 @@ brw_initialize_context_constants(struct brw_context
> *brw)
> >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters =
> MAX_ATOMIC_COUNTERS;
> >ctx->Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters =
> MAX_ATOMIC_COUNTERS;
> >ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters =
> MAX_ATOMIC_COUNTERS;
> > +  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxAtomicCounters =
> MAX_ATOMIC_COUNTERS;
> >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers =
> BRW_MAX_ABO;
> >ctx->Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers =
> BRW_MAX_ABO;
> >ctx->Const.Progr

[Mesa-dev] [Bug 74508] Steam games not launching.

2014-02-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=74508

Kenneth Graunke  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |i...@freedesktop.org
   |org |
 QA Contact||intel-3d-bugs@lists.freedes
   ||ktop.org
  Component|Other   |Drivers/DRI/i965

--- Comment #2 from Kenneth Graunke  ---
You might try:

1. Close Steam
2. Open a terminal, and run:

   export MESA_GLSL_VERSION_OVERRIDE=130
   steam &

3. Launch your game.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Potential fix for #70410

2014-02-04 Thread Krzysztof A. Sobiecki
"Armin K."  writes:
> This would be easier to fix in LLVM. The newline is rather unnecessary
> in the output.
I'm neither able or willing to hack LLVM, but
https://bugs.freedesktop.org/attachment.cgi?id=91751 looks nice?

-- 
X was an interactive protocol: 
alpha blending a full-screen image looked like slugs racing down the monitor. 
http://www.keithp.com/~keithp/talks/usenix2000/render.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Potential fix for #70410

2014-02-04 Thread Armin K.
On 02/04/2014 05:24 PM, Krzysztof A. Sobiecki wrote:
> A small patch to work around a llvm-config-3.5 change, with a newline
> hack.
> 
> 
> 
> 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

This would be easier to fix in LLVM. The newline is rather unnecessary
in the output.

-- 
Note: My last name is not Krejzi.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 70410] egl-static/Makefile: linking fails with llvm >= 3.4

2014-02-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70410

--- Comment #19 from Krzysztof A. Sobiecki  ---
Patch sent to mesa-dev, if included I will close this bug, for now.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/14] radeon: just don't map VRAM buffers at all

2014-02-04 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Tue, Feb 4, 2014 at 4:17 PM, Christian König  wrote:
> From: Christian König 
>
> Signed-off-by: Christian König 
> ---
>  src/gallium/drivers/radeon/r600_texture.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_texture.c 
> b/src/gallium/drivers/radeon/r600_texture.c
> index 878b26f..eb1e191 100644
> --- a/src/gallium/drivers/radeon/r600_texture.c
> +++ b/src/gallium/drivers/radeon/r600_texture.c
> @@ -911,8 +911,8 @@ static void *r600_texture_transfer_map(struct 
> pipe_context *ctx,
> if (rtex->surface.level[level].mode >= RADEON_SURF_MODE_1D)
> use_staging_texture = TRUE;
>
> -   /* Untiled buffers in VRAM, which is slow for CPU reads */
> -   if ((usage & PIPE_TRANSFER_READ) && !(usage & 
> PIPE_TRANSFER_MAP_DIRECTLY) &&
> +   /* Untiled buffers in VRAM, which is slow for CPU reads and writes */
> +   if (!(usage & PIPE_TRANSFER_MAP_DIRECTLY) &&
> (rtex->resource.domains == RADEON_DOMAIN_VRAM)) {
> use_staging_texture = TRUE;
> }
> --
> 1.8.3.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Potential fix for #70410

2014-02-04 Thread Aaron Watry
On Tue, Feb 4, 2014 at 10:28 AM, Armin K.  wrote:
> On 02/04/2014 05:24 PM, Krzysztof A. Sobiecki wrote:
>> A small patch to work around a llvm-config-3.5 change, with a newline
>> hack.
>>
>
> This would be easier to fix in LLVM. The newline is rather unnecessary
> in the output.
>

I fully agree...  The patch that I sent to llvm-commits [1] went
completely ignored, and we'll probably need to get someone to provide
feedback.

[1] 
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140106/201074.html

--Aaron
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Potential fix for #70410

2014-02-04 Thread Krzysztof A. Sobiecki
A small patch to work around a llvm-config-3.5 change, with a newline
hack.

Signed-off-by: Krzysztof Sobiecki  gmail.com>
Tested-by: Kai Wasserbäch 
---

LLVM 3.5 added --system-libs to llvm-config, fix build failure.
Fixes #70410

diff --git a/configure.ac b/configure.ac
index ba158e8..c31d962 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1843,7 +1843,12 @@ dnl in LLVM_LIBS.
 
 if test "x$MESA_LLVM" != x0; then
 
-LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
+if test $LLVM_VERSION_MAJOR -eq 3 -a $LLVM_VERSION_MINOR -ge 5 ; then
+LLVM_LIBS="`$LLVM_CONFIG --system-libs --libs ${LLVM_COMPONENTS} |tr "\n" " "`"
+dnl Because my llvm-config adds a new line...
+else
+LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
+fi
 
 if test "x$with_llvm_shared_libs" = xyes; then
 dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,

-- 
X was an interactive protocol: 
alpha blending a full-screen image looked like slugs racing down the monitor. 
http://www.keithp.com/~keithp/talks/usenix2000/render.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] centroid affects interpolation

2014-02-04 Thread Kevin Rogovin
Place centroid keyword as an interpolation qualifier.
Previously was a storage qualifier. Fixes front end
to accept input of the form "centroid in type variable"

---
 src/glsl/glsl_parser.yy | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index 928c57e..265fc57 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -1353,6 +1353,11 @@ interpolation_qualifier:
   memset(& $$, 0, sizeof($$));
   $$.flags.q.flat = 1;
}
+   | CENTROID
+   {
+  memset(& $$, 0, sizeof($$));
+  $$.flags.q.centroid = 1;
+   }
| NOPERSPECTIVE
{
   memset(& $$, 0, sizeof($$));
@@ -1501,13 +1506,7 @@ type_qualifier:
}
;
 
-auxiliary_storage_qualifier:
-   CENTROID
-   {
-  memset(& $$, 0, sizeof($$));
-  $$.flags.q.centroid = 1;
-   }
-   | SAMPLE
+auxiliary_storage_qualifier:SAMPLE
{
   memset(& $$, 0, sizeof($$));
   $$.flags.q.sample = 1;
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/14] st/omx: initial OpenMAX H264 encoder v4

2014-02-04 Thread Christian König
From: Christian König 

v2 (chk): fix eos handling
v3 (leo): implement scaling configuration support
v4 (leo): fix bitrate bug

Signed-off-by: Christian König 
Signed-off-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c |   2 +-
 src/gallium/state_trackers/omx/Makefile.am |   3 +-
 src/gallium/state_trackers/omx/entrypoint.c|   9 +-
 src/gallium/state_trackers/omx/vid_dec.c   |  15 +-
 src/gallium/state_trackers/omx/vid_enc.c   | 842 +
 src/gallium/state_trackers/omx/vid_enc.h   |  80 +++
 6 files changed, 942 insertions(+), 9 deletions(-)
 create mode 100644 src/gallium/state_trackers/omx/vid_enc.c
 create mode 100644 src/gallium/state_trackers/omx/vid_enc.h

diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c 
b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
index e9502be..149748b 100644
--- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
+++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
@@ -110,7 +110,7 @@ static void rate_control(struct rvce_encoder *enc)
RVCE_CS(enc->pic.rate_ctrl.peak_bits_picture_integer); // 
encPeakBitsPerPictureInteger
RVCE_CS(enc->pic.rate_ctrl.peak_bits_picture_fraction); // 
encPeakBitsPerPictureFractional
RVCE_CS(0x); // encMinQP
-   RVCE_CS(0x); // encMaxQP
+   RVCE_CS(0x0033); // encMaxQP
RVCE_CS(0x); // encSkipFrameEnable
RVCE_CS(0x); // encFillerDataEnable
RVCE_CS(0x); // encEnforceHRD
diff --git a/src/gallium/state_trackers/omx/Makefile.am 
b/src/gallium/state_trackers/omx/Makefile.am
index 1983248..53839d9 100644
--- a/src/gallium/state_trackers/omx/Makefile.am
+++ b/src/gallium/state_trackers/omx/Makefile.am
@@ -32,4 +32,5 @@ libomxtracker_la_SOURCES = \
entrypoint.c \
vid_dec.c \
vid_dec_mpeg12.c \
-   vid_dec_h264.c
+   vid_dec_h264.c \
+   vid_enc.c
diff --git a/src/gallium/state_trackers/omx/entrypoint.c 
b/src/gallium/state_trackers/omx/entrypoint.c
index bc8664b..c67b8c9 100644
--- a/src/gallium/state_trackers/omx/entrypoint.c
+++ b/src/gallium/state_trackers/omx/entrypoint.c
@@ -41,6 +41,7 @@
 
 #include "entrypoint.h"
 #include "vid_dec.h"
+#include "vid_enc.h"
 
 pipe_static_mutex(omx_lock);
 static Display *omx_display = NULL;
@@ -52,13 +53,17 @@ int omx_component_library_Setup(stLoaderComponentType 
**stComponents)
OMX_ERRORTYPE r;
 
if (stComponents == NULL)
-  return 1;
+  return 2;
 
r = vid_dec_LoaderComponent(stComponents[0]);
if (r != OMX_ErrorNone)
   return r;
 
-   return 1;
+   r = vid_enc_LoaderComponent(stComponents[1]);
+   if (r != OMX_ErrorNone)
+  return r;
+
+   return 2;
 }
 
 struct vl_screen *omx_get_screen(void)
diff --git a/src/gallium/state_trackers/omx/vid_dec.c 
b/src/gallium/state_trackers/omx/vid_dec.c
index 7be1dad..b8b519e 100644
--- a/src/gallium/state_trackers/omx/vid_dec.c
+++ b/src/gallium/state_trackers/omx/vid_dec.c
@@ -534,7 +534,6 @@ static void vid_dec_FillOutput(vid_dec_PrivateType *priv, 
struct pipe_video_buff
struct pipe_sampler_view **views;
struct pipe_transfer *transfer;
struct pipe_box box = { };
-
uint8_t *src, *dst;
 
views = buf->get_sampler_view_planes(buf);
@@ -561,8 +560,6 @@ static void vid_dec_FillOutput(vid_dec_PrivateType *priv, 
struct pipe_video_buff
util_copy_rect(dst, views[1]->texture->format, def->nStride, 0, 0,
   box.width, box.height, src, transfer->stride, 0, 0);
pipe_transfer_unmap(priv->pipe, transfer);
-
-   output->nFilledLen = output->nAllocLen;
 }
 
 static void vid_dec_FrameDecoded(OMX_COMPONENTTYPE *comp, 
OMX_BUFFERHEADERTYPE* input,
@@ -574,8 +571,16 @@ static void vid_dec_FrameDecoded(OMX_COMPONENTTYPE *comp, 
OMX_BUFFERHEADERTYPE*
if (!input->pInputPortPrivate)
   input->pInputPortPrivate = priv->Flush(priv);
 
-   if (input->pInputPortPrivate)
-  vid_dec_FillOutput(priv, input->pInputPortPrivate, output);
+   if (input->pInputPortPrivate) {
+  if (output->pInputPortPrivate) {
+ struct pipe_video_buffer *tmp = output->pOutputPortPrivate;
+ output->pOutputPortPrivate = input->pInputPortPrivate;
+ input->pInputPortPrivate = tmp;
+  } else {
+ vid_dec_FillOutput(priv, input->pInputPortPrivate, output);
+  }
+  output->nFilledLen = output->nAllocLen;
+   }
 
if (eos && input->pInputPortPrivate)
   vid_dec_FreeInputPortPrivate(input);
diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
new file mode 100644
index 000..3833f24
--- /dev/null
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -0,0 +1,842 @@
+/**
+ *
+ * Copyright 2013 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files 

[Mesa-dev] [PATCH 08/14] r600/video: disable tilling for now

2014-02-04 Thread Christian König
From: Christian König 

Signed-off-by: Christian König 
---
 src/gallium/drivers/r600/r600_uvd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_uvd.c 
b/src/gallium/drivers/r600/r600_uvd.c
index e0db492..c3da7f8 100644
--- a/src/gallium/drivers/r600/r600_uvd.c
+++ b/src/gallium/drivers/r600/r600_uvd.c
@@ -77,7 +77,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct 
pipe_context *pipe,
template.height = align(tmpl->height / array_size, 
VL_MACROBLOCK_HEIGHT);
 
vl_video_buffer_template(&templ, &template, resource_formats[0], 1, 
array_size, PIPE_USAGE_STATIC, 0);
-   if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced)
+   //if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced)
templ.bind = PIPE_BIND_LINEAR;
resources[0] = (struct r600_texture *)
pipe->screen->resource_create(pipe->screen, &templ);
@@ -86,7 +86,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct 
pipe_context *pipe,
 
if (resource_formats[1] != PIPE_FORMAT_NONE) {
vl_video_buffer_template(&templ, &template, 
resource_formats[1], 1, array_size, PIPE_USAGE_STATIC, 1);
-   if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced)
+   //if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced)
templ.bind = PIPE_BIND_LINEAR;
resources[1] = (struct r600_texture *)
pipe->screen->resource_create(pipe->screen, &templ);
@@ -96,7 +96,7 @@ struct pipe_video_buffer *r600_video_buffer_create(struct 
pipe_context *pipe,
 
if (resource_formats[2] != PIPE_FORMAT_NONE) {
vl_video_buffer_template(&templ, &template, 
resource_formats[2], 1, array_size, PIPE_USAGE_STATIC, 2);
-   if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced)
+   //if (ctx->b.chip_class < EVERGREEN || tmpl->interlaced)
templ.bind = PIPE_BIND_LINEAR;
resources[2] = (struct r600_texture *)
pipe->screen->resource_create(pipe->screen, &templ);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/14] radeon/video: directly create buffers in the right domain

2014-02-04 Thread Christian König
From: Christian König 

Avoid moving things around on start of stream.

Signed-off-by: Christian König 
---
 src/gallium/drivers/radeon/radeon_uvd.c   | 6 +++---
 src/gallium/drivers/radeon/radeon_video.c | 9 ++---
 src/gallium/drivers/radeon/radeon_video.h | 4 +++-
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index e12b6fb..3075905 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -815,12 +815,12 @@ struct pipe_video_codec *ruvd_create_decoder(struct 
pipe_context *context,
for (i = 0; i < NUM_BUFFERS; ++i) {
unsigned msg_fb_size = FB_BUFFER_OFFSET + FB_BUFFER_SIZE;
STATIC_ASSERT(sizeof(struct ruvd_msg) <= FB_BUFFER_OFFSET);
-   if (!rvid_create_buffer(dec->ws, &dec->msg_fb_buffers[i], 
msg_fb_size)) {
+   if (!rvid_create_buffer(dec->ws, &dec->msg_fb_buffers[i], 
msg_fb_size, RADEON_DOMAIN_VRAM)) {
RVID_ERR("Can't allocated message buffers.\n");
goto error;
}
 
-   if (!rvid_create_buffer(dec->ws, &dec->bs_buffers[i], 
bs_buf_size)) {
+   if (!rvid_create_buffer(dec->ws, &dec->bs_buffers[i], 
bs_buf_size, RADEON_DOMAIN_GTT)) {
RVID_ERR("Can't allocated bitstream buffers.\n");
goto error;
}
@@ -829,7 +829,7 @@ struct pipe_video_codec *ruvd_create_decoder(struct 
pipe_context *context,
rvid_clear_buffer(dec->ws, dec->cs, &dec->bs_buffers[i]);
}
 
-   if (!rvid_create_buffer(dec->ws, &dec->dpb, dpb_size)) {
+   if (!rvid_create_buffer(dec->ws, &dec->dpb, dpb_size, 
RADEON_DOMAIN_VRAM)) {
RVID_ERR("Can't allocated dpb.\n");
goto error;
}
diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index 3471202..455b147 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -59,9 +59,12 @@ unsigned rvid_alloc_stream_handle()
 }
 
 /* create a buffer in the winsys */
-bool rvid_create_buffer(struct radeon_winsys *ws, struct rvid_buffer *buffer, 
unsigned size)
+bool rvid_create_buffer(struct radeon_winsys *ws, struct rvid_buffer *buffer,
+   unsigned size, enum radeon_bo_domain domain)
 {
-   buffer->buf = ws->buffer_create(ws, size, 4096, false, 
RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM);
+   buffer->domain = domain;
+
+   buffer->buf = ws->buffer_create(ws, size, 4096, false, domain);
if (!buffer->buf)
return false;
 
@@ -87,7 +90,7 @@ bool rvid_resize_buffer(struct radeon_winsys *ws, struct 
radeon_winsys_cs *cs,
struct rvid_buffer old_buf = *new_buf;
void *src = NULL, *dst = NULL;
 
-   if (!rvid_create_buffer(ws, new_buf, new_size))
+   if (!rvid_create_buffer(ws, new_buf, new_size, new_buf->domain))
goto error;
 
src = ws->buffer_map(old_buf.cs_handle, cs, PIPE_TRANSFER_READ);
diff --git a/src/gallium/drivers/radeon/radeon_video.h 
b/src/gallium/drivers/radeon/radeon_video.h
index 7833ddc..55d2ca4 100644
--- a/src/gallium/drivers/radeon/radeon_video.h
+++ b/src/gallium/drivers/radeon/radeon_video.h
@@ -43,6 +43,7 @@
 /* video buffer representation */
 struct rvid_buffer
 {
+   enum radeon_bo_domain   domain;
struct pb_buffer*   buf;
struct radeon_winsys_cs_handle* cs_handle;
 };
@@ -51,7 +52,8 @@ struct rvid_buffer
 unsigned rvid_alloc_stream_handle(void);
 
 /* create a buffer in the winsys */
-bool rvid_create_buffer(struct radeon_winsys *ws, struct rvid_buffer *buffer, 
unsigned size);
+bool rvid_create_buffer(struct radeon_winsys *ws, struct rvid_buffer *buffer,
+   unsigned size, enum radeon_bo_domain domain);
 
 /* destroy a buffer */
 void rvid_destroy_buffer(struct rvid_buffer *buffer);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 27/30] r600g: calculate a better value for array_size

2014-02-04 Thread Grigori Goronzy

On 04.02.2014 00:53, Dave Airlie wrote:

From: Dave Airlie 

attempt to calculate a better value for array size to avoid breaking apps.

Signed-off-by: Dave Airlie 
---
  src/gallium/drivers/r600/r600_shader.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 8fa7054..f0e980b 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -1416,7 +1416,7 @@ static int emit_gs_ring_writes(struct r600_shader_ctx 
*ctx, bool ind)

if (ind) {
output.array_base = ring_offset >> 2; /* in dwords */
-   output.array_size = 0xff
+   output.array_size = ctx->shader->gs_max_out_vertices * 
4;


array_size is 12 bits in size. It overflows when gs_max_out_vertices is 
set to 1024, and no vertices will be written at all. I don't believe 
it's safe to assume a fixed output size per vertex either. This easily 
breaks GSVS writes in case there are many vertex attributes.


Is there anything wrong with just setting array_size to the maximum, 
0xfff? streamout does the same thing.



output.index_gpr = ctx->gs_export_gpr_treg;
} else
output.array_base = ring_offset >> 2; /* in dwords */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/14] st/omx: add workaround for bug in Bellagio

2014-02-04 Thread Christian König
From: Christian König 

Not blocking for the message thread can lead to accessing freed up memory.

Signed-off-by: Christian König 
---
 src/gallium/state_trackers/omx/entrypoint.c | 13 +
 src/gallium/state_trackers/omx/entrypoint.h |  2 ++
 src/gallium/state_trackers/omx/vid_dec.c|  3 +--
 src/gallium/state_trackers/omx/vid_enc.c|  3 +--
 4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/omx/entrypoint.c 
b/src/gallium/state_trackers/omx/entrypoint.c
index c67b8c9..89aae41 100644
--- a/src/gallium/state_trackers/omx/entrypoint.c
+++ b/src/gallium/state_trackers/omx/entrypoint.c
@@ -103,3 +103,16 @@ void omx_put_screen(void)
}
pipe_mutex_unlock(omx_lock);
 }
+
+OMX_ERRORTYPE omx_workaround_Destructor(OMX_COMPONENTTYPE *comp)
+{
+   omx_base_component_PrivateType* priv = 
(omx_base_component_PrivateType*)comp->pComponentPrivate;
+
+   priv->state = OMX_StateInvalid;
+   tsem_up(priv->messageSem);
+
+   /* wait for thread to exit */;
+   pthread_join(priv->messageHandlerThread, NULL);
+
+   return omx_base_component_Destructor(comp);
+}
diff --git a/src/gallium/state_trackers/omx/entrypoint.h 
b/src/gallium/state_trackers/omx/entrypoint.h
index 41454be..af7c337 100644
--- a/src/gallium/state_trackers/omx/entrypoint.h
+++ b/src/gallium/state_trackers/omx/entrypoint.h
@@ -43,4 +43,6 @@ extern int omx_component_library_Setup(stLoaderComponentType 
**stComponents);
 struct vl_screen *omx_get_screen(void);
 void omx_put_screen(void);
 
+OMX_ERRORTYPE omx_workaround_Destructor(OMX_COMPONENTTYPE *comp);
+
 #endif
diff --git a/src/gallium/state_trackers/omx/vid_dec.c 
b/src/gallium/state_trackers/omx/vid_dec.c
index b8b519e..e2a2891 100644
--- a/src/gallium/state_trackers/omx/vid_dec.c
+++ b/src/gallium/state_trackers/omx/vid_dec.c
@@ -247,8 +247,7 @@ static OMX_ERRORTYPE vid_dec_Destructor(OMX_COMPONENTTYPE 
*comp)
if (priv->screen)
   omx_put_screen();
 
-   omx_base_filter_Destructor(comp);
-   return OMX_ErrorNone;
+   return omx_workaround_Destructor(comp);
 }
 
 static OMX_ERRORTYPE vid_dec_SetParameter(OMX_HANDLETYPE handle, OMX_INDEXTYPE 
idx, OMX_PTR param)
diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 3833f24..c1d8795 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -268,8 +268,7 @@ static OMX_ERRORTYPE vid_enc_Destructor(OMX_COMPONENTTYPE 
*comp)
if (priv->screen)
   omx_put_screen();
 
-   omx_base_filter_Destructor(comp);
-   return OMX_ErrorNone;
+   return omx_workaround_Destructor(comp);
 }
 
 static OMX_ERRORTYPE vid_enc_SetParameter(OMX_HANDLETYPE handle, OMX_INDEXTYPE 
idx, OMX_PTR param)
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/14] radeon: just don't map VRAM buffers at all

2014-02-04 Thread Christian König
From: Christian König 

Signed-off-by: Christian König 
---
 src/gallium/drivers/radeon/r600_texture.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 878b26f..eb1e191 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -911,8 +911,8 @@ static void *r600_texture_transfer_map(struct pipe_context 
*ctx,
if (rtex->surface.level[level].mode >= RADEON_SURF_MODE_1D)
use_staging_texture = TRUE;
 
-   /* Untiled buffers in VRAM, which is slow for CPU reads */
-   if ((usage & PIPE_TRANSFER_READ) && !(usage & 
PIPE_TRANSFER_MAP_DIRECTLY) &&
+   /* Untiled buffers in VRAM, which is slow for CPU reads and writes */
+   if (!(usage & PIPE_TRANSFER_MAP_DIRECTLY) &&
(rtex->resource.domains == RADEON_DOMAIN_VRAM)) {
use_staging_texture = TRUE;
}
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/14] radeon: update legal notes on UVD

2014-02-04 Thread Christian König
From: Christian König 

Signed-off-by: Christian König 
---
 docs/README.UVD | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/docs/README.UVD b/docs/README.UVD
index 36b467e..38ea864 100644
--- a/docs/README.UVD
+++ b/docs/README.UVD
@@ -11,3 +11,34 @@ INFORMATION FOR PACKAGED MEDIA IS EXPRESSLY PROHIBITED 
WITHOUT A LICENSE
 UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT PORTFOLIO, WHICH LICENSES IS
 AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers Green Circle, Suite 400E,
 Greenwood Village, Colorado 80111 U.S.A.
+
+WARRANTY DISCLAIMER: THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY
+KIND.  AMD DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
+BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
+PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, THAT THE SOFTWARE WILL RUN
+UNINTERRUPTED OR ERROR-FREE OR WARRANTIES ARISING FROM CUSTOM OF TRADE OR
+COURSE OF USAGE.  THE ENTIRE RISK ASSOCIATED WITH THE USE OF THE SOFTWARE IS
+ASSUMED BY YOU.  Some jurisdictions do not allow the exclusion of implied
+warranties, so the above exclusion may not apply to You.
+
+LIMITATION OF LIABILITY AND INDEMNIFICATION:  AMD AND ITS LICENSORS WILL NOT,
+UNDER ANY CIRCUMSTANCES BE LIABLE FOR ANY PUNITIVE, DIRECT, INCIDENTAL,
+INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE SOFTWARE OR
+THIS AGREEMENT EVEN IF AMD AND ITS LICENSORS HAVE BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.  In no event shall AMD's total liability to You
+for all damages, losses, and causes of action (whether in contract, tort
+(including negligence) or otherwise) exceed the amount of $100 USD.  You agree
+to defend, indemnify and hold harmless AMD and its licensors, and any of their
+directors, officers, employees, affiliates or agents from and against any and
+all loss, damage, liability and other expenses (including reasonable
+attorneys' fees), resulting from Your use of the Software or violation of the
+terms and conditions of this Agreement.
+
+U.S. GOVERNMENT RESTRICTED RIGHTS: The Software is provided with "RESTRICTED
+RIGHTS." Use, duplication, or disclosure by the Government is subject to the
+restrictions as set forth in FAR 52.227-14 and DFAR252.227-7013, et seq., or
+its successor.  Use of the Software by the Government constitutes
+acknowledgement of AMD's proprietary rights in them.
+
+EXPORT RESTRICTIONS: The Software may be subject to export restrictions as
+stated in the Software License Agreement.
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/14] vl: add H264 encoding interface

2014-02-04 Thread Christian König
From: Christian König 

Signed-off-by: Christian König 
Signed-off-by: Leo Liu 
---
 src/gallium/auxiliary/vl/vl_decoder.c |  2 +-
 src/gallium/drivers/radeon/radeon_video.c |  5 ++--
 src/gallium/include/pipe/p_video_codec.h  | 13 +
 src/gallium/include/pipe/p_video_enums.h  |  4 +--
 src/gallium/include/pipe/p_video_state.h  | 45 +++
 5 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_decoder.c 
b/src/gallium/auxiliary/vl/vl_decoder.c
index fc01067..97e549d 100644
--- a/src/gallium/auxiliary/vl/vl_decoder.c
+++ b/src/gallium/auxiliary/vl/vl_decoder.c
@@ -39,7 +39,7 @@ vl_profile_supported(struct pipe_screen *screen, enum 
pipe_video_profile profile
assert(screen);
switch (u_reduce_video_profile(profile)) {
   case PIPE_VIDEO_FORMAT_MPEG12:
- return true;
+ return entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
   default:
  return false;
}
diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index 455b147..173fd68 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -233,10 +233,11 @@ int rvid_get_video_param(struct pipe_screen *screen,
case PIPE_VIDEO_FORMAT_MPEG12:
case PIPE_VIDEO_FORMAT_MPEG4:
case PIPE_VIDEO_FORMAT_MPEG4_AVC:
-   return true;
+   return entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
case PIPE_VIDEO_FORMAT_VC1:
/* FIXME: VC-1 simple/main profile is broken */
-   return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED;
+   return profile == PIPE_VIDEO_PROFILE_VC1_ADVANCED &&
+  entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE;
default:
return false;
}
diff --git a/src/gallium/include/pipe/p_video_codec.h 
b/src/gallium/include/pipe/p_video_codec.h
index 0e3827a..d4cdacb 100644
--- a/src/gallium/include/pipe/p_video_codec.h
+++ b/src/gallium/include/pipe/p_video_codec.h
@@ -87,6 +87,14 @@ struct pipe_video_codec
 const unsigned *sizes);
 
/**
+* encode to a bitstream
+*/
+   void (*encode_bitstream)(struct pipe_video_codec *codec,
+struct pipe_video_buffer *source,
+struct pipe_resource *destination,
+void **feedback);
+
+   /**
 * end decoding of the current frame
 */
void (*end_frame)(struct pipe_video_codec *codec,
@@ -98,6 +106,11 @@ struct pipe_video_codec
 * should be called before a video_buffer is acessed by the state tracker 
again
 */
void (*flush)(struct pipe_video_codec *codec);
+
+   /**
+* get encoder feedback
+*/
+   void (*get_feedback)(struct pipe_video_codec *codec, void *feedback, 
unsigned *size);
 };
 
 /**
diff --git a/src/gallium/include/pipe/p_video_enums.h 
b/src/gallium/include/pipe/p_video_enums.h
index 7ec29c0..10205ac 100644
--- a/src/gallium/include/pipe/p_video_enums.h
+++ b/src/gallium/include/pipe/p_video_enums.h
@@ -72,8 +72,8 @@ enum pipe_video_entrypoint
PIPE_VIDEO_ENTRYPOINT_UNKNOWN,
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
PIPE_VIDEO_ENTRYPOINT_IDCT,
-   PIPE_VIDEO_ENTRYPOINT_MC
+   PIPE_VIDEO_ENTRYPOINT_MC,
+   PIPE_VIDEO_ENTRYPOINT_ENCODE
 };
 
-
 #endif /* PIPE_VIDEO_ENUMS_H */
diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index 79e588f..f9721dc 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -110,6 +110,24 @@ enum pipe_h264_slice_type
PIPE_H264_SLICE_TYPE_SI = 0x4
 };
 
+enum pipe_h264_enc_picture_type
+{
+   PIPE_H264_ENC_PICTURE_TYPE_P = 0x00,
+   PIPE_H264_ENC_PICTURE_TYPE_B = 0x01,
+   PIPE_H264_ENC_PICTURE_TYPE_I = 0x02,
+   PIPE_H264_ENC_PICTURE_TYPE_IDR = 0x03,
+   PIPE_H264_ENC_PICTURE_TYPE_SKIP = 0x04
+};
+
+enum pipe_h264_enc_rate_control_method
+{
+   PIPE_H264_ENC_RATE_CONTROL_METHOD_DISABLE = 0x00,
+   PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT_SKIP = 0x01,
+   PIPE_H264_ENC_RATE_CONTROL_METHOD_VARIABLE_SKIP = 0x02,
+   PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT = 0x03,
+   PIPE_H264_ENC_RATE_CONTROL_METHOD_VARIABLE = 0x04
+};
+
 struct pipe_picture_desc
 {
enum pipe_video_profile profile;
@@ -325,6 +343,33 @@ struct pipe_h264_picture_desc
struct pipe_video_buffer *ref[16];
 };
 
+struct pipe_h264_enc_rate_control
+{
+   enum pipe_h264_enc_rate_control_method rate_ctrl_method;
+   unsigned target_bitrate;
+   unsigned peak_bitrate;
+   unsigned frame_rate_num;
+   unsigned frame_rate_den;
+   unsigned vbv_buffer_size;
+   unsigned target_bits_picture;
+   unsigned peak_bits_picture_integer;
+   unsigned peak_bits_picture_fraction;
+};
+
+struct pipe_h264_enc_picture_desc
+{
+   struct pipe_picture_desc base;

Re: [Mesa-dev] [PATCH] R600/SI: Add pattern for zero-extending i1 to i32

2014-02-04 Thread Tom Stellard
On Tue, Feb 04, 2014 at 12:56:39PM +0900, Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> Fixes opencl-example if_* tests with radeonsi.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469
> Signed-off-by: Michel Dänzer 
Reviewed-by: Tom Stellard 
> ---
>  lib/Target/R600/SIInstructions.td | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/Target/R600/SIInstructions.td 
> b/lib/Target/R600/SIInstructions.td
> index 7e37821..59fe2ae 100644
> --- a/lib/Target/R600/SIInstructions.td
> +++ b/lib/Target/R600/SIInstructions.td
> @@ -1827,6 +1827,11 @@ def : Pat <
>(V_CNDMASK_B32_e64 (i32 0), (i32 -1), $src0)
>  >;
>  
> +def : Pat <
> +  (i32 (zext i1:$src0)),
> +  (V_CNDMASK_B32_e64 (i32 0), (i32 1), $src0)
> +>;
> +
>  // 1. Offset as 8bit DWORD immediate
>  def : Pat <
>(SIload_constant i128:$sbase, IMM8bitDWORD:$offset),
> -- 
> 1.9.rc1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/14] radeon/vce: initial VCE support v6

2014-02-04 Thread Christian König
From: Christian König 

v2 (chk): revert feedback buffer hack
v3 (slava): fixed bitstream size calculation
v4 (chk): always create buffers in the right domain
v5 (chk): flush async
v6 (chk): rework fw interface add version check

Signed-off-by: Christian König 
Signed-off-by: Leo Liu 
Signed-off-by: Slava Grigorev 
---
 src/gallium/drivers/r600/r600_uvd.c|  20 ++
 src/gallium/drivers/radeon/Makefile.sources|   4 +-
 src/gallium/drivers/radeon/radeon_vce.c| 303 +
 src/gallium/drivers/radeon/radeon_vce.h| 106 
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 348 +
 src/gallium/drivers/radeon/radeon_video.c  |   3 +-
 src/gallium/drivers/radeonsi/si_uvd.c  |  20 ++
 7 files changed, 802 insertions(+), 2 deletions(-)
 create mode 100644 src/gallium/drivers/radeon/radeon_vce.c
 create mode 100644 src/gallium/drivers/radeon/radeon_vce.h
 create mode 100644 src/gallium/drivers/radeon/radeon_vce_40_2_2.c

diff --git a/src/gallium/drivers/r600/r600_uvd.c 
b/src/gallium/drivers/r600/r600_uvd.c
index c3da7f8..700a0b3 100644
--- a/src/gallium/drivers/r600/r600_uvd.c
+++ b/src/gallium/drivers/r600/r600_uvd.c
@@ -47,6 +47,7 @@
 #include "r600_pipe.h"
 #include "radeon/radeon_video.h"
 #include "radeon/radeon_uvd.h"
+#include "radeon/radeon_vce.h"
 #include "r600d.h"
 
 /**
@@ -164,9 +165,28 @@ static struct radeon_winsys_cs_handle* 
r600_uvd_set_dtb(struct ruvd_msg *msg, st
return luma->resource.cs_buf;
 }
 
+/* get the radeon resources for VCE */
+static void r600_vce_get_buffer(struct pipe_resource *resource,
+   struct radeon_winsys_cs_handle **handle,
+   struct radeon_surface **surface)
+{
+   struct r600_texture *res = (struct r600_texture *)resource;
+
+   if (handle)
+   *handle = res->resource.cs_buf;
+
+   if (surface)
+   *surface = &res->surface;
+}
+
 /* create decoder */
 struct pipe_video_codec *r600_uvd_create_decoder(struct pipe_context *context,
   const struct 
pipe_video_codec *templat)
 {
+   struct r600_context *ctx = (struct r600_context *)context;
+
+   if (templat->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
+   return rvce_create_encoder(context, templat, ctx->b.ws, 
r600_vce_get_buffer);
+
return ruvd_create_decoder(context, templat, r600_uvd_set_dtb);
 }
diff --git a/src/gallium/drivers/radeon/Makefile.sources 
b/src/gallium/drivers/radeon/Makefile.sources
index e0ccab9..bbfb8ad 100644
--- a/src/gallium/drivers/radeon/Makefile.sources
+++ b/src/gallium/drivers/radeon/Makefile.sources
@@ -5,7 +5,9 @@ C_SOURCES := \
r600_streamout.c \
 r600_texture.c \
radeon_video.c \
-   radeon_uvd.c
+   radeon_uvd.c \
+   radeon_vce.c \
+   radeon_vce_40_2_2.c
 
 LLVM_C_FILES := \
radeon_setup_tgsi_llvm.c \
diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
new file mode 100644
index 000..1642750
--- /dev/null
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -0,0 +1,303 @@
+/**
+ *
+ * Copyright 2013 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **/
+
+/*
+ * Authors:
+ *  Christian König 
+ *
+ */
+
+#include 
+
+#include "pipe/p_video_codec.h"
+
+#include "util/u_video.h"
+#include "util/u_memory.h"
+
+#include "vl/vl_video_buffer.h"
+
+#include "../../winsys/radeon/drm/radeon_winsys.h"
+#include "r600_pipe_common.h"
+#include "radeon_video.h"
+#include "radeon_vce.h"
+
+#define CPB_SIZE (40 * 1024 * 1024)
+
+static void flush(struct rvce_encoder *enc)
+{
+   enc->

[Mesa-dev] [PATCH 11/14] radeon/winsys: add VCE support v3

2014-02-04 Thread Christian König
From: Christian König 

v2: add fw version query
v3: add README.VCE

Signed-off-by: Christian König 
---
 docs/README.VCE   | 43 +++
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 10 ++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 20 ++-
 src/gallium/winsys/radeon/drm/radeon_winsys.h |  2 ++
 4 files changed, 74 insertions(+), 1 deletion(-)
 create mode 100644 docs/README.VCE

diff --git a/docs/README.VCE b/docs/README.VCE
new file mode 100644
index 000..d4b4cc5
--- /dev/null
+++ b/docs/README.VCE
@@ -0,0 +1,43 @@
+The software may implement third party technologies (e.g. third party
+libraries) that are not licensed to you by AMD and for which you may need
+to obtain licenses from other parties.  Unless explicitly stated otherwise,
+these third party technologies are not licensed hereunder.  Such third
+party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4,
+AVC, and VC-1.  
+
+For MPEG-2 Intermediate Products: ANY USE OF THIS PRODUCT IN ANY MANNER OTHER
+THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD IS EXPRESSLY
+PROHIBITED WITHOUT A LICENSE UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT
+PORTFOLIO, WHICH LICENSES IS AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers
+Green Circle, Suite 400E, Greenwood Village, Colorado 80111 U.S.A.
+
+WARRANTY DISCLAIMER: THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY
+KIND.  AMD DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
+BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
+PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, THAT THE SOFTWARE WILL RUN
+UNINTERRUPTED OR ERROR-FREE OR WARRANTIES ARISING FROM CUSTOM OF TRADE OR
+COURSE OF USAGE.  THE ENTIRE RISK ASSOCIATED WITH THE USE OF THE SOFTWARE IS
+ASSUMED BY YOU.  Some jurisdictions do not allow the exclusion of implied
+warranties, so the above exclusion may not apply to You.
+
+LIMITATION OF LIABILITY AND INDEMNIFICATION:  AMD AND ITS LICENSORS WILL NOT,
+UNDER ANY CIRCUMSTANCES BE LIABLE FOR ANY PUNITIVE, DIRECT, INCIDENTAL,
+INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE SOFTWARE OR
+THIS AGREEMENT EVEN IF AMD AND ITS LICENSORS HAVE BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.  In no event shall AMD's total liability to You
+for all damages, losses, and causes of action (whether in contract, tort
+(including negligence) or otherwise) exceed the amount of $100 USD.  You agree
+to defend, indemnify and hold harmless AMD and its licensors, and any of their
+directors, officers, employees, affiliates or agents from and against any and
+all loss, damage, liability and other expenses (including reasonable
+attorneys' fees), resulting from Your use of the Software or violation of the
+terms and conditions of this Agreement.
+
+U.S. GOVERNMENT RESTRICTED RIGHTS: The Software is provided with "RESTRICTED
+RIGHTS." Use, duplication, or disclosure by the Government is subject to the
+restrictions as set forth in FAR 52.227-14 and DFAR252.227-7013, et seq., or
+its successor.  Use of the Software by the Government constitutes
+acknowledgement of AMD's proprietary rights in them.
+
+EXPORT RESTRICTIONS: The Software may be subject to export restrictions as
+stated in the Software License Agreement.
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
index d8ad297..ccba0c0 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c
@@ -99,6 +99,10 @@
 #define RADEON_CS_RING_UVD  3
 #endif
 
+#ifndef RADEON_CS_RING_VCE
+#define RADEON_CS_RING_VCE  4
+#endif
+
 #ifndef RADEON_CS_END_OF_FRAME
 #define RADEON_CS_END_OF_FRAME  0x04
 #endif
@@ -538,6 +542,12 @@ static void radeon_drm_cs_flush(struct radeon_winsys_cs 
*rcs, unsigned flags, ui
 cs->cst->cs.num_chunks = 3;
 break;
 
+case RING_VCE:
+cs->cst->flags[0] = 0;
+cs->cst->flags[1] = RADEON_CS_RING_VCE;
+cs->cst->cs.num_chunks = 3;
+break;
+
 default:
 case RING_GFX:
 cs->cst->flags[0] = 0;
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 427ee7d..d57b296 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -97,10 +97,18 @@
 #define RADEON_INFO_RING_WORKING 0x15
 #endif
 
+#ifndef RADEON_INFO_VCE_FW_VERSION
+#define RADEON_INFO_VCE_FW_VERSION 0x1b
+#endif
+
 #ifndef RADEON_CS_RING_UVD
 #define RADEON_CS_RING_UVD 3
 #endif
 
+#ifndef RADEON_CS_RING_VCE
+#define RADEON_CS_RING_VCE 4
+#endif
+
 static struct util_hash_table *fd_tab = NULL;
 
 /* Enable/disable feature access for one command stream.
@@ -341,13 +349,23 @@ static boolean do_winsys_init(struct radeon_drm_winsys 
*ws)
 ws->info.r600_has_dma = T

  1   2   >