[Mesa-dev] [PATCH] r600g: Don't leak bytecode on shader compile failure

2014-04-09 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74868

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/r600/r600_shader.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index ddf79ee..b198359 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -155,7 +155,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
r = r600_shader_from_tgsi(rctx, shader, key);
if (r) {
R600_ERR(translation from TGSI failed !\n);
-   return r;
+   goto error;
}
 
/* disable SB for geom shaders - it can't handle the CF_EMIT 
instructions */
@@ -169,7 +169,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
r = r600_bytecode_build(shader-shader.bc);
if (r) {
R600_ERR(building bytecode failed !\n);
-   return r;
+   goto error;
}
}
 
@@ -182,7 +182,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
 dump, use_sb);
if (r) {
R600_ERR(r600_sb_bytecode_process failed !\n);
-   return r;
+   goto error;
}
}
 
@@ -192,16 +192,16 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
r = r600_sb_bytecode_process(rctx, 
shader-gs_copy_shader-shader.bc,
 
shader-gs_copy_shader-shader, dump, 0);
if (r)
-   return r;
+   goto error;
}
 
if ((r = store_shader(ctx, shader-gs_copy_shader)))
-   return r;
+   goto error;
}
 
/* Store the shader in a buffer. */
if ((r = store_shader(ctx, shader)))
-   return r;
+   goto error;
 
/* Build state. */
switch (shader-shader.processor_type) {
@@ -235,9 +235,13 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
}
break;
default:
-   return -EINVAL;
+   goto error;
}
return 0;
+
+error:
+   r600_pipe_shader_destroy(ctx, shader);
+   return r;
 }
 
 void r600_pipe_shader_destroy(struct pipe_context *ctx, struct 
r600_pipe_shader *shader)
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] util: Rework endian handling in python code

2014-04-09 Thread Michel Dänzer

José, any chance you could work Richard to get this in?


On Mon, 2014-04-07 at 11:35 +0100, Richard Sandiford wrote:
 Ping.
 
 Richard Sandiford rsand...@linux.vnet.ibm.com writes:
  Ping (with fixed subject)
 
  Richard Sandiford rsand...@linux.vnet.ibm.com writes:
  This is a refresh of:
 
 http://lists.freedesktop.org/archives/mesa-dev/2013-June/040594.html
 
  At the moment the python code uses sys.byteorder to decide whether
  u_format_table.c should be for big or little endian.  With this series
  it instead generates both forms, using blocks like:
 
  #ifdef PIPE_ARCH_BIG_ENDIAN
  ...
  #else
  ...
  #endif
 
  in cases where endianness matters.
 
  Doing it this way is more cross-compiler-friendly.  It also means people
  working on LE systems can see what the differences would be for BE.
 
  Tested on x86_64 and z.  I don't have commit access so please apply if OK.
 
  Thanks,
  Richard
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev




-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77208] VdpPresentationQueueGetTime does not return a monotonic time

2014-04-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77208

Christian König deathsim...@vodafone.de changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Christian König deathsim...@vodafone.de ---
That's a known issue, it's a design problem of DRI2.

Essentially there is no global time you could return from
VdpPresentationQueueGetTime. Instead what you return is always the an
estimation based on the vsync counter and that's unfortunately per output
instead of global.

Additional to that I never bothered adding the difference between the last flip
and the current time to the result of VdpPresentationQueueGetTime. So when you
don't have a page flip the result of VdpPresentationQueueGetTime will probably
stand still.

*** This bug has been marked as a duplicate of bug 66384 ***

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: Expand ARB_gpu_shader5 to describe status of individual features

2014-04-09 Thread Chris Forbes
On Wed, Apr 9, 2014 at 7:23 AM, Ian Romanick i...@freedesktop.org wrote:

 I believe UBO array indices are also dynamically uniform.

I was surprised to find this when building the list too, but I believe
it's unrestricted.

The GLSL 4.0 spec, 4.3.7 p43 (bottom of) says:

Any integral expression can be used to index a uniform block array.

The corresponding language for arrays of samplers is:

...a dynamically uniform integral expression, otherwise results
are undefined.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] winsys/radeon: fix a race condition in initialization of radeon_winsys::screen

2014-04-09 Thread Christian König

Am 09.04.2014 05:44, schrieb Michel Dänzer:

On Mit, 2014-04-09 at 02:15 +0200, Marek Olšák wrote:

From: Marek Olšák marek.ol...@amd.com

Create the screen in the winsys while the mutex is locked.
This also results in a nice code cleanup!

[...]

diff --git a/src/gallium/targets/egl-static/egl_pipe.c 
b/src/gallium/targets/egl-static/egl_pipe.c
index eb1cff9..ce734fb 100644
--- a/src/gallium/targets/egl-static/egl_pipe.c
+++ b/src/gallium/targets/egl-static/egl_pipe.c
@@ -119,19 +119,9 @@ pipe_r300_create_screen(int fd)
  {
  #if _EGL_PIPE_R300
 struct radeon_winsys *sws;
-   struct pipe_screen *screen;
-
-   sws = radeon_drm_winsys_create(fd);
-   if (!sws)
-  return NULL;
-
-   screen = r300_screen_create(sws);
-   if (!screen)
-  return NULL;
  
-   screen = debug_screen_wrap(screen);

-
-   return screen;
+   sws = radeon_drm_winsys_create(fd, r300_screen_create);
+   return sws ? debug_screen_wrap(sws-screen) : NULL;

I think it would be clearer to keep this as:

sws = radeon_drm_winsys_create(fd, r300_screen_create);
if (!sws)
   return NULL;

return debug_screen_wrap(sws-screen);


Either way though, the series is

Reviewed-by: Michel Dänzer michel.daen...@amd.com


I actually like the shorter form, but anyway thanks allot for this it 
removes quite a bunch of todos from my list. Didn't though that creating 
the screen from the winsys was ok with the design, but when you are fine 
with it I'm not about to complain.


For the series:

Reviewed-by: Christian König christian.koe...@amd.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] radeon/vce: implement B-frame support

2014-04-09 Thread Christian König
From: Christian König christian.koe...@amd.com

Signed-off-by: Slava Grigorev slava.grigo...@amd.com
Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/drivers/radeon/radeon_vce.h|  2 +-
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 73 ++
 2 files changed, 53 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.h 
b/src/gallium/drivers/radeon/radeon_vce.h
index f815cad..7cc87be 100644
--- a/src/gallium/drivers/radeon/radeon_vce.h
+++ b/src/gallium/drivers/radeon/radeon_vce.h
@@ -45,7 +45,7 @@
 #define RVCE_READWRITE(buf, domain) RVCE_CS(RVCE_RELOC(buf, 
RADEON_USAGE_READWRITE, domain) * 4)
 #define RVCE_END() *begin = (enc-cs-buf[enc-cs-cdw] - begin) * 4; }
 
-#define RVCE_NUM_CPB_FRAMES 2
+#define RVCE_NUM_CPB_FRAMES 3
 
 struct r600_common_screen;
 
diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c 
b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
index 1327d64..3b67b31 100644
--- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
+++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
@@ -54,6 +54,11 @@ static struct rvce_cpb_slot *l0_slot(struct rvce_encoder 
*enc)
return LIST_ENTRY(struct rvce_cpb_slot, enc-cpb_slots.next, list);
 }
 
+static struct rvce_cpb_slot *l1_slot(struct rvce_encoder *enc)
+{
+   return LIST_ENTRY(struct rvce_cpb_slot, enc-cpb_slots.next-next, 
list);
+}
+
 static void frame_offset(struct rvce_encoder *enc, struct rvce_cpb_slot *slot,
 unsigned *luma_offset, unsigned *chroma_offset)
 {
@@ -99,8 +104,8 @@ static void create(struct rvce_encoder *enc)
 
RVCE_BEGIN(0x0101); // create cmd
RVCE_CS(0x); // encUseCircularBuffer
-   RVCE_CS(0x0041); // encProfile
-   RVCE_CS(0x000a); // encLevel
+   RVCE_CS(0x004d); // encProfile: Main
+   RVCE_CS(0x002a); // encLevel: 4.2
RVCE_CS(0x); // encPicStructRestriction
RVCE_CS(enc-base.width); // encImageWidth
RVCE_CS(enc-base.height); // encImageHeight
@@ -175,12 +180,12 @@ static void pic_control(struct rvce_encoder *enc)
RVCE_CS(0x); // encSPSID
RVCE_CS(0x); // encPPSID
RVCE_CS(0x0040); // encConstraintSetFlags
-   RVCE_CS(0x); // encBPicPattern
+   RVCE_CS(MAX2(enc-base.max_references, 1) - 1); // encBPicPattern
RVCE_CS(0x); // weightPredModeBPicture
RVCE_CS(MIN2(enc-base.max_references, 2)); // 
encNumberOfReferenceFrames
RVCE_CS(enc-base.max_references + 1); // encMaxNumRefFrames
-   RVCE_CS(0x); // encNumDefaultActiveRefL0
-   RVCE_CS(0x); // encNumDefaultActiveRefL1
+   RVCE_CS(0x0001); // encNumDefaultActiveRefL0
+   RVCE_CS(0x0001); // encNumDefaultActiveRefL1
RVCE_CS(0x); // encSliceMode
RVCE_CS(0x); // encMaxSliceSize
RVCE_END();
@@ -275,7 +280,7 @@ static void encode(struct rvce_encoder *enc)
RVCE_CS(0x); // encInputPic(Addr|Array)Mode
RVCE_CS(0x); // encInputPicTileConfig
RVCE_CS(enc-pic.picture_type); // encPicType
-   RVCE_CS(enc-pic.picture_type == 3); // encIdrFlag
+   RVCE_CS(enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR); // 
encIdrFlag
RVCE_CS(0x); // encIdrPicId
RVCE_CS(0x); // encMGSKeyPic
RVCE_CS(0x0001); // encReferenceFlag
@@ -283,7 +288,17 @@ static void encode(struct rvce_encoder *enc)
RVCE_CS(0x); // num_ref_idx_active_override_flag
RVCE_CS(0x); // num_ref_idx_l0_active_minus1
RVCE_CS(0x); // num_ref_idx_l1_active_minus1
-   for (i = 0; i  4; ++i) {
+
+   i = enc-pic.frame_num - enc-pic.ref_idx_l0;
+   if (i  1  enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_P) {
+   RVCE_CS(0x0001); // encRefListModificationOp
+   RVCE_CS(i - 1);  // encRefListModificationNum
+   } else {
+   RVCE_CS(0x); // encRefListModificationOp
+   RVCE_CS(0x); // encRefListModificationNum
+   }
+
+   for (i = 0; i  3; ++i) {
RVCE_CS(0x); // encRefListModificationOp
RVCE_CS(0x); // encRefListModificationNum
}
@@ -291,22 +306,14 @@ static void encode(struct rvce_encoder *enc)
RVCE_CS(0x); // encDecodedPictureMarkingOp
RVCE_CS(0x); // encDecodedPictureMarkingNum
RVCE_CS(0x); // encDecodedPictureMarkingIdx
-   }
-   for (i = 0; i  4; ++i) {
RVCE_CS(0x); // encDecodedRefBasePictureMarkingOp
RVCE_CS(0x); // encDecodedRefBasePictureMarkingNum
}
 
+   // encReferencePictureL0[0]
RVCE_CS(0x); // pictureStructure
-
-   if (enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR) { 
-   

[Mesa-dev] [PATCH 5/7] st/omx/enc: separate input buffer private and task structure

2014-04-09 Thread Christian König
From: Christian König christian.koe...@amd.com

Keep tasks as linked list, this way we can associate
more than one encoding task with each buffer.

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/state_trackers/omx/vid_enc.c | 182 +--
 src/gallium/state_trackers/omx/vid_enc.h |   4 +
 2 files changed, 127 insertions(+), 59 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 080730b..88d15a9 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -54,12 +54,18 @@
 #include entrypoint.h
 #include vid_enc.h
 
-struct input_buf_private {
+struct encode_task {
+   struct list_head list;
+
struct pipe_video_buffer *buf;
struct pipe_resource *bitstream;
void *feedback;
 };
 
+struct input_buf_private {
+   struct list_head tasks;
+};
+
 struct output_buf_private {
struct pipe_resource *bitstream;
struct pipe_transfer *transfer;
@@ -79,6 +85,8 @@ static OMX_ERRORTYPE 
vid_enc_AllocateOutBuffer(omx_base_PortType *comp, OMX_INOU
 static OMX_ERRORTYPE vid_enc_FreeOutBuffer(omx_base_PortType *port, OMX_U32 
idx, OMX_BUFFERHEADERTYPE *buf);
 static void vid_enc_BufferEncoded(OMX_COMPONENTTYPE *comp, 
OMX_BUFFERHEADERTYPE* input, OMX_BUFFERHEADERTYPE* output);
 
+static void enc_ReleaseTasks(struct list_head *head);
+
 static void vid_enc_name(char str[OMX_MAX_STRINGNAME_SIZE])
 {
snprintf(str, OMX_MAX_STRINGNAME_SIZE, OMX_VID_ENC_BASE_NAME, 
driver_descriptor.name);
@@ -243,6 +251,9 @@ static OMX_ERRORTYPE vid_enc_Constructor(OMX_COMPONENTTYPE 
*comp, OMX_STRING nam
priv-scale.xWidth = OMX_VID_ENC_SCALING_WIDTH_DEFAULT;
priv-scale.xHeight = OMX_VID_ENC_SCALING_WIDTH_DEFAULT;
 
+   LIST_INITHEAD(priv-free_tasks);
+   LIST_INITHEAD(priv-used_tasks);
+
return OMX_ErrorNone;
 }
 
@@ -251,6 +262,9 @@ static OMX_ERRORTYPE vid_enc_Destructor(OMX_COMPONENTTYPE 
*comp)
vid_enc_PrivateType* priv = comp-pComponentPrivate;
int i;
 
+   enc_ReleaseTasks(priv-free_tasks);
+   enc_ReleaseTasks(priv-used_tasks);
+
if (priv-ports) {
   for (i = 0; i  priv-sPortTypesParam[OMX_PortDomainVideo].nPorts; ++i) {
  if(priv-ports[i])
@@ -563,9 +577,10 @@ static OMX_ERRORTYPE 
vid_enc_MessageHandler(OMX_COMPONENTTYPE* comp, internalReq
 static OMX_ERRORTYPE vid_enc_FreeInBuffer(omx_base_PortType *port, OMX_U32 
idx, OMX_BUFFERHEADERTYPE *buf)
 {
struct input_buf_private *inp = buf-pInputPortPrivate;
-   pipe_resource_reference(inp-bitstream, NULL);
-   inp-buf-destroy(inp-buf);
-   FREE(inp);
+   if (inp) {
+  enc_ReleaseTasks(inp-tasks);
+  FREE(inp);
+   }
return base_port_FreeBuffer(port, idx, buf);
 }
 
@@ -607,22 +622,25 @@ static OMX_ERRORTYPE 
vid_enc_FreeOutBuffer(omx_base_PortType *port, OMX_U32 idx,
return base_port_FreeBuffer(port, idx, buf);
 }
 
-static OMX_ERRORTYPE enc_NeedInputPortPrivate(omx_base_PortType *port, 
OMX_BUFFERHEADERTYPE *buf)
+static struct encode_task *enc_NeedTask(omx_base_PortType *port)
 {
+   OMX_VIDEO_PORTDEFINITIONTYPE *def = port-sPortParam.format.video;
OMX_COMPONENTTYPE* comp = port-standCompContainer;
vid_enc_PrivateType *priv = comp-pComponentPrivate;
-   OMX_VIDEO_PORTDEFINITIONTYPE *def = port-sPortParam.format.video;
-   struct input_buf_private **inp = (struct input_buf_private 
**)buf-pInputPortPrivate;
+
struct pipe_video_buffer templat = {};
+   struct encode_task *task;
 
-   if (*inp) {
-  pipe_resource_reference((*inp)-bitstream, NULL);
-  return OMX_ErrorNone;
+   if (!LIST_IS_EMPTY(priv-free_tasks)) {
+  task = LIST_ENTRY(struct encode_task, priv-free_tasks.next, list);
+  LIST_DEL(task-list);
+  return task;
}
 
-   if (!(*inp = CALLOC(1, sizeof(struct input_buf_private {
-  return OMX_ErrorInsufficientResources;
-   }
+   /* allocate a new one */
+   task = CALLOC_STRUCT(encode_task);
+   if (!task)
+  return NULL;
 
templat.buffer_format = PIPE_FORMAT_NV12;
templat.chroma_format = PIPE_VIDEO_CHROMA_FORMAT_420;
@@ -630,25 +648,46 @@ static OMX_ERRORTYPE 
enc_NeedInputPortPrivate(omx_base_PortType *port, OMX_BUFFE
templat.height = def-nFrameHeight;
templat.interlaced = false;
 
-   if (!((*inp)-buf = priv-s_pipe-create_video_buffer(priv-s_pipe, 
templat))) {
-  FREE(*inp);
-  return OMX_ErrorInsufficientResources;
+   task-buf = priv-s_pipe-create_video_buffer(priv-s_pipe, templat);
+   if (!task-buf) {
+  FREE(task);
+  return NULL;
}
 
-   return OMX_ErrorNone;
+   return task;
+}
+
+static void enc_MoveTasks(struct list_head *from, struct list_head *to)
+{
+   to-prev-next = from-next;
+   from-next-prev = to-prev;
+   from-prev-next = to;
+   to-prev = from-prev;
+   LIST_INITHEAD(from);
 }
 
-static OMX_ERRORTYPE enc_LoadImage(omx_base_PortType *port, 
OMX_BUFFERHEADERTYPE *buf)
+static void enc_ReleaseTasks(struct list_head *head)
+{
+   struct encode_task 

[Mesa-dev] [PATCH 1/7] radeon/vce: remove RVCE_NUM_CPB_EXTRA_FRAMES

2014-04-09 Thread Christian König
From: Christian König christian.koe...@amd.com

Doesn't seems to be needed any more.

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/drivers/radeon/radeon_vce.c| 2 +-
 src/gallium/drivers/radeon/radeon_vce.h| 1 -
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 3 +--
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index 4b824f9..012b4f8 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -262,7 +262,7 @@ struct pipe_video_codec *rvce_create_encoder(struct 
pipe_context *context,
vpitch = align(tmp_surf-npix_y, 16);
tmp_buf-destroy(tmp_buf);
if (!rvid_create_buffer(enc-ws, enc-cpb,
-   pitch * vpitch * 1.5 * (RVCE_NUM_CPB_FRAMES + 
RVCE_NUM_CPB_EXTRA_FRAMES),
+   pitch * vpitch * 1.5 * RVCE_NUM_CPB_FRAMES,
RADEON_DOMAIN_VRAM)) {
RVID_ERR(Can't create CPB buffer.\n);
goto error;
diff --git a/src/gallium/drivers/radeon/radeon_vce.h 
b/src/gallium/drivers/radeon/radeon_vce.h
index 9dc0c68..3ea738b 100644
--- a/src/gallium/drivers/radeon/radeon_vce.h
+++ b/src/gallium/drivers/radeon/radeon_vce.h
@@ -44,7 +44,6 @@
 #define RVCE_END() *begin = (enc-cs-buf[enc-cs-cdw] - begin) * 4; }
 
 #define RVCE_NUM_CPB_FRAMES 2
-#define RVCE_NUM_CPB_EXTRA_FRAMES 2
 
 struct r600_common_screen;
 
diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c 
b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
index 26c3629..c41b2d0 100644
--- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
+++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
@@ -224,9 +224,8 @@ static void frame_offset(struct rvce_encoder *enc, unsigned 
frame_num,
unsigned pitch = align(enc-luma-level[0].pitch_bytes, 128);
unsigned vpitch = align(enc-luma-npix_y, 16);
unsigned fsize = pitch * (vpitch + vpitch / 2);
-   unsigned base_offset = RVCE_NUM_CPB_EXTRA_FRAMES * fsize;
 
-   *luma_offset = base_offset + (frame_num % RVCE_NUM_CPB_FRAMES) * fsize;
+   *luma_offset = (frame_num % RVCE_NUM_CPB_FRAMES) * fsize;
*chroma_offset = *luma_offset + pitch * vpitch;
 }
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] vl: add interface for H264 B-frame encoding

2014-04-09 Thread Christian König
From: Christian König christian.koe...@amd.com

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 11 ++-
 src/gallium/include/pipe/p_video_state.h   |  3 +++
 src/gallium/state_trackers/omx/vid_enc.c   |  8 +++-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c 
b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
index c41b2d0..33a58f3 100644
--- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
+++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
@@ -156,8 +156,8 @@ static void pic_control(struct rvce_encoder *enc)
RVCE_CS(0x0040); // encConstraintSetFlags
RVCE_CS(0x); // encBPicPattern
RVCE_CS(0x); // weightPredModeBPicture
-   RVCE_CS(0x0001); // encNumberOfReferenceFrames
-   RVCE_CS(0x0001); // encMaxNumRefFrames
+   RVCE_CS(MIN2(enc-base.max_references, 2)); // 
encNumberOfReferenceFrames
+   RVCE_CS(enc-base.max_references + 1); // encMaxNumRefFrames
RVCE_CS(0x); // encNumDefaultActiveRefL0
RVCE_CS(0x); // encNumDefaultActiveRefL1
RVCE_CS(0x); // encSliceMode
@@ -297,8 +297,9 @@ static void encode(struct rvce_encoder *enc)
RVCE_CS(0x); // chromaOffset
}
else if(enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_P) {
-   frame_offset(enc, enc-pic.frame_num - 1, luma_offset, 
chroma_offset);
+   frame_offset(enc, enc-pic.ref_idx_l0, luma_offset, 
chroma_offset);
RVCE_CS(0x); // encPicType
+   // TODO: Stores these in the CPB backtrack
RVCE_CS(enc-pic.frame_num - 1); // frameNumber
RVCE_CS(enc-pic.frame_num - 1); // pictureOrderCount
RVCE_CS(luma_offset); // lumaOffset
@@ -322,8 +323,8 @@ static void encode(struct rvce_encoder *enc)
RVCE_CS(0x); // encReferenceRefBasePictureLumaOffset
RVCE_CS(0x); // encReferenceRefBasePictureChromaOffset
RVCE_CS(0x); // pictureCount
-   RVCE_CS(0x); // frameNumber
-   RVCE_CS(0x); // pictureOrderCount
+   RVCE_CS(enc-pic.frame_num); // frameNumber
+   RVCE_CS(enc-pic.pic_order_cnt); // pictureOrderCount
RVCE_CS(0x); // numIPicRemainInRCGOP
RVCE_CS(0x); // numPPicRemainInRCGOP
RVCE_CS(0x); // numBPicRemainInRCGOP
diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index f9721dc..0256a8f 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -368,6 +368,9 @@ struct pipe_h264_enc_picture_desc
 
enum pipe_h264_enc_picture_type picture_type;
unsigned frame_num;
+   unsigned pic_order_cnt;
+   unsigned ref_idx_l0;
+   unsigned ref_idx_l1;
 };
 
 #ifdef __cplusplus
diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 8ec0439..080730b 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -769,11 +769,17 @@ static void enc_ControlPicture(omx_base_PortType *port,
 
if (!(priv-frame_num % OMX_VID_ENC_IDR_PERIOD_DEFAULT) || 
priv-force_pic_type.IntraRefreshVOP) {
   picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_IDR;
+  picture-ref_idx_l0 = 0;
+  picture-ref_idx_l1 = 0;
   priv-frame_num = 0;
-   } else
+   } else {
   picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_P;
+  picture-ref_idx_l0 = priv-frame_num - 1;
+  picture-ref_idx_l1 = 0;
+   }

picture-frame_num = priv-frame_num++;
+   picture-pic_order_cnt = picture-frame_num;
priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; 
 }
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] radeon/vce: add proper CPB backtrack

2014-04-09 Thread Christian König
From: Christian König christian.koe...@amd.com

Remember what frames we encoded at which position.

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/drivers/radeon/radeon_vce.c| 87 --
 src/gallium/drivers/radeon/radeon_vce.h| 15 +
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 44 -
 3 files changed, 123 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index 012b4f8..a7dfcda 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -80,6 +80,57 @@ static void dump_feedback(struct rvce_encoder *enc, struct 
rvid_buffer *fb)
 #endif
 
 /**
+ * reset the CPB handling
+ */
+static void reset_cpb(struct rvce_encoder *enc)
+{
+   unsigned i;
+
+   LIST_INITHEAD(enc-cpb_slots);
+   for (i = 0; i  RVCE_NUM_CPB_FRAMES; ++i) {
+   struct rvce_cpb_slot *slot = enc-cpb_array[i];
+   slot-index = i;
+   slot-picture_type = PIPE_H264_ENC_PICTURE_TYPE_SKIP;
+   slot-frame_num = 0;
+   slot-pic_order_cnt = 0;
+   LIST_ADDTAIL(slot-list, enc-cpb_slots);
+   }
+}
+
+/**
+ * sort l0 and l1 to the top of the list
+ */
+static void sort_cpb(struct rvce_encoder *enc)
+{
+   struct rvce_cpb_slot *i, *l0 = NULL, *l1 = NULL;
+
+   LIST_FOR_EACH_ENTRY(i, enc-cpb_slots, list) {
+   if (i-frame_num == enc-pic.ref_idx_l0)
+   l0 = i;
+
+   if (i-frame_num == enc-pic.ref_idx_l1)
+   l1 = i;
+
+   if (enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_P  l0)
+   break;
+
+   if (enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_B 
+   l0  l1)
+   break;
+   }
+
+   if (l1) {
+   LIST_DEL(l1-list);
+   LIST_ADD(l1-list, enc-cpb_slots);
+   }
+
+   if (l0) {
+   LIST_DEL(l0-list);
+   LIST_ADD(l0-list, enc-cpb_slots);
+   }
+}
+
+/**
  * destroy this video encoder
  */
 static void rvce_destroy(struct pipe_video_codec *encoder)
@@ -97,6 +148,7 @@ static void rvce_destroy(struct pipe_video_codec *encoder)
}
rvid_destroy_buffer(enc-cpb);
enc-ws-cs_destroy(enc-cs);
+   FREE(enc-cpb_array);
FREE(enc);
 }
 
@@ -118,6 +170,12 @@ static void rvce_begin_frame(struct pipe_video_codec 
*encoder,
 
enc-get_buffer(vid_buf-resources[0], enc-handle, enc-luma);
enc-get_buffer(vid_buf-resources[1], NULL, enc-chroma);
+
+   if (pic-picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR)
+   reset_cpb(enc);
+   else if (pic-picture_type == PIPE_H264_ENC_PICTURE_TYPE_P ||
+pic-picture_type == PIPE_H264_ENC_PICTURE_TYPE_B)
+   sort_cpb(enc);

if (!enc-stream_handle) {
struct rvid_buffer fb;
@@ -167,7 +225,17 @@ static void rvce_end_frame(struct pipe_video_codec 
*encoder,
   struct pipe_picture_desc *picture)
 {
struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
+   struct rvce_cpb_slot *slot = LIST_ENTRY(
+   struct rvce_cpb_slot, enc-cpb_slots.prev, list);
+
flush(enc);
+
+   /* update the CPB backtrack with the just encoded frame */
+   LIST_DEL(slot-list);
+   slot-picture_type = enc-pic.picture_type;
+   slot-frame_num = enc-pic.frame_num;
+   slot-pic_order_cnt = enc-pic.pic_order_cnt;
+   LIST_ADD(slot-list, enc-cpb_slots);
 }
 
 static void rvce_get_feedback(struct pipe_video_codec *encoder,
@@ -213,7 +281,7 @@ struct pipe_video_codec *rvce_create_encoder(struct 
pipe_context *context,
struct rvce_encoder *enc;
struct pipe_video_buffer *tmp_buf, templat = {};
struct radeon_surface *tmp_surf;
-   unsigned pitch, vpitch;
+   unsigned cpb_size;
 
if (!rscreen-info.vce_fw_version) {
RVID_ERR(Kernel doesn't supports VCE!\n);
@@ -258,16 +326,22 @@ struct pipe_video_codec *rvce_create_encoder(struct 
pipe_context *context,
}
 
get_buffer(((struct vl_video_buffer *)tmp_buf)-resources[0], NULL, 
tmp_surf);
-   pitch = align(tmp_surf-level[0].pitch_bytes, 128);
-   vpitch = align(tmp_surf-npix_y, 16);
+   cpb_size = align(tmp_surf-level[0].pitch_bytes, 128);
+   cpb_size = cpb_size * align(tmp_surf-npix_y, 16);
+   cpb_size = cpb_size * 3 / 2;
+   cpb_size = cpb_size * RVCE_NUM_CPB_FRAMES;
tmp_buf-destroy(tmp_buf);
-   if (!rvid_create_buffer(enc-ws, enc-cpb,
-   pitch * vpitch * 1.5 * RVCE_NUM_CPB_FRAMES,
-   RADEON_DOMAIN_VRAM)) {
+   if (!rvid_create_buffer(enc-ws, enc-cpb, cpb_size, 
RADEON_DOMAIN_VRAM)) {
RVID_ERR(Can't create CPB buffer.\n);
goto error;

[Mesa-dev] [PATCH 7/7] st/omx/enc: enable B-frames

2014-04-09 Thread Christian König
From: Christian König christian.koe...@amd.com

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/state_trackers/omx/vid_enc.c | 10 +++---
 src/gallium/state_trackers/omx/vid_enc.h |  1 +
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 7633cd6..1e6a189 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -563,7 +563,7 @@ static OMX_ERRORTYPE 
vid_enc_MessageHandler(OMX_COMPONENTTYPE* comp, internalReq
 priv-scale.xWidth : 
port-sPortParam.format.video.nFrameWidth;
  templat.height = priv-scale_buffer[priv-current_scale_buffer] ?
 priv-scale.xHeight : 
port-sPortParam.format.video.nFrameHeight;
- templat.max_references = 1;
+ templat.max_references = OMX_VID_ENC_P_PERIOD_DEFAULT;
 
  priv-codec = priv-s_pipe-create_video_codec(priv-s_pipe, 
templat);
 
@@ -907,13 +907,17 @@ static OMX_ERRORTYPE 
vid_enc_EncodeFrame(omx_base_PortType *port, OMX_BUFFERHEAD
}
 
/* -- determine picture type - */
-   if (!(priv-pic_order_cnt % OMX_VID_ENC_IDR_PERIOD_DEFAULT) || 
priv-force_pic_type.IntraRefreshVOP) {
+   if (!(priv-pic_order_cnt % OMX_VID_ENC_IDR_PERIOD_DEFAULT) ||
+   priv-force_pic_type.IntraRefreshVOP) {
   enc_ClearBframes(port, inp);
   picture_type = PIPE_H264_ENC_PICTURE_TYPE_IDR;
   priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; 
   priv-frame_num = 0;
+   } else if (!(priv-pic_order_cnt % OMX_VID_ENC_P_PERIOD_DEFAULT) ||
+  (buf-nFlags  OMX_BUFFERFLAG_EOS)) {
+  picture_type = PIPE_H264_ENC_PICTURE_TYPE_P;
} else {
-  picture_type = PIPE_H264_ENC_PICTURE_TYPE_P; 
+  picture_type = PIPE_H264_ENC_PICTURE_TYPE_B;
}

task-pic_order_cnt = priv-pic_order_cnt++;
diff --git a/src/gallium/state_trackers/omx/vid_enc.h 
b/src/gallium/state_trackers/omx/vid_enc.h
index 6f6226a..c01c959 100644
--- a/src/gallium/state_trackers/omx/vid_enc.h
+++ b/src/gallium/state_trackers/omx/vid_enc.h
@@ -60,6 +60,7 @@
 #define OMX_VID_ENC_SCALING_WIDTH_DEFAULT 0x
 #define OMX_VID_ENC_SCALING_HEIGHT_DEFAULT 0x
 #define OMX_VID_ENC_IDR_PERIOD_DEFAULT 1000
+#define OMX_VID_ENC_P_PERIOD_DEFAULT 4
 
 #define OMX_VID_ENC_NUM_SCALING_BUFFERS 4
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] st/omx/enc: implement frame reordering and B-frames

2014-04-09 Thread Christian König
From: Christian König christian.koe...@amd.com

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/state_trackers/omx/vid_enc.c | 91 +---
 src/gallium/state_trackers/omx/vid_enc.h |  5 +-
 2 files changed, 76 insertions(+), 20 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 88d15a9..7633cd6 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -58,6 +58,7 @@ struct encode_task {
struct list_head list;
 
struct pipe_video_buffer *buf;
+   unsigned pic_order_cnt;
struct pipe_resource *bitstream;
void *feedback;
 };
@@ -247,12 +248,14 @@ static OMX_ERRORTYPE 
vid_enc_Constructor(OMX_COMPONENTTYPE *comp, OMX_STRING nam
 
priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; 
priv-frame_num = 0;
+   priv-pic_order_cnt = 0;
 
priv-scale.xWidth = OMX_VID_ENC_SCALING_WIDTH_DEFAULT;
priv-scale.xHeight = OMX_VID_ENC_SCALING_WIDTH_DEFAULT;
 
LIST_INITHEAD(priv-free_tasks);
LIST_INITHEAD(priv-used_tasks);
+   LIST_INITHEAD(priv-b_frames);
 
return OMX_ErrorNone;
 }
@@ -264,6 +267,7 @@ static OMX_ERRORTYPE vid_enc_Destructor(OMX_COMPONENTTYPE 
*comp)
 
enc_ReleaseTasks(priv-free_tasks);
enc_ReleaseTasks(priv-used_tasks);
+   enc_ReleaseTasks(priv-b_frames);
 
if (priv-ports) {
   for (i = 0; i  priv-sPortTypesParam[OMX_PortDomainVideo].nPorts; ++i) {
@@ -803,23 +807,13 @@ static void enc_ControlPicture(omx_base_PortType *port, 
struct pipe_h264_enc_pic
picture-quant_p_frames = priv-quant.nQpP;
picture-quant_b_frames = priv-quant.nQpB;
 
-   if (!(priv-frame_num % OMX_VID_ENC_IDR_PERIOD_DEFAULT) || 
priv-force_pic_type.IntraRefreshVOP) {
-  picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_IDR;
-  picture-ref_idx_l0 = 0;
-  picture-ref_idx_l1 = 0;
-  priv-frame_num = 0;
-   } else {
-  picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_P;
-  picture-ref_idx_l0 = priv-frame_num - 1;
-  picture-ref_idx_l1 = 0;
-   }
-   
-   picture-frame_num = priv-frame_num++;
-   picture-pic_order_cnt = picture-frame_num;
-   priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; 
+   picture-frame_num = priv-frame_num;
+   picture-ref_idx_l0 = priv-ref_idx_l0;
+   picture-ref_idx_l1 = priv-ref_idx_l1;
 }
 
-static void enc_HandleTask(omx_base_PortType *port, struct encode_task *task)
+static void enc_HandleTask(omx_base_PortType *port, struct encode_task *task,
+   enum pipe_h264_enc_picture_type picture_type)
 {
OMX_COMPONENTTYPE* comp = port-standCompContainer;
vid_enc_PrivateType *priv = comp-pComponentPrivate;
@@ -834,6 +828,9 @@ static void enc_HandleTask(omx_base_PortType *port, struct 
encode_task *task)
/* -- allocate output buffer - */
task-bitstream = pipe_buffer_create(priv-s_pipe-screen, 
PIPE_BIND_VERTEX_BUFFER,
 PIPE_USAGE_STREAM, size);
+
+   picture.picture_type = picture_type;
+   picture.pic_order_cnt = task-pic_order_cnt;
enc_ControlPicture(port, picture);
 
/* -- encode frame - */
@@ -842,11 +839,39 @@ static void enc_HandleTask(omx_base_PortType *port, 
struct encode_task *task)
priv-codec-end_frame(priv-codec, vbuf, picture.base);
 }
 
+static void enc_ClearBframes(omx_base_PortType *port, struct input_buf_private 
*inp)
+{
+   OMX_COMPONENTTYPE* comp = port-standCompContainer;
+   vid_enc_PrivateType *priv = comp-pComponentPrivate;
+   struct encode_task *task;
+
+   if (LIST_IS_EMPTY(priv-b_frames))
+  return;
+
+   task = LIST_ENTRY(struct encode_task, priv-b_frames.prev, list);
+   LIST_DEL(task-list);
+
+   /* promote last from to P frame */
+   priv-ref_idx_l0 = priv-ref_idx_l1;
+   enc_HandleTask(port, task, PIPE_H264_ENC_PICTURE_TYPE_P);
+   LIST_ADDTAIL(task-list, inp-tasks);
+   priv-ref_idx_l1 = priv-frame_num++;
+
+   /* handle B frames */
+   LIST_FOR_EACH_ENTRY(task, priv-b_frames, list) {
+  enc_HandleTask(port, task, PIPE_H264_ENC_PICTURE_TYPE_B);
+  priv-ref_idx_l0 = priv-frame_num++;
+   }
+
+   enc_MoveTasks(priv-b_frames, inp-tasks);
+}
+
 static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType *port, 
OMX_BUFFERHEADERTYPE *buf)
 {
OMX_COMPONENTTYPE* comp = port-standCompContainer;
vid_enc_PrivateType *priv = comp-pComponentPrivate;
struct input_buf_private *inp = buf-pInputPortPrivate;
+   enum pipe_h264_enc_picture_type picture_type;
struct encode_task *task;
OMX_ERRORTYPE err;
 
@@ -863,8 +888,10 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType 
*port, OMX_BUFFERHEAD
   return OMX_ErrorInsufficientResources;
 
if (buf-nFilledLen == 0) {
-  if (buf-nFlags  OMX_BUFFERFLAG_EOS)
+  if (buf-nFlags  OMX_BUFFERFLAG_EOS) {
  buf-nFilledLen = buf-nAllocLen;
+ enc_ClearBframes(port, inp);
+  }
   return 

[Mesa-dev] [PATCH 04/10] glsl/linker: initialize explicit uniform locations

2014-04-09 Thread Tapani Pälli
Patch initializes the UniformRemapTable for explicit locations. This
needs to happen before optimizations to make sure all inactive uniforms
get their explicit locations correctly.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/glsl/linker.cpp | 99 +
 1 file changed, 99 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 7c194a2..1b4cb63 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -74,6 +74,7 @@
 #include link_varyings.h
 #include ir_optimization.h
 #include ir_rvalue_visitor.h
+#include ir_uniform.h
 
 extern C {
 #include main/shaderobj.h
@@ -2089,6 +2090,100 @@ check_image_resources(struct gl_context *ctx, struct 
gl_shader_program *prog)
   linker_error(prog, Too many combined image uniforms and fragment 
outputs);
 }
 
+
+/**
+ * Initializes explicit location slots point to -1 for a variable,
+ * checks for overlaps between other uniforms using explicit locations.
+ */
+static bool
+reserve_explicit_locations(struct gl_shader_program *prog,
+   string_to_uint_map *map, ir_variable *var)
+{
+   unsigned max_loc = var-data.location + var-type-component_slots() - 1;
+
+   /* Resize remap table if locations do not fit in the current one. */
+   if (max_loc + 1  prog-NumUniformRemapTable) {
+  prog-UniformRemapTable =
+ reralloc(prog, prog-UniformRemapTable,
+  gl_uniform_storage *,
+  max_loc + 1);
+  prog-NumUniformRemapTable = max_loc + 1;
+   }
+
+   for (unsigned i = 0; i  var-type-component_slots(); i++) {
+  unsigned loc = var-data.location + i;
+
+  /* Check if location is already used. */
+  if (prog-UniformRemapTable[loc] == (gl_uniform_storage *) -1) {
+
+ /* Possibly same uniform from a different stage, this is ok. */
+ unsigned hash_loc;
+ if (map-get(hash_loc, var-name)  hash_loc == loc - i)
+   continue;
+
+ /* ARB_explicit_uniform_location specification states:
+  *
+  * No two default-block uniform variables in the program can have
+  * the same location, even if they are unused, otherwise a 
compiler
+  * or linker error will be generated.
+  */
+ linker_error(prog, location qualifier 
+  for uniform %s 
+  overlaps previously used location,
+  var-name);
+ return false;
+  }
+
+  prog-UniformRemapTable[loc] = (gl_uniform_storage *) -1;
+   }
+
+   /* Note, base location used for arrays. */
+   map-put(var-data.location, var-name);
+
+   return true;
+}
+
+/**
+ * Check and reserve all explicit uniform locations, called before
+ * any optimizations happen to handle also inactive uniforms and
+ * inactive array elements that may get trimmed away.
+ */
+static void
+check_explicit_uniform_locations(struct gl_context *ctx,
+ struct gl_shader_program *prog)
+{
+   if (!ctx-Extensions.ARB_explicit_uniform_location)
+  return;
+
+   /* This map is used to detect if overlapping explicit locations
+* occur with the same uniform (from different stage) or a different one.
+*/
+   string_to_uint_map *uniform_map = new string_to_uint_map;
+
+   for (unsigned i = 0; i  MESA_SHADER_STAGES; i++) {
+  struct gl_shader *sh = prog-_LinkedShaders[i];
+
+  if (!sh)
+ continue;
+
+  foreach_list(node, sh-ir) {
+ ir_variable *var = ((ir_instruction *)node)-as_variable();
+ if ((var  var-data.mode == ir_var_uniform) 
+ var-data.explicit_location) {
+if (!reserve_explicit_locations(prog, uniform_map, var))
+   return;
+
+/* Initialize locations that were allocated but left unused. */
+for (unsigned i = 0; i  prog-NumUniformRemapTable; i++)
+   if (prog-UniformRemapTable[i] != (gl_uniform_storage *) -1)
+  prog-UniformRemapTable[i] = NULL;
+ }
+  }
+   }
+
+   delete uniform_map;
+}
+
 void
 link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
 {
@@ -2232,6 +2327,10 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
  break;
}
 
+   check_explicit_uniform_locations(ctx, prog);
+   if (!prog-LinkStatus)
+  goto done;
+
/* Validate the inputs of each stage with the output of the preceding
 * stage.
 */
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] GL_ARB_explicit_uniform_location v2

2014-04-09 Thread Tapani Pälli
Hi;

Patches implement the extension, no Piglit regressions and all the tests
for the extension pass. Location initialization and assignment is done
like Ian suggested, this removed quite a bit of code since now there is
no need to store inactive uniforms temporarily.

Here's a branch with the patches:
http://cgit.freedesktop.org/~tpalli/mesa/log/?h=exp_uniform_loc_v2

// Tapani


Tapani Pälli (10):
  glapi: add GL_ARB_explicit_uniform_location
  mesa: add enable bit for ARB_explicit_uniform_location
  mesa: add new enum MAX_UNIFORM_LOCATIONS
  glsl/linker: initialize explicit uniform locations
  glsl/linker: assign explicit uniform locations
  mesa: support inactive uniforms in glUniform* functions
  glsl: add enable bit for ARB_explicit_uniform_location
  glsl: parser changes for GL_ARB_explicit_uniform_location
  Enable GL_ARB_explicit_uniform_location in the drivers.
  docs: update ARB_explicit_uniform_location status

 docs/GL3.txt |  2 +-
 src/glsl/ast_to_hir.cpp  | 37 +++
 src/glsl/glcpp/glcpp-parse.y |  3 +
 src/glsl/glsl_lexer.ll   |  1 +
 src/glsl/glsl_parser_extras.cpp  |  1 +
 src/glsl/glsl_parser_extras.h| 16 +
 src/glsl/ir_uniform.h|  5 +-
 src/glsl/link_uniforms.cpp   | 56 ++--
 src/glsl/linker.cpp  | 99 
 src/mapi/glapi/gen/gl_API.xml|  6 ++
 src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
 src/mesa/main/context.c  | 10 ++-
 src/mesa/main/extensions.c   |  1 +
 src/mesa/main/get.c  |  1 +
 src/mesa/main/get_hash_params.py |  1 +
 src/mesa/main/mtypes.h   |  6 ++
 src/mesa/main/tests/enum_strings.cpp |  1 +
 src/mesa/main/uniform_query.cpp  | 15 +
 src/mesa/state_tracker/st_extensions.c   |  1 +
 19 files changed, 254 insertions(+), 9 deletions(-)

-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS

2014-04-09 Thread Tapani Pälli
Patch adds new implementation dependent value required by the
GL_ARB_explicit_uniform_location extension. Default value for user
assignable locations is calculated as sum of MaxUniformComponents
for each stage.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/mesa/main/context.c  | 10 +-
 src/mesa/main/get.c  |  1 +
 src/mesa/main/get_hash_params.py |  1 +
 src/mesa/main/mtypes.h   |  5 +
 src/mesa/main/tests/enum_strings.cpp |  1 +
 5 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 860ae86..8b77df1 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -610,8 +610,16 @@ _mesa_init_constants(struct gl_context *ctx)
ctx-Const.MaxUniformBlockSize = 16384;
ctx-Const.UniformBufferOffsetAlignment = 1;
 
-   for (i = 0; i  MESA_SHADER_STAGES; i++)
+   /* GL_ARB_explicit_uniform_location, initial value calculated
+* as sum of MaxUniformComponents for each stage.
+*/
+   ctx-Const.MaxUserAssignableUniformLocations = 0;
+
+   for (i = 0; i  MESA_SHADER_STAGES; i++) {
   init_program_limits(ctx, i, ctx-Const.Program[i]);
+  ctx-Const.MaxUserAssignableUniformLocations +=
+ ctx-Const.Program[i].MaxUniformComponents;
+   }
 
ctx-Const.MaxProgramMatrices = MAX_PROGRAM_MATRICES;
ctx-Const.MaxProgramMatrixStackDepth = MAX_PROGRAM_MATRIX_STACK_DEPTH;
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 6d95790..8b50441 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -395,6 +395,7 @@ EXTRA_EXT(ARB_viewport_array);
 EXTRA_EXT(ARB_compute_shader);
 EXTRA_EXT(ARB_gpu_shader5);
 EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5);
+EXTRA_EXT(ARB_explicit_uniform_location);
 
 static const int
 extra_ARB_color_buffer_float_or_glcore[] = {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 06d0bba..5709d42 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -474,6 +474,7 @@ descriptor=[
   [ MAX_LIST_NESTING, CONST(MAX_LIST_NESTING), NO_EXTRA ],
   [ MAX_NAME_STACK_DEPTH, CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA ],
   [ MAX_PIXEL_MAP_TABLE, CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA ],
+  [ MAX_UNIFORM_LOCATIONS, 
CONTEXT_INT(Const.MaxUserAssignableUniformLocations), NO_EXTRA ],
   [ NAME_STACK_DEPTH, CONTEXT_INT(Select.NameStackDepth), NO_EXTRA ],
   [ PACK_LSB_FIRST, CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA ],
   [ PACK_SWAP_BYTES, CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA ],
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7ac6bbe..fefbe06 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3311,6 +3311,11 @@ struct gl_constants
GLuint UniformBufferOffsetAlignment;
/** @} */
 
+   /**
+* GL_ARB_explicit_uniform_location
+*/
+   GLuint MaxUserAssignableUniformLocations;
+
/** GL_ARB_geometry_shader4 */
GLuint MaxGeometryOutputVertices;
GLuint MaxGeometryTotalOutputComponents;
diff --git a/src/mesa/main/tests/enum_strings.cpp 
b/src/mesa/main/tests/enum_strings.cpp
index 3795700..298ff6a 100644
--- a/src/mesa/main/tests/enum_strings.cpp
+++ b/src/mesa/main/tests/enum_strings.cpp
@@ -787,6 +787,7 @@ const struct enum_info everything[] = {
{ 0x8256, GL_RESET_NOTIFICATION_STRATEGY_ARB },
{ 0x8257, GL_PROGRAM_BINARY_RETRIEVABLE_HINT },
{ 0x8261, GL_NO_RESET_NOTIFICATION_ARB },
+   { 0x826E, GL_MAX_UNIFORM_LOCATIONS },
{ 0x82DF, GL_TEXTURE_IMMUTABLE_LEVELS },
{ 0x8362, GL_UNSIGNED_BYTE_2_3_3_REV },
{ 0x8363, GL_UNSIGNED_SHORT_5_6_5 },
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] mesa: add enable bit for ARB_explicit_uniform_location

2014-04-09 Thread Tapani Pälli
Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/mesa/main/extensions.c | 1 +
 src/mesa/main/mtypes.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index a72284c..8605189 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -99,6 +99,7 @@ static const struct extension extension_table[] = {
{ GL_ARB_draw_indirect,   o(ARB_draw_indirect),   
GLC,2010 },
{ GL_ARB_draw_instanced,  o(ARB_draw_instanced),  
GL, 2008 },
{ GL_ARB_explicit_attrib_location,
o(ARB_explicit_attrib_location),GL, 2009 },
+   { GL_ARB_explicit_uniform_location,   
o(ARB_explicit_uniform_location),   GL, 2012 },
{ GL_ARB_fragment_coord_conventions,  
o(ARB_fragment_coord_conventions),  GL, 2009 },
{ GL_ARB_fragment_program,o(ARB_fragment_program),
GLL,2002 },
{ GL_ARB_fragment_program_shadow, 
o(ARB_fragment_program_shadow), GLL,2003 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 4d014d1..7ac6bbe 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3508,6 +3508,7 @@ struct gl_extensions
GLboolean ARB_fragment_shader;
GLboolean ARB_framebuffer_object;
GLboolean ARB_explicit_attrib_location;
+   GLboolean ARB_explicit_uniform_location;
GLboolean ARB_geometry_shader4;
GLboolean ARB_gpu_shader5;
GLboolean ARB_half_float_vertex;
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] glapi: add GL_ARB_explicit_uniform_location

2014-04-09 Thread Tapani Pälli
Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/mapi/glapi/gen/gl_API.xml | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 9200cd6..d269d7d 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8312,6 +8312,12 @@
 
 !-- ARB extensions #128...#131 --
 
+category name=GL_ARB_explicit_uniform_location number=128
+enum name=MAX_UNIFORM_LOCATIONS count=1 value=0x826E 
+size name=Get mode=get/
+/enum
+/category
+
 xi:include href=ARB_invalidate_subdata.xml 
xmlns:xi=http://www.w3.org/2001/XInclude/
 
 !-- ARB extensions #134...#138 --
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] mesa: support inactive uniforms in glUniform* functions

2014-04-09 Thread Tapani Pälli
Support inactive uniforms that have explicit location set in
glUniform* functions.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/mesa/main/uniform_query.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 5f1af08..e33800a 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -253,6 +253,21 @@ validate_uniform_parameters(struct gl_context *ctx,
   return false;
}
 
+   /* If the driver storage pointer in remap table is -1, we ignore silently.
+*
+* GL_ARB_explicit_uniform_location spec says:
+* What happens if Uniform* is called with an explicitly defined
+* uniform location, but that uniform is deemed inactive by the
+* linker?
+*
+* RESOLVED: The call is ignored for inactive uniform variables and
+* no error is generated.
+*
+*/
+   if (ctx-Extensions.ARB_explicit_uniform_location 
+  shProg-UniformRemapTable[location] == (gl_uniform_storage *) -1)
+  return false;
+
_mesa_uniform_split_location_offset(shProg, location, loc, array_index);
 
if (shProg-UniformStorage[*loc].array_elements == 0  count  1) {
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] glsl: add enable bit for ARB_explicit_uniform_location

2014-04-09 Thread Tapani Pälli
Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/glsl/glsl_parser_extras.cpp | 1 +
 src/glsl/glsl_parser_extras.h   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index a42f3d2..d6415ab 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -505,6 +505,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(ARB_draw_buffers,   true,  false, dummy_true),
EXT(ARB_draw_instanced, true,  false, ARB_draw_instanced),
EXT(ARB_explicit_attrib_location,   true,  false, 
ARB_explicit_attrib_location),
+   EXT(ARB_explicit_uniform_location,  true,  false, 
ARB_explicit_uniform_location),
EXT(ARB_fragment_coord_conventions, true,  false, 
ARB_fragment_coord_conventions),
EXT(ARB_texture_rectangle,  true,  false, dummy_true),
EXT(EXT_texture_array,  true,  false, EXT_texture_array),
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 3ad205c..c53c583 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -345,6 +345,8 @@ struct _mesa_glsl_parse_state {
bool ARB_draw_instanced_warn;
bool ARB_explicit_attrib_location_enable;
bool ARB_explicit_attrib_location_warn;
+   bool ARB_explicit_uniform_location_enable;
+   bool ARB_explicit_uniform_location_warn;
bool ARB_fragment_coord_conventions_enable;
bool ARB_fragment_coord_conventions_warn;
bool ARB_texture_rectangle_enable;
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] Enable GL_ARB_explicit_uniform_location in the drivers.

2014-04-09 Thread Tapani Pälli
Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 src/mesa/state_tracker/st_extensions.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 15fcd30..f8abf98 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -170,6 +170,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx-Extensions.ARB_draw_instanced = true;
ctx-Extensions.ARB_ES2_compatibility = true;
ctx-Extensions.ARB_explicit_attrib_location = true;
+   ctx-Extensions.ARB_explicit_uniform_location = true;
ctx-Extensions.ARB_fragment_coord_conventions = true;
ctx-Extensions.ARB_fragment_program = true;
ctx-Extensions.ARB_fragment_program_shadow = true;
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 3e1e45d..5b11e7b 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -534,6 +534,7 @@ void st_init_extensions(struct st_context *st)
ctx-Extensions.ARB_ES2_compatibility = GL_TRUE;
ctx-Extensions.ARB_draw_elements_base_vertex = GL_TRUE;
ctx-Extensions.ARB_explicit_attrib_location = GL_TRUE;
+   ctx-Extensions.ARB_explicit_uniform_location = GL_TRUE;
ctx-Extensions.ARB_fragment_coord_conventions = GL_TRUE;
ctx-Extensions.ARB_fragment_program = GL_TRUE;
ctx-Extensions.ARB_fragment_shader = GL_TRUE;
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/10] glsl: parser changes for GL_ARB_explicit_uniform_location

2014-04-09 Thread Tapani Pälli
Patch adds a preprocessor define for the extension and stores explicit
location data for uniforms during AST-HIR conversion. It also sets
layout token to be available when having the extension in place.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/glsl/ast_to_hir.cpp   | 37 +
 src/glsl/glcpp/glcpp-parse.y  |  3 +++
 src/glsl/glsl_lexer.ll|  1 +
 src/glsl/glsl_parser_extras.h | 14 ++
 4 files changed, 55 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 8d55ee3..7431ad7 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2170,6 +2170,43 @@ validate_explicit_location(const struct 
ast_type_qualifier *qual,
 {
bool fail = false;
 
+   /* Checks for GL_ARB_explicit_uniform_location. */
+   if (qual-flags.q.uniform) {
+
+  if (!state-check_explicit_uniform_location_allowed(loc, var))
+ return;
+
+  const struct gl_context *const ctx = state-ctx;
+  unsigned max_loc = qual-location + var-type-component_slots() - 1;
+
+  /* ARB_explicit_uniform_location specification states:
+   *
+   * The explicitly defined locations and the generated locations
+   * must be in the range of 0 to MAX_UNIFORM_LOCATIONS minus one.
+   *
+   * Valid locations for default-block uniform variable locations
+   * are in the range of 0 to the implementation-defined maximum
+   * number of uniform locations.
+   */
+  if (qual-location  0) {
+ _mesa_glsl_error(loc, state,
+  explicit location  0 for uniform %s, var-name);
+ return;
+  }
+
+  if (max_loc = ctx-Const.MaxUserAssignableUniformLocations) {
+ _mesa_glsl_error(loc, state, location qualifier for uniform %s 
+  = MAX_UNIFORM_LOCATIONS (%u),
+  var-name,
+  ctx-Const.MaxUserAssignableUniformLocations);
+ return;
+  }
+
+  var-data.explicit_location = true;
+  var-data.location = qual-location;
+  return;
+   }
+
/* Between GL_ARB_explicit_attrib_location an
 * GL_ARB_separate_shader_objects, the inputs and outputs of any shader
 * stage can be assigned explicit locations.  The checking here associates
diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index f28d853..6d42138 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2087,6 +2087,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
  if (extensions-ARB_explicit_attrib_location)
 add_builtin_define(parser, GL_ARB_explicit_attrib_location, 
1);
 
+ if (extensions-ARB_explicit_uniform_location)
+add_builtin_define(parser, GL_ARB_explicit_uniform_location, 
1);
+
  if (extensions-ARB_shader_texture_lod)
 add_builtin_define(parser, GL_ARB_shader_texture_lod, 1);
 
diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
index 7602351..83f0b6d 100644
--- a/src/glsl/glsl_lexer.ll
+++ b/src/glsl/glsl_lexer.ll
@@ -393,6 +393,7 @@ layout  {
  || yyextra-AMD_conservative_depth_enable
  || yyextra-ARB_conservative_depth_enable
  || yyextra-ARB_explicit_attrib_location_enable
+ || yyextra-ARB_explicit_uniform_location_enable
   || yyextra-has_separate_shader_objects()
  || yyextra-ARB_uniform_buffer_object_enable
  || yyextra-ARB_fragment_coord_conventions_enable
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index c53c583..20879a0 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -152,6 +152,20 @@ struct _mesa_glsl_parse_state {
   return true;
}
 
+   bool check_explicit_uniform_location_allowed(YYLTYPE *locp,
+const ir_variable *var)
+   {
+  /* Requires OpenGL 3.3 or ARB_explicit_attrib_location. */
+  if (ctx-Version  33  !ctx-Extensions.ARB_explicit_attrib_location) {
+ _mesa_glsl_error(locp, this, %s explicit location requires 
+  GL_ARB_explicit_attrib_location extension 
+  or OpenGL 3.3, mode_string(var));
+ return false;
+  }
+
+  return true;
+   }
+
bool has_explicit_attrib_location() const
{
   return ARB_explicit_attrib_location_enable || is_version(330, 300);
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] glsl/linker: assign explicit uniform locations

2014-04-09 Thread Tapani Pälli
Patch refactors the existing uniform processing so explicit locations
are taken in to account during variable processing. These locations
are temporarily stored in gl_uniform_storage before actual locations
are set.

The 'remap_location' variable in gl_uniform_storage is changed to be
signed so that we can use 0 as a valid explicit location and '-1' as
identifier that no explicit location has been defined.

When locations are set, UniformRemapTable is first populated with
uniforms that have explicit location set (inactive and actives ones),
rest are put after explicit location slots.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/glsl/ir_uniform.h  |  5 +++--
 src/glsl/link_uniforms.cpp | 56 +-
 2 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h
index 3508509..9dc4a8e 100644
--- a/src/glsl/ir_uniform.h
+++ b/src/glsl/ir_uniform.h
@@ -181,9 +181,10 @@ struct gl_uniform_storage {
 
/**
 * The 'base location' for this uniform in the uniform remap table. For
-* arrays this is the first element in the array.
+* arrays this is the first element in the array. It needs to be signed
+* so that we can use 0 as valid location and -1 as initial value
 */
-   unsigned remap_location;
+   int remap_location;
 };
 
 #ifdef __cplusplus
diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 29dc0b1..0f99082 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -387,6 +387,9 @@ public:
void set_and_process(struct gl_shader_program *prog,
ir_variable *var)
{
+  current_var = var;
+  field_counter = 0;
+
   ubo_block_index = -1;
   if (var-is_in_uniform_block()) {
  if (var-is_interface_instance()  var-type-is_array()) {
@@ -543,6 +546,22 @@ private:
  return;
   }
 
+  /* Assign explicit locations. */
+  if (current_var-data.explicit_location) {
+ /* Set sequential locations for struct fields. */
+ if (current_var-type-is_record()) {
+const unsigned entries = MAX2(1, 
this-uniforms[id].array_elements);
+this-uniforms[id].remap_location =
+   current_var-data.location + field_counter;
+   field_counter += entries;
+ } else {
+this-uniforms[id].remap_location = current_var-data.location;
+ }
+  } else {
+ /* Initialize to -1 to indicate that no explicit location is set */
+ this-uniforms[id].remap_location = -1;
+  }
+
   this-uniforms[id].name = ralloc_strdup(this-uniforms, name);
   this-uniforms[id].type = base_type;
   this-uniforms[id].initialized = 0;
@@ -598,6 +617,17 @@ public:
gl_texture_index targets[MAX_SAMPLERS];
 
/**
+* Current variable being processed.
+*/
+   ir_variable *current_var;
+
+   /**
+* Field counter is used to take care that uniform structures
+* with explicit locations get sequential locations.
+*/
+   unsigned field_counter;
+
+   /**
 * Mask of samplers used by the current shader stage.
 */
unsigned shader_samplers_used;
@@ -799,10 +829,6 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog)
prog-UniformStorage = NULL;
prog-NumUserUniformStorage = 0;
 
-   ralloc_free(prog-UniformRemapTable);
-   prog-UniformRemapTable = NULL;
-   prog-NumUniformRemapTable = 0;
-
if (prog-UniformHash != NULL) {
   prog-UniformHash-clear();
} else {
@@ -915,9 +941,29 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog)
  sizeof(prog-_LinkedShaders[i]-SamplerTargets));
}
 
-   /* Build the uniform remap table that is used to set/get uniform locations 
*/
+   /* Reserve all the explicit locations of the active uniforms. */
+   for (unsigned i = 0; i  num_user_uniforms; i++) {
+  if (uniforms[i].remap_location != -1) {
+ /* How many new entries for this uniform? */
+ const unsigned entries = MAX2(1, uniforms[i].array_elements);
+
+ /* Set remap table entries point to correct gl_uniform_storage. */
+ for (unsigned j = 0; j  entries; j++) {
+unsigned element_loc = uniforms[i].remap_location + j;
+assert(prog-UniformRemapTable[element_loc] ==
+   (gl_uniform_storage *) -1);
+prog-UniformRemapTable[element_loc] = uniforms[i];
+ }
+  }
+   }
+
+   /* Reserve locations for rest of the uniforms. */
for (unsigned i = 0; i  num_user_uniforms; i++) {
 
+  /* Explicit ones have been set already. */
+  if (uniforms[i].remap_location != -1)
+ continue;
+
   /* how many new entries for this uniform? */
   const unsigned entries = MAX2(1, uniforms[i].array_elements);
 
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 10/10] docs: update ARB_explicit_uniform_location status

2014-04-09 Thread Tapani Pälli
Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index bf51e3a..245a045 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -148,7 +148,7 @@ GL 4.3:
   GL_ARB_compute_shaderstarted (Paul Berry)
   GL_ARB_copy_imagenot started
   GL_KHR_debug DONE (all drivers)
-  GL_ARB_explicit_uniform_location not started
+  GL_ARB_explicit_uniform_location DONE (all drivers that 
support GLSL)
   GL_ARB_fragment_layer_viewport   not started
   GL_ARB_framebuffer_no_attachmentsnot started
   GL_ARB_internalformat_query2 not started
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS

2014-04-09 Thread Petri Latvala

On 04/09/2014 12:56 PM, Tapani Pälli wrote:

diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 06d0bba..5709d42 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -474,6 +474,7 @@ descriptor=[
[ MAX_LIST_NESTING, CONST(MAX_LIST_NESTING), NO_EXTRA ],
[ MAX_NAME_STACK_DEPTH, CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA ],
[ MAX_PIXEL_MAP_TABLE, CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA ],
+  [ MAX_UNIFORM_LOCATIONS, CONTEXT_INT(Const.MaxUserAssignableUniformLocations), 
NO_EXTRA ],
[ NAME_STACK_DEPTH, CONTEXT_INT(Select.NameStackDepth), NO_EXTRA ],
[ PACK_LSB_FIRST, CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA ],
[ PACK_SWAP_BYTES, CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA ],



Should that NO_EXTRA be extra_ARB_explicit_uniform_location?


--
Petri Latvala

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Don't leak bytecode on shader compile failure

2014-04-09 Thread Marek Olšák
Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Wed, Apr 9, 2014 at 8:39 AM, Michel Dänzer mic...@daenzer.net wrote:
 From: Michel Dänzer michel.daen...@amd.com

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74868

 Cc: mesa-sta...@lists.freedesktop.org
 Signed-off-by: Michel Dänzer michel.daen...@amd.com
 ---
  src/gallium/drivers/r600/r600_shader.c | 18 +++---
  1 file changed, 11 insertions(+), 7 deletions(-)

 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index ddf79ee..b198359 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -155,7 +155,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
 r = r600_shader_from_tgsi(rctx, shader, key);
 if (r) {
 R600_ERR(translation from TGSI failed !\n);
 -   return r;
 +   goto error;
 }

 /* disable SB for geom shaders - it can't handle the CF_EMIT 
 instructions */
 @@ -169,7 +169,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
 r = r600_bytecode_build(shader-shader.bc);
 if (r) {
 R600_ERR(building bytecode failed !\n);
 -   return r;
 +   goto error;
 }
 }

 @@ -182,7 +182,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
  dump, use_sb);
 if (r) {
 R600_ERR(r600_sb_bytecode_process failed !\n);
 -   return r;
 +   goto error;
 }
 }

 @@ -192,16 +192,16 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
 r = r600_sb_bytecode_process(rctx, 
 shader-gs_copy_shader-shader.bc,
  
 shader-gs_copy_shader-shader, dump, 0);
 if (r)
 -   return r;
 +   goto error;
 }

 if ((r = store_shader(ctx, shader-gs_copy_shader)))
 -   return r;
 +   goto error;
 }

 /* Store the shader in a buffer. */
 if ((r = store_shader(ctx, shader)))
 -   return r;
 +   goto error;

 /* Build state. */
 switch (shader-shader.processor_type) {
 @@ -235,9 +235,13 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
 }
 break;
 default:
 -   return -EINVAL;
 +   goto error;
 }
 return 0;
 +
 +error:
 +   r600_pipe_shader_destroy(ctx, shader);
 +   return r;
  }

  void r600_pipe_shader_destroy(struct pipe_context *ctx, struct 
 r600_pipe_shader *shader)
 --
 1.9.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77240] New: khrplatform.h not installed if EGL is disabled

2014-04-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77240

  Priority: medium
Bug ID: 77240
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: khrplatform.h not installed if EGL is disabled
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: eric.le.bihan@free.fr
  Hardware: All
Status: NEW
   Version: unspecified
 Component: Other
   Product: Mesa

Created attachment 97136
  -- https://bugs.freedesktop.org/attachment.cgi?id=97136action=edit
Patch to fix missing khrplatform.h

KHR/khrplatform.h is required by the EGL, GLES and VG headers, but is
only installed if Mesa3d is compiled with EGL support. Configuring with

  $ ./configure --disable-egl --enable-gles1 --enable-gles2 ...

will result with an incomplete header set. When compiling Cairo with 
OpenGLESv2 support, the build will fail because of the missing header:

  /usr/include/GLES2/gl2platform.h:20:29: fatal error: KHR/khrplatform.h: No
such file or directory

The attached patch fixes the issue for me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77208] VdpPresentationQueueGetTime does not return a monotonic time

2014-04-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77208

--- Comment #3 from Andy Furniss adf.li...@gmail.com ---
(In reply to comment #1)
 Oh, it seems the pausing issue could be caused by interaction with power
 management. This is what a user posted:
 
 When polling '/sys/kernel/debug/dri/0/radeon_pm_info' you can see that this 
 only happens when the power level switches from an UVD power level to a 
 non-UVD power level. Pause - 1-2 seconds - non-UVD power level - Play - 
 Stutter

Not for me though, my HD4890 doesn't have uvd, and I just tried forcing dpm to
low and high rather than auto and the pausing issue is still there with both.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 76856] -Wl, --no-undefined gives undefined references to libc symbols on OpenBSD

2014-04-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=76856

Emil Velikov emil.l.veli...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Emil Velikov emil.l.veli...@gmail.com ---
Pushed to master

commit 11623be934f8573910484de2a5fb50c95f0a1d44
Author: Jonathan Gray j...@jsg.id.au
Date:   Thu Apr 3 15:46:01 2014 +1100

automake: don't enable -Wl,--no-undefined on OpenBSD

OpenBSD does not have DT_NEEDED entries for libc by design,
over concerns how the symbols would be referenced after
changing the major version of the library.

So avoid -no-undefined checks on OpenBSD as they will fail.

v2: don't include the -no-undefined libtool option in the variable
and change -Wl,--no-undefined references in Automake.inc as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76856
Signed-off-by: Jonathan Gray j...@jsg.id.au
Reviewed-by: Emil Velikov emil.l.veli...@gmail.com
Reviewed-by: Matt Turner matts...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 76377] DRI3 should only be enabled on Linux due to a udev dependency

2014-04-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=76377

Emil Velikov emil.l.veli...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Emil Velikov emil.l.veli...@gmail.com ---
Both patches are in master now, and a tagged for the 10.1 stable branch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path

2014-04-09 Thread Emil Velikov
On 16/03/14 14:10, Emil Velikov wrote:
 With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and
 initialize glx display) we've split the big _Xglobal_lock handling in
 a more fine grained manner.
 
 Unfortunatelly we forgot to drop the unlock_mutex on the error paths,
 leading to undefined behaviour as the mutex is already unlocked.
 
Gents, Kristian,

Can someone spare a few minutes to review this patch ? It addresses 3+ yo bug.

Cheers,
-Emil

 Cc: Kristian Høgsberg k...@bitplanet.net
 Cc: 9.2 10.0 10.1  mesa-sta...@lists.freedesktop.org
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/glx/glxext.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)
 
 diff --git a/src/glx/glxext.c b/src/glx/glxext.c
 index 4a195bd..de73036 100644
 --- a/src/glx/glxext.c
 +++ b/src/glx/glxext.c
 @@ -826,7 +826,6 @@ __glXInitialize(Display * dpy)
 dpyPriv-codes = XInitExtension(dpy, __glXExtensionName);
 if (!dpyPriv-codes) {
free(dpyPriv);
 -  _XUnlockMutex(_Xglobal_lock);
return NULL;
 }
  
 @@ -842,7 +841,6 @@ __glXInitialize(Display * dpy)
dpyPriv-majorVersion, dpyPriv-minorVersion)
 || (dpyPriv-majorVersion == 1  dpyPriv-minorVersion  1)) {
free(dpyPriv);
 -  _XUnlockMutex(_Xglobal_lock);
return NULL;
 }
  
 @@ -907,7 +905,7 @@ __glXInitialize(Display * dpy)
 dpyPriv-next = glx_displays;
 glx_displays = dpyPriv;
  
 -_XUnlockMutex(_Xglobal_lock);
 +   _XUnlockMutex(_Xglobal_lock);
  
 return dpyPriv;
  }
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] linker: Fold set_uniform_binding into call site

2014-04-09 Thread Kenneth Graunke
On 04/04/2014 02:01 PM, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 In the next patch, we'll see that using
 gl_shader_program::UniformStorage is not correct for uniform blocks.
 That means we can't use ::UniformStorage to select between the sampler
 path and the block path.  Instead we want to just use the type of the
 variable.  That's never passed to set_uniform_binding, and it's easier

Ehhhmm.then

 to just remove the function (especially for later patches in the series)
 than to add another parameter.
 
 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
 Cc: 10.1 mesa-sta...@lists.freedesktop.org
 Cc: git...@socker.lepus.uberspace.de
 ---
  src/glsl/link_uniform_initializers.cpp | 33 -
  1 file changed, 12 insertions(+), 21 deletions(-)
 
 diff --git a/src/glsl/link_uniform_initializers.cpp 
 b/src/glsl/link_uniform_initializers.cpp
 index 6f15e69..bbdeec9 100644
 --- a/src/glsl/link_uniform_initializers.cpp
 +++ b/src/glsl/link_uniform_initializers.cpp
 @@ -151,25 +151,6 @@ set_block_binding(void *mem_ctx, gl_shader_program *prog,
  }
  
  void
 -set_uniform_binding(void *mem_ctx, gl_shader_program *prog,
 -const char *name, const glsl_type *type, int binding)

...what exactly is this? ^

 -{
 -   struct gl_uniform_storage *const storage =
 -  get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name);
 -
 -   if (storage == NULL) {
 -  assert(storage != NULL);
 -  return;
 -   }
 -
 -   if (storage-type-is_sampler()) {
 -  set_sampler_binding(mem_ctx, prog, name, type, binding);
 -   } else if (storage-block_index != -1) {
 -  set_block_binding(mem_ctx, prog, name, type, binding);
 -   }
 -}
 -
 -void
  set_uniform_initializer(void *mem_ctx, gl_shader_program *prog,
   const char *name, const glsl_type *type,
   ir_constant *val)
 @@ -268,8 +249,18 @@ link_set_uniform_initializers(struct gl_shader_program 
 *prog)
   mem_ctx = ralloc_context(NULL);
  
   if (var-data.explicit_binding) {
 -linker::set_uniform_binding(mem_ctx, prog, var-name,
 -var-type, var-data.binding);
 +const glsl_type *const type = var-type;

Here you're using type, which is var-type, which is exactly what we
were already passing.

AFAICT all you needed to do was change:

   if (storage-type-is_sampler())

to

if (type-is_sampler() || (type-is_array() 
type-fields.array-is_sampler()))

in set_uniform_binding.

 +
 +if (type-is_sampler()
 +|| (type-is_array()  type-fields.array-is_sampler())) {
 +   linker::set_sampler_binding(mem_ctx, prog, var-name,
 +   type, var-data.binding);
 +} else if (var-is_in_uniform_block()) {
 +   linker::set_block_binding(mem_ctx, prog, var-name,
 + type, var-data.binding);
 +} else {
 +   assert(!Explicit binding not on a sampler or UBO.);
 +}
   } else if (var-constant_value) {
  linker::set_uniform_initializer(mem_ctx, prog, var-name,
  var-type, var-constant_value);
 




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] linker: Set block bindings based on UniformBlocks rather than UniformStorage

2014-04-09 Thread Kenneth Graunke
On 04/04/2014 02:01 PM, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 For blocks, gl_shader_program::UniformStorage isn't very useful.  The
 names stored there are the names of the elements of the block, so
 finding blocks with an instance name is hard.  There is also only one
 entry in ::UniformStorage for each element of a block array, and that is
 a deal breaker.
 
 Using ::UniformBlocks is what _mesa_GetUniformBlockIndex does.  I
 contemplated sharing code between set_block_binding and
 _mesa_GetUniformBlockIndex, but building the stand-alone compiler and
 the unit tests make this hard.  I plan to return to this effort shortly.
 
 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
 Cc: 10.1 mesa-sta...@lists.freedesktop.org
 Cc: git...@socker.lepus.uberspace.de
 ---
  src/glsl/link_uniform_initializers.cpp | 32 +---
  1 file changed, 21 insertions(+), 11 deletions(-)
 
 diff --git a/src/glsl/link_uniform_initializers.cpp 
 b/src/glsl/link_uniform_initializers.cpp
 index c633850..491eb69 100644
 --- a/src/glsl/link_uniform_initializers.cpp
 +++ b/src/glsl/link_uniform_initializers.cpp
 @@ -46,6 +46,18 @@ get_storage(gl_uniform_storage *storage, unsigned 
 num_storage,
 return NULL;
  }
  
 +static unsigned
 +get_uniform_block_index(const gl_shader_program *shProg,
 +const char *uniformBlockName)
 +{
 +   for (unsigned i = 0; i  shProg-NumUniformBlocks; i++) {
 +  if (!strcmp(shProg-UniformBlocks[i].Name, uniformBlockName))
 +  return i;
 +   }
 +
 +   return GL_INVALID_INDEX;
 +}
 +
  void
  copy_constant_to_storage(union gl_constant_value *storage,
const ir_constant *val,
 @@ -123,29 +135,24 @@ set_sampler_binding(gl_shader_program *prog, const char 
 *name, int binding)
  }
  
  void
 -set_block_binding(gl_shader_program *prog, const char *name, int binding)
 +set_block_binding(gl_shader_program *prog, const char *block_name, int 
 binding)
  {
 -   struct gl_uniform_storage *const storage =
 -  get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name);
 +   const unsigned block_index = get_uniform_block_index(prog, block_name);
  
 -   if (storage == NULL) {
 -  assert(storage != NULL);
 +   if (block_index == GL_INVALID_INDEX) {
 +  assert(block_index != GL_INVALID_INDEX);
return;
 }
  
 -   if (storage-block_index != -1) {
/* This is a field of a UBO.  val is the binding index. */
for (int i = 0; i  MESA_SHADER_STAGES; i++) {
 - int stage_index = 
 prog-UniformBlockStageIndex[i][storage-block_index];
 + int stage_index = prog-UniformBlockStageIndex[i][block_index];
  
   if (stage_index != -1) {
  struct gl_shader *sh = prog-_LinkedShaders[i];
  sh-UniformBlocks[stage_index].Binding = binding;
   }
}
 -   }
 -
 -   storage-initialized = true;

Why is it not necessary to set storage-initialized = true?  It goes
away here and never seems to come back.

  }
  
  void
 @@ -253,7 +260,10 @@ link_set_uniform_initializers(struct gl_shader_program 
 *prog)
  || (type-is_array()  type-fields.array-is_sampler())) {
 linker::set_sampler_binding(prog, var-name, 
 var-data.binding);
  } else if (var-is_in_uniform_block()) {
 -   linker::set_block_binding(prog, var-name, var-data.binding);
 +   const glsl_type *const iface_type = var-get_interface_type();
 +
 +   linker::set_block_binding(prog, iface_type-name,
 + var-data.binding);
  } else {
 assert(!Explicit binding not on a sampler or UBO.);
  }
 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 6/6] glsl: Ignore loop-too-large heuristic if there's bad variable indexing.

2014-04-09 Thread Kenneth Graunke
On 04/08/2014 09:20 PM, Kenneth Graunke wrote:
 Many shaders use a pattern such as:
 
 for (int i = 0; i  NUM_LIGHTS; i++) {
...access a uniform array, or shader input/output array...
 }
 
 where NUM_LIGHTS is a small constant (such as 2, 4, or 8).
 
 The expectation is that the compiler will unroll those loops, turning
 the array access into constant indexing, which is more efficient, and
 which may enable array splitting and other optimizations.
 
 In many cases, our heuristic fails - either there's another tiny nested
 loop inside, or the estimated number of instructions is just barely
 beyond the threshold.  So, we fail to unroll the loop, leaving the
 variable indexing in place.
 
 Drivers which don't support the particular flavor of variable indexing
 will call lower_variable_index_to_cond_assign(), which generates piles
 and piles of immensely inefficient code.  We'd like to avoid generating
 that.
 
 This patch detects unsupported forms of variable-indexing in loops, where
 the array index is a loop induction variable.  In that case, it bypasses
 the loop-too-large heuristic and forces unrolling.
 
 Improves performance in a PCF soft-shadow microbenchmark by 2x.

Sorry...this number is incorrect.  It improves performance by 21%.

The 2x figure was due to a bug in the older version where I literally
unrolled everything, which also got rid of variable-indexing of uniform
arrays.  (The program still worked even with the buggy patch...we just
made different unrolling decisions.  So, we should still be able to
attain 2x...just not with this patch alone.)

 No changes in shader-db.
 
 v2: Check ir-array for being an array or matrix, rather than the
 ir_dereference_array itself.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/glsl/loop_unroll.cpp | 61 
 +---
  1 file changed, 58 insertions(+), 3 deletions(-)
 
 v1 of 6/6 had several bugs, which ended up cancelling out in many cases
 and making things look like they were working.  I think this one is
 actually good.  Sorry for the noise...
 
 diff --git a/src/glsl/loop_unroll.cpp b/src/glsl/loop_unroll.cpp
 index 1ce4d58..da53280 100644
 --- a/src/glsl/loop_unroll.cpp
 +++ b/src/glsl/loop_unroll.cpp
 @@ -63,13 +63,17 @@ is_break(ir_instruction *ir)
  class loop_unroll_count : public ir_hierarchical_visitor {
  public:
 int nodes;
 +   bool unsupported_variable_indexing;
 /* If there are nested loops, the node count will be inaccurate. */
 bool nested_loop;
  
 -   loop_unroll_count(exec_list *list)
 +   loop_unroll_count(exec_list *list, loop_variable_state *ls,
 + const struct gl_shader_compiler_options *options)
 +  : ls(ls), options(options)
 {
nodes = 0;
nested_loop = false;
 +  unsupported_variable_indexing = false;
  
run(list);
 }
 @@ -91,6 +95,54 @@ public:
nested_loop = true;
return visit_continue;
 }
 +
 +   virtual ir_visitor_status visit_enter(ir_dereference_array *ir)
 +   {
 +  /* Check for arrays variably-indexed by a loop induction variable.
 +   * Unrolling the loop may convert that access into constant-indexing.
 +   *
 +   * Many drivers don't support particular kinds of variable indexing,
 +   * and have to resort to using lower_variable_index_to_cond_assign to
 +   * handle it.  This results in huge amounts of horrible code, so we'd
 +   * like to avoid that if possible.  Here, we just note that it will
 +   * happen.
 +   */
 +  if ((ir-array-type-is_array() || ir-array-type-is_matrix()) 
 +  !ir-array_index-as_constant()) {
 + ir_variable *array = ir-array-variable_referenced();
 + loop_variable *lv = ls-get(ir-array_index-variable_referenced());
 + if (array  lv  lv-is_induction_var()) {
 +switch (array-data.mode) {
 +case ir_var_auto:
 +case ir_var_temporary:
 +case ir_var_const_in:
 +case ir_var_function_in:
 +case ir_var_function_out:
 +case ir_var_function_inout:
 +   if (options-EmitNoIndirectTemp)
 +  unsupported_variable_indexing = true;
 +   break;
 +case ir_var_uniform:
 +   if (options-EmitNoIndirectUniform)
 +  unsupported_variable_indexing = true;
 +   break;
 +case ir_var_shader_in:
 +   if (options-EmitNoIndirectInput)
 +  unsupported_variable_indexing = true;
 +   break;
 +case ir_var_shader_out:
 +   if (options-EmitNoIndirectOutput)
 +  unsupported_variable_indexing = true;
 +   break;
 +}
 + }
 +  }
 +  return visit_continue;
 +   }
 +
 +private:
 +   loop_variable_state *ls;
 +   const struct gl_shader_compiler_options *options;
  };
  
  
 @@ -257,9 +309,12 @@ 

Re: [Mesa-dev] [PATCH 1/7] linker: Split set_uniform_binding into separate functions for blocks and samplers

2014-04-09 Thread Kenneth Graunke
On 04/04/2014 02:01 PM, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 The two code paths are quite different, and there are some problems in
 the handling of uniform blocks.  Future changes will cause these paths
 to diverge further.  Ultimately, selecting between the two functions
 will happen at the set_uniform_binding call site, and
 set_uniform_binding will be deleted.
 
 NOTE: This patch just moves code around.
 
 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
 Cc: 10.1 mesa-sta...@lists.freedesktop.org
 Cc: git...@socker.lepus.uberspace.de
 ---
  src/glsl/link_uniform_initializers.cpp | 42 
 +++---
  1 file changed, 39 insertions(+), 3 deletions(-)

Assuming you have a reasonable response to my comment on patch 5, this
series is:

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

though, I'm not sure how much that's worth - I had to re-read the GLSL
rules and re-discover how our compiler IR for this stuff works.  The
code seems right, but I could be totally missing something obvious.

On that note...is it just me, or is the compiler IR for uniform blocks
rather ugly and messy?

Anyway, thanks a ton for doing this, Ian.  Sorry for dropping the ball
when we first implemented 420pack.

 diff --git a/src/glsl/link_uniform_initializers.cpp 
 b/src/glsl/link_uniform_initializers.cpp
 index 9d6977d..9a10350 100644
 --- a/src/glsl/link_uniform_initializers.cpp
 +++ b/src/glsl/link_uniform_initializers.cpp
 @@ -84,7 +84,7 @@ copy_constant_to_storage(union gl_constant_value *storage,
  }
  
  void
 -set_uniform_binding(void *mem_ctx, gl_shader_program *prog,
 +set_sampler_binding(void *mem_ctx, gl_shader_program *prog,
  const char *name, const glsl_type *type, int binding)
  {
 struct gl_uniform_storage *const storage =
 @@ -95,7 +95,7 @@ set_uniform_binding(void *mem_ctx, gl_shader_program *prog,
return;
 }
  
 -   if (storage-type-is_sampler()) {
 +   {
unsigned elements = MAX2(storage-array_elements, 1);
  
/* From section 4.4.4 of the GLSL 4.20 specification:
 @@ -118,7 +118,24 @@ set_uniform_binding(void *mem_ctx, gl_shader_program 
 *prog,
  }
   }
}
 -   } else if (storage-block_index != -1) {
 +   }
 +
 +   storage-initialized = true;
 +}
 +
 +void
 +set_block_binding(void *mem_ctx, gl_shader_program *prog,
 +  const char *name, const glsl_type *type, int binding)
 +{
 +   struct gl_uniform_storage *const storage =
 +  get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name);
 +
 +   if (storage == NULL) {
 +  assert(storage != NULL);
 +  return;
 +   }
 +
 +   if (storage-block_index != -1) {
/* This is a field of a UBO.  val is the binding index. */
for (int i = 0; i  MESA_SHADER_STAGES; i++) {
   int stage_index = 
 prog-UniformBlockStageIndex[i][storage-block_index];
 @@ -134,6 +151,25 @@ set_uniform_binding(void *mem_ctx, gl_shader_program 
 *prog,
  }
  
  void
 +set_uniform_binding(void *mem_ctx, gl_shader_program *prog,
 +const char *name, const glsl_type *type, int binding)
 +{
 +   struct gl_uniform_storage *const storage =
 +  get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name);
 +
 +   if (storage == NULL) {
 +  assert(storage != NULL);
 +  return;
 +   }
 +
 +   if (storage-type-is_sampler()) {
 +  set_sampler_binding(mem_ctx, prog, name, type, binding);
 +   } else if (storage-block_index != -1) {
 +  set_block_binding(mem_ctx, prog, name, type, binding);
 +   }
 +}
 +
 +void
  set_uniform_initializer(void *mem_ctx, gl_shader_program *prog,
   const char *name, const glsl_type *type,
   ir_constant *val)
 




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()

2014-04-09 Thread Brian Paul
Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module
for AA lines (when the device doesn't support that feature).  We need to
initialize this list before we setup the swtnl pieces.

Found/fixed by Charmaine Lee.

Cc: 10.0 mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/svga/svga_context.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_context.c 
b/src/gallium/drivers/svga/svga_context.c
index 0ba09ce..8389384 100644
--- a/src/gallium/drivers/svga/svga_context.c
+++ b/src/gallium/drivers/svga/svga_context.c
@@ -123,6 +123,8 @@ struct pipe_context *svga_context_create( struct 
pipe_screen *screen,
if (svga == NULL)
   goto no_svga;
 
+   LIST_INITHEAD(svga-dirty_buffers);
+
svga-pipe.screen = screen;
svga-pipe.priv = priv;
svga-pipe.destroy = svga_destroy;
@@ -185,8 +187,6 @@ struct pipe_context *svga_context_create( struct 
pipe_screen *screen,
 
svga-dirty = ~0;
 
-   LIST_INITHEAD(svga-dirty_buffers);
-
check_for_workarounds(svga);
 
return svga-pipe;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] xa: handle solid-fill src/mask

2014-04-09 Thread Thomas Hellstrom
On 04/03/2014 04:07 PM, Brian Paul wrote:
 On 04/02/2014 11:17 PM, Thomas Hellstrom wrote:
 On 04/01/2014 05:04 PM, Rob Clark wrote:
 From: Rob Clark robcl...@freedesktop.org

 Add support to property handle solid-fill src and/or mask.  Without
 this
 we fallback to sw a lot for common things like text rendering.

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
   src/gallium/state_trackers/xa/xa_composite.c |  88 
   src/gallium/state_trackers/xa/xa_priv.h  |   7 +-
   src/gallium/state_trackers/xa/xa_renderer.c  | 289
 ---
   src/gallium/state_trackers/xa/xa_tgsi.c  |  31 ++-
   4 files changed, 242 insertions(+), 173 deletions(-)

 Rob,
 While testing this patch it looks like we sometimes set two samplers,
 and the first one is NULL.
 The SVGA driver asserts on that condition.
 We might need to move the active sampler to the first entry in that
 case, and adjust tex coords and shader accordingly.

 I'll discuss with BrianP.

 I think the root problem is a disagreement between texture samplers
 and sampler views.  If a texture sampler is non-null, the
 corresponding sampler view be should be non-null too, and vice versa.

 We're tripping over an assertion when a a sampler view is non-null but
 the corresponding sampler is NULL.

 I'm going to write a patch for the driver to be more resilient in that
 situation.

 -Brian

Brian, This is a different problem.
Here, the state tracker sets up sampler[0] and sampler_view[0] to NULL,
but sampler[1] and sampler_view[1] to NON-NULL, but
samplers and sampler views are consistent.

The question is whether that's OK, or whether that's not allowed.

/Thomas









 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0Am=VGqKhmPosyC9QpnoVvQFOszMOE3eB%2B956me7wcCEnEs%3D%0As=5dd06c74671afd42afbd55b02f0c81895463044783e8d6f7a692963750b02123

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()

2014-04-09 Thread Thomas Hellstrom
Reviewed-by: Thomas Hellstrom thellst...@vmware.com

On 04/09/2014 07:40 PM, Brian Paul wrote:
 Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module
 for AA lines (when the device doesn't support that feature).  We need to
 initialize this list before we setup the swtnl pieces.

 Found/fixed by Charmaine Lee.

 Cc: 10.0 mesa-sta...@lists.freedesktop.org
 ---
  src/gallium/drivers/svga/svga_context.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/src/gallium/drivers/svga/svga_context.c 
 b/src/gallium/drivers/svga/svga_context.c
 index 0ba09ce..8389384 100644
 --- a/src/gallium/drivers/svga/svga_context.c
 +++ b/src/gallium/drivers/svga/svga_context.c
 @@ -123,6 +123,8 @@ struct pipe_context *svga_context_create( struct 
 pipe_screen *screen,
 if (svga == NULL)
goto no_svga;
  
 +   LIST_INITHEAD(svga-dirty_buffers);
 +
 svga-pipe.screen = screen;
 svga-pipe.priv = priv;
 svga-pipe.destroy = svga_destroy;
 @@ -185,8 +187,6 @@ struct pipe_context *svga_context_create( struct 
 pipe_screen *screen,
  
 svga-dirty = ~0;
  
 -   LIST_INITHEAD(svga-dirty_buffers);
 -
 check_for_workarounds(svga);
  
 return svga-pipe;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()

2014-04-09 Thread Jakob Bornecrantz
On Wed, Apr 9, 2014 at 7:40 PM, Brian Paul bri...@vmware.com wrote:
 Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module
 for AA lines (when the device doesn't support that feature).  We need to
 initialize this list before we setup the swtnl pieces.

 Found/fixed by Charmaine Lee.

 Cc: 10.0 mesa-sta...@lists.freedesktop.org
 ---
  src/gallium/drivers/svga/svga_context.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Jakob Bornecrantz ja...@vmware.com

Cheers, Jakob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask

2014-04-09 Thread Thomas Hellstrom
Hi, Rob!

On 04/08/2014 10:48 PM, Rob Clark wrote:
 From: Rob Clark robcl...@freedesktop.org

 Add support to property handle solid-fill src and/or mask.  Without this
 we fallback to sw a lot for common things like text rendering.

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
  src/gallium/state_trackers/xa/xa_composite.c | 115 +--
  src/gallium/state_trackers/xa/xa_priv.h  |  13 +-
  src/gallium/state_trackers/xa/xa_renderer.c  | 298 
 ---
  src/gallium/state_trackers/xa/xa_tgsi.c  |  36 +++-
  4 files changed, 263 insertions(+), 199 deletions(-)

 diff --git a/src/gallium/state_trackers/xa/xa_composite.c 
 b/src/gallium/state_trackers/xa/xa_composite.c
 index 7ae35a1..b70fd47 100644
 --- a/src/gallium/state_trackers/xa/xa_composite.c
 +++ b/src/gallium/state_trackers/xa/xa_composite.c
 @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend,
  boolean supported = FALSE;
  
  /*
 - * Temporarily disable component alpha since it appears buggy.
 - */
 -if (mask_pic  mask_pic-component_alpha)
 - return FALSE;
 -
 -/*


I'll attach the rendercheck logs of two early regression. The first one
(log1.txt) happens because we enable component_alpha here.
The second one is with component alpha disabled again.

/Thomas


rendercheck 1.4
Render extension version 0.11
Window format: r8g8b8
Found server-supported format: a8
Found server-supported format: a8r8g8b8
Found server-supported format: x8r8g8b8
Found server-supported format: b8g8r8a8
Found server-supported format: b8g8r8x8
Found server-supported format: r8g8b8
Found server-supported format: b8g8r8
Found server-supported format: r5g5b5
Found server-supported format: b5g5r5
Found server-supported format: x1r5g5b5
Found server-supported format: x1b5g5r5
Found server-supported format: r5g6b5
Found server-supported format: b5g6r5
Found server-supported format: x8b8g8r8
Found server-supported format: x2r10g10b10
Found server-supported format: x2b10g10r10
Beginning testing of filling of 1x1R pictures
Beginning testing of filling of 10x10 pictures
Beginning dest coords test
Beginning src coords test
Beginning mask coords test
mask coords test error of 32. at (1, 0) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (2, 0) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (3, 0) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (4, 0) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (0, 1) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 64. at (1, 1) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 1.00 1.00 1.00
mask coords test error of 32. at (2, 1) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 64. at (3, 1) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 1.00 1.00 1.00
mask coords test error of 32. at (4, 1) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (0, 2) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 64. at (1, 2) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 1.00 1.00 1.00
mask coords test error of 32. at (2, 2) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 64. at (3, 2) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 1.00 1.00 1.00
mask coords test error of 32. at (4, 2) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (0, 3) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 64. at (1, 3) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 1.00 1.00 1.00
mask coords test error of 64. at (2, 3) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 1.00 1.00 1.00
mask coords test error of 32. at (3, 3) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (4, 3) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (0, 4) --
   RGBA
got:   0.00 0.00 0.00 1.00
expected:  1.00 0.00 0.00 1.00
mask coords test error of 32. at (1, 4) --
   RG

Re: [Mesa-dev] [Mesa-stable] [PATCH 1/2] glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path

2014-04-09 Thread Ian Romanick
On 03/16/2014 07:10 AM, Emil Velikov wrote:
 With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and
 initialize glx display) we've split the big _Xglobal_lock handling in
 a more fine grained manner.
 
 Unfortunatelly we forgot to drop the unlock_mutex on the error paths,
 leading to undefined behaviour as the mutex is already unlocked.
 
 Cc: Kristian Høgsberg k...@bitplanet.net
 Cc: 9.2 10.0 10.1  mesa-sta...@lists.freedesktop.org
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com

Sorry for not looking at this sooner... I checked the code, and this
patch is obviously correct.  The lock was released just a few lines
earlier (outside the patch).

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 ---
  src/glx/glxext.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)
 
 diff --git a/src/glx/glxext.c b/src/glx/glxext.c
 index 4a195bd..de73036 100644
 --- a/src/glx/glxext.c
 +++ b/src/glx/glxext.c
 @@ -826,7 +826,6 @@ __glXInitialize(Display * dpy)
 dpyPriv-codes = XInitExtension(dpy, __glXExtensionName);
 if (!dpyPriv-codes) {
free(dpyPriv);
 -  _XUnlockMutex(_Xglobal_lock);
return NULL;
 }
  
 @@ -842,7 +841,6 @@ __glXInitialize(Display * dpy)
dpyPriv-majorVersion, dpyPriv-minorVersion)
 || (dpyPriv-majorVersion == 1  dpyPriv-minorVersion  1)) {
free(dpyPriv);
 -  _XUnlockMutex(_Xglobal_lock);
return NULL;
 }
  
 @@ -907,7 +905,7 @@ __glXInitialize(Display * dpy)
 dpyPriv-next = glx_displays;
 glx_displays = dpyPriv;
  
 -_XUnlockMutex(_Xglobal_lock);
 +   _XUnlockMutex(_Xglobal_lock);
  
 return dpyPriv;
  }
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] xa: handle solid-fill src/mask

2014-04-09 Thread Brian Paul

On 04/09/2014 11:50 AM, Thomas Hellstrom wrote:

On 04/03/2014 04:07 PM, Brian Paul wrote:

On 04/02/2014 11:17 PM, Thomas Hellstrom wrote:

On 04/01/2014 05:04 PM, Rob Clark wrote:

From: Rob Clark robcl...@freedesktop.org

Add support to property handle solid-fill src and/or mask.  Without
this
we fallback to sw a lot for common things like text rendering.

Signed-off-by: Rob Clark robcl...@freedesktop.org
---
   src/gallium/state_trackers/xa/xa_composite.c |  88 
   src/gallium/state_trackers/xa/xa_priv.h  |   7 +-
   src/gallium/state_trackers/xa/xa_renderer.c  | 289
---
   src/gallium/state_trackers/xa/xa_tgsi.c  |  31 ++-
   4 files changed, 242 insertions(+), 173 deletions(-)


Rob,
While testing this patch it looks like we sometimes set two samplers,
and the first one is NULL.
The SVGA driver asserts on that condition.
We might need to move the active sampler to the first entry in that
case, and adjust tex coords and shader accordingly.

I'll discuss with BrianP.


I think the root problem is a disagreement between texture samplers
and sampler views.  If a texture sampler is non-null, the
corresponding sampler view be should be non-null too, and vice versa.

We're tripping over an assertion when a a sampler view is non-null but
the corresponding sampler is NULL.

I'm going to write a patch for the driver to be more resilient in that
situation.

-Brian


Brian, This is a different problem.
Here, the state tracker sets up sampler[0] and sampler_view[0] to NULL,
but sampler[1] and sampler_view[1] to NON-NULL, but
samplers and sampler views are consistent.

The question is whether that's OK, or whether that's not allowed.


I think that's OK.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask

2014-04-09 Thread Rob Clark
On Wed, Apr 9, 2014 at 1:59 PM, Thomas Hellstrom thellst...@vmware.com wrote:
 Hi, Rob!

 On 04/08/2014 10:48 PM, Rob Clark wrote:
 From: Rob Clark robcl...@freedesktop.org

 Add support to property handle solid-fill src and/or mask.  Without this
 we fallback to sw a lot for common things like text rendering.

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
  src/gallium/state_trackers/xa/xa_composite.c | 115 +--
  src/gallium/state_trackers/xa/xa_priv.h  |  13 +-
  src/gallium/state_trackers/xa/xa_renderer.c  | 298 
 ---
  src/gallium/state_trackers/xa/xa_tgsi.c  |  36 +++-
  4 files changed, 263 insertions(+), 199 deletions(-)

 diff --git a/src/gallium/state_trackers/xa/xa_composite.c 
 b/src/gallium/state_trackers/xa/xa_composite.c
 index 7ae35a1..b70fd47 100644
 --- a/src/gallium/state_trackers/xa/xa_composite.c
 +++ b/src/gallium/state_trackers/xa/xa_composite.c
 @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend,
  boolean supported = FALSE;

  /*
 - * Temporarily disable component alpha since it appears buggy.
 - */
 -if (mask_pic  mask_pic-component_alpha)
 - return FALSE;
 -
 -/*


oh, I guess that hunk should have been a different patch anyways (even
if it worked)

 I'll attach the rendercheck logs of two early regression. The first one
 (log1.txt) happens because we enable component_alpha here.
 The second one is with component alpha disabled again.

hmm.. that almost looks like a vertex shader issue (if I'm
understanding what rendercheck is saying properly).  Like the mask
coords are wrong?

BR,
-R

 /Thomas


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i965: Add reads_accumulator_implicitly() function.

2014-04-09 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_shader.cpp | 16 
 src/mesa/drivers/dri/i965/brw_shader.h   |  1 +
 2 files changed, 17 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index f194437..c8796b3 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -664,6 +664,22 @@ backend_instruction::can_do_saturate() const
 }
 
 bool
+backend_instruction::reads_accumulator_implicitly() const
+{
+   switch (opcode) {
+   case BRW_OPCODE_MAC:
+   case BRW_OPCODE_MACH:
+   /* FINISHME: Enable these if we ever start emitting them.
+* case BRW_OPCODE_SADA:
+* case BRW_OPCODE_SADA2:
+*/
+  return true;
+   default:
+  return false;
+   }
+}
+
+bool
 backend_instruction::has_side_effects() const
 {
switch (opcode) {
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index 6bd7dc8..9ef08e5 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -47,6 +47,7 @@ public:
bool is_control_flow() const;
bool can_do_source_mods() const;
bool can_do_saturate() const;
+   bool reads_accumulator_implicitly() const;
 
/**
 * True if the instruction has side effects other than writing to
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965: Add is_accumulator() function.

2014-04-09 Thread Matt Turner
From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
---
As a follow-on patch series, we should move common fields from fs_reg
and vec4's reg into a backend_reg and consolidate these functions.

 src/mesa/drivers/dri/i965/brw_fs.cpp   |  8 
 src/mesa/drivers/dri/i965/brw_fs.h |  1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 +
 src/mesa/drivers/dri/i965/brw_vec4.h   |  2 ++
 4 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 85a5463..e576545 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -501,6 +501,14 @@ fs_reg::is_valid_3src() const
return file == GRF || file == UNIFORM;
 }
 
+bool
+fs_reg::is_accumulator() const
+{
+   return file == HW_REG 
+  fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE 
+  fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR;
+}
+
 int
 fs_visitor::type_size(const struct glsl_type *type)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 3d21ee5..1dadccd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -83,6 +83,7 @@ public:
bool is_null() const;
bool is_valid_3src() const;
bool is_contiguous() const;
+   bool is_accumulator() const;
 
fs_reg apply_stride(unsigned stride);
/** Smear a channel of the reg to all channels. */
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 740d9ff..38d2b93 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -151,6 +151,15 @@ src_reg::src_reg(dst_reg reg)
 swizzles[2], swizzles[3]);
 }
 
+bool
+src_reg::is_accumulator() const
+{
+   return file == HW_REG 
+  fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE 
+  fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR;
+}
+
+
 void
 dst_reg::init()
 {
@@ -221,6 +230,14 @@ dst_reg::is_null() const
 }
 
 bool
+dst_reg::is_accumulator() const
+{
+   return file == HW_REG 
+  fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE 
+  fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR;
+}
+
+bool
 vec4_instruction::is_send_from_grf()
 {
switch (opcode) {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 159a5bd..b3549a5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -128,6 +128,7 @@ public:
bool equals(src_reg *r);
bool is_zero() const;
bool is_one() const;
+   bool is_accumulator() const;
 
src_reg(class vec4_visitor *v, const struct glsl_type *type);
 
@@ -195,6 +196,7 @@ public:
explicit dst_reg(src_reg reg);
 
bool is_null() const;
+   bool is_accumulator() const;
 
int writemask; /** Bitfield of WRITEMASK_[XYZW] */
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Add writes_accumulator flag

2014-04-09 Thread Matt Turner
From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com

Our hardware has an accumulator register, which can be used to store
intermediate results across multiple instructions.  Many instructions
can implicitly write a value to the accumulator in addition to their
normal destination register.  This is enabled by the AccWrEn flag.

This patch introduces a new flag, inst-writes_accumulator, which
allows us to express the AccWrEn notion in the IR.  It also creates a
n ALU2_ACC macro to easily define emitters for instructions that
implicitly write the accumulator.

Previously, we only supported implicit accumulator writes from the
ADDC, SUBB, and MACH instructions.  We always enabled them on those
instructions, and left them disabled for other instructions.

To take advantage of the MAC (multiply-accumulate) instruction, we
need to be able to set AccWrEn on other types of instructions.

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
---
I split out is_accumulator() into a separate patch, and made some
fixes to the scheduling code. Let me know if these changes look good
to you, JP. (Patch formatted with -U15 as to see other sections of
the scheduling code during review)

 src/mesa/drivers/dri/i965/brw_fs.cpp   | 26 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  7 +--
 .../drivers/dri/i965/brw_schedule_instructions.cpp | 58 ++
 src/mesa/drivers/dri/i965/brw_shader.h |  1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 15 ++
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  7 +--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 17 +--
 7 files changed, 95 insertions(+), 36 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index e576545..0eece60 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -52,30 +52,32 @@ extern C {
 #include glsl/glsl_types.h
 
 void
 fs_inst::init()
 {
memset(this, 0, sizeof(*this));
this-conditional_mod = BRW_CONDITIONAL_NONE;
 
this-dst = reg_undef;
this-src[0] = reg_undef;
this-src[1] = reg_undef;
this-src[2] = reg_undef;
 
/* This will be the case for almost all instructions. */
this-regs_written = 1;
+
+   this-writes_accumulator = false;
 }
 
 fs_inst::fs_inst()
 {
init();
this-opcode = BRW_OPCODE_NOP;
 }
 
 fs_inst::fs_inst(enum opcode opcode)
 {
init();
this-opcode = opcode;
 }
 
 fs_inst::fs_inst(enum opcode opcode, fs_reg dst)
@@ -139,63 +141,72 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
 
 #define ALU1(op)\
fs_inst *\
fs_visitor::op(fs_reg dst, fs_reg src0)  \
{\
   return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0);  \
}
 
 #define ALU2(op)\
fs_inst *\
fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1) \
{\
   return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);\
}
 
+#define ALU2_ACC(op)\
+   fs_inst *\
+   fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1) \
+   {\
+  fs_inst *inst = new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);\
+  inst-writes_accumulator = true;  \
+  return inst;  \
+   }
+
 #define ALU3(op)\
fs_inst *\
fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1, fs_reg src2)\
{\
   return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1, src2);\
}
 
 ALU1(NOT)
 ALU1(MOV)
 ALU1(FRC)
 ALU1(RNDD)
 ALU1(RNDE)
 ALU1(RNDZ)
 ALU2(ADD)
 ALU2(MUL)
-ALU2(MACH)
+ALU2_ACC(MACH)
 ALU2(AND)
 ALU2(OR)
 ALU2(XOR)
 ALU2(SHL)
 ALU2(SHR)
 ALU2(ASR)
 ALU3(LRP)
 ALU1(BFREV)
 ALU3(BFE)
 ALU2(BFI1)
 ALU3(BFI2)
 ALU1(FBH)
 ALU1(FBL)
 ALU1(CBIT)
 ALU3(MAD)
-ALU2(ADDC)
-ALU2(SUBB)
+ALU2_ACC(ADDC)
+ALU2_ACC(SUBB)
 ALU2(SEL)
 
 /** Gen4 predicated IF. */
 fs_inst *
 fs_visitor::IF(uint32_t predicate)
 {
fs_inst *inst = new(mem_ctx) fs_inst(BRW_OPCODE_IF);
inst-predicate = predicate;
return inst;
 }
 
 /** Gen6 IF with embedded comparison. */
 fs_inst *
 fs_visitor::IF(fs_reg src0, fs_reg src1, 

Re: [Mesa-dev] [PATCH 1/5] i965: Add writes_accumulator flag

2014-04-09 Thread Matt Turner
On Fri, Apr 4, 2014 at 6:51 AM, Juha-Pekka Heikkila
juhapekka.heikk...@gmail.com wrote:
 diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
 b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
 index a951459..92f82fd 100644
 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
 @@ -758,6 +758,7 @@ fs_instruction_scheduler::calculate_deps()
 schedule_node *last_fixed_grf_write = NULL;
 int reg_width = v-dispatch_width / 8;

 +   schedule_node *last_accumulator_write = NULL;
 /* The last instruction always needs to still be the last
  * instruction.  Either it's flow control (IF, ELSE, ENDIF, DO,
  * WHILE) and scheduling other things after it would disturb the
 @@ -822,6 +823,10 @@ fs_instruction_scheduler::calculate_deps()

The line before this was
  if (inst-reads_flag()) {
  add_dep(last_conditional_mod[inst-flag_subreg], n);
}

 +  if (inst-writes_accumulator || inst-dst.is_accumulator()) {
 + add_dep(last_accumulator_write, n);
 +  }

But we're checking if we're writing the accumulator here, instead of reading it.

We're also not giving the scheduler any benefits from it's new
knowledge of accumulator dependencies, because we're still calling
add_barrier_deps() above when we don't recognize the destination. I
hope you don't mind, but I split the is_accumulator() additions into a
separate patch, fixed up the scheduler hunks and sent the revised
patch. Let me know if it looks right to you.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask

2014-04-09 Thread Rob Clark
On Wed, Apr 9, 2014 at 1:59 PM, Thomas Hellstrom thellst...@vmware.com wrote:
 Hi, Rob!

 On 04/08/2014 10:48 PM, Rob Clark wrote:
 From: Rob Clark robcl...@freedesktop.org

 Add support to property handle solid-fill src and/or mask.  Without this
 we fallback to sw a lot for common things like text rendering.

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
  src/gallium/state_trackers/xa/xa_composite.c | 115 +--
  src/gallium/state_trackers/xa/xa_priv.h  |  13 +-
  src/gallium/state_trackers/xa/xa_renderer.c  | 298 
 ---
  src/gallium/state_trackers/xa/xa_tgsi.c  |  36 +++-
  4 files changed, 263 insertions(+), 199 deletions(-)

 diff --git a/src/gallium/state_trackers/xa/xa_composite.c 
 b/src/gallium/state_trackers/xa/xa_composite.c
 index 7ae35a1..b70fd47 100644
 --- a/src/gallium/state_trackers/xa/xa_composite.c
 +++ b/src/gallium/state_trackers/xa/xa_composite.c
 @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend,
  boolean supported = FALSE;

  /*
 - * Temporarily disable component alpha since it appears buggy.
 - */
 -if (mask_pic  mask_pic-component_alpha)
 - return FALSE;
 -
 -/*


 I'll attach the rendercheck logs of two early regression. The first one
 (log1.txt) happens because we enable component_alpha here.
 The second one is with component alpha disabled again.

hmm, so for the second test, it works for me with --sync:


[robclark@reptile:~]$ rendercheck --sync -v -t mcoords
rendercheck 1.4
Render extension version 0.11
Window format: r8g8b8
Found server-supported format: a8
Found server-supported format: a8r8g8b8
Found server-supported format: x8r8g8b8
Found server-supported format: b8g8r8a8
Found server-supported format: b8g8r8x8
Found server-supported format: r8g8b8
Found server-supported format: b8g8r8
Found server-supported format: r5g5b5
Found server-supported format: b5g5r5
Found server-supported format: x1r5g5b5
Found server-supported format: x1b5g5r5
Found server-supported format: r5g6b5
Found server-supported format: b5g6r5
Found server-supported format: x8b8g8r8
Found server-supported format: x2r10g10b10
Found server-supported format: x2b10g10r10
Beginning mask coords test
1 tests passed of 1 total
Successful Groups:
mcoords
[robclark@reptile:~]$



but not without (although the error I get is a bit different..
although maybe different rendercheck args?)


[robclark@reptile:~]$ rendercheck -v -t mcoords
rendercheck 1.4
Render extension version 0.11
Window format: r8g8b8
Found server-supported format: a8
Found server-supported format: a8r8g8b8
Found server-supported format: x8r8g8b8
Found server-supported format: b8g8r8a8
Found server-supported format: b8g8r8x8
Found server-supported format: r8g8b8
Found server-supported format: b8g8r8
Found server-supported format: r5g5b5
Found server-supported format: b5g5r5
Found server-supported format: x1r5g5b5
Found server-supported format: x1b5g5r5
Found server-supported format: r5g6b5
Found server-supported format: b5g6r5
Found server-supported format: x8b8g8r8
Found server-supported format: x2r10g10b10
Found server-supported format: x2b10g10r10
Beginning mask coords test
mask coords test error of 255. at (0, 0) --
   R G B A
got:   1.000 1.000 1.000 1.000
expected:  1.000 0.000 0.000 1.000
mask coords test error of 255. at (1, 1) --
   R G B A
got:   1.000 0.000 0.000 1.000
expected:  1.000 1.000 1.000 1.000
mask coords test error of 255. at (2, 1) --
   R G B A
got:   1.000 1.000 1.000 1.000
expected:  1.000 0.000 0.000 1.000
mask coords test error of 255. at (3, 1) --
   R G B A
got:   1.000 0.000 0.000 1.000
expected:  1.000 1.000 1.000 1.000
mask coords test error of 255. at (4, 1) --
   R G B A
got:   1.000 1.000 1.000 1.000
expected:  1.000 0.000 0.000 1.000
mask coords test error of 255. at (1, 2) --
   R G B A
got:   1.000 0.000 0.000 1.000
expected:  1.000 1.000 1.000 1.000
mask coords test error of 255. at (2, 2) --
   R G B A
got:   1.000 1.000 1.000 1.000
expected:  1.000 0.000 0.000 1.000
mask coords test error of 255. at (3, 2) --
   R G B A
got:   1.000 0.000 0.000 1.000
expected:  1.000 1.000 1.000 1.000
mask coords test error of 255. at (4, 2) --
   R G B A
got:   1.000 1.000 1.000 1.000
expected:  1.000 0.000 0.000 1.000
mask coords test error of 255. at (1, 3) --
   R G B A
got:   1.000 0.000 0.000 1.000
expected:  1.000 1.000 1.000 1.000
mask coords test error of 255. at (3, 3) --
   R G B A
got:   1.000 1.000 1.000 1.000
expected:  1.000 0.000 0.000 1.000
expected vs tested:
1 0
10101 11010
10101 11010
10011 11001
1 1
0 tests passed of 1 total
Successful Groups:

Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask

2014-04-09 Thread Rob Clark
On Wed, Apr 9, 2014 at 5:12 PM, Rob Clark robdcl...@gmail.com wrote:
 On Wed, Apr 9, 2014 at 1:59 PM, Thomas Hellstrom thellst...@vmware.com 
 wrote:
 Hi, Rob!

 On 04/08/2014 10:48 PM, Rob Clark wrote:
 From: Rob Clark robcl...@freedesktop.org

 Add support to property handle solid-fill src and/or mask.  Without this
 we fallback to sw a lot for common things like text rendering.

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
  src/gallium/state_trackers/xa/xa_composite.c | 115 +--
  src/gallium/state_trackers/xa/xa_priv.h  |  13 +-
  src/gallium/state_trackers/xa/xa_renderer.c  | 298 
 ---
  src/gallium/state_trackers/xa/xa_tgsi.c  |  36 +++-
  4 files changed, 263 insertions(+), 199 deletions(-)

 diff --git a/src/gallium/state_trackers/xa/xa_composite.c 
 b/src/gallium/state_trackers/xa/xa_composite.c
 index 7ae35a1..b70fd47 100644
 --- a/src/gallium/state_trackers/xa/xa_composite.c
 +++ b/src/gallium/state_trackers/xa/xa_composite.c
 @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend,
  boolean supported = FALSE;

  /*
 - * Temporarily disable component alpha since it appears buggy.
 - */
 -if (mask_pic  mask_pic-component_alpha)
 - return FALSE;
 -
 -/*


 I'll attach the rendercheck logs of two early regression. The first one
 (log1.txt) happens because we enable component_alpha here.
 The second one is with component alpha disabled again.

 hmm, so for the second test, it works for me with --sync:

oh, and if I bring back disabling of mask pic w/ component alpha, then
it passes for me both with and without --sync..

BR,
-R

 
 [robclark@reptile:~]$ rendercheck --sync -v -t mcoords
 rendercheck 1.4
 Render extension version 0.11
 Window format: r8g8b8
 Found server-supported format: a8
 Found server-supported format: a8r8g8b8
 Found server-supported format: x8r8g8b8
 Found server-supported format: b8g8r8a8
 Found server-supported format: b8g8r8x8
 Found server-supported format: r8g8b8
 Found server-supported format: b8g8r8
 Found server-supported format: r5g5b5
 Found server-supported format: b5g5r5
 Found server-supported format: x1r5g5b5
 Found server-supported format: x1b5g5r5
 Found server-supported format: r5g6b5
 Found server-supported format: b5g6r5
 Found server-supported format: x8b8g8r8
 Found server-supported format: x2r10g10b10
 Found server-supported format: x2b10g10r10
 Beginning mask coords test
 1 tests passed of 1 total
 Successful Groups:
 mcoords
 [robclark@reptile:~]$
 


 but not without (although the error I get is a bit different..
 although maybe different rendercheck args?)

 
 [robclark@reptile:~]$ rendercheck -v -t mcoords
 rendercheck 1.4
 Render extension version 0.11
 Window format: r8g8b8
 Found server-supported format: a8
 Found server-supported format: a8r8g8b8
 Found server-supported format: x8r8g8b8
 Found server-supported format: b8g8r8a8
 Found server-supported format: b8g8r8x8
 Found server-supported format: r8g8b8
 Found server-supported format: b8g8r8
 Found server-supported format: r5g5b5
 Found server-supported format: b5g5r5
 Found server-supported format: x1r5g5b5
 Found server-supported format: x1b5g5r5
 Found server-supported format: r5g6b5
 Found server-supported format: b5g6r5
 Found server-supported format: x8b8g8r8
 Found server-supported format: x2r10g10b10
 Found server-supported format: x2b10g10r10
 Beginning mask coords test
 mask coords test error of 255. at (0, 0) --
R G B A
 got:   1.000 1.000 1.000 1.000
 expected:  1.000 0.000 0.000 1.000
 mask coords test error of 255. at (1, 1) --
R G B A
 got:   1.000 0.000 0.000 1.000
 expected:  1.000 1.000 1.000 1.000
 mask coords test error of 255. at (2, 1) --
R G B A
 got:   1.000 1.000 1.000 1.000
 expected:  1.000 0.000 0.000 1.000
 mask coords test error of 255. at (3, 1) --
R G B A
 got:   1.000 0.000 0.000 1.000
 expected:  1.000 1.000 1.000 1.000
 mask coords test error of 255. at (4, 1) --
R G B A
 got:   1.000 1.000 1.000 1.000
 expected:  1.000 0.000 0.000 1.000
 mask coords test error of 255. at (1, 2) --
R G B A
 got:   1.000 0.000 0.000 1.000
 expected:  1.000 1.000 1.000 1.000
 mask coords test error of 255. at (2, 2) --
R G B A
 got:   1.000 1.000 1.000 1.000
 expected:  1.000 0.000 0.000 1.000
 mask coords test error of 255. at (3, 2) --
R G B A
 got:   1.000 0.000 0.000 1.000
 expected:  1.000 1.000 1.000 1.000
 mask coords test error of 255. at (4, 2) --
R G B A
 got:   1.000 1.000 1.000 1.000
 expected:  1.000 0.000 0.000 1.000
 mask coords test error of 255. at (1, 3) --
R G B A
 got:   1.000 0.000 0.000 1.000
 expected:  1.000 1.000 

[Mesa-dev] [PATCH] i965: Don't make instructions with a null dest a barrier to scheduling.

2014-04-09 Thread Matt Turner
Now that we properly track accumulator dependencies, the scheduler is
able to schedule instructions between the mach and mov in the common
the integer multiplication pattern:

   mul  acc0, x, y
   mach null, x, y
   mov  dest, acc0

Since a null destination implies no dependency on the destination, we
can also safely schedule instructions (that don't write the accumulator)
between the mul and mach.
---
This depends on JP's accumulator series.

 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 3538da5..910b73a 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -864,7 +864,8 @@ fs_instruction_scheduler::calculate_deps()
   } else if (inst-dst.is_accumulator()) {
  add_dep(last_accumulator_write, n);
  last_accumulator_write = n;
-  } else if (inst-dst.file != BAD_FILE) {
+  } else if (inst-dst.file != BAD_FILE 
+ !inst-dst.is_null()) {
 add_barrier_deps(n);
   }
 
@@ -983,7 +984,8 @@ fs_instruction_scheduler::calculate_deps()
  }
   } else if (inst-dst.is_accumulator()) {
  last_accumulator_write = n;
-  } else if (inst-dst.file != BAD_FILE) {
+  } else if (inst-dst.file != BAD_FILE 
+ !inst-dst.is_null()) {
 add_barrier_deps(n);
   }
 
@@ -1089,7 +1091,8 @@ vec4_instruction_scheduler::calculate_deps()
   } else if (inst-dst.is_accumulator()) {
  add_dep(last_accumulator_write, n);
  last_accumulator_write = n;
-  } else if (inst-dst.file != BAD_FILE) {
+  } else if (inst-dst.file != BAD_FILE 
+ !inst-dst.is_null()) {
  add_barrier_deps(n);
   }
 
@@ -1173,7 +1176,8 @@ vec4_instruction_scheduler::calculate_deps()
  last_fixed_grf_write = n;
   } else if (inst-dst.is_accumulator()) {
  last_accumulator_write = n;
-  } else if (inst-dst.file != BAD_FILE) {
+  } else if (inst-dst.file != BAD_FILE 
+ !inst-dst.is_null()) {
  add_barrier_deps(n);
   }
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: Add reads_accumulator_implicitly() function.

2014-04-09 Thread Eric Anholt
Matt Turner matts...@gmail.com writes:

 ---
  src/mesa/drivers/dri/i965/brw_shader.cpp | 16 
  src/mesa/drivers/dri/i965/brw_shader.h   |  1 +
  2 files changed, 17 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index f194437..c8796b3 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -664,6 +664,22 @@ backend_instruction::can_do_saturate() const
  }
  
  bool
 +backend_instruction::reads_accumulator_implicitly() const
 +{
 +   switch (opcode) {
 +   case BRW_OPCODE_MAC:
 +   case BRW_OPCODE_MACH:
 +   /* FINISHME: Enable these if we ever start emitting them.
 +* case BRW_OPCODE_SADA:
 +* case BRW_OPCODE_SADA2:
 +*/

Let's just uncomment SADA2 right away to prevent pain in the future.
SAD2 doesn't read the acc, though.

Other than that, the first 2 patches are:

Reviewed-by: Eric Anholt e...@anholt.net

I think scheduling is still broken in the last one, because you're
removing the barrier deps on implicit-accumulator opcodes and replacing
them with explicit dependencies, but you're not tracking the accumulator
updates by almost-all-instructions pre-gen6.  The scheduler would be
free to slip in some unrelated instruction after the MUL in the
following snippet from brw_vec4_visitor.cpp:

emit(MUL(acc, op[0], op[1]));
emit(MACH(dst_null_d(), op[0], op[1]));
emit(MOV(result_dst, src_reg(acc)));

(err, why are we doing MACH and MOV instead of just MACH into
result_dst?)


pgpumEA54Dvmj.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965: Add is_accumulator() function.

2014-04-09 Thread Kenneth Graunke
On 04/09/2014 01:47 PM, Matt Turner wrote:
 From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 
 Reviewed-by: Matt Turner matts...@gmail.com
 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
 As a follow-on patch series, we should move common fields from fs_reg
 and vec4's reg into a backend_reg and consolidate these functions.

Yeah, there's been some talk about creating one of those.

But even at present, src_reg and dst_reg both inherit from reg.
is_accumulator() should be defined there, not in both subclasses.

 
  src/mesa/drivers/dri/i965/brw_fs.cpp   |  8 
  src/mesa/drivers/dri/i965/brw_fs.h |  1 +
  src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 +
  src/mesa/drivers/dri/i965/brw_vec4.h   |  2 ++
  4 files changed, 28 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 85a5463..e576545 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -501,6 +501,14 @@ fs_reg::is_valid_3src() const
 return file == GRF || file == UNIFORM;
  }
  
 +bool
 +fs_reg::is_accumulator() const
 +{
 +   return file == HW_REG 
 +  fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE 
 +  fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR;
 +}
 +
  int
  fs_visitor::type_size(const struct glsl_type *type)
  {
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 3d21ee5..1dadccd 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -83,6 +83,7 @@ public:
 bool is_null() const;
 bool is_valid_3src() const;
 bool is_contiguous() const;
 +   bool is_accumulator() const;
  
 fs_reg apply_stride(unsigned stride);
 /** Smear a channel of the reg to all channels. */
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 index 740d9ff..38d2b93 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 @@ -151,6 +151,15 @@ src_reg::src_reg(dst_reg reg)
  swizzles[2], swizzles[3]);
  }
  
 +bool
 +src_reg::is_accumulator() const
 +{
 +   return file == HW_REG 
 +  fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE 
 +  fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR;
 +}
 +
 +
  void
  dst_reg::init()
  {
 @@ -221,6 +230,14 @@ dst_reg::is_null() const
  }
  
  bool
 +dst_reg::is_accumulator() const
 +{
 +   return file == HW_REG 
 +  fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE 
 +  fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR;
 +}
 +
 +bool
  vec4_instruction::is_send_from_grf()
  {
 switch (opcode) {
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index 159a5bd..b3549a5 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -128,6 +128,7 @@ public:
 bool equals(src_reg *r);
 bool is_zero() const;
 bool is_one() const;
 +   bool is_accumulator() const;
  
 src_reg(class vec4_visitor *v, const struct glsl_type *type);
  
 @@ -195,6 +196,7 @@ public:
 explicit dst_reg(src_reg reg);
  
 bool is_null() const;
 +   bool is_accumulator() const;
  
 int writemask; /** Bitfield of WRITEMASK_[XYZW] */
  
 




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: Add reads_accumulator_implicitly() function.

2014-04-09 Thread Matt Turner
On Wed, Apr 9, 2014 at 3:06 PM, Eric Anholt e...@anholt.net wrote:
 Matt Turner matts...@gmail.com writes:

 ---
  src/mesa/drivers/dri/i965/brw_shader.cpp | 16 
  src/mesa/drivers/dri/i965/brw_shader.h   |  1 +
  2 files changed, 17 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index f194437..c8796b3 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -664,6 +664,22 @@ backend_instruction::can_do_saturate() const
  }

  bool
 +backend_instruction::reads_accumulator_implicitly() const
 +{
 +   switch (opcode) {
 +   case BRW_OPCODE_MAC:
 +   case BRW_OPCODE_MACH:
 +   /* FINISHME: Enable these if we ever start emitting them.
 +* case BRW_OPCODE_SADA:
 +* case BRW_OPCODE_SADA2:
 +*/

 Let's just uncomment SADA2 right away to prevent pain in the future.
 SAD2 doesn't read the acc, though.

 Other than that, the first 2 patches are:

 Reviewed-by: Eric Anholt e...@anholt.net

 I think scheduling is still broken in the last one, because you're
 removing the barrier deps on implicit-accumulator opcodes and replacing
 them with explicit dependencies, but you're not tracking the accumulator
 updates by almost-all-instructions pre-gen6.  The scheduler would be
 free to slip in some unrelated instruction after the MUL in the
 following snippet from brw_vec4_visitor.cpp:

Ah, that is true.

I went looking for text about this, since I didn't know about it until
you mentioned it recently. I see in the GM45 docs a 'Accumulator
Disable' bit in cr0. I wonder whether all of the false
write-after-write dependencies on the accumulator actually cause
stalls, and if so whether we should attempt to disable accumulator
writes. We don't seem to have any cases where we rely on implicit
accumulator updates that we couldn't replace with explicit accumulator
destinations.


 emit(MUL(acc, op[0], op[1]));
 emit(MACH(dst_null_d(), op[0], op[1]));
 emit(MOV(result_dst, src_reg(acc)));

 (err, why are we doing MACH and MOV instead of just MACH into
 result_dst?)

mach writes the *high* 32-bits of the result into its destination (so
useful for *mulExtended()).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Add writes_accumulator flag

2014-04-09 Thread Kenneth Graunke
On 04/09/2014 01:47 PM, Matt Turner wrote:
 From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 
 Our hardware has an accumulator register, which can be used to store
 intermediate results across multiple instructions.  Many instructions
 can implicitly write a value to the accumulator in addition to their
 normal destination register.  This is enabled by the AccWrEn flag.
 
 This patch introduces a new flag, inst-writes_accumulator, which
 allows us to express the AccWrEn notion in the IR.  It also creates a
 n ALU2_ACC macro to easily define emitters for instructions that
 implicitly write the accumulator.
 
 Previously, we only supported implicit accumulator writes from the
 ADDC, SUBB, and MACH instructions.  We always enabled them on those
 instructions, and left them disabled for other instructions.
 
 To take advantage of the MAC (multiply-accumulate) instruction, we
 need to be able to set AccWrEn on other types of instructions.
 
 Reviewed-by: Matt Turner matts...@gmail.com
 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
 I split out is_accumulator() into a separate patch, and made some
 fixes to the scheduling code. Let me know if these changes look good
 to you, JP. (Patch formatted with -U15 as to see other sections of
 the scheduling code during review)
 
  src/mesa/drivers/dri/i965/brw_fs.cpp   | 26 ++
  src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  7 +--
  .../drivers/dri/i965/brw_schedule_instructions.cpp | 58 
 ++
  src/mesa/drivers/dri/i965/brw_shader.h |  1 +
  src/mesa/drivers/dri/i965/brw_vec4.cpp | 15 ++
  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  7 +--
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 17 +--
  7 files changed, 95 insertions(+), 36 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index e576545..0eece60 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
[snip]
 @@ -2113,40 +2124,35 @@ fs_visitor::dead_code_eliminate()
  
   for (int i = 0; i  inst-regs_written; i++) {
  int var = live_intervals-var_from_vgrf[inst-dst.reg];
  assert(live_intervals-end[var + inst-dst.reg_offset + i] = 
 pc);
  if (live_intervals-end[var + inst-dst.reg_offset + i] != pc) {
 dead = false;
 break;
  }
   }
  
   if (dead) {
  /* Don't dead code eliminate instructions that write to the
   * accumulator as a side-effect. Instead just set the destination
   * to the null register to free it.
   */
 -switch (inst-opcode) {
 -case BRW_OPCODE_ADDC:
 -case BRW_OPCODE_SUBB:
 -case BRW_OPCODE_MACH:
 +if (inst-writes_accumulator) {
 inst-dst = fs_reg(retype(brw_null_reg(), inst-dst.type));
 -   break;

Pre-existing bug: we ought to set progress = true in this case.

 -default:
 +} else {
 inst-remove();
 progress = true;
 -   break;
  }
   }
}
  
pc++;
 }
  
 if (progress)
invalidate_live_intervals();
  
 return progress;
  }
  
  struct dead_code_hash_key
  {




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965: Add is_accumulator() function.

2014-04-09 Thread Matt Turner
On Wed, Apr 9, 2014 at 3:13 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 On 04/09/2014 01:47 PM, Matt Turner wrote:
 From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com

 Reviewed-by: Matt Turner matts...@gmail.com
 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
 As a follow-on patch series, we should move common fields from fs_reg
 and vec4's reg into a backend_reg and consolidate these functions.

 Yeah, there's been some talk about creating one of those.

 But even at present, src_reg and dst_reg both inherit from reg.
 is_accumulator() should be defined there, not in both subclasses.

Yeah. That's what made me notice. I don't care to do a partial fix now.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Reset reg_from when we can't coalesce.

2014-04-09 Thread Matt Turner
Not setting this would prevented coalescing after a failed attempt if
the sources for both MOVs were the same.

total instructions in shared programs: 1654531 - 1650224 (-0.26%)
instructions in affected programs: 423167 - 418860 (-1.02%)
GAINED:2
LOST:  0
---
 src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
index 6e30d16..4e3b611 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
@@ -162,6 +162,7 @@ fs_visitor::register_coalesce()
  if (!can_coalesce_vars(live_intervals, instructions, inst,
 var_to[i], var_from[i])) {
 can_coalesce = false;
+reg_from = -1;
 break;
  }
   }
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/gs: Add dummy source to prepare_channel_masks instruction.

2014-04-09 Thread Matt Turner
The generator uses its destination as a source implicitly, which breaks
some assumptions in dead code elimination. Giving the instruction a
source allows us to reason about it better.

Reviewed-by: Eric Anholt e...@anholt.net
---
I can't use the source in the generator because a shl(1) instruction
is emitted from generate_gs_prepare_channel_masks(), so we rely on a
bunch of bits in src being in dst even though we're not writing the
whole register.

 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 2 ++
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 ++-
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index a74514f..47aac75 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1221,6 +1221,8 @@ 
vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
   break;
 
case GS_OPCODE_PREPARE_CHANNEL_MASKS:
+  assert(dst.file == src[0].file 
+ dst.reg == src[0].reg);
   generate_gs_prepare_channel_masks(dst);
   break;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 13d6d38..1321a94 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -408,7 +408,8 @@ vec4_gs_visitor::emit_control_data_bits()
  src_reg channel_mask(this, glsl_type::uint_type);
  inst = emit(SHL(dst_reg(channel_mask), one, channel));
  inst-force_writemask_all = true;
- emit(GS_OPCODE_PREPARE_CHANNEL_MASKS, dst_reg(channel_mask));
+ emit(GS_OPCODE_PREPARE_CHANNEL_MASKS, dst_reg(channel_mask),
+   channel_mask);
  emit(GS_OPCODE_SET_CHANNEL_MASKS, mrf_reg, channel_mask);
   }
 
diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
index b854db5..49e1a97 100644
--- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
@@ -758,6 +758,8 @@ 
gen8_vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
   break;
 
case GS_OPCODE_PREPARE_CHANNEL_MASKS:
+  assert(dst.file == src[0].file 
+ dst.reg == src[0].reg);
   generate_gs_prepare_channel_masks(dst);
   break;
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/18] Implement GL_ARB_multi_bind

2014-04-09 Thread Fredrik Höglund
On Tuesday 08 April 2014, Kenneth Graunke wrote:
 On 01/21/2014 03:35 PM, Fredrik Höglund wrote:
  So here is my take on GL_ARB_multi_bind.
  
  I tried to come up with names for the new hash table functions that
  don't suggest that they should be used to do unlocked insertions/lookups.
  I'm not entirely happy with the ones I came up with though, so I'm
  hoping someone will have better suggestions.
  
  When binding 32 textures glBindTextures() seems to be about three times
  faster than calling glActiveTexture() + glBindTexture() in a loop.
  When binding 4 textures it's about twice as fast.
  
  I hope to land this series this week if there are no major issues.
  
  Note that I haven't been able to test the glBindImageTextures()
  implementation.
  
  This series is also available at:
  
  git://people.freedesktop.org/~fredrik/mesa arb-multi-bind
 
 Hi Fredrik,
 
 Where are we at with this?  It sounds like there were a few review
 comments and suggestions - were you planning to send out a v2?

I plan on sending out a new version shortly.

Fredrik

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] fixup! i965: Add writes_accumulator flag

2014-04-09 Thread Matt Turner
---
Eric, how about this squashed in?

On Gen  6 any accumulator use, with the exception of the implied
update that nearly every instruction does causes a barrier dep.
Implicit writes, noted by ::writes_accumulator, causes a barrier dep.

On Gen = 6, we just track the accumulator dependencies with
last_accumulator_write.

 .../drivers/dri/i965/brw_schedule_instructions.cpp | 74 --
 1 file changed, 55 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 910b73a..8cc6908 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -742,6 +742,8 @@ fs_instruction_scheduler::is_compressed(fs_inst *inst)
 void
 fs_instruction_scheduler::calculate_deps()
 {
+   const bool gen6plus = v-brw-gen = 6;
+
/* Pre-register-allocation, this tracks the last write per VGRF (so
 * different reg_offsets within it can interfere when they shouldn't).
 * After register allocation, reg_offsets are gone and we track individual
@@ -801,7 +803,7 @@ fs_instruction_scheduler::calculate_deps()
 } else {
add_dep(last_fixed_grf_write, n);
 }
- } else if (inst-src[i].is_accumulator()) {
+ } else if (inst-src[i].is_accumulator()  gen6plus) {
 add_dep(last_accumulator_write, n);
 } else if (inst-src[i].file != BAD_FILE 
inst-src[i].file != IMM 
@@ -826,7 +828,11 @@ fs_instruction_scheduler::calculate_deps()
   }
 
   if (inst-reads_accumulator_implicitly()) {
- add_dep(last_accumulator_write, n);
+ if (gen6plus) {
+add_dep(last_accumulator_write, n);
+ } else {
+add_barrier_deps(n);
+ }
   }
 
   /* write-after-write deps. */
@@ -861,7 +867,7 @@ fs_instruction_scheduler::calculate_deps()
  } else {
 last_fixed_grf_write = n;
  }
-  } else if (inst-dst.is_accumulator()) {
+  } else if (inst-dst.is_accumulator()  gen6plus) {
  add_dep(last_accumulator_write, n);
  last_accumulator_write = n;
   } else if (inst-dst.file != BAD_FILE 
@@ -882,8 +888,12 @@ fs_instruction_scheduler::calculate_deps()
   }
 
   if (inst-writes_accumulator) {
- add_dep(last_accumulator_write, n);
- last_accumulator_write = n;
+ if (gen6plus) {
+add_dep(last_accumulator_write, n);
+last_accumulator_write = n;
+ } else {
+add_barrier_deps(n);
+ }
   }
}
 
@@ -923,7 +933,7 @@ fs_instruction_scheduler::calculate_deps()
 } else {
add_dep(n, last_fixed_grf_write);
 }
- } else if (inst-src[i].is_accumulator()) {
+ } else if (inst-src[i].is_accumulator()  gen6plus) {
 add_dep(n, last_accumulator_write);
  } else if (inst-src[i].file != BAD_FILE 
inst-src[i].file != IMM 
@@ -948,7 +958,11 @@ fs_instruction_scheduler::calculate_deps()
   }
 
   if (inst-reads_accumulator_implicitly()) {
- add_dep(n, last_accumulator_write);
+ if (gen6plus) {
+add_dep(n, last_accumulator_write);
+ } else {
+add_barrier_deps(n);
+ }
   }
 
   /* Update the things this instruction wrote, so earlier reads
@@ -982,7 +996,7 @@ fs_instruction_scheduler::calculate_deps()
  } else {
 last_fixed_grf_write = n;
  }
-  } else if (inst-dst.is_accumulator()) {
+  } else if (inst-dst.is_accumulator()  gen6plus) {
  last_accumulator_write = n;
   } else if (inst-dst.file != BAD_FILE 
  !inst-dst.is_null()) {
@@ -1000,7 +1014,11 @@ fs_instruction_scheduler::calculate_deps()
   }
 
   if (inst-writes_accumulator) {
- last_accumulator_write = n;
+ if (gen6plus) {
+last_accumulator_write = n;
+ } else {
+add_barrier_deps(n);
+ }
   }
}
 }
@@ -1008,6 +1026,8 @@ fs_instruction_scheduler::calculate_deps()
 void
 vec4_instruction_scheduler::calculate_deps()
 {
+   const bool gen6plus = v-brw-gen = 6;
+
schedule_node *last_grf_write[grf_count];
schedule_node *last_mrf_write[BRW_MAX_MRF];
schedule_node *last_conditional_mod = NULL;
@@ -1047,7 +1067,7 @@ vec4_instruction_scheduler::calculate_deps()
 (inst-src[i].fixed_hw_reg.file ==
  BRW_GENERAL_REGISTER_FILE)) {
 add_dep(last_fixed_grf_write, n);
- } else if (inst-src[i].is_accumulator()) {
+ } else if (inst-src[i].is_accumulator()  gen6plus) {
 assert(last_accumulator_write);
 add_dep(last_accumulator_write, n);
  } else if (inst-src[i].file != BAD_FILE 
@@ -1074,8 +1094,12 @@ vec4_instruction_scheduler::calculate_deps()
   }
 
  

[Mesa-dev] state tracker texture sizing fun

2014-04-09 Thread Dave Airlie
So I was looking at adding ARB_texture_query_levels support to
gallium, and hit a bit of a saga in the state tracker texture
finalising code.

commits involved in this are below,

So to fix the query levels test I essentially wanted to revert
529b7b355d392b1534ccd8ff7b428dc21cbfdc21 so that the hw was programmed
with the correct last levels and the query tests would pass,

I then did an llvmpipe piglit run, and found two major regressions,
texture arrays broke and getteximage broke,

arrays are broken simply because st_texture_image_copy is broken for
arrays, that seems like a not insane fix,

however getteximage is broken because of Cooper's commit, as the test
sets GL_NEAREST for everything, it however TexImage2D a number of
layers into the textures, then GetTexImage them backout, however due
to that sampler check we totally fail, I think we should be dropping
Cooper's change it lacks justification and it dies texture backing
store and samplers together a bit much for my liking,

Dave.

commit 529b7b355d392b1534ccd8ff7b428dc21cbfdc21
Author: Brian Paul bri...@vmware.com
Date:   Mon May 3 13:04:29 2010 -0600

st/mesa: restore original last_layer comparison

Commit e648d4a1d1c0c5f70916e38366b863f0bec79a62 changed the original
less-than test to a not-equal test.  This was an effort to save some
memory by switching the texture layout to a non-mipmapped layout when
we mis-guessed about the original layout (thus saving some memory).

However, this causes us to hit a new (apparently broken) code path
when copying the old texture's data to the new texture.  Simply
undo this change for the time being until the other/new bug is fixed.

Fixes fd.o bug 27933.

commit e648d4a1d1c0c5f70916e38366b863f0bec79a62
Author: Brian Paul bri...@vmware.com
Date:   Thu Apr 29 15:32:36 2010 -0600

st/mesa: ignore gl_texture_object::BaseLevel when allocating gallium texture

Previously, when we created a gallium texture for a corresponding Mesa
texture we'd only allocate space for mipmap levels = BaseLevel.

This patch undoes that mechanism.  This fixes a render-to-texture bug
when rendering to level 0 when BaseLevel=1.

Also, it makes sense to allocate the whole texture object memory when
BaseLevel  0 since a common use of GL_TEXTURE_BASE_LEVEL is to
progressively load/render mipmaps.  Eventually, the app almost always
fills in the level=0 mipmap image.

Finally, the texture image code is bit easier to understand now.

ae2daacbac7242938cffe0e2409071e030e00863
Author: Cooper Yuan coopery...@gmail.com
Date:   Thu Oct 1 17:54:27 2009 +0800

st/mesa: fix non-mipmap lastLevel calculation.

reviewed by Brian Paul.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: fix sampler_view REALLOC/FREE macro mix-up

2014-04-09 Thread Brian Paul
We were using REALLOC() from u_memory.h but FREE() from imports.h.
This mismatch caused us to trash the heap on Windows after we
deleted a texture object.

This fixes a regression from commit 6c59be7776e4d.
---
 src/mesa/state_tracker/st_cb_texture.c |2 +-
 src/mesa/state_tracker/st_texture.c|   12 
 src/mesa/state_tracker/st_texture.h|3 +++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 353415b..304dc91 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -155,7 +155,7 @@ st_DeleteTextureObject(struct gl_context *ctx,
 
pipe_resource_reference(stObj-pt, NULL);
st_texture_release_all_sampler_views(stObj);
-   FREE(stObj-sampler_views);
+   st_texture_free_sampler_views(stObj);
_mesa_delete_texture_object(ctx, texObj);
 }
 
diff --git a/src/mesa/state_tracker/st_texture.c 
b/src/mesa/state_tracker/st_texture.c
index 8d559df..cfa0605 100644
--- a/src/mesa/state_tracker/st_texture.c
+++ b/src/mesa/state_tracker/st_texture.c
@@ -483,3 +483,15 @@ st_texture_release_all_sampler_views(struct 
st_texture_object *stObj)
for (i = 0; i  stObj-num_sampler_views; ++i)
   pipe_sampler_view_reference(stObj-sampler_views[i], NULL);
 }
+
+
+void
+st_texture_free_sampler_views(struct st_texture_object *stObj)
+{
+   /* NOTE:
+* We use FREE() here to match REALLOC() above.  Both come from
+* u_memory.h, not imports.h.  If we mis-match MALLOC/FREE from
+* those two headers we can trash the heap.
+*/
+   FREE(stObj-sampler_views);
+}
diff --git a/src/mesa/state_tracker/st_texture.h 
b/src/mesa/state_tracker/st_texture.h
index 87de9f9..f2afaf1 100644
--- a/src/mesa/state_tracker/st_texture.h
+++ b/src/mesa/state_tracker/st_texture.h
@@ -241,4 +241,7 @@ st_texture_release_sampler_view(struct st_context *st,
 extern void
 st_texture_release_all_sampler_views(struct st_texture_object *stObj);
 
+void
+st_texture_free_sampler_views(struct st_texture_object *stObj);
+
 #endif
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] mesa: s/FREE/free/ in vdpau code

2014-04-09 Thread Brian Paul
---
 src/mesa/main/vdpau.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/vdpau.c b/src/mesa/main/vdpau.c
index c2cf206..d974593 100644
--- a/src/mesa/main/vdpau.c
+++ b/src/mesa/main/vdpau.c
@@ -88,7 +88,7 @@ unregister_surface(struct set_entry *entry)
}
 
_mesa_set_remove(ctx-vdpSurfaces, entry);
-   FREE(surf);
+   free(surf);
 }
 
 void GLAPIENTRY
@@ -145,7 +145,7 @@ register_surface(struct gl_context *ctx, GLboolean isOutput,
 
   if (tex-Immutable) {
  _mesa_unlock_texture(ctx, tex);
- FREE(surf);
+ free(surf);
  _mesa_error(ctx, GL_INVALID_OPERATION,
  VDPAURegisterSurfaceNV(texture is immutable));
  return (GLintptr)NULL;
@@ -155,7 +155,7 @@ register_surface(struct gl_context *ctx, GLboolean isOutput,
  tex-Target = target;
   else if (tex-Target != target) {
  _mesa_unlock_texture(ctx, tex);
- FREE(surf);
+ free(surf);
  _mesa_error(ctx, GL_INVALID_OPERATION,
  VDPAURegisterSurfaceNV(target mismatch));
  return (GLintptr)NULL;
@@ -254,7 +254,7 @@ _mesa_VDPAUUnregisterSurfaceNV(GLintptr surface)
}
 
_mesa_set_remove(ctx-vdpSurfaces, entry);
-   FREE(surf);
+   free(surf);
 }
 
 void GLAPIENTRY
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] xlib: s/FREE/free/

2014-04-09 Thread Brian Paul
---
 src/mesa/drivers/x11/xm_api.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/x11/xm_api.c b/src/mesa/drivers/x11/xm_api.c
index 4779595..d860569 100644
--- a/src/mesa/drivers/x11/xm_api.c
+++ b/src/mesa/drivers/x11/xm_api.c
@@ -855,7 +855,7 @@ XMesaVisual XMesaCreateVisual( XMesaDisplay *display,
 accum_red_size, accum_green_size,
 accum_blue_size, accum_alpha_size,
 0)) {
-  FREE(v);
+  free(v);
   return NULL;
}
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] mesa: s/FREE/free/ in _mesa_free_errors_data()

2014-04-09 Thread Brian Paul
---
 src/mesa/main/errors.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
index 9151718..d80fda0 100644
--- a/src/mesa/main/errors.c
+++ b/src/mesa/main/errors.c
@@ -980,7 +980,7 @@ _mesa_free_errors_data(struct gl_context *ctx)
   for (i = 0; i = ctx-Debug-GroupStackDepth; i++) {
  free_errors_data(ctx, i);
   }
-  FREE(ctx-Debug);
+  free(ctx-Debug);
   /* set to NULL just in case it is used before context is completely 
gone. */
   ctx-Debug = NULL;
}
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] mesa: use malloc/free instead of MALLOC/FREE in attrib stack code

2014-04-09 Thread Brian Paul
We moved away from MALLOC/FREE in the rest of core Mesa a while ago.
---
 src/mesa/main/attrib.c |   20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c
index 5a626f2..c656845 100644
--- a/src/mesa/main/attrib.c
+++ b/src/mesa/main/attrib.c
@@ -217,7 +217,7 @@ push_attrib(struct gl_context *ctx, struct gl_attrib_node 
**head,
 {
void *attribute;
 
-   attribute = MALLOC(attr_size);
+   attribute = malloc(attr_size);
if (attribute == NULL) {
   _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib);
   return false;
@@ -227,7 +227,7 @@ push_attrib(struct gl_context *ctx, struct gl_attrib_node 
**head,
   memcpy(attribute, attr_data, attr_size);
}
else {
-  FREE(attribute);
+  free(attribute);
   _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib);
   return false;
}
@@ -277,7 +277,7 @@ _mesa_PushAttrib(GLbitfield mask)
 attr-DrawBuffer[i] = ctx-DrawBuffer-ColorDrawBuffer[i];
   }
   else {
- FREE(attr);
+ free(attr);
  _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib);
  goto end;
   }
@@ -374,7 +374,7 @@ _mesa_PushAttrib(GLbitfield mask)
   attr-FragmentProgram = ctx-FragmentProgram.Enabled;
 
   if (!save_attrib_data(head, GL_ENABLE_BIT, attr)) {
- FREE(attr);
+ free(attr);
  _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib);
  goto end;
   }
@@ -440,7 +440,7 @@ _mesa_PushAttrib(GLbitfield mask)
  attr-ReadBuffer = ctx-ReadBuffer-ColorReadBuffer;
   }
   else {
- FREE(attr);
+ free(attr);
  _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib);
  goto end;
   }
@@ -491,7 +491,7 @@ _mesa_PushAttrib(GLbitfield mask)
   }
 
   if (!save_attrib_data(head, GL_TEXTURE_BIT, texstate)) {
- FREE(texstate);
+ free(texstate);
  _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib(GL_TEXTURE_BIT));
  goto end;
   }
@@ -1626,7 +1626,7 @@ _mesa_PushClientAttrib(GLbitfield mask)
   }
   else {
  _mesa_error( ctx, GL_OUT_OF_MEMORY, glPushClientAttrib );
- FREE(attr);
+ free(attr);
  goto end;
   }
 
@@ -1642,7 +1642,7 @@ _mesa_PushClientAttrib(GLbitfield mask)
   }
   else {
  _mesa_error( ctx, GL_OUT_OF_MEMORY, glPushClientAttrib );
- FREE(attr);
+ free(attr);
  goto end;
}
}
@@ -1656,7 +1656,7 @@ _mesa_PushClientAttrib(GLbitfield mask)
   }
 
   if (!init_array_attrib_data(ctx, attr)) {
- FREE(attr);
+ free(attr);
  goto end;
   }
 
@@ -1666,7 +1666,7 @@ _mesa_PushClientAttrib(GLbitfield mask)
   else {
  free_array_attrib_data(ctx, attr);
  _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushClientAttrib);
- FREE(attr);
+ free(attr);
  /* goto to keep safe from possible later changes */
  goto end;
   }
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] mesa: remove the MALLOC, CALLOC and FREE macros

2014-04-09 Thread Brian Paul
No longer used anywhere.  These also caused trouble in the Gallium
state tracker code where we include both core Mesa and Gallium util
headers (and the macros were defined differently in each world.)
Removing these macros should help avoid macro mix-ups in the future.
---
 src/mesa/main/imports.h |6 --
 1 file changed, 6 deletions(-)

diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h
index 9e221cc..17a9bd0 100644
--- a/src/mesa/main/imports.h
+++ b/src/mesa/main/imports.h
@@ -49,16 +49,10 @@ extern C {
 /** Memory macros */
 /*@{*/
 
-/** Allocate \p BYTES bytes */
-#define MALLOC(BYTES)  malloc(BYTES)
-/** Allocate and zero \p BYTES bytes */
-#define CALLOC(BYTES)  calloc(1, BYTES)
 /** Allocate a structure of type \p T */
 #define MALLOC_STRUCT(T)   (struct T *) malloc(sizeof(struct T))
 /** Allocate and zero a structure of type \p T */
 #define CALLOC_STRUCT(T)   (struct T *) calloc(1, sizeof(struct T))
-/** Free memory */
-#define FREE(PTR)  free(PTR)
 
 /*@}*/
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] glxinfo: Print XFB, TBO, and UBO limits

2014-04-09 Thread Fredrik Höglund
---
 src/xdemos/glxinfo.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c
index a116e4a..a77e808 100644
--- a/src/xdemos/glxinfo.c
+++ b/src/xdemos/glxinfo.c
@@ -659,6 +659,28 @@ print_limits(const char *extensions, const char *oglstring)
   { 1, GL_MAX_COLOR_ATTACHMENTS, GL_MAX_COLOR_ATTACHMENTS, 
GL_ARB_framebuffer_object },
   { 1, GL_MAX_SAMPLES, GL_MAX_SAMPLES, GL_ARB_framebuffer_object },
 #endif
+#if defined (GL_EXT_transform_feedback)
+ { 1, GL_MAX_TRANSFORM_FEEDBACK_BUFFERS, 
GL_MAX_TRANSFORM_FEEDBACK_BUFFERS, GL_EXT_transform_feedback },
+ { 1, GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS_EXT, 
GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS, GL_EXT_transform_feedback 
},
+ { 1, GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS_EXT, 
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS, GL_EXT_transform_feedback, },
+ { 1, GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS_EXT, 
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS, GL_EXT_transform_feedback },
+#endif
+#if defined (GL_ARB_texture_buffer_object)
+  { 1, GL_TEXTURE_BUFFER_OFFSET_ALIGNMENT, 
GL_TEXTURE_BUFFER_OFFSET_ALIGNMENT, GL_ARB_texture_buffer_object },
+  { 1, GL_MAX_TEXTURE_BUFFER_SIZE, GL_MAX_TEXTURE_BUFFER_SIZE, 
GL_ARB_texture_buffer_object },
+#endif
+#if defined (GL_ARB_uniform_buffer_object)
+  { 1, GL_MAX_VERTEX_UNIFORM_BLOCKS, GL_MAX_VERTEX_UNIFORM_BLOCKS, 
GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_FRAGMENT_UNIFORM_BLOCKS, GL_MAX_FRAGMENT_UNIFORM_BLOCKS, 
GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_GEOMETRY_UNIFORM_BLOCKS, GL_MAX_GEOMETRY_UNIFORM_BLOCKS , 
GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_COMBINED_UNIFORM_BLOCKS, GL_MAX_COMBINED_UNIFORM_BLOCKS, 
GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_UNIFORM_BUFFER_BINDINGS, GL_MAX_UNIFORM_BUFFER_BINDINGS, 
GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_UNIFORM_BLOCK_SIZE, GL_MAX_UNIFORM_BLOCK_SIZE, 
GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_COMBINED_VERTEX_UNIFORM_COMPONENTS, 
GL_MAX_COMBINED_VERTEX_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_COMBINED_FRAGMENT_UNIFORM_COMPONENTS, 
GL_MAX_COMBINED_FRAGMENT_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object },
+  { 1, GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, 
GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object },
+  { 1, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, 
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, GL_ARB_uniform_buffer_object },
+#endif
   { 0, (GLenum) 0, NULL, NULL }
};
GLint i, max[2];
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] glxinfo: Remove the ARB suffixes from core enums

2014-04-09 Thread Fredrik Höglund
The suffix is only removed from the printed names in case someone
wants to build glxinfo against an old implementation.
---
 src/xdemos/glxinfo.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c
index f8a4e51..6e00dd3 100644
--- a/src/xdemos/glxinfo.c
+++ b/src/xdemos/glxinfo.c
@@ -538,20 +538,20 @@ static void
 print_shader_limits(GLenum target)
 {
static const struct token_name vertex_limits[] = {
-  { GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB, 
GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB },
-  { GL_MAX_VARYING_FLOATS_ARB, GL_MAX_VARYING_FLOATS_ARB },
-  { GL_MAX_VERTEX_ATTRIBS_ARB, GL_MAX_VERTEX_ATTRIBS_ARB },
-  { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS_ARB },
-  { GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB, 
GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB },
-  { GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS_ARB, 
GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS_ARB },
-  { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS_ARB },
+  { GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB, 
GL_MAX_VERTEX_UNIFORM_COMPONENTS },
+  { GL_MAX_VARYING_FLOATS_ARB, GL_MAX_VARYING_FLOATS },
+  { GL_MAX_VERTEX_ATTRIBS_ARB, GL_MAX_VERTEX_ATTRIBS },
+  { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS },
+  { GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB, 
GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS },
+  { GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS_ARB, 
GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS },
+  { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS },
   { GL_MAX_VERTEX_OUTPUT_COMPONENTS  , GL_MAX_VERTEX_OUTPUT_COMPONENTS   
},
   { (GLenum) 0, NULL }
};
static const struct token_name fragment_limits[] = {
-  { GL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB, 
GL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB },
-  { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS_ARB },
-  { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS_ARB },
+  { GL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB, 
GL_MAX_FRAGMENT_UNIFORM_COMPONENTS },
+  { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS },
+  { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS },
   { GL_MAX_FRAGMENT_INPUT_COMPONENTS , GL_MAX_FRAGMENT_INPUT_COMPONENTS  
},
   { (GLenum) 0, NULL }
};
@@ -567,12 +567,12 @@ print_shader_limits(GLenum target)
 
switch (target) {
case GL_VERTEX_SHADER:
-  printf(GL_VERTEX_SHADER_ARB:\n);
+  printf(GL_VERTEX_SHADER:\n);
   print_shader_limit_list(vertex_limits);
   break;
 
case GL_FRAGMENT_SHADER:
-  printf(GL_FRAGMENT_SHADER_ARB:\n);
+  printf(GL_FRAGMENT_SHADER:\n);
   print_shader_limit_list(fragment_limits);
   break;
 
@@ -637,22 +637,22 @@ print_limits(const char *extensions, const char 
*oglstring)
   { 2, GL_ALIASED_POINT_SIZE_RANGE, GL_ALIASED_POINT_SIZE_RANGE, NULL },
   { 2, GL_SMOOTH_POINT_SIZE_RANGE, GL_SMOOTH_POINT_SIZE_RANGE, NULL },
 #if defined(GL_ARB_texture_cube_map)
-  { 1, GL_MAX_CUBE_MAP_TEXTURE_SIZE_ARB, 
GL_MAX_CUBE_MAP_TEXTURE_SIZE_ARB, GL_ARB_texture_cube_map },
+  { 1, GL_MAX_CUBE_MAP_TEXTURE_SIZE_ARB, GL_MAX_CUBE_MAP_TEXTURE_SIZE, 
GL_ARB_texture_cube_map },
 #endif
 #if defined(GL_NV_texture_rectangle)
-  { 1, GL_MAX_RECTANGLE_TEXTURE_SIZE_NV, 
GL_MAX_RECTANGLE_TEXTURE_SIZE_NV, GL_NV_texture_rectangle },
+  { 1, GL_MAX_RECTANGLE_TEXTURE_SIZE_NV, GL_MAX_RECTANGLE_TEXTURE_SIZE, 
GL_NV_texture_rectangle },
 #endif
 #if defined(GL_ARB_multitexture)
-  { 1, GL_MAX_TEXTURE_UNITS_ARB, GL_MAX_TEXTURE_UNITS_ARB, 
GL_ARB_multitexture },
+  { 1, GL_MAX_TEXTURE_UNITS_ARB, GL_MAX_TEXTURE_UNITS, 
GL_ARB_multitexture },
 #endif
 #if defined(GL_EXT_texture_lod_bias)
-  { 1, GL_MAX_TEXTURE_LOD_BIAS_EXT, GL_MAX_TEXTURE_LOD_BIAS_EXT, 
GL_EXT_texture_lod_bias },
+  { 1, GL_MAX_TEXTURE_LOD_BIAS_EXT, GL_MAX_TEXTURE_LOD_BIAS, 
GL_EXT_texture_lod_bias },
 #endif
 #if defined(GL_EXT_texture_filter_anisotropic)
   { 1, GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT, 
GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT, GL_EXT_texture_filter_anisotropic },
 #endif
 #if defined(GL_ARB_draw_buffers)
-  { 1, GL_MAX_DRAW_BUFFERS_ARB, GL_MAX_DRAW_BUFFERS_ARB, 
GL_ARB_draw_buffers },
+  { 1, GL_MAX_DRAW_BUFFERS_ARB, GL_MAX_DRAW_BUFFERS, 
GL_ARB_draw_buffers },
 #endif
 #if defined(GL_ARB_blend_func_extended)
   { 1, GL_MAX_DUAL_SOURCE_DRAW_BUFFERS, GL_MAX_DUAL_SOURCE_DRAW_BUFFERS, 
GL_ARB_blend_func_extended },
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] glxinfo: Print GL_ARB_vertex_attrib_binding limits

2014-04-09 Thread Fredrik Höglund
---
 src/xdemos/glxinfo.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c
index a77e808..f97ba3e 100644
--- a/src/xdemos/glxinfo.c
+++ b/src/xdemos/glxinfo.c
@@ -681,6 +681,11 @@ print_limits(const char *extensions, const char *oglstring)
   { 1, GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, 
GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object },
   { 1, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, 
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, GL_ARB_uniform_buffer_object },
 #endif
+#if defined (GL_ARB_vertex_attrib_binding)
+  { 1, GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, 
GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, GL_ARB_vertex_attrib_binding },
+  { 1, GL_MAX_VERTEX_ATTRIB_STRIDE, GL_MAX_VERTEX_ATTRIB_STRIDE, 
GL_ARB_vertex_attrib_binding },
+  { 1, GL_MAX_VERTEX_ATTRIB_BINDINGS, GL_MAX_VERTEX_ATTRIB_BINDINGS, 
GL_ARB_vertex_attrib_binding },
+#endif
   { 0, (GLenum) 0, NULL, NULL }
};
GLint i, max[2];
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] glxinfo: Print GL_EXT_texture_array limits

2014-04-09 Thread Fredrik Höglund
---
 src/xdemos/glxinfo.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c
index f97ba3e..f8a4e51 100644
--- a/src/xdemos/glxinfo.c
+++ b/src/xdemos/glxinfo.c
@@ -628,6 +628,9 @@ print_limits(const char *extensions, const char *oglstring)
   { 1, GL_MAX_TEXTURE_STACK_DEPTH, GL_MAX_TEXTURE_STACK_DEPTH, NULL },
   { 1, GL_MAX_TEXTURE_SIZE, GL_MAX_TEXTURE_SIZE, NULL },
   { 1, GL_MAX_3D_TEXTURE_SIZE, GL_MAX_3D_TEXTURE_SIZE, NULL },
+#if defined(GL_EXT_texture_array)
+  { 1, GL_MAX_ARRAY_TEXTURE_LAYERS_EXT, GL_MAX_ARRAY_TEXTURE_LAYERS, 
GL_EXT_texture_array },
+#endif
   { 2, GL_MAX_VIEWPORT_DIMS, GL_MAX_VIEWPORT_DIMS, NULL },
   { 2, GL_ALIASED_LINE_WIDTH_RANGE, GL_ALIASED_LINE_WIDTH_RANGE, NULL },
   { 2, GL_SMOOTH_LINE_WIDTH_RANGE, GL_SMOOTH_LINE_WIDTH_RANGE, NULL },
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] mesa: use malloc/free instead of MALLOC/FREE in attrib stack code

2014-04-09 Thread Kenneth Graunke
On 04/09/2014 06:39 PM, Brian Paul wrote:
 We moved away from MALLOC/FREE in the rest of core Mesa a while ago.
 ---
  src/mesa/main/attrib.c |   20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

Series is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa/st: set min/max texture gather offset to driver-reported value

2014-04-09 Thread Ilia Mirkin
It was always getting set to -8/7 unconditionally.  Use the
driver-reported value instead.

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
 src/mesa/state_tracker/st_extensions.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 845d29c..673a855 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -275,6 +275,9 @@ void st_init_limits(struct st_context *st)
c-MaxProgramTexelOffset = screen-get_param(screen, 
PIPE_CAP_MAX_TEXEL_OFFSET);
 
c-MaxProgramTextureGatherComponents = screen-get_param(screen, 
PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS);
+   c-MinProgramTextureGatherOffset = screen-get_param(screen, 
PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET);
+   c-MaxProgramTextureGatherOffset = screen-get_param(screen, 
PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET);
+
c-UniformBooleanTrue = ~0;
 
c-MaxTransformFeedbackBuffers =
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gallium: add a way to query min/max texture gather offsets

2014-04-09 Thread Ilia Mirkin
Defaults to providing the same offsets as MIN/MAX_TEXEL_OFFSET. For
nvc0, the offset can be -32/31.

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
 src/gallium/docs/source/screen.rst   | 4 
 src/gallium/drivers/freedreno/freedreno_screen.c | 2 ++
 src/gallium/drivers/i915/i915_screen.c   | 2 ++
 src/gallium/drivers/ilo/ilo_screen.c | 2 ++
 src/gallium/drivers/llvmpipe/lp_screen.c | 2 ++
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 2 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 4 
 src/gallium/drivers/r300/r300_screen.c   | 2 ++
 src/gallium/drivers/r600/r600_pipe.c | 2 ++
 src/gallium/drivers/radeonsi/si_pipe.c   | 2 ++
 src/gallium/drivers/svga/svga_screen.c   | 2 ++
 src/gallium/include/pipe/p_defines.h | 2 ++
 13 files changed, 30 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 943d880..5c255d0 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -193,6 +193,10 @@ The integer capabilities:
   for buffers.
 * ``PIPE_CAP_TEXTURE_QUERY_LOD``: Whether the ``LODQ`` instruction is
   supported.
+* ``PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET``: The minimum offset that can be used
+  in conjunction with a texture gather opcode.
+* ``PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET``: The maximum offset that can be used
+  in conjunction with a texture gather opcode.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 96c769e..08556a4 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -240,9 +240,11 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_TIMESTAMP:
return 0;
 
+   case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MIN_TEXEL_OFFSET:
return -8;
 
+   case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MAX_TEXEL_OFFSET:
return 7;
 
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 892c3ea..b484d36 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -253,6 +253,8 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
   return I915_MAX_TEXTURE_2D_LEVELS;
case PIPE_CAP_MIN_TEXEL_OFFSET:
case PIPE_CAP_MAX_TEXEL_OFFSET:
+   case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET:
+   case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 7f2e01f..4bea564 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -361,8 +361,10 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_SEAMLESS_CUBE_MAP:
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
   return true;
+   case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MIN_TEXEL_OFFSET:
   return -8;
+   case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MAX_TEXEL_OFFSET:
   return 7;
case PIPE_CAP_CONDITIONAL_RENDER:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 6eb7d64..8fbc58f 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -176,8 +176,10 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
   return 1;
/* this is a lie could support arbitrary large offsets */
+   case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MIN_TEXEL_OFFSET:
   return -8;
+   case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MAX_TEXEL_OFFSET:
   return 7;
case PIPE_CAP_CONDITIONAL_RENDER:
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index c34b1da..57a2f7d 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -106,6 +106,8 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
case PIPE_CAP_MIN_TEXEL_OFFSET:
case PIPE_CAP_MAX_TEXEL_OFFSET:
+   case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET:
+   case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c