Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup

2013-08-08 Thread Christian König

Am 08.08.2013 02:20, schrieb Marek Olšák:

FMASK is bound as a separate texture. For every texture, there can be
an FMASK. Therefore a separate array of resource slots has to be added.

This adds a new mechanism for emitting resource descriptors, its features are:
- resource descriptors are stored in an ordinary buffer (not in a CS)


Having resource descriptors outside of the CS has two problems that we 
need to solve first:


1. Fine grained descriptor updates doesn't work, I already tried that. 
The problem is that unlike previous asics descriptors are now a memory 
block, so no longer part of the CP context. So when we (for example) 
have a draw command executing and the next draw command is using new 
resources for a specific slot we would either block until the first draw 
command is finished (which is bad for performance) or change the 
descriptors while they are still in use (which results in VM faults).


2. If my understand is correct when they are embedded the descriptors 
are preloaded into the caches while executing the IB, so to archive the 
same speed with descriptors outside of the IB you need to add additional 
commands to the constant IB which is new to SI and we currently doesn't 
support in the CS interface.


Regards,
Christian.


- descriptors of disabled resources are set to zeros
- fine-grained resource updates (it can update one resource slot while not
   touching the other slots)
- updates are done with the WRITE_DATA packet
- it implements the si_atom interface for packet emission
- only used for FMASK textures right now

The primary motivation for this is that FMASK textures naturally need
fine-grained resource updates and I also need to query in the shader
if a resource is NULL.
---
  src/gallium/drivers/radeonsi/Makefile.sources  |   1 +
  src/gallium/drivers/radeonsi/r600_hw_context.c |   3 +
  src/gallium/drivers/radeonsi/r600_resource.h   |   1 +
  src/gallium/drivers/radeonsi/r600_texture.c|   1 +
  src/gallium/drivers/radeonsi/radeonsi_pipe.c   |   9 +-
  src/gallium/drivers/radeonsi/radeonsi_pipe.h   |   6 +-
  src/gallium/drivers/radeonsi/radeonsi_pm4.c|   7 +
  src/gallium/drivers/radeonsi/radeonsi_pm4.h|   2 +
  src/gallium/drivers/radeonsi/si_descriptors.c  | 188 +
  src/gallium/drivers/radeonsi/si_state.c|  58 +++-
  src/gallium/drivers/radeonsi/si_state.h|  36 +
  11 files changed, 305 insertions(+), 7 deletions(-)
  create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c

diff --git a/src/gallium/drivers/radeonsi/Makefile.sources 
b/src/gallium/drivers/radeonsi/Makefile.sources
index b3ffa72..68c8282 100644
--- a/src/gallium/drivers/radeonsi/Makefile.sources
+++ b/src/gallium/drivers/radeonsi/Makefile.sources
@@ -10,6 +10,7 @@ C_SOURCES := \
r600_translate.c \
radeonsi_pm4.c \
radeonsi_compute.c \
+   si_descriptors.c \
si_state.c \
si_state_streamout.c \
si_state_draw.c \
diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
b/src/gallium/drivers/radeonsi/r600_hw_context.c
index 7ed7496..b595477 100644
--- a/src/gallium/drivers/radeonsi/r600_hw_context.c
+++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
@@ -289,6 +289,9 @@ void si_context_flush(struct r600_context *ctx, unsigned 
flags)
 * next draw command
 */
si_pm4_reset_emitted(ctx);
+
+   si_sampler_views_begin_new_cs(ctx, 
ctx-fmask_sampler_views[PIPE_SHADER_VERTEX]);
+   si_sampler_views_begin_new_cs(ctx, 
ctx-fmask_sampler_views[PIPE_SHADER_FRAGMENT]);
  }
  
  void si_context_emit_fence(struct r600_context *ctx, struct si_resource *fence_bo, unsigned offset, unsigned value)

diff --git a/src/gallium/drivers/radeonsi/r600_resource.h 
b/src/gallium/drivers/radeonsi/r600_resource.h
index e5dd36a..ab5c7b7 100644
--- a/src/gallium/drivers/radeonsi/r600_resource.h
+++ b/src/gallium/drivers/radeonsi/r600_resource.h
@@ -44,6 +44,7 @@ struct r600_fmask_info {
unsigned offset;
unsigned size;
unsigned alignment;
+   unsigned pitch;
unsigned bank_height;
unsigned slice_tile_max;
unsigned tile_mode_index;
diff --git a/src/gallium/drivers/radeonsi/r600_texture.c 
b/src/gallium/drivers/radeonsi/r600_texture.c
index cd3d1aa..b613564 100644
--- a/src/gallium/drivers/radeonsi/r600_texture.c
+++ b/src/gallium/drivers/radeonsi/r600_texture.c
@@ -463,6 +463,7 @@ static void r600_texture_get_fmask_info(struct r600_screen 
*rscreen,
out-slice_tile_max -= 1;
  
  	out-tile_mode_index = fmask.tiling_index[0];

+   out-pitch = fmask.level[0].nblk_x;
out-bank_height = fmask.bankh;
out-alignment = MAX2(256, fmask.bo_alignment);
out-size = fmask.bo_size;
diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c 
b/src/gallium/drivers/radeonsi/radeonsi_pipe.c
index ad955e3..3112124 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c
+++ 

Re: [Mesa-dev] [PATCH 18/20] radeonsi: port texture improvements from r600g

2013-08-08 Thread Christian König

Am 08.08.2013 02:20, schrieb Marek Olšák:

This started as an attempt to add support for MSAA texture transfers and
MSAA depth-stencil decompression for the DB-CB copy path.
It has gotten a bit out of control, but it's for the greater good.

Some changes do not make much sense, they are there just to make it look
like the other driver.

With a few cosmetic modifications, r600_texture.c can be shared with
a symlink.


The patch itself looks good to me. The original plan was to put that 
stuff under drivers/radeon, and only have the SI specific parts under 
radeonsi.


Didn't worked as intended and we should try to share that code between 
r600 and radeonsi again.


Christian.


---
  src/gallium/drivers/r600/r600_blit.c |   3 -
  src/gallium/drivers/r600/r600_texture.c  |   3 +-
  src/gallium/drivers/radeonsi/r600_blit.c |  53 ++-
  src/gallium/drivers/radeonsi/r600_resource.h |   7 +-
  src/gallium/drivers/radeonsi/r600_texture.c  | 528 ---
  src/gallium/drivers/radeonsi/radeonsi_pipe.c |   8 +-
  src/gallium/drivers/radeonsi/radeonsi_pipe.h |  11 +-
  src/gallium/drivers/radeonsi/si_state.c  |  21 +-
  8 files changed, 365 insertions(+), 269 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_blit.c 
b/src/gallium/drivers/r600/r600_blit.c
index 2230e7b..1c22a75 100644
--- a/src/gallium/drivers/r600/r600_blit.c
+++ b/src/gallium/drivers/r600/r600_blit.c
@@ -171,9 +171,6 @@ void r600_blit_decompress_depth(struct pipe_context *ctx,
zsurf = ctx-create_surface(ctx, 
texture-resource.b.b, surf_tmpl);
  
  surf_tmpl.format = flushed_depth_texture-resource.b.b.format;

-   surf_tmpl.u.tex.level = level;
-   surf_tmpl.u.tex.first_layer = layer;
-   surf_tmpl.u.tex.last_layer = layer;
cbsurf = ctx-create_surface(ctx,

flushed_depth_texture-resource.b.b, surf_tmpl);
  
diff --git a/src/gallium/drivers/r600/r600_texture.c b/src/gallium/drivers/r600/r600_texture.c

index 36cca17..742e982 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -458,8 +458,7 @@ r600_texture_create_object(struct pipe_screen *screen,
rtex-is_depth = 
util_format_has_depth(util_format_description(rtex-resource.b.b.format));
  
  	rtex-surface = *surface;

-   r = r600_setup_surface(screen, rtex,
-  pitch_in_bytes_override);
+   r = r600_setup_surface(screen, rtex, pitch_in_bytes_override);
if (r) {
FREE(rtex);
return NULL;
diff --git a/src/gallium/drivers/radeonsi/r600_blit.c 
b/src/gallium/drivers/radeonsi/r600_blit.c
index 2db952c..d5ed87a 100644
--- a/src/gallium/drivers/radeonsi/r600_blit.c
+++ b/src/gallium/drivers/radeonsi/r600_blit.c
@@ -102,23 +102,31 @@ static void r600_blitter_end(struct pipe_context *ctx)
r600_context_queries_resume(rctx);
  }
  
-void si_blit_uncompress_depth(struct pipe_context *ctx,

+static unsigned u_max_sample(struct pipe_resource *r)
+{
+   return r-nr_samples ? r-nr_samples - 1 : 0;
+}
+
+void r600_blit_decompress_depth(struct pipe_context *ctx,
struct r600_texture *texture,
struct r600_texture *staging,
unsigned first_level, unsigned last_level,
-   unsigned first_layer, unsigned last_layer)
+   unsigned first_layer, unsigned last_layer,
+   unsigned first_sample, unsigned last_sample)
  {
struct r600_context *rctx = (struct r600_context *)ctx;
-   unsigned layer, level, checked_last_layer, max_layer;
+   unsigned layer, level, sample, checked_last_layer, max_layer, 
max_sample;
float depth = 1.0f;
const struct util_format_description *desc;
-   void *custom_dsa;
+   void **custom_dsa;
struct r600_texture *flushed_depth_texture = staging ?
staging : texture-flushed_depth_texture;
  
  	if (!staging  !texture-dirty_level_mask)

return;
  
+	max_sample = u_max_sample(texture-resource.b.b);

+
desc = 
util_format_description(flushed_depth_texture-resource.b.b.format);
switch (util_format_has_depth(desc) | util_format_has_stencil(desc)  
1) {
default:
@@ -144,30 +152,35 @@ void si_blit_uncompress_depth(struct pipe_context *ctx,
checked_last_layer = last_layer  max_layer ? last_layer : 
max_layer;
  
  		for (layer = first_layer; layer = checked_last_layer; layer++) {

-   struct pipe_surface *zsurf, *cbsurf, surf_tmpl;
+   for (sample = first_sample; sample = last_sample; 
sample++) {
+   struct pipe_surface *zsurf, *cbsurf, surf_tmpl;
  
-			surf_tmpl.format = texture-real_format;

-   surf_tmpl.u.tex.level = level;
-  

Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Michel Dänzer
On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote:
 
 diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
 b/src/gallium/drivers/radeonsi/si_state_draw.c
 index 746ace6..4208fa7 100644
 --- a/src/gallium/drivers/radeonsi/si_state_draw.c
 +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
 @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
 struct si_pipe_shader *s
   /* Last 2 reserved SGPRs are used for VCC */
   num_sgprs = num_user_sgprs + 2;
   }
 + num_sgprs += 1; /* XXX this fixes VM faults */
   assert(num_sgprs = 104);

You can't just increment num_sgprs unconditionally here, or the
assertion will spuriously fail again when num_sgprs was 104 to begin
with.

More fundamentally, I'd really like us to understand why / under what
circumstances the value needs to be incremented at all. You mentioned on
IRC that you were using an LLVM 3.3 tree, is this change still necessary
with LLVM master / trunk? If yes, it would be good to look into at least
some example shaders where the change is really necessary, to try and
understand the circumstances.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/20] radeonsi: implement MSAA colorbuffer compression for rendering

2013-08-08 Thread Michel Dänzer
On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote:
 
 diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
 b/src/gallium/drivers/radeonsi/r600_hw_context.c
 index 25c972b..382382b 100644
 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c
 +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
 @@ -179,6 +179,17 @@ static void r600_flush_framebuffer(struct r600_context 
 *ctx)
   si_pm4_emit(ctx, pm4);
   si_pm4_free_state(ctx, pm4, ~0);
  
 + /* flush CB_META */
 + pm4 = si_pm4_alloc_state(ctx);
 +
 + if (pm4 == NULL)
 + return;
 +
 + si_cmd_flush_and_inv_cb_meta(pm4);
 + si_pm4_emit(ctx, pm4);
 + si_pm4_free_state(ctx, pm4, ~0);
 + ctx-flush_and_inv_cb_meta = false;
 +
   ctx-flags = ~R600_CONTEXT_DST_CACHES_DIRTY;
  }

Can't you add to the existing pm4 here, instead of allocating a new one?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states

2013-08-08 Thread Christian König

Am 08.08.2013 02:20, schrieb Marek Olšák:

It's the same as in r600g. Look how simple it is.


That concept has the problem that we don't necessary know in which order 
the state is emitted.


Why not just add an emit callback to si_pm4_state for the short term 
instead?


For the long term we should probably teach the kernel interface to 
accept a bunch of pointers to smaller IB fragments instead of one large 
IB. That would allow us to not only save the copy of commands for the 
CSO states, but also allows us to emit optional IB fragments that are 
only inserted if the we had a context switch in between.


Christian.


---
  src/gallium/drivers/radeonsi/r600_hw_context.c |  8 
  src/gallium/drivers/radeonsi/radeonsi_pipe.h   |  9 +
  src/gallium/drivers/radeonsi/si_state.h| 10 ++
  src/gallium/drivers/radeonsi/si_state_draw.c   |  8 +++-
  4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
b/src/gallium/drivers/radeonsi/r600_hw_context.c
index 382382b..7ed7496 100644
--- a/src/gallium/drivers/radeonsi/r600_hw_context.c
+++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
@@ -114,9 +114,17 @@ err:
  void si_need_cs_space(struct r600_context *ctx, unsigned num_dw,
boolean count_draw_in)
  {
+   int i;
+
/* The number of dwords we already used in the CS so far. */
num_dw += ctx-cs-cdw;
  
+	for (i = 0; i  ctx-num_atoms; i++) {

+   if (ctx-atoms[i]-dirty) {
+   num_dw += ctx-atoms[i]-num_dw;
+   }
+   }
+
if (count_draw_in) {
/* The number of dwords all the dirty states would take. */
num_dw += ctx-pm4_dirty_cdwords;
diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h 
b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
index e370149..5fa9bdc 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
@@ -145,6 +145,10 @@ struct r600_context {
void*custom_blend_decompress;
struct r600_screen  *screen;
struct radeon_winsys*ws;
+
+   struct si_atom  *atoms[SI_MAX_ATOMS];
+   unsignednum_atoms;
+
struct si_vertex_element*vertex_elements;
struct pipe_framebuffer_state   framebuffer;
unsignedfb_log_samples;
@@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct pipe_screen 
*screen, struct pipe_
return rscreen-ws-buffer_get_virtual_address(rresource-cs_buf);
  }
  
+static INLINE void si_add_atom(struct r600_context *rctx, struct si_atom *atom)

+{
+   rctx-atoms[rctx-num_atoms++] = atom;
+}
+
  #endif
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index b01fbf2..4aabdef 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -29,6 +29,16 @@
  
  #include radeonsi_pm4.h
  
+#define SI_MAX_ATOMS 2

+
+/* This encapsulates a state or an operation which can emitted into the GPU
+ * command stream. */
+struct si_atom {
+   void (*emit)(struct r600_context *ctx, struct si_atom *state);
+   unsignednum_dw;
+   booldirty;
+};
+
  struct si_state_blend {
struct si_pm4_state pm4;
uint32_tcb_target_mask;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 4208fa7..bcae778 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -664,7 +664,7 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
  {
struct r600_context *rctx = (struct r600_context *)ctx;
struct pipe_index_buffer ib = {};
-   uint32_t cp_coher_cntl;
+   uint32_t cp_coher_cntl, i;
  
  	if (!info-count  (info-indexed || !info-count_from_stream_output))

return;
@@ -728,6 +728,12 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
  
  	si_need_cs_space(rctx, 0, TRUE);
  
+	for (i = 0; i  rctx-num_atoms; i++) {

+   if (rctx-atoms[i]-dirty) {
+   rctx-atoms[i]-emit(rctx, rctx-atoms[i]);
+   }
+   }
+
si_pm4_emit_dirty(rctx);
rctx-pm4_dirty_cdwords = 0;
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Christian König

Am 08.08.2013 10:30, schrieb Michel Dänzer:

On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote:

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 746ace6..4208fa7 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
/* Last 2 reserved SGPRs are used for VCC */
num_sgprs = num_user_sgprs + 2;
}
+   num_sgprs += 1; /* XXX this fixes VM faults */
assert(num_sgprs = 104);

You can't just increment num_sgprs unconditionally here, or the
assertion will spuriously fail again when num_sgprs was 104 to begin
with.

More fundamentally, I'd really like us to understand why / under what
circumstances the value needs to be incremented at all. You mentioned on
IRC that you were using an LLVM 3.3 tree, is this change still necessary
with LLVM master / trunk? If yes, it would be good to look into at least
some example shaders where the change is really necessary, to try and
understand the circumstances.


Additional to that I'm not sure if we should add another sgpr pointer to 
the shader, the docs speak of quite a performance penalty for each. 
Since they are (optional) attributes to the textures anyway they could 
probably be interleaved with the texture resources.


Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/20] MSAA support for Radeon SI and more

2013-08-08 Thread Michel Dänzer
On Don, 2013-08-08 at 02:21 +0200, Marek Olšák wrote:
 
 This is MSAA support for the radeonsi driver. It implements MSAA
 rendering, texturing, and colorbuffer compression. The only missing
 feature is the MSAA fast color clear; other than that, it operates at
 maximum performance the hardware is capable of.

Nice!


 There are also some minor changes which are not related to MSAA.

Apart from the issues being discussed, the series looks good to me.


 Also the whole r600_texture.c file from r600g is ported to radeonsi.

Great. As Christian mentioned, we've been meaning to share code between
radeonsi and r600g, feel free to take any opportunities you see for that
anytime.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.

2013-08-08 Thread Eric Anholt
Previously, we were asserting that each driver specified an NConfigOptions
exactly equal to the number of options they supplied, leading to frequent
bugs when people would forget to adjust the value when adjusting driver
options.  Instead, just overallocate the table by a bit and leave sanity
checking to the assert in findOption().
---
 src/gallium/state_trackers/dri/common/dri_screen.c |  5 +--
 src/mesa/drivers/dri/common/dri_util.c |  4 +-
 src/mesa/drivers/dri/common/xmlconfig.c| 44 --
 src/mesa/drivers/dri/common/xmlconfig.h|  4 +-
 src/mesa/drivers/dri/i915/intel_screen.c   |  5 +--
 src/mesa/drivers/dri/i965/intel_screen.c   |  5 +--
 src/mesa/drivers/dri/radeon/radeon_screen.c|  5 +--
 7 files changed, 14 insertions(+), 58 deletions(-)

diff --git a/src/gallium/state_trackers/dri/common/dri_screen.c 
b/src/gallium/state_trackers/dri/common/dri_screen.c
index 3b42b5a..779741e 100644
--- a/src/gallium/state_trackers/dri/common/dri_screen.c
+++ b/src/gallium/state_trackers/dri/common/dri_screen.c
@@ -74,8 +74,6 @@ PUBLIC const char __driConfigOptions[] =
 
 #define false 0
 
-static const uint __driNConfigOptions = 13;
-
 static const __DRIconfig **
 dri_fill_in_modes(struct dri_screen *screen)
 {
@@ -417,8 +415,7 @@ dri_init_screen_helper(struct dri_screen *screen,
else
   screen-target = PIPE_TEXTURE_RECT;
 
-   driParseOptionInfo(screen-optionCacheDefaults,
-  __driConfigOptions, __driNConfigOptions);
+   driParseOptionInfo(screen-optionCacheDefaults, __driConfigOptions);
 
driParseConfigFiles(screen-optionCache,
   screen-optionCacheDefaults,
diff --git a/src/mesa/drivers/dri/common/dri_util.c 
b/src/mesa/drivers/dri/common/dri_util.c
index 9ed9df4..fa520ea 100644
--- a/src/mesa/drivers/dri/common/dri_util.c
+++ b/src/mesa/drivers/dri/common/dri_util.c
@@ -52,8 +52,6 @@ PUBLIC const char __dri2ConfigOptions[] =
   DRI_CONF_SECTION_END
DRI_CONF_END;
 
-static const uint __dri2NConfigOptions = 1;
-
 /*/
 /** \name Screen handling functions  */
 /*/
@@ -112,7 +110,7 @@ dri2CreateNewScreen(int scrn, int fd,
return NULL;
 }
 
-driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions, 
__dri2NConfigOptions);
+driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions);
 driParseConfigFiles(psp-optionCache, psp-optionInfo, psp-myNum, 
dri2);
 
 return psp;
diff --git a/src/mesa/drivers/dri/common/xmlconfig.c 
b/src/mesa/drivers/dri/common/xmlconfig.c
index 5c97c20..b95e452 100644
--- a/src/mesa/drivers/dri/common/xmlconfig.c
+++ b/src/mesa/drivers/dri/common/xmlconfig.c
@@ -132,16 +132,6 @@ static GLuint findOption (const driOptionCache *cache, 
const char *name) {
 return hash;
 }
 
-/** \brief Count the real number of options in an option cache */
-static GLuint countOptions (const driOptionCache *cache) {
-GLuint size = 1  cache-tableSize;
-GLuint i, count = 0;
-for (i = 0; i  size; ++i)
-   if (cache-info[i].name)
-   count++;
-return count;
-}
-
 /** \brief Like strdup but using malloc and with error checking. */
 #define XSTRDUP(dest,source) do { \
 GLuint len = strlen (source); \
@@ -685,25 +675,18 @@ static void optInfoEndElem (void *userData, const 
XML_Char *name) {
 }
 }
 
-void driParseOptionInfo (driOptionCache *info,
-const char *configOptions, GLuint nConfigOptions) {
+void driParseOptionInfo (driOptionCache *info, const char *configOptions) {
 XML_Parser p;
 int status;
 struct OptInfoData userData;
 struct OptInfoData *data = userData;
-GLuint realNoptions;
-
-  /* determine hash table size and allocate memory:
-   * 3/2 of the number of options, rounded up, so there remains always
-   * at least one free entry. This is needed for detecting undefined
-   * options in configuration files without getting a hash table overflow.
-   * Round this up to a power of two. */
-GLuint minSize = (nConfigOptions*3 + 1) / 2;
-GLuint size, log2size;
-for (size = 1, log2size = 0; size  minSize; size = 1, ++log2size);
-info-tableSize = log2size;
-info-info = calloc(size, sizeof (driOptionInfo));
-info-values = calloc(size, sizeof (driOptionValue));
+
+/* Make the hash table big enough to fit more than the maximum number of
+ * config options we've ever seen in a driver.
+ */
+info-tableSize = 6;
+info-info = calloc(1  info-tableSize, sizeof (driOptionInfo));
+info-values = calloc(1  info-tableSize, sizeof (driOptionValue));
 if (info-info == NULL || info-values == NULL) {
fprintf (stderr, %s: %d: out of memory.\n, __FILE__, __LINE__);
abort();
@@ -728,17 +711,6 @@ void driParseOptionInfo (driOptionCache *info,
XML_FATAL (%s., 

Re: [Mesa-dev] [PATCH 12/20] radeonsi: implement MSAA colorbuffer compression for rendering

2013-08-08 Thread Marek Olšák
Do you mean the surface_sync pm4 or the one in si_state?

Marek

On Thu, Aug 8, 2013 at 10:31 AM, Michel Dänzer mic...@daenzer.net wrote:
 On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote:

 diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
 b/src/gallium/drivers/radeonsi/r600_hw_context.c
 index 25c972b..382382b 100644
 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c
 +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
 @@ -179,6 +179,17 @@ static void r600_flush_framebuffer(struct r600_context 
 *ctx)
   si_pm4_emit(ctx, pm4);
   si_pm4_free_state(ctx, pm4, ~0);

 + /* flush CB_META */
 + pm4 = si_pm4_alloc_state(ctx);
 +
 + if (pm4 == NULL)
 + return;
 +
 + si_cmd_flush_and_inv_cb_meta(pm4);
 + si_pm4_emit(ctx, pm4);
 + si_pm4_free_state(ctx, pm4, ~0);
 + ctx-flush_and_inv_cb_meta = false;
 +
   ctx-flags = ~R600_CONTEXT_DST_CACHES_DIRTY;
  }

 Can't you add to the existing pm4 here, instead of allocating a new one?


 --
 Earthling Michel Dänzer   |   http://www.amd.com
 Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Marek Olšák
Interleaving might not be a good idea, but they could be in the same
array, like this:

0..15: textures
16..31: FMASK textures

I'll test LLVM master, but we should probably bump the LLVM version
requirement in configure.ac to prevent users from using LLVM 3.3.

Marek

On Thu, Aug 8, 2013 at 10:37 AM, Christian König
deathsim...@vodafone.de wrote:
 Am 08.08.2013 10:30, schrieb Michel Dänzer:

 On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote:

 diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
 b/src/gallium/drivers/radeonsi/si_state_draw.c
 index 746ace6..4208fa7 100644
 --- a/src/gallium/drivers/radeonsi/si_state_draw.c
 +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
 @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context
 *ctx, struct si_pipe_shader *s
 /* Last 2 reserved SGPRs are used for VCC */
 num_sgprs = num_user_sgprs + 2;
 }
 +   num_sgprs += 1; /* XXX this fixes VM faults */
 assert(num_sgprs = 104);

 You can't just increment num_sgprs unconditionally here, or the
 assertion will spuriously fail again when num_sgprs was 104 to begin
 with.

 More fundamentally, I'd really like us to understand why / under what
 circumstances the value needs to be incremented at all. You mentioned on
 IRC that you were using an LLVM 3.3 tree, is this change still necessary
 with LLVM master / trunk? If yes, it would be good to look into at least
 some example shaders where the change is really necessary, to try and
 understand the circumstances.


 Additional to that I'm not sure if we should add another sgpr pointer to the
 shader, the docs speak of quite a performance penalty for each. Since they
 are (optional) attributes to the textures anyway they could probably be
 interleaved with the texture resources.

 Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states

2013-08-08 Thread Marek Olšák
Why is the order so important? Note that cache flushes shouldn't be
treated like states and neither should the draw packets and any state
that comes from pipe_draw_info. The only things left are the register
updates and resource updates where the order doesn't matter, does it?
Anyway, the order is determined by the order of si_add_atom calls,
which should be done once in create_context.

Marek

On Thu, Aug 8, 2013 at 10:32 AM, Christian König
deathsim...@vodafone.de wrote:
 Am 08.08.2013 02:20, schrieb Marek Olšák:

 It's the same as in r600g. Look how simple it is.


 That concept has the problem that we don't necessary know in which order the
 state is emitted.

 Why not just add an emit callback to si_pm4_state for the short term
 instead?

 For the long term we should probably teach the kernel interface to accept a
 bunch of pointers to smaller IB fragments instead of one large IB. That
 would allow us to not only save the copy of commands for the CSO states, but
 also allows us to emit optional IB fragments that are only inserted if the
 we had a context switch in between.

 Christian.


 ---
   src/gallium/drivers/radeonsi/r600_hw_context.c |  8 
   src/gallium/drivers/radeonsi/radeonsi_pipe.h   |  9 +
   src/gallium/drivers/radeonsi/si_state.h| 10 ++
   src/gallium/drivers/radeonsi/si_state_draw.c   |  8 +++-
   4 files changed, 34 insertions(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c
 b/src/gallium/drivers/radeonsi/r600_hw_context.c
 index 382382b..7ed7496 100644
 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c
 +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
 @@ -114,9 +114,17 @@ err:
   void si_need_cs_space(struct r600_context *ctx, unsigned num_dw,
 boolean count_draw_in)
   {
 +   int i;
 +
 /* The number of dwords we already used in the CS so far. */
 num_dw += ctx-cs-cdw;
   + for (i = 0; i  ctx-num_atoms; i++) {
 +   if (ctx-atoms[i]-dirty) {
 +   num_dw += ctx-atoms[i]-num_dw;
 +   }
 +   }
 +
 if (count_draw_in) {
 /* The number of dwords all the dirty states would take.
 */
 num_dw += ctx-pm4_dirty_cdwords;
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 index e370149..5fa9bdc 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 @@ -145,6 +145,10 @@ struct r600_context {
 void*custom_blend_decompress;
 struct r600_screen  *screen;
 struct radeon_winsys*ws;
 +
 +   struct si_atom  *atoms[SI_MAX_ATOMS];
 +   unsignednum_atoms;
 +
 struct si_vertex_element*vertex_elements;
 struct pipe_framebuffer_state   framebuffer;
 unsignedfb_log_samples;
 @@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct
 pipe_screen *screen, struct pipe_
 return rscreen-ws-buffer_get_virtual_address(rresource-cs_buf);
   }
   +static INLINE void si_add_atom(struct r600_context *rctx, struct
 si_atom *atom)
 +{
 +   rctx-atoms[rctx-num_atoms++] = atom;
 +}
 +
   #endif
 diff --git a/src/gallium/drivers/radeonsi/si_state.h
 b/src/gallium/drivers/radeonsi/si_state.h
 index b01fbf2..4aabdef 100644
 --- a/src/gallium/drivers/radeonsi/si_state.h
 +++ b/src/gallium/drivers/radeonsi/si_state.h
 @@ -29,6 +29,16 @@
 #include radeonsi_pm4.h
   +#define SI_MAX_ATOMS 2
 +
 +/* This encapsulates a state or an operation which can emitted into the
 GPU
 + * command stream. */
 +struct si_atom {
 +   void (*emit)(struct r600_context *ctx, struct si_atom *state);
 +   unsignednum_dw;
 +   booldirty;
 +};
 +
   struct si_state_blend {
 struct si_pm4_state pm4;
 uint32_tcb_target_mask;
 diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
 b/src/gallium/drivers/radeonsi/si_state_draw.c
 index 4208fa7..bcae778 100644
 --- a/src/gallium/drivers/radeonsi/si_state_draw.c
 +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
 @@ -664,7 +664,7 @@ void si_draw_vbo(struct pipe_context *ctx, const
 struct pipe_draw_info *info)
   {
 struct r600_context *rctx = (struct r600_context *)ctx;
 struct pipe_index_buffer ib = {};
 -   uint32_t cp_coher_cntl;
 +   uint32_t cp_coher_cntl, i;
 if (!info-count  (info-indexed ||
 !info-count_from_stream_output))
 return;
 @@ -728,6 +728,12 @@ void si_draw_vbo(struct pipe_context *ctx, const
 struct pipe_draw_info *info)
 si_need_cs_space(rctx, 0, TRUE);
   + for (i = 0; i  rctx-num_atoms; i++) {
 +   if (rctx-atoms[i]-dirty) {
 +   rctx-atoms[i]-emit(rctx, rctx-atoms[i]);
 +   }
 

Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup

2013-08-08 Thread Marek Olšák
.On Thu, Aug 8, 2013 at 9:47 AM, Christian König
deathsim...@vodafone.de wrote:
 Am 08.08.2013 02:20, schrieb Marek Olšák:

 FMASK is bound as a separate texture. For every texture, there can be
 an FMASK. Therefore a separate array of resource slots has to be added.

 This adds a new mechanism for emitting resource descriptors, its features
 are:
 - resource descriptors are stored in an ordinary buffer (not in a CS)


 Having resource descriptors outside of the CS has two problems that we need
 to solve first:

 1. Fine grained descriptor updates doesn't work, I already tried that. The
 problem is that unlike previous asics descriptors are now a memory block, so
 no longer part of the CP context. So when we (for example) have a draw
 command executing and the next draw command is using new resources for a
 specific slot we would either block until the first draw command is finished
 (which is bad for performance) or change the descriptors while they are
 still in use (which results in VM faults).

So what would the proper solution be here? Do I need to flush some
caches or would moving the descriptor updates to the constant IB fix
that?


 2. If my understand is correct when they are embedded the descriptors are
 preloaded into the caches while executing the IB, so to archive the same
 speed with descriptors outside of the IB you need to add additional commands
 to the constant IB which is new to SI and we currently doesn't support in
 the CS interface.

There seems to be support for the constant IB. The CS ioctl chunk ID
is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in
si_vm_packet3_ce_check. Is there anything missing?

Marek


 Regards,
 Christian.


 - descriptors of disabled resources are set to zeros
 - fine-grained resource updates (it can update one resource slot while not
touching the other slots)
 - updates are done with the WRITE_DATA packet
 - it implements the si_atom interface for packet emission
 - only used for FMASK textures right now

 The primary motivation for this is that FMASK textures naturally need
 fine-grained resource updates and I also need to query in the shader
 if a resource is NULL.
 ---
   src/gallium/drivers/radeonsi/Makefile.sources  |   1 +
   src/gallium/drivers/radeonsi/r600_hw_context.c |   3 +
   src/gallium/drivers/radeonsi/r600_resource.h   |   1 +
   src/gallium/drivers/radeonsi/r600_texture.c|   1 +
   src/gallium/drivers/radeonsi/radeonsi_pipe.c   |   9 +-
   src/gallium/drivers/radeonsi/radeonsi_pipe.h   |   6 +-
   src/gallium/drivers/radeonsi/radeonsi_pm4.c|   7 +
   src/gallium/drivers/radeonsi/radeonsi_pm4.h|   2 +
   src/gallium/drivers/radeonsi/si_descriptors.c  | 188
 +
   src/gallium/drivers/radeonsi/si_state.c|  58 +++-
   src/gallium/drivers/radeonsi/si_state.h|  36 +
   11 files changed, 305 insertions(+), 7 deletions(-)
   create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c

 diff --git a/src/gallium/drivers/radeonsi/Makefile.sources
 b/src/gallium/drivers/radeonsi/Makefile.sources
 index b3ffa72..68c8282 100644
 --- a/src/gallium/drivers/radeonsi/Makefile.sources
 +++ b/src/gallium/drivers/radeonsi/Makefile.sources
 @@ -10,6 +10,7 @@ C_SOURCES := \
 r600_translate.c \
 radeonsi_pm4.c \
 radeonsi_compute.c \
 +   si_descriptors.c \
 si_state.c \
 si_state_streamout.c \
 si_state_draw.c \
 diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c
 b/src/gallium/drivers/radeonsi/r600_hw_context.c
 index 7ed7496..b595477 100644
 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c
 +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
 @@ -289,6 +289,9 @@ void si_context_flush(struct r600_context *ctx,
 unsigned flags)
  * next draw command
  */
 si_pm4_reset_emitted(ctx);
 +
 +   si_sampler_views_begin_new_cs(ctx,
 ctx-fmask_sampler_views[PIPE_SHADER_VERTEX]);
 +   si_sampler_views_begin_new_cs(ctx,
 ctx-fmask_sampler_views[PIPE_SHADER_FRAGMENT]);
   }
 void si_context_emit_fence(struct r600_context *ctx, struct
 si_resource *fence_bo, unsigned offset, unsigned value)
 diff --git a/src/gallium/drivers/radeonsi/r600_resource.h
 b/src/gallium/drivers/radeonsi/r600_resource.h
 index e5dd36a..ab5c7b7 100644
 --- a/src/gallium/drivers/radeonsi/r600_resource.h
 +++ b/src/gallium/drivers/radeonsi/r600_resource.h
 @@ -44,6 +44,7 @@ struct r600_fmask_info {
 unsigned offset;
 unsigned size;
 unsigned alignment;
 +   unsigned pitch;
 unsigned bank_height;
 unsigned slice_tile_max;
 unsigned tile_mode_index;
 diff --git a/src/gallium/drivers/radeonsi/r600_texture.c
 b/src/gallium/drivers/radeonsi/r600_texture.c
 index cd3d1aa..b613564 100644
 --- a/src/gallium/drivers/radeonsi/r600_texture.c
 +++ b/src/gallium/drivers/radeonsi/r600_texture.c
 @@ -463,6 +463,7 @@ static void r600_texture_get_fmask_info(struct
 r600_screen *rscreen,

Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states

2013-08-08 Thread Christian König

Am 08.08.2013 14:02, schrieb Marek Olšák:

Why is the order so important? Note that cache flushes shouldn't be
treated like states and neither should the draw packets and any state
that comes from pipe_draw_info. The only things left are the register
updates and resource updates where the order doesn't matter, does it?


Unfortunately it does. In theory the CP context should prevent us from 
dealing with such details and allow us to emit the resource updates and 
register writes in any order, but AFAIK in practice we have some erratas 
on this (that reminds me to remind Alex to show you the documents on that).



Anyway, the order is determined by the order of si_add_atom calls,
which should be done once in create_context.


Then why don't you just use a constant structure for this?

Christian.



Marek

On Thu, Aug 8, 2013 at 10:32 AM, Christian König
deathsim...@vodafone.de wrote:

Am 08.08.2013 02:20, schrieb Marek Olšák:


It's the same as in r600g. Look how simple it is.


That concept has the problem that we don't necessary know in which order the
state is emitted.

Why not just add an emit callback to si_pm4_state for the short term
instead?

For the long term we should probably teach the kernel interface to accept a
bunch of pointers to smaller IB fragments instead of one large IB. That
would allow us to not only save the copy of commands for the CSO states, but
also allows us to emit optional IB fragments that are only inserted if the
we had a context switch in between.

Christian.



---
   src/gallium/drivers/radeonsi/r600_hw_context.c |  8 
   src/gallium/drivers/radeonsi/radeonsi_pipe.h   |  9 +
   src/gallium/drivers/radeonsi/si_state.h| 10 ++
   src/gallium/drivers/radeonsi/si_state_draw.c   |  8 +++-
   4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c
b/src/gallium/drivers/radeonsi/r600_hw_context.c
index 382382b..7ed7496 100644
--- a/src/gallium/drivers/radeonsi/r600_hw_context.c
+++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
@@ -114,9 +114,17 @@ err:
   void si_need_cs_space(struct r600_context *ctx, unsigned num_dw,
 boolean count_draw_in)
   {
+   int i;
+
 /* The number of dwords we already used in the CS so far. */
 num_dw += ctx-cs-cdw;
   + for (i = 0; i  ctx-num_atoms; i++) {
+   if (ctx-atoms[i]-dirty) {
+   num_dw += ctx-atoms[i]-num_dw;
+   }
+   }
+
 if (count_draw_in) {
 /* The number of dwords all the dirty states would take.
*/
 num_dw += ctx-pm4_dirty_cdwords;
diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
index e370149..5fa9bdc 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
@@ -145,6 +145,10 @@ struct r600_context {
 void*custom_blend_decompress;
 struct r600_screen  *screen;
 struct radeon_winsys*ws;
+
+   struct si_atom  *atoms[SI_MAX_ATOMS];
+   unsignednum_atoms;
+
 struct si_vertex_element*vertex_elements;
 struct pipe_framebuffer_state   framebuffer;
 unsignedfb_log_samples;
@@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct
pipe_screen *screen, struct pipe_
 return rscreen-ws-buffer_get_virtual_address(rresource-cs_buf);
   }
   +static INLINE void si_add_atom(struct r600_context *rctx, struct
si_atom *atom)
+{
+   rctx-atoms[rctx-num_atoms++] = atom;
+}
+
   #endif
diff --git a/src/gallium/drivers/radeonsi/si_state.h
b/src/gallium/drivers/radeonsi/si_state.h
index b01fbf2..4aabdef 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -29,6 +29,16 @@
 #include radeonsi_pm4.h
   +#define SI_MAX_ATOMS 2
+
+/* This encapsulates a state or an operation which can emitted into the
GPU
+ * command stream. */
+struct si_atom {
+   void (*emit)(struct r600_context *ctx, struct si_atom *state);
+   unsignednum_dw;
+   booldirty;
+};
+
   struct si_state_blend {
 struct si_pm4_state pm4;
 uint32_tcb_target_mask;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 4208fa7..bcae778 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -664,7 +664,7 @@ void si_draw_vbo(struct pipe_context *ctx, const
struct pipe_draw_info *info)
   {
 struct r600_context *rctx = (struct r600_context *)ctx;
 struct pipe_index_buffer ib = {};
-   uint32_t cp_coher_cntl;
+   uint32_t cp_coher_cntl, i;
 if (!info-count  (info-indexed ||

Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup

2013-08-08 Thread Christian König

Am 08.08.2013 14:38, schrieb Marek Olšák:

.On Thu, Aug 8, 2013 at 9:47 AM, Christian König
deathsim...@vodafone.de wrote:

Am 08.08.2013 02:20, schrieb Marek Olšák:


FMASK is bound as a separate texture. For every texture, there can be
an FMASK. Therefore a separate array of resource slots has to be added.

This adds a new mechanism for emitting resource descriptors, its features
are:
- resource descriptors are stored in an ordinary buffer (not in a CS)


Having resource descriptors outside of the CS has two problems that we need
to solve first:

1. Fine grained descriptor updates doesn't work, I already tried that. The
problem is that unlike previous asics descriptors are now a memory block, so
no longer part of the CP context. So when we (for example) have a draw
command executing and the next draw command is using new resources for a
specific slot we would either block until the first draw command is finished
(which is bad for performance) or change the descriptors while they are
still in use (which results in VM faults).

So what would the proper solution be here? Do I need to flush some
caches or would moving the descriptor updates to the constant IB fix
that?


Actually the current implementation worked better than anything else I 
tried.


When you really need the resource descriptors in a separate buffer you 
need to use one buffer for each draw call and always write the full 
buffer contents (no partial updates). Flushing anything won't really 
help either. The only solution I see using one buffer is to block until 
the last draw call is finished with WAIT_REG_MEM, but that would be 
quite disastrous for performance.



2. If my understand is correct when they are embedded the descriptors are
preloaded into the caches while executing the IB, so to archive the same
speed with descriptors outside of the IB you need to add additional commands
to the constant IB which is new to SI and we currently doesn't support in
the CS interface.

There seems to be support for the constant IB. The CS ioctl chunk ID
is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in
si_vm_packet3_ce_check. Is there anything missing?


The userspace side seems to be missing and except for throwing NOP 
packets into it we never tested it. I know from the closed source side 
that it actually was quite tricky for them to get working.


Additional to that please note that I'm not 100% sure that just putting 
the descriptors into the IB is really helping here. It was just the most 
simplest solution to avoid allocating a new buffer on each draw call.


Cheers,
Christian.


Marek


Regards,
Christian.



- descriptors of disabled resources are set to zeros
- fine-grained resource updates (it can update one resource slot while not
touching the other slots)
- updates are done with the WRITE_DATA packet
- it implements the si_atom interface for packet emission
- only used for FMASK textures right now

The primary motivation for this is that FMASK textures naturally need
fine-grained resource updates and I also need to query in the shader
if a resource is NULL.
---
   src/gallium/drivers/radeonsi/Makefile.sources  |   1 +
   src/gallium/drivers/radeonsi/r600_hw_context.c |   3 +
   src/gallium/drivers/radeonsi/r600_resource.h   |   1 +
   src/gallium/drivers/radeonsi/r600_texture.c|   1 +
   src/gallium/drivers/radeonsi/radeonsi_pipe.c   |   9 +-
   src/gallium/drivers/radeonsi/radeonsi_pipe.h   |   6 +-
   src/gallium/drivers/radeonsi/radeonsi_pm4.c|   7 +
   src/gallium/drivers/radeonsi/radeonsi_pm4.h|   2 +
   src/gallium/drivers/radeonsi/si_descriptors.c  | 188
+
   src/gallium/drivers/radeonsi/si_state.c|  58 +++-
   src/gallium/drivers/radeonsi/si_state.h|  36 +
   11 files changed, 305 insertions(+), 7 deletions(-)
   create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c

diff --git a/src/gallium/drivers/radeonsi/Makefile.sources
b/src/gallium/drivers/radeonsi/Makefile.sources
index b3ffa72..68c8282 100644
--- a/src/gallium/drivers/radeonsi/Makefile.sources
+++ b/src/gallium/drivers/radeonsi/Makefile.sources
@@ -10,6 +10,7 @@ C_SOURCES := \
 r600_translate.c \
 radeonsi_pm4.c \
 radeonsi_compute.c \
+   si_descriptors.c \
 si_state.c \
 si_state_streamout.c \
 si_state_draw.c \
diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c
b/src/gallium/drivers/radeonsi/r600_hw_context.c
index 7ed7496..b595477 100644
--- a/src/gallium/drivers/radeonsi/r600_hw_context.c
+++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
@@ -289,6 +289,9 @@ void si_context_flush(struct r600_context *ctx,
unsigned flags)
  * next draw command
  */
 si_pm4_reset_emitted(ctx);
+
+   si_sampler_views_begin_new_cs(ctx,
ctx-fmask_sampler_views[PIPE_SHADER_VERTEX]);
+   si_sampler_views_begin_new_cs(ctx,
ctx-fmask_sampler_views[PIPE_SHADER_FRAGMENT]);
   }
 void 

Re: [Mesa-dev] [PATCH 15/20] radeonsi: add basic infrastructure for atom-based states

2013-08-08 Thread Marek Olšák
On Thu, Aug 8, 2013 at 2:55 PM, Christian König deathsim...@vodafone.de wrote:
 Am 08.08.2013 14:02, schrieb Marek Olšák:

 Why is the order so important? Note that cache flushes shouldn't be
 treated like states and neither should the draw packets and any state
 that comes from pipe_draw_info. The only things left are the register
 updates and resource updates where the order doesn't matter, does it?


 Unfortunately it does. In theory the CP context should prevent us from
 dealing with such details and allow us to emit the resource updates and
 register writes in any order, but AFAIK in practice we have some erratas on
 this (that reminds me to remind Alex to show you the documents on that).


 Anyway, the order is determined by the order of si_add_atom calls,
 which should be done once in create_context.


 Then why don't you just use a constant structure for this?

Yes, it indeed looks like a good idea to do so for clarity now that I
know the order does matter.

Marek


 Christian.



 Marek

 On Thu, Aug 8, 2013 at 10:32 AM, Christian König
 deathsim...@vodafone.de wrote:

 Am 08.08.2013 02:20, schrieb Marek Olšák:

 It's the same as in r600g. Look how simple it is.


 That concept has the problem that we don't necessary know in which order
 the
 state is emitted.

 Why not just add an emit callback to si_pm4_state for the short term
 instead?

 For the long term we should probably teach the kernel interface to accept
 a
 bunch of pointers to smaller IB fragments instead of one large IB. That
 would allow us to not only save the copy of commands for the CSO states,
 but
 also allows us to emit optional IB fragments that are only inserted if
 the
 we had a context switch in between.

 Christian.


 ---
src/gallium/drivers/radeonsi/r600_hw_context.c |  8 
src/gallium/drivers/radeonsi/radeonsi_pipe.h   |  9 +
src/gallium/drivers/radeonsi/si_state.h| 10 ++
src/gallium/drivers/radeonsi/si_state_draw.c   |  8 +++-
4 files changed, 34 insertions(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c
 b/src/gallium/drivers/radeonsi/r600_hw_context.c
 index 382382b..7ed7496 100644
 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c
 +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
 @@ -114,9 +114,17 @@ err:
void si_need_cs_space(struct r600_context *ctx, unsigned num_dw,
  boolean count_draw_in)
{
 +   int i;
 +
  /* The number of dwords we already used in the CS so far. */
  num_dw += ctx-cs-cdw;
+ for (i = 0; i  ctx-num_atoms; i++) {
 +   if (ctx-atoms[i]-dirty) {
 +   num_dw += ctx-atoms[i]-num_dw;
 +   }
 +   }
 +
  if (count_draw_in) {
  /* The number of dwords all the dirty states would
 take.
 */
  num_dw += ctx-pm4_dirty_cdwords;
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 index e370149..5fa9bdc 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
 @@ -145,6 +145,10 @@ struct r600_context {
  void*custom_blend_decompress;
  struct r600_screen  *screen;
  struct radeon_winsys*ws;
 +
 +   struct si_atom  *atoms[SI_MAX_ATOMS];
 +   unsignednum_atoms;
 +
  struct si_vertex_element*vertex_elements;
  struct pipe_framebuffer_state   framebuffer;
  unsignedfb_log_samples;
 @@ -329,4 +333,9 @@ static INLINE uint64_t r600_resource_va(struct
 pipe_screen *screen, struct pipe_
  return
 rscreen-ws-buffer_get_virtual_address(rresource-cs_buf);
}
+static INLINE void si_add_atom(struct r600_context *rctx, struct
 si_atom *atom)
 +{
 +   rctx-atoms[rctx-num_atoms++] = atom;
 +}
 +
#endif
 diff --git a/src/gallium/drivers/radeonsi/si_state.h
 b/src/gallium/drivers/radeonsi/si_state.h
 index b01fbf2..4aabdef 100644
 --- a/src/gallium/drivers/radeonsi/si_state.h
 +++ b/src/gallium/drivers/radeonsi/si_state.h
 @@ -29,6 +29,16 @@
  #include radeonsi_pm4.h
+#define SI_MAX_ATOMS 2
 +
 +/* This encapsulates a state or an operation which can emitted into the
 GPU
 + * command stream. */
 +struct si_atom {
 +   void (*emit)(struct r600_context *ctx, struct si_atom *state);
 +   unsignednum_dw;
 +   booldirty;
 +};
 +
struct si_state_blend {
  struct si_pm4_state pm4;
  uint32_tcb_target_mask;
 diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
 b/src/gallium/drivers/radeonsi/si_state_draw.c
 index 4208fa7..bcae778 100644
 --- a/src/gallium/drivers/radeonsi/si_state_draw.c
 +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
 @@ -664,7 +664,7 @@ void si_draw_vbo(struct 

Re: [Mesa-dev] [PATCH 12/20] radeonsi: implement MSAA colorbuffer compression for rendering

2013-08-08 Thread Michel Dänzer
On Don, 2013-08-08 at 13:44 +0200, Marek Olšák wrote:
 Do you mean the surface_sync pm4 or the one in si_state?

I mean this:

diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
b/src/gallium/drivers/radeonsi/r600_hw_context.c
index 25c972b..d9fc151 100644
--- a/src/gallium/drivers/radeonsi/r600_hw_context.c
+++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
@@ -176,6 +176,7 @@ static void r600_flush_framebuffer(struct r600_context *ctx)
S_0085F0_CB7_DEST_BASE_ENA(1) |
S_0085F0_DB_ACTION_ENA(1) |
S_0085F0_DB_DEST_BASE_ENA(1));
+   si_cmd_flush_and_inv_cb_meta(pm4);
si_pm4_emit(ctx, pm4);
si_pm4_free_state(ctx, pm4, ~0);


 On Thu, Aug 8, 2013 at 10:31 AM, Michel Dänzer mic...@daenzer.net wrote:
  On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote:
 
  diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
  b/src/gallium/drivers/radeonsi/r600_hw_context.c
  index 25c972b..382382b 100644
  --- a/src/gallium/drivers/radeonsi/r600_hw_context.c
  +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
  @@ -179,6 +179,17 @@ static void r600_flush_framebuffer(struct 
  r600_context *ctx)
si_pm4_emit(ctx, pm4);
si_pm4_free_state(ctx, pm4, ~0);
 
  + /* flush CB_META */
  + pm4 = si_pm4_alloc_state(ctx);
  +
  + if (pm4 == NULL)
  + return;
  +
  + si_cmd_flush_and_inv_cb_meta(pm4);
  + si_pm4_emit(ctx, pm4);
  + si_pm4_free_state(ctx, pm4, ~0);
  + ctx-flush_and_inv_cb_meta = false;
  +
ctx-flags = ~R600_CONTEXT_DST_CACHES_DIRTY;
   }
 
  Can't you add to the existing pm4 here, instead of allocating a new one?
 
 
  --
  Earthling Michel Dänzer   |   http://www.amd.com
  Libre software enthusiast |  Debian, X and DRI developer
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Michel Dänzer
On Don, 2013-08-08 at 13:51 +0200, Marek Olšák wrote:
 I'll test LLVM master, but we should probably bump the LLVM version
 requirement in configure.ac to prevent users from using LLVM 3.3.

That might be a good idea for the Mesa release after 9.2 anyway, but
another possibility might be to only enable features requiring LLVM
changes when Mesa is built against LLVM 3.4 or newer (there are already
some examples of that, guarded by #if HAVE_LLVM = 0x0304), and fixing
any problems with running against LLVM 3.3.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup

2013-08-08 Thread Marek Olšák
On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de wrote:
 Am 08.08.2013 14:38, schrieb Marek Olšák:

 .On Thu, Aug 8, 2013 at 9:47 AM, Christian König
 deathsim...@vodafone.de wrote:

 Am 08.08.2013 02:20, schrieb Marek Olšák:

 FMASK is bound as a separate texture. For every texture, there can be
 an FMASK. Therefore a separate array of resource slots has to be added.

 This adds a new mechanism for emitting resource descriptors, its
 features
 are:
 - resource descriptors are stored in an ordinary buffer (not in a CS)


 Having resource descriptors outside of the CS has two problems that we
 need
 to solve first:

 1. Fine grained descriptor updates doesn't work, I already tried that.
 The
 problem is that unlike previous asics descriptors are now a memory block,
 so
 no longer part of the CP context. So when we (for example) have a draw
 command executing and the next draw command is using new resources for a
 specific slot we would either block until the first draw command is
 finished
 (which is bad for performance) or change the descriptors while they are
 still in use (which results in VM faults).

 So what would the proper solution be here? Do I need to flush some
 caches or would moving the descriptor updates to the constant IB fix
 that?


 Actually the current implementation worked better than anything else I
 tried.

 When you really need the resource descriptors in a separate buffer you need
 to use one buffer for each draw call and always write the full buffer
 contents (no partial updates). Flushing anything won't really help either.
 The only solution I see using one buffer is to block until the last draw
 call is finished with WAIT_REG_MEM, but that would be quite disastrous for
 performance.


 2. If my understand is correct when they are embedded the descriptors are
 preloaded into the caches while executing the IB, so to archive the same
 speed with descriptors outside of the IB you need to add additional
 commands
 to the constant IB which is new to SI and we currently doesn't support in
 the CS interface.

 There seems to be support for the constant IB. The CS ioctl chunk ID
 is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in
 si_vm_packet3_ce_check. Is there anything missing?


 The userspace side seems to be missing and except for throwing NOP packets
 into it we never tested it. I know from the closed source side that it
 actually was quite tricky for them to get working.

 Additional to that please note that I'm not 100% sure that just putting the
 descriptors into the IB is really helping here. It was just the most
 simplest solution to avoid allocating a new buffer on each draw call.

I understand. I don't really need to have resource descriptors in a
separate buffer, all I need is these 3 basic features a gallium driver
should support:
- fine-grained resource updates (mainly for performance, see below)
- ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0)
- no GPU crash if a shader is using SAMPLER[15] but there are no samplers bound

FYI, partial sampler view and sampler state updates are coming to
gallium, Brian Paul already has some patches, it's just a matter of
time now. Vertex and constant buffer states already support partial
updates.

Marek


 Cheers,
 Christian.

 Marek

 Regards,
 Christian.


 - descriptors of disabled resources are set to zeros
 - fine-grained resource updates (it can update one resource slot while
 not
 touching the other slots)
 - updates are done with the WRITE_DATA packet
 - it implements the si_atom interface for packet emission
 - only used for FMASK textures right now

 The primary motivation for this is that FMASK textures naturally need
 fine-grained resource updates and I also need to query in the shader
 if a resource is NULL.
 ---
src/gallium/drivers/radeonsi/Makefile.sources  |   1 +
src/gallium/drivers/radeonsi/r600_hw_context.c |   3 +
src/gallium/drivers/radeonsi/r600_resource.h   |   1 +
src/gallium/drivers/radeonsi/r600_texture.c|   1 +
src/gallium/drivers/radeonsi/radeonsi_pipe.c   |   9 +-
src/gallium/drivers/radeonsi/radeonsi_pipe.h   |   6 +-
src/gallium/drivers/radeonsi/radeonsi_pm4.c|   7 +
src/gallium/drivers/radeonsi/radeonsi_pm4.h|   2 +
src/gallium/drivers/radeonsi/si_descriptors.c  | 188
 +
src/gallium/drivers/radeonsi/si_state.c|  58 +++-
src/gallium/drivers/radeonsi/si_state.h|  36 +
11 files changed, 305 insertions(+), 7 deletions(-)
create mode 100644 src/gallium/drivers/radeonsi/si_descriptors.c

 diff --git a/src/gallium/drivers/radeonsi/Makefile.sources
 b/src/gallium/drivers/radeonsi/Makefile.sources
 index b3ffa72..68c8282 100644
 --- a/src/gallium/drivers/radeonsi/Makefile.sources
 +++ b/src/gallium/drivers/radeonsi/Makefile.sources
 @@ -10,6 +10,7 @@ C_SOURCES := \
  r600_translate.c \
  radeonsi_pm4.c \
  

Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Tom Stellard
On Thu, Aug 08, 2013 at 01:51:39PM +0200, Marek Olšák wrote:
 Interleaving might not be a good idea, but they could be in the same
 array, like this:
 
 0..15: textures
 16..31: FMASK textures
 
 I'll test LLVM master, but we should probably bump the LLVM version
 requirement in configure.ac to prevent users from using LLVM 3.3.
 

What is it that works with LLVM master, but doesn't work with LLVM 3.3?

-Tom

 Marek
 
 On Thu, Aug 8, 2013 at 10:37 AM, Christian König
 deathsim...@vodafone.de wrote:
  Am 08.08.2013 10:30, schrieb Michel Dänzer:
 
  On Don, 2013-08-08 at 02:20 +0200, Marek Olšák wrote:
 
  diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
  b/src/gallium/drivers/radeonsi/si_state_draw.c
  index 746ace6..4208fa7 100644
  --- a/src/gallium/drivers/radeonsi/si_state_draw.c
  +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
  @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context
  *ctx, struct si_pipe_shader *s
  /* Last 2 reserved SGPRs are used for VCC */
  num_sgprs = num_user_sgprs + 2;
  }
  +   num_sgprs += 1; /* XXX this fixes VM faults */
  assert(num_sgprs = 104);
 
  You can't just increment num_sgprs unconditionally here, or the
  assertion will spuriously fail again when num_sgprs was 104 to begin
  with.
 
  More fundamentally, I'd really like us to understand why / under what
  circumstances the value needs to be incremented at all. You mentioned on
  IRC that you were using an LLVM 3.3 tree, is this change still necessary
  with LLVM master / trunk? If yes, it would be good to look into at least
  some example shaders where the change is really necessary, to try and
  understand the circumstances.
 
 
  Additional to that I'm not sure if we should add another sgpr pointer to the
  shader, the docs speak of quite a performance penalty for each. Since they
  are (optional) attributes to the textures anyway they could probably be
  interleaved with the texture resources.
 
  Christian.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Tom Stellard
On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote:
 ---
  src/gallium/drivers/radeonsi/radeonsi_shader.c |  7 ++--
  src/gallium/drivers/radeonsi/radeonsi_shader.h | 58 
 ++
  src/gallium/drivers/radeonsi/si_state_draw.c   |  1 +
  3 files changed, 36 insertions(+), 30 deletions(-)
 
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
 b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 index 18dde61..2806045 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
 +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 @@ -1209,7 +1209,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
  {
   struct lp_build_tgsi_context *bld_base = 
 si_shader_ctx-radeon_bld.soa.bld_base;
   struct gallivm_state *gallivm = bld_base-base.gallivm;
 - LLVMTypeRef params[20], f32, i8, i32, v2i32, v3i32;
 + LLVMTypeRef params[SI_MAX_PARAMS], f32, i8, i32, v2i32, v3i32;
   unsigned i;
  
   i8 = LLVMInt8TypeInContext(gallivm-context);
 @@ -1221,6 +1221,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
   params[SI_PARAM_CONST] = LLVMPointerType(LLVMVectorType(i8, 16), 
 CONST_ADDR_SPACE);
   params[SI_PARAM_SAMPLER] = params[SI_PARAM_CONST];
   params[SI_PARAM_RESOURCE] = LLVMPointerType(LLVMVectorType(i8, 32), 
 CONST_ADDR_SPACE);
 + params[SI_PARAM_FMASK_RESOURCE] = params[SI_PARAM_RESOURCE];
  
   if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX) {
   params[SI_PARAM_VERTEX_BUFFER] = params[SI_PARAM_SAMPLER];
 @@ -1229,7 +1230,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
   params[SI_PARAM_DUMMY_0] = i32;
   params[SI_PARAM_DUMMY_1] = i32;
   params[SI_PARAM_INSTANCE_ID] = i32;
 - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 9);
 + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 10);
  
   } else {
   params[SI_PARAM_PRIM_MASK] = i32;
 @@ -1249,7 +1250,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
   params[SI_PARAM_ANCILLARY] = f32;
   params[SI_PARAM_SAMPLE_COVERAGE] = f32;
   params[SI_PARAM_POS_FIXED_PT] = f32;
 - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 20);
 + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 21);
   }
  
   radeon_llvm_shader_type(si_shader_ctx-radeon_bld.main_fn, 
 si_shader_ctx-type);
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h 
 b/src/gallium/drivers/radeonsi/radeonsi_shader.h
 index 2ce34b9..836b144 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
 +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
 @@ -34,43 +34,47 @@
  #define SI_SGPR_CONST0
  #define SI_SGPR_SAMPLER  2
  #define SI_SGPR_RESOURCE 4
 -#define SI_SGPR_VERTEX_BUFFER6
 -#define SI_SGPR_START_INSTANCE   8
 +#define SI_SGPR_FMASK_RESOURCE   6
 +#define SI_SGPR_VERTEX_BUFFER8
 +#define SI_SGPR_START_INSTANCE   10
  
 -#define SI_VS_NUM_USER_SGPR  9
 -#define SI_PS_NUM_USER_SGPR  6
 +#define SI_VS_NUM_USER_SGPR  11
 +#define SI_PS_NUM_USER_SGPR  8
  
  /* LLVM function parameter indices */
  #define SI_PARAM_CONST   0
  #define SI_PARAM_SAMPLER 1
  #define SI_PARAM_RESOURCE2
 +#define SI_PARAM_FMASK_RESOURCE  3
  
  /* VS only parameters */
 -#define SI_PARAM_VERTEX_BUFFER   3
 -#define SI_PARAM_START_INSTANCE  4
 -#define SI_PARAM_VERTEX_ID   5
 -#define SI_PARAM_DUMMY_0 6
 -#define SI_PARAM_DUMMY_1 7
 -#define SI_PARAM_INSTANCE_ID 8
 +#define SI_PARAM_VERTEX_BUFFER   4
 +#define SI_PARAM_START_INSTANCE  5
 +#define SI_PARAM_VERTEX_ID   6
 +#define SI_PARAM_DUMMY_0 7
 +#define SI_PARAM_DUMMY_1 8
 +#define SI_PARAM_INSTANCE_ID 9
  
  /* PS only parameters */
 -#define SI_PARAM_PRIM_MASK   3
 -#define SI_PARAM_PERSP_SAMPLE4
 -#define SI_PARAM_PERSP_CENTER5
 -#define SI_PARAM_PERSP_CENTROID  6
 -#define SI_PARAM_PERSP_PULL_MODEL7
 -#define SI_PARAM_LINEAR_SAMPLE   8
 -#define SI_PARAM_LINEAR_CENTER   9
 -#define SI_PARAM_LINEAR_CENTROID 10
 -#define SI_PARAM_LINE_STIPPLE_TEX11
 -#define SI_PARAM_POS_X_FLOAT 12
 -#define SI_PARAM_POS_Y_FLOAT 13
 -#define SI_PARAM_POS_Z_FLOAT 14
 -#define SI_PARAM_POS_W_FLOAT 15
 -#define SI_PARAM_FRONT_FACE  16
 -#define SI_PARAM_ANCILLARY   17
 -#define SI_PARAM_SAMPLE_COVERAGE 18
 -#define SI_PARAM_POS_FIXED_PT19
 +#define SI_PARAM_PRIM_MASK   4
 +#define SI_PARAM_PERSP_SAMPLE5
 +#define SI_PARAM_PERSP_CENTER6
 +#define SI_PARAM_PERSP_CENTROID  7
 +#define SI_PARAM_PERSP_PULL_MODEL8
 +#define SI_PARAM_LINEAR_SAMPLE   9
 +#define 

Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Tom Stellard
On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote:
 ---
  src/gallium/drivers/radeonsi/radeonsi_shader.c |  7 ++--
  src/gallium/drivers/radeonsi/radeonsi_shader.h | 58 
 ++
  src/gallium/drivers/radeonsi/si_state_draw.c   |  1 +
  3 files changed, 36 insertions(+), 30 deletions(-)
 
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
 b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 index 18dde61..2806045 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
 +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
 @@ -1209,7 +1209,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
  {
   struct lp_build_tgsi_context *bld_base = 
 si_shader_ctx-radeon_bld.soa.bld_base;
   struct gallivm_state *gallivm = bld_base-base.gallivm;
 - LLVMTypeRef params[20], f32, i8, i32, v2i32, v3i32;
 + LLVMTypeRef params[SI_MAX_PARAMS], f32, i8, i32, v2i32, v3i32;
   unsigned i;
  
   i8 = LLVMInt8TypeInContext(gallivm-context);
 @@ -1221,6 +1221,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
   params[SI_PARAM_CONST] = LLVMPointerType(LLVMVectorType(i8, 16), 
 CONST_ADDR_SPACE);
   params[SI_PARAM_SAMPLER] = params[SI_PARAM_CONST];
   params[SI_PARAM_RESOURCE] = LLVMPointerType(LLVMVectorType(i8, 32), 
 CONST_ADDR_SPACE);
 + params[SI_PARAM_FMASK_RESOURCE] = params[SI_PARAM_RESOURCE];
  
   if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX) {
   params[SI_PARAM_VERTEX_BUFFER] = params[SI_PARAM_SAMPLER];
 @@ -1229,7 +1230,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
   params[SI_PARAM_DUMMY_0] = i32;
   params[SI_PARAM_DUMMY_1] = i32;
   params[SI_PARAM_INSTANCE_ID] = i32;
 - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 9);
 + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 10);
  
   } else {
   params[SI_PARAM_PRIM_MASK] = i32;
 @@ -1249,7 +1250,7 @@ static void create_function(struct si_shader_context 
 *si_shader_ctx)
   params[SI_PARAM_ANCILLARY] = f32;
   params[SI_PARAM_SAMPLE_COVERAGE] = f32;
   params[SI_PARAM_POS_FIXED_PT] = f32;
 - radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 20);
 + radeon_llvm_create_func(si_shader_ctx-radeon_bld, params, 21);
   }
  
   radeon_llvm_shader_type(si_shader_ctx-radeon_bld.main_fn, 
 si_shader_ctx-type);
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h 
 b/src/gallium/drivers/radeonsi/radeonsi_shader.h
 index 2ce34b9..836b144 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
 +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
 @@ -34,43 +34,47 @@
  #define SI_SGPR_CONST0
  #define SI_SGPR_SAMPLER  2
  #define SI_SGPR_RESOURCE 4
 -#define SI_SGPR_VERTEX_BUFFER6
 -#define SI_SGPR_START_INSTANCE   8
 +#define SI_SGPR_FMASK_RESOURCE   6
 +#define SI_SGPR_VERTEX_BUFFER8
 +#define SI_SGPR_START_INSTANCE   10
  
 -#define SI_VS_NUM_USER_SGPR  9
 -#define SI_PS_NUM_USER_SGPR  6
 +#define SI_VS_NUM_USER_SGPR  11
 +#define SI_PS_NUM_USER_SGPR  8
  
  /* LLVM function parameter indices */
  #define SI_PARAM_CONST   0
  #define SI_PARAM_SAMPLER 1
  #define SI_PARAM_RESOURCE2
 +#define SI_PARAM_FMASK_RESOURCE  3
  
  /* VS only parameters */
 -#define SI_PARAM_VERTEX_BUFFER   3
 -#define SI_PARAM_START_INSTANCE  4
 -#define SI_PARAM_VERTEX_ID   5
 -#define SI_PARAM_DUMMY_0 6
 -#define SI_PARAM_DUMMY_1 7
 -#define SI_PARAM_INSTANCE_ID 8
 +#define SI_PARAM_VERTEX_BUFFER   4
 +#define SI_PARAM_START_INSTANCE  5
 +#define SI_PARAM_VERTEX_ID   6
 +#define SI_PARAM_DUMMY_0 7
 +#define SI_PARAM_DUMMY_1 8
 +#define SI_PARAM_INSTANCE_ID 9
  
  /* PS only parameters */
 -#define SI_PARAM_PRIM_MASK   3
 -#define SI_PARAM_PERSP_SAMPLE4
 -#define SI_PARAM_PERSP_CENTER5
 -#define SI_PARAM_PERSP_CENTROID  6
 -#define SI_PARAM_PERSP_PULL_MODEL7
 -#define SI_PARAM_LINEAR_SAMPLE   8
 -#define SI_PARAM_LINEAR_CENTER   9
 -#define SI_PARAM_LINEAR_CENTROID 10
 -#define SI_PARAM_LINE_STIPPLE_TEX11
 -#define SI_PARAM_POS_X_FLOAT 12
 -#define SI_PARAM_POS_Y_FLOAT 13
 -#define SI_PARAM_POS_Z_FLOAT 14
 -#define SI_PARAM_POS_W_FLOAT 15
 -#define SI_PARAM_FRONT_FACE  16
 -#define SI_PARAM_ANCILLARY   17
 -#define SI_PARAM_SAMPLE_COVERAGE 18
 -#define SI_PARAM_POS_FIXED_PT19
 +#define SI_PARAM_PRIM_MASK   4
 +#define SI_PARAM_PERSP_SAMPLE5
 +#define SI_PARAM_PERSP_CENTER6
 +#define SI_PARAM_PERSP_CENTROID  7
 +#define SI_PARAM_PERSP_PULL_MODEL8
 +#define SI_PARAM_LINEAR_SAMPLE   9
 +#define 

Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Michel Dänzer
On Don, 2013-08-08 at 08:02 -0700, Tom Stellard wrote:
 On Thu, Aug 08, 2013 at 01:51:39PM +0200, Marek Olšák wrote:
  Interleaving might not be a good idea, but they could be in the same
  array, like this:
  
  0..15: textures
  16..31: FMASK textures
  
  I'll test LLVM master, but we should probably bump the LLVM version
  requirement in configure.ac to prevent users from using LLVM 3.3.
  
 
 What is it that works with LLVM master, but doesn't work with LLVM 3.3?

Apart from MSAA, at least (looking at #if HAVE_LLVM = 0x0304 guards in
radeonsi_shader.c) TGSI_OPCODE_DDX/Y and TGSI_OPCODE_TXD.

So, I do wonder if it's worth allowing the R600 backend from LLVM 3.3
for the Mesa release after 9.2?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: set non-existing values really to zero in size queries for d3d10

2013-08-08 Thread sroland
From: Roland Scheidegger srol...@vmware.com

My previous attempt at doing so double-failed miserably (minification of
zero still gives one, and even if it would not the value was never written
anyway).
While here also rename the confusingly named int_vec bld as we have int vecs
of different sizes, and rename need_nr_mips (as this also changes out-of-bounds
behavior) to is_sviewinfo too.
---
 src/gallium/auxiliary/draw/draw_llvm_sample.c |4 +--
 src/gallium/auxiliary/gallivm/lp_bld_sample.h |2 +-
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |   34 ++---
 src/gallium/drivers/llvmpipe/lp_tex_sample.c  |4 +--
 4 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c 
b/src/gallium/auxiliary/draw/draw_llvm_sample.c
index f10cba3..97b0255 100644
--- a/src/gallium/auxiliary/draw/draw_llvm_sample.c
+++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c
@@ -271,7 +271,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct 
lp_build_sampler_soa *base,
   struct lp_type type,
   unsigned texture_unit,
   unsigned target,
-  boolean need_nr_mips,
+  boolean is_sviewinfo,
   boolean scalar_lod,
   LLVMValueRef explicit_lod, /* optional */
   LLVMValueRef *sizes_out)
@@ -286,7 +286,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct 
lp_build_sampler_soa *base,
type,
texture_unit,
target,
-   need_nr_mips,
+   is_sviewinfo,
scalar_lod,
explicit_lod,
sizes_out);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
index db3ea1d..75e8c59 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
@@ -498,7 +498,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
 struct lp_type int_type,
 unsigned texture_unit,
 unsigned target,
-boolean need_nr_mips,
+boolean is_viewinfo,
 boolean scalar_lod,
 LLVMValueRef explicit_lod,
 LLVMValueRef *sizes_out);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index e403ac8..65d6e7b 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -1944,7 +1944,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
 struct lp_type int_type,
 unsigned texture_unit,
 unsigned target,
-boolean need_nr_mips,
+boolean is_sviewinfo,
 boolean scalar_lod,
 LLVMValueRef explicit_lod,
 LLVMValueRef *sizes_out)
@@ -1954,7 +1954,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
int dims, i;
boolean has_array;
unsigned num_lods = 1;
-   struct lp_build_context bld_int_vec;
+   struct lp_build_context bld_int_vec4;
 
/*
 * Do some sanity verification about bound texture and shader dcl target.
@@ -1997,24 +1997,19 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
 
assert(!int_type.floating);
 
-   lp_build_context_init(bld_int_vec, gallivm, lp_type_int_vec(32, 128));
+   lp_build_context_init(bld_int_vec4, gallivm, lp_type_int_vec(32, 128));
 
if (explicit_lod) {
   /* FIXME: this needs to honor per-element lod */
   lod = LLVMBuildExtractElement(gallivm-builder, explicit_lod, 
lp_build_const_int32(gallivm, 0), );
   first_level = dynamic_state-first_level(dynamic_state, gallivm, 
texture_unit);
   level = LLVMBuildAdd(gallivm-builder, lod, first_level, level);
-  lod = lp_build_broadcast_scalar(bld_int_vec, level);
+  lod = lp_build_broadcast_scalar(bld_int_vec4, level);
} else {
-  lod = bld_int_vec.zero;
+  lod = bld_int_vec4.zero;
}
 
-   if (need_nr_mips) {
-  size = bld_int_vec.zero;
-   }
-   else {
-  size = bld_int_vec.undef;
-   }
+   size = bld_int_vec4.undef;
 
size = LLVMBuildInsertElement(gallivm-builder, size,
  dynamic_state-width(dynamic_state, gallivm, 
texture_unit),
@@ -2032,7 +2027,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
 lp_build_const_int32(gallivm, 2), );
}
 
-   size = 

Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Michel Dänzer
On Don, 2013-08-08 at 08:00 -0700, Tom Stellard wrote:
 On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote:
  
  diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
  b/src/gallium/drivers/radeonsi/si_state_draw.c
  index 746ace6..4208fa7 100644
  --- a/src/gallium/drivers/radeonsi/si_state_draw.c
  +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
  @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
  struct si_pipe_shader *s
  /* Last 2 reserved SGPRs are used for VCC */
  num_sgprs = num_user_sgprs + 2;
  }
  +   num_sgprs += 1; /* XXX this fixes VM faults */
 
 One problem is that the compiler is under reporting the number of SGPRs,
 when there are unused USER_SGPRs in the shader.  It should always be
 reporting a number greater than or equal to the number of USER_SGPRs.

That itself shouldn't be a problem, as the radeonsi code ensures this.


 I think Michel mentioned this earlier, but there may also be a problem
 with the way we determine usage of the VCC register in the shader, maybe
 it is being used for more instructions than we realize.

I was just worrying that AMDGPUAsmPrinter::EmitProgramInfoSI() might not
take implicit access to VCC into account, but Christian confirmed it
should. AFAICT we should be encoding any such implicit access correctly
(except for V_DIV_SCALE_F* and V_DIV_FMAS_F*, which aren't used yet).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup

2013-08-08 Thread Christian König

Am 08.08.2013 16:33, schrieb Marek Olšák:

On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de wrote:

Am 08.08.2013 14:38, schrieb Marek Olšák:


.On Thu, Aug 8, 2013 at 9:47 AM, Christian König
deathsim...@vodafone.de wrote:

Am 08.08.2013 02:20, schrieb Marek Olšák:


FMASK is bound as a separate texture. For every texture, there can be
an FMASK. Therefore a separate array of resource slots has to be added.

This adds a new mechanism for emitting resource descriptors, its
features
are:
- resource descriptors are stored in an ordinary buffer (not in a CS)


Having resource descriptors outside of the CS has two problems that we
need
to solve first:

1. Fine grained descriptor updates doesn't work, I already tried that.
The
problem is that unlike previous asics descriptors are now a memory block,
so
no longer part of the CP context. So when we (for example) have a draw
command executing and the next draw command is using new resources for a
specific slot we would either block until the first draw command is
finished
(which is bad for performance) or change the descriptors while they are
still in use (which results in VM faults).

So what would the proper solution be here? Do I need to flush some
caches or would moving the descriptor updates to the constant IB fix
that?


Actually the current implementation worked better than anything else I
tried.

When you really need the resource descriptors in a separate buffer you need
to use one buffer for each draw call and always write the full buffer
contents (no partial updates). Flushing anything won't really help either..
The only solution I see using one buffer is to block until the last draw
call is finished with WAIT_REG_MEM, but that would be quite disastrous for
performance.



2. If my understand is correct when they are embedded the descriptors are
preloaded into the caches while executing the IB, so to archive the same
speed with descriptors outside of the IB you need to add additional
commands
to the constant IB which is new to SI and we currently doesn't support in
the CS interface.

There seems to be support for the constant IB. The CS ioctl chunk ID
is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in
si_vm_packet3_ce_check. Is there anything missing?


The userspace side seems to be missing and except for throwing NOP packets
into it we never tested it. I know from the closed source side that it
actually was quite tricky for them to get working.

Additional to that please note that I'm not 100% sure that just putting the
descriptors into the IB is really helping here. It was just the most
simplest solution to avoid allocating a new buffer on each draw call.

I understand. I don't really need to have resource descriptors in a
separate buffer, all I need is these 3 basic features a gallium driver
should support:
- fine-grained resource updates (mainly for performance, see below)
- ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0)
- no GPU crash if a shader is using SAMPLER[15] but there are no samplers bound

FYI, partial sampler view and sampler state updates are coming to
gallium, Brian Paul already has some patches, it's just a matter of
time now. Vertex and constant buffer states already support partial
updates.


That shouldn't be to much off a problem.

Just allocate a state at startup and initialize it with the proper pm4 
commands for 16 samplers, then update the resource descriptors in that 
state when we change the bound 
textures/samplers/views/constants/whatever. All we need to do then is 
setting the emitted state to NULL so that it gets re-emitted in the next 
draw command.


Christian.


Marek


Cheers,
Christian.


Marek


Regards,
Christian.



- descriptors of disabled resources are set to zeros
- fine-grained resource updates (it can update one resource slot while
not
 touching the other slots)
- updates are done with the WRITE_DATA packet
- it implements the si_atom interface for packet emission
- only used for FMASK textures right now

The primary motivation for this is that FMASK textures naturally need
fine-grained resource updates and I also need to query in the shader
if a resource is NULL.
---
src/gallium/drivers/radeonsi/Makefile.sources  |   1 +
src/gallium/drivers/radeonsi/r600_hw_context.c |   3 +
src/gallium/drivers/radeonsi/r600_resource.h   |   1 +
src/gallium/drivers/radeonsi/r600_texture.c|   1 +
src/gallium/drivers/radeonsi/radeonsi_pipe.c   |   9 +-
src/gallium/drivers/radeonsi/radeonsi_pipe.h   |   6 +-
src/gallium/drivers/radeonsi/radeonsi_pm4.c|   7 +
src/gallium/drivers/radeonsi/radeonsi_pm4.h|   2 +
src/gallium/drivers/radeonsi/si_descriptors.c  | 188
+
src/gallium/drivers/radeonsi/si_state.c|  58 +++-
src/gallium/drivers/radeonsi/si_state.h|  36 +
11 files changed, 305 insertions(+), 7 deletions(-)
create mode 100644 

[Mesa-dev] [Bug 67910] New: make check fail on lp_test_arit test

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67910

  Priority: medium
Bug ID: 67910
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: make check fail on lp_test_arit test
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: lordhea...@gmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

* llvm-3.4svn r187791
* mesa-git trunk 43076a55c2c914a619b8b3a3f93e15a7eee6d2ad

--output of 'make check'--
make[5]: Entering directory
`/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe'
PASS: lp_test_format
FAIL: lp_test_arit
PASS: lp_test_blend
PASS: lp_test_conv
PASS: lp_test_printf
make[6]: Entering directory
`/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe'
make[6]: Nothing to be done for `all'.
make[6]: Leaving directory
`/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe'

Testsuite summary for Mesa 9.3.0-devel

# TOTAL: 5
# PASS:  4
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

See src/gallium/drivers/llvmpipe/test-suite.log
Please report to https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa

make[5]: *** [test-suite.log] Error 1
make[5]: Leaving directory
`/home/lordh/archdevel/mesa/git/mesatst/src/gallium/drivers/llvmpipe'


--test-suite.log file--
===
   Mesa 9.3.0-devel: src/gallium/drivers/llvmpipe/test-suite.log
===

# TOTAL: 5
# PASS:  4
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: lp_test_arit
==

floor(-0): ref = -1, out = 0, precision = -0.00 bits, FAIL
ceil(0): ref = 1, out = 0, precision = -0.00 bits, FAIL
fract(-0): ref = 0.9994, out = -0, precision = -0.00 bits, FAIL

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup

2013-08-08 Thread Marek Olšák
On Thu, Aug 8, 2013 at 6:57 PM, Christian König deathsim...@vodafone.de wrote:
 Am 08.08.2013 16:33, schrieb Marek Olšák:

 On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de
 wrote:

 Am 08.08.2013 14:38, schrieb Marek Olšák:

 .On Thu, Aug 8, 2013 at 9:47 AM, Christian König
 deathsim...@vodafone.de wrote:

 Am 08.08.2013 02:20, schrieb Marek Olšák:

 FMASK is bound as a separate texture. For every texture, there can be
 an FMASK. Therefore a separate array of resource slots has to be
 added.

 This adds a new mechanism for emitting resource descriptors, its
 features
 are:
 - resource descriptors are stored in an ordinary buffer (not in a CS)


 Having resource descriptors outside of the CS has two problems that we
 need
 to solve first:

 1. Fine grained descriptor updates doesn't work, I already tried that.
 The
 problem is that unlike previous asics descriptors are now a memory
 block,
 so
 no longer part of the CP context. So when we (for example) have a draw
 command executing and the next draw command is using new resources for
 a
 specific slot we would either block until the first draw command is
 finished
 (which is bad for performance) or change the descriptors while they are
 still in use (which results in VM faults).

 So what would the proper solution be here? Do I need to flush some
 caches or would moving the descriptor updates to the constant IB fix
 that?


 Actually the current implementation worked better than anything else I
 tried.

 When you really need the resource descriptors in a separate buffer you
 need
 to use one buffer for each draw call and always write the full buffer
 contents (no partial updates). Flushing anything won't really help
 either..

 The only solution I see using one buffer is to block until the last draw
 call is finished with WAIT_REG_MEM, but that would be quite disastrous
 for
 performance.


 2. If my understand is correct when they are embedded the descriptors
 are
 preloaded into the caches while executing the IB, so to archive the
 same
 speed with descriptors outside of the IB you need to add additional
 commands
 to the constant IB which is new to SI and we currently doesn't support
 in
 the CS interface.

 There seems to be support for the constant IB. The CS ioctl chunk ID
 is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in
 si_vm_packet3_ce_check. Is there anything missing?


 The userspace side seems to be missing and except for throwing NOP
 packets
 into it we never tested it. I know from the closed source side that it
 actually was quite tricky for them to get working.

 Additional to that please note that I'm not 100% sure that just putting
 the
 descriptors into the IB is really helping here. It was just the most
 simplest solution to avoid allocating a new buffer on each draw call.

 I understand. I don't really need to have resource descriptors in a
 separate buffer, all I need is these 3 basic features a gallium driver
 should support:
 - fine-grained resource updates (mainly for performance, see below)
 - ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0)
 - no GPU crash if a shader is using SAMPLER[15] but there are no samplers
 bound

 FYI, partial sampler view and sampler state updates are coming to
 gallium, Brian Paul already has some patches, it's just a matter of
 time now. Vertex and constant buffer states already support partial
 updates.


 That shouldn't be to much off a problem.

 Just allocate a state at startup and initialize it with the proper pm4
 commands for 16 samplers, then update the resource descriptors in that state
 when we change the bound textures/samplers/views/constants/whatever. All we
 need to do then is setting the emitted state to NULL so that it gets
 re-emitted in the next draw command.

That would re-emit all 16 shader resources even if just one of them
needs to be changed. I was trying to avoid this inefficiency. Is it
really impossible to emit just one resource descriptor and keep the
others unchanged? This is a basic D3D10/11 feature, for example:

void ID3D11DeviceContext::VSSetShaderResources(
  [in]  UINT StartSlot,
  [in]  UINT NumViews,
  [in]  ID3D11ShaderResourceView *const *ppShaderResourceViews
);

If the constant engine is required to implement this interface
efficiently, then I'd like to work on constant IB support.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67910] make check fail on lp_test_arit test

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67910

Roland Scheidegger srol...@vmware.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Roland Scheidegger srol...@vmware.com ---


*** This bug has been marked as a duplicate of bug 67672 ***

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67672] 9.2 git, Test failure in src/gallium/drivers/llvmpipe

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67672

Roland Scheidegger srol...@vmware.com changed:

   What|Removed |Added

 CC||lordhea...@gmail.com

--- Comment #14 from Roland Scheidegger srol...@vmware.com ---
*** Bug 67910 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Remember to call intel_prepare_render() before blitting.

2013-08-08 Thread Ian Romanick

On 08/06/2013 02:36 PM, Kenneth Graunke wrote:

Otherwise, blits to the window system buffer may cause crashes,
since dst_irb-mt may be NULL.

This code is lifted straight out of brw_blorp_framebuffer()'s
try_blorp_blit() helper.

Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
Cc: 9.2 mesa-sta...@lists.freedesktop.org


Reviewed-by: Ian Romanick ian.d.roman...@intel.com


---
  src/mesa/drivers/dri/i965/intel_fbo.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index fb0bb71..1692325 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -695,6 +695,11 @@ intel_blit_framebuffer_with_blitter(struct gl_context *ctx,
  {
 struct brw_context *brw = brw_context(ctx);

+   /* Sync up the state of window system buffers.  We need to do this before
+* we go looking for the buffers.
+*/
+   intel_prepare_render(brw);
+
 if (mask  GL_COLOR_BUFFER_BIT) {
GLint i;
const struct gl_framebuffer *drawFb = ctx-DrawBuffer;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Remember to call intel_prepare_render() before blitting.

2013-08-08 Thread Chad Versace

On 08/06/2013 02:36 PM, Kenneth Graunke wrote:

Otherwise, blits to the window system buffer may cause crashes,
since dst_irb-mt may be NULL.

This code is lifted straight out of brw_blorp_framebuffer()'s
try_blorp_blit() helper.

Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
Cc: 9.2 mesa-sta...@lists.freedesktop.org
---
  src/mesa/drivers/dri/i965/intel_fbo.c | 5 +
  1 file changed, 5 insertions(+)


Reviewed-by: Chad Versace chad.vers...@linux.intel.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/20] radeonsi: reserve a parameter slot for FMASK resources in shaders

2013-08-08 Thread Tom Stellard
On Thu, Aug 08, 2013 at 05:36:09PM +0200, Michel Dänzer wrote:
 On Don, 2013-08-08 at 08:00 -0700, Tom Stellard wrote:
  On Thu, Aug 08, 2013 at 02:20:54AM +0200, Marek Olšák wrote:
   
   diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
   b/src/gallium/drivers/radeonsi/si_state_draw.c
   index 746ace6..4208fa7 100644
   --- a/src/gallium/drivers/radeonsi/si_state_draw.c
   +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
   @@ -241,6 +241,7 @@ static void si_pipe_shader_ps(struct pipe_context 
   *ctx, struct si_pipe_shader *s
 /* Last 2 reserved SGPRs are used for VCC */
 num_sgprs = num_user_sgprs + 2;
 }
   + num_sgprs += 1; /* XXX this fixes VM faults */
  
  One problem is that the compiler is under reporting the number of SGPRs,
  when there are unused USER_SGPRs in the shader.  It should always be
  reporting a number greater than or equal to the number of USER_SGPRs.
 
 That itself shouldn't be a problem, as the radeonsi code ensures this.


The way radeonsi checks for this on pixel shaders won't work in all
cases.  For example, if there are 8 USER_SGPR and the shader uses VCC,
then the backend should report 10 SGPRs used.  This value is eventually
rounded up to the next multiple of 8, so what is actually stored in
shader-num_sgprs is 16.

Now lets say that the shader doesn't see 2 of the USER_SGPRs, it will
report 8 SGPRs (6 USER_SGPRs + 2 for VCC).  This value is rounded up to
the nearest multiple of 8, which is still 8, and stored in shader-num_sgprs.

Since the number of SGPRs reported matches the number of USER_SGPRs,
the code in si_pipe_shader_ps() will accepted the reported number of
SGPRs as valid, even though it isn't.

-Tom
 
  I think Michel mentioned this earlier, but there may also be a problem
  with the way we determine usage of the VCC register in the shader, maybe
  it is being used for more instructions than we realize.
 
 I was just worrying that AMDGPUAsmPrinter::EmitProgramInfoSI() might not
 take implicit access to VCC into account, but Christian confirmed it
 should. AFAICT we should be encoding any such implicit access correctly
 (except for V_DIV_SCALE_F* and V_DIV_FMAS_F*, which aren't used yet).
 
 
 -- 
 Earthling Michel Dänzer   |   http://www.amd.com
 Libre software enthusiast |  Debian, X and DRI developer
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67740] render2.c: In function '__indirect_glMap1d'

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67740

Ian Romanick i...@freedesktop.org changed:

   What|Removed |Added

 Status|NEW |NEEDINFO
 CC||cwo...@cworth.org,
   ||i...@freedesktop.org

--- Comment #1 from Ian Romanick i...@freedesktop.org ---
Are you building from GIT or from the 9.1.6 release tarball?

Does this occur with 9.1.5?

What options are you passing to configure?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.

2013-08-08 Thread Ian Romanick

On 08/08/2013 03:59 AM, Eric Anholt wrote:

Previously, we were asserting that each driver specified an NConfigOptions
exactly equal to the number of options they supplied, leading to frequent
bugs when people would forget to adjust the value when adjusting driver
options.  Instead, just overallocate the table by a bit and leave sanity
checking to the assert in findOption().


This makes sense.  I thought part of the reason for __driNConfigOptions 
being exact was that it was used by external driconf GUI tools.  Will 
this change break those tools?



---
  src/gallium/state_trackers/dri/common/dri_screen.c |  5 +--
  src/mesa/drivers/dri/common/dri_util.c |  4 +-
  src/mesa/drivers/dri/common/xmlconfig.c| 44 --
  src/mesa/drivers/dri/common/xmlconfig.h|  4 +-
  src/mesa/drivers/dri/i915/intel_screen.c   |  5 +--
  src/mesa/drivers/dri/i965/intel_screen.c   |  5 +--
  src/mesa/drivers/dri/radeon/radeon_screen.c|  5 +--
  7 files changed, 14 insertions(+), 58 deletions(-)

diff --git a/src/gallium/state_trackers/dri/common/dri_screen.c 
b/src/gallium/state_trackers/dri/common/dri_screen.c
index 3b42b5a..779741e 100644
--- a/src/gallium/state_trackers/dri/common/dri_screen.c
+++ b/src/gallium/state_trackers/dri/common/dri_screen.c
@@ -74,8 +74,6 @@ PUBLIC const char __driConfigOptions[] =

  #define false 0

-static const uint __driNConfigOptions = 13;
-
  static const __DRIconfig **
  dri_fill_in_modes(struct dri_screen *screen)
  {
@@ -417,8 +415,7 @@ dri_init_screen_helper(struct dri_screen *screen,
 else
screen-target = PIPE_TEXTURE_RECT;

-   driParseOptionInfo(screen-optionCacheDefaults,
-  __driConfigOptions, __driNConfigOptions);
+   driParseOptionInfo(screen-optionCacheDefaults, __driConfigOptions);

 driParseConfigFiles(screen-optionCache,
   screen-optionCacheDefaults,
diff --git a/src/mesa/drivers/dri/common/dri_util.c 
b/src/mesa/drivers/dri/common/dri_util.c
index 9ed9df4..fa520ea 100644
--- a/src/mesa/drivers/dri/common/dri_util.c
+++ b/src/mesa/drivers/dri/common/dri_util.c
@@ -52,8 +52,6 @@ PUBLIC const char __dri2ConfigOptions[] =
DRI_CONF_SECTION_END
 DRI_CONF_END;

-static const uint __dri2NConfigOptions = 1;
-
  /*/
  /** \name Screen handling functions  */
  /*/
@@ -112,7 +110,7 @@ dri2CreateNewScreen(int scrn, int fd,
return NULL;
  }

-driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions, 
__dri2NConfigOptions);
+driParseOptionInfo(psp-optionInfo, __dri2ConfigOptions);
  driParseConfigFiles(psp-optionCache, psp-optionInfo, psp-myNum, 
dri2);

  return psp;
diff --git a/src/mesa/drivers/dri/common/xmlconfig.c 
b/src/mesa/drivers/dri/common/xmlconfig.c
index 5c97c20..b95e452 100644
--- a/src/mesa/drivers/dri/common/xmlconfig.c
+++ b/src/mesa/drivers/dri/common/xmlconfig.c
@@ -132,16 +132,6 @@ static GLuint findOption (const driOptionCache *cache, 
const char *name) {
  return hash;
  }

-/** \brief Count the real number of options in an option cache */
-static GLuint countOptions (const driOptionCache *cache) {
-GLuint size = 1  cache-tableSize;
-GLuint i, count = 0;
-for (i = 0; i  size; ++i)
-   if (cache-info[i].name)
-   count++;
-return count;
-}
-
  /** \brief Like strdup but using malloc and with error checking. */
  #define XSTRDUP(dest,source) do { \
  GLuint len = strlen (source); \
@@ -685,25 +675,18 @@ static void optInfoEndElem (void *userData, const 
XML_Char *name) {
  }
  }

-void driParseOptionInfo (driOptionCache *info,
-const char *configOptions, GLuint nConfigOptions) {
+void driParseOptionInfo (driOptionCache *info, const char *configOptions) {
  XML_Parser p;
  int status;
  struct OptInfoData userData;
  struct OptInfoData *data = userData;
-GLuint realNoptions;
-
-  /* determine hash table size and allocate memory:
-   * 3/2 of the number of options, rounded up, so there remains always
-   * at least one free entry. This is needed for detecting undefined
-   * options in configuration files without getting a hash table overflow.
-   * Round this up to a power of two. */
-GLuint minSize = (nConfigOptions*3 + 1) / 2;
-GLuint size, log2size;
-for (size = 1, log2size = 0; size  minSize; size = 1, ++log2size);
-info-tableSize = log2size;
-info-info = calloc(size, sizeof (driOptionInfo));
-info-values = calloc(size, sizeof (driOptionValue));
+
+/* Make the hash table big enough to fit more than the maximum number of
+ * config options we've ever seen in a driver.
+ */
+info-tableSize = 6;
+info-info = calloc(1  info-tableSize, sizeof (driOptionInfo));
+info-values = calloc(1  info-tableSize, sizeof 

Re: [Mesa-dev] [i965] i965/draw: Move constant formation outside of for loop and use an enum.

2013-08-08 Thread Mark Mueller
On Tue, Aug 6, 2013 at 12:31 PM, Ian Romanick i...@freedesktop.org wrote:

 On 08/06/2013 12:10 PM, mmueller wrote:Signed-off-by: mmueller 
 markkmuel...@gmail.com

 ---
   src/mesa/drivers/dri/i965/brw_**draw.c | 18 +-
   1 file changed, 9 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/**brw_draw.c
 b/src/mesa/drivers/dri/i965/**brw_draw.c
 index 6170d07..e11d0d8 100644
 --- a/src/mesa/drivers/dri/i965/**brw_draw.c
 +++ b/src/mesa/drivers/dri/i965/**brw_draw.c
 @@ -367,6 +367,15 @@ static bool brw_try_draw_prims( struct gl_context
 *ctx,
  bool retval = true;
  GLuint i;
  bool fail_next = false;
 +   enum {
 +   estimated_max_prim_size =
 +   512 + /* batchbuffer commands */
 +   ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) +
 sizeof(struct gen5_sampler_default_color +
 +   1024 + /* gen6 VS push constants */
 +   1024 + /* gen6 WM push constants */
 +   512 /* misc. pad */
 +   };
 +


 I think this would be better as 'const int estimated_max_prim_size = ...'.
  Using an enum is mostly the same (but also enforces that it's a
 compile-time constant), but it looks weird. :)


OK. A new patch is on the way *sigh*, I knew this day was coming for
the last 15 years, but I don't feel that was enough time to prepare. Now
I'm forced to come out of the closet - I'm a lifelong member of PETA*.

All too often I clearly communicate something to the compiler yet it still
goes off and does something else. Before I started using enums to replace
#defines and const PODs in my code, I too thought enums were weird and
ugly. After time I came to realized that it was an effective means to get
the compiler to do some useful things. After a while they don't look so
weird.

Obviously in this case it's of no concern.

Cheers

*PETA - Promoting Enums Through Awareness (www.PE.TA, not to be confused
with People Eating Tasty Animals: PETA.org)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.

2013-08-08 Thread Eric Anholt
Ian Romanick i...@freedesktop.org writes:

 On 08/08/2013 03:59 AM, Eric Anholt wrote:
 Previously, we were asserting that each driver specified an NConfigOptions
 exactly equal to the number of options they supplied, leading to frequent
 bugs when people would forget to adjust the value when adjusting driver
 options.  Instead, just overallocate the table by a bit and leave sanity
 checking to the assert in findOption().

 This makes sense.  I thought part of the reason for __driNConfigOptions 
 being exact was that it was used by external driconf GUI tools.  Will 
 this change break those tools?

I spent an embarassingly long time trying to find how the external tool
used the config symbols, only to find out that it doesn't.  driconf
calls out to xdriinfo, which does a
glXGetProcAddress(glXGetDriverConfig), and parses the XML returned
From calling the returned function.  It's like a sane interface! (except
for not checking for an extension string trying all this)


pgpLliH2uKKgC.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/20] radeonsi: add FMASK texture binding slots and resource setup

2013-08-08 Thread Alex Deucher
On Thu, Aug 8, 2013 at 1:34 PM, Marek Olšák mar...@gmail.com wrote:
 On Thu, Aug 8, 2013 at 6:57 PM, Christian König deathsim...@vodafone.de 
 wrote:
 Am 08.08.2013 16:33, schrieb Marek Olšák:

 On Thu, Aug 8, 2013 at 3:09 PM, Christian König deathsim...@vodafone.de
 wrote:

 Am 08.08.2013 14:38, schrieb Marek Olšák:

 .On Thu, Aug 8, 2013 at 9:47 AM, Christian König
 deathsim...@vodafone.de wrote:

 Am 08.08.2013 02:20, schrieb Marek Olšák:

 FMASK is bound as a separate texture. For every texture, there can be
 an FMASK. Therefore a separate array of resource slots has to be
 added.

 This adds a new mechanism for emitting resource descriptors, its
 features
 are:
 - resource descriptors are stored in an ordinary buffer (not in a CS)


 Having resource descriptors outside of the CS has two problems that we
 need
 to solve first:

 1. Fine grained descriptor updates doesn't work, I already tried that.
 The
 problem is that unlike previous asics descriptors are now a memory
 block,
 so
 no longer part of the CP context. So when we (for example) have a draw
 command executing and the next draw command is using new resources for
 a
 specific slot we would either block until the first draw command is
 finished
 (which is bad for performance) or change the descriptors while they are
 still in use (which results in VM faults).

 So what would the proper solution be here? Do I need to flush some
 caches or would moving the descriptor updates to the constant IB fix
 that?


 Actually the current implementation worked better than anything else I
 tried.

 When you really need the resource descriptors in a separate buffer you
 need
 to use one buffer for each draw call and always write the full buffer
 contents (no partial updates). Flushing anything won't really help
 either..

 The only solution I see using one buffer is to block until the last draw
 call is finished with WAIT_REG_MEM, but that would be quite disastrous
 for
 performance.


 2. If my understand is correct when they are embedded the descriptors
 are
 preloaded into the caches while executing the IB, so to archive the
 same
 speed with descriptors outside of the IB you need to add additional
 commands
 to the constant IB which is new to SI and we currently doesn't support
 in
 the CS interface.

 There seems to be support for the constant IB. The CS ioctl chunk ID
 is RADEON_CHUNK_ID_CONST_IB and the allowed packets are listed in
 si_vm_packet3_ce_check. Is there anything missing?


 The userspace side seems to be missing and except for throwing NOP
 packets
 into it we never tested it. I know from the closed source side that it
 actually was quite tricky for them to get working.

 Additional to that please note that I'm not 100% sure that just putting
 the
 descriptors into the IB is really helping here. It was just the most
 simplest solution to avoid allocating a new buffer on each draw call.

 I understand. I don't really need to have resource descriptors in a
 separate buffer, all I need is these 3 basic features a gallium driver
 should support:
 - fine-grained resource updates (mainly for performance, see below)
 - ability to unbind resources (e.g. by setting IMG_RSRC_WORD1 to 0)
 - no GPU crash if a shader is using SAMPLER[15] but there are no samplers
 bound

 FYI, partial sampler view and sampler state updates are coming to
 gallium, Brian Paul already has some patches, it's just a matter of
 time now. Vertex and constant buffer states already support partial
 updates.


 That shouldn't be to much off a problem.

 Just allocate a state at startup and initialize it with the proper pm4
 commands for 16 samplers, then update the resource descriptors in that state
 when we change the bound textures/samplers/views/constants/whatever. All we
 need to do then is setting the emitted state to NULL so that it gets
 re-emitted in the next draw command.

 That would re-emit all 16 shader resources even if just one of them
 needs to be changed. I was trying to avoid this inefficiency. Is it
 really impossible to emit just one resource descriptor and keep the
 others unchanged? This is a basic D3D10/11 feature, for example:

 void ID3D11DeviceContext::VSSetShaderResources(
   [in]  UINT StartSlot,
   [in]  UINT NumViews,
   [in]  ID3D11ShaderResourceView *const *ppShaderResourceViews
 );

 If the constant engine is required to implement this interface
 efficiently, then I'd like to work on constant IB support.

You'll need to either store them in memory or re-emit them if you
store them in the IB.  The CE is mainly there so that it can prime the
TC in parallel with the command stream processing.

Alex
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] draw: cleanup the extra attribs

2013-08-08 Thread Zack Rusin
Before inserting new front face and prim id outputs cleanup
the old extra outputs, otherwise our cache will use previous
output slots which will break as soon as outputs of the current
shader don't match the last.

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/auxiliary/draw/draw_context.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index af9caee..2dc6772 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -555,6 +555,7 @@ draw_get_shader_info(const struct draw_context *draw)
 void
 draw_prepare_shader_outputs(struct draw_context *draw)
 {
+   draw_remove_extra_vertex_attribs(draw);
draw_ia_prepare_outputs(draw, draw-pipeline.ia);
draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled);
 }
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] draw: reset the vertex id when injecting new primitive id

2013-08-08 Thread Zack Rusin
Without reseting the vertex id, with primitives where the same
vertex is used with different primitives (e.g. tri/lines strips)
our vbuf module won't re-emit those vertices with the changed
primitive id. So lets reset the vertex id whenever injecting
new primitive id to make sure that the vertex data is correctly
emitted.

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/auxiliary/draw/draw_pipe_ia.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_ia.c 
b/src/gallium/auxiliary/draw/draw_pipe_ia.c
index ecbb233..d64f19b 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_ia.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_ia.c
@@ -68,6 +68,15 @@ inject_primid(struct draw_stage *stage,
 
for (i = 0; i  num_verts; ++i) {
   struct vertex_header *v = header-v[i];
+  /* We have to reset the vertex_id because it's used by
+   * vbuf to figure out if the vertex had already been
+   * emitted. For line/tri strips the first vertex of
+   * subsequent primitives would already be emitted,
+   * but since we're changing the primitive id on the vertex
+   * we want to make sure it's reemitted with the correct
+   * data.
+   */
+  v-vertex_id = UNDEFINED_VERTEX_ID;
   memcpy(v-data[slot][0], primid, sizeof(primid));
   memcpy(v-data[slot][1], primid, sizeof(primid));
   memcpy(v-data[slot][2], primid, sizeof(primid));
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler

2013-08-08 Thread Zack Rusin
We can't be injecting the primitive id's in the pipeline because
by that time the primitives have already been decomposed. To
properly number the primitives we need to handle the adjacency
primitives by hand. This patch moves the prim id injection into
the original primitive assembler and completely removes the
useless pipeline stage.

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/auxiliary/Makefile.sources   |1 -
 src/gallium/auxiliary/draw/draw_context.c|8 +-
 src/gallium/auxiliary/draw/draw_pipe.c   |4 -
 src/gallium/auxiliary/draw/draw_pipe.h   |7 -
 src/gallium/auxiliary/draw/draw_pipe_ia.c|  259 --
 src/gallium/auxiliary/draw/draw_pipe_validate.c  |   14 --
 src/gallium/auxiliary/draw/draw_prim_assembler.c |  168 +-
 src/gallium/auxiliary/draw/draw_prim_assembler.h |   12 +
 src/gallium/auxiliary/draw/draw_private.h|4 +-
 9 files changed, 180 insertions(+), 297 deletions(-)
 delete mode 100644 src/gallium/auxiliary/draw/draw_pipe_ia.c

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index b0172de..acbcef7 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -13,7 +13,6 @@ C_SOURCES := \
draw/draw_pipe_clip.c \
draw/draw_pipe_cull.c \
draw/draw_pipe_flatshade.c \
-draw/draw_pipe_ia.c \
draw/draw_pipe_offset.c \
draw/draw_pipe_pstipple.c \
draw/draw_pipe_stipple.c \
diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 2dc6772..2d4843e 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -40,6 +40,7 @@
 #include util/u_prim.h
 #include draw_context.h
 #include draw_pipe.h
+#include draw_prim_assembler.h
 #include draw_vs.h
 #include draw_gs.h
 
@@ -95,6 +96,10 @@ draw_create_context(struct pipe_context *pipe, boolean 
try_llvm)
if (!draw_init(draw))
   goto err_destroy;
 
+   draw-ia = draw_prim_assembler_create(draw);
+   if (!draw-ia)
+  goto err_destroy;
+
return draw;
 
 err_destroy:
@@ -206,6 +211,7 @@ void draw_destroy( struct draw_context *draw )
   draw-render-destroy( draw-render );
*/
 
+   draw_prim_assembler_destroy(draw-ia);
draw_pipeline_destroy( draw );
draw_pt_destroy( draw );
draw_vs_destroy( draw );
@@ -556,7 +562,7 @@ void
 draw_prepare_shader_outputs(struct draw_context *draw)
 {
draw_remove_extra_vertex_attribs(draw);
-   draw_ia_prepare_outputs(draw, draw-pipeline.ia);
+   draw_prim_assembler_prepare_outputs(draw-ia);
draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled);
 }
 
diff --git a/src/gallium/auxiliary/draw/draw_pipe.c 
b/src/gallium/auxiliary/draw/draw_pipe.c
index 8140299..f1ee6cb 100644
--- a/src/gallium/auxiliary/draw/draw_pipe.c
+++ b/src/gallium/auxiliary/draw/draw_pipe.c
@@ -49,7 +49,6 @@ boolean draw_pipeline_init( struct draw_context *draw )
draw-pipeline.clip  = draw_clip_stage( draw );
draw-pipeline.flatshade = draw_flatshade_stage( draw );
draw-pipeline.cull  = draw_cull_stage( draw );
-   draw-pipeline.ia= draw_ia_stage( draw );
draw-pipeline.validate  = draw_validate_stage( draw );
draw-pipeline.first = draw-pipeline.validate;
 
@@ -62,7 +61,6 @@ boolean draw_pipeline_init( struct draw_context *draw )
!draw-pipeline.clip ||
!draw-pipeline.flatshade ||
!draw-pipeline.cull ||
-   !draw-pipeline.ia ||
!draw-pipeline.validate)
   return FALSE;
 
@@ -97,8 +95,6 @@ void draw_pipeline_destroy( struct draw_context *draw )
   draw-pipeline.flatshade-destroy( draw-pipeline.flatshade );
if (draw-pipeline.cull)
   draw-pipeline.cull-destroy( draw-pipeline.cull );
-   if (draw-pipeline.ia)
-  draw-pipeline.ia-destroy( draw-pipeline.ia );
if (draw-pipeline.validate)
   draw-pipeline.validate-destroy( draw-pipeline.validate );
if (draw-pipeline.aaline)
diff --git a/src/gallium/auxiliary/draw/draw_pipe.h 
b/src/gallium/auxiliary/draw/draw_pipe.h
index 70822a4..7c9ed6c 100644
--- a/src/gallium/auxiliary/draw/draw_pipe.h
+++ b/src/gallium/auxiliary/draw/draw_pipe.h
@@ -91,10 +91,6 @@ extern struct draw_stage *draw_stipple_stage( struct 
draw_context *context );
 extern struct draw_stage *draw_wide_line_stage( struct draw_context *context );
 extern struct draw_stage *draw_wide_point_stage( struct draw_context *context 
);
 extern struct draw_stage *draw_validate_stage( struct draw_context *context );
-extern struct draw_stage *draw_ia_stage(struct draw_context *context);
-
-boolean draw_ia_stage_required(const struct draw_context *context,
-   unsigned prim);
 
 extern void draw_free_temp_verts( struct draw_stage *stage );
 extern boolean draw_alloc_temp_verts( struct draw_stage *stage, unsigned nr );
@@ -108,9 +104,6 @@ void 

Re: [Mesa-dev] [PATCH 2/3] draw: reset the vertex id when injecting new primitive id

2013-08-08 Thread Zack Rusin
Don't worry about this one too much. The next patch removes draw_pipe_ia.c 
anyway...

- Original Message -
 Without reseting the vertex id, with primitives where the same
 vertex is used with different primitives (e.g. tri/lines strips)
 our vbuf module won't re-emit those vertices with the changed
 primitive id. So lets reset the vertex id whenever injecting
 new primitive id to make sure that the vertex data is correctly
 emitted.
 
 Signed-off-by: Zack Rusin za...@vmware.com
 ---
  src/gallium/auxiliary/draw/draw_pipe_ia.c |9 +
  1 file changed, 9 insertions(+)
 
 diff --git a/src/gallium/auxiliary/draw/draw_pipe_ia.c
 b/src/gallium/auxiliary/draw/draw_pipe_ia.c
 index ecbb233..d64f19b 100644
 --- a/src/gallium/auxiliary/draw/draw_pipe_ia.c
 +++ b/src/gallium/auxiliary/draw/draw_pipe_ia.c
 @@ -68,6 +68,15 @@ inject_primid(struct draw_stage *stage,
  
 for (i = 0; i  num_verts; ++i) {
struct vertex_header *v = header-v[i];
 +  /* We have to reset the vertex_id because it's used by
 +   * vbuf to figure out if the vertex had already been
 +   * emitted. For line/tri strips the first vertex of
 +   * subsequent primitives would already be emitted,
 +   * but since we're changing the primitive id on the vertex
 +   * we want to make sure it's reemitted with the correct
 +   * data.
 +   */
 +  v-vertex_id = UNDEFINED_VERTEX_ID;
memcpy(v-data[slot][0], primid, sizeof(primid));
memcpy(v-data[slot][1], primid, sizeof(primid));
memcpy(v-data[slot][2], primid, sizeof(primid));
 --
 1.7.10.4
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallivm: use texture target from shader instead of static state for size query

2013-08-08 Thread Zack Rusin
Series looks good to me.

Reviewed-by: Zack Rusin za...@vmware.com

- Original Message -
 From: Roland Scheidegger srol...@vmware.com
 
 d3d10 has no notion of distinct array resources neither at the resource nor
 sampler view level. However, shader dcl of resources certainly has, and
 d3d10 expects resinfo to return the values according to that - in particular
 a resource might have been a 1d texture with some array layers, then the
 sampler view might have only used 1 layer so it can be accessed both as 1d
 or 1d array texture (I think - the former definitely works). resinfo of a
 resource decleared as array needs to return number of array layers but
 non-array resource needs to return 0 (and not 1). Hence fix this by passing
 the target from the shader decl to emit_size_query and use that (in case of
 OpenGL the target will come from the instruction itself).
 Could probably do the same for actual sampling, though it may not matter
 there
 (as the bogus components will essentially get clamped away), possibly could
 wreak havoc though if it REALLY doesn't match (which is of course an error
 but still).
 ---
  src/gallium/auxiliary/draw/draw_llvm_sample.c |2 +
  src/gallium/auxiliary/gallivm/lp_bld_sample.h |1 +
  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |   32 ++-
  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h   |1 +
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c   |   43
  -
  src/gallium/drivers/llvmpipe/lp_tex_sample.c  |2 +
  6 files changed, 77 insertions(+), 4 deletions(-)
 
 diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c
 b/src/gallium/auxiliary/draw/draw_llvm_sample.c
 index 3016d7c..f10cba3 100644
 --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c
 +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c
 @@ -270,6 +270,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct
 lp_build_sampler_soa *base,
struct gallivm_state *gallivm,
struct lp_type type,
unsigned texture_unit,
 +  unsigned target,
boolean need_nr_mips,
boolean scalar_lod,
LLVMValueRef explicit_lod, /* optional
*/
 @@ -284,6 +285,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct
 lp_build_sampler_soa *base,
 sampler-dynamic_state.base,
 type,
 texture_unit,
 +   target,
 need_nr_mips,
 scalar_lod,
 explicit_lod,
 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 index dff8be2..db3ea1d 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 @@ -497,6 +497,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
  struct lp_sampler_dynamic_state *dynamic_state,
  struct lp_type int_type,
  unsigned texture_unit,
 +unsigned target,
  boolean need_nr_mips,
  boolean scalar_lod,
  LLVMValueRef explicit_lod,
 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 index b0bb58b..e403ac8 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 @@ -1943,6 +1943,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
  struct lp_sampler_dynamic_state *dynamic_state,
  struct lp_type int_type,
  unsigned texture_unit,
 +unsigned target,
  boolean need_nr_mips,
  boolean scalar_lod,
  LLVMValueRef explicit_lod,
 @@ -1955,9 +1956,36 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
 unsigned num_lods = 1;
 struct lp_build_context bld_int_vec;
  
 -   dims = texture_dims(static_state-target);
 +   /*
 +* Do some sanity verification about bound texture and shader dcl target.
 +* Not entirely sure what's possible but assume array/non-array
 +* always compatible (probably not ok for OpenGL but d3d10 has no
 +* distinction of arrays at the resource level).
 +* Everything else looks bogus (though not entirely sure about rect/2d).
 +* Currently disabled because it causes assertion failures if there's
 +* nothing bound (or rather a dummy texture, not that this case would
 +* return the right values).
 +*/
 +   if (0  

Re: [Mesa-dev] [PATCH] gallivm: set non-existing values really to zero in size queries for d3d10

2013-08-08 Thread Zack Rusin
Looks good.

Reviewed-by: Zack Rusin za...@vmware.com

- Original Message -
 From: Roland Scheidegger srol...@vmware.com
 
 My previous attempt at doing so double-failed miserably (minification of
 zero still gives one, and even if it would not the value was never written
 anyway).
 While here also rename the confusingly named int_vec bld as we have int vecs
 of different sizes, and rename need_nr_mips (as this also changes
 out-of-bounds
 behavior) to is_sviewinfo too.
 ---
  src/gallium/auxiliary/draw/draw_llvm_sample.c |4 +--
  src/gallium/auxiliary/gallivm/lp_bld_sample.h |2 +-
  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |   34
  ++---
  src/gallium/drivers/llvmpipe/lp_tex_sample.c  |4 +--
  4 files changed, 22 insertions(+), 22 deletions(-)
 
 diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c
 b/src/gallium/auxiliary/draw/draw_llvm_sample.c
 index f10cba3..97b0255 100644
 --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c
 +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c
 @@ -271,7 +271,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct
 lp_build_sampler_soa *base,
struct lp_type type,
unsigned texture_unit,
unsigned target,
 -  boolean need_nr_mips,
 +  boolean is_sviewinfo,
boolean scalar_lod,
LLVMValueRef explicit_lod, /* optional
*/
LLVMValueRef *sizes_out)
 @@ -286,7 +286,7 @@ draw_llvm_sampler_soa_emit_size_query(const struct
 lp_build_sampler_soa *base,
 type,
 texture_unit,
 target,
 -   need_nr_mips,
 +   is_sviewinfo,
 scalar_lod,
 explicit_lod,
 sizes_out);
 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 index db3ea1d..75e8c59 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
 @@ -498,7 +498,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
  struct lp_type int_type,
  unsigned texture_unit,
  unsigned target,
 -boolean need_nr_mips,
 +boolean is_viewinfo,
  boolean scalar_lod,
  LLVMValueRef explicit_lod,
  LLVMValueRef *sizes_out);
 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 index e403ac8..65d6e7b 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
 @@ -1944,7 +1944,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
  struct lp_type int_type,
  unsigned texture_unit,
  unsigned target,
 -boolean need_nr_mips,
 +boolean is_sviewinfo,
  boolean scalar_lod,
  LLVMValueRef explicit_lod,
  LLVMValueRef *sizes_out)
 @@ -1954,7 +1954,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
 int dims, i;
 boolean has_array;
 unsigned num_lods = 1;
 -   struct lp_build_context bld_int_vec;
 +   struct lp_build_context bld_int_vec4;
  
 /*
  * Do some sanity verification about bound texture and shader dcl target.
 @@ -1997,24 +1997,19 @@ lp_build_size_query_soa(struct gallivm_state
 *gallivm,
  
 assert(!int_type.floating);
  
 -   lp_build_context_init(bld_int_vec, gallivm, lp_type_int_vec(32, 128));
 +   lp_build_context_init(bld_int_vec4, gallivm, lp_type_int_vec(32, 128));
  
 if (explicit_lod) {
/* FIXME: this needs to honor per-element lod */
lod = LLVMBuildExtractElement(gallivm-builder, explicit_lod,
lp_build_const_int32(gallivm, 0), );
first_level = dynamic_state-first_level(dynamic_state, gallivm,
texture_unit);
level = LLVMBuildAdd(gallivm-builder, lod, first_level, level);
 -  lod = lp_build_broadcast_scalar(bld_int_vec, level);
 +  lod = lp_build_broadcast_scalar(bld_int_vec4, level);
 } else {
 -  lod = bld_int_vec.zero;
 +  lod = bld_int_vec4.zero;
 }
  
 -   if (need_nr_mips) {
 -  size = bld_int_vec.zero;
 -   }
 -   else {
 -  size = bld_int_vec.undef;
 -   }
 +   size = bld_int_vec4.undef;
  
 size = LLVMBuildInsertElement(gallivm-builder, size,
  

[Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.

2013-08-08 Thread Mark Mueller
Signed-off-by: Mark Mueller markkmuel...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_draw.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 6170d07..1b5ed55 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context *ctx,
bool retval = true;
GLuint i;
bool fail_next = false;
+   const int estimated_max_prim_size =
+   512 + /* batchbuffer commands */
+   ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + 
sizeof(struct gen5_sampler_default_color +
+   1024 + /* gen6 VS push constants */
+   1024 + /* gen6 WM push constants */
+   512; /* misc. pad */
 
if (ctx-NewState)
   _mesa_update_state( ctx );
@@ -405,16 +411,6 @@ static bool brw_try_draw_prims( struct gl_context *ctx,
brw-state.dirty.brw |= BRW_NEW_VERTICES;
 
for (i = 0; i  nr_prims; i++) {
-  int estimated_max_prim_size;
-
-  estimated_max_prim_size = 512; /* batchbuffer commands */
-  estimated_max_prim_size += (BRW_MAX_TEX_UNIT *
- (sizeof(struct brw_sampler_state) +
-  sizeof(struct gen5_sampler_default_color)));
-  estimated_max_prim_size += 1024; /* gen6 VS push constants */
-  estimated_max_prim_size += 1024; /* gen6 WM push constants */
-  estimated_max_prim_size += 512; /* misc. pad */
-
   /* Flush the batch if it's approaching full, so that we don't wrap while
* we've got validated state that needs to be in the same batch as the
* primitives.
-- 
1.8.3.1


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Enable LTO by default on i965/libdricore release builds.

2013-08-08 Thread Eric Anholt
We can't just smash it on globally due to (probably resolvable) issues
with the asm in glapi.  And we don't want to penalize developers with
longer build times for their normal debug environment.

Due to libdricore making almost all of our symbols public, the effect is
very small -- cairo-gl with INTEL_NO_HW=1 shows -0.798709% +/- 0.333703%
change in runtime (n=30).
---

If we were to avoid dricore, there's an additional 5% improvement available
(see the megadriver branch of my tree).

 configure.ac  | 25 +
 src/mesa/Makefile.am  |  4 ++--
 src/mesa/drivers/dri/i965/Makefile.am |  1 +
 src/mesa/libdricore/Makefile.am   |  8 +++-
 src/mesa/program/Makefile.am  |  4 ++--
 5 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/configure.ac b/configure.ac
index 62d06e0..26c230d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -314,6 +314,7 @@ AC_ARG_ENABLE([debug],
 [enable_debug=$enableval],
 [enable_debug=no]
 )
+enable_lto=yes
 if test x$enable_debug = xyes; then
 DEFINES_FOR_BUILD=$DEFINES_FOR_BUILD -DDEBUG
 if test x$GCC_FOR_BUILD = xyes; then
@@ -330,7 +331,31 @@ if test x$enable_debug = xyes; then
 if test x$GXX = xyes; then
 CXXFLAGS=$CXXFLAGS -g -O0
 fi
+
+# Disable LTO by default on debug builds, since it's so expensive at
+# compile time.
+enable_lto=no
+fi
+
+AC_ARG_ENABLE([lto],
+[AS_HELP_STRING([--disable-lto],
+[Enable link-time optimization @:@default=auto, disabled if debug is 
enabled@:@])],
+[enable_lto=$enableval])
+
+if test x$enable_lto = xyes; then
+# Enable LTO if the compiler supports it and not disabled by the user,
+# which allows the compiler to inline Mesa's many small,
+# non-manually-inlined helper functions.
+save_CFLAGS=$CFLAGS
+AC_MSG_CHECKING([whether $CC supports LTO])
+LTOFLAGS=-flto -fuse-linker-plugin
+CFLAGS=$CFLAGS $LTOFLAGS
+AC_LINK_IFELSE([AC_LANG_PROGRAM()],
+  [AC_MSG_RESULT([yes])],
+  [AC_MSG_RESULT([no]); LTOFLAGS=]);
+CFLAGS=$save_CFLAGS
 fi
+AC_SUBST([LTOFLAGS])
 
 dnl
 dnl library names
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index e9c16e7..ad04cc6 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -94,8 +94,8 @@ BUILDDIR = $(top_builddir)/src/mesa/
 include Makefile.sources
 
 AM_CPPFLAGS = $(DEFINES) $(INCLUDE_DIRS)
-AM_CFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CFLAGS)
-AM_CXXFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CXXFLAGS)
+AM_CFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CFLAGS) $(LTOFLAGS)
+AM_CXXFLAGS = $(LLVM_CFLAGS) $(VISIBILITY_CXXFLAGS) $(LTOFLAGS)
 
 MESA_ASM_FILES_FOR_ARCH =
 
diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
b/src/mesa/drivers/dri/i965/Makefile.am
index 27c67d1..5b3e9cc 100644
--- a/src/mesa/drivers/dri/i965/Makefile.am
+++ b/src/mesa/drivers/dri/i965/Makefile.am
@@ -36,6 +36,7 @@ AM_CFLAGS = \
-I$(top_builddir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
+   $(LTOFLAGS) \
$(INTEL_CFLAGS)
 
 if HAVE_OPENGL_ES1
diff --git a/src/mesa/libdricore/Makefile.am b/src/mesa/libdricore/Makefile.am
index 56ceeb7..22a358d 100644
--- a/src/mesa/libdricore/Makefile.am
+++ b/src/mesa/libdricore/Makefile.am
@@ -31,6 +31,9 @@ AM_CPPFLAGS = \
$(DEFINES) \
-DUSE_DRICORE
 
+AM_CFLAGS = $(LTOFLAGS)
+AM_CXXFLAGS = $(LTOFLAGS)
+
 libdricore@VERSION@_la_SOURCES = \
$(MESA_FILES) \
$(LIBGLCPP_GENERATED_FILES) \
@@ -38,7 +41,10 @@ libdricore@VERSION@_la_SOURCES = \
$(LIBGLSL_FILES) \
$(BUILTIN_COMPILER_GENERATED_CXX_FILES) \
$(top_builddir)/src/glsl/builtin_function.cpp
-libdricore@VERSION@_la_LDFLAGS = -version-number 1:0
+libdricore@VERSION@_la_LDFLAGS = \
+-version-number 1:0 \
+$(LTOFLAGS) \
+$()
 libdricore@VERSION@_la_LIBADD = \
 ../program/libdricore_program.la \
 $()
diff --git a/src/mesa/program/Makefile.am b/src/mesa/program/Makefile.am
index ab565e2..f6b7c94 100644
--- a/src/mesa/program/Makefile.am
+++ b/src/mesa/program/Makefile.am
@@ -24,8 +24,8 @@ include ../Makefile.sources
 AM_CPPFLAGS = $(DEFINES) $(INCLUDE_DIRS)
 AM_CFLAGS = $(VISIBILITY_CFLAGS)
 AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)
-libdricore_program_la_CFLAGS = $(NOVISIBILITY_CFLAGS)
-libdricore_program_la_CXXFLAGS = $(NOVISIBILITY_CXXFLAGS)
+libdricore_program_la_CFLAGS = $(NOVISIBILITY_CFLAGS) $(LTOFLAGS)
+libdricore_program_la_CXXFLAGS = $(NOVISIBILITY_CXXFLAGS) $(LTOFLAGS)
 
 SRCDIR = $(top_srcdir)/src/mesa/
 BUILDDIR = $(top_builddir)/src/mesa/
-- 
1.8.4.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] dri: Choose a decent global driNConfigOptions.

2013-08-08 Thread Kenneth Graunke

On 08/08/2013 03:59 AM, Eric Anholt wrote:

Previously, we were asserting that each driver specified an NConfigOptions
exactly equal to the number of options they supplied, leading to frequent
bugs when people would forget to adjust the value when adjusting driver
options.  Instead, just overallocate the table by a bit and leave sanity
checking to the assert in findOption().


Looks reasonable to me.  It'll be nice to not have to change this all 
the time.  I didn't verify that the assertions you mention actually 
catch things, though.


Reviewed-by: Kenneth Graunke kenn...@whitecape.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.

2013-08-08 Thread Eric Anholt
Mark Mueller markkmuel...@gmail.com writes:
 Signed-off-by: Mark Mueller markkmuel...@gmail.com
 ---
  src/mesa/drivers/dri/i965/brw_draw.c | 16 ++--
  1 file changed, 6 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
 b/src/mesa/drivers/dri/i965/brw_draw.c
 index 6170d07..1b5ed55 100644
 --- a/src/mesa/drivers/dri/i965/brw_draw.c
 +++ b/src/mesa/drivers/dri/i965/brw_draw.c
 @@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context *ctx,
 bool retval = true;
 GLuint i;
 bool fail_next = false;
 +   const int estimated_max_prim_size =
 +   512 + /* batchbuffer commands */
 +   ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) + 
 sizeof(struct gen5_sampler_default_color +
 +   1024 + /* gen6 VS push constants */
 +   1024 + /* gen6 WM push constants */
 +   512; /* misc. pad */

What's the point of this change?  Moving loop invariants out of loops is
something basic that your compiler does, and you're moving the
definition farther from its use.


pgpMfzBtdnQ0v.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: Do not export private symbols

2013-08-08 Thread Chad Versace

On 08/06/2013 05:30 PM, Jakob Bornecrantz wrote:

On Wed, Aug 7, 2013 at 2:00 AM, Chad Versace
chad.vers...@linux.intel.comwrote:


On 08/06/2013 03:02 PM, Jakob Bornecrantz wrote:


On Wed, Aug 7, 2013 at 12:00 AM, Jakob Bornecrantz wallbra...@gmail.com

wrote:


  On Tue, Aug 6, 2013 at 9:21 PM, Chad Versace 

chad.vers...@linux.intel.com


wrote:



  libEGL was incorrectly exporting *all* symbols, public and private.

This patch adds -fvisibility=hidden to libEGL's linker flags to ensure
that only symbols annotated with __attribute__((visibility(**
default)))
get exported.

Sanity-checked on X11/EGL with Piglit, and on Wayland by running
weston-gears.



You need to test this patch on a non-inbuilt driver, egl_gallium or
egl_glx,
none of which require you to have any hardware. Or force egl_dri2 to not
be built in.



Do that and you have my
Reviewed-by: Jakob Bornecrantz ja...@vmware.com



Ok. I'll run this on egl_gallium or egl_glx, and then report back.



Either is fine, tho I have no idea if egl_glx even works.


All looks good. I tested es2gears_x11 using egl_gallium with Gallium's
swrast driver. I also verified with gdb that those drivers were actually
used.

I'll commit to master now.

By the way, I talked to krh today, and he suggested that we delete egl_glx
rather than allow it to bitrot.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67921] New: [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67921

  Priority: medium
Bug ID: 67921
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [bisected commit 883987] crosscompiling fails with
util/u_cpu_detect.c:247:4: error: 'asm' undeclared
(first use in this function)
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: alexandre.f.dem...@gmail.com
  Hardware: Other
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

Commit 883987503fc79691398eb024f37480ff083805a3 breaks crosscompilation when
building with the following:
baseExec=./autogen.sh --prefix=/usr \
--enable-debug \
--enable-osmesa \
--enable-gbm \
--enable-xvmc \
--enable-vdpau \
--enable-gles1 \
--enable-gles2 \
--enable-openvg \
--enable-xorg \
--enable-xa \
--enable-egl \
--enable-gallium-egl \
--enable-glx-tls \
--enable-texture-float \
--enable-wgl \
--with-gallium-drivers=r600,swrast,svga \
--with-dri-drivers= \
--with-egl-platforms=x11,drm,wayland
export CFLAGS='-m32'
export CXXFLAGS='-m32'
export LLVM_CONFIG='/usr/bin/llvm-config32'
export PKG_CONFIG_PATH=/usr/lib/i386-linux-gnu/pkgconfig
Exec32=$baseExec --enable-32-bit --build=x86_64-pc-linux-gnu
--host=i686-pc-linux-gnu --libdir=/usr/lib/i386-linux-gnu
--includedir=/usr/include/i386-linux-gnu
$Exec32

---
It fails with:
  CC   util/u_cpu_detect.lo
util/u_cpu_detect.c: In function 'sse2_has_daz':
util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)
asm volatile (fxsave %0 :: m (fxarea));
^
util/u_cpu_detect.c:247:4: note: each undeclared identifier is reported only
once for each function it appears in
util/u_cpu_detect.c:247:8: error: expected ';' before 'volatile'
asm volatile (fxsave %0 :: m (fxarea));
^
make[3]: *** [util/u_cpu_detect.lo] Error 1

---
However, compiling normally for my 64bit architecture works fine.


Bad commit 883987503fc79691398eb024f37480ff083805a3
util: try much harder to set DAZ flag
Author
Roland Scheideggersrol...@vmware.com
Author date
8/5/13 9:30 PM

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67921] [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67921

Alexandre Demers alexandre.f.dem...@gmail.com changed:

   What|Removed |Added

 CC||srol...@vmware.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler

2013-08-08 Thread Roland Scheidegger
Am 08.08.2013 21:46, schrieb Zack Rusin:
 We can't be injecting the primitive id's in the pipeline because
 by that time the primitives have already been decomposed. To
 properly number the primitives we need to handle the adjacency
 primitives by hand. This patch moves the prim id injection into
 the original primitive assembler and completely removes the
 useless pipeline stage.
 
 Signed-off-by: Zack Rusin za...@vmware.com
 ---
  src/gallium/auxiliary/Makefile.sources   |1 -
  src/gallium/auxiliary/draw/draw_context.c|8 +-
  src/gallium/auxiliary/draw/draw_pipe.c   |4 -
  src/gallium/auxiliary/draw/draw_pipe.h   |7 -
  src/gallium/auxiliary/draw/draw_pipe_ia.c|  259 
 --
  src/gallium/auxiliary/draw/draw_pipe_validate.c  |   14 --
  src/gallium/auxiliary/draw/draw_prim_assembler.c |  168 +-
  src/gallium/auxiliary/draw/draw_prim_assembler.h |   12 +
  src/gallium/auxiliary/draw/draw_private.h|4 +-
  9 files changed, 180 insertions(+), 297 deletions(-)
  delete mode 100644 src/gallium/auxiliary/draw/draw_pipe_ia.c
 
 diff --git a/src/gallium/auxiliary/Makefile.sources 
 b/src/gallium/auxiliary/Makefile.sources
 index b0172de..acbcef7 100644
 --- a/src/gallium/auxiliary/Makefile.sources
 +++ b/src/gallium/auxiliary/Makefile.sources
 @@ -13,7 +13,6 @@ C_SOURCES := \
   draw/draw_pipe_clip.c \
   draw/draw_pipe_cull.c \
   draw/draw_pipe_flatshade.c \
 -draw/draw_pipe_ia.c \
   draw/draw_pipe_offset.c \
   draw/draw_pipe_pstipple.c \
   draw/draw_pipe_stipple.c \
 diff --git a/src/gallium/auxiliary/draw/draw_context.c 
 b/src/gallium/auxiliary/draw/draw_context.c
 index 2dc6772..2d4843e 100644
 --- a/src/gallium/auxiliary/draw/draw_context.c
 +++ b/src/gallium/auxiliary/draw/draw_context.c
 @@ -40,6 +40,7 @@
  #include util/u_prim.h
  #include draw_context.h
  #include draw_pipe.h
 +#include draw_prim_assembler.h
  #include draw_vs.h
  #include draw_gs.h
  
 @@ -95,6 +96,10 @@ draw_create_context(struct pipe_context *pipe, boolean 
 try_llvm)
 if (!draw_init(draw))
goto err_destroy;
  
 +   draw-ia = draw_prim_assembler_create(draw);
 +   if (!draw-ia)
 +  goto err_destroy;
 +
 return draw;
  
  err_destroy:
 @@ -206,6 +211,7 @@ void draw_destroy( struct draw_context *draw )
draw-render-destroy( draw-render );
 */
  
 +   draw_prim_assembler_destroy(draw-ia);
 draw_pipeline_destroy( draw );
 draw_pt_destroy( draw );
 draw_vs_destroy( draw );
 @@ -556,7 +562,7 @@ void
  draw_prepare_shader_outputs(struct draw_context *draw)
  {
 draw_remove_extra_vertex_attribs(draw);
 -   draw_ia_prepare_outputs(draw, draw-pipeline.ia);
 +   draw_prim_assembler_prepare_outputs(draw-ia);
 draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled);
  }
  
 diff --git a/src/gallium/auxiliary/draw/draw_pipe.c 
 b/src/gallium/auxiliary/draw/draw_pipe.c
 index 8140299..f1ee6cb 100644
 --- a/src/gallium/auxiliary/draw/draw_pipe.c
 +++ b/src/gallium/auxiliary/draw/draw_pipe.c
 @@ -49,7 +49,6 @@ boolean draw_pipeline_init( struct draw_context *draw )
 draw-pipeline.clip  = draw_clip_stage( draw );
 draw-pipeline.flatshade = draw_flatshade_stage( draw );
 draw-pipeline.cull  = draw_cull_stage( draw );
 -   draw-pipeline.ia= draw_ia_stage( draw );
 draw-pipeline.validate  = draw_validate_stage( draw );
 draw-pipeline.first = draw-pipeline.validate;
  
 @@ -62,7 +61,6 @@ boolean draw_pipeline_init( struct draw_context *draw )
 !draw-pipeline.clip ||
 !draw-pipeline.flatshade ||
 !draw-pipeline.cull ||
 -   !draw-pipeline.ia ||
 !draw-pipeline.validate)
return FALSE;
  
 @@ -97,8 +95,6 @@ void draw_pipeline_destroy( struct draw_context *draw )
draw-pipeline.flatshade-destroy( draw-pipeline.flatshade );
 if (draw-pipeline.cull)
draw-pipeline.cull-destroy( draw-pipeline.cull );
 -   if (draw-pipeline.ia)
 -  draw-pipeline.ia-destroy( draw-pipeline.ia );
 if (draw-pipeline.validate)
draw-pipeline.validate-destroy( draw-pipeline.validate );
 if (draw-pipeline.aaline)
 diff --git a/src/gallium/auxiliary/draw/draw_pipe.h 
 b/src/gallium/auxiliary/draw/draw_pipe.h
 index 70822a4..7c9ed6c 100644
 --- a/src/gallium/auxiliary/draw/draw_pipe.h
 +++ b/src/gallium/auxiliary/draw/draw_pipe.h
 @@ -91,10 +91,6 @@ extern struct draw_stage *draw_stipple_stage( struct 
 draw_context *context );
  extern struct draw_stage *draw_wide_line_stage( struct draw_context *context 
 );
  extern struct draw_stage *draw_wide_point_stage( struct draw_context 
 *context );
  extern struct draw_stage *draw_validate_stage( struct draw_context *context 
 );
 -extern struct draw_stage *draw_ia_stage(struct draw_context *context);
 -
 -boolean draw_ia_stage_required(const struct draw_context *context,
 -   unsigned prim);
  
  extern void 

Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler

2013-08-08 Thread Zack Rusin
  Series looks good though I'm unsure why the pipeline stage doesn't work.
 Where does that decomposition happen? Is that something like GS
 outputting multiple prims in the same topology which all need the same id?

No, it's because the pipeline stage is ran on the decomposed primitives. The 
issue is that the pipeline stage is ran after stream output and stream output 
requires decomposed primitives, meaning that by the time we get to the pipeline 
we lost the original primitive info. The d3d10 wants the primitive id's to be 
injected into vertices but in the order in which they are traversed on the 
original (striped) primitives, so we need to do it when doing the original 
decomposition where we have access to the original topology and can number the 
vertices correctly.

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler

2013-08-08 Thread Roland Scheidegger
Am 09.08.2013 00:40, schrieb Zack Rusin:
   Series looks good though I'm unsure why the pipeline stage doesn't work.
 Where does that decomposition happen? Is that something like GS
 outputting multiple prims in the same topology which all need the same id?
 
 No, it's because the pipeline stage is ran on the decomposed primitives. The 
 issue is that the pipeline stage is ran after stream output and stream output 
 requires decomposed primitives, meaning that by the time we get to the 
 pipeline we lost the original primitive info. The d3d10 wants the primitive 
 id's to be injected into vertices but in the order in which they are 
 traversed on the original (striped) primitives, so we need to do it when 
 doing the original decomposition where we have access to the original 
 topology and can number the vertices correctly.
 
 z
 

I see I totally forgot stream out needs decomposed primitives, and I
guess stream out (and prim assembler) can't run as an ordinary pipeline
stage?

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Reject VertexAttribPointer() with GL_BGRA and !normalized.

2013-08-08 Thread Ian Romanick

On 07/30/2013 09:57 AM, Ian Romanick wrote:

On 07/23/2013 03:50 PM, Matt Turner wrote:

On Mon, Jul 15, 2013 at 6:37 AM, Fredrik Höglund fred...@kde.org wrote:

On Monday 15 July 2013, Kenneth Graunke wrote:

Fixes Piglit's ARB_vertex_attrib_bgra/api-errors test.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/main/varray.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
index 529d933..48f15bd 100644
--- a/src/mesa/main/varray.c
+++ b/src/mesa/main/varray.c
@@ -463,6 +463,16 @@ _mesa_VertexAttribPointer(GLuint index, GLint
size, GLenum type,
return;
 }

+   /* From the GL_ARB_vertex_array_bgra specification:
+* The error INVALID_VALUE is generated by VertexAttribPointer if
+*  size is BGRA and normalized is FALSE.
+*/
+   if (size == GL_BGRA  !normalized) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  glVertexAttribPointer(BGRA and !normalized));
+  return;
+   }
+


I think this code belongs in update_array(), since it already handles
other
BGRA errors, and also checks if the extension is supported.

This also reminds me that we need to decide if we should make the error
code depend on the GL version and entry point.

Both OpenGL 3.3 and GL_ARB_vertex_attrib_binding changes the error code
to GL_INVALID_OPERATION. The extension only does it for the entry points
it adds.

I personally don't think it's worth adding the additional complexity for
that since the conditions under which the errors to be generated haven't
changed.


Ian,

I remember you mentioning when we were showing the interns how to
write piglit tests that the return value seemed wrong. Can you
elaborate here? This is holding up Fredrik's ARB_vertex_attrib_binding
patches.


I believe the ARB_vertex_array_bgra should have specified
INVALID_OPERATION as the error code.  Section 2.5 (GL Errors) provides
the general guidance that INVALID_VALUE is generated for Numeric
argument out of range.  This is error is that a GLboolean is false
instead of true.  Clearly false is in range for GLboolean, so
INVALID_VALUE is not appropriate.

Since OpenGL 3.3 corrects the error code, we should follow that behavior.


Any chance of an updated patch in time for 9.2? :)


Thanks,
Matt


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] OpenGL ES only configuration (without desktop OpenGL support)

2013-08-08 Thread Ian Romanick

On 08/06/2013 02:13 PM, Siarhei Siamashka wrote:

Hello,

Some months ago, the commit configure.ac: Allow OpenGL ES1 and ES2 only
with enabled OpenGL dropped support for the OpenGL-free configuration.

 http://lists.freedesktop.org/archives/mesa-dev/2013-February/033909.html
 http://lists.freedesktop.org/archives/mesa-commit/2013-February/041708.html

Could this be possibly reverted to allow me to continue shooting
myself in the foot? The support for OpenGL ES is pretty horrible
in the open source software. One nice exception is Qt5 which is doing
pretty well. But the rest of the software does not generally work out
of the box without patches or tweaks. You can also hardly find a
problem-free OpenGL ES compatible open source game (other than Quake3).

I have an open feature request for Gentoo, which is a very configurable
Linux distribution and should not have any troubles working either with
or without OpenGL (the choice is up to the user):

 https://bugs.gentoo.org/show_bug.cgi?id=476524

But if upstream Mesa treats this configuration as unsupported, then I
also don't see it progressing anywhere in Gentoo. So could you please
re-consider this decision?


We've removed all of the #ifdef code inside Mesa that would have made 
any difference.  It was a nightmare to maintain, and we almost always 
got it wrong... because nobody was testing that configuration.  The only 
thing this is possibly going to gain you is a trivial amount of build 
time (by not building libGL, etc.).



Thanks.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67924] New: [softpipe] piglit tesselation quad_strip wireframe regression

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67924

  Priority: medium
Bug ID: 67924
  Keywords: regression
CC: za...@vmware.com
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [softpipe] piglit tesselation quad_strip wireframe
regression
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

mesa: 2c2e64edaba0f6aeb181ca5b51eb8dea8e9b39f9 (master)


$ ./bin/ext_transform_feedback-tessellation quad_strip wireframe -auto
Probe at (48,129)
  Expected: 0.00 0.00 0.00 0.00
  Observed: 0.501961 0.501961 0.501961 0.501961
PIGLIT: {'result': 'fail' }


d6b3a193d4d525c5048ebf793e6a63fd98f92d64 is the first bad commit
commit d6b3a193d4d525c5048ebf793e6a63fd98f92d64
Author: Zack Rusin za...@vmware.com
Date:   Wed Jul 31 07:34:49 2013 -0400

draw: inject frontface info into wireframe outputs

Draw module can decompose primitives into wireframe models, which
is a fancy word for 'lines', unfortunately that decomposition means
that we weren't able to preserve the original front-face info which
could be derived from the original primitives (lines don't have a
'face'). To fix it allow draw module to inject a fake face semantic
into outputs from which the backends can figure out the original
frontfacing info of the primitives.

Signed-off-by: Zack Rusin za...@vmware.com
Reviewed-by: Roland Scheidegger srol...@vmware.com
Reviewed-by: Jose Fonseca jfons...@vmware.com

:04 04 c03c3c58f6b2d41a576aa2c1388323bc4fdfb21e
cb98c963ea0f7797496053122ad6b5c3f56a1861 Msrc
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] meta: Don't call _mesa_Ortho with width or height of 0

2013-08-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Fixes failures in oglconform fbo mipmap.manual.color,
mipmap.manual.colorAndDepth, mipmap.automatic, and
mipmap.manualIterateTexTargets subtests.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/common/meta.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index c62927c..69b06ed 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -704,9 +704,14 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
   _mesa_LoadIdentity();
   _mesa_MatrixMode(GL_PROJECTION);
   _mesa_LoadIdentity();
-  _mesa_Ortho(0.0, ctx-DrawBuffer-Width,
-  0.0, ctx-DrawBuffer-Height,
-  -1.0, 1.0);
+
+  /* glOrtho with width = 0 or height = 0 generates GL_INVALID_VALUE.
+   * This can occur when there is no draw buffer.
+   */
+  if (ctx-DrawBuffer-Width != 0  ctx-DrawBuffer-Height != 0)
+ _mesa_Ortho(0.0, ctx-DrawBuffer-Width,
+ 0.0, ctx-DrawBuffer-Height,
+ -1.0, 1.0);
}
 
if (state  MESA_META_CLIP) {
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] mesa: Make detach_renderbuffer available outside fbobject.c

2013-08-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Also add a return value indicating whether any work was done.

This will be used by the next patch.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/fbobject.c | 42 +-
 src/mesa/main/fbobject.h |  6 ++
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index 74f294c..d121167 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1227,19 +1227,43 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint 
renderbuffer)
  * the renderbuffer.
  * This is used when a renderbuffer object is deleted.
  * The spec calls for unbinding.
+ *
+ * \returns
+ * \c true if the renderbuffer was detached from an attachment point.  \c
+ * false otherwise.
  */
-static void
-detach_renderbuffer(struct gl_context *ctx,
-struct gl_framebuffer *fb,
-struct gl_renderbuffer *rb)
+bool
+_mesa_detach_renderbuffer(struct gl_context *ctx,
+  struct gl_framebuffer *fb,
+  const void *att)
 {
-   GLuint i;
+   unsigned i;
+   bool progress = false;
+
for (i = 0; i  BUFFER_COUNT; i++) {
-  if (fb-Attachment[i].Renderbuffer == rb) {
+  if (fb-Attachment[i].Texture == att
+  || fb-Attachment[i].Renderbuffer == att) {
  _mesa_remove_attachment(ctx, fb-Attachment[i]);
+ progress = true;
   }
}
-   invalidate_framebuffer(fb);
+
+   /* Section 4.4.4 (Framebuffer Completeness), subsection Whole Framebuffer
+* Completeness, of the OpenGL 3.1 spec says:
+*
+* Performing any of the following actions may change whether the
+* framebuffer is considered complete or incomplete:
+*
+* ...
+*
+*- Deleting, with DeleteTextures or DeleteRenderbuffers, an object
+*  containing an image that is attached to a framebuffer object
+*  that is bound to the framebuffer.
+*/
+   if (progress)
+  invalidate_framebuffer(fb);
+
+   return progress;
 }
 
 
@@ -1264,11 +1288,11 @@ _mesa_DeleteRenderbuffers(GLsizei n, const GLuint 
*renderbuffers)
 }
 
 if (_mesa_is_user_fbo(ctx-DrawBuffer)) {
-   detach_renderbuffer(ctx, ctx-DrawBuffer, rb);
+   _mesa_detach_renderbuffer(ctx, ctx-DrawBuffer, rb);
 }
 if (_mesa_is_user_fbo(ctx-ReadBuffer)
  ctx-ReadBuffer != ctx-DrawBuffer) {
-   detach_renderbuffer(ctx, ctx-ReadBuffer, rb);
+   _mesa_detach_renderbuffer(ctx, ctx-ReadBuffer, rb);
 }
 
/* Remove from hash table immediately, to free the ID.
diff --git a/src/mesa/main/fbobject.h b/src/mesa/main/fbobject.h
index 0a2a5cc..ab138cf 100644
--- a/src/mesa/main/fbobject.h
+++ b/src/mesa/main/fbobject.h
@@ -28,6 +28,7 @@
 
 #include compiler.h
 #include glheader.h
+#include stdbool.h
 
 struct gl_context;
 struct gl_texture_object;
@@ -113,6 +114,11 @@ _mesa_is_legal_color_format(const struct gl_context *ctx, 
GLenum baseFormat);
 extern GLenum
 _mesa_base_fbo_format(struct gl_context *ctx, GLenum internalFormat);
 
+extern bool
+_mesa_detach_renderbuffer(struct gl_context *ctx,
+  struct gl_framebuffer *fb,
+  const void *att);
+
 extern GLboolean GLAPIENTRY
 _mesa_IsRenderbuffer(GLuint renderbuffer);
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa: Use _mesa_detach_renderbuffer when deleting a texture

2013-08-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The functional change is that now invalidate_framebuffer is called if
the texture is actually detached from one of the currently bound FBOs.
Previously this was only done for renderbuffers.

The remaining changes make the texture delete path look more similar to
the renderbuffer delete path.  This includes adding relevant spec
quotations to justify the behavior.

Fixes piglit fbo-incomplete delete texture of bound FBO test.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/fbobject.c | 23 +++
 src/mesa/main/texobj.c   | 42 +++---
 2 files changed, 46 insertions(+), 19 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index d121167..1034c7a 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1223,10 +1223,8 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint 
renderbuffer)
 
 
 /**
- * If the given renderbuffer is anywhere attached to the framebuffer, detach
- * the renderbuffer.
- * This is used when a renderbuffer object is deleted.
- * The spec calls for unbinding.
+ * Remove the specified renderbuffer or texture from any attachment point in
+ * the framebuffer.
  *
  * \returns
  * \c true if the renderbuffer was detached from an attachment point.  \c
@@ -1287,6 +1285,23 @@ _mesa_DeleteRenderbuffers(GLsizei n, const GLuint 
*renderbuffers)
_mesa_BindRenderbuffer(GL_RENDERBUFFER_EXT, 0);
 }
 
+/* Section 4.4.2 (Attaching Images to Framebuffer Objects),
+ * subsection Attaching Renderbuffer Images to a Framebuffer, of
+ * the OpenGL 3.1 spec says:
+ *
+ * If a renderbuffer object is deleted while its image is
+ * attached to one or more attachment points in the currently
+ * bound framebuffer, then it is as if FramebufferRenderbuffer
+ * had been called, with a renderbuffer of 0, for each
+ * attachment point to which this image was attached in the
+ * currently bound framebuffer. In other words, this
+ * renderbuffer image is first detached from all attachment
+ * points in the currently bound framebuffer. Note that the
+ * renderbuffer image is specifically not detached from any
+ * non-bound framebuffers. Detaching the image from any
+ * non-bound framebuffers is the responsibility of the
+ * application.
+ */
 if (_mesa_is_user_fbo(ctx-DrawBuffer)) {
_mesa_detach_renderbuffer(ctx, ctx-DrawBuffer, rb);
 }
diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index 1666a66..7c8f04d 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -1041,23 +1041,35 @@ static void
 unbind_texobj_from_fbo(struct gl_context *ctx,
struct gl_texture_object *texObj)
 {
-   const GLuint n = (ctx-DrawBuffer == ctx-ReadBuffer) ? 1 : 2;
-   GLuint i;
+   bool progress = false;
 
-   for (i = 0; i  n; i++) {
-  struct gl_framebuffer *fb = (i == 0) ? ctx-DrawBuffer : ctx-ReadBuffer;
-  if (_mesa_is_user_fbo(fb)) {
- GLuint j;
- for (j = 0; j  BUFFER_COUNT; j++) {
-if (fb-Attachment[j].Type == GL_TEXTURE 
-fb-Attachment[j].Texture == texObj) {
-  /* Vertices are already flushed by _mesa_DeleteTextures */
-  ctx-NewState |= _NEW_BUFFERS;
-   _mesa_remove_attachment(ctx, fb-Attachment + j); 
-}
- }
-  }
+   /* Section 4.4.2 (Attaching Images to Framebuffer Objects), subsection
+* Attaching Texture Images to a Framebuffer, of the OpenGL 3.1 spec
+* says:
+*
+* If a texture object is deleted while its image is attached to one
+* or more attachment points in the currently bound framebuffer, then
+* it is as if FramebufferTexture* had been called, with a texture of
+* zero, for each attachment point to which this image was attached in
+* the currently bound framebuffer. In other words, this texture image
+* is first detached from all attachment points in the currently bound
+* framebuffer. Note that the texture image is specifically not
+* detached from any other framebuffer objects. Detaching the texture
+* image from any other framebuffer objects is the responsibility of
+* the application.
+*/
+   if (_mesa_is_user_fbo(ctx-DrawBuffer)) {
+  progress = _mesa_detach_renderbuffer(ctx, ctx-DrawBuffer, texObj);
}
+   if (_mesa_is_user_fbo(ctx-ReadBuffer)
+ctx-ReadBuffer != ctx-DrawBuffer) {
+  progress = _mesa_detach_renderbuffer(ctx, ctx-ReadBuffer, texObj)
+ || progress;
+   }
+
+   if (progress)
+  /* Vertices are 

Re: [Mesa-dev] Mesa CVS: compilation fails in u_cpu_detect.c

2013-08-08 Thread Roland Scheidegger
Am 09.08.2013 01:05, schrieb Dieter Nützel:
 After git pull 8. Aug 19:18 compilation fails in
 src/gallium/auxiliary/util/u_cpu_detect.c
 
 gmake[3]: Entering directory `/opt/mesa/src/gallium/auxiliary'
   CC   util/u_cpu_detect.lo
 util/u_cpu_detect.c: In function 'sse2_has_daz':
 util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this
 function)
 util/u_cpu_detect.c:247:4: note: each undeclared identifier is reported
 only once for each function it appears in
 util/u_cpu_detect.c:247:8: error: expected ';' before 'volatile'
 gmake[3]: *** [util/u_cpu_detect.lo] Fehler 1
 gmake[3]: Leaving directory `/opt/mesa/src/gallium/auxiliary'
 gmake[2]: *** [all-recursive] Fehler 1
 gmake[2]: Leaving directory `/opt/mesa/src/gallium/auxiliary'
 gmake[1]: *** [all-recursive] Fehler 1
 gmake[1]: Leaving directory `/opt/mesa/src'
 make: *** [all-recursive] Fehler 1
 2.482u 0.628s 0:05.66 54.7% 0+0k 0+16io 0pf+0w
 
 ./configure --prefix=/usr --with-dri-drivers= --with-gallium-drivers=r600
 --enable-vdpau --enable-texture-float --enable-r600-llvm-compiler
 
 This helps:
 
 --- src/gallium/auxiliary/util/u_cpu_detect.c.orig  2013-08-08
 19:18:16.113550440 +0200
 +++ src/gallium/auxiliary/util/u_cpu_detect.c   2013-08-09
 00:08:20.681470540 +0200
 @@ -244,7 +244,7 @@
 
 fxarea.mxcsr_mask = 0;
  #if (defined(PIPE_CC_GCC) || defined(PIPE_CC_SUNPRO))
 -   asm volatile (fxsave %0 :: m (fxarea));
 +   __asm __volatile (fxsave %0 :: m (fxarea));
  #elif (defined(PIPE_CC_MSVC) || defined(PIPE_CC_ICL))
 _fxsave(fxarea);
  #endif
 
 Regards,
   Dieter

Fixed thanks!

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67925] New: ARB_vertex_array_bgra api-errors test fails when normalized is false

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67925

  Priority: medium
Bug ID: 67925
CC: co...@octayn.net, fred...@kde.org,
i...@freedesktop.org, kenn...@whitecape.org
  Assignee: mesa-dev@lists.freedesktop.org
Blocks: 67224
   Summary: ARB_vertex_array_bgra api-errors test fails when
normalized is false
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: i...@freedesktop.org
  Hardware: Other
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

There are two patches for this out on the list.  Sort it out and commit
something. :)

http://lists.freedesktop.org/archives/mesa-dev/2013-July/041774.html
http://lists.freedesktop.org/archives/mesa-dev/2013-July/042487.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67921] [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67921

Roland Scheidegger srol...@vmware.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Roland Scheidegger srol...@vmware.com ---
(In reply to comment #0)
 However, compiling normally for my 64bit architecture works fine.
Yes, the code in question is never compiled on x64. Weird though when I hacked
around the ifdefs to test it on x64 it worked just fine.
Anyway fixed with 8f40fa0e7f47093d6e93ca4dd12569a6f948dae6.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67926] New: [softpipe] piglit tesselation polygon wireframe regression

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67926

  Priority: medium
Bug ID: 67926
  Keywords: have-backtrace, regression
CC: za...@vmware.com
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [softpipe] piglit tesselation polygon wireframe
regression
  Severity: critical
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

[softpipe] piglit tesselation polygon wireframe regression

mesa: 2c2e64edaba0f6aeb181ca5b51eb8dea8e9b39f9 (master)


$ ./bin/ext_transform_feedback-tessellation polygon wireframe -auto
Segmentation fault (core dumped)

(gdb) bt
#0  0x7fb5df2ba92e in sp_setup_line (setup=0x9ed330, v0=0xac4280,
v1=0xbc21c0) at sp_setup.c:1074
#1  0x7fb5df2a8f42 in sp_vbuf_draw_elements (vbr=0x9ed290,
indices=0x9effd0, nr=6) at sp_prim_vbuf.c:180
#2  0x7fb5df18f6c8 in vbuf_flush_vertices (vbuf=0x9eff10) at
draw/draw_pipe_vbuf.c:327
#3  0x7fb5df18f8ca in vbuf_flush (stage=0x9eff10, flags=4) at
draw/draw_pipe_vbuf.c:392
#4  0x7fb5df18ddde in unfilled_flush (stage=0x933860, flags=4) at
draw/draw_pipe_unfilled.c:209
#5  0x7fb5df189303 in cull_flush (stage=0x9e7210, flags=4) at
draw/draw_pipe_cull.c:231
#6  0x7fb5df188acd in clip_flush (stage=0x9e0b10, flags=4) at
draw/draw_pipe_clip.c:748
#7  0x7fb5df182226 in draw_pipeline_flush (draw=0x9dc800, flags=4) at
draw/draw_pipe.c:353
#8  0x7fb5df17b9b6 in draw_do_flush (draw=0x9dc800, flags=4) at
draw/draw_context.c:722
#9  0x7fb5df17ad9e in draw_flush (draw=0x9dc800) at draw/draw_context.c:225
#10 0x7fb5df2a8b2a in softpipe_draw_vbo (pipe=0x945080,
info=0x7056a420) at sp_draw_arrays.c:145
#11 0x7fb5df17968a in cso_draw_vbo (cso=0xa775a0, info=0x7056a420) at
cso_cache/cso_context.c:1413
#12 0x7fb5df084526 in st_draw_vbo (ctx=0xa0b920, prims=0x7056a4e0,
nr_prims=1, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=4, 
tfb_vertcount=0x0) at ../../src/mesa/state_tracker/st_draw.c:286
#13 0x7fb5df04082d in vbo_draw_arrays (ctx=0xa0b920, mode=9, start=0,
count=5, numInstances=1, baseInstance=0)
at ../../src/mesa/vbo/vbo_exec_array.c:624
#14 0x7fb5df04127e in vbo_exec_DrawArrays (mode=9, start=0, count=5) at
../../src/mesa/vbo/vbo_exec_array.c:776
#15 0x7fb5e2c43d9c in stub_glDrawArrays (mode=9, first=0, count=5) at
piglit/tests/util/generated_dispatch.c:6223
#16 0x004021b0 in draw (prog=3, use_xfb=false, y_offset=64, mode=9,
num_vertices=5)
at piglit/tests/spec/ext_transform_feedback/tessellation.c:446
#17 0x00402836 in piglit_display () at
piglit/tests/spec/ext_transform_feedback/tessellation.c:618
#18 0x7fb5e2c35e1c in display () at
piglit/tests/util/piglit-framework-gl/piglit_glut_framework.c:60
#19 0x7fb5e23d3fc4 in fghRedrawWindow (window=0x91a140) at
freeglut_main.c:210
#20 fghcbDisplayWindow (window=0x91a140, enumerator=0x7056a9c0) at
freeglut_main.c:227
#21 0x7fb5e23d7719 in fgEnumWindows
(enumCallback=enumCallback@entry=0x7fb5e23d3f20 fghcbDisplayWindow,
enumerator=enumerator@entry=0x7056a9c0)
at freeglut_structure.c:394
#22 0x7fb5e23d445c in fghDisplayAll () at freeglut_main.c:249
#23 glutMainLoopEvent () at freeglut_main.c:1450
#24 0x7fb5e23d4d81 in glutMainLoop () at freeglut_main.c:1498
#25 0x7fb5e2c3604b in run_test (gl_fw=0x7fb5e2f1e040 glut_fw, argc=3,
argv=0x7056ad78)
at piglit/tests/util/piglit-framework-gl/piglit_glut_framework.c:142
#26 0x7fb5e2c340e9 in piglit_gl_test_run (argc=3, argv=0x7056ad78,
config=0x7056ac60)
at piglit/tests/util/piglit-framework-gl.c:141
#27 0x00401a81 in main (argc=3, argv=0x7056ad78) at
piglit/tests/spec/ext_transform_feedback/tessellation.c:131
(gdb) frame 0
#0  0x7fb5df2ba92e in sp_setup_line (setup=0x9ed330, v0=0xac4280,
v1=0xbc21c0) at sp_setup.c:1074
1074   int x1 = (int) v1[0][0];
(gdb) print v1[0]
Cannot access memory at address 0xbc21c0


d6b3a193d4d525c5048ebf793e6a63fd98f92d64 is the first bad commit
commit d6b3a193d4d525c5048ebf793e6a63fd98f92d64
Author: Zack Rusin za...@vmware.com
Date:   Wed Jul 31 07:34:49 2013 -0400

draw: inject frontface info into wireframe outputs

Draw module can decompose primitives into wireframe models, which
is a fancy word for 'lines', unfortunately that decomposition means
that we weren't able to preserve the original front-face info which
could be derived from the original primitives (lines don't have a
'face'). To fix it allow draw module to inject a fake face semantic
into outputs from which the backends can figure out the original
frontfacing info of the primitives.

Signed-off-by: Zack Rusin za...@vmware.com
Reviewed-by: Roland Scheidegger srol...@vmware.com

[Mesa-dev] [PATCH] glsl: Don't allow const on out or inout function parameters

2013-08-08 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Fixes piglit tests const-inout-parameter.frag and
const-out-parameter.frag.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: 9.2 mesa-sta...@lists.freedesktop.org
Cc: Matt Turner matts...@gmail.com
---
This currently causes a regression in
relaxed-parameter-qualifier-ordering.vert.  I believe that test is
incorrect (by using const with out and inout parameters), and a patch
has been sent to the piglit list already.

 src/glsl/ast_to_hir.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 482ab3c..6266456 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -1967,6 +1967,21 @@ apply_type_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
   _mesa_glsl_shader_target_name(state-target));
}
 
+   /* Section 6.1.1 (Function Calling Conventions) of the GLSL 1.10 spec says:
+*
+* However, the const qualifier cannot be used with out or inout.
+*
+* The same section of the GLSL 4.40 spec further clarifies this saying:
+*
+* The const qualifier cannot be used with out or inout, or a
+* compile-time error results.
+*/
+   if (is_parameter  qual-flags.q.constant  qual-flags.q.out) {
+  _mesa_glsl_error(loc, state,
+  `const' may not be applied to `out' or `inout' 
+  function parameters);
+   }
+
/* If there is no qualifier that changes the mode of the variable, leave
 * the setting alone.
 */
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67925] ARB_vertex_array_bgra api-errors test fails when normalized is false

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67925

--- Comment #1 from Fredrik Höglund fred...@kde.org ---
I have yet a third version in my arb-vertex-attrib-binding branch:

http://cgit.freedesktop.org/~fredrik/mesa/commit/?h=arb-vertex-attrib-bindingid=df98f9f8

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] draw: rewrite primitive assembler

2013-08-08 Thread Zack Rusin
 Am 09.08.2013 00:40, schrieb Zack Rusin:
Series looks good though I'm unsure why the pipeline stage doesn't work.
  Where does that decomposition happen? Is that something like GS
  outputting multiple prims in the same topology which all need the same id?
  
  No, it's because the pipeline stage is ran on the decomposed primitives.
  The issue is that the pipeline stage is ran after stream output and stream
  output requires decomposed primitives, meaning that by the time we get to
  the pipeline we lost the original primitive info. The d3d10 wants the
  primitive id's to be injected into vertices but in the order in which they
  are traversed on the original (striped) primitives, so we need to do it
  when doing the original decomposition where we have access to the original
  topology and can number the vertices correctly.
  
  z
  
 
 I see I totally forgot stream out needs decomposed primitives, and I
 guess stream out (and prim assembler) can't run as an ordinary pipeline
 stage?

I was thinking about that when I was doing it and I thought it should be 
possible to rewrite SO as a pipeline stage, but we'd need to change the 
interface to include some sort of a prepare stage and then redo the code in so. 
Once so would be in a pipeline then we could think about primitive assembler, 
but that would require also more changes to the pipeline because we want to 
know if the primitives are adjacency primitives and pipeline stages get only 
tris/lines/points... and this was the point at which I went screw it, i'm 
injecting prim ids in the primitive assembler.

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 67928] New: [softpipe] piglit tesselation quads wireframe regression

2013-08-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67928

  Priority: medium
Bug ID: 67928
  Keywords: regression
CC: za...@vmware.com
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [softpipe] piglit tesselation quads wireframe
regression
  Severity: critical
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

mesa: 2c2e64edaba0f6aeb181ca5b51eb8dea8e9b39f9 (master)

$ ./bin/ext_transform_feedback-tessellation quads wireframe -auto
*** Error in `./bin/ext_transform_feedback-tessellation': double free or
corruption (!prev): 0x01bc1910 ***


d6b3a193d4d525c5048ebf793e6a63fd98f92d64 is the first bad commit
commit d6b3a193d4d525c5048ebf793e6a63fd98f92d64
Author: Zack Rusin za...@vmware.com
Date:   Wed Jul 31 07:34:49 2013 -0400

draw: inject frontface info into wireframe outputs

Draw module can decompose primitives into wireframe models, which
is a fancy word for 'lines', unfortunately that decomposition means
that we weren't able to preserve the original front-face info which
could be derived from the original primitives (lines don't have a
'face'). To fix it allow draw module to inject a fake face semantic
into outputs from which the backends can figure out the original
frontfacing info of the primitives.

Signed-off-by: Zack Rusin za...@vmware.com
Reviewed-by: Roland Scheidegger srol...@vmware.com
Reviewed-by: Jose Fonseca jfons...@vmware.com

:04 04 c03c3c58f6b2d41a576aa2c1388323bc4fdfb21e
cb98c963ea0f7797496053122ad6b5c3f56a1861 Msrc
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Don't allow const on out or inout function parameters

2013-08-08 Thread Matt Turner
On Thu, Aug 8, 2013 at 4:54 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 Fixes piglit tests const-inout-parameter.frag and
 const-out-parameter.frag.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 Cc: 9.2 mesa-sta...@lists.freedesktop.org
 Cc: Matt Turner matts...@gmail.com
 ---
 This currently causes a regression in
 relaxed-parameter-qualifier-ordering.vert.  I believe that test is
 incorrect (by using const with out and inout parameters), and a patch
 has been sent to the piglit list already.

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: disable GPUVM by default

2013-08-08 Thread Alex Deucher
Cayman and trinity systems still seem to suffer from
stability problems with GPUVM.  This also fixes compute
on these asics.  It can still be enabled for testing
by setting env var RADEON_VA=true.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=65958

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
CC: 9.2 mesa-sta...@lists.freedesktop.org
CC: 9.1 mesa-sta...@lists.freedesktop.org
---
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 033e78f..69c42a0 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -404,7 +404,7 @@ static boolean do_winsys_init(struct radeon_drm_winsys *ws)
   ws-info.r600_ib_vm_max_size))
 ws-info.r600_virtual_address = FALSE;
 }
-   if (ws-gen == DRV_R600  !debug_get_bool_option(RADEON_VA, TRUE))
+   if (ws-gen == DRV_R600  !debug_get_bool_option(RADEON_VA, FALSE))
ws-info.r600_virtual_address = FALSE;
 }
 
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [i965][V2] i965/draw: Move constant formation outside of for loop and use an enum.

2013-08-08 Thread Mark Mueller
On Thu, Aug 8, 2013 at 2:19 PM, Eric Anholt e...@anholt.net wrote:

 Mark Mueller markkmuel...@gmail.com writes:
  Signed-off-by: Mark Mueller markkmuel...@gmail.com
  ---
   src/mesa/drivers/dri/i965/brw_draw.c | 16 ++--
   1 file changed, 6 insertions(+), 10 deletions(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_draw.c
 b/src/mesa/drivers/dri/i965/brw_draw.c
  index 6170d07..1b5ed55 100644
  --- a/src/mesa/drivers/dri/i965/brw_draw.c
  +++ b/src/mesa/drivers/dri/i965/brw_draw.c
  @@ -367,6 +367,12 @@ static bool brw_try_draw_prims( struct gl_context
 *ctx,
  bool retval = true;
  GLuint i;
  bool fail_next = false;
  +   const int estimated_max_prim_size =
  +   512 + /* batchbuffer commands */
  +   ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) +
 sizeof(struct gen5_sampler_default_color +
  +   1024 + /* gen6 VS push constants */
  +   1024 + /* gen6 WM push constants */
  +   512; /* misc. pad */

 What's the point of this change?  Moving loop invariants out of loops is
 something basic that your compiler does,


Is that universally true for the code as it looked originally (see below)?
I've worked on embedded Atom and other systems with heavily dumbed down gcc
or other cross compilers. For instance there is a good chance that the
compilers from vehicle infotainment systems that I've worked on recently
would generate assembly for each line of code below inside the loop.

original code:

  int estimated_max_prim_size;


  estimated_max_prim_size = 512; /* batchbuffer commands */
  estimated_max_prim_size += (BRW_MAX_TEX_UNIT *
  (sizeof(struct brw_sampler_state) +
   sizeof(struct gen5_sampler_default_color)));
  estimated_max_prim_size += 1024; /* gen6 VS push constants */
  estimated_max_prim_size += 1024; /* gen6 WM push constants */
  estimated_max_prim_size += 512; /* misc. pad */
.

.



 and you're moving the
 definition farther from its use.


Sure, yet it's in the company of fail_next with a similar problem. What
about keeping the definition inside the for loop but adding the const
keyword and adding all of the immediates as one operation?

   for (i = 0; i  nr_prims; i++) {

 const int estimated_max_prim_size =
 512 + /* batchbuffer commands */
 ((BRW_MAX_TEX_UNIT * (sizeof(struct brw_sampler_state) +
sizeof(struct gen5_sampler_default_color +
 1024 + /* gen6 VS push constants */
 1024 + /* gen6 WM push constants */
 512; /* misc. pad */

I'm also fine with dropping the whole thing and moving on as the real point
of this was to calibrate myself to the contribution process. It's been good
for that.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev