Re: [Mesa-dev] [PATCH 10/18] i965: Speculatively flush the batch after transform feedback

2015-07-07 Thread Kenneth Graunke
On Tuesday, July 07, 2015 04:46:22 PM Chris Wilson wrote:
 On Tue, Jul 07, 2015 at 10:12:20AM +0100, Chris Wilson wrote:
  On Mon, Jul 06, 2015 at 09:05:18PM -0700, Kristian Høgsberg wrote:
   On Mon, Jul 6, 2015 at 12:36 PM, Kenneth Graunke kenn...@whitecape.org 
   wrote:
On Monday, July 06, 2015 11:33:15 AM Chris Wilson wrote:
Since the purpose of transform feedback tends to be for the client to
act upon the results to change the geometry in the scene, it is likely
that the client will soon be waiting upon the results. Flush the batch
early so that we don't build up a long queue of commands afterwards 
that
could delay the readback.
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 ++
 1 file changed, 6 insertions(+)
   
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 857ebe5..13dbe5b 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -494,6 +494,12 @@ gen7_end_transform_feedback(struct gl_context 
*ctx,
   
brw_batch_end(brw-batch);
   
+   /* We will likely want to read the results in the very near 
future, so
+* push this primitive to hardware if it is currently idle.
+*/
+   if (!brw_batch_busy(brw-batch))
+  brw_batch_flush(brw-batch);
+
/* EndTransformFeedback() means that we need to update the number 
of
 * vertices written.  Since it's only necessary if 
DrawTransformFeedback()
 * is called and it means mapping a buffer object, we delay 
computing it
   
   
We need some data to justify this change.
   
   I think even the theory is not correct - transform feedback is
   typically fed back into the GPU (as new geometry, eg) rather than
   consumed by the CPU, and in that case the flush is not helpful. But at
   the end of the day, data will tell.
  
  How are they fed back? Can the xfb buffer be bound to the vertex buffer?
  (Genuine question! The only examples I've seen were for testing by the
  CPU.)

Yes, it can.  Just glBindBuffer() some buffers around.  Or, I suspect
one could bind it as a texture buffer object or SSBO and then use a
compute shader on the results.

With GL 4.x, the avoid synchronizing with the CPU mentality is a lot
more prevalent, due to the advent of compute shaders.

 
 I've reviewed the code again, and gen7_end_transform_feedback() is always
 followed by brw_compute_xfb_vertices_written (and a read of the sol
 buffer) afaict, maybe not immediately but always before the next
 transform feedback.

Sadly, yes.  We have a primitive count and we need a vertex count - so,
a tiny bit of math.  Ideally, we would use the Gen7.5 MI_MATH+ feature
to do this, eliminating the CPU-GPU synchronization point.

 Also afaict it is not possible to map the sol buffer directly into the
 application.
 -Chris

It definitely is - the application creates GL buffer objects and binds
them for use with transform feedback.  They can certainly
glMapBufferRange() those buffers.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gallium/hud: replace byte units flag with pipe_driver_query_type

2015-07-07 Thread Brian Paul
Instead of using a boolean 'is bytes' value, use the pipe_driver_query_type
enum type.  This will let is add support for time values in the next patch.
---
 src/gallium/auxiliary/hud/hud_context.c  | 20 
 src/gallium/auxiliary/hud/hud_driver_query.c |  9 +++--
 src/gallium/auxiliary/hud/hud_private.h  |  5 +++--
 3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 6a124f7..9f42da9 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -231,14 +231,16 @@ hud_draw_string(struct hud_context *hud, unsigned x, 
unsigned y,
 }
 
 static void
-number_to_human_readable(uint64_t num, boolean is_in_bytes, char *out)
+number_to_human_readable(uint64_t num, enum pipe_driver_query_type type,
+ char *out)
 {
static const char *byte_units[] =
   {,  KB,  MB,  GB,  TB,  PB,  EB};
static const char *metric_units[] =
   {,  k,  M,  G,  T,  P,  E};
-   const char **units = is_in_bytes ? byte_units : metric_units;
-   double divisor = is_in_bytes ? 1024 : 1000;
+   const char **units =
+  (type == PIPE_DRIVER_QUERY_TYPE_BYTES) ? byte_units : metric_units;
+   double divisor = (type == PIPE_DRIVER_QUERY_TYPE_BYTES) ? 1024 : 1000;
int unit = 0;
double d = num;
 
@@ -301,7 +303,7 @@ hud_pane_accumulate_vertices(struct hud_context *hud,
hud-font.glyph_height / 2;
 
   number_to_human_readable(pane-max_value * i / 5,
-   pane-uses_byte_units, str);
+   pane-type, str);
   hud_draw_string(hud, x, y, str);
}
 
@@ -312,7 +314,7 @@ hud_pane_accumulate_vertices(struct hud_context *hud,
   unsigned y = pane-y2 + 2 + i*hud-font.glyph_height;
 
   number_to_human_readable(gr-current_value,
-   pane-uses_byte_units, str);
+   pane-type, str);
   hud_draw_string(hud, x, y,   %s: %s, gr-name, str);
   i++;
}
@@ -869,12 +871,14 @@ hud_parse_env_var(struct hud_context *hud, const char 
*env)
   else if (strcmp(name, samples-passed) == 0 
has_occlusion_query(hud-pipe-screen)) {
  hud_pipe_query_install(pane, hud-pipe, samples-passed,
-PIPE_QUERY_OCCLUSION_COUNTER, 0, 0, FALSE);
+PIPE_QUERY_OCCLUSION_COUNTER, 0, 0,
+PIPE_DRIVER_QUERY_TYPE_UINT64);
   }
   else if (strcmp(name, primitives-generated) == 0 
has_streamout(hud-pipe-screen)) {
  hud_pipe_query_install(pane, hud-pipe, primitives-generated,
-PIPE_QUERY_PRIMITIVES_GENERATED, 0, 0, FALSE);
+PIPE_QUERY_PRIMITIVES_GENERATED, 0, 0,
+PIPE_DRIVER_QUERY_TYPE_UINT64);
   }
   else {
  boolean processed = FALSE;
@@ -901,7 +905,7 @@ hud_parse_env_var(struct hud_context *hud, const char *env)
 if (i  Elements(pipeline_statistics_names)) {
hud_pipe_query_install(pane, hud-pipe, name,
   PIPE_QUERY_PIPELINE_STATISTICS, i,
-  0, FALSE);
+  0, PIPE_DRIVER_QUERY_TYPE_UINT64);
processed = TRUE;
 }
  }
diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
b/src/gallium/auxiliary/hud/hud_driver_query.c
index ee71678..c47d232 100644
--- a/src/gallium/auxiliary/hud/hud_driver_query.c
+++ b/src/gallium/auxiliary/hud/hud_driver_query.c
@@ -150,7 +150,7 @@ void
 hud_pipe_query_install(struct hud_pane *pane, struct pipe_context *pipe,
const char *name, unsigned query_type,
unsigned result_index,
-   uint64_t max_value, boolean uses_byte_units)
+   uint64_t max_value, enum pipe_driver_query_type type)
 {
struct hud_graph *gr;
struct query_info *info;
@@ -178,8 +178,7 @@ hud_pipe_query_install(struct hud_pane *pane, struct 
pipe_context *pipe,
hud_pane_add_graph(pane, gr);
if (pane-max_value  max_value)
   hud_pane_set_max_value(pane, max_value);
-   if (uses_byte_units)
-  pane-uses_byte_units = TRUE;
+   pane-type = type;
 }
 
 boolean
@@ -189,7 +188,6 @@ hud_driver_query_install(struct hud_pane *pane, struct 
pipe_context *pipe,
struct pipe_screen *screen = pipe-screen;
struct pipe_driver_query_info query;
unsigned num_queries, i;
-   boolean uses_byte_units;
boolean found = FALSE;
 
if (!screen-get_driver_query_info)
@@ -208,9 +206,8 @@ hud_driver_query_install(struct hud_pane *pane, struct 
pipe_context *pipe,
if (!found)
   return FALSE;
 
-   uses_byte_units = query.type == PIPE_DRIVER_QUERY_TYPE_BYTES;
hud_pipe_query_install(pane, pipe, 

[Mesa-dev] [PATCH] st/dri: don't set PIPE_BIND_SCANOUT for MSAA surfaces

2015-07-07 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

---
 src/gallium/state_trackers/dri/dri2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index a8323a3..5aa785c 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -556,7 +556,7 @@ dri2_allocate_textures(struct dri_context *ctx,
 
  if (drawable-textures[statt]) {
 templ.format = drawable-textures[statt]-format;
-templ.bind = drawable-textures[statt]-bind;
+templ.bind = drawable-textures[statt]-bind  ~PIPE_BIND_SCANOUT;
 templ.nr_samples = drawable-stvis.samples;
 
 /* Try to reuse the resource.
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/dri: don't set PIPE_BIND_SCANOUT for MSAA surfaces

2015-07-07 Thread Brian Paul

On 07/07/2015 10:29 AM, Marek Olšák wrote:

From: Marek Olšák marek.ol...@amd.com

---
  src/gallium/state_trackers/dri/dri2.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index a8323a3..5aa785c 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -556,7 +556,7 @@ dri2_allocate_textures(struct dri_context *ctx,

   if (drawable-textures[statt]) {
  templ.format = drawable-textures[statt]-format;
-templ.bind = drawable-textures[statt]-bind;
+templ.bind = drawable-textures[statt]-bind  ~PIPE_BIND_SCANOUT;
  templ.nr_samples = drawable-stvis.samples;

  /* Try to reuse the resource.



LGTM.

Reviewed-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] opencl: use versioned .so in mesa.icd

2015-07-07 Thread Emil Velikov
Ccing Tom

Thank you Igor !

On 07/07/15 11:05, Igor Gnatenko wrote:
 We must have versioned library in mesa.icd, because ICD loader would
 fail if the mesa-devel package wasn't installed.
 
 Reported-by: Fabian Deutsch fabian.deut...@gmx.de
 Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73512
 Cc: 10.6 mesa-sta...@lists.freedesktop.org
 Signed-off-by: Igor Gnatenko i.gnatenko.br...@gmail.com
Similar to the default location of the .icd file, this is another picky
topic. Negardless I think we should go ahead with this patch.

Why ? First let's see what others do:
 - nvidia - versioned soname, resides in lib. The full soname is used.
 - catalyst - no soname, resides in lib. libamdocl32/64.so is used.
 - beignet - unversioned soname, resides in lib/foo. Full library path
is used and no version.
 - the spec - does not mention anything about soname, versioning or
location. The example gives a plain libVendorAOpenCL.so.
Based off this one can assume that it should live in lib, although
everything else remains open.

As our lovely build always sets SONAME (even when -module is set), the
Fedora guys are doing (have been shipping) it correctly.

With all that said, do we have any comments/objections against this patch ?

Thanks,
Emil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-07 Thread Francisco Jerez
Matt Turner matts...@gmail.com writes:

 On Sun, Jul 5, 2015 at 4:45 PM, Francisco Jerez curroje...@riseup.net wrote:
 Hi Matt,

 Matt Turner matts...@gmail.com writes:

 On Fri, Jul 3, 2015 at 3:46 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Heh, I happened to come across this comment yesterday while looking for
 the remaining no16 calls and wondered why on earth it couldn't do the
 same that the normal interpolation code does.  After this patch and a
 series coming up that will remove all SIMD8 fallbacks from the texturing
 code, the only case left still applicable to Gen7 hardware and later
 will be SIMD16 explicit accumulator operands unsupported.  Anyone?

 I can explain the problem:

 Prior to Gen7, the were were two accumulator registers usable for most
 datatypes (acc0, acc1). On Gen7, they removed integer-support from
 acc1, which was necessary to implement SIMD16 integer multiplication
 using the normal MUL/MACH sequence.

 IIRC they got rid of the acc1 register on IVB altogether, but managed to
 emulate it for floating point types by taking advantage of the extra
 precision not normally used for floating point arithmetic (the fake acc1
 basically uses the same storage in the EU that holds the 32 MSBs of each
 component of acc0), what explains the apparent asymmetry between integer
 and floating point data types.

 I've never read anything that told me that -- what have you seen?

Heh, I'll try to dig up my reference and send it to you in private.


 I implemented 32-bit integer multiplication without using the
 accumulator in:

 commit f7df169ba13d22338e9276839a7e9629ca0a6b4f
 Author: Matt Turner matts...@gmail.com
 Date:   Wed May 13 18:34:03 2015 -0700

 i965/fs: Implement integer multiply without mul/mach.

 The remaining cases of SIMD16 explicit accumulator operands
 unsupported are ADDC, SUBB, and 32x32 - high 32-bit multiplication.
 The remaining multiplication case can probably be reimplemented
 without the accumulator, like I did for the low 32-bit result.

 Hmm, I have the suspicion that high 32-bit multiplication is the one
 legit use-case of the accumulator we have left, any algorithm breaking
 it up into individual 32/16-bit MULs would end up doing more
 multiplications than the two MUL/MACH instructions we do now, because we
 wouldn't be able to take advantage of the full precision implemented in
 the hardware if we truncate the 48-bit intermediate results to fit in a
 32-bit register.

 That's probably true. It's just that Sandybridge and earlier don't
 expose the functionality (but could do 64-bit integer multiplication
 just fine), Ivybridge has the quarter-control/accumulator bug, Haswell
 works fine if you split the multiplication sequence into SIMD8, and
 Broadwell let's you do 32x32 - 64-bit multiplication without the
 accumulator.

 So you have only two platforms where it's you have to use the
 accumulator, and one of them is broken (but I guess can be trivially
 fixed by some force-writemask-all hackery).


I guess there's also VLV, CHV and BXT, AFAIK the latter two have some
level of support for 64-bit multiplication (with the annoying alignment
restriction on the operands) but it might be easier for them to use the
accumulator path like earlier hardware.

 The best SIMD16 code for [iu]mulExtended() where both lsb and msb
 results are used is probably 2 sets of mul/mach/mov (with some kind of
 work around for Ivybridge), but that's kind of hard to recognize.

It's probably also the best SIMD16 code (on chips without reasonable
support for 64-bit multiply that is) for computing the high 32 bits of
the result, regardless of whether optimizer is able to recognise that
the low 32 bits of the computation also come out as a side product, and
whether or not the low 32 bits are used by the shader.

A potential solution could be to have the visitor emit full 64-bit MULs
speculatively for any 32-bit integer multiplication (high or low),
together with a MOV to chop off the unnecessary bits, a later
optimization pass (run after CSE to give the optimizer the opportunity
to merge the 64-bits MULs from the high and low 32-bit computations)
would demote 64-bit MULs for which only the lowest 32-bits of the result
are used to 32-bit MULs, later on the SIMD width lowering pass would
split 16-wide 64-bit MULs in half, and a later pass would lower them
into the MUL/MACH sequence on platforms that don't support full 64-bit
MULs natively.

Not sure if it's worth doing at this point.  I can have a look into
implementing the lowering pass for 64-bit MULs so we can start taking
advantage of the SIMD width lowering pass and get rid of the no16() call
right away, but the additional optimization pass to demote 64-bit MULs
(and speculative emission of 64-bit MULs from the visitor) can probably
wait until we have some use-case?

 How about we use the SIMD width lowering pass to split the computation
 in half?  It should be quite straightforward but will probably require
 adding a new 

Re: [Mesa-dev] [PATCH 0/8] Render node only opencl and pipe-loader cleanups

2015-07-07 Thread Emil Velikov
On 30/06/15 16:09, Emil Velikov wrote:
 Hello all,
 
 As mentioned over IRC a few weeks back, here is a series that removes 
 support for non-render node devices.
 
 The two main motivations being:
  - Currently we force X/xcb onto everyone that wants to use OpenCL
 (headless OpenCL systems/farms anyone ?)
  - Nice overall cleanup - 43 insertions(+), 279 deletions(-)
 
 
 Note that the final patches touch related code - from removing a unused 
 function (pipe_loader_sw_probe_xlib) to using loader_open_device() over 
 open(), with the former caring about CLOEXEC.

Francisco, Tom,

Can you guys please take a look at the series. Even an Ack would be
greatly appreciated.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/6] i965: Upload binding tables in hw-generated binding table format.

2015-07-07 Thread Abdiel Janulgue
When hardware-generated binding tables are enabled, use the hw-generated
binding table format when uploading binding table state.

Normally, the CS will will just consume the binding table pointer commands
as pipelined state. When the RS is enabled however, the RS flushes whatever
edited surface state entries of our on-chip binding table to the binding
table pool before passing the command on to the CS.

Note that the the binding table pointer offset is relative to the binding table
pool base address when resource streamer instead of the surface state base 
address.

v2: Fix possible buffer overflow when allocating a chunk out of the
hw-binding table pool (Ken).
v3: Remove extra newline and add missing brace around if-statement (Matt).

Cc: kenn...@whitecape.org
Cc: matts...@gmail.com
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 72 --
 1 file changed, 56 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index b3d592b..cc56dbf 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -50,6 +50,26 @@ static const GLuint stage_to_bt_edit[MESA_SHADER_FRAGMENT + 
1] = {
_3DSTATE_BINDING_TABLE_EDIT_PS,
 };
 
+static uint32_t
+reserve_hw_bt_space(struct brw_context *brw, unsigned bytes)
+{
+   if (brw-hw_bt_pool.next_offset + bytes = brw-hw_bt_pool.bo-size - 128) {
+  gen7_reset_hw_bt_pool_offsets(brw);
+   }
+
+   uint32_t offset = brw-hw_bt_pool.next_offset;
+
+   /* From the Haswell PRM, Volume 2b: Command Reference: Instructions,
+* 3DSTATE_BINDING_TABLE_POINTERS_xS:
+*
+* If HW Binding Table is enabled, the offset is relative to the
+*  Binding Table Pool Base Address and the alignment is 64 bytes.
+*/
+   brw-hw_bt_pool.next_offset += ALIGN(bytes, 64);
+
+   return offset;
+}
+
 /**
  * Upload a shader stage's binding table as indirect state.
  *
@@ -70,30 +90,50 @@ brw_upload_binding_table(struct brw_context *brw,
 
   stage_state-bind_bo_offset = 0;
} else {
-  /* Upload a new binding table. */
-  if (INTEL_DEBUG  DEBUG_SHADER_TIME) {
- brw-vtbl.emit_buffer_surface_state(
-brw, stage_state-surf_offset[
-prog_data-binding_table.shader_time_start],
-brw-shader_time.bo, 0, BRW_SURFACEFORMAT_RAW,
-brw-shader_time.bo-size, 1, true);
+  /* When RS is enabled use hw-binding table uploads, otherwise fallback to
+   * software-uploads.
+   */
+  if (brw-use_resource_streamer) {
+ gen7_update_binding_table_from_array(brw, stage_state-stage,
+  stage_state-surf_offset,
+  prog_data-binding_table
+  .size_bytes / 4);
+  } else {
+ /* Upload a new binding table. */
+ if (INTEL_DEBUG  DEBUG_SHADER_TIME) {
+brw-vtbl.emit_buffer_surface_state(
+   brw, stage_state-surf_offset[
+  prog_data-binding_table.shader_time_start],
+   brw-shader_time.bo, 0, BRW_SURFACEFORMAT_RAW,
+   brw-shader_time.bo-size, 1, true);
+ }
+
+ uint32_t *bind = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
+  prog_data-binding_table.size_bytes,
+  32,
+  stage_state-bind_bo_offset);
+
+ /* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */
+ memcpy(bind, stage_state-surf_offset,
+prog_data-binding_table.size_bytes);
   }
-
-  uint32_t *bind = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
-   prog_data-binding_table.size_bytes, 32,
-   stage_state-bind_bo_offset);
-
-  /* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */
-  memcpy(bind, stage_state-surf_offset,
- prog_data-binding_table.size_bytes);
}
 
brw-ctx.NewDriverState |= brw_new_binding_table;
 
if (brw-gen = 7) {
+  if (brw-use_resource_streamer) {
+ stage_state-bind_bo_offset =
+reserve_hw_bt_space(brw, prog_data-binding_table.size_bytes);
+  }
   BEGIN_BATCH(2);
   OUT_BATCH(packet_name  16 | (2 - 2));
-  OUT_BATCH(stage_state-bind_bo_offset);
+  /* Align SurfaceStateOffset[16:6] format to [15:5] PS Binding Table field
+   * when hw-generated binding table is enabled.
+   */
+  OUT_BATCH(brw-use_resource_streamer ?
+(stage_state-bind_bo_offset  1) :
+stage_state-bind_bo_offset);
   ADVANCE_BATCH();
}
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [Bug 91254] (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1

2015-07-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91254

Tomasz C. toma...@o2.pl changed:

   What|Removed |Added

 CC||toma...@o2.pl

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91254] (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1

2015-07-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91254

Bug ID: 91254
   Summary: (regresion) video using VA-API on Intel slow and
freeze system with mesa 10.6 or 10.6.1
   Product: Mesa
   Version: 10.6
  Hardware: x86-64 (AMD64)
   URL: https://bugs.archlinux.org/task/45459
OS: Linux (All)
Status: NEW
  Severity: major
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: toma...@o2.pl
QA Contact: i...@freedesktop.org

After upgrading to mesa 10.6.0-1 from 10.5.7 under the Intel Graphics, with the
work of the VA-API display slows down and freezes.

Additional info:

* package version(s)
mesa 10.6.0-1
mesa-libgl 10.6.0-1

* config and/or log files etc.
System: 4.0.6-1-ck x86_64 (64 bit), (tested also 4.0.5-2), Desktop: KDE (Plasma
5.3)
CPU: Dual core Intel Core i5 M 450 (-HT-MCP-) cache: 3072 KB 
Graphics: Card: Intel Core Processor Integrated Graphics Controller
Display Server: X.Org 1.17.2 driver: intel Resolution: 1920x1080@60.00hz
GLX Renderer: Mesa DRI Intel Ironlake Mobile GLX Version: 2.1 Mesa 10.6.0

The problem is on Intel Core i5 M 450 - first generation (Nehalem) of Intel
Core, also tested on the i3-3220T - third generation (Ivy Bridge) and i3-4005U
fourth generation (Haswell) and it works properly.
I did not test for second-generation (Sandy Bridge).

Steps to reproduce:

Metod 1
- install mesa 10.6.0-1 and mesa-libgl 10.6.0-1 (or 10.6.1)
- install mpv and configure it:
vo=opengl
hwdec=vaapi
- play any video,

Metod 2
- install mesa 10.6.0-1 and mesa-libgl 10.6.0-1
- install kodi
- enable VA-API (Settings  Video  Acceleration)
- play any video

Symptoms: display slows down and freezes

Tested on:
- xf86-video-intel 1:2.99.917+364+gb24e758-1 and 2.99.917-5
- AccelMethod, SNA, UXA, glamor
- Linux 4.0.6, 4.0.5-2, 4.1.1

On most video files you can see the problem, but not all.
You can test to: Jellyfish Video Bitrate Test Files http://jell.yfish.us/

It helps only downgrade to mesa and mesa-libgl to 10.5.7-1

Does not help downgrade xf86-video-intel to 2.99.917-5, therefore the suspicion
that the problem is mesa.

Upgrade libva and libva-intel-driver from 1.5.1 to 1.6.0 does not resolve this
bug.

The bug is reported:
https://bugs.archlinux.org/task/45459
https://bbs.archlinux.org/viewtopic.php?id=198982

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v5 3/6] i965: Enable hardware-generated binding tables on render path.

2015-07-07 Thread Abdiel Janulgue
This patch implements the binding table enable command which is also
used to allocate a binding table pool where where hardware-generated
binding table entries are flushed into. Each binding table offset in
the binding table pool is unique per each shader stage that are
enabled within a batch.

Also insert the required brw_tracked_state objects to enable
hw-generated binding tables in normal render path.

v2: - Use MOCS in binding table pool alloc for GEN8
- Fix spurious offset when allocating binding table pool entry
  and start from zero instead.
v3: - Include GEN8 fix for spurious offset above.
v4: - Fixup wrong packet length in enable/disable hw-binding table
  for GEN8 (Ville).
- Don't invoke HW-binding table disable command when we dont
  have resource streamer (Chris).
v5: - Reorder the state cache invalidate flush so it happens in-between
  enabling hw-generated binding tables and the previous sw-binding
  table GPU state (Chris).

Cc: kenn...@whitecape.org
Cc: syrj...@sci.fi
Cc: ch...@chris-wilson.co.uk
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 96 ++
 src/mesa/drivers/dri/i965/brw_context.c|  4 ++
 src/mesa/drivers/dri/i965/brw_context.h|  6 ++
 src/mesa/drivers/dri/i965/brw_state.h  |  6 ++
 src/mesa/drivers/dri/i965/brw_state_upload.c   |  4 ++
 src/mesa/drivers/dri/i965/gen7_disable.c   |  4 +-
 src/mesa/drivers/dri/i965/gen8_disable.c   |  4 +-
 src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
 8 files changed, 124 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 98ff0dd..2f32976 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -170,6 +170,102 @@ const struct brw_tracked_state brw_gs_binding_table = {
.emit = brw_gs_upload_binding_table,
 };
 
+/**
+ * Hardware-generated binding tables for the resource streamer
+ */
+void
+gen7_disable_hw_binding_tables(struct brw_context *brw)
+{
+   if (!brw-use_resource_streamer)
+  return;
+
+   int pkt_len = brw-gen = 8 ? 4 : 3;
+
+   BEGIN_BATCH(pkt_len);
+   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (pkt_len - 2));
+   if (brw-gen = 8) {
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+   } else {
+  OUT_BATCH(HSW_BT_POOL_ALLOC_MUST_BE_ONE);
+  OUT_BATCH(0);
+   }
+   ADVANCE_BATCH();
+
+   /* From the Haswell PRM, Volume 7: 3D Media GPGPU,
+* 3DSTATE_BINDING_TABLE_POOL_ALLOC  Programming Note:
+*
+* When switching between HW and SW binding table generation, SW must
+* issue a state cache invalidate.
+*/
+   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
+}
+
+void
+gen7_enable_hw_binding_tables(struct brw_context *brw)
+{
+   if (!brw-use_resource_streamer)
+  return;
+
+   if (!brw-hw_bt_pool.bo) {
+  /* We use a single re-usable buffer object for the lifetime of the
+   * context and size it to maximum allowed binding tables that can be
+   * programmed per batch:
+   *
+   * From the Haswell PRM, Volume 7: 3D Media GPGPU,
+   * 3DSTATE_BINDING_TABLE_POOL_ALLOC  Programming Note:
+   * A maximum of 16,383 Binding tables are allowed in any batch buffer
+   */
+  static const int max_size = 16383 * 4;
+  brw-hw_bt_pool.bo = drm_intel_bo_alloc(brw-bufmgr, hw_bt,
+  max_size, 64);
+  brw-hw_bt_pool.next_offset = 0;
+   }
+
+   /* From the Haswell PRM, Volume 7: 3D Media GPGPU,
+* 3DSTATE_BINDING_TABLE_POOL_ALLOC  Programming Note:
+*
+* When switching between HW and SW binding table generation, SW must
+* issue a state cache invalidate.
+*/
+   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
+
+   int pkt_len = brw-gen = 8 ? 4 : 3;
+   uint32_t dw1 = BRW_HW_BINDING_TABLE_ENABLE;
+   if (brw-is_haswell)
+  dw1 |= SET_FIELD(GEN7_MOCS_L3, GEN7_HW_BT_POOL_MOCS) |
+ HSW_BT_POOL_ALLOC_MUST_BE_ONE;
+   else if (brw-gen = 8)
+  dw1 |= BDW_MOCS_WB;
+
+   BEGIN_BATCH(pkt_len);
+   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (pkt_len - 2));
+   if (brw-gen = 8) {
+  OUT_RELOC64(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
+  OUT_BATCH(brw-hw_bt_pool.bo-size);
+   } else {
+  OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
+  OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0,
+ brw-hw_bt_pool.bo-size);
+   }
+   ADVANCE_BATCH();
+}
+
+void
+gen7_reset_hw_bt_pool_offsets(struct brw_context *brw)
+{
+   brw-hw_bt_pool.next_offset = 0;
+}
+
+const struct brw_tracked_state gen7_hw_binding_tables = {
+   .dirty = {
+  .mesa = 0,
+  .brw = BRW_NEW_BATCH,
+   },
+   .emit = gen7_enable_hw_binding_tables
+};
+
 /** @} */
 
 /**
diff --git 

Re: [Mesa-dev] [PATCH 10/18] i965: Speculatively flush the batch after transform feedback

2015-07-07 Thread Chris Wilson
On Mon, Jul 06, 2015 at 09:05:18PM -0700, Kristian Høgsberg wrote:
 On Mon, Jul 6, 2015 at 12:36 PM, Kenneth Graunke kenn...@whitecape.org 
 wrote:
  On Monday, July 06, 2015 11:33:15 AM Chris Wilson wrote:
  Since the purpose of transform feedback tends to be for the client to
  act upon the results to change the geometry in the scene, it is likely
  that the client will soon be waiting upon the results. Flush the batch
  early so that we don't build up a long queue of commands afterwards that
  could delay the readback.
  ---
   src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 ++
   1 file changed, 6 insertions(+)
 
  diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
  b/src/mesa/drivers/dri/i965/gen7_sol_state.c
  index 857ebe5..13dbe5b 100644
  --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
  +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
  @@ -494,6 +494,12 @@ gen7_end_transform_feedback(struct gl_context *ctx,
 
  brw_batch_end(brw-batch);
 
  +   /* We will likely want to read the results in the very near future, so
  +* push this primitive to hardware if it is currently idle.
  +*/
  +   if (!brw_batch_busy(brw-batch))
  +  brw_batch_flush(brw-batch);
  +
  /* EndTransformFeedback() means that we need to update the number of
   * vertices written.  Since it's only necessary if 
  DrawTransformFeedback()
   * is called and it means mapping a buffer object, we delay computing 
  it
 
 
  We need some data to justify this change.
 
 I think even the theory is not correct - transform feedback is
 typically fed back into the GPU (as new geometry, eg) rather than
 consumed by the CPU, and in that case the flush is not helpful. But at
 the end of the day, data will tell.

How are they fed back? Can the xfb buffer be bound to the vertex buffer?
(Genuine question! The only examples I've seen were for testing by the
CPU.)

The point of the patch was really more about getting people to think
about the idea of making sure we queue work early that we need in the
near future, and breaking such work up into packets that are naturally
fenced by the kernel.

However, Jesse made a good point that spinning on a manual semaphore for
such feedback (if needed by the CPU) is likely far superior than using
the kernel wait interfaces.

For the query object, we would reserve the first slot for the semaphore
tracking, then after every query pair would add a PIPE_CONTROL dword
write to that slot with the new seqno. For reporting we need only map
async and spin until that value is greater than the query we want to
report back to the user.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] Render node only opencl and pipe-loader cleanups

2015-07-07 Thread Francisco Jerez
Emil Velikov emil.l.veli...@gmail.com writes:

 On 30/06/15 16:09, Emil Velikov wrote:
 Hello all,
 
 As mentioned over IRC a few weeks back, here is a series that removes 
 support for non-render node devices.
 
 The two main motivations being:
  - Currently we force X/xcb onto everyone that wants to use OpenCL
 (headless OpenCL systems/farms anyone ?)
  - Nice overall cleanup - 43 insertions(+), 279 deletions(-)
 
 
 Note that the final patches touch related code - from removing a unused 
 function (pipe_loader_sw_probe_xlib) to using loader_open_device() over 
 open(), with the former caring about CLOEXEC.

 Francisco, Tom,

 Can you guys please take a look at the series. Even an Ack would be
 greatly appreciated.


Looks OK to me, assuming that Tom is OK with the general approach the
series is:
Reviewed-by: Francisco Jerez curroje...@riseup.net

 Thanks
 Emil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] Render node only opencl and pipe-loader cleanups

2015-07-07 Thread Tom Stellard
On Tue, Jul 07, 2015 at 05:43:19PM +0100, Emil Velikov wrote:
 On 30/06/15 16:09, Emil Velikov wrote:
  Hello all,
  
  As mentioned over IRC a few weeks back, here is a series that removes 
  support for non-render node devices.
  
  The two main motivations being:
   - Currently we force X/xcb onto everyone that wants to use OpenCL
  (headless OpenCL systems/farms anyone ?)

Is this really true?  I don't see where lack of xcb prevents users from
building OpenCL.

-Tom

   - Nice overall cleanup - 43 insertions(+), 279 deletions(-)
  
  
  Note that the final patches touch related code - from removing a unused 
  function (pipe_loader_sw_probe_xlib) to using loader_open_device() over 
  open(), with the former caring about CLOEXEC.
 
 Francisco, Tom,
 
 Can you guys please take a look at the series. Even an Ack would be
 greatly appreciated.
 

I have no problems with merging these.

-Tom
 Thanks
 Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 5/5] i965/gen9: Allocate YF/YS tiled buffer objects

2015-07-07 Thread Anuj Phogat
On Tue, Jul 7, 2015 at 2:35 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Tuesday, June 23, 2015 01:23:05 PM Anuj Phogat wrote:
 In case of I915_TILING_{X,Y} we need to pass tiling format to libdrm
 using drm_intel_bo_alloc_tiled(). But, In case of YF/YS tiled buffers
 libdrm need not know about the tiling format because these buffers
 don't have hardware support to be tiled or detiled through a fenced
 region. libdrm still need to know buffer alignment value for its use
 in kernel when resolving the relocation.

 Using drm_intel_bo_alloc_for_render() for YF/YS tiled buffers
 satisfy both the above conditions.

 V2: Delete min/max buffer size restrictions not valid for i965+.
 Remove redundant align to tile size statements.
 Remove some redundant code now when there are no min/max buffer size.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Cc: Ben Widawsky b...@bwidawsk.net
 ---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 62 
 +--
  1 file changed, 58 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
 b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 index 80c52f2..5bcb094 100644
 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 @@ -558,6 +558,48 @@ intel_lower_compressed_format(struct brw_context *brw, 
 mesa_format format)
 }
  }

 +/* This function computes Yf/Ys tiled bo size, alignment and pitch. */
 +static uint64_t
 +intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
 +uint64_t *pitch)

 Hi Anuj,

 This patch has a subtle bug: you've specified pitch and stride to be
 uint64_t here, but below when you call it

 [snip]
 @@ -616,11 +658,23 @@ intel_miptree_create(struct brw_context *brw,
alloc_flags |= BO_ALLOC_FOR_RENDER;

 unsigned long pitch;
 -   mt-bo = drm_intel_bo_alloc_tiled(brw-bufmgr, miptree, total_width,
 - total_height, mt-cpp, mt-tiling,
 - pitch, alloc_flags);
 mt-etc_format = etc_format;
 -   mt-pitch = pitch;
 +
 +   if (mt-tr_mode != INTEL_MIPTREE_TRMODE_NONE) {
 +  unsigned alignment = 0;
 +  unsigned long size;
 +  size = intel_get_yf_ys_bo_size(mt, alignment, pitch);

 ...you're passing a pointer to an unsigned long.  On 32-bit builds,
 unsigned long is a 4 byte value, while uint64_t is 8 bytes.  This could
 lead to stack corruption.  (GCC warns about this during a 32-bit build.)

Thanks for noticing this Ken. I think I never did 32 bit build with these
patches :(.

 I assumed the solution was to make everything uint32_t, but apparently
 drm_intel_bo_alloc_tiled actually expects an unsigned long.  So we can't
 change that.

How about changing the parameter type of pitch to unsigned long*
and types of size and stride to unsigned long? This fixes the 32 bit
build warnings.

 Then I looked at your code, and realized that nothing even uses the
 pitch value.  Is there some point to the parameter existing at all?

pitch value is later assigned to mt-pitch. I could have avoided
passing pitch parameter and instead assign mt-pitch in
drm_intel_bo_alloc_for_render(). But, I used the current approach
to keep mt-pitch assignments at a single place.

I'm working on some refactoring to make this code look better.

 --Ken

 +  assert(size);
 +  mt-bo = drm_intel_bo_alloc_for_render(brw-bufmgr, miptree,
 + size, alignment);
 +  mt-pitch = pitch;
 +   } else {
 +  mt-bo = drm_intel_bo_alloc_tiled(brw-bufmgr, miptree,
 +total_width, total_height, mt-cpp,
 +mt-tiling, pitch,
 +alloc_flags);
 +  mt-pitch = pitch;
 +   }

 /* If the BO is too large to fit in the aperture, we need to use the
  * BLT engine to support it.  Prior to Sandybridge, the BLT paths can't

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/18] i965: Speculatively flush the batch after transform feedback

2015-07-07 Thread Chris Wilson
On Tue, Jul 07, 2015 at 10:31:07AM -0700, Kenneth Graunke wrote:
 On Tuesday, July 07, 2015 04:46:22 PM Chris Wilson wrote:
  On Tue, Jul 07, 2015 at 10:12:20AM +0100, Chris Wilson wrote:
   On Mon, Jul 06, 2015 at 09:05:18PM -0700, Kristian Høgsberg wrote:
On Mon, Jul 6, 2015 at 12:36 PM, Kenneth Graunke 
kenn...@whitecape.org wrote:
 On Monday, July 06, 2015 11:33:15 AM Chris Wilson wrote:
 Since the purpose of transform feedback tends to be for the client to
 act upon the results to change the geometry in the scene, it is 
 likely
 that the client will soon be waiting upon the results. Flush the 
 batch
 early so that we don't build up a long queue of commands afterwards 
 that
 could delay the readback.
 ---
  src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 ++
  1 file changed, 6 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
 b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 index 857ebe5..13dbe5b 100644
 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 @@ -494,6 +494,12 @@ gen7_end_transform_feedback(struct gl_context 
 *ctx,

 brw_batch_end(brw-batch);

 +   /* We will likely want to read the results in the very near 
 future, so
 +* push this primitive to hardware if it is currently idle.
 +*/
 +   if (!brw_batch_busy(brw-batch))
 +  brw_batch_flush(brw-batch);
 +
 /* EndTransformFeedback() means that we need to update the 
 number of
  * vertices written.  Since it's only necessary if 
 DrawTransformFeedback()
  * is called and it means mapping a buffer object, we delay 
 computing it


 We need some data to justify this change.

I think even the theory is not correct - transform feedback is
typically fed back into the GPU (as new geometry, eg) rather than
consumed by the CPU, and in that case the flush is not helpful. But at
the end of the day, data will tell.
   
   How are they fed back? Can the xfb buffer be bound to the vertex buffer?
   (Genuine question! The only examples I've seen were for testing by the
   CPU.)
 
 Yes, it can.  Just glBindBuffer() some buffers around.  Or, I suspect
 one could bind it as a texture buffer object or SSBO and then use a
 compute shader on the results.
 
 With GL 4.x, the avoid synchronizing with the CPU mentality is a lot
 more prevalent, due to the advent of compute shaders.
 
  
  I've reviewed the code again, and gen7_end_transform_feedback() is always
  followed by brw_compute_xfb_vertices_written (and a read of the sol
  buffer) afaict, maybe not immediately but always before the next
  transform feedback.
 
 Sadly, yes.  We have a primitive count and we need a vertex count - so,
 a tiny bit of math.  Ideally, we would use the Gen7.5 MI_MATH+ feature
 to do this, eliminating the CPU-GPU synchronization point.
 
  Also afaict it is not possible to map the sol buffer directly into the
  application.
  -Chris
 
 It definitely is - the application creates GL buffer objects and binds
 them for use with transform feedback.  They can certainly
 glMapBufferRange() those buffers.

The trouble I see is that the values stored currently are implementation
dependent and often reset. How is the application meant to use them
directly?

(Just trying to understand a bit better. If it is that the current
implementation is stalling when not required, then trying to speed
those stalls up really is just lipstick on a pig and irrelevant. The
patch was just trying to make a suggestion that feeding the gpu around
expected stall points works best with the current batch-level granularity
of our fences. Using intrabatch semaphores for the query objects seems a
more promising avenue than doing batch flushes anyway.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/18] loader: Look for any version of currently linked libudev.so

2015-07-07 Thread Emil Velikov
On 06/07/15 11:33, Chris Wilson wrote:
 Since there was an ABI break and linking twice against libudev.so.0 and
 libudev.so.1 causes the application to quickly crash, we first check if
 the application is currently linked against libudev before dlopening a
 local handle. However for backwards/forwards compatability, we need to
 inspect the application for current linkage against all known versions
 first.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
I'm ever so slightly conserned that RTLD_NOLOAD is not part of the POSIX
standard, thus it's missing on some platforms (*BSD seems ok, while
Solaris, MacOS are not).

Then again this code is not build for them so we are safe. Plus it does
save nastry crashes :-) Feel free to add the Cc: mesa-stable tag.

Reviewed-by: Emil Velikov emil.l.veli...@gmail.com

Note(s) so self: 1) what was the main obstactle for dropping libudev and
sysfs 2) all that handling is completely broken in our configure.ac :-\

-Emil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: fix Bug 85252 - Segfault in compiler while processing ternary operator with void arguments

2015-07-07 Thread Renaud Gaubert
This is done by returning an rvalue of type void in the
ast_function_expression::hir function instead of a void expression.

This produces (in the case of the ternary) an hir with a call
to the void returning function and an assignement of a void variable
which will be optimized out (the assignement) during the optimization
pass.

This fix results in having a valid subexpression in the many
different cases where the subexpressions are functions whose
return values are void.

Thus preventing to dereference NULL in the following cases:
  * binary operator
  * unary operators
  * ternary operator
  * comparison operators (except equal and nequal operator)

Equal and nequal had to be handled as a special case because
instead of segfaulting on a forbidden syntax it was now accepting
expressions with a void return value on either (or both) side of
the expression.

Piglist tests are on the way

Signed-off-by: Renaud Gaubert ren...@lse.epita.fr
Reviewed-by: Gabriel Laskar gabr...@lse.epita.fr
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85252
---
 src/glsl/ast_function.cpp |  6 +-
 src/glsl/ast_to_hir.cpp   | 10 +-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 92e26bf..776a754 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -1785,7 +1785,11 @@ ast_function_expression::hir(exec_list *instructions,
 /* an error has already been emitted */
 value = ir_rvalue::error_value(ctx);
   } else {
-value = generate_call(instructions, sig, actual_parameters, state);
+value = generate_call(instructions, sig, actual_parameters, state);
+if (!value) {
+  ir_variable *const tmp = new(ctx) ir_variable(glsl_type::void_type, 
void_var, ir_var_temporary);
+  value = new(ctx) ir_dereference_variable(tmp);
+}
   }
 
   return value;
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 8cb46be..00cc16c 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -1270,7 +1270,15 @@ ast_expression::do_hir(exec_list *instructions,
*applied to one operand that can make them match, in which
*case this conversion is done.
*/
-  if ((!apply_implicit_conversion(op[0]-type, op[1], state)
+
+  if (op[0]-type == glsl_type::void_type || op[1]-type == 
glsl_type::void_type) {
+
+_mesa_glsl_error( loc, state, `%s':  wrong operand types: no 
operation 
+  `%1$s' exists that takes a left-hand operand of type 'void' or a 
+  right operand of type 'void', (this-oper == ast_equal) ? == : 
!=);
+
+ error_emitted = true;
+  } else if ((!apply_implicit_conversion(op[0]-type, op[1], state)
 !apply_implicit_conversion(op[1]-type, op[0], state))
   || (op[0]-type != op[1]-type)) {
  _mesa_glsl_error( loc, state, operands of `%s' must have the same 
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/hud: add PIPE_DRIVER_QUERY_TYPE_MICROSECONDS for HUD

2015-07-07 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Tue, Jul 7, 2015 at 5:37 PM, Brian Paul bri...@vmware.com wrote:
 This allows drivers to report queries in units of microseconds and
 have the HUD display us (microseconds), ms (milliseconds) or s
 (seconds) on the graph.
 ---
  src/gallium/auxiliary/hud/hud_context.c | 25 -
  src/gallium/include/pipe/p_defines.h| 11 ++-
  2 files changed, 26 insertions(+), 10 deletions(-)

 diff --git a/src/gallium/auxiliary/hud/hud_context.c 
 b/src/gallium/auxiliary/hud/hud_context.c
 index 9f42da9..cb55220 100644
 --- a/src/gallium/auxiliary/hud/hud_context.c
 +++ b/src/gallium/auxiliary/hud/hud_context.c
 @@ -238,8 +238,9 @@ number_to_human_readable(uint64_t num, enum 
 pipe_driver_query_type type,
{,  KB,  MB,  GB,  TB,  PB,  EB};
 static const char *metric_units[] =
{,  k,  M,  G,  T,  P,  E};
 -   const char **units =
 -  (type == PIPE_DRIVER_QUERY_TYPE_BYTES) ? byte_units : metric_units;
 +   static const char *time_units[] =
 +  { us,  ms,  s};  /* based on microseconds */
 +   const char *suffix;
 double divisor = (type == PIPE_DRIVER_QUERY_TYPE_BYTES) ? 1024 : 1000;
 int unit = 0;
 double d = num;
 @@ -249,12 +250,26 @@ number_to_human_readable(uint64_t num, enum 
 pipe_driver_query_type type,
unit++;
 }

 +   switch (type) {
 +   case PIPE_DRIVER_QUERY_TYPE_MICROSECONDS:
 +  assert(unit  ARRAY_SIZE(time_units));
 +  suffix = time_units[unit];
 +  break;
 +   case PIPE_DRIVER_QUERY_TYPE_BYTES:
 +  assert(unit  ARRAY_SIZE(byte_units));
 +  suffix = byte_units[unit];
 +  break;
 +   default:
 +  assert(unit  ARRAY_SIZE(metric_units));
 +  suffix = metric_units[unit];
 +   }
 +
 if (d = 100 || d == (int)d)
 -  sprintf(out, %.0f%s, d, units[unit]);
 +  sprintf(out, %.0f%s, d, suffix);
 else if (d = 10 || d*10 == (int)(d*10))
 -  sprintf(out, %.1f%s, d, units[unit]);
 +  sprintf(out, %.1f%s, d, suffix);
 else
 -  sprintf(out, %.2f%s, d, units[unit]);
 +  sprintf(out, %.2f%s, d, suffix);
  }

  static void
 diff --git a/src/gallium/include/pipe/p_defines.h 
 b/src/gallium/include/pipe/p_defines.h
 index 153897a..b0cd23d 100644
 --- a/src/gallium/include/pipe/p_defines.h
 +++ b/src/gallium/include/pipe/p_defines.h
 @@ -788,11 +788,12 @@ union pipe_color_union

  enum pipe_driver_query_type
  {
 -   PIPE_DRIVER_QUERY_TYPE_UINT64 = 0,
 -   PIPE_DRIVER_QUERY_TYPE_UINT   = 1,
 -   PIPE_DRIVER_QUERY_TYPE_FLOAT  = 2,
 -   PIPE_DRIVER_QUERY_TYPE_PERCENTAGE = 3,
 -   PIPE_DRIVER_QUERY_TYPE_BYTES  = 4,
 +   PIPE_DRIVER_QUERY_TYPE_UINT64   = 0,
 +   PIPE_DRIVER_QUERY_TYPE_UINT = 1,
 +   PIPE_DRIVER_QUERY_TYPE_FLOAT= 2,
 +   PIPE_DRIVER_QUERY_TYPE_PERCENTAGE   = 3,
 +   PIPE_DRIVER_QUERY_TYPE_BYTES= 4,
 +   PIPE_DRIVER_QUERY_TYPE_MICROSECONDS = 5,
  };

  enum pipe_driver_query_group_type
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/hud: display percentages with % suffix

2015-07-07 Thread Marek Olšák
Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Tue, Jul 7, 2015 at 9:17 PM, Brian Paul bri...@vmware.com wrote:
 ---
  src/gallium/auxiliary/hud/hud_context.c | 3 +++
  1 file changed, 3 insertions(+)

 diff --git a/src/gallium/auxiliary/hud/hud_context.c 
 b/src/gallium/auxiliary/hud/hud_context.c
 index cb55220..bd57190 100644
 --- a/src/gallium/auxiliary/hud/hud_context.c
 +++ b/src/gallium/auxiliary/hud/hud_context.c
 @@ -255,6 +255,9 @@ number_to_human_readable(uint64_t num, enum 
 pipe_driver_query_type type,
assert(unit  ARRAY_SIZE(time_units));
suffix = time_units[unit];
break;
 +   case PIPE_DRIVER_QUERY_TYPE_PERCENTAGE:
 +  suffix = %;
 +  break;
 case PIPE_DRIVER_QUERY_TYPE_BYTES:
assert(unit  ARRAY_SIZE(byte_units));
suffix = byte_units[unit];
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 5/5] i965/gen9: Allocate YF/YS tiled buffer objects

2015-07-07 Thread Anuj Phogat
On Tue, Jul 7, 2015 at 12:11 PM, Anuj Phogat anuj.pho...@gmail.com wrote:
 On Tue, Jul 7, 2015 at 2:35 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Tuesday, June 23, 2015 01:23:05 PM Anuj Phogat wrote:
 In case of I915_TILING_{X,Y} we need to pass tiling format to libdrm
 using drm_intel_bo_alloc_tiled(). But, In case of YF/YS tiled buffers
 libdrm need not know about the tiling format because these buffers
 don't have hardware support to be tiled or detiled through a fenced
 region. libdrm still need to know buffer alignment value for its use
 in kernel when resolving the relocation.

 Using drm_intel_bo_alloc_for_render() for YF/YS tiled buffers
 satisfy both the above conditions.

 V2: Delete min/max buffer size restrictions not valid for i965+.
 Remove redundant align to tile size statements.
 Remove some redundant code now when there are no min/max buffer size.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Cc: Ben Widawsky b...@bwidawsk.net
 ---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 62 
 +--
  1 file changed, 58 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
 b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 index 80c52f2..5bcb094 100644
 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 @@ -558,6 +558,48 @@ intel_lower_compressed_format(struct brw_context *brw, 
 mesa_format format)
 }
  }

 +/* This function computes Yf/Ys tiled bo size, alignment and pitch. */
 +static uint64_t
 +intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
 +uint64_t *pitch)

 Hi Anuj,

 This patch has a subtle bug: you've specified pitch and stride to be
 uint64_t here, but below when you call it

 [snip]
 @@ -616,11 +658,23 @@ intel_miptree_create(struct brw_context *brw,
alloc_flags |= BO_ALLOC_FOR_RENDER;

 unsigned long pitch;
 -   mt-bo = drm_intel_bo_alloc_tiled(brw-bufmgr, miptree, total_width,
 - total_height, mt-cpp, mt-tiling,
 - pitch, alloc_flags);
 mt-etc_format = etc_format;
 -   mt-pitch = pitch;
 +
 +   if (mt-tr_mode != INTEL_MIPTREE_TRMODE_NONE) {
 +  unsigned alignment = 0;
 +  unsigned long size;
 +  size = intel_get_yf_ys_bo_size(mt, alignment, pitch);

 ...you're passing a pointer to an unsigned long.  On 32-bit builds,
 unsigned long is a 4 byte value, while uint64_t is 8 bytes.  This could
 lead to stack corruption.  (GCC warns about this during a 32-bit build.)

 Thanks for noticing this Ken. I think I never did 32 bit build with these
 patches :(.

 I assumed the solution was to make everything uint32_t, but apparently
 drm_intel_bo_alloc_tiled actually expects an unsigned long.  So we can't
 change that.

 How about changing the parameter type of pitch to unsigned long*
 and types of size and stride to unsigned long? This fixes the 32 bit
 build warnings.

 Then I looked at your code, and realized that nothing even uses the
 pitch value.  Is there some point to the parameter existing at all?

 pitch value is later assigned to mt-pitch. I could have avoided
 passing pitch parameter and instead assign mt-pitch in
 drm_intel_bo_alloc_for_render(). But, I used the current approach
Correction: assign mt-pitch in intel_get_yf_ys_bo_size()
 to keep mt-pitch assignments at a single place.

 I'm working on some refactoring to make this code look better.

 --Ken

 +  assert(size);
 +  mt-bo = drm_intel_bo_alloc_for_render(brw-bufmgr, miptree,
 + size, alignment);
 +  mt-pitch = pitch;
 +   } else {
 +  mt-bo = drm_intel_bo_alloc_tiled(brw-bufmgr, miptree,
 +total_width, total_height, mt-cpp,
 +mt-tiling, pitch,
 +alloc_flags);
 +  mt-pitch = pitch;
 +   }

 /* If the BO is too large to fit in the aperture, we need to use the
  * BLT engine to support it.  Prior to Sandybridge, the BLT paths can't

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCHv2] i965/gen9: Use custom MOCS entries set up by the kernel.

2015-07-07 Thread Francisco Jerez
Instead of relying on hardware defaults the i915 kernel driver is
going program custom MOCS tables system-wide on Gen9 hardware.  The
WT entry previously used for renderbuffers had a number of problems:
It disabled caching on eLLC, it used a reserved L3 cacheability
setting, and it used to override the PTE controls making renderbuffers
always WT on LLC regardless of the kernel's setting.  Instead use an
entry from the new MOCS tables with parameters: TC=LLC/eLLC, LeCC=PTE,
L3CC=WB.

The WB entry previously used for anything other than renderbuffers
has moved to a different index in the new MOCS tables but it should
have the same caching semantics as the old entry.

Even though the corresponding kernel change (drm/i915: Added
Programming of the MOCS) is in a way an ABI break it doesn't seem
necessary to check that the kernel is recent enough because the change
should only affect Gen9 which is still unreleased hardware.

v2: Update MOCS values for the new Android-incompatible tables
introduced in v7 of the kernel patch.

Cc: 10.6 mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_defines.h| 11 ++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c |  3 +--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 66b9abc..8ab8d62 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -2491,12 +2491,13 @@ enum brw_wm_barycentric_interp_mode {
 #define BDW_MOCS_WT  0x58
 #define BDW_MOCS_PTE 0x18
 
-/* Skylake: MOCS is now an index into an array of 64 different configurable
- * cache settings.  We still use only either write-back or write-through; and
- * rely on the documented default values.
+/* Skylake: MOCS is now an index into an array of 62 different caching
+ * configurations programmed by the kernel.
  */
-#define SKL_MOCS_WB (0b001001  1)
-#define SKL_MOCS_WT (0b000101  1)
+/* TC=LLC/eLLC, LeCC=WB, LRUM=3, L3CC=WB */
+#define SKL_MOCS_WB  (2  1)
+/* TC=LLC/eLLC, LeCC=PTE, LRUM=3, L3CC=WB */
+#define SKL_MOCS_PTE (1  1)
 
 #define MEDIA_VFE_STATE 0x7000
 /* GEN7 DW2, GEN8+ DW3 */
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index bd3eb00..dfaf762 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -401,8 +401,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
   irb-mt_layer : (irb-mt_layer / MAX2(mt-num_samples, 1));
GLenum gl_target =
   rb-TexImage ? rb-TexImage-TexObject-Target : GL_TEXTURE_2D;
-   /* FINISHME: Use PTE MOCS on Skylake. */
-   uint32_t mocs = brw-gen = 9 ? SKL_MOCS_WT : BDW_MOCS_PTE;
+   uint32_t mocs = brw-gen = 9 ? SKL_MOCS_PTE : BDW_MOCS_PTE;
 
intel_miptree_used_for_rendering(mt);
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] Render node only opencl and pipe-loader cleanups

2015-07-07 Thread Emil Velikov
On 07/07/15 19:42, Tom Stellard wrote:
 On Tue, Jul 07, 2015 at 05:43:19PM +0100, Emil Velikov wrote:
 On 30/06/15 16:09, Emil Velikov wrote:
 Hello all,

 As mentioned over IRC a few weeks back, here is a series that removes 
 support for non-render node devices.

 The two main motivations being:
  - Currently we force X/xcb onto everyone that wants to use OpenCL
 (headless OpenCL systems/farms anyone ?)
 
 Is this really true?  I don't see where lack of xcb prevents users from
 building OpenCL.
 
Ouch just realised how silly the wording is. Sorry about that.

Currently if you have xcb at build time it will get picked, regardless
if xcb is present at runtime or not. Something nasty which I refer to as
hidden dependency.


 -Tom
 
  - Nice overall cleanup - 43 insertions(+), 279 deletions(-)


 Note that the final patches touch related code - from removing a unused 
 function (pipe_loader_sw_probe_xlib) to using loader_open_device() over 
 open(), with the former caring about CLOEXEC.

 Francisco, Tom,

 Can you guys please take a look at the series. Even an Ack would be
 greatly appreciated.

 
 I have no problems with merging these.
 
If you'd like to take a look I can give it another week. Alternatively
I'll fix patch 1/8 summary (same force mistake) and will push these in
a day or so.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/hud: display percentages with % suffix

2015-07-07 Thread Brian Paul
---
 src/gallium/auxiliary/hud/hud_context.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index cb55220..bd57190 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -255,6 +255,9 @@ number_to_human_readable(uint64_t num, enum 
pipe_driver_query_type type,
   assert(unit  ARRAY_SIZE(time_units));
   suffix = time_units[unit];
   break;
+   case PIPE_DRIVER_QUERY_TYPE_PERCENTAGE:
+  suffix = %;
+  break;
case PIPE_DRIVER_QUERY_TYPE_BYTES:
   assert(unit  ARRAY_SIZE(byte_units));
   suffix = byte_units[unit];
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/18] i965: Speculatively flush the batch after transform feedback

2015-07-07 Thread Chris Wilson
On Tue, Jul 07, 2015 at 10:12:20AM +0100, Chris Wilson wrote:
 On Mon, Jul 06, 2015 at 09:05:18PM -0700, Kristian Høgsberg wrote:
  On Mon, Jul 6, 2015 at 12:36 PM, Kenneth Graunke kenn...@whitecape.org 
  wrote:
   On Monday, July 06, 2015 11:33:15 AM Chris Wilson wrote:
   Since the purpose of transform feedback tends to be for the client to
   act upon the results to change the geometry in the scene, it is likely
   that the client will soon be waiting upon the results. Flush the batch
   early so that we don't build up a long queue of commands afterwards that
   could delay the readback.
   ---
src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 ++
1 file changed, 6 insertions(+)
  
   diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
   b/src/mesa/drivers/dri/i965/gen7_sol_state.c
   index 857ebe5..13dbe5b 100644
   --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
   +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
   @@ -494,6 +494,12 @@ gen7_end_transform_feedback(struct gl_context *ctx,
  
   brw_batch_end(brw-batch);
  
   +   /* We will likely want to read the results in the very near future, 
   so
   +* push this primitive to hardware if it is currently idle.
   +*/
   +   if (!brw_batch_busy(brw-batch))
   +  brw_batch_flush(brw-batch);
   +
   /* EndTransformFeedback() means that we need to update the number of
* vertices written.  Since it's only necessary if 
   DrawTransformFeedback()
* is called and it means mapping a buffer object, we delay 
   computing it
  
  
   We need some data to justify this change.
  
  I think even the theory is not correct - transform feedback is
  typically fed back into the GPU (as new geometry, eg) rather than
  consumed by the CPU, and in that case the flush is not helpful. But at
  the end of the day, data will tell.
 
 How are they fed back? Can the xfb buffer be bound to the vertex buffer?
 (Genuine question! The only examples I've seen were for testing by the
 CPU.)

I've reviewed the code again, and gen7_end_transform_feedback() is always
followed by brw_compute_xfb_vertices_written (and a read of the sol
buffer) afaict, maybe not immediately but always before the next
transform feedback.

Also afaict it is not possible to map the sol buffer directly into the
application.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-07 Thread Roland Scheidegger
Am 07.07.2015 um 22:35 schrieb Jose Fonseca:
 On 07/07/15 21:28, Ilia Mirkin wrote:
 On Tue, Jul 7, 2015 at 4:24 PM, Jose Fonseca jfons...@vmware.com wrote:
 I'm not experienced with the semantics around resources that can be
 read/written by shaders, so I can't really make educated comments.

 But overall this looks good to me FWIW.

 On 05/07/15 14:25, Marek Olšák wrote:

 From: Marek Olšák marek.ol...@amd.com

 Other approaches are being considered:

 1) Don't use resource wrappers (views) and pass all view parameters
  (format, layer range, level) to set_shader_images just like
  set_vertex_buffers, set_constant_buffer, or even
 glBindImageTexture
 do.


 I don't know how much pipe drivers leverage this nowadays, but these
 structures are convenient placeholders for driver data, particular
 when they
 don't support something (e.g., a certain format, or need some
 swizzling),
 natively.


 2) Use pipe_sampler_view instead of pipe_image_view,
  and maybe even use set_sampler_views instead of set_shader_images.
  set_sampler_views would have to use start_slot =
 PIPE_MAX_SAMPLERS
 for
  all writable images to allow for OpenGL textures in the lower
 slots.


 If pipe_sampler_view  and pipe_image_view are the same, we could
 indeed use
 one structure for both.  While still keeping the separate
 create/bind/destroy functions.

 The big difference is that a sampler view has a first/last layer and
 first/last level, while image views are more like surfaces which just
 have the one of each. But they also need a byte range for buffer
 images.
 
 D3D11_TEX2D_ARRAY_UAV allows to specify first/last layer
 https://msdn.microsoft.com/en-us/library/windows/desktop/ff476242.aspx ,
 so it sounds that once pipe_image_view is updated to handle D3D11, the
 difference would reduce to the absence of last_level
 
 Of course we could just ignore that and guarantee that first==last for
 images.
 
 Yes, it might not be a bad idea.
 

You could of course argue then isn't it really more like pipe_surface?
At least in d3d11 clearly they are much closer in concept to rts. The
actual structures are of course mostly the same in gallium, the
differences boil down to pipe_surface having (long obsolete)
width/height parameters and a writable flag, whereas sampler views
instead have swizzling fields (I don't think they'd have any use for
this), support multiple levels (again, not needed for shader images /
uavs), and have a target parameter (in d3d10, rts actually have a target
parameter too, but it is of no practical consequence, hence there was no
need for that in gallium - I'm not sure if it would be required for
shader images / uavs, uavs certainly have such target parameter too but
I'm not sure it matters).
But in any case, I'm pretty impartial to what structure is used, as long
as it is created/destroyed separately.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-07 Thread Jose Fonseca
I'm not experienced with the semantics around resources that can be 
read/written by shaders, so I can't really make educated comments.


But overall this looks good to me FWIW.

On 05/07/15 14:25, Marek Olšák wrote:

From: Marek Olšák marek.ol...@amd.com

Other approaches are being considered:

1) Don't use resource wrappers (views) and pass all view parameters
(format, layer range, level) to set_shader_images just like
set_vertex_buffers, set_constant_buffer, or even glBindImageTexture do.


I don't know how much pipe drivers leverage this nowadays, but these 
structures are convenient placeholders for driver data, particular when 
they don't support something (e.g., a certain format, or need some 
swizzling), natively.




2) Use pipe_sampler_view instead of pipe_image_view,
and maybe even use set_sampler_views instead of set_shader_images.
set_sampler_views would have to use start_slot = PIPE_MAX_SAMPLERS for
all writable images to allow for OpenGL textures in the lower slots.


If pipe_sampler_view  and pipe_image_view are the same, we could indeed 
use one structure for both.  While still keeping the separate 
create/bind/destroy functions.


This would enable drivers to treat them uniformly internally if they 
wanted (e.g, by concatenating all views bindings into a single array as 
you described). Or seperate internal objects if they wanted.


This seems the best of both worlds.

There is even a precendent: {create,bind,delete}_{fs,vs,gs}_state. 
These all use the same template structure, but drivers are free to 
create joint or disjoint private structures for each kind.  And in face 
llvmpipe (and all draw based drivers), end up using different private 
objects.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-07 Thread Jose Fonseca

On 07/07/15 21:28, Ilia Mirkin wrote:

On Tue, Jul 7, 2015 at 4:24 PM, Jose Fonseca jfons...@vmware.com wrote:

I'm not experienced with the semantics around resources that can be
read/written by shaders, so I can't really make educated comments.

But overall this looks good to me FWIW.

On 05/07/15 14:25, Marek Olšák wrote:


From: Marek Olšák marek.ol...@amd.com

Other approaches are being considered:

1) Don't use resource wrappers (views) and pass all view parameters
 (format, layer range, level) to set_shader_images just like
 set_vertex_buffers, set_constant_buffer, or even glBindImageTexture
do.



I don't know how much pipe drivers leverage this nowadays, but these
structures are convenient placeholders for driver data, particular when they
don't support something (e.g., a certain format, or need some swizzling),
natively.



2) Use pipe_sampler_view instead of pipe_image_view,
 and maybe even use set_sampler_views instead of set_shader_images.
 set_sampler_views would have to use start_slot = PIPE_MAX_SAMPLERS
for
 all writable images to allow for OpenGL textures in the lower slots.



If pipe_sampler_view  and pipe_image_view are the same, we could indeed use
one structure for both.  While still keeping the separate
create/bind/destroy functions.


The big difference is that a sampler view has a first/last layer and
first/last level, while image views are more like surfaces which just
have the one of each. But they also need a byte range for buffer
images.


D3D11_TEX2D_ARRAY_UAV allows to specify first/last layer 
https://msdn.microsoft.com/en-us/library/windows/desktop/ff476242.aspx , 
so it sounds that once pipe_image_view is updated to handle D3D11, the 
difference would reduce to the absence of last_level



Of course we could just ignore that and guarantee that first==last for images.


Yes, it might not be a bad idea.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-07 Thread Ilia Mirkin
On Tue, Jul 7, 2015 at 4:24 PM, Jose Fonseca jfons...@vmware.com wrote:
 I'm not experienced with the semantics around resources that can be
 read/written by shaders, so I can't really make educated comments.

 But overall this looks good to me FWIW.

 On 05/07/15 14:25, Marek Olšák wrote:

 From: Marek Olšák marek.ol...@amd.com

 Other approaches are being considered:

 1) Don't use resource wrappers (views) and pass all view parameters
 (format, layer range, level) to set_shader_images just like
 set_vertex_buffers, set_constant_buffer, or even glBindImageTexture
 do.


 I don't know how much pipe drivers leverage this nowadays, but these
 structures are convenient placeholders for driver data, particular when they
 don't support something (e.g., a certain format, or need some swizzling),
 natively.


 2) Use pipe_sampler_view instead of pipe_image_view,
 and maybe even use set_sampler_views instead of set_shader_images.
 set_sampler_views would have to use start_slot = PIPE_MAX_SAMPLERS
 for
 all writable images to allow for OpenGL textures in the lower slots.


 If pipe_sampler_view  and pipe_image_view are the same, we could indeed use
 one structure for both.  While still keeping the separate
 create/bind/destroy functions.

The big difference is that a sampler view has a first/last layer and
first/last level, while image views are more like surfaces which just
have the one of each. But they also need a byte range for buffer
images.

Of course we could just ignore that and guarantee that first==last for images.


 This would enable drivers to treat them uniformly internally if they wanted
 (e.g, by concatenating all views bindings into a single array as you
 described). Or seperate internal objects if they wanted.

 This seems the best of both worlds.

 There is even a precendent: {create,bind,delete}_{fs,vs,gs}_state. These all
 use the same template structure, but drivers are free to create joint or
 disjoint private structures for each kind.  And in face llvmpipe (and all
 draw based drivers), end up using different private objects.

 Jose

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-07 Thread Ilia Mirkin
On Tue, Jul 7, 2015 at 4:35 PM, Jose Fonseca jfons...@vmware.com wrote:
 On 07/07/15 21:28, Ilia Mirkin wrote:

 On Tue, Jul 7, 2015 at 4:24 PM, Jose Fonseca jfons...@vmware.com wrote:

 I'm not experienced with the semantics around resources that can be
 read/written by shaders, so I can't really make educated comments.

 But overall this looks good to me FWIW.

 On 05/07/15 14:25, Marek Olšák wrote:


 From: Marek Olšák marek.ol...@amd.com

 Other approaches are being considered:

 1) Don't use resource wrappers (views) and pass all view parameters
  (format, layer range, level) to set_shader_images just like
  set_vertex_buffers, set_constant_buffer, or even glBindImageTexture
 do.



 I don't know how much pipe drivers leverage this nowadays, but these
 structures are convenient placeholders for driver data, particular when
 they
 don't support something (e.g., a certain format, or need some swizzling),
 natively.


 2) Use pipe_sampler_view instead of pipe_image_view,
  and maybe even use set_sampler_views instead of set_shader_images.
  set_sampler_views would have to use start_slot = PIPE_MAX_SAMPLERS
 for
  all writable images to allow for OpenGL textures in the lower
 slots.



 If pipe_sampler_view  and pipe_image_view are the same, we could indeed
 use
 one structure for both.  While still keeping the separate
 create/bind/destroy functions.


 The big difference is that a sampler view has a first/last layer and
 first/last level, while image views are more like surfaces which just
 have the one of each. But they also need a byte range for buffer
 images.


 D3D11_TEX2D_ARRAY_UAV allows to specify first/last layer
 https://msdn.microsoft.com/en-us/library/windows/desktop/ff476242.aspx , so
 it sounds that once pipe_image_view is updated to handle D3D11, the
 difference would reduce to the absence of last_level

Erm. Duh. OpenGL needs first/last layer too. And pipe_surface has it
too, so it all works out well :)


 Of course we could just ignore that and guarantee that first==last for
 images.


 Yes, it might not be a bad idea.

 Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] gallium/hud: add PIPE_DRIVER_QUERY_TYPE_MICROSECONDS for HUD

2015-07-07 Thread Brian Paul
This allows drivers to report queries in units of microseconds and
have the HUD display us (microseconds), ms (milliseconds) or s
(seconds) on the graph.
---
 src/gallium/auxiliary/hud/hud_context.c | 25 -
 src/gallium/include/pipe/p_defines.h| 11 ++-
 2 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 9f42da9..cb55220 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -238,8 +238,9 @@ number_to_human_readable(uint64_t num, enum 
pipe_driver_query_type type,
   {,  KB,  MB,  GB,  TB,  PB,  EB};
static const char *metric_units[] =
   {,  k,  M,  G,  T,  P,  E};
-   const char **units =
-  (type == PIPE_DRIVER_QUERY_TYPE_BYTES) ? byte_units : metric_units;
+   static const char *time_units[] =
+  { us,  ms,  s};  /* based on microseconds */
+   const char *suffix;
double divisor = (type == PIPE_DRIVER_QUERY_TYPE_BYTES) ? 1024 : 1000;
int unit = 0;
double d = num;
@@ -249,12 +250,26 @@ number_to_human_readable(uint64_t num, enum 
pipe_driver_query_type type,
   unit++;
}
 
+   switch (type) {
+   case PIPE_DRIVER_QUERY_TYPE_MICROSECONDS:
+  assert(unit  ARRAY_SIZE(time_units));
+  suffix = time_units[unit];
+  break;
+   case PIPE_DRIVER_QUERY_TYPE_BYTES:
+  assert(unit  ARRAY_SIZE(byte_units));
+  suffix = byte_units[unit];
+  break;
+   default:
+  assert(unit  ARRAY_SIZE(metric_units));
+  suffix = metric_units[unit];
+   }
+
if (d = 100 || d == (int)d)
-  sprintf(out, %.0f%s, d, units[unit]);
+  sprintf(out, %.0f%s, d, suffix);
else if (d = 10 || d*10 == (int)(d*10))
-  sprintf(out, %.1f%s, d, units[unit]);
+  sprintf(out, %.1f%s, d, suffix);
else
-  sprintf(out, %.2f%s, d, units[unit]);
+  sprintf(out, %.2f%s, d, suffix);
 }
 
 static void
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 153897a..b0cd23d 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -788,11 +788,12 @@ union pipe_color_union
 
 enum pipe_driver_query_type
 {
-   PIPE_DRIVER_QUERY_TYPE_UINT64 = 0,
-   PIPE_DRIVER_QUERY_TYPE_UINT   = 1,
-   PIPE_DRIVER_QUERY_TYPE_FLOAT  = 2,
-   PIPE_DRIVER_QUERY_TYPE_PERCENTAGE = 3,
-   PIPE_DRIVER_QUERY_TYPE_BYTES  = 4,
+   PIPE_DRIVER_QUERY_TYPE_UINT64   = 0,
+   PIPE_DRIVER_QUERY_TYPE_UINT = 1,
+   PIPE_DRIVER_QUERY_TYPE_FLOAT= 2,
+   PIPE_DRIVER_QUERY_TYPE_PERCENTAGE   = 3,
+   PIPE_DRIVER_QUERY_TYPE_BYTES= 4,
+   PIPE_DRIVER_QUERY_TYPE_MICROSECONDS = 5,
 };
 
 enum pipe_driver_query_group_type
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: use implementation specified MAX_VERTEX_ATTRIBS rather than hardcoded value

2015-07-07 Thread Timothy Arceri
---
 src/glsl/linker.cpp | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 6a69c15..2f5a36f 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -3084,12 +3084,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   }
}
 
-   /* FINISHME: The value of the max_attribute_index parameter is
-* FINISHME: implementation dependent based on the value of
-* FINISHME: GL_MAX_VERTEX_ATTRIBS.  GL_MAX_VERTEX_ATTRIBS must be
-* FINISHME: at least 16, so hardcode 16 for now.
-*/
-   if (!assign_attribute_or_color_locations(prog, MESA_SHADER_VERTEX, 16)) {
+   if (!assign_attribute_or_color_locations(prog, MESA_SHADER_VERTEX, 
ctx-Const.Program[MESA_SHADER_VERTEX].MaxAttribs)) {
   goto done;
}
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: use implementation specified MAX_VERTEX_ATTRIBS rather than hardcoded value

2015-07-07 Thread Ilia Mirkin
Assuming the comment is correct, this is

Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

src/mesa/main/get_hash_params.py:  [ MAX_VERTEX_ATTRIBS_ARB,
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAttribs),
extra_ARB_vertex_program_api_es2 ],

Quickly looked over the code, and the comment does seem correct.

Perhaps not going over 80 chars by so much would be better, your call
whether to fix that or not.


On Tue, Jul 7, 2015 at 7:42 PM, Timothy Arceri t_arc...@yahoo.com.au wrote:
 ---
  src/glsl/linker.cpp | 7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)

 diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
 index 6a69c15..2f5a36f 100644
 --- a/src/glsl/linker.cpp
 +++ b/src/glsl/linker.cpp
 @@ -3084,12 +3084,7 @@ link_shaders(struct gl_context *ctx, struct 
 gl_shader_program *prog)
}
 }

 -   /* FINISHME: The value of the max_attribute_index parameter is
 -* FINISHME: implementation dependent based on the value of
 -* FINISHME: GL_MAX_VERTEX_ATTRIBS.  GL_MAX_VERTEX_ATTRIBS must be
 -* FINISHME: at least 16, so hardcode 16 for now.
 -*/
 -   if (!assign_attribute_or_color_locations(prog, MESA_SHADER_VERTEX, 16)) {
 +   if (!assign_attribute_or_color_locations(prog, MESA_SHADER_VERTEX, 
 ctx-Const.Program[MESA_SHADER_VERTEX].MaxAttribs)) {
goto done;
 }

 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/18] i965: Speculatively flush the batch after transform feedback

2015-07-07 Thread Kenneth Graunke
On Tuesday, July 07, 2015 09:02:16 PM Chris Wilson wrote:
 On Tue, Jul 07, 2015 at 10:31:07AM -0700, Kenneth Graunke wrote:
  On Tuesday, July 07, 2015 04:46:22 PM Chris Wilson wrote:
   On Tue, Jul 07, 2015 at 10:12:20AM +0100, Chris Wilson wrote:
On Mon, Jul 06, 2015 at 09:05:18PM -0700, Kristian Høgsberg wrote:
 On Mon, Jul 6, 2015 at 12:36 PM, Kenneth Graunke 
 kenn...@whitecape.org wrote:
  On Monday, July 06, 2015 11:33:15 AM Chris Wilson wrote:
  Since the purpose of transform feedback tends to be for the client 
  to
  act upon the results to change the geometry in the scene, it is 
  likely
  that the client will soon be waiting upon the results. Flush the 
  batch
  early so that we don't build up a long queue of commands 
  afterwards that
  could delay the readback.
  ---
   src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 ++
   1 file changed, 6 insertions(+)
 
  diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
  b/src/mesa/drivers/dri/i965/gen7_sol_state.c
  index 857ebe5..13dbe5b 100644
  --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
  +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
  @@ -494,6 +494,12 @@ gen7_end_transform_feedback(struct gl_context 
  *ctx,
 
  brw_batch_end(brw-batch);
 
  +   /* We will likely want to read the results in the very near 
  future, so
  +* push this primitive to hardware if it is currently idle.
  +*/
  +   if (!brw_batch_busy(brw-batch))
  +  brw_batch_flush(brw-batch);
  +
  /* EndTransformFeedback() means that we need to update the 
  number of
   * vertices written.  Since it's only necessary if 
  DrawTransformFeedback()
   * is called and it means mapping a buffer object, we delay 
  computing it
 
 
  We need some data to justify this change.
 
 I think even the theory is not correct - transform feedback is
 typically fed back into the GPU (as new geometry, eg) rather than
 consumed by the CPU, and in that case the flush is not helpful. But at
 the end of the day, data will tell.

How are they fed back? Can the xfb buffer be bound to the vertex buffer?
(Genuine question! The only examples I've seen were for testing by the
CPU.)
  
  Yes, it can.  Just glBindBuffer() some buffers around.  Or, I suspect
  one could bind it as a texture buffer object or SSBO and then use a
  compute shader on the results.
  
  With GL 4.x, the avoid synchronizing with the CPU mentality is a lot
  more prevalent, due to the advent of compute shaders.
  
   
   I've reviewed the code again, and gen7_end_transform_feedback() is always
   followed by brw_compute_xfb_vertices_written (and a read of the sol
   buffer) afaict, maybe not immediately but always before the next
   transform feedback.
  
  Sadly, yes.  We have a primitive count and we need a vertex count - so,
  a tiny bit of math.  Ideally, we would use the Gen7.5 MI_MATH+ feature
  to do this, eliminating the CPU-GPU synchronization point.
  
   Also afaict it is not possible to map the sol buffer directly into the
   application.
   -Chris
  
  It definitely is - the application creates GL buffer objects and binds
  them for use with transform feedback.  They can certainly
  glMapBufferRange() those buffers.
 
 The trouble I see is that the values stored currently are implementation
 dependent and often reset. How is the application meant to use them
 directly?
 
 (Just trying to understand a bit better. If it is that the current
 implementation is stalling when not required, then trying to speed
 those stalls up really is just lipstick on a pig and irrelevant. The
 patch was just trying to make a suggestion that feeding the gpu around
 expected stall points works best with the current batch-level granularity
 of our fences. Using intrabatch semaphores for the query objects seems a
 more promising avenue than doing batch flushes anyway.)
 -Chris

I think we misunderstood each other.  By SOL buffer do you mean
prim_count_bo?  If so, that's not visible to applications.

Stream out (aka transform feedback) works by writing geometry data
coming out of the VS/HS/DS/GS stages (whichever is last) into an
application buffer.  So I assumed you meant that buffer.  But the
format of /that/ data is absolutely controlled by the application.

The mechanism for counting the primitives written (to implement
glDrawTransformFeedback()) is entirely up to the driver.  It's not
the best.  Prior to MI_MATH existing, it was the best I could think of.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC] loader: libudev vs sysfs vs libdrm

2015-07-07 Thread Emil Velikov
Hello all,

A recent patch by Chris, fixing some libudev fun in our loader, made
me think if we can clear it up a bit.

Having three different ways of retrieving the vendor/device ID does
feel a bit excessive. Plus as one gets fixed others are likely to
break - and they do.
So here is a summary of each method, from portability POV.
 - libudev: widely common across Linux distributions (but not all).
 - sysfs: written by Gary Wong to target GNU Hurd and *BSD. The *BSD
folk never got to using it though :-\
 - libdrm: used as a last resource fall-back after the above two. the
sole option used by *BSD, MacOS and Android.

libdrm seems like a nice middle ground that can be used everywhere.
Which begs the question: from a technical POV, is there any
advantage/disadvantage of using one over the other ?

I do recall Kristian and Eric participating in this discussion before,
but the only thing I can find is along the lines of linux distros
should be using libudev :-(


Can anyone shed a light/cast their 2c ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vs: Fix matNxM vertex attributes where M != 4.

2015-07-07 Thread Chris Forbes
Reviewed-by: Chris Forbes chr...@ijw.co.nz

On Thu, Jul 2, 2015 at 8:08 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 Matrix vertex attributes have their columns padded out to vec4s, which
 I was failing to account for.  Scalar NIR expects them to be packed,
 however.

 Cc: mesa-sta...@lists.freedesktop.org
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
  1 file changed, 11 insertions(+), 4 deletions(-)

 I still need to write proper Piglit tests for this.  We have basically a 
 single
 test for matrix vertex attributes, and that's a mat4 (which worked).

 But I figure we probably shouldn't hold up the bugfix on that.

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index caf1300..37b1ed7 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -91,12 +91,19 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
* So, we need to copy from fs_reg(ATTR, var-location) to
* offset(nir_inputs, var-data.driver_location).
*/
 - unsigned components = var-type-without_array()-components();
 + const glsl_type *const t = var-type-without_array();
 + const unsigned components = t-components();
 + const unsigned cols = t-matrix_columns;
 + const unsigned elts = t-vector_elements;
   unsigned array_length = var-type-is_array() ? var-type-length : 
 1;
   for (unsigned i = 0; i  array_length; i++) {
 -for (unsigned j = 0; j  components; j++) {
 -   bld.MOV(retype(offset(input, bld, components * i + j), type),
 -   offset(fs_reg(ATTR, var-data.location + i, type), 
 bld, j));
 +for (unsigned j = 0; j  cols; j++) {
 +   for (unsigned k = 0; k  elts; k++) {
 +  bld.MOV(offset(retype(input, type), bld,
 + components * i + elts * j + k),
 +  offset(fs_reg(ATTR, var-data.location + i, type),
 + bld, 4 * j + k));
 +   }
  }
   }
   break;
 --
 2.4.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91254] (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1

2015-07-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91254

--- Comment #2 from Tomasz C. toma...@o2.pl ---
On:
mesa-git 10.7.0_devel.71031
mesa-libgl-git 10.7.0_devel.71031
(compiled from git master)
this problem still exists same as 10.6 and 10.6.1

If I go back this two packages to version 10.5.7 it works correctly.

How can I help locate the source of the problem?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 10.6.2 cut-off

2015-07-07 Thread Emil Velikov
Hi all,

As requested by Ilia, a bit of a heads-up:

Any patches sent to mesa-stable and/or landed in master after 12 PM
(noon) GMT, on the 8th of July won't feature in 10.6.2.

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91254] (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1

2015-07-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91254

--- Comment #3 from Chris Wilson ch...@chris-wilson.co.uk ---
You have two end points, a bisection would be very useful and only take a few
minutes (maybe an hour at most?).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/18] i965: Introduce a context-local batch manager

2015-07-07 Thread Chris Wilson
On Tue, Jul 07, 2015 at 01:14:53PM +0300, Abdiel Janulgue wrote:
 On 07/06/2015 01:33 PM, Chris Wilson wrote:
  @@ -600,7 +593,10 @@ brw_emit_null_surface_state(struct brw_context *brw,
1  BRW_SURFACE_WRITEDISABLE_B_SHIFT |
1  BRW_SURFACE_WRITEDISABLE_A_SHIFT);
  }
  -   surf[1] = bo ? bo-offset64 : 0;
  +   surf[1] = brw_batch_reloc(brw-batch, *out_offset + 4,
  + bo, 0,
  + I915_GEM_DOMAIN_RENDER,
  + I915_GEM_DOMAIN_RENDER);
 
 null check for bo?

I put the NULL check into the inline variant of brw_batch_reloc() for a
bit of syntatic sugar for these cases.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/18] i965: Query whether we have kernel support for the TIMESTAMP register once

2015-07-07 Thread Martin Peres

On 06/07/15 19:12, Chris Wilson wrote:

On Mon, Jul 06, 2015 at 04:19:36PM +0300, Martin Peres wrote:


On 06/07/15 16:15, Martin Peres wrote:

On 06/07/15 16:13, Chris Wilson wrote:

On Mon, Jul 06, 2015 at 03:10:48PM +0300, Martin Peres wrote:

On 06/07/15 13:33, Chris Wilson wrote:

Move the query for the TIMESTAMP register from context init to the
screen, so that it is only queried once for all contexts.

On 32bit systems, some old kernels trigger a hw bug resulting in the
TIMESTAMP register being shifted and the low bits always zero. Detect
this by repeating the read a few times and check the register is
incrementing.

You do not do the latter. You only check for the low bits.

I guess the counter is supposed to be monotonically increasing and
with a resolution of a few microseconds which would make this
perfectly valid. Could you confirm and make sure to add this
information in the commit message please?

The counter should increment every 80ns. What's misleading in what I
wrote? It describes the hw bug and how to detect it.

Well, it is not misleading, it just lacks this information.

If it incremented every seconds, the patch would be stupid because
the timestamp could be at 0 and polling 10 times at a few us of
interval would always yield the same result. That's all :)

Oh, forgot to say: With this information added in the commit message
and the commit message duplicated as a comment in
intel_detect_timestamp(), the patch is:

How about:

On 32bit systems, some old kernels trigger a hw bug resulting in the
TIMESTAMP register being shifted and the low 32bits always zero. Detect
this by repeating the read a few times and check the register is
incrementing every 80ns as expected and not stuck on zero (as would be
the case with the buggy kernel/hw.).
-Chris

Perfect!

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/18] i965: Speculatively flush the batch after transform feedback

2015-07-07 Thread Martin Peres



On 06/07/15 22:36, Kenneth Graunke wrote:

On Monday, July 06, 2015 11:33:15 AM Chris Wilson wrote:

Since the purpose of transform feedback tends to be for the client to
act upon the results to change the geometry in the scene, it is likely
that the client will soon be waiting upon the results. Flush the batch
early so that we don't build up a long queue of commands afterwards that
could delay the readback.
---
  src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 857ebe5..13dbe5b 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -494,6 +494,12 @@ gen7_end_transform_feedback(struct gl_context *ctx,
  
 brw_batch_end(brw-batch);
  
+   /* We will likely want to read the results in the very near future, so

+* push this primitive to hardware if it is currently idle.
+*/
+   if (!brw_batch_busy(brw-batch))
+  brw_batch_flush(brw-batch);
+
 /* EndTransformFeedback() means that we need to update the number of
  * vertices written.  Since it's only necessary if DrawTransformFeedback()
  * is called and it means mapping a buffer object, we delay computing it


We need some data to justify this change.


I actually get a negative perf improvement out of this one, -0.9% on a 
customer benchmark.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] i965/gen9: Plugin the code for selecting YF/YS tiling on skl+

2015-07-07 Thread Kenneth Graunke
On Wednesday, June 10, 2015 03:30:47 PM Anuj Phogat wrote:
 Buffers with Yf/Ys tiling end up using meta upload / download
 paths or the blitter for cases where they used tiled_memcpy paths
 in case of Y tiling. This has exposed some bugs in meta path. To
 avoid any piglit regressions on SKL this patch keeps the Yf/Ys
 tiling disabled at the moment.
 
 V3: Make brw_miptree_choose_tr_mode() actually choose TRMODE. (Ben)
 Few cosmetic changes.
 V4: Get rid of brw_miptree_choose_tr_mode().
 Take care of all tile resource modes {Yf, Ys, none} for all
 generations at one place.
 
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Cc: Ben Widawsky b...@bwidawsk.net
 ---
  src/mesa/drivers/dri/i965/brw_tex_layout.c | 97 
 --
  1 file changed, 79 insertions(+), 18 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
 b/src/mesa/drivers/dri/i965/brw_tex_layout.c
 index b9ac4cf..c0ef5cc 100644
 --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
 +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
 @@ -807,27 +807,88 @@ brw_miptree_layout(struct brw_context *brw,
 enum intel_miptree_tiling_mode requested,
 struct intel_mipmap_tree *mt)
  {
 -   mt-tr_mode = INTEL_MIPTREE_TRMODE_NONE;
 +   const unsigned bpp = mt-cpp * 8;
 +   const bool is_tr_mode_yf_ys_allowed =
 +  brw-gen = 9 
 +  !for_bo 
 +  !mt-compressed 
 +  /* Enable YF/YS tiling only for color surfaces because depth and
 +   * stencil surfaces are not supported in blitter using fast copy
 +   * blit and meta PBO upload, download paths. No other paths
 +   * currently support Yf/Ys tiled surfaces.
 +   * FIXME:  Remove this restriction once we have a tiled_memcpy()
 +   * path to do depth/stencil data upload/download to Yf/Ys tiled
 +   * surfaces.
 +   */
 +  _mesa_is_format_color_format(mt-format) 
 +  (requested == INTEL_MIPTREE_TILING_Y ||
 +   requested == INTEL_MIPTREE_TILING_ANY) 
 +  (bpp  is_power_of_two(bpp)) 
 +  /* FIXME: To avoid piglit regressions keep the Yf/Ys tiling
 +   * disabled at the moment.
 +   */
 +  false;

I must say, I was a bit surprised to see this land as is.  You've got a
lot of conditions there, only to finish them up with  false - with a
comment saying that your code isn't passing Piglit yet.  That doesn't
really meet our usual qualifications for merging.

Coverity also pointed out that your if (is_tr_mode_yf_ys_allowed) block
below is dead code, issuing new warnings.

Forgive my ignorance, but what's the purpose of Yf/Ys tiling?  My
understanding was that Ys is primarily in support of a new OpenGL
feature - GL_ARB_spare_texture(*) - which isn't yet enabled:

https://www.opengl.org/registry/specs/ARB/sparse_texture.txt

Is Yf tiling supposed to be more efficient than legacy Y-tiling?  If so,
then switching to it is an optimization, isn't it?  We usually require
data indicating some kind of performance improvement (any kind!) before
landing a bunch of code for optimizations.  Obviously that's pretty
tricky with pre-release hardware, so I'd settle for it's complete
and functions correctly.

At any rate, it's merged, and hopefully you're able to get it working...

  
 -   intel_miptree_set_alignment(brw, mt);
 -   intel_miptree_set_total_width_height(brw, mt);
 +   /* Lower index (Yf) is the higher priority mode */
 +   const uint32_t tr_mode[3] = {INTEL_MIPTREE_TRMODE_YF,
 +INTEL_MIPTREE_TRMODE_YS,
 +INTEL_MIPTREE_TRMODE_NONE};
 +   int i = is_tr_mode_yf_ys_allowed ? 0 : ARRAY_SIZE(tr_mode) - 1;
  
 -   if (!mt-total_width || !mt-total_height) {
 -  intel_miptree_release(mt);
 -  return;
 -   }
 +   while (i  ARRAY_SIZE(tr_mode)) {
 +  if (brw-gen  9)
 + assert(tr_mode[i] == INTEL_MIPTREE_TRMODE_NONE);
 +  else
 + assert(tr_mode[i] == INTEL_MIPTREE_TRMODE_YF ||
 +tr_mode[i] == INTEL_MIPTREE_TRMODE_YS ||
 +tr_mode[i] == INTEL_MIPTREE_TRMODE_NONE);
  
 -   /* On Gen9+ the alignment values are expressed in multiples of the block
 -* size
 -*/
 -   if (brw-gen = 9) {
 -  unsigned int i, j;
 -  _mesa_get_format_block_size(mt-format, i, j);
 -  mt-align_w /= i;
 -  mt-align_h /= j;
 -   }
 +  mt-tr_mode = tr_mode[i];
 +  intel_miptree_set_alignment(brw, mt);
 +  intel_miptree_set_total_width_height(brw, mt);
  
 -   if (!for_bo)
 -  mt-tiling = brw_miptree_choose_tiling(brw, requested, mt);
 +  if (!mt-total_width || !mt-total_height) {
 + intel_miptree_release(mt);
 + return;
 +  }
 +
 +  /* On Gen9+ the alignment values are expressed in multiples of the
 +   * block size.
 +   */
 +  if (brw-gen = 9) {
 + unsigned int i, j;
 + _mesa_get_format_block_size(mt-format, i, j);
 + mt-align_w /= i;
 + mt-align_h /= j;
 +  }
 +
 +  

[Mesa-dev] [PATCH] opencl: use versioned .so in mesa.icd

2015-07-07 Thread Igor Gnatenko
We must have versioned library in mesa.icd, because ICD loader would
fail if the mesa-devel package wasn't installed.

Reported-by: Fabian Deutsch fabian.deut...@gmx.de
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73512
Cc: 10.6 mesa-sta...@lists.freedesktop.org
Signed-off-by: Igor Gnatenko i.gnatenko.br...@gmail.com
---
 configure.ac   | 3 +++
 src/gallium/targets/opencl/Makefile.am | 2 +-
 src/gallium/targets/opencl/mesa.icd| 1 -
 src/gallium/targets/opencl/mesa.icd.in | 1 +
 4 files changed, 5 insertions(+), 2 deletions(-)
 delete mode 100644 src/gallium/targets/opencl/mesa.icd
 create mode 100644 src/gallium/targets/opencl/mesa.icd.in

diff --git a/configure.ac b/configure.ac
index d240c06..a7141a3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -64,6 +64,8 @@ m4_ifdef([AM_PROG_AR], [AM_PROG_AR])
 dnl Set internal versions
 OSMESA_VERSION=8
 AC_SUBST([OSMESA_VERSION])
+OPENCL_VERSION=1
+AC_SUBST([OPENCL_VERSION])
 
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.38
@@ -2376,6 +2378,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/targets/libgl-xlib/Makefile
src/gallium/targets/omx/Makefile
src/gallium/targets/opencl/Makefile
+   src/gallium/targets/opencl/mesa.icd
src/gallium/targets/osmesa/Makefile
src/gallium/targets/osmesa/osmesa.pc
src/gallium/targets/pipe-loader/Makefile
diff --git a/src/gallium/targets/opencl/Makefile.am 
b/src/gallium/targets/opencl/Makefile.am
index 70e60e2..af6d760 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -5,7 +5,7 @@ lib_LTLIBRARIES = lib@OPENCL_LIBNAME@.la
 lib@OPENCL_LIBNAME@_la_LDFLAGS = \
$(LLVM_LDFLAGS) \
-no-undefined \
-   -version-number 1:0 \
+   -version-number @OPENCL_VERSION@:0 \
$(GC_SECTIONS) \
$(LD_NO_UNDEFINED)
 
diff --git a/src/gallium/targets/opencl/mesa.icd 
b/src/gallium/targets/opencl/mesa.icd
deleted file mode 100644
index 6a6a870..000
--- a/src/gallium/targets/opencl/mesa.icd
+++ /dev/null
@@ -1 +0,0 @@
-libMesaOpenCL.so
diff --git a/src/gallium/targets/opencl/mesa.icd.in 
b/src/gallium/targets/opencl/mesa.icd.in
new file mode 100644
index 000..1b77b4e
--- /dev/null
+++ b/src/gallium/targets/opencl/mesa.icd.in
@@ -0,0 +1 @@
+lib@OPENCL_LIBNAME@.so.@OPENCL_VERSION@
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91254] (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1

2015-07-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91254

--- Comment #1 from Chris Wilson ch...@chris-wilson.co.uk ---
I suspect this a dup of bug 90839. Do you see the regression remain on master
or the 10.6 branch?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/18] i965: Introduce a context-local batch manager

2015-07-07 Thread Abdiel Janulgue


On 07/06/2015 01:33 PM, Chris Wilson wrote:

 +/*
 + * Add a relocation entry for the target buffer into the current batch.
 + *
 + * This is the heart of performing fast relocations, both here and in
 + * the corresponding kernel relocation routines.
 + *
 + * - Instead of passing in handles for the kernel convert back into
 + *   the buffer for every relocation, we tell the kernel which
 + *   execobject slot corresponds with the relocation. The kernel is
 + *   able to use a simple LUT constructed as it first looks up each buffer
 + *   for the batch rather than search a small, overfull hashtable. As both
 + *   the number of relocations and buffers in a batch grow, the simple
 + *   LUT is much more efficient (though the LUT itself is less cache
 + *   friendly).
 + *   However, as the batch buffer is by definition the last object in
 + *   the execbuffer array we have to perform a pass to relabel the
 + *   target of all relocations pointing to the batch. (Except when
 + *   the kernel supports batch-first, in which case we can do the relocation
 + *   target processing for the batch inline.)
 + *
 + * - If the kernel has not moved the buffer, it will still be in the same
 + *   location as last time we used it. If we tell the kernel that all the
 + *   relocation entries are the same as the offset for the buffer, then
 + *   the kernel need only check that all the buffers are still in the same
 + *   location and then skip performing relocations entirely. A huge win.
 + *
 + * - As a consequence of telling the kernel to skip processing the 
 relocations,
 + *   we need to tell the kernel about the read/write domains and special 
 needs
 + *   of the buffers.
 + *
 + * - Alternatively, we can request the kernel place the buffer exactly
 + *   where we want it and forgo all relocations to that buffer entirely.
 + *   The buffer is effectively pinned for its lifetime (if the kernel
 + *   does have to move it, for example to swap it out to recover memory,
 + *   the kernel will return it back to our requested location at the start
 + *   of the next batch.) This of course imposes a lot of constraints on where
 + *   we can say the buffers are, they must meet all the alignment constraints
 + *   and not overlap.
 + *
 + * - Essential to all these techniques is that we always use the same
 + *   presumed_offset for the relocations as for submitting the execobject.
 + *   That value must be written into the batch and it must match the value
 + *   we tell the kernel. (This breaks down when using relocation tries shared
 + *   between multiple contexts, hence the need for context-local batch
 + *   management.)
 + *
 + * In contrast to libdrm, we can build the execbuffer array along with
 + * the batch by forgoing the ability to handle general relocation trees.
 + * This avoids having multiple passes to build the execbuffer parameter,
 + * and also gives us a means to cheaply track when a buffer has been
 + * referenced by the batch.
 + */
 +uint64_t __brw_batch_reloc(struct brw_batch *batch,
 +   uint32_t batch_offset,
 +   struct brw_bo *target_bo,
 +   uint64_t target_offset,
 +   unsigned read_domains,
 +   unsigned write_domain)
 +{
 +   assert(target_bo-refcnt);
 +   if (unlikely(target_bo-batch != batch)) {
 +  /* XXX legal sharing between contexts/threads? */
 +  target_bo = brw_bo_import(batch, target_bo-base, true);
 +  if (unlikely(target_bo == NULL))
 + longjmp(batch-jmpbuf, -ENOMEM);
 +  target_bo-refcnt--; /* kept alive by the implicit active reference */
 +   }
 +   assert(target_bo-batch == batch);
 +
 +   if (target_bo-exec == NULL) {
 +  int n;
 +
 +  /* reserve one exec entry for the batch */
 +  if (unlikely(batch-emit.nexec + 1 == batch-exec_size))
 + __brw_batch_grow_exec(batch);
 +
 +  n = batch-emit.nexec++;
 +  target_bo-target_handle = has_lut(batch) ? n : target_bo-handle;
 +  target_bo-exec = memset(batch-exec + n, 0, sizeof(*target_bo-exec));
 +  target_bo-exec-handle = target_bo-handle;
 +  target_bo-exec-alignment = target_bo-alignment;
 +  target_bo-exec-offset = target_bo-offset;
 +  if (target_bo-pinned)
 + target_bo-exec-flags = EXEC_OBJECT_PINNED;
 +
 +  /* Track the total amount of memory in use by all active requests */
 +  if (target_bo-read.rq == NULL) {
 +   batch-rss += target_bo-size;
 +   if (batch-rss  batch-peak_rss)
 +   batch-peak_rss = batch-rss;
 +  }
 +  target_bo-read.rq = batch-next_request;
 +  list_move_tail(target_bo-read.link, batch-next_request-read);
 +
 +  batch-aperture += target_bo-size;
 +   }
 +
 +   if (!target_bo-pinned) {
 +  int n;
 +
 +  if (unlikely(batch-emit.nreloc == batch-reloc_size))
 + __brw_batch_grow_reloc(batch);
 +
 +  n = batch-emit.nreloc++;
 +  

Re: [Mesa-dev] [PATCH v2 5/5] i965/gen9: Allocate YF/YS tiled buffer objects

2015-07-07 Thread Kenneth Graunke
On Tuesday, June 23, 2015 01:23:05 PM Anuj Phogat wrote:
 In case of I915_TILING_{X,Y} we need to pass tiling format to libdrm
 using drm_intel_bo_alloc_tiled(). But, In case of YF/YS tiled buffers
 libdrm need not know about the tiling format because these buffers
 don't have hardware support to be tiled or detiled through a fenced
 region. libdrm still need to know buffer alignment value for its use
 in kernel when resolving the relocation.
 
 Using drm_intel_bo_alloc_for_render() for YF/YS tiled buffers
 satisfy both the above conditions.
 
 V2: Delete min/max buffer size restrictions not valid for i965+.
 Remove redundant align to tile size statements.
 Remove some redundant code now when there are no min/max buffer size.
 
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Cc: Ben Widawsky b...@bwidawsk.net
 ---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 62 
 +--
  1 file changed, 58 insertions(+), 4 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
 b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 index 80c52f2..5bcb094 100644
 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 @@ -558,6 +558,48 @@ intel_lower_compressed_format(struct brw_context *brw, 
 mesa_format format)
 }
  }
  
 +/* This function computes Yf/Ys tiled bo size, alignment and pitch. */
 +static uint64_t
 +intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
 +uint64_t *pitch)

Hi Anuj,

This patch has a subtle bug: you've specified pitch and stride to be
uint64_t here, but below when you call it

[snip]
 @@ -616,11 +658,23 @@ intel_miptree_create(struct brw_context *brw,
alloc_flags |= BO_ALLOC_FOR_RENDER;
  
 unsigned long pitch;
 -   mt-bo = drm_intel_bo_alloc_tiled(brw-bufmgr, miptree, total_width,
 - total_height, mt-cpp, mt-tiling,
 - pitch, alloc_flags);
 mt-etc_format = etc_format;
 -   mt-pitch = pitch;
 +
 +   if (mt-tr_mode != INTEL_MIPTREE_TRMODE_NONE) {
 +  unsigned alignment = 0;
 +  unsigned long size;
 +  size = intel_get_yf_ys_bo_size(mt, alignment, pitch);

...you're passing a pointer to an unsigned long.  On 32-bit builds,
unsigned long is a 4 byte value, while uint64_t is 8 bytes.  This could
lead to stack corruption.  (GCC warns about this during a 32-bit build.)

I assumed the solution was to make everything uint32_t, but apparently
drm_intel_bo_alloc_tiled actually expects an unsigned long.  So we can't
change that.

Then I looked at your code, and realized that nothing even uses the
pitch value.  Is there some point to the parameter existing at all?

--Ken

 +  assert(size);
 +  mt-bo = drm_intel_bo_alloc_for_render(brw-bufmgr, miptree,
 + size, alignment);
 +  mt-pitch = pitch;
 +   } else {
 +  mt-bo = drm_intel_bo_alloc_tiled(brw-bufmgr, miptree,
 +total_width, total_height, mt-cpp,
 +mt-tiling, pitch,
 +alloc_flags);
 +  mt-pitch = pitch;
 +   }
  
 /* If the BO is too large to fit in the aperture, we need to use the
  * BLT engine to support it.  Prior to Sandybridge, the BLT paths can't
 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/18] i965: Reuse our VBO for streaming fast-clear vertices

2015-07-07 Thread Martin Peres

On 06/07/15 19:43, Kenneth Graunke wrote:

On Monday, July 06, 2015 11:33:10 AM Chris Wilson wrote:

Rather than allocating a fresh page every time we clear a buffer, keep
that page around between invocations by tracking the last used offset
and only allocating a fresh page when we wrap.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
---
  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 17 ++---
  1 file changed, 14 insertions(+), 3 deletions(-)

This looks okay to me.  Do you have any performance data to justify the
extra complexity?


I actually get a negative performance improvement on a customer 
benchmark (-1.3%). Could it be because we are waiting on the VBO at some 
point?


What benchmark did you try to get a perf improvement?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/18] i965: Introduce a context-local batch manager

2015-07-07 Thread Abdiel Janulgue


On 07/07/2015 01:19 PM, Chris Wilson wrote:
 On Tue, Jul 07, 2015 at 01:14:53PM +0300, Abdiel Janulgue wrote:
 On 07/06/2015 01:33 PM, Chris Wilson wrote:
 @@ -600,7 +593,10 @@ brw_emit_null_surface_state(struct brw_context *brw,
   1  BRW_SURFACE_WRITEDISABLE_B_SHIFT |
   1  BRW_SURFACE_WRITEDISABLE_A_SHIFT);
 }
 -   surf[1] = bo ? bo-offset64 : 0;
 +   surf[1] = brw_batch_reloc(brw-batch, *out_offset + 4,
 + bo, 0,
 + I915_GEM_DOMAIN_RENDER,
 + I915_GEM_DOMAIN_RENDER);

 null check for bo?
 
 I put the NULL check into the inline variant of brw_batch_reloc() for a
 bit of syntatic sugar for these cases.
 -Chris
 

You're, right. I failed to notice the in-line variant.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/18] i965: Introduce a context-local batch manager

2015-07-07 Thread Kenneth Graunke
Hi Chris,

I made a genuine effort to review this patch, hoping to better understand
the various changes and what you were trying to accomplish.  I spent many
hours reading and trying to enumerate changes - or potential changes I
needed to look hard at to convince myself whether they were correct.

I came up with a frighteningly long list of changes:

* Relocation handling changes considerably (the original point of
  Kristian's endeavour which led up to this).

* Fencing, busy tracking, and sync objects are completely reworked.

* Render-to-texture cache flushing and dirty buffer tracking is
  completely reworked.

* Gen7 SOL buffer offset resetting now uses MI_LOAD_REGISTER_IMM rather
  than the execbuf2 parameter, requiring the command validator on Haswell.
  This effectively bumps the kernel requirement from v3.6 to v4.2-rc1,
  which will simply not fly with distributions at this time.

* glBufferSubData() now uses intel_upload_data() rather than allocating
  a temporary BO.  This is the first use of the upload buffer by the
  BLT engine, and could imply that the upload buffer's lifetime now
  extends across batches - longer than before.  Separable change that
  requires separate evaluation and justification.

* Per buffer cache-coherency checking rather than brw-has_llc?

* glBufferSubData()'s prefer_stall_to_blit flag appears to depend on
  per-buffer cache-coherency rather than being set globally.  Could
  impact performance of buffer uploads.

* Potential missing flushes (which can cause hangs or misrendering):

  - It looks like calling brw_bo_busy() with BUSY_FLUSH causes a flush
when necessary.  However, some instances of the old bo_busy,
bo_references, batch_flush pattern are replaced without that flag.
One occurrance was in BufferSubData(); I did not spend time to
check every case.

  - Flushes are often done implicitly by e.g. brw_bo_read calling
brw_bo_map with the appropriate flags, and many explicit checks
and flushes are removed.  Not bad, but needs careful review.

  - Gen6+ query object code might have dropped an implicit flush
guaranteeing that when the GL application requests the result,
any pending work will be kicked off so they can poll/spin
repeatedly until the result arrives.

  - New code to avoid redundant flushes.

* perf_debug() warnings are removed all over the code for some reason:

  - Unsynchronized maps/BufferSubData not working on !LLC platforms?
If they work now, that's a huge change!  If not, why drop the warning?

  - Warnings about stalls on mapping buffers and miptrees are gone now.
These have been useful in tracking down performance problems.  They
might not always be accurate, but surely removing them should be done
separately with justification?

  - Warnings about stalls on query objects are gone.  I've used these when
analyzing application performance.  Why?

  - Warnings about implicit flushes are gone.

* BO unmap calls appears to be missing in some places.  A few map calls
  have moved around in hard-to-follow ways.  Unclear how lifetimes of
  buffers and lifetimes of maps are affected.

* Possible mmap vs. pwrite preference changes?  Hard to follow.

* Texture upload (tiled_memcpy) changes, which is notoriously fragile
  and can lose all of the performance benefit if the compiler isn't able
  to optimize it just right.  Ideally separate.

* Assertions change to GL errors in brw_get_graphics_reset_status().

* Aperture space checking significantly reworked, especially for the BLT
  paths.  Honestly, a lot nicer, but couldn't this be separated?

* The bo_reuse driconf option is removed.

* Gen4-5 structure changes.

* brw_get_timestamp() - removes initialization of result to 0.
  Probably unnecessary and OK to delete; should be separate.

* New helper functions and coding patterns.  Separable.

* Noise (renaming, moving code between files, some other trivial changes
  like removing 'brw' variables and moving code into else blocks).

* ...I probably missed some things.

Based upon this, I cannot in good conscience consider merging this patch.
The potential for breakage is staggering.  As a proof-of-concept, you've
done an excellent job in proving we can do much better, and introduced a
lot of good ideas.  But there's a lot of work left to be done before we
can consider applying it to our production quality driver.

Please advise whether you would like to work towards making a mergeable,
incremental patch series, or if someone else should embark on that
endeavour.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] opencl: use versioned .so in mesa.icd

2015-07-07 Thread Michel Dänzer
On 07.07.2015 19:05, Igor Gnatenko wrote:
 We must have versioned library in mesa.icd, because ICD loader would
 fail if the mesa-devel package wasn't installed.
 
 Reported-by: Fabian Deutsch fabian.deut...@gmx.de
 Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73512
 Cc: 10.6 mesa-sta...@lists.freedesktop.org
 Signed-off-by: Igor Gnatenko i.gnatenko.br...@gmail.com
 ---
  configure.ac   | 3 +++
  src/gallium/targets/opencl/Makefile.am | 2 +-
  src/gallium/targets/opencl/mesa.icd| 1 -
  src/gallium/targets/opencl/mesa.icd.in | 1 +
  4 files changed, 5 insertions(+), 2 deletions(-)
  delete mode 100644 src/gallium/targets/opencl/mesa.icd
  create mode 100644 src/gallium/targets/opencl/mesa.icd.in
 
 diff --git a/configure.ac b/configure.ac
 index d240c06..a7141a3 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -64,6 +64,8 @@ m4_ifdef([AM_PROG_AR], [AM_PROG_AR])
  dnl Set internal versions
  OSMESA_VERSION=8
  AC_SUBST([OSMESA_VERSION])
 +OPENCL_VERSION=1
 +AC_SUBST([OPENCL_VERSION])
  
  dnl Versions for external dependencies
  LIBDRM_REQUIRED=2.4.38
 @@ -2376,6 +2378,7 @@ AC_CONFIG_FILES([Makefile
   src/gallium/targets/libgl-xlib/Makefile
   src/gallium/targets/omx/Makefile
   src/gallium/targets/opencl/Makefile
 + src/gallium/targets/opencl/mesa.icd
   src/gallium/targets/osmesa/Makefile
   src/gallium/targets/osmesa/osmesa.pc
   src/gallium/targets/pipe-loader/Makefile
 diff --git a/src/gallium/targets/opencl/Makefile.am 
 b/src/gallium/targets/opencl/Makefile.am
 index 70e60e2..af6d760 100644
 --- a/src/gallium/targets/opencl/Makefile.am
 +++ b/src/gallium/targets/opencl/Makefile.am
 @@ -5,7 +5,7 @@ lib_LTLIBRARIES = lib@OPENCL_LIBNAME@.la
  lib@OPENCL_LIBNAME@_la_LDFLAGS = \
   $(LLVM_LDFLAGS) \
   -no-undefined \
 - -version-number 1:0 \
 + -version-number @OPENCL_VERSION@:0 \
   $(GC_SECTIONS) \
   $(LD_NO_UNDEFINED)
  
 diff --git a/src/gallium/targets/opencl/mesa.icd 
 b/src/gallium/targets/opencl/mesa.icd
 deleted file mode 100644
 index 6a6a870..000
 --- a/src/gallium/targets/opencl/mesa.icd
 +++ /dev/null
 @@ -1 +0,0 @@
 -libMesaOpenCL.so
 diff --git a/src/gallium/targets/opencl/mesa.icd.in 
 b/src/gallium/targets/opencl/mesa.icd.in
 new file mode 100644
 index 000..1b77b4e
 --- /dev/null
 +++ b/src/gallium/targets/opencl/mesa.icd.in
 @@ -0,0 +1 @@
 +lib@OPENCL_LIBNAME@.so.@OPENCL_VERSION@
 

Acked-by: Michel Dänzer michel.daen...@amd.com


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] clover: little OpenCL status code logging clean

2015-07-07 Thread EdB
s/build_error/compile_error in order to match the stored OpenCL status code.
Make program::build catch and log every OpenCL error.
Make tgsi error triggering uniform with the llvm one.
---
Note that compile_error class is keep for later use

 .../state_trackers/clover/core/compiler.hpp|  3 ++-
 src/gallium/state_trackers/clover/core/error.hpp   |  4 ++--
 src/gallium/state_trackers/clover/core/program.cpp |  4 ++--
 .../state_trackers/clover/llvm/invocation.cpp  | 18 +++---
 .../state_trackers/clover/tgsi/compiler.cpp| 28 +-
 5 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp 
b/src/gallium/state_trackers/clover/core/compiler.hpp
index c68aa39..2076417 100644
--- a/src/gallium/state_trackers/clover/core/compiler.hpp
+++ b/src/gallium/state_trackers/clover/core/compiler.hpp
@@ -37,7 +37,8 @@ namespace clover {
const std::string opts,
std::string r_log);
 
-   module compile_program_tgsi(const std::string source);
+   module compile_program_tgsi(const std::string source,
+   std::string r_log);
 }
 
 #endif
diff --git a/src/gallium/state_trackers/clover/core/error.hpp 
b/src/gallium/state_trackers/clover/core/error.hpp
index 780b973..59a5af4 100644
--- a/src/gallium/state_trackers/clover/core/error.hpp
+++ b/src/gallium/state_trackers/clover/core/error.hpp
@@ -65,9 +65,9 @@ namespace clover {
   cl_int code;
};
 
-   class build_error : public error {
+   class compile_error : public error {
public:
-  build_error(const std::string what = ) :
+  compile_error(const std::string what = ) :
  error(CL_COMPILE_PROGRAM_FAILURE, what) {
   }
};
diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
b/src/gallium/state_trackers/clover/core/program.cpp
index 0d6cc40..6eebd9c 100644
--- a/src/gallium/state_trackers/clover/core/program.cpp
+++ b/src/gallium/state_trackers/clover/core/program.cpp
@@ -56,14 +56,14 @@ program::build(const ref_vectordevice devs, const char 
*opts,
 
  try {
 auto module = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
-   compile_program_tgsi(_source) :
+   compile_program_tgsi(_source, log) :
compile_program_llvm(_source, headers,
 dev.ir_format(),
 dev.ir_target(), 
build_opts(dev),
 log));
 _binaries.insert({ dev, module });
 _logs.insert({ dev, log });
- } catch (const build_error ) {
+ } catch (const error ) {
 _logs.insert({ dev, log });
 throw;
  }
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 9b91fee..967284d 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -108,7 +108,7 @@ namespace {
  name, llvm::MemoryBuffer::getMemBuffer(source));
 
   if (!c.ExecuteAction(act))
- throw build_error(log);
+ throw compile_error(log);
}
 
module
@@ -256,7 +256,7 @@ namespace {
   r_log = log;
 
   if (!ExecSuccess)
- throw build_error();
+ throw compile_error();
 
   // Get address spaces map to be able to find kernel argument address 
space
   memcpy(address_spaces, c.getTarget().getAddressSpaceMap(),
@@ -485,7 +485,7 @@ namespace {
   LLVMDisposeMessage(err_message);
 
   if (err) {
- throw build_error();
+ throw compile_error();
   }
}
 
@@ -505,7 +505,7 @@ namespace {
   if (LLVMGetTargetFromTriple(triple.c_str(), target, error_message)) {
  r_log = std::string(error_message);
  LLVMDisposeMessage(error_message);
- throw build_error();
+ throw compile_error();
   }
 
   LLVMTargetMachineRef tm = LLVMCreateTargetMachine(
@@ -514,7 +514,7 @@ namespace {
 
   if (!tm) {
  r_log = Could not create TargetMachine:  + triple;
- throw build_error();
+ throw compile_error();
   }
 
   if (dump_asm) {
@@ -567,7 +567,7 @@ namespace {
 const char *name;
 if (gelf_getshdr(section, symtab_header) != symtab_header) {
r_log = Failed to read ELF section header.;
-   throw build_error();
+   throw compile_error();
 }
 name = elf_strptr(elf, section_str_index, symtab_header.sh_name);
if (!strcmp(name, .symtab)) {
@@ -577,9 +577,9 @@ namespace {
  }
  if (!symtab) {
 r_log = Unable to find symbol table.;
-throw build_error();
+throw compile_error();
  }
-  } 

[Mesa-dev] [Bug 91259] FS compile failed: Register spilling not supported with m14 used

2015-07-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91259

Bug ID: 91259
   Summary: FS compile failed: Register spilling not supported
with m14 used
   Product: Mesa
   Version: 10.6
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: maxweiss.1...@googlemail.com
QA Contact: mesa-dev@lists.freedesktop.org

Failed to run a JavaFX application.

System:
- Arch Linux (last full package update 07 July 2015) (64 bit)
- Intel HD 3000
- Oracle JDK 8
- mesa 10.6.1-1

Symptoms:
A GUI with blurry fonts and inverted colors after a pile of exceptions.

Example Exception trace:
---
Please report at https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa
Program link log: FS compile failed: Register spilling not supported with m14
used

java.lang.RuntimeException: Error creating shader program
at com.sun.prism.es2.ES2Shader.createFromSource(ES2Shader.java:158)
at com.sun.prism.es2.ES2Shader.createFromSource(ES2Shader.java:173)
at
com.sun.prism.es2.ES2ResourceFactory.createShader(ES2ResourceFactory.java:219)
at
com.sun.scenario.effect.impl.prism.ps.PPSRenderer.createShader(PPSRenderer.java:203)
at
com.sun.scenario.effect.impl.prism.ps.PPSLinearConvolvePeer.createShader(PPSLinearConvolvePeer.java:102)
at
com.sun.scenario.effect.impl.prism.ps.PPSOneSamplerPeer.filterImpl(PPSOneSamplerPeer.java:90)
at
com.sun.scenario.effect.impl.prism.ps.PPSEffectPeer.filter(PPSEffectPeer.java:54)
at
com.sun.scenario.effect.LinearConvolveCoreEffect.filterImageDatas(LinearConvolveCoreEffect.java:85)
at
com.sun.scenario.effect.LinearConvolveCoreEffect.filterImageDatas(LinearConvolveCoreEffect.java:41)
at com.sun.scenario.effect.FilterEffect.filter(FilterEffect.java:195)
at
com.sun.scenario.effect.impl.prism.PrEffectHelper.render(PrEffectHelper.java:166)
at com.sun.javafx.sg.prism.EffectFilter.render(EffectFilter.java:61)
at com.sun.javafx.sg.prism.NGNode.renderEffect(NGNode.java:2379)
at com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2064)
at com.sun.javafx.sg.prism.NGImageView.doRender(NGImageView.java:103)
at com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1959)
at com.sun.javafx.sg.prism.NGGroup.renderContent(NGGroup.java:235)
at com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2067)
at com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1959)
at com.sun.javafx.tk.quantum.ViewPainter.doPaint(ViewPainter.java:474)
at com.sun.javafx.tk.quantum.ViewPainter.paintImpl(ViewPainter.java:327)
at
com.sun.javafx.tk.quantum.UploadingPainter.run(UploadingPainter.java:133)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at com.sun.javafx.tk.RenderJob.run(RenderJob.java:58)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at
com.sun.javafx.tk.quantum.QuantumRenderer$PipelineRunnable.run(QuantumRenderer.java:125)
at java.lang.Thread.run(Thread.java:745)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91259] FS compile failed: Register spilling not supported with m14 used

2015-07-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91259

--- Comment #1 from maxweiss.1...@googlemail.com ---
Sorry I've failed to post the whole trace, here's the rest:

java.lang.IllegalStateException: Operation requires resource lock
at com.sun.prism.impl.ManagedResource.assertLocked(ManagedResource.java:96)
at com.sun.prism.impl.BaseTexture.assertLocked(BaseTexture.java:267)
at
com.sun.prism.impl.ps.BaseShaderContext.setTexture(BaseShaderContext.java:689)
at
com.sun.prism.impl.ps.BaseShaderContext.validateTextureOp(BaseShaderContext.java:585)
at
com.sun.prism.impl.ps.BaseShaderContext.validateTextureOp(BaseShaderContext.java:501)
at com.sun.prism.impl.BaseGraphics.drawTextureRaw(BaseGraphics.java:703)
at
com.sun.scenario.effect.impl.prism.ps.PPSOneSamplerPeer.filterImpl(PPSOneSamplerPeer.java:117)
at
com.sun.scenario.effect.impl.prism.ps.PPSEffectPeer.filter(PPSEffectPeer.java:54)
at
com.sun.scenario.effect.LinearConvolveCoreEffect.filterImageDatas(LinearConvolveCoreEffect.java:85)
at
com.sun.scenario.effect.LinearConvolveCoreEffect.filterImageDatas(LinearConvolveCoreEffect.java:41)
at com.sun.scenario.effect.FilterEffect.filter(FilterEffect.java:195)
at
com.sun.scenario.effect.impl.prism.PrEffectHelper.render(PrEffectHelper.java:166)
at com.sun.javafx.sg.prism.EffectFilter.render(EffectFilter.java:61)
at com.sun.javafx.sg.prism.NGNode.renderEffect(NGNode.java:2379)
at com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2064)
at com.sun.javafx.sg.prism.NGImageView.doRender(NGImageView.java:103)
at com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1959)
at com.sun.javafx.sg.prism.NGGroup.renderContent(NGGroup.java:235)
at com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2067)
at com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1959)
at com.sun.javafx.tk.quantum.ViewPainter.doPaint(ViewPainter.java:474)
at com.sun.javafx.tk.quantum.ViewPainter.paintImpl(ViewPainter.java:327)
at
com.sun.javafx.tk.quantum.UploadingPainter.run(UploadingPainter.java:133)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at com.sun.javafx.tk.RenderJob.run(RenderJob.java:58)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at
com.sun.javafx.tk.quantum.QuantumRenderer$PipelineRunnable.run(QuantumRenderer.java:125)
at java.lang.Thread.run(Thread.java:745)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev